npm - @vpxa/aikit - Versions diffs - 0.1.151 → 0.1.153 - Mend

@vpxa/aikit 0.1.151 → 0.1.153

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (30) hide show

package/package.json +1 -1
package/packages/blocks-core/dist/index.d.ts +471 -0
package/packages/blocks-core/dist/index.js +863 -0
package/packages/chunker/dist/index.d.ts +12 -0
package/packages/chunker/dist/index.js +4 -4
package/packages/cli/dist/index.js +15 -15
package/packages/cli/dist/{init-Dk0WDziB.js → init-O57V8aOH.js} +1 -1
package/packages/cli/dist/{scaffold-BB6OrTuA.js → scaffold-DwQDdiCJ.js} +1 -1
package/packages/cli/dist/{templates-D4t_3cJs.js → templates-VOIHbNnT.js} +1 -1
package/packages/present/dist/index.html +818 -3629
package/packages/server/dist/index.js +1 -1
package/packages/server/dist/server-Bs6Rib4s.js +398 -0
package/packages/store/dist/index.js +12 -12
package/scaffold/dist/adapters/_shared.mjs +2 -1
package/scaffold/dist/adapters/claude-code.mjs +10 -9
package/scaffold/dist/adapters/codex.mjs +3 -3
package/scaffold/dist/adapters/copilot.mjs +20 -20
package/scaffold/dist/adapters/gemini.mjs +9 -3
package/scaffold/dist/definitions/agents.mjs +16 -120
package/scaffold/dist/definitions/bodies.mjs +214 -254
package/scaffold/dist/definitions/protocols.mjs +110 -206
package/scaffold/dist/definitions/skills/adr-skill.mjs +27 -0
package/scaffold/dist/definitions/skills/brainstorming.mjs +14 -0
package/scaffold/dist/definitions/skills/browser-use.mjs +1 -1
package/scaffold/dist/definitions/skills/c4-architecture.mjs +46 -1
package/scaffold/dist/definitions/skills/docs.mjs +34 -0
package/scaffold/dist/definitions/skills/frontend-design.mjs +20 -0
package/scaffold/dist/definitions/skills/present.mjs +31 -0
package/scaffold/dist/definitions/skills/session-handoff.mjs +20 -0
package/packages/server/dist/server-D67lImHa.js +0 -540

package/scaffold/dist/definitions/protocols.mjs CHANGED Viewed

@@ -1,26 +1,26 @@
-const e={"code-agent-base":`# Code Agent — Shared Base Instructions
-> This file contains shared protocols for all code-modifying agents (Implementer, Frontend, Refactor, Debugger). Each agent's definition file contains only its unique identity, constraints, and workflow. **Do not duplicate this content in agent files.**
-## AI Kit MCP Tool Naming Convention
+function e(e){return`
+## Flow Context Bootstrap
-All tool references in these instructions use **short names** (e.g. \`status\`, \`compact\`, \`search\`).
-At runtime, these are MCP tools exposed by the AI Kit server. Depending on your IDE/client, the actual tool name will be prefixed:
+When dispatched as a subagent within an active flow:
-| Client | Tool naming pattern | Example |
-|--------|-------------------|---------|
-| VS Code Copilot | \`mcp_<serverName>_<tool>\` | \`mcp_aikit_status\` |
-| Claude Code | \`mcp__<serverName>__<tool>\` | \`mcp__aikit__status\` |
-| Other MCP clients | \`<serverName>_<tool>\` or bare \`<tool>\` | \`aikit_status\` or \`status\` |
+1. **Withdraw context first** — before any search or file reads:
+   \`\`\`
+  knowledge({ action: 'withdraw', scope: 'flow', profile: '${e}', budget: 6000 })
+   \`\`\`
+   This returns pre-analyzed context from prior agents.
-The server name is \`aikit\` — check your MCP configuration if tools aren't found.
+2. **Use returned context** — do NOT re-search or re-read files already covered
+3. **\`read_file\` ONLY** for exact lines needed for editing
+4. **Deposit new discoveries:**
+   \`\`\`
+   knowledge({ action: 'remember', scope: 'flow', title: '<discovery>', content: '<details>', category: 'context' })
+   \`\`\`
-**When these instructions say** \`status({})\` **→ call the MCP tool whose name ends with** \`_status\` **and pass** \`{}\` **as arguments.**
+${e===`<PROFILE>`?`**Profile:** Check your role → implementer | documenter | reviewer | researcher | debugger`:`**Profile:** \`${e}\``}
-If tools are deferred/lazy-loaded, load them first (e.g. in VS Code Copilot: \`tool_search_tool_regex({ pattern: "aikit" })\`).
+---`}function t(){return"\n## Evidence Citation Protocol (tier-aware)\n\n**Standalone mode:** If no FORGE task_id was provided in your dispatch prompt, skip `evidence_map` calls entirely — provide free-form findings with `file:line` citations only.\n\nThe Orchestrator runs `forge_classify` before dispatching you, and runs the final `evidence_map({ action: 'gate', task_id })` after you respond. **Do not create your own task_id or run the gate** — feed into the Orchestrator's existing evidence map.\n\n| Tier | Your responsibility |\n|------|---------------------|\n| Floor | Free-form findings with `file.ts#Lxx` citations. No `evidence_map` calls required. |\n| Standard | For every CRITICAL or HIGH finding: `evidence_map({action:'add', task_id, claim, status:'V', receipt:'file.ts#Lxx'})`. Max 2-4 adds to keep signal high. |\n| Critical | Structured claims for all CRITICAL/HIGH findings (2-4 Verified + receipts) AND tag contract/security claims with `safety_gate:'commitment'` or `safety_gate:'provenance'`. |\n\n**Every response MUST include:**\n- `**FORGE Task ID:** <task_id>` (passed in by Orchestrator, or state \"not provided\")\n- `**Tier applied:** Floor | Standard | Critical`\n- `**Findings:** <list>` with `file:line` receipts\n- Verdict: `APPROVED` | `CHANGES_REQUESTED` | `BLOCKED`\n\nDo NOT:\n- Create a new `evidence_map` (the Orchestrator already did)\n- Run `evidence_map({action:'gate'})` yourself — the Orchestrator owns the gate\n- Duplicate findings into the map that weren't CRITICAL/HIGH"}const n={"code-agent-base":`# Code Agent — Shared Base Instructions
----
+> This file contains shared protocols for all code-modifying agents (Implementer, Frontend, Refactor, Debugger). Each agent's definition file contains only its unique identity, constraints, and workflow. **Do not duplicate this content in agent files.**
 ## Invocation Mode Detection
@@ -97,6 +97,10 @@ Always follow this order when you need to understand something. **Never skip to
 Past decisions, conventions, and patterns are stored in curated knowledge. Auto-knowledge captures facts automatically from tool outputs (conventions, errors, test results, research). Use \`search()\` with specific keywords to surface these — they are indexed alongside manually curated entries. You MUST search before implementing:
+- If running as a sub-agent, start with \`knowledge({ action: "withdraw", scope: "flow", profile: "<your-role>", budget: 6000 })\` to pull prior compressed context.
+- Before re-running \`file_summary\`, \`compact\`, \`stratum_card\`, \`search\`, or \`blast_radius\`, check existing flow context first and reuse it when it is sufficient.
+- Reuse existing stash/checkpoint/workset context when present before creating new compressed artifacts.
 \`\`\`
 search("keywords about the feature/area you're changing")  // check for past decisions
 knowledge({ action: "list", category: "decisions" })   // scan recent decisions that might apply
@@ -110,6 +114,7 @@ knowledge({ action: "withdraw", scope: "flow", profile: "<your-role>", budget: 6
 **Rules:**
 - If results exist → **READ them and FOLLOW** established patterns. Do not silently override.
 - If results conflict with the current task → **surface the conflict** to the user/orchestrator.
+- If flow-context search results already contain enough detail → **use them directly** instead of re-running the original tool.
 - If no results → proceed, but **persist your decisions with \`knowledge({ action: "remember", ... })\`** afterward for future recall.
 - Never assume "there's nothing stored" — always search first.
@@ -141,7 +146,7 @@ If unsure which AI Kit tool to use → run \`guide({ topic: "what you need" })\`
 ---
-## Loop Detection & Breaking
+## Loop Detection & Tooling Failure Modes
 Track repeated failures. If the same approach fails, **stop and change strategy**.
@@ -159,6 +164,27 @@ Track repeated failures. If the same approach fails, **stop and change strategy*
 **Never brute-force.** If you catch yourself making the same type of edit repeatedly, you are in a loop.
+### Tooling failure exits
+| Signal | Stop condition | Exit action |
+|--------|---------------|-------------|
+| \`evidence_map\` returns HOLD | Insufficient evidence for FORGE gate | Surface concrete gaps to user — do not retry |
+| Sub-agent returns BLOCKED | Subagent cannot proceed | Read its message, escalate to user with options |
+| \`onboard\` reports stale index (>7 days) | Index is stale | Run \`reindex({})\` ONCE; if still stale, surface to user |
+| \`check\` or \`test_run\` fails 3x identical | Same failure mode repeating | STOP — surface to user with full output, do not retry |
+| \`compact\` returns < 50% reduction | Compression ineffective | Use \`file_summary\` or \`stratum_card\` instead |
+## Sub-agent Context Budget
+When dispatching subagents, choose tier based on task complexity:
+| Tier | Budget | Tools | Use For |
+|------|--------|-------|---------|
+| **Floor** | T1 stratum_card only | Read-only | Quick lookups, single-file Q&A |
+| **Standard** | compact() + T2 stratum_card | Read-only + search | Multi-file analysis, research |
+| **Critical** | digest() + stratum_card + flow context | Full | Implementation, decisions, multi-step |
+Always tell the subagent: profile, tier, and what they should NOT do.
 ---
 ## Hallucination Self-Check
@@ -183,12 +209,14 @@ Track repeated failures. If the same approach fails, **stop and change strategy*
 ---
-## Read-Before-Edit (MANDATORY)
+## Ambiguity Resolution Protocol
-Before modifying ANY file, you MUST read it first using \`file_summary\` or \`compact\` to understand its structure. Then use \`read_file\` for the exact lines you need to edit.
+When a task admits ≥2 valid interpretations:
+1. **Name** each interpretation in one sentence.
+2. **Identify** which assumption causes the most harm if wrong (irreversibility, blast radius, user surprise).
+3. **Ask** ONE question — the one that disambiguates the highest-harm assumption.
-**Forbidden pattern:** Editing a file based on assumptions or partial context from search results alone.
-**Required pattern:** \`file_summary\` → \`read_file\` (exact edit region) → \`replace_string_in_file\`
+Do NOT silently pick. Do NOT ask multiple questions if one is sufficient.
 ## Scope Guard
@@ -234,30 +262,9 @@ For outdated AI Kit entries → \`knowledge({ action: "update", path, content, r
 ---
-## Context Reuse Protocol (MANDATORY)
-Auto-knowledge captures tool responses as shared context between agents. **Before running any read tool, check if another agent already ran it.**
-**Check-before-run pattern:**
-1. Before \`file_summary\`, \`compact\`, \`stratum_card\`, \`search\`, \`blast_radius\`:
-  - \`search({ query: "<tool-name>: <path-or-query>", tags: ["flow-context"], limit: 3 })\`
-  - Example: \`search({ query: "file_summary: src/auth.ts", tags: ["flow-context"] })\`
-2. If results found with enough detail → **use them directly** — do NOT re-run the tool
-3. Only run the original tool if no cached results exist or results are insufficient
-**At agent startup (FIRST action):**
-- Call \`knowledge({ action: "withdraw", profile: "<your-role>", budget: 6000 })\` to receive pre-analyzed context from prior agents in the same flow
-- This surfaces file summaries, search results, and analysis that other agents already performed
-- Parse the withdrawn context — it may contain the exact information you need
-**Why this matters:**
-- Each re-run wastes tokens and time
-- Auto-knowledge stores \`file_summary\`, \`compact\`, \`search\`, \`stratum_card\`, \`blast_radius\`, \`scope_map\` results
-- Results are tagged with \`flow-context\` and searchable via \`search()\`
----
+## AI Kit Tool Discipline
-## FORBIDDEN: Native Tools When AI Kit Alternative Exists
+Use AI Kit retrieval and compression tools first. Prefer reusable compressed context over raw reads, and only drop to native tools when precision for an edit or tool fallback requires it.
 | NEVER use this | USE THIS instead | Why |
 |---|---|---|
@@ -268,33 +275,28 @@ Auto-knowledge captures tool responses as shared context between agents. **Befor
 | \`grep_search\` for a symbol name | \`symbol({ name })\` | Definition + references with scope and call context |
 | \`run_in_terminal\` for tsc/lint | \`check({})\` | Typecheck + lint combined, summary output |
 | \`run_in_terminal\` for test | \`test_run({})\` | Run tests with structured output |
+| Editing without reading | \`file_summary\` then targeted \`read_file\` | Prevents wrong-position edits |
 **\`read_file\` is ONLY acceptable when you need exact line content FOR EDITING (before \`replace_string_in_file\`).**
----
+For edits, first understand structure with \`file_summary\` or \`compact\`, then use targeted \`read_file\` only for the exact region.
+Never patch from search snippets or assumptions alone.
-## Context Efficiency (MANDATORY)
+## compact() Failure Recovery
-**MANDATORY: Use AI Kit over \`read_file\` to understand code** (if tools are loaded). Use the AI Kit compression tools:
-- **\`file_summary({ path })\`** — Structure, exports, imports (~50 tokens vs ~1000+ for read_file)
-- **\`compact({ path, query })\`** — Extract relevant sections from a single file (5-20x token reduction)
-- **\`digest({ sources })\`** — Compress 3+ files into a single token-budgeted summary
-- **\`stratum_card({ files, query })\`** — Generate a reusable T1/T2 context card for files you'll reference repeatedly
+If \`compact()\` returns <200 bytes or empty content, the file is NOT indexed. Follow this fallback:
-**Session phases** — structure your work to minimize context bloat:
+1. **Do NOT retry** compact on the same file — it will fail again
+2. **Use \`read_file\`** with a LARGE range (e.g., \`startLine: 1, endLine: 9999\`) — NEVER chunk into small ranges
+3. **Use \`stash()\`** to cache findings from unindexed files — context pressure causes re-reads
+4. **Check \`status()\`** to see which paths are indexed before calling compact
-| Phase | What to do | Compress after? |
-|-------|-----------|----------------|
-| **Understand** | Search AI Kit, read summaries, trace symbols | Yes — \`digest\` findings before planning |
-| **Plan** | Design approach, identify files to change | Yes — \`stash\` the plan, compact analysis |
-| **Execute** | Make changes, one sub-task at a time | Yes — compact between independent sub-tasks |
-| **Verify** | \`check\` + \`test_run\` + \`blast_radius\` | — |
+**Anti-patterns to avoid:**
+- Retrying compact 3x on same unindexed file (wastes 3 tool calls)
+- Falling back to read_file in small chunks (10-50 lines) — each chunk costs ~3K prompt tokens in overhead
+- Re-reading the same file later because you forgot the content — use stash() to cache
-**Rules:**
-- **Never compact mid-operation** — finish the current sub-task first
-- **Recycle context to files** — save analysis results via \`stash\` or \`knowledge({ action: "remember", ... })\`, not just in conversation
-- **Decompose monolithic work** — break into independent chunks, pass results via artifact files between sub-tasks
-- **One-shot sub-tasks** — for self-contained changes, provide all context upfront to avoid back-and-forth
+*Why:* these tools reduce token cost, shrink duplicate reads, and lower the odds of wrong-file or wrong-position edits while preserving reusable context.
 ---
@@ -401,26 +403,7 @@ When you need user input or need to explain something before asking:
 - **Prefer the simplest method** that adequately conveys the information
 - **CLI mode override:** When running in terminal (not VS Code chat), always use \`format: "browser"\` for any rich content
-## Flow Context Bootstrap
-When dispatched as a subagent within an active flow:
-1. **Withdraw context first** — before any search or file reads:
-   \`\`\`
-   knowledge({ action: 'withdraw', profile: '<PROFILE>', budget: 6000 })
-   \`\`\`
-   This returns pre-analyzed context from prior agents.
-2. **Use returned context** — do NOT re-search or re-read files already covered
-3. **\`read_file\` ONLY** for exact lines needed for editing
-4. **Deposit new discoveries:**
-   \`\`\`
-   knowledge({ action: 'remember', scope: 'flow', title: '<discovery>', content: '<details>', category: 'context' })
-   \`\`\`
-**Profile:** Check your role → implementer | documenter | reviewer | researcher | debugger
----
+${e(`<PROFILE>`)}
 ## Handoff Format
@@ -439,6 +422,23 @@ Always return this structure when invoked as a sub-agent:
   <blockers>{any blocking issues}</blockers>
 </handoff>
 \`\`\`
+  ## AI Kit MCP Tool Naming Convention
+  All tool references in these instructions use **short names** (e.g. \`status\`, \`compact\`, \`search\`).
+  At runtime, these are MCP tools exposed by the AI Kit server. Depending on your IDE/client, the actual tool name will be prefixed:
+  | Client | Tool naming pattern | Example |
+  |--------|-------------------|---------|
+  | VS Code Copilot | \`mcp_<serverName>_<tool>\` | \`mcp_aikit_status\` |
+  | Claude Code | \`mcp__<serverName>__<tool>\` | \`mcp__aikit__status\` |
+  | Other MCP clients | \`<serverName>_<tool>\` or bare \`<tool>\` | \`aikit_status\` or \`status\` |
+  The server name is \`aikit\` — check your MCP configuration if tools aren't found.
+  **When these instructions say** \`status({})\` **→ call the MCP tool whose name ends with** \`_status\` **and pass** \`{}\` **as arguments.**
+  If tools are deferred/lazy-loaded, load them first (e.g. in VS Code Copilot: \`tool_search_tool_regex({ pattern: "aikit" })\`).
 `,"researcher-base":`# Researcher — Shared Base Instructions
 > Shared methodology for all Researcher variants. Each variant's definition contains only its unique identity and model assignment. **Do not duplicate.**
@@ -453,26 +453,7 @@ Follow the **MANDATORY FIRST ACTION** and **Information Lookup Order** from code
 **Start with pre-analyzed artifacts.** They cover 80%+ of common research needs.
-## Flow Context Bootstrap
-When dispatched as a subagent within an active flow:
-1. **Withdraw context first** — before any search or file reads:
-   \`\`\`
-  knowledge({ action: 'withdraw', scope: 'flow', profile: 'researcher', budget: 6000 })
-   \`\`\`
-   This returns pre-analyzed context from prior agents.
-2. **Use returned context** — do NOT re-search or re-read files already covered
-3. **\`read_file\` ONLY** for exact lines needed for editing
-4. **Deposit new discoveries:**
-   \`\`\`
-   knowledge({ action: 'remember', scope: 'flow', title: '<discovery>', content: '<details>', category: 'context' })
-   \`\`\`
-**Profile:** \`researcher\`
----
+${e(`researcher`)}
 ## Research Methodology
@@ -550,19 +531,13 @@ When invoked for a decision analysis, you receive a specific question. You MUST:
 ## Invocation Mode Detection
-- **Direct** (has AI Kit tools) → Follow the **Information Lookup Order** from code-agent-base
-- **Sub-agent** (prompt has "## Prior AI Kit Context") → Skip AI Kit Recall, use provided context
+> **Mode:** Researchers always run as subagents — no Direct mode.
 ---
 ## Context Efficiency
-- **NEVER use \`read_file\` to understand code** — use AI Kit compression tools instead
-- **\`file_summary\`** for structure (exports, imports, call edges — 10x fewer tokens)
-- **\`compact\`** for specific sections (5-20x token reduction vs read_file)
-- **\`digest\`** when synthesizing from 3+ sources
-- **\`stratum_card\`** for files you'll reference repeatedly
-- **\`read_file\` is ONLY acceptable** when you need exact lines for a pending edit operation
+> **Reminder:** Apply Context Efficiency rules — prefer compact/digest/file_summary over raw read_file. See \`code-agent-base\` for full table.
 ## Parallel Exploration via \`lane\`
@@ -585,26 +560,7 @@ Follow the **MANDATORY FIRST ACTION** and **Information Lookup Order** from code
 2. If onboard shows ❌ → Run \`onboard({ path: '.' })\` and wait for completion
 3. If onboard shows ✅ → Read relevant onboard artifacts using \`compact({ path: '<Onboard Directory>/<file>' })\` — especially \`patterns.md\` and \`api-surface.md\` for review context
-## Flow Context Bootstrap
-When dispatched as a subagent within an active flow:
-1. **Withdraw context first** — before any search or file reads:
-   \`\`\`
-  knowledge({ action: 'withdraw', scope: 'flow', profile: 'reviewer', budget: 6000 })
-   \`\`\`
-  This returns pre-analyzed context from prior agents.
-2. **Use returned context** — do NOT re-search or re-read files already covered
-3. **\`read_file\` ONLY** for exact lines needed for editing
-4. **Deposit new discoveries:**
-   \`\`\`
-   knowledge({ action: 'remember', scope: 'flow', title: '<discovery>', content: '<details>', category: 'context' })
-   \`\`\`
-**Profile:** \`reviewer\`
----
+${e(`reviewer`)}
 ## Review Workflow
@@ -656,28 +612,7 @@ When dispatched as a subagent within an active flow:
 - **FAILED** for any CRITICAL finding
 - Always check for **test coverage** on new/changed code
-## Evidence Citation Protocol (tier-aware)
-**Standalone mode:** If no FORGE task_id was provided in your dispatch prompt, skip \`evidence_map\` calls entirely — provide free-form findings with \`file:line\` citations only.
-The Orchestrator runs \`forge_classify\` before dispatching you, and runs the final \`evidence_map({ action: 'gate', task_id })\` after you respond. **Do not create your own task_id or run the gate** — feed into the Orchestrator's existing evidence map.
-| Tier | Your responsibility |
-|------|---------------------|
-| Floor | Free-form findings with \`file.ts#Lxx\` citations. No \`evidence_map\` calls required. |
-| Standard | For every CRITICAL or HIGH finding: \`evidence_map({action:'add', task_id, claim, status:'V', receipt:'file.ts#Lxx'})\`. Max 2-4 adds to keep signal high. |
-| Critical | Structured claims for all CRITICAL/HIGH findings (2-4 Verified + receipts) AND tag contract/security claims with \`safety_gate:'commitment'\` or \`safety_gate:'provenance'\`. |
-**Every response MUST include:**
-- \`**FORGE Task ID:** <task_id>\` (passed in by Orchestrator, or state "not provided")
-- \`**Tier applied:** Floor | Standard | Critical\`
-- \`**Findings:** <list>\` with \`file:line\` receipts
-- Verdict: \`APPROVED\` | \`CHANGES_REQUESTED\` | \`BLOCKED\`
-Do NOT:
-- Create a new \`evidence_map\` (the Orchestrator already did)
-- Run \`evidence_map({action:'gate'})\` yourself — the Orchestrator owns the gate
-- Duplicate findings into the map that weren't CRITICAL/HIGH
+${t()}
 `,"architect-reviewer-base":`# Architect-Reviewer — Shared Base Instructions
 > Shared methodology for all Architect-Reviewer variants. Each variant's definition contains only identity and model. **Do not duplicate.**
@@ -690,26 +625,7 @@ Follow the **MANDATORY FIRST ACTION** and **Information Lookup Order** from code
 2. If onboard shows ❌ → Run \`onboard({ path: '.' })\` and wait for completion
 3. If onboard shows ✅ → Read relevant onboard artifacts using \`compact({ path: '<Onboard Directory>/<file>' })\` — especially \`structure.md\`, \`dependencies.md\`, and \`diagram.md\` for architecture context
-## Flow Context Bootstrap
-When dispatched as a subagent within an active flow:
-1. **Withdraw context first** — before any search or file reads:
-   \`\`\`
-  knowledge({ action: 'withdraw', scope: 'flow', profile: 'reviewer', budget: 6000 })
-   \`\`\`
-  This returns pre-analyzed context from prior agents.
-2. **Use returned context** — do NOT re-search or re-read files already covered
-3. **\`read_file\` ONLY** for exact lines needed for editing
-4. **Deposit new discoveries:**
-   \`\`\`
-   knowledge({ action: 'remember', scope: 'flow', title: '<discovery>', content: '<details>', category: 'context' })
-   \`\`\`
-**Profile:** \`reviewer\`
----
+${e(`reviewer`)}
 ## Review Workflow
@@ -757,28 +673,7 @@ When dispatched as a subagent within an active flow:
 - **BLOCKED** — Fundamental design flaw requiring rethink
 - Always validate **dependency direction** — inner layers must not depend on outer
-## Evidence Citation Protocol (tier-aware)
-**Standalone mode:** If no FORGE task_id was provided in your dispatch prompt, skip \`evidence_map\` calls entirely — provide free-form findings with \`file:line\` citations only.
-The Orchestrator runs \`forge_classify\` before dispatching you, and runs the final \`evidence_map({ action: 'gate', task_id })\` after you respond. **Do not create your own task_id or run the gate** — feed into the Orchestrator's existing evidence map.
-| Tier | Your responsibility |
-|------|---------------------|
-| Floor | Free-form findings with \`file.ts#Lxx\` citations. No \`evidence_map\` calls required. |
-| Standard | For every CRITICAL or HIGH finding: \`evidence_map({action:'add', task_id, claim, status:'V', receipt:'file.ts#Lxx'})\`. Max 2-4 adds to keep signal high. |
-| Critical | Structured claims for all CRITICAL/HIGH findings (2-4 Verified + receipts) AND tag contract/security claims with \`safety_gate:'commitment'\` or \`safety_gate:'provenance'\`. |
-**Every response MUST include:**
-- \`**FORGE Task ID:** <task_id>\` (passed in by Orchestrator, or state "not provided")
-- \`**Tier applied:** Floor | Standard | Critical\`
-- \`**Findings:** <list>\` with \`file:line\` receipts
-- Verdict: \`APPROVED\` | \`CHANGES_REQUESTED\` | \`BLOCKED\`
-Do NOT:
-- Create a new \`evidence_map\` (the Orchestrator already did)
-- Run \`evidence_map({action:'gate'})\` yourself — the Orchestrator owns the gate
-- Duplicate findings into the map that weren't CRITICAL/HIGH
+${t()}
 ## Graph-Assisted Layer Verification
@@ -810,7 +705,9 @@ The Orchestrator uses **multi-model decision analysis** to resolve non-trivial t
 Dispatch ALL available Researcher variants **in parallel** via \`runSubagent\` — one call per variant, same question, simultaneous. Each returns an independent recommendation grounded in their thinking style:
-**IMPORTANT: Include this instruction in every researcher dispatch prompt: "You are running as a subagent. Do NOT use the \`present\` tool — return all analysis as plain text."**
+**IMPORTANT: Include these instructions in every researcher dispatch prompt:**
+- "You are running as a subagent. Do NOT use the \`present\` tool — return all analysis as plain text."
+- "Keep your analysis to ≤ 500 words. Structure: (1) Recommendation, (2) Key evidence, (3) Critical risks. No preamble."
 | Variant | Thinking Style | Lens |
 |---------|---------------|------|
@@ -821,7 +718,10 @@ Dispatch ALL available Researcher variants **in parallel** via \`runSubagent\`
 ### Phase 2 — Peer Review (parallel)
-After all researchers return, **anonymize** their responses as Perspective A / B / C / D (strip agent names). Then dispatch a **second parallel batch** of 4 review sub-agents via \`runSubagent\`:
+After all researchers return:
+1. **Compress** each response to its core argument (≤ 200 words) — \`stash\` full responses if needed later
+2. **Anonymize** as Perspective A / B / C / D (strip agent names)
+3. Dispatch **second parallel batch** of review sub-agents with compressed versions via \`runSubagent\`:
 **Peer Review Prompt Template:**
 \`\`\`
@@ -888,7 +788,7 @@ Trigger the decision protocol when there is an **unresolved non-trivial technica
 - **\`runSubagent\` is ALWAYS available** — it is a core tool in every environment (VS Code, CLI, Copilot Chat). NEVER claim it is unavailable. NEVER simulate researchers inline by "applying lenses yourself." If you cannot call \`runSubagent\`, you have a tool-loading issue — retry or escalate, do NOT degrade to single-agent inline simulation.
 - **No \`present\` in subagents** — always include "Do NOT use the \`present\` tool — return all analysis as plain text" in every researcher dispatch prompt. Subagent visual outputs are invisible to the user.
-- Always launch in **parallel**, minimum 4 variants
+- Always launch in **parallel** — 4 variants for Critical, 2 (Alpha + Delta) for Standard per tier gate
 - Use exact case-sensitive agent names — never rename or alias
 - **Anonymize** researcher outputs before peer review (A/B/C/D, not agent names)
 - Peer review is a SEPARATE parallel batch — never skip it
@@ -897,12 +797,16 @@ Trigger the decision protocol when there is an **unresolved non-trivial technica
 - **Produce an ADR** after every decision resolution
 - \`knowledge({ action: "remember", ... })\` the decision for future recall
-## Shortcut: Floor-Tier Decisions
+## Tier Shortcuts
+**Standard tier** (default for multi-file tasks):
+- Phase 1: 2 researchers only (Alpha + Delta) — skip Beta + Gamma
+- Skip Phase 2 (peer review) — synthesize directly from 2 research outputs
+- Verdict format required but can be concise
+- ADR optional (\`knowledge({ action: "remember", ... })\` at minimum)
-For decisions classified as **Floor tier** (blast_radius ≤ 2, single concern):
-- Skip Phase 2 (peer review) — synthesis directly from Phase 1
-- Verdict format still required but can be abbreviated
-- ADR is optional (use \`knowledge({ action: "remember", ... })\` at minimum)
+**Floor tier** (blast_radius ≤ 2, single concern):
+- Skip the Decision Protocol entirely — decide inline or with 1 researcher max
 `,"forge-protocol":`# FORGE Protocol — Quality Overlay
 > Follow the FORGE (Fact-Oriented Reasoning with Graduated Evidence) protocol for all code generation and modification tasks.
@@ -992,7 +896,7 @@ evidence_map({ action: "gate", task_id: "add-user-api" })  → YIELD ✅
 3. **Standard**: \`evidence_map create\` → add 3-8 claims during work → \`evidence_map gate\`
 4. **Critical**: Full 4-phase flow with comprehensive evidence
 5. **After gate**: YIELD = done, HOLD = fix + re-gate, HARD_BLOCK = escalate
-`},t={"execution-state":`# Execution State: {Task Title}
+`},r={"execution-state":`# Execution State: {Task Title}
 **Status:** PLANNING | IN_PROGRESS | REVIEW | COMPLETED | BLOCKED
 **Started:** {timestamp}
@@ -1044,4 +948,4 @@ evidence_map({ action: "gate", task_id: "add-user-api" })  → YIELD ✅
 ## Alternatives Considered
 {Other approaches evaluated and why they were rejected — keeps the "why not" alongside the "why"}
-`};export{e as PROTOCOLS,t as TEMPLATES};
+`};export{n as PROTOCOLS,r as TEMPLATES};

package/scaffold/dist/definitions/skills/adr-skill.mjs CHANGED Viewed

@@ -1384,6 +1384,33 @@ metadata:
 # ADR Skill
+## Quick Reference
+**Purpose:** Create Architecture Decision Records optimized for agentic coding — decisions that agents can implement without follow-up questions.
+**Write an ADR when:** Decision changes how system is built, is hard to reverse, affects other people/agents, or has real alternatives.
+**Four-Phase Workflow:**
+1. **Discover** — Socratic questioning to surface intent, constraints, and alternatives
+2. **Draft** — Fill template (Simple or MADR) with concrete, measurable constraints
+3. **Validate** — Agent-readiness checklist (score ≥ 80%)
+4. **Record** — Save to \`docs/decisions/\` or \`adr/\`, update index
+**Two templates:**
+- **Simple** (≤3 options, single-team) — Context → Decision → Consequences → Implementation Plan → Alternatives
+- **MADR** (complex, multi-team) — Full template with Decision Drivers, Pros/Cons matrix, detailed Implementation Plan
+**Commands:**
+| Action | What to do |
+|--------|-----------|
+| Create | Run 4-phase workflow → save to \`docs/decisions/NNNN-slug.md\` |
+| Consult | \`search({ query: "ADR <topic>" })\` before implementing |
+| Update | Edit content, change status in YAML frontmatter |
+| Deprecate | Set \`status: deprecated\`, add superseded-by link |
+| Bootstrap | Create \`docs/decisions/\` + \`index.md\` if missing |
+**Agent-readiness gate:** Every ADR scores ≥ 80% on: Specificity, Testability, Completeness, Independence, Actionability.
 ## Philosophy
 ADRs created with this skill are **executable specifications for coding agents**. A human approves the decision; an agent implements it. The ADR must contain everything the agent needs to write correct code without asking follow-up questions.

package/scaffold/dist/definitions/skills/brainstorming.mjs CHANGED Viewed

@@ -14,6 +14,20 @@ argument-hint: "Feature, component, or behavior to design"
 # Brainstorming Ideas Into Designs
+## Quick Reference
+**Purpose:** Explore user intent, requirements, and design BEFORE any implementation. Mandatory for creative work.
+**HARD GATE:** Do NOT write any code until design is presented and user approves. No exceptions, even for "simple" tasks.
+**Two modes** (you decide, don't ask user):
+- **Simple Mode** - <=3 files, single concern, no new boundaries -> brief context + 3-5 targeted questions + short design summary
+- **Full Mode** - multi-file, new services, cross-cutting concerns -> deep context gathering + structured design document + explicit approval
+**Process:** Understand context -> Ask questions (one at a time) -> Present design -> Get user approval -> Hand off to implementation
+**Output:** Design document with: goal, scope, approach, file changes, edge cases, and acceptance criteria.
 Help turn ideas into fully formed designs and specs through natural collaborative dialogue.
 Start by understanding the current project context, then ask questions one at a time to refine the idea. Once you understand what you're building, present the design and get user approval.

package/scaffold/dist/definitions/skills/browser-use.mjs CHANGED Viewed

@@ -1,4 +1,4 @@
-var e=[{file:`SKILL.md`,content:"---\nname: browser-use\ndescription: \"Browser automation for AI agents using AI Kit's owned `browser` MCP tool. Triggered when: (1) repo-access exhausts its Strategy Ladder and auth requires browser interaction, (2) `web_fetch` returns login page HTML, SAML redirect, or CAPTCHA instead of content, (3) user needs to interact with web applications (fill forms, click buttons, extract data), (4) a site requires JavaScript rendering that `web_fetch` cannot handle, (5) user asks to browse, scrape, test, or automate a website, or (6) another skill needs a standard recipe format for browser-driven workflows. Uses AI Kit's owned Chromium runtime and recipe patterns for domain-specific automation skills — no external MCP server dependency.\"\nmetadata:\n  category: cross-cutting\n  domain: general\n  applicability: on-demand\n  inputs: [url, auth-error, browser-task, login-wall]\n  outputs: [page-content, screenshots, extracted-data, authenticated-session, network-captures]\n  requires: []\n  relatedSkills: [repo-access, present, aikit]\nargument-hint: \"URL or browser task description\"\n---\n\n# Browser Automation for AI Agents\n\nUse AI Kit's `browser` MCP tool for authentication barriers, data extraction, form interactions, network capture, and web automation. Single tool, action-based dispatch, owned Chromium runtime.\n\n## Runtime\n\n- Tool: `browser({ action: ... })`\n- 11 actions: `open`, `read`, `act`, `navigate`, `network`, `console`, `fetch`, `eval`, `screenshot`, `dialog`, `session`\n- Modes: `headless` (CI), `ui` (desktop), `panel` (VS Code)\n- Install: `aikit browser install`\n- Auto-idle shutdown after timeout\n\n## When to Activate\n\n- `web_fetch` returns login HTML, SAML redirect, or CAPTCHA\n- `http` returns 401/403 and user confirms browser access works\n- `repo-access` Strategy Ladder exhausted — SSO/OAuth blocks CLI\n- Anti-bot detection (Cloudflare, \"verify you are human\")\n- User asks to browse, scrape, automate, test, or interact with a web app\n- Need screenshots, accessibility snapshots, or JS-rendered content\n- Preview or inspect local HTML files (serve locally, then open with browser)\n- Need to capture network traffic or make authenticated API calls using page session\n\n## When NOT to Activate\n\n- Public pages `web_fetch` handles correctly\n- API endpoints reachable via `http` with auth headers\n- Static downloads via `http`\n- Tasks only needing raw HTML/links/outline\n\n## Two Automation Modes\n\n### Script Mode (Default — Imperative)\n\nDirect sequential `browser()` calls. Best for one-off tasks, testing, API capture.\n\n~~~text\n// Open → Read → Act → Read loop\nbrowser({ action: 'open', url: 'https://app.example.com', mode: 'ui' })\nbrowser({ action: 'read', pageId })\nbrowser({ action: 'act', pageId, kind: 'click', ref: '@login-button' })\nbrowser({ action: 'read', pageId })  // verify state changed\n~~~\n\n**Network Intelligence pattern:**\n\n~~~text\nbrowser({ action: 'network', pageId, subAction: 'enable', filter: { resourceTypes: ['xhr', 'fetch'] } })\n// ... navigate/interact to trigger API calls ...\nbrowser({ action: 'network', pageId, subAction: 'get' })\nbrowser({ action: 'network', pageId, subAction: 'export-har' })\n~~~\n\n**Authenticated API calls (using page cookies/session):**\n\n~~~text\nbrowser({ action: 'fetch', pageId, fetchUrl: 'https://app.example.com/api/data', fetchMethod: 'GET' })\n~~~\n\nExecutes `fetch()` in the page, so cookies, session state, and CSRF tokens are reused automatically.\n\n**Console capture:**\n\n~~~text\nbrowser({ action: 'console', pageId, consoleSubAction: 'enable' })\n// ... trigger page actions ...\nbrowser({ action: 'console', pageId, consoleSubAction: 'get', level: 'error' })\n~~~\n\n### Recipe Mode (Declarative)\n\nStructured step-by-step format for reusable workflows and domain skills. Each step declares Action, Verify, On Failure, and Extract fields.\n\nLoad [references/recipes.md](references/recipes.md) for full recipe templates and the recipe format specification.\n\nBrief recipe format:\n\n~~~text\nStep N: <description>\n  Action: browser({ ... })\n  Verify: <condition to check after action>\n  On Failure: <recovery strategy>\n  Extract: <data to capture for next steps>\n~~~\n\n## Action Reference\n\n| Action | Purpose | Key Params |\n|--------|---------|------------|\n| `open` | Launch page | `url`, `mode` (ui/headless/panel), `waitUntil` |\n| `read` | Extract content | `pageId`, `readMode` (snapshot/dom/markdown/text), `selector` |\n| `act` | DOM interaction | `pageId`, `kind`, `ref`/`selector`, `text`/`key`/`value` |\n| `navigate` | Page navigation | `pageId`, `url` or `type` (back/forward/reload/waitFor) |\n| `network` | Capture traffic | `pageId`, `subAction` (enable/get/clear/export-har), `filter` |\n| `console` | Capture console | `pageId`, `consoleSubAction` (enable/get/clear), `level` |\n| `fetch` | Page-context HTTP | `pageId`, `fetchUrl`, `fetchMethod`, `fetchHeaders`, `fetchBody` |\n| `eval` | Execute JS | `pageId`, `code` |\n| `screenshot` | Capture image | `pageId`, `selector`, `fullPage`, `clip`, `format` |\n| `dialog` | Pre-register handler for NEXT dialog | `pageId`, `accept`, `promptText` |\n| `session` | Manage sessions | `sessionAction` (list/close/cookies/set-cookie/get-storage/...) |\n\n## Read Modes\n\n| Mode | Output | Use Case |\n|------|--------|----------|\n| `snapshot` | ARIA accessibility tree with refs | Element targeting, form interaction |\n| `dom` | Raw HTML | HTML structure, debugging |\n| `markdown` | Clean readable text | Content extraction, summarization |\n| `text` | Plain text | Simple text extraction |\n\n## Interaction Kinds\n\n| Kind | Required Params | Notes |\n|------|-----------------|-------|\n| `click` | `ref` or `selector` | Left-click element |\n| `type` | `ref`/`selector` + `text` | Type into input/textarea |\n| `press` | `ref`/`selector` + `key` | Send key to element. Requires a target — use `ref` from snapshot or `selector`. |\n| `hover` | `ref`/`selector` | Trigger hover states |\n| `drag` | `fromRef`/`fromSelector` + `toRef`/`toSelector` | Drag and drop |\n| `select` | `ref`/`selector` + `value` | Select dropdown option |\n| `scroll` | optional `ref`/`selector` | Scroll page or element |\n| `upload` | `ref`/`selector` + `value` (path) | File upload |\n\n### Element Targeting Priority\n\n1. **`ref`** (e.g., `@F12`) — From `read(snapshot)` ARIA tree. Most reliable.\n2. **`selector`** (e.g., `input[name='q']`) — Playwright CSS/attribute selector. Precise.\n3. **`element`** (e.g., `'Submit'`) — Text matching via `text=` locator. **Picks first DOM match regardless of visibility.** Fragile for complex widgets (comboboxes, ARIA roles). Last resort.\n\n**Always `read(snapshot)` first** to get refs before interacting.\n\n> **Visibility Warning**: Playwright `act` waits up to 30s for the target to be visible. If a selector or `element` matches a hidden element first, the action times out. The browser tool does NOT expose a `force` or custom `timeout` parameter.\n>\n> **Workarounds:**\n> - Append `:visible` to selectors: `selector: 'button:has-text(\"Submit\"):visible'`\n> - Use specific selectors instead of `element` when labels are ambiguous (e.g., \"Search\" may match 30+ elements)\n> - Use `read(snapshot)` refs (`@F12`) which always target the specific rendered element\n\n## Network Intelligence\n\nThree new actions for API reverse-engineering and authenticated requests:\n\n**`network`** — Passive traffic capture with circular buffer (200 entries default):\n- `enable`: Start capturing with optional filter (resourceTypes, urlPattern, excludeUrls)\n- `get`: Retrieve captured requests + responses with timing\n- `clear`: Reset buffer\n- `export-har`: Export as HAR 1.2 format\n\nHeaders are redacted by default (Authorization, Cookie, etc.). Pass `showSensitive: true` to see full headers.\n\n**`console`** — Browser console message capture (1000 entries default):\n- `enable`: Start capturing all console output\n- `get`: Retrieve messages, optionally filtered by `level`\n- `clear`: Reset buffer\n\n**`fetch`** — Execute HTTP from page context:\n- Uses the page's live cookies, session, CSRF tokens\n- Supports GET/POST/PUT/PATCH/DELETE/HEAD/OPTIONS\n- Body auto-truncated at 256KB\n- Alternative to extracting cookies then calling `http` tool\n\n**Workflow — Reverse-engineer API:**\n\n~~~text\n1. open target page\n2. network enable (filter: xhr, fetch)\n3. interact with the page (click buttons, submit forms)\n4. network get → see API endpoints, methods, headers\n5. fetch → replay API calls using page session\n~~~\n\n## Session Management\n\n| Action | Purpose | Note |\n|--------|---------|------|\n| `cookies` | Export page cookies | `confirm: true` required |\n| `set-cookie` | Inject cookies | `confirm: true` required |\n| `delete-cookie` / `clear-cookies` | Remove cookies | `confirm: true` required |\n| `get-storage` / `set-storage` / `clear-storage` | localStorage/sessionStorage | |\n| `list` | List open pages | |\n| `close` | Close a page | |\n\n## Security Model\n\n**Hard gates — NEVER bypass:**\n- Credentials go via terminal input (NEVER through tool params or chat)\n- CAPTCHA/MFA: pause and ask user\n- Never store tokens in conversation\n- Close pages containing sensitive data when done\n- Verify page URL before entering credentials (phishing prevention)\n- Use `headless` mode for automated non-interactive tasks; `ui` for user-supervised auth\n\n**Cookie safety gate:** All cookie read/write session actions (`cookies`, `set-cookie`, `delete-cookie`, `clear-cookies`) require `confirm: true` as an explicit acknowledgment. Without it, the tool returns an error.\n\n## Local File Preview\n\nThe browser tool blocks `file:///` URLs for security. To preview local HTML files, serve them via a local HTTP server first.\n\n**Pattern:**\n\n~~~text\n// 1. Start local server (pick an unused port)\n//    Terminal: npx -y serve <directory> -l <port>\n//    Example: npx -y serve ./dist -l 3847\n\n// 2. Open in browser\nbrowser({ action: 'open', url: 'http://localhost:3847/my-file.html', mode: 'ui' })\n\n// 3. Read content or take screenshot\nbrowser({ action: 'read', pageId, readMode: 'markdown' })\nbrowser({ action: 'screenshot', pageId, fullPage: true })\n\n// 4. Clean up — kill the server terminal when done\n~~~\n\n**Use cases:**\n- Preview generated HTML (viewers, reports, docs)\n- Visual regression testing of local builds\n- Inspect single-file HTML applications\n- Screenshot local pages for review\n\n**Important:** Always use `mode: 'ui'` for visual preview so the user can also see and interact with the page.\n\n## Integration\n\n| Skill | Handoff Pattern |\n|-------|------------------|\n| `repo-access` | Strategy Ladder step 6 → browser-use for SSO/OAuth login |\n| `present` | `present({ format: 'browser' })` returns URL → open with browser tool |\n| `aikit` | `web_fetch` fails → browser-use activates |\n\n## Dialog Handling\n\n`dialog()` registers a **one-shot handler** for the NEXT dialog. It must be called **BEFORE** the action that triggers alert, confirm, or prompt.\n\n**Pattern:**\n~~~text\nbrowser({ action: 'dialog', pageId, accept: true })\nbrowser({ action: 'eval', pageId, code: 'confirm(\"Sure?\")' }) // or browser({ action: 'act', ... }) if interaction triggers it\n~~~\n\nFor `prompt` dialogs, pass `promptText` for the response.\n\n## Troubleshooting\n\n| Issue | Fix |\n|-------|-----|\n| \"Browser not installed\" | Run `aikit browser install` |\n| Element not found | `read` with `snapshot` mode first, use ref from ARIA tree |\n| Timeout on navigation | Add `waitUntil: 'networkidle'` to open/navigate |\n| SSO redirect loop | Check cookies with `session({ sessionAction: 'cookies' })` |\n| Anti-bot block | Try `mode: 'ui'`, add delays between actions |\n| Network capture empty | Ensure `enable` called BEFORE navigating |\n\n## Decision Flow\n\n~~~text\nNeed browser?\n├─ Can web_fetch/http handle it? → NO browser needed\n├─ Login wall / SSO / CAPTCHA? → browser-use (Script mode for one-off, Recipe for reusable)\n├─ Need to capture API traffic? → network enable → interact → network get\n├─ Need authenticated API calls? → fetch action (uses page session)\n├─ JS-rendered content? → open + read(markdown)\n├─ Preview local HTML file? → serve dir (npx serve) → open(http://localhost:<port>/file.html, mode: 'ui')\n├─ Form interaction? → Script mode: open → read(snapshot) → act → verify\n└─ Reusable workflow? → Recipe mode (see references/recipes.md)\n~~~\n"},{file:`references/recipes.md`,content:`# Browser Recipes & Domain Skills
+var e=[{file:`SKILL.md`,content:"---\nname: browser-use\ndescription: \"Browser automation for AI agents using AI Kit's owned `browser` MCP tool. Triggered when: (1) repo-access exhausts its Strategy Ladder and auth requires browser interaction, (2) `web_fetch` returns login page HTML, SAML redirect, or CAPTCHA instead of content, (3) user needs to interact with web applications (fill forms, click buttons, extract data), (4) a site requires JavaScript rendering that `web_fetch` cannot handle, (5) user asks to browse, scrape, test, or automate a website, or (6) another skill needs a standard recipe format for browser-driven workflows. Uses AI Kit's owned Chromium runtime and recipe patterns for domain-specific automation skills — no external MCP server dependency.\"\nmetadata:\n  category: cross-cutting\n  domain: general\n  applicability: on-demand\n  inputs: [url, auth-error, browser-task, login-wall]\n  outputs: [page-content, screenshots, extracted-data, authenticated-session, network-captures]\n  requires: []\n  relatedSkills: [repo-access, present, aikit]\nargument-hint: \"URL or browser task description\"\n---\n\n# Browser Automation for AI Agents\n\nUse AI Kit's `browser` MCP tool for authentication barriers, data extraction, form interactions, network capture, and web automation. Single tool, action-based dispatch, owned Chromium runtime.\n\n## Quick Reference\n\n**Tool:** `browser({ action: \"...\", ... })` — single tool, 11 actions, owned Chromium.\n\n**Actions:**\n| Action | Purpose | Key params |\n|--------|---------|------------|\n| `open` | Launch page | `url`, `mode` (ui/headless) |\n| `read` | Get page content | `pageId`, `readMode` (snapshot/dom/markdown/text) |\n| `act` | Interact with elements | `pageId`, `kind` (click/type/press/hover/drag/select/scroll/upload) |\n| `navigate` | Go to URL, back/forward, wait | `pageId`, `url` or `type` |\n| `network` | Capture network traffic | `pageId`, `subAction` (enable/get/clear) |\n| `console` | Browser console messages | `pageId`, `consoleSubAction` |\n| `fetch` | HTTP with page cookies | `pageId`, `fetchUrl` |\n| `eval` | Run JS in page context | `pageId`, `code` |\n| `screenshot` | Capture page/element | `pageId`, `fullPage?`, `selector?` |\n| `dialog` | Handle alert/confirm/prompt | `pageId`, `accept` |\n| `session` | List pages, cookies, storage | `sessionAction` |\n\n**Two modes:**\n- **Script Mode** (default) — direct sequential `browser()` calls for one-off tasks\n- **Recipe Mode** — reusable labeled step sequences for domain-specific automation\n\n**Activate when:** `web_fetch` returns login/SAML/CAPTCHA, `http` gets 401/403, anti-bot detection, need JS rendering or screenshots.\n**Skip when:** Public pages (`web_fetch` works), API endpoints (`http` works), static downloads.\n\n**⚠️ `file:///` URLs are blocked** — serve locally with `npx serve` then open `http://localhost`.\n\n## Runtime\n\n- Tool: `browser({ action: ... })`\n- 11 actions: `open`, `read`, `act`, `navigate`, `network`, `console`, `fetch`, `eval`, `screenshot`, `dialog`, `session`\n- Modes: `headless` (CI), `ui` (desktop), `panel` (VS Code)\n- Install: `aikit browser install`\n- Auto-idle shutdown after timeout\n\n## When to Activate\n\n- `web_fetch` returns login HTML, SAML redirect, or CAPTCHA\n- `http` returns 401/403 and user confirms browser access works\n- `repo-access` Strategy Ladder exhausted — SSO/OAuth blocks CLI\n- Anti-bot detection (Cloudflare, \"verify you are human\")\n- User asks to browse, scrape, automate, test, or interact with a web app\n- Need screenshots, accessibility snapshots, or JS-rendered content\n- Preview or inspect local HTML files (serve locally, then open with browser)\n- Need to capture network traffic or make authenticated API calls using page session\n\n## When NOT to Activate\n\n- Public pages `web_fetch` handles correctly\n- API endpoints reachable via `http` with auth headers\n- Static downloads via `http`\n- Tasks only needing raw HTML/links/outline\n\n## Two Automation Modes\n\n### Script Mode (Default — Imperative)\n\nDirect sequential `browser()` calls. Best for one-off tasks, testing, API capture.\n\n~~~text\n// Open → Read → Act → Read loop\nbrowser({ action: 'open', url: 'https://app.example.com', mode: 'ui' })\nbrowser({ action: 'read', pageId })\nbrowser({ action: 'act', pageId, kind: 'click', ref: '@login-button' })\nbrowser({ action: 'read', pageId })  // verify state changed\n~~~\n\n**Network Intelligence pattern:**\n\n~~~text\nbrowser({ action: 'network', pageId, subAction: 'enable', filter: { resourceTypes: ['xhr', 'fetch'] } })\n// ... navigate/interact to trigger API calls ...\nbrowser({ action: 'network', pageId, subAction: 'get' })\nbrowser({ action: 'network', pageId, subAction: 'export-har' })\n~~~\n\n**Authenticated API calls (using page cookies/session):**\n\n~~~text\nbrowser({ action: 'fetch', pageId, fetchUrl: 'https://app.example.com/api/data', fetchMethod: 'GET' })\n~~~\n\nExecutes `fetch()` in the page, so cookies, session state, and CSRF tokens are reused automatically.\n\n**Console capture:**\n\n~~~text\nbrowser({ action: 'console', pageId, consoleSubAction: 'enable' })\n// ... trigger page actions ...\nbrowser({ action: 'console', pageId, consoleSubAction: 'get', level: 'error' })\n~~~\n\n### Recipe Mode (Declarative)\n\nStructured step-by-step format for reusable workflows and domain skills. Each step declares Action, Verify, On Failure, and Extract fields.\n\nLoad [references/recipes.md](references/recipes.md) for full recipe templates and the recipe format specification.\n\nBrief recipe format:\n\n~~~text\nStep N: <description>\n  Action: browser({ ... })\n  Verify: <condition to check after action>\n  On Failure: <recovery strategy>\n  Extract: <data to capture for next steps>\n~~~\n\n## Action Reference\n\n| Action | Purpose | Key Params |\n|--------|---------|------------|\n| `open` | Launch page | `url`, `mode` (ui/headless/panel), `waitUntil` |\n| `read` | Extract content | `pageId`, `readMode` (snapshot/dom/markdown/text), `selector` |\n| `act` | DOM interaction | `pageId`, `kind`, `ref`/`selector`, `text`/`key`/`value` |\n| `navigate` | Page navigation | `pageId`, `url` or `type` (back/forward/reload/waitFor) |\n| `network` | Capture traffic | `pageId`, `subAction` (enable/get/clear/export-har), `filter` |\n| `console` | Capture console | `pageId`, `consoleSubAction` (enable/get/clear), `level` |\n| `fetch` | Page-context HTTP | `pageId`, `fetchUrl`, `fetchMethod`, `fetchHeaders`, `fetchBody` |\n| `eval` | Execute JS | `pageId`, `code` |\n| `screenshot` | Capture image | `pageId`, `selector`, `fullPage`, `clip`, `format` |\n| `dialog` | Pre-register handler for NEXT dialog | `pageId`, `accept`, `promptText` |\n| `session` | Manage sessions | `sessionAction` (list/close/cookies/set-cookie/get-storage/...) |\n\n## Read Modes\n\n| Mode | Output | Use Case |\n|------|--------|----------|\n| `snapshot` | ARIA accessibility tree with refs | Element targeting, form interaction |\n| `dom` | Raw HTML | HTML structure, debugging |\n| `markdown` | Clean readable text | Content extraction, summarization |\n| `text` | Plain text | Simple text extraction |\n\n## Interaction Kinds\n\n| Kind | Required Params | Notes |\n|------|-----------------|-------|\n| `click` | `ref` or `selector` | Left-click element |\n| `type` | `ref`/`selector` + `text` | Type into input/textarea |\n| `press` | `ref`/`selector` + `key` | Send key to element. Requires a target — use `ref` from snapshot or `selector`. |\n| `hover` | `ref`/`selector` | Trigger hover states |\n| `drag` | `fromRef`/`fromSelector` + `toRef`/`toSelector` | Drag and drop |\n| `select` | `ref`/`selector` + `value` | Select dropdown option |\n| `scroll` | optional `ref`/`selector` | Scroll page or element |\n| `upload` | `ref`/`selector` + `value` (path) | File upload |\n\n### Element Targeting Priority\n\n1. **`ref`** (e.g., `@F12`) — From `read(snapshot)` ARIA tree. Most reliable.\n2. **`selector`** (e.g., `input[name='q']`) — Playwright CSS/attribute selector. Precise.\n3. **`element`** (e.g., `'Submit'`) — Text matching via `text=` locator. **Picks first DOM match regardless of visibility.** Fragile for complex widgets (comboboxes, ARIA roles). Last resort.\n\n**Always `read(snapshot)` first** to get refs before interacting.\n\n> **Visibility Warning**: Playwright `act` waits up to 30s for the target to be visible. If a selector or `element` matches a hidden element first, the action times out. The browser tool does NOT expose a `force` or custom `timeout` parameter.\n>\n> **Workarounds:**\n> - Append `:visible` to selectors: `selector: 'button:has-text(\"Submit\"):visible'`\n> - Use specific selectors instead of `element` when labels are ambiguous (e.g., \"Search\" may match 30+ elements)\n> - Use `read(snapshot)` refs (`@F12`) which always target the specific rendered element\n\n## Network Intelligence\n\nThree new actions for API reverse-engineering and authenticated requests:\n\n**`network`** — Passive traffic capture with circular buffer (200 entries default):\n- `enable`: Start capturing with optional filter (resourceTypes, urlPattern, excludeUrls)\n- `get`: Retrieve captured requests + responses with timing\n- `clear`: Reset buffer\n- `export-har`: Export as HAR 1.2 format\n\nHeaders are redacted by default (Authorization, Cookie, etc.). Pass `showSensitive: true` to see full headers.\n\n**`console`** — Browser console message capture (1000 entries default):\n- `enable`: Start capturing all console output\n- `get`: Retrieve messages, optionally filtered by `level`\n- `clear`: Reset buffer\n\n**`fetch`** — Execute HTTP from page context:\n- Uses the page's live cookies, session, CSRF tokens\n- Supports GET/POST/PUT/PATCH/DELETE/HEAD/OPTIONS\n- Body auto-truncated at 256KB\n- Alternative to extracting cookies then calling `http` tool\n\n**Workflow — Reverse-engineer API:**\n\n~~~text\n1. open target page\n2. network enable (filter: xhr, fetch)\n3. interact with the page (click buttons, submit forms)\n4. network get → see API endpoints, methods, headers\n5. fetch → replay API calls using page session\n~~~\n\n## Session Management\n\n| Action | Purpose | Note |\n|--------|---------|------|\n| `cookies` | Export page cookies | `confirm: true` required |\n| `set-cookie` | Inject cookies | `confirm: true` required |\n| `delete-cookie` / `clear-cookies` | Remove cookies | `confirm: true` required |\n| `get-storage` / `set-storage` / `clear-storage` | localStorage/sessionStorage | |\n| `list` | List open pages | |\n| `close` | Close a page | |\n\n## Security Model\n\n**Hard gates — NEVER bypass:**\n- Credentials go via terminal input (NEVER through tool params or chat)\n- CAPTCHA/MFA: pause and ask user\n- Never store tokens in conversation\n- Close pages containing sensitive data when done\n- Verify page URL before entering credentials (phishing prevention)\n- Use `headless` mode for automated non-interactive tasks; `ui` for user-supervised auth\n\n**Cookie safety gate:** All cookie read/write session actions (`cookies`, `set-cookie`, `delete-cookie`, `clear-cookies`) require `confirm: true` as an explicit acknowledgment. Without it, the tool returns an error.\n\n## Local File Preview\n\nThe browser tool blocks `file:///` URLs for security. To preview local HTML files, serve them via a local HTTP server first.\n\n**Pattern:**\n\n~~~text\n// 1. Start local server (pick an unused port)\n//    Terminal: npx -y serve <directory> -l <port>\n//    Example: npx -y serve ./dist -l 3847\n\n// 2. Open in browser\nbrowser({ action: 'open', url: 'http://localhost:3847/my-file.html', mode: 'ui' })\n\n// 3. Read content or take screenshot\nbrowser({ action: 'read', pageId, readMode: 'markdown' })\nbrowser({ action: 'screenshot', pageId, fullPage: true })\n\n// 4. Clean up — kill the server terminal when done\n~~~\n\n**Use cases:**\n- Preview generated HTML (viewers, reports, docs)\n- Visual regression testing of local builds\n- Inspect single-file HTML applications\n- Screenshot local pages for review\n\n**Important:** Always use `mode: 'ui'` for visual preview so the user can also see and interact with the page.\n\n## Integration\n\n| Skill | Handoff Pattern |\n|-------|------------------|\n| `repo-access` | Strategy Ladder step 6 → browser-use for SSO/OAuth login |\n| `present` | `present({ format: 'browser' })` returns URL → open with browser tool |\n| `aikit` | `web_fetch` fails → browser-use activates |\n\n## Dialog Handling\n\n`dialog()` registers a **one-shot handler** for the NEXT dialog. It must be called **BEFORE** the action that triggers alert, confirm, or prompt.\n\n**Pattern:**\n~~~text\nbrowser({ action: 'dialog', pageId, accept: true })\nbrowser({ action: 'eval', pageId, code: 'confirm(\"Sure?\")' }) // or browser({ action: 'act', ... }) if interaction triggers it\n~~~\n\nFor `prompt` dialogs, pass `promptText` for the response.\n\n## Troubleshooting\n\n| Issue | Fix |\n|-------|-----|\n| \"Browser not installed\" | Run `aikit browser install` |\n| Element not found | `read` with `snapshot` mode first, use ref from ARIA tree |\n| Timeout on navigation | Add `waitUntil: 'networkidle'` to open/navigate |\n| SSO redirect loop | Check cookies with `session({ sessionAction: 'cookies' })` |\n| Anti-bot block | Try `mode: 'ui'`, add delays between actions |\n| Network capture empty | Ensure `enable` called BEFORE navigating |\n\n## Decision Flow\n\n~~~text\nNeed browser?\n├─ Can web_fetch/http handle it? → NO browser needed\n├─ Login wall / SSO / CAPTCHA? → browser-use (Script mode for one-off, Recipe for reusable)\n├─ Need to capture API traffic? → network enable → interact → network get\n├─ Need authenticated API calls? → fetch action (uses page session)\n├─ JS-rendered content? → open + read(markdown)\n├─ Preview local HTML file? → serve dir (npx serve) → open(http://localhost:<port>/file.html, mode: 'ui')\n├─ Form interaction? → Script mode: open → read(snapshot) → act → verify\n└─ Reusable workflow? → Recipe mode (see references/recipes.md)\n~~~\n"},{file:`references/recipes.md`,content:`# Browser Recipes & Domain Skills
 Reference file for reusable browser automation patterns. Load this when building domain-specific browser workflows.