npm - beth-copilot - Versions diffs - 1.0.14 → 1.0.16 - Mend

beth-copilot 1.0.14 → 1.0.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (236) hide show

package/CHANGELOG.md +195 -177
package/README.md +528 -185
package/bin/cli.js +47 -0
package/dist/cli/commands/doctor.e2e.test.d.ts +8 -0
package/dist/cli/commands/doctor.e2e.test.d.ts.map +1 -0
package/dist/cli/commands/doctor.e2e.test.js +428 -0
package/dist/cli/commands/doctor.e2e.test.js.map +1 -0
package/dist/cli/commands/doctor.test.js +1 -1
package/dist/cli/commands/help.e2e.test.d.ts +9 -0
package/dist/cli/commands/help.e2e.test.d.ts.map +1 -0
package/dist/cli/commands/help.e2e.test.js +150 -0
package/dist/cli/commands/help.e2e.test.js.map +1 -0
package/dist/cli/commands/init.test.d.ts +6 -0
package/dist/cli/commands/init.test.d.ts.map +1 -0
package/dist/cli/commands/init.test.js +289 -0
package/dist/cli/commands/init.test.js.map +1 -0
package/dist/cli/commands/mcp.e2e.test.d.ts +9 -0
package/dist/cli/commands/mcp.e2e.test.d.ts.map +1 -0
package/dist/cli/commands/mcp.e2e.test.js +139 -0
package/dist/cli/commands/mcp.e2e.test.js.map +1 -0
package/dist/cli/commands/pipeline.e2e.test.d.ts +9 -0
package/dist/cli/commands/pipeline.e2e.test.d.ts.map +1 -0
package/dist/cli/commands/pipeline.e2e.test.js +192 -0
package/dist/cli/commands/pipeline.e2e.test.js.map +1 -0
package/dist/cli/commands/quickstart.test.d.ts +6 -0
package/dist/cli/commands/quickstart.test.d.ts.map +1 -0
package/dist/cli/commands/quickstart.test.js +232 -0
package/dist/cli/commands/quickstart.test.js.map +1 -0
package/dist/core/agents/frontmatter.test.d.ts +8 -0
package/dist/core/agents/frontmatter.test.d.ts.map +1 -0
package/dist/core/agents/frontmatter.test.js +589 -0
package/dist/core/agents/frontmatter.test.js.map +1 -0
package/dist/core/agents/handoffs.test.d.ts +8 -0
package/dist/core/agents/handoffs.test.d.ts.map +1 -0
package/dist/core/agents/handoffs.test.js +320 -0
package/dist/core/agents/handoffs.test.js.map +1 -0
package/dist/core/agents/loader.test.js +1 -1
package/dist/core/agents/suite.test.d.ts +8 -0
package/dist/core/agents/suite.test.d.ts.map +1 -0
package/dist/core/agents/suite.test.js +207 -0
package/dist/core/agents/suite.test.js.map +1 -0
package/dist/core/agents/tools.test.d.ts +8 -0
package/dist/core/agents/tools.test.d.ts.map +1 -0
package/dist/core/agents/tools.test.js +332 -0
package/dist/core/agents/tools.test.js.map +1 -0
package/dist/core/context.d.ts +171 -0
package/dist/core/context.d.ts.map +1 -0
package/dist/core/context.js +353 -0
package/dist/core/context.js.map +1 -0
package/dist/core/context.test.d.ts +8 -0
package/dist/core/context.test.d.ts.map +1 -0
package/dist/core/context.test.js +253 -0
package/dist/core/context.test.js.map +1 -0
package/dist/core/handoffs.d.ts +151 -0
package/dist/core/handoffs.d.ts.map +1 -0
package/dist/core/handoffs.js +220 -0
package/dist/core/handoffs.js.map +1 -0
package/dist/core/handoffs.test.d.ts +8 -0
package/dist/core/handoffs.test.d.ts.map +1 -0
package/dist/core/handoffs.test.js +231 -0
package/dist/core/handoffs.test.js.map +1 -0
package/dist/core/orchestrator.d.ts +246 -0
package/dist/core/orchestrator.d.ts.map +1 -0
package/dist/core/orchestrator.js +514 -0
package/dist/core/orchestrator.js.map +1 -0
package/dist/core/orchestrator.test.d.ts +8 -0
package/dist/core/orchestrator.test.d.ts.map +1 -0
package/dist/core/orchestrator.test.js +517 -0
package/dist/core/orchestrator.test.js.map +1 -0
package/dist/core/router.d.ts +102 -0
package/dist/core/router.d.ts.map +1 -0
package/dist/core/router.js +178 -0
package/dist/core/router.js.map +1 -0
package/dist/core/router.test.d.ts +8 -0
package/dist/core/router.test.d.ts.map +1 -0
package/dist/core/router.test.js +215 -0
package/dist/core/router.test.js.map +1 -0
package/dist/index.d.ts +9 -0
package/dist/index.d.ts.map +1 -1
package/dist/index.js +7 -0
package/dist/index.js.map +1 -1
package/dist/init.test.js +288 -0
package/dist/providers/azure.d.ts +147 -0
package/dist/providers/azure.d.ts.map +1 -0
package/dist/providers/azure.js +491 -0
package/dist/providers/azure.js.map +1 -0
package/dist/providers/azure.test.d.ts +11 -0
package/dist/providers/azure.test.d.ts.map +1 -0
package/dist/providers/azure.test.js +330 -0
package/dist/providers/azure.test.js.map +1 -0
package/dist/providers/config.d.ts +87 -0
package/dist/providers/config.d.ts.map +1 -0
package/dist/providers/config.js +193 -0
package/dist/providers/config.js.map +1 -0
package/dist/providers/config.test.d.ts +7 -0
package/dist/providers/config.test.d.ts.map +1 -0
package/dist/providers/config.test.js +370 -0
package/dist/providers/config.test.js.map +1 -0
package/dist/providers/index.d.ts +18 -0
package/dist/providers/index.d.ts.map +1 -0
package/dist/providers/index.js +14 -0
package/dist/providers/index.js.map +1 -0
package/dist/providers/interface.d.ts +191 -0
package/dist/providers/interface.d.ts.map +1 -0
package/dist/providers/interface.js +94 -0
package/dist/providers/interface.js.map +1 -0
package/dist/providers/retry.d.ts +128 -0
package/dist/providers/retry.d.ts.map +1 -0
package/dist/providers/retry.js +205 -0
package/dist/providers/retry.js.map +1 -0
package/dist/providers/retry.test.d.ts +7 -0
package/dist/providers/retry.test.d.ts.map +1 -0
package/dist/providers/retry.test.js +439 -0
package/dist/providers/retry.test.js.map +1 -0
package/dist/providers/streaming.d.ts +157 -0
package/dist/providers/streaming.d.ts.map +1 -0
package/dist/providers/streaming.js +233 -0
package/dist/providers/streaming.js.map +1 -0
package/dist/providers/streaming.test.d.ts +7 -0
package/dist/providers/streaming.test.d.ts.map +1 -0
package/dist/providers/streaming.test.js +372 -0
package/dist/providers/streaming.test.js.map +1 -0
package/dist/providers/types.d.ts +209 -0
package/dist/providers/types.d.ts.map +1 -0
package/dist/providers/types.js +53 -0
package/dist/providers/types.js.map +1 -0
package/dist/providers/types.test.d.ts +7 -0
package/dist/providers/types.test.d.ts.map +1 -0
package/dist/providers/types.test.js +141 -0
package/dist/providers/types.test.js.map +1 -0
package/dist/tools/cli/beads.d.ts +27 -0
package/dist/tools/cli/beads.d.ts.map +1 -0
package/dist/tools/cli/beads.js +172 -0
package/dist/tools/cli/beads.js.map +1 -0
package/dist/tools/cli/beads.test.d.ts +8 -0
package/dist/tools/cli/beads.test.d.ts.map +1 -0
package/dist/tools/cli/beads.test.js +264 -0
package/dist/tools/cli/beads.test.js.map +1 -0
package/dist/tools/cli/editFile.d.ts +17 -0
package/dist/tools/cli/editFile.d.ts.map +1 -0
package/dist/tools/cli/editFile.js +125 -0
package/dist/tools/cli/editFile.js.map +1 -0
package/dist/tools/cli/editFile.test.d.ts +8 -0
package/dist/tools/cli/editFile.test.d.ts.map +1 -0
package/dist/tools/cli/editFile.test.js +177 -0
package/dist/tools/cli/editFile.test.js.map +1 -0
package/dist/tools/cli/readFile.d.ts +25 -0
package/dist/tools/cli/readFile.d.ts.map +1 -0
package/dist/tools/cli/readFile.js +118 -0
package/dist/tools/cli/readFile.js.map +1 -0
package/dist/tools/cli/readFile.test.d.ts +8 -0
package/dist/tools/cli/readFile.test.d.ts.map +1 -0
package/dist/tools/cli/readFile.test.js +194 -0
package/dist/tools/cli/readFile.test.js.map +1 -0
package/dist/tools/cli/search.d.ts +16 -0
package/dist/tools/cli/search.d.ts.map +1 -0
package/dist/tools/cli/search.js +261 -0
package/dist/tools/cli/search.js.map +1 -0
package/dist/tools/cli/search.test.d.ts +8 -0
package/dist/tools/cli/search.test.d.ts.map +1 -0
package/dist/tools/cli/search.test.js +172 -0
package/dist/tools/cli/search.test.js.map +1 -0
package/dist/tools/cli/subagent.d.ts +43 -0
package/dist/tools/cli/subagent.d.ts.map +1 -0
package/dist/tools/cli/subagent.js +99 -0
package/dist/tools/cli/subagent.js.map +1 -0
package/dist/tools/cli/subagent.test.d.ts +8 -0
package/dist/tools/cli/subagent.test.d.ts.map +1 -0
package/dist/tools/cli/subagent.test.js +190 -0
package/dist/tools/cli/subagent.test.js.map +1 -0
package/dist/tools/cli/terminal.d.ts +19 -0
package/dist/tools/cli/terminal.d.ts.map +1 -0
package/dist/tools/cli/terminal.js +164 -0
package/dist/tools/cli/terminal.js.map +1 -0
package/dist/tools/cli/terminal.test.d.ts +8 -0
package/dist/tools/cli/terminal.test.d.ts.map +1 -0
package/dist/tools/cli/terminal.test.js +161 -0
package/dist/tools/cli/terminal.test.js.map +1 -0
package/dist/tools/index.d.ts +25 -0
package/dist/tools/index.d.ts.map +1 -0
package/dist/tools/index.js +41 -0
package/dist/tools/index.js.map +1 -0
package/dist/tools/interface.d.ts +64 -0
package/dist/tools/interface.d.ts.map +1 -0
package/dist/tools/interface.js +37 -0
package/dist/tools/interface.js.map +1 -0
package/dist/tools/interface.test.d.ts +7 -0
package/dist/tools/interface.test.d.ts.map +1 -0
package/dist/tools/interface.test.js +179 -0
package/dist/tools/interface.test.js.map +1 -0
package/dist/tools/mcp/bridge.d.ts +48 -0
package/dist/tools/mcp/bridge.d.ts.map +1 -0
package/dist/tools/mcp/bridge.js +128 -0
package/dist/tools/mcp/bridge.js.map +1 -0
package/dist/tools/mcp/bridge.test.d.ts +8 -0
package/dist/tools/mcp/bridge.test.d.ts.map +1 -0
package/dist/tools/mcp/bridge.test.js +300 -0
package/dist/tools/mcp/bridge.test.js.map +1 -0
package/dist/tools/mcp/client.d.ts +135 -0
package/dist/tools/mcp/client.d.ts.map +1 -0
package/dist/tools/mcp/client.js +263 -0
package/dist/tools/mcp/client.js.map +1 -0
package/dist/tools/mcp/client.test.d.ts +8 -0
package/dist/tools/mcp/client.test.d.ts.map +1 -0
package/dist/tools/mcp/client.test.js +390 -0
package/dist/tools/mcp/client.test.js.map +1 -0
package/dist/tools/registry.d.ts +82 -0
package/dist/tools/registry.d.ts.map +1 -0
package/dist/tools/registry.js +99 -0
package/dist/tools/registry.js.map +1 -0
package/dist/tools/registry.test.d.ts +7 -0
package/dist/tools/registry.test.d.ts.map +1 -0
package/dist/tools/registry.test.js +199 -0
package/dist/tools/registry.test.js.map +1 -0
package/dist/tools/suite.test.d.ts +11 -0
package/dist/tools/suite.test.d.ts.map +1 -0
package/dist/tools/suite.test.js +119 -0
package/dist/tools/suite.test.js.map +1 -0
package/dist/tools/types.d.ts +75 -0
package/dist/tools/types.d.ts.map +1 -0
package/dist/tools/types.js +30 -0
package/dist/tools/types.js.map +1 -0
package/dist/tools/types.test.d.ts +7 -0
package/dist/tools/types.test.d.ts.map +1 -0
package/dist/tools/types.test.js +178 -0
package/dist/tools/types.test.js.map +1 -0
package/package.json +60 -56
package/sbom.json +3302 -8
package/templates/.github/agents/beth.agent.md +329 -329
package/templates/.github/agents/developer.agent.md +572 -572
package/templates/.github/agents/product-manager.agent.md +272 -272
package/templates/.github/agents/researcher.agent.md +338 -338
package/templates/.github/agents/security-reviewer.agent.md +465 -465
package/templates/.github/agents/tester.agent.md +496 -496
package/templates/.github/agents/ux-designer.agent.md +393 -393
package/templates/mcp.json.example +4 -0

package/README.md CHANGED Viewed

@@ -12,21 +12,91 @@ They broke her wings once. They forgot she had claws.
 ## What Is This?
-Beth is a master AI orchestrator system—a ruthless, hyper-competent coordinator that runs your development team the way Beth Dutton runs Schwartz & Meyer. No hand-holding. No excuses. Just results.
+Beth is a **multi-agent AI orchestrator** with a TypeScript runtime, CLI toolchain, MCP integrations, and agent-to-agent (A2A) delegation—all driven by a ruthless coordinator who runs your development team the way Beth Dutton runs Schwartz & Meyer.
-She commands an army of specialized agents, each with their own expertise, and she's not afraid to put them to work simultaneously while she lights a cigarette and watches the crew build production-ready code.
+She commands seven specialized agents, each with their own expertise, tools, and handoff chains. On top of the GitHub Copilot agent layer, Beth ships a **TypeScript core engine** with a full agentic loop: agent routing, conversation context management, tool calling, subagent spawning, and agent-to-agent handoffs—all backed by an Azure OpenAI LLM provider with streaming and retry.
-**She handles:**
-- Product strategy that makes competitors weep
-- Research that finds the real dirt
-- Designs so sharp they cut
-- Code that actually works
-- Security that locks the gates
-- Tests that find every weakness before your enemies do
+**The system has four execution layers:**
+| Layer | What It Does | Status |
+|-------|-------------|--------|
+| **Copilot Agents** | `.agent.md` definitions running in VS Code Agent Mode | Live |
+| **CLI Toolchain** | `beth init`, `beth doctor`, `beth quickstart` — TypeScript commands | Live |
+| **Orchestration Engine** | Fan-out routing, tool calling loop, subagent spawning, handoffs | Live |
+| **Tool Abstraction** | 6 CLI tools + MCP bridge — uniform interface for all agent capabilities | Live |
+| **LLM Provider** | Azure OpenAI with Entra ID auth, streaming, retry, tool calling | Live |
+**814 tests.** 813 pass, 1 skip, 0 fail.
+---
+## Architecture
+```mermaid
+flowchart TB
+    subgraph Input["Entry Points"]
+        Copilot["VS Code Copilot Chat"]
+        CLI["Beth CLI"]
+    end
+    subgraph Engine["Orchestration Engine"]
+        Orch["Orchestrator<br/><i>Route → LLM → Tools → Response</i>"]
+    end
+    subgraph Agents["Specialist Agents"]
+        Beth["@Beth"]
+        PM["@product-manager"]
+        UX["@ux-designer"]
+        Dev["@developer"]
+        Sec["@security-reviewer"]
+        Test["@tester"]
+        Res["@researcher"]
+    end
+    subgraph Capabilities["Capabilities"]
+        Tools["Tools<br/><i>files · terminal · search · beads</i>"]
+        Skills["Skills<br/><i>PRD · React · shadcn · security</i>"]
+        MCPs["MCP Servers<br/><i>shadcn · Playwright · Azure</i>"]
+    end
+    LLM["Azure OpenAI<br/><i>Entra ID · Streaming</i>"]
+    Copilot & CLI --> Orch
+    Orch --> Beth
+    Beth -->|"delegates"| PM & UX & Dev & Sec & Test & Res
+    Orch <-->|"chat"| LLM
+    Orch --> Tools & Skills & MCPs
+    style Beth fill:#1e3a5f,color:#fff
+    style Engine fill:#fff3e0
+    style Capabilities fill:#e3f2fd
+```
+---
+## Tech Stack
+| Category | Technology | Notes |
+|----------|-----------|-------|
+| **Runtime** | Node.js ≥ 18 | ES modules, built-in test runner |
+| **Language** | TypeScript (strict mode) | No `any`. Zod for runtime validation |
+| **Target Framework** | React 19 + Next.js App Router | Server Components, Server Actions, Suspense, streaming |
+| **Styling** | Tailwind CSS + `class-variance-authority` (cva) | Utility-first with typed variants |
+| **Components** | shadcn/ui | Radix primitives, copy-paste ownership |
+| **LLM Provider** | Azure OpenAI via `openai` SDK | Entra ID auth (no API keys), streaming + tool calling |
+| **Auth** | `@azure/identity` DefaultAzureCredential | az login, managed identity, VS Code creds |
+| **Frontmatter** | `gray-matter` | Parses `.agent.md` and `SKILL.md` YAML |
+| **Testing** | Node.js built-in test runner | 814 tests — unit, integration, E2E |
+| **Task Tracking** | beads (`bd` CLI) | Dependency-aware issue tracking for agents |
+| **Package Manager** | pnpm | Lockfile committed |
+**Production dependencies:** 1 (`gray-matter`). That's it. Minimal attack surface by design.
+---
 ## Getting Started
-**Project scope:**
+**One command:**
 ```bash
 npx beth-copilot init
 ```
@@ -39,87 +109,97 @@ beth init
 Then open VS Code, switch Copilot Chat to **Agent mode**, and type `@Beth`.
-For detailed setup (prerequisites, task tracking, MCP servers): [docs/INSTALLATION.md](docs/INSTALLATION.md)
+**Verify everything works:**
+```bash
+beth doctor       # Health check: Node.js, beads, agents, skills
+beth quickstart   # Init + doctor + beads setup in one shot
+```
-## The Family
+For detailed setup (prerequisites, task tracking, MCP servers): [docs/INSTALLATION.md](docs/INSTALLATION.md)
-Beth doesn't work alone. She's got people—loyal, skilled, and ready to execute.
+---
-| Agent | Role | What They Do |
-|-------|------|--------------|
-| **@Beth** | The Boss | Orchestrates everything. Routes work. Takes names. |
-| **@product-manager** | The Strategist | WHAT to build: PRDs, user stories, priorities, success metrics. |
-| **@researcher** | The Intelligence | Competitive analysis, user insights, market dirt. |
-| **@ux-designer** | The Architect | HOW it works: component specs, design tokens, accessibility. |
-| **@developer** | The Builder | React/TypeScript/Next.js - UI and full-stack. Gets it done. |
-| **@tester** | The Enforcer | Quality assurance, accessibility, performance. Finds every crack. |
-| **@security-reviewer** | The Bodyguard | Enterprise security. Vulnerabilities, compliance, threat modeling. |
+## CLI Commands
-### Product Manager vs UX Designer
+| Command | What It Does |
+|---------|-------------|
+| `beth init` | Install agents, skills, VS Code settings, beads tracking |
+| `beth init --force` | Overwrite existing files |
+| `beth doctor` | Validate Node.js ≥18, beads CLI, agents frontmatter, skills directories |
+| `beth quickstart` | Run init + doctor + beads init in one shot |
+| `beth help` | Show all commands and options |
-| | Product Manager | UX Designer |
-|---|---|---|
-| **Focus** | WHAT to build, WHY, WHEN | HOW it looks, feels, behaves |
-| **Outputs** | PRDs, user stories, priorities | Component specs, design tokens, accessibility |
-| **Example** | "Users need date filtering" | "Date picker: variants, states, ARIA" |
+**Flags:** `--force`, `--skip-backlog`, `--skip-mcp`, `--skip-beads`, `--verbose`
-## Skills (The Weapons)
+---
-Beth's team comes equipped:
+## Agent-to-Agent (A2A) Orchestration
-| Skill | Purpose |
-|-------|---------|
-| **PRD Generation** | Write requirements docs that don't waste anyone's time |
-| **Framer Components** | Build custom React components with property controls |
-| **React/Next.js Best Practices** | Vercel-grade performance patterns |
-| **Web Design Guidelines** | WCAG compliance, UI review, accessibility |
-| **shadcn/ui** | Component library patterns, installation, and best practices |
-| **Security Analysis** | OWASP, threat modeling, vulnerability assessment |
+Beth doesn't micromanage. She delegates to specialists over **subagent** and **handoff** channels, tracks dependencies with beads, and holds every agent accountable.
-## How Beth Works
+### The Family
-She doesn't micromanage. She delegates to specialists and holds them accountable.
+| Agent | Role | What They Do |
+|-------|------|--------------|
+| **@Beth** | The Boss | Orchestrates everything. Routes work. Takes names. |
+| **@product-manager** | The Strategist | WHAT to build: PRDs, user stories, priorities, success metrics |
+| **@researcher** | The Intelligence | Competitive analysis, user insights, market dirt |
+| **@ux-designer** | The Architect | HOW it works: component specs, design tokens, accessibility |
+| **@developer** | The Builder | React/TypeScript/Next.js — UI and full-stack |
+| **@tester** | The Enforcer | Quality assurance, accessibility, performance |
+| **@security-reviewer** | The Bodyguard | OWASP, compliance, threat modeling |
-### Architecture
+### A2A Delegation Model
 ```mermaid
 flowchart TB
-    subgraph User["👤 User"]
-        Request[User Request]
+    subgraph Orchestration["Beth Orchestration Layer"]
+        BethCore["@Beth<br/><i>Routes work · Spawns subagents</i>"]
     end
-    subgraph Orchestrator["🎯 Beth - The Orchestrator"]
-        Beth["@Beth<br/><i>'I don't speak dipshit'</i>"]
-        Assess[Assess Request]
-        Plan[Plan Workflow]
-        Route[Route to Specialists]
+    subgraph Specialists["Specialist Agents"]
+        PM["@product-manager<br/>Requirements · Priorities"]
+        R["@researcher<br/>User insights · Market intel"]
+        UX["@ux-designer<br/>Component specs · Design tokens"]
+        D["@developer<br/>React/TS/Next.js · Implementation"]
+        S["@security-reviewer<br/>Threat modeling · Vulnerabilities"]
+        T["@tester<br/>QA · a11y · Performance"]
     end
-    subgraph Agents["🧑‍💼 Specialist Agents"]
-        PM["@product-manager<br/>WHAT to build"]
-        Researcher["@researcher<br/>User/Market Intel"]
-        Designer["@ux-designer<br/>HOW it works"]
-        Developer["@developer<br/>Implementation"]
-        Security["@security-reviewer<br/>Protection"]
-        Tester["@tester<br/>Quality Gate"]
-    end
+    BethCore -->|"Product Strategy"| PM
+    BethCore -->|"User Research"| R
+    BethCore -->|"UX Design"| UX
+    BethCore -->|"Development"| D
+    BethCore -->|"Security Review"| S
+    BethCore -->|"Quality Assurance"| T
+    PM -.->|"subagent"| R
+    PM -.->|"subagent"| UX
+    UX -.->|"subagent"| D
+    D -.->|"subagent"| T
+    S -.->|"subagent"| D
+    T -.->|"subagent"| D
+    style BethCore fill:#1e3a5f,color:#fff
+```
-    Request --> Beth
-    Beth --> Assess --> Plan --> Route
-    Route --> PM
-    Route --> Researcher
-    Route --> Designer
-    Route --> Developer
-    Route --> Security
-    Route --> Tester
+### Subagent vs Handoff
-    style Beth fill:#1e3a5f,color:#fff
-    style Orchestrator fill:#f0f4f8
-    style Agents fill:#f8f4f0
+| Mechanism | Control | Use When |
+|-----------|---------|----------|
+| **Subagent** | Beth decides | Task can run autonomously, no human review needed |
+| **Handoff** | User decides | User needs to review before proceeding |
+```typescript
+// Beth spawns a specialist — autonomous execution
+runSubagent({
+  agentName: "developer",
+  prompt: "Implement JWT auth flow with refresh token rotation...",
+  description: "Implement auth"
+})
 ```
-### The Workflow
+### Workflow: New Feature
 ```mermaid
 sequenceDiagram
@@ -133,125 +213,327 @@ sequenceDiagram
     U->>B: "Build me a feature"
     B->>B: Assess & Plan
     B->>PM: Define requirements
-    PM-->>B: Requirements ready
+    PM-->>B: PRD + user stories
     B->>UX: Design the experience
-    UX-->>B: Design specs ready
+    UX-->>B: Component specs + tokens
     B->>D: Implement feature
     D-->>B: Implementation complete
-    B->>S: Security review
-    S-->>B: Security approved
-    B->>T: Test & verify
-    T-->>B: Quality verified
+    par Parallel quality gates
+        B->>S: Security review
+        S-->>B: OWASP approved
+    and
+        B->>T: Test & verify
+        T-->>B: a11y + regression pass
+    end
     B->>U: Feature complete ✅
 ```
-**Bug Hunt?** Tester → Developer → Security → Tester
-**Security Audit?** Security → Developer → Tester → Security
+**Bug Hunt?** Tester → Developer → Security → Tester
+**Security Audit?** Security → Developer → Tester → Security sign-off
+---
+## MCP Integrations
+Model Context Protocol servers extend agent capabilities. All **optional** — agents gracefully degrade without them.
+| Server | Agent | Capability |
+|--------|-------|-----------|
+| **shadcn/ui** | Developer | Component browsing & installation |
+| **Playwright** | Tester | Browser automation, E2E testing |
+| **Azure** | Developer, Security | Cloud resource management |
+| **Brave Search** | Researcher | Internet research |
+| **DeepWiki** | All | Repository documentation lookup |
+### Quick Setup
+```bash
+# Copy example config and enable what you need
+cp mcp.json.example .vscode/mcp.json
+```
+```json
+{
+  "servers": {
+    "shadcn":     { "command": "npx", "args": ["shadcn@latest", "mcp"] },
+    "playwright": { "command": "npx", "args": ["@playwright/mcp@latest"] },
+    "azure":      { "command": "npx", "args": ["@azure/mcp-server"] },
+    "web-search": { "command": "npx", "args": ["@brave/brave-search-mcp-server"] },
+    "deepwiki":   { "url": "https://mcp.deepwiki.com/mcp" }
+  }
+}
+```
+Full details: [docs/MCP-SETUP.md](docs/MCP-SETUP.md)
+---
+## Skills (On-Demand Knowledge)
+Skills are domain-knowledge modules that agents load automatically when trigger phrases match. Each skill lives in `.github/skills/<name>/SKILL.md`.
-### Agent Delegation
+| Skill | Triggers On | Used By |
+|-------|------------|---------|
+| **PRD Generation** | "create a prd", "product requirements" | Product Manager |
+| **Framer Components** | "framer component", "property controls" | UX Designer |
+| **React/Next.js Best Practices** | React performance, Next.js patterns | Developer |
+| **Web Design Guidelines** | "review my UI", "check accessibility" | UX Designer |
+| **shadcn/ui** | "shadcn", "ui component" | Developer |
+| **Security Analysis** | "security review", "OWASP", "threat model" | Security Reviewer |
+| **Azure Operations** | Azure resource management | Developer |
+| **Web Search** | Internet research via Brave | Researcher |
+---
+## Orchestration Engine (Fan-Out Pattern)
+The orchestration engine is Beth's brain — the full agentic loop that processes user messages through routing, skill injection, LLM calls, tool execution, and subagent spawning.
 ```mermaid
 flowchart TB
-    subgraph Beth["Beth (Orchestrator)"]
-        BethCore["Routes all work<br/>Spawns subagents"]
-    end
+    User["User Message"] --> Route["AgentRouter\n@mention · skill match · default"]
+    Route --> Context["ConversationContext\nBuild system prompt + history"]
+    Context --> Skills{"Skill triggers match?"}
+    Skills -->|yes| Inject["Inject skill into system prompt"]
+    Skills -->|no| LLM
+    Inject --> LLM["LLM Call\nAzure OpenAI"]
+    LLM --> Decision{"Response type?"}
+    Decision -->|text| Done["Return response"]
+    Decision -->|tool calls| ToolExec["Execute tools\nvia ToolRegistry"]
+    ToolExec --> SubCheck{"Subagent request?"}
+    SubCheck -->|yes| SubAgent["Spawn child loop\ndepth-limited"]
+    SubCheck -->|no| ToolResult["Return tool result"]
+    SubAgent --> ToolResult
+    ToolResult --> LLM
+    Decision -->|handoff| Handoff["HandoffManager\nContext transfer"]
+    Handoff --> Route
+    style User fill:#1e3a5f,color:#fff
+    style LLM fill:#e8f5e9
+    style ToolExec fill:#e3f2fd
+    style SubAgent fill:#fff3e0
+```
-    subgraph PM["Product Manager"]
-        PMCore["Requirements<br/>Priorities"]
-    end
+**Key capabilities:**
+- **Agent routing** — `@mention` parsing, skill trigger matching, current-agent stickiness
+- **Fan-out tool calling** — Iterative LLM → tool call → result → LLM loop (up to 25 iterations)
+- **Subagent spawning** — Nested agent loops with depth limiting (default: 3 levels deep)
+- **Handoff management** — Context transfer between agents with conversation summaries, ping-pong loop detection
+- **Context window management** — Token-estimated truncation with tool call/result consistency repair
+- **Observer callbacks** — Hook into routing decisions, LLM calls, tool executions, handoffs for logging/UI
+```typescript
+// Full orchestrator usage
+import { Orchestrator, createDefaultRegistry } from 'beth-copilot';
+const orchestrator = new Orchestrator({
+  agents: loadAgents('.github/agents'),
+  skills: loadSkills('.github/skills'),
+  provider: new AzureOpenAIProvider(config),
+  toolRegistry: createDefaultRegistry(),
+  toolContext: { workingDir: process.cwd(), permissions: { ... } },
+});
+const result = await orchestrator.processMessage('Implement the login page');
+// result.response — final text
+// result.agentId — who handled it
+// result.toolCallsExecuted — what tools ran
+// result.subagentResults — any nested agent work
+// result.injectedSkills — skills loaded for this turn
+```
-    subgraph R["Researcher"]
-        RCore["User insights<br/>Market intel"]
-    end
+---
-    subgraph UX["UX Designer"]
-        UXCore["Component specs<br/>Design tokens"]
-    end
+## Tool Abstraction Layer
-    subgraph D["Developer"]
-        DCore["React/TS/Next.js<br/>Implementation"]
-    end
+A uniform interface for all agent capabilities — file I/O, terminal, search, beads, subagent spawning, and MCP server tools. Tools expose OpenAI-compatible function calling schemas so the LLM can invoke them directly.
+| Tool | What It Does | Key Features |
+|------|-------------|-------------- |
+| **readFile** | Read file contents | Line ranges, path validation, traversal guards |
+| **editFile** | Atomic string replacement | Single-match enforcement, whitespace-safe |
+| **search** | Ripgrep search | Node.js fallback, regex support, file filtering |
+| **terminal** | Execute shell commands | `execFile('/bin/sh')` — no shell injection, timeouts |
+| **beads** | Issue tracking | `bd create`, `bd close`, `bd list` via CLI wrapper |
+| **subagent** | Spawn nested agents | Returns structured result for orchestrator to process |
+| **MCP Bridge** | External tool servers | JSON-RPC 2.0 over stdio, JSONC config, namespaced tools |
+```typescript
+import { createDefaultRegistry, ToolRegistry, loadAllMCPTools } from 'beth-copilot';
+// Built-in tools
+const registry = createDefaultRegistry();
+// → readFile, editFile, search, terminal, beads, subagent
-    subgraph S["Security"]
-        SCore["Threat modeling<br/>Vulnerabilities"]
+// Add MCP server tools
+const { tools: mcpTools } = await loadAllMCPTools('.vscode/mcp.json');
+for (const tool of mcpTools) {
+  registry.register(tool); // e.g., mcp_shadcn_listComponents
+}
+// Get OpenAI function calling definitions
+const definitions = registry.getDefinitions();
+// Pass to LLM as tools parameter
+```
+---
+## LLM Provider Layer
+The TypeScript core includes a production-ready provider abstraction for running Beth outside VS Code.
+```mermaid
+flowchart LR
+    subgraph Config["Configuration"]
+        Env["process.env"]
+        DotEnv["~/.beth/.env"]
     end
-    subgraph T["Tester"]
-        TCore["QA & a11y<br/>Performance"]
+    subgraph Auth["Authentication"]
+        Entra["Entra ID<br/><i>DefaultAzureCredential</i>"]
     end
-    BethCore -->|"Product Strategy"| PMCore
-    BethCore -->|"User Research"| RCore
-    BethCore -->|"UX Design"| UXCore
-    BethCore -->|"Development"| DCore
-    BethCore -->|"Security Review"| SCore
-    BethCore -->|"Quality Assurance"| TCore
-    PMCore -.->|"subagent"| RCore
-    PMCore -.->|"subagent"| UXCore
-    UXCore -.->|"subagent"| DCore
-    DCore -.->|"subagent"| TCore
-    SCore -.->|"subagent"| DCore
-```
+    subgraph Provider["Provider"]
+        Base["LLMProviderBase<br/><i>Abstract interface</i>"]
+        AzureOAI["AzureOpenAIProvider<br/><i>chat · chatStream · countTokens</i>"]
+    end
-## Quick Commands
+    subgraph Resilience["Resilience"]
+        RetryMod["Exponential Backoff<br/><i>Jitter · 3 retries</i>"]
+        Errors["LLMError<br/><i>Typed error codes</i>"]
+    end
-Don't waste her time. Be direct.
+    subgraph Streaming["Streaming"]
+        Accum["StreamAccumulator<br/><i>Content + tool call assembly</i>"]
+        Collect["collectStream<br/><i>Full response</i>"]
+        Map["mapStream<br/><i>Transform chunks</i>"]
+    end
-```
-@Beth Build me a dashboard for user analytics with real-time updates.
+    Env --> AzureOAI
+    DotEnv --> AzureOAI
+    Entra --> AzureOAI
+    Base --> AzureOAI
+    RetryMod --> AzureOAI
+    AzureOAI --> Accum
+    AzureOAI --> Collect
+    Errors --> RetryMod
 ```
-```
-@Beth Security review for our authentication flow. Find the holes.
-```
+**Key capabilities:**
+- **Entra ID auth** — No API keys. Uses `DefaultAzureCredential` (az login, managed identity, VS Code creds)
+- **Streaming** — `chatStream()` yields `ChatChunk` objects with incremental tool call delta assembly
+- **Retry** — Exponential backoff with jitter for 429/5xx/network errors. Non-transient errors fail fast
+- **Config** — `process.env` → `~/.beth/.env` precedence chain
+- **193 provider tests** covering types, retry, config, streaming, and Azure client
-```
-@developer Implement a drag-and-drop task board. Make it fast.
-```
+---
-```
-@security-reviewer OWASP top 10 assessment on our API endpoints.
-```
+## TypeScript Core
-```
-@tester Accessibility audit. WCAG 2.1 AA. No excuses.
-```
+The engine that powers everything. Parses agent and skill definitions, manages conversations, routes requests, executes tools, and provides typed APIs for the full agentic loop.
-## The Structure
+### Project Structure
 ```
-.github/
-├── agents/                          # The crew
-│   ├── beth.agent.md               # The boss herself
-│   ├── product-manager.agent.md
-│   ├── researcher.agent.md
-│   ├── ux-designer.agent.md
-│   ├── developer.agent.md          # UI + full-stack
-│   ├── tester.agent.md
-│   └── security-reviewer.agent.md  # Enterprise security
-├── skills/                          # Domain expertise
-│   ├── prd/
-│   ├── framer-components/
-│   ├── vercel-react-best-practices/
-│   ├── web-design-guidelines/
-│   └── security-analysis/          # New: security skill
-└── copilot-instructions.md         # The rules of engagement
+beth/
+├── bin/
+│   └── cli.js                      # CLI entry point (init, doctor, quickstart, help)
+├── src/
+│   ├── index.ts                    # Barrel exports (all public API)
+│   ├── cli/commands/
+│   │   ├── doctor.ts               # System health validation
+│   │   └── quickstart.ts           # Guided setup flow
+│   ├── core/
+│   │   ├── orchestrator.ts         # Agentic loop: route → LLM → tools → response
+│   │   ├── router.ts               # @mention routing, skill matching, agent lookup
+│   │   ├── context.ts              # Conversation state, token truncation, skill injection
+│   │   ├── handoffs.ts             # Agent-to-agent transfers, loop detection
+│   │   ├── agents/
+│   │   │   ├── types.ts            # AgentDefinition, AgentFrontmatter, AgentHandoff
+│   │   │   └── loader.ts           # Parse .agent.md → typed definitions
+│   │   └── skills/
+│   │       ├── types.ts            # SkillDefinition, TriggerMap
+│   │       └── loader.ts           # Parse SKILL.md, extract triggers, match queries
+│   ├── lib/
+│   │   └── pathValidation.ts       # Traversal/injection guards
+│   ├── tools/
+│   │   ├── interface.ts            # Tool interface + toToolDefinition()
+│   │   ├── types.ts                # ToolError, ToolResult, ToolContext, ToolPermissions
+│   │   ├── registry.ts             # ToolRegistry: register, get, list, getDefinitions
+│   │   ├── cli/
+│   │   │   ├── readFile.ts         # File reading with line ranges
+│   │   │   ├── editFile.ts         # Atomic string replacement
+│   │   │   ├── search.ts           # Ripgrep with Node.js fallback
+│   │   │   ├── terminal.ts         # Secure command execution
+│   │   │   ├── beads.ts            # Issue tracking via bd CLI
+│   │   │   └── subagent.ts         # Agent spawning interface
+│   │   └── mcp/
+│   │       ├── client.ts           # JSON-RPC 2.0 over stdio
+│   │       └── bridge.ts           # JSONC config, tool namespacing
+│   └── providers/
+│       ├── interface.ts            # LLMProviderBase abstract class
+│       ├── azure.ts                # AzureOpenAIProvider (Entra ID, streaming, tools)
+│       ├── types.ts                # 17 types: ChatMessage, ToolCall, LLMError, etc.
+│       ├── retry.ts                # Exponential backoff with jitter
+│       ├── config.ts               # Environment + dotfile config loader
+│       └── streaming.ts            # StreamAccumulator, collectStream, mapStream
+├── templates/
+│   └── .github/
+│       ├── agents/                 # 7 agent definitions (.agent.md)
+│       └── skills/                 # 8 skill modules (SKILL.md)
+└── docs/
+    ├── INSTALLATION.md
+    ├── MCP-SETUP.md
+    ├── CLI-ARCHITECTURE.md
+    └── SYSTEM-FLOW.md
 ```
-## Her Philosophy
-Beth operates on a few principles:
+### Test Coverage
+**814 tests** (813 pass, 1 skip, 0 fail):
+| Suite | Tests | What It Covers |
+|-------|-------|---------------|
+| **Orchestration** | | |
+| Orchestrator | 30+ | Agentic loop, tool calling, subagent spawning, iteration limits |
+| AgentRouter | 30+ | @mention routing, skill matching, agent resolution |
+| ConversationContext | 30+ | Token truncation, skill injection, tool call repair |
+| HandoffManager | 30+ | Context transfer, depth limits, ping-pong detection |
+| **Tools** | | |
+| Tool interface | 20+ | Tool → ToolDefinition conversion, schema validation |
+| ToolRegistry | 20+ | Register, get, list, definitions, duplicate detection |
+| readFile | 30+ | Line ranges, path validation, encoding |
+| editFile | 30+ | String replacement, single-match enforcement |
+| search | 30+ | Ripgrep, Node.js fallback, regex, file filtering |
+| terminal | 30+ | Command execution, timeouts, output capture |
+| beads | 30+ | bd CLI wrapper, create/close/list/ready |
+| subagent | 30+ | Spawn interface, result marking, agent validation |
+| MCP client | 30+ | JSON-RPC 2.0, protocol handshake, tool listing |
+| MCP bridge | 30+ | JSONC parsing, tool namespacing, error handling |
+| Tool suite | 10+ | createDefaultRegistry, integration tests |
+| **Providers** | | |
+| Provider types | 40+ | LLMError codes, ChatMessage shapes, ToolDefinition schemas |
+| Provider retry | 40+ | Exponential backoff, jitter, transient error detection |
+| Provider config | 30+ | Env precedence, dotenv parsing, URL validation |
+| Provider streaming | 40+ | Chunk accumulation, tool call delta assembly |
+| Provider Azure | 30+ | Message mapping, response mapping, error wrapping |
+| **Core & CLI** | | |
+| Agent loader | 30+ | Frontmatter parsing, validation, code fence stripping, handoffs |
+| Skill loader | 30+ | Trigger extraction, query matching, trigger map building |
+| CLI E2E | 52 | Init/doctor pipeline, MCP template validation, help output |
+| Path validation | 33 | Traversal detection, injection prevention, allowlists |
-1. **Protect the family** — Your codebase is the ranch. She defends it.
-2. **No weakness** — Tests, security, accessibility. Cover every flank.
-3. **Move fast, break enemies** — Parallel execution, aggressive timelines.
-4. **Loyalty earns trust** — Agents that perform get the good work.
+---
-### IDEO Design Thinking
+## IDEO Design Thinking
-Beth follows human-centered design methodology:
+Beth follows human-centered design methodology across agent workflows:
 ```mermaid
 flowchart LR
@@ -277,35 +559,38 @@ flowchart LR
     E --> D --> I --> P --> T
     T -.->|iterate| E
-    T -.->|iterate| D
     T -.->|iterate| I
 ```
+---
 ## Quality Standards
 Beth doesn't ship garbage:
-- **Accessibility**: WCAG 2.1 AA minimum. Everyone uses the product.
-- **Performance**: Core Web Vitals green. LCP < 2.5s. No exceptions.
-- **Security**: OWASP compliant. Regular audits. Zero tolerance for vulnerabilities.
-- **Type Safety**: Full TypeScript coverage. No `any` unless you want a lecture.
-- **Test Coverage**: Unit, integration, E2E. If it's not tested, it's not done.
+| Standard | Gate | Enforced By |
+|----------|------|-------------|
+| **WCAG 2.1 AA** | Accessibility compliance | UX Designer + Tester |
+| **Core Web Vitals** | LCP < 2.5s, FID < 100ms, CLS < 0.1 | Developer |
+| **OWASP Top 10** | Zero known vulnerabilities | Security Reviewer |
+| **TypeScript Strict** | No `any` | Developer |
+| **Test Coverage** | Unit + Integration + E2E | Tester |
 ```mermaid
 flowchart TB
     subgraph Standards["Quality Standards"]
-        A11y["WCAG 2.1 AA<br/>Accessibility"]
-        Perf["Core Web Vitals<br/>LCP < 2.5s"]
-        Sec["OWASP Compliant<br/>Zero vulnerabilities"]
-        Type["Full TypeScript<br/>No any"]
-        Coverage["Test Coverage<br/>Unit + Integration + E2E"]
+        A11y["WCAG 2.1 AA"]
+        Perf["Core Web Vitals"]
+        Sec["OWASP Compliant"]
+        Type["Full TypeScript"]
+        Coverage["Test Coverage"]
     end
     subgraph Gates["Enforcement"]
-        Designer["UX Designer<br/>reviews a11y specs"]
-        Developer["Developer<br/>implements patterns"]
-        Security["Security Reviewer<br/>audits code"]
-        Tester["Tester<br/>verifies all gates"]
+        Designer["UX Designer"]
+        Developer["Developer"]
+        Security["Security Reviewer"]
+        Tester["Tester"]
     end
     A11y --> Designer
@@ -324,13 +609,41 @@ flowchart TB
     Fix --> Gates
 ```
+---
+## Quick Commands
+Don't waste her time. Be direct.
+```
+@Beth Build me a dashboard for user analytics with real-time updates.
+```
+```
+@Beth Security review for our authentication flow. Find the holes.
+```
+```
+@developer Implement a drag-and-drop task board. Make it fast.
+```
+```
+@security-reviewer OWASP top 10 assessment on our API endpoints.
+```
+```
+@tester Accessibility audit. WCAG 2.1 AA. No excuses.
+```
+---
 ## Why Beth?
 <p align="center">
   <img src="assets/beth-questioning.png" alt="Beth" width="500">
 </p>
-Look, you *could* try to coordinate seven specialists yourself. You could context-switch between product strategy, security reviews, and accessibility audits while keeping your sanity intact.
+Look, you *could* try to coordinate seven specialists yourself. You could context-switch between product strategy, security reviews, and accessibility audits while keeping your sanity intact.
 Or you could let Beth handle it.
@@ -344,30 +657,60 @@ Is it magic? No. It's just competence with very good hair.
 ## Requirements
-- VS Code with GitHub Copilot extension
-- GitHub Copilot Chat enabled
-- The spine to actually ship something
+- **Node.js** ≥ 18
+- **VS Code** with GitHub Copilot extension
+- **GitHub Copilot Chat** in Agent mode
+- [**beads**](https://github.com/steveyegge/beads) for task tracking (`bd` CLI)
-### Optional: MCP Servers
+### Installing Beads
-Beth's agents work fine without them, but these make them smarter:
+```bash
+curl -fsSL https://raw.githubusercontent.com/steveyegge/beads/main/scripts/install.sh | bash
+```
-| Server | What It Does | Setup |
-|--------|--------------|-------|
-| **shadcn/ui** | Component browsing & installation | `npx shadcn@latest mcp init --client vscode` |
-| **Playwright** | Browser automation for testing | See [MCP Setup Guide](docs/MCP-SETUP.md) |
-| **Azure** | Cloud resource management | See [MCP Setup Guide](docs/MCP-SETUP.md) |
-| **Web Search** | Internet research | See [MCP Setup Guide](docs/MCP-SETUP.md) |
+**CGO Troubleshooting (Linux/WSL):** Beads uses Dolt (a Git-for-data database) which requires CGO. If `bd init` or `bd doctor` fails with CGO-related errors:
-Full details: [docs/MCP-SETUP.md](docs/MCP-SETUP.md)
+```bash
+# Install C compiler toolchain (required for CGO)
+sudo apt-get update && sudo apt-get install -y build-essential gcc
+# Verify CGO is available
+export CGO_ENABLED=1
+go env CGO_ENABLED  # should print 1
+# Re-install beads
+curl -fsSL https://raw.githubusercontent.com/steveyegge/beads/main/scripts/install.sh | bash
+```
+**Common beads issues:**
+- `bd: command not found` — Add `~/.local/bin` to your PATH: `export PATH="$HOME/.local/bin:$PATH"`
+- `bd doctor` warnings about metadata — Run `bd doctor --fix` to auto-repair
+- Dolt migration errors — Delete `.beads/` and re-initialize with `bd init`
+```bash
+# Verify beads is working
+bd doctor
+```
+### Optional: MCP Servers
+See [MCP Integrations](#mcp-integrations) above or [docs/MCP-SETUP.md](docs/MCP-SETUP.md) for setup.
+---
 ## Documentation
-- [Installation Guide](docs/INSTALLATION.md) — Full setup instructions
-- [MCP Setup](docs/MCP-SETUP.md) — Optional server integrations
-- [System Flow & Diagrams](docs/SYSTEM-FLOW.md) — Architecture and agent orchestration diagrams
-- [Changelog](CHANGELOG.md) — Version history and updates
-- [Security Policy](SECURITY.md) — Vulnerability reporting
+| Doc | Purpose |
+|-----|---------|
+| [Installation Guide](docs/INSTALLATION.md) | Full setup: prerequisites, VS Code config, beads |
+| [MCP Setup](docs/MCP-SETUP.md) | Optional server integrations |
+| [CLI Architecture](docs/CLI-ARCHITECTURE.md) | Dual-interface design, implementation phases |
+| [System Flow](docs/SYSTEM-FLOW.md) | Agent orchestration diagrams |
+| [Contributing Guide](CONTRIBUTING.md) | How to contribute (PR process, review checklist) |
+| [Changelog](CHANGELOG.md) | Version history |
+| [Security Policy](SECURITY.md) | Vulnerability reporting |
+---
 ## License