npm - beth-copilot - Versions diffs - 1.0.13 → 1.0.15 - Mend

beth-copilot 1.0.13 → 1.0.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (104) hide show

package/CHANGELOG.md +195 -170
package/README.md +408 -185
package/bin/cli.js +65 -4
package/dist/cli/commands/doctor.e2e.test.d.ts +8 -0
package/dist/cli/commands/doctor.e2e.test.d.ts.map +1 -0
package/dist/cli/commands/doctor.e2e.test.js +428 -0
package/dist/cli/commands/doctor.e2e.test.js.map +1 -0
package/dist/cli/commands/doctor.test.js +1 -1
package/dist/cli/commands/help.e2e.test.d.ts +9 -0
package/dist/cli/commands/help.e2e.test.d.ts.map +1 -0
package/dist/cli/commands/help.e2e.test.js +150 -0
package/dist/cli/commands/help.e2e.test.js.map +1 -0
package/dist/cli/commands/init.test.d.ts +6 -0
package/dist/cli/commands/init.test.d.ts.map +1 -0
package/dist/cli/commands/init.test.js +289 -0
package/dist/cli/commands/init.test.js.map +1 -0
package/dist/cli/commands/mcp.e2e.test.d.ts +9 -0
package/dist/cli/commands/mcp.e2e.test.d.ts.map +1 -0
package/dist/cli/commands/mcp.e2e.test.js +139 -0
package/dist/cli/commands/mcp.e2e.test.js.map +1 -0
package/dist/cli/commands/pipeline.e2e.test.d.ts +9 -0
package/dist/cli/commands/pipeline.e2e.test.d.ts.map +1 -0
package/dist/cli/commands/pipeline.e2e.test.js +192 -0
package/dist/cli/commands/pipeline.e2e.test.js.map +1 -0
package/dist/cli/commands/quickstart.test.d.ts +6 -0
package/dist/cli/commands/quickstart.test.d.ts.map +1 -0
package/dist/cli/commands/quickstart.test.js +232 -0
package/dist/cli/commands/quickstart.test.js.map +1 -0
package/dist/core/agents/frontmatter.test.d.ts +8 -0
package/dist/core/agents/frontmatter.test.d.ts.map +1 -0
package/dist/core/agents/frontmatter.test.js +589 -0
package/dist/core/agents/frontmatter.test.js.map +1 -0
package/dist/core/agents/handoffs.test.d.ts +8 -0
package/dist/core/agents/handoffs.test.d.ts.map +1 -0
package/dist/core/agents/handoffs.test.js +320 -0
package/dist/core/agents/handoffs.test.js.map +1 -0
package/dist/core/agents/loader.test.js +1 -1
package/dist/core/agents/suite.test.d.ts +8 -0
package/dist/core/agents/suite.test.d.ts.map +1 -0
package/dist/core/agents/suite.test.js +207 -0
package/dist/core/agents/suite.test.js.map +1 -0
package/dist/core/agents/tools.test.d.ts +8 -0
package/dist/core/agents/tools.test.d.ts.map +1 -0
package/dist/core/agents/tools.test.js +332 -0
package/dist/core/agents/tools.test.js.map +1 -0
package/dist/init.test.js +288 -0
package/dist/providers/azure.d.ts +147 -0
package/dist/providers/azure.d.ts.map +1 -0
package/dist/providers/azure.js +491 -0
package/dist/providers/azure.js.map +1 -0
package/dist/providers/azure.test.d.ts +11 -0
package/dist/providers/azure.test.d.ts.map +1 -0
package/dist/providers/azure.test.js +330 -0
package/dist/providers/azure.test.js.map +1 -0
package/dist/providers/config.d.ts +87 -0
package/dist/providers/config.d.ts.map +1 -0
package/dist/providers/config.js +193 -0
package/dist/providers/config.js.map +1 -0
package/dist/providers/config.test.d.ts +7 -0
package/dist/providers/config.test.d.ts.map +1 -0
package/dist/providers/config.test.js +370 -0
package/dist/providers/config.test.js.map +1 -0
package/dist/providers/index.d.ts +18 -0
package/dist/providers/index.d.ts.map +1 -0
package/dist/providers/index.js +14 -0
package/dist/providers/index.js.map +1 -0
package/dist/providers/interface.d.ts +191 -0
package/dist/providers/interface.d.ts.map +1 -0
package/dist/providers/interface.js +94 -0
package/dist/providers/interface.js.map +1 -0
package/dist/providers/retry.d.ts +128 -0
package/dist/providers/retry.d.ts.map +1 -0
package/dist/providers/retry.js +205 -0
package/dist/providers/retry.js.map +1 -0
package/dist/providers/retry.test.d.ts +7 -0
package/dist/providers/retry.test.d.ts.map +1 -0
package/dist/providers/retry.test.js +439 -0
package/dist/providers/retry.test.js.map +1 -0
package/dist/providers/streaming.d.ts +157 -0
package/dist/providers/streaming.d.ts.map +1 -0
package/dist/providers/streaming.js +233 -0
package/dist/providers/streaming.js.map +1 -0
package/dist/providers/streaming.test.d.ts +7 -0
package/dist/providers/streaming.test.d.ts.map +1 -0
package/dist/providers/streaming.test.js +372 -0
package/dist/providers/streaming.test.js.map +1 -0
package/dist/providers/types.d.ts +209 -0
package/dist/providers/types.d.ts.map +1 -0
package/dist/providers/types.js +53 -0
package/dist/providers/types.js.map +1 -0
package/dist/providers/types.test.d.ts +7 -0
package/dist/providers/types.test.d.ts.map +1 -0
package/dist/providers/types.test.js +141 -0
package/dist/providers/types.test.js.map +1 -0
package/package.json +60 -56
package/sbom.json +3302 -8
package/templates/.github/agents/beth.agent.md +329 -329
package/templates/.github/agents/developer.agent.md +572 -572
package/templates/.github/agents/product-manager.agent.md +272 -272
package/templates/.github/agents/researcher.agent.md +338 -338
package/templates/.github/agents/security-reviewer.agent.md +465 -465
package/templates/.github/agents/tester.agent.md +496 -496
package/templates/.github/agents/ux-designer.agent.md +393 -393
package/templates/mcp.json.example +4 -0

package/README.md CHANGED Viewed

@@ -12,21 +12,131 @@ They broke her wings once. They forgot she had claws.
 ## What Is This?
-Beth is a master AI orchestrator system—a ruthless, hyper-competent coordinator that runs your development team the way Beth Dutton runs Schwartz & Meyer. No hand-holding. No excuses. Just results.
+Beth is a **multi-agent AI orchestrator** with a TypeScript runtime, CLI toolchain, MCP integrations, and agent-to-agent (A2A) delegation—all driven by a ruthless coordinator who runs your development team the way Beth Dutton runs Schwartz & Meyer.
-She commands an army of specialized agents, each with their own expertise, and she's not afraid to put them to work simultaneously while she lights a cigarette and watches the crew build production-ready code.
+She commands seven specialized agents, each with their own expertise, tools, and handoff chains. On top of the GitHub Copilot agent layer, Beth now ships a **TypeScript core engine** with parsed agent/skill schemas, an Azure OpenAI LLM provider, streaming tool-call support, and a CLI that validates your entire installation in one command.
-**She handles:**
-- Product strategy that makes competitors weep
-- Research that finds the real dirt
-- Designs so sharp they cut
-- Code that actually works
-- Security that locks the gates
-- Tests that find every weakness before your enemies do
+**The system has three execution layers:**
+| Layer | What It Does | Status |
+|-------|-------------|--------|
+| **Copilot Agents** | `.agent.md` definitions running in VS Code Agent Mode | Live |
+| **CLI Toolchain** | `beth init`, `beth doctor`, `beth quickstart` — TypeScript commands with 485 tests | Live |
+| **LLM Provider** | Azure OpenAI with Entra ID auth, streaming, retry, tool calling | Live |
+---
+## Architecture
+```mermaid
+flowchart TB
+    subgraph UI["User Interfaces"]
+        Copilot["VS Code Copilot Chat<br/><i>Agent Mode</i>"]
+        CLI["Beth CLI<br/><i>init · doctor · quickstart</i>"]
+    end
+    subgraph Core["Beth Core Engine — TypeScript"]
+        AgentLoader["Agent Loader<br/><i>Parse .agent.md frontmatter</i>"]
+        SkillLoader["Skill Loader<br/><i>Parse SKILL.md + triggers</i>"]
+        Types["Agent & Skill Types<br/><i>Typed schemas</i>"]
+        PathVal["Path Validation<br/><i>Traversal/injection guard</i>"]
+    end
+    subgraph Agents["Specialist Agents (A2A)"]
+        Beth["@Beth<br/><i>Orchestrator</i>"]
+        PM["@product-manager"]
+        Researcher["@researcher"]
+        Designer["@ux-designer"]
+        Developer["@developer"]
+        Security["@security-reviewer"]
+        Tester["@tester"]
+    end
+    subgraph Skills["Skills — On-Demand Knowledge"]
+        PRD["PRD Generation"]
+        Framer["Framer Components"]
+        React["React/Next.js<br/>Best Practices"]
+        WebDesign["Web Design<br/>Guidelines"]
+        Shadcn["shadcn/ui"]
+        SecAnalysis["Security Analysis"]
+        AzureOps["Azure Operations"]
+        WebSearch["Web Search"]
+    end
+    subgraph MCP["MCP Servers — Optional"]
+        MCPShadcn["shadcn/ui"]
+        MCPPlaywright["Playwright"]
+        MCPAzure["Azure"]
+        MCPBrave["Brave Search"]
+        MCPDeepWiki["DeepWiki"]
+    end
+    subgraph Provider["LLM Provider Layer"]
+        Interface["LLMProviderBase<br/><i>Abstract interface</i>"]
+        Azure["AzureOpenAIProvider<br/><i>Entra ID · Streaming</i>"]
+        Retry["Retry + Backoff<br/><i>Exponential w/ jitter</i>"]
+        Stream["StreamAccumulator<br/><i>Tool call assembly</i>"]
+        Config["Config Loader<br/><i>env → ~/.beth/.env</i>"]
+    end
+    subgraph Tracking["Work Tracking"]
+        Beads["beads (bd CLI)<br/><i>Agent coordination</i>"]
+        Backlog["Backlog.md<br/><i>Human changelog</i>"]
+    end
+    Copilot --> Beth
+    CLI --> Core
+    Core --> Agents
+    Beth -->|"routes"| PM & Researcher & Designer & Developer & Security & Tester
+    PM -.->|"loads"| PRD
+    Designer -.->|"loads"| Framer & WebDesign
+    Developer -.->|"loads"| React & Shadcn
+    Security -.->|"loads"| SecAnalysis
+    Researcher -.->|"loads"| WebSearch
+    Developer -.->|"uses"| MCPShadcn
+    Tester -.->|"uses"| MCPPlaywright
+    Security -.->|"uses"| MCPAzure
+    Researcher -.->|"uses"| MCPBrave
+    Azure --> Interface
+    Retry --> Azure
+    Stream --> Azure
+    Config --> Azure
+    Beth -.->|"tracks"| Beads
+    Beth -.->|"updates"| Backlog
+    style Beth fill:#1e3a5f,color:#fff
+    style Core fill:#f0f4f8
+    style Provider fill:#e8f5e9
+```
+---
+## Tech Stack
+| Category | Technology | Notes |
+|----------|-----------|-------|
+| **Runtime** | Node.js ≥ 18 | ES modules, built-in test runner |
+| **Language** | TypeScript (strict mode) | No `any`. Zod for runtime validation |
+| **Target Framework** | React 19 + Next.js App Router | Server Components, Server Actions, Suspense, streaming |
+| **Styling** | Tailwind CSS + `class-variance-authority` (cva) | Utility-first with typed variants |
+| **Components** | shadcn/ui | Radix primitives, copy-paste ownership |
+| **LLM Provider** | Azure OpenAI via `openai` SDK | Entra ID auth (no API keys), streaming + tool calling |
+| **Auth** | `@azure/identity` DefaultAzureCredential | az login, managed identity, VS Code creds |
+| **Frontmatter** | `gray-matter` | Parses `.agent.md` and `SKILL.md` YAML |
+| **Testing** | Node.js built-in test runner | 485 tests — unit, integration, E2E |
+| **Task Tracking** | beads (`bd` CLI) | Dependency-aware issue tracking for agents |
+| **Package Manager** | pnpm | Lockfile committed |
+**Production dependencies:** 1 (`gray-matter`). That's it. Minimal attack surface by design.
+---
 ## Getting Started
-**Project scope:**
+**One command:**
 ```bash
 npx beth-copilot init
 ```
@@ -39,87 +149,97 @@ beth init
 Then open VS Code, switch Copilot Chat to **Agent mode**, and type `@Beth`.
-For detailed setup (prerequisites, task tracking, MCP servers): [docs/INSTALLATION.md](docs/INSTALLATION.md)
+**Verify everything works:**
+```bash
+beth doctor       # Health check: Node.js, beads, agents, skills
+beth quickstart   # Init + doctor + beads setup in one shot
+```
-## The Family
+For detailed setup (prerequisites, task tracking, MCP servers): [docs/INSTALLATION.md](docs/INSTALLATION.md)
-Beth doesn't work alone. She's got people—loyal, skilled, and ready to execute.
+---
-| Agent | Role | What They Do |
-|-------|------|--------------|
-| **@Beth** | The Boss | Orchestrates everything. Routes work. Takes names. |
-| **@product-manager** | The Strategist | WHAT to build: PRDs, user stories, priorities, success metrics. |
-| **@researcher** | The Intelligence | Competitive analysis, user insights, market dirt. |
-| **@ux-designer** | The Architect | HOW it works: component specs, design tokens, accessibility. |
-| **@developer** | The Builder | React/TypeScript/Next.js - UI and full-stack. Gets it done. |
-| **@tester** | The Enforcer | Quality assurance, accessibility, performance. Finds every crack. |
-| **@security-reviewer** | The Bodyguard | Enterprise security. Vulnerabilities, compliance, threat modeling. |
+## CLI Commands
-### Product Manager vs UX Designer
+| Command | What It Does |
+|---------|-------------|
+| `beth init` | Install agents, skills, VS Code settings, beads tracking |
+| `beth init --force` | Overwrite existing files |
+| `beth doctor` | Validate Node.js ≥18, beads CLI, agents frontmatter, skills directories |
+| `beth quickstart` | Run init + doctor + beads init in one shot |
+| `beth help` | Show all commands and options |
-| | Product Manager | UX Designer |
-|---|---|---|
-| **Focus** | WHAT to build, WHY, WHEN | HOW it looks, feels, behaves |
-| **Outputs** | PRDs, user stories, priorities | Component specs, design tokens, accessibility |
-| **Example** | "Users need date filtering" | "Date picker: variants, states, ARIA" |
+**Flags:** `--force`, `--skip-backlog`, `--skip-mcp`, `--skip-beads`, `--verbose`
-## Skills (The Weapons)
+---
-Beth's team comes equipped:
+## Agent-to-Agent (A2A) Orchestration
-| Skill | Purpose |
-|-------|---------|
-| **PRD Generation** | Write requirements docs that don't waste anyone's time |
-| **Framer Components** | Build custom React components with property controls |
-| **React/Next.js Best Practices** | Vercel-grade performance patterns |
-| **Web Design Guidelines** | WCAG compliance, UI review, accessibility |
-| **shadcn/ui** | Component library patterns, installation, and best practices |
-| **Security Analysis** | OWASP, threat modeling, vulnerability assessment |
+Beth doesn't micromanage. She delegates to specialists over **subagent** and **handoff** channels, tracks dependencies with beads, and holds every agent accountable.
-## How Beth Works
+### The Family
-She doesn't micromanage. She delegates to specialists and holds them accountable.
+| Agent | Role | What They Do |
+|-------|------|--------------|
+| **@Beth** | The Boss | Orchestrates everything. Routes work. Takes names. |
+| **@product-manager** | The Strategist | WHAT to build: PRDs, user stories, priorities, success metrics |
+| **@researcher** | The Intelligence | Competitive analysis, user insights, market dirt |
+| **@ux-designer** | The Architect | HOW it works: component specs, design tokens, accessibility |
+| **@developer** | The Builder | React/TypeScript/Next.js — UI and full-stack |
+| **@tester** | The Enforcer | Quality assurance, accessibility, performance |
+| **@security-reviewer** | The Bodyguard | OWASP, compliance, threat modeling |
-### Architecture
+### A2A Delegation Model
 ```mermaid
 flowchart TB
-    subgraph User["👤 User"]
-        Request[User Request]
+    subgraph Orchestration["Beth Orchestration Layer"]
+        BethCore["@Beth<br/><i>Routes work · Spawns subagents</i>"]
     end
-    subgraph Orchestrator["🎯 Beth - The Orchestrator"]
-        Beth["@Beth<br/><i>'I don't speak dipshit'</i>"]
-        Assess[Assess Request]
-        Plan[Plan Workflow]
-        Route[Route to Specialists]
+    subgraph Specialists["Specialist Agents"]
+        PM["@product-manager<br/>Requirements · Priorities"]
+        R["@researcher<br/>User insights · Market intel"]
+        UX["@ux-designer<br/>Component specs · Design tokens"]
+        D["@developer<br/>React/TS/Next.js · Implementation"]
+        S["@security-reviewer<br/>Threat modeling · Vulnerabilities"]
+        T["@tester<br/>QA · a11y · Performance"]
     end
-    subgraph Agents["🧑‍💼 Specialist Agents"]
-        PM["@product-manager<br/>WHAT to build"]
-        Researcher["@researcher<br/>User/Market Intel"]
-        Designer["@ux-designer<br/>HOW it works"]
-        Developer["@developer<br/>Implementation"]
-        Security["@security-reviewer<br/>Protection"]
-        Tester["@tester<br/>Quality Gate"]
-    end
+    BethCore -->|"Product Strategy"| PM
+    BethCore -->|"User Research"| R
+    BethCore -->|"UX Design"| UX
+    BethCore -->|"Development"| D
+    BethCore -->|"Security Review"| S
+    BethCore -->|"Quality Assurance"| T
+    PM -.->|"subagent"| R
+    PM -.->|"subagent"| UX
+    UX -.->|"subagent"| D
+    D -.->|"subagent"| T
+    S -.->|"subagent"| D
+    T -.->|"subagent"| D
+    style BethCore fill:#1e3a5f,color:#fff
+```
-    Request --> Beth
-    Beth --> Assess --> Plan --> Route
-    Route --> PM
-    Route --> Researcher
-    Route --> Designer
-    Route --> Developer
-    Route --> Security
-    Route --> Tester
+### Subagent vs Handoff
-    style Beth fill:#1e3a5f,color:#fff
-    style Orchestrator fill:#f0f4f8
-    style Agents fill:#f8f4f0
+| Mechanism | Control | Use When |
+|-----------|---------|----------|
+| **Subagent** | Beth decides | Task can run autonomously, no human review needed |
+| **Handoff** | User decides | User needs to review before proceeding |
+```typescript
+// Beth spawns a specialist — autonomous execution
+runSubagent({
+  agentName: "developer",
+  prompt: "Implement JWT auth flow with refresh token rotation...",
+  description: "Implement auth"
+})
 ```
-### The Workflow
+### Workflow: New Feature
 ```mermaid
 sequenceDiagram
@@ -133,125 +253,197 @@ sequenceDiagram
     U->>B: "Build me a feature"
     B->>B: Assess & Plan
     B->>PM: Define requirements
-    PM-->>B: Requirements ready
+    PM-->>B: PRD + user stories
     B->>UX: Design the experience
-    UX-->>B: Design specs ready
+    UX-->>B: Component specs + tokens
     B->>D: Implement feature
     D-->>B: Implementation complete
-    B->>S: Security review
-    S-->>B: Security approved
-    B->>T: Test & verify
-    T-->>B: Quality verified
+    par Parallel quality gates
+        B->>S: Security review
+        S-->>B: OWASP approved
+    and
+        B->>T: Test & verify
+        T-->>B: a11y + regression pass
+    end
     B->>U: Feature complete ✅
 ```
-**Bug Hunt?** Tester → Developer → Security → Tester
-**Security Audit?** Security → Developer → Tester → Security
+**Bug Hunt?** Tester → Developer → Security → Tester
+**Security Audit?** Security → Developer → Tester → Security sign-off
-### Agent Delegation
+---
-```mermaid
-flowchart TB
-    subgraph Beth["Beth (Orchestrator)"]
-        BethCore["Routes all work<br/>Spawns subagents"]
-    end
+## MCP Integrations
-    subgraph PM["Product Manager"]
-        PMCore["Requirements<br/>Priorities"]
-    end
+Model Context Protocol servers extend agent capabilities. All **optional** — agents gracefully degrade without them.
+| Server | Agent | Capability |
+|--------|-------|-----------|
+| **shadcn/ui** | Developer | Component browsing & installation |
+| **Playwright** | Tester | Browser automation, E2E testing |
+| **Azure** | Developer, Security | Cloud resource management |
+| **Brave Search** | Researcher | Internet research |
+| **DeepWiki** | All | Repository documentation lookup |
+### Quick Setup
+```bash
+# Copy example config and enable what you need
+cp mcp.json.example .vscode/mcp.json
+```
+```json
+{
+  "servers": {
+    "shadcn":     { "command": "npx", "args": ["shadcn@latest", "mcp"] },
+    "playwright": { "command": "npx", "args": ["@playwright/mcp@latest"] },
+    "azure":      { "command": "npx", "args": ["@azure/mcp-server"] },
+    "web-search": { "command": "npx", "args": ["@brave/brave-search-mcp-server"] },
+    "deepwiki":   { "url": "https://mcp.deepwiki.com/mcp" }
+  }
+}
+```
+Full details: [docs/MCP-SETUP.md](docs/MCP-SETUP.md)
-    subgraph R["Researcher"]
-        RCore["User insights<br/>Market intel"]
+---
+## Skills (On-Demand Knowledge)
+Skills are domain-knowledge modules that agents load automatically when trigger phrases match. Each skill lives in `.github/skills/<name>/SKILL.md`.
+| Skill | Triggers On | Used By |
+|-------|------------|---------|
+| **PRD Generation** | "create a prd", "product requirements" | Product Manager |
+| **Framer Components** | "framer component", "property controls" | UX Designer |
+| **React/Next.js Best Practices** | React performance, Next.js patterns | Developer |
+| **Web Design Guidelines** | "review my UI", "check accessibility" | UX Designer |
+| **shadcn/ui** | "shadcn", "ui component" | Developer |
+| **Security Analysis** | "security review", "OWASP", "threat model" | Security Reviewer |
+| **Azure Operations** | Azure resource management | Developer |
+| **Web Search** | Internet research via Brave | Researcher |
+---
+## LLM Provider Layer
+The TypeScript core includes a production-ready provider abstraction for running Beth outside VS Code.
+```mermaid
+flowchart LR
+    subgraph Config["Configuration"]
+        Env["process.env"]
+        DotEnv["~/.beth/.env"]
     end
-    subgraph UX["UX Designer"]
-        UXCore["Component specs<br/>Design tokens"]
+    subgraph Auth["Authentication"]
+        Entra["Entra ID<br/><i>DefaultAzureCredential</i>"]
     end
-    subgraph D["Developer"]
-        DCore["React/TS/Next.js<br/>Implementation"]
+    subgraph Provider["Provider"]
+        Base["LLMProviderBase<br/><i>Abstract interface</i>"]
+        AzureOAI["AzureOpenAIProvider<br/><i>chat · chatStream · countTokens</i>"]
     end
-    subgraph S["Security"]
-        SCore["Threat modeling<br/>Vulnerabilities"]
+    subgraph Resilience["Resilience"]
+        RetryMod["Exponential Backoff<br/><i>Jitter · 3 retries</i>"]
+        Errors["LLMError<br/><i>Typed error codes</i>"]
     end
-    subgraph T["Tester"]
-        TCore["QA & a11y<br/>Performance"]
+    subgraph Streaming["Streaming"]
+        Accum["StreamAccumulator<br/><i>Content + tool call assembly</i>"]
+        Collect["collectStream<br/><i>Full response</i>"]
+        Map["mapStream<br/><i>Transform chunks</i>"]
     end
-    BethCore -->|"Product Strategy"| PMCore
-    BethCore -->|"User Research"| RCore
-    BethCore -->|"UX Design"| UXCore
-    BethCore -->|"Development"| DCore
-    BethCore -->|"Security Review"| SCore
-    BethCore -->|"Quality Assurance"| TCore
-    PMCore -.->|"subagent"| RCore
-    PMCore -.->|"subagent"| UXCore
-    UXCore -.->|"subagent"| DCore
-    DCore -.->|"subagent"| TCore
-    SCore -.->|"subagent"| DCore
+    Env --> AzureOAI
+    DotEnv --> AzureOAI
+    Entra --> AzureOAI
+    Base --> AzureOAI
+    RetryMod --> AzureOAI
+    AzureOAI --> Accum
+    AzureOAI --> Collect
+    Errors --> RetryMod
 ```
-## Quick Commands
-Don't waste her time. Be direct.
+**Key capabilities:**
+- **Entra ID auth** — No API keys. Uses `DefaultAzureCredential` (az login, managed identity, VS Code creds)
+- **Streaming** — `chatStream()` yields `ChatChunk` objects with incremental tool call delta assembly
+- **Retry** — Exponential backoff with jitter for 429/5xx/network errors. Non-transient errors fail fast
+- **Config** — `process.env` → `~/.beth/.env` precedence chain
+- **193 provider tests** covering types, retry, config, streaming, and Azure client
-```
-@Beth Build me a dashboard for user analytics with real-time updates.
-```
+---
-```
-@Beth Security review for our authentication flow. Find the holes.
-```
+## TypeScript Core
-```
-@developer Implement a drag-and-drop task board. Make it fast.
-```
+The engine that powers everything. Parses agent and skill definitions, validates configuration, and provides typed APIs.
-```
-@security-reviewer OWASP top 10 assessment on our API endpoints.
-```
+### Project Structure
 ```
-@tester Accessibility audit. WCAG 2.1 AA. No excuses.
+beth/
+├── bin/
+│   └── cli.js                      # CLI entry point (init, doctor, quickstart, help)
+├── src/
+│   ├── index.ts                    # Barrel exports
+│   ├── cli/commands/
+│   │   ├── doctor.ts               # System health validation
+│   │   └── quickstart.ts           # Guided setup flow
+│   ├── core/
+│   │   ├── agents/
+│   │   │   ├── types.ts            # AgentDefinition, AgentFrontmatter, AgentHandoff
+│   │   │   └── loader.ts           # Parse .agent.md → typed definitions
+│   │   └── skills/
+│   │       ├── types.ts            # SkillDefinition, TriggerMap
+│   │       └── loader.ts           # Parse SKILL.md, extract triggers, match queries
+│   ├── lib/
+│   │   └── pathValidation.ts       # Traversal/injection guards
+│   └── providers/
+│       ├── interface.ts            # LLMProviderBase abstract class
+│       ├── azure.ts                # AzureOpenAIProvider (Entra ID, streaming, tools)
+│       ├── types.ts                # 17 types: ChatMessage, ToolCall, LLMError, etc.
+│       ├── retry.ts                # Exponential backoff with jitter
+│       ├── config.ts               # Environment + dotfile config loader
+│       └── streaming.ts            # StreamAccumulator, collectStream, mapStream
+├── templates/
+│   └── .github/
+│       ├── agents/                 # 7 agent definitions (.agent.md)
+│       └── skills/                 # 8 skill modules (SKILL.md)
+└── docs/
+    ├── INSTALLATION.md
+    ├── MCP-SETUP.md
+    ├── CLI-ARCHITECTURE.md
+    └── SYSTEM-FLOW.md
 ```
-## The Structure
-```
-.github/
-├── agents/                          # The crew
-│   ├── beth.agent.md               # The boss herself
-│   ├── product-manager.agent.md
-│   ├── researcher.agent.md
-│   ├── ux-designer.agent.md
-│   ├── developer.agent.md          # UI + full-stack
-│   ├── tester.agent.md
-│   └── security-reviewer.agent.md  # Enterprise security
-├── skills/                          # Domain expertise
-│   ├── prd/
-│   ├── framer-components/
-│   ├── vercel-react-best-practices/
-│   ├── web-design-guidelines/
-│   └── security-analysis/          # New: security skill
-└── copilot-instructions.md         # The rules of engagement
-```
+### Test Coverage
-## Her Philosophy
+**485 tests** (484 pass, 1 skip, 0 fail):
-Beth operates on a few principles:
+| Suite | Tests | What It Covers |
+|-------|-------|---------------|
+| Agent loader | 30+ | Frontmatter parsing, validation, code fence stripping, handoffs |
+| Skill loader | 30+ | Trigger extraction, query matching, trigger map building |
+| Provider types | 40+ | LLMError codes, ChatMessage shapes, ToolDefinition schemas |
+| Provider retry | 40+ | Exponential backoff, jitter, transient error detection |
+| Provider config | 30+ | Env precedence, dotenv parsing, URL validation |
+| Provider streaming | 40+ | Chunk accumulation, tool call delta assembly |
+| Provider Azure | 30+ | Message mapping, response mapping, error wrapping |
+| CLI E2E | 52 | Init/doctor pipeline, MCP template validation, help output |
+| Path validation | 33 | Traversal detection, injection prevention, allowlists |
-1. **Protect the family** — Your codebase is the ranch. She defends it.
-2. **No weakness** — Tests, security, accessibility. Cover every flank.
-3. **Move fast, break enemies** — Parallel execution, aggressive timelines.
-4. **Loyalty earns trust** — Agents that perform get the good work.
+---
-### IDEO Design Thinking
+## IDEO Design Thinking
-Beth follows human-centered design methodology:
+Beth follows human-centered design methodology across agent workflows:
 ```mermaid
 flowchart LR
@@ -277,35 +469,38 @@ flowchart LR
     E --> D --> I --> P --> T
     T -.->|iterate| E
-    T -.->|iterate| D
     T -.->|iterate| I
 ```
+---
 ## Quality Standards
 Beth doesn't ship garbage:
-- **Accessibility**: WCAG 2.1 AA minimum. Everyone uses the product.
-- **Performance**: Core Web Vitals green. LCP < 2.5s. No exceptions.
-- **Security**: OWASP compliant. Regular audits. Zero tolerance for vulnerabilities.
-- **Type Safety**: Full TypeScript coverage. No `any` unless you want a lecture.
-- **Test Coverage**: Unit, integration, E2E. If it's not tested, it's not done.
+| Standard | Gate | Enforced By |
+|----------|------|-------------|
+| **WCAG 2.1 AA** | Accessibility compliance | UX Designer + Tester |
+| **Core Web Vitals** | LCP < 2.5s, FID < 100ms, CLS < 0.1 | Developer |
+| **OWASP Top 10** | Zero known vulnerabilities | Security Reviewer |
+| **TypeScript Strict** | No `any` | Developer |
+| **Test Coverage** | Unit + Integration + E2E | Tester |
 ```mermaid
 flowchart TB
     subgraph Standards["Quality Standards"]
-        A11y["WCAG 2.1 AA<br/>Accessibility"]
-        Perf["Core Web Vitals<br/>LCP < 2.5s"]
-        Sec["OWASP Compliant<br/>Zero vulnerabilities"]
-        Type["Full TypeScript<br/>No any"]
-        Coverage["Test Coverage<br/>Unit + Integration + E2E"]
+        A11y["WCAG 2.1 AA"]
+        Perf["Core Web Vitals"]
+        Sec["OWASP Compliant"]
+        Type["Full TypeScript"]
+        Coverage["Test Coverage"]
     end
     subgraph Gates["Enforcement"]
-        Designer["UX Designer<br/>reviews a11y specs"]
-        Developer["Developer<br/>implements patterns"]
-        Security["Security Reviewer<br/>audits code"]
-        Tester["Tester<br/>verifies all gates"]
+        Designer["UX Designer"]
+        Developer["Developer"]
+        Security["Security Reviewer"]
+        Tester["Tester"]
     end
     A11y --> Designer
@@ -324,13 +519,41 @@ flowchart TB
     Fix --> Gates
 ```
+---
+## Quick Commands
+Don't waste her time. Be direct.
+```
+@Beth Build me a dashboard for user analytics with real-time updates.
+```
+```
+@Beth Security review for our authentication flow. Find the holes.
+```
+```
+@developer Implement a drag-and-drop task board. Make it fast.
+```
+```
+@security-reviewer OWASP top 10 assessment on our API endpoints.
+```
+```
+@tester Accessibility audit. WCAG 2.1 AA. No excuses.
+```
+---
 ## Why Beth?
 <p align="center">
   <img src="assets/beth-questioning.png" alt="Beth" width="500">
 </p>
-Look, you *could* try to coordinate seven specialists yourself. You could context-switch between product strategy, security reviews, and accessibility audits while keeping your sanity intact.
+Look, you *could* try to coordinate seven specialists yourself. You could context-switch between product strategy, security reviews, and accessibility audits while keeping your sanity intact.
 Or you could let Beth handle it.
@@ -344,30 +567,30 @@ Is it magic? No. It's just competence with very good hair.
 ## Requirements
-- VS Code with GitHub Copilot extension
-- GitHub Copilot Chat enabled
-- The spine to actually ship something
+- **Node.js** ≥ 18
+- **VS Code** with GitHub Copilot extension
+- **GitHub Copilot Chat** in Agent mode
+- [**beads**](https://github.com/steveyegge/beads) for task tracking (`bd` CLI)
 ### Optional: MCP Servers
-Beth's agents work fine without them, but these make them smarter:
-| Server | What It Does | Setup |
-|--------|--------------|-------|
-| **shadcn/ui** | Component browsing & installation | `npx shadcn@latest mcp init --client vscode` |
-| **Playwright** | Browser automation for testing | See [MCP Setup Guide](docs/MCP-SETUP.md) |
-| **Azure** | Cloud resource management | See [MCP Setup Guide](docs/MCP-SETUP.md) |
-| **Web Search** | Internet research | See [MCP Setup Guide](docs/MCP-SETUP.md) |
+See [MCP Integrations](#mcp-integrations) above or [docs/MCP-SETUP.md](docs/MCP-SETUP.md) for setup.
-Full details: [docs/MCP-SETUP.md](docs/MCP-SETUP.md)
+---
 ## Documentation
-- [Installation Guide](docs/INSTALLATION.md) — Full setup instructions
-- [MCP Setup](docs/MCP-SETUP.md) — Optional server integrations
-- [System Flow & Diagrams](docs/SYSTEM-FLOW.md) — Architecture and agent orchestration diagrams
-- [Changelog](CHANGELOG.md) — Version history and updates
-- [Security Policy](SECURITY.md) — Vulnerability reporting
+| Doc | Purpose |
+|-----|---------|
+| [Installation Guide](docs/INSTALLATION.md) | Full setup: prerequisites, VS Code config, beads |
+| [MCP Setup](docs/MCP-SETUP.md) | Optional server integrations |
+| [CLI Architecture](docs/CLI-ARCHITECTURE.md) | Dual-interface design, implementation phases |
+| [System Flow](docs/SYSTEM-FLOW.md) | Agent orchestration diagrams |
+| [Contributing Guide](CONTRIBUTING.md) | How to contribute (PR process, review checklist) |
+| [Changelog](CHANGELOG.md) | Version history |
+| [Security Policy](SECURITY.md) | Vulnerability reporting |
+---
 ## License