beth-copilot 1.0.13 → 1.0.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (104) hide show
  1. package/CHANGELOG.md +195 -170
  2. package/README.md +408 -185
  3. package/bin/cli.js +65 -4
  4. package/dist/cli/commands/doctor.e2e.test.d.ts +8 -0
  5. package/dist/cli/commands/doctor.e2e.test.d.ts.map +1 -0
  6. package/dist/cli/commands/doctor.e2e.test.js +428 -0
  7. package/dist/cli/commands/doctor.e2e.test.js.map +1 -0
  8. package/dist/cli/commands/doctor.test.js +1 -1
  9. package/dist/cli/commands/help.e2e.test.d.ts +9 -0
  10. package/dist/cli/commands/help.e2e.test.d.ts.map +1 -0
  11. package/dist/cli/commands/help.e2e.test.js +150 -0
  12. package/dist/cli/commands/help.e2e.test.js.map +1 -0
  13. package/dist/cli/commands/init.test.d.ts +6 -0
  14. package/dist/cli/commands/init.test.d.ts.map +1 -0
  15. package/dist/cli/commands/init.test.js +289 -0
  16. package/dist/cli/commands/init.test.js.map +1 -0
  17. package/dist/cli/commands/mcp.e2e.test.d.ts +9 -0
  18. package/dist/cli/commands/mcp.e2e.test.d.ts.map +1 -0
  19. package/dist/cli/commands/mcp.e2e.test.js +139 -0
  20. package/dist/cli/commands/mcp.e2e.test.js.map +1 -0
  21. package/dist/cli/commands/pipeline.e2e.test.d.ts +9 -0
  22. package/dist/cli/commands/pipeline.e2e.test.d.ts.map +1 -0
  23. package/dist/cli/commands/pipeline.e2e.test.js +192 -0
  24. package/dist/cli/commands/pipeline.e2e.test.js.map +1 -0
  25. package/dist/cli/commands/quickstart.test.d.ts +6 -0
  26. package/dist/cli/commands/quickstart.test.d.ts.map +1 -0
  27. package/dist/cli/commands/quickstart.test.js +232 -0
  28. package/dist/cli/commands/quickstart.test.js.map +1 -0
  29. package/dist/core/agents/frontmatter.test.d.ts +8 -0
  30. package/dist/core/agents/frontmatter.test.d.ts.map +1 -0
  31. package/dist/core/agents/frontmatter.test.js +589 -0
  32. package/dist/core/agents/frontmatter.test.js.map +1 -0
  33. package/dist/core/agents/handoffs.test.d.ts +8 -0
  34. package/dist/core/agents/handoffs.test.d.ts.map +1 -0
  35. package/dist/core/agents/handoffs.test.js +320 -0
  36. package/dist/core/agents/handoffs.test.js.map +1 -0
  37. package/dist/core/agents/loader.test.js +1 -1
  38. package/dist/core/agents/suite.test.d.ts +8 -0
  39. package/dist/core/agents/suite.test.d.ts.map +1 -0
  40. package/dist/core/agents/suite.test.js +207 -0
  41. package/dist/core/agents/suite.test.js.map +1 -0
  42. package/dist/core/agents/tools.test.d.ts +8 -0
  43. package/dist/core/agents/tools.test.d.ts.map +1 -0
  44. package/dist/core/agents/tools.test.js +332 -0
  45. package/dist/core/agents/tools.test.js.map +1 -0
  46. package/dist/init.test.js +288 -0
  47. package/dist/providers/azure.d.ts +147 -0
  48. package/dist/providers/azure.d.ts.map +1 -0
  49. package/dist/providers/azure.js +491 -0
  50. package/dist/providers/azure.js.map +1 -0
  51. package/dist/providers/azure.test.d.ts +11 -0
  52. package/dist/providers/azure.test.d.ts.map +1 -0
  53. package/dist/providers/azure.test.js +330 -0
  54. package/dist/providers/azure.test.js.map +1 -0
  55. package/dist/providers/config.d.ts +87 -0
  56. package/dist/providers/config.d.ts.map +1 -0
  57. package/dist/providers/config.js +193 -0
  58. package/dist/providers/config.js.map +1 -0
  59. package/dist/providers/config.test.d.ts +7 -0
  60. package/dist/providers/config.test.d.ts.map +1 -0
  61. package/dist/providers/config.test.js +370 -0
  62. package/dist/providers/config.test.js.map +1 -0
  63. package/dist/providers/index.d.ts +18 -0
  64. package/dist/providers/index.d.ts.map +1 -0
  65. package/dist/providers/index.js +14 -0
  66. package/dist/providers/index.js.map +1 -0
  67. package/dist/providers/interface.d.ts +191 -0
  68. package/dist/providers/interface.d.ts.map +1 -0
  69. package/dist/providers/interface.js +94 -0
  70. package/dist/providers/interface.js.map +1 -0
  71. package/dist/providers/retry.d.ts +128 -0
  72. package/dist/providers/retry.d.ts.map +1 -0
  73. package/dist/providers/retry.js +205 -0
  74. package/dist/providers/retry.js.map +1 -0
  75. package/dist/providers/retry.test.d.ts +7 -0
  76. package/dist/providers/retry.test.d.ts.map +1 -0
  77. package/dist/providers/retry.test.js +439 -0
  78. package/dist/providers/retry.test.js.map +1 -0
  79. package/dist/providers/streaming.d.ts +157 -0
  80. package/dist/providers/streaming.d.ts.map +1 -0
  81. package/dist/providers/streaming.js +233 -0
  82. package/dist/providers/streaming.js.map +1 -0
  83. package/dist/providers/streaming.test.d.ts +7 -0
  84. package/dist/providers/streaming.test.d.ts.map +1 -0
  85. package/dist/providers/streaming.test.js +372 -0
  86. package/dist/providers/streaming.test.js.map +1 -0
  87. package/dist/providers/types.d.ts +209 -0
  88. package/dist/providers/types.d.ts.map +1 -0
  89. package/dist/providers/types.js +53 -0
  90. package/dist/providers/types.js.map +1 -0
  91. package/dist/providers/types.test.d.ts +7 -0
  92. package/dist/providers/types.test.d.ts.map +1 -0
  93. package/dist/providers/types.test.js +141 -0
  94. package/dist/providers/types.test.js.map +1 -0
  95. package/package.json +60 -56
  96. package/sbom.json +3302 -8
  97. package/templates/.github/agents/beth.agent.md +329 -329
  98. package/templates/.github/agents/developer.agent.md +572 -572
  99. package/templates/.github/agents/product-manager.agent.md +272 -272
  100. package/templates/.github/agents/researcher.agent.md +338 -338
  101. package/templates/.github/agents/security-reviewer.agent.md +465 -465
  102. package/templates/.github/agents/tester.agent.md +496 -496
  103. package/templates/.github/agents/ux-designer.agent.md +393 -393
  104. package/templates/mcp.json.example +4 -0
package/README.md CHANGED
@@ -12,21 +12,131 @@ They broke her wings once. They forgot she had claws.
12
12
 
13
13
  ## What Is This?
14
14
 
15
- Beth is a master AI orchestrator system—a ruthless, hyper-competent coordinator that runs your development team the way Beth Dutton runs Schwartz & Meyer. No hand-holding. No excuses. Just results.
15
+ Beth is a **multi-agent AI orchestrator** with a TypeScript runtime, CLI toolchain, MCP integrations, and agent-to-agent (A2A) delegation—all driven by a ruthless coordinator who runs your development team the way Beth Dutton runs Schwartz & Meyer.
16
16
 
17
- She commands an army of specialized agents, each with their own expertise, and she's not afraid to put them to work simultaneously while she lights a cigarette and watches the crew build production-ready code.
17
+ She commands seven specialized agents, each with their own expertise, tools, and handoff chains. On top of the GitHub Copilot agent layer, Beth now ships a **TypeScript core engine** with parsed agent/skill schemas, an Azure OpenAI LLM provider, streaming tool-call support, and a CLI that validates your entire installation in one command.
18
18
 
19
- **She handles:**
20
- - Product strategy that makes competitors weep
21
- - Research that finds the real dirt
22
- - Designs so sharp they cut
23
- - Code that actually works
24
- - Security that locks the gates
25
- - Tests that find every weakness before your enemies do
19
+ **The system has three execution layers:**
20
+
21
+ | Layer | What It Does | Status |
22
+ |-------|-------------|--------|
23
+ | **Copilot Agents** | `.agent.md` definitions running in VS Code Agent Mode | Live |
24
+ | **CLI Toolchain** | `beth init`, `beth doctor`, `beth quickstart` — TypeScript commands with 485 tests | Live |
25
+ | **LLM Provider** | Azure OpenAI with Entra ID auth, streaming, retry, tool calling | Live |
26
+
27
+ ---
28
+
29
+ ## Architecture
30
+
31
+ ```mermaid
32
+ flowchart TB
33
+ subgraph UI["User Interfaces"]
34
+ Copilot["VS Code Copilot Chat<br/><i>Agent Mode</i>"]
35
+ CLI["Beth CLI<br/><i>init · doctor · quickstart</i>"]
36
+ end
37
+
38
+ subgraph Core["Beth Core Engine — TypeScript"]
39
+ AgentLoader["Agent Loader<br/><i>Parse .agent.md frontmatter</i>"]
40
+ SkillLoader["Skill Loader<br/><i>Parse SKILL.md + triggers</i>"]
41
+ Types["Agent & Skill Types<br/><i>Typed schemas</i>"]
42
+ PathVal["Path Validation<br/><i>Traversal/injection guard</i>"]
43
+ end
44
+
45
+ subgraph Agents["Specialist Agents (A2A)"]
46
+ Beth["@Beth<br/><i>Orchestrator</i>"]
47
+ PM["@product-manager"]
48
+ Researcher["@researcher"]
49
+ Designer["@ux-designer"]
50
+ Developer["@developer"]
51
+ Security["@security-reviewer"]
52
+ Tester["@tester"]
53
+ end
54
+
55
+ subgraph Skills["Skills — On-Demand Knowledge"]
56
+ PRD["PRD Generation"]
57
+ Framer["Framer Components"]
58
+ React["React/Next.js<br/>Best Practices"]
59
+ WebDesign["Web Design<br/>Guidelines"]
60
+ Shadcn["shadcn/ui"]
61
+ SecAnalysis["Security Analysis"]
62
+ AzureOps["Azure Operations"]
63
+ WebSearch["Web Search"]
64
+ end
65
+
66
+ subgraph MCP["MCP Servers — Optional"]
67
+ MCPShadcn["shadcn/ui"]
68
+ MCPPlaywright["Playwright"]
69
+ MCPAzure["Azure"]
70
+ MCPBrave["Brave Search"]
71
+ MCPDeepWiki["DeepWiki"]
72
+ end
73
+
74
+ subgraph Provider["LLM Provider Layer"]
75
+ Interface["LLMProviderBase<br/><i>Abstract interface</i>"]
76
+ Azure["AzureOpenAIProvider<br/><i>Entra ID · Streaming</i>"]
77
+ Retry["Retry + Backoff<br/><i>Exponential w/ jitter</i>"]
78
+ Stream["StreamAccumulator<br/><i>Tool call assembly</i>"]
79
+ Config["Config Loader<br/><i>env → ~/.beth/.env</i>"]
80
+ end
81
+
82
+ subgraph Tracking["Work Tracking"]
83
+ Beads["beads (bd CLI)<br/><i>Agent coordination</i>"]
84
+ Backlog["Backlog.md<br/><i>Human changelog</i>"]
85
+ end
86
+
87
+ Copilot --> Beth
88
+ CLI --> Core
89
+ Core --> Agents
90
+ Beth -->|"routes"| PM & Researcher & Designer & Developer & Security & Tester
91
+
92
+ PM -.->|"loads"| PRD
93
+ Designer -.->|"loads"| Framer & WebDesign
94
+ Developer -.->|"loads"| React & Shadcn
95
+ Security -.->|"loads"| SecAnalysis
96
+ Researcher -.->|"loads"| WebSearch
97
+ Developer -.->|"uses"| MCPShadcn
98
+ Tester -.->|"uses"| MCPPlaywright
99
+ Security -.->|"uses"| MCPAzure
100
+ Researcher -.->|"uses"| MCPBrave
101
+
102
+ Azure --> Interface
103
+ Retry --> Azure
104
+ Stream --> Azure
105
+ Config --> Azure
106
+
107
+ Beth -.->|"tracks"| Beads
108
+ Beth -.->|"updates"| Backlog
109
+
110
+ style Beth fill:#1e3a5f,color:#fff
111
+ style Core fill:#f0f4f8
112
+ style Provider fill:#e8f5e9
113
+ ```
114
+
115
+ ---
116
+
117
+ ## Tech Stack
118
+
119
+ | Category | Technology | Notes |
120
+ |----------|-----------|-------|
121
+ | **Runtime** | Node.js ≥ 18 | ES modules, built-in test runner |
122
+ | **Language** | TypeScript (strict mode) | No `any`. Zod for runtime validation |
123
+ | **Target Framework** | React 19 + Next.js App Router | Server Components, Server Actions, Suspense, streaming |
124
+ | **Styling** | Tailwind CSS + `class-variance-authority` (cva) | Utility-first with typed variants |
125
+ | **Components** | shadcn/ui | Radix primitives, copy-paste ownership |
126
+ | **LLM Provider** | Azure OpenAI via `openai` SDK | Entra ID auth (no API keys), streaming + tool calling |
127
+ | **Auth** | `@azure/identity` DefaultAzureCredential | az login, managed identity, VS Code creds |
128
+ | **Frontmatter** | `gray-matter` | Parses `.agent.md` and `SKILL.md` YAML |
129
+ | **Testing** | Node.js built-in test runner | 485 tests — unit, integration, E2E |
130
+ | **Task Tracking** | beads (`bd` CLI) | Dependency-aware issue tracking for agents |
131
+ | **Package Manager** | pnpm | Lockfile committed |
132
+
133
+ **Production dependencies:** 1 (`gray-matter`). That's it. Minimal attack surface by design.
134
+
135
+ ---
26
136
 
27
137
  ## Getting Started
28
138
 
29
- **Project scope:**
139
+ **One command:**
30
140
  ```bash
31
141
  npx beth-copilot init
32
142
  ```
@@ -39,87 +149,97 @@ beth init
39
149
 
40
150
  Then open VS Code, switch Copilot Chat to **Agent mode**, and type `@Beth`.
41
151
 
42
- For detailed setup (prerequisites, task tracking, MCP servers): [docs/INSTALLATION.md](docs/INSTALLATION.md)
152
+ **Verify everything works:**
153
+ ```bash
154
+ beth doctor # Health check: Node.js, beads, agents, skills
155
+ beth quickstart # Init + doctor + beads setup in one shot
156
+ ```
43
157
 
44
- ## The Family
158
+ For detailed setup (prerequisites, task tracking, MCP servers): [docs/INSTALLATION.md](docs/INSTALLATION.md)
45
159
 
46
- Beth doesn't work alone. She's got people—loyal, skilled, and ready to execute.
160
+ ---
47
161
 
48
- | Agent | Role | What They Do |
49
- |-------|------|--------------|
50
- | **@Beth** | The Boss | Orchestrates everything. Routes work. Takes names. |
51
- | **@product-manager** | The Strategist | WHAT to build: PRDs, user stories, priorities, success metrics. |
52
- | **@researcher** | The Intelligence | Competitive analysis, user insights, market dirt. |
53
- | **@ux-designer** | The Architect | HOW it works: component specs, design tokens, accessibility. |
54
- | **@developer** | The Builder | React/TypeScript/Next.js - UI and full-stack. Gets it done. |
55
- | **@tester** | The Enforcer | Quality assurance, accessibility, performance. Finds every crack. |
56
- | **@security-reviewer** | The Bodyguard | Enterprise security. Vulnerabilities, compliance, threat modeling. |
162
+ ## CLI Commands
57
163
 
58
- ### Product Manager vs UX Designer
164
+ | Command | What It Does |
165
+ |---------|-------------|
166
+ | `beth init` | Install agents, skills, VS Code settings, beads tracking |
167
+ | `beth init --force` | Overwrite existing files |
168
+ | `beth doctor` | Validate Node.js ≥18, beads CLI, agents frontmatter, skills directories |
169
+ | `beth quickstart` | Run init + doctor + beads init in one shot |
170
+ | `beth help` | Show all commands and options |
59
171
 
60
- | | Product Manager | UX Designer |
61
- |---|---|---|
62
- | **Focus** | WHAT to build, WHY, WHEN | HOW it looks, feels, behaves |
63
- | **Outputs** | PRDs, user stories, priorities | Component specs, design tokens, accessibility |
64
- | **Example** | "Users need date filtering" | "Date picker: variants, states, ARIA" |
172
+ **Flags:** `--force`, `--skip-backlog`, `--skip-mcp`, `--skip-beads`, `--verbose`
65
173
 
66
- ## Skills (The Weapons)
174
+ ---
67
175
 
68
- Beth's team comes equipped:
176
+ ## Agent-to-Agent (A2A) Orchestration
69
177
 
70
- | Skill | Purpose |
71
- |-------|---------|
72
- | **PRD Generation** | Write requirements docs that don't waste anyone's time |
73
- | **Framer Components** | Build custom React components with property controls |
74
- | **React/Next.js Best Practices** | Vercel-grade performance patterns |
75
- | **Web Design Guidelines** | WCAG compliance, UI review, accessibility |
76
- | **shadcn/ui** | Component library patterns, installation, and best practices |
77
- | **Security Analysis** | OWASP, threat modeling, vulnerability assessment |
178
+ Beth doesn't micromanage. She delegates to specialists over **subagent** and **handoff** channels, tracks dependencies with beads, and holds every agent accountable.
78
179
 
79
- ## How Beth Works
180
+ ### The Family
80
181
 
81
- She doesn't micromanage. She delegates to specialists and holds them accountable.
182
+ | Agent | Role | What They Do |
183
+ |-------|------|--------------|
184
+ | **@Beth** | The Boss | Orchestrates everything. Routes work. Takes names. |
185
+ | **@product-manager** | The Strategist | WHAT to build: PRDs, user stories, priorities, success metrics |
186
+ | **@researcher** | The Intelligence | Competitive analysis, user insights, market dirt |
187
+ | **@ux-designer** | The Architect | HOW it works: component specs, design tokens, accessibility |
188
+ | **@developer** | The Builder | React/TypeScript/Next.js — UI and full-stack |
189
+ | **@tester** | The Enforcer | Quality assurance, accessibility, performance |
190
+ | **@security-reviewer** | The Bodyguard | OWASP, compliance, threat modeling |
82
191
 
83
- ### Architecture
192
+ ### A2A Delegation Model
84
193
 
85
194
  ```mermaid
86
195
  flowchart TB
87
- subgraph User["👤 User"]
88
- Request[User Request]
196
+ subgraph Orchestration["Beth Orchestration Layer"]
197
+ BethCore["@Beth<br/><i>Routes work · Spawns subagents</i>"]
89
198
  end
90
199
 
91
- subgraph Orchestrator["🎯 Beth - The Orchestrator"]
92
- Beth["@Beth<br/><i>'I don't speak dipshit'</i>"]
93
- Assess[Assess Request]
94
- Plan[Plan Workflow]
95
- Route[Route to Specialists]
200
+ subgraph Specialists["Specialist Agents"]
201
+ PM["@product-manager<br/>Requirements · Priorities"]
202
+ R["@researcher<br/>User insights · Market intel"]
203
+ UX["@ux-designer<br/>Component specs · Design tokens"]
204
+ D["@developer<br/>React/TS/Next.js · Implementation"]
205
+ S["@security-reviewer<br/>Threat modeling · Vulnerabilities"]
206
+ T["@tester<br/>QA · a11y · Performance"]
96
207
  end
97
208
 
98
- subgraph Agents["🧑‍💼 Specialist Agents"]
99
- PM["@product-manager<br/>WHAT to build"]
100
- Researcher["@researcher<br/>User/Market Intel"]
101
- Designer["@ux-designer<br/>HOW it works"]
102
- Developer["@developer<br/>Implementation"]
103
- Security["@security-reviewer<br/>Protection"]
104
- Tester["@tester<br/>Quality Gate"]
105
- end
209
+ BethCore -->|"Product Strategy"| PM
210
+ BethCore -->|"User Research"| R
211
+ BethCore -->|"UX Design"| UX
212
+ BethCore -->|"Development"| D
213
+ BethCore -->|"Security Review"| S
214
+ BethCore -->|"Quality Assurance"| T
215
+
216
+ PM -.->|"subagent"| R
217
+ PM -.->|"subagent"| UX
218
+ UX -.->|"subagent"| D
219
+ D -.->|"subagent"| T
220
+ S -.->|"subagent"| D
221
+ T -.->|"subagent"| D
222
+
223
+ style BethCore fill:#1e3a5f,color:#fff
224
+ ```
106
225
 
107
- Request --> Beth
108
- Beth --> Assess --> Plan --> Route
109
-
110
- Route --> PM
111
- Route --> Researcher
112
- Route --> Designer
113
- Route --> Developer
114
- Route --> Security
115
- Route --> Tester
226
+ ### Subagent vs Handoff
116
227
 
117
- style Beth fill:#1e3a5f,color:#fff
118
- style Orchestrator fill:#f0f4f8
119
- style Agents fill:#f8f4f0
228
+ | Mechanism | Control | Use When |
229
+ |-----------|---------|----------|
230
+ | **Subagent** | Beth decides | Task can run autonomously, no human review needed |
231
+ | **Handoff** | User decides | User needs to review before proceeding |
232
+
233
+ ```typescript
234
+ // Beth spawns a specialist — autonomous execution
235
+ runSubagent({
236
+ agentName: "developer",
237
+ prompt: "Implement JWT auth flow with refresh token rotation...",
238
+ description: "Implement auth"
239
+ })
120
240
  ```
121
241
 
122
- ### The Workflow
242
+ ### Workflow: New Feature
123
243
 
124
244
  ```mermaid
125
245
  sequenceDiagram
@@ -133,125 +253,197 @@ sequenceDiagram
133
253
 
134
254
  U->>B: "Build me a feature"
135
255
  B->>B: Assess & Plan
256
+
136
257
  B->>PM: Define requirements
137
- PM-->>B: Requirements ready
258
+ PM-->>B: PRD + user stories
259
+
138
260
  B->>UX: Design the experience
139
- UX-->>B: Design specs ready
261
+ UX-->>B: Component specs + tokens
262
+
140
263
  B->>D: Implement feature
141
264
  D-->>B: Implementation complete
142
- B->>S: Security review
143
- S-->>B: Security approved
144
- B->>T: Test & verify
145
- T-->>B: Quality verified
265
+
266
+ par Parallel quality gates
267
+ B->>S: Security review
268
+ S-->>B: OWASP approved
269
+ and
270
+ B->>T: Test & verify
271
+ T-->>B: a11y + regression pass
272
+ end
273
+
146
274
  B->>U: Feature complete ✅
147
275
  ```
148
276
 
149
- **Bug Hunt?** Tester → Developer → Security → Tester
150
- **Security Audit?** Security → Developer → Tester → Security
277
+ **Bug Hunt?** Tester → Developer → Security → Tester
278
+ **Security Audit?** Security → Developer → Tester → Security sign-off
151
279
 
152
- ### Agent Delegation
280
+ ---
153
281
 
154
- ```mermaid
155
- flowchart TB
156
- subgraph Beth["Beth (Orchestrator)"]
157
- BethCore["Routes all work<br/>Spawns subagents"]
158
- end
282
+ ## MCP Integrations
159
283
 
160
- subgraph PM["Product Manager"]
161
- PMCore["Requirements<br/>Priorities"]
162
- end
284
+ Model Context Protocol servers extend agent capabilities. All **optional** — agents gracefully degrade without them.
285
+
286
+ | Server | Agent | Capability |
287
+ |--------|-------|-----------|
288
+ | **shadcn/ui** | Developer | Component browsing & installation |
289
+ | **Playwright** | Tester | Browser automation, E2E testing |
290
+ | **Azure** | Developer, Security | Cloud resource management |
291
+ | **Brave Search** | Researcher | Internet research |
292
+ | **DeepWiki** | All | Repository documentation lookup |
293
+
294
+ ### Quick Setup
295
+
296
+ ```bash
297
+ # Copy example config and enable what you need
298
+ cp mcp.json.example .vscode/mcp.json
299
+ ```
300
+
301
+ ```json
302
+ {
303
+ "servers": {
304
+ "shadcn": { "command": "npx", "args": ["shadcn@latest", "mcp"] },
305
+ "playwright": { "command": "npx", "args": ["@playwright/mcp@latest"] },
306
+ "azure": { "command": "npx", "args": ["@azure/mcp-server"] },
307
+ "web-search": { "command": "npx", "args": ["@brave/brave-search-mcp-server"] },
308
+ "deepwiki": { "url": "https://mcp.deepwiki.com/mcp" }
309
+ }
310
+ }
311
+ ```
312
+
313
+ Full details: [docs/MCP-SETUP.md](docs/MCP-SETUP.md)
163
314
 
164
- subgraph R["Researcher"]
165
- RCore["User insights<br/>Market intel"]
315
+ ---
316
+
317
+ ## Skills (On-Demand Knowledge)
318
+
319
+ Skills are domain-knowledge modules that agents load automatically when trigger phrases match. Each skill lives in `.github/skills/<name>/SKILL.md`.
320
+
321
+ | Skill | Triggers On | Used By |
322
+ |-------|------------|---------|
323
+ | **PRD Generation** | "create a prd", "product requirements" | Product Manager |
324
+ | **Framer Components** | "framer component", "property controls" | UX Designer |
325
+ | **React/Next.js Best Practices** | React performance, Next.js patterns | Developer |
326
+ | **Web Design Guidelines** | "review my UI", "check accessibility" | UX Designer |
327
+ | **shadcn/ui** | "shadcn", "ui component" | Developer |
328
+ | **Security Analysis** | "security review", "OWASP", "threat model" | Security Reviewer |
329
+ | **Azure Operations** | Azure resource management | Developer |
330
+ | **Web Search** | Internet research via Brave | Researcher |
331
+
332
+ ---
333
+
334
+ ## LLM Provider Layer
335
+
336
+ The TypeScript core includes a production-ready provider abstraction for running Beth outside VS Code.
337
+
338
+ ```mermaid
339
+ flowchart LR
340
+ subgraph Config["Configuration"]
341
+ Env["process.env"]
342
+ DotEnv["~/.beth/.env"]
166
343
  end
167
344
 
168
- subgraph UX["UX Designer"]
169
- UXCore["Component specs<br/>Design tokens"]
345
+ subgraph Auth["Authentication"]
346
+ Entra["Entra ID<br/><i>DefaultAzureCredential</i>"]
170
347
  end
171
348
 
172
- subgraph D["Developer"]
173
- DCore["React/TS/Next.js<br/>Implementation"]
349
+ subgraph Provider["Provider"]
350
+ Base["LLMProviderBase<br/><i>Abstract interface</i>"]
351
+ AzureOAI["AzureOpenAIProvider<br/><i>chat · chatStream · countTokens</i>"]
174
352
  end
175
353
 
176
- subgraph S["Security"]
177
- SCore["Threat modeling<br/>Vulnerabilities"]
354
+ subgraph Resilience["Resilience"]
355
+ RetryMod["Exponential Backoff<br/><i>Jitter · 3 retries</i>"]
356
+ Errors["LLMError<br/><i>Typed error codes</i>"]
178
357
  end
179
358
 
180
- subgraph T["Tester"]
181
- TCore["QA & a11y<br/>Performance"]
359
+ subgraph Streaming["Streaming"]
360
+ Accum["StreamAccumulator<br/><i>Content + tool call assembly</i>"]
361
+ Collect["collectStream<br/><i>Full response</i>"]
362
+ Map["mapStream<br/><i>Transform chunks</i>"]
182
363
  end
183
364
 
184
- BethCore -->|"Product Strategy"| PMCore
185
- BethCore -->|"User Research"| RCore
186
- BethCore -->|"UX Design"| UXCore
187
- BethCore -->|"Development"| DCore
188
- BethCore -->|"Security Review"| SCore
189
- BethCore -->|"Quality Assurance"| TCore
190
-
191
- PMCore -.->|"subagent"| RCore
192
- PMCore -.->|"subagent"| UXCore
193
- UXCore -.->|"subagent"| DCore
194
- DCore -.->|"subagent"| TCore
195
- SCore -.->|"subagent"| DCore
365
+ Env --> AzureOAI
366
+ DotEnv --> AzureOAI
367
+ Entra --> AzureOAI
368
+ Base --> AzureOAI
369
+ RetryMod --> AzureOAI
370
+ AzureOAI --> Accum
371
+ AzureOAI --> Collect
372
+ Errors --> RetryMod
196
373
  ```
197
374
 
198
- ## Quick Commands
199
-
200
- Don't waste her time. Be direct.
375
+ **Key capabilities:**
376
+ - **Entra ID auth** — No API keys. Uses `DefaultAzureCredential` (az login, managed identity, VS Code creds)
377
+ - **Streaming** `chatStream()` yields `ChatChunk` objects with incremental tool call delta assembly
378
+ - **Retry** — Exponential backoff with jitter for 429/5xx/network errors. Non-transient errors fail fast
379
+ - **Config** — `process.env` → `~/.beth/.env` precedence chain
380
+ - **193 provider tests** covering types, retry, config, streaming, and Azure client
201
381
 
202
- ```
203
- @Beth Build me a dashboard for user analytics with real-time updates.
204
- ```
382
+ ---
205
383
 
206
- ```
207
- @Beth Security review for our authentication flow. Find the holes.
208
- ```
384
+ ## TypeScript Core
209
385
 
210
- ```
211
- @developer Implement a drag-and-drop task board. Make it fast.
212
- ```
386
+ The engine that powers everything. Parses agent and skill definitions, validates configuration, and provides typed APIs.
213
387
 
214
- ```
215
- @security-reviewer OWASP top 10 assessment on our API endpoints.
216
- ```
388
+ ### Project Structure
217
389
 
218
390
  ```
219
- @tester Accessibility audit. WCAG 2.1 AA. No excuses.
391
+ beth/
392
+ ├── bin/
393
+ │ └── cli.js # CLI entry point (init, doctor, quickstart, help)
394
+ ├── src/
395
+ │ ├── index.ts # Barrel exports
396
+ │ ├── cli/commands/
397
+ │ │ ├── doctor.ts # System health validation
398
+ │ │ └── quickstart.ts # Guided setup flow
399
+ │ ├── core/
400
+ │ │ ├── agents/
401
+ │ │ │ ├── types.ts # AgentDefinition, AgentFrontmatter, AgentHandoff
402
+ │ │ │ └── loader.ts # Parse .agent.md → typed definitions
403
+ │ │ └── skills/
404
+ │ │ ├── types.ts # SkillDefinition, TriggerMap
405
+ │ │ └── loader.ts # Parse SKILL.md, extract triggers, match queries
406
+ │ ├── lib/
407
+ │ │ └── pathValidation.ts # Traversal/injection guards
408
+ │ └── providers/
409
+ │ ├── interface.ts # LLMProviderBase abstract class
410
+ │ ├── azure.ts # AzureOpenAIProvider (Entra ID, streaming, tools)
411
+ │ ├── types.ts # 17 types: ChatMessage, ToolCall, LLMError, etc.
412
+ │ ├── retry.ts # Exponential backoff with jitter
413
+ │ ├── config.ts # Environment + dotfile config loader
414
+ │ └── streaming.ts # StreamAccumulator, collectStream, mapStream
415
+ ├── templates/
416
+ │ └── .github/
417
+ │ ├── agents/ # 7 agent definitions (.agent.md)
418
+ │ └── skills/ # 8 skill modules (SKILL.md)
419
+ └── docs/
420
+ ├── INSTALLATION.md
421
+ ├── MCP-SETUP.md
422
+ ├── CLI-ARCHITECTURE.md
423
+ └── SYSTEM-FLOW.md
220
424
  ```
221
425
 
222
- ## The Structure
223
-
224
- ```
225
- .github/
226
- ├── agents/ # The crew
227
- │ ├── beth.agent.md # The boss herself
228
- │ ├── product-manager.agent.md
229
- │ ├── researcher.agent.md
230
- │ ├── ux-designer.agent.md
231
- │ ├── developer.agent.md # UI + full-stack
232
- │ ├── tester.agent.md
233
- │ └── security-reviewer.agent.md # Enterprise security
234
- ├── skills/ # Domain expertise
235
- │ ├── prd/
236
- │ ├── framer-components/
237
- │ ├── vercel-react-best-practices/
238
- │ ├── web-design-guidelines/
239
- │ └── security-analysis/ # New: security skill
240
- └── copilot-instructions.md # The rules of engagement
241
- ```
426
+ ### Test Coverage
242
427
 
243
- ## Her Philosophy
428
+ **485 tests** (484 pass, 1 skip, 0 fail):
244
429
 
245
- Beth operates on a few principles:
430
+ | Suite | Tests | What It Covers |
431
+ |-------|-------|---------------|
432
+ | Agent loader | 30+ | Frontmatter parsing, validation, code fence stripping, handoffs |
433
+ | Skill loader | 30+ | Trigger extraction, query matching, trigger map building |
434
+ | Provider types | 40+ | LLMError codes, ChatMessage shapes, ToolDefinition schemas |
435
+ | Provider retry | 40+ | Exponential backoff, jitter, transient error detection |
436
+ | Provider config | 30+ | Env precedence, dotenv parsing, URL validation |
437
+ | Provider streaming | 40+ | Chunk accumulation, tool call delta assembly |
438
+ | Provider Azure | 30+ | Message mapping, response mapping, error wrapping |
439
+ | CLI E2E | 52 | Init/doctor pipeline, MCP template validation, help output |
440
+ | Path validation | 33 | Traversal detection, injection prevention, allowlists |
246
441
 
247
- 1. **Protect the family** — Your codebase is the ranch. She defends it.
248
- 2. **No weakness** — Tests, security, accessibility. Cover every flank.
249
- 3. **Move fast, break enemies** — Parallel execution, aggressive timelines.
250
- 4. **Loyalty earns trust** — Agents that perform get the good work.
442
+ ---
251
443
 
252
- ### IDEO Design Thinking
444
+ ## IDEO Design Thinking
253
445
 
254
- Beth follows human-centered design methodology:
446
+ Beth follows human-centered design methodology across agent workflows:
255
447
 
256
448
  ```mermaid
257
449
  flowchart LR
@@ -277,35 +469,38 @@ flowchart LR
277
469
 
278
470
  E --> D --> I --> P --> T
279
471
  T -.->|iterate| E
280
- T -.->|iterate| D
281
472
  T -.->|iterate| I
282
473
  ```
283
474
 
475
+ ---
476
+
284
477
  ## Quality Standards
285
478
 
286
479
  Beth doesn't ship garbage:
287
480
 
288
- - **Accessibility**: WCAG 2.1 AA minimum. Everyone uses the product.
289
- - **Performance**: Core Web Vitals green. LCP < 2.5s. No exceptions.
290
- - **Security**: OWASP compliant. Regular audits. Zero tolerance for vulnerabilities.
291
- - **Type Safety**: Full TypeScript coverage. No `any` unless you want a lecture.
292
- - **Test Coverage**: Unit, integration, E2E. If it's not tested, it's not done.
481
+ | Standard | Gate | Enforced By |
482
+ |----------|------|-------------|
483
+ | **WCAG 2.1 AA** | Accessibility compliance | UX Designer + Tester |
484
+ | **Core Web Vitals** | LCP < 2.5s, FID < 100ms, CLS < 0.1 | Developer |
485
+ | **OWASP Top 10** | Zero known vulnerabilities | Security Reviewer |
486
+ | **TypeScript Strict** | No `any` | Developer |
487
+ | **Test Coverage** | Unit + Integration + E2E | Tester |
293
488
 
294
489
  ```mermaid
295
490
  flowchart TB
296
491
  subgraph Standards["Quality Standards"]
297
- A11y["WCAG 2.1 AA<br/>Accessibility"]
298
- Perf["Core Web Vitals<br/>LCP < 2.5s"]
299
- Sec["OWASP Compliant<br/>Zero vulnerabilities"]
300
- Type["Full TypeScript<br/>No any"]
301
- Coverage["Test Coverage<br/>Unit + Integration + E2E"]
492
+ A11y["WCAG 2.1 AA"]
493
+ Perf["Core Web Vitals"]
494
+ Sec["OWASP Compliant"]
495
+ Type["Full TypeScript"]
496
+ Coverage["Test Coverage"]
302
497
  end
303
498
 
304
499
  subgraph Gates["Enforcement"]
305
- Designer["UX Designer<br/>reviews a11y specs"]
306
- Developer["Developer<br/>implements patterns"]
307
- Security["Security Reviewer<br/>audits code"]
308
- Tester["Tester<br/>verifies all gates"]
500
+ Designer["UX Designer"]
501
+ Developer["Developer"]
502
+ Security["Security Reviewer"]
503
+ Tester["Tester"]
309
504
  end
310
505
 
311
506
  A11y --> Designer
@@ -324,13 +519,41 @@ flowchart TB
324
519
  Fix --> Gates
325
520
  ```
326
521
 
522
+ ---
523
+
524
+ ## Quick Commands
525
+
526
+ Don't waste her time. Be direct.
527
+
528
+ ```
529
+ @Beth Build me a dashboard for user analytics with real-time updates.
530
+ ```
531
+
532
+ ```
533
+ @Beth Security review for our authentication flow. Find the holes.
534
+ ```
535
+
536
+ ```
537
+ @developer Implement a drag-and-drop task board. Make it fast.
538
+ ```
539
+
540
+ ```
541
+ @security-reviewer OWASP top 10 assessment on our API endpoints.
542
+ ```
543
+
544
+ ```
545
+ @tester Accessibility audit. WCAG 2.1 AA. No excuses.
546
+ ```
547
+
548
+ ---
549
+
327
550
  ## Why Beth?
328
551
 
329
552
  <p align="center">
330
553
  <img src="assets/beth-questioning.png" alt="Beth" width="500">
331
554
  </p>
332
555
 
333
- Look, you *could* try to coordinate seven specialists yourself. You could context-switch between product strategy, security reviews, and accessibility audits while keeping your sanity intact.
556
+ Look, you *could* try to coordinate seven specialists yourself. You could context-switch between product strategy, security reviews, and accessibility audits while keeping your sanity intact.
334
557
 
335
558
  Or you could let Beth handle it.
336
559
 
@@ -344,30 +567,30 @@ Is it magic? No. It's just competence with very good hair.
344
567
 
345
568
  ## Requirements
346
569
 
347
- - VS Code with GitHub Copilot extension
348
- - GitHub Copilot Chat enabled
349
- - The spine to actually ship something
570
+ - **Node.js** 18
571
+ - **VS Code** with GitHub Copilot extension
572
+ - **GitHub Copilot Chat** in Agent mode
573
+ - [**beads**](https://github.com/steveyegge/beads) for task tracking (`bd` CLI)
350
574
 
351
575
  ### Optional: MCP Servers
352
576
 
353
- Beth's agents work fine without them, but these make them smarter:
354
-
355
- | Server | What It Does | Setup |
356
- |--------|--------------|-------|
357
- | **shadcn/ui** | Component browsing & installation | `npx shadcn@latest mcp init --client vscode` |
358
- | **Playwright** | Browser automation for testing | See [MCP Setup Guide](docs/MCP-SETUP.md) |
359
- | **Azure** | Cloud resource management | See [MCP Setup Guide](docs/MCP-SETUP.md) |
360
- | **Web Search** | Internet research | See [MCP Setup Guide](docs/MCP-SETUP.md) |
577
+ See [MCP Integrations](#mcp-integrations) above or [docs/MCP-SETUP.md](docs/MCP-SETUP.md) for setup.
361
578
 
362
- Full details: [docs/MCP-SETUP.md](docs/MCP-SETUP.md)
579
+ ---
363
580
 
364
581
  ## Documentation
365
582
 
366
- - [Installation Guide](docs/INSTALLATION.md) Full setup instructions
367
- - [MCP Setup](docs/MCP-SETUP.md) — Optional server integrations
368
- - [System Flow & Diagrams](docs/SYSTEM-FLOW.md) Architecture and agent orchestration diagrams
369
- - [Changelog](CHANGELOG.md) Version history and updates
370
- - [Security Policy](SECURITY.md) Vulnerability reporting
583
+ | Doc | Purpose |
584
+ |-----|---------|
585
+ | [Installation Guide](docs/INSTALLATION.md) | Full setup: prerequisites, VS Code config, beads |
586
+ | [MCP Setup](docs/MCP-SETUP.md) | Optional server integrations |
587
+ | [CLI Architecture](docs/CLI-ARCHITECTURE.md) | Dual-interface design, implementation phases |
588
+ | [System Flow](docs/SYSTEM-FLOW.md) | Agent orchestration diagrams |
589
+ | [Contributing Guide](CONTRIBUTING.md) | How to contribute (PR process, review checklist) |
590
+ | [Changelog](CHANGELOG.md) | Version history |
591
+ | [Security Policy](SECURITY.md) | Vulnerability reporting |
592
+
593
+ ---
371
594
 
372
595
  ## License
373
596