agent-bober 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (212) hide show
  1. package/.claude-plugin/plugin.json +9 -0
  2. package/LICENSE +21 -0
  3. package/README.md +495 -0
  4. package/agents/bober-evaluator.md +323 -0
  5. package/agents/bober-generator.md +245 -0
  6. package/agents/bober-planner.md +248 -0
  7. package/dist/cli/commands/eval.d.ts +6 -0
  8. package/dist/cli/commands/eval.d.ts.map +1 -0
  9. package/dist/cli/commands/eval.js +129 -0
  10. package/dist/cli/commands/eval.js.map +1 -0
  11. package/dist/cli/commands/init.d.ts +5 -0
  12. package/dist/cli/commands/init.d.ts.map +1 -0
  13. package/dist/cli/commands/init.js +547 -0
  14. package/dist/cli/commands/init.js.map +1 -0
  15. package/dist/cli/commands/plan.d.ts +5 -0
  16. package/dist/cli/commands/plan.d.ts.map +1 -0
  17. package/dist/cli/commands/plan.js +87 -0
  18. package/dist/cli/commands/plan.js.map +1 -0
  19. package/dist/cli/commands/run.d.ts +5 -0
  20. package/dist/cli/commands/run.d.ts.map +1 -0
  21. package/dist/cli/commands/run.js +120 -0
  22. package/dist/cli/commands/run.js.map +1 -0
  23. package/dist/cli/commands/sprint.d.ts +6 -0
  24. package/dist/cli/commands/sprint.d.ts.map +1 -0
  25. package/dist/cli/commands/sprint.js +206 -0
  26. package/dist/cli/commands/sprint.js.map +1 -0
  27. package/dist/cli/index.d.ts +3 -0
  28. package/dist/cli/index.d.ts.map +1 -0
  29. package/dist/cli/index.js +124 -0
  30. package/dist/cli/index.js.map +1 -0
  31. package/dist/config/defaults.d.ts +15 -0
  32. package/dist/config/defaults.d.ts.map +1 -0
  33. package/dist/config/defaults.js +226 -0
  34. package/dist/config/defaults.js.map +1 -0
  35. package/dist/config/index.d.ts +4 -0
  36. package/dist/config/index.d.ts.map +1 -0
  37. package/dist/config/index.js +8 -0
  38. package/dist/config/index.js.map +1 -0
  39. package/dist/config/loader.d.ts +18 -0
  40. package/dist/config/loader.d.ts.map +1 -0
  41. package/dist/config/loader.js +189 -0
  42. package/dist/config/loader.js.map +1 -0
  43. package/dist/config/schema.d.ts +904 -0
  44. package/dist/config/schema.d.ts.map +1 -0
  45. package/dist/config/schema.js +181 -0
  46. package/dist/config/schema.js.map +1 -0
  47. package/dist/contracts/eval-result.d.ts +205 -0
  48. package/dist/contracts/eval-result.d.ts.map +1 -0
  49. package/dist/contracts/eval-result.js +87 -0
  50. package/dist/contracts/eval-result.js.map +1 -0
  51. package/dist/contracts/index.d.ts +4 -0
  52. package/dist/contracts/index.d.ts.map +1 -0
  53. package/dist/contracts/index.js +16 -0
  54. package/dist/contracts/index.js.map +1 -0
  55. package/dist/contracts/spec.d.ts +101 -0
  56. package/dist/contracts/spec.d.ts.map +1 -0
  57. package/dist/contracts/spec.js +51 -0
  58. package/dist/contracts/spec.js.map +1 -0
  59. package/dist/contracts/sprint-contract.d.ts +141 -0
  60. package/dist/contracts/sprint-contract.d.ts.map +1 -0
  61. package/dist/contracts/sprint-contract.js +80 -0
  62. package/dist/contracts/sprint-contract.js.map +1 -0
  63. package/dist/evaluators/builtin/api-check.d.ts +13 -0
  64. package/dist/evaluators/builtin/api-check.d.ts.map +1 -0
  65. package/dist/evaluators/builtin/api-check.js +152 -0
  66. package/dist/evaluators/builtin/api-check.js.map +1 -0
  67. package/dist/evaluators/builtin/build-check.d.ts +17 -0
  68. package/dist/evaluators/builtin/build-check.d.ts.map +1 -0
  69. package/dist/evaluators/builtin/build-check.js +155 -0
  70. package/dist/evaluators/builtin/build-check.js.map +1 -0
  71. package/dist/evaluators/builtin/command-runner.d.ts +26 -0
  72. package/dist/evaluators/builtin/command-runner.d.ts.map +1 -0
  73. package/dist/evaluators/builtin/command-runner.js +114 -0
  74. package/dist/evaluators/builtin/command-runner.js.map +1 -0
  75. package/dist/evaluators/builtin/lint.d.ts +17 -0
  76. package/dist/evaluators/builtin/lint.d.ts.map +1 -0
  77. package/dist/evaluators/builtin/lint.js +264 -0
  78. package/dist/evaluators/builtin/lint.js.map +1 -0
  79. package/dist/evaluators/builtin/playwright.d.ts +16 -0
  80. package/dist/evaluators/builtin/playwright.d.ts.map +1 -0
  81. package/dist/evaluators/builtin/playwright.js +238 -0
  82. package/dist/evaluators/builtin/playwright.js.map +1 -0
  83. package/dist/evaluators/builtin/typescript-check.d.ts +12 -0
  84. package/dist/evaluators/builtin/typescript-check.d.ts.map +1 -0
  85. package/dist/evaluators/builtin/typescript-check.js +155 -0
  86. package/dist/evaluators/builtin/typescript-check.js.map +1 -0
  87. package/dist/evaluators/builtin/unit-test.d.ts +18 -0
  88. package/dist/evaluators/builtin/unit-test.d.ts.map +1 -0
  89. package/dist/evaluators/builtin/unit-test.js +279 -0
  90. package/dist/evaluators/builtin/unit-test.js.map +1 -0
  91. package/dist/evaluators/index.d.ts +11 -0
  92. package/dist/evaluators/index.d.ts.map +1 -0
  93. package/dist/evaluators/index.js +13 -0
  94. package/dist/evaluators/index.js.map +1 -0
  95. package/dist/evaluators/plugin-interface.d.ts +50 -0
  96. package/dist/evaluators/plugin-interface.d.ts.map +1 -0
  97. package/dist/evaluators/plugin-interface.js +2 -0
  98. package/dist/evaluators/plugin-interface.js.map +1 -0
  99. package/dist/evaluators/plugin-loader.d.ts +18 -0
  100. package/dist/evaluators/plugin-loader.d.ts.map +1 -0
  101. package/dist/evaluators/plugin-loader.js +107 -0
  102. package/dist/evaluators/plugin-loader.js.map +1 -0
  103. package/dist/evaluators/registry.d.ts +78 -0
  104. package/dist/evaluators/registry.d.ts.map +1 -0
  105. package/dist/evaluators/registry.js +238 -0
  106. package/dist/evaluators/registry.js.map +1 -0
  107. package/dist/index.d.ts +17 -0
  108. package/dist/index.d.ts.map +1 -0
  109. package/dist/index.js +22 -0
  110. package/dist/index.js.map +1 -0
  111. package/dist/orchestrator/context-handoff.d.ts +543 -0
  112. package/dist/orchestrator/context-handoff.d.ts.map +1 -0
  113. package/dist/orchestrator/context-handoff.js +133 -0
  114. package/dist/orchestrator/context-handoff.js.map +1 -0
  115. package/dist/orchestrator/evaluator-agent.d.ts +15 -0
  116. package/dist/orchestrator/evaluator-agent.d.ts.map +1 -0
  117. package/dist/orchestrator/evaluator-agent.js +233 -0
  118. package/dist/orchestrator/evaluator-agent.js.map +1 -0
  119. package/dist/orchestrator/generator-agent.d.ts +16 -0
  120. package/dist/orchestrator/generator-agent.d.ts.map +1 -0
  121. package/dist/orchestrator/generator-agent.js +147 -0
  122. package/dist/orchestrator/generator-agent.js.map +1 -0
  123. package/dist/orchestrator/pipeline.d.ts +24 -0
  124. package/dist/orchestrator/pipeline.d.ts.map +1 -0
  125. package/dist/orchestrator/pipeline.js +290 -0
  126. package/dist/orchestrator/pipeline.js.map +1 -0
  127. package/dist/orchestrator/planner-agent.d.ts +10 -0
  128. package/dist/orchestrator/planner-agent.d.ts.map +1 -0
  129. package/dist/orchestrator/planner-agent.js +187 -0
  130. package/dist/orchestrator/planner-agent.js.map +1 -0
  131. package/dist/state/helpers.d.ts +5 -0
  132. package/dist/state/helpers.d.ts.map +1 -0
  133. package/dist/state/helpers.js +8 -0
  134. package/dist/state/helpers.js.map +1 -0
  135. package/dist/state/history.d.ts +39 -0
  136. package/dist/state/history.d.ts.map +1 -0
  137. package/dist/state/history.js +162 -0
  138. package/dist/state/history.js.map +1 -0
  139. package/dist/state/index.d.ts +8 -0
  140. package/dist/state/index.d.ts.map +1 -0
  141. package/dist/state/index.js +22 -0
  142. package/dist/state/index.js.map +1 -0
  143. package/dist/state/plan-state.d.ts +21 -0
  144. package/dist/state/plan-state.d.ts.map +1 -0
  145. package/dist/state/plan-state.js +108 -0
  146. package/dist/state/plan-state.js.map +1 -0
  147. package/dist/state/sprint-state.d.ts +20 -0
  148. package/dist/state/sprint-state.d.ts.map +1 -0
  149. package/dist/state/sprint-state.js +98 -0
  150. package/dist/state/sprint-state.js.map +1 -0
  151. package/dist/utils/fs.d.ts +31 -0
  152. package/dist/utils/fs.d.ts.map +1 -0
  153. package/dist/utils/fs.js +67 -0
  154. package/dist/utils/fs.js.map +1 -0
  155. package/dist/utils/git.d.ts +35 -0
  156. package/dist/utils/git.d.ts.map +1 -0
  157. package/dist/utils/git.js +84 -0
  158. package/dist/utils/git.js.map +1 -0
  159. package/dist/utils/index.d.ts +4 -0
  160. package/dist/utils/index.d.ts.map +1 -0
  161. package/dist/utils/index.js +4 -0
  162. package/dist/utils/index.js.map +1 -0
  163. package/dist/utils/logger.d.ts +45 -0
  164. package/dist/utils/logger.d.ts.map +1 -0
  165. package/dist/utils/logger.js +73 -0
  166. package/dist/utils/logger.js.map +1 -0
  167. package/hooks/hooks.json +10 -0
  168. package/package.json +67 -0
  169. package/scripts/detect-stack.sh +287 -0
  170. package/scripts/init-project.sh +206 -0
  171. package/scripts/run-eval.sh +175 -0
  172. package/skills/bober.anchor/SKILL.md +365 -0
  173. package/skills/bober.anchor/references/anchor-guide.md +567 -0
  174. package/skills/bober.brownfield/SKILL.md +422 -0
  175. package/skills/bober.brownfield/references/codebase-analysis.md +304 -0
  176. package/skills/bober.eval/SKILL.md +235 -0
  177. package/skills/bober.eval/references/eval-strategies.md +407 -0
  178. package/skills/bober.eval/references/feedback-format.md +182 -0
  179. package/skills/bober.plan/SKILL.md +244 -0
  180. package/skills/bober.plan/references/clarification-guide.md +124 -0
  181. package/skills/bober.plan/references/spec-schema.md +253 -0
  182. package/skills/bober.react/SKILL.md +330 -0
  183. package/skills/bober.react/references/react-scaffold.md +344 -0
  184. package/skills/bober.run/SKILL.md +303 -0
  185. package/skills/bober.solidity/SKILL.md +416 -0
  186. package/skills/bober.solidity/references/solidity-guide.md +487 -0
  187. package/skills/bober.sprint/SKILL.md +280 -0
  188. package/skills/bober.sprint/references/contract-schema.md +251 -0
  189. package/templates/base/CLAUDE.md +20 -0
  190. package/templates/base/bober.config.json +35 -0
  191. package/templates/brownfield/CLAUDE.md +34 -0
  192. package/templates/brownfield/bober.config.json +37 -0
  193. package/templates/presets/anchor/CLAUDE.md +163 -0
  194. package/templates/presets/anchor/bober.config.json +9 -0
  195. package/templates/presets/api-node/CLAUDE.md +153 -0
  196. package/templates/presets/api-node/bober.config.json +10 -0
  197. package/templates/presets/nextjs/CLAUDE.md +82 -0
  198. package/templates/presets/nextjs/bober.config.json +14 -0
  199. package/templates/presets/python-api/CLAUDE.md +202 -0
  200. package/templates/presets/python-api/bober.config.json +9 -0
  201. package/templates/presets/react-vite/CLAUDE.md +71 -0
  202. package/templates/presets/react-vite/bober.config.json +53 -0
  203. package/templates/presets/react-vite/scaffold/package.json +45 -0
  204. package/templates/presets/react-vite/scaffold/server/index.ts +38 -0
  205. package/templates/presets/react-vite/scaffold/server/tsconfig.json +24 -0
  206. package/templates/presets/react-vite/scaffold/src/App.tsx +37 -0
  207. package/templates/presets/react-vite/scaffold/src/index.html +12 -0
  208. package/templates/presets/react-vite/scaffold/src/main.tsx +12 -0
  209. package/templates/presets/react-vite/scaffold/tsconfig.json +27 -0
  210. package/templates/presets/react-vite/scaffold/vite.config.ts +34 -0
  211. package/templates/presets/solidity/CLAUDE.md +106 -0
  212. package/templates/presets/solidity/bober.config.json +9 -0
@@ -0,0 +1,304 @@
1
+ # Codebase Analysis Methodology
2
+
3
+ This document describes how to perform a thorough analysis of an existing codebase before planning brownfield changes. A complete analysis prevents regressions, ensures pattern compliance, and correctly sizes sprint contracts.
4
+
5
+ ## Analysis Phases
6
+
7
+ ### Phase 1: Surface-Level Survey (5 minutes)
8
+
9
+ Get the big picture without reading any code.
10
+
11
+ **1. File structure survey:**
12
+ ```
13
+ Use Glob with broad patterns to understand the layout:
14
+ src/**/*
15
+ app/**/*
16
+ server/**/*
17
+ lib/**/*
18
+ tests/**/*
19
+ e2e/**/*
20
+ ```
21
+
22
+ Questions to answer:
23
+ - Is this a monorepo or single project?
24
+ - What is the top-level organization? (feature folders, layer folders, hybrid)
25
+ - How many source files are there? (rough scale: tens, hundreds, thousands)
26
+ - Where do tests live? (co-located, separate directory, both)
27
+
28
+ **2. Package/dependency analysis:**
29
+
30
+ Read `package.json` (or equivalent) and categorize dependencies:
31
+ - Framework (React, Vue, Angular, Express, Fastify, etc.)
32
+ - ORM/database (Prisma, Drizzle, TypeORM, Mongoose, etc.)
33
+ - State management (Redux, Zustand, MobX, Recoil, etc.)
34
+ - UI library (shadcn, Material UI, Chakra, Ant Design, etc.)
35
+ - Testing (vitest, jest, mocha, playwright, cypress, etc.)
36
+ - Build tools (vite, webpack, esbuild, turbopack, etc.)
37
+ - Utilities (lodash, date-fns, zod, etc.)
38
+
39
+ **3. Configuration file scan:**
40
+
41
+ Check for and read:
42
+ - `tsconfig.json` / `jsconfig.json` — Compiler settings, path aliases, strict mode
43
+ - `vite.config.ts` / `next.config.js` / `webpack.config.js` — Build configuration
44
+ - `eslint.config.js` / `.eslintrc.*` / `biome.json` — Linting rules
45
+ - `tailwind.config.ts` — CSS configuration
46
+ - `prisma/schema.prisma` / `drizzle.config.ts` — Database configuration
47
+ - `.env.example` — Environment variables (reveals integrations and services)
48
+ - `Dockerfile` / `docker-compose.yml` — Container configuration
49
+ - `.github/workflows/*.yml` — CI/CD pipeline
50
+
51
+ ### Phase 2: Architecture Mapping (10 minutes)
52
+
53
+ Understand how the system is organized and how data flows.
54
+
55
+ **1. Entry points:**
56
+
57
+ Identify the application's entry points:
58
+ - Frontend: `main.tsx`, `App.tsx`, `pages/_app.tsx`, `app/layout.tsx`
59
+ - Backend: `server/index.ts`, `src/app.ts`, `main.py`
60
+ - CLI: `bin/`, `cli/`
61
+
62
+ Read each entry point to understand the boot sequence: what middleware is loaded, what routes are registered, what providers wrap the app.
63
+
64
+ **2. Routing map:**
65
+
66
+ Frontend routes:
67
+ ```
68
+ Use Grep to find route definitions:
69
+ Pattern: "path.*:.*/" or "Route.*path" or "<Route" (React Router)
70
+ Pattern: "app/" directory structure (Next.js App Router)
71
+ Pattern: "pages/" directory structure (Next.js Pages Router)
72
+ ```
73
+
74
+ Backend routes:
75
+ ```
76
+ Use Grep to find API route definitions:
77
+ Pattern: "app\.(get|post|put|delete|patch)" (Express)
78
+ Pattern: "router\.(get|post|put|delete|patch)" (Express Router)
79
+ Pattern: "@(Get|Post|Put|Delete|Patch)" (NestJS decorators)
80
+ Pattern: "@app\.(get|post|put|delete|patch)" (FastAPI)
81
+ ```
82
+
83
+ Produce a route table:
84
+ ```
85
+ Frontend Routes:
86
+ / -> pages/Home.tsx
87
+ /login -> pages/Login.tsx
88
+ /dashboard -> pages/Dashboard.tsx (protected)
89
+ /settings -> pages/Settings.tsx (protected)
90
+
91
+ Backend Routes:
92
+ GET /api/users -> routes/users.ts:getUsers
93
+ POST /api/users -> routes/users.ts:createUser
94
+ GET /api/users/:id -> routes/users.ts:getUser
95
+ PUT /api/users/:id -> routes/users.ts:updateUser
96
+ DELETE /api/users/:id -> routes/users.ts:deleteUser
97
+ POST /api/auth/login -> routes/auth.ts:login
98
+ POST /api/auth/logout -> routes/auth.ts:logout
99
+ ```
100
+
101
+ **3. Database schema map:**
102
+
103
+ Read the ORM schema and produce an entity relationship summary:
104
+ ```
105
+ Models:
106
+ User: id, email, passwordHash, name, createdAt, updatedAt
107
+ Post: id, title, content, authorId -> User, createdAt, updatedAt
108
+ Comment: id, content, postId -> Post, authorId -> User, createdAt
109
+
110
+ Relationships:
111
+ User 1:N Post (author)
112
+ User 1:N Comment (author)
113
+ Post 1:N Comment
114
+ ```
115
+
116
+ **4. Middleware/interceptor chain:**
117
+
118
+ For backend apps, trace the middleware chain:
119
+ ```
120
+ Request -> cors -> helmet -> bodyParser -> authMiddleware -> routeHandler -> errorHandler -> Response
121
+ ```
122
+
123
+ For frontend apps, trace the provider chain:
124
+ ```
125
+ <StrictMode>
126
+ <QueryClientProvider>
127
+ <AuthProvider>
128
+ <ThemeProvider>
129
+ <RouterProvider>
130
+ <App />
131
+ ```
132
+
133
+ ### Phase 3: Pattern Extraction (10 minutes)
134
+
135
+ Read 3-5 representative files of each type to extract patterns.
136
+
137
+ **1. Component patterns (frontend):**
138
+
139
+ Read several components and note:
140
+ - Function declaration style: `function Component()` or `const Component = () =>`
141
+ - Props typing: `interface Props {}` or `type Props = {}` or inline
142
+ - State management: useState, useReducer, store hook
143
+ - Data fetching: useEffect + fetch, React Query, SWR, server components
144
+ - Styling: className strings, CSS modules, styled-components, Tailwind
145
+ - File structure: imports, types, component, exports (in what order?)
146
+
147
+ **2. Route handler patterns (backend):**
148
+
149
+ Read several route handlers and note:
150
+ - Handler style: direct function, controller class, handler + service pattern
151
+ - Request validation: Zod, Joi, class-validator, manual
152
+ - Response format: JSON shape, status codes, error format
153
+ - Error handling: try/catch, error middleware, either pattern
154
+ - Database access: direct ORM calls or through a service layer?
155
+
156
+ **3. Test patterns:**
157
+
158
+ Read several test files and note:
159
+ - Test structure: describe/it, test(), or BDD-style
160
+ - Assertion library: expect (vitest/jest), assert, chai
161
+ - Mocking approach: vi.mock, jest.mock, manual mocks
162
+ - Test data: factories, fixtures, inline objects
163
+ - Setup/teardown: beforeEach/afterEach patterns
164
+
165
+ **4. Import conventions:**
166
+
167
+ Note:
168
+ - Absolute imports (`@/lib/utils`) vs relative (`../../lib/utils`)
169
+ - Barrel imports (`from '@/components'`) vs direct (`from '@/components/Button'`)
170
+ - Type imports: `import type { X }` vs `import { X }`
171
+ - Import ordering: external first, then internal? Alphabetical?
172
+
173
+ ### Phase 4: Health Assessment (5 minutes)
174
+
175
+ Assess the current health of the codebase.
176
+
177
+ **1. Test coverage:**
178
+ ```bash
179
+ # Count test files
180
+ find src -name "*.test.*" | wc -l
181
+ find tests -name "*.test.*" 2>/dev/null | wc -l
182
+
183
+ # Count source files (to calculate ratio)
184
+ find src -name "*.ts" -not -name "*.test.*" -not -name "*.d.ts" | wc -l
185
+
186
+ # Run tests to get current status
187
+ npm test 2>&1 | tail -20
188
+ ```
189
+
190
+ **2. Type safety:**
191
+ ```bash
192
+ # Check for any existing type errors
193
+ npx tsc --noEmit 2>&1 | tail -20
194
+
195
+ # Check for `any` usage (indicates weak typing)
196
+ grep -r ": any" src/ --include="*.ts" --include="*.tsx" | wc -l
197
+ ```
198
+
199
+ **3. Code quality indicators:**
200
+ ```bash
201
+ # Check for TODO/FIXME/HACK comments
202
+ grep -r "TODO\|FIXME\|HACK\|XXX" src/ --include="*.ts" --include="*.tsx" | wc -l
203
+
204
+ # Check for console.log statements
205
+ grep -r "console\.log" src/ --include="*.ts" --include="*.tsx" | wc -l
206
+
207
+ # Check linting status
208
+ npm run lint 2>&1 | tail -10
209
+ ```
210
+
211
+ **4. Git health:**
212
+ ```bash
213
+ # Recent activity (who's working on what)
214
+ git log --oneline --since="2 weeks ago" | head -20
215
+
216
+ # Files with most recent changes (hot spots)
217
+ git log --name-only --since="1 month ago" --pretty=format: | sort | uniq -c | sort -rn | head -20
218
+
219
+ # Check for uncommitted changes
220
+ git status --porcelain
221
+ ```
222
+
223
+ ### Phase 5: Risk Map
224
+
225
+ Combine the analysis into a risk assessment:
226
+
227
+ **High-risk areas** (modify with extreme caution):
228
+ - Files imported by >10 other files (high coupling)
229
+ - Files with no test coverage
230
+ - Files with recent high churn (many recent commits)
231
+ - Shared utilities and middleware
232
+ - Database schema (migrations affect everything)
233
+ - Authentication/authorization code
234
+
235
+ **Medium-risk areas** (modify carefully with tests):
236
+ - Components used on multiple pages
237
+ - API route handlers with complex business logic
238
+ - Configuration files
239
+ - Shared types/interfaces
240
+
241
+ **Low-risk areas** (safe to modify):
242
+ - Isolated page components
243
+ - New files that don't modify existing code
244
+ - Test files
245
+ - Documentation
246
+
247
+ ## Output Format
248
+
249
+ The codebase analysis should produce a structured summary that is saved to `.bober/codebase-analysis.json` (or included in the PlanSpec's `techNotes.existingPatterns`) and referenced by all sprint contracts:
250
+
251
+ ```json
252
+ {
253
+ "timestamp": "<ISO-8601>",
254
+ "commit": "<git commit hash>",
255
+ "techStack": {
256
+ "language": "TypeScript 5.x",
257
+ "frontend": "React 18, Vite, React Router v6",
258
+ "backend": "Express.js",
259
+ "database": "PostgreSQL via Prisma",
260
+ "styling": "Tailwind CSS + shadcn/ui",
261
+ "testing": "Vitest (unit), Playwright (E2E)",
262
+ "cicd": "GitHub Actions"
263
+ },
264
+ "architecture": {
265
+ "pattern": "feature-based with shared lib/",
266
+ "frontendRoutes": 8,
267
+ "backendEndpoints": 15,
268
+ "dbModels": 5
269
+ },
270
+ "health": {
271
+ "testFiles": 23,
272
+ "sourceFiles": 67,
273
+ "testCoverageRatio": 0.34,
274
+ "typeErrors": 0,
275
+ "lintErrors": 3,
276
+ "todoComments": 12,
277
+ "anyUsage": 4
278
+ },
279
+ "patterns": {
280
+ "componentStyle": "Arrow function components with Props interface",
281
+ "stateManagement": "Zustand for global state, useState for local",
282
+ "dataFetching": "TanStack Query with custom hooks in src/hooks/",
283
+ "apiCalls": "Fetch wrapper in src/lib/api.ts",
284
+ "errorHandling": "Error boundaries + toast notifications",
285
+ "testStyle": "describe/it blocks with @testing-library/react",
286
+ "importStyle": "Absolute imports with @/ prefix, type imports separated"
287
+ },
288
+ "highRiskFiles": [
289
+ "src/lib/api.ts (imported by 23 files)",
290
+ "src/middleware/auth.ts (all protected routes depend on this)",
291
+ "prisma/schema.prisma (database schema)"
292
+ ]
293
+ }
294
+ ```
295
+
296
+ ## Tips for Effective Analysis
297
+
298
+ 1. **Read the README first.** It often explains the architecture and setup process.
299
+ 2. **Check CLAUDE.md or CONTRIBUTING.md.** These may have explicit instructions about code patterns.
300
+ 3. **Look at recent PRs** (if accessible) to understand the team's expectations.
301
+ 4. **Do not analyze every file.** Sample 3-5 representative files per category. If the first 3 components all use the same pattern, you can assume the rest do too.
302
+ 5. **Pay attention to the `.gitignore`.** It tells you what's generated vs. authored.
303
+ 6. **Check for a monorepo tool.** `turbo.json`, `nx.json`, `pnpm-workspace.yaml`, `lerna.json` indicate monorepo structure.
304
+ 7. **Look for a design system.** Check `src/components/ui/` or similar. If a design system exists, all new UI must use it.
@@ -0,0 +1,235 @@
1
+ ---
2
+ name: bober.eval
3
+ description: Run an independent evaluation of the current sprint state against its contract, producing structured pass/fail feedback.
4
+ argument-hint: "[contract-id]"
5
+ ---
6
+
7
+ # bober.eval — Standalone Evaluation Skill
8
+
9
+ You are running the **bober.eval** skill. Your job is to independently evaluate the current state of a sprint implementation against its contract and produce structured feedback. This skill can be run at any time, independently of the sprint execution loop.
10
+
11
+ ## When to Use This Skill
12
+
13
+ - **During development:** To check your progress against criteria before running the full sprint loop
14
+ - **After manual changes:** When you have fixed something the Generator produced and want to re-evaluate
15
+ - **For debugging:** To understand exactly what is passing and failing in a sprint
16
+ - **As a standalone QA check:** To evaluate any codebase state against a sprint contract
17
+
18
+ ## Step 1: Identify the Target Contract
19
+
20
+ **If a contract ID was provided as an argument:**
21
+ - Load the contract from `.bober/contracts/<contractId>.json`
22
+ - Verify it exists
23
+
24
+ **If no contract ID was provided:**
25
+ - Load the most recent PlanSpec from `.bober/specs/`
26
+ - Find the most recent sprint contract with status `in-progress` or `needs-rework`
27
+ - If none are in-progress, find the first `proposed` contract
28
+ - If all are `completed`, tell the user there is nothing to evaluate
29
+
30
+ Read the contract and its parent PlanSpec.
31
+
32
+ ## Step 2: Load Configuration
33
+
34
+ Read `bober.config.json` and extract:
35
+ - `evaluator.strategies`: The configured evaluation strategies
36
+ - `evaluator.model`: The model to use (informational)
37
+ - `commands`: The project commands for build, test, lint, typecheck
38
+
39
+ ## Step 3: Pre-Flight Checks
40
+
41
+ Before running evaluation strategies, verify the environment:
42
+
43
+ 1. **Check if dependencies are installed:**
44
+ ```bash
45
+ # Check for installed dependencies (varies by stack)
46
+ # Node.js: ls node_modules/.package-lock.json 2>/dev/null
47
+ # Rust/Anchor: check target/ directory
48
+ # Solidity/Hardhat: ls node_modules/.package-lock.json 2>/dev/null
49
+ # Solidity/Foundry: check lib/ directory
50
+ # Python: check venv or .venv
51
+ ```
52
+ If dependencies are not installed, run the configured install command first.
53
+
54
+ 2. **Check the current git branch:**
55
+ ```bash
56
+ git branch --show-current
57
+ ```
58
+ Note the branch for the evaluation report.
59
+
60
+ 3. **Check for uncommitted changes:**
61
+ ```bash
62
+ git status --porcelain
63
+ ```
64
+ Note any uncommitted changes in the report. The evaluation should still proceed, but this is important context.
65
+
66
+ ## Step 4: Execute Evaluation Strategies
67
+
68
+ Run each strategy configured in `evaluator.strategies` from the config. Execute them in this order for fastest feedback on failures:
69
+
70
+ ### Priority 1: Build/Compile Verification
71
+ ```bash
72
+ # Use commands.build from config (varies by stack)
73
+ # e.g., npm run build, anchor build, forge build, cargo build, etc.
74
+ ```
75
+ - Record the full output
76
+ - If the build fails, most other checks are unreliable -- still run them but note this
77
+
78
+ ### Priority 2: Type Checking / Static Analysis
79
+ ```bash
80
+ # Use commands.typecheck from config (varies by stack)
81
+ # e.g., npx tsc --noEmit, cargo clippy, solhint, mypy, etc.
82
+ ```
83
+ - Record every type error with file path and line number
84
+ - Count total errors
85
+
86
+ ### Priority 3: Linting
87
+ ```bash
88
+ # Use commands.lint from config (varies by stack)
89
+ # e.g., npm run lint, solhint, clippy, ruff, etc.
90
+ ```
91
+ - Record every lint error (ignore warnings unless they indicate real problems)
92
+ - Count total errors
93
+
94
+ ### Priority 4: Unit Tests
95
+ ```bash
96
+ # Use commands.test from config (varies by stack)
97
+ # e.g., npm test, anchor test, forge test, pytest, etc.
98
+ ```
99
+ - Record which tests passed and which failed
100
+ - For failures, record the test name, expected vs actual output, and file location
101
+ - Check if any pre-existing tests broke (regression)
102
+
103
+ ### Priority 5: E2E Tests (Playwright)
104
+ ```bash
105
+ # Only run if configured and installed
106
+ npx playwright test 2>&1
107
+ ```
108
+ - If Playwright is not installed, mark as `skipped` (not `failed`)
109
+ - Record which tests passed and failed
110
+ - Note if screenshots are available
111
+
112
+ ### Priority 6: API Checks
113
+ - If the contract has API-related success criteria, start the dev server and test endpoints:
114
+ ```bash
115
+ # Start dev server in background
116
+ # Test endpoints with curl
117
+ curl -s -w "\n%{http_code}" http://localhost:<port>/api/<endpoint>
118
+ ```
119
+ - Record response status codes and body shapes
120
+
121
+ ### Priority 7: Custom Strategies
122
+ - For each strategy with `type: "custom"`, execute the command from the strategy's `config` field
123
+ - Record the output and exit code
124
+
125
+ **For each strategy, record:**
126
+ ```json
127
+ {
128
+ "strategy": "<type>",
129
+ "required": true,
130
+ "result": "pass | fail | skipped",
131
+ "exitCode": 0,
132
+ "output": "<relevant output>",
133
+ "errorCount": 0,
134
+ "details": "<explanation>"
135
+ }
136
+ ```
137
+
138
+ ## Step 5: Verify Success Criteria
139
+
140
+ Go through EVERY success criterion in the contract, one by one.
141
+
142
+ For each criterion:
143
+
144
+ 1. **Read the criterion and its verification method**
145
+ 2. **Gather evidence:**
146
+ - For `build`/`typecheck`/`lint`/`unit-test`/`playwright`: Use the strategy results from Step 4
147
+ - For `manual`: Read the relevant source files. Trace the code path. Verify the described behavior exists in the code.
148
+ - For `api-check`: Test the specific endpoint described in the criterion
149
+ - For `custom`: Run the custom command
150
+ 3. **Make a judgment: pass, fail, or skipped**
151
+ 4. **Record evidence supporting the judgment**
152
+
153
+ **Judgment rules:**
154
+ - `pass`: You have concrete evidence the criterion is met
155
+ - `fail`: You have concrete evidence the criterion is NOT met, or you cannot find evidence that it IS met
156
+ - `skipped`: The verification method cannot be executed (e.g., Playwright not installed)
157
+
158
+ **A criterion marked `required: true` MUST have a definitive pass or fail. It cannot be skipped.**
159
+
160
+ ## Step 6: Check for Regressions
161
+
162
+ Beyond the contract criteria, check for broader regressions:
163
+
164
+ 1. **Pre-existing test count:** If you can determine how many tests existed before the sprint, compare to the current count. Fewer passing tests = regression.
165
+ 2. **Build stability:** Does the full project build, not just the new code?
166
+ 3. **Unexpected file changes:** Use `git diff --stat` to see all changed files. Flag any files changed that are NOT in the contract's `estimatedFiles`.
167
+
168
+ ## Step 7: Produce the EvalResult
169
+
170
+ Generate the structured evaluation result following the schema in `skills/bober.eval/references/feedback-format.md`.
171
+
172
+ **Overall result determination:**
173
+ - **PASS:** ALL required strategies passed AND ALL required criteria passed AND no critical regressions
174
+ - **FAIL:** ANY required strategy failed OR ANY required criterion failed OR critical regression found
175
+
176
+ Save the EvalResult to `.bober/eval-results/eval-<contractId>-<iteration>.json`.
177
+
178
+ If this is the first evaluation for this contract, iteration = 1. Otherwise, read the contract's `iterationHistory` to determine the next iteration number.
179
+
180
+ Append to `.bober/history.jsonl`:
181
+ ```json
182
+ {"event":"eval-completed","contractId":"...","evalId":"...","result":"pass|fail","timestamp":"..."}
183
+ ```
184
+
185
+ ## Step 8: Output Report
186
+
187
+ Present results in a clear, human-readable format:
188
+
189
+ ```
190
+ ## Evaluation Report: <sprint title>
191
+
192
+ **Contract:** <contractId>
193
+ **Iteration:** <N>
194
+ **Result:** PASS / FAIL
195
+ **Branch:** <current branch>
196
+ **Uncommitted changes:** yes/no
197
+
198
+ ### Strategy Results
199
+ | Strategy | Required | Result |
200
+ |----------|----------|--------|
201
+ | build | yes | PASS |
202
+ | typecheck| yes | PASS |
203
+ | lint | yes | FAIL (3 errors) |
204
+ | unit-test| yes | PASS (12/12 tests) |
205
+
206
+ ### Success Criteria
207
+ | ID | Description | Required | Result |
208
+ |----|-------------|----------|--------|
209
+ | sc-1-1 | Project builds successfully | yes | PASS |
210
+ | sc-1-2 | Registration form exists at /register | yes | PASS |
211
+ | sc-1-3 | API returns 201 on valid registration | yes | FAIL |
212
+ ...
213
+
214
+ ### Failures (if any)
215
+
216
+ **sc-1-3: API returns 201 on valid registration**
217
+ - What failed: POST /api/auth/register returns 500 instead of 201
218
+ - Where: src/routes/auth.ts:42
219
+ - Evidence: `curl -X POST http://localhost:3000/api/auth/register -H "Content-Type: application/json" -d '{"email":"test@test.com","password":"password123"}' returned 500 with error "relation users does not exist"`
220
+ - Expected: 201 with `{ id, email }` response body
221
+ - Root cause: The database migration has not been run. The users table does not exist.
222
+
223
+ ### Regressions (if any)
224
+ - <description>
225
+
226
+ ### Summary
227
+ <2-3 sentence summary>
228
+ ```
229
+
230
+ ## Anti-Leniency Reminders
231
+
232
+ - If a criterion says "the form displays an error message" and you can only verify the validation logic exists in code but cannot confirm the message renders, mark it as **fail** with a note about what you could not verify.
233
+ - If the build has warnings that look like potential runtime errors (e.g., unused imports of things that should be used), flag them even if the build technically passes.
234
+ - If a test passes but the test itself is trivial (e.g., `expect(true).toBe(true)`), note this in the report. A passing trivial test does not satisfy a functional criterion.
235
+ - If the Generator's self-report says something works but you find evidence it does not, trust your evidence over the report.