agent-bober 0.11.0 → 0.11.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -3,11 +3,11 @@
3
3
  [![npm version](https://img.shields.io/npm/v/agent-bober.svg)](https://www.npmjs.com/package/agent-bober)
4
4
  [![license](https://img.shields.io/npm/l/agent-bober.svg)](https://github.com/BOBER3r/agent-bober/blob/main/LICENSE)
5
5
 
6
- **Generator-Evaluator multi-agent harness for building applications autonomously with any LLM.**
6
+ **Multi-agent harness for building applications autonomously with any LLM.**
7
7
 
8
8
  [agentbober.com](https://agentbober.com) | [npm](https://www.npmjs.com/package/agent-bober) | [GitHub](https://github.com/BOBER3r/agent-bober)
9
9
 
10
- Inspired by Anthropic's engineering publication [**"Harness design for long-running application development"**](https://www.anthropic.com/engineering/harness-design-long-running-apps), agent-bober implements the Generator-Evaluator multi-agent pattern as a reusable, installable workflow. It orchestrates AI agents in a structured loop: a **Planner** decomposes your idea into sprint contracts, a **Generator** writes the code, and an **Evaluator** independently verifies each sprint against its contract before moving on. The result is autonomous, high-quality software development with built-in guardrails, context resets, and brutally honest evaluation.
10
+ Inspired by Anthropic's engineering publication [**"Harness design for long-running application development"**](https://www.anthropic.com/engineering/harness-design-long-running-apps), agent-bober implements a multi-agent pipeline as a reusable, installable workflow. It orchestrates AI agents in a structured loop: a **Researcher** analyzes your codebase, a **Planner** decomposes your idea into sprint contracts, a **Curator** pre-analyzes code patterns and utilities for each sprint, a **Generator** writes the code with curated context, and an **Evaluator** independently verifies each sprint against its contract before moving on. The result is autonomous, high-quality software development with built-in guardrails, context resets, and brutally honest evaluation.
11
11
 
12
12
  Works with **Claude, GPT, Gemini, Ollama**, and any OpenAI-compatible endpoint. Mix and match providers per agent role.
13
13
 
@@ -116,17 +116,19 @@ Specialized workflows:
116
116
 
117
117
  ## Multi-Provider Support
118
118
 
119
- agent-bober is **provider-agnostic**. Use any LLM provider for any agent role. Mix and match -- Opus for planning, GPT-4.1 for generation, local Ollama for evaluation.
119
+ agent-bober is **provider-agnostic**. Use any LLM provider for any agent role. Mix and match providers freely -- use one for planning, another for generation, a local model for evaluation.
120
120
 
121
121
  ### Supported Providers
122
122
 
123
- | Provider | Models | API Key |
124
- |----------|--------|---------|
123
+ | Provider | Shorthands | API Key |
124
+ |----------|-----------|---------|
125
125
  | **Anthropic** (default) | `opus`, `sonnet`, `haiku` | `ANTHROPIC_API_KEY` |
126
- | **OpenAI** | `gpt-4.1`, `gpt-4.1-mini`, `o3`, `o4-mini` | `OPENAI_API_KEY` |
126
+ | **OpenAI** | Any OpenAI model ID | `OPENAI_API_KEY` |
127
127
  | **Google Gemini** | `gemini-pro`, `gemini-flash` | `GOOGLE_API_KEY` or `GEMINI_API_KEY` |
128
128
  | **OpenAI-Compatible** | Any model (Ollama, LM Studio, Groq, DeepSeek, etc.) | Optional |
129
129
 
130
+ Shorthands resolve to the latest model version automatically. You can also pass any full model ID directly -- it will be sent to the provider as-is.
131
+
130
132
  ### Configuration
131
133
 
132
134
  Set providers per agent role in `bober.config.json`:
@@ -139,21 +141,20 @@ Set providers per agent role in `bober.config.json`:
139
141
  },
140
142
  "generator": {
141
143
  "provider": "openai",
142
- "model": "gpt-4.1"
144
+ "model": "your-preferred-model"
143
145
  },
144
146
  "evaluator": {
145
147
  "provider": "openai-compat",
146
- "model": "llama3.1:70b",
148
+ "model": "any-local-model",
147
149
  "endpoint": "http://localhost:11434/v1"
148
150
  }
149
151
  }
150
152
  ```
151
153
 
152
- Model shorthands auto-resolve to the correct provider:
153
- - `"opus"` / `"sonnet"` / `"haiku"` -- Anthropic
154
- - `"gpt-4.1"` / `"o3"` / `"o4-mini"` -- OpenAI
155
- - `"gemini-pro"` / `"gemini-flash"` -- Google
156
- - `"ollama/llama3"` -- OpenAI-compatible at localhost:11434
154
+ The `ollama/` prefix is a shortcut for local models:
155
+ ```jsonc
156
+ { "model": "ollama/llama3" } // resolves to openai-compat at localhost:11434
157
+ ```
157
158
 
158
159
  Override provider for all roles from the CLI:
159
160
  ```bash
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "agent-bober",
3
- "version": "0.11.0",
4
- "description": "Generator-Evaluator multi-agent harness for building applications autonomously with any LLM. Supports Claude, GPT, Gemini, Ollama. Includes MCP server for Cursor/Windsurf.",
3
+ "version": "0.11.3",
4
+ "description": "Multi-agent harness for building applications autonomously with any LLM. Researcher, Planner, Curator, Generator, Evaluator pipeline. Supports Claude, GPT, Gemini, Ollama. MCP server for Cursor/Windsurf.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
7
7
  "types": "dist/index.d.ts",
@@ -251,7 +251,69 @@ For retry iterations (iteration > 1), populate `evaluatorFeedback` with the eval
251
251
 
252
252
  Save the handoff to `.bober/handoffs/<handoffId>.json`.
253
253
 
254
- ### 3d. Spawn the Generator Subagent
254
+ ### 3d. Spawn the Curator Subagent (once per sprint, before the first generator attempt)
255
+
256
+ Check if `curator.enabled` is `true` in `bober.config.json` (default: true). If enabled, and this is iteration 1 (not a retry), spawn a curator subagent to produce a Sprint Briefing.
257
+
258
+ **Skip the curator if:**
259
+ - `curator.enabled` is `false` in config
260
+ - This is a retry iteration (iteration > 1) — the briefing already exists from the first attempt
261
+
262
+ **Use the Agent tool to spawn the curator:**
263
+
264
+ ```
265
+ Agent tool call:
266
+ description: "Curate sprint <N>: <sprint title>"
267
+ subagent_type: bober-curator
268
+ mode: auto
269
+ prompt: <the full prompt below>
270
+ ```
271
+
272
+ **Build the curator prompt:**
273
+
274
+ ```
275
+ You are the Bober Curator subagent. You have been spawned by the orchestrator to produce a Sprint Briefing.
276
+
277
+ ## Sprint Contract
278
+ Read from: .bober/contracts/<contractId>.json
279
+
280
+ ## Project Overview
281
+ Plan: <spec title>
282
+ Description: <spec description>
283
+ Tech Stack: <spec techStack>
284
+
285
+ ## Completed Sprints
286
+ <list completed sprint titles and what they built, or "No prior sprints completed.">
287
+
288
+ ## Project Root
289
+ <project root path>
290
+
291
+ ## Instructions
292
+ 1. Read the sprint contract at .bober/contracts/<contractId>.json
293
+ 2. For each file in estimatedFiles: read it, extract relevant sections, trace imports
294
+ 3. Find existing utilities the generator should reuse (search src/utils/, src/lib/, src/helpers/)
295
+ 4. Find test files similar to what this sprint needs — extract patterns
296
+ 5. Check .bober/principles.md, README.md, architecture docs
297
+ 6. Identify files/tests that may be affected by the changes (grep for imports)
298
+ 7. Determine implementation sequence based on dependencies
299
+ 8. Save the Sprint Briefing to .bober/briefings/<contractId>-briefing.md
300
+
301
+ Your final response must contain ONLY a JSON object (no markdown fences):
302
+ {
303
+ "contractId": "<contract ID>",
304
+ "briefingPath": ".bober/briefings/<contractId>-briefing.md",
305
+ "filesAnalyzed": ["<files you read>"],
306
+ "patternsFound": <number>,
307
+ "utilsIdentified": <number>,
308
+ "summary": "<2-3 sentence summary>"
309
+ }
310
+ ```
311
+
312
+ **After the curator subagent returns:**
313
+ 1. Verify the briefing was saved: check `.bober/briefings/<contractId>-briefing.md` exists
314
+ 2. Log the curator result but do NOT read the full briefing into orchestrator context — the generator will read it from disk
315
+
316
+ ### 3e. Spawn the Generator Subagent
255
317
 
256
318
  Use the **Agent tool** to spawn a generator subagent.
257
319
 
@@ -269,22 +331,30 @@ IMPORTANT: The generator MUST have full write access (`mode: auto` or `mode: byp
269
331
 
270
332
  **Build the generator prompt:**
271
333
 
334
+ IMPORTANT: Do NOT paste the full handoff JSON inline. The handoff has already been saved to disk. Reference the file path instead — this keeps the orchestrator's context lean.
335
+
272
336
  ```
273
337
  You are the Bober Generator subagent. You have been spawned by the orchestrator to implement a sprint.
274
338
 
275
339
  ## Context Handoff
276
- <paste the FULL handoff JSON here — this is ALL the context you get>
340
+ Read the full handoff from: .bober/handoffs/<handoffId>.json
341
+
342
+ ## Sprint Briefing
343
+ Read the curated Sprint Briefing FIRST (if it exists): .bober/briefings/<contractId>-briefing.md
344
+ The briefing contains pre-analyzed code patterns, utilities to reuse, affected files, testing patterns, and implementation sequence. Start here before exploring the codebase.
277
345
 
278
346
  ## Instructions
279
- 1. Read the SprintContract at .bober/contracts/<contractId>.json
280
- 2. Read the PlanSpec at .bober/specs/<specId>.json for broader context
281
- 3. Read bober.config.json for commands configuration
282
- 4. Read .bober/principles.md if it exists — adhere to all principles strictly
283
- 5. Read the files listed in the contract's estimatedFiles
284
- 6. Implement the sprint according to the contract's success criteria
285
- 7. Self-verify: run build, typecheck, lint, and test commands
286
- 8. Commit your changes with proper messages (format: "bober(<sprint-N>): <description>")
287
- 9. Work on the feature branch, never on main/master
347
+ 1. Read the Sprint Briefing at .bober/briefings/<contractId>-briefing.md (if it exists)
348
+ 2. Read the handoff at .bober/handoffs/<handoffId>.json
349
+ 3. Read the SprintContract at .bober/contracts/<contractId>.json
350
+ 4. Read the PlanSpec at .bober/specs/<specId>.json for broader context
351
+ 5. Read bober.config.json for commands configuration
352
+ 6. Read .bober/principles.md if it exists — adhere to all principles strictly
353
+ 7. Read the files listed in the contract's estimatedFiles
354
+ 8. Implement the sprint according to the contract's success criteria
355
+ 9. Self-verify: run build, typecheck, lint, and test commands
356
+ 10. Commit your changes with proper messages (format: "bober(<sprint-N>): <description>")
357
+ 11. Work on the feature branch, never on main/master
288
358
 
289
359
  <IF iteration > 1>
290
360
  ## IMPORTANT — This is a RETRY (iteration <N>)
@@ -331,7 +401,7 @@ When done, respond with EXACTLY this JSON structure (no other text):
331
401
  ```
332
402
  5. If the generator subagent crashed or returned an error, mark the sprint as `needs-rework` and log it.
333
403
 
334
- ### 3e. Spawn the Evaluator Subagent
404
+ ### 3f. Spawn the Evaluator Subagent
335
405
 
336
406
  Use the **Agent tool** to spawn an evaluator subagent.
337
407
 
@@ -349,20 +419,16 @@ NOTE: The evaluator has read + bash access but NO write/edit tools (enforced by
349
419
 
350
420
  **Build the evaluator prompt:**
351
421
 
422
+ IMPORTANT: Do NOT paste the full handoff or contract JSON inline. Reference file paths instead — this keeps the orchestrator's context lean. Only include the generator's completion report and minimal context identifiers.
423
+
352
424
  ```
353
425
  You are the Bober Evaluator subagent. You have been spawned by the orchestrator to evaluate a sprint.
354
426
 
355
427
  ## Sprint Contract
356
- <paste the full SprintContract JSON>
428
+ Read from: .bober/contracts/<contractId>.json
357
429
 
358
430
  ## Generator's Completion Report
359
- <paste the generator's completion report JSON>
360
-
361
- ## Project Configuration
362
- <paste relevant sections of bober.config.json: commands, evaluator>
363
-
364
- ## Project Principles
365
- <paste full text of .bober/principles.md or "No principles file found.">
431
+ <paste the generator's completion report JSON — this is small and needed for context>
366
432
 
367
433
  ## Context
368
434
  - Contract ID: <contractId>
@@ -375,10 +441,10 @@ You are the Bober Evaluator subagent. You have been spawned by the orchestrator
375
441
  ## Instructions
376
442
  1. Read the SprintContract at .bober/contracts/<contractId>.json
377
443
  2. Read bober.config.json for configured eval strategies and commands
378
- 3. Run each configured evaluation strategy (typecheck, lint, build, unit-test, playwright, api-check) using the commands from config
379
- 4. Verify EVERY success criterion in the contract one by one
380
- 5. Check for regressions (pre-existing tests still passing, build stability)
381
- 6. Check adherence to project principles
444
+ 3. Read .bober/principles.md if it exists check adherence
445
+ 4. Run each configured evaluation strategy (typecheck, lint, build, unit-test, playwright, api-check) using the commands from config
446
+ 5. Verify EVERY success criterion in the contract one by one
447
+ 6. Check for regressions (pre-existing tests still passing, build stability)
382
448
  7. Produce a structured EvalResult
383
449
 
384
450
  IMPORTANT: You do NOT have Write or Edit tools. Output the EvalResult JSON in your response, and the orchestrator will save it to disk.
@@ -432,7 +498,7 @@ When done, respond with EXACTLY this JSON structure (no other text):
432
498
  2. Save the EvalResult to `.bober/eval-results/eval-<contractId>-<iteration>.json` (the evaluator cannot write files).
433
499
  3. Determine pass/fail from the `overallResult` field.
434
500
 
435
- ### 3f. Process the Evaluation Result
501
+ ### 3g. Process the Evaluation Result
436
502
 
437
503
  **On PASS:**
438
504
  1. Update contract status to `completed` and save to `.bober/contracts/`.
@@ -477,14 +543,14 @@ When done, respond with EXACTLY this JSON structure (no other text):
477
543
  - If the failure blocks subsequent sprints, stop the pipeline.
478
544
  4. Print failure report with full context.
479
545
 
480
- ### 3g. Context Reset
546
+ ### 3h. Context Reset
481
547
 
482
548
  After each sprint completes (pass or fail), check `pipeline.contextReset` from config:
483
549
  - `always`: Fresh context for the next sprint. The next sprint's Generator receives only its handoff document. (This is the default with subagent architecture — each spawn IS a fresh context.)
484
550
  - `on-threshold`: Same as `always` with subagents, since each subagent is already isolated.
485
551
  - `never`: Carry summary forward in the handoff. Still a fresh subagent, but with richer handoff.
486
552
 
487
- ### 3h. Iteration Budget
553
+ ### 3i. Iteration Budget
488
554
 
489
555
  Track total Generator-Evaluator iterations across all sprints:
490
556
  - Each Generator+Evaluator cycle counts as 1 iteration.
@@ -134,7 +134,68 @@ Save the handoff to `.bober/handoffs/<handoffId>.json`.
134
134
  }
135
135
  ```
136
136
 
137
- ## Step 4: Spawn the Generator Subagent
137
+ ## Step 4: Spawn the Curator Subagent (once per sprint)
138
+
139
+ Check if `curator.enabled` is `true` in `bober.config.json` (default: true). If enabled, spawn a curator subagent ONCE before the first generator attempt to produce a Sprint Briefing.
140
+
141
+ **Skip the curator if:**
142
+ - `curator.enabled` is `false` in config
143
+ - A briefing already exists at `.bober/briefings/<contractId>-briefing.md` (from a previous run)
144
+
145
+ **Use the Agent tool to spawn the curator:**
146
+
147
+ ```
148
+ Agent tool call:
149
+ description: "Curate sprint <N>: <sprint title>"
150
+ subagent_type: bober-curator
151
+ mode: auto
152
+ prompt: <the prompt below>
153
+ ```
154
+
155
+ **Curator prompt:**
156
+
157
+ ```
158
+ You are the Bober Curator subagent. You have been spawned by the orchestrator to produce a Sprint Briefing.
159
+
160
+ ## Sprint Contract
161
+ Read from: .bober/contracts/<contractId>.json
162
+
163
+ ## Project Overview
164
+ Plan: <spec title>
165
+ Description: <spec description>
166
+
167
+ ## Completed Sprints
168
+ <list completed sprint titles and what they built, or "No prior sprints completed.">
169
+
170
+ ## Project Root
171
+ <project root path>
172
+
173
+ ## Instructions
174
+ 1. Read the sprint contract at .bober/contracts/<contractId>.json
175
+ 2. For each file in estimatedFiles: read it, extract relevant sections, trace imports
176
+ 3. Find existing utilities the generator should reuse (search src/utils/, src/lib/, src/helpers/)
177
+ 4. Find test files similar to what this sprint needs — extract patterns
178
+ 5. Check .bober/principles.md, README.md, architecture docs
179
+ 6. Identify files/tests that may be affected by the changes (grep for imports)
180
+ 7. Determine implementation sequence based on dependencies
181
+ 8. Save the Sprint Briefing to .bober/briefings/<contractId>-briefing.md
182
+
183
+ Your final response must contain ONLY a JSON object (no markdown fences):
184
+ {
185
+ "contractId": "<contract ID>",
186
+ "briefingPath": ".bober/briefings/<contractId>-briefing.md",
187
+ "filesAnalyzed": ["<files you read>"],
188
+ "patternsFound": <number>,
189
+ "utilsIdentified": <number>,
190
+ "summary": "<2-3 sentence summary>"
191
+ }
192
+ ```
193
+
194
+ **After the curator subagent returns:**
195
+ 1. Verify the briefing was saved: check `.bober/briefings/<contractId>-briefing.md` exists
196
+ 2. Do NOT read the full briefing into orchestrator context — the generator reads it from disk
197
+
198
+ ## Step 5: Spawn the Generator Subagent
138
199
 
139
200
  **Before spawning:**
140
201
  1. Ensure the correct git branch exists and is checked out:
@@ -157,22 +218,30 @@ IMPORTANT: Use `mode: auto` or `mode: bypassPermissions` — the generator needs
157
218
 
158
219
  **Generator prompt:**
159
220
 
221
+ IMPORTANT: Do NOT paste the full handoff JSON inline. The handoff has already been saved to disk. Reference the file path instead — this keeps the orchestrator's context lean.
222
+
160
223
  ```
161
224
  You are the Bober Generator subagent. You have been spawned by the orchestrator to implement a sprint.
162
225
 
163
226
  ## Context Handoff
164
- <paste the FULL handoff JSON>
227
+ Read the full handoff from: .bober/handoffs/<handoffId>.json
228
+
229
+ ## Sprint Briefing
230
+ Read the curated Sprint Briefing FIRST (if it exists): .bober/briefings/<contractId>-briefing.md
231
+ The briefing contains pre-analyzed code patterns, utilities to reuse, affected files, testing patterns, and implementation sequence. Start here before exploring the codebase.
165
232
 
166
233
  ## Instructions
167
- 1. Read the SprintContract at .bober/contracts/<contractId>.json
168
- 2. Read the PlanSpec at .bober/specs/<specId>.json for broader context
169
- 3. Read bober.config.json for commands configuration
170
- 4. Read .bober/principles.md if it exists — adhere to all principles strictly
171
- 5. Read the files listed in the contract's estimatedFiles
172
- 6. Implement the sprint according to the contract's success criteria
173
- 7. Self-verify: run build, typecheck, lint, and test commands
174
- 8. Commit your changes (format: "bober(<sprint-N>): <description>")
175
- 9. Work on the feature branch, never on main/master
234
+ 1. Read the Sprint Briefing at .bober/briefings/<contractId>-briefing.md (if it exists)
235
+ 2. Read the handoff at .bober/handoffs/<handoffId>.json
236
+ 3. Read the SprintContract at .bober/contracts/<contractId>.json
237
+ 4. Read the PlanSpec at .bober/specs/<specId>.json for broader context
238
+ 5. Read bober.config.json for commands configuration
239
+ 6. Read .bober/principles.md if it exists — adhere to all principles strictly
240
+ 7. Read the files listed in the contract's estimatedFiles
241
+ 8. Implement the sprint according to the contract's success criteria
242
+ 9. Self-verify: run build, typecheck, lint, and test commands
243
+ 10. Commit your changes (format: "bober(<sprint-N>): <description>")
244
+ 11. Work on the feature branch, never on main/master
176
245
 
177
246
  <IF iteration > 1>
178
247
  ## IMPORTANT — This is a RETRY (iteration <N>)
@@ -206,7 +275,7 @@ When done, respond with EXACTLY this JSON structure (no other text):
206
275
  3. Save the generator report to `.bober/handoffs/gen-report-<contractId>-<iteration>.json`
207
276
  4. If the generator subagent crashed or returned an error, mark the sprint as `needs-rework` with note "Generator subagent failed".
208
277
 
209
- ## Step 5: Spawn the Evaluator Subagent
278
+ ## Step 6: Spawn the Evaluator Subagent
210
279
 
211
280
  **Use the Agent tool to spawn the evaluator:**
212
281
 
@@ -222,20 +291,16 @@ NOTE: The evaluator needs `mode: auto` for bash access (running tests, builds).
222
291
 
223
292
  **Evaluator prompt:**
224
293
 
294
+ IMPORTANT: Do NOT paste the full contract or config JSON inline. Reference file paths instead — this keeps the orchestrator's context lean. Only include the generator's completion report and minimal context identifiers.
295
+
225
296
  ```
226
297
  You are the Bober Evaluator subagent. You have been spawned by the orchestrator to evaluate a sprint.
227
298
 
228
299
  ## Sprint Contract
229
- <paste the full SprintContract JSON>
300
+ Read from: .bober/contracts/<contractId>.json
230
301
 
231
302
  ## Generator's Completion Report
232
- <paste the generator's completion report JSON>
233
-
234
- ## Project Configuration
235
- <paste relevant sections: commands, evaluator config>
236
-
237
- ## Project Principles
238
- <paste full text of .bober/principles.md or "No principles file found.">
303
+ <paste the generator's completion report JSON — this is small and needed for context>
239
304
 
240
305
  ## Context
241
306
  - Contract ID: <contractId>
@@ -243,14 +308,15 @@ You are the Bober Evaluator subagent. You have been spawned by the orchestrator
243
308
  - Sprint: <N> of <total>
244
309
  - Iteration: <N>
245
310
  - Branch: <current branch>
311
+ - Changed files (per generator): <list of files>
246
312
 
247
313
  ## Instructions
248
314
  1. Read the SprintContract at .bober/contracts/<contractId>.json
249
315
  2. Read bober.config.json for configured eval strategies and commands
250
- 3. Run each configured evaluation strategy using the commands from config
251
- 4. Verify EVERY success criterion one by one
252
- 5. Check for regressions
253
- 6. Check adherence to project principles
316
+ 3. Read .bober/principles.md if it exists check adherence
317
+ 4. Run each configured evaluation strategy using the commands from config
318
+ 5. Verify EVERY success criterion one by one
319
+ 6. Check for regressions
254
320
  7. Produce a structured EvalResult
255
321
 
256
322
  IMPORTANT: You do NOT have Write or Edit tools. Output the EvalResult JSON in your response.
@@ -278,7 +344,7 @@ Respond with EXACTLY this JSON structure (no other text):
278
344
  2. Save the EvalResult to `.bober/eval-results/eval-<contractId>-<iteration>.json` (the evaluator cannot write files).
279
345
  3. Determine pass/fail from the `overallResult` field.
280
346
 
281
- ## Step 6: Process Evaluation Result
347
+ ## Step 7: Process Evaluation Result
282
348
 
283
349
  ### If the sprint PASSES:
284
350
 
@@ -365,7 +431,7 @@ Check `evaluator.maxIterations` from `bober.config.json` (default: 3). If the cu
365
431
  - Run /bober-plan to revise the plan
366
432
  ```
367
433
 
368
- ## Step 7: Context Reset
434
+ ## Step 8: Context Reset
369
435
 
370
436
  After a sprint completes (pass or fail), manage context:
371
437
 
@@ -389,3 +455,260 @@ Read `pipeline.contextReset` from config:
389
455
  After completing this phase, suggest the following next steps to the user:
390
456
  - `/bober-eval` — Evaluate the current sprint output independently
391
457
  - `/bober-sprint` — Execute the next sprint in the plan
458
+
459
+
460
+ ---
461
+
462
+ <!-- Reference: contract-schema.md -->
463
+
464
+ # SprintContract JSON Schema
465
+
466
+ This document defines the complete schema for SprintContract documents. Sprint contracts are the binding agreement between the Planner, Generator, and Evaluator for a single sprint.
467
+
468
+ ## Location
469
+
470
+ SprintContract files are stored at: `.bober/contracts/<contractId>.json`
471
+
472
+ ## Naming Convention
473
+
474
+ - `contractId` format: `sprint-<specId>-<sprint-number>`
475
+ - Example: `sprint-spec-20260326-user-auth-1`
476
+ - Sprint numbers are 1-indexed (first sprint is 1, not 0)
477
+
478
+ ## Full Schema
479
+
480
+ ```json
481
+ {
482
+ "contractId": "string (required)",
483
+ "specId": "string (required, references parent PlanSpec)",
484
+ "sprintNumber": "number (required, 1-indexed)",
485
+ "title": "string (required, concise sprint title)",
486
+ "description": "string (required, what this sprint delivers)",
487
+ "status": "string (required, one of: proposed, in-progress, completed, needs-rework)",
488
+ "createdAt": "string (required, ISO-8601)",
489
+ "updatedAt": "string (required, ISO-8601)",
490
+ "completedAt": "string (optional, ISO-8601, set when status becomes completed)",
491
+
492
+ "dependsOn": [
493
+ "string — contractId references for sprints that must complete before this one"
494
+ ],
495
+
496
+ "features": [
497
+ "string — featureId references from the parent PlanSpec"
498
+ ],
499
+
500
+ "successCriteria": [
501
+ {
502
+ "criterionId": "string (required, format: sc-<sprint>-<index>)",
503
+ "description": "string (required, specific testable criterion)",
504
+ "verificationMethod": "string (required, one of: manual, typecheck, lint, unit-test, playwright, api-check, build, custom)",
505
+ "required": "boolean (required, true = must pass for sprint to pass)",
506
+ "customCommand": "string (optional, command to run for custom verification)"
507
+ }
508
+ ],
509
+
510
+ "generatorNotes": "string (required, guidance for the Generator agent)",
511
+ "evaluatorNotes": "string (required, guidance for the Evaluator agent)",
512
+
513
+ "estimatedFiles": [
514
+ "string — file paths expected to be created or modified"
515
+ ],
516
+
517
+ "estimatedDuration": "string (required, one of: small, medium, large)",
518
+
519
+ "iterationHistory": [
520
+ {
521
+ "iteration": "number",
522
+ "evalId": "string — reference to EvalResult",
523
+ "result": "string (pass | fail)",
524
+ "timestamp": "string (ISO-8601)"
525
+ }
526
+ ],
527
+
528
+ "lastEvalId": "string (optional, reference to most recent EvalResult)"
529
+ }
530
+ ```
531
+
532
+ ## Field Descriptions
533
+
534
+ ### Core Fields
535
+
536
+ | Field | Description |
537
+ |-------|-------------|
538
+ | `contractId` | Unique identifier. Generated by the Planner. Never changes. |
539
+ | `specId` | Reference to the parent PlanSpec. Used to load broader context. |
540
+ | `sprintNumber` | Position in the sprint sequence. 1-indexed. |
541
+ | `title` | Concise description of what this sprint delivers. Should start with a verb: "Implement...", "Add...", "Create...". |
542
+ | `description` | 2-4 sentences describing the sprint's deliverables and scope. |
543
+ | `status` | Lifecycle state. See Status Transitions below. |
544
+
545
+ ### Status Transitions
546
+
547
+ ```
548
+ proposed → in-progress → completed
549
+
550
+ needs-rework → in-progress → completed
551
+ ```
552
+
553
+ - `proposed`: Created by the Planner. Not yet started or reviewed.
554
+ - `in-progress`: Contract negotiated and Generator is working on it.
555
+ - `completed`: All required success criteria passed evaluation.
556
+ - `needs-rework`: Failed evaluation after maximum iterations. Requires human intervention or plan revision.
557
+
558
+ ### Dependencies
559
+
560
+ | Field | Description |
561
+ |-------|-------------|
562
+ | `dependsOn` | Array of `contractId` values that must have status `completed` before this sprint can start. Empty array for the first sprint. |
563
+ | `features` | Array of `featureId` values from the parent PlanSpec that this sprint implements (partially or fully). |
564
+
565
+ ### Success Criteria
566
+
567
+ Each success criterion is a single testable statement that the Evaluator checks independently.
568
+
569
+ | Field | Description |
570
+ |-------|-------------|
571
+ | `criterionId` | Unique within the contract. Format: `sc-<sprintNumber>-<index>` (1-indexed). |
572
+ | `description` | Specific, testable criterion. Must describe observable behavior or measurable outcome. |
573
+ | `verificationMethod` | How the Evaluator should verify this criterion. |
574
+ | `required` | If `true`, this criterion MUST pass for the sprint to pass. If `false`, it is advisory. |
575
+ | `customCommand` | Only for `verificationMethod: "custom"`. The command the Evaluator should run. |
576
+
577
+ ### Verification Methods
578
+
579
+ | Method | What the Evaluator Does |
580
+ |--------|------------------------|
581
+ | `manual` | Reads source code and assesses whether the criterion is met based on code inspection and logic tracing. |
582
+ | `typecheck` | Runs the configured typecheck command. Criterion passes if zero type errors. |
583
+ | `lint` | Runs the configured lint command. Criterion passes if zero lint errors (warnings OK). |
584
+ | `unit-test` | Runs the configured test command. Criterion passes if all tests pass. |
585
+ | `playwright` | Runs Playwright E2E tests. Criterion passes if all relevant E2E tests pass. |
586
+ | `api-check` | Tests specific API endpoints using curl or similar. Criterion passes if responses match expectations. |
587
+ | `build` | Runs the configured build command. Criterion passes if build succeeds with exit code 0. |
588
+ | `custom` | Runs `customCommand` and interprets the result. Exit code 0 = pass. |
589
+
590
+ ### Agent Notes
591
+
592
+ | Field | Description |
593
+ |-------|-------------|
594
+ | `generatorNotes` | Free-form guidance for the Generator. Should include: key files to examine for patterns, known gotchas, suggested implementation order, references to similar existing code. |
595
+ | `evaluatorNotes` | Free-form guidance for the Evaluator. Should include: specific things to test, edge cases to check, how to verify UI criteria, expected API response shapes. |
596
+
597
+ ### Estimates
598
+
599
+ | Field | Description |
600
+ |-------|-------------|
601
+ | `estimatedFiles` | Array of file paths the Generator is expected to create or modify. This is advisory -- the Generator may touch additional files if needed. The Evaluator uses this to check for unexpected changes. |
602
+ | `estimatedDuration` | Relative size estimate: `small` (30-60 min), `medium` (1-3 hours), `large` (3-5 hours). |
603
+
604
+ ### Iteration History
605
+
606
+ | Field | Description |
607
+ |-------|-------------|
608
+ | `iterationHistory` | Array of past evaluation attempts. Appended after each evaluation. |
609
+ | `lastEvalId` | Reference to the most recent EvalResult. Updated after each evaluation. |
610
+
611
+ ## Complete Example
612
+
613
+ ```json
614
+ {
615
+ "contractId": "sprint-spec-20260326-user-auth-1",
616
+ "specId": "spec-20260326-user-auth",
617
+ "sprintNumber": 1,
618
+ "title": "Implement user registration with form and API",
619
+ "description": "Create the user registration flow end-to-end: a React registration form with email, password, and confirm-password fields; an Express API endpoint that validates input and creates a user record in PostgreSQL with a bcrypt-hashed password; and basic form validation on both client and server.",
620
+ "status": "proposed",
621
+ "createdAt": "2026-03-26T10:00:00Z",
622
+ "updatedAt": "2026-03-26T10:00:00Z",
623
+ "completedAt": null,
624
+
625
+ "dependsOn": [],
626
+
627
+ "features": ["feat-1"],
628
+
629
+ "successCriteria": [
630
+ {
631
+ "criterionId": "sc-1-1",
632
+ "description": "The project builds successfully with zero errors.",
633
+ "verificationMethod": "build",
634
+ "required": true
635
+ },
636
+ {
637
+ "criterionId": "sc-1-2",
638
+ "description": "TypeScript compilation produces zero type errors.",
639
+ "verificationMethod": "typecheck",
640
+ "required": true
641
+ },
642
+ {
643
+ "criterionId": "sc-1-3",
644
+ "description": "A registration form component exists at the /register route with email, password, and confirm-password input fields, each with an associated label.",
645
+ "verificationMethod": "manual",
646
+ "required": true
647
+ },
648
+ {
649
+ "criterionId": "sc-1-4",
650
+ "description": "POST /api/auth/register accepts { email, password } and returns 201 with { id, email } on success.",
651
+ "verificationMethod": "api-check",
652
+ "required": true
653
+ },
654
+ {
655
+ "criterionId": "sc-1-5",
656
+ "description": "POST /api/auth/register returns 400 with an error message when email is already registered.",
657
+ "verificationMethod": "api-check",
658
+ "required": true
659
+ },
660
+ {
661
+ "criterionId": "sc-1-6",
662
+ "description": "The password is stored as a bcrypt hash in the database, never in plain text.",
663
+ "verificationMethod": "manual",
664
+ "required": true
665
+ },
666
+ {
667
+ "criterionId": "sc-1-7",
668
+ "description": "Client-side validation shows an error when password is shorter than 8 characters before form submission.",
669
+ "verificationMethod": "manual",
670
+ "required": true
671
+ },
672
+ {
673
+ "criterionId": "sc-1-8",
674
+ "description": "ESLint reports zero errors on all new and modified files.",
675
+ "verificationMethod": "lint",
676
+ "required": false
677
+ }
678
+ ],
679
+
680
+ "generatorNotes": "Look at existing route definitions in src/routes/ for the Express routing pattern. The project uses Prisma -- check prisma/schema.prisma for the existing schema and add a User model. Use bcrypt (already in package.json) for password hashing. For the React form, follow the pattern in src/components/ -- the project uses controlled components with useState. The registration form should be at src/pages/Register.tsx and the route added to src/App.tsx.",
681
+
682
+ "evaluatorNotes": "For sc-1-3: Read the Register component source and verify it renders three labeled input fields. For sc-1-4 and sc-1-5: Start the dev server and use curl to test the endpoint. For sc-1-6: Read the route handler code and verify bcrypt.hash is called before database insertion. For sc-1-7: Read the form component code and verify client-side validation logic exists for password length.",
683
+
684
+ "estimatedFiles": [
685
+ "prisma/schema.prisma",
686
+ "src/routes/auth.ts",
687
+ "src/pages/Register.tsx",
688
+ "src/App.tsx"
689
+ ],
690
+
691
+ "estimatedDuration": "medium",
692
+
693
+ "iterationHistory": [],
694
+ "lastEvalId": null
695
+ }
696
+ ```
697
+
698
+ ## Writing Good Success Criteria
699
+
700
+ ### Do
701
+
702
+ - Start with an observable action or state: "The form displays...", "The API returns...", "The database contains..."
703
+ - Include specific values: "returns 201", "displays 'Invalid email'", "at least 8 characters"
704
+ - Map each criterion to exactly one verification method
705
+ - Include at least one `build` criterion and one functional criterion per sprint
706
+ - Write criteria the Evaluator can verify without guessing
707
+
708
+ ### Do Not
709
+
710
+ - Use subjective language: "looks good", "works well", "clean code"
711
+ - Combine multiple checks in one criterion (split them)
712
+ - Reference internal implementation details unless checking them IS the criterion
713
+ - Write criteria that require human visual judgment (unless verification method is `manual` and the check is code-inspectable)
714
+ - Assume the Evaluator has context beyond the contract and handoff documents