agent-bober 0.4.3 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -4,9 +4,9 @@ description: Full autonomous pipeline — plan a feature, execute all sprints, e
4
4
  argument-hint: <task-description>
5
5
  ---
6
6
 
7
- # bober.run — Full Pipeline Orchestrator
7
+ # bober.run — Multi-Agent Pipeline Orchestrator
8
8
 
9
- You are running the **bober.run** skill. This is the top-level orchestrator that runs the entire Generator-Evaluator pipeline from start to finish: planning, sprint execution, evaluation, and iteration. The user provides a task description and you deliver a working implementation.
9
+ You are the **orchestrator** for the bober.run pipeline. You do NOT plan, code, or evaluate yourself. You spawn subagents for each of those roles using the **Agent tool**, coordinate the flow between them, and track progress. Each subagent runs in its own isolated context window, receiving only the information you explicitly pass in its prompt.
10
10
 
11
11
  ## Autonomous Mode
12
12
 
@@ -20,206 +20,504 @@ This command is designed to run **fully autonomously** — do NOT stop to ask th
20
20
 
21
21
  The user launched this command to walk away and come back to a finished product. Respect that intent.
22
22
 
23
- ## Overview
24
-
25
- The pipeline follows this flow:
23
+ ## Architecture — True Multi-Agent Orchestration
26
24
 
27
25
  ```
28
- User Task Description
29
- |
30
- v
31
- [1. PLAN] -----> PlanSpec + Sprint Contracts
32
- |
33
- v
34
- [2. SPRINT LOOP]
35
- |
36
- +----> [2a. Generate] ---> Code changes
37
- | |
38
- | v
39
- | [2b. Evaluate] ---> Pass/Fail
40
- | |
41
- | fail + retries left?
42
- | |
43
- | yes: feedback --> [2a. Generate]
44
- | no: escalate
45
- |
46
- | pass: next sprint
47
- |
48
- v
49
- [3. COMPLETE] ---> All sprints done
26
+ ORCHESTRATOR (you — this session)
27
+
28
+ ├─ 1. Read bober.config.json, .bober/principles.md
29
+ ├─ 2. Run check-prereqs.sh
30
+
31
+ ├─ 3. SPAWN planner subagent (Agent tool)
32
+ │ └─ Planner reads codebase, generates PlanSpec + sprint contracts
33
+ │ └─ Saves to .bober/specs/ and .bober/contracts/
34
+ │ └─ Returns: spec ID and contract list
35
+
36
+ ├─ 4. For each sprint contract:
37
+ │ │
38
+ │ ├─ 4a. Build context handoff (JSON in the prompt)
39
+ │ │ (spec, contract, previous feedback, principles)
40
+ │ │
41
+ │ ├─ 4b. SPAWN generator subagent (Agent tool)
42
+ │ │ └─ Receives handoff as prompt
43
+ │ │ └─ Implements the sprint, commits code
44
+ │ │ └─ Returns: completion report JSON
45
+ │ │
46
+ │ ├─ 4c. SPAWN evaluator subagent (Agent tool)
47
+ │ │ └─ Receives handoff + generator report
48
+ │ │ └─ Runs eval strategies (typecheck, lint, test, playwright)
49
+ │ │ └─ Returns: eval result JSON with pass/fail
50
+ │ │
51
+ │ ├─ 4d. If FAILED and retries < maxIterations:
52
+ │ │ └─ Add evaluator feedback to handoff
53
+ │ │ └─ Go to 4b (spawn FRESH generator with feedback)
54
+ │ │
55
+ │ └─ 4e. If PASSED: update contract status, log, next sprint
56
+
57
+ └─ 5. Final summary
50
58
  ```
51
59
 
52
- ## Step 1: Initialize and Plan
60
+ **Critical rules for you as orchestrator:**
61
+ - NEVER do the planning, coding, or evaluating yourself — ALWAYS delegate to subagents via the Agent tool.
62
+ - After spawning a subagent, READ the files it created to get the actual results (the subagent's return value is a summary, but files on disk are the source of truth).
63
+ - Keep your own context clean — only track orchestration state (which sprint, which iteration, pass/fail), not implementation details.
64
+ - Each subagent spawn is a FRESH context — this is the whole point. It prevents context degradation over long pipelines.
65
+ - Log progress to `.bober/progress.md` and `.bober/history.jsonl` between every phase transition.
66
+ - Print clear phase banners so progress is visible in the terminal.
67
+
68
+ ---
69
+
70
+ ## Step 1: Initialize
53
71
 
54
- ### 1a. Check Project State
72
+ ### 1a. Read Project Configuration
55
73
 
56
74
  Read `bober.config.json`. If it does not exist:
57
- - Ask the user the minimal initialization questions: project name, mode (greenfield vs brownfield), and what they are building
58
- - Determine the appropriate `mode` and `preset` (if any) from the user's description
59
- - Create `bober.config.json` with appropriate defaults
60
- - Create the `.bober/` directory structure
75
+ - Ask the user the minimal initialization questions: project name, mode (greenfield vs brownfield), and what they are building.
76
+ - Determine the appropriate `mode` and `preset` (if any) from the user's description.
77
+ - Create `bober.config.json` with appropriate defaults.
78
+ - Create the `.bober/` directory structure.
61
79
 
62
80
  If `bober.config.json` exists, read the configuration.
63
81
 
64
- ### 1b. Check for Existing Plans
82
+ Read `.bober/principles.md` if it exists. You will pass the principles text into every subagent prompt.
83
+
84
+ ### 1b. Run Prerequisites Check
85
+
86
+ ```bash
87
+ bash scripts/check-prereqs.sh
88
+ ```
89
+
90
+ If it fails, report the missing prerequisites and stop.
91
+
92
+ ### 1c. Check for Existing Plans
65
93
 
66
94
  Read `.bober/specs/` and `.bober/progress.md`. If there is an existing plan with incomplete sprints:
67
95
 
68
- - If the user provided a new task description that clearly differs from the existing plan create a new plan (option B)
69
- - If the user provided no task or a task that matches the existing plan resume from the next incomplete sprint (option A)
70
- - Log your decision but do NOT ask the user — autonomous mode means you decide and move forward
96
+ - If the user provided a new task description that clearly differs from the existing plan: create a new plan (go to Step 2)
97
+ - If the user provided no task or a task that matches the existing plan: resume from the next incomplete sprint (skip to Step 3)
98
+ - Log your decision but do NOT ask the user — autonomous mode means you decide and move forward.
99
+
100
+ Log event:
101
+ ```json
102
+ {"event":"pipeline-started","timestamp":"<ISO-8601>","task":"<task description>"}
103
+ ```
104
+
105
+ ---
106
+
107
+ ## Step 2: Spawn the Planner Subagent
108
+
109
+ Use the **Agent tool** to spawn a planner subagent.
110
+
111
+ **How to call the Agent tool:**
71
112
 
72
- ### 1c. Run the Planning Phase
113
+ ```
114
+ Agent tool call:
115
+ description: "Plan feature: <title from task description>"
116
+ prompt: <the full prompt below>
117
+ ```
73
118
 
74
- If creating a new plan, execute the bober.plan workflow:
119
+ **Build the planner prompt with ALL of these sections:**
120
+
121
+ ```
122
+ You are the Bober Planner subagent. You have been spawned by the orchestrator to create a plan.
123
+
124
+ ## Your Task
125
+ <paste the user's task description here>
126
+
127
+ ## Project Configuration (bober.config.json)
128
+ <paste the full contents of bober.config.json here>
129
+
130
+ ## Project Principles (.bober/principles.md)
131
+ <paste the full contents of .bober/principles.md here, or "No principles file found." if it does not exist>
132
+
133
+ ## Existing Specs
134
+ <list any existing spec IDs from .bober/specs/, or "None" if no prior specs>
135
+
136
+ ## Instructions
137
+ 1. Read the codebase to understand the project structure (use Glob and Grep to survey, Read to examine key files).
138
+ 2. Generate a PlanSpec with sprint decomposition.
139
+ 3. Save the PlanSpec to .bober/specs/<specId>.json
140
+ 4. Save each SprintContract to .bober/contracts/<contractId>.json
141
+ 5. Update .bober/progress.md with the plan summary.
142
+ 6. Append to .bober/history.jsonl: {"event":"plan-created","specId":"...","timestamp":"...","sprintCount":N}
143
+
144
+ IMPORTANT: You are running as a subagent — do NOT ask clarifying questions. Infer reasonable defaults from the codebase and task description. If something is genuinely ambiguous, document your assumption in the PlanSpec's "assumptions" field.
145
+
146
+ ## Your Response
147
+ When done, respond with EXACTLY this JSON structure (no other text):
148
+ {
149
+ "specId": "<the spec ID you created>",
150
+ "title": "<plan title>",
151
+ "sprintCount": <number>,
152
+ "contractIds": ["<contract-id-1>", "<contract-id-2>", ...],
153
+ "summary": "<2-3 sentence summary of the plan>"
154
+ }
155
+ ```
75
156
 
76
- 1. Gather codebase context (read key files, survey structure)
77
- 2. Ask 3-5 clarifying questions about the task
78
- 3. Wait for user responses
79
- 4. Generate the PlanSpec with sprint decomposition
80
- 5. Save everything to `.bober/`
157
+ **After the planner subagent returns:**
81
158
 
82
- **Configuration values that matter:**
83
- - `planner.maxClarifications`: Max questions to ask
84
- - `sprint.maxSprints`: Maximum number of sprints in the plan
85
- - `sprint.sprintSize`: Size calibration for sprint decomposition
159
+ 1. Parse the planner's response to extract `specId` and `contractIds`.
160
+ 2. Read `.bober/specs/<specId>.json` to verify it was created.
161
+ 3. Read each contract file in `.bober/contracts/` to verify they exist.
162
+ 4. Print the plan summary:
163
+ ```
164
+ === PLAN CREATED ===
165
+ Spec: <specId>
166
+ Title: <title>
167
+ Sprints: <count>
168
+ 1. <Sprint 1 title>
169
+ 2. <Sprint 2 title>
170
+ ...
171
+ ```
172
+ 5. If the planner subagent failed or returned an error, report it and stop the pipeline.
86
173
 
87
- Report the plan summary to the user and proceed.
174
+ ---
88
175
 
89
- ## Step 2: Sprint Execution Loop
176
+ ## Step 3: Sprint Execution Loop
90
177
 
91
178
  Load the sprint contracts from `.bober/contracts/` in order. For each sprint with status `proposed` or `needs-rework`:
92
179
 
93
- ### 2a. Pre-Sprint Checks
180
+ ### 3a. Pre-Sprint Checks
94
181
 
95
- 1. **Verify dependencies:** All sprints in `dependsOn` must have status `completed`
96
- 2. **Verify build state:** The project must build before starting a new sprint
182
+ 1. **Verify dependencies:** All sprints in `dependsOn` must have status `completed`.
183
+ 2. **Verify build state:** The project must build before starting a new sprint.
97
184
  ```bash
98
- # Run configured build/compile command (varies by stack)
99
- # e.g., npm run build, anchor build, forge build, cargo build
185
+ # Run configured build/compile command from bober.config.json commands.build
100
186
  ```
101
- If the build is broken BEFORE the sprint starts, stop and report this to the user. Do not start a sprint on a broken codebase.
102
- 3. **Verify git state:** Ensure we are on the correct feature branch
187
+ If the build is broken BEFORE the sprint starts, stop and report this to the user.
188
+ 3. **Verify git state:** Ensure we are on the correct feature branch.
103
189
  ```bash
104
190
  git branch --show-current
105
191
  ```
106
192
  4. **Check iteration budget:** Read `pipeline.maxIterations` from config. Track total iterations across all sprints. If the budget is exhausted, stop.
107
193
 
108
- ### 2b. Contract Negotiation
194
+ Print phase banner:
195
+ ```
196
+ === SPRINT <N>/<total>: <title> ===
197
+ Iteration: 1 of <maxIterations>
198
+ Budget used: <used>/<max> total iterations
199
+ ```
200
+
201
+ ### 3b. Contract Negotiation
109
202
 
110
203
  If the sprint status is `proposed`:
111
- - Review success criteria for executability
112
- - Verify evaluation strategies are available
113
- - Adjust criteria if needed
114
204
  - Update status to `in-progress`
205
+ - Save the updated contract back to `.bober/contracts/`
206
+ - Log event:
207
+ ```json
208
+ {"event":"sprint-started","contractId":"...","specId":"...","timestamp":"..."}
209
+ ```
115
210
 
116
- ### 2c. Generate
211
+ ### 3c. Build the Context Handoff
212
+
213
+ Build a context handoff JSON. This is the ONLY information the subagent receives — it must be self-contained.
214
+
215
+ **Context Handoff structure:**
216
+ ```json
217
+ {
218
+ "handoffId": "handoff-<contractId>-gen-<iteration>",
219
+ "type": "to-generator",
220
+ "contractId": "<contract ID>",
221
+ "specId": "<spec ID>",
222
+ "timestamp": "<ISO-8601>",
223
+ "iteration": 1,
224
+ "context": {
225
+ "projectOverview": "<Brief project description from PlanSpec>",
226
+ "completedSprints": [
227
+ {
228
+ "contractId": "<ID>",
229
+ "title": "<title>",
230
+ "summary": "<what was built>"
231
+ }
232
+ ],
233
+ "currentBranch": "<git branch name>",
234
+ "relevantFiles": ["<key files the generator should read>"]
235
+ },
236
+ "contract": { "<full SprintContract object>" },
237
+ "config": {
238
+ "commands": { "<commands section from bober.config.json>" },
239
+ "generator": { "<generator section from bober.config.json>" }
240
+ },
241
+ "principles": "<full text of .bober/principles.md or null>",
242
+ "evaluatorFeedback": null
243
+ }
244
+ ```
117
245
 
118
- Create a ContextHandoff for the Generator:
119
- - Include the contract, project context, config, and any evaluator feedback (for retries)
120
- - Include summaries of completed sprints
121
- - Include relevant file paths
246
+ For retry iterations (iteration > 1), populate `evaluatorFeedback` with the evaluator's failure details.
122
247
 
123
- Spawn the `bober-generator` subagent.
248
+ Save the handoff to `.bober/handoffs/<handoffId>.json`.
124
249
 
125
- After generation:
126
- - Read the Generator's completion report
127
- - Verify commits were made
128
- - Proceed to evaluation
250
+ ### 3d. Spawn the Generator Subagent
129
251
 
130
- ### 2d. Evaluate
252
+ Use the **Agent tool** to spawn a generator subagent.
131
253
 
132
- Create a ContextHandoff for the Evaluator:
133
- - Include the contract, Generator's report, config
254
+ **How to call the Agent tool:**
134
255
 
135
- Spawn the `bober-evaluator` subagent.
256
+ ```
257
+ Agent tool call:
258
+ description: "Sprint <N>: <sprint title>"
259
+ prompt: <the full prompt below>
260
+ ```
136
261
 
137
- After evaluation:
138
- - Read the EvalResult
139
- - Save it to `.bober/eval-results/`
140
- - Determine pass/fail
262
+ **Build the generator prompt:**
141
263
 
142
- ### 2e. Process Result
264
+ ```
265
+ You are the Bober Generator subagent. You have been spawned by the orchestrator to implement a sprint.
266
+
267
+ ## Context Handoff
268
+ <paste the FULL handoff JSON here — this is ALL the context you get>
269
+
270
+ ## Instructions
271
+ 1. Read the SprintContract at .bober/contracts/<contractId>.json
272
+ 2. Read the PlanSpec at .bober/specs/<specId>.json for broader context
273
+ 3. Read bober.config.json for commands configuration
274
+ 4. Read .bober/principles.md if it exists — adhere to all principles strictly
275
+ 5. Read the files listed in the contract's estimatedFiles
276
+ 6. Implement the sprint according to the contract's success criteria
277
+ 7. Self-verify: run build, typecheck, lint, and test commands
278
+ 8. Commit your changes with proper messages (format: "bober(<sprint-N>): <description>")
279
+ 9. Work on the feature branch, never on main/master
280
+
281
+ <IF iteration > 1>
282
+ ## IMPORTANT — This is a RETRY (iteration <N>)
283
+ The previous attempt failed evaluation. Here is the evaluator's feedback:
284
+ <paste evaluator feedback JSON>
285
+
286
+ Focus on fixing the specific failures listed above. Read the feedback line by line before making any changes.
287
+ </IF>
288
+
289
+ ## Your Response
290
+ When done, respond with EXACTLY this JSON structure (no other text):
291
+ {
292
+ "contractId": "<contract ID>",
293
+ "status": "complete | partial | blocked",
294
+ "criteriaResults": [
295
+ {
296
+ "criterionId": "sc-X-Y",
297
+ "met": true/false,
298
+ "evidence": "<verification evidence>"
299
+ }
300
+ ],
301
+ "filesChanged": [
302
+ {
303
+ "path": "<file path>",
304
+ "action": "created | modified | deleted",
305
+ "description": "<what changed>"
306
+ }
307
+ ],
308
+ "testsAdded": ["<test file paths>"],
309
+ "commits": ["<hash> - <message>"],
310
+ "blockers": ["<any unresolved issues>"],
311
+ "notes": "<additional context for the evaluator>"
312
+ }
313
+ ```
314
+
315
+ **After the generator subagent returns:**
316
+
317
+ 1. Parse the generator's response to extract the completion report.
318
+ 2. Verify commits were made: `git log --oneline -5`
319
+ 3. Save the generator report to `.bober/handoffs/gen-report-<contractId>-<iteration>.json`
320
+ 4. Log event:
321
+ ```json
322
+ {"event":"sprint-iteration-started","contractId":"...","iteration":N,"timestamp":"..."}
323
+ ```
324
+ 5. If the generator subagent crashed or returned an error, mark the sprint as `needs-rework` and log it.
325
+
326
+ ### 3e. Spawn the Evaluator Subagent
327
+
328
+ Use the **Agent tool** to spawn an evaluator subagent.
329
+
330
+ **How to call the Agent tool:**
331
+
332
+ ```
333
+ Agent tool call:
334
+ description: "Evaluate sprint <N>: <sprint title>"
335
+ prompt: <the full prompt below>
336
+ ```
337
+
338
+ **Build the evaluator prompt:**
339
+
340
+ ```
341
+ You are the Bober Evaluator subagent. You have been spawned by the orchestrator to evaluate a sprint.
342
+
343
+ ## Sprint Contract
344
+ <paste the full SprintContract JSON>
345
+
346
+ ## Generator's Completion Report
347
+ <paste the generator's completion report JSON>
348
+
349
+ ## Project Configuration
350
+ <paste relevant sections of bober.config.json: commands, evaluator>
351
+
352
+ ## Project Principles
353
+ <paste full text of .bober/principles.md or "No principles file found.">
354
+
355
+ ## Context
356
+ - Contract ID: <contractId>
357
+ - Spec ID: <specId>
358
+ - Sprint: <N> of <total>
359
+ - Iteration: <N>
360
+ - Branch: <current git branch>
361
+ - Changed files (per generator): <list of files>
362
+
363
+ ## Instructions
364
+ 1. Read the SprintContract at .bober/contracts/<contractId>.json
365
+ 2. Read bober.config.json for configured eval strategies and commands
366
+ 3. Run each configured evaluation strategy (typecheck, lint, build, unit-test, playwright, api-check) using the commands from config
367
+ 4. Verify EVERY success criterion in the contract one by one
368
+ 5. Check for regressions (pre-existing tests still passing, build stability)
369
+ 6. Check adherence to project principles
370
+ 7. Produce a structured EvalResult
371
+
372
+ IMPORTANT: You do NOT have Write or Edit tools. Output the EvalResult JSON in your response, and the orchestrator will save it to disk.
373
+
374
+ ## Your Response
375
+ When done, respond with EXACTLY this JSON structure (no other text):
376
+ {
377
+ "evalId": "eval-<contractId>-<iteration>",
378
+ "contractId": "<contract ID>",
379
+ "specId": "<spec ID>",
380
+ "timestamp": "<ISO-8601>",
381
+ "iteration": <N>,
382
+ "overallResult": "pass | fail",
383
+ "score": {
384
+ "criteriaTotal": <N>,
385
+ "criteriaPassed": <N>,
386
+ "criteriaFailed": <N>,
387
+ "criteriaSkipped": <N>,
388
+ "requiredPassed": <N>,
389
+ "requiredFailed": <N>,
390
+ "requiredTotal": <N>
391
+ },
392
+ "strategyResults": [
393
+ {
394
+ "strategy": "<type>",
395
+ "required": true/false,
396
+ "result": "pass | fail | skipped",
397
+ "output": "<relevant output>",
398
+ "details": "<explanation>"
399
+ }
400
+ ],
401
+ "criteriaResults": [
402
+ {
403
+ "criterionId": "sc-X-Y",
404
+ "description": "<criterion>",
405
+ "required": true/false,
406
+ "result": "pass | fail | skipped",
407
+ "evidence": "<evidence>",
408
+ "feedback": "<failure details if failed>"
409
+ }
410
+ ],
411
+ "regressions": [],
412
+ "generatorFeedback": [],
413
+ "summary": "<2-3 sentence summary>"
414
+ }
415
+ ```
416
+
417
+ **After the evaluator subagent returns:**
418
+
419
+ 1. Parse the evaluator's response to extract the EvalResult.
420
+ 2. Save the EvalResult to `.bober/eval-results/eval-<contractId>-<iteration>.json` (the evaluator cannot write files).
421
+ 3. Determine pass/fail from the `overallResult` field.
422
+
423
+ ### 3f. Process the Evaluation Result
143
424
 
144
425
  **On PASS:**
145
- 1. Update contract status to `completed`
146
- 2. Update `.bober/progress.md`
147
- 3. Log to `.bober/history.jsonl`
148
- 4. Report milestone to user:
426
+ 1. Update contract status to `completed` and save to `.bober/contracts/`.
427
+ 2. Update `.bober/progress.md`.
428
+ 3. Log event:
429
+ ```json
430
+ {"event":"sprint-completed","contractId":"...","specId":"...","iteration":N,"timestamp":"..."}
431
+ ```
432
+ 4. Print milestone:
149
433
  ```
150
- Sprint <N>/<total> PASSED: <title>
434
+ === Sprint <N>/<total> PASSED ===
435
+ Title: <title>
436
+ Iteration: <M>
151
437
  Progress: [=====> ] <N>/<total> sprints complete
152
438
  Next: <next sprint title>
153
439
  ```
154
- 5. Move to next sprint
440
+ 5. Move to next sprint.
155
441
 
156
442
  **On FAIL with retries remaining:**
157
- 1. Check if iteration count < `evaluator.maxIterations` (default: 3)
158
- 2. Feed evaluator feedback back to Generator (go to 2c)
159
- 3. Report retry:
443
+ 1. Check if iteration < `evaluator.maxIterations` (default: 3).
444
+ 2. Log event:
445
+ ```json
446
+ {"event":"sprint-iteration-failed","contractId":"...","iteration":N,"failedCriteria":[...],"timestamp":"..."}
160
447
  ```
161
- Sprint <N> iteration <M> failed. Retrying with evaluator feedback...
162
- Failed: <brief failure summary>
448
+ 3. Print retry notice:
163
449
  ```
450
+ === Sprint <N> iteration <M> FAILED ===
451
+ Failed criteria: <list>
452
+ Retrying (iteration <M+1> of <maxIterations>)...
453
+ ```
454
+ 4. Build a NEW context handoff with evaluator feedback included.
455
+ 5. Go back to step 3d (spawn a FRESH generator subagent with the feedback).
456
+
457
+ **On FAIL with no retries remaining:**
458
+ 1. Update contract status to `needs-rework` and save.
459
+ 2. Log event:
460
+ ```json
461
+ {"event":"sprint-failed","contractId":"...","specId":"...","totalIterations":N,"timestamp":"..."}
462
+ ```
463
+ 3. Decide whether to continue or stop:
464
+ - If the failure is in a non-blocking sprint (nothing depends on it), skip and continue.
465
+ - If the failure blocks subsequent sprints, stop the pipeline.
466
+ 4. Print failure report with full context.
164
467
 
165
- **On FAIL with no retries:**
166
- 1. Update contract status to `needs-rework`
167
- 2. Decide whether to continue or stop based on severity:
168
- - If the failure is in a non-blocking sprint (nothing depends on it), skip and continue
169
- - If the failure blocks subsequent sprints, stop the pipeline
170
- 3. Report to user with full context
171
-
172
- ### 2f. Context Reset
468
+ ### 3g. Context Reset
173
469
 
174
- After each sprint completes (pass or fail), check `pipeline.contextReset`:
175
- - `always`: Fresh context for the next sprint. The next sprint's Generator receives only its handoff document.
176
- - `on-threshold`: Continue with current context unless it is getting large. If context exceeds a reasonable threshold (use your judgment), reset.
177
- - `never`: Carry all context forward (not recommended for long pipelines).
470
+ After each sprint completes (pass or fail), check `pipeline.contextReset` from config:
471
+ - `always`: Fresh context for the next sprint. The next sprint's Generator receives only its handoff document. (This is the default with subagent architecture — each spawn IS a fresh context.)
472
+ - `on-threshold`: Same as `always` with subagents, since each subagent is already isolated.
473
+ - `never`: Carry summary forward in the handoff. Still a fresh subagent, but with richer handoff.
178
474
 
179
- ### 2g. Iteration Budget
475
+ ### 3h. Iteration Budget
180
476
 
181
477
  Track total Generator-Evaluator iterations across all sprints:
182
- - Each Generator+Evaluator cycle counts as 1 iteration
183
- - When total iterations reach `pipeline.maxIterations` (default: 20), stop the pipeline regardless of sprint status
184
- - Report the budget status:
478
+ - Each Generator+Evaluator cycle counts as 1 iteration.
479
+ - When total iterations reach `pipeline.maxIterations` (default: 20), stop the pipeline.
480
+ - Print budget status after each cycle:
185
481
  ```
186
482
  Iteration budget: <used>/<max>
187
483
  ```
188
484
 
189
- ## Step 3: Completion
485
+ ---
486
+
487
+ ## Step 4: Completion
190
488
 
191
489
  When all sprints are complete (or the pipeline stops):
192
490
 
193
491
  ### All Sprints Passed
194
492
 
195
493
  ```
196
- ## Pipeline Complete
494
+ === PIPELINE COMPLETE ===
197
495
 
198
496
  All <N> sprints passed successfully.
199
497
 
200
498
  ### Results
201
- 1. [PASS] Sprint 1: <title>
202
- 2. [PASS] Sprint 2: <title>
499
+ 1. [PASS] Sprint 1: <title> — iteration <M>
500
+ 2. [PASS] Sprint 2: <title> — iteration <M>
203
501
  ...
204
502
 
205
503
  ### Statistics
206
504
  - Total iterations: <N>
207
505
  - Sprints: <N>/<N> passed
208
- - Time: <start> to <end>
506
+ - Subagents spawned: <count>
209
507
 
210
508
  ### What Was Built
211
509
  <Brief summary of the complete feature>
212
510
 
213
511
  ### Next Steps
214
512
  - Review the code on branch: bober/<feature-slug>
215
- - Run the test suite: npm test
216
- - Merge to main when ready: git merge bober/<feature-slug>
513
+ - Run the test suite: <configured test command>
514
+ - Merge to main when ready
217
515
  ```
218
516
 
219
517
  ### Pipeline Stopped (failures or budget exhausted)
220
518
 
221
519
  ```
222
- ## Pipeline Stopped
520
+ === PIPELINE STOPPED ===
223
521
 
224
522
  Completed <M> of <N> sprints. Stopped because: <reason>
225
523
 
@@ -242,6 +540,8 @@ Sprint 3: <title>
242
540
  - Run /bober.plan to revise the plan
243
541
  ```
244
542
 
543
+ ---
544
+
245
545
  ## Human Escalation Protocol
246
546
 
247
547
  Escalate to the user (pause and ask) when:
@@ -258,6 +558,8 @@ Escalate to the user (pause and ask) when:
258
558
 
259
559
  4. **Halfway checkpoint:** For plans with 5+ sprints, pause after completing half the sprints to report progress and ask if the user wants to continue, adjust, or stop.
260
560
 
561
+ ---
562
+
261
563
  ## Progress Tracking
262
564
 
263
565
  Throughout the pipeline, keep `.bober/progress.md` updated:
@@ -288,6 +590,7 @@ Last updated: <timestamp>
288
590
  ### Pipeline Statistics
289
591
  - Total iterations used: 4 / 20
290
592
  - Sprints completed: 2 / 5
593
+ - Subagents spawned: 6
291
594
  ```
292
595
 
293
596
  And keep `.bober/history.jsonl` updated with events:
@@ -301,10 +604,14 @@ And keep `.bober/history.jsonl` updated with events:
301
604
  - `pipeline-stopped`
302
605
  - `human-escalation`
303
606
 
304
- ## Error Recovery
607
+ ---
608
+
609
+ ## Error Handling
305
610
 
611
+ - **Subagent crash/timeout:** If a subagent call via the Agent tool fails or returns an error, catch it. Log the error, mark the sprint as `needs-rework`, and decide whether to retry or escalate. Do NOT let a subagent failure crash the entire pipeline.
612
+ - **Subagent returns malformed response:** If you cannot parse the subagent's JSON response, read the files on disk (`.bober/specs/`, `.bober/contracts/`, `.bober/eval-results/`) as the source of truth. The subagent may have saved files correctly even if its response text was garbled.
306
613
  - **Git conflicts:** Pause and report to user. Do not auto-resolve.
307
614
  - **npm install failures:** Try once. If it fails, report to user.
308
615
  - **Dev server won't start:** Needed for API checks and Playwright. Report as a configuration issue.
309
- - **Out of context window:** If the conversation is getting extremely long, proactively reset context by summarizing progress and starting a fresh handoff.
310
- - **Previous sprint broke something:** If a completed sprint's code is causing issues in a later sprint, note this but do not go back and modify completed sprints. Instead, have the current sprint fix the issue within its scope.
616
+ - **Out of context window:** With subagent architecture, this is largely mitigated each subagent gets a fresh context. If YOUR orchestrator context gets long, summarize completed sprints more aggressively in the handoff documents.
617
+ - **Previous sprint broke something:** If a completed sprint's code is causing issues in a later sprint, note this but do not go back and modify completed sprints. Instead, include the issue details in the current sprint's generator handoff so it can fix the problem within its scope.