agent-bober 0.4.3 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: bober.sprint
3
- description: Execute the next pending sprint — negotiate contracts, run the Generator, evaluate output, and iterate until passing or exhausting retries.
3
+ description: Execute the next pending sprint — spawn generator and evaluator subagents, orchestrate the retry loop until passing or exhausting retries.
4
4
  argument-hint: "[sprint-number]"
5
5
  handoffs:
6
6
  - label: "Evaluate Sprint"
@@ -11,9 +11,11 @@ handoffs:
11
11
  prompt: "Execute the next sprint"
12
12
  ---
13
13
 
14
- # bober.sprint — Sprint Execution Skill
14
+ # bober.sprint — Sprint Execution Orchestrator
15
15
 
16
- You are running the **bober.sprint** skill. Your job is to execute a single sprint from an existing plan through the full Generator-Evaluator loop: negotiate the contract, generate the implementation, evaluate the output, and iterate until the sprint passes or retries are exhausted.
16
+ You are the **orchestrator** for a single sprint cycle. You do NOT implement code or evaluate it yourself. You spawn subagents using the **Agent tool** for both the generator (implementation) and evaluator (verification) roles, and you manage the retry loop between them.
17
+
18
+ Each subagent runs in its own isolated context window. It receives ONLY the information you explicitly pass in its prompt. After a subagent completes, you read the files it created on disk to get the full results.
17
19
 
18
20
  ## Prerequisites
19
21
 
@@ -22,7 +24,9 @@ Before starting, verify these exist:
22
24
  - At least one PlanSpec in `.bober/specs/`
23
25
  - At least one SprintContract in `.bober/contracts/`
24
26
 
25
- If any are missing, tell the user to run `/bober:plan` first.
27
+ If any are missing, tell the user to run `/bober-plan` first.
28
+
29
+ Also read `.bober/principles.md` if it exists. You will include the principles text in every subagent prompt.
26
30
 
27
31
  ## Step 1: Identify the Target Sprint
28
32
 
@@ -71,9 +75,9 @@ When a contract status is `proposed`, it has not yet been reviewed for executabi
71
75
  {"event":"sprint-started","contractId":"...","specId":"...","timestamp":"..."}
72
76
  ```
73
77
 
74
- ## Step 3: Create Context Handoff
78
+ ## Step 3: Build the Context Handoff
75
79
 
76
- Create a ContextHandoff document for the Generator. This document is the ONLY context the Generator receives -- it must be self-contained.
80
+ Build a ContextHandoff document for the Generator. This document is the ONLY context the Generator subagent receives it must be self-contained.
77
81
 
78
82
  **ContextHandoff structure:**
79
83
  ```json
@@ -103,6 +107,7 @@ Create a ContextHandoff document for the Generator. This document is the ONLY co
103
107
  "commands": { "<commands section from bober.config.json>" },
104
108
  "generator": { "<generator section from bober.config.json>" }
105
109
  },
110
+ "principles": "<full text of .bober/principles.md or null>",
106
111
  "evaluatorFeedback": null
107
112
  }
108
113
  ```
@@ -129,9 +134,7 @@ Save the handoff to `.bober/handoffs/<handoffId>.json`.
129
134
  }
130
135
  ```
131
136
 
132
- ## Step 4: Spawn the Generator
133
-
134
- Invoke the `bober-generator` subagent with the handoff document.
137
+ ## Step 4: Spawn the Generator Subagent
135
138
 
136
139
  **Before spawning:**
137
140
  1. Ensure the correct git branch exists and is checked out:
@@ -140,48 +143,132 @@ Invoke the `bober-generator` subagent with the handoff document.
140
143
  ```
141
144
  2. If this is a retry, the Generator should be on the same branch with the previous attempt's code still present.
142
145
 
143
- **Spawn the Generator:**
144
- Use the `bober-generator` agent definition. Pass it the handoff file path.
146
+ **Use the Agent tool to spawn the generator:**
145
147
 
146
- **After the Generator completes:**
147
- 1. Read the Generator's completion report
148
- 2. Verify the Generator committed its changes (check `git log`)
149
- 3. Proceed to evaluation
148
+ ```
149
+ Agent tool call:
150
+ description: "Sprint <N>: <sprint title>"
151
+ prompt: <the full prompt below>
152
+ ```
150
153
 
151
- ## Step 5: Spawn the Evaluator
154
+ **Generator prompt:**
152
155
 
153
- Create an Evaluator handoff document:
156
+ ```
157
+ You are the Bober Generator subagent. You have been spawned by the orchestrator to implement a sprint.
158
+
159
+ ## Context Handoff
160
+ <paste the FULL handoff JSON>
161
+
162
+ ## Instructions
163
+ 1. Read the SprintContract at .bober/contracts/<contractId>.json
164
+ 2. Read the PlanSpec at .bober/specs/<specId>.json for broader context
165
+ 3. Read bober.config.json for commands configuration
166
+ 4. Read .bober/principles.md if it exists — adhere to all principles strictly
167
+ 5. Read the files listed in the contract's estimatedFiles
168
+ 6. Implement the sprint according to the contract's success criteria
169
+ 7. Self-verify: run build, typecheck, lint, and test commands
170
+ 8. Commit your changes (format: "bober(<sprint-N>): <description>")
171
+ 9. Work on the feature branch, never on main/master
172
+
173
+ <IF iteration > 1>
174
+ ## IMPORTANT — This is a RETRY (iteration <N>)
175
+ The previous attempt failed evaluation. Here is the evaluator's feedback:
176
+ <paste evaluator feedback JSON>
177
+
178
+ Focus on fixing the specific failures listed above.
179
+ </IF>
180
+
181
+ ## Your Response
182
+ When done, respond with EXACTLY this JSON structure (no other text):
183
+ {
184
+ "contractId": "<contract ID>",
185
+ "status": "complete | partial | blocked",
186
+ "criteriaResults": [
187
+ {"criterionId": "sc-X-Y", "met": true/false, "evidence": "<evidence>"}
188
+ ],
189
+ "filesChanged": [
190
+ {"path": "<path>", "action": "created | modified | deleted", "description": "<what>"}
191
+ ],
192
+ "testsAdded": ["<test files>"],
193
+ "commits": ["<hash> - <message>"],
194
+ "blockers": ["<issues>"],
195
+ "notes": "<context for evaluator>"
196
+ }
197
+ ```
154
198
 
155
- ```json
199
+ **After the Generator subagent returns:**
200
+ 1. Parse the generator's response to extract the completion report.
201
+ 2. Verify commits were made: `git log --oneline -5`
202
+ 3. Save the generator report to `.bober/handoffs/gen-report-<contractId>-<iteration>.json`
203
+ 4. If the generator subagent crashed or returned an error, mark the sprint as `needs-rework` with note "Generator subagent failed".
204
+
205
+ ## Step 5: Spawn the Evaluator Subagent
206
+
207
+ **Use the Agent tool to spawn the evaluator:**
208
+
209
+ ```
210
+ Agent tool call:
211
+ description: "Evaluate sprint <N>: <sprint title>"
212
+ prompt: <the full prompt below>
213
+ ```
214
+
215
+ **Evaluator prompt:**
216
+
217
+ ```
218
+ You are the Bober Evaluator subagent. You have been spawned by the orchestrator to evaluate a sprint.
219
+
220
+ ## Sprint Contract
221
+ <paste the full SprintContract JSON>
222
+
223
+ ## Generator's Completion Report
224
+ <paste the generator's completion report JSON>
225
+
226
+ ## Project Configuration
227
+ <paste relevant sections: commands, evaluator config>
228
+
229
+ ## Project Principles
230
+ <paste full text of .bober/principles.md or "No principles file found.">
231
+
232
+ ## Context
233
+ - Contract ID: <contractId>
234
+ - Spec ID: <specId>
235
+ - Sprint: <N> of <total>
236
+ - Iteration: <N>
237
+ - Branch: <current branch>
238
+
239
+ ## Instructions
240
+ 1. Read the SprintContract at .bober/contracts/<contractId>.json
241
+ 2. Read bober.config.json for configured eval strategies and commands
242
+ 3. Run each configured evaluation strategy using the commands from config
243
+ 4. Verify EVERY success criterion one by one
244
+ 5. Check for regressions
245
+ 6. Check adherence to project principles
246
+ 7. Produce a structured EvalResult
247
+
248
+ IMPORTANT: You do NOT have Write or Edit tools. Output the EvalResult JSON in your response.
249
+
250
+ ## Your Response
251
+ Respond with EXACTLY this JSON structure (no other text):
156
252
  {
157
- "handoffId": "handoff-<contractId>-eval-<iteration>",
158
- "type": "to-evaluator",
253
+ "evalId": "eval-<contractId>-<iteration>",
159
254
  "contractId": "<contract ID>",
160
255
  "specId": "<spec ID>",
161
256
  "timestamp": "<ISO-8601>",
162
- "iteration": 1,
163
- "context": {
164
- "generatorReport": { "<Generator's completion report>" },
165
- "changedFiles": ["<files the generator reports changing>"],
166
- "branch": "<current branch>"
167
- },
168
- "contract": { "<full SprintContract object>" },
169
- "config": {
170
- "commands": { "<commands section from bober.config.json>" },
171
- "evaluator": { "<evaluator section from bober.config.json>" }
172
- }
257
+ "iteration": <N>,
258
+ "overallResult": "pass | fail",
259
+ "score": { "criteriaTotal": N, "criteriaPassed": N, "criteriaFailed": N, "criteriaSkipped": N, "requiredPassed": N, "requiredFailed": N, "requiredTotal": N },
260
+ "strategyResults": [ {"strategy": "<type>", "required": true/false, "result": "pass|fail|skipped", "output": "<output>", "details": "<details>"} ],
261
+ "criteriaResults": [ {"criterionId": "sc-X-Y", "description": "<desc>", "required": true/false, "result": "pass|fail|skipped", "evidence": "<evidence>", "feedback": "<if failed>"} ],
262
+ "regressions": [],
263
+ "generatorFeedback": [],
264
+ "summary": "<2-3 sentence summary>"
173
265
  }
174
266
  ```
175
267
 
176
- Save the handoff to `.bober/handoffs/<handoffId>.json`.
177
-
178
- **Spawn the Evaluator:**
179
- Use the `bober-evaluator` agent definition. Pass it the handoff file path.
180
-
181
- **After the Evaluator completes:**
182
- 1. Read the EvalResult
183
- 2. Save the EvalResult to `.bober/eval-results/` if the evaluator could not (it lacks Write tools)
184
- 3. Determine pass/fail
268
+ **After the Evaluator subagent returns:**
269
+ 1. Parse the evaluator's response to extract the EvalResult.
270
+ 2. Save the EvalResult to `.bober/eval-results/eval-<contractId>-<iteration>.json` (the evaluator cannot write files).
271
+ 3. Determine pass/fail from the `overallResult` field.
185
272
 
186
273
  ## Step 6: Process Evaluation Result
187
274
 
@@ -203,7 +290,7 @@ Use the `bober-evaluator` agent definition. Pass it the handoff file path.
203
290
 
204
291
  4. **Report success to the user:**
205
292
  ```
206
- Sprint <N> PASSED on iteration <M>.
293
+ === Sprint <N> PASSED on iteration <M> ===
207
294
 
208
295
  Completed: <sprint title>
209
296
  Key results:
@@ -211,7 +298,7 @@ Use the `bober-evaluator` agent definition. Pass it the handoff file path.
211
298
  - <criterion 2>: PASS
212
299
  ...
213
300
 
214
- Next sprint: <next sprint title> (run /bober.sprint to continue)
301
+ Next sprint: <next sprint title> (run /bober-sprint to continue)
215
302
  ```
216
303
 
217
304
  ### If the sprint FAILS and retries remain:
@@ -227,14 +314,15 @@ Check `evaluator.maxIterations` from `bober.config.json` (default: 3). If the cu
227
314
 
228
315
  3. **Report the retry to the user:**
229
316
  ```
230
- Sprint <N> iteration <M> FAILED. <X> of <Y> criteria not met.
317
+ === Sprint <N> iteration <M> FAILED ===
318
+ <X> of <Y> criteria not met.
231
319
  Retrying (iteration <M+1> of <maxIterations>)...
232
320
 
233
321
  Failed criteria:
234
322
  - <criterion>: <brief reason>
235
323
  ```
236
324
 
237
- 4. **Go to Step 4** (spawn Generator again with feedback)
325
+ 4. **Go to Step 4** (spawn a FRESH Generator subagent with feedback)
238
326
 
239
327
  ### If the sprint FAILS and no retries remain:
240
328
 
@@ -253,7 +341,7 @@ Check `evaluator.maxIterations` from `bober.config.json` (default: 3). If the cu
253
341
 
254
342
  4. **Report failure to the user with full context:**
255
343
  ```
256
- Sprint <N> FAILED after <maxIterations> iterations.
344
+ === Sprint <N> FAILED after <maxIterations> iterations ===
257
345
 
258
346
  Contract: <contract title>
259
347
  Failed criteria:
@@ -265,8 +353,8 @@ Check `evaluator.maxIterations` from `bober.config.json` (default: 3). If the cu
265
353
  Recommended actions:
266
354
  - Review the failed criteria and evaluator feedback
267
355
  - Consider simplifying the sprint scope
268
- - Run /bober.sprint <N> to retry from scratch
269
- - Run /bober.plan to revise the plan
356
+ - Run /bober-sprint <N> to retry from scratch
357
+ - Run /bober-plan to revise the plan
270
358
  ```
271
359
 
272
360
  ## Step 7: Context Reset
@@ -274,20 +362,22 @@ Check `evaluator.maxIterations` from `bober.config.json` (default: 3). If the cu
274
362
  After a sprint completes (pass or fail), manage context:
275
363
 
276
364
  Read `pipeline.contextReset` from config:
277
- - `always`: Context is fully reset between sprints. The next sprint starts fresh with only the handoff document.
278
- - `on-threshold`: Context resets only if the conversation is getting long. Not applicable in single-sprint skill execution.
279
- - `never`: Context carries forward. Not recommended.
280
-
281
- ## Next Steps
282
-
283
- After completing this phase, suggest the following next steps to the user:
284
- - `/bober-eval` — Evaluate the current sprint output independently
285
- - `/bober-sprint` — Execute the next sprint in the plan
365
+ - `always`: Each subagent already gets a fresh context. This is the default behavior.
366
+ - `on-threshold`: Same as `always` with subagent architecture.
367
+ - `never`: Include richer context summaries in the handoff for the next sprint.
286
368
 
287
369
  ## Error Handling
288
370
 
371
+ - **Subagent crash/timeout:** If the Agent tool call fails, log the error. Do not let it crash the orchestration. Mark the sprint appropriately and report to the user.
372
+ - **Subagent returns malformed response:** Read files on disk as the source of truth. The subagent may have saved files correctly even if its text response was garbled.
289
373
  - **Generator fails to produce any output:** Mark sprint as `needs-rework` with note "Generator produced no output"
290
374
  - **Evaluator cannot run strategies:** Report which strategies failed to execute and why. If a required strategy cannot run, mark sprint as `needs-rework` with a configuration issue note.
291
375
  - **Git conflicts:** Report the conflict to the user. Do not auto-resolve.
292
- - **Build broken before sprint started:** Verify the build passes BEFORE starting the Generator. If the build is already broken, report this and do not proceed.
376
+ - **Build broken before sprint started:** Verify the build passes BEFORE spawning the Generator. If the build is already broken, report this and do not proceed.
293
377
  - **Missing dependencies:** If `npm install` or equivalent has not been run, run it before starting.
378
+
379
+ ## Next Steps
380
+
381
+ After completing this phase, suggest the following next steps to the user:
382
+ - `/bober-eval` — Evaluate the current sprint output independently
383
+ - `/bober-sprint` — Execute the next sprint in the plan
@@ -6,7 +6,8 @@
6
6
  { "type": "typecheck", "required": true },
7
7
  { "type": "lint", "required": true },
8
8
  { "type": "build", "required": true },
9
- { "type": "unit-test", "required": true }
9
+ { "type": "unit-test", "required": true },
10
+ { "type": "playwright", "required": false }
10
11
  ], "maxIterations": 3 },
11
12
  "sprint": { "maxSprints": 10, "requireContracts": true, "sprintSize": "medium" },
12
13
  "pipeline": { "maxIterations": 20, "requireApproval": false, "contextReset": "always" },