agent-bober 0.4.2 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +60 -4
- package/agents/bober-evaluator.md +84 -8
- package/agents/bober-generator.md +102 -0
- package/agents/bober-planner.md +24 -0
- package/dist/cli/commands/init.js +1 -0
- package/dist/cli/commands/init.js.map +1 -1
- package/dist/evaluators/builtin/playwright.d.ts +11 -0
- package/dist/evaluators/builtin/playwright.d.ts.map +1 -1
- package/dist/evaluators/builtin/playwright.js +259 -12
- package/dist/evaluators/builtin/playwright.js.map +1 -1
- package/package.json +1 -1
- package/skills/bober.eval/SKILL.md +145 -148
- package/skills/bober.playwright/SKILL.md +429 -0
- package/skills/bober.playwright/references/playwright-patterns.md +377 -0
- package/skills/bober.run/SKILL.md +433 -119
- package/skills/bober.sprint/SKILL.md +147 -57
- package/templates/presets/nextjs/bober.config.json +2 -1
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: bober.sprint
|
|
3
|
-
description: Execute the next pending sprint —
|
|
3
|
+
description: Execute the next pending sprint — spawn generator and evaluator subagents, orchestrate the retry loop until passing or exhausting retries.
|
|
4
4
|
argument-hint: "[sprint-number]"
|
|
5
5
|
handoffs:
|
|
6
6
|
- label: "Evaluate Sprint"
|
|
@@ -11,9 +11,11 @@ handoffs:
|
|
|
11
11
|
prompt: "Execute the next sprint"
|
|
12
12
|
---
|
|
13
13
|
|
|
14
|
-
# bober.sprint — Sprint Execution
|
|
14
|
+
# bober.sprint — Sprint Execution Orchestrator
|
|
15
15
|
|
|
16
|
-
You are
|
|
16
|
+
You are the **orchestrator** for a single sprint cycle. You do NOT implement code or evaluate it yourself. You spawn subagents using the **Agent tool** for both the generator (implementation) and evaluator (verification) roles, and you manage the retry loop between them.
|
|
17
|
+
|
|
18
|
+
Each subagent runs in its own isolated context window. It receives ONLY the information you explicitly pass in its prompt. After a subagent completes, you read the files it created on disk to get the full results.
|
|
17
19
|
|
|
18
20
|
## Prerequisites
|
|
19
21
|
|
|
@@ -22,7 +24,9 @@ Before starting, verify these exist:
|
|
|
22
24
|
- At least one PlanSpec in `.bober/specs/`
|
|
23
25
|
- At least one SprintContract in `.bober/contracts/`
|
|
24
26
|
|
|
25
|
-
If any are missing, tell the user to run `/bober
|
|
27
|
+
If any are missing, tell the user to run `/bober-plan` first.
|
|
28
|
+
|
|
29
|
+
Also read `.bober/principles.md` if it exists. You will include the principles text in every subagent prompt.
|
|
26
30
|
|
|
27
31
|
## Step 1: Identify the Target Sprint
|
|
28
32
|
|
|
@@ -71,9 +75,9 @@ When a contract status is `proposed`, it has not yet been reviewed for executabi
|
|
|
71
75
|
{"event":"sprint-started","contractId":"...","specId":"...","timestamp":"..."}
|
|
72
76
|
```
|
|
73
77
|
|
|
74
|
-
## Step 3:
|
|
78
|
+
## Step 3: Build the Context Handoff
|
|
75
79
|
|
|
76
|
-
|
|
80
|
+
Build a ContextHandoff document for the Generator. This document is the ONLY context the Generator subagent receives — it must be self-contained.
|
|
77
81
|
|
|
78
82
|
**ContextHandoff structure:**
|
|
79
83
|
```json
|
|
@@ -103,6 +107,7 @@ Create a ContextHandoff document for the Generator. This document is the ONLY co
|
|
|
103
107
|
"commands": { "<commands section from bober.config.json>" },
|
|
104
108
|
"generator": { "<generator section from bober.config.json>" }
|
|
105
109
|
},
|
|
110
|
+
"principles": "<full text of .bober/principles.md or null>",
|
|
106
111
|
"evaluatorFeedback": null
|
|
107
112
|
}
|
|
108
113
|
```
|
|
@@ -129,9 +134,7 @@ Save the handoff to `.bober/handoffs/<handoffId>.json`.
|
|
|
129
134
|
}
|
|
130
135
|
```
|
|
131
136
|
|
|
132
|
-
## Step 4: Spawn the Generator
|
|
133
|
-
|
|
134
|
-
Invoke the `bober-generator` subagent with the handoff document.
|
|
137
|
+
## Step 4: Spawn the Generator Subagent
|
|
135
138
|
|
|
136
139
|
**Before spawning:**
|
|
137
140
|
1. Ensure the correct git branch exists and is checked out:
|
|
@@ -140,48 +143,132 @@ Invoke the `bober-generator` subagent with the handoff document.
|
|
|
140
143
|
```
|
|
141
144
|
2. If this is a retry, the Generator should be on the same branch with the previous attempt's code still present.
|
|
142
145
|
|
|
143
|
-
**
|
|
144
|
-
Use the `bober-generator` agent definition. Pass it the handoff file path.
|
|
146
|
+
**Use the Agent tool to spawn the generator:**
|
|
145
147
|
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
148
|
+
```
|
|
149
|
+
Agent tool call:
|
|
150
|
+
description: "Sprint <N>: <sprint title>"
|
|
151
|
+
prompt: <the full prompt below>
|
|
152
|
+
```
|
|
150
153
|
|
|
151
|
-
|
|
154
|
+
**Generator prompt:**
|
|
152
155
|
|
|
153
|
-
|
|
156
|
+
```
|
|
157
|
+
You are the Bober Generator subagent. You have been spawned by the orchestrator to implement a sprint.
|
|
158
|
+
|
|
159
|
+
## Context Handoff
|
|
160
|
+
<paste the FULL handoff JSON>
|
|
161
|
+
|
|
162
|
+
## Instructions
|
|
163
|
+
1. Read the SprintContract at .bober/contracts/<contractId>.json
|
|
164
|
+
2. Read the PlanSpec at .bober/specs/<specId>.json for broader context
|
|
165
|
+
3. Read bober.config.json for commands configuration
|
|
166
|
+
4. Read .bober/principles.md if it exists — adhere to all principles strictly
|
|
167
|
+
5. Read the files listed in the contract's estimatedFiles
|
|
168
|
+
6. Implement the sprint according to the contract's success criteria
|
|
169
|
+
7. Self-verify: run build, typecheck, lint, and test commands
|
|
170
|
+
8. Commit your changes (format: "bober(<sprint-N>): <description>")
|
|
171
|
+
9. Work on the feature branch, never on main/master
|
|
172
|
+
|
|
173
|
+
<IF iteration > 1>
|
|
174
|
+
## IMPORTANT — This is a RETRY (iteration <N>)
|
|
175
|
+
The previous attempt failed evaluation. Here is the evaluator's feedback:
|
|
176
|
+
<paste evaluator feedback JSON>
|
|
177
|
+
|
|
178
|
+
Focus on fixing the specific failures listed above.
|
|
179
|
+
</IF>
|
|
180
|
+
|
|
181
|
+
## Your Response
|
|
182
|
+
When done, respond with EXACTLY this JSON structure (no other text):
|
|
183
|
+
{
|
|
184
|
+
"contractId": "<contract ID>",
|
|
185
|
+
"status": "complete | partial | blocked",
|
|
186
|
+
"criteriaResults": [
|
|
187
|
+
{"criterionId": "sc-X-Y", "met": true/false, "evidence": "<evidence>"}
|
|
188
|
+
],
|
|
189
|
+
"filesChanged": [
|
|
190
|
+
{"path": "<path>", "action": "created | modified | deleted", "description": "<what>"}
|
|
191
|
+
],
|
|
192
|
+
"testsAdded": ["<test files>"],
|
|
193
|
+
"commits": ["<hash> - <message>"],
|
|
194
|
+
"blockers": ["<issues>"],
|
|
195
|
+
"notes": "<context for evaluator>"
|
|
196
|
+
}
|
|
197
|
+
```
|
|
154
198
|
|
|
155
|
-
|
|
199
|
+
**After the Generator subagent returns:**
|
|
200
|
+
1. Parse the generator's response to extract the completion report.
|
|
201
|
+
2. Verify commits were made: `git log --oneline -5`
|
|
202
|
+
3. Save the generator report to `.bober/handoffs/gen-report-<contractId>-<iteration>.json`
|
|
203
|
+
4. If the generator subagent crashed or returned an error, mark the sprint as `needs-rework` with note "Generator subagent failed".
|
|
204
|
+
|
|
205
|
+
## Step 5: Spawn the Evaluator Subagent
|
|
206
|
+
|
|
207
|
+
**Use the Agent tool to spawn the evaluator:**
|
|
208
|
+
|
|
209
|
+
```
|
|
210
|
+
Agent tool call:
|
|
211
|
+
description: "Evaluate sprint <N>: <sprint title>"
|
|
212
|
+
prompt: <the full prompt below>
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
**Evaluator prompt:**
|
|
216
|
+
|
|
217
|
+
```
|
|
218
|
+
You are the Bober Evaluator subagent. You have been spawned by the orchestrator to evaluate a sprint.
|
|
219
|
+
|
|
220
|
+
## Sprint Contract
|
|
221
|
+
<paste the full SprintContract JSON>
|
|
222
|
+
|
|
223
|
+
## Generator's Completion Report
|
|
224
|
+
<paste the generator's completion report JSON>
|
|
225
|
+
|
|
226
|
+
## Project Configuration
|
|
227
|
+
<paste relevant sections: commands, evaluator config>
|
|
228
|
+
|
|
229
|
+
## Project Principles
|
|
230
|
+
<paste full text of .bober/principles.md or "No principles file found.">
|
|
231
|
+
|
|
232
|
+
## Context
|
|
233
|
+
- Contract ID: <contractId>
|
|
234
|
+
- Spec ID: <specId>
|
|
235
|
+
- Sprint: <N> of <total>
|
|
236
|
+
- Iteration: <N>
|
|
237
|
+
- Branch: <current branch>
|
|
238
|
+
|
|
239
|
+
## Instructions
|
|
240
|
+
1. Read the SprintContract at .bober/contracts/<contractId>.json
|
|
241
|
+
2. Read bober.config.json for configured eval strategies and commands
|
|
242
|
+
3. Run each configured evaluation strategy using the commands from config
|
|
243
|
+
4. Verify EVERY success criterion one by one
|
|
244
|
+
5. Check for regressions
|
|
245
|
+
6. Check adherence to project principles
|
|
246
|
+
7. Produce a structured EvalResult
|
|
247
|
+
|
|
248
|
+
IMPORTANT: You do NOT have Write or Edit tools. Output the EvalResult JSON in your response.
|
|
249
|
+
|
|
250
|
+
## Your Response
|
|
251
|
+
Respond with EXACTLY this JSON structure (no other text):
|
|
156
252
|
{
|
|
157
|
-
"
|
|
158
|
-
"type": "to-evaluator",
|
|
253
|
+
"evalId": "eval-<contractId>-<iteration>",
|
|
159
254
|
"contractId": "<contract ID>",
|
|
160
255
|
"specId": "<spec ID>",
|
|
161
256
|
"timestamp": "<ISO-8601>",
|
|
162
|
-
"iteration":
|
|
163
|
-
"
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
"
|
|
169
|
-
"
|
|
170
|
-
"commands": { "<commands section from bober.config.json>" },
|
|
171
|
-
"evaluator": { "<evaluator section from bober.config.json>" }
|
|
172
|
-
}
|
|
257
|
+
"iteration": <N>,
|
|
258
|
+
"overallResult": "pass | fail",
|
|
259
|
+
"score": { "criteriaTotal": N, "criteriaPassed": N, "criteriaFailed": N, "criteriaSkipped": N, "requiredPassed": N, "requiredFailed": N, "requiredTotal": N },
|
|
260
|
+
"strategyResults": [ {"strategy": "<type>", "required": true/false, "result": "pass|fail|skipped", "output": "<output>", "details": "<details>"} ],
|
|
261
|
+
"criteriaResults": [ {"criterionId": "sc-X-Y", "description": "<desc>", "required": true/false, "result": "pass|fail|skipped", "evidence": "<evidence>", "feedback": "<if failed>"} ],
|
|
262
|
+
"regressions": [],
|
|
263
|
+
"generatorFeedback": [],
|
|
264
|
+
"summary": "<2-3 sentence summary>"
|
|
173
265
|
}
|
|
174
266
|
```
|
|
175
267
|
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
**After the Evaluator completes:**
|
|
182
|
-
1. Read the EvalResult
|
|
183
|
-
2. Save the EvalResult to `.bober/eval-results/` if the evaluator could not (it lacks Write tools)
|
|
184
|
-
3. Determine pass/fail
|
|
268
|
+
**After the Evaluator subagent returns:**
|
|
269
|
+
1. Parse the evaluator's response to extract the EvalResult.
|
|
270
|
+
2. Save the EvalResult to `.bober/eval-results/eval-<contractId>-<iteration>.json` (the evaluator cannot write files).
|
|
271
|
+
3. Determine pass/fail from the `overallResult` field.
|
|
185
272
|
|
|
186
273
|
## Step 6: Process Evaluation Result
|
|
187
274
|
|
|
@@ -203,7 +290,7 @@ Use the `bober-evaluator` agent definition. Pass it the handoff file path.
|
|
|
203
290
|
|
|
204
291
|
4. **Report success to the user:**
|
|
205
292
|
```
|
|
206
|
-
Sprint <N> PASSED on iteration <M
|
|
293
|
+
=== Sprint <N> PASSED on iteration <M> ===
|
|
207
294
|
|
|
208
295
|
Completed: <sprint title>
|
|
209
296
|
Key results:
|
|
@@ -211,7 +298,7 @@ Use the `bober-evaluator` agent definition. Pass it the handoff file path.
|
|
|
211
298
|
- <criterion 2>: PASS
|
|
212
299
|
...
|
|
213
300
|
|
|
214
|
-
Next sprint: <next sprint title> (run /bober
|
|
301
|
+
Next sprint: <next sprint title> (run /bober-sprint to continue)
|
|
215
302
|
```
|
|
216
303
|
|
|
217
304
|
### If the sprint FAILS and retries remain:
|
|
@@ -227,14 +314,15 @@ Check `evaluator.maxIterations` from `bober.config.json` (default: 3). If the cu
|
|
|
227
314
|
|
|
228
315
|
3. **Report the retry to the user:**
|
|
229
316
|
```
|
|
230
|
-
Sprint <N> iteration <M> FAILED
|
|
317
|
+
=== Sprint <N> iteration <M> FAILED ===
|
|
318
|
+
<X> of <Y> criteria not met.
|
|
231
319
|
Retrying (iteration <M+1> of <maxIterations>)...
|
|
232
320
|
|
|
233
321
|
Failed criteria:
|
|
234
322
|
- <criterion>: <brief reason>
|
|
235
323
|
```
|
|
236
324
|
|
|
237
|
-
4. **Go to Step 4** (spawn Generator
|
|
325
|
+
4. **Go to Step 4** (spawn a FRESH Generator subagent with feedback)
|
|
238
326
|
|
|
239
327
|
### If the sprint FAILS and no retries remain:
|
|
240
328
|
|
|
@@ -253,7 +341,7 @@ Check `evaluator.maxIterations` from `bober.config.json` (default: 3). If the cu
|
|
|
253
341
|
|
|
254
342
|
4. **Report failure to the user with full context:**
|
|
255
343
|
```
|
|
256
|
-
Sprint <N> FAILED after <maxIterations> iterations
|
|
344
|
+
=== Sprint <N> FAILED after <maxIterations> iterations ===
|
|
257
345
|
|
|
258
346
|
Contract: <contract title>
|
|
259
347
|
Failed criteria:
|
|
@@ -265,8 +353,8 @@ Check `evaluator.maxIterations` from `bober.config.json` (default: 3). If the cu
|
|
|
265
353
|
Recommended actions:
|
|
266
354
|
- Review the failed criteria and evaluator feedback
|
|
267
355
|
- Consider simplifying the sprint scope
|
|
268
|
-
- Run /bober
|
|
269
|
-
- Run /bober
|
|
356
|
+
- Run /bober-sprint <N> to retry from scratch
|
|
357
|
+
- Run /bober-plan to revise the plan
|
|
270
358
|
```
|
|
271
359
|
|
|
272
360
|
## Step 7: Context Reset
|
|
@@ -274,20 +362,22 @@ Check `evaluator.maxIterations` from `bober.config.json` (default: 3). If the cu
|
|
|
274
362
|
After a sprint completes (pass or fail), manage context:
|
|
275
363
|
|
|
276
364
|
Read `pipeline.contextReset` from config:
|
|
277
|
-
- `always`:
|
|
278
|
-
- `on-threshold`:
|
|
279
|
-
- `never`:
|
|
280
|
-
|
|
281
|
-
## Next Steps
|
|
282
|
-
|
|
283
|
-
After completing this phase, suggest the following next steps to the user:
|
|
284
|
-
- `/bober-eval` — Evaluate the current sprint output independently
|
|
285
|
-
- `/bober-sprint` — Execute the next sprint in the plan
|
|
365
|
+
- `always`: Each subagent already gets a fresh context. This is the default behavior.
|
|
366
|
+
- `on-threshold`: Same as `always` with subagent architecture.
|
|
367
|
+
- `never`: Include richer context summaries in the handoff for the next sprint.
|
|
286
368
|
|
|
287
369
|
## Error Handling
|
|
288
370
|
|
|
371
|
+
- **Subagent crash/timeout:** If the Agent tool call fails, log the error. Do not let it crash the orchestration. Mark the sprint appropriately and report to the user.
|
|
372
|
+
- **Subagent returns malformed response:** Read files on disk as the source of truth. The subagent may have saved files correctly even if its text response was garbled.
|
|
289
373
|
- **Generator fails to produce any output:** Mark sprint as `needs-rework` with note "Generator produced no output"
|
|
290
374
|
- **Evaluator cannot run strategies:** Report which strategies failed to execute and why. If a required strategy cannot run, mark sprint as `needs-rework` with a configuration issue note.
|
|
291
375
|
- **Git conflicts:** Report the conflict to the user. Do not auto-resolve.
|
|
292
|
-
- **Build broken before sprint started:** Verify the build passes BEFORE
|
|
376
|
+
- **Build broken before sprint started:** Verify the build passes BEFORE spawning the Generator. If the build is already broken, report this and do not proceed.
|
|
293
377
|
- **Missing dependencies:** If `npm install` or equivalent has not been run, run it before starting.
|
|
378
|
+
|
|
379
|
+
## Next Steps
|
|
380
|
+
|
|
381
|
+
After completing this phase, suggest the following next steps to the user:
|
|
382
|
+
- `/bober-eval` — Evaluate the current sprint output independently
|
|
383
|
+
- `/bober-sprint` — Execute the next sprint in the plan
|
|
@@ -6,7 +6,8 @@
|
|
|
6
6
|
{ "type": "typecheck", "required": true },
|
|
7
7
|
{ "type": "lint", "required": true },
|
|
8
8
|
{ "type": "build", "required": true },
|
|
9
|
-
{ "type": "unit-test", "required": true }
|
|
9
|
+
{ "type": "unit-test", "required": true },
|
|
10
|
+
{ "type": "playwright", "required": false }
|
|
10
11
|
], "maxIterations": 3 },
|
|
11
12
|
"sprint": { "maxSprints": 10, "requireContracts": true, "sprintSize": "medium" },
|
|
12
13
|
"pipeline": { "maxIterations": 20, "requireApproval": false, "contextReset": "always" },
|