valent-pipeline 0.4.3 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +26 -19
- package/package.json +1 -1
- package/pipeline/docs/lean-spawn-human-tasks.md +2 -2
- package/pipeline/orchestrators/claude-code/README.md +18 -2
- package/pipeline/orchestrators/claude-code/plan.workflow.js +56 -16
- package/pipeline/orchestrators/claude-code/retro.workflow.js +58 -17
- package/pipeline/orchestrators/claude-code/sprint.workflow.js +97 -9
- package/pipeline/orchestrators/codex/README.md +3 -3
- package/pipeline/orchestrators/codex/lead-loop.md +3 -3
- package/pipeline/prompts/lead.md +1 -1
- package/pipeline/schemas/task-graph.schema.json +1 -1
- package/pipeline/steps/common/distilled-handoff-format.md +1 -1
- package/pipeline/steps/orchestration/adopt-lead-and-create-team.md +1 -1
- package/pipeline/steps/orchestration/sprint-plan.md +2 -2
- package/pipeline/steps/retrospective/calibration.md +1 -1
- package/pipeline/task-graphs/frontend-only.yaml +1 -1
- package/pipeline/task-graphs/fullstack-web.yaml +1 -1
- package/pipeline/task-graphs/mobile-app.yaml +1 -1
- package/pipeline/templates/bend-handoff.template.md +1 -1
- package/pipeline/templates/critic-review.template.md +1 -1
- package/pipeline/templates/data-handoff.template.md +1 -1
- package/pipeline/templates/docgen-handoff.template.md +1 -1
- package/pipeline/templates/execution-report.template.md +1 -1
- package/pipeline/templates/fend-handoff.template.md +1 -1
- package/pipeline/templates/iac-handoff.template.md +1 -1
- package/pipeline/templates/judge-decision.template.md +1 -1
- package/pipeline/templates/libdev-handoff.template.md +1 -1
- package/pipeline/templates/mcp-dev-handoff.template.md +1 -1
- package/pipeline/templates/mobile-handoff.template.md +1 -1
- package/pipeline/templates/qa-test-spec.template.md +1 -1
- package/pipeline/templates/readiness-review.template.md +1 -1
- package/pipeline/templates/reqs-brief.template.md +1 -1
- package/pipeline/templates/uxa-spec.template.md +1 -1
- package/skills/valent-configure/SKILL.md +10 -5
- package/skills/valent-run-epic-workflow/SKILL.md +4 -4
- package/skills/valent-run-project-workflow/SKILL.md +4 -4
- package/skills/valent-run-story-workflow/SKILL.md +4 -3
- package/src/commands/init.js +45 -23
- package/src/commands/resolve-graph.js +3 -6
- package/src/commands/upgrade.js +28 -5
- package/src/lib/config-schema.js +8 -3
- package/src/lib/graph.js +2 -6
- package/src/lib/handoff.js +2 -6
- package/src/lib/paths.js +26 -0
package/README.md
CHANGED
|
@@ -7,12 +7,11 @@ You write the story. The pipeline handles requirements analysis, UX specificatio
|
|
|
7
7
|
## Quick Start
|
|
8
8
|
|
|
9
9
|
```bash
|
|
10
|
-
#
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
# Initialize in your project
|
|
10
|
+
# Initialize in your project — no global install needed.
|
|
11
|
+
# `init` scaffolds .valent-pipeline/ AND vendors the CLI into it, so every project
|
|
12
|
+
# pins its own version and the agents call it via `node .valent-pipeline/bin/cli.js`.
|
|
14
13
|
cd your-project
|
|
15
|
-
valent-pipeline init
|
|
14
|
+
npx valent-pipeline init
|
|
16
15
|
|
|
17
16
|
# Run the interactive configuration wizard
|
|
18
17
|
/valent-configure
|
|
@@ -21,6 +20,13 @@ valent-pipeline init
|
|
|
21
20
|
/valent-run-story STORY-001
|
|
22
21
|
```
|
|
23
22
|
|
|
23
|
+
> **No global install required.** `npx valent-pipeline init` copies the CLI (`bin/` + `src/`)
|
|
24
|
+
> into `.valent-pipeline/` and installs its dependencies there. Agents invoke
|
|
25
|
+
> `node .valent-pipeline/bin/cli.js <cmd>` — so different projects can run different CLI
|
|
26
|
+
> versions, and you can customize the pipeline (including `src/`) per project. A global
|
|
27
|
+
> install (`npm install -g valent-pipeline`) still works if you prefer the bare
|
|
28
|
+
> `valent-pipeline` command for manual use.
|
|
29
|
+
|
|
24
30
|
## How It Works
|
|
25
31
|
|
|
26
32
|
A persistent **Lead** agent reads your story, assembles a team of specialist agents, and orchestrates them through a dependency-driven pipeline:
|
|
@@ -108,39 +114,40 @@ Specialized agents that replace BEND for non-API project types:
|
|
|
108
114
|
- Claude Code CLI
|
|
109
115
|
- npm account (for publishing)
|
|
110
116
|
|
|
111
|
-
### Install
|
|
112
|
-
|
|
113
|
-
```bash
|
|
114
|
-
npm install -g valent-pipeline
|
|
115
|
-
```
|
|
116
|
-
|
|
117
117
|
### Initialize a Project
|
|
118
118
|
|
|
119
119
|
```bash
|
|
120
120
|
cd your-project
|
|
121
|
-
valent-pipeline init
|
|
121
|
+
npx valent-pipeline init
|
|
122
122
|
```
|
|
123
123
|
|
|
124
124
|
The init command:
|
|
125
125
|
1. Runs an interactive wizard to set project type, tech stack, and model assignments
|
|
126
126
|
2. Copies pipeline infrastructure to `.valent-pipeline/`
|
|
127
|
-
3.
|
|
128
|
-
|
|
129
|
-
|
|
127
|
+
3. **Vendors the CLI** (`bin/` + `src/`) into `.valent-pipeline/` and installs its runtime
|
|
128
|
+
dependencies there, so the project is self-contained and agents run
|
|
129
|
+
`node .valent-pipeline/bin/cli.js <cmd>` — no global install or `npx` round-trip at run time
|
|
130
|
+
4. Generates `pipeline-config.yaml` from your answers
|
|
131
|
+
5. Creates knowledge directories and initializes the backlog
|
|
132
|
+
6. Installs Claude Code skills for story/epic/project execution
|
|
133
|
+
|
|
134
|
+
A global install (`npm install -g valent-pipeline`) is optional — only needed if you want the
|
|
135
|
+
bare `valent-pipeline` command available for manual use outside a project.
|
|
130
136
|
|
|
131
137
|
### Upgrade
|
|
132
138
|
|
|
133
139
|
```bash
|
|
134
|
-
valent-pipeline upgrade
|
|
135
|
-
valent-pipeline upgrade --dry-run # preview changes without applying
|
|
140
|
+
npx valent-pipeline upgrade
|
|
141
|
+
npx valent-pipeline upgrade --dry-run # preview changes without applying
|
|
136
142
|
```
|
|
137
143
|
|
|
138
|
-
Upgrades pipeline infrastructure (prompts, templates, task graphs, scripts)
|
|
144
|
+
Upgrades pipeline infrastructure (prompts, templates, task graphs, scripts) **and re-vendors the
|
|
145
|
+
CLI** (`bin/` + `src/`) while preserving your project-specific files (config, knowledge, backlog).
|
|
139
146
|
|
|
140
147
|
### Validate Configuration
|
|
141
148
|
|
|
142
149
|
```bash
|
|
143
|
-
valent-pipeline config validate
|
|
150
|
+
node .valent-pipeline/bin/cli.js config validate
|
|
144
151
|
```
|
|
145
152
|
|
|
146
153
|
## Configuration
|
package/package.json
CHANGED
|
@@ -135,8 +135,8 @@ Then test other commands:
|
|
|
135
135
|
```bash
|
|
136
136
|
valent-pipeline config validate # should exit 0
|
|
137
137
|
valent-pipeline upgrade --dry-run # should show no changes (just installed)
|
|
138
|
-
valent-pipeline db rebuild # indexes story artifacts (auto-creates DB if missing)
|
|
139
|
-
valent-pipeline db rebuild # should complete (no stories to index yet)
|
|
138
|
+
node .valent-pipeline/bin/cli.js db rebuild # indexes story artifacts (auto-creates DB if missing)
|
|
139
|
+
node .valent-pipeline/bin/cli.js db rebuild # should complete (no stories to index yet)
|
|
140
140
|
```
|
|
141
141
|
|
|
142
142
|
Clean up:
|
|
@@ -45,19 +45,35 @@ incl. a resume-safety lint), but:
|
|
|
45
45
|
| 3b CRITIC | `parallel([blind, edge, acceptance])` independent agents → triage barrier | one CRITIC context, passes anchored on each other |
|
|
46
46
|
| Spawn context | `buildPrompt()` mirrors `spawn.template.md` (Setup/Task/Trigger/Completion) | terse inline instructions |
|
|
47
47
|
| Roll-over | a rejected story is recorded and the batch continues | — |
|
|
48
|
+
| Empty-graph guard | a resolved graph with zero dev agents throws a diagnostic before Build | silent empty Build → CRITIC looping on an empty diff |
|
|
49
|
+
| No-diff guard | if dev agents report no files, CRITIC/QA/JUDGE are skipped and the story rolls over `blocked` | 4-agent CRITIC re-reviewing an empty diff to the cap |
|
|
50
|
+
| Non-actionable verdict | a gate/CRITIC `needs-review` escalates immediately (no re-run) | re-reviewing a structural blocker until the cap |
|
|
48
51
|
| Resume | journal (`resumeFromRunId`) | disk-state rehydration + re-decide |
|
|
49
52
|
|
|
50
53
|
## Args
|
|
51
54
|
|
|
52
55
|
```js
|
|
53
56
|
// batch form (a planned sprint)
|
|
54
|
-
{ stories: [{ storyId, projectType, profiles }, ...], maxRejectionCycles? }
|
|
57
|
+
{ stories: [{ storyId, projectType, profiles }, ...], maxRejectionCycles?, models? }
|
|
55
58
|
// single-story form (back-compat)
|
|
56
|
-
{ storyId, projectType, profiles?, maxRejectionCycles? }
|
|
59
|
+
{ storyId, projectType, profiles?, maxRejectionCycles?, models? }
|
|
57
60
|
```
|
|
58
61
|
|
|
59
62
|
Returns `{ shipped, stories_shipped, stories_rolled_over, results: [{ storyId, shipped, verdict, skipped }] }`.
|
|
60
63
|
|
|
64
|
+
### Per-agent model tiers (`models`)
|
|
65
|
+
|
|
66
|
+
Each workflow assigns a model tier per spawned agent — **gates** (READINESS/CRITIC/JUDGE) → `opus`,
|
|
67
|
+
**spec + build** → `sonnet`, **CLI-runner / IO** steps (resolve-graph, sprint-pack, validate-sprint,
|
|
68
|
+
calibrate, embed, persist) → `haiku`, and the retro's loop-until-dry review (`RETRO-REVIEW`) → `opus`.
|
|
69
|
+
This assignment is baked into each script as a default and is **overridable** via the `models` arg —
|
|
70
|
+
the `pipeline-config.yaml` `models` tier→roles map (`{ opus:[...], sonnet:[...], haiku:[...] }`), which
|
|
71
|
+
the invoking skills pass through. Edit it with `/valent-configure` → "Model Assignments". A Workflow
|
|
72
|
+
script can't read files, so the config arrives via `args`, never a direct read. The same `models`
|
|
73
|
+
config also drives the prose-Lead pipeline (`providers/claude-code/runtime.md`), so the two paths stay
|
|
74
|
+
in sync. Selection is static (default + args only) → journal-replay safe. Omit `models` to use the
|
|
75
|
+
baked-in default; an agent with no tier mapping inherits the session model.
|
|
76
|
+
|
|
61
77
|
## Resume & state model (step 8)
|
|
62
78
|
|
|
63
79
|
**The journal is the state of record.** Each Workflow invocation returns a `runId`. To resume
|
|
@@ -14,14 +14,17 @@
|
|
|
14
14
|
* one git branch and must be sequential — see sprint.workflow.js.)
|
|
15
15
|
*
|
|
16
16
|
* The deterministic packing/validation (greedy bin-packing, consistency cross-checks) is NOT
|
|
17
|
-
* done in this script — it lives in `valent-pipeline sprint-pack` / `validate-sprint`
|
|
17
|
+
* done in this script — it lives in `node .valent-pipeline/bin/cli.js sprint-pack` / `validate-sprint`
|
|
18
18
|
* (src/lib/sprint.js), invoked through an agent because a Workflow script has no CLI/fs
|
|
19
19
|
* access. Both runtimes reuse those CLIs; this workflow just sequences the agents.
|
|
20
20
|
*
|
|
21
21
|
* The return value is shaped to feed straight into sprint.workflow.js:
|
|
22
22
|
* { sprintId, points_planned, stories: [{ storyId, projectType, profiles }] }
|
|
23
23
|
*
|
|
24
|
-
* args: { stories: [{ storyId, projectType }], sprintId, velocity, backlogPath?, maxRejectionCycles? }
|
|
24
|
+
* args: { stories: [{ storyId, projectType }], sprintId, velocity, backlogPath?, maxRejectionCycles?, models? }
|
|
25
|
+
* `models` is the pipeline-config.yaml `models` tier->roles map, passed through by the invoking
|
|
26
|
+
* skill so per-agent model tiers stay config-driven (editable via `valent configure`). Omit it to
|
|
27
|
+
* use the baked-in default. See sprint.workflow.js for the full rationale.
|
|
25
28
|
*/
|
|
26
29
|
|
|
27
30
|
export const meta = {
|
|
@@ -31,8 +34,8 @@ export const meta = {
|
|
|
31
34
|
{ title: 'Groom', detail: 'reqs -> uxa? -> qa-a -> readiness gate, pipelined across the batch' },
|
|
32
35
|
{ title: 'Size', detail: 'profile-matched estimators per story, summed (parallel)' },
|
|
33
36
|
{ title: 'Persist', detail: 'write story_points + groomed status to the backlog' },
|
|
34
|
-
{ title: 'Pack', detail: 'valent-pipeline sprint-pack (greedy bin-packing, in code)' },
|
|
35
|
-
{ title: 'Validate', detail: 'write plan/status artifacts + valent-pipeline validate-sprint' },
|
|
37
|
+
{ title: 'Pack', detail: 'node .valent-pipeline/bin/cli.js sprint-pack (greedy bin-packing, in code)' },
|
|
38
|
+
{ title: 'Validate', detail: 'write plan/status artifacts + node .valent-pipeline/bin/cli.js validate-sprint' },
|
|
36
39
|
],
|
|
37
40
|
}
|
|
38
41
|
|
|
@@ -116,7 +119,16 @@ const PROFILE_ESTIMATORS = {
|
|
|
116
119
|
|
|
117
120
|
// --- args ---
|
|
118
121
|
|
|
119
|
-
|
|
122
|
+
// args may arrive as a parsed object or as a JSON string, depending on how the invoking
|
|
123
|
+
// skill/harness passes it. Normalize defensively so `a.stories` etc. resolve either way.
|
|
124
|
+
function parseArgs(x) {
|
|
125
|
+
if (typeof x === 'string') {
|
|
126
|
+
try { return JSON.parse(x) } catch { return {} }
|
|
127
|
+
}
|
|
128
|
+
return x || {}
|
|
129
|
+
}
|
|
130
|
+
|
|
131
|
+
const a = parseArgs(args)
|
|
120
132
|
const stories = Array.isArray(a.stories) ? a.stories : []
|
|
121
133
|
const sprintId = a.sprintId
|
|
122
134
|
const velocity = a.velocity
|
|
@@ -126,6 +138,34 @@ if (!stories.length || !sprintId || typeof velocity !== 'number') {
|
|
|
126
138
|
throw new Error('args must include { stories:[{storyId,projectType}], sprintId, velocity }')
|
|
127
139
|
}
|
|
128
140
|
|
|
141
|
+
// --- per-agent model tiers ----------------------------------------------------
|
|
142
|
+
// Tiers come from pipeline-config.yaml `models` (a tier->roles map), passed in as
|
|
143
|
+
// args.models by the invoking skill — a Workflow script can't read files. We invert it
|
|
144
|
+
// to role->tier and overlay it on a baked-in default so the workflow self-hosts a sane
|
|
145
|
+
// assignment even when args.models is absent. Static + args only => journal-replay safe.
|
|
146
|
+
// readiness gate -> opus, spec/estimators -> sonnet, CLI-runners/IO -> haiku.
|
|
147
|
+
const DEFAULT_MODELS = {
|
|
148
|
+
READINESS: 'opus',
|
|
149
|
+
REQS: 'sonnet', UXA: 'sonnet', 'QA-A': 'sonnet',
|
|
150
|
+
BEND: 'sonnet', FEND: 'sonnet', DATA: 'sonnet', 'MCP-DEV': 'sonnet',
|
|
151
|
+
LIBDEV: 'sonnet', DOCGEN: 'sonnet', IAC: 'sonnet', MOBILE: 'sonnet',
|
|
152
|
+
PERSIST: 'haiku', PACK: 'haiku', VALIDATE: 'haiku',
|
|
153
|
+
}
|
|
154
|
+
function buildModelMap(cfg) {
|
|
155
|
+
const map = { ...DEFAULT_MODELS }
|
|
156
|
+
if (cfg && typeof cfg === 'object' && !Array.isArray(cfg)) {
|
|
157
|
+
for (const tier of ['opus', 'sonnet', 'haiku']) {
|
|
158
|
+
for (const role of cfg[tier] || []) {
|
|
159
|
+
if (typeof role === 'string') map[role.toUpperCase()] = tier
|
|
160
|
+
}
|
|
161
|
+
}
|
|
162
|
+
}
|
|
163
|
+
return map
|
|
164
|
+
}
|
|
165
|
+
const MODELS = buildModelMap(a.models)
|
|
166
|
+
// undefined => the agent inherits the main-loop (session) model.
|
|
167
|
+
const modelFor = (role) => MODELS[String(role).toUpperCase()]
|
|
168
|
+
|
|
129
169
|
function buildPrompt({ role, promptFile, storyId, taskSubject, trigger, returnContract }) {
|
|
130
170
|
const outputDir = `stories/${storyId}/output`
|
|
131
171
|
return [
|
|
@@ -161,7 +201,7 @@ const groomed = await pipeline(
|
|
|
161
201
|
taskSubject: 'Tag testing_profiles for this story, then produce reqs-brief.md.',
|
|
162
202
|
returnContract: 'Return ONLY { schema:1, agent:"reqs", story, testing_profiles:[...], files:[...] } as JSON.',
|
|
163
203
|
}),
|
|
164
|
-
{ label: `reqs:${story.storyId}`, phase: 'Groom', schema: REQS_GROOM_SCHEMA },
|
|
204
|
+
{ label: `reqs:${story.storyId}`, phase: 'Groom', schema: REQS_GROOM_SCHEMA, model: modelFor('REQS') },
|
|
165
205
|
)
|
|
166
206
|
return { ...story, profiles: r.testing_profiles || [] }
|
|
167
207
|
},
|
|
@@ -170,7 +210,7 @@ const groomed = await pipeline(
|
|
|
170
210
|
if (g.profiles.includes('ui')) {
|
|
171
211
|
await agent(
|
|
172
212
|
buildPrompt({ role: 'UXA', promptFile: 'uxa.md', storyId: g.storyId, taskSubject: 'Translate the brief into uxa-spec.md.' }),
|
|
173
|
-
{ label: `uxa:${g.storyId}`, phase: 'Groom', schema: HANDOFF_SCHEMA },
|
|
213
|
+
{ label: `uxa:${g.storyId}`, phase: 'Groom', schema: HANDOFF_SCHEMA, model: modelFor('UXA') },
|
|
174
214
|
)
|
|
175
215
|
}
|
|
176
216
|
return g
|
|
@@ -179,7 +219,7 @@ const groomed = await pipeline(
|
|
|
179
219
|
async (g) => {
|
|
180
220
|
await agent(
|
|
181
221
|
buildPrompt({ role: 'QA-A', promptFile: 'qa-a.md', storyId: g.storyId, taskSubject: 'Produce qa-test-spec.md before any code is written.' }),
|
|
182
|
-
{ label: `qa-a:${g.storyId}`, phase: 'Groom', schema: HANDOFF_SCHEMA },
|
|
222
|
+
{ label: `qa-a:${g.storyId}`, phase: 'Groom', schema: HANDOFF_SCHEMA, model: modelFor('QA-A') },
|
|
183
223
|
)
|
|
184
224
|
return g
|
|
185
225
|
},
|
|
@@ -192,7 +232,7 @@ const groomed = await pipeline(
|
|
|
192
232
|
role: 'READINESS', promptFile: 'readiness.md', storyId: g.storyId,
|
|
193
233
|
taskSubject: 'Validate the spec chain (reqs/uxa/qa) is implementation-ready; run cross-story checks (sprint mode).',
|
|
194
234
|
}),
|
|
195
|
-
{ label: `gate:readiness:${g.storyId}`, phase: 'Groom', schema: VERDICT_SCHEMA },
|
|
235
|
+
{ label: `gate:readiness:${g.storyId}`, phase: 'Groom', schema: VERDICT_SCHEMA, model: modelFor('READINESS') },
|
|
196
236
|
)
|
|
197
237
|
if (v.verdict === 'pass') return { ...g, groomedStatus: 'groomed' }
|
|
198
238
|
rejections += 1
|
|
@@ -204,7 +244,7 @@ const groomed = await pipeline(
|
|
|
204
244
|
log(`${g.storyId}: readiness rejection ${rejections}/${maxRejectionCycles} -> ${target}`)
|
|
205
245
|
await agent(
|
|
206
246
|
buildPrompt({ role: target, promptFile: `${target.toLowerCase()}.md`, storyId: g.storyId, taskSubject: 'Address the READINESS rejection and rewrite the affected spec.' }),
|
|
207
|
-
{ label: `rework:${target.toLowerCase()}:${g.storyId}`, phase: 'Groom', schema: HANDOFF_SCHEMA },
|
|
247
|
+
{ label: `rework:${target.toLowerCase()}:${g.storyId}`, phase: 'Groom', schema: HANDOFF_SCHEMA, model: modelFor(target) },
|
|
208
248
|
)
|
|
209
249
|
}
|
|
210
250
|
},
|
|
@@ -227,7 +267,7 @@ const sized = await parallel(
|
|
|
227
267
|
taskSubject: 'Estimate this story (read your estimate.md step; apply calibration directives if present).',
|
|
228
268
|
returnContract: 'Return ONLY { schema:1, agent, story, points:<int> } as JSON.',
|
|
229
269
|
}),
|
|
230
|
-
{ label: `estimate:${est.toLowerCase()}:${g.storyId}`, phase: 'Size', schema: ESTIMATE_SCHEMA },
|
|
270
|
+
{ label: `estimate:${est.toLowerCase()}:${g.storyId}`, phase: 'Size', schema: ESTIMATE_SCHEMA, model: modelFor(est) },
|
|
231
271
|
)),
|
|
232
272
|
).then((ests) => ({
|
|
233
273
|
...g,
|
|
@@ -244,15 +284,15 @@ await agent(
|
|
|
244
284
|
`Update \`${backlogPath}\`: for each of these stories set \`story_points\` and \`status: groomed\`, ` +
|
|
245
285
|
`and write \`testing_profiles\`. Stories (JSON): ${JSON.stringify(sizedStories.map((s) => ({ id: s.storyId, story_points: s.points, testing_profiles: s.profiles })))}. ` +
|
|
246
286
|
`Return your \`valent:handoff\` machine block fields as JSON.`,
|
|
247
|
-
{ label: 'persist-sizing', phase: 'Persist', schema: HANDOFF_SCHEMA },
|
|
287
|
+
{ label: 'persist-sizing', phase: 'Persist', schema: HANDOFF_SCHEMA, model: modelFor('PERSIST') },
|
|
248
288
|
)
|
|
249
289
|
|
|
250
290
|
phase('Pack')
|
|
251
291
|
// Deterministic greedy packing happens in code (src/lib/sprint.js), invoked via the CLI.
|
|
252
292
|
const pack = await agent(
|
|
253
|
-
`Run exactly: \`valent-pipeline sprint-pack --velocity ${velocity} --backlog ${backlogPath}\` ` +
|
|
293
|
+
`Run exactly: \`node .valent-pipeline/bin/cli.js sprint-pack --velocity ${velocity} --backlog ${backlogPath}\` ` +
|
|
254
294
|
`in the project root and return its stdout JSON verbatim (fields: sprint_stories, buffer_story_ids, points_planned, remaining_capacity).`,
|
|
255
|
-
{ label: 'sprint-pack', phase: 'Pack', schema: PACK_SCHEMA },
|
|
295
|
+
{ label: 'sprint-pack', phase: 'Pack', schema: PACK_SCHEMA, model: modelFor('PACK') },
|
|
256
296
|
)
|
|
257
297
|
log(`packed ${pack.sprint_stories.length} stories (${pack.points_planned} pts); buffer: ${pack.buffer_story_ids.length}`)
|
|
258
298
|
|
|
@@ -262,9 +302,9 @@ const validation = await agent(
|
|
|
262
302
|
`For sprint ${sprintId}: (1) write \`sprint-${sprintId}-plan.md\` from \`.valent-pipeline/templates/sprint-plan.template.md\` ` +
|
|
263
303
|
`and \`sprint-${sprintId}-status.yaml\` from the status template for the packed stories ${JSON.stringify(pack.sprint_stories)}; ` +
|
|
264
304
|
`(2) tag those stories \`sprint: ${sprintId}\` + \`status: sprint-planned\` in \`${backlogPath}\`; ` +
|
|
265
|
-
`(3) run \`valent-pipeline validate-sprint --status sprint-${sprintId}-status.yaml --backlog ${backlogPath}\` and ` +
|
|
305
|
+
`(3) run \`node .valent-pipeline/bin/cli.js validate-sprint --status sprint-${sprintId}-status.yaml --backlog ${backlogPath}\` and ` +
|
|
266
306
|
`return its result as JSON { valid:boolean, errors:[...] } (errors = the lines it printed on failure, else []).`,
|
|
267
|
-
{ label: 'validate-sprint', phase: 'Validate', schema: VALIDATE_SCHEMA },
|
|
307
|
+
{ label: 'validate-sprint', phase: 'Validate', schema: VALIDATE_SCHEMA, model: modelFor('VALIDATE') },
|
|
268
308
|
)
|
|
269
309
|
if (!validation.valid) {
|
|
270
310
|
throw new Error(`sprint ${sprintId} plan failed validation: ${(validation.errors || []).join('; ')}`)
|
|
@@ -16,25 +16,28 @@
|
|
|
16
16
|
* guard) -> embed (CLI).
|
|
17
17
|
*
|
|
18
18
|
* The deterministic pieces are NOT in this script: calibration arithmetic is
|
|
19
|
-
* `valent-pipeline calibrate` (src/lib/sprint.js); embedding is `valent-pipeline db embed`.
|
|
19
|
+
* `node .valent-pipeline/bin/cli.js calibrate` (src/lib/sprint.js); embedding is `node .valent-pipeline/bin/cli.js db embed`.
|
|
20
20
|
* Both run through agents (a Workflow script has no CLI/fs access). The directive IMPACT
|
|
21
21
|
* GATING and INVARIANT GUARD are deterministic policy, so they are enforced HERE in code —
|
|
22
22
|
* the agent only proposes; the script decides what gets applied vs. surfaced for approval.
|
|
23
23
|
*
|
|
24
|
-
* args: { batchNumber, sprintId?, storyOutputDirs?: string[], dryRounds?: number, maxRounds?: number }
|
|
24
|
+
* args: { batchNumber, sprintId?, storyOutputDirs?: string[], dryRounds?: number, maxRounds?: number, models? }
|
|
25
25
|
* sprintId present => sprint-mode (calibration runs). dryRounds = consecutive empty rounds
|
|
26
|
-
* that end the loop-until-dry (default 2). maxRounds caps it (default 5).
|
|
26
|
+
* that end the loop-until-dry (default 2). maxRounds caps it (default 5). `models` is the
|
|
27
|
+
* pipeline-config.yaml `models` tier->roles map, passed through by the invoking skill so
|
|
28
|
+
* per-agent model tiers stay config-driven (editable via `valent configure`). Omit it to use
|
|
29
|
+
* the baked-in default. See sprint.workflow.js for the full rationale.
|
|
27
30
|
*/
|
|
28
31
|
|
|
29
32
|
export const meta = {
|
|
30
33
|
name: 'valent-retro',
|
|
31
34
|
description: 'Retrospective: calibrate, loop-until-dry aggregate review, gated directives, embed (Workflow)',
|
|
32
35
|
phases: [
|
|
33
|
-
{ title: 'Calibrate', detail: 'valent-pipeline calibrate (estimation accuracy, in code) — sprint mode' },
|
|
36
|
+
{ title: 'Calibrate', detail: 'node .valent-pipeline/bin/cli.js calibrate (estimation accuracy, in code) — sprint mode' },
|
|
34
37
|
{ title: 'Analyze', detail: 'CRITIC/QA/JUDGE batch outputs + cost' },
|
|
35
38
|
{ title: 'Aggregate', detail: 'loop-until-dry 3-pass aggregate review + completeness critic (R5)' },
|
|
36
39
|
{ title: 'Directives', detail: 'agent proposes; code enforces impact gating + invariant guard' },
|
|
37
|
-
{ title: 'Embed', detail: 'valent-pipeline db embed (persist curated patterns)' },
|
|
40
|
+
{ title: 'Embed', detail: 'node .valent-pipeline/bin/cli.js db embed (persist curated patterns)' },
|
|
38
41
|
],
|
|
39
42
|
}
|
|
40
43
|
|
|
@@ -109,13 +112,51 @@ const HANDOFF_SCHEMA = {
|
|
|
109
112
|
|
|
110
113
|
// --- args ---
|
|
111
114
|
|
|
112
|
-
|
|
115
|
+
// args may arrive as a parsed object or as a JSON string, depending on how the invoking
|
|
116
|
+
// skill/harness passes it. Normalize defensively so `a.batchNumber` etc. resolve either way.
|
|
117
|
+
function parseArgs(x) {
|
|
118
|
+
if (typeof x === 'string') {
|
|
119
|
+
try { return JSON.parse(x) } catch { return {} }
|
|
120
|
+
}
|
|
121
|
+
return x || {}
|
|
122
|
+
}
|
|
123
|
+
|
|
124
|
+
const a = parseArgs(args)
|
|
113
125
|
const batchNumber = a.batchNumber
|
|
114
126
|
const sprintId = a.sprintId || null
|
|
115
127
|
const dryRounds = a.dryRounds ?? 2
|
|
116
128
|
const maxRounds = a.maxRounds ?? 5
|
|
117
129
|
if (batchNumber == null) throw new Error('args must include { batchNumber }')
|
|
118
130
|
|
|
131
|
+
// --- per-agent model tiers ----------------------------------------------------
|
|
132
|
+
// Tiers come from pipeline-config.yaml `models` (a tier->roles map), passed in as
|
|
133
|
+
// args.models by the invoking skill — a Workflow script can't read files. We invert it
|
|
134
|
+
// to role->tier and overlay it on a baked-in default so the workflow self-hosts a sane
|
|
135
|
+
// assignment even when args.models is absent. Static + args only => journal-replay safe.
|
|
136
|
+
// Retro stages map to synthetic role keys (not the single RETROSPECTIVE persona) so each
|
|
137
|
+
// stage can be tuned independently: the loop-until-dry aggregate review + completeness
|
|
138
|
+
// critic are the genuine quality work (RETRO-REVIEW -> opus); analyze/directives are
|
|
139
|
+
// lighter (RETRO -> sonnet); calibrate/embed/IO are mechanical (haiku).
|
|
140
|
+
const DEFAULT_MODELS = {
|
|
141
|
+
'RETRO-REVIEW': 'opus',
|
|
142
|
+
RETRO: 'sonnet',
|
|
143
|
+
CALIBRATE: 'haiku', EMBED: 'haiku', PERSIST: 'haiku',
|
|
144
|
+
}
|
|
145
|
+
function buildModelMap(cfg) {
|
|
146
|
+
const map = { ...DEFAULT_MODELS }
|
|
147
|
+
if (cfg && typeof cfg === 'object' && !Array.isArray(cfg)) {
|
|
148
|
+
for (const tier of ['opus', 'sonnet', 'haiku']) {
|
|
149
|
+
for (const role of cfg[tier] || []) {
|
|
150
|
+
if (typeof role === 'string') map[role.toUpperCase()] = tier
|
|
151
|
+
}
|
|
152
|
+
}
|
|
153
|
+
}
|
|
154
|
+
return map
|
|
155
|
+
}
|
|
156
|
+
const MODELS = buildModelMap(a.models)
|
|
157
|
+
// undefined => the agent inherits the main-loop (session) model.
|
|
158
|
+
const modelFor = (role) => MODELS[String(role).toUpperCase()]
|
|
159
|
+
|
|
119
160
|
const retroPrompt = (instruction, returnContract) =>
|
|
120
161
|
`You are **RETROSPECTIVE**, analyzing story batch ${batchNumber} in the valent-pipeline. ` +
|
|
121
162
|
`Read \`.valent-pipeline/prompts/retrospective.md\` and the step file named in the task. ${instruction} ` +
|
|
@@ -131,9 +172,9 @@ if (sprintId) {
|
|
|
131
172
|
phase('Calibrate')
|
|
132
173
|
// Estimation-accuracy arithmetic lives in code (src/lib/sprint.js); run it via the CLI.
|
|
133
174
|
calibration = await agent(
|
|
134
|
-
`Run exactly: \`valent-pipeline calibrate --sprint ${sprintId}\` in the project root and return its stdout JSON verbatim ` +
|
|
175
|
+
`Run exactly: \`node .valent-pipeline/bin/cli.js calibrate --sprint ${sprintId}\` in the project root and return its stdout JSON verbatim ` +
|
|
135
176
|
`(fields: ratios, flagged_pairs, surface_averages, velocity). This feeds calibration directives.`,
|
|
136
|
-
{ label: 'calibrate', phase: 'Calibrate', schema: { type: 'object', additionalProperties: true } },
|
|
177
|
+
{ label: 'calibrate', phase: 'Calibrate', schema: { type: 'object', additionalProperties: true }, model: modelFor('CALIBRATE') },
|
|
137
178
|
)
|
|
138
179
|
log(`calibration: ${(calibration.flagged_pairs || []).length} flagged pair(s); velocity unstable=${calibration.velocity?.unstable}`)
|
|
139
180
|
}
|
|
@@ -144,7 +185,7 @@ await agent(
|
|
|
144
185
|
'Run analyze.md: read all CRITIC reviews, QA-B bug reports, JUDGE rejections, and cost data; categorize rejection/bug patterns.',
|
|
145
186
|
'Return ONLY { schema:1, findings:[{id,summary,severity,stories}] } as JSON.',
|
|
146
187
|
),
|
|
147
|
-
{ label: 'analyze', phase: 'Analyze', schema: FINDINGS_SCHEMA },
|
|
188
|
+
{ label: 'analyze', phase: 'Analyze', schema: FINDINGS_SCHEMA, model: modelFor('RETRO') },
|
|
148
189
|
)
|
|
149
190
|
|
|
150
191
|
phase('Aggregate')
|
|
@@ -164,7 +205,7 @@ while (dry < dryRounds && round < maxRounds) {
|
|
|
164
205
|
`Report ONLY findings not already reported in earlier rounds.`,
|
|
165
206
|
'Return ONLY { schema:1, findings:[{id,summary,severity,stories}] } as JSON.',
|
|
166
207
|
),
|
|
167
|
-
{ label: `aggregate:round-${round}`, phase: 'Aggregate', schema: FINDINGS_SCHEMA },
|
|
208
|
+
{ label: `aggregate:round-${round}`, phase: 'Aggregate', schema: FINDINGS_SCHEMA, model: modelFor('RETRO-REVIEW') },
|
|
168
209
|
)
|
|
169
210
|
const fresh = (r.findings || []).filter((f) => !seen.has(findingKey(f)))
|
|
170
211
|
if (!fresh.length) {
|
|
@@ -187,7 +228,7 @@ const critic = await agent(
|
|
|
187
228
|
`List only genuine gaps — empty if coverage is complete.`,
|
|
188
229
|
'Return ONLY { schema:1, gaps:["..."] } as JSON.',
|
|
189
230
|
),
|
|
190
|
-
{ label: 'completeness-critic', phase: 'Aggregate', schema: COMPLETENESS_SCHEMA },
|
|
231
|
+
{ label: 'completeness-critic', phase: 'Aggregate', schema: COMPLETENESS_SCHEMA, model: modelFor('RETRO-REVIEW') },
|
|
191
232
|
)
|
|
192
233
|
if ((critic.gaps || []).length) {
|
|
193
234
|
log(`completeness-critic surfaced ${critic.gaps.length} gap(s) — running targeted reviews`)
|
|
@@ -196,7 +237,7 @@ if ((critic.gaps || []).length) {
|
|
|
196
237
|
agent(
|
|
197
238
|
retroPrompt(`Targeted aggregate review for the previously-uncovered angle: "${gap}". Report only findings not already reported.`,
|
|
198
239
|
'Return ONLY { schema:1, findings:[{id,summary,severity,stories}] } as JSON.'),
|
|
199
|
-
{ label: `aggregate:gap-${i + 1}`, phase: 'Aggregate', schema: FINDINGS_SCHEMA },
|
|
240
|
+
{ label: `aggregate:gap-${i + 1}`, phase: 'Aggregate', schema: FINDINGS_SCHEMA, model: modelFor('RETRO-REVIEW') },
|
|
200
241
|
)),
|
|
201
242
|
)
|
|
202
243
|
for (const r of extra.filter(Boolean)) {
|
|
@@ -222,7 +263,7 @@ const drafted = await agent(
|
|
|
222
263
|
`propose it and flag it; the orchestrator decides what gets applied.`,
|
|
223
264
|
'Return ONLY { schema:1, directives:[{target_agent,directive,reason,impact_level,touchesInvariant,category}] } as JSON.',
|
|
224
265
|
),
|
|
225
|
-
{ label: 'draft-directives', phase: 'Directives', schema: DIRECTIVES_SCHEMA },
|
|
266
|
+
{ label: 'draft-directives', phase: 'Directives', schema: DIRECTIVES_SCHEMA, model: modelFor('RETRO') },
|
|
226
267
|
)
|
|
227
268
|
|
|
228
269
|
const all = drafted.directives || []
|
|
@@ -239,7 +280,7 @@ if (applied.length) {
|
|
|
239
280
|
`Append these APPROVED correction directives to \`correction-directives.yaml\` (status: active, created_batch: ${batchNumber}). ` +
|
|
240
281
|
`They have passed the impact gate (low/medium only). Directives (JSON): ${JSON.stringify(applied)}. ` +
|
|
241
282
|
`Return { schema:1 } when done.`,
|
|
242
|
-
{ label: 'apply-directives', phase: 'Directives', schema: HANDOFF_SCHEMA },
|
|
283
|
+
{ label: 'apply-directives', phase: 'Directives', schema: HANDOFF_SCHEMA, model: modelFor('PERSIST') },
|
|
243
284
|
)
|
|
244
285
|
}
|
|
245
286
|
if (proposals.length) {
|
|
@@ -248,7 +289,7 @@ if (proposals.length) {
|
|
|
248
289
|
`Write these directive PROPOSALS to \`retrospective-batch-${batchNumber}.md\` under "## Pending Approval" — do NOT add them to ` +
|
|
249
290
|
`correction-directives.yaml. For each, document the proposed directive, why it needs approval (architecture-conflict or high-impact), ` +
|
|
250
291
|
`evidence, risk, and an alternative. Proposals (JSON): ${JSON.stringify(proposals)}. Return { schema:1 } when done.`,
|
|
251
|
-
{ label: 'surface-proposals', phase: 'Directives', schema: HANDOFF_SCHEMA },
|
|
292
|
+
{ label: 'surface-proposals', phase: 'Directives', schema: HANDOFF_SCHEMA, model: modelFor('PERSIST') },
|
|
252
293
|
)
|
|
253
294
|
}
|
|
254
295
|
|
|
@@ -257,8 +298,8 @@ phase('Embed')
|
|
|
257
298
|
const embed = await agent(
|
|
258
299
|
`Run embed-instructions.md: write \`embed-instructions.md\` (curated recurring patterns / novel decisions / bug patterns / ` +
|
|
259
300
|
`broadly-applicable directives only — NOT one-offs) in the most recent story output dir, then run ` +
|
|
260
|
-
`\`valent-pipeline db embed --file <that path>\`. Return { schema:1, embedded:<int count> }.`,
|
|
261
|
-
{ label: 'embed', phase: 'Embed', schema: { type: 'object', additionalProperties: true } },
|
|
301
|
+
`\`node .valent-pipeline/bin/cli.js db embed --file <that path>\`. Return { schema:1, embedded:<int count> }.`,
|
|
302
|
+
{ label: 'embed', phase: 'Embed', schema: { type: 'object', additionalProperties: true }, model: modelFor('EMBED') },
|
|
262
303
|
)
|
|
263
304
|
|
|
264
305
|
return {
|