valent-pipeline 0.2.20 → 0.2.22

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (115) hide show
  1. package/README.md +438 -0
  2. package/package.json +1 -1
  3. package/pipeline/agents-manifest.yaml +61 -1
  4. package/pipeline/docs/agent-reference.md +82 -23
  5. package/pipeline/docs/design/refactor-checklist.md +111 -0
  6. package/pipeline/docs/index.md +60 -0
  7. package/pipeline/docs/pipeline-overview.md +4 -0
  8. package/pipeline/docs/prd-completion-audit-design.md +132 -0
  9. package/pipeline/prompts/bend.md +5 -11
  10. package/pipeline/prompts/critic.md +9 -0
  11. package/pipeline/prompts/data.md +59 -0
  12. package/pipeline/prompts/docgen.md +61 -0
  13. package/pipeline/prompts/fend.md +3 -10
  14. package/pipeline/prompts/iac.md +70 -0
  15. package/pipeline/prompts/lead.md +81 -3
  16. package/pipeline/prompts/libdev.md +61 -0
  17. package/pipeline/prompts/mcp-dev.md +59 -0
  18. package/pipeline/prompts/mobile.md +92 -0
  19. package/pipeline/prompts/qa-a.md +1 -1
  20. package/pipeline/prompts/qa-b.md +1 -1
  21. package/pipeline/prompts/reqs.md +5 -1
  22. package/pipeline/scripts/db-bootstrap.ts +1 -1
  23. package/pipeline/scripts/embed-sqlite.ts +5 -0
  24. package/pipeline/steps/common/quality-standards.md +19 -0
  25. package/pipeline/steps/critic/data-pipeline.md +28 -0
  26. package/pipeline/steps/critic/document-generation.md +21 -0
  27. package/pipeline/steps/critic/iac.md +29 -0
  28. package/pipeline/steps/critic/library.md +24 -0
  29. package/pipeline/steps/critic/mcp-server.md +24 -0
  30. package/pipeline/steps/critic/mobile-app.md +29 -0
  31. package/pipeline/steps/data/estimate.md +51 -0
  32. package/pipeline/steps/data/handoff.md +9 -0
  33. package/pipeline/steps/data/implement.md +16 -0
  34. package/pipeline/steps/data/read-inputs.md +13 -0
  35. package/pipeline/steps/data/write-tests.md +13 -0
  36. package/pipeline/steps/docgen/estimate.md +49 -0
  37. package/pipeline/steps/docgen/handoff.md +9 -0
  38. package/pipeline/steps/docgen/implement.md +19 -0
  39. package/pipeline/steps/docgen/read-inputs.md +13 -0
  40. package/pipeline/steps/docgen/write-tests.md +15 -0
  41. package/pipeline/steps/iac/estimate.md +50 -0
  42. package/pipeline/steps/iac/handoff.md +9 -0
  43. package/pipeline/steps/iac/implement.md +19 -0
  44. package/pipeline/steps/iac/read-inputs.md +13 -0
  45. package/pipeline/steps/iac/write-tests.md +20 -0
  46. package/pipeline/steps/judge/ship-decision.md +14 -1
  47. package/pipeline/steps/libdev/estimate.md +49 -0
  48. package/pipeline/steps/libdev/handoff.md +9 -0
  49. package/pipeline/steps/libdev/implement.md +19 -0
  50. package/pipeline/steps/libdev/read-inputs.md +13 -0
  51. package/pipeline/steps/libdev/write-tests.md +16 -0
  52. package/pipeline/steps/mcp-dev/estimate.md +49 -0
  53. package/pipeline/steps/mcp-dev/handoff.md +9 -0
  54. package/pipeline/steps/mcp-dev/implement.md +29 -0
  55. package/pipeline/steps/mcp-dev/read-inputs.md +13 -0
  56. package/pipeline/steps/mcp-dev/write-tests.md +19 -0
  57. package/pipeline/steps/mobile/emulator-lifecycle.md +67 -0
  58. package/pipeline/steps/mobile/estimate.md +51 -0
  59. package/pipeline/steps/mobile/flutter.md +30 -0
  60. package/pipeline/steps/mobile/handoff.md +18 -0
  61. package/pipeline/steps/mobile/implement.md +20 -0
  62. package/pipeline/steps/mobile/react-native.md +32 -0
  63. package/pipeline/steps/mobile/read-inputs.md +10 -0
  64. package/pipeline/steps/mobile/write-tests.md +59 -0
  65. package/pipeline/steps/orchestration/adopt-lead-and-create-team.md +1 -1
  66. package/pipeline/steps/orchestration/sprint-execute.md +22 -0
  67. package/pipeline/steps/orchestration/sprint-groom.md +4 -0
  68. package/pipeline/steps/orchestration/sprint-init.md +5 -2
  69. package/pipeline/steps/orchestration/sprint-plan.md +9 -3
  70. package/pipeline/steps/orchestration/sprint-review.md +4 -3
  71. package/pipeline/steps/orchestration/sprint-size.md +19 -12
  72. package/pipeline/steps/orchestration/validate-story-inputs.md +9 -0
  73. package/pipeline/steps/qa-a/data-pipeline.md +32 -0
  74. package/pipeline/steps/qa-a/document-generation.md +52 -0
  75. package/pipeline/steps/qa-a/iac.md +30 -0
  76. package/pipeline/steps/qa-a/library.md +42 -0
  77. package/pipeline/steps/qa-a/mcp-server.md +31 -0
  78. package/pipeline/steps/qa-a/mobile-app.md +59 -0
  79. package/pipeline/steps/qa-b/data-pipeline.md +48 -0
  80. package/pipeline/steps/qa-b/document-generation.md +47 -0
  81. package/pipeline/steps/qa-b/iac.md +44 -0
  82. package/pipeline/steps/qa-b/library.md +61 -0
  83. package/pipeline/steps/qa-b/mcp-server.md +40 -0
  84. package/pipeline/steps/qa-b/mobile-app.md +71 -0
  85. package/pipeline/steps/readiness/standalone-review.md +7 -2
  86. package/pipeline/steps/reqs/data-pipeline.md +56 -0
  87. package/pipeline/steps/reqs/document-generation.md +55 -0
  88. package/pipeline/steps/reqs/draft-brief.md +10 -0
  89. package/pipeline/steps/reqs/iac.md +63 -0
  90. package/pipeline/steps/reqs/library.md +56 -0
  91. package/pipeline/steps/reqs/mcp-server.md +48 -0
  92. package/pipeline/steps/reqs/mobile-app.md +54 -0
  93. package/pipeline/steps/reqs/self-review.md +5 -3
  94. package/pipeline/task-graphs/backend-api.yaml +19 -2
  95. package/pipeline/task-graphs/data-pipeline.yaml +29 -12
  96. package/pipeline/task-graphs/document-generation.yaml +29 -12
  97. package/pipeline/task-graphs/frontend-only.yaml +19 -2
  98. package/pipeline/task-graphs/fullstack-web.yaml +19 -2
  99. package/pipeline/task-graphs/library.yaml +29 -12
  100. package/pipeline/task-graphs/mcp-server.yaml +29 -12
  101. package/pipeline/task-graphs/mobile-app.yaml +171 -0
  102. package/pipeline/templates/bugs.template.md +1 -1
  103. package/pipeline/templates/critic-review.template.md +1 -1
  104. package/pipeline/templates/data-handoff.template.md +96 -0
  105. package/pipeline/templates/docgen-handoff.template.md +83 -0
  106. package/pipeline/templates/iac-handoff.template.md +83 -0
  107. package/pipeline/templates/judge-decision.template.md +11 -1
  108. package/pipeline/templates/libdev-handoff.template.md +82 -0
  109. package/pipeline/templates/mcp-dev-handoff.template.md +87 -0
  110. package/pipeline/templates/mobile-handoff.template.md +122 -0
  111. package/pipeline/templates/reqs-brief.template.md +60 -4
  112. package/skills/valent-run-deferred-tests/SKILL.md +109 -0
  113. package/src/commands/db-rebuild.js +5 -0
  114. package/src/lib/config-schema.js +1 -1
  115. package/src/lib/db.js +1 -1
package/README.md ADDED
@@ -0,0 +1,438 @@
1
+ # valent-pipeline
2
+
3
+ A multi-agent AI pipeline that takes user stories and ships tested, reviewed, committed code. Built on Claude Code agent teams.
4
+
5
+ You write the story. The pipeline handles requirements analysis, UX specification, test planning, implementation, adversarial code review, test execution, and a final evidence-based ship decision -- producing a full artifact trail for every story.
6
+
7
+ ## Quick Start
8
+
9
+ ```bash
10
+ # Install globally
11
+ npm install -g valent-pipeline
12
+
13
+ # Initialize in your project
14
+ cd your-project
15
+ valent-pipeline init
16
+
17
+ # Run the interactive configuration wizard
18
+ /valent-configure
19
+
20
+ # Execute a story
21
+ /valent-run-story STORY-001
22
+ ```
23
+
24
+ ## How It Works
25
+
26
+ A persistent **Lead** agent reads your story, assembles a team of specialist agents, and orchestrates them through a dependency-driven pipeline:
27
+
28
+ ```
29
+ REQS -> UXA -> QA-A -> READINESS -> BEND + FEND -> CRITIC -> QA-B -> JUDGE -> SHIP
30
+ ```
31
+
32
+ 1. **REQS** translates acceptance criteria into an implementation brief
33
+ 2. **UXA** converts UX specs into component specifications (frontend projects)
34
+ 3. **QA-A** writes behavioral test specifications *before any code exists*
35
+ 4. **READINESS** gate validates the spec chain -- stops on first failure
36
+ 5. **BEND + FEND** implement production code and tests in parallel
37
+ 6. **CRITIC** runs a 3-pass adversarial code review (blind hunt, edge cases, acceptance audit)
38
+ 7. **QA-B** executes tests against real infrastructure, files bugs, builds traceability matrix
39
+ 8. **JUDGE** makes an evidence-based SHIP or REJECT decision
40
+ 9. **Lead** commits code, writes the story report, and picks the next story
41
+
42
+ Two quality gates (**READINESS** and **JUDGE**) enforce pass/fail checkpoints. Rejection loops send work back to the responsible agent with specific corrections, with a circuit breaker to prevent infinite cycles.
43
+
44
+ ## Project Types
45
+
46
+ The pipeline supports 7 project types, each with a tailored task graph and specialized developer agent:
47
+
48
+ | Project Type | Developer Agent | Agents Skipped |
49
+ |---|---|---|
50
+ | `fullstack-web` | BEND + FEND | *(none)* |
51
+ | `backend-api` | BEND | UXA, FEND, PMCP |
52
+ | `frontend-only` | FEND | BEND |
53
+ | `data-pipeline` | DATA | UXA, FEND, PMCP |
54
+ | `mcp-server` | MCP-DEV | UXA, FEND, PMCP |
55
+ | `document-generation` | DOCGEN | UXA, FEND, PMCP |
56
+ | `library` | LIBDEV | UXA, FEND, PMCP |
57
+ | `mobile-app` | MOBILE | *(conditional)* |
58
+
59
+ The Lead selects which agents to spawn based on `project.type` in your `pipeline-config.yaml` and the story's `testing_profiles`.
60
+
61
+ ## Agent Roster
62
+
63
+ ### Per-Story Agents (10)
64
+
65
+ Spawned fresh per story, torn down after ship or cancel.
66
+
67
+ | Agent | Model | Role | Output |
68
+ |---|---|---|---|
69
+ | REQS | Sonnet | Requirements analyst | `reqs-brief.md` |
70
+ | UXA | Sonnet | UX specification | `uxa-spec.md` |
71
+ | QA-A | Sonnet | Test specification | `qa-test-spec.md`, `visual-validation-checklist.md` |
72
+ | READINESS | Sonnet | Spec quality gate | `readiness-review.md` |
73
+ | BEND | Sonnet | Backend developer | `bend-handoff.md` |
74
+ | FEND | Sonnet | Frontend developer | `fend-handoff.md` |
75
+ | CRITIC | Opus | Adversarial code reviewer | `critic-review.md` |
76
+ | QA-B | Sonnet | Test executor | `execution-report.md`, `bugs.md`, `traceability-matrix.md` |
77
+ | JUDGE | Sonnet | Final quality gate | `judge-review.md`, `judge-decision.md` |
78
+ | Knowledge | Haiku | Knowledge retrieval | *(inbox only)* |
79
+
80
+ ### Domain Developer Agents
81
+
82
+ Specialized agents that replace BEND for non-API project types:
83
+
84
+ | Agent | Model | Project Type | Output |
85
+ |---|---|---|---|
86
+ | DATA | Sonnet | `data-pipeline` | `data-handoff.md` |
87
+ | MCP-DEV | Sonnet | `mcp-server` | `mcp-dev-handoff.md` |
88
+ | LIBDEV | Sonnet | `library` | `libdev-handoff.md` |
89
+ | DOCGEN | Sonnet | `document-generation` | `docgen-handoff.md` |
90
+ | IAC | Sonnet | Cross-cutting (any type) | `iac-handoff.md` |
91
+ | MOBILE | Sonnet | `mobile-app` | `mobile-handoff.md` |
92
+
93
+ ### Persistent & Ephemeral Agents
94
+
95
+ | Agent | Model | Lifecycle | Trigger |
96
+ |---|---|---|---|
97
+ | Lead | Opus | Persistent across stories | Always running |
98
+ | PMCP | Sonnet | Ephemeral | QA-B requests visual validation |
99
+ | Embed | Haiku | Ephemeral | After Retrospective curates |
100
+ | Retrospective | Sonnet | Ephemeral | Every N stories (configurable) |
101
+ | Help | Haiku | Ephemeral | User request |
102
+
103
+ ## Installation
104
+
105
+ ### Prerequisites
106
+
107
+ - Node.js >= 18
108
+ - Claude Code CLI
109
+ - npm account (for publishing)
110
+
111
+ ### Install
112
+
113
+ ```bash
114
+ npm install -g valent-pipeline
115
+ ```
116
+
117
+ ### Initialize a Project
118
+
119
+ ```bash
120
+ cd your-project
121
+ valent-pipeline init
122
+ ```
123
+
124
+ The init command:
125
+ 1. Runs an interactive wizard to set project type, tech stack, and model assignments
126
+ 2. Copies pipeline infrastructure to `.valent-pipeline/`
127
+ 3. Generates `pipeline-config.yaml` from your answers
128
+ 4. Creates knowledge directories and initializes the backlog
129
+ 5. Installs Claude Code skills for story/epic/project execution
130
+
131
+ ### Upgrade
132
+
133
+ ```bash
134
+ valent-pipeline upgrade
135
+ valent-pipeline upgrade --dry-run # preview changes without applying
136
+ ```
137
+
138
+ Upgrades pipeline infrastructure (prompts, templates, task graphs, scripts) while preserving your project-specific files (config, knowledge, backlog).
139
+
140
+ ### Validate Configuration
141
+
142
+ ```bash
143
+ valent-pipeline config validate
144
+ ```
145
+
146
+ ## Configuration
147
+
148
+ All configuration lives in `.valent-pipeline/pipeline-config.yaml`. Run `/valent-configure` to edit interactively, or edit the file directly.
149
+
150
+ ### Key Sections
151
+
152
+ ```yaml
153
+ project:
154
+ type: fullstack-web # Project type (determines agent roster)
155
+ root: . # Project root directory
156
+ story_directory: ./stories # Where story inputs live
157
+ backlog_path: ./pipeline-backlog.yaml
158
+
159
+ tech_stack:
160
+ language: TypeScript
161
+ backend_framework: Express
162
+ frontend_framework: React
163
+ test_framework_unit: Vitest
164
+ test_framework_e2e: Playwright
165
+ browser_automation_mcp: playwright-mcp
166
+
167
+ models:
168
+ opus: [BEND, FEND, CRITIC] # Complex code generation, review
169
+ sonnet: [REQS, UXA, QA-A, ...] # Analysis, spec writing, judgment
170
+ haiku: [Knowledge, Embed, Help] # Retrieval, indexing, lookups
171
+
172
+ quality:
173
+ max_rejection_cycles: 5 # Circuit breaker for rejection loops
174
+ retrospective_every_n_stories: 5 # Retrospective trigger frequency
175
+ stall_threshold_minutes: 15 # Agent stall detection timeout
176
+
177
+ git:
178
+ target_branch: "" # Base branch for story branches
179
+ story_branch_prefix: story/ # Branch naming convention
180
+
181
+ knowledge:
182
+ mode: sqlite # none | sqlite | local-docker | connect-to-existing
183
+ sqlite_db_path: ./.valent-pipeline/pipeline.db
184
+
185
+ sprint: # Only used in epic/project mode
186
+ duration_minutes: 480
187
+ initial_velocity_points: 60
188
+ estimation_model: calibrated # calibrated | baseline
189
+ fibonacci_scale: [1, 2, 3, 5, 8, 13, 21]
190
+ ```
191
+
192
+ ## CLI Commands
193
+
194
+ ### Pipeline Management
195
+
196
+ | Command | Description |
197
+ |---|---|
198
+ | `valent-pipeline init` | Initialize pipeline in current project |
199
+ | `valent-pipeline upgrade` | Upgrade pipeline infrastructure |
200
+ | `valent-pipeline upgrade --dry-run` | Preview upgrade changes |
201
+ | `valent-pipeline config validate` | Validate pipeline-config.yaml |
202
+
203
+ ### Database Commands
204
+
205
+ | Command | Description |
206
+ |---|---|
207
+ | `valent-pipeline db init` | Initialize SQLite knowledge database |
208
+ | `valent-pipeline db rebuild` | Drop and recreate all tables |
209
+ | `valent-pipeline db index <story-dir>` | Index a story's artifacts |
210
+ | `valent-pipeline db query <text>` | Full-text search across artifacts |
211
+ | `valent-pipeline db embed <file>` | Generate and store embeddings |
212
+
213
+ ### Claude Code Skills
214
+
215
+ Invoked as slash commands inside Claude Code:
216
+
217
+ | Skill | Description |
218
+ |---|---|
219
+ | `/valent-configure` | Interactive configuration wizard |
220
+ | `/valent-run-story STORY-ID` | Execute a single story |
221
+ | `/valent-run-epic EPIC-ID` | Execute an epic with sprint planning |
222
+ | `/valent-run-project` | Execute a full project across all epics |
223
+ | `/valent-setup-backlog` | Convert epics/stories into pipeline backlog |
224
+ | `/valent-run-retrospective` | Trigger a standalone retrospective |
225
+ | `/valent-run-deferred-tests` | Run deferred iOS tests on Mac |
226
+ | `/valent-debug-export` | Export diagnostic dump |
227
+ | `/valent-help` | Pipeline documentation and FAQ |
228
+
229
+ ## Story Inputs
230
+
231
+ Create a story directory with at least a `story.md` file:
232
+
233
+ ```
234
+ stories/
235
+ STORY-001/
236
+ story.md # Required: user story + acceptance criteria
237
+ ux-spec.md # Optional: UX specification
238
+ trigger-map.md # Optional: interaction flows
239
+ scenarios.md # Optional: behavioral scenarios
240
+ architecture-notes.md # Optional: constraints and decisions
241
+ ```
242
+
243
+ The pipeline writes all output to `stories/STORY-001/output/`.
244
+
245
+ ## Pipeline Output
246
+
247
+ For each story, the pipeline produces 15+ artifacts in `stories/{story-id}/output/`:
248
+
249
+ | Artifact | Agent | Purpose |
250
+ |---|---|---|
251
+ | `reqs-brief.md` | REQS | Implementation brief from ACs |
252
+ | `uxa-spec.md` | UXA | Component specs from UX spec |
253
+ | `qa-test-spec.md` | QA-A | Behavioral test specifications |
254
+ | `visual-validation-checklist.md` | QA-A | Browser automation checklist |
255
+ | `{dev}-handoff.md` | BEND/FEND/etc. | Implementation summary |
256
+ | `critic-review.md` | CRITIC | 3-pass code review findings |
257
+ | `execution-report.md` | QA-B | Test execution results |
258
+ | `bugs.md` | QA-B | Filed bugs with priorities |
259
+ | `traceability-matrix.md` | QA-B | AC-to-test coverage map |
260
+ | `readiness-review.md` | READINESS | Spec gate results |
261
+ | `judge-review.md` | JUDGE | Bug review findings |
262
+ | `judge-decision.md` | JUDGE | Ship/reject decision with evidence |
263
+ | `pmcp-evidence.md` | PMCP | Visual validation screenshots |
264
+ | `story-report.md` | Lead | Story completion summary |
265
+ | `decisions.md` | *(any)* | Design Council deliberation log |
266
+
267
+ Plus committed, tested production code in your project source tree.
268
+
269
+ ## Communication Model
270
+
271
+ All inter-agent communication follows the [Distilled Communication Standard](pipeline/docs/communication-standard.md):
272
+
273
+ - **Handoff documents** -- structured artifacts with YAML frontmatter, orchestrator summary, and facts-only content. Every handoff follows a [template skeleton](pipeline/docs/template-skeleton.md).
274
+ - **Inbox messages** -- terse coordination messages (~500 tokens max) with file pointers. Types include `[HANDOFF]`, `[BLOCKER]`, `[REVISION]`, `[CRITIC-REJECTION]`, `[BUG]`, `[DESIGN-COUNCIL]`, `[ESCALATION]`.
275
+ - **Design Council** -- structured deliberation protocol for contested design decisions with position statements, synthesis, and escalation to user if consensus fails.
276
+ - **Human Escalation** -- when agent deliberation is insufficient, the Lead surfaces the issue to the user with full context.
277
+
278
+ ## Knowledge System
279
+
280
+ The pipeline learns from its own output through a [knowledge system](pipeline/docs/knowledge-system.md) with three data sources:
281
+
282
+ | Source | Location | Purpose |
283
+ |---|---|---|
284
+ | Correction directives | `knowledge/correction-directives.yaml` | Behavioral changes for agents from past patterns |
285
+ | Curated knowledge | `knowledge/curated/` | Conventions, validated patterns, known pitfalls |
286
+ | SQLite / ChromaDB | `.valent-pipeline/pipeline.db` | Embedding-based retrieval (optional) |
287
+
288
+ The **Retrospective** agent (triggered every N stories) is the sole gatekeeper for what enters persistent knowledge. It analyzes batch outputs, writes correction directives, and produces indexing instructions for the **Embed** agent. The **Knowledge** agent reads all sources and responds to teammate queries during story execution.
289
+
290
+ ### Knowledge Modes
291
+
292
+ | Mode | Dependencies | Description |
293
+ |---|---|---|
294
+ | `none` | None | Curated files + correction directives only |
295
+ | `sqlite` | better-sqlite3 | Local SQLite with FTS5 and vector search |
296
+ | `local-docker` | Docker | ChromaDB via Docker Compose + curated files |
297
+ | `connect-to-existing` | Network | Remote ChromaDB instance + curated files |
298
+
299
+ ## Execution Modes
300
+
301
+ ### Single Story
302
+
303
+ ```
304
+ /valent-run-story STORY-001
305
+ ```
306
+
307
+ Executes one story through the full pipeline.
308
+
309
+ ### Epic (Sprint-Based)
310
+
311
+ ```
312
+ /valent-run-epic EPIC-001
313
+ ```
314
+
315
+ Runs an epic with sprint planning: grooms stories, estimates sizing using calibrated Fibonacci points, plans sprints, executes stories in priority order, and runs retrospectives between sprints.
316
+
317
+ ### Full Project
318
+
319
+ ```
320
+ /valent-run-project
321
+ ```
322
+
323
+ Executes all epics in the backlog with cross-epic dependency resolution.
324
+
325
+ ### Backlog Setup
326
+
327
+ ```
328
+ /valent-setup-backlog
329
+ ```
330
+
331
+ Converts your epics and stories documents into a prioritized `pipeline-backlog.yaml` with vertical slice ordering and knowledge base initialization.
332
+
333
+ ## Quality Gates
334
+
335
+ ### READINESS Gate
336
+
337
+ Validates the spec chain before any code is written:
338
+ - REQS brief completeness and accuracy
339
+ - UXA spec consistency (frontend projects)
340
+ - QA test spec coverage and depth
341
+
342
+ Stops on first failure. The responsible upstream agent must rework before the pipeline proceeds.
343
+
344
+ ### JUDGE Gate
345
+
346
+ Makes the final ship decision based on evidence:
347
+ - Bug priority review (can reclassify P4 bugs to P1-P3)
348
+ - Test execution results verification
349
+ - Traceability matrix completeness
350
+ - PMCP visual evidence (UI projects)
351
+ - Applies "evidence over assertion" -- independently verifies every upstream claim
352
+
353
+ Verdicts: **SHIP** (commit and close), **SHIP-PARTIAL** (mobile: ship Android, defer iOS), **REJECT** (send back with corrections).
354
+
355
+ ### Rejection Loops
356
+
357
+ When CRITIC or JUDGE rejects work:
358
+ 1. Lead re-queues the responsible agent with the specific rejection findings
359
+ 2. Agent reworks and resubmits
360
+ 3. Circuit breaker (`max_rejection_cycles`, default 5) prevents infinite loops
361
+ 4. After max cycles, Lead escalates to user
362
+
363
+ ## Crash Recovery
364
+
365
+ All pipeline state is persisted to disk:
366
+ - `pipeline-state.json` -- current story, backlog, phase timing, team roster
367
+ - Handoff files with YAML frontmatter tracking step progress
368
+ - Git working directory preserves code state
369
+ - Inbox files preserve communication history
370
+
371
+ If the Lead crashes, it can reconstruct the full pipeline state from these artifacts on restart.
372
+
373
+ ## Directory Structure
374
+
375
+ After initialization, the pipeline installs to `.valent-pipeline/` in your project:
376
+
377
+ ```
378
+ .valent-pipeline/
379
+ pipeline-config.yaml # Your project configuration
380
+ pipeline-state.json # Pipeline runtime state
381
+ agents-manifest.yaml # Agent definitions and dependencies
382
+ prompts/ # Agent prompt templates (21 files)
383
+ templates/ # Handoff document templates (27 files)
384
+ task-graphs/ # Task dependency graphs per project type (8 files)
385
+ steps/ # Agent step files (114 files)
386
+ bend/ # Backend developer steps
387
+ fend/ # Frontend developer steps
388
+ critic/ # Code review steps
389
+ qa-a/ # Test spec steps (domain-specific)
390
+ qa-b/ # Test execution steps (domain-specific)
391
+ reqs/ # Requirements analysis steps
392
+ readiness/ # Readiness gate steps
393
+ judge/ # Judge gate steps
394
+ orchestration/ # Lead orchestration steps
395
+ retrospective/ # Retrospective analysis steps
396
+ common/ # Shared agent protocols
397
+ data/ # Data pipeline developer steps
398
+ docgen/ # Document generation steps
399
+ fend/ # Frontend developer steps
400
+ iac/ # Infrastructure-as-code steps
401
+ libdev/ # Library developer steps
402
+ mcp-dev/ # MCP server developer steps
403
+ mobile/ # Mobile developer steps
404
+ uxa/ # UX specification steps
405
+ spawn-templates/ # Agent spawn configuration
406
+ scripts/ # Pipeline utility scripts
407
+ docs/ # Pipeline reference documentation
408
+ knowledge/
409
+ curated/ # Curated knowledge files
410
+ correction-directives.yaml
411
+ pipeline.db # SQLite knowledge database
412
+ ```
413
+
414
+ ## Documentation
415
+
416
+ Full reference documentation lives in `pipeline/docs/`:
417
+
418
+ | Document | Description |
419
+ |---|---|
420
+ | [Pipeline Overview](pipeline/docs/pipeline-overview.md) | Architecture, flow, artifact map |
421
+ | [Agent Reference](pipeline/docs/agent-reference.md) | All agents, models, inputs/outputs |
422
+ | [Communication Standard](pipeline/docs/communication-standard.md) | Handoff format, inbox protocol, Design Council |
423
+ | [Lead Lifecycle](pipeline/docs/lead-lifecycle.md) | Kick-off, monitoring, ship, crash recovery |
424
+ | [Task Graph Specification](pipeline/docs/task-graph.md) | Dependencies, task states, claiming |
425
+ | [Pipeline State Schema](pipeline/docs/pipeline-state-schema.md) | JSON schema for pipeline-state.json |
426
+ | [Knowledge System](pipeline/docs/knowledge-system.md) | RAG assessment, correction directives, curation |
427
+ | [Template Skeleton](pipeline/docs/template-skeleton.md) | Universal handoff document structure |
428
+ | [NPX Packaging](pipeline/docs/npx-packaging.md) | Package distribution and init workflow |
429
+
430
+ ### Reference
431
+
432
+ | Document | Description |
433
+ |---|---|
434
+ | [Refactor Checklist](pipeline/docs/design/refactor-checklist.md) | Every location to update when changing agents, config, tables, or phases |
435
+
436
+ ## License
437
+
438
+ MIT
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "valent-pipeline",
3
- "version": "0.2.20",
3
+ "version": "0.2.22",
4
4
  "description": "v3 multi-agent AI pipeline for software development lifecycle",
5
5
  "type": "module",
6
6
  "bin": {
@@ -50,7 +50,7 @@ agents:
50
50
  prompt_template: .valent-pipeline/prompts/uxa.md
51
51
  reads_from: [reqs-brief.md, ux-spec, trigger-map, scenarios]
52
52
  writes_to: [uxa-spec.md]
53
- project_types: [fullstack-web, frontend-only]
53
+ project_types: [fullstack-web, frontend-only, mobile-app]
54
54
  degraded_without: [trigger-map, scenarios] # runs translation-only without these
55
55
 
56
56
  qa_a:
@@ -80,6 +80,7 @@ agents:
80
80
  prompt_template: .valent-pipeline/prompts/bend.md
81
81
  reads_from: [reqs-brief.md, qa-test-spec.md]
82
82
  writes_to: [bend-handoff.md]
83
+ project_types: [backend-api, fullstack-web]
83
84
 
84
85
  fend:
85
86
  name: FEND
@@ -91,6 +92,65 @@ agents:
91
92
  writes_to: [fend-handoff.md]
92
93
  project_types: [fullstack-web, frontend-only]
93
94
 
95
+ mobile:
96
+ name: MOBILE
97
+ model: sonnet
98
+ lifecycle: per-story
99
+ role: "Mobile developer — implements RN/Flutter screens, components, Maestro E2E flows"
100
+ prompt_template: .valent-pipeline/prompts/mobile.md
101
+ reads_from: [reqs-brief.md, uxa-spec.md, qa-test-spec.md]
102
+ writes_to: [mobile-handoff.md]
103
+ project_types: [mobile-app]
104
+
105
+ data:
106
+ name: DATA
107
+ model: sonnet
108
+ lifecycle: per-story
109
+ role: "Data pipeline developer — implements ETL, transforms, data quality, checkpointing"
110
+ prompt_template: .valent-pipeline/prompts/data.md
111
+ reads_from: [reqs-brief.md, qa-test-spec.md]
112
+ writes_to: [data-handoff.md]
113
+ project_types: [data-pipeline]
114
+
115
+ mcp_dev:
116
+ name: MCP-DEV
117
+ model: sonnet
118
+ lifecycle: per-story
119
+ role: "Protocol developer — implements MCP server tools, JSON-RPC handlers, transport"
120
+ prompt_template: .valent-pipeline/prompts/mcp-dev.md
121
+ reads_from: [reqs-brief.md, qa-test-spec.md]
122
+ writes_to: [mcp-dev-handoff.md]
123
+ project_types: [mcp-server]
124
+
125
+ libdev:
126
+ name: LIBDEV
127
+ model: sonnet
128
+ lifecycle: per-story
129
+ role: "Library developer — implements public API, exports, packaging, type declarations"
130
+ prompt_template: .valent-pipeline/prompts/libdev.md
131
+ reads_from: [reqs-brief.md, qa-test-spec.md]
132
+ writes_to: [libdev-handoff.md]
133
+ project_types: [library]
134
+
135
+ docgen:
136
+ name: DOCGEN
137
+ model: sonnet
138
+ lifecycle: per-story
139
+ role: "Document generation developer — implements templates, render pipeline, output formatting"
140
+ prompt_template: .valent-pipeline/prompts/docgen.md
141
+ reads_from: [reqs-brief.md, qa-test-spec.md]
142
+ writes_to: [docgen-handoff.md]
143
+ project_types: [document-generation]
144
+
145
+ iac:
146
+ name: IAC
147
+ model: sonnet
148
+ lifecycle: per-story
149
+ role: "Infrastructure developer — implements IaC definitions, deployment configs, infrastructure tests"
150
+ prompt_template: .valent-pipeline/prompts/iac.md
151
+ reads_from: [reqs-brief.md, qa-test-spec.md]
152
+ writes_to: [iac-handoff.md]
153
+
94
154
  critic:
95
155
  name: CRITIC
96
156
  model: opus
@@ -1,28 +1,48 @@
1
1
  # V3 Agent Reference
2
2
 
3
- > Quick reference for all 15 agents in the v3 pipeline.
3
+ > Quick reference for all agents in the v3 pipeline.
4
4
  > Definitive source: `.valent-pipeline/agents-manifest.yaml`
5
5
 
6
6
  ---
7
7
 
8
8
  ## Agent Roster
9
9
 
10
- ### Per-Story Agents (10)
10
+ ### Core Per-Story Agents (10)
11
11
 
12
- Spawned fresh for each story and torn down after the story ships or is cancelled.
12
+ Spawned fresh for each story and torn down after the story ships or is cancelled. These agents form the standard pipeline flow regardless of project type.
13
13
 
14
14
  | Agent | Model | Role | Reads | Writes | Key Behavior |
15
15
  |-------|-------|------|-------|--------|--------------|
16
- | REQS | Sonnet | Requirements analyst -- translates ACs into implementation brief | story-input (ACs, trigger-map, architecture-decisions, UX spec) | `reqs-brief.md` | Brainstorms ambiguity resolutions; escalates only when options have genuinely competing tradeoffs |
16
+ | REQS | Sonnet | Requirements analyst -- translates ACs into implementation brief | story-input (ACs, trigger-map, architecture-decisions, UX spec) | `reqs-brief.md` | Brainstorms ambiguity resolutions; loads domain-specific step files per testing profile; escalates only when options have genuinely competing tradeoffs |
17
17
  | UXA | Sonnet | UX specification -- translates UX spec into component specs | `reqs-brief.md`, ux-spec, trigger-map, scenarios | `uxa-spec.md` | Runs translation-only mode without trigger-map or scenarios; skipped for backend-only projects |
18
- | QA-A | Sonnet | QA spec writer -- produces behavioral test specifications | `reqs-brief.md`, `uxa-spec.md` | `qa-test-spec.md`, `visual-validation-checklist.md` | Writes test specs before code exists; tests are specified, not implemented |
19
- | READINESS | Sonnet | Spec quality gate -- validates specs before execution begins | `reqs-brief.md`, `uxa-spec.md`, `qa-test-spec.md` | `readiness-review.md` | Sequential review: stops on first failure |
20
- | BEND | Opus | Backend developer -- implements production code and tests | `reqs-brief.md`, `qa-test-spec.md` | `bend-handoff.md` | Implements to QA-A test spec; coordinates with FEND via inbox for shared files |
21
- | FEND | Opus | Frontend developer -- implements UI components and tests | `reqs-brief.md`, `uxa-spec.md`, `qa-test-spec.md` | `fend-handoff.md` | Implements to UXA component spec; skipped for backend-only projects |
22
- | CRITIC | Opus | Code reviewer -- 3-pass adversarial review | git-diff, `reqs-brief.md`, `qa-test-spec.md` | `critic-review.md` | 3-pass sequential review (blind hunt, edge-case hunt, acceptance audit) + triage |
23
- | QA-B | Sonnet | Test executor -- runs tests, validates spec alignment, files bugs | `qa-test-spec.md`, `critic-review.md`, `reqs-brief.md` | `execution-report.md`, `bugs.md`, `traceability-matrix.md` | Runs tests against real infrastructure; can request PMCP spawn for visual validation |
18
+ | QA-A | Sonnet | QA spec writer -- produces behavioral test specifications | `reqs-brief.md`, `uxa-spec.md` | `qa-test-spec.md`, `visual-validation-checklist.md` | Writes test specs before code exists; risk-based test depth (P0-P3); domain-specific step files per project type |
19
+ | READINESS | Sonnet | Spec quality gate -- validates specs before execution begins | `reqs-brief.md`, `uxa-spec.md`, `qa-test-spec.md` | `readiness-review.md` | Sequential review: stops on first failure; routes rejection to responsible upstream agent |
20
+ | BEND | Sonnet | Backend developer -- implements production code and tests | `reqs-brief.md`, `qa-test-spec.md` | `bend-handoff.md` | Implements to QA-A test spec; coordinates with FEND via inbox for shared files; fullstack-web and backend-api only |
21
+ | FEND | Sonnet | Frontend developer -- implements UI components and tests | `reqs-brief.md`, `uxa-spec.md`, `qa-test-spec.md` | `fend-handoff.md` | Implements to UXA component spec; fullstack-web and frontend-only only |
22
+ | CRITIC | Opus | Code reviewer -- 3-pass adversarial review | git-diff, `reqs-brief.md`, `qa-test-spec.md` | `critic-review.md` | 3-pass sequential review (blind hunt, edge-case hunt, acceptance audit) + triage; domain-specific review steps per project type |
23
+ | QA-B | Sonnet | Test executor -- runs tests, validates spec alignment, files bugs | `qa-test-spec.md`, `critic-review.md`, `reqs-brief.md` | `execution-report.md`, `bugs.md`, `traceability-matrix.md` | Runs tests against real infrastructure; domain-specific execution steps; can request PMCP spawn for visual validation |
24
24
  | JUDGE | Sonnet | Final quality gate -- bug review + ship decision | `execution-report.md`, `traceability-matrix.md`, `pmcp-evidence.md`, `bugs.md`, `qa-test-spec.md` | `judge-review.md`, `judge-decision.md`, `story-report.md` | Evidence over assertion -- independently verifies every upstream claim |
25
- | Knowledge | Haiku | Knowledge retrieval -- answers queries from persistent data sources | chromadb, curated-knowledge-files, correction-directives | _(none -- inbox only)_ | Responds via inbox only; no file output |
25
+ | Knowledge | Haiku | Knowledge retrieval -- answers queries from persistent data sources | chromadb, curated-knowledge-files, correction-directives | _(none -- inbox only)_ | Responds via inbox only; no file output; uses CLI db commands for SQLite queries |
26
+
27
+ ### Domain Developer Agents (6)
28
+
29
+ Specialized developer agents that replace or supplement BEND/FEND for specific project types. Each has its own prompt, step files, handoff template, and domain-specific QA-A/QA-B/CRITIC steps.
30
+
31
+ | Agent | Model | Role | Project Type | Reads | Writes | Key Domain |
32
+ |-------|-------|------|-------------|-------|--------|------------|
33
+ | DATA | Sonnet | Data pipeline developer | `data-pipeline` | `reqs-brief.md`, `qa-test-spec.md` | `data-handoff.md` | ETL/transforms, idempotency, checkpointing, row-level logging |
34
+ | MCP-DEV | Sonnet | Protocol developer | `mcp-server` | `reqs-brief.md`, `qa-test-spec.md` | `mcp-dev-handoff.md` | JSON-RPC/stdio, two-tier error model, tool registration |
35
+ | LIBDEV | Sonnet | Library developer | `library` | `reqs-brief.md`, `qa-test-spec.md` | `libdev-handoff.md` | Public API, exports/packaging, CJS/ESM, semver, type declarations |
36
+ | DOCGEN | Sonnet | Document generation developer | `document-generation` | `reqs-brief.md`, `qa-test-spec.md` | `docgen-handoff.md` | Template engine, render pipeline, encoding, assets |
37
+ | IAC | Sonnet | Infrastructure developer | Cross-cutting (any type) | `reqs-brief.md`, `qa-test-spec.md` | `iac-handoff.md` | Terraform/Pulumi/CloudFormation, K8s, CI/CD, IAM |
38
+ | MOBILE | Sonnet | Mobile developer | `mobile-app` | `reqs-brief.md`, `uxa-spec.md`, `qa-test-spec.md` | `mobile-handoff.md` | React Native/Flutter, Maestro E2E, emulator lifecycle, iOS deferral |
39
+
40
+ **Notes:**
41
+ - DATA, MCP-DEV, LIBDEV, DOCGEN each replace BEND in their dedicated task graph.
42
+ - IAC is cross-cutting -- it slots into ANY task graph when `iac` is in `testing_profiles`, running in parallel with the primary developer agent.
43
+ - MOBILE replaces BEND for mobile-app projects; BEND can still be conditionally included if `testing_profiles` includes `api`.
44
+ - Each domain agent has 5 standard steps: read-inputs, implement, write-tests, handoff, estimate.
45
+ - See the agent prompts in `pipeline/prompts/` and step files in `pipeline/steps/` for full implementation details.
26
46
 
27
47
  ### Persistent Agent (1)
28
48
 
@@ -47,26 +67,65 @@ Spawned on-demand by the Lead when triggered by specific events.
47
67
 
48
68
  ## Project-Type Agent Selection
49
69
 
50
- Not all agents run for every project type. The Lead reads `project_type` from `pipeline-config.yaml` and skips agents that don't apply.
70
+ Not all agents run for every project type. The Lead reads `project_type` from `pipeline-config.yaml`, selects the appropriate task graph, and spawns only the agents that apply.
51
71
 
52
- | Project Type | Agents Skipped |
53
- |-------------|----------------|
54
- | fullstack-web | _(none -- all agents active)_ |
55
- | backend-api | UXA, FEND, PMCP |
56
- | frontend-only | BEND |
57
- | data-pipeline | UXA, FEND, PMCP |
58
- | mcp-server | UXA, FEND, PMCP |
59
- | document-generation | UXA, FEND, PMCP |
60
- | library | UXA, FEND, PMCP |
72
+ | Project Type | Developer Agent(s) | Agents Skipped | Task Graph |
73
+ |---|---|---|---|
74
+ | `fullstack-web` | BEND + FEND | _(none)_ | `fullstack-web.yaml` |
75
+ | `backend-api` | BEND | UXA, FEND, PMCP | `backend-api.yaml` |
76
+ | `frontend-only` | FEND | BEND | `frontend-only.yaml` |
77
+ | `data-pipeline` | DATA | UXA, FEND, PMCP | `data-pipeline.yaml` |
78
+ | `mcp-server` | MCP-DEV | UXA, FEND, PMCP | `mcp-server.yaml` |
79
+ | `document-generation` | DOCGEN | UXA, FEND, PMCP | `document-generation.yaml` |
80
+ | `library` | LIBDEV | UXA, FEND, PMCP | `library.yaml` |
81
+ | `mobile-app` | MOBILE (+ BEND if api profile) | *(conditional)* | `mobile-app.yaml` |
82
+
83
+ **Conditional agents (any project type):**
84
+ - **IAC** -- spawned when `testing_profiles` includes `iac`; runs in parallel with the primary developer agent
85
+ - **PMCP** -- spawned when `testing_profiles` includes `ui`; triggered by QA-B for visual validation
86
+ - **UXA** -- can be conditionally skipped even for UI projects if `testing_profiles` excludes `ui`
61
87
 
62
88
  ---
63
89
 
64
90
  ## Model Tier Summary
65
91
 
92
+ Default assignments from `config-schema.js`:
93
+
66
94
  | Tier | Agents | Use Case | Cost |
67
95
  |------|--------|----------|------|
68
- | Opus | Lead, BEND, FEND, CRITIC | Complex code generation, orchestration, nuanced multi-pass review | Highest |
69
- | Sonnet | REQS, UXA, QA-A, QA-B, READINESS, JUDGE, PMCP, Retrospective | Analysis, spec writing, test execution, judgment, coordination | Balanced |
96
+ | Opus | Lead, CRITIC | Orchestration, nuanced multi-pass code review | Highest |
97
+ | Sonnet | REQS, UXA, QA-A, QA-B, READINESS, JUDGE, PMCP, Retrospective, BEND, FEND, DATA, MCP-DEV, LIBDEV, DOCGEN, IAC, MOBILE | Analysis, spec writing, implementation, test execution, judgment | Balanced |
70
98
  | Haiku | Knowledge, Embed, Help | Mechanical retrieval, indexing instructions, documentation lookups | Lowest |
71
99
 
72
100
  Model assignments are configurable in `pipeline-config.yaml` under the `models` section. Move agents between tiers to adjust the quality/cost tradeoff for your project.
101
+
102
+ ---
103
+
104
+ ## Step File Architecture
105
+
106
+ Each agent has domain-specific step files that provide detailed execution instructions. Step files live in `pipeline/steps/{agent}/` and are referenced by the agent's prompt.
107
+
108
+ ### Shared Steps (`common/`)
109
+
110
+ | Step File | Purpose |
111
+ |---|---|
112
+ | `agent-protocol.md` | Universal agent communication rules |
113
+ | `distilled-handoff-format.md` | How to write distilled handoff documents |
114
+ | `no-api-passthrough.md` | Constraint: no passthrough API endpoints |
115
+ | `no-ui-passthrough.md` | Constraint: no passthrough UI components |
116
+ | `quality-standards.md` | Cross-cutting quality standards for all agents |
117
+
118
+ ### Domain-Specific Steps
119
+
120
+ QA-A, QA-B, CRITIC, and REQS each have domain-specific step files that load based on the project's `testing_profiles`:
121
+
122
+ | Profile | QA-A Step | QA-B Step | CRITIC Step | REQS Step |
123
+ |---|---|---|---|---|
124
+ | `api` | `qa-a/api.md` | `qa-b/api.md` | *(standard)* | *(standard)* |
125
+ | `ui` | `qa-a/ui.md` | `qa-b/ui.md` | *(standard)* | *(standard)* |
126
+ | `data-pipeline` | `qa-a/data-pipeline.md` | `qa-b/data-pipeline.md` | `critic/data-pipeline.md` | `reqs/data-pipeline.md` |
127
+ | `mcp-server` | `qa-a/mcp-server.md` | `qa-b/mcp-server.md` | `critic/mcp-server.md` | `reqs/mcp-server.md` |
128
+ | `library` | `qa-a/library.md` | `qa-b/library.md` | `critic/library.md` | `reqs/library.md` |
129
+ | `document-generation` | `qa-a/document-generation.md` | `qa-b/document-generation.md` | `critic/document-generation.md` | `reqs/document-generation.md` |
130
+ | `iac` | `qa-a/iac.md` | `qa-b/iac.md` | `critic/iac.md` | `reqs/iac.md` |
131
+ | `mobile-app` | `qa-a/mobile-app.md` | `qa-b/mobile-app.md` | `critic/mobile-app.md` | `reqs/mobile-app.md` |