valent-pipeline 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (108) hide show
  1. package/package.json +1 -1
  2. package/pipeline/agents-manifest.yaml +170 -0
  3. package/pipeline/docker-compose.chromadb.yml +15 -0
  4. package/pipeline/docs/agent-reference.md +72 -0
  5. package/pipeline/docs/communication-standard.md +452 -0
  6. package/pipeline/docs/knowledge-system.md +237 -0
  7. package/pipeline/docs/lead-lifecycle.md +262 -0
  8. package/pipeline/docs/lean-spawn-human-tasks.md +207 -0
  9. package/pipeline/docs/npx-implementation-plan.md +171 -0
  10. package/pipeline/docs/npx-packaging.md +85 -0
  11. package/pipeline/docs/pipeline-overview.md +174 -0
  12. package/pipeline/docs/pipeline-state-schema.md +103 -0
  13. package/pipeline/docs/task-graph.md +184 -0
  14. package/pipeline/docs/template-skeleton.md +281 -0
  15. package/pipeline/prompts/bend.md +111 -0
  16. package/pipeline/prompts/critic.md +105 -0
  17. package/pipeline/prompts/embed.md +80 -0
  18. package/pipeline/prompts/fend.md +136 -0
  19. package/pipeline/prompts/help.md +28 -0
  20. package/pipeline/prompts/judge-g1.md +121 -0
  21. package/pipeline/prompts/judge-g2.md +112 -0
  22. package/pipeline/prompts/knowledge.md +129 -0
  23. package/pipeline/prompts/lead.md +682 -0
  24. package/pipeline/prompts/pmcp.md +77 -0
  25. package/pipeline/prompts/qa-a.md +149 -0
  26. package/pipeline/prompts/qa-b.md +132 -0
  27. package/pipeline/prompts/reqs.md +105 -0
  28. package/pipeline/prompts/retrospective.md +82 -0
  29. package/pipeline/prompts/uxa.md +143 -0
  30. package/pipeline/scripts/embed-sqlite.ts +282 -0
  31. package/pipeline/scripts/embed.ts +425 -0
  32. package/pipeline/spawn-templates/agent-spawn.template.md +16 -0
  33. package/pipeline/spawn-templates/knowledge-spawn.template.md +17 -0
  34. package/pipeline/spawn-templates/pipeline-context.template.md +46 -0
  35. package/pipeline/steps/bend/handoff.md +9 -0
  36. package/pipeline/steps/bend/implement.md +13 -0
  37. package/pipeline/steps/bend/read-inputs.md +13 -0
  38. package/pipeline/steps/bend/write-tests.md +15 -0
  39. package/pipeline/steps/common/distilled-handoff-format.md +49 -0
  40. package/pipeline/steps/common/no-api-passthrough.md +18 -0
  41. package/pipeline/steps/common/no-ui-passthrough.md +18 -0
  42. package/pipeline/steps/critic/acceptance-audit.md +24 -0
  43. package/pipeline/steps/critic/blind-hunt.md +18 -0
  44. package/pipeline/steps/critic/edge-case-hunt.md +22 -0
  45. package/pipeline/steps/critic/test-review.md +19 -0
  46. package/pipeline/steps/critic/triage-depth.md +17 -0
  47. package/pipeline/steps/critic/triage.md +12 -0
  48. package/pipeline/steps/critic/write-verdict.md +31 -0
  49. package/pipeline/steps/fend/handoff.md +9 -0
  50. package/pipeline/steps/fend/implement.md +16 -0
  51. package/pipeline/steps/fend/read-inputs.md +10 -0
  52. package/pipeline/steps/fend/write-tests.md +12 -0
  53. package/pipeline/steps/judge-g1/pass1-review.md +117 -0
  54. package/pipeline/steps/judge-g1/pass2-review.md +51 -0
  55. package/pipeline/steps/judge-g2/evidence-review.md +105 -0
  56. package/pipeline/steps/judge-g2/ship-decision.md +43 -0
  57. package/pipeline/steps/orchestration/adopt-lead-and-create-team.md +91 -0
  58. package/pipeline/steps/orchestration/load-agents-manifest.md +9 -0
  59. package/pipeline/steps/orchestration/load-pipeline-config.md +33 -0
  60. package/pipeline/steps/orchestration/resolve-next-work-item.md +32 -0
  61. package/pipeline/steps/orchestration/resolve-story-path.md +12 -0
  62. package/pipeline/steps/orchestration/update-backlog-status.md +28 -0
  63. package/pipeline/steps/orchestration/validate-story-inputs.md +43 -0
  64. package/pipeline/steps/qa-a/api.md +31 -0
  65. package/pipeline/steps/qa-a/read-inputs.md +34 -0
  66. package/pipeline/steps/qa-a/write-spec.md +144 -0
  67. package/pipeline/steps/qa-b/api.md +52 -0
  68. package/pipeline/steps/qa-b/execute-tests.md +90 -0
  69. package/pipeline/steps/qa-b/file-bugs.md +41 -0
  70. package/pipeline/steps/qa-b/write-report.md +55 -0
  71. package/pipeline/steps/reqs/analyze.md +41 -0
  72. package/pipeline/steps/reqs/draft-brief.md +29 -0
  73. package/pipeline/steps/reqs/pre-mortem.md +27 -0
  74. package/pipeline/steps/reqs/read-inputs.md +25 -0
  75. package/pipeline/steps/reqs/self-review.md +22 -0
  76. package/pipeline/steps/reqs/write-output.md +14 -0
  77. package/pipeline/steps/retrospective/aggregate-review.md +51 -0
  78. package/pipeline/steps/retrospective/analyze.md +35 -0
  79. package/pipeline/steps/retrospective/directives.md +60 -0
  80. package/pipeline/steps/retrospective/embed-instructions.md +39 -0
  81. package/pipeline/steps/retrospective/report.md +34 -0
  82. package/pipeline/steps/uxa/read-inputs.md +22 -0
  83. package/pipeline/steps/uxa/translate-spec.md +124 -0
  84. package/pipeline/steps/uxa/write-output.md +15 -0
  85. package/pipeline/task-graphs/backend-api.yaml +139 -0
  86. package/pipeline/task-graphs/data-pipeline.yaml +139 -0
  87. package/pipeline/task-graphs/document-generation.yaml +139 -0
  88. package/pipeline/task-graphs/frontend-only.yaml +178 -0
  89. package/pipeline/task-graphs/fullstack-web.yaml +186 -0
  90. package/pipeline/task-graphs/library.yaml +139 -0
  91. package/pipeline/task-graphs/mcp-server.yaml +139 -0
  92. package/pipeline/templates/bend-handoff.template.md +83 -0
  93. package/pipeline/templates/bugs.template.md +111 -0
  94. package/pipeline/templates/critic-review.template.md +101 -0
  95. package/pipeline/templates/decisions.template.md +29 -0
  96. package/pipeline/templates/embed-instructions.template.md +46 -0
  97. package/pipeline/templates/execution-report.template.md +119 -0
  98. package/pipeline/templates/fend-handoff.template.md +85 -0
  99. package/pipeline/templates/judge-g1-review.template.md +155 -0
  100. package/pipeline/templates/judge-g2-decision.template.md +64 -0
  101. package/pipeline/templates/pmcp-evidence.template.md +49 -0
  102. package/pipeline/templates/qa-test-spec.template.md +153 -0
  103. package/pipeline/templates/reqs-brief.template.md +119 -0
  104. package/pipeline/templates/retrospective.template.md +108 -0
  105. package/pipeline/templates/story-report.template.md +89 -0
  106. package/pipeline/templates/traceability-matrix.template.md +90 -0
  107. package/pipeline/templates/uxa-spec.template.md +169 -0
  108. package/pipeline/templates/visual-validation-checklist.template.md +71 -0
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "valent-pipeline",
3
- "version": "0.1.1",
3
+ "version": "0.1.2",
4
4
  "description": "v3 multi-agent AI pipeline for software development lifecycle",
5
5
  "type": "module",
6
6
  "bin": {
@@ -0,0 +1,170 @@
1
+ # =============================================================================
2
+ # V3 Agent Manifest — Lead Agent's Primary Reference for Team Composition
3
+ # =============================================================================
4
+ #
5
+ # This manifest defines every agent the lead can spawn, their models, roles,
6
+ # dependencies, and outputs. The lead reads this file at pipeline start to:
7
+ #
8
+ # 1. Phase 1 kick-off: Filter by lifecycle: per-story, check degraded_without
9
+ # to determine which optional inputs are available, and spawn teammates.
10
+ # 2. Task dependency graph: Built from reads_from / writes_to — if agent B
11
+ # reads from what agent A writes, B is blockedBy A.
12
+ # 3. Communication: All teammates can message any other living teammate via
13
+ # inbox. No routing restrictions.
14
+ # 4. Model selection: Centralized here, not scattered across prompt templates.
15
+ # Changing an agent's model is a one-line YAML edit.
16
+ # 5. Ephemeral spawning: The ephemeral_agents section tells the lead what it
17
+ # can spawn on-demand and what triggers the spawn.
18
+ #
19
+ # IMPORTANT — Communication & Access:
20
+ # - All teammates can message any other living teammate. There are NO
21
+ # communication restrictions.
22
+ # - reads_from indicates PRIMARY DEPENDENCIES for task sequencing, not access
23
+ # restrictions. All agents can read any file in the story folder.
24
+ # =============================================================================
25
+
26
+ agents:
27
+ lead:
28
+ name: Lead
29
+ model: opus
30
+ lifecycle: persistent
31
+ role: "Pipeline orchestrator — spawns team, monitors execution, manages story lifecycle"
32
+ prompt_template: v3/prompts/lead.md
33
+ reads_from: [story-input, agents-manifest.yaml, pipeline-config.yaml, pipeline-state.json]
34
+ writes_to: [pipeline-state.json]
35
+
36
+ reqs:
37
+ name: REQS
38
+ model: sonnet
39
+ lifecycle: per-story
40
+ role: "Requirements analyst — translates ACs into implementation brief"
41
+ prompt_template: v3/prompts/reqs.md
42
+ reads_from: [story-input] # all story folder contents: ACs, trigger-map, architecture-decisions, UX spec when available
43
+ writes_to: [reqs-brief.md]
44
+
45
+ uxa:
46
+ name: UXA
47
+ model: sonnet
48
+ lifecycle: per-story
49
+ role: "UX specification agent — translates UX spec into component specs"
50
+ prompt_template: v3/prompts/uxa.md
51
+ reads_from: [reqs-brief.md, ux-spec, trigger-map, scenarios]
52
+ writes_to: [uxa-spec.md]
53
+ project_types: [fullstack-web, frontend-only]
54
+ degraded_without: [trigger-map, scenarios] # runs translation-only without these
55
+
56
+ qa_a:
57
+ name: QA-A
58
+ model: sonnet
59
+ lifecycle: per-story
60
+ role: "QA spec writer — produces behavioral test specifications"
61
+ prompt_template: v3/prompts/qa-a.md
62
+ reads_from: [reqs-brief.md, uxa-spec.md]
63
+ writes_to: [qa-test-spec.md, visual-validation-checklist.md]
64
+
65
+ judge_g1:
66
+ name: JUDGE-G1
67
+ model: sonnet
68
+ lifecycle: per-story
69
+ role: "Quality gate — validates reqs, UXA spec, test specs (Pass 1) and bug priorities (Pass 2)"
70
+ prompt_template: v3/prompts/judge-g1.md
71
+ passes:
72
+ pass1_review_order: [reqs-validation, uxa-validation, qa-spec-validation] # sequential, stop on first failure
73
+ pass2: bug-review
74
+ reads_from: [reqs-brief.md, uxa-spec.md, qa-test-spec.md, bugs.md, execution-report.md]
75
+ writes_to: [judge-g1-review.md]
76
+
77
+ bend:
78
+ name: BEND
79
+ model: sonnet
80
+ lifecycle: per-story
81
+ role: "Backend developer — implements production code and tests"
82
+ prompt_template: v3/prompts/bend.md
83
+ reads_from: [reqs-brief.md, qa-test-spec.md]
84
+ writes_to: [bend-handoff.md]
85
+
86
+ fend:
87
+ name: FEND
88
+ model: sonnet
89
+ lifecycle: per-story
90
+ role: "Frontend developer — implements UI components and tests"
91
+ prompt_template: v3/prompts/fend.md
92
+ reads_from: [reqs-brief.md, uxa-spec.md, qa-test-spec.md]
93
+ writes_to: [fend-handoff.md]
94
+ project_types: [fullstack-web, frontend-only]
95
+
96
+ critic:
97
+ name: CRITIC
98
+ model: opus
99
+ lifecycle: per-story
100
+ role: "Code reviewer — 3-pass adversarial review of production and test code"
101
+ prompt_template: v3/prompts/critic.md
102
+ review_passes: [blind-hunt, edge-case-hunt, acceptance-audit, triage]
103
+ reads_from: [git-diff, reqs-brief.md, qa-test-spec.md]
104
+ writes_to: [critic-review.md]
105
+
106
+ qa_b:
107
+ name: QA-B
108
+ model: sonnet
109
+ lifecycle: per-story
110
+ role: "Test executor — runs tests, validates spec alignment, files bugs"
111
+ prompt_template: v3/prompts/qa-b.md
112
+ reads_from: [qa-test-spec.md, critic-review.md, reqs-brief.md]
113
+ writes_to: [execution-report.md, bugs.md, traceability-matrix.md]
114
+ can_request_spawn: [pmcp] # asks lead to spawn PMCP
115
+
116
+ judge_g2:
117
+ name: JUDGE-G2
118
+ model: sonnet
119
+ lifecycle: per-story
120
+ role: "Final ship gate — evidence-based approval or rejection"
121
+ prompt_template: v3/prompts/judge-g2.md
122
+ reads_from: [execution-report.md, traceability-matrix.md, pmcp-evidence.md, bugs.md, judge-g1-review.md, qa-test-spec.md] # critic-review.md intentionally excluded — G2 validates test/execution evidence, not code review; qa-test-spec.md used as reference for assertion cross-check
123
+ writes_to: [judge-g2-decision.md, story-report.md]
124
+
125
+ knowledge:
126
+ name: Knowledge
127
+ model: haiku
128
+ lifecycle: per-story
129
+ role: "Knowledge retrieval — answers queries from persistent data sources"
130
+ prompt_template: v3/prompts/knowledge.md
131
+ data_sources: [chromadb, curated-knowledge-files, correction-directives]
132
+ context_variables: [knowledge_mode, chromadb_host, chromadb_collection_prefix, curated_files_path, correction_directives]
133
+ # No writes_to — Knowledge Agent responds via inbox only, no file output
134
+
135
+ ephemeral_agents:
136
+ pmcp:
137
+ name: PMCP
138
+ model: sonnet
139
+ role: "Visual validation — executes browser automation MCP checklist, captures screenshots"
140
+ prompt_template: v3/prompts/pmcp.md
141
+ reads_from: [visual-validation-checklist.md]
142
+ writes_to: [pmcp-evidence.md]
143
+ spawned_by: lead
144
+ spawn_trigger: qa_a_checklist # Lead spawns idle when QA-A writes visual-validation-checklist.md
145
+ execution_trigger: qa_b # QA-B sends [PMCP-TRIGGER] with dev server URL to start execution
146
+
147
+ embed:
148
+ name: Embed
149
+ model: haiku
150
+ role: "Knowledge indexer — indexes curated patterns into knowledge base"
151
+ prompt_template: v3/prompts/embed.md
152
+ spawned_by: lead
153
+ triggered_by: retrospective # only runs after Retrospective Agent curates what to index
154
+
155
+ retrospective:
156
+ name: Retrospective
157
+ model: sonnet
158
+ role: "Batch reviewer — analyzes last N stories for recurring patterns"
159
+ prompt_template: v3/prompts/retrospective.md
160
+ spawned_by: lead
161
+ triggered_by: every-n-stories
162
+
163
+ help:
164
+ name: Help
165
+ model: haiku
166
+ role: "Pipeline help — explains any piece of the pipeline from documentation"
167
+ prompt_template: v3/prompts/help.md
168
+ reads_from: [v3/docs/]
169
+ spawned_by: lead
170
+ triggered_by: user-request
@@ -0,0 +1,15 @@
1
+ services:
2
+ chromadb:
3
+ image: chromadb/chroma:latest
4
+ ports:
5
+ - "8000:8000"
6
+ volumes:
7
+ - ./chromadb-data:/chroma/chroma
8
+ environment:
9
+ IS_PERSISTENT: "TRUE"
10
+ ANONYMIZED_TELEMETRY: "FALSE"
11
+ healthcheck:
12
+ test: ["CMD-SHELL", "curl -f http://localhost:8000/api/v1/heartbeat || exit 1"]
13
+ interval: 10s
14
+ timeout: 5s
15
+ retries: 5
@@ -0,0 +1,72 @@
1
+ # V3 Agent Reference
2
+
3
+ > Quick reference for all 15 agents in the v3 pipeline.
4
+ > Definitive source: `v3/agents-manifest.yaml`
5
+
6
+ ---
7
+
8
+ ## Agent Roster
9
+
10
+ ### Per-Story Agents (10)
11
+
12
+ Spawned fresh for each story and torn down after the story ships or is cancelled.
13
+
14
+ | Agent | Model | Role | Reads | Writes | Key Behavior |
15
+ |-------|-------|------|-------|--------|--------------|
16
+ | REQS | Sonnet | Requirements analyst -- translates ACs into implementation brief | story-input (ACs, trigger-map, architecture-decisions, UX spec) | `reqs-brief.md` | Brainstorms ambiguity resolutions; escalates only when options have genuinely competing tradeoffs |
17
+ | UXA | Sonnet | UX specification -- translates UX spec into component specs | `reqs-brief.md`, ux-spec, trigger-map, scenarios | `uxa-spec.md` | Runs translation-only mode without trigger-map or scenarios; skipped for backend-only projects |
18
+ | QA-A | Sonnet | QA spec writer -- produces behavioral test specifications | `reqs-brief.md`, `uxa-spec.md` | `qa-test-spec.md`, `visual-validation-checklist.md` | Writes test specs before code exists; tests are specified, not implemented |
19
+ | JUDGE-G1 | Sonnet | Quality gate -- validates specs (Pass 1) and bug priorities (Pass 2) | `reqs-brief.md`, `uxa-spec.md`, `qa-test-spec.md`, `bugs.md`, `execution-report.md` | `judge-g1-review.md` | Sequential review: stops on first failure in Pass 1 |
20
+ | BEND | Opus | Backend developer -- implements production code and tests | `reqs-brief.md`, `qa-test-spec.md` | `bend-handoff.md` | Implements to QA-A test spec; coordinates with FEND via inbox for shared files |
21
+ | FEND | Opus | Frontend developer -- implements UI components and tests | `reqs-brief.md`, `uxa-spec.md`, `qa-test-spec.md` | `fend-handoff.md` | Implements to UXA component spec; skipped for backend-only projects |
22
+ | CRITIC | Opus | Code reviewer -- 3-pass adversarial review | git-diff, `reqs-brief.md`, `qa-test-spec.md` | `critic-review.md` | 3-pass sequential review (blind hunt, edge-case hunt, acceptance audit) + triage |
23
+ | QA-B | Sonnet | Test executor -- runs tests, validates spec alignment, files bugs | `qa-test-spec.md`, `critic-review.md`, `reqs-brief.md` | `execution-report.md`, `bugs.md`, `traceability-matrix.md` | Runs tests against real infrastructure; can request PMCP spawn for visual validation |
24
+ | JUDGE-G2 | Sonnet | Final ship gate -- evidence-based approval or rejection | `execution-report.md`, `traceability-matrix.md`, `pmcp-evidence.md`, `bugs.md`, `judge-g1-review.md` | `judge-g2-decision.md` | Evidence over assertion -- independently verifies every upstream claim |
25
+ | Knowledge | Haiku | Knowledge retrieval -- answers queries from persistent data sources | chromadb, curated-knowledge-files, correction-directives | _(none -- inbox only)_ | Responds via inbox only; no file output |
26
+
27
+ ### Persistent Agent (1)
28
+
29
+ Lives across stories. Manages the backlog and orchestrates each story team.
30
+
31
+ | Agent | Model | Role | Reads | Writes | Key Behavior |
32
+ |-------|-------|------|-------|--------|--------------|
33
+ | Lead | Opus | Pipeline orchestrator -- spawns team, monitors execution, manages story lifecycle | story-input, `agents-manifest.yaml`, `pipeline-config.yaml`, `pipeline-state.json` | `story-report.md`, `pipeline-state.json` | Builds task graph from manifest; enforces circuit breaker on rejection loops; escalates to user as last resort |
34
+
35
+ ### Ephemeral Agents (4)
36
+
37
+ Spawned on-demand by the Lead when triggered by specific events.
38
+
39
+ | Agent | Model | Role | Reads | Writes | Trigger |
40
+ |-------|-------|------|-------|--------|---------|
41
+ | PMCP | Sonnet | Visual validation -- browser automation MCP, captures screenshots | `visual-validation-checklist.md` | `pmcp-evidence.md` | Requested by QA-B, BEND, or FEND |
42
+ | Embed | Haiku | Knowledge indexer -- indexes curated patterns into knowledge base | _(retrospective output)_ | _(indexing instructions)_ | After Retrospective agent curates what to index |
43
+ | Retrospective | Sonnet | Batch reviewer -- analyzes last N stories for recurring patterns | _(story reports)_ | _(retrospective report)_ | Every N stories (configurable) |
44
+ | Help | Haiku | Pipeline help -- explains any piece of the pipeline from documentation | `v3/docs/` | _(inbox only)_ | User request |
45
+
46
+ ---
47
+
48
+ ## Project-Type Agent Selection
49
+
50
+ Not all agents run for every project type. The Lead reads `project_type` from `pipeline-config.yaml` and skips agents that don't apply.
51
+
52
+ | Project Type | Agents Skipped |
53
+ |-------------|----------------|
54
+ | fullstack-web | _(none -- all agents active)_ |
55
+ | backend-api | UXA, FEND, PMCP |
56
+ | frontend-only | BEND |
57
+ | data-pipeline | UXA, FEND, PMCP |
58
+ | mcp-server | UXA, FEND, PMCP |
59
+ | document-generation | UXA, FEND, PMCP |
60
+ | library | UXA, FEND, PMCP |
61
+
62
+ ---
63
+
64
+ ## Model Tier Summary
65
+
66
+ | Tier | Agents | Use Case | Cost |
67
+ |------|--------|----------|------|
68
+ | Opus | Lead, BEND, FEND, CRITIC | Complex code generation, orchestration, nuanced multi-pass review | Highest |
69
+ | Sonnet | REQS, UXA, QA-A, QA-B, JUDGE-G1, JUDGE-G2, PMCP, Retrospective | Analysis, spec writing, test execution, judgment, coordination | Balanced |
70
+ | Haiku | Knowledge, Embed, Help | Mechanical retrieval, indexing instructions, documentation lookups | Lowest |
71
+
72
+ Model assignments are configurable in `pipeline-config.yaml` under the `models` section. Move agents between tiers to adjust the quality/cost tradeoff for your project.