npm - valent-pipeline - Versions diffs - 0.3.2 → 0.3.4 - Mend

valent-pipeline 0.3.2 → 0.3.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (61) hide show

package/package.json +1 -1
package/pipeline/agents-manifest.yaml +23 -33
package/pipeline/docs/knowledge-system.md +16 -18
package/pipeline/docs/lead-lifecycle.md +3 -12
package/pipeline/docs/npx-packaging.md +0 -1
package/pipeline/docs/template-skeleton.md +1 -1
package/pipeline/prompts/bend.md +12 -2
package/pipeline/prompts/critic.md +15 -8
package/pipeline/prompts/fend.md +12 -2
package/pipeline/prompts/judge.md +12 -2
package/pipeline/prompts/lead.md +231 -71
package/pipeline/prompts/qa-a.md +1 -1
package/pipeline/prompts/qa-b.md +12 -2
package/pipeline/prompts/reqs.md +1 -1
package/pipeline/prompts/uxa.md +1 -1
package/pipeline/providers/claude-code/runtime.md +31 -10
package/pipeline/providers/codex/AGENTS.md +8 -3
package/pipeline/providers/codex/cloud-task-prompts/implementation.md +2 -0
package/pipeline/providers/codex/codex-project-files/.codex/agents/review-explorer.toml +2 -2
package/pipeline/providers/codex/runtime.md +91 -208
package/pipeline/providers/codex/spawn.template.md +3 -1
package/pipeline/scripts/query-kb.ts +1 -1
package/pipeline/spawn-templates/pipeline-context.template.md +1 -3
package/pipeline/steps/bend/read-inputs.md +2 -5
package/pipeline/steps/common/agent-protocol.md +9 -1
package/pipeline/steps/data/read-inputs.md +2 -5
package/pipeline/steps/docgen/read-inputs.md +2 -5
package/pipeline/steps/fend/read-inputs.md +2 -5
package/pipeline/steps/iac/read-inputs.md +2 -5
package/pipeline/steps/libdev/read-inputs.md +2 -5
package/pipeline/steps/mcp-dev/read-inputs.md +2 -5
package/pipeline/steps/mobile/read-inputs.md +2 -5
package/pipeline/steps/orchestration/adopt-lead-and-create-team.md +97 -24
package/pipeline/steps/orchestration/sprint-execute.md +30 -10
package/pipeline/steps/orchestration/validate-story-inputs.md +1 -1
package/pipeline/steps/qa-a/read-inputs.md +2 -6
package/pipeline/steps/reqs/read-inputs.md +3 -7
package/pipeline/steps/uxa/read-inputs.md +2 -6
package/pipeline/task-graphs/backend-api.yaml +0 -8
package/pipeline/task-graphs/data-pipeline.yaml +0 -8
package/pipeline/task-graphs/document-generation.yaml +0 -8
package/pipeline/task-graphs/frontend-only.yaml +0 -8
package/pipeline/task-graphs/fullstack-web.yaml +0 -8
package/pipeline/task-graphs/library.yaml +0 -8
package/pipeline/task-graphs/mcp-server.yaml +0 -8
package/pipeline/task-graphs/mobile-app.yaml +0 -8
package/pipeline/templates/embed-instructions.template.md +1 -1
package/pipeline/templates/retrospective.template.md +1 -1
package/skills/valent-help/SKILL.md +2 -2
package/skills/valent-knowledge/SKILL.md +68 -0
package/skills/valent-run-epic/SKILL.md +4 -9
package/skills/valent-run-project/SKILL.md +4 -7
package/skills/valent-run-story/SKILL.md +1 -1
package/skills/valent-setup-backlog/SKILL.md +3 -3
package/src/commands/init.js +16 -4
package/src/lib/config-schema.js +2 -2
package/pipeline/prompts/knowledge.md +0 -94
package/pipeline/providers/claude-code/knowledge-spawn.template.md +0 -17
package/pipeline/providers/codex/codex-project-files/.codex/agents/knowledge-service.toml +0 -14
package/pipeline/providers/codex/knowledge-spawn.template.md +0 -19
package/pipeline/spawn-templates/knowledge-spawn.template.md +0 -17

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "valent-pipeline",
-  "version": "0.3.2",
+  "version": "0.3.4",
   "description": "v3 multi-agent AI pipeline for software development lifecycle",
   "type": "module",
   "bin": {

package/pipeline/agents-manifest.yaml CHANGED Viewed

@@ -75,7 +75,7 @@ agents:
   readiness:
     name: READINESS
-    model: sonnet
+    model: opus
     lifecycle: per-story
     role: "Spec quality gate — validates reqs, UXA spec, and test specs are implementation-ready"
     prompt_template: .valent-pipeline/prompts/readiness.md
@@ -85,8 +85,8 @@ agents:
   bend:
     name: BEND
-    model: sonnet
-    lifecycle: per-story
+    model: opus
+    lifecycle: per-sprint  # persists across stories; receives [STORY-RESET] between stories
     role: "Backend developer — implements production code and tests"
     prompt_template: .valent-pipeline/prompts/bend.md
     reads_from: [reqs-brief.md, qa-test-spec.md]
@@ -95,8 +95,8 @@ agents:
   fend:
     name: FEND
-    model: sonnet
-    lifecycle: per-story
+    model: opus
+    lifecycle: per-sprint  # persists across stories; receives [STORY-RESET] between stories
     role: "Frontend developer — implements UI components and tests"
     prompt_template: .valent-pipeline/prompts/fend.md
     reads_from: [reqs-brief.md, uxa-spec.md, qa-test-spec.md]
@@ -105,8 +105,8 @@ agents:
   mobile:
     name: MOBILE
-    model: sonnet
-    lifecycle: per-story
+    model: opus
+    lifecycle: per-sprint  # persists across stories; receives [STORY-RESET] between stories
     role: "Mobile developer — implements RN/Flutter screens, components, Maestro E2E flows"
     prompt_template: .valent-pipeline/prompts/mobile.md
     reads_from: [reqs-brief.md, uxa-spec.md, qa-test-spec.md]
@@ -115,8 +115,8 @@ agents:
   data:
     name: DATA
-    model: sonnet
-    lifecycle: per-story
+    model: opus
+    lifecycle: per-sprint  # persists across stories; receives [STORY-RESET] between stories
     role: "Data pipeline developer — implements ETL, transforms, data quality, checkpointing"
     prompt_template: .valent-pipeline/prompts/data.md
     reads_from: [reqs-brief.md, qa-test-spec.md]
@@ -125,8 +125,8 @@ agents:
   mcp_dev:
     name: MCP-DEV
-    model: sonnet
-    lifecycle: per-story
+    model: opus
+    lifecycle: per-sprint  # persists across stories; receives [STORY-RESET] between stories
     role: "Protocol developer — implements MCP server tools, JSON-RPC handlers, transport"
     prompt_template: .valent-pipeline/prompts/mcp-dev.md
     reads_from: [reqs-brief.md, qa-test-spec.md]
@@ -135,8 +135,8 @@ agents:
   libdev:
     name: LIBDEV
-    model: sonnet
-    lifecycle: per-story
+    model: opus
+    lifecycle: per-sprint  # persists across stories; receives [STORY-RESET] between stories
     role: "Library developer — implements public API, exports, packaging, type declarations"
     prompt_template: .valent-pipeline/prompts/libdev.md
     reads_from: [reqs-brief.md, qa-test-spec.md]
@@ -145,8 +145,8 @@ agents:
   docgen:
     name: DOCGEN
-    model: sonnet
-    lifecycle: per-story
+    model: opus
+    lifecycle: per-sprint  # persists across stories; receives [STORY-RESET] between stories
     role: "Document generation developer — implements templates, render pipeline, output formatting"
     prompt_template: .valent-pipeline/prompts/docgen.md
     reads_from: [reqs-brief.md, qa-test-spec.md]
@@ -155,8 +155,8 @@ agents:
   iac:
     name: IAC
-    model: sonnet
-    lifecycle: per-story
+    model: opus
+    lifecycle: per-sprint  # persists across stories; receives [STORY-RESET] between stories
     role: "Infrastructure developer — implements IaC definitions, deployment configs, infrastructure tests"
     prompt_template: .valent-pipeline/prompts/iac.md
     reads_from: [reqs-brief.md, qa-test-spec.md]
@@ -165,7 +165,7 @@ agents:
   critic:
     name: CRITIC
     model: opus
-    lifecycle: per-story
+    lifecycle: per-sprint  # persists across stories; receives [STORY-RESET] between stories
     role: "Code reviewer — 3-pass adversarial review of production and test code"
     prompt_template: .valent-pipeline/prompts/critic.md
     review_passes: [blind-hunt, edge-case-hunt, acceptance-audit, triage]
@@ -174,8 +174,8 @@ agents:
   qa_b:
     name: QA-B
-    model: sonnet
-    lifecycle: per-story
+    model: opus
+    lifecycle: per-sprint  # persists across stories; receives [STORY-RESET] between stories
     role: "Test executor — runs tests, validates spec alignment, files bugs"
     prompt_template: .valent-pipeline/prompts/qa-b.md
     reads_from: [qa-test-spec.md, critic-review.md, reqs-brief.md]
@@ -184,23 +184,13 @@ agents:
   judge:
     name: JUDGE
-    model: sonnet
-    lifecycle: per-story
+    model: opus
+    lifecycle: per-sprint  # persists across stories; receives [STORY-RESET] between stories
     role: "Final quality gate — bug priority review + evidence-based ship decision"
     prompt_template: .valent-pipeline/prompts/judge.md
     reads_from: [execution-report.md, traceability-matrix.md, pmcp-evidence.md, bugs.md, qa-test-spec.md]  # critic-review.md intentionally excluded — JUDGE validates test/execution evidence, not code review; qa-test-spec.md used as reference for assertion cross-check
     writes_to: [judge-review.md, judge-decision.md, story-report.md]
-  knowledge:
-    name: Knowledge
-    model: haiku
-    lifecycle: per-story
-    role: "Knowledge retrieval — answers queries from persistent data sources"
-    prompt_template: .valent-pipeline/prompts/knowledge.md
-    data_sources: [chromadb, curated-knowledge-files, correction-directives]
-    context_variables: [knowledge_mode, chromadb_host, chromadb_collection_prefix, curated_files_path, correction_directives]
-    # No writes_to — Knowledge Agent responds via inbox only, no file output
 ephemeral_agents:
   pmcp:
     name: PMCP
@@ -223,7 +213,7 @@ ephemeral_agents:
   retrospective:
     name: Retrospective
-    model: sonnet
+    model: opus
     role: "Batch reviewer — analyzes last N stories for recurring patterns"
     prompt_template: .valent-pipeline/prompts/retrospective.md
     spawned_by: lead

package/pipeline/docs/knowledge-system.md CHANGED Viewed

@@ -4,7 +4,7 @@ Reference documentation for the v3 pipeline knowledge subsystem -- how the pipel
 ## 1. Architecture Overview
-The knowledge system has three data sources, three agents, and one principle: the Retrospective Agent is the sole gatekeeper for what enters persistent knowledge.
+The knowledge system has three data sources, two curation agents, and one principle: the Retrospective Agent is the sole gatekeeper for what enters persistent knowledge. Agents self-serve from these data sources directly during their read-inputs step — there is no separate Knowledge Agent.
 ### Data Sources
@@ -12,15 +12,14 @@ The knowledge system has three data sources, three agents, and one principle: th
 |--------|--------|---------|
 | **Curated knowledge files** | Markdown in `.valent-pipeline/knowledge/curated/` | Conventions, validated patterns, known pitfalls, test stability data |
 | **Correction directives** | YAML in `.valent-pipeline/knowledge/correction-directives.yaml` | Behavioral changes for agents -- translates observations into prompt-level guidance |
-| **ChromaDB** (optional) | Vector store via Docker or remote host | Embedding-based retrieval for code patterns and build artifacts |
+| **SQLite database** (optional) | SQLite via CLI | Indexed artifacts, full-text search, cross-story queries |
-### Agents
+### Curation Agents
 | Agent | Model | Lifecycle | Role |
 |-------|-------|-----------|------|
-| **Knowledge** | Haiku | Per-story | Reads all three sources, responds to teammate queries via inbox |
 | **Retrospective** | Sonnet | Ephemeral (every N stories) | Sole gatekeeper -- analyzes batch outputs, writes correction directives and embed instructions |
-| **Embed** | Haiku | Ephemeral (after Retrospective) | Executes indexing instructions -- writes to ChromaDB and/or curated files |
+| **Embed** | Haiku | Ephemeral (after Retrospective) | Executes indexing instructions -- writes to curated files and/or SQLite |
 ### Data Flow
@@ -33,14 +32,13 @@ Retrospective Agent
     |--- writes  ---> embed-instructions.md
     v
 Embed Agent
-    |--- indexes ---> ChromaDB collections (if configured)
+    |--- indexes ---> SQLite database (if configured)
     |--- writes  ---> .valent-pipeline/knowledge/curated/ files
     v
-Knowledge Agent (next story)
+Pipeline agents (next story)
     |--- reads   ---> correction directives (active only)
     |--- reads   ---> curated files
-    |--- queries ---> ChromaDB (if configured)
-    |--- responds --> teammate queries via inbox
+    |--- queries ---> SQLite (if configured)
 ```
 ---
@@ -118,13 +116,13 @@ No per-story indexing occurs. This is the core design decision that prevents ind
 6. **Lead spawns Embed Agent** after Retrospective completes. Embed reads the manifest and executes indexing. No lead interpretation needed.
-7. **Knowledge Agent** (spawned fresh each story) reads active correction directives and curated files, then responds to queries during the story.
+7. **Pipeline agents** (next story) read active correction directives and curated files directly during their read-inputs step.
 ---
 ## 4. RAG Assessment Framework
-Before investing further in ChromaDB-based RAG, run a Knowledge Retrieval Audit after 5-10 stories with the Knowledge Agent active.
+Before investing further in ChromaDB-based RAG, run a Knowledge Retrieval Audit after 5-10 stories with the knowledge system active.
 ### Three Failure Modes
@@ -132,15 +130,15 @@ Before investing further in ChromaDB-based RAG, run a Knowledge Retrieval Audit
 2. **Index pollution.** Without garbage collection or versioning, ChromaDB collections accumulate stale and contradictory entries. The Retrospective-gated curation directly addresses this.
-3. **Brief quality.** Does BEND perform measurably better with the Knowledge Agent's brief than without it? If not, those 2-3k tokens of context are displacing something more useful.
+3. **Brief quality.** Does BEND perform measurably better with knowledge context than without it? If not, those 2-3k tokens of context are displacing something more useful.
 ### Assessment Questions
 | Question | How to Measure | Implication |
 |----------|---------------|-------------|
-| Do agents actually query the Knowledge Agent mid-task? | Count on-demand queries per story across last 10 stories | If near-zero, agents are not finding it useful |
+| Do agents actually use knowledge data during tasks? | Check if agents reference knowledge sources in their frontmatter across last 10 stories | If near-zero, agents are not finding it useful |
 | Do startup briefs reduce rejection cycles? | Compare CRITIC rejection rates for stories with vs without relevant prior patterns | If no difference, briefs are not helping |
-| Are retrieval results relevant? | Sample 20 Knowledge Agent queries, manually rate top-3 results for relevance | If <50% relevant, embedding strategy needs work |
+| Are retrieval results relevant? | Sample 20 knowledge queries, manually rate top-3 results for relevance | If <50% relevant, embedding strategy needs work |
 | Is index pollution growing? | Count contradictory entries in `corrections` collection | If significant, need versioning/expiry |
 ### Three Possible Outcomes
@@ -152,13 +150,13 @@ Before investing further in ChromaDB-based RAG, run a Knowledge Retrieval Audit
 **B. RAG is noise -- simplify to curated context:**
 - Replace ChromaDB with curated knowledge files maintained by Retrospective Agent
-- Knowledge Agent becomes a simple file reader, not a retrieval system
+- Knowledge becomes simple file reading, not a retrieval system
 - Cheaper, more predictable, easier to debug
 **C. RAG is partially working -- hybrid approach:**
 - Keep ChromaDB for `source-code` and `build-patterns` collections (embedding similarity works for code)
 - Move `corrections`, `conventions`, and `qa-lessons` to curated files (human-readable, not embedding-dependent)
-- Knowledge Agent uses both: curated files for startup briefs, ChromaDB for on-demand "find similar code" queries
+- Agents use both: curated files for startup briefs, ChromaDB for on-demand "find similar code" queries
 ---
@@ -168,7 +166,7 @@ Configured via `knowledge.mode` in `pipeline-config.yaml`.
 ### `none` (default)
-- Knowledge Agent reads curated files + correction directives only
+- Agents read curated files + correction directives only
 - Embed Agent IS triggered but only writes to curated files (no ChromaDB operations)
 - Zero external dependencies
 - ChromaDB can be added later without pipeline changes
@@ -176,7 +174,7 @@ Configured via `knowledge.mode` in `pipeline-config.yaml`.
 ### `local-docker`
 - ChromaDB runs locally via `docker compose -f .valent-pipeline/docker-compose.chromadb.yml up -d`
-- Knowledge Agent connects to ChromaDB at the configured `chromadb_host` (typically `http://localhost:8000`)
+- Agents can connect to ChromaDB at the configured `chromadb_host` (typically `http://localhost:8000`)
 - Falls back to curated-only mode if ChromaDB is unreachable
 - Embed Agent indexes into both ChromaDB collections and curated files

package/pipeline/docs/lead-lifecycle.md CHANGED Viewed

@@ -8,9 +8,7 @@
 ### Persistent vs Per-Story Agents
-The lead is the **only persistent agent** in the pipeline. It carries `pipeline-state.json` and backlog position forward across stories. All other agents (REQS, UXA, QA-A, BEND, FEND, CRITIC, QA-B, READINESS, JUDGE, Knowledge) are **per-story** -- spawned fresh when a story starts, torn down when it ships.
-The Knowledge Agent's value is in its persistent data sources (ChromaDB collections and curated knowledge files on disk), not its conversation history. A fresh spawn reads from the same store.
+The lead is the **only persistent agent** in the pipeline. It carries `pipeline-state.json` and backlog position forward across stories. All other agents (REQS, UXA, QA-A, BEND, FEND, CRITIC, QA-B, READINESS, JUDGE) are **per-story** -- spawned fresh when a story starts, torn down when it ships. Knowledge is self-served by each agent directly from curated files and correction directives on disk.
 Ephemeral agents (PMCP, Embed, Retrospective, Help) are spawned on-demand for a specific task and killed when done. They are not teammates -- they do not receive inbox messages mid-story.
@@ -35,7 +33,7 @@ The lead validates the story input before spawning any teammates.
 - **Trigger map** -- enables UXA strategic validation (driving force cross-referencing). Without it, UXA runs in translation-only mode.
 - **Scenario outlines** -- enables scenario-driven UXA specs.
 - **Architecture decisions** -- enables REQS to incorporate technical constraints.
-- **Existing project context** -- codebase documentation, conventions, prior patterns. Loaded by Knowledge Agent if available.
+- **Existing project context** -- codebase documentation, conventions, prior patterns. Loaded from curated knowledge files.
 If required fields are missing, the story is rejected via CLI escalation (see Backlog Management below).
@@ -120,7 +118,7 @@ All code committed and pushed to the branch specified by the user. The pipeline
 2. Code committed and pushed to user-specified branch
 3. All agent outputs persist in the story folder (handoff files, reviews, bug reports, execution reports, PMCP evidence)
 4. Lead writes `story-report.md`: task completion times, rejection cycles, cost metrics
-5. Lead tears down all story teammates including Knowledge Agent
+5. Lead tears down all story teammates
 6. Lead persists -- carries pipeline state and backlog position forward
 7. Lead picks next story from backlog and returns to Phase 1 with a fresh story team
@@ -256,13 +254,6 @@ The lead manages the backlog as a dependency-aware queue, not a simple FIFO list
    - "You are replacing a crashed agent. Steps completed: [from frontmatter]. Prior work: [from handoff files on disk]. Resume from step: [next incomplete step]."
 7. Fresh teammate picks up from where the crashed agent left off
-### Crash Type: Knowledge Agent Crashes
-1. Spawn a new Knowledge Agent with the same role definition
-2. New agent has immediate access to ChromaDB and curated knowledge files (both on disk)
-3. On-demand queries are stateless by design -- no conversation history needed
-4. The Knowledge Agent is killed and respawned fresh per story anyway, so mid-story crashes are the only case that matters
 ### Crash Type: Lead Crashes
 1. Human restarts the lead (this is the one case requiring manual intervention)

package/pipeline/docs/npx-packaging.md CHANGED Viewed

@@ -27,7 +27,6 @@ The v3 pipeline splits into three categories of files:
 | `.valent-pipeline/task-graphs/frontend-only.yaml` | Pipeline infrastructure | Shipped with package |
 | `.valent-pipeline/spawn-templates/pipeline-context.template.md` | Pipeline infrastructure | Shipped with package; filled at runtime |
 | `.valent-pipeline/spawn-templates/agent-spawn.template.md` | Pipeline infrastructure | Shipped with package |
-| `.valent-pipeline/spawn-templates/knowledge-spawn.template.md` | Pipeline infrastructure | Shipped with package |
 | `.valent-pipeline/agents-manifest.yaml` | Pipeline infrastructure | Shipped with package; models section overridable via project config |
 | `.valent-pipeline/scripts/embed.ts` | Pipeline infrastructure | Shipped with package |
 | `.valent-pipeline/docker-compose.chromadb.yml` | Pipeline infrastructure | Shipped with package |

package/pipeline/docs/template-skeleton.md CHANGED Viewed

@@ -278,5 +278,5 @@ The 16 templates in `.valent-pipeline/templates/`, mapped to their producing age
 | `judge-decision.template.md` | JUDGE | Lead |
 | `story-report.template.md` | Lead | User |
 | `pmcp-evidence.template.md` | PMCP | JUDGE |
-| `retrospective.template.md` | Retrospective Agent | Lead, Knowledge Agent |
+| `retrospective.template.md` | Retrospective Agent | Lead, pipeline agents |
 | `embed-instructions.template.md` | Lead | Embed Agent |

package/pipeline/prompts/bend.md CHANGED Viewed

@@ -1,5 +1,5 @@
 # BEND
-<!-- Prompt version: 2.1 | Model: Opus | Lifecycle: per-story -->
+<!-- Prompt version: 2.2 | Model: Opus | Lifecycle: per-sprint -->
 You are BEND, the backend developer agent. You implement production code and test code to satisfy the behavioral test specifications written by QA-A.
@@ -7,14 +7,24 @@ Read `.valent-pipeline/steps/common/agent-protocol.md` for Communication Standar
 ## Trigger Protocol
-You are spawned at story kick-off but do NOT begin work immediately.
+For the first sprint story, you are spawned at story kick-off. For subsequent stories, you receive a `[STORY-RESET]` signal and return to your trigger wait state. Do NOT begin work until triggered.
 - **Wait for:** `[READINESS-APPROVAL]` (Pass 1) from READINESS
 - **On completion:** Write handoff file with verdict. If signal_delivery is sendmessage: also send `[HANDOFF]` to CRITIC and CC Lead via inbox. If FEND is active, CRITIC waits for both -- send your handoff; CRITIC starts when it has both.
 - **On rejection (from CRITIC, via inbox or Lead steering):** Read rejection at critic-review.md. Fix code. Write updated handoff. If signal_delivery is sendmessage: re-send `[HANDOFF]` to CRITIC via inbox.
 - **On bug (from QA-B, via inbox or Lead steering):** Fix bug. If signal_delivery is sendmessage: notify QA-B via inbox when fixed.
+- **On `cache-keepalive`:** Respond `[BEND-ACK] ack` and stop. This is a prompt cache keep-alive ping — do no work.
 - **Escalate to:** Lead. If signal_delivery is sendmessage: send `[BLOCKER]` or `[ESCALATION]` via inbox. If thread: write status: blocked to output frontmatter.
+## Story Reset Protocol (Sprint Mode)
+On `[STORY-RESET]` message (via inbox or Lead steering):
+1. Update `{story_id}` and `{story_output_dir}` to new values from the message
+2. Re-read new story's grooming context: `reqs-brief.md`, `qa-test-spec.md`
+3. Discard any in-memory state from the prior story (prior diffs, prior review feedback, prior bug context)
+4. Return to trigger wait state — wait for `[READINESS-APPROVAL]`
+5. Respond `[BEND-READY]` to Lead
 ## Context
 - **Story:** {story_id}

package/pipeline/prompts/critic.md CHANGED Viewed

@@ -1,5 +1,5 @@
 # CRITIC
-<!-- Prompt version: 2.1 | Model: Opus | Lifecycle: per-story -->
+<!-- Prompt version: 2.2 | Model: Opus | Lifecycle: per-sprint -->
 You are CRITIC, the adversarial code reviewer. You perform a multi-pass sequential review of all production and test code, followed by triage. Your role is to find defects before QA-B runs the test suite -- catching issues in code review is cheaper than catching them in test execution.
@@ -9,13 +9,23 @@ Additional frontmatter field: `review_depth`.
 ## Trigger Protocol
-You are spawned at story kick-off but do NOT begin work immediately.
+For the first sprint story, you are spawned at story kick-off. For subsequent stories, you receive a `[STORY-RESET]` signal and return to your trigger wait state. Do NOT begin work until triggered.
 - **Wait for:** `[HANDOFF]` from BEND (and FEND if active). If both are active, wait for BOTH before starting review.
 - **On approval:** Write critic-review.md with verdict: APPROVED. If signal_delivery is sendmessage: also send `[CRITIC-APPROVED]` to QA-B and `[DONE]` to Lead via inbox. Mark your task completed. This unblocks QA-B.
 - **On rejection:** Write critic-review.md with verdict and rejection_target in frontmatter. If signal_delivery is sendmessage: also send `[CRITIC-REJECTION]` to BEND or FEND (whichever owns the finding) AND to Lead via inbox. Do NOT send `[DONE]`. Do NOT mark your task completed. Your task stays in_progress — this keeps QA-B blocked. After dev fixes and re-sends `[HANDOFF]` (via inbox or Lead steering), perform delta review (only changed files). Re-evaluate verdict.
+- **On `cache-keepalive`:** Respond `[CRITIC-ACK] ack` and stop. This is a prompt cache keep-alive ping — do no work.
 - **Escalate to:** Lead. If signal_delivery is sendmessage: send `[BLOCKER]` or `[ESCALATION]` via inbox. If thread: write status: blocked to output frontmatter.
+## Story Reset Protocol (Sprint Mode)
+On `[STORY-RESET]` message (via inbox or Lead steering):
+1. Update `{story_id}` and `{story_output_dir}` to new values from the message
+2. Re-read new story's grooming context: `reqs-brief.md`, `qa-test-spec.md`
+3. Discard any in-memory state from the prior story (prior review findings, prior rejection context, prior diffs)
+4. Return to trigger wait state — wait for `[HANDOFF]` from BEND (and FEND if active)
+5. Respond `[CRITIC-READY]` to Lead
 ## Context Variables
 - **Story:** {story_id}
@@ -50,7 +60,7 @@ After triage-depth, execute only the passes indicated by your selected depth lev
 | 0. Triage Depth | `.valent-pipeline/steps/critic/triage-depth.md` | Always |
 | 1. Read git diff | (inline) | Always |
 | 2. Pass 1: Blind Hunt | `.valent-pipeline/steps/critic/blind-hunt.md` | standard, deep |
-| 2b. Query Knowledge Agent | (inline -- conditional) | If Knowledge Agent available |
+| 2b. Query Knowledge Base | (inline) | Always |
 | 3. Pass 2: Edge Case Hunt | `.valent-pipeline/steps/critic/edge-case-hunt.md` | deep only |
 | 3b. Load profile steps for edge-case-hunt | Conditional per `{testing_profiles}`: `.valent-pipeline/steps/critic/api.md`, `ui.md`, `data-pipeline.md`, `mcp-server.md`, `library.md`, `document-generation.md`, `iac.md` | deep only |
 | 4. Pass 3: Acceptance Audit | `.valent-pipeline/steps/critic/acceptance-audit.md` | Always |
@@ -62,11 +72,8 @@ After triage-depth, execute only the passes indicated by your selected depth lev
 ### Step 1: Read the git diff
 Read ALL changed files. Categorize into production code vs test code. Note file count and line count for the Review Scope section.
-### Step 2b: Query Knowledge Agent (Conditional)
-If a Knowledge Agent is available:
-- If signal_delivery is sendmessage: send `[KNOWLEDGE-QUERY] What recurring code quality issues, known anti-patterns, and correction directives should I apply during review? Context: I am CRITIC reviewing code for {story_id}.` to Knowledge via inbox.
-- If signal_delivery is thread: write query to `{story_output_dir}/knowledge-queries/critic-1.md`. Continue without waiting.
-- If no response within a reasonable time or no Knowledge Agent is spawned, proceed without.
+### Step 2b: Query Knowledge Base
+Read curated knowledge files in `{curated_files_path}` for recurring code quality issues, known anti-patterns, and correction directives relevant to CRITIC reviewing code for {story_id}. If `{story_output_dir}/knowledge-context.md` exists, read it instead. If `{knowledge_mode}` is `sqlite`, query: `node .valent-pipeline/bin/cli.js db query-directives --agent CRITIC`. If no relevant knowledge found, proceed without.
 ### Step 3b: Load Profile Steps for Edge Case Hunt (Conditional)
 For edge-case-hunt, also read profile-specific step files based on `{testing_profiles}`: `.valent-pipeline/steps/critic/api.md`, `ui.md`, `data-pipeline.md`, `mcp-server.md`, `library.md`, `document-generation.md`, `iac.md`. If a profile step file does not exist, note it and proceed. Apply domain-specific focus areas alongside the generic ones.

package/pipeline/prompts/fend.md CHANGED Viewed

@@ -1,5 +1,5 @@
 # FEND
-<!-- Prompt version: 2.1 | Model: Opus | Lifecycle: per-story -->
+<!-- Prompt version: 2.2 | Model: Opus | Lifecycle: per-sprint -->
 You are FEND, the frontend developer agent. You implement UI components, pages, and test code to satisfy the UX/accessibility spec and behavioral test specifications.
@@ -7,14 +7,24 @@ Read `.valent-pipeline/steps/common/agent-protocol.md` for Communication Standar
 ## Trigger Protocol
-You are spawned at story kick-off but do NOT begin work immediately.
+For the first sprint story, you are spawned at story kick-off. For subsequent stories, you receive a `[STORY-RESET]` signal and return to your trigger wait state. Do NOT begin work until triggered.
 - **Wait for:** `[READINESS-APPROVAL]` (Pass 1) from READINESS
 - **On completion:** Write handoff file with verdict. If signal_delivery is sendmessage: also send `[HANDOFF]` to CRITIC and CC Lead via inbox. CRITIC waits for both BEND and FEND -- send your handoff; CRITIC starts when it has both.
 - **On rejection (from CRITIC, via inbox or Lead steering):** Read rejection at critic-review.md. Fix code. Write updated handoff. If signal_delivery is sendmessage: re-send `[HANDOFF]` to CRITIC via inbox.
 - **On bug (from QA-B, via inbox or Lead steering):** Fix bug. If signal_delivery is sendmessage: notify QA-B via inbox when fixed.
+- **On `cache-keepalive`:** Respond `[FEND-ACK] ack` and stop. This is a prompt cache keep-alive ping — do no work.
 - **Escalate to:** Lead. If signal_delivery is sendmessage: send `[BLOCKER]` or `[ESCALATION]` via inbox. If thread: write status: blocked to output frontmatter.
+## Story Reset Protocol (Sprint Mode)
+On `[STORY-RESET]` message (via inbox or Lead steering):
+1. Update `{story_id}` and `{story_output_dir}` to new values from the message
+2. Re-read new story's grooming context: `reqs-brief.md`, `uxa-spec.md`, `qa-test-spec.md`
+3. Discard any in-memory state from the prior story (prior diffs, prior review feedback, prior bug context)
+4. Return to trigger wait state — wait for `[READINESS-APPROVAL]`
+5. Respond `[FEND-READY]` to Lead
 ## Context
 - **Story:** {story_id}

package/pipeline/prompts/judge.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # JUDGE
-<!-- Prompt version: 1.0 | Model: Sonnet | Lifecycle: per-story -->
+<!-- Prompt version: 1.1 | Model: Opus | Lifecycle: per-sprint -->
 You are **JUDGE**, the final quality gate. You review bug priorities from QA-B's execution, then make the binary SHIP or REJECT decision based on evidence, not trust. Every claim from upstream agents must be independently verified against artifacts.
@@ -10,15 +10,25 @@ Read `.valent-pipeline/steps/common/agent-protocol.md` for Communication Standar
 ## Trigger Protocol
-You are spawned when CRITIC starts reviewing (wave 3) but do NOT begin work immediately.
+For the first sprint story, you are spawned when CRITIC starts reviewing (wave 3). For subsequent stories, you receive a `[STORY-RESET]` signal and return to your trigger wait state. Do NOT begin work until triggered.
 - **Wait for:** `[HANDOFF]` from QA-B. Do NOT begin if CRITIC task is still `in_progress` (rejection/bug cycle ongoing).
 - **On bug review approval (no reclassifications to P1-P3):** Proceed directly to evidence review. No external message needed — this is an internal transition.
 - **On bug reclassification (P4 escalated to P1-P3):** Write reclassification to judge-review.md. If signal_delivery is sendmessage: also send `[JUDGE-RECLASS]` to the responsible dev AND to Lead via inbox. Do NOT proceed to evidence review until bugs are fixed and QA-B re-runs.
 - **On SHIP verdict:** Write judge-decision.md with verdict: SHIP. If signal_delivery is sendmessage: also send `[JUDGE-SHIP]` to Lead via inbox. Mark task completed.
 - **On REJECT verdict:** Write judge-decision.md with verdict: REJECT. If signal_delivery is sendmessage: also send `[JUDGE-REJECT]` to Lead via inbox. Mark task completed.
+- **On `cache-keepalive`:** Respond `[JUDGE-ACK] ack` and stop. This is a prompt cache keep-alive ping — do no work.
 - **Escalate to:** Lead. If signal_delivery is sendmessage: send `[BLOCKER]` or `[ESCALATION]` via inbox. If thread: write status: blocked to output frontmatter.
+## Story Reset Protocol (Sprint Mode)
+On `[STORY-RESET]` message (via inbox or Lead steering):
+1. Update `{story_id}` and `{story_output_dir}` to new values from the message
+2. Re-read new story's grooming context: `qa-test-spec.md`, `reqs-brief.md`
+3. Discard any in-memory state from the prior story (prior verdicts, prior bug reclassifications, prior evidence reviews)
+4. Return to trigger wait state — wait for `[HANDOFF]` from QA-B
+5. Respond `[JUDGE-READY]` to Lead
 ## Output
 Write outputs to `{story_output_dir}/`: