npm - @sandrinio/vbounce - Versions diffs - 1.9.0 → 2.0.0 - Mend

@sandrinio/vbounce 1.9.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/README.md +287 -15
package/bin/vbounce.mjs +21 -0
package/brains/AGENTS.md +59 -21
package/brains/CHANGELOG.md +22 -0
package/brains/CLAUDE.md +121 -27
package/brains/GEMINI.md +60 -23
package/brains/claude-agents/developer.md +6 -4
package/brains/copilot/copilot-instructions.md +5 -0
package/brains/cursor-rules/vbounce-process.mdc +3 -0
package/brains/windsurf/.windsurfrules +5 -0
package/package.json +1 -1
package/scripts/close_sprint.mjs +32 -1
package/scripts/post_sprint_improve.mjs +486 -0
package/scripts/suggest_improvements.mjs +206 -43
package/skills/agent-team/SKILL.md +48 -25
package/skills/agent-team/references/discovery.md +97 -0
package/skills/doc-manager/SKILL.md +142 -18
package/skills/improve/SKILL.md +149 -58
package/skills/lesson/SKILL.md +14 -0
package/templates/epic.md +19 -16
package/templates/spike.md +143 -0
package/templates/sprint.md +32 -12
package/templates/sprint_report.md +6 -4
package/templates/story.md +23 -8

package/README.md CHANGED Viewed

@@ -8,9 +8,162 @@ V-Bounce Engine turns AI assistants — Claude Code, Cursor, Gemini, Copilot, Co
 ---
-## How It Works
+## The Problem
-V-Bounce Engine is built around a **Context Loop** — a closed feedback system that makes agents smarter with each sprint.
+AI coding agents are powerful — but without structure, they create expensive chaos:
+- **No accountability.** The agent writes code, but nobody reviews it against requirements before it ships. Bugs that a junior engineer would catch survive to production.
+- **Invisible progress.** You ask "how's the feature going?" and the only answer is "the agent is still running." No milestones, no intermediate artifacts, no way to course-correct mid-sprint.
+- **No institutional memory.** Every session starts from zero. The agent makes the same architectural mistake it made last week because nothing captures what went wrong.
+- **Rework cycles.** Without quality gates, bad code compounds. A missed requirement discovered late costs 10x more to fix than one caught early.
+- **Risk blindness.** There's no structured way to assess what could go wrong before the agent starts building.
+V-Bounce Engine solves this by wrapping AI agents in the same discipline that makes human engineering teams reliable: planning documents, role-based reviews, automated gates, and a learning loop that compounds knowledge across sprints.
+---
+## Built-in Guardrails
+Every risk that keeps you up at night has a specific mechanism that catches it:
+| Risk | What catches it |
+|------|----------------|
+| Agent ships code that doesn't match requirements | **QA gate** — validates every story against acceptance criteria before merge |
+| Architectural drift over time | **Architect gate** — audits against your ADRs and safe-zone rules on every story |
+| One bad story breaks everything | **Git worktrees** — every story is isolated; failures can't contaminate other work |
+| Agent gets stuck in a loop | **3-bounce escalation** — after 3 failed attempts, the story surfaces to a human |
+| Scope creep on "quick fixes" | **Hotfix hard-stop** — Developer must stop if a fix touches more than 2 files |
+| Same mistakes keep happening | **LESSONS.md** — agents read accumulated mistakes before writing future code |
+| Silent regressions | **Root cause tagging** — every failure is tagged and tracked across sprints |
+| Framework itself becomes stale | **Self-improvement skill** — analyzes friction patterns and proposes changes (with your approval) |
+---
+## Planning With V-Bounce
+V-Bounce separates planning into two layers: **what to build** and **how to ship it**. The AI is your planning partner — not a tool you invoke with commands.
+### Product Planning — What to Build
+Just talk to the AI. Say "plan a feature for X" or "create an epic for payments" and it handles the rest — reading upstream documents, researching your codebase, and drafting planning documents.
+A document hierarchy that mirrors how product teams already think:
+```
+Charter  (WHY — vision, principles, constraints)
+  → Roadmap  (WHAT/WHEN — releases, milestones, architecture decisions)
+    → Epic  (scoped WHAT — a feature with clear boundaries)
+      → Story  (HOW — implementation spec with acceptance criteria)
+```
+**You write the top levels. The AI builds the bottom — informed by your actual codebase.**
+When creating Epics, the AI researches your codebase to fill Technical Context with real file paths and verified dependencies — not guesses. When decomposing Epics into Stories, the AI reads affected files, explores architecture patterns, and creates small, focused stories by deliverable (vertical slices), not by layer.
+Every document includes an **ambiguity score**:
+- 🔴 High — requirements unclear, blocked from development
+- 🟡 Medium — tech TBD but logic is clear, safe to plan
+- 🟢 Low — fully specified, ready to build
+No level can be skipped. This prevents the most common AI failure mode: building the wrong thing because requirements were vague.
+### Execution Planning — How to Ship It
+Once you know *what* to build, three documents govern *how* it gets delivered:
+| Document | Scope | Who uses it | What it tracks |
+|----------|-------|-------------|----------------|
+| **Delivery Plan** | A full release (multiple sprints) | PM | Which Epics are included, project window (start/end dates), high-level backlog prioritization, escalated/parked stories |
+| **Sprint Plan** | One sprint (typically 1 week) | Team Lead + PM | Active story scope, context pack readiness checklists, execution strategy (parallel vs sequential phases), dependency chains, risk flags, and a live execution log |
+| **Risk Registry** | Cross-cutting (all levels) | PM + Architect | Active risks with likelihood/impact scoring, phase-stamped analysis log, mitigations, and resolution history |
+**How they connect:**
+```
+Delivery Plan  (the milestone — "we're shipping auth + payments by March 30")
+  → Sprint Plan  (this week — "stories 01-03 in parallel, 04 depends on 01")
+       ↑
+  Risk Registry  (cross-cutting — reviewed at every sprint boundary)
+```
+The **Delivery Plan** is updated only at sprint boundaries. The **Sprint Plan** is the single source of truth during active execution — every story state transition is recorded there. At sprint end, the Sprint Plan's execution log becomes the skeleton for the Sprint Report automatically.
+The **Sprint Plan** also includes a **Context Pack Readiness** checklist for each story — a preflight check ensuring the spec is complete, acceptance criteria are defined, and ambiguity is low before any code is written. If a story isn't ready, it stays in Refinement.
+---
+## Reports and Visibility
+V-Bounce generates structured reports at every stage — designed to answer stakeholder questions without requiring anyone to read code:
+| Report | When it's generated | What it answers |
+|--------|-------------------|-----------------|
+| **Implementation Report** | After each story is built | What was built? What decisions were made? What tests were added? |
+| **QA Report** | After validation | Does the implementation match the acceptance criteria? What failed? |
+| **Architect Report** | After audit | Does this align with our architecture? Any ADR violations? |
+| **Sprint Report** | End of sprint | What shipped? What bounced? What's the correction tax? Lessons learned? |
+| **Release Report** | After merge | What went to production? Environment changes? Post-merge validations? |
+| **Scribe Report** | After documentation pass | What product docs were created, updated, or flagged as stale? |
+**You don't need to read code to manage the sprint.** The reports surface exactly what a PM or PO needs to make decisions.
+---
+## What You Can Measure
+V-Bounce tracks metrics that map directly to product and delivery health:
+| Metric | What it tells you | Action when it's bad |
+|--------|------------------|---------------------|
+| **Bounce Rate (QA)** | How often code fails acceptance criteria | Stories may have vague requirements — tighten acceptance criteria |
+| **Bounce Rate (Architect)** | How often code violates architecture rules | ADRs may be unclear, or the agent needs better context |
+| **Correction Tax** | 0% = agent delivered autonomously, 100% = human rewrote everything | High tax means the agent needs better guidance (Charter, Roadmap, or Skills) |
+| **Root Cause Distribution** | Why things fail — `missing_tests`, `adr_violation`, `spec_ambiguity`, etc. | Invest in the category that fails most often |
+| **Escalation Rate** | How often stories hit the 3-bounce limit | Chronic escalation signals structural issues in planning docs |
+| **Sprint Velocity** | Stories completed per sprint | Track trend over time — should improve as LESSONS.md grows |
+Run `vbounce trends` to see cross-sprint analysis. Run `vbounce suggest` for AI-generated improvement recommendations.
+---
+## How a Sprint Flows
+Here's what a sprint looks like from the product side — no terminal commands, no code:
+**Phase 1 — Planning**
+You talk to the AI about what to build. The AI creates Epics and Stories by reading upstream documents and researching your codebase. Ambiguity, risks, and open questions are surfaced and discussed collaboratively.
+**Phase 2 — Sprint Planning**
+You and the AI decide what goes into the sprint together. The AI reads the backlog, proposes scope, and surfaces blockers — open questions, unresolved ambiguity, dependency risks, edge cases. You discuss, adjust, and confirm. **No sprint starts without your explicit confirmation.** The Sprint Plan is mandatory.
+**Phase 3 — The Bounce**
+The AI team works autonomously. For each Story:
+1. The **Developer** builds the feature in isolation (with E2E tests, not just unit tests)
+2. The **QA agent** checks: does the code meet the acceptance criteria?
+3. The **Architect agent** checks: does the code follow our architecture rules?
+4. If either check fails, the work "bounces" back to the Developer with a tagged reason
+5. After 3 bounces, the story escalates — the AI presents root causes and options (re-scope, split, spike, or remove), and you decide
+Lessons are recorded **immediately** after each story merges, not deferred to sprint close.
+**Phase 4 — Review**
+The Sprint Report lands. It tells you:
+- What shipped and what didn't
+- How many bounces each story took (and why)
+- The correction tax (how much human intervention was needed)
+- Test counts per story
+- Lessons already captured during the sprint
+- Recommendations for process improvements
+You review, approve the release, and the sprint archives itself. The next sprint starts smarter because the agents now carry forward everything they learned.
+---
+## Continuous Improvement
+Most AI coding setups are stateless — every session starts from scratch. V-Bounce is the opposite.
+The **Context Loop** is a closed feedback system that makes your AI team measurably better over time:
 ```
 Plan  ──>  Build  ──>  Bounce  ──>  Document  ──>  Learn
@@ -22,15 +175,121 @@ Plan  ──>  Build  ──>  Bounce  ──>  Document  ──>  Learn
                   Next sprint reads it all
 ```
-**Plan.** The Team Lead writes requirements using structured templates (Charter, Epic, Story) before any code is written.
+After each sprint:
+- **LESSONS.md** captures every mistake — agents read this before writing future code
+- **Trend analysis** spots recurring patterns (e.g., "auth-related stories bounce 3x more than average")
+- **Self-improvement pipeline** analyzes friction and proposes concrete framework changes
+- **Scribe** keeps product documentation in sync with actual code
+Sprint 1 might have a 40% bounce rate. By Sprint 5, that number drops — because the agents have accumulated context about your codebase, your architecture decisions, and your team's standards.
+### The Self-Improvement Pipeline
+When a sprint closes (`vbounce sprint close`), an automated pipeline analyzes what went wrong and proposes how to fix the framework itself:
+```
+Sprint Close
+  │
+  ├── Trend Analysis         → Cross-sprint bounce patterns
+  │
+  ├── Retro Parser           → Reads §5 Framework Self-Assessment tables
+  │                            from the Sprint Report
+  │
+  ├── Lesson Analyzer        → Classifies LESSONS.md rules by what they
+  │                            can become: gate checks, scripts, template
+  │                            fields, or permanent agent rules
+  │
+  ├── Recurrence Detector    → Cross-references archived sprint reports
+  │                            to find findings that keep coming back
+  │
+  ├── Effectiveness Checker  → Did last sprint's improvements actually
+  │                            resolve their target findings?
+  │
+  └── Improvement Suggestions → Human-readable proposals with impact levels
+```
+Every proposal gets an **impact level** so you know what to fix first:
+| Level | Label | Meaning | When to fix |
+|-------|-------|---------|-------------|
+| **P0** | Critical | Blocks agent work or causes incorrect output | Before next sprint |
+| **P1** | High | Causes rework — bounces, wasted tokens, repeated manual steps | This improvement cycle |
+| **P2** | Medium | Friction that slows agents but doesn't block | Within 2 sprints |
+| **P3** | Low | Polish — nice-to-have | Batch when convenient |
+### Lessons Become Automation
-**Build.** The Developer agent implements each Story in an isolated git worktree and submits an Implementation Report.
+The pipeline doesn't just track lessons — it classifies each one by what it can become:
-**Bounce.** The QA agent validates against acceptance criteria. The Architect agent audits against your architecture rules. If either fails, the work bounces back to the Developer — up to 3 times before escalating to you. Every failure is tagged with a root cause (`missing_tests`, `adr_violation`, `spec_ambiguity`, etc.) for trend analysis.
+| Lesson pattern | Becomes | Example |
+|---------------|---------|---------|
+| "Always check X", "Never use Y" | **Gate check** — automated grep/lint rule | "Never import from internal modules" → pre-gate grep pattern |
+| "Run X before Y" | **Script** — validation step | "Run type-check before QA" → added to pre_gate_runner.sh |
+| "Include X in the story" | **Template field** — required section | "Include rollback plan" → new field in story template |
+| General behavioral rules (3+ sprints old) | **Agent config** — permanent brain rule | "Always check for N+1 queries" → graduated to Architect config |
-**Document.** After merge, the Scribe agent maps the actual codebase into semantic product documentation using [vdoc](https://github.com/sandrinio/vdoc) (optional).
+This means your framework evolves organically: agents report friction, the pipeline classifies it, you approve the fix, and the next sprint runs smoother. No manual analysis required.
-**Learn.** Sprint mistakes are recorded in `LESSONS.md`. All agents read it before writing future code. The framework also tracks its own performance — bounce rates, correction tax, recurring failure patterns — and suggests improvements to its own templates and skills.
+Run `vbounce improve S-XX` anytime to trigger the pipeline on demand.
+---
+## Is V-Bounce Right For You?
+**Best fit:**
+- Teams using AI agents for production code (not just prototypes)
+- Projects with clear requirements that can be expressed as acceptance criteria
+- Codebases where architectural consistency matters
+- Teams that want to scale AI usage without losing quality control
+**Less ideal for:**
+- One-off scripts or throwaway prototypes (overkill)
+- Exploratory research with no defined requirements
+- Projects where the entire team is deeply embedded in every code change anyway
+**Minimum setup:** One person who can run `npx` commands + one person who can write a Charter and Epics. That's it.
+---
+## Roles and Responsibilities
+### Human
+You own the planning and the final say. The agents never ship without your approval.
+| Responsibility | What it involves |
+|---------------|-----------------|
+| **Set vision and constraints** | Write the Charter and Roadmap — define what to build and what's off-limits |
+| **Define requirements** | Break Roadmap into Epics and Stories with acceptance criteria |
+| **Review and approve** | Read sprint reports, approve releases, intervene on escalations |
+| **Tune agent performance** | Adjust brain files, skills, and ADRs based on trend data and bounce patterns |
+| **Install and configure** | Run the installer, verify setup with `vbounce doctor` |
+### Agent — Team Lead (Orchestrator)
+The Team Lead reads your planning documents and coordinates the entire sprint. It never writes code — it delegates, tracks state, and generates reports.
+| Responsibility | What it involves |
+|---------------|-----------------|
+| **Sprint orchestration** | Assigns stories, manages state transitions, enforces the bounce loop |
+| **Agent delegation** | Spawns Developer, QA, Architect, DevOps, and Scribe agents as needed |
+| **Report routing** | Reads each agent's output and decides the next step (pass, bounce, escalate) |
+| **Escalation** | Surfaces stories to the human after 3 failed bounces |
+| **Sprint reporting** | Consolidates execution data into Sprint Reports and Release Reports |
+### Agent — Specialists (Developer, QA, Architect, DevOps, Scribe)
+Five specialist agents, each with a single job and strict boundaries:
+| Agent | What it does | Constraints |
+|-------|-------------|-------------|
+| **Developer** | Implements stories in isolated worktrees, submits implementation reports | Works only in its assigned worktree |
+| **QA** | Validates code against acceptance criteria | Read-only — cannot modify code |
+| **Architect** | Audits against ADRs, architecture rules, and safe-zone boundaries | Read-only — cannot modify code |
+| **DevOps** | Merges passing stories into the sprint branch | Only acts after both gates pass |
+| **Scribe** | Generates and maintains product documentation from the actual codebase | Only runs after merge |
+One person can fill the entire human side. The framework scales to the team you have.
 ---
@@ -198,6 +457,7 @@ vbounce validate ready STORY-ID       # Pre-bounce readiness gate
 # Self-improvement
 vbounce trends                        # Cross-sprint trend analysis
 vbounce suggest S-01                  # Generate improvement suggestions
+vbounce improve S-01                  # Full self-improvement pipeline
 # Health check
 vbounce doctor                        # Verify setup
@@ -220,7 +480,9 @@ product_plans/                   # Created when you start planning
 .bounce/                         # Created on first sprint init
   state.json                     #   Machine-readable sprint state (crash recovery)
   reports/                       #   QA and Architect bounce reports
-  improvement-log.md             #   Tracked improvement suggestions
+  improvement-manifest.json      #   Machine-readable improvement proposals (auto-generated)
+  improvement-suggestions.md     #   Human-readable suggestions with impact levels (auto-generated)
+  improvement-log.md             #   Applied/rejected/deferred improvement tracking
 .worktrees/                      # Git worktrees for isolated story branches
@@ -229,18 +491,28 @@ LESSONS.md                       # Accumulated mistakes — agents read this bef
 ---
-## End-of-Sprint Reports
-When a sprint concludes, V-Bounce Engine generates three structured reports:
-- **Sprint Report** — what was delivered, execution metrics (tokens, cost, bounce rates), story results, lessons learned, and a retrospective.
-- **Release Report** — the DevOps agent's merge log, environment changes, and post-merge validations.
-- **Scribe Report** — which product documentation was created, updated, or flagged as stale.
+## Glossary
+| Term | Definition |
+|------|-----------|
+| **Bounce** | When a story fails a quality gate (QA or Architect) and gets sent back to the Developer for fixes |
+| **Bounce Rate** | Percentage of stories that fail a gate on the first attempt |
+| **Context Loop** | The closed feedback cycle: Plan → Build → Bounce → Document → Learn → next sprint |
+| **Correction Tax** | How much human intervention a story needed — 0% is fully autonomous, 100% means a human rewrote it |
+| **Escalation** | When a story hits the 3-bounce limit and surfaces to a human for intervention |
+| **Gate** | An automated quality checkpoint — QA validates requirements, Architect validates structure |
+| **Hotfix Path** | A fast track for trivial (L1) changes: 1-2 files, no QA/Architect gates, human verifies directly |
+| **L1–L4** | Complexity labels: L1 Trivial, L2 Standard, L3 Complex, L4 Strategic |
+| **Root Cause Tag** | A label on every bounce failure (e.g., `missing_tests`, `adr_violation`) used for trend analysis |
+| **Scribe** | The documentation agent that maps code into semantic product docs |
+| **Sprint Report** | End-of-sprint summary: what shipped, metrics, bounce analysis, lessons, retrospective |
+| **Worktree** | An isolated git checkout where a single story is implemented — prevents cross-story interference |
 ---
 ## Documentation
+- [System Overview with diagrams](OVERVIEW.md)
 - [Epic template and structure](templates/epic.md)
 - [Hotfix edge cases](docs/HOTFIX_EDGE_CASES.md)
 - [vdoc integration](https://github.com/sandrinio/vdoc)

package/bin/vbounce.mjs CHANGED Viewed

@@ -86,6 +86,7 @@ Usage:
   vbounce docs check <sprintId>        Detect stale vdocs and generate Scribe task
   vbounce trends                       Cross-sprint trend analysis
   vbounce suggest <sprintId>           Generate improvement suggestions
+  vbounce improve <sprintId>           Run full self-improvement pipeline
   vbounce doctor                       Validate all configs and state files
 Install Platforms:
@@ -195,6 +196,26 @@ if (command === 'suggest') {
   runScript('suggest_improvements.mjs', args.slice(1));
 }
+// -- improve --
+if (command === 'improve') {
+  rl.close();
+  // Full pipeline: analyze → trends → suggest
+  const sprintArg = args[1];
+  if (!sprintArg) {
+    console.error('Usage: vbounce improve S-XX');
+    process.exit(1);
+  }
+  // Run trends first
+  const trendsPath = path.join(pkgRoot, 'scripts', 'sprint_trends.mjs');
+  if (fs.existsSync(trendsPath)) {
+    console.log('Step 1/2: Running cross-sprint trend analysis...');
+    spawnSync(process.execPath, [trendsPath], { stdio: 'inherit', cwd: process.cwd() });
+  }
+  // Run suggest (which internally runs post_sprint_improve.mjs)
+  console.log('\nStep 2/2: Running improvement analyzer + suggestions...');
+  runScript('suggest_improvements.mjs', [sprintArg]);
+}
 // -- docs --
 if (command === 'docs') {
   rl.close();

package/brains/AGENTS.md CHANGED Viewed

@@ -4,7 +4,11 @@
 ## Identity
-You are an AI coding agent operating within **V-Bounce Engine** — a structured system for planning, implementing, and validating software using AI agents. You work as part of a team: Team Lead, Developer, QA, Architect, DevOps, and Scribe agents collaborate through structured reports.
+You are an AI operating within **V-Bounce Engine** — a structured system for planning, implementing, and validating software.
+You have two roles depending on the phase:
+- **During Planning (Phase 1 & 2):** You work directly with the human. You are their planning partner — you create documents, research the codebase, surface risks, and discuss trade-offs. No subagents are involved.
+- **During Execution (Phase 3):** You are the Team Lead orchestrating specialist agents (Developer, QA, Architect, DevOps, Scribe) through structured reports.
 You MUST follow the V-Bounce process. Deviating from it — skipping validation, ignoring LESSONS.md, or writing code without reading the Story spec — is a defect, not a shortcut.
@@ -24,34 +28,51 @@ Skills are in the `skills/` directory. Each skill has a `SKILL.md` with instruct
 ## The V-Bounce Process
-### Phase 1: Verification (Planning)
-Documents are created in strict hierarchy — no level can be skipped:
+The process has four phases. You determine which phase to operate in based on what the human is asking for.
+### Phase 1: Planning (AI + Human — No Subagents)
+**When to enter:** The human talks about what to build, asks to create or modify planning documents, discusses features, priorities, or asks about work status. This is a direct conversation — no subagents.
+Read `skills/doc-manager/SKILL.md` and follow its workflows.
+**Document hierarchy** — no level can be skipped:
 Charter (why) → Roadmap (strategic what/when) → Epic (detailed what) → Story (how) → Delivery Plan (execution) → Risk Registry (risks)
-### Pre-Bounce Checks
-Before starting any sprint, the Team Lead MUST:
-- **Triage the Request**: Is this an L1 Trivial change (1-2 files, cosmetic/minor)?
-  - If YES → Use the **Hotfix Path** (create a Hotfix document, bypass Epic/Story).
-  - If NO → Use the **Standard Path** (create/find Epic, Story).
-- **Determine Execution Mode**: Full Bounce vs Fast Track.
-- **Dependency Check**: Stories with `Depends On:` must execute sequentially.
-- Read RISK_REGISTRY.md — flag high-severity risks that affect planned stories.
-- Read `sprint-{XX}.md` §2 Sprint Open Questions — do not bounce stories with unresolved blocking questions.
-- If `vdocs/_manifest.json` exists, read it.
-- **Strategic Freeze**: Charter/Roadmap frozen during sprints. Use **Impact Analysis Protocol** if emergency changes occur. Evaluate active stories against new strategy. Pause until human approval.
-### Phase 2: The Bounce (Implementation)
+**Your responsibilities during planning:**
+1. **Creating documents:** Read upstream documents, research the codebase, draft the document. Follow doc-manager's CREATE and DECOMPOSE workflows.
+2. **Surfacing problems:** Assess ambiguity, open questions, edge cases, and risks. Present these clearly to the human — this is collaborative.
+3. **Answering status questions:** Read `product_plans/` to understand current state (backlog/, sprints/, archive/, strategy/).
+4. **Triaging requests:** L1 Trivial → Hotfix Path. Everything else → Standard Path (Epic → Story → Sprint).
+### Phase 2: Sprint Planning (AI + Human — Collaborative Gate)
+**When to enter:** The human wants to start executing work — "let's start a sprint", "what should we work on next?"
+**Hard rule: No bounce can start without a finalized, human-confirmed Sprint Plan.**
+1. Read backlog, archive, Risk Registry, vdocs manifest
+2. Propose sprint scope based on priority, dependencies, complexity
+3. Surface blockers: open questions, 🔴 ambiguity, missing prerequisites, risks, edge cases
+4. Discuss and refine with human
+5. Create Sprint Plan from `templates/sprint.md` — fill §0 Readiness Gate, §1 Active Scope, §2 Execution Strategy, §3 Open Questions
+6. **Gate:** Human confirms the Sprint Plan. Only then set status to "Active"
+**Strategic Freeze:** Charter/Roadmap frozen during sprints. Use **Impact Analysis Protocol** if emergency changes occur. Pause until human approval.
+### Phase 3: The Bounce (Execution)
 **Standard Path (L2-L4 Stories):**
 0. **Orient via state**: Read `.bounce/state.json` (`vbounce state show`) for instant context. Run `vbounce prep sprint S-{XX}` to generate a fresh context pack.
 1. Team Lead sends Story context pack to Developer.
 2. Developer reads LESSONS.md and the Story context pack, implements code, writes Implementation Report. CLI Orchestrator must run `./scripts/validate_report.mjs` on the report to enforce YAML strictness.
 3. **Pre-QA Gate Scan:** Team Lead runs `./scripts/pre_gate_runner.sh qa` to catch mechanical failures before spawning QA. If trivial issues → return to Dev.
 4. QA runs Quick Scan + PR Review (skipping pre-scanned checks), validates against Story §2 The Truth. If fail → Bug Report to Dev. CLI Orchestrator must run `./scripts/validate_report.mjs` on the QA report.
-5. Dev fixes and resubmits. 3+ failures → Escalated.
+5. Dev fixes and resubmits. 3+ failures → Escalated (see Escalation Recovery below).
 6. **Pre-Architect Gate Scan:** Team Lead runs `./scripts/pre_gate_runner.sh arch` to catch structural issues before spawning Architect. If mechanical failures → return to Dev.
 7. Architect runs Deep Audit + Trend Check (skipping pre-scanned checks), validates Safe Zone compliance and ADR adherence.
 8. DevOps merges story branch into sprint branch, validates post-merge (tests + lint + build), handles release tagging.
-9. Team Lead consolidates reports into Sprint Report.
+9. **Record lessons immediately**: After DevOps merge, check Dev and QA reports for `lessons_flagged`. Record to LESSONS.md now — do not wait for sprint close.
+10. Team Lead consolidates reports into Sprint Report.
 **Hotfix Path (L1 Trivial Tasks):**
 1. Team Lead evaluates request and creates `HOTFIX-{Date}-{Name}.md`.
@@ -61,10 +82,27 @@ Before starting any sprint, the Team Lead MUST:
 5. Hotfix is merged directly into the active branch.
 6. DevOps (or Team Lead) runs `./scripts/hotfix_manager.sh sync` to update active worktrees.
-### Phase 3: Review
-Sprint Report → Human review → Delivery Plan updated → Lessons recorded → Next sprint.
+**Escalation Recovery (3+ bounce failures):**
+1. Mark story as "Escalated" in Sprint Plan
+2. Present to human: what failed, root causes from bounce reports, pattern analysis
+3. Propose options: re-scope the story, split into smaller stories, create a spike, or remove from sprint
+4. Human decides. Execute the decision.
+### Phase 4: Review
+Sprint Report → Human review → Delivery Plan updated (at boundary only) → Lessons recorded → Next sprint.
 If sprint delivered new features or Dev reports flagged stale product docs → spawn Scribe agent to generate/update vdocs/ via vdoc.
+**Self-Improvement Pipeline** (auto-runs on `vbounce sprint close`):
+1. `sprint_trends.mjs` → cross-sprint trend analysis → `.bounce/trends.md`
+2. `post_sprint_improve.mjs` → parses §5 retro tables + LESSONS.md automation candidates + recurring patterns + effectiveness checks → `.bounce/improvement-manifest.json`
+3. `suggest_improvements.mjs` → generates human-readable suggestions with impact levels → `.bounce/improvement-suggestions.md`
+4. Human reviews suggestions → approve/reject/defer each item
+5. Run `/improve` to apply approved changes with brain-file sync
+**Impact Levels:** P0 Critical (blocks agents), P1 High (causes rework), P2 Medium (friction), P3 Low (polish). See `/improve` skill for details.
+On-demand: `vbounce improve S-{XX}` runs the full pipeline.
 ## Story States
 Draft → Refinement → Ready to Bounce → Bouncing → QA Passed → Architect Passed → Sprint Review → Done
@@ -101,7 +139,7 @@ Bouncing → Escalated (3+ failures)
 10. One source of truth. Reference upstream documents, don't duplicate.
 11. Change Logs are mandatory on every document modification.
 12. Agent Reports MUST use YAML Frontmatter. Every `.bounce/report/` generated must start with a strict `---` YAML block containing the core status and metrics before the Markdown body.
-13. Framework Integrity. Any modification to a `brains/` or `skills/` file MUST be recorded in `brains/CHANGELOG.md`.
+13. Framework Integrity. Any modification to a `brains/`, `skills/`, `templates/`, or `scripts/` file MUST be recorded in `brains/CHANGELOG.md` and reflected in `MANIFEST.md`.
 ## Framework Structure

package/brains/CHANGELOG.md CHANGED Viewed

@@ -3,6 +3,28 @@
 This log tracks modifications to the core agentic framework (e.g., `brains/`, `skills/`).
 Per **Rule 13: Framework Integrity**, anytime an entry is made here, all tool-specific brain files must be reviewed for consistency.
+## [2026-03-13] — Discovery Phase: Structured Spike System
+### Spike Template (New)
+- **Added**: `templates/spike.md` — spike document template with YAML frontmatter (spike_id, parent_epic_ref, status, ambiguity_before, time_box), 8 sections (Question, Constraints, Approach, Findings, Decision, Residual Risk, Affected Documents checklist, Change Log). Hierarchy Level 3.5 — child of Epic, sibling of Story. Output location: `product_plans/backlog/EPIC-{NNN}_{name}/SPIKE-{EpicID}-{NNN}-{topic}.md`.
+### Discovery Reference (New)
+- **Added**: `skills/agent-team/references/discovery.md` — spike execution protocol. Covers: when discovery triggers, spike lifecycle (Open → Investigating → Findings Ready → Validated → Closed), 4-step execution protocol (Create → Investigate → Validate → Close & Propagate), timing rules, integration with bounce sequence.
+### Doc-Manager Skill (Modified)
+- **Modified**: `skills/doc-manager/SKILL.md` — added Spike row to Template Locations table; added spike file to folder structure diagram; added spike information flows (Epic §8 → Spike §1, Spike §4 → Epic §4, Spike §5 → Roadmap §3, Spike §6 → Risk Registry); added Spike pre-read requirements; added spike cascade rules; added spike transition gates (Probing/Spiking → Refinement, Spike → Validated, Spike → Closed); updated Developer and Architect agent integration rows with spike ownership; added Ambiguity Assessment Rubric section with 🔴/🟡/🟢 signal definitions and spike creation trigger.
+### Agent-Team Skill (Modified)
+- **Modified**: `skills/agent-team/SKILL.md` — added Step 0.5: Discovery Check between Sprint Setup and Story Initialization; added critical rule "Resolve discovery before bouncing" requiring L4/🔴 stories to complete spikes before entering bounce sequence.
+### Claude Brain (Modified)
+- **Modified**: `brains/CLAUDE.md` — added Discovery Check to Pre-Bounce Checks; expanded L4 complexity label with spike creation and validation requirements; updated Story States diagram to show spike sub-flow (Dev investigates → Arch validates → docs updated).
+### Sync Notes
+- Other brain files (`GEMINI.md`, `AGENTS.md`, `cursor-rules/`) not yet updated — should be synced in a follow-up change.
+---
 ## [2026-03-12] — LanceDB Removal
 - **Removed**: `scripts/vbounce_ask.mjs` — LanceDB semantic query tool. Replaced by direct `LESSONS.md` reads.