npm - buildanything - Versions diffs - 2.1.2 → 2.2.0 - Mend

buildanything 2.1.2 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/agents/a11y-architect.md +2 -2
package/commands/build.md +72 -28
package/package.json +1 -1
package/protocols/state-schema.json +23 -2
package/protocols/state-schema.md +2 -0
package/protocols/web-phase-branches.md +1 -11
package/src/orchestrator/worktree-launcher.ts +20 -0

package/agents/a11y-architect.md CHANGED Viewed

@@ -1,8 +1,8 @@
 ---
 name: a11y-architect
 description: Accessibility Architect specializing in WCAG 2.2 compliance for Web and Native platforms. Use PROACTIVELY when designing UI components, establishing design systems, or auditing code for inclusive user experiences.
-model: opus
-effort: xhigh
+model: sonnet
+effort: medium
 tools: ["Read", "Write", "Edit", "Bash", "Grep", "Glob", "Skill"]
 ---

package/commands/build.md CHANGED Viewed

@@ -24,6 +24,12 @@ Every Agent tool call MUST include a `subagent_type` field unless the dispatch i
 Missing `subagent_type` on a non-INTERNAL dispatch is a HARD-GATE violation. The orchestrator rejects dispatches that don't name a specific agent. If you catch yourself typing `description: "..."` without a `subagent_type:` line alongside it, STOP and look up the right agent from the per-phase dispatch tables further down in this file.
 </HARD-GATE>
+<HARD-GATE>
+MODEL ROUTING — DO NOT OVERRIDE.
+NEVER pass a `model` parameter on Agent tool calls. Each agent `.md` file declares `model:` in its YAML frontmatter (opus, sonnet, or haiku). Claude Code reads the frontmatter and routes to the correct model automatically. Passing `model:` on the invocation overrides the frontmatter and breaks cost routing. The orchestrator's only job is to pass the correct `agent_type` — the plugin handles model selection.
+</HARD-GATE>
 <HARD-GATE>
 ARTIFACT WRITER-OWNER RULE.
@@ -46,7 +52,6 @@ Live downstream docs (read across Phase 3+):
   - `docs/plans/ux-architecture.md`     — P3 writer (web)
   - `docs/plans/ux-flow-validation.md`  — design-ux-researcher writer (web, Step 3.3b)
   - `docs/plans/inclusive-visuals-audit.md` — P3 writer (web)
-  - `docs/plans/a11y-design-review.md`  — P3 writer, a11y-architect writer (web, Step 3.7)
   - `docs/plans/page-specs/*.md`        — P3 writer, design-ux-architect writer (web, Step 3.3 — per-screen wireframes + layout specs)
   - `docs/plans/refs.json`              — P2 writer, P3 writer (P3 extends after visual spec lands)
   - `docs/plans/decisions.jsonl`        — orchestrator-scribe ONLY via `scribe_decision` MCP tool (subagents return `deviation_row` objects; the orchestrator forwards each row through the MCP, which owns ID allocation and atomic append)
@@ -140,6 +145,41 @@ Increment after each agent returns (parallel dispatch of 6 agents = +6). Reset t
 Phase 4 context pressure: With 20+ tasks, compact returns accumulate ~30-40K tokens in the orchestrator's context. The compaction checkpoint (dispatch_count >= 8) is the primary relief valve. If Phase 4 has more than 15 tasks, force a compaction checkpoint after every wave transition regardless of dispatch_count.
+### Phase Boundary Eviction (Context Budget Protocol)
+At every phase boundary (after gate approval, before starting the next phase):
+1. **Write carry-forward summary.** Append to `.build-state.json.phase_summaries[]`:
+   ```jsonc
+   {
+     "phase": <N>,
+     "completed_at": "<ISO timestamp>",
+     "artifacts": ["<paths of files this phase produced>"],
+     "decisions": "<1-2 sentences: key decisions made>",
+     "status": "<approved | approved_with_concerns | auto_approved>",
+     "carry_forward": "<1-2 sentences: user feedback or constraints that affect future phases>"
+   }
+   ```
+   Budget: max 500 tokens for the entire entry. If you can't fit it, you're including too much.
+2. **Save state.** Call `state_save`.
+3. **Drop prior-phase context.** After saving, you do NOT need to retain in working memory:
+   - Agent dispatch prompts from the completed phase (already sent)
+   - Agent returns from the completed phase (already processed, summary in state)
+   - File contents read to compose prompts (still on disk, re-readable)
+   - Metric loop intermediate scores (final score in state)
+   - Gate presentation text (user already approved)
+4. **Re-read for next phase.** Read `.build-state.json` fresh (contains `phase_summaries` — your structured memory of all prior phases). Then read only the input artifacts needed for the next phase:
+   - Entering Phase 3: `architecture.md`, `sprint-tasks.md`, `quality-targets.json`
+   - Entering Phase 4: `feature-delegation-plan.json`, current wave's feature briefs
+   - Entering Phase 5: `quality-targets.json`, feature list from state
+   - Entering Phase 6: Phase 5 findings paths from state, `decisions.jsonl`
+   - Entering Phase 7: LRR verdict from state
+The `phase_summaries` array is your memory of prior phases. You do NOT need the raw conversation that produced them. If you need a specific fact from Phase 1 during Phase 5, read the artifact file — don't try to recall it.
 **Cumulative-cost banner at phase boundaries:** When announcing a phase transition (e.g. "Phase N complete — proceeding to Phase N+1"), prefix the message with `[Cost so far: $X.XX • Y tokens]`. Source the values from the last-appended entry in `docs/plans/build-log.md`'s token-accounting lines (fields `cumulative_usd=...` plus the sum of `input_tokens=...` + `output_tokens=...`), written by `src/orchestrator/hooks/token-accounting.ts` (see module for exact schema). If the build-log has no token-accounting entries yet, omit the prefix rather than guessing.
 Input: $ARGUMENTS
@@ -559,7 +599,7 @@ Run via the Bash tool:
 ---
-## Phase 2: Plan / Architect — TEAM of 6 + sequence
+## Phase 2: Plan / Architect — TEAM of 4 + sequence + security review
 **Goal**: Convert the PRD into a concrete architecture and ordered task list with explicit dependencies. Every architect receives the PRD (design-doc.md) + the Research Digest + its domain's raw research file (hybrid routing).
@@ -573,21 +613,23 @@ If existing code, call the Agent tool — description: "Explore codebase" — IN
 If greenfield, skip to Step 2.2.
-### Step 2.2 — Architecture Design (TEAM of 6 architects coordinating via SendMessage)
+### Step 2.2 — Architecture Design (TEAM of 4 architects coordinating via SendMessage)
+The 4 architects design as a TEAM — not 4 isolated subagents. Cross-domain contract boundaries (Backend↔Frontend on API shape, Performance↔Backend+Data on query shapes, Frontend↔Performance on bundle budgets) are caught at design time via peer SendMessage, not absorbed silently by a downstream stitcher.
-The 6 architects design as a TEAM — not 6 isolated subagents. Cross-domain contract boundaries (Backend↔Frontend on API shape, Security↔Backend on auth, A11y↔Frontend on component patterns, Performance↔Backend+Data on query shapes) are caught at design time via peer SendMessage, not absorbed silently by a downstream stitcher.
+Security is NOT in the team — it runs as a separate review pass after synthesis (Step 2.4) to avoid the coordination overhead of its dense cross-check pairings.
-**On re-entry from LRR backward routing:** If Phase 2 is being re-opened via the re-entry dispatch template (Step 6.3), skip team creation if the original `phase-2-architects` team is still live from this build; otherwise recreate it. Pass the re-entry payload (`{blocking_finding, prior_output: "docs/plans/architecture.md", decision_row}`) into the dispatch prompt of the architect(s) whose domain matches `decision_row.author` — only those architects re-run, not all 6. The re-dispatched architect revises its `docs/plans/phase-2-contracts/<name>.md` in place, SendMessages peers on any contract boundary it now changes, and the synthesizer re-runs once to re-stitch `architecture.md`. Do NOT redo unaffected domains.
+**On re-entry from LRR backward routing:** If Phase 2 is being re-opened via the re-entry dispatch template (Step 6.3), skip team creation if the original `phase-2-architects` team is still live from this build; otherwise recreate it. Pass the re-entry payload (`{blocking_finding, prior_output: "docs/plans/architecture.md", decision_row}`) into the dispatch prompt of the architect(s) whose domain matches `decision_row.author` — only those architects re-run, not all 4. The re-dispatched architect revises its `docs/plans/phase-2-contracts/<name>.md` in place, SendMessages peers on any contract boundary it now changes, and the synthesizer re-runs once to re-stitch `architecture.md`. Do NOT redo unaffected domains.
 After the synthesizer re-stitches `architecture.md`, re-run the Refs Indexer (Step 2.3 dispatch #4) to update `docs/plans/refs.json` with fresh anchors, and re-run the DAG Validator (Step 2.3 dispatch #3) to verify sprint-tasks.md still references valid architecture sections. Invalidate the sprint-context hash per the refs.json mutation rule.
 **Step 2.2a — Create the team.**
-Call `TeamCreate` with `team_name: "phase-2-architects"`. This team scopes the SendMessage channel for the 6 architects below. Capture the team id in `.build-state.json` for teardown.
+Call `TeamCreate` with `team_name: "phase-2-architects"`. This team scopes the SendMessage channel for the 4 architects below. Capture the team id in `.build-state.json` for teardown.
-**Step 2.2b — Dispatch 6 architects as teammates (ONE message).**
+**Step 2.2b — Dispatch 4 architects as teammates (ONE message).**
-Call the Agent tool 6 times in a single message. Each call passes `team_name: "phase-2-architects"` and a unique `name` (listed below). Each architect receives: `docs/plans/design-doc.md` (PRD) + `docs/plans/phase1-scratch/findings-digest.md` + ITS DOMAIN'S RAW RESEARCH FILE (hybrid routing) + the team roster + cross-check pairings + the per-architect output file path.
+Call the Agent tool 4 times in a single message. Each call passes `team_name: "phase-2-architects"` and a unique `name` (listed below). Each architect receives: `docs/plans/design-doc.md` (PRD) + `docs/plans/phase1-scratch/findings-digest.md` + ITS DOMAIN'S RAW RESEARCH FILE (hybrid routing) + the team roster + cross-check pairings + the per-architect output file path.
 Shared brief appended to every architect prompt:
@@ -597,21 +639,18 @@ ROSTER:
   - backend-architect         (owns services, API contracts, DB schema)
   - frontend-architect        (owns component hierarchy, state mgmt, routing)
   - data-engineer             (owns ETL/ELT, schema versioning, query patterns)
-  - security-engineer         (owns auth model, input validation, threat model)
-  - accessibility-auditor     (owns WCAG 2.2 AA constraints on component/nav choice)
   - performance-benchmarker   (owns quality-targets.json, bundle + latency budgets)
 CROSS-CHECK PAIRINGS (mandatory — if your design touches one of these boundaries, SendMessage the peer before you finalize):
   - Backend ↔ Frontend         on API contract shape (REST vs GraphQL, request/response schemas, error envelope)
-  - Security ↔ Backend         on auth flow (token storage, refresh, session model, authz gates)
-  - Accessibility ↔ Frontend   on component patterns (primitives, focus management, landmark structure)
   - Performance ↔ Backend+Data on query shapes (N+1 risk, indexing strategy, bundle impact of data layer choices)
-  - Security ↔ Frontend        on client-side auth (token storage location, CSRF protection, input sanitization, secure cookie flags)
+  - Frontend ↔ Performance     on bundle budgets (per-Scope classification, animation strategy, MapLibre/heavy-lib placement)
 COORDINATION RULES:
   - Plain text in your output file is INVISIBLE to teammates. If a contract boundary intersects another architect's domain, you MUST `SendMessage` to that peer using the exact `name` from the roster above. Do not assume they will read your file.
   - If a peer SendMessage challenges a decision you have written, revise your output file and SendMessage back with the resolution — do not silently ignore.
-  - Idle (exit) only after: (1) your initial read + draft is complete, AND (2) all cross-check pairings touching your domain have either been resolved via SendMessage or confirmed non-intersecting.
+  - Max 2 rounds of cross-check per pairing. After round 2, document the disagreement in your output file under `### Unresolved Tensions` and idle. The synthesizer resolves remaining tensions.
+  - Idle (exit) only after: (1) your initial read + draft is complete, AND (2) all cross-check pairings touching your domain have either been resolved via SendMessage, confirmed non-intersecting, or hit the 2-round cap.
 OUTPUT:
   Write your findings to `docs/plans/phase-2-contracts/<your-name>.md` (e.g., `docs/plans/phase-2-contracts/backend-architect.md`). This file is the authoritative record of your post-debate position — include both your initial decisions AND any revisions driven by peer SendMessage.
@@ -621,22 +660,19 @@ Per-architect dispatches:
 **CONTEXT header:** Render `rendered_context_header` for phase 2 per the canonical template (see CONTEXT HEADER HARD-GATE above). Prepend to every Phase 2 architect prompt below.
+All architects use model: **sonnet**.
-1. Description: "Backend architecture" — agent_type: `engineering-backend-architect` — subagent_type: `engineering-backend-architect` — team_name: `phase-2-architects` — name: `backend-architect` — Prompt: "[CONTEXT header above] Design system architecture. Read these files via your Read tool before starting — do NOT expect pasted content:\n  - PRD: `docs/plans/design-doc.md`\n  - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n  - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n  - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`, `docs/plans/phase1-scratch/feature-intel.md`\nInclude services, data models, API contracts, database schema, integration points. Respect stack choices from PRD. Map per-feature Business Rules and States to specific endpoints, persistence schemas, and validation logic — every State the product spec defines must have a backend behavior.\n\n[paste shared team brief above]"
+1. Description: "Backend architecture" — agent_type: `engineering-backend-architect` — subagent_type: `engineering-backend-architect` — model: `sonnet` — team_name: `phase-2-architects` — name: `backend-architect` — Prompt: "[CONTEXT header above] Design system architecture. Read these files via your Read tool before starting — do NOT expect pasted content:\n  - PRD: `docs/plans/design-doc.md`\n  - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n  - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n  - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`, `docs/plans/phase1-scratch/feature-intel.md`\nInclude services, data models, API contracts, database schema, integration points. Respect stack choices from PRD. Map per-feature Business Rules and States to specific endpoints, persistence schemas, and validation logic — every State the product spec defines must have a backend behavior.\n\n[paste shared team brief above]"
-2. Description: "Frontend architecture" — agent_type: `engineering-frontend-developer` — subagent_type: `engineering-frontend-developer` — team_name: `phase-2-architects` — name: `frontend-architect` — Prompt: "[CONTEXT header above] Design frontend architecture. Read these files via your Read tool before starting — do NOT expect pasted content:\n  - PRD: `docs/plans/design-doc.md`\n  - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n  - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n  - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/ux-research.md`, `docs/plans/phase1-scratch/feature-intel.md`\nInclude component hierarchy, layout strategy, responsive approach, state management, routing. Align UX with the persona from research. Map the Screen Inventory to your component hierarchy — every screen the product spec lists must have a routable view, and per-feature States must drive the component-state matrix.\n\n[paste shared team brief above]"
+2. Description: "Frontend architecture" — agent_type: `engineering-frontend-developer` — subagent_type: `engineering-frontend-developer` — model: `sonnet` — team_name: `phase-2-architects` — name: `frontend-architect` — Prompt: "[CONTEXT header above] Design frontend architecture. Read these files via your Read tool before starting — do NOT expect pasted content:\n  - PRD: `docs/plans/design-doc.md`\n  - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n  - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n  - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/ux-research.md`, `docs/plans/phase1-scratch/feature-intel.md`\nInclude component hierarchy, layout strategy, responsive approach, state management, routing. Align UX with the persona from research. Map the Screen Inventory to your component hierarchy — every screen the product spec lists must have a routable view, and per-feature States must drive the component-state matrix.\n\n[paste shared team brief above]"
-3. Description: "Data engineering" — agent_type: `engineering-data-engineer` — subagent_type: `engineering-data-engineer` — team_name: `phase-2-architects` — name: `data-engineer` — Prompt: "[CONTEXT header above] Design data architecture. Read these files via your Read tool before starting — do NOT expect pasted content:\n  - PRD: `docs/plans/design-doc.md`\n  - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n  - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n  - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`\nInclude ETL/ELT patterns, schema versioning, query patterns, indexing strategy, data lineage, migration plan. Per-feature data requirements from the product spec drive your schema — derived fields, denormalizations, and access patterns must serve specific feature flows.\n\n[paste shared team brief above]"
+3. Description: "Data engineering" — agent_type: `engineering-data-engineer` — subagent_type: `engineering-data-engineer` — model: `sonnet` — team_name: `phase-2-architects` — name: `data-engineer` — Prompt: "[CONTEXT header above] Design data architecture. Read these files via your Read tool before starting — do NOT expect pasted content:\n  - PRD: `docs/plans/design-doc.md`\n  - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n  - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n  - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`\nInclude ETL/ELT patterns, schema versioning, query patterns, indexing strategy, data lineage, migration plan. Per-feature data requirements from the product spec drive your schema — derived fields, denormalizations, and access patterns must serve specific feature flows.\n\n[paste shared team brief above]"
-4. Description: "Security architecture" — agent_type: `engineering-security-engineer` — subagent_type: `engineering-security-engineer` — team_name: `phase-2-architects` — name: `security-engineer` — Prompt: "[CONTEXT header above] Security review. Read these files via your Read tool before starting — do NOT expect pasted content:\n  - PRD: `docs/plans/design-doc.md`\n  - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n  - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n  - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`\nCover auth model, input validation, secrets management, threat model, dependency hygiene. Use the product spec's ## Permissions & Roles section to drive your auth model — roles in the product spec must map to enforceable permissions in the architecture.\n\n[paste shared team brief above]"
+4. Description: "Performance constraints" — agent_type: `testing-performance-benchmarker` — subagent_type: `testing-performance-benchmarker` — model: `sonnet` — team_name: `phase-2-architects` — name: `performance-benchmarker` — Prompt: "[CONTEXT header above] Define quality targets for this build. Read these files via your Read tool before starting — do NOT expect pasted content:\n  - PRD: `docs/plans/design-doc.md`\n  - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n  - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n  - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`\nWrite `docs/plans/quality-targets.json` covering bundle budget, LCP, TTI, API p95, Lighthouse scores. Use per-Scope budgets: Marketing 500KB / Product 300KB / Dashboard 400KB / Internal 200KB gzipped. Per-feature critical-path performance derives from the product spec's Happy Path latency expectations.\n\n[paste shared team brief above]"
-5. Description: "A11y constraints" — agent_type: `a11y-architect` — subagent_type: `a11y-architect` — team_name: `phase-2-architects` — name: `accessibility-auditor` — Prompt: "[CONTEXT header above] Accessibility-driven architecture constraints. Read these files via your Read tool before starting — do NOT expect pasted content:\n  - PRD: `docs/plans/design-doc.md`\n  - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n  - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n  - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/ux-research.md`\nIdentify WCAG 2.2 AA requirements that affect component choice, navigation structure, form patterns, focus management, landmark regions. Per-feature Persona Constraints (e.g., \"user scans, doesn't read\", \"operator on a phone in the field\") drive component-level a11y constraints.\n\n[paste shared team brief above]"
+**Step 2.2c — Wait for all 4 teammates to idle**, then proceed to synthesis. The `docs/plans/phase-2-contracts/*.md` files now contain post-debate positions (initial draft plus any SendMessage-driven revisions). The orchestrator does NOT read these files — the synthesizer below does.
-6. Description: "Performance constraints" — agent_type: `testing-performance-benchmarker` — subagent_type: `testing-performance-benchmarker` — team_name: `phase-2-architects` — name: `performance-benchmarker` — Prompt: "[CONTEXT header above] Define quality targets for this build. Read these files via your Read tool before starting — do NOT expect pasted content:\n  - PRD: `docs/plans/design-doc.md`\n  - PRODUCT SPEC: `docs/plans/product-spec.md` (## App Overview, ## Screen Inventory, ## Permissions & Roles, per-feature behavioral sections — feature behaviors are the source of truth your architecture must support)\n  - DIGEST: `docs/plans/phase1-scratch/findings-digest.md`\n  - YOUR DOMAIN RAW: `docs/plans/phase1-scratch/tech-feasibility.md`\nWrite `docs/plans/quality-targets.json` covering bundle budget, LCP, TTI, API p95, Lighthouse scores. Use per-Scope budgets: Marketing 500KB / Product 300KB / Dashboard 400KB / Internal 200KB gzipped. Per-feature critical-path performance derives from the product spec's Happy Path latency expectations.\n\n[paste shared team brief above]"
-**Step 2.2c — Wait for all 6 teammates to idle**, then proceed to synthesis. The `docs/plans/phase-2-contracts/*.md` files now contain post-debate positions (initial draft plus any SendMessage-driven revisions). The orchestrator does NOT read these files — the synthesizer below does.
-After all 6 teammates are idle, the 4 raw research files are **SPENT**. They sit on disk for audit but no downstream phase reads them — they are NOT in the `refs.json` index. The orchestrator MOVES them to `docs/plans/phase1-scratch/` if not already there, to make the distinction physically obvious.
+After all 4 teammates are idle, the 4 raw research files are **SPENT**. They sit on disk for audit but no downstream phase reads them — they are NOT in the `refs.json` index. The orchestrator MOVES them to `docs/plans/phase1-scratch/` if not already there, to make the distinction physically obvious.
 **Step 2.2d — Team teardown.** After the synthesizer dispatch at Step 2.3 returns, call `TeamDelete` on `phase-2-architects` to clean up the team channel.
@@ -646,7 +682,7 @@ Four sequential dispatches.
 **CONTEXT header:** Reuse `rendered_context_header` from phase 2 (already rendered above). Prepend to Step 2.3 synthesizer + sprint-breakdown prompts.
-1. Description: "Implementation blueprint" — agent_type: `code-architect` — subagent_type: `code-architect` — Prompt: "[CONTEXT header above] Implementation blueprint. Read the PRD via your Read tool: `docs/plans/design-doc.md`. Read the product spec: `docs/plans/product-spec.md` (Screen Inventory + per-feature behavioral sections — your blueprint's file-and-build-order list must cover every feature in the spec). Read all 6 post-debate architect positions via your own Read tool from `docs/plans/phase-2-contracts/`:\n  - `backend-architect.md`\n  - `frontend-architect.md`\n  - `data-engineer.md`\n  - `security-engineer.md`\n  - `accessibility-auditor.md`\n  - `performance-benchmarker.md`\n\nThese files are the authoritative team positions AFTER any SendMessage-driven revisions — the architects already cross-checked each other's contract boundaries, so you can stitch without re-debating. Your job is to assemble the 6 positions into a coherent architecture. Where positions conflict OUTSIDE the 5 mandatory cross-check pairings, flag the contradiction explicitly in `architecture.md` under a `### Unresolved Tensions` section and pick the safer default. Do not silently absorb contradictions. Include specific files to create/modify, build sequence, dependency order. Write `docs/plans/architecture.md` with stable section anchors per `protocols/architecture-schema.md`. Required top-level sections: Overview, Frontend, Backend, Data Model, Security, Infrastructure, Scope, Out of Scope. Scope to the boundary from the PRD. Every API endpoint heading in the Backend section MUST include feature attribution annotations — e.g. `**POST /api/orders** (provides: order-placement)` — using the feature kebab names from `product-spec.md`. These annotations are required for the graph indexer to emit cross-feature dependency edges."
+1. Description: "Implementation blueprint" — agent_type: `code-architect` — subagent_type: `code-architect` — Prompt: "[CONTEXT header above] Implementation blueprint. Read the PRD via your Read tool: `docs/plans/design-doc.md`. Read the product spec: `docs/plans/product-spec.md` (Screen Inventory + per-feature behavioral sections — your blueprint's file-and-build-order list must cover every feature in the spec). Read all 4 post-debate architect positions via your own Read tool from `docs/plans/phase-2-contracts/`:\n  - `backend-architect.md`\n  - `frontend-architect.md`\n  - `data-engineer.md`\n  - `performance-benchmarker.md`\n\nThese files are the authoritative team positions AFTER any SendMessage-driven revisions — the architects already cross-checked each other's contract boundaries, so you can stitch without re-debating. Your job is to assemble the 4 positions into a coherent architecture. Where positions conflict OUTSIDE the 3 mandatory cross-check pairings, flag the contradiction explicitly in `architecture.md` under a `### Unresolved Tensions` section and pick the safer default. Do not silently absorb contradictions. Include specific files to create/modify, build sequence, dependency order. Write `docs/plans/architecture.md` with stable section anchors per `protocols/architecture-schema.md`. Required top-level sections: Overview, Frontend, Backend, Data Model, Security, Infrastructure, Scope, Out of Scope. Scope to the boundary from the PRD. Every API endpoint heading in the Backend section MUST include feature attribution annotations — e.g. `**POST /api/orders** (provides: order-placement)` — using the feature kebab names from `product-spec.md`. These annotations are required for the graph indexer to emit cross-feature dependency edges."
 2. Description: "Sprint breakdown" — agent_type: `planner` — subagent_type: `planner` — Prompt: "[CONTEXT header above] Break this architecture into ordered, atomic tasks. Each task needs: description, acceptance criteria, **dependencies** (list of task IDs this depends on), size (S/M/L), **Behavioral Test** field for every UI task (concrete interaction: 'Navigate to [page], click [element], verify [outcome]') or curl-based acceptance test for API tasks, **Feature** — the exact feature name from product-spec.md (e.g. 'Order Placement', 'Auth') that must match a `## Feature: X` heading in product-spec.md (use '—' for infrastructure tasks that don't belong to a specific feature), **Screens** — comma-separated screen names from the product-spec Screen Inventory (e.g. 'Catalog, Product Detail') that must match screen names in product-spec.md (use '—' for backend-only tasks). Read these files via your Read tool before starting:\n  - ARCHITECTURE: `docs/plans/architecture.md`\n  - PRODUCT SPEC: `docs/plans/product-spec.md` (per-feature behavioral sections — every feature in the spec must have at least one task, and per-feature acceptance criteria become Behavioral Test field values)\n  - PRD: `docs/plans/design-doc.md`\nSave to `docs/plans/sprint-tasks.md`. The table must have these columns in order: Task ID, Title, Size, Dependencies, Behavioral Test, Owns Files, Implementing Phase, Feature, Screens. Dependencies field is load-bearing — Phase 4 uses it to batch independent tasks in parallel. Each task's Behavioral Test field SHOULD reference a specific feature acceptance criterion from the product spec (e.g., \"User can submit form with valid email; submitted form appears in admin dashboard within 5s\" — derived from product-spec.md's Happy Path or per-state criteria)."
@@ -667,7 +703,7 @@ Report any violations. If clean, return PASS. If violations, return a list of fi
 For each doc, extract section anchors into a flat index. Schema: `[{\"anchor\": \"design-doc.md#persona\", \"topic\": \"user persona\", \"file_path\": \"docs/plans/design-doc.md\"}, ...]`. This index is consumed by the Phase 4 Briefing Officer for per-task context maps. Do NOT include Phase 1 scratch files — they are SPENT."
-**Architecture Metric Loop (callable service):** Run the Metric Loop Protocol (`protocols/metric-loop.md`) on `architecture.md`. Define a metric: coverage of PRD requirements, specificity, consistency across the 6 architects, and **simplicity** — is this the simplest architecture that meets the requirements? Could any service, abstraction, or dependency be eliminated? Penalize over-engineering. Max 3 iterations.
+**Architecture Metric Loop (callable service):** Run the Metric Loop Protocol (`protocols/metric-loop.md`) on `architecture.md`. Define a metric: coverage of PRD requirements, specificity, consistency across the 4 architects, and **simplicity** — is this the simplest architecture that meets the requirements? Could any service, abstraction, or dependency be eliminated? Penalize over-engineering. Max 3 iterations.
 #### Step 2.3.1.idx — Architecture graph index
@@ -701,6 +737,14 @@ Run via the Bash tool:
 **Architecture decisions:** The Implementation Blueprint synthesizer returns 4 `deviation_row` objects (or a `phase_2_decisions` array of row objects) in its structured result — one per cross-cutting Phase 2 decision (API contract, persistence layer, auth model, framework choice). The orchestrator forwards each row through the `scribe_decision` MCP tool (see Phase 4 "Orchestrator-scribe dispatch"); the MCP allocates `D-2-<seq>` IDs and atomically appends to `docs/plans/decisions.jsonl`. Author = `architect`. Each row carries a `ref` anchor pointing into `architecture.md` per `protocols/decision-log.md`. Total: 4 rows.
+### Step 2.4 — Security Review (post-synthesis, NOT in team)
+Security runs as a standalone subagent AFTER the architecture is synthesized. It reviews the complete picture rather than debating piecemeal during the team phase. This eliminates the coordination overhead of security's dense cross-check pairings while preserving full security coverage.
+Description: "Security architecture review" — agent_type: `engineering-security-engineer` — subagent_type: `engineering-security-engineer` — model: `sonnet` — Prompt: "[CONTEXT header above] Security review of the synthesized architecture. Read these files via your Read tool before starting — do NOT expect pasted content:\n  - ARCHITECTURE: `docs/plans/architecture.md` (the synthesized output — this is your primary input)\n  - PRD: `docs/plans/design-doc.md`\n  - PRODUCT SPEC: `docs/plans/product-spec.md` (## Permissions & Roles is your auth model source of truth)\n  - BACKEND CONTRACT: `docs/plans/phase-2-contracts/backend-architect.md`\n  - FRONTEND CONTRACT: `docs/plans/phase-2-contracts/frontend-architect.md`\n\nReview the architecture for: auth model completeness, input validation coverage, secrets management, threat model, CSRF/XSS/injection surface, RLS policy design, dependency hygiene, client-side auth posture (token storage, secure cookies). Use the product spec's ## Permissions & Roles section to verify every role maps to enforceable permissions.\n\nWrite `docs/plans/phase-2-contracts/security-engineer.md` with your findings. Structure: auth model, RLS policies, threat model, input validation rules, secrets management, security headers, and a `### Required Revisions` section listing any changes needed to `architecture.md`. If no revisions needed, state 'No revisions required.'\n\nIf `### Required Revisions` is non-empty, the synthesizer will re-run once to incorporate your findings."
+**Post-security revision (conditional):** If the security review's `### Required Revisions` section is non-empty, re-dispatch the Implementation Blueprint synthesizer (Step 2.3 dispatch #1) with an additional instruction: "Read `docs/plans/phase-2-contracts/security-engineer.md` § Required Revisions and incorporate into `architecture.md`. Do not re-read other contracts — only apply the security revisions." Then re-run the Refs Indexer. Max 1 revision cycle.
 **Writes:** `docs/plans/architecture.md`, `docs/plans/sprint-tasks.md`, `docs/plans/quality-targets.json`, `docs/plans/refs.json`. Decision rows (4) flow through the orchestrator's `scribe_decision` MCP calls.
 ### Quality Gate 2
@@ -978,7 +1022,7 @@ Call the Agent tool 5 times in one message:
 2. Description: "Performance audit" — agent_type: `testing-performance-benchmarker` — subagent_type: `testing-performance-benchmarker` — Prompt: "[CONTEXT header above] Measure response times, identify bottlenecks, flag performance issues. NFR targets: Read `docs/plans/quality-targets.json` via your Read tool for performance thresholds. Bundle size per-Scope budgets apply (Marketing 500KB / Product 300KB / Dashboard 400KB / Internal 200KB gzipped). Report benchmarks AGAINST these targets, not generic metrics."
-3. Description: "A11y audit" — agent_type: `a11y-architect` — subagent_type: `a11y-architect` — Prompt: "[CONTEXT header above] WCAG 2.2 AA runtime compliance audit on all interfaces. Check screen reader, keyboard nav, contrast, focus order, touch targets (>=44px), reduced-motion variants. Report issues with severity (Critical/Serious/Moderate/Minor)."
+3. Description: "A11y audit" — agent_type: `a11y-architect` — subagent_type: `a11y-architect` — Prompt: "[CONTEXT header above] Light-touch accessibility sweep — flag only Critical and Serious WCAG 2.2 AA violations (blatant ADA issues). Skip Moderate/Minor. Keep the report concise. WCAG 2.2 AA runtime compliance audit on all interfaces. Check screen reader, keyboard nav, contrast, focus order, touch targets (>=44px), reduced-motion variants. Report issues with severity (Critical/Serious/Moderate/Minor)."
 4. Description: "Security audit" — agent_type: `engineering-security-engineer` — subagent_type: `engineering-security-engineer` — Prompt: "[CONTEXT header above] Security review at app level: auth, input validation, data exposure, dependency vulnerabilities. NFR targets: Read `docs/plans/quality-targets.json` via your Read tool for security thresholds. Report findings with severity."
@@ -1155,7 +1199,7 @@ Evaluate whether the build meets NFR targets (response time, load handling, erro
 SRE MAY spawn ONE read-only follow-up investigation, but ONLY if verdict would be BLOCK. Same caps as Security."
-4. Description: "LRR A11y chapter" — agent_type: `a11y-architect` — subagent_type: `a11y-architect` — Prompt: "[CONTEXT header above] You are the A11y chapter of the LRR (NEW SEAT in this panel — closes the biggest coverage gap). Read: Phase 5 a11y audit output (from Step 5.1), WCAG 2.2 AA runtime check, per-page accessibility findings, `docs/plans/quality-targets.json` a11y section.
+4. Description: "LRR A11y chapter" — agent_type: `a11y-architect` — subagent_type: `a11y-architect` — Prompt: "[CONTEXT header above] Advisory only — BLOCK verdict requires Critical-severity violations only. Serious issues are CONCERNS, not BLOCK. You are the A11y chapter of the LRR (NEW SEAT in this panel — closes the biggest coverage gap). Read: Phase 5 a11y audit output (from Step 5.1), WCAG 2.2 AA runtime check, per-page accessibility findings, `docs/plans/quality-targets.json` a11y section.
 Scoring rules:
   - PASS if zero Serious + zero Critical findings

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "buildanything",
-  "version": "2.1.2",
+  "version": "2.2.0",
   "description": "One command to build an entire product. 44 specialist agents orchestrated into a full engineering pipeline for Claude Code.",
   "bin": {
     "buildanything": "./bin/setup.js",

package/protocols/state-schema.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "$schema": "http://json-schema.org/draft-07/schema#",
-  "$comment": "Schema version migration table: schema_version 1 = Stages 1-3; schema_version 2 = Stage 4 (adds backward_routing_count, backward_routing_count_by_target_phase, in_flight_backward_edge, mode_transitions); schema_version 3 = Stage 5 (adds lrr_cycle_state); schema_version 4 = Stage 6 (adds current_sprint_context_hash), schema_version 5 = Stage 7 (adds feature_delegation_plan_path, current_wave, completed_features, feature_acceptance, feature_briefs). --- Runtime validation rules (not encodable in JSON Schema, require code): Rule 5 — step prefix must match current phase number; Rule 6 — mode/autonomous consistency (mode==='autonomous' iff autonomous===true); Rule 7 — iOS fields gating (app_name, bundle_id, xcodeproj_path, ios_features, phase_progress.phase_minus_1 exist iff project_type==='ios'); Rule 10 — pending/in-progress disjoint (in_progress_task.task_id not in pending_tasks or completed_tasks); Rule 11 — resume_point.phase/step must not be ahead of top-level phase/step; Rule 12 — timestamps monotonic (session_last_saved >= session_started).",
+  "$comment": "Schema version migration table: schema_version 1 = Stages 1-3; schema_version 2 = Stage 4 (adds backward_routing_count, backward_routing_count_by_target_phase, in_flight_backward_edge, mode_transitions); schema_version 3 = Stage 5 (adds lrr_cycle_state); schema_version 4 = Stage 6 (adds current_sprint_context_hash), schema_version 5 = Stage 7 (adds feature_delegation_plan_path, current_wave, completed_features, feature_acceptance, feature_briefs), schema_version 6 = Stage 8 (adds phase_summaries). --- Runtime validation rules (not encodable in JSON Schema, require code): Rule 5 — step prefix must match current phase number; Rule 6 — mode/autonomous consistency (mode==='autonomous' iff autonomous===true); Rule 7 — iOS fields gating (app_name, bundle_id, xcodeproj_path, ios_features, phase_progress.phase_minus_1 exist iff project_type==='ios'); Rule 10 — pending/in-progress disjoint (in_progress_task.task_id not in pending_tasks or completed_tasks); Rule 11 — resume_point.phase/step must not be ahead of top-level phase/step; Rule 12 — timestamps monotonic (session_last_saved >= session_started).",
   "title": ".build-state.json",
   "description": "Typed source of truth for BuildAnything build state. Validated by the PreToolUse schema lint hook (W2-2). additionalProperties: false enforces fail-closed per A8 SSOT rule.",
   "type": "object",
@@ -199,6 +199,20 @@
         "session_id": { "type": ["string", "null"] },
         "timestamp":  { "type": "string", "format": "date-time" }
       }
+    },
+    "phase_summary": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": ["phase", "completed_at", "artifacts", "decisions", "status"],
+      "properties": {
+        "phase": { "type": "integer", "enum": [-1, 0, 1, 2, 3, 4, 5, 6, 7] },
+        "completed_at": { "type": "string", "format": "date-time" },
+        "artifacts": { "type": "array", "items": { "type": "string" } },
+        "decisions": { "type": "string", "maxLength": 300 },
+        "status": { "type": "string", "enum": ["approved", "approved_with_concerns", "auto_approved"] },
+        "carry_forward": { "type": "string", "maxLength": 200 }
+      }
     }
   },
@@ -229,7 +243,7 @@
     "schema_version": {
       "type": "integer",
       "minimum": 1,
-      "maximum": 5,
+      "maximum": 6,
       "description": "Currently 5 (Stage 7). Bumped to 2 at Stage 4, 3 at Stage 5, 4 at Stage 6, 5 at Stage 7."
     },
     "project_type": {
@@ -418,6 +432,13 @@
       "additionalProperties": { "type": "string" },
       "$comment": "Stage 7+ (schema_version >= 5). Written by orchestrator after each briefing-officer dispatch (Step 4.2.a).",
       "description": "Stage 7+ (schema_version >= 5). Map of feature name to feature brief file path (docs/plans/feature-briefs/{feature}.md)."
+    },
+    "phase_summaries": {
+      "type": "array",
+      "items": { "$ref": "#/$defs/phase_summary" },
+      "$comment": "Stage 8+ (schema_version >= 6). Written by orchestrator at each phase boundary per the Context Budget Protocol.",
+      "description": "Stage 8+ (schema_version >= 6). Structured carry-forward summaries from completed phases. Max ~500 tokens per entry."
     }
   }
 }

package/protocols/state-schema.md CHANGED Viewed

@@ -17,6 +17,7 @@
 | 3 | Stage 5 | `lrr_cycle_state` (object; interior fields loose-typed pending Stage 5 iteration — see "Fields added at v3" below) | `BUILDANYTHING_SDK_LRR=false` reverts to markdown aggregator; `lrr_cycle_state` becomes an ignored field on the orchestrator read path (additive-only, no data loss on downgrade) |
 | 4 | Stage 6 | `current_sprint_context_hash` | `BUILDANYTHING_SDK_SPRINT_CONTEXT=false` (web) and/or `BUILDANYTHING_SDK_SPRINT_CONTEXT_IOS=false` (iOS parity gate) reverts Phase 4 to per-task refs re-send; `current_sprint_context_hash` becomes an ignored field on the orchestrator read path (additive-only, no data loss on downgrade) |
 | 5 | Stage 7 | `feature_delegation_plan_path`, `current_wave`, `completed_features`, `feature_acceptance`, `feature_briefs` | Feature-level fields are additive and optional; a Stage 6 runtime reading a Stage 7 state file with `schema_version` downgraded to `4` will ignore these fields without data loss on the read path |
+| 6 | Stage 8 | `phase_summaries` | Additive; ignored by older runtimes (no data loss on downgrade) |
 **A7 forward-reject rule.** When `bin/buildanything-runtime.ts` reads `.build-state.json` at session start, if `schema_version > MAX_SUPPORTED_SCHEMA_VERSION`, the runtime refuses to proceed and emits a clear error pointing to the compat matrix (`docs/migration/sdk-host-compat.md`). This is the A7 defense against silent schema drift — an old runtime must never silently ignore fields a newer runtime persisted. See **Task 4.5.2** for the runtime implementation (out of scope for this prose-only update).
@@ -82,6 +83,7 @@
 | `verification` | object | yes | `{last_verify_result, last_verify_timestamp}`. `last_verify_result` is one of `"PRODUCTION_READY"`, `"NEEDS_WORK"`, `"BLOCKED"`, or `null`. |
 | `blockers` | array | no | Open blockers. Each: `{id, description, surfaced_at, type}`. Type is `"build"`, `"design"`, `"dep"`, or `"external"`. |
 | `decisions_pruned_at_phase0` | boolean | no | Default `false`. Set to `true` after Phase 0 archives stale decision rows. |
+| `phase_summaries` | array | no | Structured carry-forward summaries from completed phases. Each entry: `{phase, completed_at, artifacts[], decisions, status, carry_forward?}`. Max ~500 tokens per entry. Written at phase boundaries per the Context Budget Protocol. |
 ### Decision Log Pruning (Phase 0)

package/protocols/web-phase-branches.md CHANGED Viewed

@@ -237,16 +237,6 @@ Call the Agent tool once:
 Record the score history to `docs/plans/build-log.md` under `## Design Critic Loop`.
-### Step 3.7 — A11y Design Review (single agent)
-WCAG 2.2 AA runtime check on the rendered style guide plus any key product pages that exist at this point.
-Call the Agent tool once:
-1. Description: "A11y design review" — agent_type: `a11y-architect` — subagent_type: `a11y-architect` — prompt: "[CONTEXT header above — phase: 3] WCAG 2.2 AA runtime check on the rendered `/design-system` route and any key product pages. Check contrast, focus order, keyboard navigation, screen reader labels, reduced-motion variants, and touch targets (>= 44px). Use Playwright and axe-core. Save findings to `docs/plans/a11y-design-review.md` with severity tags (Critical / Serious / Moderate / Minor)."
-Output: `docs/plans/a11y-design-review.md`.
 ### Step 3.8 — Autonomous Quality Gate
 Log to `docs/plans/build-log.md`: final screenshot paths, Design Critic score history (per-round totals plus per-axis subscores), a11y findings count by severity, a DNA compliance score derived from the critic's 7 DNA-axis subscores, and the DESIGN.md lint result (broken-refs count, warning count, hash). No user pause.
@@ -367,7 +357,7 @@ Call the Agent tool 5 times in one message:
 Exceeding the budget by >25% auto-blocks the Phase 6 LRR SRE chapter. Budget violations route back to Phase 3.2 (component mapping — swap a heavy variant for a lighter one) OR Phase 4 (code-splitting, lazy-loading, dynamic imports). Report budget-compliance per Scope axis, with the exact gzipped bundle size and LCP measurement."
-3. Description: "Accessibility audit" — agent_type: `a11y-architect` — subagent_type: `a11y-architect` — Prompt: "[CONTEXT header above — phase: 5] WCAG 2.2 AA runtime compliance audit on all interfaces. NFR target: Read `docs/plans/quality-targets.json` via your Read tool for accessibility thresholds. Check screen reader, keyboard nav, contrast, focus order, reduced-motion variants, touch targets >= 44px. Report issues with severity tags (Critical/Serious/Moderate/Minor). This is the same agent that sets constraints at Phase 2 and judges at Phase 6 LRR — keep the standards consistent across all three invocations."
+3. Description: "Accessibility audit" — agent_type: `a11y-architect` — subagent_type: `a11y-architect` — Prompt: "[CONTEXT header above — phase: 5] Light-touch accessibility sweep — flag only Critical and Serious WCAG 2.2 AA violations. Skip Moderate/Minor. WCAG 2.2 AA runtime compliance audit on all interfaces. NFR target: Read `docs/plans/quality-targets.json` via your Read tool for accessibility thresholds. Check screen reader, keyboard nav, contrast, focus order, reduced-motion variants, touch targets >= 44px. Report issues with severity tags (Critical/Serious/Moderate/Minor). This is the same agent that sets constraints at Phase 2 and judges at Phase 6 LRR — keep the standards consistent across all three invocations."
 4. Description: "Security audit" — agent_type: `engineering-security-engineer` — subagent_type: `engineering-security-engineer` — Prompt: "[CONTEXT header above — phase: 5] Security review: auth, input validation, data exposure, dependency vulnerabilities. NFR targets: Read `docs/plans/quality-targets.json` via your Read tool for security thresholds. Report findings with severity."

package/src/orchestrator/worktree-launcher.ts ADDED Viewed

@@ -0,0 +1,20 @@
+export interface WorktreeCommandOpts {
+  worktreeName: string;
+  model: string;
+  prompt: string;
+  timeoutSeconds?: number;
+}
+export function buildWorktreeCommand(opts: WorktreeCommandOpts): string {
+  const timeout = opts.timeoutSeconds ?? 1800;
+  const escaped = opts.prompt.replace(/"/g, '\\"');
+  return `timeout ${timeout} claude -p --worktree ${opts.worktreeName} --model ${opts.model} --dangerously-skip-permissions "${escaped}"`;
+}
+export function buildMergeCommands(worktreeName: string): string[] {
+  return [
+    `git merge worktree-${worktreeName} --no-edit`,
+    `git worktree remove .claude/worktrees/${worktreeName} --force 2>/dev/null || true`,
+    `git branch -D worktree-${worktreeName} 2>/dev/null || true`,
+  ];
+}