npm - codebyplan - Versions diffs - 1.5.0 → 1.8.0 - Mend

codebyplan 1.5.0 → 1.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (206) hide show

package/templates/skills/cbp-round-check/SKILL.md ADDED Viewed

@@ -0,0 +1,104 @@
+---
+scope: org-shared
+name: cbp-round-check
+description: Run automated checks standalone for the current round
+effort: low
+---
+<!-- Re-read this file before executing. Do not rely on memory. -->
+# Round Check Command
+Run automated checks independently with mandatory execution. Updates round QA. Hard fails if mandatory checks (build/lint/types) fail.
+## Instructions
+### Step 1: Get Current Round
+Use MCP `get_current_task` to find active task, then `get_rounds` to find the in-progress round.
+### Step 2: Determine Project Root
+Find the correct app directory:
+```bash
+REPO_ROOT="$(git rev-parse --show-toplevel)"
+```
+Identify app dir from project structure (e.g., `apps/web/` for Next.js).
+### Step 3: Execute Mandatory Checks (Hard Fail)
+For each check, EXECUTE the command and capture stdout + stderr. Log execution status.
+| Check | Command | Hard Fail |
+|-------|---------|-----------|
+| **Build** | `cd {app_dir} && npm run build 2>&1` | YES |
+| **Lint** | `cd {app_dir} && npm run lint 2>&1` | YES |
+| **Types** | `cd {app_dir} && npx tsc --noEmit 2>&1` | YES |
+For each:
+- Run the command via Bash tool
+- Log `EXECUTED: <command>` or `FAILED: <command> (exit code: N)`
+- If skipping (infrastructure-only changes): log `SKIPPED: <command> (reason: no app code changed)`
+### Step 4: Execute Conditional Checks
+| Check | Command | Condition |
+|-------|---------|-----------|
+| **Tests** | `cd {app_dir} && npx vitest --run 2>&1` | Test files exist |
+| **A11y** | Static check (aria, alt, focus) | UI files changed |
+| **API Health** | `curl -s -o /dev/null -w "%{http_code}" http://localhost:{PORT}/` | API routes changed |
+| **Visual** | Visual check flow (page-map + visual-check) | UI work + dev server running |
+### Step 5: Analyze Build Output
+Scan all captured output for:
+- **Warnings** (not just errors)
+- **Deprecation notices** (`grep -i "deprecat"` in output)
+- **Console.log in changed files**: `grep -rn "console\.\(log\|debug\|info\)" {changed_files}` (exclude tests)
+- **Bundle size warnings**
+### Step 6: Save QA Results
+Update round QA via MCP `update_round(round_id, qa: ...)`:
+```json
+{
+  "items": [
+    {"type": "auto", "check": "build", "status": "pass", "ran_at": "...", "notes": null, "executed": true},
+    {"type": "auto", "check": "lint", "status": "fail", "ran_at": "...", "notes": "3 errors", "executed": true},
+    {"type": "auto", "check": "types", "status": "pass", "ran_at": "...", "notes": null, "executed": true},
+    {"type": "auto", "check": "tests", "status": "skipped", "ran_at": "...", "notes": "no test files", "executed": false}
+  ]
+}
+```
+### Step 7: Show Results
+```
+## Round Check Results
+| Check | Status | Executed | Notes |
+|-------|--------|----------|-------|
+| Build | pass | yes | - |
+| Lint | fail | yes | 3 errors |
+| Types | pass | yes | - |
+| Tests | skipped | no | no test files |
+| Visual | pass | yes | screenshots saved |
+### Build Analysis
+- Warnings: [N]
+- Deprecations: [N]
+- Console.logs in code: [N]
+**Result**: [N] passed, [N] failed, [N] skipped
+**Hard fail**: [yes/no]
+```
+If hard fail: `Mandatory checks failed. Fix issues before continuing.`
+If soft failures only: `Run /cbp-round-start to trigger auto-fix, or fix manually.`
+## Integration
+- **Reads**: MCP `get_current_task`, `get_rounds`
+- **Writes**: MCP `update_round` (qa field)
+- **Standalone**: Can be run independently at any time

package/templates/skills/cbp-round-end/SKILL.md ADDED Viewed

@@ -0,0 +1,183 @@
+---
+scope: org-shared
+name: cbp-round-end
+description: Summary wrap-up after testing phase completes
+effort: high
+---
+# Round End Command
+Summary phase — presents what was done, then runs code quality review to catch bugs and logic errors that automated checks miss.
+**Inline-fallback for any spawn failure**: when `cbp-improve-round` (or any peer agent) fails to spawn, the orchestrator falls through to an inline procedure that produces equivalent (lower-fidelity but valid) output. The contract: detect failure class → record in `round.context.improve_round_findings.spawn_failure` → walk the agent's Phase checklist inline → continue the skill. Same procedure for every failure class (org/billing, monthly Agent cap, provider 5xx, rate limit, context overflow, tool not available). Pre-emptive skip applies when the same class fired on the prior round.
+See `reference/inline-fallback.md` for full trigger table, procedure, and coverage list.
+## Pipeline
+```
+/cbp-round-execute → /cbp-round-end → [code review + user decisions] → /cbp-round-update
+```
+## Identifier Notation
+This skill operates on the **active** task/round resolved via MCP `get_current_task` / `get_rounds` and does not accept a positional identifier argument. Canonical chk-task-round notation — used in prose, error messages, and cross-references — follows `.claude/rules/notation-consistency.md` "CHK / TASK / ROUND Identifier Notation": `108-1` (CHK-108 TASK-1), `45` (standalone TASK-45), `108-1-2` (round 2 of CHK-108 TASK-1), `45-2` (round 2 of standalone TASK-45).
+## Instructions
+### Step 1: Get Current Task and Round
+Use MCP `get_current_task` with repo_id (pass `checkpoint_id` if known to avoid disambiguation) to find the active task.
+Use MCP `get_rounds` for the task to find the in-progress round.
+Load round context with all outputs (executor_output, testing_qa_output, reviewer_output).
+### Step 2: Collect Files Changed
+Collect all files changed during this round from:
+- Work executor output
+- `git diff --name-status HEAD` for final state
+Build the files list with approval status:
+```json
+[
+  {
+    "path": "src/file.ts",
+    "action": "modified",
+    "claude_approved": true,
+    "user_approved": false
+  }
+]
+```
+**claude_approved**: `true` if cbp-testing-qa-agent passed for this file. `false` if issues remain.
+**user_approved**: Always `false` initially. User approves via git staging or web UI.
+### Step 3: Collect and Aggregate QA Results
+**No QA runs here** — all QA was already executed by per-wave `cbp-testing-qa-agent` inside `/cbp-round-execute` Step 5.
+This step is the **canonical user_qa[] construction site**. cbp-testing-qa-agent and cbp-test-e2e-agent are fully independent producers (no cross-read); cross-source user_qa items that combine signals from multiple producers are built here, at the user-facing boundary. Single-source user_qa items (derivable from one producer's data alone) stay with that producer.
+#### 3a — Collect single-source items from agent outputs
+Collect from round context (all three sources are agent-derived):
+- **Auto items**: from `testing_qa_output.auto_qa.items`
+- **User items (single-source)**: from `testing_qa_output.user_qa.items` (Phase 4b.1 design-source comparison + Phase 4b.2 mechanical-sweep spot-check + Phase 4b.0 connection smoke — all derivable from cbp-testing-qa-agent's own data)
+- **Default items**: from `testing_qa_output.default_checklist.items`
+#### 3b — Construct cross-source user_qa from frontend_ui_review (NEW)
+Read `round.context.frontend_ui_review.findings[]` (populated by the `cbp-frontend-ui` skill at `/cbp-round-execute` Step 5b under `phase: 'screenshot_review'`). For each finding, emit a `user_qa` item per the rules below.
+| Finding `category` | Finding `severity` | Emitted user_qa item |
+|--------------------|-------------------|---------------------|
+| `baseline_regression` | any | `{ type: 'user', check: 'Visual baseline regression — {page_or_screen}', status: 'pending', round_number: N, instructions: 'Open the diff PNG at `{screenshot}` (the `-diff.png` sibling shows pixel differences). Pixel-diff was `{baseline_diff_pct}%`. Decide: (a) regression — add a task to fix, OR (b) new rendering is correct — run `pnpm exec playwright test --update-snapshots` in `apps/{app}` and commit the updated baselines. Do NOT proceed until a decision is recorded.' }` |
+| `rendered_visual` | `critical` | `{ type: 'user', check: 'Rendered-visual critical — {page_or_screen}', status: 'pending', round_number: N, instructions: 'Open the screenshot at `{screenshot}`. The cbp-frontend-ui review flagged a critical rendering issue: `{finding.issue}`. Suggested fix: `{finding.suggestion}`. Decide whether this needs a fix-round before proceeding.' }` |
+| `rendered_visual` | `warning` | (no user_qa item; finding stays in `frontend_ui_review.findings` and surfaces via Step 7 findings presentation if relevant) |
+| Other categories | any | (no user_qa item from this step) |
+Skip Step 3b entirely when `round.context.frontend_ui_review` is absent (no e2e ran, or screenshot-review phase didn't execute).
+This is the required user gate for baseline updates — baselines are NEVER auto-accepted.
+#### 3c — Merge
+Combine the single-source items (3a) and cross-source items (3b) into a single `user_qa[]` for the round. Merge with previous rounds (supersede items for re-modified files, preserve verified items).
+### Step 4: Update Task Files and QA
+Update via MCP:
+- `update_task(task_id, files_changed: [...])` — merge with existing
+- `update_round(round_id, files_changed: [...], qa: {items: [...]})` — round-specific
+- `update_task(task_id, qa: {items: [...]})` — aggregated
+### Step 5: Present Summary
+```
+## Round [N] Complete - Ready for Review
+### Work Done
+[Brief summary from executor_output]
+### Files Changed ([N] files, [N] need approval)
+| File | Action | Claude | User |
+|------|--------|--------|------|
+| src/file.ts | modified | approved | pending |
+### Auto Checks
+| Check | Status |
+|-------|--------|
+| Build | pass |
+| Lint | pass |
+| Types | pass |
+| Tests | pass/skipped |
+```
+### Step 6: Spawn Code Quality Review
+Spawn `cbp-improve-round` agent via Agent tool with:
+```yaml
+input:
+  repo_id: [from config]
+  task: {id, title, requirements, context}
+  round: {id, number, requirements, files_changed, context}
+  project_path: [working directory]
+```
+Wait for agent to complete. If the spawn fails for any reason, apply the inline-fallback procedure documented in `reference/inline-fallback.md` (record `round.context.improve_round_findings.spawn_failure`, walk the agent's Phase checklist inline, continue the skill).
+### Step 7: Present Findings
+**If `status: 'no_findings'`:** show `### Code Review\nNo issues found. Code looks good.` and skip to Step 8.
+**If findings exist**, present them grouped by severity (table + per-finding details), then ask the user via AskUserQuestion which to fix: `all`, `1,2` (specific numbers), `none`, or `inline` (only when all findings qualify under the Trivial-Resolution Exception).
+Example tables and the `inline` option gating spec: see `reference/findings-presentation.md`.
+### Step 8: Route Based on Decisions
+**If `round.context.auto_loop_mode === true`** (auto-loop active):
+- Auto-accept ALL findings into `improve_round_findings[]` regardless of severity (the user opted into the loop).
+- Skip the polish-spiral stop-gate (auto-loop has its own cap-exhausted termination).
+- Skip the user findings-decision prompt.
+- Save findings via `update_round` exactly as in manual mode.
+- Auto-trigger `/cbp-round-update` immediately. round-update Step 6 will decide whether to spawn another round or exit clean (see cbp-round-update SKILL.md Step 6).
+**Else (manual mode — flag absent or false):**
+Run the existing flow:
+1. After round 2+, surface the polish-spiral stop-gate per `polish-spiral-stop-gate.md` (defer-to-followups vs continue).
+2. Surface the findings-decision AskUserQuestion (with optional `inline` per the gating rules in `reference/findings-presentation.md`).
+3. Save accepted/rejected findings to round context via MCP `update_round`:
+   ```json
+   {
+     "context": {
+       "improve_round_findings": [accepted findings],
+       "improve_round_rejected": [rejected findings with user reasons]
+     }
+   }
+   ```
+4. Auto-trigger `/cbp-round-update`. round-update will see unapproved files and route to `/cbp-round-input`; `/cbp-round-input` picks up the findings from round context and includes them in the new round's requirements automatically.
+## Key Rules
+- Claude NEVER git adds files — user does code review
+- Auto-triggers `/cbp-round-update` after findings are handled
+- `/cbp-round-end` is auto-triggered by `/cbp-round-execute` (user does not call it directly)
+- Findings are **presented for user decision** — never auto-fix without user consent
+## Integration
+- **Triggered by**: `/cbp-round-execute` (auto, after all waves + testing complete)
+- **Reads**: MCP `get_current_task`, `get_rounds`, round context
+- **Writes**: MCP `update_round`, `update_task` (files_changed, qa, findings)
+- **Spawns**: `cbp-improve-round` (code quality review)
+- **Triggers**: `/cbp-round-update` (auto, after findings handled)

package/templates/skills/cbp-round-end/reference/findings-presentation.md ADDED Viewed

@@ -0,0 +1,44 @@
+# Findings Presentation in `/cbp-round-end` Step 7
+When `improve-round` returns findings, Step 7 presents them grouped by severity and asks the user how to proceed.
+## Example output
+```
+### Code Review Findings
+| # | Severity | Category | File | Issue |
+|---|----------|----------|------|-------|
+| 1 | critical | bug | src/api.ts:42 | Missing null check on user input |
+| 2 | high | logic_error | src/calc.ts:15 | Off-by-one in loop boundary |
+| 3 | medium | edge_case | src/form.ts:88 | Empty array not handled |
+#### Details
+**1. Missing null check on user input** (critical)
+[description + suggested fix from agent]
+**2. Off-by-one in loop boundary** (high)
+[description + suggested fix from agent]
+```
+## AskUserQuestion options
+```
+Which findings should be fixed?
+- "all" — fix all findings in a new round
+- "1,2" — fix specific findings by number
+- "none" — skip all, proceed to round-update
+- "inline" — fix in THIS round before proceeding (only offered when all findings qualify under `infra-issue-absorption.md` Trivial-Resolution Exception)
+- Or explain why specific findings are not issues
+```
+## "inline" option gating
+Only present the "inline" option when ALL pending findings simultaneously satisfy:
+1. Diff is comment-only, annotation-only, banner-only, or single-value rename — no logic, no control flow
+2. Each fix is under ~5 minutes of executor time
+3. Verification is automatic — the existing test/lint/audit pipeline confirms the change
+If any finding fails these gates, omit the "inline" option entirely (revert to the 3-option prompt). When inline is chosen, apply the edits via direct `Edit`, re-run the verification commands (hook syntax check + `testing-qa-agent` scoped to modified files) and proceed to `/cbp-round-update` without spawning a new round. Document the decision in `round.context.inline_fix_log = { findings: [ids], rationale: "trivial-resolution exception", applied_at: <ISO> }` (mirrors the `bypass_log` shape from `infra-issue-absorption.md` "Pipeline Bypass for Trivial-Resolution Rounds").

package/templates/skills/cbp-round-end/reference/inline-fallback.md ADDED Viewed

@@ -0,0 +1,35 @@
+# Inline-Fallback for Any Spawn Failure
+When `improve-round` (or any agent spawned by this or peer skills) fails to spawn, the orchestrator falls through to an inline procedure that produces equivalent (lower-fidelity but valid) output. Same contract for every failure class — no special-casing per class.
+## Trigger conditions (any one)
+| Failure class             | Detection signal                                                                      |
+| ------------------------- | ------------------------------------------------------------------------------------- |
+| Org/billing limit         | `API Error: Extra usage is required for 1M context` (the original Proposal U trigger) |
+| Monthly Agent usage cap   | `API Error: This conversation has reached the monthly Agent usage limit` or similar   |
+| Provider 5xx              | Spawn returns `API Error 500` / `502` / `503` — transient or sustained                |
+| Rate limit                | `API Error 429` with retry hint                                                       |
+| Context overflow at spawn | Spawn returns `Context window exceeded` before agent can run                          |
+| Tool not available        | Skill caller's tool surface lacks Agent (rare — only in nested-agent contexts)        |
+## Fallback procedure (uniform across all triggers)
+1. Note the failure: `agent_spawned: false`, `skip_reason: "<one-line failure class>"`. Save to `round.context.improve_round_findings.spawn_failure = { class, error_message, decided_at }`.
+2. Perform the agent's analysis inline using whatever tools the orchestrator has (typically `Read` + `Bash` grep/find/head + `Glob`/`Grep`). Use the agent's documented Phase checklist as the script — agents are essentially curated checklists; following them inline produces equivalent (lower-fidelity but valid) output.
+3. Record findings in the same shape the agent would have returned (`findings[]` array with `severity`, `category`, `file`, `description`, `suggested_fix`). Mark each with `mode: 'inline_fallback'` so analytics can distinguish.
+4. Continue the skill — do NOT abort the round on spawn failure. The fallback is intended to keep the pipeline moving; aborting would force the user to manually re-run when the same failure will recur.
+**Pre-emptive skip**: when the same failure class fired in the previous round of the same task, skip the spawn attempt entirely and go straight to inline. This avoids one wasted API call per round during a sustained outage.
+## Coverage
+This fallback applies to:
+- `improve-round` spawned by `/cbp-round-end` (Step 6) — original case
+- `task-planner` spawned by `/cbp-round-start` Step 7 — orchestrator falls back to inline planning using the planner's Phase checklist
+- `testing-qa-agent` spawned by `/cbp-round-execute` Step 5 (per-wave) — orchestrator runs build/lint/types/tests inline via Bash and aggregates results in the agent's output shape
+- `task-check` spawned by `/cbp-task-check` skill — orchestrator walks the agent's verdict checklist inline
+- `improve-claude` spawned by its caller (when re-enabled) — orchestrator walks the agent's Phase 0-7 inline
+For details, each spawning skill carries a brief "Inline fallback" section pointing back to this contract. The canonical reference is here.

package/templates/skills/cbp-round-execute/SKILL.md ADDED Viewed

@@ -0,0 +1,211 @@
+---
+scope: org-shared
+name: cbp-round-execute
+description: Execute the approved plan from /cbp-round-start — runs per-wave executors, inline testing-qa per wave, and routes to /cbp-round-end
+effort: xhigh
+---
+# Round Execute Command
+Execution and validation phase. Receives the approved plan from `/cbp-round-start`, dispatches wave executors, runs per-wave `cbp-testing-qa-agent` in parallel, and routes to `/cbp-round-end`.
+## Pipeline
+```
+/cbp-round-start → /cbp-round-execute → /cbp-round-end (auto)
+```
+## Identifier Notation
+This skill operates on the **active** task/round resolved via MCP `get_current_task` / `get_rounds` and does not accept a positional identifier argument. Canonical chk-task-round notation — used in prose, error messages, and cross-references — follows `.claude/rules/notation-consistency.md` "CHK / TASK / ROUND Identifier Notation": `108-1` (CHK-108 TASK-1), `45` (standalone TASK-45), `108-1-2` (round 2 of CHK-108 TASK-1), `45-2` (round 2 of standalone TASK-45).
+## Instructions
+### Step 1: Get Current Task and Round
+Use MCP `get_current_task` with repo_id (pass checkpoint_id if known) to find the active task.
+Use MCP `get_rounds` for the task to find the in-progress round.
+If no in-progress round: `No active round. Run /cbp-round-start first.`
+### Step 2: Load Approved Plan
+Read the plan from round context (`context.planner_output`). If no plan: `No approved plan in round context. Run /cbp-round-start first.`
+Read effective testing profile: `round.context.testing_profile_override` if set (user override for this round only), else `task.context.testing_profile` (set by planner Phase 4.8), else default `'web'`. Pass the effective profile to all per-wave `cbp-testing-qa-agent` spawns.
+### Step 3: Route Execution Path
+Inspect `approved_plan.files_to_modify[]` and `approved_plan.round_type`. Four execution paths exist; pick the one that matches BEFORE Step 3a/3b.
+| Condition                                                                        | Path                                                                                                             |
+| -------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- |
+| `files_to_modify[]` empty AND `round_type === 'survey'`                          | **3-SURVEY** — orchestrator executes inline; constructs executor-equivalent output; NEVER spawn `cbp-round-executor` |
+| Every entry under `.claude/**`                                                   | **3-INLINE** — orchestrator applies via build-cc skills or direct Edit; NEVER spawn `cbp-round-executor`             |
+| At least one entry outside `.claude/**` AND `approved_plan.waves[]` has ≥2 waves | **3-WAVE** — dispatch per-wave per schema in `approved_plan.waves[]`                                             |
+| At least one entry outside `.claude/**` (single wave or no waves field)          | **3-AGENT** — spawn single `cbp-round-executor`                                                                      |
+#### Step 3-SURVEY: Empty-Files Survey Path
+Execute the survey instructions inline using Read/Grep/Bash. Save to `round.context.survey_output`. Build executor-equivalent output object with `round_type: 'survey'`. Skip to Step 3c.
+`round_type: 'survey'` MUST be set in `round.context` so Step 4 (dev-server probe) and downstream skills short-circuit correctly.
+#### Step 3-INLINE: `.claude/`-Only Inline Path
+For each entry, route per `rules/file-routing.md`:
+- `.claude/skills/{name}/SKILL.md` → `cbp-build-cc-skill` via Skill tool
+- `.claude/agents/{name}/AGENT.md` → `cbp-build-cc-agent` via Skill tool
+- `.claude/rules/{name}.md` → `cbp-build-cc-rule` via Skill tool
+- `.claude/CLAUDE.md` → `cbp-build-cc-claude-file` via Skill tool (or direct Edit)
+- `.claude/settings*.json` → `cbp-build-cc-settings` via Skill tool
+- `.claude/context/**`, `.claude/docs/**` → direct Edit
+- `.claude/hooks/{name}.sh` → direct Write/Edit
+Build executor-equivalent output object inline. Skip to Step 3c.
+#### Step 3-WAVE: Multi-Wave Dispatch
+When `approved_plan.waves[]` is present and has ≥2 entries:
+1. Topological-sort waves by `depends_on[]` to determine dispatch order.
+2. For each wave whose `depends_on[]` names are all complete, spawn the wave executor:
+   - `agent_type: 'cbp-round-executor'` → spawn `cbp-round-executor` with wave-scoped input (see `agents/cbp-round-executor.md` wave input contract)
+   - `agent_type: 'inline'` → execute inline as 3-INLINE path, scoped to `wave.files[]`
+3. After each wave completes, spawn `cbp-testing-qa-agent` against `wave.files[]` with `testing_profile` from Step 2. Run this testing spawn in PARALLEL with the next wave's executor when dependency order allows.
+4. After all waves complete, merge all `files_changed[]` into a single executor output.
+#### Step 3-AGENT: Single `cbp-round-executor` Spawn
+#### Mechanical-Edits Delegation Gate
+Before spawning `cbp-round-executor`, inspect `task.context.work_mode` (set by cbp-task-planner Phase 4.1).
+| Value              | Action                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+| ------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `mechanical`       | Spawn `cbp-mechanical-edits` instead of `cbp-round-executor`. Derive the spec (renames / substitutions / frontmatter_edits / index_regen) from `approved_plan.files_to_modify[]` + the task's deliverables; pass it via the prompt body per the agent's Input Contract. After the agent returns, verify `git status --porcelain` reflects only expected paths AND `validation.orphaned_refs` is empty. Skip the rest of Step 3-AGENT and proceed to Step 3b.                                                                                                                                                                                                         |
+| `mixed`            | Read `task.context.mechanical_files[]` (populated by cbp-task-planner Phase 4.1 per its partition rule). Spawn `cbp-round-executor` for the AUTHORED portion FIRST — the executor's `files` input is `files_to_modify[]` MINUS `mechanical_files[]`. After it returns, spawn `cbp-mechanical-edits` against ONLY `mechanical_files[]` — derive the spec (renames / substitutions / frontmatter_edits / index_regen) from those entries' purpose strings. Merge both `files_changed[]` results into a single output for Step 5. If `mechanical_files[]` is absent or empty when `work_mode === 'mixed'`, halt with a planner-output error (Phase 4.1 contract violation). |
+| `design` or absent | Proceed with the existing `cbp-round-executor` spawn below (no change in behaviour).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
+**Universal postcondition for both `mechanical` and `mixed`:** if any spawned `cbp-mechanical-edits` reports `validation.orphaned_refs.length > 0`, treat as a hard-fail signal and route through Step 6 (regardless of whether the executor also ran in the `mixed` path).
+This gate is distinct from Phase 2.95's `execution_mode` parallelism hint (consumed downstream by `cbp-round-executor` Step 3.5 for batch-create scenarios). Both gates can fire on the same task.
+Spawn `cbp-round-executor` with the approved plan and full context:
+```yaml
+input:
+  task_number: N
+  round_number: N
+  approved_plan: [from round context]
+  checkpoint: { id, title, goal, context }
+  task: { id, title, requirements, context }
+  resources: [merged from checkpoint.resources + task.resources]
+```
+Wait for executor output.
+### Step 3b: Database Work (if plan includes DB changes)
+If the approved plan includes database schema changes, RLS policies, or type generation:
+1. Spawn `cbp-database-agent` with DB-related steps from the plan
+2. Wait for completion
+3. Merge `files_changed` into executor output
+### Step 3c: Completion Check
+- `status: 'completed'` and all deliverables done → proceed to Step 4
+- `status: 'blocked'` → present blocker to user via AskUserQuestion, resolve, re-spawn executor with remaining work
+- Deliverables incomplete → re-spawn executor with remaining deliverables (max 3 re-triggers). After 3 re-triggers, save partial output and proceed.
+### Step 4: Dev-Server Probe (rounds 2+, web/desktop profile)
+When `round_number >= 2` AND `testing_profile` is `'web'` or `'desktop'` AND `files_changed` contains any UI file, probe the dev server BEFORE cbp-testing-qa-agent spawns (saves a full agent spawn when the server is down).
+Read `.codebyplan/server.json` `port_allocations[]`. For each configured port run:
+```bash
+curl -sf --max-time 3 -o /dev/null -w "%{http_code}" http://localhost:{port}/
+```
+Any non-2xx/3xx response → AskUserQuestion: start the dev server, skip UI checks, or abort.
+Skip this probe for `testing_profile === 'claude_only'` or `'backend'`.
+### Step 5: Per-Wave Testing (or global testing for single-wave)
+Read `task.context.testing_profile` (already loaded in Step 2).
+**claude_only profile**: run inline checks only (no `cbp-testing-qa-agent` spawn):
+1. `bash -n <hook-file>` for each modified `.sh` in `files_changed`
+2. Verify each modified/created SKILL.md ≤300 lines (warn threshold; hook blocks at 600); `scope:` marker present; no `/cbp-*` notation
+On pass, synthesise `testing_qa_output` inline per the procedure in `reference/inline-fallback.md` "Validation fallback" section (output shape defined in `agents/cbp-testing-qa-agent.md` Output Contract) and persist to `round.context.testing_qa_output` at Step 7.
+**All other profiles**: spawn `cbp-testing-qa-agent` AND `cbp-test-e2e-agent` in parallel (two Agent calls in the same message) per completed wave (or full executor output in single-wave mode). `cbp-test-e2e-agent` is gated on `has_ui_work === true` AND profile in {`web`, `desktop`, `full_matrix`, `cross_app`} — skipped for `claude_only` / `backend`-only.
+Input contracts: `cbp-testing-qa-agent` receives `executor_output`, `testing_profile`, `has_ui_work` (see `agents/cbp-testing-qa-agent.md` Input Contract). `cbp-test-e2e-agent` receives `repo_id`, `round_number`, `files_changed`, `prior_round_files_changed` (full task aggregate when round_number ≥ 2), `whole_checkpoint_mode: false`, `test_strategy`, `pages_affected`, `has_auth`, `dev_server_port` (see `agents/cbp-test-e2e-agent.md` Input Contract for the full shape).
+**Independence**: neither agent reads the other's output. Cross-source user_qa items (cbp-frontend-ui + agent data) are constructed downstream at `/cbp-round-end` Step 3b. Per-wave spawns MAY run in parallel with the next wave's executor when dependency order allows.
+### Step 5b: Post-E2E Screenshot Review (cbp-frontend-ui Phase 6.5)
+When `round.context.e2e_output.screenshots[]` is non-empty, invoke the `cbp-frontend-ui` skill with `phase: 'screenshot_review'` (input: `files_changed`, `e2e_screenshots: round.context.e2e_output.screenshots`, `context: { checkpoint_goal, round_requirements }`). Under this phase the skill runs only Phase 6.5 (Rendered-Output Visual Review) + 7 + 8 — Phases 1-6 (style) already ran inline at executor Step 3.8 with `phase: 'style_only'`.
+Persist findings to `round.context.frontend_ui_review` (merge with Step 3.8's style-only output if present). `/cbp-round-end` Step 3b emits user_qa items for any `category: 'baseline_regression'` (any severity) and any `category: 'rendered_visual' + severity: 'critical'` — neither auto-fails the round. cbp-testing-qa-agent does NOT read these findings (full independence per Step 5).
+**Skip** when `round.context.e2e_output` is absent, `screenshots` is empty, or `testing_profile === 'claude_only'`.
+### Step 6: Hard-Fail Routing
+Per-wave hard-fail signal: `testing_qa_output.totals.hard_fail || e2e_output.status === 'failed' || e2e_output.test_results?.failed > 0`.
+**All waves hard_fail: false** → proceed to Step 7. **Any wave hard_fail: true**:
+- **Simple fixes** (type errors, lint, missing imports, test assertion fixes, e2e `real`-category with clear code-side root cause, no prior re-trigger this round) → save failure details to round context; retrigger the failing wave's executor; re-run testing-qa AND test-e2e for that wave.
+- **Structural OR already re-triggered once OR e2e preflight aborts** → save failure context via MCP `update_round`; auto-trigger `/cbp-round-input`. STOP.
+## Inline execution fallback
+When `cbp-round-executor` spawn fails (per `agent-spawn-failure-fallback.md` triggers), fall through to the 3-INLINE branch in Step 3 above for `.claude/`-only edits. For non-`.claude/` edits, walk `agents/cbp-round-executor.md` Phase 1–4 inline using Read / Edit / Write / Bash. Full procedure: `reference/inline-fallback.md` "Execution fallback" section.
+## Inline validation fallback
+When `cbp-testing-qa-agent` spawn fails OR the resolved `testing_profile` is `claude_only` (in which case the agent isn't spawned by design), run validation inline. Apply the profile gate matrix in `agents/cbp-testing-qa-agent.md` Phase 3 to determine in-scope checks. Full procedure + per-profile shape: `reference/inline-fallback.md` "Validation fallback" section.
+### Step 7: Save Executor Output
+Update round context via MCP `update_round`:
+- `context`: { ...existing, executor_output, testing_qa_output, e2e_output, frontend_ui_review }
+`e2e_output` and `frontend_ui_review` are present only when the gates above admitted them (e2e ran AND Step 5b ran).
+### Step 8: Auto-trigger Round End
+```
+Execution and validation complete. Starting round wrap-up...
+```
+Trigger `/cbp-round-end`.
+## Key Rules
+- **Code + test writing + inline validation** — planning lives in `round-start`, summary in `round-end`
+- Per-wave `cbp-testing-qa-agent` AND `cbp-test-e2e-agent` run in parallel (both against the same wave's `files[]`); they may also run in parallel with the NEXT wave's executor when dependency order allows
+- `testing_profile` from `task.context` governs which checks run — read it once in Step 2; pass to every testing-qa + test-e2e spawn
+- `claude_only` profile skips all agent spawns (testing-qa AND test-e2e); runs hook syntax and skill structure checks inline
+- Step 5b (cbp-frontend-ui Phase 6.5) runs only when e2e produced screenshots — gated on `e2e_output.screenshots[]` non-empty
+- Claude NEVER git adds files in round commands
+## Integration
+- **Reads**: MCP `get_current_task`, `get_rounds`
+- **Writes**: MCP `update_round` (context with executor_output + testing_qa_output + e2e_output + frontend_ui_review)
+- **Spawns**: `cbp-round-executor` (per wave or single), `cbp-testing-qa-agent` (per wave, parallel sibling of cbp-test-e2e-agent), `cbp-test-e2e-agent` (per wave when has_ui_work + non-claude_only profile), `cbp-database-agent` (if DB work), `cbp-security-agent` (if security review needed)
+- **Skill invocations**: `cbp-frontend-ui` at Step 5b with `phase: 'screenshot_review'` (post-e2e)
+- **Triggers**: `/cbp-round-end` (auto)
+- **Triggered by**: `/cbp-round-start` (auto, after plan approval)

package/templates/skills/cbp-round-execute/reference/inline-fallback.md ADDED Viewed

@@ -0,0 +1,59 @@
+---
+scope: org-shared
+---
+# Inline-fallback procedures
+When `round-executor` or `testing-qa-agent` cannot be spawned (env limits, monthly cap, 5xx, rate limit, context overflow), the orchestrator falls through to an inline procedure that walks the agent's Phase checklist using its own tools.
+The two fallback modes are documented separately so the SKILL.md stubs can link the right section.
+## Execution fallback (round-executor spawn failed)
+Triggered when the executor agent spawn returns one of the failure classes documented in `agent-spawn-failure-fallback.md`. Procedure:
+1. Detect failure class from error string. Record:
+   ```yaml
+   round.context.executor_findings.spawn_failure:
+     class: "monthly_agent_usage_limit" | "provider_5xx" | "rate_limit_429" | "context_overflow_at_spawn" | "billing_limit"
+     error_message: "<verbatim>"
+     decided_at: "<ISO>"
+   ```
+2. For `.claude/`-only file sets, fall through to the 3-INLINE branch in `../SKILL.md` Step 3 (orchestrator routes per file-routing.md to the matching build-cc skill or direct Edit).
+3. For non-`.claude/` file sets, walk `agents/round-executor.md` Phase 1–4 inline using Read / Edit / Write / Bash / Glob / Grep. Step 3 (Implementation) is the load-bearing phase — apply each `files_to_modify[]` deliverable in order, respecting wave boundaries when wave mode is active.
+4. Populate the executor's output contract with `mode: 'inline_fallback'` so analytics distinguishes.
+5. Pre-emptive skip on repeat: if `prior_round.context.executor_findings.spawn_failure.class === current_class`, skip the spawn attempt entirely and go straight to inline.
+## Validation fallback (testing-qa-agent spawn failed OR claude_only profile)
+Triggered when testing-qa-agent spawn returns a failure class, OR when the resolved profile is `claude_only` (in which case the agent should not have been spawned at all). Procedure:
+1. Detect failure class. Record:
+   ```yaml
+   round.context.testing_qa_findings.spawn_failure:
+     class: "<failure_class>"
+     error_message: "<verbatim>"
+     decided_at: "<ISO>"
+   ```
+2. Apply the profile gate matrix from `agents/testing-qa-agent.md` Phase 3 to determine which checks are in-scope:
+   - `claude_only`: only hook bash syntax (`bash -n <hook>`) + skill structure validation (line counts, scope marker, /cbp-* legacy notation absent)
+   - `web`: skip desktop + backend
+   - `backend`: skip web + desktop
+   - `desktop`: skip web + backend
+   - `full_matrix`: all
+   - `cross_app`: union of touched apps
+3. Walk `agents/testing-qa-agent.md` Phase 1 (Setup) + Phase 2 (Discovery) + Phase 3 (Mandatory Automated Testing) inline using Read / Grep / Bash. Aggregate per-check results.
+4. Populate `testing_qa_output` shape with `mode: 'inline_fallback'`. For `claude_only` specifically, use `mode: 'inline_synthesised_for_claude_only_profile'` (the agent was never expected to spawn — this isn't a fallback, it's the documented happy path).
+5. Pre-emptive skip on repeat: if `prior_round.context.testing_qa_findings.spawn_failure.class === current_class`, skip the spawn attempt entirely.
+## Pre-emptive skip rule
+Per `agent-spawn-failure-fallback.md` item 5: when the same failure class fired in the previous round of the same task, skip the spawn attempt entirely and go straight to inline. This avoids one wasted API call per round during a sustained outage.
+## Pairs With
+- `../SKILL.md` — points at this reference for procedural detail
+- `agents/round-executor.md` — execution-fallback target agent
+- `agents/testing-qa-agent.md` — validation-fallback target agent + Phase 3 profile gate matrix
+- `rules/agent-spawn-failure-fallback.md` — required-coverage table; canonical failure classes
+- `rules/testing-profile.md` — claude_only profile detail; cross-app union semantics