npm - codebyplan - Versions diffs - 1.13.52 → 1.13.54 - Mend

codebyplan 1.13.52 → 1.13.54

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (92) hide show

package/templates/rules/execution-proof.md ADDED Viewed

@@ -0,0 +1,70 @@
+---
+description: Real execution proof is a non-skippable verify obligation — tiered by what the round touched, every tier producing a COMMITTED artifact, never prose.
+paths:
+  - ".claude/skills/cbp-verify/**"
+  - ".claude/skills/cbp-round-build/**"
+  - ".claude/agents/cbp-verify-reviewer.md"
+  - ".claude/agents/cbp-e2e-playwright.md"
+  - ".claude/agents/cbp-e2e-maestro.md"
+  - ".claude/agents/cbp-e2e-tauri.md"
+  - ".claude/agents/cbp-e2e-vscode.md"
+  - ".claude/agents/cbp-e2e-xcuitest.md"
+---
+# Execution Proof
+"I verified the build" is not proof. Proof is a **committed artifact** that an auditor can
+re-inspect after the session ends. `cbp-verify` Phase 3 produces it; a passing verdict without
+it is invalid. The required artifact is **tiered by what the round's diff actually touched** —
+the tier is chosen from `files_changed`, not from a `has_ui_work` guess.
+## Tiers
+| Tier | Round touched | Proof obligation | Asserted by |
+|------|---------------|------------------|-------------|
+| **1** | A configured e2e framework's `app` source (`.codebyplan/e2e.json`) | `cbp-e2e-*` specialist runs the app and **commits screenshots** to the framework's committed dir | `codebyplan e2e verify-round` (non-empty gallery + non-zero assertions) |
+| **2** | UI source, but NO e2e framework configured for that app | **MANDATORY** dev-server run + at least one committed route screenshot or HTTP response trace for each changed route | manifest `artifacts[]` + `git ls-files --error-unmatch` |
+| **3** | Backend / API only (route handlers, server actions, endpoints) | Hit each changed endpoint; record an HTTP status trace (method, path, status, ms) committed to the round artifact dir | manifest `artifacts[]` |
+| **4** | `claude_only` / docs / config only (no app surface) | Proof IS the build/test commands — `codebyplan check --scope round\|task` (+ `bash -n` for touched hooks); profile-valid, no screenshot | manifest `gates[]` |
+A round can hit multiple tiers; satisfy each tier its diff touches.
+## Hard Rules
+- **Empty proof on a UI-touching diff is a GATE FAILURE.** A round whose `files_changed`
+  includes UI source but whose manifest carries zero committed screenshots/traces fails verify —
+  route to a fix round that captures the missing artifact. (Mirrors `e2e-mandatory.md`
+  Committed-Screenshot Enforcement; sole exception: `vscode-test`-only behavior rounds.)
+- **Screenshots must be committed, not `/tmp`.** Each artifact path is proven present with
+  `git ls-files --error-unmatch <path>` — an unstaged or `/tmp` file is not proof.
+- **Prose is never proof.** A narrative claim with no artifact path does not satisfy any tier.
+## Manifest Schema
+`cbp-verify` writes a `verify_manifest` into round/task context — the durable record of which
+gates ran and what proof exists:
+```yaml
+verify_manifest:
+  scope: round | task
+  gates:                       # deterministic gate results
+    - name: gate6 | lint | typecheck | tests | audit
+      exit_code: number
+      new_failures: string[]   # post-baseline-diff; [] = pass
+  proof:
+    tier: 1 | 2 | 3 | 4
+    artifacts:                 # committed proof, one per affected surface
+      - kind: screenshot | http_trace | command_log
+        path: string           # repo-relative; verified via git ls-files --error-unmatch
+        affected: string       # route / endpoint / file this proves
+    e2e_verify_round:          # present for Tier 1
+      pass: boolean
+      failed_checks: string[]  # e2e_eligible_skipped | zero_assertion_run | empty_gallery
+  decided_at: ISO8601
+```
+## Cross-References
+- `rules/e2e-mandatory.md` — Tier 1 opt-out contract + committed-screenshot mandate.
+- `rules/two-tier-ci.md` — how proof feeds the soft (round/task) vs hardcore (checkpoint) tiers.
+- `skills/cbp-verify/reference/deterministic-gates.md` — the gate command contracts + manifest write.

package/templates/rules/model-invocation-convention.md CHANGED Viewed

@@ -7,8 +7,8 @@ a skill is strictly user-only (i.e. it must never auto-trigger from another skil
 The absence of `disable-model-invocation` (or `disable-model-invocation: false`) is the normal
 state. It allows the skill to be auto-triggered via the Skill tool from within other skills —
-which is how the auto-trigger close-out flow works (e.g. `cbp-task-check` → `cbp-task-testing`,
-`cbp-task-testing` → `cbp-task-complete`).
+which is how the auto-trigger close-out flow works (e.g. `cbp-round-build` → `cbp-verify`,
+`cbp-verify` task scope → `cbp-finalize`).
 ## The sole exception: `cbp-round-complete`

package/templates/rules/parallel-waves.md CHANGED Viewed

@@ -1,24 +1,24 @@
 ---
 name: parallel-waves
-description: Wave schema, invariants, and proximity-split algorithm for cbp-task-planner Phase 5.6 wave decomposition.
+description: Wave schema, invariants, and proximity-split algorithm for cbp-round-planner Phase 5.6 wave decomposition.
 paths:
-  - .claude/agents/cbp-task-planner.md
+  - .claude/agents/cbp-round-planner.md
 ---
 # Parallel Waves
-Authoritative expansion of `cbp-task-planner` Phase 5.6. The planner reads this file at wave decomposition time.
+Authoritative expansion of `cbp-round-planner` Phase 5.6. The planner reads this file at wave decomposition time.
 ## Wave Schema
-Each entry in `plan.waves[]` carries these fields (source: `.claude/agents/cbp-task-planner.md` Phase 5.6 "Output" block):
+Each entry in `plan.waves[]` carries these fields (source: `.claude/agents/cbp-round-planner.md` Phase 5.6 "Output" block):
 ```yaml
 - name: string               # short identifier, e.g. "web-ui", "backend", "db"
-  agent_type: 'round-executor' | 'inline'
+  agent_type: 'round-builder' | 'inline'
   files: string[]            # repo-relative paths owned by this wave
   depends_on: string[]       # names of waves that must complete before this one starts
-  skill_preloads: string[]   # skills invoked by the executor before Step 3 (e.g. "frontend-design")
+  skill_preloads: string[]   # skills invoked by the builder before Step 3 (e.g. "frontend-design")
   note: string               # optional — required on continuation waves from an arbitrary-boundary split
 ```
@@ -31,9 +31,9 @@ Each entry in `plan.waves[]` carries these fields (source: `.claude/agents/cbp-t
 **(III) 3–15 files per wave** — every wave holds between 3 and 15 files (inclusive).
   - Below 3: merge into a sibling wave.
   - Above 15: apply the proximity-split algorithm below.
-  - Sole exception — trivially small plans are exempt from the lower bound: a plan with fewer than 3 total files uses one single wave, and a single-app plan with ≤5 total files MAY skip decomposition entirely (one wave, or `waves[]` omitted — see `cbp-task-planner` Phase 5.6). Zero waves (omitted `waves[]`) trivially satisfies this invariant.
+  - Sole exception — trivially small plans are exempt from the lower bound: a plan with fewer than 3 total files uses one single wave, and a single-app plan with ≤5 total files MAY skip decomposition entirely (one wave, or `waves[]` omitted — see `cbp-round-planner` Phase 5.6). Zero waves (omitted `waves[]`) trivially satisfies this invariant.
-**(IV) UI skill preloads** — for each wave whose `files[]` contains UI-bearing paths (`*.tsx`, `*.jsx`, `*.scss`, etc.), add `"frontend-design"` to `skill_preloads[]` (source: `.claude/agents/cbp-task-planner.md` Phase 5.6 step "Populate `skill_preloads[]`").
+**(IV) UI skill preloads** — for each wave whose `files[]` contains UI-bearing paths (`*.tsx`, `*.jsx`, `*.scss`, etc.), add `"frontend-design"` to `skill_preloads[]` (source: `.claude/agents/cbp-round-planner.md` Phase 5.6 step "Populate `skill_preloads[]`").
 ## Proximity-Split Algorithm
@@ -57,7 +57,7 @@ Invariants I (disjoint files), II (acyclic `depends_on` DAG), and III (3–15 fi
 ## Cross-References
-- `agents/cbp-task-planner.md` Phase 5.6 — consumer of this rule; steps 1–6 and the `validate-waves` verification call.
+- `agents/cbp-round-planner.md` Phase 5.6 — consumer of this rule; steps 1–6 and the `validate-waves` verification call.
 - `packages/codebyplan-package/src/lib/validate-waves.ts` — deterministic enforcement of invariants I–III.
-- `agents/cbp-round-executor.md` Step 2.6 — wave-mode skill preloads.
-- `skills/cbp-round-execute/SKILL.md` Step 3 — per-wave executor dispatch.
+- `agents/cbp-round-builder.md` Step 2.6 — wave-mode skill preloads.
+- `skills/cbp-round-build/SKILL.md` Step 3 — per-wave builder dispatch.

package/templates/rules/spawn-failure-is-gate-failure.md ADDED Viewed

@@ -0,0 +1,76 @@
+---
+description: A subagent spawn failure is a HARD GATE FAILURE — STOP and retry, never walk the agent's steps inline and self-certify.
+paths:
+  - ".claude/skills/cbp-verify/**"
+  - ".claude/skills/cbp-round-build/**"
+  - ".claude/skills/cbp-finalize/**"
+  - ".claude/agents/cbp-verify-reviewer.md"
+  - ".claude/agents/cbp-round-builder.md"
+---
+# Spawn Failure Is Gate Failure
+When a verify/execution stage delegates work to a subagent (e.g. `cbp-verify` spawning
+`cbp-verify-reviewer`, `cbp-round-build` spawning `cbp-round-builder`), the agent is the
+**fresh-context oracle**. If the agent cannot run, the orchestrator does NOT have an
+equivalent signal — and it must NEVER manufacture one.
+## The Rule
+A **spawn failure** — the agent could not run, or died on a terminal error before producing
+its output contract — is a **HARD GATE FAILURE**. The orchestrator STOPS and surfaces a retry
+directive. It does NOT walk the agent's phase checklist inline with its own tools and grade its
+own work. Self-certification by the orchestrator that spawned the agent is precisely the
+fresh-context blind spot the agent exists to remove; reproducing the agent's steps inline
+re-introduces it.
+Spawn-failure classes (non-exhaustive): provider 5xx, rate-limit / monthly-cap / billing block,
+context overflow at spawn, the agent process dying before emitting its output contract.
+**Retry directive shape** (surface verbatim, then STOP):
+```
+## Verify blocked — reviewer could not spawn
+The fresh-context reviewer (<agent>) failed to spawn: <class> — <verbatim error>.
+This is a hard gate failure, not a pass. Retry when capacity returns:
+  Next: /cbp-verify
+```
+Record `<scope>.context.verify.spawn_failure = { agent, class, error_message, decided_at }` so
+the retry is auditable and a verdict is never written on a missing review.
+## Spawn-Failed vs Spawn-Ran-And-Found-Problems
+These are different outcomes with opposite routes — do not conflate them:
+| Outcome | Meaning | Route |
+|---------|---------|-------|
+| **Spawn failed** | Agent never produced its output contract (terminal error). | HARD GATE FAILURE → STOP + retry directive. No verdict written. |
+| **Spawn ran, found problems** | Agent returned findings / `NOT_READY`. | Normal flow → in-scope mechanical fix or `/cbp-round-plan` fix round. |
+A returned `NOT_READY` is a *successful* review with a negative verdict — it is acted on, not
+retried. Only the absence of a contract is a spawn failure.
+## Carve-Out: The `claude_only` Profile Is Not Inline Fallback
+The `claude_only` profile (rounds with no app surface — `.claude/`-only edits, docs, config)
+has **no agent to spawn by design**. Its proof IS the deterministic command set:
+`codebyplan check --scope round|task` plus `bash -n <hook>` for any touched shell file. Running
+those inline is a **first-class deterministic verification path**, not a banned inline fallback —
+there was never a subagent to substitute for. This carve-out applies ONLY when the resolved
+profile is `claude_only`; for every other profile an agent is expected, and its spawn failure is
+a hard gate failure per above.
+## Why (Replaces Inline-Fallback Self-Certification)
+The retired `inline-fallback.md` procedures let an orchestrator that just failed to spawn an
+agent walk that agent's steps and pass its own work. That defeats the entire point of a
+fresh-context review and silently downgraded quality under sustained outages. This rule replaces
+those procedures: a missing review is a STOP, not a self-graded continue.
+## Cross-References
+- `skills/cbp-verify/SKILL.md` Phase 4 — the reviewer spawn + this hard-fail.
+- `agents/cbp-verify-reviewer.md` — the reviewer whose absence triggers this rule.
+- `rules/execution-proof.md` — the proof obligation a passing verdict still requires.

package/templates/rules/task-routing-recommendation.md CHANGED Viewed

@@ -12,7 +12,7 @@ CodeByPlan has two families of task commands since CHK-141:
 | Family | Commands | When to use |
 |--------|----------|-------------|
-| Checkpoint-bound | `/cbp-task-create`, `/cbp-task-start {chk}-{task}`, `/cbp-task-check`, `/cbp-task-testing`, `/cbp-task-complete` | Work that belongs to a CHK-NNN checkpoint |
+| Checkpoint-bound | `/cbp-task-create`, `/cbp-task-start {chk}-{task}`, `/cbp-verify`, `/cbp-finalize` | Work that belongs to a CHK-NNN checkpoint |
 | Standalone | `/cbp-standalone-task-create`, `/cbp-standalone-task-start {task}`, `/cbp-standalone-task-check`, `/cbp-standalone-task-testing`, `/cbp-standalone-task-complete` | Independent work not tied to any checkpoint |
 ## Round Commands (Both Families)

package/templates/rules/todo-backend.md CHANGED Viewed

@@ -62,8 +62,8 @@ The queue head (`get_todos` `rows[0]`) maps to one of these slash commands. The
 | State | Command | Required context |
 |-------|---------|------------------|
-| Round in progress | `/cbp-round-update` | `{checkpoint_id, task_id, round_id}` |
-| Round pending start | `/cbp-round-start` | `{checkpoint_id, task_id}` |
+| Round in progress | `/cbp-verify` | `{checkpoint_id, task_id, round_id}` |
+| Round pending start | `/cbp-round-plan` | `{checkpoint_id, task_id}` |
 | Task pending start | `/cbp-task-start` | `{checkpoint_id, task_id}` or `{task_id}` for standalone |
 | Checkpoint pending activation | `/cbp-checkpoint-update` | `{checkpoint_id}` |
 | Checkpoint done | `/cbp-checkpoint-check` | `{checkpoint_id}` |
@@ -118,4 +118,4 @@ CHK-111 shipped the original todos queue as Postgres triggers + a 583-LOC `regen
 4. Env vars (from `apps/todo-worker/.env.example`): `SUPABASE_URL`, `SUPABASE_SECRET_KEY` (an `sb_secret_...` key), `LOG_LEVEL`, `WORKER_POLL_MS`.
 5. Save the resulting `project_ref` to `.codebyplan.json` `shipment.surfaces.railway-todo-worker.project_ref`.
-Smoke after deploy: run `/cbp-task-complete` in any worktree → tail Railway logs → expect a `claim → apply` cycle within `WORKER_POLL_MS`.
+Smoke after deploy: run `/cbp-finalize` in any worktree → tail Railway logs → expect a `claim → apply` cycle within `WORKER_POLL_MS`.

package/templates/rules/two-tier-ci.md ADDED Viewed

@@ -0,0 +1,63 @@
+---
+description: Two CI tiers — soft (round/task → feat) is baseline-tolerant; hardcore (checkpoint → main) is whole-repo absolute green. Branch model is feat→main direct.
+paths:
+  - ".claude/skills/cbp-verify/**"
+  - ".claude/skills/cbp-checkpoint-check/**"
+  - ".claude/skills/cbp-checkpoint-end/**"
+  - ".claude/skills/cbp-ship-main/**"
+  - ".codebyplan/ci.json"
+---
+# Two Tier CI
+CodeByPlan gates work at two strictness tiers. The tier is chosen by **what is being
+promoted**, not by preference.
+## Soft Tier — round / task → feat branch
+Runs at every `cbp-verify` (round scope) and the task-scope escalation. **Baseline-tolerant**:
+pre-existing red is non-blocking; only NEW per-package failures fail.
+- `codebyplan check --scope round|task` (NO `--no-baseline`). Each baselined check
+  (`lint` / `typecheck` / `tests` / `audit`) fails ONLY when its `new_failures[]` is non-empty
+  vs `.check-baseline.json`. `gate6` (sibling-identity parity) is **always hard** — never
+  baselined.
+- `codebyplan e2e verify-round --round-id <id> --task-id <id>` per round (Tier-1 e2e proof).
+- Fresh-context review via `cbp-verify-reviewer` (its spawn failure is a hard gate failure —
+  `rules/spawn-failure-is-gate-failure.md`).
+The soft tier keeps the inner loop fast: a feat branch may carry the repo's known baseline red
+forward without blocking, while guaranteeing the work being added is itself clean.
+## Hardcore Tier — checkpoint → main
+Runs at checkpoint close (`cbp-checkpoint-check` / `cbp-checkpoint-end` / ship). **Zero baseline
+forgiveness — whole-repo absolute green.**
+- `codebyplan check --scope merged --no-baseline` = every failing package and every GHSA id
+  counts; any red fails. (`gate6` unchanged — still always hard.)
+- Aggregate e2e proof across the whole checkpoint diff.
+- Every required `main` branch-protection PR check is green (repo-specific — read the repo's
+  configured required checks, never assume a single hardcoded check name).
+## Critical Constraint — feat→main DIRECT, main-only
+The branch model is **feat→main direct**; `.codebyplan/git.json` has `integration: null`,
+`production: "main"`. There is **NO intermediate integration branch** — the "checkpoint branch"
+IS the per-checkpoint feat branch. The hardcore tier runs against that feat branch's merged
+state before it lands on main; do not assume a staging/integration hop exists.
+## Report-Only Rollout
+The whole-repo hardcore CI **job** lands **report-only first** (`continue-on-error: true`) and is
+flipped to a required check ONLY after the `apps/web` baseline is burned down. Until then,
+`--scope merged --no-baseline` is advisory in CI — surfaced, not enforced — so a pre-existing
+`apps/web` red does not block a merge while the baseline is still being paid down. Locally,
+`cbp-verify` still runs and reports it.
+## Cross-References
+- `rules/execution-proof.md` — the committed-artifact obligation feeding both tiers.
+- `rules/spawn-failure-is-gate-failure.md` — fresh-context review is non-substitutable.
+- `skills/cbp-verify/reference/deterministic-gates.md` — exact gate commands + JSON contracts.
+- `.codebyplan/git.json` — authoritative branch model (`integration: null`, `production: main`).

package/templates/settings.project.base.json CHANGED Viewed

@@ -56,9 +56,9 @@
       "Skill(cbp-checkpoint-check)",
       "Skill(cbp-checkpoint-complete)",
       "Skill(cbp-round-complete)",
-      "Skill(cbp-round-execute)",
+      "Skill(cbp-round-build)",
       "Skill(cbp-session-end)",
-      "Skill(cbp-task-complete)",
+      "Skill(cbp-finalize)",
       "Skill(cbp-standalone-task-create)",
       "Skill(cbp-standalone-task-start)",
       "Skill(cbp-standalone-task-complete)",
@@ -126,13 +126,10 @@
       "Skill(cbp-map-architecture)",
       "Skill(cbp-merge-main)",
       "Skill(cbp-refresh-arch-map)",
-      "Skill(cbp-refresh-infra)",
-      "Skill(cbp-round-check)",
-      "Skill(cbp-round-end)",
-      "Skill(cbp-round-input)",
-      "Skill(cbp-round-start)",
-      "Skill(cbp-round-update)",
+      "Skill(cbp-round-plan)",
       "Skill(cbp-session-start)",
+      "Skill(cbp-setup-cd)",
+      "Skill(cbp-setup-ci)",
       "Skill(cbp-setup-e2e)",
       "Skill(cbp-setup-eslint)",
       "Skill(cbp-ship-configure)",
@@ -142,11 +139,10 @@
       "Skill(cbp-supabase-branch-check)",
       "Skill(cbp-supabase-migrate)",
       "Skill(cbp-supabase-setup)",
-      "Skill(cbp-task-check)",
       "Skill(cbp-task-create)",
       "Skill(cbp-task-start)",
-      "Skill(cbp-task-testing)",
       "Skill(cbp-todo)",
+      "Skill(cbp-verify)",
       "Skill(supabase)",
       "Skill(supabase-postgres-best-practices)",
       "mcp__codebyplan__get_checkpoints",
@@ -212,6 +208,8 @@
       "Bash(npx codebyplan ports:*)",
       "Bash(codebyplan tech-stack:*)",
       "Bash(npx codebyplan tech-stack:*)",
+      "Bash(codebyplan docs:*)",
+      "Bash(npx codebyplan docs:*)",
       "Bash(codebyplan eslint:*)",
       "Bash(npx codebyplan eslint:*)",
       "Bash(codebyplan lsp:*)",
@@ -226,6 +224,8 @@
       "Bash(npx codebyplan checkpoint:*)",
       "Bash(codebyplan task:*)",
       "Bash(npx codebyplan task:*)",
+      "Bash(codebyplan standalone-task:*)",
+      "Bash(npx codebyplan standalone-task:*)",
       "Bash(codebyplan session:*)",
       "Bash(npx codebyplan session:*)",
       "Bash(codebyplan help:*)",
@@ -249,7 +249,11 @@
       "Bash(codebyplan e2e:*)",
       "Bash(npx codebyplan e2e:*)",
       "Bash(codebyplan arch-map:*)",
-      "Bash(npx codebyplan arch-map:*)"
+      "Bash(npx codebyplan arch-map:*)",
+      "Bash(codebyplan ci:*)",
+      "Bash(npx codebyplan ci:*)",
+      "Bash(codebyplan cd:*)",
+      "Bash(npx codebyplan cd:*)"
     ]
   },
   "attribution": {

package/templates/skills/cbp-build-cc-mode/SKILL.md CHANGED Viewed

@@ -38,7 +38,7 @@ A skill that carries a `model:` line is a **gap** — remove it unless a deliber
 ### Agents — `model:` + `effort:`
-Default `model: sonnet` + `effort: xhigh`. Fifteen of the 17 authoring agents take the default (`cbp-cc-executor`, `cbp-database-agent`, `cbp-improve-claude`, `cbp-improve-round`, `cbp-research`, `cbp-round-executor`, `cbp-security-agent`, `cbp-task-check`, `cbp-task-planner`, `cbp-testing-qa-agent`, `cbp-e2e-playwright`, `cbp-e2e-maestro`, `cbp-e2e-tauri`, `cbp-e2e-vscode`, `cbp-e2e-xcuitest`). The other two are exceptions:
+Default `model: sonnet` + `effort: xhigh`. Fifteen of the 17 authoring agents take the default (`cbp-cc-executor`, `cbp-database-agent`, `cbp-improve-claude`, `cbp-research`, `cbp-round-builder`, `cbp-security-agent`, `cbp-stripe-agent`, `cbp-verify-reviewer`, `cbp-round-planner`, `cbp-testing-qa-agent`, `cbp-e2e-playwright`, `cbp-e2e-maestro`, `cbp-e2e-tauri`, `cbp-e2e-vscode`, `cbp-e2e-xcuitest`). The other two are exceptions:
 | agent                | model  | effort | reason                                                                              |
 | -------------------- | ------ | ------ | ----------------------------------------------------------------------------------- |

package/templates/skills/cbp-build-cc-settings/reference/cbp-permission-policy.md CHANGED Viewed

@@ -22,7 +22,7 @@ Precedence is `deny > ask > allow`; arrays union across scopes (managed/user/pro
 ### `allow` — the autonomous workflow surface
-- **Non-lifecycle, non-shipment `/cbp-*` skills** — authoring (`cbp-build-cc-*`), frontend (`cbp-frontend-*`), git (`cbp-git-*`, `cbp-merge-main`, `cbp-refresh-infra`), round work (`cbp-round-check`/`-end`/`-input`/`-start`/`-update` — `cbp-round-update` is autonomous triage that only reads round state and routes to `cbp-round-complete` or `cbp-round-input`, so it runs without a prompt), setup/configure (`cbp-setup-*`, `cbp-ship-configure`, `cbp-supabase-*`), task prep (`cbp-task-check`/`-create`/`-start`/`-testing`, `cbp-standalone-task-check`/`-testing`), planning (`cbp-checkpoint-plan`/`-update`), plus `cbp-session-start` and `cbp-todo`. Invoking a skill is the intended mode of operation; the gated side effects happen inside via the Bash/MCP tools the skill calls, which carry their own tiering. The lifecycle/state-transition and plan-approval skills are the exception — they live in `ask` (next section).
+- **Non-lifecycle, non-shipment `/cbp-*` skills** — authoring (`cbp-build-cc-*`), frontend (`cbp-frontend-*`), git (`cbp-git-*`, `cbp-merge-main`, `cbp-refresh-infra`), round work (`cbp-round-plan`, `cbp-verify` — `cbp-verify` is the autonomous verify stage that runs deterministic gates, proves execution, spawns the fresh-context reviewer, and routes to `cbp-round-complete` or `cbp-round-plan`, so it runs without a prompt), setup/configure (`cbp-setup-*`, `cbp-ship-configure`, `cbp-supabase-*`), task prep (`cbp-task-create`/`-start`, `cbp-standalone-task-check`/`-testing`), planning (`cbp-checkpoint-plan`/`-update`), plus `cbp-session-start` and `cbp-todo`. Invoking a skill is the intended mode of operation; the gated side effects happen inside via the Bash/MCP tools the skill calls, which carry their own tiering. The lifecycle/state-transition and plan-approval skills are the exception — they live in `ask` (next section).
 - **All `mcp__codebyplan__*` reads** (`get_*`, `list_*`, `search_*`, `health_check`, `lookup_symbol`, `resolve_library_id`, `get_chunk`).
 - **Routine workflow-write MCP tools** the pipeline calls many times per task: create/update/complete checkpoint, task, and round; session log + session-state writes; `create_worktree`, `add_library`, `flag_stale_chunk`, `update_server_config`, `update_eslint_repo_config`, `update_task_template`. Gating these with `ask` would make the autonomous workflow unusable.
 - **Read/safe CLI commands** (both `codebyplan X` and `npx codebyplan X`): `whoami`, `resolve-worktree`, `statusline`, `ports`, `tech-stack`, `eslint`, `round`, `help`, `--version`.
@@ -30,8 +30,8 @@ Precedence is `deny > ask > allow`; arrays union across scopes (managed/user/pro
 ### `ask` — the deliberate confirm-gate
 - **Production-shipment skills**: `cbp-ship`, `cbp-ship-main`, `cbp-checkpoint-end` — these promote/deploy to production, so they prompt even in an otherwise auto-allowed setup.
-- **Lifecycle / state-transition skills**: `cbp-checkpoint-start`, `cbp-checkpoint-create`, `cbp-checkpoint-check`, `cbp-checkpoint-complete`, `cbp-round-complete`, `cbp-session-end`, `cbp-task-complete`, `cbp-standalone-task-create`, `cbp-standalone-task-start`, `cbp-standalone-task-complete` — these open or close checkpoints, tasks, rounds, and sessions (advancing workflow state in the database), so they stop for explicit confirmation rather than running autonomously. `cbp-round-complete` is the permission-gated round finalizer (reconciles the user's `git add`s, completes the round, routes onward); its `ask` prompt replaces the in-skill confirmation that used to live in `cbp-round-update` — which is now an autonomous, `allow`-tier triage step.
-- **Plan-approval gate**: `cbp-round-execute` — the round plan is approved by confirming this `ask` prompt rather than via an in-skill AskUserQuestion. `cbp-round-start` runs its planning Q&A, then hands off to `cbp-round-execute`; the permission prompt is the user's go/no-go on the plan.
+- **Lifecycle / state-transition skills**: `cbp-checkpoint-start`, `cbp-checkpoint-create`, `cbp-checkpoint-check`, `cbp-checkpoint-complete`, `cbp-round-complete`, `cbp-session-end`, `cbp-finalize`, `cbp-standalone-task-create`, `cbp-standalone-task-start`, `cbp-standalone-task-complete` — these open or close checkpoints, tasks, rounds, and sessions (advancing workflow state in the database), so they stop for explicit confirmation rather than running autonomously. `cbp-round-complete` is the permission-gated round finalizer (reconciles the user's `git add`s, completes the round, routes onward); its `ask` prompt is the human gate downstream of `cbp-verify` — the autonomous, `allow`-tier verify stage whose triage routes here.
+- **Plan-approval gate**: `cbp-round-build` — the round plan is approved by confirming this `ask` prompt rather than via an in-skill AskUserQuestion. `cbp-round-plan` runs its planning Q&A, then hands off to `cbp-round-build`; the permission prompt is the user's go/no-go on the plan.
 - **Destructive / admin MCP tools**: `delete_session_log`, `delete_worktree`, `create_repo`, `release_assignment`. (The launch and member-admin tools were dropped from the MCP surface in CHK-180 — those concerns are web-app only now.)
 - **Mutating / external / clobber-risk CLI commands** (both prefixes): `setup`, `login`, `logout`, `upgrade-auth`, `config` (can overwrite committed `.codebyplan/` files), `branch` (rewrites branch config), `ship`, `claude` (`install`/`update`/`uninstall` overwrite `.claude/`).
@@ -53,11 +53,11 @@ A skill invokes the next skill via the Skill tool at the appropriate routing bra
 ### How the human gate works
 - **`allow`-tier** skill: the harness auto-fires it silently when the triggering skill invokes it.
-  No permission prompt. Use for safe, routine-flow skills (e.g. `cbp-task-testing`,
-  `cbp-round-input`) where the trigger condition already encodes the human intent.
+  No permission prompt. Use for safe, routine-flow skills (e.g. `cbp-verify`,
+  `cbp-round-plan`) where the trigger condition already encodes the human intent.
 - **`ask`-tier** skill: the harness pauses and shows a permission prompt before the skill runs.
   **That prompt IS the human gate** — it replaces the old "Next: /cbp-X, run it yourself"
-  manual directive. Use for lifecycle/state-transition skills (e.g. `cbp-task-complete`,
+  manual directive. Use for lifecycle/state-transition skills (e.g. `cbp-finalize`,
   `cbp-checkpoint-check`) where a deliberate confirmation is still desirable.
 This means:
@@ -70,7 +70,7 @@ This means:
 The `cbp-skill-context-guard.sh` PreToolUse hook denies heavy close-out skills when the
 context window exceeds `CBP_CONTEXT_WARN_TOKENS` (default 200 000 tokens). The heavy allowlist
-is: `cbp-round-execute`, `cbp-task-testing`, `cbp-standalone-task-testing`,
+is: `cbp-round-build`, `cbp-verify`, `cbp-standalone-task-testing`,
 `cbp-checkpoint-check`, `cbp-checkpoint-end`.
 When the guard fires, it directs the model to run `/cbp-clear-prep` instead. The flow is:

package/templates/skills/cbp-build-cc-skill/SKILL.md CHANGED Viewed

@@ -81,7 +81,7 @@ A Task-pattern skill that must only run on explicit user confirmation is a **per
 - MUST carry `disable-model-invocation: true` — the model cannot invoke it; only the user can (via `/skill-name`).
 - Any upstream skill that auto-triggers it MUST instead emit a `Next: /skill-name` directive and STOP — model invocation of a `disable-model-invocation` skill is blocked at the runtime level.
-- Canonical example: `/cbp-round-complete` (the round finalizer). `/cbp-round-update` routes a clean triage via a `Next: /cbp-round-complete` directive and stops — it cannot invoke round-complete directly.
+- Canonical example: `/cbp-round-complete` (the round finalizer). `/cbp-verify` routes a clean round via a `Next: /cbp-round-complete` directive and stops — it cannot invoke round-complete directly.
 ### Step 5 — Fill the frontmatter

package/templates/skills/cbp-build-cc-skill/reference/cbp-quality.md CHANGED Viewed

@@ -79,14 +79,14 @@ A skill should do one thing in the pipeline. If a skill both plans AND executes,
 | Wrong                                   | Right                                                        |
 | --------------------------------------- | ------------------------------------------------------------ |
-| `/cbp-round` (plans + executes + tests) | `/cbp-round-start` → `/cbp-round-execute` → `/cbp-round-end` |
+| `/cbp-round` (plans + executes + tests) | `/cbp-round-plan` → `/cbp-round-build` → `/cbp-verify` |
 ### Pipeline Clarity
 If the skill is part of a chain, show it:
 ```
-/cbp-round-start (planning) → /cbp-round-execute (ask-tier permission = plan approval)
+/cbp-round-plan (planning) → /cbp-round-build (ask-tier permission = plan approval)
 ```
 ### Approval Gates

package/templates/skills/cbp-build-cc-skill/reference/fork-eligibility.md CHANGED Viewed

@@ -4,7 +4,7 @@
 parent conversation and, per the runtime, **runs in the background by default**. It is
 isolation for a *whole skill*, not a way to delegate one sub-step. A forked body therefore
 cannot drive the main pipeline: it can't `AskUserQuestion`, can't auto-trigger another
-skill, and can't run an inline-fallback that the orchestrator depends on.
+skill, and can't run the deterministic fallback path the orchestrator depends on.
 So forking only helps a narrow shape of skill. The canonical eligible example is
 [examples/fork-skill.md](../examples/fork-skill.md): a single self-contained analytical task
@@ -19,20 +19,20 @@ A skill is **fork-eligible** only when ALL hold:
 3. It does **not route** — no auto-trigger of another skill, no close-out directive that must
    fire in the main context.
 4. It does **not fan out** — it does not spawn multiple subagents and coordinate them.
-5. It has **no inline-fallback** contract the orchestrator relies on.
+5. It has **no deterministic fallback** path the orchestrator relies on.
 Fail any one → the skill stays **inline** (main context). Inline skills still get clean
 context isolation the right way: by delegating their heavy step to a dedicated **agent**
-(e.g. `cbp-task-check`, `cbp-improve-round`, `cbp-round-executor`). The agent is the
+(e.g. `cbp-verify-reviewer`, `cbp-round-builder`). The agent is the
 isolation boundary; the skill stays in the main thread to orchestrate, route, and interact.
 ## When NOT to use `context: fork` (the disqualifying patterns)
 | Pattern | Why it can't fork | Example skills |
 |---------|-------------------|----------------|
-| **fan-out** | spawns multiple agents in parallel and coordinates them | `cbp-round-execute`, `cbp-checkpoint-check`, `cbp-map-architecture`, `cbp-refresh-arch-map` |
-| **spawn-then-route** | spawns one agent, then `AskUserQuestion` / auto-triggers the next skill / runs inline-fallback | `cbp-task-check`, `cbp-standalone-task-check`, `cbp-round-start`, `cbp-round-end`, `cbp-checkpoint-plan` |
-| **inline-by-design** | interactive Q&A or stepwise writes that must stay in the main context | `cbp-task-create`, `cbp-task-complete`, `cbp-round-update`, `cbp-merge-main` |
+| **fan-out** | spawns multiple agents in parallel and coordinates them | `cbp-round-build`, `cbp-checkpoint-check`, `cbp-map-architecture`, `cbp-refresh-arch-map` |
+| **spawn-then-route** | spawns one agent, then `AskUserQuestion` / auto-triggers the next skill | `cbp-verify`, `cbp-standalone-task-check`, `cbp-round-plan`, `cbp-checkpoint-plan` |
+| **inline-by-design** | interactive Q&A or stepwise writes that must stay in the main context | `cbp-task-create`, `cbp-finalize`, `cbp-merge-main` |
 | **consumed-inline** | invoked *by* an agent (e.g. round-executor) and applies fixes synchronously into that context | `cbp-frontend-design`, `cbp-frontend-ui`, `cbp-frontend-ux` |
 | **doc-ref-only** | mentions subagents/fork only as documentation; runs inline authoring | the `cbp-build-cc-*` authoring skills, `cbp-supabase-migrate` |
@@ -40,28 +40,25 @@ isolation boundary; the skill stays in the main thread to orchestrate, route, an
 Every skill whose `SKILL.md` touches the subagent/fork boundary — by spawning a subagent, by
 being invoked inline by an agent, or by documenting the feature — was classified against the
-eligibility test. **Result: 0 of 25 are fork-eligible** — none were migrated, because every
+eligibility test. **Result: 0 of 22 are fork-eligible** — none were migrated, because every
 one either already isolates heavy work in a dedicated agent (the correct boundary) or depends
 on inline orchestration/interaction that a background fork would break.
 | Skill | Pattern | Fork-eligible |
 |-------|---------|:---:|
-| cbp-round-execute | fan-out | no |
+| cbp-round-build | fan-out | no |
 | cbp-checkpoint-check | fan-out | no |
 | cbp-map-architecture | fan-out | no |
 | cbp-refresh-arch-map | fan-out | no |
-| cbp-round-start | spawn-then-route | no |
-| cbp-round-end | spawn-then-route | no |
-| cbp-task-check | spawn-then-route | no |
+| cbp-round-plan | spawn-then-route | no |
+| cbp-verify | spawn-then-route | no |
 | cbp-standalone-task-check | spawn-then-route | no |
 | cbp-checkpoint-plan | spawn-then-route | no |
-| cbp-round-update | inline-by-design | no |
 | cbp-task-create | inline-by-design | no |
 | cbp-standalone-task-create | inline-by-design | no |
-| cbp-task-complete | inline-by-design | no |
+| cbp-finalize | inline-by-design | no |
 | cbp-standalone-task-complete | inline-by-design | no |
 | cbp-merge-main | inline-by-design | no |
-| cbp-task-testing | inline-by-design | no |
 | cbp-standalone-task-testing | inline-by-design | no |
 | cbp-frontend-design | consumed-inline | no |
 | cbp-frontend-ui | consumed-inline | no |

package/templates/skills/cbp-checkpoint-check/SKILL.md CHANGED Viewed

@@ -1,4 +1,5 @@
 ---
+scope: org-shared
 name: cbp-checkpoint-check
 description: Full re-evaluation of a checkpoint with before/after comparison
 argument-hint: [CHK-NNN]
@@ -83,7 +84,14 @@ Aggregate QA from all tasks and rounds:
 | TASK-[N] | READY | all_pass | [N] |
 ```
-Re-run build/lint/types on current codebase to verify nothing regressed across tasks.
+Re-run build/lint/types on the current codebase to verify nothing regressed across tasks. Detect `$PLATFORM` from the project type (same signal table as `cbp-testing-qa-agent.md` Step 1), then resolve commands from `.codebyplan/ci.json`:
+```bash
+CI_BUILD_CMD=$(npx codebyplan ci resolve build --platform "$PLATFORM" 2>/dev/null)
+CI_TYPES_CMD=$(npx codebyplan ci resolve typecheck --platform "$PLATFORM" 2>/dev/null)
+```
+Run: `${CI_BUILD_CMD:-npm run build}` and `${CI_TYPES_CMD:-npx tsc --noEmit}`. For lint use the whole-repo command (`pnpm -w lint`). Fallback: if `.codebyplan/ci.json` is absent, `ci resolve` returns the central default; if the binary is unavailable the `${CI_*_CMD:-<literal>}` guard uses the hardcoded fallback.
 ### Step 5b: Whole-Checkpoint E2E
@@ -119,11 +127,11 @@ Aggregate the files touched across all tasks (reusing Step 4's deduplicated tabl
    Continue to Step 6.
 5. **On fail** (any framework `f`: `e2e_outputs[f].status === 'failed'` OR `e2e_outputs[f].test_results.failed > 0`): build a failure summary from `e2e_outputs[*].test_results.failures[]` aggregated and grouped by `category`. Surface via `AskUserQuestion`:
-   - **(a) Create fix-task in CHK-{NNN} (recommended)** — run `codebyplan task create` (CLI write-through; break-glass: MCP `create_task`) with `checkpoint_id=current_checkpoint_id`, `title="Fix checkpoint-level e2e failures (CHK-{NNN})"`, `requirements` containing the detailed failure breakdown (category counts, files involved, pages broken, screenshot paths from `e2e_outputs[*].screenshots[]`), AND `context: { source_checkpoint_id, e2e_failure_summary: { category_counts, pages_broken, screenshot_paths }, fix_type: "checkpoint_e2e" }` so downstream `cbp-task-planner` can verify failure premises. Per `cbp-round-end` reference `findings-presentation.md` "Infra Issue Absorption Contract — Resolve-in-Current-Scope by Default", checkpoint-level e2e failures absorb into the active checkpoint — not standalone.
+   - **(a) Create fix-task in CHK-{NNN} (recommended)** — run `codebyplan task create` (CLI write-through; break-glass: MCP `create_task`) with `checkpoint_id=current_checkpoint_id`, `title="Fix checkpoint-level e2e failures (CHK-{NNN})"`, `requirements` containing the detailed failure breakdown (category counts, files involved, pages broken, screenshot paths from `e2e_outputs[*].screenshots[]`), AND `context: { source_checkpoint_id, e2e_failure_summary: { category_counts, pages_broken, screenshot_paths }, fix_type: "checkpoint_e2e" }` so downstream `cbp-round-planner` can verify failure premises. Per `cbp-verify` reference `findings-presentation.md` "Infra Issue Absorption Contract — Resolve-in-Current-Scope by Default", checkpoint-level e2e failures absorb into the active checkpoint — not standalone.
    - **(b) Surface as warning only — proceed to checkpoint-end** — append `| Checkpoint E2E | warning | N failures (deferred) |` to Step 5 QA Summary; continue to Step 6.
    - **(c) Halt — review manually** — STOP and wait for the user.
-   See `cbp-round-end` reference `findings-presentation.md` "Infra Issue Absorption Contract — Infra-Class Issue Catalog" row "Checkpoint-level e2e failure" for the routing rationale.
+   See `cbp-verify` reference `findings-presentation.md` "Infra Issue Absorption Contract — Infra-Class Issue Catalog" row "Checkpoint-level e2e failure" for the routing rationale.
 ### Step 6: User Discussion

package/templates/skills/cbp-checkpoint-create/SKILL.md CHANGED Viewed

@@ -87,7 +87,22 @@ This is the first identity-stamping point — when claiming, passing `worktree_i
 Read `.codebyplan/git.json` `branch_config.production` (default `"main"`) as `BASE`. codebyplan repos are main-only — never create or branch from a `development`/integration branch.
-Compute the slug deterministically:
+**8.1 — Reuse the cloud-created branch when present.** When the repo is GitHub-connected, the CHK-207 `create-feat-branch` Edge Function fires on the Step 7 row INSERT, creates `feat/CHK-{NNN}-<slug>` on origin, and writes `branch_name` back to the checkpoint row. Creating a second, differently-slugged branch here orphans the cloud one — so re-read the row first:
+```bash
+sleep 5  # give the INSERT webhook a moment to write branch_name back
+npx codebyplan sync 2>/dev/null || true
+BRANCH=$(jq -r '.branch_name // empty' ".codebyplan/state/checkpoints/{checkpoint-id}.json" 2>/dev/null)
+```
+(Break-glass: MCP `get_checkpoints` and read the row's `branch_name`.) If `BRANCH` is non-empty, check out the existing remote branch and skip 8.2 entirely — do NOT push (it already exists on origin) and do NOT persist `--branch-name` (the Edge Function already recorded it):
+```bash
+git fetch origin "$BRANCH"
+git checkout -b "$BRANCH" --track "origin/$BRANCH"
+```
+**8.2 — Fallback: create the branch locally.** Only when `BRANCH` is empty (repo not GitHub-connected, or the webhook hasn't landed). Compute the slug deterministically:
 ```bash
 SLUG=$(codebyplan slug "{checkpoint title}")

package/templates/skills/cbp-checkpoint-end/SKILL.md CHANGED Viewed

@@ -96,7 +96,11 @@ Runtime deployment for the base branch is handled in Step 7 by `/cbp-ship` (whic
 ### Step 7: Runtime Shipment via `/cbp-ship`
-After branch promotion to main completes, invoke `/cbp-ship` to deploy every configured surface:
+After branch promotion to main completes, invoke `/cbp-ship` to deploy every configured surface.
+`/cbp-ship` reads `.codebyplan/cd.json` when present to inform per-surface deploy variant
+selection (trigger, environment, approval gate, OIDC auth, credential env-var names). Repos
+without `cd.json` fall back to filesystem surface detection — no behavior change. Run
+`/cbp-setup-cd` to set up `cd.json` for a repo that has not yet migrated.
 - Vercel auto-deploy verification
 - Mobile shipment (asks user: skip / EAS internal TestFlight / EAS external TestFlight)