codebyplan 1.13.52 → 1.13.54

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (92) hide show
  1. package/dist/cli.js +3226 -897
  2. package/package.json +1 -1
  3. package/templates/agents/cbp-database-agent.md +1 -1
  4. package/templates/agents/cbp-e2e-maestro.md +1 -1
  5. package/templates/agents/cbp-e2e-playwright.md +24 -16
  6. package/templates/agents/cbp-e2e-tauri.md +1 -1
  7. package/templates/agents/cbp-e2e-vscode.md +1 -1
  8. package/templates/agents/cbp-e2e-xcuitest.md +1 -1
  9. package/templates/agents/cbp-improve-claude.md +2 -2
  10. package/templates/agents/{cbp-round-executor.md → cbp-round-builder.md} +23 -23
  11. package/templates/agents/{cbp-task-planner.md → cbp-round-planner.md} +26 -25
  12. package/templates/agents/cbp-security-agent.md +10 -2
  13. package/templates/agents/cbp-stripe-agent.md +2 -2
  14. package/templates/agents/cbp-testing-qa-agent.md +34 -20
  15. package/templates/agents/cbp-verify-reviewer.md +236 -0
  16. package/templates/context/architecture-map.md +4 -4
  17. package/templates/context/mcp-docs.md +57 -11
  18. package/templates/context/testing/e2e.md +9 -9
  19. package/templates/github-workflows/ci.yml +104 -0
  20. package/templates/github-workflows/publish.yml +8 -27
  21. package/templates/github-workflows/release-desktop.yml +215 -0
  22. package/templates/hooks/cbp-skill-context-guard.sh +1 -1
  23. package/templates/hooks/cbp-test-hooks.sh +9 -9
  24. package/templates/hooks/validate-structure-lengths.sh +1 -1
  25. package/templates/hooks/validate-structure-patterns.sh +1 -1
  26. package/templates/rules/README.md +1 -2
  27. package/templates/rules/agent-claim-verification.md +1 -1
  28. package/templates/rules/context-file-loading.md +10 -10
  29. package/templates/rules/development-workflow.md +73 -0
  30. package/templates/rules/e2e-mandatory.md +8 -8
  31. package/templates/rules/execution-proof.md +70 -0
  32. package/templates/rules/model-invocation-convention.md +2 -2
  33. package/templates/rules/parallel-waves.md +11 -11
  34. package/templates/rules/spawn-failure-is-gate-failure.md +76 -0
  35. package/templates/rules/task-routing-recommendation.md +1 -1
  36. package/templates/rules/todo-backend.md +3 -3
  37. package/templates/rules/two-tier-ci.md +63 -0
  38. package/templates/settings.project.base.json +15 -11
  39. package/templates/skills/cbp-build-cc-mode/SKILL.md +1 -1
  40. package/templates/skills/cbp-build-cc-settings/reference/cbp-permission-policy.md +7 -7
  41. package/templates/skills/cbp-build-cc-skill/SKILL.md +1 -1
  42. package/templates/skills/cbp-build-cc-skill/reference/cbp-quality.md +2 -2
  43. package/templates/skills/cbp-build-cc-skill/reference/fork-eligibility.md +11 -14
  44. package/templates/skills/cbp-checkpoint-check/SKILL.md +11 -3
  45. package/templates/skills/cbp-checkpoint-create/SKILL.md +16 -1
  46. package/templates/skills/cbp-checkpoint-end/SKILL.md +5 -1
  47. package/templates/skills/cbp-checkpoint-update/SKILL.md +3 -3
  48. package/templates/skills/cbp-clear-continue/SKILL.md +2 -2
  49. package/templates/skills/cbp-clear-prep/SKILL.md +3 -3
  50. package/templates/skills/{cbp-task-complete → cbp-finalize}/SKILL.md +25 -29
  51. package/templates/skills/{cbp-task-complete → cbp-finalize}/reference/checkpoint-done-branching.md +1 -1
  52. package/templates/skills/{cbp-task-complete → cbp-finalize}/reference/next-step-heuristic.md +1 -1
  53. package/templates/skills/cbp-frontend-design/SKILL.md +1 -1
  54. package/templates/skills/cbp-frontend-ui/SKILL.md +7 -7
  55. package/templates/skills/cbp-git-commit/SKILL.md +3 -3
  56. package/templates/skills/cbp-merge-main/SKILL.md +4 -4
  57. package/templates/skills/{cbp-round-execute → cbp-round-build}/SKILL.md +93 -75
  58. package/templates/skills/cbp-round-complete/SKILL.md +15 -14
  59. package/templates/skills/cbp-round-plan/SKILL.md +344 -0
  60. package/templates/skills/cbp-session-end/SKILL.md +1 -1
  61. package/templates/skills/cbp-setup-cd/SKILL.md +291 -0
  62. package/templates/skills/cbp-setup-cd/reference/github-actions-cd.md +231 -0
  63. package/templates/skills/cbp-setup-ci/SKILL.md +175 -0
  64. package/templates/skills/cbp-setup-ci/reference/github-actions.md +100 -0
  65. package/templates/skills/cbp-ship/SKILL.md +21 -0
  66. package/templates/skills/cbp-ship-main/SKILL.md +3 -2
  67. package/templates/skills/cbp-standalone-task-check/SKILL.md +10 -9
  68. package/templates/skills/cbp-standalone-task-complete/SKILL.md +12 -13
  69. package/templates/skills/cbp-standalone-task-create/SKILL.md +16 -9
  70. package/templates/skills/cbp-standalone-task-start/SKILL.md +9 -5
  71. package/templates/skills/cbp-standalone-task-testing/SKILL.md +16 -7
  72. package/templates/skills/cbp-task-create/SKILL.md +6 -7
  73. package/templates/skills/cbp-task-start/SKILL.md +8 -8
  74. package/templates/skills/cbp-todo/SKILL.md +6 -8
  75. package/templates/skills/cbp-verify/SKILL.md +146 -0
  76. package/templates/skills/cbp-verify/reference/deterministic-gates.md +114 -0
  77. package/templates/skills/{cbp-round-end → cbp-verify}/reference/findings-presentation.md +16 -12
  78. package/templates/skills/cbp-verify/reference/round-scope.md +62 -0
  79. package/templates/skills/cbp-verify/reference/task-scope.md +71 -0
  80. package/templates/agents/cbp-improve-round.md +0 -283
  81. package/templates/agents/cbp-task-check.md +0 -217
  82. package/templates/skills/cbp-round-check/SKILL.md +0 -132
  83. package/templates/skills/cbp-round-end/SKILL.md +0 -173
  84. package/templates/skills/cbp-round-end/reference/inline-fallback.md +0 -35
  85. package/templates/skills/cbp-round-execute/reference/inline-fallback.md +0 -55
  86. package/templates/skills/cbp-round-input/SKILL.md +0 -197
  87. package/templates/skills/cbp-round-start/SKILL.md +0 -261
  88. package/templates/skills/cbp-round-update/SKILL.md +0 -120
  89. package/templates/skills/cbp-ship/templates/workflow-eas-submit.yml +0 -53
  90. package/templates/skills/cbp-ship/templates/workflow-vsce-publish.yml +0 -31
  91. package/templates/skills/cbp-task-check/SKILL.md +0 -172
  92. package/templates/skills/cbp-task-testing/SKILL.md +0 -277
@@ -0,0 +1,70 @@
1
+ ---
2
+ description: Real execution proof is a non-skippable verify obligation — tiered by what the round touched, every tier producing a COMMITTED artifact, never prose.
3
+ paths:
4
+ - ".claude/skills/cbp-verify/**"
5
+ - ".claude/skills/cbp-round-build/**"
6
+ - ".claude/agents/cbp-verify-reviewer.md"
7
+ - ".claude/agents/cbp-e2e-playwright.md"
8
+ - ".claude/agents/cbp-e2e-maestro.md"
9
+ - ".claude/agents/cbp-e2e-tauri.md"
10
+ - ".claude/agents/cbp-e2e-vscode.md"
11
+ - ".claude/agents/cbp-e2e-xcuitest.md"
12
+ ---
13
+
14
+ # Execution Proof
15
+
16
+ "I verified the build" is not proof. Proof is a **committed artifact** that an auditor can
17
+ re-inspect after the session ends. `cbp-verify` Phase 3 produces it; a passing verdict without
18
+ it is invalid. The required artifact is **tiered by what the round's diff actually touched** —
19
+ the tier is chosen from `files_changed`, not from a `has_ui_work` guess.
20
+
21
+ ## Tiers
22
+
23
+ | Tier | Round touched | Proof obligation | Asserted by |
24
+ |------|---------------|------------------|-------------|
25
+ | **1** | A configured e2e framework's `app` source (`.codebyplan/e2e.json`) | `cbp-e2e-*` specialist runs the app and **commits screenshots** to the framework's committed dir | `codebyplan e2e verify-round` (non-empty gallery + non-zero assertions) |
26
+ | **2** | UI source, but NO e2e framework configured for that app | **MANDATORY** dev-server run + at least one committed route screenshot or HTTP response trace for each changed route | manifest `artifacts[]` + `git ls-files --error-unmatch` |
27
+ | **3** | Backend / API only (route handlers, server actions, endpoints) | Hit each changed endpoint; record an HTTP status trace (method, path, status, ms) committed to the round artifact dir | manifest `artifacts[]` |
28
+ | **4** | `claude_only` / docs / config only (no app surface) | Proof IS the build/test commands — `codebyplan check --scope round\|task` (+ `bash -n` for touched hooks); profile-valid, no screenshot | manifest `gates[]` |
29
+
30
+ A round can hit multiple tiers; satisfy each tier its diff touches.
31
+
32
+ ## Hard Rules
33
+
34
+ - **Empty proof on a UI-touching diff is a GATE FAILURE.** A round whose `files_changed`
35
+ includes UI source but whose manifest carries zero committed screenshots/traces fails verify —
36
+ route to a fix round that captures the missing artifact. (Mirrors `e2e-mandatory.md`
37
+ Committed-Screenshot Enforcement; sole exception: `vscode-test`-only behavior rounds.)
38
+ - **Screenshots must be committed, not `/tmp`.** Each artifact path is proven present with
39
+ `git ls-files --error-unmatch <path>` — an unstaged or `/tmp` file is not proof.
40
+ - **Prose is never proof.** A narrative claim with no artifact path does not satisfy any tier.
41
+
42
+ ## Manifest Schema
43
+
44
+ `cbp-verify` writes a `verify_manifest` into round/task context — the durable record of which
45
+ gates ran and what proof exists:
46
+
47
+ ```yaml
48
+ verify_manifest:
49
+ scope: round | task
50
+ gates: # deterministic gate results
51
+ - name: gate6 | lint | typecheck | tests | audit
52
+ exit_code: number
53
+ new_failures: string[] # post-baseline-diff; [] = pass
54
+ proof:
55
+ tier: 1 | 2 | 3 | 4
56
+ artifacts: # committed proof, one per affected surface
57
+ - kind: screenshot | http_trace | command_log
58
+ path: string # repo-relative; verified via git ls-files --error-unmatch
59
+ affected: string # route / endpoint / file this proves
60
+ e2e_verify_round: # present for Tier 1
61
+ pass: boolean
62
+ failed_checks: string[] # e2e_eligible_skipped | zero_assertion_run | empty_gallery
63
+ decided_at: ISO8601
64
+ ```
65
+
66
+ ## Cross-References
67
+
68
+ - `rules/e2e-mandatory.md` — Tier 1 opt-out contract + committed-screenshot mandate.
69
+ - `rules/two-tier-ci.md` — how proof feeds the soft (round/task) vs hardcore (checkpoint) tiers.
70
+ - `skills/cbp-verify/reference/deterministic-gates.md` — the gate command contracts + manifest write.
@@ -7,8 +7,8 @@ a skill is strictly user-only (i.e. it must never auto-trigger from another skil
7
7
 
8
8
  The absence of `disable-model-invocation` (or `disable-model-invocation: false`) is the normal
9
9
  state. It allows the skill to be auto-triggered via the Skill tool from within other skills —
10
- which is how the auto-trigger close-out flow works (e.g. `cbp-task-check` → `cbp-task-testing`,
11
- `cbp-task-testing` → `cbp-task-complete`).
10
+ which is how the auto-trigger close-out flow works (e.g. `cbp-round-build` → `cbp-verify`,
11
+ `cbp-verify` task scope → `cbp-finalize`).
12
12
 
13
13
  ## The sole exception: `cbp-round-complete`
14
14
 
@@ -1,24 +1,24 @@
1
1
  ---
2
2
  name: parallel-waves
3
- description: Wave schema, invariants, and proximity-split algorithm for cbp-task-planner Phase 5.6 wave decomposition.
3
+ description: Wave schema, invariants, and proximity-split algorithm for cbp-round-planner Phase 5.6 wave decomposition.
4
4
  paths:
5
- - .claude/agents/cbp-task-planner.md
5
+ - .claude/agents/cbp-round-planner.md
6
6
  ---
7
7
 
8
8
  # Parallel Waves
9
9
 
10
- Authoritative expansion of `cbp-task-planner` Phase 5.6. The planner reads this file at wave decomposition time.
10
+ Authoritative expansion of `cbp-round-planner` Phase 5.6. The planner reads this file at wave decomposition time.
11
11
 
12
12
  ## Wave Schema
13
13
 
14
- Each entry in `plan.waves[]` carries these fields (source: `.claude/agents/cbp-task-planner.md` Phase 5.6 "Output" block):
14
+ Each entry in `plan.waves[]` carries these fields (source: `.claude/agents/cbp-round-planner.md` Phase 5.6 "Output" block):
15
15
 
16
16
  ```yaml
17
17
  - name: string # short identifier, e.g. "web-ui", "backend", "db"
18
- agent_type: 'round-executor' | 'inline'
18
+ agent_type: 'round-builder' | 'inline'
19
19
  files: string[] # repo-relative paths owned by this wave
20
20
  depends_on: string[] # names of waves that must complete before this one starts
21
- skill_preloads: string[] # skills invoked by the executor before Step 3 (e.g. "frontend-design")
21
+ skill_preloads: string[] # skills invoked by the builder before Step 3 (e.g. "frontend-design")
22
22
  note: string # optional — required on continuation waves from an arbitrary-boundary split
23
23
  ```
24
24
 
@@ -31,9 +31,9 @@ Each entry in `plan.waves[]` carries these fields (source: `.claude/agents/cbp-t
31
31
  **(III) 3–15 files per wave** — every wave holds between 3 and 15 files (inclusive).
32
32
  - Below 3: merge into a sibling wave.
33
33
  - Above 15: apply the proximity-split algorithm below.
34
- - Sole exception — trivially small plans are exempt from the lower bound: a plan with fewer than 3 total files uses one single wave, and a single-app plan with ≤5 total files MAY skip decomposition entirely (one wave, or `waves[]` omitted — see `cbp-task-planner` Phase 5.6). Zero waves (omitted `waves[]`) trivially satisfies this invariant.
34
+ - Sole exception — trivially small plans are exempt from the lower bound: a plan with fewer than 3 total files uses one single wave, and a single-app plan with ≤5 total files MAY skip decomposition entirely (one wave, or `waves[]` omitted — see `cbp-round-planner` Phase 5.6). Zero waves (omitted `waves[]`) trivially satisfies this invariant.
35
35
 
36
- **(IV) UI skill preloads** — for each wave whose `files[]` contains UI-bearing paths (`*.tsx`, `*.jsx`, `*.scss`, etc.), add `"frontend-design"` to `skill_preloads[]` (source: `.claude/agents/cbp-task-planner.md` Phase 5.6 step "Populate `skill_preloads[]`").
36
+ **(IV) UI skill preloads** — for each wave whose `files[]` contains UI-bearing paths (`*.tsx`, `*.jsx`, `*.scss`, etc.), add `"frontend-design"` to `skill_preloads[]` (source: `.claude/agents/cbp-round-planner.md` Phase 5.6 step "Populate `skill_preloads[]`").
37
37
 
38
38
  ## Proximity-Split Algorithm
39
39
 
@@ -57,7 +57,7 @@ Invariants I (disjoint files), II (acyclic `depends_on` DAG), and III (3–15 fi
57
57
 
58
58
  ## Cross-References
59
59
 
60
- - `agents/cbp-task-planner.md` Phase 5.6 — consumer of this rule; steps 1–6 and the `validate-waves` verification call.
60
+ - `agents/cbp-round-planner.md` Phase 5.6 — consumer of this rule; steps 1–6 and the `validate-waves` verification call.
61
61
  - `packages/codebyplan-package/src/lib/validate-waves.ts` — deterministic enforcement of invariants I–III.
62
- - `agents/cbp-round-executor.md` Step 2.6 — wave-mode skill preloads.
63
- - `skills/cbp-round-execute/SKILL.md` Step 3 — per-wave executor dispatch.
62
+ - `agents/cbp-round-builder.md` Step 2.6 — wave-mode skill preloads.
63
+ - `skills/cbp-round-build/SKILL.md` Step 3 — per-wave builder dispatch.
@@ -0,0 +1,76 @@
1
+ ---
2
+ description: A subagent spawn failure is a HARD GATE FAILURE — STOP and retry, never walk the agent's steps inline and self-certify.
3
+ paths:
4
+ - ".claude/skills/cbp-verify/**"
5
+ - ".claude/skills/cbp-round-build/**"
6
+ - ".claude/skills/cbp-finalize/**"
7
+ - ".claude/agents/cbp-verify-reviewer.md"
8
+ - ".claude/agents/cbp-round-builder.md"
9
+ ---
10
+
11
+ # Spawn Failure Is Gate Failure
12
+
13
+ When a verify/execution stage delegates work to a subagent (e.g. `cbp-verify` spawning
14
+ `cbp-verify-reviewer`, `cbp-round-build` spawning `cbp-round-builder`), the agent is the
15
+ **fresh-context oracle**. If the agent cannot run, the orchestrator does NOT have an
16
+ equivalent signal — and it must NEVER manufacture one.
17
+
18
+ ## The Rule
19
+
20
+ A **spawn failure** — the agent could not run, or died on a terminal error before producing
21
+ its output contract — is a **HARD GATE FAILURE**. The orchestrator STOPS and surfaces a retry
22
+ directive. It does NOT walk the agent's phase checklist inline with its own tools and grade its
23
+ own work. Self-certification by the orchestrator that spawned the agent is precisely the
24
+ fresh-context blind spot the agent exists to remove; reproducing the agent's steps inline
25
+ re-introduces it.
26
+
27
+ Spawn-failure classes (non-exhaustive): provider 5xx, rate-limit / monthly-cap / billing block,
28
+ context overflow at spawn, the agent process dying before emitting its output contract.
29
+
30
+ **Retry directive shape** (surface verbatim, then STOP):
31
+
32
+ ```
33
+ ## Verify blocked — reviewer could not spawn
34
+
35
+ The fresh-context reviewer (<agent>) failed to spawn: <class> — <verbatim error>.
36
+ This is a hard gate failure, not a pass. Retry when capacity returns:
37
+ Next: /cbp-verify
38
+ ```
39
+
40
+ Record `<scope>.context.verify.spawn_failure = { agent, class, error_message, decided_at }` so
41
+ the retry is auditable and a verdict is never written on a missing review.
42
+
43
+ ## Spawn-Failed vs Spawn-Ran-And-Found-Problems
44
+
45
+ These are different outcomes with opposite routes — do not conflate them:
46
+
47
+ | Outcome | Meaning | Route |
48
+ |---------|---------|-------|
49
+ | **Spawn failed** | Agent never produced its output contract (terminal error). | HARD GATE FAILURE → STOP + retry directive. No verdict written. |
50
+ | **Spawn ran, found problems** | Agent returned findings / `NOT_READY`. | Normal flow → in-scope mechanical fix or `/cbp-round-plan` fix round. |
51
+
52
+ A returned `NOT_READY` is a *successful* review with a negative verdict — it is acted on, not
53
+ retried. Only the absence of a contract is a spawn failure.
54
+
55
+ ## Carve-Out: The `claude_only` Profile Is Not Inline Fallback
56
+
57
+ The `claude_only` profile (rounds with no app surface — `.claude/`-only edits, docs, config)
58
+ has **no agent to spawn by design**. Its proof IS the deterministic command set:
59
+ `codebyplan check --scope round|task` plus `bash -n <hook>` for any touched shell file. Running
60
+ those inline is a **first-class deterministic verification path**, not a banned inline fallback —
61
+ there was never a subagent to substitute for. This carve-out applies ONLY when the resolved
62
+ profile is `claude_only`; for every other profile an agent is expected, and its spawn failure is
63
+ a hard gate failure per above.
64
+
65
+ ## Why (Replaces Inline-Fallback Self-Certification)
66
+
67
+ The retired `inline-fallback.md` procedures let an orchestrator that just failed to spawn an
68
+ agent walk that agent's steps and pass its own work. That defeats the entire point of a
69
+ fresh-context review and silently downgraded quality under sustained outages. This rule replaces
70
+ those procedures: a missing review is a STOP, not a self-graded continue.
71
+
72
+ ## Cross-References
73
+
74
+ - `skills/cbp-verify/SKILL.md` Phase 4 — the reviewer spawn + this hard-fail.
75
+ - `agents/cbp-verify-reviewer.md` — the reviewer whose absence triggers this rule.
76
+ - `rules/execution-proof.md` — the proof obligation a passing verdict still requires.
@@ -12,7 +12,7 @@ CodeByPlan has two families of task commands since CHK-141:
12
12
 
13
13
  | Family | Commands | When to use |
14
14
  |--------|----------|-------------|
15
- | Checkpoint-bound | `/cbp-task-create`, `/cbp-task-start {chk}-{task}`, `/cbp-task-check`, `/cbp-task-testing`, `/cbp-task-complete` | Work that belongs to a CHK-NNN checkpoint |
15
+ | Checkpoint-bound | `/cbp-task-create`, `/cbp-task-start {chk}-{task}`, `/cbp-verify`, `/cbp-finalize` | Work that belongs to a CHK-NNN checkpoint |
16
16
  | Standalone | `/cbp-standalone-task-create`, `/cbp-standalone-task-start {task}`, `/cbp-standalone-task-check`, `/cbp-standalone-task-testing`, `/cbp-standalone-task-complete` | Independent work not tied to any checkpoint |
17
17
 
18
18
  ## Round Commands (Both Families)
@@ -62,8 +62,8 @@ The queue head (`get_todos` `rows[0]`) maps to one of these slash commands. The
62
62
 
63
63
  | State | Command | Required context |
64
64
  |-------|---------|------------------|
65
- | Round in progress | `/cbp-round-update` | `{checkpoint_id, task_id, round_id}` |
66
- | Round pending start | `/cbp-round-start` | `{checkpoint_id, task_id}` |
65
+ | Round in progress | `/cbp-verify` | `{checkpoint_id, task_id, round_id}` |
66
+ | Round pending start | `/cbp-round-plan` | `{checkpoint_id, task_id}` |
67
67
  | Task pending start | `/cbp-task-start` | `{checkpoint_id, task_id}` or `{task_id}` for standalone |
68
68
  | Checkpoint pending activation | `/cbp-checkpoint-update` | `{checkpoint_id}` |
69
69
  | Checkpoint done | `/cbp-checkpoint-check` | `{checkpoint_id}` |
@@ -118,4 +118,4 @@ CHK-111 shipped the original todos queue as Postgres triggers + a 583-LOC `regen
118
118
  4. Env vars (from `apps/todo-worker/.env.example`): `SUPABASE_URL`, `SUPABASE_SECRET_KEY` (an `sb_secret_...` key), `LOG_LEVEL`, `WORKER_POLL_MS`.
119
119
  5. Save the resulting `project_ref` to `.codebyplan.json` `shipment.surfaces.railway-todo-worker.project_ref`.
120
120
 
121
- Smoke after deploy: run `/cbp-task-complete` in any worktree → tail Railway logs → expect a `claim → apply` cycle within `WORKER_POLL_MS`.
121
+ Smoke after deploy: run `/cbp-finalize` in any worktree → tail Railway logs → expect a `claim → apply` cycle within `WORKER_POLL_MS`.
@@ -0,0 +1,63 @@
1
+ ---
2
+ description: Two CI tiers — soft (round/task → feat) is baseline-tolerant; hardcore (checkpoint → main) is whole-repo absolute green. Branch model is feat→main direct.
3
+ paths:
4
+ - ".claude/skills/cbp-verify/**"
5
+ - ".claude/skills/cbp-checkpoint-check/**"
6
+ - ".claude/skills/cbp-checkpoint-end/**"
7
+ - ".claude/skills/cbp-ship-main/**"
8
+ - ".codebyplan/ci.json"
9
+ ---
10
+
11
+ # Two Tier CI
12
+
13
+ CodeByPlan gates work at two strictness tiers. The tier is chosen by **what is being
14
+ promoted**, not by preference.
15
+
16
+ ## Soft Tier — round / task → feat branch
17
+
18
+ Runs at every `cbp-verify` (round scope) and the task-scope escalation. **Baseline-tolerant**:
19
+ pre-existing red is non-blocking; only NEW per-package failures fail.
20
+
21
+ - `codebyplan check --scope round|task` (NO `--no-baseline`). Each baselined check
22
+ (`lint` / `typecheck` / `tests` / `audit`) fails ONLY when its `new_failures[]` is non-empty
23
+ vs `.check-baseline.json`. `gate6` (sibling-identity parity) is **always hard** — never
24
+ baselined.
25
+ - `codebyplan e2e verify-round --round-id <id> --task-id <id>` per round (Tier-1 e2e proof).
26
+ - Fresh-context review via `cbp-verify-reviewer` (its spawn failure is a hard gate failure —
27
+ `rules/spawn-failure-is-gate-failure.md`).
28
+
29
+ The soft tier keeps the inner loop fast: a feat branch may carry the repo's known baseline red
30
+ forward without blocking, while guaranteeing the work being added is itself clean.
31
+
32
+ ## Hardcore Tier — checkpoint → main
33
+
34
+ Runs at checkpoint close (`cbp-checkpoint-check` / `cbp-checkpoint-end` / ship). **Zero baseline
35
+ forgiveness — whole-repo absolute green.**
36
+
37
+ - `codebyplan check --scope merged --no-baseline` = every failing package and every GHSA id
38
+ counts; any red fails. (`gate6` unchanged — still always hard.)
39
+ - Aggregate e2e proof across the whole checkpoint diff.
40
+ - Every required `main` branch-protection PR check is green (repo-specific — read the repo's
41
+ configured required checks, never assume a single hardcoded check name).
42
+
43
+ ## Critical Constraint — feat→main DIRECT, main-only
44
+
45
+ The branch model is **feat→main direct**; `.codebyplan/git.json` has `integration: null`,
46
+ `production: "main"`. There is **NO intermediate integration branch** — the "checkpoint branch"
47
+ IS the per-checkpoint feat branch. The hardcore tier runs against that feat branch's merged
48
+ state before it lands on main; do not assume a staging/integration hop exists.
49
+
50
+ ## Report-Only Rollout
51
+
52
+ The whole-repo hardcore CI **job** lands **report-only first** (`continue-on-error: true`) and is
53
+ flipped to a required check ONLY after the `apps/web` baseline is burned down. Until then,
54
+ `--scope merged --no-baseline` is advisory in CI — surfaced, not enforced — so a pre-existing
55
+ `apps/web` red does not block a merge while the baseline is still being paid down. Locally,
56
+ `cbp-verify` still runs and reports it.
57
+
58
+ ## Cross-References
59
+
60
+ - `rules/execution-proof.md` — the committed-artifact obligation feeding both tiers.
61
+ - `rules/spawn-failure-is-gate-failure.md` — fresh-context review is non-substitutable.
62
+ - `skills/cbp-verify/reference/deterministic-gates.md` — exact gate commands + JSON contracts.
63
+ - `.codebyplan/git.json` — authoritative branch model (`integration: null`, `production: main`).
@@ -56,9 +56,9 @@
56
56
  "Skill(cbp-checkpoint-check)",
57
57
  "Skill(cbp-checkpoint-complete)",
58
58
  "Skill(cbp-round-complete)",
59
- "Skill(cbp-round-execute)",
59
+ "Skill(cbp-round-build)",
60
60
  "Skill(cbp-session-end)",
61
- "Skill(cbp-task-complete)",
61
+ "Skill(cbp-finalize)",
62
62
  "Skill(cbp-standalone-task-create)",
63
63
  "Skill(cbp-standalone-task-start)",
64
64
  "Skill(cbp-standalone-task-complete)",
@@ -126,13 +126,10 @@
126
126
  "Skill(cbp-map-architecture)",
127
127
  "Skill(cbp-merge-main)",
128
128
  "Skill(cbp-refresh-arch-map)",
129
- "Skill(cbp-refresh-infra)",
130
- "Skill(cbp-round-check)",
131
- "Skill(cbp-round-end)",
132
- "Skill(cbp-round-input)",
133
- "Skill(cbp-round-start)",
134
- "Skill(cbp-round-update)",
129
+ "Skill(cbp-round-plan)",
135
130
  "Skill(cbp-session-start)",
131
+ "Skill(cbp-setup-cd)",
132
+ "Skill(cbp-setup-ci)",
136
133
  "Skill(cbp-setup-e2e)",
137
134
  "Skill(cbp-setup-eslint)",
138
135
  "Skill(cbp-ship-configure)",
@@ -142,11 +139,10 @@
142
139
  "Skill(cbp-supabase-branch-check)",
143
140
  "Skill(cbp-supabase-migrate)",
144
141
  "Skill(cbp-supabase-setup)",
145
- "Skill(cbp-task-check)",
146
142
  "Skill(cbp-task-create)",
147
143
  "Skill(cbp-task-start)",
148
- "Skill(cbp-task-testing)",
149
144
  "Skill(cbp-todo)",
145
+ "Skill(cbp-verify)",
150
146
  "Skill(supabase)",
151
147
  "Skill(supabase-postgres-best-practices)",
152
148
  "mcp__codebyplan__get_checkpoints",
@@ -212,6 +208,8 @@
212
208
  "Bash(npx codebyplan ports:*)",
213
209
  "Bash(codebyplan tech-stack:*)",
214
210
  "Bash(npx codebyplan tech-stack:*)",
211
+ "Bash(codebyplan docs:*)",
212
+ "Bash(npx codebyplan docs:*)",
215
213
  "Bash(codebyplan eslint:*)",
216
214
  "Bash(npx codebyplan eslint:*)",
217
215
  "Bash(codebyplan lsp:*)",
@@ -226,6 +224,8 @@
226
224
  "Bash(npx codebyplan checkpoint:*)",
227
225
  "Bash(codebyplan task:*)",
228
226
  "Bash(npx codebyplan task:*)",
227
+ "Bash(codebyplan standalone-task:*)",
228
+ "Bash(npx codebyplan standalone-task:*)",
229
229
  "Bash(codebyplan session:*)",
230
230
  "Bash(npx codebyplan session:*)",
231
231
  "Bash(codebyplan help:*)",
@@ -249,7 +249,11 @@
249
249
  "Bash(codebyplan e2e:*)",
250
250
  "Bash(npx codebyplan e2e:*)",
251
251
  "Bash(codebyplan arch-map:*)",
252
- "Bash(npx codebyplan arch-map:*)"
252
+ "Bash(npx codebyplan arch-map:*)",
253
+ "Bash(codebyplan ci:*)",
254
+ "Bash(npx codebyplan ci:*)",
255
+ "Bash(codebyplan cd:*)",
256
+ "Bash(npx codebyplan cd:*)"
253
257
  ]
254
258
  },
255
259
  "attribution": {
@@ -38,7 +38,7 @@ A skill that carries a `model:` line is a **gap** — remove it unless a deliber
38
38
 
39
39
  ### Agents — `model:` + `effort:`
40
40
 
41
- Default `model: sonnet` + `effort: xhigh`. Fifteen of the 17 authoring agents take the default (`cbp-cc-executor`, `cbp-database-agent`, `cbp-improve-claude`, `cbp-improve-round`, `cbp-research`, `cbp-round-executor`, `cbp-security-agent`, `cbp-task-check`, `cbp-task-planner`, `cbp-testing-qa-agent`, `cbp-e2e-playwright`, `cbp-e2e-maestro`, `cbp-e2e-tauri`, `cbp-e2e-vscode`, `cbp-e2e-xcuitest`). The other two are exceptions:
41
+ Default `model: sonnet` + `effort: xhigh`. Fifteen of the 17 authoring agents take the default (`cbp-cc-executor`, `cbp-database-agent`, `cbp-improve-claude`, `cbp-research`, `cbp-round-builder`, `cbp-security-agent`, `cbp-stripe-agent`, `cbp-verify-reviewer`, `cbp-round-planner`, `cbp-testing-qa-agent`, `cbp-e2e-playwright`, `cbp-e2e-maestro`, `cbp-e2e-tauri`, `cbp-e2e-vscode`, `cbp-e2e-xcuitest`). The other two are exceptions:
42
42
 
43
43
  | agent | model | effort | reason |
44
44
  | -------------------- | ------ | ------ | ----------------------------------------------------------------------------------- |
@@ -22,7 +22,7 @@ Precedence is `deny > ask > allow`; arrays union across scopes (managed/user/pro
22
22
 
23
23
  ### `allow` — the autonomous workflow surface
24
24
 
25
- - **Non-lifecycle, non-shipment `/cbp-*` skills** — authoring (`cbp-build-cc-*`), frontend (`cbp-frontend-*`), git (`cbp-git-*`, `cbp-merge-main`, `cbp-refresh-infra`), round work (`cbp-round-check`/`-end`/`-input`/`-start`/`-update` — `cbp-round-update` is autonomous triage that only reads round state and routes to `cbp-round-complete` or `cbp-round-input`, so it runs without a prompt), setup/configure (`cbp-setup-*`, `cbp-ship-configure`, `cbp-supabase-*`), task prep (`cbp-task-check`/`-create`/`-start`/`-testing`, `cbp-standalone-task-check`/`-testing`), planning (`cbp-checkpoint-plan`/`-update`), plus `cbp-session-start` and `cbp-todo`. Invoking a skill is the intended mode of operation; the gated side effects happen inside via the Bash/MCP tools the skill calls, which carry their own tiering. The lifecycle/state-transition and plan-approval skills are the exception — they live in `ask` (next section).
25
+ - **Non-lifecycle, non-shipment `/cbp-*` skills** — authoring (`cbp-build-cc-*`), frontend (`cbp-frontend-*`), git (`cbp-git-*`, `cbp-merge-main`, `cbp-refresh-infra`), round work (`cbp-round-plan`, `cbp-verify` — `cbp-verify` is the autonomous verify stage that runs deterministic gates, proves execution, spawns the fresh-context reviewer, and routes to `cbp-round-complete` or `cbp-round-plan`, so it runs without a prompt), setup/configure (`cbp-setup-*`, `cbp-ship-configure`, `cbp-supabase-*`), task prep (`cbp-task-create`/`-start`, `cbp-standalone-task-check`/`-testing`), planning (`cbp-checkpoint-plan`/`-update`), plus `cbp-session-start` and `cbp-todo`. Invoking a skill is the intended mode of operation; the gated side effects happen inside via the Bash/MCP tools the skill calls, which carry their own tiering. The lifecycle/state-transition and plan-approval skills are the exception — they live in `ask` (next section).
26
26
  - **All `mcp__codebyplan__*` reads** (`get_*`, `list_*`, `search_*`, `health_check`, `lookup_symbol`, `resolve_library_id`, `get_chunk`).
27
27
  - **Routine workflow-write MCP tools** the pipeline calls many times per task: create/update/complete checkpoint, task, and round; session log + session-state writes; `create_worktree`, `add_library`, `flag_stale_chunk`, `update_server_config`, `update_eslint_repo_config`, `update_task_template`. Gating these with `ask` would make the autonomous workflow unusable.
28
28
  - **Read/safe CLI commands** (both `codebyplan X` and `npx codebyplan X`): `whoami`, `resolve-worktree`, `statusline`, `ports`, `tech-stack`, `eslint`, `round`, `help`, `--version`.
@@ -30,8 +30,8 @@ Precedence is `deny > ask > allow`; arrays union across scopes (managed/user/pro
30
30
  ### `ask` — the deliberate confirm-gate
31
31
 
32
32
  - **Production-shipment skills**: `cbp-ship`, `cbp-ship-main`, `cbp-checkpoint-end` — these promote/deploy to production, so they prompt even in an otherwise auto-allowed setup.
33
- - **Lifecycle / state-transition skills**: `cbp-checkpoint-start`, `cbp-checkpoint-create`, `cbp-checkpoint-check`, `cbp-checkpoint-complete`, `cbp-round-complete`, `cbp-session-end`, `cbp-task-complete`, `cbp-standalone-task-create`, `cbp-standalone-task-start`, `cbp-standalone-task-complete` — these open or close checkpoints, tasks, rounds, and sessions (advancing workflow state in the database), so they stop for explicit confirmation rather than running autonomously. `cbp-round-complete` is the permission-gated round finalizer (reconciles the user's `git add`s, completes the round, routes onward); its `ask` prompt replaces the in-skill confirmation that used to live in `cbp-round-update` — which is now an autonomous, `allow`-tier triage step.
34
- - **Plan-approval gate**: `cbp-round-execute` — the round plan is approved by confirming this `ask` prompt rather than via an in-skill AskUserQuestion. `cbp-round-start` runs its planning Q&A, then hands off to `cbp-round-execute`; the permission prompt is the user's go/no-go on the plan.
33
+ - **Lifecycle / state-transition skills**: `cbp-checkpoint-start`, `cbp-checkpoint-create`, `cbp-checkpoint-check`, `cbp-checkpoint-complete`, `cbp-round-complete`, `cbp-session-end`, `cbp-finalize`, `cbp-standalone-task-create`, `cbp-standalone-task-start`, `cbp-standalone-task-complete` — these open or close checkpoints, tasks, rounds, and sessions (advancing workflow state in the database), so they stop for explicit confirmation rather than running autonomously. `cbp-round-complete` is the permission-gated round finalizer (reconciles the user's `git add`s, completes the round, routes onward); its `ask` prompt is the human gate downstream of `cbp-verify` — the autonomous, `allow`-tier verify stage whose triage routes here.
34
+ - **Plan-approval gate**: `cbp-round-build` — the round plan is approved by confirming this `ask` prompt rather than via an in-skill AskUserQuestion. `cbp-round-plan` runs its planning Q&A, then hands off to `cbp-round-build`; the permission prompt is the user's go/no-go on the plan.
35
35
  - **Destructive / admin MCP tools**: `delete_session_log`, `delete_worktree`, `create_repo`, `release_assignment`. (The launch and member-admin tools were dropped from the MCP surface in CHK-180 — those concerns are web-app only now.)
36
36
  - **Mutating / external / clobber-risk CLI commands** (both prefixes): `setup`, `login`, `logout`, `upgrade-auth`, `config` (can overwrite committed `.codebyplan/` files), `branch` (rewrites branch config), `ship`, `claude` (`install`/`update`/`uninstall` overwrite `.claude/`).
37
37
 
@@ -53,11 +53,11 @@ A skill invokes the next skill via the Skill tool at the appropriate routing bra
53
53
  ### How the human gate works
54
54
 
55
55
  - **`allow`-tier** skill: the harness auto-fires it silently when the triggering skill invokes it.
56
- No permission prompt. Use for safe, routine-flow skills (e.g. `cbp-task-testing`,
57
- `cbp-round-input`) where the trigger condition already encodes the human intent.
56
+ No permission prompt. Use for safe, routine-flow skills (e.g. `cbp-verify`,
57
+ `cbp-round-plan`) where the trigger condition already encodes the human intent.
58
58
  - **`ask`-tier** skill: the harness pauses and shows a permission prompt before the skill runs.
59
59
  **That prompt IS the human gate** — it replaces the old "Next: /cbp-X, run it yourself"
60
- manual directive. Use for lifecycle/state-transition skills (e.g. `cbp-task-complete`,
60
+ manual directive. Use for lifecycle/state-transition skills (e.g. `cbp-finalize`,
61
61
  `cbp-checkpoint-check`) where a deliberate confirmation is still desirable.
62
62
 
63
63
  This means:
@@ -70,7 +70,7 @@ This means:
70
70
 
71
71
  The `cbp-skill-context-guard.sh` PreToolUse hook denies heavy close-out skills when the
72
72
  context window exceeds `CBP_CONTEXT_WARN_TOKENS` (default 200 000 tokens). The heavy allowlist
73
- is: `cbp-round-execute`, `cbp-task-testing`, `cbp-standalone-task-testing`,
73
+ is: `cbp-round-build`, `cbp-verify`, `cbp-standalone-task-testing`,
74
74
  `cbp-checkpoint-check`, `cbp-checkpoint-end`.
75
75
 
76
76
  When the guard fires, it directs the model to run `/cbp-clear-prep` instead. The flow is:
@@ -81,7 +81,7 @@ A Task-pattern skill that must only run on explicit user confirmation is a **per
81
81
 
82
82
  - MUST carry `disable-model-invocation: true` — the model cannot invoke it; only the user can (via `/skill-name`).
83
83
  - Any upstream skill that auto-triggers it MUST instead emit a `Next: /skill-name` directive and STOP — model invocation of a `disable-model-invocation` skill is blocked at the runtime level.
84
- - Canonical example: `/cbp-round-complete` (the round finalizer). `/cbp-round-update` routes a clean triage via a `Next: /cbp-round-complete` directive and stops — it cannot invoke round-complete directly.
84
+ - Canonical example: `/cbp-round-complete` (the round finalizer). `/cbp-verify` routes a clean round via a `Next: /cbp-round-complete` directive and stops — it cannot invoke round-complete directly.
85
85
 
86
86
  ### Step 5 — Fill the frontmatter
87
87
 
@@ -79,14 +79,14 @@ A skill should do one thing in the pipeline. If a skill both plans AND executes,
79
79
 
80
80
  | Wrong | Right |
81
81
  | --------------------------------------- | ------------------------------------------------------------ |
82
- | `/cbp-round` (plans + executes + tests) | `/cbp-round-start` → `/cbp-round-execute` → `/cbp-round-end` |
82
+ | `/cbp-round` (plans + executes + tests) | `/cbp-round-plan` → `/cbp-round-build` → `/cbp-verify` |
83
83
 
84
84
  ### Pipeline Clarity
85
85
 
86
86
  If the skill is part of a chain, show it:
87
87
 
88
88
  ```
89
- /cbp-round-start (planning) → /cbp-round-execute (ask-tier permission = plan approval)
89
+ /cbp-round-plan (planning) → /cbp-round-build (ask-tier permission = plan approval)
90
90
  ```
91
91
 
92
92
  ### Approval Gates
@@ -4,7 +4,7 @@
4
4
  parent conversation and, per the runtime, **runs in the background by default**. It is
5
5
  isolation for a *whole skill*, not a way to delegate one sub-step. A forked body therefore
6
6
  cannot drive the main pipeline: it can't `AskUserQuestion`, can't auto-trigger another
7
- skill, and can't run an inline-fallback that the orchestrator depends on.
7
+ skill, and can't run the deterministic fallback path the orchestrator depends on.
8
8
 
9
9
  So forking only helps a narrow shape of skill. The canonical eligible example is
10
10
  [examples/fork-skill.md](../examples/fork-skill.md): a single self-contained analytical task
@@ -19,20 +19,20 @@ A skill is **fork-eligible** only when ALL hold:
19
19
  3. It does **not route** — no auto-trigger of another skill, no close-out directive that must
20
20
  fire in the main context.
21
21
  4. It does **not fan out** — it does not spawn multiple subagents and coordinate them.
22
- 5. It has **no inline-fallback** contract the orchestrator relies on.
22
+ 5. It has **no deterministic fallback** path the orchestrator relies on.
23
23
 
24
24
  Fail any one → the skill stays **inline** (main context). Inline skills still get clean
25
25
  context isolation the right way: by delegating their heavy step to a dedicated **agent**
26
- (e.g. `cbp-task-check`, `cbp-improve-round`, `cbp-round-executor`). The agent is the
26
+ (e.g. `cbp-verify-reviewer`, `cbp-round-builder`). The agent is the
27
27
  isolation boundary; the skill stays in the main thread to orchestrate, route, and interact.
28
28
 
29
29
  ## When NOT to use `context: fork` (the disqualifying patterns)
30
30
 
31
31
  | Pattern | Why it can't fork | Example skills |
32
32
  |---------|-------------------|----------------|
33
- | **fan-out** | spawns multiple agents in parallel and coordinates them | `cbp-round-execute`, `cbp-checkpoint-check`, `cbp-map-architecture`, `cbp-refresh-arch-map` |
34
- | **spawn-then-route** | spawns one agent, then `AskUserQuestion` / auto-triggers the next skill / runs inline-fallback | `cbp-task-check`, `cbp-standalone-task-check`, `cbp-round-start`, `cbp-round-end`, `cbp-checkpoint-plan` |
35
- | **inline-by-design** | interactive Q&A or stepwise writes that must stay in the main context | `cbp-task-create`, `cbp-task-complete`, `cbp-round-update`, `cbp-merge-main` |
33
+ | **fan-out** | spawns multiple agents in parallel and coordinates them | `cbp-round-build`, `cbp-checkpoint-check`, `cbp-map-architecture`, `cbp-refresh-arch-map` |
34
+ | **spawn-then-route** | spawns one agent, then `AskUserQuestion` / auto-triggers the next skill | `cbp-verify`, `cbp-standalone-task-check`, `cbp-round-plan`, `cbp-checkpoint-plan` |
35
+ | **inline-by-design** | interactive Q&A or stepwise writes that must stay in the main context | `cbp-task-create`, `cbp-finalize`, `cbp-merge-main` |
36
36
  | **consumed-inline** | invoked *by* an agent (e.g. round-executor) and applies fixes synchronously into that context | `cbp-frontend-design`, `cbp-frontend-ui`, `cbp-frontend-ux` |
37
37
  | **doc-ref-only** | mentions subagents/fork only as documentation; runs inline authoring | the `cbp-build-cc-*` authoring skills, `cbp-supabase-migrate` |
38
38
 
@@ -40,28 +40,25 @@ isolation boundary; the skill stays in the main thread to orchestrate, route, an
40
40
 
41
41
  Every skill whose `SKILL.md` touches the subagent/fork boundary — by spawning a subagent, by
42
42
  being invoked inline by an agent, or by documenting the feature — was classified against the
43
- eligibility test. **Result: 0 of 25 are fork-eligible** — none were migrated, because every
43
+ eligibility test. **Result: 0 of 22 are fork-eligible** — none were migrated, because every
44
44
  one either already isolates heavy work in a dedicated agent (the correct boundary) or depends
45
45
  on inline orchestration/interaction that a background fork would break.
46
46
 
47
47
  | Skill | Pattern | Fork-eligible |
48
48
  |-------|---------|:---:|
49
- | cbp-round-execute | fan-out | no |
49
+ | cbp-round-build | fan-out | no |
50
50
  | cbp-checkpoint-check | fan-out | no |
51
51
  | cbp-map-architecture | fan-out | no |
52
52
  | cbp-refresh-arch-map | fan-out | no |
53
- | cbp-round-start | spawn-then-route | no |
54
- | cbp-round-end | spawn-then-route | no |
55
- | cbp-task-check | spawn-then-route | no |
53
+ | cbp-round-plan | spawn-then-route | no |
54
+ | cbp-verify | spawn-then-route | no |
56
55
  | cbp-standalone-task-check | spawn-then-route | no |
57
56
  | cbp-checkpoint-plan | spawn-then-route | no |
58
- | cbp-round-update | inline-by-design | no |
59
57
  | cbp-task-create | inline-by-design | no |
60
58
  | cbp-standalone-task-create | inline-by-design | no |
61
- | cbp-task-complete | inline-by-design | no |
59
+ | cbp-finalize | inline-by-design | no |
62
60
  | cbp-standalone-task-complete | inline-by-design | no |
63
61
  | cbp-merge-main | inline-by-design | no |
64
- | cbp-task-testing | inline-by-design | no |
65
62
  | cbp-standalone-task-testing | inline-by-design | no |
66
63
  | cbp-frontend-design | consumed-inline | no |
67
64
  | cbp-frontend-ui | consumed-inline | no |
@@ -1,4 +1,5 @@
1
1
  ---
2
+ scope: org-shared
2
3
  name: cbp-checkpoint-check
3
4
  description: Full re-evaluation of a checkpoint with before/after comparison
4
5
  argument-hint: [CHK-NNN]
@@ -83,7 +84,14 @@ Aggregate QA from all tasks and rounds:
83
84
  | TASK-[N] | READY | all_pass | [N] |
84
85
  ```
85
86
 
86
- Re-run build/lint/types on current codebase to verify nothing regressed across tasks.
87
+ Re-run build/lint/types on the current codebase to verify nothing regressed across tasks. Detect `$PLATFORM` from the project type (same signal table as `cbp-testing-qa-agent.md` Step 1), then resolve commands from `.codebyplan/ci.json`:
88
+
89
+ ```bash
90
+ CI_BUILD_CMD=$(npx codebyplan ci resolve build --platform "$PLATFORM" 2>/dev/null)
91
+ CI_TYPES_CMD=$(npx codebyplan ci resolve typecheck --platform "$PLATFORM" 2>/dev/null)
92
+ ```
93
+
94
+ Run: `${CI_BUILD_CMD:-npm run build}` and `${CI_TYPES_CMD:-npx tsc --noEmit}`. For lint use the whole-repo command (`pnpm -w lint`). Fallback: if `.codebyplan/ci.json` is absent, `ci resolve` returns the central default; if the binary is unavailable the `${CI_*_CMD:-<literal>}` guard uses the hardcoded fallback.
87
95
 
88
96
  ### Step 5b: Whole-Checkpoint E2E
89
97
 
@@ -119,11 +127,11 @@ Aggregate the files touched across all tasks (reusing Step 4's deduplicated tabl
119
127
  Continue to Step 6.
120
128
 
121
129
  5. **On fail** (any framework `f`: `e2e_outputs[f].status === 'failed'` OR `e2e_outputs[f].test_results.failed > 0`): build a failure summary from `e2e_outputs[*].test_results.failures[]` aggregated and grouped by `category`. Surface via `AskUserQuestion`:
122
- - **(a) Create fix-task in CHK-{NNN} (recommended)** — run `codebyplan task create` (CLI write-through; break-glass: MCP `create_task`) with `checkpoint_id=current_checkpoint_id`, `title="Fix checkpoint-level e2e failures (CHK-{NNN})"`, `requirements` containing the detailed failure breakdown (category counts, files involved, pages broken, screenshot paths from `e2e_outputs[*].screenshots[]`), AND `context: { source_checkpoint_id, e2e_failure_summary: { category_counts, pages_broken, screenshot_paths }, fix_type: "checkpoint_e2e" }` so downstream `cbp-task-planner` can verify failure premises. Per `cbp-round-end` reference `findings-presentation.md` "Infra Issue Absorption Contract — Resolve-in-Current-Scope by Default", checkpoint-level e2e failures absorb into the active checkpoint — not standalone.
130
+ - **(a) Create fix-task in CHK-{NNN} (recommended)** — run `codebyplan task create` (CLI write-through; break-glass: MCP `create_task`) with `checkpoint_id=current_checkpoint_id`, `title="Fix checkpoint-level e2e failures (CHK-{NNN})"`, `requirements` containing the detailed failure breakdown (category counts, files involved, pages broken, screenshot paths from `e2e_outputs[*].screenshots[]`), AND `context: { source_checkpoint_id, e2e_failure_summary: { category_counts, pages_broken, screenshot_paths }, fix_type: "checkpoint_e2e" }` so downstream `cbp-round-planner` can verify failure premises. Per `cbp-verify` reference `findings-presentation.md` "Infra Issue Absorption Contract — Resolve-in-Current-Scope by Default", checkpoint-level e2e failures absorb into the active checkpoint — not standalone.
123
131
  - **(b) Surface as warning only — proceed to checkpoint-end** — append `| Checkpoint E2E | warning | N failures (deferred) |` to Step 5 QA Summary; continue to Step 6.
124
132
  - **(c) Halt — review manually** — STOP and wait for the user.
125
133
 
126
- See `cbp-round-end` reference `findings-presentation.md` "Infra Issue Absorption Contract — Infra-Class Issue Catalog" row "Checkpoint-level e2e failure" for the routing rationale.
134
+ See `cbp-verify` reference `findings-presentation.md` "Infra Issue Absorption Contract — Infra-Class Issue Catalog" row "Checkpoint-level e2e failure" for the routing rationale.
127
135
 
128
136
  ### Step 6: User Discussion
129
137
 
@@ -87,7 +87,22 @@ This is the first identity-stamping point — when claiming, passing `worktree_i
87
87
 
88
88
  Read `.codebyplan/git.json` `branch_config.production` (default `"main"`) as `BASE`. codebyplan repos are main-only — never create or branch from a `development`/integration branch.
89
89
 
90
- Compute the slug deterministically:
90
+ **8.1 — Reuse the cloud-created branch when present.** When the repo is GitHub-connected, the CHK-207 `create-feat-branch` Edge Function fires on the Step 7 row INSERT, creates `feat/CHK-{NNN}-<slug>` on origin, and writes `branch_name` back to the checkpoint row. Creating a second, differently-slugged branch here orphans the cloud one — so re-read the row first:
91
+
92
+ ```bash
93
+ sleep 5 # give the INSERT webhook a moment to write branch_name back
94
+ npx codebyplan sync 2>/dev/null || true
95
+ BRANCH=$(jq -r '.branch_name // empty' ".codebyplan/state/checkpoints/{checkpoint-id}.json" 2>/dev/null)
96
+ ```
97
+
98
+ (Break-glass: MCP `get_checkpoints` and read the row's `branch_name`.) If `BRANCH` is non-empty, check out the existing remote branch and skip 8.2 entirely — do NOT push (it already exists on origin) and do NOT persist `--branch-name` (the Edge Function already recorded it):
99
+
100
+ ```bash
101
+ git fetch origin "$BRANCH"
102
+ git checkout -b "$BRANCH" --track "origin/$BRANCH"
103
+ ```
104
+
105
+ **8.2 — Fallback: create the branch locally.** Only when `BRANCH` is empty (repo not GitHub-connected, or the webhook hasn't landed). Compute the slug deterministically:
91
106
 
92
107
  ```bash
93
108
  SLUG=$(codebyplan slug "{checkpoint title}")
@@ -96,7 +96,11 @@ Runtime deployment for the base branch is handled in Step 7 by `/cbp-ship` (whic
96
96
 
97
97
  ### Step 7: Runtime Shipment via `/cbp-ship`
98
98
 
99
- After branch promotion to main completes, invoke `/cbp-ship` to deploy every configured surface:
99
+ After branch promotion to main completes, invoke `/cbp-ship` to deploy every configured surface.
100
+ `/cbp-ship` reads `.codebyplan/cd.json` when present to inform per-surface deploy variant
101
+ selection (trigger, environment, approval gate, OIDC auth, credential env-var names). Repos
102
+ without `cd.json` fall back to filesystem surface detection — no behavior change. Run
103
+ `/cbp-setup-cd` to set up `cd.json` for a repo that has not yet migrated.
100
104
 
101
105
  - Vercel auto-deploy verification
102
106
  - Mobile shipment (asks user: skip / EAS internal TestFlight / EAS external TestFlight)