npm - codebyplan - Versions diffs - 1.8.0 → 1.9.0 - Mend

codebyplan 1.8.0 → 1.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/dist/cli.js CHANGED Viewed

@@ -14,7 +14,7 @@ var VERSION, PACKAGE_NAME;
 var init_version = __esm({
   "src/lib/version.ts"() {
     "use strict";
-    VERSION = "1.8.0";
+    VERSION = "1.9.0";
     PACKAGE_NAME = "codebyplan";
   }
 });

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "codebyplan",
-  "version": "1.8.0",
+  "version": "1.9.0",
   "description": "CLI for CodeByPlan — AI-powered development planning and tracking",
   "type": "module",
   "bin": {

package/templates/skills/cbp-checkpoint-create/SKILL.md CHANGED Viewed

@@ -1,65 +1,42 @@
 ---
 scope: org-shared
 name: cbp-checkpoint-create
-description: In-depth idea assessment, Q&A, and task creation
+description: Mechanical checkpoint creation — capture the user's description, infer title + goal, dedup against existing modules, create the checkpoint row + feat branch, then hand off to /cbp-checkpoint-plan for deep planning. Creates ZERO tasks.
 argument-hint: [checkpoint description]
-effort: xhigh
+effort: high
 ---
 # Checkpoint Create Command
-Runs INLINE (no subagent) - all context stays in session. Assesses the idea, runs research if needed, conducts exhaustive Q&A, builds context, and creates tasks as vertical slices.
+Runs INLINE. This is the **mechanical** stage only: capture raw user input, infer a title/goal, run a cheap module-overlap check, create the checkpoint row, create + switch to the feat branch, then auto-trigger `/cbp-checkpoint-plan`. It does **NOT** assess the idea, run research, conduct exhaustive Q&A, generate a plan, or create tasks — all of that is `/cbp-checkpoint-plan`'s job.
-## Instructions
-### Step 1: Check for Existing Checkpoint Data
-Before asking the user anything, check if a checkpoint already exists with ideas/context:
-1. Use MCP `get_next_action` response context — if it includes a checkpoint, load it via MCP `get_checkpoints`
-2. If `$ARGUMENTS` contains a checkpoint number (e.g., `69` or `CHK-69`), load that checkpoint
-3. Check the checkpoint's `ideas` array and `context` fields
+## Pipeline
-**If checkpoint has `ideas[]` with descriptions/requirements:**
-- Use `ideas[].description` as the checkpoint description (DO NOT ask the user)
-- Use `ideas[].requirements` as the requirements list
-- Use existing `context` (decisions, discoveries, etc.) if populated
-- Skip Steps 1b and 2 if deadline already set
-- Proceed directly to Step 3 with this pre-loaded context
+```
+/cbp-checkpoint-create (mechanical, here) → /cbp-checkpoint-plan (deep planning + tasks) → /cbp-checkpoint-start (activate + claim)
+```
-**If checkpoint has NO ideas or is brand new:** Continue to Step 1b.
+## Instructions
-### Step 1b: Get Checkpoint Description
+### Step 1: Check for Existing Checkpoint Data
-**If arguments provided:** Use `$ARGUMENTS` as `user_prompt`
+Source `repo_id` from `.codebyplan/repo.json`. If `$ARGUMENTS` contains a checkpoint number, or MCP `get_next_action` returns one, load it via MCP `get_checkpoints`. If the checkpoint already has `ideas[]` with descriptions, reuse `ideas[].description` (do not re-ask) and skip Step 2.
-**If NO arguments:** Ask user via AskUserQuestion:
-```
-What should this checkpoint accomplish?
-Describe the work in a sentence or two.
-```
+### Step 2: Get Checkpoint Description
-**Data routing for ideas[]:**
-- `ideas[].description` = user's raw prompt (user-written, immutable after creation)
-- `ideas[].assessment` = Claude's structured analysis of this idea (Claude-updatable)
-- `context` = structured analysis output (decisions, constraints, discoveries)
-- Do NOT write Claude analysis into `user_context` — that field is for raw user text only
+**If `$ARGUMENTS` provided:** use it as the description. **Else** ask via AskUserQuestion: "What should this checkpoint accomplish? Describe the work in a sentence or two."
-### Step 2: Prompt for Deadline
+Data routing: `ideas[].description` = the user's raw words (immutable; never overwrite). Do NOT write any Claude analysis into `user_context` — that field is raw user text only. Assessment, decisions, and discoveries are written later by `/cbp-checkpoint-plan`.
-**Skip if checkpoint already has a deadline set.**
+### Step 3: Prompt for Deadline
-Ask user via AskUserQuestion with options:
-- Today
-- Tomorrow
-- This week (Friday)
-- Custom date
+Skip if a deadline is already set. Else ask via AskUserQuestion: Today / Tomorrow / This week (Friday) / Custom date.
-### Step 2.9: Semantic-Domain Module Dedup (BEFORE assessment)
+### Step 4: Semantic-Domain Module Dedup
-Run BEFORE Step 3 (Assess Idea) so the assessment factors in any existing module that might already cover the proposed feature. Cheaper to clarify intent at checkpoint creation than during planner Phase 1 Q&A.
+Cheap module-overlap pre-flight — catching "is this even a new module?" here halves the planner's round-1 surface area for duplicate-feature checkpoints.
-**Trigger**: the checkpoint description (or any `ideas[].description`) contains a feature verb in a user-facing semantic domain. Domain table — extend as new modules ship:
+**Trigger**: the description contains a feature verb/noun in a user-facing semantic domain:
 | Domain | Trigger verbs / nouns | Glob target |
 |--------|----------------------|-------------|
@@ -75,213 +52,65 @@ Run BEFORE Step 3 (Assess Idea) so the assessment factors in any existing module
 | Pets | pet, dog, cat, animal | `apps/mobile/src/features/modules/pet*/` |
 | Work | work, job, task, productivity | `apps/mobile/src/features/modules/work*/` |
-**Procedure**:
-1. Tokenise the description into verbs/nouns. Match against the trigger column.
-2. For each matched domain, Glob the corresponding target. If the glob returns zero entries, no overlap — proceed.
-3. If the glob returns one or more existing modules, ask the user via AskUserQuestion (BEFORE running Step 3 codebase analysis):
-   > Found existing module(s) `{matched-paths}` which may already cover {domain} functionality.
-   >
-   > A) Extend the existing module — add the new feature inside `{primary-match}` rather than creating a new module
-   > B) Build a separate module — explain the distinction in 1-2 sentences (different user intent / different data shape)
-   > C) Unrelated — the new feature is in the same semantic neighbourhood but does not overlap functionally
-   > D) Cancel checkpoint creation — re-frame the description
-4. Save the answer as a locked decision in checkpoint context:
-   ```json
-   {
-     "decision": "Extend existing food module rather than create eating module",
-     "rationale": "User clarified meal-tracking lives alongside existing food/recipe data",
-     "locked": true
-   }
-   ```
-   Pass `locked: true` so planner Phase 2 honors it.
-5. If (B) "Build separate", record the user's distinction verbatim as the rationale — planner Phase 1 will use it to avoid duplicate scaffolding.
-**Why this fires here, not in planner Phase 1**: the planner's Q&A already catches duplicates (livebyplan TASK-4 caught "eating" → existing "food" via planner Q&A), but only after spawning Explore and reading codebase context — wasteful when the question is "is this even a new module?" Catching at checkpoint creation halves the round-1 surface area for duplicate-feature checkpoints.
-Skip this step when the description has zero domain matches OR when the user explicitly states intent in the prompt ("extend the existing food module to support meal sessions") with a referenced module path.
-### Step 3: Assess Idea (INLINE - no subagent)
-Analyze the idea thoroughly:
-1. **Discovery Level Detection:**
-   - Level 0: Simple change, no research needed
-   - Level 1: Minor research, check existing patterns
-   - Level 2: Moderate research, check docs/APIs
-   - Level 3: Deep research, unfamiliar territory
-2. **Codebase Analysis:** Use Glob/Grep/Read to understand:
-   - Related existing code
-   - Patterns to follow
-   - Dependencies affected
-   - Files that will need changes
-3. **Research (Level 2+ only):** Spawn a single `research` subagent for web research. Wait for results.
-   - Research findings MUST be stored in checkpoint.research JSONB via MCP `update_checkpoint` — NEVER write to local docs/ folder
-4. **Cross-Reference All Ideas:**
-   - When analyzing ideas[], iterate ALL items — not just the first
-   - Cross-reference requirements across all context items
-   - Identify overlaps, conflicts, and dependencies between ideas
-   - Store cross-reference results in checkpoint context (discoveries[], dependencies[])
-5. **Store Assessment Per Idea:**
-   - For each idea analyzed, write Claude's analysis into `ideas[N].assessment`
-   - Use MCP `update_checkpoint` with the full ideas array including assessment fields
-   - The assessment is Claude-written; the description remains the user's original text
-### Step 4: Exhaustive Q&A
-Ask the user targeted questions to fill gaps. Every answer gets saved immediately.
-**Build questions based on:**
-- Ambiguities in the description
-- Multiple valid approaches detected
-- UI/UX decisions needed
-- Scope boundaries unclear
-- Integration points uncertain
-Use AskUserQuestion for each question batch (max 4 per call). After each response, save to checkpoint context via MCP `update_checkpoint`:
-```json
-{
-  "context": {
-    "decisions": [{"decision": "...", "rationale": "...", "locked": true}],
-    "discoveries": [{"topic": "...", "finding": "..."}],
-    "dependencies": ["..."],
-    "constraints": ["..."],
-    "qa_answers": [{"question": "...", "answer": "..."}]
-  }
-}
-```
-Continue asking until all ambiguities resolved.
-### Step 5: Determine Next Checkpoint Number
-Use MCP `get_checkpoints` for the repo. Find highest checkpoint number, add 1.
-### Step 6: Create Checkpoint in DB
-**Before calling `create_checkpoint`**: resolve worktree_id via `npx codebyplan resolve-worktree 2>/dev/null`. If non-empty, pass as the `worktree_id` parameter so the checkpoint is born assigned to this worktree. (Per CHK-104 TASK-2 — DB-level hard-lock requires identifying the caller worktree at creation time.)
-**Why here specifically**: this is the first identity-stamping point for the checkpoint; if `worktree_id` is missing here, every downstream task and round inherits the gap and the hard-lock pre-guards have no caller identity to compare against. If empty, prompt the user to run `npx codebyplan setup` first from this directory to register the worktree before creating the checkpoint.
+**Procedure**: tokenise the description; for each matched domain, Glob the target. Zero entries → proceed. One+ existing modules → ask via AskUserQuestion before continuing: (A) extend the existing module, (B) build a separate module — give the 1–2 sentence distinction, (C) unrelated, (D) cancel. Save the answer as a `locked: true` decision in `context.decisions[]` so `/cbp-checkpoint-plan` honors it.
-Use MCP `create_checkpoint`:
-- `repo_id`: from .codebyplan/repo.json
-- `worktree_id`: from `npx codebyplan resolve-worktree 2>/dev/null` (omit param if empty — checkpoint is created with NULL worktree_id, unassigned)
-- `title`: derived from description
-- `number`: next number
-- `goal`: detailed goal from assessment
-- `deadline`: from Step 2
-- `status`: "pending"
-- `context`: accumulated from Q&A
-- `research`: from Step 3 (if any)
+Skip when the description has zero domain matches OR the user already named a target module path.
-### Step 6b: Generate Plan
+### Step 5: Infer Title + Goal
-Based on the assessment and Q&A, generate a structured plan:
+Lightweight inference from the description — no deep analysis. **Title**: concise, ≤80 chars. **Goal**: ≤300 chars, a faithful restatement of intent (not a plan).
-1. Analyze ALL `ideas[]` items — every description and every requirement
-2. Create an ordered sequence of plan steps that covers everything
-3. Each step: `{ title: "What to build", description: "How and why", scope: "Which idea(s) it addresses" }`
-4. Ensure every idea description and every requirement is addressed by at least one step
-5. Cross-reference: if multiple ideas relate, consolidate into cohesive steps
+### Step 6: Claim-or-Open Prompt
-Save plan to checkpoint via MCP `update_checkpoint(checkpoint_id, plan: { steps: [...] })`.
+Ask the user via AskUserQuestion whether to claim this checkpoint now:
-### Step 7: Create Tasks as Vertical Slices
+- **Claim for me + this worktree** (default) — resolve `npx codebyplan resolve-worktree 2>/dev/null` and set it as the checkpoint `worktree_id` at create. The creator carries momentum straight through plan → start.
+- **Leave it open** — create with `worktree_id` null so anyone free can claim it later via `/cbp-checkpoint-start`.
-**Critical design principle:** Each task is a complete vertical slice that can be independently implemented and tested. Each task should produce a complete, production-ready deliverable.
+Record the choice; it drives both the create call (Step 8) and the plan→start routing in `/cbp-checkpoint-plan`.
-**BAD (horizontal layers):**
-- TASK-1: Create database functions
-- TASK-2: Create API routes
-- TASK-3: Create UI components
+### Step 7: Determine Next Checkpoint Number
-**GOOD (vertical slices grouped by theme):**
-- TASK-1: Implement user authentication (DB + API + UI)
-- TASK-2: Implement user profile page (DB + API + UI)
+MCP `get_checkpoints` for the repo; highest `number` + 1.
-**GOOD (theme-based grouping for infrastructure):**
-- TASK-1: Config rule + remove Step 0 from all commands (one theme: eliminate redundant reads)
-- TASK-2: Task sizing + round workflow improvements (one theme: workflow optimization)
+### Step 8: Create Checkpoint Row
-**Sizing:** With 1M token context, tasks can be large. Group logically by theme — each task should encompass all work needed to deliver a complete feature or improvement. Only split a vertical slice when it covers genuinely independent themes, not because of size. A single task touching 30+ files is fine if they all serve one coherent purpose.
+MCP `create_checkpoint`:
+- `repo_id` (from `.codebyplan/repo.json`), `number`, `title`, `goal`, `deadline`, `status: "pending"`
+- `ideas`: `[{ description: <raw user text> }]`
+- `worktree_id`: the resolved worktree from Step 6 **only if the user chose "claim"**; omit when "leave open"
-For each task, use MCP `create_task`:
-- `checkpoint_id`: from Step 6
-- `title`: descriptive vertical slice title
-- `number`: sequential
-- `requirements`: detailed requirements
-- `context`: extracted relevant context from checkpoint
+This is the first identity-stamping point — when claiming, passing `worktree_id` here engages the CHK-104 hard-lock from birth. No `context`, `research`, `plan`, or tasks are written here.
-### Step 8: Create Git Branch
+### Step 9: Create + Switch to Feat Branch
-Check: `git branch -a | grep development`
+Read `.codebyplan/git.json` `branch_config.production` (default `"main"`) as `BASE`. codebyplan repos are main-only — never create or branch from a `development`/integration branch.
-**If NO development branch (local or remote):**
-Create it from main:
 ```bash
-git checkout main && git checkout -b development && git push -u origin development && echo "Created development branch from main"
+git fetch origin "$BASE" 2>/dev/null || true
+git checkout -b "feat/CHK-{NNN}-{slug}" "origin/$BASE" 2>/dev/null \
+  || git checkout -b "feat/CHK-{NNN}-{slug}" "$BASE"
+git push -u origin "feat/CHK-{NNN}-{slug}"
 ```
-**If development branch exists:**
+Slug: lowercase, dash-joined, punctuation dropped, ≤40 chars. Persist the branch via MCP `update_checkpoint(checkpoint_id, branch_name: "feat/CHK-{NNN}-{slug}")`. (The dedicated `/cbp-git-branch-feat-create` skill is the canonical config-driven helper if you prefer to delegate.)
-Detect worktree: `IS_WORKTREE=$(test -f "$(git rev-parse --show-toplevel)/.git" && echo "yes" || echo "no")`
-**(a) In a worktree:** Create feature branch from current branch:
-```bash
-git checkout -b feat/CHK-{NNN}-{slug}
-git push -u origin feat/CHK-{NNN}-{slug}
-```
-**(b) Not in a worktree:** Run:
-```bash
-git checkout -b feat/CHK-{NNN}-{slug}
-git push -u origin feat/CHK-{NNN}-{slug}
-```
-### Step 9: Assign Worktree
-Resolve `worktree_id` at runtime (CHK-108: never read from `.codebyplan/repo.json`):
-```bash
-WORKTREE_ID=$(npx codebyplan resolve-worktree 2>/dev/null)
-```
-If `WORKTREE_ID` is non-empty:
-```
-MCP update_checkpoint(checkpoint_id, worktree_id: WORKTREE_ID)
-```
-If empty (tuple unmatched): skip — Step 6 already passed `worktree_id` to `create_checkpoint`, so this re-stamp is only a defensive no-op.
-### Step 10: Show Result
+### Step 10: Show Result + Auto-Trigger Plan
 ```
 ## Checkpoint Created
-**ID**: CHK-[NNN]
-**Title**: [Title]
-**Deadline**: [date]
-**Discovery Level**: [0-3]
-**Tasks**: [N]
+**CHK-NNN**: [title]  •  **Deadline**: [date]  •  **Branch**: feat/CHK-NNN-slug
+**Claim**: [claimed by this worktree / left open]
-### Tasks:
-1. TASK-1: [title]
-2. TASK-2: [title]
-...
-Run `/cbp-todo` to start working.
+Now planning CHK-NNN… handing off to /cbp-checkpoint-plan.
 ```
+Auto-trigger `/cbp-checkpoint-plan {NNN}` in the same context. This skill created ZERO tasks — the plan skill produces them.
 ## Integration
-- **Runs inline**: All analysis in current session (no context loss)
-- **Spawns**: `research` agent (Level 2+ only, for web research)
-- **Saves to DB**: Context, research, QA answers via MCP
+- **Runs inline**: mechanical setup only — no assessment, research, Q&A, plan, or tasks
+- **Reads**: MCP `get_next_action`, `get_checkpoints`; `.codebyplan/repo.json`, `.codebyplan/git.json`; `npx codebyplan resolve-worktree`
+- **Writes**: MCP `create_checkpoint` (description-only ideas + deadline + optional worktree_id), `update_checkpoint` (branch_name)
+- **Triggers**: `/cbp-checkpoint-plan` (auto)

package/templates/skills/cbp-checkpoint-plan/SKILL.md ADDED Viewed

@@ -0,0 +1,137 @@
+---
+scope: org-shared
+name: cbp-checkpoint-plan
+description: Deep inline planning for a checkpoint — assess, gap-analyse, decide dependencies, compare alternatives, optionally e2e-probe a suspected-broken area, then create tasks as vertical slices. Runs after /cbp-checkpoint-create (mechanical) and before /cbp-checkpoint-start (activate + claim). Does NOT activate or claim.
+argument-hint: [checkpoint-number]
+effort: xhigh
+---
+# Checkpoint Plan Command
+Runs INLINE (no subagent) — all analysis and Q&A stay in the main session. This is the rigour stage that prevents half-baked plans: it discovers shortcomings, decides whether existing dependencies suffice or a new one is warranted, compares competing approaches, and only THEN creates tasks. It produces `plan.steps[]` + tasks but **never activates the checkpoint and never claims a user/worktree** — that is `/cbp-checkpoint-start`.
+## Pipeline
+```
+/cbp-checkpoint-create (mechanical) → /cbp-checkpoint-plan (deep planning, here) → /cbp-checkpoint-start (activate + claim)
+```
+Semantic-domain module dedup already ran in `/cbp-checkpoint-create` (its Step 4) — do NOT repeat it here. This skill assumes the checkpoint row, title, goal, branch, and any module-overlap decision already exist.
+## Instructions
+### Step 0: Parse `$ARGUMENTS`
+Source `repo_id` from `.codebyplan/repo.json` — every MCP call below that takes `repo_id` uses it.
+| Shape | Resolves to |
+|-------|-------------|
+| `{chk}` (e.g. `138`) | CHK-{chk} via MCP `get_checkpoints` filtered by `number` |
+| _(empty)_ | Active/pending checkpoint via MCP `get_current_task`; if none in progress, the most recent `pending` checkpoint that has no `plan.steps` |
+Malformed (non-numeric, contains `-`): surface `checkpoint-plan: invalid argument` and stop.
+### Step 1: Load Checkpoint + Existing Tasks
+1. Resolve the checkpoint (Step 0). Load `user_context`, `ideas[]`, `context` (decisions / discoveries / dependencies / constraints / qa_answers / alternatives), `research`, `plan`.
+2. MCP `get_tasks(checkpoint_id)` — load existing tasks. This sets the mode:
+   - **fresh** — zero tasks: full plan + create all tasks.
+   - **additive re-plan** — tasks exist: gap-analyse against them; only ADD new tasks or refine requirements for gaps. NEVER delete or overwrite an in-flight task.
+3. Note whether `worktree_id` is set (claimed at create) — drives routing in Step 11.
+### Step 2: Assess Ideas + Codebase
+Iterate ALL `ideas[]` (not just the first). For each:
+1. **Discovery level** — 0 (trivial) · 1 (check existing patterns) · 2 (read docs/APIs) · 3 (unfamiliar territory).
+2. **Codebase analysis** — Glob/Grep/Read the related code: existing patterns to follow, files that will change, integration points.
+3. **Research (level 2+ only)** — spawn a single `cbp-research` subagent for web/library research. Persist findings to `checkpoint.research` via MCP `update_checkpoint` — never to local docs. In additive re-plan mode, read the existing `research` (loaded in Step 1) and append — do not replace the object.
+4. **Cross-reference** ideas against each other: overlaps, conflicts, shared dependencies.
+Write each idea's analysis into `ideas[N].assessment` (Claude-authored; never touch `description`, which is the user's words).
+### Step 3: Gap Analysis
+Find what the raw request misses — this is the core anti-"half-ass" step. Load `reference/gap-analysis-playbook.md` and run its two passes:
+- **Pass 1 (in-scope gaps)** — shortcomings, half-implemented patterns, and missing foundations WITHIN the stated scope. Record each as a `context.discoveries[]` entry.
+- **Pass 2 (adjacent findings)** — problems in the same neighbourhood but outside the literal request. Per locked policy, adjacent findings are **pulled INTO this checkpoint** as additional plan steps / tasks (scope absorption) rather than deferred — unless the user explicitly scopes them out in Step 7, or absorbing them would push the checkpoint past its deadline (surface in Step 7 for confirmation before absorbing).
+### Step 4: E2E Discovery Probe (opt-in)
+Only relevant when an idea touches a UI surface AND you SUSPECT an existing flow is already broken. Load `reference/e2e-discovery-probe.md`.
+1. Surface the suspicion: name the area + the specific pages/screens, and why you think it is broken.
+2. Ask the user via AskUserQuestion to confirm running the probe (it needs a running dev server).
+3. On confirm, spawn `cbp-test-e2e-agent` with `whole_checkpoint_mode: true`, `round_number: 0`, `files_changed: []`, the `pages_affected` you proposed, plus `repo_id` / `test_strategy` / `has_auth` / `dev_server_port`. Resolve `test_strategy` and `dev_server_port` per `reference/e2e-discovery-probe.md` (do not pass placeholder strings).
+4. Record the probe outcome (what actually failed vs. what you assumed) in `context.discoveries[]` so the plan targets real defects.
+Skip this step entirely for non-UI checkpoints or when no breakage is suspected.
+### Step 5: Dependency Decisions
+When an idea could be built by extending something already installed OR by adding a new dependency, do NOT silently pick one. Load `reference/dep-decision-rubric.md` and:
+1. Identify the capability needed and whether an existing dependency already covers it.
+2. If a new dependency is a candidate, surface the trade-off (capability gap, bundle weight, maintenance, vendor-docs availability) — via AskUserQuestion when the choice is consequential.
+3. Lock the outcome as a `context.decisions[]` entry with `locked: true` and a rationale.
+### Step 6: Alternative Comparison
+When more than one viable approach exists for a meaningful design fork, present the alternatives before committing. Load `reference/alternative-comparison-template.md`:
+1. Build 2–4 options with a one-line trade-off each; mark a recommendation.
+2. Ask via AskUserQuestion (use `preview` for code/layout comparisons).
+3. Save the comparison + the user's choice to `context.alternatives[]` and mirror the chosen path as a `context.decisions[]` entry.
+### Step 7: Exhaustive Q&A
+Resolve every remaining ambiguity. Ask via AskUserQuestion (max 4 per batch). After each batch, append to `context.qa_answers[]`. Continue until nothing material is unresolved — including confirming the scope-absorption candidates from Step 3 Pass 2 (keep / drop each).
+### Step 8: Generate or Extend `plan.steps[]`
+Build an ordered `plan.steps[]` where each step is `{ title, description, scope }` (`scope` = which idea/requirement it addresses). Every `ideas[].requirements` item AND every kept gap finding must be covered by ≥1 step. In additive mode, append/refine steps — do not drop steps that map to existing tasks. Save via MCP `update_checkpoint(checkpoint_id, plan: { steps: [...] })` — `plan` is a top-level checkpoint field, so this write does not touch `context` or `research`; the context-replaces rule in Step 10 applies only to the `context` JSONB.
+### Step 9: Create Tasks as Vertical Slices
+Each task is a complete, independently shippable vertical slice — group by theme, not by layer.
+**BAD (horizontal layers):** TASK-1 DB functions · TASK-2 API routes · TASK-3 UI.
+**GOOD (vertical slices):** TASK-1 user auth (DB + API + UI) · TASK-2 profile page (DB + API + UI).
+**GOOD (infra by theme):** TASK-1 config rule + Step-0 removal · TASK-2 task-sizing + round-workflow.
+Sizing: with a 1M-token context, tasks can be large — group by coherent purpose; a single task touching 30+ files is fine if they serve one theme. Only split when themes are genuinely independent.
+For each task use MCP `create_task` (`checkpoint_id`, sequential `number` after the current max, `title`, `requirements`, `context` with the relevant decisions/discoveries). **Additive mode:** create only tasks for steps not already covered by an existing task; never delete in-flight tasks. **Coverage check** — a plan step counts as covered only when an existing task's requirements address its full scope; when uncertain, create the task and note in its requirements `may overlap with TASK-N — review before executing`.
+### Step 10: Persist Full Context
+Final write of the complete `checkpoint.context` JSONB via MCP `update_checkpoint`. Honor the **context-replaces-not-merges** contract: read the current context, merge your additions in memory, write the FULL object (`decisions` + `discoveries` + `dependencies` + `constraints` + `qa_answers` + `alternatives`). A partial write clobbers sibling keys. Phrase any database-verb words as prose in payloads (WAF gate).
+### Step 11: Show Result + Route
+```
+## Checkpoint Planned
+**CHK-NNN**: [title]
+**Tasks**: [created N] (additive: [+M new])  •  **Plan steps**: [count]
+**Locked decisions**: [count]  •  **E2E probe**: [fired / skipped]
+### Tasks
+1. TASK-1: [title]
+...
+```
+This skill does **NOT** activate the checkpoint and does **NOT** claim a user/worktree.
+- **Claimed by THIS session** — `worktree_id` is set AND equals `npx codebyplan resolve-worktree 2>/dev/null`: auto-trigger `/cbp-checkpoint-start` in the same context (the creator carries momentum into activation).
+- **Otherwise** — `worktree_id` is null, set to a different worktree, or `resolve-worktree` is empty: surface a single directive — `Next: /cbp-checkpoint-start` — so the owning session (or anyone free, if open) claims and starts it. Never auto-activate a checkpoint owned by a different worktree.
+## Integration
+- **Reads**: MCP `get_current_task`, `get_checkpoints`, `get_tasks`
+- **Writes**: MCP `update_checkpoint` (ideas assessment, context, plan, research), `create_task`
+- **Spawns**: `cbp-research` (level 2+ only), `cbp-test-e2e-agent` (opt-in discovery probe, `whole_checkpoint_mode`)
+- **Triggered by**: `/cbp-checkpoint-create` (auto), or user directly
+- **Triggers**: `/cbp-checkpoint-start` (auto when claimed at create; directive when left open)
+- **Never**: activates the checkpoint or claims a user/worktree — that is `/cbp-checkpoint-start`

package/templates/skills/cbp-checkpoint-plan/reference/alternative-comparison-template.md ADDED Viewed

@@ -0,0 +1,54 @@
+---
+scope: org-shared
+---
+# Alternative Comparison Template
+Loaded by `/cbp-checkpoint-plan` Step 6. Use when a meaningful design fork has more than one viable answer. Surfacing the alternatives — instead of silently picking one — is what lets the user redirect before tasks are created.
+## When a fork warrants a question
+Ask the user when ALL of these hold:
+- There are 2–4 genuinely viable options (not one obvious winner).
+- The choice is hard to reverse later (architecture, data shape, public API, a one-way dependency).
+- The options have materially different trade-offs the user would care about.
+Resolve inline (record a decision, no question) when the choice is reversible, low-stakes, or has a single defensible answer.
+## How to present
+Use AskUserQuestion. Lead with your recommendation as the first option and tag it `(Recommended)`. Give each option a one-line trade-off. For code/layout/config forks, use the `preview` field to show concrete snippets side by side.
+```
+Question: "How should X be structured?"
+Options:
+  - "Approach A (Recommended)" — <one-line trade-off>
+  - "Approach B"               — <one-line trade-off>
+  - "Approach C"               — <one-line trade-off>
+```
+## Record format
+Persist to `context.alternatives[]` (a JSONB key introduced by this skill — schema-flexible; declare it as below). Read-merge-write the full context per Step 10.
+```json
+{
+  "question": "How should X be structured?",
+  "options": [
+    { "label": "Approach A", "tradeoff": "...", "recommended": true },
+    { "label": "Approach B", "tradeoff": "..." }
+  ],
+  "chosen": "Approach A",
+  "rationale": "user picked A because ...",
+  "decided_at": "2026-06-01T..."
+}
+```
+Mirror the chosen path as a `context.decisions[]` entry with `locked: true` so the executor treats it as settled and the planner does not re-ask on a re-run.
+## Notes
+- One question per fork; do not bundle unrelated forks into one multi-select.
+- If the user picks "Other" and writes a custom answer, record their text verbatim as the rationale.
+- Keep the options mutually exclusive — overlapping options produce ambiguous decisions.

package/templates/skills/cbp-checkpoint-plan/reference/dep-decision-rubric.md ADDED Viewed

@@ -0,0 +1,50 @@
+---
+scope: org-shared
+---
+# Dependency Decision Rubric
+Loaded by `/cbp-checkpoint-plan` Step 5. Use when an idea could be built by extending something already installed OR by pulling in a new dependency. The goal is a deliberate, recorded choice — never a silent `pnpm add`.
+## Decision tree
+1. **Does an installed dependency already cover this?**
+   - Check `package.json` (root + the relevant workspace) and `vendor/INDEX.md` for an existing library.
+   - Grep the codebase for prior art — the capability may already be wrapped in a util/hook/service.
+   - If yes and it fits → **extend the existing dependency**. Record a `locked` decision and stop.
+2. **Can the existing dependency be extended at acceptable cost?**
+   - A thin wrapper / adapter over an installed lib almost always beats a new dependency.
+   - If extension means forking or fighting the library → a new dependency may be justified; continue.
+3. **Is a new dependency warranted?** Weigh:
+   | Factor | Favors extending | Favors new dependency |
+   |--------|------------------|-----------------------|
+   | Capability gap | Existing covers ~all of it | Existing covers little / poorly |
+   | Bundle weight | Adds to an already-loaded dep | Heavy add to a lean surface (esp. mobile) |
+   | Maintenance | No new supply-chain surface | Well-maintained, widely used, typed |
+   | Vendor docs | — | A `vendor/{lib}/v{ver}/` mirror exists or can be scaffolded via `/cbp-build-vendor-doc` |
+   | Lock-in / migration | Reuses known patterns | One-way door; migration cost later |
+4. **Consequential choice?** If adding a new dependency meaningfully changes bundle size, security surface, or architecture, surface it to the user via AskUserQuestion with the trade-off table above. Otherwise decide inline and record it.
+## Record format
+Write the outcome as a `context.decisions[]` entry (read-merge-write the full context per Step 10):
+```json
+{
+  "decision": "Extend the installed date-fns rather than add dayjs",
+  "rationale": "date-fns already bundled; needed formatter is a 3-line wrapper; avoids a second date lib",
+  "locked": true
+}
+```
+When a new dependency IS chosen, also add a `context.dependencies[]` entry naming it and (if no vendor mirror exists) a plan step to scaffold one via `/cbp-build-vendor-doc`.
+## Anti-patterns
+- Adding a library for something the standard lib or an installed dep already does.
+- Two libraries that solve the same problem (e.g. two date libs, two state managers) — consolidate instead.
+- Deciding silently — every extend-vs-add fork must leave a recorded, locked decision so the executor and future planners can see the reasoning.

package/templates/skills/cbp-checkpoint-plan/reference/e2e-discovery-probe.md ADDED Viewed

@@ -0,0 +1,57 @@
+---
+scope: org-shared
+---
+# E2E Discovery Probe
+Loaded by `/cbp-checkpoint-plan` Step 4. The probe answers one question before you plan a fix: **is this area actually broken, and how?** It reuses `cbp-test-e2e-agent` (the sole owner of e2e execution) in `whole_checkpoint_mode` rather than introducing a second smoke-test path.
+## When to offer the probe
+Offer it only when BOTH hold:
+- An idea touches a UI surface — its text or the affected files mention a page / screen / route / form / component.
+- You have a concrete suspicion that an existing flow is already broken (not "let's test everything"). The probe targets a named area, not the whole app.
+Skip silently for backend-only / infra / `claude_only` checkpoints, or when you have no breakage suspicion.
+## Procedure
+1. **State the suspicion** — name the area, the specific pages/screens, and why you think it is broken (a stale selector, a route that 404s, a recent refactor nearby).
+2. **Confirm with the user** via AskUserQuestion — the probe needs a running dev server, so it is opt-in. Options: run the probe / skip and plan from assumption / let me name different pages.
+3. **Resolve the dev-server port** from `.codebyplan/server.json` `port_allocations[]` (pick the entry whose `server_type` matches the app, e.g. `nextjs`). If nothing is running there, ask the user to start it or skip.
+4. **Resolve `test_strategy`** — call MCP `get_repos()`, find the entry where `id === repo_id`, and read the affected app's platform + e2e framework from its `tech_stack` record. If the record has no e2e data, pass `null` for the unknown fields — the agent resolves them itself at its Step 1.5. Do NOT pass placeholder strings.
+5. **Spawn** `cbp-test-e2e-agent` with the payload below.
+6. **Interpret** the result: compare what actually failed against what you assumed. Record the delta in `context.discoveries[]` so the plan targets real defects, not imagined ones.
+## Spawn payload (whole_checkpoint_mode)
+`round_number: 0` is the documented sentinel for `whole_checkpoint_mode` in the agent's Input Contract — in that mode `files_changed` / `prior_round_files_changed` are ignored and the agent runs the `pages_affected` you give it.
+```yaml
+input:
+  repo_id: <repo UUID from .codebyplan/repo.json>
+  round_number: 0                 # sentinel — whole_checkpoint_mode
+  whole_checkpoint_mode: true
+  files_changed: []               # nothing changed yet — probing current state
+  prior_round_files_changed: []   # ignored under whole_checkpoint_mode; required for non-probe round_number >= 2 calls
+  test_strategy:
+    platform: <from tech_stack DB record>
+    e2e_framework: <playwright | maestro | webdriverio | xcuitest | vscode-test>
+  pages_affected: ["<route or screen you suspect>", ...]
+  has_auth: <true | false>
+  dev_server_port: <port from server.json, or null>
+```
+## What you get back
+The agent returns `test_results` (passed / failed / skipped + per-failure `category` and `classification_reason`) and `preflight`. For planning purposes:
+- `category: 'real'` failures → genuine defects; turn each into a plan step / task.
+- `category: 'env' | 'auth' | 'access' | 'flake'` → not the feature's fault; note it but do not plan a code fix around it.
+- A clean pass → your breakage suspicion was wrong; plan the actual requested change without a "fix" step you did not need.
+## Why reuse the agent (not a new smoke probe)
+`cbp-test-e2e-agent` is the declared sole owner of e2e: it auto-detects the platform, reconciles against the `tech_stack` DB record, configures the framework if missing, and classifies failures. A bespoke in-skill smoke check would duplicate that ownership and drift. The probe is a thin, opt-in caller of the existing agent.

package/templates/skills/cbp-checkpoint-plan/reference/gap-analysis-playbook.md ADDED Viewed

@@ -0,0 +1,47 @@
+---
+scope: org-shared
+---
+# Gap Analysis Playbook
+Loaded by `/cbp-checkpoint-plan` Step 3. The job: find what the raw request misses, before any task is created. Most "half-ass" outcomes come from planning only what was literally asked and ignoring the foundations it depends on or the adjacent breakage it sits next to.
+## Two passes
+### Pass 1 — in-scope gaps
+For each idea, compare the stated requirements against codebase reality (Glob/Grep/Read the affected area). Look for:
+- **Missing foundations** — the request assumes a helper / table / route / type that does not exist yet.
+- **Half-implemented patterns** — a similar feature exists but is incomplete (e.g. a hook with no test, a route with no error path, a component with no loading/empty state).
+- **Implicit requirements** — auth, validation, RLS, migration, types regen, i18n, a11y that the request did not name but the feature needs to be production-ready.
+- **Consistency debt** — the new work would diverge from an established convention unless explicitly aligned.
+Record each as a `context.discoveries[]` entry: `{ topic, finding }`. These become plan steps in Step 8.
+### Pass 2 — adjacent findings
+While reading the area, you will notice problems just outside the literal request (a neighbouring bug, a stale comment referencing a removed symbol, a sibling file with the same latent defect). Classify each:
+| Class | Meaning | Default action |
+|-------|---------|----------------|
+| `in_scope_gap` | Needed for the request to be production-ready | Add a plan step (Pass 1) |
+| `adjacent_absorbed` | Same neighbourhood, cheap to fix while here, low risk | **Pull into this checkpoint** as a task (locked policy) |
+| `adjacent_deferred` | Real but large/independent enough to warrant its own checkpoint | Record as discovery; confirm with user in Step 7 before dropping |
+The locked CHK-138 policy is **scope absorption**: prefer `adjacent_absorbed` over `adjacent_deferred`. Only defer when the user explicitly scopes it out in Step 7, or when absorbing it would balloon the checkpoint beyond its deadline.
+## What "production-ready" means here
+A plan is complete when, for every idea, you can answer yes to:
+- Does every requirement have a covering plan step?
+- Are the implicit requirements (auth / validation / migration / types / tests) either covered or explicitly deemed N/A?
+- Did Pass 2 findings get a disposition (absorbed or deferred-with-user-confirmation)?
+- Would shipping this leave any half-implemented pattern in the touched files?
+If any answer is no, the plan is not done — add the step or ask the question.
+## Output
+Everything from this playbook lands in `context.discoveries[]` and, after Step 8, in `plan.steps[]`. Do not create tasks here — task creation is Step 9, after dependency decisions and alternatives are settled.

package/templates/skills/cbp-checkpoint-start/SKILL.md ADDED Viewed

@@ -0,0 +1,84 @@
+---
+scope: org-shared
+name: cbp-checkpoint-start
+description: Activate a planned checkpoint and claim it for the current user/worktree, then route into task work. Runs after /cbp-checkpoint-plan (which produces tasks but never activates). Refuses to start an unplanned checkpoint.
+argument-hint: [checkpoint-number]
+effort: high
+---
+# Checkpoint Start Command
+The activation + claim gate of the checkpoint pipeline. `/cbp-checkpoint-plan` produces tasks but deliberately leaves the checkpoint `pending` and possibly unclaimed so it can sit in a team queue. This skill flips it to `active`, claims it for the caller's worktree if still open, and routes into the first task.
+## Pipeline
+```
+/cbp-checkpoint-create → /cbp-checkpoint-plan (tasks, no activation) → /cbp-checkpoint-start (activate + claim, here) → /cbp-task-start
+```
+## Instructions
+### Step 0: Parse `$ARGUMENTS`
+Source `repo_id` from `.codebyplan/repo.json`. Resolve caller worktree once for the whole skill: `CALLER_WT=$(npx codebyplan resolve-worktree 2>/dev/null)`.
+| Shape | Resolves to |
+|-------|-------------|
+| `{chk}` (e.g. `138`) | CHK-{chk} via MCP `get_checkpoints` filtered by `number` |
+| _(empty)_ | The next `pending` checkpoint that already has tasks (planned but not yet started); if several, the lowest-numbered |
+Malformed (non-numeric, contains `-`): surface `checkpoint-start: invalid argument` and stop.
+### Step 1: Load Checkpoint + Tasks
+Load the checkpoint (`status`, `worktree_id`, `plan`) and its tasks via MCP `get_tasks(checkpoint_id)`.
+### Step 2: Planned-Gate
+A checkpoint must be planned before it can start.
+- **No tasks AND empty `plan.steps[]`** → refuse: surface `CHK-NNN is not planned yet.` and auto-trigger `/cbp-checkpoint-plan {NNN}`. STOP. (An unplanned checkpoint is `worktree_id`-null, so there is nothing to own yet — always kick off planning, matching `/cbp-todo` Rule A.)
+- **Already `active`** → no activation needed; skip to Step 3 for a claim-check only, then Step 5.
+- **`pending` with tasks** → proceed.
+### Step 3: Claim Logic
+Compare the checkpoint's `worktree_id` against `CALLER_WT`:
+| Checkpoint `worktree_id` | Action |
+|--------------------------|--------|
+| null (left open at create) | Claim it: in Step 4 pass `worktree_id: CALLER_WT`. If `CALLER_WT` is empty, warn the checkpoint will stay unclaimed and proceed without it. |
+| equals `CALLER_WT` | Already yours — no-op. |
+| a DIFFERENT worktree | STOP. Surface: `CHK-NNN is claimed by worktree {other}; current worktree is {CALLER_WT}. If {other} is dead, a maintainer can release it via the release_assignment MCP tool, then re-run /cbp-checkpoint-start.` Do not activate. |
+This mirrors the CHK-104 hard-lock model — never wrest a checkpoint from a live worktree.
+### Step 4: Activate
+If the checkpoint is already `active` AND `worktree_id` already equals `CALLER_WT` (the Step 3 no-op row), skip this step entirely and proceed to Step 5 — nothing to write.
+Otherwise set the checkpoint `active` via MCP `update_checkpoint(checkpoint_id, status: "active"`, plus `worktree_id: CALLER_WT` when claiming per Step 3, plus `caller_worktree_id: CALLER_WT` so the hard-lock pre-guard accepts the call (omit `caller_worktree_id` only when `CALLER_WT` is empty). If the checkpoint was already `active` but a claim is still needed, skip the status write and only write `worktree_id`.
+### Step 5: Route
+Follow the close-out routing convention — auto-trigger the next same-context step, never an A/B/C menu. `{first-pending-task}` is the lowest-numbered pending task from Step 1 (not necessarily TASK-1, since additive re-planning may have completed earlier ones):
+- **Claimed by THIS session** (`CALLER_WT` now owns the checkpoint): auto-trigger `/cbp-task-start {chk}-{first-pending-task}` in the same context.
+- **`CALLER_WT` empty / unresolved**: surface a single directive — `Next: /cbp-task-start {chk}-{first-pending-task}` — and let the user proceed.
+Show a one-line confirmation before routing:
+```
+## Checkpoint Started
+**CHK-NNN**: [title]  •  **Status**: active  •  **Claimed by**: [worktree or "open"]
+**Next task**: TASK-[N] — [title]
+```
+## Integration
+- **Reads**: MCP `get_checkpoints`, `get_tasks`; `npx codebyplan resolve-worktree`
+- **Writes**: MCP `update_checkpoint` (status + worktree_id, with caller_worktree_id pre-guard)
+- **Triggered by**: `/cbp-checkpoint-plan` (auto when claimed at create), `/cbp-todo` (planned-but-pending gate), or user directly
+- **Triggers**: `/cbp-task-start` (auto when claimed), or `/cbp-checkpoint-plan` (when the checkpoint is unplanned)
+- **Never**: plans or creates tasks — that is `/cbp-checkpoint-plan`

package/templates/skills/cbp-task-start/SKILL.md CHANGED Viewed

@@ -72,8 +72,8 @@ The task MUST run on its target feat branch. Claude switches/creates that branch
 #### 3.1 — Determine the target branch
 Read `.codebyplan/git.json`:
-- `branch_config.protected` (fall back to `["main", "development"]`)
-- `branch_config.integration` (fall back to `"development"`)
+- `branch_config.protected` (fall back to `["main"]`)
+- `branch_config.production` (fall back to `"main"`) → store as `PRODUCTION`
 Compute `TARGET`:
@@ -102,16 +102,16 @@ if git rev-parse --verify "$TARGET" >/dev/null 2>&1; then
 elif git rev-parse --verify "origin/$TARGET" >/dev/null 2>&1; then
   git checkout -t "origin/$TARGET"
-# (c) target doesn't exist — create from integration branch
+# (c) target doesn't exist — create from production branch (main)
 else
-  # First make sure integration is up to date
-  git fetch origin "$INTEGRATION" 2>/dev/null || true
-  git checkout -b "$TARGET" "origin/$INTEGRATION" 2>/dev/null \
-    || git checkout -b "$TARGET" "$INTEGRATION"
+  # First make sure production is up to date
+  git fetch origin "$PRODUCTION" 2>/dev/null || true
+  git checkout -b "$TARGET" "origin/$PRODUCTION" 2>/dev/null \
+    || git checkout -b "$TARGET" "$PRODUCTION"
 fi
 ```
-**Carrying uncommitted work** — `git checkout` carries clean (non-conflicting) working-tree changes to the new branch automatically. This is intended: changes made on `development` while preparing the task move with the user to the new feat branch. No `git stash`, ever (per `git-safety.md`). No `git add`, ever (per `git-workflow.md`).
+**Carrying uncommitted work** — `git checkout` carries clean (non-conflicting) working-tree changes to the new branch automatically. This is intended: changes made on `main` while preparing the task move with the user to the new feat branch. No `git stash`, ever (per `git-safety.md`). No `git add`, ever (per `git-workflow.md`).
 **If `git checkout` exits non-zero** (typically "would clobber" because a tracked file has unstaged changes that conflict with target's version): surface the raw git error verbatim, stop, do NOT attempt recovery. The user resolves and re-invokes. This is the only case where `/cbp-task-start` halts on branch state.
@@ -128,7 +128,7 @@ After successful switch:
 #### 3.5 — Protected-branch sanity (defensive)
-After all of the above, `current` should be a feat branch by construction. If somehow it's still in `branch_config.protected` (e.g. TARGET resolved to "development" because a checkpoint has a malformed `branch_name`), THEN block with a hard error citing the bad config — this is a data bug, not a user workflow problem.
+After all of the above, `current` should be a feat branch by construction. If somehow it's still in `branch_config.protected` (e.g. TARGET resolved to "main" because a checkpoint has a malformed `branch_name`), THEN block with a hard error citing the bad config — this is a data bug, not a user workflow problem.
 ### Step 3b: Clean Slate
@@ -173,18 +173,18 @@ Before activating the task, verify the caller's worktree matches the assigned wo
 4. If `TARGET_WT IS NULL` or matches, proceed.
-### Step 3.6: Integration Drift Check (optional)
+### Step 3.6: Main-Drift Check (optional)
-Before loading context, check if the feat branch has drifted from integration. Soft-skip on fetch failure (this gate is optional, not mandatory).
+Before loading context, check if the feat branch has drifted from the production branch (main). Soft-skip on fetch failure (this gate is optional, not mandatory).
-1. Run `git fetch origin {INTEGRATION}` where `{INTEGRATION}` comes from `.codebyplan/git.json` `branch_config.integration` (default `development`). If fetch fails (offline, auth), skip this entire step silently and proceed to Step 4.
-2. Compute: `BEHIND=$(git rev-list --count HEAD..origin/{INTEGRATION})`.
+1. Run `git fetch origin {PRODUCTION}` where `{PRODUCTION}` comes from `.codebyplan/git.json` `branch_config.production` (default `main`). If fetch fails (offline, auth), skip this entire step silently and proceed to Step 4.
+2. Compute: `BEHIND=$(git rev-list --count HEAD..origin/{PRODUCTION})`.
 3. Compute `LAST_FETCH_AGE_HOURS` from the mtime of `.git/FETCH_HEAD` relative to now.
 4. If `BEHIND >= 10` OR `LAST_FETCH_AGE_HOURS > 24`: surface AskUserQuestion:
    ```
-   Feat branch is {BEHIND} commits behind origin/{INTEGRATION} (last fetch {AGE}h ago).
-   Merge integration now before starting the task?
+   Feat branch is {BEHIND} commits behind origin/{PRODUCTION} (last fetch {AGE}h ago).
+   Merge main now before starting the task?
    A) Yes — run /cbp-merge-main (recommended)
    B) No — continue without merging
    ```

package/templates/skills/cbp-todo/SKILL.md CHANGED Viewed

@@ -26,6 +26,18 @@ WORKTREE_ID=$(npx codebyplan resolve-worktree 2>/dev/null)
 Use MCP `get_next_action` with `repo_id` and `worktree_id` (if present from Step 0).
+### Step 1.5: Checkpoint Planning Gate
+Before honoring the command from Step 1, gate on the resolved active/next checkpoint's planning + activation state. This keeps work from starting on a half-baked or un-activated checkpoint. Resolve the checkpoint from the `get_next_action` response context (or MCP `get_current_task`), then load its `plan` + `status` via MCP `get_checkpoints` and its task count via MCP `get_tasks(checkpoint_id)`.
+Evaluate two rules in order (Rule A wins if both could match):
+- **RULE A — unplanned**: empty `plan.steps[]` **AND** zero tasks → the checkpoint has not been planned. Suppress the Step-1 command; surface `Now planning CHK-NNN… handing off to /cbp-checkpoint-plan` and auto-trigger `/cbp-checkpoint-plan {NNN}`.
+- **RULE B — planned-but-pending**: has tasks (or non-empty `plan.steps[]`) **BUT** `status === "pending"` (not yet activated) → the checkpoint is planned but not started. Suppress the Step-1 command; surface `Now starting CHK-NNN… handing off to /cbp-checkpoint-start` and auto-trigger `/cbp-checkpoint-start {NNN}` (a planned checkpoint must be started + claimed before task work).
+- **Neither** (planned AND `active`) → fall through to Step 2 unchanged. No regression to the existing flow.
+Skip this gate when `get_next_action` returns no checkpoint (idle — see Step 3) or the command is `/cbp-session-start`.
 ### Step 2: Load Context Based on Command
 Before triggering the command, load the context it needs. This ensures `/clear` + `/cbp-todo` reliably restores full working context.
@@ -36,6 +48,8 @@ Before triggering the command, load the context it needs. This ensures `/clear`
 |----------------|-----------------|
 | `/cbp-session-start` | None — `/cbp-session-start` handles its own loading |
 | `/cbp-checkpoint-create` | If checkpoint exists in response context: load checkpoint via MCP `get_checkpoints` (filter by number). Display checkpoint title, goal, ideas summary |
+| `/cbp-checkpoint-plan` | Load checkpoint via MCP `get_checkpoints` (filter by number) + `get_tasks(checkpoint_id)`. Display checkpoint title, goal, ideas, existing task count |
+| `/cbp-checkpoint-start` | Load checkpoint via MCP `get_checkpoints` + `get_tasks(checkpoint_id)`. Display checkpoint title, status, claim state, first pending task |
 | `/cbp-task-start [N]` | Load via MCP `get_current_task`. Display checkpoint title + task title/requirements summary |
 | `/cbp-round-start` | Load via MCP `get_current_task` + `get_rounds(task_id)`. Display checkpoint + task + round count + last round summary |
 | `/cbp-round-update` | Load via MCP `get_current_task` + `get_rounds(task_id)`. Display checkpoint + task + files_changed approval summary |
@@ -93,5 +107,5 @@ If Step 3 found actionable work, show `instructions` from the get_next_action re
 ## Integration
 - **Called by**: `/cbp-session-start`, `/cbp-task-complete`, `/cbp-checkpoint-complete`, manual, after `/clear`
-- **Reads**: MCP `get_next_action`, `get_current_task`, `get_rounds`, `get_checkpoints`
-- **Triggers**: `command` from response (auto)
+- **Reads**: MCP `get_next_action`, `get_current_task`, `get_rounds`, `get_checkpoints`, `get_tasks`
+- **Triggers**: `command` from response (auto); Step 1.5 gate overrides to `/cbp-checkpoint-plan` (unplanned) or `/cbp-checkpoint-start` (planned-but-pending)