codebyplan 1.8.0 → 1.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/cli.js CHANGED
@@ -14,7 +14,7 @@ var VERSION, PACKAGE_NAME;
14
14
  var init_version = __esm({
15
15
  "src/lib/version.ts"() {
16
16
  "use strict";
17
- VERSION = "1.8.0";
17
+ VERSION = "1.9.0";
18
18
  PACKAGE_NAME = "codebyplan";
19
19
  }
20
20
  });
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "codebyplan",
3
- "version": "1.8.0",
3
+ "version": "1.9.0",
4
4
  "description": "CLI for CodeByPlan — AI-powered development planning and tracking",
5
5
  "type": "module",
6
6
  "bin": {
@@ -1,65 +1,42 @@
1
1
  ---
2
2
  scope: org-shared
3
3
  name: cbp-checkpoint-create
4
- description: In-depth idea assessment, Q&A, and task creation
4
+ description: Mechanical checkpoint creation — capture the user's description, infer title + goal, dedup against existing modules, create the checkpoint row + feat branch, then hand off to /cbp-checkpoint-plan for deep planning. Creates ZERO tasks.
5
5
  argument-hint: [checkpoint description]
6
- effort: xhigh
6
+ effort: high
7
7
  ---
8
8
 
9
9
  # Checkpoint Create Command
10
10
 
11
- Runs INLINE (no subagent) - all context stays in session. Assesses the idea, runs research if needed, conducts exhaustive Q&A, builds context, and creates tasks as vertical slices.
11
+ Runs INLINE. This is the **mechanical** stage only: capture raw user input, infer a title/goal, run a cheap module-overlap check, create the checkpoint row, create + switch to the feat branch, then auto-trigger `/cbp-checkpoint-plan`. It does **NOT** assess the idea, run research, conduct exhaustive Q&A, generate a plan, or create tasks all of that is `/cbp-checkpoint-plan`'s job.
12
12
 
13
- ## Instructions
14
-
15
- ### Step 1: Check for Existing Checkpoint Data
16
-
17
- Before asking the user anything, check if a checkpoint already exists with ideas/context:
18
-
19
- 1. Use MCP `get_next_action` response context — if it includes a checkpoint, load it via MCP `get_checkpoints`
20
- 2. If `$ARGUMENTS` contains a checkpoint number (e.g., `69` or `CHK-69`), load that checkpoint
21
- 3. Check the checkpoint's `ideas` array and `context` fields
13
+ ## Pipeline
22
14
 
23
- **If checkpoint has `ideas[]` with descriptions/requirements:**
24
- - Use `ideas[].description` as the checkpoint description (DO NOT ask the user)
25
- - Use `ideas[].requirements` as the requirements list
26
- - Use existing `context` (decisions, discoveries, etc.) if populated
27
- - Skip Steps 1b and 2 if deadline already set
28
- - Proceed directly to Step 3 with this pre-loaded context
15
+ ```
16
+ /cbp-checkpoint-create (mechanical, here) /cbp-checkpoint-plan (deep planning + tasks) → /cbp-checkpoint-start (activate + claim)
17
+ ```
29
18
 
30
- **If checkpoint has NO ideas or is brand new:** Continue to Step 1b.
19
+ ## Instructions
31
20
 
32
- ### Step 1b: Get Checkpoint Description
21
+ ### Step 1: Check for Existing Checkpoint Data
33
22
 
34
- **If arguments provided:** Use `$ARGUMENTS` as `user_prompt`
23
+ Source `repo_id` from `.codebyplan/repo.json`. If `$ARGUMENTS` contains a checkpoint number, or MCP `get_next_action` returns one, load it via MCP `get_checkpoints`. If the checkpoint already has `ideas[]` with descriptions, reuse `ideas[].description` (do not re-ask) and skip Step 2.
35
24
 
36
- **If NO arguments:** Ask user via AskUserQuestion:
37
- ```
38
- What should this checkpoint accomplish?
39
- Describe the work in a sentence or two.
40
- ```
25
+ ### Step 2: Get Checkpoint Description
41
26
 
42
- **Data routing for ideas[]:**
43
- - `ideas[].description` = user's raw prompt (user-written, immutable after creation)
44
- - `ideas[].assessment` = Claude's structured analysis of this idea (Claude-updatable)
45
- - `context` = structured analysis output (decisions, constraints, discoveries)
46
- - Do NOT write Claude analysis into `user_context` — that field is for raw user text only
27
+ **If `$ARGUMENTS` provided:** use it as the description. **Else** ask via AskUserQuestion: "What should this checkpoint accomplish? Describe the work in a sentence or two."
47
28
 
48
- ### Step 2: Prompt for Deadline
29
+ Data routing: `ideas[].description` = the user's raw words (immutable; never overwrite). Do NOT write any Claude analysis into `user_context` — that field is raw user text only. Assessment, decisions, and discoveries are written later by `/cbp-checkpoint-plan`.
49
30
 
50
- **Skip if checkpoint already has a deadline set.**
31
+ ### Step 3: Prompt for Deadline
51
32
 
52
- Ask user via AskUserQuestion with options:
53
- - Today
54
- - Tomorrow
55
- - This week (Friday)
56
- - Custom date
33
+ Skip if a deadline is already set. Else ask via AskUserQuestion: Today / Tomorrow / This week (Friday) / Custom date.
57
34
 
58
- ### Step 2.9: Semantic-Domain Module Dedup (BEFORE assessment)
35
+ ### Step 4: Semantic-Domain Module Dedup
59
36
 
60
- Run BEFORE Step 3 (Assess Idea) so the assessment factors in any existing module that might already cover the proposed feature. Cheaper to clarify intent at checkpoint creation than during planner Phase 1 Q&A.
37
+ Cheap module-overlap pre-flight catching "is this even a new module?" here halves the planner's round-1 surface area for duplicate-feature checkpoints.
61
38
 
62
- **Trigger**: the checkpoint description (or any `ideas[].description`) contains a feature verb in a user-facing semantic domain. Domain table — extend as new modules ship:
39
+ **Trigger**: the description contains a feature verb/noun in a user-facing semantic domain:
63
40
 
64
41
  | Domain | Trigger verbs / nouns | Glob target |
65
42
  |--------|----------------------|-------------|
@@ -75,213 +52,65 @@ Run BEFORE Step 3 (Assess Idea) so the assessment factors in any existing module
75
52
  | Pets | pet, dog, cat, animal | `apps/mobile/src/features/modules/pet*/` |
76
53
  | Work | work, job, task, productivity | `apps/mobile/src/features/modules/work*/` |
77
54
 
78
- **Procedure**:
79
-
80
- 1. Tokenise the description into verbs/nouns. Match against the trigger column.
81
- 2. For each matched domain, Glob the corresponding target. If the glob returns zero entries, no overlap — proceed.
82
- 3. If the glob returns one or more existing modules, ask the user via AskUserQuestion (BEFORE running Step 3 codebase analysis):
83
-
84
- > Found existing module(s) `{matched-paths}` which may already cover {domain} functionality.
85
- >
86
- > A) Extend the existing module — add the new feature inside `{primary-match}` rather than creating a new module
87
- > B) Build a separate module — explain the distinction in 1-2 sentences (different user intent / different data shape)
88
- > C) Unrelated — the new feature is in the same semantic neighbourhood but does not overlap functionally
89
- > D) Cancel checkpoint creation — re-frame the description
90
-
91
- 4. Save the answer as a locked decision in checkpoint context:
92
- ```json
93
- {
94
- "decision": "Extend existing food module rather than create eating module",
95
- "rationale": "User clarified meal-tracking lives alongside existing food/recipe data",
96
- "locked": true
97
- }
98
- ```
99
- Pass `locked: true` so planner Phase 2 honors it.
100
-
101
- 5. If (B) "Build separate", record the user's distinction verbatim as the rationale — planner Phase 1 will use it to avoid duplicate scaffolding.
102
-
103
- **Why this fires here, not in planner Phase 1**: the planner's Q&A already catches duplicates (livebyplan TASK-4 caught "eating" → existing "food" via planner Q&A), but only after spawning Explore and reading codebase context — wasteful when the question is "is this even a new module?" Catching at checkpoint creation halves the round-1 surface area for duplicate-feature checkpoints.
104
-
105
- Skip this step when the description has zero domain matches OR when the user explicitly states intent in the prompt ("extend the existing food module to support meal sessions") with a referenced module path.
106
-
107
- ### Step 3: Assess Idea (INLINE - no subagent)
108
-
109
- Analyze the idea thoroughly:
110
-
111
- 1. **Discovery Level Detection:**
112
- - Level 0: Simple change, no research needed
113
- - Level 1: Minor research, check existing patterns
114
- - Level 2: Moderate research, check docs/APIs
115
- - Level 3: Deep research, unfamiliar territory
116
-
117
- 2. **Codebase Analysis:** Use Glob/Grep/Read to understand:
118
- - Related existing code
119
- - Patterns to follow
120
- - Dependencies affected
121
- - Files that will need changes
122
-
123
- 3. **Research (Level 2+ only):** Spawn a single `research` subagent for web research. Wait for results.
124
- - Research findings MUST be stored in checkpoint.research JSONB via MCP `update_checkpoint` — NEVER write to local docs/ folder
125
-
126
- 4. **Cross-Reference All Ideas:**
127
- - When analyzing ideas[], iterate ALL items — not just the first
128
- - Cross-reference requirements across all context items
129
- - Identify overlaps, conflicts, and dependencies between ideas
130
- - Store cross-reference results in checkpoint context (discoveries[], dependencies[])
131
-
132
- 5. **Store Assessment Per Idea:**
133
- - For each idea analyzed, write Claude's analysis into `ideas[N].assessment`
134
- - Use MCP `update_checkpoint` with the full ideas array including assessment fields
135
- - The assessment is Claude-written; the description remains the user's original text
136
-
137
- ### Step 4: Exhaustive Q&A
138
-
139
- Ask the user targeted questions to fill gaps. Every answer gets saved immediately.
140
-
141
- **Build questions based on:**
142
- - Ambiguities in the description
143
- - Multiple valid approaches detected
144
- - UI/UX decisions needed
145
- - Scope boundaries unclear
146
- - Integration points uncertain
147
-
148
- Use AskUserQuestion for each question batch (max 4 per call). After each response, save to checkpoint context via MCP `update_checkpoint`:
149
-
150
- ```json
151
- {
152
- "context": {
153
- "decisions": [{"decision": "...", "rationale": "...", "locked": true}],
154
- "discoveries": [{"topic": "...", "finding": "..."}],
155
- "dependencies": ["..."],
156
- "constraints": ["..."],
157
- "qa_answers": [{"question": "...", "answer": "..."}]
158
- }
159
- }
160
- ```
161
-
162
- Continue asking until all ambiguities resolved.
163
-
164
- ### Step 5: Determine Next Checkpoint Number
165
-
166
- Use MCP `get_checkpoints` for the repo. Find highest checkpoint number, add 1.
167
-
168
- ### Step 6: Create Checkpoint in DB
169
-
170
- **Before calling `create_checkpoint`**: resolve worktree_id via `npx codebyplan resolve-worktree 2>/dev/null`. If non-empty, pass as the `worktree_id` parameter so the checkpoint is born assigned to this worktree. (Per CHK-104 TASK-2 — DB-level hard-lock requires identifying the caller worktree at creation time.)
171
-
172
- **Why here specifically**: this is the first identity-stamping point for the checkpoint; if `worktree_id` is missing here, every downstream task and round inherits the gap and the hard-lock pre-guards have no caller identity to compare against. If empty, prompt the user to run `npx codebyplan setup` first from this directory to register the worktree before creating the checkpoint.
55
+ **Procedure**: tokenise the description; for each matched domain, Glob the target. Zero entries → proceed. One+ existing modules → ask via AskUserQuestion before continuing: (A) extend the existing module, (B) build a separate module — give the 1–2 sentence distinction, (C) unrelated, (D) cancel. Save the answer as a `locked: true` decision in `context.decisions[]` so `/cbp-checkpoint-plan` honors it.
173
56
 
174
- Use MCP `create_checkpoint`:
175
- - `repo_id`: from .codebyplan/repo.json
176
- - `worktree_id`: from `npx codebyplan resolve-worktree 2>/dev/null` (omit param if empty — checkpoint is created with NULL worktree_id, unassigned)
177
- - `title`: derived from description
178
- - `number`: next number
179
- - `goal`: detailed goal from assessment
180
- - `deadline`: from Step 2
181
- - `status`: "pending"
182
- - `context`: accumulated from Q&A
183
- - `research`: from Step 3 (if any)
57
+ Skip when the description has zero domain matches OR the user already named a target module path.
184
58
 
185
- ### Step 6b: Generate Plan
59
+ ### Step 5: Infer Title + Goal
186
60
 
187
- Based on the assessment and Q&A, generate a structured plan:
61
+ Lightweight inference from the description no deep analysis. **Title**: concise, ≤80 chars. **Goal**: ≤300 chars, a faithful restatement of intent (not a plan).
188
62
 
189
- 1. Analyze ALL `ideas[]` items — every description and every requirement
190
- 2. Create an ordered sequence of plan steps that covers everything
191
- 3. Each step: `{ title: "What to build", description: "How and why", scope: "Which idea(s) it addresses" }`
192
- 4. Ensure every idea description and every requirement is addressed by at least one step
193
- 5. Cross-reference: if multiple ideas relate, consolidate into cohesive steps
63
+ ### Step 6: Claim-or-Open Prompt
194
64
 
195
- Save plan to checkpoint via MCP `update_checkpoint(checkpoint_id, plan: { steps: [...] })`.
65
+ Ask the user via AskUserQuestion whether to claim this checkpoint now:
196
66
 
197
- ### Step 7: Create Tasks as Vertical Slices
67
+ - **Claim for me + this worktree** (default) — resolve `npx codebyplan resolve-worktree 2>/dev/null` and set it as the checkpoint `worktree_id` at create. The creator carries momentum straight through plan → start.
68
+ - **Leave it open** — create with `worktree_id` null so anyone free can claim it later via `/cbp-checkpoint-start`.
198
69
 
199
- **Critical design principle:** Each task is a complete vertical slice that can be independently implemented and tested. Each task should produce a complete, production-ready deliverable.
70
+ Record the choice; it drives both the create call (Step 8) and the plan→start routing in `/cbp-checkpoint-plan`.
200
71
 
201
- **BAD (horizontal layers):**
202
- - TASK-1: Create database functions
203
- - TASK-2: Create API routes
204
- - TASK-3: Create UI components
72
+ ### Step 7: Determine Next Checkpoint Number
205
73
 
206
- **GOOD (vertical slices grouped by theme):**
207
- - TASK-1: Implement user authentication (DB + API + UI)
208
- - TASK-2: Implement user profile page (DB + API + UI)
74
+ MCP `get_checkpoints` for the repo; highest `number` + 1.
209
75
 
210
- **GOOD (theme-based grouping for infrastructure):**
211
- - TASK-1: Config rule + remove Step 0 from all commands (one theme: eliminate redundant reads)
212
- - TASK-2: Task sizing + round workflow improvements (one theme: workflow optimization)
76
+ ### Step 8: Create Checkpoint Row
213
77
 
214
- **Sizing:** With 1M token context, tasks can be large. Group logically by theme — each task should encompass all work needed to deliver a complete feature or improvement. Only split a vertical slice when it covers genuinely independent themes, not because of size. A single task touching 30+ files is fine if they all serve one coherent purpose.
78
+ MCP `create_checkpoint`:
79
+ - `repo_id` (from `.codebyplan/repo.json`), `number`, `title`, `goal`, `deadline`, `status: "pending"`
80
+ - `ideas`: `[{ description: <raw user text> }]`
81
+ - `worktree_id`: the resolved worktree from Step 6 **only if the user chose "claim"**; omit when "leave open"
215
82
 
216
- For each task, use MCP `create_task`:
217
- - `checkpoint_id`: from Step 6
218
- - `title`: descriptive vertical slice title
219
- - `number`: sequential
220
- - `requirements`: detailed requirements
221
- - `context`: extracted relevant context from checkpoint
83
+ This is the first identity-stamping point — when claiming, passing `worktree_id` here engages the CHK-104 hard-lock from birth. No `context`, `research`, `plan`, or tasks are written here.
222
84
 
223
- ### Step 8: Create Git Branch
85
+ ### Step 9: Create + Switch to Feat Branch
224
86
 
225
- Check: `git branch -a | grep development`
87
+ Read `.codebyplan/git.json` `branch_config.production` (default `"main"`) as `BASE`. codebyplan repos are main-only never create or branch from a `development`/integration branch.
226
88
 
227
- **If NO development branch (local or remote):**
228
- Create it from main:
229
89
  ```bash
230
- git checkout main && git checkout -b development && git push -u origin development && echo "Created development branch from main"
90
+ git fetch origin "$BASE" 2>/dev/null || true
91
+ git checkout -b "feat/CHK-{NNN}-{slug}" "origin/$BASE" 2>/dev/null \
92
+ || git checkout -b "feat/CHK-{NNN}-{slug}" "$BASE"
93
+ git push -u origin "feat/CHK-{NNN}-{slug}"
231
94
  ```
232
95
 
233
- **If development branch exists:**
96
+ Slug: lowercase, dash-joined, punctuation dropped, ≤40 chars. Persist the branch via MCP `update_checkpoint(checkpoint_id, branch_name: "feat/CHK-{NNN}-{slug}")`. (The dedicated `/cbp-git-branch-feat-create` skill is the canonical config-driven helper if you prefer to delegate.)
234
97
 
235
- Detect worktree: `IS_WORKTREE=$(test -f "$(git rev-parse --show-toplevel)/.git" && echo "yes" || echo "no")`
236
-
237
- **(a) In a worktree:** Create feature branch from current branch:
238
- ```bash
239
- git checkout -b feat/CHK-{NNN}-{slug}
240
- git push -u origin feat/CHK-{NNN}-{slug}
241
- ```
242
-
243
- **(b) Not in a worktree:** Run:
244
- ```bash
245
- git checkout -b feat/CHK-{NNN}-{slug}
246
- git push -u origin feat/CHK-{NNN}-{slug}
247
- ```
248
-
249
- ### Step 9: Assign Worktree
250
-
251
- Resolve `worktree_id` at runtime (CHK-108: never read from `.codebyplan/repo.json`):
252
-
253
- ```bash
254
- WORKTREE_ID=$(npx codebyplan resolve-worktree 2>/dev/null)
255
- ```
256
-
257
- If `WORKTREE_ID` is non-empty:
258
- ```
259
- MCP update_checkpoint(checkpoint_id, worktree_id: WORKTREE_ID)
260
- ```
261
-
262
- If empty (tuple unmatched): skip — Step 6 already passed `worktree_id` to `create_checkpoint`, so this re-stamp is only a defensive no-op.
263
-
264
- ### Step 10: Show Result
98
+ ### Step 10: Show Result + Auto-Trigger Plan
265
99
 
266
100
  ```
267
101
  ## Checkpoint Created
268
102
 
269
- **ID**: CHK-[NNN]
270
- **Title**: [Title]
271
- **Deadline**: [date]
272
- **Discovery Level**: [0-3]
273
- **Tasks**: [N]
103
+ **CHK-NNN**: [title] • **Deadline**: [date] • **Branch**: feat/CHK-NNN-slug
104
+ **Claim**: [claimed by this worktree / left open]
274
105
 
275
- ### Tasks:
276
- 1. TASK-1: [title]
277
- 2. TASK-2: [title]
278
- ...
279
-
280
- Run `/cbp-todo` to start working.
106
+ Now planning CHK-NNN… handing off to /cbp-checkpoint-plan.
281
107
  ```
282
108
 
109
+ Auto-trigger `/cbp-checkpoint-plan {NNN}` in the same context. This skill created ZERO tasks — the plan skill produces them.
110
+
283
111
  ## Integration
284
112
 
285
- - **Runs inline**: All analysis in current session (no context loss)
286
- - **Spawns**: `research` agent (Level 2+ only, for web research)
287
- - **Saves to DB**: Context, research, QA answers via MCP
113
+ - **Runs inline**: mechanical setup only no assessment, research, Q&A, plan, or tasks
114
+ - **Reads**: MCP `get_next_action`, `get_checkpoints`; `.codebyplan/repo.json`, `.codebyplan/git.json`; `npx codebyplan resolve-worktree`
115
+ - **Writes**: MCP `create_checkpoint` (description-only ideas + deadline + optional worktree_id), `update_checkpoint` (branch_name)
116
+ - **Triggers**: `/cbp-checkpoint-plan` (auto)
@@ -0,0 +1,137 @@
1
+ ---
2
+ scope: org-shared
3
+ name: cbp-checkpoint-plan
4
+ description: Deep inline planning for a checkpoint — assess, gap-analyse, decide dependencies, compare alternatives, optionally e2e-probe a suspected-broken area, then create tasks as vertical slices. Runs after /cbp-checkpoint-create (mechanical) and before /cbp-checkpoint-start (activate + claim). Does NOT activate or claim.
5
+ argument-hint: [checkpoint-number]
6
+ effort: xhigh
7
+ ---
8
+
9
+ # Checkpoint Plan Command
10
+
11
+ Runs INLINE (no subagent) — all analysis and Q&A stay in the main session. This is the rigour stage that prevents half-baked plans: it discovers shortcomings, decides whether existing dependencies suffice or a new one is warranted, compares competing approaches, and only THEN creates tasks. It produces `plan.steps[]` + tasks but **never activates the checkpoint and never claims a user/worktree** — that is `/cbp-checkpoint-start`.
12
+
13
+ ## Pipeline
14
+
15
+ ```
16
+ /cbp-checkpoint-create (mechanical) → /cbp-checkpoint-plan (deep planning, here) → /cbp-checkpoint-start (activate + claim)
17
+ ```
18
+
19
+ Semantic-domain module dedup already ran in `/cbp-checkpoint-create` (its Step 4) — do NOT repeat it here. This skill assumes the checkpoint row, title, goal, branch, and any module-overlap decision already exist.
20
+
21
+ ## Instructions
22
+
23
+ ### Step 0: Parse `$ARGUMENTS`
24
+
25
+ Source `repo_id` from `.codebyplan/repo.json` — every MCP call below that takes `repo_id` uses it.
26
+
27
+ | Shape | Resolves to |
28
+ |-------|-------------|
29
+ | `{chk}` (e.g. `138`) | CHK-{chk} via MCP `get_checkpoints` filtered by `number` |
30
+ | _(empty)_ | Active/pending checkpoint via MCP `get_current_task`; if none in progress, the most recent `pending` checkpoint that has no `plan.steps` |
31
+
32
+ Malformed (non-numeric, contains `-`): surface `checkpoint-plan: invalid argument` and stop.
33
+
34
+ ### Step 1: Load Checkpoint + Existing Tasks
35
+
36
+ 1. Resolve the checkpoint (Step 0). Load `user_context`, `ideas[]`, `context` (decisions / discoveries / dependencies / constraints / qa_answers / alternatives), `research`, `plan`.
37
+ 2. MCP `get_tasks(checkpoint_id)` — load existing tasks. This sets the mode:
38
+ - **fresh** — zero tasks: full plan + create all tasks.
39
+ - **additive re-plan** — tasks exist: gap-analyse against them; only ADD new tasks or refine requirements for gaps. NEVER delete or overwrite an in-flight task.
40
+ 3. Note whether `worktree_id` is set (claimed at create) — drives routing in Step 11.
41
+
42
+ ### Step 2: Assess Ideas + Codebase
43
+
44
+ Iterate ALL `ideas[]` (not just the first). For each:
45
+
46
+ 1. **Discovery level** — 0 (trivial) · 1 (check existing patterns) · 2 (read docs/APIs) · 3 (unfamiliar territory).
47
+ 2. **Codebase analysis** — Glob/Grep/Read the related code: existing patterns to follow, files that will change, integration points.
48
+ 3. **Research (level 2+ only)** — spawn a single `cbp-research` subagent for web/library research. Persist findings to `checkpoint.research` via MCP `update_checkpoint` — never to local docs. In additive re-plan mode, read the existing `research` (loaded in Step 1) and append — do not replace the object.
49
+ 4. **Cross-reference** ideas against each other: overlaps, conflicts, shared dependencies.
50
+
51
+ Write each idea's analysis into `ideas[N].assessment` (Claude-authored; never touch `description`, which is the user's words).
52
+
53
+ ### Step 3: Gap Analysis
54
+
55
+ Find what the raw request misses — this is the core anti-"half-ass" step. Load `reference/gap-analysis-playbook.md` and run its two passes:
56
+
57
+ - **Pass 1 (in-scope gaps)** — shortcomings, half-implemented patterns, and missing foundations WITHIN the stated scope. Record each as a `context.discoveries[]` entry.
58
+ - **Pass 2 (adjacent findings)** — problems in the same neighbourhood but outside the literal request. Per locked policy, adjacent findings are **pulled INTO this checkpoint** as additional plan steps / tasks (scope absorption) rather than deferred — unless the user explicitly scopes them out in Step 7, or absorbing them would push the checkpoint past its deadline (surface in Step 7 for confirmation before absorbing).
59
+
60
+ ### Step 4: E2E Discovery Probe (opt-in)
61
+
62
+ Only relevant when an idea touches a UI surface AND you SUSPECT an existing flow is already broken. Load `reference/e2e-discovery-probe.md`.
63
+
64
+ 1. Surface the suspicion: name the area + the specific pages/screens, and why you think it is broken.
65
+ 2. Ask the user via AskUserQuestion to confirm running the probe (it needs a running dev server).
66
+ 3. On confirm, spawn `cbp-test-e2e-agent` with `whole_checkpoint_mode: true`, `round_number: 0`, `files_changed: []`, the `pages_affected` you proposed, plus `repo_id` / `test_strategy` / `has_auth` / `dev_server_port`. Resolve `test_strategy` and `dev_server_port` per `reference/e2e-discovery-probe.md` (do not pass placeholder strings).
67
+ 4. Record the probe outcome (what actually failed vs. what you assumed) in `context.discoveries[]` so the plan targets real defects.
68
+
69
+ Skip this step entirely for non-UI checkpoints or when no breakage is suspected.
70
+
71
+ ### Step 5: Dependency Decisions
72
+
73
+ When an idea could be built by extending something already installed OR by adding a new dependency, do NOT silently pick one. Load `reference/dep-decision-rubric.md` and:
74
+
75
+ 1. Identify the capability needed and whether an existing dependency already covers it.
76
+ 2. If a new dependency is a candidate, surface the trade-off (capability gap, bundle weight, maintenance, vendor-docs availability) — via AskUserQuestion when the choice is consequential.
77
+ 3. Lock the outcome as a `context.decisions[]` entry with `locked: true` and a rationale.
78
+
79
+ ### Step 6: Alternative Comparison
80
+
81
+ When more than one viable approach exists for a meaningful design fork, present the alternatives before committing. Load `reference/alternative-comparison-template.md`:
82
+
83
+ 1. Build 2–4 options with a one-line trade-off each; mark a recommendation.
84
+ 2. Ask via AskUserQuestion (use `preview` for code/layout comparisons).
85
+ 3. Save the comparison + the user's choice to `context.alternatives[]` and mirror the chosen path as a `context.decisions[]` entry.
86
+
87
+ ### Step 7: Exhaustive Q&A
88
+
89
+ Resolve every remaining ambiguity. Ask via AskUserQuestion (max 4 per batch). After each batch, append to `context.qa_answers[]`. Continue until nothing material is unresolved — including confirming the scope-absorption candidates from Step 3 Pass 2 (keep / drop each).
90
+
91
+ ### Step 8: Generate or Extend `plan.steps[]`
92
+
93
+ Build an ordered `plan.steps[]` where each step is `{ title, description, scope }` (`scope` = which idea/requirement it addresses). Every `ideas[].requirements` item AND every kept gap finding must be covered by ≥1 step. In additive mode, append/refine steps — do not drop steps that map to existing tasks. Save via MCP `update_checkpoint(checkpoint_id, plan: { steps: [...] })` — `plan` is a top-level checkpoint field, so this write does not touch `context` or `research`; the context-replaces rule in Step 10 applies only to the `context` JSONB.
94
+
95
+ ### Step 9: Create Tasks as Vertical Slices
96
+
97
+ Each task is a complete, independently shippable vertical slice — group by theme, not by layer.
98
+
99
+ **BAD (horizontal layers):** TASK-1 DB functions · TASK-2 API routes · TASK-3 UI.
100
+ **GOOD (vertical slices):** TASK-1 user auth (DB + API + UI) · TASK-2 profile page (DB + API + UI).
101
+ **GOOD (infra by theme):** TASK-1 config rule + Step-0 removal · TASK-2 task-sizing + round-workflow.
102
+
103
+ Sizing: with a 1M-token context, tasks can be large — group by coherent purpose; a single task touching 30+ files is fine if they serve one theme. Only split when themes are genuinely independent.
104
+
105
+ For each task use MCP `create_task` (`checkpoint_id`, sequential `number` after the current max, `title`, `requirements`, `context` with the relevant decisions/discoveries). **Additive mode:** create only tasks for steps not already covered by an existing task; never delete in-flight tasks. **Coverage check** — a plan step counts as covered only when an existing task's requirements address its full scope; when uncertain, create the task and note in its requirements `may overlap with TASK-N — review before executing`.
106
+
107
+ ### Step 10: Persist Full Context
108
+
109
+ Final write of the complete `checkpoint.context` JSONB via MCP `update_checkpoint`. Honor the **context-replaces-not-merges** contract: read the current context, merge your additions in memory, write the FULL object (`decisions` + `discoveries` + `dependencies` + `constraints` + `qa_answers` + `alternatives`). A partial write clobbers sibling keys. Phrase any database-verb words as prose in payloads (WAF gate).
110
+
111
+ ### Step 11: Show Result + Route
112
+
113
+ ```
114
+ ## Checkpoint Planned
115
+
116
+ **CHK-NNN**: [title]
117
+ **Tasks**: [created N] (additive: [+M new]) • **Plan steps**: [count]
118
+ **Locked decisions**: [count] • **E2E probe**: [fired / skipped]
119
+
120
+ ### Tasks
121
+ 1. TASK-1: [title]
122
+ ...
123
+ ```
124
+
125
+ This skill does **NOT** activate the checkpoint and does **NOT** claim a user/worktree.
126
+
127
+ - **Claimed by THIS session** — `worktree_id` is set AND equals `npx codebyplan resolve-worktree 2>/dev/null`: auto-trigger `/cbp-checkpoint-start` in the same context (the creator carries momentum into activation).
128
+ - **Otherwise** — `worktree_id` is null, set to a different worktree, or `resolve-worktree` is empty: surface a single directive — `Next: /cbp-checkpoint-start` — so the owning session (or anyone free, if open) claims and starts it. Never auto-activate a checkpoint owned by a different worktree.
129
+
130
+ ## Integration
131
+
132
+ - **Reads**: MCP `get_current_task`, `get_checkpoints`, `get_tasks`
133
+ - **Writes**: MCP `update_checkpoint` (ideas assessment, context, plan, research), `create_task`
134
+ - **Spawns**: `cbp-research` (level 2+ only), `cbp-test-e2e-agent` (opt-in discovery probe, `whole_checkpoint_mode`)
135
+ - **Triggered by**: `/cbp-checkpoint-create` (auto), or user directly
136
+ - **Triggers**: `/cbp-checkpoint-start` (auto when claimed at create; directive when left open)
137
+ - **Never**: activates the checkpoint or claims a user/worktree — that is `/cbp-checkpoint-start`
@@ -0,0 +1,54 @@
1
+ ---
2
+ scope: org-shared
3
+ ---
4
+
5
+ # Alternative Comparison Template
6
+
7
+ Loaded by `/cbp-checkpoint-plan` Step 6. Use when a meaningful design fork has more than one viable answer. Surfacing the alternatives — instead of silently picking one — is what lets the user redirect before tasks are created.
8
+
9
+ ## When a fork warrants a question
10
+
11
+ Ask the user when ALL of these hold:
12
+
13
+ - There are 2–4 genuinely viable options (not one obvious winner).
14
+ - The choice is hard to reverse later (architecture, data shape, public API, a one-way dependency).
15
+ - The options have materially different trade-offs the user would care about.
16
+
17
+ Resolve inline (record a decision, no question) when the choice is reversible, low-stakes, or has a single defensible answer.
18
+
19
+ ## How to present
20
+
21
+ Use AskUserQuestion. Lead with your recommendation as the first option and tag it `(Recommended)`. Give each option a one-line trade-off. For code/layout/config forks, use the `preview` field to show concrete snippets side by side.
22
+
23
+ ```
24
+ Question: "How should X be structured?"
25
+ Options:
26
+ - "Approach A (Recommended)" — <one-line trade-off>
27
+ - "Approach B" — <one-line trade-off>
28
+ - "Approach C" — <one-line trade-off>
29
+ ```
30
+
31
+ ## Record format
32
+
33
+ Persist to `context.alternatives[]` (a JSONB key introduced by this skill — schema-flexible; declare it as below). Read-merge-write the full context per Step 10.
34
+
35
+ ```json
36
+ {
37
+ "question": "How should X be structured?",
38
+ "options": [
39
+ { "label": "Approach A", "tradeoff": "...", "recommended": true },
40
+ { "label": "Approach B", "tradeoff": "..." }
41
+ ],
42
+ "chosen": "Approach A",
43
+ "rationale": "user picked A because ...",
44
+ "decided_at": "2026-06-01T..."
45
+ }
46
+ ```
47
+
48
+ Mirror the chosen path as a `context.decisions[]` entry with `locked: true` so the executor treats it as settled and the planner does not re-ask on a re-run.
49
+
50
+ ## Notes
51
+
52
+ - One question per fork; do not bundle unrelated forks into one multi-select.
53
+ - If the user picks "Other" and writes a custom answer, record their text verbatim as the rationale.
54
+ - Keep the options mutually exclusive — overlapping options produce ambiguous decisions.
@@ -0,0 +1,50 @@
1
+ ---
2
+ scope: org-shared
3
+ ---
4
+
5
+ # Dependency Decision Rubric
6
+
7
+ Loaded by `/cbp-checkpoint-plan` Step 5. Use when an idea could be built by extending something already installed OR by pulling in a new dependency. The goal is a deliberate, recorded choice — never a silent `pnpm add`.
8
+
9
+ ## Decision tree
10
+
11
+ 1. **Does an installed dependency already cover this?**
12
+ - Check `package.json` (root + the relevant workspace) and `vendor/INDEX.md` for an existing library.
13
+ - Grep the codebase for prior art — the capability may already be wrapped in a util/hook/service.
14
+ - If yes and it fits → **extend the existing dependency**. Record a `locked` decision and stop.
15
+
16
+ 2. **Can the existing dependency be extended at acceptable cost?**
17
+ - A thin wrapper / adapter over an installed lib almost always beats a new dependency.
18
+ - If extension means forking or fighting the library → a new dependency may be justified; continue.
19
+
20
+ 3. **Is a new dependency warranted?** Weigh:
21
+
22
+ | Factor | Favors extending | Favors new dependency |
23
+ |--------|------------------|-----------------------|
24
+ | Capability gap | Existing covers ~all of it | Existing covers little / poorly |
25
+ | Bundle weight | Adds to an already-loaded dep | Heavy add to a lean surface (esp. mobile) |
26
+ | Maintenance | No new supply-chain surface | Well-maintained, widely used, typed |
27
+ | Vendor docs | — | A `vendor/{lib}/v{ver}/` mirror exists or can be scaffolded via `/cbp-build-vendor-doc` |
28
+ | Lock-in / migration | Reuses known patterns | One-way door; migration cost later |
29
+
30
+ 4. **Consequential choice?** If adding a new dependency meaningfully changes bundle size, security surface, or architecture, surface it to the user via AskUserQuestion with the trade-off table above. Otherwise decide inline and record it.
31
+
32
+ ## Record format
33
+
34
+ Write the outcome as a `context.decisions[]` entry (read-merge-write the full context per Step 10):
35
+
36
+ ```json
37
+ {
38
+ "decision": "Extend the installed date-fns rather than add dayjs",
39
+ "rationale": "date-fns already bundled; needed formatter is a 3-line wrapper; avoids a second date lib",
40
+ "locked": true
41
+ }
42
+ ```
43
+
44
+ When a new dependency IS chosen, also add a `context.dependencies[]` entry naming it and (if no vendor mirror exists) a plan step to scaffold one via `/cbp-build-vendor-doc`.
45
+
46
+ ## Anti-patterns
47
+
48
+ - Adding a library for something the standard lib or an installed dep already does.
49
+ - Two libraries that solve the same problem (e.g. two date libs, two state managers) — consolidate instead.
50
+ - Deciding silently — every extend-vs-add fork must leave a recorded, locked decision so the executor and future planners can see the reasoning.
@@ -0,0 +1,57 @@
1
+ ---
2
+ scope: org-shared
3
+ ---
4
+
5
+ # E2E Discovery Probe
6
+
7
+ Loaded by `/cbp-checkpoint-plan` Step 4. The probe answers one question before you plan a fix: **is this area actually broken, and how?** It reuses `cbp-test-e2e-agent` (the sole owner of e2e execution) in `whole_checkpoint_mode` rather than introducing a second smoke-test path.
8
+
9
+ ## When to offer the probe
10
+
11
+ Offer it only when BOTH hold:
12
+
13
+ - An idea touches a UI surface — its text or the affected files mention a page / screen / route / form / component.
14
+ - You have a concrete suspicion that an existing flow is already broken (not "let's test everything"). The probe targets a named area, not the whole app.
15
+
16
+ Skip silently for backend-only / infra / `claude_only` checkpoints, or when you have no breakage suspicion.
17
+
18
+ ## Procedure
19
+
20
+ 1. **State the suspicion** — name the area, the specific pages/screens, and why you think it is broken (a stale selector, a route that 404s, a recent refactor nearby).
21
+ 2. **Confirm with the user** via AskUserQuestion — the probe needs a running dev server, so it is opt-in. Options: run the probe / skip and plan from assumption / let me name different pages.
22
+ 3. **Resolve the dev-server port** from `.codebyplan/server.json` `port_allocations[]` (pick the entry whose `server_type` matches the app, e.g. `nextjs`). If nothing is running there, ask the user to start it or skip.
23
+ 4. **Resolve `test_strategy`** — call MCP `get_repos()`, find the entry where `id === repo_id`, and read the affected app's platform + e2e framework from its `tech_stack` record. If the record has no e2e data, pass `null` for the unknown fields — the agent resolves them itself at its Step 1.5. Do NOT pass placeholder strings.
24
+ 5. **Spawn** `cbp-test-e2e-agent` with the payload below.
25
+ 6. **Interpret** the result: compare what actually failed against what you assumed. Record the delta in `context.discoveries[]` so the plan targets real defects, not imagined ones.
26
+
27
+ ## Spawn payload (whole_checkpoint_mode)
28
+
29
+ `round_number: 0` is the documented sentinel for `whole_checkpoint_mode` in the agent's Input Contract — in that mode `files_changed` / `prior_round_files_changed` are ignored and the agent runs the `pages_affected` you give it.
30
+
31
+ ```yaml
32
+ input:
33
+ repo_id: <repo UUID from .codebyplan/repo.json>
34
+ round_number: 0 # sentinel — whole_checkpoint_mode
35
+ whole_checkpoint_mode: true
36
+ files_changed: [] # nothing changed yet — probing current state
37
+ prior_round_files_changed: [] # ignored under whole_checkpoint_mode; required for non-probe round_number >= 2 calls
38
+
39
+ test_strategy:
40
+ platform: <from tech_stack DB record>
41
+ e2e_framework: <playwright | maestro | webdriverio | xcuitest | vscode-test>
42
+ pages_affected: ["<route or screen you suspect>", ...]
43
+ has_auth: <true | false>
44
+ dev_server_port: <port from server.json, or null>
45
+ ```
46
+
47
+ ## What you get back
48
+
49
+ The agent returns `test_results` (passed / failed / skipped + per-failure `category` and `classification_reason`) and `preflight`. For planning purposes:
50
+
51
+ - `category: 'real'` failures → genuine defects; turn each into a plan step / task.
52
+ - `category: 'env' | 'auth' | 'access' | 'flake'` → not the feature's fault; note it but do not plan a code fix around it.
53
+ - A clean pass → your breakage suspicion was wrong; plan the actual requested change without a "fix" step you did not need.
54
+
55
+ ## Why reuse the agent (not a new smoke probe)
56
+
57
+ `cbp-test-e2e-agent` is the declared sole owner of e2e: it auto-detects the platform, reconciles against the `tech_stack` DB record, configures the framework if missing, and classifies failures. A bespoke in-skill smoke check would duplicate that ownership and drift. The probe is a thin, opt-in caller of the existing agent.
@@ -0,0 +1,47 @@
1
+ ---
2
+ scope: org-shared
3
+ ---
4
+
5
+ # Gap Analysis Playbook
6
+
7
+ Loaded by `/cbp-checkpoint-plan` Step 3. The job: find what the raw request misses, before any task is created. Most "half-ass" outcomes come from planning only what was literally asked and ignoring the foundations it depends on or the adjacent breakage it sits next to.
8
+
9
+ ## Two passes
10
+
11
+ ### Pass 1 — in-scope gaps
12
+
13
+ For each idea, compare the stated requirements against codebase reality (Glob/Grep/Read the affected area). Look for:
14
+
15
+ - **Missing foundations** — the request assumes a helper / table / route / type that does not exist yet.
16
+ - **Half-implemented patterns** — a similar feature exists but is incomplete (e.g. a hook with no test, a route with no error path, a component with no loading/empty state).
17
+ - **Implicit requirements** — auth, validation, RLS, migration, types regen, i18n, a11y that the request did not name but the feature needs to be production-ready.
18
+ - **Consistency debt** — the new work would diverge from an established convention unless explicitly aligned.
19
+
20
+ Record each as a `context.discoveries[]` entry: `{ topic, finding }`. These become plan steps in Step 8.
21
+
22
+ ### Pass 2 — adjacent findings
23
+
24
+ While reading the area, you will notice problems just outside the literal request (a neighbouring bug, a stale comment referencing a removed symbol, a sibling file with the same latent defect). Classify each:
25
+
26
+ | Class | Meaning | Default action |
27
+ |-------|---------|----------------|
28
+ | `in_scope_gap` | Needed for the request to be production-ready | Add a plan step (Pass 1) |
29
+ | `adjacent_absorbed` | Same neighbourhood, cheap to fix while here, low risk | **Pull into this checkpoint** as a task (locked policy) |
30
+ | `adjacent_deferred` | Real but large/independent enough to warrant its own checkpoint | Record as discovery; confirm with user in Step 7 before dropping |
31
+
32
+ The locked CHK-138 policy is **scope absorption**: prefer `adjacent_absorbed` over `adjacent_deferred`. Only defer when the user explicitly scopes it out in Step 7, or when absorbing it would balloon the checkpoint beyond its deadline.
33
+
34
+ ## What "production-ready" means here
35
+
36
+ A plan is complete when, for every idea, you can answer yes to:
37
+
38
+ - Does every requirement have a covering plan step?
39
+ - Are the implicit requirements (auth / validation / migration / types / tests) either covered or explicitly deemed N/A?
40
+ - Did Pass 2 findings get a disposition (absorbed or deferred-with-user-confirmation)?
41
+ - Would shipping this leave any half-implemented pattern in the touched files?
42
+
43
+ If any answer is no, the plan is not done — add the step or ask the question.
44
+
45
+ ## Output
46
+
47
+ Everything from this playbook lands in `context.discoveries[]` and, after Step 8, in `plan.steps[]`. Do not create tasks here — task creation is Step 9, after dependency decisions and alternatives are settled.
@@ -0,0 +1,84 @@
1
+ ---
2
+ scope: org-shared
3
+ name: cbp-checkpoint-start
4
+ description: Activate a planned checkpoint and claim it for the current user/worktree, then route into task work. Runs after /cbp-checkpoint-plan (which produces tasks but never activates). Refuses to start an unplanned checkpoint.
5
+ argument-hint: [checkpoint-number]
6
+ effort: high
7
+ ---
8
+
9
+ # Checkpoint Start Command
10
+
11
+ The activation + claim gate of the checkpoint pipeline. `/cbp-checkpoint-plan` produces tasks but deliberately leaves the checkpoint `pending` and possibly unclaimed so it can sit in a team queue. This skill flips it to `active`, claims it for the caller's worktree if still open, and routes into the first task.
12
+
13
+ ## Pipeline
14
+
15
+ ```
16
+ /cbp-checkpoint-create → /cbp-checkpoint-plan (tasks, no activation) → /cbp-checkpoint-start (activate + claim, here) → /cbp-task-start
17
+ ```
18
+
19
+ ## Instructions
20
+
21
+ ### Step 0: Parse `$ARGUMENTS`
22
+
23
+ Source `repo_id` from `.codebyplan/repo.json`. Resolve caller worktree once for the whole skill: `CALLER_WT=$(npx codebyplan resolve-worktree 2>/dev/null)`.
24
+
25
+ | Shape | Resolves to |
26
+ |-------|-------------|
27
+ | `{chk}` (e.g. `138`) | CHK-{chk} via MCP `get_checkpoints` filtered by `number` |
28
+ | _(empty)_ | The next `pending` checkpoint that already has tasks (planned but not yet started); if several, the lowest-numbered |
29
+
30
+ Malformed (non-numeric, contains `-`): surface `checkpoint-start: invalid argument` and stop.
31
+
32
+ ### Step 1: Load Checkpoint + Tasks
33
+
34
+ Load the checkpoint (`status`, `worktree_id`, `plan`) and its tasks via MCP `get_tasks(checkpoint_id)`.
35
+
36
+ ### Step 2: Planned-Gate
37
+
38
+ A checkpoint must be planned before it can start.
39
+
40
+ - **No tasks AND empty `plan.steps[]`** → refuse: surface `CHK-NNN is not planned yet.` and auto-trigger `/cbp-checkpoint-plan {NNN}`. STOP. (An unplanned checkpoint is `worktree_id`-null, so there is nothing to own yet — always kick off planning, matching `/cbp-todo` Rule A.)
41
+ - **Already `active`** → no activation needed; skip to Step 3 for a claim-check only, then Step 5.
42
+ - **`pending` with tasks** → proceed.
43
+
44
+ ### Step 3: Claim Logic
45
+
46
+ Compare the checkpoint's `worktree_id` against `CALLER_WT`:
47
+
48
+ | Checkpoint `worktree_id` | Action |
49
+ |--------------------------|--------|
50
+ | null (left open at create) | Claim it: in Step 4 pass `worktree_id: CALLER_WT`. If `CALLER_WT` is empty, warn the checkpoint will stay unclaimed and proceed without it. |
51
+ | equals `CALLER_WT` | Already yours — no-op. |
52
+ | a DIFFERENT worktree | STOP. Surface: `CHK-NNN is claimed by worktree {other}; current worktree is {CALLER_WT}. If {other} is dead, a maintainer can release it via the release_assignment MCP tool, then re-run /cbp-checkpoint-start.` Do not activate. |
53
+
54
+ This mirrors the CHK-104 hard-lock model — never wrest a checkpoint from a live worktree.
55
+
56
+ ### Step 4: Activate
57
+
58
+ If the checkpoint is already `active` AND `worktree_id` already equals `CALLER_WT` (the Step 3 no-op row), skip this step entirely and proceed to Step 5 — nothing to write.
59
+
60
+ Otherwise set the checkpoint `active` via MCP `update_checkpoint(checkpoint_id, status: "active"`, plus `worktree_id: CALLER_WT` when claiming per Step 3, plus `caller_worktree_id: CALLER_WT` so the hard-lock pre-guard accepts the call (omit `caller_worktree_id` only when `CALLER_WT` is empty). If the checkpoint was already `active` but a claim is still needed, skip the status write and only write `worktree_id`.
61
+
62
+ ### Step 5: Route
63
+
64
+ Follow the close-out routing convention — auto-trigger the next same-context step, never an A/B/C menu. `{first-pending-task}` is the lowest-numbered pending task from Step 1 (not necessarily TASK-1, since additive re-planning may have completed earlier ones):
65
+
66
+ - **Claimed by THIS session** (`CALLER_WT` now owns the checkpoint): auto-trigger `/cbp-task-start {chk}-{first-pending-task}` in the same context.
67
+ - **`CALLER_WT` empty / unresolved**: surface a single directive — `Next: /cbp-task-start {chk}-{first-pending-task}` — and let the user proceed.
68
+
69
+ Show a one-line confirmation before routing:
70
+
71
+ ```
72
+ ## Checkpoint Started
73
+
74
+ **CHK-NNN**: [title] • **Status**: active • **Claimed by**: [worktree or "open"]
75
+ **Next task**: TASK-[N] — [title]
76
+ ```
77
+
78
+ ## Integration
79
+
80
+ - **Reads**: MCP `get_checkpoints`, `get_tasks`; `npx codebyplan resolve-worktree`
81
+ - **Writes**: MCP `update_checkpoint` (status + worktree_id, with caller_worktree_id pre-guard)
82
+ - **Triggered by**: `/cbp-checkpoint-plan` (auto when claimed at create), `/cbp-todo` (planned-but-pending gate), or user directly
83
+ - **Triggers**: `/cbp-task-start` (auto when claimed), or `/cbp-checkpoint-plan` (when the checkpoint is unplanned)
84
+ - **Never**: plans or creates tasks — that is `/cbp-checkpoint-plan`
@@ -72,8 +72,8 @@ The task MUST run on its target feat branch. Claude switches/creates that branch
72
72
  #### 3.1 — Determine the target branch
73
73
 
74
74
  Read `.codebyplan/git.json`:
75
- - `branch_config.protected` (fall back to `["main", "development"]`)
76
- - `branch_config.integration` (fall back to `"development"`)
75
+ - `branch_config.protected` (fall back to `["main"]`)
76
+ - `branch_config.production` (fall back to `"main"`) → store as `PRODUCTION`
77
77
 
78
78
  Compute `TARGET`:
79
79
 
@@ -102,16 +102,16 @@ if git rev-parse --verify "$TARGET" >/dev/null 2>&1; then
102
102
  elif git rev-parse --verify "origin/$TARGET" >/dev/null 2>&1; then
103
103
  git checkout -t "origin/$TARGET"
104
104
 
105
- # (c) target doesn't exist — create from integration branch
105
+ # (c) target doesn't exist — create from production branch (main)
106
106
  else
107
- # First make sure integration is up to date
108
- git fetch origin "$INTEGRATION" 2>/dev/null || true
109
- git checkout -b "$TARGET" "origin/$INTEGRATION" 2>/dev/null \
110
- || git checkout -b "$TARGET" "$INTEGRATION"
107
+ # First make sure production is up to date
108
+ git fetch origin "$PRODUCTION" 2>/dev/null || true
109
+ git checkout -b "$TARGET" "origin/$PRODUCTION" 2>/dev/null \
110
+ || git checkout -b "$TARGET" "$PRODUCTION"
111
111
  fi
112
112
  ```
113
113
 
114
- **Carrying uncommitted work** — `git checkout` carries clean (non-conflicting) working-tree changes to the new branch automatically. This is intended: changes made on `development` while preparing the task move with the user to the new feat branch. No `git stash`, ever (per `git-safety.md`). No `git add`, ever (per `git-workflow.md`).
114
+ **Carrying uncommitted work** — `git checkout` carries clean (non-conflicting) working-tree changes to the new branch automatically. This is intended: changes made on `main` while preparing the task move with the user to the new feat branch. No `git stash`, ever (per `git-safety.md`). No `git add`, ever (per `git-workflow.md`).
115
115
 
116
116
  **If `git checkout` exits non-zero** (typically "would clobber" because a tracked file has unstaged changes that conflict with target's version): surface the raw git error verbatim, stop, do NOT attempt recovery. The user resolves and re-invokes. This is the only case where `/cbp-task-start` halts on branch state.
117
117
 
@@ -128,7 +128,7 @@ After successful switch:
128
128
 
129
129
  #### 3.5 — Protected-branch sanity (defensive)
130
130
 
131
- After all of the above, `current` should be a feat branch by construction. If somehow it's still in `branch_config.protected` (e.g. TARGET resolved to "development" because a checkpoint has a malformed `branch_name`), THEN block with a hard error citing the bad config — this is a data bug, not a user workflow problem.
131
+ After all of the above, `current` should be a feat branch by construction. If somehow it's still in `branch_config.protected` (e.g. TARGET resolved to "main" because a checkpoint has a malformed `branch_name`), THEN block with a hard error citing the bad config — this is a data bug, not a user workflow problem.
132
132
 
133
133
  ### Step 3b: Clean Slate
134
134
 
@@ -173,18 +173,18 @@ Before activating the task, verify the caller's worktree matches the assigned wo
173
173
 
174
174
  4. If `TARGET_WT IS NULL` or matches, proceed.
175
175
 
176
- ### Step 3.6: Integration Drift Check (optional)
176
+ ### Step 3.6: Main-Drift Check (optional)
177
177
 
178
- Before loading context, check if the feat branch has drifted from integration. Soft-skip on fetch failure (this gate is optional, not mandatory).
178
+ Before loading context, check if the feat branch has drifted from the production branch (main). Soft-skip on fetch failure (this gate is optional, not mandatory).
179
179
 
180
- 1. Run `git fetch origin {INTEGRATION}` where `{INTEGRATION}` comes from `.codebyplan/git.json` `branch_config.integration` (default `development`). If fetch fails (offline, auth), skip this entire step silently and proceed to Step 4.
181
- 2. Compute: `BEHIND=$(git rev-list --count HEAD..origin/{INTEGRATION})`.
180
+ 1. Run `git fetch origin {PRODUCTION}` where `{PRODUCTION}` comes from `.codebyplan/git.json` `branch_config.production` (default `main`). If fetch fails (offline, auth), skip this entire step silently and proceed to Step 4.
181
+ 2. Compute: `BEHIND=$(git rev-list --count HEAD..origin/{PRODUCTION})`.
182
182
  3. Compute `LAST_FETCH_AGE_HOURS` from the mtime of `.git/FETCH_HEAD` relative to now.
183
183
  4. If `BEHIND >= 10` OR `LAST_FETCH_AGE_HOURS > 24`: surface AskUserQuestion:
184
184
 
185
185
  ```
186
- Feat branch is {BEHIND} commits behind origin/{INTEGRATION} (last fetch {AGE}h ago).
187
- Merge integration now before starting the task?
186
+ Feat branch is {BEHIND} commits behind origin/{PRODUCTION} (last fetch {AGE}h ago).
187
+ Merge main now before starting the task?
188
188
  A) Yes — run /cbp-merge-main (recommended)
189
189
  B) No — continue without merging
190
190
  ```
@@ -26,6 +26,18 @@ WORKTREE_ID=$(npx codebyplan resolve-worktree 2>/dev/null)
26
26
 
27
27
  Use MCP `get_next_action` with `repo_id` and `worktree_id` (if present from Step 0).
28
28
 
29
+ ### Step 1.5: Checkpoint Planning Gate
30
+
31
+ Before honoring the command from Step 1, gate on the resolved active/next checkpoint's planning + activation state. This keeps work from starting on a half-baked or un-activated checkpoint. Resolve the checkpoint from the `get_next_action` response context (or MCP `get_current_task`), then load its `plan` + `status` via MCP `get_checkpoints` and its task count via MCP `get_tasks(checkpoint_id)`.
32
+
33
+ Evaluate two rules in order (Rule A wins if both could match):
34
+
35
+ - **RULE A — unplanned**: empty `plan.steps[]` **AND** zero tasks → the checkpoint has not been planned. Suppress the Step-1 command; surface `Now planning CHK-NNN… handing off to /cbp-checkpoint-plan` and auto-trigger `/cbp-checkpoint-plan {NNN}`.
36
+ - **RULE B — planned-but-pending**: has tasks (or non-empty `plan.steps[]`) **BUT** `status === "pending"` (not yet activated) → the checkpoint is planned but not started. Suppress the Step-1 command; surface `Now starting CHK-NNN… handing off to /cbp-checkpoint-start` and auto-trigger `/cbp-checkpoint-start {NNN}` (a planned checkpoint must be started + claimed before task work).
37
+ - **Neither** (planned AND `active`) → fall through to Step 2 unchanged. No regression to the existing flow.
38
+
39
+ Skip this gate when `get_next_action` returns no checkpoint (idle — see Step 3) or the command is `/cbp-session-start`.
40
+
29
41
  ### Step 2: Load Context Based on Command
30
42
 
31
43
  Before triggering the command, load the context it needs. This ensures `/clear` + `/cbp-todo` reliably restores full working context.
@@ -36,6 +48,8 @@ Before triggering the command, load the context it needs. This ensures `/clear`
36
48
  |----------------|-----------------|
37
49
  | `/cbp-session-start` | None — `/cbp-session-start` handles its own loading |
38
50
  | `/cbp-checkpoint-create` | If checkpoint exists in response context: load checkpoint via MCP `get_checkpoints` (filter by number). Display checkpoint title, goal, ideas summary |
51
+ | `/cbp-checkpoint-plan` | Load checkpoint via MCP `get_checkpoints` (filter by number) + `get_tasks(checkpoint_id)`. Display checkpoint title, goal, ideas, existing task count |
52
+ | `/cbp-checkpoint-start` | Load checkpoint via MCP `get_checkpoints` + `get_tasks(checkpoint_id)`. Display checkpoint title, status, claim state, first pending task |
39
53
  | `/cbp-task-start [N]` | Load via MCP `get_current_task`. Display checkpoint title + task title/requirements summary |
40
54
  | `/cbp-round-start` | Load via MCP `get_current_task` + `get_rounds(task_id)`. Display checkpoint + task + round count + last round summary |
41
55
  | `/cbp-round-update` | Load via MCP `get_current_task` + `get_rounds(task_id)`. Display checkpoint + task + files_changed approval summary |
@@ -93,5 +107,5 @@ If Step 3 found actionable work, show `instructions` from the get_next_action re
93
107
  ## Integration
94
108
 
95
109
  - **Called by**: `/cbp-session-start`, `/cbp-task-complete`, `/cbp-checkpoint-complete`, manual, after `/clear`
96
- - **Reads**: MCP `get_next_action`, `get_current_task`, `get_rounds`, `get_checkpoints`
97
- - **Triggers**: `command` from response (auto)
110
+ - **Reads**: MCP `get_next_action`, `get_current_task`, `get_rounds`, `get_checkpoints`, `get_tasks`
111
+ - **Triggers**: `command` from response (auto); Step 1.5 gate overrides to `/cbp-checkpoint-plan` (unplanned) or `/cbp-checkpoint-start` (planned-but-pending)