npm - codebyplan - Versions diffs - 1.13.52 → 1.13.54 - Mend

codebyplan 1.13.52 → 1.13.54

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (92) hide show

package/templates/skills/cbp-standalone-task-complete/SKILL.md CHANGED Viewed

@@ -25,7 +25,7 @@ Any multi-segment input is an error:
 ```
 standalone-task-complete: argument `{value}` looks like a checkpoint-task pair.
-Use /cbp-task-complete {chk}-{task} for checkpoint-bound tasks.
+Use /cbp-finalize {chk}-{task} for checkpoint-bound tasks.
 Standalone tasks use a bare number, e.g. /cbp-standalone-task-complete 45.
 ```
@@ -35,7 +35,7 @@ Error cases: any multi-segment input, `abc`, `108-`, `-1`, anything with whitesp
 - `standalone-task-complete 45` → standalone TASK-45
 - `standalone-task-complete` (no arg) → active in-progress task via `get_current_standalone_task`
-- `standalone-task-complete 141-3` → error: "Use /cbp-task-complete {chk}-{task} for checkpoint-bound tasks."
+- `standalone-task-complete 141-3` → error: "Use /cbp-finalize {chk}-{task} for checkpoint-bound tasks."
 - `standalone-task-complete abc` → error: malformed
 ### Step 1.5: Get Current Task
@@ -56,7 +56,7 @@ If any round is `in_progress`:
 ```
 ## Cannot Complete Standalone Task
-Standalone TASK-[N] has an active round (Round [N]). Run `/cbp-round-update` to finish it.
+Standalone TASK-[N] has an active round (Round [N]). Run `/cbp-verify` to finish it.
 ```
 Stop here.
@@ -66,7 +66,7 @@ Verify at least one round has `testing_qa_output` in its context. If not:
 ```
 ## Cannot Complete Standalone Task
-No testing-qa-agent validation found. Run `/cbp-round-start` to execute a validated round.
+No testing-qa-agent validation found. Run `/cbp-round-plan` to execute a validated round.
 ```
 Stop here.
@@ -179,18 +179,17 @@ When `branch_deleted === true` in the ship JSON:
 ### Step 7.5: Complete Standalone Task
-Note: `complete_standalone_task` is called only after `codebyplan ship` succeeds (no `checks_failed`) — the DB completion record reflects work that has landed in production.
+Note: completion is called only after `codebyplan ship` succeeds (no `checks_failed`) — the DB completion record reflects work that has landed in production.
-Resolve caller worktree: `CALLER_WT=$(npx codebyplan resolve-worktree 2>/dev/null)`.
+Complete via the CLI (wraps the `complete_standalone_task` MCP tool):
-Call `complete_standalone_task(standalone_task_id, caller_worktree_id: CALLER_WT)`. `caller_worktree_id` is REQUIRED — the MCP server's pre-guard rejects mutations from non-matching worktrees. The server auto-clears `assigned_worktree_id` on the task on success.
+```bash
+codebyplan standalone-task complete --id <standalone_task.id>
+```
-If `CALLER_WT` is empty, surface this warning and ask user to confirm before proceeding:
+The CLI auto-resolves `caller_worktree_id` (override → worktree cache → resolver). `caller_worktree_id` is REQUIRED — the MCP server's pre-guard rejects mutations from non-matching worktrees, and the CLI hard-fails (exit 1) with registration guidance rather than sending an undefined id. The server auto-clears `assigned_worktree_id` on the task on success.
-```
-Warning: could not resolve caller_worktree_id (npx codebyplan resolve-worktree returned empty).
-The complete_standalone_task call may be rejected by the pre-guard. Proceed anyway? (yes / no)
-```
+If the CLI exits 1 with a "could not resolve caller_worktree_id" message, run `npx codebyplan setup` (or `codebyplan resolve-worktree --cache`) from this worktree, then re-run the command.
 ### Step 8: Run Cleanup + Migration (inline)
@@ -238,6 +237,6 @@ Do NOT use AskUserQuestion for routing. Do NOT use the Skill tool to auto-trigge
 - **Chain**: `/cbp-standalone-task-check` → `/cbp-standalone-task-testing` → `/cbp-standalone-task-complete`
 - **Delegates to**: `codebyplan ship` CLI (Step 7 — PR creation, check polling, merge, branch cleanup)
 - **Reads**: MCP `get_current_standalone_task`, `get_standalone_tasks`, `get_standalone_rounds`
-- **Writes**: MCP `update_standalone_task`, `complete_standalone_task`
+- **Writes**: MCP `update_standalone_task` (Step 6 files); `codebyplan standalone-task complete` (wraps `complete_standalone_task`)
 - **Uses skills (inline, no sub-agent)**: `cleanup` (if deletions), `migration` (if exports renamed)
 - **Does NOT** auto-trigger next skill — emits single directive only

package/templates/skills/cbp-standalone-task-create/SKILL.md CHANGED Viewed

@@ -17,7 +17,7 @@ Create a new standalone task — independent of any checkpoint. Gathers user con
 ## Identifier Notation
-Standalone tasks use a bare number (e.g. `45` = standalone TASK-45). There is no checkpoint segment. Canonical notation follows `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary".
+Standalone tasks use a bare number (e.g. `45` = standalone TASK-45). There is no checkpoint segment. Canonical notation follows `cbp-round-plan` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary".
 ## Instructions
@@ -96,13 +96,20 @@ Resolve worktree_id via `npx codebyplan resolve-worktree 2>/dev/null`.
 ### Step 7: Create Standalone Task
-Use MCP `create_standalone_task` with:
-- **repo_id**: from `.codebyplan/repo.json`
-- **title**: Concise task title
-- **number**: next available number from Step 3
-- **requirements**: Numbered requirements list
-- **context**: Include decisions from Q&A and source findings
-- **assigned_worktree_id**: from Step 6 (if resolved)
+Create via the CLI (wraps the `create_standalone_task` MCP tool; auto-resolves `caller_worktree_id`):
+```bash
+codebyplan standalone-task create \
+  --title "<concise task title>" \
+  --number <next number from Step 3> \
+  --requirements "<numbered requirements list>" \
+  --context '<JSON: decisions from Q&A + source findings>' \
+  --assigned-worktree-id <from Step 6, if resolved>
+```
+- `--repo-id` is optional — the CLI reads it from `.codebyplan/repo.json`.
+- Omit `--assigned-worktree-id` when Step 6 did not resolve a worktree.
+- On success the CLI prints the created row JSON (including `.id`) to stdout.
 ```
 ## Standalone Task Created
@@ -145,6 +152,6 @@ Waiting for user to decide next step.
 ## Integration
 - **Reads**: MCP `get_standalone_tasks`
-- **Writes**: MCP `create_standalone_task`
+- **Writes**: `codebyplan standalone-task create` (wraps `create_standalone_task` MCP tool)
 - **Triggered by**: user manual
 - **Does NOT auto-trigger** next command — user decides

package/templates/skills/cbp-standalone-task-start/SKILL.md CHANGED Viewed

@@ -149,13 +149,17 @@ Load context from DB:
 ### Step 5: Set Task Status
-Use MCP `update_standalone_task(task_id, status: "in_progress")`.
+Set status via the CLI (wraps `update_standalone_task`; auto-resolves `caller_worktree_id`):
-If `CALLER_WT` is present, include `caller_worktree_id: CALLER_WT`.
+```bash
+codebyplan standalone-task update --id <task.id> --status in_progress
+```
+`--id` is the standalone task UUID resolved in Step 2. The CLI resolves `caller_worktree_id` itself (override → worktree cache → resolver), so `CALLER_WT` does not need to be passed.
 ### Step 6: Auto-trigger Round Start
-Trigger `/cbp-round-start` with no argument. Do NOT pass the task number — round-start's 2-segment form means standalone TASK-{a} round {b}, not a checkpoint task. Passing no argument causes round-start to derive the active task/round from state, which is correct.
+Trigger `/cbp-round-plan` with no argument. Do NOT pass the task number — round-start's 2-segment form means standalone TASK-{a} round {b}, not a checkpoint task. Passing no argument causes round-start to derive the active task/round from state, which is correct.
 ```
 Starting first round...
@@ -164,5 +168,5 @@ Starting first round...
 ## Integration
 - **Reads**: MCP `get_standalone_tasks`, `get_current_standalone_task`, `get_standalone_rounds`
-- **Writes**: MCP `update_standalone_task`
-- **Triggers**: `/cbp-round-start` (no argument — auto, round 1)
+- **Writes**: `codebyplan standalone-task update` (Step 5 status); MCP `update_standalone_task` (Step 3.4 branch_name persist)
+- **Triggers**: `/cbp-round-plan` (no argument — auto, round 1)

package/templates/skills/cbp-standalone-task-testing/SKILL.md CHANGED Viewed

@@ -19,7 +19,7 @@ Comprehensive task-level testing for standalone tasks — the **cross-round doub
 ## Scope vs Round-Level Validation
-Per-wave `testing-qa-agent` runs inside `/cbp-round-execute` Step 5 and **owns per-round QA**: per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret full-diff grep, and `pnpm audit`. This skill does NOT repeat them. It adds only the cross-round layer invisible within a single round: workspace-wide lint, workspace tsc, and the full test suite (which catch cross-package breakage), plus the cross-round code review (Step 6.5) and the user manual walkthrough (Step 8).
+Per-wave `testing-qa-agent` runs inside `/cbp-round-build` Step 5 and **owns per-round QA**: per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret full-diff grep, and `pnpm audit`. This skill does NOT repeat them. It adds only the cross-round layer invisible within a single round: workspace-wide lint, workspace tsc, and the full test suite (which catch cross-package breakage), plus the cross-round code review (Step 6.5) and the user manual walkthrough (Step 8).
 ## Instructions
@@ -34,7 +34,7 @@ Any multi-segment input is an error:
 ```
 standalone-task-testing: argument `{value}` looks like a checkpoint-task pair.
-Use /cbp-task-testing {chk}-{task} for checkpoint-bound tasks.
+Use /cbp-verify {chk}-{task} for checkpoint-bound tasks.
 Standalone tasks use a bare number, e.g. /cbp-standalone-task-testing 45.
 ```
@@ -57,7 +57,7 @@ Use MCP `get_standalone_rounds(standalone_task_id)`. Verify all rounds are `comp
 ## Cannot Run Standalone Task Testing
 Standalone TASK-[N] has an active round (Round [N]). Complete it first:
-- Run `/cbp-round-update` to finish the round
+- Run `/cbp-verify` to finish the round
 ```
 Stop.
@@ -86,13 +86,22 @@ Read every non-deleted file in the aggregated list. Build a mental model of the
 Capture stdout and stderr for each check.
+**ci.json command resolution (absent-fallback safe):** Before running the checks below, resolve commands from `.codebyplan/ci.json`:
+```bash
+CI_TYPES_CMD=$(npx codebyplan ci resolve typecheck 2>/dev/null)
+CI_UNIT_CMD=$(npx codebyplan ci resolve unit_test 2>/dev/null)
+```
+Fallback: if `.codebyplan/ci.json` is absent, `ci resolve` returns the central default (exit 0). If the binary is unavailable, the variable is empty and the `${CI_*_CMD:-<literal>}` guards in the table below activate the hardcoded fallback.
 **Hard-fail tests** (block completion):
 | Category | Command | Condition |
 |----------|---------|-----------|
 | Full-repo lint | `pnpm -w lint` | Always |
-| Full-repo types | `pnpm exec tsc --noEmit` | Source files changed |
-| Full-repo unit tests | `pnpm test --run` | Source files in aggregated_files |
+| Full-repo types | `${CI_TYPES_CMD:-pnpm exec tsc --noEmit}` | Source files changed |
+| Full-repo unit tests | `${CI_UNIT_CMD:-pnpm test --run}` | Source files in aggregated_files |
 | Per-package E2E | `pnpm --filter <pkg> e2e:test` | UI files in aggregated_files |
 These are the workspace-wide / cross-package checks only — per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret grep, and `pnpm audit` already ran per-round inside `testing-qa-agent` and are NOT repeated here.
@@ -176,11 +185,11 @@ Next: /cbp-standalone-task-complete {N}
 ---
 **Next:**
-Run `/cbp-round-input` to address the minor issues found during testing.
+Run `/cbp-round-plan` to address the minor issues found during testing.
 ---
-Waiting for user to run `/cbp-round-input`.
+Waiting for user to run `/cbp-round-plan`.
 **Major problems found:**

package/templates/skills/cbp-task-create/SKILL.md CHANGED Viewed

@@ -10,13 +10,12 @@ Create a new task within the active checkpoint. Gathers user context, analyzes e
 ## When Used
-- Suggested by `/cbp-task-check` when scope issues require a new task
-- Suggested by `/cbp-task-testing` when major problems need a separate task
+- Suggested by `/cbp-verify` (task scope) when scope issues or major problems require a separate task
 - User manually wants to add a task to the current checkpoint
 ## Identifier Notation
-This skill operates on the **active** checkpoint resolved via MCP `get_current_task` and does not accept a positional identifier argument. The task it creates gets its `number` from the next-available slot within the active checkpoint (checkpoint-bound). Canonical chk-task-round notation — used in prose, error messages, and cross-references — follows `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary": `108-1` (CHK-108 TASK-1), `108-1-2` (round 2 of CHK-108 TASK-1).
+This skill operates on the **active** checkpoint resolved via MCP `get_current_task` and does not accept a positional identifier argument. The task it creates gets its `number` from the next-available slot within the active checkpoint (checkpoint-bound). Canonical chk-task-round notation — used in prose, error messages, and cross-references — follows `cbp-round-plan` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary": `108-1` (CHK-108 TASK-1), `108-1-2` (round 2 of CHK-108 TASK-1).
 **Bare-number argument**: if a bare number (e.g. `42`) is provided with no checkpoint context, this skill cannot resolve it to a checkpoint-bound task:
@@ -44,8 +43,8 @@ Use AskUserQuestion to understand the new task:
 Why is this task needed? What should it accomplish?
-If this was triggered by `/cbp-task-check` or `/cbp-task-testing`, the findings are:
-[pre-loaded context from check/testing findings if available]
+If this was triggered by `/cbp-verify` (task scope), the findings are:
+[pre-loaded context from verify findings if available]
 Please describe:
 1. What the task should accomplish
@@ -70,7 +69,7 @@ Discovered issues MUST be captured. The default target is current scope (round
 | Situation | Action |
 |-----------|--------|
-| Trivial inline fix (≤5 min, mechanical, scope-clean) | Apply in the CURRENT round per `cbp-round-end` reference `findings-presentation.md` "Trivial-Resolution Exception" |
+| Trivial inline fix (≤5 min, mechanical, scope-clean) | Apply in the CURRENT round per `cbp-verify` reference `findings-presentation.md` "Trivial-Resolution Exception" |
 | Related to the current task's domain | Create a new ROUND in the current task |
 | Fits the current checkpoint goal but is meaningfully separate | Create a new TASK in the current checkpoint via `create_task(checkpoint_id)` |
 | Large enough to need multiple tasks AND fits no current checkpoint | Create a NEW CHECKPOINT via `create_checkpoint` |
@@ -193,5 +192,5 @@ Waiting for user to decide next step.
 - **Reads**: Local state `.codebyplan/state/checkpoints/<id>.json` + `.../tasks/<id>.json`; on miss `npx codebyplan sync` once; MCP `get_current_task` / `get_tasks` as documented break-glass when the state dir is absent and sync fails. Step 3.5 dedup `get_tasks(standalone=true)` stays MCP — no local-state equivalent for standalone listing.
 - **Writes**: `codebyplan task create --checkpoint-id <id> ...` (CLI write-through); MCP `create_task` break-glass.
-- **Triggered by**: `/cbp-task-check` (suggested), `/cbp-task-testing` (suggested), user manual
+- **Triggered by**: `/cbp-verify` (task scope, suggested), user manual
 - **Does NOT auto-trigger** next command — user decides

package/templates/skills/cbp-task-start/SKILL.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: cbp-task-start
 description: Start a task, load context from DB
-triggers: [cbp-round-start]
+triggers: [cbp-round-plan]
 argument-hint: [chk-task]  # e.g. `108-1` (CHK-108 TASK-1)
 effort: xhigh
 ---
@@ -14,7 +14,7 @@ Start a task by loading context from the database and preparing for work.
 ### Step 1: Parse `$ARGUMENTS`
-Parse the argument using the canonical chk-task-round notation (see `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary"):
+Parse the argument using the canonical chk-task-round notation (see `cbp-round-plan` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary"):
 | Shape | Regex | Resolves to |
 |-------|-------|-------------|
@@ -30,7 +30,7 @@ task-start: invalid argument `{value}`. Expected:
   (empty) → next pending task
 For standalone tasks, use `/cbp-standalone-task-start {N}`.
-For a specific round, use `/cbp-round-start 108-1-2`.
+For a specific round, use `/cbp-round-plan 108-1-2`.
 ```
 Error cases: `108-1-2` (that is round-start's shape), `abc`, `108-`, `-1`, `108--1`, anything with whitespace or non-numeric characters.
@@ -40,7 +40,7 @@ Error cases: `108-1-2` (that is round-start's shape), `abc`, `108-`, `-1`, `108-
 - `task-start 108-1` → CHK-108 TASK-1
 - `task-start` (no arg) → next pending via `get_current_task`
 - `task-start 45` → error: "Use /cbp-standalone-task-start 45 instead — bare numbers no longer route to standalone tasks."
-- `task-start 108-1-2` → error: "use `/cbp-round-start 108-1-2`"
+- `task-start 108-1-2` → error: "use `/cbp-round-plan 108-1-2`"
 - `task-start abc` → error: malformed
 - `task-start 108-` → error: malformed
@@ -75,7 +75,7 @@ Ask via AskUserQuestion, naming the resolved task and disclosing the actions:
 > - **Cancel** — do nothing
 - **Proceed** → continue to Step 3.
-- **Cancel** → abort cleanly: make NO writes (no branch switch, no commit, no `update_task`, no `/cbp-round-start` trigger) and exit with one line: `Cancelled by user — TASK-{N} not started.`
+- **Cancel** → abort cleanly: make NO writes (no branch switch, no commit, no `update_task`, no `/cbp-round-plan` trigger) and exit with one line: `Cancelled by user — TASK-{N} not started.`
 ### Step 3: Branch Auto-Handling
@@ -221,17 +221,17 @@ Display context summary:
 ### Step 6: Auto-trigger Round Start
-The Step 2.5 permission gate already covered this hand-off (the user approved running the skill, round-1 auto-start included), so no further prompt is needed here — automatically trigger `/cbp-round-start` for the first round.
+The Step 2.5 permission gate already covered this hand-off (the user approved running the skill, round-1 auto-start included), so no further prompt is needed here — automatically trigger `/cbp-round-plan` for the first round.
 ```
 Starting first round...
 ```
-Trigger `/cbp-round-start` with **no argument**. Do NOT pass the task identifier (`{chk}-{task}`) — round-start's 2-segment form is interpreted as standalone TASK-`{chk}` round `{task}`, not CHK-`{chk}` TASK-`{task}`. Passing no argument causes round-start to derive the active task/round from state, which is the correct path here.
+Trigger `/cbp-round-plan` with **no argument**. Do NOT pass the task identifier (`{chk}-{task}`) — round-start's 2-segment form is interpreted as standalone TASK-`{chk}` round `{task}`, not CHK-`{chk}` TASK-`{task}`. Passing no argument causes round-start to derive the active task/round from state, which is the correct path here.
 ## Integration
 - **Gates**: Step 2.5 permission gate — asks the user to confirm before any side effect; **Cancel** aborts cleanly with no writes. Fires on every invocation (manual, auto-trigger, auto-loop).
 - **Reads**: `.codebyplan/state/checkpoints/*.json`, `checkpoints/<id>/tasks/*.json`, `checkpoints/<id>/tasks/<id>/rounds/*.json`, `todos.json` (local-first; `npx codebyplan sync` on miss; MCP `get_current_task`/`get_tasks`/`get_rounds` break-glass)
 - **Writes**: `codebyplan task update` (CLI write-through; MCP `update_task` break-glass)
-- **Triggers**: `/cbp-round-start` (auto, round 1, no argument)
+- **Triggers**: `/cbp-round-plan` (auto, round 1, no argument)

package/templates/skills/cbp-todo/SKILL.md CHANGED Viewed

@@ -131,19 +131,17 @@ Once the gates pass, load the context the head command needs. This ensures `/cle
 | `/cbp-checkpoint-plan` | Load checkpoint from `.codebyplan/state/checkpoints/<id>.json` + task files under `checkpoints/<id>/tasks/` (fallback MCP `get_checkpoints` + `get_tasks`). Display checkpoint title, goal, ideas, existing task count |
 | `/cbp-checkpoint-start` | Load checkpoint + task files from local state (fallback MCP `get_checkpoints` + `get_tasks`). Display checkpoint title, status, claim state, first pending task |
 | `/cbp-task-start [N]` | Load from `.codebyplan/state/session/current.json` (fallback MCP `get_current_task`). Display checkpoint title + task title/requirements summary |
-| `/cbp-round-start` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + round count + last round summary |
-| `/cbp-round-update` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + files_changed triage summary (claude_approved, findings, hard_fail) |
-| `/cbp-round-input` | **Full context load** (see Step 2b) |
-| `/cbp-task-check` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint + task + files summary |
-| `/cbp-task-testing` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + testing status summary |
+| `/cbp-round-plan` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + round count + last round summary |
+| `/cbp-round-plan` | **Full context load** (see Step 2b) |
+| `/cbp-verify` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + files_changed triage summary (claude_approved, findings, hard_fail) |
 | `/cbp-task-create` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint + task list summary |
-| `/cbp-task-complete` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint + task summary |
+| `/cbp-finalize` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint + task summary |
 | `/cbp-checkpoint-complete` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint summary |
 | *(no command / idle)* | See Step 3 — suggest `/cbp-session-end` |
 **For any unrecognized command:** Load from local state session (fallback MCP `get_current_task`) as a safe default. Display whatever context is available.
-### Step 2b: Full Context Load (for `/cbp-round-input`)
+### Step 2b: Full Context Load (for `/cbp-round-plan`)
 This is the most context-dependent command. Load everything:
@@ -190,7 +188,7 @@ Reached only when the Step 1.5 ownership gate allowed routing to continue, the S
 ## Integration
-- **Called by**: `/cbp-session-start`, `/cbp-task-complete`, `/cbp-checkpoint-complete`, manual, after `/clear`
+- **Called by**: `/cbp-session-start`, `/cbp-finalize`, `/cbp-checkpoint-complete`, manual, after `/clear`
 - **Resolves**: `npx codebyplan resolve-worktree --json` (worktree id + distress signal), `npx codebyplan whoami --json` (user id)
 - **Reads**: `.codebyplan/state/todos.json`, `session/current.json`, `checkpoints/<id>.json`, `checkpoints/<id>/tasks/<id>.json`, `checkpoints/<id>/tasks/<id>/rounds/<id>.json`, `worktrees.json`. If missing/stale, run `npx codebyplan sync` once and re-read. Break-glass: MCP `get_todos`, `get_current_task`, `get_rounds`, `get_checkpoints`, `get_tasks` when state dir absent and sync fails. `get_worktrees` stays MCP (display-only ownership-block path; no CLI verb).
 - **Triggers**: `rows[0].command` (auto, after the Step 1.5 ownership gate and Step 1.55 stale-entity guard pass, and the Step 1.6 planning gate falls through); Step 1.55 overrides to STOP (stale completed/cancelled entity); Step 1.6 overrides to `/cbp-checkpoint-plan` (unplanned) or `/cbp-checkpoint-start` (planned-but-pending)

package/templates/skills/cbp-verify/SKILL.md ADDED Viewed

@@ -0,0 +1,146 @@
+---
+name: cbp-verify
+description: Unified verify stage — deterministic gates, real-execution proof, and a fresh-context diff review at round or task scope. Auto-triggered by cbp-round-build; escalates to task scope on the last clean round.
+argument-hint: [chk-task[-round] | task[-round]]
+triggers: [cbp-round-plan, cbp-round-complete, cbp-finalize]
+effort: xhigh
+---
+# Verify Command
+The single verify stage for the execution half. Collapses automated checks, finished-round
+triage, AI production review, and comprehensive task-level testing into one scope-aware skill.
+The deterministic spine lives in the CLI (`codebyplan check`, `codebyplan e2e verify-round`);
+this skill orchestrates the gates, proves execution, spawns ONE fresh-context reviewer, and
+routes on a single directive.
+Auto-triggered by `/cbp-round-build` after execution. The human gate is NOT here — it is the
+separate `ask`-tier `cbp-round-complete` (round) and the one batched walkthrough in Phase 6
+(task). This skill is model-invocable on purpose.
+## Scope & Kind
+- **SCOPE** (`round` | `task`) — auto-detected: a 3-segment `{chk}-{task}-{round}` (or 2-segment
+  `{task}-{round}` standalone) argument, or an auto-trigger from `cbp-round-build`, is
+  `scope=round`. A 2-segment `{chk}-{task}` (or bare `{task}` standalone) argument, or the Phase 5
+  escalation, is `scope=task`.
+- **KIND** (`checkpoint` | `standalone`) — detected ONCE at the top from identifier shape
+  (3-segment / 2-segment-chk = checkpoint; 2-segment / bare = standalone). KIND selects MCP tool
+  names per the table in `reference/deterministic-gates.md`.
+All reads are local-state-first (`.codebyplan/state/**`); on miss run `npx codebyplan sync` once
+and re-read; MCP `get_*` is the break-glass fallback. All writes go through `codebyplan ... update`
+(CLI write-through), MCP break-glass.
+## HARD GATES — non-negotiable
+- The deterministic-gate JSON **is** the verdict — never narrate "I verified the build". (Phase 2)
+- Empty execution proof on a UI-touching diff = GATE FAILURE. (Phase 3, `rules/execution-proof.md`)
+- Reviewer spawn failure = HARD GATE FAILURE → STOP + retry directive; NEVER self-review inline.
+  (Phase 4, `rules/spawn-failure-is-gate-failure.md`)
+- `gate6` is always hard (never baselined); baseline regressions are a user-accept gate, never
+  auto-accepted. (`rules/two-tier-ci.md`)
+## Phase Skeleton
+### PHASE 1 — RESOLVE
+Parse `$ARGUMENTS` (notation per `cbp-round-plan` identifier vocabulary). Detect SCOPE and KIND
+(above). Resolve the active round/task from local state. If `scope=round` and no in-progress
+round → `No active round. Run /cbp-round-plan first.` and STOP. If `scope=task` and any round is
+still `in_progress` → STOP with "complete the active round first". Full resolution + KIND tool
+table: `reference/deterministic-gates.md`.
+### PHASE 2 — DETERMINISTIC GATES
+Run the unified matrix and capture the JSON:
+```bash
+codebyplan check --scope <round|task> --json
+```
+The JSON `{ results[], any_failed, hard_fail_checks[], no_baseline }` IS the verdict — record
+each result's `check`, `status`, `exit_code`, `new_failures[]`. `gate6` is ALWAYS hard;
+`lint`/`typecheck`/`tests`/`audit` fail only on NEW per-package failures vs the committed
+`.check-baseline.json` (baseline-tolerant soft tier, `rules/two-tier-ci.md`). `any_failed === true`
+(equivalently `hard_fail_checks.length > 0`) → carry into the Phase 5 verdict as a fail. Exact
+contract + the `claude_only` carve-out (deterministic-only path, no agent): see
+`reference/deterministic-gates.md`.
+### PHASE 3 — REAL EXECUTION PROOF
+Produce the committed proof for every tier the diff touches (`rules/execution-proof.md`):
+- **Tier 1** (configured e2e framework whose `app` source changed) — persist `e2e_eligible` /
+  `e2e_outputs` to round context, then:
+  ```bash
+  codebyplan e2e verify-round --round-id <round_id> --task-id <task_id>
+  ```
+  Exit 0 = pass; exit 1 → surface `result.failed_checks[]` (`e2e_eligible_skipped` /
+  `zero_assertion_run` / `empty_gallery`) verbatim and carry as a fail.
+- **Tier 2/3/4** — dev-server screenshot / HTTP trace / command log per the rule.
+**Empty proof on a UI diff = GATE FAILURE.** Verify each screenshot/trace is committed with
+`git ls-files --error-unmatch <path>`. Write the `verify_manifest` (gates + proof, schema in
+`rules/execution-proof.md`). Per-scope detail: `reference/round-scope.md`, `reference/task-scope.md`.
+### PHASE 4 — FRESH-CONTEXT DIFF REVIEW
+Spawn `cbp-verify-reviewer` with `scope` (round → round diff; task → full task diff) and the
+Input Contract from `agents/cbp-verify-reviewer.md`. **SPAWN FAILURE = HARD GATE FAILURE** → STOP
+and surface the retry directive (`rules/spawn-failure-is-gate-failure.md`); record
+`<scope>.context.verify.spawn_failure`; do NOT walk the reviewer's phases inline. A returned
+`NOT_READY` is a successful review — act on it, do not retry.
+Triage the returned findings: in-scope mechanical fixes the orchestrator applies itself
+(`Edit`/`Write`); blocking out-of-scope findings → `/cbp-round-plan` fix round. A baseline
+regression is a **blocking user-accept gate** — never auto-accepted.
+### PHASE 5 — VERDICT + ROUTE (single directive, never an A/B/C menu)
+Combine Phase 2 + 3 + 4. Route on one directive (`feedback-close-out-routing.md`):
+| Result | Route |
+|--------|-------|
+| Any gate/proof/review fail | `Next: /cbp-round-plan` (open a fix round) |
+| Pass + more work wanted | `Next: /cbp-round-plan` (another round) |
+| Pass + LAST round + clean (scope=round) | escalate to `scope=task` → re-enter at Phase 1 |
+| Pass (scope=task) | proceed to Phase 6 finalize |
+### PHASE 6 — FINALIZE
+- **scope=round** — route to the human git-add gate: `Next: /cbp-round-complete`
+  (`ask`-tier; reconciles `sync-approvals` + `complete_round`). cbp-verify does NOT stage files
+  or complete the round. Detail: `reference/round-scope.md`.
+- **scope=task** — whole-repo `codebyplan check --scope task`, holistic `cbp-verify-reviewer`
+  (scope=task) already run in Phase 4, then the ONE genuine human step: a single batched
+  `AskUserQuestion` walkthrough (all user-testable items in one prompt, never one-per-question).
+  On satisfaction, write `task.context.verify_verdict = { verdict: 'READY', manifest, decided_at }`
+  and route `Next: /cbp-finalize`. Detail: `reference/task-scope.md`.
+## Key Rules
+- The JSON verdict from `codebyplan check` / `e2e verify-round` is authoritative — no prose
+  substitution.
+- Reviewer spawn failure STOPS the skill (retry directive); never self-certify inline.
+- Empty proof on a UI diff fails verify; screenshots must be committed.
+- Claude NEVER `git add`s — staging is the user's approval signal at `cbp-round-complete`.
+- Single-directive routing only — never an A/B/C menu.
+- `claude_only` profile is the deterministic-only carve-out (no reviewer spawn expected).
+## Integration
+- **Triggered by**: `/cbp-round-build` (auto, scope=round after execution); self-escalates to
+  scope=task on the last clean round.
+- **Reads**: `.codebyplan/state/**` (local-first; `npx codebyplan sync` on miss; MCP `get_*`
+  break-glass); changed files + git diff via the reviewer.
+- **Writes**: `codebyplan round update` / `codebyplan task update` (CLI write-through; MCP
+  `update_round` / `update_task` break-glass) — `verify_manifest`, `verify_verdict`.
+- **Spawns**: `cbp-verify-reviewer` (scope param); the `cbp-e2e-*` specialists feed Tier-1 proof
+  upstream in `cbp-round-build`.
+- **Triggers**: `/cbp-round-plan` (any fail or more-work), `/cbp-round-complete` (scope=round
+  finalize), `/cbp-finalize` (scope=task READY).
+- **References**: `reference/round-scope.md`, `reference/task-scope.md`,
+  `reference/deterministic-gates.md`.

package/templates/skills/cbp-verify/reference/deterministic-gates.md ADDED Viewed

@@ -0,0 +1,114 @@
+# Deterministic Gates — Command Contracts & Manifest
+Authoritative gate-command + manifest detail for `cbp-verify`. The SKILL.md phases point here;
+this file is loaded on demand.
+## KIND tool table
+KIND is detected once at SKILL Phase 1 from the identifier shape. MCP tool names differ by KIND;
+all writes prefer the CLI write-through and fall back to MCP.
+| Operation | `checkpoint` KIND | `standalone` KIND |
+|-----------|------------------|-------------------|
+| Get task | local state (break-glass `get_current_task`) | `get_current_standalone_task(repo_id)` |
+| Get rounds | local state (break-glass `get_rounds`) | `get_standalone_rounds(standalone_task_id)` |
+| Update round | `codebyplan round update` (MCP `update_round`) | MCP `update_standalone_round` |
+| Update task | `codebyplan task update` (MCP `update_task`) | MCP `update_standalone_task` |
+Empty-arg KIND detection: probe `get_current_standalone_task` first; if found → `standalone`;
+else `checkpoint` via `get_current_task`. (KIND detection is MCP-unavoidable — no identifier yet
+means no local path to probe; everything after is local-first.)
+## Phase 1 resolution detail
+| Parse | Resolution |
+|-------|-----------|
+| `{chk}-{task}-{round}` | checkpoint round. Read `.codebyplan/state/checkpoints/*.json` → filter `number==={chk}`; `.../tasks/*.json` → `{task}`; `.../rounds/*.json` → `{round}`. |
+| `{chk}-{task}` | checkpoint task (scope=task). Resolve checkpoint + task; verify all rounds `completed`. |
+| `{task}-{round}` | standalone round (scope=round). |
+| `{task}` (bare) | standalone task (scope=task). |
+| _(empty)_ | the active in-progress task/round from `.codebyplan/state/todos.json`. |
+On any miss: `npx codebyplan sync` once, re-read; MCP `get_*` break-glass only when the state dir
+is absent AND sync fails.
+## Phase 2 — `codebyplan check`
+```bash
+codebyplan check --scope <round|task> --json
+```
+JSON shape (`RunCheckResult`, source `packages/codebyplan-package/src/lib/check.ts:185`):
+```jsonc
+{
+  "results": [
+    { "check": "gate6|lint|typecheck|tests|audit",
+      "status": "pass|fail|skipped",
+      "exit_code": 0,
+      "command": "...",
+      "stdout": "...", "stderr": "...",
+      "executed": true,
+      "new_failures": ["@scope/pkg", "GHSA-xxxx"] }  // omitted for gate6
+  ],
+  "any_failed": false,
+  "hard_fail_checks": [],          // names of checks that failed post-baseline-diff
+  "no_baseline": false
+}
+```
+- **`gate6`** (sibling-identity parity) is ALWAYS hard — never baselined, no `new_failures` field.
+- `lint` / `typecheck` / `tests` / `audit` are **baseline-diffed**: `status: 'pass'` when
+  `new_failures` is `[]` even if the underlying command exited non-zero (pre-existing red is
+  tolerated). `audit.new_failures` lists new GHSA ids not in the allowlist.
+- Verdict: `any_failed === true` (≡ `hard_fail_checks.length > 0`) is a fail — surface each failing
+  result's `new_failures` / `stdout` / `stderr`. **This JSON is the verdict; never substitute prose.**
+- Soft tier uses NO `--no-baseline`. The whole-repo absolute-green tier
+  (`--scope merged --no-baseline`) belongs to checkpoint close, not this skill
+  (`rules/two-tier-ci.md`).
+## Phase 3 — `codebyplan e2e verify-round`
+```bash
+codebyplan e2e verify-round --round-id <uuid> --task-id <uuid>
+```
+Persist `round.context.e2e_eligible[]` + `e2e_outputs{}` FIRST (the CLI reads the round row from
+the DB). Verdict JSON (`VerifyRoundResult`, source `packages/codebyplan-package/src/lib/e2e.ts:127`):
+```jsonc
+{ "round_id": "...", "task_id": "...",
+  "result": { "pass": true, "failed_checks": [], "skipped_validly": [] } }
+```
+Exit 0 = pass. Exit 1 → one or more of `e2e_eligible_skipped` / `zero_assertion_run` /
+`empty_gallery` in `result.failed_checks[]` — surface verbatim, carry as a fail, route to a fix
+round (`rules/e2e-mandatory.md`). When `e2e_eligible[]` is empty, skip the call — nothing to verify.
+## `claude_only` carve-out (deterministic-only path)
+When the resolved profile is `claude_only` (round touched only `.claude/**` / docs / config — no
+app surface), there is **no reviewer to spawn by design**. Proof IS the deterministic set:
+1. `codebyplan check --scope <round|task> --json` (gate6 + matrix as above).
+2. `bash -n <hook>` for each touched `.sh` file.
+3. SKILL/agent/rule structure sanity for touched `.claude/` files (line counts, no `/cbp-*`
+   legacy notation).
+This is a first-class verification path, NOT a banned inline fallback
+(`rules/spawn-failure-is-gate-failure.md` carve-out) — Phase 4's reviewer spawn is skipped, and
+that skip is recorded as `verify_manifest.proof.tier: 4`, not a spawn failure.
+## verify-manifest write
+Write the manifest into round/task context (merge into existing context — the `update_*`
+REPLACE contract requires re-sending the full object):
+```bash
+codebyplan round update --id <round_id> --task-id <uuid> --checkpoint-id <uuid> --context '<json>'
+# break-glass: MCP update_round / update_standalone_round
+```
+Schema (canonical in `rules/execution-proof.md`): `verify_manifest = { scope, gates[], proof{ tier,
+artifacts[], e2e_verify_round }, decided_at }`. Each `proof.artifacts[].path` is proven committed
+via `git ls-files --error-unmatch <path>` before it counts.