npm - codebyplan - Versions diffs - 1.13.53 → 1.13.55 - Mend

codebyplan 1.13.53 → 1.13.55

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (84) hide show

package/templates/skills/cbp-round-plan/SKILL.md ADDED Viewed

@@ -0,0 +1,344 @@
+---
+name: cbp-round-plan
+description: Plan a round — round-1 planning AND round-2+ deep-analysis of unapproved work, then spawn cbp-round-planner and auto-trigger /cbp-round-build
+triggers: [cbp-round-build]
+argument-hint: [chk-task-round | task-round | requirements-text]  # e.g. `108-1-2`, `45-2`, or free-text
+effort: xhigh
+---
+## Kind Detection
+Inspect the resolved identifier from argument parsing to determine the task kind:
+| Identifier shape | KIND |
+|-----------------|------|
+| `{task}-{round}` (2-segment, e.g. `45-2`) | `standalone` |
+| `{chk}-{task}-{round}` (3-segment, e.g. `141-3-1`) | `checkpoint` |
+| _(empty / free-text)_ | Check `get_current_standalone_task` first; if found → `standalone`. Else → `checkpoint` via `get_current_task`. |
+Set `KIND` for the rest of this skill. Read/write sources vary by KIND:
+| Operation | `checkpoint` KIND | `standalone` KIND |
+|-----------|------------------|-------------------|
+| Get task | local state (break-glass: `get_current_task`) | `get_current_standalone_task(repo_id)` |
+| Get rounds | local state (break-glass: `get_rounds`) | `get_standalone_rounds(standalone_task_id)` |
+| Add round | `codebyplan round add` (break-glass: `add_round`) | `add_standalone_round(standalone_task_id, ...)` |
+| Update round | `codebyplan round update` (break-glass: `update_round`) | `update_standalone_round(standalone_round_id, ...)` |
+| Complete round | `complete_round(round_id, duration_minutes?)` | `complete_standalone_round(standalone_round_id, duration_minutes?, caller_worktree_id)` ⚠️ `caller_worktree_id` is REQUIRED for standalone |
+| Update task | `codebyplan task update` (break-glass: `update_task`) | `update_standalone_task(standalone_task_id, ...)` |
+> **Note**: The `standalone` KIND column uses MCP tools unchanged — standalone local-first migration is out of scope and will be addressed in a later task.
+# Round Plan Command
+The single **planning entry** for the round cycle. It plans BOTH the first round (from `task.requirements`) AND every follow-up round (deep analysis of the prior round's unapproved work, verify gate/proof failures, and reviewer findings — the round-input deep-analysis role is now folded in here). It analyzes context, spawns `cbp-round-planner` for the plan, then auto-triggers `/cbp-round-build` — the `ask`-tier permission prompt on that skill IS the user's plan approval. NO execution or testing here — those belong to `/cbp-round-build` and `/cbp-verify`.
+## Planner Spawn Failure Is A Gate Failure
+If the `cbp-round-planner` agent spawn fails (`API Error: Extra usage required`, monthly Agent
+usage cap, provider 5xx, rate limit, context overflow at spawn, the agent dying before emitting
+its output contract), that is a **HARD GATE FAILURE** per `rules/spawn-failure-is-gate-failure.md`.
+The orchestrator does **NOT** walk the planner's phases inline and self-certify a plan — doing so
+re-introduces the exact fresh-context blind spot the planner exists to remove. STOP and surface the
+retry directive verbatim:
+```
+## Plan blocked — planner could not spawn
+The round planner (cbp-round-planner) failed to spawn: <class> — <verbatim error>.
+This is a hard gate failure, not a plan. Retry when capacity returns:
+  Next: /cbp-round-plan
+```
+Record `round.context.planner_findings.spawn_failure = { class, error_message, decided_at }` so the
+retry is auditable. Do NOT continue to Step 8/9; no plan means no execution.
+## Pipeline
+```
+/cbp-round-plan (planning) → /cbp-round-build (ask-tier permission = plan approval)
+```
+**Auto-loop mode**: when `auto_loop_mode === true` carries forward from the prior round, the Step 6
+Q&A and the Step 4b mode prompt are skipped, and Step 8's `/cbp-round-build` permission is
+auto-approved. See `cbp-verify` reference `round-scope.md` Phase 5 (the auto-loop decision that
+carries `auto_loop_mode` forward) for the contract.
+## Instructions
+### Step 0: Parse `$ARGUMENTS` shape
+Disambiguate the argument up front. Three input shapes:
+### CHK / TASK / ROUND Identifier Notation Vocabulary
+| Shape | Regex | Meaning |
+|-------|-------|---------|
+| `{chk}-{task}-{round}` (e.g. `108-1-2`) | `^[0-9]+-[0-9]+-[0-9]+$` | **Identifier**: targets round {round} of CHK-{chk} TASK-{task}. Sets `target_checkpoint`, `target_task`, `target_round`. |
+| `{task}-{round}` (e.g. `45-2`) | `^[0-9]+-[0-9]+$` | **Identifier**: targets round {round} of standalone TASK-{task}. Sets `target_task`, `target_round` (no checkpoint). |
+| Non-empty, non-identifier-shaped | — | **Free-text round requirements** (the follow-up-round requirements path). |
+| _(empty)_ | — | No identifier, no requirements text — derive task/round from Kind Detection above. |
+Malformed identifier shapes (`108-`, `-1-2`, `108--1`, `abc-1`) — surface this error and stop, mirroring `/cbp-task-start`'s error vocabulary:
+```
+round-plan: invalid argument `{value}`. Expected:
+  108-1-2  → round 2 of CHK-108 TASK-1
+  45-2     → round 2 of standalone TASK-45
+  (free-text) → round requirements (follow-up round)
+  (empty)  → derive from active task/round state
+```
+#### Worked examples
+- `round-plan 108-1-2` → resolve CHK-108 TASK-1, target round 2
+- `round-plan 45-2` → resolve standalone TASK-45, target round 2
+- `round-plan 108-1` → **standalone TASK-108 round 1** (NOT CHK-108 TASK-1). The 2-segment form means `{task}-{round}`; checkpoint-bound round-targeting requires all three numbers (`108-1-2`). If you got here trying to target CHK-108 TASK-1, you want `108-1-2` (specifying a round number) or `/cbp-task-start 108-1` (specifying just a task).
+- `round-plan "Implement OAuth flow per CHK-X plan"` → free-text path; current-task/round derivation from state
+- `round-plan` (empty) → derive from Kind Detection; round 1 uses `task.requirements`
+- `round-plan 108-` → error: malformed identifier
+- `round-plan -1-2` → error: malformed identifier
+- `round-plan abc-1` → error: malformed identifier
+> **Mental-model warning**: `task-start` and `round-plan` interpret the SAME `108-1` argument differently. `task-start 108-1` → CHK-108 TASK-1 (checkpoint-bound task). `round-plan 108-1` → standalone TASK-108 round 1 (because round-plan needs three numbers for a checkpoint-bound round). When in doubt, write all three: `round-plan 108-1-2`.
+### Step 1: Get Current Task
+If Step 0 produced an identifier (`target_task` set): resolve directly per the identifier (bound or standalone) — use KIND-appropriate sources per the Kind Detection table above.
+Otherwise: use Kind Detection to determine KIND, then:
+- **checkpoint KIND**: read `.codebyplan/state/checkpoints/<checkpointId>/tasks/<taskId>.json` (local-first). If missing/stale, run `npx codebyplan sync` once and re-read. Break-glass fallback: MCP `get_current_task` when the state dir is absent and sync fails (daemon-dead + CLI-unavailable).
+- **standalone KIND**: MCP `get_current_standalone_task(repo_id)` — standalone tools are NOT migrated; standalone KIND still uses MCP until a later task.
+If no in-progress task and Step 0 produced no identifier, show error: `No active task. Run /cbp-task-start first.`
+### Step 2: Determine Round Number
+If Step 0 produced `target_round`: use that as the round number directly (may target an existing round for resume, or N+1 for a new round — disambiguate by reading local round files or MCP per KIND below).
+Otherwise:
+- **checkpoint KIND**: list `.codebyplan/state/checkpoints/<checkpointId>/tasks/<taskId>/rounds/` (local-first). If missing/stale, run `npx codebyplan sync` once and re-read. Break-glass fallback: MCP `get_rounds` when the state dir is absent and sync fails.
+- **standalone KIND**: MCP `get_standalone_rounds(standalone_task_id)` — standalone still uses MCP.
+Count existing rounds; next is N+1. **Round 1** → run the Step 3 first-round path. **Round 2+** → run the Step 3D deep-analysis path (the folded-in round-input deep-analysis role) before creating the round.
+### Step 3: Gather Round Requirements
+**If Step 0 parsed an identifier**: this skill was invoked to resume/start a specific round. Requirements come from existing round data (when the round exists) or task.requirements (for new round 1) — `$ARGUMENTS` is the identifier itself, NOT requirements text.
+**Else (free-text path), if round number is 1:**
+- Use `task.requirements` as the primary requirements
+- If `$ARGUMENTS` provided (free-text), merge as additional context
+**Else (round number > 1):** requirements come from the Step 3D Deep Analysis below — the prior round's unapproved work, verify gate/proof failures, and reviewer findings. Free-text `$ARGUMENTS` (if provided) is merged in as additional direction.
+### Step 3D: Deep Analysis (round 2+ only — MANDATORY)
+Follow-up rounds get the SAME depth as round 1. This is the deep-analysis role that round-input
+formerly owned; it now runs inline here for every round after the first.
+**3D-a:** Load task (already loaded in Step 1) — confirm `files_changed`, `requirements`, `context`, `qa` are present.
+**3D-b:** Load all rounds:
+- **checkpoint KIND** — Read `.codebyplan/state/checkpoints/<checkpointId>/tasks/<taskId>/rounds/*.json` (local-first; break-glass: MCP `get_rounds(task_id)`). Get previous round context + verify outcome.
+- **standalone KIND** — MCP `get_standalone_rounds(standalone_task_id)`.
+**3D-c:** Identify unapproved files from `task.files_changed` where `user_approved === false`.
+**3D-d:** Read the content of each unapproved file (Read tool, max 5 files, first 100 lines each).
+**3D-e:** Cross-reference with task requirements — for each requirement, is it met or missed?
+**3D-f:** Extract verify findings from the latest round context. `/cbp-verify` writes its outcome
+into the round/task context (`verify_manifest`, gate `new_failures[]`, execution-proof gaps, and any
+blocking `cbp-verify-reviewer` findings it could not auto-apply inline because they reference files
+outside the prior round's `files_changed[]`). Include those out-of-scope findings as high-priority
+requirements; deterministic gate failures and proof gaps come first (they block shipping).
+**3D-g:** Identify root causes — not "file X is wrong" but "requirement Y was not met because Z".
+**3D-h:** Present the analysis to the user:
+```
+## Follow-up Round Analysis
+### Unapproved Files ([N])
+| File | Action | Issue |
+|------|--------|-------|
+| path | modified | [what's wrong based on reading the file] |
+### Requirements Coverage
+| Requirement | Status | Gap |
+|-------------|--------|-----|
+| [req] | met/missed | [what's missing] |
+### Verify Findings (from /cbp-verify on the previous round)
+| # | Source | File | Issue |
+|---|--------|------|-------|
+| [gate failure / proof gap / reviewer finding the previous verify could not auto-apply] |
+### Root Causes
+1. [root cause with explanation]
+2. [root cause with explanation]
+```
+### Step 4: Create Round in DB
+**checkpoint KIND**: `codebyplan round add --task-id <uuid> --checkpoint-id <uuid> --number <N> --status in_progress --started-at <ISO> --triggered-by <user|claude|auto_loop> [--requirements <text>] [--context <json>]` (CLI write-through: writes local state at `.codebyplan/state/checkpoints/<checkpointId>/tasks/<taskId>/rounds/<roundId>.json` + REST). Break-glass fallback: MCP `add_round` when the CLI is unavailable.
+**standalone KIND**: MCP `add_standalone_round(standalone_task_id, ...)` — standalone still uses MCP.
+Fields to supply:
+- `task_id` (checkpoint) / `standalone_task_id` (standalone): current task ID
+- `number`: next round number
+- `status`: "in_progress"
+- `started_at`: now (ISO format)
+- `triggered_by`: `"user"` | `"claude"` | `"auto_loop"` — `auto_loop` when `auto_loop_mode === true` carries forward; `claude` when auto-triggered by a verify failure; `user` otherwise
+- `requirements`: round requirements from Step 3 / 3D
+- `context`: when `auto_loop_mode === true`, persist `auto_loop_mode: true`, `auto_loop_index: N`, `auto_loop_cap: C` into `round.context`
+### Step 4b: Choose Mode (round 2+ only)
+**If `round.context.auto_loop_mode === true` on the prior round**: skip the AskUserQuestion below. Auto-pick "Start round with these findings" using the Step 3D analysis directly. Set the next round's `auto_loop_mode = true`, carry `auto_loop_index` and `auto_loop_cap` forward.
+**Else (manual mode)**, ask user via AskUserQuestion:
+- **Start round with these findings** — use the Step 3D analysis as requirements
+- **Discuss first** — open conversation about the analysis
+- **Adjust** — user provides their own requirements
+If "Discuss first": enter open discussion; when direction is clear, continue. If "Adjust": merge the user's requirements with the analysis context.
+**Under `auto_loop_mode`**: requirements come VERBATIM from the prior round's reviewer findings and verify gate/proof failures, treated as a flat list. Test/gate failures are listed first (they block shipping); reviewer findings follow. Both sets are included unchanged — do not re-summarise or re-prioritise either.
+### Step 5: Load Context from DB
+Load from checkpoint and task:
+- Checkpoint context (decisions, dependencies, constraints, goal)
+- Task context and requirements
+- Resources from checkpoint and task (documentation links, API docs, guides)
+- Previous round results (files_changed, verify outcomes)
+- For round 2+: load the latest completed round's `context.verify_manifest` and `context.executor_output`:
+  - **checkpoint KIND**: read `.codebyplan/state/checkpoints/<checkpointId>/tasks/<taskId>/rounds/<roundId>.json` (local-first). If missing/stale, run `npx codebyplan sync` once and re-read. Break-glass fallback: MCP `get_rounds` when the state dir is absent and sync fails.
+  - **standalone KIND**: MCP `get_standalone_rounds(standalone_task_id)` — standalone still uses MCP.
+- Identify unapproved files from task.files_changed (where user_approved === false)
+### Step 6: First Round Analysis (Round 1 only)
+For the first round only:
+1. Analyze the task context and requirements thoroughly
+2. Identify any ambiguities or gaps
+3. If questions are needed, ask user via AskUserQuestion (max 4 per batch)
+4. If task context changes from Q&A answers:
+   - **checkpoint KIND**: `codebyplan task update --id <task-id> --checkpoint-id <uuid> --context <json>` (CLI write-through: local state at `.codebyplan/state/checkpoints/<checkpointId>/tasks/<taskId>.json` + REST). Break-glass fallback: MCP `update_task` when the CLI is unavailable.
+   - **standalone KIND**: MCP `update_standalone_task(standalone_task_id, context: {...})` — standalone still uses MCP.
+   - Save Q&A decisions as `context.decisions[]`
+Skip this step for round 2+ (context already established via the Step 3D deep analysis).
+**If `auto_loop_mode === true`, skip Q&A regardless of round number** — the auto-loop already has authoritative requirements from the prior round's findings.
+### Step 7: Spawn Round Planner Agent
+Spawn `cbp-round-planner` agent with:
+```yaml
+input:
+  task_number: N
+  round_number: N
+  requirements: task.requirements
+  round_requirements: [round-specific input from Step 3 / 3D]
+  checkpoint: {id, title, goal, context}
+  task: {id, title, requirements, context}
+  previous_rounds: {count, files_pending, files_changed, verify_manifest, unapproved_files}
+```
+**Round 2+ context**: the planner receives the previous round's verify outcome and the list of unapproved files (from Step 3D).
+**Priority**: if `round_requirements` is set (round 2+), the planner focuses on those specific goals while respecting `task.requirements` as context.
+Wait for planner output. **If the spawn fails, STOP per "Planner Spawn Failure Is A Gate Failure" above** — do not self-certify a plan inline.
+### Step 8: Present Plan
+Present the plan to user:
+```
+## Round [N] Plan
+**Goal**: [goal]
+### Steps:
+1. [step]
+2. [step]
+...
+### Files to modify:
+| File | Action | Purpose |
+|------|--------|---------|
+| ... | ... | ... |
+```
+**Wave table** — when `approved_plan.waves[]` contains ≥2 entries, render a wave summary table BEFORE the files table:
+```
+### Execution Waves
+| Wave | Agent type | Files | Depends on | Skill preloads |
+|------|-----------|-------|-----------|----------------|
+| web-ui | cbp-round-builder | 7 | — | cbp-frontend-design |
+| backend-api | cbp-round-builder | 4 | — | — |
+```
+Single-wave plans present the existing flat plan view (no wave table) — backward compatible.
+**Plan approval is the `ask`-tier `Skill(cbp-round-build)` permission prompt** — there is NO approve/needs-changes/wrong AskUserQuestion here. After presenting the plan, proceed to Step 9, which auto-triggers `/cbp-round-build`; the harness then shows the `ask`-tier permission prompt, and confirming it IS the user's go-ahead on the plan.
+**Denied-build handling** — if the user declines the `/cbp-round-build` permission, the plan does not run. Treat the decline as "the plan must change":
+- **Minor changes**: collect the user's feedback, re-spawn `cbp-round-planner` with it as a constraint (re-run Step 7), present the revised plan, and re-trigger `/cbp-round-build`.
+- **Wrong direction**: save the rejection reason to round context and re-run this skill's deep-analysis path (Step 3D) for new requirements.
+**If `auto_loop_mode === true`**: the loop auto-approves — log `round.context.plan_approval = { mode: "auto_loop", auto_approved_at: <ISO> }`, surface a one-line note `"Auto-approved under auto_loop_mode (round N of cap C)"`, and proceed to Step 9 (the `/cbp-round-build` permission is auto-approved under the loop).
+### Step 9: Auto-trigger Round Build
+Save planner output to round context, then trigger `/cbp-round-build`. The `ask`-tier permission prompt on `/cbp-round-build` is the user's plan approval (see Step 8).
+- **checkpoint KIND**: `codebyplan round update --id <round-id> --task-id <uuid> --checkpoint-id <uuid> --context <json>` (CLI write-through: local state at `.codebyplan/state/checkpoints/<checkpointId>/tasks/<taskId>/rounds/<roundId>.json` + REST). Break-glass fallback: MCP `update_round` when the CLI is unavailable.
+- **standalone KIND**: MCP `update_standalone_round(standalone_round_id, ...)` — standalone still uses MCP.
+```
+Starting build phase...
+```
+## Fallback: Context Recovery
+If this command is triggered **directly** (not via `/cbp-todo`) and no context is available in the session:
+1. Read `.codebyplan/repo.json` for `repo_id`
+2. Use Kind Detection to determine KIND, then:
+   - **checkpoint KIND**: Read local `.codebyplan/state/` files for checkpoint + task (break-glass: MCP `get_current_task`).
+   - **standalone KIND**: MCP `get_current_standalone_task`.
+3. Load round history:
+   - **checkpoint KIND**: Read `.codebyplan/state/checkpoints/<id>/tasks/<id>/rounds/*.json` (break-glass: MCP `get_rounds(task_id)`).
+   - **standalone KIND**: MCP `get_standalone_rounds(standalone_task_id)`.
+4. Continue from Step 1 (round 2+ deep analysis at Step 3D handles all context loading)
+## Key Rules
+- **Planning ONLY** — no code execution, no testing
+- Planner gets full context (checkpoint + task + previous rounds)
+- Round 1 plans from `task.requirements`; round 2+ runs the MANDATORY Step 3D deep analysis (same depth as round 1 — no quick-fix behavior)
+- Planner spawn failure is a HARD GATE FAILURE — STOP + retry directive, NEVER self-certify a plan inline (`rules/spawn-failure-is-gate-failure.md`)
+- Claude NEVER git adds files in round commands
+## Integration
+- **Reads (checkpoint KIND)**: `.codebyplan/state/checkpoints/<id>.json`, `checkpoints/<id>/tasks/<id>.json`, `checkpoints/<id>/tasks/<id>/rounds/<id>.json` (local-first; `npx codebyplan sync` on miss; MCP `get_current_task` / `get_rounds` as break-glass), file contents (Read tool, round 2+ deep analysis)
+- **Reads (standalone KIND)**: MCP `get_current_standalone_task`, `get_standalone_rounds` — standalone still uses MCP
+- **Writes (checkpoint KIND)**: `codebyplan round add` (Step 4), `codebyplan round update` (Step 9), `codebyplan task update` (Step 6 context only) — break-glass: MCP `add_round` / `update_round` / `update_task`
+- **Writes (standalone KIND)**: MCP `add_standalone_round`, `update_standalone_round`, `update_standalone_task` — standalone still uses MCP
+- **Spawns**: `cbp-round-planner` (spawn failure = hard gate failure → STOP)
+- **Triggers**: `/cbp-round-build` (auto, on plan approval)
+- **Triggered by**: `/cbp-task-start` (round 1), `/cbp-verify` (round 2+ fix round or more-work), `/cbp-todo` (after /clear), user manually

package/templates/skills/cbp-session-end/SKILL.md CHANGED Viewed

@@ -30,7 +30,7 @@ Snapshot the current next-action so the next `/cbp-session-start` (Step 4.5) can
 2. If `rows[0]` exists and its `command` is non-empty (active work in flight):
    ```yaml
    handoff:
-     command: <rows[0].command> # e.g. "/cbp-round-update"
+     command: <rows[0].command> # e.g. "/cbp-verify"
      instructions: <rows[0].instructions> # human-readable trigger reason
      state: <rows[0].state> # workflow state label
      context: # entity ids used by the Step 4.5 freshness probe

package/templates/skills/cbp-ship-main/SKILL.md CHANGED Viewed

@@ -79,12 +79,13 @@ If `branch_deleted === true`, run a conditional Supabase preview-branch teardown
 > Lifecycle contract: see [[supabase-branch-lifecycle]].
 - Read `FEAT_BRANCH` from the `feat_branch` field in the Step 3 JSON output — NOT from `git branch --show-current`. By the time Step 4 runs, `codebyplan ship` has already checked out the base branch (unless `--keep-feat` was passed), so the live branch is the base, not the feat branch.
-- Call `mcp__supabase__list_branches` with `project_id: rrvtrumtkhrsbhcyrwvf`.
+- Resolve the parent project ref from config: `PARENT_REF=$(jq -r '.shipment.surfaces.supabase.project_ref' .codebyplan/shipment.json)` — never a hardcoded literal. If `PARENT_REF` resolves empty or `null` (no Supabase surface configured), skip the teardown silently — there is no preview branch to remove.
+- Call `mcp__supabase__list_branches` with `project_id: $PARENT_REF`.
 - Scan the returned list for an entry whose `name` exactly equals `$FEAT_BRANCH`.
 - If found: call `mcp__supabase__delete_branch` with its `branch_id`. Report the Supabase delete outcome alongside `pr_url` / `merge_commit` / `branch_deleted`.
 - If not found: no-op silently — the GitHub integration may have already removed the preview branch on PR close; not-found is success, NOT an error.
 - If the `list_branches` call itself fails (network, auth, or a non-success response — distinct from a successful lookup that returns no match): emit a non-blocking warning that the Supabase preview branch for `$FEAT_BRANCH` may still exist and should be verified in the dashboard. Do not treat an API failure as a not-found success.
-- Never delete the parent project `rrvtrumtkhrsbhcyrwvf` itself or any persistent/production branch.
+- Never delete the parent project `$PARENT_REF` itself or any persistent/production branch.
 - This coordinates safely with `/cbp-checkpoint-end` — the existence-checked delete makes any second attempt a harmless no-op.
 ## Key Rules

package/templates/skills/cbp-standalone-task-check/SKILL.md CHANGED Viewed

@@ -9,11 +9,11 @@ effort: high
 # Standalone Task Check Command
-AI-driven production readiness review for standalone tasks. Spawns the `cbp-task-check` agent for thorough verification including user satisfaction discussion. This command is a thin orchestrator — the agent does the heavy lifting.
+AI-driven production readiness review for standalone tasks. Spawns the `cbp-verify-reviewer` agent (`scope: 'task'`) for thorough verification including user satisfaction discussion. This command is a thin orchestrator — the agent does the heavy lifting.
-## Inline-Fallback for Spawn Failure
+## Spawn Failure Is A Hard Gate Failure
-If the `cbp-task-check` agent spawn fails for any reason, follow the canonical inline-fallback procedure documented in `skills/cbp-round-end/SKILL.md` "Inline-fallback for any spawn failure". Walk every agent phase inline with the same Read/Grep depth the agent would have used — do not skip phases.
+If the `cbp-verify-reviewer` agent spawn fails for any reason, STOP and surface a retry directive per `rules/spawn-failure-is-gate-failure.md`. Do NOT walk the agent's phases inline and self-certify — a missing fresh-context review is a hard gate failure, not a pass. Retry when capacity returns.
 ## When Used
@@ -34,7 +34,7 @@ Any multi-segment input is an error:
 ```
 standalone-task-check: argument `{value}` looks like a checkpoint-task pair.
-Use /cbp-task-check {chk}-{task} for checkpoint-bound tasks.
+Use /cbp-verify {chk}-{task} for checkpoint-bound tasks.
 Standalone tasks use a bare number, e.g. /cbp-standalone-task-check 45.
 ```
@@ -44,7 +44,7 @@ Error cases: any multi-segment input, `abc`, `108-`, `-1`, anything with whitesp
 - `standalone-task-check 45` → standalone TASK-45
 - `standalone-task-check` (no arg) → active in-progress task via `get_current_standalone_task`
-- `standalone-task-check 141-3` → error: "Use /cbp-task-check {chk}-{task} for checkpoint-bound tasks."
+- `standalone-task-check 141-3` → error: "Use /cbp-verify {chk}-{task} for checkpoint-bound tasks."
 - `standalone-task-check abc` → error: malformed
 ### Step 1.5: Get Current Task
@@ -66,7 +66,7 @@ If any rounds still in_progress:
 ## Cannot Run Standalone Task Check
 Standalone TASK-[N] has an active round (Round [N]). Complete it first:
-- Run `/cbp-round-update` to finish the round
+- Run `/cbp-verify` to finish the round
 ```
 Stop here.
@@ -78,12 +78,13 @@ Stop here.
 No checkpoint context is needed — standalone tasks have no parent checkpoint.
-### Step 4: Spawn Task Check Agent
+### Step 4: Spawn Verify Reviewer Agent
-Spawn `cbp-task-check` agent with full context:
+Spawn `cbp-verify-reviewer` agent (`scope: 'task'`) with full context:
 ```yaml
 input:
+  scope: 'task'
   task_number: [N]
   round_number: [total rounds]
   checkpoint: null
@@ -117,7 +118,7 @@ Issues found that need addressing:
 - [issue 2]
 ```
-Suggest: `/cbp-round-input` with specific issues. Stop — wait for user.
+Suggest: `/cbp-round-plan` with specific issues. Stop — wait for user.
 **NOT READY — needs new task:**

package/templates/skills/cbp-standalone-task-complete/SKILL.md CHANGED Viewed

@@ -25,7 +25,7 @@ Any multi-segment input is an error:
 ```
 standalone-task-complete: argument `{value}` looks like a checkpoint-task pair.
-Use /cbp-task-complete {chk}-{task} for checkpoint-bound tasks.
+Use /cbp-finalize {chk}-{task} for checkpoint-bound tasks.
 Standalone tasks use a bare number, e.g. /cbp-standalone-task-complete 45.
 ```
@@ -35,7 +35,7 @@ Error cases: any multi-segment input, `abc`, `108-`, `-1`, anything with whitesp
 - `standalone-task-complete 45` → standalone TASK-45
 - `standalone-task-complete` (no arg) → active in-progress task via `get_current_standalone_task`
-- `standalone-task-complete 141-3` → error: "Use /cbp-task-complete {chk}-{task} for checkpoint-bound tasks."
+- `standalone-task-complete 141-3` → error: "Use /cbp-finalize {chk}-{task} for checkpoint-bound tasks."
 - `standalone-task-complete abc` → error: malformed
 ### Step 1.5: Get Current Task
@@ -56,7 +56,7 @@ If any round is `in_progress`:
 ```
 ## Cannot Complete Standalone Task
-Standalone TASK-[N] has an active round (Round [N]). Run `/cbp-round-update` to finish it.
+Standalone TASK-[N] has an active round (Round [N]). Run `/cbp-verify` to finish it.
 ```
 Stop here.
@@ -66,7 +66,7 @@ Verify at least one round has `testing_qa_output` in its context. If not:
 ```
 ## Cannot Complete Standalone Task
-No testing-qa-agent validation found. Run `/cbp-round-start` to execute a validated round.
+No testing-qa-agent validation found. Run `/cbp-round-plan` to execute a validated round.
 ```
 Stop here.
@@ -179,18 +179,17 @@ When `branch_deleted === true` in the ship JSON:
 ### Step 7.5: Complete Standalone Task
-Note: `complete_standalone_task` is called only after `codebyplan ship` succeeds (no `checks_failed`) — the DB completion record reflects work that has landed in production.
+Note: completion is called only after `codebyplan ship` succeeds (no `checks_failed`) — the DB completion record reflects work that has landed in production.
-Resolve caller worktree: `CALLER_WT=$(npx codebyplan resolve-worktree 2>/dev/null)`.
+Complete via the CLI (wraps the `complete_standalone_task` MCP tool):
-Call `complete_standalone_task(standalone_task_id, caller_worktree_id: CALLER_WT)`. `caller_worktree_id` is REQUIRED — the MCP server's pre-guard rejects mutations from non-matching worktrees. The server auto-clears `assigned_worktree_id` on the task on success.
+```bash
+codebyplan standalone-task complete --id <standalone_task.id>
+```
-If `CALLER_WT` is empty, surface this warning and ask user to confirm before proceeding:
+The CLI auto-resolves `caller_worktree_id` (override → worktree cache → resolver). `caller_worktree_id` is REQUIRED — the MCP server's pre-guard rejects mutations from non-matching worktrees, and the CLI hard-fails (exit 1) with registration guidance rather than sending an undefined id. The server auto-clears `assigned_worktree_id` on the task on success.
-```
-Warning: could not resolve caller_worktree_id (npx codebyplan resolve-worktree returned empty).
-The complete_standalone_task call may be rejected by the pre-guard. Proceed anyway? (yes / no)
-```
+If the CLI exits 1 with a "could not resolve caller_worktree_id" message, run `npx codebyplan setup` (or `codebyplan resolve-worktree --cache`) from this worktree, then re-run the command.
 ### Step 8: Run Cleanup + Migration (inline)
@@ -238,6 +237,6 @@ Do NOT use AskUserQuestion for routing. Do NOT use the Skill tool to auto-trigge
 - **Chain**: `/cbp-standalone-task-check` → `/cbp-standalone-task-testing` → `/cbp-standalone-task-complete`
 - **Delegates to**: `codebyplan ship` CLI (Step 7 — PR creation, check polling, merge, branch cleanup)
 - **Reads**: MCP `get_current_standalone_task`, `get_standalone_tasks`, `get_standalone_rounds`
-- **Writes**: MCP `update_standalone_task`, `complete_standalone_task`
+- **Writes**: MCP `update_standalone_task` (Step 6 files); `codebyplan standalone-task complete` (wraps `complete_standalone_task`)
 - **Uses skills (inline, no sub-agent)**: `cleanup` (if deletions), `migration` (if exports renamed)
 - **Does NOT** auto-trigger next skill — emits single directive only

package/templates/skills/cbp-standalone-task-create/SKILL.md CHANGED Viewed

@@ -17,7 +17,7 @@ Create a new standalone task — independent of any checkpoint. Gathers user con
 ## Identifier Notation
-Standalone tasks use a bare number (e.g. `45` = standalone TASK-45). There is no checkpoint segment. Canonical notation follows `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary".
+Standalone tasks use a bare number (e.g. `45` = standalone TASK-45). There is no checkpoint segment. Canonical notation follows `cbp-round-plan` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary".
 ## Instructions
@@ -96,13 +96,20 @@ Resolve worktree_id via `npx codebyplan resolve-worktree 2>/dev/null`.
 ### Step 7: Create Standalone Task
-Use MCP `create_standalone_task` with:
-- **repo_id**: from `.codebyplan/repo.json`
-- **title**: Concise task title
-- **number**: next available number from Step 3
-- **requirements**: Numbered requirements list
-- **context**: Include decisions from Q&A and source findings
-- **assigned_worktree_id**: from Step 6 (if resolved)
+Create via the CLI (wraps the `create_standalone_task` MCP tool; auto-resolves `caller_worktree_id`):
+```bash
+codebyplan standalone-task create \
+  --title "<concise task title>" \
+  --number <next number from Step 3> \
+  --requirements "<numbered requirements list>" \
+  --context '<JSON: decisions from Q&A + source findings>' \
+  --assigned-worktree-id <from Step 6, if resolved>
+```
+- `--repo-id` is optional — the CLI reads it from `.codebyplan/repo.json`.
+- Omit `--assigned-worktree-id` when Step 6 did not resolve a worktree.
+- On success the CLI prints the created row JSON (including `.id`) to stdout.
 ```
 ## Standalone Task Created
@@ -145,6 +152,6 @@ Waiting for user to decide next step.
 ## Integration
 - **Reads**: MCP `get_standalone_tasks`
-- **Writes**: MCP `create_standalone_task`
+- **Writes**: `codebyplan standalone-task create` (wraps `create_standalone_task` MCP tool)
 - **Triggered by**: user manual
 - **Does NOT auto-trigger** next command — user decides

package/templates/skills/cbp-standalone-task-start/SKILL.md CHANGED Viewed

@@ -149,13 +149,17 @@ Load context from DB:
 ### Step 5: Set Task Status
-Use MCP `update_standalone_task(task_id, status: "in_progress")`.
+Set status via the CLI (wraps `update_standalone_task`; auto-resolves `caller_worktree_id`):
-If `CALLER_WT` is present, include `caller_worktree_id: CALLER_WT`.
+```bash
+codebyplan standalone-task update --id <task.id> --status in_progress
+```
+`--id` is the standalone task UUID resolved in Step 2. The CLI resolves `caller_worktree_id` itself (override → worktree cache → resolver), so `CALLER_WT` does not need to be passed.
 ### Step 6: Auto-trigger Round Start
-Trigger `/cbp-round-start` with no argument. Do NOT pass the task number — round-start's 2-segment form means standalone TASK-{a} round {b}, not a checkpoint task. Passing no argument causes round-start to derive the active task/round from state, which is correct.
+Trigger `/cbp-round-plan` with no argument. Do NOT pass the task number — round-start's 2-segment form means standalone TASK-{a} round {b}, not a checkpoint task. Passing no argument causes round-start to derive the active task/round from state, which is correct.
 ```
 Starting first round...
@@ -164,5 +168,5 @@ Starting first round...
 ## Integration
 - **Reads**: MCP `get_standalone_tasks`, `get_current_standalone_task`, `get_standalone_rounds`
-- **Writes**: MCP `update_standalone_task`
-- **Triggers**: `/cbp-round-start` (no argument — auto, round 1)
+- **Writes**: `codebyplan standalone-task update` (Step 5 status); MCP `update_standalone_task` (Step 3.4 branch_name persist)
+- **Triggers**: `/cbp-round-plan` (no argument — auto, round 1)

package/templates/skills/cbp-standalone-task-testing/SKILL.md CHANGED Viewed

@@ -19,7 +19,7 @@ Comprehensive task-level testing for standalone tasks — the **cross-round doub
 ## Scope vs Round-Level Validation
-Per-wave `testing-qa-agent` runs inside `/cbp-round-execute` Step 5 and **owns per-round QA**: per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret full-diff grep, and `pnpm audit`. This skill does NOT repeat them. It adds only the cross-round layer invisible within a single round: workspace-wide lint, workspace tsc, and the full test suite (which catch cross-package breakage), plus the cross-round code review (Step 6.5) and the user manual walkthrough (Step 8).
+Per-wave `testing-qa-agent` runs inside `/cbp-round-build` Step 5 and **owns per-round QA**: per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret full-diff grep, and `pnpm audit`. This skill does NOT repeat them. It adds only the cross-round layer invisible within a single round: workspace-wide lint, workspace tsc, and the full test suite (which catch cross-package breakage), plus the cross-round code review (Step 6.5) and the user manual walkthrough (Step 8).
 ## Instructions
@@ -34,7 +34,7 @@ Any multi-segment input is an error:
 ```
 standalone-task-testing: argument `{value}` looks like a checkpoint-task pair.
-Use /cbp-task-testing {chk}-{task} for checkpoint-bound tasks.
+Use /cbp-verify {chk}-{task} for checkpoint-bound tasks.
 Standalone tasks use a bare number, e.g. /cbp-standalone-task-testing 45.
 ```
@@ -57,7 +57,7 @@ Use MCP `get_standalone_rounds(standalone_task_id)`. Verify all rounds are `comp
 ## Cannot Run Standalone Task Testing
 Standalone TASK-[N] has an active round (Round [N]). Complete it first:
-- Run `/cbp-round-update` to finish the round
+- Run `/cbp-verify` to finish the round
 ```
 Stop.
@@ -185,11 +185,11 @@ Next: /cbp-standalone-task-complete {N}
 ---
 **Next:**
-Run `/cbp-round-input` to address the minor issues found during testing.
+Run `/cbp-round-plan` to address the minor issues found during testing.
 ---
-Waiting for user to run `/cbp-round-input`.
+Waiting for user to run `/cbp-round-plan`.
 **Major problems found:**