codebyplan 1.13.53 → 1.13.55

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (84) hide show
  1. package/dist/cli.js +1364 -352
  2. package/package.json +1 -1
  3. package/templates/agents/cbp-database-agent.md +1 -1
  4. package/templates/agents/cbp-e2e-maestro.md +1 -1
  5. package/templates/agents/cbp-e2e-playwright.md +24 -16
  6. package/templates/agents/cbp-e2e-tauri.md +1 -1
  7. package/templates/agents/cbp-e2e-vscode.md +1 -1
  8. package/templates/agents/cbp-e2e-xcuitest.md +1 -1
  9. package/templates/agents/cbp-improve-claude.md +2 -2
  10. package/templates/agents/{cbp-round-executor.md → cbp-round-builder.md} +23 -23
  11. package/templates/agents/{cbp-task-planner.md → cbp-round-planner.md} +26 -25
  12. package/templates/agents/cbp-security-agent.md +1 -1
  13. package/templates/agents/cbp-stripe-agent.md +2 -2
  14. package/templates/agents/cbp-testing-qa-agent.md +11 -11
  15. package/templates/agents/cbp-verify-reviewer.md +236 -0
  16. package/templates/context/architecture-map.md +4 -4
  17. package/templates/context/mcp-docs.md +57 -11
  18. package/templates/context/testing/e2e.md +9 -9
  19. package/templates/github-workflows/ci.yml +58 -0
  20. package/templates/hooks/cbp-skill-context-guard.sh +1 -1
  21. package/templates/hooks/cbp-test-hooks.sh +9 -9
  22. package/templates/hooks/validate-structure-lengths.sh +1 -1
  23. package/templates/hooks/validate-structure-patterns.sh +1 -1
  24. package/templates/rules/README.md +1 -2
  25. package/templates/rules/agent-claim-verification.md +1 -1
  26. package/templates/rules/context-file-loading.md +10 -10
  27. package/templates/rules/development-workflow.md +73 -0
  28. package/templates/rules/e2e-mandatory.md +8 -8
  29. package/templates/rules/execution-proof.md +70 -0
  30. package/templates/rules/model-invocation-convention.md +2 -2
  31. package/templates/rules/parallel-waves.md +11 -11
  32. package/templates/rules/spawn-failure-is-gate-failure.md +76 -0
  33. package/templates/rules/task-routing-recommendation.md +1 -1
  34. package/templates/rules/todo-backend.md +3 -3
  35. package/templates/rules/two-tier-ci.md +63 -0
  36. package/templates/settings.project.base.json +8 -10
  37. package/templates/skills/cbp-build-cc-mode/SKILL.md +1 -1
  38. package/templates/skills/cbp-build-cc-settings/reference/cbp-permission-policy.md +7 -7
  39. package/templates/skills/cbp-build-cc-skill/SKILL.md +1 -1
  40. package/templates/skills/cbp-build-cc-skill/reference/cbp-quality.md +2 -2
  41. package/templates/skills/cbp-build-cc-skill/reference/fork-eligibility.md +11 -14
  42. package/templates/skills/cbp-checkpoint-check/SKILL.md +2 -2
  43. package/templates/skills/cbp-checkpoint-create/SKILL.md +16 -1
  44. package/templates/skills/cbp-checkpoint-update/SKILL.md +3 -3
  45. package/templates/skills/cbp-clear-continue/SKILL.md +2 -2
  46. package/templates/skills/cbp-clear-prep/SKILL.md +3 -3
  47. package/templates/skills/{cbp-task-complete → cbp-finalize}/SKILL.md +25 -29
  48. package/templates/skills/{cbp-task-complete → cbp-finalize}/reference/checkpoint-done-branching.md +1 -1
  49. package/templates/skills/{cbp-task-complete → cbp-finalize}/reference/next-step-heuristic.md +1 -1
  50. package/templates/skills/cbp-frontend-design/SKILL.md +1 -1
  51. package/templates/skills/cbp-frontend-ui/SKILL.md +7 -7
  52. package/templates/skills/cbp-git-commit/SKILL.md +3 -3
  53. package/templates/skills/cbp-merge-main/SKILL.md +4 -4
  54. package/templates/skills/{cbp-round-execute → cbp-round-build}/SKILL.md +93 -75
  55. package/templates/skills/cbp-round-complete/SKILL.md +15 -14
  56. package/templates/skills/cbp-round-plan/SKILL.md +344 -0
  57. package/templates/skills/cbp-session-end/SKILL.md +1 -1
  58. package/templates/skills/cbp-ship-main/SKILL.md +3 -2
  59. package/templates/skills/cbp-standalone-task-check/SKILL.md +10 -9
  60. package/templates/skills/cbp-standalone-task-complete/SKILL.md +12 -13
  61. package/templates/skills/cbp-standalone-task-create/SKILL.md +16 -9
  62. package/templates/skills/cbp-standalone-task-start/SKILL.md +9 -5
  63. package/templates/skills/cbp-standalone-task-testing/SKILL.md +5 -5
  64. package/templates/skills/cbp-task-create/SKILL.md +6 -7
  65. package/templates/skills/cbp-task-start/SKILL.md +8 -8
  66. package/templates/skills/cbp-todo/SKILL.md +6 -8
  67. package/templates/skills/cbp-verify/SKILL.md +146 -0
  68. package/templates/skills/cbp-verify/reference/deterministic-gates.md +114 -0
  69. package/templates/skills/{cbp-round-end → cbp-verify}/reference/findings-presentation.md +16 -12
  70. package/templates/skills/cbp-verify/reference/round-scope.md +62 -0
  71. package/templates/skills/cbp-verify/reference/task-scope.md +71 -0
  72. package/templates/agents/cbp-improve-round.md +0 -283
  73. package/templates/agents/cbp-task-check.md +0 -217
  74. package/templates/skills/cbp-round-check/SKILL.md +0 -134
  75. package/templates/skills/cbp-round-end/SKILL.md +0 -173
  76. package/templates/skills/cbp-round-end/reference/inline-fallback.md +0 -35
  77. package/templates/skills/cbp-round-execute/reference/inline-fallback.md +0 -55
  78. package/templates/skills/cbp-round-input/SKILL.md +0 -197
  79. package/templates/skills/cbp-round-start/SKILL.md +0 -261
  80. package/templates/skills/cbp-round-update/SKILL.md +0 -120
  81. package/templates/skills/cbp-ship/templates/workflow-eas-submit.yml +0 -53
  82. package/templates/skills/cbp-ship/templates/workflow-vsce-publish.yml +0 -31
  83. package/templates/skills/cbp-task-check/SKILL.md +0 -172
  84. package/templates/skills/cbp-task-testing/SKILL.md +0 -279
@@ -10,13 +10,12 @@ Create a new task within the active checkpoint. Gathers user context, analyzes e
10
10
 
11
11
  ## When Used
12
12
 
13
- - Suggested by `/cbp-task-check` when scope issues require a new task
14
- - Suggested by `/cbp-task-testing` when major problems need a separate task
13
+ - Suggested by `/cbp-verify` (task scope) when scope issues or major problems require a separate task
15
14
  - User manually wants to add a task to the current checkpoint
16
15
 
17
16
  ## Identifier Notation
18
17
 
19
- This skill operates on the **active** checkpoint resolved via MCP `get_current_task` and does not accept a positional identifier argument. The task it creates gets its `number` from the next-available slot within the active checkpoint (checkpoint-bound). Canonical chk-task-round notation — used in prose, error messages, and cross-references — follows `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary": `108-1` (CHK-108 TASK-1), `108-1-2` (round 2 of CHK-108 TASK-1).
18
+ This skill operates on the **active** checkpoint resolved via MCP `get_current_task` and does not accept a positional identifier argument. The task it creates gets its `number` from the next-available slot within the active checkpoint (checkpoint-bound). Canonical chk-task-round notation — used in prose, error messages, and cross-references — follows `cbp-round-plan` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary": `108-1` (CHK-108 TASK-1), `108-1-2` (round 2 of CHK-108 TASK-1).
20
19
 
21
20
  **Bare-number argument**: if a bare number (e.g. `42`) is provided with no checkpoint context, this skill cannot resolve it to a checkpoint-bound task:
22
21
 
@@ -44,8 +43,8 @@ Use AskUserQuestion to understand the new task:
44
43
 
45
44
  Why is this task needed? What should it accomplish?
46
45
 
47
- If this was triggered by `/cbp-task-check` or `/cbp-task-testing`, the findings are:
48
- [pre-loaded context from check/testing findings if available]
46
+ If this was triggered by `/cbp-verify` (task scope), the findings are:
47
+ [pre-loaded context from verify findings if available]
49
48
 
50
49
  Please describe:
51
50
  1. What the task should accomplish
@@ -70,7 +69,7 @@ Discovered issues MUST be captured. The default target is current scope (round
70
69
 
71
70
  | Situation | Action |
72
71
  |-----------|--------|
73
- | Trivial inline fix (≤5 min, mechanical, scope-clean) | Apply in the CURRENT round per `cbp-round-end` reference `findings-presentation.md` "Trivial-Resolution Exception" |
72
+ | Trivial inline fix (≤5 min, mechanical, scope-clean) | Apply in the CURRENT round per `cbp-verify` reference `findings-presentation.md` "Trivial-Resolution Exception" |
74
73
  | Related to the current task's domain | Create a new ROUND in the current task |
75
74
  | Fits the current checkpoint goal but is meaningfully separate | Create a new TASK in the current checkpoint via `create_task(checkpoint_id)` |
76
75
  | Large enough to need multiple tasks AND fits no current checkpoint | Create a NEW CHECKPOINT via `create_checkpoint` |
@@ -193,5 +192,5 @@ Waiting for user to decide next step.
193
192
 
194
193
  - **Reads**: Local state `.codebyplan/state/checkpoints/<id>.json` + `.../tasks/<id>.json`; on miss `npx codebyplan sync` once; MCP `get_current_task` / `get_tasks` as documented break-glass when the state dir is absent and sync fails. Step 3.5 dedup `get_tasks(standalone=true)` stays MCP — no local-state equivalent for standalone listing.
195
194
  - **Writes**: `codebyplan task create --checkpoint-id <id> ...` (CLI write-through); MCP `create_task` break-glass.
196
- - **Triggered by**: `/cbp-task-check` (suggested), `/cbp-task-testing` (suggested), user manual
195
+ - **Triggered by**: `/cbp-verify` (task scope, suggested), user manual
197
196
  - **Does NOT auto-trigger** next command — user decides
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: cbp-task-start
3
3
  description: Start a task, load context from DB
4
- triggers: [cbp-round-start]
4
+ triggers: [cbp-round-plan]
5
5
  argument-hint: [chk-task] # e.g. `108-1` (CHK-108 TASK-1)
6
6
  effort: xhigh
7
7
  ---
@@ -14,7 +14,7 @@ Start a task by loading context from the database and preparing for work.
14
14
 
15
15
  ### Step 1: Parse `$ARGUMENTS`
16
16
 
17
- Parse the argument using the canonical chk-task-round notation (see `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary"):
17
+ Parse the argument using the canonical chk-task-round notation (see `cbp-round-plan` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary"):
18
18
 
19
19
  | Shape | Regex | Resolves to |
20
20
  |-------|-------|-------------|
@@ -30,7 +30,7 @@ task-start: invalid argument `{value}`. Expected:
30
30
  (empty) → next pending task
31
31
 
32
32
  For standalone tasks, use `/cbp-standalone-task-start {N}`.
33
- For a specific round, use `/cbp-round-start 108-1-2`.
33
+ For a specific round, use `/cbp-round-plan 108-1-2`.
34
34
  ```
35
35
 
36
36
  Error cases: `108-1-2` (that is round-start's shape), `abc`, `108-`, `-1`, `108--1`, anything with whitespace or non-numeric characters.
@@ -40,7 +40,7 @@ Error cases: `108-1-2` (that is round-start's shape), `abc`, `108-`, `-1`, `108-
40
40
  - `task-start 108-1` → CHK-108 TASK-1
41
41
  - `task-start` (no arg) → next pending via `get_current_task`
42
42
  - `task-start 45` → error: "Use /cbp-standalone-task-start 45 instead — bare numbers no longer route to standalone tasks."
43
- - `task-start 108-1-2` → error: "use `/cbp-round-start 108-1-2`"
43
+ - `task-start 108-1-2` → error: "use `/cbp-round-plan 108-1-2`"
44
44
  - `task-start abc` → error: malformed
45
45
  - `task-start 108-` → error: malformed
46
46
 
@@ -75,7 +75,7 @@ Ask via AskUserQuestion, naming the resolved task and disclosing the actions:
75
75
  > - **Cancel** — do nothing
76
76
 
77
77
  - **Proceed** → continue to Step 3.
78
- - **Cancel** → abort cleanly: make NO writes (no branch switch, no commit, no `update_task`, no `/cbp-round-start` trigger) and exit with one line: `Cancelled by user — TASK-{N} not started.`
78
+ - **Cancel** → abort cleanly: make NO writes (no branch switch, no commit, no `update_task`, no `/cbp-round-plan` trigger) and exit with one line: `Cancelled by user — TASK-{N} not started.`
79
79
 
80
80
  ### Step 3: Branch Auto-Handling
81
81
 
@@ -221,17 +221,17 @@ Display context summary:
221
221
 
222
222
  ### Step 6: Auto-trigger Round Start
223
223
 
224
- The Step 2.5 permission gate already covered this hand-off (the user approved running the skill, round-1 auto-start included), so no further prompt is needed here — automatically trigger `/cbp-round-start` for the first round.
224
+ The Step 2.5 permission gate already covered this hand-off (the user approved running the skill, round-1 auto-start included), so no further prompt is needed here — automatically trigger `/cbp-round-plan` for the first round.
225
225
 
226
226
  ```
227
227
  Starting first round...
228
228
  ```
229
229
 
230
- Trigger `/cbp-round-start` with **no argument**. Do NOT pass the task identifier (`{chk}-{task}`) — round-start's 2-segment form is interpreted as standalone TASK-`{chk}` round `{task}`, not CHK-`{chk}` TASK-`{task}`. Passing no argument causes round-start to derive the active task/round from state, which is the correct path here.
230
+ Trigger `/cbp-round-plan` with **no argument**. Do NOT pass the task identifier (`{chk}-{task}`) — round-start's 2-segment form is interpreted as standalone TASK-`{chk}` round `{task}`, not CHK-`{chk}` TASK-`{task}`. Passing no argument causes round-start to derive the active task/round from state, which is the correct path here.
231
231
 
232
232
  ## Integration
233
233
 
234
234
  - **Gates**: Step 2.5 permission gate — asks the user to confirm before any side effect; **Cancel** aborts cleanly with no writes. Fires on every invocation (manual, auto-trigger, auto-loop).
235
235
  - **Reads**: `.codebyplan/state/checkpoints/*.json`, `checkpoints/<id>/tasks/*.json`, `checkpoints/<id>/tasks/<id>/rounds/*.json`, `todos.json` (local-first; `npx codebyplan sync` on miss; MCP `get_current_task`/`get_tasks`/`get_rounds` break-glass)
236
236
  - **Writes**: `codebyplan task update` (CLI write-through; MCP `update_task` break-glass)
237
- - **Triggers**: `/cbp-round-start` (auto, round 1, no argument)
237
+ - **Triggers**: `/cbp-round-plan` (auto, round 1, no argument)
@@ -131,19 +131,17 @@ Once the gates pass, load the context the head command needs. This ensures `/cle
131
131
  | `/cbp-checkpoint-plan` | Load checkpoint from `.codebyplan/state/checkpoints/<id>.json` + task files under `checkpoints/<id>/tasks/` (fallback MCP `get_checkpoints` + `get_tasks`). Display checkpoint title, goal, ideas, existing task count |
132
132
  | `/cbp-checkpoint-start` | Load checkpoint + task files from local state (fallback MCP `get_checkpoints` + `get_tasks`). Display checkpoint title, status, claim state, first pending task |
133
133
  | `/cbp-task-start [N]` | Load from `.codebyplan/state/session/current.json` (fallback MCP `get_current_task`). Display checkpoint title + task title/requirements summary |
134
- | `/cbp-round-start` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + round count + last round summary |
135
- | `/cbp-round-update` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + files_changed triage summary (claude_approved, findings, hard_fail) |
136
- | `/cbp-round-input` | **Full context load** (see Step 2b) |
137
- | `/cbp-task-check` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint + task + files summary |
138
- | `/cbp-task-testing` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + testing status summary |
134
+ | `/cbp-round-plan` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + round count + last round summary |
135
+ | `/cbp-round-plan` | **Full context load** (see Step 2b) |
136
+ | `/cbp-verify` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + files_changed triage summary (claude_approved, findings, hard_fail) |
139
137
  | `/cbp-task-create` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint + task list summary |
140
- | `/cbp-task-complete` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint + task summary |
138
+ | `/cbp-finalize` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint + task summary |
141
139
  | `/cbp-checkpoint-complete` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint summary |
142
140
  | *(no command / idle)* | See Step 3 — suggest `/cbp-session-end` |
143
141
 
144
142
  **For any unrecognized command:** Load from local state session (fallback MCP `get_current_task`) as a safe default. Display whatever context is available.
145
143
 
146
- ### Step 2b: Full Context Load (for `/cbp-round-input`)
144
+ ### Step 2b: Full Context Load (for `/cbp-round-plan`)
147
145
 
148
146
  This is the most context-dependent command. Load everything:
149
147
 
@@ -190,7 +188,7 @@ Reached only when the Step 1.5 ownership gate allowed routing to continue, the S
190
188
 
191
189
  ## Integration
192
190
 
193
- - **Called by**: `/cbp-session-start`, `/cbp-task-complete`, `/cbp-checkpoint-complete`, manual, after `/clear`
191
+ - **Called by**: `/cbp-session-start`, `/cbp-finalize`, `/cbp-checkpoint-complete`, manual, after `/clear`
194
192
  - **Resolves**: `npx codebyplan resolve-worktree --json` (worktree id + distress signal), `npx codebyplan whoami --json` (user id)
195
193
  - **Reads**: `.codebyplan/state/todos.json`, `session/current.json`, `checkpoints/<id>.json`, `checkpoints/<id>/tasks/<id>.json`, `checkpoints/<id>/tasks/<id>/rounds/<id>.json`, `worktrees.json`. If missing/stale, run `npx codebyplan sync` once and re-read. Break-glass: MCP `get_todos`, `get_current_task`, `get_rounds`, `get_checkpoints`, `get_tasks` when state dir absent and sync fails. `get_worktrees` stays MCP (display-only ownership-block path; no CLI verb).
196
194
  - **Triggers**: `rows[0].command` (auto, after the Step 1.5 ownership gate and Step 1.55 stale-entity guard pass, and the Step 1.6 planning gate falls through); Step 1.55 overrides to STOP (stale completed/cancelled entity); Step 1.6 overrides to `/cbp-checkpoint-plan` (unplanned) or `/cbp-checkpoint-start` (planned-but-pending)
@@ -0,0 +1,146 @@
1
+ ---
2
+ name: cbp-verify
3
+ description: Unified verify stage — deterministic gates, real-execution proof, and a fresh-context diff review at round or task scope. Auto-triggered by cbp-round-build; escalates to task scope on the last clean round.
4
+ argument-hint: [chk-task[-round] | task[-round]]
5
+ triggers: [cbp-round-plan, cbp-round-complete, cbp-finalize]
6
+ effort: xhigh
7
+ ---
8
+
9
+ # Verify Command
10
+
11
+ The single verify stage for the execution half. Collapses automated checks, finished-round
12
+ triage, AI production review, and comprehensive task-level testing into one scope-aware skill.
13
+ The deterministic spine lives in the CLI (`codebyplan check`, `codebyplan e2e verify-round`);
14
+ this skill orchestrates the gates, proves execution, spawns ONE fresh-context reviewer, and
15
+ routes on a single directive.
16
+
17
+ Auto-triggered by `/cbp-round-build` after execution. The human gate is NOT here — it is the
18
+ separate `ask`-tier `cbp-round-complete` (round) and the one batched walkthrough in Phase 6
19
+ (task). This skill is model-invocable on purpose.
20
+
21
+ ## Scope & Kind
22
+
23
+ - **SCOPE** (`round` | `task`) — auto-detected: a 3-segment `{chk}-{task}-{round}` (or 2-segment
24
+ `{task}-{round}` standalone) argument, or an auto-trigger from `cbp-round-build`, is
25
+ `scope=round`. A 2-segment `{chk}-{task}` (or bare `{task}` standalone) argument, or the Phase 5
26
+ escalation, is `scope=task`.
27
+ - **KIND** (`checkpoint` | `standalone`) — detected ONCE at the top from identifier shape
28
+ (3-segment / 2-segment-chk = checkpoint; 2-segment / bare = standalone). KIND selects MCP tool
29
+ names per the table in `reference/deterministic-gates.md`.
30
+
31
+ All reads are local-state-first (`.codebyplan/state/**`); on miss run `npx codebyplan sync` once
32
+ and re-read; MCP `get_*` is the break-glass fallback. All writes go through `codebyplan ... update`
33
+ (CLI write-through), MCP break-glass.
34
+
35
+ ## HARD GATES — non-negotiable
36
+
37
+ - The deterministic-gate JSON **is** the verdict — never narrate "I verified the build". (Phase 2)
38
+ - Empty execution proof on a UI-touching diff = GATE FAILURE. (Phase 3, `rules/execution-proof.md`)
39
+ - Reviewer spawn failure = HARD GATE FAILURE → STOP + retry directive; NEVER self-review inline.
40
+ (Phase 4, `rules/spawn-failure-is-gate-failure.md`)
41
+ - `gate6` is always hard (never baselined); baseline regressions are a user-accept gate, never
42
+ auto-accepted. (`rules/two-tier-ci.md`)
43
+
44
+ ## Phase Skeleton
45
+
46
+ ### PHASE 1 — RESOLVE
47
+
48
+ Parse `$ARGUMENTS` (notation per `cbp-round-plan` identifier vocabulary). Detect SCOPE and KIND
49
+ (above). Resolve the active round/task from local state. If `scope=round` and no in-progress
50
+ round → `No active round. Run /cbp-round-plan first.` and STOP. If `scope=task` and any round is
51
+ still `in_progress` → STOP with "complete the active round first". Full resolution + KIND tool
52
+ table: `reference/deterministic-gates.md`.
53
+
54
+ ### PHASE 2 — DETERMINISTIC GATES
55
+
56
+ Run the unified matrix and capture the JSON:
57
+
58
+ ```bash
59
+ codebyplan check --scope <round|task> --json
60
+ ```
61
+
62
+ The JSON `{ results[], any_failed, hard_fail_checks[], no_baseline }` IS the verdict — record
63
+ each result's `check`, `status`, `exit_code`, `new_failures[]`. `gate6` is ALWAYS hard;
64
+ `lint`/`typecheck`/`tests`/`audit` fail only on NEW per-package failures vs the committed
65
+ `.check-baseline.json` (baseline-tolerant soft tier, `rules/two-tier-ci.md`). `any_failed === true`
66
+ (equivalently `hard_fail_checks.length > 0`) → carry into the Phase 5 verdict as a fail. Exact
67
+ contract + the `claude_only` carve-out (deterministic-only path, no agent): see
68
+ `reference/deterministic-gates.md`.
69
+
70
+ ### PHASE 3 — REAL EXECUTION PROOF
71
+
72
+ Produce the committed proof for every tier the diff touches (`rules/execution-proof.md`):
73
+
74
+ - **Tier 1** (configured e2e framework whose `app` source changed) — persist `e2e_eligible` /
75
+ `e2e_outputs` to round context, then:
76
+
77
+ ```bash
78
+ codebyplan e2e verify-round --round-id <round_id> --task-id <task_id>
79
+ ```
80
+
81
+ Exit 0 = pass; exit 1 → surface `result.failed_checks[]` (`e2e_eligible_skipped` /
82
+ `zero_assertion_run` / `empty_gallery`) verbatim and carry as a fail.
83
+ - **Tier 2/3/4** — dev-server screenshot / HTTP trace / command log per the rule.
84
+
85
+ **Empty proof on a UI diff = GATE FAILURE.** Verify each screenshot/trace is committed with
86
+ `git ls-files --error-unmatch <path>`. Write the `verify_manifest` (gates + proof, schema in
87
+ `rules/execution-proof.md`). Per-scope detail: `reference/round-scope.md`, `reference/task-scope.md`.
88
+
89
+ ### PHASE 4 — FRESH-CONTEXT DIFF REVIEW
90
+
91
+ Spawn `cbp-verify-reviewer` with `scope` (round → round diff; task → full task diff) and the
92
+ Input Contract from `agents/cbp-verify-reviewer.md`. **SPAWN FAILURE = HARD GATE FAILURE** → STOP
93
+ and surface the retry directive (`rules/spawn-failure-is-gate-failure.md`); record
94
+ `<scope>.context.verify.spawn_failure`; do NOT walk the reviewer's phases inline. A returned
95
+ `NOT_READY` is a successful review — act on it, do not retry.
96
+
97
+ Triage the returned findings: in-scope mechanical fixes the orchestrator applies itself
98
+ (`Edit`/`Write`); blocking out-of-scope findings → `/cbp-round-plan` fix round. A baseline
99
+ regression is a **blocking user-accept gate** — never auto-accepted.
100
+
101
+ ### PHASE 5 — VERDICT + ROUTE (single directive, never an A/B/C menu)
102
+
103
+ Combine Phase 2 + 3 + 4. Route on one directive (`feedback-close-out-routing.md`):
104
+
105
+ | Result | Route |
106
+ |--------|-------|
107
+ | Any gate/proof/review fail | `Next: /cbp-round-plan` (open a fix round) |
108
+ | Pass + more work wanted | `Next: /cbp-round-plan` (another round) |
109
+ | Pass + LAST round + clean (scope=round) | escalate to `scope=task` → re-enter at Phase 1 |
110
+ | Pass (scope=task) | proceed to Phase 6 finalize |
111
+
112
+ ### PHASE 6 — FINALIZE
113
+
114
+ - **scope=round** — route to the human git-add gate: `Next: /cbp-round-complete`
115
+ (`ask`-tier; reconciles `sync-approvals` + `complete_round`). cbp-verify does NOT stage files
116
+ or complete the round. Detail: `reference/round-scope.md`.
117
+ - **scope=task** — whole-repo `codebyplan check --scope task`, holistic `cbp-verify-reviewer`
118
+ (scope=task) already run in Phase 4, then the ONE genuine human step: a single batched
119
+ `AskUserQuestion` walkthrough (all user-testable items in one prompt, never one-per-question).
120
+ On satisfaction, write `task.context.verify_verdict = { verdict: 'READY', manifest, decided_at }`
121
+ and route `Next: /cbp-finalize`. Detail: `reference/task-scope.md`.
122
+
123
+ ## Key Rules
124
+
125
+ - The JSON verdict from `codebyplan check` / `e2e verify-round` is authoritative — no prose
126
+ substitution.
127
+ - Reviewer spawn failure STOPS the skill (retry directive); never self-certify inline.
128
+ - Empty proof on a UI diff fails verify; screenshots must be committed.
129
+ - Claude NEVER `git add`s — staging is the user's approval signal at `cbp-round-complete`.
130
+ - Single-directive routing only — never an A/B/C menu.
131
+ - `claude_only` profile is the deterministic-only carve-out (no reviewer spawn expected).
132
+
133
+ ## Integration
134
+
135
+ - **Triggered by**: `/cbp-round-build` (auto, scope=round after execution); self-escalates to
136
+ scope=task on the last clean round.
137
+ - **Reads**: `.codebyplan/state/**` (local-first; `npx codebyplan sync` on miss; MCP `get_*`
138
+ break-glass); changed files + git diff via the reviewer.
139
+ - **Writes**: `codebyplan round update` / `codebyplan task update` (CLI write-through; MCP
140
+ `update_round` / `update_task` break-glass) — `verify_manifest`, `verify_verdict`.
141
+ - **Spawns**: `cbp-verify-reviewer` (scope param); the `cbp-e2e-*` specialists feed Tier-1 proof
142
+ upstream in `cbp-round-build`.
143
+ - **Triggers**: `/cbp-round-plan` (any fail or more-work), `/cbp-round-complete` (scope=round
144
+ finalize), `/cbp-finalize` (scope=task READY).
145
+ - **References**: `reference/round-scope.md`, `reference/task-scope.md`,
146
+ `reference/deterministic-gates.md`.
@@ -0,0 +1,114 @@
1
+ # Deterministic Gates — Command Contracts & Manifest
2
+
3
+ Authoritative gate-command + manifest detail for `cbp-verify`. The SKILL.md phases point here;
4
+ this file is loaded on demand.
5
+
6
+ ## KIND tool table
7
+
8
+ KIND is detected once at SKILL Phase 1 from the identifier shape. MCP tool names differ by KIND;
9
+ all writes prefer the CLI write-through and fall back to MCP.
10
+
11
+ | Operation | `checkpoint` KIND | `standalone` KIND |
12
+ |-----------|------------------|-------------------|
13
+ | Get task | local state (break-glass `get_current_task`) | `get_current_standalone_task(repo_id)` |
14
+ | Get rounds | local state (break-glass `get_rounds`) | `get_standalone_rounds(standalone_task_id)` |
15
+ | Update round | `codebyplan round update` (MCP `update_round`) | MCP `update_standalone_round` |
16
+ | Update task | `codebyplan task update` (MCP `update_task`) | MCP `update_standalone_task` |
17
+
18
+ Empty-arg KIND detection: probe `get_current_standalone_task` first; if found → `standalone`;
19
+ else `checkpoint` via `get_current_task`. (KIND detection is MCP-unavoidable — no identifier yet
20
+ means no local path to probe; everything after is local-first.)
21
+
22
+ ## Phase 1 resolution detail
23
+
24
+ | Parse | Resolution |
25
+ |-------|-----------|
26
+ | `{chk}-{task}-{round}` | checkpoint round. Read `.codebyplan/state/checkpoints/*.json` → filter `number==={chk}`; `.../tasks/*.json` → `{task}`; `.../rounds/*.json` → `{round}`. |
27
+ | `{chk}-{task}` | checkpoint task (scope=task). Resolve checkpoint + task; verify all rounds `completed`. |
28
+ | `{task}-{round}` | standalone round (scope=round). |
29
+ | `{task}` (bare) | standalone task (scope=task). |
30
+ | _(empty)_ | the active in-progress task/round from `.codebyplan/state/todos.json`. |
31
+
32
+ On any miss: `npx codebyplan sync` once, re-read; MCP `get_*` break-glass only when the state dir
33
+ is absent AND sync fails.
34
+
35
+ ## Phase 2 — `codebyplan check`
36
+
37
+ ```bash
38
+ codebyplan check --scope <round|task> --json
39
+ ```
40
+
41
+ JSON shape (`RunCheckResult`, source `packages/codebyplan-package/src/lib/check.ts:185`):
42
+
43
+ ```jsonc
44
+ {
45
+ "results": [
46
+ { "check": "gate6|lint|typecheck|tests|audit",
47
+ "status": "pass|fail|skipped",
48
+ "exit_code": 0,
49
+ "command": "...",
50
+ "stdout": "...", "stderr": "...",
51
+ "executed": true,
52
+ "new_failures": ["@scope/pkg", "GHSA-xxxx"] } // omitted for gate6
53
+ ],
54
+ "any_failed": false,
55
+ "hard_fail_checks": [], // names of checks that failed post-baseline-diff
56
+ "no_baseline": false
57
+ }
58
+ ```
59
+
60
+ - **`gate6`** (sibling-identity parity) is ALWAYS hard — never baselined, no `new_failures` field.
61
+ - `lint` / `typecheck` / `tests` / `audit` are **baseline-diffed**: `status: 'pass'` when
62
+ `new_failures` is `[]` even if the underlying command exited non-zero (pre-existing red is
63
+ tolerated). `audit.new_failures` lists new GHSA ids not in the allowlist.
64
+ - Verdict: `any_failed === true` (≡ `hard_fail_checks.length > 0`) is a fail — surface each failing
65
+ result's `new_failures` / `stdout` / `stderr`. **This JSON is the verdict; never substitute prose.**
66
+ - Soft tier uses NO `--no-baseline`. The whole-repo absolute-green tier
67
+ (`--scope merged --no-baseline`) belongs to checkpoint close, not this skill
68
+ (`rules/two-tier-ci.md`).
69
+
70
+ ## Phase 3 — `codebyplan e2e verify-round`
71
+
72
+ ```bash
73
+ codebyplan e2e verify-round --round-id <uuid> --task-id <uuid>
74
+ ```
75
+
76
+ Persist `round.context.e2e_eligible[]` + `e2e_outputs{}` FIRST (the CLI reads the round row from
77
+ the DB). Verdict JSON (`VerifyRoundResult`, source `packages/codebyplan-package/src/lib/e2e.ts:127`):
78
+
79
+ ```jsonc
80
+ { "round_id": "...", "task_id": "...",
81
+ "result": { "pass": true, "failed_checks": [], "skipped_validly": [] } }
82
+ ```
83
+
84
+ Exit 0 = pass. Exit 1 → one or more of `e2e_eligible_skipped` / `zero_assertion_run` /
85
+ `empty_gallery` in `result.failed_checks[]` — surface verbatim, carry as a fail, route to a fix
86
+ round (`rules/e2e-mandatory.md`). When `e2e_eligible[]` is empty, skip the call — nothing to verify.
87
+
88
+ ## `claude_only` carve-out (deterministic-only path)
89
+
90
+ When the resolved profile is `claude_only` (round touched only `.claude/**` / docs / config — no
91
+ app surface), there is **no reviewer to spawn by design**. Proof IS the deterministic set:
92
+
93
+ 1. `codebyplan check --scope <round|task> --json` (gate6 + matrix as above).
94
+ 2. `bash -n <hook>` for each touched `.sh` file.
95
+ 3. SKILL/agent/rule structure sanity for touched `.claude/` files (line counts, no `/cbp-*`
96
+ legacy notation).
97
+
98
+ This is a first-class verification path, NOT a banned inline fallback
99
+ (`rules/spawn-failure-is-gate-failure.md` carve-out) — Phase 4's reviewer spawn is skipped, and
100
+ that skip is recorded as `verify_manifest.proof.tier: 4`, not a spawn failure.
101
+
102
+ ## verify-manifest write
103
+
104
+ Write the manifest into round/task context (merge into existing context — the `update_*`
105
+ REPLACE contract requires re-sending the full object):
106
+
107
+ ```bash
108
+ codebyplan round update --id <round_id> --task-id <uuid> --checkpoint-id <uuid> --context '<json>'
109
+ # break-glass: MCP update_round / update_standalone_round
110
+ ```
111
+
112
+ Schema (canonical in `rules/execution-proof.md`): `verify_manifest = { scope, gates[], proof{ tier,
113
+ artifacts[], e2e_verify_round }, decided_at }`. Each `proof.artifacts[].path` is proven committed
114
+ via `git ls-files --error-unmatch <path>` before it counts.
@@ -1,6 +1,8 @@
1
- # Findings Presentation in `/cbp-round-end` Step 7
1
+ # Findings Presentation & Infra Issue Absorption
2
2
 
3
- When `improve-round` returns findings, Step 7 presents them grouped by severity, then **auto-applies in-scope findings inline** (manual mode) or defers them to the next loop round (auto-loop mode). There is no findings-decision prompt.
3
+ When `cbp-verify-reviewer` returns findings, `cbp-verify` Phase 4 presents them grouped by
4
+ severity, then **auto-applies in-scope findings inline** (manual mode) or defers them to the next
5
+ loop round (auto-loop mode). There is no findings-decision prompt.
4
6
 
5
7
  ## Example output
6
8
 
@@ -24,14 +26,16 @@ When `improve-round` returns findings, Step 7 presents them grouped by severity,
24
26
 
25
27
  ## Auto-apply model (manual mode)
26
28
 
27
- Step 7 auto-applies all **in-scope** findings inline — no user prompt. A finding is *in-scope* when every file it references is within the round's `files_changed[]`; it is *out-of-scope* otherwise.
29
+ `cbp-verify` Phase 4 auto-applies all **in-scope** findings inline — no user prompt. A finding is
30
+ *in-scope* when every file it references is within the round's `files_changed[]`; it is
31
+ *out-of-scope* otherwise.
28
32
 
29
- - **In-scope** → the round-end orchestrator (main context, has Edit/Write) applies the fix directly via `Edit` / `Write`, re-runs the verification commands (hook syntax check + `cbp-testing-qa-agent` scoped to modified files), and records it in `round.context.inline_fix_log = { findings: [ids], rationale, fixes: [...], applied_at: <ISO> }`. The `cbp-improve-round` agent stays read-only/advisory and never writes.
30
- - **Out-of-scope** → saved to `round.context.improve_round_findings[]`; Step 8 routes them to `/cbp-round-input` (next round) or a new task per the Infra Issue Absorption Contract below.
33
+ - **In-scope** → the verify orchestrator (main context, has Edit/Write) applies the fix directly via `Edit` / `Write`, re-runs the verification commands (hook syntax check + `cbp-testing-qa-agent` scoped to modified files), and records it in `round.context.inline_fix_log = { findings: [ids], rationale, fixes: [...], applied_at: <ISO> }`. The `cbp-verify-reviewer` agent stays read-only/advisory and never writes.
34
+ - **Out-of-scope** → saved to `round.context.verify_findings[]`; Phase 5 routes them to `/cbp-round-plan` (next round) or a new task per the Infra Issue Absorption Contract below.
31
35
 
32
- The only user decision in Step 7 is the **baseline-regression accept** gate (baselines are NEVER auto-accepted). Under `auto_loop_mode`, Step 7 does not auto-apply — all findings are accepted into `improve_round_findings[]` and deferred to the next loop round.
36
+ The only user decision in Phase 4 is the **baseline-regression accept** gate (baselines are NEVER auto-accepted). Under `auto_loop_mode`, Phase 4 does not auto-apply — all findings are accepted into `verify_findings[]` and deferred to the next loop round.
33
37
 
34
- The **Trivial-Resolution Exception** below still governs the deeper bypass cases (skipping executor / testing-qa / improve-round for ≤5-line non-logic corrective rounds); it is referenced by `/cbp-round-execute` and `/cbp-task-testing` for infra-issue absorption.
38
+ The **Trivial-Resolution Exception** below still governs the deeper bypass cases (skipping executor / testing-qa / fresh-context review for ≤5-line non-logic corrective rounds); it is referenced by `/cbp-round-build` and `/cbp-verify` (task scope) for infra-issue absorption.
35
39
 
36
40
  ---
37
41
 
@@ -39,7 +43,7 @@ The **Trivial-Resolution Exception** below still governs the deeper bypass cases
39
43
 
40
44
  ### Resolve-in-Current-Scope by Default
41
45
 
42
- When `/cbp-round-execute` Step 5 (per-wave `cbp-testing-qa-agent`) or `/cbp-task-testing` surfaces a pre-existing infra-class issue (critical/high CVE, broken ESLint config-load, Playwright env-loading gap, dead CI pipeline, etc.), the default response is **absorb into current scope** — NOT create a standalone task.
46
+ When `/cbp-round-build` Step 5 (per-wave `cbp-testing-qa-agent`) or `/cbp-verify` (task scope) surfaces a pre-existing infra-class issue (critical/high CVE, broken ESLint config-load, Playwright env-loading gap, dead CI pipeline, etc.), the default response is **absorb into current scope** — NOT create a standalone task.
43
47
 
44
48
  Order of preference for routing a finding:
45
49
 
@@ -84,10 +88,10 @@ When the trivial-resolution exception qualifies, the orchestrator MAY bypass the
84
88
 
85
89
  | Stage | Bypass allowed when | Document as |
86
90
  |-------|--------------------|-------------|
87
- | `cbp-round-executor` | Single-file Edit fully specified by prior reviewer output | `bypass_log.executor: "single-file edit, used direct Edit"` |
91
+ | `cbp-round-builder` | Single-file Edit fully specified by prior reviewer output | `bypass_log.executor: "single-file edit, used direct Edit"` |
88
92
  | `cbp-testing-qa-agent` | Edit is non-code (comment, doc, type-annotation) AND existing test coverage protects the area | `bypass_log.testing_qa: "non-code edit, existing tests cover area"` |
89
- | `cbp-improve-round` | Diff is ≤5 lines AND no logic changed | `bypass_log.improve_round: "≤5 lines non-logic, skipped"` |
90
- | `cbp-task-planner` | Path B (the planner's trivial-corrective bypass that keeps repeat fix-rounds cheap) already qualifies | `bypass_log.planner: "Path B trivial-corrective bypass"` |
93
+ | `cbp-verify-reviewer` | Diff is ≤5 lines AND no logic changed | `bypass_log.review: "≤5 lines non-logic, skipped"` |
94
+ | `cbp-round-planner` | Path B (the planner's trivial-corrective bypass that keeps repeat fix-rounds cheap) already qualifies | `bypass_log.planner: "Path B trivial-corrective bypass"` |
91
95
 
92
96
  **ALL four bypasses simultaneously** is acceptable for ≤5-line non-logic corrective edits where every premise was verified by a prior reviewer.
93
97
 
@@ -95,7 +99,7 @@ When the trivial-resolution exception qualifies, the orchestrator MAY bypass the
95
99
 
96
100
  ### Infra-Class Issue Catalog
97
101
 
98
- These categories surface from per-wave `cbp-testing-qa-agent` or from `/cbp-task-testing`. Default routing for each is in-scope absorption unless genuinely off-axis from the active checkpoint.
102
+ These categories surface from per-wave `cbp-testing-qa-agent` or from `/cbp-verify` (task scope). Default routing for each is in-scope absorption unless genuinely off-axis from the active checkpoint.
99
103
 
100
104
  | Category | Examples |
101
105
  |----------|----------|
@@ -0,0 +1,62 @@
1
+ # Round-Scope Verify
2
+
3
+ Loaded by `cbp-verify` when `scope=round`. This is the per-round quality pass that runs after
4
+ `/cbp-round-build` finishes execution — the soft tier of `rules/two-tier-ci.md`.
5
+
6
+ ## What round scope verifies
7
+
8
+ The review window is THIS round's diff only (`round.files_changed` + `git diff` of the round). It
9
+ covers automated checks, fresh-context review spawn, and finished-round triage routing in one
10
+ scope-aware pass.
11
+
12
+ ## Phase mapping (round)
13
+
14
+ - **Phase 2 — gates**: `codebyplan check --scope round --json`. Baseline-tolerant: only NEW
15
+ per-package failures fail; `gate6` always hard. The JSON is the verdict.
16
+ - **Phase 3 — proof**: tier from the round's diff (`rules/execution-proof.md`).
17
+ - Tier 1: the `cbp-e2e-*` specialists already ran inside `cbp-round-build`; here persist
18
+ `e2e_eligible` / `e2e_outputs` then run `codebyplan e2e verify-round`. Empty gallery /
19
+ zero-assertion / eligible-skipped → fail.
20
+ - Tier 2/3: dev-server screenshot or HTTP trace for the round's changed routes/endpoints,
21
+ committed and proven via `git ls-files --error-unmatch`.
22
+ - Tier 4 (`claude_only`): deterministic-only path, no reviewer spawn (see
23
+ `reference/deterministic-gates.md`).
24
+ - **Phase 4 — review**: spawn `cbp-verify-reviewer` with `scope: 'round'`. Spawn failure = HARD
25
+ GATE FAILURE → STOP + retry directive (`rules/spawn-failure-is-gate-failure.md`). In-scope
26
+ mechanical findings → orchestrator applies via Edit/Write; blocking findings →
27
+ `/cbp-round-plan`. A baseline regression surfaced by the reviewer or e2e is a blocking
28
+ user-accept gate, never auto-accepted.
29
+
30
+ ## Phase 5 routing (round)
31
+
32
+ | Result | Directive |
33
+ |--------|-----------|
34
+ | Any gate / proof / review fail | `Next: /cbp-round-plan` (fix round) |
35
+ | Pass, but more work wanted on the task | `Next: /cbp-round-plan` (another round) |
36
+ | Pass + LAST round + clean | escalate to `scope=task` (re-enter Phase 1) |
37
+
38
+ "More work wanted" is signalled the same way the old pipeline did — unstaged files at
39
+ `/cbp-round-complete` mean the user wants more on them. cbp-verify does not decide that; it routes
40
+ to the human gate and lets staging speak.
41
+
42
+ ## Phase 6 finalize (round) — hand to the human git-add gate
43
+
44
+ cbp-verify does NOT complete the round and NEVER `git add`s. On a clean pass it persists the
45
+ `verify_manifest` to round context and routes:
46
+
47
+ ```
48
+ Next: /cbp-round-complete
49
+ ```
50
+
51
+ `/cbp-round-complete` is the separate `ask`-tier, `disable-model-invocation` finalizer: the user
52
+ stages the files they approve (`git add`), the skill reconciles via `codebyplan round
53
+ sync-approvals` and `complete_round`, then routes onward (all files approved → escalate to task
54
+ verify; some withheld → `/cbp-round-plan`). The permission prompt on `/cbp-round-complete` IS the
55
+ human confirmation — do not add an AskUserQuestion in cbp-verify at round scope.
56
+
57
+ ## Writes (round)
58
+
59
+ `codebyplan round update --id <round_id> --task-id <uuid> --checkpoint-id <uuid> --context '<json>'`
60
+ (merge `verify_manifest` into existing context; the REPLACE contract requires the full object).
61
+ Break-glass: MCP `update_round` (checkpoint KIND) / `update_standalone_round` (standalone KIND) —
62
+ pass `caller_worktree_id` on locked feat rows.
@@ -0,0 +1,71 @@
1
+ # Task-Scope Verify
2
+
3
+ Loaded by `cbp-verify` when `scope=task` — reached by escalation from the last clean round, or by
4
+ an explicit `{chk}-{task}` / bare-`{task}` argument. This is the holistic cross-round
5
+ double-check — AI production review plus comprehensive task-level testing in one pass.
6
+
7
+ ## Precondition
8
+
9
+ All rounds of the task must be `completed`. If any round is `in_progress`, STOP:
10
+
11
+ ```
12
+ ## Cannot run task verify
13
+ TASK-[N] has an active round (Round [N]). Finish it first (run /cbp-verify at round scope, then
14
+ /cbp-round-complete).
15
+ ```
16
+
17
+ ## What task scope verifies
18
+
19
+ The review window is the FULL aggregated task diff — all rounds' `files_changed` deduplicated
20
+ (latest action per path wins). Task scope catches what no single round can see: requirements
21
+ traceability, checkpoint-goal alignment, cross-round integration gaps, whole-repo lint/type/test
22
+ regressions, and shippability.
23
+
24
+ ## Phase mapping (task)
25
+
26
+ - **Phase 2 — gates**: `codebyplan check --scope task --json`. Whole-repo + baseline; only NEW
27
+ per-package failures fail; `gate6` always hard. This is the cross-package layer invisible to
28
+ per-round checks (a non-web package edit that slipped past per-round web-only lints surfaces
29
+ here).
30
+ - **Phase 3 — proof**: aggregate proof across the task diff — every UI surface touched across all
31
+ rounds must have a committed artifact (`rules/execution-proof.md`). Re-run `codebyplan e2e
32
+ verify-round` for each round whose `e2e_eligible[]` is non-empty.
33
+ - **Phase 4 — review**: spawn `cbp-verify-reviewer` with `scope: 'task'`. It grades each
34
+ requirement (`met`/`partially met`/`not met` with `path:line` evidence), checks
35
+ `checkpoint.goal` alignment, runs the holistic cross-round code review + shippable gate, and
36
+ surfaces `scope_divergence_candidates`. Spawn failure = HARD GATE FAILURE → STOP + retry
37
+ (`rules/spawn-failure-is-gate-failure.md`).
38
+
39
+ ## Phase 6 — the ONE genuine human step
40
+
41
+ After the deterministic gates + reviewer pass, run a single batched `AskUserQuestion` walkthrough:
42
+ present every user-testable item (visual quality, UX flow, business-logic correctness, edge cases,
43
+ content accuracy) in ONE checklist prompt with a single overall answer — NEVER one question per
44
+ item. Generate the items from task requirements + the aggregated diff + round context.
45
+
46
+ `scope_divergence_candidates` from the reviewer are confirmed here (the reviewer cannot capture
47
+ user input — it is read-only). If the user confirms a divergence about FUTURE scope, route to
48
+ `/cbp-checkpoint-update` instead of finalize (the current task delivered correctly; the divergence
49
+ belongs to checkpoint replanning).
50
+
51
+ ## Phase 5/6 routing (task)
52
+
53
+ | Result | Directive |
54
+ |--------|-----------|
55
+ | Any gate / proof / review fail (fixable) | `Next: /cbp-round-plan` (fix round) |
56
+ | Reviewer NOT_READY — needs new task scope | `Suggest: /cbp-task-create` then STOP (user scope decision) |
57
+ | Confirmed future-scope divergence | `Next: /cbp-checkpoint-update` |
58
+ | Pass + user satisfied | write verdict, `Next: /cbp-finalize` |
59
+
60
+ On the pass path, write `task.context.verify_verdict = { verdict: 'READY', manifest, user_tests,
61
+ decided_at }`:
62
+
63
+ ```bash
64
+ codebyplan task update --id <task_id> --checkpoint-id <uuid> --context '<json>'
65
+ # break-glass: MCP update_task (checkpoint KIND) / update_standalone_task (standalone KIND)
66
+ ```
67
+
68
+ `/cbp-finalize` (the task-level ship finalizer) reads
69
+ `task.context.verify_verdict` — it must exist with `verdict: 'READY'` before finalize proceeds.
70
+ cbp-verify never edits source at task scope beyond the orchestrator-applied in-scope mechanical
71
+ fixes from Phase 4; it never `git add`s.