codebyplan 1.13.52 → 1.13.54

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (92) hide show
  1. package/dist/cli.js +3226 -897
  2. package/package.json +1 -1
  3. package/templates/agents/cbp-database-agent.md +1 -1
  4. package/templates/agents/cbp-e2e-maestro.md +1 -1
  5. package/templates/agents/cbp-e2e-playwright.md +24 -16
  6. package/templates/agents/cbp-e2e-tauri.md +1 -1
  7. package/templates/agents/cbp-e2e-vscode.md +1 -1
  8. package/templates/agents/cbp-e2e-xcuitest.md +1 -1
  9. package/templates/agents/cbp-improve-claude.md +2 -2
  10. package/templates/agents/{cbp-round-executor.md → cbp-round-builder.md} +23 -23
  11. package/templates/agents/{cbp-task-planner.md → cbp-round-planner.md} +26 -25
  12. package/templates/agents/cbp-security-agent.md +10 -2
  13. package/templates/agents/cbp-stripe-agent.md +2 -2
  14. package/templates/agents/cbp-testing-qa-agent.md +34 -20
  15. package/templates/agents/cbp-verify-reviewer.md +236 -0
  16. package/templates/context/architecture-map.md +4 -4
  17. package/templates/context/mcp-docs.md +57 -11
  18. package/templates/context/testing/e2e.md +9 -9
  19. package/templates/github-workflows/ci.yml +104 -0
  20. package/templates/github-workflows/publish.yml +8 -27
  21. package/templates/github-workflows/release-desktop.yml +215 -0
  22. package/templates/hooks/cbp-skill-context-guard.sh +1 -1
  23. package/templates/hooks/cbp-test-hooks.sh +9 -9
  24. package/templates/hooks/validate-structure-lengths.sh +1 -1
  25. package/templates/hooks/validate-structure-patterns.sh +1 -1
  26. package/templates/rules/README.md +1 -2
  27. package/templates/rules/agent-claim-verification.md +1 -1
  28. package/templates/rules/context-file-loading.md +10 -10
  29. package/templates/rules/development-workflow.md +73 -0
  30. package/templates/rules/e2e-mandatory.md +8 -8
  31. package/templates/rules/execution-proof.md +70 -0
  32. package/templates/rules/model-invocation-convention.md +2 -2
  33. package/templates/rules/parallel-waves.md +11 -11
  34. package/templates/rules/spawn-failure-is-gate-failure.md +76 -0
  35. package/templates/rules/task-routing-recommendation.md +1 -1
  36. package/templates/rules/todo-backend.md +3 -3
  37. package/templates/rules/two-tier-ci.md +63 -0
  38. package/templates/settings.project.base.json +15 -11
  39. package/templates/skills/cbp-build-cc-mode/SKILL.md +1 -1
  40. package/templates/skills/cbp-build-cc-settings/reference/cbp-permission-policy.md +7 -7
  41. package/templates/skills/cbp-build-cc-skill/SKILL.md +1 -1
  42. package/templates/skills/cbp-build-cc-skill/reference/cbp-quality.md +2 -2
  43. package/templates/skills/cbp-build-cc-skill/reference/fork-eligibility.md +11 -14
  44. package/templates/skills/cbp-checkpoint-check/SKILL.md +11 -3
  45. package/templates/skills/cbp-checkpoint-create/SKILL.md +16 -1
  46. package/templates/skills/cbp-checkpoint-end/SKILL.md +5 -1
  47. package/templates/skills/cbp-checkpoint-update/SKILL.md +3 -3
  48. package/templates/skills/cbp-clear-continue/SKILL.md +2 -2
  49. package/templates/skills/cbp-clear-prep/SKILL.md +3 -3
  50. package/templates/skills/{cbp-task-complete → cbp-finalize}/SKILL.md +25 -29
  51. package/templates/skills/{cbp-task-complete → cbp-finalize}/reference/checkpoint-done-branching.md +1 -1
  52. package/templates/skills/{cbp-task-complete → cbp-finalize}/reference/next-step-heuristic.md +1 -1
  53. package/templates/skills/cbp-frontend-design/SKILL.md +1 -1
  54. package/templates/skills/cbp-frontend-ui/SKILL.md +7 -7
  55. package/templates/skills/cbp-git-commit/SKILL.md +3 -3
  56. package/templates/skills/cbp-merge-main/SKILL.md +4 -4
  57. package/templates/skills/{cbp-round-execute → cbp-round-build}/SKILL.md +93 -75
  58. package/templates/skills/cbp-round-complete/SKILL.md +15 -14
  59. package/templates/skills/cbp-round-plan/SKILL.md +344 -0
  60. package/templates/skills/cbp-session-end/SKILL.md +1 -1
  61. package/templates/skills/cbp-setup-cd/SKILL.md +291 -0
  62. package/templates/skills/cbp-setup-cd/reference/github-actions-cd.md +231 -0
  63. package/templates/skills/cbp-setup-ci/SKILL.md +175 -0
  64. package/templates/skills/cbp-setup-ci/reference/github-actions.md +100 -0
  65. package/templates/skills/cbp-ship/SKILL.md +21 -0
  66. package/templates/skills/cbp-ship-main/SKILL.md +3 -2
  67. package/templates/skills/cbp-standalone-task-check/SKILL.md +10 -9
  68. package/templates/skills/cbp-standalone-task-complete/SKILL.md +12 -13
  69. package/templates/skills/cbp-standalone-task-create/SKILL.md +16 -9
  70. package/templates/skills/cbp-standalone-task-start/SKILL.md +9 -5
  71. package/templates/skills/cbp-standalone-task-testing/SKILL.md +16 -7
  72. package/templates/skills/cbp-task-create/SKILL.md +6 -7
  73. package/templates/skills/cbp-task-start/SKILL.md +8 -8
  74. package/templates/skills/cbp-todo/SKILL.md +6 -8
  75. package/templates/skills/cbp-verify/SKILL.md +146 -0
  76. package/templates/skills/cbp-verify/reference/deterministic-gates.md +114 -0
  77. package/templates/skills/{cbp-round-end → cbp-verify}/reference/findings-presentation.md +16 -12
  78. package/templates/skills/cbp-verify/reference/round-scope.md +62 -0
  79. package/templates/skills/cbp-verify/reference/task-scope.md +71 -0
  80. package/templates/agents/cbp-improve-round.md +0 -283
  81. package/templates/agents/cbp-task-check.md +0 -217
  82. package/templates/skills/cbp-round-check/SKILL.md +0 -132
  83. package/templates/skills/cbp-round-end/SKILL.md +0 -173
  84. package/templates/skills/cbp-round-end/reference/inline-fallback.md +0 -35
  85. package/templates/skills/cbp-round-execute/reference/inline-fallback.md +0 -55
  86. package/templates/skills/cbp-round-input/SKILL.md +0 -197
  87. package/templates/skills/cbp-round-start/SKILL.md +0 -261
  88. package/templates/skills/cbp-round-update/SKILL.md +0 -120
  89. package/templates/skills/cbp-ship/templates/workflow-eas-submit.yml +0 -53
  90. package/templates/skills/cbp-ship/templates/workflow-vsce-publish.yml +0 -31
  91. package/templates/skills/cbp-task-check/SKILL.md +0 -172
  92. package/templates/skills/cbp-task-testing/SKILL.md +0 -277
@@ -25,7 +25,7 @@ Any multi-segment input is an error:
25
25
 
26
26
  ```
27
27
  standalone-task-complete: argument `{value}` looks like a checkpoint-task pair.
28
- Use /cbp-task-complete {chk}-{task} for checkpoint-bound tasks.
28
+ Use /cbp-finalize {chk}-{task} for checkpoint-bound tasks.
29
29
  Standalone tasks use a bare number, e.g. /cbp-standalone-task-complete 45.
30
30
  ```
31
31
 
@@ -35,7 +35,7 @@ Error cases: any multi-segment input, `abc`, `108-`, `-1`, anything with whitesp
35
35
 
36
36
  - `standalone-task-complete 45` → standalone TASK-45
37
37
  - `standalone-task-complete` (no arg) → active in-progress task via `get_current_standalone_task`
38
- - `standalone-task-complete 141-3` → error: "Use /cbp-task-complete {chk}-{task} for checkpoint-bound tasks."
38
+ - `standalone-task-complete 141-3` → error: "Use /cbp-finalize {chk}-{task} for checkpoint-bound tasks."
39
39
  - `standalone-task-complete abc` → error: malformed
40
40
 
41
41
  ### Step 1.5: Get Current Task
@@ -56,7 +56,7 @@ If any round is `in_progress`:
56
56
  ```
57
57
  ## Cannot Complete Standalone Task
58
58
 
59
- Standalone TASK-[N] has an active round (Round [N]). Run `/cbp-round-update` to finish it.
59
+ Standalone TASK-[N] has an active round (Round [N]). Run `/cbp-verify` to finish it.
60
60
  ```
61
61
 
62
62
  Stop here.
@@ -66,7 +66,7 @@ Verify at least one round has `testing_qa_output` in its context. If not:
66
66
  ```
67
67
  ## Cannot Complete Standalone Task
68
68
 
69
- No testing-qa-agent validation found. Run `/cbp-round-start` to execute a validated round.
69
+ No testing-qa-agent validation found. Run `/cbp-round-plan` to execute a validated round.
70
70
  ```
71
71
 
72
72
  Stop here.
@@ -179,18 +179,17 @@ When `branch_deleted === true` in the ship JSON:
179
179
 
180
180
  ### Step 7.5: Complete Standalone Task
181
181
 
182
- Note: `complete_standalone_task` is called only after `codebyplan ship` succeeds (no `checks_failed`) — the DB completion record reflects work that has landed in production.
182
+ Note: completion is called only after `codebyplan ship` succeeds (no `checks_failed`) — the DB completion record reflects work that has landed in production.
183
183
 
184
- Resolve caller worktree: `CALLER_WT=$(npx codebyplan resolve-worktree 2>/dev/null)`.
184
+ Complete via the CLI (wraps the `complete_standalone_task` MCP tool):
185
185
 
186
- Call `complete_standalone_task(standalone_task_id, caller_worktree_id: CALLER_WT)`. `caller_worktree_id` is REQUIRED — the MCP server's pre-guard rejects mutations from non-matching worktrees. The server auto-clears `assigned_worktree_id` on the task on success.
186
+ ```bash
187
+ codebyplan standalone-task complete --id <standalone_task.id>
188
+ ```
187
189
 
188
- If `CALLER_WT` is empty, surface this warning and ask user to confirm before proceeding:
190
+ The CLI auto-resolves `caller_worktree_id` (override → worktree cache → resolver). `caller_worktree_id` is REQUIRED the MCP server's pre-guard rejects mutations from non-matching worktrees, and the CLI hard-fails (exit 1) with registration guidance rather than sending an undefined id. The server auto-clears `assigned_worktree_id` on the task on success.
189
191
 
190
- ```
191
- Warning: could not resolve caller_worktree_id (npx codebyplan resolve-worktree returned empty).
192
- The complete_standalone_task call may be rejected by the pre-guard. Proceed anyway? (yes / no)
193
- ```
192
+ If the CLI exits 1 with a "could not resolve caller_worktree_id" message, run `npx codebyplan setup` (or `codebyplan resolve-worktree --cache`) from this worktree, then re-run the command.
194
193
 
195
194
  ### Step 8: Run Cleanup + Migration (inline)
196
195
 
@@ -238,6 +237,6 @@ Do NOT use AskUserQuestion for routing. Do NOT use the Skill tool to auto-trigge
238
237
  - **Chain**: `/cbp-standalone-task-check` → `/cbp-standalone-task-testing` → `/cbp-standalone-task-complete`
239
238
  - **Delegates to**: `codebyplan ship` CLI (Step 7 — PR creation, check polling, merge, branch cleanup)
240
239
  - **Reads**: MCP `get_current_standalone_task`, `get_standalone_tasks`, `get_standalone_rounds`
241
- - **Writes**: MCP `update_standalone_task`, `complete_standalone_task`
240
+ - **Writes**: MCP `update_standalone_task` (Step 6 files); `codebyplan standalone-task complete` (wraps `complete_standalone_task`)
242
241
  - **Uses skills (inline, no sub-agent)**: `cleanup` (if deletions), `migration` (if exports renamed)
243
242
  - **Does NOT** auto-trigger next skill — emits single directive only
@@ -17,7 +17,7 @@ Create a new standalone task — independent of any checkpoint. Gathers user con
17
17
 
18
18
  ## Identifier Notation
19
19
 
20
- Standalone tasks use a bare number (e.g. `45` = standalone TASK-45). There is no checkpoint segment. Canonical notation follows `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary".
20
+ Standalone tasks use a bare number (e.g. `45` = standalone TASK-45). There is no checkpoint segment. Canonical notation follows `cbp-round-plan` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary".
21
21
 
22
22
  ## Instructions
23
23
 
@@ -96,13 +96,20 @@ Resolve worktree_id via `npx codebyplan resolve-worktree 2>/dev/null`.
96
96
 
97
97
  ### Step 7: Create Standalone Task
98
98
 
99
- Use MCP `create_standalone_task` with:
100
- - **repo_id**: from `.codebyplan/repo.json`
101
- - **title**: Concise task title
102
- - **number**: next available number from Step 3
103
- - **requirements**: Numbered requirements list
104
- - **context**: Include decisions from Q&A and source findings
105
- - **assigned_worktree_id**: from Step 6 (if resolved)
99
+ Create via the CLI (wraps the `create_standalone_task` MCP tool; auto-resolves `caller_worktree_id`):
100
+
101
+ ```bash
102
+ codebyplan standalone-task create \
103
+ --title "<concise task title>" \
104
+ --number <next number from Step 3> \
105
+ --requirements "<numbered requirements list>" \
106
+ --context '<JSON: decisions from Q&A + source findings>' \
107
+ --assigned-worktree-id <from Step 6, if resolved>
108
+ ```
109
+
110
+ - `--repo-id` is optional — the CLI reads it from `.codebyplan/repo.json`.
111
+ - Omit `--assigned-worktree-id` when Step 6 did not resolve a worktree.
112
+ - On success the CLI prints the created row JSON (including `.id`) to stdout.
106
113
 
107
114
  ```
108
115
  ## Standalone Task Created
@@ -145,6 +152,6 @@ Waiting for user to decide next step.
145
152
  ## Integration
146
153
 
147
154
  - **Reads**: MCP `get_standalone_tasks`
148
- - **Writes**: MCP `create_standalone_task`
155
+ - **Writes**: `codebyplan standalone-task create` (wraps `create_standalone_task` MCP tool)
149
156
  - **Triggered by**: user manual
150
157
  - **Does NOT auto-trigger** next command — user decides
@@ -149,13 +149,17 @@ Load context from DB:
149
149
 
150
150
  ### Step 5: Set Task Status
151
151
 
152
- Use MCP `update_standalone_task(task_id, status: "in_progress")`.
152
+ Set status via the CLI (wraps `update_standalone_task`; auto-resolves `caller_worktree_id`):
153
153
 
154
- If `CALLER_WT` is present, include `caller_worktree_id: CALLER_WT`.
154
+ ```bash
155
+ codebyplan standalone-task update --id <task.id> --status in_progress
156
+ ```
157
+
158
+ `--id` is the standalone task UUID resolved in Step 2. The CLI resolves `caller_worktree_id` itself (override → worktree cache → resolver), so `CALLER_WT` does not need to be passed.
155
159
 
156
160
  ### Step 6: Auto-trigger Round Start
157
161
 
158
- Trigger `/cbp-round-start` with no argument. Do NOT pass the task number — round-start's 2-segment form means standalone TASK-{a} round {b}, not a checkpoint task. Passing no argument causes round-start to derive the active task/round from state, which is correct.
162
+ Trigger `/cbp-round-plan` with no argument. Do NOT pass the task number — round-start's 2-segment form means standalone TASK-{a} round {b}, not a checkpoint task. Passing no argument causes round-start to derive the active task/round from state, which is correct.
159
163
 
160
164
  ```
161
165
  Starting first round...
@@ -164,5 +168,5 @@ Starting first round...
164
168
  ## Integration
165
169
 
166
170
  - **Reads**: MCP `get_standalone_tasks`, `get_current_standalone_task`, `get_standalone_rounds`
167
- - **Writes**: MCP `update_standalone_task`
168
- - **Triggers**: `/cbp-round-start` (no argument — auto, round 1)
171
+ - **Writes**: `codebyplan standalone-task update` (Step 5 status); MCP `update_standalone_task` (Step 3.4 branch_name persist)
172
+ - **Triggers**: `/cbp-round-plan` (no argument — auto, round 1)
@@ -19,7 +19,7 @@ Comprehensive task-level testing for standalone tasks — the **cross-round doub
19
19
 
20
20
  ## Scope vs Round-Level Validation
21
21
 
22
- Per-wave `testing-qa-agent` runs inside `/cbp-round-execute` Step 5 and **owns per-round QA**: per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret full-diff grep, and `pnpm audit`. This skill does NOT repeat them. It adds only the cross-round layer invisible within a single round: workspace-wide lint, workspace tsc, and the full test suite (which catch cross-package breakage), plus the cross-round code review (Step 6.5) and the user manual walkthrough (Step 8).
22
+ Per-wave `testing-qa-agent` runs inside `/cbp-round-build` Step 5 and **owns per-round QA**: per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret full-diff grep, and `pnpm audit`. This skill does NOT repeat them. It adds only the cross-round layer invisible within a single round: workspace-wide lint, workspace tsc, and the full test suite (which catch cross-package breakage), plus the cross-round code review (Step 6.5) and the user manual walkthrough (Step 8).
23
23
 
24
24
  ## Instructions
25
25
 
@@ -34,7 +34,7 @@ Any multi-segment input is an error:
34
34
 
35
35
  ```
36
36
  standalone-task-testing: argument `{value}` looks like a checkpoint-task pair.
37
- Use /cbp-task-testing {chk}-{task} for checkpoint-bound tasks.
37
+ Use /cbp-verify {chk}-{task} for checkpoint-bound tasks.
38
38
  Standalone tasks use a bare number, e.g. /cbp-standalone-task-testing 45.
39
39
  ```
40
40
 
@@ -57,7 +57,7 @@ Use MCP `get_standalone_rounds(standalone_task_id)`. Verify all rounds are `comp
57
57
  ## Cannot Run Standalone Task Testing
58
58
 
59
59
  Standalone TASK-[N] has an active round (Round [N]). Complete it first:
60
- - Run `/cbp-round-update` to finish the round
60
+ - Run `/cbp-verify` to finish the round
61
61
  ```
62
62
 
63
63
  Stop.
@@ -86,13 +86,22 @@ Read every non-deleted file in the aggregated list. Build a mental model of the
86
86
 
87
87
  Capture stdout and stderr for each check.
88
88
 
89
+ **ci.json command resolution (absent-fallback safe):** Before running the checks below, resolve commands from `.codebyplan/ci.json`:
90
+
91
+ ```bash
92
+ CI_TYPES_CMD=$(npx codebyplan ci resolve typecheck 2>/dev/null)
93
+ CI_UNIT_CMD=$(npx codebyplan ci resolve unit_test 2>/dev/null)
94
+ ```
95
+
96
+ Fallback: if `.codebyplan/ci.json` is absent, `ci resolve` returns the central default (exit 0). If the binary is unavailable, the variable is empty and the `${CI_*_CMD:-<literal>}` guards in the table below activate the hardcoded fallback.
97
+
89
98
  **Hard-fail tests** (block completion):
90
99
 
91
100
  | Category | Command | Condition |
92
101
  |----------|---------|-----------|
93
102
  | Full-repo lint | `pnpm -w lint` | Always |
94
- | Full-repo types | `pnpm exec tsc --noEmit` | Source files changed |
95
- | Full-repo unit tests | `pnpm test --run` | Source files in aggregated_files |
103
+ | Full-repo types | `${CI_TYPES_CMD:-pnpm exec tsc --noEmit}` | Source files changed |
104
+ | Full-repo unit tests | `${CI_UNIT_CMD:-pnpm test --run}` | Source files in aggregated_files |
96
105
  | Per-package E2E | `pnpm --filter <pkg> e2e:test` | UI files in aggregated_files |
97
106
 
98
107
  These are the workspace-wide / cross-package checks only — per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret grep, and `pnpm audit` already ran per-round inside `testing-qa-agent` and are NOT repeated here.
@@ -176,11 +185,11 @@ Next: /cbp-standalone-task-complete {N}
176
185
  ---
177
186
 
178
187
  **Next:**
179
- Run `/cbp-round-input` to address the minor issues found during testing.
188
+ Run `/cbp-round-plan` to address the minor issues found during testing.
180
189
 
181
190
  ---
182
191
 
183
- Waiting for user to run `/cbp-round-input`.
192
+ Waiting for user to run `/cbp-round-plan`.
184
193
 
185
194
  **Major problems found:**
186
195
 
@@ -10,13 +10,12 @@ Create a new task within the active checkpoint. Gathers user context, analyzes e
10
10
 
11
11
  ## When Used
12
12
 
13
- - Suggested by `/cbp-task-check` when scope issues require a new task
14
- - Suggested by `/cbp-task-testing` when major problems need a separate task
13
+ - Suggested by `/cbp-verify` (task scope) when scope issues or major problems require a separate task
15
14
  - User manually wants to add a task to the current checkpoint
16
15
 
17
16
  ## Identifier Notation
18
17
 
19
- This skill operates on the **active** checkpoint resolved via MCP `get_current_task` and does not accept a positional identifier argument. The task it creates gets its `number` from the next-available slot within the active checkpoint (checkpoint-bound). Canonical chk-task-round notation — used in prose, error messages, and cross-references — follows `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary": `108-1` (CHK-108 TASK-1), `108-1-2` (round 2 of CHK-108 TASK-1).
18
+ This skill operates on the **active** checkpoint resolved via MCP `get_current_task` and does not accept a positional identifier argument. The task it creates gets its `number` from the next-available slot within the active checkpoint (checkpoint-bound). Canonical chk-task-round notation — used in prose, error messages, and cross-references — follows `cbp-round-plan` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary": `108-1` (CHK-108 TASK-1), `108-1-2` (round 2 of CHK-108 TASK-1).
20
19
 
21
20
  **Bare-number argument**: if a bare number (e.g. `42`) is provided with no checkpoint context, this skill cannot resolve it to a checkpoint-bound task:
22
21
 
@@ -44,8 +43,8 @@ Use AskUserQuestion to understand the new task:
44
43
 
45
44
  Why is this task needed? What should it accomplish?
46
45
 
47
- If this was triggered by `/cbp-task-check` or `/cbp-task-testing`, the findings are:
48
- [pre-loaded context from check/testing findings if available]
46
+ If this was triggered by `/cbp-verify` (task scope), the findings are:
47
+ [pre-loaded context from verify findings if available]
49
48
 
50
49
  Please describe:
51
50
  1. What the task should accomplish
@@ -70,7 +69,7 @@ Discovered issues MUST be captured. The default target is current scope (round
70
69
 
71
70
  | Situation | Action |
72
71
  |-----------|--------|
73
- | Trivial inline fix (≤5 min, mechanical, scope-clean) | Apply in the CURRENT round per `cbp-round-end` reference `findings-presentation.md` "Trivial-Resolution Exception" |
72
+ | Trivial inline fix (≤5 min, mechanical, scope-clean) | Apply in the CURRENT round per `cbp-verify` reference `findings-presentation.md` "Trivial-Resolution Exception" |
74
73
  | Related to the current task's domain | Create a new ROUND in the current task |
75
74
  | Fits the current checkpoint goal but is meaningfully separate | Create a new TASK in the current checkpoint via `create_task(checkpoint_id)` |
76
75
  | Large enough to need multiple tasks AND fits no current checkpoint | Create a NEW CHECKPOINT via `create_checkpoint` |
@@ -193,5 +192,5 @@ Waiting for user to decide next step.
193
192
 
194
193
  - **Reads**: Local state `.codebyplan/state/checkpoints/<id>.json` + `.../tasks/<id>.json`; on miss `npx codebyplan sync` once; MCP `get_current_task` / `get_tasks` as documented break-glass when the state dir is absent and sync fails. Step 3.5 dedup `get_tasks(standalone=true)` stays MCP — no local-state equivalent for standalone listing.
195
194
  - **Writes**: `codebyplan task create --checkpoint-id <id> ...` (CLI write-through); MCP `create_task` break-glass.
196
- - **Triggered by**: `/cbp-task-check` (suggested), `/cbp-task-testing` (suggested), user manual
195
+ - **Triggered by**: `/cbp-verify` (task scope, suggested), user manual
197
196
  - **Does NOT auto-trigger** next command — user decides
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: cbp-task-start
3
3
  description: Start a task, load context from DB
4
- triggers: [cbp-round-start]
4
+ triggers: [cbp-round-plan]
5
5
  argument-hint: [chk-task] # e.g. `108-1` (CHK-108 TASK-1)
6
6
  effort: xhigh
7
7
  ---
@@ -14,7 +14,7 @@ Start a task by loading context from the database and preparing for work.
14
14
 
15
15
  ### Step 1: Parse `$ARGUMENTS`
16
16
 
17
- Parse the argument using the canonical chk-task-round notation (see `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary"):
17
+ Parse the argument using the canonical chk-task-round notation (see `cbp-round-plan` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary"):
18
18
 
19
19
  | Shape | Regex | Resolves to |
20
20
  |-------|-------|-------------|
@@ -30,7 +30,7 @@ task-start: invalid argument `{value}`. Expected:
30
30
  (empty) → next pending task
31
31
 
32
32
  For standalone tasks, use `/cbp-standalone-task-start {N}`.
33
- For a specific round, use `/cbp-round-start 108-1-2`.
33
+ For a specific round, use `/cbp-round-plan 108-1-2`.
34
34
  ```
35
35
 
36
36
  Error cases: `108-1-2` (that is round-start's shape), `abc`, `108-`, `-1`, `108--1`, anything with whitespace or non-numeric characters.
@@ -40,7 +40,7 @@ Error cases: `108-1-2` (that is round-start's shape), `abc`, `108-`, `-1`, `108-
40
40
  - `task-start 108-1` → CHK-108 TASK-1
41
41
  - `task-start` (no arg) → next pending via `get_current_task`
42
42
  - `task-start 45` → error: "Use /cbp-standalone-task-start 45 instead — bare numbers no longer route to standalone tasks."
43
- - `task-start 108-1-2` → error: "use `/cbp-round-start 108-1-2`"
43
+ - `task-start 108-1-2` → error: "use `/cbp-round-plan 108-1-2`"
44
44
  - `task-start abc` → error: malformed
45
45
  - `task-start 108-` → error: malformed
46
46
 
@@ -75,7 +75,7 @@ Ask via AskUserQuestion, naming the resolved task and disclosing the actions:
75
75
  > - **Cancel** — do nothing
76
76
 
77
77
  - **Proceed** → continue to Step 3.
78
- - **Cancel** → abort cleanly: make NO writes (no branch switch, no commit, no `update_task`, no `/cbp-round-start` trigger) and exit with one line: `Cancelled by user — TASK-{N} not started.`
78
+ - **Cancel** → abort cleanly: make NO writes (no branch switch, no commit, no `update_task`, no `/cbp-round-plan` trigger) and exit with one line: `Cancelled by user — TASK-{N} not started.`
79
79
 
80
80
  ### Step 3: Branch Auto-Handling
81
81
 
@@ -221,17 +221,17 @@ Display context summary:
221
221
 
222
222
  ### Step 6: Auto-trigger Round Start
223
223
 
224
- The Step 2.5 permission gate already covered this hand-off (the user approved running the skill, round-1 auto-start included), so no further prompt is needed here — automatically trigger `/cbp-round-start` for the first round.
224
+ The Step 2.5 permission gate already covered this hand-off (the user approved running the skill, round-1 auto-start included), so no further prompt is needed here — automatically trigger `/cbp-round-plan` for the first round.
225
225
 
226
226
  ```
227
227
  Starting first round...
228
228
  ```
229
229
 
230
- Trigger `/cbp-round-start` with **no argument**. Do NOT pass the task identifier (`{chk}-{task}`) — round-start's 2-segment form is interpreted as standalone TASK-`{chk}` round `{task}`, not CHK-`{chk}` TASK-`{task}`. Passing no argument causes round-start to derive the active task/round from state, which is the correct path here.
230
+ Trigger `/cbp-round-plan` with **no argument**. Do NOT pass the task identifier (`{chk}-{task}`) — round-start's 2-segment form is interpreted as standalone TASK-`{chk}` round `{task}`, not CHK-`{chk}` TASK-`{task}`. Passing no argument causes round-start to derive the active task/round from state, which is the correct path here.
231
231
 
232
232
  ## Integration
233
233
 
234
234
  - **Gates**: Step 2.5 permission gate — asks the user to confirm before any side effect; **Cancel** aborts cleanly with no writes. Fires on every invocation (manual, auto-trigger, auto-loop).
235
235
  - **Reads**: `.codebyplan/state/checkpoints/*.json`, `checkpoints/<id>/tasks/*.json`, `checkpoints/<id>/tasks/<id>/rounds/*.json`, `todos.json` (local-first; `npx codebyplan sync` on miss; MCP `get_current_task`/`get_tasks`/`get_rounds` break-glass)
236
236
  - **Writes**: `codebyplan task update` (CLI write-through; MCP `update_task` break-glass)
237
- - **Triggers**: `/cbp-round-start` (auto, round 1, no argument)
237
+ - **Triggers**: `/cbp-round-plan` (auto, round 1, no argument)
@@ -131,19 +131,17 @@ Once the gates pass, load the context the head command needs. This ensures `/cle
131
131
  | `/cbp-checkpoint-plan` | Load checkpoint from `.codebyplan/state/checkpoints/<id>.json` + task files under `checkpoints/<id>/tasks/` (fallback MCP `get_checkpoints` + `get_tasks`). Display checkpoint title, goal, ideas, existing task count |
132
132
  | `/cbp-checkpoint-start` | Load checkpoint + task files from local state (fallback MCP `get_checkpoints` + `get_tasks`). Display checkpoint title, status, claim state, first pending task |
133
133
  | `/cbp-task-start [N]` | Load from `.codebyplan/state/session/current.json` (fallback MCP `get_current_task`). Display checkpoint title + task title/requirements summary |
134
- | `/cbp-round-start` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + round count + last round summary |
135
- | `/cbp-round-update` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + files_changed triage summary (claude_approved, findings, hard_fail) |
136
- | `/cbp-round-input` | **Full context load** (see Step 2b) |
137
- | `/cbp-task-check` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint + task + files summary |
138
- | `/cbp-task-testing` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + testing status summary |
134
+ | `/cbp-round-plan` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + round count + last round summary |
135
+ | `/cbp-round-plan` | **Full context load** (see Step 2b) |
136
+ | `/cbp-verify` | Load from local state session + round files (fallback MCP `get_current_task` + `get_rounds`). Display checkpoint + task + files_changed triage summary (claude_approved, findings, hard_fail) |
139
137
  | `/cbp-task-create` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint + task list summary |
140
- | `/cbp-task-complete` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint + task summary |
138
+ | `/cbp-finalize` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint + task summary |
141
139
  | `/cbp-checkpoint-complete` | Load from local state session (fallback MCP `get_current_task`). Display checkpoint summary |
142
140
  | *(no command / idle)* | See Step 3 — suggest `/cbp-session-end` |
143
141
 
144
142
  **For any unrecognized command:** Load from local state session (fallback MCP `get_current_task`) as a safe default. Display whatever context is available.
145
143
 
146
- ### Step 2b: Full Context Load (for `/cbp-round-input`)
144
+ ### Step 2b: Full Context Load (for `/cbp-round-plan`)
147
145
 
148
146
  This is the most context-dependent command. Load everything:
149
147
 
@@ -190,7 +188,7 @@ Reached only when the Step 1.5 ownership gate allowed routing to continue, the S
190
188
 
191
189
  ## Integration
192
190
 
193
- - **Called by**: `/cbp-session-start`, `/cbp-task-complete`, `/cbp-checkpoint-complete`, manual, after `/clear`
191
+ - **Called by**: `/cbp-session-start`, `/cbp-finalize`, `/cbp-checkpoint-complete`, manual, after `/clear`
194
192
  - **Resolves**: `npx codebyplan resolve-worktree --json` (worktree id + distress signal), `npx codebyplan whoami --json` (user id)
195
193
  - **Reads**: `.codebyplan/state/todos.json`, `session/current.json`, `checkpoints/<id>.json`, `checkpoints/<id>/tasks/<id>.json`, `checkpoints/<id>/tasks/<id>/rounds/<id>.json`, `worktrees.json`. If missing/stale, run `npx codebyplan sync` once and re-read. Break-glass: MCP `get_todos`, `get_current_task`, `get_rounds`, `get_checkpoints`, `get_tasks` when state dir absent and sync fails. `get_worktrees` stays MCP (display-only ownership-block path; no CLI verb).
196
194
  - **Triggers**: `rows[0].command` (auto, after the Step 1.5 ownership gate and Step 1.55 stale-entity guard pass, and the Step 1.6 planning gate falls through); Step 1.55 overrides to STOP (stale completed/cancelled entity); Step 1.6 overrides to `/cbp-checkpoint-plan` (unplanned) or `/cbp-checkpoint-start` (planned-but-pending)
@@ -0,0 +1,146 @@
1
+ ---
2
+ name: cbp-verify
3
+ description: Unified verify stage — deterministic gates, real-execution proof, and a fresh-context diff review at round or task scope. Auto-triggered by cbp-round-build; escalates to task scope on the last clean round.
4
+ argument-hint: [chk-task[-round] | task[-round]]
5
+ triggers: [cbp-round-plan, cbp-round-complete, cbp-finalize]
6
+ effort: xhigh
7
+ ---
8
+
9
+ # Verify Command
10
+
11
+ The single verify stage for the execution half. Collapses automated checks, finished-round
12
+ triage, AI production review, and comprehensive task-level testing into one scope-aware skill.
13
+ The deterministic spine lives in the CLI (`codebyplan check`, `codebyplan e2e verify-round`);
14
+ this skill orchestrates the gates, proves execution, spawns ONE fresh-context reviewer, and
15
+ routes on a single directive.
16
+
17
+ Auto-triggered by `/cbp-round-build` after execution. The human gate is NOT here — it is the
18
+ separate `ask`-tier `cbp-round-complete` (round) and the one batched walkthrough in Phase 6
19
+ (task). This skill is model-invocable on purpose.
20
+
21
+ ## Scope & Kind
22
+
23
+ - **SCOPE** (`round` | `task`) — auto-detected: a 3-segment `{chk}-{task}-{round}` (or 2-segment
24
+ `{task}-{round}` standalone) argument, or an auto-trigger from `cbp-round-build`, is
25
+ `scope=round`. A 2-segment `{chk}-{task}` (or bare `{task}` standalone) argument, or the Phase 5
26
+ escalation, is `scope=task`.
27
+ - **KIND** (`checkpoint` | `standalone`) — detected ONCE at the top from identifier shape
28
+ (3-segment / 2-segment-chk = checkpoint; 2-segment / bare = standalone). KIND selects MCP tool
29
+ names per the table in `reference/deterministic-gates.md`.
30
+
31
+ All reads are local-state-first (`.codebyplan/state/**`); on miss run `npx codebyplan sync` once
32
+ and re-read; MCP `get_*` is the break-glass fallback. All writes go through `codebyplan ... update`
33
+ (CLI write-through), MCP break-glass.
34
+
35
+ ## HARD GATES — non-negotiable
36
+
37
+ - The deterministic-gate JSON **is** the verdict — never narrate "I verified the build". (Phase 2)
38
+ - Empty execution proof on a UI-touching diff = GATE FAILURE. (Phase 3, `rules/execution-proof.md`)
39
+ - Reviewer spawn failure = HARD GATE FAILURE → STOP + retry directive; NEVER self-review inline.
40
+ (Phase 4, `rules/spawn-failure-is-gate-failure.md`)
41
+ - `gate6` is always hard (never baselined); baseline regressions are a user-accept gate, never
42
+ auto-accepted. (`rules/two-tier-ci.md`)
43
+
44
+ ## Phase Skeleton
45
+
46
+ ### PHASE 1 — RESOLVE
47
+
48
+ Parse `$ARGUMENTS` (notation per `cbp-round-plan` identifier vocabulary). Detect SCOPE and KIND
49
+ (above). Resolve the active round/task from local state. If `scope=round` and no in-progress
50
+ round → `No active round. Run /cbp-round-plan first.` and STOP. If `scope=task` and any round is
51
+ still `in_progress` → STOP with "complete the active round first". Full resolution + KIND tool
52
+ table: `reference/deterministic-gates.md`.
53
+
54
+ ### PHASE 2 — DETERMINISTIC GATES
55
+
56
+ Run the unified matrix and capture the JSON:
57
+
58
+ ```bash
59
+ codebyplan check --scope <round|task> --json
60
+ ```
61
+
62
+ The JSON `{ results[], any_failed, hard_fail_checks[], no_baseline }` IS the verdict — record
63
+ each result's `check`, `status`, `exit_code`, `new_failures[]`. `gate6` is ALWAYS hard;
64
+ `lint`/`typecheck`/`tests`/`audit` fail only on NEW per-package failures vs the committed
65
+ `.check-baseline.json` (baseline-tolerant soft tier, `rules/two-tier-ci.md`). `any_failed === true`
66
+ (equivalently `hard_fail_checks.length > 0`) → carry into the Phase 5 verdict as a fail. Exact
67
+ contract + the `claude_only` carve-out (deterministic-only path, no agent): see
68
+ `reference/deterministic-gates.md`.
69
+
70
+ ### PHASE 3 — REAL EXECUTION PROOF
71
+
72
+ Produce the committed proof for every tier the diff touches (`rules/execution-proof.md`):
73
+
74
+ - **Tier 1** (configured e2e framework whose `app` source changed) — persist `e2e_eligible` /
75
+ `e2e_outputs` to round context, then:
76
+
77
+ ```bash
78
+ codebyplan e2e verify-round --round-id <round_id> --task-id <task_id>
79
+ ```
80
+
81
+ Exit 0 = pass; exit 1 → surface `result.failed_checks[]` (`e2e_eligible_skipped` /
82
+ `zero_assertion_run` / `empty_gallery`) verbatim and carry as a fail.
83
+ - **Tier 2/3/4** — dev-server screenshot / HTTP trace / command log per the rule.
84
+
85
+ **Empty proof on a UI diff = GATE FAILURE.** Verify each screenshot/trace is committed with
86
+ `git ls-files --error-unmatch <path>`. Write the `verify_manifest` (gates + proof, schema in
87
+ `rules/execution-proof.md`). Per-scope detail: `reference/round-scope.md`, `reference/task-scope.md`.
88
+
89
+ ### PHASE 4 — FRESH-CONTEXT DIFF REVIEW
90
+
91
+ Spawn `cbp-verify-reviewer` with `scope` (round → round diff; task → full task diff) and the
92
+ Input Contract from `agents/cbp-verify-reviewer.md`. **SPAWN FAILURE = HARD GATE FAILURE** → STOP
93
+ and surface the retry directive (`rules/spawn-failure-is-gate-failure.md`); record
94
+ `<scope>.context.verify.spawn_failure`; do NOT walk the reviewer's phases inline. A returned
95
+ `NOT_READY` is a successful review — act on it, do not retry.
96
+
97
+ Triage the returned findings: in-scope mechanical fixes the orchestrator applies itself
98
+ (`Edit`/`Write`); blocking out-of-scope findings → `/cbp-round-plan` fix round. A baseline
99
+ regression is a **blocking user-accept gate** — never auto-accepted.
100
+
101
+ ### PHASE 5 — VERDICT + ROUTE (single directive, never an A/B/C menu)
102
+
103
+ Combine Phase 2 + 3 + 4. Route on one directive (`feedback-close-out-routing.md`):
104
+
105
+ | Result | Route |
106
+ |--------|-------|
107
+ | Any gate/proof/review fail | `Next: /cbp-round-plan` (open a fix round) |
108
+ | Pass + more work wanted | `Next: /cbp-round-plan` (another round) |
109
+ | Pass + LAST round + clean (scope=round) | escalate to `scope=task` → re-enter at Phase 1 |
110
+ | Pass (scope=task) | proceed to Phase 6 finalize |
111
+
112
+ ### PHASE 6 — FINALIZE
113
+
114
+ - **scope=round** — route to the human git-add gate: `Next: /cbp-round-complete`
115
+ (`ask`-tier; reconciles `sync-approvals` + `complete_round`). cbp-verify does NOT stage files
116
+ or complete the round. Detail: `reference/round-scope.md`.
117
+ - **scope=task** — whole-repo `codebyplan check --scope task`, holistic `cbp-verify-reviewer`
118
+ (scope=task) already run in Phase 4, then the ONE genuine human step: a single batched
119
+ `AskUserQuestion` walkthrough (all user-testable items in one prompt, never one-per-question).
120
+ On satisfaction, write `task.context.verify_verdict = { verdict: 'READY', manifest, decided_at }`
121
+ and route `Next: /cbp-finalize`. Detail: `reference/task-scope.md`.
122
+
123
+ ## Key Rules
124
+
125
+ - The JSON verdict from `codebyplan check` / `e2e verify-round` is authoritative — no prose
126
+ substitution.
127
+ - Reviewer spawn failure STOPS the skill (retry directive); never self-certify inline.
128
+ - Empty proof on a UI diff fails verify; screenshots must be committed.
129
+ - Claude NEVER `git add`s — staging is the user's approval signal at `cbp-round-complete`.
130
+ - Single-directive routing only — never an A/B/C menu.
131
+ - `claude_only` profile is the deterministic-only carve-out (no reviewer spawn expected).
132
+
133
+ ## Integration
134
+
135
+ - **Triggered by**: `/cbp-round-build` (auto, scope=round after execution); self-escalates to
136
+ scope=task on the last clean round.
137
+ - **Reads**: `.codebyplan/state/**` (local-first; `npx codebyplan sync` on miss; MCP `get_*`
138
+ break-glass); changed files + git diff via the reviewer.
139
+ - **Writes**: `codebyplan round update` / `codebyplan task update` (CLI write-through; MCP
140
+ `update_round` / `update_task` break-glass) — `verify_manifest`, `verify_verdict`.
141
+ - **Spawns**: `cbp-verify-reviewer` (scope param); the `cbp-e2e-*` specialists feed Tier-1 proof
142
+ upstream in `cbp-round-build`.
143
+ - **Triggers**: `/cbp-round-plan` (any fail or more-work), `/cbp-round-complete` (scope=round
144
+ finalize), `/cbp-finalize` (scope=task READY).
145
+ - **References**: `reference/round-scope.md`, `reference/task-scope.md`,
146
+ `reference/deterministic-gates.md`.
@@ -0,0 +1,114 @@
1
+ # Deterministic Gates — Command Contracts & Manifest
2
+
3
+ Authoritative gate-command + manifest detail for `cbp-verify`. The SKILL.md phases point here;
4
+ this file is loaded on demand.
5
+
6
+ ## KIND tool table
7
+
8
+ KIND is detected once at SKILL Phase 1 from the identifier shape. MCP tool names differ by KIND;
9
+ all writes prefer the CLI write-through and fall back to MCP.
10
+
11
+ | Operation | `checkpoint` KIND | `standalone` KIND |
12
+ |-----------|------------------|-------------------|
13
+ | Get task | local state (break-glass `get_current_task`) | `get_current_standalone_task(repo_id)` |
14
+ | Get rounds | local state (break-glass `get_rounds`) | `get_standalone_rounds(standalone_task_id)` |
15
+ | Update round | `codebyplan round update` (MCP `update_round`) | MCP `update_standalone_round` |
16
+ | Update task | `codebyplan task update` (MCP `update_task`) | MCP `update_standalone_task` |
17
+
18
+ Empty-arg KIND detection: probe `get_current_standalone_task` first; if found → `standalone`;
19
+ else `checkpoint` via `get_current_task`. (KIND detection is MCP-unavoidable — no identifier yet
20
+ means no local path to probe; everything after is local-first.)
21
+
22
+ ## Phase 1 resolution detail
23
+
24
+ | Parse | Resolution |
25
+ |-------|-----------|
26
+ | `{chk}-{task}-{round}` | checkpoint round. Read `.codebyplan/state/checkpoints/*.json` → filter `number==={chk}`; `.../tasks/*.json` → `{task}`; `.../rounds/*.json` → `{round}`. |
27
+ | `{chk}-{task}` | checkpoint task (scope=task). Resolve checkpoint + task; verify all rounds `completed`. |
28
+ | `{task}-{round}` | standalone round (scope=round). |
29
+ | `{task}` (bare) | standalone task (scope=task). |
30
+ | _(empty)_ | the active in-progress task/round from `.codebyplan/state/todos.json`. |
31
+
32
+ On any miss: `npx codebyplan sync` once, re-read; MCP `get_*` break-glass only when the state dir
33
+ is absent AND sync fails.
34
+
35
+ ## Phase 2 — `codebyplan check`
36
+
37
+ ```bash
38
+ codebyplan check --scope <round|task> --json
39
+ ```
40
+
41
+ JSON shape (`RunCheckResult`, source `packages/codebyplan-package/src/lib/check.ts:185`):
42
+
43
+ ```jsonc
44
+ {
45
+ "results": [
46
+ { "check": "gate6|lint|typecheck|tests|audit",
47
+ "status": "pass|fail|skipped",
48
+ "exit_code": 0,
49
+ "command": "...",
50
+ "stdout": "...", "stderr": "...",
51
+ "executed": true,
52
+ "new_failures": ["@scope/pkg", "GHSA-xxxx"] } // omitted for gate6
53
+ ],
54
+ "any_failed": false,
55
+ "hard_fail_checks": [], // names of checks that failed post-baseline-diff
56
+ "no_baseline": false
57
+ }
58
+ ```
59
+
60
+ - **`gate6`** (sibling-identity parity) is ALWAYS hard — never baselined, no `new_failures` field.
61
+ - `lint` / `typecheck` / `tests` / `audit` are **baseline-diffed**: `status: 'pass'` when
62
+ `new_failures` is `[]` even if the underlying command exited non-zero (pre-existing red is
63
+ tolerated). `audit.new_failures` lists new GHSA ids not in the allowlist.
64
+ - Verdict: `any_failed === true` (≡ `hard_fail_checks.length > 0`) is a fail — surface each failing
65
+ result's `new_failures` / `stdout` / `stderr`. **This JSON is the verdict; never substitute prose.**
66
+ - Soft tier uses NO `--no-baseline`. The whole-repo absolute-green tier
67
+ (`--scope merged --no-baseline`) belongs to checkpoint close, not this skill
68
+ (`rules/two-tier-ci.md`).
69
+
70
+ ## Phase 3 — `codebyplan e2e verify-round`
71
+
72
+ ```bash
73
+ codebyplan e2e verify-round --round-id <uuid> --task-id <uuid>
74
+ ```
75
+
76
+ Persist `round.context.e2e_eligible[]` + `e2e_outputs{}` FIRST (the CLI reads the round row from
77
+ the DB). Verdict JSON (`VerifyRoundResult`, source `packages/codebyplan-package/src/lib/e2e.ts:127`):
78
+
79
+ ```jsonc
80
+ { "round_id": "...", "task_id": "...",
81
+ "result": { "pass": true, "failed_checks": [], "skipped_validly": [] } }
82
+ ```
83
+
84
+ Exit 0 = pass. Exit 1 → one or more of `e2e_eligible_skipped` / `zero_assertion_run` /
85
+ `empty_gallery` in `result.failed_checks[]` — surface verbatim, carry as a fail, route to a fix
86
+ round (`rules/e2e-mandatory.md`). When `e2e_eligible[]` is empty, skip the call — nothing to verify.
87
+
88
+ ## `claude_only` carve-out (deterministic-only path)
89
+
90
+ When the resolved profile is `claude_only` (round touched only `.claude/**` / docs / config — no
91
+ app surface), there is **no reviewer to spawn by design**. Proof IS the deterministic set:
92
+
93
+ 1. `codebyplan check --scope <round|task> --json` (gate6 + matrix as above).
94
+ 2. `bash -n <hook>` for each touched `.sh` file.
95
+ 3. SKILL/agent/rule structure sanity for touched `.claude/` files (line counts, no `/cbp-*`
96
+ legacy notation).
97
+
98
+ This is a first-class verification path, NOT a banned inline fallback
99
+ (`rules/spawn-failure-is-gate-failure.md` carve-out) — Phase 4's reviewer spawn is skipped, and
100
+ that skip is recorded as `verify_manifest.proof.tier: 4`, not a spawn failure.
101
+
102
+ ## verify-manifest write
103
+
104
+ Write the manifest into round/task context (merge into existing context — the `update_*`
105
+ REPLACE contract requires re-sending the full object):
106
+
107
+ ```bash
108
+ codebyplan round update --id <round_id> --task-id <uuid> --checkpoint-id <uuid> --context '<json>'
109
+ # break-glass: MCP update_round / update_standalone_round
110
+ ```
111
+
112
+ Schema (canonical in `rules/execution-proof.md`): `verify_manifest = { scope, gates[], proof{ tier,
113
+ artifacts[], e2e_verify_round }, decided_at }`. Each `proof.artifacts[].path` is proven committed
114
+ via `git ls-files --error-unmatch <path>` before it counts.