codebyplan 1.13.28 → 1.13.29

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/cli.js CHANGED
@@ -14,7 +14,7 @@ var VERSION, PACKAGE_NAME;
14
14
  var init_version = __esm({
15
15
  "src/lib/version.ts"() {
16
16
  "use strict";
17
- VERSION = "1.13.28";
17
+ VERSION = "1.13.29";
18
18
  PACKAGE_NAME = "codebyplan";
19
19
  }
20
20
  });
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "codebyplan",
3
- "version": "1.13.28",
3
+ "version": "1.13.29",
4
4
  "description": "CLI for CodeByPlan — AI-powered development planning and tracking",
5
5
  "type": "module",
6
6
  "bin": {
@@ -279,6 +279,6 @@ Return findings sorted by severity (critical first). If no findings, return `sta
279
279
  ## Integration
280
280
 
281
281
  - **Spawned by**: `/cbp-round-end` (Step 6)
282
- - **Returns to**: `/cbp-round-end` which presents findings to user
282
+ - **Returns to**: `/cbp-round-end` which auto-applies in-scope findings inline and routes out-of-scope findings to `/cbp-round-update`
283
283
  - **Does NOT**: Apply any changes
284
284
  - **Reads**: Changed files, task requirements, round context
@@ -9,7 +9,7 @@ effort: xhigh
9
9
 
10
10
  # Task Check Agent
11
11
 
12
- AI-driven production readiness review with user satisfaction discussion. Verifies all task requirements are met, checkpoint goals are aligned, and work is production-ready.
12
+ AI-driven production readiness review with user satisfaction discussion. This is the **cross-round double-check** layer: per-round QA (build/lint/types per app, the `console.log`/debug scan, the OWASP/secret grep, API auth-enforcement curls, `pnpm audit`) already ran inside each round's `testing-qa-agent` — this agent does NOT re-run it. Its unique value is holistic: verifying all task requirements are met, checkpoint goals are aligned, the aggregated work is shippable, and — for tasks that span many rounds where scope can shift as new ideas/problems surface — detecting scope drift that should update the checkpoint or task rather than re-running per-round checks.
13
13
 
14
14
  **Numeric-claim verification (Proposal P6)**: when round summaries assert numeric facts (file counts, package counts, percentage changes, line counts, version numbers), verify each via direct count: `find ... | wc -l`, `grep -c`, `wc -l <file>`. Do NOT accept narrative numbers without a verification command. Mismatches between asserted and actual counts indicate documentation drift; flag as a finding requiring a fix.
15
15
 
@@ -95,14 +95,16 @@ Check `task.files_changed`:
95
95
  - List unapproved files
96
96
  - Determine if unapproved files block completion
97
97
 
98
- ### Phase 6: Code Review
98
+ ### Phase 6: Code Review (holistic spot-check)
99
99
 
100
- Read ALL changed files and verify:
101
- - No obvious bugs or regressions
102
- - No security issues (hardcoded secrets, SQL injection, XSS)
103
- - No leftover debug code (console.log, TODO from this task)
104
- - Error handling present where needed
105
- - Consistent with existing codebase patterns
100
+ Per-round QA already ran the line-level checks — the `console.log`/debug scan (round `testing-qa-agent` Phase 3.5), the OWASP secret/injection grep (Phase 5), the API auth-enforcement curl (Phase 3.55), and `pnpm audit` (Phase 3.7). Do NOT re-run them here. Phase 6 is a light holistic spot-check across the aggregated diff for what a single round cannot see:
101
+
102
+ - No obvious bugs or regressions that emerge only when all rounds' changes are read together
103
+ - No cross-round integration gaps (a field/contract introduced in one round that a later round broke)
104
+ - Error handling present where needed at the feature boundary
105
+ - Consistent with existing codebase patterns across the full task diff
106
+
107
+ If the aggregated diff surfaces an obvious issue per-round QA missed, flag it as a finding — but the per-round scans are authoritative for line-level concerns.
106
108
 
107
109
  ### Phase 7: Shippable Feature Gate
108
110
 
@@ -125,6 +127,8 @@ Update `round_outcome_analysis` with findings.
125
127
 
126
128
  ### Phase 9: User Satisfaction Discussion
127
129
 
130
+ For tasks that ran many rounds, scope drift accumulates quietly — each round may have absorbed a new idea or problem without the checkpoint/task requirements being updated. The satisfaction discussion is where that drift surfaces; treat the scope-divergence scan below as a first-class output, not an afterthought.
131
+
128
132
  Present findings to user via AskUserQuestion:
129
133
 
130
134
  ```
@@ -5,6 +5,15 @@
5
5
  # staging-status flip, and web-UI flag sync to the codebyplan CLI.
6
6
  # Replaces the inline jq merge + curl PATCH with a single CLI call.
7
7
  #
8
+ # Trigger context (CHK-197): complete_round is now called by /cbp-round-complete
9
+ # (the permission-gated finalizer), which already runs `sync-approvals` once
10
+ # BEFORE completing the round. This hook firing afterward is the expected
11
+ # post-complete safety net — it catches approval drift between that pre-complete
12
+ # sync and the approval_locked write; it is NOT a duplicate run to remove.
13
+ # NOTE: this hook matches complete_round only; complete_standalone_round is NOT
14
+ # covered, so standalone rounds rely solely on /cbp-round-complete's pre-complete
15
+ # sync (pre-existing coverage gap, documented here intentionally — not fixed).
16
+ #
8
17
  # Delegates to: npx codebyplan round sync-approvals
9
18
  # - Git-diff drift merge (in/not-in DB vs git)
10
19
  # - Staging-status → user_approved flip
@@ -55,7 +55,8 @@
55
55
  "Skill(cbp-checkpoint-create)",
56
56
  "Skill(cbp-checkpoint-check)",
57
57
  "Skill(cbp-checkpoint-complete)",
58
- "Skill(cbp-round-update)",
58
+ "Skill(cbp-round-complete)",
59
+ "Skill(cbp-round-execute)",
59
60
  "Skill(cbp-session-end)",
60
61
  "Skill(cbp-task-complete)",
61
62
  "Skill(cbp-standalone-task-create)",
@@ -114,9 +115,9 @@
114
115
  "Skill(cbp-refresh-infra)",
115
116
  "Skill(cbp-round-check)",
116
117
  "Skill(cbp-round-end)",
117
- "Skill(cbp-round-execute)",
118
118
  "Skill(cbp-round-input)",
119
119
  "Skill(cbp-round-start)",
120
+ "Skill(cbp-round-update)",
120
121
  "Skill(cbp-session-start)",
121
122
  "Skill(cbp-setup-e2e)",
122
123
  "Skill(cbp-setup-eslint)",
@@ -31,19 +31,20 @@ Fifteen of the 16 authoring agents take the default (`cbp-cc-executor`, `cbp-dat
31
31
 
32
32
  | skill | model | effort | reason |
33
33
  | ----------------- | ------ | ------ | ----------------------------------------------------------------------------------------------------------------- |
34
- | cbp-round-end | sonnet | high | Spawns cbp-improve-round agent; skill body summarises + presents user findings-decision — lighter than xhigh suffices |
34
+ | cbp-round-end | sonnet | high | Spawns cbp-improve-round agent; auto-applies in-scope findings + routes out-of-scope to round-update — lighter than xhigh suffices |
35
35
  | cbp-task-check | sonnet | high | Thin orchestrator over spawned cbp-task-check agent; inline-fallback path keeps Opus for safety |
36
36
  | cbp-checkpoint-update | sonnet | high | Status updates + context patches mostly; lighter than xhigh |
37
37
  | cbp-ship-main | sonnet | high | Production-impacting PR creation; keep Opus reasoning but drop effort |
38
38
  | cbp-merge-main | sonnet | high | Long-lived-branch integration merge — surgical conflict resolution, no authoring |
39
39
 
40
- ### Haiku-low skills (9)
40
+ ### Haiku-low skills (10)
41
41
 
42
42
  `model: haiku` + `effort: low`. Pure mechanical / dispatch / templated work.
43
43
 
44
44
  | skill | model | effort | reason |
45
45
  | ---------------------- | ----- | ------ | ---------------------------------------------------------------------------------------- |
46
- | cbp-round-update | haiku | low | Pure mechanical: read git status, check approvals, route per rules |
46
+ | cbp-round-update | haiku | low | Pure mechanical: triage round state (claude_approved/findings/hard_fail), route to round-complete or round-input |
47
+ | cbp-round-complete | haiku | low | Pure mechanical: sync-approvals git-add reconcile, complete round, route per unapproved count |
47
48
  | cbp-round-check | haiku | low | Run build/lint/types commands, parse output, update QA |
48
49
  | cbp-todo | haiku | low | Dispatch: single MCP call + route to next command |
49
50
  | cbp-checkpoint-complete | haiku | low | Pure finalization — mark completed, write summary; judgment happened in checkpoint-check |
@@ -22,7 +22,7 @@ Precedence is `deny > ask > allow`; arrays union across scopes (managed/user/pro
22
22
 
23
23
  ### `allow` — the autonomous workflow surface
24
24
 
25
- - **Non-lifecycle, non-shipment `/cbp-*` skills** — authoring (`cbp-build-cc-*`), frontend (`cbp-frontend-*`), git (`cbp-git-*`, `cbp-merge-main`, `cbp-refresh-infra`), round work (`cbp-round-check`/`-end`/`-execute`/`-input`/`-start`), setup/configure (`cbp-setup-*`, `cbp-ship-configure`, `cbp-supabase-*`), task prep (`cbp-task-check`/`-create`/`-start`/`-testing`, `cbp-standalone-task-check`/`-testing`), planning (`cbp-checkpoint-plan`/`-update`), plus `cbp-session-start` and `cbp-todo`. Invoking a skill is the intended mode of operation; the gated side effects happen inside via the Bash/MCP tools the skill calls, which carry their own tiering. The lifecycle/state-transition skills are the exception — they live in `ask` (next section).
25
+ - **Non-lifecycle, non-shipment `/cbp-*` skills** — authoring (`cbp-build-cc-*`), frontend (`cbp-frontend-*`), git (`cbp-git-*`, `cbp-merge-main`, `cbp-refresh-infra`), round work (`cbp-round-check`/`-end`/`-input`/`-start`/`-update` — `cbp-round-update` is autonomous triage that only reads round state and routes to `cbp-round-complete` or `cbp-round-input`, so it runs without a prompt), setup/configure (`cbp-setup-*`, `cbp-ship-configure`, `cbp-supabase-*`), task prep (`cbp-task-check`/`-create`/`-start`/`-testing`, `cbp-standalone-task-check`/`-testing`), planning (`cbp-checkpoint-plan`/`-update`), plus `cbp-session-start` and `cbp-todo`. Invoking a skill is the intended mode of operation; the gated side effects happen inside via the Bash/MCP tools the skill calls, which carry their own tiering. The lifecycle/state-transition and plan-approval skills are the exception — they live in `ask` (next section).
26
26
  - **All `mcp__codebyplan__*` reads** (`get_*`, `list_*`, `search_*`, `health_check`, `lookup_symbol`, `resolve_library_id`, `get_chunk`).
27
27
  - **Routine workflow-write MCP tools** the pipeline calls many times per task: create/update/complete checkpoint, task, and round; session log + session-state writes; `create_worktree`, `add_library`, `flag_stale_chunk`, `update_server_config`, `update_eslint_repo_config`, `update_task_template`. Gating these with `ask` would make the autonomous workflow unusable.
28
28
  - **Read/safe CLI commands** (both `codebyplan X` and `npx codebyplan X`): `whoami`, `resolve-worktree`, `statusline`, `ports`, `tech-stack`, `eslint`, `round`, `help`, `--version`.
@@ -30,7 +30,8 @@ Precedence is `deny > ask > allow`; arrays union across scopes (managed/user/pro
30
30
  ### `ask` — the deliberate confirm-gate
31
31
 
32
32
  - **Production-shipment skills**: `cbp-ship`, `cbp-ship-main`, `cbp-checkpoint-end` — these promote/deploy to production, so they prompt even in an otherwise auto-allowed setup.
33
- - **Lifecycle / state-transition skills**: `cbp-checkpoint-start`, `cbp-checkpoint-create`, `cbp-checkpoint-check`, `cbp-checkpoint-complete`, `cbp-round-update`, `cbp-session-end`, `cbp-task-complete`, `cbp-standalone-task-create`, `cbp-standalone-task-start`, `cbp-standalone-task-complete` — these open or close checkpoints, tasks, rounds, and sessions (advancing workflow state in the database), so they stop for explicit confirmation rather than running autonomously.
33
+ - **Lifecycle / state-transition skills**: `cbp-checkpoint-start`, `cbp-checkpoint-create`, `cbp-checkpoint-check`, `cbp-checkpoint-complete`, `cbp-round-complete`, `cbp-session-end`, `cbp-task-complete`, `cbp-standalone-task-create`, `cbp-standalone-task-start`, `cbp-standalone-task-complete` — these open or close checkpoints, tasks, rounds, and sessions (advancing workflow state in the database), so they stop for explicit confirmation rather than running autonomously. `cbp-round-complete` is the permission-gated round finalizer (reconciles the user's `git add`s, completes the round, routes onward); its `ask` prompt replaces the in-skill confirmation that used to live in `cbp-round-update` — which is now an autonomous, `allow`-tier triage step.
34
+ - **Plan-approval gate**: `cbp-round-execute` — the round plan is approved by confirming this `ask` prompt rather than via an in-skill AskUserQuestion. `cbp-round-start` runs its planning Q&A, then hands off to `cbp-round-execute`; the permission prompt is the user's go/no-go on the plan.
34
35
  - **Destructive / admin MCP tools**: `delete_session_log`, `delete_worktree`, `create_repo`, `release_assignment`. (The launch and member-admin tools were dropped from the MCP surface in CHK-180 — those concerns are web-app only now.)
35
36
  - **Mutating / external / clobber-risk CLI commands** (both prefixes): `setup`, `login`, `logout`, `upgrade-auth`, `config` (can overwrite committed `.codebyplan/` files), `branch` (rewrites branch config), `ship`, `claude` (`install`/`update`/`uninstall` overwrite `.claude/`).
36
37
 
@@ -88,7 +88,7 @@ A skill should do one thing in the pipeline. If a skill both plans AND executes,
88
88
  If the skill is part of a chain, show it:
89
89
 
90
90
  ```
91
- /cbp-round-start (planning) → [user approval] → /cbp-round-execute (auto)
91
+ /cbp-round-start (planning) → /cbp-round-execute (ask-tier permission = plan approval)
92
92
  ```
93
93
 
94
94
  ### Approval Gates
@@ -42,7 +42,7 @@ Triggered by `/cbp-task-start` (Step 3.6, optional stale-check), `/cbp-task-comp
42
42
  - **Cancel** — abort the skill.
43
43
  - `unstaged_dirty=false AND staged_present=true` → print one informational line and proceed to Step 1:
44
44
  `Staged changes detected — proceeding with merge.`
45
- (Pre-staged files will be included in the merge commit at Step 2 — this is intentional; the caller already approved them via /cbp-round-update.)
45
+ (Pre-staged files will be included in the merge commit at Step 2 — this is intentional; the caller already approved them via /cbp-round-complete.)
46
46
  - `unstaged_dirty=false AND staged_present=false` → proceed silently to Step 1.
47
47
  - Either `git diff` command exits with code ≥ 2 (git hard error — not-a-repo, detached HEAD with no commits, index lock, corrupt object store): surface the raw error output and STOP. Do NOT proceed to Step 1.
48
48
 
@@ -0,0 +1,164 @@
1
+ ---
2
+ scope: org-shared
3
+ name: cbp-round-complete
4
+ description: Reconcile user git-add approvals, complete the round, and route to the next step
5
+ argument-hint: [chk-task-round | task-round]
6
+ triggers: [cbp-task-check, cbp-standalone-task-check, cbp-round-input]
7
+ effort: low
8
+ ---
9
+
10
+ ## Kind Detection
11
+
12
+ Inspect the resolved identifier from argument parsing to determine the task kind:
13
+
14
+ | Identifier shape | KIND |
15
+ |-----------------|------|
16
+ | `{task}-{round}` (2-segment, e.g. `45-2`) | `standalone` |
17
+ | `{chk}-{task}-{round}` (3-segment, e.g. `141-3-1`) | `checkpoint` |
18
+ | _(empty / free-text)_ | Check `get_current_standalone_task` first; if found → `standalone`. Else → `checkpoint` via `get_current_task`. |
19
+
20
+ Set `KIND` for the rest of this skill. MCP tool names vary by KIND:
21
+
22
+ | Operation | `checkpoint` KIND | `standalone` KIND |
23
+ |-----------|------------------|-------------------|
24
+ | Get task | `get_current_task(repo_id)` | `get_current_standalone_task(repo_id)` |
25
+ | Get rounds | `get_rounds(task_id)` | `get_standalone_rounds(standalone_task_id)` |
26
+ | Update round | `update_round(round_id, ...)` | `update_standalone_round(standalone_round_id, ...)` |
27
+ | Complete round | `complete_round(round_id, duration_minutes?)` | `complete_standalone_round(standalone_round_id, duration_minutes?, caller_worktree_id)` ⚠️ `caller_worktree_id` is REQUIRED for standalone |
28
+
29
+ # Round Complete Command
30
+
31
+ The **permission-gated finalizer** for a round that `/cbp-round-update` triaged as clean. It reconciles which files the **user** approved via `git add`, completes the round, and routes to the next step.
32
+
33
+ This skill is gated by an `ask`-tier `Skill(cbp-round-complete)` permission rule in `settings.json`. **The permission prompt IS the user confirmation** — there is NO AskUserQuestion inside this skill. If the user declines the permission, the skill does not run: nothing is synced, no round is completed, and the user can stage files and re-invoke (directly or by re-running `/cbp-round-update`) when ready.
34
+
35
+ ## HARD GATE — Every Step Must Execute
36
+
37
+ Step 2 (sync-approvals CLI) MUST exit 0. If it fails, do NOT proceed to Step 3. Before completing the round, verify:
38
+
39
+ - [ ] `codebyplan round sync-approvals` exited 0
40
+
41
+ If this is false: DO NOT proceed to Step 3.
42
+
43
+ ## Instructions
44
+
45
+ ### Step 1: Parse `$ARGUMENTS`
46
+
47
+ Parse the argument using the canonical chk-task-round notation (see `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary"):
48
+
49
+ | Shape | Regex | Resolves to |
50
+ |-------|-------|-------------|
51
+ | `{chk}-{task}-{round}` (e.g. `108-1-2`) | `^[0-9]+-[0-9]+-[0-9]+$` | Checkpoint-bound: CHK-{chk} TASK-{task} ROUND-{round} |
52
+ | `{task}-{round}` (e.g. `45-2`) | `^[0-9]+-[0-9]+$` | Standalone: standalone TASK-{task} ROUND-{round} |
53
+ | _(empty)_ | — | Use Kind Detection to find active task and latest round |
54
+
55
+ Anything else is malformed — surface this error and stop:
56
+
57
+ ```
58
+ round-complete: invalid argument `{value}`. Expected:
59
+ 108-1-2 → CHK-108 TASK-1 ROUND-2 (checkpoint-bound)
60
+ 45-2 → standalone TASK-45 ROUND-2
61
+ (empty) → active task and latest round
62
+ ```
63
+
64
+ Note that `108-1` is **valid** here — it resolves to standalone TASK-108 ROUND-1 per the 2-segment task-round form. To target a checkpoint-bound round, use the 3-segment form `108-1-2`.
65
+
66
+ ### Step 1.5: Get Current Task and Round
67
+
68
+ Given the parse from Step 1:
69
+
70
+ | Parse | Resolution path |
71
+ |-------|-----------------|
72
+ | `{chk}-{task}-{round}` | MCP `get_checkpoints(repo_id)` → filter `number === {chk}`. MCP `get_tasks(checkpoint_id)` → filter `number === {task}`. MCP `get_rounds(task_id)` → filter `number === {round}`. |
73
+ | `{task}-{round}` | MCP `get_standalone_rounds` via `get_current_standalone_task` or direct task lookup → filter `number === {round}`. |
74
+ | _(empty)_ | Use Kind Detection: checkpoint KIND → MCP `get_current_task(repo_id)` + `get_rounds(task_id)`; standalone KIND → MCP `get_current_standalone_task(repo_id)` + `get_standalone_rounds(standalone_task_id)`. |
75
+
76
+ If no task found: `No active task. Nothing to complete.`
77
+
78
+ ### Step 2: Sync git diff + approvals via CLI
79
+
80
+ Reconcile which files the user has approved by staging them. Run:
81
+
82
+ ```
83
+ npx codebyplan round sync-approvals --round-id <round_id> --task-id <task_id>
84
+ ```
85
+
86
+ The CLI auto-resolves the caller worktree id with the following precedence:
87
+ 1. `--caller-worktree-id <uuid>` override (if passed — skips all resolution)
88
+ 2. Per-device branch-keyed cache (`.codebyplan/worktree.local.json`)
89
+ 3. In-process tuple API call: `POST /worktrees/resolve` using `(device_id, repo_path, branch)`
90
+
91
+ On the write path (non `--dry-run`), if the worktree id cannot be resolved the CLI **hard-fails with exit 1** and prints an actionable message. To pre-populate the cache:
92
+
93
+ ```
94
+ npx codebyplan resolve-worktree --cache
95
+ ```
96
+
97
+ If this worktree is not yet registered, run `npx codebyplan setup` first, then re-run `/cbp-round-complete`.
98
+
99
+ The CLI parses `git status --short`, merges drift + staging + web-UI flag, and writes both round and task (forwarding `caller_worktree_id` on both writes so the server honors the feat-worktree lock). A **cleanly staged** file (`git add`-ed, no further unstaged changes) becomes `user_approved: true`.
100
+
101
+ Read the stdout JSON: `{ added, stale_marked, reactivated, total_files }`.
102
+
103
+ If the command exits non-zero, surface the stderr and STOP. Do NOT proceed to Step 3.
104
+
105
+ This is the **single** explicit reconcile owned by this skill. (The `cbp-mcp-round-sync.sh` PostToolUse hook fires again right after Step 3's `complete_round` — see the note below — but that is the existing post-complete safety net, not a duplicate run to schedule here.)
106
+
107
+ ### Step 3: Complete the Round
108
+
109
+ Calculate duration from the round's `started_at` to now in minutes.
110
+
111
+ - **checkpoint KIND**: MCP `complete_round(round_id, duration_minutes)`.
112
+ - **standalone KIND**: MCP `complete_standalone_round(standalone_round_id, duration_minutes, caller_worktree_id)`. ⚠️ `caller_worktree_id` is REQUIRED — resolve via `CALLER_WT=$(npx codebyplan resolve-worktree 2>/dev/null)`. If `CALLER_WT` is empty, surface this warning and ask the user to confirm before proceeding:
113
+
114
+ ```
115
+ Warning: could not resolve caller_worktree_id (npx codebyplan resolve-worktree returned empty).
116
+ The complete_standalone_round call may be rejected by the pre-guard. Proceed anyway? (yes / no)
117
+ ```
118
+
119
+ If the user confirms yes, proceed with `caller_worktree_id: ""`. If no, stop.
120
+
121
+ `complete_round` / `complete_standalone_round` sets the round `completed`, locks all `file_changes` for the round (`approval_locked: true`), and returns `unapproved_files[]` + `unapproved_count`. Hold those for routing.
122
+
123
+ > **PostToolUse hook note**: completing the round fires the `cbp-mcp-round-sync.sh` PostToolUse hook (matcher `mcp__codebyplan__complete_round`), which runs `sync-approvals` once more as a post-complete safety net for any approval drift between Step 2 and the lock. This is **expected** and is not double-processing — Step 2 is the pre-complete reconcile that makes `unapproved_count` accurate for routing; the hook is the existing catch-up. Note the hook matches `complete_round` only — `complete_standalone_round` is **not** covered by it (a pre-existing gap), so standalone rounds rely solely on this skill's Step 2 reconcile.
124
+
125
+ ### Step 4: Route
126
+
127
+ **4a — Count files** — Display: `"Round N complete — Files: X total, Y approved, Z pending"`.
128
+
129
+ **4b — Route on `unapproved_count`** (from Step 3's `complete_round` response):
130
+
131
+ - **`unapproved_count === 0`** (every file user-approved): the user has signed off on the whole round.
132
+ - checkpoint KIND → auto-trigger `/cbp-task-check`.
133
+ - standalone KIND → auto-trigger `/cbp-standalone-task-check`.
134
+ - **`unapproved_count > 0`** (user withheld approval on some files): the unstaged files are the signal that more work is wanted on them. Auto-trigger `/cbp-round-input` — its Step 2 deep analysis reads exactly those `user_approved === false` files and formulates the next round's requirements. This route is **independent of how many files are staged**; round-input is reachable even when zero files were staged.
135
+
136
+ - **Degenerate auto-loop guard**: if the just-completed round had `round.context.auto_loop_mode === true` AND it was a clean exit (no `improve_round_findings[]`, no hard-fail — which is why `/cbp-round-update` triaged it to round-complete in the first place), do NOT auto-trigger `/cbp-round-input`. Its auto-loop path transcribes the prior round's findings verbatim, and a clean round has none — auto-triggering would spin on an empty input. Instead surface the clean-exit note below and STOP; the user stages the pending files and re-invokes (or runs `/cbp-round-input` manually). Persist `round.context.round_complete.degenerate_auto_loop_exit = true`.
137
+
138
+ ```
139
+ ## Round N Complete — Auto-loop finished clean
140
+
141
+ **Files**: X total, Y approved, Z pending
142
+
143
+ Pending files passed all checks; they are just not staged. Stage them
144
+ (`git add <path>`) to finish the task, or run /cbp-round-input to start
145
+ another round.
146
+ ```
147
+
148
+ Persist a breadcrumb on the round via `update_round` / `update_standalone_round` per KIND: `round.context.round_complete = { staged_count, unstaged_count, route, decided_at }`.
149
+
150
+ ## Key Rules
151
+
152
+ - **Permission prompt = confirmation** — gated by `ask`-tier `Skill(cbp-round-complete)`. NEVER add an AskUserQuestion to confirm running; the harness prompt is the gate. A declined permission is a clean no-op.
153
+ - **Step 2 (CLI) must exit 0** — if it fails, STOP before `complete_round`. The merge semantics are enforced by the CLI.
154
+ - **NEVER ask the user to git add files** — Step 2 only reads staging status. **NEVER stage files** — Claude does not touch the git staging area; the user's `git add` is the approval signal.
155
+ - **standalone KIND Step 3**: `caller_worktree_id` is REQUIRED for `complete_standalone_round` — always resolve and pass it.
156
+ - **Auto-triggered by `/cbp-round-update`** (clean triage), or run manually by the user.
157
+
158
+ ## Integration
159
+
160
+ - **Gates**: `ask`-tier `Skill(cbp-round-complete)` permission prompt — the harness confirms before the skill runs; a decline makes NO writes. There is no in-skill AskUserQuestion.
161
+ - **Triggered by**: `/cbp-round-update` (auto, clean triage), or user manually
162
+ - **Reads**: MCP `get_current_task` / `get_current_standalone_task`, `get_rounds` / `get_standalone_rounds` (per KIND); delegates git+approval sync to `npx codebyplan round sync-approvals`
163
+ - **Writes**: MCP `complete_round` / `complete_standalone_round` (per KIND); `update_round` / `update_standalone_round` (`round_complete` breadcrumb); round+task `files_changed` written by the CLI
164
+ - **Triggers**: `/cbp-task-check` (checkpoint KIND, all files approved), `/cbp-standalone-task-check` (standalone KIND, all files approved), `/cbp-round-input` (some files unapproved — fires independent of staging count)
@@ -16,7 +16,7 @@ See `reference/inline-fallback.md` for full trigger table, procedure, and covera
16
16
  ## Pipeline
17
17
 
18
18
  ```
19
- /cbp-round-execute → /cbp-round-end → [code review + user decisions] → /cbp-round-update
19
+ /cbp-round-execute → /cbp-round-end → [code review + auto-apply in-scope] → /cbp-round-update
20
20
  ```
21
21
 
22
22
  ## Identifier Notation
@@ -126,9 +126,13 @@ Wait for agent to complete. If the spawn fails for any reason, apply the inline-
126
126
 
127
127
  **If `status: 'no_findings'`:** show `### Code Review\nNo issues found. Code looks good.` and skip to Step 8.
128
128
 
129
- **If findings exist**, present them grouped by severity (table + per-finding details), then ask the user via AskUserQuestion which to fix: `all`, `1,2` (specific numbers), `none`, or `inline` (only when all findings qualify under the Trivial-Resolution Exception).
129
+ **If findings exist**, present them grouped by severity (table + per-finding details).
130
130
 
131
- Example tables and the `inline` option gating spec: see `reference/findings-presentation.md`.
131
+ **Under `auto_loop_mode === true`**: do NOT auto-apply here — Step 8's auto-loop path accepts all findings into `improve_round_findings[]` and defers the fixes to the next loop round. Skip straight to Step 8.
132
+
133
+ **Manual mode**: **auto-apply all in-scope findings inline**. A finding is *in-scope* when every file it references is within the round's `files_changed[]`. The round-end orchestrator (main context — it has Edit/Write) applies these fixes directly; the `cbp-improve-round` agent stays read-only/advisory and never writes. Record each applied fix in `round.context.inline_fix_log` (findings indices, rationale, `fixes[]`, applied_at). After applying, re-run the verification scoped to the modified files (hook syntax check for `.sh`; `cbp-testing-qa-agent` for code) per `reference/findings-presentation.md`; if it fails, do NOT record the fix — treat the finding as out-of-scope instead. Findings that reference files OUTSIDE `files_changed[]` are **out-of-scope** — do NOT apply them; save them to `improve_round_findings[]` so Step 8 routes them to `/cbp-round-input` or a new task. There is no findings-decision AskUserQuestion — the round was already approved at the `/cbp-round-execute` permission prompt. The baseline-regression gate above is the ONLY user decision in this step.
134
+
135
+ Example tables and the in-scope/out-of-scope classification: see `reference/findings-presentation.md`.
132
136
 
133
137
  ### Step 8: Route Based on Decisions
134
138
 
@@ -136,33 +140,31 @@ Example tables and the `inline` option gating spec: see `reference/findings-pres
136
140
 
137
141
  - Auto-accept ALL findings into `improve_round_findings[]` regardless of severity (the user opted into the loop).
138
142
  - Skip the polish-spiral stop-gate (auto-loop has its own cap-exhausted termination).
139
- - Skip the user findings-decision prompt.
143
+ - Skip Step 7's inline auto-apply (findings are deferred to the next loop round, not applied this round).
140
144
  - Save findings via `update_round` exactly as in manual mode.
141
- - Auto-trigger `/cbp-round-update` immediately. round-update Step 6 will decide whether to spawn another round or exit clean (see cbp-round-update SKILL.md Step 6).
145
+ - Auto-trigger `/cbp-round-update` immediately. round-update triages the round and either routes to `/cbp-round-input` (spawn another round) or `/cbp-round-complete` (clean exit) — see cbp-round-update SKILL.md Step 2/3.
142
146
 
143
147
  **Else (manual mode — flag absent or false):**
144
148
 
145
- Run the existing flow:
149
+ Step 7 already auto-applied in-scope findings and logged them to `round.context.inline_fix_log`. Now record any out-of-scope findings and route:
146
150
 
147
- 1. After round 2+, surface the polish-spiral stop-gate per `polish-spiral-stop-gate.md` (defer-to-followups vs continue).
148
- 2. Surface the findings-decision AskUserQuestion (with optional `inline` per the gating rules in `reference/findings-presentation.md`).
149
- 3. Save accepted/rejected findings to round context via MCP `update_round`:
151
+ 1. **Polish-spiral stop-gate** (round 2+ only): if this is round 2 or later AND the prior round also ended with code-review fixes, surface a one-line stop-gate via AskUserQuestion — *defer remaining polish to a follow-up task* vs *continue with another round*. This is a genuine user decision about scope (it guards against endless low-value polish loops), not a flow-control prompt. Skip on round 1.
152
+ 2. Save out-of-scope findings (those NOT auto-applied in Step 7) to round context via MCP `update_round`:
150
153
  ```json
151
154
  {
152
155
  "context": {
153
- "improve_round_findings": [accepted findings],
154
- "improve_round_rejected": [rejected findings with user reasons]
156
+ "improve_round_findings": [out-of-scope findings]
155
157
  }
156
158
  }
157
159
  ```
158
- 4. Auto-trigger `/cbp-round-update`. round-update will see unapproved files and route to `/cbp-round-input`; `/cbp-round-input` picks up the findings from round context and includes them in the new round's requirements automatically.
160
+ 3. Auto-trigger `/cbp-round-update`. round-update triages the round: if out-of-scope findings (or a hard-fail) remain it routes to `/cbp-round-input` (which picks up the findings from round context and includes them in the new round's requirements automatically); if the round is clean it routes to `/cbp-round-complete` (the permission-gated finalizer that reconciles the user's `git add`s and completes the round).
159
161
 
160
162
  ## Key Rules
161
163
 
162
- - Claude NEVER git adds files — user does code review
164
+ - Claude NEVER git adds files — user approval is via git staging at `/cbp-round-complete`
163
165
  - Auto-triggers `/cbp-round-update` after findings are handled
164
166
  - `/cbp-round-end` is auto-triggered by `/cbp-round-execute` (user does not call it directly)
165
- - Findings are **presented for user decision**never auto-fix without user consent
167
+ - In-scope findings are **auto-applied inline** by the round-end orchestrator (the round was already approved at the `/cbp-round-execute` permission); out-of-scope findings route to `/cbp-round-input`. `cbp-improve-round` stays read-only/advisory. Baseline-regression accept (Step 7 gate) stays a user decision — baselines are NEVER auto-accepted.
166
168
 
167
169
  ## Integration
168
170
 
@@ -1,6 +1,6 @@
1
1
  # Findings Presentation in `/cbp-round-end` Step 7
2
2
 
3
- When `improve-round` returns findings, Step 7 presents them grouped by severity and asks the user how to proceed.
3
+ When `improve-round` returns findings, Step 7 presents them grouped by severity, then **auto-applies in-scope findings inline** (manual mode) or defers them to the next loop round (auto-loop mode). There is no findings-decision prompt.
4
4
 
5
5
  ## Example output
6
6
 
@@ -22,26 +22,16 @@ When `improve-round` returns findings, Step 7 presents them grouped by severity
22
22
  [description + suggested fix from agent]
23
23
  ```
24
24
 
25
- ## AskUserQuestion options
25
+ ## Auto-apply model (manual mode)
26
26
 
27
- ```
28
- Which findings should be fixed?
29
- - "all" — fix all findings in a new round
30
- - "1,2" — fix specific findings by number
31
- - "none" — skip all, proceed to round-update
32
- - "inline" — fix in THIS round before proceeding (only offered when all findings qualify under the Trivial-Resolution Exception below)
33
- - Or explain why specific findings are not issues
34
- ```
35
-
36
- ## "inline" option gating
27
+ Step 7 auto-applies all **in-scope** findings inline — no user prompt. A finding is *in-scope* when every file it references is within the round's `files_changed[]`; it is *out-of-scope* otherwise.
37
28
 
38
- Only present the "inline" option when ALL pending findings simultaneously qualify under the **Trivial-Resolution Exception** (see subsection below):
29
+ - **In-scope** the round-end orchestrator (main context, has Edit/Write) applies the fix directly via `Edit` / `Write`, re-runs the verification commands (hook syntax check + `cbp-testing-qa-agent` scoped to modified files), and records it in `round.context.inline_fix_log = { findings: [ids], rationale, fixes: [...], applied_at: <ISO> }`. The `cbp-improve-round` agent stays read-only/advisory and never writes.
30
+ - **Out-of-scope** → saved to `round.context.improve_round_findings[]`; Step 8 routes them to `/cbp-round-input` (next round) or a new task per the Infra Issue Absorption Contract below.
39
31
 
40
- 1. Diff is comment-only, annotation-only, banner-only, or single-value rename no logic, no control flow
41
- 2. Each fix is under ~5 minutes of executor time
42
- 3. Verification is automatic — the existing test/lint/audit pipeline confirms the change
32
+ The only user decision in Step 7 is the **baseline-regression accept** gate (baselines are NEVER auto-accepted). Under `auto_loop_mode`, Step 7 does not auto-applyall findings are accepted into `improve_round_findings[]` and deferred to the next loop round.
43
33
 
44
- If any finding fails these gates, omit the "inline" option entirely (revert to the 3-option prompt). When inline is chosen, apply the edits via direct `Edit`, re-run the verification commands (hook syntax check + `cbp-testing-qa-agent` scoped to modified files) and proceed to `/cbp-round-update` without spawning a new round. Document the decision in `round.context.inline_fix_log = { findings: [ids], rationale: "trivial-resolution exception", applied_at: <ISO> }` (mirrors the `bypass_log` shape from the Pipeline Bypass subsection below).
34
+ The **Trivial-Resolution Exception** below still governs the deeper bypass cases (skipping executor / testing-qa / improve-round for ≤5-line non-logic corrective rounds); it is referenced by `/cbp-round-execute` and `/cbp-task-testing` for infra-issue absorption.
45
35
 
46
36
  ---
47
37
 
@@ -15,6 +15,10 @@ Execution and validation phase. Receives the approved plan from `/cbp-round-star
15
15
  /cbp-round-start → /cbp-round-execute → /cbp-round-end (auto)
16
16
  ```
17
17
 
18
+ ## Approval Model
19
+
20
+ The `ask`-tier `Skill(cbp-round-execute)` permission prompt (configured in `settings.json`) is the **plan-approval gate** handed off from `/cbp-round-start`: confirming the permission approves the plan; declining it returns control to `/cbp-round-start` (re-plan with feedback) or `/cbp-round-input` (wrong direction). Once execution begins, the executors (`cbp-round-executor`, `cbp-mechanical-edits`) and the 3-INLINE / 3-SURVEY paths apply edits **automatically** — there is NO in-skill AskUserQuestion for approval. The only downstream user decisions are genuine ones: the dev-server start prompt (Step 4) and the baseline-regression accept gate (`/cbp-round-end` Step 7).
21
+
18
22
  ## Identifier Notation
19
23
 
20
24
  This skill operates on the **active** task/round resolved via MCP `get_current_task` / `get_rounds` and does not accept a positional identifier argument. Canonical chk-task-round notation is defined in `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary".
@@ -24,7 +24,6 @@ Set `KIND` for the rest of this skill. MCP tool names vary by KIND:
24
24
  | Get rounds | `get_rounds(task_id)` | `get_standalone_rounds(standalone_task_id)` |
25
25
  | Add round | `add_round(task_id, ...)` | `add_standalone_round(standalone_task_id, ...)` |
26
26
  | Update round | `update_round(round_id, ...)` | `update_standalone_round(standalone_round_id, ...)` |
27
- | Complete round | `complete_round(round_id, duration_minutes?)` | `complete_standalone_round(standalone_round_id, duration_minutes?, caller_worktree_id)` ⚠️ `caller_worktree_id` is REQUIRED for standalone |
28
27
  | Update task | `update_task(task_id, ...)` | `update_standalone_task(standalone_task_id, ...)` |
29
28
 
30
29
  # Round Input Command
@@ -33,7 +32,7 @@ Gathers input for a new round. Performs deep analysis of unapproved files, requi
33
32
 
34
33
  ## When Used
35
34
 
36
- - After `/cbp-round-update` routes here (unapproved files)
35
+ - After `/cbp-round-update` triages a round as not-clean and routes here, or `/cbp-round-complete` routes here (files left unapproved after completing the round)
37
36
  - After `/cbp-round-execute` Step 6 routes here (structural failure or retry-exhausted hard-fail)
38
37
  - After `/clear` + `/cbp-todo` reloads context and triggers this
39
38
  - When user wants to start a new round with specific changes
@@ -78,8 +77,9 @@ If the argument matches the numeric regex, resolve the target task/round from DB
78
77
  **2f:** Extract testing-qa failures from latest round context (`context.testing_qa_output`)
79
78
 
80
79
  **2g:** Extract code review findings from latest round context (`context.improve_round_findings`).
81
- These are user-accepted findings from `improve-round` agent — bugs, logic errors, edge cases
82
- that the user agreed should be fixed. Include them as high-priority requirements.
80
+ These are out-of-scope findings from the `improve-round` agent — bugs, logic errors, edge cases
81
+ that round-end could not auto-apply inline (they reference files outside the prior round's
82
+ `files_changed[]`). Include them as high-priority requirements.
83
83
 
84
84
  **2h:** Identify root causes — not "file X is wrong" but "requirement Y was not met because Z"
85
85
 
@@ -175,12 +175,12 @@ If this command is triggered **directly** (not via `/cbp-todo`) and no context i
175
175
  - **Deep analysis is MANDATORY** — always runs, even if arguments provided (for context)
176
176
  - **Analysis reads from DB (MCP)**, not conversation history
177
177
  - **Follow-up rounds get same depth as round 1** — no quick-fix behavior
178
- - **Never ask to git add** — file approval is handled by `/cbp-round-update`
178
+ - **Never ask to git add** — user file approval (git staging) is reconciled by `/cbp-round-complete`
179
179
  - **Update all context locations** — task, checkpoint, and round should all have consistent information
180
180
 
181
181
  ## Integration
182
182
 
183
- - **Triggered by**: `/cbp-round-update` (auto, unapproved files), `/cbp-round-execute` (auto, on hard-fail after retry exhausted), `/cbp-todo` (after /clear), user manually
183
+ - **Triggered by**: `/cbp-round-update` (auto, not-clean triage), `/cbp-round-complete` (auto, files left unapproved after completing the round), `/cbp-round-execute` (auto, on hard-fail after retry exhausted), `/cbp-todo` (after /clear), user manually
184
184
  - **Reads**: MCP `get_current_task` / `get_current_standalone_task`, `get_rounds` / `get_standalone_rounds` (per KIND), file contents (Read tool)
185
185
  - **Writes**: MCP `update_task` / `update_standalone_task` (context), `update_checkpoint` (context, if checkpoint KIND and needed)
186
186
  - **Triggers**: `/cbp-round-start` (auto)
@@ -30,7 +30,7 @@ Set `KIND` for the rest of this skill. MCP tool names vary by KIND:
30
30
 
31
31
  # Round Start Command
32
32
 
33
- Planning phase for a new round. Analyzes context, creates plan, gets user approval. NO execution or testing — those are separate commands.
33
+ Planning phase for a new round. Analyzes context, creates a plan, then auto-triggers `/cbp-round-execute` — the `ask`-tier permission prompt on that skill IS the user's plan approval. NO execution or testing — those are separate commands.
34
34
 
35
35
  ## Inline-Fallback for Planner Spawn Failure
36
36
 
@@ -42,17 +42,17 @@ Procedure summary (pointer back to canonical):
42
42
  2. Walk the planner's documented Phase 0-8 checklist inline using `Read` / `Grep` / `Bash` / MCP `get_*` — `agents/cbp-task-planner.md` is the inline script. Phase 1.5 (Requirement Premise Verification) and Phase 4.7 (Migration Shape-Distribution Pre-Flight) are MANDATORY in fallback mode — these are the gates the agent uniquely enforces; skipping them produces unverified plans.
43
43
  3. Populate the planner's output contract (`approved_plan` shape: `files_to_modify[]`, `deliverables`, `specialist_needs`, `round_type`, `shape_distribution` if applicable, `context_summary`) with `mode: 'inline_fallback'`.
44
44
  4. Apply the pre-emptive-skip rule: when the same failure class fired in the previous spawn of this session, skip the spawn attempt entirely and go straight to inline.
45
- 5. Continue the skill — do NOT abort. The plan still requires user approval at Step 9.
45
+ 5. Continue the skill — do NOT abort. Step 9 auto-triggers `/cbp-round-execute`; the `ask`-tier permission prompt on that skill is the user's plan approval (see Step 8).
46
46
 
47
47
  Inline-fallback is NOT a quality downgrade trapdoor — Phase 1.5 row-by-row verification is mandatory. A fallback plan that skipped premise verification is a regression caught by the next session's cbp-improve-round.
48
48
 
49
49
  ## Pipeline
50
50
 
51
51
  ```
52
- /cbp-round-start (planning) → [user approval] → /cbp-round-execute (auto)
52
+ /cbp-round-start (planning) → /cbp-round-execute (ask-tier permission = plan approval)
53
53
  ```
54
54
 
55
- **Auto-loop mode**: when `round.context.auto_loop_mode === true` flows in from `/cbp-round-input`, Step 6 (Q&A) and Step 8 (user approval) skip user prompts. See cbp-round-update SKILL.md Step 4 (auto-loop decision) and cbp-round-end SKILL.md Step 8 for the full contract.
55
+ **Auto-loop mode**: when `round.context.auto_loop_mode === true` flows in from `/cbp-round-input`, Step 6 (Q&A) is skipped and Step 8's `/cbp-round-execute` permission is auto-approved. See cbp-round-update SKILL.md Step 3b (auto-loop decision) and cbp-round-end SKILL.md Step 8 for the full contract.
56
56
 
57
57
  ## Instructions
58
58
 
@@ -176,7 +176,7 @@ input:
176
176
 
177
177
  Wait for planner output.
178
178
 
179
- ### Step 8: User Approval
179
+ ### Step 8: Present Plan
180
180
 
181
181
  Present the plan to user:
182
182
 
@@ -208,24 +208,21 @@ Present the plan to user:
208
208
 
209
209
  Single-wave plans present the existing flat plan view (no wave table) — backward compatible.
210
210
 
211
- **If `auto_loop_mode === true`**: skip the AskUserQuestion below; auto-approve the plan. Log `round.context.plan_approval = { mode: "auto_loop", auto_approved_at: <ISO> }`. Surface a one-line note in the chat output: `"Auto-approved under auto_loop_mode (round N of cap C)"` so the user can see the loop is active. Proceed to Step 9.
211
+ **Plan approval is the `ask`-tier `Skill(cbp-round-execute)` permission prompt** there is NO approve/needs-changes/wrong AskUserQuestion here. After presenting the plan, proceed to Step 9, which auto-triggers `/cbp-round-execute`; the harness then shows the `ask`-tier permission prompt, and confirming it IS the user's go-ahead on the plan.
212
212
 
213
- **Else (manual mode)**, ask user via AskUserQuestion with explicit options:
213
+ **Denied-execute handling** if the user declines the `/cbp-round-execute` permission, the plan does not run. Treat the decline as "the plan must change":
214
214
 
215
- 1. **Yes** approve and start execution
216
- 2. **No needs changes** user provides feedback, revise the plan (re-run Step 7 with feedback)
217
- 3. **No — totally wrong** — discard plan, return to `/cbp-round-input` for new requirements
215
+ - **Minor changes**: collect the user's feedback, re-spawn `cbp-task-planner` with it as a constraint (re-run Step 7), present the revised plan, and re-trigger `/cbp-round-execute`.
216
+ - **Wrong direction**: save the rejection reason to round context and auto-trigger `/cbp-round-input` for new requirements.
218
217
 
219
- **If "Yes"**: proceed to Step 9.
220
- **If "Needs changes"**: collect user feedback, re-spawn `cbp-task-planner` with feedback as constraint, present revised plan, ask again.
221
- **If "Totally wrong"**: save rejection reason to round context, auto-trigger `/cbp-round-input`.
218
+ **If `auto_loop_mode === true`**: the loop auto-approves — log `round.context.plan_approval = { mode: "auto_loop", auto_approved_at: <ISO> }`, surface a one-line note `"Auto-approved under auto_loop_mode (round N of cap C)"`, and proceed to Step 9 (the `/cbp-round-execute` permission is auto-approved under the loop).
222
219
 
223
220
  ### Step 9: Auto-trigger Round Execute
224
221
 
225
- On approval, save planner output to round context via MCP `update_round` / `update_standalone_round` per KIND, then trigger `/cbp-round-execute`.
222
+ Save planner output to round context via MCP `update_round` / `update_standalone_round` per KIND, then trigger `/cbp-round-execute`. The `ask`-tier permission prompt on `/cbp-round-execute` is the user's plan approval (see Step 8).
226
223
 
227
224
  ```
228
- Plan approved. Starting execution phase...
225
+ Starting execution phase...
229
226
  ```
230
227
 
231
228
  ## Key Rules
@@ -1,9 +1,9 @@
1
1
  ---
2
2
  scope: org-shared
3
3
  name: cbp-round-update
4
- description: Check file approvals, complete round, and route to next step
4
+ description: Triage a finished round (Claude-only) and route to round-complete or round-input
5
5
  argument-hint: [chk-task-round | task-round]
6
- triggers: [cbp-round-input, cbp-task-check, cbp-standalone-task-check]
6
+ triggers: [cbp-round-complete, cbp-round-input]
7
7
  effort: low
8
8
  ---
9
9
 
@@ -17,28 +17,24 @@ Inspect the resolved identifier from argument parsing to determine the task kind
17
17
  | `{chk}-{task}-{round}` (3-segment, e.g. `141-3-1`) | `checkpoint` |
18
18
  | _(empty / free-text)_ | Check `get_current_standalone_task` first; if found → `standalone`. Else → `checkpoint` via `get_current_task`. |
19
19
 
20
- Set `KIND` for the rest of this skill. MCP tool names vary by KIND:
20
+ Set `KIND` for the rest of this skill. round-update is **read + triage only** — it reads round state and routes; it never completes the round or writes file approvals. MCP read/audit tool names vary by KIND:
21
21
 
22
22
  | Operation | `checkpoint` KIND | `standalone` KIND |
23
23
  |-----------|------------------|-------------------|
24
24
  | Get task | `get_current_task(repo_id)` | `get_current_standalone_task(repo_id)` |
25
25
  | Get rounds | `get_rounds(task_id)` | `get_standalone_rounds(standalone_task_id)` |
26
- | Add round | `add_round(task_id, ...)` | `add_standalone_round(standalone_task_id, ...)` |
27
- | Update round | `update_round(round_id, ...)` | `update_standalone_round(standalone_round_id, ...)` |
28
- | Complete round | `complete_round(round_id, duration_minutes?)` | `complete_standalone_round(standalone_round_id, duration_minutes?, caller_worktree_id)` ⚠️ `caller_worktree_id` is REQUIRED for standalone |
29
- | Update task | `update_task(task_id, ...)` | `update_standalone_task(standalone_task_id, ...)` |
26
+ | Update round (audit only) | `update_round(round_id, ...)` | `update_standalone_round(standalone_round_id, ...)` |
30
27
 
31
- # Round Update Command
32
-
33
- Checks file approval status, completes the round, and routes to next step. NEVER asks the user to git add or stage anything — it only reads current state.
28
+ The completion + file-approval reconcile (`sync-approvals`, `complete_round` / `complete_standalone_round`) now lives in `/cbp-round-complete`.
34
29
 
35
- ## HARD GATE — Every Step Must Execute
30
+ # Round Update Command
36
31
 
37
- Step 2 (sync-approvals CLI) MUST exit 0. If it fails, do NOT proceed to Step 3. Before completing the round, verify:
32
+ **Claude-only, autonomous triage.** round-update inspects a finished round's automated state — which files Claude approved (`claude_approved`), whether testing-QA hard-failed, and whether `improve-round` left outstanding findings — and routes to exactly one next step. It makes **no writes** beyond an audit breadcrumb and **never prompts the user**: it is auto-triggered by `/cbp-round-end` and runs without a confirmation gate. The user-facing confirmation has moved to `/cbp-round-complete` (an `ask`-tier permission prompt). round-update NEVER reads or touches git staging — user approval is reconciled later by `/cbp-round-complete`.
38
33
 
39
- - [ ] `codebyplan round sync-approvals` exited 0
34
+ ## Routing in one line
40
35
 
41
- If this is false: DO NOT proceed to Step 3.
36
+ - **Round is clean** trigger `/cbp-round-complete` (the permission-gated finalizer reconciles your `git add`s and completes the round).
37
+ - **Round needs more work** → trigger `/cbp-round-input` (more changes / planning).
42
38
 
43
39
  ## Instructions
44
40
 
@@ -82,147 +78,39 @@ Given the parse from Step 1:
82
78
 
83
79
  If no task found: `No active task. Nothing to update.`
84
80
 
85
- ### Step 1.6: Permission Gate
86
-
87
- Step 1 (parse) and Step 1.5 (resolve task + round) are read-only. Step 2 onward mutates state — the `sync-approvals` CLI write, `complete_round`, and the auto-trigger of the next step. Before any of that, confirm the user wants this skill to run.
88
-
89
- This gate fires on **every** invocation — manual, auto-triggered by `/cbp-round-end`, and on every iteration of the Step 4 auto-loop. There is no bypass. It sits **before** and is independent of the top-of-file HARD GATE (which governs Step 2's exit code, not user consent), and it is distinct from the Branch A clean-exit route choice in Step 5 (that one picks where to go next; this one authorizes running at all).
90
-
91
- Ask via AskUserQuestion, naming the resolved round and disclosing the actions:
92
-
93
- > Update ROUND-{N} of TASK-{M}?
94
- > This will sync the git diff + file approvals, complete the round, and route to the next step (which may auto-start a new round).
95
- >
96
- > - **Proceed** — run the skill
97
- > - **Cancel** — do nothing
98
-
99
- - **Proceed** → continue to Step 2.
100
- - **Cancel** → abort cleanly: make NO writes (no `sync-approvals`, no `complete_round`, no auto-trigger) and exit with one line: `Cancelled by user — ROUND-{N} not updated.`
101
-
102
- ### Step 2: Sync git diff + approvals via CLI
103
-
104
- Run:
105
-
106
- ```
107
- npx codebyplan round sync-approvals --round-id <round_id> --task-id <task_id>
108
- ```
109
-
110
- The CLI auto-resolves the caller worktree id with the following precedence:
111
- 1. `--caller-worktree-id <uuid>` override (if passed — skips all resolution)
112
- 2. Per-device branch-keyed cache (`.codebyplan/worktree.local.json`)
113
- 3. In-process tuple API call: `POST /worktrees/resolve` using `(device_id, repo_path, branch)`
114
-
115
- On the write path (non `--dry-run`), if the worktree id cannot be resolved the CLI **hard-fails with exit 1** and prints an actionable message. To pre-populate the cache:
116
-
117
- ```
118
- npx codebyplan resolve-worktree --cache
119
- ```
120
-
121
- If this worktree is not yet registered, run `npx codebyplan setup` first, then re-run `/cbp-round-update`.
122
-
123
- The CLI parses `git status --short`, merges drift + staging + web-UI flag, and writes both round and task (forwarding `caller_worktree_id` on both writes so the server honors the feat-worktree lock).
124
-
125
- Read the stdout JSON: `{ added, stale_marked, reactivated, total_files }`.
126
-
127
- If the command exits non-zero, surface the stderr and STOP. Do NOT proceed to Step 3.
128
-
129
- ### Step 3: Complete Round
130
-
131
- Calculate duration from `started_at` to now in minutes.
132
-
133
- - **checkpoint KIND**: MCP `complete_round(round_id, duration_minutes)`.
134
- - **standalone KIND**: MCP `complete_standalone_round(standalone_round_id, duration_minutes, caller_worktree_id)`. ⚠️ `caller_worktree_id` is REQUIRED — resolve via `CALLER_WT=$(npx codebyplan resolve-worktree 2>/dev/null)`. If `CALLER_WT` is empty, surface this warning and ask user to confirm before proceeding:
135
-
136
- ```
137
- Warning: could not resolve caller_worktree_id (npx codebyplan resolve-worktree returned empty).
138
- The complete_standalone_round call may be rejected by the pre-guard. Proceed anyway? (yes / no)
139
- ```
81
+ This step is **read-only**. There is no permission gate — round-update is autonomous (see the Key Rules below).
140
82
 
141
- If user confirms yes, proceed with `caller_worktree_id: ""`. If no, stop.
83
+ ### Step 2: Triage the Round
142
84
 
143
- ### Step 4: Auto-Loop Decision
85
+ Read the latest round's context: `round.context.testing_qa_output.totals.hard_fail`, `round.context.improve_round_findings[]`, `round.context.round_type`, and `round.files_changed[]` (each entry's `claude_approved`). Compute a single `clean` verdict:
144
86
 
145
- Read the latest round's `improve_round_findings[]` and `testing_qa_output.totals.hard_fail`. If EITHER is non-clean (findings non-empty OR `hard_fail === true`), the auto-loop must spawn the next round automatically.
87
+ - **Survey round** (`round.context.round_type === 'survey'`): no file diff exists, so QA/approval predicates do not apply. `clean = improve_round_findings[]` is empty.
88
+ - **Normal round**: `clean = (every file in files_changed[] has claude_approved === true) AND testing_qa_output.totals.hard_fail === false AND improve_round_findings[]` is empty.
146
89
 
147
- Procedure:
90
+ Display a one-line triage summary, e.g. `"ROUND-N triage: clean"` or `"ROUND-N triage: N findings / hard_fail=true → needs another round"`. round-update reads `claude_approved` only — it does **not** read git staging or `user_approved`; those belong to `/cbp-round-complete`.
148
91
 
149
- 1. Compute `next_index = (round.context.auto_loop_index ?? 0) + 1`.
150
- 2. **If `next_index > (round.context.auto_loop_cap ?? 5)`**: surface the cap-exhausted prompt via AskUserQuestion (options: extend cap, stop loop / drop into round-input, close task as-is). Persist `round.context.auto_loop_cap_exhausted = { user_choice, decided_at }` and route per choice.
151
- 3. **Otherwise**: persist `round.context.auto_loop_decision = { spawned_next: true, next_index, decided_at }` on the CURRENT round via `update_round` / `update_standalone_round` per KIND (audit trail), then auto-trigger `/cbp-round-input` with NO AskUserQuestion. Pass `auto_loop_mode: true`, `auto_loop_index: next_index`, `auto_loop_cap: (prior cap ?? 5)` forward — round-start Step 4 persists them on the new round.
92
+ ### Step 3: Route
152
93
 
153
- > **Permission-gate note**: the "NO AskUserQuestion" in item 3 governs only the auto-loop's decision to spawn `/cbp-round-input` it does not add a prompt for that hand-off. It is NOT a bypass of the Step 1.6 permission gate: when the spawned round eventually re-enters `/cbp-round-update`, Step 1.6 prompts again (the gate always fires, including inside the auto-loop).
154
-
155
- If BOTH signals are clean, fall through to Step 5 (exit routing).
156
-
157
- ### Step 5: Exit Routing
158
-
159
- **5a: Count files** — Display: `"Files: X total, Y approved, Z pending"`
160
-
161
- **5b: Route with four branches** (Step 4 already handled the dirty-loop case; Step 5 is the clean-exit path).
162
-
163
- **Branch D — IF `round.context.round_type === 'survey'` (checked FIRST):**
164
-
165
- Survey rounds produce no file diff; A/B/C predicates assume `files_changed[]` non-empty. Survey routing is decided by `improve_round_findings[]` instead:
166
-
167
- - `improve_round_findings[]` non-empty → auto-trigger `/cbp-round-input`
168
- - `improve_round_findings[]` empty → checkpoint KIND: auto-trigger `/cbp-task-check`; standalone KIND: auto-trigger `/cbp-standalone-task-check`
169
-
170
- Output: `"## Round [N] Complete — Survey Round"` with duration, files=0, findings count, routing message. Skip Branches A/B/C.
171
-
172
- **Branch A — ELSE IF all files have `user_approved: true`:**
173
-
174
- ```
175
- ## Round [N] Complete - All Files Approved
176
-
177
- **Duration**: [N] minutes
178
- **Files**: [X] total, [X] approved, 0 pending
179
- ```
180
-
181
- Surface AskUserQuestion (clean-exit user-gate):
182
-
183
- - **(a) close & complete round** → checkpoint KIND: auto-trigger `/cbp-task-check`; standalone KIND: auto-trigger `/cbp-standalone-task-check`
184
- - **(b) start new round** → auto-trigger `/cbp-round-input`
185
-
186
- Persist `round.context.auto_loop_exit = { staged_count, unstaged_count, route, decided_at }`.
187
-
188
- **Branch B — ELSE IF unapproved files exist AND every unapproved file has `claude_approved: true` AND `testing_qa_output.totals.hard_fail: false` AND no `improve_round_findings[]`:**
189
-
190
- The clean-but-unstaged staging-gate case. Mode-dependent:
191
-
192
- - **Auto-loop exit** (the completed round had `auto_loop_mode === true` on its context AND Step 4 fell through here with BOTH signals clean): surface the Branch A clean-exit user-gate — do NOT auto-trigger `/cbp-round-input`. Branch B's entry condition already requires `no improve_round_findings[]`, so the auto-loop's verbatim-from-findings procedure has no input to formulate requirements from; auto-triggering round-input would loop on an empty input. Persist `round.context.auto_loop_exit.degenerate_empty_findings: true` (degenerate sub-case: no findings to feed round-input, so surface the clean-exit user-gate above instead of auto-triggering).
193
- - **Manual round**: emit staging-gate prompt and STOP. User stages files (never-git-add rule) and re-invokes; command then falls into Branch A.
194
-
195
- Manual-mode prompt:
196
-
197
- ```
198
- ## Round [N] Complete — Files Pending Staging
199
-
200
- **Files**: [X] total, [Y] approved, [Z] pending
201
-
202
- ### Pending (passed all checks; not yet staged):
203
- - [path]
204
-
205
- Stage them (`git add <path>`) and re-run `/cbp-round-update` to proceed.
206
- Waiting for user to stage files.
207
- ```
94
+ **3a — Clean → `/cbp-round-complete`.** Auto-trigger `/cbp-round-complete`. round-complete is `ask`-tier: the harness shows a permission prompt (the user's confirmation to finalize the round). round-complete then reconciles the user's `git add`s, completes the round, and routes onward (all files staged → task-check; some withheld → round-input). round-update writes nothing here beyond the Step 2 summary. In `auto_loop_mode`, a clean triage is the loop's success exit the loop continues only via the not-clean path in 3b; round-complete owns the degenerate clean-but-unstaged guard.
208
95
 
209
- **Branch CELSE (unapproved AND outstanding findings or `claude_approved: false`):**
96
+ **3b Not clean → `/cbp-round-input`.** More changes or planning are needed. Routing is **independent of git staging** round-input is reachable whether or not the user has staged anything (it performs its own deep analysis of the unapproved files). Two sub-cases:
210
97
 
211
- Unreachable in the auto-loop path — Step 4 catches it first. Retained for MANUAL invocation. Output `## Round [N] Complete - [Z] Files Pending` with file counts and list of unapproved paths; auto-trigger `/cbp-round-input`.
98
+ - **Auto-loop** (`round.context.auto_loop_mode === true`): compute `next_index = (round.context.auto_loop_index ?? 0) + 1`.
99
+ - If `next_index > (round.context.auto_loop_cap ?? 5)`: surface the cap-exhausted prompt via AskUserQuestion (a genuine multi-option user decision — keep it). Options: extend cap, stop loop / drop into round-input, close task as-is. Persist `round.context.auto_loop_cap_exhausted = { user_choice, decided_at }` and route per choice.
100
+ - Otherwise: persist `round.context.auto_loop_decision = { spawned_next: true, next_index, decided_at }` on the current round via `update_round` / `update_standalone_round` (audit trail), then auto-trigger `/cbp-round-input` with NO prompt. Pass `auto_loop_mode: true`, `auto_loop_index: next_index`, `auto_loop_cap: (prior cap ?? 5)` forward — round-start Step 4 persists them on the new round.
101
+ - **Manual round**: auto-trigger `/cbp-round-input` directly (no prompt).
212
102
 
213
103
  ## Key Rules
214
104
 
215
- - **Step 1.6 permission gate fires first** every invocation (incl. auto-trigger and the Step 4 auto-loop) asks the user to confirm before any write; **Cancel** is a clean abort (no `sync-approvals`, no `complete_round`, no auto-trigger).
216
- - **Step 2 (CLI) must exit 0** if it fails, STOP. The merge semantics are enforced by the CLI.
217
- - **Step 4 owns the dirty-loop case**; Step 5 owns the clean-exit case. Step 5 Branch C is for manual invocation only.
218
- - **NEVER ask user to git add files** only reads staging status. **NEVER stage files** Claude does not touch git staging area.
219
- - Auto-triggered by `/cbp-round-end`, or run manually by user.
220
- - **standalone KIND Step 3**: `caller_worktree_id` is REQUIRED for `complete_standalone_round` — always resolve and pass it.
105
+ - **Autonomous + Claude-only** round-update never prompts before running. It is auto-triggered by `/cbp-round-end`. The confirmation step is `/cbp-round-complete`'s `ask`-tier permission prompt, not an AskUserQuestion here. (The auto-loop cap-exhausted AskUserQuestion in Step 3b is a genuine user decision, not a run gate.)
106
+ - **Triage, never finalize** round-update does NOT call `sync-approvals`, `complete_round`, or `complete_standalone_round`, and does NOT write file approvals. All of that is `/cbp-round-complete`.
107
+ - **Never touches git** round-update reads `claude_approved` from the DB only; it never reads staging, asks the user to `git add`, or stages files.
108
+ - **git-add independence** the "needs more work" route to `/cbp-round-input` fires regardless of whether files are staged. There is no clean-but-unstaged dead-end.
109
+ - **standalone parity** — KIND detection governs which read/audit tools are used; the clean→`/cbp-round-complete` and not-clean→`/cbp-round-input` routing is identical for both KINDs (round-complete and round-input self-detect KIND).
221
110
 
222
111
  ## Integration
223
112
 
224
- - **Gates**: Step 1.6 permission gate — asks the user to confirm before any side effect; **Cancel** aborts cleanly with no writes. Fires on every invocation incl. the Step 4 auto-loop; sits before and independent of the top-of-file Step 2 hard gate.
225
113
  - **Triggered by**: `/cbp-round-end` (auto), or user manually
226
- - **Reads**: MCP `get_current_task` / `get_current_standalone_task`, `get_rounds` / `get_standalone_rounds` (per KIND); delegates git+approval sync to `npx codebyplan round sync-approvals`
227
- - **Writes**: MCP `update_round` / `update_standalone_round` (auto_loop_decision / auto_loop_exit / auto_loop_cap_exhausted); `complete_round` / `complete_standalone_round` (per KIND); round+task files_changed written by CLI
228
- - **Triggers**: `/cbp-round-input` (Step 4 dirty-loop spawn; Branch A 'b' choice; Branch B auto-loop-exit; Branch C manual), `/cbp-task-check` (Branch A 'a' choice per checkpoint KIND; Branch D survey-clean per checkpoint KIND), `/cbp-standalone-task-check` (Branch A 'a' choice per standalone KIND; Branch D survey-clean per standalone KIND), staging-gate stop (Branch B manual mode), cap-exhausted prompt routes from Step 4 (any of the three options)
114
+ - **Reads**: MCP `get_current_task` / `get_current_standalone_task`, `get_rounds` / `get_standalone_rounds` (per KIND); round context (`testing_qa_output`, `improve_round_findings`, `round_type`, `files_changed[].claude_approved`)
115
+ - **Writes**: MCP `update_round` / `update_standalone_round` audit only (`auto_loop_decision` / `auto_loop_cap_exhausted`). No completion, no file-approval writes.
116
+ - **Triggers**: `/cbp-round-complete` (clean triage `ask`-tier permission prompt is the user confirmation), `/cbp-round-input` (not-clean triage: outstanding findings, hard-fail, or unapproved Claude checks fires independent of git staging; also the auto-loop dirty spawn), cap-exhausted prompt routes from Step 3b (any of the three options)
@@ -17,7 +17,7 @@ If the `cbp-task-check` agent spawn fails for any reason, follow the canonical i
17
17
 
18
18
  ## When Used
19
19
 
20
- - After all rounds complete and all files approved (auto-triggered by `/cbp-round-update`)
20
+ - After all rounds complete and all files approved (auto-triggered by `/cbp-round-complete`)
21
21
  - Before `/cbp-standalone-task-testing`
22
22
  - Never skippable
23
23
 
@@ -149,4 +149,4 @@ Suggest: Approve files, then re-run `/cbp-standalone-task-check {task}`. Stop
149
149
  - **Reads**: MCP `get_current_standalone_task`, `get_standalone_tasks`, `get_standalone_rounds`, all changed files (via agent)
150
150
  - **Writes**: MCP `update_standalone_task` (context.check_verdict)
151
151
  - **Triggers**: emits directive `Next: /clear, then /cbp-standalone-task-testing {task}` on READY + satisfied
152
- - **Triggered by**: `/cbp-round-update` (auto, when all files approved)
152
+ - **Triggered by**: `/cbp-round-complete` (auto, when all files approved)
@@ -10,6 +10,8 @@ effort: xhigh
10
10
 
11
11
  Complete a standalone task. Auto-triggered by `/cbp-standalone-task-testing` when all tests pass. Can also be run manually.
12
12
 
13
+ This skill is gated by an `ask`-tier `Skill(cbp-standalone-task-complete)` permission rule in `settings.json` (shipped templates). **The permission prompt IS the user confirmation** — there is NO completion-confirmation AskUserQuestion inside this skill (the Step 7.5 `caller_worktree_id` guard is a separate environmental safety prompt, not a flow-control confirmation). A declined permission is a clean no-op (nothing committed, merged, pushed, or completed).
14
+
13
15
  ## Instructions
14
16
 
15
17
  ### Step 1: Parse `$ARGUMENTS`
@@ -84,9 +86,7 @@ Load `task.qa` and `task.files_changed`:
84
86
  1. **QA**: count items by status (pass / fail / pending / skipped). If any item has status `fail` or `pending`, warn the user.
85
87
  2. **Files**: list any file with `user_approved === false` and warn.
86
88
 
87
- **If issues exist**, AskUserQuestion: `Complete anyway` / `Run QA first` / `Cancel`. On `Run QA first` or `Cancel`, stop. On `Complete anyway`, continue.
88
-
89
- **If no issues**, AskUserQuestion to confirm: `Ready to complete standalone TASK-[N]: [title] — [N] rounds, [N] files. Proceed?`
89
+ If any QA item is `fail`/`pending` or any file is unapproved, **surface the warnings in the output and continue** — record them for the Step 9 summary. There is NO confirmation AskUserQuestion here: `Skill(cbp-standalone-task-complete)` is `ask`-tier, so the harness permission prompt that gated this skill IS the user's confirmation to complete. The hard gates in Steps 2–2.6 (all rounds completed, ≥1 round has `testing_qa_output`, `check_verdict` READY, `task_testing_output.all_passed`) already block completion when prerequisites are unmet; these QA / file-approval items are warnings, not blockers.
90
90
 
91
91
  ### Step 4: Aggregate Files Changed
92
92
 
@@ -169,6 +169,7 @@ Apply the `cleanup` skill inline to remove orphan references to deleted/modified
169
169
  **Files**: [N] changed
170
170
  **Commit**: [hash]
171
171
  **Branch merged**: [feat-branch] → {PRODUCTION}
172
+ **Warnings**: [any QA / file-approval warnings from Step 3, or "none"]
172
173
  ```
173
174
 
174
175
  #### Route (single directive — never a menu)
@@ -9,7 +9,7 @@ effort: xhigh
9
9
 
10
10
  # Standalone Task Testing Command
11
11
 
12
- Comprehensive task-level testing for standalone tasks — runs all automated tests and walks the user through manual testing one-by-one. Tests the entire delivered feature holistically after all rounds are complete. Runs inline — no sub-agent.
12
+ Comprehensive task-level testing for standalone tasks — the **cross-round double-check** run once after all rounds complete. Per-round QA (per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret grep, `pnpm audit`) is owned by each round's `testing-qa-agent`; this skill does NOT re-run it. It tests the entire delivered feature holistically across the full task diff — catching cross-package and cross-round problems no single round can see. Runs inline — no sub-agent.
13
13
 
14
14
  ## When Used
15
15
 
@@ -19,7 +19,7 @@ Comprehensive task-level testing for standalone tasks — runs all automated tes
19
19
 
20
20
  ## Scope vs Round-Level Validation
21
21
 
22
- Per-wave `testing-qa-agent` runs inside `/cbp-round-execute` Step 5. This skill adds the cross-cutting layer visible only across the full task diff: full-repo lint, workspace tsc, full test suite, `pnpm audit`, and full-diff security scan.
22
+ Per-wave `testing-qa-agent` runs inside `/cbp-round-execute` Step 5 and **owns per-round QA**: per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret full-diff grep, and `pnpm audit`. This skill does NOT repeat them. It adds only the cross-round layer invisible within a single round: workspace-wide lint, workspace tsc, and the full test suite (which catch cross-package breakage), plus the cross-round code review (Step 6.5) and the user manual walkthrough (Step 8).
23
23
 
24
24
  ## Instructions
25
25
 
@@ -93,9 +93,9 @@ Capture stdout and stderr for each check.
93
93
  | Full-repo lint | `pnpm -w lint` | Always |
94
94
  | Full-repo types | `pnpm exec tsc --noEmit` | Source files changed |
95
95
  | Full-repo unit tests | `pnpm test --run` | Source files in aggregated_files |
96
- | Full-repo audit | `pnpm audit` | Always |
97
96
  | Per-package E2E | `pnpm --filter <pkg> e2e:test` | UI files in aggregated_files |
98
- | Full-diff security scan | inline grep or `security-agent` | Always |
97
+
98
+ These are the workspace-wide / cross-package checks only — per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret grep, and `pnpm audit` already ran per-round inside `testing-qa-agent` and are NOT repeated here.
99
99
 
100
100
  **Soft tests** (report, don't block):
101
101
 
@@ -9,7 +9,7 @@ effort: high
9
9
 
10
10
  # Task Check Command
11
11
 
12
- AI-driven production readiness review. Spawns the `cbp-task-check` agent for thorough verification including user satisfaction discussion. This command is a thin orchestrator — the agent does the heavy lifting.
12
+ AI-driven production readiness review. Spawns the `cbp-task-check` agent for thorough verification including user satisfaction discussion. This command is a thin orchestrator — the agent does the heavy lifting. It is the **cross-round double-check**: rounds already own per-round QA (debug scan, security grep, audit, per-app build/lint/types), so this layer focuses on holistic concerns visible only across the full task diff — requirements traceability, checkpoint alignment, shippability, holistic code review, and scope drift — never re-running per-round checks.
13
13
 
14
14
  ## Inline-Fallback for Spawn Failure
15
15
 
@@ -27,7 +27,7 @@ Inline-fallback is NOT a quality downgrade trapdoor — every Phase from the age
27
27
 
28
28
  ## When Used
29
29
 
30
- - After all rounds complete and all files approved (auto-triggered by `/cbp-round-update`)
30
+ - After all rounds complete and all files approved (auto-triggered by `/cbp-round-complete`)
31
31
  - Before `/cbp-task-testing`
32
32
  - `/cbp-task-check` is NEVER skippable
33
33
 
@@ -163,4 +163,4 @@ Suggest: Approve files, then re-run `/cbp-task-check`. **STOP HERE** — wait fo
163
163
  - **Reads**: MCP `get_current_task`, `get_rounds`, all changed files (via agent)
164
164
  - **Writes**: MCP `update_task` (context.check_verdict)
165
165
  - **Triggers**: emits directive `Next: /clear, then /cbp-task-testing {chk-task}` on READY + satisfied (cross-context — testing is heavyweight, fresh context helps)
166
- - **Triggered by**: `/cbp-round-update` (auto, when all files approved)
166
+ - **Triggered by**: `/cbp-round-complete` (auto, when all files approved)
@@ -10,6 +10,8 @@ effort: xhigh
10
10
 
11
11
  Complete the current task. Auto-triggered by `/cbp-task-testing` when all tests pass. Can also be run manually.
12
12
 
13
+ This skill is gated by an `ask`-tier `Skill(cbp-task-complete)` permission rule in `settings.json`. **The permission prompt IS the user confirmation** — there is NO AskUserQuestion inside this skill. A declined permission is a clean no-op (nothing committed, merged, pushed, or completed).
14
+
13
15
  ## Instructions
14
16
 
15
17
  ### Step 1: Parse `$ARGUMENTS`
@@ -90,12 +92,10 @@ Stop here.
90
92
 
91
93
  Load `task.qa` and `task.files_changed`:
92
94
 
93
- 1. **QA**: count items by status (pass / fail / pending / skipped) across all types. If any item has status `fail` or `pending` (including default checklists), warn the user.
94
- 2. **Files**: list any file with `user_approved === false` and warn.
95
-
96
- **If issues exist**, AskUserQuestion: `Complete anyway` / `Run QA first` (suggest `/cbp-task-check`) / `Cancel`. On `Run QA first` or `Cancel`, stop. On `Complete anyway`, continue.
95
+ 1. **QA**: count items by status (pass / fail / pending / skipped) across all types.
96
+ 2. **Files**: list any file with `user_approved === false`.
97
97
 
98
- **If no issues**, AskUserQuestion to confirm: `Ready to complete TASK-[N]: [title] [N] rounds, [N] files. Proceed?`
98
+ If any QA item is `fail`/`pending` or any file is unapproved, **surface the warnings in the output and continue** — record them for the Step 9 summary. There is NO confirmation AskUserQuestion here: `Skill(cbp-task-complete)` is `ask`-tier, so the harness permission prompt that gated this skill IS the user's confirmation to complete. The hard gates in Steps 2–2.6 (all rounds completed, ≥1 round has `testing_qa_output`, `check_verdict` READY, `task_testing_output.all_passed`) already block completion when prerequisites are unmet; these QA / file-approval items are warnings, not blockers.
99
99
 
100
100
  ### Step 4: Aggregate Files Changed
101
101
 
@@ -142,7 +142,7 @@ Call `complete_task(task_id)`. The server resolves the caller's worktree identit
142
142
 
143
143
  Apply the `cleanup` skill inline to remove orphan references to deleted/modified files. Then apply `migration` to propagate renames/moves to consumers. Both run without sub-agent spawns. Skip cleanup if no deletions/modifications; skip migration if cleanup handled everything.
144
144
 
145
- ### Step 9: Show Result and Route (User-Confirmed)
145
+ ### Step 9: Show Result and Route
146
146
 
147
147
  Show the completion summary:
148
148
 
@@ -153,6 +153,7 @@ Show the completion summary:
153
153
  **Rounds**: [N] completed
154
154
  **Files**: [N] changed
155
155
  **Commit**: [hash]
156
+ **Warnings**: [any QA / file-approval warnings from Step 3, or "none"]
156
157
  ```
157
158
 
158
159
  Then route. Same-context transitions (next task in this checkpoint) auto-trigger via the Skill tool. Cross-context transitions (checkpoint done → /cbp-checkpoint-check, session end) surface as a single directive 'Next: /clear, then /cbp-X' for the user to invoke after refreshing context.
@@ -9,7 +9,7 @@ effort: xhigh
9
9
 
10
10
  # Task Testing Command
11
11
 
12
- Comprehensive task-level testing — runs all automated tests and walks the user through manual testing one-by-one. Distinct from round-level testing (`testing-qa-agent`): this tests the **entire delivered feature holistically** after all rounds are complete. Runs inline — no sub-agent.
12
+ Comprehensive task-level testing — the **cross-round double-check** run once after all rounds complete. Per-round QA (per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret grep, `pnpm audit`) is owned by each round's `testing-qa-agent`; this skill does NOT re-run it. Instead it tests the **entire delivered feature holistically** across the full task diff — catching cross-package and cross-round problems no single round can see. Runs inline — no sub-agent.
13
13
 
14
14
  ## When Used
15
15
 
@@ -19,7 +19,7 @@ Comprehensive task-level testing — runs all automated tests and walks the user
19
19
 
20
20
  ## Scope vs Round-Level Validation
21
21
 
22
- Per-wave `testing-qa-agent` runs inside `/cbp-round-execute` Step 5. This skill adds the cross-cutting layer that is only visible across the full task diff: full-repo lint, workspace tsc, full test suite, `pnpm audit`, and full-diff security scan each run once here, not per-round.
22
+ Per-wave `testing-qa-agent` runs inside `/cbp-round-execute` Step 5 and **owns per-round QA**: per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret full-diff grep, and `pnpm audit`. This skill does NOT repeat them. It adds only the cross-round layer invisible within a single round: workspace-wide lint, workspace tsc, and the full test suite (which catch cross-package breakage), plus the cross-round code review (Step 6.5), the autonomous sim screenshot loop (Step 6.x), and the user manual walkthrough (Step 8).
23
23
 
24
24
  ## Instructions
25
25
 
@@ -109,11 +109,9 @@ Capture stdout and stderr for each check.
109
109
  | Full-repo lint | `pnpm -w lint` | Always |
110
110
  | Full-repo types | `pnpm exec tsc --noEmit` | Source files changed |
111
111
  | Full-repo unit tests | `pnpm test --run` | Source files in aggregated_files |
112
- | Full-repo audit | `pnpm audit` | Always |
113
112
  | Per-package E2E | `pnpm --filter <pkg> e2e:test` | UI files in aggregated_files |
114
- | Full-diff security scan | inline grep or `security-agent` | Always |
115
113
 
116
- Per-file lint + format are enforced by `lint-format-on-edit.sh` hook per edit. This step catches cross-package issues invisible to per-wave checks.
114
+ These are the workspace-wide / cross-package checks only — per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret grep, and `pnpm audit` already ran per-round inside `testing-qa-agent` and are NOT repeated here. Per-file lint + format are enforced by `lint-format-on-edit.sh` per edit. This step catches cross-package issues invisible to per-wave checks.
117
115
 
118
116
  **Soft tests** (report, don't block):
119
117
 
@@ -133,7 +133,7 @@ Once the gates pass, load the context the head command needs. This ensures `/cle
133
133
  | `/cbp-checkpoint-start` | Load checkpoint via MCP `get_checkpoints` + `get_tasks(checkpoint_id)`. Display checkpoint title, status, claim state, first pending task |
134
134
  | `/cbp-task-start [N]` | Load via MCP `get_current_task`. Display checkpoint title + task title/requirements summary |
135
135
  | `/cbp-round-start` | Load via MCP `get_current_task` + `get_rounds(task_id)`. Display checkpoint + task + round count + last round summary |
136
- | `/cbp-round-update` | Load via MCP `get_current_task` + `get_rounds(task_id)`. Display checkpoint + task + files_changed approval summary |
136
+ | `/cbp-round-update` | Load via MCP `get_current_task` + `get_rounds(task_id)`. Display checkpoint + task + files_changed triage summary (claude_approved, findings, hard_fail) |
137
137
  | `/cbp-round-input` | **Full context load** (see Step 2b) |
138
138
  | `/cbp-task-check` | Load via MCP `get_current_task`. Display checkpoint + task + files summary |
139
139
  | `/cbp-task-testing` | Load via MCP `get_current_task` + `get_rounds(task_id)`. Display checkpoint + task + testing status summary |