codebyplan 1.13.28 → 1.13.29
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/cli.js +1 -1
- package/package.json +1 -1
- package/templates/agents/cbp-improve-round.md +1 -1
- package/templates/agents/cbp-task-check.md +12 -8
- package/templates/hooks/cbp-mcp-round-sync.sh +9 -0
- package/templates/settings.project.base.json +3 -2
- package/templates/skills/cbp-build-cc-mode/SKILL.md +4 -3
- package/templates/skills/cbp-build-cc-settings/reference/cbp-permission-policy.md +3 -2
- package/templates/skills/cbp-build-cc-skill/reference/cbp-quality.md +1 -1
- package/templates/skills/cbp-merge-main/SKILL.md +1 -1
- package/templates/skills/cbp-round-complete/SKILL.md +164 -0
- package/templates/skills/cbp-round-end/SKILL.md +16 -14
- package/templates/skills/cbp-round-end/reference/findings-presentation.md +7 -17
- package/templates/skills/cbp-round-execute/SKILL.md +4 -0
- package/templates/skills/cbp-round-input/SKILL.md +6 -6
- package/templates/skills/cbp-round-start/SKILL.md +12 -15
- package/templates/skills/cbp-round-update/SKILL.md +31 -143
- package/templates/skills/cbp-standalone-task-check/SKILL.md +2 -2
- package/templates/skills/cbp-standalone-task-complete/SKILL.md +4 -3
- package/templates/skills/cbp-standalone-task-testing/SKILL.md +4 -4
- package/templates/skills/cbp-task-check/SKILL.md +3 -3
- package/templates/skills/cbp-task-complete/SKILL.md +7 -6
- package/templates/skills/cbp-task-testing/SKILL.md +3 -5
- package/templates/skills/cbp-todo/SKILL.md +1 -1
package/dist/cli.js
CHANGED
package/package.json
CHANGED
|
@@ -279,6 +279,6 @@ Return findings sorted by severity (critical first). If no findings, return `sta
|
|
|
279
279
|
## Integration
|
|
280
280
|
|
|
281
281
|
- **Spawned by**: `/cbp-round-end` (Step 6)
|
|
282
|
-
- **Returns to**: `/cbp-round-end` which
|
|
282
|
+
- **Returns to**: `/cbp-round-end` which auto-applies in-scope findings inline and routes out-of-scope findings to `/cbp-round-update`
|
|
283
283
|
- **Does NOT**: Apply any changes
|
|
284
284
|
- **Reads**: Changed files, task requirements, round context
|
|
@@ -9,7 +9,7 @@ effort: xhigh
|
|
|
9
9
|
|
|
10
10
|
# Task Check Agent
|
|
11
11
|
|
|
12
|
-
AI-driven production readiness review with user satisfaction discussion.
|
|
12
|
+
AI-driven production readiness review with user satisfaction discussion. This is the **cross-round double-check** layer: per-round QA (build/lint/types per app, the `console.log`/debug scan, the OWASP/secret grep, API auth-enforcement curls, `pnpm audit`) already ran inside each round's `testing-qa-agent` — this agent does NOT re-run it. Its unique value is holistic: verifying all task requirements are met, checkpoint goals are aligned, the aggregated work is shippable, and — for tasks that span many rounds where scope can shift as new ideas/problems surface — detecting scope drift that should update the checkpoint or task rather than re-running per-round checks.
|
|
13
13
|
|
|
14
14
|
**Numeric-claim verification (Proposal P6)**: when round summaries assert numeric facts (file counts, package counts, percentage changes, line counts, version numbers), verify each via direct count: `find ... | wc -l`, `grep -c`, `wc -l <file>`. Do NOT accept narrative numbers without a verification command. Mismatches between asserted and actual counts indicate documentation drift; flag as a finding requiring a fix.
|
|
15
15
|
|
|
@@ -95,14 +95,16 @@ Check `task.files_changed`:
|
|
|
95
95
|
- List unapproved files
|
|
96
96
|
- Determine if unapproved files block completion
|
|
97
97
|
|
|
98
|
-
### Phase 6: Code Review
|
|
98
|
+
### Phase 6: Code Review (holistic spot-check)
|
|
99
99
|
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
- No
|
|
103
|
-
- No
|
|
104
|
-
- Error handling present where needed
|
|
105
|
-
- Consistent with existing codebase patterns
|
|
100
|
+
Per-round QA already ran the line-level checks — the `console.log`/debug scan (round `testing-qa-agent` Phase 3.5), the OWASP secret/injection grep (Phase 5), the API auth-enforcement curl (Phase 3.55), and `pnpm audit` (Phase 3.7). Do NOT re-run them here. Phase 6 is a light holistic spot-check across the aggregated diff for what a single round cannot see:
|
|
101
|
+
|
|
102
|
+
- No obvious bugs or regressions that emerge only when all rounds' changes are read together
|
|
103
|
+
- No cross-round integration gaps (a field/contract introduced in one round that a later round broke)
|
|
104
|
+
- Error handling present where needed at the feature boundary
|
|
105
|
+
- Consistent with existing codebase patterns across the full task diff
|
|
106
|
+
|
|
107
|
+
If the aggregated diff surfaces an obvious issue per-round QA missed, flag it as a finding — but the per-round scans are authoritative for line-level concerns.
|
|
106
108
|
|
|
107
109
|
### Phase 7: Shippable Feature Gate
|
|
108
110
|
|
|
@@ -125,6 +127,8 @@ Update `round_outcome_analysis` with findings.
|
|
|
125
127
|
|
|
126
128
|
### Phase 9: User Satisfaction Discussion
|
|
127
129
|
|
|
130
|
+
For tasks that ran many rounds, scope drift accumulates quietly — each round may have absorbed a new idea or problem without the checkpoint/task requirements being updated. The satisfaction discussion is where that drift surfaces; treat the scope-divergence scan below as a first-class output, not an afterthought.
|
|
131
|
+
|
|
128
132
|
Present findings to user via AskUserQuestion:
|
|
129
133
|
|
|
130
134
|
```
|
|
@@ -5,6 +5,15 @@
|
|
|
5
5
|
# staging-status flip, and web-UI flag sync to the codebyplan CLI.
|
|
6
6
|
# Replaces the inline jq merge + curl PATCH with a single CLI call.
|
|
7
7
|
#
|
|
8
|
+
# Trigger context (CHK-197): complete_round is now called by /cbp-round-complete
|
|
9
|
+
# (the permission-gated finalizer), which already runs `sync-approvals` once
|
|
10
|
+
# BEFORE completing the round. This hook firing afterward is the expected
|
|
11
|
+
# post-complete safety net — it catches approval drift between that pre-complete
|
|
12
|
+
# sync and the approval_locked write; it is NOT a duplicate run to remove.
|
|
13
|
+
# NOTE: this hook matches complete_round only; complete_standalone_round is NOT
|
|
14
|
+
# covered, so standalone rounds rely solely on /cbp-round-complete's pre-complete
|
|
15
|
+
# sync (pre-existing coverage gap, documented here intentionally — not fixed).
|
|
16
|
+
#
|
|
8
17
|
# Delegates to: npx codebyplan round sync-approvals
|
|
9
18
|
# - Git-diff drift merge (in/not-in DB vs git)
|
|
10
19
|
# - Staging-status → user_approved flip
|
|
@@ -55,7 +55,8 @@
|
|
|
55
55
|
"Skill(cbp-checkpoint-create)",
|
|
56
56
|
"Skill(cbp-checkpoint-check)",
|
|
57
57
|
"Skill(cbp-checkpoint-complete)",
|
|
58
|
-
"Skill(cbp-round-
|
|
58
|
+
"Skill(cbp-round-complete)",
|
|
59
|
+
"Skill(cbp-round-execute)",
|
|
59
60
|
"Skill(cbp-session-end)",
|
|
60
61
|
"Skill(cbp-task-complete)",
|
|
61
62
|
"Skill(cbp-standalone-task-create)",
|
|
@@ -114,9 +115,9 @@
|
|
|
114
115
|
"Skill(cbp-refresh-infra)",
|
|
115
116
|
"Skill(cbp-round-check)",
|
|
116
117
|
"Skill(cbp-round-end)",
|
|
117
|
-
"Skill(cbp-round-execute)",
|
|
118
118
|
"Skill(cbp-round-input)",
|
|
119
119
|
"Skill(cbp-round-start)",
|
|
120
|
+
"Skill(cbp-round-update)",
|
|
120
121
|
"Skill(cbp-session-start)",
|
|
121
122
|
"Skill(cbp-setup-e2e)",
|
|
122
123
|
"Skill(cbp-setup-eslint)",
|
|
@@ -31,19 +31,20 @@ Fifteen of the 16 authoring agents take the default (`cbp-cc-executor`, `cbp-dat
|
|
|
31
31
|
|
|
32
32
|
| skill | model | effort | reason |
|
|
33
33
|
| ----------------- | ------ | ------ | ----------------------------------------------------------------------------------------------------------------- |
|
|
34
|
-
| cbp-round-end | sonnet | high | Spawns cbp-improve-round agent;
|
|
34
|
+
| cbp-round-end | sonnet | high | Spawns cbp-improve-round agent; auto-applies in-scope findings + routes out-of-scope to round-update — lighter than xhigh suffices |
|
|
35
35
|
| cbp-task-check | sonnet | high | Thin orchestrator over spawned cbp-task-check agent; inline-fallback path keeps Opus for safety |
|
|
36
36
|
| cbp-checkpoint-update | sonnet | high | Status updates + context patches mostly; lighter than xhigh |
|
|
37
37
|
| cbp-ship-main | sonnet | high | Production-impacting PR creation; keep Opus reasoning but drop effort |
|
|
38
38
|
| cbp-merge-main | sonnet | high | Long-lived-branch integration merge — surgical conflict resolution, no authoring |
|
|
39
39
|
|
|
40
|
-
### Haiku-low skills (
|
|
40
|
+
### Haiku-low skills (10)
|
|
41
41
|
|
|
42
42
|
`model: haiku` + `effort: low`. Pure mechanical / dispatch / templated work.
|
|
43
43
|
|
|
44
44
|
| skill | model | effort | reason |
|
|
45
45
|
| ---------------------- | ----- | ------ | ---------------------------------------------------------------------------------------- |
|
|
46
|
-
| cbp-round-update | haiku | low | Pure mechanical:
|
|
46
|
+
| cbp-round-update | haiku | low | Pure mechanical: triage round state (claude_approved/findings/hard_fail), route to round-complete or round-input |
|
|
47
|
+
| cbp-round-complete | haiku | low | Pure mechanical: sync-approvals git-add reconcile, complete round, route per unapproved count |
|
|
47
48
|
| cbp-round-check | haiku | low | Run build/lint/types commands, parse output, update QA |
|
|
48
49
|
| cbp-todo | haiku | low | Dispatch: single MCP call + route to next command |
|
|
49
50
|
| cbp-checkpoint-complete | haiku | low | Pure finalization — mark completed, write summary; judgment happened in checkpoint-check |
|
|
@@ -22,7 +22,7 @@ Precedence is `deny > ask > allow`; arrays union across scopes (managed/user/pro
|
|
|
22
22
|
|
|
23
23
|
### `allow` — the autonomous workflow surface
|
|
24
24
|
|
|
25
|
-
- **Non-lifecycle, non-shipment `/cbp-*` skills** — authoring (`cbp-build-cc-*`), frontend (`cbp-frontend-*`), git (`cbp-git-*`, `cbp-merge-main`, `cbp-refresh-infra`), round work (`cbp-round-check`/`-end`/`-
|
|
25
|
+
- **Non-lifecycle, non-shipment `/cbp-*` skills** — authoring (`cbp-build-cc-*`), frontend (`cbp-frontend-*`), git (`cbp-git-*`, `cbp-merge-main`, `cbp-refresh-infra`), round work (`cbp-round-check`/`-end`/`-input`/`-start`/`-update` — `cbp-round-update` is autonomous triage that only reads round state and routes to `cbp-round-complete` or `cbp-round-input`, so it runs without a prompt), setup/configure (`cbp-setup-*`, `cbp-ship-configure`, `cbp-supabase-*`), task prep (`cbp-task-check`/`-create`/`-start`/`-testing`, `cbp-standalone-task-check`/`-testing`), planning (`cbp-checkpoint-plan`/`-update`), plus `cbp-session-start` and `cbp-todo`. Invoking a skill is the intended mode of operation; the gated side effects happen inside via the Bash/MCP tools the skill calls, which carry their own tiering. The lifecycle/state-transition and plan-approval skills are the exception — they live in `ask` (next section).
|
|
26
26
|
- **All `mcp__codebyplan__*` reads** (`get_*`, `list_*`, `search_*`, `health_check`, `lookup_symbol`, `resolve_library_id`, `get_chunk`).
|
|
27
27
|
- **Routine workflow-write MCP tools** the pipeline calls many times per task: create/update/complete checkpoint, task, and round; session log + session-state writes; `create_worktree`, `add_library`, `flag_stale_chunk`, `update_server_config`, `update_eslint_repo_config`, `update_task_template`. Gating these with `ask` would make the autonomous workflow unusable.
|
|
28
28
|
- **Read/safe CLI commands** (both `codebyplan X` and `npx codebyplan X`): `whoami`, `resolve-worktree`, `statusline`, `ports`, `tech-stack`, `eslint`, `round`, `help`, `--version`.
|
|
@@ -30,7 +30,8 @@ Precedence is `deny > ask > allow`; arrays union across scopes (managed/user/pro
|
|
|
30
30
|
### `ask` — the deliberate confirm-gate
|
|
31
31
|
|
|
32
32
|
- **Production-shipment skills**: `cbp-ship`, `cbp-ship-main`, `cbp-checkpoint-end` — these promote/deploy to production, so they prompt even in an otherwise auto-allowed setup.
|
|
33
|
-
- **Lifecycle / state-transition skills**: `cbp-checkpoint-start`, `cbp-checkpoint-create`, `cbp-checkpoint-check`, `cbp-checkpoint-complete`, `cbp-round-
|
|
33
|
+
- **Lifecycle / state-transition skills**: `cbp-checkpoint-start`, `cbp-checkpoint-create`, `cbp-checkpoint-check`, `cbp-checkpoint-complete`, `cbp-round-complete`, `cbp-session-end`, `cbp-task-complete`, `cbp-standalone-task-create`, `cbp-standalone-task-start`, `cbp-standalone-task-complete` — these open or close checkpoints, tasks, rounds, and sessions (advancing workflow state in the database), so they stop for explicit confirmation rather than running autonomously. `cbp-round-complete` is the permission-gated round finalizer (reconciles the user's `git add`s, completes the round, routes onward); its `ask` prompt replaces the in-skill confirmation that used to live in `cbp-round-update` — which is now an autonomous, `allow`-tier triage step.
|
|
34
|
+
- **Plan-approval gate**: `cbp-round-execute` — the round plan is approved by confirming this `ask` prompt rather than via an in-skill AskUserQuestion. `cbp-round-start` runs its planning Q&A, then hands off to `cbp-round-execute`; the permission prompt is the user's go/no-go on the plan.
|
|
34
35
|
- **Destructive / admin MCP tools**: `delete_session_log`, `delete_worktree`, `create_repo`, `release_assignment`. (The launch and member-admin tools were dropped from the MCP surface in CHK-180 — those concerns are web-app only now.)
|
|
35
36
|
- **Mutating / external / clobber-risk CLI commands** (both prefixes): `setup`, `login`, `logout`, `upgrade-auth`, `config` (can overwrite committed `.codebyplan/` files), `branch` (rewrites branch config), `ship`, `claude` (`install`/`update`/`uninstall` overwrite `.claude/`).
|
|
36
37
|
|
|
@@ -88,7 +88,7 @@ A skill should do one thing in the pipeline. If a skill both plans AND executes,
|
|
|
88
88
|
If the skill is part of a chain, show it:
|
|
89
89
|
|
|
90
90
|
```
|
|
91
|
-
/cbp-round-start (planning) →
|
|
91
|
+
/cbp-round-start (planning) → /cbp-round-execute (ask-tier permission = plan approval)
|
|
92
92
|
```
|
|
93
93
|
|
|
94
94
|
### Approval Gates
|
|
@@ -42,7 +42,7 @@ Triggered by `/cbp-task-start` (Step 3.6, optional stale-check), `/cbp-task-comp
|
|
|
42
42
|
- **Cancel** — abort the skill.
|
|
43
43
|
- `unstaged_dirty=false AND staged_present=true` → print one informational line and proceed to Step 1:
|
|
44
44
|
`Staged changes detected — proceeding with merge.`
|
|
45
|
-
(Pre-staged files will be included in the merge commit at Step 2 — this is intentional; the caller already approved them via /cbp-round-
|
|
45
|
+
(Pre-staged files will be included in the merge commit at Step 2 — this is intentional; the caller already approved them via /cbp-round-complete.)
|
|
46
46
|
- `unstaged_dirty=false AND staged_present=false` → proceed silently to Step 1.
|
|
47
47
|
- Either `git diff` command exits with code ≥ 2 (git hard error — not-a-repo, detached HEAD with no commits, index lock, corrupt object store): surface the raw error output and STOP. Do NOT proceed to Step 1.
|
|
48
48
|
|
|
@@ -0,0 +1,164 @@
|
|
|
1
|
+
---
|
|
2
|
+
scope: org-shared
|
|
3
|
+
name: cbp-round-complete
|
|
4
|
+
description: Reconcile user git-add approvals, complete the round, and route to the next step
|
|
5
|
+
argument-hint: [chk-task-round | task-round]
|
|
6
|
+
triggers: [cbp-task-check, cbp-standalone-task-check, cbp-round-input]
|
|
7
|
+
effort: low
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Kind Detection
|
|
11
|
+
|
|
12
|
+
Inspect the resolved identifier from argument parsing to determine the task kind:
|
|
13
|
+
|
|
14
|
+
| Identifier shape | KIND |
|
|
15
|
+
|-----------------|------|
|
|
16
|
+
| `{task}-{round}` (2-segment, e.g. `45-2`) | `standalone` |
|
|
17
|
+
| `{chk}-{task}-{round}` (3-segment, e.g. `141-3-1`) | `checkpoint` |
|
|
18
|
+
| _(empty / free-text)_ | Check `get_current_standalone_task` first; if found → `standalone`. Else → `checkpoint` via `get_current_task`. |
|
|
19
|
+
|
|
20
|
+
Set `KIND` for the rest of this skill. MCP tool names vary by KIND:
|
|
21
|
+
|
|
22
|
+
| Operation | `checkpoint` KIND | `standalone` KIND |
|
|
23
|
+
|-----------|------------------|-------------------|
|
|
24
|
+
| Get task | `get_current_task(repo_id)` | `get_current_standalone_task(repo_id)` |
|
|
25
|
+
| Get rounds | `get_rounds(task_id)` | `get_standalone_rounds(standalone_task_id)` |
|
|
26
|
+
| Update round | `update_round(round_id, ...)` | `update_standalone_round(standalone_round_id, ...)` |
|
|
27
|
+
| Complete round | `complete_round(round_id, duration_minutes?)` | `complete_standalone_round(standalone_round_id, duration_minutes?, caller_worktree_id)` ⚠️ `caller_worktree_id` is REQUIRED for standalone |
|
|
28
|
+
|
|
29
|
+
# Round Complete Command
|
|
30
|
+
|
|
31
|
+
The **permission-gated finalizer** for a round that `/cbp-round-update` triaged as clean. It reconciles which files the **user** approved via `git add`, completes the round, and routes to the next step.
|
|
32
|
+
|
|
33
|
+
This skill is gated by an `ask`-tier `Skill(cbp-round-complete)` permission rule in `settings.json`. **The permission prompt IS the user confirmation** — there is NO AskUserQuestion inside this skill. If the user declines the permission, the skill does not run: nothing is synced, no round is completed, and the user can stage files and re-invoke (directly or by re-running `/cbp-round-update`) when ready.
|
|
34
|
+
|
|
35
|
+
## HARD GATE — Every Step Must Execute
|
|
36
|
+
|
|
37
|
+
Step 2 (sync-approvals CLI) MUST exit 0. If it fails, do NOT proceed to Step 3. Before completing the round, verify:
|
|
38
|
+
|
|
39
|
+
- [ ] `codebyplan round sync-approvals` exited 0
|
|
40
|
+
|
|
41
|
+
If this is false: DO NOT proceed to Step 3.
|
|
42
|
+
|
|
43
|
+
## Instructions
|
|
44
|
+
|
|
45
|
+
### Step 1: Parse `$ARGUMENTS`
|
|
46
|
+
|
|
47
|
+
Parse the argument using the canonical chk-task-round notation (see `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary"):
|
|
48
|
+
|
|
49
|
+
| Shape | Regex | Resolves to |
|
|
50
|
+
|-------|-------|-------------|
|
|
51
|
+
| `{chk}-{task}-{round}` (e.g. `108-1-2`) | `^[0-9]+-[0-9]+-[0-9]+$` | Checkpoint-bound: CHK-{chk} TASK-{task} ROUND-{round} |
|
|
52
|
+
| `{task}-{round}` (e.g. `45-2`) | `^[0-9]+-[0-9]+$` | Standalone: standalone TASK-{task} ROUND-{round} |
|
|
53
|
+
| _(empty)_ | — | Use Kind Detection to find active task and latest round |
|
|
54
|
+
|
|
55
|
+
Anything else is malformed — surface this error and stop:
|
|
56
|
+
|
|
57
|
+
```
|
|
58
|
+
round-complete: invalid argument `{value}`. Expected:
|
|
59
|
+
108-1-2 → CHK-108 TASK-1 ROUND-2 (checkpoint-bound)
|
|
60
|
+
45-2 → standalone TASK-45 ROUND-2
|
|
61
|
+
(empty) → active task and latest round
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
Note that `108-1` is **valid** here — it resolves to standalone TASK-108 ROUND-1 per the 2-segment task-round form. To target a checkpoint-bound round, use the 3-segment form `108-1-2`.
|
|
65
|
+
|
|
66
|
+
### Step 1.5: Get Current Task and Round
|
|
67
|
+
|
|
68
|
+
Given the parse from Step 1:
|
|
69
|
+
|
|
70
|
+
| Parse | Resolution path |
|
|
71
|
+
|-------|-----------------|
|
|
72
|
+
| `{chk}-{task}-{round}` | MCP `get_checkpoints(repo_id)` → filter `number === {chk}`. MCP `get_tasks(checkpoint_id)` → filter `number === {task}`. MCP `get_rounds(task_id)` → filter `number === {round}`. |
|
|
73
|
+
| `{task}-{round}` | MCP `get_standalone_rounds` via `get_current_standalone_task` or direct task lookup → filter `number === {round}`. |
|
|
74
|
+
| _(empty)_ | Use Kind Detection: checkpoint KIND → MCP `get_current_task(repo_id)` + `get_rounds(task_id)`; standalone KIND → MCP `get_current_standalone_task(repo_id)` + `get_standalone_rounds(standalone_task_id)`. |
|
|
75
|
+
|
|
76
|
+
If no task found: `No active task. Nothing to complete.`
|
|
77
|
+
|
|
78
|
+
### Step 2: Sync git diff + approvals via CLI
|
|
79
|
+
|
|
80
|
+
Reconcile which files the user has approved by staging them. Run:
|
|
81
|
+
|
|
82
|
+
```
|
|
83
|
+
npx codebyplan round sync-approvals --round-id <round_id> --task-id <task_id>
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
The CLI auto-resolves the caller worktree id with the following precedence:
|
|
87
|
+
1. `--caller-worktree-id <uuid>` override (if passed — skips all resolution)
|
|
88
|
+
2. Per-device branch-keyed cache (`.codebyplan/worktree.local.json`)
|
|
89
|
+
3. In-process tuple API call: `POST /worktrees/resolve` using `(device_id, repo_path, branch)`
|
|
90
|
+
|
|
91
|
+
On the write path (non `--dry-run`), if the worktree id cannot be resolved the CLI **hard-fails with exit 1** and prints an actionable message. To pre-populate the cache:
|
|
92
|
+
|
|
93
|
+
```
|
|
94
|
+
npx codebyplan resolve-worktree --cache
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
If this worktree is not yet registered, run `npx codebyplan setup` first, then re-run `/cbp-round-complete`.
|
|
98
|
+
|
|
99
|
+
The CLI parses `git status --short`, merges drift + staging + web-UI flag, and writes both round and task (forwarding `caller_worktree_id` on both writes so the server honors the feat-worktree lock). A **cleanly staged** file (`git add`-ed, no further unstaged changes) becomes `user_approved: true`.
|
|
100
|
+
|
|
101
|
+
Read the stdout JSON: `{ added, stale_marked, reactivated, total_files }`.
|
|
102
|
+
|
|
103
|
+
If the command exits non-zero, surface the stderr and STOP. Do NOT proceed to Step 3.
|
|
104
|
+
|
|
105
|
+
This is the **single** explicit reconcile owned by this skill. (The `cbp-mcp-round-sync.sh` PostToolUse hook fires again right after Step 3's `complete_round` — see the note below — but that is the existing post-complete safety net, not a duplicate run to schedule here.)
|
|
106
|
+
|
|
107
|
+
### Step 3: Complete the Round
|
|
108
|
+
|
|
109
|
+
Calculate duration from the round's `started_at` to now in minutes.
|
|
110
|
+
|
|
111
|
+
- **checkpoint KIND**: MCP `complete_round(round_id, duration_minutes)`.
|
|
112
|
+
- **standalone KIND**: MCP `complete_standalone_round(standalone_round_id, duration_minutes, caller_worktree_id)`. ⚠️ `caller_worktree_id` is REQUIRED — resolve via `CALLER_WT=$(npx codebyplan resolve-worktree 2>/dev/null)`. If `CALLER_WT` is empty, surface this warning and ask the user to confirm before proceeding:
|
|
113
|
+
|
|
114
|
+
```
|
|
115
|
+
Warning: could not resolve caller_worktree_id (npx codebyplan resolve-worktree returned empty).
|
|
116
|
+
The complete_standalone_round call may be rejected by the pre-guard. Proceed anyway? (yes / no)
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
If the user confirms yes, proceed with `caller_worktree_id: ""`. If no, stop.
|
|
120
|
+
|
|
121
|
+
`complete_round` / `complete_standalone_round` sets the round `completed`, locks all `file_changes` for the round (`approval_locked: true`), and returns `unapproved_files[]` + `unapproved_count`. Hold those for routing.
|
|
122
|
+
|
|
123
|
+
> **PostToolUse hook note**: completing the round fires the `cbp-mcp-round-sync.sh` PostToolUse hook (matcher `mcp__codebyplan__complete_round`), which runs `sync-approvals` once more as a post-complete safety net for any approval drift between Step 2 and the lock. This is **expected** and is not double-processing — Step 2 is the pre-complete reconcile that makes `unapproved_count` accurate for routing; the hook is the existing catch-up. Note the hook matches `complete_round` only — `complete_standalone_round` is **not** covered by it (a pre-existing gap), so standalone rounds rely solely on this skill's Step 2 reconcile.
|
|
124
|
+
|
|
125
|
+
### Step 4: Route
|
|
126
|
+
|
|
127
|
+
**4a — Count files** — Display: `"Round N complete — Files: X total, Y approved, Z pending"`.
|
|
128
|
+
|
|
129
|
+
**4b — Route on `unapproved_count`** (from Step 3's `complete_round` response):
|
|
130
|
+
|
|
131
|
+
- **`unapproved_count === 0`** (every file user-approved): the user has signed off on the whole round.
|
|
132
|
+
- checkpoint KIND → auto-trigger `/cbp-task-check`.
|
|
133
|
+
- standalone KIND → auto-trigger `/cbp-standalone-task-check`.
|
|
134
|
+
- **`unapproved_count > 0`** (user withheld approval on some files): the unstaged files are the signal that more work is wanted on them. Auto-trigger `/cbp-round-input` — its Step 2 deep analysis reads exactly those `user_approved === false` files and formulates the next round's requirements. This route is **independent of how many files are staged**; round-input is reachable even when zero files were staged.
|
|
135
|
+
|
|
136
|
+
- **Degenerate auto-loop guard**: if the just-completed round had `round.context.auto_loop_mode === true` AND it was a clean exit (no `improve_round_findings[]`, no hard-fail — which is why `/cbp-round-update` triaged it to round-complete in the first place), do NOT auto-trigger `/cbp-round-input`. Its auto-loop path transcribes the prior round's findings verbatim, and a clean round has none — auto-triggering would spin on an empty input. Instead surface the clean-exit note below and STOP; the user stages the pending files and re-invokes (or runs `/cbp-round-input` manually). Persist `round.context.round_complete.degenerate_auto_loop_exit = true`.
|
|
137
|
+
|
|
138
|
+
```
|
|
139
|
+
## Round N Complete — Auto-loop finished clean
|
|
140
|
+
|
|
141
|
+
**Files**: X total, Y approved, Z pending
|
|
142
|
+
|
|
143
|
+
Pending files passed all checks; they are just not staged. Stage them
|
|
144
|
+
(`git add <path>`) to finish the task, or run /cbp-round-input to start
|
|
145
|
+
another round.
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
Persist a breadcrumb on the round via `update_round` / `update_standalone_round` per KIND: `round.context.round_complete = { staged_count, unstaged_count, route, decided_at }`.
|
|
149
|
+
|
|
150
|
+
## Key Rules
|
|
151
|
+
|
|
152
|
+
- **Permission prompt = confirmation** — gated by `ask`-tier `Skill(cbp-round-complete)`. NEVER add an AskUserQuestion to confirm running; the harness prompt is the gate. A declined permission is a clean no-op.
|
|
153
|
+
- **Step 2 (CLI) must exit 0** — if it fails, STOP before `complete_round`. The merge semantics are enforced by the CLI.
|
|
154
|
+
- **NEVER ask the user to git add files** — Step 2 only reads staging status. **NEVER stage files** — Claude does not touch the git staging area; the user's `git add` is the approval signal.
|
|
155
|
+
- **standalone KIND Step 3**: `caller_worktree_id` is REQUIRED for `complete_standalone_round` — always resolve and pass it.
|
|
156
|
+
- **Auto-triggered by `/cbp-round-update`** (clean triage), or run manually by the user.
|
|
157
|
+
|
|
158
|
+
## Integration
|
|
159
|
+
|
|
160
|
+
- **Gates**: `ask`-tier `Skill(cbp-round-complete)` permission prompt — the harness confirms before the skill runs; a decline makes NO writes. There is no in-skill AskUserQuestion.
|
|
161
|
+
- **Triggered by**: `/cbp-round-update` (auto, clean triage), or user manually
|
|
162
|
+
- **Reads**: MCP `get_current_task` / `get_current_standalone_task`, `get_rounds` / `get_standalone_rounds` (per KIND); delegates git+approval sync to `npx codebyplan round sync-approvals`
|
|
163
|
+
- **Writes**: MCP `complete_round` / `complete_standalone_round` (per KIND); `update_round` / `update_standalone_round` (`round_complete` breadcrumb); round+task `files_changed` written by the CLI
|
|
164
|
+
- **Triggers**: `/cbp-task-check` (checkpoint KIND, all files approved), `/cbp-standalone-task-check` (standalone KIND, all files approved), `/cbp-round-input` (some files unapproved — fires independent of staging count)
|
|
@@ -16,7 +16,7 @@ See `reference/inline-fallback.md` for full trigger table, procedure, and covera
|
|
|
16
16
|
## Pipeline
|
|
17
17
|
|
|
18
18
|
```
|
|
19
|
-
/cbp-round-execute → /cbp-round-end → [code review +
|
|
19
|
+
/cbp-round-execute → /cbp-round-end → [code review + auto-apply in-scope] → /cbp-round-update
|
|
20
20
|
```
|
|
21
21
|
|
|
22
22
|
## Identifier Notation
|
|
@@ -126,9 +126,13 @@ Wait for agent to complete. If the spawn fails for any reason, apply the inline-
|
|
|
126
126
|
|
|
127
127
|
**If `status: 'no_findings'`:** show `### Code Review\nNo issues found. Code looks good.` and skip to Step 8.
|
|
128
128
|
|
|
129
|
-
**If findings exist**, present them grouped by severity (table + per-finding details)
|
|
129
|
+
**If findings exist**, present them grouped by severity (table + per-finding details).
|
|
130
130
|
|
|
131
|
-
|
|
131
|
+
**Under `auto_loop_mode === true`**: do NOT auto-apply here — Step 8's auto-loop path accepts all findings into `improve_round_findings[]` and defers the fixes to the next loop round. Skip straight to Step 8.
|
|
132
|
+
|
|
133
|
+
**Manual mode**: **auto-apply all in-scope findings inline**. A finding is *in-scope* when every file it references is within the round's `files_changed[]`. The round-end orchestrator (main context — it has Edit/Write) applies these fixes directly; the `cbp-improve-round` agent stays read-only/advisory and never writes. Record each applied fix in `round.context.inline_fix_log` (findings indices, rationale, `fixes[]`, applied_at). After applying, re-run the verification scoped to the modified files (hook syntax check for `.sh`; `cbp-testing-qa-agent` for code) per `reference/findings-presentation.md`; if it fails, do NOT record the fix — treat the finding as out-of-scope instead. Findings that reference files OUTSIDE `files_changed[]` are **out-of-scope** — do NOT apply them; save them to `improve_round_findings[]` so Step 8 routes them to `/cbp-round-input` or a new task. There is no findings-decision AskUserQuestion — the round was already approved at the `/cbp-round-execute` permission prompt. The baseline-regression gate above is the ONLY user decision in this step.
|
|
134
|
+
|
|
135
|
+
Example tables and the in-scope/out-of-scope classification: see `reference/findings-presentation.md`.
|
|
132
136
|
|
|
133
137
|
### Step 8: Route Based on Decisions
|
|
134
138
|
|
|
@@ -136,33 +140,31 @@ Example tables and the `inline` option gating spec: see `reference/findings-pres
|
|
|
136
140
|
|
|
137
141
|
- Auto-accept ALL findings into `improve_round_findings[]` regardless of severity (the user opted into the loop).
|
|
138
142
|
- Skip the polish-spiral stop-gate (auto-loop has its own cap-exhausted termination).
|
|
139
|
-
- Skip
|
|
143
|
+
- Skip Step 7's inline auto-apply (findings are deferred to the next loop round, not applied this round).
|
|
140
144
|
- Save findings via `update_round` exactly as in manual mode.
|
|
141
|
-
- Auto-trigger `/cbp-round-update` immediately. round-update
|
|
145
|
+
- Auto-trigger `/cbp-round-update` immediately. round-update triages the round and either routes to `/cbp-round-input` (spawn another round) or `/cbp-round-complete` (clean exit) — see cbp-round-update SKILL.md Step 2/3.
|
|
142
146
|
|
|
143
147
|
**Else (manual mode — flag absent or false):**
|
|
144
148
|
|
|
145
|
-
|
|
149
|
+
Step 7 already auto-applied in-scope findings and logged them to `round.context.inline_fix_log`. Now record any out-of-scope findings and route:
|
|
146
150
|
|
|
147
|
-
1.
|
|
148
|
-
2.
|
|
149
|
-
3. Save accepted/rejected findings to round context via MCP `update_round`:
|
|
151
|
+
1. **Polish-spiral stop-gate** (round 2+ only): if this is round 2 or later AND the prior round also ended with code-review fixes, surface a one-line stop-gate via AskUserQuestion — *defer remaining polish to a follow-up task* vs *continue with another round*. This is a genuine user decision about scope (it guards against endless low-value polish loops), not a flow-control prompt. Skip on round 1.
|
|
152
|
+
2. Save out-of-scope findings (those NOT auto-applied in Step 7) to round context via MCP `update_round`:
|
|
150
153
|
```json
|
|
151
154
|
{
|
|
152
155
|
"context": {
|
|
153
|
-
"improve_round_findings": [
|
|
154
|
-
"improve_round_rejected": [rejected findings with user reasons]
|
|
156
|
+
"improve_round_findings": [out-of-scope findings]
|
|
155
157
|
}
|
|
156
158
|
}
|
|
157
159
|
```
|
|
158
|
-
|
|
160
|
+
3. Auto-trigger `/cbp-round-update`. round-update triages the round: if out-of-scope findings (or a hard-fail) remain it routes to `/cbp-round-input` (which picks up the findings from round context and includes them in the new round's requirements automatically); if the round is clean it routes to `/cbp-round-complete` (the permission-gated finalizer that reconciles the user's `git add`s and completes the round).
|
|
159
161
|
|
|
160
162
|
## Key Rules
|
|
161
163
|
|
|
162
|
-
- Claude NEVER git adds files — user
|
|
164
|
+
- Claude NEVER git adds files — user approval is via git staging at `/cbp-round-complete`
|
|
163
165
|
- Auto-triggers `/cbp-round-update` after findings are handled
|
|
164
166
|
- `/cbp-round-end` is auto-triggered by `/cbp-round-execute` (user does not call it directly)
|
|
165
|
-
-
|
|
167
|
+
- In-scope findings are **auto-applied inline** by the round-end orchestrator (the round was already approved at the `/cbp-round-execute` permission); out-of-scope findings route to `/cbp-round-input`. `cbp-improve-round` stays read-only/advisory. Baseline-regression accept (Step 7 gate) stays a user decision — baselines are NEVER auto-accepted.
|
|
166
168
|
|
|
167
169
|
## Integration
|
|
168
170
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Findings Presentation in `/cbp-round-end` Step 7
|
|
2
2
|
|
|
3
|
-
When `improve-round` returns findings, Step 7 presents them grouped by severity
|
|
3
|
+
When `improve-round` returns findings, Step 7 presents them grouped by severity, then **auto-applies in-scope findings inline** (manual mode) or defers them to the next loop round (auto-loop mode). There is no findings-decision prompt.
|
|
4
4
|
|
|
5
5
|
## Example output
|
|
6
6
|
|
|
@@ -22,26 +22,16 @@ When `improve-round` returns findings, Step 7 presents them grouped by severity
|
|
|
22
22
|
[description + suggested fix from agent]
|
|
23
23
|
```
|
|
24
24
|
|
|
25
|
-
##
|
|
25
|
+
## Auto-apply model (manual mode)
|
|
26
26
|
|
|
27
|
-
|
|
28
|
-
Which findings should be fixed?
|
|
29
|
-
- "all" — fix all findings in a new round
|
|
30
|
-
- "1,2" — fix specific findings by number
|
|
31
|
-
- "none" — skip all, proceed to round-update
|
|
32
|
-
- "inline" — fix in THIS round before proceeding (only offered when all findings qualify under the Trivial-Resolution Exception below)
|
|
33
|
-
- Or explain why specific findings are not issues
|
|
34
|
-
```
|
|
35
|
-
|
|
36
|
-
## "inline" option gating
|
|
27
|
+
Step 7 auto-applies all **in-scope** findings inline — no user prompt. A finding is *in-scope* when every file it references is within the round's `files_changed[]`; it is *out-of-scope* otherwise.
|
|
37
28
|
|
|
38
|
-
|
|
29
|
+
- **In-scope** → the round-end orchestrator (main context, has Edit/Write) applies the fix directly via `Edit` / `Write`, re-runs the verification commands (hook syntax check + `cbp-testing-qa-agent` scoped to modified files), and records it in `round.context.inline_fix_log = { findings: [ids], rationale, fixes: [...], applied_at: <ISO> }`. The `cbp-improve-round` agent stays read-only/advisory and never writes.
|
|
30
|
+
- **Out-of-scope** → saved to `round.context.improve_round_findings[]`; Step 8 routes them to `/cbp-round-input` (next round) or a new task per the Infra Issue Absorption Contract below.
|
|
39
31
|
|
|
40
|
-
|
|
41
|
-
2. Each fix is under ~5 minutes of executor time
|
|
42
|
-
3. Verification is automatic — the existing test/lint/audit pipeline confirms the change
|
|
32
|
+
The only user decision in Step 7 is the **baseline-regression accept** gate (baselines are NEVER auto-accepted). Under `auto_loop_mode`, Step 7 does not auto-apply — all findings are accepted into `improve_round_findings[]` and deferred to the next loop round.
|
|
43
33
|
|
|
44
|
-
|
|
34
|
+
The **Trivial-Resolution Exception** below still governs the deeper bypass cases (skipping executor / testing-qa / improve-round for ≤5-line non-logic corrective rounds); it is referenced by `/cbp-round-execute` and `/cbp-task-testing` for infra-issue absorption.
|
|
45
35
|
|
|
46
36
|
---
|
|
47
37
|
|
|
@@ -15,6 +15,10 @@ Execution and validation phase. Receives the approved plan from `/cbp-round-star
|
|
|
15
15
|
/cbp-round-start → /cbp-round-execute → /cbp-round-end (auto)
|
|
16
16
|
```
|
|
17
17
|
|
|
18
|
+
## Approval Model
|
|
19
|
+
|
|
20
|
+
The `ask`-tier `Skill(cbp-round-execute)` permission prompt (configured in `settings.json`) is the **plan-approval gate** handed off from `/cbp-round-start`: confirming the permission approves the plan; declining it returns control to `/cbp-round-start` (re-plan with feedback) or `/cbp-round-input` (wrong direction). Once execution begins, the executors (`cbp-round-executor`, `cbp-mechanical-edits`) and the 3-INLINE / 3-SURVEY paths apply edits **automatically** — there is NO in-skill AskUserQuestion for approval. The only downstream user decisions are genuine ones: the dev-server start prompt (Step 4) and the baseline-regression accept gate (`/cbp-round-end` Step 7).
|
|
21
|
+
|
|
18
22
|
## Identifier Notation
|
|
19
23
|
|
|
20
24
|
This skill operates on the **active** task/round resolved via MCP `get_current_task` / `get_rounds` and does not accept a positional identifier argument. Canonical chk-task-round notation is defined in `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary".
|
|
@@ -24,7 +24,6 @@ Set `KIND` for the rest of this skill. MCP tool names vary by KIND:
|
|
|
24
24
|
| Get rounds | `get_rounds(task_id)` | `get_standalone_rounds(standalone_task_id)` |
|
|
25
25
|
| Add round | `add_round(task_id, ...)` | `add_standalone_round(standalone_task_id, ...)` |
|
|
26
26
|
| Update round | `update_round(round_id, ...)` | `update_standalone_round(standalone_round_id, ...)` |
|
|
27
|
-
| Complete round | `complete_round(round_id, duration_minutes?)` | `complete_standalone_round(standalone_round_id, duration_minutes?, caller_worktree_id)` ⚠️ `caller_worktree_id` is REQUIRED for standalone |
|
|
28
27
|
| Update task | `update_task(task_id, ...)` | `update_standalone_task(standalone_task_id, ...)` |
|
|
29
28
|
|
|
30
29
|
# Round Input Command
|
|
@@ -33,7 +32,7 @@ Gathers input for a new round. Performs deep analysis of unapproved files, requi
|
|
|
33
32
|
|
|
34
33
|
## When Used
|
|
35
34
|
|
|
36
|
-
- After `/cbp-round-update` routes here (unapproved
|
|
35
|
+
- After `/cbp-round-update` triages a round as not-clean and routes here, or `/cbp-round-complete` routes here (files left unapproved after completing the round)
|
|
37
36
|
- After `/cbp-round-execute` Step 6 routes here (structural failure or retry-exhausted hard-fail)
|
|
38
37
|
- After `/clear` + `/cbp-todo` reloads context and triggers this
|
|
39
38
|
- When user wants to start a new round with specific changes
|
|
@@ -78,8 +77,9 @@ If the argument matches the numeric regex, resolve the target task/round from DB
|
|
|
78
77
|
**2f:** Extract testing-qa failures from latest round context (`context.testing_qa_output`)
|
|
79
78
|
|
|
80
79
|
**2g:** Extract code review findings from latest round context (`context.improve_round_findings`).
|
|
81
|
-
These are
|
|
82
|
-
that
|
|
80
|
+
These are out-of-scope findings from the `improve-round` agent — bugs, logic errors, edge cases
|
|
81
|
+
that round-end could not auto-apply inline (they reference files outside the prior round's
|
|
82
|
+
`files_changed[]`). Include them as high-priority requirements.
|
|
83
83
|
|
|
84
84
|
**2h:** Identify root causes — not "file X is wrong" but "requirement Y was not met because Z"
|
|
85
85
|
|
|
@@ -175,12 +175,12 @@ If this command is triggered **directly** (not via `/cbp-todo`) and no context i
|
|
|
175
175
|
- **Deep analysis is MANDATORY** — always runs, even if arguments provided (for context)
|
|
176
176
|
- **Analysis reads from DB (MCP)**, not conversation history
|
|
177
177
|
- **Follow-up rounds get same depth as round 1** — no quick-fix behavior
|
|
178
|
-
- **Never ask to git add** — file approval is
|
|
178
|
+
- **Never ask to git add** — user file approval (git staging) is reconciled by `/cbp-round-complete`
|
|
179
179
|
- **Update all context locations** — task, checkpoint, and round should all have consistent information
|
|
180
180
|
|
|
181
181
|
## Integration
|
|
182
182
|
|
|
183
|
-
- **Triggered by**: `/cbp-round-update` (auto, unapproved
|
|
183
|
+
- **Triggered by**: `/cbp-round-update` (auto, not-clean triage), `/cbp-round-complete` (auto, files left unapproved after completing the round), `/cbp-round-execute` (auto, on hard-fail after retry exhausted), `/cbp-todo` (after /clear), user manually
|
|
184
184
|
- **Reads**: MCP `get_current_task` / `get_current_standalone_task`, `get_rounds` / `get_standalone_rounds` (per KIND), file contents (Read tool)
|
|
185
185
|
- **Writes**: MCP `update_task` / `update_standalone_task` (context), `update_checkpoint` (context, if checkpoint KIND and needed)
|
|
186
186
|
- **Triggers**: `/cbp-round-start` (auto)
|
|
@@ -30,7 +30,7 @@ Set `KIND` for the rest of this skill. MCP tool names vary by KIND:
|
|
|
30
30
|
|
|
31
31
|
# Round Start Command
|
|
32
32
|
|
|
33
|
-
Planning phase for a new round. Analyzes context, creates plan,
|
|
33
|
+
Planning phase for a new round. Analyzes context, creates a plan, then auto-triggers `/cbp-round-execute` — the `ask`-tier permission prompt on that skill IS the user's plan approval. NO execution or testing — those are separate commands.
|
|
34
34
|
|
|
35
35
|
## Inline-Fallback for Planner Spawn Failure
|
|
36
36
|
|
|
@@ -42,17 +42,17 @@ Procedure summary (pointer back to canonical):
|
|
|
42
42
|
2. Walk the planner's documented Phase 0-8 checklist inline using `Read` / `Grep` / `Bash` / MCP `get_*` — `agents/cbp-task-planner.md` is the inline script. Phase 1.5 (Requirement Premise Verification) and Phase 4.7 (Migration Shape-Distribution Pre-Flight) are MANDATORY in fallback mode — these are the gates the agent uniquely enforces; skipping them produces unverified plans.
|
|
43
43
|
3. Populate the planner's output contract (`approved_plan` shape: `files_to_modify[]`, `deliverables`, `specialist_needs`, `round_type`, `shape_distribution` if applicable, `context_summary`) with `mode: 'inline_fallback'`.
|
|
44
44
|
4. Apply the pre-emptive-skip rule: when the same failure class fired in the previous spawn of this session, skip the spawn attempt entirely and go straight to inline.
|
|
45
|
-
5. Continue the skill — do NOT abort.
|
|
45
|
+
5. Continue the skill — do NOT abort. Step 9 auto-triggers `/cbp-round-execute`; the `ask`-tier permission prompt on that skill is the user's plan approval (see Step 8).
|
|
46
46
|
|
|
47
47
|
Inline-fallback is NOT a quality downgrade trapdoor — Phase 1.5 row-by-row verification is mandatory. A fallback plan that skipped premise verification is a regression caught by the next session's cbp-improve-round.
|
|
48
48
|
|
|
49
49
|
## Pipeline
|
|
50
50
|
|
|
51
51
|
```
|
|
52
|
-
/cbp-round-start (planning) →
|
|
52
|
+
/cbp-round-start (planning) → /cbp-round-execute (ask-tier permission = plan approval)
|
|
53
53
|
```
|
|
54
54
|
|
|
55
|
-
**Auto-loop mode**: when `round.context.auto_loop_mode === true` flows in from `/cbp-round-input`, Step 6 (Q&A) and Step 8
|
|
55
|
+
**Auto-loop mode**: when `round.context.auto_loop_mode === true` flows in from `/cbp-round-input`, Step 6 (Q&A) is skipped and Step 8's `/cbp-round-execute` permission is auto-approved. See cbp-round-update SKILL.md Step 3b (auto-loop decision) and cbp-round-end SKILL.md Step 8 for the full contract.
|
|
56
56
|
|
|
57
57
|
## Instructions
|
|
58
58
|
|
|
@@ -176,7 +176,7 @@ input:
|
|
|
176
176
|
|
|
177
177
|
Wait for planner output.
|
|
178
178
|
|
|
179
|
-
### Step 8:
|
|
179
|
+
### Step 8: Present Plan
|
|
180
180
|
|
|
181
181
|
Present the plan to user:
|
|
182
182
|
|
|
@@ -208,24 +208,21 @@ Present the plan to user:
|
|
|
208
208
|
|
|
209
209
|
Single-wave plans present the existing flat plan view (no wave table) — backward compatible.
|
|
210
210
|
|
|
211
|
-
**
|
|
211
|
+
**Plan approval is the `ask`-tier `Skill(cbp-round-execute)` permission prompt** — there is NO approve/needs-changes/wrong AskUserQuestion here. After presenting the plan, proceed to Step 9, which auto-triggers `/cbp-round-execute`; the harness then shows the `ask`-tier permission prompt, and confirming it IS the user's go-ahead on the plan.
|
|
212
212
|
|
|
213
|
-
**
|
|
213
|
+
**Denied-execute handling** — if the user declines the `/cbp-round-execute` permission, the plan does not run. Treat the decline as "the plan must change":
|
|
214
214
|
|
|
215
|
-
|
|
216
|
-
|
|
217
|
-
3. **No — totally wrong** — discard plan, return to `/cbp-round-input` for new requirements
|
|
215
|
+
- **Minor changes**: collect the user's feedback, re-spawn `cbp-task-planner` with it as a constraint (re-run Step 7), present the revised plan, and re-trigger `/cbp-round-execute`.
|
|
216
|
+
- **Wrong direction**: save the rejection reason to round context and auto-trigger `/cbp-round-input` for new requirements.
|
|
218
217
|
|
|
219
|
-
**If "
|
|
220
|
-
**If "Needs changes"**: collect user feedback, re-spawn `cbp-task-planner` with feedback as constraint, present revised plan, ask again.
|
|
221
|
-
**If "Totally wrong"**: save rejection reason to round context, auto-trigger `/cbp-round-input`.
|
|
218
|
+
**If `auto_loop_mode === true`**: the loop auto-approves — log `round.context.plan_approval = { mode: "auto_loop", auto_approved_at: <ISO> }`, surface a one-line note `"Auto-approved under auto_loop_mode (round N of cap C)"`, and proceed to Step 9 (the `/cbp-round-execute` permission is auto-approved under the loop).
|
|
222
219
|
|
|
223
220
|
### Step 9: Auto-trigger Round Execute
|
|
224
221
|
|
|
225
|
-
|
|
222
|
+
Save planner output to round context via MCP `update_round` / `update_standalone_round` per KIND, then trigger `/cbp-round-execute`. The `ask`-tier permission prompt on `/cbp-round-execute` is the user's plan approval (see Step 8).
|
|
226
223
|
|
|
227
224
|
```
|
|
228
|
-
|
|
225
|
+
Starting execution phase...
|
|
229
226
|
```
|
|
230
227
|
|
|
231
228
|
## Key Rules
|
|
@@ -1,9 +1,9 @@
|
|
|
1
1
|
---
|
|
2
2
|
scope: org-shared
|
|
3
3
|
name: cbp-round-update
|
|
4
|
-
description:
|
|
4
|
+
description: Triage a finished round (Claude-only) and route to round-complete or round-input
|
|
5
5
|
argument-hint: [chk-task-round | task-round]
|
|
6
|
-
triggers: [cbp-round-
|
|
6
|
+
triggers: [cbp-round-complete, cbp-round-input]
|
|
7
7
|
effort: low
|
|
8
8
|
---
|
|
9
9
|
|
|
@@ -17,28 +17,24 @@ Inspect the resolved identifier from argument parsing to determine the task kind
|
|
|
17
17
|
| `{chk}-{task}-{round}` (3-segment, e.g. `141-3-1`) | `checkpoint` |
|
|
18
18
|
| _(empty / free-text)_ | Check `get_current_standalone_task` first; if found → `standalone`. Else → `checkpoint` via `get_current_task`. |
|
|
19
19
|
|
|
20
|
-
Set `KIND` for the rest of this skill. MCP tool names vary by KIND:
|
|
20
|
+
Set `KIND` for the rest of this skill. round-update is **read + triage only** — it reads round state and routes; it never completes the round or writes file approvals. MCP read/audit tool names vary by KIND:
|
|
21
21
|
|
|
22
22
|
| Operation | `checkpoint` KIND | `standalone` KIND |
|
|
23
23
|
|-----------|------------------|-------------------|
|
|
24
24
|
| Get task | `get_current_task(repo_id)` | `get_current_standalone_task(repo_id)` |
|
|
25
25
|
| Get rounds | `get_rounds(task_id)` | `get_standalone_rounds(standalone_task_id)` |
|
|
26
|
-
|
|
|
27
|
-
| Update round | `update_round(round_id, ...)` | `update_standalone_round(standalone_round_id, ...)` |
|
|
28
|
-
| Complete round | `complete_round(round_id, duration_minutes?)` | `complete_standalone_round(standalone_round_id, duration_minutes?, caller_worktree_id)` ⚠️ `caller_worktree_id` is REQUIRED for standalone |
|
|
29
|
-
| Update task | `update_task(task_id, ...)` | `update_standalone_task(standalone_task_id, ...)` |
|
|
26
|
+
| Update round (audit only) | `update_round(round_id, ...)` | `update_standalone_round(standalone_round_id, ...)` |
|
|
30
27
|
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
Checks file approval status, completes the round, and routes to next step. NEVER asks the user to git add or stage anything — it only reads current state.
|
|
28
|
+
The completion + file-approval reconcile (`sync-approvals`, `complete_round` / `complete_standalone_round`) now lives in `/cbp-round-complete`.
|
|
34
29
|
|
|
35
|
-
|
|
30
|
+
# Round Update Command
|
|
36
31
|
|
|
37
|
-
|
|
32
|
+
**Claude-only, autonomous triage.** round-update inspects a finished round's automated state — which files Claude approved (`claude_approved`), whether testing-QA hard-failed, and whether `improve-round` left outstanding findings — and routes to exactly one next step. It makes **no writes** beyond an audit breadcrumb and **never prompts the user**: it is auto-triggered by `/cbp-round-end` and runs without a confirmation gate. The user-facing confirmation has moved to `/cbp-round-complete` (an `ask`-tier permission prompt). round-update NEVER reads or touches git staging — user approval is reconciled later by `/cbp-round-complete`.
|
|
38
33
|
|
|
39
|
-
|
|
34
|
+
## Routing in one line
|
|
40
35
|
|
|
41
|
-
|
|
36
|
+
- **Round is clean** → trigger `/cbp-round-complete` (the permission-gated finalizer reconciles your `git add`s and completes the round).
|
|
37
|
+
- **Round needs more work** → trigger `/cbp-round-input` (more changes / planning).
|
|
42
38
|
|
|
43
39
|
## Instructions
|
|
44
40
|
|
|
@@ -82,147 +78,39 @@ Given the parse from Step 1:
|
|
|
82
78
|
|
|
83
79
|
If no task found: `No active task. Nothing to update.`
|
|
84
80
|
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
Step 1 (parse) and Step 1.5 (resolve task + round) are read-only. Step 2 onward mutates state — the `sync-approvals` CLI write, `complete_round`, and the auto-trigger of the next step. Before any of that, confirm the user wants this skill to run.
|
|
88
|
-
|
|
89
|
-
This gate fires on **every** invocation — manual, auto-triggered by `/cbp-round-end`, and on every iteration of the Step 4 auto-loop. There is no bypass. It sits **before** and is independent of the top-of-file HARD GATE (which governs Step 2's exit code, not user consent), and it is distinct from the Branch A clean-exit route choice in Step 5 (that one picks where to go next; this one authorizes running at all).
|
|
90
|
-
|
|
91
|
-
Ask via AskUserQuestion, naming the resolved round and disclosing the actions:
|
|
92
|
-
|
|
93
|
-
> Update ROUND-{N} of TASK-{M}?
|
|
94
|
-
> This will sync the git diff + file approvals, complete the round, and route to the next step (which may auto-start a new round).
|
|
95
|
-
>
|
|
96
|
-
> - **Proceed** — run the skill
|
|
97
|
-
> - **Cancel** — do nothing
|
|
98
|
-
|
|
99
|
-
- **Proceed** → continue to Step 2.
|
|
100
|
-
- **Cancel** → abort cleanly: make NO writes (no `sync-approvals`, no `complete_round`, no auto-trigger) and exit with one line: `Cancelled by user — ROUND-{N} not updated.`
|
|
101
|
-
|
|
102
|
-
### Step 2: Sync git diff + approvals via CLI
|
|
103
|
-
|
|
104
|
-
Run:
|
|
105
|
-
|
|
106
|
-
```
|
|
107
|
-
npx codebyplan round sync-approvals --round-id <round_id> --task-id <task_id>
|
|
108
|
-
```
|
|
109
|
-
|
|
110
|
-
The CLI auto-resolves the caller worktree id with the following precedence:
|
|
111
|
-
1. `--caller-worktree-id <uuid>` override (if passed — skips all resolution)
|
|
112
|
-
2. Per-device branch-keyed cache (`.codebyplan/worktree.local.json`)
|
|
113
|
-
3. In-process tuple API call: `POST /worktrees/resolve` using `(device_id, repo_path, branch)`
|
|
114
|
-
|
|
115
|
-
On the write path (non `--dry-run`), if the worktree id cannot be resolved the CLI **hard-fails with exit 1** and prints an actionable message. To pre-populate the cache:
|
|
116
|
-
|
|
117
|
-
```
|
|
118
|
-
npx codebyplan resolve-worktree --cache
|
|
119
|
-
```
|
|
120
|
-
|
|
121
|
-
If this worktree is not yet registered, run `npx codebyplan setup` first, then re-run `/cbp-round-update`.
|
|
122
|
-
|
|
123
|
-
The CLI parses `git status --short`, merges drift + staging + web-UI flag, and writes both round and task (forwarding `caller_worktree_id` on both writes so the server honors the feat-worktree lock).
|
|
124
|
-
|
|
125
|
-
Read the stdout JSON: `{ added, stale_marked, reactivated, total_files }`.
|
|
126
|
-
|
|
127
|
-
If the command exits non-zero, surface the stderr and STOP. Do NOT proceed to Step 3.
|
|
128
|
-
|
|
129
|
-
### Step 3: Complete Round
|
|
130
|
-
|
|
131
|
-
Calculate duration from `started_at` to now in minutes.
|
|
132
|
-
|
|
133
|
-
- **checkpoint KIND**: MCP `complete_round(round_id, duration_minutes)`.
|
|
134
|
-
- **standalone KIND**: MCP `complete_standalone_round(standalone_round_id, duration_minutes, caller_worktree_id)`. ⚠️ `caller_worktree_id` is REQUIRED — resolve via `CALLER_WT=$(npx codebyplan resolve-worktree 2>/dev/null)`. If `CALLER_WT` is empty, surface this warning and ask user to confirm before proceeding:
|
|
135
|
-
|
|
136
|
-
```
|
|
137
|
-
Warning: could not resolve caller_worktree_id (npx codebyplan resolve-worktree returned empty).
|
|
138
|
-
The complete_standalone_round call may be rejected by the pre-guard. Proceed anyway? (yes / no)
|
|
139
|
-
```
|
|
81
|
+
This step is **read-only**. There is no permission gate — round-update is autonomous (see the Key Rules below).
|
|
140
82
|
|
|
141
|
-
|
|
83
|
+
### Step 2: Triage the Round
|
|
142
84
|
|
|
143
|
-
|
|
85
|
+
Read the latest round's context: `round.context.testing_qa_output.totals.hard_fail`, `round.context.improve_round_findings[]`, `round.context.round_type`, and `round.files_changed[]` (each entry's `claude_approved`). Compute a single `clean` verdict:
|
|
144
86
|
|
|
145
|
-
|
|
87
|
+
- **Survey round** (`round.context.round_type === 'survey'`): no file diff exists, so QA/approval predicates do not apply. `clean = improve_round_findings[]` is empty.
|
|
88
|
+
- **Normal round**: `clean = (every file in files_changed[] has claude_approved === true) AND testing_qa_output.totals.hard_fail === false AND improve_round_findings[]` is empty.
|
|
146
89
|
|
|
147
|
-
|
|
90
|
+
Display a one-line triage summary, e.g. `"ROUND-N triage: clean"` or `"ROUND-N triage: N findings / hard_fail=true → needs another round"`. round-update reads `claude_approved` only — it does **not** read git staging or `user_approved`; those belong to `/cbp-round-complete`.
|
|
148
91
|
|
|
149
|
-
|
|
150
|
-
2. **If `next_index > (round.context.auto_loop_cap ?? 5)`**: surface the cap-exhausted prompt via AskUserQuestion (options: extend cap, stop loop / drop into round-input, close task as-is). Persist `round.context.auto_loop_cap_exhausted = { user_choice, decided_at }` and route per choice.
|
|
151
|
-
3. **Otherwise**: persist `round.context.auto_loop_decision = { spawned_next: true, next_index, decided_at }` on the CURRENT round via `update_round` / `update_standalone_round` per KIND (audit trail), then auto-trigger `/cbp-round-input` with NO AskUserQuestion. Pass `auto_loop_mode: true`, `auto_loop_index: next_index`, `auto_loop_cap: (prior cap ?? 5)` forward — round-start Step 4 persists them on the new round.
|
|
92
|
+
### Step 3: Route
|
|
152
93
|
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
If BOTH signals are clean, fall through to Step 5 (exit routing).
|
|
156
|
-
|
|
157
|
-
### Step 5: Exit Routing
|
|
158
|
-
|
|
159
|
-
**5a: Count files** — Display: `"Files: X total, Y approved, Z pending"`
|
|
160
|
-
|
|
161
|
-
**5b: Route with four branches** (Step 4 already handled the dirty-loop case; Step 5 is the clean-exit path).
|
|
162
|
-
|
|
163
|
-
**Branch D — IF `round.context.round_type === 'survey'` (checked FIRST):**
|
|
164
|
-
|
|
165
|
-
Survey rounds produce no file diff; A/B/C predicates assume `files_changed[]` non-empty. Survey routing is decided by `improve_round_findings[]` instead:
|
|
166
|
-
|
|
167
|
-
- `improve_round_findings[]` non-empty → auto-trigger `/cbp-round-input`
|
|
168
|
-
- `improve_round_findings[]` empty → checkpoint KIND: auto-trigger `/cbp-task-check`; standalone KIND: auto-trigger `/cbp-standalone-task-check`
|
|
169
|
-
|
|
170
|
-
Output: `"## Round [N] Complete — Survey Round"` with duration, files=0, findings count, routing message. Skip Branches A/B/C.
|
|
171
|
-
|
|
172
|
-
**Branch A — ELSE IF all files have `user_approved: true`:**
|
|
173
|
-
|
|
174
|
-
```
|
|
175
|
-
## Round [N] Complete - All Files Approved
|
|
176
|
-
|
|
177
|
-
**Duration**: [N] minutes
|
|
178
|
-
**Files**: [X] total, [X] approved, 0 pending
|
|
179
|
-
```
|
|
180
|
-
|
|
181
|
-
Surface AskUserQuestion (clean-exit user-gate):
|
|
182
|
-
|
|
183
|
-
- **(a) close & complete round** → checkpoint KIND: auto-trigger `/cbp-task-check`; standalone KIND: auto-trigger `/cbp-standalone-task-check`
|
|
184
|
-
- **(b) start new round** → auto-trigger `/cbp-round-input`
|
|
185
|
-
|
|
186
|
-
Persist `round.context.auto_loop_exit = { staged_count, unstaged_count, route, decided_at }`.
|
|
187
|
-
|
|
188
|
-
**Branch B — ELSE IF unapproved files exist AND every unapproved file has `claude_approved: true` AND `testing_qa_output.totals.hard_fail: false` AND no `improve_round_findings[]`:**
|
|
189
|
-
|
|
190
|
-
The clean-but-unstaged staging-gate case. Mode-dependent:
|
|
191
|
-
|
|
192
|
-
- **Auto-loop exit** (the completed round had `auto_loop_mode === true` on its context AND Step 4 fell through here with BOTH signals clean): surface the Branch A clean-exit user-gate — do NOT auto-trigger `/cbp-round-input`. Branch B's entry condition already requires `no improve_round_findings[]`, so the auto-loop's verbatim-from-findings procedure has no input to formulate requirements from; auto-triggering round-input would loop on an empty input. Persist `round.context.auto_loop_exit.degenerate_empty_findings: true` (degenerate sub-case: no findings to feed round-input, so surface the clean-exit user-gate above instead of auto-triggering).
|
|
193
|
-
- **Manual round**: emit staging-gate prompt and STOP. User stages files (never-git-add rule) and re-invokes; command then falls into Branch A.
|
|
194
|
-
|
|
195
|
-
Manual-mode prompt:
|
|
196
|
-
|
|
197
|
-
```
|
|
198
|
-
## Round [N] Complete — Files Pending Staging
|
|
199
|
-
|
|
200
|
-
**Files**: [X] total, [Y] approved, [Z] pending
|
|
201
|
-
|
|
202
|
-
### Pending (passed all checks; not yet staged):
|
|
203
|
-
- [path]
|
|
204
|
-
|
|
205
|
-
Stage them (`git add <path>`) and re-run `/cbp-round-update` to proceed.
|
|
206
|
-
Waiting for user to stage files.
|
|
207
|
-
```
|
|
94
|
+
**3a — Clean → `/cbp-round-complete`.** Auto-trigger `/cbp-round-complete`. round-complete is `ask`-tier: the harness shows a permission prompt (the user's confirmation to finalize the round). round-complete then reconciles the user's `git add`s, completes the round, and routes onward (all files staged → task-check; some withheld → round-input). round-update writes nothing here beyond the Step 2 summary. In `auto_loop_mode`, a clean triage is the loop's success exit — the loop continues only via the not-clean path in 3b; round-complete owns the degenerate clean-but-unstaged guard.
|
|
208
95
|
|
|
209
|
-
**
|
|
96
|
+
**3b — Not clean → `/cbp-round-input`.** More changes or planning are needed. Routing is **independent of git staging** — round-input is reachable whether or not the user has staged anything (it performs its own deep analysis of the unapproved files). Two sub-cases:
|
|
210
97
|
|
|
211
|
-
|
|
98
|
+
- **Auto-loop** (`round.context.auto_loop_mode === true`): compute `next_index = (round.context.auto_loop_index ?? 0) + 1`.
|
|
99
|
+
- If `next_index > (round.context.auto_loop_cap ?? 5)`: surface the cap-exhausted prompt via AskUserQuestion (a genuine multi-option user decision — keep it). Options: extend cap, stop loop / drop into round-input, close task as-is. Persist `round.context.auto_loop_cap_exhausted = { user_choice, decided_at }` and route per choice.
|
|
100
|
+
- Otherwise: persist `round.context.auto_loop_decision = { spawned_next: true, next_index, decided_at }` on the current round via `update_round` / `update_standalone_round` (audit trail), then auto-trigger `/cbp-round-input` with NO prompt. Pass `auto_loop_mode: true`, `auto_loop_index: next_index`, `auto_loop_cap: (prior cap ?? 5)` forward — round-start Step 4 persists them on the new round.
|
|
101
|
+
- **Manual round**: auto-trigger `/cbp-round-input` directly (no prompt).
|
|
212
102
|
|
|
213
103
|
## Key Rules
|
|
214
104
|
|
|
215
|
-
- **
|
|
216
|
-
- **
|
|
217
|
-
- **
|
|
218
|
-
- **
|
|
219
|
-
-
|
|
220
|
-
- **standalone KIND Step 3**: `caller_worktree_id` is REQUIRED for `complete_standalone_round` — always resolve and pass it.
|
|
105
|
+
- **Autonomous + Claude-only** — round-update never prompts before running. It is auto-triggered by `/cbp-round-end`. The confirmation step is `/cbp-round-complete`'s `ask`-tier permission prompt, not an AskUserQuestion here. (The auto-loop cap-exhausted AskUserQuestion in Step 3b is a genuine user decision, not a run gate.)
|
|
106
|
+
- **Triage, never finalize** — round-update does NOT call `sync-approvals`, `complete_round`, or `complete_standalone_round`, and does NOT write file approvals. All of that is `/cbp-round-complete`.
|
|
107
|
+
- **Never touches git** — round-update reads `claude_approved` from the DB only; it never reads staging, asks the user to `git add`, or stages files.
|
|
108
|
+
- **git-add independence** — the "needs more work" route to `/cbp-round-input` fires regardless of whether files are staged. There is no clean-but-unstaged dead-end.
|
|
109
|
+
- **standalone parity** — KIND detection governs which read/audit tools are used; the clean→`/cbp-round-complete` and not-clean→`/cbp-round-input` routing is identical for both KINDs (round-complete and round-input self-detect KIND).
|
|
221
110
|
|
|
222
111
|
## Integration
|
|
223
112
|
|
|
224
|
-
- **Gates**: Step 1.6 permission gate — asks the user to confirm before any side effect; **Cancel** aborts cleanly with no writes. Fires on every invocation incl. the Step 4 auto-loop; sits before and independent of the top-of-file Step 2 hard gate.
|
|
225
113
|
- **Triggered by**: `/cbp-round-end` (auto), or user manually
|
|
226
|
-
- **Reads**: MCP `get_current_task` / `get_current_standalone_task`, `get_rounds` / `get_standalone_rounds` (per KIND);
|
|
227
|
-
- **Writes**: MCP `update_round` / `update_standalone_round`
|
|
228
|
-
- **Triggers**: `/cbp-round-
|
|
114
|
+
- **Reads**: MCP `get_current_task` / `get_current_standalone_task`, `get_rounds` / `get_standalone_rounds` (per KIND); round context (`testing_qa_output`, `improve_round_findings`, `round_type`, `files_changed[].claude_approved`)
|
|
115
|
+
- **Writes**: MCP `update_round` / `update_standalone_round` — audit only (`auto_loop_decision` / `auto_loop_cap_exhausted`). No completion, no file-approval writes.
|
|
116
|
+
- **Triggers**: `/cbp-round-complete` (clean triage — `ask`-tier permission prompt is the user confirmation), `/cbp-round-input` (not-clean triage: outstanding findings, hard-fail, or unapproved Claude checks — fires independent of git staging; also the auto-loop dirty spawn), cap-exhausted prompt routes from Step 3b (any of the three options)
|
|
@@ -17,7 +17,7 @@ If the `cbp-task-check` agent spawn fails for any reason, follow the canonical i
|
|
|
17
17
|
|
|
18
18
|
## When Used
|
|
19
19
|
|
|
20
|
-
- After all rounds complete and all files approved (auto-triggered by `/cbp-round-
|
|
20
|
+
- After all rounds complete and all files approved (auto-triggered by `/cbp-round-complete`)
|
|
21
21
|
- Before `/cbp-standalone-task-testing`
|
|
22
22
|
- Never skippable
|
|
23
23
|
|
|
@@ -149,4 +149,4 @@ Suggest: Approve files, then re-run `/cbp-standalone-task-check {task}`. Stop
|
|
|
149
149
|
- **Reads**: MCP `get_current_standalone_task`, `get_standalone_tasks`, `get_standalone_rounds`, all changed files (via agent)
|
|
150
150
|
- **Writes**: MCP `update_standalone_task` (context.check_verdict)
|
|
151
151
|
- **Triggers**: emits directive `Next: /clear, then /cbp-standalone-task-testing {task}` on READY + satisfied
|
|
152
|
-
- **Triggered by**: `/cbp-round-
|
|
152
|
+
- **Triggered by**: `/cbp-round-complete` (auto, when all files approved)
|
|
@@ -10,6 +10,8 @@ effort: xhigh
|
|
|
10
10
|
|
|
11
11
|
Complete a standalone task. Auto-triggered by `/cbp-standalone-task-testing` when all tests pass. Can also be run manually.
|
|
12
12
|
|
|
13
|
+
This skill is gated by an `ask`-tier `Skill(cbp-standalone-task-complete)` permission rule in `settings.json` (shipped templates). **The permission prompt IS the user confirmation** — there is NO completion-confirmation AskUserQuestion inside this skill (the Step 7.5 `caller_worktree_id` guard is a separate environmental safety prompt, not a flow-control confirmation). A declined permission is a clean no-op (nothing committed, merged, pushed, or completed).
|
|
14
|
+
|
|
13
15
|
## Instructions
|
|
14
16
|
|
|
15
17
|
### Step 1: Parse `$ARGUMENTS`
|
|
@@ -84,9 +86,7 @@ Load `task.qa` and `task.files_changed`:
|
|
|
84
86
|
1. **QA**: count items by status (pass / fail / pending / skipped). If any item has status `fail` or `pending`, warn the user.
|
|
85
87
|
2. **Files**: list any file with `user_approved === false` and warn.
|
|
86
88
|
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
**If no issues**, AskUserQuestion to confirm: `Ready to complete standalone TASK-[N]: [title] — [N] rounds, [N] files. Proceed?`
|
|
89
|
+
If any QA item is `fail`/`pending` or any file is unapproved, **surface the warnings in the output and continue** — record them for the Step 9 summary. There is NO confirmation AskUserQuestion here: `Skill(cbp-standalone-task-complete)` is `ask`-tier, so the harness permission prompt that gated this skill IS the user's confirmation to complete. The hard gates in Steps 2–2.6 (all rounds completed, ≥1 round has `testing_qa_output`, `check_verdict` READY, `task_testing_output.all_passed`) already block completion when prerequisites are unmet; these QA / file-approval items are warnings, not blockers.
|
|
90
90
|
|
|
91
91
|
### Step 4: Aggregate Files Changed
|
|
92
92
|
|
|
@@ -169,6 +169,7 @@ Apply the `cleanup` skill inline to remove orphan references to deleted/modified
|
|
|
169
169
|
**Files**: [N] changed
|
|
170
170
|
**Commit**: [hash]
|
|
171
171
|
**Branch merged**: [feat-branch] → {PRODUCTION}
|
|
172
|
+
**Warnings**: [any QA / file-approval warnings from Step 3, or "none"]
|
|
172
173
|
```
|
|
173
174
|
|
|
174
175
|
#### Route (single directive — never a menu)
|
|
@@ -9,7 +9,7 @@ effort: xhigh
|
|
|
9
9
|
|
|
10
10
|
# Standalone Task Testing Command
|
|
11
11
|
|
|
12
|
-
Comprehensive task-level testing for standalone tasks —
|
|
12
|
+
Comprehensive task-level testing for standalone tasks — the **cross-round double-check** run once after all rounds complete. Per-round QA (per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret grep, `pnpm audit`) is owned by each round's `testing-qa-agent`; this skill does NOT re-run it. It tests the entire delivered feature holistically across the full task diff — catching cross-package and cross-round problems no single round can see. Runs inline — no sub-agent.
|
|
13
13
|
|
|
14
14
|
## When Used
|
|
15
15
|
|
|
@@ -19,7 +19,7 @@ Comprehensive task-level testing for standalone tasks — runs all automated tes
|
|
|
19
19
|
|
|
20
20
|
## Scope vs Round-Level Validation
|
|
21
21
|
|
|
22
|
-
Per-wave `testing-qa-agent` runs inside `/cbp-round-execute` Step 5. This skill adds the cross-
|
|
22
|
+
Per-wave `testing-qa-agent` runs inside `/cbp-round-execute` Step 5 and **owns per-round QA**: per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret full-diff grep, and `pnpm audit`. This skill does NOT repeat them. It adds only the cross-round layer invisible within a single round: workspace-wide lint, workspace tsc, and the full test suite (which catch cross-package breakage), plus the cross-round code review (Step 6.5) and the user manual walkthrough (Step 8).
|
|
23
23
|
|
|
24
24
|
## Instructions
|
|
25
25
|
|
|
@@ -93,9 +93,9 @@ Capture stdout and stderr for each check.
|
|
|
93
93
|
| Full-repo lint | `pnpm -w lint` | Always |
|
|
94
94
|
| Full-repo types | `pnpm exec tsc --noEmit` | Source files changed |
|
|
95
95
|
| Full-repo unit tests | `pnpm test --run` | Source files in aggregated_files |
|
|
96
|
-
| Full-repo audit | `pnpm audit` | Always |
|
|
97
96
|
| Per-package E2E | `pnpm --filter <pkg> e2e:test` | UI files in aggregated_files |
|
|
98
|
-
|
|
97
|
+
|
|
98
|
+
These are the workspace-wide / cross-package checks only — per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret grep, and `pnpm audit` already ran per-round inside `testing-qa-agent` and are NOT repeated here.
|
|
99
99
|
|
|
100
100
|
**Soft tests** (report, don't block):
|
|
101
101
|
|
|
@@ -9,7 +9,7 @@ effort: high
|
|
|
9
9
|
|
|
10
10
|
# Task Check Command
|
|
11
11
|
|
|
12
|
-
AI-driven production readiness review. Spawns the `cbp-task-check` agent for thorough verification including user satisfaction discussion. This command is a thin orchestrator — the agent does the heavy lifting.
|
|
12
|
+
AI-driven production readiness review. Spawns the `cbp-task-check` agent for thorough verification including user satisfaction discussion. This command is a thin orchestrator — the agent does the heavy lifting. It is the **cross-round double-check**: rounds already own per-round QA (debug scan, security grep, audit, per-app build/lint/types), so this layer focuses on holistic concerns visible only across the full task diff — requirements traceability, checkpoint alignment, shippability, holistic code review, and scope drift — never re-running per-round checks.
|
|
13
13
|
|
|
14
14
|
## Inline-Fallback for Spawn Failure
|
|
15
15
|
|
|
@@ -27,7 +27,7 @@ Inline-fallback is NOT a quality downgrade trapdoor — every Phase from the age
|
|
|
27
27
|
|
|
28
28
|
## When Used
|
|
29
29
|
|
|
30
|
-
- After all rounds complete and all files approved (auto-triggered by `/cbp-round-
|
|
30
|
+
- After all rounds complete and all files approved (auto-triggered by `/cbp-round-complete`)
|
|
31
31
|
- Before `/cbp-task-testing`
|
|
32
32
|
- `/cbp-task-check` is NEVER skippable
|
|
33
33
|
|
|
@@ -163,4 +163,4 @@ Suggest: Approve files, then re-run `/cbp-task-check`. **STOP HERE** — wait fo
|
|
|
163
163
|
- **Reads**: MCP `get_current_task`, `get_rounds`, all changed files (via agent)
|
|
164
164
|
- **Writes**: MCP `update_task` (context.check_verdict)
|
|
165
165
|
- **Triggers**: emits directive `Next: /clear, then /cbp-task-testing {chk-task}` on READY + satisfied (cross-context — testing is heavyweight, fresh context helps)
|
|
166
|
-
- **Triggered by**: `/cbp-round-
|
|
166
|
+
- **Triggered by**: `/cbp-round-complete` (auto, when all files approved)
|
|
@@ -10,6 +10,8 @@ effort: xhigh
|
|
|
10
10
|
|
|
11
11
|
Complete the current task. Auto-triggered by `/cbp-task-testing` when all tests pass. Can also be run manually.
|
|
12
12
|
|
|
13
|
+
This skill is gated by an `ask`-tier `Skill(cbp-task-complete)` permission rule in `settings.json`. **The permission prompt IS the user confirmation** — there is NO AskUserQuestion inside this skill. A declined permission is a clean no-op (nothing committed, merged, pushed, or completed).
|
|
14
|
+
|
|
13
15
|
## Instructions
|
|
14
16
|
|
|
15
17
|
### Step 1: Parse `$ARGUMENTS`
|
|
@@ -90,12 +92,10 @@ Stop here.
|
|
|
90
92
|
|
|
91
93
|
Load `task.qa` and `task.files_changed`:
|
|
92
94
|
|
|
93
|
-
1. **QA**: count items by status (pass / fail / pending / skipped) across all types.
|
|
94
|
-
2. **Files**: list any file with `user_approved === false
|
|
95
|
-
|
|
96
|
-
**If issues exist**, AskUserQuestion: `Complete anyway` / `Run QA first` (suggest `/cbp-task-check`) / `Cancel`. On `Run QA first` or `Cancel`, stop. On `Complete anyway`, continue.
|
|
95
|
+
1. **QA**: count items by status (pass / fail / pending / skipped) across all types.
|
|
96
|
+
2. **Files**: list any file with `user_approved === false`.
|
|
97
97
|
|
|
98
|
-
|
|
98
|
+
If any QA item is `fail`/`pending` or any file is unapproved, **surface the warnings in the output and continue** — record them for the Step 9 summary. There is NO confirmation AskUserQuestion here: `Skill(cbp-task-complete)` is `ask`-tier, so the harness permission prompt that gated this skill IS the user's confirmation to complete. The hard gates in Steps 2–2.6 (all rounds completed, ≥1 round has `testing_qa_output`, `check_verdict` READY, `task_testing_output.all_passed`) already block completion when prerequisites are unmet; these QA / file-approval items are warnings, not blockers.
|
|
99
99
|
|
|
100
100
|
### Step 4: Aggregate Files Changed
|
|
101
101
|
|
|
@@ -142,7 +142,7 @@ Call `complete_task(task_id)`. The server resolves the caller's worktree identit
|
|
|
142
142
|
|
|
143
143
|
Apply the `cleanup` skill inline to remove orphan references to deleted/modified files. Then apply `migration` to propagate renames/moves to consumers. Both run without sub-agent spawns. Skip cleanup if no deletions/modifications; skip migration if cleanup handled everything.
|
|
144
144
|
|
|
145
|
-
### Step 9: Show Result and Route
|
|
145
|
+
### Step 9: Show Result and Route
|
|
146
146
|
|
|
147
147
|
Show the completion summary:
|
|
148
148
|
|
|
@@ -153,6 +153,7 @@ Show the completion summary:
|
|
|
153
153
|
**Rounds**: [N] completed
|
|
154
154
|
**Files**: [N] changed
|
|
155
155
|
**Commit**: [hash]
|
|
156
|
+
**Warnings**: [any QA / file-approval warnings from Step 3, or "none"]
|
|
156
157
|
```
|
|
157
158
|
|
|
158
159
|
Then route. Same-context transitions (next task in this checkpoint) auto-trigger via the Skill tool. Cross-context transitions (checkpoint done → /cbp-checkpoint-check, session end) surface as a single directive 'Next: /clear, then /cbp-X' for the user to invoke after refreshing context.
|
|
@@ -9,7 +9,7 @@ effort: xhigh
|
|
|
9
9
|
|
|
10
10
|
# Task Testing Command
|
|
11
11
|
|
|
12
|
-
Comprehensive task-level testing —
|
|
12
|
+
Comprehensive task-level testing — the **cross-round double-check** run once after all rounds complete. Per-round QA (per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret grep, `pnpm audit`) is owned by each round's `testing-qa-agent`; this skill does NOT re-run it. Instead it tests the **entire delivered feature holistically** across the full task diff — catching cross-package and cross-round problems no single round can see. Runs inline — no sub-agent.
|
|
13
13
|
|
|
14
14
|
## When Used
|
|
15
15
|
|
|
@@ -19,7 +19,7 @@ Comprehensive task-level testing — runs all automated tests and walks the user
|
|
|
19
19
|
|
|
20
20
|
## Scope vs Round-Level Validation
|
|
21
21
|
|
|
22
|
-
Per-wave `testing-qa-agent` runs inside `/cbp-round-execute` Step 5. This skill adds the cross-
|
|
22
|
+
Per-wave `testing-qa-agent` runs inside `/cbp-round-execute` Step 5 and **owns per-round QA**: per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret full-diff grep, and `pnpm audit`. This skill does NOT repeat them. It adds only the cross-round layer invisible within a single round: workspace-wide lint, workspace tsc, and the full test suite (which catch cross-package breakage), plus the cross-round code review (Step 6.5), the autonomous sim screenshot loop (Step 6.x), and the user manual walkthrough (Step 8).
|
|
23
23
|
|
|
24
24
|
## Instructions
|
|
25
25
|
|
|
@@ -109,11 +109,9 @@ Capture stdout and stderr for each check.
|
|
|
109
109
|
| Full-repo lint | `pnpm -w lint` | Always |
|
|
110
110
|
| Full-repo types | `pnpm exec tsc --noEmit` | Source files changed |
|
|
111
111
|
| Full-repo unit tests | `pnpm test --run` | Source files in aggregated_files |
|
|
112
|
-
| Full-repo audit | `pnpm audit` | Always |
|
|
113
112
|
| Per-package E2E | `pnpm --filter <pkg> e2e:test` | UI files in aggregated_files |
|
|
114
|
-
| Full-diff security scan | inline grep or `security-agent` | Always |
|
|
115
113
|
|
|
116
|
-
Per-file lint + format are enforced by `lint-format-on-edit.sh`
|
|
114
|
+
These are the workspace-wide / cross-package checks only — per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret grep, and `pnpm audit` already ran per-round inside `testing-qa-agent` and are NOT repeated here. Per-file lint + format are enforced by `lint-format-on-edit.sh` per edit. This step catches cross-package issues invisible to per-wave checks.
|
|
117
115
|
|
|
118
116
|
**Soft tests** (report, don't block):
|
|
119
117
|
|
|
@@ -133,7 +133,7 @@ Once the gates pass, load the context the head command needs. This ensures `/cle
|
|
|
133
133
|
| `/cbp-checkpoint-start` | Load checkpoint via MCP `get_checkpoints` + `get_tasks(checkpoint_id)`. Display checkpoint title, status, claim state, first pending task |
|
|
134
134
|
| `/cbp-task-start [N]` | Load via MCP `get_current_task`. Display checkpoint title + task title/requirements summary |
|
|
135
135
|
| `/cbp-round-start` | Load via MCP `get_current_task` + `get_rounds(task_id)`. Display checkpoint + task + round count + last round summary |
|
|
136
|
-
| `/cbp-round-update` | Load via MCP `get_current_task` + `get_rounds(task_id)`. Display checkpoint + task + files_changed
|
|
136
|
+
| `/cbp-round-update` | Load via MCP `get_current_task` + `get_rounds(task_id)`. Display checkpoint + task + files_changed triage summary (claude_approved, findings, hard_fail) |
|
|
137
137
|
| `/cbp-round-input` | **Full context load** (see Step 2b) |
|
|
138
138
|
| `/cbp-task-check` | Load via MCP `get_current_task`. Display checkpoint + task + files summary |
|
|
139
139
|
| `/cbp-task-testing` | Load via MCP `get_current_task` + `get_rounds(task_id)`. Display checkpoint + task + testing status summary |
|