npm - create-claude-cabinet - Versions diffs - 0.32.0 → 0.33.0 - Mend

create-claude-cabinet 0.32.0 → 0.33.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/lib/cli.js +1 -1
package/package.json +1 -1
package/templates/README.md +1 -1
package/templates/cabinet/checkpoint-protocol.md +48 -0
package/templates/hooks/action-completion-gate.sh +1 -1
package/templates/skills/cc-upgrade/SKILL.md +6 -0
package/templates/skills/cc-upgrade/phases/execute-group-workflow-split-detect.md +74 -0
package/templates/skills/execute-group/SKILL.md +184 -97
package/templates/workflows/execute-group-complete.js +303 -0
package/templates/workflows/execute-group-implement.js +243 -0
package/templates/workflows/execute-group.js +0 -506

package/lib/cli.js CHANGED Viewed

@@ -485,7 +485,7 @@ const MODULES = {
     mandatory: false,
     default: true,
     lean: true,
-    templates: ['skills/plan', 'skills/execute', 'skills/generate-plan-groups', 'skills/execute-group', 'workflows/execute-group.js', 'skills/investigate', 'cabinet/checkpoint-protocol.md'],
+    templates: ['skills/plan', 'skills/execute', 'skills/generate-plan-groups', 'skills/execute-group', 'workflows/execute-group-implement.js', 'workflows/execute-group-complete.js', 'skills/investigate', 'cabinet/checkpoint-protocol.md'],
   },
   'compliance': {
     name: 'Compliance Stack (rules + enforcement)',

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "create-claude-cabinet",
-  "version": "0.32.0",
+  "version": "0.33.0",
   "description": "Claude Cabinet — opinionated process scaffolding for Claude Code projects",
   "bin": {
     "create-claude-cabinet": "bin/create-claude-cabinet.js"

package/templates/README.md CHANGED Viewed

@@ -40,7 +40,7 @@ templates, see [EXTENSIONS.md](EXTENSIONS.md).
 | `skills/debrief-quick/` | Quick debrief variant — core phases only, skip presentation. |
 | `skills/execute/` | Execute a plan with cabinet member checkpoints. 3-checkpoint protocol (pre-implementation, per-file-group, pre-commit). 5 phase files. |
 | `skills/generate-plan-groups/` | Scheduler: find plans with surface-area declarations, build a conflict graph, persist conflict-free parallel groups as pib-db `grp:` tags. Does not execute — hands each group to /execute-group. |
-| `skills/execute-group/` | Runner: execute one generated group via the `execute-group.js` workflow — cabinet pre-review, parallel worktree implementation, sequential merge with per-plan review, integration, informed final review, completion report. |
+| `skills/execute-group/` | Runner: execute one generated group as a 3-stage pipeline — interactive cabinet pre-review (CP1) the operator decides on, then the `execute-group-implement.js` workflow (parallel worktree implementation + sequential merge) and the `execute-group-complete.js` workflow (advisory review + integration + completion report), with operator checkpoints between stages. |
 | `skills/cc-extract/` | Analyze project artifacts and propose upstream extraction candidates for Claude Cabinet. |
 | `skills/investigate/` | Structured codebase exploration: frame, observe, hypothesize, test, conclude. |
 | `skills/cc-link/` | Set up local development linking for Claude Cabinet source repo work. |

package/templates/cabinet/checkpoint-protocol.md CHANGED Viewed

@@ -27,6 +27,36 @@ set of conflict-free plans implemented concurrently in separate worktrees,
 then merged together. `/execute` never exercises that scope — it runs one
 plan at a time and uses only the first three.
+## Checkpoint modes — who acts on the verdict
+The scope says *what* is reviewed. The **mode** says *what happens to the
+verdict*. This distinction is load-bearing: an autonomous gate that reverts
+or halts on a false-positive `stop` is fragile and expensive. The default for
+high-stakes reviews is to put judgment in front of the operator.
+| Mode | Where it runs | What a `stop`/`pause` does | Used by |
+|------|---------------|----------------------------|---------|
+| **Interactive CP** | Main session (skill level) | Surfaced to the operator, who decides (proceed / drop / override / abort). Never automatic. | `/execute-group` CP1 |
+| **Advisory CP** | Workflow | Recorded in the Completion Report as a concern. Never halts or reverts. The only automatic gate alongside it is `/validate`. | `/execute-group` CP3 |
+| **Full CP** | Main session or workflow | Halts on `stop`, escalates 3+ `pause` to a halt, requires explicit override. The classic gate. | `/execute` CP1/CP2/CP3 |
+**Why Interactive and Advisory exist.** `/execute-group` once ran CP1 and CP3
+as autonomous gates inside a single workflow: a cabinet `stop` halted the run
+or reverted a merge with no human in the loop. False positives there cost real
+money (a CP1 halted twice consecutively — 1.6M+ tokens — on concerns the plan
+text already addressed). Moving CP1 to interactive (operator decides) and CP3
+to advisory (concerns recorded, `/validate` is the only hard gate) keeps the
+review signal while removing the destructive autonomous action.
+### Interactive CP adds a required `addressed_by_plan` field
+At Interactive CP (`/execute-group` CP1, `pre-impl` scope), each agent's
+verdict carries one extra **required** field, `addressed_by_plan` — the list
+of risks the plan already handles. The agent must enumerate these *first*,
+before raising any concern. This forces the plan-first discipline structurally:
+a risk listed in `addressed_by_plan` cannot also be raised as a concern. It is
+the direct fix for the false-positive halts.
 ## Step 1 — Select which members to spawn
 Spawn one Agent per cabinet member that matches **either**:
@@ -107,8 +137,26 @@ Each agent returns exactly this shape:
 }
 ```
+At **Interactive CP** (`/execute-group` CP1), add the required
+`addressed_by_plan` array described above:
+```json
+{
+  "cabinet_member": "name",
+  "addressed_by_plan": ["risks the plan already handles"],
+  "verdict": "continue" | "pause" | "stop",
+  "concerns": [ ... ]
+}
+```
 ## Step 4 — Apply escalation
+The escalation below is **Full CP** behavior (used by `/execute`). For
+**Interactive CP** the verdicts are surfaced to the operator severity-first
+and the operator decides — no automatic halt. For **Advisory CP** the concerns
+are recorded in the Completion Report and nothing halts or reverts; `/validate`
+is the only automatic gate. See "Checkpoint modes" above.
 Collect every verdict, then:
 - **Any `stop`** → halt. Show the concern. Require an explicit override

package/templates/hooks/action-completion-gate.sh CHANGED Viewed

@@ -98,7 +98,7 @@ try:
     cp3g = cks.get('cp3_group', '')
     if me is None: print('NOT_IN_REPORT')
     elif me.get('status') != 'merged': print('plan-status=' + str(me.get('status')))
-    elif cp3g not in ('continue', 'skipped', 'n/a'): print('cp3_group=' + str(cp3g))
+    elif cp3g not in ('continue', 'skipped', 'n/a'): print('cp3_group=' + str(cp3g))  # n/a: backward-compat with pre-v0.32 reports
     elif integ.get('validate') != 'pass': print('integration.validate=' + str(integ.get('validate')))
     elif integ.get('breadcrumbs') != 'valid': print('integration.breadcrumbs=' + str(integ.get('breadcrumbs')))
     else: print('OK')

package/templates/skills/cc-upgrade/SKILL.md CHANGED Viewed

@@ -280,6 +280,12 @@ orphans conversationally:
 - **`execute-plans/` → `generate-plan-groups/` + `execute-group/`:** if
   `.claude/skills/execute-plans/` exists, run
   `phases/execute-plans-rename-detect.md`.
+- **`execute-group.js` → `execute-group-implement.js` + `execute-group-complete.js`:**
+  if `.claude/workflows/execute-group.js` exists, run
+  `phases/execute-group-workflow-split-detect.md`. That phase removes the
+  orphaned monolithic workflow once both replacement workflow scripts are
+  present (the cleanup loop won't, since the file was deleted upstream rather
+  than renamed).
 - **`handoff*` → `engagement*` (+ `.claude/handoff/` infra → `.claude/engagement/`):**
   if any of `.claude/skills/handoff*` or `.claude/handoff/` exists, run
   `phases/handoff-rename-detect.md`. That phase removes the orphaned skill

package/templates/skills/cc-upgrade/phases/execute-group-workflow-split-detect.md ADDED Viewed

@@ -0,0 +1,74 @@
+# execute-group.js workflow split detection
+In the execute-group redesign, the monolithic `execute-group.js` workflow
+(one script running CP1 → implement → merge with per-plan CP3 → integration →
+group CP3 → completion as autonomous gates) was split into two focused
+workflow scripts plus a skill-level interactive checkpoint:
+- **`execute-group-implement.js`** — mechanical parallel implementation +
+  sequential merge (no cabinet review).
+- **`execute-group-complete.js`** — advisory CP3 + integration + completion
+  report.
+- **Interactive CP1** moved into the `/execute-group` SKILL.md (the operator
+  decides; it is no longer an autonomous gate).
+The installer copies the two new workflow files, but the old
+`.claude/workflows/execute-group.js` is **not** removed by the cleanup loop:
+that loop only deletes files still mapping to a current CC template, and
+`execute-group.js` no longer maps to one (it was deleted upstream, not
+renamed). So after an upgrade a project that had it ends up with the stale
+`execute-group.js` sitting next to the two new scripts.
+This phase detects and removes that orphan.
+## When this phase runs
+Only when the orphan workflow file is actually on disk:
+```bash
+test -f .claude/workflows/execute-group.js && echo "HAS_ORPHAN=1"
+```
+If it's absent, skip silently — nothing to clean up.
+## What to do
+When the orphan is present, explain the split in plain terms:
+> The `/execute-group` workflow was split. The old single
+> `execute-group.js` script (which ran cabinet review as autonomous
+> halt/revert gates) is replaced by:
+> - **`execute-group-implement.js`** — mechanical implementation + merge
+> - **`execute-group-complete.js`** — advisory review + completion
+> - an **interactive CP1** that now lives in the `/execute-group` skill, so
+>   *you* decide on pre-implementation concerns instead of a gate halting
+>   automatically.
+>
+> Both new workflow scripts are installed. The old `execute-group.js` is
+> left over from before the split and should be removed.
+The orphan is only safe to remove once **both** replacement workflows are
+present (otherwise removing it would strand `/execute-group` with no
+orchestrator):
+```bash
+if [ -f .claude/workflows/execute-group-implement.js ] && \
+   [ -f .claude/workflows/execute-group-complete.js ]; then
+  rm -f .claude/workflows/execute-group.js
+  echo "Removed orphaned .claude/workflows/execute-group.js"
+else
+  echo "WARN: replacement workflows not both present — leaving execute-group.js in place"
+fi
+```
+If either replacement is missing, leave the old script in place and tell the
+user the upgrade didn't fully install the new workflows (re-run the
+installer). A working monolith beats a half-removed split.
+## Note on in-flight runs
+Completion Reports already written to `.claude/verification/group-*-report.json`
+by the old workflow remain valid — the completion gate reads the same
+`per_plan[].status`, `checkpoints.cp3_group`, and `checkpoints.integration`
+fields, which the new `execute-group-complete.js` preserves. No report
+migration is needed.

package/templates/skills/execute-group/SKILL.md CHANGED Viewed

@@ -2,12 +2,12 @@
 name: execute-group
 description: |
   Run one parallel plan group produced by /generate-plan-groups. Validates
-  the group hasn't drifted, then launches the execute-group workflow:
-  cabinet pre-review, parallel worktree implementation, sequential merge with
-  per-plan review, integration check, informed final review, and a completion
-  report. Use when: "execute group", "run group", "/execute-group".
+  the group hasn't drifted, runs an interactive cabinet pre-review (CP1) you
+  decide on, then drives a two-workflow pipeline: mechanical parallel
+  implementation + merge, an operator checkpoint, then advisory review +
+  completion. Use when: "execute group", "run group", "/execute-group".
 disable-model-invocation: true
-argument-hint: "group label — e.g., '2026-05-30-1'"
+argument-hint: "group label — e.g., '2026-05-30-1' (append --advisory to skip the CP1 pause)"
 related:
   - type: skill
     name: generate-plan-groups
@@ -15,10 +15,13 @@ related:
     name: execute
   - type: file
     path: .claude/cabinet/checkpoint-protocol.md
-    role: "The cabinet checkpoint mechanism — the workflow's review agents read and follow it"
+    role: "The cabinet checkpoint mechanism — CP1 (interactive) and CP3 (advisory) both read it"
   - type: file
-    path: .claude/workflows/execute-group.js
-    role: "The orchestrator this skill launches"
+    path: .claude/workflows/execute-group-implement.js
+    role: "Stage 2 — mechanical parallel implementation + sequential merge"
+  - type: file
+    path: .claude/workflows/execute-group-complete.js
+    role: "Stage 3 — advisory CP3 + integration + completion report"
 ---
 # /execute-group — Run a Generated Parallel Plan Group
@@ -26,51 +29,64 @@ related:
 ## Purpose
 `/generate-plan-groups` decides *what can run in parallel* and persists each
-conflict-free group as pib-db `grp:` tags. This skill *runs one group*: it
-re-checks the group is still safe, then hands off to a **workflow
-orchestrator** (`execute-group.js`) that drives implementation and cabinet
-review end to end.
-**Why a workflow, not direct Agent-tool spawning:** worktree agents cannot
-spawn sub-agents (no Agent-tool access — empirically verified). So a worktree
-agent cannot run its own cabinet checkpoints. The workflow script solves this
-by being the single orchestrator: it spawns worktree agents for
-implementation AND cabinet agents for review as first-class parallel
-participants. This is the capability the old all-in-one parallel-execution
-skill could not provide.
+conflict-free group as pib-db `grp:` tags. This skill *runs one group* as a
+**three-stage pipeline with operator checkpoints between stages**:
+1. **Interactive CP1** (this skill, main session) — cabinet members pre-review
+   the plans; *you* decide whether to proceed, drop plans, or pass overrides.
+2. **Implementation workflow** (`execute-group-implement.js`) — parallel
+   worktree implementation + sequential merge. Purely mechanical, no review.
+3. **Review + completion workflow** (`execute-group-complete.js`) — advisory
+   cabinet review of the merged diff, integration check, and completion report.
+**Why this shape (and not one monolithic workflow):** an earlier design ran
+CP1 and CP3 *inside* a single workflow as autonomous gates — a cabinet `stop`
+halted the run or reverted a merge automatically. False positives in those
+gates produced expensive halts (field evidence: a CP1 halted twice in a row,
+1.6M+ tokens, on concerns the plan text already addressed). The fix:
+**judgment belongs to the operator, automation stays mechanical.** CP1 is now
+interactive (you see the findings and decide). CP3 is advisory (concerns are
+recorded, never auto-halt/revert). The only hard, automatic gate is
+`/validate` — a deterministic build check, not a judgment call.
+**Why two workflows instead of direct Agent-tool spawning:** worktree agents
+cannot spawn sub-agents (no Agent-tool access — empirically verified), so a
+worktree agent cannot run its own checkpoints. CP1 escapes this by running at
+the skill level (main session, where the Agent tool *is* available). Stages 2
+and 3 are workflows because they need to spawn worktree/merge/review agents as
+first-class parallel participants.
 ## Prerequisites
 - The group must have been produced by `/generate-plan-groups` (its plans
   carry `grp:<label>`, `grp-generated:`, and `grp-hash:` tags).
 - Plans must still have `## Surface Area` sections in their notes.
-- The Workflow tool must be available (the orchestrator runs as a workflow).
+- The Workflow tool must be available (Stages 2 and 3 run as workflows).
 ## Honest ceiling — read before relying on this
-The workflow runs the checkpoints; it does not guarantee the *review was
-thorough*. Specifically:
-- **No mid-implementation (CP2) review.** Worktree agents implement without
-  a reviewer looking over their shoulder. CP1 reviews before, CP3 reviews
-  after. For a plan whose diff is large or touches high-risk surface, run
-  `/execute <plan>` individually instead of via a group — full `/execute`
-  has the per-file-group checkpoint this path sacrifices for parallelism.
+- **No mid-implementation (CP2) review.** Worktree agents implement without a
+  reviewer watching. CP1 reviews before, CP3 reviews after. For a plan whose
+  diff is large or touches high-risk surface, run `/execute <plan>`
+  individually instead — full `/execute` has the per-file-group checkpoint
+  this path sacrifices for parallelism.
+- **CP3 is advisory.** It surfaces concerns; it does not block completion.
+  Only `/validate` blocks. A real problem invisible to `/validate` (e.g. a
+  subtle behavioral regression) will land on main with an advisory note, not
+  a revert. The operator must read the CP3 concerns in the report.
 - **Surface area is intent, not reality.** Under-declared surface area can
-  hide a semantic conflict the conflict graph missed; only CP3 catches it.
-- **Feature-file "affect" is heuristic.** Behavioral coupling not textually
-  referenced may be missed.
+  hide a semantic conflict the conflict graph missed.
 ## Workflow
-### Step 1 — Staleness guard (skill-level, BEFORE launching)
+### Step 1 — Staleness guard (BEFORE anything else)
-The persisted group is a hint, not a contract. Re-validate it against the
-*current* state before running:
+The persisted group is a hint, not a contract. Re-validate against *current*
+state:
 1. **Fetch the group's plans.** Query actions whose `tags` contain
-   `grp:<label>` (the argument). Use `pib_query` (or `node scripts/pib-db.mjs
-   query`):
+   `grp:<label>` (the argument, minus any `--advisory` flag). Use `pib_query`
+   (or `node scripts/pib-db.mjs query`):
    ```sql
    SELECT a.fid, a.text, a.notes, a.tags
    FROM actions a
@@ -81,11 +97,10 @@ The persisted group is a hint, not a contract. Re-validate it against the
 2. **Drop plans that are no longer open or lost their surface area.** Report
    each dropped plan and why.
-3. **Recompute the surface-area hash and compare.** Recompute it **exactly
-   as `/generate-plan-groups` did**: for every still-open plan in the group,
-   parse its `## Surface Area` file/dir list, concatenate all entries across
-   the group, sort, and hash. Compare to the `grp-hash:` token stored on the
-   plans.
+3. **Recompute the surface-area hash and compare.** Recompute it **exactly as
+   `/generate-plan-groups` did**: for every still-open plan in the group, parse
+   its `## Surface Area` file/dir list, concatenate all entries across the
+   group, sort, and hash. Compare to the `grp-hash:` token stored on the plans.
    - **Hash matches** → the group is current. Proceed.
    - **Hash differs** → a plan's surface area changed since grouping. **HALT:**
      > Group `<label>` has drifted since it was generated (surface areas
@@ -94,90 +109,162 @@ The persisted group is a hint, not a contract. Re-validate it against the
      Do not run a stale group — the conflict-free guarantee no longer holds.
 4. **Edge cases:**
-   - **0 plans survive** the filter → tell the user the group is empty
-     (all drifted/closed) and stop. Don't launch the workflow.
-   - **1 plan survives** → you may still launch (the workflow skips
-     group-level checkpoints for a single plan), or just suggest
+   - **0 plans survive** → tell the user the group is empty (all
+     drifted/closed) and stop. Do not launch any workflow.
+   - **1 plan survives** → you may still run it (Stages 2/3 skip the
+     group-level aggregate review for a single plan), or suggest
      `/execute <plan>` directly. Single-plan groups gain nothing from the
      parallel machinery.
 ### Step 2 — Select cabinet members
-Select the cabinet members the workflow's checkpoints will use. Use
+Select the cabinet members CP1 and CP3 will use. From
 `.claude/skills/_index.json`: members whose `standingMandate` includes
 `execute`, plus any whose file patterns match the group's aggregate surface
-area. For each, collect `{ key, agentType, path, directive }` (the
-`agentType` is the registered `cabinet-<name>` subagent; `directive` is
-`directives.execute` if present). The workflow's review agents each read
-`.claude/cabinet/checkpoint-protocol.md` and follow it, scoped to the
-checkpoint they run (group aggregate / pre-impl / post-merge).
+area. For each, collect `{ key, agentType, path, directive }` (`agentType` is
+the registered `cabinet-<name>` subagent; `directive` is `directives.execute`
+if present).
+If the project has no cabinet members, skip CP1 and tell the user the run
+proceeds without review (implementation + `/validate` only).
+### Step 2.5 — Interactive CP1 (this skill, main session)
+Spawn one Agent per selected cabinet member **in a single message** (parallel).
+Each agent reads `.claude/cabinet/checkpoint-protocol.md` (interactive CP mode,
+`pre-impl` scope), its own SKILL.md at `path`, the project briefing, and the
+plans' full notes. Each returns this verdict shape (note the **required
+`addressed_by_plan`** field — CP1 only):
+```
+CP1_VERDICT_SCHEMA:
+{
+  "cabinet_member": "name",
+  "addressed_by_plan": ["risks the plan already handles — enumerate FIRST"],
+  "verdict": "continue" | "pause" | "stop",
+  "concerns": [
+    { "description": "...", "evidence": "...", "severity": "blocking" | "advisory" }
+  ]
+}
+```
+`addressed_by_plan` is required to force plan-first review: the agent must
+enumerate what the plan already covers *before* raising concerns. A concern
+the plan explicitly handles must not be raised — this is the discipline whose
+absence caused the false-positive halts.
+**Present findings to the operator severity-first**, not verdict-first:
+1. **Blocking concerns** (any `severity: blocking`) — listed first, with
+   member + evidence.
+2. **Advisory concerns** — next.
+3. **Addressed-by-plan** — collapsed to a one-line count ("12 risks the plans
+   already cover") unless the operator asks to expand.
+Then **the operator decides** — this is the checkpoint, not an automatic gate:
+- If any agent returned `stop` or raised a blocking concern:
+  > A cabinet member recommends stopping: [concern]. Proceed anyway, drop the
+  > affected plan from this run, or abort?
+- Otherwise summarize and ask: **"Launch implementation?"**
+Capture the operator's response as:
+- `operatorOverrides`: an array of free-text directives to pass into the
+  implementation agents (e.g. "skip plan X", "watch the migration ordering").
+  Empty array if none.
+- `cp1Findings`: the structured CP1 verdicts (recorded in the final report).
+**`--advisory` flag:** if the argument includes `--advisory`, still run the
+CP1 agents and record `cp1Findings`, but **skip the operator pause** — print
+the severity-first summary and proceed straight to Stage 2 with no overrides.
+Use this for low-risk groups where you trust the plans.
+**All CP1 agents errored:** if every agent failed to return a verdict, do not
+silently proceed. Warn:
+> Cabinet review failed (no agent returned a verdict). Proceed without review
+> or abort?
+### Step 3a — Launch the Implementation workflow (Stage 2)
+Invoke the Workflow tool:
+- **script:** `.claude/workflows/execute-group-implement.js`
+- **args:**
+  ```json
+  {
+    "label": "<label>",
+    "plans": [{ "fid": "...", "text": "...", "notes": "...", "surfaceArea": "..." }],
+    "operatorOverrides": ["...optional operator directives from CP1..."]
+  }
+  ```
+  Pass `plans` and `operatorOverrides` as real JSON (not stringified).
-If the project has no cabinet members, the workflow still runs — it just
-skips the checkpoints (implementation + validate only). Say so.
+The workflow returns a structured result: `{ label, plans_implemented,
+per_plan, merged, loose_ends }`. **Present it plainly:**
+> N of M plans merged. [list any failed/parked/noop plans and their reasons
+> from `loose_ends`].
-### Step 3 — Launch the workflow
+Then **operator checkpoint:** "Continue to review + completion?" Wait for
+confirmation before Stage 3. If nothing merged, say so and ask whether to stop
+(Stage 3 has nothing to review).
-Invoke the Workflow tool with the orchestrator script and the assembled
-arguments:
+### Step 3b — Launch the Review + Completion workflow (Stage 3)
-- **script:** `.claude/workflows/execute-group.js`
+Invoke the Workflow tool:
+- **script:** `.claude/workflows/execute-group-complete.js`
 - **args:**
   ```json
   {
     "label": "<label>",
-    "plans": [{ "fid": "...", "text": "...", "notes": "...", "surfaceArea": "..." }],
+    "mergedPlans": [ ...the `merged` array from Stage 2's result... ],
+    "implPerPlan": [ ...the `per_plan` array from Stage 2's result (all plans, not just merged)... ],
     "cabinetMembers": [{ "key": "...", "agentType": "cabinet-...", "path": "...", "directive": "..." }],
+    "cp1Findings": [ ...the CP1 verdicts captured in Step 2.5... ],
     "checkpointProtocolPath": ".claude/cabinet/checkpoint-protocol.md",
     "briefingPath": ".claude/cabinet/_briefing.md"
   }
   ```
-Pass `plans` and `cabinetMembers` as real JSON arrays (not stringified).
+The workflow runs advisory CP3 over the aggregate diff, the integration check,
+and completion (it writes the Completion Report to
+`.claude/verification/group-<label>-report.json` **before** marking plans done,
+then marks merged plans done itself). It returns the Completion Report.
 ### Step 4 — Present the Completion Report
-The workflow returns a structured Completion Report. Present it plainly:
-which plans merged, which parked/failed, the checkpoint verdicts, the
-integration result, any new pib-db actions created for deferred manual ACs,
-and the `loose_ends`. The report is also the evidence the completion gate
-(`action-completion-gate.sh`) checks when a `grp:`-tagged plan is marked
-done — don't discard it.
+Present it plainly: which plans completed, the advisory CP3 concerns (call
+these out — they are the operator's to weigh), the integration result, any new
+pib-db actions created for deferred manual ACs, and the `loose_ends`. The
+report on disk is also the evidence the completion gate
+(`action-completion-gate.sh`) checks for `grp:`-tagged plans — don't discard
+it.
-If the workflow halted early (a checkpoint returned `stop`, or integration
-failed), report exactly where and why. Nothing was merged on a pre-merge
-halt; on a post-merge CP3 stop, the offending plan was reverted.
+If completion was gated (final `/validate` failed), report exactly why; the
+merged plans are left **open** for the operator to fix and re-run.
 #### Recovery steps for parked/failed plans
-After a mixed result, present explicit next steps for each status:
-- **Merged** — done. No action needed.
-- **Parked** (a merge was reverted after CP3 rejection, or /validate failed
-  post-merge) — the worktree branch is preserved. To retry this plan
-  individually with full cabinet checkpoints (including the per-file-group
-  CP2 that the group path skips), **strip its `grp:` tags first** and then
-  run `/execute <plan>`. If you don't strip the tags, the completion gate
-  will block because the Completion Report shows this plan as "parked," not
-  "merged." Strip with: `pib_update_action --tags "<non-grp-tags-only>"`.
-- **Failed implementation** — the worktree agent could not complete the
-  plan. Investigate the `deviations` in the report, fix the plan, then
-  strip the `grp:` tags and run `/execute <plan>` individually.
-- **No result** — the worktree agent errored entirely. Same recovery:
-  strip tags, retry via `/execute`.
-Re-running `/generate-plan-groups` automatically replaces stale `grp:` tags
-on any plans it re-groups — but only for plans it selects. Plans you retry
-individually should have their tags stripped before running `/execute`.
+- **Merged & completed** — done.
+- **Parked / failed implementation / no result** — the worktree branch (if
+  any) is preserved. To retry individually with full cabinet checkpoints
+  (including the per-file-group CP2 the group path skips), **strip the `grp:`
+  tags first**, then run `/execute <plan>`. If you don't strip the tags, the
+  completion gate blocks because the report shows the plan as not-merged.
+  Strip with: `pib_update_action --tags "<non-grp-tags-only>"`.
+Re-running `/generate-plan-groups` automatically replaces stale `grp:` tags on
+any plans it re-groups — but only for plans it selects.
 ## Principles
-- **The group is a hint, not a contract.** Always re-validate (Step 1)
-  before running. Regenerate freely.
-- **The workflow is the single orchestrator.** Don't try to run the
-  checkpoints from this skill — the whole point is that the workflow can
-  spawn both implementors and reviewers, and a worktree agent cannot.
+- **Judgment to the operator, automation mechanical.** CP1 is an interactive
+  decision; CP3 is advisory; `/validate` is the only automatic gate.
+- **The group is a hint, not a contract.** Always re-validate (Step 1) before
+  running. Regenerate freely.
+- **Operator checkpoints between stages.** You see CP1 findings before
+  implementation and merge results before review+completion. Each is a real
+  decision point, not a rubber stamp.
 - **Sequential merges, parallel everything else.** Merges into main are
-  serialized with `/validate` between them; CP1, implementation, and
-  per-plan CP3 run in parallel.
-- **Honest about the ceiling.** This runs the checkpoints; it does not prove
-  the review was deep. For high-risk plans, prefer individual `/execute`.
+  serialized with `/validate` between them; CP1, implementation, and CP3 run
+  in parallel.
+- **Honest about the ceiling.** For high-risk plans, prefer individual
+  `/execute`.