npm - create-claude-cabinet - Versions diffs - 0.30.0 → 0.31.1 - Mend

create-claude-cabinet 0.30.0 → 0.31.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/lib/cli.js +18 -1
package/package.json +1 -1
package/templates/README.md +4 -2
package/templates/cabinet/checkpoint-protocol.md +134 -0
package/templates/hooks/action-completion-gate.sh +70 -0
package/templates/skills/cc-upgrade/SKILL.md +14 -0
package/templates/skills/cc-upgrade/phases/execute-plans-rename-detect.md +77 -0
package/templates/skills/execute/SKILL.md +30 -46
package/templates/skills/execute-group/SKILL.md +183 -0
package/templates/skills/{execute-plans → generate-plan-groups}/SKILL.md +72 -89
package/templates/skills/plan/SKILL.md +2 -1
package/templates/skills/validate/phases/validators.md +37 -0
package/templates/workflows/execute-group.js +506 -0
/package/templates/skills/{execute-plans → generate-plan-groups}/scripts/build-conflict-graph.js +0 -0

package/lib/cli.js CHANGED Viewed

@@ -485,7 +485,7 @@ const MODULES = {
     mandatory: false,
     default: true,
     lean: true,
-    templates: ['skills/plan', 'skills/execute', 'skills/execute-plans', 'skills/investigate'],
+    templates: ['skills/plan', 'skills/execute', 'skills/generate-plan-groups', 'skills/execute-group', 'workflows/execute-group.js', 'skills/investigate', 'cabinet/checkpoint-protocol.md'],
   },
   'compliance': {
     name: 'Compliance Stack (rules + enforcement)',
@@ -1279,6 +1279,23 @@ async function run() {
         }
       }
     }
+    // execute-plans/ → generate-plan-groups/ (the plan→parallel split).
+    // Key-matched, not version-gated: if the old key is present it needs
+    // migrating; if it isn't, this no-ops. Idempotent on re-run.
+    for (const key of Object.keys(existingManifest)) {
+      const match = key.match(/\.claude\/skills\/execute-plans\//);
+      if (match) {
+        const newKey = key.replace('skills/execute-plans/', 'skills/generate-plan-groups/');
+        // Partial-state guard: if the project already tracks the new key
+        // (a prior partial migration), keep its hash — don't clobber it with
+        // the stale execute-plans hash, which would force a needless re-copy.
+        if (!existingManifest[newKey]) {
+          existingManifest[newKey] = existingManifest[key];
+        }
+        delete existingManifest[key];
+        migrationCount++;
+      }
+    }
     // Future manifest key migrations go here
     if (migrationCount > 0) {
       console.log(`  🔄 Migrated ${migrationCount} manifest key${migrationCount === 1 ? '' : 's'} for directory rename`);

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "create-claude-cabinet",
-  "version": "0.30.0",
+  "version": "0.31.1",
   "description": "Claude Cabinet — opinionated process scaffolding for Claude Code projects",
   "bin": {
     "create-claude-cabinet": "bin/create-claude-cabinet.js"

package/templates/README.md CHANGED Viewed

@@ -27,7 +27,7 @@ templates, see [EXTENSIONS.md](EXTENSIONS.md).
 | `rules/enforcement-pipeline.md` | Generic enforcement pipeline: capture, classify, promote, encode, monitor. Describes the compliance stack and promotion criteria. |
 | `rules/memory-capture.md` | When and how to capture memories via /cc-remember to the per-file curated layout at ~/.claude/projects/<slug>/memory/. What to capture, what not to, cadence guidance. |
-### Skills (22 workflow + 31 cabinet members)
+### Skills (24 workflow + 31 cabinet members)
 **Workflow Skills:**
@@ -39,7 +39,8 @@ templates, see [EXTENSIONS.md](EXTENSIONS.md).
 | `skills/debrief/` | Session close. Inventory work, close items, run cabinet consultations, update state, persist, record lessons. 9 phase files. |
 | `skills/debrief-quick/` | Quick debrief variant — core phases only, skip presentation. |
 | `skills/execute/` | Execute a plan with cabinet member checkpoints. 3-checkpoint protocol (pre-implementation, per-file-group, pre-commit). 5 phase files. |
-| `skills/execute-plans/` | Batch execution of multiple plans with conflict detection. |
+| `skills/generate-plan-groups/` | Scheduler: find plans with surface-area declarations, build a conflict graph, persist conflict-free parallel groups as pib-db `grp:` tags. Does not execute — hands each group to /execute-group. |
+| `skills/execute-group/` | Runner: execute one generated group via the `execute-group.js` workflow — cabinet pre-review, parallel worktree implementation, sequential merge with per-plan review, integration, informed final review, completion report. |
 | `skills/cc-extract/` | Analyze project artifacts and propose upstream extraction candidates for Claude Cabinet. |
 | `skills/investigate/` | Structured codebase exploration: frame, observe, hypothesize, test, conclude. |
 | `skills/cc-link/` | Set up local development linking for Claude Cabinet source repo work. |
@@ -103,6 +104,7 @@ mandates and scoped directives.
 | `cabinet/eval-protocol.md` | Structured assessment framework for evaluating skill/cabinet member effectiveness. |
 | `cabinet/lifecycle.md` | When to adopt, retire, and assess cabinet members. |
 | `cabinet/output-contract.md` | How cabinet members produce structured findings for the audit system. |
+| `cabinet/checkpoint-protocol.md` | The cabinet checkpoint mechanism (member selection, verdict schema, escalation) shared by /execute and /execute-group — read, not copied, so both stay in sync. |
 | `cabinet/prompt-guide.md` | Craft knowledge for writing cabinet member prompts. 17 principles. |
 ### Scripts (12)

package/templates/cabinet/checkpoint-protocol.md ADDED Viewed

@@ -0,0 +1,134 @@
+# Cabinet Checkpoint Protocol
+The single source of truth for how cabinet members review work in
+progress. `/execute` and `/execute-group` both **read this file and
+follow it** rather than copying the mechanism — so a change here flows to
+every checkpoint, everywhere, with no copy-drift.
+A checkpoint is a chance to stop before the cost of fixing goes up. The
+mechanism is the same at every scale; only the **scope** of what's
+reviewed changes.
+## When you are told to "follow the checkpoint protocol scoped to X"
+The caller names a scope. The scope determines what each spawned agent
+reviews — everything else (how to spawn, what to collect, how to
+escalate) is identical.
+| Scope | Reviews | Runs |
+|-------|---------|------|
+| `pre-impl` | The plan text + the list of files it will change | Before any code is written |
+| `this file group` | The git diff for one logical group of changed files | After each file group is implemented |
+| `pre-commit` | The full git diff of all changes | After implementation, before commit |
+| `this group's aggregate` *(group runs only)* | The combined diff of all plans in a parallel group | After a parallel group merges |
+A *parallel group* (the last row) is `/execute-group`'s unit of work: a
+set of conflict-free plans implemented concurrently in separate worktrees,
+then merged together. `/execute` never exercises that scope — it runs one
+plan at a time and uses only the first three.
+## Step 1 — Select which members to spawn
+Spawn one Agent per cabinet member that matches **either**:
+- **Standing mandate** — `standingMandate` includes the current verb
+  (`execute`). Read `.claude/skills/_index.json` to find them. These run
+  at every checkpoint regardless of surface area.
+- **Surface area** — a file in the reviewed scope matches the member's
+  file patterns, or a keyword in the plan description matches the
+  member's topic keywords.
+Fall back to reading `cabinet-*/SKILL.md` frontmatter if the index is
+missing.
+**Err toward inclusion.** A member that activates unnecessarily costs a
+few seconds; one that stays silent when it was needed costs rework. For
+`this file group` scope, narrow to members matching *that group's* files
+— a member reviewing 3 changed files gives sharper feedback than one
+reviewing 30.
+If the project has no cabinet members, skip the checkpoint and proceed —
+checkpoints add depth, not structure.
+## Step 2 — Spawn the agents (concurrently)
+Spawn the selected members concurrently — they don't depend on each
+other. **How** you spawn depends on the caller:
+- From `/execute` (main session): issue all Agent-tool calls in a single
+  message so they run in parallel.
+- From `/execute-group` (workflow script): issue the spawns as `agent()`
+  calls inside a `parallel()` block. Worktree agents cannot spawn
+  reviewers themselves — the workflow orchestrator does it.
+Either way, each spawned agent receives:
+- The cabinet member's full `SKILL.md` content
+- Essential project briefing from `.claude/cabinet/_briefing.md` (read it
+  once, reuse for every agent)
+- The member's `directives.execute`, if present — paste it in to sharpen
+  the member's focus
+- **The scoped material:** plan text + file list (`pre-impl`), or the
+  relevant git diff (`this file group`, `pre-commit`, aggregate)
+- An instruction to return the verdict object below
+**Plan-first review discipline (critical for `pre-impl` scope):** at
+`pre-impl` scope, the agent receives the plan's full notes. The plan IS
+the primary input — it may already address common risks (auth, validation,
+XSS, race conditions). The agent MUST:
+1. **Read the plan text first.** Understand what the plan says it will do
+   and what mitigations it already includes.
+2. **Only raise concerns the plan does NOT address.** If the plan says
+   "preview action lives in Admin::TargetsController with three-layered
+   auth," do not raise "needs admin auth" as a concern — the plan already
+   covers it. Explicitly acknowledge addressed concerns rather than
+   re-raising them.
+3. **Distinguish "the codebase has this risk" from "the plan doesn't
+   mitigate this risk."** A checkpoint is not a codebase audit. The
+   question is whether THIS PLAN is safe to start — not whether the
+   codebase has pre-existing issues outside the plan's scope.
+Without this discipline, cabinet members pattern-match against codebase
+state and raise false positives that the plan already handles, wasting
+tokens on re-runs that produce the same concerns.
+## Step 3 — Collect verdicts
+Each agent returns exactly this shape:
+```json
+{
+  "cabinet_member": "name",
+  "verdict": "continue" | "pause" | "stop",
+  "concerns": [
+    { "description": "...", "evidence": "...", "severity": "blocking" | "advisory" }
+  ]
+}
+```
+## Step 4 — Apply escalation
+Collect every verdict, then:
+- **Any `stop`** → halt. Show the concern. Require an explicit override
+  from the user before proceeding.
+- **Any `pause`** → show the concern with options: proceed / address /
+  abort.
+- **3+ `pause`** → escalate to stop-equivalent (halt, require override).
+- **All `continue`** → proceed with a brief one-line summary.
+At `pre-commit` and aggregate scopes, re-check earlier `continue`
+concerns: a concern that was minor in one file group can become
+significant once all changes are viewed together.
+## Principles
+- **Cabinet members are guardrails, not gates.** The user always has the
+  final say. A `stop` requires explicit override — it is not an automatic
+  rejection.
+- **Scope tightly.** The narrower the diff a member reviews, the better
+  the feedback.
+- **The pre-commit sweep catches emergent issues.** File groups that look
+  fine alone create problems in combination — type mismatches across
+  boundaries, security gaps from API + frontend changes landing together.

package/templates/hooks/action-completion-gate.sh CHANGED Viewed

@@ -42,4 +42,74 @@ if [ "$AC_VERIFIED" != "True" ]; then
   exit 0
 fi
+# --- Group-plan gate (Piece 4) ---
+# Plans run via /execute-group carry a grp:<label> tag. For these, the
+# workflow's Completion Report is the proof that the checkpoint sequence ran.
+# The report is the workflow's own execution record — it either ran the
+# checkpoints or it didn't. (Honest ceiling: this proves the workflow ran all
+# checkpoints and they returned continue, NOT that the right members reviewed
+# or that the review was deep. Auto-upgrades when cabinet subagent identity
+# becomes trustworthy.)
+#
+# Tag lookup is best-effort: if pib.db can't be read, GRP_LABEL is empty and
+# this gate is skipped — the base breadcrumb gate above still applies.
+DB_PATH="${PIB_DB_PATH:-pib.db}"
+TAGS=$(python3 -c "
+import sqlite3, sys
+try:
+    c = sqlite3.connect('$DB_PATH')
+    r = c.execute('SELECT tags FROM actions WHERE fid=?', ('$FID',)).fetchone()
+    sys.stdout.write(r[0] if r and r[0] else '')
+except Exception:
+    sys.stdout.write('')
+" 2>/dev/null)
+GRP_LABEL=$(printf '%s' "$TAGS" | tr ',' '\n' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | grep '^grp:' | head -1 | sed 's/^grp://')
+# Sanitize: group labels are date-style tokens (A-Za-z0-9_-). Strip anything
+# else before interpolating into a file path / python string — defends against
+# path traversal (../) and quote-breaking from a malformed tag.
+GRP_LABEL=$(printf '%s' "$GRP_LABEL" | tr -cd 'A-Za-z0-9_-')
+if [ -n "$GRP_LABEL" ]; then
+  # grp plans must carry the scenarios_updated field (the worktree agent
+  # records it — an empty array is fine, but absence means the agent didn't
+  # run the feature-file step).
+  HAS_SCENARIOS=$(python3 -c "import json; d=json.load(open('$BREADCRUMB')); print('scenarios_updated' in d)" 2>/dev/null)
+  if [ "$HAS_SCENARIOS" != "True" ]; then
+    echo "{\"decision\":\"block\",\"reason\":\"Action $FID (grp:$GRP_LABEL) breadcrumb is missing the scenarios_updated field. The /execute-group worktree agent records it (empty array if no e2e/features files were affected). Re-run /execute-group $GRP_LABEL so the field is written.\"}"
+    exit 0
+  fi
+  REPORT="$VERIFY_DIR/group-$GRP_LABEL-report.json"
+  if [ ! -f "$REPORT" ]; then
+    echo "{\"decision\":\"block\",\"reason\":\"Action $FID carries grp:$GRP_LABEL but its Completion Report is missing ($REPORT). Grouped plans are completed by /execute-group, which writes the report after running cabinet checkpoints and marks merged plans done itself. To complete: run /execute-group $GRP_LABEL. If you are completing this plan outside the group flow, remove the grp:$GRP_LABEL tag from its tags first.\"}"
+    exit 0
+  fi
+  VERDICT=$(python3 -c "
+import json
+try:
+    d = json.load(open('$REPORT'))
+    pp = d.get('per_plan', [])
+    me = next((p for p in pp if isinstance(p, dict) and p.get('fid') == '$FID'), None)
+    cks = d.get('checkpoints', {}) or {}
+    integ = cks.get('integration', {}) or {}
+    cp3g = cks.get('cp3_group', '')
+    if me is None: print('NOT_IN_REPORT')
+    elif me.get('status') != 'merged': print('plan-status=' + str(me.get('status')))
+    elif cp3g not in ('continue', 'skipped', 'n/a'): print('cp3_group=' + str(cp3g))
+    elif integ.get('validate') != 'pass': print('integration.validate=' + str(integ.get('validate')))
+    elif integ.get('breadcrumbs') != 'valid': print('integration.breadcrumbs=' + str(integ.get('breadcrumbs')))
+    else: print('OK')
+except Exception:
+    print('REPORT_UNREADABLE')
+" 2>/dev/null)
+  if [ "$VERDICT" != "OK" ]; then
+    echo "{\"decision\":\"block\",\"reason\":\"Action $FID (grp:$GRP_LABEL) is not cleared by its Completion Report: $VERDICT. The report must show this plan with status=merged, checkpoints.cp3_group=continue, integration.validate=pass, and integration.breadcrumbs=valid. Inspect it: cat $REPORT . If the group run did not finish cleanly, re-run /execute-group $GRP_LABEL; do not force-complete a plan the workflow parked or that failed integration.\"}"
+    exit 0
+  fi
+fi
 exit 0

package/templates/skills/cc-upgrade/SKILL.md CHANGED Viewed

@@ -269,6 +269,20 @@ and correct path.
 `~/.claude-cabinet/omega-venv/` exists OR `~/.claude/settings.json`
 contains `omega-venv`, run `phases/omega-migration-detect.md`.
+### 2.6. Directory Rename Cleanup
+**Not version-gated — runs on any upgrade, keyed on disk presence.** When
+CC renames a skill directory, the installer re-keys the manifest but
+leaves the old directory on disk (the cleanup loop classifies it as a
+non-template file and keeps it). These phases detect and remove such
+orphans conversationally:
+- **`execute-plans/` → `generate-plan-groups/` + `execute-group/`:** if
+  `.claude/skills/execute-plans/` exists, run
+  `phases/execute-plans-rename-detect.md`.
+If the orphan directory isn't present, the phase skips silently.
 ### 3. Explain What Changed
 Read `phases/explain-changes.md` for how to present changes.

package/templates/skills/cc-upgrade/phases/execute-plans-rename-detect.md ADDED Viewed

@@ -0,0 +1,77 @@
+# execute-plans → generate-plan-groups rename detection
+In the plan→parallel-execution split, the all-in-one `/execute-plans`
+skill was divided into `/generate-plan-groups` (scheduler) and
+`/execute-group` (runner). The installer's manifest-key migration re-keys
+the tracked files for hash continuity, but it does **not** delete the old
+`.claude/skills/execute-plans/` directory on disk — the cleanup loop
+classifies it as a non-template file and keeps it. So after a mechanical
+upgrade, a project that had `execute-plans` ends up with the orphan
+directory still present, and `/execute-plans` muscle-memory keeps invoking
+the old checkpoint-dropping skill.
+This phase detects and removes that orphan.
+## Detection
+This phase proceeds only if the orphan directory is actually on disk:
+```bash
+test -d .claude/skills/execute-plans && echo "HAS_ORPHAN=1"
+```
+If the directory is absent (fresh install, or already cleaned), skip this
+phase silently — say nothing.
+## What to explain to the user
+When the orphan is present, explain the rename in plain terms:
+> `/execute-plans` has been split into two skills:
+> - **`/generate-plan-groups`** — finds plans that can run in parallel and
+>   tags them into conflict-free groups (the old Steps 1–4).
+> - **`/execute-group <label>`** — runs one group: worktree implementation
+>   *plus* cabinet checkpoints (which the old skill claimed to run but
+>   couldn't, because worktree agents can't spawn reviewers).
+>
+> Both new skills are now installed. The old `execute-plans/` directory is
+> left over from before the rename and should be removed so `/execute-plans`
+> stops resolving to the obsolete skill.
+## Removal
+The orphan is only safe to remove once its **direct replacement** —
+`generate-plan-groups` (the renamed scheduler half) — is on disk. The
+runner half, `execute-group`, may or may not be present (it ships in a
+later piece / may be deselected); its absence must NOT block removal,
+because the scheduler is the rename of the old skill. This single guard
+covers both cases:
+```bash
+if [ -f .claude/skills/generate-plan-groups/SKILL.md ]; then
+  rm -rf .claude/skills/execute-plans
+  echo "Removed orphaned .claude/skills/execute-plans/"
+  if [ ! -f .claude/skills/execute-group/SKILL.md ]; then
+    echo "Note: /execute-group (the runner) is not installed — /generate-plan-groups"
+    echo "persists groups; add the runner to execute them, or run /execute per plan."
+  fi
+else
+  echo "WARN: generate-plan-groups not found — leaving execute-plans/ in place"
+fi
+```
+Never remove the orphan if `generate-plan-groups/SKILL.md` is absent —
+that would delete the only copy of the scheduler logic.
+## Persisted-group note
+If the project has actions tagged with `grp:` tokens (from a prior
+`/generate-plan-groups` run), those tags remain valid — they reference
+plans, not the skill directory. No migration of tags is needed.
+## What this phase does NOT do
+- It does not rewrite historical pib-db actions that mention
+  `execute-plans` in their notes — that's history, left as-is.
+- It does not touch the skill index (`_index.json`) — the installer
+  regenerates that from the installed skills on every run.

package/templates/skills/execute/SKILL.md CHANGED Viewed

@@ -9,6 +9,9 @@ description: |
 related:
   - type: skill
     name: validate
+  - type: file
+    path: .claude/cabinet/checkpoint-protocol.md
+    role: "The cabinet checkpoint mechanism — read and followed at Checkpoints 1/2/3"
   - type: file
     path: .claude/skills/execute/phases/load-plan.md
     role: "Project-specific: where plans live and how to read them"
@@ -162,31 +165,14 @@ If no cabinet members exist in the project, skip all checkpoint steps
 (3, 4b, 5) and execute the plan directly. Checkpoints add depth, not
 structure.
-### 3. Checkpoint 1: Pre-Implementation Review (Parallel Agents)
-Before writing any code, **spawn one Agent per activated cabinet member**
-in a single message. Each receives:
-- The cabinet member's full SKILL.md content
-- Essential project briefing from `.claude/cabinet/_briefing.md`
-- The plan text and list of files that will change
-- Instructions to evaluate whether the plan is safe to start
-Each agent returns:
-```json
-{
-  "cabinet_member": "name",
-  "verdict": "continue" | "pause" | "stop",
-  "concerns": [
-    { "description": "...", "evidence": "...", "severity": "blocking" | "advisory" }
-  ]
-}
-```
+### 3. Checkpoint 1: Pre-Implementation Review
-**Collect all verdicts.** Apply escalation:
-- Any **stop** → halt, show concern, require explicit override from user
-- Any **pause** → show concern with options: proceed / address / abort
-- 3+ **pause** → escalate to stop-equivalent
-- All **continue** → proceed with brief summary
+Before writing any code, **read `.claude/cabinet/checkpoint-protocol.md`
+and follow it, scoped to `pre-impl`.** The protocol covers which members
+to spawn, what each receives, the verdict shape, and the escalation
+rules. The reviewed material at this scope is the plan text and the list
+of files that will change — the question each member answers is "is this
+plan safe to start?"
 ### 4. Implement (File Group by File Group)
@@ -206,22 +192,23 @@ For each group:
      versions — prop APIs change between major versions and guessing
      wastes build cycles.
 3. **Checkpoint 2: File Group Review** — if cabinet members are active,
-   spawn agents for ONLY cabinet members matching the changed files. Each
-   receives the git diff for this file group + plan context. Same
-   escalation rules as Checkpoint 1.
+   **read `.claude/cabinet/checkpoint-protocol.md` and follow it, scoped
+   to `this file group`.** The reviewed material is the git diff for this
+   file group plus plan context; member selection narrows to those
+   matching the changed files.
 4. If all continue, move to the next group
 File-group granularity keeps reviews focused. A cabinet member reviewing
 3 changed files gives better feedback than one reviewing 30.
-### 5. Checkpoint 3: Pre-Commit Sweep (Parallel Agents)
+### 5. Checkpoint 3: Pre-Commit Sweep
-After all implementation is complete, **spawn one Agent per activated
-cabinet member** in a single message. Each receives the full git diff of
-all changes + plan context.
-Earlier "continue" concerns are re-checked — a concern that was minor
-in isolation may be significant in the aggregate.
+After all implementation is complete, **read
+`.claude/cabinet/checkpoint-protocol.md` and follow it, scoped to
+`pre-commit`.** The reviewed material is the full git diff of all changes
+plus plan context. As the protocol notes for this scope, earlier
+"continue" concerns are re-checked — a concern that was minor in
+isolation may be significant in the aggregate.
 ### 6. Validate and Commit
@@ -334,18 +321,15 @@ doesn't define. Execute them at their declared position.
 ## Principles
-- **Cabinet members are guardrails, not gates.** The user always has the
-  final say. Stop verdicts require explicit override, not automatic
-  rejection.
-- **Err toward inclusion** when selecting cabinet members. Better to have
-  a cabinet member say "looks fine" than to miss a concern.
-- **File-group granularity** keeps checkpoint reviews focused. A
-  cabinet member reviewing 3 changed files gives better feedback than one
-  reviewing 30.
-- **The pre-commit sweep catches emergent issues.** Individual file
-  groups may look fine but create problems in combination (type
-  mismatches across boundaries, security gaps from API + frontend
-  changes together).
+The principles governing the checkpoints themselves — guardrails not
+gates, err toward inclusion, tight scoping, the pre-commit sweep — live
+in `.claude/cabinet/checkpoint-protocol.md` (the single source of truth
+the checkpoints read). The principle specific to `/execute`:
+- **Verify every acceptance criterion before marking work done.** The
+  checkpoints catch design and review issues; the QA gate (Step 7)
+  catches "looks complete but the AC was never actually run." Neither
+  substitutes for the other.
 ## Calibration