npm - codebyplan - Versions diffs - 1.13.57 → 1.13.58 - Mend

codebyplan 1.13.57 → 1.13.58

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/dist/cli.js +1576 -943
package/package.json +2 -2
package/templates/github-workflows/release-desktop.yml +2 -2
package/templates/skills/cbp-session-start/SKILL.md +62 -211
package/templates/skills/cbp-session-start/qa-regression.md +13 -11

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "codebyplan",
-  "version": "1.13.57",
+  "version": "1.13.58",
   "description": "CLI for CodeByPlan — AI-powered development planning and tracking",
   "type": "module",
   "bin": {
@@ -53,7 +53,7 @@
     "@eslint/js": "^9.18.0",
     "@types/node": "^20",
     "@vitest/eslint-plugin": "^1.1.44",
-    "esbuild": "^0.28",
+    "esbuild": ">=0.28.1",
     "eslint": "^9.18.0",
     "eslint-config-prettier": "^10.0.1",
     "eslint-plugin-no-secrets": "^2.2.1",

package/templates/github-workflows/release-desktop.yml CHANGED Viewed

@@ -163,7 +163,7 @@ jobs:
       - name: Post release metadata to API
         env:
-          CBP_API_KEY: ${{ secrets.CODEBYPLAN_API_KEY }}
+          DESKTOP_RELEASE_TOKEN: ${{ secrets.DESKTOP_RELEASE_TOKEN }}
           VERSION: ${{ needs.check-version.outputs.version }}
         run: |
           TAG="desktop-v${VERSION}"
@@ -205,7 +205,7 @@ jobs:
           curl -fL -X POST \
             "https://www.codebyplan.com/api/desktop/releases" \
             -H "Content-Type: application/json" \
-            -H "x-api-key: ${CBP_API_KEY}" \
+            -H "x-release-token: ${DESKTOP_RELEASE_TOKEN}" \
             -d "$(jq -n \
               --arg version "${VERSION}" \
               --arg notes "${NOTES}" \

package/templates/skills/cbp-session-start/SKILL.md CHANGED Viewed

@@ -11,14 +11,14 @@ Activate the session, open a fresh session log, and surface the previous log's p
 ## Instructions
-Run Steps 0 through 5.8 silently (no intermediate output) — except Step 0 aborts the session on MCP failure, Step 1.5 may surface a one-line infra-drift nudge, Step 1.55 may surface a one-line architecture-map drift nudge, Step 1.6 may run an install-and-halt path, Step 1.7 may surface a one-line LSP binary nudge, Step 4.5 may auto-resume a handoff and exit session-start entirely (no Step 6 output), and Step 5.7 may surface an approval gate. (Step numbers are organizational labels; execution order is 0 → 1 → 1.5 → 1.55 → 1.6 → 1.7 → 3 → 4 → 4.5 → 5 → 5.7 → 5.8 → 6 → 7.) Produce ONE output block at Step 6, then auto-trigger or stop per Step 7.
+Run Step 0 silently (hard gate). Then run `codebyplan session start --json` (Step 1 through Step 5.8 collapsed into one CLI call). Parse the envelope and print the `rendered_block`. Apply Claude-side Step 5.7 (interactive commit gate) and Step 7 (routing) from the envelope. Produce ONE output block, then auto-trigger or stop per Step 7.
 ### Step 0: MCP Health Gate
 Call MCP `health_check` tool. **The MCP connection is vital — this is a hard gate.**
-- **If succeeds**: Continue silently to Step 1.
-- **If fails**: Print the error below and **STOP the entire session-start**. Do NOT continue to Step 1 or any later step — no config load, no `update_session_state(activate)`, no `create_session_log`, no `/cbp-todo` trigger:
+- **If succeeds**: Continue silently to the orchestrator.
+- **If fails**: Print the error below and **STOP the entire session-start**. Do NOT continue — no activate, no create-log, no `/cbp-todo` trigger:
   ```
   ✖ MCP connection failed — session-start aborted. Check:
@@ -29,237 +29,88 @@ Call MCP `health_check` tool. **The MCP connection is vital — this is a hard g
   Fix the connection, then run /cbp-session-start again.
   ```
-### Step 1: Load Config
+### Steps 1–5.8: Orchestrator
-Read per-concern config files from the project root. Single load point for the session:
-- `repo_id` (UUID) — from `.codebyplan/repo.json`, required for all MCP operations
-- `git_branch` — from `.codebyplan/git.json`
-Resolve `worktree_id` at runtime using the structured JSON form:
+Run the CLI orchestrator and parse the JSON envelope it emits:
 ```bash
-RESOLVE_JSON=$(npx codebyplan resolve-worktree --json)
-# → {"worktree_id":"<uuid>|null","error_kind":null|"<kind>"}
+codebyplan session start --json
 ```
-Extract `worktree_id` and `error_kind` from the JSON output.
-- `error_kind` is `null` or `"tuple_miss"` → healthy. `WORKTREE_ID` = `worktree_id` (may be `null`: a legitimate main-repo or unregistered-worktree case — proceed normally; the server resolves worktree identity from the JWT/ctx, falling back to the repo main-worktree when no specific worktree is matched).
-- `error_kind` is `local_config_read_failed`, `local_config_write_failed`, `legacy_file_blocks_dir`, `api_failed`, `git_failed`, or `unhandled` → **broken local state**. Hold the `error_kind` for Step 6 to display as a distress warning. Session continues (non-blocking — unlike `/cbp-todo`, session-start does NOT hard-stop on a non-tuple-miss distress).
-Pass `WORKTREE_ID` to MCP tools that support it. Null `WORKTREE_ID` means the (device, path, branch) tuple is unregistered — note this for Step 6.
-### Step 1.5: Infra Drift Check
-Surface — never block — when this worktree's source-monorepo `.claude/` infra has fallen behind. Runs after Step 1 and may add one line to the Step 6 output (`$PRODUCTION` is `branch_config.production` read in Step 1). Consumer-repo package-version freshness is handled by Step 1.6 (the freshness gate), not here:
-- **Monorepo (concept A)** — both `packages/codebyplan-package/templates/` and `scripts/infra-drift.mjs` exist. Refresh `origin/$PRODUCTION` best-effort first, then run the reporter:
-  ```bash
-  git fetch origin "$PRODUCTION" 2>/dev/null || true
-  node scripts/infra-drift.mjs 2>/dev/null || true
-  ```
-  The script self-guards (feat branch + behind > 0) and emits at most one `⚠ .claude/ infra is N behind — run /cbp-refresh-infra` line. Hold any output for Step 6.
-- **Neither** → skip silently.
-Fully non-blocking; every failure path falls through with no output.
-### Step 1.55: Architecture-Map Drift Check
-Surface — never block — when this repo's generated `.claude/architecture/` maps have
-gone stale (a mapped module's source was committed past the SHA stamped in
-`.codebyplan/architecture.json`). Runs after the infra-drift check (Step 1.5) and may add
-one line to the Step 6 output. This is the freshness nudge for the architecture-map
-pipeline (mirrors the infra-drift pattern; the analog for first-party code maps):
-- **No manifest** — `.codebyplan/architecture.json` absent → skip silently (the repo has
-  no maps yet; nothing to drift-check).
-- **Manifest present** — run the drift probe best-effort and count drifted modules:
-  ```bash
-  npx codebyplan arch-map drift 2>/dev/null | grep -c '^[[:space:]]*DRIFTED' || true
-  ```
-  If the count is `> 0`, hold this single line for Step 6:
-  ```
-  ⚠ .claude/architecture is N module(s) stale — run /cbp-refresh-arch-map
-  ```
-  A count of `0` (all maps fresh, no stamped modules, or any probe failure) emits nothing.
-Fully non-blocking; every failure path falls through with no output. The `arch-map` CLI is
-itself hook-safe (exits 0 on any internal error), so a missing or too-old `codebyplan`
-binary simply yields a zero count and no line.
-### Step 1.6: Package Freshness Gate
-Check whether a newer `codebyplan` is published and safe to auto-install on this worktree's current branch. Runs AFTER the architecture-map drift check (Step 1.55) and BEFORE session activation (Step 3).
-Run `codebyplan session freshness-gate --halt-on-update` and parse the JSON output (`{ result: 'skipped'|'guarded'|'up_to_date'|'updated'|'error', ... }`):
-- **Probe failed** — the command errored or output cannot be parsed as JSON. → **FAIL-SAFE SKIP**: proceed silently to Step 3. Never disrupt a session over a best-effort freshness probe — the MCP gate (Step 0) is the only vital gate.
-- **`result === 'skipped'` / `'guarded'` / `'up_to_date'`** → skip silently, proceed to Step 3. Gate on the `result` field only.
-- **`result === 'error'`** → fail-safe skip, proceed to Step 3.
-- **`result === 'updated'`** → the CLI already ran the install + `npx codebyplan claude update`. Parse the JSON response:
-  - If `changed_files[]` is present and non-empty, offer the same commit gate as Step 5.7:
-    ```
-    codebyplan updated. Commit the resulting .claude/ and .codebyplan/ changes before exiting?
-    [list of changed paths under .claude/ and .codebyplan/]
-    Reply: yes | no | select
-    ```
+The orchestrator performs (in order, all fail-safe):
+1. Resolve `repo_id` from `.codebyplan/repo.json`
+2. `codebyplan resolve-worktree --json` → `worktree_id`, `worktree_error_kind`
+3. `codebyplan session freshness-gate --halt-on-update` → update_halt short-circuit (no activate, no create-log when halt)
+4. Infra-drift nudge (monorepo-only), arch-map drift nudge, LSP binary nudge
+5. Read previous session log from `.codebyplan/state/session/current.json` (sync on miss)
+6. Handoff freshness probe (Step 4.5 gate — uses previous row before create-log overwrites it)
+7. `codebyplan session update-state --action activate` (write-through)
+8. `codebyplan session create-log` → `session_log_id`
+9. Infra-files set math: active task round files subtracted from `git status --porcelain`
+10. `get_checkpoints({ repo_id, status:'active' })` → ownership partition (owned vs cross)
+11. Compute `next_action` + render the Step 6 output block
+The envelope shape:
+```typescript
+{
+  status: 'ok' | 'update_halt',
+  session_log_id: string | null,
+  worktree_id: string | null,
+  worktree_error_kind: string | null,
+  infra_drift_nudge: string | null,
+  arch_map_nudge: string | null,
+  lsp_nudges: string[],
+  handoff: { fresh: boolean, command?: string, context?: object, state?: unknown },
+  infra_files_to_commit: string[],
+  owned_checkpoints: Array<{ id: string, title: string }>,
+  cross_checkpoints: Array<{ id: string, title: string }>,
+  owned_count: number,
+  total_count: number,
+  next_action: 'mcp_update_halt' | 'resume_handoff' | 'commit_then_todo' | 'trigger_todo' | 'stop',
+  rendered_block: string,
+  previous_session: { title?: string, summary?: string, pending?: string } | null,
+}
+```
-    On `yes`: `git add` the listed paths only, then trigger `/cbp-git-commit`. On `no`: skip. On `select`: ask which subset.
-  - **HALT** — do NOT proceed to Step 3. Print:
+**Parse and branch**:
-    ```
-    ✓ codebyplan updated. Start a FRESH Claude Code session
-    (run /clear or open a new window) so the updated .claude/ takes effect.
-    ```
+- `status === 'update_halt'` → print `rendered_block` (the update-halt message) and **STOP**. No further writes, no `/cbp-todo` trigger.
+- Otherwise → print `rendered_block` (the Step 6 output block), then proceed to Step 5.7 and Step 7.
-    On this update-and-halt path the session is NOT continued: `update_session_state(activate)` is NOT called, `create_session_log` is NOT called, and `/cbp-todo` is NOT triggered.
+### Step 5.7: Commit Non-Task Files (Claude-side)
-Populate the claude-status cache best-effort (pure cache population — never gates session-start):
+Driven by `envelope.infra_files_to_commit[]`. If non-empty, present once:
-```bash
-npx codebyplan claude status --write-cache --quiet 2>/dev/null || true
 ```
+Commit these non-task files before starting session?
+[list of infra_files_to_commit]
-### Step 1.7: LSP Binary Nudge
-Surface — never block — any language-server binaries still missing for this repo's tech stack. Runs after the freshness gate (Step 1.6) and before session activation (Step 3); may add one line per missing binary to the Step 6 output. Fully non-blocking — every failure path falls through with no output.
-```bash
-LSP_NUDGE=$(npx codebyplan lsp --check 2>/dev/null || true)
+Reply: yes | no | select
 ```
-`codebyplan lsp --check` reads the committed `.codebyplan/lsp.json` (written by a prior project-scope `codebyplan claude install`/`update`), re-checks each LSP server binary on PATH, prunes any now-resolved `.codebyplan/todo/session-start/` nudge file, and prints ONE install-hint line per still-missing binary (e.g. `npm i -g typescript-language-server`). When `.codebyplan/lsp.json` is absent or every binary is already on PATH, it prints nothing. If `$LSP_NUDGE` is non-empty, hold each line for the Step 6 output prefixed with `LSP:`; otherwise skip silently.
-### Step 3: Update Session State
-Run `codebyplan session update-state --action activate` (CLI write-through: writes `.codebyplan/state/session/state.json` + REST). This deactivates all other repos automatically. Break-glass fallback: MCP `update_session_state` with action `activate` when the CLI is unavailable. Note: the CLI validates `--action` — only `activate`/`deactivate` are accepted; a missing, valueless, or invalid value exits 1 with a usage message.
-Note: Step 0 `health_check` stays MCP unconditionally — it tests MCP connectivity itself and must not be replaced.
-### Step 4: Read Last Session Log
-Read `.codebyplan/state/session/current.json` (local-first). If missing/stale, run `npx codebyplan sync` once and re-read. Break-glass fallback: MCP `get_session_logs({ repo_id, worktree_id, limit: 1 })` when the state dir is absent and sync fails (daemon-dead + CLI-unavailable).
-Take the first row — same inclusive-worktree scope as Step 4.5 so the previous-session display and the handoff probe agree on which row is "most recent for this worktree".
-- If a previous log exists, hold its title/summary/pending items for the Step 6 output so the user sees where they left off.
-- If none exists (first session ever for this worktree), skip silently.
-### Step 5: Create This Session's Log
-Run `codebyplan session create-log --started-at <now> --repo-id <repo_id> --worktree-id <WORKTREE_ID>` (CLI write-through: writes `.codebyplan/state/session/current.json` + REST). Create it **even if empty** — this establishes the record for session-end to finalize. Break-glass fallback: MCP `create_session_log` when the CLI is unavailable.
-Minimal seed content:
-- `started_at`: now
-- `repo_id` from config; `worktree_id` from `WORKTREE_ID` resolved in Step 1
-- `summary`: empty (session-end fills this in)
-Hold the new log's ID in context so `/cbp-session-end` can update the same record.
-### Step 4.5: Handoff Auto-Resume Probe
-Probe the most-recent closed session log for a structured handoff payload (the handoff freshness-gate contract is specified inline in this step) and auto-resume directly into the captured command when fresh. Additive — placed BEFORE the existing `/cbp-todo` auto-trigger; ALL failure paths fall through silently to Step 7.
-1. Reuse the row held from Step 4 (held from Step 4 in memory — do NOT re-read from disk here; at this point `session/current.json` still holds the previous session row, which Step 5 will overwrite).
-2. **Defensive gates** (any failure → silent fall-through to Step 7):
-   - No row returned → fall through.
-   - Row missing `closed_at` (orphan / still-open session) → fall through.
-   - `row.content` is `null` → no handoff captured at end-of-session → fall through.
-   - `row.content` exists but parse throws or shape mismatch (`command` field absent OR is an empty string) → fall through.
-3. **Freshness gate** — load the row as `handoff = row.content` (per CHK-111 Migration A column alias). Mark stale when ANY of:
-   - `(now - row.closed_at) > freshness_window_hours` (read from `.codebyplan/repo.json`, default 24 hours)
-   - Referenced entity in `handoff.context` has shifted. For each id present, read the matching local state file and check `updated_at`:
-     - `checkpoint_id` → read `.codebyplan/state/checkpoints/<checkpoint_id>.json` (local-first; sync + MCP break-glass if missing)
-     - `task_id` → read `.codebyplan/state/checkpoints/<checkpoint_id>/tasks/<task_id>.json` (local-first; sync + MCP break-glass if missing)
-     - `round_id` → read `.codebyplan/state/checkpoints/<checkpoint_id>/tasks/<task_id>/rounds/<round_id>.json` (local-first; sync + MCP break-glass if missing)
-       Then compare `entry.updated_at > handoff.captured_at` → stale on any inequality.
-   - Local file missing after sync attempt → stale (referenced entity gone or moved out of reach).
-   - `handoff.context.checkpoint_id` resolves to a checkpoint whose `worktree_id` is non-null AND (caller `WORKTREE_ID` is `null` OR differs from `checkpoint.worktree_id`) → stale (a fresh handoff for another worktree's work — or for assigned work this caller cannot confirm ownership of — must not auto-resume here). Mirrors the cbp-todo Step 1.5 ownership rule.
-4. **On stale OR any defensive gate hit**: fall through silently to Step 7 (existing `/cbp-todo` trigger).
-5. **On fresh hit**: trigger `handoff.command` directly with `handoff.context` / `handoff.state` in the trigger arguments. The downstream skill self-loads its full context — do NOT duplicate `/cbp-todo` Step 2's context-loading matrix here. Skip Step 5.7, Step 6 output, and Step 7.
-### Step 5.7: Commit Non-Task Files
-Clean the working tree of leftover infra before the session begins. Only commit files that are **not** part of an unfinished task.
-1. Resolve the active task's round files (local-first):
-   - Read `.codebyplan/state/todos.json` (local-first) to identify the active task id. If missing/stale, run `npx codebyplan sync` once and re-read. Break-glass fallback: MCP `get_current_task(repo_id)` when the state dir is absent and sync fails.
-   - If active task exists: read `.codebyplan/state/checkpoints/<checkpoint_id>/tasks/<task_id>/rounds/` and filter to rounds with status not in `completed` / `cancelled`; collect their `files[]` → `task_files` set. Break-glass fallback: MCP `get_rounds(task_id)`.
-   - If no active task exists, `task_files` is empty.
-2. Run `codebyplan session infra-files --json --task-files "<csv>"` where `<csv>` is the comma-separated task files from step 1. Parse the JSON response (`{ infra_files: string[], task_files: string[], note?: string }`). The CLI re-runs `git status --porcelain` internally and applies the set-math deterministically — the race-safe recompute, reading the index after Steps 0–5 round-trips complete.
-3. If `infra_files` is empty → skip. Otherwise present once:
-   ```
-   Commit these non-task files before starting session?
-   [list of infra_files]
-   Reply: yes | no | select
-   ```
-4. On `yes`: `git add` the listed files, then trigger `/cbp-git-commit` (it handles conventional message + commit).
-   On `no`: skip. On `select`: ask which subset.
+On `yes`: `git add` the listed files, then trigger `/cbp-git-commit`.
+On `no`: skip. On `select`: ask which subset.
 Non-blocking — session start proceeds either way.
-### Step 5.8: Resolve Ownership
-Call MCP `get_checkpoints({ repo_id, status: 'active' })` (MCP only — deliberate: the active-filter query has no local-mirror equivalent). Partition results into:
-- `owned[]` — entries where `checkpoint.worktree_id === WORKTREE_ID`, OR both are `null`
-- `cross_worktree[]` — entries where `checkpoint.worktree_id` is non-null AND differs from `WORKTREE_ID` (includes the case where caller `WORKTREE_ID` is `null` but the target has a non-null `worktree_id`)
-Hold `owned_count = owned.length`, `total_count = owned.length + cross_worktree.length`, `owned_names` (CHK-NNN + title for each owned entry), and `cross_names` (CHK-NNN + name for each cross-worktree entry). These values are consumed by Step 6 and Step 7 — single MCP call, no duplicate round-trips.
-### Step 6: Output
-```
-Session active | Worktree: [worktree_id or "unregistered"]
-[⚠ resolve-worktree: <error_kind> — local state is broken; routing may be unreliable. Run `npx codebyplan setup` to repair. — only when error_kind is non-null and not tuple_miss]
-[⚠ .claude/ infra is N behind — run /cbp-refresh-infra — only when Step 1.5 found drift]
-[⚠ .claude/architecture is N module(s) stale — run /cbp-refresh-arch-map — only when Step 1.55 found drift]
-Previous session: [title or "none"]
-  Pending: [pending items from previous log, or "—"]
-Ownership: [total_count] active CHK(s), [owned_count] owned by this worktree
-[Owned: CHK-NNN (title), … — only when owned_count > 0]
-[Cross-worktree: CHK-ZZZ (name), … — only when total_count > owned_count]
-[LSP: <install hint> — one line per still-missing LSP binary held from Step 1.7, only when LSP_NUDGE is non-empty]
-[⚠ Worktree unregistered — run `npx codebyplan setup` to register — only when WORKTREE_ID is null and no resolver distress was already shown]
-```
-READ-ONLY — this block never proposes reassignment, release, or lock transfer of cross-worktree checkpoints.
 ### Step 7: Auto-trigger
-Three-branch gate using `owned_count` and `total_count` from Step 5.8:
+Route from `envelope.next_action`:
-- **`owned_count >= 1`**: trigger `/cbp-todo` (owns active work — proceed as today).
-- **`total_count >= 1` AND `owned_count === 0`**: **STOP** — do NOT auto-trigger `/cbp-todo`. The Ownership block shown in Step 6 already communicates the situation; the user must switch to the owning worktree or start new work explicitly.
-- **`total_count === 0`** (no active checkpoints anywhere): trigger `/cbp-todo` (idle path — leads to checkpoint-create or session-end).
+- **`mcp_update_halt`** — already handled above (STOP after printing rendered_block).
+- **`resume_handoff`** — trigger `envelope.handoff.command` directly with `envelope.handoff.context` / `envelope.handoff.state`. Skip Step 5.7.
+- **`commit_then_todo`** — run Step 5.7 (infra commit gate), then trigger `/cbp-todo`.
+- **`trigger_todo`** — trigger `/cbp-todo` (owns active work, or idle path).
+- **`stop`** — do NOT auto-trigger `/cbp-todo`. The Ownership block in `rendered_block` communicates the situation; the user must switch to the owning worktree or start new work explicitly.
 ## Integration
 - **Triggered by**: user invocation, `/clear` recovery
-- **Resolves**: `npx codebyplan resolve-worktree --json` (worktree id + distress signal; non-tuple-miss distress is non-blocking at session-start)
-- **Reads**: `.codebyplan/repo.json`, `.codebyplan/git.json` (`branch_config.production` for Step 1.5 infra-drift fetch), MCP `health_check` (Step 0 hard gate — stays MCP unconditionally); local-first reads (with `npx codebyplan sync` + MCP break-glass): `.codebyplan/state/session/current.json` (Step 4 previous log + Step 4.5 handoff probe), `.codebyplan/state/checkpoints/<id>.json` / `tasks/<id>.json` / `rounds/<id>.json` (Step 4.5 freshness probe), `.codebyplan/state/todos.json` (Step 5.7 active-task lookup); MCP `get_checkpoints({ repo_id, status: 'active' })` (Step 5.8 ownership partition — MCP only, no local mirror for active-filter query); `scripts/infra-drift.mjs` + a best-effort `git fetch` (Step 1.5 monorepo drift); `npx codebyplan arch-map drift` + `.codebyplan/architecture.json` presence (Step 1.55 architecture-map drift nudge, non-blocking); `codebyplan session freshness-gate --halt-on-update` (Step 1.6 package-freshness gate); `codebyplan session infra-files --json --task-files <csv>` (Step 5.7 infra-file set math); `npx codebyplan lsp --check` (Step 1.7 LSP binary nudge — reads `.codebyplan/lsp.json`, non-blocking). Reads at Step 3 and later do NOT fire on a Step 0 MCP hard-fail or the Step 1.6 update-and-halt path
-- **Writes**: `codebyplan session create-log` (Step 5 — CLI write-through; break-glass: MCP `create_session_log`), `codebyplan session update-state --action activate` (Step 3 — CLI write-through to `.codebyplan/state/session/state.json`; break-glass: MCP `update_session_state`) — both SKIPPED on a Step 0 MCP hard-fail and on the Step 1.6 update-and-halt path
+- **Reads**: MCP `health_check` (Step 0 hard gate — stays MCP unconditionally); `codebyplan session start --json` (Steps 1–5.8 — the orchestrator reads `.codebyplan/repo.json`, `.codebyplan/git.json`, `.codebyplan/state/session/current.json`, `.codebyplan/state/todos.json`, `.codebyplan/state/checkpoints/` entity files, `scripts/infra-drift.mjs`, `.codebyplan/architecture.json`, `.codebyplan/lsp.json`)
+- **Writes**: orchestrator calls `codebyplan session update-state --action activate` (Step 7) and `codebyplan session create-log` (Step 8) — both SKIPPED on Step 0 hard-fail and on `status: update_halt`
 - **Spawns**: none
-- **Triggers**: `/cbp-git-commit` (conditional, on user approval at Step 5.7 or the Step 1.6 update path), `handoff.command` (on fresh handoff hit at Step 4.5), `/cbp-todo` (auto fall-through when owned_count >= 1 or total_count === 0; STOPS with no trigger when total_count >= 1 AND owned_count === 0; NOT triggered on a Step 0 hard-fail or the Step 1.6 update-and-halt path)
+- **Triggers**: `/cbp-git-commit` (conditional, on user approval at Step 5.7), `envelope.handoff.command` (on `next_action: resume_handoff`), `/cbp-todo` (on `next_action: trigger_todo` or `commit_then_todo`); STOPS with no trigger on `next_action: stop` or `mcp_update_halt`
 - **Paired with**: `/cbp-session-end`
-- **Pairs with**: `/cbp-session-end` Step 1.3 (handoff write-path; the freshness-gate contract is specified inline in Step 4.5 above)
+- **Pairs with**: `/cbp-session-end` Step 1.3 (handoff write-path; the freshness-gate contract is implemented inside `codebyplan session start`)

package/templates/skills/cbp-session-start/qa-regression.md CHANGED Viewed

@@ -5,22 +5,24 @@ description: Manual regression procedure for the cbp-session-start worktree-owne
 # cbp-session-start — Worktree-Ownership Regression
-Manual procedure verifying that `/cbp-session-start` correctly resolves the caller's worktree identity, gates Step 7 auto-trigger on ownership, and surfaces distress signals non-blocking. No automated harness exists for markdown skills; run these by hand (or exercise the MCP calls directly) whenever Step 1, Step 4.5, Step 5.8, Step 6, or Step 7 of `SKILL.md` changes.
+Manual procedure verifying that `/cbp-session-start` correctly resolves the caller's worktree identity, gates Step 7 auto-trigger on ownership, and surfaces distress signals non-blocking. Run these by hand whenever the SKILL's envelope-consumption logic (Step 0 health gate, the `codebyplan session start` invocation, the Step 5.7 commit gate, or the Step 7 `next_action` routing) changes.
+Since the worktree resolution, handoff freshness, ownership partition, and Step 6 render are now computed inside `codebyplan session start` (Steps 1–5.8 collapsed into one CLI call), the deterministic behavior is covered by unit tests (`src/cli/session/start.test.ts`, `src/lib/session.test.ts`). These scenarios are now best verified by inspecting the **envelope** the CLI emits — run `codebyplan session start --json` and read its fields — rather than tracing SKILL prose. The `ownership`, `next_action`, `handoff.fresh`, and `worktree_error_kind` fields below correspond to envelope keys.
 Repo under test: `2ff6d405-39c5-47b8-a6d1-59f998ac0537`.
 ## Preconditions
-- Step 1 uses `resolve-worktree --json` (not the legacy `2>/dev/null` form) — confirm with `grep -n 'resolve-worktree' SKILL.md` → line contains `--json`.
-- Step 5.8 calls `get_checkpoints({ repo_id, status: 'active' })` — confirm no Step 7 auto-trigger bypasses this gate.
-- Step 6 Ownership block is READ-ONLY — confirm no "reassign", "release_assignment", or "transfer" language appears in SKILL.md: `grep -n 'reassign\|release_assignment\|transfer' SKILL.md` → no hits.
-- Step 4.5 freshness gate includes the cross-worktree stale bullet (checkpoint `worktree_id` non-null and differs from caller).
+- The SKILL keeps Step 0 (`health_check`) as a hard gate, then delegates Steps 1–5.8 to `codebyplan session start --json` — confirm with `grep -n 'session start --json' SKILL.md` → at least one hit.
+- The orchestrator resolves worktree identity via `resolve-worktree --json` and partitions ownership via `get_checkpoints({ repo_id, status: 'active' })` — confirm the envelope carries `worktree_id` / `worktree_error_kind` / `owned_count` / `total_count`.
+- The Step 6 Ownership block (the envelope's `rendered_block`) is READ-ONLY — confirm no "reassign", "release_assignment", or "transfer" language appears in SKILL.md: `grep -n 'reassign\|release_assignment\|transfer' SKILL.md` → no hits.
+- The handoff freshness gate includes the cross-worktree stale rule (checkpoint `worktree_id` non-null and differs from caller) — covered by the `probeHandoff` unit tests; the envelope surfaces the outcome as `handoff.fresh`.
 ## Scenario A — caller owns an active CHK → auto-trigger
 1. Run from a worktree whose `WORKTREE_ID` matches the active checkpoint's `worktree_id` (or both are `null`).
 2. `get_checkpoints({ repo_id, status: 'active' })` returns at least one entry whose `worktree_id === WORKTREE_ID` (or both null).
-3. **Expected**: Step 5.8 sets `owned_count >= 1`. Step 6 shows `Ownership: N active CHK(s), N owned by this worktree`. Step 7 first branch fires: `/cbp-todo` is auto-triggered.
+3. **Expected**: envelope `owned_count >= 1`, `next_action: 'trigger_todo'`. `rendered_block` shows `Ownership: N active CHK(s), N owned by this worktree`. The SKILL's Step 7 fires `/cbp-todo`.
 ## Scenario B — active CHK(s) exist but none owned by caller → Ownership block + STOP
@@ -29,22 +31,22 @@ Repro: caller worktree is `codebyplan-claude-2` (`38cd7dfa`). The only active ch
 1. `resolve-worktree --json` returns `{"worktree_id":"38cd7dfa-...","error_kind":null}`.
 2. `get_checkpoints({ repo_id, status: 'active' })` returns CHK-136 with `worktree_id = "016bd7f2-..."`.
 3. Step 5.8: `owned_count = 0`, `total_count = 1`, `cross_worktree = [CHK-136]`.
-4. **Expected**: Step 6 shows `Ownership: 1 active CHK(s), 0 owned by this worktree` and `[Cross-worktree: CHK-136 (…)]`. Step 7 second branch fires: Ownership block is displayed (already in Step 6) and skill **STOPS** — `/cbp-todo` is NOT auto-triggered. No reassignment language appears.
+4. **Expected**: envelope `owned_count = 0`, `total_count = 1`, `next_action: 'stop'`, `cross_checkpoints = [CHK-136]`. `rendered_block` shows `Ownership: 1 active CHK(s), 0 owned by this worktree` and `[Cross-worktree: CHK-136 (…)]`. The SKILL **STOPS** on `next_action: 'stop'` — `/cbp-todo` is NOT auto-triggered. No reassignment language appears.
 ## Scenario C — no active CHKs anywhere → idle /cbp-todo trigger
 1. `get_checkpoints({ repo_id, status: 'active' })` returns `[]`.
 2. Step 5.8: `owned_count = 0`, `total_count = 0`.
-3. **Expected**: Step 6 shows `Ownership: 0 active CHK(s), 0 owned by this worktree`. Step 7 third branch fires: `/cbp-todo` is auto-triggered (idle path → checkpoint-create or session-end suggestion).
+3. **Expected**: envelope `owned_count = 0`, `total_count = 0`, `next_action: 'trigger_todo'`. `rendered_block` shows `Ownership: 0 active CHK(s), 0 owned by this worktree`. The SKILL fires `/cbp-todo` (idle path → checkpoint-create or session-end suggestion).
 ## Scenario D — cross-worktree handoff → Step 4.5 marks stale, falls through
 1. The most-recent closed session log contains a handoff payload whose `context.checkpoint_id` resolves to a checkpoint with `worktree_id = "016bd7f2-..."` (a different worktree).
 2. Caller `WORKTREE_ID = "38cd7dfa-..."`.
-3. **Expected**: Step 4.5 freshness gate hits the cross-worktree stale bullet (`checkpoint.worktree_id` non-null AND differs from caller) → marks stale → falls through silently to Step 7. The mismatched handoff is NOT auto-resumed. Ownership output from Step 5.8 / Step 6 / Step 7 proceeds normally.
+3. **Expected**: the orchestrator's handoff freshness gate hits the cross-worktree stale rule (`checkpoint.worktree_id` non-null AND differs from caller) → envelope `handoff.fresh: false`. The mismatched handoff is NOT auto-resumed (`next_action` is never `resume_handoff`); ownership/routing proceed normally per `owned_count`/`total_count`.
 ## Scenario E — resolver distress (non-tuple-miss) → warning line above Ownership block, session proceeds
 1. `resolve-worktree --json` returns `{"worktree_id":null,"error_kind":"local_config_read_failed"}` (or any other non-null, non-tuple-miss `error_kind`).
-2. **Expected**: Step 1 holds the `error_kind`; session continues (non-blocking). Step 6 surfaces `⚠ resolve-worktree: local_config_read_failed — local state is broken; routing may be unreliable. Run \`npx codebyplan setup\` to repair.` ABOVE the Ownership block. All subsequent steps still run (Steps 2–5.8). Step 7 proceeds per the `owned_count`/`total_count` values from Step 5.8 as normal — the skill does NOT hard-stop the way `/cbp-todo` does on the same distress kinds.
-3. **Compound case** (distress typically leaves `WORKTREE_ID` null): Step 5.8 then classifies every checkpoint with a non-null `worktree_id` as `cross_worktree[]` (only truly-null-`worktree_id` checkpoints land in `owned[]`). So if all active checkpoints are assigned, `owned_count = 0` and Step 7's second branch STOPS — the distress warning + Ownership block are the only output. The session log created in Step 5 records `worktree_id: null` (the resolver could not read local state); this is expected, not a failure.
+2. **Expected**: the envelope carries `worktree_error_kind: 'local_config_read_failed'`; `status` stays `'ok'` (non-blocking). `rendered_block` surfaces `⚠ resolve-worktree: local_config_read_failed — local state is broken; routing may be unreliable. Run \`npx codebyplan setup\` to repair.` ABOVE the Ownership block. The orchestrator still activates + creates the session log. `next_action` routes per the `owned_count`/`total_count` values as normal — the skill does NOT hard-stop the way `/cbp-todo` does on the same distress kinds.
+3. **Compound case** (distress typically leaves `worktree_id` null): ownership then classifies every checkpoint with a non-null `worktree_id` as `cross_checkpoints[]` (only truly-null-`worktree_id` checkpoints land in `owned_checkpoints[]`). So if all active checkpoints are assigned, `owned_count = 0` and `next_action: 'stop'` — the distress warning + Ownership block are the only output. The created session log records `worktree_id: null` (the resolver could not read local state); this is expected, not a failure.