npm - codebyplan - Versions diffs - 1.13.15 → 1.13.17 - Mend

codebyplan 1.13.15 → 1.13.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/README.md +77 -2
package/dist/cli.js +709 -116
package/package.json +2 -2
package/templates/agents/cbp-e2e-playwright.md +194 -93
package/templates/context/testing/e2e.md +14 -9
package/templates/hooks/README.md +23 -1
package/templates/settings.project.base.json +10 -0
package/templates/skills/cbp-git-worktree-create/SKILL.md +11 -0
package/templates/skills/cbp-round-execute/SKILL.md +18 -0
package/templates/skills/cbp-setup-cmux/SKILL.md +170 -0
package/templates/skills/cbp-task-complete/SKILL.md +14 -0
package/templates/skills/cbp-task-start/SKILL.md +8 -0

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "codebyplan",
-  "version": "1.13.15",
+  "version": "1.13.17",
   "description": "CLI for CodeByPlan — AI-powered development planning and tracking",
   "type": "module",
   "bin": {
@@ -61,6 +61,6 @@
     "prettier": "^3.8.1",
     "typescript": "^5",
     "typescript-eslint": "^8.20.0",
-    "vitest": "^4.1.2"
+    "vitest": "^4.1.8"
   }
 }

package/templates/agents/cbp-e2e-playwright.md CHANGED Viewed

@@ -26,136 +26,237 @@ pnpm exec playwright install --with-deps chromium
 ## playwright.config.ts
-Derive `baseURL` from `.codebyplan/server.json` at config-read time. Match by label
-(`"Web Dev"`) rather than array position — a monorepo can have several nextjs allocations.
+Resolve the apps/web dev-server port at config-read time via the shared resolver
+`apps/web/e2e/resolve-web-dev-port.ts` — imported by BOTH `playwright.config.ts` and
+`e2e/auth.setup.ts` (single source of truth). It reads the per-worktree
+`.codebyplan/server.local.json` overlay first, then the committed `.codebyplan/server.json`.
+Match by label rather than array position — a monorepo can have several Next.js allocations
+with similar label prefixes.
+**Label-matching rules** (`findWebDevPort`):
+- `server.local.json` overlay: each label has the worktree name appended as the last
+  parenthetical group (e.g. `"Web Dev (codebyplan-mcp-1)"`). Strip exactly ONE trailing
+  `" (…)"` group, then require the result `=== "Web Dev"`.
+  - `"Web Dev (codebyplan-mcp-1)"` → strip → `"Web Dev"` ✓
+  - `"Web Dev (codebyplan-desktop) (codebyplan-mcp-1)"` → strip → `"Web Dev (codebyplan-desktop)"` ✗
+- `server.json` committed base: require `label === "Web Dev"` exactly (do NOT strip —
+  `"Web Dev (codebyplan-desktop)"` must not match).
+**Resolution order** (first hit wins):
+0. `PLAYWRIGHT_BASE_URL` — explicit CI / local override (`parsePortFromUrl` extracts the port)
+1. `.codebyplan/server.local.json` — `findWebDevPort(…, {stripWorktreeSuffix: true})`
+2. `.codebyplan/server.json` — `findWebDevPort(…, {stripWorktreeSuffix: false})`
+3. `E2E_BASE_URL` — `parsePortFromUrl` (kept BELOW the overlay: a stale `E2E_BASE_URL` in a
+   gitignored `.env.local` must never shadow the worktree's own port — set `PLAYWRIGHT_BASE_URL`
+   to override in CI)
+4. `3010` — last resort
+The resolver uses `readFileSync` + `JSON.parse` with paths relative to `apps/web/e2e/`
+(`resolve(__dirname, "../../../.codebyplan/…")`). Each read is wrapped in `try/catch` — the
+overlay is gitignored and absent in CI. Do NOT import from the `codebyplan` CLI package
+(async, cross-package coupling). `findWebDevPort` + `parsePortFromUrl` are pure and unit-tested
+in `e2e/__tests__/resolve-web-dev-port.test.ts`.
 ```ts
+import { readFileSync } from "node:fs";
+import { resolve } from "node:path";
 import { defineConfig, devices } from "@playwright/test";
-import { execSync } from "child_process";
-function getBaseUrl(): string {
+import { resolveWebDevPort } from "./e2e/resolve-web-dev-port";
+// Load apps/web/.env.local into process.env (process.env wins on conflict)
+(function loadDotEnvLocal() {
+  try {
+    const text = readFileSync(resolve(__dirname, ".env.local"), "utf-8");
+    for (const line of text.split("\n")) {
+      const t = line.trim();
+      if (!t || t.startsWith("#")) continue;
+      const eq = t.indexOf("=");
+      if (eq === -1) continue;
+      const k = t.slice(0, eq).trim();
+      let v = t.slice(eq + 1).trim();
+      if ((v.startsWith('"') && v.endsWith('"')) || (v.startsWith("'") && v.endsWith("'")))
+        v = v.slice(1, -1);
+      if (!(k in process.env)) process.env[k] = v;
+    }
+  } catch { /* absent in CI */ }
+})();
+// Load .codebyplan/e2e.env — Supabase + auth credentials (process.env wins)
+(function loadE2eEnv() {
   try {
-    const raw = execSync(
-      "jq -r '.port_allocations[] | select(.label==\"Web Dev\") | .port' .codebyplan/server.json 2>/dev/null | head -1",
-      { encoding: "utf-8" }
-    ).trim();
-    const port = parseInt(raw, 10);
-    return `http://localhost:${port}`;
-  } catch {
-    return "http://localhost:3010";
-  }
-}
+    const text = readFileSync(resolve(__dirname, "../../.codebyplan/e2e.env"), "utf-8");
+    for (const line of text.split("\n")) {
+      const t = line.trim();
+      if (!t || t.startsWith("#")) continue;
+      const eq = t.indexOf("=");
+      if (eq === -1) continue;
+      const k = t.slice(0, eq).trim();
+      let v = t.slice(eq + 1).trim();
+      if ((v.startsWith('"') && v.endsWith('"')) || (v.startsWith("'") && v.endsWith("'")))
+        v = v.slice(1, -1);
+      if (!(k in process.env)) process.env[k] = v;
+    }
+  } catch { /* absent in CI — shell / CI secrets used instead */ }
+})();
+// findWebDevPort, parsePortFromUrl, and resolveWebDevPort live in the shared
+// module ./e2e/resolve-web-dev-port.ts (imported above) — single source of
+// truth, also consumed by e2e/auth.setup.ts. Resolution order:
+//   0. PLAYWRIGHT_BASE_URL → 1. server.local.json → 2. server.json
+//   → 3. E2E_BASE_URL → 4. 3010.
+const port = resolveWebDevPort();
 export default defineConfig({
-  testDir: "apps/web/e2e",
-  fullyParallel: false,
+  testDir: "./e2e",
+  testMatch: "*.spec.ts",
+  globalSetup: require.resolve("./e2e/global-setup"),
+  fullyParallel: true,
   forbidOnly: !!process.env.CI,
-  retries: process.env.CI ? 2 : 0,
+  retries: process.env.CI ? 1 : 0,
   workers: 1,            // serialize against shared remote Supabase — see e2e.md § Supabase Parallelism
-  reporter: process.env.CI ? "github" : "html",
-  globalSetup: "./apps/web/e2e/global-setup",
+  reporter: "list",
+  timeout: 30_000,
+  expect: {
+    toHaveScreenshot: { stylePath: "./e2e/screenshot.css" },
+  },
   use: {
-    baseURL: getBaseUrl(),
+    baseURL: `http://localhost:${port}`,
+    storageState: "e2e/.auth/refreshed-state.json",
+    actionTimeout: 15_000,
     trace: "on-first-retry",
     screenshot: "only-on-failure",
   },
-  projects: [
-    { name: "setup", testMatch: /global\.setup\.ts/ },
-    {
-      name: "web",
-      use: { ...devices["Desktop Chrome"], storageState: "apps/web/e2e/.auth/user.json" },
-      dependencies: ["setup"],
-    },
-  ],
   webServer: {
-    command: "pnpm --filter @codebyplan/web dev",
-    url: getBaseUrl(),
+    command: `pnpm --filter @codebyplan/web dev --port ${port}`,
+    url: `http://localhost:${port}`,
     reuseExistingServer: !process.env.CI,
     timeout: 120_000,
+    env: {
+      // Forward Supabase + auth vars into the spawned dev server. Only forward
+      // vars that are present in process.env (undefined would stringify to "undefined").
+      ...(process.env.NEXT_PUBLIC_SUPABASE_URL && {
+        NEXT_PUBLIC_SUPABASE_URL: process.env.NEXT_PUBLIC_SUPABASE_URL,
+      }),
+      ...(process.env.NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY && {
+        NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY: process.env.NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY,
+      }),
+      ...(process.env.SUPABASE_SECRET_KEY && {
+        SUPABASE_SECRET_KEY: process.env.SUPABASE_SECRET_KEY,
+      }),
+    },
   },
+  projects: [
+    {
+      name: "chromium",
+      use: { ...devices["Desktop Chrome"] },
+    },
+  ],
 });
 ```
 ## Auth — Global Setup + Storage State
-`apps/web/e2e/global-setup.ts`:
+`apps/web/e2e/global-setup.ts` performs two phases at startup:
-```ts
-import { chromium, FullConfig } from "@playwright/test";
-import path from "path";
-const AUTH_FILE = path.join(__dirname, ".auth/user.json");
-export default async function globalSetup(config: FullConfig) {
-  const email = process.env.E2E_TEST_EMAIL;
-  const password = process.env.E2E_TEST_PASSWORD;
-  if (!email || !password) {
-    throw new Error(
-      "E2E_TEST_EMAIL and E2E_TEST_PASSWORD must be set.\n" +
-        "Copy .env.local.example to .env.local, then run: pnpm e2e:provision"
-    );
-  }
-  const { baseURL } = config.projects[0].use;
-  const browser = await chromium.launch();
-  const page = await browser.newPage();
-  await page.goto(`${baseURL}/login`);
-  await page.getByLabel(/email/i).fill(email);
-  await page.getByLabel(/password/i).fill(password);
-  await page.getByRole("button", { name: /sign in|log in/i }).click();
-  await page.waitForURL(/\/(dashboard|home|app)/, { timeout: 15_000 });
-  await page.goto(baseURL!);   // cold-start warmup
-  await page.context().storageState({ path: AUTH_FILE });
-  await browser.close();
-}
-```
+**Phase 1 — Auth refresh**: reads `e2e/.auth/state.json`, finds the Supabase auth cookie
+(`sb-<projectref>-auth-token`), decodes its base64-JSON payload (`decodeAuthCookie` from the
+shared `e2e/auth-cookie.ts` module), calls `supabase.auth.refreshSession({refresh_token})` for
+fresh tokens, re-encodes via `encodeAuthCookie`, and writes the result to
+`e2e/.auth/refreshed-state.json`. No browser required — pure HTTP against Supabase auth.
-Gitignore storage state before first use:
+**Phase 2 — Maintainer seeding**: uses the service-role client to ensure the test user
+holds a maintainer-or-above role on at least one organization (`e2e-test-fixture` slug).
+Idempotent — if a qualifying membership already exists, phase 2 is a no-op.
-```bash
-mkdir -p apps/web/e2e/.auth
-echo "apps/web/e2e/.auth/" >> .gitignore
-```
+**Required env vars** (read via `readEnv(name, fallbacks)` which checks `process.env` first,
+then falls back to the parsed `.env.local` file):
+- `NEXT_PUBLIC_SUPABASE_URL` — both phases
+- `NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY` — Phase 1 refresh client
+- `SUPABASE_SECRET_KEY` — Phase 2 admin operations
+- `E2E_USER_EMAIL` — Phase 2 user lookup + seeding
+Because `readEnv` checks `process.env` before `.env.local`, loading these vars into
+`process.env` via `loadE2eEnv()` in `playwright.config.ts` is sufficient — global-setup
+will pick them up.
+### Seeding `state.json` — `e2e/auth.setup.ts` (`pnpm e2e:auth-setup`)
+global-setup Phase 1 only *refreshes* an existing `state.json`; the initial seed is written by
+`e2e/auth.setup.ts`. It is a pure-HTTP API seed — **no browser, no dev server, no hydration
+timing**: it loads creds from `.env.local` + `.codebyplan/e2e.env`, calls
+`supabase.auth.signInWithPassword({email, password})` with the publishable-key client, derives
+the project ref from `NEXT_PUBLIC_SUPABASE_URL`, and writes a `sb-<projectref>-auth-token`
+cookie (domain `localhost`) into `state.json` using the same `encodeAuthCookie` from
+`e2e/auth-cookie.ts` that global-setup consumes. This makes seeding deterministic in any
+worktree — run `pnpm e2e:auth-setup` (optionally `--port N`) when `state.json` is missing or
+its refresh token has expired. Do NOT reintroduce a browser-login flow (the `(auth)/login`
+page is a client component whose `onSubmit` only attaches after hydration — clicking submit
+pre-hydration falls through to a native GET and never authenticates).
 ## Auth Probe
-`apps/web/e2e/_probe/auth.spec.ts` — validates the login path directly (outside storage-
-state flow) so credential failures are diagnosed cleanly:
+`apps/web/e2e/_probe/auth.spec.ts` — verifies that the stored auth state
+(`refreshed-state.json`) grants access to the authenticated dashboard without
+redirecting to the login page. It is intentionally minimal (one test) and runs
+before the full suite to confirm the auth preflight:
 ```ts
 import { test, expect } from "@playwright/test";
-test("auth probe: can log in with E2E_TEST_EMAIL/E2E_TEST_PASSWORD", async ({ page }) => {
-  const email = process.env.E2E_TEST_EMAIL;
-  const password = process.env.E2E_TEST_PASSWORD;
-  expect(email, "E2E_TEST_EMAIL env var is required").toBeTruthy();
-  expect(password, "E2E_TEST_PASSWORD env var is required").toBeTruthy();
-  await page.goto("/login");
-  await page.getByLabel(/email/i).fill(email!);
-  await page.getByLabel(/password/i).fill(password!);
-  await page.getByRole("button", { name: /sign in|log in/i }).click();
+test("auth probe: authenticated user reaches /dashboard without login redirect", async ({
+  page,
+}) => {
+  const response = await page.goto("/dashboard", {
+    waitUntil: "domcontentloaded",
+    timeout: 20_000,
+  });
-  await expect(page).toHaveURL(/\/(dashboard|home|app)/, { timeout: 15_000 });
+  // Must NOT be redirected to /login or /auth/login.
+  const finalUrl = page.url();
+  expect(
+    finalUrl,
+    `Auth probe failed: landed on ${finalUrl} instead of /dashboard. Check state.json / refreshed-state.json.`
+  ).not.toMatch(/\/(auth\/)?login/);
+  // HTTP status must be < 400.
+  const status = response?.status() ?? 0;
+  expect(
+    status,
+    `Auth probe failed: /dashboard returned HTTP ${status}.`
+  ).toBeLessThan(400);
+  // Dashboard heading must be present.
+  await expect(
+    page.getByRole("heading", { level: 1, name: /welcome to codebyplan|dashboard/i })
+  ).toBeVisible({ timeout: 15_000 });
 });
 ```
-Run probe: `pnpm exec playwright test --project=web _probe/auth`
+Run probe: `pnpm exec playwright test --project=chromium _probe/auth`
 ## Pre-flight Probes (Step 6.5.2)
 **Dev server**: `curl -s -o /dev/null -w "%{http_code}" http://localhost:{port}/` — expect
 200/3xx. On failure:
-> "Dev server is not responding on port `{port}`. Please run `cd apps/{app} && pnpm dev`
+> "Dev server is not responding on port `{port}`. Please run `cd apps/web && pnpm dev --port {port}`
 > in a separate terminal, then reply 'ready' when the page loads in your browser."
-**Port alignment**: parse `playwright.config.ts` `baseURL` port; compare to
-`.codebyplan/server.json` `port_allocations[]`. On mismatch ask which is correct, then
-propose an Edit to align them.
+Note: Playwright's `webServer` block behaviour differs by environment. In local worktree
+runs (`reuseExistingServer: true`), Playwright reuses an already-running server and only
+auto-starts one when nothing is listening on the port — this probe is mainly a safety net.
+In CI (`reuseExistingServer: false`), Playwright always spawns a fresh server regardless of
+any already-running process, so the dev-server readiness probe is the active guard for that
+path.
+**Port alignment**: parse `playwright.config.ts` `baseURL` port; compare to the resolved
+port from `.codebyplan/server.local.json` (worktree overlay, checked first) then
+`.codebyplan/server.json` (committed base). On mismatch ask which is correct, then propose
+an Edit to align them.
 ## Spec-Writing Patterns
@@ -227,7 +328,7 @@ Include this in the specialist output alongside `screenshots[]`.
 ## Run Command
 ```bash
-pnpm exec playwright test {spec} --project=web --reporter=list
+pnpm exec playwright test {spec} --project=chromium --reporter=list
 ```
 ## Selector Conventions
@@ -238,13 +339,13 @@ the new page state rather than holding stale `Locator` handles.
 ## CI Secrets
-`E2E_TEST_EMAIL`, `E2E_TEST_PASSWORD`, `NEXT_PUBLIC_SUPABASE_URL`,
-`NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY` (or legacy `_ANON_KEY`).
+`E2E_USER_EMAIL`, `E2E_USER_PASSWORD`, `NEXT_PUBLIC_SUPABASE_URL`,
+`NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY`, `SUPABASE_SECRET_KEY`.
 ## Pitfalls
 **Cold-start timeouts** — warmup in `globalSetup` (after `page.goto(baseURL!)`) primes
-Turbopack compilation. **Port mismatch** — compare `baseURL` port to `server.json` before
-running. **Supabase parallelism** — remote Supabase requires `workers: 1` to prevent
-auth/RLS races. **SCSS Module selectors** — use `[class*='componentName'].first()` or
-role-based selectors.
+Turbopack compilation. **Port mismatch** — compare `baseURL` port to resolved port from
+`server.local.json` / `server.json` before running. **Supabase parallelism** — remote
+Supabase requires `workers: 1` to prevent auth/RLS races. **SCSS Module selectors** — use
+`[class*='componentName'].first()` or role-based selectors.

package/templates/context/testing/e2e.md CHANGED Viewed

@@ -118,11 +118,12 @@ with the blocking preflight field populated.
 ### 6.5.1 Environment Variables
-Check `apps/{app}/.env.local` and process env. Framework-specific required var names come
-from the `credential_vars` input field (the dispatching skill reads them from
-`.codebyplan/e2e.json`). Naming conventions:
+Check `apps/{app}/.env.local`, `.codebyplan/e2e.env`, and process env. Framework-specific
+required var names come from the `credential_vars` input field (the dispatching skill reads
+them from `.codebyplan/e2e.json`). Naming conventions:
-- Playwright uses `E2E_TEST_*` (avoids collision with non-E2E `TEST_*` vars).
+- Playwright uses `E2E_USER_EMAIL` / `E2E_USER_PASSWORD` (matches `.codebyplan/e2e.json`
+  `credentials.frameworks.playwright` and the `e2e.env` file loaded by `playwright.config.ts`).
 - Maestro/XCUITest stay on `TEST_*` per `rules/maestro-auth-state-reset.md`.
 For any missing var:
@@ -148,8 +149,9 @@ On any failure, `AskUserQuestion` with remediation steps; re-probe after "ready"
 silently skip a required runtime prerequisite.
 **Port alignment (Playwright only)**: parse `playwright.config.ts` `baseURL` and compare
-to `.codebyplan/server.json` `port_allocations[]` for the app. On mismatch ask which is
-correct before running.
+to the resolved port from `.codebyplan/server.local.json` (worktree overlay, checked first)
+then `.codebyplan/server.json` (committed base) `port_allocations[]` for the app. On
+mismatch ask which is correct before running.
 ### 6.5.3 Auth Probe (only when `has_auth`)
@@ -303,9 +305,12 @@ Every repo with Playwright auth ships:
 - `scripts/provision-e2e-user.ts` — idempotent script creating the canonical E2E user
   and (for multi-tenant repos) a `test` subdomain. Wired to `pnpm e2e:provision`.
-- `.env.local.example` — lists every env var `globalSetup` requires.
-- CI secrets: `E2E_TEST_EMAIL`, `E2E_TEST_PASSWORD`, `NEXT_PUBLIC_SUPABASE_URL`,
-  `NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY` (or legacy `_ANON_KEY`).
+- `.codebyplan/e2e.env` — gitignored per-worktree file listing every env var `globalSetup`
+  requires. Written by `codebyplan e2e:provision` or manually. Loaded by `playwright.config.ts`
+  into `process.env` so global-setup picks them up via its `readEnv` helper.
+- `.env.local.example` — lists every env var `globalSetup` requires for reference.
+- CI secrets: `E2E_USER_EMAIL`, `E2E_USER_PASSWORD`, `NEXT_PUBLIC_SUPABASE_URL`,
+  `NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY`, `SUPABASE_SECRET_KEY`.
 Per-repo specifics (email, vault name, remaining-spec migration list) live in the repo's
 own `docs/e2e-setup.md`, not in this shared file.

package/templates/hooks/README.md CHANGED Viewed

@@ -226,7 +226,7 @@ After a `complete_round` MCP call succeeds, reconciles the round's `files_change
 ### `cbp-cmux-workspace-sync.sh` — SessionStart, matcher `*`
-On every session start, syncs the active [cmux](https://github.com/nicholasgasior/cmux) workspace title to the current git branch and the workspace description to the repo folder basename (the directory that contains `.git/`).
+On every session start, syncs the active [cmux](https://github.com/nicholasgasior/cmux) workspace title to the current git branch, the workspace description to the repo folder basename (the directory that contains `.git/`), and applies the workspace color from `.codebyplan/cmux.json` via `cmux workspace-action --action set-color`. All three actions are delegated to `codebyplan cmux-sync`. If no `workspace_color` is configured, a one-line nudge is printed to stdout prompting the user to run `/cbp-setup-cmux`.
 **Blocks vs warns**: never blocks — exit 0 on every path. A SessionStart hook must never prevent a session from opening.
@@ -252,6 +252,28 @@ After any Bash tool call that contains a `git checkout` or `git switch` invocati
 ---
+### Auto dev server (`codebyplan cmux-serve`)
+At the start of each round, `cbp-round-execute` (Step 2a) calls `codebyplan cmux-serve --files "<round files>"` to auto-start the dev server for any app whose source files are touched. The subcommand probes each allocated port via `node:net`, starts a `cmux new-split` terminal pane + sends the dev command for any non-listening app, then opens a browser pane. If the port is already listening (another worktree) it only opens the browser pane. No hook registration is needed — the skill invokes the subcommand directly. Gated by `auto_dev_server` in `.codebyplan/cmux.json`; no-op outside cmux.
+---
+### Status surface (`codebyplan cmux-status`)
+The lifecycle skills push CodeByPlan development state into the cmux workspace sidebar via `codebyplan cmux-status`. No hook registration is needed — the skills invoke the subcommand directly:
+| Skill | What is pushed |
+| --- | --- |
+| `cbp-task-start` (Step 4.5) | `--checkpoint "CHK-NNN: title" --task "TASK-N: title"` |
+| `cbp-task-complete` (Step 7.3) | `--task "TASK-N: title done" --progress completed/total` |
+| `cbp-round-execute` (Step 3d) | `--qa "R{n} {status}"` where status ∈ completed / blocked / re-triggering |
+**`auto_status` toggle.** Gated by the `auto_status` field in `.codebyplan/cmux.json` (configured via `/cbp-setup-cmux`). When `auto_status` is `false`, every call is a no-op. Default is `true` (enabled).
+**No-op outside cmux.** `codebyplan cmux-status` checks for `$CMUX_WORKSPACE_ID` before doing anything. Outside a cmux workspace it exits immediately — safe to call unconditionally from skills and hooks.
+---
 ## Supporting (not registered)
 ### `test-hooks.sh` — invoked by `auto-test-hooks.sh`

package/templates/settings.project.base.json CHANGED Viewed

@@ -124,6 +124,7 @@
       "Skill(cbp-round-update)",
       "Skill(cbp-session-end)",
       "Skill(cbp-session-start)",
+      "Skill(cbp-setup-cmux)",
       "Skill(cbp-setup-e2e)",
       "Skill(cbp-setup-eslint)",
       "Skill(cbp-ship-configure)",
@@ -135,6 +136,11 @@
       "Skill(cbp-supabase-branch-check)",
       "Skill(cbp-supabase-migrate)",
       "Skill(cbp-supabase-setup)",
+      "Skill(cbp-standalone-task-check)",
+      "Skill(cbp-standalone-task-complete)",
+      "Skill(cbp-standalone-task-create)",
+      "Skill(cbp-standalone-task-start)",
+      "Skill(cbp-standalone-task-testing)",
       "Skill(cbp-task-check)",
       "Skill(cbp-task-complete)",
       "Skill(cbp-task-create)",
@@ -196,6 +202,10 @@
       "Bash(npx codebyplan resolve-worktree:*)",
       "Bash(codebyplan cmux-sync:*)",
       "Bash(npx codebyplan cmux-sync:*)",
+      "Bash(codebyplan cmux-status:*)",
+      "Bash(npx codebyplan cmux-status:*)",
+      "Bash(codebyplan cmux-serve:*)",
+      "Bash(npx codebyplan cmux-serve:*)",
       "Bash(codebyplan version-status:*)",
       "Bash(npx codebyplan version-status:*)",
       "Bash(codebyplan statusline:*)",

package/templates/skills/cbp-git-worktree-create/SKILL.md CHANGED Viewed

@@ -142,6 +142,17 @@ cp "$MAIN_REPO/.env.local" "$WORKTREE_PATH/.env.local"
 Verify `.env.local` is already in `.gitignore` (it should be via `.env.local` pattern). If not, add it.
+Also copy the gitignored E2E credentials source (`.codebyplan/e2e.env`, referenced by `.codebyplan/e2e.json`) so the new worktree can run Playwright auth flows immediately:
+```bash
+mkdir -p "$WORKTREE_PATH/.codebyplan"
+if [ -f "$MAIN_REPO/.codebyplan/e2e.env" ]; then
+  cp "$MAIN_REPO/.codebyplan/e2e.env" "$WORKTREE_PATH/.codebyplan/e2e.env"
+fi
+```
+If the main repo has no `.codebyplan/e2e.env` yet, provision it after setup by running `codebyplan ports --path "$WORKTREE_PATH" --provision-e2e` (copies the canonical E2E vars from `apps/web/.env.local`). Pass `--path` BEFORE the boolean flag. `.codebyplan/e2e.env` is gitignored — never commit it.
 ### Step 8: Push Branch
 ```bash

package/templates/skills/cbp-round-execute/SKILL.md CHANGED Viewed

@@ -57,6 +57,16 @@ Read the plan from round context (`context.planner_output`). If no plan: `No app
 Read effective testing profile: `round.context.testing_profile_override` if set (user override for this round only), else `task.context.testing_profile` (set by planner Phase 4.8), else default `'web'`. Pass the effective profile to all per-wave `cbp-testing-qa-agent` spawns.
+### Step 2a: Auto-Dev-Server (cmux)
+Fire the dev-server hook at round-execution start. Self-no-ops outside cmux or when `auto_dev_server` is disabled in `.codebyplan/cmux.json`.
+```bash
+npx codebyplan cmux-serve --files "<comma-separated approved_plan.files_to_modify[].path>"
+```
+The subcommand reads `.codebyplan/server.json` `port_allocations[]`, resolves which apps' source dirs intersect the round's files, probes each allocated port, and starts a cmux terminal split + browser pane for any app not already serving. Idempotent — if the port is already listening it only opens the browser pane (mitigating the multi-worktree port collision).
 ### Step 3: Route Execution Path
 Inspect `approved_plan.files_to_modify[]` and `approved_plan.round_type`. Four execution paths exist; pick the one that matches BEFORE Step 3a/3b.
@@ -143,6 +153,14 @@ If the approved plan includes database schema changes, RLS policies, or type gen
 - `status: 'blocked'` → present blocker to user via AskUserQuestion, resolve, re-spawn executor with remaining work
 - Deliverables incomplete → re-spawn executor with remaining deliverables (max 3 re-triggers). After 3 re-triggers, save partial output and proceed.
+### Step 3d: Push cmux QA Status
+Push the round's QA outcome to the cmux workspace sidebar. Self-no-ops outside cmux or when `auto_status` is disabled. Status is one of: `completed`, `blocked`, or `re-triggering`.
+```bash
+npx codebyplan cmux-status --qa "R{round_number} {status}"
+```
 ### Step 4: Dev-Server Probe (rounds 2+, web/desktop profile)
 When `round_number >= 2` AND `testing_profile` is `'web'` or `'desktop'` AND `files_changed` contains any UI file, probe the dev server BEFORE cbp-testing-qa-agent spawns (saves a full agent spawn when the server is down).