codebyplan 1.13.15 → 1.13.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "codebyplan",
3
- "version": "1.13.15",
3
+ "version": "1.13.17",
4
4
  "description": "CLI for CodeByPlan — AI-powered development planning and tracking",
5
5
  "type": "module",
6
6
  "bin": {
@@ -61,6 +61,6 @@
61
61
  "prettier": "^3.8.1",
62
62
  "typescript": "^5",
63
63
  "typescript-eslint": "^8.20.0",
64
- "vitest": "^4.1.2"
64
+ "vitest": "^4.1.8"
65
65
  }
66
66
  }
@@ -26,136 +26,237 @@ pnpm exec playwright install --with-deps chromium
26
26
 
27
27
  ## playwright.config.ts
28
28
 
29
- Derive `baseURL` from `.codebyplan/server.json` at config-read time. Match by label
30
- (`"Web Dev"`) rather than array position a monorepo can have several nextjs allocations.
29
+ Resolve the apps/web dev-server port at config-read time via the shared resolver
30
+ `apps/web/e2e/resolve-web-dev-port.ts` — imported by BOTH `playwright.config.ts` and
31
+ `e2e/auth.setup.ts` (single source of truth). It reads the per-worktree
32
+ `.codebyplan/server.local.json` overlay first, then the committed `.codebyplan/server.json`.
33
+ Match by label rather than array position — a monorepo can have several Next.js allocations
34
+ with similar label prefixes.
35
+
36
+ **Label-matching rules** (`findWebDevPort`):
37
+
38
+ - `server.local.json` overlay: each label has the worktree name appended as the last
39
+ parenthetical group (e.g. `"Web Dev (codebyplan-mcp-1)"`). Strip exactly ONE trailing
40
+ `" (…)"` group, then require the result `=== "Web Dev"`.
41
+ - `"Web Dev (codebyplan-mcp-1)"` → strip → `"Web Dev"` ✓
42
+ - `"Web Dev (codebyplan-desktop) (codebyplan-mcp-1)"` → strip → `"Web Dev (codebyplan-desktop)"` ✗
43
+ - `server.json` committed base: require `label === "Web Dev"` exactly (do NOT strip —
44
+ `"Web Dev (codebyplan-desktop)"` must not match).
45
+
46
+ **Resolution order** (first hit wins):
47
+
48
+ 0. `PLAYWRIGHT_BASE_URL` — explicit CI / local override (`parsePortFromUrl` extracts the port)
49
+ 1. `.codebyplan/server.local.json` — `findWebDevPort(…, {stripWorktreeSuffix: true})`
50
+ 2. `.codebyplan/server.json` — `findWebDevPort(…, {stripWorktreeSuffix: false})`
51
+ 3. `E2E_BASE_URL` — `parsePortFromUrl` (kept BELOW the overlay: a stale `E2E_BASE_URL` in a
52
+ gitignored `.env.local` must never shadow the worktree's own port — set `PLAYWRIGHT_BASE_URL`
53
+ to override in CI)
54
+ 4. `3010` — last resort
55
+
56
+ The resolver uses `readFileSync` + `JSON.parse` with paths relative to `apps/web/e2e/`
57
+ (`resolve(__dirname, "../../../.codebyplan/…")`). Each read is wrapped in `try/catch` — the
58
+ overlay is gitignored and absent in CI. Do NOT import from the `codebyplan` CLI package
59
+ (async, cross-package coupling). `findWebDevPort` + `parsePortFromUrl` are pure and unit-tested
60
+ in `e2e/__tests__/resolve-web-dev-port.test.ts`.
31
61
 
32
62
  ```ts
63
+ import { readFileSync } from "node:fs";
64
+ import { resolve } from "node:path";
33
65
  import { defineConfig, devices } from "@playwright/test";
34
- import { execSync } from "child_process";
35
66
 
36
- function getBaseUrl(): string {
67
+ import { resolveWebDevPort } from "./e2e/resolve-web-dev-port";
68
+
69
+ // Load apps/web/.env.local into process.env (process.env wins on conflict)
70
+ (function loadDotEnvLocal() {
71
+ try {
72
+ const text = readFileSync(resolve(__dirname, ".env.local"), "utf-8");
73
+ for (const line of text.split("\n")) {
74
+ const t = line.trim();
75
+ if (!t || t.startsWith("#")) continue;
76
+ const eq = t.indexOf("=");
77
+ if (eq === -1) continue;
78
+ const k = t.slice(0, eq).trim();
79
+ let v = t.slice(eq + 1).trim();
80
+ if ((v.startsWith('"') && v.endsWith('"')) || (v.startsWith("'") && v.endsWith("'")))
81
+ v = v.slice(1, -1);
82
+ if (!(k in process.env)) process.env[k] = v;
83
+ }
84
+ } catch { /* absent in CI */ }
85
+ })();
86
+
87
+ // Load .codebyplan/e2e.env — Supabase + auth credentials (process.env wins)
88
+ (function loadE2eEnv() {
37
89
  try {
38
- const raw = execSync(
39
- "jq -r '.port_allocations[] | select(.label==\"Web Dev\") | .port' .codebyplan/server.json 2>/dev/null | head -1",
40
- { encoding: "utf-8" }
41
- ).trim();
42
- const port = parseInt(raw, 10);
43
- return `http://localhost:${port}`;
44
- } catch {
45
- return "http://localhost:3010";
46
- }
47
- }
90
+ const text = readFileSync(resolve(__dirname, "../../.codebyplan/e2e.env"), "utf-8");
91
+ for (const line of text.split("\n")) {
92
+ const t = line.trim();
93
+ if (!t || t.startsWith("#")) continue;
94
+ const eq = t.indexOf("=");
95
+ if (eq === -1) continue;
96
+ const k = t.slice(0, eq).trim();
97
+ let v = t.slice(eq + 1).trim();
98
+ if ((v.startsWith('"') && v.endsWith('"')) || (v.startsWith("'") && v.endsWith("'")))
99
+ v = v.slice(1, -1);
100
+ if (!(k in process.env)) process.env[k] = v;
101
+ }
102
+ } catch { /* absent in CI — shell / CI secrets used instead */ }
103
+ })();
104
+
105
+ // findWebDevPort, parsePortFromUrl, and resolveWebDevPort live in the shared
106
+ // module ./e2e/resolve-web-dev-port.ts (imported above) — single source of
107
+ // truth, also consumed by e2e/auth.setup.ts. Resolution order:
108
+ // 0. PLAYWRIGHT_BASE_URL → 1. server.local.json → 2. server.json
109
+ // → 3. E2E_BASE_URL → 4. 3010.
110
+ const port = resolveWebDevPort();
48
111
 
49
112
  export default defineConfig({
50
- testDir: "apps/web/e2e",
51
- fullyParallel: false,
113
+ testDir: "./e2e",
114
+ testMatch: "*.spec.ts",
115
+ globalSetup: require.resolve("./e2e/global-setup"),
116
+ fullyParallel: true,
52
117
  forbidOnly: !!process.env.CI,
53
- retries: process.env.CI ? 2 : 0,
118
+ retries: process.env.CI ? 1 : 0,
54
119
  workers: 1, // serialize against shared remote Supabase — see e2e.md § Supabase Parallelism
55
- reporter: process.env.CI ? "github" : "html",
56
- globalSetup: "./apps/web/e2e/global-setup",
120
+ reporter: "list",
121
+ timeout: 30_000,
122
+ expect: {
123
+ toHaveScreenshot: { stylePath: "./e2e/screenshot.css" },
124
+ },
57
125
  use: {
58
- baseURL: getBaseUrl(),
126
+ baseURL: `http://localhost:${port}`,
127
+ storageState: "e2e/.auth/refreshed-state.json",
128
+ actionTimeout: 15_000,
59
129
  trace: "on-first-retry",
60
130
  screenshot: "only-on-failure",
61
131
  },
62
- projects: [
63
- { name: "setup", testMatch: /global\.setup\.ts/ },
64
- {
65
- name: "web",
66
- use: { ...devices["Desktop Chrome"], storageState: "apps/web/e2e/.auth/user.json" },
67
- dependencies: ["setup"],
68
- },
69
- ],
70
132
  webServer: {
71
- command: "pnpm --filter @codebyplan/web dev",
72
- url: getBaseUrl(),
133
+ command: `pnpm --filter @codebyplan/web dev --port ${port}`,
134
+ url: `http://localhost:${port}`,
73
135
  reuseExistingServer: !process.env.CI,
74
136
  timeout: 120_000,
137
+ env: {
138
+ // Forward Supabase + auth vars into the spawned dev server. Only forward
139
+ // vars that are present in process.env (undefined would stringify to "undefined").
140
+ ...(process.env.NEXT_PUBLIC_SUPABASE_URL && {
141
+ NEXT_PUBLIC_SUPABASE_URL: process.env.NEXT_PUBLIC_SUPABASE_URL,
142
+ }),
143
+ ...(process.env.NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY && {
144
+ NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY: process.env.NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY,
145
+ }),
146
+ ...(process.env.SUPABASE_SECRET_KEY && {
147
+ SUPABASE_SECRET_KEY: process.env.SUPABASE_SECRET_KEY,
148
+ }),
149
+ },
75
150
  },
151
+ projects: [
152
+ {
153
+ name: "chromium",
154
+ use: { ...devices["Desktop Chrome"] },
155
+ },
156
+ ],
76
157
  });
77
158
  ```
78
159
 
79
160
  ## Auth — Global Setup + Storage State
80
161
 
81
- `apps/web/e2e/global-setup.ts`:
162
+ `apps/web/e2e/global-setup.ts` performs two phases at startup:
82
163
 
83
- ```ts
84
- import { chromium, FullConfig } from "@playwright/test";
85
- import path from "path";
86
-
87
- const AUTH_FILE = path.join(__dirname, ".auth/user.json");
88
-
89
- export default async function globalSetup(config: FullConfig) {
90
- const email = process.env.E2E_TEST_EMAIL;
91
- const password = process.env.E2E_TEST_PASSWORD;
92
-
93
- if (!email || !password) {
94
- throw new Error(
95
- "E2E_TEST_EMAIL and E2E_TEST_PASSWORD must be set.\n" +
96
- "Copy .env.local.example to .env.local, then run: pnpm e2e:provision"
97
- );
98
- }
99
-
100
- const { baseURL } = config.projects[0].use;
101
- const browser = await chromium.launch();
102
- const page = await browser.newPage();
103
-
104
- await page.goto(`${baseURL}/login`);
105
- await page.getByLabel(/email/i).fill(email);
106
- await page.getByLabel(/password/i).fill(password);
107
- await page.getByRole("button", { name: /sign in|log in/i }).click();
108
- await page.waitForURL(/\/(dashboard|home|app)/, { timeout: 15_000 });
109
-
110
- await page.goto(baseURL!); // cold-start warmup
111
- await page.context().storageState({ path: AUTH_FILE });
112
- await browser.close();
113
- }
114
- ```
164
+ **Phase 1 — Auth refresh**: reads `e2e/.auth/state.json`, finds the Supabase auth cookie
165
+ (`sb-<projectref>-auth-token`), decodes its base64-JSON payload (`decodeAuthCookie` from the
166
+ shared `e2e/auth-cookie.ts` module), calls `supabase.auth.refreshSession({refresh_token})` for
167
+ fresh tokens, re-encodes via `encodeAuthCookie`, and writes the result to
168
+ `e2e/.auth/refreshed-state.json`. No browser required — pure HTTP against Supabase auth.
115
169
 
116
- Gitignore storage state before first use:
170
+ **Phase 2 Maintainer seeding**: uses the service-role client to ensure the test user
171
+ holds a maintainer-or-above role on at least one organization (`e2e-test-fixture` slug).
172
+ Idempotent — if a qualifying membership already exists, phase 2 is a no-op.
117
173
 
118
- ```bash
119
- mkdir -p apps/web/e2e/.auth
120
- echo "apps/web/e2e/.auth/" >> .gitignore
121
- ```
174
+ **Required env vars** (read via `readEnv(name, fallbacks)` which checks `process.env` first,
175
+ then falls back to the parsed `.env.local` file):
176
+
177
+ - `NEXT_PUBLIC_SUPABASE_URL` — both phases
178
+ - `NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY` — Phase 1 refresh client
179
+ - `SUPABASE_SECRET_KEY` — Phase 2 admin operations
180
+ - `E2E_USER_EMAIL` — Phase 2 user lookup + seeding
181
+
182
+ Because `readEnv` checks `process.env` before `.env.local`, loading these vars into
183
+ `process.env` via `loadE2eEnv()` in `playwright.config.ts` is sufficient — global-setup
184
+ will pick them up.
185
+
186
+ ### Seeding `state.json` — `e2e/auth.setup.ts` (`pnpm e2e:auth-setup`)
187
+
188
+ global-setup Phase 1 only *refreshes* an existing `state.json`; the initial seed is written by
189
+ `e2e/auth.setup.ts`. It is a pure-HTTP API seed — **no browser, no dev server, no hydration
190
+ timing**: it loads creds from `.env.local` + `.codebyplan/e2e.env`, calls
191
+ `supabase.auth.signInWithPassword({email, password})` with the publishable-key client, derives
192
+ the project ref from `NEXT_PUBLIC_SUPABASE_URL`, and writes a `sb-<projectref>-auth-token`
193
+ cookie (domain `localhost`) into `state.json` using the same `encodeAuthCookie` from
194
+ `e2e/auth-cookie.ts` that global-setup consumes. This makes seeding deterministic in any
195
+ worktree — run `pnpm e2e:auth-setup` (optionally `--port N`) when `state.json` is missing or
196
+ its refresh token has expired. Do NOT reintroduce a browser-login flow (the `(auth)/login`
197
+ page is a client component whose `onSubmit` only attaches after hydration — clicking submit
198
+ pre-hydration falls through to a native GET and never authenticates).
122
199
 
123
200
  ## Auth Probe
124
201
 
125
- `apps/web/e2e/_probe/auth.spec.ts` — validates the login path directly (outside storage-
126
- state flow) so credential failures are diagnosed cleanly:
202
+ `apps/web/e2e/_probe/auth.spec.ts` — verifies that the stored auth state
203
+ (`refreshed-state.json`) grants access to the authenticated dashboard without
204
+ redirecting to the login page. It is intentionally minimal (one test) and runs
205
+ before the full suite to confirm the auth preflight:
127
206
 
128
207
  ```ts
129
208
  import { test, expect } from "@playwright/test";
130
209
 
131
- test("auth probe: can log in with E2E_TEST_EMAIL/E2E_TEST_PASSWORD", async ({ page }) => {
132
- const email = process.env.E2E_TEST_EMAIL;
133
- const password = process.env.E2E_TEST_PASSWORD;
134
- expect(email, "E2E_TEST_EMAIL env var is required").toBeTruthy();
135
- expect(password, "E2E_TEST_PASSWORD env var is required").toBeTruthy();
136
-
137
- await page.goto("/login");
138
- await page.getByLabel(/email/i).fill(email!);
139
- await page.getByLabel(/password/i).fill(password!);
140
- await page.getByRole("button", { name: /sign in|log in/i }).click();
210
+ test("auth probe: authenticated user reaches /dashboard without login redirect", async ({
211
+ page,
212
+ }) => {
213
+ const response = await page.goto("/dashboard", {
214
+ waitUntil: "domcontentloaded",
215
+ timeout: 20_000,
216
+ });
141
217
 
142
- await expect(page).toHaveURL(/\/(dashboard|home|app)/, { timeout: 15_000 });
218
+ // Must NOT be redirected to /login or /auth/login.
219
+ const finalUrl = page.url();
220
+ expect(
221
+ finalUrl,
222
+ `Auth probe failed: landed on ${finalUrl} instead of /dashboard. Check state.json / refreshed-state.json.`
223
+ ).not.toMatch(/\/(auth\/)?login/);
224
+
225
+ // HTTP status must be < 400.
226
+ const status = response?.status() ?? 0;
227
+ expect(
228
+ status,
229
+ `Auth probe failed: /dashboard returned HTTP ${status}.`
230
+ ).toBeLessThan(400);
231
+
232
+ // Dashboard heading must be present.
233
+ await expect(
234
+ page.getByRole("heading", { level: 1, name: /welcome to codebyplan|dashboard/i })
235
+ ).toBeVisible({ timeout: 15_000 });
143
236
  });
144
237
  ```
145
238
 
146
- Run probe: `pnpm exec playwright test --project=web _probe/auth`
239
+ Run probe: `pnpm exec playwright test --project=chromium _probe/auth`
147
240
 
148
241
  ## Pre-flight Probes (Step 6.5.2)
149
242
 
150
243
  **Dev server**: `curl -s -o /dev/null -w "%{http_code}" http://localhost:{port}/` — expect
151
244
  200/3xx. On failure:
152
245
 
153
- > "Dev server is not responding on port `{port}`. Please run `cd apps/{app} && pnpm dev`
246
+ > "Dev server is not responding on port `{port}`. Please run `cd apps/web && pnpm dev --port {port}`
154
247
  > in a separate terminal, then reply 'ready' when the page loads in your browser."
155
248
 
156
- **Port alignment**: parse `playwright.config.ts` `baseURL` port; compare to
157
- `.codebyplan/server.json` `port_allocations[]`. On mismatch ask which is correct, then
158
- propose an Edit to align them.
249
+ Note: Playwright's `webServer` block behaviour differs by environment. In local worktree
250
+ runs (`reuseExistingServer: true`), Playwright reuses an already-running server and only
251
+ auto-starts one when nothing is listening on the port — this probe is mainly a safety net.
252
+ In CI (`reuseExistingServer: false`), Playwright always spawns a fresh server regardless of
253
+ any already-running process, so the dev-server readiness probe is the active guard for that
254
+ path.
255
+
256
+ **Port alignment**: parse `playwright.config.ts` `baseURL` port; compare to the resolved
257
+ port from `.codebyplan/server.local.json` (worktree overlay, checked first) then
258
+ `.codebyplan/server.json` (committed base). On mismatch ask which is correct, then propose
259
+ an Edit to align them.
159
260
 
160
261
  ## Spec-Writing Patterns
161
262
 
@@ -227,7 +328,7 @@ Include this in the specialist output alongside `screenshots[]`.
227
328
  ## Run Command
228
329
 
229
330
  ```bash
230
- pnpm exec playwright test {spec} --project=web --reporter=list
331
+ pnpm exec playwright test {spec} --project=chromium --reporter=list
231
332
  ```
232
333
 
233
334
  ## Selector Conventions
@@ -238,13 +339,13 @@ the new page state rather than holding stale `Locator` handles.
238
339
 
239
340
  ## CI Secrets
240
341
 
241
- `E2E_TEST_EMAIL`, `E2E_TEST_PASSWORD`, `NEXT_PUBLIC_SUPABASE_URL`,
242
- `NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY` (or legacy `_ANON_KEY`).
342
+ `E2E_USER_EMAIL`, `E2E_USER_PASSWORD`, `NEXT_PUBLIC_SUPABASE_URL`,
343
+ `NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY`, `SUPABASE_SECRET_KEY`.
243
344
 
244
345
  ## Pitfalls
245
346
 
246
347
  **Cold-start timeouts** — warmup in `globalSetup` (after `page.goto(baseURL!)`) primes
247
- Turbopack compilation. **Port mismatch** — compare `baseURL` port to `server.json` before
248
- running. **Supabase parallelism** — remote Supabase requires `workers: 1` to prevent
249
- auth/RLS races. **SCSS Module selectors** — use `[class*='componentName'].first()` or
250
- role-based selectors.
348
+ Turbopack compilation. **Port mismatch** — compare `baseURL` port to resolved port from
349
+ `server.local.json` / `server.json` before running. **Supabase parallelism** — remote
350
+ Supabase requires `workers: 1` to prevent auth/RLS races. **SCSS Module selectors** — use
351
+ `[class*='componentName'].first()` or role-based selectors.
@@ -118,11 +118,12 @@ with the blocking preflight field populated.
118
118
 
119
119
  ### 6.5.1 Environment Variables
120
120
 
121
- Check `apps/{app}/.env.local` and process env. Framework-specific required var names come
122
- from the `credential_vars` input field (the dispatching skill reads them from
123
- `.codebyplan/e2e.json`). Naming conventions:
121
+ Check `apps/{app}/.env.local`, `.codebyplan/e2e.env`, and process env. Framework-specific
122
+ required var names come from the `credential_vars` input field (the dispatching skill reads
123
+ them from `.codebyplan/e2e.json`). Naming conventions:
124
124
 
125
- - Playwright uses `E2E_TEST_*` (avoids collision with non-E2E `TEST_*` vars).
125
+ - Playwright uses `E2E_USER_EMAIL` / `E2E_USER_PASSWORD` (matches `.codebyplan/e2e.json`
126
+ `credentials.frameworks.playwright` and the `e2e.env` file loaded by `playwright.config.ts`).
126
127
  - Maestro/XCUITest stay on `TEST_*` per `rules/maestro-auth-state-reset.md`.
127
128
 
128
129
  For any missing var:
@@ -148,8 +149,9 @@ On any failure, `AskUserQuestion` with remediation steps; re-probe after "ready"
148
149
  silently skip a required runtime prerequisite.
149
150
 
150
151
  **Port alignment (Playwright only)**: parse `playwright.config.ts` `baseURL` and compare
151
- to `.codebyplan/server.json` `port_allocations[]` for the app. On mismatch ask which is
152
- correct before running.
152
+ to the resolved port from `.codebyplan/server.local.json` (worktree overlay, checked first)
153
+ then `.codebyplan/server.json` (committed base) `port_allocations[]` for the app. On
154
+ mismatch ask which is correct before running.
153
155
 
154
156
  ### 6.5.3 Auth Probe (only when `has_auth`)
155
157
 
@@ -303,9 +305,12 @@ Every repo with Playwright auth ships:
303
305
 
304
306
  - `scripts/provision-e2e-user.ts` — idempotent script creating the canonical E2E user
305
307
  and (for multi-tenant repos) a `test` subdomain. Wired to `pnpm e2e:provision`.
306
- - `.env.local.example` — lists every env var `globalSetup` requires.
307
- - CI secrets: `E2E_TEST_EMAIL`, `E2E_TEST_PASSWORD`, `NEXT_PUBLIC_SUPABASE_URL`,
308
- `NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY` (or legacy `_ANON_KEY`).
308
+ - `.codebyplan/e2e.env` — gitignored per-worktree file listing every env var `globalSetup`
309
+ requires. Written by `codebyplan e2e:provision` or manually. Loaded by `playwright.config.ts`
310
+ into `process.env` so global-setup picks them up via its `readEnv` helper.
311
+ - `.env.local.example` — lists every env var `globalSetup` requires for reference.
312
+ - CI secrets: `E2E_USER_EMAIL`, `E2E_USER_PASSWORD`, `NEXT_PUBLIC_SUPABASE_URL`,
313
+ `NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY`, `SUPABASE_SECRET_KEY`.
309
314
 
310
315
  Per-repo specifics (email, vault name, remaining-spec migration list) live in the repo's
311
316
  own `docs/e2e-setup.md`, not in this shared file.
@@ -226,7 +226,7 @@ After a `complete_round` MCP call succeeds, reconciles the round's `files_change
226
226
 
227
227
  ### `cbp-cmux-workspace-sync.sh` — SessionStart, matcher `*`
228
228
 
229
- On every session start, syncs the active [cmux](https://github.com/nicholasgasior/cmux) workspace title to the current git branch and the workspace description to the repo folder basename (the directory that contains `.git/`).
229
+ On every session start, syncs the active [cmux](https://github.com/nicholasgasior/cmux) workspace title to the current git branch, the workspace description to the repo folder basename (the directory that contains `.git/`), and applies the workspace color from `.codebyplan/cmux.json` via `cmux workspace-action --action set-color`. All three actions are delegated to `codebyplan cmux-sync`. If no `workspace_color` is configured, a one-line nudge is printed to stdout prompting the user to run `/cbp-setup-cmux`.
230
230
 
231
231
  **Blocks vs warns**: never blocks — exit 0 on every path. A SessionStart hook must never prevent a session from opening.
232
232
 
@@ -252,6 +252,28 @@ After any Bash tool call that contains a `git checkout` or `git switch` invocati
252
252
 
253
253
  ---
254
254
 
255
+ ### Auto dev server (`codebyplan cmux-serve`)
256
+
257
+ At the start of each round, `cbp-round-execute` (Step 2a) calls `codebyplan cmux-serve --files "<round files>"` to auto-start the dev server for any app whose source files are touched. The subcommand probes each allocated port via `node:net`, starts a `cmux new-split` terminal pane + sends the dev command for any non-listening app, then opens a browser pane. If the port is already listening (another worktree) it only opens the browser pane. No hook registration is needed — the skill invokes the subcommand directly. Gated by `auto_dev_server` in `.codebyplan/cmux.json`; no-op outside cmux.
258
+
259
+ ---
260
+
261
+ ### Status surface (`codebyplan cmux-status`)
262
+
263
+ The lifecycle skills push CodeByPlan development state into the cmux workspace sidebar via `codebyplan cmux-status`. No hook registration is needed — the skills invoke the subcommand directly:
264
+
265
+ | Skill | What is pushed |
266
+ | --- | --- |
267
+ | `cbp-task-start` (Step 4.5) | `--checkpoint "CHK-NNN: title" --task "TASK-N: title"` |
268
+ | `cbp-task-complete` (Step 7.3) | `--task "TASK-N: title done" --progress completed/total` |
269
+ | `cbp-round-execute` (Step 3d) | `--qa "R{n} {status}"` where status ∈ completed / blocked / re-triggering |
270
+
271
+ **`auto_status` toggle.** Gated by the `auto_status` field in `.codebyplan/cmux.json` (configured via `/cbp-setup-cmux`). When `auto_status` is `false`, every call is a no-op. Default is `true` (enabled).
272
+
273
+ **No-op outside cmux.** `codebyplan cmux-status` checks for `$CMUX_WORKSPACE_ID` before doing anything. Outside a cmux workspace it exits immediately — safe to call unconditionally from skills and hooks.
274
+
275
+ ---
276
+
255
277
  ## Supporting (not registered)
256
278
 
257
279
  ### `test-hooks.sh` — invoked by `auto-test-hooks.sh`
@@ -124,6 +124,7 @@
124
124
  "Skill(cbp-round-update)",
125
125
  "Skill(cbp-session-end)",
126
126
  "Skill(cbp-session-start)",
127
+ "Skill(cbp-setup-cmux)",
127
128
  "Skill(cbp-setup-e2e)",
128
129
  "Skill(cbp-setup-eslint)",
129
130
  "Skill(cbp-ship-configure)",
@@ -135,6 +136,11 @@
135
136
  "Skill(cbp-supabase-branch-check)",
136
137
  "Skill(cbp-supabase-migrate)",
137
138
  "Skill(cbp-supabase-setup)",
139
+ "Skill(cbp-standalone-task-check)",
140
+ "Skill(cbp-standalone-task-complete)",
141
+ "Skill(cbp-standalone-task-create)",
142
+ "Skill(cbp-standalone-task-start)",
143
+ "Skill(cbp-standalone-task-testing)",
138
144
  "Skill(cbp-task-check)",
139
145
  "Skill(cbp-task-complete)",
140
146
  "Skill(cbp-task-create)",
@@ -196,6 +202,10 @@
196
202
  "Bash(npx codebyplan resolve-worktree:*)",
197
203
  "Bash(codebyplan cmux-sync:*)",
198
204
  "Bash(npx codebyplan cmux-sync:*)",
205
+ "Bash(codebyplan cmux-status:*)",
206
+ "Bash(npx codebyplan cmux-status:*)",
207
+ "Bash(codebyplan cmux-serve:*)",
208
+ "Bash(npx codebyplan cmux-serve:*)",
199
209
  "Bash(codebyplan version-status:*)",
200
210
  "Bash(npx codebyplan version-status:*)",
201
211
  "Bash(codebyplan statusline:*)",
@@ -142,6 +142,17 @@ cp "$MAIN_REPO/.env.local" "$WORKTREE_PATH/.env.local"
142
142
 
143
143
  Verify `.env.local` is already in `.gitignore` (it should be via `.env.local` pattern). If not, add it.
144
144
 
145
+ Also copy the gitignored E2E credentials source (`.codebyplan/e2e.env`, referenced by `.codebyplan/e2e.json`) so the new worktree can run Playwright auth flows immediately:
146
+
147
+ ```bash
148
+ mkdir -p "$WORKTREE_PATH/.codebyplan"
149
+ if [ -f "$MAIN_REPO/.codebyplan/e2e.env" ]; then
150
+ cp "$MAIN_REPO/.codebyplan/e2e.env" "$WORKTREE_PATH/.codebyplan/e2e.env"
151
+ fi
152
+ ```
153
+
154
+ If the main repo has no `.codebyplan/e2e.env` yet, provision it after setup by running `codebyplan ports --path "$WORKTREE_PATH" --provision-e2e` (copies the canonical E2E vars from `apps/web/.env.local`). Pass `--path` BEFORE the boolean flag. `.codebyplan/e2e.env` is gitignored — never commit it.
155
+
145
156
  ### Step 8: Push Branch
146
157
 
147
158
  ```bash
@@ -57,6 +57,16 @@ Read the plan from round context (`context.planner_output`). If no plan: `No app
57
57
 
58
58
  Read effective testing profile: `round.context.testing_profile_override` if set (user override for this round only), else `task.context.testing_profile` (set by planner Phase 4.8), else default `'web'`. Pass the effective profile to all per-wave `cbp-testing-qa-agent` spawns.
59
59
 
60
+ ### Step 2a: Auto-Dev-Server (cmux)
61
+
62
+ Fire the dev-server hook at round-execution start. Self-no-ops outside cmux or when `auto_dev_server` is disabled in `.codebyplan/cmux.json`.
63
+
64
+ ```bash
65
+ npx codebyplan cmux-serve --files "<comma-separated approved_plan.files_to_modify[].path>"
66
+ ```
67
+
68
+ The subcommand reads `.codebyplan/server.json` `port_allocations[]`, resolves which apps' source dirs intersect the round's files, probes each allocated port, and starts a cmux terminal split + browser pane for any app not already serving. Idempotent — if the port is already listening it only opens the browser pane (mitigating the multi-worktree port collision).
69
+
60
70
  ### Step 3: Route Execution Path
61
71
 
62
72
  Inspect `approved_plan.files_to_modify[]` and `approved_plan.round_type`. Four execution paths exist; pick the one that matches BEFORE Step 3a/3b.
@@ -143,6 +153,14 @@ If the approved plan includes database schema changes, RLS policies, or type gen
143
153
  - `status: 'blocked'` → present blocker to user via AskUserQuestion, resolve, re-spawn executor with remaining work
144
154
  - Deliverables incomplete → re-spawn executor with remaining deliverables (max 3 re-triggers). After 3 re-triggers, save partial output and proceed.
145
155
 
156
+ ### Step 3d: Push cmux QA Status
157
+
158
+ Push the round's QA outcome to the cmux workspace sidebar. Self-no-ops outside cmux or when `auto_status` is disabled. Status is one of: `completed`, `blocked`, or `re-triggering`.
159
+
160
+ ```bash
161
+ npx codebyplan cmux-status --qa "R{round_number} {status}"
162
+ ```
163
+
146
164
  ### Step 4: Dev-Server Probe (rounds 2+, web/desktop profile)
147
165
 
148
166
  When `round_number >= 2` AND `testing_profile` is `'web'` or `'desktop'` AND `files_changed` contains any UI file, probe the dev server BEFORE cbp-testing-qa-agent spawns (saves a full agent spawn when the server is down).