codebyplan 1.11.1 → 1.11.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/dist/cli.js +56 -5
  2. package/package.json +1 -1
  3. package/templates/README.md +1 -1
  4. package/templates/agents/cbp-cc-executor.md +1 -1
  5. package/templates/agents/cbp-e2e-maestro.md +202 -0
  6. package/templates/agents/cbp-e2e-playwright.md +229 -0
  7. package/templates/agents/cbp-e2e-tauri.md +184 -0
  8. package/templates/agents/cbp-e2e-vscode.md +203 -0
  9. package/templates/agents/cbp-e2e-xcuitest.md +224 -0
  10. package/templates/agents/cbp-improve-claude.md +1 -1
  11. package/templates/agents/cbp-round-executor.md +11 -11
  12. package/templates/agents/cbp-task-check.md +1 -1
  13. package/templates/agents/cbp-task-planner.md +2 -0
  14. package/templates/agents/cbp-testing-qa-agent.md +9 -9
  15. package/templates/context/testing/e2e.md +303 -0
  16. package/templates/hooks/validate-structure-lengths.sh +2 -0
  17. package/templates/hooks/validate-structure-smoke.sh +2 -1
  18. package/templates/hooks/validate-structure-templates.sh +1 -0
  19. package/templates/rules/context-file-loading.md +4 -1
  20. package/templates/rules/e2e-mandatory.md +70 -0
  21. package/templates/skills/cbp-build-cc-agent/SKILL.md +16 -14
  22. package/templates/skills/cbp-build-cc-agent/reference/cbp-quality.md +4 -4
  23. package/templates/skills/cbp-build-cc-agent/scripts/validate-agent.sh +8 -6
  24. package/templates/skills/cbp-build-cc-mode/SKILL.md +4 -4
  25. package/templates/skills/cbp-checkpoint-check/SKILL.md +12 -8
  26. package/templates/skills/cbp-checkpoint-plan/SKILL.md +2 -2
  27. package/templates/skills/cbp-checkpoint-plan/reference/e2e-discovery-probe.md +5 -5
  28. package/templates/skills/cbp-e2e-setup/SKILL.md +254 -0
  29. package/templates/skills/cbp-e2e-setup/reference/maestro.md +200 -0
  30. package/templates/skills/cbp-e2e-setup/reference/playwright.md +212 -0
  31. package/templates/skills/cbp-e2e-setup/reference/tauri.md +147 -0
  32. package/templates/skills/cbp-e2e-setup/reference/vscode.md +154 -0
  33. package/templates/skills/cbp-e2e-setup/reference/xcuitest.md +185 -0
  34. package/templates/skills/cbp-frontend-ui/SKILL.md +6 -6
  35. package/templates/skills/cbp-frontend-ux/SKILL.md +1 -1
  36. package/templates/skills/cbp-round-execute/SKILL.md +30 -17
  37. package/templates/skills/cbp-task-check/SKILL.md +2 -2
  38. package/templates/agents/cbp-test-e2e-agent.md +0 -363
@@ -0,0 +1,200 @@
1
+ # Maestro Reference
2
+
3
+ Full install, config, flows, and CI walkthrough for Maestro on an Expo/React Native project.
4
+ Source: vendor/maestro/v2.6 + `.claude/context/testing/e2e.md`.
5
+
6
+ ## Prerequisites
7
+
8
+ - Java 17 or later: `java -version` (install via `brew install openjdk@17` on macOS)
9
+ - Android emulator (for android targets) or iOS Simulator (for ios targets)
10
+ - Expo app bundled and running on the target device/emulator
11
+
12
+ ## Install
13
+
14
+ ```bash
15
+ # macOS — recommended
16
+ curl -fsSL "https://get.maestro.mobile.dev" | bash
17
+
18
+ # Alternative: Homebrew tap
19
+ brew tap mobile-dev-inc/tap
20
+ brew install maestro
21
+ ```
22
+
23
+ Verify:
24
+
25
+ ```bash
26
+ maestro --version
27
+ ```
28
+
29
+ Update later with: `maestro upgrade`
30
+
31
+ ## maestro/config.yaml
32
+
33
+ Create at repo root under `maestro/config.yaml`:
34
+
35
+ ```yaml
36
+ # Maestro workspace configuration
37
+ # See: https://docs.maestro.dev/maestro-flows/workspace-management/repository-configuration
38
+
39
+ appId: com.yourorg.yourapp # must match app.config.ts / expo config
40
+ env:
41
+ TEST_EMAIL: ${TEST_EMAIL}
42
+ TEST_PASSWORD: ${TEST_PASSWORD}
43
+ APP_ID: com.yourorg.yourapp
44
+ ```
45
+
46
+ `appId` must match the value in `app.config.ts` `ios.bundleIdentifier` /
47
+ `android.package`. If they differ across platforms, use the Android package ID for
48
+ Maestro's `appId` on Android and the iOS bundle ID on iOS tests.
49
+
50
+ ## Shared login flow
51
+
52
+ Create `maestro/flows/_shared/login.yaml`:
53
+
54
+ ```yaml
55
+ appId: ${APP_ID}
56
+ ---
57
+ - launchApp:
58
+ clearState: true
59
+ - assertVisible: "Sign in"
60
+ - tapOn: "Email"
61
+ - inputText: ${TEST_EMAIL}
62
+ - tapOn: "Password"
63
+ - inputText: ${TEST_PASSWORD}
64
+ - tapOn: "Sign in"
65
+ - assertVisible:
66
+ text: ".*" # Replace with a post-login element (e.g. "Dashboard")
67
+ timeout: 15000
68
+ ```
69
+
70
+ Reference it from any other flow:
71
+
72
+ ```yaml
73
+ - runFlow: _shared/login.yaml
74
+ ```
75
+
76
+ ## Auth probe flow
77
+
78
+ `maestro/flows/_probe/auth.yaml` — minimal login verification:
79
+
80
+ ```yaml
81
+ appId: ${APP_ID}
82
+ tags:
83
+ - probe
84
+ ---
85
+ - launchApp:
86
+ clearState: true
87
+ - assertVisible: "Sign in"
88
+ - tapOn: "Email"
89
+ - inputText: ${TEST_EMAIL}
90
+ - tapOn: "Password"
91
+ - inputText: ${TEST_PASSWORD}
92
+ - tapOn: "Sign in"
93
+ - assertVisible:
94
+ text: ".*"
95
+ timeout: 15000
96
+ ```
97
+
98
+ Run the probe before the full suite: `maestro test maestro/flows/_probe/auth.yaml`
99
+
100
+ ## Platform targeting
101
+
102
+ Maestro v2.6 exposes `-p` / `--platform` as a global option (placed BEFORE the `test`
103
+ subcommand). Values: `android`, `ios`, or `web`.
104
+
105
+ Run on Android: start an Android emulator, then
106
+
107
+ ```bash
108
+ maestro --platform=android test maestro/flows/
109
+ ```
110
+
111
+ Run on iOS: boot an iOS Simulator, then
112
+
113
+ ```bash
114
+ maestro --platform=ios test maestro/flows/
115
+ ```
116
+
117
+ Omitting the flag is also valid — platform is then implicit from whichever single
118
+ emulator/simulator is currently running.
119
+
120
+ Target a specific device by UDID / emulator name:
121
+
122
+ ```bash
123
+ maestro test --device <device-id> maestro/flows/
124
+ ```
125
+
126
+ ## Directory structure
127
+
128
+ ```
129
+ maestro/
130
+ config.yaml
131
+ flows/
132
+ _shared/
133
+ login.yaml
134
+ open-side-menu.yaml
135
+ _probe/
136
+ auth.yaml
137
+ onboarding/
138
+ signup.yaml
139
+ home/
140
+ dashboard.yaml
141
+ ```
142
+
143
+ One subdirectory per app module under `maestro/flows/`. Shared flows under `_shared/`.
144
+
145
+ ## Screenshots
146
+
147
+ ```yaml
148
+ - takeScreenshot: "after-login"
149
+ ```
150
+
151
+ Configure a repo-local screenshots path in `maestro/config.yaml`:
152
+
153
+ ```yaml
154
+ screenshotsDir: maestro/screenshots
155
+ ```
156
+
157
+ ## pnpm scripts
158
+
159
+ Add to root `package.json`:
160
+
161
+ ```json
162
+ {
163
+ "scripts": {
164
+ "maestro:test": "maestro test maestro/flows/",
165
+ "maestro:test:probe": "maestro test maestro/flows/_probe/",
166
+ "maestro:studio": "maestro studio"
167
+ }
168
+ }
169
+ ```
170
+
171
+ ## CI (GitHub Actions)
172
+
173
+ Maestro CI runs require a connected device. For GitHub Actions use
174
+ [Maestro Cloud](https://cloud.maestro.dev) or a self-hosted runner with a connected
175
+ device. A minimal Maestro Cloud step:
176
+
177
+ ```yaml
178
+ - name: Run Maestro flows
179
+ uses: mobile-dev-inc/action-maestro-cloud@v1
180
+ with:
181
+ api-key: ${{ secrets.MAESTRO_CLOUD_API_KEY }}
182
+ app-file: path/to/app.apk # or .ipa
183
+ flow-file: maestro/flows/
184
+ env:
185
+ TEST_EMAIL: ${{ secrets.TEST_EMAIL }}
186
+ TEST_PASSWORD: ${{ secrets.TEST_PASSWORD }}
187
+ ```
188
+
189
+ For local self-hosted runner, set `TEST_EMAIL` and `TEST_PASSWORD` as runner env vars.
190
+
191
+ ## Pitfalls
192
+
193
+ **App ID mismatch** — `appId` in config.yaml must exactly match the compiled bundle
194
+ identifier. Re-run `expo prebuild` if you changed the identifier after prebuild.
195
+
196
+ **clearState: true** — always clear app state in `launchApp` for the login flow so
197
+ each run starts from a signed-out state.
198
+
199
+ **Java version** — Maestro requires Java 17+. If `maestro --version` fails, check
200
+ `JAVA_HOME` or install via Homebrew.
@@ -0,0 +1,212 @@
1
+ # Playwright Reference
2
+
3
+ Full install, config, auth, and CI walkthrough for Playwright on a Next.js monorepo.
4
+ Source: vendor/playwright/v1.60 + `.claude/context/testing/e2e.md`.
5
+
6
+ ## Install
7
+
8
+ ```bash
9
+ pnpm add -D @playwright/test
10
+ pnpm exec playwright install chromium
11
+ ```
12
+
13
+ For CI with system dependencies:
14
+
15
+ ```bash
16
+ pnpm exec playwright install --with-deps chromium
17
+ ```
18
+
19
+ ## playwright.config.ts
20
+
21
+ Derive `baseURL` from `.codebyplan/server.json` at config-read time:
22
+
23
+ ```ts
24
+ import { defineConfig, devices } from "@playwright/test";
25
+ import { execSync } from "child_process";
26
+ import path from "path";
27
+
28
+ // Pull the Web Dev port from server.json so config stays in sync. Match by label
29
+ // rather than server_type — a repo can have several nextjs allocations, so
30
+ // array-position head -1 is not stable.
31
+ function getBaseUrl(): string {
32
+ try {
33
+ const raw = execSync(
34
+ "jq -r '.port_allocations[] | select(.label==\"Web Dev\") | .port' .codebyplan/server.json 2>/dev/null | head -1",
35
+ { encoding: "utf-8" }
36
+ ).trim();
37
+ const port = parseInt(raw, 10);
38
+ return `http://localhost:${port}`;
39
+ } catch {
40
+ return "http://localhost:3010"; // fallback
41
+ }
42
+ }
43
+
44
+ export default defineConfig({
45
+ testDir: "apps/web/e2e",
46
+ fullyParallel: false,
47
+ forbidOnly: !!process.env.CI,
48
+ retries: process.env.CI ? 2 : 0,
49
+ workers: 1, // serialize against shared remote Supabase — see e2e.md § Supabase Parallelism
50
+ reporter: process.env.CI ? "github" : "html",
51
+ globalSetup: "./apps/web/e2e/global-setup", // string path — resolved relative to config; safe under ESM
52
+ use: {
53
+ baseURL: getBaseUrl(),
54
+ trace: "on-first-retry",
55
+ screenshot: "only-on-failure",
56
+ },
57
+ projects: [
58
+ { name: "setup", testMatch: /global\.setup\.ts/ },
59
+ {
60
+ name: "web",
61
+ use: { ...devices["Desktop Chrome"], storageState: "apps/web/e2e/.auth/user.json" },
62
+ dependencies: ["setup"],
63
+ },
64
+ ],
65
+ webServer: {
66
+ command: "pnpm --filter @codebyplan/web dev",
67
+ url: getBaseUrl(),
68
+ reuseExistingServer: !process.env.CI,
69
+ timeout: 120_000,
70
+ },
71
+ });
72
+ ```
73
+
74
+ Key options:
75
+
76
+ | Option | Why |
77
+ | --- | --- |
78
+ | `workers: 1` | Prevents auth/RLS races on a shared remote Supabase project |
79
+ | `globalSetup` | Logs in once, writes `storageState` so tests start authenticated |
80
+ | `reuseExistingServer` | Skip dev-server startup when already running locally |
81
+
82
+ ## Auth — global setup + storage state
83
+
84
+ Create `apps/web/e2e/global-setup.ts`:
85
+
86
+ ```ts
87
+ import { chromium, FullConfig } from "@playwright/test";
88
+ import path from "path";
89
+
90
+ const AUTH_FILE = path.join(__dirname, ".auth/user.json");
91
+
92
+ export default async function globalSetup(config: FullConfig) {
93
+ const email = process.env.E2E_TEST_EMAIL;
94
+ const password = process.env.E2E_TEST_PASSWORD;
95
+
96
+ if (!email || !password) {
97
+ throw new Error(
98
+ "E2E_TEST_EMAIL and E2E_TEST_PASSWORD must be set.\n" +
99
+ "Copy .env.local.example to .env.local, then run: pnpm e2e:provision"
100
+ );
101
+ }
102
+
103
+ const { baseURL } = config.projects[0].use;
104
+ const browser = await chromium.launch();
105
+ const page = await browser.newPage();
106
+
107
+ await page.goto(`${baseURL}/login`);
108
+ await page.getByLabel(/email/i).fill(email);
109
+ await page.getByLabel(/password/i).fill(password);
110
+ await page.getByRole("button", { name: /sign in|log in/i }).click();
111
+ await page.waitForURL(/\/(dashboard|home|app)/, { timeout: 15_000 });
112
+
113
+ // Warm up the first route to avoid cold-start timeouts in specs
114
+ await page.goto(baseURL!);
115
+ await page.context().storageState({ path: AUTH_FILE });
116
+ await browser.close();
117
+ }
118
+ ```
119
+
120
+ Gitignore the auth state — run before first use:
121
+
122
+ ```bash
123
+ mkdir -p apps/web/e2e/.auth
124
+ echo "apps/web/e2e/.auth/" >> .gitignore
125
+ ```
126
+
127
+ ## Auth probe spec
128
+
129
+ `apps/web/e2e/_probe/auth.spec.ts` — validates the login path directly (outside
130
+ storage-state flow) so credential failures are diagnosed cleanly:
131
+
132
+ ```ts
133
+ import { test, expect } from "@playwright/test";
134
+
135
+ test("auth probe: can log in with E2E_TEST_EMAIL/E2E_TEST_PASSWORD", async ({
136
+ page,
137
+ }) => {
138
+ const email = process.env.E2E_TEST_EMAIL;
139
+ const password = process.env.E2E_TEST_PASSWORD;
140
+ expect(email, "E2E_TEST_EMAIL env var is required").toBeTruthy();
141
+ expect(password, "E2E_TEST_PASSWORD env var is required").toBeTruthy();
142
+
143
+ await page.goto("/login");
144
+ await page.getByLabel(/email/i).fill(email!);
145
+ await page.getByLabel(/password/i).fill(password!);
146
+ await page.getByRole("button", { name: /sign in|log in/i }).click();
147
+
148
+ await expect(page).toHaveURL(/\/(dashboard|home|app)/, { timeout: 15_000 });
149
+ });
150
+ ```
151
+
152
+ Run the probe before the full suite: `pnpm exec playwright test --project=web _probe/auth`.
153
+
154
+ ## Provision script convention
155
+
156
+ Every repo with Playwright auth ships `scripts/provision-e2e-user.ts` — an idempotent
157
+ script that creates the test user in the dev Supabase project. Wire it to `package.json`:
158
+
159
+ ```json
160
+ {
161
+ "scripts": {
162
+ "e2e:provision": "tsx scripts/provision-e2e-user.ts"
163
+ }
164
+ }
165
+ ```
166
+
167
+ The skill records the path; the repo author writes the script. See
168
+ `.claude/context/testing/e2e.md` § Provisioning Playwright credentials for the full
169
+ contract (idempotency, multi-tenant subdomain, `.env.local.example`).
170
+
171
+ ## CI secrets
172
+
173
+ Add these four secrets to the GitHub repo (Settings → Secrets → Actions):
174
+
175
+ | Secret | Purpose |
176
+ | --- | --- |
177
+ | `E2E_TEST_EMAIL` | Test account email |
178
+ | `E2E_TEST_PASSWORD` | Test account password |
179
+ | `NEXT_PUBLIC_SUPABASE_URL` | Supabase project URL |
180
+ | `NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY` | Supabase publishable (anon) key |
181
+
182
+ ## GitHub Actions snippet
183
+
184
+ ```yaml
185
+ - name: Install Playwright browsers
186
+ run: pnpm exec playwright install --with-deps chromium
187
+
188
+ - name: Run Playwright tests
189
+ run: pnpm exec playwright test
190
+ env:
191
+ E2E_TEST_EMAIL: ${{ secrets.E2E_TEST_EMAIL }}
192
+ E2E_TEST_PASSWORD: ${{ secrets.E2E_TEST_PASSWORD }}
193
+ NEXT_PUBLIC_SUPABASE_URL: ${{ secrets.NEXT_PUBLIC_SUPABASE_URL }}
194
+ NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY: ${{ secrets.NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY }}
195
+ ```
196
+
197
+ ## Pitfalls
198
+
199
+ **Cold-start timeouts** — Next.js dev mode compiles routes lazily. The warmup fetch in
200
+ `globalSetup` (after `page.goto(baseURL!)`) primes the compiler before specs run.
201
+
202
+ **SCSS Module selectors** — prefer `[class*='componentName']` with `.first()` over
203
+ positional `.nth(N)` locators. Prefer `getByRole`/`getByLabel`/`getByTestId` when
204
+ accessible names are available.
205
+
206
+ **SCSS import errors in tests** — Playwright runs in Node, not webpack. If your test
207
+ imports a component that imports SCSS, configure `playwright.config.ts` to use
208
+ `@playwright/test`'s built-in transform or exclude such imports.
209
+
210
+ **Port mismatch** — before running, compare `playwright.config.ts` `baseURL` port with
211
+ `.codebyplan/server.json`. On mismatch, the `cbp-e2e-playwright` agent will ask which is
212
+ correct — do not guess.
@@ -0,0 +1,147 @@
1
+ # Tauri Reference
2
+
3
+ Full install, config, and test walkthrough for Tauri desktop apps using WebDriverIO +
4
+ tauri-driver. Source: upstream WebDriverIO docs + `.claude/context/testing/e2e.md`.
5
+
6
+ ## Prerequisites
7
+
8
+ - Rust toolchain: `rustup --version` (install via https://rustup.rs)
9
+ - `tauri-driver` binary: installed via Cargo (see Install below)
10
+ - Built Tauri binary: `cargo build` must complete before any tests run
11
+ - Node.js / pnpm available
12
+
13
+ ## Install
14
+
15
+ ```bash
16
+ # WebDriverIO runner + framework
17
+ pnpm add -D @wdio/cli @wdio/local-runner @wdio/mocha-framework @wdio/spec-reporter
18
+
19
+ # Tauri driver (native binary — needs Cargo)
20
+ cargo install tauri-driver
21
+ ```
22
+
23
+ Verify:
24
+
25
+ ```bash
26
+ which tauri-driver
27
+ tauri-driver --version
28
+ ```
29
+
30
+ ## wdio.conf.ts
31
+
32
+ Place at `apps/desktop/wdio.conf.ts` (or repo root for single-app repos):
33
+
34
+ ```ts
35
+ import { spawn, spawnSync } from "child_process";
36
+ import type { Options } from "@wdio/types";
37
+
38
+ // Path to your built Tauri binary
39
+ const BINARY_PATH = "./src-tauri/target/debug/your-app-name";
40
+
41
+ let tauriDriver: ReturnType<typeof spawn>;
42
+
43
+ export const config: Options.Testrunner = {
44
+ specs: ["./e2e/**/*.spec.ts"],
45
+ maxInstances: 1,
46
+ capabilities: [
47
+ {
48
+ "tauri:options": { application: BINARY_PATH },
49
+ maxInstances: 1,
50
+ },
51
+ ],
52
+ services: ["chromedriver"],
53
+ framework: "mocha",
54
+ reporters: ["spec"],
55
+ mochaOpts: { timeout: 60_000 },
56
+
57
+ beforeSession: async () => {
58
+ // Start tauri-driver before each session
59
+ tauriDriver = spawn("tauri-driver", [], {
60
+ stdio: [null, process.stdout, process.stderr],
61
+ });
62
+ },
63
+
64
+ afterSession: async () => {
65
+ // Kill tauri-driver after each session
66
+ tauriDriver.kill();
67
+ },
68
+ };
69
+ ```
70
+
71
+ ## Build before running
72
+
73
+ Tests will fail if the binary is stale or absent. Always build before running tests:
74
+
75
+ ```bash
76
+ # Build the Tauri app
77
+ cargo build --manifest-path apps/desktop/src-tauri/Cargo.toml
78
+
79
+ # Then run WebDriverIO
80
+ pnpm --filter @codebyplan/desktop wdio run wdio.conf.ts
81
+ ```
82
+
83
+ Or as a combined pnpm script:
84
+
85
+ ```json
86
+ {
87
+ "scripts": {
88
+ "e2e": "cargo build --manifest-path src-tauri/Cargo.toml && wdio run wdio.conf.ts",
89
+ "e2e:test": "wdio run wdio.conf.ts"
90
+ }
91
+ }
92
+ ```
93
+
94
+ ## Writing tests
95
+
96
+ Use `data-testid` attributes for stable targeting (Tauri WebView renders HTML):
97
+
98
+ ```ts
99
+ import { browser, $ } from "@wdio/globals";
100
+ import { expect } from "@wdio/globals";
101
+
102
+ describe("Desktop app", () => {
103
+ it("opens the main window", async () => {
104
+ const navBar = await $("[data-testid='nav']");
105
+ await expect(navBar).toBeDisplayed();
106
+ });
107
+
108
+ it("navigates to settings", async () => {
109
+ await $("[data-testid='settings-link']").click();
110
+ await expect($("[data-testid='settings-panel']")).toBeDisplayed();
111
+ });
112
+ });
113
+ ```
114
+
115
+ Prefer `data-testid` over CSS class selectors — SCSS Modules mangle class names.
116
+
117
+ ## Auth probe
118
+
119
+ `apps/desktop/e2e/_probe/auth.spec.ts`:
120
+
121
+ ```ts
122
+ import { browser, $ } from "@wdio/globals";
123
+ import { expect } from "@wdio/globals";
124
+
125
+ describe("auth probe", () => {
126
+ it("can reach the main window", async () => {
127
+ // Tauri desktop apps often skip network auth — adapt if your app has auth
128
+ const root = await $("[data-testid='app-root']");
129
+ await expect(root).toBeDisplayed();
130
+ });
131
+ });
132
+ ```
133
+
134
+ ## Pitfalls
135
+
136
+ **Must build before run** — tauri-driver launches the binary; if the binary doesn't
137
+ exist or is stale the session fails immediately with a confusing error.
138
+
139
+ **Binary path** — the `application` path in `capabilities` must be the exact path
140
+ to the compiled binary. Debug builds are under `src-tauri/target/debug/`, release
141
+ builds under `src-tauri/target/release/`.
142
+
143
+ **Port conflicts** — tauri-driver listens on port 4444 by default. Ensure no other
144
+ WebDriver session is running on the same port.
145
+
146
+ **CI** — Tauri desktop E2E on CI requires a display (Xvfb on Linux) and the full Rust
147
+ build toolchain. Use GitHub-hosted `ubuntu-latest` or `macos-latest` runners.
@@ -0,0 +1,154 @@
1
+ # VS Code Extension Reference
2
+
3
+ Full install, config, and test walkthrough for VS Code extension testing using
4
+ `@vscode/test-cli` and `@vscode/test-electron`. Source: upstream VS Code extension
5
+ testing docs.
6
+
7
+ ## Prerequisites
8
+
9
+ - VS Code installed (used as the test host)
10
+ - Node.js / pnpm available
11
+ - On Linux CI: Xvfb for a display server (extensions require a GUI)
12
+
13
+ ## Install
14
+
15
+ ```bash
16
+ pnpm add -D @vscode/test-cli @vscode/test-electron
17
+ ```
18
+
19
+ Verify:
20
+
21
+ ```bash
22
+ pnpm exec vscode-test --version
23
+ ```
24
+
25
+ ## .vscode-test.mjs
26
+
27
+ Create `.vscode-test.mjs` at the extension package root (e.g. `apps/vscode/`):
28
+
29
+ ```js
30
+ import { defineConfig } from "@vscode/test-cli";
31
+
32
+ export default defineConfig({
33
+ files: "e2e/**/*.test.js", // compiled output path (JS, not TS)
34
+ extensionDevelopmentPath: ".", // path to the extension package root
35
+ workspaceFolder: "test-fixtures/workspace", // optional: open a fixture workspace
36
+ mocha: {
37
+ timeout: 20_000,
38
+ ui: "bdd",
39
+ },
40
+ });
41
+ ```
42
+
43
+ For TypeScript source, compile tests before running:
44
+
45
+ ```json
46
+ {
47
+ "scripts": {
48
+ "test:e2e": "tsc -p tsconfig.test.json && vscode-test",
49
+ "test:e2e:watch": "vscode-test --watch"
50
+ }
51
+ }
52
+ ```
53
+
54
+ ## Extension host lifecycle
55
+
56
+ `@vscode/test-electron` downloads an isolated VS Code instance, installs your extension,
57
+ opens the workspace, and runs the Mocha suite inside the extension host process.
58
+
59
+ Tests import from `vscode` — the module is available because they run inside VS Code:
60
+
61
+ ```ts
62
+ import * as vscode from "vscode";
63
+ import * as assert from "assert";
64
+
65
+ suite("Extension", () => {
66
+ test("extension activates", async () => {
67
+ const ext = vscode.extensions.getExtension("yourpublisher.yourextension");
68
+ assert.ok(ext, "extension not found");
69
+ await ext.activate();
70
+ assert.ok(ext.isActive);
71
+ });
72
+
73
+ test("command is registered", async () => {
74
+ const commands = await vscode.commands.getCommands();
75
+ assert.ok(
76
+ commands.includes("yourextension.yourCommand"),
77
+ "command not registered"
78
+ );
79
+ });
80
+ });
81
+ ```
82
+
83
+ The test file runs inside the VS Code extension host — full `vscode` API is available,
84
+ including workspace, editors, commands, and diagnostics.
85
+
86
+ ## Directory structure
87
+
88
+ ```
89
+ apps/vscode/
90
+ .vscode-test.mjs
91
+ e2e/
92
+ _probe/
93
+ activation.test.ts
94
+ commands/
95
+ my-command.test.ts
96
+ test-fixtures/
97
+ workspace/ # optional: committed fixture files opened in tests
98
+ ```
99
+
100
+ ## Activation probe
101
+
102
+ `apps/vscode/e2e/_probe/activation.test.ts`:
103
+
104
+ ```ts
105
+ import * as vscode from "vscode";
106
+ import * as assert from "assert";
107
+
108
+ suite("Activation probe", () => {
109
+ test("extension activates without error", async () => {
110
+ // Replace with your publisher.extensionname from package.json
111
+ const ext = vscode.extensions.getExtension("yourpublisher.yourextension");
112
+ assert.ok(ext, "Extension not installed in test host");
113
+ if (!ext.isActive) {
114
+ await ext.activate();
115
+ }
116
+ assert.ok(ext.isActive, "Extension did not activate");
117
+ });
118
+ });
119
+ ```
120
+
121
+ ## CI (GitHub Actions)
122
+
123
+ Linux runners require Xvfb. Use the `xvfb-run` wrapper:
124
+
125
+ ```yaml
126
+ - name: Install dependencies
127
+ run: pnpm install
128
+
129
+ - name: Compile extension tests
130
+ run: pnpm --filter @codebyplan/vscode test:compile
131
+
132
+ - name: Run VS Code extension tests
133
+ run: xvfb-run -a pnpm --filter @codebyplan/vscode test:e2e
134
+ env:
135
+ DISPLAY: ':99.0'
136
+ ```
137
+
138
+ On macOS/Windows runners, Xvfb is not needed — `vscode-test` uses the native display.
139
+
140
+ ## Pitfalls
141
+
142
+ **Wrong extensionDevelopmentPath** — if the path in `.vscode-test.mjs` doesn't point
143
+ to the package root (where `package.json` has the `contributes` block), VS Code won't
144
+ find the extension and activation tests will fail silently.
145
+
146
+ **TypeScript source vs compiled output** — `@vscode/test-cli` runs compiled JS.
147
+ Always compile before invoking `vscode-test` in CI.
148
+
149
+ **Extension host isolation** — each test run downloads a fresh VS Code binary into a
150
+ temp dir. This is intentional; do not try to reuse the system VS Code installation.
151
+
152
+ **`vscode` module availability** — tests run inside the extension host, so `import
153
+ * as vscode from "vscode"` resolves correctly. The same import will fail if you try
154
+ to run these files with plain Node.js outside the host.