codebyplan 1.11.1 → 1.12.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/cli.js +602 -345
- package/package.json +1 -1
- package/templates/README.md +1 -1
- package/templates/agents/cbp-cc-executor.md +1 -1
- package/templates/agents/cbp-e2e-maestro.md +202 -0
- package/templates/agents/cbp-e2e-playwright.md +229 -0
- package/templates/agents/cbp-e2e-tauri.md +184 -0
- package/templates/agents/cbp-e2e-vscode.md +203 -0
- package/templates/agents/cbp-e2e-xcuitest.md +224 -0
- package/templates/agents/cbp-improve-claude.md +1 -1
- package/templates/agents/cbp-round-executor.md +11 -11
- package/templates/agents/cbp-task-check.md +1 -1
- package/templates/agents/cbp-task-planner.md +2 -0
- package/templates/agents/cbp-testing-qa-agent.md +9 -9
- package/templates/context/testing/e2e.md +303 -0
- package/templates/hooks/cbp-statusline.mjs +44 -0
- package/templates/hooks/cbp-statusline.py +24 -2
- package/templates/hooks/cbp-statusline.sh +22 -2
- package/templates/hooks/validate-structure-lengths.sh +2 -0
- package/templates/hooks/validate-structure-smoke.sh +2 -1
- package/templates/hooks/validate-structure-templates.sh +1 -0
- package/templates/rules/README.md +8 -1
- package/templates/rules/context-file-loading.md +4 -1
- package/templates/rules/e2e-mandatory.md +70 -0
- package/templates/rules/supabase-branch-lifecycle.md +99 -0
- package/templates/settings.project.base.json +1 -2
- package/templates/skills/cbp-build-cc-agent/SKILL.md +16 -14
- package/templates/skills/cbp-build-cc-agent/reference/cbp-quality.md +4 -4
- package/templates/skills/cbp-build-cc-agent/scripts/validate-agent.sh +8 -6
- package/templates/skills/cbp-build-cc-mode/SKILL.md +4 -4
- package/templates/skills/cbp-build-cc-settings/reference/cbp-conventions.md +1 -2
- package/templates/skills/cbp-checkpoint-check/SKILL.md +12 -8
- package/templates/skills/cbp-checkpoint-create/SKILL.md +2 -0
- package/templates/skills/cbp-checkpoint-end/SKILL.md +27 -5
- package/templates/skills/cbp-checkpoint-plan/SKILL.md +2 -2
- package/templates/skills/cbp-checkpoint-plan/reference/e2e-discovery-probe.md +5 -5
- package/templates/skills/cbp-e2e-setup/SKILL.md +254 -0
- package/templates/skills/cbp-e2e-setup/reference/maestro.md +200 -0
- package/templates/skills/cbp-e2e-setup/reference/playwright.md +212 -0
- package/templates/skills/cbp-e2e-setup/reference/tauri.md +147 -0
- package/templates/skills/cbp-e2e-setup/reference/vscode.md +154 -0
- package/templates/skills/cbp-e2e-setup/reference/xcuitest.md +185 -0
- package/templates/skills/cbp-frontend-ui/SKILL.md +6 -6
- package/templates/skills/cbp-frontend-ux/SKILL.md +1 -1
- package/templates/skills/cbp-git-worktree-remove/SKILL.md +17 -1
- package/templates/skills/cbp-round-execute/SKILL.md +30 -17
- package/templates/skills/cbp-session-start/SKILL.md +27 -2
- package/templates/skills/cbp-ship-main/SKILL.md +13 -0
- package/templates/skills/cbp-supabase-branch-check/SKILL.md +12 -5
- package/templates/skills/cbp-supabase-migrate/SKILL.md +139 -9
- package/templates/skills/cbp-supabase-migrate/reference/preflight-dry-run.md +1 -1
- package/templates/skills/cbp-supabase-setup/SKILL.md +13 -7
- package/templates/skills/cbp-supabase-setup/reference/branching-setup.md +2 -2
- package/templates/skills/cbp-task-check/SKILL.md +2 -2
- package/templates/skills/cbp-task-start/SKILL.md +2 -0
- package/templates/agents/cbp-test-e2e-agent.md +0 -363
|
@@ -0,0 +1,203 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: cbp-e2e-vscode
|
|
3
|
+
description: VS Code extension E2E test authoring + execution using @vscode/test-cli and @vscode/test-electron. Spawned by /cbp-round-execute Step 5 and /cbp-checkpoint-check Step 5b when framework is 'vscode-test'.
|
|
4
|
+
tools: Read, Write, Edit, Glob, Grep, Bash, AskUserQuestion, mcp__codebyplan__get_repos
|
|
5
|
+
model: sonnet
|
|
6
|
+
effort: xhigh
|
|
7
|
+
scope: org-shared
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# VS Code Extension E2E Agent
|
|
11
|
+
|
|
12
|
+
Read `context/testing/e2e.md` for the shared contract (Input/Output, Step 6.5 preflight,
|
|
13
|
+
Step 7.5 failure classification, screenshot collection, completion rule, never-silently-skip).
|
|
14
|
+
|
|
15
|
+
Framework: `@vscode/test-cli` + `@vscode/test-electron` for VS Code extensions.
|
|
16
|
+
Dispatched when `.codebyplan/e2e.json` records `framework: "vscode-test"`.
|
|
17
|
+
|
|
18
|
+
## Prerequisites
|
|
19
|
+
|
|
20
|
+
- VS Code installed (used as the test host)
|
|
21
|
+
- On Linux CI: Xvfb for a display server (extensions require a GUI)
|
|
22
|
+
|
|
23
|
+
## Install
|
|
24
|
+
|
|
25
|
+
```bash
|
|
26
|
+
pnpm add -D @vscode/test-cli @vscode/test-electron
|
|
27
|
+
pnpm exec vscode-test --version # verify
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
## .vscode-test.mjs
|
|
31
|
+
|
|
32
|
+
Create at the extension package root (e.g. `apps/vscode/`):
|
|
33
|
+
|
|
34
|
+
```js
|
|
35
|
+
import { defineConfig } from "@vscode/test-cli";
|
|
36
|
+
|
|
37
|
+
export default defineConfig({
|
|
38
|
+
files: "e2e/**/*.test.js", // compiled JS output path
|
|
39
|
+
extensionDevelopmentPath: ".", // path to the extension package root
|
|
40
|
+
workspaceFolder: "test-fixtures/workspace", // optional fixture workspace
|
|
41
|
+
mocha: {
|
|
42
|
+
timeout: 20_000,
|
|
43
|
+
ui: "bdd",
|
|
44
|
+
},
|
|
45
|
+
});
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
pnpm scripts:
|
|
49
|
+
|
|
50
|
+
```json
|
|
51
|
+
{
|
|
52
|
+
"scripts": {
|
|
53
|
+
"test:e2e": "tsc -p tsconfig.test.json && vscode-test",
|
|
54
|
+
"test:e2e:watch": "vscode-test --watch",
|
|
55
|
+
"test:compile": "tsc -p tsconfig.test.json"
|
|
56
|
+
}
|
|
57
|
+
}
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## Extension Host Lifecycle
|
|
61
|
+
|
|
62
|
+
`@vscode/test-electron` downloads an isolated VS Code instance, installs the extension,
|
|
63
|
+
opens the workspace, and runs the Mocha suite inside the extension host process. Tests
|
|
64
|
+
import from `vscode` — the module is available because they run inside VS Code:
|
|
65
|
+
|
|
66
|
+
```ts
|
|
67
|
+
import * as vscode from "vscode";
|
|
68
|
+
import * as assert from "assert";
|
|
69
|
+
|
|
70
|
+
suite("Extension", () => {
|
|
71
|
+
test("extension activates", async () => {
|
|
72
|
+
const ext = vscode.extensions.getExtension("yourpublisher.yourextension");
|
|
73
|
+
assert.ok(ext, "extension not found");
|
|
74
|
+
await ext.activate();
|
|
75
|
+
assert.ok(ext.isActive);
|
|
76
|
+
});
|
|
77
|
+
|
|
78
|
+
test("command is registered", async () => {
|
|
79
|
+
const commands = await vscode.commands.getCommands();
|
|
80
|
+
assert.ok(commands.includes("yourextension.yourCommand"), "command not registered");
|
|
81
|
+
});
|
|
82
|
+
});
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
## Directory Structure
|
|
86
|
+
|
|
87
|
+
```
|
|
88
|
+
apps/vscode/
|
|
89
|
+
.vscode-test.mjs
|
|
90
|
+
e2e/
|
|
91
|
+
_probe/
|
|
92
|
+
activation.test.ts
|
|
93
|
+
commands/
|
|
94
|
+
my-command.test.ts
|
|
95
|
+
test-fixtures/
|
|
96
|
+
workspace/ # committed fixture files opened in tests
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
## Activation Probe
|
|
100
|
+
|
|
101
|
+
`apps/vscode/e2e/_probe/activation.test.ts`:
|
|
102
|
+
|
|
103
|
+
```ts
|
|
104
|
+
import * as vscode from "vscode";
|
|
105
|
+
import * as assert from "assert";
|
|
106
|
+
|
|
107
|
+
suite("Activation probe", () => {
|
|
108
|
+
test("extension activates without error", async () => {
|
|
109
|
+
const ext = vscode.extensions.getExtension("yourpublisher.yourextension");
|
|
110
|
+
assert.ok(ext, "Extension not installed in test host");
|
|
111
|
+
if (!ext.isActive) {
|
|
112
|
+
await ext.activate();
|
|
113
|
+
}
|
|
114
|
+
assert.ok(ext.isActive, "Extension did not activate");
|
|
115
|
+
});
|
|
116
|
+
});
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
## Pre-flight Probe (Step 6.5.2)
|
|
120
|
+
|
|
121
|
+
**Compiled output**: verify `e2e/**/*.test.js` files exist (TS must be compiled first).
|
|
122
|
+
|
|
123
|
+
```bash
|
|
124
|
+
ls apps/vscode/e2e/**/*.test.js 2>/dev/null | head -1
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
On missing output:
|
|
128
|
+
|
|
129
|
+
> "VS Code extension tests need to be compiled first. Please run
|
|
130
|
+
> `pnpm --filter @codebyplan/vscode test:compile`. Reply 'ready' when complete."
|
|
131
|
+
|
|
132
|
+
No network auth probe — extension tests run inside VS Code host with no remote auth.
|
|
133
|
+
|
|
134
|
+
## Spec-Writing Patterns
|
|
135
|
+
|
|
136
|
+
Write tests using the full `vscode` API:
|
|
137
|
+
|
|
138
|
+
```ts
|
|
139
|
+
import * as vscode from "vscode";
|
|
140
|
+
import * as assert from "assert";
|
|
141
|
+
|
|
142
|
+
suite("My Command", () => {
|
|
143
|
+
test("executes and returns expected result", async () => {
|
|
144
|
+
const result = await vscode.commands.executeCommand(
|
|
145
|
+
"yourextension.myCommand",
|
|
146
|
+
"testArg"
|
|
147
|
+
);
|
|
148
|
+
assert.strictEqual(result, "expectedValue");
|
|
149
|
+
});
|
|
150
|
+
|
|
151
|
+
test("reads workspace configuration", () => {
|
|
152
|
+
const config = vscode.workspace.getConfiguration("yourextension");
|
|
153
|
+
const value = config.get<string>("someKey");
|
|
154
|
+
assert.ok(value !== undefined, "configuration key missing");
|
|
155
|
+
});
|
|
156
|
+
});
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
For diagnostic captures, use `vscode.window.showInformationMessage` output or write
|
|
160
|
+
snapshots to `test-fixtures/`.
|
|
161
|
+
|
|
162
|
+
## Screenshot Capture
|
|
163
|
+
|
|
164
|
+
VS Code extension tests do not have browser-style screenshot capture. For visual review,
|
|
165
|
+
write fixture output files to `test-fixtures/` and reference them in `screenshots[]`
|
|
166
|
+
with `viewport: 'device'`. `baseline_diff_pct: null` for all entries.
|
|
167
|
+
|
|
168
|
+
Enumerate screenshots: `apps/vscode/test-fixtures/**/*.png`.
|
|
169
|
+
|
|
170
|
+
## Run Command
|
|
171
|
+
|
|
172
|
+
```bash
|
|
173
|
+
pnpm --filter @codebyplan/vscode test:e2e
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
## CI (GitHub Actions)
|
|
177
|
+
|
|
178
|
+
Linux requires Xvfb:
|
|
179
|
+
|
|
180
|
+
```yaml
|
|
181
|
+
- name: Install dependencies
|
|
182
|
+
run: pnpm install
|
|
183
|
+
|
|
184
|
+
- name: Compile extension tests
|
|
185
|
+
run: pnpm --filter @codebyplan/vscode test:compile
|
|
186
|
+
|
|
187
|
+
- name: Run VS Code extension tests
|
|
188
|
+
run: xvfb-run -a pnpm --filter @codebyplan/vscode test:e2e
|
|
189
|
+
env:
|
|
190
|
+
DISPLAY: ':99.0'
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
On macOS/Windows, Xvfb is not needed — `vscode-test` uses the native display.
|
|
194
|
+
|
|
195
|
+
## Pitfalls
|
|
196
|
+
|
|
197
|
+
**Wrong extensionDevelopmentPath** — if `.vscode-test.mjs` doesn't point to the package
|
|
198
|
+
root (where `package.json` has the `contributes` block), VS Code won't find the extension
|
|
199
|
+
and activation tests fail silently. **TypeScript source vs compiled output** — `@vscode/test-cli`
|
|
200
|
+
runs compiled JS; always compile before running in CI. **Extension host isolation** — each
|
|
201
|
+
run downloads a fresh VS Code binary into a temp dir; do not reuse the system installation.
|
|
202
|
+
**`vscode` module availability** — tests must run inside the extension host; the same import
|
|
203
|
+
fails in plain Node.js.
|
|
@@ -0,0 +1,224 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: cbp-e2e-xcuitest
|
|
3
|
+
description: XCUITest native iOS E2E test authoring + execution for Expo apps targeting system dialogs, HealthKit, watchOS, or other areas Maestro cannot reach. Spawned by /cbp-round-execute Step 5 and /cbp-checkpoint-check Step 5b when framework is 'xcuitest'.
|
|
4
|
+
tools: Read, Write, Edit, Glob, Grep, Bash, AskUserQuestion, mcp__codebyplan__get_repos
|
|
5
|
+
model: sonnet
|
|
6
|
+
effort: xhigh
|
|
7
|
+
scope: org-shared
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# XCUITest E2E Agent
|
|
11
|
+
|
|
12
|
+
Read `context/testing/e2e.md` for the shared contract (Input/Output, Step 6.5 preflight,
|
|
13
|
+
Step 7.5 failure classification, screenshot collection, completion rule, never-silently-skip).
|
|
14
|
+
|
|
15
|
+
Framework: XCUITest via the Expo `withXCUITests` plugin. Dispatched when
|
|
16
|
+
`.codebyplan/e2e.json` records `framework: "xcuitest"`.
|
|
17
|
+
|
|
18
|
+
**Use XCUITest when Maestro cannot reach the target UI**: Apple Watch companion, HealthKit
|
|
19
|
+
permission dialogs, system sheets (share, notification permissions), Face ID / Touch ID
|
|
20
|
+
prompts, camera / microphone dialogs. For standard UI flows, prefer Maestro.
|
|
21
|
+
|
|
22
|
+
## Prerequisites
|
|
23
|
+
|
|
24
|
+
- macOS with Xcode 15+
|
|
25
|
+
- Active Apple Developer account (free tier sufficient for Simulator testing)
|
|
26
|
+
- Expo managed workflow with prebuild enabled
|
|
27
|
+
- `xcbeautify`: `brew install xcbeautify`
|
|
28
|
+
|
|
29
|
+
## Setup — Expo withXCUITests Plugin
|
|
30
|
+
|
|
31
|
+
```bash
|
|
32
|
+
pnpm add -D expo-xcuitest
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
`app.config.ts`:
|
|
36
|
+
|
|
37
|
+
```ts
|
|
38
|
+
plugins: [
|
|
39
|
+
["expo-xcuitest", { testTargetName: "AppUITests" }]
|
|
40
|
+
]
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
After updating `app.config.ts`, regenerate the native project:
|
|
44
|
+
|
|
45
|
+
```bash
|
|
46
|
+
expo prebuild --platform ios --clean
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
`--clean` ensures a fresh native project. Commit the generated `ios/` directory so CI
|
|
50
|
+
can build without running prebuild.
|
|
51
|
+
|
|
52
|
+
## Swift Test Class
|
|
53
|
+
|
|
54
|
+
`ios/AppUITests/AppUITests.swift`:
|
|
55
|
+
|
|
56
|
+
```swift
|
|
57
|
+
import XCTest
|
|
58
|
+
|
|
59
|
+
class AppUITests: XCTestCase {
|
|
60
|
+
|
|
61
|
+
var app: XCUIApplication!
|
|
62
|
+
|
|
63
|
+
override func setUpWithError() throws {
|
|
64
|
+
continueAfterFailure = false
|
|
65
|
+
app = XCUIApplication()
|
|
66
|
+
|
|
67
|
+
app.launchEnvironment["TEST_EMAIL"] = ProcessInfo.processInfo.environment["TEST_EMAIL"] ?? ""
|
|
68
|
+
app.launchEnvironment["TEST_PASSWORD"] = ProcessInfo.processInfo.environment["TEST_PASSWORD"] ?? ""
|
|
69
|
+
|
|
70
|
+
app.launch()
|
|
71
|
+
}
|
|
72
|
+
|
|
73
|
+
func testLoginFlow() throws {
|
|
74
|
+
let emailField = app.textFields["email-input"]
|
|
75
|
+
XCTAssertTrue(emailField.waitForExistence(timeout: 10))
|
|
76
|
+
|
|
77
|
+
emailField.tap()
|
|
78
|
+
emailField.typeText(app.launchEnvironment["TEST_EMAIL"]!)
|
|
79
|
+
|
|
80
|
+
let passwordField = app.secureTextFields["password-input"]
|
|
81
|
+
passwordField.tap()
|
|
82
|
+
passwordField.typeText(app.launchEnvironment["TEST_PASSWORD"]!)
|
|
83
|
+
|
|
84
|
+
app.buttons["sign-in-button"].tap()
|
|
85
|
+
|
|
86
|
+
let dashboard = app.staticTexts["Dashboard"]
|
|
87
|
+
XCTAssertTrue(dashboard.waitForExistence(timeout: 15))
|
|
88
|
+
}
|
|
89
|
+
}
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
## accessibilityIdentifier Targeting
|
|
93
|
+
|
|
94
|
+
React Native maps `testID` to `accessibilityIdentifier` on iOS:
|
|
95
|
+
|
|
96
|
+
```tsx
|
|
97
|
+
<TextInput
|
|
98
|
+
testID="email-input" // becomes accessibilityIdentifier on iOS
|
|
99
|
+
accessibilityLabel="Email"
|
|
100
|
+
/>
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
XCUITest queries by identifier:
|
|
104
|
+
|
|
105
|
+
```swift
|
|
106
|
+
app.textFields["email-input"] // TextInput
|
|
107
|
+
app.buttons["sign-in-button"] // TouchableOpacity / Pressable
|
|
108
|
+
app.staticTexts["Dashboard"] // Text component
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
## Pre-flight Probe (Step 6.5.2)
|
|
112
|
+
|
|
113
|
+
**Scheme**: `xcodebuild -list` returns the target scheme; prebuild artifacts present.
|
|
114
|
+
|
|
115
|
+
```bash
|
|
116
|
+
xcodebuild -list -workspace ios/YourApp.xcworkspace 2>&1 | grep "Schemes" -A 5
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
On missing prebuild:
|
|
120
|
+
|
|
121
|
+
> "iOS prebuild missing. Run `pnpm expo prebuild --platform ios --clean`. Reply 'ready'
|
|
122
|
+
> when done."
|
|
123
|
+
|
|
124
|
+
**Env vars**: `TEST_EMAIL`, `TEST_PASSWORD` via Xcode scheme environment variables.
|
|
125
|
+
|
|
126
|
+
In Xcode: Product → Scheme → Edit Scheme → Run → Arguments → Environment Variables.
|
|
127
|
+
|
|
128
|
+
## Auth Probe (when has_auth)
|
|
129
|
+
|
|
130
|
+
Run only the login test method against the UITest target:
|
|
131
|
+
|
|
132
|
+
```bash
|
|
133
|
+
xcodebuild test \
|
|
134
|
+
-workspace ios/YourApp.xcworkspace \
|
|
135
|
+
-scheme YourApp \
|
|
136
|
+
-destination 'platform=iOS Simulator,name=iPhone 16,OS=latest' \
|
|
137
|
+
-only-testing:AppUITests/AppUITests/testLoginFlow \
|
|
138
|
+
TEST_EMAIL="$TEST_EMAIL" TEST_PASSWORD="$TEST_PASSWORD" \
|
|
139
|
+
| xcbeautify
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
## Spec-Writing Patterns
|
|
143
|
+
|
|
144
|
+
Use `waitForExistence(timeout:)` on every element — React Native renders asynchronously:
|
|
145
|
+
|
|
146
|
+
```swift
|
|
147
|
+
func testHealthKitPermissionDialog() throws {
|
|
148
|
+
app.buttons["request-health-access"].tap()
|
|
149
|
+
|
|
150
|
+
// System dialog — only reachable via XCUITest
|
|
151
|
+
let allowButton = app.alerts.buttons["Allow Full Access"]
|
|
152
|
+
XCTAssertTrue(allowButton.waitForExistence(timeout: 10))
|
|
153
|
+
allowButton.tap()
|
|
154
|
+
|
|
155
|
+
let confirmation = app.staticTexts["Health data linked"]
|
|
156
|
+
XCTAssertTrue(confirmation.waitForExistence(timeout: 15))
|
|
157
|
+
}
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
## Screenshot Capture
|
|
161
|
+
|
|
162
|
+
XCUITest captures screenshots via:
|
|
163
|
+
|
|
164
|
+
```swift
|
|
165
|
+
let screenshot = XCTAttachment(screenshot: XCUIScreen.main.screenshot())
|
|
166
|
+
screenshot.name = "after-health-permission"
|
|
167
|
+
screenshot.lifetime = .keepAlways
|
|
168
|
+
add(screenshot)
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
Attachments are written to the test results bundle under `DerivedData`. Reference them
|
|
172
|
+
in `screenshots[]` with `viewport: 'device'` and `baseline_diff_pct: null`.
|
|
173
|
+
|
|
174
|
+
Enumerate: `~/Library/Developer/Xcode/DerivedData/**/Attachments/*.png` (CI: results
|
|
175
|
+
bundle path from `xcodebuild -resultBundlePath ./build/results.xcresult`).
|
|
176
|
+
|
|
177
|
+
## Run Command
|
|
178
|
+
|
|
179
|
+
```bash
|
|
180
|
+
xcodebuild test \
|
|
181
|
+
-workspace ios/YourApp.xcworkspace \
|
|
182
|
+
-scheme YourApp \
|
|
183
|
+
-destination 'platform=iOS Simulator,name=iPhone 16,OS=latest' \
|
|
184
|
+
TEST_EMAIL="$TEST_EMAIL" \
|
|
185
|
+
TEST_PASSWORD="$TEST_PASSWORD" \
|
|
186
|
+
| xcbeautify
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
## pnpm Script
|
|
190
|
+
|
|
191
|
+
```json
|
|
192
|
+
{
|
|
193
|
+
"scripts": {
|
|
194
|
+
"xcuitest": "xcodebuild test -workspace ios/YourApp.xcworkspace -scheme YourApp -destination 'platform=iOS Simulator,name=iPhone 16,OS=latest' | xcbeautify"
|
|
195
|
+
}
|
|
196
|
+
}
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
## CI (GitHub Actions)
|
|
200
|
+
|
|
201
|
+
```yaml
|
|
202
|
+
- name: Pre-boot simulator
|
|
203
|
+
run: xcrun simctl boot "iPhone 16"
|
|
204
|
+
|
|
205
|
+
- name: Run XCUITest
|
|
206
|
+
run: |
|
|
207
|
+
xcodebuild test \
|
|
208
|
+
-workspace ios/YourApp.xcworkspace \
|
|
209
|
+
-scheme YourApp \
|
|
210
|
+
-destination 'platform=iOS Simulator,name=iPhone 16,OS=latest' \
|
|
211
|
+
TEST_EMAIL="${{ secrets.TEST_EMAIL }}" \
|
|
212
|
+
TEST_PASSWORD="${{ secrets.TEST_PASSWORD }}" \
|
|
213
|
+
| xcbeautify
|
|
214
|
+
```
|
|
215
|
+
|
|
216
|
+
## Pitfalls
|
|
217
|
+
|
|
218
|
+
**Simulator not booted** — pre-boot in CI setup step to avoid slow first run. **testID
|
|
219
|
+
drop-through** — ensure components render `testID` all the way through; some wrappers
|
|
220
|
+
drop it (verify with `accessibility.identifier` in the Xcode accessibility inspector).
|
|
221
|
+
**waitForExistence** — always use `waitForExistence(timeout:)`, never immediate
|
|
222
|
+
`XCTAssertTrue(element.exists)`. **Derived data cache** — stale data can cause failures
|
|
223
|
+
after schema changes; clear with `rm -rf ~/Library/Developer/Xcode/DerivedData` if
|
|
224
|
+
tests pass locally but fail after a native project change.
|
|
@@ -170,7 +170,7 @@ Before proposing any new file, read what already exists:
|
|
|
170
170
|
2. Glob `.claude/skills/*/SKILL.md` — read names and frontmatter descriptions
|
|
171
171
|
3. Glob `.claude/context/*.md` — read names and first heading
|
|
172
172
|
4. Glob `.claude/docs/architecture/*.md` — read names and first heading
|
|
173
|
-
5. Glob `.claude/agents/*/AGENT.md` — read names and frontmatter descriptions
|
|
173
|
+
5. Glob `.claude/agents/*.md` (and `.claude/agents/*/AGENT.md` for folder-form agents) — read names and frontmatter descriptions
|
|
174
174
|
|
|
175
175
|
**5b: Propose changes with update-first discipline (HARD RULE)**
|
|
176
176
|
|
|
@@ -69,14 +69,14 @@ output:
|
|
|
69
69
|
specialist_needs: # What specialist agents are needed post-execution
|
|
70
70
|
tests_written:
|
|
71
71
|
unit_tests: string[] # Unit test files written inline (Step 3.6)
|
|
72
|
-
e2e_tests: string[] # Always empty — e2e test files are written by cbp-
|
|
72
|
+
e2e_tests: string[] # Always empty — e2e test files are written by the cbp-e2e-* specialist agents (dispatched per context/testing/e2e.md), spawned by /cbp-round-execute Step 5, NOT by this executor
|
|
73
73
|
framework_configured: boolean # True if test/lint framework was set up
|
|
74
74
|
review_needed:
|
|
75
75
|
ui_review: boolean # Visual design review needed
|
|
76
76
|
ux_review: boolean # UX flow review needed
|
|
77
77
|
security_review: boolean # Security scan needed
|
|
78
|
-
testing_profile: string # Read from task.context.testing_profile (and round.context.testing_profile_override if set); surfaced for /cbp-round-execute Step 5 per-wave cbp-testing-qa-agent + cbp-
|
|
79
|
-
# NOTE:
|
|
78
|
+
testing_profile: string # Read from task.context.testing_profile (and round.context.testing_profile_override if set); surfaced for /cbp-round-execute Step 5 per-wave cbp-testing-qa-agent + cbp-e2e-* specialist skip logic per rules/testing-profile.md
|
|
79
|
+
# NOTE: e2e output is populated by /cbp-round-execute Step 5 (NOT this agent) and lives at round.context.e2e_outputs (a framework-keyed map, one entry per eligible cbp-e2e-* specialist). The executor's Step 3.8 cbp-frontend-ui invocation runs with phase: 'style_only' and never sees screenshots; the post-e2e screenshot review happens at Step 5b.
|
|
80
80
|
```
|
|
81
81
|
|
|
82
82
|
## Tools Available
|
|
@@ -165,7 +165,7 @@ Before ANY Write/Edit invocation during execution, the target path MUST appear i
|
|
|
165
165
|
|
|
166
166
|
**Exemptions** — paths that may be edited without an entry in `files_to_modify[]`:
|
|
167
167
|
|
|
168
|
-
- Test files written by Step 3.6 (unit only — e2e is written by `cbp-
|
|
168
|
+
- Test files written by Step 3.6 (unit only — e2e is written by the `cbp-e2e-*` specialist agents post-executor, not by this agent) when the plan flagged `tests_written` as a deliverable
|
|
169
169
|
- Lockfiles regenerated by `pnpm install` after `package.json` edits already in scope
|
|
170
170
|
- Generated TypeScript types (e.g. `apps/web/src/lib/database.types.ts`) when DB migrations are in scope
|
|
171
171
|
- Auto-formatted prettier rewrites of files already in `files_to_modify[]`
|
|
@@ -181,7 +181,7 @@ Two categories of work are NOT performed by this agent and must be returned to t
|
|
|
181
181
|
| Action | Why excluded | Where it goes |
|
|
182
182
|
|--------|--------------|---------------|
|
|
183
183
|
| MCP `create_task`, `update_task`, `complete_task`, `add_round`, etc. (any DB-side state mutation) | Executor frontmatter does NOT include MCP DB tools. Tool-not-available errors force orchestrator improvisation. | Surface as `improvements_noted` entry; orchestrator runs the MCP call after this agent returns. Executor never tries to invoke MCP DB tools. |
|
|
184
|
-
| Spawning `cbp-
|
|
184
|
+
| Spawning `cbp-e2e-*` specialist agents | Executor's tools list (Read/Write/Edit/Glob/Grep/Bash/TaskUpdate/AskUserQuestion/Skill) does NOT include the `Task` / Agent tool. E2E execution is owned by the `cbp-e2e-*` specialist agents (dispatched per `context/testing/e2e.md`), spawned by `/cbp-round-execute` Step 5 (parallel with `cbp-testing-qa-agent`) and is invoked by the orchestrator. | Set `specialist_needs.review_needed.ux_review` / `ui_review` if applicable. Do NOT attempt to spawn any e2e agent from inside the executor. |
|
|
185
185
|
|
|
186
186
|
If the plan implies either action, complete the rest of the work and surface the carved-out steps in `improvements_noted[]` for the orchestrator to handle.
|
|
187
187
|
|
|
@@ -358,7 +358,7 @@ When the approved plan includes specialized work, delegate to sub-executor agent
|
|
|
358
358
|
|
|
359
359
|
After implementing features in Step 3, write unit tests for all new/modified code. Tests are deliverables — they ship with the code in the same round.
|
|
360
360
|
|
|
361
|
-
**Reference**: Read `.claude/context/testing/unit.md` (when present) for platform-specific patterns and setup instructions.
|
|
361
|
+
**Reference**: Read `.claude/context/testing/unit.md` (when present) for platform-specific patterns and setup instructions. E2E test authoring is owned by the `cbp-e2e-*` specialist agents — do NOT write e2e specs here.
|
|
362
362
|
|
|
363
363
|
**Platform detection** from `test_strategy` in approved plan (set by `cbp-task-planner` Phase 2.9):
|
|
364
364
|
|
|
@@ -383,7 +383,7 @@ After implementing features in Step 3, write unit tests for all new/modified cod
|
|
|
383
383
|
|
|
384
384
|
### Step 3.7: REMOVED — E2E execution moved to /cbp-round-execute Step 5
|
|
385
385
|
|
|
386
|
-
E2E test authoring + execution is owned by `cbp-
|
|
386
|
+
E2E test authoring + execution is owned by the `cbp-e2e-*` specialist agents (dispatched per `context/testing/e2e.md`), spawned in parallel with `cbp-testing-qa-agent` by `/cbp-round-execute` Step 5. The executor does NOT spawn them (Step 0.2 carve-out). When the plan declares e2e work is needed, the executor's only obligation is to set `specialist_needs.review_needed.ui_review` / `ux_review` if applicable; the orchestrator handles the rest.
|
|
387
387
|
|
|
388
388
|
### Step 3.65: Defensive React Checklist (after writing component code)
|
|
389
389
|
|
|
@@ -396,7 +396,7 @@ E2E test authoring + execution is owned by `cbp-test-e2e-agent`, spawned in para
|
|
|
396
396
|
|
|
397
397
|
### Step 3.8: Frontend Self-Review (UI + UX, style-only)
|
|
398
398
|
|
|
399
|
-
After unit tests (Step 3.6) and the defensive React checklist (Step 3.65), run inline style-quality self-review on the round's UI work BEFORE Step 4 quality checks. This pass runs WITHOUT e2e screenshots — the screenshot-driven Phase 6.5 of `cbp-frontend-ui` runs separately at `/cbp-round-execute` Step 5b once `cbp-
|
|
399
|
+
After unit tests (Step 3.6) and the defensive React checklist (Step 3.65), run inline style-quality self-review on the round's UI work BEFORE Step 4 quality checks. This pass runs WITHOUT e2e screenshots — the screenshot-driven Phase 6.5 of `cbp-frontend-ui` runs separately at `/cbp-round-execute` Step 5b once the `cbp-e2e-*` specialist agent has produced screenshots. Mirror counterpart of Step 2.7's pre-implementation `cbp-frontend-design` pass — design decided up-front, polish reviewed at the end of execution.
|
|
400
400
|
|
|
401
401
|
**Trigger gate** — fire when `files_changed` contains ANY of:
|
|
402
402
|
|
|
@@ -461,7 +461,7 @@ Analyze the completed work and populate `specialist_needs`:
|
|
|
461
461
|
|
|
462
462
|
**Tests written** (execution phase — completed in Step 3.6):
|
|
463
463
|
- `unit_tests_written`: List unit test files written inline by executor (Step 3.6)
|
|
464
|
-
- `e2e_tests_written`: Always empty here — E2E test authoring is owned by `cbp-
|
|
464
|
+
- `e2e_tests_written`: Always empty here — E2E test authoring is owned by the `cbp-e2e-*` specialist agents (dispatched per `context/testing/e2e.md`), spawned by `/cbp-round-execute` Step 5 (post-executor)
|
|
465
465
|
- `framework_configured`: true if a unit-test/lint framework was set up from scratch
|
|
466
466
|
|
|
467
467
|
**Review needed** (validation phase — these review quality):
|
|
@@ -515,7 +515,7 @@ This gate makes the contract enforceable. Without it, Step 3.4 can be silently s
|
|
|
515
515
|
|
|
516
516
|
#### Subagent Cost Recording
|
|
517
517
|
|
|
518
|
-
When ANY background subagents were spawned during execution (general-purpose, cbp-database-agent,
|
|
518
|
+
When ANY background subagents were spawned during execution (general-purpose, cbp-database-agent, etc.), populate `round.context.subagent_summaries[]` with one entry per agent:
|
|
519
519
|
|
|
520
520
|
```yaml
|
|
521
521
|
subagent_summaries:
|
|
@@ -583,7 +583,7 @@ Which would you prefer?
|
|
|
583
583
|
- **Spawned by**: `/cbp-round-execute` Step 3 (single-wave 3-AGENT path or per-wave 3-WAVE path)
|
|
584
584
|
- **Returns to**: `/cbp-round-execute` which collects output and runs per-wave `cbp-testing-qa-agent`
|
|
585
585
|
- **Depends on**: `cbp-task-planner` agent (provides approved plan)
|
|
586
|
-
- **May spawn**: `cbp-database-agent` as sub-executor for Supabase operations. (NOT `cbp-
|
|
586
|
+
- **May spawn**: `cbp-database-agent` as sub-executor for Supabase operations. (NOT any `cbp-e2e-*` specialist — those are owned by the `cbp-e2e-*` specialist agents (dispatched per `context/testing/e2e.md`), spawned by `/cbp-round-execute` Step 5 per Step 0.2 carve-out.)
|
|
587
587
|
|
|
588
588
|
## Structure Knowledge
|
|
589
589
|
|
|
@@ -82,7 +82,7 @@ Review all QA items across all rounds:
|
|
|
82
82
|
- **Auto items**: Verify all passed (build, lint, types, tests)
|
|
83
83
|
- **Default items**: Verify all resolved (pass or skipped with reason)
|
|
84
84
|
|
|
85
|
-
**E2E pass vs skipped distinction**: When reading `auto_qa.items[]` for `check: 'e2e'`, do NOT conflate `status: 'pass'` with `status: 'skipped'`. A spec that ran with `passed === 0 && skipped > 0` for any path touching `files_changed` is a hard fail, not a pass — verdict text MUST explicitly call this out: "E2E spec authored but assertions did not execute (skip-gated)." Do NOT issue a READY verdict on a zero-assertion e2e run; route to a fix round per `rules/
|
|
85
|
+
**E2E pass vs skipped distinction**: When reading `auto_qa.items[]` for `check: 'e2e'`, do NOT conflate `status: 'pass'` with `status: 'skipped'`. A spec that ran with `passed === 0 && skipped > 0` for any path touching `files_changed` is a hard fail, not a pass — verdict text MUST explicitly call this out: "E2E spec authored but assertions did not execute (skip-gated)." Do NOT issue a READY verdict on a zero-assertion e2e run; route to a fix round per `rules/e2e-mandatory.md`.
|
|
86
86
|
|
|
87
87
|
List any pending or failed items. Determine if they are blockers.
|
|
88
88
|
|
|
@@ -502,6 +502,8 @@ plan.testing_profile: 'claude_only' | 'web' | 'desktop' | 'backend' | 'full_matr
|
|
|
502
502
|
|
|
503
503
|
User may override at round-start via `$ARGUMENTS`. Planner's detection is the default — not a hard gate.
|
|
504
504
|
|
|
505
|
+
**E2E eligibility is config-driven at execute time, not here.** `/cbp-round-execute` Step 5 reads `.codebyplan/e2e.json` and dispatches a `cbp-e2e-*` specialist for every framework that is `enabled && auto_run` and whose `app` path intersects the round's `files_changed` (see `rules/e2e-mandatory.md`). `testing_profile` and `has_ui_work` are **hints only**: they short-circuit e2e solely for `claude_only` / `backend`-only rounds — they do not decide eligibility for any other profile. Do not gate e2e on `has_ui_work` in the plan. Optionally, if `.codebyplan/e2e.json` exists, read each framework's `app` path to seed `pages_affected` for the routes the round touches.
|
|
506
|
+
|
|
505
507
|
### Phase 5: Design Solution
|
|
506
508
|
|
|
507
509
|
Honor locked decisions. Create solution design with files, integration points.
|
|
@@ -20,7 +20,7 @@ Single agent that handles non-e2e quality validation in the per-wave validation
|
|
|
20
20
|
- Apply default production checklist items
|
|
21
21
|
- Detect unrelated issues and missing tests
|
|
22
22
|
|
|
23
|
-
E2E execution (Playwright / Maestro / WebDriverIO / XCUITest / vscode-test) is owned by `cbp-
|
|
23
|
+
E2E execution (Playwright / Maestro / WebDriverIO / XCUITest / vscode-test) is owned by the `cbp-e2e-*` specialist agents (dispatched per `context/testing/e2e.md`), spawned in parallel with this agent by `/cbp-round-execute` Step 5. **The agents are fully independent — this agent does NOT read `round.context.e2e_outputs` or `round.context.frontend_ui_review`.** This agent emits auto QA items and default checklist items. Baseline-regression findings surface as a BLOCKING gate at `/cbp-round-end` Step 7 (an explicit accept-or-fix user decision; baselines are NEVER auto-accepted).
|
|
24
24
|
|
|
25
25
|
## Input Contract
|
|
26
26
|
|
|
@@ -92,7 +92,7 @@ output:
|
|
|
92
92
|
passed: number
|
|
93
93
|
warnings: number
|
|
94
94
|
failed: number
|
|
95
|
-
hard_fail: boolean # true if build/lint/types failed, unit tests (vitest/jest/cargo) failed when applicable, OR npm audit found critical/high vulnerabilities. E2E hard_fail is owned by
|
|
95
|
+
hard_fail: boolean # true if build/lint/types failed, unit tests (vitest/jest/cargo) failed when applicable, OR npm audit found critical/high vulnerabilities. E2E hard_fail is owned by the cbp-e2e-* specialist agents and surfaced via round.context.e2e_outputs.
|
|
96
96
|
critical_issues: string[]
|
|
97
97
|
captured_tasks:
|
|
98
98
|
- issue_index: number # index into unrelated_issues[]
|
|
@@ -147,7 +147,7 @@ Apply `testing_profile` from input before running any checks. When `testing_prof
|
|
|
147
147
|
| full_matrix | Run all checks |
|
|
148
148
|
| cross_app | Run union of touched apps' checks (intersection by detected files) |
|
|
149
149
|
|
|
150
|
-
E2E (Playwright / Maestro / WebDriverIO / XCUITest / vscode-test) is NEVER run by this agent under any profile — it's owned by `cbp-
|
|
150
|
+
E2E (Playwright / Maestro / WebDriverIO / XCUITest / vscode-test) is NEVER run by this agent under any profile — it's owned by the `cbp-e2e-*` specialist agents (dispatched per `context/testing/e2e.md`; parallel siblings spawned by `/cbp-round-execute` Step 5).
|
|
151
151
|
|
|
152
152
|
**CRITICAL: Within your profile's allowed check set (see Profile Gate Matrix above), every applicable command MUST be executed. No skipping an in-scope check without an explicit, logged reason.**
|
|
153
153
|
|
|
@@ -187,7 +187,7 @@ Procedure:
|
|
|
187
187
|
|
|
188
188
|
This closes the cycle where R2 adds a flat-config and the QA pass lints only R2 files, only for `/cbp-task-check` to later lint the full task and surface dozens of errors on R1 files — wasting an entire corrective round. Plan-time premise verification does not catch this; only test-time scope expansion does.
|
|
189
189
|
|
|
190
|
-
**Hard fail means: if any of build/lint/types/unit fails or is not executed when applicable, set `totals.hard_fail = true`. The round CANNOT complete.** E2E hard_fail is set independently by `
|
|
190
|
+
**Hard fail means: if any of build/lint/types/unit fails or is not executed when applicable, set `totals.hard_fail = true`. The round CANNOT complete.** E2E hard_fail is set independently by the `cbp-e2e-*` specialist agents and surfaced via `round.context.e2e_outputs`; `/cbp-round-execute` Step 6 considers both signals.
|
|
191
191
|
|
|
192
192
|
**Step 3a: Execute conditional unit-test checks (HARD FAIL when applicable):**
|
|
193
193
|
|
|
@@ -209,7 +209,7 @@ Run the unit-test runners detected in Step 1:
|
|
|
209
209
|
If condition is met and test fails: set `totals.hard_fail = true`.
|
|
210
210
|
If condition is not met (no applicable files changed): log `SKIPPED: <command> (reason: no applicable files changed)`.
|
|
211
211
|
|
|
212
|
-
E2E commands and their preflight (dev server / simulator / emulator / built binary / auth probe) are owned by `cbp-
|
|
212
|
+
E2E commands and their preflight (dev server / simulator / emulator / built binary / auth probe) are owned by the `cbp-e2e-*` specialist agents. See `context/testing/e2e.md` for the canonical preflight contract (Step 6.5 and the shared workflow).
|
|
213
213
|
|
|
214
214
|
**Step 3b: Execute conditional checks (soft):**
|
|
215
215
|
|
|
@@ -360,7 +360,7 @@ Return complete output contract.
|
|
|
360
360
|
- Auto and default QA items generated
|
|
361
361
|
- `hard_fail` flag correctly set
|
|
362
362
|
- **Vitest/Jest/Cargo unit-test hard_fail enforced** when source files changed
|
|
363
|
-
- E2E execution + preflight delegated entirely to `
|
|
363
|
+
- E2E execution + preflight delegated entirely to the `cbp-e2e-*` specialist agents (this agent never runs Playwright/Maestro/wdio/etc.)
|
|
364
364
|
|
|
365
365
|
## Failure Modes
|
|
366
366
|
|
|
@@ -373,6 +373,6 @@ Return complete output contract.
|
|
|
373
373
|
|
|
374
374
|
## Integration
|
|
375
375
|
|
|
376
|
-
- **Spawned by**: `/cbp-round-execute` Step 5 (per-wave; runs in parallel with `
|
|
377
|
-
- **Parallel
|
|
378
|
-
- **Output consumed by**: `/cbp-round-execute` Step 6 (hard-fail routing — this agent's `totals.hard_fail` is OR'd
|
|
376
|
+
- **Spawned by**: `/cbp-round-execute` Step 5 (per-wave; runs in parallel with the `cbp-e2e-*` specialists and may also run in parallel with next wave's executor)
|
|
377
|
+
- **Parallel siblings**: `cbp-e2e-*` specialist agents (fully independent — no cross-read; all agents complete on their own timeline using only their own inputs)
|
|
378
|
+
- **Output consumed by**: `/cbp-round-execute` Step 6 (hard-fail routing — this agent's `totals.hard_fail` is OR'd across `round.context.e2e_outputs` entries: any `e2e_outputs[f].test_results.failed > 0` or `e2e_outputs[f].status === 'failed'`, plus the `e2e_eligible_skipped` signal), `/cbp-round-end` Step 3 (reads this agent's `auto_qa[]` and `default_checklist[]`). This agent does not emit `user_qa` items; baseline-regression findings surface as a BLOCKING gate at `/cbp-round-end` Step 7 (an explicit accept-or-fix user decision; baselines are NEVER auto-accepted).
|