npm - codebyplan - Versions diffs - 1.11.1 → 1.11.2 - Mend

codebyplan 1.11.1 → 1.11.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

package/dist/cli.js +56 -5
package/package.json +1 -1
package/templates/README.md +1 -1
package/templates/agents/cbp-cc-executor.md +1 -1
package/templates/agents/cbp-e2e-maestro.md +202 -0
package/templates/agents/cbp-e2e-playwright.md +229 -0
package/templates/agents/cbp-e2e-tauri.md +184 -0
package/templates/agents/cbp-e2e-vscode.md +203 -0
package/templates/agents/cbp-e2e-xcuitest.md +224 -0
package/templates/agents/cbp-improve-claude.md +1 -1
package/templates/agents/cbp-round-executor.md +11 -11
package/templates/agents/cbp-task-check.md +1 -1
package/templates/agents/cbp-task-planner.md +2 -0
package/templates/agents/cbp-testing-qa-agent.md +9 -9
package/templates/context/testing/e2e.md +303 -0
package/templates/hooks/validate-structure-lengths.sh +2 -0
package/templates/hooks/validate-structure-smoke.sh +2 -1
package/templates/hooks/validate-structure-templates.sh +1 -0
package/templates/rules/context-file-loading.md +4 -1
package/templates/rules/e2e-mandatory.md +70 -0
package/templates/skills/cbp-build-cc-agent/SKILL.md +16 -14
package/templates/skills/cbp-build-cc-agent/reference/cbp-quality.md +4 -4
package/templates/skills/cbp-build-cc-agent/scripts/validate-agent.sh +8 -6
package/templates/skills/cbp-build-cc-mode/SKILL.md +4 -4
package/templates/skills/cbp-checkpoint-check/SKILL.md +12 -8
package/templates/skills/cbp-checkpoint-plan/SKILL.md +2 -2
package/templates/skills/cbp-checkpoint-plan/reference/e2e-discovery-probe.md +5 -5
package/templates/skills/cbp-e2e-setup/SKILL.md +254 -0
package/templates/skills/cbp-e2e-setup/reference/maestro.md +200 -0
package/templates/skills/cbp-e2e-setup/reference/playwright.md +212 -0
package/templates/skills/cbp-e2e-setup/reference/tauri.md +147 -0
package/templates/skills/cbp-e2e-setup/reference/vscode.md +154 -0
package/templates/skills/cbp-e2e-setup/reference/xcuitest.md +185 -0
package/templates/skills/cbp-frontend-ui/SKILL.md +6 -6
package/templates/skills/cbp-frontend-ux/SKILL.md +1 -1
package/templates/skills/cbp-round-execute/SKILL.md +30 -17
package/templates/skills/cbp-task-check/SKILL.md +2 -2
package/templates/agents/cbp-test-e2e-agent.md +0 -363

package/templates/skills/cbp-e2e-setup/reference/xcuitest.md ADDED Viewed

@@ -0,0 +1,185 @@
+# XCUITest Reference
+Full walkthrough for iOS native E2E testing with XCUITest via the Expo `withXCUITests`
+plugin. Source: Apple XCUITest docs + Expo prebuild docs.
+## When to use XCUITest vs Maestro
+| Scenario | Use |
+| --- | --- |
+| Standard UI flows (login, navigation, forms) | Maestro — simpler, cross-platform |
+| Apple Watch companion app testing | XCUITest — Maestro can't target watchOS |
+| HealthKit permission dialogs | XCUITest — system dialogs not reachable by Maestro |
+| iOS system sheet interactions (share sheet, notification permissions) | XCUITest |
+| Face ID / Touch ID prompts | XCUITest |
+| Camera/microphone permission dialogs | XCUITest |
+Choose Maestro first; escalate to XCUITest only when Maestro genuinely cannot reach
+the target UI.
+## Prerequisites
+- macOS with Xcode 15+ installed
+- An active Apple Developer account (free tier sufficient for Simulator testing)
+- Expo managed workflow with prebuild enabled
+- `xcbeautify` for readable output: `brew install xcbeautify`
+## Setup — Expo withXCUITests plugin
+Add the plugin to `app.config.ts` (or `app.config.js`):
+```ts
+export default {
+  expo: {
+    plugins: [
+      ["expo-build-properties", { ios: { useFrameworks: "static" } }],
+      // Add your withXCUITests plugin config
+      ["./plugins/withXCUITests", {}],
+    ],
+  },
+};
+```
+If using the community `expo-xcuitest` plugin:
+```bash
+pnpm add -D expo-xcuitest
+```
+Then in `app.config.ts`:
+```ts
+plugins: [
+  ["expo-xcuitest", { testTargetName: "AppUITests" }]
+]
+```
+## Prebuild
+After updating `app.config.ts`, regenerate the native project:
+```bash
+expo prebuild --platform ios --clean
+```
+`--clean` ensures a fresh native project from the current config. Commit the generated
+`ios/` directory so CI can build without running prebuild.
+## Swift test class
+Create `ios/AppUITests/AppUITests.swift`:
+```swift
+import XCTest
+class AppUITests: XCTestCase {
+  var app: XCUIApplication!
+  override func setUpWithError() throws {
+    continueAfterFailure = false
+    app = XCUIApplication()
+    // Inject credentials via scheme environment variables
+    app.launchEnvironment["TEST_EMAIL"] = ProcessInfo.processInfo.environment["TEST_EMAIL"] ?? ""
+    app.launchEnvironment["TEST_PASSWORD"] = ProcessInfo.processInfo.environment["TEST_PASSWORD"] ?? ""
+    app.launch()
+  }
+  func testLoginFlow() throws {
+    // Wait for the login screen
+    let emailField = app.textFields["email-input"]
+    XCTAssertTrue(emailField.waitForExistence(timeout: 10))
+    emailField.tap()
+    emailField.typeText(app.launchEnvironment["TEST_EMAIL"]!)
+    let passwordField = app.secureTextFields["password-input"]
+    passwordField.tap()
+    passwordField.typeText(app.launchEnvironment["TEST_PASSWORD"]!)
+    app.buttons["sign-in-button"].tap()
+    // Assert post-login element
+    let dashboard = app.staticTexts["Dashboard"]
+    XCTAssertTrue(dashboard.waitForExistence(timeout: 15))
+  }
+}
+```
+## accessibilityIdentifier targeting
+Set `accessibilityIdentifier` in your React Native components so XCUITest can find them:
+```tsx
+// In React Native
+<TextInput
+  testID="email-input"          // becomes accessibilityIdentifier on iOS
+  accessibilityLabel="Email"
+/>
+```
+In XCUITest, query by identifier:
+```swift
+app.textFields["email-input"]     // TextInput
+app.buttons["sign-in-button"]     // TouchableOpacity / Pressable
+app.staticTexts["Dashboard"]      // Text component
+```
+## Credentials via scheme environment variables
+Rather than hardcoding credentials, inject them via the Xcode scheme.
+In Xcode: Product → Scheme → Edit Scheme → Run → Arguments → Environment Variables.
+Add `TEST_EMAIL` and `TEST_PASSWORD` pointing to your local values.
+For CI, pass them via `xcodebuild`:
+```bash
+xcodebuild test \
+  -workspace ios/YourApp.xcworkspace \
+  -scheme YourApp \
+  -destination 'platform=iOS Simulator,name=iPhone 16,OS=latest' \
+  TEST_EMAIL="$TEST_EMAIL" \
+  TEST_PASSWORD="$TEST_PASSWORD" \
+  | xcbeautify
+```
+## Running tests
+```bash
+xcodebuild test \
+  -workspace ios/YourApp.xcworkspace \
+  -scheme YourApp \
+  -destination 'platform=iOS Simulator,name=iPhone 16,OS=latest' \
+  | xcbeautify
+```
+## pnpm script
+```json
+{
+  "scripts": {
+    "xcuitest": "xcodebuild test -workspace ios/YourApp.xcworkspace -scheme YourApp -destination 'platform=iOS Simulator,name=iPhone 16,OS=latest' | xcbeautify"
+  }
+}
+```
+## Pitfalls
+**Simulator not booted** — `xcodebuild` will boot the simulator if needed, but the
+first run is slow. Pre-boot with `xcrun simctl boot "iPhone 16"` in CI setup.
+**accessibilityIdentifier vs testID** — React Native maps `testID` to
+`accessibilityIdentifier` on iOS. Ensure the component renders the prop all the way
+through; some wrappers drop it.
+**waitForExistence timeout** — always use `waitForExistence(timeout:)` rather than
+asserting element existence immediately. React Native renders asynchronously; the
+element may not be in the view hierarchy at the instant of the assertion.
+**Derived data cache** — stale derived data can cause confusing failures. Clear with
+`rm -rf ~/Library/Developer/Xcode/DerivedData` if tests pass locally but fail after
+a schema change.

package/templates/skills/cbp-frontend-ui/SKILL.md CHANGED Viewed

@@ -10,7 +10,7 @@ effort: xhigh
 Invoked twice per round in non-`claude_only` profiles:
 1. `round-executor` Step 3.8 — `phase: 'style_only'`, no e2e screenshots. Reviews token/spacing/typography/color/cohesion against the just-written code.
-2. `/cbp-round-execute` Step 5b — `phase: 'screenshot_review'`, with screenshots from `test-e2e-agent`. Reviews rendered output and detects baseline regressions.
+2. `/cbp-round-execute` Step 5b — `phase: 'screenshot_review'`, with screenshots from the `cbp-e2e-*` specialists. Reviews rendered output and detects baseline regressions.
 Default `phase: 'full'` runs everything (back-compat for any caller not yet migrated). Inline counterpart of the up-front `frontend-design` skill — `frontend-design` decides direction before code; `frontend-ui` reviews and polishes after code.
@@ -36,7 +36,7 @@ input:
   context:
     checkpoint_goal: string
     round_requirements: string
-  e2e_screenshots:                          # Required for phase 'screenshot_review' or 'full' (when present); empty / omitted for 'style_only'. Sourced from round.context.e2e_output.screenshots (populated by test-e2e-agent at /cbp-round-execute Step 5).
+  e2e_screenshots:                          # Required for phase 'screenshot_review' or 'full' (when present); empty / omitted for 'style_only'. Sourced from the aggregated round.context.e2e_outputs[*].screenshots (populated by the cbp-e2e-* specialists at /cbp-round-execute Step 5).
     - test_name: string
       path: string                          # Repo-relative or absolute path to PNG
       page_or_screen: string
@@ -213,7 +213,7 @@ The skill's auto-fix capability is for in-scope polish, not opportunistic sweeps
 **Specifically forbidden** (always out of scope, never edited regardless of `files_changed`):
 - `.claude/**` — managed infrastructure under user-level governance
-- Project test infrastructure (e.g., `playwright.config.*`, `e2e/**`) — governed by `test-e2e-agent`
+- Project test infrastructure (e.g., `playwright.config.*`, `e2e/**`) — governed by the `cbp-e2e-*` specialist agents
 - DB migrations (e.g., `supabase/migrations/**`) — governed by `database-agent`
 - Vendor mirrors and read-only reference trees
@@ -254,9 +254,9 @@ Go beyond fixing violations — actively improve visual quality. If spacing coul
 - **Loaded twice per round** (non-`claude_only` profiles):
   1. `round-executor` Step 3.8 with `phase: 'style_only'` and empty `e2e_screenshots[]` — reviews the just-written code's tokens/spacing/typography/color/cohesion (mandatory when files_changed contains UI / styling files)
-  2. `/cbp-round-execute` Step 5b with `phase: 'screenshot_review'` and screenshots from `round.context.e2e_output.screenshots` — runs Phase 6.5 only (rendered-output review + baseline regressions). Skipped when no e2e ran (`claude_only` / `backend` / `has_ui_work === false`).
-- **Also invoked by**: `/cbp-checkpoint-check` (TASK-2 deliverable, future) with screenshots from a whole-checkpoint e2e run
-- **Consumes**: `e2e_screenshots[]` from `round.context.e2e_output.screenshots` (populated by `test-e2e-agent` at `/cbp-round-execute` Step 5)
+  2. `/cbp-round-execute` Step 5b with `phase: 'screenshot_review'` and screenshots aggregated from `round.context.e2e_outputs[*].screenshots` — runs Phase 6.5 only (rendered-output review + baseline regressions). Skipped when no e2e ran (`claude_only` / `backend`, or no eligible framework in `.codebyplan/e2e.json`).
+- **Also invoked by**: `/cbp-checkpoint-check` with screenshots aggregated from a whole-checkpoint e2e run
+- **Consumes**: `e2e_screenshots[]` aggregated from `round.context.e2e_outputs[*].screenshots` (populated by the `cbp-e2e-*` specialists at `/cbp-round-execute` Step 5)
 - **Output written to**: `round.context.frontend_ui_review` — when invoked twice per round, the second invocation merges with the first
 - **Downstream gate**: this skill emits `findings[]` only. Baseline-regression findings surface as a BLOCKING gate at `/cbp-round-end` Step 7 (baselines never auto-accepted); rendered-visual critical findings are surfaced in the Step 7 findings presentation.
 - **Paired with**: `frontend-design` (pre-implementation aesthetic decision), `frontend-ux` (interaction-quality self-review, also Step 3.8)

package/templates/skills/cbp-frontend-ux/SKILL.md CHANGED Viewed

@@ -148,7 +148,7 @@ This rule applies to every file. The skill's auto-fix surface exists because in-
 **Specifically forbidden** (always out of scope, never edited regardless of `files_changed`):
 - `.claude/**` — managed infrastructure under user-level governance
-- Project test infrastructure (e.g., `playwright.config.*`, `e2e/**`) — governed by `test-e2e-agent`
+- Project test infrastructure (e.g., `playwright.config.*`, `e2e/**`) — governed by the `cbp-e2e-*` specialist agents
 - DB migrations (e.g., `supabase/migrations/**`) — governed by `database-agent`
 - Vendor mirrors and read-only reference trees

package/templates/skills/cbp-round-execute/SKILL.md CHANGED Viewed

@@ -56,7 +56,7 @@ Execute the survey instructions inline using Read/Grep/Bash. Save to `round.cont
 For each entry, route per `rules/file-routing.md`:
 - `.claude/skills/{name}/SKILL.md` → `cbp-build-cc-skill` via Skill tool
-- `.claude/agents/{name}/AGENT.md` → `cbp-build-cc-agent` via Skill tool
+- `.claude/agents/{name}.md` (or `{name}/AGENT.md` folder form) → `cbp-build-cc-agent` via Skill tool
 - `.claude/rules/{name}.md` → `cbp-build-cc-rule` via Skill tool
 - `.claude/CLAUDE.md` → `cbp-build-cc-claude-file` via Skill tool (or direct Edit)
 - `.claude/settings*.json` → `cbp-build-cc-settings` via Skill tool
@@ -145,28 +145,40 @@ Read `task.context.testing_profile` (already loaded in Step 2).
 On pass, synthesise `testing_qa_output` inline per the procedure in `reference/inline-fallback.md` "Validation fallback" section (output shape defined in `agents/cbp-testing-qa-agent.md` Output Contract) and persist to `round.context.testing_qa_output` at Step 7.
-**All other profiles**: spawn `cbp-testing-qa-agent` AND `cbp-test-e2e-agent` in parallel (two Agent calls in the same message) per completed wave (or full executor output in single-wave mode). `cbp-test-e2e-agent` is gated on `has_ui_work === true` AND profile in {`web`, `desktop`, `full_matrix`, `cross_app`} — skipped for `claude_only` / `backend`-only.
+**All other profiles**: spawn `cbp-testing-qa-agent` against the wave's `files[]` (or full executor output in single-wave mode), and dispatch e2e specialists **config-driven** in parallel — all Agent calls in the same message:
-Input contracts: `cbp-testing-qa-agent` receives `executor_output`, `testing_profile`, `has_ui_work` (see `agents/cbp-testing-qa-agent.md` Input Contract). `cbp-test-e2e-agent` receives `repo_id`, `round_number`, `files_changed`, `prior_round_files_changed` (full task aggregate when round_number ≥ 2), `whole_checkpoint_mode: false`, `test_strategy`, `pages_affected`, `has_auth`, `dev_server_port` (see `agents/cbp-test-e2e-agent.md` Input Contract for the full shape).
+1. **Short-circuit hints** (applied *before* reading `e2e.json`, emit no `e2e_eligible_skipped` signal): if `testing_profile === 'backend'` OR `round.context.round_type === 'survey'`, dispatch `cbp-testing-qa-agent` alone and skip e2e entirely. (The `claude_only` branch above already skips all agent spawns.)
+2. Read `.codebyplan/e2e.json`. If the file is absent or `frameworks` is missing/empty, no framework is eligible — skip e2e entirely (no `e2e_eligible_skipped` signal) and run `cbp-testing-qa-agent` alone.
+3. For each entry in `frameworks` where `enabled === true` AND `auto_run === true`: if `platforms[]` does not include the current CI target (e.g. an iOS-only config on a Linux runner with no simulator), skip the framework — a recorded valid platform skip per `rules/e2e-mandatory.md`, NOT added to `e2e_eligible[]`. Otherwise mark it **eligible** when its `app` source path intersects this wave's `files_changed` (repo root for single-app repos). Record the eligible framework names as `round.context.e2e_eligible[]`.
+4. For every eligible framework, spawn the matching `cbp-e2e-*` specialist (per the `context/testing/e2e.md` dispatch routing table) IN PARALLEL with `cbp-testing-qa-agent` and with each other. Inject `framework`, `app`, `platforms`, and `credential_vars` from `e2e.json` — the config is authoritative; agents do not auto-detect.
+5. `has_ui_work` and `testing_profile` are **hints only** beyond the short-circuit above — they never suppress an eligible framework. Pure `.claude/`-only and docs-only rounds match no configured `app` path and are therefore not eligible.
-**Independence**: neither agent reads the other's output. Baseline-regression findings surface as a BLOCKING gate at `/cbp-round-end` Step 7 (an explicit accept-or-fix user decision; baselines are NEVER auto-accepted). Per-wave spawns MAY run in parallel with the next wave's executor when dependency order allows.
+This realises the opt-out contract in `rules/e2e-mandatory.md`: an eligible framework whose specialist does not run — without a recorded valid skip reason — is an `e2e_eligible_skipped` hard-fail at Step 6.
+Input contracts: `cbp-testing-qa-agent` receives `executor_output`, `testing_profile`, `has_ui_work` (see `agents/cbp-testing-qa-agent.md` Input Contract). The `cbp-e2e-*` specialist receives `repo_id`, `round_number`, `files_changed`, `prior_round_files_changed` (full task aggregate when round_number ≥ 2), `whole_checkpoint_mode: false`, `framework`, `app`, `platforms`, `credential_vars`, `test_strategy`, `pages_affected`, `has_auth`, `dev_server_port` (see `context/testing/e2e.md` Input Contract for the full shape). `test_strategy` is injected here in per-round mode; `/cbp-checkpoint-check` Step 5b omits it (the specialist self-resolves from `e2e.json` + DB in `whole_checkpoint_mode`).
+**Independence**: neither agent reads the other's output. Baseline-regression findings surface as a BLOCKING gate at `/cbp-round-end` Step 7 (an explicit accept-or-fix user decision; baselines are NEVER auto-accepted). Per-wave spawns MAY run in parallel with the next wave's executor when dependency order allows. The `cbp-e2e-*` specialists are parallel siblings of `cbp-testing-qa-agent` — they do not share state.
 ### Step 5b: Post-E2E Screenshot Review (cbp-frontend-ui Phase 6.5)
-When `round.context.e2e_output.screenshots[]` is non-empty, invoke the `cbp-frontend-ui` skill with `phase: 'screenshot_review'` (input: `files_changed`, `e2e_screenshots: round.context.e2e_output.screenshots`, `context: { checkpoint_goal, round_requirements }`). Under this phase the skill runs only Phase 6.5 (Rendered-Output Visual Review) + 7 + 8 — Phases 1-6 (style) already ran inline at executor Step 3.8 with `phase: 'style_only'`.
+Aggregate screenshots across ALL specialists that ran: `screenshots = Object.values(round.context.e2e_outputs ?? {}).flatMap(o => o.screenshots ?? [])`. When the aggregated list is non-empty, invoke the `cbp-frontend-ui` skill with `phase: 'screenshot_review'` (input: `files_changed`, `e2e_screenshots: <aggregated screenshots>`, `context: { checkpoint_goal, round_requirements }`). Under this phase the skill runs only Phase 6.5 (Rendered-Output Visual Review) + 7 + 8 — Phases 1-6 (style) already ran inline at executor Step 3.8 with `phase: 'style_only'`.
 Persist findings to `round.context.frontend_ui_review` (merge with Step 3.8's style-only output if present). Baseline-regression findings surface as a BLOCKING gate at `/cbp-round-end` Step 7 (an explicit accept-or-fix user decision; baselines are NEVER auto-accepted); rendered_visual critical findings are surfaced in the Step 7 findings presentation. Neither auto-fails the round. cbp-testing-qa-agent does NOT read these findings (full independence per Step 5).
-**Skip** when `round.context.e2e_output` is absent, `screenshots` is empty, or `testing_profile === 'claude_only'`.
+**Skip** when `round.context.e2e_outputs` is absent/empty, the aggregated `screenshots` list is empty, or `testing_profile === 'claude_only'`.
 ### Step 6: Hard-Fail Routing
-Per-wave hard-fail signal: `testing_qa_output.totals.hard_fail || e2e_output.status === 'failed' || e2e_output.test_results?.failed > 0`.
+Per-wave hard-fail signal — true when ANY hold:
+- `testing_qa_output.totals.hard_fail === true`.
+- For any framework `f` in `round.context.e2e_outputs`: `e2e_outputs[f].status === 'failed'` OR `e2e_outputs[f].test_results?.failed > 0`.
+- **`e2e_eligible_skipped`**: any framework in `round.context.e2e_eligible[]` for which no specialist output exists in `round.context.e2e_outputs` AND no valid skip reason is recorded (per the `rules/e2e-mandatory.md` valid-skip list). A silently-skipped eligible framework is a hard-fail.
 **All waves hard_fail: false** → proceed to Step 7. **Any wave hard_fail: true**:
-- **Simple fixes** (type errors, lint, missing imports, test assertion fixes, e2e `real`-category with clear code-side root cause, no prior re-trigger this round) → save failure details to round context; retrigger the failing wave's executor; re-run testing-qa AND test-e2e for that wave.
-- **Structural OR already re-triggered once OR e2e preflight aborts** → save failure context via MCP `update_round`; auto-trigger `/cbp-round-input`. STOP.
+- **Simple fixes** (type errors, lint, missing imports, test assertion fixes, e2e `real`-category with clear code-side root cause, no prior re-trigger this round) → save failure details to round context; retrigger the failing wave's executor; re-run testing-qa AND the eligible `cbp-e2e-*` specialists for that wave.
+- **Structural OR already re-triggered once OR e2e preflight aborts OR `e2e_eligible_skipped`** → save failure context via MCP `update_round`; auto-trigger `/cbp-round-input`. STOP.
 ## Inline execution fallback
@@ -180,9 +192,9 @@ When `cbp-testing-qa-agent` spawn fails OR the resolved `testing_profile` is `cl
 Update round context via MCP `update_round`:
-- `context`: { ...existing, executor_output, testing_qa_output, e2e_output, frontend_ui_review }
+- `context`: { ...existing, executor_output, testing_qa_output, e2e_eligible, e2e_outputs, frontend_ui_review }
-`e2e_output` and `frontend_ui_review` are present only when the gates above admitted them (e2e ran AND Step 5b ran).
+`e2e_outputs` (a framework-keyed map of specialist outputs, e.g. `{ playwright: {...}, maestro: {...} }`) and `frontend_ui_review` are present only when the gates above admitted them (≥1 eligible framework ran AND Step 5b ran). `e2e_eligible[]` records which frameworks were eligible this round and drives the Step 6 `e2e_eligible_skipped` check.
 ### Step 8: Auto-trigger Round End
@@ -195,17 +207,18 @@ Trigger `/cbp-round-end`.
 ## Key Rules
 - **Code + test writing + inline validation** — planning lives in `round-start`, summary in `round-end`
-- Per-wave `cbp-testing-qa-agent` AND `cbp-test-e2e-agent` run in parallel (both against the same wave's `files[]`); they may also run in parallel with the NEXT wave's executor when dependency order allows
-- `testing_profile` from `task.context` governs which checks run — read it once in Step 2; pass to every testing-qa + test-e2e spawn
-- `claude_only` profile skips all agent spawns (testing-qa AND test-e2e); runs hook syntax and skill structure checks inline
-- Step 5b (cbp-frontend-ui Phase 6.5) runs only when e2e produced screenshots — gated on `e2e_output.screenshots[]` non-empty
+- Per-wave `cbp-testing-qa-agent` AND the `cbp-e2e-*` specialist run in parallel (both against the same wave's `files[]`); they may also run in parallel with the NEXT wave's executor when dependency order allows
+- `testing_profile` from `task.context` governs which checks run — read it once in Step 2; pass to every testing-qa + e2e specialist spawn
+- `claude_only` profile skips all agent spawns (testing-qa AND `cbp-e2e-*`); runs hook syntax and skill structure checks inline
+- E2E dispatch is **config-driven and opt-out** (`.codebyplan/e2e.json`), not gated on `has_ui_work`/`testing_profile` — an eligible framework that silently does not run is an `e2e_eligible_skipped` hard-fail (`rules/e2e-mandatory.md`)
+- Step 5b (cbp-frontend-ui Phase 6.5) runs only when e2e produced screenshots — gated on the aggregated `e2e_outputs[*].screenshots[]` being non-empty
 - Claude NEVER git adds files in round commands
 ## Integration
 - **Reads**: MCP `get_current_task`, `get_rounds`
-- **Writes**: MCP `update_round` (context with executor_output + testing_qa_output + e2e_output + frontend_ui_review)
-- **Spawns**: `cbp-round-executor` (per wave or single), `cbp-testing-qa-agent` (per wave, parallel sibling of cbp-test-e2e-agent), `cbp-test-e2e-agent` (per wave when has_ui_work + non-claude_only profile), `cbp-database-agent` (if DB work), `cbp-security-agent` (if security review needed)
+- **Writes**: MCP `update_round` (context with executor_output + testing_qa_output + e2e_eligible + e2e_outputs + frontend_ui_review)
+- **Spawns**: `cbp-round-executor` (per wave or single), `cbp-testing-qa-agent` (per wave, parallel sibling of the `cbp-e2e-*` specialists), the `cbp-e2e-*` specialists (config-driven dispatch per `context/testing/e2e.md`, one per eligible framework in `.codebyplan/e2e.json`), `cbp-database-agent` (if DB work), `cbp-security-agent` (if security review needed)
 - **Skill invocations**: `cbp-frontend-ui` at Step 5b with `phase: 'screenshot_review'` (post-e2e)
 - **Triggers**: `/cbp-round-end` (auto)
 - **Triggered by**: `/cbp-round-start` (auto, after plan approval)

package/templates/skills/cbp-task-check/SKILL.md CHANGED Viewed

@@ -18,12 +18,12 @@ If the `cbp-task-check` agent spawn fails for any reason (`API Error: Extra usag
 Procedure summary (pointer back to canonical):
 1. Detect the failure class from the error string; record `round.context.task_check_findings.spawn_failure = { class, error_message, decided_at }`.
-2. Walk the agent's documented Phase 1-10 checklist inline using `Read` / `Grep` / `Bash` / MCP `get_*` tools — the agent's AGENT.md is the inline script.
+2. Walk the agent's documented Phase 1-10 checklist inline using `Read` / `Grep` / `Bash` / MCP `get_*` tools — the agent's definition file is the inline script.
 3. Populate the agent's output contract (`verdict`, `route_recommendation`, `requirements_status`, `qa_status`, `code_review_findings`, `user_satisfaction`, `scope_divergence_detected`, etc.) with `mode: 'inline_fallback'` so analytics distinguishes.
 4. Apply the pre-emptive-skip rule: when the same failure class fired in the previous skill of this session, skip the spawn attempt entirely and go straight to inline.
 5. Continue the skill — do NOT abort. Inline-fallback is intended to keep the pipeline moving under sustained outages.
-Inline-fallback is NOT a quality downgrade trapdoor — every Phase from the AGENT.md MUST be walked, in order, with the same Read/Grep depth the agent would have used. Skipping phases under the banner of fallback is a separate failure mode that `cbp-improve-claude` flags as `inline_fallback_shortcutting`.
+Inline-fallback is NOT a quality downgrade trapdoor — every Phase from the agent definition MUST be walked, in order, with the same Read/Grep depth the agent would have used. Skipping phases under the banner of fallback is a separate failure mode that `cbp-improve-claude` flags as `inline_fallback_shortcutting`.
 ## When Used