codebyplan 1.11.1 → 1.11.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/cli.js +56 -5
- package/package.json +1 -1
- package/templates/README.md +1 -1
- package/templates/agents/cbp-cc-executor.md +1 -1
- package/templates/agents/cbp-e2e-maestro.md +202 -0
- package/templates/agents/cbp-e2e-playwright.md +229 -0
- package/templates/agents/cbp-e2e-tauri.md +184 -0
- package/templates/agents/cbp-e2e-vscode.md +203 -0
- package/templates/agents/cbp-e2e-xcuitest.md +224 -0
- package/templates/agents/cbp-improve-claude.md +1 -1
- package/templates/agents/cbp-round-executor.md +11 -11
- package/templates/agents/cbp-task-check.md +1 -1
- package/templates/agents/cbp-task-planner.md +2 -0
- package/templates/agents/cbp-testing-qa-agent.md +9 -9
- package/templates/context/testing/e2e.md +303 -0
- package/templates/hooks/validate-structure-lengths.sh +2 -0
- package/templates/hooks/validate-structure-smoke.sh +2 -1
- package/templates/hooks/validate-structure-templates.sh +1 -0
- package/templates/rules/context-file-loading.md +4 -1
- package/templates/rules/e2e-mandatory.md +70 -0
- package/templates/skills/cbp-build-cc-agent/SKILL.md +16 -14
- package/templates/skills/cbp-build-cc-agent/reference/cbp-quality.md +4 -4
- package/templates/skills/cbp-build-cc-agent/scripts/validate-agent.sh +8 -6
- package/templates/skills/cbp-build-cc-mode/SKILL.md +4 -4
- package/templates/skills/cbp-checkpoint-check/SKILL.md +12 -8
- package/templates/skills/cbp-checkpoint-plan/SKILL.md +2 -2
- package/templates/skills/cbp-checkpoint-plan/reference/e2e-discovery-probe.md +5 -5
- package/templates/skills/cbp-e2e-setup/SKILL.md +254 -0
- package/templates/skills/cbp-e2e-setup/reference/maestro.md +200 -0
- package/templates/skills/cbp-e2e-setup/reference/playwright.md +212 -0
- package/templates/skills/cbp-e2e-setup/reference/tauri.md +147 -0
- package/templates/skills/cbp-e2e-setup/reference/vscode.md +154 -0
- package/templates/skills/cbp-e2e-setup/reference/xcuitest.md +185 -0
- package/templates/skills/cbp-frontend-ui/SKILL.md +6 -6
- package/templates/skills/cbp-frontend-ux/SKILL.md +1 -1
- package/templates/skills/cbp-round-execute/SKILL.md +30 -17
- package/templates/skills/cbp-task-check/SKILL.md +2 -2
- package/templates/agents/cbp-test-e2e-agent.md +0 -363
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
scope: org-shared
|
|
3
3
|
name: cbp-build-cc-agent
|
|
4
|
-
description: Build a Claude Code subagent at .claude/agents/{name}
|
|
4
|
+
description: Build a Claude Code subagent at .claude/agents/{name}.md (flat form, per the official sub-agents spec) following the official sub-agents spec (frontmatter, tools, model, memory, hooks, skills preload, permission modes, isolation).
|
|
5
5
|
argument-hint: "[agent-name] [--scope=project|user] [--memory=project|user|local] [--isolation=worktree]"
|
|
6
6
|
allowed-tools: Read, Write, Edit, Glob, Grep, Bash(mkdir *), Bash(chmod *)
|
|
7
7
|
effort: xhigh
|
|
@@ -32,18 +32,18 @@ Do **not** create a subagent for one-off work — spawn `general-purpose` inline
|
|
|
32
32
|
|
|
33
33
|
- Name must be lowercase kebab-case (a–z, 0–9, hyphens only)
|
|
34
34
|
- Reject collisions with built-in agents: `Explore`, `Plan`, `general-purpose`, `statusline-setup`, `Claude Code Guide`
|
|
35
|
-
- Reject if `.claude/agents/{name}
|
|
35
|
+
- Reject if `.claude/agents/{name}.md` already exists unless the user asked to update
|
|
36
36
|
|
|
37
37
|
### Step 2 — Pick scope
|
|
38
38
|
|
|
39
|
-
|
|
39
|
+
The default layout is **flat form** — a single `.claude/agents/{name}.md` file, matching the official Claude Code sub-agents spec and all 12 existing CBP agents. Use folder form `{name}/AGENT.md` only when the agent bundles supporting files (context, examples, scripts) next to it.
|
|
40
40
|
|
|
41
|
-
| Scope | Path
|
|
42
|
-
| ------- |
|
|
43
|
-
| project | `.claude/agents/{name}
|
|
44
|
-
| user | `~/.claude/agents/{name}
|
|
41
|
+
| Scope | Path | Use when |
|
|
42
|
+
| ------- | ------------------------------- | --------------------------------------- |
|
|
43
|
+
| project | `.claude/agents/{name}.md` | Repo-specific, shared with team via git |
|
|
44
|
+
| user | `~/.claude/agents/{name}.md` | Personal, reused across all projects |
|
|
45
45
|
|
|
46
|
-
Default: `project`. Pass `--scope=user` to override.
|
|
46
|
+
Default: `project`. Pass `--scope=user` to override. Use folder form `{scope-path}/{name}/AGENT.md` only when bundling supporting files alongside the agent definition.
|
|
47
47
|
|
|
48
48
|
### Step 3 — Read the template and CBP context
|
|
49
49
|
|
|
@@ -89,8 +89,8 @@ Cross-references for complex cases:
|
|
|
89
89
|
|
|
90
90
|
### Step 6 — Write the agent file
|
|
91
91
|
|
|
92
|
-
1.
|
|
93
|
-
2.
|
|
92
|
+
1. Write `{scope-path}/{name}.md` (flat form — default for all new agents)
|
|
93
|
+
2. If the agent bundles supporting files (context, examples, scripts), create a folder instead: `mkdir -p {scope-path}/{name}` and write `{scope-path}/{name}/AGENT.md`
|
|
94
94
|
3. Copy the template, fill frontmatter, then write the system prompt as the markdown body. The body becomes the agent's entire system prompt — no CLAUDE.md prefix, no Claude Code defaults.
|
|
95
95
|
|
|
96
96
|
System prompt guidance:
|
|
@@ -100,16 +100,18 @@ System prompt guidance:
|
|
|
100
100
|
- Output format it must return
|
|
101
101
|
- Failure modes (when to return `blocked`, `failed`, `no_findings`)
|
|
102
102
|
|
|
103
|
-
Supporting files (optional) live next to
|
|
103
|
+
Supporting files (optional) live next to the agent in a same-name folder: `{name}/context.md`, `{name}/examples/`, etc. Only create the folder when you are actually writing supporting files.
|
|
104
104
|
|
|
105
105
|
### Step 7 — Validate
|
|
106
106
|
|
|
107
107
|
Run the validator:
|
|
108
108
|
|
|
109
109
|
```bash
|
|
110
|
-
bash "${CLAUDE_SKILL_DIR}/scripts/validate-agent.sh" "{scope-path}/{name}
|
|
110
|
+
bash "${CLAUDE_SKILL_DIR}/scripts/validate-agent.sh" "{scope-path}/{name}.md"
|
|
111
111
|
```
|
|
112
112
|
|
|
113
|
+
(Pass `{scope-path}/{name}/AGENT.md` instead if folder form was chosen.)
|
|
114
|
+
|
|
113
115
|
It checks: name format, frontmatter parses, required fields present (`name`, `description`, `scope`), tool names are valid, model alias is recognised.
|
|
114
116
|
|
|
115
117
|
### Step 8 — Surface to the user
|
|
@@ -126,7 +128,7 @@ Report:
|
|
|
126
128
|
|
|
127
129
|
- **Triggered by**: user invocation
|
|
128
130
|
- **Reads**: `${CLAUDE_SKILL_DIR}/templates/agent.md`, `${CLAUDE_SKILL_DIR}/reference/*.md` (including `cbp-quality.md`)
|
|
129
|
-
- **Writes**: `.claude/agents/{name}
|
|
131
|
+
- **Writes**: `.claude/agents/{name}.md` (flat default) or `.claude/agents/{name}/AGENT.md` (folder form, when bundling supporting files); same under `~/.claude/` for user scope
|
|
130
132
|
- **Related skills**: `/cbp-build-cc-skill` (skill-level workflows), `/cbp-build-cc-settings` (session-level `SubagentStart`/`SubagentStop` hooks), `/cbp-build-cc-mode` (canonical model/effort matrix — audit or apply)
|
|
131
133
|
|
|
132
134
|
## Key Rules
|
|
@@ -136,4 +138,4 @@ Report:
|
|
|
136
138
|
- `bypassPermissions` and `acceptEdits` inherited from the parent cannot be weakened by the child
|
|
137
139
|
- Preloaded skills inject _full content_ at startup — don't preload `disable-model-invocation: true` skills
|
|
138
140
|
- Restart the session or use `/agents` to load a newly written agent file
|
|
139
|
-
- CBP folder form is
|
|
141
|
+
- Flat form `.claude/agents/{name}.md` is the default and matches all 12 existing CBP agents; folder form `{name}/AGENT.md` is optional and only for agents that bundle supporting files alongside them
|
|
@@ -4,11 +4,11 @@ scope: org-shared
|
|
|
4
4
|
|
|
5
5
|
# Agent Authoring Quality
|
|
6
6
|
|
|
7
|
-
Quality expectations and structure for `/.claude/agents/{name}
|
|
7
|
+
Quality expectations and structure for `/.claude/agents/{name}.md` files. This file adds CBP-specific constraints on top of the official Claude Code sub-agents spec.
|
|
8
8
|
|
|
9
|
-
##
|
|
9
|
+
## Agent Layout
|
|
10
10
|
|
|
11
|
-
CBP uses the
|
|
11
|
+
CBP uses the flat form `.claude/agents/{name}.md` — the default per the official Claude Code sub-agents spec. All 12 existing CBP agents use this layout. Folder form `.claude/agents/{name}/AGENT.md` is optional and only appropriate when the agent bundles supporting files (context, examples, scripts) next to the definition.
|
|
12
12
|
|
|
13
13
|
## Required CBP Frontmatter
|
|
14
14
|
|
|
@@ -125,7 +125,7 @@ Agents that modify `.claude/` files MUST:
|
|
|
125
125
|
|
|
126
126
|
| Situation | Action |
|
|
127
127
|
| ---------------------------------------- | ----------------------------------------------- |
|
|
128
|
-
| Existing agent covers the domain | Update its
|
|
128
|
+
| Existing agent covers the domain | Update its agent definition file |
|
|
129
129
|
| New domain, no existing agent covers it | Create new agent |
|
|
130
130
|
| Existing agent is too large (>400 lines) | Split into orchestrator + specialist |
|
|
131
131
|
| One-off task, not recurring | Don't create an agent — use inline instructions |
|
|
@@ -1,11 +1,12 @@
|
|
|
1
1
|
#!/bin/bash
|
|
2
|
-
# Validate a Claude Code subagent file (CBP folder form).
|
|
3
|
-
# Usage: validate-agent.sh <path-to-
|
|
2
|
+
# Validate a Claude Code subagent file (flat form or CBP folder form).
|
|
3
|
+
# Usage: validate-agent.sh <path-to-agent-file>
|
|
4
|
+
# Accepts: flat agents/{name}.md (spec default) or folder agents/{name}/AGENT.md (for bundling).
|
|
4
5
|
# Exit 0 = valid, exit 1 = invalid (errors printed to stderr).
|
|
5
6
|
|
|
6
7
|
set -uo pipefail
|
|
7
8
|
|
|
8
|
-
FILE="${1:?Usage: validate-agent.sh <path-to-
|
|
9
|
+
FILE="${1:?Usage: validate-agent.sh <path-to-agent-file>}"
|
|
9
10
|
|
|
10
11
|
if [ ! -f "$FILE" ]; then
|
|
11
12
|
echo "ERROR: file not found: $FILE" >&2
|
|
@@ -15,10 +16,11 @@ fi
|
|
|
15
16
|
errors=0
|
|
16
17
|
err() { echo " - $1" >&2; errors=$((errors + 1)); }
|
|
17
18
|
|
|
18
|
-
#
|
|
19
|
+
# Accept flat agents/{name}.md (spec default) or folder agents/{name}/AGENT.md (when bundling)
|
|
19
20
|
case "$FILE" in
|
|
20
|
-
*/AGENT.md) ;;
|
|
21
|
-
|
|
21
|
+
*/AGENT.md) ;; # folder form (agent bundles supporting files)
|
|
22
|
+
*/agents/*.md) ;; # flat form — Claude Code spec default
|
|
23
|
+
*) err "expected a subagent markdown file under agents/ — flat agents/{name}.md or folder agents/{name}/AGENT.md (got: $FILE)";;
|
|
22
24
|
esac
|
|
23
25
|
|
|
24
26
|
# Frontmatter must be present
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
scope: org-shared
|
|
3
3
|
name: cbp-build-cc-mode
|
|
4
|
-
description: Audit + apply the CHK-109 model/effort matrix across every authoring skill and agent under `packages/codebyplan-package/templates/`. Bare invocation walks all SKILL.md and
|
|
4
|
+
description: Audit + apply the CHK-109 model/effort matrix across every authoring skill and agent under `packages/codebyplan-package/templates/`. Bare invocation walks all SKILL.md and agent .md files and reports frontmatter gaps; with a path argument, edits the target file's frontmatter to the matrix-decided values.
|
|
5
5
|
argument-hint: "[path-to-agent-or-skill]"
|
|
6
6
|
allowed-tools: Read, Write, Edit, Glob, Grep
|
|
7
7
|
effort: xhigh
|
|
@@ -23,7 +23,7 @@ Audit or apply the canonical `model:` + `effort:` frontmatter convention across
|
|
|
23
23
|
|
|
24
24
|
`model: sonnet` + `effort: xhigh`
|
|
25
25
|
|
|
26
|
-
|
|
26
|
+
Fifteen of the 16 authoring agents take the default (`cbp-cc-executor`, `cbp-database-agent`, `cbp-improve-claude`, `cbp-improve-round`, `cbp-research`, `cbp-round-executor`, `cbp-security-agent`, `cbp-task-check`, `cbp-task-planner`, `cbp-testing-qa-agent`, `cbp-e2e-playwright`, `cbp-e2e-maestro`, `cbp-e2e-tauri`, `cbp-e2e-vscode`, `cbp-e2e-xcuitest`). The 16th — `cbp-mechanical-edits` — is an explicit haiku-low exception (see below). 27 skills take the default: cbp-round-start, cbp-round-input, cbp-round-execute, cbp-task-create, cbp-task-start, cbp-task-complete, cbp-task-testing, cbp-checkpoint-create, cbp-checkpoint-check, cbp-checkpoint-end, cbp-build-cc-mode, cbp-build-cc-agent, cbp-build-cc-skill, cbp-build-cc-rule, cbp-build-cc-claude-file, cbp-build-cc-memory, cbp-build-cc-settings, cbp-frontend-a11y, cbp-frontend-design, cbp-frontend-ui, cbp-frontend-ux, cbp-session-end, cbp-ship, cbp-ship-configure, cbp-supabase-setup, cbp-supabase-migrate, cbp-supabase-branch-check.
|
|
27
27
|
|
|
28
28
|
### Effort-lowered skills (5)
|
|
29
29
|
|
|
@@ -55,7 +55,7 @@ Eleven of the 12 authoring agents take the default (`cbp-cc-executor`, `cbp-data
|
|
|
55
55
|
|
|
56
56
|
### Haiku-low agents (1)
|
|
57
57
|
|
|
58
|
-
`model: haiku` + `effort: low`. The
|
|
58
|
+
`model: haiku` + `effort: low`. The 16th authoring agent — pure I/O mechanical work, never authors logic.
|
|
59
59
|
|
|
60
60
|
| agent | model | effort | reason |
|
|
61
61
|
| -------------------- | ----- | ------ | ----------------------------------------------------------------------------------- |
|
|
@@ -87,7 +87,7 @@ The audit is read-only — no edits performed.
|
|
|
87
87
|
|
|
88
88
|
## Apply Mode
|
|
89
89
|
|
|
90
|
-
Invoked with a path argument (the target `SKILL.md` or
|
|
90
|
+
Invoked with a path argument (the target `SKILL.md` or agent `.md`). Procedure:
|
|
91
91
|
|
|
92
92
|
1. Read the file. Identify the skill or agent name from the `name:` frontmatter field (or from the path basename if `name:` is absent).
|
|
93
93
|
2. Look up target `model` + `effort` from the matrix above.
|
|
@@ -90,11 +90,9 @@ Re-run build/lint/types on current codebase to verify nothing regressed across t
|
|
|
90
90
|
|
|
91
91
|
Aggregate the files touched across all tasks (reusing Step 4's deduplicated table) and run e2e once against the union of pages they affect.
|
|
92
92
|
|
|
93
|
-
1. **Build `pages_affected`** — derive from Step 4's `files_changed` union using the same heuristic as `
|
|
93
|
+
1. **Build `pages_affected`** — derive from Step 4's `files_changed` union using the same heuristic as `context/testing/e2e.md` Step 5.1 (Next.js: `app/<route>/page.tsx` chains; Expo: screen files; Tauri: route components; fallback: directory-based grouping). Deduplicate by route / screen.
|
|
94
94
|
|
|
95
|
-
2. **
|
|
96
|
-
|
|
97
|
-
`test_strategy` is intentionally omitted — the agent auto-detects per-app via its Step 1.5 DB tech-stack lookup (`get_repos`) and Step 2 filesystem reconciliation. Mixed-framework monorepos disambiguate via `tech_stack.apps[]`; pre-pass `test_strategy` only when DB tech_stack is empty AND filesystem probe is ambiguous.
|
|
95
|
+
2. **Config-driven dispatch** (per `context/testing/e2e.md` dispatch contract): read `.codebyplan/e2e.json`. If the file is absent or `frameworks` is empty, append `| Checkpoint E2E | n/a | no framework configured |` to the Step 5 QA Summary and continue to Step 6. Otherwise, for each entry where `enabled === true` AND `auto_run === true` whose `app` path intersects the aggregated `files_changed` union, spawn the matching `cbp-e2e-*` specialist via the Agent tool — one per eligible framework, in parallel. Inject `framework`, `app`, `platforms`, and `credential_vars` from `e2e.json` (authoritative; the agent does not auto-detect):
|
|
98
96
|
|
|
99
97
|
```yaml
|
|
100
98
|
input:
|
|
@@ -105,18 +103,24 @@ Aggregate the files touched across all tasks (reusing Step 4's deduplicated tabl
|
|
|
105
103
|
pages_affected: [aggregated]
|
|
106
104
|
has_auth: [boolean, from .codebyplan/server.json + repo]
|
|
107
105
|
dev_server_port: [from .codebyplan/server.json]
|
|
106
|
+
framework: [from e2e.json — authoritative]
|
|
107
|
+
app: [from e2e.json — e.g. apps/web]
|
|
108
|
+
platforms: [from e2e.json — e.g. ["web"]]
|
|
109
|
+
credential_vars: [from e2e.json — env var names only, never secrets]
|
|
108
110
|
```
|
|
109
111
|
|
|
110
|
-
|
|
112
|
+
Hold each specialist's output keyed by framework (an `e2e_outputs[framework]` map) for this skill's aggregation — checkpoint-check has no MCP round, so this lives in-memory during the run (persist to `checkpoint.context` via `update_checkpoint` at Step 7 if a durable record is needed). `test_strategy` is intentionally omitted — the agent resolves it from `.codebyplan/e2e.json` and the DB tech-stack record.
|
|
113
|
+
|
|
114
|
+
3. **Wait for all specialists to complete.** Each agent's output carries `whole_checkpoint_aggregated: true` confirming whole-checkpoint formatting.
|
|
111
115
|
|
|
112
|
-
4. **On pass** (`
|
|
116
|
+
4. **On pass** (every eligible framework `f`: `e2e_outputs[f].status === 'completed'` AND `e2e_outputs[f].test_results.failed === 0`): append a row to the Step 5 QA Summary table:
|
|
113
117
|
```
|
|
114
118
|
| Checkpoint E2E | pass | aggregated |
|
|
115
119
|
```
|
|
116
120
|
Continue to Step 6.
|
|
117
121
|
|
|
118
|
-
5. **On fail** (`
|
|
119
|
-
- **(a) Create fix-task in CHK-{NNN} (recommended)** — invoke MCP `create_task` with `checkpoint_id=current_checkpoint_id`, `title="Fix checkpoint-level e2e failures (CHK-{NNN})"`, `requirements` containing the detailed failure breakdown (category counts, files involved, pages broken, screenshot paths from `
|
|
122
|
+
5. **On fail** (any framework `f`: `e2e_outputs[f].status === 'failed'` OR `e2e_outputs[f].test_results.failed > 0`): build a failure summary from `e2e_outputs[*].test_results.failures[]` aggregated and grouped by `category`. Surface via `AskUserQuestion`:
|
|
123
|
+
- **(a) Create fix-task in CHK-{NNN} (recommended)** — invoke MCP `create_task` with `checkpoint_id=current_checkpoint_id`, `title="Fix checkpoint-level e2e failures (CHK-{NNN})"`, `requirements` containing the detailed failure breakdown (category counts, files involved, pages broken, screenshot paths from `e2e_outputs[*].screenshots[]`), AND `context: { source_checkpoint_id, e2e_failure_summary: { category_counts, pages_broken, screenshot_paths }, fix_type: "checkpoint_e2e" }` so downstream `cbp-task-planner` can verify failure premises. Per `infra-issue-absorption.md` "Resolve-in-Current-Scope by Default", checkpoint-level e2e failures absorb into the active checkpoint — not standalone.
|
|
120
124
|
- **(b) Surface as warning only — proceed to checkpoint-end** — append `| Checkpoint E2E | warning | N failures (deferred) |` to Step 5 QA Summary; continue to Step 6.
|
|
121
125
|
- **(c) Halt — review manually** — STOP and wait for the user.
|
|
122
126
|
|
|
@@ -63,7 +63,7 @@ Only relevant when an idea touches a UI surface AND you SUSPECT an existing flow
|
|
|
63
63
|
|
|
64
64
|
1. Surface the suspicion: name the area + the specific pages/screens, and why you think it is broken.
|
|
65
65
|
2. Ask the user via AskUserQuestion to confirm running the probe (it needs a running dev server).
|
|
66
|
-
3. On confirm, spawn `cbp-
|
|
66
|
+
3. On confirm, spawn the config-matched `cbp-e2e-*` specialist (for a web app, `cbp-e2e-playwright`) per the `context/testing/e2e.md` dispatch contract, with `whole_checkpoint_mode: true`, `round_number: 0`, `files_changed: []`, the `pages_affected` you proposed, plus `repo_id` / `test_strategy` / `has_auth` / `dev_server_port`. Resolve `test_strategy` and `dev_server_port` per `reference/e2e-discovery-probe.md` (do not pass placeholder strings).
|
|
67
67
|
4. Record the probe outcome (what actually failed vs. what you assumed) in `context.discoveries[]` so the plan targets real defects.
|
|
68
68
|
|
|
69
69
|
Skip this step entirely for non-UI checkpoints or when no breakage is suspected.
|
|
@@ -131,7 +131,7 @@ This skill does **NOT** activate the checkpoint and does **NOT** claim a user/wo
|
|
|
131
131
|
|
|
132
132
|
- **Reads**: MCP `get_current_task`, `get_checkpoints`, `get_tasks`
|
|
133
133
|
- **Writes**: MCP `update_checkpoint` (ideas assessment, context, plan, research), `create_task`
|
|
134
|
-
- **Spawns**: `cbp-research` (level 2+ only), `cbp-
|
|
134
|
+
- **Spawns**: `cbp-research` (level 2+ only), config-matched `cbp-e2e-*` specialist (opt-in discovery probe, `whole_checkpoint_mode` — see `context/testing/e2e.md` dispatch contract)
|
|
135
135
|
- **Triggered by**: `/cbp-checkpoint-create` (auto), or user directly
|
|
136
136
|
- **Triggers**: `/cbp-checkpoint-start` (auto when claimed at create; directive when left open)
|
|
137
137
|
- **Never**: activates the checkpoint or claims a user/worktree — that is `/cbp-checkpoint-start`
|
|
@@ -4,7 +4,7 @@ scope: org-shared
|
|
|
4
4
|
|
|
5
5
|
# E2E Discovery Probe
|
|
6
6
|
|
|
7
|
-
Loaded by `/cbp-checkpoint-plan` Step 4. The probe answers one question before you plan a fix: **is this area actually broken, and how?** It reuses `cbp-
|
|
7
|
+
Loaded by `/cbp-checkpoint-plan` Step 4. The probe answers one question before you plan a fix: **is this area actually broken, and how?** It reuses the config-matched `cbp-e2e-*` specialist (the framework owners of e2e execution) in `whole_checkpoint_mode` rather than introducing a second smoke-test path. See `context/testing/e2e.md` for the dispatch contract that selects which specialist to spawn.
|
|
8
8
|
|
|
9
9
|
## When to offer the probe
|
|
10
10
|
|
|
@@ -20,8 +20,8 @@ Skip silently for backend-only / infra / `claude_only` checkpoints, or when you
|
|
|
20
20
|
1. **State the suspicion** — name the area, the specific pages/screens, and why you think it is broken (a stale selector, a route that 404s, a recent refactor nearby).
|
|
21
21
|
2. **Confirm with the user** via AskUserQuestion — the probe needs a running dev server, so it is opt-in. Options: run the probe / skip and plan from assumption / let me name different pages.
|
|
22
22
|
3. **Resolve the dev-server port** from `.codebyplan/server.json` `port_allocations[]` (pick the entry whose `server_type` matches the app, e.g. `nextjs`). If nothing is running there, ask the user to start it or skip.
|
|
23
|
-
4. **Resolve `test_strategy`** — call MCP `get_repos()`, find the entry where `id === repo_id`, and read the affected app's platform + e2e framework from its `tech_stack` record. If the record has no e2e data, pass `null` for the unknown fields — the agent resolves them itself
|
|
24
|
-
5. **Spawn** `cbp-
|
|
23
|
+
4. **Resolve `test_strategy`** — call MCP `get_repos()`, find the entry where `id === repo_id`, and read the affected app's platform + e2e framework from its `tech_stack` record. If the record has no e2e data, pass `null` for the unknown fields — the agent resolves them itself. Do NOT pass placeholder strings.
|
|
24
|
+
5. **Spawn** the config-matched `cbp-e2e-*` specialist per `context/testing/e2e.md` dispatch contract (e.g. `cbp-e2e-playwright` for a web app) with the payload below.
|
|
25
25
|
6. **Interpret** the result: compare what actually failed against what you assumed. Record the delta in `context.discoveries[]` so the plan targets real defects, not imagined ones.
|
|
26
26
|
|
|
27
27
|
## Spawn payload (whole_checkpoint_mode)
|
|
@@ -52,6 +52,6 @@ The agent returns `test_results` (passed / failed / skipped + per-failure `categ
|
|
|
52
52
|
- `category: 'env' | 'auth' | 'access' | 'flake'` → not the feature's fault; note it but do not plan a code fix around it.
|
|
53
53
|
- A clean pass → your breakage suspicion was wrong; plan the actual requested change without a "fix" step you did not need.
|
|
54
54
|
|
|
55
|
-
## Why reuse the
|
|
55
|
+
## Why reuse the specialist agents (not a new smoke probe)
|
|
56
56
|
|
|
57
|
-
`cbp-
|
|
57
|
+
The `cbp-e2e-*` specialists are the declared framework owners of e2e: they configure the framework if missing, run preflight, and classify failures. A bespoke in-skill smoke check would duplicate that ownership and drift. The probe is a thin, opt-in caller of the matching existing agent (selected per `context/testing/e2e.md` dispatch contract).
|
|
@@ -0,0 +1,254 @@
|
|
|
1
|
+
---
|
|
2
|
+
scope: org-shared
|
|
3
|
+
name: cbp-e2e-setup
|
|
4
|
+
description: Detect installed E2E frameworks, ask which to enable, record credentials source (gitignored env-file path + var names only, never secrets), and write/refresh .codebyplan/e2e.json. Interactive, idempotent.
|
|
5
|
+
argument-hint: "[--force]"
|
|
6
|
+
model: sonnet
|
|
7
|
+
effort: xhigh
|
|
8
|
+
allowed-tools: Read, Write, Edit, Bash(cat *), Bash(jq *), Bash(which *), Bash(test *), Bash(mkdir *), Bash(cp *), Bash(echo *), Bash(date *), Bash(mv *), Bash(git check-ignore *), AskUserQuestion, mcp__codebyplan__get_repos
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# E2E Setup
|
|
12
|
+
|
|
13
|
+
Configure `.codebyplan/e2e.json` so the E2E test pipeline knows which frameworks are
|
|
14
|
+
enabled, where each app lives, and where to read credentials at test time.
|
|
15
|
+
|
|
16
|
+
Invoke at any time. Already-configured frameworks are preserved unless `--force` is passed.
|
|
17
|
+
Pass `--force` to re-ask all questions including credentials blocks.
|
|
18
|
+
|
|
19
|
+
## Arguments
|
|
20
|
+
|
|
21
|
+
Inspect `$ARGUMENTS` for `--force`. If present, set `force_mode = true`.
|
|
22
|
+
Absent: use idempotent mode — preserve existing credentials blocks, skip re-asking
|
|
23
|
+
already-configured frameworks.
|
|
24
|
+
|
|
25
|
+
## Step 1 — Detect installed frameworks
|
|
26
|
+
|
|
27
|
+
Run both detection signals and merge:
|
|
28
|
+
|
|
29
|
+
**Signal A — DB tech_stack** via `mcp__codebyplan__get_repos` (match `repo_id` from
|
|
30
|
+
`.codebyplan/repo.json`). Scan `tech_stack[]` for: `playwright`, `maestro`, `xcuitest`,
|
|
31
|
+
`webdriverio`, `@wdio/cli`, `@vscode/test-cli`.
|
|
32
|
+
|
|
33
|
+
**Signal B — Filesystem probes:**
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
test -f playwright.config.ts || test -f playwright.config.js # → playwright
|
|
37
|
+
test -f maestro/config.yaml || test -d maestro # → maestro
|
|
38
|
+
test -d ios && find ios -name '*UITests' -maxdepth 2 | grep -q . # → xcuitest
|
|
39
|
+
test -f wdio.conf.ts || test -f wdio.conf.js # → tauri (wdio)
|
|
40
|
+
test -f .vscode-test.mjs || test -d apps/vscode # → vscode
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
**Signal C — Read existing `.codebyplan/e2e.json`** for idempotent merge:
|
|
44
|
+
|
|
45
|
+
```bash
|
|
46
|
+
cat .codebyplan/e2e.json 2>/dev/null || echo '{}'
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
A framework detected by A or B is "detected". A framework already in e2e.json is
|
|
50
|
+
"configured". A framework with `enabled: false` in e2e.json is "configured-disabled".
|
|
51
|
+
|
|
52
|
+
## Step 2 — Ask which to enable
|
|
53
|
+
|
|
54
|
+
Display a summary table of detection results:
|
|
55
|
+
|
|
56
|
+
```
|
|
57
|
+
Framework | Detected | Configured | Status
|
|
58
|
+
----------- | -------- | ---------- | ------
|
|
59
|
+
playwright | yes | yes | enabled
|
|
60
|
+
maestro | no | no | absent
|
|
61
|
+
xcuitest | no | no | absent
|
|
62
|
+
tauri | no | no | absent
|
|
63
|
+
vscode | no | no | absent
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
AskUserQuestion (multi-select):
|
|
67
|
+
|
|
68
|
+
```
|
|
69
|
+
Which E2E frameworks should be enabled?
|
|
70
|
+
Detected frameworks are pre-checked. Undetected ones can still be enabled.
|
|
71
|
+
|
|
72
|
+
Select all that apply:
|
|
73
|
+
A) playwright (web — Next.js)
|
|
74
|
+
B) maestro (mobile — Expo/React Native)
|
|
75
|
+
C) xcuitest (iOS native — Apple Watch, HealthKit, system dialogs)
|
|
76
|
+
D) tauri (desktop — WebDriverIO + tauri-driver)
|
|
77
|
+
E) vscode (VS Code extension — @vscode/test-cli)
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
In `--force` mode: re-ask even for frameworks already enabled.
|
|
81
|
+
Otherwise: frameworks already `enabled: true` in e2e.json are kept without asking.
|
|
82
|
+
|
|
83
|
+
## Step 3 — Mobile platforms (conditional)
|
|
84
|
+
|
|
85
|
+
If maestro or xcuitest is in the enabled set and the framework is not yet configured
|
|
86
|
+
(or `--force`), AskUserQuestion:
|
|
87
|
+
|
|
88
|
+
```
|
|
89
|
+
Mobile platform target for <framework>:
|
|
90
|
+
A) Android only
|
|
91
|
+
B) iOS only
|
|
92
|
+
C) Both Android and iOS
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
Record the answer as `platforms: ["android"]`, `["ios"]`, or `["android", "ios"]` on the
|
|
96
|
+
framework config block.
|
|
97
|
+
|
|
98
|
+
## Step 4 — Credentials source
|
|
99
|
+
|
|
100
|
+
For each enabled framework that touches auth (playwright, maestro, xcuitest):
|
|
101
|
+
|
|
102
|
+
**Idempotency gate** (skip if `--force` is absent AND
|
|
103
|
+
`credentials.frameworks[framework].email_var` AND
|
|
104
|
+
`credentials.frameworks[framework].password_var` are both set to non-empty strings —
|
|
105
|
+
an empty `{}` entry counts as unconfigured, so prompt for it):
|
|
106
|
+
|
|
107
|
+
```
|
|
108
|
+
Credentials block for <framework> already configured — use --force to reset.
|
|
109
|
+
env_file: <path>
|
|
110
|
+
email_var: <var>
|
|
111
|
+
password_var: <var>
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
Print the preserved values and continue to Step 5.
|
|
115
|
+
|
|
116
|
+
**Otherwise**, AskUserQuestion (one question per framework, step-by-step):
|
|
117
|
+
|
|
118
|
+
```
|
|
119
|
+
Credentials source for <framework>
|
|
120
|
+
|
|
121
|
+
1. Gitignored env-file path that holds the secrets
|
|
122
|
+
(default: .codebyplan/e2e.env — a dedicated file, separate from app .env.local)
|
|
123
|
+
Path: ___
|
|
124
|
+
|
|
125
|
+
2. Email env var name (default: E2E_TEST_EMAIL for playwright, TEST_EMAIL for others)
|
|
126
|
+
Var name: ___
|
|
127
|
+
|
|
128
|
+
3. Password env var name (default: E2E_TEST_PASSWORD for playwright, TEST_PASSWORD for others)
|
|
129
|
+
Var name: ___
|
|
130
|
+
|
|
131
|
+
4. (Optional) Provision script path — the skill only records the path, never creates it
|
|
132
|
+
(convention: scripts/provision-e2e-user.ts — leave blank to skip)
|
|
133
|
+
Path: ___
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
After collecting the env-file path, verify it is gitignored and capture the result —
|
|
137
|
+
this boolean is persisted as the required `credentials.gitignored` field:
|
|
138
|
+
|
|
139
|
+
```bash
|
|
140
|
+
git check-ignore -q <env_file_path> && GITIGNORED=true || GITIGNORED=false
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
If NOT ignored (`GITIGNORED=false`): warn and offer to append it:
|
|
144
|
+
|
|
145
|
+
```
|
|
146
|
+
Warning: <path> is not in .gitignore.
|
|
147
|
+
This file will contain live credentials — committing it is a credential leak.
|
|
148
|
+
Append <path> to .gitignore? (Y/n)
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
On yes: append the path to `.gitignore` and set `GITIGNORED=true`. On no: leave
|
|
152
|
+
`GITIGNORED=false` and record in the output but do not block.
|
|
153
|
+
|
|
154
|
+
Carry `gitignored: $GITIGNORED` into the credentials block assembled for Step 5 — the
|
|
155
|
+
`E2eCredentials.gitignored` field is required, so it must always be populated.
|
|
156
|
+
|
|
157
|
+
Never write secret values into e2e.json — only the path, var names, and provision_script
|
|
158
|
+
reference are persisted.
|
|
159
|
+
|
|
160
|
+
## Step 5 — Write .codebyplan/e2e.json
|
|
161
|
+
|
|
162
|
+
Build the updated payload conforming to the `E2eConfig` schema
|
|
163
|
+
(`packages/codebyplan-package/src/lib/types.ts`).
|
|
164
|
+
|
|
165
|
+
For playwright, derive `base_url` from `.codebyplan/server.json`:
|
|
166
|
+
|
|
167
|
+
```bash
|
|
168
|
+
jq -r '.port_allocations[] | select(.label == "Web Dev") | "http://localhost:\(.port)"' \
|
|
169
|
+
.codebyplan/server.json | head -1
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
Match by the `Web Dev` label rather than `server_type == "nextjs"` — a repo can have
|
|
173
|
+
several `nextjs` allocations (e.g. one per sibling worktree), so array-position
|
|
174
|
+
`head -1` is not stable. Confirm the derived URL with the user before writing
|
|
175
|
+
(`Derived base_url: <url> — correct? (Y/n)`) and allow an override. Store as
|
|
176
|
+
`frameworks.playwright.base_url`.
|
|
177
|
+
|
|
178
|
+
Idempotency rule: the jq object-merge (`. + {...}`) REPLACES top-level keys, so build
|
|
179
|
+
`$CREDENTIALS_JSON` as a deep-merge of the existing block and the newly-collected data
|
|
180
|
+
BEFORE the write — otherwise a second run for an additional framework would clobber the
|
|
181
|
+
first framework's credentials. Assemble it in the shell:
|
|
182
|
+
|
|
183
|
+
```bash
|
|
184
|
+
EXISTING_CREDS=$(jq -c '.credentials // {}' .codebyplan/e2e.json)
|
|
185
|
+
CREDENTIALS_JSON=$(echo "$EXISTING_CREDS" | jq -c --argjson new "$NEW_CREDS_JSON" '. * $new')
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
The `*` operator deep-merges, so frameworks skipped by the idempotency gate keep their
|
|
189
|
+
prior entry. Then write atomically using jq temp+mv to avoid partial writes (only the two
|
|
190
|
+
schema fields `frameworks` and `credentials` are written — `E2eConfig` defines no other):
|
|
191
|
+
|
|
192
|
+
```bash
|
|
193
|
+
jq --argjson frameworks "$FRAMEWORKS_JSON" \
|
|
194
|
+
--argjson credentials "$CREDENTIALS_JSON" \
|
|
195
|
+
'. + {frameworks: $frameworks, credentials: $credentials}' \
|
|
196
|
+
.codebyplan/e2e.json > .codebyplan/e2e.json.tmp \
|
|
197
|
+
&& mv .codebyplan/e2e.json.tmp .codebyplan/e2e.json
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
Framework config shape per framework:
|
|
201
|
+
|
|
202
|
+
| Field | playwright | maestro / xcuitest | tauri / vscode |
|
|
203
|
+
| ------------ | ------------------- | -------------------- | -------------- |
|
|
204
|
+
| `enabled` | true | true | true |
|
|
205
|
+
| `app` | `apps/web` (nextjs) | `apps/<expo-app>` | `apps/desktop` / `apps/vscode` |
|
|
206
|
+
| `config_path`| `playwright.config.ts` | `maestro/config.yaml` | `wdio.conf.ts` / `.vscode-test.mjs` |
|
|
207
|
+
| `auto_run` | false | false | false |
|
|
208
|
+
| `test_dir` | `apps/web/e2e` | — | — |
|
|
209
|
+
| `base_url` | from server.json | — | — |
|
|
210
|
+
| `platforms` | — | from Step 3 | — |
|
|
211
|
+
|
|
212
|
+
Disabled frameworks get `enabled: false`; all other fields are preserved from their
|
|
213
|
+
prior configured state.
|
|
214
|
+
|
|
215
|
+
## Step 6 — Verify and report
|
|
216
|
+
|
|
217
|
+
Re-read `.codebyplan/e2e.json` and emit a per-framework summary:
|
|
218
|
+
|
|
219
|
+
```
|
|
220
|
+
E2E Setup — Complete
|
|
221
|
+
|
|
222
|
+
Framework | Status | App | base_url / platforms | Creds source
|
|
223
|
+
----------- | -------- | ---------- | --------------------- | ------------
|
|
224
|
+
playwright | enabled | apps/web | http://localhost:3010 | .codebyplan/e2e.env (E2E_TEST_EMAIL)
|
|
225
|
+
maestro | disabled | — | — | —
|
|
226
|
+
...
|
|
227
|
+
|
|
228
|
+
e2e.json written to .codebyplan/e2e.json
|
|
229
|
+
|
|
230
|
+
Next steps per framework — see reference docs:
|
|
231
|
+
playwright → reference/playwright.md
|
|
232
|
+
maestro → reference/maestro.md
|
|
233
|
+
xcuitest → reference/xcuitest.md
|
|
234
|
+
tauri → reference/tauri.md
|
|
235
|
+
vscode → reference/vscode.md
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
## Key Rules
|
|
239
|
+
|
|
240
|
+
- Never write secret values into e2e.json — only env-file path + var names
|
|
241
|
+
- Gitignore guard runs before any credentials are persisted
|
|
242
|
+
- Preserved credentials blocks are printed verbatim so the user can verify them
|
|
243
|
+
- Atomic write (tmp + mv) — never leaves e2e.json in a partial state
|
|
244
|
+
- `auto_run: false` by default — the user opts in explicitly
|
|
245
|
+
|
|
246
|
+
## Additional resources
|
|
247
|
+
|
|
248
|
+
- Playwright install + auth + CI: [reference/playwright.md](reference/playwright.md)
|
|
249
|
+
- Maestro install + flows + CI: [reference/maestro.md](reference/maestro.md)
|
|
250
|
+
- Tauri (WebDriverIO): [reference/tauri.md](reference/tauri.md)
|
|
251
|
+
- VS Code extension testing: [reference/vscode.md](reference/vscode.md)
|
|
252
|
+
- XCUITest (iOS native): [reference/xcuitest.md](reference/xcuitest.md)
|
|
253
|
+
- E2E schema types: `packages/codebyplan-package/src/lib/types.ts` (E2eConfig)
|
|
254
|
+
- Shared E2E conventions: `.claude/context/testing/e2e.md`
|