codebyplan 1.13.43 → 1.13.45

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. package/dist/cli.js +5079 -1556
  2. package/package.json +1 -1
  3. package/templates/agents/cbp-task-check.md +1 -3
  4. package/templates/agents/cbp-task-planner.md +8 -6
  5. package/templates/github-workflows/publish.yml +93 -21
  6. package/templates/hooks/cbp-auto-test-hooks.sh +1 -0
  7. package/templates/hooks/cbp-e2e-spec-patterns.sh +100 -0
  8. package/templates/hooks/cbp-lint-format-on-edit.sh +1 -0
  9. package/templates/hooks/cbp-maestro-yaml-validate.sh +1 -0
  10. package/templates/hooks/cbp-pre-commit-quality-gate.sh +1 -0
  11. package/templates/hooks/cbp-statusline.sh +0 -0
  12. package/templates/hooks/cbp-subagent-statusline.sh +0 -0
  13. package/templates/hooks/cbp-test-coverage-gate.sh +1 -0
  14. package/templates/hooks/cbp-test-hooks.sh +1 -0
  15. package/templates/hooks/hooks.json +4 -0
  16. package/templates/hooks/verify-parity.sh +20 -0
  17. package/templates/rules/parallel-waves.md +8 -3
  18. package/templates/rules/scope-vocabulary.md +4 -3
  19. package/templates/settings.project.base.json +22 -0
  20. package/templates/skills/cbp-build-cc-claude-file/SKILL.md +11 -1
  21. package/templates/skills/cbp-build-cc-claude-file/scripts/validate-claude-file.sh +72 -0
  22. package/templates/skills/cbp-build-cc-mode/SKILL.md +12 -16
  23. package/templates/skills/cbp-build-cc-rule/SKILL.md +11 -1
  24. package/templates/skills/cbp-build-cc-rule/scripts/validate-rule.sh +69 -0
  25. package/templates/skills/cbp-build-cc-settings/SKILL.md +2 -2
  26. package/templates/skills/cbp-build-cc-settings/scripts/validate-settings.sh +67 -0
  27. package/templates/skills/cbp-checkpoint-create/SKILL.md +12 -4
  28. package/templates/skills/cbp-checkpoint-end/SKILL.md +19 -11
  29. package/templates/skills/cbp-git-commit/SKILL.md +10 -12
  30. package/templates/skills/cbp-git-worktree-create/SKILL.md +7 -48
  31. package/templates/skills/cbp-git-worktree-remove/SKILL.md +23 -40
  32. package/templates/skills/cbp-map-architecture/SKILL.md +1 -0
  33. package/templates/skills/cbp-merge-main/SKILL.md +21 -26
  34. package/templates/skills/cbp-refresh-arch-map/SKILL.md +1 -0
  35. package/templates/skills/cbp-round-check/SKILL.md +37 -36
  36. package/templates/skills/cbp-round-execute/SKILL.md +9 -3
  37. package/templates/skills/cbp-session-end/SKILL.md +27 -47
  38. package/templates/skills/cbp-session-start/SKILL.md +35 -51
  39. package/templates/skills/cbp-standalone-task-start/SKILL.md +10 -19
  40. package/templates/skills/cbp-supabase-migrate/SKILL.md +24 -27
  41. package/templates/skills/cbp-task-start/SKILL.md +9 -21
  42. package/templates/skills/cbp-task-testing/SKILL.md +18 -10
@@ -93,9 +93,7 @@ Compute `TARGET`:
93
93
  | Task type | TARGET |
94
94
  |-----------|--------|
95
95
  | Checkpoint-bound, `checkpoint.branch_name` set | `checkpoint.branch_name` (e.g. `feat/CHK-095-pricing-page`) |
96
- | Checkpoint-bound, `branch_name` null (legacy) | `feat/CHK-{NNN}-{kebab-slug-from-checkpoint-title}` |
97
-
98
- Slug rules: lowercase, words joined by `-`, drop punctuation, truncate to 40 chars.
96
+ | Checkpoint-bound, `branch_name` null (legacy) | `feat/CHK-{NNN}-{slug}` where slug is computed via `codebyplan slug "{checkpoint title}"` |
99
97
 
100
98
  #### 3.2 — Compare current branch
101
99
 
@@ -103,29 +101,19 @@ Run `git branch --show-current`. If `current == TARGET` → continue to Step 3b.
103
101
 
104
102
  #### 3.3 — Switch automatically (no AskUserQuestion, no blocking)
105
103
 
106
- Run the switch directly. Three sub-cases, in order:
104
+ Run the branch checkout via the deterministic CLI:
107
105
 
108
106
  ```bash
109
- # (a) target exists locally
110
- if git rev-parse --verify "$TARGET" >/dev/null 2>&1; then
111
- git checkout "$TARGET"
112
-
113
- # (b) target exists on origin (track it)
114
- elif git rev-parse --verify "origin/$TARGET" >/dev/null 2>&1; then
115
- git checkout -t "origin/$TARGET"
116
-
117
- # (c) target doesn't exist — create from production branch (main)
118
- else
119
- # First make sure production is up to date
120
- git fetch origin "$PRODUCTION" 2>/dev/null || true
121
- git checkout -b "$TARGET" "origin/$PRODUCTION" 2>/dev/null \
122
- || git checkout -b "$TARGET" "$PRODUCTION"
123
- fi
107
+ RESULT=$(codebyplan branch checkout --target "$TARGET" --production "$PRODUCTION")
108
+ # Parse JSON: { action: "checked_out_local" | "checked_out_tracking" | "created_from_production" | "error", branch, previous?, error? }
109
+ ACTION=$(echo "$RESULT" | jq -r '.action')
124
110
  ```
125
111
 
126
- **Carrying uncommitted work** — `git checkout` carries clean (non-conflicting) working-tree changes to the new branch automatically. This is intended: changes made on `main` while preparing the task move with the user to the new feat branch. No `git stash`, ever (per `git-safety.md`). No `git add`, ever (per `git-workflow.md`).
112
+ **Interpreting the result:**
113
+ - `action === "error"`: surface the raw `error` field verbatim, stop, do NOT attempt recovery. The user resolves git state and re-invokes. This is the only case where `/cbp-task-start` halts on branch state.
114
+ - Otherwise (all success actions): branch is now `TARGET`. Continue to Step 3.4.
127
115
 
128
- **If `git checkout` exits non-zero** (typically "would clobber" because a tracked file has unstaged changes that conflict with target's version): surface the raw git error verbatim, stop, do NOT attempt recovery. The user resolves and re-invokes. This is the only case where `/cbp-task-start` halts on branch state.
116
+ **Carrying uncommitted work** the CLI respects git's automatic carrying of clean (non-conflicting) working-tree changes. Changes made while preparing the task move with the user to the new feat branch. No `git stash`, ever (per `git-safety.md`). No `git add`, ever (per `git-workflow.md`).
129
117
 
130
118
  **Note — Supabase preview branch**: no Supabase branch is created at this point. Creation is lazy — it happens on the first DB change when `/cbp-supabase-migrate` runs on the feat branch, which provisions a Supabase branch named identically to the git branch. See `cbp-supabase-migrate` Step 2.3 for the creation protocol.
131
119
 
@@ -9,7 +9,7 @@ effort: xhigh
9
9
 
10
10
  # Task Testing Command
11
11
 
12
- Comprehensive task-level testing — the **cross-round double-check** run once after all rounds complete. Per-round QA (per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret grep, `pnpm audit`) is owned by each round's `testing-qa-agent`; this skill does NOT re-run it. Instead it tests the **entire delivered feature holistically** across the full task diff — catching cross-package and cross-round problems no single round can see. Runs inline — no sub-agent.
12
+ Comprehensive task-level testing — runs all automated tests and walks the user through manual testing one-by-one. Distinct from round-level testing (`testing-qa-agent`): this tests the **entire delivered feature holistically** after all rounds are complete. Runs inline — no sub-agent.
13
13
 
14
14
  ## When Used
15
15
 
@@ -19,13 +19,13 @@ Comprehensive task-level testing — the **cross-round double-check** run once a
19
19
 
20
20
  ## Scope vs Round-Level Validation
21
21
 
22
- Per-wave `testing-qa-agent` runs inside `/cbp-round-execute` Step 5 and **owns per-round QA**: per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret full-diff grep, and `pnpm audit`. This skill does NOT repeat them. It adds only the cross-round layer invisible within a single round: workspace-wide lint, workspace tsc, and the full test suite (which catch cross-package breakage), plus the cross-round code review (Step 6.5), the autonomous sim screenshot loop (Step 6.x), and the user manual walkthrough (Step 8).
22
+ Per-wave `testing-qa-agent` runs inside `/cbp-round-execute` Step 5. This skill adds the cross-cutting layer that is only visible across the full task diff: whole-repo lint, whole-repo typecheck, full test suite, `pnpm audit` (via `codebyplan check --scope task --json`), and full-diff security scan each run once here, not per-round.
23
23
 
24
24
  ## Instructions
25
25
 
26
26
  ### Step 1: Parse `$ARGUMENTS`
27
27
 
28
- Parse the argument using the canonical chk-task-round notation (see `cbp-round-start` Step 0 "CHK / TASK / ROUND Identifier Notation Vocabulary"):
28
+ Parse the argument using the canonical chk-task-round notation (see `.claude/rules/notation-consistency.md`):
29
29
 
30
30
  | Shape | Regex | Resolves to |
31
31
  |-------|-------|-------------|
@@ -104,14 +104,24 @@ Capture stdout and stderr for each check.
104
104
 
105
105
  **Hard-fail tests** (block completion):
106
106
 
107
+ Run the unified check matrix:
108
+
109
+ ```bash
110
+ codebyplan check --scope task --json
111
+ ```
112
+
113
+ Capture the JSON result. The runner is **whole-repo + baseline**: it runs `turbo run lint|typecheck|test` across every package and diffs each per-package result against the committed `.check-baseline.json`, so only NEW per-package failures fail a check. Five checks run for `--scope task`: `gate6` (sibling-identity parity — ALWAYS hard-fail, never baselined), `lint`, `typecheck`, `tests`, and `audit` (`audit.new_failures` lists new GHSA advisory ids not in the allowlist). A baselined check's `status` is `pass` when its `new_failures` array is empty even if the underlying command exited non-zero. If `any_failed === true` (or `hard_fail_checks` is non-empty), this is a hard fail — surface each failing result's `stdout`/`stderr`/`new_failures` and stop.
114
+
115
+ For each result entry, record: `category` (from `result.check`), `status` (from `result.status`), `details`, `stdout` (from `result.stdout`), `stderr` (from `result.stderr`), and `new_failures` (from `result.new_failures` — the newly-failing packages / new GHSA ids; the field is omitted/`undefined` for `gate6`, not `null`).
116
+
117
+ Additional hard-fail checks (not part of the runner):
118
+
107
119
  | Category | Command | Condition |
108
120
  | ----------------------- | ------------------------------- | -------------------------------- |
109
- | Full-repo lint | `pnpm -w lint` | Always |
110
- | Full-repo types | `pnpm exec tsc --noEmit` | Source files changed |
111
- | Full-repo unit tests | `pnpm test --run` | Source files in aggregated_files |
112
121
  | Per-package E2E | `pnpm --filter <pkg> e2e:test` | UI files in aggregated_files |
122
+ | Full-diff security scan | inline grep or `security-agent` | Always |
113
123
 
114
- These are the workspace-wide / cross-package checks only — per-app build/lint/types, the `console.log`/debug scan, the OWASP/secret grep, and `pnpm audit` already ran per-round inside `testing-qa-agent` and are NOT repeated here. Per-file lint + format are enforced by `lint-format-on-edit.sh` per edit. This step catches cross-package issues invisible to per-wave checks.
124
+ Per-file lint + format are enforced by `lint-format-on-edit.sh` hook per edit. This step catches cross-package issues invisible to per-wave checks.
115
125
 
116
126
  **Soft tests** (report, don't block):
117
127
 
@@ -120,8 +130,6 @@ These are the workspace-wide / cross-package checks only — per-app build/lint/
120
130
  | Visual | Screenshot compare via `e2e:visual-check` | UI work + dev server running |
121
131
  | API Health | `curl` health endpoint | API routes changed |
122
132
 
123
- For each test, record: `category, status (pass|fail|skipped), details, stdout, stderr`.
124
-
125
133
  #### Step 6.x: Autonomous Sim Screenshot Validation (mobile / on-device)
126
134
 
127
135
  For mobile rounds (Maestro / XCUITest / Tauri-mobile) where unit tests passed but the round touched component-mount code paths (custom hooks, prop signatures, conditional renders, navigation tabs), unit-test green is NOT sufficient evidence that the screen mounts at runtime. Use the autonomous sim screenshot loop to catch runtime crashes invisible to mocked unit tests.
@@ -168,7 +176,7 @@ For each finding, record: `{category, file, description, severity: 'low'|'medium
168
176
 
169
177
  Findings with severity `medium` or `high` feed the Step 9 problem classification. `low` findings are recorded in `task_testing_output` for the record but do not block.
170
178
 
171
- If any finding points to a need that exceeds task scope (e.g. a utility worth extracting for the wider codebase, a convention the repo should adopt globally), route per `cbp-task-create` Step 3.5 "Immediate Issue Capture Contract — How to Capture" — default to a NEW TASK in the current checkpoint, not a standalone task. Standalone routing applies only when the finding is genuinely off-axis from every active checkpoint AND the user has confirmed standalone routing.
179
+ If any finding points to a need that exceeds task scope (e.g. a utility worth extracting for the wider codebase, a convention the repo should adopt globally), route per `immediate-issue-capture.md` "How to Capture" — default to a NEW TASK in the current checkpoint, not a standalone task. Standalone routing applies only when the finding is genuinely off-axis from every active checkpoint AND the user has confirmed standalone routing.
172
180
 
173
181
  ### Step 7: Separate Claude-Testable vs User-Testable
174
182