codebyplan 1.13.52 → 1.13.54
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/cli.js +3226 -897
- package/package.json +1 -1
- package/templates/agents/cbp-database-agent.md +1 -1
- package/templates/agents/cbp-e2e-maestro.md +1 -1
- package/templates/agents/cbp-e2e-playwright.md +24 -16
- package/templates/agents/cbp-e2e-tauri.md +1 -1
- package/templates/agents/cbp-e2e-vscode.md +1 -1
- package/templates/agents/cbp-e2e-xcuitest.md +1 -1
- package/templates/agents/cbp-improve-claude.md +2 -2
- package/templates/agents/{cbp-round-executor.md → cbp-round-builder.md} +23 -23
- package/templates/agents/{cbp-task-planner.md → cbp-round-planner.md} +26 -25
- package/templates/agents/cbp-security-agent.md +10 -2
- package/templates/agents/cbp-stripe-agent.md +2 -2
- package/templates/agents/cbp-testing-qa-agent.md +34 -20
- package/templates/agents/cbp-verify-reviewer.md +236 -0
- package/templates/context/architecture-map.md +4 -4
- package/templates/context/mcp-docs.md +57 -11
- package/templates/context/testing/e2e.md +9 -9
- package/templates/github-workflows/ci.yml +104 -0
- package/templates/github-workflows/publish.yml +8 -27
- package/templates/github-workflows/release-desktop.yml +215 -0
- package/templates/hooks/cbp-skill-context-guard.sh +1 -1
- package/templates/hooks/cbp-test-hooks.sh +9 -9
- package/templates/hooks/validate-structure-lengths.sh +1 -1
- package/templates/hooks/validate-structure-patterns.sh +1 -1
- package/templates/rules/README.md +1 -2
- package/templates/rules/agent-claim-verification.md +1 -1
- package/templates/rules/context-file-loading.md +10 -10
- package/templates/rules/development-workflow.md +73 -0
- package/templates/rules/e2e-mandatory.md +8 -8
- package/templates/rules/execution-proof.md +70 -0
- package/templates/rules/model-invocation-convention.md +2 -2
- package/templates/rules/parallel-waves.md +11 -11
- package/templates/rules/spawn-failure-is-gate-failure.md +76 -0
- package/templates/rules/task-routing-recommendation.md +1 -1
- package/templates/rules/todo-backend.md +3 -3
- package/templates/rules/two-tier-ci.md +63 -0
- package/templates/settings.project.base.json +15 -11
- package/templates/skills/cbp-build-cc-mode/SKILL.md +1 -1
- package/templates/skills/cbp-build-cc-settings/reference/cbp-permission-policy.md +7 -7
- package/templates/skills/cbp-build-cc-skill/SKILL.md +1 -1
- package/templates/skills/cbp-build-cc-skill/reference/cbp-quality.md +2 -2
- package/templates/skills/cbp-build-cc-skill/reference/fork-eligibility.md +11 -14
- package/templates/skills/cbp-checkpoint-check/SKILL.md +11 -3
- package/templates/skills/cbp-checkpoint-create/SKILL.md +16 -1
- package/templates/skills/cbp-checkpoint-end/SKILL.md +5 -1
- package/templates/skills/cbp-checkpoint-update/SKILL.md +3 -3
- package/templates/skills/cbp-clear-continue/SKILL.md +2 -2
- package/templates/skills/cbp-clear-prep/SKILL.md +3 -3
- package/templates/skills/{cbp-task-complete → cbp-finalize}/SKILL.md +25 -29
- package/templates/skills/{cbp-task-complete → cbp-finalize}/reference/checkpoint-done-branching.md +1 -1
- package/templates/skills/{cbp-task-complete → cbp-finalize}/reference/next-step-heuristic.md +1 -1
- package/templates/skills/cbp-frontend-design/SKILL.md +1 -1
- package/templates/skills/cbp-frontend-ui/SKILL.md +7 -7
- package/templates/skills/cbp-git-commit/SKILL.md +3 -3
- package/templates/skills/cbp-merge-main/SKILL.md +4 -4
- package/templates/skills/{cbp-round-execute → cbp-round-build}/SKILL.md +93 -75
- package/templates/skills/cbp-round-complete/SKILL.md +15 -14
- package/templates/skills/cbp-round-plan/SKILL.md +344 -0
- package/templates/skills/cbp-session-end/SKILL.md +1 -1
- package/templates/skills/cbp-setup-cd/SKILL.md +291 -0
- package/templates/skills/cbp-setup-cd/reference/github-actions-cd.md +231 -0
- package/templates/skills/cbp-setup-ci/SKILL.md +175 -0
- package/templates/skills/cbp-setup-ci/reference/github-actions.md +100 -0
- package/templates/skills/cbp-ship/SKILL.md +21 -0
- package/templates/skills/cbp-ship-main/SKILL.md +3 -2
- package/templates/skills/cbp-standalone-task-check/SKILL.md +10 -9
- package/templates/skills/cbp-standalone-task-complete/SKILL.md +12 -13
- package/templates/skills/cbp-standalone-task-create/SKILL.md +16 -9
- package/templates/skills/cbp-standalone-task-start/SKILL.md +9 -5
- package/templates/skills/cbp-standalone-task-testing/SKILL.md +16 -7
- package/templates/skills/cbp-task-create/SKILL.md +6 -7
- package/templates/skills/cbp-task-start/SKILL.md +8 -8
- package/templates/skills/cbp-todo/SKILL.md +6 -8
- package/templates/skills/cbp-verify/SKILL.md +146 -0
- package/templates/skills/cbp-verify/reference/deterministic-gates.md +114 -0
- package/templates/skills/{cbp-round-end → cbp-verify}/reference/findings-presentation.md +16 -12
- package/templates/skills/cbp-verify/reference/round-scope.md +62 -0
- package/templates/skills/cbp-verify/reference/task-scope.md +71 -0
- package/templates/agents/cbp-improve-round.md +0 -283
- package/templates/agents/cbp-task-check.md +0 -217
- package/templates/skills/cbp-round-check/SKILL.md +0 -132
- package/templates/skills/cbp-round-end/SKILL.md +0 -173
- package/templates/skills/cbp-round-end/reference/inline-fallback.md +0 -35
- package/templates/skills/cbp-round-execute/reference/inline-fallback.md +0 -55
- package/templates/skills/cbp-round-input/SKILL.md +0 -197
- package/templates/skills/cbp-round-start/SKILL.md +0 -261
- package/templates/skills/cbp-round-update/SKILL.md +0 -120
- package/templates/skills/cbp-ship/templates/workflow-eas-submit.yml +0 -53
- package/templates/skills/cbp-ship/templates/workflow-vsce-publish.yml +0 -31
- package/templates/skills/cbp-task-check/SKILL.md +0 -172
- package/templates/skills/cbp-task-testing/SKILL.md +0 -277
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Real execution proof is a non-skippable verify obligation — tiered by what the round touched, every tier producing a COMMITTED artifact, never prose.
|
|
3
|
+
paths:
|
|
4
|
+
- ".claude/skills/cbp-verify/**"
|
|
5
|
+
- ".claude/skills/cbp-round-build/**"
|
|
6
|
+
- ".claude/agents/cbp-verify-reviewer.md"
|
|
7
|
+
- ".claude/agents/cbp-e2e-playwright.md"
|
|
8
|
+
- ".claude/agents/cbp-e2e-maestro.md"
|
|
9
|
+
- ".claude/agents/cbp-e2e-tauri.md"
|
|
10
|
+
- ".claude/agents/cbp-e2e-vscode.md"
|
|
11
|
+
- ".claude/agents/cbp-e2e-xcuitest.md"
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
# Execution Proof
|
|
15
|
+
|
|
16
|
+
"I verified the build" is not proof. Proof is a **committed artifact** that an auditor can
|
|
17
|
+
re-inspect after the session ends. `cbp-verify` Phase 3 produces it; a passing verdict without
|
|
18
|
+
it is invalid. The required artifact is **tiered by what the round's diff actually touched** —
|
|
19
|
+
the tier is chosen from `files_changed`, not from a `has_ui_work` guess.
|
|
20
|
+
|
|
21
|
+
## Tiers
|
|
22
|
+
|
|
23
|
+
| Tier | Round touched | Proof obligation | Asserted by |
|
|
24
|
+
|------|---------------|------------------|-------------|
|
|
25
|
+
| **1** | A configured e2e framework's `app` source (`.codebyplan/e2e.json`) | `cbp-e2e-*` specialist runs the app and **commits screenshots** to the framework's committed dir | `codebyplan e2e verify-round` (non-empty gallery + non-zero assertions) |
|
|
26
|
+
| **2** | UI source, but NO e2e framework configured for that app | **MANDATORY** dev-server run + at least one committed route screenshot or HTTP response trace for each changed route | manifest `artifacts[]` + `git ls-files --error-unmatch` |
|
|
27
|
+
| **3** | Backend / API only (route handlers, server actions, endpoints) | Hit each changed endpoint; record an HTTP status trace (method, path, status, ms) committed to the round artifact dir | manifest `artifacts[]` |
|
|
28
|
+
| **4** | `claude_only` / docs / config only (no app surface) | Proof IS the build/test commands — `codebyplan check --scope round\|task` (+ `bash -n` for touched hooks); profile-valid, no screenshot | manifest `gates[]` |
|
|
29
|
+
|
|
30
|
+
A round can hit multiple tiers; satisfy each tier its diff touches.
|
|
31
|
+
|
|
32
|
+
## Hard Rules
|
|
33
|
+
|
|
34
|
+
- **Empty proof on a UI-touching diff is a GATE FAILURE.** A round whose `files_changed`
|
|
35
|
+
includes UI source but whose manifest carries zero committed screenshots/traces fails verify —
|
|
36
|
+
route to a fix round that captures the missing artifact. (Mirrors `e2e-mandatory.md`
|
|
37
|
+
Committed-Screenshot Enforcement; sole exception: `vscode-test`-only behavior rounds.)
|
|
38
|
+
- **Screenshots must be committed, not `/tmp`.** Each artifact path is proven present with
|
|
39
|
+
`git ls-files --error-unmatch <path>` — an unstaged or `/tmp` file is not proof.
|
|
40
|
+
- **Prose is never proof.** A narrative claim with no artifact path does not satisfy any tier.
|
|
41
|
+
|
|
42
|
+
## Manifest Schema
|
|
43
|
+
|
|
44
|
+
`cbp-verify` writes a `verify_manifest` into round/task context — the durable record of which
|
|
45
|
+
gates ran and what proof exists:
|
|
46
|
+
|
|
47
|
+
```yaml
|
|
48
|
+
verify_manifest:
|
|
49
|
+
scope: round | task
|
|
50
|
+
gates: # deterministic gate results
|
|
51
|
+
- name: gate6 | lint | typecheck | tests | audit
|
|
52
|
+
exit_code: number
|
|
53
|
+
new_failures: string[] # post-baseline-diff; [] = pass
|
|
54
|
+
proof:
|
|
55
|
+
tier: 1 | 2 | 3 | 4
|
|
56
|
+
artifacts: # committed proof, one per affected surface
|
|
57
|
+
- kind: screenshot | http_trace | command_log
|
|
58
|
+
path: string # repo-relative; verified via git ls-files --error-unmatch
|
|
59
|
+
affected: string # route / endpoint / file this proves
|
|
60
|
+
e2e_verify_round: # present for Tier 1
|
|
61
|
+
pass: boolean
|
|
62
|
+
failed_checks: string[] # e2e_eligible_skipped | zero_assertion_run | empty_gallery
|
|
63
|
+
decided_at: ISO8601
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
## Cross-References
|
|
67
|
+
|
|
68
|
+
- `rules/e2e-mandatory.md` — Tier 1 opt-out contract + committed-screenshot mandate.
|
|
69
|
+
- `rules/two-tier-ci.md` — how proof feeds the soft (round/task) vs hardcore (checkpoint) tiers.
|
|
70
|
+
- `skills/cbp-verify/reference/deterministic-gates.md` — the gate command contracts + manifest write.
|
|
@@ -7,8 +7,8 @@ a skill is strictly user-only (i.e. it must never auto-trigger from another skil
|
|
|
7
7
|
|
|
8
8
|
The absence of `disable-model-invocation` (or `disable-model-invocation: false`) is the normal
|
|
9
9
|
state. It allows the skill to be auto-triggered via the Skill tool from within other skills —
|
|
10
|
-
which is how the auto-trigger close-out flow works (e.g. `cbp-
|
|
11
|
-
`cbp-
|
|
10
|
+
which is how the auto-trigger close-out flow works (e.g. `cbp-round-build` → `cbp-verify`,
|
|
11
|
+
`cbp-verify` task scope → `cbp-finalize`).
|
|
12
12
|
|
|
13
13
|
## The sole exception: `cbp-round-complete`
|
|
14
14
|
|
|
@@ -1,24 +1,24 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: parallel-waves
|
|
3
|
-
description: Wave schema, invariants, and proximity-split algorithm for cbp-
|
|
3
|
+
description: Wave schema, invariants, and proximity-split algorithm for cbp-round-planner Phase 5.6 wave decomposition.
|
|
4
4
|
paths:
|
|
5
|
-
- .claude/agents/cbp-
|
|
5
|
+
- .claude/agents/cbp-round-planner.md
|
|
6
6
|
---
|
|
7
7
|
|
|
8
8
|
# Parallel Waves
|
|
9
9
|
|
|
10
|
-
Authoritative expansion of `cbp-
|
|
10
|
+
Authoritative expansion of `cbp-round-planner` Phase 5.6. The planner reads this file at wave decomposition time.
|
|
11
11
|
|
|
12
12
|
## Wave Schema
|
|
13
13
|
|
|
14
|
-
Each entry in `plan.waves[]` carries these fields (source: `.claude/agents/cbp-
|
|
14
|
+
Each entry in `plan.waves[]` carries these fields (source: `.claude/agents/cbp-round-planner.md` Phase 5.6 "Output" block):
|
|
15
15
|
|
|
16
16
|
```yaml
|
|
17
17
|
- name: string # short identifier, e.g. "web-ui", "backend", "db"
|
|
18
|
-
agent_type: 'round-
|
|
18
|
+
agent_type: 'round-builder' | 'inline'
|
|
19
19
|
files: string[] # repo-relative paths owned by this wave
|
|
20
20
|
depends_on: string[] # names of waves that must complete before this one starts
|
|
21
|
-
skill_preloads: string[] # skills invoked by the
|
|
21
|
+
skill_preloads: string[] # skills invoked by the builder before Step 3 (e.g. "frontend-design")
|
|
22
22
|
note: string # optional — required on continuation waves from an arbitrary-boundary split
|
|
23
23
|
```
|
|
24
24
|
|
|
@@ -31,9 +31,9 @@ Each entry in `plan.waves[]` carries these fields (source: `.claude/agents/cbp-t
|
|
|
31
31
|
**(III) 3–15 files per wave** — every wave holds between 3 and 15 files (inclusive).
|
|
32
32
|
- Below 3: merge into a sibling wave.
|
|
33
33
|
- Above 15: apply the proximity-split algorithm below.
|
|
34
|
-
- Sole exception — trivially small plans are exempt from the lower bound: a plan with fewer than 3 total files uses one single wave, and a single-app plan with ≤5 total files MAY skip decomposition entirely (one wave, or `waves[]` omitted — see `cbp-
|
|
34
|
+
- Sole exception — trivially small plans are exempt from the lower bound: a plan with fewer than 3 total files uses one single wave, and a single-app plan with ≤5 total files MAY skip decomposition entirely (one wave, or `waves[]` omitted — see `cbp-round-planner` Phase 5.6). Zero waves (omitted `waves[]`) trivially satisfies this invariant.
|
|
35
35
|
|
|
36
|
-
**(IV) UI skill preloads** — for each wave whose `files[]` contains UI-bearing paths (`*.tsx`, `*.jsx`, `*.scss`, etc.), add `"frontend-design"` to `skill_preloads[]` (source: `.claude/agents/cbp-
|
|
36
|
+
**(IV) UI skill preloads** — for each wave whose `files[]` contains UI-bearing paths (`*.tsx`, `*.jsx`, `*.scss`, etc.), add `"frontend-design"` to `skill_preloads[]` (source: `.claude/agents/cbp-round-planner.md` Phase 5.6 step "Populate `skill_preloads[]`").
|
|
37
37
|
|
|
38
38
|
## Proximity-Split Algorithm
|
|
39
39
|
|
|
@@ -57,7 +57,7 @@ Invariants I (disjoint files), II (acyclic `depends_on` DAG), and III (3–15 fi
|
|
|
57
57
|
|
|
58
58
|
## Cross-References
|
|
59
59
|
|
|
60
|
-
- `agents/cbp-
|
|
60
|
+
- `agents/cbp-round-planner.md` Phase 5.6 — consumer of this rule; steps 1–6 and the `validate-waves` verification call.
|
|
61
61
|
- `packages/codebyplan-package/src/lib/validate-waves.ts` — deterministic enforcement of invariants I–III.
|
|
62
|
-
- `agents/cbp-round-
|
|
63
|
-
- `skills/cbp-round-
|
|
62
|
+
- `agents/cbp-round-builder.md` Step 2.6 — wave-mode skill preloads.
|
|
63
|
+
- `skills/cbp-round-build/SKILL.md` Step 3 — per-wave builder dispatch.
|
|
@@ -0,0 +1,76 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: A subagent spawn failure is a HARD GATE FAILURE — STOP and retry, never walk the agent's steps inline and self-certify.
|
|
3
|
+
paths:
|
|
4
|
+
- ".claude/skills/cbp-verify/**"
|
|
5
|
+
- ".claude/skills/cbp-round-build/**"
|
|
6
|
+
- ".claude/skills/cbp-finalize/**"
|
|
7
|
+
- ".claude/agents/cbp-verify-reviewer.md"
|
|
8
|
+
- ".claude/agents/cbp-round-builder.md"
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Spawn Failure Is Gate Failure
|
|
12
|
+
|
|
13
|
+
When a verify/execution stage delegates work to a subagent (e.g. `cbp-verify` spawning
|
|
14
|
+
`cbp-verify-reviewer`, `cbp-round-build` spawning `cbp-round-builder`), the agent is the
|
|
15
|
+
**fresh-context oracle**. If the agent cannot run, the orchestrator does NOT have an
|
|
16
|
+
equivalent signal — and it must NEVER manufacture one.
|
|
17
|
+
|
|
18
|
+
## The Rule
|
|
19
|
+
|
|
20
|
+
A **spawn failure** — the agent could not run, or died on a terminal error before producing
|
|
21
|
+
its output contract — is a **HARD GATE FAILURE**. The orchestrator STOPS and surfaces a retry
|
|
22
|
+
directive. It does NOT walk the agent's phase checklist inline with its own tools and grade its
|
|
23
|
+
own work. Self-certification by the orchestrator that spawned the agent is precisely the
|
|
24
|
+
fresh-context blind spot the agent exists to remove; reproducing the agent's steps inline
|
|
25
|
+
re-introduces it.
|
|
26
|
+
|
|
27
|
+
Spawn-failure classes (non-exhaustive): provider 5xx, rate-limit / monthly-cap / billing block,
|
|
28
|
+
context overflow at spawn, the agent process dying before emitting its output contract.
|
|
29
|
+
|
|
30
|
+
**Retry directive shape** (surface verbatim, then STOP):
|
|
31
|
+
|
|
32
|
+
```
|
|
33
|
+
## Verify blocked — reviewer could not spawn
|
|
34
|
+
|
|
35
|
+
The fresh-context reviewer (<agent>) failed to spawn: <class> — <verbatim error>.
|
|
36
|
+
This is a hard gate failure, not a pass. Retry when capacity returns:
|
|
37
|
+
Next: /cbp-verify
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
Record `<scope>.context.verify.spawn_failure = { agent, class, error_message, decided_at }` so
|
|
41
|
+
the retry is auditable and a verdict is never written on a missing review.
|
|
42
|
+
|
|
43
|
+
## Spawn-Failed vs Spawn-Ran-And-Found-Problems
|
|
44
|
+
|
|
45
|
+
These are different outcomes with opposite routes — do not conflate them:
|
|
46
|
+
|
|
47
|
+
| Outcome | Meaning | Route |
|
|
48
|
+
|---------|---------|-------|
|
|
49
|
+
| **Spawn failed** | Agent never produced its output contract (terminal error). | HARD GATE FAILURE → STOP + retry directive. No verdict written. |
|
|
50
|
+
| **Spawn ran, found problems** | Agent returned findings / `NOT_READY`. | Normal flow → in-scope mechanical fix or `/cbp-round-plan` fix round. |
|
|
51
|
+
|
|
52
|
+
A returned `NOT_READY` is a *successful* review with a negative verdict — it is acted on, not
|
|
53
|
+
retried. Only the absence of a contract is a spawn failure.
|
|
54
|
+
|
|
55
|
+
## Carve-Out: The `claude_only` Profile Is Not Inline Fallback
|
|
56
|
+
|
|
57
|
+
The `claude_only` profile (rounds with no app surface — `.claude/`-only edits, docs, config)
|
|
58
|
+
has **no agent to spawn by design**. Its proof IS the deterministic command set:
|
|
59
|
+
`codebyplan check --scope round|task` plus `bash -n <hook>` for any touched shell file. Running
|
|
60
|
+
those inline is a **first-class deterministic verification path**, not a banned inline fallback —
|
|
61
|
+
there was never a subagent to substitute for. This carve-out applies ONLY when the resolved
|
|
62
|
+
profile is `claude_only`; for every other profile an agent is expected, and its spawn failure is
|
|
63
|
+
a hard gate failure per above.
|
|
64
|
+
|
|
65
|
+
## Why (Replaces Inline-Fallback Self-Certification)
|
|
66
|
+
|
|
67
|
+
The retired `inline-fallback.md` procedures let an orchestrator that just failed to spawn an
|
|
68
|
+
agent walk that agent's steps and pass its own work. That defeats the entire point of a
|
|
69
|
+
fresh-context review and silently downgraded quality under sustained outages. This rule replaces
|
|
70
|
+
those procedures: a missing review is a STOP, not a self-graded continue.
|
|
71
|
+
|
|
72
|
+
## Cross-References
|
|
73
|
+
|
|
74
|
+
- `skills/cbp-verify/SKILL.md` Phase 4 — the reviewer spawn + this hard-fail.
|
|
75
|
+
- `agents/cbp-verify-reviewer.md` — the reviewer whose absence triggers this rule.
|
|
76
|
+
- `rules/execution-proof.md` — the proof obligation a passing verdict still requires.
|
|
@@ -12,7 +12,7 @@ CodeByPlan has two families of task commands since CHK-141:
|
|
|
12
12
|
|
|
13
13
|
| Family | Commands | When to use |
|
|
14
14
|
|--------|----------|-------------|
|
|
15
|
-
| Checkpoint-bound | `/cbp-task-create`, `/cbp-task-start {chk}-{task}`, `/cbp-
|
|
15
|
+
| Checkpoint-bound | `/cbp-task-create`, `/cbp-task-start {chk}-{task}`, `/cbp-verify`, `/cbp-finalize` | Work that belongs to a CHK-NNN checkpoint |
|
|
16
16
|
| Standalone | `/cbp-standalone-task-create`, `/cbp-standalone-task-start {task}`, `/cbp-standalone-task-check`, `/cbp-standalone-task-testing`, `/cbp-standalone-task-complete` | Independent work not tied to any checkpoint |
|
|
17
17
|
|
|
18
18
|
## Round Commands (Both Families)
|
|
@@ -62,8 +62,8 @@ The queue head (`get_todos` `rows[0]`) maps to one of these slash commands. The
|
|
|
62
62
|
|
|
63
63
|
| State | Command | Required context |
|
|
64
64
|
|-------|---------|------------------|
|
|
65
|
-
| Round in progress | `/cbp-
|
|
66
|
-
| Round pending start | `/cbp-round-
|
|
65
|
+
| Round in progress | `/cbp-verify` | `{checkpoint_id, task_id, round_id}` |
|
|
66
|
+
| Round pending start | `/cbp-round-plan` | `{checkpoint_id, task_id}` |
|
|
67
67
|
| Task pending start | `/cbp-task-start` | `{checkpoint_id, task_id}` or `{task_id}` for standalone |
|
|
68
68
|
| Checkpoint pending activation | `/cbp-checkpoint-update` | `{checkpoint_id}` |
|
|
69
69
|
| Checkpoint done | `/cbp-checkpoint-check` | `{checkpoint_id}` |
|
|
@@ -118,4 +118,4 @@ CHK-111 shipped the original todos queue as Postgres triggers + a 583-LOC `regen
|
|
|
118
118
|
4. Env vars (from `apps/todo-worker/.env.example`): `SUPABASE_URL`, `SUPABASE_SECRET_KEY` (an `sb_secret_...` key), `LOG_LEVEL`, `WORKER_POLL_MS`.
|
|
119
119
|
5. Save the resulting `project_ref` to `.codebyplan.json` `shipment.surfaces.railway-todo-worker.project_ref`.
|
|
120
120
|
|
|
121
|
-
Smoke after deploy: run `/cbp-
|
|
121
|
+
Smoke after deploy: run `/cbp-finalize` in any worktree → tail Railway logs → expect a `claim → apply` cycle within `WORKER_POLL_MS`.
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: Two CI tiers — soft (round/task → feat) is baseline-tolerant; hardcore (checkpoint → main) is whole-repo absolute green. Branch model is feat→main direct.
|
|
3
|
+
paths:
|
|
4
|
+
- ".claude/skills/cbp-verify/**"
|
|
5
|
+
- ".claude/skills/cbp-checkpoint-check/**"
|
|
6
|
+
- ".claude/skills/cbp-checkpoint-end/**"
|
|
7
|
+
- ".claude/skills/cbp-ship-main/**"
|
|
8
|
+
- ".codebyplan/ci.json"
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Two Tier CI
|
|
12
|
+
|
|
13
|
+
CodeByPlan gates work at two strictness tiers. The tier is chosen by **what is being
|
|
14
|
+
promoted**, not by preference.
|
|
15
|
+
|
|
16
|
+
## Soft Tier — round / task → feat branch
|
|
17
|
+
|
|
18
|
+
Runs at every `cbp-verify` (round scope) and the task-scope escalation. **Baseline-tolerant**:
|
|
19
|
+
pre-existing red is non-blocking; only NEW per-package failures fail.
|
|
20
|
+
|
|
21
|
+
- `codebyplan check --scope round|task` (NO `--no-baseline`). Each baselined check
|
|
22
|
+
(`lint` / `typecheck` / `tests` / `audit`) fails ONLY when its `new_failures[]` is non-empty
|
|
23
|
+
vs `.check-baseline.json`. `gate6` (sibling-identity parity) is **always hard** — never
|
|
24
|
+
baselined.
|
|
25
|
+
- `codebyplan e2e verify-round --round-id <id> --task-id <id>` per round (Tier-1 e2e proof).
|
|
26
|
+
- Fresh-context review via `cbp-verify-reviewer` (its spawn failure is a hard gate failure —
|
|
27
|
+
`rules/spawn-failure-is-gate-failure.md`).
|
|
28
|
+
|
|
29
|
+
The soft tier keeps the inner loop fast: a feat branch may carry the repo's known baseline red
|
|
30
|
+
forward without blocking, while guaranteeing the work being added is itself clean.
|
|
31
|
+
|
|
32
|
+
## Hardcore Tier — checkpoint → main
|
|
33
|
+
|
|
34
|
+
Runs at checkpoint close (`cbp-checkpoint-check` / `cbp-checkpoint-end` / ship). **Zero baseline
|
|
35
|
+
forgiveness — whole-repo absolute green.**
|
|
36
|
+
|
|
37
|
+
- `codebyplan check --scope merged --no-baseline` = every failing package and every GHSA id
|
|
38
|
+
counts; any red fails. (`gate6` unchanged — still always hard.)
|
|
39
|
+
- Aggregate e2e proof across the whole checkpoint diff.
|
|
40
|
+
- Every required `main` branch-protection PR check is green (repo-specific — read the repo's
|
|
41
|
+
configured required checks, never assume a single hardcoded check name).
|
|
42
|
+
|
|
43
|
+
## Critical Constraint — feat→main DIRECT, main-only
|
|
44
|
+
|
|
45
|
+
The branch model is **feat→main direct**; `.codebyplan/git.json` has `integration: null`,
|
|
46
|
+
`production: "main"`. There is **NO intermediate integration branch** — the "checkpoint branch"
|
|
47
|
+
IS the per-checkpoint feat branch. The hardcore tier runs against that feat branch's merged
|
|
48
|
+
state before it lands on main; do not assume a staging/integration hop exists.
|
|
49
|
+
|
|
50
|
+
## Report-Only Rollout
|
|
51
|
+
|
|
52
|
+
The whole-repo hardcore CI **job** lands **report-only first** (`continue-on-error: true`) and is
|
|
53
|
+
flipped to a required check ONLY after the `apps/web` baseline is burned down. Until then,
|
|
54
|
+
`--scope merged --no-baseline` is advisory in CI — surfaced, not enforced — so a pre-existing
|
|
55
|
+
`apps/web` red does not block a merge while the baseline is still being paid down. Locally,
|
|
56
|
+
`cbp-verify` still runs and reports it.
|
|
57
|
+
|
|
58
|
+
## Cross-References
|
|
59
|
+
|
|
60
|
+
- `rules/execution-proof.md` — the committed-artifact obligation feeding both tiers.
|
|
61
|
+
- `rules/spawn-failure-is-gate-failure.md` — fresh-context review is non-substitutable.
|
|
62
|
+
- `skills/cbp-verify/reference/deterministic-gates.md` — exact gate commands + JSON contracts.
|
|
63
|
+
- `.codebyplan/git.json` — authoritative branch model (`integration: null`, `production: main`).
|
|
@@ -56,9 +56,9 @@
|
|
|
56
56
|
"Skill(cbp-checkpoint-check)",
|
|
57
57
|
"Skill(cbp-checkpoint-complete)",
|
|
58
58
|
"Skill(cbp-round-complete)",
|
|
59
|
-
"Skill(cbp-round-
|
|
59
|
+
"Skill(cbp-round-build)",
|
|
60
60
|
"Skill(cbp-session-end)",
|
|
61
|
-
"Skill(cbp-
|
|
61
|
+
"Skill(cbp-finalize)",
|
|
62
62
|
"Skill(cbp-standalone-task-create)",
|
|
63
63
|
"Skill(cbp-standalone-task-start)",
|
|
64
64
|
"Skill(cbp-standalone-task-complete)",
|
|
@@ -126,13 +126,10 @@
|
|
|
126
126
|
"Skill(cbp-map-architecture)",
|
|
127
127
|
"Skill(cbp-merge-main)",
|
|
128
128
|
"Skill(cbp-refresh-arch-map)",
|
|
129
|
-
"Skill(cbp-
|
|
130
|
-
"Skill(cbp-round-check)",
|
|
131
|
-
"Skill(cbp-round-end)",
|
|
132
|
-
"Skill(cbp-round-input)",
|
|
133
|
-
"Skill(cbp-round-start)",
|
|
134
|
-
"Skill(cbp-round-update)",
|
|
129
|
+
"Skill(cbp-round-plan)",
|
|
135
130
|
"Skill(cbp-session-start)",
|
|
131
|
+
"Skill(cbp-setup-cd)",
|
|
132
|
+
"Skill(cbp-setup-ci)",
|
|
136
133
|
"Skill(cbp-setup-e2e)",
|
|
137
134
|
"Skill(cbp-setup-eslint)",
|
|
138
135
|
"Skill(cbp-ship-configure)",
|
|
@@ -142,11 +139,10 @@
|
|
|
142
139
|
"Skill(cbp-supabase-branch-check)",
|
|
143
140
|
"Skill(cbp-supabase-migrate)",
|
|
144
141
|
"Skill(cbp-supabase-setup)",
|
|
145
|
-
"Skill(cbp-task-check)",
|
|
146
142
|
"Skill(cbp-task-create)",
|
|
147
143
|
"Skill(cbp-task-start)",
|
|
148
|
-
"Skill(cbp-task-testing)",
|
|
149
144
|
"Skill(cbp-todo)",
|
|
145
|
+
"Skill(cbp-verify)",
|
|
150
146
|
"Skill(supabase)",
|
|
151
147
|
"Skill(supabase-postgres-best-practices)",
|
|
152
148
|
"mcp__codebyplan__get_checkpoints",
|
|
@@ -212,6 +208,8 @@
|
|
|
212
208
|
"Bash(npx codebyplan ports:*)",
|
|
213
209
|
"Bash(codebyplan tech-stack:*)",
|
|
214
210
|
"Bash(npx codebyplan tech-stack:*)",
|
|
211
|
+
"Bash(codebyplan docs:*)",
|
|
212
|
+
"Bash(npx codebyplan docs:*)",
|
|
215
213
|
"Bash(codebyplan eslint:*)",
|
|
216
214
|
"Bash(npx codebyplan eslint:*)",
|
|
217
215
|
"Bash(codebyplan lsp:*)",
|
|
@@ -226,6 +224,8 @@
|
|
|
226
224
|
"Bash(npx codebyplan checkpoint:*)",
|
|
227
225
|
"Bash(codebyplan task:*)",
|
|
228
226
|
"Bash(npx codebyplan task:*)",
|
|
227
|
+
"Bash(codebyplan standalone-task:*)",
|
|
228
|
+
"Bash(npx codebyplan standalone-task:*)",
|
|
229
229
|
"Bash(codebyplan session:*)",
|
|
230
230
|
"Bash(npx codebyplan session:*)",
|
|
231
231
|
"Bash(codebyplan help:*)",
|
|
@@ -249,7 +249,11 @@
|
|
|
249
249
|
"Bash(codebyplan e2e:*)",
|
|
250
250
|
"Bash(npx codebyplan e2e:*)",
|
|
251
251
|
"Bash(codebyplan arch-map:*)",
|
|
252
|
-
"Bash(npx codebyplan arch-map:*)"
|
|
252
|
+
"Bash(npx codebyplan arch-map:*)",
|
|
253
|
+
"Bash(codebyplan ci:*)",
|
|
254
|
+
"Bash(npx codebyplan ci:*)",
|
|
255
|
+
"Bash(codebyplan cd:*)",
|
|
256
|
+
"Bash(npx codebyplan cd:*)"
|
|
253
257
|
]
|
|
254
258
|
},
|
|
255
259
|
"attribution": {
|
|
@@ -38,7 +38,7 @@ A skill that carries a `model:` line is a **gap** — remove it unless a deliber
|
|
|
38
38
|
|
|
39
39
|
### Agents — `model:` + `effort:`
|
|
40
40
|
|
|
41
|
-
Default `model: sonnet` + `effort: xhigh`. Fifteen of the 17 authoring agents take the default (`cbp-cc-executor`, `cbp-database-agent`, `cbp-improve-claude`, `cbp-
|
|
41
|
+
Default `model: sonnet` + `effort: xhigh`. Fifteen of the 17 authoring agents take the default (`cbp-cc-executor`, `cbp-database-agent`, `cbp-improve-claude`, `cbp-research`, `cbp-round-builder`, `cbp-security-agent`, `cbp-stripe-agent`, `cbp-verify-reviewer`, `cbp-round-planner`, `cbp-testing-qa-agent`, `cbp-e2e-playwright`, `cbp-e2e-maestro`, `cbp-e2e-tauri`, `cbp-e2e-vscode`, `cbp-e2e-xcuitest`). The other two are exceptions:
|
|
42
42
|
|
|
43
43
|
| agent | model | effort | reason |
|
|
44
44
|
| -------------------- | ------ | ------ | ----------------------------------------------------------------------------------- |
|
|
@@ -22,7 +22,7 @@ Precedence is `deny > ask > allow`; arrays union across scopes (managed/user/pro
|
|
|
22
22
|
|
|
23
23
|
### `allow` — the autonomous workflow surface
|
|
24
24
|
|
|
25
|
-
- **Non-lifecycle, non-shipment `/cbp-*` skills** — authoring (`cbp-build-cc-*`), frontend (`cbp-frontend-*`), git (`cbp-git-*`, `cbp-merge-main`, `cbp-refresh-infra`), round work (`cbp-round-
|
|
25
|
+
- **Non-lifecycle, non-shipment `/cbp-*` skills** — authoring (`cbp-build-cc-*`), frontend (`cbp-frontend-*`), git (`cbp-git-*`, `cbp-merge-main`, `cbp-refresh-infra`), round work (`cbp-round-plan`, `cbp-verify` — `cbp-verify` is the autonomous verify stage that runs deterministic gates, proves execution, spawns the fresh-context reviewer, and routes to `cbp-round-complete` or `cbp-round-plan`, so it runs without a prompt), setup/configure (`cbp-setup-*`, `cbp-ship-configure`, `cbp-supabase-*`), task prep (`cbp-task-create`/`-start`, `cbp-standalone-task-check`/`-testing`), planning (`cbp-checkpoint-plan`/`-update`), plus `cbp-session-start` and `cbp-todo`. Invoking a skill is the intended mode of operation; the gated side effects happen inside via the Bash/MCP tools the skill calls, which carry their own tiering. The lifecycle/state-transition and plan-approval skills are the exception — they live in `ask` (next section).
|
|
26
26
|
- **All `mcp__codebyplan__*` reads** (`get_*`, `list_*`, `search_*`, `health_check`, `lookup_symbol`, `resolve_library_id`, `get_chunk`).
|
|
27
27
|
- **Routine workflow-write MCP tools** the pipeline calls many times per task: create/update/complete checkpoint, task, and round; session log + session-state writes; `create_worktree`, `add_library`, `flag_stale_chunk`, `update_server_config`, `update_eslint_repo_config`, `update_task_template`. Gating these with `ask` would make the autonomous workflow unusable.
|
|
28
28
|
- **Read/safe CLI commands** (both `codebyplan X` and `npx codebyplan X`): `whoami`, `resolve-worktree`, `statusline`, `ports`, `tech-stack`, `eslint`, `round`, `help`, `--version`.
|
|
@@ -30,8 +30,8 @@ Precedence is `deny > ask > allow`; arrays union across scopes (managed/user/pro
|
|
|
30
30
|
### `ask` — the deliberate confirm-gate
|
|
31
31
|
|
|
32
32
|
- **Production-shipment skills**: `cbp-ship`, `cbp-ship-main`, `cbp-checkpoint-end` — these promote/deploy to production, so they prompt even in an otherwise auto-allowed setup.
|
|
33
|
-
- **Lifecycle / state-transition skills**: `cbp-checkpoint-start`, `cbp-checkpoint-create`, `cbp-checkpoint-check`, `cbp-checkpoint-complete`, `cbp-round-complete`, `cbp-session-end`, `cbp-
|
|
34
|
-
- **Plan-approval gate**: `cbp-round-
|
|
33
|
+
- **Lifecycle / state-transition skills**: `cbp-checkpoint-start`, `cbp-checkpoint-create`, `cbp-checkpoint-check`, `cbp-checkpoint-complete`, `cbp-round-complete`, `cbp-session-end`, `cbp-finalize`, `cbp-standalone-task-create`, `cbp-standalone-task-start`, `cbp-standalone-task-complete` — these open or close checkpoints, tasks, rounds, and sessions (advancing workflow state in the database), so they stop for explicit confirmation rather than running autonomously. `cbp-round-complete` is the permission-gated round finalizer (reconciles the user's `git add`s, completes the round, routes onward); its `ask` prompt is the human gate downstream of `cbp-verify` — the autonomous, `allow`-tier verify stage whose triage routes here.
|
|
34
|
+
- **Plan-approval gate**: `cbp-round-build` — the round plan is approved by confirming this `ask` prompt rather than via an in-skill AskUserQuestion. `cbp-round-plan` runs its planning Q&A, then hands off to `cbp-round-build`; the permission prompt is the user's go/no-go on the plan.
|
|
35
35
|
- **Destructive / admin MCP tools**: `delete_session_log`, `delete_worktree`, `create_repo`, `release_assignment`. (The launch and member-admin tools were dropped from the MCP surface in CHK-180 — those concerns are web-app only now.)
|
|
36
36
|
- **Mutating / external / clobber-risk CLI commands** (both prefixes): `setup`, `login`, `logout`, `upgrade-auth`, `config` (can overwrite committed `.codebyplan/` files), `branch` (rewrites branch config), `ship`, `claude` (`install`/`update`/`uninstall` overwrite `.claude/`).
|
|
37
37
|
|
|
@@ -53,11 +53,11 @@ A skill invokes the next skill via the Skill tool at the appropriate routing bra
|
|
|
53
53
|
### How the human gate works
|
|
54
54
|
|
|
55
55
|
- **`allow`-tier** skill: the harness auto-fires it silently when the triggering skill invokes it.
|
|
56
|
-
No permission prompt. Use for safe, routine-flow skills (e.g. `cbp-
|
|
57
|
-
`cbp-round-
|
|
56
|
+
No permission prompt. Use for safe, routine-flow skills (e.g. `cbp-verify`,
|
|
57
|
+
`cbp-round-plan`) where the trigger condition already encodes the human intent.
|
|
58
58
|
- **`ask`-tier** skill: the harness pauses and shows a permission prompt before the skill runs.
|
|
59
59
|
**That prompt IS the human gate** — it replaces the old "Next: /cbp-X, run it yourself"
|
|
60
|
-
manual directive. Use for lifecycle/state-transition skills (e.g. `cbp-
|
|
60
|
+
manual directive. Use for lifecycle/state-transition skills (e.g. `cbp-finalize`,
|
|
61
61
|
`cbp-checkpoint-check`) where a deliberate confirmation is still desirable.
|
|
62
62
|
|
|
63
63
|
This means:
|
|
@@ -70,7 +70,7 @@ This means:
|
|
|
70
70
|
|
|
71
71
|
The `cbp-skill-context-guard.sh` PreToolUse hook denies heavy close-out skills when the
|
|
72
72
|
context window exceeds `CBP_CONTEXT_WARN_TOKENS` (default 200 000 tokens). The heavy allowlist
|
|
73
|
-
is: `cbp-round-
|
|
73
|
+
is: `cbp-round-build`, `cbp-verify`, `cbp-standalone-task-testing`,
|
|
74
74
|
`cbp-checkpoint-check`, `cbp-checkpoint-end`.
|
|
75
75
|
|
|
76
76
|
When the guard fires, it directs the model to run `/cbp-clear-prep` instead. The flow is:
|
|
@@ -81,7 +81,7 @@ A Task-pattern skill that must only run on explicit user confirmation is a **per
|
|
|
81
81
|
|
|
82
82
|
- MUST carry `disable-model-invocation: true` — the model cannot invoke it; only the user can (via `/skill-name`).
|
|
83
83
|
- Any upstream skill that auto-triggers it MUST instead emit a `Next: /skill-name` directive and STOP — model invocation of a `disable-model-invocation` skill is blocked at the runtime level.
|
|
84
|
-
- Canonical example: `/cbp-round-complete` (the round finalizer). `/cbp-
|
|
84
|
+
- Canonical example: `/cbp-round-complete` (the round finalizer). `/cbp-verify` routes a clean round via a `Next: /cbp-round-complete` directive and stops — it cannot invoke round-complete directly.
|
|
85
85
|
|
|
86
86
|
### Step 5 — Fill the frontmatter
|
|
87
87
|
|
|
@@ -79,14 +79,14 @@ A skill should do one thing in the pipeline. If a skill both plans AND executes,
|
|
|
79
79
|
|
|
80
80
|
| Wrong | Right |
|
|
81
81
|
| --------------------------------------- | ------------------------------------------------------------ |
|
|
82
|
-
| `/cbp-round` (plans + executes + tests) | `/cbp-round-
|
|
82
|
+
| `/cbp-round` (plans + executes + tests) | `/cbp-round-plan` → `/cbp-round-build` → `/cbp-verify` |
|
|
83
83
|
|
|
84
84
|
### Pipeline Clarity
|
|
85
85
|
|
|
86
86
|
If the skill is part of a chain, show it:
|
|
87
87
|
|
|
88
88
|
```
|
|
89
|
-
/cbp-round-
|
|
89
|
+
/cbp-round-plan (planning) → /cbp-round-build (ask-tier permission = plan approval)
|
|
90
90
|
```
|
|
91
91
|
|
|
92
92
|
### Approval Gates
|
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
parent conversation and, per the runtime, **runs in the background by default**. It is
|
|
5
5
|
isolation for a *whole skill*, not a way to delegate one sub-step. A forked body therefore
|
|
6
6
|
cannot drive the main pipeline: it can't `AskUserQuestion`, can't auto-trigger another
|
|
7
|
-
skill, and can't run
|
|
7
|
+
skill, and can't run the deterministic fallback path the orchestrator depends on.
|
|
8
8
|
|
|
9
9
|
So forking only helps a narrow shape of skill. The canonical eligible example is
|
|
10
10
|
[examples/fork-skill.md](../examples/fork-skill.md): a single self-contained analytical task
|
|
@@ -19,20 +19,20 @@ A skill is **fork-eligible** only when ALL hold:
|
|
|
19
19
|
3. It does **not route** — no auto-trigger of another skill, no close-out directive that must
|
|
20
20
|
fire in the main context.
|
|
21
21
|
4. It does **not fan out** — it does not spawn multiple subagents and coordinate them.
|
|
22
|
-
5. It has **no
|
|
22
|
+
5. It has **no deterministic fallback** path the orchestrator relies on.
|
|
23
23
|
|
|
24
24
|
Fail any one → the skill stays **inline** (main context). Inline skills still get clean
|
|
25
25
|
context isolation the right way: by delegating their heavy step to a dedicated **agent**
|
|
26
|
-
(e.g. `cbp-
|
|
26
|
+
(e.g. `cbp-verify-reviewer`, `cbp-round-builder`). The agent is the
|
|
27
27
|
isolation boundary; the skill stays in the main thread to orchestrate, route, and interact.
|
|
28
28
|
|
|
29
29
|
## When NOT to use `context: fork` (the disqualifying patterns)
|
|
30
30
|
|
|
31
31
|
| Pattern | Why it can't fork | Example skills |
|
|
32
32
|
|---------|-------------------|----------------|
|
|
33
|
-
| **fan-out** | spawns multiple agents in parallel and coordinates them | `cbp-round-
|
|
34
|
-
| **spawn-then-route** | spawns one agent, then `AskUserQuestion` / auto-triggers the next skill
|
|
35
|
-
| **inline-by-design** | interactive Q&A or stepwise writes that must stay in the main context | `cbp-task-create`, `cbp-
|
|
33
|
+
| **fan-out** | spawns multiple agents in parallel and coordinates them | `cbp-round-build`, `cbp-checkpoint-check`, `cbp-map-architecture`, `cbp-refresh-arch-map` |
|
|
34
|
+
| **spawn-then-route** | spawns one agent, then `AskUserQuestion` / auto-triggers the next skill | `cbp-verify`, `cbp-standalone-task-check`, `cbp-round-plan`, `cbp-checkpoint-plan` |
|
|
35
|
+
| **inline-by-design** | interactive Q&A or stepwise writes that must stay in the main context | `cbp-task-create`, `cbp-finalize`, `cbp-merge-main` |
|
|
36
36
|
| **consumed-inline** | invoked *by* an agent (e.g. round-executor) and applies fixes synchronously into that context | `cbp-frontend-design`, `cbp-frontend-ui`, `cbp-frontend-ux` |
|
|
37
37
|
| **doc-ref-only** | mentions subagents/fork only as documentation; runs inline authoring | the `cbp-build-cc-*` authoring skills, `cbp-supabase-migrate` |
|
|
38
38
|
|
|
@@ -40,28 +40,25 @@ isolation boundary; the skill stays in the main thread to orchestrate, route, an
|
|
|
40
40
|
|
|
41
41
|
Every skill whose `SKILL.md` touches the subagent/fork boundary — by spawning a subagent, by
|
|
42
42
|
being invoked inline by an agent, or by documenting the feature — was classified against the
|
|
43
|
-
eligibility test. **Result: 0 of
|
|
43
|
+
eligibility test. **Result: 0 of 22 are fork-eligible** — none were migrated, because every
|
|
44
44
|
one either already isolates heavy work in a dedicated agent (the correct boundary) or depends
|
|
45
45
|
on inline orchestration/interaction that a background fork would break.
|
|
46
46
|
|
|
47
47
|
| Skill | Pattern | Fork-eligible |
|
|
48
48
|
|-------|---------|:---:|
|
|
49
|
-
| cbp-round-
|
|
49
|
+
| cbp-round-build | fan-out | no |
|
|
50
50
|
| cbp-checkpoint-check | fan-out | no |
|
|
51
51
|
| cbp-map-architecture | fan-out | no |
|
|
52
52
|
| cbp-refresh-arch-map | fan-out | no |
|
|
53
|
-
| cbp-round-
|
|
54
|
-
| cbp-
|
|
55
|
-
| cbp-task-check | spawn-then-route | no |
|
|
53
|
+
| cbp-round-plan | spawn-then-route | no |
|
|
54
|
+
| cbp-verify | spawn-then-route | no |
|
|
56
55
|
| cbp-standalone-task-check | spawn-then-route | no |
|
|
57
56
|
| cbp-checkpoint-plan | spawn-then-route | no |
|
|
58
|
-
| cbp-round-update | inline-by-design | no |
|
|
59
57
|
| cbp-task-create | inline-by-design | no |
|
|
60
58
|
| cbp-standalone-task-create | inline-by-design | no |
|
|
61
|
-
| cbp-
|
|
59
|
+
| cbp-finalize | inline-by-design | no |
|
|
62
60
|
| cbp-standalone-task-complete | inline-by-design | no |
|
|
63
61
|
| cbp-merge-main | inline-by-design | no |
|
|
64
|
-
| cbp-task-testing | inline-by-design | no |
|
|
65
62
|
| cbp-standalone-task-testing | inline-by-design | no |
|
|
66
63
|
| cbp-frontend-design | consumed-inline | no |
|
|
67
64
|
| cbp-frontend-ui | consumed-inline | no |
|
|
@@ -1,4 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
+
scope: org-shared
|
|
2
3
|
name: cbp-checkpoint-check
|
|
3
4
|
description: Full re-evaluation of a checkpoint with before/after comparison
|
|
4
5
|
argument-hint: [CHK-NNN]
|
|
@@ -83,7 +84,14 @@ Aggregate QA from all tasks and rounds:
|
|
|
83
84
|
| TASK-[N] | READY | all_pass | [N] |
|
|
84
85
|
```
|
|
85
86
|
|
|
86
|
-
Re-run build/lint/types on current codebase to verify nothing regressed across tasks.
|
|
87
|
+
Re-run build/lint/types on the current codebase to verify nothing regressed across tasks. Detect `$PLATFORM` from the project type (same signal table as `cbp-testing-qa-agent.md` Step 1), then resolve commands from `.codebyplan/ci.json`:
|
|
88
|
+
|
|
89
|
+
```bash
|
|
90
|
+
CI_BUILD_CMD=$(npx codebyplan ci resolve build --platform "$PLATFORM" 2>/dev/null)
|
|
91
|
+
CI_TYPES_CMD=$(npx codebyplan ci resolve typecheck --platform "$PLATFORM" 2>/dev/null)
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
Run: `${CI_BUILD_CMD:-npm run build}` and `${CI_TYPES_CMD:-npx tsc --noEmit}`. For lint use the whole-repo command (`pnpm -w lint`). Fallback: if `.codebyplan/ci.json` is absent, `ci resolve` returns the central default; if the binary is unavailable the `${CI_*_CMD:-<literal>}` guard uses the hardcoded fallback.
|
|
87
95
|
|
|
88
96
|
### Step 5b: Whole-Checkpoint E2E
|
|
89
97
|
|
|
@@ -119,11 +127,11 @@ Aggregate the files touched across all tasks (reusing Step 4's deduplicated tabl
|
|
|
119
127
|
Continue to Step 6.
|
|
120
128
|
|
|
121
129
|
5. **On fail** (any framework `f`: `e2e_outputs[f].status === 'failed'` OR `e2e_outputs[f].test_results.failed > 0`): build a failure summary from `e2e_outputs[*].test_results.failures[]` aggregated and grouped by `category`. Surface via `AskUserQuestion`:
|
|
122
|
-
- **(a) Create fix-task in CHK-{NNN} (recommended)** — run `codebyplan task create` (CLI write-through; break-glass: MCP `create_task`) with `checkpoint_id=current_checkpoint_id`, `title="Fix checkpoint-level e2e failures (CHK-{NNN})"`, `requirements` containing the detailed failure breakdown (category counts, files involved, pages broken, screenshot paths from `e2e_outputs[*].screenshots[]`), AND `context: { source_checkpoint_id, e2e_failure_summary: { category_counts, pages_broken, screenshot_paths }, fix_type: "checkpoint_e2e" }` so downstream `cbp-
|
|
130
|
+
- **(a) Create fix-task in CHK-{NNN} (recommended)** — run `codebyplan task create` (CLI write-through; break-glass: MCP `create_task`) with `checkpoint_id=current_checkpoint_id`, `title="Fix checkpoint-level e2e failures (CHK-{NNN})"`, `requirements` containing the detailed failure breakdown (category counts, files involved, pages broken, screenshot paths from `e2e_outputs[*].screenshots[]`), AND `context: { source_checkpoint_id, e2e_failure_summary: { category_counts, pages_broken, screenshot_paths }, fix_type: "checkpoint_e2e" }` so downstream `cbp-round-planner` can verify failure premises. Per `cbp-verify` reference `findings-presentation.md` "Infra Issue Absorption Contract — Resolve-in-Current-Scope by Default", checkpoint-level e2e failures absorb into the active checkpoint — not standalone.
|
|
123
131
|
- **(b) Surface as warning only — proceed to checkpoint-end** — append `| Checkpoint E2E | warning | N failures (deferred) |` to Step 5 QA Summary; continue to Step 6.
|
|
124
132
|
- **(c) Halt — review manually** — STOP and wait for the user.
|
|
125
133
|
|
|
126
|
-
See `cbp-
|
|
134
|
+
See `cbp-verify` reference `findings-presentation.md` "Infra Issue Absorption Contract — Infra-Class Issue Catalog" row "Checkpoint-level e2e failure" for the routing rationale.
|
|
127
135
|
|
|
128
136
|
### Step 6: User Discussion
|
|
129
137
|
|
|
@@ -87,7 +87,22 @@ This is the first identity-stamping point — when claiming, passing `worktree_i
|
|
|
87
87
|
|
|
88
88
|
Read `.codebyplan/git.json` `branch_config.production` (default `"main"`) as `BASE`. codebyplan repos are main-only — never create or branch from a `development`/integration branch.
|
|
89
89
|
|
|
90
|
-
|
|
90
|
+
**8.1 — Reuse the cloud-created branch when present.** When the repo is GitHub-connected, the CHK-207 `create-feat-branch` Edge Function fires on the Step 7 row INSERT, creates `feat/CHK-{NNN}-<slug>` on origin, and writes `branch_name` back to the checkpoint row. Creating a second, differently-slugged branch here orphans the cloud one — so re-read the row first:
|
|
91
|
+
|
|
92
|
+
```bash
|
|
93
|
+
sleep 5 # give the INSERT webhook a moment to write branch_name back
|
|
94
|
+
npx codebyplan sync 2>/dev/null || true
|
|
95
|
+
BRANCH=$(jq -r '.branch_name // empty' ".codebyplan/state/checkpoints/{checkpoint-id}.json" 2>/dev/null)
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
(Break-glass: MCP `get_checkpoints` and read the row's `branch_name`.) If `BRANCH` is non-empty, check out the existing remote branch and skip 8.2 entirely — do NOT push (it already exists on origin) and do NOT persist `--branch-name` (the Edge Function already recorded it):
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
git fetch origin "$BRANCH"
|
|
102
|
+
git checkout -b "$BRANCH" --track "origin/$BRANCH"
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
**8.2 — Fallback: create the branch locally.** Only when `BRANCH` is empty (repo not GitHub-connected, or the webhook hasn't landed). Compute the slug deterministically:
|
|
91
106
|
|
|
92
107
|
```bash
|
|
93
108
|
SLUG=$(codebyplan slug "{checkpoint title}")
|
|
@@ -96,7 +96,11 @@ Runtime deployment for the base branch is handled in Step 7 by `/cbp-ship` (whic
|
|
|
96
96
|
|
|
97
97
|
### Step 7: Runtime Shipment via `/cbp-ship`
|
|
98
98
|
|
|
99
|
-
After branch promotion to main completes, invoke `/cbp-ship` to deploy every configured surface
|
|
99
|
+
After branch promotion to main completes, invoke `/cbp-ship` to deploy every configured surface.
|
|
100
|
+
`/cbp-ship` reads `.codebyplan/cd.json` when present to inform per-surface deploy variant
|
|
101
|
+
selection (trigger, environment, approval gate, OIDC auth, credential env-var names). Repos
|
|
102
|
+
without `cd.json` fall back to filesystem surface detection — no behavior change. Run
|
|
103
|
+
`/cbp-setup-cd` to set up `cd.json` for a repo that has not yet migrated.
|
|
100
104
|
|
|
101
105
|
- Vercel auto-deploy verification
|
|
102
106
|
- Mobile shipment (asks user: skip / EAS internal TestFlight / EAS external TestFlight)
|