baldart 4.43.1 → 4.45.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,34 @@ All notable changes to BALDART will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [4.45.0] - 2026-06-15
9
+
10
+ **The main-repo-root (`$MAIN`) resolution across `worktree-manager` + `/prd` + `/new` is unified onto one correct, `separate-git-dir`-safe canonical — fixing two latent bugs while *refuting* the obvious "unify on `--git-common-dir`" fix the deferred plan proposed.** v4.42.0 deferred the `$MAIN` cleanup after an adversarial pass flagged it as a likely trap; this release does it properly. The root resolution lived in ~5 forms across three genuine execution contexts (main-checkout cwd, worktree cwd, the `allocate-id.sh` startup), and the deferred plan wanted to collapse them onto `git rev-parse --git-common-dir` + `/..` (the recipe already in `allocate-id.sh` `resolve_main()`). A 3-skeptic adversarial review **before implementation** demolished that plan with git experiments: (1) `--git-common-dir` + parent returns the *parent of the git dir*, which is the WRONG directory under `git init --separate-git-dir` (the git dir lives outside the working tree) — so the "canonical" recipe was itself fragile, and `resolve_main()` only worked because BALDART repos use in-tree `.git`; (2) the alleged `prd/SKILL.md` Step-1 bug ("`--show-toplevel` from inside a worktree") does **not** exist — Step 1 runs only at fresh kickoff on the main checkout, and resume reads the persisted `main_path`, so `--show-toplevel` is correct there *and* is the only form that survives `separate-git-dir`; (3) `git worktree list` head is also not `separate-git-dir`-safe (returns the git dir). The corrected design keeps the task's **context classification** but fixes the **primitive**: every site resolves the root through `--show-toplevel` (the true working-tree root in all cases), using `git -C .. rev-parse --show-toplevel` from a worktree (its parent `.worktrees/` lives inside the main repo). `resolve_main()` is rewritten as the canonical reference (detects a linked worktree via `git-dir != git-common-dir`, walks one level up) — **byte-identical output to the old form for in-tree repos** (verified), correct for `separate-git-dir`, and fails cleanly (`return 1`) instead of emitting a garbage path. Two real fragilities fixed along the way: the `/mw` `$MAIN` fallback (`--git-common-dir` + `/..` under `git -C`, relative-base hazard + `separate-git-dir` break) and the `3b` card-sync (`--show-superproject-working-tree || pwd`, which returned the SUPERPROJECT for a real submodule and only worked otherwise by a cwd accident). All recipes dogfooded across normal + `separate-git-dir` repos in every cwd context. **MINOR** (hardening + bug fixes across skills/agents; no removed surface, **no new `baldart.config.yml` key** ⇒ schema-change propagation rule N/A).
11
+
12
+ ### Fixed
13
+
14
+ - **`framework/.claude/skills/worktree-manager/scripts/allocate-id.sh` — `resolve_main()` rewritten `--show-toplevel`-based.** Now resolves the working-tree root (correct under `git init --separate-git-dir`, where `--git-common-dir` + parent pointed at the wrong directory), detects a linked worktree (`git-dir != git-common-dir`) and walks one level up. Identical output to the prior form for in-tree repos; returns non-zero instead of a garbage path when not in a git repo.
15
+ - **`framework/.claude/skills/worktree-manager/SKILL.md` — `/mw` step 4c `$MAIN` fallback** no longer derives the root via `git -C "$WORKTREE_PATH" rev-parse --git-common-dir` + `/..` (parent-of-git-dir, wrong under a separate git dir; appending `/..` to a possibly-relative `--git-common-dir` under `git -C` resolves against the wrong base). It `cd`s into the worktree then `git -C .. rev-parse --show-toplevel`, with a loud guard replacing the silent failure path.
16
+ - **`framework/.claude/skills/worktree-manager/SKILL.md` — `3b. Sync untracked cards`** drops the `git -C "$WORKTREE_PATH" --show-superproject-working-tree || pwd` form (returned the *superproject* for a real git submodule, where the cards do not live; only resolved otherwise by relying on cwd). At step 3b cwd is still the main checkout, so it now uses `git rev-parse --show-toplevel` — correct for normal, submodule, and `separate-git-dir` repos.
17
+
18
+ ### Changed
19
+
20
+ - **`framework/.claude/skills/worktree-manager/SKILL.md` — new `## Resolving the main repo root` section** documenting the three execution contexts, the correct primitive per context, and an explicit "do NOT use `--git-common-dir` + `/..`" rule (codifying the adversarial finding so the next maintainer doesn't re-attempt the refuted unification). The `/nw` step-1, nw-docs step-0, and `/nw` env-copy sites gain cross-ref comments; env-copy keeps `git -C .. --show-toplevel` but replaces its silent `|| echo '../..'` fallback with a loud guard.
21
+ - **`framework/.claude/skills/prd/references/validation-phase.md` + `framework/.claude/agents/{prd,prd-card-writer}.md`** — the legacy `$MAIN` fallback and the descriptive `$MAIN` comments switch from `--git-common-dir` to the canonical `--show-toplevel` resolution (and point at the new section). `prd/SKILL.md` Step 1 is deliberately **unchanged** — `--show-toplevel` there is correct (refuted "bug").
22
+ - **`framework/.claude/skills/new/references/setup.md` — `$MAIN` resolution corrected**: from a worktree-invocation, resolve via the worktree's parent toplevel rather than `--show-superproject-working-tree` (a submodule primitive) or `git worktree list` (not `separate-git-dir`-safe).
23
+
24
+ ## [4.44.0] - 2026-06-15
25
+
26
+ **The toolchain layer (v4.41.0) now reaches the *execution-side* gates too: the worktree baseline/merge builds and the classic `/new` reference suite stop hard-coding `npx tsc`/`npx eslint`/`npm run build` and run the consumer's configured `toolchain.commands.*` instead.** v4.41.0 wired the *review* gates (`qa-sentinel`, `coder`, the `/new`+`/new2` review workflows) through `toolchain.commands.*`, but a dozen gates that actually run a build/lint/typecheck were still hard-coded — so a Biome/Vitest consumer's worktree baseline silently ran `eslint`, and the classic `/new` per-card + final gates ran `npm run build` regardless of config. This release makes those gates toolchain-aware while preserving each site's existing **severity** policy (a configured command failing is the same STOP/continue verdict the default produced — the protocol's no-fallback rule governs only *which* command runs, never what to do with its exit code). The mechanism follows the **execution context**: a markdown **prose note** where a full model writes the gate command (the `/new` references, `/bug`, `/simplify`), and an inline **shell resolver** in `worktree-manager` — which has no `args.config` and whose baseline runs in a weak/background subagent, so it resolves `toolchain.commands.*` from `baldart.config.yml` on disk the same way it already resolves `git.trunk_branch`, rather than relying on a prose note a weak model could skip. The design was adversarially reviewed before implementation (3 skeptics), which corrected the initial plan on three points: (1) the `/mw` best-effort `npm run test 2>/dev/null || true` keeps its `|| true` swallow (never a STOP trigger); (2) the `/mw` changed-files lint scope is preserved as the default (a configured whole-tree command like `biome check .` runs by the consumer's contract); (3) two missed twins — `/bug` and `/simplify` verify steps — were added to scope. **Declared debt (intentionally untouched):** `npm install` stays hard-coded (there is no `toolchain.commands.install` key — adding one would be a 5-layer schema change; keeping this MINOR), `markdownlint` is not in the curated map, and the descriptive `(tsc+lint+build)` labels in `new2.js`/`setup.md` are not load-bearing (the `new2` projectBrief already injects the verbatim instruction — refuted by review). **MINOR** (additive: more gates honor the existing `features.has_toolchain` + `toolchain.commands.*`; no removed surface, **no new `baldart.config.yml` key** ⇒ schema-change propagation rule N/A).
27
+
28
+ ### Changed
29
+
30
+ - **`framework/.claude/skills/worktree-manager/SKILL.md` — gates toolchain-aware via an inline shell resolver.** New `## Toolchain-aware gates` convention + a Project-Context dependency on `features.has_toolchain` + `toolchain.commands.{typecheck,lint,build,test}`. Each gate block (`/nw` step 5 baseline, `/mw` step 2 pre-merge, step 4b rebase build, step 5 post-merge) inlines a `_tc()` resolver that reads the config from disk (mirroring the existing `git.trunk_branch`/`git.merge_strategy` grep idiom) and runs the configured command via `eval "${VAR:-<default>}"`, falling back to the hard-coded default when unset/flag-off. Three carve-outs documented: `npm install` is not a gate key, severity is unchanged, and the `/mw` best-effort `test` line keeps its `|| true` swallow.
31
+ - **`framework/.claude/skills/new/SKILL.md` — new `## Toolchain gates` core invariant** (cited as `§ "Toolchain gates"` by the reference modules, same pattern as `§ "Context economy"`): when `has_toolchain`, the mechanical gates use `toolchain.commands.<gate>` verbatim, a non-zero configured command is a real FAIL (no masking fallback), and `npm install`/`markdownlint` are not toolchain keys.
32
+ - **`framework/.claude/skills/new/references/{implement,review-cycle,completeness,final-review,commit,team-mode}.md` — gate sites cite `§ "Toolchain gates"`** so the classic `/new` per-card (Phase 1 verify, Phase 2.55/2.5x re-runs, Phase 3 doc re-verify, completeness gap re-run, commit pre-check), team-mode (per-card coder briefing, group build, group simplify re-run), and final-review build all resolve `toolchain.commands.*` verbatim when the flag is on.
33
+ - **`framework/.claude/skills/{bug,simplify}/SKILL.md` — standalone verify steps toolchain-aware** (the two twins surfaced by the adversarial scope review): `/bug` PHASE 5 regression checks and `/simplify` Step 5 verify gain a self-contained Toolchain note + a Project-Context dependency on `features.has_toolchain` + `toolchain.commands.{lint,typecheck,test}`.
34
+ - **`framework/agents/toolchain-protocol.md` — Consumers section updated** to record the execution-side gates and the two resolution mechanisms (prose note vs on-disk shell resolver), and to restate the `npm install`/`markdownlint` carve-outs.
35
+
8
36
  ## [4.43.1] - 2026-06-15
9
37
 
10
38
  **The npm publish workflow is now race-safe: pushing several `v*.*.*` tags close together can no longer leave `latest` on a lower version.** Publishing v4.41.0/4.42.0/4.43.0 together exposed the bug — the three tag-push workflow runs executed in **parallel**, and `npm publish` (with no explicit dist-tag) points `latest` at whichever version finishes **last chronologically**, not the highest. Real outcome: `latest` landed on **4.42.0**, so `npx baldart` installed a non-latest version (fixed by hand with `npm dist-tag add baldart@4.43.0 latest`). This release closes the race two ways: (1) a workflow-level **`concurrency` group** (`group: publish-npm`, `cancel-in-progress: false`) serializes all publish runs so they never race on the dist-tag and a publish is never cancelled; (2) a new final step **reconciles `latest` to the highest published version** — it computes the max stable semver across `npm view baldart versions` (unioned with the just-published `VERSION` to survive registry propagation lag, numeric per-segment compare so `4.10.0 > 4.9.0`) and repoints `latest` only when it has drifted. Because the runs are serialized, the last run always leaves `latest` on the maximum regardless of push order. The existing "Verify tag matches VERSION" guard is untouched. **PATCH** (CI/release-machinery bugfix, no change to installed surface or `baldart.config.yml` ⇒ schema-change propagation rule N/A).
package/VERSION CHANGED
@@ -1 +1 @@
1
- 4.43.1
1
+ 4.45.0
@@ -311,7 +311,8 @@ applies even when N=1.
311
311
  worktree on this machine (lock + shared high-water mark under `.worktrees/`):
312
312
 
313
313
  ```bash
314
- # $MAIN is the main repo root: dirname of `git rev-parse --git-common-dir`.
314
+ # $MAIN is the persisted main repo root (see worktree-manager § "Resolving the
315
+ # main repo root" — via --show-toplevel, NOT --git-common-dir).
315
316
  ALLOC="$MAIN/.claude/skills/worktree-manager/scripts/allocate-id.sh"
316
317
  if [ -x "$ALLOC" ]; then
317
318
  N="$("$ALLOC" reserve FEAT "<slug>")" # e.g. prints 0024
@@ -504,7 +504,9 @@ parallelism (concurrent sessions on sibling worktrees both pick the same "next"
504
504
  integer and conflict at merge):
505
505
 
506
506
  ```bash
507
- # $MAIN = dirname of `git rev-parse --git-common-dir` (shared across worktrees).
507
+ # $MAIN = the persisted main repo root (the same value SKILL.md Step 1 wrote; see
508
+ # worktree-manager § "Resolving the main repo root" — resolved via --show-toplevel,
509
+ # NOT --git-common-dir, which breaks under git init --separate-git-dir).
508
510
  ALLOC="$MAIN/.claude/skills/worktree-manager/scripts/allocate-id.sh"
509
511
  [ -x "$ALLOC" ] && N="$("$ALLOC" reserve FEAT <slug>") # also BUG | UI | DOC | PERF
510
512
  ```
@@ -17,7 +17,7 @@ Argument: optional bug description (e.g., `/bug feature X is not saving`).
17
17
 
18
18
  ## Project Context
19
19
 
20
- **Reads from `baldart.config.yml`:** `paths.design_system`, `paths.references_dir`, `paths.wiki_dir` (Phase 0 `rg` lookup target).
20
+ **Reads from `baldart.config.yml`:** `paths.design_system`, `paths.references_dir`, `paths.wiki_dir` (Phase 0 `rg` lookup target), `features.has_toolchain` + `toolchain.commands.{lint,typecheck,test}` (the PHASE 5 regression gates run those verbatim when the flag is on).
21
21
  **Gated by features:** `features.has_design_system` (when `true`, Phase 4's design-system reads become BLOCKING for UI-touching bugs).
22
22
  **Overlay:** loads `.baldart/overlays/bug.md` if present — project-specific debug entry points (e.g. SWR debug switches, env summary helpers, error-code modules). The base skill stays generic; project-specific code paths live in the overlay.
23
23
  **On missing/empty keys:** ask the user; do not assume defaults. See `framework/agents/project-context.md` § 3.
@@ -202,6 +202,8 @@ If the project exposes an env-summary helper (listed in `.baldart/overlays/bug.m
202
202
 
203
203
  ## PHASE 5: VERIFY & CLEAN UP
204
204
 
205
+ > **Toolchain:** when `features.has_toolchain: true` in `baldart.config.yml`, run `toolchain.commands.{typecheck,lint,test}` **verbatim** instead of the defaults below — per `framework/agents/toolchain-protocol.md`. Empty/absent key (or flag off) → the default. A configured command that exits non-zero is a real FAIL (do not fall back).
206
+
205
207
  1. Reproduce original scenario — confirm fix
206
208
  2. Check regressions: `npx tsc --noEmit` + `npx eslint --max-warnings=0 <changed-files>`
207
209
  3. If tests exist: `npm run test`
@@ -243,6 +243,24 @@ baselines. Keep that bulk on disk and pass **paths**, not bodies.
243
243
 
244
244
  ---
245
245
 
246
+ ## Toolchain gates
247
+
248
+ When `features.has_toolchain: true` in `baldart.config.yml`, every mechanical gate
249
+ this skill runs (`lint`, `typecheck`, `test`, `build`) uses the consumer's
250
+ configured command from `toolchain.commands.<gate>` **verbatim** instead of the
251
+ `npm`/`npx` default shown at the call site — per `framework/agents/toolchain-protocol.md`.
252
+ Resolve per gate: a non-empty `toolchain.commands.{lint,typecheck,test,build}` wins;
253
+ empty/absent (or the flag off/missing) → the default at the call site, identical to
254
+ pre-toolchain behavior. A configured command that exits non-zero is a real gate
255
+ **FAIL** — do NOT then run the default (that would mask the failure); fall back only
256
+ when the key is **unset**. `npm install` and `markdownlint` are NOT toolchain keys
257
+ (there is no `commands.install`, and markdownlint is not in the curated map) — they
258
+ stay as written. The gate-output discipline (§ "Context economy" → redirect to disk,
259
+ surface only the exit code) applies to the resolved command exactly as to the default.
260
+ Reference modules cite this section as `§ "Toolchain gates"`.
261
+
262
+ ---
263
+
246
264
 
247
265
  ## Routing — il per-card pipeline e i moduli on-demand
248
266
 
@@ -250,9 +268,9 @@ baselines. Keep that bulk on disk and pass **paths**, not bodies.
250
268
  sistema, ri-letto a OGNI turno. Per non pagare 60k+ token di istruzioni di fase a
251
269
  ogni turno, il dettaglio passo-passo di ogni fase vive in un **modulo `references/<x>.md`**
252
270
  caricato on-demand. Questo file (il core) tiene solo gli invarianti cross-fase
253
- (Context Tracking, Progress Visibility, § "Context economy", Agent Routing, QA
254
- Profile, Trivial fast-lane, Risk-signal detector, Fix Application Log) + questa
255
- mappa di navigazione.
271
+ (Context Tracking, Progress Visibility, § "Context economy", § "Toolchain gates",
272
+ Agent Routing, QA Profile, Trivial fast-lane, Risk-signal detector, Fix Application
273
+ Log) + questa mappa di navigazione.
256
274
 
257
275
  > **HARD RULE — leggi il modulo PRIMA di eseguire la fase.** Quando entri in una
258
276
  > fase, **Read** il suo modulo `references/<x>.md` e poi eseguilo. Eseguire una
@@ -260,9 +278,9 @@ mappa di navigazione.
260
278
  > compaction che l'ha evacuato) è una violazione di protocollo: ricaricalo. Registra
261
279
  > nel tracker, sotto `## Current Card`, il campo `phase_module_loaded: <modulo>` al
262
280
  > caricamento, così la § "Context recovery protocol" sa cosa ri-leggere dopo una
263
- > compaction. I `§ "..."` citati dai moduli (Context economy, Context Tracking,
264
- > Trivial-card fast-lane, Risk-signal detector, Fix Application Log) puntano a sezioni
265
- > che vivono **qui nel core** → risolvono sempre.
281
+ > compaction. I `§ "..."` citati dai moduli (Context economy, Toolchain gates, Context
282
+ > Tracking, Trivial-card fast-lane, Risk-signal detector, Fix Application Log) puntano a
283
+ > sezioni che vivono **qui nel core** → risolvono sempre.
266
284
 
267
285
  **Sequenza (per ogni card, in ordine — i moduli per-card sono caricati per traversata):**
268
286
 
@@ -6,7 +6,7 @@
6
6
 
7
7
  > Sequential-mode global step numbering resumes here at 26 (Phase 3.5 ended at 25; Phase 3.7 used its own local C.0–C.6 counter). The tracker phase-string `4-commit` therefore maps to step 26, NOT a second step 25.
8
8
 
9
- 26. **Update tracker**: phase = "4-commit". **Entry assertion** — before committing, verify the Phase 3.7 e2e re-run obligation was honored: read the tracker for `e2e-rerun: triggered` / `e2e-rerun: not-needed`. If Phase 3.7 touched UI files but no `e2e-rerun` entry exists, do NOT commit yet — go run the re-run per Phase 3.7 step 6 first. Also confirm Phase 3.5/3.7 fixes did not leave lint/tsc broken: if the Phase 3.7 fix sub-loop applied any patch, run `npm run lint` + `npx tsc --noEmit` (when typescript) once before committing (redirect to disk per § "Context economy").
9
+ 26. **Update tracker**: phase = "4-commit". **Entry assertion** — before committing, verify the Phase 3.7 e2e re-run obligation was honored: read the tracker for `e2e-rerun: triggered` / `e2e-rerun: not-needed`. If Phase 3.7 touched UI files but no `e2e-rerun` entry exists, do NOT commit yet — go run the re-run per Phase 3.7 step 6 first. Also confirm Phase 3.5/3.7 fixes did not leave lint/tsc broken: if the Phase 3.7 fix sub-loop applied any patch, run `npm run lint` + `npx tsc --noEmit` (when typescript) once before committing — when `has_toolchain`, the configured `toolchain.commands.{lint,typecheck}` verbatim (§ "Toolchain gates") — (redirect to disk per § "Context economy").
10
10
  27. Stage and commit **all changes together** in the worktree using format `[CARD-ID] Brief description` (MUST per AGENTS.md). Include all relevant files — implementation, review fixes, QA-driven fixes, and doc updates in a single commit. Do NOT merge or push yet — that happens post-batch.
11
11
  - **IMPORTANT — explicit staging**: NEVER use `git add -A` or `git add .`. Always stage files by explicit name:
12
12
  ```bash
@@ -118,7 +118,7 @@ Before triggering any review, you MUST verify that the coder agent implemented *
118
118
  - The exact list of unimplemented items (copy the checklist rows)
119
119
  - The file-ownership restrictions from `## File Ownership Map`
120
120
  - The instruction: "Implement ONLY these missing items. Do not refactor or expand scope."
121
- - After the fix agent completes, re-run the static gates the fix could have broken — `npm run lint`, `npx tsc --noEmit` (when `stack.language` includes typescript), and `npm test` — not just build + lint (a gap-fix can introduce a type error or break a test that the earlier Phase 2 gate had passed). Redirect each to `/tmp/<gate>-<CARD-ID>.txt` per § "Context economy" (never inline).
121
+ - After the fix agent completes, re-run the static gates the fix could have broken — `npm run lint`, `npx tsc --noEmit` (when `stack.language` includes typescript), and `npm test` (when `has_toolchain`, the configured `toolchain.commands.{lint,typecheck,test}` verbatim § "Toolchain gates") — not just build + lint (a gap-fix can introduce a type error or break a test that the earlier Phase 2 gate had passed). Redirect each to `/tmp/<gate>-<CARD-ID>.txt` per § "Context economy" (never inline).
122
122
  - Re-verify each fixed item against the code — do NOT trust the agent's self-report.
123
123
  - Repeat this sub-loop up to **2 times** (per-item budget, shared with step 0 — see "Loop-counter scope"). After 2 loops, if items remain Partial or Missing:
124
124
  - Log in `## Issues & Flags`: list each unimplemented requirement.
@@ -258,7 +258,7 @@ that is a **gate violation**: log it as
258
258
  - **`security`-domain findings** (path in `paths.high_risk_modules`, or RLS-policy SQL) → route to **security-reviewer** in write mode (canonical writer map v4.26.1 — it owns the security-invariant contract a coder lacks; NEVER route security fixes to coder). **`migration`-domain findings** (SQL under the migrations dir) → route to **coder**. For both, apply the Sub-agent failure protocol's STOP-on-crash rule (never inline-fallback on a security/migration fix). These are NOT collapsed into a generic "everything else" bucket.
259
259
  - **All remaining findings** (other code, perf, test) → invoke the **coder** agent once to apply them in a single pass.
260
260
  Run in the order doc-reviewer → security-reviewer → coder (skip any whose partition is empty). Pass only the verified findings, not false positives.
261
- 12. Run final build: `npm run lint && npx tsc --noEmit && npm run build` (redirect each to `/tmp/final-<gate>.txt` per § "Context economy"; surface only exit code + bounded extract on failure).
261
+ 12. Run final build: `npm run lint && npx tsc --noEmit && npm run build` (when `has_toolchain`, the configured `toolchain.commands.{lint,typecheck,build}` verbatim — § "Toolchain gates"; redirect each to `/tmp/final-<gate>.txt` per § "Context economy"; surface only exit code + bounded extract on failure).
262
262
  If any check fails, apply self-healing retry loop (up to 3 times).
263
263
  13. **Update tracker** with final review results:
264
264
  - Review engine: Codex (a non-Anthropic frontier model, resolved at runtime by `codex-companion.mjs`) (primary) | Claude code-reviewer (fallback)
@@ -335,6 +335,7 @@
335
335
  ```
336
336
 
337
337
  8. **Run the verification gates and CAPTURE their output to disk** (so step 9 can pass it to a fix agent) — **redirect, never `tee`/stream inline** (per § "Context economy" → Gate-output discipline). Each is its own gate:
338
+ When `features.has_toolchain: true`, substitute each command below with `toolchain.commands.{lint,typecheck,test,build}` run verbatim (§ "Toolchain gates"; defaults shown are the fallback when a key is unset).
338
339
  ```bash
339
340
  cd <worktree-path>
340
341
  npm run lint > /tmp/lint-<CARD-ID>.txt 2>&1; echo "lint:$?"
@@ -117,7 +117,7 @@ After completeness is verified, clean up the implementation before it reaches re
117
117
 
118
118
  **Telemetry (Fix Application Log)** — for EVERY finding (valid OR skipped) append one row to the tracker's `## Fix Application Log` section per the schema above. Use `domain=simplify-{reuse|quality|efficiency}` matching the originating agent. Include the `severity` trailing key. Inline: `decision=inline | applied_by=orchestrator | est_lines=<n> | severity=<HIGH|MEDIUM> | finding=<1-line>`. Delegated (domain-override): `decision=<coder|doc-reviewer> | applied_by=<coder|doc-reviewer> | est_lines=<n> | severity=<...> | finding=<1-line>`. Skipped: `decision=skipped | applied_by=- | est_lines=0 | reason=<false-positive|not-worth-addressing>`.
119
119
 
120
- 5. After all fixes, run `npm run lint` and `npx tsc --noEmit` to confirm nothing broke (redirect to disk per § "Context economy"; surface only exit code + a bounded extract on failure).
120
+ 5. After all fixes, run `npm run lint` and `npx tsc --noEmit` (when `has_toolchain`, the configured `toolchain.commands.{lint,typecheck}` verbatim — § "Toolchain gates") to confirm nothing broke (redirect to disk per § "Context economy"; surface only exit code + a bounded extract on failure).
121
121
  If either fails, fix the regression (up to **2 retries**). **If it still fails after 2 retries**: do NOT silently continue to Phase 2.6 with a broken tree — log the failure in `## Issues & Flags` as `[SIMPLIFY-REGRESSION]` and invoke `AskUserQuestion` (revert the simplify fixes / keep and have me fix manually / stop the card), mirroring the Phase 3.5 escalation.
122
122
 
123
123
  6. **Update tracker**: phase = "2.55-simplify DONE", log count of fixes applied (or "clean — 0 fixes").
@@ -288,7 +288,7 @@ skill's Phase 1 falls back to deriving Gherkin scenarios from
288
288
  Doc-reviewer applies all doc-domain fixes itself. The orchestrator does NOT spawn a coder for doc fixes (since v3.40.0 — `doc` is owned by `doc-reviewer`, see "Domain-Override Domains"). The only doc-reviewer output that leaves this phase unfixed is a **doc-drift→bug finding rooted in CODE** (the implementation contradicts a documented contract). Route it explicitly: if the conflicting code file matches the `security` Domain-Override match rule (`paths.high_risk_modules`) → spawn `security-reviewer` with the finding now, in this phase (a security-class code fix is not deferrable to a `light` Phase 3.7, and security is owned by `security-reviewer` — never a coder); otherwise carry the finding into the Phase 3.7 `/codexreview` input as a known code-drift bug and let the Phase 3.7 fix sub-loop apply it. Either way, append a Fix Application Log row with `domain=codex-correctness` (NOT `doc`) so telemetry attributes it as a code fix. Do NOT leave it accumulating in the tracker with no fix owner.
289
289
  14. **Knowledge-corpus sync (OPTIONAL — only if the project ships a corpus-sync agent)**: There is NO shipped `obsidian-sync` agent — do NOT dispatch one (a hard dispatch to a non-existent subagent fails silently). Only when the project provides its own knowledge-corpus sync agent (declared in `.baldart/overlays/new.md`) AND doc-reviewer's findings indicate a corpus impact, invoke that agent with the listed paths after the doc fixes are applied. Otherwise skip with a one-line notice (`knowledge-corpus sync: skipped (no corpus-sync agent configured)`). Non-blocking either way.
290
290
  15. **Telemetry** — after doc-reviewer returns, append one row per doc finding to `## Fix Application Log`: `3 | doc | est_lines=<n> | decision=doc-reviewer | applied_by=doc-reviewer | finding=<1-line>`. If 0 findings, append one row: `3 | doc | est_lines=0 | decision=skipped | applied_by=- | reason=no-findings`. **Phase-8 producer (named counter)** — ALSO record the per-card doc-gap counts as a structured line in `## Current Card` (carried into `## Completed Cards` at Phase 5): `doc_gaps: found=<N> fixed=<M>` where `N` = total doc findings doc-reviewer raised and `M` = those it applied. This is the single named producer for Phase 8's `doc_gaps_found` / `doc_gaps_fixed` fields — without it those fields have no upstream write and Phase 8 would hard-code zeros. (D.4a is the team-mode producer of the same counter — see Phase 7 § D.4a.)
291
- 16. Run `npm run lint` and `npx tsc --noEmit` (when `stack.language` includes typescript) to verify nothing broke (redirect to disk per § "Context economy"). If doc-reviewer touched any source-adjacent file (a `.ts`/`.tsx` helper, a co-located doc export), also run `npm run build`. If any check fails, apply the self-healing retry loop (up to 3 times, no user prompt). **If still failing after 3 retries**: do NOT fall through silently to Phase 3.5 — log `[DOC-PHASE-REGRESSION]` in `## Issues & Flags` and invoke `AskUserQuestion` (revert the doc-phase edits that broke the build / keep and fix manually / stop the card).
291
+ 16. Run `npm run lint` and `npx tsc --noEmit` (when `stack.language` includes typescript) — when `has_toolchain`, the configured `toolchain.commands.{lint,typecheck,build}` verbatim (§ "Toolchain gates") — to verify nothing broke (redirect to disk per § "Context economy"). If doc-reviewer touched any source-adjacent file (a `.ts`/`.tsx` helper, a co-located doc export), also run `npm run build`. If any check fails, apply the self-healing retry loop (up to 3 times, no user prompt). **If still failing after 3 retries**: do NOT fall through silently to Phase 3.5 — log `[DOC-PHASE-REGRESSION]` in `## Issues & Flags` and invoke `AskUserQuestion` (revert the doc-phase edits that broke the build / keep and fix manually / stop the card).
292
292
  17. **Telemetry for the step-16 self-heal** — if the retry loop spawned any fix (a code edit to recover from a doc-phase regression), append a Fix Application Log row for it AFTER the loop settles (the step-15 doc telemetry row was written before this loop ran, so it does not capture step-16 fixes). Then update tracker: phase = "3-doc-review DONE", log doc findings count, fixes applied.
293
293
  If doc-reviewer found a recurring gap, append 1-line to `## Lessons Learned`:
294
294
  `DOC: <pattern>`
@@ -15,7 +15,7 @@
15
15
  - Resolve `$TRUNK` = `git.trunk_branch` from `baldart.config.yml`. **When the key is absent** (consumer updated to ≥4.0.0 without re-running `configure`), do NOT hard-assume `develop` — autodetect the repo's real default branch exactly as worktree-manager does, so `/new` and `nw` agree on the base: `git -C "$MAIN" symbolic-ref --quiet refs/remotes/origin/HEAD` (strip `refs/remotes/origin/`), else the first existing local branch among `develop` / `main` / `master`. A `main`-trunk repo defaulted to `develop` here would diverge from the worktree base `nw` picks and break every `git diff "$TRUNK...HEAD"` gate. Only HALT ("Trunk branch unresolved — run `npx baldart configure`") if nothing resolves. Persist the resolved value as `Trunk branch:` in the tracker `## Worktree` section. **Every later Phase 0 / Phase 6c bash snippet that references the integration trunk MUST use `$TRUNK`, never a baked-in `develop`.** Begin every later consumer with a guard: if `$TRUNK` is empty → HALT with "Trunk branch unresolved — re-read `git.trunk_branch` from the tracker".
16
16
  - Resolve `$METRICS` = `paths.metrics` from `baldart.config.yml` (default `docs/metrics`). This is the framework-owned telemetry directory written by Phase 8 (tracker archive, `skill-runs.jsonl`, `sessions/`). The dirty-tree gate (step 3) reads `$METRICS` to recognise — and never surface — its own telemetry output. Persist it as `Metrics dir:` in the tracker `## Worktree` section.
17
17
 
18
- 1. **Resolve `$MAIN`** — the absolute path of the main repo (not a worktree). If `/new` was invoked from inside a worktree, walk up to the parent repo via `git rev-parse --show-superproject-working-tree` or `git worktree list` until you find the non-worktree root. Persist as `Main repo:` in the tracker `## Worktree` section. **Write `$MAIN` to the tracker the moment it is computed** — every later consumer (Phase 6c, Phase 6b) MUST re-read it from the tracker and HALT with "`$MAIN` absent from tracker" if the field is missing or empty, never silently use an undefined `$MAIN` (it does not survive context compaction).
18
+ 1. **Resolve `$MAIN`** — the absolute path of the main repo (not a worktree). Use `git rev-parse --show-toplevel` when cwd is the main checkout. If `/new` was invoked from **inside a worktree**, that returns the worktree, so resolve the main root from the worktree's parent instead: `cd "$(git rev-parse --show-toplevel)/.." && git rev-parse --show-toplevel` (the worktree lives at `<main>/.worktrees/<name>`). Do **not** use `--git-common-dir` + `/..` (wrong under `git init --separate-git-dir`) or `git worktree list` (its main entry is the git dir, not the working tree, under a separate git dir). This is the same contract as worktree-manager § "Resolving the main repo root". Persist as `Main repo:` in the tracker `## Worktree` section. **Write `$MAIN` to the tracker the moment it is computed** — every later consumer (Phase 6c, Phase 6b) MUST re-read it from the tracker and HALT with "`$MAIN` absent from tracker" if the field is missing or empty, never silently use an undefined `$MAIN` (it does not survive context compaction).
19
19
 
20
20
  1b. **Migration Gate (BLOCKING only when a migration is *declared* — else a silent no-op)** — resolve DB migrations interactively **before** the worktree exists, so the schema is live before any card builds against it. *Why this exists*: a migration applied to a shared/remote DB is owner-gated, so without this gate it is deferred to the END of the batch — and every downstream card in the batch is then built and verified against a schema that is not yet live (`validation_commands` / QA / E2E / DB-generated `tsc` types fail falsely → those cards cascade into deferral/blocked). Front-loading the migration removes that root cause. **The declaration lives in the EPIC card** (`migration_plan` block — project-specific, authored by the user, typically via the `.baldart/overlays/new.md` overlay). Steps:
21
21
 
@@ -105,6 +105,7 @@ Agent tool call:
105
105
  a) Print the numbered requirements checklist (anti-skip measure)
106
106
  b) Implement ALL requirements
107
107
  c) Run: npx tsc --noEmit && npx eslint --max-warnings=0 <your-files>
108
+ (when toolchain.commands.{typecheck,lint} are configured, run those verbatim instead — per agents/toolchain-protocol.md)
108
109
  d) Self-heal up to 3 times if checks fail
109
110
  e) Verify completeness: for each requirement, confirm code exists (read it)
110
111
  f) If any requirement is missing after implementation, implement it now
@@ -151,7 +152,7 @@ For each completed agent:
151
152
 
152
153
  After ALL agents in the group complete successfully:
153
154
 
154
- 1. **D.1 — Build verification (group)** — Run `npm run build` in the worktree to verify combined changes compile (redirect to `/tmp/build-group.txt` per § "Context economy"; surface only exit code + bounded extract on failure). If build fails, identify which card's changes broke it (from `git diff --name-only` per card), spawn a targeted fix-coder for those files only.
155
+ 1. **D.1 — Build verification (group)** — Run `npm run build` (when `has_toolchain`, the configured `toolchain.commands.build` verbatim — § "Toolchain gates") in the worktree to verify combined changes compile (redirect to `/tmp/build-group.txt` per § "Context economy"; surface only exit code + bounded extract on failure). If build fails, identify which card's changes broke it (from `git diff --name-only` per card), spawn a targeted fix-coder for those files only.
155
156
 
156
157
  1.5. **D.1.5 — Effective per-card review profile (compute ONCE; drives D.2 + D.4b)** — For EACH card in the group, compute its **effective codex profile** with the SAME deterministic rule the sequential Phase 3.7 Step C uses, so the two paths never disagree:
157
158
  - **Floor**: read the card's `review_profile` field (`skip`/`light`/`balanced`/`deep`) per the QA Profile Selector (fallback-computed only for legacy cards lacking the field).
@@ -242,7 +243,7 @@ After ALL agents in the group complete successfully:
242
243
 
243
244
  3a. **D.3a — Phase 2.5b AC-Closure Gate (per-card, BLOCKING — non-skippable)** — For EACH card in the group, **sequentially**, invoke the full Phase 2.5b gate as documented in `### Phase 2.5b — AC-Closure Gate (BLOCKING — Scope Closure Discipline)`. This includes: build the AC Closure Ledger from the card YAML, run the rationalization scan, invoke `AskUserQuestion` one-per-deferred-AC, run the `implementation_notes` deferral audit, and persist the ledger in the tracker. Until EVERY card in the group exits PASS, do NOT proceed to D.3b. Cards exiting with `not_implemented` ACs that the user routes to "Implementa adesso" must finish their fix-coder loop and re-pass the gate before D.3b starts for the next card. Log under `## AC Closure Ledger — <CARD-ID>` per card.
244
245
 
245
- 3b. **D.3b — Phase 2.55 Simplify (per-card, FANNED OUT across the group)** — The Simplify agents are **read-only analysis on file-disjoint per-card diffs** (the orchestrator applies the fixes afterward), so there is NO reason to run them one card at a time. **Spawn the per-card Simplify analysis for ALL eligible cards in PARALLEL** — in a SINGLE message, fire each card's Phase 2.55 trio (Reuse / Quality / Efficiency) against that card's diff captured to `/tmp/diff-<CARD-ID>.txt` (per Phase 2.55 step 2 — pass each trio the **path**, scoped to the card's File Ownership Map; never inline the diff). Per-card (not group-aggregate) so findings stay attributable. When all analyses return, **apply fixes per card** (file-disjoint → no write conflict), then re-run `npm run lint` and `npx tsc --noEmit` on the worktree ONCE for the whole group (redirect to disk per § "Context economy"). (Concurrency is capped by the platform; passing N cards is safe — excess agents queue.)
246
+ 3b. **D.3b — Phase 2.55 Simplify (per-card, FANNED OUT across the group)** — The Simplify agents are **read-only analysis on file-disjoint per-card diffs** (the orchestrator applies the fixes afterward), so there is NO reason to run them one card at a time. **Spawn the per-card Simplify analysis for ALL eligible cards in PARALLEL** — in a SINGLE message, fire each card's Phase 2.55 trio (Reuse / Quality / Efficiency) against that card's diff captured to `/tmp/diff-<CARD-ID>.txt` (per Phase 2.55 step 2 — pass each trio the **path**, scoped to the card's File Ownership Map; never inline the diff). Per-card (not group-aggregate) so findings stay attributable. When all analyses return, **apply fixes per card** (file-disjoint → no write conflict), then re-run `npm run lint` and `npx tsc --noEmit` (when `has_toolchain`, the configured `toolchain.commands.{lint,typecheck}` verbatim — § "Toolchain gates") on the worktree ONCE for the whole group (redirect to disk per § "Context economy"). (Concurrency is capped by the platform; passing N cards is safe — excess agents queue.)
246
247
  - **Gate (enumerated, `TRIVIAL_CARDS`-driven)**: SKIP D.3b for a card in **`TRIVIAL_CARDS`** (the set already computed at D.1.5 — `review_profile == skip` AND 0 Step-A triggers AND **non-source diff**), aligning team mode with sequential Phase 2.55's `IS_TRIVIAL` re-confirmation on the ACTUAL diff and with team-mode's own D.1.5 SSOT. A trivial card has no substantive diff to simplify, and Simplify is quality-only (no merge-gate coverage to lose). Log `simplify: SKIPPED (trivial — non-source diff)`. **A card with `review_profile == skip` whose committed diff DID touch a source file is NOT in `TRIVIAL_CARDS` → run D.3b for it** (exactly as sequential 2.55 does — `skip` is the floor, the real diff is the deciding check). For `light`/`balanced`/`deep` cards D.3b runs unchanged. This is the ONLY enumerated skip — never skip D.3b "for time" on a non-trivial card. (Skipped cards are simply omitted from the parallel fan-out.)
247
248
 
248
249
  3c. **D.3c — Phase 2.6 E2E-Review (per-card)** — First, evaluate the existing Gate table for EVERY card at once (skip when `features.has_e2e_review: false`, backend-only diff per the diff predicate documented in Phase 2.6, or card type in the Phase 2.6 skip set — `backend`/`api`/`db`/`infra`/`docs`/`chore`/`config`). In practice most cards in a group skip this gate (backend/db/api), so the eligible set is usually 0–1. For the cards that PASS the gate, invoke `/e2e-review` in programmatic mode with that card's payload. Each `/e2e-review` keeps its own isolated state dir (`.baldart/e2e-review/<CARD-ID>/`), so multiple runs do not clobber each other's artifacts.
@@ -66,7 +66,7 @@ created by Step 1 (HARD RULE 17). `$WORKTREE_PATH` is set in the state file.
66
66
 
67
67
  **Variable guard (R6).** Before executing any item in Step 7, resolve and verify:
68
68
  - `$WORKTREE_PATH` — read from the state file `## Worktree / path:`. HALT with `"ABORT: WORKTREE_PATH not set in state file — cannot commit or merge."` if absent or empty.
69
- - `$MAIN` — the absolute path of the main (non-worktree) git repository root. **READ the persisted value from the state file `## Worktree / main_path:`** (written at SKILL.md Step 1, the R6 persist-then-read pattern `/new` uses). HALT with `"ABORT: MAIN absent from state — re-run from a session whose Step 1 persisted main_path."` if the field is absent or empty. Only when the field is genuinely missing (legacy state file) FALL BACK to deriving it once via `git -C "$WORKTREE_PATH" rev-parse --git-common-dir` (strips `/.git`). Every subsequent `git` command that runs outside the worktree MUST use `git -C "$MAIN" …`; never use a bare `git` command whose cwd is undefined. (This is the same variable name and resolution contract `/new` uses — the two skills must agree.)
69
+ - `$MAIN` — the absolute path of the main (non-worktree) git repository root. **READ the persisted value from the state file `## Worktree / main_path:`** (written at SKILL.md Step 1, the R6 persist-then-read pattern `/new` uses). HALT with `"ABORT: MAIN absent from state — re-run from a session whose Step 1 persisted main_path."` if the field is absent or empty. Only when the field is genuinely missing (legacy state file) FALL BACK to deriving it once from the worktree's parent toplevel: `MAIN="$(cd "$WORKTREE_PATH" && git -C .. rev-parse --show-toplevel)"`. (Do **not** use `git rev-parse --git-common-dir` + `/..` — that returns the parent of the git dir, which is wrong under `git init --separate-git-dir`; `--show-toplevel` is the true working-tree root. This mirrors the canonical resolution in worktree-manager § "Resolving the main repo root".) Every subsequent `git` command that runs outside the worktree MUST use `git -C "$MAIN" …`; never use a bare `git` command whose cwd is undefined. (This is the same variable name and resolution contract `/new` uses — the two skills must agree.)
70
70
 
71
71
  ### Resolve remaining items
72
72
 
@@ -8,7 +8,7 @@ description: Review changed code for reuse, quality, and efficiency, then fix an
8
8
 
9
9
  ## Project Context
10
10
 
11
- **Reads from `baldart.config.yml`:** `paths.references_dir`, `paths.components_primitives`, `paths.components_root`, `paths.design_system`.
11
+ **Reads from `baldart.config.yml`:** `paths.references_dir`, `paths.components_primitives`, `paths.components_root`, `paths.design_system`, `features.has_toolchain` + `toolchain.commands.{lint,typecheck,test}` (the Step 5 verify gates run those verbatim when the flag is on — see Step 5).
12
12
  **Gated by features:** `features.has_design_system` (Design System check section is BLOCKING when `true`).
13
13
  **Overlay:** loads `.baldart/overlays/simplify.md` if present — project-specific utility module paths, hook conventions.
14
14
  **On missing/empty keys:** ask the user; do not assume defaults. See `framework/agents/project-context.md` § 3.
@@ -142,6 +142,12 @@ When extracting shared code, write to the SAME `${paths.*}` keys the review phas
142
142
  These static checks run AFTER Step 4 has mutated code — a pre-fix pass would not cover the post-fix
143
143
  code, so re-running them here is mandatory whenever Step 4 changed anything.
144
144
 
145
+ > **Toolchain:** when `features.has_toolchain: true` in `baldart.config.yml`, run the command from
146
+ > `toolchain.commands.{lint,typecheck,test}` **verbatim** instead of the default shown below — per
147
+ > `framework/agents/toolchain-protocol.md`. Per gate: a non-empty config command wins; empty/absent
148
+ > (or the flag off) falls back to the default. A configured command that exits non-zero is a real
149
+ > FAIL (do not fall back to the default).
150
+
145
151
  Run the linter (use the project's lint command):
146
152
 
147
153
  ```
@@ -20,6 +20,7 @@ description: >
20
20
  - `git.merge_strategy` — `pr` (default, GitHub PR via `gh`) or `local-push` (direct fast-forward to `origin/<git.trunk_branch>`). Drives `/mw` step 4c.
21
21
  - `paths.backlog_dir` — used for syncing untracked backlog cards (`/nw` step 3b).
22
22
  - `paths.metrics` — JSONL telemetry dir (default `docs/metrics`) used by the rebase conflict-resolution table.
23
+ - `features.has_toolchain` + `toolchain.commands.{typecheck,lint,build,test}` — when the flag is `true`, the mechanical gates below run the consumer's configured commands verbatim instead of the hardcoded `npx tsc`/`npx eslint`/`npm run build` defaults (see "## Toolchain-aware gates"). Absent/`false` → defaults, identical to pre-toolchain behavior.
23
24
  - Protocol reference: `framework/agents/project-context.md`. Skills must ASK when a needed key is missing — never assume.
24
25
 
25
26
  ## Effort
@@ -30,6 +31,47 @@ reasoning depth for this run — detect it once at kickoff and strip the token
30
31
  before consuming user input. Level→behavior mapping, parsing contract, and
31
32
  precedence caveats: `framework/agents/effort-protocol.md`.
32
33
 
34
+ ## Toolchain-aware gates
35
+
36
+ When `features.has_toolchain: true` in `baldart.config.yml`, every mechanical
37
+ gate in this skill (type-check, lint, build, test) runs the command from
38
+ `toolchain.commands.<gate>` **verbatim** instead of the hardcoded default — per
39
+ `framework/agents/toolchain-protocol.md`. Resolution is done **in shell**, the
40
+ same way this skill already resolves `git.trunk_branch` / `git.merge_strategy`
41
+ from disk (a background `/nw` baseline runner is a weak/background model — a prose
42
+ "please substitute" note would be silently skipped while the hardcoded command
43
+ ran anyway; the shell resolver is model-independent). Each gate block inlines this
44
+ compact resolver (self-contained, since each runs as its own `Bash` call):
45
+
46
+ ```bash
47
+ # Resolve baldart.config.yml from disk; missing/unreadable → empty → hardcoded default.
48
+ CFG="baldart.config.yml"; [ -f "$CFG" ] || CFG="$(git rev-parse --show-toplevel 2>/dev/null)/baldart.config.yml"
49
+ _tc() { grep -E '^[[:space:]]*has_toolchain:[[:space:]]*true' "$CFG" >/dev/null 2>&1 || return 0
50
+ grep -A20 '^toolchain:' "$CFG" 2>/dev/null | grep -A15 '^[[:space:]]*commands:' \
51
+ | grep -E "^[[:space:]]+$1:" | head -1 \
52
+ | sed -E "s/.*$1:[[:space:]]*\"?([^\"#]*)\"?.*/\1/" | sed -E 's/[[:space:]]+$//'; }
53
+ ```
54
+
55
+ `eval "${VAR:-<default>}"` then runs the configured command or falls back.
56
+
57
+ **Three carve-outs specific to this skill:**
58
+ - **`npm install` is NOT a toolchain gate** — there is no `toolchain.commands.install`
59
+ key (the command map is lint/format/typecheck/test/test_related/build/audit).
60
+ Dependency installation stays as-is; do **not** route it through the resolver.
61
+ - **Severity is unchanged.** The resolver picks *which* command runs, never *what
62
+ to do with its exit code*. The protocol's "a configured command that exits
63
+ non-zero is a real FAIL — do not then run the default" governs only resolution
64
+ (never mask a failure by running the default after the configured one failed);
65
+ it does **not** change each block's STOP/continue policy: a `/nw` baseline
66
+ type-check/lint failure is still "report but continue", a build failure is still
67
+ "STOP", exactly as written at each block.
68
+ - **The `/mw` best-effort test line stays best-effort.** `npm run test 2>/dev/null
69
+ || true` (pre-merge step 2) is intentionally non-blocking; the resolved command
70
+ keeps the `|| true` swallow, so `test` is never a `/mw` STOP trigger.
71
+
72
+ A configured **lint** command may be whole-tree (e.g. `npx biome check .`) where a
73
+ default is changed-files-scoped (`/mw` pre-merge) — that wider scope is the
74
+ consumer's configured contract, applied only when `has_toolchain` is on.
33
75
 
34
76
  **IMMEDIATE EXECUTION**: When invoked via `/nw`, `/mw`, `/lw`, or `/cw`, do NOT explain the process. Start executing the matching command flow immediately:
35
77
 
@@ -112,6 +154,9 @@ Output:
112
154
  # Without this, the worktree IS git-tracked: `git status` on the main
113
155
  # repo explodes with every file inside the worktree, exactly defeating
114
156
  # the parallel-isolation purpose of the docs-mode.
157
+ # Main-checkout context (the docs worktree is created below, cwd is still the
158
+ # main repo) → `--show-toplevel` is the main root, separate-git-dir-safe. See
159
+ # § "Resolving the main repo root".
115
160
  MAIN_ROOT="$(git rev-parse --show-toplevel)"
116
161
  if [ ! -f "$MAIN_ROOT/.gitignore" ] || ! grep -qE '^\.worktrees/?$' "$MAIN_ROOT/.gitignore"; then
117
162
  echo "ERROR: .worktrees/ is not in $MAIN_ROOT/.gitignore" >&2
@@ -367,7 +412,8 @@ unmerged sibling branch, invisible to both the local backlog and the trunk
367
412
  merge-base. They collide at rebase/merge time.
368
413
 
369
414
  `scripts/allocate-id.sh` closes that race **on the same machine**. Every worktree
370
- shares one main repo root (resolved from `git rev-parse --git-common-dir`), and
415
+ shares one main repo root (resolved by the canonical `resolve_main()` helper in
416
+ that script — see § "Resolving the main repo root" below), and
371
417
  `.worktrees/` lives there and is gitignored — so it is the natural shared
372
418
  coordination point, exactly like `registry.json`. The allocator anchors a lock
373
419
  and a per-prefix high-water-mark there:
@@ -381,6 +427,36 @@ and a per-prefix high-water-mark there:
381
427
  .claude/skills/worktree-manager/scripts/allocate-id.sh release <worktree-path>
382
428
  ```
383
429
 
430
+ ---
431
+
432
+ ## Resolving the main repo root
433
+
434
+ Several steps need `$MAIN` — the absolute path of the **main repo working tree**
435
+ (never a worktree). It is resolved in **three distinct execution contexts**, and
436
+ each genuinely needs a different primitive — this is NOT entropy, and "unifying"
437
+ them onto one naked primitive regresses real cases (verified):
438
+
439
+ 1. **Main-checkout cwd** (a worktree is being *created*, or cards synced before
440
+ any `cd`): `git rev-parse --show-toplevel`. This is the true working-tree root
441
+ even when `.git` lives elsewhere (`git init --separate-git-dir`). Sites:
442
+ `/nw` step 1, nw-docs step 0, `3b. Sync untracked cards` (cwd is still main
443
+ there), and `/prd` SKILL.md Step 1.
444
+ 2. **Worktree cwd** (after `cd "$WORKTREE_PATH"`): `git -C .. rev-parse --show-toplevel`.
445
+ The worktree's parent is `.worktrees/`, which lives inside the main repo, so
446
+ its toplevel is the main root — correct under a separate git dir too. Sites:
447
+ `/nw` step 4 env-copy, `/mw` step 4c `$MAIN` fallback.
448
+ 3. **Arbitrary cwd, main-or-worktree** (the `allocate-id.sh` startup, called from
449
+ either): the canonical `resolve_main()` in `scripts/allocate-id.sh` — it uses
450
+ `--show-toplevel`, then detects a linked worktree (`git-dir != git-common-dir`)
451
+ and walks one level up.
452
+
453
+ **Do NOT resolve `$MAIN` via `git rev-parse --git-common-dir` + `/..`.** That
454
+ returns the parent of the *git dir*, which equals the working-tree root only when
455
+ `.git` is in-tree; under `git init --separate-git-dir` it points at the wrong
456
+ directory. Always derive the root through `--show-toplevel`. `resolve_main()` is
457
+ the reference implementation; the SKILL snippets mirror it inline because the
458
+ script is not on disk inside a freshly-checked-out worktree.
459
+
384
460
  Shared files, all under `$MAIN/.worktrees/` (already gitignored):
385
461
  - `.id-alloc.lock/` — cross-process mutex (atomic `mkdir`, stale-stolen after 30s
386
462
  via the dir's own mtime; it is a directory, NOT a git ref, so it does not touch
@@ -422,6 +498,9 @@ Supports three modes:
422
498
  # Resolve $MAIN (main repo root) and $TRUNK (git.trunk_branch) up front, and
423
499
  # persist both onto the registry entry created in step 6 (R6). Every later step
424
500
  # reads them from the registry with a presence guard — never from in-context state.
501
+ # Main-checkout context: cwd is the main repo (the orchestrator invokes /nw from
502
+ # $MAIN — setup.md resolves $MAIN before spawning), so `--show-toplevel` is the
503
+ # main root and is separate-git-dir-safe. See § "Resolving the main repo root".
425
504
  MAIN="$(git rev-parse --show-toplevel)"
426
505
  # .gitignore safety (auto-heal, NON-blocking — code-mode differs from nw-docs, which
427
506
  # hard-aborts). Without `.worktrees/` ignored the worktree is git-tracked and `git status`
@@ -520,7 +599,14 @@ but are NOT on the trunk branch yet. The worktree (branched from the trunk) won'
520
599
  # when the key is absent — emit it resolved, never the literal ${paths.backlog_dir}
521
600
  # token, which is not valid bash). Same resolution as /new's card-scoped diff block.
522
601
  BACKLOG_DIR="<value of paths.backlog_dir, or 'backlog' if the key is absent>"
523
- MAIN_ROOT="$(git -C "$WORKTREE_PATH" rev-parse --show-superproject-working-tree 2>/dev/null || pwd)"
602
+ # At this point cwd is still the MAIN checkout (the `cd "$WORKTREE_PATH"` happens
603
+ # in step 4 below), so `--show-toplevel` IS the main repo root — correct for a
604
+ # normal repo, a git submodule, and a `--separate-git-dir` repo alike. (The old
605
+ # `git -C "$WORKTREE_PATH" --show-superproject-working-tree || pwd` was wrong for a
606
+ # real submodule — it returned the SUPERPROJECT, not this repo's root where the
607
+ # cards live — and only resolved otherwise by relying on this same cwd accident.
608
+ # See § "Resolving the main repo root".)
609
+ MAIN_ROOT="$(git rev-parse --show-toplevel)"
524
610
 
525
611
  # For each card in the batch, check if its YAML exists in the main repo but not in the worktree
526
612
  for CARD_FILE in $(ls "$MAIN_ROOT/$BACKLOG_DIR"/*.yml 2>/dev/null); do
@@ -549,7 +635,13 @@ npm install
549
635
 
550
636
  # 3. Copy environment files from main repo root
551
637
  # Build requires Firebase env vars — this copy is critical.
552
- MAIN_ROOT="$(git -C .. rev-parse --show-toplevel 2>/dev/null || echo '../..')"
638
+ # cwd is the worktree; its parent (`.worktrees/`) lives inside the main repo,
639
+ # so `git -C ..` resolves the MAIN repo's toplevel — correct under a
640
+ # `--separate-git-dir` repo too (unlike `--git-common-dir`+parent). The old
641
+ # silent `|| echo '../..'` fallback masked a resolution failure with a relative
642
+ # guess; fail loud instead. See § "Resolving the main repo root".
643
+ MAIN_ROOT="$(git -C .. rev-parse --show-toplevel 2>/dev/null)"
644
+ [ -n "$MAIN_ROOT" ] || { echo "ERROR: cannot resolve main repo root from worktree $(pwd)" >&2; exit 1; }
553
645
  cp "$MAIN_ROOT/.env.local" .env.local 2>/dev/null || true
554
646
  cp "$MAIN_ROOT/.env" .env 2>/dev/null || true
555
647
 
@@ -597,16 +689,27 @@ fi
597
689
  ### 5. Verify baseline
598
690
 
599
691
  ```bash
692
+ # Toolchain-aware (§ "Toolchain-aware gates"): when features.has_toolchain: true,
693
+ # run toolchain.commands.{typecheck,lint,build} verbatim; else the defaults below.
694
+ CFG="baldart.config.yml"; [ -f "$CFG" ] || CFG="$(git rev-parse --show-toplevel 2>/dev/null)/baldart.config.yml"
695
+ _tc() { grep -E '^[[:space:]]*has_toolchain:[[:space:]]*true' "$CFG" >/dev/null 2>&1 || return 0
696
+ grep -A20 '^toolchain:' "$CFG" 2>/dev/null | grep -A15 '^[[:space:]]*commands:' \
697
+ | grep -E "^[[:space:]]+$1:" | head -1 \
698
+ | sed -E "s/.*$1:[[:space:]]*\"?([^\"#]*)\"?.*/\1/" | sed -E 's/[[:space:]]+$//'; }
699
+
700
+ TC_TC=$(_tc typecheck); TC_LINT=$(_tc lint); TC_BUILD=$(_tc build)
701
+
600
702
  # TypeScript + lint (fast)
601
- npx tsc --noEmit
602
- npx eslint --max-warnings=0 src/
703
+ eval "${TC_TC:-npx tsc --noEmit}"
704
+ eval "${TC_LINT:-npx eslint --max-warnings=0 src/}"
603
705
 
604
706
  # Full build verification (required — confirms worktree is functional)
605
- npm run build
707
+ eval "${TC_BUILD:-npm run build}"
606
708
  ```
607
709
 
608
710
  If build fails → STOP and report. Do NOT continue — the worktree is broken.
609
711
  If only tsc/lint fails → report but continue (the trunk branch should be clean, may be a transient issue).
712
+ (Severity is by-gate as stated here; the toolchain resolver only changes *which* command runs — § "Toolchain-aware gates".)
610
713
 
611
714
  ### 6. Update registry
612
715
 
@@ -717,15 +820,27 @@ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>"
717
820
  fi
718
821
 
719
822
  # 2. Run quality checks
720
- # Lint only changed files (vs the trunk branch)
721
- npx eslint --max-warnings=0 $(git diff --name-only "origin/$TRUNK" -- '*.ts' '*.tsx')
722
- npx tsc --noEmit
823
+ # Toolchain-aware "Toolchain-aware gates"): when features.has_toolchain: true,
824
+ # run toolchain.commands.{lint,typecheck,build,test} verbatim; else the defaults below.
825
+ CFG="baldart.config.yml"; [ -f "$CFG" ] || CFG="$(git rev-parse --show-toplevel 2>/dev/null)/baldart.config.yml"
826
+ _tc() { grep -E '^[[:space:]]*has_toolchain:[[:space:]]*true' "$CFG" >/dev/null 2>&1 || return 0
827
+ grep -A20 '^toolchain:' "$CFG" 2>/dev/null | grep -A15 '^[[:space:]]*commands:' \
828
+ | grep -E "^[[:space:]]+$1:" | head -1 \
829
+ | sed -E "s/.*$1:[[:space:]]*\"?([^\"#]*)\"?.*/\1/" | sed -E 's/[[:space:]]+$//'; }
830
+ TC_LINT=$(_tc lint); TC_TC=$(_tc typecheck); TC_BUILD=$(_tc build); TC_TEST=$(_tc test)
831
+
832
+ # Lint — default is changed-files-scoped (vs trunk); a configured whole-tree command
833
+ # (e.g. `npx biome check .`) runs verbatim by the consumer's contract (§ carve-out).
834
+ CHANGED_TS=$(git diff --name-only "origin/$TRUNK" -- '*.ts' '*.tsx')
835
+ eval "${TC_LINT:-npx eslint --max-warnings=0 $CHANGED_TS}"
836
+ eval "${TC_TC:-npx tsc --noEmit}"
723
837
 
724
838
  # Full build (required before PR per AGENTS.md)
725
- npm run build
839
+ eval "${TC_BUILD:-npm run build}"
726
840
 
727
- # Tests
728
- npm run test 2>/dev/null || true
841
+ # Tests — BEST-EFFORT by design: never a STOP trigger. Keep the `|| true` swallow
842
+ # whether configured or default (§ carve-out: the /mw test line stays best-effort).
843
+ { eval "${TC_TEST:-npm run test}"; } 2>/dev/null || true
729
844
  ```
730
845
 
731
846
  If lint, tsc, or build fails → report and STOP. Do NOT proceed to PR.
@@ -898,7 +1013,13 @@ done
898
1013
 
899
1014
  # Continue rebase + verify build (no stash to restore — step 4b never stashes)
900
1015
  git rebase --continue
901
- npm run build
1016
+ # Toolchain-aware (§ "Toolchain-aware gates"): toolchain.commands.build verbatim when set, else default.
1017
+ CFG="baldart.config.yml"; [ -f "$CFG" ] || CFG="$(git rev-parse --show-toplevel 2>/dev/null)/baldart.config.yml"
1018
+ _tc() { grep -E '^[[:space:]]*has_toolchain:[[:space:]]*true' "$CFG" >/dev/null 2>&1 || return 0
1019
+ grep -A20 '^toolchain:' "$CFG" 2>/dev/null | grep -A15 '^[[:space:]]*commands:' \
1020
+ | grep -E "^[[:space:]]+$1:" | head -1 \
1021
+ | sed -E "s/.*$1:[[:space:]]*\"?([^\"#]*)\"?.*/\1/" | sed -E 's/[[:space:]]+$//'; }
1022
+ TC_BUILD=$(_tc build); eval "${TC_BUILD:-npm run build}"
902
1023
  ```
903
1024
 
904
1025
  If build fails after rebase → STOP and report. The rebase introduced incompatibilities.
@@ -911,11 +1032,16 @@ Read `git.merge_strategy` from `baldart.config.yml` (default: `pr`):
911
1032
 
912
1033
  ```bash
913
1034
  # $MAIN and $TRUNK were resolved in step 1 (read from the registry entry, R6) and
914
- # presence-guarded. Fall back to deriving $MAIN from git-common-dir ONLY if it is
1035
+ # presence-guarded. Fall back to deriving $MAIN from the worktree ONLY if it is
915
1036
  # still unset — never silently re-derive over a value the registry already supplied.
916
1037
  if [ -z "$MAIN" ]; then
917
- MAIN=$(git -C "$WORKTREE_PATH" rev-parse --git-common-dir)/..
918
- MAIN=$(cd "$MAIN" && pwd)
1038
+ # Canonical worktree→main resolution (see § "Resolving the main repo root"):
1039
+ # cd INTO the worktree so `git -C ..` resolves the MAIN repo's toplevel. NOT
1040
+ # `--git-common-dir`+`/..` — that returns the parent of the git dir (wrong under
1041
+ # `git init --separate-git-dir`), and appending `/..` to a possibly-relative
1042
+ # `--git-common-dir` under `git -C` resolves against the wrong base.
1043
+ MAIN="$(cd "$WORKTREE_PATH" 2>/dev/null && git -C .. rev-parse --show-toplevel 2>/dev/null)"
1044
+ [ -n "$MAIN" ] || { echo "ERROR: cannot resolve \$MAIN from worktree $WORKTREE_PATH (registry mainRoot was empty)." >&2; exit 1; }
919
1045
  fi
920
1046
 
921
1047
  # Resolve strategy from baldart.config.yml (default: pr)
@@ -1093,8 +1219,15 @@ fi
1093
1219
  ### 5. Post-merge verification
1094
1220
 
1095
1221
  ```bash
1096
- npx tsc --noEmit
1097
- npm run build
1222
+ # Toolchain-aware (§ "Toolchain-aware gates"): toolchain.commands.{typecheck,build} verbatim when set, else defaults.
1223
+ CFG="baldart.config.yml"; [ -f "$CFG" ] || CFG="$(git rev-parse --show-toplevel 2>/dev/null)/baldart.config.yml"
1224
+ _tc() { grep -E '^[[:space:]]*has_toolchain:[[:space:]]*true' "$CFG" >/dev/null 2>&1 || return 0
1225
+ grep -A20 '^toolchain:' "$CFG" 2>/dev/null | grep -A15 '^[[:space:]]*commands:' \
1226
+ | grep -E "^[[:space:]]+$1:" | head -1 \
1227
+ | sed -E "s/.*$1:[[:space:]]*\"?([^\"#]*)\"?.*/\1/" | sed -E 's/[[:space:]]+$//'; }
1228
+ TC_TC=$(_tc typecheck); TC_BUILD=$(_tc build)
1229
+ eval "${TC_TC:-npx tsc --noEmit}"
1230
+ eval "${TC_BUILD:-npm run build}"
1098
1231
  ```
1099
1232
 
1100
1233
  If post-merge build fails → STOP and report. Do NOT cleanup worktree (may need to investigate).
@@ -8,8 +8,8 @@
8
8
  # backlog scan cannot see IDs that are still in flight on an unmerged sibling
9
9
  # worktree branch.
10
10
  #
11
- # Mechanism: every worktree shares the same main repo root (resolved from
12
- # `git rev-parse --git-common-dir`), and `.worktrees/` lives there and is
11
+ # Mechanism: every worktree shares the same main repo root (resolved by
12
+ # `resolve_main()` below), and `.worktrees/` lives there and is
13
13
  # gitignored. We anchor a lock + a per-prefix high-water-mark file there, so a
14
14
  # reservation is atomic across every worktree on this machine. The high-water
15
15
  # bumped under the lock is the correctness mechanism; the max() against the real
@@ -34,15 +34,29 @@ LOCK_MAX_TRIES=150 # ~30s at 0.2s/try
34
34
 
35
35
  err() { printf '%s\n' "$*" >&2; }
36
36
 
37
- # --- Resolve the shared main repo root from any worktree -------------------
37
+ # --- Resolve the main repo root from the main checkout OR any worktree -----
38
+ # CANONICAL main-repo-root resolution for the whole framework (mirrored inline in
39
+ # worktree-manager/SKILL.md + prd, where this script is not reachable from a
40
+ # worktree checkout). Built on `--show-toplevel`, NOT `--git-common-dir`+parent:
41
+ # the latter returns the PARENT OF THE GIT DIR, which is wrong under
42
+ # `git init --separate-git-dir` (the git dir lives outside the working tree).
43
+ # `--show-toplevel` is the true working-tree root in every case. From a linked
44
+ # worktree `--show-toplevel` returns the WORKTREE root, so we detect that
45
+ # (git-dir != git-common-dir) and walk one level up out of `.worktrees/` — the
46
+ # worktree path is always `<main>/.worktrees/<name>` (SKILL.md R8) — and resolve
47
+ # the main checkout's toplevel there. Fails (return 1) rather than print garbage.
38
48
  resolve_main() {
39
- local common
49
+ local top gd common
50
+ top="$(git rev-parse --show-toplevel 2>/dev/null)" || return 1
51
+ gd="$(git rev-parse --git-dir 2>/dev/null)" || return 1
40
52
  common="$(git rev-parse --git-common-dir 2>/dev/null)" || return 1
41
- case "$common" in
42
- /*) ;; # already absolute
43
- *) common="$(pwd)/$common" ;; # relative (we're in the main repo) → absolutise
44
- esac
45
- (cd "$common/.." 2>/dev/null && pwd) || return 1
53
+ if [ "$gd" = "$common" ]; then
54
+ printf '%s\n' "$top" # main checkout — toplevel IS the main repo root
55
+ else
56
+ # linked worktree — the main root is the toplevel of the directory that
57
+ # contains `.worktrees/` (one level above this worktree).
58
+ (cd "$top/.." 2>/dev/null && git rev-parse --show-toplevel 2>/dev/null) || return 1
59
+ fi
46
60
  }
47
61
 
48
62
  # --- Read a paths.* / git.* scalar from baldart.config.yml -----------------
@@ -64,6 +64,21 @@ protocol. The `/new` workflow scripts receive the resolved config via their
64
64
  `args.config` payload (the consuming skill passes `baldart.config.yml`) — they
65
65
  must never hard-code project facts (see the workflows contamination contract).
66
66
 
67
+ Since v4.42.0 the **execution-side** gates resolve through this protocol too: the
68
+ `worktree-manager` skill (`/nw` baseline + `/mw` pre-merge / rebase / post-merge
69
+ build gates), the classic `/new` reference suite (`implement`, `review-cycle`,
70
+ `completeness`, `final-review`, `commit`, `team-mode` — cited as `§ "Toolchain
71
+ gates"` from the `/new` core), and the standalone `/bug` + `/simplify` verify
72
+ steps. Two execution-context-specific resolution mechanisms are used: a markdown
73
+ **prose note** where a full model reads the skill and writes the gate command
74
+ (the `/new` references, `/bug`, `/simplify`, the agents), and an inline **shell
75
+ resolver** in `worktree-manager` (it has no `args.config` and its baseline runs in
76
+ a weak/background subagent, so it resolves `toolchain.commands.*` from
77
+ `baldart.config.yml` on disk — the same way it already resolves `git.trunk_branch`
78
+ — rather than relying on a prose note a weak model could skip). `npm install` is
79
+ **not** a gate (there is no `commands.install` key), and `markdownlint` is not in
80
+ the curated map — both stay as written everywhere.
81
+
67
82
  ## Fallback rules
68
83
 
69
84
  - A configured command that EXITS NON-ZERO is a genuine gate **FAIL** — do not
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "baldart",
3
- "version": "4.43.1",
3
+ "version": "4.45.0",
4
4
  "description": "Claude Agent Framework - Reusable framework for coordinating AI agents and humans in software projects",
5
5
  "bin": {
6
6
  "baldart": "./bin/baldart.js"