npm - baldart - Versions diffs - 4.40.0 → 4.41.0 - Mend

baldart 4.40.0 → 4.41.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

package/CHANGELOG.md +21 -0
package/README.md +6 -2
package/VERSION +1 -1
package/framework/.claude/agents/coder.md +4 -2
package/framework/.claude/agents/qa-sentinel.md +10 -1
package/framework/.claude/commands/qa.md +2 -0
package/framework/.claude/skills/toolchain-bootstrap/SKILL.md +127 -0
package/framework/.claude/workflows/new-card-review.js +17 -2
package/framework/.claude/workflows/new2.js +3 -0
package/framework/agents/index.md +2 -0
package/framework/agents/toolchain-protocol.md +80 -0
package/framework/docs/TOOLCHAIN-LAYER.md +135 -0
package/framework/templates/baldart.config.template.yml +40 -0
package/package.json +1 -1
package/src/commands/configure.js +81 -0
package/src/commands/doctor.js +67 -0
package/src/commands/update.js +12 -0
package/src/utils/tool-currency.js +52 -0
package/src/utils/toolchain-adapters/biome.js +92 -0
package/src/utils/toolchain-adapters/eslint.js +39 -0
package/src/utils/toolchain-adapters/husky.js +30 -0
package/src/utils/toolchain-adapters/index.js +83 -0
package/src/utils/toolchain-adapters/jest.js +34 -0
package/src/utils/toolchain-adapters/lefthook.js +84 -0
package/src/utils/toolchain-adapters/prettier.js +39 -0
package/src/utils/toolchain-adapters/tsc.js +50 -0
package/src/utils/toolchain-adapters/vitest.js +46 -0
package/src/utils/toolchain-installer.js +233 -0

package/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,27 @@ All notable changes to BALDART will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [4.41.0] - 2026-06-15
+**BALDART becomes opinionated about the *tools*, not just the workflow: a new curated toolchain layer installs best-in-class JS/TS dev tools at first install and makes the agents actually use them.** Until now BALDART shipped agents and workflows but was agnostic about linters/formatters/test-runners — every quality gate hard-coded `eslint`/`tsc`/`jest`. This release adds an opt-in toolchain layer (`features.has_toolchain`) that, on a JS/TS project, PRESELECTS and installs a curated set as devDependencies — **Biome** (format + lint + import organizer), **Vitest**, **tsc**, **Lefthook** (pre-commit) — then records literal gate commands in `toolchain.commands.*` that the gate flows (`/new`, `/new2`, `/qa`, `qa-sentinel`, `coder`) run verbatim instead of guessing. The design is the fourth member of the install-adapter family (alongside routine-/tool-/lsp-adapters) and inherits their invariants: **opinionated but askable** (default Y, opt-out), **non-destructive** (configs written only when absent; existing ESLint/Prettier/Jest/husky are detected and a migration is only ever PROPOSED, never automatic — `.husky/` is never overwritten), **never silent in CI** (`--non-interactive` writes the flag only; `baldart doctor` backfills), and **silent fallback** (an unset command degrades to the project-standard default; the layer is invisible to non-JS projects and consumers with their own toolchain). **MINOR** (additive capability + new `features.has_toolchain` + `toolchain.*` config keys, propagated end-to-end per the schema-change propagation rule; backwards-compatible — flag defaults `false`, every gate falls back to today's behavior when unset).
+### Added
+- **`src/utils/toolchain-adapters/`** — new adapter family (same dispatcher pattern as `lsp-adapters/`). Curated installers `biome.js` / `vitest.js` / `tsc.js` / `lefthook.js` (each: `installCommand`/`verifyCommand`/`commands()`/`initConfig`/`static replaces`/`static detect`) + incumbent **detectors** `eslint.js` / `prettier.js` / `jest.js` / `husky.js` (detection-only, drive the migration proposal). `index.js` exposes `REGISTRY` + `INCUMBENTS` + `detectAll` + `detectIncumbents`.
+- **`src/utils/toolchain-installer.js`** — orchestrator (modeled on `graphify-installer.js`): `recommend()` (clean installs), `migrations()` (incumbent-blocked tools → manual migration), `install()` (devDeps; writes a tool's config BEFORE the package's own postinstall so Lefthook registers our Biome pre-commit, not its commented-out example), `initConfigs()` (non-destructive), `commandsFor()`, `activeTools()`, `certify()`. Never throws.
+- **`framework/agents/toolchain-protocol.md`** — runtime protocol: per-gate resolution (explicit `toolchain.commands.*` → project-standard fallback → skip), the command map, and the rule that a configured command which FAILS is a real failure (fallback applies only to *unset* commands). Routed from `framework/agents/index.md`.
+- **`framework/docs/TOOLCHAIN-LAYER.md`** — operator guide (plumbing, lifecycle, edge cases, how to add a language/tool).
+- **`framework/.claude/skills/toolchain-bootstrap/SKILL.md`** — explicit install path (`/toolchain-bootstrap`), CI-safe; mirrors `/lsp-bootstrap`.
+- **`framework/templates/baldart.config.template.yml`** — `features.has_toolchain` + the `toolchain:` block (`installed_tools`, `commands.{lint,format,typecheck,test,test_related,build,audit}`, `auto_verify`).
+### Changed
+- **`src/commands/configure.js`** — autodetects the prompt default (JS/TS project AND no incumbent → Y), and on enable runs the install/migrate/write/certify flow (clean set preselected; incumbents → migration proposal, never automatic).
+- **`src/commands/update.js`** — the schema-drift detector now diffs the `toolchain:` block including its nested `commands.*` map (same nested-block contract as `graph:`), so a future gate key surfaces for pre-existing consumers.
+- **`src/commands/doctor.js`** — backfill actions `toolchain-install` (devDeps missing) and `toolchain-init-config` (default config missing), gated on `features.has_toolchain`, never blocking.
+- **`src/utils/tool-currency.js`** — `_toolchainRecords` reports curated devDeps behind their npm `latest` (installed version read from `node_modules`, not a global binary); `autoUpgradable:false` (devDep upgrades touch `package.json` — surfaced as a command, user-run). Honest `unknown` offline.
+- **Consumers wired to read `toolchain.commands.*` with silent fallback** — `framework/.claude/agents/qa-sentinel.md`, `framework/.claude/agents/coder.md`, `framework/.claude/commands/qa.md`, `framework/.claude/workflows/new-card-review.js` (post-fix re-verify + qa gate brief), `framework/.claude/workflows/new2.js` (pre-flight baseline config facts).
 ## [4.40.0] - 2026-06-15
 **The classic `/new` now slims its single-card final review the same way `new2` already does — closing an asymmetry that made an N=1 batch re-run review finders it had already run.** `new-final-review.js` has carried an F-041 single-card slim since v4.17.x (keep the unique-value cross-model Codex pass + the qa-sentinel merge gate; drop the duplicate Claude breadth finders that already ran per-card), but it only ever fired for **`new2`**, which passes `singleCard`. The classic `/new` final-review delegation (`final-review.md` Step F.1.5) **never passed the flag**, so `slim` was always false and a one-card `/new` batch ran the full breadth set in the final review even though the per-card pass had already covered the same files — a duplicate cross-model Codex + redundant doc review on every single-card run. The user's framing ("for one card, skip the per-card review") was the wrong lever (it would drop Simplify, break the fail-fast ordering that runs review *before* E2E + doc-sync, and lose the early security pass — and was already litigated: v3.35.0 introduced an N=1 final-review skip, v3.37.0 reverted it precisely because the per-card pass can run shallow under `light`). The right lever, already chosen for `new2`, is the inverse: keep the per-card pass, slim the *final*. This release extends that slim to classic `/new` — but **per-finder, coverage-gated**, because classic `/new` differs from `new2`: its `doc-reviewer` runs in the skill's Phase 3 (and can defer to Final under `light`), and its `api-perf-cost-auditor` is deferred-to-final *by design* (never runs per-card). **MINOR** (changes the N=1 merge-gate composition; no new surface, no `baldart.config.yml` key — `singleCard`/`slimDoc`/`slimApi` are workflow args ⇒ schema-propagation rule N/A).

package/README.md CHANGED Viewed

@@ -208,13 +208,13 @@ Skills always-ask when required keys are missing — never silently default.
 never overwrites your file. Full guide:
 [`framework/docs/PROJECT-CONFIGURATION.md`](framework/docs/PROJECT-CONFIGURATION.md).
-### Skills (32 portable skills)
+### Skills (33 portable skills)
 Skills live under `.claude/skills/` and are auto-discovered by Claude Code.
 Bundled skills:
 - **Workflow**: `new`, `new2` (v4.16.0 — EXPERIMENTAL workflow-hosted `/new`, Claude-only, for A/B testing context economy), `prd`, `prd-add`, `bug`, `simplify`, `worktree-manager`, `issue-review`, `context-primer`
-- **Code quality**: `skill-creator`, `find-skills`, `webapp-testing`, `playwright-skill`, `lsp-bootstrap` (v3.10.0), `graphify-bootstrap` (v4.21.0 — code knowledge graph), `graph-align` (v4.21.0 — doc↔graph alignment), `e2e-review` (v3.18.0)
+- **Code quality**: `skill-creator`, `find-skills`, `webapp-testing`, `playwright-skill`, `lsp-bootstrap` (v3.10.0), `graphify-bootstrap` (v4.21.0 — code knowledge graph), `graph-align` (v4.21.0 — doc↔graph alignment), `toolchain-bootstrap` (v4.41.0 — curated dev toolchain), `e2e-review` (v3.18.0)
 - **Design**: `frontend-design`, `ui-design`, `motion-design`, `gamification-design`, `design-system-init` (v3.11.0)
 - **Product**: `seo-audit`, `copywriting`, `api-design-principles`
 - **Knowledge**: `doc-writing-for-rag`, `capture` (LLM wiki overlay)
@@ -245,6 +245,10 @@ When `features.has_lsp_layer: true`, `codebase-architect` and the code-explorati
 When `features.has_code_graph: true`, agents prefer the [Graphify](https://github.com/safishamsi/graphify) code knowledge graph (tree-sitter, local/offline, native Leiden communities) for **structural / relational** queries — "what connects X to Y", blast-radius of a change, which modules cluster — via `graphify query`/`path`/`explain`/`affected`. The same graph **re-activates the LLM-wiki auto-learning loop** (dormant since the RAG removal in v4.20.0): `wiki-curator`, `/capture`, and the nightly `doc-graph-align` routine feed synthesis candidates from Graphify's native `GRAPH_REPORT.md` (god nodes, communities, suggested questions) — entirely offline. Graphify is a single language-agnostic tool (`pipx install graphifyy`); install via `baldart configure` or `/graphify-bootstrap` (never silent in CI — `baldart doctor` backfills). Falls back silently to LSP→Grep→Git. See [`framework/agents/code-graph-protocol.md`](framework/agents/code-graph-protocol.md) and [`framework/docs/CODE-GRAPH-LAYER.md`](framework/docs/CODE-GRAPH-LAYER.md).
+### Curated Toolchain Layer (new in v4.41.0)
+When `features.has_toolchain: true`, BALDART becomes opinionated about the *tools* you build with, not just the workflow. On a JS/TS project `baldart configure` PRESELECTS and installs a curated set as devDependencies — **Biome** (format + lint + import organizer), **Vitest**, **tsc**, **Lefthook** (pre-commit) — and records literal gate commands in `toolchain.commands.*`. The quality-gate flows (`/new`, `/new2`, `/qa`, `qa-sentinel`, `coder`) then run those commands verbatim instead of hard-coding `eslint`/`tsc`/`jest`. Opinionated **but askable** (default Y, opt-out) and **non-destructive**: existing ESLint/Prettier/Jest/husky setups are detected and a migration is only ever PROPOSED, never automatic (`.husky/` is never overwritten). Never silent in CI (`baldart doctor` backfills); each gate falls back silently to the project-standard default when its command is unset, so non-JS projects and consumers with their own toolchain are unaffected. Install via `baldart configure` or `/toolchain-bootstrap`. See [`framework/agents/toolchain-protocol.md`](framework/agents/toolchain-protocol.md) and [`framework/docs/TOOLCHAIN-LAYER.md`](framework/docs/TOOLCHAIN-LAYER.md).
 ### Commands
 - **/new**: Batch orchestrator with QA validation, production readiness checklist, and context recovery (also available as a skill)

package/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 4.40.0
1	+ 4.41.0

package/framework/.claude/agents/coder.md CHANGED Viewed

@@ -226,9 +226,11 @@ persistence code while `stack.database` is empty, surface a warning suggesting
 Do NOT wait until the end to verify your work. After each sub-task that changes types, interfaces, or function signatures:
-1. **When `stack.language` includes `typescript`**: run `npx tsc --noEmit` — fix ALL errors before proceeding to the next sub-task. On non-TypeScript stacks, run the project's equivalent type/static check instead (e.g. `mypy` / `pyright` for Python, `go vet` for Go) when one is configured.
+> **Toolchain (since v4.41.0):** when `features.has_toolchain: true`, use the commands in `toolchain.commands.*` (`typecheck`, `lint`, `test`/`test_related`) verbatim for the steps below — per `agents/toolchain-protocol.md` — falling back to the defaults shown when a key is unset.
+1. **When `stack.language` includes `typescript`**: run `toolchain.commands.typecheck` if set, else `npx tsc --noEmit` — fix ALL errors before proceeding to the next sub-task. On non-TypeScript stacks, run the project's equivalent type/static check instead (e.g. `mypy` / `pyright` for Python, `go vet` for Go) when one is configured.
 2. If the type check fails: **STOP**. Fix the errors first. Do NOT continue writing new files on top of broken types.
-3. After the final sub-task, run the project's linter on your changed files (e.g. `npx eslint --max-warnings=0 <files>` for a JS/TS stack, or the configured equivalent).
+3. After the final sub-task, run the linter on your changed files: `toolchain.commands.lint` if set, else the project default (e.g. `npx eslint --max-warnings=0 <files>` for a JS/TS stack, or the configured equivalent).
 4. **Unit tests**: run ONLY the specific test file, not the full suite. `npm run test` is slow (minutes). Run the single file with the project's actual runner — check `package.json` scripts / devDependencies and use whichever is present (e.g. `npx vitest run <file>`, `npx jest <file>`, `node --import tsx <file>`, or `pytest <file>` / `go test ./<pkg>` on non-JS stacks). Do NOT assume `tsx` is installed: if the chosen command fails with a missing-loader/module error, that means the runner is wrong — try the next candidate, do NOT treat the failure as "no related tests". If the task doesn't involve a known test file, skip the test run.
 **Error Recovery — three-branch state machine (cap: 3 attempts)**: If a build breaks mid-implementation:

package/framework/.claude/agents/qa-sentinel.md CHANGED Viewed

@@ -23,7 +23,16 @@ Run mechanical quality gates fast, return a PASS/FAIL verdict, and get out of th
 > patterns. Pre-commit gates below are the typical defaults — adjust the toolchain to
 > match your project (e.g. `ruff` instead of `eslint`, `pytest` instead of `npm test`).
-Default pre-commit gates (MUST pass before any commit verdict):
+**Toolchain commands (since v4.41.0):** when `features.has_toolchain: true` in
+`baldart.config.yml`, run the command from `toolchain.commands.<gate>` (lint,
+typecheck, test, test_related, build, audit) **verbatim** instead of the default
+below — per `agents/toolchain-protocol.md`. Resolve per gate: a non-empty config
+command wins; an empty/absent one falls back to the default below. A configured
+command that EXITS NON-ZERO is a real FAIL (do not fall back). Example: a project
+on Biome runs `npx biome check .` for lint; a Vitest project runs `npx vitest run`.
+Default pre-commit gates (MUST pass before any commit verdict) — used when no
+`toolchain.commands.*` is configured:
   1. Lint: `npx eslint --max-warnings=0 <changed source files>` (or project equivalent)
   2. Type-check: `npx tsc --noEmit` (or project equivalent)
   3. Markdown lint: `npx markdownlint-cli2 <changed .md files>` (if .md files changed)

package/framework/.claude/commands/qa.md CHANGED Viewed

@@ -98,6 +98,8 @@ Write the initial file:
 ## Step 3 — Execute Profile
+> **Toolchain (since v4.41.0):** when `features.has_toolchain: true` in `baldart.config.yml`, run the gate commands from `toolchain.commands.*` (lint, typecheck, test, test_related, build, audit) verbatim — per `agents/toolchain-protocol.md` — instead of the generic examples below. A configured command that fails is a real FAIL.
 ### LIGHT — Fast confidence (<3 min target [DESIGN-CHOICE: scoped to lint + type-check + related tests only; sufficient for low-risk or doc-only diffs])
 Run directly in this session (no sub-agents):

package/framework/.claude/skills/toolchain-bootstrap/SKILL.md ADDED Viewed

@@ -0,0 +1,127 @@
+---
+name: toolchain-bootstrap
+effort: medium
+description: >
+  Install, verify, and wire the curated dev toolchain for this project. Detects
+  the JS/TS stack, installs the best-in-class tools as devDependencies (Biome for
+  format+lint+import, Vitest, tsc, Lefthook), writes their default config only
+  when absent, records toolchain.installed_tools + toolchain.commands.* in
+  baldart.config.yml, and proposes (never forces) a migration off any existing
+  ESLint/Prettier/Jest/husky. Use when the user says /toolchain-bootstrap,
+  "install the toolchain", "set up biome", or after enabling
+  features.has_toolchain for the first time. Idempotent — re-running re-verifies
+  and reports drift, never overwriting existing configs.
+---
+# Toolchain Bootstrap
+Set up the curated dev toolchain so the quality gates (`/new`, `/qa`, `/check`,
+`qa-sentinel`) run best-in-class tools instead of whatever the project happened
+to have. Non-destructive throughout: configs are written only when absent and
+incumbent tools are never installed over or removed.
+## Project Context
+**Reads from `baldart.config.yml`:** `features.has_toolchain`, `toolchain.installed_tools`, `toolchain.commands.*`, `toolchain.auto_verify`.
+**Gated by features:** `features.has_toolchain` — this skill refuses to run when the flag is `false`, prompting the user to flip it via `npx baldart configure` first.
+**Overlay:** loads `.baldart/overlays/toolchain-bootstrap.md` if present — project-specific tool choices or install commands.
+**On missing keys:** ask the user; do not assume defaults. See `framework/agents/project-context.md` § 3.
+## Effort
+**Baseline:** `effort: medium` (frontmatter). **Inline override:** pass
+`effort=<low|medium|high|xhigh|max>` anywhere in the invocation to scale
+reasoning depth for this run — detect it once at kickoff and strip the token
+before consuming user input. Level→behavior mapping, parsing contract, and
+precedence caveats: `framework/agents/effort-protocol.md`.
+## What This Skill Does
+1. Reads `baldart.config.yml` and confirms `features.has_toolchain: true`.
+2. Probes the cwd via `src/utils/toolchain-installer.js` to compute the clean
+   recommendation (`recommend()`) and any incumbent-blocked migrations
+   (`migrations()`).
+3. Installs the recommended tools as **devDependencies** (`npm install -D -E`) —
+   never global (these are project tools invoked via `npx`, unlike LSP servers).
+4. Writes each tool's default config **only when absent** (`biome.json`,
+   `lefthook.yml`) and registers Lefthook's `pre-commit` hook.
+5. For incumbents (ESLint/Prettier/Jest/husky), PROPOSES a migration with a
+   preview — never automatic, never touching the existing config (`.husky/` is
+   never overwritten).
+6. Writes `toolchain.installed_tools` + `toolchain.commands.*` to
+   `baldart.config.yml`, then `certify()`s the set.
+7. Prints a compact status and points to
+   `framework/agents/toolchain-protocol.md` for runtime behavior.
+## Workflow
+1. **Refusal check.** If `features.has_toolchain: false` (or missing):
+   ```
+   This project hasn't opted in to the curated toolchain yet.
+   Run `npx baldart configure` and answer YES to "Enable curated dev toolchain?"
+   then re-run /toolchain-bootstrap.
+   ```
+   Stop here. Do not proceed.
+2. **Probe.** Call the installer:
+   ```
+   node -e 'const T=require("./.framework/src/utils/toolchain-installer"); const t=new T(); console.log(JSON.stringify({hasPkg:t.hasPackageJson(),recommend:t.recommend(),migrations:t.migrations()},null,2))'
+   ```
+   If `hasPkg` is false, report "no package.json — JS toolchain not applicable"
+   and stop. Otherwise show the recommended tools and any migrations.
+3. **Confirm.** For the clean recommendation, default YES. For each migration,
+   show what it replaces (and for Biome, the `npx biome migrate eslint/prettier
+   --write` preview) and default NO.
+4. **Install + config + commands.** Run the installer's `install`, `initConfigs`,
+   then read back `commandsFor(activeTools())`. Capture failures into a "Skipped"
+   bucket — never abort the whole skill on one failure.
+5. **Persist.** Update `baldart.config.yml`:
+   ```yaml
+   toolchain:
+     installed_tools: [biome, vitest, tsc, lefthook]
+     commands:
+       lint: npx biome check .
+       format: npx biome format --write .
+       typecheck: npx tsc --noEmit
+       test: npx vitest run
+       test_related: npx vitest related --run
+     auto_verify: true
+   ```
+6. **Output.** A compact one-line-per-tool status, in the project's
+   `identity.language` from `baldart.config.yml` (English default).
+## Output Contract
+A single short status block — never a multi-page report. Format:
+```
+Toolchain bootstrap:
+  ✓ biome      (devDep 2.x, biome.json written, lint+format wired)
+  ✓ vitest     (devDep, test + test_related wired)
+  ✓ tsc        (typescript present, typecheck wired)
+  ✓ lefthook   (devDep, lefthook.yml + pre-commit hook installed)
+  · eslint     → migration available (declined): npx biome migrate eslint --write
+Config updated: baldart.config.yml toolchain.installed_tools = [biome, vitest, tsc, lefthook]
+Next: /qa, /new, qa-sentinel will now run these commands instead of the defaults.
+```
+## Notes
+- This skill never edits source code. Its only side effects are devDependency
+  installs, writing absent default configs, registering the Lefthook hook, and a
+  YAML write to `baldart.config.yml`.
+- Re-run safely. `recommend()`/`migrations()` are recomputed each time, configs
+  are skipped when present, and already-usable tools are left untouched —
+  additions / repairs are idempotent.
+- `baldart doctor` is the unattended backfill: it installs missing tools
+  (`toolchain-install`) and restores missing default config (`toolchain-init-config`).
+## See Also
+- `framework/agents/toolchain-protocol.md` — runtime command resolution + fallback.
+- `framework/docs/TOOLCHAIN-LAYER.md` — full plumbing + lifecycle.
+- `src/utils/toolchain-adapters/` — per-tool install/verify recipes.

package/framework/.claude/workflows/new-card-review.js CHANGED Viewed

@@ -38,6 +38,14 @@ const cfg = a.config || {}
 const highRisk = (cfg.paths && cfg.paths.high_risk_modules) || [] // security-domain hint
 const protocolRef = '.claude/skills/new/references/review-cycle.md'
+// Curated toolchain (since v4.41.0): when features.has_toolchain is on, the
+// consumer records LITERAL gate commands in toolchain.commands.* — agents run
+// THOSE instead of guessing (e.g. `npx biome check .` not `npm run lint`). Empty
+// / absent → silent fallback to the project-standard default. Read from the
+// config the skill already passes via args.config (never hardcoded project facts).
+const tcCmds = (cfg.toolchain && cfg.toolchain.commands) || {}
+const tc = (key, fallback) => (tcCmds[key] && String(tcCmds[key]).trim()) || fallback
 // Per-card result accumulator — built up-front (so the early-return guards can return it) and
 // populated with fixesApplied/residual in the Fix phase.
 const perCard = {}
@@ -193,8 +201,15 @@ const codexPrompt =
   `For each finding return: finding_id, title, severity (BLOCKER|HIGH|MEDIUM|LOW), confidence (0-100), evidence (exact file:line + code quote), minimal_fix_direction, and domain (doc|security|migration|code|perf|test). ` +
   `Run the mandatory false-positive check on every finding and suppress the unconvincing ones (your findings are treated as already FP-validated). Set codexAvailable:true when the review ran.`
+const tcGateLines = [
+  ['lint', tcCmds.lint], ['typecheck', tcCmds.typecheck], ['test', tcCmds.test],
+  ['build', tcCmds.build], ['audit', tcCmds.audit],
+].filter(([, v]) => v && String(v).trim());
 const qaPrompt =
-  `Run MECHANICAL GATES ONLY over the wave scope, per ${protocolRef} (Phase 3.5 qa-sentinel contract): lint, type-check (when stack uses typescript), the full test suite, build, dependency audit, and markdownlint as applicable. You are a GATE RUNNER: do NOT read source for code findings, do NOT emit severities — return only a PASS/FAIL/SKIP gate table.\n\nWorktree: ${a.worktreePath || '(cwd)'} — cd into it first.\nChanged files:\n${unionScope.join('\n')}`
+  `Run MECHANICAL GATES ONLY over the wave scope, per ${protocolRef} (Phase 3.5 qa-sentinel contract): lint, type-check (when stack uses typescript), the full test suite, build, dependency audit, and markdownlint as applicable. You are a GATE RUNNER: do NOT read source for code findings, do NOT emit severities — return only a PASS/FAIL/SKIP gate table.\n\nWorktree: ${a.worktreePath || '(cwd)'} — cd into it first.\nChanged files:\n${unionScope.join('\n')}` +
+  (tcGateLines.length
+    ? `\n\nThis project configures a curated toolchain — run THESE EXACT commands for the corresponding gates (do not substitute):\n${tcGateLines.map(([k, v]) => `  • ${k}: ${String(v).trim()}`).join('\n')}`
+    : '')
 function simplifyPrompt(c) {
   return `Simplify analysis (read-only — you do NOT edit, the workflow applies fixes afterward) over ONE card's committed diff, per ${protocolRef} (Phase 2.55). Cover all THREE lenses and return findings:\n` +
@@ -342,7 +357,7 @@ async function applyFixPass(findings, writer, label, role) {
     `You MAY edit ONLY these files (ownership map — touching anything else is a violation):\n${unionEditable.join('\n')}\n\n` +
     `Findings to fix (fix the code, not the tests unless a test itself is wrong; do NOT expand scope beyond the finding):\n` +
     findings.map((f) => `- [${f.finding_id}] (${f.card || '?'} / ${f.domain} / ${f.severity}) ${f.title}\n    evidence: ${f.evidence}\n    direction: ${f.minimal_fix_direction}`).join('\n') +
-    `\n\nAfter applying: run \`npm run lint\` and (when the project uses typescript) \`npx tsc --noEmit\` and \`npm run build\` in the worktree. If a check fails because of an edit you made, fix the regression — at most 2 retries — staying within the allowed files. ` +
+    `\n\nAfter applying: run \`${tc('lint', 'npm run lint')}\` and (when the project uses typescript) \`${tc('typecheck', 'npx tsc --noEmit')}\` and \`${tc('build', 'npm run build')}\` in the worktree. If a check fails because of an edit you made, fix the regression — at most 2 retries — staying within the allowed files. ` +
     `Do NOT commit. Do NOT git stash (refs/stash is shared across worktrees). ` +
     `Return: applied (finding_ids you fixed), unresolved (finding_ids you could NOT fix within the allowed files / 2 retries), and checks (PASS/FAIL/SKIP for lint, tsc, build).`
   const r = await agent(fixBrief, { label, phase: 'Fix', agentType: writer, schema: FIX_SCHEMA })

package/framework/.claude/workflows/new2.js CHANGED Viewed

@@ -215,6 +215,9 @@ const projectBrief = [
   `Trunk: ${TRUNK}`,
   `Reference modules (Read these for the EXACT /new semantics — this workflow only sequences them): ${REF}/`,
   `Config facts: stack=${JSON.stringify(cfg.stack || {})}; features=${JSON.stringify(features)}; high_risk_modules=${JSON.stringify(highRisk)}; paths.backlog_dir=${paths.backlog_dir || '?'}; paths.references_dir=${paths.references_dir || '?'}.`,
+  (features.has_toolchain && cfg.toolchain && cfg.toolchain.commands)
+    ? `Toolchain commands (run THESE verbatim for the baseline + any gate, per agents/toolchain-protocol.md; empty → project default): ${JSON.stringify(cfg.toolchain.commands)}.`
+    : '',
   FLAGS.effort ? `Reasoning effort for this run: ${FLAGS.effort}.` : '',
 ].filter(Boolean).join('\n')

package/framework/agents/index.md CHANGED Viewed

@@ -24,6 +24,7 @@ Route agents to the right module with minimal reading.
 - If touching architecture, auth, or tech stack -> read `agents/architecture.md`.
 - If touching workflow/process/commits/backlog -> read `agents/workflows.md`.
 - If CREATING or MUTATING a backlog card (any prefix — `FEAT`/`CHORE`/`BUG`/`DOC`/`PERF`/`UI`), or consuming one type-blind (`/new`, `/new2`) -> read `agents/card-schema.md` (the universal, profile-aware baseline) before writing/validating fields.
+- If running MECHANICAL GATES (lint, format, type-check, test, build, audit) and `features.has_toolchain: true` -> read `agents/toolchain-protocol.md` and run the command from `toolchain.commands.<gate>` verbatim, falling back to the project-standard default when a key is unset. A configured command that FAILS is a real gate failure (do not fall back).
 - If touching testing or QA issues -> read `agents/testing.md` (also documents the scope-aware,
   profile-driven test-selection strategy consumed by `qa-sentinel`).
 - If touching GitHub issues or issue workflow -> read `agents/github-issue-subagent.md`.
@@ -65,6 +66,7 @@ When adding or updating agents, update REGISTRY.md — not this file.
 - `agents/project-context.md` — Project context protocol: `baldart.config.yml` + overlays + missing-key handling (since v3.0.0)
 - `agents/code-search-protocol.md` — Retrieval hierarchy for code search: LSP → Grep → Git (since v3.10.0, gated on `features.has_lsp_layer`)
 - `agents/code-graph-protocol.md` — Structural/relational retrieval via the Graphify code knowledge graph (since v4.21.0, gated on `features.has_code_graph`)
+- `agents/toolchain-protocol.md` — Mechanical-gate command resolution (lint/format/typecheck/test/build/audit) from `toolchain.commands.*` with silent project-standard fallback (since v4.41.0, gated on `features.has_toolchain`)
 - `agents/design-system-protocol.md` — Registry-first discipline for UI work: BLOCKING cascade on `INDEX.md` + `tokens-reference.md` + `components/<Name>.md` (since v3.11.0, gated on `features.has_design_system`)
 - `agents/card-schema.md` — Atomic Card Baseline Schema: the universal, profile-aware (epic/child/standalone) field contract every backlog card satisfies, plus the consumer HALT/BACK-FILL/WARN contract (since v4.35.0)

package/framework/agents/toolchain-protocol.md ADDED Viewed

@@ -0,0 +1,80 @@
+# Toolchain Protocol
+## Purpose
+Define how agents/skills run the **mechanical quality gates** — lint, format,
+type-check, test, build, audit — when a project has opted into the curated
+toolchain. The commands are read from `toolchain.commands.*` in
+`baldart.config.yml` and run **verbatim**, instead of each gate hard-coding
+`eslint` / `tsc` / `jest` / `npm run build`. This lets a project standardize on
+the best-in-class tools (Biome for format+lint+import, Vitest, tsc, Lefthook)
+and have every gate flow use them consistently.
+## Scope
+**In**: which command to invoke for each gate; the resolution order; the silent
+fallback. **Out**: code review for correctness/quality (that is the reviewer
+agents' job, not a mechanical gate); installing the tools (that is
+`baldart configure` / `/toolchain-bootstrap` / `baldart doctor`, see
+`framework/docs/TOOLCHAIN-LAYER.md`).
+## Gating
+Conditional on `features.has_toolchain: true` in `baldart.config.yml`. The layer
+degrades **silently and per-command**: for each gate, if
+`toolchain.commands.<gate>` is a non-empty string, run it **exactly**; otherwise
+fall back to the flow's built-in project-standard default (`npx eslint`,
+`npx tsc --noEmit`, `npm test`, `npm run build`, `npm audit`, …). When the flag
+is `false`/missing, every gate uses its default — identical to pre-toolchain
+behavior. **Never block a task** because a command is unset or a tool is missing;
+surface install gaps through `baldart doctor` (`toolchain-install` /
+`toolchain-init-config` actions), not mid-task.
+## Resolution hierarchy (per gate)
+1. **Explicit config** — `toolchain.commands.<gate>` from `baldart.config.yml`,
+   run verbatim. This is the source of truth when set.
+2. **Project-standard fallback** — the gate's existing default command, inferred
+   from the stack (eslint/tsc/jest/npm scripts). Used when the config key is
+   empty/absent.
+3. **Skip** — when neither resolves (e.g. no build script in a library), report
+   `SKIP`, never fail.
+## Command map
+| Gate | Config key | Curated default (Biome/Vitest/tsc) | Fallback |
+| --- | --- | --- | --- |
+| Lint | `toolchain.commands.lint` | `npx biome check .` | `npx eslint --max-warnings=0 <files>` |
+| Format | `toolchain.commands.format` | `npx biome format --write .` | `npx prettier --write <files>` |
+| Type-check | `toolchain.commands.typecheck` | `npx tsc --noEmit` | `npx tsc --noEmit` |
+| Test | `toolchain.commands.test` | `npx vitest run` | `npm test` |
+| Test (related) | `toolchain.commands.test_related` | `npx vitest related --run <files>` | `npx jest --findRelatedTests <files>` |
+| Build | `toolchain.commands.build` | `npm run build` | `npm run build` |
+| Audit | `toolchain.commands.audit` | `npm audit --audit-level=high` | `npm audit` |
+Biome's `check` runs lint + format-check + import-organize in one pass, so the
+`lint` gate alone covers what a separate ESLint+Prettier+import-plugin trio
+would; the `format` command only `--write`s.
+## Consumers
+`qa-sentinel` (gate runner), `/qa`, `/check`, the `coder` post-fix re-verify,
+and the `/new` + `/new2` review workflows resolve gate commands through this
+protocol. The `/new` workflow scripts receive the resolved config via their
+`args.config` payload (the consuming skill passes `baldart.config.yml`) — they
+must never hard-code project facts (see the workflows contamination contract).
+## Fallback rules
+- A configured command that EXITS NON-ZERO is a genuine gate **FAIL** — do not
+  silently fall back to the default (that would mask a real failure). Fallback
+  applies only when the command is **unset**, not when it fails.
+- A configured command whose binary is missing (`command not found`) is a setup
+  gap → report it and fall back to the default for that single run, then let
+  `baldart doctor` repair the install. Never abort the task.
+## See also
+- `framework/docs/TOOLCHAIN-LAYER.md` — install lifecycle, adapters, plumbing.
+- `agents/testing.md` — scope-aware, profile-driven test selection (the `test` /
+  `test_related` gates compose with it).

package/framework/docs/TOOLCHAIN-LAYER.md ADDED Viewed

@@ -0,0 +1,135 @@
+# Curated Toolchain Layer — Operator Guide
+> Since v4.41.0. Opt-in, gated on `features.has_toolchain`. The **runtime**
+> protocol (which command each gate runs) lives in
+> [`framework/agents/toolchain-protocol.md`](../agents/toolchain-protocol.md);
+> this document covers the *plumbing* — install, config, lifecycle, fallback.
+## 1. Why this exists
+BALDART has always been opinionated about *how* you work (workflows, review,
+gates) but agnostic about the *tools* you work with. Every quality-gate flow
+hard-coded `eslint` / `tsc` / `jest` / `npm run build`. This layer lets BALDART
+also recommend and install a curated, best-in-class JS/TS toolchain — and, more
+importantly, makes the agents **use it**: the gate flows read the literal
+commands from `toolchain.commands.*` instead of guessing.
+The curated set (JS/TS):
+| Slot | Tool | Why |
+| --- | --- | --- |
+| Format + Lint + Imports | **Biome** | One fast Rust binary replacing ESLint + Prettier + import plugins |
+| Test | **Vitest** | Fast, ESM-native test runner |
+| Type-check | **tsc** | The TypeScript compiler in `--noEmit` mode |
+| Pre-commit | **Lefthook** | Fast, parallel git-hook runner (owns `pre-commit`) |
+Other languages (Ruff for Python, gofmt for Go, …) plug in by adding an adapter —
+no other layer changes.
+## 2. Principles (non-negotiable)
+- **Opinionated but askable** — on a JS/TS project with no incumbent, `configure`
+  PRESELECTS the tools (default Y); the user opts out.
+- **Non-destructive** — config files are written only when absent; incumbent
+  tools (ESLint/Prettier/Jest/husky) are detected, **never installed or removed**.
+  A migration is only ever PROPOSED (with a preview), never automatic. Flipping
+  `features.has_toolchain` true→false does NOT uninstall anything — it just stops
+  agents from using the commands.
+- **Never silent in CI** — `--non-interactive`/`--yes` installs nothing; it writes
+  the flag and `baldart doctor` backfills interactively.
+- **Silent fallback, never abort** — for each gate, an unset command falls back to
+  the project-standard default; a missing tool is a doctor item, never a task
+  blocker. (A configured command that EXITS NON-ZERO is a real failure — fallback
+  applies only to *unset* commands, not failing ones.)
+## 3. What BALDART adds vs what the tools ship
+BALDART does not reimplement Biome/Vitest/tsc/Lefthook. It adds only:
+- **Gating + config** (`features.has_toolchain` + the `toolchain:` block).
+- **Opt-in install** (`configure` interactive branch, `/toolchain-bootstrap`,
+  `doctor` backfill) via per-tool adapters.
+- **A command-resolution protocol** telling BALDART's agents which command each
+  gate runs.
+- **Currency** — `baldart doctor` reports devDeps behind their npm latest
+  (non-blocking; devDep upgrades are user-run).
+- **Fallback discipline** — silent degrade to project-standard defaults.
+## 4. The moving parts
+| Part | Where | Role |
+|------|-------|------|
+| Config flag + block | `framework/templates/baldart.config.template.yml` | `features.has_toolchain` + `toolchain.{installed_tools,commands.*,auto_verify}` |
+| Curated adapters | `src/utils/toolchain-adapters/{biome,vitest,tsc,lefthook}.js` | `installCommand/verifyCommand/commands()/initConfig/static replaces/static detect` |
+| Incumbent detectors | `src/utils/toolchain-adapters/{eslint,prettier,jest,husky}.js` | detection only (migration proposal) |
+| Dispatcher | `src/utils/toolchain-adapters/index.js` | `REGISTRY` + `INCUMBENTS` + `detectAll` + `detectIncumbents` |
+| Installer wrapper | `src/utils/toolchain-installer.js` | `recommend/migrations/install/initConfigs/commandsFor/certify/activeTools` |
+| Configure | `src/commands/configure.js` | autodetect default + prompt + interactive install/migrate/write |
+| Update detector | `src/commands/update.js` | diffs the `toolchain:` block (incl. nested `commands.*`) |
+| Doctor | `src/commands/doctor.js` | `toolchain-install` / `toolchain-init-config` backfill + currency |
+| Currency | `src/utils/tool-currency.js` | `_toolchainRecords` — installed (node_modules) vs npm latest |
+| Retrieval protocol | `framework/agents/toolchain-protocol.md` | which command each gate runs; fallback |
+| Consumers | `qa-sentinel`, `coder`, `/qa`, `new-card-review.js`, `new2.js` | run `toolchain.commands.*` with fallback |
+| Bootstrap skill | `/toolchain-bootstrap` | explicit install path (CI-safe backfill) |
+## 5. Install lifecycle
+```
+features.has_toolchain: true
+        │
+        ├─ baldart configure (INTERACTIVE)
+        │      → recommend() clean set → npm install -D -E (devDeps)
+        │      → initConfigs() writes biome.json / lefthook.yml IF ABSENT
+        │      → Lefthook registers its pre-commit hook
+        │      → write toolchain.installed_tools + toolchain.commands.*
+        │      → certify() (each tool resolves via npx + config present)
+        │      └─ incumbent present → PROPOSE migration (never automatic)
+        │
+        ├─ baldart configure --yes / CI
+        │      → writes the flag from autodetect; installs NOTHING
+        │      → doctor backfills the install
+        │
+        ├─ /qa · /new · /new2 · qa-sentinel · coder
+        │      → resolve each gate from toolchain.commands.* (fallback to default)
+        │
+        └─ baldart doctor
+               → toolchain-install   (devDep missing)
+               → toolchain-init-config (default config missing)
+               → tool-upgrade:toolchain:<tool> (behind npm latest — non-blocking)
+```
+## 6. Fallback hierarchy (per gate)
+1. **Explicit** — `toolchain.commands.<gate>`, run verbatim.
+2. **Project-standard** — the flow's built-in default (`npx eslint`, `npx tsc
+   --noEmit`, `npm test`, `npm run build`, `npm audit`).
+3. **Skip** — when neither resolves (e.g. no build script), report `SKIP`.
+## 7. Edge cases
+- **Non-JS project** — no adapter `detect()` fires; nothing is proposed.
+- **Monorepo** — Biome/Vitest are root-scoped (one `biome.json` covers the
+  workspace); Lefthook is root-only (one `.git`).
+- **Existing ESLint/Prettier/Jest/husky** — detected as incumbents; the matching
+  curated tool routes to the migration proposal, never an automatic install. husky
+  specifically: Lefthook is NOT installed over `.husky/` — the hook handoff is
+  manual.
+- **`biome.json` / `lefthook.yml` already present** — kept, never overwritten.
+- **No `package.json`** — the JS toolchain is skipped (cannot install a devDep);
+  the flag stays on so a later `configure`/`doctor` can complete it.
+## 8. Adding a language / tool
+1. Create `src/utils/toolchain-adapters/<tool>.js` shaped like `biome.js`
+   (`installCommand`, `verifyCommand`, `commands()`, `initConfig`, `static
+   replaces`, `static detect`). Add it to `REGISTRY` in `index.js`.
+2. For an incumbent it supersedes, add a detection-only adapter to `INCUMBENTS`
+   and list it in the curated tool's `static replaces`.
+3. Everything else (configure, doctor, currency, the protocol) iterates the
+   registry — no further wiring.
+## See also
+- `framework/agents/toolchain-protocol.md` — runtime command resolution.
+- `framework/.claude/skills/toolchain-bootstrap/SKILL.md` — explicit install path.
+- `src/utils/toolchain-adapters/` — per-tool install/verify recipes.

package/framework/templates/baldart.config.template.yml CHANGED Viewed

@@ -202,6 +202,19 @@ features:
   # changed. See framework/.claude/skills/e2e-review/SKILL.md.
   has_e2e_review: false
+  # Curated dev toolchain (since v4.41.0). When true, BALDART installs and
+  # enables an opinionated set of best-in-class JS/TS dev tools (Biome for
+  # format+lint+import; later Vitest, tsc, Lefthook) and the quality-gate flows
+  # (/new, /qa, /check, qa-sentinel) run the commands recorded in
+  # `toolchain.commands.*` below instead of hardcoding eslint/tsc/jest. Falls
+  # back SILENTLY to the project-standard defaults when a command is unset, so a
+  # consumer with its own toolchain (or a non-JS project) is unaffected.
+  # `baldart configure` / `/toolchain-bootstrap` install the tools; `baldart
+  # doctor` backfills. Existing ESLint/Prettier setups are never touched — a
+  # migration is only ever PROPOSED. See framework/agents/toolchain-protocol.md
+  # and framework/docs/TOOLCHAIN-LAYER.md.
+  has_toolchain: false
 # ─── E2E REVIEW ──────────────────────────────────────────────────────────
 # Tuning for the /e2e-review skill (only consulted when has_e2e_review: true).
 e2e_review:
@@ -263,6 +276,33 @@ graph:
   # without it, so this is an enhancement, not a requirement.
   register_mcp: false
+# ─── TOOLCHAIN ─────────────────────────────────────────────────────────────
+# State of the curated dev toolchain. Populated by `baldart configure` / the
+# `/toolchain-bootstrap` skill. Only meaningful when `features.has_toolchain:
+# true`. The commands are LITERAL (run verbatim by agents) — an empty string
+# means "fall back to the project-standard default for this gate".
+toolchain:
+  # Curated tools BALDART has installed/verified locally. Identifiers map to
+  # adapters under src/utils/toolchain-adapters/ (biome — extend with vitest,
+  # tsc, lefthook, …).
+  installed_tools: []
+  # Literal gate commands read by the quality-gate flows (/new, /qa, /check,
+  # qa-sentinel). Empty string → the flow uses its built-in fallback (eslint /
+  # tsc / jest / npm run build / npm audit), so the layer degrades cleanly.
+  commands:
+    lint: ""            # e.g. npx biome check .
+    format: ""          # e.g. npx biome format --write .
+    typecheck: ""       # e.g. npx tsc --noEmit
+    test: ""            # e.g. npx vitest run
+    test_related: ""    # e.g. npx vitest related --run
+    build: ""           # e.g. npm run build
+    audit: ""           # e.g. npm audit --audit-level=high
+  # When true, `baldart doctor` re-verifies the installed tools resolve and
+  # restores any missing default config. Disable on CI if too noisy.
+  auto_verify: true
 # ─── GIT ─────────────────────────────────────────────────────────────────
 # Controls how worktree-manager (`/mw`) integrates a worktree's feature
 # branch back into the integration trunk.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "baldart",
-  "version": "4.40.0",
+  "version": "4.41.0",
   "description": "Claude Agent Framework - Reusable framework for coordinating AI agents and humans in software projects",
   "bin": {
     "baldart": "./bin/baldart.js"