baldart 4.49.0 → 4.49.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,14 @@ All notable changes to BALDART will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [4.49.1] - 2026-06-17
9
+
10
+ **The four `/new` dynamic workflows shed their paragraph-length `meta.description` — the last prose that re-entered the orchestrator context on every workflow completion.** The `Workflow` tool contract specifies `description` as a *one-line* field shown in the permission dialog, but `new-card-review` / `new-final-review` / `new2` / `new2-resolve` each carried a full-paragraph rationale (~150–230 words) that the runtime echoes back into the orchestrator prefix on the completion line — pure prose with zero runtime value (the orchestrator already knows what it invoked). Each is collapsed to a single line; the full semantics were already duplicated in the script's own args-contract comment block (source code, never injected into context) plus the `references/*.md` SSOT, so nothing is lost. Same family as the v4.49.0 prose-leak closures, applied to the workflow layer. **PATCH** (no behaviour change — descriptions are display/permission-dialog text only; **no new `baldart.config.yml` key**).
11
+
12
+ ### Changed
13
+
14
+ - **`framework/.claude/workflows/{new-card-review,new-final-review,new2,new2-resolve}.js`** — `meta.description` collapsed from a full paragraph to one line each (e.g. `'Per-wave review+fix cluster for /new (off-context).'`). A two-line comment above each marks why it must stay a one-liner (re-enters orchestrator context on completion) and points to the args-contract comment block + reference module that hold the full semantics.
15
+
8
16
  ## [4.49.0] - 2026-06-17
9
17
 
10
18
  **Caveman-style terse contracts close the last prose-leak the v4.47.0 turn-economy left out of scope: `codebase-architect` gets an opt-in `OUTPUT=terse` grounding mode, and `/new`'s orchestrator is forced near-silent.** A study of the [caveman](https://github.com/JuliusBrussee/caveman) skill (compress *how the agent talks*, not *what it builds* — output tokens only, reasoning untouched) plus a **prose-leak surface map of BALDART's own fleet** found the surface already covered almost everywhere — finders emit YAML/schema (`code-reviewer`, auditors, `plan-auditor`, `prd-card-writer`, `coder` completion-report), `senior-researcher` is file-resident, `context-primer` is already capped at a 30–50-line XML — with ONE genuinely-uncovered prose exploder: `codebase-architect`, whose budget cap is *input*-side only (`codebase-architect.md`), whose "Communication Style" explicitly licenses narrative prose, and whose return is outside the v4.47.0 rule by design (`new/SKILL.md` § "Context economy": *"It does NOT change any agent's return contract"*). The earlier ponytail-style "cap all finder prose" idea was **refuted** by a 3-skeptic adversarial pass (finders already capped + data-driven; new2 already isolates subagent output; driver is turn-count) — this release ships only the two data-validated survivors. A follow-up **return-consumption audit** of the whole fleet confirmed the floor is already reached everywhere else — finders/auditors return a parsed schema or override-to-`## FINDINGS`, and `qa-sentinel`/`doc-reviewer`/`senior-researcher`/`prd-card-writer` are file-resident with a thin manifest return; the only remaining lever was wiring the new `OUTPUT=terse` flag into the remaining machine-to-machine `codebase-architect` call-sites (`context-primer`, `/prd` discovery dimension-resolution + ISA spawns). The near-silent rule is scoped to `/new` only — `/prd` and `/bug` are conversational, where the prose IS the deliverable (caveman's own Auto-Clarity principle: never compress what a human reads to decide). `/new` is the opposite: not conversational — the user keeps the orchestrator attached only so it can surface a genuine decision, so its running narration serves no one. **MINOR** (additive agent capability + skill behaviour; **no new `baldart.config.yml` key** ⇒ schema-change propagation rule N/A; no removed surface).
package/VERSION CHANGED
@@ -1 +1 @@
1
- 4.49.0
1
+ 4.49.1
@@ -1,7 +1,8 @@
1
1
  export const meta = {
2
2
  name: 'new-card-review',
3
- description:
4
- "Per-wave review+fix cluster for /new. Takes 1..N co-located cards (1 = sequential per-card; N = a team-mode group/wave) and runs the whole review fan-out OUTSIDE the orchestrator context: Simplify + cross-model Codex (agent-launched, code-reviewer fallback) + qa-sentinel gates + security-reviewer (high-risk only), each specialist FP-checking its OWN findings; then ONE coder applies all VERIFIED code/perf/security/simplify findings in a single pass (files disjoint by ownership) and re-verifies lint/tsc/build. Doc-review and api-perf are deliberately OUT (doc runs post-E2E in the skill; api-perf is deferred to the Final FULL gate). Returns a compact {perCard:{fixesApplied,residual}, gateTable, summary} — the only object that re-enters the orchestrator prefix. Maps to references/review-cycle.md (Phase 2.55+3.5+3.7) and references/team-mode.md (Step D.2-D.4b). NOT new2: no human gates (returned as residual), no AC-closure, no E2E, no merge/commit/backlog.",
3
+ // One-liner only: this string re-enters the orchestrator context on completion.
4
+ // Full semantics live in the args-contract comment block below + references/review-cycle.md.
5
+ description: 'Per-wave review+fix cluster for /new (off-context).',
5
6
  phases: [
6
7
  { title: 'Baseline', detail: 'architecture grounding for the wave scope' },
7
8
  { title: 'Discovery', detail: 'parallel finders per card (simplify / codex / qa / security)' },
@@ -1,7 +1,8 @@
1
1
  export const meta = {
2
2
  name: 'new-final-review',
3
- description:
4
- "Cross-batch final code review for /new. Fans out a multi-agent review (Codex primary + doc-reviewer + api-perf-cost-auditor + qa-sentinel gates) over the WHOLE batch diff. Each domain specialist OWNS its lane end-to-end — it FP-checks its own findings in the finding pass, so there is no generic code-reviewer re-judge over another specialist's domain (cross-domain) nor over its own findings (self-judge); any residual unresolved finding is routed to its domain specialist. Read-only: returns classified findings + a gate table, applies NO fixes (the calling skill owns fix application + user gates). Maps to references/final-review.md steps F.2–F.4.",
3
+ // One-liner only: this string re-enters the orchestrator context on completion.
4
+ // Full semantics live in the args-contract comment block below + references/final-review.md.
5
+ description: 'Cross-batch final code review for /new (read-only, off-context).',
5
6
  phases: [
6
7
  { title: 'Baseline', detail: 'architecture grounding for the batch scope (F.2)' },
7
8
  { title: 'Review', detail: 'parallel multi-agent review of the batch diff (F.3)' },
@@ -1,7 +1,8 @@
1
1
  export const meta = {
2
2
  name: 'new2-resolve',
3
- description:
4
- "Self-healing resolution pass for the autonomous new2 batch workflow. Called by new2 whenever a deterministic gate would otherwise need a human: a card fail/blocker (ac-unmet | blocker | qa-fail | e2e-blocked | merge-blocker) or a legitimate scope-EXPANDING finding (scope-expansion). Tier-1 targeted fix with a TERMINAL short-circuit (skips the costly multi-attempt when the problem is impossible-by-definition, verified not trusted), then a judged multi-attempt; a MANDATORY adversarial judge cross-checks every verified claim against the real diff (prevents fabricated success) whenever the judge is a DIFFERENT specialist than the fixer. When fixer === judge (doc→doc-reviewer, security→security-reviewer) the second pass would just judge its own work — no cross-model diversity, pure waste — so the judge AND the terminal-verdict ratification are skipped (the reviewer-writer self-verifies in its single pass). Specialized per domain (doc→doc-reviewer single pass, ui→ui-expert fix + code-reviewer judge, security→security-reviewer single pass, perf→api-perf-cost-auditor judge). Terminal is a tracked follow-up. Accepts a `findings` list (batched per area). Uses agent()/parallel() only — no nested workflows.",
3
+ // One-liner only: this string re-enters the orchestrator context on completion.
4
+ // Full semantics live in the args-contract comment block below.
5
+ description: 'Self-healing resolution pass for the autonomous new2 batch workflow.',
5
6
  phases: [
6
7
  { title: 'Diagnose', detail: 'classify + terminal short-circuit + scope-expansion boundary' },
7
8
  { title: 'Repair', detail: 'targeted fix, then judged multi-attempt if needed' },
@@ -1,7 +1,8 @@
1
1
  export const meta = {
2
2
  name: 'new2',
3
- description:
4
- "EXPERIMENTAL workflow host for /new (A/B testing). Runs an ENTIRE backlog-card batch autonomously in the background runtime — pre-flight, a dependency-gated (DAG) per-card implement+review pipeline with specialized agents, cross-batch final review, and an integrity-gated auto-merge — so subagent output never enters the main orchestrator context. Zero AskUserQuestion: each /new gate is a deterministic policy; blocking gates and scope-expanding findings are routed to the new2-resolve self-healing workflow. Before the final review a cross-card integration pass implements the residuals that were deferred only by the per-card ownership artifact (out-of-ownership whose remedy lands inside the batch union) or by a transient outage — re-running resolve with batch-union ownership, domain-routed — so they are not left as follow-ups to manage later (owner-gated / not-a-code-defect / baseline / new-AC scope-expansion still defer). Resilient by design: transient API errors are retried, a sustained outage degrades cleanly (durable resume), follow-ups for residuals are reconciled by the skill (offline-safe), and the merge is blocked unless the batch is verifiably complete (no false-DONE, no unreviewed code). Agents Read /new's reference modules for semantics (args.refModulesBase) so this script encodes only orchestration shape + gate policy. Claude-only.",
3
+ // One-liner only: this string re-enters the orchestrator context on completion.
4
+ // Full semantics live in the args-contract comment block below + /new's reference modules.
5
+ description: 'EXPERIMENTAL autonomous workflow host for the whole /new batch (off-context, Claude-only).',
5
6
  phases: [
6
7
  { title: 'Pre-flight', detail: 'deterministic workspace hygiene + worktree + dep-graph + cross-card grounding (Phase 0)' },
7
8
  { title: 'Implement', detail: 'dependency-gated per-card implement+review pipeline with owner_agent + specialized review fan-out' },
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "baldart",
3
- "version": "4.49.0",
3
+ "version": "4.49.1",
4
4
  "description": "Claude Agent Framework - Reusable framework for coordinating AI agents and humans in software projects",
5
5
  "bin": {
6
6
  "baldart": "./bin/baldart.js"