baldart 4.27.0 → 4.27.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +8 -0
- package/VERSION +1 -1
- package/framework/.claude/workflows/new2-resolve.js +14 -1
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,14 @@ All notable changes to BALDART will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [4.27.1] - 2026-06-11
|
|
9
|
+
|
|
10
|
+
**`new2` resolve: never run an adversarial doc review.** When a resolution pass fixed a `doc`-domain finding, the fixer was `doc-reviewer` (which applies *and* self-verifies the fix in one pass) — but `judgeVerify()` then spawned `doc-reviewer` *again* as the mandatory adversarial judge, plus the terminal-judge ratification used it a third time. That is `doc-reviewer` judging `doc-reviewer`: zero cross-model diversity, pure waste of tokens and time. The doc domain now trusts the single reviewer-writer pass — the adversarial judge and the terminal-judge ratification are both skipped for `doc` only (every other domain keeps the mandatory cross-check unchanged, since their fixer and judge are genuinely independent specialists). **PATCH** (cost/latency optimization on the EXPERIMENTAL `new2` surface; no config key, no change to `/new`, no behavior change for code/ui/security/perf/test resolutions).
|
|
11
|
+
|
|
12
|
+
### Changed
|
|
13
|
+
|
|
14
|
+
- **`framework/.claude/workflows/new2-resolve.js`** — `judgeVerify()` short-circuits for `domain === 'doc'` (accepts the first verified attempt without re-spawning), and the terminal-judge branch ratifies a doc terminal verdict directly instead of re-running `doc-reviewer`. `meta.description` updated to document the doc exception.
|
|
15
|
+
|
|
8
16
|
## [4.27.0] - 2026-06-11
|
|
9
17
|
|
|
10
18
|
**`/prd`: the Obsidian back-reference is now two-phase — the spec note is linked the moment the PRD starts, not only at merge.** The end-of-run snippet (v4.22.0) is gated on a successful merge, so a PRD that is paused or abandoned mid-flow (e.g. parked at the UI-design step, worktree never merged) left its origin note empty — even though detection already happened at Step 1. Observed in the wild: a `realtime-order-collaboration` PRD stuck at `ui-design` never wrote back, while a completed sibling PRD did. **MINOR** (additive capability on `/prd`; reuses the existing slug-keyed markers + the snippet `status:` lifecycle field; no `baldart.config.yml` key, so the schema-change propagation rule does not apply).
|
package/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
4.27.
|
|
1
|
+
4.27.1
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
export const meta = {
|
|
2
2
|
name: 'new2-resolve',
|
|
3
3
|
description:
|
|
4
|
-
"Self-healing resolution pass for the autonomous new2 batch workflow. Called by new2 whenever a deterministic gate would otherwise need a human: a card fail/blocker (ac-unmet | blocker | qa-fail | e2e-blocked | merge-blocker) or a legitimate scope-EXPANDING finding (scope-expansion). Tier-1 targeted fix with a TERMINAL short-circuit (skips the costly multi-attempt when the problem is impossible-by-definition, verified not trusted), then a judged multi-attempt; a MANDATORY adversarial judge cross-checks every verified claim against the real diff (prevents fabricated success). Specialized per domain (doc→doc-reviewer, ui→ui-expert, security→security-reviewer judge, perf→api-perf-cost-auditor judge). Terminal is a tracked follow-up. Accepts a `findings` list (batched per area). Uses agent()/parallel() only — no nested workflows.",
|
|
4
|
+
"Self-healing resolution pass for the autonomous new2 batch workflow. Called by new2 whenever a deterministic gate would otherwise need a human: a card fail/blocker (ac-unmet | blocker | qa-fail | e2e-blocked | merge-blocker) or a legitimate scope-EXPANDING finding (scope-expansion). Tier-1 targeted fix with a TERMINAL short-circuit (skips the costly multi-attempt when the problem is impossible-by-definition, verified not trusted), then a judged multi-attempt; a MANDATORY adversarial judge cross-checks every verified claim against the real diff (prevents fabricated success) for every domain EXCEPT doc — doc-reviewer is a reviewer-writer that owns its domain (applies + self-verifies in one pass), so a second adversarial doc-reviewer would just judge itself: wasteful, so the doc judge is skipped. Specialized per domain (doc→doc-reviewer single pass, ui→ui-expert, security→security-reviewer judge, perf→api-perf-cost-auditor judge). Terminal is a tracked follow-up. Accepts a `findings` list (batched per area). Uses agent()/parallel() only — no nested workflows.",
|
|
5
5
|
phases: [
|
|
6
6
|
{ title: 'Diagnose', detail: 'classify + terminal short-circuit + scope-expansion boundary' },
|
|
7
7
|
{ title: 'Repair', detail: 'targeted fix, then judged multi-attempt if needed' },
|
|
@@ -197,6 +197,11 @@ if (attempt && attempt.terminal) {
|
|
|
197
197
|
let confirmed = false
|
|
198
198
|
if (tr === 'out-of-ownership') {
|
|
199
199
|
confirmed = !filesInScope(attempt.remedyFiles) // genuinely terminal iff remedy files are NOT in MAY-EDIT
|
|
200
|
+
} else if (domain === 'doc') {
|
|
201
|
+
// doc is fixed by doc-reviewer (reviewer-writer that owns its domain). Ratifying its
|
|
202
|
+
// terminal verdict with a SECOND doc-reviewer is doc-reviewer-judges-doc-reviewer —
|
|
203
|
+
// no cross-model diversity, pure waste. Trust the single pass; no adversarial re-run.
|
|
204
|
+
confirmed = true
|
|
200
205
|
} else {
|
|
201
206
|
// owner-gated / not-a-code-defect / baseline-not-reached — ratify with the judge.
|
|
202
207
|
try {
|
|
@@ -260,6 +265,14 @@ return await materialiseFollowup(kind, (attempt && attempt.note) || 'unresolved
|
|
|
260
265
|
// returns the files it independently confirmed changed; we cross-check ⊆ MAY-EDIT.
|
|
261
266
|
async function judgeVerify(verifiedAttempts) {
|
|
262
267
|
if (!verifiedAttempts.length) return { ok: false, best: 0 }
|
|
268
|
+
// doc-reviewer is a reviewer-writer that owns its domain: it applies the fix AND
|
|
269
|
+
// self-verifies in the same pass. A SECOND adversarial doc-reviewer cross-check is
|
|
270
|
+
// doc-reviewer-judges-doc-reviewer — no cross-model diversity, a pure waste of tokens
|
|
271
|
+
// and time. Trust the single pass; accept the first verified attempt without re-spawning.
|
|
272
|
+
if (domain === 'doc') {
|
|
273
|
+
const first = verifiedAttempts.find((v) => v.r && v.r.verified) || verifiedAttempts[0]
|
|
274
|
+
return { ok: true, best: first.i }
|
|
275
|
+
}
|
|
263
276
|
let judge = null
|
|
264
277
|
try {
|
|
265
278
|
judge = await agentSafe(
|