baldart 4.30.1 → 4.31.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,33 @@ All notable changes to BALDART will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [4.31.1] - 2026-06-11
9
+
10
+ **`new2`: fix v4.31.0's dedup coverage + drop empty residuals + honest A/B cost — verified against the real FEAT-0022 telemetry/report (not the narrative).** Reading the actual run's `skill-runs.jsonl` + workflow report (instead of the prior turn's recollection) exposed three things v4.31.0 missed:
11
+ - **The owner-gated dedup was too narrow** — it filtered by `deferralClass ∈ {owner-gated, not-a-code-defect}`, but the run's FIVE identical `db:push` residuals on card -01 carried kinds `ac-unmet`/`blocker`/`policy-deferred-ac`; the two `policy-deferred-ac` (an F-016 class that is ALWAYS an external/infra action) slipped through. Fix: dedup on `EXTERNAL_DEFERRAL = {owner-gated, not-a-code-defect, policy-deferred-ac}`. A second defect caught by a self-test before publishing: the action key split those 5 across `db-migration-deploy` and `migration:<file>` (the filename branch ran first), collapsing to 2 not 1 — **one `db:push` pushes all pending migrations, so the deploy intent must win over a bare filename**; reordered the key so every db-migration-deploy residual maps to one key. Verified on the real residuals: 7 db:push residuals (5 on -01, 1 on -02, 1 batch-level `F001`) → 2 (one per real card; `F001` dropped), matching the hand reconciliation.
12
+ - **Empty out-of-scope residuals** — the run emitted two `out-of-scope` residuals with no file/line/evidence (`":"`), which would mint contentless follow-up cards. `resolve()` now skips an out-of-scope finding whose composed evidence is empty after stripping `:`/space separators.
13
+ - **A/B cost telemetry was ~8× under** — the workflow reported `total_tokens=476k` (`budget.spent()`, output-only) / `agent_count=30` while the harness saw ~4.15M subagent tokens / 64 agents, because the workflow counters exclude the subagents spawned inside the nested workflows (`new2-resolve`, `new-final-review`). Since `total_tokens` was non-null, the skill's "backfill only if null" rule accepted the wrong figure — defeating new2's whole purpose (A/B on context economy). SKILL.md Step 5.5 now mandates **always** reading the real transcript `usage` via the `-stats` script as the headline cost, labelling the workflow figures as partial.
14
+
15
+ **PATCH** (corrects v4.31.0's dedup coverage + noise filter + telemetry honesty on the EXPERIMENTAL `new2` surface; no new capability, no config key). Meta: same lesson as the feature itself — the v4.31.0 dedup was written from the prior turn's recollection; reading the raw telemetry/report found the gap. Validate fixes against the data, not the memory of the data.
16
+
17
+ ### Changed
18
+
19
+ - **`framework/.claude/workflows/new2.js`** — `dedupOwnerGatedResiduals` now keys on `EXTERNAL_DEFERRAL` (adds `policy-deferred-ac`); `ownerGatedActionKey` reorders the db-migration-deploy branch ahead of the filename branch (one key per deploy action); `resolve()` skips empty `out-of-scope` residuals.
20
+ - **`framework/.claude/skills/new2/SKILL.md`** — Step 5.5 cost recording: always backfill real transcript `usage` (the workflow's `total_tokens`/`agent_count` are partial — output-only + exclude nested workflows); keep them labelled `*_workflow`.
21
+
22
+ ## [4.31.0] - 2026-06-11
23
+
24
+ **`new2`: the residual ledger self-corrects before returning — no more duplicate-of-done follow-ups, no more N defers for one external action.** A real `new2` run (FEAT-0022 epic, 3 cards) surfaced two over-report classes in the offline-safe residual ledger that the skill was absorbing **by hand** every run: (1) **4 of 8 follow-ups were false-open** — scope-expansion residuals deferred early in the batch but satisfied LATER by another card's commit / a final-review fix, which only `integrateCrossCard()` retracts; a residual closed by any *other* in-batch path stayed falsely-open (and left a best-effort, uncommitted follow-up YAML in the worktree). (2) **3 follow-ups for ONE physical action** — one migration's remote `db:push` re-raised per-card AND batch-wide by the final review → three near-identical owner-gated cards. Both were caught only by the skill's manual per-residual disk grep + consolidation (load-bearing, repeated every run). This release moves that work into the workflow, once, deterministically:
25
+ - **Ledger self-correction (`reconcileLedgerAgainstHead`)** — before returning, re-verify every still-pending code/doc residual (`scope-expansion`/`out-of-scope`/`unresolved`/`file-diff-violation`/`merge-artifact-skipped`/`out-of-ownership`, `materialized` true OR false) against the worktree HEAD via one read-only agent, and retract only the ones it can back with a `file:line` proof. **Conservative by design** — default KEEP: a false-open residual is recoverable downstream (the skill re-checks disk), but a wrong retract silently drops real work (F-029). Owner-gated / not-a-code-defect / policy-deferred are EXTERNAL actions a commit cannot close → never auto-retracted here. Telemetry `ledger_reconciled`.
26
+ - **Owner-gated dedup (`dedupOwnerGatedResiduals`)** — collapse owner-gated / not-a-code-defect residuals that share one action key (migration filename · `db:push`/`db:check-sync` · deploy · secret · DNS), keeping **one per distinct real card** (the skill marks each card DONE only after ITS follow-up exists) and dropping only batch-level duplicates (residual `card` is a finding id → no DONE-linkage); a batch-level residual with no matching per-card entry is a genuinely-new action and is kept. Telemetry `owner_gated_deduped`.
27
+
28
+ The skill's Step 5.1 disk reconciliation **still runs** — it is now a safety net over a pre-cleaned ledger (the self-correction is conservative; F-040 worktree-not-merged still applies), not the sole defence. Same recurring shape as prior `new2` fixes: the splice existed in ONE location (`integrateCrossCard`), the resolution-detection was needed in ALL paths that close a residual. **MINOR** (additive capability + observability on the EXPERIMENTAL `new2` surface; no behavior regression — the guards/policies are unchanged and the retract is proof-gated + conservative; no `baldart.config.yml` key, so the schema-change propagation rule does not apply; no change to `/new`).
29
+
30
+ ### Added
31
+
32
+ - **`framework/.claude/workflows/new2.js`** — `reconcileLedgerAgainstHead()` (new `Reconcile` phase, conservative proof-gated retract of already-satisfied residuals) + `dedupOwnerGatedResiduals()` (collapse duplicate external actions), both run right before `finalReturn`. New telemetry fields `ledger_reconciled` + `owner_gated_deduped`; new counters wired through `buildTelemetry`.
33
+ - **`framework/.claude/skills/new2/SKILL.md`** — documents that `residuals[]` now arrives pre-cleaned (Step 5.1 reframed as a safety net) and records the two new telemetry fields in the A/B step.
34
+
8
35
  ## [4.30.1] - 2026-06-11
9
36
 
10
37
  **`new2`: stop spending opus on mechanical ops steps — explicit per-step model overrides.** Three `general-purpose` agents had no `model:` override, so they inherited the session's main-loop model (opus) for work that needs none. The Merge step is a deterministic OPS/GIT executor (git merge + YAML status reconciliation + grep-based epic closure + leave-and-report hygiene gates) whose correctness-critical checks (F-029 forcedDone guard, F-040 deferred guard) are enforced in JS AFTER it returns — independent of the agent's reasoning → **sonnet**. The per-card Codex review agent is a pure DRIVER (runs the companion, strips `[codex]` traces, maps findings) — the review intelligence is Codex, run externally → **haiku**. The post-merge Production Readiness checklist is non-blocking report-not-execute → **sonnet**. The Pre-flight agent (DAG + ownership map + idempotency — it grounds the whole batch) intentionally **stays opus**. **PATCH** (cost optimization on the EXPERIMENTAL `new2` surface; no behavior change — the deterministic guards/policies are unchanged; no config key).
package/VERSION CHANGED
@@ -1 +1 @@
1
- 4.30.1
1
+ 4.31.1
@@ -159,6 +159,14 @@ returns when the batch is done. It returns:
159
159
  flag is **advisory only** — `true` means the workflow *attempted* a write (possibly
160
160
  into a worktree that never merged), not that a card exists on disk in the main repo.
161
161
  **You (the skill) must reconcile EVERY residual against the main-repo disk** (Step 5.1).
162
+ **v4.31.0 — the workflow now self-corrects this ledger before returning**: it re-verifies
163
+ every still-pending code/doc residual against the worktree HEAD and **retracts the ones a
164
+ later in-batch commit already satisfied** (telemetry `ledger_reconciled`), and **collapses
165
+ duplicate owner-gated residuals that map to one external action** (e.g. one migration's
166
+ `db:push` re-raised per-card + batch-wide → telemetry `owner_gated_deduped`). So `residuals[]`
167
+ arrives pre-cleaned of false-open and duplicate entries; your Step 5.1 disk reconciliation is
168
+ now a **safety net** over a clean ledger (it still runs — the self-correction is conservative
169
+ and F-040 worktree-not-merged still applies), not the sole defence.
162
170
  - `degraded` / `degradationReasons` — the batch stopped early under a sustained
163
171
  outage (or another degradation). The batch is NOT complete; it must be resumed.
164
172
  - `telemetry` — the Phase-8 record (`variant:"new2"`).
@@ -220,9 +228,16 @@ returns when the batch is done. It returns:
220
228
  (`git -C $MAIN log --oneline ${trunk} | grep <card>`); annotate any divergence and never present
221
229
  progress the disk does not show. Then fill `wall_clock_s` (now − kickoff `ts`) and
222
230
  `followups_on_disk` (count the actual follow-up files on disk in the main repo, NOT
223
- `residualFollowups.length` — which double-counts). `total_tokens`/`agent_count` come from the
224
- workflow; if `total_tokens` is null, run the `/new` Phase-8 `-stats` script to backfill real
225
- `usage`. Keep `degraded`/`degradation_reasons` + `cards_deferred_done_pending` in the record so
231
+ `residualFollowups.length` — which double-counts). **Cost (the A/B's whole point) — do NOT trust
232
+ the workflow's `total_tokens`/`agent_count` as the headline figure.** They are **partial**:
233
+ `total_tokens` is `budget.spent()` output-only AND the agent counters do not include the subagents
234
+ spawned inside the nested workflows (`new2-resolve`, `new-final-review`) — on the real FEAT-0022 run
235
+ the workflow reported `total_tokens=476k` / `agent_count=30` while the harness saw ~4.15M subagent
236
+ tokens / 64 agents (≈8× under). So **always** run the `/new` Phase-8 `-stats` script to read the
237
+ real transcript `usage` (not only when `total_tokens` is null), record it as the headline cost, and
238
+ keep the workflow's figures clearly labelled as `total_tokens_workflow`/`agent_count_workflow`
239
+ (partial, output-only, excludes nested workflows). Keep `degraded`/`degradation_reasons` +
240
+ `cards_deferred_done_pending` in the record so
226
241
  the A/B comparison stays honest. Also record `migration_gate: <migration.status>`
227
242
  (`none`|`applied`|`skipped`|`degraded`) — the Step-3.5 gate is a pre-launch interaction, NOT a
228
243
  mid-batch question, so it does not break the zero-ask-during-batch invariant; logging it keeps the
@@ -234,4 +249,9 @@ returns when the batch is done. It returns:
234
249
  too (count of residuals the pre-final-review Cross-Card Integration Pass implemented in-batch —
235
250
  out-of-ownership-within-batch + outage retries — instead of leaving as follow-ups); with
236
251
  `deferral_breakdown` it shows how many deferrals were genuinely undeferrable vs absorbed in-batch.
252
+ Keep `ledger_reconciled` + `owner_gated_deduped` (v4.31.0) too — they quantify the ledger
253
+ self-correction: `ledger_reconciled` > 0 means the workflow retracted residuals a later commit had
254
+ already satisfied (work the skill used to suppress by hand; a persistently high value signals
255
+ deferrals resolving too late — order the dependent card earlier), and `owner_gated_deduped` > 0
256
+ means N defers were collapsed to one external action.
237
257
  Do NOT re-summarise the cards — the workflow already did.
@@ -9,6 +9,7 @@ export const meta = {
9
9
  { title: 'Final', detail: 'cross-batch final review (delegates to new-final-review)' },
10
10
  { title: 'Merge', detail: 'integrity-gated auto-merge to trunk via git.merge_strategy + cleanup' },
11
11
  { title: 'Production', detail: 'post-merge production-readiness checklist (Phase 7, non-blocking)' },
12
+ { title: 'Reconcile', detail: 'ledger self-correction: retract residuals already satisfied in HEAD + dedup owner-gated external actions (v4.31.0)' },
12
13
  ],
13
14
  }
14
15
 
@@ -71,6 +72,8 @@ const cardMayEdit = {} // v4.30.0 — per-card MAY-EDIT, for the cross-car
71
72
  const perCardResults = []
72
73
  let prodReadiness = null
73
74
  let integratedCount = 0 // v4.30.0 — residuals resolved by the Cross-Card Integration Pass
75
+ let ledgerReconciled = 0 // v4.31.0 — residuals retracted because already-satisfied in HEAD (resolved by a later commit, not by integrateCrossCard)
76
+ let ownerGatedDeduped = 0 // v4.31.0 — duplicate owner-gated residuals collapsed (N defers → 1 external action)
74
77
  let degraded = false
75
78
  const degradationReasons = []
76
79
 
@@ -386,9 +389,14 @@ async function resolve(kind, card, evidence, extra) {
386
389
  // domain and decide whether an out-of-ownership remedy lands inside the batch union.
387
390
  residuals.push({ card, kind, evidence, materialized: !!fc, deferralClass, domain: dom, remedyFiles: (res && res.remedyFiles) || [] })
388
391
  }
389
- // F-022 — route out-of-scope findings the resolve surfaced.
392
+ // F-022 — route out-of-scope findings the resolve surfaced. v4.31.1 — a finding with no file,
393
+ // no line AND no evidence is noise (the resolve emitted an empty placeholder); pushing it would
394
+ // mint a contentless follow-up card. Skip when the composed evidence is empty after stripping the
395
+ // `:`/space separators — a real finding always carries at least a file, a line, or a description.
390
396
  for (const osf of (res && res.outOfScopeFindings) || []) {
391
- residuals.push({ card, kind: 'out-of-scope', evidence: `${osf.file || ''}:${osf.line || ''} ${osf.evidence || ''}`, materialized: false })
397
+ const ev = `${osf.file || ''}:${osf.line || ''} ${osf.evidence || ''}`
398
+ if (!ev.replace(/[:\s]/g, '')) continue
399
+ residuals.push({ card, kind: 'out-of-scope', evidence: ev.trim(), materialized: false })
392
400
  }
393
401
  ledger(card, 'resolve:' + kind, status, (res && (res.followupCard || res.reason)) || '')
394
402
  return { status, deferralClass }
@@ -1062,6 +1070,123 @@ if (mergeResult && mergeResult.merged) {
1062
1070
  ledger(firstCard, 'phase7-production', 'SKIPPED', 'not merged')
1063
1071
  }
1064
1072
 
1073
+ // ───────────────────────────────────────────────────────────────────────────
1074
+ // Ledger self-correction (v4.31.0) — a residual deferred earlier in the batch can be satisfied
1075
+ // LATER by another card's commit, the Cross-Card Integration Pass, or the final-review resolve.
1076
+ // Only integrateCrossCard() splices ITS OWN resolutions; a residual closed by any OTHER in-batch
1077
+ // path stays falsely-open in the offline-safe ledger (and materialiseFollowup left a best-effort,
1078
+ // uncommitted follow-up YAML in the worktree). The skill then has to grep main-repo disk per
1079
+ // residual to suppress the false-open ones — load-bearing, manual, repeated every run. Move that
1080
+ // verification HERE, once, deterministically: re-check every still-pending code/doc residual
1081
+ // against the worktree HEAD and retract only the ones an agent can PROVE are already satisfied.
1082
+ // CONSERVATIVE: default KEEP — a false-open residual is recoverable downstream (the skill re-checks
1083
+ // disk), but a wrong retract silently drops real work (F-029). Owner-gated / not-a-code-defect /
1084
+ // policy-deferred are EXTERNAL actions a commit cannot close → never auto-retracted here.
1085
+ // ───────────────────────────────────────────────────────────────────────────
1086
+ const VERIFIABLE_RETRACT = new Set(['scope-expansion', 'out-of-scope', 'unresolved', 'file-diff-violation', 'merge-artifact-skipped', 'out-of-ownership'])
1087
+ async function reconcileLedgerAgainstHead() {
1088
+ if (degraded || !sharedCtx || !sharedCtx.worktreePath) return
1089
+ // materialized:true is NOT proof a residual is correctly-open — its best-effort worktree YAML may
1090
+ // be for work since closed, which would mint a duplicate-of-done card in the main repo. So verify
1091
+ // BOTH materialized true and false, restricted to classes a later commit could actually close.
1092
+ const candidates = residuals.filter((r) => !r.integrated && VERIFIABLE_RETRACT.has(r.deferralClass || r.kind))
1093
+ if (!candidates.length) return
1094
+ phase('Reconcile')
1095
+ let verdict = null
1096
+ try {
1097
+ verdict = await agentSafe(
1098
+ `Read-only verification (you are general-purpose ops; ROLE BOUNDARY: git read commands only — you NEVER edit, write, or delete any file). cd into the worktree ${sharedCtx.worktreePath}; its HEAD holds every committed card of the batch.\n\n` +
1099
+ `For each residual below, decide whether the gap it describes is ALREADY SATISFIED in the worktree HEAD — a later card's commit, a cross-card fix, or a final-review fix may have closed it after it was deferred. Inspect the actual code/docs/tests in HEAD; do NOT trust the residual text.\n` +
1100
+ `CONSERVATIVE CONTRACT: set resolved only for indices you can back with the exact file:line in HEAD that satisfies the gap. If you are not certain, OMIT the index (a still-open residual is safely re-checked downstream; a wrong 'resolved' silently drops real work).\n\n` +
1101
+ `Residuals (index · card · class · evidence):\n` +
1102
+ candidates.map((r, i) => ` [${i}] ${r.card} · ${r.deferralClass || r.kind} · ${r.evidence}`).join('\n') +
1103
+ `\n\nReturn: { resolved: [ { index, proof } ] } — only indices already satisfied in HEAD, each with a file:line proof.`,
1104
+ { label: 'ledger-reconcile', phase: 'Reconcile', agentType: 'general-purpose', model: 'sonnet',
1105
+ schema: { type: 'object', required: ['resolved'], additionalProperties: true, properties: { resolved: { type: 'array', items: { type: 'object', required: ['index'], additionalProperties: true, properties: { index: { type: 'number' }, proof: { type: 'string' } } } } } } }
1106
+ )
1107
+ } catch (e) { if (e && e.transientExhausted) noteDegraded('outage'); return }
1108
+ const satisfied = ((verdict && verdict.resolved) || []).map((x) => x.index).filter((i) => Number.isInteger(i) && i >= 0 && i < candidates.length)
1109
+ if (!satisfied.length) { ledger(firstCard, 'ledger-reconcile', 'CLEAN', `${candidates.length} pending residual(s); none verifiably already-satisfied in HEAD`); return }
1110
+ const retracted = []
1111
+ for (const i of satisfied) {
1112
+ const r = candidates[i]
1113
+ const idx = residuals.indexOf(r)
1114
+ if (idx < 0) continue // already spliced (duplicate index)
1115
+ residuals.splice(idx, 1)
1116
+ for (let j = residualFollowups.length - 1; j >= 0; j--) {
1117
+ if (residualFollowups[j].card === r.card && residualFollowups[j].kind === r.kind) residualFollowups.splice(j, 1)
1118
+ }
1119
+ retracted.push(`${r.card}/${r.deferralClass || r.kind}`)
1120
+ ledgerReconciled++
1121
+ }
1122
+ ledger(firstCard, 'ledger-reconcile', 'RETRACTED', `${retracted.length} residual(s) already-satisfied in HEAD → retracted (no duplicate-of-done card): ${retracted.join(', ')}`)
1123
+ }
1124
+
1125
+ // ───────────────────────────────────────────────────────────────────────────
1126
+ // Owner-gated dedup (v4.31.0) — several deferred ACs across the batch can map to ONE physical
1127
+ // external action (e.g. one migration's remote `db:push` re-raised per-card AND batch-wide by the
1128
+ // final review). Minting N follow-ups for one action is redundant. Collapse owner-gated /
1129
+ // not-a-code-defect residuals that share an action key — but KEEP one per distinct REAL card (the
1130
+ // skill marks each card DONE only after ITS follow-up exists) and drop only batch-level duplicates
1131
+ // (residual.card is a finding id, not a backlog card → no DONE-linkage to preserve). A batch-level
1132
+ // owner-gated residual with NO matching real-card entry is a genuinely-new action → kept untouched.
1133
+ // ───────────────────────────────────────────────────────────────────────────
1134
+ function ownerGatedActionKey(r) {
1135
+ const hay = `${r.evidence || ''} ${(r.remedyFiles || []).join(' ')}`
1136
+ // ORDER MATTERS — the migration-deploy intent wins over a bare filename match. One `db:push`
1137
+ // pushes ALL pending migrations in a single action, so every "migration not deployed" / db:check-
1138
+ // -sync / db:push residual must collapse onto ONE key regardless of whether it also names the .sql
1139
+ // file (the real FEAT-0022 run split 5 identical db:push residuals across `db-migration-deploy`
1140
+ // and `migration:<file>` because the filename branch ran first → only the deploy intent is canon).
1141
+ if (/\bdb:push\b|\bdb:check-sync\b|remote db push|migration[^.]*(deploy|remote|push|not[_ ]?deployed)/i.test(hay)) return 'db-migration-deploy'
1142
+ const mig = hay.match(/(\d{14}_[a-z0-9_]+\.sql)/i)
1143
+ if (mig) return 'migration:' + mig[1].toLowerCase()
1144
+ if (/\bdeploy(ment)?\b/i.test(hay)) return 'deploy'
1145
+ if (/\bsecret\b/i.test(hay)) return 'secret'
1146
+ if (/\bDNS\b|\bdomain\b/i.test(hay)) return 'dns'
1147
+ return null // unknown action → never dedup (avoid collapsing two genuinely-distinct externals)
1148
+ }
1149
+ // classes that are ALWAYS an external/infra action (never code a commit can close) — the only ones
1150
+ // safe to collapse by shared action key. `policy-deferred-ac` belongs here (F-016: an AC whose
1151
+ // remedy is out-of-ownership or an owner-gated infra step) — the real FEAT-0022 run proved a single
1152
+ // `db:push` surfaced as owner-gated AND policy-deferred-ac on the same card, so excluding the latter
1153
+ // left duplicates uncollapsed. ac-unmet/blocker/merge-blocker are NOT here: they may be code defects,
1154
+ // and their EXTERNAL instances are already reclassified to `owner-gated` by the F-040 classifier.
1155
+ const EXTERNAL_DEFERRAL = new Set(['owner-gated', 'not-a-code-defect', 'policy-deferred-ac'])
1156
+ function dedupOwnerGatedResiduals() {
1157
+ const realCard = new Set(cardIds)
1158
+ const groups = {}
1159
+ for (const r of residuals) {
1160
+ if (!EXTERNAL_DEFERRAL.has(r.deferralClass)) continue
1161
+ const k = ownerGatedActionKey(r)
1162
+ if (!k) continue
1163
+ ;(groups[k] = groups[k] || []).push(r)
1164
+ }
1165
+ for (const k of Object.keys(groups)) {
1166
+ const g = groups[k]
1167
+ if (g.length < 2) continue
1168
+ if (!g.some((r) => realCard.has(r.card))) continue // all batch-level (new action) — keep them
1169
+ const seenCard = new Set()
1170
+ for (const r of g) {
1171
+ const isReal = realCard.has(r.card)
1172
+ const isDup = isReal ? seenCard.has(r.card) : true // keep one per real card; every batch-level entry is a dup of the per-card tracking
1173
+ if (isReal) seenCard.add(r.card)
1174
+ if (!isDup) continue
1175
+ const idx = residuals.indexOf(r)
1176
+ if (idx < 0) continue
1177
+ residuals.splice(idx, 1)
1178
+ for (let j = residualFollowups.length - 1; j >= 0; j--) {
1179
+ if (residualFollowups[j].card === r.card && residualFollowups[j].kind === r.kind) residualFollowups.splice(j, 1)
1180
+ }
1181
+ ownerGatedDeduped++
1182
+ }
1183
+ }
1184
+ if (ownerGatedDeduped) ledger(firstCard, 'owner-gated-dedup', 'COLLAPSED', `${ownerGatedDeduped} duplicate owner-gated residual(s) collapsed (multiple defers → one external action)`)
1185
+ }
1186
+
1187
+ await reconcileLedgerAgainstHead()
1188
+ dedupOwnerGatedResiduals()
1189
+
1065
1190
  return finalReturn({ fatal: false })
1066
1191
 
1067
1192
  // ───────────────────────────────────────────────────────────────────────────
@@ -1105,6 +1230,13 @@ function buildTelemetry() {
1105
1230
  // v4.30.0 — residuals the Cross-Card Integration Pass implemented in-batch (out-of-ownership
1106
1231
  // within the batch union + outage retries) instead of leaving as follow-ups to manage later.
1107
1232
  cross_card_integrated: integratedCount,
1233
+ // v4.31.0 — residuals retracted because a later in-batch commit already satisfied them (verified
1234
+ // against the worktree HEAD, conservative). A non-zero value means the ledger self-corrected what
1235
+ // the skill used to suppress by hand; a persistently high value signals deferrals that resolve too
1236
+ // late (consider ordering the dependent card earlier).
1237
+ ledger_reconciled: ledgerReconciled,
1238
+ // v4.31.0 — duplicate owner-gated residuals collapsed to one external action (e.g. one db:push).
1239
+ owner_gated_deduped: ownerGatedDeduped,
1108
1240
  // followups_on_disk is filled by the SKILL after it materialises pending residuals.
1109
1241
  followups_materialized_in_workflow: residuals.filter((x) => x.materialized).length,
1110
1242
  resolve_invocations: resolvedSignatures.size,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "baldart",
3
- "version": "4.30.1",
3
+ "version": "4.31.1",
4
4
  "description": "Claude Agent Framework - Reusable framework for coordinating AI agents and humans in software projects",
5
5
  "bin": {
6
6
  "baldart": "./bin/baldart.js"