baldart 4.23.0 → 4.24.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,34 @@ All notable changes to BALDART will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [4.24.1] - 2026-06-10
9
+
10
+ **`new2`: an owner-gated gate no longer destroys a completed card (silent work-loss → commit + defer).** A real `/new2` run on a schema-change card produced **zero output** despite 52 min of work: the card's only obstacle was an *owner-gated* step (`db:check-sync` needs an approved remote `db:push`). That step was correctly `policy-deferred` up front — but the **same** condition was *also* re-raised by a reviewer as a fresh `MIGRATION_NOT_DEPLOYED` blocker, which (via the review-block branch's `s !== 'resolved' → cardBlocked`) triggered `rollbackCard`'s `git clean -fd` and **erased the completed migration**. Compounding it: the `E4-file-diff` gate logged `AUTO-REVERTED` while reverting *nothing* (leaving the card's DoD-mandated ADR/ER-doc edits orphaned in the worktree — its MAY-EDIT map was narrower than the DoD), and the residual follow-up was written *inside the worktree* and marked `materialized:true` without disk proof, so it vanished when the batch didn't merge. Root cause confirmed via the gate ledger + on-disk state + two rounds of adversarial review (the obvious "E4 reverted it" diagnosis was **wrong** — E4 was a no-op; `rollbackCard` was the eraser). **PATCH** (bug-fix to the EXPERIMENTAL `new2` skill + its workflows; **no `baldart.config.yml` key** and **no change to shared `/new` prose** — the DONE-deferral is handled entirely inside `new2`'s own merge prompt + skill, so `/new` interactive is provably unaffected; the schema-change propagation rule does not apply).
11
+
12
+ ### Fixed
13
+
14
+ - **`framework/.claude/workflows/new2.js` + `new2-resolve.js` — classification-based card-block (the primary fix, F-040).** `resolve()` now propagates a structured `deferralClass` from `new2-resolve`'s terminal-judge. A review-block that resolves to an **owner-gated / not-a-code-defect** deferral no longer sets `cardBlocked` → the card's *complete* code is committed (it is **not** rolled back); only a genuine unresolved **code** defect (or `out-of-ownership`/`baseline`/`outage`) still blocks + rolls back. A deliberately-broken migration (`db:reset` failing) is still classified code-defect → blocks, so this is not an over-match escape hatch.
15
+ - **`new2.js` — `E4-file-diff` is honest.** The old `AUTO-REVERTED` log reverted nothing. The owner agent now reconciles out-of-ownership edits itself (per `implement.md §11b`) and reports `revertedOutOfOwnership`; the gate logs `REVERTED` / `FLAGGED` to match reality, and an unresolved violation becomes a tracked residual (never silent, never orphaned).
16
+ - **`new2.js` pre-flight — ownership map ⊇ DoD (root cause of the E4 false positive).** A card's MAY-EDIT now = `files_likely_touched` ∪ paths **named explicitly** in its `acceptance_criteria`/`definition_of_done` (the ADR/data-model/ER doc a schema-change must touch), so editing a DoD-mandated doc is no longer a file-diff violation.
17
+ - **`new2.js` + `new2/SKILL.md` — DONE deferred to the skill, gated on the follow-up existing on disk (closes the F-029 false-DONE the review surfaced).** A card carrying an open owner-gated/policy-deferred AC commits its code but stays **NON-DONE**; the merge agent is told to leave those cards non-DONE; the **skill** marks them DONE post-run **only after** verifying the deferral's follow-up exists on disk in the main repo (fail-loud otherwise — never DONE with a dropped requirement).
18
+ - **`new2-resolve.js` + `new2/SKILL.md` — follow-ups are reliable.** Follow-up materialisation is best-effort inside the workflow (it rides the merge if the batch merges); the **skill is the SSOT**, verifying every residual against the **main-repo** disk and creating any missing follow-up there — so a non-merged batch never loses one. `materialized` is now advisory only.
19
+ - **`new2.js` (Fix G) — AC-deferral dedup is text-drift-proof.** The policy-deferred-AC key is now scoped to the AC *number* (`acSig`), so a deferred AC is no longer re-routed to `resolve()` a second time when the pre-flight and implement agents word the AC text slightly differently.
20
+ - **`new2/SKILL.md` — telemetry reconciled against disk.** Before recording, the skill verifies each `committed` card actually has a commit on trunk and never presents progress the disk does not show; adds `cards_deferred_done_pending` so the A/B record distinguishes "code landed" from "DONE".
21
+
22
+ ## [4.24.0] - 2026-06-10
23
+
24
+ **Atomic backlog-ID allocator — no FEAT/BUG collisions across parallel worktrees.** When several `/prd` (or `/new`/`new2` follow-up) sessions run in parallel on sibling worktrees, each branched from the same trunk, the old `max(^id: FEAT-) + 1` scan made them all land on the **same next integer**: the other session's card was in flight on an unmerged sibling branch, invisible to both the local backlog and the trunk merge-base — so two epics both became `FEAT-0024` and conflicted at rebase/merge. The `git fetch` + merge-base scan only ever covered *already-merged* IDs, never in-flight ones. A new allocator anchors a lock + per-prefix high-water mark in `$MAIN/.worktrees/` (the shared coordination point every worktree already reaches via `git rev-parse --git-common-dir`, gitignored like `registry.json`), so a reservation is atomic across every worktree **on the same machine**. The high-water mark bumped under the lock is the correctness anchor; `max()` against the real backlog + sibling-worktree backlogs + reservations log + trunk merge-base makes it **self-healing**. **MINOR** (additive capability on the `worktree-manager` skill; opt-in — callers fall back to the inline merge-base scan + `[ID-RACE-RISK]` note when the script is absent, so older installs and cross-machine cloud agents are unaffected. **No `baldart.config.yml` key** — the allocator reuses `paths.backlog_dir` + the gitignored `.worktrees/` convention, so the schema-change propagation rule does not apply).
25
+
26
+ ### Added
27
+
28
+ - **`framework/.claude/skills/worktree-manager/scripts/allocate-id.sh`** — prefix-parametric (`FEAT`/`BUG`/`UI`/`DOC`/`PERF`/…) atomic ID allocator. `reserve <PREFIX> <slug>` prints the next free integer zero-padded to 4 digits; `release <worktree-path>` prunes a finished worktree's reservations. Cross-process mutex via atomic `mkdir` (stale-stolen after 30s via the lock dir's own mtime — a directory, not a git ref, so it never touches the shared `refs/stash` that `git stash` in worktrees is forbidden over). The slow `git fetch` runs **before** the lock, keeping the critical section local-FS-only (sub-second). Reads `paths.backlog_dir` + `git.trunk_branch` from `baldart.config.yml` and resolves `$MAIN` at runtime — no hardcoded project facts (passes the framework-edit-gate). Same-machine scope; the trunk merge-base scan is the best-effort cross-machine guard.
29
+
30
+ ### Changed
31
+
32
+ - **`framework/.claude/skills/worktree-manager/SKILL.md`** — new section **"ID Allocation"** documenting the allocator, the shared `.id-alloc.lock/` / `.id-hwm-<PREFIX>` / `id-reservations.jsonl` files, the same-machine scope + opt-in fallback contract, and the gap-tolerant monotonic-counter design. `release` is wired into the cleanup paths of `/mw` (step 7), `mw-docs` (step 6), and `/cw` (step 4).
33
+ - **`framework/.claude/agents/prd-card-writer.md`** — § "FEAT-XXXX numbering" + Pre-Generation Checklist item 1 now reserve the integer via the allocator (primary path), with the trunk fetch + merge-base scan + `[ID-RACE-RISK]` note as the documented fallback when the script is absent. Applies to any prefix the writer mints.
34
+ - **`framework/.claude/agents/prd.md`** — § 4.1 NAMING CONVENTIONS prefers the allocator (any prefix) over the plain backlog scan, with the same documented fallback.
35
+
8
36
  ## [4.23.0] - 2026-06-09
9
37
 
10
38
  **Functional Traceability Gate — no orphan UI affordances from mockups.** Handed-off mockups (Claude Design / Figma / the internal generators) routinely add interactive-looking chrome — buttons, icons, menu entries — that no requirement backed. Implementing a mockup **1:1 for fidelity** materialised that chrome into real markup + dead handlers + unused icon imports, and the cruft propagated forever. A new **unconditional** gate (independent of `features.has_design_system`) now requires every interactive or iconographic artifact to trace to a function before it becomes code: each artifact is classified **backed** (implement) / **decorative** (static-render, `aria-hidden`) / **orphan** (drop or escalate — never silent). The oracle is the PRD **UI Element Inventory `function_ref`** allowlist → card AC → orphan. Enforced at three boundaries: PRD mockup-intake (orphans become BLOCKING Discovery items), `ui-expert` implementation-time (BLOCKING pre-work), and `code-reviewer` / `/design-review` per-merge (`UI_ORPHAN_AFFORDANCE` HIGH finding). The `visual-fidelity-verifier` gains a `resolved_orphans` carve-out so a deliberately-dropped orphan is *expected-absent*, not a `component-missing` false positive. **MINOR** (additive discipline across existing UI surfaces; no new agent/skill/command and **no `baldart.config.yml` key** — the gate is registry-independent and reads its allowlist from the PRD inventory, so the schema-change propagation rule does not apply).
package/VERSION CHANGED
@@ -1 +1 @@
1
- 4.23.0
1
+ 4.24.1
@@ -300,18 +300,36 @@ applies even when N=1.
300
300
 
301
301
  ### FEAT-XXXX numbering
302
302
 
303
- - Pick the next free `FEAT-XXXX` integer (Grep `^id: FEAT-` in `${paths.backlog_dir}/`, find max,
304
- add 1). **Reserve the entire integer for this epic** do NOT reuse the integer
305
- for unrelated cards even if it has fewer than ~99 children.
306
- - **Scan the canonical backlog, not just the worktree branch.** The worktree
307
- branched from the trunk may not contain IDs committed on sibling branches not
308
- yet merged. Before computing max, run
309
- `git fetch <git.trunk_branch>` and grep `${paths.backlog_dir}/` on the merge-base of the
310
- trunk (`git grep '^id: FEAT-' $(git merge-base HEAD <git.trunk_branch>)` plus
311
- the local worktree) so concurrent PRD sessions don't both land on the same
312
- "next free" integer. If a `git fetch` is not possible (offline), state the
313
- reservation is best-effort against the local view and surface a
314
- `[ID-RACE-RISK]` note so the caller can re-verify before commit.
303
+ - Pick the next free `FEAT-XXXX` integer. **Reserve the entire integer for this
304
+ epic** do NOT reuse the integer for unrelated cards even if it has fewer than
305
+ ~99 children.
306
+ - **Use the atomic allocator when present (primary path).** Plain `max + 1` is
307
+ unsafe under parallelism: concurrent PRD sessions on sibling worktrees, all
308
+ branched from the same trunk, see the same max (the other session's card is in
309
+ flight on an unmerged branch) and both pick the same integer → merge conflict.
310
+ The `worktree-manager` allocator reserves the integer atomically across every
311
+ worktree on this machine (lock + shared high-water mark under `.worktrees/`):
312
+
313
+ ```bash
314
+ # $MAIN is the main repo root: dirname of `git rev-parse --git-common-dir`.
315
+ ALLOC="$MAIN/.claude/skills/worktree-manager/scripts/allocate-id.sh"
316
+ if [ -x "$ALLOC" ]; then
317
+ N="$("$ALLOC" reserve FEAT "<slug>")" # e.g. prints 0024
318
+ # → the reserved epic id is FEAT-$N. The integer is now claimed; a parallel
319
+ # session calling reserve will get FEAT-$((N+1)).
320
+ fi
321
+ ```
322
+
323
+ State the reserved integer in your response.
324
+ - **Fallback when the allocator is absent** (older install, no script): degrade to
325
+ the canonical-backlog scan — `git fetch <git.trunk_branch>`, then
326
+ `max` over `git grep '^id: FEAT-' $(git merge-base HEAD <git.trunk_branch>)`
327
+ plus the local worktree backlog. This still misses in-flight sibling-worktree
328
+ IDs, so surface a `[ID-RACE-RISK]` note so the caller can re-verify before
329
+ commit. If `git fetch` is impossible (offline), say the reservation is
330
+ best-effort against the local view.
331
+ - The same procedure applies to any other prefix you mint (`BUG`, `UI`, `DOC`,
332
+ `PERF`): `reserve <PREFIX> <slug>` — the allocator is prefix-parametric.
315
333
 
316
334
  ### FORBIDDEN PATTERNS — agent MUST refuse to generate
317
335
 
@@ -350,10 +368,12 @@ Mirror their structure when generating new epic+child sets.
350
368
 
351
369
  Before writing the first YAML file, confirm and report to the caller:
352
370
 
353
- 1. **Reserved FEAT-XXXX integer**: Grep `^id: FEAT-` in `${paths.backlog_dir}/`, find max,
354
- pick max+1, per the concurrency-safe procedure in § "FEAT-XXXX numbering"
355
- (fetch trunk + merge-base scan to avoid duplicate IDs across parallel
356
- sessions). State the chosen integer in your response.
371
+ 1. **Reserved FEAT-XXXX integer**: reserve it via the atomic allocator
372
+ (`allocate-id.sh reserve FEAT <slug>`), per § "FEAT-XXXX numbering" — this is
373
+ the concurrency-safe path that prevents duplicate IDs across parallel sessions
374
+ on sibling worktrees. If the allocator is absent, fall back to the trunk
375
+ fetch + merge-base scan with an `[ID-RACE-RISK]` note. State the chosen
376
+ integer in your response.
357
377
  2. **Drafted epic + N children list**: list the planned filenames
358
378
  (`FEAT-XXXX-00-<slug>-epic.yml`, `FEAT-XXXX-01-<sub-slug>.yml`, ...) BEFORE
359
379
  writing them. This is a contract — the actual files MUST match the list.
@@ -498,7 +498,20 @@ Create protocol-compliant backlog cards for every actionable step. Cards MUST fo
498
498
 
499
499
  #### 4.1 — NAMING CONVENTIONS
500
500
 
501
- Before creating cards, scan `${paths.backlog_dir}/*.yml` to determine the next available number. File naming rules:
501
+ Before creating cards, reserve the next number for the chosen prefix. **Prefer the
502
+ atomic allocator** — a plain `${paths.backlog_dir}/*.yml` scan collides under
503
+ parallelism (concurrent sessions on sibling worktrees both pick the same "next"
504
+ integer and conflict at merge):
505
+
506
+ ```bash
507
+ # $MAIN = dirname of `git rev-parse --git-common-dir` (shared across worktrees).
508
+ ALLOC="$MAIN/.claude/skills/worktree-manager/scripts/allocate-id.sh"
509
+ [ -x "$ALLOC" ] && N="$("$ALLOC" reserve FEAT <slug>") # also BUG | UI | DOC | PERF
510
+ ```
511
+
512
+ If the allocator is absent (older install), fall back to scanning
513
+ `${paths.backlog_dir}/*.yml` for the max and add 1, noting `[ID-RACE-RISK]`. See
514
+ `worktree-manager` § "ID Allocation" for the mechanism. File naming rules:
502
515
 
503
516
  | Prefix | When to use | Example |
504
517
  |--------|-------------|---------|
@@ -117,42 +117,56 @@ returns when the batch is done. It returns:
117
117
 
118
118
  - `report` — ready-to-show markdown batch summary.
119
119
  - `residuals` — the **OFFLINE-SAFE ledger of record**: every residual the workflow
120
- could not finish, each `{ card, kind, evidence, materialized }`. A residual with
121
- `materialized:false` has NO follow-up card on disk yet (e.g. the workflow hit an
122
- outage where no agent could write the file). **You (the skill) must reconcile it.**
120
+ could not finish, each `{ card, kind, evidence, materialized }`. The `materialized`
121
+ flag is **advisory only** `true` means the workflow *attempted* a write (possibly
122
+ into a worktree that never merged), not that a card exists on disk in the main repo.
123
+ **You (the skill) must reconcile EVERY residual against the main-repo disk** (Step 5.1).
123
124
  - `degraded` / `degradationReasons` — the batch stopped early under a sustained
124
125
  outage (or another degradation). The batch is NOT complete; it must be resumed.
125
126
  - `telemetry` — the Phase-8 record (`variant:"new2"`).
126
127
 
127
128
  ### Step 5 — Reconcile, resume, present, record
128
129
 
129
- 1. **Materialise missing follow-ups (offline-safeyou have filesystem access, the
130
- workflow does not).** For every `residuals[]` entry with `materialized:false`,
131
- create a follow-up card `${paths.backlog_dir}/<card>-followup-<kind>.yml` (status:
132
- TODO). **Delegate the write to the `prd-card-writer` agent** the same owner the
133
- workflow uses (card-template, Rule C `review_profile`, `owner_agent` routed to the
134
- residual's domain, traceability) derived from the residual (≥1 requirement;
135
- `acceptance_criteria` = the verbatim residual; `files_likely_touched` from the card's
136
- ownership). Do NOT hand-write a minimal stub the offline path must match the
137
- agent-path quality (F-039). It MUST pass the `/new` pre-flight field check. If
138
- `prd-card-writer` itself is unavailable (total outage), fall back to a minimal valid
139
- stub so the card still exists. This is the layer that guarantees **nothing is ever
140
- dropped, even when every agent was dead** during the run.
141
- 2. **Resume if degraded.** If `degraded` is true, re-invoke the workflow with
130
+ 1. **Materialise follow-ups in the MAIN repo verify on disk, do NOT trust `materialized`
131
+ (F-040).** The workflow's agents run cd'd into the *worktree*, so any follow-up they wrote
132
+ may live in a worktree that was NOT merged (and is now gone) — `materialized:true` only means
133
+ "the workflow attempted a write", never proof on disk. So for **every** `residuals[]` entry
134
+ (regardless of the `materialized` flag), check whether a matching follow-up card actually exists
135
+ on disk under `${paths.backlog_dir}` in the **MAIN repo** (`<card>-followup-*.yml`). If it is
136
+ absent, create it by **delegating to the `prd-card-writer` agent** the same owner the workflow
137
+ uses (card-template, Rule C `review_profile`, `owner_agent` routed to the residual's domain,
138
+ traceability) derived from the residual (≥1 requirement; `acceptance_criteria` = the verbatim
139
+ residual; `files_likely_touched` from the card's ownership). Do NOT hand-write a minimal stub —
140
+ the offline path must match agent-path quality (F-039); it MUST pass the `/new` pre-flight field
141
+ check. If `prd-card-writer` is unavailable (total outage), fall back to a minimal valid stub. This
142
+ main-repo, **disk-verified** write is the SSOT — nothing is dropped even on a non-merged batch.
143
+ 2. **Mark deferred cards DONE — only after their follow-up exists (F-040/H).** Some committed cards
144
+ were intentionally left **NON-DONE** because they carry an open owner-gated/policy-deferred AC
145
+ (e.g. a pending remote `db:push`): they are `perCardResults[]` entries with `deferred:true` (also
146
+ `cards_deferred_done_pending` in telemetry + the `F040-deferred` ledger row). For each such card,
147
+ now that step 1 guaranteed its deferral's follow-up exists on disk in the main repo, set the card
148
+ `status: DONE` + `completed_date` + an implementation_note (`"DONE post-run (new2) — AC deferred to
149
+ follow-up <id>"`) in `${paths.backlog_dir}/<card>.yml`, and fold all of them into ONE reconciliation
150
+ commit in the MAIN repo. **If a card's follow-up could NOT be created in step 1, leave it NON-DONE
151
+ and surface it** — fail-loud; NEVER mark a card DONE with a silently-dropped requirement (F-029).
152
+ 3. **Resume if degraded.** If `degraded` is true, re-invoke the workflow with
142
153
  `Workflow({ scriptPath, resumeFromRunId })` (same `args` + the new `ts`). The
143
154
  per-card **skip-completed** guard makes the resume idempotent — already-committed
144
155
  cards are skipped, only the incomplete/blocked ones run. Repeat until `degraded`
145
156
  is false (or the same cards stall twice → surface to the user).
146
- 3. **Present.** Print `report` verbatim. Surface `residuals` prominently
157
+ 4. **Present.** Print `report` verbatim. Surface `residuals` prominently
147
158
  ("questi residui sono tracciati come follow-up: …") — the post-run review that
148
159
  replaced the ~25 mid-run questions. If `degraded`, say so plainly (the run was
149
160
  incomplete and resumed).
150
- 4. **Record truthful telemetry.** Before appending `telemetry` to
151
- `${metricsDir}/skill-runs.jsonl`, fill the fields the workflow could not compute:
152
- `wall_clock_s` (now kickoff `ts`) and `followups_on_disk` (count the actual
153
- follow-up files on disk, NOT `residualFollowups.length` which double-counts).
154
- `total_tokens`/`agent_count` come from the workflow (`budget.spent()` delta +
155
- spawn counter); if `total_tokens` is null, run the `/new` Phase-8 `-stats` script
156
- to backfill real `usage`. Keep `degraded`/`degradation_reasons` in the record so
157
- the A/B comparison can exclude or weight degraded runs. Do NOT re-summarise the
158
- cardsthe workflow already did.
161
+ 5. **Record truthful telemetry — reconciled against disk (F-040).** Before appending `telemetry`
162
+ to `${metricsDir}/skill-runs.jsonl`, fill the fields the workflow could not compute and
163
+ **reconcile the report against the real disk state** (agent `reason` strings can over-claim — a
164
+ residual may say "AC PASS / migration created" about a change a rollback later erased). Verify:
165
+ every `perCardResults` entry marked `committed` actually has a commit in `${trunk}`
166
+ (`git -C $MAIN log --oneline ${trunk} | grep <card>`); annotate any divergence and never present
167
+ progress the disk does not show. Then fill `wall_clock_s` (now kickoff `ts`) and
168
+ `followups_on_disk` (count the actual follow-up files on disk in the main repo, NOT
169
+ `residualFollowups.length`which double-counts). `total_tokens`/`agent_count` come from the
170
+ workflow; if `total_tokens` is null, run the `/new` Phase-8 `-stats` script to backfill real
171
+ `usage`. Keep `degraded`/`degradation_reasons` + `cards_deferred_done_pending` in the record so
172
+ the A/B comparison stays honest. Do NOT re-summarise the cards — the workflow already did.
@@ -292,7 +292,9 @@ git -C "$WORKTREE_PATH" rebase "origin/$TRUNK"
292
292
  # 5. Sync local trunk ref — reuse /mw "Common" block: ff-only pull when main
293
293
  # repo HEAD is already $TRUNK; otherwise just fetch.
294
294
 
295
- # 6. Remove the registry entry for this worktree.
295
+ # 6. Remove the registry entry for this worktree, then release its backlog-ID
296
+ # reservations (best-effort, high-water mark untouched):
297
+ # .claude/skills/worktree-manager/scripts/allocate-id.sh release "$WORKTREE_PATH" 2>/dev/null || true
296
298
  ```
297
299
 
298
300
  **Forbidden in docs merge**: pre-merge `npm run build`, `npx eslint`, `npx tsc`, `npm run test`, post-merge `npm run build`. The whole point of docs mode is that these gates are inapplicable.
@@ -356,6 +358,57 @@ Create the file if it doesn't exist. Update it on every `/nw`, `/mw`, `/cw` oper
356
358
 
357
359
  ---
358
360
 
361
+ ## ID Allocation — atomic backlog-ID reservation across parallel worktrees
362
+
363
+ When several PRD / card sessions run in parallel on sibling worktrees, each
364
+ branched from the same trunk, a plain `max(^id: PREFIX-) + 1` scan makes them all
365
+ land on the **same next integer**: the other session's card is in flight on an
366
+ unmerged sibling branch, invisible to both the local backlog and the trunk
367
+ merge-base. They collide at rebase/merge time.
368
+
369
+ `scripts/allocate-id.sh` closes that race **on the same machine**. Every worktree
370
+ shares one main repo root (resolved from `git rev-parse --git-common-dir`), and
371
+ `.worktrees/` lives there and is gitignored — so it is the natural shared
372
+ coordination point, exactly like `registry.json`. The allocator anchors a lock
373
+ and a per-prefix high-water-mark there:
374
+
375
+ ```bash
376
+ # Reserve the next free integer for a prefix (FEAT, BUG, UI, DOC, PERF, …).
377
+ # Prints the integer zero-padded to 4 digits on stdout; diagnostics on stderr.
378
+ .claude/skills/worktree-manager/scripts/allocate-id.sh reserve FEAT <slug>
379
+
380
+ # Release a finished worktree's reservations (best-effort prune; never blocks).
381
+ .claude/skills/worktree-manager/scripts/allocate-id.sh release <worktree-path>
382
+ ```
383
+
384
+ Shared files, all under `$MAIN/.worktrees/` (already gitignored):
385
+ - `.id-alloc.lock/` — cross-process mutex (atomic `mkdir`, stale-stolen after 30s
386
+ via the dir's own mtime; it is a directory, NOT a git ref, so it does not touch
387
+ the shared `refs/stash` that `git stash` in worktrees is forbidden over).
388
+ - `.id-hwm-<PREFIX>` — monotonic high-water mark per prefix. **This is the
389
+ correctness anchor**: bumped under the lock, never decremented.
390
+ - `id-reservations.jsonl` — append-only `{prefix,id,worktree,slug,ts}` log for
391
+ observability + `release` pruning. NOT load-bearing for correctness.
392
+
393
+ Inside the lock, `reserve` takes `NEXT = max(high-water file, local backlog, every
394
+ sibling worktree's backlog from registry.json, reservations log, trunk
395
+ merge-base) + 1`. The `max()` against the real backlog makes it **self-healing**:
396
+ if `.id-hwm-*` is ever lost or a card was created bypassing the allocator, the
397
+ next reservation still skips occupied numbers. Abandoned reservations leave
398
+ **gaps**, which is fine — backlog IDs need not be contiguous, and "reserve the
399
+ whole integer, never reuse" is already the rule.
400
+
401
+ **Scope: same machine only.** The shared file cannot reach separate cloud agents;
402
+ the trunk merge-base scan inside `reserve` is the best-effort cross-machine guard.
403
+ Callers (`prd-card-writer`, the `prd` agent) invoke `reserve` when the script is
404
+ present and **fall back to their inline merge-base scan + `[ID-RACE-RISK]` note**
405
+ when it is absent (older installs) — the allocator is additive, never required.
406
+
407
+ `release` is wired into the cleanup path of `/mw`, `mw-docs`, and `/cw` (see those
408
+ steps) so a merged/cleaned worktree's reservations are pruned.
409
+
410
+ ---
411
+
359
412
  ## /nw — New Worktree
360
413
 
361
414
  Supports three modes:
@@ -1077,6 +1130,13 @@ git worktree prune
1077
1130
 
1078
1131
  Remove the entry from `.worktrees/registry.json`.
1079
1132
 
1133
+ Also release this worktree's backlog-ID reservations (best-effort; the high-water
1134
+ mark stays untouched so numbering remains monotonic):
1135
+
1136
+ ```bash
1137
+ .claude/skills/worktree-manager/scripts/allocate-id.sh release "$WORKTREE_PATH" 2>/dev/null || true
1138
+ ```
1139
+
1080
1140
  ### 8. Report
1081
1141
 
1082
1142
  ```
@@ -1184,7 +1244,12 @@ git worktree prune
1184
1244
 
1185
1245
  ### 4. Update registry
1186
1246
 
1187
- Remove cleaned entries from `.worktrees/registry.json`.
1247
+ Remove cleaned entries from `.worktrees/registry.json`. For each removed
1248
+ worktree, also release its backlog-ID reservations (best-effort):
1249
+
1250
+ ```bash
1251
+ .claude/skills/worktree-manager/scripts/allocate-id.sh release "<removed-worktree-path>" 2>/dev/null || true
1252
+ ```
1188
1253
 
1189
1254
  ### 5. Report
1190
1255
 
@@ -0,0 +1,207 @@
1
+ #!/usr/bin/env bash
2
+ #
3
+ # allocate-id.sh — atomic, cross-worktree backlog-ID allocator (same machine).
4
+ #
5
+ # Problem it solves: several PRD/card sessions running in parallel on sibling
6
+ # worktrees all branch from the same trunk, so each computes the same
7
+ # "max(^id: PREFIX-) + 1" and they collide at rebase/merge time. A plain
8
+ # backlog scan cannot see IDs that are still in flight on an unmerged sibling
9
+ # worktree branch.
10
+ #
11
+ # Mechanism: every worktree shares the same main repo root (resolved from
12
+ # `git rev-parse --git-common-dir`), and `.worktrees/` lives there and is
13
+ # gitignored. We anchor a lock + a per-prefix high-water-mark file there, so a
14
+ # reservation is atomic across every worktree on this machine. The high-water
15
+ # bumped under the lock is the correctness mechanism; the max() against the real
16
+ # backlog makes it self-healing if the counter file is ever lost or bypassed.
17
+ #
18
+ # Scope: SAME MACHINE only (a shared local file). Cross-machine (separate cloud
19
+ # agents) is not covered here — the caller keeps the merge-base scan as a
20
+ # best-effort cross-machine guard and surfaces an [ID-RACE-RISK] note.
21
+ #
22
+ # Usage:
23
+ # allocate-id.sh reserve <PREFIX> <slug> # prints the reserved integer, zero-padded to 4
24
+ # allocate-id.sh release <worktree-path> # prunes that worktree's reservations (best-effort)
25
+ #
26
+ # Exit non-zero on hard failure (not a git repo, lock unobtainable) so the
27
+ # caller can fall back to its inline scan. All diagnostics go to stderr; stdout
28
+ # carries ONLY the allocated number on success.
29
+
30
+ set -u
31
+
32
+ LOCK_STALE_SECONDS=30
33
+ LOCK_MAX_TRIES=150 # ~30s at 0.2s/try
34
+
35
+ err() { printf '%s\n' "$*" >&2; }
36
+
37
+ # --- Resolve the shared main repo root from any worktree -------------------
38
+ resolve_main() {
39
+ local common
40
+ common="$(git rev-parse --git-common-dir 2>/dev/null)" || return 1
41
+ case "$common" in
42
+ /*) ;; # already absolute
43
+ *) common="$(pwd)/$common" ;; # relative (we're in the main repo) → absolutise
44
+ esac
45
+ (cd "$common/.." 2>/dev/null && pwd) || return 1
46
+ }
47
+
48
+ # --- Read a paths.* / git.* scalar from baldart.config.yml -----------------
49
+ config_scalar() {
50
+ # $1 = top-level block (paths|git), $2 = key
51
+ local block="$1" key="$2" cfg="$MAIN/baldart.config.yml"
52
+ [ -f "$cfg" ] || return 0
53
+ grep -A60 "^${block}:" "$cfg" 2>/dev/null \
54
+ | grep -m1 "[[:space:]]*${key}:" \
55
+ | sed -E "s/.*${key}:[[:space:]]*\"?([^\"#]*)\"?.*/\1/" \
56
+ | sed -E 's/[[:space:]]+$//'
57
+ }
58
+
59
+ # --- Highest integer for PREFIX across one or more directories -------------
60
+ scan_dirs_max() {
61
+ local prefix="$1"; shift
62
+ local max="$1"; shift # starting floor
63
+ local d line n
64
+ for d in "$@"; do
65
+ [ -d "$d" ] || continue
66
+ while IFS= read -r line; do
67
+ [ -n "$line" ] || continue
68
+ n="$(printf '%s' "$line" | sed -E "s/^id:[[:space:]]*${prefix}-0*([0-9]+).*/\1/")"
69
+ case "$n" in ''|*[!0-9]*) continue ;; esac
70
+ [ "$n" -gt "$max" ] && max="$n"
71
+ done <<EOF
72
+ $(grep -rhoE "^id:[[:space:]]*${prefix}-[0-9]+" "$d"/*.yml 2>/dev/null)
73
+ EOF
74
+ done
75
+ printf '%s' "$max"
76
+ }
77
+
78
+ # --- Lock helpers (mkdir is atomic on POSIX; portable, no flock) -----------
79
+ # Staleness uses the lock directory's own mtime (set atomically by mkdir, and
80
+ # refreshed when the holder writes pid) — NOT a ts file that may not be written
81
+ # yet, which would race a just-created lock into an instant false steal. The
82
+ # critical section is local-FS only (the slow `git fetch` runs before the lock),
83
+ # so the stale threshold is never tripped by a legitimate holder.
84
+ file_mtime() { stat -f %m "$1" 2>/dev/null || stat -c %Y "$1" 2>/dev/null || echo 0; }
85
+ acquire_lock() {
86
+ local tries=0 now m age
87
+ mkdir -p "$WT_DIR" 2>/dev/null || true
88
+ while ! mkdir "$LOCKDIR" 2>/dev/null; do
89
+ if [ -d "$LOCKDIR" ]; then
90
+ now="$(date +%s)"; m="$(file_mtime "$LOCKDIR")"
91
+ case "$m" in ''|*[!0-9]*) m=0 ;; esac
92
+ age=$(( now - m ))
93
+ if [ "$m" -gt 0 ] && [ "$age" -gt "$LOCK_STALE_SECONDS" ]; then
94
+ err "WARN: stealing stale id-alloc lock (age ${age}s)"; rm -rf "$LOCKDIR" 2>/dev/null; continue
95
+ fi
96
+ fi
97
+ tries=$(( tries + 1 ))
98
+ [ "$tries" -gt "$LOCK_MAX_TRIES" ] && { err "ERROR: could not acquire id-alloc lock"; return 1; }
99
+ sleep 0.2
100
+ done
101
+ printf '%s\n' "$$" > "$LOCKDIR/pid" 2>/dev/null || true
102
+ }
103
+ release_lock() { rm -rf "$LOCKDIR" 2>/dev/null || true; }
104
+
105
+ # ===========================================================================
106
+ CMD="${1:-}"
107
+ MAIN="$(resolve_main)" || { err "ERROR: not inside a git repository"; exit 1; }
108
+ WT_DIR="$MAIN/.worktrees"
109
+ LOCKDIR="$WT_DIR/.id-alloc.lock"
110
+ RESV="$WT_DIR/id-reservations.jsonl"
111
+ REG="$WT_DIR/registry.json"
112
+
113
+ case "$CMD" in
114
+ reserve)
115
+ PREFIX="${2:-}"; SLUG="${3:-}"
116
+ [ -n "$PREFIX" ] || { err "usage: allocate-id.sh reserve <PREFIX> <slug>"; exit 2; }
117
+ case "$PREFIX" in *[!A-Z]*|'') err "ERROR: PREFIX must be uppercase letters (e.g. FEAT, BUG)"; exit 2 ;; esac
118
+
119
+ BACKLOG_DIR="$(config_scalar paths backlog_dir)"; [ -n "$BACKLOG_DIR" ] || BACKLOG_DIR="backlog"
120
+ TRUNK="$(config_scalar git trunk_branch)"
121
+ WT="$(git rev-parse --show-toplevel 2>/dev/null || echo "$MAIN")"
122
+ HWM="$WT_DIR/.id-hwm-$PREFIX"
123
+
124
+ # Refresh the remote trunk ref BEFORE taking the lock — it's the only slow
125
+ # (network) step; keeping it out of the critical section means a holder can
126
+ # never be stale-stolen mid-allocation.
127
+ [ -n "$TRUNK" ] && git fetch origin "$TRUNK" --quiet 2>/dev/null || true
128
+
129
+ acquire_lock || exit 1
130
+ trap release_lock EXIT
131
+
132
+ # 1) Floor = stored high-water mark (correctness anchor).
133
+ max=0
134
+ if [ -f "$HWM" ]; then
135
+ max="$(tr -cd '0-9' < "$HWM" 2>/dev/null)"; case "$max" in ''|*[!0-9]*) max=0 ;; esac
136
+ fi
137
+
138
+ # 2) Local worktree backlog + every sibling worktree's backlog (in flight).
139
+ dirs="$MAIN/$BACKLOG_DIR"
140
+ [ "$WT" != "$MAIN" ] && dirs="$dirs $WT/$BACKLOG_DIR"
141
+ if [ -f "$REG" ]; then
142
+ while IFS= read -r p; do
143
+ [ -n "$p" ] && dirs="$dirs $p/$BACKLOG_DIR"
144
+ done <<EOF
145
+ $(grep -oE '"path"[[:space:]]*:[[:space:]]*"[^"]*"' "$REG" 2>/dev/null | sed -E 's/.*"([^"]*)"$/\1/')
146
+ EOF
147
+ fi
148
+ # shellcheck disable=SC2086
149
+ max="$(scan_dirs_max "$PREFIX" "$max" $dirs)"
150
+
151
+ # 3) Reservations log (covers a wiped HWM with reservations still in flight).
152
+ if [ -f "$RESV" ]; then
153
+ while IFS= read -r n; do
154
+ n="$(printf '%s' "$n" | sed -E "s/.*\"${PREFIX}-0*([0-9]+).*/\1/")"
155
+ case "$n" in ''|*[!0-9]*) continue ;; esac
156
+ [ "$n" -gt "$max" ] && max="$n"
157
+ done <<EOF
158
+ $(grep -oE "\"id\":\"${PREFIX}-[0-9]+" "$RESV" 2>/dev/null)
159
+ EOF
160
+ fi
161
+
162
+ # 4) Best-effort cross-machine guard: merge-base of trunk (already-merged IDs).
163
+ # The fetch already ran before the lock; this is a local object-store read.
164
+ if [ -n "$TRUNK" ]; then
165
+ MB="$(git merge-base HEAD "origin/$TRUNK" 2>/dev/null || git merge-base HEAD "$TRUNK" 2>/dev/null || true)"
166
+ if [ -n "$MB" ]; then
167
+ while IFS= read -r n; do
168
+ n="$(printf '%s' "$n" | sed -E "s/^id:[[:space:]]*${PREFIX}-0*([0-9]+).*/\1/")"
169
+ case "$n" in ''|*[!0-9]*) continue ;; esac
170
+ [ "$n" -gt "$max" ] && max="$n"
171
+ done <<EOF
172
+ $(git grep -hoE "^id:[[:space:]]*${PREFIX}-[0-9]+" "$MB" -- "$BACKLOG_DIR/*.yml" 2>/dev/null)
173
+ EOF
174
+ fi
175
+ fi
176
+
177
+ NEXT=$(( max + 1 ))
178
+ printf '%s\n' "$NEXT" > "$HWM" 2>/dev/null || true
179
+ printf '{"prefix":"%s","id":"%s-%04d","worktree":"%s","slug":"%s","ts":"%s"}\n' \
180
+ "$PREFIX" "$PREFIX" "$NEXT" "$WT" "$SLUG" "$(date -u +%Y-%m-%dT%H:%M:%SZ)" >> "$RESV" 2>/dev/null || true
181
+
182
+ release_lock; trap - EXIT
183
+ printf '%04d\n' "$NEXT"
184
+ ;;
185
+
186
+ release)
187
+ WT="${2:-}"
188
+ [ -n "$WT" ] || { err "usage: allocate-id.sh release <worktree-path>"; exit 2; }
189
+ [ -f "$RESV" ] || exit 0
190
+ acquire_lock || exit 0 # best-effort prune; never block cleanup
191
+ trap release_lock EXIT
192
+ # grep -v exit code: 0 = some lines kept, 1 = none kept (all pruned → empty
193
+ # file is the correct result), >=2 = real error (keep the original).
194
+ grep -vF "\"worktree\":\"$WT\"" "$RESV" > "$RESV.tmp" 2>/dev/null
195
+ if [ "$?" -le 1 ]; then
196
+ mv "$RESV.tmp" "$RESV" 2>/dev/null || rm -f "$RESV.tmp" 2>/dev/null
197
+ else
198
+ rm -f "$RESV.tmp" 2>/dev/null
199
+ fi
200
+ release_lock; trap - EXIT
201
+ ;;
202
+
203
+ *)
204
+ err "usage: allocate-id.sh {reserve <PREFIX> <slug> | release <worktree-path>}"
205
+ exit 2
206
+ ;;
207
+ esac
@@ -153,12 +153,12 @@ if (kind === 'scope-expansion') {
153
153
  `If INTEGRATE: apply (you are ${fixerAgent}), re-run lint+tsc, return applied:true verified:true. If FOLLOW-UP: applied:false verified:false note:'needs-followup: <why>'.`,
154
154
  { label: `resolve:scope:${card}`, phase: 'Repair', agentType: fixerAgent, schema: FIX_SCHEMA }
155
155
  )
156
- } catch (e) { if (e && e.transientExhausted) return { status: 'followup', reason: 'outage during scope-expansion', outOfScopeFindings: [] }; throw e }
156
+ } catch (e) { if (e && e.transientExhausted) return { status: 'followup', reason: 'outage during scope-expansion', deferralClass: 'outage', outOfScopeFindings: [] }; throw e }
157
157
  if (decide && decide.verified) {
158
158
  const ok = await judgeVerify([{ i: 1, r: decide }])
159
159
  if (ok.ok) { log('scope-expansion integrated within ownership.'); return { status: 'resolved', outOfScopeFindings: collectOOS(decide) } }
160
160
  }
161
- return await materialiseFollowup('scope-expansion', (decide && decide.note) || 'outside ownership / new AC / protected', collectOOS(decide))
161
+ return await materialiseFollowup('scope-expansion', (decide && decide.note) || 'outside ownership / new AC / protected', collectOOS(decide), 'scope-expansion')
162
162
  }
163
163
 
164
164
  // ───────────────────────────────────────────────────────────────────────────
@@ -179,7 +179,7 @@ try {
179
179
  `Tier-1 targeted repair for card ${card} (${kind}).\n\n${brief}\n\n${gateHint}\n\nApply the minimal correct fix within MAY-EDIT only. Re-run the originating gate and report verified honestly (never claim verified without re-running it).`,
180
180
  { label: `resolve:${kind}:${card}`, phase: 'Repair', agentType: fixerAgent, schema: FIX_SCHEMA }
181
181
  )
182
- } catch (e) { if (e && e.transientExhausted) return { status: 'followup', reason: 'outage during tier-1', outOfScopeFindings: [] }; throw e }
182
+ } catch (e) { if (e && e.transientExhausted) return { status: 'followup', reason: 'outage during tier-1', deferralClass: 'outage', outOfScopeFindings: [] }; throw e }
183
183
 
184
184
  // F-008 — terminal short-circuit, verified not trusted.
185
185
  if (attempt && attempt.terminal) {
@@ -197,7 +197,7 @@ if (attempt && attempt.terminal) {
197
197
  confirmed = !!(tj && tj.confirmed)
198
198
  } catch (_) { confirmed = false }
199
199
  }
200
- if (confirmed) { log(`${kind} terminal (${tr}) — short-circuit to follow-up.`); return await materialiseFollowup(kind, `terminal: ${tr} — ${attempt.note || ''}`, collectOOS(attempt)) }
200
+ if (confirmed) { log(`${kind} terminal (${tr}) — short-circuit to follow-up.`); return await materialiseFollowup(kind, `terminal: ${tr} — ${attempt.note || ''}`, collectOOS(attempt), tr || 'unresolved') }
201
201
  log(`terminal verdict (${tr}) rejected — proceeding to multi-attempt.`)
202
202
  }
203
203
 
@@ -242,7 +242,7 @@ if (canFanOut && !protectedDomain) {
242
242
  log('budget near target — skipping tier-2 fan-out.')
243
243
  }
244
244
 
245
- return await materialiseFollowup(kind, (attempt && attempt.note) || 'unresolved after repair tiers', collectOOS(attempt).concat(tier2OOS))
245
+ return await materialiseFollowup(kind, (attempt && attempt.note) || 'unresolved after repair tiers', collectOOS(attempt).concat(tier2OOS), 'unresolved')
246
246
 
247
247
  // ───────────────────────────────────────────────────────────────────────────
248
248
  // F-015/F-033 — mandatory adversarial judge + deterministic JS cross-check.
@@ -266,11 +266,20 @@ async function judgeVerify(verifiedAttempts) {
266
266
  return { ok: true, best: judge.best }
267
267
  }
268
268
 
269
- async function materialiseFollowup(k, reason, oos) {
269
+ // F-040 — `deferralClass` (4th arg) classifies WHY this became a follow-up so new2.js can
270
+ // decide whether the CARD should still commit. owner-gated / not-a-code-defect → the card's
271
+ // own code is complete; the residual is an external step (do NOT roll the card back). Anything
272
+ // else (unresolved code defect, out-of-ownership, baseline) → genuine block → rollback as before.
273
+ // The classifier flows back through resolve() in new2.js; never write it into the worktree.
274
+ async function materialiseFollowup(k, reason, oos, deferralClass) {
275
+ const cls = deferralClass || 'unresolved'
270
276
  let r = null
271
277
  try {
272
278
  // F-039 — backlog cards are owned by prd-card-writer (card-template + Rule C
273
279
  // review_profile + owner_agent + traceability), NOT a hand-written Haiku stub.
280
+ // F-040 — the workflow agent runs cd'd into the worktree, so this write is BEST-EFFORT
281
+ // (it rides the merge if the batch merges). The SKILL is the SSOT: it verifies/creates the
282
+ // card in the MAIN repo post-run, so a non-merged batch never loses the follow-up.
274
283
  r = await agentSafe(
275
284
  `Create ONE follow-up backlog card so this residual is TRACKED, not dropped (per ${REF}/completeness.md Phase 2.5b option 3). You are prd-card-writer: apply your card-template, Rule C (review_profile), owner_agent routing, and traceability rules — do NOT emit a minimal stub.\n\n${brief}\nKind: ${k}\nResidual domain: ${domain}\nReason unresolved: ${reason}\n\n` +
276
285
  `Write ${backlogDir}/${card}-followup-<gate>.yml with status: TODO, derived from the residual: requirements + acceptance_criteria (the verbatim residual as ≥1 AC), owner_agent routed to the residual domain (${domain}), review_profile per Rule C, files_likely_touched ≥1 from the card ownership / remedy files. It MUST pass the /new pre-flight field check. Return the created card id.`,
@@ -280,9 +289,9 @@ async function materialiseFollowup(k, reason, oos) {
280
289
  // F-020 — could not materialise (e.g. outage): return WITHOUT a followupCard so the
281
290
  // SKILL writes it from the offline-safe residual ledger. Never claim it was created.
282
291
  log(`follow-up materialisation failed (${String(e && e.message)}) — skill will reconcile.`)
283
- return { status: 'followup', followupCard: null, reason, outOfScopeFindings: oos || [] }
292
+ return { status: 'followup', followupCard: null, reason, deferralClass: cls, outOfScopeFindings: oos || [] }
284
293
  }
285
294
  const followupCard = (r && r.created && r.followupCard) ? r.followupCard : null
286
295
  log(`${k} → follow-up ${followupCard || '(deferred to skill)'} (nothing dropped).`)
287
- return { status: 'followup', followupCard, reason, outOfScopeFindings: oos || [] }
296
+ return { status: 'followup', followupCard, reason, deferralClass: cls, outOfScopeFindings: oos || [] }
288
297
  }
@@ -70,6 +70,11 @@ function sig(card, gate, evidence) {
70
70
  const e = String(evidence || '').toLowerCase().replace(/\s+/g, ' ').replace(/[0-9a-f]{7,40}/g, '#').trim().slice(0, 160)
71
71
  return `${card}::${String(gate || '').toLowerCase()}::${e}`
72
72
  }
73
+ // F-040 (Fix G) — AC-deferral key on AC NUMBER ONLY. The full-text sig() drifted between the
74
+ // pre-flight policyDeferredACs[].text and the implement agent's unmetACs[].text, so a policy-
75
+ // deferred AC got re-routed to resolve() a second time (the migration-card double-routing). This
76
+ // coarse key is scoped to ac-defer so it never collides with a freeform 'blocker' finding.
77
+ function acSig(card, n) { return `${card}::ac-defer::ac-${String(n)}` }
73
78
 
74
79
  function ledger(card, gate, decision, detail) {
75
80
  gateLedger.push({ card, gate, decision, detail: detail || '' })
@@ -163,6 +168,7 @@ const MERGE_SCHEMA = {
163
168
  mergeTs: { type: 'string' },
164
169
  reconciliation: { type: 'string' },
165
170
  forcedDone: { type: 'array', items: { type: 'string' }, description: 'MUST be empty — false-DONE is forbidden (F-029)' },
171
+ deferredLeftOpen: { type: 'array', items: { type: 'string' }, description: 'F-040 — committed cards left NON-DONE (open owner-gated AC); the skill marks them DONE post-run' },
166
172
  epicsClosed: { type: 'array', items: { type: 'string' }, description: 'Epic/parent cards marked DONE by Phase 6b step 5e (all children DONE) — NOT a forcedDone violation' },
167
173
  uncommittedLeft: { type: 'boolean', description: 'true if dirty code was left (NOT committed) + reported (F-030)' },
168
174
  note: { type: 'string' },
@@ -208,7 +214,7 @@ try {
208
214
  `• G5 depends-on: a card whose depends_on names a non-DONE card NOT in this batch → EXCLUDE it AND every in-batch card that transitively depends on it.\n` +
209
215
  `• cardGraph (REQUIRED, F-021): for every runnable card return { id, dependsOn:[IN-BATCH deps only], ownerAgent (the card's owner_agent; G25 unknown→'coder'), reviewProfile (the card's review_profile; default 'balanced'), policyDeferredACs }.\n` +
210
216
  `• F-016 AC↔ownership consistency: for each acceptance_criterion, derive the file(s) it requires editing. If those files are NOT a subset of the card's MAY-EDIT/files_likely_touched → add the AC to policyDeferredACs:[{n,text,owningCard|owningFile,reason}] (it will become ONE follow-up, never a resolve). Do the same for any AC whose remedy is an owner-gated infra action (remote db push / deploy / secret / DNS).\n` +
211
- `• Complexity (setup.md 3c): decide executionMode sequential|team (+ groups for team). Build the file-ownership map → /tmp; return ownershipMapPath.\n` +
217
+ `• Complexity (setup.md 3c): decide executionMode sequential|team (+ groups for team). Build the file-ownership map → /tmp; return ownershipMapPath. F-040: each card's MAY-EDIT = files_likely_touched ∪ every path NAMED EXPLICITLY in that card's acceptance_criteria/definition_of_done (an ADR the DoD says to update, the data-model / ER doc for a schema-change, etc.) — so editing a DoD-mandated doc is NOT a file-diff violation. Do NOT add another card's files this way.\n` +
212
218
  `• Persist per-card architecture baselines to /tmp/arch-baseline-<CARD>.md; return archBaselinePaths.\n\n` +
213
219
  `Return the structured PREFLIGHT object. ok:false ONLY if the workspace is unworkable.`,
214
220
  { label: 'preflight', phase: 'Pre-flight', agentType: 'general-purpose', schema: PREFLIGHT_SCHEMA }
@@ -249,6 +255,8 @@ for (const n of cardGraph) {
249
255
  for (const ac of (n.policyDeferredACs || [])) {
250
256
  residuals.push({ card: n.id, kind: 'policy-deferred-ac', evidence: `AC-${ac.n}: ${ac.text} (${ac.reason || 'out-of-ownership / owner-gated'})`, materialized: false })
251
257
  acceptedDeferrals.add(sig(n.id, 'ac-unmet', `AC-${ac.n}: ${ac.text}`))
258
+ acceptedDeferrals.add(acSig(n.id, ac.n)) // F-040 (Fix G) — text-drift-proof AC key
259
+
252
260
  ledger(n.id, 'F016-policy-defer', 'DEFERRED-BY-POLICY', `AC-${ac.n} → follow-up (owner: ${ac.owningCard || ac.owningFile || '?'})`)
253
261
  }
254
262
  }
@@ -274,11 +282,15 @@ function domainMayEdit(dom, codeScope) {
274
282
  return docPaths.length ? docPaths : codeScope // doc-only ownership; fall back to code scope if no doc paths configured
275
283
  }
276
284
 
285
+ // F-040 — returns { status:'resolved'|'followup'|'fatal', deferralClass }. deferralClass tells
286
+ // the caller WHY a followup happened: 'owner-gated'/'not-a-code-defect' → the card's own code is
287
+ // complete (external/infra step remains) → caller must NOT roll the card back; anything else
288
+ // ('unresolved'/'out-of-ownership'/'baseline-not-reached'/'outage') → genuine block → rollback.
277
289
  async function resolve(kind, card, evidence, extra) {
278
290
  const s = sig(card, kind, evidence)
279
291
  if (resolvedSignatures.has(s) || acceptedDeferrals.has(s)) {
280
292
  ledger(card, 'resolve:' + kind, 'DEDUP-SKIP', 'already resolved/deferred this run')
281
- return 'resolved'
293
+ return { status: 'resolved', deferralClass: null }
282
294
  }
283
295
  resolvedSignatures.add(s)
284
296
  const dom = (extra && extra.domain) || 'code'
@@ -295,10 +307,11 @@ async function resolve(kind, card, evidence, extra) {
295
307
  })
296
308
  } catch (e) {
297
309
  if (e && (e.transientExhausted || isTransient(e))) noteDegraded('outage')
298
- res = { status: 'followup', reason: 'resolve workflow error: ' + String(e && e.message) }
310
+ res = { status: 'followup', reason: 'resolve workflow error: ' + String(e && e.message), deferralClass: 'outage' }
299
311
  }
300
312
  const status = (res && res.status) || 'followup'
301
- if (status === 'fatal') { batchFatal = true; ledger(card, 'resolve:' + kind, 'FATAL', (res && res.reason) || ''); return status }
313
+ const deferralClass = (res && res.deferralClass) || null
314
+ if (status === 'fatal') { batchFatal = true; ledger(card, 'resolve:' + kind, 'FATAL', (res && res.reason) || ''); return { status, deferralClass } }
302
315
  if (status === 'followup') {
303
316
  acceptedDeferrals.add(s) // F-028 — a deferred residual must not be re-routed by a later gate.
304
317
  const fc = (res && res.followupCard) || null
@@ -310,7 +323,7 @@ async function resolve(kind, card, evidence, extra) {
310
323
  residuals.push({ card, kind: 'out-of-scope', evidence: `${osf.file || ''}:${osf.line || ''} ${osf.evidence || ''}`, materialized: false })
311
324
  }
312
325
  ledger(card, 'resolve:' + kind, status, (res && (res.followupCard || res.reason)) || '')
313
- return status
326
+ return { status, deferralClass }
314
327
  }
315
328
 
316
329
  // ───────────────────────────────────────────────────────────────────────────
@@ -337,6 +350,11 @@ async function runCard(cardId, cardPath, lessons) {
337
350
  const node = graphById[cardId] || {}
338
351
  const ownerAgent = node.ownerAgent || 'coder'
339
352
  const reviewProfile = node.reviewProfile || 'balanced'
353
+ // F-040/H — a card carrying an open owner-gated/policy-deferred AC commits its code but stays
354
+ // NON-DONE; the SKILL marks it DONE post-run only after the deferral's follow-up exists on disk
355
+ // in the main repo. Seeded from the pre-flight policy-deferred ACs; set by any owner-gated review
356
+ // deferral or unmet-AC follow-up below.
357
+ let deferredOpen = ((node.policyDeferredACs) || []).length > 0
340
358
  function g(name, decision, detail) { gates.push({ gate: name, decision, detail: detail || '' }); ledger(cardId, name, decision, detail) }
341
359
 
342
360
  // F-026 — skip-completed: only if committed AND gates green for that sha AND no open follow-up.
@@ -361,10 +379,11 @@ async function runCard(cardId, cardPath, lessons) {
361
379
  impl = await agentSafe(
362
380
  `Implement card ${cardId} per ${REF}/implement.md (Phase 1 claim+architect+plan-auditor, Phase 2 you ARE the owner_agent '${ownerAgent}') and ${REF}/completeness.md (Phase 2.5 + 2.5b AC-closure ledger). Run all gates/bash yourself.\n\n${cardBrief}\n\n` +
363
381
  `POLICIES: G26 Phase-2 lint/tsc/test/build failing after the module's retry cap → buildBlocked:true + blockedGate. Build the AC Closure Ledger (one row per AC: implemented|unmet|policy-deferred). DO NOT silently defer; report unmet rows (excluding policy-deferred). Persist arch baseline to /tmp/arch-baseline-${cardId}.md and the diff to /tmp/diff-${cardId}.txt.\n\n` +
364
- `Return: { epic, buildBlocked, blockedGate, unmetACs:[{n,text}], scopeFiles, mayEditPaths, fileDiffViolation, note }`,
382
+ `E4 OWNERSHIP RECONCILE (implement.md §11b — do this BEFORE returning): the card's MAY-EDIT includes files_likely_touched ∪ paths NAMED EXPLICITLY in this card's acceptance_criteria/definition_of_done (e.g. an ADR the DoD says to update, the data-model / ER doc for a schema change). Editing THOSE is in-scope. For any OTHER dirty file outside MAY-EDIT (another card's file, or unrelated): \`git checkout -- <file>\` to revert it (NEVER leave it orphaned), list it in revertedOutOfOwnership. Set fileDiffViolation:true ONLY if such an edit genuinely could not be reverted (then say why in note) — it is no longer a silent label.\n\n` +
383
+ `Return: { epic, buildBlocked, blockedGate, unmetACs:[{n,text}], scopeFiles, mayEditPaths, revertedOutOfOwnership:[paths], fileDiffViolation, note }`,
365
384
  { label: `implement:${cardId}`, phase: 'Implement', agentType: ownerAgent,
366
385
  schema: { type: 'object', required: ['epic', 'buildBlocked', 'unmetACs', 'scopeFiles'], additionalProperties: true,
367
- properties: { epic: { type: 'boolean' }, buildBlocked: { type: 'boolean' }, blockedGate: { type: 'string' }, unmetACs: { type: 'array', items: { type: 'object', additionalProperties: true } }, scopeFiles: { type: 'array', items: { type: 'string' } }, mayEditPaths: { type: 'array', items: { type: 'string' } }, fileDiffViolation: { type: 'boolean' }, note: { type: 'string' } } } }
386
+ properties: { epic: { type: 'boolean' }, buildBlocked: { type: 'boolean' }, blockedGate: { type: 'string' }, unmetACs: { type: 'array', items: { type: 'object', additionalProperties: true } }, scopeFiles: { type: 'array', items: { type: 'string' } }, mayEditPaths: { type: 'array', items: { type: 'string' } }, revertedOutOfOwnership: { type: 'array', items: { type: 'string' } }, fileDiffViolation: { type: 'boolean' }, note: { type: 'string' } } } }
368
387
  )
369
388
  } catch (e) {
370
389
  if (e && e.transientExhausted) { noteDegraded('outage'); return { card: cardId, status: 'pending', gates, telemetry: tele } }
@@ -375,19 +394,28 @@ async function runCard(cardId, cardPath, lessons) {
375
394
 
376
395
  const mayEdit = (impl && impl.mayEditPaths) || []
377
396
  const scopeFiles = (impl && impl.scopeFiles) || []
378
- if (impl && impl.fileDiffViolation) g('E4-file-diff', 'AUTO-REVERTED', 'coder touched files outside ownership')
397
+ // F-040 E4 honest label: 'AUTO-REVERTED' used to be a no-op log (files were left orphaned).
398
+ // Now the owner agent reconciles out-of-ownership edits itself (implement.md §11b); we report
399
+ // what it actually did. A genuine unresolved violation becomes a tracked residual, never silent.
400
+ const reverted = (impl && impl.revertedOutOfOwnership) || []
401
+ if (reverted.length) g('E4-file-diff', 'REVERTED', `out-of-ownership reverted: ${reverted.join(', ')}`)
402
+ if (impl && impl.fileDiffViolation) {
403
+ g('E4-file-diff', 'FLAGGED', 'unresolved out-of-ownership edit — tracked as residual')
404
+ residuals.push({ card: cardId, kind: 'file-diff-violation', evidence: `unresolved out-of-ownership edit: ${(impl && impl.note) || ''}`, materialized: false })
405
+ }
379
406
 
380
407
  if (impl && impl.buildBlocked) {
381
- const s = await resolve('blocker', cardId, `Phase-2 gate failing: ${impl.blockedGate}`, { mayEditPaths: mayEdit, scopeFiles, domain: 'code' })
408
+ const s = (await resolve('blocker', cardId, `Phase-2 gate failing: ${impl.blockedGate}`, { mayEditPaths: mayEdit, scopeFiles, domain: 'code' })).status
382
409
  g('G26-build', s === 'resolved' ? 'RESOLVED' : 'FOLLOWUP', impl.blockedGate)
383
410
  if (s !== 'resolved') { await rollbackCard(cardId, mayEdit); return { card: cardId, status: 'followup', gates, commit: '-', scopeFiles, telemetry: tele } }
384
411
  }
385
412
 
386
413
  // F-010/F-016 — unmet ACs that are policy-deferred are skipped (already tracked).
387
414
  for (const ac of (impl && impl.unmetACs) || []) {
388
- if (acceptedDeferrals.has(sig(cardId, 'ac-unmet', `AC-${ac.n}: ${ac.text}`))) { g('G7-ac-closure', 'DEFERRED-BY-POLICY', `AC-${ac.n}`); continue }
389
- const s = await resolve('ac-unmet', cardId, `AC-${ac.n}: ${ac.text}`, { mayEditPaths: mayEdit, scopeFiles, domain: 'code' })
415
+ if (acceptedDeferrals.has(acSig(cardId, ac.n)) || acceptedDeferrals.has(sig(cardId, 'ac-unmet', `AC-${ac.n}: ${ac.text}`))) { g('G7-ac-closure', 'DEFERRED-BY-POLICY', `AC-${ac.n}`); deferredOpen = true; continue }
416
+ const s = (await resolve('ac-unmet', cardId, `AC-${ac.n}: ${ac.text}`, { mayEditPaths: mayEdit, scopeFiles, domain: 'code' })).status
390
417
  g('G7-ac-closure', s === 'resolved' ? 'RESOLVED' : 'FOLLOWUP', `AC-${ac.n}`)
418
+ if (s !== 'resolved') deferredOpen = true // F-040/H — unmet AC tracked as follow-up → card stays NON-DONE
391
419
  }
392
420
 
393
421
  // --- Review fan-out (F-024/F-025): specialized agents, trimmed by review_profile. ---
@@ -437,23 +465,38 @@ async function runCard(cardId, cardPath, lessons) {
437
465
  let cardBlocked = false
438
466
  for (const b of blocks) {
439
467
  const kind = /e2e/i.test(b.gate) ? 'e2e-blocked' : /qa/i.test(b.gate) ? 'qa-fail' : 'blocker'
440
- const s = await resolve(kind, cardId, `${b.gate}: ${b.evidence}`, { mayEditPaths: mayEdit, scopeFiles, domain: b.domain || 'code' })
441
- g(b.gate, s === 'resolved' ? 'RESOLVED' : 'FOLLOWUP', b.evidence)
442
- if (s !== 'resolved') cardBlocked = true
468
+ const r = await resolve(kind, cardId, `${b.gate}: ${b.evidence}`, { mayEditPaths: mayEdit, scopeFiles, domain: b.domain || 'code' })
469
+ // F-040 — THE primary fix. An owner-gated / not-a-code-defect deferral means the card's OWN
470
+ // code is complete and correct; the residual is an external/infra step (e.g. a remote db push)
471
+ // already tracked as a follow-up. Do NOT roll the card back — it proceeds to commit, NON-DONE
472
+ // (the skill marks it DONE post-run once the follow-up exists on disk). This replaces the old
473
+ // `s !== 'resolved' → cardBlocked` which destroyed a completed migration card's work over a db:push gate.
474
+ // A genuine unresolved CODE defect (or out-of-ownership/baseline/outage) still blocks + rolls back.
475
+ const ownerGated = r.status === 'followup' && (r.deferralClass === 'owner-gated' || r.deferralClass === 'not-a-code-defect')
476
+ g(b.gate, r.status === 'resolved' ? 'RESOLVED' : ownerGated ? 'DEFERRED-OWNER-GATED' : 'FOLLOWUP', b.evidence)
477
+ if (ownerGated) deferredOpen = true
478
+ else if (r.status !== 'resolved') cardBlocked = true
443
479
  }
444
480
  for (const sx of scopeExp) {
445
- const s = await resolve('scope-expansion', cardId, sx.evidence || '', { mayEditPaths: mayEdit, scopeFiles, domain: sx.domain || 'code' })
481
+ const s = (await resolve('scope-expansion', cardId, sx.evidence || '', { mayEditPaths: mayEdit, scopeFiles, domain: sx.domain || 'code' })).status
446
482
  g('scope-expansion', s === 'resolved' ? 'INTEGRATED' : 'FOLLOWUP', sx.evidence || '')
447
483
  }
448
484
 
449
485
  if (cardBlocked) { await rollbackCard(cardId, mayEdit); return { card: cardId, status: 'followup', gates, commit: '-', scopeFiles, archBaselinePath: `/tmp/arch-baseline-${cardId}.md`, telemetry: tele } }
450
486
 
451
487
  // --- Phase 4 — commit (F-023: Haiku + git-status reconcile, never git add -A). ---
488
+ // F-040/H — DONE policy. A card with an OPEN owner-gated/policy-deferred AC commits its code but
489
+ // must NOT be marked DONE here (its own DoD isn't met yet — e.g. the remote db:push is pending).
490
+ // The new2 SKILL marks it DONE post-run, ONLY after the deferral's follow-up exists on disk in the
491
+ // main repo (so a card is never DONE with a silently-dropped requirement — F-029).
492
+ const doneStep = deferredOpen
493
+ ? `(4) DO NOT mark the card DONE: it has an OPEN owner-gated/policy-deferred AC. Keep status IN_PROGRESS and add an implementation_note "deferred — DONE pending follow-up (new2 skill reconciles post-run)". STILL add the ssot-registry row for the committed code.`
494
+ : `(4) mark the card DONE in its YAML + add the ssot-registry row.`
452
495
  let commitRes
453
496
  try {
454
497
  commitRes = await agentSafe(
455
498
  `Commit card ${cardId} in worktree ${sharedCtx.worktreePath}. MECHANICAL — do NOT re-read reference modules.\n` +
456
- `Steps: (1) \`git status --porcelain\`; (2) stage = MAY-EDIT (${JSON.stringify(mayEdit)}) ∩ dirty — NEVER \`git add -A\`, NEVER \`git stash\`; if dirty has files OUTSIDE MAY-EDIT, do NOT stage them and set reconcileNote; (3) commit message \`[${cardId}] <concise>\`; (4) mark the card DONE in its YAML + add the ssot-registry row; (5) 'nothing to commit' = already committed (record HEAD).\n` +
499
+ `Steps: (1) \`git status --porcelain\`; (2) stage = MAY-EDIT (${JSON.stringify(mayEdit)}) ∩ dirty — NEVER \`git add -A\`, NEVER \`git stash\`; if dirty has files OUTSIDE MAY-EDIT, do NOT stage them and set reconcileNote; (3) commit message \`[${cardId}] <concise>\`; ${doneStep} (5) 'nothing to commit' = already committed (record HEAD).\n` +
457
500
  `On COMMIT_LOCK: clear stale lock + retry once. Still locked → committed:false.\n\n` +
458
501
  `Return: { committed, commit, filesChanged, reconcileNote }`,
459
502
  { label: `commit:${cardId}`, phase: 'Implement', agentType: 'general-purpose', model: 'haiku',
@@ -465,17 +508,20 @@ async function runCard(cardId, cardPath, lessons) {
465
508
  }
466
509
 
467
510
  if (!commitRes || !commitRes.committed) {
468
- const s = await resolve('blocker', cardId, 'commit blocked after retries', { mayEditPaths: mayEdit, scopeFiles, domain: 'code' })
511
+ const s = (await resolve('blocker', cardId, 'commit blocked after retries', { mayEditPaths: mayEdit, scopeFiles, domain: 'code' })).status
469
512
  g('G16-commit', s === 'resolved' ? 'RESOLVED' : 'FOLLOWUP')
470
513
  if (s !== 'resolved') { await rollbackCard(cardId, mayEdit); return { card: cardId, status: 'followup', gates, commit: '-', scopeFiles, archBaselinePath: `/tmp/arch-baseline-${cardId}.md`, telemetry: tele } }
471
514
  }
472
515
  if (commitRes && commitRes.reconcileNote) g('commit-reconcile', 'NOTE', commitRes.reconcileNote)
473
516
 
474
- g('commit', 'COMMITTED', (commitRes && commitRes.commit) || '')
517
+ g('commit', 'COMMITTED', `${(commitRes && commitRes.commit) || ''}${deferredOpen ? ' (NON-DONE — deferred, skill reconciles)' : ''}`)
475
518
  return {
476
519
  card: cardId, status: 'committed',
477
520
  commit: (commitRes && commitRes.commit) || '-',
478
521
  filesChanged: (commitRes && commitRes.filesChanged) || [],
522
+ // F-040/H — true when this committed card is intentionally left NON-DONE (open deferral). The
523
+ // merge agent leaves it non-DONE and the SKILL marks it DONE after its follow-up materialises.
524
+ deferred: deferredOpen,
479
525
  scopeFiles, archBaselinePath: `/tmp/arch-baseline-${cardId}.md`, gates, telemetry: tele,
480
526
  }
481
527
  }
@@ -594,14 +640,14 @@ if (committed.length && !batchFatal && !degraded) {
594
640
  }
595
641
  for (const area of Object.keys(byArea)) {
596
642
  const group = byArea[area]
597
- const s = await resolve('merge-blocker', group[0].finding_id || firstCard,
643
+ const s = (await resolve('merge-blocker', group[0].finding_id || firstCard,
598
644
  group.map((f) => `${f.severity} ${f.title}: ${f.evidence}`).join(' || '),
599
645
  { mayEditPaths: reviewScopeFiles, scopeFiles: reviewScopeFiles, domain: group[0].domain || 'code',
600
- findings: group.map((f) => ({ kind: 'merge-blocker', evidence: `${f.title}: ${f.evidence}`, domain: f.domain || 'code' })) })
646
+ findings: group.map((f) => ({ kind: 'merge-blocker', evidence: `${f.title}: ${f.evidence}`, domain: f.domain || 'code' })) })).status
601
647
  if (s !== 'resolved') mergeBlocked = true
602
648
  }
603
649
  if (finalSummary && finalSummary.failingGates && finalSummary.failingGates.length) {
604
- const s = await resolve('qa-fail', firstCard, `final gates failing: ${finalSummary.failingGates.join(', ')}`, { mayEditPaths: reviewScopeFiles, scopeFiles: reviewScopeFiles, domain: 'code' })
650
+ const s = (await resolve('qa-fail', firstCard, `final gates failing: ${finalSummary.failingGates.join(', ')}`, { mayEditPaths: reviewScopeFiles, scopeFiles: reviewScopeFiles, domain: 'code' })).status
605
651
  if (s !== 'resolved') mergeBlocked = true
606
652
  }
607
653
  } else {
@@ -621,6 +667,10 @@ if (committed.length && !batchFatal && !degraded) {
621
667
  phase('Merge')
622
668
  let mergeResult = null
623
669
  const incomplete = runnableCards.filter((id) => state[id] !== 'committed' && state[id] !== 'epic-skipped')
670
+ // F-040/H — committed cards intentionally left NON-DONE (open owner-gated/policy-deferred AC). They
671
+ // ARE merged (their code is complete), but Phase 6b must NOT force them to DONE; the SKILL does that
672
+ // post-run once the deferral's follow-up exists on disk. They count as complete for the merge gate.
673
+ const deferredCards = committed.filter((r) => r.deferred).map((r) => r.card)
624
674
  const integrityOK = committed.length > 0 && !mergeBlocked && !batchFatal && !degraded && incomplete.length === 0
625
675
  if (!committed.length) {
626
676
  ledger(firstCard, 'merge', 'SKIPPED', 'no committed cards')
@@ -635,15 +685,23 @@ if (!committed.length) {
635
685
  `• G24 → auto-merge via merge_strategy.\n` +
636
686
  `• F-030 HARD RULE: NEVER \`git add\`/commit code that did not pass the per-card gates. If the worktree is dirty with uncommitted code → DO NOT commit it; leave it, set uncommittedLeft:true, and report. NO "safety commit". Security/migration code is NEVER swept in.\n` +
637
687
  `• F-029 HARD RULE: Phase 6b reconciliation marks a card DONE ONLY if it has a real commit in ${TRUNK}..HEAD AND its gates are green. NEVER force a non-implemented card to DONE. Return forcedDone:[] (must be empty).\n` +
688
+ `• F-040 DEFERRED CARDS — leave NON-DONE (do NOT force to DONE in Phase 6b): ${deferredCards.length ? deferredCards.join(' ') : '(none)'}. These committed their code but carry an OPEN owner-gated/policy-deferred AC (e.g. a pending remote db:push). Their YAML is INTENTIONALLY IN_PROGRESS; the new2 skill marks them DONE post-run after materialising the deferral's follow-up. They ARE part of the merge — just skip them in the DONE-reconciliation. Return deferredLeftOpen:[the ones you left non-DONE].\n` +
638
689
  `• EPIC CLOSURE (Phase 6b step 5e): the epic/parent card (group.is_epic:true) is NOT in the batch and stays TODO unless closed here. For each distinct group.parent of the batch cards (and any epic card in the batch itself): if EVERY child of that epic — \`grep -l "parent: <EPIC-ID>" backlog/*.yml | xargs grep -L "status: DONE"\` prints nothing — set the epic card status:DONE + completed_date + note "epic-closure gate — all children DONE" and fold into the reconciliation commit. If any child is still open → leave the epic untouched. This is NOT a forcedDone violation (the epic is a tracker, gated on all-children-DONE, not on its own commit). Return epicsClosed:[<EPIC-IDs marked DONE>].\n` +
639
690
  `• G19 sync-deferred → HEAD==${TRUNK} ff-pull, else leave+report. G20 → leave+report. G21 post-batch dirty → partition-ignore framework artifacts; leave the rest + report (do NOT commit). G22 divergence → behind: ff-pull; ahead/both: leave+report; NEVER reset --hard/force-push. G23 stash restore conflict → leave intact + report.\n\n` +
640
- `Return: { merged, mergeCommit, mergeTs, reconciliation, forcedDone:[], uncommittedLeft, note }`,
691
+ `Return: { merged, mergeCommit, mergeTs, reconciliation, forcedDone:[], deferredLeftOpen:[], uncommittedLeft, note }`,
641
692
  { label: 'merge', phase: 'Merge', agentType: 'general-purpose', schema: MERGE_SCHEMA }
642
693
  )
643
694
  } catch (e) { if (e && e.transientExhausted) noteDegraded('outage'); mergeResult = null }
644
695
  if (mergeResult && (mergeResult.forcedDone || []).length) { noteDegraded('false_done'); ledger(firstCard, 'F029-guard', 'VIOLATION', `forcedDone: ${mergeResult.forcedDone.join(' ')}`) }
645
696
  if (mergeResult && mergeResult.uncommittedLeft) ledger(firstCard, 'F030-guard', 'LEFT-UNCOMMITTED', 'dirty code left (not swept) + reported')
646
697
  if (mergeResult && (mergeResult.epicsClosed || []).length) ledger(firstCard, 'epic-closure', 'CLOSED', `epics marked DONE (all children DONE): ${mergeResult.epicsClosed.join(' ')}`)
698
+ if (deferredCards.length) {
699
+ ledger(firstCard, 'F040-deferred', 'LEFT-NON-DONE', `${deferredCards.join(' ')} — skill marks DONE post-run after follow-up materialises`)
700
+ // F-040 guard — catch a merge agent that ignored the instruction and force-DONE'd a deferred card.
701
+ const leftOpen = (mergeResult && mergeResult.deferredLeftOpen) || []
702
+ const wronglyDone = deferredCards.filter((c) => !leftOpen.includes(c))
703
+ if (mergeResult && wronglyDone.length) { noteDegraded('false_done'); ledger(firstCard, 'F040-guard', 'VIOLATION', `deferred cards force-DONE by merge: ${wronglyDone.join(' ')}`) }
704
+ }
647
705
  ledger(firstCard, 'G24-merge', (mergeResult && mergeResult.merged) ? 'MERGED' : 'INCOMPLETE', (mergeResult && (mergeResult.mergeCommit || mergeResult.note)) || '')
648
706
  if (mergeResult && mergeResult.reconciliation) ledger(firstCard, 'G19-23-reconcile', 'AUTO', mergeResult.reconciliation)
649
707
  }
@@ -686,6 +744,9 @@ function buildTelemetry() {
686
744
  ts: TS || null,
687
745
  cards_total: cardIds.length,
688
746
  cards_real_done: perCardResults.filter((r) => r.status === 'committed').length,
747
+ // F-040/H — committed cards left NON-DONE pending their owner-gated follow-up (the skill marks
748
+ // them DONE post-run). Surfaced so the A/B telemetry distinguishes "code landed" from "DONE".
749
+ cards_deferred_done_pending: perCardResults.filter((r) => r.deferred).length,
689
750
  cards_force_done: 0, // F-029 — force-DONE forbidden; always 0.
690
751
  cards_followup: perCardResults.filter((r) => r.status === 'followup').length,
691
752
  cards_blocked: runnableCards.filter((id) => state[id] === 'blocked').length,
@@ -716,7 +777,7 @@ function buildReport(o) {
716
777
  L.push(``, `## Esito card`)
717
778
  L.push(`| Card | Status | Commit | File |`)
718
779
  L.push(`|------|--------|--------|------|`)
719
- for (const r of perCardResults) L.push(`| ${r.card} | ${r.status} | ${r.commit || '-'} | ${(r.filesChanged || []).length} |`)
780
+ for (const r of perCardResults) L.push(`| ${r.card} | ${r.status}${r.deferred ? ' (NON-DONE: deferred)' : ''} | ${r.commit || '-'} | ${(r.filesChanged || []).length} |`)
720
781
  const blockedIds = runnableCards.filter((id) => state[id] === 'blocked' || state[id] === 'pending')
721
782
  for (const id of blockedIds) L.push(`| ${id} | ${state[id]} | - | 0 |`)
722
783
  if (finalSummary) {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "baldart",
3
- "version": "4.23.0",
3
+ "version": "4.24.1",
4
4
  "description": "Claude Agent Framework - Reusable framework for coordinating AI agents and humans in software projects",
5
5
  "bin": {
6
6
  "baldart": "./bin/baldart.js"