baldart 4.35.0 → 4.36.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,6 +5,25 @@ All notable changes to BALDART will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [4.36.0] - 2026-06-13
9
+
10
+ **`/new` security-domain fixes are now applied by `security-reviewer`, not `coder` — the v4.26.1 canonical writer map, finally propagated from `new2` to `/new`.** Auditing the `new2` lessons for guards/logic missing on `/new` surfaced one real gap (the others — args-string guard, JS router clamp, no-self-judge + specialist-owned lane, relevance-gated fan-out — were already present on `/new`). `new2-resolve.js` routes security fixes to `security-reviewer` (`fixerAgent = {doc:'doc-reviewer', ui:'ui-expert', security:'security-reviewer'}[domain] || 'coder'`), but the canonical writer map was never propagated to `/new`'s SSOT: the `Domain-Override Domains` table (SKILL.md) and every fix-routing site still sent `security` → `coder`. A coder applying a one-line RLS/permission/auth fix lacks the security-invariant contract that lives in `security-reviewer`'s system prompt — the same class of error as "wrong agent for the card", and a direct violation of the user's standing strict-specialization principle. **MINOR** (changes which agent applies security fixes across `/new`; backwards-compatible — `migration` stays `coder`, no install/layout change, no `baldart.config.yml` key ⇒ schema-propagation rule N/A).
11
+
12
+ ### Changed
13
+
14
+ - **`framework/.claude/skills/new/SKILL.md`** — `Domain-Override Domains` table: `security` owning agent `coder` → **`security-reviewer`** (write mode), plus a new "Why `security` is owned by `security-reviewer`" rationale mirroring the `doc` one. The sequential-mode overview line aligned too.
15
+ - **`framework/.claude/skills/new/references/review-cycle.md`**, **`final-review.md`**, **`team-mode.md`**, **`codex-gate.md`** — every security-fix routing site (Phase 2.55 Domain-Override delegation, the delegated-workflow residual routing, the Final FULL merge-blocking partition, the Phase 3.7 codex fix sub-loop, and the doc-drift→bug security path) now routes `security` → `security-reviewer` and runs it before the `coder` pass. `migration` stays `coder`.
16
+ - **`framework/.claude/workflows/new-card-review.js`** — the Fix phase no longer folds security into the single coder pass. It partitions `VERIFIED` findings into a `security-reviewer` pass (domain `security`) and a `coder` pass (`code`/`perf`/`migration`/`test`/`simplify`), run sequentially (security first) over the disjoint-by-ownership editable set so shared-file edits never conflict; a FAIL in either pass fails the wave. `new-final-review.js` needed no change (it is read-only — the calling skill applies fixes — and its `domainVerifier` already routes security verification to `security-reviewer`).
17
+ - **`framework/.claude/agents/security-reviewer.md`** — new "Dual mode — review vs. apply" Behavior Rule: by default it audits and proposes (read-only), but when invoked as the security domain writer (by `/new`/`new2`/the codex fix loop) it APPLIES the remediation directly via Edit/Write and re-verifies — security fixes are owned by it, never deferred to a coder.
18
+
19
+ ## [4.35.1] - 2026-06-13
20
+
21
+ **`/new` workflow delegation no longer degrades to a silent no-op when `args` arrives as a JSON string.** A live `/new FEAT-0027 -full` team-mode run delegated its per-wave review cluster to the `new-card-review` workflow and got back a degenerate result (`cards:0`, 0 agents, ~24ms) — the orchestrator correctly fell back to the inline cluster, but the delegation (the single biggest context-economy win in team mode) was wasted on every wave. Root cause: the `Workflow` tool sometimes serializes a structured `args` object to a JSON **string**; `new-card-review.js` and `new-final-review.js` read `args.cards` / `args.reviewScopeFiles` directly, so a string `args` left those `undefined` → empty scope → the early-return guard fired. The `new2` family (`new2.js`, `new2-resolve.js`) had already been hardened against exactly this (`F-001/F-004` parse-or-default guard), but the fix was **never propagated** to the two `/new` workflows — a parallel-location miss. **PATCH** (bugfix to shipped workflow payload, no behaviour change to install, no config key ⇒ schema-propagation rule N/A).
22
+
23
+ ### Fixed
24
+
25
+ - **`framework/.claude/workflows/new-card-review.js`**, **`framework/.claude/workflows/new-final-review.js`** — added the same defensive `if (typeof a === 'string') { try { a = JSON.parse(a) } catch (_) { a = {} } }` guard already present in `new2.js`/`new2-resolve.js`. All four workflows now tolerate `args` delivered as a JSON string, so `/new`'s delegated review cluster and Final Review fan-out run as intended instead of no-op'ing into the inline fallback.
26
+
8
27
  ## [4.35.0] - 2026-06-13
9
28
 
10
29
  **Card-baseline standardization — every backlog card, any prefix/origin, conforms to one profile-aware SSOT; `/new` normalizes foreign cards at ingestion.** A real `CHORE-0007` (consumer repo `mayo`) reached `/new` **without `review_profile`** (and without `scope`/`scope_boundaries`/`canonical_docs`): it was hand/ad-hoc authored after a graph-align finding, never by the canonical writer. `/new` and `/new2` consume cards **type-blind** — they scale per-card review depth on `review_profile` and run the same pipeline regardless of prefix — so an off-baseline card silently degrades the pipeline. Root cause: the baseline was scattered across `card-template.yml` + `prd-card-writer`'s Required-Fields + Rule C with **no single SSOT and no validator**; `prd-card-writer` only documented the `/prd` epic+children flow (its standalone single-card mode, used by `new2-resolve`, was undocumented); and three writers diverged — `/prd`/`new2`/`new2-resolve` emit the full baseline, but the `/new` AC-deferral stub (`completeness.md`) and `/issue-review` (`issue-review.md`) wrote partial cards.
package/VERSION CHANGED
@@ -1 +1 @@
1
- 4.35.0
1
+ 4.36.0
@@ -50,6 +50,7 @@ Before reviewing:
50
50
 
51
51
  ## Behavior Rules
52
52
 
53
+ - **Dual mode — review vs. apply.** By default you AUDIT and *propose* remediations (read-only). But when invoked as the **security domain writer** (e.g. by `/new` / `new2` / the Phase 3.7 codex fix loop, whose brief tells you to "apply the verified security findings" in write mode), you ARE the fixer: APPLY the minimal remediation directly with the Edit/Write tools, then re-verify (lint/tsc/build as instructed). Security fixes are owned by you — never deferred to a coder — because the auth/permission/RLS/multi-tenant-isolation invariants live in YOUR system prompt, not the coder's. Stay within the files the brief's ownership map allows; if a fix needs a file outside that scope, report it as residual rather than expanding scope.
53
54
  - Be extremely critical, thorough, and skeptical. Optimize for correctness and security, not politeness.
54
55
  - Do NOT assume the developer did things safely unless proven by code evidence.
55
56
  - Treat ALL external input as hostile.
@@ -291,7 +291,7 @@ per-card nei sub-step D.x (mai aggregate). Caricalo quando Pre-flight seleziona
291
291
  ### Sequential mode (default for small batches)
292
292
 
293
293
  - Cards execute one at a time through the full per-card pipeline (Phases 1-5).
294
- - Code review and doc review for the same card run as **parallel read-only audits**, then fixes are applied by domain owner: **doc findings → `doc-reviewer` (write mode)**, code/security/migration findings → `coder`. (Sequential Phase 3 is even simpler — doc-reviewer runs alone, so it audits AND applies in one invocation.)
294
+ - Code review and doc review for the same card run as **parallel read-only audits**, then fixes are applied by domain owner (see § "Domain-Override Domains"): **doc findings → `doc-reviewer` (write mode)**, **security findings → `security-reviewer` (write mode)**, code/perf/migration findings → `coder`. (Sequential Phase 3 is even simpler — doc-reviewer runs alone, so it audits AND applies in one invocation.)
295
295
  - This mode is unchanged from the original behavior.
296
296
 
297
297
  ### Team mode (for complex batches)
@@ -541,11 +541,13 @@ Enumerated exhaustively:
541
541
  | Domain | Owning agent | Match rule |
542
542
  |---|---|---|
543
543
  | `doc` | **`doc-reviewer`** (write mode) | File path matching `*.md` under `${paths.references_dir}`, `${paths.prd_dir}`, project root `CHANGELOG.md`, or any `ssot-registry.md`. |
544
- | `security` | `coder` | File path matching any entry in `paths.high_risk_modules` (`baldart.config.yml`) — the same auth/permission/payment-class paths the Phase 3.7 Step A detector reads. Also any SQL migration whose content matches `CREATE POLICY|ALTER POLICY|DROP POLICY` (RLS policy mutations). If `paths.high_risk_modules` is absent, the security match rule emits a one-line diagnostic and matches nothing (no hardcoded default). |
544
+ | `security` | **`security-reviewer`** (write mode) | File path matching any entry in `paths.high_risk_modules` (`baldart.config.yml`) — the same auth/permission/payment-class paths the Phase 3.7 Step A detector reads. Also any SQL migration whose content matches `CREATE POLICY|ALTER POLICY|DROP POLICY` (RLS policy mutations). If `paths.high_risk_modules` is absent, the security match rule emits a one-line diagnostic and matches nothing (no hardcoded default). |
545
545
  | `migration` | `coder` | File path matching `${paths.migrations_dir}/*.sql` if `paths.migrations_dir` is defined in `baldart.config.yml`; otherwise the project's migrations dir per convention (`migrations/`, `db/migrate/`, `supabase/migrations/`, `prisma/migrations/`). |
546
546
 
547
547
  **Why `doc` is owned by `doc-reviewer`, not `coder` (since v3.40.0)** — the doc invariants the orchestrator must not break (freshness markers, linking protocol, frontmatter standard, tabular formatting, SSOT/registry coverage, dependency-topological order, SCIP/code refs) are encoded in the **`doc-reviewer`** system prompt, NOT the coder's. The coder is a code-oriented agent that lacks the doc-invariant contract — routing doc fixes to it is the wrong agent doing work the auditing agent already has full context for. The agent that *audits* the docs is also the agent that *fixes* them (`doc-reviewer.md` § Constraints: "WRITE missing docs directly. You are fully responsible — do not defer to other agents"). NEVER route a `doc`-domain fix to `coder`.
548
548
 
549
+ **Why `security` is owned by `security-reviewer`, not `coder` (since v4.36.0)** — the same logic as `doc`, applied to the security domain (canonical writer map v4.26.1; user principle "il codice lo scrive solo coder, la security solo security-reviewer"). The auth/permission/RLS/multi-tenant-isolation invariants live in the **`security-reviewer`** system prompt, not the coder's; a coder applying a one-line RLS or permission fix without that contract is the same class of error as the "wrong agent for the card". `security-reviewer` is the writer for security-domain fixes — it audits AND applies. `migration` stays `coder` (SQL authoring is the coder's lane; a migration's security-policy content matching the RLS rule above is classified `security`, not `migration`). NEVER route a `security`-domain fix to `coder`.
550
+
549
551
  **Edge case explicit** — a mechanical append-a-row update to `CHANGELOG.md` or `ssot-registry.md` is still classified `doc` and still goes through `doc-reviewer`, never inline and never `coder`. The uniformity of the rule matters more than the cost of the individual spawn.
550
552
 
551
553
  Domains NOT listed here remain governed by the per-phase rules of the corresponding phase (e.g. `simplify-*` follows Phase 2.55 inline rule).
@@ -128,9 +128,9 @@ For EVERY card (no conditional skip — the gate ALWAYS runs; only its DEPTH var
128
128
 
129
129
  4. **Apply fix sub-loop** (mirror of Phase 3.5 retry pattern):
130
130
  - If 0 BLOCKER and 0 HIGH → log `verdict: PASS — proceeding to Phase 4` in tracker. Done. (MEDIUM/LOW findings are advisory at this per-card gate; they are not silently lost — the post-batch **Final-review FULL gate** applies every VERIFIED finding ≥ MEDIUM. Log the MEDIUM count in the tracker so it is visible.)
131
- - If 1+ BLOCKER OR 1+ HIGH → spawn `coder` agent with the report path + list of VERIFIED bugs. **At `full` profile** the report contains Codex-suggested inline patches: pass them and have the coder **apply the suggested patches** with the right system prompt (project conventions, naming, testing patterns) — it does NOT re-do the analysis or re-grep (since v3.28.3), BUT it MUST first confirm each patch still applies against the current file state (prior fix-loop iterations may have shifted line offsets); if a patch no longer applies cleanly, the coder re-locates the target by content and applies the equivalent edit rather than a stale-offset verbatim paste. **At `light` profile** (since v4.18.0) the findings come from **Codex** (the sole finder) — the report carries Codex's `minimal_fix_direction`; brief the coder to apply it (treat it like the `full`-profile Codex fix direction). **On the Codex-unavailable fallback** the `light` findings come from `code-reviewer` instead — brief the coder to apply the `code-reviewer` fix direction (no Codex patches to paste). After coder fixes, **re-write the lean contract `/tmp/codexreview-lean-<CARD-ID>.json` (it is consumed-once and deleted by `/codexreview`)** and re-invoke `/codexreview` via the Skill tool with `args: <CARD-ID>` (NOT a bare prose mention — the card ID MUST be passed so the retry reviews THIS card, not an inferred one). Repeat **max 2 times**.
131
+ - If 1+ BLOCKER OR 1+ HIGH → spawn the **domain writer** with the report path + list of VERIFIED bugs (canonical writer map v4.26.1 — see SKILL.md § "Domain-Override Domains"): **`security`-domain findings** (touching `paths.high_risk_modules` or RLS-policy SQL — the same `security` match rule) → **`security-reviewer`** in write mode (it owns the security-invariant contract a coder lacks; never route a security fix to `coder`); **all other findings** (`correctness`/code/perf/`other`) → **`coder`**. Run security-reviewer first, then coder (skip either if its partition is empty). **At `full` profile** the report contains Codex-suggested inline patches: pass them and have the coder **apply the suggested patches** with the right system prompt (project conventions, naming, testing patterns) — it does NOT re-do the analysis or re-grep (since v3.28.3), BUT it MUST first confirm each patch still applies against the current file state (prior fix-loop iterations may have shifted line offsets); if a patch no longer applies cleanly, the coder re-locates the target by content and applies the equivalent edit rather than a stale-offset verbatim paste. **At `light` profile** (since v4.18.0) the findings come from **Codex** (the sole finder) — the report carries Codex's `minimal_fix_direction`; brief the coder to apply it (treat it like the `full`-profile Codex fix direction). **On the Codex-unavailable fallback** the `light` findings come from `code-reviewer` instead — brief the coder to apply the `code-reviewer` fix direction (no Codex patches to paste). After coder fixes, **re-write the lean contract `/tmp/codexreview-lean-<CARD-ID>.json` (it is consumed-once and deleted by `/codexreview`)** and re-invoke `/codexreview` via the Skill tool with `args: <CARD-ID>` (NOT a bare prose mention — the card ID MUST be passed so the retry reviews THIS card, not an inferred one). Repeat **max 2 times**.
132
132
  - If still BLOCKER/HIGH after 2 retries → log in `## Issues & Flags` and **ask the user** whether to proceed, escalate, or stop. The Phase 4 commit MUST NOT happen until the Pre-Merge Codex Review verdict is PASS or user explicitly overrides.
133
- - **Telemetry** — for EVERY codex finding processed (verified BLOCKER, verified HIGH, or false-positive-filtered), append one row to `## Fix Application Log`: `3.7 | codex-<security|correctness|other> | est_lines=<n> | decision=<coder|skipped> | applied_by=<coder|-> | severity=<BLOCKER|HIGH|FALSE-POSITIVE> | retry=<n>`. Classify domain: `security` for findings touching RLS / auth / permissions / payments; `correctness` for logic / data integrity / race conditions; `other` for everything else.
133
+ - **Telemetry** — for EVERY codex finding processed (verified BLOCKER, verified HIGH, or false-positive-filtered), append one row to `## Fix Application Log`: `3.7 | codex-<security|correctness|other> | est_lines=<n> | decision=<security-reviewer|coder|skipped> | applied_by=<security-reviewer|coder|-> | severity=<BLOCKER|HIGH|FALSE-POSITIVE> | retry=<n>`. (`security`-domain fixes are applied by `security-reviewer`, all others by `coder`.) Classify domain: `security` for findings touching RLS / auth / permissions / payments; `correctness` for logic / data integrity / race conditions; `other` for everything else.
134
134
 
135
135
  5. **Update tracker**: phase = `3.7-codexgate DONE` (the gate runs unconditionally for every card — the legacy `3.7-highrisk` name implied it only fired on high-risk cards, which is no longer true), log final verdict, retry count, list of fixed findings, and the report path.
136
136
 
@@ -220,9 +220,9 @@ that is a **gate violation**: log it as
220
220
  10. **Persist verified findings** to `/tmp/batch-final-review-<FIRST-CARD-ID>.md`.
221
221
  11. **Merge-blocking gate (mirrors the per-card Phase 3.7 gate this final pass backstops):** if any VERIFIED **BLOCKER or HIGH** finding exists, it MUST be resolved before Phase 6 merge. Apply fixes by **domain owner** (since v3.40.0 — same Domain-Override routing as the per-card phases), then re-verify; if a BLOCKER/HIGH cannot be resolved in a single apply + one retry, log it in `## Issues & Flags` and invoke `AskUserQuestion` (override with reason / escalate to a follow-up card / halt) — do NOT proceed to Phase 6 with an unresolved BLOCKER or HIGH. VERIFIED findings of severity MEDIUM are also applied (advisory below that). Partition the verified findings by the **Domain-Override match rules** ("Domain-Override Domains"):
222
222
  - **`doc`-domain findings** (file path matching the `doc` match rule — `*.md` under `${paths.references_dir}`/`${paths.prd_dir}`, `CHANGELOG.md`, `ssot-registry.md`) → invoke the **doc-reviewer** agent once in write mode to apply them. NEVER route doc fixes to coder.
223
- - **`security`-domain findings** (path in `paths.high_risk_modules`, or RLS-policy SQL) and **`migration`-domain findings** (SQL under the migrations dir) → route to **coder**, but apply the Sub-agent failure protocol's STOP-on-crash rule for these domains (never inline-fallback on a security/migration fix). These are NOT collapsed into a generic "everything else" bucket.
223
+ - **`security`-domain findings** (path in `paths.high_risk_modules`, or RLS-policy SQL) route to **security-reviewer** in write mode (canonical writer map v4.26.1 — it owns the security-invariant contract a coder lacks; NEVER route security fixes to coder). **`migration`-domain findings** (SQL under the migrations dir) → route to **coder**. For both, apply the Sub-agent failure protocol's STOP-on-crash rule (never inline-fallback on a security/migration fix). These are NOT collapsed into a generic "everything else" bucket.
224
224
  - **All remaining findings** (other code, perf, test) → invoke the **coder** agent once to apply them in a single pass.
225
- Run in the order doc-reviewer → coder (or skip either if its partition is empty). Pass only the verified findings, not false positives.
225
+ Run in the order doc-reviewer → security-reviewer → coder (skip any whose partition is empty). Pass only the verified findings, not false positives.
226
226
  12. Run final build: `npm run lint && npx tsc --noEmit && npm run build` (redirect each to `/tmp/final-<gate>.txt` per § "Context economy"; surface only exit code + bounded extract on failure).
227
227
  If any check fails, apply self-healing retry loop (up to 3 times).
228
228
  13. **Update tracker** with final review results:
@@ -51,8 +51,10 @@ so it surfaces in telemetry.
51
51
  ```
52
52
 
53
53
  The workflow runs Simplify + Codex (agent-launched, code-reviewer fallback) + qa-sentinel + security,
54
- FP-checks each specialist's own findings, then **one coder applies all VERIFIED
55
- code/perf/security/simplify findings in a single pass** and re-verifies lint/tsc/build. It returns
54
+ FP-checks each specialist's own findings, then the **domain writer applies its VERIFIED findings**
55
+ (canonical writer map v4.26.1: `security` `security-reviewer`; `code`/`perf`/`migration`/`test`/
56
+ `simplify` → `coder`) — security-reviewer pass first, then the coder pass — and re-verifies
57
+ lint/tsc/build. It returns
56
58
  `{ codexEngine, perCard: { <CARD-ID>: { fixesApplied, residual } }, gateTable, summary }`.
57
59
  **Skip the inline Phase 2.55 + Phase 3.5 below AND the Phase 3.7 gate in `codex-gate.md`** (all three
58
60
  are now done), then handle the workflow output HERE in the skill. **Process each `residual` finding by
@@ -61,7 +63,9 @@ so it surfaces in telemetry.
61
63
  - `classification == NEEDS_MANUAL_CONFIRMATION` (any domain) → `AskUserQuestion` — the human gate the
62
64
  workflow cannot run. (`summary.needsManual` counts these, doc included.)
63
65
  - else `domain == doc` residual → carry into **Phase 3** (the doc-reviewer runs there, post-E2E, on final code).
64
- - else `code`/`perf`/`security`/`migration` residual (a fix the coder could not converge in its 2 retries)
66
+ - else `security` residual (a fix not converged in 2 retries) → spawn a targeted `security-reviewer`
67
+ now over this card's `editableFiles` (it owns the security-invariant contract — never a coder).
68
+ - else `code`/`perf`/`migration` residual (a fix the coder could not converge in its 2 retries)
65
69
  → spawn a targeted `coder` now over this card's `editableFiles`.
66
70
  - **QA gate (BLOCKING — mirror of inline Phase 3.5 step 24)**: if `gateTable` has any `status:"FAIL"`
67
71
  **OR** `summary.checksFailed` is true, the merge gate is NOT satisfied. Spawn a `coder` on the
@@ -107,7 +111,7 @@ After completeness is verified, clean up the implementation before it reaches re
107
111
  - **Efficiency agent** — flag unnecessary work (redundant computations, duplicate API calls, N+1), missed concurrency, hot-path bloat, recurring no-op updates without change-detection guards, TOCTOU existence checks, memory issues (unbounded structures, missing cleanup), overly broad operations.
108
112
 
109
113
  4. Aggregate findings from all three agents. For each finding:
110
- - **Valid AND in a Domain-Override domain** (the finding's target file matches the `doc`, `security`, or `migration` match rule in "Domain-Override Domains") → do NOT apply inline. Delegate to the domain owner: `doc` → `doc-reviewer` (write mode), `security`/`migration` → `coder`. Even a one-line efficiency fix in `paths.high_risk_modules` or a migration file goes to the owning agent — the orchestrator lacks that domain's invariant contract.
114
+ - **Valid AND in a Domain-Override domain** (the finding's target file matches the `doc`, `security`, or `migration` match rule in "Domain-Override Domains") → do NOT apply inline. Delegate to the domain **writer** (canonical writer map v4.26.1): `doc` → `doc-reviewer` (write mode), `security` → `security-reviewer` (write mode — it owns the security-invariant contract a coder lacks), `migration` → `coder`. Even a one-line efficiency fix in `paths.high_risk_modules` (security) or a migration file goes to the owning agent — the orchestrator lacks that domain's invariant contract.
111
115
  - **Valid AND not in a Domain-Override domain** → fix directly (apply edits inline).
112
116
  - **False positive / not worth addressing** → skip, BUT record it (see telemetry). If the skip rests on a "covered by X" / "redundant" / "not needed" rationalization (the same family the AC-Closure Gate guards against), do NOT discard silently — verify the rationale by reading `X`, and if it does not hold, treat the finding as valid.
113
117
 
@@ -279,9 +283,9 @@ skill's Phase 1 falls back to deriving Gherkin scenarios from
279
283
  per-card Phase 3.7 gate now skips that duplicate (lean mode), so THIS pass MUST carry it.
280
284
  A doc-drift→bug finding whose root cause is in CODE (not the doc) is the ONE thing
281
285
  doc-reviewer does NOT fix itself — report it with the conflicting code location + the doc
282
- it violates, and the orchestrator routes it to the `security`/code fix path as appropriate.
286
+ it violates, and the orchestrator routes it to the `security` (→ security-reviewer) / code (→ coder) fix path as appropriate.
283
287
  ```
284
- Doc-reviewer applies all doc-domain fixes itself. The orchestrator does NOT spawn a coder for doc fixes (since v3.40.0 — `doc` is owned by `doc-reviewer`, see "Domain-Override Domains"). The only doc-reviewer output that leaves this phase unfixed is a **doc-drift→bug finding rooted in CODE** (the implementation contradicts a documented contract). Route it explicitly: if the conflicting code file matches the `security` Domain-Override match rule (`paths.high_risk_modules`) → spawn `coder` with the finding now, in this phase (a security-class code fix is not deferrable to a `light` Phase 3.7); otherwise carry the finding into the Phase 3.7 `/codexreview` input as a known code-drift bug and let the Phase 3.7 fix sub-loop apply it. Either way, append a Fix Application Log row with `domain=codex-correctness` (NOT `doc`) so telemetry attributes it as a code fix. Do NOT leave it accumulating in the tracker with no fix owner.
288
+ Doc-reviewer applies all doc-domain fixes itself. The orchestrator does NOT spawn a coder for doc fixes (since v3.40.0 — `doc` is owned by `doc-reviewer`, see "Domain-Override Domains"). The only doc-reviewer output that leaves this phase unfixed is a **doc-drift→bug finding rooted in CODE** (the implementation contradicts a documented contract). Route it explicitly: if the conflicting code file matches the `security` Domain-Override match rule (`paths.high_risk_modules`) → spawn `security-reviewer` with the finding now, in this phase (a security-class code fix is not deferrable to a `light` Phase 3.7, and security is owned by `security-reviewer` — never a coder); otherwise carry the finding into the Phase 3.7 `/codexreview` input as a known code-drift bug and let the Phase 3.7 fix sub-loop apply it. Either way, append a Fix Application Log row with `domain=codex-correctness` (NOT `doc`) so telemetry attributes it as a code fix. Do NOT leave it accumulating in the tracker with no fix owner.
285
289
  14. **Knowledge-corpus sync (OPTIONAL — only if the project ships a corpus-sync agent)**: There is NO shipped `obsidian-sync` agent — do NOT dispatch one (a hard dispatch to a non-existent subagent fails silently). Only when the project provides its own knowledge-corpus sync agent (declared in `.baldart/overlays/new.md`) AND doc-reviewer's findings indicate a corpus impact, invoke that agent with the listed paths after the doc fixes are applied. Otherwise skip with a one-line notice (`knowledge-corpus sync: skipped (no corpus-sync agent configured)`). Non-blocking either way.
286
290
  15. **Telemetry** — after doc-reviewer returns, append one row per doc finding to `## Fix Application Log`: `3 | doc | est_lines=<n> | decision=doc-reviewer | applied_by=doc-reviewer | finding=<1-line>`. If 0 findings, append one row: `3 | doc | est_lines=0 | decision=skipped | applied_by=- | reason=no-findings`. **Phase-8 producer (named counter)** — ALSO record the per-card doc-gap counts as a structured line in `## Current Card` (carried into `## Completed Cards` at Phase 5): `doc_gaps: found=<N> fixed=<M>` where `N` = total doc findings doc-reviewer raised and `M` = those it applied. This is the single named producer for Phase 8's `doc_gaps_found` / `doc_gaps_fixed` fields — without it those fields have no upstream write and Phase 8 would hard-code zeros. (D.4a is the team-mode producer of the same counter — see Phase 7 § D.4a.)
287
291
  16. Run `npm run lint` and `npx tsc --noEmit` (when `stack.language` includes typescript) to verify nothing broke (redirect to disk per § "Context economy"). If doc-reviewer touched any source-adjacent file (a `.ts`/`.tsx` helper, a co-located doc export), also run `npm run build`. If any check fails, apply the self-healing retry loop (up to 3 times, no user prompt). **If still failing after 3 retries**: do NOT fall through silently to Phase 3.5 — log `[DOC-PHASE-REGRESSION]` in `## Issues & Flags` and invoke `AskUserQuestion` (revert the doc-phase edits that broke the build / keep and fix manually / stop the card).
@@ -184,13 +184,15 @@ After ALL agents in the group complete successfully:
184
184
  }})
185
185
  ```
186
186
  The workflow fans out the finders per card, runs ONE Codex pass + ONE qa-sentinel (group max tier)
187
- over the union, and **one coder applies all VERIFIED code/perf/security/simplify fixes for the
188
- whole group in a single pass** (files disjoint by ownership no conflict, same as D.3). It returns
189
- `{ codexEngine, perCard, gateTable, summary }`. **Skip the inline D.2 (code portion), D.3, D.3b,
187
+ over the union, and the **domain writer applies all VERIFIED fixes for the whole group** (canonical
188
+ writer map v4.26.1: `security``security-reviewer`, then `code`/`perf`/`migration`/`test`/`simplify`
189
+ `coder`; the two passes run sequentially over disjoint-by-ownership files no conflict, same as D.3).
190
+ It returns `{ codexEngine, perCard, gateTable, summary }`. **Skip the inline D.2 (code portion), D.3, D.3b,
190
191
  D.4, D.4b** below. Then per card handle `perCard[<id>].residual` exactly as the sequential gate does
191
192
  (`references/review-cycle.md` § Phase 2.5x — **by classification first**: `NEEDS_MANUAL_CONFIRMATION`
192
- any-domain → `AskUserQuestion`; else doc residual → the post-E2E doc step; else unconverged
193
- code/perf/security residual → targeted `coder`). Apply the **same BLOCKING QA-gate consumption**:
193
+ any-domain → `AskUserQuestion`; else doc residual → the post-E2E doc step; else unconverged `security`
194
+ residual → targeted `security-reviewer`; else unconverged code/perf residual → targeted `coder`).
195
+ Apply the **same BLOCKING QA-gate consumption**:
194
196
  `gateTable` with any `status:"FAIL"` OR `summary.checksFailed` → coder fix (≤2 retries) then
195
197
  `AskUserQuestion`; **D.5 commit MUST NOT happen until `gateTable` is PASS/SKIP and `checksFailed` is
196
198
  false** (a delegated QA FAIL blocks exactly as inline D.4 / Phase 3.5 would — `gateTable` is
@@ -28,7 +28,11 @@ export const meta = {
28
28
  // gateTable, summary }
29
29
  // ───────────────────────────────────────────────────────────────────────────
30
30
 
31
- const a = args || {}
31
+ // Tolerate args delivered as a JSON string (parse-or-default) — the Workflow tool
32
+ // sometimes serializes a structured `args` object to a string; without this guard
33
+ // `a.cards` is undefined → empty `cards` → degenerate no-op return (cards:0, 0 agents).
34
+ let a = args || {}
35
+ if (typeof a === 'string') { try { a = JSON.parse(a) } catch (_) { a = {} } }
32
36
  const cards = (Array.isArray(a.cards) ? a.cards : []).filter((c) => c && c.cardId)
33
37
  const cfg = a.config || {}
34
38
  const highRisk = (cfg.paths && cfg.paths.high_risk_modules) || [] // security-domain hint
@@ -298,59 +302,83 @@ const surviving = classified
298
302
  .map((f) => ({ ...f, card: attributeCard(f, fileToCard, cards) }))
299
303
 
300
304
  // ───────────────────────────────────────────────────────────────────────────
301
- // Phase Fix — ONE coder applies all VERIFIED code/perf/security/simplify findings.
302
- // doc findingsresidual (the skill runs doc-reviewer post-E2E on final code).
303
- // NEEDS_MANUAL_CONFIRMATION residual (human gate, owned by the skill).
305
+ // Phase Fix — the DOMAIN WRITER applies its verified findings (canonical writer
306
+ // map v4.26.1): security security-reviewer (owns the security-invariant
307
+ // contract never folded into the coder pass); code/perf/migration/test/simplify
308
+ // → coder. doc findings → residual (the skill runs doc-reviewer post-E2E on final
309
+ // code). NEEDS_MANUAL_CONFIRMATION → residual (human gate, owned by the skill).
304
310
  // ───────────────────────────────────────────────────────────────────────────
305
311
  phase('Fix')
306
312
  const isDoc = (f) => /doc|wiki|ssot|readme/.test(String(f.domain).toLowerCase())
313
+ // 'security' domain → security-reviewer. migration STAYS coder (canonical writer map: code/perf/
314
+ // migration/test → coder), so match the exact 'security' domain, not the broader verifier regex.
315
+ const isSecurity = (f) => String(f.domain).toLowerCase() === 'security'
307
316
  const isManual = (f) => f.classification === 'NEEDS_MANUAL_CONFIRMATION'
308
317
  // Partition `surviving` (= VERIFIED + NEEDS_MANUAL; FALSE_POSITIVE already dropped) with NO overlap:
309
- // actionable = VERIFIED non-doc → the coder fixes these.
318
+ // securityFix = VERIFIED security security-reviewer applies (it owns the security invariants).
319
+ // actionable = VERIFIED non-doc non-security → the coder fixes these.
310
320
  // docResidual = VERIFIED doc → the skill runs doc-reviewer post-E2E on final code.
311
321
  // manualResidual= NEEDS_MANUAL any → human gate, owned by the skill (a doc-manual must NOT be
312
322
  // silently auto-re-reviewed: it carries its needs-manual classification out).
313
- const actionable = surviving.filter((f) => f.classification === 'VERIFIED' && !isDoc(f))
323
+ const securityFix = surviving.filter((f) => f.classification === 'VERIFIED' && !isDoc(f) && isSecurity(f))
324
+ const actionable = surviving.filter((f) => f.classification === 'VERIFIED' && !isDoc(f) && !isSecurity(f))
314
325
  const docResidual = surviving.filter((f) => f.classification === 'VERIFIED' && isDoc(f))
315
326
  const manualResidual = surviving.filter(isManual)
316
327
 
317
328
  const SKIP_CHECKS = { lint: 'SKIP', tsc: 'SKIP', build: 'SKIP' }
318
- let fixResult = { applied: [], unresolved: [], checks: { ...SKIP_CHECKS } }
319
- if (actionable.length && unionEditable.length) {
329
+
330
+ // One fix pass: the domain WRITER applies its verified findings to the worktree, then re-verifies.
331
+ // Passes run SEQUENTIALLY (security-reviewer before coder) so edits on shared files never conflict
332
+ // without having to partition the ownership map; the last pass to run carries the build it verified.
333
+ async function applyFixPass(findings, writer, label, role) {
334
+ if (!findings.length) return { applied: [], unresolved: [], checks: { ...SKIP_CHECKS }, ran: false }
335
+ if (!unionEditable.length) {
336
+ log(`Fix: ${findings.length} ${label} finding(s) but no editable files in scope — returned as residual (${writer} skipped).`)
337
+ return { applied: [], unresolved: findings.map((f) => f.finding_id), checks: { ...SKIP_CHECKS }, ran: false }
338
+ }
320
339
  const fixBrief =
321
- `Apply ALL of the verified review findings below to the worktree, then verify the build. You are the SINGLE fix pass for this wave.\n\n` +
340
+ `Apply ALL of the verified ${role} review findings below to the worktree, then verify the build. You are the ${writer} fix pass for this wave.\n\n` +
322
341
  `Worktree: ${a.worktreePath || '(cwd)'} — cd into it.\n` +
323
342
  `You MAY edit ONLY these files (ownership map — touching anything else is a violation):\n${unionEditable.join('\n')}\n\n` +
324
- `Findings to fix (grouped — fix the code, not the tests unless a test itself is wrong; do NOT expand scope beyond the finding):\n` +
325
- actionable.map((f) => `- [${f.finding_id}] (${f.card || '?'} / ${f.domain} / ${f.severity}) ${f.title}\n evidence: ${f.evidence}\n direction: ${f.minimal_fix_direction}`).join('\n') +
343
+ `Findings to fix (fix the code, not the tests unless a test itself is wrong; do NOT expand scope beyond the finding):\n` +
344
+ findings.map((f) => `- [${f.finding_id}] (${f.card || '?'} / ${f.domain} / ${f.severity}) ${f.title}\n evidence: ${f.evidence}\n direction: ${f.minimal_fix_direction}`).join('\n') +
326
345
  `\n\nAfter applying: run \`npm run lint\` and (when the project uses typescript) \`npx tsc --noEmit\` and \`npm run build\` in the worktree. If a check fails because of an edit you made, fix the regression — at most 2 retries — staying within the allowed files. ` +
327
346
  `Do NOT commit. Do NOT git stash (refs/stash is shared across worktrees). ` +
328
347
  `Return: applied (finding_ids you fixed), unresolved (finding_ids you could NOT fix within the allowed files / 2 retries), and checks (PASS/FAIL/SKIP for lint, tsc, build).`
329
- const r = await agent(fixBrief, { label: 'fix-coder', phase: 'Fix', agentType: 'coder', schema: FIX_SCHEMA })
330
- // Normalize: the coder may die (null) or return a truthy object missing fields.
331
- fixResult = (r && typeof r === 'object') ? r : { applied: [], unresolved: actionable.map((f) => f.finding_id), checks: { ...SKIP_CHECKS } }
332
- if (!Array.isArray(fixResult.applied)) fixResult.applied = []
333
- if (!Array.isArray(fixResult.unresolved)) fixResult.unresolved = []
334
- if (!fixResult.checks || typeof fixResult.checks !== 'object') fixResult.checks = { ...SKIP_CHECKS }
335
- log(`Fix: coder applied ${fixResult.applied.length}/${actionable.length} finding(s); checks lint=${fixResult.checks.lint} tsc=${fixResult.checks.tsc} build=${fixResult.checks.build}.`)
336
- } else if (actionable.length) {
337
- // Actionable findings exist but NO editable files are mapped → cannot fix; return all as residual
338
- // (no wasted coder spawn — the skill will route them to a targeted coder with a proper ownership scope).
339
- fixResult = { applied: [], unresolved: actionable.map((f) => f.finding_id), checks: { ...SKIP_CHECKS } }
340
- log(`Fix: ${actionable.length} actionable finding(s) but no editable files in scope — returned as residual (coder skipped).`)
341
- } else {
342
- log('Fix: no actionable code/perf/security/simplify findings — coder skipped.')
348
+ const r = await agent(fixBrief, { label, phase: 'Fix', agentType: writer, schema: FIX_SCHEMA })
349
+ // Normalize: the agent may die (null) or return a truthy object missing fields.
350
+ const res = (r && typeof r === 'object') ? r : { applied: [], unresolved: findings.map((f) => f.finding_id), checks: { ...SKIP_CHECKS } }
351
+ if (!Array.isArray(res.applied)) res.applied = []
352
+ if (!Array.isArray(res.unresolved)) res.unresolved = []
353
+ if (!res.checks || typeof res.checks !== 'object') res.checks = { ...SKIP_CHECKS }
354
+ res.ran = true
355
+ log(`Fix: ${writer} applied ${res.applied.length}/${findings.length} ${label} finding(s); checks lint=${res.checks.lint} tsc=${res.checks.tsc} build=${res.checks.build}.`)
356
+ return res
343
357
  }
344
358
 
359
+ // Security writer FIRST (owns the security-invariant contract), then the coder. Sequential → no
360
+ // edit conflict on shared files; the coder pass (when it runs) carries the authoritative build.
361
+ const secResult = await applyFixPass(securityFix, 'security-reviewer', 'fix-security', 'security')
362
+ const codeFixResult = await applyFixPass(actionable, 'coder', 'fix-coder', 'code/perf/simplify')
363
+ if (!securityFix.length && !actionable.length) log('Fix: no actionable code/perf/security/simplify findings — fixers skipped.')
364
+
365
+ // Merge the two passes. A FAIL in EITHER pass fails the wave; PASS only when a pass actually ran it.
366
+ const fixPasses = [secResult, codeFixResult]
367
+ const allActionable = [...securityFix, ...actionable]
368
+ const appliedIds = new Set(fixPasses.flatMap((p) => (p.applied || []).map((x) => x.finding_id)))
369
+ const unresolvedIds = new Set(fixPasses.flatMap((p) => p.unresolved || []))
370
+ const ranChecks = fixPasses.filter((p) => p.ran).map((p) => p.checks)
371
+ const mergedChecks = ['lint', 'tsc', 'build'].reduce((acc, k) => {
372
+ acc[k] = ranChecks.some((c) => c[k] === 'FAIL') ? 'FAIL' : (ranChecks.some((c) => c[k] === 'PASS') ? 'PASS' : 'SKIP')
373
+ return acc
374
+ }, {})
345
375
  // Unfixed actionable findings become residual (human/coder follow-up owned by the skill).
346
- const appliedIds = new Set((fixResult.applied || []).map((x) => x.finding_id))
347
- const unresolvedIds = new Set(fixResult.unresolved || [])
348
- const codeResidual = actionable.filter((f) => !appliedIds.has(f.finding_id) || unresolvedIds.has(f.finding_id))
349
- const checksFailed = ['lint', 'tsc', 'build'].some((k) => fixResult.checks && fixResult.checks[k] === 'FAIL')
376
+ const codeResidual = allActionable.filter((f) => !appliedIds.has(f.finding_id) || unresolvedIds.has(f.finding_id))
377
+ const checksFailed = ['lint', 'tsc', 'build'].some((k) => mergedChecks[k] === 'FAIL')
350
378
 
351
379
  // ---- Assemble per-card result ----------------------------------------------
352
380
  function bucket(cardId) { return perCard[cardId] || (perCard[cardId] = { fixesApplied: [], residual: [] }) }
353
- for (const f of actionable) {
381
+ for (const f of allActionable) {
354
382
  if (appliedIds.has(f.finding_id) && !unresolvedIds.has(f.finding_id)) {
355
383
  bucket(f.card || cards[0].cardId).fixesApplied.push(`[${f.finding_id}] ${f.title}`)
356
384
  }
@@ -24,7 +24,11 @@ export const meta = {
24
24
  // { codexEngine, findings:[…classified, FALSE_POSITIVE dropped], gateTable, summary }
25
25
  // ───────────────────────────────────────────────────────────────────────────
26
26
 
27
- const a = args || {}
27
+ // Tolerate args delivered as a JSON string (parse-or-default) — the Workflow tool
28
+ // sometimes serializes a structured `args` object to a string; without this guard
29
+ // `a.reviewScopeFiles`/`a.cardPaths` are undefined → empty scope → degenerate no-op return.
30
+ let a = args || {}
31
+ if (typeof a === 'string') { try { a = JSON.parse(a) } catch (_) { a = {} } }
28
32
  const scope = Array.isArray(a.reviewScopeFiles) ? a.reviewScopeFiles : []
29
33
  const cards = Array.isArray(a.cardPaths) ? a.cardPaths : []
30
34
  const cfg = a.config || {}
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "baldart",
3
- "version": "4.35.0",
3
+ "version": "4.36.0",
4
4
  "description": "Claude Agent Framework - Reusable framework for coordinating AI agents and humans in software projects",
5
5
  "bin": {
6
6
  "baldart": "./bin/baldart.js"