baldart 4.38.0 → 4.39.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +12 -0
- package/VERSION +1 -1
- package/bin/baldart.js +10 -0
- package/framework/.claude/skills/new/references/merge-cleanup.md +15 -0
- package/framework/.claude/skills/new2/SKILL.md +11 -0
- package/framework/.claude/skills/prd/references/validation-phase.md +14 -0
- package/package.json +1 -1
- package/src/commands/reap-orphans.js +90 -0
package/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,18 @@ All notable changes to BALDART will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [4.39.0] - 2026-06-15
|
|
9
|
+
|
|
10
|
+
**`/new`, `/new2`, and `/prd` now auto-reap orphaned Codex MCP servers at their workspace-hygiene finalizers — the v4.37.0 doctor reaper, made automatic.** v4.37.0 added an on-demand reaper to `baldart doctor`, but the leak compounds *per skill run*: every batch's Codex finder calls (`/new`/`new2` per-card review + final review, `/prd` discovery-completeness + plan audit) drive `codex app-server`, whose detached broker spawns the `~/.codex/config.toml` MCP servers (Playwright, …) as children that orphan to init (ppid 1) when the broker dies and keep burning CPU. Waiting for a manual `baldart doctor` let them accumulate between runs. Now each batch ends by sweeping them. A new focused, non-interactive CLI command (`baldart reap-orphans`) is the SSOT the three finalizers call; it shares the v4.37.0 `codex-orphans.js` detection/reaping logic and the same hard safety invariant — it reaps ONLY orphaned MCP servers (ppid 1 ⇒ broker dead ⇒ stdio broken), and NEVER kills a live `codex app-server` broker (a shared, detached runtime that may still serve the user's interactive session). Because an MCP child of a still-warm broker is not yet orphaned, this is a cumulative orphan sweep (catches this run's debris once its broker dies, plus any prior runs'), not a per-run broker teardown. **MINOR** (new CLI command + skill-finalizer wiring; backwards-compatible — non-blocking hygiene step, no-op when nothing is orphaned, no install/layout change, no `baldart.config.yml` key ⇒ schema-propagation rule N/A).
|
|
11
|
+
|
|
12
|
+
### Added
|
|
13
|
+
|
|
14
|
+
- **`src/commands/reap-orphans.js`** + **`bin/baldart.js`** — new `baldart reap-orphans` command: detects orphaned MCP servers (ppid 1 + MCP signature) and reaps each process tree via syscall, then prints a one-line summary. `--dry-run` reports without killing; `--json` emits a machine-readable result (`schema:"baldart.reap-orphans/1"`). Always exits 0 (hygiene, never a blocker). Reuses `src/utils/codex-orphans.js` (the v4.37.0 SSOT); live `codex app-server` brokers are detected and reported but never killed.
|
|
15
|
+
|
|
16
|
+
### Changed
|
|
17
|
+
|
|
18
|
+
- **`framework/.claude/skills/new/references/merge-cleanup.md`** (Phase 6c, new step 5b), **`framework/.claude/skills/prd/references/validation-phase.md`** (Step 7.5, new non-blocking closer), **`framework/.claude/skills/new2/SKILL.md`** (Step 5, new item 6) — each workspace-hygiene finalizer now runs `npx baldart reap-orphans` as a NON-BLOCKING step and folds its summary into the phase log. `new2` runs it in the main context after the workflow returns (the workflow sandbox cannot run Bash). All three carry the explicit "reaps orphans only, never the broker" note so a future maintainer does not escalate it into a broker kill.
|
|
19
|
+
|
|
8
20
|
## [4.38.0] - 2026-06-15
|
|
9
21
|
|
|
10
22
|
**`baldart doctor` now checks whether the external tools BALDART installs are out of date upstream, and offers a one-command upgrade.** BALDART installs external tools into consumer machines but **pins none of them** — `pipx install graphifyy`, `npm install -g typescript-language-server`, … all grab "latest" at install time, and neither pipx nor npm ever auto-upgrades. So a consumer who installed months ago is frozen on whatever version they got, and never receives upstream security/correctness fixes; the `add`/`update`/`configure` flows can't help because they only run at install time. Concrete trigger: Graphify shipped `0.8.37` (SSRF guard thread-safety + prompt-injection mitigation + a macOS NFC/NFD re-extraction loop fix), `0.8.38` (`calls` edge-direction + JS/TS default import/export + tsconfig `paths` correctness) and `0.8.39` (a `graphify affected` `KeyError` crash fix — a command BALDART agents actually invoke) — all invisible to a frozen install. This release adds the continuous currency check that was missing: the tool-dependency analogue of the `baldart` CLI's own `UpdateNotifier`. **MINOR** (new doctor diagnostic + self-heal action; backwards-compatible — network-gated and skipped under `--offline`, zero output when every tool is current, no install/layout change, no `baldart.config.yml` key ⇒ schema-propagation rule N/A).
|
package/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
4.
|
|
1
|
+
4.39.0
|
package/bin/baldart.js
CHANGED
|
@@ -154,6 +154,16 @@ program
|
|
|
154
154
|
await doctorCommand({ auto: !!options.auto, offline: !!options.offline });
|
|
155
155
|
});
|
|
156
156
|
|
|
157
|
+
program
|
|
158
|
+
.command('reap-orphans')
|
|
159
|
+
.description('Sweep orphaned MCP-server processes left by Codex calls (ppid 1 — their broker is dead). Used by the /new and /prd finalizers; safe to run anytime. Never touches a live codex app-server broker.')
|
|
160
|
+
.option('--json', 'Machine-readable output: emit a single JSON result object on stdout')
|
|
161
|
+
.option('--dry-run', 'Detect and report orphans without killing anything')
|
|
162
|
+
.action(async (options) => {
|
|
163
|
+
const reapCommand = require('../src/commands/reap-orphans');
|
|
164
|
+
await reapCommand({ json: !!options.json, dryRun: !!options.dryRun });
|
|
165
|
+
});
|
|
166
|
+
|
|
157
167
|
const overlayGroup = program
|
|
158
168
|
.command('overlay')
|
|
159
169
|
.description('Author and check .baldart/overlays/ — scaffolds, validates, and detects drift on skill/agent/command overlays');
|
|
@@ -177,6 +177,20 @@ The most common failure mode is leaving cards IN_PROGRESS after merge. This crea
|
|
|
177
177
|
- Question: `"Restore dello stash di Phase 0 ha generato conflitti. Lo stash è ancora presente (NON eliminato). Come procedo?"`
|
|
178
178
|
- Options: `[Lascia lo stash + apri istruzioni per merge manuale]` / `[Mostrami il conflitto inline]` / `[Halt]`.
|
|
179
179
|
|
|
180
|
+
5b. **Process hygiene — reap orphaned Codex MCP servers (NON-BLOCKING)**. This
|
|
181
|
+
batch's per-card / final-review Codex calls drive `codex app-server`, whose
|
|
182
|
+
broker spawns the MCP servers declared in `~/.codex/config.toml` (Playwright,
|
|
183
|
+
…) as children; when a broker dies the OS reparents those MCP servers to init
|
|
184
|
+
(ppid 1) and they keep burning CPU. Sweep them now so the batch ends clean:
|
|
185
|
+
```bash
|
|
186
|
+
npx baldart reap-orphans 2>/dev/null || true
|
|
187
|
+
```
|
|
188
|
+
This reaps ONLY orphaned MCP servers (ppid 1 ⇒ their broker is already dead ⇒
|
|
189
|
+
stdio is broken ⇒ dead weight). It deliberately NEVER kills a live
|
|
190
|
+
`codex app-server` broker (a shared, detached runtime that may still serve the
|
|
191
|
+
user's interactive session). Never gate the close on this — any error or a
|
|
192
|
+
"nothing to reap" result is fine; capture its one-line summary for the log.
|
|
193
|
+
|
|
180
194
|
6. **Log and exit**:
|
|
181
195
|
```
|
|
182
196
|
## Phase 6c — Workspace Hygiene Post-merge
|
|
@@ -185,6 +199,7 @@ The most common failure mode is leaving cards IN_PROGRESS after merge. This crea
|
|
|
185
199
|
Divergence (local…origin/$TRUNK): <0\t0 | resolved: pushed/cherry-picked/ff-pulled/rebased>
|
|
186
200
|
Sync-deferred markers: <none | reconciled | user-retained>
|
|
187
201
|
Phase 0 snapshot restore: <n/a | popped clean | conflict-deferred-to-user>
|
|
202
|
+
Codex MCP hygiene: <reaped N/M | nothing to reap | skipped (error)>
|
|
188
203
|
Completed: <timestamp>
|
|
189
204
|
```
|
|
190
205
|
If any step ended in HALT, set `Status: HALT` and report — Phase 7 must NOT start with an unclean main repo unless the user explicitly chose `[Lascia così]`.
|
|
@@ -265,3 +265,14 @@ returns when the batch is done. It returns:
|
|
|
265
265
|
deferrals resolving too late — order the dependent card earlier), and `owner_gated_deduped` > 0
|
|
266
266
|
means N defers were collapsed to one external action.
|
|
267
267
|
Do NOT re-summarise the cards — the workflow already did.
|
|
268
|
+
6. **Process hygiene — reap orphaned Codex MCP servers (NON-BLOCKING).** The batch's per-card Codex
|
|
269
|
+
finder calls drive `codex app-server`, whose broker spawns the `~/.codex/config.toml` MCP servers
|
|
270
|
+
(Playwright, …) as children; when a broker dies they leak to init (ppid 1) and keep burning CPU.
|
|
271
|
+
Sweep them in the main context (the workflow sandbox cannot run Bash, so this MUST run here, after
|
|
272
|
+
the workflow returns) so the run ends clean:
|
|
273
|
+
```bash
|
|
274
|
+
npx baldart reap-orphans 2>/dev/null || true
|
|
275
|
+
```
|
|
276
|
+
Reaps ONLY orphaned MCP servers (ppid 1 ⇒ broker dead); NEVER kills a live `codex app-server`
|
|
277
|
+
broker. Non-blocking — any error / "nothing to reap" is fine; fold its one-line summary into the
|
|
278
|
+
record (`codex_mcp_reaped`).
|
|
@@ -247,6 +247,20 @@ markers it can emit, then act:
|
|
|
247
247
|
empty, no `[SYNC-NEEDS-DECISION]` marker is left unhandled, and the merged remote
|
|
248
248
|
branch is gone (or its deletion is explicitly user-deferred).
|
|
249
249
|
|
|
250
|
+
**Process hygiene — reap orphaned Codex MCP servers (NON-BLOCKING).** This run's
|
|
251
|
+
Codex calls (discovery-completeness check, plan audit) drive `codex app-server`,
|
|
252
|
+
whose broker spawns the MCP servers from `~/.codex/config.toml` (Playwright, …)
|
|
253
|
+
as children; when a broker dies the OS reparents them to init (ppid 1) and they
|
|
254
|
+
keep burning CPU. Sweep them now so the run ends clean:
|
|
255
|
+
```bash
|
|
256
|
+
npx baldart reap-orphans 2>/dev/null || true
|
|
257
|
+
```
|
|
258
|
+
This reaps ONLY orphaned MCP servers (ppid 1 ⇒ broker dead ⇒ stdio broken ⇒ dead
|
|
259
|
+
weight); it NEVER kills a live `codex app-server` broker (a shared, detached
|
|
260
|
+
runtime that may still serve the user's interactive session). This is NOT part
|
|
261
|
+
of the blocking gate — any error or "nothing to reap" is fine; include its
|
|
262
|
+
one-line summary in the final summary's hygiene line.
|
|
263
|
+
|
|
250
264
|
### Step 7.6 — Obsidian back-reference (NON-BLOCKING — runs only when a spec note was given)
|
|
251
265
|
|
|
252
266
|
**Why this exists.** When the user kicked off the PRD from an Obsidian note (state file
|
package/package.json
CHANGED
|
@@ -0,0 +1,90 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* `baldart reap-orphans` — sweep orphaned MCP-server processes left by Codex
|
|
3
|
+
* calls (since v4.38.0).
|
|
4
|
+
*
|
|
5
|
+
* Non-interactive, focused companion to the `baldart doctor` reap action: it
|
|
6
|
+
* detects MCP servers that have been orphaned to init (ppid 1 — their parent
|
|
7
|
+
* `codex app-server` broker is dead) and kills each process tree. Designed to be
|
|
8
|
+
* called from the `/new` (Phase 6c) and `/prd` (Step 7.5) workspace-hygiene
|
|
9
|
+
* finalizers so each batch ends by clearing the MCP debris its Codex finder
|
|
10
|
+
* calls (and any prior runs) left behind.
|
|
11
|
+
*
|
|
12
|
+
* SCOPE & SAFETY (do not "improve" this into killing the broker):
|
|
13
|
+
* - Reaps ONLY orphaned MCP servers (ppid 1 + MCP signature). A live, in-use
|
|
14
|
+
* `codex app-server` broker is `detached + unref'd` BY DESIGN and also shows
|
|
15
|
+
* ppid 1, so we never touch brokers — killing one could break the user's
|
|
16
|
+
* concurrent interactive Codex session. Brokers are reported, never killed.
|
|
17
|
+
* - Because MCP children of a *still-alive* broker have ppid = broker (not 1),
|
|
18
|
+
* a run whose broker is still warm may leave its own MCP non-orphaned at
|
|
19
|
+
* finalizer time; those get swept by the next run's finalizer (or `doctor`).
|
|
20
|
+
* This command is a cumulative orphan sweep, not a per-run broker teardown.
|
|
21
|
+
*
|
|
22
|
+
* Always exits 0 — this is hygiene, never a blocker. The SSOT for detection /
|
|
23
|
+
* reaping logic is `src/utils/codex-orphans.js`; this command only frames it.
|
|
24
|
+
*/
|
|
25
|
+
|
|
26
|
+
const UI = require('../utils/ui');
|
|
27
|
+
const CodexOrphans = require('../utils/codex-orphans');
|
|
28
|
+
|
|
29
|
+
async function reapOrphans(opts = {}) {
|
|
30
|
+
const json = !!opts.json;
|
|
31
|
+
const dryRun = !!opts.dryRun;
|
|
32
|
+
|
|
33
|
+
const result = {
|
|
34
|
+
schema: 'baldart.reap-orphans/1',
|
|
35
|
+
found: 0,
|
|
36
|
+
reaped: 0,
|
|
37
|
+
failed: 0,
|
|
38
|
+
runtimeBrokers: 0,
|
|
39
|
+
dryRun,
|
|
40
|
+
orphans: [],
|
|
41
|
+
failures: [],
|
|
42
|
+
};
|
|
43
|
+
|
|
44
|
+
try {
|
|
45
|
+
const procs = CodexOrphans.listProcesses();
|
|
46
|
+
const { mcp, runtime } = CodexOrphans.detectOrphans(procs);
|
|
47
|
+
result.found = mcp.length;
|
|
48
|
+
result.runtimeBrokers = runtime.length;
|
|
49
|
+
result.orphans = mcp.map((p) => ({ pid: p.pid, etime: p.etime, command: p.command }));
|
|
50
|
+
|
|
51
|
+
if (!dryRun && mcp.length > 0) {
|
|
52
|
+
const { killed, failed } = CodexOrphans.reapOrphans(mcp, procs);
|
|
53
|
+
result.reaped = killed.length;
|
|
54
|
+
result.failed = failed.length;
|
|
55
|
+
result.failures = failed;
|
|
56
|
+
}
|
|
57
|
+
} catch (err) {
|
|
58
|
+
result.error = (err && err.message) || String(err);
|
|
59
|
+
}
|
|
60
|
+
|
|
61
|
+
if (json) {
|
|
62
|
+
process.stdout.write(JSON.stringify(result) + '\n');
|
|
63
|
+
return result;
|
|
64
|
+
}
|
|
65
|
+
|
|
66
|
+
// Human output — single concise summary line + optional detail.
|
|
67
|
+
if (result.error) {
|
|
68
|
+
UI.warning(`Codex MCP reap skipped (probe error: ${result.error}).`);
|
|
69
|
+
} else if (result.found === 0) {
|
|
70
|
+
UI.success('Codex MCP hygiene: no orphaned MCP servers — nothing to reap.');
|
|
71
|
+
} else if (dryRun) {
|
|
72
|
+
UI.warning(`Codex MCP hygiene: ${result.found} orphaned MCP server(s) found (dry-run, not killed):`);
|
|
73
|
+
result.orphans.slice(0, 8).forEach((o) =>
|
|
74
|
+
console.log(` • pid ${o.pid} (up ${o.etime}): ${o.command.slice(0, 70)}`));
|
|
75
|
+
if (result.orphans.length > 8) console.log(` • … and ${result.orphans.length - 8} more`);
|
|
76
|
+
} else {
|
|
77
|
+
UI.success(`Codex MCP hygiene: reaped ${result.reaped}/${result.found} orphaned MCP server(s) (incl. descendants).`);
|
|
78
|
+
if (result.failed > 0) {
|
|
79
|
+
UI.warning(`${result.failed} could not be killed:`);
|
|
80
|
+
result.failures.forEach((f) => console.log(` pid ${f.pid}: ${f.error}`));
|
|
81
|
+
}
|
|
82
|
+
}
|
|
83
|
+
if (result.runtimeBrokers > 0) {
|
|
84
|
+
UI.info(`(${result.runtimeBrokers} codex app-server broker(s) detected — left untouched by design.)`);
|
|
85
|
+
}
|
|
86
|
+
|
|
87
|
+
return result;
|
|
88
|
+
}
|
|
89
|
+
|
|
90
|
+
module.exports = reapOrphans;
|