@tekyzinc/gsd-t 4.0.29 → 4.2.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,43 @@
2
2
 
3
3
  All notable changes to GSD-T are documented here. Updated with each release.
4
4
 
5
+ ## [4.2.10] - 2026-06-05 (M83 Left-Shifted Plan Hardening - minor)
6
+
7
+ ### Added - Plan-phase hardening: catch dead deliverables and edge cases BEFORE execute
8
+
9
+ Left-shifts failure detection from verify to plan. Adversarial validation (the Red Team) ran only at verify — after code exists — so a milestone whose headline capability shipped as DEAD CODE (the NiceNote M5 incident: a 100MB+ chunked reader built but never wired into `openPath`, with no test exercising it) burned **four verify cycles** re-litigating the milestone's reason to exist. The root cause was in the plan: it never bound each acceptance criterion to a code path + a killing test, and nothing adversarial reviewed the design before code was written. The `plan` phase now runs two blocking gates before execute.
10
+
11
+ - **Acceptance-traceability gate** (deterministic) — `bin/gsd-t-traceability-gate.cjs`, dispatched as `gsd-t traceability-gate`. Parses `.gsd-t/domains/*/tasks.md`; every behavioral task (one declaring acceptance criteria) must bind its ACs to a `**Files**` code path AND a named killing test; a `**Headline:** true` task must have BOTH a real implementation path and a test. Exit 4 blocks execute. Field detection is emphasis-stripped + colon-position-agnostic (`**Label**:` ≡ `**Label:**`); task blocks are detected by any non-structural heading bearing an AC (descriptive headings are not dropped); the test check is tied to the Test/Files/AC fields only (an incidental runner word in a Dependencies note does not clear it); pytest `test_*.py` / `*_test.py` conventions are preserved.
12
+ - **Adversarial pre-mortem** (generative) — `templates/prompts/pre-mortem-subagent.md`, an opus, fresh-context, assume-the-plan-is-flawed reviewer wired into the plan workflow. Predicts edge-case / dead-deliverable / NFR / shallow-test failures and converts each blocking finding into a **required test** the plan must adopt (advisory notes forbidden — that is how M5's chunk reader shipped three data-loss bugs across three cycles). Verdict `BLOCK` / `CLEARED`.
13
+ - The two gates are the temporal dual of the Red Team: attack the design at plan, not just the code at verify. The deterministic gate runs first and fails CLOSED (an unevaluable gate blocks); the pre-mortem cannot approve a gate-blocked plan.
14
+ - New CLI `gsd-t traceability-gate [--milestone Mxx] [--tasks FILE]` (exit 0/4/64), added to project + global bin tools. Contract `.gsd-t/contracts/plan-hardening-contract.md` v1.0.0 STABLE. `gsd-t-plan.md` + the phase-workflow plan objective updated to require traceable tasks up front.
15
+ - **Verification**: orthogonal triad ran. Adversarial Workflow Red Team (Opus, fresh context) FAILed first pass (1 CRITICAL — colon-inside-bold markdown defeated all field detection, silently passing the literal M5 dead-code plan — + 2 HIGH + 2 MEDIUM), all fixed; re-validation found a regression the CRITICAL fix introduced (underscore-stripping broke pytest paths, HIGH), fixed; final re-validation GRUDGING-PASS (14/14 checks, no new HIGH/CRITICAL). Real-sandbox acceptance gate passed (gate fires through the Workflow sandbox and blocks the bad plan). Suite 1372/0/4 (+15 M83 tests). Self-tested against the actual NiceNote M5 dead-code plan — the gate FAILs it at plan time, which is the milestone's reason to exist.
16
+ - Origin: review of the NiceNote 9-milestone build, where the triad caught real bugs at verify but late; the user's proposal for an adversarial risk-assessment agent at plan.
17
+
18
+ ### Versioning
19
+
20
+ Minor bump 4.1.10 → 4.2.10 (new feature, additive; patch reset to 10).
21
+
22
+ ## [4.1.10] - 2026-06-05 (M82 Competition Mode - minor)
23
+
24
+ ### Added - Competition Mode: generate-and-judge for upstream, pre-contract phases
25
+
26
+ The *generative* dual of the orthogonal validation triad. The triad is adversarial (many critics, one candidate → a filter); Competition Mode is generative (many candidates, one judge → a generator). GSD-T historically filtered hard but **generated singly** — every upstream artifact was a single draft. Competition Mode adds the missing generator on the phases where it pays. **Watershed rule:** generate-and-judge ABOVE the contract; attack-and-filter BELOW it.
27
+
28
+ - **Opt-in `--competition N`** (N clamped 2–5; default off) on eligible upstream phases: `partition`, `milestone`, `discuss`, `design-decompose`. Ignored (single producer, logged) on ineligible phases (plan/impact/prd/doc-ripple) and impossible on post-contract phases (execute/verify/…).
29
+ - **Producers = Self-MoA** — N samples of ONE strong model (opus), diversified by prompt *angle* (max-parallelism / simplicity / risk-isolation / dependency-depth / balance), not by a model zoo. Evidence (Self-MoA, arXiv 2502.00674): aggregation is far more sensitive to candidate quality than diversity; mixing models injects low-quality candidates. No debate — producers stay independent.
30
+ - **Objective judge for partition (the v1 beachhead)** — `bin/gsd-t-competition-judge.cjs --kind partition` scores candidate decompositions via the SAME file-disjointness oracle the dispatcher uses (`bin/gsd-t-file-disjointness.cjs`): parallelGroups / waveDepth / validity. A calculator, not an LLM critic → immune to position/verbosity/self-preference bias. Touch paths normalized (`./a` ≡ `a`, `//`, backslashes, trailing slash, dedupe; case preserved).
31
+ - **Subjective judge for milestone/discuss/design** — blind + deterministically-shuffled + different-model (sonnet) + rubric-scored; the winner is finalized deterministically by `--kind generic` (highest weighted score; reproducible tiebreak; zero inference in the substrate).
32
+ - **Two-gate selection policy** (synthesize only when candidate-quality-uniform AND artifact-is-list-shaped; else pick-one) + three artifact classes (coupled-thesis → pick-one; line-items → union/dedup; structurally-validated → synthesize+re-validate). The finalizer does pick-one-at-thesis + union-at-line-item-level, then partition re-validates the graft via the oracle and BLOCKS on a reintroduced overlap.
33
+ - **New CLI**: `gsd-t competition-judge [--in SPEC.json] [--project-dir P]` (exit 0 winner / 4 no valid candidate / 64 bad input). Added to project + global bin tools.
34
+ - **Contract**: `.gsd-t/contracts/competition-mode-contract.md` v1.0.0 STABLE (6 invariants).
35
+ - **Verification**: orthogonal triad ran. Adversarial Workflow Red Team (Opus, fresh context) FAILed first pass (3 HIGH + 2 MEDIUM), all fixed, re-validation Red Team GRUDGING-PASS (all 5 fixed, no new HIGH/CRITICAL). Real-sandbox acceptance gate passed (judge integration ran end-to-end in the Workflow sandbox). Suite 1357/0/4 (+6 M82 tests). **SC#1 measured on M82's own partition: competition (3 producers) → 3 parallel groups vs N=1 baseline's 1 (3× parallelism), invalid overlap candidate correctly disqualified.** SC#3 position-bias probe: order-invariant winner (100%).
36
+ - Origin: brainstorm 2026-06-05 grounded in 2 deep-research runs (best-of-N/judge/debate + synthesis-vs-pick-one/MoA/Frankenstein).
37
+
38
+ ### Versioning
39
+
40
+ Minor bump 4.0.29 → 4.1.10 (new feature, additive; patch reset to 10).
41
+
5
42
  ## [4.0.29] - 2026-06-05 (M81 Workflows Runtime-Native - patch)
6
43
 
7
44
  ### Fixed - TD-113: 6 of 7 workflows (+ quick) crashed in the Workflow sandbox and had never run
package/README.md CHANGED
@@ -122,8 +122,14 @@ gsd-t build-coverage --json # M57: new top-level pat
122
122
  gsd-t ci-parity --json # M57: reproduce the project's actual CI build locally (auto docker build)
123
123
  gsd-t test-data --list [--run ID] [--json] # M58: list test-data ledger entries
124
124
  gsd-t test-data --purge --run ID [--dry-run] [--json] # M58: purge tagged test data after Verify (Step 4.5)
125
+ gsd-t competition-judge --in SPEC.json [--project-dir P] # M82: generate-and-judge selection oracle (partition / generic)
126
+ gsd-t traceability-gate --milestone Mxx [--project-dir P] # M83: plan-phase acceptance-traceability gate (AC → path → killing test)
125
127
  ```
126
128
 
129
+ **Plan Hardening (M83).** The `plan` phase now runs two blocking gates before execute, so a plan can't ship a dead deliverable: a deterministic **acceptance-traceability gate** (`gsd-t traceability-gate` — every AC must bind to a code path + a killing test; the headline capability needs both impl and test) and an adversarial **pre-mortem** agent (opus, fresh-context, predicts edge-case/NFR/dead-deliverable failures and requires a test for each). The temporal dual of the Red Team — attack the design at plan, not just the code at verify. Origin: a build where the headline capability shipped as dead code and burned 4 verify cycles. See `.gsd-t/contracts/plan-hardening-contract.md`.
130
+
131
+ **Competition Mode (M82).** Opt-in `--competition N` (N 2–5) on upstream, pre-contract phases (`/gsd-t-partition`, `/gsd-t-milestone`, `/gsd-t-design-decompose`) fans out N parallel candidate producers and a judge selects the winner — the generative dual of the orthogonal validation triad. Partition uses an *objective* file-disjointness oracle as the judge (a calculator, not a biased critic); subjective phases use a blind + different-model + rubric judge. Default off. See `.gsd-t/contracts/competition-mode-contract.md`.
132
+
127
133
  `gsd-t parallel` consumes the M44 task-graph (D1) and applies three pre-spawn gates (D4 depgraph validation → D5 file-disjointness → D6 economics) followed by mode-aware headroom/split math. Extends — does not replace — the M40 orchestrator. Contract: `.gsd-t/contracts/wave-join-contract.md` v1.1.0.
128
134
 
129
135
  Each iteration runs as a fresh `claude -p` session. A cumulative debug ledger (`.gsd-t/debug-state.jsonl`) preserves hypothesis/fix/learning history across sessions. An anti-repetition preamble prevents retrying failed approaches.
@@ -0,0 +1,344 @@
1
+ "use strict";
2
+
3
+ /**
4
+ * gsd-t-competition-judge — M82 D1
5
+ *
6
+ * The selection oracle for Competition Mode (generate-and-judge on upstream,
7
+ * pre-contract phases). Given N candidate artifacts produced by parallel
8
+ * producers, score them and emit a winner — the GENERATIVE dual of the
9
+ * orthogonal validation triad (which is adversarial: many critics, one
10
+ * candidate). Contract: .gsd-t/contracts/competition-mode-contract.md v1.0.0.
11
+ *
12
+ * Two judge modes, chosen by `--kind`:
13
+ *
14
+ * --kind partition → OBJECTIVE judge (the v1 beachhead). Each candidate is a
15
+ * proposed domain decomposition: a list of domains, each with a write-
16
+ * touch list. We score it with the SAME disjointness oracle the real
17
+ * parallel dispatcher uses (bin/gsd-t-file-disjointness.cjs), so the judge
18
+ * is a CALCULATOR, not a critic — it sidesteps every LLM-judge bias
19
+ * (position / verbosity / self-preference). Metrics, higher-is-better
20
+ * unless noted:
21
+ * - valid : zero write-target overlaps across domains (HARD gate)
22
+ * - parallelGroups : count of disjoint domains that can fan out at once
23
+ * - waveDepth : serial gates (sequential groups + 1 if any) — LOWER better
24
+ * - unprovableCount : domains with no touch list — LOWER better (safe-default seq)
25
+ * Ranking: invalid candidates are disqualified; among valid ones, rank by
26
+ * (parallelGroups desc, waveDepth asc, unprovableCount asc, domainCount asc).
27
+ *
28
+ * --kind generic → records a SUBJECTIVE judge's verdict. The numeric scoring
29
+ * lives in the rubric the Workflow's judge agent fills in (blind+shuffled,
30
+ * different-model, rubric-scored — see the contract). This CLI only
31
+ * validates/normalizes the rubric scores the agent supplies and picks the
32
+ * winner deterministically (highest weighted score; ties → lowest index of
33
+ * the ORIGINAL, pre-shuffle order to keep selection reproducible). It does
34
+ * NOT call an LLM — keeping inference out of the deterministic substrate
35
+ * (per feedback_deterministic_orchestration + anthropic-key-measurement-only).
36
+ *
37
+ * Input: a JSON spec on stdin OR via --in <path>. Shapes:
38
+ *
39
+ * partition: {
40
+ * "kind": "partition",
41
+ * "candidates": [
42
+ * { "id": "A", "domains": [ { "name": "d1", "touches": ["a.js","b.js"] }, ... ] },
43
+ * ...
44
+ * ]
45
+ * }
46
+ *
47
+ * generic: {
48
+ * "kind": "generic",
49
+ * "axes": [ { "key": "coherence", "weight": 1 }, { "key": "completeness", "weight": 1 }, ... ],
50
+ * "candidates": [
51
+ * { "id": "A", "scores": { "coherence": 4, "completeness": 3, ... } },
52
+ * ...
53
+ * ]
54
+ * }
55
+ *
56
+ * Output (JSON envelope, the shape runCli parses):
57
+ * {
58
+ * ok: boolean, // true unless input was unusable
59
+ * exitCode: 0 | 4 | 64,
60
+ * kind, n,
61
+ * winner: <candidateId|null>,
62
+ * ranked: [ { id, valid?, parallelGroups?, waveDepth?, unprovableCount?, score?, rank } ],
63
+ * reason?: string
64
+ * }
65
+ *
66
+ * Exit codes: 0 ok+winner · 4 ok but NO valid candidate (all disqualified) · 64 bad input.
67
+ *
68
+ * Hard rules (mirrors the disjointness prover's discipline):
69
+ * - Zero external runtime deps (Node built-ins only).
70
+ * - Never throws — always emits an envelope.
71
+ * - Pure / read-only — no project mutation. Deterministic given the same input.
72
+ */
73
+
74
+ const fs = require("node:fs");
75
+
76
+ // The objective partition judge reuses the production disjointness oracle so the
77
+ // judge's notion of "parallelizable" is byte-identical to the dispatcher's.
78
+ let proveDisjointness;
79
+ try {
80
+ ({ proveDisjointness } = require("./gsd-t-file-disjointness.cjs"));
81
+ } catch {
82
+ proveDisjointness = null;
83
+ }
84
+
85
+ // ─── Partition scoring (objective) ───────────────────────────────────────
86
+
87
+ /**
88
+ * Score one candidate partition by running its domains through the disjointness
89
+ * oracle. Each domain becomes a pseudo-task {id, domain, touches}; we never hit
90
+ * git history (every domain carries an explicit touch list or is counted
91
+ * unprovable), so scoring is pure and deterministic.
92
+ *
93
+ * @returns {{valid, domainCount, parallelGroups, sequentialGroups, unprovableCount, waveDepth}}
94
+ */
95
+ // Normalize a touch path to a stable file identity so two spellings of the SAME
96
+ // file (./bin/x.js vs bin/x.js, trailing slash, backslashes, redundant ./ or //)
97
+ // are detected as a conflict. Without this, an overlapping partition could be
98
+ // scored `valid` and WIN — then the real dispatcher would hit a write conflict.
99
+ // Note: case is preserved (most CI runs on case-sensitive Linux); collapsing case
100
+ // here would create false conflicts on case-sensitive repos. Path identity only.
101
+ function _normPath(p) {
102
+ if (typeof p !== "string") return "";
103
+ let s = p.trim().replace(/\\/g, "/"); // backslashes -> forward
104
+ s = s.replace(/\/+/g, "/"); // collapse repeated slashes
105
+ s = s.replace(/^\.\//, ""); // drop leading ./
106
+ while (s.includes("/./")) s = s.replace("/./", "/"); // drop interior /./
107
+ s = s.replace(/\/+$/, ""); // drop trailing slash
108
+ return s;
109
+ }
110
+
111
+ function scorePartition(candidate, projectDir) {
112
+ const domains = Array.isArray(candidate.domains) ? candidate.domains : [];
113
+ const tasks = domains.map((d, i) => ({
114
+ id: `${candidate.id}:${d.name || `d${i}`}`,
115
+ domain: d.name || `d${i}`,
116
+ // Only honor an explicit touch list — never let the oracle fall through to
117
+ // git history during scoring (would make the judge non-deterministic).
118
+ // Normalize + de-dupe so path-spelling variants are caught as real conflicts.
119
+ touches: Array.isArray(d.touches)
120
+ ? Array.from(new Set(d.touches.map(_normPath).filter(Boolean)))
121
+ : [],
122
+ }));
123
+
124
+ // Run the real oracle when available; otherwise fall back to a self-contained
125
+ // overlap check so the judge still works if the lib isn't co-located.
126
+ const res = proveDisjointness
127
+ ? proveDisjointness({ tasks, projectDir })
128
+ : _localDisjoint(tasks);
129
+
130
+ const parallelGroups = (res.parallel || []).length;
131
+ const sequentialGroups = (res.sequential || []).filter(
132
+ (g) => !(g.length === 1 && (res.unprovable || []).includes(g[0])),
133
+ ).length;
134
+ const unprovableCount = (res.unprovable || []).length;
135
+
136
+ // VALID = no two domains with declared touch lists write the same file. An
137
+ // overlap shows up as a sequential group of size ≥2 among provable tasks.
138
+ const overlapGroup = (res.sequential || []).some((g) => g.length >= 2);
139
+ const valid = !overlapGroup;
140
+
141
+ // waveDepth: 1 wave for the disjoint fan-out, +1 per serial bottleneck
142
+ // (overlapping/unprovable domains that must run after). Fewer = better.
143
+ const serialBottlenecks = sequentialGroups + unprovableCount;
144
+ const waveDepth = (parallelGroups > 0 ? 1 : 0) + (serialBottlenecks > 0 ? 1 : 0) || 1;
145
+
146
+ return {
147
+ valid,
148
+ domainCount: domains.length,
149
+ parallelGroups,
150
+ sequentialGroups,
151
+ unprovableCount,
152
+ waveDepth,
153
+ };
154
+ }
155
+
156
+ // Self-contained overlap fallback (only used if the oracle lib is absent).
157
+ function _localDisjoint(tasks) {
158
+ const parallel = [];
159
+ const sequential = [];
160
+ const unprovable = [];
161
+ const provable = [];
162
+ for (const t of tasks) {
163
+ if (!t.touches || t.touches.length === 0) {
164
+ unprovable.push(t);
165
+ sequential.push([t]);
166
+ } else {
167
+ provable.push(t);
168
+ }
169
+ }
170
+ // union-find over file overlap
171
+ const parent = provable.map((_, i) => i);
172
+ const find = (i) => {
173
+ while (parent[i] !== i) { parent[i] = parent[parent[i]]; i = parent[i]; }
174
+ return i;
175
+ };
176
+ for (let i = 0; i < provable.length; i++) {
177
+ for (let j = i + 1; j < provable.length; j++) {
178
+ const a = new Set(provable[i].touches);
179
+ if (provable[j].touches.some((f) => a.has(f))) {
180
+ const ra = find(i), rb = find(j);
181
+ if (ra !== rb) parent[ra] = rb;
182
+ }
183
+ }
184
+ }
185
+ const groups = new Map();
186
+ for (let i = 0; i < provable.length; i++) {
187
+ const r = find(i);
188
+ if (!groups.has(r)) groups.set(r, []);
189
+ groups.get(r).push(provable[i]);
190
+ }
191
+ for (const g of groups.values()) (g.length === 1 ? parallel : sequential).push(g);
192
+ return { parallel, sequential, unprovable };
193
+ }
194
+
195
+ // Drop candidates that are not usable objects with a string id (Red Team MED-4:
196
+ // the 'never throws' guarantee is on the function, not just the CLI shell — an
197
+ // in-process caller passing [null] or {id:{}} must not crash, and a non-string id
198
+ // could never match `c.id === winnerId` in the workflow anyway).
199
+ function _safeCandidates(candidates) {
200
+ return (Array.isArray(candidates) ? candidates : []).filter(
201
+ (c) => c && typeof c === "object" && typeof c.id === "string" && c.id.length > 0,
202
+ );
203
+ }
204
+
205
+ function rankPartitions(rawCandidates, projectDir) {
206
+ const candidates = _safeCandidates(rawCandidates);
207
+ const scored = candidates.map((c) => ({ id: c.id, ...scorePartition(c, projectDir) }));
208
+ // Disqualify invalid (file-overlap) candidates from winning, but keep them in
209
+ // the ranking so the caller can see why they lost.
210
+ const valid = scored.filter((s) => s.valid);
211
+ const cmp = (a, b) =>
212
+ b.parallelGroups - a.parallelGroups || // more concurrency wins
213
+ a.waveDepth - b.waveDepth || // fewer serial gates wins
214
+ a.unprovableCount - b.unprovableCount || // fewer unknowns wins
215
+ a.domainCount - b.domainCount; // simpler (fewer domains) wins
216
+ valid.sort(cmp);
217
+ const invalid = scored.filter((s) => !s.valid);
218
+ const ordered = [...valid, ...invalid];
219
+ ordered.forEach((s, i) => { s.rank = i + 1; });
220
+ return { ranked: ordered, winner: valid.length ? valid[0].id : null };
221
+ }
222
+
223
+ // ─── Generic scoring (subjective rubric, deterministic selection) ────────
224
+
225
+ function rankGeneric(spec) {
226
+ const axes = Array.isArray(spec.axes) && spec.axes.length
227
+ ? spec.axes
228
+ : [{ key: "quality", weight: 1 }];
229
+ const candidates = _safeCandidates(spec.candidates);
230
+ const scored = candidates.map((c, idx) => {
231
+ const scores = c.scores || {};
232
+ let total = 0;
233
+ let weightSum = 0;
234
+ for (const ax of axes) {
235
+ const w = Number(ax.weight) || 0;
236
+ const v = Number(scores[ax.key]) || 0;
237
+ total += w * v;
238
+ weightSum += w;
239
+ }
240
+ const score = weightSum > 0 ? total / weightSum : 0;
241
+ return { id: c.id, score: Number(score.toFixed(4)), _idx: idx };
242
+ });
243
+ // Highest weighted score wins; ties broken by ORIGINAL index (reproducible,
244
+ // immune to candidate-order shuffling done for bias control upstream).
245
+ scored.sort((a, b) => b.score - a.score || a._idx - b._idx);
246
+ scored.forEach((s, i) => { s.rank = i + 1; delete s._idx; });
247
+ return { ranked: scored, winner: scored.length ? scored[0].id : null };
248
+ }
249
+
250
+ // ─── Driver ──────────────────────────────────────────────────────────────
251
+
252
+ function judge(spec, projectDir) {
253
+ const candidates = Array.isArray(spec && spec.candidates) ? spec.candidates : [];
254
+ if (!candidates.length) {
255
+ return { ok: false, exitCode: 64, kind: spec && spec.kind, n: 0, winner: null, ranked: [], reason: "no-candidates" };
256
+ }
257
+ const kind = spec.kind === "generic" ? "generic" : "partition";
258
+ const { ranked, winner } = kind === "partition"
259
+ ? rankPartitions(candidates, projectDir)
260
+ : rankGeneric(spec);
261
+ const ok = winner != null;
262
+ return {
263
+ ok,
264
+ exitCode: ok ? 0 : 4,
265
+ kind,
266
+ n: candidates.length,
267
+ winner,
268
+ ranked,
269
+ ...(ok ? {} : { reason: kind === "partition" ? "no-valid-candidate" : "no-candidates" }),
270
+ };
271
+ }
272
+
273
+ function readInput(opts) {
274
+ if (opts.in) return fs.readFileSync(opts.in, "utf8");
275
+ // stdin
276
+ try {
277
+ return fs.readFileSync(0, "utf8");
278
+ } catch {
279
+ return "";
280
+ }
281
+ }
282
+
283
+ function parseArgs(argv) {
284
+ const opts = { json: true, in: null, projectDir: process.cwd(), help: false };
285
+ for (let i = 0; i < argv.length; i++) {
286
+ const a = argv[i];
287
+ if (a === "--help" || a === "-h") opts.help = true;
288
+ else if (a === "--in") opts.in = argv[++i];
289
+ else if (a === "--project-dir") opts.projectDir = argv[++i];
290
+ else if (a === "--json") opts.json = true;
291
+ }
292
+ return opts;
293
+ }
294
+
295
+ const HELP = `Usage: gsd-t competition-judge [--in PATH] [--project-dir PATH]
296
+
297
+ Reads a candidate-set JSON spec (stdin or --in) and emits a ranked winner.
298
+
299
+ --in PATH Read spec from file instead of stdin.
300
+ --project-dir PATH Project root (default: cwd).
301
+ --json Emit JSON envelope (default; always on).
302
+
303
+ Spec.kind:
304
+ "partition" Objective oracle judge — scores domain decompositions via the
305
+ file-disjointness prover (parallelGroups / waveDepth / validity).
306
+ "generic" Deterministic rubric selector — picks the highest weighted score
307
+ from rubric values an upstream judge agent supplied.
308
+
309
+ Exit codes: 0 winner · 4 no valid candidate · 64 bad input.`;
310
+
311
+ function main() {
312
+ const opts = parseArgs(process.argv.slice(2));
313
+ if (opts.help) {
314
+ process.stdout.write(HELP + "\n");
315
+ process.exit(0);
316
+ }
317
+ let spec;
318
+ try {
319
+ const raw = readInput(opts);
320
+ spec = JSON.parse(raw);
321
+ } catch (e) {
322
+ const env = { ok: false, exitCode: 64, kind: null, n: 0, winner: null, ranked: [], reason: `bad-input: ${e && e.message}` };
323
+ process.stdout.write(JSON.stringify(env, null, 2) + "\n");
324
+ process.exit(64);
325
+ }
326
+ let result;
327
+ try {
328
+ result = judge(spec, opts.projectDir);
329
+ } catch (e) {
330
+ result = { ok: false, exitCode: 64, kind: spec && spec.kind, n: 0, winner: null, ranked: [], reason: `judge-error: ${e && e.message}` };
331
+ }
332
+ process.stdout.write(JSON.stringify(result, null, 2) + "\n");
333
+ process.exit(result.exitCode);
334
+ }
335
+
336
+ if (require.main === module) main();
337
+
338
+ module.exports = {
339
+ judge,
340
+ scorePartition,
341
+ rankPartitions,
342
+ rankGeneric,
343
+ _internal: { _localDisjoint, _normPath },
344
+ };