@tekyzinc/gsd-t 4.0.28 → 4.1.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,41 @@
2
2
 
3
3
  All notable changes to GSD-T are documented here. Updated with each release.
4
4
 
5
+ ## [4.1.10] - 2026-06-05 (M82 Competition Mode - minor)
6
+
7
+ ### Added - Competition Mode: generate-and-judge for upstream, pre-contract phases
8
+
9
+ The *generative* dual of the orthogonal validation triad. The triad is adversarial (many critics, one candidate → a filter); Competition Mode is generative (many candidates, one judge → a generator). GSD-T historically filtered hard but **generated singly** — every upstream artifact was a single draft. Competition Mode adds the missing generator on the phases where it pays. **Watershed rule:** generate-and-judge ABOVE the contract; attack-and-filter BELOW it.
10
+
11
+ - **Opt-in `--competition N`** (N clamped 2–5; default off) on eligible upstream phases: `partition`, `milestone`, `discuss`, `design-decompose`. Ignored (single producer, logged) on ineligible phases (plan/impact/prd/doc-ripple) and impossible on post-contract phases (execute/verify/…).
12
+ - **Producers = Self-MoA** — N samples of ONE strong model (opus), diversified by prompt *angle* (max-parallelism / simplicity / risk-isolation / dependency-depth / balance), not by a model zoo. Evidence (Self-MoA, arXiv 2502.00674): aggregation is far more sensitive to candidate quality than diversity; mixing models injects low-quality candidates. No debate — producers stay independent.
13
+ - **Objective judge for partition (the v1 beachhead)** — `bin/gsd-t-competition-judge.cjs --kind partition` scores candidate decompositions via the SAME file-disjointness oracle the dispatcher uses (`bin/gsd-t-file-disjointness.cjs`): parallelGroups / waveDepth / validity. A calculator, not an LLM critic → immune to position/verbosity/self-preference bias. Touch paths normalized (`./a` ≡ `a`, `//`, backslashes, trailing slash, dedupe; case preserved).
14
+ - **Subjective judge for milestone/discuss/design** — blind + deterministically-shuffled + different-model (sonnet) + rubric-scored; the winner is finalized deterministically by `--kind generic` (highest weighted score; reproducible tiebreak; zero inference in the substrate).
15
+ - **Two-gate selection policy** (synthesize only when candidate-quality-uniform AND artifact-is-list-shaped; else pick-one) + three artifact classes (coupled-thesis → pick-one; line-items → union/dedup; structurally-validated → synthesize+re-validate). The finalizer does pick-one-at-thesis + union-at-line-item-level, then partition re-validates the graft via the oracle and BLOCKS on a reintroduced overlap.
16
+ - **New CLI**: `gsd-t competition-judge [--in SPEC.json] [--project-dir P]` (exit 0 winner / 4 no valid candidate / 64 bad input). Added to project + global bin tools.
17
+ - **Contract**: `.gsd-t/contracts/competition-mode-contract.md` v1.0.0 STABLE (6 invariants).
18
+ - **Verification**: orthogonal triad ran. Adversarial Workflow Red Team (Opus, fresh context) FAILed first pass (3 HIGH + 2 MEDIUM), all fixed, re-validation Red Team GRUDGING-PASS (all 5 fixed, no new HIGH/CRITICAL). Real-sandbox acceptance gate passed (judge integration ran end-to-end in the Workflow sandbox). Suite 1357/0/4 (+6 M82 tests). **SC#1 measured on M82's own partition: competition (3 producers) → 3 parallel groups vs N=1 baseline's 1 (3× parallelism), invalid overlap candidate correctly disqualified.** SC#3 position-bias probe: order-invariant winner (100%).
19
+ - Origin: brainstorm 2026-06-05 grounded in 2 deep-research runs (best-of-N/judge/debate + synthesis-vs-pick-one/MoA/Frankenstein).
20
+
21
+ ### Versioning
22
+
23
+ Minor bump 4.0.29 → 4.1.10 (new feature, additive; patch reset to 10).
24
+
25
+ ## [4.0.29] - 2026-06-05 (M81 Workflows Runtime-Native - patch)
26
+
27
+ ### Fixed - TD-113: 6 of 7 workflows (+ quick) crashed in the Workflow sandbox and had never run
28
+
29
+ The GSD-T self-scan (Scan #12) and a live NiceNote session both confirmed it: every `*.workflow.js` except `gsd-t-scan` opened with `require("./_lib.js")`, which the Anthropic Workflow sandbox forbids (it provides only `agent/parallel/pipeline/log/phase/budget/args` — no `require`/`fs`/`path`/`child_process`/`process`). Each threw `ReferenceError` on first eval, so the entire orchestration layer — `execute`, `verify`, `wave`, `integrate`, `debug`, `phase`, `quick` — silently fell back to hand-driven runs and never actually executed as workflows.
30
+
31
+ Ported all 7 to the runtime-native pattern proven on scan in M71/M80: inline `async` helpers that delegate each CLI call (preflight, verify-gate, brief, build-coverage, ci-parity, test-data, parallel/disjointness) to an `agent()`'s Bash — preferring project-local `bin/<tool>.cjs`, falling back to the global `gsd-t` PATH binary — and parse the JSON envelope. `args` is now `JSON.parse`d (it arrives stringified). File reads moved into the agents that have `Read` (worker reads its own scope.md/tasks.md; triad agents read their own protocol from `templates/prompts/`). `verify`'s raw `spawnSync`/`require` CI-parity block and `Date.now()` run-id were replaced; the M57/M58 FAIL-blocking semantics are unchanged.
32
+
33
+ - `templates/workflows/gsd-t-{execute,verify,wave,integrate,debug,phase,quick}.workflow.js`: runtime-native port.
34
+ - `test/m71-workflow-runtime-native-lint.test.js`: lint now covers all 8 workflows (was scan-only).
35
+ - `test/m81-workflows-runtime-native.test.js`: structural invariants (no `_lib` require, args-string parse, no `spawnSync`/`Date.now`/`Math.random` in orchestrator, FAIL-blocking gates preserved).
36
+ - `CLAUDE.md`, `~/.claude/CLAUDE.md` + `templates/CLAUDE-global.md`: documented the runtime-native invariant; retired `_lib.js` as a workflow dependency.
37
+
38
+ Proven in the REAL sandbox: `quick` ran end-to-end (verify-gate PASS), `verify` evaluated through its CLI delegations returning a real verify-gate envelope, `execute` evaluated cleanly to its arg-guard — all with zero ReferenceError. Suite 1341/1341 pass.
39
+
5
40
  ## [4.0.28] - 2026-06-04 (M80 Scan Document-Phase Fixes - patch)
6
41
 
7
42
  ### Fixed - scan workflow crashed at the document phase, then shipped a truncated plain-English doc
package/README.md CHANGED
@@ -122,8 +122,11 @@ gsd-t build-coverage --json # M57: new top-level pat
122
122
  gsd-t ci-parity --json # M57: reproduce the project's actual CI build locally (auto docker build)
123
123
  gsd-t test-data --list [--run ID] [--json] # M58: list test-data ledger entries
124
124
  gsd-t test-data --purge --run ID [--dry-run] [--json] # M58: purge tagged test data after Verify (Step 4.5)
125
+ gsd-t competition-judge --in SPEC.json [--project-dir P] # M82: generate-and-judge selection oracle (partition / generic)
125
126
  ```
126
127
 
128
+ **Competition Mode (M82).** Opt-in `--competition N` (N 2–5) on upstream, pre-contract phases (`/gsd-t-partition`, `/gsd-t-milestone`, `/gsd-t-design-decompose`) fans out N parallel candidate producers and a judge selects the winner — the generative dual of the orthogonal validation triad. Partition uses an *objective* file-disjointness oracle as the judge (a calculator, not a biased critic); subjective phases use a blind + different-model + rubric judge. Default off. See `.gsd-t/contracts/competition-mode-contract.md`.
129
+
127
130
  `gsd-t parallel` consumes the M44 task-graph (D1) and applies three pre-spawn gates (D4 depgraph validation → D5 file-disjointness → D6 economics) followed by mode-aware headroom/split math. Extends — does not replace — the M40 orchestrator. Contract: `.gsd-t/contracts/wave-join-contract.md` v1.1.0.
128
131
 
129
132
  Each iteration runs as a fresh `claude -p` session. A cumulative debug ledger (`.gsd-t/debug-state.jsonl`) preserves hypothesis/fix/learning history across sessions. An anti-repetition preamble prevents retrying failed approaches.
@@ -0,0 +1,344 @@
1
+ "use strict";
2
+
3
+ /**
4
+ * gsd-t-competition-judge — M82 D1
5
+ *
6
+ * The selection oracle for Competition Mode (generate-and-judge on upstream,
7
+ * pre-contract phases). Given N candidate artifacts produced by parallel
8
+ * producers, score them and emit a winner — the GENERATIVE dual of the
9
+ * orthogonal validation triad (which is adversarial: many critics, one
10
+ * candidate). Contract: .gsd-t/contracts/competition-mode-contract.md v1.0.0.
11
+ *
12
+ * Two judge modes, chosen by `--kind`:
13
+ *
14
+ * --kind partition → OBJECTIVE judge (the v1 beachhead). Each candidate is a
15
+ * proposed domain decomposition: a list of domains, each with a write-
16
+ * touch list. We score it with the SAME disjointness oracle the real
17
+ * parallel dispatcher uses (bin/gsd-t-file-disjointness.cjs), so the judge
18
+ * is a CALCULATOR, not a critic — it sidesteps every LLM-judge bias
19
+ * (position / verbosity / self-preference). Metrics, higher-is-better
20
+ * unless noted:
21
+ * - valid : zero write-target overlaps across domains (HARD gate)
22
+ * - parallelGroups : count of disjoint domains that can fan out at once
23
+ * - waveDepth : serial gates (sequential groups + 1 if any) — LOWER better
24
+ * - unprovableCount : domains with no touch list — LOWER better (safe-default seq)
25
+ * Ranking: invalid candidates are disqualified; among valid ones, rank by
26
+ * (parallelGroups desc, waveDepth asc, unprovableCount asc, domainCount asc).
27
+ *
28
+ * --kind generic → records a SUBJECTIVE judge's verdict. The numeric scoring
29
+ * lives in the rubric the Workflow's judge agent fills in (blind+shuffled,
30
+ * different-model, rubric-scored — see the contract). This CLI only
31
+ * validates/normalizes the rubric scores the agent supplies and picks the
32
+ * winner deterministically (highest weighted score; ties → lowest index of
33
+ * the ORIGINAL, pre-shuffle order to keep selection reproducible). It does
34
+ * NOT call an LLM — keeping inference out of the deterministic substrate
35
+ * (per feedback_deterministic_orchestration + anthropic-key-measurement-only).
36
+ *
37
+ * Input: a JSON spec on stdin OR via --in <path>. Shapes:
38
+ *
39
+ * partition: {
40
+ * "kind": "partition",
41
+ * "candidates": [
42
+ * { "id": "A", "domains": [ { "name": "d1", "touches": ["a.js","b.js"] }, ... ] },
43
+ * ...
44
+ * ]
45
+ * }
46
+ *
47
+ * generic: {
48
+ * "kind": "generic",
49
+ * "axes": [ { "key": "coherence", "weight": 1 }, { "key": "completeness", "weight": 1 }, ... ],
50
+ * "candidates": [
51
+ * { "id": "A", "scores": { "coherence": 4, "completeness": 3, ... } },
52
+ * ...
53
+ * ]
54
+ * }
55
+ *
56
+ * Output (JSON envelope, the shape runCli parses):
57
+ * {
58
+ * ok: boolean, // true unless input was unusable
59
+ * exitCode: 0 | 4 | 64,
60
+ * kind, n,
61
+ * winner: <candidateId|null>,
62
+ * ranked: [ { id, valid?, parallelGroups?, waveDepth?, unprovableCount?, score?, rank } ],
63
+ * reason?: string
64
+ * }
65
+ *
66
+ * Exit codes: 0 ok+winner · 4 ok but NO valid candidate (all disqualified) · 64 bad input.
67
+ *
68
+ * Hard rules (mirrors the disjointness prover's discipline):
69
+ * - Zero external runtime deps (Node built-ins only).
70
+ * - Never throws — always emits an envelope.
71
+ * - Pure / read-only — no project mutation. Deterministic given the same input.
72
+ */
73
+
74
+ const fs = require("node:fs");
75
+
76
+ // The objective partition judge reuses the production disjointness oracle so the
77
+ // judge's notion of "parallelizable" is byte-identical to the dispatcher's.
78
+ let proveDisjointness;
79
+ try {
80
+ ({ proveDisjointness } = require("./gsd-t-file-disjointness.cjs"));
81
+ } catch {
82
+ proveDisjointness = null;
83
+ }
84
+
85
+ // ─── Partition scoring (objective) ───────────────────────────────────────
86
+
87
+ /**
88
+ * Score one candidate partition by running its domains through the disjointness
89
+ * oracle. Each domain becomes a pseudo-task {id, domain, touches}; we never hit
90
+ * git history (every domain carries an explicit touch list or is counted
91
+ * unprovable), so scoring is pure and deterministic.
92
+ *
93
+ * @returns {{valid, domainCount, parallelGroups, sequentialGroups, unprovableCount, waveDepth}}
94
+ */
95
+ // Normalize a touch path to a stable file identity so two spellings of the SAME
96
+ // file (./bin/x.js vs bin/x.js, trailing slash, backslashes, redundant ./ or //)
97
+ // are detected as a conflict. Without this, an overlapping partition could be
98
+ // scored `valid` and WIN — then the real dispatcher would hit a write conflict.
99
+ // Note: case is preserved (most CI runs on case-sensitive Linux); collapsing case
100
+ // here would create false conflicts on case-sensitive repos. Path identity only.
101
+ function _normPath(p) {
102
+ if (typeof p !== "string") return "";
103
+ let s = p.trim().replace(/\\/g, "/"); // backslashes -> forward
104
+ s = s.replace(/\/+/g, "/"); // collapse repeated slashes
105
+ s = s.replace(/^\.\//, ""); // drop leading ./
106
+ while (s.includes("/./")) s = s.replace("/./", "/"); // drop interior /./
107
+ s = s.replace(/\/+$/, ""); // drop trailing slash
108
+ return s;
109
+ }
110
+
111
+ function scorePartition(candidate, projectDir) {
112
+ const domains = Array.isArray(candidate.domains) ? candidate.domains : [];
113
+ const tasks = domains.map((d, i) => ({
114
+ id: `${candidate.id}:${d.name || `d${i}`}`,
115
+ domain: d.name || `d${i}`,
116
+ // Only honor an explicit touch list — never let the oracle fall through to
117
+ // git history during scoring (would make the judge non-deterministic).
118
+ // Normalize + de-dupe so path-spelling variants are caught as real conflicts.
119
+ touches: Array.isArray(d.touches)
120
+ ? Array.from(new Set(d.touches.map(_normPath).filter(Boolean)))
121
+ : [],
122
+ }));
123
+
124
+ // Run the real oracle when available; otherwise fall back to a self-contained
125
+ // overlap check so the judge still works if the lib isn't co-located.
126
+ const res = proveDisjointness
127
+ ? proveDisjointness({ tasks, projectDir })
128
+ : _localDisjoint(tasks);
129
+
130
+ const parallelGroups = (res.parallel || []).length;
131
+ const sequentialGroups = (res.sequential || []).filter(
132
+ (g) => !(g.length === 1 && (res.unprovable || []).includes(g[0])),
133
+ ).length;
134
+ const unprovableCount = (res.unprovable || []).length;
135
+
136
+ // VALID = no two domains with declared touch lists write the same file. An
137
+ // overlap shows up as a sequential group of size ≥2 among provable tasks.
138
+ const overlapGroup = (res.sequential || []).some((g) => g.length >= 2);
139
+ const valid = !overlapGroup;
140
+
141
+ // waveDepth: 1 wave for the disjoint fan-out, +1 per serial bottleneck
142
+ // (overlapping/unprovable domains that must run after). Fewer = better.
143
+ const serialBottlenecks = sequentialGroups + unprovableCount;
144
+ const waveDepth = (parallelGroups > 0 ? 1 : 0) + (serialBottlenecks > 0 ? 1 : 0) || 1;
145
+
146
+ return {
147
+ valid,
148
+ domainCount: domains.length,
149
+ parallelGroups,
150
+ sequentialGroups,
151
+ unprovableCount,
152
+ waveDepth,
153
+ };
154
+ }
155
+
156
+ // Self-contained overlap fallback (only used if the oracle lib is absent).
157
+ function _localDisjoint(tasks) {
158
+ const parallel = [];
159
+ const sequential = [];
160
+ const unprovable = [];
161
+ const provable = [];
162
+ for (const t of tasks) {
163
+ if (!t.touches || t.touches.length === 0) {
164
+ unprovable.push(t);
165
+ sequential.push([t]);
166
+ } else {
167
+ provable.push(t);
168
+ }
169
+ }
170
+ // union-find over file overlap
171
+ const parent = provable.map((_, i) => i);
172
+ const find = (i) => {
173
+ while (parent[i] !== i) { parent[i] = parent[parent[i]]; i = parent[i]; }
174
+ return i;
175
+ };
176
+ for (let i = 0; i < provable.length; i++) {
177
+ for (let j = i + 1; j < provable.length; j++) {
178
+ const a = new Set(provable[i].touches);
179
+ if (provable[j].touches.some((f) => a.has(f))) {
180
+ const ra = find(i), rb = find(j);
181
+ if (ra !== rb) parent[ra] = rb;
182
+ }
183
+ }
184
+ }
185
+ const groups = new Map();
186
+ for (let i = 0; i < provable.length; i++) {
187
+ const r = find(i);
188
+ if (!groups.has(r)) groups.set(r, []);
189
+ groups.get(r).push(provable[i]);
190
+ }
191
+ for (const g of groups.values()) (g.length === 1 ? parallel : sequential).push(g);
192
+ return { parallel, sequential, unprovable };
193
+ }
194
+
195
+ // Drop candidates that are not usable objects with a string id (Red Team MED-4:
196
+ // the 'never throws' guarantee is on the function, not just the CLI shell — an
197
+ // in-process caller passing [null] or {id:{}} must not crash, and a non-string id
198
+ // could never match `c.id === winnerId` in the workflow anyway).
199
+ function _safeCandidates(candidates) {
200
+ return (Array.isArray(candidates) ? candidates : []).filter(
201
+ (c) => c && typeof c === "object" && typeof c.id === "string" && c.id.length > 0,
202
+ );
203
+ }
204
+
205
+ function rankPartitions(rawCandidates, projectDir) {
206
+ const candidates = _safeCandidates(rawCandidates);
207
+ const scored = candidates.map((c) => ({ id: c.id, ...scorePartition(c, projectDir) }));
208
+ // Disqualify invalid (file-overlap) candidates from winning, but keep them in
209
+ // the ranking so the caller can see why they lost.
210
+ const valid = scored.filter((s) => s.valid);
211
+ const cmp = (a, b) =>
212
+ b.parallelGroups - a.parallelGroups || // more concurrency wins
213
+ a.waveDepth - b.waveDepth || // fewer serial gates wins
214
+ a.unprovableCount - b.unprovableCount || // fewer unknowns wins
215
+ a.domainCount - b.domainCount; // simpler (fewer domains) wins
216
+ valid.sort(cmp);
217
+ const invalid = scored.filter((s) => !s.valid);
218
+ const ordered = [...valid, ...invalid];
219
+ ordered.forEach((s, i) => { s.rank = i + 1; });
220
+ return { ranked: ordered, winner: valid.length ? valid[0].id : null };
221
+ }
222
+
223
+ // ─── Generic scoring (subjective rubric, deterministic selection) ────────
224
+
225
+ function rankGeneric(spec) {
226
+ const axes = Array.isArray(spec.axes) && spec.axes.length
227
+ ? spec.axes
228
+ : [{ key: "quality", weight: 1 }];
229
+ const candidates = _safeCandidates(spec.candidates);
230
+ const scored = candidates.map((c, idx) => {
231
+ const scores = c.scores || {};
232
+ let total = 0;
233
+ let weightSum = 0;
234
+ for (const ax of axes) {
235
+ const w = Number(ax.weight) || 0;
236
+ const v = Number(scores[ax.key]) || 0;
237
+ total += w * v;
238
+ weightSum += w;
239
+ }
240
+ const score = weightSum > 0 ? total / weightSum : 0;
241
+ return { id: c.id, score: Number(score.toFixed(4)), _idx: idx };
242
+ });
243
+ // Highest weighted score wins; ties broken by ORIGINAL index (reproducible,
244
+ // immune to candidate-order shuffling done for bias control upstream).
245
+ scored.sort((a, b) => b.score - a.score || a._idx - b._idx);
246
+ scored.forEach((s, i) => { s.rank = i + 1; delete s._idx; });
247
+ return { ranked: scored, winner: scored.length ? scored[0].id : null };
248
+ }
249
+
250
+ // ─── Driver ──────────────────────────────────────────────────────────────
251
+
252
+ function judge(spec, projectDir) {
253
+ const candidates = Array.isArray(spec && spec.candidates) ? spec.candidates : [];
254
+ if (!candidates.length) {
255
+ return { ok: false, exitCode: 64, kind: spec && spec.kind, n: 0, winner: null, ranked: [], reason: "no-candidates" };
256
+ }
257
+ const kind = spec.kind === "generic" ? "generic" : "partition";
258
+ const { ranked, winner } = kind === "partition"
259
+ ? rankPartitions(candidates, projectDir)
260
+ : rankGeneric(spec);
261
+ const ok = winner != null;
262
+ return {
263
+ ok,
264
+ exitCode: ok ? 0 : 4,
265
+ kind,
266
+ n: candidates.length,
267
+ winner,
268
+ ranked,
269
+ ...(ok ? {} : { reason: kind === "partition" ? "no-valid-candidate" : "no-candidates" }),
270
+ };
271
+ }
272
+
273
+ function readInput(opts) {
274
+ if (opts.in) return fs.readFileSync(opts.in, "utf8");
275
+ // stdin
276
+ try {
277
+ return fs.readFileSync(0, "utf8");
278
+ } catch {
279
+ return "";
280
+ }
281
+ }
282
+
283
+ function parseArgs(argv) {
284
+ const opts = { json: true, in: null, projectDir: process.cwd(), help: false };
285
+ for (let i = 0; i < argv.length; i++) {
286
+ const a = argv[i];
287
+ if (a === "--help" || a === "-h") opts.help = true;
288
+ else if (a === "--in") opts.in = argv[++i];
289
+ else if (a === "--project-dir") opts.projectDir = argv[++i];
290
+ else if (a === "--json") opts.json = true;
291
+ }
292
+ return opts;
293
+ }
294
+
295
+ const HELP = `Usage: gsd-t competition-judge [--in PATH] [--project-dir PATH]
296
+
297
+ Reads a candidate-set JSON spec (stdin or --in) and emits a ranked winner.
298
+
299
+ --in PATH Read spec from file instead of stdin.
300
+ --project-dir PATH Project root (default: cwd).
301
+ --json Emit JSON envelope (default; always on).
302
+
303
+ Spec.kind:
304
+ "partition" Objective oracle judge — scores domain decompositions via the
305
+ file-disjointness prover (parallelGroups / waveDepth / validity).
306
+ "generic" Deterministic rubric selector — picks the highest weighted score
307
+ from rubric values an upstream judge agent supplied.
308
+
309
+ Exit codes: 0 winner · 4 no valid candidate · 64 bad input.`;
310
+
311
+ function main() {
312
+ const opts = parseArgs(process.argv.slice(2));
313
+ if (opts.help) {
314
+ process.stdout.write(HELP + "\n");
315
+ process.exit(0);
316
+ }
317
+ let spec;
318
+ try {
319
+ const raw = readInput(opts);
320
+ spec = JSON.parse(raw);
321
+ } catch (e) {
322
+ const env = { ok: false, exitCode: 64, kind: null, n: 0, winner: null, ranked: [], reason: `bad-input: ${e && e.message}` };
323
+ process.stdout.write(JSON.stringify(env, null, 2) + "\n");
324
+ process.exit(64);
325
+ }
326
+ let result;
327
+ try {
328
+ result = judge(spec, opts.projectDir);
329
+ } catch (e) {
330
+ result = { ok: false, exitCode: 64, kind: spec && spec.kind, n: 0, winner: null, ranked: [], reason: `judge-error: ${e && e.message}` };
331
+ }
332
+ process.stdout.write(JSON.stringify(result, null, 2) + "\n");
333
+ process.exit(result.exitCode);
334
+ }
335
+
336
+ if (require.main === module) main();
337
+
338
+ module.exports = {
339
+ judge,
340
+ scorePartition,
341
+ rankPartitions,
342
+ rankGeneric,
343
+ _internal: { _localDisjoint, _normPath },
344
+ };
package/bin/gsd-t.js CHANGED
@@ -1182,6 +1182,8 @@ const GLOBAL_BIN_TOOLS = [
1182
1182
  // M57 — CI-parity verify-gate checks (structural build-coverage + containment-safe ci-parity).
1183
1183
  "gsd-t-build-coverage.cjs",
1184
1184
  "gsd-t-ci-parity.cjs",
1185
+ // M82 — Competition Mode generate-and-judge selection oracle.
1186
+ "gsd-t-competition-judge.cjs",
1185
1187
  ];
1186
1188
 
1187
1189
  function installGlobalBinTools() {
@@ -2469,6 +2471,10 @@ const PROJECT_BIN_TOOLS = [
2469
2471
  "cli-preflight.cjs", "parallel-cli.cjs", "parallel-cli-tee.cjs",
2470
2472
  "gsd-t-context-brief.cjs",
2471
2473
  "gsd-t-verify-gate.cjs", "gsd-t-verify-gate-judge.cjs",
2474
+ // M82 — Competition Mode judge + its disjointness oracle dependency, so a
2475
+ // project's gsd-t-phase workflow can score candidate partitions via the
2476
+ // project-local bin (runCli prefers bin/<tool>.cjs over the global binary).
2477
+ "gsd-t-competition-judge.cjs", "gsd-t-file-disjointness.cjs",
2472
2478
  ];
2473
2479
 
2474
2480
  // Files that older versions of this installer copied into project bin/ but
@@ -4546,6 +4552,16 @@ if (require.main === module) {
4546
4552
  });
4547
4553
  process.exit(res.status == null ? 1 : res.status);
4548
4554
  }
4555
+ case "competition-judge": {
4556
+ // M82 D1 — `gsd-t competition-judge` thin dispatcher to the generate-and-judge
4557
+ // selection oracle (objective partition judge + deterministic rubric selector).
4558
+ const { spawnSync } = require("child_process");
4559
+ const js = path.join(__dirname, "gsd-t-competition-judge.cjs");
4560
+ const res = spawnSync(process.execPath, [js, ...args.slice(1)], {
4561
+ stdio: "inherit",
4562
+ });
4563
+ process.exit(res.status == null ? 1 : res.status);
4564
+ }
4549
4565
  case "metrics":
4550
4566
  doMetrics(args.slice(1));
4551
4567
  break;
@@ -25,14 +25,21 @@ Capture the design reference from `$ARGUMENTS` (Figma URL / image path). If Figm
25
25
  args: {
26
26
  phase: "design-decompose",
27
27
  projectDir: ".",
28
- userInput: "$ARGUMENTS"
28
+ userInput: "$ARGUMENTS",
29
+ // M82 Competition Mode (opt-in): `--competition N` (N 2..5) fans out N
30
+ // parallel decompositions; a blind, different-model, rubric judge (fidelity /
31
+ // completeness / reuse / simplicity) selects the winner. Useful when a design
32
+ // is ambiguous or the component boundaries aren't obvious.
33
+ competition: 1
29
34
  }
30
35
  }
31
36
  ```
32
37
 
38
+ **Competition Mode (`--competition N`).** When a design is ambiguous or the element/widget/page boundaries aren't obvious, `/gsd-t-design-decompose --competition 3` fans out N candidate decompositions and a blind, different-model rubric judge picks the best. Parse N (clamped 2..5). See `.gsd-t/contracts/competition-mode-contract.md`. Default off.
39
+
33
40
  ## Step 3: Interpret the result
34
41
 
35
- The Workflow returns `{ status, artifacts, summary, decisions }`.
42
+ The Workflow returns `{ status, artifacts, summary, decisions }` (plus `competition: { n, winner, ranked }` when Competition Mode ran).
36
43
 
37
44
  - `status === "complete"`: the element → widget → page contract tree is written under `.gsd-t/contracts/design/`.
38
45
  - `status === "partial" | "blocked"`: the agent needs the design source (e.g. Figma auth) or a stack-capability decision. Surface it.
@@ -479,6 +479,14 @@ Use these when user asks for help on a specific command:
479
479
  - **Use when**: Test data hygiene. Catches the GSD-T-Board class (2442 orphaned `E2E_TEST_*` / `E2E_DRAG_*` ideas left in the production data store after a passing Verify run).
480
480
  - **CLI**: `gsd-t test-data --list [--run <id>] [--json]` / `gsd-t test-data --purge --run <id> [--dry-run] [--json] [--project <dir>]`. Exit 0 on success, 4 on adapter errors, 64 on usage error.
481
481
 
482
+ ### competition-judge (M82)
483
+ - **Summary**: The selection oracle for Competition Mode (generate-and-judge — the *generative* dual of the orthogonal validation triad). Two modes: `--kind partition` scores candidate domain decompositions via the file-disjointness oracle (parallelGroups / waveDepth / validity — a calculator, not an LLM critic, so it's immune to judge bias); `--kind generic` is a deterministic rubric selector that finalizes a winner from rubric scores an upstream blind/different-model judge supplied.
484
+ - **Auto-invoked**: Yes — by `gsd-t-phase.workflow.js` when an eligible phase (partition / milestone / design-decompose) is run with `competition: N` (N 2–5). Opt-in per phase via `/gsd-t-partition --competition N` etc. Default off.
485
+ - **Files**: `bin/gsd-t-competition-judge.cjs` (reuses `bin/gsd-t-file-disjointness.cjs`).
486
+ - **Use when**: Upstream, pre-contract, wide-solution-space decisions where the cost of a single draft is high (partition, milestone decomposition, ambiguous design decomposition). Never on post-contract phases (execute/verify/etc.) — those are owned by the adversarial triad.
487
+ - **CLI**: `gsd-t competition-judge [--in <spec.json>] [--project-dir <dir>]` (spec via stdin or `--in`). Exit 0 winner · 4 no valid candidate · 64 bad input.
488
+ - **Contract**: `.gsd-t/contracts/competition-mode-contract.md` v1.0.0 STABLE.
489
+
482
490
  ## Unknown Command
483
491
 
484
492
  If user asks for help on unrecognized command:
@@ -25,14 +25,21 @@ Read `.gsd-t/progress.md` (current version + completed milestones), `docs/requir
25
25
  args: {
26
26
  phase: "milestone",
27
27
  projectDir: ".",
28
- userInput: "$ARGUMENTS"
28
+ userInput: "$ARGUMENTS",
29
+ // M82 Competition Mode (opt-in): `--competition N` (N 2..5) fans out N
30
+ // parallel Self-MoA producers proposing different decomposition strategies
31
+ // (risk-first / value-first / dependency-first); a blind, different-model,
32
+ // rubric judge selects the winner. Coupled-thesis → pick-one (no Frankenstein).
33
+ competition: 1
29
34
  }
30
35
  }
31
36
  ```
32
37
 
38
+ **Competition Mode (`--competition N`).** Milestone decomposition is the highest-altitude decision in the system — different strategies are genuinely different. If the user invokes `/gsd-t-milestone --competition 3`, parse N (clamped 2..5) and pass `competition: N`. Because a milestone decomposition is a *coupled thesis*, the judge selects one winner whole (pick-one) and only salvages non-overlapping good line-items from the losers — it never Frankensteins. See `.gsd-t/contracts/competition-mode-contract.md`. Default off.
39
+
33
40
  ## Step 3: Interpret the result
34
41
 
35
- The Workflow returns `{ status, artifacts, summary, decisions }`.
42
+ The Workflow returns `{ status, artifacts, summary, decisions }` (plus `competition: { n, winner, ranked }` when Competition Mode ran).
36
43
 
37
44
  - `status === "complete"`: milestone defined and appended to progress.md with falsifiable SCs. Do NOT auto-partition for large/risky milestones — show the Next Up hint.
38
45
  - `status === "blocked"`: the agent needs a scoping decision from the user.
@@ -30,14 +30,21 @@ Call the `Workflow` tool with:
30
30
  phase: "partition",
31
31
  milestone: "M{NN}",
32
32
  projectDir: ".",
33
- userInput: "$ARGUMENTS"
33
+ userInput: "$ARGUMENTS",
34
+ // M82 Competition Mode (opt-in): if the user passed `--competition N` in
35
+ // $ARGUMENTS (N in 2..5), set competition: N. N parallel Self-MoA producers
36
+ // propose partitions; the OBJECTIVE oracle judge (file-disjointness scoring)
37
+ // picks the most-parallelizable valid decomposition. Omit / set 1 = off.
38
+ competition: 1
34
39
  }
35
40
  }
36
41
  ```
37
42
 
43
+ **Competition Mode (`--competition N`).** Partition is the v1 beachhead for generate-and-judge: its judge is the file-disjointness oracle, so it is a calculator, not a biased critic. If the user invokes `/gsd-t-partition --competition 3`, parse N (clamped 2..5) and pass `competition: N`. The workflow fans out N candidate partitions, scores each on measured parallelism / wave-depth / boundary-cleanliness, and finalizes the winner. See `.gsd-t/contracts/competition-mode-contract.md`. Default off (single producer).
44
+
38
45
  ## Step 3: Interpret the result
39
46
 
40
- The Workflow returns `{ status, artifacts, summary, decisions }`.
47
+ The Workflow returns `{ status, artifacts, summary, decisions }` (plus `competition: { n, winner, ranked }` when Competition Mode ran).
41
48
 
42
49
  - `status === "complete"`: domains scoped, contracts drafted. Auto-advance to `/gsd-t-plan`.
43
50
  - `status === "partial" | "blocked"`: read `summary` for what's missing (e.g. ambiguous scope needing discussion).
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@tekyzinc/gsd-t",
3
- "version": "4.0.28",
3
+ "version": "4.1.10",
4
4
  "description": "GSD-T: Contract-Driven Development for Claude Code — 54 slash commands with headless-by-default workflow spawning, unattended supervisor relay with event stream, graph-powered code analysis, real-time agent dashboard, task telemetry, doc-ripple enforcement, backlog management, impact analysis, test sync, milestone archival, and PRD generation",
5
5
  "author": "Tekyz, Inc.",
6
6
  "license": "MIT",
@@ -328,10 +328,10 @@ Canonical scripts:
328
328
  - `gsd-t-integrate.workflow.js` — cross-domain wire-up + light verify-gate
329
329
  - `gsd-t-debug.workflow.js` — 2-cycle diagnose/fix/verify (CLAUDE.md Prime Rule)
330
330
  - `gsd-t-quick.workflow.js` — preflight + brief + single-task + verify-gate (M56-D4)
331
- - `gsd-t-phase.workflow.js` — generic upper-stage runner (partition / plan / discuss / impact / milestone / prd / design-decompose / doc-ripple)
331
+ - `gsd-t-phase.workflow.js` — generic upper-stage runner (partition / plan / discuss / impact / milestone / prd / design-decompose / doc-ripple). **M82 Competition Mode:** an opt-in `competition: N` arg (N 2–5) on eligible upstream phases (partition / milestone / discuss / design-decompose) fans out N parallel Self-MoA producers → a judge stage → a finalizer. Partition's judge is the OBJECTIVE file-disjointness oracle (`gsd-t competition-judge --kind partition` — a calculator, not an LLM critic, immune to judge bias, the v1 beachhead); subjective phases use a blind + shuffled + different-model + rubric judge whose pick is finalized deterministically by `--kind generic`. The generative dual of the orthogonal validation triad; watershed rule = generate-and-judge ABOVE the contract, attack-and-filter BELOW. Default off. Contract: `competition-mode-contract.md` v1.0.0.
332
332
  - `gsd-t-scan.workflow.js` — preflight → volume-probe → pipeline(per-slice deep finder → single verify) → synthesis → document → render (M66: fans out by codebase VOLUME, not a fixed 5-teammate dimension count; M67: deep document phase deterministically produces the full living-doc set + dimension files, per-doc fan-out)
333
333
 
334
- Shared helpers: `templates/workflows/_lib.js` — `runPreflight`, `generateBrief`, `proveFileDisjointness`, `runVerifyGate`, `loadProtocol`, `readDomainTasks`, `readScope`. Each prefers project-local `bin/<tool>.cjs` and falls back to global `gsd-t` PATH binary (preserves M55-D5 project-local-bin invariant).
334
+ **Runtime-native invariant (M81 — v4.0.29+):** the Workflow sandbox provides ONLY `agent/parallel/pipeline/log/phase/budget/args` — NO `require`/`fs`/`path`/`child_process`/`process`, and `args` arrives as a JSON STRING. Each workflow is self-contained: it `JSON.parse`s `args` and delegates every CLI call (preflight, verify-gate, brief, build-coverage, ci-parity, test-data, disjointness) to inline `async` helpers that run the command via an `agent()`'s Bash (preferring project-local `bin/<tool>.cjs`, else the global `gsd-t` PATH binary) and parse the JSON envelope — preserving the M55-D5 project-local-bin invariant. The old `require("./_lib.js")` pattern threw `ReferenceError` on first eval and silently broke every workflow except scan (TD-113, fixed M81); `_lib.js` is retired as a workflow dependency.
335
335
 
336
336
  ## Preflight Gate (KEPT from M55)
337
337
 
@@ -16,10 +16,39 @@ export const meta = {
16
16
  ],
17
17
  };
18
18
 
19
- const lib = require("./_lib.js");
19
+ // M81: runtime-native helpers (sandbox bans require/fs/child_process/process — the old
20
+ // require("./_lib.js") crashed this workflow on first eval, TD-113). Delegate CLI calls
21
+ // to an agent's Bash; args arrives as a JSON STRING in this runtime. See gsd-t-scan.workflow.js.
22
+ const _args = (typeof args === "string") ? (() => { try { return JSON.parse(args); } catch { return {}; } })() : (args || {});
23
+ const _CLI_ENVELOPE_SCHEMA = {
24
+ type: "object", required: ["ok", "exitCode"], additionalProperties: true,
25
+ properties: { ok: { type: "boolean" }, exitCode: { type: "integer" }, envelope: {}, stdout: { type: "string" }, stderr: { type: "string" }, via: { type: "string" } },
26
+ };
27
+ async function runCli(projectDir, subcmd, argv, localBin, label, parseJson = true, phaseName) {
28
+ const argStr = (argv || []).map((a) => `'${String(a).replace(/'/g, "'\\''")}'`).join(" ");
29
+ const prompt = [
30
+ `Run a GSD-T CLI command for the project at \`${projectDir}\` and report the result. Steps:`,
31
+ `1. If \`${projectDir}/bin/${localBin}\` exists, run: \`node ${projectDir}/bin/${localBin} ${argStr}\` (set via="local"). Otherwise run: \`gsd-t ${subcmd} ${argStr}\` (set via="global"). Use cwd \`${projectDir}\`.`,
32
+ `2. Capture exit code (ok = exitCode 0) and stdout/stderr.`,
33
+ parseJson ? `3. Parse stdout as JSON into \`envelope\` (null if not JSON). Return JSON per the schema.` : `3. Put stdout (trimmed, ≤4000 chars) in \`stdout\`. Return JSON per the schema.`,
34
+ `Do NOT do any other work. ONLY run this one command and report.`,
35
+ ].join("\n");
36
+ const opts = { label, schema: _CLI_ENVELOPE_SCHEMA, model: "haiku" };
37
+ if (phaseName) opts.phase = phaseName;
38
+ const r = await agent(prompt, opts).catch((e) => ({ ok: false, exitCode: -1, envelope: null, stderr: String(e && e.message), via: "error" }));
39
+ return r || { ok: false, exitCode: -1, envelope: null, via: "error" };
40
+ }
41
+ async function runPreflight(projectDir, label = "preflight", phaseName) { return runCli(projectDir, "preflight", ["--json"], "cli-preflight.cjs", label, true, phaseName); }
42
+ async function generateBrief(projectDir, { kind = "execute", milestone, domain, id, label = "brief", phaseName } = {}) {
43
+ const argv = ["--kind", kind, "--spawn-id", id, "--out", `${projectDir}/.gsd-t/briefs/${id}.json`];
44
+ if (milestone) argv.push("--milestone", milestone);
45
+ if (domain) argv.push("--domain", domain);
46
+ const r = await runCli(projectDir, "brief", argv, "gsd-t-context-brief.cjs", label, false, phaseName);
47
+ return { ok: r.ok, briefPath: `${projectDir}/.gsd-t/briefs/${id}.json`, via: r.via };
48
+ }
20
49
 
21
- const projectDir = (args && args.projectDir) || ".";
22
- const symptom = (args && args.symptom) || null;
50
+ const projectDir = _args.projectDir || ".";
51
+ const symptom = _args.symptom || null;
23
52
 
24
53
  const DEBUG_CYCLE_SCHEMA = {
25
54
  type: "object",
@@ -42,9 +71,9 @@ if (!symptom) {
42
71
  }
43
72
 
44
73
  phase("Preflight");
45
- const pre = lib.runPreflight({ projectDir });
74
+ const pre = await runPreflight(projectDir);
46
75
  if (!pre.ok) return { status: "failed", reason: "preflight-failed", preflight: pre.envelope };
47
- const brief = lib.generateBrief({ kind: "execute", projectDir });
76
+ const brief = await generateBrief(projectDir, { kind: "execute", id: "debug-brief" });
48
77
 
49
78
  let lastResult = null;
50
79
  for (let cycle = 1; cycle <= 2; cycle++) {