npm - @tekyzinc/gsd-t - Versions diffs - 4.0.28 → 4.1.10 - Mend

@tekyzinc/gsd-t 4.0.28 → 4.1.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/CHANGELOG.md +35 -0
package/README.md +3 -0
package/bin/gsd-t-competition-judge.cjs +344 -0
package/bin/gsd-t.js +16 -0
package/commands/gsd-t-design-decompose.md +9 -2
package/commands/gsd-t-help.md +8 -0
package/commands/gsd-t-milestone.md +9 -2
package/commands/gsd-t-partition.md +9 -2
package/package.json +1 -1
package/templates/CLAUDE-global.md +2 -2
package/templates/workflows/gsd-t-debug.workflow.js +34 -5
package/templates/workflows/gsd-t-execute.workflow.js +54 -29
package/templates/workflows/gsd-t-integrate.workflow.js +37 -7
package/templates/workflows/gsd-t-phase.workflow.js +368 -25
package/templates/workflows/gsd-t-quick.workflow.js +59 -7
package/templates/workflows/gsd-t-verify.workflow.js +67 -47
package/templates/workflows/gsd-t-wave.workflow.js +7 -4

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,41 @@
 All notable changes to GSD-T are documented here. Updated with each release.
+## [4.1.10] - 2026-06-05 (M82 Competition Mode - minor)
+### Added - Competition Mode: generate-and-judge for upstream, pre-contract phases
+The *generative* dual of the orthogonal validation triad. The triad is adversarial (many critics, one candidate → a filter); Competition Mode is generative (many candidates, one judge → a generator). GSD-T historically filtered hard but **generated singly** — every upstream artifact was a single draft. Competition Mode adds the missing generator on the phases where it pays. **Watershed rule:** generate-and-judge ABOVE the contract; attack-and-filter BELOW it.
+- **Opt-in `--competition N`** (N clamped 2–5; default off) on eligible upstream phases: `partition`, `milestone`, `discuss`, `design-decompose`. Ignored (single producer, logged) on ineligible phases (plan/impact/prd/doc-ripple) and impossible on post-contract phases (execute/verify/…).
+- **Producers = Self-MoA** — N samples of ONE strong model (opus), diversified by prompt *angle* (max-parallelism / simplicity / risk-isolation / dependency-depth / balance), not by a model zoo. Evidence (Self-MoA, arXiv 2502.00674): aggregation is far more sensitive to candidate quality than diversity; mixing models injects low-quality candidates. No debate — producers stay independent.
+- **Objective judge for partition (the v1 beachhead)** — `bin/gsd-t-competition-judge.cjs --kind partition` scores candidate decompositions via the SAME file-disjointness oracle the dispatcher uses (`bin/gsd-t-file-disjointness.cjs`): parallelGroups / waveDepth / validity. A calculator, not an LLM critic → immune to position/verbosity/self-preference bias. Touch paths normalized (`./a` ≡ `a`, `//`, backslashes, trailing slash, dedupe; case preserved).
+- **Subjective judge for milestone/discuss/design** — blind + deterministically-shuffled + different-model (sonnet) + rubric-scored; the winner is finalized deterministically by `--kind generic` (highest weighted score; reproducible tiebreak; zero inference in the substrate).
+- **Two-gate selection policy** (synthesize only when candidate-quality-uniform AND artifact-is-list-shaped; else pick-one) + three artifact classes (coupled-thesis → pick-one; line-items → union/dedup; structurally-validated → synthesize+re-validate). The finalizer does pick-one-at-thesis + union-at-line-item-level, then partition re-validates the graft via the oracle and BLOCKS on a reintroduced overlap.
+- **New CLI**: `gsd-t competition-judge [--in SPEC.json] [--project-dir P]` (exit 0 winner / 4 no valid candidate / 64 bad input). Added to project + global bin tools.
+- **Contract**: `.gsd-t/contracts/competition-mode-contract.md` v1.0.0 STABLE (6 invariants).
+- **Verification**: orthogonal triad ran. Adversarial Workflow Red Team (Opus, fresh context) FAILed first pass (3 HIGH + 2 MEDIUM), all fixed, re-validation Red Team GRUDGING-PASS (all 5 fixed, no new HIGH/CRITICAL). Real-sandbox acceptance gate passed (judge integration ran end-to-end in the Workflow sandbox). Suite 1357/0/4 (+6 M82 tests). **SC#1 measured on M82's own partition: competition (3 producers) → 3 parallel groups vs N=1 baseline's 1 (3× parallelism), invalid overlap candidate correctly disqualified.** SC#3 position-bias probe: order-invariant winner (100%).
+- Origin: brainstorm 2026-06-05 grounded in 2 deep-research runs (best-of-N/judge/debate + synthesis-vs-pick-one/MoA/Frankenstein).
+### Versioning
+Minor bump 4.0.29 → 4.1.10 (new feature, additive; patch reset to 10).
+## [4.0.29] - 2026-06-05 (M81 Workflows Runtime-Native - patch)
+### Fixed - TD-113: 6 of 7 workflows (+ quick) crashed in the Workflow sandbox and had never run
+The GSD-T self-scan (Scan #12) and a live NiceNote session both confirmed it: every `*.workflow.js` except `gsd-t-scan` opened with `require("./_lib.js")`, which the Anthropic Workflow sandbox forbids (it provides only `agent/parallel/pipeline/log/phase/budget/args` — no `require`/`fs`/`path`/`child_process`/`process`). Each threw `ReferenceError` on first eval, so the entire orchestration layer — `execute`, `verify`, `wave`, `integrate`, `debug`, `phase`, `quick` — silently fell back to hand-driven runs and never actually executed as workflows.
+Ported all 7 to the runtime-native pattern proven on scan in M71/M80: inline `async` helpers that delegate each CLI call (preflight, verify-gate, brief, build-coverage, ci-parity, test-data, parallel/disjointness) to an `agent()`'s Bash — preferring project-local `bin/<tool>.cjs`, falling back to the global `gsd-t` PATH binary — and parse the JSON envelope. `args` is now `JSON.parse`d (it arrives stringified). File reads moved into the agents that have `Read` (worker reads its own scope.md/tasks.md; triad agents read their own protocol from `templates/prompts/`). `verify`'s raw `spawnSync`/`require` CI-parity block and `Date.now()` run-id were replaced; the M57/M58 FAIL-blocking semantics are unchanged.
+- `templates/workflows/gsd-t-{execute,verify,wave,integrate,debug,phase,quick}.workflow.js`: runtime-native port.
+- `test/m71-workflow-runtime-native-lint.test.js`: lint now covers all 8 workflows (was scan-only).
+- `test/m81-workflows-runtime-native.test.js`: structural invariants (no `_lib` require, args-string parse, no `spawnSync`/`Date.now`/`Math.random` in orchestrator, FAIL-blocking gates preserved).
+- `CLAUDE.md`, `~/.claude/CLAUDE.md` + `templates/CLAUDE-global.md`: documented the runtime-native invariant; retired `_lib.js` as a workflow dependency.
+Proven in the REAL sandbox: `quick` ran end-to-end (verify-gate PASS), `verify` evaluated through its CLI delegations returning a real verify-gate envelope, `execute` evaluated cleanly to its arg-guard — all with zero ReferenceError. Suite 1341/1341 pass.
 ## [4.0.28] - 2026-06-04 (M80 Scan Document-Phase Fixes - patch)
 ### Fixed - scan workflow crashed at the document phase, then shipped a truncated plain-English doc

package/README.md CHANGED Viewed

@@ -122,8 +122,11 @@ gsd-t build-coverage --json                             # M57: new top-level pat
 gsd-t ci-parity --json                                  # M57: reproduce the project's actual CI build locally (auto docker build)
 gsd-t test-data --list [--run ID] [--json]              # M58: list test-data ledger entries
 gsd-t test-data --purge --run ID [--dry-run] [--json]   # M58: purge tagged test data after Verify (Step 4.5)
+gsd-t competition-judge --in SPEC.json [--project-dir P] # M82: generate-and-judge selection oracle (partition / generic)
 ```
+**Competition Mode (M82).** Opt-in `--competition N` (N 2–5) on upstream, pre-contract phases (`/gsd-t-partition`, `/gsd-t-milestone`, `/gsd-t-design-decompose`) fans out N parallel candidate producers and a judge selects the winner — the generative dual of the orthogonal validation triad. Partition uses an *objective* file-disjointness oracle as the judge (a calculator, not a biased critic); subjective phases use a blind + different-model + rubric judge. Default off. See `.gsd-t/contracts/competition-mode-contract.md`.
 `gsd-t parallel` consumes the M44 task-graph (D1) and applies three pre-spawn gates (D4 depgraph validation → D5 file-disjointness → D6 economics) followed by mode-aware headroom/split math. Extends — does not replace — the M40 orchestrator. Contract: `.gsd-t/contracts/wave-join-contract.md` v1.1.0.
 Each iteration runs as a fresh `claude -p` session. A cumulative debug ledger (`.gsd-t/debug-state.jsonl`) preserves hypothesis/fix/learning history across sessions. An anti-repetition preamble prevents retrying failed approaches.

package/bin/gsd-t-competition-judge.cjs ADDED Viewed

@@ -0,0 +1,344 @@
+"use strict";
+/**
+ * gsd-t-competition-judge — M82 D1
+ *
+ * The selection oracle for Competition Mode (generate-and-judge on upstream,
+ * pre-contract phases). Given N candidate artifacts produced by parallel
+ * producers, score them and emit a winner — the GENERATIVE dual of the
+ * orthogonal validation triad (which is adversarial: many critics, one
+ * candidate). Contract: .gsd-t/contracts/competition-mode-contract.md v1.0.0.
+ *
+ * Two judge modes, chosen by `--kind`:
+ *
+ *   --kind partition  → OBJECTIVE judge (the v1 beachhead). Each candidate is a
+ *       proposed domain decomposition: a list of domains, each with a write-
+ *       touch list. We score it with the SAME disjointness oracle the real
+ *       parallel dispatcher uses (bin/gsd-t-file-disjointness.cjs), so the judge
+ *       is a CALCULATOR, not a critic — it sidesteps every LLM-judge bias
+ *       (position / verbosity / self-preference). Metrics, higher-is-better
+ *       unless noted:
+ *         - valid             : zero write-target overlaps across domains (HARD gate)
+ *         - parallelGroups    : count of disjoint domains that can fan out at once
+ *         - waveDepth         : serial gates (sequential groups + 1 if any) — LOWER better
+ *         - unprovableCount   : domains with no touch list — LOWER better (safe-default seq)
+ *       Ranking: invalid candidates are disqualified; among valid ones, rank by
+ *       (parallelGroups desc, waveDepth asc, unprovableCount asc, domainCount asc).
+ *
+ *   --kind generic   → records a SUBJECTIVE judge's verdict. The numeric scoring
+ *       lives in the rubric the Workflow's judge agent fills in (blind+shuffled,
+ *       different-model, rubric-scored — see the contract). This CLI only
+ *       validates/normalizes the rubric scores the agent supplies and picks the
+ *       winner deterministically (highest weighted score; ties → lowest index of
+ *       the ORIGINAL, pre-shuffle order to keep selection reproducible). It does
+ *       NOT call an LLM — keeping inference out of the deterministic substrate
+ *       (per feedback_deterministic_orchestration + anthropic-key-measurement-only).
+ *
+ * Input: a JSON spec on stdin OR via --in <path>. Shapes:
+ *
+ *   partition: {
+ *     "kind": "partition",
+ *     "candidates": [
+ *       { "id": "A", "domains": [ { "name": "d1", "touches": ["a.js","b.js"] }, ... ] },
+ *       ...
+ *     ]
+ *   }
+ *
+ *   generic: {
+ *     "kind": "generic",
+ *     "axes": [ { "key": "coherence", "weight": 1 }, { "key": "completeness", "weight": 1 }, ... ],
+ *     "candidates": [
+ *       { "id": "A", "scores": { "coherence": 4, "completeness": 3, ... } },
+ *       ...
+ *     ]
+ *   }
+ *
+ * Output (JSON envelope, the shape runCli parses):
+ *   {
+ *     ok: boolean,            // true unless input was unusable
+ *     exitCode: 0 | 4 | 64,
+ *     kind, n,
+ *     winner: <candidateId|null>,
+ *     ranked: [ { id, valid?, parallelGroups?, waveDepth?, unprovableCount?, score?, rank } ],
+ *     reason?: string
+ *   }
+ *
+ * Exit codes: 0 ok+winner · 4 ok but NO valid candidate (all disqualified) · 64 bad input.
+ *
+ * Hard rules (mirrors the disjointness prover's discipline):
+ *   - Zero external runtime deps (Node built-ins only).
+ *   - Never throws — always emits an envelope.
+ *   - Pure / read-only — no project mutation. Deterministic given the same input.
+ */
+const fs = require("node:fs");
+// The objective partition judge reuses the production disjointness oracle so the
+// judge's notion of "parallelizable" is byte-identical to the dispatcher's.
+let proveDisjointness;
+try {
+  ({ proveDisjointness } = require("./gsd-t-file-disjointness.cjs"));
+} catch {
+  proveDisjointness = null;
+}
+// ─── Partition scoring (objective) ───────────────────────────────────────
+/**
+ * Score one candidate partition by running its domains through the disjointness
+ * oracle. Each domain becomes a pseudo-task {id, domain, touches}; we never hit
+ * git history (every domain carries an explicit touch list or is counted
+ * unprovable), so scoring is pure and deterministic.
+ *
+ * @returns {{valid, domainCount, parallelGroups, sequentialGroups, unprovableCount, waveDepth}}
+ */
+// Normalize a touch path to a stable file identity so two spellings of the SAME
+// file (./bin/x.js vs bin/x.js, trailing slash, backslashes, redundant ./ or //)
+// are detected as a conflict. Without this, an overlapping partition could be
+// scored `valid` and WIN — then the real dispatcher would hit a write conflict.
+// Note: case is preserved (most CI runs on case-sensitive Linux); collapsing case
+// here would create false conflicts on case-sensitive repos. Path identity only.
+function _normPath(p) {
+  if (typeof p !== "string") return "";
+  let s = p.trim().replace(/\\/g, "/");        // backslashes -> forward
+  s = s.replace(/\/+/g, "/");                    // collapse repeated slashes
+  s = s.replace(/^\.\//, "");                    // drop leading ./
+  while (s.includes("/./")) s = s.replace("/./", "/"); // drop interior /./
+  s = s.replace(/\/+$/, "");                      // drop trailing slash
+  return s;
+}
+function scorePartition(candidate, projectDir) {
+  const domains = Array.isArray(candidate.domains) ? candidate.domains : [];
+  const tasks = domains.map((d, i) => ({
+    id: `${candidate.id}:${d.name || `d${i}`}`,
+    domain: d.name || `d${i}`,
+    // Only honor an explicit touch list — never let the oracle fall through to
+    // git history during scoring (would make the judge non-deterministic).
+    // Normalize + de-dupe so path-spelling variants are caught as real conflicts.
+    touches: Array.isArray(d.touches)
+      ? Array.from(new Set(d.touches.map(_normPath).filter(Boolean)))
+      : [],
+  }));
+  // Run the real oracle when available; otherwise fall back to a self-contained
+  // overlap check so the judge still works if the lib isn't co-located.
+  const res = proveDisjointness
+    ? proveDisjointness({ tasks, projectDir })
+    : _localDisjoint(tasks);
+  const parallelGroups = (res.parallel || []).length;
+  const sequentialGroups = (res.sequential || []).filter(
+    (g) => !(g.length === 1 && (res.unprovable || []).includes(g[0])),
+  ).length;
+  const unprovableCount = (res.unprovable || []).length;
+  // VALID = no two domains with declared touch lists write the same file. An
+  // overlap shows up as a sequential group of size ≥2 among provable tasks.
+  const overlapGroup = (res.sequential || []).some((g) => g.length >= 2);
+  const valid = !overlapGroup;
+  // waveDepth: 1 wave for the disjoint fan-out, +1 per serial bottleneck
+  // (overlapping/unprovable domains that must run after). Fewer = better.
+  const serialBottlenecks = sequentialGroups + unprovableCount;
+  const waveDepth = (parallelGroups > 0 ? 1 : 0) + (serialBottlenecks > 0 ? 1 : 0) || 1;
+  return {
+    valid,
+    domainCount: domains.length,
+    parallelGroups,
+    sequentialGroups,
+    unprovableCount,
+    waveDepth,
+  };
+}
+// Self-contained overlap fallback (only used if the oracle lib is absent).
+function _localDisjoint(tasks) {
+  const parallel = [];
+  const sequential = [];
+  const unprovable = [];
+  const provable = [];
+  for (const t of tasks) {
+    if (!t.touches || t.touches.length === 0) {
+      unprovable.push(t);
+      sequential.push([t]);
+    } else {
+      provable.push(t);
+    }
+  }
+  // union-find over file overlap
+  const parent = provable.map((_, i) => i);
+  const find = (i) => {
+    while (parent[i] !== i) { parent[i] = parent[parent[i]]; i = parent[i]; }
+    return i;
+  };
+  for (let i = 0; i < provable.length; i++) {
+    for (let j = i + 1; j < provable.length; j++) {
+      const a = new Set(provable[i].touches);
+      if (provable[j].touches.some((f) => a.has(f))) {
+        const ra = find(i), rb = find(j);
+        if (ra !== rb) parent[ra] = rb;
+      }
+    }
+  }
+  const groups = new Map();
+  for (let i = 0; i < provable.length; i++) {
+    const r = find(i);
+    if (!groups.has(r)) groups.set(r, []);
+    groups.get(r).push(provable[i]);
+  }
+  for (const g of groups.values()) (g.length === 1 ? parallel : sequential).push(g);
+  return { parallel, sequential, unprovable };
+}
+// Drop candidates that are not usable objects with a string id (Red Team MED-4:
+// the 'never throws' guarantee is on the function, not just the CLI shell — an
+// in-process caller passing [null] or {id:{}} must not crash, and a non-string id
+// could never match `c.id === winnerId` in the workflow anyway).
+function _safeCandidates(candidates) {
+  return (Array.isArray(candidates) ? candidates : []).filter(
+    (c) => c && typeof c === "object" && typeof c.id === "string" && c.id.length > 0,
+  );
+}
+function rankPartitions(rawCandidates, projectDir) {
+  const candidates = _safeCandidates(rawCandidates);
+  const scored = candidates.map((c) => ({ id: c.id, ...scorePartition(c, projectDir) }));
+  // Disqualify invalid (file-overlap) candidates from winning, but keep them in
+  // the ranking so the caller can see why they lost.
+  const valid = scored.filter((s) => s.valid);
+  const cmp = (a, b) =>
+    b.parallelGroups - a.parallelGroups ||      // more concurrency wins
+    a.waveDepth - b.waveDepth ||                 // fewer serial gates wins
+    a.unprovableCount - b.unprovableCount ||     // fewer unknowns wins
+    a.domainCount - b.domainCount;               // simpler (fewer domains) wins
+  valid.sort(cmp);
+  const invalid = scored.filter((s) => !s.valid);
+  const ordered = [...valid, ...invalid];
+  ordered.forEach((s, i) => { s.rank = i + 1; });
+  return { ranked: ordered, winner: valid.length ? valid[0].id : null };
+}
+// ─── Generic scoring (subjective rubric, deterministic selection) ────────
+function rankGeneric(spec) {
+  const axes = Array.isArray(spec.axes) && spec.axes.length
+    ? spec.axes
+    : [{ key: "quality", weight: 1 }];
+  const candidates = _safeCandidates(spec.candidates);
+  const scored = candidates.map((c, idx) => {
+    const scores = c.scores || {};
+    let total = 0;
+    let weightSum = 0;
+    for (const ax of axes) {
+      const w = Number(ax.weight) || 0;
+      const v = Number(scores[ax.key]) || 0;
+      total += w * v;
+      weightSum += w;
+    }
+    const score = weightSum > 0 ? total / weightSum : 0;
+    return { id: c.id, score: Number(score.toFixed(4)), _idx: idx };
+  });
+  // Highest weighted score wins; ties broken by ORIGINAL index (reproducible,
+  // immune to candidate-order shuffling done for bias control upstream).
+  scored.sort((a, b) => b.score - a.score || a._idx - b._idx);
+  scored.forEach((s, i) => { s.rank = i + 1; delete s._idx; });
+  return { ranked: scored, winner: scored.length ? scored[0].id : null };
+}
+// ─── Driver ──────────────────────────────────────────────────────────────
+function judge(spec, projectDir) {
+  const candidates = Array.isArray(spec && spec.candidates) ? spec.candidates : [];
+  if (!candidates.length) {
+    return { ok: false, exitCode: 64, kind: spec && spec.kind, n: 0, winner: null, ranked: [], reason: "no-candidates" };
+  }
+  const kind = spec.kind === "generic" ? "generic" : "partition";
+  const { ranked, winner } = kind === "partition"
+    ? rankPartitions(candidates, projectDir)
+    : rankGeneric(spec);
+  const ok = winner != null;
+  return {
+    ok,
+    exitCode: ok ? 0 : 4,
+    kind,
+    n: candidates.length,
+    winner,
+    ranked,
+    ...(ok ? {} : { reason: kind === "partition" ? "no-valid-candidate" : "no-candidates" }),
+  };
+}
+function readInput(opts) {
+  if (opts.in) return fs.readFileSync(opts.in, "utf8");
+  // stdin
+  try {
+    return fs.readFileSync(0, "utf8");
+  } catch {
+    return "";
+  }
+}
+function parseArgs(argv) {
+  const opts = { json: true, in: null, projectDir: process.cwd(), help: false };
+  for (let i = 0; i < argv.length; i++) {
+    const a = argv[i];
+    if (a === "--help" || a === "-h") opts.help = true;
+    else if (a === "--in") opts.in = argv[++i];
+    else if (a === "--project-dir") opts.projectDir = argv[++i];
+    else if (a === "--json") opts.json = true;
+  }
+  return opts;
+}
+const HELP = `Usage: gsd-t competition-judge [--in PATH] [--project-dir PATH]
+Reads a candidate-set JSON spec (stdin or --in) and emits a ranked winner.
+  --in PATH          Read spec from file instead of stdin.
+  --project-dir PATH Project root (default: cwd).
+  --json             Emit JSON envelope (default; always on).
+Spec.kind:
+  "partition"  Objective oracle judge — scores domain decompositions via the
+               file-disjointness prover (parallelGroups / waveDepth / validity).
+  "generic"    Deterministic rubric selector — picks the highest weighted score
+               from rubric values an upstream judge agent supplied.
+Exit codes: 0 winner · 4 no valid candidate · 64 bad input.`;
+function main() {
+  const opts = parseArgs(process.argv.slice(2));
+  if (opts.help) {
+    process.stdout.write(HELP + "\n");
+    process.exit(0);
+  }
+  let spec;
+  try {
+    const raw = readInput(opts);
+    spec = JSON.parse(raw);
+  } catch (e) {
+    const env = { ok: false, exitCode: 64, kind: null, n: 0, winner: null, ranked: [], reason: `bad-input: ${e && e.message}` };
+    process.stdout.write(JSON.stringify(env, null, 2) + "\n");
+    process.exit(64);
+  }
+  let result;
+  try {
+    result = judge(spec, opts.projectDir);
+  } catch (e) {
+    result = { ok: false, exitCode: 64, kind: spec && spec.kind, n: 0, winner: null, ranked: [], reason: `judge-error: ${e && e.message}` };
+  }
+  process.stdout.write(JSON.stringify(result, null, 2) + "\n");
+  process.exit(result.exitCode);
+}
+if (require.main === module) main();
+module.exports = {
+  judge,
+  scorePartition,
+  rankPartitions,
+  rankGeneric,
+  _internal: { _localDisjoint, _normPath },
+};

package/bin/gsd-t.js CHANGED Viewed

@@ -1182,6 +1182,8 @@ const GLOBAL_BIN_TOOLS = [
   // M57 — CI-parity verify-gate checks (structural build-coverage + containment-safe ci-parity).
   "gsd-t-build-coverage.cjs",
   "gsd-t-ci-parity.cjs",
+  // M82 — Competition Mode generate-and-judge selection oracle.
+  "gsd-t-competition-judge.cjs",
 ];
 function installGlobalBinTools() {
@@ -2469,6 +2471,10 @@ const PROJECT_BIN_TOOLS = [
   "cli-preflight.cjs", "parallel-cli.cjs", "parallel-cli-tee.cjs",
   "gsd-t-context-brief.cjs",
   "gsd-t-verify-gate.cjs", "gsd-t-verify-gate-judge.cjs",
+  // M82 — Competition Mode judge + its disjointness oracle dependency, so a
+  // project's gsd-t-phase workflow can score candidate partitions via the
+  // project-local bin (runCli prefers bin/<tool>.cjs over the global binary).
+  "gsd-t-competition-judge.cjs", "gsd-t-file-disjointness.cjs",
 ];
 // Files that older versions of this installer copied into project bin/ but
@@ -4546,6 +4552,16 @@ if (require.main === module) {
       });
       process.exit(res.status == null ? 1 : res.status);
     }
+    case "competition-judge": {
+      // M82 D1 — `gsd-t competition-judge` thin dispatcher to the generate-and-judge
+      // selection oracle (objective partition judge + deterministic rubric selector).
+      const { spawnSync } = require("child_process");
+      const js = path.join(__dirname, "gsd-t-competition-judge.cjs");
+      const res = spawnSync(process.execPath, [js, ...args.slice(1)], {
+        stdio: "inherit",
+      });
+      process.exit(res.status == null ? 1 : res.status);
+    }
     case "metrics":
       doMetrics(args.slice(1));
       break;

package/commands/gsd-t-design-decompose.md CHANGED Viewed

@@ -25,14 +25,21 @@ Capture the design reference from `$ARGUMENTS` (Figma URL / image path). If Figm
   args: {
     phase: "design-decompose",
     projectDir: ".",
-    userInput: "$ARGUMENTS"
+    userInput: "$ARGUMENTS",
+    // M82 Competition Mode (opt-in): `--competition N` (N 2..5) fans out N
+    // parallel decompositions; a blind, different-model, rubric judge (fidelity /
+    // completeness / reuse / simplicity) selects the winner. Useful when a design
+    // is ambiguous or the component boundaries aren't obvious.
+    competition: 1
   }
 }
 ```
+**Competition Mode (`--competition N`).** When a design is ambiguous or the element/widget/page boundaries aren't obvious, `/gsd-t-design-decompose --competition 3` fans out N candidate decompositions and a blind, different-model rubric judge picks the best. Parse N (clamped 2..5). See `.gsd-t/contracts/competition-mode-contract.md`. Default off.
 ## Step 3: Interpret the result
-The Workflow returns `{ status, artifacts, summary, decisions }`.
+The Workflow returns `{ status, artifacts, summary, decisions }` (plus `competition: { n, winner, ranked }` when Competition Mode ran).
 - `status === "complete"`: the element → widget → page contract tree is written under `.gsd-t/contracts/design/`.
 - `status === "partial" | "blocked"`: the agent needs the design source (e.g. Figma auth) or a stack-capability decision. Surface it.

package/commands/gsd-t-help.md CHANGED Viewed

@@ -479,6 +479,14 @@ Use these when user asks for help on a specific command:
 - **Use when**: Test data hygiene. Catches the GSD-T-Board class (2442 orphaned `E2E_TEST_*` / `E2E_DRAG_*` ideas left in the production data store after a passing Verify run).
 - **CLI**: `gsd-t test-data --list [--run <id>] [--json]` / `gsd-t test-data --purge --run <id> [--dry-run] [--json] [--project <dir>]`. Exit 0 on success, 4 on adapter errors, 64 on usage error.
+### competition-judge (M82)
+- **Summary**: The selection oracle for Competition Mode (generate-and-judge — the *generative* dual of the orthogonal validation triad). Two modes: `--kind partition` scores candidate domain decompositions via the file-disjointness oracle (parallelGroups / waveDepth / validity — a calculator, not an LLM critic, so it's immune to judge bias); `--kind generic` is a deterministic rubric selector that finalizes a winner from rubric scores an upstream blind/different-model judge supplied.
+- **Auto-invoked**: Yes — by `gsd-t-phase.workflow.js` when an eligible phase (partition / milestone / design-decompose) is run with `competition: N` (N 2–5). Opt-in per phase via `/gsd-t-partition --competition N` etc. Default off.
+- **Files**: `bin/gsd-t-competition-judge.cjs` (reuses `bin/gsd-t-file-disjointness.cjs`).
+- **Use when**: Upstream, pre-contract, wide-solution-space decisions where the cost of a single draft is high (partition, milestone decomposition, ambiguous design decomposition). Never on post-contract phases (execute/verify/etc.) — those are owned by the adversarial triad.
+- **CLI**: `gsd-t competition-judge [--in <spec.json>] [--project-dir <dir>]` (spec via stdin or `--in`). Exit 0 winner · 4 no valid candidate · 64 bad input.
+- **Contract**: `.gsd-t/contracts/competition-mode-contract.md` v1.0.0 STABLE.
 ## Unknown Command
 If user asks for help on unrecognized command:

package/commands/gsd-t-milestone.md CHANGED Viewed

@@ -25,14 +25,21 @@ Read `.gsd-t/progress.md` (current version + completed milestones), `docs/requir
   args: {
     phase: "milestone",
     projectDir: ".",
-    userInput: "$ARGUMENTS"
+    userInput: "$ARGUMENTS",
+    // M82 Competition Mode (opt-in): `--competition N` (N 2..5) fans out N
+    // parallel Self-MoA producers proposing different decomposition strategies
+    // (risk-first / value-first / dependency-first); a blind, different-model,
+    // rubric judge selects the winner. Coupled-thesis → pick-one (no Frankenstein).
+    competition: 1
   }
 }
 ```
+**Competition Mode (`--competition N`).** Milestone decomposition is the highest-altitude decision in the system — different strategies are genuinely different. If the user invokes `/gsd-t-milestone --competition 3`, parse N (clamped 2..5) and pass `competition: N`. Because a milestone decomposition is a *coupled thesis*, the judge selects one winner whole (pick-one) and only salvages non-overlapping good line-items from the losers — it never Frankensteins. See `.gsd-t/contracts/competition-mode-contract.md`. Default off.
 ## Step 3: Interpret the result
-The Workflow returns `{ status, artifacts, summary, decisions }`.
+The Workflow returns `{ status, artifacts, summary, decisions }` (plus `competition: { n, winner, ranked }` when Competition Mode ran).
 - `status === "complete"`: milestone defined and appended to progress.md with falsifiable SCs. Do NOT auto-partition for large/risky milestones — show the Next Up hint.
 - `status === "blocked"`: the agent needs a scoping decision from the user.

package/commands/gsd-t-partition.md CHANGED Viewed

@@ -30,14 +30,21 @@ Call the `Workflow` tool with:
     phase: "partition",
     milestone: "M{NN}",
     projectDir: ".",
-    userInput: "$ARGUMENTS"
+    userInput: "$ARGUMENTS",
+    // M82 Competition Mode (opt-in): if the user passed `--competition N` in
+    // $ARGUMENTS (N in 2..5), set competition: N. N parallel Self-MoA producers
+    // propose partitions; the OBJECTIVE oracle judge (file-disjointness scoring)
+    // picks the most-parallelizable valid decomposition. Omit / set 1 = off.
+    competition: 1
   }
 }
 ```
+**Competition Mode (`--competition N`).** Partition is the v1 beachhead for generate-and-judge: its judge is the file-disjointness oracle, so it is a calculator, not a biased critic. If the user invokes `/gsd-t-partition --competition 3`, parse N (clamped 2..5) and pass `competition: N`. The workflow fans out N candidate partitions, scores each on measured parallelism / wave-depth / boundary-cleanliness, and finalizes the winner. See `.gsd-t/contracts/competition-mode-contract.md`. Default off (single producer).
 ## Step 3: Interpret the result
-The Workflow returns `{ status, artifacts, summary, decisions }`.
+The Workflow returns `{ status, artifacts, summary, decisions }` (plus `competition: { n, winner, ranked }` when Competition Mode ran).
 - `status === "complete"`: domains scoped, contracts drafted. Auto-advance to `/gsd-t-plan`.
 - `status === "partial" | "blocked"`: read `summary` for what's missing (e.g. ambiguous scope needing discussion).

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@tekyzinc/gsd-t",
-  "version": "4.0.28",
+  "version": "4.1.10",
   "description": "GSD-T: Contract-Driven Development for Claude Code — 54 slash commands with headless-by-default workflow spawning, unattended supervisor relay with event stream, graph-powered code analysis, real-time agent dashboard, task telemetry, doc-ripple enforcement, backlog management, impact analysis, test sync, milestone archival, and PRD generation",
   "author": "Tekyz, Inc.",
   "license": "MIT",

package/templates/CLAUDE-global.md CHANGED Viewed

@@ -328,10 +328,10 @@ Canonical scripts:
 - `gsd-t-integrate.workflow.js` — cross-domain wire-up + light verify-gate
 - `gsd-t-debug.workflow.js` — 2-cycle diagnose/fix/verify (CLAUDE.md Prime Rule)
 - `gsd-t-quick.workflow.js` — preflight + brief + single-task + verify-gate (M56-D4)
-- `gsd-t-phase.workflow.js` — generic upper-stage runner (partition / plan / discuss / impact / milestone / prd / design-decompose / doc-ripple)
+- `gsd-t-phase.workflow.js` — generic upper-stage runner (partition / plan / discuss / impact / milestone / prd / design-decompose / doc-ripple). **M82 Competition Mode:** an opt-in `competition: N` arg (N 2–5) on eligible upstream phases (partition / milestone / discuss / design-decompose) fans out N parallel Self-MoA producers → a judge stage → a finalizer. Partition's judge is the OBJECTIVE file-disjointness oracle (`gsd-t competition-judge --kind partition` — a calculator, not an LLM critic, immune to judge bias, the v1 beachhead); subjective phases use a blind + shuffled + different-model + rubric judge whose pick is finalized deterministically by `--kind generic`. The generative dual of the orthogonal validation triad; watershed rule = generate-and-judge ABOVE the contract, attack-and-filter BELOW. Default off. Contract: `competition-mode-contract.md` v1.0.0.
 - `gsd-t-scan.workflow.js` — preflight → volume-probe → pipeline(per-slice deep finder → single verify) → synthesis → document → render (M66: fans out by codebase VOLUME, not a fixed 5-teammate dimension count; M67: deep document phase deterministically produces the full living-doc set + dimension files, per-doc fan-out)
-Shared helpers: `templates/workflows/_lib.js` — `runPreflight`, `generateBrief`, `proveFileDisjointness`, `runVerifyGate`, `loadProtocol`, `readDomainTasks`, `readScope`. Each prefers project-local `bin/<tool>.cjs` and falls back to global `gsd-t` PATH binary (preserves M55-D5 project-local-bin invariant).
+**Runtime-native invariant (M81 — v4.0.29+):** the Workflow sandbox provides ONLY `agent/parallel/pipeline/log/phase/budget/args` — NO `require`/`fs`/`path`/`child_process`/`process`, and `args` arrives as a JSON STRING. Each workflow is self-contained: it `JSON.parse`s `args` and delegates every CLI call (preflight, verify-gate, brief, build-coverage, ci-parity, test-data, disjointness) to inline `async` helpers that run the command via an `agent()`'s Bash (preferring project-local `bin/<tool>.cjs`, else the global `gsd-t` PATH binary) and parse the JSON envelope — preserving the M55-D5 project-local-bin invariant. The old `require("./_lib.js")` pattern threw `ReferenceError` on first eval and silently broke every workflow except scan (TD-113, fixed M81); `_lib.js` is retired as a workflow dependency.
 ## Preflight Gate (KEPT from M55)

package/templates/workflows/gsd-t-debug.workflow.js CHANGED Viewed

@@ -16,10 +16,39 @@ export const meta = {
   ],
 };
-const lib = require("./_lib.js");
+// M81: runtime-native helpers (sandbox bans require/fs/child_process/process — the old
+// require("./_lib.js") crashed this workflow on first eval, TD-113). Delegate CLI calls
+// to an agent's Bash; args arrives as a JSON STRING in this runtime. See gsd-t-scan.workflow.js.
+const _args = (typeof args === "string") ? (() => { try { return JSON.parse(args); } catch { return {}; } })() : (args || {});
+const _CLI_ENVELOPE_SCHEMA = {
+  type: "object", required: ["ok", "exitCode"], additionalProperties: true,
+  properties: { ok: { type: "boolean" }, exitCode: { type: "integer" }, envelope: {}, stdout: { type: "string" }, stderr: { type: "string" }, via: { type: "string" } },
+};
+async function runCli(projectDir, subcmd, argv, localBin, label, parseJson = true, phaseName) {
+  const argStr = (argv || []).map((a) => `'${String(a).replace(/'/g, "'\\''")}'`).join(" ");
+  const prompt = [
+    `Run a GSD-T CLI command for the project at \`${projectDir}\` and report the result. Steps:`,
+    `1. If \`${projectDir}/bin/${localBin}\` exists, run: \`node ${projectDir}/bin/${localBin} ${argStr}\` (set via="local"). Otherwise run: \`gsd-t ${subcmd} ${argStr}\` (set via="global"). Use cwd \`${projectDir}\`.`,
+    `2. Capture exit code (ok = exitCode 0) and stdout/stderr.`,
+    parseJson ? `3. Parse stdout as JSON into \`envelope\` (null if not JSON). Return JSON per the schema.` : `3. Put stdout (trimmed, ≤4000 chars) in \`stdout\`. Return JSON per the schema.`,
+    `Do NOT do any other work. ONLY run this one command and report.`,
+  ].join("\n");
+  const opts = { label, schema: _CLI_ENVELOPE_SCHEMA, model: "haiku" };
+  if (phaseName) opts.phase = phaseName;
+  const r = await agent(prompt, opts).catch((e) => ({ ok: false, exitCode: -1, envelope: null, stderr: String(e && e.message), via: "error" }));
+  return r || { ok: false, exitCode: -1, envelope: null, via: "error" };
+}
+async function runPreflight(projectDir, label = "preflight", phaseName) { return runCli(projectDir, "preflight", ["--json"], "cli-preflight.cjs", label, true, phaseName); }
+async function generateBrief(projectDir, { kind = "execute", milestone, domain, id, label = "brief", phaseName } = {}) {
+  const argv = ["--kind", kind, "--spawn-id", id, "--out", `${projectDir}/.gsd-t/briefs/${id}.json`];
+  if (milestone) argv.push("--milestone", milestone);
+  if (domain) argv.push("--domain", domain);
+  const r = await runCli(projectDir, "brief", argv, "gsd-t-context-brief.cjs", label, false, phaseName);
+  return { ok: r.ok, briefPath: `${projectDir}/.gsd-t/briefs/${id}.json`, via: r.via };
+}
-const projectDir = (args && args.projectDir) || ".";
-const symptom = (args && args.symptom) || null;
+const projectDir = _args.projectDir || ".";
+const symptom = _args.symptom || null;
 const DEBUG_CYCLE_SCHEMA = {
   type: "object",
@@ -42,9 +71,9 @@ if (!symptom) {
 }
 phase("Preflight");
-const pre = lib.runPreflight({ projectDir });
+const pre = await runPreflight(projectDir);
 if (!pre.ok) return { status: "failed", reason: "preflight-failed", preflight: pre.envelope };
-const brief = lib.generateBrief({ kind: "execute", projectDir });
+const brief = await generateBrief(projectDir, { kind: "execute", id: "debug-brief" });
 let lastResult = null;
 for (let cycle = 1; cycle <= 2; cycle++) {