npm - @tekyzinc/gsd-t - Versions diffs - 4.0.29 → 4.2.10 - Mend

@tekyzinc/gsd-t 4.0.29 → 4.2.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/CHANGELOG.md +37 -0
package/README.md +6 -0
package/bin/gsd-t-competition-judge.cjs +344 -0
package/bin/gsd-t-traceability-gate.cjs +338 -0
package/bin/gsd-t.js +29 -0
package/commands/gsd-t-design-decompose.md +9 -2
package/commands/gsd-t-help.md +16 -0
package/commands/gsd-t-milestone.md +9 -2
package/commands/gsd-t-partition.md +9 -2
package/commands/gsd-t-plan.md +5 -3
package/package.json +1 -1
package/templates/CLAUDE-global.md +1 -1
package/templates/prompts/pre-mortem-subagent.md +46 -0
package/templates/workflows/gsd-t-phase.workflow.js +414 -19

package/bin/gsd-t-traceability-gate.cjs ADDED Viewed

@@ -0,0 +1,338 @@
+"use strict";
+/**
+ * gsd-t-traceability-gate — M83 D1
+ *
+ * The plan-phase acceptance-traceability gate. The deterministic half of
+ * Left-Shifted Plan Hardening (the adversarial pre-mortem agent is the
+ * generative half). Contract: .gsd-t/contracts/plan-hardening-contract.md.
+ *
+ * ORIGIN (NiceNote M5 incident, 2026-06-05): M5's headline capability (AC-6,
+ * 100MB+ chunked read) shipped as DEAD CODE — the chunk reader was built but
+ * openPath still materialized whole files, and NO test asserted the headline
+ * capability, so the suite stayed green. The triad burned 4 verify cycles
+ * re-litigating the milestone's reason to exist. Root cause: the plan never
+ * bound each acceptance criterion to (a) a real code path and (b) a test that
+ * FAILS if that path is absent. This gate enforces that binding BEFORE execute.
+ *
+ * What it checks, per `.gsd-t/domains/* /tasks.md` task block:
+ *   - Every task that declares **Acceptance criteria** MUST declare **Files**
+ *     (an implementing code path) — an AC with no path is an unbacked promise.
+ *   - Every such task MUST declare a TEST reference (a Test/Tests field, a
+ *     test-runner mention, or a Files entry matching a test path pattern) — an
+ *     AC with no killing test is the dead-code class (passes vacuously / never
+ *     exercised). The milestone's HEADLINE capability without a test is exactly
+ *     the M5 failure.
+ *   - A task tagged as the milestone HEADLINE (**Headline:** true, or an AC
+ *     referencing the milestone's named capability) gets a STRICTER check: it
+ *     MUST have a non-test Files entry (real implementation, not just a test)
+ *     AND a test entry. A headline with only a test, or only an impl, fails.
+ *
+ * It does NOT judge whether the code is correct (that's verify) — only whether
+ * the PLAN is complete enough that execute can't produce a dead deliverable.
+ *
+ * Input: --milestone Mxx --project-dir PATH  (reads .gsd-t/domains/* /tasks.md).
+ *        OR --tasks <file> to check a single tasks.md (used by tests).
+ * Output: JSON envelope { ok, exitCode, milestone, tasks:[...], violations:[...] }.
+ * Exit: 0 all tasks traceable · 4 ≥1 violation (blocks execute) · 64 bad input.
+ *
+ * Hard rules: zero deps, never throws, pure/read-only.
+ */
+const fs = require("node:fs");
+const path = require("node:path");
+// ─── tasks.md parsing ────────────────────────────────────────────────────
+// Red Team CRITICAL/HIGH-3/MEDIUM-1 (M83 verify): markdown field labels appear in
+// BOTH `**Label**: v` (colon outside bold) and `**Label:** v` (colon inside) forms.
+// Matching against the raw line missed the colon-inside form — defeating the entire
+// gate on the canonical M5 dead-code plan. Fix: STRIP emphasis markers first, then
+// match the colon-agnostic bare text. All field detection runs on the bared line.
+function _bare(line) {
+  return String(line == null ? "" : line).replace(/[*_`]/g, "");
+}
+// Path-safe bare: strips only emphasis that wraps labels (* and backtick), but
+// PRESERVES underscores — pytest's test_*.py / *_test.py conventions depend on
+// them, and TEST_PATH_RE has `_test\.` / `test_` alternatives (Red Team M83
+// recheck HIGH: stripping `_` before the test-path scan false-failed Python plans).
+function _barePath(s) {
+  return String(s == null ? "" : s).replace(/[*`]/g, "");
+}
+// A test reference is: an explicit Test/Tests field, a known runner mention, or a
+// Files path that looks like a test file. Kept broad on purpose — the gate asserts
+// a test is NAMED, not that it exists yet (plan precedes execute).
+const TEST_PATH_RE = /(\.test\.|\.spec\.|(^|\/)tests?\/|(^|\/)e2e\/|_test\.|test_|cargo test|vitest|playwright|pytest|jest)/i;
+// Field regexes run on the BARED line, so the colon can be anywhere the label ends.
+const TEST_FIELD_RE = /^\s*[-*]?\s*(tests?|test\s*ref|test\s*coverage|verified\s*by)\s*:/i;
+const FILES_FIELD_RE = /^\s*[-*]?\s*files?\s*:/i;
+const AC_FIELD_RE = /^\s*[-*]?\s*(acceptance(\s*criteria)?|accept|ac)\s*:/i;
+const HEADLINE_FIELD_RE = /^\s*[-*]?\s*headline\s*:\s*(true|yes)/i;
+const HEADING_RE = /^(#{2,4})\s+(.*\S.*)$/;
+// Headings that are structural, never tasks — so we don't mis-parse a Summary/
+// Overview block as a behavioral task. Everything else that bears an AC field IS
+// assessed (Red Team HIGH-2: do NOT gate task detection on heading wording —
+// anchor on the AC, so a descriptive heading like "Implement the reader" is caught).
+const NON_TASK_HEADING_RE = /^(summary|overview|notes?|context|goal|background|wave\s*history|index|integration\s*points?|dependencies|references?|appendix|tasks)\s*$/i;
+/**
+ * Parse a tasks.md into candidate blocks: every `##`–`####` heading starts a
+ * block (except the structural-heading skip list). A block becomes a TASK for
+ * assessment iff it contains an acceptance-criteria field (decided later in
+ * assessTask) — but we keep ALL non-structural blocks so no AC-bearing block is
+ * ever dropped on heading wording.
+ * @returns {Array<{title, raw, lines}>}
+ */
+function parseTasks(md) {
+  const lines = (md || "").split(/\r?\n/);
+  const blocks = [];
+  let cur = null;
+  for (const line of lines) {
+    const m = line.match(HEADING_RE);
+    if (m) {
+      const title = m[2].trim();
+      // Close any open block at every heading.
+      if (cur) { blocks.push(cur); cur = null; }
+      // Structural headings start no block; everything else does.
+      if (!NON_TASK_HEADING_RE.test(_bare(title).trim())) {
+        cur = { title, lines: [] };
+      }
+      continue;
+    }
+    if (cur) cur.lines.push(line);
+  }
+  if (cur) blocks.push(cur);
+  return blocks.map((t) => ({ title: t.title, raw: t.lines.join("\n"), lines: t.lines }));
+}
+// ─── per-task traceability assessment ────────────────────────────────────
+// All field matching runs on the BARED line (emphasis stripped) so colon
+// position inside/outside bold is irrelevant (Red Team CRITICAL fix).
+function fieldValue(lines, re) {
+  for (const ln of lines) {
+    const bare = _bare(ln);
+    if (re.test(bare)) {
+      const idx = bare.indexOf(":");
+      return idx >= 0 ? bare.slice(idx + 1).trim() : "";
+    }
+  }
+  return null;
+}
+// Like fieldValue but PRESERVES underscores in the returned value (label is still
+// matched emphasis-agnostically) — used for value-level test-path scans so
+// test_*.py / *_test.py survive (Red Team recheck HIGH).
+function fieldValueRaw(lines, re) {
+  for (const ln of lines) {
+    if (re.test(_bare(ln))) {
+      const raw = _barePath(ln);
+      const idx = raw.indexOf(":");
+      return idx >= 0 ? raw.slice(idx + 1).trim() : "";
+    }
+  }
+  return null;
+}
+function hasMultiField(lines, re) {
+  return lines.some((ln) => re.test(_bare(ln)));
+}
+// Collect the indented/bulleted sub-lines that follow an Acceptance-criteria
+// label up to the next top-level field — these ARE the acceptance criteria, and
+// an AC may name its own verifying test there ("…; verified by cargo test").
+function _acBulletText(lines) {
+  const out = [];
+  let inAc = false;
+  for (const ln of lines) {
+    const bare = _bare(ln);
+    if (AC_FIELD_RE.test(bare)) { inAc = true; continue; }
+    if (!inAc) continue;
+    // A new NON-INDENTED "Label:" line closes the AC block.
+    if (/^\s*[-*]?\s*[a-z][a-z\s]{1,24}:/i.test(bare) && !/^\s{2,}/.test(ln)) {
+      inAc = false; continue;
+    }
+    out.push(_barePath(ln)); // preserve underscores for test-path detection
+  }
+  return out.join("\n");
+}
+/**
+ * A task is "behavioral" (subject to the gate) if it declares acceptance
+ * criteria — i.e. it promises an observable behavior. Pure-scaffolding tasks
+ * with no ACs are out of scope (nothing to trace).
+ */
+function assessTask(task) {
+  const lines = task.lines;
+  const hasAc = hasMultiField(lines, AC_FIELD_RE);
+  if (!hasAc) {
+    return { title: task.title, behavioral: false, violations: [] };
+  }
+  // Underscore-preserving values for path/runner scans (Red Team recheck HIGH).
+  const filesVal = fieldValueRaw(lines, FILES_FIELD_RE) || "";
+  const hasFiles = hasMultiField(lines, FILES_FIELD_RE) && filesVal.replace(/[—–-]/g, "").trim().length > 0;
+  // Test reference (MEDIUM-1 fix): satisfied ONLY by a runner/test-path tied to a
+  // RELEVANT field — the Test field, the Files field, or the Acceptance-criteria
+  // value (where an AC may name its own verifying test, e.g. "…; verified by cargo
+  // test"). An incidental runner mention in an UNRELATED field (Dependencies,
+  // Notes, Scope) must NOT vacuously clear the killing-test requirement.
+  const hasTestField = hasMultiField(lines, TEST_FIELD_RE);
+  const testFieldVal = fieldValueRaw(lines, TEST_FIELD_RE) || "";
+  const acVal = fieldValueRaw(lines, AC_FIELD_RE) || "";
+  // AC criteria often span bullet sub-lines after the label; gather those too
+  // (underscore-preserving, so a test_*.py named in a bullet still matches).
+  const acBullets = _acBulletText(lines);
+  const filesHasTestPath = TEST_PATH_RE.test(filesVal);
+  const testFieldHasRunner = TEST_PATH_RE.test(testFieldVal);
+  const acHasRunner = TEST_PATH_RE.test(acVal) || TEST_PATH_RE.test(acBullets);
+  const hasTest = hasTestField || filesHasTestPath || testFieldHasRunner || acHasRunner;
+  // A non-test implementing path: a Files entry that is NOT only test files.
+  const fileTokens = filesVal.split(/[,\s]+/).map((s) => s.replace(/[`*()]/g, "").trim()).filter(Boolean);
+  const implTokens = fileTokens.filter((f) => /[./]/.test(f) && !TEST_PATH_RE.test(f));
+  const hasImplPath = implTokens.length > 0;
+  const isHeadline = lines.some((ln) => HEADLINE_FIELD_RE.test(_bare(ln)));
+  const violations = [];
+  if (!hasFiles) {
+    violations.push({ kind: "ac-without-path", detail: "task declares acceptance criteria but no **Files** implementing path — an unbacked promise." });
+  }
+  if (!hasTest) {
+    violations.push({ kind: "ac-without-test", detail: "task declares acceptance criteria but names no test (Test field, test path, or runner) — the dead-code class: it can pass vacuously / never be exercised." });
+  }
+  if (isHeadline && !hasImplPath) {
+    violations.push({ kind: "headline-without-impl", detail: "HEADLINE task has no non-test implementing path — the milestone's reason to exist is not bound to real code (the M5 AC-6 dead-code failure)." });
+  }
+  if (isHeadline && !hasTest) {
+    violations.push({ kind: "headline-without-test", detail: "HEADLINE task has no test proving the milestone's core capability is delivered (the missing >100MB-fixture failure)." });
+  }
+  return {
+    title: task.title,
+    behavioral: true,
+    isHeadline,
+    hasFiles, hasTest, hasImplPath,
+    violations,
+  };
+}
+// ─── driver ──────────────────────────────────────────────────────────────
+function listTasksFiles(projectDir, milestone) {
+  const domainsDir = path.join(projectDir, ".gsd-t", "domains");
+  let entries = [];
+  try {
+    entries = fs.readdirSync(domainsDir, { withFileTypes: true });
+  } catch {
+    return [];
+  }
+  const out = [];
+  const mPrefix = milestone ? milestone.toLowerCase() : null;
+  for (const e of entries) {
+    if (!e.isDirectory()) continue;
+    // When a milestone is given, prefer domains whose name carries that mNN
+    // prefix; if none match, fall back to all domains (single-milestone repos).
+    const tasksPath = path.join(domainsDir, e.name, "tasks.md");
+    if (fs.existsSync(tasksPath)) out.push({ domain: e.name, tasksPath });
+  }
+  if (mPrefix) {
+    const matched = out.filter((d) => d.domain.toLowerCase().startsWith(mPrefix));
+    if (matched.length) return matched;
+  }
+  return out;
+}
+function runGate({ projectDir = process.cwd(), milestone = null, tasksFile = null } = {}) {
+  let files;
+  if (tasksFile) {
+    files = [{ domain: path.basename(path.dirname(tasksFile)), tasksPath: tasksFile }];
+  } else {
+    files = listTasksFiles(projectDir, milestone);
+  }
+  if (!files.length) {
+    return { ok: false, exitCode: 64, milestone, reason: "no-tasks-files", tasks: [], violations: [] };
+  }
+  const taskResults = [];
+  const violations = [];
+  let behavioralCount = 0;
+  for (const f of files) {
+    let md;
+    try { md = fs.readFileSync(f.tasksPath, "utf8"); } catch { continue; }
+    for (const t of parseTasks(md)) {
+      const r = assessTask(t);
+      r.domain = f.domain;
+      taskResults.push(r);
+      if (r.behavioral) behavioralCount++;
+      for (const v of r.violations) {
+        violations.push({ domain: f.domain, task: r.title, ...v });
+      }
+    }
+  }
+  const ok = violations.length === 0;
+  return {
+    ok,
+    exitCode: ok ? 0 : 4,
+    milestone,
+    summary: {
+      tasksTotal: taskResults.length,
+      behavioral: behavioralCount,
+      violations: violations.length,
+    },
+    tasks: taskResults,
+    violations,
+    ...(ok ? {} : { reason: "untraceable-acceptance-criteria" }),
+  };
+}
+// ─── CLI ─────────────────────────────────────────────────────────────────
+function parseArgs(argv) {
+  const o = { projectDir: process.cwd(), milestone: null, tasksFile: null, help: false };
+  for (let i = 0; i < argv.length; i++) {
+    const a = argv[i];
+    if (a === "--help" || a === "-h") o.help = true;
+    else if (a === "--project-dir") o.projectDir = argv[++i];
+    else if (a === "--milestone") o.milestone = argv[++i];
+    else if (a === "--tasks") o.tasksFile = argv[++i];
+    else if (a === "--json") {/* default */}
+  }
+  return o;
+}
+const HELP = `Usage: gsd-t traceability-gate [--milestone Mxx] [--project-dir PATH] [--tasks FILE]
+Plan-phase acceptance-traceability gate (M83). Asserts every behavioral task in
+the milestone's .gsd-t/domains/* /tasks.md binds its acceptance criteria to an
+implementing **Files** path AND a named test. Headline tasks must have BOTH a
+real implementation path and a test. Blocks execute on any violation.
+  --milestone Mxx    Limit to domains whose name carries the mNN prefix.
+  --project-dir P    Project root (default: cwd).
+  --tasks FILE       Check a single tasks.md (overrides domain discovery).
+Exit: 0 all traceable · 4 ≥1 violation · 64 no tasks files / bad input.`;
+function main() {
+  const o = parseArgs(process.argv.slice(2));
+  if (o.help) { process.stdout.write(HELP + "\n"); process.exit(0); }
+  let res;
+  try {
+    res = runGate(o);
+  } catch (e) {
+    res = { ok: false, exitCode: 64, milestone: o.milestone, reason: `gate-error: ${e && e.message}`, tasks: [], violations: [] };
+  }
+  process.stdout.write(JSON.stringify(res, null, 2) + "\n");
+  process.exit(res.exitCode);
+}
+if (require.main === module) main();
+module.exports = { runGate, parseTasks, assessTask, _internal: { fieldValue, TEST_PATH_RE } };

package/bin/gsd-t.js CHANGED Viewed

@@ -1182,6 +1182,10 @@ const GLOBAL_BIN_TOOLS = [
   // M57 — CI-parity verify-gate checks (structural build-coverage + containment-safe ci-parity).
   "gsd-t-build-coverage.cjs",
   "gsd-t-ci-parity.cjs",
+  // M82 — Competition Mode generate-and-judge selection oracle.
+  "gsd-t-competition-judge.cjs",
+  // M83 — Plan-phase acceptance-traceability gate.
+  "gsd-t-traceability-gate.cjs",
 ];
 function installGlobalBinTools() {
@@ -2469,6 +2473,12 @@ const PROJECT_BIN_TOOLS = [
   "cli-preflight.cjs", "parallel-cli.cjs", "parallel-cli-tee.cjs",
   "gsd-t-context-brief.cjs",
   "gsd-t-verify-gate.cjs", "gsd-t-verify-gate-judge.cjs",
+  // M82 — Competition Mode judge + its disjointness oracle dependency, so a
+  // project's gsd-t-phase workflow can score candidate partitions via the
+  // project-local bin (runCli prefers bin/<tool>.cjs over the global binary).
+  "gsd-t-competition-judge.cjs", "gsd-t-file-disjointness.cjs",
+  // M83 — Plan-phase acceptance-traceability gate (runs in the plan workflow).
+  "gsd-t-traceability-gate.cjs",
 ];
 // Files that older versions of this installer copied into project bin/ but
@@ -4546,6 +4556,25 @@ if (require.main === module) {
       });
       process.exit(res.status == null ? 1 : res.status);
     }
+    case "competition-judge": {
+      // M82 D1 — `gsd-t competition-judge` thin dispatcher to the generate-and-judge
+      // selection oracle (objective partition judge + deterministic rubric selector).
+      const { spawnSync } = require("child_process");
+      const js = path.join(__dirname, "gsd-t-competition-judge.cjs");
+      const res = spawnSync(process.execPath, [js, ...args.slice(1)], {
+        stdio: "inherit",
+      });
+      process.exit(res.status == null ? 1 : res.status);
+    }
+    case "traceability-gate": {
+      // M83 D1 — `gsd-t traceability-gate` plan-phase acceptance-traceability gate.
+      const { spawnSync } = require("child_process");
+      const js = path.join(__dirname, "gsd-t-traceability-gate.cjs");
+      const res = spawnSync(process.execPath, [js, ...args.slice(1)], {
+        stdio: "inherit",
+      });
+      process.exit(res.status == null ? 1 : res.status);
+    }
     case "metrics":
       doMetrics(args.slice(1));
       break;

package/commands/gsd-t-design-decompose.md CHANGED Viewed

@@ -25,14 +25,21 @@ Capture the design reference from `$ARGUMENTS` (Figma URL / image path). If Figm
   args: {
     phase: "design-decompose",
     projectDir: ".",
-    userInput: "$ARGUMENTS"
+    userInput: "$ARGUMENTS",
+    // M82 Competition Mode (opt-in): `--competition N` (N 2..5) fans out N
+    // parallel decompositions; a blind, different-model, rubric judge (fidelity /
+    // completeness / reuse / simplicity) selects the winner. Useful when a design
+    // is ambiguous or the component boundaries aren't obvious.
+    competition: 1
   }
 }
 ```
+**Competition Mode (`--competition N`).** When a design is ambiguous or the element/widget/page boundaries aren't obvious, `/gsd-t-design-decompose --competition 3` fans out N candidate decompositions and a blind, different-model rubric judge picks the best. Parse N (clamped 2..5). See `.gsd-t/contracts/competition-mode-contract.md`. Default off.
 ## Step 3: Interpret the result
-The Workflow returns `{ status, artifacts, summary, decisions }`.
+The Workflow returns `{ status, artifacts, summary, decisions }` (plus `competition: { n, winner, ranked }` when Competition Mode ran).
 - `status === "complete"`: the element → widget → page contract tree is written under `.gsd-t/contracts/design/`.
 - `status === "partial" | "blocked"`: the agent needs the design source (e.g. Figma auth) or a stack-capability decision. Surface it.

package/commands/gsd-t-help.md CHANGED Viewed

@@ -479,6 +479,22 @@ Use these when user asks for help on a specific command:
 - **Use when**: Test data hygiene. Catches the GSD-T-Board class (2442 orphaned `E2E_TEST_*` / `E2E_DRAG_*` ideas left in the production data store after a passing Verify run).
 - **CLI**: `gsd-t test-data --list [--run <id>] [--json]` / `gsd-t test-data --purge --run <id> [--dry-run] [--json] [--project <dir>]`. Exit 0 on success, 4 on adapter errors, 64 on usage error.
+### competition-judge (M82)
+- **Summary**: The selection oracle for Competition Mode (generate-and-judge — the *generative* dual of the orthogonal validation triad). Two modes: `--kind partition` scores candidate domain decompositions via the file-disjointness oracle (parallelGroups / waveDepth / validity — a calculator, not an LLM critic, so it's immune to judge bias); `--kind generic` is a deterministic rubric selector that finalizes a winner from rubric scores an upstream blind/different-model judge supplied.
+- **Auto-invoked**: Yes — by `gsd-t-phase.workflow.js` when an eligible phase (partition / milestone / design-decompose) is run with `competition: N` (N 2–5). Opt-in per phase via `/gsd-t-partition --competition N` etc. Default off.
+- **Files**: `bin/gsd-t-competition-judge.cjs` (reuses `bin/gsd-t-file-disjointness.cjs`).
+- **Use when**: Upstream, pre-contract, wide-solution-space decisions where the cost of a single draft is high (partition, milestone decomposition, ambiguous design decomposition). Never on post-contract phases (execute/verify/etc.) — those are owned by the adversarial triad.
+- **CLI**: `gsd-t competition-judge [--in <spec.json>] [--project-dir <dir>]` (spec via stdin or `--in`). Exit 0 winner · 4 no valid candidate · 64 bad input.
+- **Contract**: `.gsd-t/contracts/competition-mode-contract.md` v1.0.0 STABLE.
+### traceability-gate (M83)
+- **Summary**: Plan-phase acceptance-traceability gate — the deterministic half of Left-Shifted Plan Hardening. Parses `.gsd-t/domains/*/tasks.md` and asserts every behavioral task binds its acceptance criteria to a `**Files**` code path AND a named killing test; a `**Headline:** true` task must have both a real implementation path and a test. Catches the dead-deliverable class (a capability built but never tested/wired) at PLAN time instead of at verify.
+- **Auto-invoked**: Yes — by `gsd-t-phase.workflow.js` at the end of the `plan` phase, blocking before execute (alongside the adversarial pre-mortem agent, protocol `templates/prompts/pre-mortem-subagent.md`).
+- **Files**: `bin/gsd-t-traceability-gate.cjs`.
+- **Use when**: Every plan phase (automatic). Origin: NiceNote M5 shipped its headline 100MB+ chunked-read as dead code with no test → 4 verify cycles.
+- **CLI**: `gsd-t traceability-gate [--milestone <Mxx>] [--project-dir <dir>] [--tasks <file>]`. Exit 0 all traceable · 4 ≥1 untraceable AC (blocks execute) · 64 no tasks files.
+- **Contract**: `.gsd-t/contracts/plan-hardening-contract.md` v1.0.0 STABLE.
 ## Unknown Command
 If user asks for help on unrecognized command:

package/commands/gsd-t-milestone.md CHANGED Viewed

@@ -25,14 +25,21 @@ Read `.gsd-t/progress.md` (current version + completed milestones), `docs/requir
   args: {
     phase: "milestone",
     projectDir: ".",
-    userInput: "$ARGUMENTS"
+    userInput: "$ARGUMENTS",
+    // M82 Competition Mode (opt-in): `--competition N` (N 2..5) fans out N
+    // parallel Self-MoA producers proposing different decomposition strategies
+    // (risk-first / value-first / dependency-first); a blind, different-model,
+    // rubric judge selects the winner. Coupled-thesis → pick-one (no Frankenstein).
+    competition: 1
   }
 }
 ```
+**Competition Mode (`--competition N`).** Milestone decomposition is the highest-altitude decision in the system — different strategies are genuinely different. If the user invokes `/gsd-t-milestone --competition 3`, parse N (clamped 2..5) and pass `competition: N`. Because a milestone decomposition is a *coupled thesis*, the judge selects one winner whole (pick-one) and only salvages non-overlapping good line-items from the losers — it never Frankensteins. See `.gsd-t/contracts/competition-mode-contract.md`. Default off.
 ## Step 3: Interpret the result
-The Workflow returns `{ status, artifacts, summary, decisions }`.
+The Workflow returns `{ status, artifacts, summary, decisions }` (plus `competition: { n, winner, ranked }` when Competition Mode ran).
 - `status === "complete"`: milestone defined and appended to progress.md with falsifiable SCs. Do NOT auto-partition for large/risky milestones — show the Next Up hint.
 - `status === "blocked"`: the agent needs a scoping decision from the user.

package/commands/gsd-t-partition.md CHANGED Viewed

@@ -30,14 +30,21 @@ Call the `Workflow` tool with:
     phase: "partition",
     milestone: "M{NN}",
     projectDir: ".",
-    userInput: "$ARGUMENTS"
+    userInput: "$ARGUMENTS",
+    // M82 Competition Mode (opt-in): if the user passed `--competition N` in
+    // $ARGUMENTS (N in 2..5), set competition: N. N parallel Self-MoA producers
+    // propose partitions; the OBJECTIVE oracle judge (file-disjointness scoring)
+    // picks the most-parallelizable valid decomposition. Omit / set 1 = off.
+    competition: 1
   }
 }
 ```
+**Competition Mode (`--competition N`).** Partition is the v1 beachhead for generate-and-judge: its judge is the file-disjointness oracle, so it is a calculator, not a biased critic. If the user invokes `/gsd-t-partition --competition 3`, parse N (clamped 2..5) and pass `competition: N`. The workflow fans out N candidate partitions, scores each on measured parallelism / wave-depth / boundary-cleanliness, and finalizes the winner. See `.gsd-t/contracts/competition-mode-contract.md`. Default off (single producer).
 ## Step 3: Interpret the result
-The Workflow returns `{ status, artifacts, summary, decisions }`.
+The Workflow returns `{ status, artifacts, summary, decisions }` (plus `competition: { n, winner, ranked }` when Competition Mode ran).
 - `status === "complete"`: domains scoped, contracts drafted. Auto-advance to `/gsd-t-plan`.
 - `status === "partial" | "blocked"`: read `summary` for what's missing (e.g. ambiguous scope needing discussion).

package/commands/gsd-t-plan.md CHANGED Viewed

@@ -33,12 +33,14 @@ Read `.gsd-t/progress.md` and each domain's `scope.md`/`constraints.md`. The par
 ## Step 3: Interpret the result
-The Workflow returns `{ status, artifacts, summary, decisions }`.
+The Workflow returns `{ status, artifacts, summary, decisions, traceability?, preMortem? }`.
-- `status === "complete"`: every domain has atomic tasks; `gsd-t parallel --dry-run` validates disjointness. Auto-advance to `/gsd-t-execute`.
-- `status === "partial" | "blocked"`: read `summary` (e.g. file-overlap between domains needing re-scoping).
+- `status === "complete"`: every domain has atomic tasks; `gsd-t parallel --dry-run` validates disjointness; **M83 plan hardening passed** (acceptance-traceability gate + adversarial pre-mortem). Auto-advance to `/gsd-t-execute`.
+- `status === "partial" | "blocked"`: read `summary` (e.g. file-overlap between domains; or **M83 plan hardening blocked** — see `traceability.violations` / `preMortem.findings`: an AC not bound to a code path + killing test, or a predicted failure condition with no planned test. Fix `tasks.md` and re-run plan).
 - `status === "failed"`: read `summary`.
+**M83 Plan Hardening (runs automatically at the end of plan, blocking before execute).** Two gates ensure the plan can't produce a dead deliverable: (1) the deterministic **acceptance-traceability gate** (`gsd-t traceability-gate`) — every behavioral task's ACs must bind to a `**Files**` code path + a named test; the **Headline:** task needs both a real impl path and a test. (2) the adversarial **pre-mortem** agent (opus, fresh-context) — predicts edge-case/dead-deliverable/NFR failures and requires a test for each. Origin: NiceNote M5 shipped its headline (100MB+ chunked read) as dead code with no test, burning 4 verify cycles. Contract: `.gsd-t/contracts/plan-hardening-contract.md`.
 ## Document Ripple
 The plan agent writes per-domain `tasks.md`, updates `integration-points.md`, and adds a Decision Log entry.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@tekyzinc/gsd-t",
-  "version": "4.0.29",
+  "version": "4.2.10",
   "description": "GSD-T: Contract-Driven Development for Claude Code — 54 slash commands with headless-by-default workflow spawning, unattended supervisor relay with event stream, graph-powered code analysis, real-time agent dashboard, task telemetry, doc-ripple enforcement, backlog management, impact analysis, test sync, milestone archival, and PRD generation",
   "author": "Tekyz, Inc.",
   "license": "MIT",

package/templates/CLAUDE-global.md CHANGED Viewed

@@ -328,7 +328,7 @@ Canonical scripts:
 - `gsd-t-integrate.workflow.js` — cross-domain wire-up + light verify-gate
 - `gsd-t-debug.workflow.js` — 2-cycle diagnose/fix/verify (CLAUDE.md Prime Rule)
 - `gsd-t-quick.workflow.js` — preflight + brief + single-task + verify-gate (M56-D4)
-- `gsd-t-phase.workflow.js` — generic upper-stage runner (partition / plan / discuss / impact / milestone / prd / design-decompose / doc-ripple)
+- `gsd-t-phase.workflow.js` — generic upper-stage runner (partition / plan / discuss / impact / milestone / prd / design-decompose / doc-ripple). **M82 Competition Mode:** an opt-in `competition: N` arg (N 2–5) on eligible upstream phases (partition / milestone / discuss / design-decompose) fans out N parallel Self-MoA producers → a judge stage → a finalizer. Partition's judge is the OBJECTIVE file-disjointness oracle (`gsd-t competition-judge --kind partition` — a calculator, not an LLM critic, immune to judge bias, the v1 beachhead); subjective phases use a blind + shuffled + different-model + rubric judge whose pick is finalized deterministically by `--kind generic`. The generative dual of the orthogonal validation triad; watershed rule = generate-and-judge ABOVE the contract, attack-and-filter BELOW. Default off. Contract: `competition-mode-contract.md` v1.0.0. **M83 Plan Hardening:** the `plan` phase runs two blocking gates before execute — a deterministic acceptance-traceability gate (`gsd-t traceability-gate`: every AC binds to a code path + a killing test; the `Headline:` task needs both impl and test) and an adversarial pre-mortem agent (opus, fresh-context, protocol `pre-mortem-subagent.md`: predicts edge-case/dead-deliverable/NFR failures, each → a required test). The temporal dual of the Red Team (attack the design at plan, not just code at verify). Contract: `plan-hardening-contract.md` v1.0.0.
 - `gsd-t-scan.workflow.js` — preflight → volume-probe → pipeline(per-slice deep finder → single verify) → synthesis → document → render (M66: fans out by codebase VOLUME, not a fixed 5-teammate dimension count; M67: deep document phase deterministically produces the full living-doc set + dimension files, per-doc fan-out)
 **Runtime-native invariant (M81 — v4.0.29+):** the Workflow sandbox provides ONLY `agent/parallel/pipeline/log/phase/budget/args` — NO `require`/`fs`/`path`/`child_process`/`process`, and `args` arrives as a JSON STRING. Each workflow is self-contained: it `JSON.parse`s `args` and delegates every CLI call (preflight, verify-gate, brief, build-coverage, ci-parity, test-data, disjointness) to inline `async` helpers that run the command via an `agent()`'s Bash (preferring project-local `bin/<tool>.cjs`, else the global `gsd-t` PATH binary) and parse the JSON envelope — preserving the M55-D5 project-local-bin invariant. The old `require("./_lib.js")` pattern threw `ReferenceError` on first eval and silently broke every workflow except scan (TD-113, fixed M81); `_lib.js` is retired as a workflow dependency.

package/templates/prompts/pre-mortem-subagent.md ADDED Viewed

@@ -0,0 +1,46 @@
+# Pre-Mortem Subagent Prompt — Adversarial Plan Review (pre-execute)
+You are an adversarial Pre-Mortem reviewer. You attack the PLAN, not the code — because the code does not exist yet. Your job is to predict, BEFORE a single line is executed, how this milestone will fail: the edge cases it will hit, the deliverables it will leave hollow, and the assumptions it is quietly making. You are the generative-adversarial dual of the Red Team: the Red Team attacks finished code at verify; you attack the design at plan, so the milestone is built right the FIRST time instead of being re-litigated across verify cycles.
+**Inverted incentives.** Your value is measured by REAL failure conditions surfaced now, not by approving the plan. A plan you bless that later burns verify cycles is YOUR failure. Assume the plan is flawed and find where.
+<!-- Workflow-stage invocation -->
+**Invocation context.** When this protocol runs as a native Workflow `agent()` stage (via `templates/workflows/gsd-t-phase.workflow.js` plan phase), your **final emission MUST be a single StructuredOutput object** matching the PRE_MORTEM schema declared by the Workflow. Bash/git/Read tool use is permitted DURING analysis; the final emission is the JSON verdict.
+<!-- brief-first rule -->
+**Brief first.** If you're about to grep, read, or run something, check the brief at `$BRIEF_PATH` first (a ≤2,500-token snapshot of CLAUDE.md + contracts + scope + requirements). It identifies the milestone's acceptance criteria and high-risk surfaces — your starting attack surface. If unset/missing, fall back to reading the plan artifacts directly, but log the gap.
+## What you are given
+The milestone's PLAN: `.gsd-t/domains/*/{scope,constraints,tasks}.md`, the relevant `.gsd-t/contracts/`, and the acceptance criteria / FRs / NFRs in `docs/requirements.md`. Read the milestone's stated GOAL and its HEADLINE capability (the one thing the milestone exists to deliver).
+## Hard Rules
+- **Failure conditions = value.** A short list is failure. Exhaust every category below.
+- **A finding must be CONCRETE and FALSIFIABLE.** "Could have edge cases" is not a finding. "A multi-byte UTF-8 codepoint split across a chunk boundary in `read_file_chunk` will corrupt or stall — there is no test for it" IS a finding.
+- **Every blocking finding must become a REQUIRED TEST.** This is the core rule. Do not emit advisory notes — advisory notes get deferred, and a deferred edge case is exactly how the NiceNote M5 chunk reader shipped three distinct data-loss bugs across three verify cycles. For each finding, state the test that must exist in the plan before execute may start. If the plan already names that test, it is not a finding.
+- **The headline capability gets the hardest scrutiny.** Ask explicitly: is the milestone's reason-to-exist (a) bound to a real code path in the plan, (b) reachable from a user action / entry point, and (c) covered by a test that FAILS if that path is dead? The NiceNote M5 milestone shipped its headline (100MB+ chunked read) as DEAD CODE because the plan never required a test that exercised it. Catch that here.
+- **Deferral is illegitimate for a milestone's own headline.** If the plan defers the milestone's defining capability (or a core AC) to a later milestone, that is a blocking finding — an incomplete milestone, not a warning.
+- Style/taste is NOT a finding. Theoretical purity is NOT a finding. Only predicted, concrete, testable failure.
+## Attack Categories (exhaust ALL)
+1. **Dead-deliverable / wiring gaps** — Is every acceptance criterion bound to a code path that is actually CALLED from an entry point? Could a capability be built but never invoked (the M5 dead-code class)? Is the headline reachable from a real user action?
+2. **Boundary & edge inputs** — empty / null / huge / zero-length / off-by-one / max-size. For each data path the plan introduces: what is the worst input, and is there a test for it? (split codepoints, chunk boundaries, 0-byte files, files at exactly the threshold, unicode, path traversal.)
+3. **Resource / NFR conditions** — memory, time, file-handle, DOM-node, payload-size ceilings. Does any NFR (performance, bounded memory, scale) have a FALSIFIABLE measured acceptance check in the plan? An NFR with no measured test is a blocking finding (the NiceNote NFR-1 160k-DOM-node class).
+4. **Error & failure paths** — what happens when the new code's dependency fails, the input is malformed, the operation is interrupted mid-flight? Does the plan specify graceful degradation, and is there a test for the failure path (not just the happy path)?
+5. **State / ordering / concurrency** — actions out of order, partial completion, re-entry, two things racing over a shared resource (the verify-gate port-race class). Does the plan account for it?
+6. **Contract & integration seams** — at every cross-domain boundary the plan defines, do both sides agree on shape, error behavior, and who owns the shared file? Is there an integration test for the seam, not just unit tests on each side?
+7. **Shallow-test traps** — does the plan's testing approach risk vacuous passes? (assertions gated behind `if (count > 0)`, `toBeVisible()` standing in for a functional check, `toHaveCount` with no state assertion.) Flag any planned test that would pass on a broken implementation.
+8. **Missing acceptance coverage** — read requirements. Is there an AC / FR / NFR with no task that delivers it, or no test that proves it?
+## Verdict
+- **BLOCK** — one or more concrete, falsifiable failure conditions that the plan does not yet cover with a required test. The plan may NOT proceed to execute until each blocking finding is answered by a named required test (or the design is changed to make the condition impossible). This is the FAIL-equivalent.
+- **CLEARED** — exhaustive search; every predicted failure condition is already covered by a named test in the plan, the headline is bound+reachable+tested, and every NFR has a measured acceptance check. (The plan-quality equivalent of GRUDGING-PASS — earned by exhaustion, not by haste.)
+## Output (StructuredOutput)
+Emit a single object: `{ verdict: "BLOCK" | "CLEARED", findings: [ { severity: "CRITICAL"|"HIGH"|"MEDIUM"|"LOW", category, condition, whyItFails, requiredTest, affectedAC? } ], headlineAssessment: { capability, boundToPath, reachable, hasKillingTest }, notes }`.
+`requiredTest` is the load-bearing field: the specific test that must be added to the plan to close the finding. A finding without a `requiredTest` is incomplete — every blocking finding converts to a test the plan must adopt before execute.