npm - @tekyzinc/gsd-t - Versions diffs - 2.71.14 → 2.71.16 - Mend

@tekyzinc/gsd-t 2.71.14 → 2.71.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/CHANGELOG.md +15 -0
package/bin/design-orchestrator.js +95 -4
package/bin/orchestrator.js +109 -1
package/commands/gsd-t-design-build.md +30 -373
package/package.json +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,21 @@
 All notable changes to GSD-T are documented here. Updated with each release.
+## [2.71.16] - 2026-04-08
+### Added (orchestrator — automated AI review loop)
+- **Automated review before human review** — orchestrator now spawns an independent reviewer Claude (no builder context) that compares built components against design contracts. If issues found, spawns a fixer Claude, re-measures, and re-reviews (max 2 cycles). Only after automated review passes do items reach human review. This is the Term 2 equivalent, running deterministically in JavaScript.
+- **Review report persistence** — each auto-review cycle writes results to `.gsd-t/design-review/auto-review/`. Unresolved issues are written to `{phase}-unresolved.json` for human visibility.
+- **Structured review output** — reviewer uses `[REVIEW_ISSUES]` markers for reliable parsing. Fallback parser catches DEVIATION/FAIL/CRITICAL keywords.
+### Pipeline (updated)
+Build → Measure → **Automated AI Review** (reviewer → fixer → re-review loop) → Human Review → Next Tier
+## [2.71.15] - 2026-04-08
+### Changed (design-build command → orchestrator delegate)
+- **`gsd-t-design-build.md` now delegates to the JS orchestrator** — the 388-line prompt-based command is replaced with a thin wrapper that runs `gsd-t design-build`. Both `/user:gsd-t-design-build` and `gsd-t design-build` now end up in the same deterministic pipeline. No more prompt-based gates that get skipped.
 ## [2.71.14] - 2026-04-08
 ### Added (design-build orchestrator)

package/bin/design-orchestrator.js CHANGED Viewed

@@ -300,6 +300,91 @@ function buildFixPrompt(phase, needsWork) {
   return `Apply these specific fixes to ${phase} components:\n\n${fixes}\n\nApply the changes and EXIT. Do not rebuild anything else.`;
 }
+// ─── Automated AI Review (Term 2 equivalent) ───────────────────────────────
+function buildReviewPrompt(phase, items, measurements, projectDir, ports) {
+  const singular = PHASE_SINGULAR[phase];
+  const contractsDir = path.join(projectDir, CONTRACTS_DIR);
+  const componentList = items.map(c => {
+    const sourcePath = c.sourcePath || guessPaths(phase, c);
+    return `- **${c.componentName}** — contract: ${c.fullContractPath}, source: ${sourcePath}, selector: \`${c.selector || "." + c.id}\``;
+  }).join("\n");
+  // Include any measurement failures for context
+  const failedMeasurements = [];
+  for (const item of items) {
+    const m = measurements[item.id] || [];
+    const failures = m.filter(x => !x.pass);
+    if (failures.length > 0) {
+      failedMeasurements.push(`- ${item.componentName}: ${failures.map(f => `${f.property}: expected ${f.expected}, got ${f.actual}`).join("; ")}`);
+    }
+  }
+  const measurementContext = failedMeasurements.length > 0
+    ? `\n## Known Measurement Failures\nPlaywright already detected these — verify they are real issues:\n${failedMeasurements.join("\n")}\n`
+    : "";
+  return `You are an INDEPENDENT design reviewer. You have NO knowledge of how these components were built. Your job is to compare the built ${phase} against their design contracts and find deviations.
+## Components to Review
+${componentList}
+${measurementContext}
+## Review Process
+For EACH component:
+1. Read the design contract file (path given above) — note every specified property value
+2. Read the source file — check that specified values are implemented correctly
+3. Use Playwright to render the component at http://localhost:${ports.reviewPort}/ and measure:
+   - Does the component render and have correct dimensions?
+   - Do colors, fonts, spacing, border-radius match the contract?
+   - For charts: correct chart type, orientation, axis labels, legend position, data format?
+   - For layouts: correct grid columns, gap, padding, child count and arrangement?
+   - For interactive elements: correct states, hover effects, click behavior?
+4. Compare contract values against actual rendered values — be SPECIFIC (exact px, hex, counts)
+## Output Format
+Output your findings between these markers. Each issue must have component, severity (critical/high/medium/low), and description with SPECIFIC contract vs. actual values:
+[REVIEW_ISSUES]
+[
+  {"component": "ComponentName", "severity": "critical", "description": "Contract specifies donut chart but rendered as pie chart (no inner radius)"},
+  {"component": "ComponentName", "severity": "high", "description": "Grid gap: contract 16px, actual 24px"}
+]
+[/REVIEW_ISSUES]
+If ALL components match their contracts, output:
+[REVIEW_ISSUES]
+[]
+[/REVIEW_ISSUES]
+## Rules
+- You write ZERO code. You ONLY review.
+- Be HARSH. Your value is in catching what the builder missed.
+- NEVER say "looks close" or "appears to match" — give SPECIFIC values.
+- Every contract property must be verified. Missing verification = missed issue.
+- Severity guide: critical = wrong component type, missing element, broken render. high = wrong dimensions, colors, layout. medium = spacing/padding off. low = minor visual difference.`;
+}
+function buildAutoFixPrompt(phase, issues, items, projectDir) {
+  const issueList = issues.map((issue, i) => {
+    const item = items.find(c => c.componentName === issue.component);
+    const contractPath = item ? item.fullContractPath : "check .gsd-t/contracts/design/";
+    return `${i + 1}. [${issue.severity}] **${issue.component}** — ${issue.description}\n   Contract: ${contractPath}`;
+  }).join("\n");
+  return `The automated design reviewer found these issues. Fix each one by reading the design contract and correcting the implementation.
+## Issues to Fix
+${issueList}
+## Rules
+- Read each component's design contract for the correct values — do NOT guess
+- Fix ONLY the listed issues — do not modify other components or add features
+- After fixing all issues, EXIT. Do not start servers or ask for review.`;
+}
 // ─── Summary ────────────────────────────────────────────────────────────────
 function formatSummary(phase, result) {
@@ -329,11 +414,13 @@ ${BOLD}Pipeline:${RESET}
   1. Read contracts from .gsd-t/contracts/design/
   2. Start dev server + review server
   3. For each tier (elements → widgets → pages):
-     a. Spawn Claude to build components
+     a. Spawn Claude (builder) to build components from contracts
      b. Measure with Playwright
-     c. Queue for human review
-     d. Wait for review submission (blocks until human approves)
-     e. Process feedback, proceed to next tier
+     c. Spawn Claude (reviewer) to compare against contracts — independent, no builder context
+     d. If reviewer finds issues → spawn Claude (fixer) → re-measure → re-review (max 2 cycles)
+     e. Queue for human review (only after automated review passes)
+     f. Wait for human review submission (blocks until human approves)
+     g. Process feedback, proceed to next tier
 `);
 }
@@ -351,11 +438,15 @@ const designBuildWorkflow = {
     timeout: 600_000,
     devServerTimeout: 30_000,
     maxReviewCycles: 3,
+    maxAutoReviewCycles: 2,
+    reviewTimeout: 300_000,
   },
   completionMessage: "All done. Run your app to verify: npm run dev",
   discoverWork,
   buildPrompt,
+  buildReviewPrompt,
+  buildAutoFixPrompt,
   measure,
   buildQueueItem,
   buildFixPrompt,

package/bin/orchestrator.js CHANGED Viewed

@@ -576,7 +576,78 @@ ${BOLD}Phases:${RESET} ${this.wf.phases.join(" → ")}
         measurements = this.wf.measure(projectDir, phase, items, { devPort, reviewPort }) || {};
       }
-      // 6e. Review cycle
+      // 6d.5. Automated AI review loop (Term 2 equivalent)
+      // Spawns an independent reviewer Claude that compares built output against contracts.
+      // If issues found → spawn fixer Claude → re-measure → re-review until clean.
+      const maxAutoReviewCycles = this.wf.defaults?.maxAutoReviewCycles || 2;
+      if (this.wf.buildReviewPrompt) {
+        let autoReviewCycle = 0;
+        let autoReviewClean = false;
+        while (autoReviewCycle < maxAutoReviewCycles && !autoReviewClean) {
+          autoReviewCycle++;
+          heading(`Automated Review — ${phase} (cycle ${autoReviewCycle}/${maxAutoReviewCycles})`);
+          // Spawn reviewer Claude — independent, no builder context
+          const reviewPrompt = this.wf.buildReviewPrompt(phase, items, measurements, projectDir, { devPort, reviewPort });
+          log(`\n${CYAN}  ⚙${RESET} Spawning reviewer Claude for ${phase}...`);
+          const reviewTimeout = this.wf.defaults?.reviewTimeout || 300_000;
+          const reviewResult = this.spawnClaude(projectDir, reviewPrompt, reviewTimeout);
+          // Parse reviewer output for issues
+          const issues = this.wf.parseReviewResult
+            ? this.wf.parseReviewResult(reviewResult.output, phase)
+            : this._parseDefaultReviewResult(reviewResult.output);
+          if (reviewResult.exitCode === 0) {
+            success(`Reviewer finished in ${reviewResult.duration}s`);
+          } else {
+            warn(`Reviewer exited with code ${reviewResult.exitCode} after ${reviewResult.duration}s`);
+          }
+          // Write review report
+          const reportDir = path.join(this.getReviewDir(projectDir), "auto-review");
+          ensureDir(reportDir);
+          fs.writeFileSync(
+            path.join(reportDir, `${phase}-cycle-${autoReviewCycle}.json`),
+            JSON.stringify({ cycle: autoReviewCycle, issues, output: reviewResult.output.slice(0, 5000) }, null, 2)
+          );
+          if (issues.length === 0) {
+            autoReviewClean = true;
+            success(`Automated review passed — no issues found in ${phase}`);
+          } else {
+            warn(`Automated review found ${issues.length} issue(s) in ${phase}`);
+            for (const issue of issues) {
+              dim(`${issue.component || "?"}: ${issue.description || issue.reason || "issue"} [${issue.severity || "medium"}]`);
+            }
+            if (autoReviewCycle < maxAutoReviewCycles) {
+              // Spawn fixer Claude with the issues
+              const fixPrompt = this.wf.buildAutoFixPrompt
+                ? this.wf.buildAutoFixPrompt(phase, issues, items, projectDir)
+                : this._defaultAutoFixPrompt(phase, issues);
+              log(`\n${CYAN}  ⚙${RESET} Spawning fixer Claude for ${issues.length} issue(s)...`);
+              const fixResult = this.spawnClaude(projectDir, fixPrompt, 120_000);
+              if (fixResult.exitCode === 0) success(`Fixer finished in ${fixResult.duration}s`);
+              else warn(`Fixer exited with code ${fixResult.exitCode}`);
+              // Re-measure after fixes
+              if (!skipMeasure && this.wf.measure) {
+                measurements = this.wf.measure(projectDir, phase, items, { devPort, reviewPort }) || {};
+              }
+            } else {
+              warn(`Max auto-review cycles reached — ${issues.length} issue(s) will go to human review`);
+              // Attach unresolved issues to measurements for human visibility
+              const issueFile = path.join(this.getReviewDir(projectDir), "auto-review", `${phase}-unresolved.json`);
+              fs.writeFileSync(issueFile, JSON.stringify(issues, null, 2));
+            }
+          }
+        }
+      }
+      // 6e. Human review cycle
       let reviewCycle = 0;
       let allApproved = false;
@@ -655,6 +726,43 @@ ${BOLD}Phases:${RESET} ${this.wf.phases.join(" → ")}
     };
   }
+  _parseDefaultReviewResult(output) {
+    // Try to parse JSON issues array from reviewer output
+    // Reviewer is instructed to output JSON between markers
+    const jsonMatch = output.match(/\[REVIEW_ISSUES\]([\s\S]*?)\[\/REVIEW_ISSUES\]/);
+    if (jsonMatch) {
+      try { return JSON.parse(jsonMatch[1].trim()); } catch { /* fall through */ }
+    }
+    // Fallback: look for PASS/FAIL verdict
+    if (/\bGRUDGING PASS\b/i.test(output) || /\bPASS\b.*\b0 issues\b/i.test(output) || /no issues found/i.test(output)) {
+      return [];
+    }
+    // If we see FAIL or DEVIATION keywords, extract what we can
+    const issues = [];
+    const deviationRegex = /(?:DEVIATION|FAIL|CRITICAL|ISSUE)[:\s—-]+(.+)/gi;
+    let match;
+    while ((match = deviationRegex.exec(output)) !== null) {
+      issues.push({ description: match[1].trim(), severity: "medium" });
+    }
+    return issues;
+  }
+  _defaultAutoFixPrompt(phase, issues) {
+    const issueList = issues.map((issue, i) =>
+      `${i + 1}. [${issue.severity || "medium"}] ${issue.component || "unknown"}: ${issue.description || issue.reason || "fix needed"}`
+    ).join("\n");
+    return `The automated reviewer found these issues in the ${phase} components. Fix each one.
+## Issues
+${issueList}
+## Rules
+- Read the relevant design contract for each component to verify the correct values
+- Fix ONLY the listed issues — do not modify other components
+- After fixing, EXIT. Do not start servers or ask for review.`;
+  }
   _defaultFixPrompt(phase, needsWork) {
     const fixes = needsWork.map(item => {
       const parts = [`Fix ${item.id}:`];

package/commands/gsd-t-design-build.md CHANGED Viewed

@@ -1,387 +1,44 @@
-# GSD-T: Design Build — Build from Design Contracts with Two-Terminal Review
+# GSD-T: Design Build — Deterministic Design-to-Code Pipeline
-You are the design builder (Term 1). You build UI components from design contracts, measure them against specs, and coordinate with an independent review session (Term 2) for unbiased verification.
+This command delegates to the **JavaScript orchestrator** for ironclad flow control. Do NOT attempt to run the build pipeline inline — the orchestrator handles Claude spawning, measurement, review gates, and feedback processing deterministically.
-**Architecture**: Term 1 builds → file system → Term 2 reviews → file system → Term 1 reads feedback. Zero shared context between sessions.
+## Step 1: Launch the Orchestrator
-## Step 0: Validate Prerequisites
-1. Read `.gsd-t/contracts/design/INDEX.md` — if missing, STOP: "Run `/user:gsd-t-design-decompose` first to create design contracts."
-2. Read `.gsd-t/progress.md` for current state
-3. Verify `scripts/gsd-t-design-review-server.js` exists (GSD-T package)
-   - Get the GSD-T install path: `npm root -g` → `{global_root}/@tekyzinc/gsd-t/scripts/`
-   - If not found: "Update GSD-T: `/user:gsd-t-version-update`"
-## Step 1: Start Infrastructure
-### 1a. Dev Server
-Check if a dev server is running:
-```bash
-lsof -i :5173 2>/dev/null | head -2
-```
-If not running, detect and start:
-```bash
-# Check package.json for dev command
-npm run dev &
-# Wait for server to be ready
-for i in $(seq 1 30); do curl -s http://localhost:5173 > /dev/null 2>&1 && break; sleep 1; done
-```
-Record the dev server port as `$DEV_PORT`.
-### 1b. Review Server
-Find the review server script:
-```bash
-GSD_ROOT=$(npm root -g)/@tekyzinc/gsd-t/scripts
-```
-Start the review server as a background process:
-```bash
-node $GSD_ROOT/gsd-t-design-review-server.js \
-  --target http://localhost:$DEV_PORT \
-  --project $PWD \
-  --port 3456 &
-REVIEW_PID=$!
-```
-Verify it's running:
-```bash
-curl -s http://localhost:3456/review/api/status
-```
-### 1c. Launch Term 2 (Independent Review Session)
-Write the review prompt to disk (Term 2 reads this, not Term 1's context):
-```bash
-cat > .gsd-t/design-review/review-prompt.md << 'PROMPT'
-You are the design review agent (Term 2). You are an INDEPENDENT reviewer — you have NO knowledge of how components were built. Your job is to compare built output against design contracts.
-## Your Loop
-1. Poll `.gsd-t/design-review/queue/` every 5 seconds for new JSON files
-2. For each queue item:
-   a. Read the queue JSON — it has component name, selector, measurements, source path
-   b. Read the matching design contract from `.gsd-t/contracts/design/`
-   c. Open the app at http://localhost:3456/ (proxied through review server)
-   d. Use Playwright to navigate to the component and evaluate it against the contract
-   e. AI Review — check what measurements can't catch:
-      - Does the visual hierarchy feel right?
-      - Are proportions correct even if individual measurements pass?
-      - Does the component look like the contract describes, holistically?
-   f. If CRITICAL issues found (wrong chart type, missing elements, wrong data model):
-      - Write rejection to `.gsd-t/design-review/rejected/{id}.json`:
-        `{ "id": "...", "reason": "...", "severity": "critical", "timestamp": "..." }`
-      - The review server will notify Term 1 automatically
-   g. If no critical issues → the component passes to human review (review UI handles this)
-3. When `.gsd-t/design-review/shutdown.json` appears → exit cleanly
-## Rules
-- You write ZERO code. You ONLY review.
-- You have NO context about how anything was built — judge purely by contract vs. output.
-- Be harsh. Your value is in catching what the builder missed.
-- Check `.gsd-t/contracts/design/elements/`, `widgets/`, and `pages/` for specs.
-PROMPT
-```
-Launch a new terminal with an independent Claude session:
-```bash
-# macOS
-osascript -e "tell application \"Terminal\" to do script \"cd $(pwd) && claude --print 'Read .gsd-t/design-review/review-prompt.md and execute the instructions. This is your only directive.'\""
-```
-```bash
-# Linux (fallback)
-gnome-terminal -- bash -c "cd $(pwd) && claude --print 'Read .gsd-t/design-review/review-prompt.md and execute the instructions. This is your only directive.'; exec bash"
-```
-Display to user:
-```
-Design Build — Infrastructure Ready
-  ✓ Dev server:    http://localhost:$DEV_PORT
-  ✓ Review server: http://localhost:3456
-  ✓ Review UI:     http://localhost:3456/review
-  ✓ Term 2:        Independent review session launched
-```
-Auto-open the review UI:
-```bash
-open http://localhost:3456/review 2>/dev/null || xdg-open http://localhost:3456/review 2>/dev/null
-```
-## Step 2: Build Phase — Elements
-Read all element contracts from `.gsd-t/contracts/design/elements/`. Sort by any `order` field or alphabetically.
-For each element contract:
-### 2a. Build the Element
-Read the element contract. Build or update the component at the specified source path. Follow the contract exactly:
-- Chart type, dimensions, colors, spacing, typography
-- Props interface matching the contract's data model
-- Import patterns (use existing design system components where specified)
-### 2b. Render-Measure-Compare
-After building, use Playwright to measure the rendered component against the contract specs:
-```javascript
-// In Playwright page.evaluate():
-const el = document.querySelector(selector);
-const s = getComputedStyle(el);
-const rect = el.getBoundingClientRect();
-// Measure each spec property from the contract
-// Compare against expected values
-// Build measurements array: [{ property, expected, actual, pass }]
-```
-### 2c. Queue for Review
-Write the queue JSON to `.gsd-t/design-review/queue/{element-id}.json`:
-```json
-{
-  "id": "element-donut-chart",
-  "name": "DonutChart",
-  "type": "element",
-  "order": 1,
-  "selector": "svg[viewBox='0 0 200 200']",
-  "sourcePath": "src/components/elements/DonutChart.vue",
-  "route": "/",
-  "measurements": [
-    { "property": "chart type", "expected": "donut", "actual": "donut", "pass": true },
-    ...
-  ]
-}
-```
-### 2d. Check for Auto-Rejections
-After queuing, check for immediate rejection from Term 2:
-```bash
-# Brief wait for Term 2 to process
-sleep 3
-ls .gsd-t/design-review/rejected/{element-id}.json 2>/dev/null
-```
-If rejected:
-- Read the rejection reason
-- Fix the element based on the rejection feedback
-- Re-measure and re-queue (max 2 auto-rejection cycles per element)
-### 2e. Continue Building
-Continue building remaining elements. Don't wait for human review — queue them all.
-## Step 3: Wait for Human Review (BLOCKING)
-**This is a BLOCKING step. Do NOT proceed to Step 4 until the human submits their review. Do NOT skip this step. Do NOT treat this as optional.**
-After all elements are built and queued, **STOP** and enter the poll loop:
-```
-Waiting for human review of elements...
-  Review UI: http://localhost:3456/review
-  {N} elements queued, awaiting submission
-```
-Poll for the review-complete signal — **actually run this bash loop**:
 ```bash
-while [ ! -f .gsd-t/design-review/review-complete.json ]; do
-  sleep 5
-done
+gsd-t design-build $ARGUMENTS
 ```
-**Only after `review-complete.json` appears**, proceed to Step 4.
-## Step 4: Process Feedback
+That's it. The orchestrator handles everything:
-Read `.gsd-t/design-review/review-complete.json` and each feedback file in `.gsd-t/design-review/feedback/`.
+1. Reads contracts from `.gsd-t/contracts/design/`
+2. Starts dev server + review server
+3. For each tier (elements → widgets → pages):
+   - Spawns Claude to build components from contracts
+   - Measures with Playwright
+   - Queues for human review
+   - **Blocks in a JS polling loop** until the human submits (ironclad gate)
+   - Processes feedback, applies fixes if needed
+   - Proceeds to next tier only after approval
-For each element's feedback:
+## Options
-### No changes, no comment → Approved
-- Mark element as complete
-- No action needed
+Pass any of these as `$ARGUMENTS`:
-### Property changes only → Apply and verify
-- Read the change list: `[{ property, oldValue, newValue, path }]`
-- Map CSS property changes back to source code:
-  - Tailwind: `padding: 16px` → find and update the Tailwind class (e.g., `p-6` → `p-4`)
-  - Inline styles: update the style binding directly
-  - CSS modules: update the CSS rule
-- Re-run Playwright measurement to verify the change took effect
-- If verification passes → mark complete
-- If verification fails → re-queue for review with updated measurements
+| Flag | Purpose |
+|------|---------|
+| `--resume`          | Resume from last saved state after interruption |
+| `--tier <name>`     | Start from a specific tier (`elements`, `widgets`, `pages`) |
+| `--project <dir>`   | Target project directory (default: current directory) |
+| `--dev-port <N>`    | Dev server port (default: 5173) |
+| `--review-port <N>` | Review server port (default: 3456) |
+| `--timeout <sec>`   | Claude timeout per tier in seconds (default: 600) |
+| `--skip-measure`    | Skip Playwright measurement (human-review only) |
-### Comment only → Interpret and fix
-- Read the comment — it describes a change to make
-- Implement the change described in the comment
-- Re-measure the element
-- Re-queue for review
+## Prerequisites
-### Changes + comment → Apply changes, use comment as context
-- Apply the property changes first
-- Read the comment for additional context or fixes beyond the property changes
-- Implement any additional changes
-- Re-measure and re-queue
-After processing all feedback:
-- Clear the queue: `rm .gsd-t/design-review/queue/*.json`
-- Delete the signal: `rm .gsd-t/design-review/review-complete.json`
-- Clear feedback: `rm .gsd-t/design-review/feedback/*.json`
-If any elements were re-queued → return to Step 3 (max 3 review cycles total).
-## Step 5: Widget Assembly Phase
-**GATE: Do NOT start this step until ALL elements from Step 2 have been reviewed and approved by the human in the review UI.**
-1. Read widget contracts from `.gsd-t/contracts/design/widgets/`
-2. For each widget:
-   a. Build the widget — MUST import approved element components, not rebuild them inline
-   b. Verify imports: grep the widget file for element imports
-   c. **Render-Measure-Compare** — use Playwright to measure the assembled widget against its contract specs:
-      ```javascript
-      // In Playwright page.evaluate():
-      const widget = document.querySelector(widgetSelector);
-      const s = getComputedStyle(widget);
-      const rect = widget.getBoundingClientRect();
-      // Measure layout properties from widget contract:
-      // - grid-template-columns / grid-template-rows (column count, track sizes)
-      // - gap / row-gap / column-gap
-      // - padding, margin, border-radius
-      // - width, height, aspect-ratio
-      // Count direct children and verify against contract's expected child count
-      const children = widget.children;
-      const childRects = [...children].map(c => c.getBoundingClientRect());
-      // Verify children-per-row: count children whose top is within 2px of first child's top
-      const firstRowTop = childRects[0]?.top;
-      const childrenPerRow = childRects.filter(r => Math.abs(r.top - firstRowTop) < 2).length;
-      // Compare ALL measured values against contract specs
-      // Build measurements array: [{ property, expected, actual, pass }]
-      ```
-      **Every measurement failure MUST appear in the queue JSON measurements array with `pass: false`.**
-   d. Write queue JSON to `.gsd-t/design-review/queue/{widget-id}.json` (same format as elements, with `"type": "widget"`)
-   e. Check for auto-rejections (sleep 3, check `rejected/` dir) — fix and re-queue if rejected (max 2 cycles)
-3. After ALL widgets are built and queued, **STOP and poll for human review** — this is a BLOCKING wait, do NOT proceed until the human submits:
-   ```
-   Waiting for human review of widgets...
-     Review UI: http://localhost:3456/review
-     {N} widgets queued, awaiting submission
-   ```
-   ```bash
-   while [ ! -f .gsd-t/design-review/review-complete.json ]; do
-     sleep 5
-   done
-   ```
-4. Process feedback — read `review-complete.json` and each feedback file, apply changes/fixes per the same rules as Step 4. Clear queue, delete signal, clear feedback after processing. If any widgets re-queued → return to sub-step 3 (max 3 review cycles).
-## Step 6: Page Composition Phase
-**GATE: Do NOT start this step until ALL widgets from Step 5 have been reviewed and approved by the human in the review UI.**
-1. Read page contracts from `.gsd-t/contracts/design/pages/`
-2. For each page:
-   a. Build the page — MUST import approved widget components
-   b. **Render-Measure-Compare** — use Playwright to verify the full page layout against its contract specs:
-      ```javascript
-      // In Playwright page.evaluate():
-      // 1. GRID LAYOUT — verify column count matches contract
-      const grid = document.querySelector(pageGridSelector);
-      const gridStyle = getComputedStyle(grid);
-      const columns = gridStyle.gridTemplateColumns.split(' ').length;
-      // Compare against contract's expected column count (e.g., 2, not 4)
-      // 2. SECTION ORDERING — verify widgets appear in contract-specified order
-      const sections = [...grid.querySelectorAll('[data-section]')];
-      const sectionOrder = sections.map(s => s.dataset.section);
-      // Compare against contract's section order array
-      // 3. WIDGET DIMENSIONS — verify each widget's width relative to grid
-      const gridRect = grid.getBoundingClientRect();
-      const widgetRects = sections.map(s => s.getBoundingClientRect());
-      // For a 2-column grid: each widget should be ~50% of grid width (minus gap)
-      // For a 1-column widget spanning full width: should be ~100% of grid width
-      // 4. SPACING — verify gap, padding, margins match contract
-      // gap, rowGap, columnGap, padding
-      // 5. RESPONSIVE — if contract specifies breakpoints, measure at each viewport width
-      // Build measurements array: [{ property, expected, actual, pass }]
-      ```
-      **Grid column count is a CRITICAL measurement. If the contract says 2 columns and the page renders 4, this is `severity: "critical"` and MUST be flagged.**
-   c. Write queue JSON to `.gsd-t/design-review/queue/{page-id}.json` (same format, with `"type": "page"`)
-   e. Check for auto-rejections (sleep 3, check `rejected/` dir) — fix and re-queue if rejected (max 2 cycles)
-3. After ALL pages are built and queued, **STOP and poll for human review** — this is a BLOCKING wait, do NOT proceed until the human submits:
-   ```
-   Waiting for human review of pages...
-     Review UI: http://localhost:3456/review
-     {N} pages queued, awaiting submission
-   ```
-   ```bash
-   while [ ! -f .gsd-t/design-review/review-complete.json ]; do
-     sleep 5
-   done
-   ```
-4. Process feedback — same rules as Step 4. Clear queue, delete signal, clear feedback. If any pages re-queued → return to sub-step 3 (max 3 review cycles).
-## Step 7: Cleanup and Report
-```bash
-# Signal Term 2 to shut down
-echo '{"shutdown": true}' > .gsd-t/design-review/shutdown.json
-sleep 2
-# Kill review server
-kill $REVIEW_PID 2>/dev/null
-# Kill dev server if we started it
-# (only if we started it in Step 1a)
-```
-Report:
-```
-Design Build Complete
-  Elements: {N} built, {N} approved
-  Widgets:  {N} built, {N} approved
-  Pages:    {N} built, {N} approved
-  Review cycles: {N}
-  Changes applied from review: {N}
-  Comments addressed: {N}
-```
-Update `.gsd-t/progress.md` with completion status.
-## Coordination Directory Structure
-```
-.gsd-t/design-review/
-├── queue/                    # Term 1 writes, Term 2 + human reads
-│   ├── element-donut.json
-│   └── element-bar.json
-├── feedback/                 # Human review writes, Term 1 reads
-│   ├── element-donut.json
-│   └── element-bar.json
-├── rejected/                 # Term 2 auto-rejects, Term 1 reads
-│   └── element-bar.json
-├── review-complete.json      # Human submit signal → Term 1 polls
-├── review-prompt.md          # Term 2's instructions (no builder context)
-├── shutdown.json             # Term 1 signals Term 2 to exit
-└── status.json               # Review server state
-```
+- Design contracts must exist in `.gsd-t/contracts/design/` with an `INDEX.md`
+- If no contracts exist, run `/user:gsd-t-design-decompose` first
-## Rules
+## Why a JS Orchestrator?
-- NEVER share builder context with Term 2 — coordination is files only
-- NEVER skip the review cycle — every component goes through Term 2 + human
-- NEVER proceed to widgets before all elements are reviewed and approved by the human — the bash polling loop MUST block until `review-complete.json` appears
-- NEVER proceed to pages before all widgets are reviewed and approved by the human — same blocking poll
-- NEVER rebuild element functionality inline in widgets — always import
-- NEVER skip or shortcut the bash polling loop — it MUST actually execute and block
-- Max 3 review cycles per phase — if still failing, stop and present to user
-- Auto-open the browser review UI so the user doesn't have to find it
-- Kill all background processes on completion or error
+Claude Code agents optimize for task completion and will skip any prompt instruction that asks them to pause indefinitely — including bash polling loops, `BLOCKING` headers, and `STOP` directives. Three separate attempts to enforce review gates via prompt instructions all failed. The JS orchestrator moves flow control out of prompts entirely into deterministic JavaScript.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@tekyzinc/gsd-t",
-  "version": "2.71.14",
+  "version": "2.71.16",
   "description": "GSD-T: Contract-Driven Development for Claude Code — 56 slash commands with headless CI/CD mode, graph-powered code analysis, real-time agent dashboard, execution intelligence, task telemetry, doc-ripple enforcement, backlog management, impact analysis, test sync, milestone archival, and PRD generation",
   "author": "Tekyz, Inc.",
   "license": "MIT",