@tekyzinc/gsd-t 2.70.14 → 2.70.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,16 @@
2
2
 
3
3
  All notable changes to GSD-T are documented here. Updated with each release.
4
4
 
5
+ ## [2.70.15] - 2026-04-06
6
+
7
+ ### Changed (design pipeline — decompose verification)
8
+ - **Separate Verification Agent** — `gsd-t-design-decompose` Step 6.5 now spawns a dedicated opus-model verification subagent instead of self-verifying chart classifications. The decompose agent cannot verify its own work — sunk cost bias causes it to rubber-stamp its classifications. The separate agent has fresh context and its sole incentive is finding mismatches.
9
+ - **BAR CHART ORIENTATION PROOF** — mechanical decision tree injected into the verification agent prompt: rectangles in a ROW → HORIZONTAL, rectangles BOTTOM-TO-TOP → VERTICAL. Eliminates the #1 misclassification (horizontal percentage bars classified as vertical grouped).
10
+ - **Max 2 fix cycles** — if the verifier finds mismatches, contracts are corrected and re-verified (up to 2 cycles). Persistent failures block decompose completion.
11
+
12
+ ### Why
13
+ v2.70.14 ensured "build follows contracts" but the contracts themselves were wrong. The decompose command's Step 6.5 asked the same agent to verify its own chart type classifications — it always passed itself. Three charts (`Number of Tools`, `Time on Page`, `Number of Visits`) were classified as `bar-vertical-grouped` when the Figma shows `bar-stacked-horizontal-percentage`. A separate verification agent with no sunk cost catches these mismatches before contracts are finalized.
14
+
5
15
  ## [2.70.14] - 2026-04-06
6
16
 
7
17
  ### Changed (design pipeline — hierarchical execution)
@@ -252,11 +252,38 @@ For each widget contract:
252
252
  1. Copy `templates/widget-contract.md` as scaffold
253
253
  2. Reference elements by name in the "Elements Used" table
254
254
  3. Define layout, data binding, responsive behavior, widget-level verification
255
+ 4. **Extract layout CSS from `get_design_context` output (MANDATORY)**:
256
+ The Figma MCP returns code with explicit CSS layout properties. Parse these into the
257
+ widget contract's "Internal Element Layout" section:
258
+ - `body_layout`: Look at the parent container's CSS in the Figma output.
259
+ `flex flex-row` or `flex gap-[16px] items-center` → `flex-row`.
260
+ `flex flex-col` or `flex-col gap-[16px]` → `flex-column`.
261
+ `grid grid-cols-2` → `grid 2-col`. Write EXACTLY what the Figma shows.
262
+ - `body_gap`: Extract the gap value from the Figma CSS (e.g., `gap-[16px]` → `16px`)
263
+ - Legend position: If legend is a SIBLING of the chart in a `flex-row` container →
264
+ legend is BESIDE the chart (`body_sidebar`). If legend is BELOW the chart in a
265
+ `flex-col` container → legend is in `footer_legend`. This distinction is CRITICAL —
266
+ it's the difference between a side-by-side layout and a stacked layout.
267
+ - `container_height`: If the Figma shows `h-[334px]` → fixed height `334px`.
268
+ If no explicit height → `auto`.
255
269
 
256
270
  For each page contract:
257
271
  1. Copy `templates/page-contract.md` as scaffold
258
272
  2. Reference widgets in grid positions
259
273
  3. Define route, data loading, global states, performance budget
274
+ 4. **Extract grid structure from `get_design_context` output (MANDATORY)**:
275
+ The Figma MCP returns the page's layout as nested containers. Parse the structure:
276
+ - Count "Row" or `flex-row` containers and their children to determine grid dimensions
277
+ (e.g., 2 Row containers with 2 cards each → `grid 2×2`, NOT `grid 1×4`)
278
+ - Extract `gap` values between rows and between cards within rows
279
+ - Extract explicit heights on cards (e.g., `h-[334px]`)
280
+ - Document in the page contract's "Widgets Used" table:
281
+ ```
282
+ | grid[row=1, cols=1-2] | most-popular-tools + number-of-tools | 2 per row |
283
+ | grid[row=2, cols=1-2] | time-on-page + number-of-visits | 2 per row |
284
+ ```
285
+ - **Anti-pattern**: Seeing 4 sibling cards and writing `grid-cols-4` when the Figma
286
+ groups them into 2 rows of 2. ALWAYS check the parent container structure.
260
287
 
261
288
  Write `INDEX.md` as a navigation map:
262
289
 
@@ -301,40 +328,112 @@ Per-page element manifest (for verification agent):
301
328
  | {analytics} | {N} | {N} — {list} |
302
329
  ```
303
330
 
304
- ## Step 6.5: Contract-vs-Figma Verification Gate (MANDATORY)
331
+ ## Step 6.5: Contract-vs-Figma Verification Gate — SEPARATE AGENT (MANDATORY)
305
332
 
306
- After writing all contracts but BEFORE proceeding to partition or build, verify that each contract accurately represents the Figma design. This gate catches errors that would otherwise propagate through the entire build.
333
+ After writing all contracts but BEFORE proceeding to partition or build, spawn a **dedicated verification subagent** to independently verify every chart classification against the Figma source. This agent has FRESH context, no sunk cost in the classifications, and its sole incentive is finding mismatches.
307
334
 
308
- ### For each widget contract:
335
+ > **Why a separate agent?** The decompose agent that classified the charts cannot objectively verify its own classifications. It has the same blind spots that caused the misclassification. This was proven repeatedly — the same agent rubber-stamps its own work. A fresh agent with only the contracts and Figma access catches what the classifier missed.
309
336
 
310
- 1. **Re-read the Figma node** — call `get_design_context` (or re-examine the screenshot) for the specific widget node
311
- 2. **Compare the contract's claimed structure against the actual Figma node:**
337
+ **OBSERVABILITY LOGGING (MANDATORY):**
338
+ Before spawning run via Bash:
339
+ `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
312
340
 
313
- | Check | What to verify | Failure mode it prevents |
314
- |-------|---------------|------------------------|
315
- | Chart type | Contract's element name matches the actual visual pattern | Donut classified as stacked bar (or vice versa) |
316
- | Data labels | Contract's Test Fixture labels match the Figma text exactly | Hallucinated column headers, invented metrics |
317
- | Element count | Number of sub-elements in contract matches Figma | Missing legends, extra charts, wrong layout |
318
- | Text content | Every title, subtitle, label, legend item matches Figma verbatim | "Engagement per video" subtitle that doesn't exist in Figma |
319
- | Layout structure | Widget's claimed layout matches Figma arrangement | Side-by-side classified as stacked, 2 charts classified as 1 |
320
-
321
- 3. **Produce a contract-vs-Figma mismatch report:**
341
+ [opus] gsd-t-design-decompose Chart Classification Verifier
322
342
 
323
343
  ```
324
- CONTRACT-VS-FIGMA VERIFICATION
325
- ───────────────────────────────
326
- most-popular-tools-card: chart-donut MATCHES Figma node 123:458
327
- number-of-tools-card: chart-bar-stacked-horizontal-percentageMATCHES
328
- member-state-card: chart-donut MISMATCH: Figma shows stacked vertical bars, not donuts
329
- Fix: reclassify as chart-bar-stacked-vertical, rewrite element contract
330
- ❌ video-playlist-table: columns [Title, Duration, Views, Watch Time, Completion]
331
- MISMATCH: Figma shows [Video, Viewed, Clicked Thumbnail, Clicked CTA, Avg. Seconds Watched]
332
- Fix: update Test Fixture column headers
344
+ Task subagent (general-purpose, model: opus):
345
+ "You are the Chart Classification Verifier. Your ONLY job is to independently
346
+ verify that each element contract's chart type classification matches the actual
347
+ Figma design. You have ZERO knowledge of how the charts were classified you
348
+ are seeing them fresh. Your incentive: every misclassification you catch prevents
349
+ a wrong chart being built. Every misclassification you miss causes a rebuild.
350
+
351
+ ## Contracts to Verify
352
+ {list each element contract filename + its claimed chart type from INDEX.md}
353
+
354
+ ## Figma Source
355
+ File key: {fileKey}
356
+ Page node: {nodeId}
357
+
358
+ ## Verification Process
359
+
360
+ For EACH element contract that claims a chart/visualization type:
361
+
362
+ 1. Read the element contract — note its claimed type (e.g., 'bar-vertical-grouped')
363
+ 2. Find the Figma node ID referenced in the contract (or in the widget that uses it)
364
+ 3. Call `get_design_context` on that specific node ID — examine the STRUCTURE:
365
+ - Layout mode (horizontal vs vertical arrangement of children)
366
+ - Child elements (are they bars? segments? slices?)
367
+ - How children are arranged (side by side? stacked? overlapping?)
368
+ - Dimensions (do bars extend horizontally or vertically?)
369
+
370
+ 4. Walk the decision tree INDEPENDENTLY (do NOT read the contract's reasoning):
371
+
372
+ BAR CHART ORIENTATION PROOF:
373
+ a. Are the data-bearing rectangles arranged HORIZONTALLY (left to right)?
374
+ → Segments share ONE ROW, each segment's WIDTH encodes its value
375
+ → This is HORIZONTAL (stacked if touching, grouped if separated)
376
+ b. Are the data-bearing rectangles arranged VERTICALLY (bottom to top)?
377
+ → Each bar is a COLUMN, each bar's HEIGHT encodes its value
378
+ → This is VERTICAL (stacked if layered, grouped if side-by-side)
379
+ c. Is it ONE bar with colored segments? → STACKED
380
+ Is it MULTIPLE separate bars? → GROUPED
381
+ d. Do labels show percentages summing to 100%? → PERCENTAGE variant
382
+
383
+ CRITICAL DISTINCTION — the #1 misclassification:
384
+ A single horizontal bar divided into colored segments (each segment's WIDTH
385
+ represents a percentage) is chart-bar-stacked-horizontal-percentage.
386
+ Multiple vertical columns of different heights side-by-side is
387
+ chart-bar-grouped-vertical. These render COMPLETELY DIFFERENTLY.
388
+ If you see colored blocks in a ROW → HORIZONTAL. Period.
389
+
390
+ 5. Compare YOUR classification against the contract's classification.
391
+
392
+ 6. For EACH element, produce:
393
+
394
+ ```
395
+ Element: {name}
396
+ Contract claims: {chart type}
397
+ Figma node: {id}
398
+ I SEE: {describe what the Figma MCP returned — layout, children, arrangement}
399
+ MY CLASSIFICATION: {your independent classification}
400
+ VERDICT: ✅ MATCH or ❌ MISMATCH
401
+ If MISMATCH: Contract says {X} but Figma shows {Y} because {evidence}
402
+ ```
403
+
404
+ ## Report
405
+
406
+ Produce the full verification table:
407
+
408
+ | # | Element | Contract Type | Verified Type | Figma Evidence | Verdict |
409
+ |---|---------|--------------|---------------|----------------|---------|
410
+ | 1 | chart-donut | chart-donut | chart-donut | circular arcs + center hole | ✅ MATCH |
411
+ | 2 | bar-vertical-grouped | bar-vertical-grouped | bar-stacked-horizontal-pct | 4 segments in ONE horizontal row | ❌ MISMATCH |
412
+
413
+ If ANY ❌ MISMATCH found:
414
+ - List each mismatch with the correct classification and evidence
415
+ - Report: 'VERIFICATION FAILED — {N} misclassifications found. Contracts must be fixed before build.'
416
+
417
+ If ALL ✅ MATCH:
418
+ - Report: 'VERIFICATION PASSED — all {N} chart classifications confirmed against Figma source.'
419
+ "
333
420
  ```
334
421
 
335
- 4. **If ANY mismatches found**: fix the contracts BEFORE proceeding. Do not build from wrong contracts.
422
+ After subagent returns run via Bash:
423
+ `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
424
+
425
+ Compute tokens/compaction per standard pattern. Append to `.gsd-t/token-log.md`.
426
+
427
+ **If VERIFICATION FAILED**: Fix every misclassified element contract before proceeding:
428
+ 1. Rename the contract file to match the correct chart type
429
+ 2. Rewrite the visual spec section to match the correct chart type
430
+ 3. Update INDEX.md references
431
+ 4. Update any widget contracts that reference the renamed element
432
+ 5. **Re-run the verification subagent** to confirm fixes (max 2 cycles)
433
+
434
+ **If VERIFICATION PASSED**: Proceed to Step 7.
336
435
 
337
- > **Why this gate exists**: The two-terminal validation (tasks 001-013) proved the system produces 50/50 scores when contracts are correctbut also revealed that scoring code-vs-contract doesn't catch contract-vs-Figma errors. This gate closes that gap.
436
+ > **Why this gate exists**: The decompose agent's own examples show the correct classification for "Number of Tools" as `chart-bar-stacked-horizontal-percentage`yet the agent classified the same chart as `bar-vertical-grouped` in practice. Soft instructions ("MANDATORY decision tree") don't prevent misclassification. A separate agent with fresh context and inverted incentives (success = finding errors) does.
338
437
 
339
438
  ## Step 7: Wire Into Partition
340
439
 
@@ -309,12 +309,57 @@ Execute the task above:
309
309
  recovers (retry button works, form can be resubmitted, etc.).
310
310
  A test that would pass on an empty HTML page with the right element IDs is useless.
311
311
  Every assertion must prove the FEATURE WORKS, not that the ELEMENT EXISTS.
312
- 8. **Visual Design Note** (when design-to-code stack rule is active):
313
- Do NOT perform visual verification yourself a dedicated Design Verification Agent
314
- (Step 5.25) runs after all domain tasks complete and handles the full visual comparison.
315
- Your job: write precise code from the design contract tokens. Use exact hex colors,
316
- exact spacing values, exact typography. Every CSS value must trace to the design contract.
317
- The verification agent will open a browser and prove whether your code matches.
312
+ 8. **Render-Measure-Compare Loop** (when design-to-code stack rule is active — MANDATORY):
313
+ After implementing the component, you MUST verify it renders correctly by measuring
314
+ the actual DOM output against the contract's layout spec. This is not optional.
315
+ Do NOT rely on visual inspection or screenshots measure mechanically.
316
+
317
+ a. **Render**: Start the dev server if not running. Navigate to a route where the
318
+ component is visible (or create a temporary test route that renders it in isolation).
319
+
320
+ b. **Measure via Playwright** — run `page.evaluate()` to extract DOM properties:
321
+ ```javascript
322
+ // For a widget: measure its internal layout
323
+ const el = document.querySelector('.widget-selector');
324
+ const style = getComputedStyle(el);
325
+ return {
326
+ display: style.display, // 'flex' or 'grid'
327
+ flexDirection: style.flexDirection, // 'row' or 'column'
328
+ gap: style.gap,
329
+ gridTemplateColumns: style.gridTemplateColumns,
330
+ width: el.offsetWidth,
331
+ height: el.offsetHeight,
332
+ childCount: el.children.length,
333
+ children: Array.from(el.children).map(c => ({
334
+ tag: c.tagName,
335
+ width: c.offsetWidth,
336
+ height: c.offsetHeight,
337
+ display: getComputedStyle(c).display,
338
+ flexDirection: getComputedStyle(c).flexDirection,
339
+ }))
340
+ };
341
+ ```
342
+
343
+ c. **Compare to contract** — check each measured value against the contract spec:
344
+ - `body_layout: flex-row` → verify `flexDirection === 'row'`
345
+ - `container_height: 334px` → verify `height === 334` (±2px tolerance)
346
+ - Grid `2×2` → verify parent has 2 row children, each with 2 card children
347
+ - Legend position: if contract says `body_sidebar` (beside chart) →
348
+ verify legend and chart share a `flex-row` parent.
349
+ If contract says `footer_legend` (below chart) →
350
+ verify legend is in a `flex-column` parent below the chart.
351
+
352
+ d. **Fix mismatches** — if ANY measurement doesn't match the contract:
353
+ - Log: "LAYOUT MISMATCH: {property} expected {contract value}, got {measured value}"
354
+ - Fix the code to match the contract spec
355
+ - Re-render and re-measure (max 2 fix cycles)
356
+ - If still mismatched after 2 cycles → log to `.gsd-t/deferred-items.md`
357
+
358
+ e. **All pass** → log "RENDER-MEASURE PASS: {N} layout properties verified" and proceed.
359
+
360
+ This loop catches the exact class of errors that visual inspection misses:
361
+ grid-cols-4 instead of 2×2, legend below instead of beside, wrong flex-direction.
362
+ These are data comparisons, not visual judgments — the same kind of check as a unit test.
318
363
  9. Run ALL test suites — this is NOT optional, not conditional, not "if applicable":
319
364
  a. Detect configured test runners: check for vitest/jest config, playwright.config.*, cypress.config.*
320
365
  b. Run EVERY detected suite. Unit tests alone are NEVER sufficient when E2E exists.
@@ -159,8 +159,16 @@ When you encounter unexpected situations:
159
159
  - If building/modifying a PAGE: IMPORT existing widget components — do NOT rebuild widget functionality inline.
160
160
  - **Contract is authoritative**: Follow the contract spec, not the Figma screenshot, when they appear to disagree.
161
161
  5. Make the change — **adapt new code to existing structures**, not the other way around
162
- 6. Verify it works
163
- 7. Commit: `[quick] {description}`
162
+ 6. **Render-Measure-Compare** (if design component — MANDATORY):
163
+ After implementing, verify via Playwright DOM measurement (not screenshots):
164
+ - Render the component in browser
165
+ - `page.evaluate()` to extract: display, flexDirection, gap, gridTemplateColumns,
166
+ offsetWidth, offsetHeight, child count and layout
167
+ - Compare each value to the contract's layout spec (body_layout, container_height, etc.)
168
+ - Mismatches → fix code → re-measure (max 2 cycles)
169
+ - This catches: wrong grid structure, legend below vs beside, wrong flex-direction
170
+ 7. Verify it works
171
+ 8. Commit: `[quick] {description}`
164
172
 
165
173
  ## Step 3.5: Emit Task Metrics
166
174
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@tekyzinc/gsd-t",
3
- "version": "2.70.14",
3
+ "version": "2.70.16",
4
4
  "description": "GSD-T: Contract-Driven Development for Claude Code — 54 slash commands with headless CI/CD mode, graph-powered code analysis, real-time agent dashboard, execution intelligence, task telemetry, doc-ripple enforcement, backlog management, impact analysis, test sync, milestone archival, and PRD generation",
5
5
  "author": "Tekyz, Inc.",
6
6
  "license": "MIT",