@tekyzinc/gsd-t 2.70.14 → 2.70.16
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +10 -0
- package/commands/gsd-t-design-decompose.md +124 -25
- package/commands/gsd-t-execute.md +51 -6
- package/commands/gsd-t-quick.md +10 -2
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,16 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to GSD-T are documented here. Updated with each release.
|
|
4
4
|
|
|
5
|
+
## [2.70.15] - 2026-04-06
|
|
6
|
+
|
|
7
|
+
### Changed (design pipeline — decompose verification)
|
|
8
|
+
- **Separate Verification Agent** — `gsd-t-design-decompose` Step 6.5 now spawns a dedicated opus-model verification subagent instead of self-verifying chart classifications. The decompose agent cannot verify its own work — sunk cost bias causes it to rubber-stamp its classifications. The separate agent has fresh context and its sole incentive is finding mismatches.
|
|
9
|
+
- **BAR CHART ORIENTATION PROOF** — mechanical decision tree injected into the verification agent prompt: rectangles in a ROW → HORIZONTAL, rectangles BOTTOM-TO-TOP → VERTICAL. Eliminates the #1 misclassification (horizontal percentage bars classified as vertical grouped).
|
|
10
|
+
- **Max 2 fix cycles** — if the verifier finds mismatches, contracts are corrected and re-verified (up to 2 cycles). Persistent failures block decompose completion.
|
|
11
|
+
|
|
12
|
+
### Why
|
|
13
|
+
v2.70.14 ensured "build follows contracts" but the contracts themselves were wrong. The decompose command's Step 6.5 asked the same agent to verify its own chart type classifications — it always passed itself. Three charts (`Number of Tools`, `Time on Page`, `Number of Visits`) were classified as `bar-vertical-grouped` when the Figma shows `bar-stacked-horizontal-percentage`. A separate verification agent with no sunk cost catches these mismatches before contracts are finalized.
|
|
14
|
+
|
|
5
15
|
## [2.70.14] - 2026-04-06
|
|
6
16
|
|
|
7
17
|
### Changed (design pipeline — hierarchical execution)
|
|
@@ -252,11 +252,38 @@ For each widget contract:
|
|
|
252
252
|
1. Copy `templates/widget-contract.md` as scaffold
|
|
253
253
|
2. Reference elements by name in the "Elements Used" table
|
|
254
254
|
3. Define layout, data binding, responsive behavior, widget-level verification
|
|
255
|
+
4. **Extract layout CSS from `get_design_context` output (MANDATORY)**:
|
|
256
|
+
The Figma MCP returns code with explicit CSS layout properties. Parse these into the
|
|
257
|
+
widget contract's "Internal Element Layout" section:
|
|
258
|
+
- `body_layout`: Look at the parent container's CSS in the Figma output.
|
|
259
|
+
`flex flex-row` or `flex gap-[16px] items-center` → `flex-row`.
|
|
260
|
+
`flex flex-col` or `flex-col gap-[16px]` → `flex-column`.
|
|
261
|
+
`grid grid-cols-2` → `grid 2-col`. Write EXACTLY what the Figma shows.
|
|
262
|
+
- `body_gap`: Extract the gap value from the Figma CSS (e.g., `gap-[16px]` → `16px`)
|
|
263
|
+
- Legend position: If legend is a SIBLING of the chart in a `flex-row` container →
|
|
264
|
+
legend is BESIDE the chart (`body_sidebar`). If legend is BELOW the chart in a
|
|
265
|
+
`flex-col` container → legend is in `footer_legend`. This distinction is CRITICAL —
|
|
266
|
+
it's the difference between a side-by-side layout and a stacked layout.
|
|
267
|
+
- `container_height`: If the Figma shows `h-[334px]` → fixed height `334px`.
|
|
268
|
+
If no explicit height → `auto`.
|
|
255
269
|
|
|
256
270
|
For each page contract:
|
|
257
271
|
1. Copy `templates/page-contract.md` as scaffold
|
|
258
272
|
2. Reference widgets in grid positions
|
|
259
273
|
3. Define route, data loading, global states, performance budget
|
|
274
|
+
4. **Extract grid structure from `get_design_context` output (MANDATORY)**:
|
|
275
|
+
The Figma MCP returns the page's layout as nested containers. Parse the structure:
|
|
276
|
+
- Count "Row" or `flex-row` containers and their children to determine grid dimensions
|
|
277
|
+
(e.g., 2 Row containers with 2 cards each → `grid 2×2`, NOT `grid 1×4`)
|
|
278
|
+
- Extract `gap` values between rows and between cards within rows
|
|
279
|
+
- Extract explicit heights on cards (e.g., `h-[334px]`)
|
|
280
|
+
- Document in the page contract's "Widgets Used" table:
|
|
281
|
+
```
|
|
282
|
+
| grid[row=1, cols=1-2] | most-popular-tools + number-of-tools | 2 per row |
|
|
283
|
+
| grid[row=2, cols=1-2] | time-on-page + number-of-visits | 2 per row |
|
|
284
|
+
```
|
|
285
|
+
- **Anti-pattern**: Seeing 4 sibling cards and writing `grid-cols-4` when the Figma
|
|
286
|
+
groups them into 2 rows of 2. ALWAYS check the parent container structure.
|
|
260
287
|
|
|
261
288
|
Write `INDEX.md` as a navigation map:
|
|
262
289
|
|
|
@@ -301,40 +328,112 @@ Per-page element manifest (for verification agent):
|
|
|
301
328
|
| {analytics} | {N} | {N} — {list} |
|
|
302
329
|
```
|
|
303
330
|
|
|
304
|
-
## Step 6.5: Contract-vs-Figma Verification Gate (MANDATORY)
|
|
331
|
+
## Step 6.5: Contract-vs-Figma Verification Gate — SEPARATE AGENT (MANDATORY)
|
|
305
332
|
|
|
306
|
-
After writing all contracts but BEFORE proceeding to partition or build, verify
|
|
333
|
+
After writing all contracts but BEFORE proceeding to partition or build, spawn a **dedicated verification subagent** to independently verify every chart classification against the Figma source. This agent has FRESH context, no sunk cost in the classifications, and its sole incentive is finding mismatches.
|
|
307
334
|
|
|
308
|
-
|
|
335
|
+
> **Why a separate agent?** The decompose agent that classified the charts cannot objectively verify its own classifications. It has the same blind spots that caused the misclassification. This was proven repeatedly — the same agent rubber-stamps its own work. A fresh agent with only the contracts and Figma access catches what the classifier missed.
|
|
309
336
|
|
|
310
|
-
|
|
311
|
-
|
|
337
|
+
**OBSERVABILITY LOGGING (MANDATORY):**
|
|
338
|
+
Before spawning — run via Bash:
|
|
339
|
+
`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
|
|
312
340
|
|
|
313
|
-
|
|
314
|
-
|-------|---------------|------------------------|
|
|
315
|
-
| Chart type | Contract's element name matches the actual visual pattern | Donut classified as stacked bar (or vice versa) |
|
|
316
|
-
| Data labels | Contract's Test Fixture labels match the Figma text exactly | Hallucinated column headers, invented metrics |
|
|
317
|
-
| Element count | Number of sub-elements in contract matches Figma | Missing legends, extra charts, wrong layout |
|
|
318
|
-
| Text content | Every title, subtitle, label, legend item matches Figma verbatim | "Engagement per video" subtitle that doesn't exist in Figma |
|
|
319
|
-
| Layout structure | Widget's claimed layout matches Figma arrangement | Side-by-side classified as stacked, 2 charts classified as 1 |
|
|
320
|
-
|
|
321
|
-
3. **Produce a contract-vs-Figma mismatch report:**
|
|
341
|
+
⚙ [opus] gsd-t-design-decompose → Chart Classification Verifier
|
|
322
342
|
|
|
323
343
|
```
|
|
324
|
-
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
|
|
328
|
-
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
|
|
344
|
+
Task subagent (general-purpose, model: opus):
|
|
345
|
+
"You are the Chart Classification Verifier. Your ONLY job is to independently
|
|
346
|
+
verify that each element contract's chart type classification matches the actual
|
|
347
|
+
Figma design. You have ZERO knowledge of how the charts were classified — you
|
|
348
|
+
are seeing them fresh. Your incentive: every misclassification you catch prevents
|
|
349
|
+
a wrong chart being built. Every misclassification you miss causes a rebuild.
|
|
350
|
+
|
|
351
|
+
## Contracts to Verify
|
|
352
|
+
{list each element contract filename + its claimed chart type from INDEX.md}
|
|
353
|
+
|
|
354
|
+
## Figma Source
|
|
355
|
+
File key: {fileKey}
|
|
356
|
+
Page node: {nodeId}
|
|
357
|
+
|
|
358
|
+
## Verification Process
|
|
359
|
+
|
|
360
|
+
For EACH element contract that claims a chart/visualization type:
|
|
361
|
+
|
|
362
|
+
1. Read the element contract — note its claimed type (e.g., 'bar-vertical-grouped')
|
|
363
|
+
2. Find the Figma node ID referenced in the contract (or in the widget that uses it)
|
|
364
|
+
3. Call `get_design_context` on that specific node ID — examine the STRUCTURE:
|
|
365
|
+
- Layout mode (horizontal vs vertical arrangement of children)
|
|
366
|
+
- Child elements (are they bars? segments? slices?)
|
|
367
|
+
- How children are arranged (side by side? stacked? overlapping?)
|
|
368
|
+
- Dimensions (do bars extend horizontally or vertically?)
|
|
369
|
+
|
|
370
|
+
4. Walk the decision tree INDEPENDENTLY (do NOT read the contract's reasoning):
|
|
371
|
+
|
|
372
|
+
BAR CHART ORIENTATION PROOF:
|
|
373
|
+
a. Are the data-bearing rectangles arranged HORIZONTALLY (left to right)?
|
|
374
|
+
→ Segments share ONE ROW, each segment's WIDTH encodes its value
|
|
375
|
+
→ This is HORIZONTAL (stacked if touching, grouped if separated)
|
|
376
|
+
b. Are the data-bearing rectangles arranged VERTICALLY (bottom to top)?
|
|
377
|
+
→ Each bar is a COLUMN, each bar's HEIGHT encodes its value
|
|
378
|
+
→ This is VERTICAL (stacked if layered, grouped if side-by-side)
|
|
379
|
+
c. Is it ONE bar with colored segments? → STACKED
|
|
380
|
+
Is it MULTIPLE separate bars? → GROUPED
|
|
381
|
+
d. Do labels show percentages summing to 100%? → PERCENTAGE variant
|
|
382
|
+
|
|
383
|
+
CRITICAL DISTINCTION — the #1 misclassification:
|
|
384
|
+
A single horizontal bar divided into colored segments (each segment's WIDTH
|
|
385
|
+
represents a percentage) is chart-bar-stacked-horizontal-percentage.
|
|
386
|
+
Multiple vertical columns of different heights side-by-side is
|
|
387
|
+
chart-bar-grouped-vertical. These render COMPLETELY DIFFERENTLY.
|
|
388
|
+
If you see colored blocks in a ROW → HORIZONTAL. Period.
|
|
389
|
+
|
|
390
|
+
5. Compare YOUR classification against the contract's classification.
|
|
391
|
+
|
|
392
|
+
6. For EACH element, produce:
|
|
393
|
+
|
|
394
|
+
```
|
|
395
|
+
Element: {name}
|
|
396
|
+
Contract claims: {chart type}
|
|
397
|
+
Figma node: {id}
|
|
398
|
+
I SEE: {describe what the Figma MCP returned — layout, children, arrangement}
|
|
399
|
+
MY CLASSIFICATION: {your independent classification}
|
|
400
|
+
VERDICT: ✅ MATCH or ❌ MISMATCH
|
|
401
|
+
If MISMATCH: Contract says {X} but Figma shows {Y} because {evidence}
|
|
402
|
+
```
|
|
403
|
+
|
|
404
|
+
## Report
|
|
405
|
+
|
|
406
|
+
Produce the full verification table:
|
|
407
|
+
|
|
408
|
+
| # | Element | Contract Type | Verified Type | Figma Evidence | Verdict |
|
|
409
|
+
|---|---------|--------------|---------------|----------------|---------|
|
|
410
|
+
| 1 | chart-donut | chart-donut | chart-donut | circular arcs + center hole | ✅ MATCH |
|
|
411
|
+
| 2 | bar-vertical-grouped | bar-vertical-grouped | bar-stacked-horizontal-pct | 4 segments in ONE horizontal row | ❌ MISMATCH |
|
|
412
|
+
|
|
413
|
+
If ANY ❌ MISMATCH found:
|
|
414
|
+
- List each mismatch with the correct classification and evidence
|
|
415
|
+
- Report: 'VERIFICATION FAILED — {N} misclassifications found. Contracts must be fixed before build.'
|
|
416
|
+
|
|
417
|
+
If ALL ✅ MATCH:
|
|
418
|
+
- Report: 'VERIFICATION PASSED — all {N} chart classifications confirmed against Figma source.'
|
|
419
|
+
"
|
|
333
420
|
```
|
|
334
421
|
|
|
335
|
-
|
|
422
|
+
After subagent returns — run via Bash:
|
|
423
|
+
`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
|
|
424
|
+
|
|
425
|
+
Compute tokens/compaction per standard pattern. Append to `.gsd-t/token-log.md`.
|
|
426
|
+
|
|
427
|
+
**If VERIFICATION FAILED**: Fix every misclassified element contract before proceeding:
|
|
428
|
+
1. Rename the contract file to match the correct chart type
|
|
429
|
+
2. Rewrite the visual spec section to match the correct chart type
|
|
430
|
+
3. Update INDEX.md references
|
|
431
|
+
4. Update any widget contracts that reference the renamed element
|
|
432
|
+
5. **Re-run the verification subagent** to confirm fixes (max 2 cycles)
|
|
433
|
+
|
|
434
|
+
**If VERIFICATION PASSED**: Proceed to Step 7.
|
|
336
435
|
|
|
337
|
-
> **Why this gate exists**: The
|
|
436
|
+
> **Why this gate exists**: The decompose agent's own examples show the correct classification for "Number of Tools" as `chart-bar-stacked-horizontal-percentage` — yet the agent classified the same chart as `bar-vertical-grouped` in practice. Soft instructions ("MANDATORY decision tree") don't prevent misclassification. A separate agent with fresh context and inverted incentives (success = finding errors) does.
|
|
338
437
|
|
|
339
438
|
## Step 7: Wire Into Partition
|
|
340
439
|
|
|
@@ -309,12 +309,57 @@ Execute the task above:
|
|
|
309
309
|
recovers (retry button works, form can be resubmitted, etc.).
|
|
310
310
|
A test that would pass on an empty HTML page with the right element IDs is useless.
|
|
311
311
|
Every assertion must prove the FEATURE WORKS, not that the ELEMENT EXISTS.
|
|
312
|
-
8. **
|
|
313
|
-
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
|
|
317
|
-
|
|
312
|
+
8. **Render-Measure-Compare Loop** (when design-to-code stack rule is active — MANDATORY):
|
|
313
|
+
After implementing the component, you MUST verify it renders correctly by measuring
|
|
314
|
+
the actual DOM output against the contract's layout spec. This is not optional.
|
|
315
|
+
Do NOT rely on visual inspection or screenshots — measure mechanically.
|
|
316
|
+
|
|
317
|
+
a. **Render**: Start the dev server if not running. Navigate to a route where the
|
|
318
|
+
component is visible (or create a temporary test route that renders it in isolation).
|
|
319
|
+
|
|
320
|
+
b. **Measure via Playwright** — run `page.evaluate()` to extract DOM properties:
|
|
321
|
+
```javascript
|
|
322
|
+
// For a widget: measure its internal layout
|
|
323
|
+
const el = document.querySelector('.widget-selector');
|
|
324
|
+
const style = getComputedStyle(el);
|
|
325
|
+
return {
|
|
326
|
+
display: style.display, // 'flex' or 'grid'
|
|
327
|
+
flexDirection: style.flexDirection, // 'row' or 'column'
|
|
328
|
+
gap: style.gap,
|
|
329
|
+
gridTemplateColumns: style.gridTemplateColumns,
|
|
330
|
+
width: el.offsetWidth,
|
|
331
|
+
height: el.offsetHeight,
|
|
332
|
+
childCount: el.children.length,
|
|
333
|
+
children: Array.from(el.children).map(c => ({
|
|
334
|
+
tag: c.tagName,
|
|
335
|
+
width: c.offsetWidth,
|
|
336
|
+
height: c.offsetHeight,
|
|
337
|
+
display: getComputedStyle(c).display,
|
|
338
|
+
flexDirection: getComputedStyle(c).flexDirection,
|
|
339
|
+
}))
|
|
340
|
+
};
|
|
341
|
+
```
|
|
342
|
+
|
|
343
|
+
c. **Compare to contract** — check each measured value against the contract spec:
|
|
344
|
+
- `body_layout: flex-row` → verify `flexDirection === 'row'`
|
|
345
|
+
- `container_height: 334px` → verify `height === 334` (±2px tolerance)
|
|
346
|
+
- Grid `2×2` → verify parent has 2 row children, each with 2 card children
|
|
347
|
+
- Legend position: if contract says `body_sidebar` (beside chart) →
|
|
348
|
+
verify legend and chart share a `flex-row` parent.
|
|
349
|
+
If contract says `footer_legend` (below chart) →
|
|
350
|
+
verify legend is in a `flex-column` parent below the chart.
|
|
351
|
+
|
|
352
|
+
d. **Fix mismatches** — if ANY measurement doesn't match the contract:
|
|
353
|
+
- Log: "LAYOUT MISMATCH: {property} expected {contract value}, got {measured value}"
|
|
354
|
+
- Fix the code to match the contract spec
|
|
355
|
+
- Re-render and re-measure (max 2 fix cycles)
|
|
356
|
+
- If still mismatched after 2 cycles → log to `.gsd-t/deferred-items.md`
|
|
357
|
+
|
|
358
|
+
e. **All pass** → log "RENDER-MEASURE PASS: {N} layout properties verified" and proceed.
|
|
359
|
+
|
|
360
|
+
This loop catches the exact class of errors that visual inspection misses:
|
|
361
|
+
grid-cols-4 instead of 2×2, legend below instead of beside, wrong flex-direction.
|
|
362
|
+
These are data comparisons, not visual judgments — the same kind of check as a unit test.
|
|
318
363
|
9. Run ALL test suites — this is NOT optional, not conditional, not "if applicable":
|
|
319
364
|
a. Detect configured test runners: check for vitest/jest config, playwright.config.*, cypress.config.*
|
|
320
365
|
b. Run EVERY detected suite. Unit tests alone are NEVER sufficient when E2E exists.
|
package/commands/gsd-t-quick.md
CHANGED
|
@@ -159,8 +159,16 @@ When you encounter unexpected situations:
|
|
|
159
159
|
- If building/modifying a PAGE: IMPORT existing widget components — do NOT rebuild widget functionality inline.
|
|
160
160
|
- **Contract is authoritative**: Follow the contract spec, not the Figma screenshot, when they appear to disagree.
|
|
161
161
|
5. Make the change — **adapt new code to existing structures**, not the other way around
|
|
162
|
-
6.
|
|
163
|
-
|
|
162
|
+
6. **Render-Measure-Compare** (if design component — MANDATORY):
|
|
163
|
+
After implementing, verify via Playwright DOM measurement (not screenshots):
|
|
164
|
+
- Render the component in browser
|
|
165
|
+
- `page.evaluate()` to extract: display, flexDirection, gap, gridTemplateColumns,
|
|
166
|
+
offsetWidth, offsetHeight, child count and layout
|
|
167
|
+
- Compare each value to the contract's layout spec (body_layout, container_height, etc.)
|
|
168
|
+
- Mismatches → fix code → re-measure (max 2 cycles)
|
|
169
|
+
- This catches: wrong grid structure, legend below vs beside, wrong flex-direction
|
|
170
|
+
7. Verify it works
|
|
171
|
+
8. Commit: `[quick] {description}`
|
|
164
172
|
|
|
165
173
|
## Step 3.5: Emit Task Metrics
|
|
166
174
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@tekyzinc/gsd-t",
|
|
3
|
-
"version": "2.70.
|
|
3
|
+
"version": "2.70.16",
|
|
4
4
|
"description": "GSD-T: Contract-Driven Development for Claude Code — 54 slash commands with headless CI/CD mode, graph-powered code analysis, real-time agent dashboard, execution intelligence, task telemetry, doc-ripple enforcement, backlog management, impact analysis, test sync, milestone archival, and PRD generation",
|
|
5
5
|
"author": "Tekyz, Inc.",
|
|
6
6
|
"license": "MIT",
|