@tekyzinc/gsd-t 2.39.13 → 2.46.11
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +12 -0
- package/README.md +19 -10
- package/bin/desktop.ini +2 -0
- package/bin/global-sync-manager.js +350 -0
- package/bin/gsd-t.js +592 -2
- package/bin/metrics-collector.js +167 -0
- package/bin/metrics-rollup.js +200 -0
- package/bin/patch-lifecycle.js +195 -0
- package/bin/rule-engine.js +160 -0
- package/commands/desktop.ini +2 -0
- package/commands/gsd-t-complete-milestone.md +194 -6
- package/commands/gsd-t-debug.md +38 -3
- package/commands/gsd-t-doc-ripple.md +148 -0
- package/commands/gsd-t-execute.md +328 -54
- package/commands/gsd-t-help.md +32 -10
- package/commands/gsd-t-integrate.md +59 -7
- package/commands/gsd-t-metrics.md +143 -0
- package/commands/gsd-t-plan.md +49 -2
- package/commands/gsd-t-qa.md +26 -5
- package/commands/gsd-t-quick.md +36 -3
- package/commands/gsd-t-status.md +78 -0
- package/commands/gsd-t-test-sync.md +23 -2
- package/commands/gsd-t-verify.md +142 -10
- package/commands/gsd-t-visualize.md +11 -1
- package/commands/gsd-t-wave.md +64 -18
- package/docs/GSD-T-README.md +10 -6
- package/docs/architecture.md +84 -2
- package/docs/ci-examples/desktop.ini +2 -0
- package/docs/ci-examples/github-actions.yml +104 -0
- package/docs/ci-examples/gitlab-ci.yml +116 -0
- package/docs/desktop.ini +2 -0
- package/docs/framework-comparison-scorecard.md +160 -0
- package/docs/infrastructure.md +87 -1
- package/docs/prd-graph-engine.md +2 -2
- package/docs/prd-gsd2-hybrid.md +258 -135
- package/docs/requirements.md +66 -2
- package/examples/.gsd-t/contracts/desktop.ini +2 -0
- package/examples/.gsd-t/desktop.ini +2 -0
- package/examples/.gsd-t/domains/desktop.ini +2 -0
- package/examples/.gsd-t/domains/example-domain/desktop.ini +2 -0
- package/examples/desktop.ini +2 -0
- package/examples/rules/.gitkeep +0 -0
- package/examples/rules/desktop.ini +2 -0
- package/package.json +40 -40
- package/scripts/desktop.ini +2 -0
- package/scripts/gsd-t-dashboard-server.js +19 -2
- package/scripts/gsd-t-dashboard.html +63 -0
- package/scripts/gsd-t-event-writer.js +1 -0
- package/templates/CLAUDE-global.md +92 -10
- package/templates/desktop.ini +2 -0
package/docs/requirements.md
CHANGED
|
@@ -37,6 +37,31 @@
|
|
|
37
37
|
| REQ-029 | Diagram Rendering Toolchain — three rendering backends supported, each free and open source, selected automatically based on availability: (1) Primary — Mermaid CLI (mmdc, @mermaid-js/mermaid-cli, MIT, npm) renders .mmd files to SVG via headless Chromium; (2) Enhanced — D2 (MPL-2.0, terrastruct/d2, Go binary) as optional renderer for architecture and dataflow diagrams — uses dagre/ELK/neato layouts (TALA excluded, paid); (3) Fallback — Kroki HTTP API (MIT, yuzutech/kroki) renders any supported format via single HTTP POST to kroki.io (public free tier) or self-hosted Docker instance | P2 | complete (M17) | test/scan.test.js + verify-gates.js |
|
|
38
38
|
| REQ-030 | MCP Diagram Server Support — gsd-t-scan supports optional MCP-based diagram generation when registered MCP servers are detected in Claude Code settings: diagram-bridge-mcp (MIT, tohachan) selects optimal format and renders via Kroki; C4Diagrammer (MIT, jonverrier) specialized for existing codebase → C4 architecture diagrams; mcp-mermaid (MIT, hustcc) for 22 Mermaid diagram types; MCP path is preferred over CLI when available | P3 | complete (M17) | test/scan.test.js + verify-gates.js |
|
|
39
39
|
|
|
40
|
+
| REQ-031 | Per-Task Telemetry Collection — metrics-collector.js emits structured records to task-metrics.jsonl with weighted signal taxonomy (5 signal types), pre-flight intelligence check warns on domain failure patterns | P1 | complete (M25) | test/metrics-collector.test.js |
|
|
41
|
+
| REQ-032 | Milestone Rollup & Process ELO — metrics-rollup.js aggregates task-metrics into rollup.jsonl with first_pass_rate, ELO scoring (K=32), trend comparison, 4 detection heuristics (first-pass-failure-spike, rework-rate-anomaly, context-overflow-correlation, duration-regression) | P1 | complete (M25) | test/metrics-rollup.test.js |
|
|
42
|
+
| REQ-033 | Metrics Dashboard Panel — Chart.js trend line (first_pass_rate over milestones), domain health heatmap, ELO display in existing dashboard via GET /metrics endpoint | P2 | complete (M25) | test/dashboard-server.test.js (extend) |
|
|
43
|
+
| REQ-034 | gsd-t-metrics Command — 50th command reads task-metrics.jsonl + rollup.jsonl, displays metrics summary, ELO, signal distribution, domain breakdown, trend comparison, heuristic warnings | P1 | complete (M25) | validated by use |
|
|
44
|
+
| REQ-035 | Process ELO in Status — gsd-t-status displays current ELO score and quality budget summary from rollup.jsonl | P2 | complete (M25) | validated by use |
|
|
45
|
+
|
|
46
|
+
| REQ-036 | Declarative Rule Engine — bin/rule-engine.js loads rules from rules.jsonl, evaluates triggers against task-metrics with 8 operators (gt, gte, lt, lte, eq, neq, in, pattern_count), tracks activation counts, flags inactive rules, consolidates related rules | P1 | planned | test/rule-engine.test.js |
|
|
47
|
+
| REQ-037 | Patch Template System — patch-templates.jsonl maps rule triggers to file edits (append, prepend, insert_after, replace), templates reference target files and edit content | P1 | planned | test/rule-engine.test.js |
|
|
48
|
+
| REQ-038 | Patch Lifecycle Manager — bin/patch-lifecycle.js manages 5-stage lifecycle (candidate->applied->measured->promoted->graduated) with promotion gate (>55% improvement over 2+ milestones) and graduation (3+ milestones sustained) | P1 | planned | test/patch-lifecycle.test.js |
|
|
49
|
+
| REQ-039 | Active Rule Injection in Execute — gsd-t-execute.md injects firing rules (max 10 lines) into subagent prompts before task dispatch | P1 | planned | validated by use |
|
|
50
|
+
| REQ-040 | Rule-Based Pre-Mortem in Plan — gsd-t-plan.md Step 1.7 enhanced with getPreMortemRules to surface historical rule matches for domain types | P2 | planned | validated by use |
|
|
51
|
+
| REQ-041 | Distillation Extension — gsd-t-complete-milestone.md distillation step extended with rule evaluation, patch candidate generation, promotion gate check, graduation, consolidation, and quality budget governance | P1 | planned | validated by use |
|
|
52
|
+
| REQ-042 | Quality Budget Governance — per-milestone rework ceiling (default 20%), auto-tightens constraints (force discuss, require contract review, split large tasks) when exceeded | P2 | planned | validated by use |
|
|
53
|
+
|
|
54
|
+
| REQ-043 | Global Sync Manager — bin/global-sync-manager.js reads local metrics, writes global aggregated files to ~/.claude/metrics/, provides APIs for global rollup, global rules, signal distribution comparison, universal rule promotion | P1 | planned | test/global-sync-manager.test.js |
|
|
55
|
+
| REQ-044 | Cross-Project Rule Propagation — gsd-t-version-update-all syncs global rules (universal or promotion_count >= 2) to all registered projects as candidates | P1 | planned | test/global-rule-sync.test.js |
|
|
56
|
+
| REQ-045 | Universal Rule Promotion — rules promoted in 3+ projects marked universal, 5+ projects become npm distribution candidates shipped in examples/rules/ | P1 | planned | test/global-sync-manager.test.js |
|
|
57
|
+
| REQ-046 | Cross-Project Signal Comparison — gsd-t-metrics --cross-project displays signal-type distribution comparison across registered projects | P2 | planned | validated by use |
|
|
58
|
+
| REQ-047 | Global ELO & Rankings — gsd-t-status displays global ELO score and cross-project rank when global metrics exist | P2 | planned | validated by use |
|
|
59
|
+
| REQ-048 | Global Rule Promotion on Milestone Completion — gsd-t-complete-milestone copies promoted rules to global-rules.jsonl and updates global rollup after local promotion | P1 | planned | validated by use |
|
|
60
|
+
| REQ-049 | E2E Enforcement Rule — when playwright.config.* or cypress.config.* exists, ALL test-running commands (execute, quick, debug, test-sync, integrate, verify, complete-milestone) MUST run the full E2E suite. Unit-only results are NEVER sufficient. QA subagent prompts explicitly mandate E2E detection and execution. | P1 | complete | enforced in 7 command files + CLAUDE.md + pre-commit-gate contract |
|
|
61
|
+
| REQ-050 | Functional E2E Test Quality Standard — Playwright specs MUST verify functional behavior (state changes, data flow, content updates after actions), NOT just element existence (isVisible, toBeEnabled). Shallow layout tests that would pass on an empty HTML page are flagged and block verification. QA subagent audits for shallow tests. | P1 | complete | enforced in execute, qa, test-sync, verify, quick, debug, integrate, complete-milestone + global CLAUDE.md + CLAUDE-global template |
|
|
62
|
+
| REQ-051 | Document Ripple Completion Gate — when a change affects multiple files, identify the full blast radius BEFORE starting, complete ALL updates in one pass, and only report completion after every downstream document is updated. Partial delivery is never acceptable. The user should never need to ask "did you update everything?" | P1 | complete | enforced in global CLAUDE.md + CLAUDE-global template + project CLAUDE.md |
|
|
63
|
+
| REQ-052 | Doc-Ripple Subagent — dedicated agent auto-spawned after code-modifying commands (execute, integrate, quick, debug, wave) that analyzes git diff, identifies full blast radius of affected documents, and spawns parallel subagents to update them. Produces manifest audit trail. Threshold logic skips trivial changes. | P1 | complete | M28: contract ACTIVE, command file, 43 tests, wired into execute/integrate/quick/debug/wave |
|
|
64
|
+
|
|
40
65
|
## Technical Requirements
|
|
41
66
|
|
|
42
67
|
| ID | Requirement | Priority | Status |
|
|
@@ -99,8 +124,47 @@
|
|
|
99
124
|
| REQ-029 | Diagram Rendering Toolchain — Mermaid CLI → D2 → Kroki fallback chain | scan-diagrams | pending | planned |
|
|
100
125
|
| REQ-030 | MCP Diagram Server Support — diagram-bridge-mcp / C4Diagrammer / mcp-mermaid | scan-diagrams | pending | planned |
|
|
101
126
|
|
|
102
|
-
|
|
103
|
-
|
|
127
|
+
## Requirements Traceability (updated by plan phase — M25)
|
|
128
|
+
|
|
129
|
+
| REQ-ID | Requirement Summary | Domain | Task(s) | Status |
|
|
130
|
+
|---------|-------------------------------------------------------------|---------------------|--------------------------------|---------|
|
|
131
|
+
| REQ-031 | Per-Task Telemetry Collection — collector + emission | metrics-collection | Task 1, 2, 3, 4, 5 | planned |
|
|
132
|
+
| REQ-032 | Milestone Rollup & Process ELO — rollup + heuristics | metrics-rollup | Task 1, 2, 3, 4, 5 | planned |
|
|
133
|
+
| REQ-033 | Metrics Dashboard Panel — /metrics endpoint + Chart.js | metrics-dashboard | Task 1, 2 | planned |
|
|
134
|
+
| REQ-034 | gsd-t-metrics Command — 50th command | metrics-commands | Task 1, 3, 4 | planned |
|
|
135
|
+
| REQ-035 | Process ELO in Status — ELO display in status output | metrics-commands | Task 2 | planned |
|
|
136
|
+
|
|
137
|
+
**Orphaned requirements**: REQ-001 through REQ-017 (all M1-M13 deliverables, complete — not mapped to M14+ tasks by design).
|
|
138
|
+
**Unanchored tasks**: metrics-commands Task 3 (CLI count) and Task 4 (4 reference files) are infrastructure supporting REQ-034 — implicitly mapped.
|
|
139
|
+
|
|
140
|
+
## Requirements Traceability (updated by plan phase — M26)
|
|
141
|
+
|
|
142
|
+
| REQ-ID | Requirement Summary | Domain | Task(s) | Status |
|
|
143
|
+
|---------|-------------------------------------------------------------|---------------------|--------------------------------|---------|
|
|
144
|
+
| REQ-036 | Declarative Rule Engine — rule evaluator + activation tracking | rule-engine | Task 1, 2, 3 | pending |
|
|
145
|
+
| REQ-037 | Patch Template System — templates.jsonl + seed data | rule-engine | Task 2, 4 | pending |
|
|
146
|
+
| REQ-038 | Patch Lifecycle Manager — 5-stage lifecycle + promotion gate | patch-lifecycle | Task 1, 2, 3 | pending |
|
|
147
|
+
| REQ-039 | Active Rule Injection in Execute | command-integration | Task 1 | pending |
|
|
148
|
+
| REQ-040 | Rule-Based Pre-Mortem in Plan | command-integration | Task 2 | pending |
|
|
149
|
+
| REQ-041 | Distillation Extension — rules + patches + graduation | command-integration | Task 3 | pending |
|
|
150
|
+
| REQ-042 | Quality Budget Governance — rework ceiling + tightening | command-integration | Task 3 | pending |
|
|
151
|
+
|
|
152
|
+
**Orphaned requirements**: None — all M26 REQs mapped to tasks.
|
|
153
|
+
**Unanchored tasks**: rule-engine Task 5 (tests) and patch-lifecycle Task 4 (tests) are QA infrastructure supporting all REQs. command-integration Task 4 (reference docs) supports Pre-Commit Gate compliance.
|
|
154
|
+
|
|
155
|
+
## Requirements Traceability (updated by plan phase — M27)
|
|
156
|
+
|
|
157
|
+
| REQ-ID | Requirement Summary | Domain | Task(s) | Status |
|
|
158
|
+
|---------|-------------------------------------------------------------|---------------------|--------------------------------|---------|
|
|
159
|
+
| REQ-043 | Global Sync Manager — read local, write global, compare | global-metrics | Task 1, 2, 3, 4 | pending |
|
|
160
|
+
| REQ-044 | Cross-Project Rule Propagation — update-all syncs rules | cross-project-sync | Task 1, 3 | pending |
|
|
161
|
+
| REQ-045 | Universal Rule Promotion — 3+ universal, 5+ npm candidate | global-metrics, cross-project-sync | gm Task 3, cps Task 2 | pending |
|
|
162
|
+
| REQ-046 | Cross-Project Signal Comparison — metrics --cross-project | command-extensions | Task 1 | pending |
|
|
163
|
+
| REQ-047 | Global ELO & Rankings — status global ELO display | command-extensions | Task 2 | pending |
|
|
164
|
+
| REQ-048 | Global Rule Promotion on Milestone Completion | command-extensions | Task 3 | pending |
|
|
165
|
+
|
|
166
|
+
**Orphaned requirements**: None — all M27 REQs mapped to tasks.
|
|
167
|
+
**Unanchored tasks**: global-metrics Task 4 (tests) and cross-project-sync Task 3 (tests) are QA infrastructure supporting REQ-043 through REQ-045. command-extensions Task 4 (reference docs) supports Pre-Commit Gate compliance.
|
|
104
168
|
|
|
105
169
|
---
|
|
106
170
|
|
|
File without changes
|
package/package.json
CHANGED
|
@@ -1,40 +1,40 @@
|
|
|
1
|
-
{
|
|
2
|
-
"name": "@tekyzinc/gsd-t",
|
|
3
|
-
"version": "2.
|
|
4
|
-
"description": "GSD-T: Contract-Driven Development for Claude Code —
|
|
5
|
-
"author": "Tekyz, Inc.",
|
|
6
|
-
"license": "MIT",
|
|
7
|
-
"repository": {
|
|
8
|
-
"type": "git",
|
|
9
|
-
"url": "git+https://github.com/Tekyz-Inc/get-stuff-done-teams.git"
|
|
10
|
-
},
|
|
11
|
-
"homepage": "https://github.com/Tekyz-Inc/get-stuff-done-teams#readme",
|
|
12
|
-
"keywords": [
|
|
13
|
-
"claude-code",
|
|
14
|
-
"gsd",
|
|
15
|
-
"ai-development",
|
|
16
|
-
"agent-teams",
|
|
17
|
-
"contract-driven-development",
|
|
18
|
-
"slash-commands"
|
|
19
|
-
],
|
|
20
|
-
"main": "bin/gsd-t.js",
|
|
21
|
-
"bin": {
|
|
22
|
-
"gsd-t": "bin/gsd-t.js"
|
|
23
|
-
},
|
|
24
|
-
"scripts": {
|
|
25
|
-
"test": "node --test",
|
|
26
|
-
"prepublishOnly": "npm test"
|
|
27
|
-
},
|
|
28
|
-
"files": [
|
|
29
|
-
"bin/",
|
|
30
|
-
"commands/",
|
|
31
|
-
"scripts/",
|
|
32
|
-
"templates/",
|
|
33
|
-
"examples/",
|
|
34
|
-
"docs/",
|
|
35
|
-
"CHANGELOG.md"
|
|
36
|
-
],
|
|
37
|
-
"engines": {
|
|
38
|
-
"node": ">=16.0.0"
|
|
39
|
-
}
|
|
40
|
-
}
|
|
1
|
+
{
|
|
2
|
+
"name": "@tekyzinc/gsd-t",
|
|
3
|
+
"version": "2.46.11",
|
|
4
|
+
"description": "GSD-T: Contract-Driven Development for Claude Code — 51 slash commands with headless CI/CD mode, graph-powered code analysis, real-time agent dashboard, execution intelligence, task telemetry, doc-ripple enforcement, backlog management, impact analysis, test sync, milestone archival, and PRD generation",
|
|
5
|
+
"author": "Tekyz, Inc.",
|
|
6
|
+
"license": "MIT",
|
|
7
|
+
"repository": {
|
|
8
|
+
"type": "git",
|
|
9
|
+
"url": "git+https://github.com/Tekyz-Inc/get-stuff-done-teams.git"
|
|
10
|
+
},
|
|
11
|
+
"homepage": "https://github.com/Tekyz-Inc/get-stuff-done-teams#readme",
|
|
12
|
+
"keywords": [
|
|
13
|
+
"claude-code",
|
|
14
|
+
"gsd",
|
|
15
|
+
"ai-development",
|
|
16
|
+
"agent-teams",
|
|
17
|
+
"contract-driven-development",
|
|
18
|
+
"slash-commands"
|
|
19
|
+
],
|
|
20
|
+
"main": "bin/gsd-t.js",
|
|
21
|
+
"bin": {
|
|
22
|
+
"gsd-t": "bin/gsd-t.js"
|
|
23
|
+
},
|
|
24
|
+
"scripts": {
|
|
25
|
+
"test": "node --test",
|
|
26
|
+
"prepublishOnly": "npm test"
|
|
27
|
+
},
|
|
28
|
+
"files": [
|
|
29
|
+
"bin/",
|
|
30
|
+
"commands/",
|
|
31
|
+
"scripts/",
|
|
32
|
+
"templates/",
|
|
33
|
+
"examples/",
|
|
34
|
+
"docs/",
|
|
35
|
+
"CHANGELOG.md"
|
|
36
|
+
],
|
|
37
|
+
"engines": {
|
|
38
|
+
"node": ">=16.0.0"
|
|
39
|
+
}
|
|
40
|
+
}
|
|
@@ -109,11 +109,28 @@ function handleEvents(req, res, eventsDir) {
|
|
|
109
109
|
req.on("close", () => { clearInterval(timer); if (unwatchFile) unwatchFile(); if (dirWatcher) dirWatcher.close(); });
|
|
110
110
|
}
|
|
111
111
|
|
|
112
|
-
function
|
|
112
|
+
function readMetricsData(metricsDir) {
|
|
113
|
+
const taskFile = path.join(metricsDir, "task-metrics.jsonl");
|
|
114
|
+
const rollupFile = path.join(metricsDir, "rollup.jsonl");
|
|
115
|
+
const taskMetrics = fs.existsSync(taskFile) ? safeReadJsonl(taskFile) : [];
|
|
116
|
+
const rollups = fs.existsSync(rollupFile) ? safeReadJsonl(rollupFile) : [];
|
|
117
|
+
return { taskMetrics, rollups };
|
|
118
|
+
}
|
|
119
|
+
|
|
120
|
+
function handleMetrics(req, res, projectDir) {
|
|
121
|
+
const metricsDir = path.join(projectDir, ".gsd-t", "metrics");
|
|
122
|
+
const data = readMetricsData(metricsDir);
|
|
123
|
+
res.writeHead(200, { "Content-Type": "application/json" });
|
|
124
|
+
res.end(JSON.stringify(data));
|
|
125
|
+
}
|
|
126
|
+
|
|
127
|
+
function startServer(port, eventsDir, htmlPath, projectDir) {
|
|
128
|
+
const projDir = projectDir || path.resolve(eventsDir, "..", "..");
|
|
113
129
|
const server = http.createServer((req, res) => {
|
|
114
130
|
const url = req.url.split("?")[0];
|
|
115
131
|
if (url === "/" || url === "") return handleRoot(req, res, htmlPath);
|
|
116
132
|
if (url === "/events") return handleEvents(req, res, eventsDir);
|
|
133
|
+
if (url === "/metrics") return handleMetrics(req, res, projDir);
|
|
117
134
|
if (url === "/ping") return handlePing(req, res, port);
|
|
118
135
|
if (url === "/stop") return handleStop(req, res, server);
|
|
119
136
|
res.writeHead(404); res.end("Not found");
|
|
@@ -122,7 +139,7 @@ function startServer(port, eventsDir, htmlPath) {
|
|
|
122
139
|
return { server, url: `http://localhost:${port}` };
|
|
123
140
|
}
|
|
124
141
|
|
|
125
|
-
module.exports = { startServer, tailEventsFile, readExistingEvents, parseEventLine, findEventsDir };
|
|
142
|
+
module.exports = { startServer, tailEventsFile, readExistingEvents, parseEventLine, findEventsDir, readMetricsData };
|
|
126
143
|
|
|
127
144
|
if (require.main === module) {
|
|
128
145
|
const argv = process.argv.slice(2);
|
|
@@ -8,6 +8,7 @@
|
|
|
8
8
|
<script src="https://unpkg.com/dagre@0.8.5/dist/dagre.min.js"></script>
|
|
9
9
|
<script src="https://unpkg.com/reactflow@11.11.4/dist/umd/index.js"></script>
|
|
10
10
|
<link rel="stylesheet" href="https://unpkg.com/reactflow@11.11.4/dist/style.css">
|
|
11
|
+
<script src="https://cdn.jsdelivr.net/npm/chart.js@4/dist/chart.umd.min.js"></script>
|
|
11
12
|
<style>
|
|
12
13
|
:root{--bg:#0d1117;--surface:#161b22;--border:#30363d;--text:#e6edf3;--muted:#7d8590;
|
|
13
14
|
--green:#3fb950;--green-bg:#1a3a1e;--red:#f85149;--red-bg:#3a1a1a;
|
|
@@ -50,6 +51,15 @@ body{background:var(--bg);color:var(--text);font-family:var(--font);font-size:12
|
|
|
50
51
|
.model-name{width:50px;color:var(--text);font-weight:500;}.model-bar-bg{flex:1;height:8px;background:var(--bg);border-radius:4px;overflow:hidden;}
|
|
51
52
|
.model-bar{height:100%;border-radius:4px;transition:width .3s ease;}.model-bar.opus{background:var(--blue);}.model-bar.sonnet{background:var(--green);}.model-bar.haiku{background:var(--yellow);}
|
|
52
53
|
.model-cnt{color:var(--muted);font-size:10px;min-width:20px;text-align:right;}
|
|
54
|
+
.metrics-panel{background:var(--surface);border-top:1px solid var(--border);padding:12px;flex-shrink:0;max-height:280px;overflow-y:auto;}
|
|
55
|
+
.metrics-panel .sb-hdr{padding:0 0 8px;border-bottom:none;}
|
|
56
|
+
.elo-display{display:flex;align-items:center;gap:8px;margin-bottom:8px;font-size:12px;}
|
|
57
|
+
.elo-score{font-size:18px;font-weight:bold;color:var(--blue);}
|
|
58
|
+
.elo-delta{font-size:12px;}.elo-delta.up{color:var(--green);}.elo-delta.down{color:var(--red);}
|
|
59
|
+
.metrics-chart{height:120px;margin-bottom:8px;}
|
|
60
|
+
.domain-heat{width:100%;border-collapse:collapse;font-size:10px;margin-top:4px;}
|
|
61
|
+
.domain-heat th{color:var(--muted);text-align:left;padding:2px 6px;border-bottom:1px solid var(--border);}
|
|
62
|
+
.domain-heat td{padding:2px 6px;}.rate-good{color:var(--green);}.rate-warn{color:var(--yellow);}.rate-bad{color:var(--red);}
|
|
53
63
|
</style>
|
|
54
64
|
</head>
|
|
55
65
|
<body>
|
|
@@ -65,6 +75,15 @@ body{background:var(--bg);color:var(--text);font-family:var(--font);font-size:12
|
|
|
65
75
|
<div id="models" class="models"><div class="noevents">No model data yet</div></div>
|
|
66
76
|
<div class="sb-hdr">Live Event Feed</div>
|
|
67
77
|
<div id="feed" class="feed"><div class="noevents">Waiting for events...</div></div>
|
|
78
|
+
<div class="metrics-panel">
|
|
79
|
+
<div class="sb-hdr">Process Metrics</div>
|
|
80
|
+
<div id="elo-display" class="elo-display"><span class="noevents">No metrics data</span></div>
|
|
81
|
+
<div class="metrics-chart"><canvas id="trend-chart"></canvas></div>
|
|
82
|
+
<table id="domain-heat" class="domain-heat" style="display:none">
|
|
83
|
+
<thead><tr><th>Domain</th><th>Pass%</th><th>Avg(s)</th><th>Tasks</th></tr></thead>
|
|
84
|
+
<tbody id="domain-heat-body"></tbody>
|
|
85
|
+
</table>
|
|
86
|
+
</div>
|
|
68
87
|
</div>
|
|
69
88
|
</div>
|
|
70
89
|
<script>
|
|
@@ -194,6 +213,50 @@ function Dashboard(){
|
|
|
194
213
|
}
|
|
195
214
|
|
|
196
215
|
ReactDOM.render(React.createElement(Dashboard),document.getElementById('rf-root'));
|
|
216
|
+
|
|
217
|
+
// ── Metrics Panel ──────────────────────────────────────────────────────────
|
|
218
|
+
(function initMetrics(){
|
|
219
|
+
let trendChart=null;
|
|
220
|
+
function rateClass(r){return r>=0.8?'rate-good':r>=0.6?'rate-warn':'rate-bad';}
|
|
221
|
+
function fetchMetrics(){
|
|
222
|
+
fetch(`http://localhost:${PORT}/metrics`).then(r=>r.json()).then(data=>{
|
|
223
|
+
renderELO(data.rollups);renderTrend(data.rollups);renderHeatmap(data.rollups);
|
|
224
|
+
}).catch(()=>{});
|
|
225
|
+
}
|
|
226
|
+
function renderELO(rollups){
|
|
227
|
+
const el=document.getElementById('elo-display');
|
|
228
|
+
if(!rollups||!rollups.length){el.innerHTML='<span class="noevents">No metrics data</span>';return;}
|
|
229
|
+
const last=rollups[rollups.length-1];
|
|
230
|
+
const dc=last.elo_delta>=0?'up':'down';const arrow=last.elo_delta>=0?'+':'';
|
|
231
|
+
el.innerHTML=`<span class="elo-score">${Math.round(last.elo_after)}</span><span>ELO</span>`+
|
|
232
|
+
`<span class="elo-delta ${dc}">${arrow}${last.elo_delta.toFixed(1)}</span>`;
|
|
233
|
+
}
|
|
234
|
+
function renderTrend(rollups){
|
|
235
|
+
const ctx=document.getElementById('trend-chart');
|
|
236
|
+
if(!ctx||!rollups||rollups.length<1)return;
|
|
237
|
+
const labels=rollups.map(r=>r.milestone);
|
|
238
|
+
const rates=rollups.map(r=>(r.first_pass_rate*100).toFixed(1));
|
|
239
|
+
if(trendChart)trendChart.destroy();
|
|
240
|
+
trendChart=new Chart(ctx,{type:'line',data:{labels,datasets:[{label:'First-Pass %',
|
|
241
|
+
data:rates,borderColor:'#3fb950',backgroundColor:'rgba(63,185,80,0.1)',fill:true,tension:0.3}]},
|
|
242
|
+
options:{responsive:true,maintainAspectRatio:false,scales:{y:{min:0,max:100,
|
|
243
|
+
ticks:{color:'#7d8590',font:{size:9}}},x:{ticks:{color:'#7d8590',font:{size:9}}}},
|
|
244
|
+
plugins:{legend:{labels:{color:'#e6edf3',font:{size:10}}}}}});
|
|
245
|
+
}
|
|
246
|
+
function renderHeatmap(rollups){
|
|
247
|
+
const tbl=document.getElementById('domain-heat');const body=document.getElementById('domain-heat-body');
|
|
248
|
+
if(!rollups||!rollups.length||!tbl||!body){return;}
|
|
249
|
+
const last=rollups[rollups.length-1];
|
|
250
|
+
if(!last.domain_breakdown||!last.domain_breakdown.length)return;
|
|
251
|
+
tbl.style.display='table';
|
|
252
|
+
body.innerHTML=last.domain_breakdown.map(d=>{
|
|
253
|
+
const r=d.first_pass_rate;const cls=rateClass(r);
|
|
254
|
+
return `<tr><td>${d.domain}</td><td class="${cls}">${(r*100).toFixed(0)}%</td>`+
|
|
255
|
+
`<td>${d.avg_duration_s.toFixed(0)}</td><td>${d.tasks}</td></tr>`;
|
|
256
|
+
}).join('');
|
|
257
|
+
}
|
|
258
|
+
fetchMetrics();setInterval(fetchMetrics,30000);
|
|
259
|
+
})();
|
|
197
260
|
</script>
|
|
198
261
|
</body>
|
|
199
262
|
</html>
|
|
@@ -44,20 +44,22 @@ PROJECT or FEATURE or SCAN
|
|
|
44
44
|
| `/user:gsd-t-milestone` | Define new milestone |
|
|
45
45
|
| `/user:gsd-t-partition` | Decompose into domains + contracts |
|
|
46
46
|
| `/user:gsd-t-discuss` | Multi-perspective design exploration |
|
|
47
|
-
| `/user:gsd-t-plan` | Create atomic task lists per domain |
|
|
47
|
+
| `/user:gsd-t-plan` | Create atomic task lists per domain (tasks auto-split to fit one context window) |
|
|
48
48
|
| `/user:gsd-t-impact` | Analyze downstream effects before execution |
|
|
49
|
-
| `/user:gsd-t-execute` | Run tasks
|
|
49
|
+
| `/user:gsd-t-execute` | Run tasks — task-level fresh dispatch, worktree isolation, adaptive replanning, active rule injection |
|
|
50
50
|
| `/user:gsd-t-test-sync` | Keep tests aligned with code changes |
|
|
51
51
|
| `/user:gsd-t-qa` | QA agent — test generation, execution, gap reporting |
|
|
52
|
+
| `/user:gsd-t-doc-ripple` | Automated document ripple — update downstream docs after code changes |
|
|
52
53
|
| `/user:gsd-t-integrate` | Wire domains together |
|
|
53
|
-
| `/user:gsd-t-verify` | Run quality gates |
|
|
54
|
-
| `/user:gsd-t-complete-milestone` | Archive milestone + git tag |
|
|
54
|
+
| `/user:gsd-t-verify` | Run quality gates + goal-backward behavior verification |
|
|
55
|
+
| `/user:gsd-t-complete-milestone` | Archive milestone + git tag (goal-backward gate, rule engine distillation) |
|
|
55
56
|
| `/user:gsd-t-wave` | Full cycle (auto-advances all phases) |
|
|
56
|
-
| `/user:gsd-t-status` | Cross-domain progress view |
|
|
57
|
+
| `/user:gsd-t-status` | Cross-domain progress view with token breakdown, global ELO and cross-project rankings |
|
|
57
58
|
| `/user:gsd-t-debug` | Systematic debugging |
|
|
58
59
|
| `/user:gsd-t-quick` | Fast task, respects contracts |
|
|
59
60
|
| `/user:gsd-t-reflect` | Generate retrospective from event stream, propose memory updates |
|
|
60
61
|
| `/user:gsd-t-visualize` | Launch browser dashboard |
|
|
62
|
+
| `/user:gsd-t-metrics` | View task telemetry, process ELO, domain health, and cross-project comparison (`--cross-project`) |
|
|
61
63
|
| `/user:gsd-t-health` | Validate .gsd-t/ structure, optionally repair |
|
|
62
64
|
| `/user:gsd-t-pause` | Save exact position for reliable resume |
|
|
63
65
|
| `/user:gsd-t-populate` | Auto-populate docs from existing codebase |
|
|
@@ -232,6 +234,49 @@ After Playwright tests finish (pass or fail), **kill any app/server processes th
|
|
|
232
234
|
|
|
233
235
|
This applies everywhere Playwright tests are executed: execute, test-sync, verify, quick, wave, debug, complete-milestone, and integrate.
|
|
234
236
|
|
|
237
|
+
### E2E Enforcement Rule (MANDATORY)
|
|
238
|
+
|
|
239
|
+
**Running only unit tests when E2E tests exist is a test failure.** This is non-negotiable.
|
|
240
|
+
|
|
241
|
+
```
|
|
242
|
+
BEFORE reporting "tests pass" for ANY task:
|
|
243
|
+
├── Does playwright.config.* or cypress.config.* exist?
|
|
244
|
+
│ YES → You MUST run the full E2E suite. Unit-only results are INCOMPLETE.
|
|
245
|
+
│ NO → Unit/integration tests are sufficient.
|
|
246
|
+
├── Did you run every detected test runner?
|
|
247
|
+
│ NO → Run it now. Do not commit until ALL suites pass.
|
|
248
|
+
└── Report format MUST include all suites:
|
|
249
|
+
"Unit: X/Y pass | E2E: X/Y pass" (or "E2E: N/A — no config")
|
|
250
|
+
```
|
|
251
|
+
|
|
252
|
+
The conditional "if UI/routes/flows changed" in command files applies to **writing new E2E specs**, not to **running existing ones**. You always run existing E2E specs. Always.
|
|
253
|
+
|
|
254
|
+
### E2E Test Quality Standard (MANDATORY)
|
|
255
|
+
|
|
256
|
+
**E2E tests must be FUNCTIONAL tests, not LAYOUT tests.** This is non-negotiable.
|
|
257
|
+
|
|
258
|
+
A layout test checks that elements exist (`isVisible`, `toBeAttached`, `toBeEnabled`, `toHaveCount`). A functional test checks that features work — actions produce correct outcomes.
|
|
259
|
+
|
|
260
|
+
```
|
|
261
|
+
LAYOUT TEST (WRONG — passes even if every feature is broken):
|
|
262
|
+
await expect(page.locator('#tab-sessions')).toBeVisible();
|
|
263
|
+
await page.click('#tab-sessions');
|
|
264
|
+
// ← No assertion that the tab's content actually loaded
|
|
265
|
+
|
|
266
|
+
FUNCTIONAL TEST (RIGHT — fails if the feature is broken):
|
|
267
|
+
await page.click('#tab-sessions');
|
|
268
|
+
await expect(page.locator('.session-list')).toContainText('Session 1');
|
|
269
|
+
// ← Proves clicking the tab loaded the session data
|
|
270
|
+
```
|
|
271
|
+
|
|
272
|
+
Every Playwright assertion must verify one of:
|
|
273
|
+
- **State changed**: After click/type/submit, the app state is different (new content, updated data, changed status)
|
|
274
|
+
- **Data flowed**: User input → API call → response rendered (use `page.waitForResponse` or assert on rendered data)
|
|
275
|
+
- **Content loaded**: Navigation/tab switch → destination content appeared (assert on text/data unique to destination)
|
|
276
|
+
- **Widget responded**: Terminal accepted keystrokes and produced output, editor saved changes, form submitted and data persisted
|
|
277
|
+
|
|
278
|
+
**If a test would pass on an empty HTML page with the correct element IDs and no JavaScript, it is not a functional test.** Rewrite it.
|
|
279
|
+
|
|
235
280
|
## QA Agent (Mandatory)
|
|
236
281
|
|
|
237
282
|
Any GSD-T phase that produces or validates code **MUST run QA**. The QA agent's sole job is test generation, execution, and gap reporting. It never writes feature code.
|
|
@@ -246,12 +291,20 @@ Any GSD-T phase that produces or validates code **MUST run QA**. The QA agent's
|
|
|
246
291
|
**Task subagent spawn instruction (execute/integrate):**
|
|
247
292
|
```
|
|
248
293
|
Task subagent (general-purpose):
|
|
249
|
-
"Run
|
|
250
|
-
|
|
251
|
-
|
|
294
|
+
"Run ALL configured test suites — detect and run every one:
|
|
295
|
+
a. Unit tests (vitest/jest/mocha): run the full suite
|
|
296
|
+
b. E2E tests: check for playwright.config.* or cypress.config.* — if found, run the FULL E2E suite
|
|
297
|
+
c. NEVER skip E2E when a config file exists. Running only unit tests is a QA FAILURE.
|
|
298
|
+
d. Read .gsd-t/contracts/ for contract definitions. Check contract compliance.
|
|
299
|
+
e. AUDIT E2E test quality: Review each Playwright spec — if any test only checks element
|
|
300
|
+
existence (isVisible, toBeAttached, toBeEnabled) without verifying functional behavior
|
|
301
|
+
(state changes, data loaded, content updated after user actions), flag it as
|
|
302
|
+
'SHALLOW TEST — needs functional assertions'. A passing test suite that doesn't catch
|
|
303
|
+
broken features is a QA FAILURE.
|
|
304
|
+
Report format: 'Unit: X/Y pass | E2E: X/Y pass (or N/A if no config) | Contract: compliant/violations | Shallow tests: N'"
|
|
252
305
|
```
|
|
253
306
|
|
|
254
|
-
**QA failure blocks phase completion.** Lead cannot proceed until QA reports PASS or user explicitly overrides.
|
|
307
|
+
**QA failure OR shallow tests found blocks phase completion.** Lead cannot proceed until QA reports PASS with zero shallow tests, or user explicitly overrides.
|
|
255
308
|
|
|
256
309
|
## Model Display (MANDATORY)
|
|
257
310
|
|
|
@@ -340,6 +393,35 @@ BEFORE EVERY COMMIT:
|
|
|
340
393
|
|
|
341
394
|
If ANY answer is YES and the doc is NOT updated, update it BEFORE committing. No exceptions.
|
|
342
395
|
|
|
396
|
+
## Document Ripple Completion Gate (MANDATORY)
|
|
397
|
+
|
|
398
|
+
**NEVER report a task as "done" or present a summary until ALL downstream documents are updated.** This is not optional.
|
|
399
|
+
|
|
400
|
+
When a change affects multiple files (e.g., a new standard that applies across command files, a renamed API, a new convention), you MUST:
|
|
401
|
+
|
|
402
|
+
1. **Identify the full blast radius BEFORE starting**: List every file that needs the change
|
|
403
|
+
2. **Complete ALL updates in one pass**: Do not update 3 of 8 files and then present a summary
|
|
404
|
+
3. **Run the Pre-Commit Gate on the COMPLETE changeset**: Not on a partial subset
|
|
405
|
+
4. **Only THEN report completion**
|
|
406
|
+
|
|
407
|
+
```
|
|
408
|
+
BEFORE reporting "done" or presenting a summary:
|
|
409
|
+
├── Did this change establish a new standard, rule, or convention?
|
|
410
|
+
│ YES → Grep for every file that should enforce it. Update ALL of them.
|
|
411
|
+
├── Did this change modify a pattern used in multiple command files?
|
|
412
|
+
│ YES → Find and update EVERY command file that uses that pattern.
|
|
413
|
+
├── Did this change affect a template (CLAUDE-global, CLAUDE-project, etc.)?
|
|
414
|
+
│ YES → The template AND the live equivalent (~/.claude/CLAUDE.md) must match.
|
|
415
|
+
├── Did this change add a new requirement?
|
|
416
|
+
│ YES → Add to docs/requirements.md in the same pass.
|
|
417
|
+
├── Have I checked EVERY file in the blast radius?
|
|
418
|
+
│ NO → Keep going. Do not present partial work.
|
|
419
|
+
└── Am I about to say "want me to also update X?" or "should I check Y?"
|
|
420
|
+
YES → STOP. Just update X and check Y. Then report done.
|
|
421
|
+
```
|
|
422
|
+
|
|
423
|
+
**The test for this gate**: If the user asks "did you update all the documents?" and the answer would be "no, I missed some" — you failed this gate. The user should never need to ask.
|
|
424
|
+
|
|
343
425
|
## Execution Behavior
|
|
344
426
|
- ALWAYS check docs/architecture.md before adding or modifying components.
|
|
345
427
|
- ALWAYS check docs/workflows.md before changing any multi-step process.
|
|
@@ -427,7 +509,7 @@ Successor mapping:
|
|
|
427
509
|
| `execute` | `test-sync` | |
|
|
428
510
|
| `test-sync` | `verify` | `integrate` (if multi-domain) |
|
|
429
511
|
| `integrate` | `verify` | |
|
|
430
|
-
| `verify` |
|
|
512
|
+
| `verify` | *(auto-invokes complete-milestone)* | |
|
|
431
513
|
| `complete-milestone` | `status` | |
|
|
432
514
|
| `scan` | `promote-debt` | `milestone` |
|
|
433
515
|
| `init` | `scan` | `milestone` |
|