@tekyzinc/gsd-t 2.39.13 → 2.46.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (50) hide show
  1. package/CHANGELOG.md +12 -0
  2. package/README.md +19 -10
  3. package/bin/desktop.ini +2 -0
  4. package/bin/global-sync-manager.js +350 -0
  5. package/bin/gsd-t.js +592 -2
  6. package/bin/metrics-collector.js +167 -0
  7. package/bin/metrics-rollup.js +200 -0
  8. package/bin/patch-lifecycle.js +195 -0
  9. package/bin/rule-engine.js +160 -0
  10. package/commands/desktop.ini +2 -0
  11. package/commands/gsd-t-complete-milestone.md +194 -6
  12. package/commands/gsd-t-debug.md +38 -3
  13. package/commands/gsd-t-doc-ripple.md +148 -0
  14. package/commands/gsd-t-execute.md +328 -54
  15. package/commands/gsd-t-help.md +32 -10
  16. package/commands/gsd-t-integrate.md +59 -7
  17. package/commands/gsd-t-metrics.md +143 -0
  18. package/commands/gsd-t-plan.md +49 -2
  19. package/commands/gsd-t-qa.md +26 -5
  20. package/commands/gsd-t-quick.md +36 -3
  21. package/commands/gsd-t-status.md +78 -0
  22. package/commands/gsd-t-test-sync.md +23 -2
  23. package/commands/gsd-t-verify.md +142 -10
  24. package/commands/gsd-t-visualize.md +11 -1
  25. package/commands/gsd-t-wave.md +64 -18
  26. package/docs/GSD-T-README.md +10 -6
  27. package/docs/architecture.md +84 -2
  28. package/docs/ci-examples/desktop.ini +2 -0
  29. package/docs/ci-examples/github-actions.yml +104 -0
  30. package/docs/ci-examples/gitlab-ci.yml +116 -0
  31. package/docs/desktop.ini +2 -0
  32. package/docs/framework-comparison-scorecard.md +160 -0
  33. package/docs/infrastructure.md +87 -1
  34. package/docs/prd-graph-engine.md +2 -2
  35. package/docs/prd-gsd2-hybrid.md +258 -135
  36. package/docs/requirements.md +66 -2
  37. package/examples/.gsd-t/contracts/desktop.ini +2 -0
  38. package/examples/.gsd-t/desktop.ini +2 -0
  39. package/examples/.gsd-t/domains/desktop.ini +2 -0
  40. package/examples/.gsd-t/domains/example-domain/desktop.ini +2 -0
  41. package/examples/desktop.ini +2 -0
  42. package/examples/rules/.gitkeep +0 -0
  43. package/examples/rules/desktop.ini +2 -0
  44. package/package.json +40 -40
  45. package/scripts/desktop.ini +2 -0
  46. package/scripts/gsd-t-dashboard-server.js +19 -2
  47. package/scripts/gsd-t-dashboard.html +63 -0
  48. package/scripts/gsd-t-event-writer.js +1 -0
  49. package/templates/CLAUDE-global.md +92 -10
  50. package/templates/desktop.ini +2 -0
@@ -104,7 +104,8 @@ Work through each dimension sequentially. For each:
104
104
  - Confirm specs cover: happy path, error states, edge cases, all modes/flags
105
105
  - If specs are missing or incomplete → invoke `gsd-t-test-sync` to create them, then re-run
106
106
  - **Missing E2E coverage on new functionality = verification FAIL**
107
- 5. Tests are NOT optionalverification cannot pass without running them and confirming comprehensive coverage
107
+ 5. **Functional test quality audit**: Read every Playwright spec. For each `test()` block, verify assertions check **functional behavior** (state changed after action, data loaded, content updated, widget responded) NOT just element existence (`isVisible`, `toBeAttached`, `toBeEnabled`). A test that would pass on an empty HTML page with the right element IDs is a **shallow test** and counts as a verification FAIL. Flag shallow tests and rewrite them before proceeding.
108
+ 6. Tests are NOT optional — verification cannot pass without running them and confirming comprehensive, functional coverage
108
109
 
109
110
  ### Team Mode (when agent teams are enabled)
110
111
  ```
@@ -199,6 +200,95 @@ Create or update `.gsd-t/verify-report.md`:
199
200
  | 2 | ui | Add loading states for async calls | WARN |
200
201
  ```
201
202
 
203
+ ## Step 5.25: Metrics Quality Budget Check
204
+
205
+ Check task-metrics for the current milestone to detect quality budget violations:
206
+
207
+ 1. Run via Bash:
208
+ `node -e "const c = require('./bin/metrics-collector.js'); const r = c.readTaskMetrics({milestone: '{milestone-id}'}); if(!r.length){console.log('No metrics data — quality budget check skipped');process.exit(0);} const pass=r.filter(t=>t.fix_cycles===0&&t.pass).length; const rate=pass/r.length; console.log('First-pass rate: '+(rate*100).toFixed(1)+'% ('+pass+'/'+r.length+')'); if(rate<0.6) console.log('⚠️ Quality budget WARNING: first-pass rate below 60%');" 2>/dev/null || true`
209
+
210
+ 2. Run heuristics check via Bash:
211
+ `node -e "const m=require('./bin/metrics-rollup.js'); const r=m.readRollups({milestone:'{milestone-id}'}); if(r.length&&r[r.length-1].heuristic_flags.some(f=>f.severity==='HIGH')) console.log('⚠️ HIGH severity heuristic flag detected — review before completing milestone');" 2>/dev/null || true`
212
+
213
+ 3. Display quality metrics summary inline. Quality budget violation is a **WARNING** (non-blocking) — does not fail verify.
214
+
215
+ 4. Include quality budget status in the verification report (Step 5):
216
+ `- Quality Budget: {PASS/WARN} — first-pass rate {N}%{, HIGH heuristic: {name} if any}`
217
+
218
+ ## Step 5.5: Goal-Backward Verification (Post-Gate Behavior Check)
219
+
220
+ This step runs **after all 8 quality gates pass**. It verifies that milestone goals are actually achieved end-to-end — not just structurally present. It catches placeholder implementations that pass all structural gates.
221
+
222
+ Refer to `.gsd-t/contracts/goal-backward-contract.md` for the full verification flow, placeholder patterns, and findings report format.
223
+
224
+ ### 5.5.1 Load Milestone Goals and Requirements
225
+
226
+ 1. Read `.gsd-t/progress.md` — extract the current milestone name and goals
227
+ 2. Read `docs/requirements.md` — identify **critical requirements** (skip trivial/low-priority items)
228
+
229
+ ### 5.5.2 Trace Requirements to Behavior
230
+
231
+ For each critical requirement:
232
+
233
+ 1. **If `.gsd-t/graph/meta.json` exists (graph available)**:
234
+ - Trace the requirement → code path → behavior chain using graph queries
235
+ - Use `getRequirementFor`, `getCallers`, and `getTestsFor` to build the chain
236
+ - Flag requirements with no traceable code path as CRITICAL findings
237
+
238
+ 2. **If graph is not available (fallback to grep)**:
239
+ - Search the codebase for the feature/function implementing each requirement
240
+ - Trace from entry point → core logic → output/response
241
+
242
+ ### 5.5.3 Scan for Placeholder Patterns
243
+
244
+ For each file identified in the requirement traces above, scan for these placeholder patterns:
245
+
246
+ | Pattern | Detection Hint | Severity |
247
+ |---------|---------------|----------|
248
+ | console.log placeholder | `console.log.*TODO\|console.log.*implement` | CRITICAL |
249
+ | TODO/FIXME in implementation | `// TODO\|// FIXME\|# TODO\|# FIXME` in non-test files | CRITICAL |
250
+ | Empty function body | `function \w+\(\) \{\}` or `\(\) => \{\}` with no logic | CRITICAL |
251
+ | Throw not-implemented | `throw new Error.*not implemented\|throw new Error.*TODO` | CRITICAL |
252
+ | Hardcoded return | `return "success"\|return true` with no conditional logic | HIGH |
253
+ | Static UI text | Static `<span>` or text that never updates based on state | HIGH |
254
+ | Pass-through stub | `return input\|return req\|return data` with no transformation | MEDIUM |
255
+
256
+ ### 5.5.4 Produce Findings Report
257
+
258
+ Format findings per the goal-backward-contract.md report format:
259
+
260
+ ```markdown
261
+ ## Goal-Backward Verification Report
262
+
263
+ ### Status: PASS | FAIL
264
+
265
+ ### Findings
266
+ | # | Requirement | File:Line | Pattern | Severity | Description |
267
+ |---|-------------|-----------|---------|----------|-------------|
268
+ | 1 | {req-id} | {path}:{line} | {pattern} | {severity} | {what's wrong} |
269
+
270
+ ### Summary
271
+ - Requirements checked: {N}
272
+ - Findings: {N} ({critical}, {high}, {medium})
273
+ - Verdict: {PASS if 0 critical/high, FAIL otherwise}
274
+ ```
275
+
276
+ ### 5.5.5 Apply Blocking Rules
277
+
278
+ - **CRITICAL or HIGH findings** → Goal-Backward status = **FAIL** — block verification
279
+ - Append findings to the Critical section of the verification report (Step 5)
280
+ - Set overall verification status to FAIL
281
+ - **MEDIUM findings** → Goal-Backward status = **WARN** — log but do not block
282
+ - Append findings to the Warnings section of the verification report (Step 5)
283
+ - **No findings** → Goal-Backward status = **PASS** — add to verification report summary
284
+
285
+ Add a `Goal-Backward:` line to the Step 5 verification report summary:
286
+ ```
287
+ - Goal-Backward: {PASS/WARN/FAIL} — {N} requirements checked, {N} findings ({critical} critical, {high} high, {medium} medium)
288
+ ```
289
+
290
+ ---
291
+
202
292
  ## Step 6: Handle Remediation
203
293
 
204
294
  If there are CRITICAL findings:
@@ -217,15 +307,9 @@ Update `.gsd-t/progress.md`:
217
307
 
218
308
  ### Autonomy Behavior
219
309
 
220
- **Level 3 (Full Auto)**:
221
- - VERIFIED Log "✅ Verify complete all quality gates passed" and auto-advance to complete-milestone. Do NOT wait for user input.
222
- - CONDITIONAL PASS Log warnings, treat as VERIFIED, and auto-advance. Do NOT wait for user input.
223
- - FAIL → Auto-execute remediation tasks (up to 2 fix attempts). If still failing after 2 attempts, STOP and report to user.
224
-
225
- **Level 1–2**:
226
- - VERIFIED → Milestone complete, proceed to next milestone or ship
227
- - CONDITIONAL PASS → User decides if warnings are acceptable
228
- - FAIL → Return to execute phase for remediation tasks
310
+ **All Levels**:
311
+ - VERIFIED or CONDITIONAL PASS **Auto-invoke complete-milestone** (see Step 8 below). Completing a verified milestone is mechanical there is no judgment call that benefits from user review.
312
+ - FAIL**Level 3**: Auto-execute remediation tasks (up to 2 fix attempts). If still failing after 2 attempts, STOP and report to user. **Level 1–2**: Return to execute phase for remediation tasks.
229
313
 
230
314
  ## Document Ripple
231
315
 
@@ -238,6 +322,54 @@ Update `.gsd-t/progress.md`:
238
322
  4. **`.gsd-t/techdebt.md`** — If verification found new quality or security issues, add as debt
239
323
  5. **`docs/requirements.md`** — If verification revealed unmet requirements, update status
240
324
 
325
+ ## Step 8: Auto-Invoke Complete-Milestone
326
+
327
+ **This step is MANDATORY and runs at ALL autonomy levels.** Completing a verified milestone is a mechanical operation (archive, tag, bump version, update docs). There is no decision that benefits from user review — the decision was made when verification passed.
328
+
329
+ If status is VERIFY-FAILED:
330
+ - Do NOT invoke complete-milestone
331
+ - Report failures and stop
332
+
333
+ If status is VERIFIED or VERIFIED-WITH-WARNINGS:
334
+ 1. Log: "✅ Verify complete — spawning complete-milestone agent..."
335
+
336
+ **OBSERVABILITY LOGGING (MANDATORY):**
337
+ Before spawning — run via Bash:
338
+ `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
339
+
340
+ 2. Spawn a Task subagent (model: sonnet, mode: bypassPermissions):
341
+ ```
342
+ "Execute the complete-milestone phase of the current GSD-T milestone.
343
+
344
+ Read and follow the full instructions in commands/gsd-t-complete-milestone.md
345
+ (resolve from ~/.claude/commands/ if not in project).
346
+ Read .gsd-t/progress.md for current milestone and state.
347
+ Read CLAUDE.md for project conventions.
348
+ Read .gsd-t/contracts/ for domain interfaces.
349
+
350
+ Complete the phase fully:
351
+ - Follow every step in the command file
352
+ - Update .gsd-t/progress.md status when done
353
+ - Run document ripple as specified
354
+ - Commit your work
355
+
356
+ Report back: one-line status summary."
357
+ ```
358
+
359
+ After subagent returns — run via Bash:
360
+ `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
361
+ Compute tokens and compaction:
362
+ - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
363
+ - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
364
+ Append to `.gsd-t/token-log.md`:
365
+ `| {DT_START} | {DT_END} | gsd-t-verify | Step 8 | sonnet | {DURATION}s | auto-complete-milestone | {TOKENS} | {COMPACTED} | | | {CTX_PCT} |`
366
+
367
+ 3. Verify subagent result: Read `.gsd-t/progress.md` — confirm status is COMPLETED. If not, report the failure.
368
+
369
+ **Why this is mandatory**: Without auto-completion, verified milestones remain in VERIFIED state indefinitely. Requirements stay unmarked, progress.md is stale, and future sessions cannot tell the work was done. This is the root cause of "GSD-T forgot it did this work" — the milestone was built and verified but never formally completed.
370
+
371
+ **Why a subagent**: Complete-milestone is a 12-step process (gap analysis, archive, version bump, git tag, doc ripple). Verify is already heavy with 8+ quality gates. Spawning a fresh-context subagent avoids compaction risk — and complete-milestone loads everything it needs from files (progress.md, verify-report.md, contracts).
372
+
241
373
  $ARGUMENTS
242
374
 
243
375
  ## Auto-Clear
@@ -39,7 +39,17 @@ Run via Bash:
39
39
  node ~/.claude/scripts/gsd-t-event-writer.js --type command_invoked --command gsd-t-visualize --reasoning "Launching dashboard" || true
40
40
  ```
41
41
 
42
- ## Step 1.5: Graph Data for Dashboard
42
+ ## Step 1.5: Context Metrics for Dashboard
43
+
44
+ If `.gsd-t/token-log.md` exists, the dashboard server automatically reads it and provides context utilization metrics for visualization. These metrics are served from the `/api/token-breakdown` endpoint and rendered as:
45
+
46
+ 1. **Context utilization timeline** — Ctx% over time, ordered by Datetime-start
47
+ 2. **Token breakdown by domain** — bar chart grouping Tokens by Domain column (gracefully handles older rows without Domain column — they are grouped as "(untagged)")
48
+ 3. **Compaction proximity warnings** — rows where Ctx% >= 70 are highlighted; rows where Ctx% >= 85 are marked critical (🔴)
49
+
50
+ If `.gsd-t/token-log.md` does not exist, context metrics panels are hidden (not shown as errors).
51
+
52
+ ## Step 1.6: Graph Data for Dashboard
43
53
 
44
54
  If `.gsd-t/graph/index.json` exists, the dashboard can render entity-relationship visualizations from the graph data. The dashboard server will detect and serve graph data automatically — no additional configuration needed.
45
55
 
@@ -79,8 +79,24 @@ After phase agent returns — run via Bash:
79
79
  Compute tokens and compaction:
80
80
  - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
81
81
  - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
82
- Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted |` if missing):
83
- `| {DT_START} | {DT_END} | gsd-t-wave | {PHASE} | sonnet | {DURATION}s | phase: {PHASE} | {TOKENS} | {COMPACTED} |`
82
+ Compute context utilization run via Bash:
83
+ `if [ "${CLAUDE_CONTEXT_TOKENS_MAX:-0}" -gt 0 ]; then CTX_PCT=$(echo "scale=1; ${CLAUDE_CONTEXT_TOKENS_USED:-0} * 100 / ${CLAUDE_CONTEXT_TOKENS_MAX}" | bc); else CTX_PCT="N/A"; fi`
84
+ Alert on context thresholds (display to user inline):
85
+ - If CTX_PCT >= 85: `echo "🔴 CRITICAL: Context at ${CTX_PCT}% — compaction likely. Task MUST be split."`
86
+ - If CTX_PCT >= 70: `echo "⚠️ WARNING: Context at ${CTX_PCT}% — approaching compaction threshold. Consider splitting in plan."`
87
+
88
+ **Orchestrator Context Self-Check (MANDATORY):**
89
+ After EVERY phase agent returns, check the wave orchestrator's own context:
90
+ - **If CTX_PCT >= 70:**
91
+ 1. Save checkpoint to `.gsd-t/progress.md` — record which phases are complete, which remain
92
+ 2. Output: `⚠️ Wave orchestrator context at {CTX_PCT}% — approaching limit. Progress saved. Run /clear then /user:gsd-t-wave to continue from the next phase.`
93
+ 3. **STOP the wave loop.** Do NOT spawn the next phase agent. The next session resumes from saved state.
94
+ - **If CTX_PCT < 70:** Continue to next phase.
95
+
96
+ This prevents the wave orchestrator from running out of context mid-wave.
97
+
98
+ Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted | Domain | Task | Ctx% |` if missing):
99
+ `| {DT_START} | {DT_END} | gsd-t-wave | {PHASE} | sonnet | {DURATION}s | phase: {PHASE} | {TOKENS} | {COMPACTED} | | | {CTX_PCT} |`
84
100
 
85
101
  ### Phase Sequence
86
102
 
@@ -114,8 +130,13 @@ Spawn agent → `commands/gsd-t-impact.md`
114
130
 
115
131
  #### 5. EXECUTE
116
132
  Spawn agent → `commands/gsd-t-execute.md`
117
- - This is the heaviest phase. The execute agent will handle its own domain agent spawning and QA agent internally.
118
- - After: Read `progress.md`, verify status = EXECUTED
133
+ - This is the heaviest phase. The execute agent uses **task-level dispatch** (fresh-dispatch-contract.md): one Task subagent per task within each domain, each receiving only scope.md + relevant contracts + single task + graph context + up to 5 prior summaries. The execute agent handles domain task-dispatching and QA internally.
134
+ - **Adaptive replanning**: After each domain completes, the execute agent runs a replan check (per `adaptive-replan-contract.md`). If a completed domain's task summaries reveal new constraints (e.g., deprecated API, wrong column name, incompatible library), the execute agent checks remaining domains' `tasks.md` files for invalidated assumptions and revises them on disk before dispatching the next domain. Maximum 2 replan cycles per execute run — if exceeded, execution pauses for user input. All replan decisions are logged to the Decision Log in `progress.md`. The wave phase summary includes any replan actions taken.
135
+ - **Team/parallel mode**: If the plan defines parallel domains (same wave), the execute agent dispatches each domain teammate with `isolation: "worktree"` (per worktree-isolation-contract.md). Each domain works in an isolated git worktree. After all domains complete, the execute agent runs the Sequential Merge Protocol: merge domain A → test → merge domain B → test. Per-domain rollback if tests fail. Worktrees are cleaned up after all merges complete.
136
+ - After: Read `progress.md`, verify status = EXECUTED. Phase summary must include replan actions if any occurred:
137
+ ```
138
+ 📋 Phase 5 (EXECUTE): {N}/{N} tasks done | Replan cycles: {N} | Domains revised: {list or "none"}
139
+ ```
119
140
 
120
141
  #### 6. TEST-SYNC
121
142
  Spawn agent → `commands/gsd-t-test-sync.md`
@@ -125,15 +146,39 @@ Spawn agent → `commands/gsd-t-test-sync.md`
125
146
  Spawn agent → `commands/gsd-t-integrate.md`
126
147
  - After: Read `progress.md`, verify status = INTEGRATED
127
148
 
128
- #### 8. VERIFY
149
+ #### 8. VERIFY + COMPLETE
129
150
  Spawn agent → `commands/gsd-t-verify.md`
151
+ - The verify agent runs all 8 standard quality gates **plus** the goal-backward verification step (Step 5.5 in gsd-t-verify.md), which checks that milestone goals are actually achieved end-to-end and scans for placeholder patterns per `.gsd-t/contracts/goal-backward-contract.md`
152
+ - Goal-backward runs after all structural gates pass — CRITICAL or HIGH findings block verification; MEDIUM findings are warnings only
153
+ - **Verify auto-invokes complete-milestone** (Step 8 of gsd-t-verify.md). The verify agent handles both verification AND milestone completion in a single agent context. Do NOT spawn a separate complete agent.
130
154
  - After: Read `progress.md`, check status:
131
- - VERIFIEDproceed to Complete
132
- - VERIFY_FAILEDhandle remediation (see Error Recovery)
155
+ - COMPLETEDmilestone done (verify passed and auto-completed)
156
+ - VERIFIEDverify passed but complete-milestone failed — spawn a standalone complete agent as fallback
157
+ - VERIFY_FAILED → handle remediation (see Error Recovery) — includes goal-backward failures
158
+ - Phase summary must include the `Goal-Backward:` line from verify-report.md:
159
+ ```
160
+ 📋 Phase 8 (VERIFY+COMPLETE): {N} gates passed | Goal-Backward: {PASS/WARN/FAIL} — {N} requirements checked, {N} findings
161
+ ```
162
+
163
+ #### 9. DOC-RIPPLE (Automated — after verify+complete)
164
+
165
+ After the final phase completes but before wave reports done:
166
+
167
+ 1. Run threshold check — read `git diff --name-only HEAD~1` and evaluate against doc-ripple-contract.md trigger conditions
168
+ 2. If SKIP: log "Doc-ripple: SKIP — {reason}" and proceed
169
+ 3. If FIRE: spawn doc-ripple agent:
170
+
171
+ ⚙ [{model}] gsd-t-doc-ripple → blast radius analysis + parallel updates
172
+
173
+ Task subagent (general-purpose, model: sonnet):
174
+ "Execute the doc-ripple workflow per commands/gsd-t-doc-ripple.md.
175
+ Git diff context: {files changed list}
176
+ Command that triggered: wave
177
+ Produce manifest at .gsd-t/doc-ripple-manifest.md.
178
+ Update all affected documents.
179
+ Report: 'Doc-ripple: {N} checked, {N} updated, {N} skipped'"
133
180
 
134
- #### 9. COMPLETE
135
- Spawn agent → `commands/gsd-t-complete-milestone.md`
136
- - After: Read `progress.md`, verify status = COMPLETED
181
+ 4. After doc-ripple returns, verify manifest exists and report summary inline
137
182
 
138
183
  ### Between Each Phase
139
184
 
@@ -286,16 +331,17 @@ If command files in `~/.claude/commands/` are tampered with, wave agents will ex
286
331
  │ check check check check + check │
287
332
  │ gate │
288
333
  │ │
289
- ┌──────────┐ ┌────────┐ ┌───────────┐ ┌─────────────────┐
290
- │ │ COMPLETE │ ← │ VERIFY │ ← │ INTEGRATE │ ←──── │ FULL TEST-SYNC │
291
- │ │ agent 9 │ │agent 8 │ │ agent 7 │ │ agent 6 │
292
- └────┬────┘ └────┬────┘ └─────┬─────┘ └────────┬────────┘
293
-
294
- archive status + status status
295
- git tag gate check check check
334
+ ┌──────────────────┐ ┌───────────┐ ┌─────────────────┐
335
+ │ │ VERIFY+COMPLETE │ ← │ INTEGRATE │ ←──── │ FULL TEST-SYNC │
336
+ │ │ agent 8 │ │ agent 7 │ │ agent 6 │
337
+ └────────┬─────────┘ └─────┬─────┘ └────────┬────────┘
338
+ ↓ ↓
339
+ gate check → status status
340
+ auto-complete check check
341
+ │ archive + tag │
296
342
  │ │
297
343
  │ Each agent: fresh context window, reads state from files, dies when done │
298
- │ Orchestrator: ~30KB total, never compacts
344
+ │ Orchestrator: 8 agents (was 9), ~30KB total, never compacts
299
345
  └──────────────────────────────────────────────────────────────────────────────┘
300
346
  ```
301
347
 
@@ -12,6 +12,8 @@ A methodology for reliable, parallelizable development using Claude Code with op
12
12
 
13
13
  **Catches downstream effects** — analyzes impact before changes break things.
14
14
 
15
+ **Self-learning rule engine** — declarative rules detect failure patterns from task metrics. Patches progress through 5 lifecycle stages with measurable improvement gates before graduating into permanent methodology.
16
+
15
17
  ---
16
18
 
17
19
  ## Quick Start
@@ -96,26 +98,28 @@ GSD-T reads all state files and tells you exactly where you left off.
96
98
  | `/user:gsd-t-milestone` | Define new milestone | Manual |
97
99
  | `/user:gsd-t-partition` | Decompose into domains + contracts | In wave |
98
100
  | `/user:gsd-t-discuss` | Multi-perspective design exploration | In wave |
99
- | `/user:gsd-t-plan` | Create atomic task lists per domain | In wave |
101
+ | `/user:gsd-t-plan` | Create atomic task lists per domain (tasks auto-split to fit one context window) | In wave |
100
102
  | `/user:gsd-t-impact` | Analyze downstream effects | In wave |
101
- | `/user:gsd-t-execute` | Run tasks (solo or team) | In wave |
103
+ | `/user:gsd-t-execute` | Run tasks task-level fresh dispatch, worktree isolation, adaptive replanning | In wave |
102
104
  | `/user:gsd-t-test-sync` | Sync tests with code changes | In wave |
103
105
  | `/user:gsd-t-qa` | QA agent — test generation, execution, gap reporting | Auto-spawned |
106
+ | `/user:gsd-t-doc-ripple` | Automated document ripple — update downstream docs after code changes | Auto-spawned |
104
107
  | `/user:gsd-t-integrate` | Wire domains together | In wave |
105
- | `/user:gsd-t-verify` | Run quality gates | In wave |
106
- | `/user:gsd-t-complete-milestone` | Archive + git tag | In wave |
108
+ | `/user:gsd-t-verify` | Run quality gates + goal-backward verification → auto-invokes complete-milestone | In wave |
109
+ | `/user:gsd-t-complete-milestone` | Archive + git tag (auto-invoked by verify, also standalone) | In wave |
107
110
 
108
111
  ### Automation & Utilities
109
112
 
110
113
  | Command | Purpose | Auto |
111
114
  |---------|---------|------|
112
115
  | `/user:gsd-t-wave` | Full cycle, auto-advances all phases | Manual |
113
- | `/user:gsd-t-status` | Cross-domain progress view | Manual |
116
+ | `/user:gsd-t-status` | Cross-domain progress view with token breakdown, global ELO and cross-project rankings | Manual |
114
117
  | `/user:gsd-t-resume` | Restore context, continue | Manual |
115
118
  | `/user:gsd-t-quick` | Fast task with GSD-T guarantees | Manual |
116
119
  | `/user:gsd-t-reflect` | Generate retrospective from event stream, propose memory updates | Manual |
117
120
  | `/user:gsd-t-visualize` | Launch browser dashboard — SSE server + React Flow agent visualization | Manual |
118
121
  | `/user:gsd-t-debug` | Systematic debugging with state | Manual |
122
+ | `/user:gsd-t-metrics` | View task telemetry, process ELO, signal distribution, domain health, and cross-project comparison (`--cross-project`) | Manual |
119
123
  | `/user:gsd-t-health` | Validate .gsd-t/ structure, optionally repair | Manual |
120
124
  | `/user:gsd-t-pause` | Save exact position for reliable resume | Manual |
121
125
  | `/user:gsd-t-log` | Sync progress Decision Log with recent git activity | Manual |
@@ -154,7 +158,7 @@ GSD-T reads all state files and tells you exactly where you left off.
154
158
  │ │ │ task + at verify)│ │
155
159
  │ │ └───────────────────┘ │
156
160
  │ ▼ │
157
- │ complete-milestone ◄── verify ◄── integrate ◄──────────────────────┘
161
+ verify+complete ◄──────────── integrate ◄──────────────────────┘
158
162
  │ │
159
163
  └─────────────────────────────────────────────────────────────────────────────┘
160
164
  ```
@@ -1,6 +1,6 @@
1
1
  # Architecture — GSD-T Framework (@tekyzinc/gsd-t)
2
2
 
3
- ## Last Updated: 2026-03-19 (Scan #10, Post-M20/M21)
3
+ ## Last Updated: 2026-03-22 (M23 Headless Mode)
4
4
 
5
5
  ## System Overview
6
6
 
@@ -16,7 +16,7 @@ The framework has no runtime — it is consumed entirely by Claude Code's slash
16
16
  - **Purpose**: Install, update, diagnose, and manage GSD-T across projects
17
17
  - **Location**: `bin/gsd-t.js` (1,798 lines, 90+ functions, all ≤ 30 lines)
18
18
  - **Dependencies**: Node.js built-ins only (fs, path, os, child_process, https, crypto)
19
- - **Subcommands**: install, update, status, doctor, init, uninstall, update-all, register, changelog, graph (index/status/query)
19
+ - **Subcommands**: install, update, status, doctor, init, uninstall, update-all, register, changelog, graph (index/status/query), headless (exec/query)
20
20
  - **Organization**: Configuration → Guard section → Helpers → Heartbeat → Commands → Install/Update → Init → Status → Uninstall → Update-All → Doctor → Register → Update Check → Help → Main dispatch
21
21
  - **All functions ≤ 30 lines** (M6 refactoring). Largest: `doRegister()` at 30 lines, `summarize()` at 30 lines.
22
22
 
@@ -66,6 +66,18 @@ The framework has no runtime — it is consumed entirely by Claude Code's slash
66
66
  - **`scripts/gsd-t-dashboard.html`** (194 lines): React 17 + React Flow v11.11.4 + Dagre via CDN (no build step, no npm deps). Dark theme (`#0d1117`). Renders agent hierarchy as directed graph from `parent_agent_id` relationships. Live event feed (max 200, outcome color-coded). Auto-reconnects on SSE disconnect. Port configurable via `?port=` URL param.
67
67
  - **`commands/gsd-t-visualize.md`** (104 lines, 48th command): Starts server via `--detach`, polls `/ping` up to 5s, opens browser cross-platform (win32/darwin/linux). Accepts `stop` argument to shut down server. Step 0 self-spawn with OBSERVABILITY LOGGING.
68
68
 
69
+ ### Headless Mode (M23 — complete)
70
+ - **doHeadless(args)**: Dispatch function for the `headless` CLI subcommand.
71
+ - **doHeadlessExec(command, cmdArgs, flags)**: Wraps `claude -p "/user:gsd-t-{command}"` via `execFileSync`. Verifies claude CLI availability, enforces timeout, writes log file if `--log` requested. Returns structured JSON if `--json` flag set.
72
+ - **parseHeadlessFlags(args)**: Extracts `--json`, `--timeout=N`, `--log` from raw args. Returns `{ flags, positional }`.
73
+ - **buildHeadlessCmd(command, cmdArgs)**: Builds the `/user:gsd-t-{command}` prompt string.
74
+ - **mapHeadlessExitCode(processExitCode, output)**: Maps process exit code + output text patterns to GSD-T exit codes (0–4).
75
+ - **headlessLogPath(projectDir, timestamp)**: Generates `.gsd-t/headless-{timestamp}.log` path.
76
+ - **doHeadlessQuery(type)**: Dispatches to one of 7 query functions. All pure Node.js file reads, no LLM calls, <100ms.
77
+ - **Query functions** (7): `queryStatus`, `queryDomains`, `queryContracts`, `queryDebt`, `queryContext`, `queryBacklog`, `queryGraph` — each reads corresponding `.gsd-t/` file and returns typed JSON result.
78
+ - **Exit codes**: 0=success, 1=verify-fail, 2=context-budget-exceeded, 3=error, 4=blocked-needs-human
79
+ - **CI/CD examples**: `docs/ci-examples/github-actions.yml` (GitHub Actions), `docs/ci-examples/gitlab-ci.yml` (GitLab CI)
80
+
69
81
  ### Graph Engine (M20 — complete)
70
82
  - **`bin/graph-store.js`** (147 lines): File-based graph storage in `.gsd-t/graph/`. 8 JSON files (index, calls, imports, contracts, requirements, tests, surfaces, meta). Read/write operations, MD5 file hashing for incremental indexing, staleness detection. Zero external deps. Note: no symlink protection (TD-099).
71
83
  - **`bin/graph-parsers.js`** (327 lines): Language-specific entity parsers. JS/TS: function declarations, arrow functions, classes, methods, imports (ES/CJS), exports. Python: def/class/import. Regex-based (no Tree-sitter). Returns `{ entities, imports, calls }`.
@@ -271,6 +283,76 @@ QA runs inline or as Task subagent depending on phase (M10 refactor). Removed fr
271
283
  | 2026-02-18 | gsd-t-tools.js as state utility CLI | Reduces token-heavy markdown parsing; compact JSON responses save ~50K tokens/wave | Parsing progress.md inline (original) |
272
284
  | 2026-02-18 | continue-here files for pause/resume | More precise than progress.md; captures exact task+next-action, not just phase | progress.md alone (less precise) |
273
285
 
286
+ ### GSD 2 Tier 1 — Execution Quality (M22 — complete v2.40.10)
287
+
288
+ Five interlocking capabilities eliminate context rot, enable safe parallel execution, and verify behavior rather than structure alone.
289
+
290
+ **Task-Level Fresh Dispatch**
291
+
292
+ Execute dispatches one subagent per TASK (not per domain). Each task agent gets a fresh context window containing only: domain scope.md, relevant contracts, the single current task, graph context for touched files, and prior task summaries (10-20 lines each). Context utilization per task: ~10-20% (down from 60-75% cumulative per domain). Compaction never triggers. The domain dispatcher (lightweight orchestrator) sequences tasks and passes summaries — it never accumulates full task context.
293
+
294
+ ```
295
+ Execute orchestrator (summaries only — ~4-8% ctx)
296
+ └── Domain-A task-dispatcher
297
+ ├── Task 1 subagent (fresh, 10-20% ctx) → summary → dies
298
+ ├── Task 2 subagent (fresh + task 1 summary) → summary → dies
299
+ └── Task N subagent (fresh + prior summaries) → summary → dies
300
+ ```
301
+
302
+ **Plan command constraint** (added M22): Every task must fit in one context window. If estimated scope exceeds 70% context, plan splits the task automatically.
303
+
304
+ **Worktree Isolation**
305
+
306
+ Parallel domain agents work in isolated git worktrees via Agent tool's `isolation: "worktree"` parameter. No shared filesystem — domains cannot step on each other's files. Merges are sequential and atomic:
307
+
308
+ ```
309
+ Dispatch N domains (isolation: "worktree") → parallel execution
310
+ └── Domain A completes → merge A → run integration tests
311
+ └── Domain B completes → merge B → run integration tests
312
+ └── Conflict or test failure → rollback that domain, others unaffected
313
+ ```
314
+
315
+ Rollback granularity is per-domain (not per-commit). Worktrees are cleaned up after all merges complete.
316
+
317
+ **Goal-Backward Verification**
318
+
319
+ After all structural quality gates pass (tests, contracts, file existence), a goal-backward pass verifies behavior. Reads milestone goals, traces each requirement to code, and checks for placeholders:
320
+ - `console.log("TODO")` / `console.log("implement X")`
321
+ - Hardcoded return values (`return "Synced"`, `return 200` on a path that should compute)
322
+ - `// TODO`, `// FIXME`, `// PLACEHOLDER` comments in critical paths
323
+ - UI components rendering static strings where dynamic data is required
324
+
325
+ Applied in: `verify`, `complete-milestone`, `wave` (verification phase).
326
+
327
+ **Adaptive Replanning**
328
+
329
+ After each domain completes in execute, the orchestrator reads the domain's result summary and evaluates whether remaining domain plans remain valid. If execution revealed new constraints (deprecated API, schema mismatch, missing dependency, incompatible library), affected domain `tasks.md` files are rewritten on disk before the next domain is dispatched.
330
+
331
+ Guard: max 2 replanning cycles per execute run. After that, pause for user input (prevents new-constraint → replan → new-constraint loops).
332
+
333
+ **Context Observability**
334
+
335
+ Extended token-log.md format (M22) includes `Domain`, `Task`, and `Ctx%` columns:
336
+
337
+ ```
338
+ | Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted | Domain | Task | Ctx% |
339
+ ```
340
+
341
+ Alert thresholds (inline display):
342
+ - `Ctx% >= 70%` → warning: task approaching compaction, consider splitting
343
+ - `Ctx% >= 85%` → critical: compaction likely, task MUST be split
344
+
345
+ `gsd-t-status` displays token breakdown by domain/task/phase. `gsd-t-visualize` consumes the same data for dashboard rendering.
346
+
347
+ ## Planned Architecture Changes (M23-M24)
348
+
349
+ **M23: Headless Mode**
350
+ - New `gsd-t headless` CLI subcommand wrapping `claude -p` for unattended execution.
351
+ - New `gsd-t headless query` for instant JSON state access (no LLM).
352
+
353
+ **M24: Docker**
354
+ - Dockerfile + docker-compose for containerized enterprise execution.
355
+
274
356
  ## Known Architecture Concerns
275
357
 
276
358
  1. **CLI single-file size**: bin/gsd-t.js at 1,438 lines exceeds the 200-line convention, but splitting adds complexity for questionable benefit given zero-dependency constraint. Accepted deviation.
@@ -0,0 +1,2 @@
1
+ [.ShellClassInfo]
2
+ IconResource=C:\Program Files\Google\Drive File Stream\122.0.1.0\GoogleDriveFS.exe,27
@@ -0,0 +1,104 @@
1
+ # GSD-T Headless Mode — GitHub Actions Example
2
+ #
3
+ # This workflow demonstrates using `gsd-t headless` for automated milestone
4
+ # verification in CI/CD pipelines.
5
+ #
6
+ # Prerequisites:
7
+ # - ANTHROPIC_API_KEY secret configured in your GitHub repository
8
+ # - GSD-T installed globally (npm install -g @tekyzinc/gsd-t)
9
+ # - Claude Code CLI installed (npm install -g @anthropic-ai/claude-code)
10
+ #
11
+ # Usage: Copy this file to .github/workflows/gsd-t-verify.yml in your project
12
+
13
+ name: GSD-T Headless Verify
14
+
15
+ on:
16
+ push:
17
+ branches: [main, develop]
18
+ pull_request:
19
+ branches: [main]
20
+ workflow_dispatch:
21
+ inputs:
22
+ command:
23
+ description: 'GSD-T command to run (default: verify)'
24
+ required: false
25
+ default: 'verify'
26
+
27
+ jobs:
28
+ gsd-t-verify:
29
+ name: GSD-T Quality Gates
30
+ runs-on: ubuntu-latest
31
+ timeout-minutes: 30
32
+
33
+ steps:
34
+ - name: Checkout repository
35
+ uses: actions/checkout@v4
36
+
37
+ - name: Setup Node.js
38
+ uses: actions/setup-node@v4
39
+ with:
40
+ node-version: '20'
41
+
42
+ - name: Install GSD-T
43
+ run: npm install -g @tekyzinc/gsd-t @anthropic-ai/claude-code
44
+
45
+ - name: Query project status (no LLM needed)
46
+ id: status
47
+ run: |
48
+ STATUS=$(gsd-t headless query status)
49
+ echo "status=$STATUS" >> $GITHUB_OUTPUT
50
+ echo "GSD-T Project Status: $STATUS"
51
+
52
+ - name: Run GSD-T headless verify
53
+ env:
54
+ ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
55
+ run: |
56
+ gsd-t headless ${{ github.event.inputs.command || 'verify' }} \
57
+ --json \
58
+ --timeout=1200 \
59
+ --log
60
+ # Exit codes:
61
+ # 0 = success
62
+ # 1 = verify-fail (tests/gates failed)
63
+ # 2 = context-budget-exceeded (try splitting milestone)
64
+ # 3 = error (claude CLI error)
65
+ # 4 = blocked-needs-human (requires manual review)
66
+
67
+ - name: Upload headless log on failure
68
+ if: failure()
69
+ uses: actions/upload-artifact@v4
70
+ with:
71
+ name: gsd-t-headless-log
72
+ path: .gsd-t/headless-*.log
73
+ retention-days: 7
74
+
75
+ gsd-t-status-gate:
76
+ name: GSD-T State Query Gate
77
+ runs-on: ubuntu-latest
78
+
79
+ steps:
80
+ - name: Checkout repository
81
+ uses: actions/checkout@v4
82
+
83
+ - name: Setup Node.js
84
+ uses: actions/setup-node@v4
85
+ with:
86
+ node-version: '20'
87
+
88
+ - name: Install GSD-T
89
+ run: npm install -g @tekyzinc/gsd-t
90
+
91
+ - name: Check project status
92
+ run: gsd-t headless query status
93
+
94
+ - name: Check for open tech debt
95
+ run: |
96
+ DEBT=$(gsd-t headless query debt)
97
+ COUNT=$(echo "$DEBT" | node -e "const d=JSON.parse(require('fs').readFileSync('/dev/stdin','utf8')); process.exit(d.data.count > 0 ? 1 : 0)" || echo "")
98
+ echo "Tech debt items: $DEBT"
99
+
100
+ - name: List active domains
101
+ run: gsd-t headless query domains
102
+
103
+ - name: Check graph index
104
+ run: gsd-t headless query graph