npm - @tekyzinc/gsd-t - Versions diffs - 2.39.13 → 2.46.11 - Mend

@tekyzinc/gsd-t 2.39.13 → 2.46.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (50) hide show

package/CHANGELOG.md +12 -0
package/README.md +19 -10
package/bin/desktop.ini +2 -0
package/bin/global-sync-manager.js +350 -0
package/bin/gsd-t.js +592 -2
package/bin/metrics-collector.js +167 -0
package/bin/metrics-rollup.js +200 -0
package/bin/patch-lifecycle.js +195 -0
package/bin/rule-engine.js +160 -0
package/commands/desktop.ini +2 -0
package/commands/gsd-t-complete-milestone.md +194 -6
package/commands/gsd-t-debug.md +38 -3
package/commands/gsd-t-doc-ripple.md +148 -0
package/commands/gsd-t-execute.md +328 -54
package/commands/gsd-t-help.md +32 -10
package/commands/gsd-t-integrate.md +59 -7
package/commands/gsd-t-metrics.md +143 -0
package/commands/gsd-t-plan.md +49 -2
package/commands/gsd-t-qa.md +26 -5
package/commands/gsd-t-quick.md +36 -3
package/commands/gsd-t-status.md +78 -0
package/commands/gsd-t-test-sync.md +23 -2
package/commands/gsd-t-verify.md +142 -10
package/commands/gsd-t-visualize.md +11 -1
package/commands/gsd-t-wave.md +64 -18
package/docs/GSD-T-README.md +10 -6
package/docs/architecture.md +84 -2
package/docs/ci-examples/desktop.ini +2 -0
package/docs/ci-examples/github-actions.yml +104 -0
package/docs/ci-examples/gitlab-ci.yml +116 -0
package/docs/desktop.ini +2 -0
package/docs/framework-comparison-scorecard.md +160 -0
package/docs/infrastructure.md +87 -1
package/docs/prd-graph-engine.md +2 -2
package/docs/prd-gsd2-hybrid.md +258 -135
package/docs/requirements.md +66 -2
package/examples/.gsd-t/contracts/desktop.ini +2 -0
package/examples/.gsd-t/desktop.ini +2 -0
package/examples/.gsd-t/domains/desktop.ini +2 -0
package/examples/.gsd-t/domains/example-domain/desktop.ini +2 -0
package/examples/desktop.ini +2 -0
package/examples/rules/.gitkeep +0 -0
package/examples/rules/desktop.ini +2 -0
package/package.json +40 -40
package/scripts/desktop.ini +2 -0
package/scripts/gsd-t-dashboard-server.js +19 -2
package/scripts/gsd-t-dashboard.html +63 -0
package/scripts/gsd-t-event-writer.js +1 -0
package/templates/CLAUDE-global.md +92 -10
package/templates/desktop.ini +2 -0

package/commands/gsd-t-verify.md CHANGED Viewed

@@ -104,7 +104,8 @@ Work through each dimension sequentially. For each:
    - Confirm specs cover: happy path, error states, edge cases, all modes/flags
    - If specs are missing or incomplete → invoke `gsd-t-test-sync` to create them, then re-run
    - **Missing E2E coverage on new functionality = verification FAIL**
-5. Tests are NOT optional — verification cannot pass without running them and confirming comprehensive coverage
+5. **Functional test quality audit**: Read every Playwright spec. For each `test()` block, verify assertions check **functional behavior** (state changed after action, data loaded, content updated, widget responded) — NOT just element existence (`isVisible`, `toBeAttached`, `toBeEnabled`). A test that would pass on an empty HTML page with the right element IDs is a **shallow test** and counts as a verification FAIL. Flag shallow tests and rewrite them before proceeding.
+6. Tests are NOT optional — verification cannot pass without running them and confirming comprehensive, functional coverage
 ### Team Mode (when agent teams are enabled)
 ```
@@ -199,6 +200,95 @@ Create or update `.gsd-t/verify-report.md`:
 | 2 | ui | Add loading states for async calls | WARN |
 ```
+## Step 5.25: Metrics Quality Budget Check
+Check task-metrics for the current milestone to detect quality budget violations:
+1. Run via Bash:
+   `node -e "const c = require('./bin/metrics-collector.js'); const r = c.readTaskMetrics({milestone: '{milestone-id}'}); if(!r.length){console.log('No metrics data — quality budget check skipped');process.exit(0);} const pass=r.filter(t=>t.fix_cycles===0&&t.pass).length; const rate=pass/r.length; console.log('First-pass rate: '+(rate*100).toFixed(1)+'% ('+pass+'/'+r.length+')'); if(rate<0.6) console.log('⚠️ Quality budget WARNING: first-pass rate below 60%');" 2>/dev/null || true`
+2. Run heuristics check via Bash:
+   `node -e "const m=require('./bin/metrics-rollup.js'); const r=m.readRollups({milestone:'{milestone-id}'}); if(r.length&&r[r.length-1].heuristic_flags.some(f=>f.severity==='HIGH')) console.log('⚠️ HIGH severity heuristic flag detected — review before completing milestone');" 2>/dev/null || true`
+3. Display quality metrics summary inline. Quality budget violation is a **WARNING** (non-blocking) — does not fail verify.
+4. Include quality budget status in the verification report (Step 5):
+   `- Quality Budget: {PASS/WARN} — first-pass rate {N}%{, HIGH heuristic: {name} if any}`
+## Step 5.5: Goal-Backward Verification (Post-Gate Behavior Check)
+This step runs **after all 8 quality gates pass**. It verifies that milestone goals are actually achieved end-to-end — not just structurally present. It catches placeholder implementations that pass all structural gates.
+Refer to `.gsd-t/contracts/goal-backward-contract.md` for the full verification flow, placeholder patterns, and findings report format.
+### 5.5.1 Load Milestone Goals and Requirements
+1. Read `.gsd-t/progress.md` — extract the current milestone name and goals
+2. Read `docs/requirements.md` — identify **critical requirements** (skip trivial/low-priority items)
+### 5.5.2 Trace Requirements to Behavior
+For each critical requirement:
+1. **If `.gsd-t/graph/meta.json` exists (graph available)**:
+   - Trace the requirement → code path → behavior chain using graph queries
+   - Use `getRequirementFor`, `getCallers`, and `getTestsFor` to build the chain
+   - Flag requirements with no traceable code path as CRITICAL findings
+2. **If graph is not available (fallback to grep)**:
+   - Search the codebase for the feature/function implementing each requirement
+   - Trace from entry point → core logic → output/response
+### 5.5.3 Scan for Placeholder Patterns
+For each file identified in the requirement traces above, scan for these placeholder patterns:
+| Pattern | Detection Hint | Severity |
+|---------|---------------|----------|
+| console.log placeholder | `console.log.*TODO\|console.log.*implement` | CRITICAL |
+| TODO/FIXME in implementation | `// TODO\|// FIXME\|# TODO\|# FIXME` in non-test files | CRITICAL |
+| Empty function body | `function \w+\(\) \{\}` or `\(\) => \{\}` with no logic | CRITICAL |
+| Throw not-implemented | `throw new Error.*not implemented\|throw new Error.*TODO` | CRITICAL |
+| Hardcoded return | `return "success"\|return true` with no conditional logic | HIGH |
+| Static UI text | Static `<span>` or text that never updates based on state | HIGH |
+| Pass-through stub | `return input\|return req\|return data` with no transformation | MEDIUM |
+### 5.5.4 Produce Findings Report
+Format findings per the goal-backward-contract.md report format:
+```markdown
+## Goal-Backward Verification Report
+### Status: PASS | FAIL
+### Findings
+| # | Requirement | File:Line | Pattern | Severity | Description |
+|---|-------------|-----------|---------|----------|-------------|
+| 1 | {req-id}    | {path}:{line} | {pattern} | {severity} | {what's wrong} |
+### Summary
+- Requirements checked: {N}
+- Findings: {N} ({critical}, {high}, {medium})
+- Verdict: {PASS if 0 critical/high, FAIL otherwise}
+```
+### 5.5.5 Apply Blocking Rules
+- **CRITICAL or HIGH findings** → Goal-Backward status = **FAIL** — block verification
+  - Append findings to the Critical section of the verification report (Step 5)
+  - Set overall verification status to FAIL
+- **MEDIUM findings** → Goal-Backward status = **WARN** — log but do not block
+  - Append findings to the Warnings section of the verification report (Step 5)
+- **No findings** → Goal-Backward status = **PASS** — add to verification report summary
+Add a `Goal-Backward:` line to the Step 5 verification report summary:
+```
+- Goal-Backward: {PASS/WARN/FAIL} — {N} requirements checked, {N} findings ({critical} critical, {high} high, {medium} medium)
+```
+---
 ## Step 6: Handle Remediation
 If there are CRITICAL findings:
@@ -217,15 +307,9 @@ Update `.gsd-t/progress.md`:
 ### Autonomy Behavior
-**Level 3 (Full Auto)**:
-- VERIFIED → Log "✅ Verify complete — all quality gates passed" and auto-advance to complete-milestone. Do NOT wait for user input.
-- CONDITIONAL PASS → Log warnings, treat as VERIFIED, and auto-advance. Do NOT wait for user input.
-- FAIL → Auto-execute remediation tasks (up to 2 fix attempts). If still failing after 2 attempts, STOP and report to user.
-**Level 1–2**:
-- VERIFIED → Milestone complete, proceed to next milestone or ship
-- CONDITIONAL PASS → User decides if warnings are acceptable
-- FAIL → Return to execute phase for remediation tasks
+**All Levels**:
+- VERIFIED or CONDITIONAL PASS → **Auto-invoke complete-milestone** (see Step 8 below). Completing a verified milestone is mechanical — there is no judgment call that benefits from user review.
+- FAIL → **Level 3**: Auto-execute remediation tasks (up to 2 fix attempts). If still failing after 2 attempts, STOP and report to user. **Level 1–2**: Return to execute phase for remediation tasks.
 ## Document Ripple
@@ -238,6 +322,54 @@ Update `.gsd-t/progress.md`:
 4. **`.gsd-t/techdebt.md`** — If verification found new quality or security issues, add as debt
 5. **`docs/requirements.md`** — If verification revealed unmet requirements, update status
+## Step 8: Auto-Invoke Complete-Milestone
+**This step is MANDATORY and runs at ALL autonomy levels.** Completing a verified milestone is a mechanical operation (archive, tag, bump version, update docs). There is no decision that benefits from user review — the decision was made when verification passed.
+If status is VERIFY-FAILED:
+- Do NOT invoke complete-milestone
+- Report failures and stop
+If status is VERIFIED or VERIFIED-WITH-WARNINGS:
+1. Log: "✅ Verify complete — spawning complete-milestone agent..."
+**OBSERVABILITY LOGGING (MANDATORY):**
+Before spawning — run via Bash:
+`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
+2. Spawn a Task subagent (model: sonnet, mode: bypassPermissions):
+```
+"Execute the complete-milestone phase of the current GSD-T milestone.
+Read and follow the full instructions in commands/gsd-t-complete-milestone.md
+(resolve from ~/.claude/commands/ if not in project).
+Read .gsd-t/progress.md for current milestone and state.
+Read CLAUDE.md for project conventions.
+Read .gsd-t/contracts/ for domain interfaces.
+Complete the phase fully:
+- Follow every step in the command file
+- Update .gsd-t/progress.md status when done
+- Run document ripple as specified
+- Commit your work
+Report back: one-line status summary."
+```
+After subagent returns — run via Bash:
+`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
+Compute tokens and compaction:
+- No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
+- Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
+Append to `.gsd-t/token-log.md`:
+`| {DT_START} | {DT_END} | gsd-t-verify | Step 8 | sonnet | {DURATION}s | auto-complete-milestone | {TOKENS} | {COMPACTED} | | | {CTX_PCT} |`
+3. Verify subagent result: Read `.gsd-t/progress.md` — confirm status is COMPLETED. If not, report the failure.
+**Why this is mandatory**: Without auto-completion, verified milestones remain in VERIFIED state indefinitely. Requirements stay unmarked, progress.md is stale, and future sessions cannot tell the work was done. This is the root cause of "GSD-T forgot it did this work" — the milestone was built and verified but never formally completed.
+**Why a subagent**: Complete-milestone is a 12-step process (gap analysis, archive, version bump, git tag, doc ripple). Verify is already heavy with 8+ quality gates. Spawning a fresh-context subagent avoids compaction risk — and complete-milestone loads everything it needs from files (progress.md, verify-report.md, contracts).
 $ARGUMENTS
 ## Auto-Clear

package/commands/gsd-t-visualize.md CHANGED Viewed

@@ -39,7 +39,17 @@ Run via Bash:
 node ~/.claude/scripts/gsd-t-event-writer.js --type command_invoked --command gsd-t-visualize --reasoning "Launching dashboard" || true
 ```
-## Step 1.5: Graph Data for Dashboard
+## Step 1.5: Context Metrics for Dashboard
+If `.gsd-t/token-log.md` exists, the dashboard server automatically reads it and provides context utilization metrics for visualization. These metrics are served from the `/api/token-breakdown` endpoint and rendered as:
+1. **Context utilization timeline** — Ctx% over time, ordered by Datetime-start
+2. **Token breakdown by domain** — bar chart grouping Tokens by Domain column (gracefully handles older rows without Domain column — they are grouped as "(untagged)")
+3. **Compaction proximity warnings** — rows where Ctx% >= 70 are highlighted; rows where Ctx% >= 85 are marked critical (🔴)
+If `.gsd-t/token-log.md` does not exist, context metrics panels are hidden (not shown as errors).
+## Step 1.6: Graph Data for Dashboard
 If `.gsd-t/graph/index.json` exists, the dashboard can render entity-relationship visualizations from the graph data. The dashboard server will detect and serve graph data automatically — no additional configuration needed.

package/commands/gsd-t-wave.md CHANGED Viewed

@@ -79,8 +79,24 @@ After phase agent returns — run via Bash:
 Compute tokens and compaction:
 - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
 - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
-Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted |` if missing):
-`| {DT_START} | {DT_END} | gsd-t-wave | {PHASE} | sonnet | {DURATION}s | phase: {PHASE} | {TOKENS} | {COMPACTED} |`
+Compute context utilization — run via Bash:
+`if [ "${CLAUDE_CONTEXT_TOKENS_MAX:-0}" -gt 0 ]; then CTX_PCT=$(echo "scale=1; ${CLAUDE_CONTEXT_TOKENS_USED:-0} * 100 / ${CLAUDE_CONTEXT_TOKENS_MAX}" | bc); else CTX_PCT="N/A"; fi`
+Alert on context thresholds (display to user inline):
+- If CTX_PCT >= 85: `echo "🔴 CRITICAL: Context at ${CTX_PCT}% — compaction likely. Task MUST be split."`
+- If CTX_PCT >= 70: `echo "⚠️ WARNING: Context at ${CTX_PCT}% — approaching compaction threshold. Consider splitting in plan."`
+**Orchestrator Context Self-Check (MANDATORY):**
+After EVERY phase agent returns, check the wave orchestrator's own context:
+- **If CTX_PCT >= 70:**
+  1. Save checkpoint to `.gsd-t/progress.md` — record which phases are complete, which remain
+  2. Output: `⚠️ Wave orchestrator context at {CTX_PCT}% — approaching limit. Progress saved. Run /clear then /user:gsd-t-wave to continue from the next phase.`
+  3. **STOP the wave loop.** Do NOT spawn the next phase agent. The next session resumes from saved state.
+- **If CTX_PCT < 70:** Continue to next phase.
+This prevents the wave orchestrator from running out of context mid-wave.
+Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted | Domain | Task | Ctx% |` if missing):
+`| {DT_START} | {DT_END} | gsd-t-wave | {PHASE} | sonnet | {DURATION}s | phase: {PHASE} | {TOKENS} | {COMPACTED} | | | {CTX_PCT} |`
 ### Phase Sequence
@@ -114,8 +130,13 @@ Spawn agent → `commands/gsd-t-impact.md`
 #### 5. EXECUTE
 Spawn agent → `commands/gsd-t-execute.md`
-- This is the heaviest phase. The execute agent will handle its own domain agent spawning and QA agent internally.
-- After: Read `progress.md`, verify status = EXECUTED
+- This is the heaviest phase. The execute agent uses **task-level dispatch** (fresh-dispatch-contract.md): one Task subagent per task within each domain, each receiving only scope.md + relevant contracts + single task + graph context + up to 5 prior summaries. The execute agent handles domain task-dispatching and QA internally.
+- **Adaptive replanning**: After each domain completes, the execute agent runs a replan check (per `adaptive-replan-contract.md`). If a completed domain's task summaries reveal new constraints (e.g., deprecated API, wrong column name, incompatible library), the execute agent checks remaining domains' `tasks.md` files for invalidated assumptions and revises them on disk before dispatching the next domain. Maximum 2 replan cycles per execute run — if exceeded, execution pauses for user input. All replan decisions are logged to the Decision Log in `progress.md`. The wave phase summary includes any replan actions taken.
+- **Team/parallel mode**: If the plan defines parallel domains (same wave), the execute agent dispatches each domain teammate with `isolation: "worktree"` (per worktree-isolation-contract.md). Each domain works in an isolated git worktree. After all domains complete, the execute agent runs the Sequential Merge Protocol: merge domain A → test → merge domain B → test. Per-domain rollback if tests fail. Worktrees are cleaned up after all merges complete.
+- After: Read `progress.md`, verify status = EXECUTED. Phase summary must include replan actions if any occurred:
+  ```
+  📋 Phase 5 (EXECUTE): {N}/{N} tasks done | Replan cycles: {N} | Domains revised: {list or "none"}
+  ```
 #### 6. TEST-SYNC
 Spawn agent → `commands/gsd-t-test-sync.md`
@@ -125,15 +146,39 @@ Spawn agent → `commands/gsd-t-test-sync.md`
 Spawn agent → `commands/gsd-t-integrate.md`
 - After: Read `progress.md`, verify status = INTEGRATED
-#### 8. VERIFY
+#### 8. VERIFY + COMPLETE
 Spawn agent → `commands/gsd-t-verify.md`
+- The verify agent runs all 8 standard quality gates **plus** the goal-backward verification step (Step 5.5 in gsd-t-verify.md), which checks that milestone goals are actually achieved end-to-end and scans for placeholder patterns per `.gsd-t/contracts/goal-backward-contract.md`
+- Goal-backward runs after all structural gates pass — CRITICAL or HIGH findings block verification; MEDIUM findings are warnings only
+- **Verify auto-invokes complete-milestone** (Step 8 of gsd-t-verify.md). The verify agent handles both verification AND milestone completion in a single agent context. Do NOT spawn a separate complete agent.
 - After: Read `progress.md`, check status:
-  - VERIFIED → proceed to Complete
-  - VERIFY_FAILED → handle remediation (see Error Recovery)
+  - COMPLETED → milestone done (verify passed and auto-completed)
+  - VERIFIED → verify passed but complete-milestone failed — spawn a standalone complete agent as fallback
+  - VERIFY_FAILED → handle remediation (see Error Recovery) — includes goal-backward failures
+- Phase summary must include the `Goal-Backward:` line from verify-report.md:
+  ```
+  📋 Phase 8 (VERIFY+COMPLETE): {N} gates passed | Goal-Backward: {PASS/WARN/FAIL} — {N} requirements checked, {N} findings
+  ```
+#### 9. DOC-RIPPLE (Automated — after verify+complete)
+After the final phase completes but before wave reports done:
+1. Run threshold check — read `git diff --name-only HEAD~1` and evaluate against doc-ripple-contract.md trigger conditions
+2. If SKIP: log "Doc-ripple: SKIP — {reason}" and proceed
+3. If FIRE: spawn doc-ripple agent:
+⚙ [{model}] gsd-t-doc-ripple → blast radius analysis + parallel updates
+Task subagent (general-purpose, model: sonnet):
+"Execute the doc-ripple workflow per commands/gsd-t-doc-ripple.md.
+Git diff context: {files changed list}
+Command that triggered: wave
+Produce manifest at .gsd-t/doc-ripple-manifest.md.
+Update all affected documents.
+Report: 'Doc-ripple: {N} checked, {N} updated, {N} skipped'"
-#### 9. COMPLETE
-Spawn agent → `commands/gsd-t-complete-milestone.md`
-- After: Read `progress.md`, verify status = COMPLETED
+4. After doc-ripple returns, verify manifest exists and report summary inline
 ### Between Each Phase
@@ -286,16 +331,17 @@ If command files in `~/.claude/commands/` are tampered with, wave agents will ex
 │    check           check       check       check +       check              │
 │                                           gate                              │
 │                                                                              │
-│  ┌──────────┐   ┌────────┐   ┌───────────┐       ┌─────────────────┐       │
-│  │ COMPLETE │ ← │ VERIFY │ ← │ INTEGRATE │ ←──── │ FULL TEST-SYNC  │       │
-│  │ agent 9  │   │agent 8 │   │  agent 7  │       │    agent 6      │       │
-│  └────┬────┘   └────┬────┘   └─────┬─────┘       └────────┬────────┘       │
-│       ↓              ↓              ↓                      ↓               │
-│    archive        status +       status                 status              │
-│    git tag        gate check     check                  check               │
+│  ┌──────────────────┐   ┌───────────┐       ┌─────────────────┐            │
+│  │ VERIFY+COMPLETE  │ ← │ INTEGRATE │ ←──── │ FULL TEST-SYNC  │            │
+│  │    agent 8       │   │  agent 7  │       │    agent 6      │            │
+│  └────────┬─────────┘   └─────┬─────┘       └────────┬────────┘            │
+│           ↓                    ↓                      ↓                     │
+│    gate check →             status                 status                   │
+│    auto-complete            check                  check                    │
+│    archive + tag                                                            │
 │                                                                              │
 │  Each agent: fresh context window, reads state from files, dies when done   │
-│  Orchestrator: ~30KB total, never compacts                                  │
+│  Orchestrator: 8 agents (was 9), ~30KB total, never compacts               │
 └──────────────────────────────────────────────────────────────────────────────┘
 ```

package/docs/GSD-T-README.md CHANGED Viewed

@@ -12,6 +12,8 @@ A methodology for reliable, parallelizable development using Claude Code with op
 **Catches downstream effects** — analyzes impact before changes break things.
+**Self-learning rule engine** — declarative rules detect failure patterns from task metrics. Patches progress through 5 lifecycle stages with measurable improvement gates before graduating into permanent methodology.
 ---
 ## Quick Start
@@ -96,26 +98,28 @@ GSD-T reads all state files and tells you exactly where you left off.
 | `/user:gsd-t-milestone` | Define new milestone | Manual |
 | `/user:gsd-t-partition` | Decompose into domains + contracts | In wave |
 | `/user:gsd-t-discuss` | Multi-perspective design exploration | In wave |
-| `/user:gsd-t-plan` | Create atomic task lists per domain | In wave |
+| `/user:gsd-t-plan` | Create atomic task lists per domain (tasks auto-split to fit one context window) | In wave |
 | `/user:gsd-t-impact` | Analyze downstream effects | In wave |
-| `/user:gsd-t-execute` | Run tasks (solo or team) | In wave |
+| `/user:gsd-t-execute` | Run tasks — task-level fresh dispatch, worktree isolation, adaptive replanning | In wave |
 | `/user:gsd-t-test-sync` | Sync tests with code changes | In wave |
 | `/user:gsd-t-qa` | QA agent — test generation, execution, gap reporting | Auto-spawned |
+| `/user:gsd-t-doc-ripple` | Automated document ripple — update downstream docs after code changes | Auto-spawned |
 | `/user:gsd-t-integrate` | Wire domains together | In wave |
-| `/user:gsd-t-verify` | Run quality gates | In wave |
-| `/user:gsd-t-complete-milestone` | Archive + git tag | In wave |
+| `/user:gsd-t-verify` | Run quality gates + goal-backward verification → auto-invokes complete-milestone | In wave |
+| `/user:gsd-t-complete-milestone` | Archive + git tag (auto-invoked by verify, also standalone) | In wave |
 ### Automation & Utilities
 | Command | Purpose | Auto |
 |---------|---------|------|
 | `/user:gsd-t-wave` | Full cycle, auto-advances all phases | Manual |
-| `/user:gsd-t-status` | Cross-domain progress view | Manual |
+| `/user:gsd-t-status` | Cross-domain progress view with token breakdown, global ELO and cross-project rankings | Manual |
 | `/user:gsd-t-resume` | Restore context, continue | Manual |
 | `/user:gsd-t-quick` | Fast task with GSD-T guarantees | Manual |
 | `/user:gsd-t-reflect` | Generate retrospective from event stream, propose memory updates | Manual |
 | `/user:gsd-t-visualize` | Launch browser dashboard — SSE server + React Flow agent visualization | Manual |
 | `/user:gsd-t-debug` | Systematic debugging with state | Manual |
+| `/user:gsd-t-metrics` | View task telemetry, process ELO, signal distribution, domain health, and cross-project comparison (`--cross-project`) | Manual |
 | `/user:gsd-t-health` | Validate .gsd-t/ structure, optionally repair | Manual |
 | `/user:gsd-t-pause` | Save exact position for reliable resume | Manual |
 | `/user:gsd-t-log` | Sync progress Decision Log with recent git activity | Manual |
@@ -154,7 +158,7 @@ GSD-T reads all state files and tells you exactly where you left off.
 │                                              │    │  task + at verify)│     │
 │                                              │    └───────────────────┘     │
 │                                              ▼                              │
-│  complete-milestone ◄── verify ◄── integrate ◄──────────────────────┘      │
+│  verify+complete ◄──────────── integrate ◄──────────────────────┘          │
 │                                                                             │
 └─────────────────────────────────────────────────────────────────────────────┘
 ```

package/docs/architecture.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Architecture — GSD-T Framework (@tekyzinc/gsd-t)
-## Last Updated: 2026-03-19 (Scan #10, Post-M20/M21)
+## Last Updated: 2026-03-22 (M23 — Headless Mode)
 ## System Overview
@@ -16,7 +16,7 @@ The framework has no runtime — it is consumed entirely by Claude Code's slash
 - **Purpose**: Install, update, diagnose, and manage GSD-T across projects
 - **Location**: `bin/gsd-t.js` (1,798 lines, 90+ functions, all ≤ 30 lines)
 - **Dependencies**: Node.js built-ins only (fs, path, os, child_process, https, crypto)
-- **Subcommands**: install, update, status, doctor, init, uninstall, update-all, register, changelog, graph (index/status/query)
+- **Subcommands**: install, update, status, doctor, init, uninstall, update-all, register, changelog, graph (index/status/query), headless (exec/query)
 - **Organization**: Configuration → Guard section → Helpers → Heartbeat → Commands → Install/Update → Init → Status → Uninstall → Update-All → Doctor → Register → Update Check → Help → Main dispatch
 - **All functions ≤ 30 lines** (M6 refactoring). Largest: `doRegister()` at 30 lines, `summarize()` at 30 lines.
@@ -66,6 +66,18 @@ The framework has no runtime — it is consumed entirely by Claude Code's slash
 - **`scripts/gsd-t-dashboard.html`** (194 lines): React 17 + React Flow v11.11.4 + Dagre via CDN (no build step, no npm deps). Dark theme (`#0d1117`). Renders agent hierarchy as directed graph from `parent_agent_id` relationships. Live event feed (max 200, outcome color-coded). Auto-reconnects on SSE disconnect. Port configurable via `?port=` URL param.
 - **`commands/gsd-t-visualize.md`** (104 lines, 48th command): Starts server via `--detach`, polls `/ping` up to 5s, opens browser cross-platform (win32/darwin/linux). Accepts `stop` argument to shut down server. Step 0 self-spawn with OBSERVABILITY LOGGING.
+### Headless Mode (M23 — complete)
+- **doHeadless(args)**: Dispatch function for the `headless` CLI subcommand.
+- **doHeadlessExec(command, cmdArgs, flags)**: Wraps `claude -p "/user:gsd-t-{command}"` via `execFileSync`. Verifies claude CLI availability, enforces timeout, writes log file if `--log` requested. Returns structured JSON if `--json` flag set.
+- **parseHeadlessFlags(args)**: Extracts `--json`, `--timeout=N`, `--log` from raw args. Returns `{ flags, positional }`.
+- **buildHeadlessCmd(command, cmdArgs)**: Builds the `/user:gsd-t-{command}` prompt string.
+- **mapHeadlessExitCode(processExitCode, output)**: Maps process exit code + output text patterns to GSD-T exit codes (0–4).
+- **headlessLogPath(projectDir, timestamp)**: Generates `.gsd-t/headless-{timestamp}.log` path.
+- **doHeadlessQuery(type)**: Dispatches to one of 7 query functions. All pure Node.js file reads, no LLM calls, <100ms.
+- **Query functions** (7): `queryStatus`, `queryDomains`, `queryContracts`, `queryDebt`, `queryContext`, `queryBacklog`, `queryGraph` — each reads corresponding `.gsd-t/` file and returns typed JSON result.
+- **Exit codes**: 0=success, 1=verify-fail, 2=context-budget-exceeded, 3=error, 4=blocked-needs-human
+- **CI/CD examples**: `docs/ci-examples/github-actions.yml` (GitHub Actions), `docs/ci-examples/gitlab-ci.yml` (GitLab CI)
 ### Graph Engine (M20 — complete)
 - **`bin/graph-store.js`** (147 lines): File-based graph storage in `.gsd-t/graph/`. 8 JSON files (index, calls, imports, contracts, requirements, tests, surfaces, meta). Read/write operations, MD5 file hashing for incremental indexing, staleness detection. Zero external deps. Note: no symlink protection (TD-099).
 - **`bin/graph-parsers.js`** (327 lines): Language-specific entity parsers. JS/TS: function declarations, arrow functions, classes, methods, imports (ES/CJS), exports. Python: def/class/import. Regex-based (no Tree-sitter). Returns `{ entities, imports, calls }`.
@@ -271,6 +283,76 @@ QA runs inline or as Task subagent depending on phase (M10 refactor). Removed fr
 | 2026-02-18 | gsd-t-tools.js as state utility CLI | Reduces token-heavy markdown parsing; compact JSON responses save ~50K tokens/wave | Parsing progress.md inline (original) |
 | 2026-02-18 | continue-here files for pause/resume | More precise than progress.md; captures exact task+next-action, not just phase | progress.md alone (less precise) |
+### GSD 2 Tier 1 — Execution Quality (M22 — complete v2.40.10)
+Five interlocking capabilities eliminate context rot, enable safe parallel execution, and verify behavior rather than structure alone.
+**Task-Level Fresh Dispatch**
+Execute dispatches one subagent per TASK (not per domain). Each task agent gets a fresh context window containing only: domain scope.md, relevant contracts, the single current task, graph context for touched files, and prior task summaries (10-20 lines each). Context utilization per task: ~10-20% (down from 60-75% cumulative per domain). Compaction never triggers. The domain dispatcher (lightweight orchestrator) sequences tasks and passes summaries — it never accumulates full task context.
+```
+Execute orchestrator (summaries only — ~4-8% ctx)
+  └── Domain-A task-dispatcher
+       ├── Task 1 subagent (fresh, 10-20% ctx) → summary → dies
+       ├── Task 2 subagent (fresh + task 1 summary) → summary → dies
+       └── Task N subagent (fresh + prior summaries) → summary → dies
+```
+**Plan command constraint** (added M22): Every task must fit in one context window. If estimated scope exceeds 70% context, plan splits the task automatically.
+**Worktree Isolation**
+Parallel domain agents work in isolated git worktrees via Agent tool's `isolation: "worktree"` parameter. No shared filesystem — domains cannot step on each other's files. Merges are sequential and atomic:
+```
+Dispatch N domains (isolation: "worktree") → parallel execution
+  └── Domain A completes → merge A → run integration tests
+  └── Domain B completes → merge B → run integration tests
+  └── Conflict or test failure → rollback that domain, others unaffected
+```
+Rollback granularity is per-domain (not per-commit). Worktrees are cleaned up after all merges complete.
+**Goal-Backward Verification**
+After all structural quality gates pass (tests, contracts, file existence), a goal-backward pass verifies behavior. Reads milestone goals, traces each requirement to code, and checks for placeholders:
+- `console.log("TODO")` / `console.log("implement X")`
+- Hardcoded return values (`return "Synced"`, `return 200` on a path that should compute)
+- `// TODO`, `// FIXME`, `// PLACEHOLDER` comments in critical paths
+- UI components rendering static strings where dynamic data is required
+Applied in: `verify`, `complete-milestone`, `wave` (verification phase).
+**Adaptive Replanning**
+After each domain completes in execute, the orchestrator reads the domain's result summary and evaluates whether remaining domain plans remain valid. If execution revealed new constraints (deprecated API, schema mismatch, missing dependency, incompatible library), affected domain `tasks.md` files are rewritten on disk before the next domain is dispatched.
+Guard: max 2 replanning cycles per execute run. After that, pause for user input (prevents new-constraint → replan → new-constraint loops).
+**Context Observability**
+Extended token-log.md format (M22) includes `Domain`, `Task`, and `Ctx%` columns:
+```
+| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted | Domain | Task | Ctx% |
+```
+Alert thresholds (inline display):
+- `Ctx% >= 70%` → warning: task approaching compaction, consider splitting
+- `Ctx% >= 85%` → critical: compaction likely, task MUST be split
+`gsd-t-status` displays token breakdown by domain/task/phase. `gsd-t-visualize` consumes the same data for dashboard rendering.
+## Planned Architecture Changes (M23-M24)
+**M23: Headless Mode**
+- New `gsd-t headless` CLI subcommand wrapping `claude -p` for unattended execution.
+- New `gsd-t headless query` for instant JSON state access (no LLM).
+**M24: Docker**
+- Dockerfile + docker-compose for containerized enterprise execution.
 ## Known Architecture Concerns
 1. **CLI single-file size**: bin/gsd-t.js at 1,438 lines exceeds the 200-line convention, but splitting adds complexity for questionable benefit given zero-dependency constraint. Accepted deviation.

package/docs/ci-examples/desktop.ini ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ [.ShellClassInfo]
2	+ IconResource=C:\Program Files\Google\Drive File Stream\122.0.1.0\GoogleDriveFS.exe,27

package/docs/ci-examples/github-actions.yml ADDED Viewed

@@ -0,0 +1,104 @@
+# GSD-T Headless Mode — GitHub Actions Example
+#
+# This workflow demonstrates using `gsd-t headless` for automated milestone
+# verification in CI/CD pipelines.
+#
+# Prerequisites:
+#   - ANTHROPIC_API_KEY secret configured in your GitHub repository
+#   - GSD-T installed globally (npm install -g @tekyzinc/gsd-t)
+#   - Claude Code CLI installed (npm install -g @anthropic-ai/claude-code)
+#
+# Usage: Copy this file to .github/workflows/gsd-t-verify.yml in your project
+name: GSD-T Headless Verify
+on:
+  push:
+    branches: [main, develop]
+  pull_request:
+    branches: [main]
+  workflow_dispatch:
+    inputs:
+      command:
+        description: 'GSD-T command to run (default: verify)'
+        required: false
+        default: 'verify'
+jobs:
+  gsd-t-verify:
+    name: GSD-T Quality Gates
+    runs-on: ubuntu-latest
+    timeout-minutes: 30
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+      - name: Install GSD-T
+        run: npm install -g @tekyzinc/gsd-t @anthropic-ai/claude-code
+      - name: Query project status (no LLM needed)
+        id: status
+        run: |
+          STATUS=$(gsd-t headless query status)
+          echo "status=$STATUS" >> $GITHUB_OUTPUT
+          echo "GSD-T Project Status: $STATUS"
+      - name: Run GSD-T headless verify
+        env:
+          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+        run: |
+          gsd-t headless ${{ github.event.inputs.command || 'verify' }} \
+            --json \
+            --timeout=1200 \
+            --log
+        # Exit codes:
+        #   0 = success
+        #   1 = verify-fail (tests/gates failed)
+        #   2 = context-budget-exceeded (try splitting milestone)
+        #   3 = error (claude CLI error)
+        #   4 = blocked-needs-human (requires manual review)
+      - name: Upload headless log on failure
+        if: failure()
+        uses: actions/upload-artifact@v4
+        with:
+          name: gsd-t-headless-log
+          path: .gsd-t/headless-*.log
+          retention-days: 7
+  gsd-t-status-gate:
+    name: GSD-T State Query Gate
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+      - name: Install GSD-T
+        run: npm install -g @tekyzinc/gsd-t
+      - name: Check project status
+        run: gsd-t headless query status
+      - name: Check for open tech debt
+        run: |
+          DEBT=$(gsd-t headless query debt)
+          COUNT=$(echo "$DEBT" | node -e "const d=JSON.parse(require('fs').readFileSync('/dev/stdin','utf8')); process.exit(d.data.count > 0 ? 1 : 0)" || echo "")
+          echo "Tech debt items: $DEBT"
+      - name: List active domains
+        run: gsd-t headless query domains
+      - name: Check graph index
+        run: gsd-t headless query graph