@tekyzinc/gsd-t 2.50.12 → 2.53.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (99) hide show
  1. package/CHANGELOG.md +24 -0
  2. package/README.md +379 -372
  3. package/bin/component-registry.js +250 -0
  4. package/bin/graph-cgc.js +510 -510
  5. package/bin/graph-indexer.js +147 -147
  6. package/bin/graph-overlay.js +195 -195
  7. package/bin/graph-parsers.js +327 -327
  8. package/bin/graph-query.js +453 -452
  9. package/bin/graph-store.js +154 -154
  10. package/bin/qa-calibrator.js +194 -0
  11. package/bin/scan-data-collector.js +153 -153
  12. package/bin/scan-diagrams-generators.js +187 -187
  13. package/bin/scan-diagrams.js +79 -79
  14. package/bin/scan-renderer.js +92 -92
  15. package/bin/scan-report-sections.js +121 -121
  16. package/bin/scan-report.js +184 -184
  17. package/bin/scan-schema-parsers.js +199 -199
  18. package/bin/scan-schema.js +103 -103
  19. package/bin/token-budget.js +246 -0
  20. package/commands/Claude-md.md +10 -10
  21. package/commands/branch.md +15 -15
  22. package/commands/checkin.md +45 -45
  23. package/commands/global-change.md +209 -209
  24. package/commands/gsd-t-audit.md +199 -0
  25. package/commands/gsd-t-backlog-add.md +94 -94
  26. package/commands/gsd-t-backlog-edit.md +111 -111
  27. package/commands/gsd-t-backlog-list.md +63 -63
  28. package/commands/gsd-t-backlog-move.md +94 -94
  29. package/commands/gsd-t-backlog-promote.md +123 -123
  30. package/commands/gsd-t-backlog-remove.md +86 -86
  31. package/commands/gsd-t-backlog-settings.md +158 -158
  32. package/commands/gsd-t-complete-milestone.md +528 -515
  33. package/commands/gsd-t-debug.md +506 -399
  34. package/commands/gsd-t-discuss.md +174 -174
  35. package/commands/gsd-t-execute.md +758 -634
  36. package/commands/gsd-t-feature.md +276 -276
  37. package/commands/gsd-t-health.md +142 -142
  38. package/commands/gsd-t-help.md +465 -457
  39. package/commands/gsd-t-impact.md +302 -302
  40. package/commands/gsd-t-init.md +320 -280
  41. package/commands/gsd-t-integrate.md +365 -249
  42. package/commands/gsd-t-milestone.md +87 -87
  43. package/commands/gsd-t-partition.md +442 -361
  44. package/commands/gsd-t-pause.md +82 -82
  45. package/commands/gsd-t-plan.md +345 -344
  46. package/commands/gsd-t-populate.md +111 -111
  47. package/commands/gsd-t-prd.md +326 -326
  48. package/commands/gsd-t-project.md +211 -211
  49. package/commands/gsd-t-promote-debt.md +123 -123
  50. package/commands/gsd-t-prompt.md +137 -137
  51. package/commands/gsd-t-qa.md +266 -266
  52. package/commands/gsd-t-quick.md +357 -234
  53. package/commands/gsd-t-reflect.md +134 -134
  54. package/commands/gsd-t-resume.md +72 -72
  55. package/commands/gsd-t-scan.md +615 -615
  56. package/commands/gsd-t-setup.md +76 -0
  57. package/commands/gsd-t-status.md +192 -166
  58. package/commands/gsd-t-test-sync.md +381 -381
  59. package/commands/gsd-t-triage-and-merge.md +171 -171
  60. package/commands/gsd-t-verify.md +382 -382
  61. package/commands/gsd-t-visualize.md +118 -118
  62. package/commands/gsd-t-wave.md +401 -378
  63. package/docs/GSD-T-README.md +425 -422
  64. package/docs/architecture.md +385 -369
  65. package/docs/harness-design-analysis.md +371 -0
  66. package/docs/infrastructure.md +205 -205
  67. package/docs/prd-graph-engine.md +398 -398
  68. package/docs/prd-gsd2-hybrid.md +559 -559
  69. package/docs/prd-harness-evolution.md +583 -0
  70. package/docs/requirements.md +14 -0
  71. package/docs/workflows.md +226 -226
  72. package/examples/.gsd-t/domains/example-domain/scope.md +13 -13
  73. package/package.json +40 -40
  74. package/scripts/gsd-t-auto-route.js +39 -39
  75. package/scripts/gsd-t-dashboard-mockup.html +1143 -1143
  76. package/scripts/gsd-t-dashboard-server.js +171 -171
  77. package/scripts/gsd-t-dashboard.html +262 -262
  78. package/scripts/gsd-t-event-writer.js +128 -128
  79. package/scripts/gsd-t-statusline.js +94 -94
  80. package/scripts/gsd-t-tools.js +175 -175
  81. package/templates/CLAUDE-global.md +639 -614
  82. package/templates/CLAUDE-project.md +24 -0
  83. package/templates/backlog-settings.md +18 -18
  84. package/templates/backlog.md +1 -1
  85. package/templates/progress.md +40 -40
  86. package/templates/shared-services-contract.md +60 -60
  87. package/templates/stacks/desktop.ini +2 -2
  88. package/bin/desktop.ini +0 -2
  89. package/commands/desktop.ini +0 -2
  90. package/docs/ci-examples/desktop.ini +0 -2
  91. package/docs/desktop.ini +0 -2
  92. package/examples/.gsd-t/contracts/desktop.ini +0 -2
  93. package/examples/.gsd-t/desktop.ini +0 -2
  94. package/examples/.gsd-t/domains/desktop.ini +0 -2
  95. package/examples/.gsd-t/domains/example-domain/desktop.ini +0 -2
  96. package/examples/desktop.ini +0 -2
  97. package/examples/rules/desktop.ini +0 -2
  98. package/scripts/desktop.ini +0 -2
  99. package/templates/desktop.ini +0 -2
@@ -1,381 +1,381 @@
1
- # GSD-T: Test Sync — Keep Tests Aligned with Code
2
-
3
- You are maintaining test coverage as code changes. Your job is to identify stale tests, coverage gaps, and dead tests, then generate tasks to address them.
4
-
5
- This command is:
6
- - **Auto-invoked** during execute phase (after each task) and verify phase
7
- - **Standalone** when user wants to audit test health
8
-
9
- ## Step 1: Load Context
10
-
11
- Read:
12
- 1. `CLAUDE.md` — testing conventions, test locations
13
- 2. `.gsd-t/progress.md` — what just changed
14
- 3. `.gsd-t/test-coverage.md` — previous coverage state (if exists)
15
- 4. `.gsd-t/domains/{current}/tasks.md` — recent completed tasks
16
-
17
- Identify:
18
- - **Unit/integration test framework** (pytest, jest, vitest, etc.)
19
- - **E2E test framework** (Playwright, Cypress, Puppeteer, etc.) — check for `playwright.config.*`, `cypress.config.*`, `playwright/`, `cypress/`, `e2e/`, or E2E-related dependencies in package.json/requirements.txt
20
- - Test directory structure
21
- - Naming conventions
22
- - Test run commands (from package.json scripts, Makefile, or CI config)
23
-
24
- ## Step 1.5: Graph-Enhanced Test Discovery
25
-
26
- If `.gsd-t/graph/meta.json` exists (graph index is available):
27
- 1. Query `getTestsFor` each changed entity to find stale or missing tests more precisely than filesystem search
28
- 2. Query `getTransitiveCallers` for changed functions to find indirectly affected tests that may need updating
29
- 3. Feed these findings into the coverage map (Step 3) and issue detection (Step 4)
30
-
31
- If graph is not available, skip this step.
32
-
33
- ## Step 2: Contract Coverage Audit
34
-
35
- Perform inline contract testing and gap analysis:
36
-
37
- 1. Read all contracts in `.gsd-t/contracts/` — identify the interface each one defines
38
- 2. For each contract, check whether a test file exists that validates it
39
- 3. Run the full test suite: `npm test` (or project equivalent)
40
- 4. Identify gaps: contracts with no tests, stale tests referencing removed APIs, uncovered code paths
41
- 5. Report: coverage gaps, stale tests, and recommended test tasks
42
-
43
- Test-sync cannot complete if critical contract gaps remain unaddressed.
44
-
45
- ## Step 3: Map Code to Tests
46
-
47
- For each file changed in recent tasks:
48
-
49
- ### A) Find Existing Tests
50
- ```bash
51
- # Common patterns
52
- find tests/ -name "*{module_name}*"
53
- find __tests__/ -name "*{module_name}*"
54
- find . -name "*.test.*" | xargs grep -l "{function_name}"
55
- find . -name "*.spec.*" | xargs grep -l "{class_name}"
56
- ```
57
-
58
- ### B) Build Coverage Map
59
- ```
60
- | Source File | Test File(s) | Coverage Status |
61
- |-------------|--------------|-----------------|
62
- | src/auth/login.py | tests/test_login.py | COVERED |
63
- | src/auth/roles.py | (none) | GAP |
64
- | src/api/users.py | tests/test_users.py | PARTIAL |
65
- ```
66
-
67
- ## Step 4: Detect Test Issues
68
-
69
- ### A) Stale Tests
70
- Tests that reference old behavior:
71
- - Function signatures that changed
72
- - Removed functions still being tested
73
- - Old API shapes in assertions
74
- - Mocked data that no longer matches schema
75
-
76
- Check:
77
- ```bash
78
- # Find tests importing changed modules
79
- grep -r "from {changed_module}" tests/
80
- # Check if test assertions match new behavior
81
- ```
82
-
83
- ### B) Coverage Gaps
84
- New or changed code without tests:
85
- - New functions with no test
86
- - New branches with no coverage
87
- - Changed behavior with no updated assertions
88
- - New error cases with no error tests
89
-
90
- ### C) Dead Tests
91
- Tests for deleted functionality:
92
- - Tests importing deleted modules
93
- - Tests for removed features
94
- - Skipped tests that should be removed
95
-
96
- ### D) Flaky Tests (if test history available)
97
- Tests that sometimes fail:
98
- - Check recent CI runs
99
- - Note any intermittent failures
100
-
101
- ## Step 5: Run Affected Tests
102
-
103
- ### A) Unit/Integration Tests
104
- Execute tests that cover changed code:
105
-
106
- ```bash
107
- # Example for pytest
108
- pytest tests/test_{module}.py -v
109
-
110
- # Example for jest
111
- npm test -- --testPathPattern="{module}"
112
- ```
113
-
114
- ### B) E2E Tests (MANDATORY when config exists)
115
- If `playwright.config.*` or `cypress.config.*` exists, you MUST run E2E tests — skipping is never acceptable:
116
-
117
- ```bash
118
- # Playwright
119
- npx playwright test {affected-spec}.spec.ts
120
-
121
- # Cypress
122
- npx cypress run --spec "cypress/e2e/{affected-spec}.cy.ts"
123
- ```
124
-
125
- Determine which E2E specs are affected:
126
- - Changed a UI component or page? → Run specs that test that page/flow
127
- - Changed an API endpoint? → Run specs that exercise that endpoint
128
- - Changed auth/session logic? → Run all auth-related E2E specs
129
- - Changed database schema? → Run specs that depend on that data
130
- - Not sure what's affected? → Run the full E2E suite
131
-
132
- ### C) Create and Update Playwright E2E Tests (MANDATORY when UI/routes/flows/modes changed)
133
-
134
- If Playwright is configured (`playwright.config.*` or Playwright in dependencies):
135
-
136
- **For new features, pages, modes, or flows — CREATE comprehensive specs:**
137
- - Happy path for every new user flow
138
- - All feature modes/flags (e.g., `--component` mode gets its own test suite, not just default mode)
139
- - Form validation: valid input, invalid input, empty fields, boundary values
140
- - Error states: network failures, API errors, permission denied, timeout
141
- - Empty states: no data, first-time user, cleared data
142
- - Loading states: skeleton screens, spinners, progressive loading
143
- - Edge cases: rapid clicking, double submission, back/forward navigation, browser refresh mid-flow
144
- - Responsive: test at mobile and desktop breakpoints if layout changes
145
-
146
- **For changed features — UPDATE existing specs AND add missing coverage:**
147
- - Changed UI elements (selectors, text, layout) → update locators and assertions
148
- - Changed form fields or validation → update form fill steps and error assertions
149
- - Removed features → remove or update affected E2E specs
150
- - Review existing specs for missing edge cases and add them
151
-
152
- **This is NOT optional.** Every new code path that a user can reach must have a Playwright spec. "We'll add tests later" is never acceptable.
153
-
154
- **FUNCTIONAL TESTS — NOT LAYOUT TESTS (MANDATORY):**
155
- E2E specs that only check element existence (`isVisible`, `toBeAttached`, `toBeEnabled`) are
156
- layout tests. Layout tests pass even when every feature is broken — they are worthless for QA.
157
-
158
- Every Playwright assertion MUST verify **functional behavior** — that an action produced the
159
- correct outcome:
160
- - **Tab/navigation**: Click → assert the NEW content loaded (unique text, data, or elements
161
- that only appear on the destination view). Never just assert the tab element exists.
162
- - **Forms**: Fill → submit → assert success feedback AND data persisted (API call observed
163
- via `page.waitForResponse`, or list/table updated with new entry).
164
- - **Interactive widgets** (terminals, editors, code panels): Open → interact → assert the
165
- widget responded (keystroke produced output, content was saved, command executed).
166
- - **Connections** (WebSocket, SSE, polling): Assert status transitions ("Connecting" →
167
- "Connected") and verify data flows through the connection.
168
- - **State toggles** (dark mode, expand/collapse, enable/disable): Assert the EFFECT of the
169
- toggle, not just that the toggle control exists.
170
- - **Error handling**: Trigger error → assert error content → assert recovery path works.
171
-
172
- **Rule: If a test would pass on an empty HTML page with the correct element IDs and no
173
- JavaScript, it is not a functional test. Rewrite it.**
174
-
175
- ### D) Capture Results
176
- For all test types:
177
- - PASS: Test still valid
178
- - FAIL: Test needs update or code has bug
179
- - ERROR: Test broken (import error, etc.)
180
-
181
- ## Step 6: Produce Test Coverage Report
182
-
183
- Create/update `.gsd-t/test-coverage.md`:
184
-
185
- ```markdown
186
- # Test Coverage Report — {date}
187
-
188
- ## Summary
189
- - Source files analyzed: {N}
190
- - Unit/integration test files: {N}
191
- - E2E test specs: {N}
192
- - Coverage gaps: {N}
193
- - Stale tests: {N}
194
- - Dead tests: {N}
195
- - Unit tests passing: {N}/{total}
196
- - E2E tests passing: {N}/{total}
197
-
198
- ## Coverage Status
199
-
200
- ### ✅ Well Covered
201
- | Source | Test | Last Verified |
202
- |--------|------|---------------|
203
- | {file} | {test} | {date} |
204
-
205
- ### ⚠️ Partial Coverage
206
- | Source | Test | Gap |
207
- |--------|------|-----|
208
- | {file} | {test} | {missing: error cases, edge cases, etc.} |
209
-
210
- ### ❌ No Coverage
211
- | Source | Risk Level | Reason |
212
- |--------|------------|--------|
213
- | {file} | {HIGH/MED/LOW} | {new file, complex logic, etc.} |
214
-
215
- ---
216
-
217
- ## Issues Found
218
-
219
- ### Stale Tests
220
- | Test | Issue | Action |
221
- |------|-------|--------|
222
- | {test} | {function signature changed} | Update assertions |
223
- | {test} | {mock data outdated} | Update mock |
224
-
225
- ### Dead Tests
226
- | Test | Reason | Action |
227
- |------|--------|--------|
228
- | {test} | {tests deleted feature} | Remove |
229
- | {test} | {imports removed module} | Remove |
230
-
231
- ### Failing Tests
232
- | Test | Error | Likely Cause |
233
- |------|-------|--------------|
234
- | {test} | {error message} | {code bug or test needs update} |
235
-
236
- ---
237
-
238
- ## Test Health Metrics
239
-
240
- - Test-to-code ratio: {N tests / N source files}
241
- - Average assertions per test: {N}
242
- - Critical paths covered: {list}
243
- - Critical paths uncovered: {list}
244
-
245
- ---
246
-
247
- ## Generated Tasks
248
-
249
- ### High Priority (blocking)
250
- - [ ] TEST-001: Fix failing test {test} — {reason}
251
- - [ ] TEST-002: Update stale test {test} — {what changed}
252
-
253
- ### Medium Priority (should do)
254
- - [ ] TEST-010: Add tests for {file} — {N} functions uncovered
255
- - [ ] TEST-011: Add error case tests for {function}
256
-
257
- ### Low Priority (nice to have)
258
- - [ ] TEST-020: Remove dead test {test}
259
- - [ ] TEST-021: Add edge case tests for {function}
260
-
261
- ---
262
-
263
- ## Recommendations
264
-
265
- {Based on findings, what should be prioritized}
266
- ```
267
-
268
- ## Step 7: Generate Test Tasks
269
-
270
- If issues found, add to current domain's tasks:
271
-
272
- ```markdown
273
- ## Auto-Generated Test Tasks
274
-
275
- ### From Test Sync — {date}
276
-
277
- - [ ] TEST-001: Fix failing test `test_login.py::test_valid_credentials`
278
- - Error: AssertionError — expected 200, got 201
279
- - Cause: API return code changed
280
- - Action: Update assertion to expect 201
281
-
282
- - [ ] TEST-002: Add tests for `src/auth/roles.py`
283
- - Functions: check_permission, assign_role, revoke_role
284
- - Priority: HIGH — authorization logic
285
-
286
- - [ ] TEST-003: Update mock data in `test_users.py`
287
- - Schema changed: added `last_login` field
288
- - Action: Update all user fixtures
289
- ```
290
-
291
- ## Step 8: Integration with Workflow
292
-
293
- ### During Execute Phase (auto-invoked):
294
- After each task completes:
295
- 1. Scan changed files and map to existing tests
296
- 2. **If new code paths have zero test coverage: write tests NOW** — do not defer
297
- 3. Run ALL affected unit/integration tests
298
- 4. Run ALL affected Playwright E2E tests
299
- 5. If failures: fix immediately (up to 2 attempts) before continuing. If both attempts fail:
300
- 1. Write failure context to `.gsd-t/debug-state.jsonl` via `node -e "require('./bin/debug-ledger.js').appendEntry('.', {iteration:1,timestamp:new Date().toISOString(),test:'test-sync-failure',error:'2 in-context fix attempts exhausted',hypothesis:'see test-coverage.md',fix:'n/a',fixFiles:[],result:'STILL_FAILS',learning:'delegating to headless debug-loop',model:'sonnet',duration:0})"`
301
- 2. Log: "Delegating to headless debug-loop (2 in-context attempts exhausted)"
302
- 3. Run: `gsd-t headless --debug-loop --max-iterations 10`
303
- 4. Exit code 0 → tests pass, continue; 1/4 → log to `.gsd-t/deferred-items.md`, report failure; 3 → report error
304
- 6. If E2E specs are missing for new features/modes/flows: **create them NOW**, not later
305
- 7. If E2E specs need updating for changed behavior: update them before continuing
306
- 8. **No task is complete until its tests exist and pass** — do not move to the next task with test gaps
307
-
308
- ### During Verify Phase (auto-invoked):
309
- Full sync:
310
- 1. Complete coverage analysis (unit + E2E)
311
- 2. Run ALL unit/integration tests
312
- 3. Run the FULL E2E test suite — this is mandatory, not optional
313
- 4. Generate full report
314
- 5. Block verification if any critical tests failing (unit or E2E)
315
-
316
- ### Standalone Mode:
317
- ```
318
- /user:gsd-t-test-sync
319
- ```
320
- 1. Full analysis of entire codebase
321
- 2. Comprehensive report
322
- 3. Generate all test tasks
323
- 4. Do not auto-add to domains — present for review
324
-
325
- ## Step 9: Report to User
326
-
327
- ### Quick Mode (during execute):
328
- ```
329
- 🧪 Test sync: 3 tests affected, 3 passing
330
- 1 coverage gap noted → will address in verify phase
331
- ```
332
-
333
- ### Full Mode (during verify or standalone):
334
- ```
335
- 🧪 Test Sync Complete
336
-
337
- Unit/Integration:
338
- - Tests run: 45
339
- - Passing: 43
340
- - Failing: 2
341
-
342
- E2E ({framework}):
343
- - Specs run: 12
344
- - Passing: 11
345
- - Failing: 1
346
-
347
- Coverage:
348
- - Gaps: 3
349
- - Stale tests: 1
350
- - Dead tests: 0
351
-
352
- Action Required:
353
- - 2 failing unit tests must be fixed before verify passes
354
- - 1 failing E2E spec must be fixed before verify passes
355
- - See .gsd-t/test-coverage.md for details
356
-
357
- Generated 5 test tasks → added to current domain
358
- ```
359
-
360
- ### Autonomy Behavior
361
-
362
- **Level 3 (Full Auto)**: Log the summary and auto-advance to the next phase. If there are failing tests, attempt auto-fix (up to 2 attempts) before continuing. Do NOT wait for user input.
363
-
364
- **Level 1–2**: Present the full report and wait for user input before proceeding.
365
-
366
- ## Document Ripple
367
-
368
- ### Always update:
369
- 1. **`.gsd-t/progress.md`** — Log test sync results in Decision Log (standalone mode)
370
- 2. **`.gsd-t/test-coverage.md`** — Created/updated with coverage report (Step 5)
371
-
372
- ### Check if affected:
373
- 3. **`docs/requirements.md`** — If test tasks map to requirements, update the Test Coverage table
374
- 4. **`.gsd-t/domains/{current}/tasks.md`** — If test tasks were generated, append them (Step 6)
375
- 5. **`.gsd-t/techdebt.md`** — If persistent test gaps were found, add as debt items
376
-
377
- $ARGUMENTS
378
-
379
- ## Auto-Clear
380
-
381
- All work is committed to project files. Execute `/clear` to free the context window for the next command.
1
+ # GSD-T: Test Sync — Keep Tests Aligned with Code
2
+
3
+ You are maintaining test coverage as code changes. Your job is to identify stale tests, coverage gaps, and dead tests, then generate tasks to address them.
4
+
5
+ This command is:
6
+ - **Auto-invoked** during execute phase (after each task) and verify phase
7
+ - **Standalone** when user wants to audit test health
8
+
9
+ ## Step 1: Load Context
10
+
11
+ Read:
12
+ 1. `CLAUDE.md` — testing conventions, test locations
13
+ 2. `.gsd-t/progress.md` — what just changed
14
+ 3. `.gsd-t/test-coverage.md` — previous coverage state (if exists)
15
+ 4. `.gsd-t/domains/{current}/tasks.md` — recent completed tasks
16
+
17
+ Identify:
18
+ - **Unit/integration test framework** (pytest, jest, vitest, etc.)
19
+ - **E2E test framework** (Playwright, Cypress, Puppeteer, etc.) — check for `playwright.config.*`, `cypress.config.*`, `playwright/`, `cypress/`, `e2e/`, or E2E-related dependencies in package.json/requirements.txt
20
+ - Test directory structure
21
+ - Naming conventions
22
+ - Test run commands (from package.json scripts, Makefile, or CI config)
23
+
24
+ ## Step 1.5: Graph-Enhanced Test Discovery
25
+
26
+ If `.gsd-t/graph/meta.json` exists (graph index is available):
27
+ 1. Query `getTestsFor` each changed entity to find stale or missing tests more precisely than filesystem search
28
+ 2. Query `getTransitiveCallers` for changed functions to find indirectly affected tests that may need updating
29
+ 3. Feed these findings into the coverage map (Step 3) and issue detection (Step 4)
30
+
31
+ If graph is not available, skip this step.
32
+
33
+ ## Step 2: Contract Coverage Audit
34
+
35
+ Perform inline contract testing and gap analysis:
36
+
37
+ 1. Read all contracts in `.gsd-t/contracts/` — identify the interface each one defines
38
+ 2. For each contract, check whether a test file exists that validates it
39
+ 3. Run the full test suite: `npm test` (or project equivalent)
40
+ 4. Identify gaps: contracts with no tests, stale tests referencing removed APIs, uncovered code paths
41
+ 5. Report: coverage gaps, stale tests, and recommended test tasks
42
+
43
+ Test-sync cannot complete if critical contract gaps remain unaddressed.
44
+
45
+ ## Step 3: Map Code to Tests
46
+
47
+ For each file changed in recent tasks:
48
+
49
+ ### A) Find Existing Tests
50
+ ```bash
51
+ # Common patterns
52
+ find tests/ -name "*{module_name}*"
53
+ find __tests__/ -name "*{module_name}*"
54
+ find . -name "*.test.*" | xargs grep -l "{function_name}"
55
+ find . -name "*.spec.*" | xargs grep -l "{class_name}"
56
+ ```
57
+
58
+ ### B) Build Coverage Map
59
+ ```
60
+ | Source File | Test File(s) | Coverage Status |
61
+ |-------------|--------------|-----------------|
62
+ | src/auth/login.py | tests/test_login.py | COVERED |
63
+ | src/auth/roles.py | (none) | GAP |
64
+ | src/api/users.py | tests/test_users.py | PARTIAL |
65
+ ```
66
+
67
+ ## Step 4: Detect Test Issues
68
+
69
+ ### A) Stale Tests
70
+ Tests that reference old behavior:
71
+ - Function signatures that changed
72
+ - Removed functions still being tested
73
+ - Old API shapes in assertions
74
+ - Mocked data that no longer matches schema
75
+
76
+ Check:
77
+ ```bash
78
+ # Find tests importing changed modules
79
+ grep -r "from {changed_module}" tests/
80
+ # Check if test assertions match new behavior
81
+ ```
82
+
83
+ ### B) Coverage Gaps
84
+ New or changed code without tests:
85
+ - New functions with no test
86
+ - New branches with no coverage
87
+ - Changed behavior with no updated assertions
88
+ - New error cases with no error tests
89
+
90
+ ### C) Dead Tests
91
+ Tests for deleted functionality:
92
+ - Tests importing deleted modules
93
+ - Tests for removed features
94
+ - Skipped tests that should be removed
95
+
96
+ ### D) Flaky Tests (if test history available)
97
+ Tests that sometimes fail:
98
+ - Check recent CI runs
99
+ - Note any intermittent failures
100
+
101
+ ## Step 5: Run Affected Tests
102
+
103
+ ### A) Unit/Integration Tests
104
+ Execute tests that cover changed code:
105
+
106
+ ```bash
107
+ # Example for pytest
108
+ pytest tests/test_{module}.py -v
109
+
110
+ # Example for jest
111
+ npm test -- --testPathPattern="{module}"
112
+ ```
113
+
114
+ ### B) E2E Tests (MANDATORY when config exists)
115
+ If `playwright.config.*` or `cypress.config.*` exists, you MUST run E2E tests — skipping is never acceptable:
116
+
117
+ ```bash
118
+ # Playwright
119
+ npx playwright test {affected-spec}.spec.ts
120
+
121
+ # Cypress
122
+ npx cypress run --spec "cypress/e2e/{affected-spec}.cy.ts"
123
+ ```
124
+
125
+ Determine which E2E specs are affected:
126
+ - Changed a UI component or page? → Run specs that test that page/flow
127
+ - Changed an API endpoint? → Run specs that exercise that endpoint
128
+ - Changed auth/session logic? → Run all auth-related E2E specs
129
+ - Changed database schema? → Run specs that depend on that data
130
+ - Not sure what's affected? → Run the full E2E suite
131
+
132
+ ### C) Create and Update Playwright E2E Tests (MANDATORY when UI/routes/flows/modes changed)
133
+
134
+ If Playwright is configured (`playwright.config.*` or Playwright in dependencies):
135
+
136
+ **For new features, pages, modes, or flows — CREATE comprehensive specs:**
137
+ - Happy path for every new user flow
138
+ - All feature modes/flags (e.g., `--component` mode gets its own test suite, not just default mode)
139
+ - Form validation: valid input, invalid input, empty fields, boundary values
140
+ - Error states: network failures, API errors, permission denied, timeout
141
+ - Empty states: no data, first-time user, cleared data
142
+ - Loading states: skeleton screens, spinners, progressive loading
143
+ - Edge cases: rapid clicking, double submission, back/forward navigation, browser refresh mid-flow
144
+ - Responsive: test at mobile and desktop breakpoints if layout changes
145
+
146
+ **For changed features — UPDATE existing specs AND add missing coverage:**
147
+ - Changed UI elements (selectors, text, layout) → update locators and assertions
148
+ - Changed form fields or validation → update form fill steps and error assertions
149
+ - Removed features → remove or update affected E2E specs
150
+ - Review existing specs for missing edge cases and add them
151
+
152
+ **This is NOT optional.** Every new code path that a user can reach must have a Playwright spec. "We'll add tests later" is never acceptable.
153
+
154
+ **FUNCTIONAL TESTS — NOT LAYOUT TESTS (MANDATORY):**
155
+ E2E specs that only check element existence (`isVisible`, `toBeAttached`, `toBeEnabled`) are
156
+ layout tests. Layout tests pass even when every feature is broken — they are worthless for QA.
157
+
158
+ Every Playwright assertion MUST verify **functional behavior** — that an action produced the
159
+ correct outcome:
160
+ - **Tab/navigation**: Click → assert the NEW content loaded (unique text, data, or elements
161
+ that only appear on the destination view). Never just assert the tab element exists.
162
+ - **Forms**: Fill → submit → assert success feedback AND data persisted (API call observed
163
+ via `page.waitForResponse`, or list/table updated with new entry).
164
+ - **Interactive widgets** (terminals, editors, code panels): Open → interact → assert the
165
+ widget responded (keystroke produced output, content was saved, command executed).
166
+ - **Connections** (WebSocket, SSE, polling): Assert status transitions ("Connecting" →
167
+ "Connected") and verify data flows through the connection.
168
+ - **State toggles** (dark mode, expand/collapse, enable/disable): Assert the EFFECT of the
169
+ toggle, not just that the toggle control exists.
170
+ - **Error handling**: Trigger error → assert error content → assert recovery path works.
171
+
172
+ **Rule: If a test would pass on an empty HTML page with the correct element IDs and no
173
+ JavaScript, it is not a functional test. Rewrite it.**
174
+
175
+ ### D) Capture Results
176
+ For all test types:
177
+ - PASS: Test still valid
178
+ - FAIL: Test needs update or code has bug
179
+ - ERROR: Test broken (import error, etc.)
180
+
181
+ ## Step 6: Produce Test Coverage Report
182
+
183
+ Create/update `.gsd-t/test-coverage.md`:
184
+
185
+ ```markdown
186
+ # Test Coverage Report — {date}
187
+
188
+ ## Summary
189
+ - Source files analyzed: {N}
190
+ - Unit/integration test files: {N}
191
+ - E2E test specs: {N}
192
+ - Coverage gaps: {N}
193
+ - Stale tests: {N}
194
+ - Dead tests: {N}
195
+ - Unit tests passing: {N}/{total}
196
+ - E2E tests passing: {N}/{total}
197
+
198
+ ## Coverage Status
199
+
200
+ ### ✅ Well Covered
201
+ | Source | Test | Last Verified |
202
+ |--------|------|---------------|
203
+ | {file} | {test} | {date} |
204
+
205
+ ### ⚠️ Partial Coverage
206
+ | Source | Test | Gap |
207
+ |--------|------|-----|
208
+ | {file} | {test} | {missing: error cases, edge cases, etc.} |
209
+
210
+ ### ❌ No Coverage
211
+ | Source | Risk Level | Reason |
212
+ |--------|------------|--------|
213
+ | {file} | {HIGH/MED/LOW} | {new file, complex logic, etc.} |
214
+
215
+ ---
216
+
217
+ ## Issues Found
218
+
219
+ ### Stale Tests
220
+ | Test | Issue | Action |
221
+ |------|-------|--------|
222
+ | {test} | {function signature changed} | Update assertions |
223
+ | {test} | {mock data outdated} | Update mock |
224
+
225
+ ### Dead Tests
226
+ | Test | Reason | Action |
227
+ |------|--------|--------|
228
+ | {test} | {tests deleted feature} | Remove |
229
+ | {test} | {imports removed module} | Remove |
230
+
231
+ ### Failing Tests
232
+ | Test | Error | Likely Cause |
233
+ |------|-------|--------------|
234
+ | {test} | {error message} | {code bug or test needs update} |
235
+
236
+ ---
237
+
238
+ ## Test Health Metrics
239
+
240
+ - Test-to-code ratio: {N tests / N source files}
241
+ - Average assertions per test: {N}
242
+ - Critical paths covered: {list}
243
+ - Critical paths uncovered: {list}
244
+
245
+ ---
246
+
247
+ ## Generated Tasks
248
+
249
+ ### High Priority (blocking)
250
+ - [ ] TEST-001: Fix failing test {test} — {reason}
251
+ - [ ] TEST-002: Update stale test {test} — {what changed}
252
+
253
+ ### Medium Priority (should do)
254
+ - [ ] TEST-010: Add tests for {file} — {N} functions uncovered
255
+ - [ ] TEST-011: Add error case tests for {function}
256
+
257
+ ### Low Priority (nice to have)
258
+ - [ ] TEST-020: Remove dead test {test}
259
+ - [ ] TEST-021: Add edge case tests for {function}
260
+
261
+ ---
262
+
263
+ ## Recommendations
264
+
265
+ {Based on findings, what should be prioritized}
266
+ ```
267
+
268
+ ## Step 7: Generate Test Tasks
269
+
270
+ If issues found, add to current domain's tasks:
271
+
272
+ ```markdown
273
+ ## Auto-Generated Test Tasks
274
+
275
+ ### From Test Sync — {date}
276
+
277
+ - [ ] TEST-001: Fix failing test `test_login.py::test_valid_credentials`
278
+ - Error: AssertionError — expected 200, got 201
279
+ - Cause: API return code changed
280
+ - Action: Update assertion to expect 201
281
+
282
+ - [ ] TEST-002: Add tests for `src/auth/roles.py`
283
+ - Functions: check_permission, assign_role, revoke_role
284
+ - Priority: HIGH — authorization logic
285
+
286
+ - [ ] TEST-003: Update mock data in `test_users.py`
287
+ - Schema changed: added `last_login` field
288
+ - Action: Update all user fixtures
289
+ ```
290
+
291
+ ## Step 8: Integration with Workflow
292
+
293
+ ### During Execute Phase (auto-invoked):
294
+ After each task completes:
295
+ 1. Scan changed files and map to existing tests
296
+ 2. **If new code paths have zero test coverage: write tests NOW** — do not defer
297
+ 3. Run ALL affected unit/integration tests
298
+ 4. Run ALL affected Playwright E2E tests
299
+ 5. If failures: fix immediately (up to 2 attempts) before continuing. If both attempts fail:
300
+ 1. Write failure context to `.gsd-t/debug-state.jsonl` via `node -e "require('./bin/debug-ledger.js').appendEntry('.', {iteration:1,timestamp:new Date().toISOString(),test:'test-sync-failure',error:'2 in-context fix attempts exhausted',hypothesis:'see test-coverage.md',fix:'n/a',fixFiles:[],result:'STILL_FAILS',learning:'delegating to headless debug-loop',model:'sonnet',duration:0})"`
301
+ 2. Log: "Delegating to headless debug-loop (2 in-context attempts exhausted)"
302
+ 3. Run: `gsd-t headless --debug-loop --max-iterations 10`
303
+ 4. Exit code 0 → tests pass, continue; 1/4 → log to `.gsd-t/deferred-items.md`, report failure; 3 → report error
304
+ 6. If E2E specs are missing for new features/modes/flows: **create them NOW**, not later
305
+ 7. If E2E specs need updating for changed behavior: update them before continuing
306
+ 8. **No task is complete until its tests exist and pass** — do not move to the next task with test gaps
307
+
308
+ ### During Verify Phase (auto-invoked):
309
+ Full sync:
310
+ 1. Complete coverage analysis (unit + E2E)
311
+ 2. Run ALL unit/integration tests
312
+ 3. Run the FULL E2E test suite — this is mandatory, not optional
313
+ 4. Generate full report
314
+ 5. Block verification if any critical tests failing (unit or E2E)
315
+
316
+ ### Standalone Mode:
317
+ ```
318
+ /user:gsd-t-test-sync
319
+ ```
320
+ 1. Full analysis of entire codebase
321
+ 2. Comprehensive report
322
+ 3. Generate all test tasks
323
+ 4. Do not auto-add to domains — present for review
324
+
325
+ ## Step 9: Report to User
326
+
327
+ ### Quick Mode (during execute):
328
+ ```
329
+ 🧪 Test sync: 3 tests affected, 3 passing
330
+ 1 coverage gap noted → will address in verify phase
331
+ ```
332
+
333
+ ### Full Mode (during verify or standalone):
334
+ ```
335
+ 🧪 Test Sync Complete
336
+
337
+ Unit/Integration:
338
+ - Tests run: 45
339
+ - Passing: 43
340
+ - Failing: 2
341
+
342
+ E2E ({framework}):
343
+ - Specs run: 12
344
+ - Passing: 11
345
+ - Failing: 1
346
+
347
+ Coverage:
348
+ - Gaps: 3
349
+ - Stale tests: 1
350
+ - Dead tests: 0
351
+
352
+ Action Required:
353
+ - 2 failing unit tests must be fixed before verify passes
354
+ - 1 failing E2E spec must be fixed before verify passes
355
+ - See .gsd-t/test-coverage.md for details
356
+
357
+ Generated 5 test tasks → added to current domain
358
+ ```
359
+
360
+ ### Autonomy Behavior
361
+
362
+ **Level 3 (Full Auto)**: Log the summary and auto-advance to the next phase. If there are failing tests, attempt auto-fix (up to 2 attempts) before continuing. Do NOT wait for user input.
363
+
364
+ **Level 1–2**: Present the full report and wait for user input before proceeding.
365
+
366
+ ## Document Ripple
367
+
368
+ ### Always update:
369
+ 1. **`.gsd-t/progress.md`** — Log test sync results in Decision Log (standalone mode)
370
+ 2. **`.gsd-t/test-coverage.md`** — Created/updated with coverage report (Step 5)
371
+
372
+ ### Check if affected:
373
+ 3. **`docs/requirements.md`** — If test tasks map to requirements, update the Test Coverage table
374
+ 4. **`.gsd-t/domains/{current}/tasks.md`** — If test tasks were generated, append them (Step 6)
375
+ 5. **`.gsd-t/techdebt.md`** — If persistent test gaps were found, add as debt items
376
+
377
+ $ARGUMENTS
378
+
379
+ ## Auto-Clear
380
+
381
+ All work is committed to project files. Execute `/clear` to free the context window for the next command.