@tekyzinc/gsd-t 2.50.12 → 2.53.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +24 -0
- package/README.md +379 -372
- package/bin/component-registry.js +250 -0
- package/bin/graph-cgc.js +510 -510
- package/bin/graph-indexer.js +147 -147
- package/bin/graph-overlay.js +195 -195
- package/bin/graph-parsers.js +327 -327
- package/bin/graph-query.js +453 -452
- package/bin/graph-store.js +154 -154
- package/bin/qa-calibrator.js +194 -0
- package/bin/scan-data-collector.js +153 -153
- package/bin/scan-diagrams-generators.js +187 -187
- package/bin/scan-diagrams.js +79 -79
- package/bin/scan-renderer.js +92 -92
- package/bin/scan-report-sections.js +121 -121
- package/bin/scan-report.js +184 -184
- package/bin/scan-schema-parsers.js +199 -199
- package/bin/scan-schema.js +103 -103
- package/bin/token-budget.js +246 -0
- package/commands/Claude-md.md +10 -10
- package/commands/branch.md +15 -15
- package/commands/checkin.md +45 -45
- package/commands/global-change.md +209 -209
- package/commands/gsd-t-audit.md +199 -0
- package/commands/gsd-t-backlog-add.md +94 -94
- package/commands/gsd-t-backlog-edit.md +111 -111
- package/commands/gsd-t-backlog-list.md +63 -63
- package/commands/gsd-t-backlog-move.md +94 -94
- package/commands/gsd-t-backlog-promote.md +123 -123
- package/commands/gsd-t-backlog-remove.md +86 -86
- package/commands/gsd-t-backlog-settings.md +158 -158
- package/commands/gsd-t-complete-milestone.md +528 -515
- package/commands/gsd-t-debug.md +506 -399
- package/commands/gsd-t-discuss.md +174 -174
- package/commands/gsd-t-execute.md +758 -634
- package/commands/gsd-t-feature.md +276 -276
- package/commands/gsd-t-health.md +142 -142
- package/commands/gsd-t-help.md +465 -457
- package/commands/gsd-t-impact.md +302 -302
- package/commands/gsd-t-init.md +320 -280
- package/commands/gsd-t-integrate.md +365 -249
- package/commands/gsd-t-milestone.md +87 -87
- package/commands/gsd-t-partition.md +442 -361
- package/commands/gsd-t-pause.md +82 -82
- package/commands/gsd-t-plan.md +345 -344
- package/commands/gsd-t-populate.md +111 -111
- package/commands/gsd-t-prd.md +326 -326
- package/commands/gsd-t-project.md +211 -211
- package/commands/gsd-t-promote-debt.md +123 -123
- package/commands/gsd-t-prompt.md +137 -137
- package/commands/gsd-t-qa.md +266 -266
- package/commands/gsd-t-quick.md +357 -234
- package/commands/gsd-t-reflect.md +134 -134
- package/commands/gsd-t-resume.md +72 -72
- package/commands/gsd-t-scan.md +615 -615
- package/commands/gsd-t-setup.md +76 -0
- package/commands/gsd-t-status.md +192 -166
- package/commands/gsd-t-test-sync.md +381 -381
- package/commands/gsd-t-triage-and-merge.md +171 -171
- package/commands/gsd-t-verify.md +382 -382
- package/commands/gsd-t-visualize.md +118 -118
- package/commands/gsd-t-wave.md +401 -378
- package/docs/GSD-T-README.md +425 -422
- package/docs/architecture.md +385 -369
- package/docs/harness-design-analysis.md +371 -0
- package/docs/infrastructure.md +205 -205
- package/docs/prd-graph-engine.md +398 -398
- package/docs/prd-gsd2-hybrid.md +559 -559
- package/docs/prd-harness-evolution.md +583 -0
- package/docs/requirements.md +14 -0
- package/docs/workflows.md +226 -226
- package/examples/.gsd-t/domains/example-domain/scope.md +13 -13
- package/package.json +40 -40
- package/scripts/gsd-t-auto-route.js +39 -39
- package/scripts/gsd-t-dashboard-mockup.html +1143 -1143
- package/scripts/gsd-t-dashboard-server.js +171 -171
- package/scripts/gsd-t-dashboard.html +262 -262
- package/scripts/gsd-t-event-writer.js +128 -128
- package/scripts/gsd-t-statusline.js +94 -94
- package/scripts/gsd-t-tools.js +175 -175
- package/templates/CLAUDE-global.md +639 -614
- package/templates/CLAUDE-project.md +24 -0
- package/templates/backlog-settings.md +18 -18
- package/templates/backlog.md +1 -1
- package/templates/progress.md +40 -40
- package/templates/shared-services-contract.md +60 -60
- package/templates/stacks/desktop.ini +2 -2
- package/bin/desktop.ini +0 -2
- package/commands/desktop.ini +0 -2
- package/docs/ci-examples/desktop.ini +0 -2
- package/docs/desktop.ini +0 -2
- package/examples/.gsd-t/contracts/desktop.ini +0 -2
- package/examples/.gsd-t/desktop.ini +0 -2
- package/examples/.gsd-t/domains/desktop.ini +0 -2
- package/examples/.gsd-t/domains/example-domain/desktop.ini +0 -2
- package/examples/desktop.ini +0 -2
- package/examples/rules/desktop.ini +0 -2
- package/scripts/desktop.ini +0 -2
- package/templates/desktop.ini +0 -2
|
@@ -1,381 +1,381 @@
|
|
|
1
|
-
# GSD-T: Test Sync — Keep Tests Aligned with Code
|
|
2
|
-
|
|
3
|
-
You are maintaining test coverage as code changes. Your job is to identify stale tests, coverage gaps, and dead tests, then generate tasks to address them.
|
|
4
|
-
|
|
5
|
-
This command is:
|
|
6
|
-
- **Auto-invoked** during execute phase (after each task) and verify phase
|
|
7
|
-
- **Standalone** when user wants to audit test health
|
|
8
|
-
|
|
9
|
-
## Step 1: Load Context
|
|
10
|
-
|
|
11
|
-
Read:
|
|
12
|
-
1. `CLAUDE.md` — testing conventions, test locations
|
|
13
|
-
2. `.gsd-t/progress.md` — what just changed
|
|
14
|
-
3. `.gsd-t/test-coverage.md` — previous coverage state (if exists)
|
|
15
|
-
4. `.gsd-t/domains/{current}/tasks.md` — recent completed tasks
|
|
16
|
-
|
|
17
|
-
Identify:
|
|
18
|
-
- **Unit/integration test framework** (pytest, jest, vitest, etc.)
|
|
19
|
-
- **E2E test framework** (Playwright, Cypress, Puppeteer, etc.) — check for `playwright.config.*`, `cypress.config.*`, `playwright/`, `cypress/`, `e2e/`, or E2E-related dependencies in package.json/requirements.txt
|
|
20
|
-
- Test directory structure
|
|
21
|
-
- Naming conventions
|
|
22
|
-
- Test run commands (from package.json scripts, Makefile, or CI config)
|
|
23
|
-
|
|
24
|
-
## Step 1.5: Graph-Enhanced Test Discovery
|
|
25
|
-
|
|
26
|
-
If `.gsd-t/graph/meta.json` exists (graph index is available):
|
|
27
|
-
1. Query `getTestsFor` each changed entity to find stale or missing tests more precisely than filesystem search
|
|
28
|
-
2. Query `getTransitiveCallers` for changed functions to find indirectly affected tests that may need updating
|
|
29
|
-
3. Feed these findings into the coverage map (Step 3) and issue detection (Step 4)
|
|
30
|
-
|
|
31
|
-
If graph is not available, skip this step.
|
|
32
|
-
|
|
33
|
-
## Step 2: Contract Coverage Audit
|
|
34
|
-
|
|
35
|
-
Perform inline contract testing and gap analysis:
|
|
36
|
-
|
|
37
|
-
1. Read all contracts in `.gsd-t/contracts/` — identify the interface each one defines
|
|
38
|
-
2. For each contract, check whether a test file exists that validates it
|
|
39
|
-
3. Run the full test suite: `npm test` (or project equivalent)
|
|
40
|
-
4. Identify gaps: contracts with no tests, stale tests referencing removed APIs, uncovered code paths
|
|
41
|
-
5. Report: coverage gaps, stale tests, and recommended test tasks
|
|
42
|
-
|
|
43
|
-
Test-sync cannot complete if critical contract gaps remain unaddressed.
|
|
44
|
-
|
|
45
|
-
## Step 3: Map Code to Tests
|
|
46
|
-
|
|
47
|
-
For each file changed in recent tasks:
|
|
48
|
-
|
|
49
|
-
### A) Find Existing Tests
|
|
50
|
-
```bash
|
|
51
|
-
# Common patterns
|
|
52
|
-
find tests/ -name "*{module_name}*"
|
|
53
|
-
find __tests__/ -name "*{module_name}*"
|
|
54
|
-
find . -name "*.test.*" | xargs grep -l "{function_name}"
|
|
55
|
-
find . -name "*.spec.*" | xargs grep -l "{class_name}"
|
|
56
|
-
```
|
|
57
|
-
|
|
58
|
-
### B) Build Coverage Map
|
|
59
|
-
```
|
|
60
|
-
| Source File | Test File(s) | Coverage Status |
|
|
61
|
-
|-------------|--------------|-----------------|
|
|
62
|
-
| src/auth/login.py | tests/test_login.py | COVERED |
|
|
63
|
-
| src/auth/roles.py | (none) | GAP |
|
|
64
|
-
| src/api/users.py | tests/test_users.py | PARTIAL |
|
|
65
|
-
```
|
|
66
|
-
|
|
67
|
-
## Step 4: Detect Test Issues
|
|
68
|
-
|
|
69
|
-
### A) Stale Tests
|
|
70
|
-
Tests that reference old behavior:
|
|
71
|
-
- Function signatures that changed
|
|
72
|
-
- Removed functions still being tested
|
|
73
|
-
- Old API shapes in assertions
|
|
74
|
-
- Mocked data that no longer matches schema
|
|
75
|
-
|
|
76
|
-
Check:
|
|
77
|
-
```bash
|
|
78
|
-
# Find tests importing changed modules
|
|
79
|
-
grep -r "from {changed_module}" tests/
|
|
80
|
-
# Check if test assertions match new behavior
|
|
81
|
-
```
|
|
82
|
-
|
|
83
|
-
### B) Coverage Gaps
|
|
84
|
-
New or changed code without tests:
|
|
85
|
-
- New functions with no test
|
|
86
|
-
- New branches with no coverage
|
|
87
|
-
- Changed behavior with no updated assertions
|
|
88
|
-
- New error cases with no error tests
|
|
89
|
-
|
|
90
|
-
### C) Dead Tests
|
|
91
|
-
Tests for deleted functionality:
|
|
92
|
-
- Tests importing deleted modules
|
|
93
|
-
- Tests for removed features
|
|
94
|
-
- Skipped tests that should be removed
|
|
95
|
-
|
|
96
|
-
### D) Flaky Tests (if test history available)
|
|
97
|
-
Tests that sometimes fail:
|
|
98
|
-
- Check recent CI runs
|
|
99
|
-
- Note any intermittent failures
|
|
100
|
-
|
|
101
|
-
## Step 5: Run Affected Tests
|
|
102
|
-
|
|
103
|
-
### A) Unit/Integration Tests
|
|
104
|
-
Execute tests that cover changed code:
|
|
105
|
-
|
|
106
|
-
```bash
|
|
107
|
-
# Example for pytest
|
|
108
|
-
pytest tests/test_{module}.py -v
|
|
109
|
-
|
|
110
|
-
# Example for jest
|
|
111
|
-
npm test -- --testPathPattern="{module}"
|
|
112
|
-
```
|
|
113
|
-
|
|
114
|
-
### B) E2E Tests (MANDATORY when config exists)
|
|
115
|
-
If `playwright.config.*` or `cypress.config.*` exists, you MUST run E2E tests — skipping is never acceptable:
|
|
116
|
-
|
|
117
|
-
```bash
|
|
118
|
-
# Playwright
|
|
119
|
-
npx playwright test {affected-spec}.spec.ts
|
|
120
|
-
|
|
121
|
-
# Cypress
|
|
122
|
-
npx cypress run --spec "cypress/e2e/{affected-spec}.cy.ts"
|
|
123
|
-
```
|
|
124
|
-
|
|
125
|
-
Determine which E2E specs are affected:
|
|
126
|
-
- Changed a UI component or page? → Run specs that test that page/flow
|
|
127
|
-
- Changed an API endpoint? → Run specs that exercise that endpoint
|
|
128
|
-
- Changed auth/session logic? → Run all auth-related E2E specs
|
|
129
|
-
- Changed database schema? → Run specs that depend on that data
|
|
130
|
-
- Not sure what's affected? → Run the full E2E suite
|
|
131
|
-
|
|
132
|
-
### C) Create and Update Playwright E2E Tests (MANDATORY when UI/routes/flows/modes changed)
|
|
133
|
-
|
|
134
|
-
If Playwright is configured (`playwright.config.*` or Playwright in dependencies):
|
|
135
|
-
|
|
136
|
-
**For new features, pages, modes, or flows — CREATE comprehensive specs:**
|
|
137
|
-
- Happy path for every new user flow
|
|
138
|
-
- All feature modes/flags (e.g., `--component` mode gets its own test suite, not just default mode)
|
|
139
|
-
- Form validation: valid input, invalid input, empty fields, boundary values
|
|
140
|
-
- Error states: network failures, API errors, permission denied, timeout
|
|
141
|
-
- Empty states: no data, first-time user, cleared data
|
|
142
|
-
- Loading states: skeleton screens, spinners, progressive loading
|
|
143
|
-
- Edge cases: rapid clicking, double submission, back/forward navigation, browser refresh mid-flow
|
|
144
|
-
- Responsive: test at mobile and desktop breakpoints if layout changes
|
|
145
|
-
|
|
146
|
-
**For changed features — UPDATE existing specs AND add missing coverage:**
|
|
147
|
-
- Changed UI elements (selectors, text, layout) → update locators and assertions
|
|
148
|
-
- Changed form fields or validation → update form fill steps and error assertions
|
|
149
|
-
- Removed features → remove or update affected E2E specs
|
|
150
|
-
- Review existing specs for missing edge cases and add them
|
|
151
|
-
|
|
152
|
-
**This is NOT optional.** Every new code path that a user can reach must have a Playwright spec. "We'll add tests later" is never acceptable.
|
|
153
|
-
|
|
154
|
-
**FUNCTIONAL TESTS — NOT LAYOUT TESTS (MANDATORY):**
|
|
155
|
-
E2E specs that only check element existence (`isVisible`, `toBeAttached`, `toBeEnabled`) are
|
|
156
|
-
layout tests. Layout tests pass even when every feature is broken — they are worthless for QA.
|
|
157
|
-
|
|
158
|
-
Every Playwright assertion MUST verify **functional behavior** — that an action produced the
|
|
159
|
-
correct outcome:
|
|
160
|
-
- **Tab/navigation**: Click → assert the NEW content loaded (unique text, data, or elements
|
|
161
|
-
that only appear on the destination view). Never just assert the tab element exists.
|
|
162
|
-
- **Forms**: Fill → submit → assert success feedback AND data persisted (API call observed
|
|
163
|
-
via `page.waitForResponse`, or list/table updated with new entry).
|
|
164
|
-
- **Interactive widgets** (terminals, editors, code panels): Open → interact → assert the
|
|
165
|
-
widget responded (keystroke produced output, content was saved, command executed).
|
|
166
|
-
- **Connections** (WebSocket, SSE, polling): Assert status transitions ("Connecting" →
|
|
167
|
-
"Connected") and verify data flows through the connection.
|
|
168
|
-
- **State toggles** (dark mode, expand/collapse, enable/disable): Assert the EFFECT of the
|
|
169
|
-
toggle, not just that the toggle control exists.
|
|
170
|
-
- **Error handling**: Trigger error → assert error content → assert recovery path works.
|
|
171
|
-
|
|
172
|
-
**Rule: If a test would pass on an empty HTML page with the correct element IDs and no
|
|
173
|
-
JavaScript, it is not a functional test. Rewrite it.**
|
|
174
|
-
|
|
175
|
-
### D) Capture Results
|
|
176
|
-
For all test types:
|
|
177
|
-
- PASS: Test still valid
|
|
178
|
-
- FAIL: Test needs update or code has bug
|
|
179
|
-
- ERROR: Test broken (import error, etc.)
|
|
180
|
-
|
|
181
|
-
## Step 6: Produce Test Coverage Report
|
|
182
|
-
|
|
183
|
-
Create/update `.gsd-t/test-coverage.md`:
|
|
184
|
-
|
|
185
|
-
```markdown
|
|
186
|
-
# Test Coverage Report — {date}
|
|
187
|
-
|
|
188
|
-
## Summary
|
|
189
|
-
- Source files analyzed: {N}
|
|
190
|
-
- Unit/integration test files: {N}
|
|
191
|
-
- E2E test specs: {N}
|
|
192
|
-
- Coverage gaps: {N}
|
|
193
|
-
- Stale tests: {N}
|
|
194
|
-
- Dead tests: {N}
|
|
195
|
-
- Unit tests passing: {N}/{total}
|
|
196
|
-
- E2E tests passing: {N}/{total}
|
|
197
|
-
|
|
198
|
-
## Coverage Status
|
|
199
|
-
|
|
200
|
-
### ✅ Well Covered
|
|
201
|
-
| Source | Test | Last Verified |
|
|
202
|
-
|--------|------|---------------|
|
|
203
|
-
| {file} | {test} | {date} |
|
|
204
|
-
|
|
205
|
-
### ⚠️ Partial Coverage
|
|
206
|
-
| Source | Test | Gap |
|
|
207
|
-
|--------|------|-----|
|
|
208
|
-
| {file} | {test} | {missing: error cases, edge cases, etc.} |
|
|
209
|
-
|
|
210
|
-
### ❌ No Coverage
|
|
211
|
-
| Source | Risk Level | Reason |
|
|
212
|
-
|--------|------------|--------|
|
|
213
|
-
| {file} | {HIGH/MED/LOW} | {new file, complex logic, etc.} |
|
|
214
|
-
|
|
215
|
-
---
|
|
216
|
-
|
|
217
|
-
## Issues Found
|
|
218
|
-
|
|
219
|
-
### Stale Tests
|
|
220
|
-
| Test | Issue | Action |
|
|
221
|
-
|------|-------|--------|
|
|
222
|
-
| {test} | {function signature changed} | Update assertions |
|
|
223
|
-
| {test} | {mock data outdated} | Update mock |
|
|
224
|
-
|
|
225
|
-
### Dead Tests
|
|
226
|
-
| Test | Reason | Action |
|
|
227
|
-
|------|--------|--------|
|
|
228
|
-
| {test} | {tests deleted feature} | Remove |
|
|
229
|
-
| {test} | {imports removed module} | Remove |
|
|
230
|
-
|
|
231
|
-
### Failing Tests
|
|
232
|
-
| Test | Error | Likely Cause |
|
|
233
|
-
|------|-------|--------------|
|
|
234
|
-
| {test} | {error message} | {code bug or test needs update} |
|
|
235
|
-
|
|
236
|
-
---
|
|
237
|
-
|
|
238
|
-
## Test Health Metrics
|
|
239
|
-
|
|
240
|
-
- Test-to-code ratio: {N tests / N source files}
|
|
241
|
-
- Average assertions per test: {N}
|
|
242
|
-
- Critical paths covered: {list}
|
|
243
|
-
- Critical paths uncovered: {list}
|
|
244
|
-
|
|
245
|
-
---
|
|
246
|
-
|
|
247
|
-
## Generated Tasks
|
|
248
|
-
|
|
249
|
-
### High Priority (blocking)
|
|
250
|
-
- [ ] TEST-001: Fix failing test {test} — {reason}
|
|
251
|
-
- [ ] TEST-002: Update stale test {test} — {what changed}
|
|
252
|
-
|
|
253
|
-
### Medium Priority (should do)
|
|
254
|
-
- [ ] TEST-010: Add tests for {file} — {N} functions uncovered
|
|
255
|
-
- [ ] TEST-011: Add error case tests for {function}
|
|
256
|
-
|
|
257
|
-
### Low Priority (nice to have)
|
|
258
|
-
- [ ] TEST-020: Remove dead test {test}
|
|
259
|
-
- [ ] TEST-021: Add edge case tests for {function}
|
|
260
|
-
|
|
261
|
-
---
|
|
262
|
-
|
|
263
|
-
## Recommendations
|
|
264
|
-
|
|
265
|
-
{Based on findings, what should be prioritized}
|
|
266
|
-
```
|
|
267
|
-
|
|
268
|
-
## Step 7: Generate Test Tasks
|
|
269
|
-
|
|
270
|
-
If issues found, add to current domain's tasks:
|
|
271
|
-
|
|
272
|
-
```markdown
|
|
273
|
-
## Auto-Generated Test Tasks
|
|
274
|
-
|
|
275
|
-
### From Test Sync — {date}
|
|
276
|
-
|
|
277
|
-
- [ ] TEST-001: Fix failing test `test_login.py::test_valid_credentials`
|
|
278
|
-
- Error: AssertionError — expected 200, got 201
|
|
279
|
-
- Cause: API return code changed
|
|
280
|
-
- Action: Update assertion to expect 201
|
|
281
|
-
|
|
282
|
-
- [ ] TEST-002: Add tests for `src/auth/roles.py`
|
|
283
|
-
- Functions: check_permission, assign_role, revoke_role
|
|
284
|
-
- Priority: HIGH — authorization logic
|
|
285
|
-
|
|
286
|
-
- [ ] TEST-003: Update mock data in `test_users.py`
|
|
287
|
-
- Schema changed: added `last_login` field
|
|
288
|
-
- Action: Update all user fixtures
|
|
289
|
-
```
|
|
290
|
-
|
|
291
|
-
## Step 8: Integration with Workflow
|
|
292
|
-
|
|
293
|
-
### During Execute Phase (auto-invoked):
|
|
294
|
-
After each task completes:
|
|
295
|
-
1. Scan changed files and map to existing tests
|
|
296
|
-
2. **If new code paths have zero test coverage: write tests NOW** — do not defer
|
|
297
|
-
3. Run ALL affected unit/integration tests
|
|
298
|
-
4. Run ALL affected Playwright E2E tests
|
|
299
|
-
5. If failures: fix immediately (up to 2 attempts) before continuing. If both attempts fail:
|
|
300
|
-
1. Write failure context to `.gsd-t/debug-state.jsonl` via `node -e "require('./bin/debug-ledger.js').appendEntry('.', {iteration:1,timestamp:new Date().toISOString(),test:'test-sync-failure',error:'2 in-context fix attempts exhausted',hypothesis:'see test-coverage.md',fix:'n/a',fixFiles:[],result:'STILL_FAILS',learning:'delegating to headless debug-loop',model:'sonnet',duration:0})"`
|
|
301
|
-
2. Log: "Delegating to headless debug-loop (2 in-context attempts exhausted)"
|
|
302
|
-
3. Run: `gsd-t headless --debug-loop --max-iterations 10`
|
|
303
|
-
4. Exit code 0 → tests pass, continue; 1/4 → log to `.gsd-t/deferred-items.md`, report failure; 3 → report error
|
|
304
|
-
6. If E2E specs are missing for new features/modes/flows: **create them NOW**, not later
|
|
305
|
-
7. If E2E specs need updating for changed behavior: update them before continuing
|
|
306
|
-
8. **No task is complete until its tests exist and pass** — do not move to the next task with test gaps
|
|
307
|
-
|
|
308
|
-
### During Verify Phase (auto-invoked):
|
|
309
|
-
Full sync:
|
|
310
|
-
1. Complete coverage analysis (unit + E2E)
|
|
311
|
-
2. Run ALL unit/integration tests
|
|
312
|
-
3. Run the FULL E2E test suite — this is mandatory, not optional
|
|
313
|
-
4. Generate full report
|
|
314
|
-
5. Block verification if any critical tests failing (unit or E2E)
|
|
315
|
-
|
|
316
|
-
### Standalone Mode:
|
|
317
|
-
```
|
|
318
|
-
/user:gsd-t-test-sync
|
|
319
|
-
```
|
|
320
|
-
1. Full analysis of entire codebase
|
|
321
|
-
2. Comprehensive report
|
|
322
|
-
3. Generate all test tasks
|
|
323
|
-
4. Do not auto-add to domains — present for review
|
|
324
|
-
|
|
325
|
-
## Step 9: Report to User
|
|
326
|
-
|
|
327
|
-
### Quick Mode (during execute):
|
|
328
|
-
```
|
|
329
|
-
🧪 Test sync: 3 tests affected, 3 passing
|
|
330
|
-
1 coverage gap noted → will address in verify phase
|
|
331
|
-
```
|
|
332
|
-
|
|
333
|
-
### Full Mode (during verify or standalone):
|
|
334
|
-
```
|
|
335
|
-
🧪 Test Sync Complete
|
|
336
|
-
|
|
337
|
-
Unit/Integration:
|
|
338
|
-
- Tests run: 45
|
|
339
|
-
- Passing: 43
|
|
340
|
-
- Failing: 2
|
|
341
|
-
|
|
342
|
-
E2E ({framework}):
|
|
343
|
-
- Specs run: 12
|
|
344
|
-
- Passing: 11
|
|
345
|
-
- Failing: 1
|
|
346
|
-
|
|
347
|
-
Coverage:
|
|
348
|
-
- Gaps: 3
|
|
349
|
-
- Stale tests: 1
|
|
350
|
-
- Dead tests: 0
|
|
351
|
-
|
|
352
|
-
Action Required:
|
|
353
|
-
- 2 failing unit tests must be fixed before verify passes
|
|
354
|
-
- 1 failing E2E spec must be fixed before verify passes
|
|
355
|
-
- See .gsd-t/test-coverage.md for details
|
|
356
|
-
|
|
357
|
-
Generated 5 test tasks → added to current domain
|
|
358
|
-
```
|
|
359
|
-
|
|
360
|
-
### Autonomy Behavior
|
|
361
|
-
|
|
362
|
-
**Level 3 (Full Auto)**: Log the summary and auto-advance to the next phase. If there are failing tests, attempt auto-fix (up to 2 attempts) before continuing. Do NOT wait for user input.
|
|
363
|
-
|
|
364
|
-
**Level 1–2**: Present the full report and wait for user input before proceeding.
|
|
365
|
-
|
|
366
|
-
## Document Ripple
|
|
367
|
-
|
|
368
|
-
### Always update:
|
|
369
|
-
1. **`.gsd-t/progress.md`** — Log test sync results in Decision Log (standalone mode)
|
|
370
|
-
2. **`.gsd-t/test-coverage.md`** — Created/updated with coverage report (Step 5)
|
|
371
|
-
|
|
372
|
-
### Check if affected:
|
|
373
|
-
3. **`docs/requirements.md`** — If test tasks map to requirements, update the Test Coverage table
|
|
374
|
-
4. **`.gsd-t/domains/{current}/tasks.md`** — If test tasks were generated, append them (Step 6)
|
|
375
|
-
5. **`.gsd-t/techdebt.md`** — If persistent test gaps were found, add as debt items
|
|
376
|
-
|
|
377
|
-
$ARGUMENTS
|
|
378
|
-
|
|
379
|
-
## Auto-Clear
|
|
380
|
-
|
|
381
|
-
All work is committed to project files. Execute `/clear` to free the context window for the next command.
|
|
1
|
+
# GSD-T: Test Sync — Keep Tests Aligned with Code
|
|
2
|
+
|
|
3
|
+
You are maintaining test coverage as code changes. Your job is to identify stale tests, coverage gaps, and dead tests, then generate tasks to address them.
|
|
4
|
+
|
|
5
|
+
This command is:
|
|
6
|
+
- **Auto-invoked** during execute phase (after each task) and verify phase
|
|
7
|
+
- **Standalone** when user wants to audit test health
|
|
8
|
+
|
|
9
|
+
## Step 1: Load Context
|
|
10
|
+
|
|
11
|
+
Read:
|
|
12
|
+
1. `CLAUDE.md` — testing conventions, test locations
|
|
13
|
+
2. `.gsd-t/progress.md` — what just changed
|
|
14
|
+
3. `.gsd-t/test-coverage.md` — previous coverage state (if exists)
|
|
15
|
+
4. `.gsd-t/domains/{current}/tasks.md` — recent completed tasks
|
|
16
|
+
|
|
17
|
+
Identify:
|
|
18
|
+
- **Unit/integration test framework** (pytest, jest, vitest, etc.)
|
|
19
|
+
- **E2E test framework** (Playwright, Cypress, Puppeteer, etc.) — check for `playwright.config.*`, `cypress.config.*`, `playwright/`, `cypress/`, `e2e/`, or E2E-related dependencies in package.json/requirements.txt
|
|
20
|
+
- Test directory structure
|
|
21
|
+
- Naming conventions
|
|
22
|
+
- Test run commands (from package.json scripts, Makefile, or CI config)
|
|
23
|
+
|
|
24
|
+
## Step 1.5: Graph-Enhanced Test Discovery
|
|
25
|
+
|
|
26
|
+
If `.gsd-t/graph/meta.json` exists (graph index is available):
|
|
27
|
+
1. Query `getTestsFor` each changed entity to find stale or missing tests more precisely than filesystem search
|
|
28
|
+
2. Query `getTransitiveCallers` for changed functions to find indirectly affected tests that may need updating
|
|
29
|
+
3. Feed these findings into the coverage map (Step 3) and issue detection (Step 4)
|
|
30
|
+
|
|
31
|
+
If graph is not available, skip this step.
|
|
32
|
+
|
|
33
|
+
## Step 2: Contract Coverage Audit
|
|
34
|
+
|
|
35
|
+
Perform inline contract testing and gap analysis:
|
|
36
|
+
|
|
37
|
+
1. Read all contracts in `.gsd-t/contracts/` — identify the interface each one defines
|
|
38
|
+
2. For each contract, check whether a test file exists that validates it
|
|
39
|
+
3. Run the full test suite: `npm test` (or project equivalent)
|
|
40
|
+
4. Identify gaps: contracts with no tests, stale tests referencing removed APIs, uncovered code paths
|
|
41
|
+
5. Report: coverage gaps, stale tests, and recommended test tasks
|
|
42
|
+
|
|
43
|
+
Test-sync cannot complete if critical contract gaps remain unaddressed.
|
|
44
|
+
|
|
45
|
+
## Step 3: Map Code to Tests
|
|
46
|
+
|
|
47
|
+
For each file changed in recent tasks:
|
|
48
|
+
|
|
49
|
+
### A) Find Existing Tests
|
|
50
|
+
```bash
|
|
51
|
+
# Common patterns
|
|
52
|
+
find tests/ -name "*{module_name}*"
|
|
53
|
+
find __tests__/ -name "*{module_name}*"
|
|
54
|
+
find . -name "*.test.*" | xargs grep -l "{function_name}"
|
|
55
|
+
find . -name "*.spec.*" | xargs grep -l "{class_name}"
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
### B) Build Coverage Map
|
|
59
|
+
```
|
|
60
|
+
| Source File | Test File(s) | Coverage Status |
|
|
61
|
+
|-------------|--------------|-----------------|
|
|
62
|
+
| src/auth/login.py | tests/test_login.py | COVERED |
|
|
63
|
+
| src/auth/roles.py | (none) | GAP |
|
|
64
|
+
| src/api/users.py | tests/test_users.py | PARTIAL |
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## Step 4: Detect Test Issues
|
|
68
|
+
|
|
69
|
+
### A) Stale Tests
|
|
70
|
+
Tests that reference old behavior:
|
|
71
|
+
- Function signatures that changed
|
|
72
|
+
- Removed functions still being tested
|
|
73
|
+
- Old API shapes in assertions
|
|
74
|
+
- Mocked data that no longer matches schema
|
|
75
|
+
|
|
76
|
+
Check:
|
|
77
|
+
```bash
|
|
78
|
+
# Find tests importing changed modules
|
|
79
|
+
grep -r "from {changed_module}" tests/
|
|
80
|
+
# Check if test assertions match new behavior
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
### B) Coverage Gaps
|
|
84
|
+
New or changed code without tests:
|
|
85
|
+
- New functions with no test
|
|
86
|
+
- New branches with no coverage
|
|
87
|
+
- Changed behavior with no updated assertions
|
|
88
|
+
- New error cases with no error tests
|
|
89
|
+
|
|
90
|
+
### C) Dead Tests
|
|
91
|
+
Tests for deleted functionality:
|
|
92
|
+
- Tests importing deleted modules
|
|
93
|
+
- Tests for removed features
|
|
94
|
+
- Skipped tests that should be removed
|
|
95
|
+
|
|
96
|
+
### D) Flaky Tests (if test history available)
|
|
97
|
+
Tests that sometimes fail:
|
|
98
|
+
- Check recent CI runs
|
|
99
|
+
- Note any intermittent failures
|
|
100
|
+
|
|
101
|
+
## Step 5: Run Affected Tests
|
|
102
|
+
|
|
103
|
+
### A) Unit/Integration Tests
|
|
104
|
+
Execute tests that cover changed code:
|
|
105
|
+
|
|
106
|
+
```bash
|
|
107
|
+
# Example for pytest
|
|
108
|
+
pytest tests/test_{module}.py -v
|
|
109
|
+
|
|
110
|
+
# Example for jest
|
|
111
|
+
npm test -- --testPathPattern="{module}"
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
### B) E2E Tests (MANDATORY when config exists)
|
|
115
|
+
If `playwright.config.*` or `cypress.config.*` exists, you MUST run E2E tests — skipping is never acceptable:
|
|
116
|
+
|
|
117
|
+
```bash
|
|
118
|
+
# Playwright
|
|
119
|
+
npx playwright test {affected-spec}.spec.ts
|
|
120
|
+
|
|
121
|
+
# Cypress
|
|
122
|
+
npx cypress run --spec "cypress/e2e/{affected-spec}.cy.ts"
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
Determine which E2E specs are affected:
|
|
126
|
+
- Changed a UI component or page? → Run specs that test that page/flow
|
|
127
|
+
- Changed an API endpoint? → Run specs that exercise that endpoint
|
|
128
|
+
- Changed auth/session logic? → Run all auth-related E2E specs
|
|
129
|
+
- Changed database schema? → Run specs that depend on that data
|
|
130
|
+
- Not sure what's affected? → Run the full E2E suite
|
|
131
|
+
|
|
132
|
+
### C) Create and Update Playwright E2E Tests (MANDATORY when UI/routes/flows/modes changed)
|
|
133
|
+
|
|
134
|
+
If Playwright is configured (`playwright.config.*` or Playwright in dependencies):
|
|
135
|
+
|
|
136
|
+
**For new features, pages, modes, or flows — CREATE comprehensive specs:**
|
|
137
|
+
- Happy path for every new user flow
|
|
138
|
+
- All feature modes/flags (e.g., `--component` mode gets its own test suite, not just default mode)
|
|
139
|
+
- Form validation: valid input, invalid input, empty fields, boundary values
|
|
140
|
+
- Error states: network failures, API errors, permission denied, timeout
|
|
141
|
+
- Empty states: no data, first-time user, cleared data
|
|
142
|
+
- Loading states: skeleton screens, spinners, progressive loading
|
|
143
|
+
- Edge cases: rapid clicking, double submission, back/forward navigation, browser refresh mid-flow
|
|
144
|
+
- Responsive: test at mobile and desktop breakpoints if layout changes
|
|
145
|
+
|
|
146
|
+
**For changed features — UPDATE existing specs AND add missing coverage:**
|
|
147
|
+
- Changed UI elements (selectors, text, layout) → update locators and assertions
|
|
148
|
+
- Changed form fields or validation → update form fill steps and error assertions
|
|
149
|
+
- Removed features → remove or update affected E2E specs
|
|
150
|
+
- Review existing specs for missing edge cases and add them
|
|
151
|
+
|
|
152
|
+
**This is NOT optional.** Every new code path that a user can reach must have a Playwright spec. "We'll add tests later" is never acceptable.
|
|
153
|
+
|
|
154
|
+
**FUNCTIONAL TESTS — NOT LAYOUT TESTS (MANDATORY):**
|
|
155
|
+
E2E specs that only check element existence (`isVisible`, `toBeAttached`, `toBeEnabled`) are
|
|
156
|
+
layout tests. Layout tests pass even when every feature is broken — they are worthless for QA.
|
|
157
|
+
|
|
158
|
+
Every Playwright assertion MUST verify **functional behavior** — that an action produced the
|
|
159
|
+
correct outcome:
|
|
160
|
+
- **Tab/navigation**: Click → assert the NEW content loaded (unique text, data, or elements
|
|
161
|
+
that only appear on the destination view). Never just assert the tab element exists.
|
|
162
|
+
- **Forms**: Fill → submit → assert success feedback AND data persisted (API call observed
|
|
163
|
+
via `page.waitForResponse`, or list/table updated with new entry).
|
|
164
|
+
- **Interactive widgets** (terminals, editors, code panels): Open → interact → assert the
|
|
165
|
+
widget responded (keystroke produced output, content was saved, command executed).
|
|
166
|
+
- **Connections** (WebSocket, SSE, polling): Assert status transitions ("Connecting" →
|
|
167
|
+
"Connected") and verify data flows through the connection.
|
|
168
|
+
- **State toggles** (dark mode, expand/collapse, enable/disable): Assert the EFFECT of the
|
|
169
|
+
toggle, not just that the toggle control exists.
|
|
170
|
+
- **Error handling**: Trigger error → assert error content → assert recovery path works.
|
|
171
|
+
|
|
172
|
+
**Rule: If a test would pass on an empty HTML page with the correct element IDs and no
|
|
173
|
+
JavaScript, it is not a functional test. Rewrite it.**
|
|
174
|
+
|
|
175
|
+
### D) Capture Results
|
|
176
|
+
For all test types:
|
|
177
|
+
- PASS: Test still valid
|
|
178
|
+
- FAIL: Test needs update or code has bug
|
|
179
|
+
- ERROR: Test broken (import error, etc.)
|
|
180
|
+
|
|
181
|
+
## Step 6: Produce Test Coverage Report
|
|
182
|
+
|
|
183
|
+
Create/update `.gsd-t/test-coverage.md`:
|
|
184
|
+
|
|
185
|
+
```markdown
|
|
186
|
+
# Test Coverage Report — {date}
|
|
187
|
+
|
|
188
|
+
## Summary
|
|
189
|
+
- Source files analyzed: {N}
|
|
190
|
+
- Unit/integration test files: {N}
|
|
191
|
+
- E2E test specs: {N}
|
|
192
|
+
- Coverage gaps: {N}
|
|
193
|
+
- Stale tests: {N}
|
|
194
|
+
- Dead tests: {N}
|
|
195
|
+
- Unit tests passing: {N}/{total}
|
|
196
|
+
- E2E tests passing: {N}/{total}
|
|
197
|
+
|
|
198
|
+
## Coverage Status
|
|
199
|
+
|
|
200
|
+
### ✅ Well Covered
|
|
201
|
+
| Source | Test | Last Verified |
|
|
202
|
+
|--------|------|---------------|
|
|
203
|
+
| {file} | {test} | {date} |
|
|
204
|
+
|
|
205
|
+
### ⚠️ Partial Coverage
|
|
206
|
+
| Source | Test | Gap |
|
|
207
|
+
|--------|------|-----|
|
|
208
|
+
| {file} | {test} | {missing: error cases, edge cases, etc.} |
|
|
209
|
+
|
|
210
|
+
### ❌ No Coverage
|
|
211
|
+
| Source | Risk Level | Reason |
|
|
212
|
+
|--------|------------|--------|
|
|
213
|
+
| {file} | {HIGH/MED/LOW} | {new file, complex logic, etc.} |
|
|
214
|
+
|
|
215
|
+
---
|
|
216
|
+
|
|
217
|
+
## Issues Found
|
|
218
|
+
|
|
219
|
+
### Stale Tests
|
|
220
|
+
| Test | Issue | Action |
|
|
221
|
+
|------|-------|--------|
|
|
222
|
+
| {test} | {function signature changed} | Update assertions |
|
|
223
|
+
| {test} | {mock data outdated} | Update mock |
|
|
224
|
+
|
|
225
|
+
### Dead Tests
|
|
226
|
+
| Test | Reason | Action |
|
|
227
|
+
|------|--------|--------|
|
|
228
|
+
| {test} | {tests deleted feature} | Remove |
|
|
229
|
+
| {test} | {imports removed module} | Remove |
|
|
230
|
+
|
|
231
|
+
### Failing Tests
|
|
232
|
+
| Test | Error | Likely Cause |
|
|
233
|
+
|------|-------|--------------|
|
|
234
|
+
| {test} | {error message} | {code bug or test needs update} |
|
|
235
|
+
|
|
236
|
+
---
|
|
237
|
+
|
|
238
|
+
## Test Health Metrics
|
|
239
|
+
|
|
240
|
+
- Test-to-code ratio: {N tests / N source files}
|
|
241
|
+
- Average assertions per test: {N}
|
|
242
|
+
- Critical paths covered: {list}
|
|
243
|
+
- Critical paths uncovered: {list}
|
|
244
|
+
|
|
245
|
+
---
|
|
246
|
+
|
|
247
|
+
## Generated Tasks
|
|
248
|
+
|
|
249
|
+
### High Priority (blocking)
|
|
250
|
+
- [ ] TEST-001: Fix failing test {test} — {reason}
|
|
251
|
+
- [ ] TEST-002: Update stale test {test} — {what changed}
|
|
252
|
+
|
|
253
|
+
### Medium Priority (should do)
|
|
254
|
+
- [ ] TEST-010: Add tests for {file} — {N} functions uncovered
|
|
255
|
+
- [ ] TEST-011: Add error case tests for {function}
|
|
256
|
+
|
|
257
|
+
### Low Priority (nice to have)
|
|
258
|
+
- [ ] TEST-020: Remove dead test {test}
|
|
259
|
+
- [ ] TEST-021: Add edge case tests for {function}
|
|
260
|
+
|
|
261
|
+
---
|
|
262
|
+
|
|
263
|
+
## Recommendations
|
|
264
|
+
|
|
265
|
+
{Based on findings, what should be prioritized}
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
## Step 7: Generate Test Tasks
|
|
269
|
+
|
|
270
|
+
If issues found, add to current domain's tasks:
|
|
271
|
+
|
|
272
|
+
```markdown
|
|
273
|
+
## Auto-Generated Test Tasks
|
|
274
|
+
|
|
275
|
+
### From Test Sync — {date}
|
|
276
|
+
|
|
277
|
+
- [ ] TEST-001: Fix failing test `test_login.py::test_valid_credentials`
|
|
278
|
+
- Error: AssertionError — expected 200, got 201
|
|
279
|
+
- Cause: API return code changed
|
|
280
|
+
- Action: Update assertion to expect 201
|
|
281
|
+
|
|
282
|
+
- [ ] TEST-002: Add tests for `src/auth/roles.py`
|
|
283
|
+
- Functions: check_permission, assign_role, revoke_role
|
|
284
|
+
- Priority: HIGH — authorization logic
|
|
285
|
+
|
|
286
|
+
- [ ] TEST-003: Update mock data in `test_users.py`
|
|
287
|
+
- Schema changed: added `last_login` field
|
|
288
|
+
- Action: Update all user fixtures
|
|
289
|
+
```
|
|
290
|
+
|
|
291
|
+
## Step 8: Integration with Workflow
|
|
292
|
+
|
|
293
|
+
### During Execute Phase (auto-invoked):
|
|
294
|
+
After each task completes:
|
|
295
|
+
1. Scan changed files and map to existing tests
|
|
296
|
+
2. **If new code paths have zero test coverage: write tests NOW** — do not defer
|
|
297
|
+
3. Run ALL affected unit/integration tests
|
|
298
|
+
4. Run ALL affected Playwright E2E tests
|
|
299
|
+
5. If failures: fix immediately (up to 2 attempts) before continuing. If both attempts fail:
|
|
300
|
+
1. Write failure context to `.gsd-t/debug-state.jsonl` via `node -e "require('./bin/debug-ledger.js').appendEntry('.', {iteration:1,timestamp:new Date().toISOString(),test:'test-sync-failure',error:'2 in-context fix attempts exhausted',hypothesis:'see test-coverage.md',fix:'n/a',fixFiles:[],result:'STILL_FAILS',learning:'delegating to headless debug-loop',model:'sonnet',duration:0})"`
|
|
301
|
+
2. Log: "Delegating to headless debug-loop (2 in-context attempts exhausted)"
|
|
302
|
+
3. Run: `gsd-t headless --debug-loop --max-iterations 10`
|
|
303
|
+
4. Exit code 0 → tests pass, continue; 1/4 → log to `.gsd-t/deferred-items.md`, report failure; 3 → report error
|
|
304
|
+
6. If E2E specs are missing for new features/modes/flows: **create them NOW**, not later
|
|
305
|
+
7. If E2E specs need updating for changed behavior: update them before continuing
|
|
306
|
+
8. **No task is complete until its tests exist and pass** — do not move to the next task with test gaps
|
|
307
|
+
|
|
308
|
+
### During Verify Phase (auto-invoked):
|
|
309
|
+
Full sync:
|
|
310
|
+
1. Complete coverage analysis (unit + E2E)
|
|
311
|
+
2. Run ALL unit/integration tests
|
|
312
|
+
3. Run the FULL E2E test suite — this is mandatory, not optional
|
|
313
|
+
4. Generate full report
|
|
314
|
+
5. Block verification if any critical tests failing (unit or E2E)
|
|
315
|
+
|
|
316
|
+
### Standalone Mode:
|
|
317
|
+
```
|
|
318
|
+
/user:gsd-t-test-sync
|
|
319
|
+
```
|
|
320
|
+
1. Full analysis of entire codebase
|
|
321
|
+
2. Comprehensive report
|
|
322
|
+
3. Generate all test tasks
|
|
323
|
+
4. Do not auto-add to domains — present for review
|
|
324
|
+
|
|
325
|
+
## Step 9: Report to User
|
|
326
|
+
|
|
327
|
+
### Quick Mode (during execute):
|
|
328
|
+
```
|
|
329
|
+
🧪 Test sync: 3 tests affected, 3 passing
|
|
330
|
+
1 coverage gap noted → will address in verify phase
|
|
331
|
+
```
|
|
332
|
+
|
|
333
|
+
### Full Mode (during verify or standalone):
|
|
334
|
+
```
|
|
335
|
+
🧪 Test Sync Complete
|
|
336
|
+
|
|
337
|
+
Unit/Integration:
|
|
338
|
+
- Tests run: 45
|
|
339
|
+
- Passing: 43
|
|
340
|
+
- Failing: 2
|
|
341
|
+
|
|
342
|
+
E2E ({framework}):
|
|
343
|
+
- Specs run: 12
|
|
344
|
+
- Passing: 11
|
|
345
|
+
- Failing: 1
|
|
346
|
+
|
|
347
|
+
Coverage:
|
|
348
|
+
- Gaps: 3
|
|
349
|
+
- Stale tests: 1
|
|
350
|
+
- Dead tests: 0
|
|
351
|
+
|
|
352
|
+
Action Required:
|
|
353
|
+
- 2 failing unit tests must be fixed before verify passes
|
|
354
|
+
- 1 failing E2E spec must be fixed before verify passes
|
|
355
|
+
- See .gsd-t/test-coverage.md for details
|
|
356
|
+
|
|
357
|
+
Generated 5 test tasks → added to current domain
|
|
358
|
+
```
|
|
359
|
+
|
|
360
|
+
### Autonomy Behavior
|
|
361
|
+
|
|
362
|
+
**Level 3 (Full Auto)**: Log the summary and auto-advance to the next phase. If there are failing tests, attempt auto-fix (up to 2 attempts) before continuing. Do NOT wait for user input.
|
|
363
|
+
|
|
364
|
+
**Level 1–2**: Present the full report and wait for user input before proceeding.
|
|
365
|
+
|
|
366
|
+
## Document Ripple
|
|
367
|
+
|
|
368
|
+
### Always update:
|
|
369
|
+
1. **`.gsd-t/progress.md`** — Log test sync results in Decision Log (standalone mode)
|
|
370
|
+
2. **`.gsd-t/test-coverage.md`** — Created/updated with coverage report (Step 5)
|
|
371
|
+
|
|
372
|
+
### Check if affected:
|
|
373
|
+
3. **`docs/requirements.md`** — If test tasks map to requirements, update the Test Coverage table
|
|
374
|
+
4. **`.gsd-t/domains/{current}/tasks.md`** — If test tasks were generated, append them (Step 6)
|
|
375
|
+
5. **`.gsd-t/techdebt.md`** — If persistent test gaps were found, add as debt items
|
|
376
|
+
|
|
377
|
+
$ARGUMENTS
|
|
378
|
+
|
|
379
|
+
## Auto-Clear
|
|
380
|
+
|
|
381
|
+
All work is committed to project files. Execute `/clear` to free the context window for the next command.
|