@tekyzinc/gsd-t 2.74.10 → 2.74.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -170,19 +170,11 @@ Note: Exploratory findings do NOT count against the scripted test pass/fail rati
170
170
 
171
171
  **OBSERVABILITY LOGGING (MANDATORY):**
172
172
  Before spawning — run via Bash:
173
- `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
173
+ `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
174
174
  After subagent returns — run via Bash:
175
- `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
176
- Compute tokens and compaction:
177
- - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
178
- - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
179
- Compute context utilization — run via Bash:
180
- `if [ "${CLAUDE_CONTEXT_TOKENS_MAX:-0}" -gt 0 ]; then CTX_PCT=$(echo "scale=1; ${CLAUDE_CONTEXT_TOKENS_USED:-0} * 100 / ${CLAUDE_CONTEXT_TOKENS_MAX}" | bc); else CTX_PCT="N/A"; fi`
181
- Alert on context thresholds (display to user inline):
182
- - If CTX_PCT >= 85: `echo "🔴 CRITICAL: Context at ${CTX_PCT}% — compaction likely. Task MUST be split."`
183
- - If CTX_PCT >= 70: `echo "⚠️ WARNING: Context at ${CTX_PCT}% — approaching compaction threshold. Consider splitting in plan."`
184
- Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted | Domain | Task | Ctx% |` if missing):
185
- `| {DT_START} | {DT_END} | gsd-t-integrate | Step 5 | haiku | {DURATION}s | {pass/fail}, {N} boundaries tested | {TOKENS} | {COMPACTED} | | | {CTX_PCT} |`
175
+ `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
176
+ Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Domain | Task | Tasks-Since-Reset |` if missing):
177
+ `| {DT_START} | {DT_END} | gsd-t-integrate | Step 5 | haiku | {DURATION}s | {pass/fail}, {N} boundaries tested | | | {COUNTER} |`
186
178
  If QA found issues, append each to `.gsd-t/qa-issues.md` (create with header `| Date | Command | Step | Model | Duration(s) | Severity | Finding |` if missing):
187
179
  `| {DT_START} | gsd-t-integrate | Step 5 | haiku | {DURATION}s | {severity} | {finding} |`
188
180
 
@@ -220,99 +212,30 @@ After integration and doc ripple, verify everything works together:
220
212
 
221
213
  ## Step 7.5: Red Team — Adversarial QA (MANDATORY)
222
214
 
223
- After integration tests pass, spawn an adversarial Red Team agent. This agent's sole purpose is to BREAK the integrated system. Its success is measured by bugs found, not tests passed.
215
+ After integration tests pass, spawn an adversarial Red Team agent on the integrated system. Success is measured by bugs found, not tests passed.
224
216
 
225
- ⚙ [{model}] Red Team → adversarial validation of integrated system
226
-
227
- **OBSERVABILITY LOGGING (MANDATORY):**
228
- Before spawning — run via Bash:
229
- `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
217
+ ⚙ [opus] Red Team → adversarial validation of integrated system
230
218
 
219
+ Resolve the templated prompt path via Bash:
231
220
  ```
232
- Task subagent (general-purpose, model: opus):
233
- "You are a Red Team QA adversary. Your job is to BREAK the integrated system.
234
-
235
- Your value is measured by REAL bugs found. More bugs = more value.
236
- If you find zero bugs, you must prove you were thorough — list every
237
- attack vector you tried and why it didn't break. A short list means
238
- you didn't try hard enough.
239
-
240
- Rules:
241
- - False positives DESTROY your credibility. If you report something
242
- as a bug and it's actually correct behavior, that's worse than
243
- missing a real bug. Never report something you haven't reproduced.
244
- - Style opinions are not bugs. Theoretical concerns are not bugs.
245
- A bug is: 'I did X, expected Y, got Z.' With proof.
246
- - You are done ONLY when you have exhausted every category below
247
- and either found a bug or documented exactly what you tried.
248
-
249
- ## Attack Categories (exhaust ALL of these)
250
-
251
- 1. **Contract Violations**: Read .gsd-t/contracts/. Does the code EXACTLY
252
- match every contract? Test each endpoint/interface/schema shape.
253
- 2. **Boundary Inputs**: Empty strings, null, undefined, huge payloads,
254
- special characters, SQL injection attempts, XSS payloads, path traversal.
255
- 3. **State Transitions**: What happens when actions are performed out of
256
- order? Double-submit? Concurrent access? Refresh mid-flow?
257
- 4. **Error Paths**: Remove env vars. Kill the database. Send malformed
258
- requests. Does the code handle failures gracefully or crash?
259
- 5. **Missing Flows**: Read docs/requirements.md. Are there user flows that
260
- exist in requirements but have NO test coverage? Write tests for them.
261
- 6. **Regression**: Run the FULL test suite. Did any existing tests break?
262
- 7. **E2E Functional Gaps**: Review ALL Playwright specs. Do they test actual
263
- behavior (state changes, data loaded, navigation works) or just check
264
- that elements exist? Flag and rewrite any shallow/layout tests.
265
- 8. **Cross-Domain Boundaries**: Test data flow across EVERY domain boundary.
266
- Does data arriving from domain A get validated by domain B? What happens
267
- when domain A sends malformed data that passed A's own validation?
268
-
269
- ## Exploratory Testing (if Playwright MCP available)
270
-
271
- After all scripted tests pass:
272
- 1. Check if Playwright MCP is registered in Claude Code settings (look for "playwright" in mcpServers)
273
- 2. If available: spend 5 minutes on adversarial interactive exploration using Playwright MCP
274
- - Attempt race conditions, double-submits, concurrent access patterns
275
- - Try unexpected input sequences, boundary values, rapid state transitions
276
- - Probe error recovery: does the app recover after failures or get stuck?
277
- 3. Tag all findings [EXPLORATORY] in your report
278
- 4. If Playwright MCP is not available: skip this section silently
279
- Note: Exploratory findings are additive — they do not replace scripted test results.
280
-
281
- ## Report Format
282
-
283
- For each bug found:
284
- - **BUG-{N}**: {severity: CRITICAL/HIGH/MEDIUM/LOW}
285
- - **Reproduction**: {exact steps to reproduce}
286
- - **Expected**: {what should happen}
287
- - **Actual**: {what actually happens}
288
- - **Proof**: {test file or command that demonstrates the bug}
289
-
290
- Summary:
291
- - BUGS FOUND: {count} (with severity breakdown)
292
- - COVERAGE GAPS: {untested flows from requirements}
293
- - SHALLOW TESTS REWRITTEN: {count}
294
- - CONTRACTS VERIFIED: {N}/{total}
295
- - ATTACK VECTORS TRIED: {list every category attempted and results}
296
- - VERDICT: FAIL ({N} bugs found) | GRUDGING PASS (exhaustive search, nothing found)
297
-
298
- Write all findings to .gsd-t/red-team-report.md.
299
- If bugs found, also append to .gsd-t/qa-issues.md."
221
+ RT_PROMPT="$(npm root -g 2>/dev/null)/@tekyzinc/gsd-t/templates/prompts/red-team-subagent.md"
222
+ [ -f "$RT_PROMPT" ] || RT_PROMPT="templates/prompts/red-team-subagent.md"
223
+ T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")
300
224
  ```
301
225
 
226
+ Spawn Task subagent (general-purpose, model: opus):
227
+ > "Read `$RT_PROMPT` and follow it. Context: cross-domain integration run. **Additional category for this run: Cross-Domain Boundaries** — test data flow across every domain boundary; does data arriving from domain A get validated by domain B; what happens when A sends malformed data that passed A's own validation. Write findings to `.gsd-t/red-team-report.md`."
228
+
302
229
  After subagent returns — run via Bash:
303
- `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
304
- Compute tokens and compaction:
305
- - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
306
- - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
230
+ ```
231
+ T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))
232
+ COUNTER=$(node bin/task-counter.cjs status 2>/dev/null | node -e "let s='';process.stdin.on('data',d=>s+=d).on('end',()=>{try{process.stdout.write(String(JSON.parse(s).count||''))}catch(_){process.stdout.write('')}})")
233
+ ```
307
234
  Append to `.gsd-t/token-log.md`:
308
- `| {DT_START} | {DT_END} | gsd-t-integrate | Red Team | sonnet | {DURATION}s | {VERDICT} — {N} bugs found | {TOKENS} | {COMPACTED} | | | {CTX_PCT} |`
309
-
310
- **If Red Team VERDICT is FAIL:**
311
- 1. Fix all CRITICAL and HIGH bugs immediately (up to 2 fix attempts per bug)
312
- 2. Re-run Red Team after fixes
313
- 3. If bugs persist after 2 fix cycles, log to `.gsd-t/deferred-items.md` and present to user
235
+ `| {DT_START} | {DT_END} | gsd-t-integrate | Red Team | opus | {DURATION}s | {VERDICT} — {N} bugs found | | | {COUNTER} |`
314
236
 
315
- **If Red Team VERDICT is GRUDGING PASS:** Proceed to doc-ripple.
237
+ **If FAIL:** fix CRITICAL/HIGH bugs (≤2 cycles) re-run. Persistent bugs → `.gsd-t/deferred-items.md`.
238
+ **If GRUDGING PASS:** proceed to doc-ripple.
316
239
 
317
240
  ## Step 8: Doc-Ripple (Automated)
318
241
 
@@ -374,19 +374,11 @@ Report: PASS (all checks pass) or FAIL with specific gaps listed."
374
374
 
375
375
  **OBSERVABILITY LOGGING (MANDATORY):**
376
376
  Before spawning — run via Bash:
377
- `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
377
+ `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
378
378
  After subagent returns — run via Bash:
379
- `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
380
- Compute tokens and compaction:
381
- - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
382
- - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
383
- Compute context utilization — run via Bash:
384
- `if [ "${CLAUDE_CONTEXT_TOKENS_MAX:-0}" -gt 0 ]; then CTX_PCT=$(echo "scale=1; ${CLAUDE_CONTEXT_TOKENS_USED:-0} * 100 / ${CLAUDE_CONTEXT_TOKENS_MAX}" | bc); else CTX_PCT="N/A"; fi`
385
- Alert on context thresholds (display to user inline):
386
- - If CTX_PCT >= 85: `echo "🔴 CRITICAL: Context at ${CTX_PCT}% — compaction likely. Task MUST be split."`
387
- - If CTX_PCT >= 70: `echo "⚠️ WARNING: Context at ${CTX_PCT}% — approaching compaction threshold. Consider splitting in plan."`
388
- Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted | Domain | Task | Ctx% |` if missing):
389
- `| {DT_START} | {DT_END} | gsd-t-plan | Step 7 | haiku | {DURATION}s | {PASS/FAIL}, iteration {N} | {TOKENS} | {COMPACTED} | | | {CTX_PCT} |`
379
+ `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
380
+ Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Domain | Task | Tasks-Since-Reset |` if missing):
381
+ `| {DT_START} | {DT_END} | gsd-t-plan | Step 7 | haiku | {DURATION}s | {PASS/FAIL}, iteration {N} | | | {COUNTER} |`
390
382
  If validation FAIL, append each gap to `.gsd-t/qa-issues.md` (create with header `| Date | Command | Step | Model | Duration(s) | Severity | Finding |` if missing):
391
383
  `| {DT_START} | gsd-t-plan | Step 7 | haiku | {DURATION}s | medium | {gap description} |`
392
384
 
@@ -12,7 +12,7 @@ To give PRD generation a fresh context window:
12
12
 
13
13
  **OBSERVABILITY LOGGING (MANDATORY):**
14
14
  Before spawning — run via Bash:
15
- `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
15
+ `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
16
16
 
17
17
  Spawn a fresh subagent using the Task tool:
18
18
  ```
@@ -23,12 +23,9 @@ Read CLAUDE.md and .gsd-t/progress.md for project context, then execute gsd-t-pr
23
23
  ```
24
24
 
25
25
  After subagent returns — run via Bash:
26
- `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
27
- Compute tokens and compaction:
28
- - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
29
- - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
30
- Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted |` if missing):
31
- `| {DT_START} | {DT_END} | gsd-t-prd | Step 0 | sonnet | {DURATION}s | prd: {topic summary} | {TOKENS} | {COMPACTED} |`
26
+ `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
27
+ Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tasks-Since-Reset |` if missing):
28
+ `| {DT_START} | {DT_END} | gsd-t-prd | Step 0 | sonnet | {DURATION}s | prd: {topic summary} | {COUNTER} |`
32
29
 
33
30
  Relay the subagent's summary to the user. **Do not execute Steps 1–6 yourself.**
34
31
 
@@ -10,7 +10,7 @@ To give this task a fresh context window and prevent compaction during consecuti
10
10
 
11
11
  **OBSERVABILITY LOGGING (MANDATORY):**
12
12
  Before spawning — run via Bash:
13
- `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
13
+ `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
14
14
 
15
15
  **Token Budget Check (before spawning subagent):**
16
16
 
@@ -95,12 +95,9 @@ Read CLAUDE.md and .gsd-t/progress.md for project context, then execute gsd-t-qu
95
95
  ```
96
96
 
97
97
  After subagent returns — run via Bash:
98
- `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
99
- Compute tokens and compaction:
100
- - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
101
- - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
102
- Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted |` if missing):
103
- `| {DT_START} | {DT_END} | gsd-t-quick | Step 0 | sonnet | {DURATION}s | quick: {task summary} | {TOKENS} | {COMPACTED} |`
98
+ `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
99
+ Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tasks-Since-Reset |` if missing):
100
+ `| {DT_START} | {DT_END} | gsd-t-quick | Step 0 | sonnet | {DURATION}s | quick: {task summary} | {COUNTER} |`
104
101
 
105
102
  Relay the subagent's summary to the user. **Do not execute Steps 1–5 yourself.**
106
103
 
@@ -262,7 +259,7 @@ If it DOES exist and this task involved UI changes — spawn the Design Verifica
262
259
 
263
260
  **OBSERVABILITY LOGGING (MANDATORY):**
264
261
  Before spawning — run via Bash:
265
- `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
262
+ `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
266
263
 
267
264
  ```
268
265
  Task subagent (general-purpose, model: opus):
@@ -327,96 +324,34 @@ After subagent returns — run observability Bash and append to token-log.md.
327
324
 
328
325
  ## Step 5.5: Red Team — Adversarial QA (MANDATORY)
329
326
 
330
- After tests pass, spawn an adversarial Red Team agent. This agent's sole purpose is to BREAK the code that was just changed. Its success is measured by bugs found, not tests passed.
327
+ After tests pass, spawn an adversarial Red Team agent. Its success is measured by bugs found, not tests passed.
331
328
 
332
- ⚙ [{model}] Red Team → adversarial validation of quick task
333
-
334
- **OBSERVABILITY LOGGING (MANDATORY):**
335
- Before spawning — run via Bash:
336
- `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
329
+ ⚙ [opus] Red Team → adversarial validation of quick task
337
330
 
331
+ Resolve the templated prompt path via Bash (same pattern as execute.md):
338
332
  ```
339
- Task subagent (general-purpose, model: opus):
340
- "You are a Red Team QA adversary. Your job is to BREAK the code that was just changed.
341
-
342
- Your value is measured by REAL bugs found. More bugs = more value.
343
- If you find zero bugs, you must prove you were thorough — list every
344
- attack vector you tried and why it didn't break. A short list means
345
- you didn't try hard enough.
346
-
347
- Rules:
348
- - False positives DESTROY your credibility. If you report something
349
- as a bug and it's actually correct behavior, that's worse than
350
- missing a real bug. Never report something you haven't reproduced.
351
- - Style opinions are not bugs. Theoretical concerns are not bugs.
352
- A bug is: 'I did X, expected Y, got Z.' With proof.
353
- - You are done ONLY when you have exhausted every category below
354
- and either found a bug or documented exactly what you tried.
355
-
356
- ## Attack Categories (exhaust ALL of these)
357
-
358
- 1. **Contract Violations**: Read .gsd-t/contracts/. Does the code EXACTLY
359
- match every contract? Test each endpoint/interface/schema shape.
360
- 2. **Boundary Inputs**: Empty strings, null, undefined, huge payloads,
361
- special characters, SQL injection attempts, XSS payloads, path traversal.
362
- 3. **State Transitions**: What happens when actions are performed out of
363
- order? Double-submit? Concurrent access? Refresh mid-flow?
364
- 4. **Error Paths**: Remove env vars. Kill the database. Send malformed
365
- requests. Does the code handle failures gracefully or crash?
366
- 5. **Missing Flows**: Read docs/requirements.md. Are there user flows that
367
- exist in requirements but have NO test coverage? Write tests for them.
368
- 6. **Regression**: Run the FULL test suite. Did any existing tests break?
369
- 7. **E2E Functional Gaps**: Review ALL Playwright specs. Do they test actual
370
- behavior (state changes, data loaded, navigation works) or just check
371
- that elements exist? Flag and rewrite any shallow/layout tests.
372
-
373
- ## Exploratory Testing (if Playwright MCP available)
374
-
375
- After all scripted tests pass:
376
- 1. Check if Playwright MCP is registered in Claude Code settings (look for "playwright" in mcpServers)
377
- 2. If available: spend 5 minutes on adversarial interactive exploration using Playwright MCP
378
- - Attempt race conditions, double-submits, concurrent access patterns
379
- - Try unexpected input sequences, boundary values, rapid state transitions
380
- - Probe error recovery: does the app recover after failures or get stuck?
381
- 3. Tag all findings [EXPLORATORY] in your report
382
- 4. If Playwright MCP is not available: skip this section silently
383
- Note: Exploratory findings are additive — they do not replace scripted test results.
384
-
385
- ## Report Format
386
-
387
- For each bug found:
388
- - **BUG-{N}**: {severity: CRITICAL/HIGH/MEDIUM/LOW}
389
- - **Reproduction**: {exact steps to reproduce}
390
- - **Expected**: {what should happen}
391
- - **Actual**: {what actually happens}
392
- - **Proof**: {test file or command that demonstrates the bug}
393
-
394
- Summary:
395
- - BUGS FOUND: {count} (with severity breakdown)
396
- - COVERAGE GAPS: {untested flows from requirements}
397
- - SHALLOW TESTS REWRITTEN: {count}
398
- - CONTRACTS VERIFIED: {N}/{total}
399
- - ATTACK VECTORS TRIED: {list every category attempted and results}
400
- - VERDICT: FAIL ({N} bugs found) | GRUDGING PASS (exhaustive search, nothing found)
401
-
402
- Write all findings to .gsd-t/red-team-report.md.
403
- If bugs found, also append to .gsd-t/qa-issues.md."
333
+ RT_PROMPT="$(npm root -g 2>/dev/null)/@tekyzinc/gsd-t/templates/prompts/red-team-subagent.md"
334
+ [ -f "$RT_PROMPT" ] || RT_PROMPT="templates/prompts/red-team-subagent.md"
335
+ T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")
404
336
  ```
405
337
 
338
+ Spawn Task subagent (general-purpose, model: opus):
339
+ > "Read `$RT_PROMPT` and follow it. Context for this run: quick task — adversarial validation of the code just changed. Write findings to `.gsd-t/red-team-report.md`."
340
+
406
341
  After subagent returns — run via Bash:
407
- `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
408
- Compute tokens and compaction:
409
- - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
410
- - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
342
+ ```
343
+ T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))
344
+ COUNTER=$(node bin/task-counter.cjs status 2>/dev/null | node -e "let s='';process.stdin.on('data',d=>s+=d).on('end',()=>{try{process.stdout.write(String(JSON.parse(s).count||''))}catch(_){process.stdout.write('')}})")
345
+ ```
411
346
  Append to `.gsd-t/token-log.md`:
412
- `| {DT_START} | {DT_END} | gsd-t-quick | Red Team | sonnet | {DURATION}s | {VERDICT} — {N} bugs found | {TOKENS} | {COMPACTED} | | | {CTX_PCT} |`
347
+ `| {DT_START} | {DT_END} | gsd-t-quick | Red Team | opus | {DURATION}s | {VERDICT} — {N} bugs found | | | {COUNTER} |`
413
348
 
414
349
  **If Red Team VERDICT is FAIL:**
415
- 1. Fix all CRITICAL and HIGH bugs immediately (up to 2 fix attempts per bug)
350
+ 1. Fix all CRITICAL and HIGH bugs (up to 2 fix cycles)
416
351
  2. Re-run Red Team after fixes
417
- 3. If bugs persist after 2 fix cycles, log to `.gsd-t/deferred-items.md` and present to user
352
+ 3. If bugs persist, log to `.gsd-t/deferred-items.md` and present to user
418
353
 
419
- **If Red Team VERDICT is GRUDGING PASS:** Proceed to doc-ripple.
354
+ **If GRUDGING PASS:** Proceed to doc-ripple.
420
355
 
421
356
  ## Step 6: Doc-Ripple (Automated)
422
357
 
@@ -9,7 +9,7 @@ When invoked directly by the user, spawn yourself as a Task subagent for a fresh
9
9
  **OBSERVABILITY LOGGING — before spawning:**
10
10
 
11
11
  Run via Bash:
12
- `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
12
+ `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
13
13
 
14
14
  ```
15
15
  Task subagent (general-purpose, model: sonnet):
@@ -21,14 +21,14 @@ Skip Step 0 — you are already the subagent."
21
21
  **OBSERVABILITY LOGGING — after subagent returns:**
22
22
 
23
23
  Run via Bash:
24
- `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
24
+ `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
25
25
 
26
26
  Compute tokens:
27
27
  - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
28
28
  - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
29
29
 
30
- Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted |` if missing):
31
- `| {DT_START} | {DT_END} | gsd-t-reflect | Step 0 | sonnet | {DURATION}s | retrospective generated | {TOKENS} | {COMPACTED} |`
30
+ Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tasks-Since-Reset |` if missing):
31
+ `| {DT_START} | {DT_END} | gsd-t-reflect | Step 0 | sonnet | {DURATION}s | retrospective generated | {COUNTER} |`
32
32
 
33
33
  Return the subagent's output and stop. Only skip Step 0 if you are already running as a subagent.
34
34
 
@@ -158,15 +158,12 @@ Teammate assignments:
158
158
  Lead: After receiving teammate reports:
159
159
  **OBSERVABILITY LOGGING (MANDATORY):**
160
160
  Before spawning — run via Bash:
161
- `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
161
+ `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
162
162
  Spawn a Task subagent to run the full test suite and contract audit.
163
163
  After subagent returns — run via Bash:
164
- `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
165
- Compute tokens and compaction:
166
- - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
167
- - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
168
- Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted |` if missing):
169
- `| {DT_START} | {DT_END} | gsd-t-verify | Step 4 | haiku | {DURATION}s | test audit + contract review | {TOKENS} | {COMPACTED} |`
164
+ `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
165
+ Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tasks-Since-Reset |` if missing):
166
+ `| {DT_START} | {DT_END} | gsd-t-verify | Step 4 | haiku | {DURATION}s | test audit + contract review | {COUNTER} |`
170
167
  Collect all reports, synthesize, create remediation plan.
171
168
  ```
172
169
 
@@ -348,7 +345,7 @@ If status is VERIFIED or VERIFIED-WITH-WARNINGS:
348
345
 
349
346
  **OBSERVABILITY LOGGING (MANDATORY):**
350
347
  Before spawning — run via Bash:
351
- `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
348
+ `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
352
349
 
353
350
  2. Spawn a Task subagent (model: sonnet, mode: bypassPermissions):
354
351
  ```
@@ -370,12 +367,9 @@ Report back: one-line status summary."
370
367
  ```
371
368
 
372
369
  After subagent returns — run via Bash:
373
- `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
374
- Compute tokens and compaction:
375
- - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
376
- - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
370
+ `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
377
371
  Append to `.gsd-t/token-log.md`:
378
- `| {DT_START} | {DT_END} | gsd-t-verify | Step 8 | sonnet | {DURATION}s | auto-complete-milestone | {TOKENS} | {COMPACTED} | | | {CTX_PCT} |`
372
+ `| {DT_START} | {DT_END} | gsd-t-verify | Step 8 | sonnet | {DURATION}s | auto-complete-milestone | | | {COUNTER} |`
379
373
 
380
374
  3. Verify subagent result: Read `.gsd-t/progress.md` — confirm status is COMPLETED. If not, report the failure.
381
375
 
@@ -9,7 +9,7 @@ When invoked directly by the user, spawn yourself as a Task subagent for a fresh
9
9
  **OBSERVABILITY LOGGING — before spawning:**
10
10
 
11
11
  Run via Bash:
12
- `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
12
+ `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
13
13
 
14
14
  ```
15
15
  Task subagent (general-purpose, model: sonnet):
@@ -21,14 +21,14 @@ Skip Step 0 — you are already the subagent."
21
21
  **OBSERVABILITY LOGGING — after subagent returns:**
22
22
 
23
23
  Run via Bash:
24
- `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
24
+ `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
25
25
 
26
26
  Compute tokens:
27
27
  - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
28
28
  - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
29
29
 
30
- Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted |` if missing):
31
- `| {DT_START} | {DT_END} | gsd-t-visualize | Step 0 | sonnet | {DURATION}s | dashboard launched | {TOKENS} | {COMPACTED} |`
30
+ Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tasks-Since-Reset |` if missing):
31
+ `| {DT_START} | {DT_END} | gsd-t-visualize | Step 0 | sonnet | {DURATION}s | dashboard launched | {COUNTER} |`
32
32
 
33
33
  Return the subagent's output and stop. Only skip Step 0 if you are already running as a subagent.
34
34
 
@@ -2,6 +2,16 @@
2
2
 
3
3
  You are the wave orchestrator. You do NOT execute phases yourself. Instead, you spawn an **independent agent for each phase**, giving each a fresh context window. This eliminates context accumulation across phases and prevents mid-wave compaction.
4
4
 
5
+ ## Step 0: Reset Phase-Count Gate (MANDATORY — first thing in a fresh session)
6
+
7
+ Run via Bash:
8
+
9
+ ```bash
10
+ node bin/task-counter.cjs reset
11
+ ```
12
+
13
+ This clears `.gsd-t/.task-counter` so the new wave session starts at 0. The gate logic is in the Phase Agent Spawn Pattern below — it forces a /clear-and-resume after N phase spawns to prevent the wave orchestrator from itself running out of context. Default N=5, override per-project via `.gsd-t/task-counter-config.json` (`{"limit":8}`) or env `GSD_T_TASK_LIMIT=8`.
14
+
5
15
  ## Step 1: Load State (Lightweight)
6
16
 
7
17
  Read ONLY:
@@ -93,7 +103,7 @@ Run via Bash:
93
103
 
94
104
  **OBSERVABILITY LOGGING (MANDATORY) — repeat for every phase spawn:**
95
105
  Before spawning — run via Bash:
96
- `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
106
+ `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
97
107
 
98
108
  ```
99
109
  Task agent (subagent_type: "general-purpose", mode: "bypassPermissions"):
@@ -114,28 +124,31 @@ Task agent (subagent_type: "general-purpose", mode: "bypassPermissions"):
114
124
  ```
115
125
 
116
126
  After phase agent returns — run via Bash:
117
- `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
118
- Compute tokens and compaction:
119
- - No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
120
- - Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
121
- Compute context utilization run via Bash:
122
- `if [ "${CLAUDE_CONTEXT_TOKENS_MAX:-0}" -gt 0 ]; then CTX_PCT=$(echo "scale=1; ${CLAUDE_CONTEXT_TOKENS_USED:-0} * 100 / ${CLAUDE_CONTEXT_TOKENS_MAX}" | bc); else CTX_PCT="N/A"; fi`
123
- Alert on context thresholds (display to user inline):
124
- - If CTX_PCT >= 85: `echo "🔴 CRITICAL: Context at ${CTX_PCT}% — compaction likely. Task MUST be split."`
125
- - If CTX_PCT >= 70: `echo "⚠️ WARNING: Context at ${CTX_PCT}% — approaching compaction threshold. Consider splitting in plan."`
126
-
127
- **Orchestrator Context Self-Check (MANDATORY):**
128
- After EVERY phase agent returns, check the wave orchestrator's own context:
129
- - **If CTX_PCT >= 70:**
130
- 1. Save checkpoint to `.gsd-t/progress.md` record which phases are complete, which remain
131
- 2. Output: `⚠️ Wave orchestrator context at {CTX_PCT}% — approaching limit. Progress saved. Run /clear then /user:gsd-t-wave to continue from the next phase.`
132
- 3. **STOP the wave loop.** Do NOT spawn the next phase agent. The next session resumes from saved state.
133
- - **If CTX_PCT < 70:** Continue to next phase.
134
-
135
- This prevents the wave orchestrator from running out of context mid-wave.
136
-
137
- Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted | Domain | Task | Ctx% |` if missing):
138
- `| {DT_START} | {DT_END} | gsd-t-wave | {PHASE} | sonnet | {DURATION}s | phase: {PHASE} | {TOKENS} | {COMPACTED} | | | {CTX_PCT} |`
127
+ `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
128
+
129
+ **Wave Orchestrator Phase-Count Gate (MANDATORY) replaces the broken context-percent check from v2.74.x:**
130
+
131
+ Run via Bash AFTER each phase agent returns:
132
+
133
+ ```bash
134
+ node bin/task-counter.cjs increment phase
135
+ ```
136
+
137
+ Read the JSON status the command prints. If `should_stop` is `true` (or the command's exit code is `10`):
138
+ 1. Save checkpoint to `.gsd-t/progress.md` record which phases are complete, which remain.
139
+ 2. Output exactly: `⏸️ Wave orchestrator phase-count gate reached ({count}/{limit} phases in this session). Progress saved. Run /clear then /user:gsd-t-wave to continue from the next phase.`
140
+ 3. **STOP the wave loop.** Do NOT spawn the next phase agent. The next session resumes from saved state.
141
+
142
+ The wave orchestrator shares the same `bin/task-counter.cjs` counter as the execute orchestrator. Each phase spawn (PARTITION, DISCUSS, PLAN, IMPACT, EXECUTE, TEST-SYNC, INTEGRATE, VERIFY+COMPLETE, DOC-RIPPLE) increments the counter by 1. With the default limit of 5, a wave will run at most 5 phase agents per session before forcing a /clear-and-resume — typically two sessions per full wave. Override via `.gsd-t/task-counter-config.json` (`{"limit":8}`) or `GSD_T_TASK_LIMIT=8`.
143
+
144
+ The previous version of this gate relied on `CLAUDE_CONTEXT_TOKENS_USED`/`_MAX` env vars which Claude Code does not export — that check was inert and let the orchestrator drain context until forced compaction. The deterministic on-disk counter has the same intent (force a /clear before context runs out) but actually works.
145
+
146
+ **On wave entry**, the wave orchestrator runs `node bin/task-counter.cjs reset` exactly once (see Step 0 — it is the very first thing the wave does in a fresh session). The reset is the SIGNAL that this is a clean post-/clear session.
147
+
148
+ Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Domain | Task | Tasks-Since-Reset |` if missing):
149
+ `| {DT_START} | {DT_END} | gsd-t-wave | {PHASE} | sonnet | {DURATION}s | phase: {PHASE} | | | {COUNTER} |`
150
+
151
+ Where `{COUNTER}` is the `count` field from the JSON the increment command just printed.
139
152
 
140
153
  ### Phase Sequence
141
154
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@tekyzinc/gsd-t",
3
- "version": "2.74.10",
3
+ "version": "2.74.12",
4
4
  "description": "GSD-T: Contract-Driven Development for Claude Code — 56 slash commands with headless CI/CD mode, graph-powered code analysis, real-time agent dashboard, execution intelligence, task telemetry, doc-ripple enforcement, backlog management, impact analysis, test sync, milestone archival, and PRD generation",
5
5
  "author": "Tekyz, Inc.",
6
6
  "license": "MIT",
@@ -0,0 +1,30 @@
1
+ # GSD-T Subagent Prompt Templates
2
+
3
+ This directory holds long-form prompts that used to be inlined into command markdown files. Inlining them caused massive context burn — each Task subagent spawn re-materialized ~3000 tokens of prompt boilerplate, dozens of times per milestone.
4
+
5
+ Now command files reference these by path. The orchestrator passes the file path to the subagent, the subagent reads it itself. The orchestrator never holds the full prompt in its own context.
6
+
7
+ ## Files
8
+
9
+ | File | Purpose | Run frequency |
10
+ |------|---------|---------------|
11
+ | `qa-subagent.md` | Test generation, execution, gap reporting | Per task |
12
+ | `red-team-subagent.md` | Adversarial bug hunting | Per domain (NOT per task) |
13
+ | `design-verify-subagent.md` | Visual element-by-element design audit | Per domain (NOT per task) |
14
+
15
+ ## Why per-domain instead of per-task
16
+
17
+ Red Team and Design Verification were originally per-domain. They were promoted to per-task by commits `da6d3ae` and `b68353e`, on the assumption that the orchestrator's `CLAUDE_CONTEXT_TOKENS_USED` self-check would catch context drain before it got bad. That env var is never set by Claude Code — the self-check was vaporware. With it inert, per-task spawning of ~10k-token Red Team subagents drained sessions in 5-10 tasks. Reverting them to per-domain raises the safe task count from ~5 to ~15+.
18
+
19
+ QA stays per-task because (a) it's much smaller, (b) it grounds against contracts which can drift task by task.
20
+
21
+ ## Adding a new prompt
22
+
23
+ Write the prompt as a self-contained markdown file in this directory. Reference it from the command file with:
24
+
25
+ ```
26
+ Spawn Task subagent (general-purpose, model: <model>):
27
+ "Read `templates/prompts/<your-prompt>.md` and follow it. Context for this run: <one-line context>."
28
+ ```
29
+
30
+ Keep the inline context to one line. The prompt body must live in the file, not the command.