npm - @tekyzinc/gsd-t - Versions diffs - 2.74.10 → 2.74.12 - Mend

@tekyzinc/gsd-t 2.74.10 → 2.74.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/CHANGELOG.md +32 -0
package/bin/gsd-t.js +4 -3
package/bin/task-counter.cjs +161 -0
package/bin/token-budget.js +43 -8
package/commands/gsd-t-audit.md +3 -6
package/commands/gsd-t-brainstorm.md +4 -7
package/commands/gsd-t-debug.md +23 -97
package/commands/gsd-t-design-decompose.md +2 -2
package/commands/gsd-t-discuss.md +4 -7
package/commands/gsd-t-doc-ripple.md +6 -14
package/commands/gsd-t-execute.md +121 -411
package/commands/gsd-t-integrate.md +20 -97
package/commands/gsd-t-plan.md +4 -12
package/commands/gsd-t-prd.md +4 -7
package/commands/gsd-t-quick.md +22 -87
package/commands/gsd-t-reflect.md +4 -4
package/commands/gsd-t-verify.md +7 -13
package/commands/gsd-t-visualize.md +4 -4
package/commands/gsd-t-wave.md +36 -23
package/package.json +1 -1
package/templates/prompts/README.md +30 -0
package/templates/prompts/design-verify-subagent.md +99 -0
package/templates/prompts/qa-subagent.md +26 -0
package/templates/prompts/red-team-subagent.md +44 -0
/package/bin/{archive-progress.js → archive-progress.cjs} +0 -0
/package/bin/{context-budget-audit.js → context-budget-audit.cjs} +0 -0
/package/bin/{log-tail.js → log-tail.cjs} +0 -0

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,38 @@
 All notable changes to GSD-T are documented here. Updated with each release.
+## [2.74.12] - 2026-04-14
+### Fixed — Context-Burn Regression (P0, affects every GSD-T project)
+**Root cause**: commit `0b91429` (2026-03-24) added an "orchestrator context self-check" that read `CLAUDE_CONTEXT_TOKENS_USED` / `CLAUDE_CONTEXT_TOKENS_MAX` — environment variables **Claude Code never exports**. The guard was always false, so the self-check was silently inert. Commits `da6d3ae` and `b68353e` then promoted Red Team and Design Verification from per-domain to per-task on the assumption that this guard would catch context drain. With the guard broken, per-task spawning of ~10K-token adversarial prompts drained sessions from 77% → 12% context in just 2 tasks (bee-poc reproducer).
+**Fix ships as a comprehensive two-layer correction:**
+#### Fix 1: Real task-count gate (replaces vaporware env-var check)
+- **NEW `bin/task-counter.cjs`** — deterministic on-disk task counter. State: `.gsd-t/.task-counter`. Config: `.gsd-t/task-counter-config.json` (default limit: 5). Env override: `GSD_T_TASK_LIMIT`. Commands: `increment <kind>`, `status`, `reset`, `should-stop` (exit code 10 at limit). This is the real signal the old self-check *pretended* to be.
+- **`commands/gsd-t-execute.md`** — Step 0 resets the counter; Step 3.5 calls `node bin/task-counter.cjs should-stop` as a gate before every task spawn; Step 5 increments after each task. At limit, the orchestrator checkpoints and STOPs — user runs `/clear` then `/user:gsd-t-resume`.
+- **`commands/gsd-t-wave.md`** — analogous phase-count gate replaces the broken "Wave Orchestrator Context Self-Check."
+- **`bin/token-budget.js`** — `getSessionStatus()` rewritten to read the task counter instead of env vars. API surface preserved (threshold/pct/consumed/estimated_remaining) so all dependent commands keep working. Graduated-degradation thresholds (warn/downgrade/conserve/stop) now fire on real signal.
+#### Fix 2: Revert per-task Red Team / Design Verify, extract prompts to templates
+- **NEW `templates/prompts/`** directory with three self-contained prompt files: `qa-subagent.md`, `red-team-subagent.md`, `design-verify-subagent.md`, plus a `README.md` explaining the architecture. Command files reference prompts by **file path**, not by inlining the body. Subagents read the prompt file themselves, so the orchestrator never re-materializes ~3500-token prompt bodies in its own context per spawn.
+- **`commands/gsd-t-execute.md`** — Red Team and Design Verification moved back to **per-domain** (where they were before `da6d3ae` / `b68353e`). QA stays per-task (smaller, and contracts can drift task-by-task). Result: safe-task-count-per-session rises from ~5 to ~15+.
+- **`commands/gsd-t-quick.md`, `gsd-t-integrate.md`, `gsd-t-debug.md`** — Red Team spawn blocks converted to templated-prompt references. ~270 lines of duplicated adversarial prompt boilerplate removed; run-specific categories (Cross-Domain Boundaries, Regression Around the Fix, Original Bug Variants) preserved as one-line context notes to the subagent.
+#### Fix 3: Token-log schema & placeholder cleanup
+- Removed `Tokens | Compacted | Ctx%` columns from the token-log schema (they always wrote `0 | null | N/A` because the env vars were never set). Added `Tasks-Since-Reset` as the real burn signal.
+- Neutralized **70+ references** to `CLAUDE_CONTEXT_TOKENS_USED` / `CLAUDE_CONTEXT_TOKENS_MAX` across 14 command files. The 3 remaining references (gsd-t-execute.md, gsd-t-wave.md, gsd-t-doc-ripple.md) are historical-note mentions only. `scripts/gsd-t-heartbeat.js` and `scripts/gsd-t-statusline.js` still read the env vars but treat them as optional fallbacks that gracefully degrade (unchanged behavior).
+- Test suite (`test/token-budget.test.js`) rewritten around the new counter-based `getSessionStatus()`. 36/36 passing.
+### Propagation
+After publishing, run `/user:gsd-t-version-update-all` to propagate the fix to every registered GSD-T project. Projects will receive the new `bin/task-counter.cjs` and updated command files in a single sweep.
+## [2.74.11] - 2026-04-13
+### Fixed
+- **`bin/archive-progress.js` → `.cjs` rename** — the new bin tools used CommonJS `require()` but failed in projects with `"type": "module"` in `package.json` (caught on BDS-Analytics-UI during first update-all). Renamed all three new bin tools to `.cjs` so they run as CommonJS regardless of the host project's module type. `version-update-all` now copies `.cjs` files and runs `archive-progress.cjs`.
 ## [2.74.10] - 2026-04-13
 ### Added

package/bin/gsd-t.js CHANGED Viewed

@@ -1557,8 +1557,9 @@ function updateSingleProject(projectDir, counts) {
 }
 // Bin tools that should ship with every registered project. Listed here so adding
-// a new tool only requires appending to this array.
-const PROJECT_BIN_TOOLS = ["archive-progress.js", "log-tail.js", "context-budget-audit.js"];
+// a new tool only requires appending to this array. Use .cjs extension so they
+// always run as CommonJS regardless of the project's package.json "type" field.
+const PROJECT_BIN_TOOLS = ["archive-progress.cjs", "log-tail.cjs", "context-budget-audit.cjs"];
 function copyBinToolsToProject(projectDir, projectName) {
   const projectBinDir = path.join(projectDir, "bin");
@@ -1611,7 +1612,7 @@ function runProgressArchiveMigration(projectDir, projectName) {
   const markerPath = path.join(projectDir, ".gsd-t", ".archive-migration-v1");
   if (fs.existsSync(markerPath)) return false;
-  const archiveScript = path.join(projectDir, "bin", "archive-progress.js");
+  const archiveScript = path.join(projectDir, "bin", "archive-progress.cjs");
   if (!fs.existsSync(archiveScript)) return false;
   try {

package/bin/task-counter.cjs ADDED Viewed

@@ -0,0 +1,161 @@
+#!/usr/bin/env node
+/**
+ * GSD-T Task Counter — Real, deterministic context-burn gate.
+ *
+ * Replaces the broken CLAUDE_CONTEXT_TOKENS_USED self-check (which never
+ * worked because Claude Code does not export those env vars). Instead of
+ * trying to read the orchestrator's own token usage, we count completed
+ * task subagent spawns. After N tasks the orchestrator MUST checkpoint
+ * progress and STOP so the user can /clear and resume cleanly.
+ *
+ * State lives at .gsd-t/.task-counter (single JSON file). A counter
+ * persists across orchestrator runs until a /clear-then-resume cycle
+ * resets it via the `reset` command.
+ *
+ * Threshold defaults are conservative — five tasks per session before
+ * stop. Override via:
+ *   - .gsd-t/task-counter-config.json  { "limit": 8 }
+ *   - env GSD_T_TASK_LIMIT=8
+ *
+ * Zero external dependencies (Node.js built-ins only).
+ *
+ * CLI usage (called from command markdown via `node bin/task-counter.cjs`):
+ *   increment <kind>          → bump counter, print JSON status
+ *   status                    → print JSON status without bumping
+ *   reset                     → clear counter (called after /clear resume)
+ *   should-stop               → exit 0 if okay to spawn, exit 10 if must stop
+ *
+ * JSON status shape:
+ *   { count, limit, remaining, should_stop, started_at, last_kind }
+ */
+const fs = require("fs");
+const path = require("path");
+const DEFAULT_LIMIT = 5;
+const STATE_FILE = ".gsd-t/.task-counter";
+const CONFIG_FILE = ".gsd-t/task-counter-config.json";
+function projectDir() {
+  return process.cwd();
+}
+function statePath() {
+  return path.join(projectDir(), STATE_FILE);
+}
+function configPath() {
+  return path.join(projectDir(), CONFIG_FILE);
+}
+function readLimit() {
+  if (process.env.GSD_T_TASK_LIMIT) {
+    const n = parseInt(process.env.GSD_T_TASK_LIMIT, 10);
+    if (!isNaN(n) && n > 0) return n;
+  }
+  try {
+    const raw = fs.readFileSync(configPath(), "utf8");
+    const cfg = JSON.parse(raw);
+    if (cfg && typeof cfg.limit === "number" && cfg.limit > 0) return cfg.limit;
+  } catch (_) {}
+  return DEFAULT_LIMIT;
+}
+function readState() {
+  try {
+    const raw = fs.readFileSync(statePath(), "utf8");
+    const s = JSON.parse(raw);
+    return {
+      count: typeof s.count === "number" ? s.count : 0,
+      started_at: s.started_at || null,
+      last_kind: s.last_kind || null,
+      stopped: !!s.stopped,
+    };
+  } catch (_) {
+    return { count: 0, started_at: null, last_kind: null, stopped: false };
+  }
+}
+function writeState(s) {
+  const dir = path.dirname(statePath());
+  if (!fs.existsSync(dir)) fs.mkdirSync(dir, { recursive: true });
+  fs.writeFileSync(statePath(), JSON.stringify(s, null, 2));
+}
+function buildStatus(state, limit) {
+  const remaining = Math.max(0, limit - state.count);
+  return {
+    count: state.count,
+    limit,
+    remaining,
+    should_stop: state.count >= limit || state.stopped,
+    started_at: state.started_at,
+    last_kind: state.last_kind,
+  };
+}
+function cmdIncrement(kind) {
+  const s = readState();
+  s.count += 1;
+  s.last_kind = kind || "task";
+  if (!s.started_at) s.started_at = new Date().toISOString();
+  const limit = readLimit();
+  if (s.count >= limit) s.stopped = true;
+  writeState(s);
+  return buildStatus(s, limit);
+}
+function cmdStatus() {
+  return buildStatus(readState(), readLimit());
+}
+function cmdReset() {
+  writeState({ count: 0, started_at: null, last_kind: null, stopped: false });
+  return buildStatus(readState(), readLimit());
+}
+function cmdShouldStop() {
+  return buildStatus(readState(), readLimit()).should_stop;
+}
+function main() {
+  const cmd = process.argv[2];
+  const arg = process.argv[3];
+  switch (cmd) {
+    case "increment": {
+      const status = cmdIncrement(arg);
+      process.stdout.write(JSON.stringify(status));
+      process.exit(status.should_stop ? 10 : 0);
+    }
+    case "status": {
+      process.stdout.write(JSON.stringify(cmdStatus()));
+      process.exit(0);
+    }
+    case "reset": {
+      process.stdout.write(JSON.stringify(cmdReset()));
+      process.exit(0);
+    }
+    case "should-stop": {
+      process.exit(cmdShouldStop() ? 10 : 0);
+    }
+    default: {
+      process.stderr.write(
+        "Usage: task-counter.cjs <increment|status|reset|should-stop> [kind]\n"
+      );
+      process.exit(2);
+    }
+  }
+}
+if (require.main === module) main();
+module.exports = {
+  cmdIncrement,
+  cmdStatus,
+  cmdReset,
+  cmdShouldStop,
+  readLimit,
+  readState,
+  DEFAULT_LIMIT,
+};

package/bin/token-budget.js CHANGED Viewed

@@ -77,13 +77,45 @@ function estimateCost(model, taskType, options) {
 /**
  * @param {string} [projectDir]
  * @returns {{ consumed: number, estimated_remaining: number, pct: number, threshold: string }}
+ *
+ * v2.74.12: previously this read process.env.CLAUDE_CONTEXT_TOKENS_USED /
+ * CLAUDE_CONTEXT_TOKENS_MAX, which Claude Code does not export — so consumed
+ * was always 0 and threshold was always 'normal'. The graduated-degradation
+ * machinery downstream was inert. Now we synthesise a percent from the real
+ * task counter at .gsd-t/.task-counter, mapping 0..limit linearly to
+ * 0..100%. This keeps the API stable so commands that ask for thresholds
+ * (downgrade/conserve/stop) get a real signal.
  */
 function getSessionStatus(projectDir) {
-  const maxTokens = parseInt(process.env.CLAUDE_CONTEXT_TOKENS_MAX || "200000", 10);
-  const consumed = readSessionConsumed(projectDir);
-  const pct = maxTokens > 0 ? Math.round((consumed / maxTokens) * 100 * 10) / 10 : 0;
+  const dir = projectDir || process.cwd();
+  const counter = readTaskCounter(dir);
+  const limit = counter.limit > 0 ? counter.limit : 5;
+  const consumed = counter.count;
+  const pct = Math.min(100, Math.round((consumed / limit) * 100 * 10) / 10);
   const threshold = resolveThreshold(pct);
-  return { consumed, estimated_remaining: maxTokens - consumed, pct, threshold };
+  const estimated_remaining = Math.max(0, limit - consumed);
+  return { consumed, estimated_remaining, pct, threshold };
+}
+function readTaskCounter(dir) {
+  try {
+    const fp = path.join(dir, ".gsd-t", ".task-counter");
+    const raw = fs.readFileSync(fp, "utf8");
+    const s = JSON.parse(raw);
+    let limit = 5;
+    try {
+      const cfgRaw = fs.readFileSync(path.join(dir, ".gsd-t", "task-counter-config.json"), "utf8");
+      const cfg = JSON.parse(cfgRaw);
+      if (cfg && typeof cfg.limit === "number" && cfg.limit > 0) limit = cfg.limit;
+    } catch (_) {}
+    if (process.env.GSD_T_TASK_LIMIT) {
+      const n = parseInt(process.env.GSD_T_TASK_LIMIT, 10);
+      if (!isNaN(n) && n > 0) limit = n;
+    }
+    return { count: typeof s.count === "number" ? s.count : 0, limit };
+  } catch (_) {
+    return { count: 0, limit: 5 };
+  }
 }
 // ── recordUsage ──────────────────────────────────────────────────────────────
@@ -123,13 +155,16 @@ function getDegradationActions(projectDir) {
  * @returns {{ estimatedTokens: number, estimatedPct: number, feasible: boolean }}
  */
 function estimateMilestoneCost(remainingTasks, projectDir) {
-  const { estimated_remaining } = getSessionStatus(projectDir);
-  const maxTokens = parseInt(process.env.CLAUDE_CONTEXT_TOKENS_MAX || "200000", 10);
+  const status = getSessionStatus(projectDir);
+  const limit = status.consumed + status.estimated_remaining || 5;
   const estimatedTokens = remainingTasks.reduce((sum, t) => {
     return sum + estimateCost(t.model, t.taskType, { complexity: t.complexity, projectDir });
   }, 0);
-  const estimatedPct = maxTokens > 0 ? Math.round((estimatedTokens / maxTokens) * 100 * 10) / 10 : 0;
-  const feasible = estimatedTokens <= estimated_remaining * 0.8;
+  // Estimate task-equivalents needed for the remaining work: cost-weighted
+  // approximation against historical avg, capped at the configured task limit.
+  const taskEquivalents = remainingTasks.length;
+  const estimatedPct = limit > 0 ? Math.min(100, Math.round((taskEquivalents / limit) * 100 * 10) / 10) : 0;
+  const feasible = taskEquivalents <= status.estimated_remaining;
   return { estimatedTokens, estimatedPct, feasible };
 }

package/commands/gsd-t-audit.md CHANGED Viewed

@@ -10,7 +10,7 @@ To keep the main conversation context lean, run audit via a Task subagent.
 **OBSERVABILITY LOGGING (MANDATORY):**
 Before spawning — run via Bash:
-`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
+`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
 Spawn a fresh subagent using the Task tool:
 ```
@@ -22,12 +22,9 @@ Read CLAUDE.md and .gsd-t/progress.md for project context, then execute gsd-t-au
 ```
 After subagent returns — run via Bash:
-`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
-Compute tokens and compaction:
-- No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
-- Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
+`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
 Append to `.gsd-t/token-log.md` (create with header if missing):
-`| {DT_START} | {DT_END} | gsd-t-audit | Step 0 | sonnet | {DURATION}s | audit: {args summary} | {TOKENS} | {COMPACTED} | | | {CTX_PCT} |`
+`| {DT_START} | {DT_END} | gsd-t-audit | Step 0 | sonnet | {DURATION}s | audit: {args summary} | | | {COUNTER} |`
 Relay the subagent's summary to the user. **Do not execute Steps 1–5 yourself.**

package/commands/gsd-t-brainstorm.md CHANGED Viewed

@@ -90,7 +90,7 @@ Before drawing any conclusions or presenting final insights, spawn a team of par
 **OBSERVABILITY LOGGING (MANDATORY):**
 Before spawning the team — run via Bash:
-`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
+`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
 ```
 Spawn a deep research team (run all three in parallel):
@@ -123,12 +123,9 @@ Do NOT proceed to Step 5 until this synthesis is complete.
 ```
 After team completes — run via Bash:
-`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
-Compute tokens and compaction:
-- No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
-- Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
-Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted |` if missing):
-`| {DT_START} | {DT_END} | gsd-t-brainstorm | Step 3 | sonnet | {DURATION}s | deep research: {topic summary} | {TOKENS} | {COMPACTED} |`
+`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
+Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tasks-Since-Reset |` if missing):
+`| {DT_START} | {DT_END} | gsd-t-brainstorm | Step 3 | sonnet | {DURATION}s | deep research: {topic summary} | {COUNTER} |`
 ## Step 4: Capture the Sparks

package/commands/gsd-t-debug.md CHANGED Viewed

@@ -72,7 +72,7 @@ If STACK_RULES is empty (no templates/stacks/ dir or no matches), skip silently.
 **OBSERVABILITY LOGGING (MANDATORY):**
 Before spawning — run via Bash:
-`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
+`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
 Spawn a fresh subagent using the Task tool:
 ```
@@ -83,12 +83,9 @@ Read CLAUDE.md and .gsd-t/progress.md for project context, then execute gsd-t-de
 ```
 After subagent returns — run via Bash:
-`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
-Compute tokens and compaction:
-- No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
-- Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
-Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted |` if missing):
-`| {DT_START} | {DT_END} | gsd-t-debug | Step 0 | sonnet | {DURATION}s | debug: {issue summary} | {TOKENS} | {COMPACTED} |`
+`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
+Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tasks-Since-Reset |` if missing):
+`| {DT_START} | {DT_END} | gsd-t-debug | Step 0 | sonnet | {DURATION}s | debug: {issue summary} | {COUNTER} |`
 Relay the subagent's summary to the user. **Do not execute Steps 1–5 yourself.**
@@ -124,7 +121,7 @@ The current approach has failed 3+ times. This means the root cause is not yet u
 **OBSERVABILITY LOGGING (MANDATORY):**
 Before spawning — run via Bash:
-`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
+`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
 ```
 Spawn a deep research team (run all three in parallel):
@@ -153,12 +150,9 @@ Lead: Wait for all three researchers to complete. Then synthesize:
 ```
 After team completes — run via Bash:
-`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
-Compute tokens and compaction:
-- No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
-- Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
+`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
 Append to `.gsd-t/token-log.md`:
-`| {DT_START} | {DT_END} | gsd-t-debug | Step 1.5 | sonnet | {DURATION}s | deep research loop break: {issue summary} | {TOKENS} | {COMPACTED} |`
+`| {DT_START} | {DT_END} | gsd-t-debug | Step 1.5 | sonnet | {DURATION}s | deep research loop break: {issue summary} | {COUNTER} |`
 **STOP. Present findings to the user before making any changes:**
@@ -378,98 +372,30 @@ Commit: `[debug] Fix {description} — root cause: {explanation}`
 ## Step 5.3: Red Team — Adversarial QA (MANDATORY)
-After the fix passes all tests, spawn an adversarial Red Team agent. This agent's sole purpose is to BREAK the fix and find regressions. Its success is measured by bugs found, not tests passed.
+After the fix passes all tests, spawn an adversarial Red Team agent to BREAK the fix and find regressions.
-⚙ [{model}] Red Team → adversarial validation of debug fix
-**OBSERVABILITY LOGGING (MANDATORY):**
-Before spawning — run via Bash:
-`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
+⚙ [opus] Red Team → adversarial validation of debug fix
+Resolve the templated prompt path via Bash:
 ```
-Task subagent (general-purpose, model: opus):
-"You are a Red Team QA adversary. Your job is to BREAK the fix that was just applied.
-Your value is measured by REAL bugs found. More bugs = more value.
-If you find zero bugs, you must prove you were thorough — list every
-attack vector you tried and why it didn't break. A short list means
-you didn't try hard enough.
-Rules:
-- False positives DESTROY your credibility. If you report something
-  as a bug and it's actually correct behavior, that's worse than
-  missing a real bug. Never report something you haven't reproduced.
-- Style opinions are not bugs. Theoretical concerns are not bugs.
-  A bug is: 'I did X, expected Y, got Z.' With proof.
-- You are done ONLY when you have exhausted every category below
-  and either found a bug or documented exactly what you tried.
-## Attack Categories (exhaust ALL of these)
-1. **Contract Violations**: Read .gsd-t/contracts/. Does the code EXACTLY
-   match every contract? Test each endpoint/interface/schema shape.
-2. **Boundary Inputs**: Empty strings, null, undefined, huge payloads,
-   special characters, SQL injection attempts, XSS payloads, path traversal.
-3. **State Transitions**: What happens when actions are performed out of
-   order? Double-submit? Concurrent access? Refresh mid-flow?
-4. **Error Paths**: Remove env vars. Kill the database. Send malformed
-   requests. Does the code handle failures gracefully or crash?
-5. **Regression Around the Fix**: The fix changed specific code. Test
-   every adjacent code path. Fixes frequently break neighboring functionality.
-6. **Original Bug Variants**: The original bug was found. Are there SIMILAR
-   bugs in related code? Same pattern, different location?
-7. **Full Suite**: Run the FULL test suite. Did the fix break anything else?
-8. **E2E Functional Gaps**: Review ALL Playwright specs. Do they test actual
-   behavior (state changes, data loaded, navigation works) or just check
-   that elements exist? Flag and rewrite any shallow/layout tests.
-## Exploratory Testing (if Playwright MCP available)
-After all scripted tests pass:
-1. Check if Playwright MCP is registered in Claude Code settings (look for "playwright" in mcpServers)
-2. If available: spend 5 minutes on adversarial interactive exploration using Playwright MCP
-   - Focus on the fixed area and adjacent code — regressions often lurk nearby
-   - Try the original bug reproduction path to confirm it is truly fixed
-   - Probe for variant bugs: same pattern in related code paths
-3. Tag all findings [EXPLORATORY] in your report
-4. If Playwright MCP is not available: skip this section silently
-Note: Exploratory findings are additive — they do not replace scripted test results.
-## Report Format
-For each bug found:
-- **BUG-{N}**: {severity: CRITICAL/HIGH/MEDIUM/LOW}
-  - **Reproduction**: {exact steps to reproduce}
-  - **Expected**: {what should happen}
-  - **Actual**: {what actually happens}
-  - **Proof**: {test file or command that demonstrates the bug}
-Summary:
-- BUGS FOUND: {count} (with severity breakdown)
-- COVERAGE GAPS: {untested flows from requirements}
-- SHALLOW TESTS REWRITTEN: {count}
-- CONTRACTS VERIFIED: {N}/{total}
-- ATTACK VECTORS TRIED: {list every category attempted and results}
-- VERDICT: FAIL ({N} bugs found) | GRUDGING PASS (exhaustive search, nothing found)
-Write all findings to .gsd-t/red-team-report.md.
-If bugs found, also append to .gsd-t/qa-issues.md."
+RT_PROMPT="$(npm root -g 2>/dev/null)/@tekyzinc/gsd-t/templates/prompts/red-team-subagent.md"
+[ -f "$RT_PROMPT" ] || RT_PROMPT="templates/prompts/red-team-subagent.md"
+T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")
 ```
+Spawn Task subagent (general-purpose, model: opus):
+> "Read `$RT_PROMPT` and follow it. Context: post-fix validation for a debug session. **Additional categories for this run:** (a) **Regression Around the Fix** — test every code path adjacent to the changed lines; fixes frequently break neighboring functionality. (b) **Original Bug Variants** — the original bug was {one-line description}; search for SIMILAR bugs in related code (same pattern, different location). Write findings to `.gsd-t/red-team-report.md`."
 After subagent returns — run via Bash:
-`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
-Compute tokens and compaction:
-- No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
-- Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
+```
+T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))
+COUNTER=$(node bin/task-counter.cjs status 2>/dev/null | node -e "let s='';process.stdin.on('data',d=>s+=d).on('end',()=>{try{process.stdout.write(String(JSON.parse(s).count||''))}catch(_){process.stdout.write('')}})")
+```
 Append to `.gsd-t/token-log.md`:
-`| {DT_START} | {DT_END} | gsd-t-debug | Red Team | sonnet | {DURATION}s | {VERDICT} — {N} bugs found | {TOKENS} | {COMPACTED} | | | {CTX_PCT} |`
-**If Red Team VERDICT is FAIL:**
-1. Fix all CRITICAL and HIGH bugs immediately (up to 2 fix attempts per bug)
-2. Re-run Red Team after fixes
-3. If bugs persist after 2 fix cycles, log to `.gsd-t/deferred-items.md` and present to user
+`| {DT_START} | {DT_END} | gsd-t-debug | Red Team | opus | {DURATION}s | {VERDICT} — {N} bugs found | | | {COUNTER} |`
-**If Red Team VERDICT is GRUDGING PASS:** Proceed to metrics and doc-ripple.
+**If FAIL:** fix CRITICAL/HIGH bugs (≤2 cycles) → re-run. Persistent bugs → `.gsd-t/deferred-items.md`.
+**If GRUDGING PASS:** proceed to metrics and doc-ripple.
 ## Step 5.5: Emit Task Metrics

package/commands/gsd-t-design-decompose.md CHANGED Viewed

@@ -360,7 +360,7 @@ After writing all contracts but BEFORE proceeding to partition or build, spawn a
 **OBSERVABILITY LOGGING (MANDATORY):**
 Before spawning — run via Bash:
-`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
+`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
 ⚙ [opus] gsd-t-design-decompose → Chart Classification Verifier
@@ -444,7 +444,7 @@ If ALL ✅ MATCH:
 ```
 After subagent returns — run via Bash:
-`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
+`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
 Compute tokens/compaction per standard pattern. Append to `.gsd-t/token-log.md`.

package/commands/gsd-t-discuss.md CHANGED Viewed

@@ -53,7 +53,7 @@ If the user requests team exploration or there are 3+ complex open questions:
 **OBSERVABILITY LOGGING (MANDATORY):**
 Before spawning the team — run via Bash:
-`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
+`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
 ```
 Create an agent team:
@@ -72,12 +72,9 @@ Lead: Synthesize into decisions and update contracts.
 ```
 After team completes — run via Bash:
-`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
-Compute tokens and compaction:
-- No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
-- Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
-Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted |` if missing):
-`| {DT_START} | {DT_END} | gsd-t-discuss | Step 3 | sonnet | {DURATION}s | team discuss: {topic summary} | {TOKENS} | {COMPACTED} |`
+`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
+Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tasks-Since-Reset |` if missing):
+`| {DT_START} | {DT_END} | gsd-t-discuss | Step 3 | sonnet | {DURATION}s | team discuss: {topic summary} | {COUNTER} |`
 Assign teammates based on the nature of the questions:
 - **Technical choice** (e.g., which database): one advocate per option + critic

package/commands/gsd-t-doc-ripple.md CHANGED Viewed

@@ -93,24 +93,16 @@ For each document or logical group:
 **OBSERVABILITY LOGGING (MANDATORY) — for each subagent spawn:**
 Before spawning — run via Bash:
-`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
+`T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
 After subagent returns — run via Bash:
-`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && TOK_END=${CLAUDE_CONTEXT_TOKENS_USED:-0} && DURATION=$((T_END-T_START))`
+`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
-Compute tokens:
-- No compaction (TOK_END >= TOK_START): `TOKENS=$((TOK_END-TOK_START))`, COMPACTED=null
-- Compaction detected (TOK_END < TOK_START): `TOKENS=$(((TOK_MAX-TOK_START)+TOK_END))`, COMPACTED=$DT_END
+Read the task counter (deterministic context-burn signal):
+`COUNTER=$(node bin/task-counter.cjs status 2>/dev/null | node -e "let s='';process.stdin.on('data',d=>s+=d).on('end',()=>{try{process.stdout.write(String(JSON.parse(s).count||''))}catch(_){process.stdout.write('')}})")`
-Compute context utilization:
-`if [ "${CLAUDE_CONTEXT_TOKENS_MAX:-0}" -gt 0 ]; then CTX_PCT=$(echo "scale=1; ${CLAUDE_CONTEXT_TOKENS_USED:-0} * 100 / ${CLAUDE_CONTEXT_TOKENS_MAX}" | bc); else CTX_PCT="N/A"; fi`
-Alert thresholds:
-- CTX_PCT >= 85: `echo "🔴 CRITICAL: Context at ${CTX_PCT}% — compaction likely. Task MUST be split."`
-- CTX_PCT >= 70: `echo "⚠️ WARNING: Context at ${CTX_PCT}% — approaching compaction threshold. Consider splitting."`
-Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Tokens | Compacted | Domain | Task | Ctx% |` if missing):
-`| {DT_START} | {DT_END} | gsd-t-doc-ripple | Step 5 | {model} | {DURATION}s | update:{document} | {TOKENS} | {COMPACTED} | doc-ripple | — | {CTX_PCT} |`
+Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Domain | Task | Tasks-Since-Reset |` if missing):
+`| {DT_START} | {DT_END} | gsd-t-doc-ripple | Step 5 | {model} | {DURATION}s | update:{document} | doc-ripple | — | {COUNTER} |`
 **Each document-update subagent prompt:**
 ```