npm - @tekyzinc/gsd-t - Versions diffs - 2.74.13 → 2.76.10 - Mend

@tekyzinc/gsd-t 2.74.13 → 2.76.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (61) hide show

package/CHANGELOG.md +116 -0
package/README.md +71 -1
package/bin/advisor-integration.js +93 -0
package/bin/check-headless-sessions.js +140 -0
package/bin/context-meter-config.cjs +101 -0
package/bin/context-meter-config.test.cjs +101 -0
package/bin/gsd-t.js +709 -16
package/bin/headless-auto-spawn.js +290 -0
package/bin/model-selector.js +224 -0
package/bin/runway-estimator.js +242 -0
package/bin/token-budget.js +96 -89
package/bin/token-optimizer.js +471 -0
package/bin/token-telemetry.js +246 -0
package/commands/gsd-t-audit.md +3 -3
package/commands/gsd-t-backlog-list.md +38 -0
package/commands/gsd-t-brainstorm.md +3 -3
package/commands/gsd-t-complete-milestone.md +24 -0
package/commands/gsd-t-debug.md +124 -7
package/commands/gsd-t-discuss.md +10 -3
package/commands/gsd-t-doc-ripple.md +32 -4
package/commands/gsd-t-execute.md +107 -52
package/commands/gsd-t-help.md +19 -0
package/commands/gsd-t-integrate.md +67 -4
package/commands/gsd-t-optimization-apply.md +91 -0
package/commands/gsd-t-optimization-reject.md +94 -0
package/commands/gsd-t-partition.md +7 -0
package/commands/gsd-t-pause.md +3 -0
package/commands/gsd-t-plan.md +10 -3
package/commands/gsd-t-prd.md +3 -3
package/commands/gsd-t-quick.md +71 -9
package/commands/gsd-t-reflect.md +3 -7
package/commands/gsd-t-resume.md +36 -0
package/commands/gsd-t-status.md +31 -0
package/commands/gsd-t-test-sync.md +7 -0
package/commands/gsd-t-verify.md +12 -5
package/commands/gsd-t-visualize.md +3 -7
package/commands/gsd-t-wave.md +82 -18
package/docs/GSD-T-README.md +52 -0
package/docs/architecture.md +95 -0
package/docs/infrastructure.md +117 -0
package/docs/methodology.md +36 -0
package/docs/prd-harness-evolution.md +51 -37
package/docs/requirements.md +66 -0
package/package.json +1 -1
package/scripts/context-meter/count-tokens-client.js +221 -0
package/scripts/context-meter/count-tokens-client.test.js +308 -0
package/scripts/context-meter/test-injector.js +55 -0
package/scripts/context-meter/threshold.js +88 -0
package/scripts/context-meter/threshold.test.js +255 -0
package/scripts/context-meter/transcript-parser.js +252 -0
package/scripts/context-meter/transcript-parser.test.js +320 -0
package/scripts/gsd-t-context-meter.e2e.test.js +415 -0
package/scripts/gsd-t-context-meter.js +350 -0
package/scripts/gsd-t-context-meter.test.js +417 -0
package/scripts/gsd-t-heartbeat.js +2 -2
package/scripts/gsd-t-statusline.js +23 -8
package/templates/CLAUDE-global.md +5 -1
package/templates/CLAUDE-project.md +26 -6
package/templates/context-meter-config.json +10 -0
package/templates/prompts/README.md +1 -1
package/bin/task-counter.cjs +0 -161

package/commands/gsd-t-execute.md CHANGED Viewed

@@ -2,17 +2,86 @@
 You are the lead agent coordinating task execution across domains. Choose solo or team mode based on the plan.
-## Step 0: Reset Task-Count Gate (MANDATORY — first thing in a fresh session)
+## Model Assignment
+Per `.gsd-t/contracts/model-selection-contract.md` v1.0.0. Selection is deterministic via `bin/model-selector.js` — never runtime-overridden by context pressure.
+- **Default**: `sonnet` — routine task execution (`selectModel({phase: "execute"})`). Sonnet is the M35 routine tier.
+- **Mechanical subroutines** (demote to `haiku`):
+  - Test runners (`selectModel({phase: "execute", task_type: "test_runner"})`)
+  - Branch guards (`selectModel({phase: "execute", task_type: "branch_guard"})`)
+  - File-existence checks (`selectModel({phase: "execute", task_type: "file_check"})`)
+- **QA subagent (Step 2)**: `sonnet` — evaluation needs judgment per M31 tier refinement (`selectModel({phase: "execute", task_type: "qa"})`).
+- **Red Team (Step 5.5)**: `opus` — adversarial reasoning benefits most from top tier (`selectModel({phase: "execute", task_type: "red_team"})`).
+- **Escalation points**: at any declared high-stakes sub-decision (cross-module refactor, contract design, security-boundary change), invoke the convention-based `/advisor` fallback from `bin/advisor-integration.js`. If the `/advisor` tool is unavailable, the caller proceeds at the assigned model and logs a missed escalation to `.gsd-t/token-log.md` (see `.gsd-t/M35-advisor-findings.md`). Never silently downgrade the model or skip Red Team / doc-ripple under context pressure — M35 removed that behavior.
+## Per-Spawn Token Bracket (MANDATORY — wrap EVERY Task subagent spawn)
+Per `.gsd-t/contracts/token-telemetry-contract.md` v1.0.0. Every Task subagent spawn below (Step 2 QA, Step 3 domain dispatcher, Step 5.25 Design Verification, Step 5.5 Red Team, Step 7 doc-ripple) **MUST** be wrapped in this token bracket so `.gsd-t/token-metrics.jsonl` gets one record per spawn. This is additive — the existing OBSERVABILITY LOGGING blocks in each spawn site are preserved unmodified alongside this bracket.
+**Before each spawn — read starting context tokens:**
+```bash
+T0_TOKENS=$(node -e "try{const s=require('fs').readFileSync('.gsd-t/.context-meter-state.json','utf8');process.stdout.write(String(JSON.parse(s).inputTokens||0))}catch(_){process.stdout.write('0')}")
+T0_PCT=$(node -e "try{const tb=require('./bin/token-budget.js');process.stdout.write(String(tb.getSessionStatus('.').pct||0))}catch(_){process.stdout.write('0')}")
+```
+**After each spawn — record the bracket:**
+```bash
+T1_TOKENS=$(node -e "try{const s=require('fs').readFileSync('.gsd-t/.context-meter-state.json','utf8');process.stdout.write(String(JSON.parse(s).inputTokens||0))}catch(_){process.stdout.write('0')}")
+T1_PCT=$(node -e "try{const tb=require('./bin/token-budget.js');process.stdout.write(String(tb.getSessionStatus('.').pct||0))}catch(_){process.stdout.write('0')}")
+node -e "require('./bin/token-telemetry.js').recordSpawn({timestamp:new Date().toISOString(),milestone:process.env.GSD_T_MILESTONE||'',command:'gsd-t-execute',phase:'${PHASE:-execute}',step:'${STEP:-}',domain:'${DOMAIN:-}',domain_type:'${DOMAIN_TYPE:-}',task:'${TASK:-}',model:'${MODEL:-sonnet}',duration_s:${DURATION:-0},input_tokens_before:${T0_TOKENS},input_tokens_after:${T1_TOKENS},tokens_consumed:${T1_TOKENS}-${T0_TOKENS},context_window_pct_before:${T0_PCT},context_window_pct_after:${T1_PCT},outcome:'${OUTCOME:-success}',halt_type:${HALT_TYPE:-null},escalated_via_advisor:${ESCALATED_VIA_ADVISOR:-false}})" 2>/dev/null || true
+```
+The bracket is additive to the existing `.gsd-t/token-log.md` OBSERVABILITY LOGGING rows. Both sinks coexist — token-log.md is human-readable with context percentage, token-metrics.jsonl is machine-readable with the full 18-field schema for `gsd-t metrics --tokens/--halts/--context-window` aggregation.
+## Step 0: Runway Check (MANDATORY — before any other work in a fresh session)
+Run via Bash. Count the `remaining_tasks` from the unblocked task list (Step 1 reads `.gsd-t/domains/*/tasks.md`), or use a conservative estimate of 5 if the count is unknown yet:
+```bash
+node -e "
+const r = require('./bin/runway-estimator.js').estimateRunway({
+  command: 'gsd-t-execute',
+  domain_type: '{DOMAIN_TYPE}',
+  remaining_tasks: {N},
+  projectDir: '.'
+});
+console.log(JSON.stringify(r, null, 2));
+if (!r.can_start) {
+  console.log('⛔ Insufficient runway — projected ' + r.projected_end_pct + '% (current ' + r.current_pct + '%, ' + r.pct_per_task + '%/task, ' + r.confidence + ' confidence, ' + r.confidence_basis + ' records)');
+  console.log('Auto-spawning headless to continue in a fresh context.');
+  const s = require('./bin/headless-auto-spawn.js').autoSpawnHeadless({
+    command: 'gsd-t-execute', args: [], continue_from: '.'
+  });
+  console.log('Session ID: ' + s.id);
+  console.log('Status: tail ' + s.logPath);
+  console.log('');
+  console.log('Your interactive session remains idle — you can use it for other work.');
+  console.log('You will be notified when the headless run completes.');
+  process.exit(0);
+}
+"
+```
+If `can_start === false`, the Step 0 block above has already spawned the headless continuation and exited. The interactive session stops here — do NOT proceed to Step 0.1. If the command continues past Step 0, `can_start === true` and runway is sufficient.
+**Contract**: `.gsd-t/contracts/runway-estimator-contract.md` v1.0.0 defines the decision-object shape and the refusal banner format. The stop threshold (85%) mirrors `.gsd-t/contracts/token-budget-contract.md` v3.0.0.
+## Step 0.1: Verify Context Gate Readiness (MANDATORY — first thing in a fresh session)
 Run via Bash:
 ```bash
-node bin/task-counter.cjs reset
+node -e "const tb = require('./bin/token-budget.js'); const s = tb.getSessionStatus('.'); console.log(JSON.stringify(s));"
 ```
-This clears `.gsd-t/.task-counter` so the new session starts at 0. The reset is the SIGNAL that this is a clean post-`/clear` orchestrator. Do this exactly ONCE per `/user:gsd-t-execute` invocation, immediately on entry. The gate logic is in Step 3.5; do NOT skip it. If `bin/task-counter.cjs` is missing in this project, `npm install` it via `gsd-t install` then retry — the gate is required.
+This calls `getSessionStatus()` (v2.0.0) which reads `.gsd-t/.context-meter-state.json` produced by the Context Meter PostToolUse hook. If the state file is fresh (timestamp within 5 min), you get real `pct` and `threshold` values; if missing or stale, the call falls back to the historical heuristic from `.gsd-t/token-log.md`.
+Use the returned `threshold` as the gate signal for the rest of this run. The gate logic is in Step 3.5; do NOT skip it. If the Context Meter hook isn't installed (`.gsd-t/.context-meter-state.json` missing and doctor reports it), run `gsd-t doctor` to diagnose — the gate still works via the heuristic fallback but real-time readings give much better guardrails.
-Why: every `/user:gsd-t-execute` invocation is a fresh orchestrator session. Without the reset, the counter from the previous session would still be at the limit and the gate would refuse to spawn anything. Reset is the only acceptable way to advance the counter back to 0.
+Why: every `/user:gsd-t-execute` invocation is a fresh orchestrator session and needs a current reading of context utilization before spawning any subagents. The authoritative source is the Context Meter state file; the fallback keeps the gate functional on projects that haven't installed the hook yet.
 ## Step 1: Load State
@@ -112,23 +181,22 @@ Before spawning — run via Bash:
 After subagent returns — run via Bash:
 `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
-Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Domain | Task | Tasks-Since-Reset |` if missing):
-`| {DT_START} | {DT_END} | gsd-t-execute | task:{task-id} | sonnet | {DURATION}s | {pass/fail} | {domain-name} | task-{task-id} | {COUNTER} |`
+Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Domain | Task | Ctx% |` if missing):
+`| {DT_START} | {DT_END} | gsd-t-execute | task:{task-id} | sonnet | {DURATION}s | {pass/fail} | {domain-name} | task-{task-id} | {CTX_PCT} |`
-Where `{COUNTER}` is the value returned by `node bin/task-counter.cjs status` (see Step 3.5). Note: the legacy `Tokens`, `Compacted`, and `Ctx%` columns were removed in v2.74.12 — Claude Code does not export `CLAUDE_CONTEXT_TOKENS_USED`/`_MAX`, so those columns always wrote zeros and the orchestrator self-check based on them was inert. The real burn signal is now `Tasks-Since-Reset`, which the task-counter gate in Step 3.5 enforces.
+Where `{CTX_PCT}` is the current `pct` value returned by `getSessionStatus()` (Step 3.5). As of v2.0.0 (M34), `pct` reads the **real** `input_tokens` count from `.gsd-t/.context-meter-state.json` — the count_tokens-based measurement produced by the Context Meter PostToolUse hook. When the state file is absent or stale, the fallback heuristic writes a best-effort percentage and this column reads `N/A` instead. The previous `Tasks-Since-Reset` column (v2.74.12) is retired.
 **For each domain (in wave order), run the domain task-dispatcher:**
 **Token Budget Check (before dispatching each domain's tasks):**
 Run via Bash:
-`node -e "const tb = require('./bin/token-budget.js'); const s = tb.getSessionStatus('.'); const d = tb.getDegradationActions(s.threshold, '.'); process.stdout.write(JSON.stringify({threshold: s.threshold, actions: d}));" 2>/dev/null`
+`node -e "const tb = require('./bin/token-budget.js'); const s = tb.getSessionStatus('.'); const d = tb.getDegradationActions(s.threshold, '.'); process.stdout.write(JSON.stringify({band: d.band, pct: d.pct, message: d.message}));" 2>/dev/null`
-Apply the result:
-- `threshold: 'normal'` or file missing → skip silently, proceed with standard model assignments
-- `threshold: 'downgrade'` → apply model overrides from `actions.modelOverride` (e.g., downgrade opus tasks to sonnet)
-- `threshold: 'conserve'` → checkpoint progress to `.gsd-t/progress.md` and skip non-essential operations (Red Team, doc-ripple) for this domain
-- `threshold: 'stop'` → checkpoint all progress, output: "Token budget exhausted — progress saved. Resume after session reset.", and halt execution for remaining domains
+Apply the result (three-band model per `token-budget-contract.md` v3.0.0 — never silently degrade quality):
+- `band: 'normal'` or file missing → proceed with standard model assignments
+- `band: 'warn'` (≥70%) → log the warning to `.gsd-t/token-log.md` and proceed at full quality; do NOT downgrade models or skip phases
+- `band: 'stop'` (≥85%) → checkpoint all progress, output: "Orchestrator context gate reached ({pct}%). Progress saved. Resume after session reset.", and halt execution for remaining domains. Runway estimator / headless auto-spawn will handle the handoff once they exist (m35-runway-estimator, m35-headless-auto-spawn).
 **Pre-dispatch experience retrieval (before dispatching each domain's tasks):**
 Run via Bash:
@@ -232,7 +300,7 @@ For each task in `.gsd-t/domains/{domain-name}/tasks.md` (in order, skip complet
 1. Load prior summaries: Read up to 5 most recent `.gsd-t/domains/{domain-name}/task-*-summary.md` files (10-20 lines each)
 2. Load graph context (if `.gsd-t/graph/meta.json` exists): query task's files for relevant graph context
 3. Display: `⚙ [sonnet] gsd-t-execute → domain: {domain-name}, task-{task-id}`
-4. Run observability Bash (T_START / DT_START / TOK_START / TOK_MAX)
+4. Run observability Bash (T_START / DT_START)
 5. Spawn task subagent:
 ```
@@ -414,11 +482,11 @@ Report back:
 ```
 6. After task subagent returns:
-   - Run observability Bash (T_END / TOK_END / DURATION / CTX_PCT)
+   - Run observability Bash (T_END / DURATION / CTX_PCT)
    - Append to token-log.md (per-task row)
    - Alert on CTX_PCT thresholds (display to user inline)
    - **Emit task-metrics record** — run via Bash:
-     `node bin/metrics-collector.js --milestone {milestone} --domain {domain-name} --task task-{task-id} --command execute --duration_s $DURATION --tokens_used $TOKENS --context_pct ${CTX_PCT:-0} --pass {true|false} --fix_cycles {0|N} --signal_type {pass-through|fix-cycle} --notes "{brief outcome}" 2>/dev/null || true`
+     `node bin/metrics-collector.js --milestone {milestone} --domain {domain-name} --task task-{task-id} --command execute --duration_s $DURATION --tokens_used 0 --context_pct ${CTX_PCT:-0} --pass {true|false} --fix_cycles {0|N} --signal_type {pass-through|fix-cycle} --notes "{brief outcome}" 2>/dev/null || true`
      Signal type: `pass-through` if task passed on first attempt; `fix-cycle` if rework was needed.
    - **Emit task_complete event** — run via Bash:
      `node ~/.claude/scripts/gsd-t-event-writer.js --type task_complete --command gsd-t-execute --reasoning "signal_type={signal_type}, domain={domain-name}" --outcome {success|failure} || true`
@@ -461,7 +529,7 @@ Report back:
 6. **Per-domain Red Team** — invoke Step 5.5 (Red Team) NOW for this domain. This is the first place Red Team runs in v2.74.12 — there is no global post-execute Red Team anymore. If Red Team returns FAIL, fix bugs and re-run before proceeding to the next domain (max 2 fix-and-verify cycles); if bugs persist, log to `.gsd-t/deferred-items.md` and present to user.
-7. **Task-count gate re-check** — run `node bin/task-counter.cjs should-stop`. If exit code is `10`, follow the Step 3.5 STOP procedure now (do NOT spawn the next domain).
+7. **Context gate re-check** — run `node -e "const tb=require('./bin/token-budget.js'); const s=tb.getSessionStatus('.'); if(s.threshold==='stop')process.exit(10); if(s.threshold==='warn')process.exit(13);"`. If exit code is `10`, follow the Step 3.5 STOP procedure now (do NOT spawn the next domain). If exit code is `13`, log the warning and proceed at full quality for the next domain (no model overrides, no phase skips — quality is never silently degraded).
 ### Team Mode (when agent teams are enabled)
 Spawn teammates for domains within the same wave. Only domains in the same wave can run in parallel — do not spawn teammates for domains in different waves simultaneously. Each teammate uses the **domain task-dispatcher pattern** — one subagent per task within their domain (same as solo mode).
@@ -605,31 +673,28 @@ After all merges complete (whether all passed, some rolled back, or errors occur
 Cleanup is not optional — orphaned worktrees waste disk space and can confuse subsequent executions. Always run cleanup, even if earlier steps failed.
 ```
-## Step 3.5: Orchestrator Task-Count Gate (MANDATORY)
+## Step 3.5: Orchestrator Context Gate (MANDATORY)
-The orchestrator MUST check `bin/task-counter.cjs` BEFORE every task subagent spawn AND immediately AFTER every domain completes. This is the real context-burn guardrail. The previous version of this step relied on `CLAUDE_CONTEXT_TOKENS_USED`/`_MAX` env vars which Claude Code does not export — that check was inert and silently let the orchestrator drain context until forced compaction. The replacement below uses a deterministic on-disk task counter.
+The orchestrator MUST check `getSessionStatus()` BEFORE every task subagent spawn AND immediately AFTER every domain completes. This is the real context-burn guardrail. As of v2.0.0 (M34), `bin/token-budget.js` reads `.gsd-t/.context-meter-state.json` — the live count_tokens-based `input_tokens` measurement produced by the Context Meter PostToolUse hook. When the state file is fresh (timestamp within 5 min), thresholds reflect the ACTUAL context window utilization; when absent or stale, the call falls back to the historical heuristic from `.gsd-t/token-log.md`.
 **Before each task spawn — gate check:**
 ```bash
-node bin/task-counter.cjs should-stop
+node -e "const tb=require('./bin/token-budget.js'); const s=tb.getSessionStatus('.'); process.stdout.write(JSON.stringify(s)); if(s.threshold==='stop')process.exit(10); if(s.threshold==='warn')process.exit(13);"
 ```
-If the exit code is `10` (counter is at or past its limit), STOP immediately. Do NOT spawn the next task. Jump straight to the checkpoint/STOP procedure below.
-If the exit code is `0`, proceed to spawn the task.
-**After each task subagent returns — increment:**
+Exit code semantics (three-band model per `token-budget-contract.md` v3.0.0):
+- `0` → `normal` band (< 70% ctx). Proceed with standard model assignments.
+- `13` → `warn` band (70–85%). Log the warning to `.gsd-t/token-log.md` and proceed at full quality. **Never downgrade models or skip phases** — M35 removed that behavior intentionally. If the projected runway is insufficient, the runway estimator (m35-runway-estimator) will halt cleanly before reaching `stop`.
+- `10` → `stop` band (≥ 85%). STOP immediately. Do NOT spawn the next task. Jump straight to the STOP procedure below.
-```bash
-node bin/task-counter.cjs increment task
-```
+The JSON on stdout contains `{consumed, estimated_remaining, pct, threshold}` — capture `pct` as `{CTX_PCT}` for the token-log `Ctx%` column on the NEXT spawn.
-This prints a JSON status line like `{"count":3,"limit":5,"remaining":2,"should_stop":false,...}`. Use this status when writing the token-log row (the `Tasks-Since-Reset` column).
+**After each task subagent returns — re-check:**
-If `should_stop` is `true` after the increment, STOP after this task completes — even if more tasks remain in the current domain.
+Run the same command again. The fresh reading reflects post-task consumption (the Context Meter hook refreshes after each tool call). If the band crossed into `stop`, STOP after this task completes even if more tasks remain in the current domain.
-**STOP procedure (when `should_stop` is true):**
+**STOP procedure (when threshold === 'stop'):**
 1. **Save checkpoint to disk** — update `.gsd-t/progress.md` with:
    - Which domains are complete, which remain
@@ -637,30 +702,20 @@ If `should_stop` is `true` after the increment, STOP after this task completes
    - Last completed task id and the next pending task id
 2. **Instruct user**: Output exactly:
    ```
-   ⏸️ Orchestrator task-count gate reached ({count}/{limit} tasks in this session).
+   ⏸️ Orchestrator context gate reached ({pct}% of model window).
    Progress saved. Run `/clear` then `/user:gsd-t-execute` to continue from the next task.
    ```
-3. **STOP execution.** Do NOT spawn another task or domain subagent. The next session resumes from saved state. The first thing the resumed orchestrator does in Step 0 is run `node bin/task-counter.cjs reset` (see below).
-**Configuring the limit:**
-The default limit is 5 tasks per session — conservative, designed for the model+harness combination as of 2026-04-13. Override per-project via `.gsd-t/task-counter-config.json`:
+3. **STOP execution.** Do NOT spawn another task or domain subagent. The next session resumes from saved state with a fresh context window.
-```json
-{ "limit": 8 }
-```
+**Configuring threshold bands:**
-Or per-session via env var: `GSD_T_TASK_LIMIT=8 /user:gsd-t-execute`.
+Band boundaries (`warn=70`, `stop=85`) are defined in `bin/token-budget.js` (`WARN_THRESHOLD_PCT` / `STOP_THRESHOLD_PCT` constants) and documented in `.gsd-t/contracts/token-budget-contract.md` v3.0.0. The `modelWindowSize` used for the denominator comes from `.gsd-t/context-meter-config.json` (default `200000`). Override the window size there if running against a different model. There is no per-session env-var override — the real-time measurement supersedes the need for one.
 **On resume (Step 0 — first thing the orchestrator does in a fresh session):**
-```bash
-node bin/task-counter.cjs reset
-```
-This clears the counter so the new session starts fresh. The reset is the SIGNAL that this is a clean post-`/clear` session — never reset mid-session.
+Step 0 runs `getSessionStatus()` once for readiness confirmation. The reading should be fresh (the Context Meter hook fires on every tool call), so the gate immediately reflects the new session's starting pct — typically near 0 since `/clear` resets the conversation.
-This deterministic gate replaces the vaporware env-var check. It is fail-safe: if `bin/task-counter.cjs` is missing for any reason, the `should-stop` command exits non-zero (treated as STOP) rather than silently allowing unlimited spawns.
+This gate replaces the v2.74.12 task counter proxy and the (never-functional) v1.x env-var check. It is fail-safe: if `bin/token-budget.js` or the state file is unreadable for any reason, `getSessionStatus()` throws and the gate exits non-zero (treated as STOP) rather than silently allowing unlimited spawns.
 ## Step 4: Checkpoint Handling
@@ -736,9 +791,9 @@ and summary, and the full comparison table per the protocol's Step 7."
 ```
 After subagent returns — run via Bash:
-`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START)) && COUNTER_JSON=$(node bin/task-counter.cjs status 2>/dev/null || echo '{}') && COUNTER=$(echo "$COUNTER_JSON" | node -e "let s=''; process.stdin.on('data',d=>s+=d).on('end',()=>{try{process.stdout.write(String(JSON.parse(s).count||''))}catch(_){process.stdout.write('')}})")`
+`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START)) && CTX_PCT=$(node -e "try{const tb=require('./bin/token-budget.js'); process.stdout.write(String(tb.getSessionStatus('.').pct))}catch(_){process.stdout.write('N/A')}")`
 Append to `.gsd-t/token-log.md`:
-`| {DT_START} | {DT_END} | gsd-t-execute | Design Verify | opus | {DURATION}s | {VERDICT} — {MATCH}/{TOTAL} elements for {domain-name} | | | {COUNTER} |`
+`| {DT_START} | {DT_END} | gsd-t-execute | Design Verify | opus | {DURATION}s | {VERDICT} — {MATCH}/{TOTAL} elements for {domain-name} | | | {CTX_PCT} |`
 **Artifact Gate (MANDATORY):**
 After the Design Verification Agent returns, check `.gsd-t/contracts/design-contract.md`:
@@ -757,7 +812,7 @@ After the Design Verification Agent returns, check `.gsd-t/contracts/design-cont
 ## Step 5.5: Red Team — Adversarial QA (per-domain, MANDATORY)
-**IMPORTANT — frequency change in v2.74.12**: Red Team was promoted to per-task by commit `da6d3ae` on the assumption that the orchestrator would catch context drain via the `CLAUDE_CONTEXT_TOKENS_USED` self-check. That env var is never set by Claude Code, so the check was inert and the per-task spawning of ~10k-token Red Team subagents was the largest single contributor to the v2.74.x context-burn regression. Red Team is now run ONCE PER COMPLETED DOMAIN — call this step from the "After all tasks in a domain complete" block, not from a per-task hook.
+**IMPORTANT — frequency change in v2.74.12**: Red Team was promoted to per-task by commit `da6d3ae` on the assumption that the orchestrator would catch context drain via an environment-variable-based self-check. That env-var path was never populated by Claude Code, so the check was inert and the per-task spawning of ~10k-token Red Team subagents was the largest single contributor to the v2.74.x context-burn regression. Red Team is now run ONCE PER COMPLETED DOMAIN — call this step from the "After all tasks in a domain complete" block, not from a per-task hook.
 After all tasks in the CURRENT DOMAIN pass their tests, spawn an adversarial Red Team agent. Its sole purpose is to BREAK the domain that was just built.
@@ -789,9 +844,9 @@ attack categories exhausted, and the path to the written
 ```
 After subagent returns — run via Bash:
-`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START)) && COUNTER=$(node bin/task-counter.cjs status 2>/dev/null | node -e "let s='';process.stdin.on('data',d=>s+=d).on('end',()=>{try{process.stdout.write(String(JSON.parse(s).count||''))}catch(_){process.stdout.write('')}})")`
+`T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START)) && CTX_PCT=$(node -e "try{const tb=require('./bin/token-budget.js'); process.stdout.write(String(tb.getSessionStatus('.').pct))}catch(_){process.stdout.write('N/A')}")`
 Append to `.gsd-t/token-log.md`:
-`| {DT_START} | {DT_END} | gsd-t-execute | Red Team | opus | {DURATION}s | {VERDICT} — {N} bugs found in {domain-name} | | | {COUNTER} |`
+`| {DT_START} | {DT_END} | gsd-t-execute | Red Team | opus | {DURATION}s | {VERDICT} — {N} bugs found in {domain-name} | | | {CTX_PCT} |`
 **If Red Team VERDICT is FAIL:**
 1. Fix all CRITICAL and HIGH bugs immediately (up to 2 fix attempts per bug)

package/commands/gsd-t-help.md CHANGED Viewed

@@ -85,6 +85,11 @@ BACKLOG                                                                Manual
   backlog-promote     Refine, classify, and launch GSD-T workflow
   backlog-settings    Manage types, apps, categories, and defaults
+OPTIMIZATION                                                           Manual
+───────────────────────────────────────────────────────────────────────────────
+  optimization-apply  Promote a pending token-optimizer recommendation
+  optimization-reject Dismiss a recommendation with optional reason + cooldown
 ───────────────────────────────────────────────────────────────────────────────
 Type /user:gsd-t-help {command} for detailed help on any command.
 Example: /user:gsd-t-help impact
@@ -456,6 +461,20 @@ Use these when user asks for help on a specific command:
 - **Files**: `.gsd-t/backlog-settings.md`
 - **Use when**: Customizing the classification dimensions for your project
+### optimization-apply
+- **Summary**: Promote a pending token-optimizer recommendation from `.gsd-t/optimization-backlog.md`
+- **Auto-invoked**: No
+- **Files**: `.gsd-t/optimization-backlog.md`, `.gsd-t/progress.md`, `.gsd-t/token-log.md`
+- **Usage**: `/user:gsd-t-optimization-apply {ID}`
+- **Use when**: A recommendation looks correct and you want to act on it — offers a quick-task or full backlog-promote path
+### optimization-reject
+- **Summary**: Dismiss a recommendation with an optional reason; sets a 5-milestone cooldown
+- **Auto-invoked**: No
+- **Files**: `.gsd-t/optimization-backlog.md`, `.gsd-t/progress.md`, `.gsd-t/token-log.md`
+- **Usage**: `/user:gsd-t-optimization-reject {ID} [--reason "text"]`
+- **Use when**: A recommendation is wrong or premature — prevents the same signal from re-surfacing for 5 milestones
 ## Unknown Command
 If user asks for help on unrecognized command:

package/commands/gsd-t-integrate.md CHANGED Viewed

@@ -2,6 +2,69 @@
 You are the lead agent performing integration work. This phase is ALWAYS single-session — one agent with full context across all domains to handle the seams.
+## Model Assignment
+Per `.gsd-t/contracts/model-selection-contract.md` v1.0.0.
+- **Default**: `sonnet` (`selectModel({phase: "integrate"})`) — integration wiring is routine coordination.
+- **Mechanical subroutines** (demote to `haiku`): integration test runners.
+- **Red Team**: `opus` — adversarial QA at integration seams always runs at top tier.
+- **Escalation**: `/advisor` convention-based fallback from `bin/advisor-integration.js` when a seam reveals a contract gap or security boundary. Never silently downgrade the model or skip Red Team / doc-ripple under context pressure — M35 removed that behavior.
+## Per-Spawn Token Bracket (MANDATORY — wrap EVERY Task subagent spawn)
+Per `.gsd-t/contracts/token-telemetry-contract.md` v1.0.0. Every Task subagent spawn below **MUST** be wrapped in this token bracket so `.gsd-t/token-metrics.jsonl` gets one record per spawn. This is additive — the existing OBSERVABILITY LOGGING blocks in each spawn site are preserved unmodified alongside this bracket.
+**Before each spawn — read starting context tokens:**
+```bash
+T0_TOKENS=$(node -e "try{const s=require('fs').readFileSync('.gsd-t/.context-meter-state.json','utf8');process.stdout.write(String(JSON.parse(s).inputTokens||0))}catch(_){process.stdout.write('0')}")
+T0_PCT=$(node -e "try{const tb=require('./bin/token-budget.js');process.stdout.write(String(tb.getSessionStatus('.').pct||0))}catch(_){process.stdout.write('0')}")
+```
+**After each spawn — record the bracket:**
+```bash
+T1_TOKENS=$(node -e "try{const s=require('fs').readFileSync('.gsd-t/.context-meter-state.json','utf8');process.stdout.write(String(JSON.parse(s).inputTokens||0))}catch(_){process.stdout.write('0')}")
+T1_PCT=$(node -e "try{const tb=require('./bin/token-budget.js');process.stdout.write(String(tb.getSessionStatus('.').pct||0))}catch(_){process.stdout.write('0')}")
+node -e "require('./bin/token-telemetry.js').recordSpawn({timestamp:new Date().toISOString(),milestone:process.env.GSD_T_MILESTONE||'',command:'gsd-t-integrate',phase:'integrate',step:'${STEP:-}',domain:'${DOMAIN:-}',domain_type:'${DOMAIN_TYPE:-}',task:'${TASK:-}',model:'${MODEL:-sonnet}',duration_s:${DURATION:-0},input_tokens_before:${T0_TOKENS},input_tokens_after:${T1_TOKENS},tokens_consumed:${T1_TOKENS}-${T0_TOKENS},context_window_pct_before:${T0_PCT},context_window_pct_after:${T1_PCT},outcome:'${OUTCOME:-success}',halt_type:${HALT_TYPE:-null},escalated_via_advisor:${ESCALATED_VIA_ADVISOR:-false}})" 2>/dev/null || true
+```
+The bracket is additive to the existing `.gsd-t/token-log.md` OBSERVABILITY LOGGING rows. Both sinks coexist.
+## Step 0: Runway Check (MANDATORY — before any other work in a fresh session)
+Count the integration wiring seams in `.gsd-t/contracts/integration-points.md` as `remaining_tasks` (conservative estimate = integration-points section count). Then run via Bash:
+```bash
+node -e "
+const r = require('./bin/runway-estimator.js').estimateRunway({
+  command: 'gsd-t-integrate',
+  domain_type: '',
+  remaining_tasks: {N},
+  projectDir: '.'
+});
+console.log(JSON.stringify(r, null, 2));
+if (!r.can_start) {
+  console.log('⛔ Insufficient runway — projected ' + r.projected_end_pct + '% (current ' + r.current_pct + '%, ' + r.pct_per_task + '%/task, ' + r.confidence + ' confidence, ' + r.confidence_basis + ' records)');
+  console.log('Auto-spawning headless to continue in a fresh context.');
+  const s = require('./bin/headless-auto-spawn.js').autoSpawnHeadless({
+    command: 'gsd-t-integrate', args: [], continue_from: '.'
+  });
+  console.log('Session ID: ' + s.id);
+  console.log('Status: tail ' + s.logPath);
+  console.log('');
+  console.log('Your interactive session remains idle — you can use it for other work.');
+  console.log('You will be notified when the headless run completes.');
+  process.exit(0);
+}
+"
+```
+If `can_start === false`, the headless continuation has been spawned and the interactive session must stop here. Do NOT proceed to Step 1.
+**Contract**: `.gsd-t/contracts/runway-estimator-contract.md` v1.0.0; stop threshold (85%) mirrors `.gsd-t/contracts/token-budget-contract.md` v3.0.0.
 ## Step 1: Load Full State
 Read everything:
@@ -173,8 +236,8 @@ Before spawning — run via Bash:
 `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
 After subagent returns — run via Bash:
 `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
-Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Domain | Task | Tasks-Since-Reset |` if missing):
-`| {DT_START} | {DT_END} | gsd-t-integrate | Step 5 | haiku | {DURATION}s | {pass/fail}, {N} boundaries tested | | | {COUNTER} |`
+Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Domain | Task | Ctx% |` if missing):
+`| {DT_START} | {DT_END} | gsd-t-integrate | Step 5 | haiku | {DURATION}s | {pass/fail}, {N} boundaries tested | | | {CTX_PCT} |`
 If QA found issues, append each to `.gsd-t/qa-issues.md` (create with header `| Date | Command | Step | Model | Duration(s) | Severity | Finding |` if missing):
 `| {DT_START} | gsd-t-integrate | Step 5 | haiku | {DURATION}s | {severity} | {finding} |`
@@ -229,10 +292,10 @@ Spawn Task subagent (general-purpose, model: opus):
 After subagent returns — run via Bash:
 ```
 T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))
-COUNTER=$(node bin/task-counter.cjs status 2>/dev/null | node -e "let s='';process.stdin.on('data',d=>s+=d).on('end',()=>{try{process.stdout.write(String(JSON.parse(s).count||''))}catch(_){process.stdout.write('')}})")
+CTX_PCT=$(node -e "try{const tb=require('./bin/token-budget.js'); process.stdout.write(String(tb.getSessionStatus('.').pct))}catch(_){process.stdout.write('N/A')}")
 ```
 Append to `.gsd-t/token-log.md`:
-`| {DT_START} | {DT_END} | gsd-t-integrate | Red Team | opus | {DURATION}s | {VERDICT} — {N} bugs found | | | {COUNTER} |`
+`| {DT_START} | {DT_END} | gsd-t-integrate | Red Team | opus | {DURATION}s | {VERDICT} — {N} bugs found | | | {CTX_PCT} |`
 **If FAIL:** fix CRITICAL/HIGH bugs (≤2 cycles) → re-run. Persistent bugs → `.gsd-t/deferred-items.md`.
 **If GRUDGING PASS:** proceed to doc-ripple.

package/commands/gsd-t-optimization-apply.md ADDED Viewed

@@ -0,0 +1,91 @@
+# GSD-T: Optimization Apply — Promote a Recommendation
+Apply (promote) a pending recommendation from `.gsd-t/optimization-backlog.md`. Takes `$ARGUMENTS` as the recommendation ID (e.g., `M35-OPT-001`).
+Recommendations are produced by `bin/token-optimizer.js` at `complete-milestone` and are **never auto-applied**. This command is the user's deliberate promotion step.
+## Usage
+```
+/user:gsd-t-optimization-apply M35-OPT-001
+```
+## Step 0: Parse $ARGUMENTS
+Extract the recommendation ID from `$ARGUMENTS`. If empty, print:
+```
+Usage: /user:gsd-t-optimization-apply {ID}
+Example: /user:gsd-t-optimization-apply M35-OPT-001
+Run `/user:gsd-t-backlog-list --file optimization-backlog.md` to see pending recommendations.
+```
+Then exit.
+## Step 1: Load the recommendation
+```bash
+node -e "
+const opt = require('./bin/token-optimizer.js');
+const content = opt.readBacklog('.');
+const entries = opt.parseBacklog(content);
+const id = process.argv[1];
+const entry = entries.find(e => e.id === id);
+if (!entry) {
+  console.error('Recommendation not found: ' + id);
+  console.error('Run /user:gsd-t-backlog-list --file optimization-backlog.md');
+  process.exit(1);
+}
+console.log(JSON.stringify(entry, null, 2));
+" {ID}
+```
+## Step 2: Idempotency check
+- If `Status: promoted` → print "Already promoted. No action taken." and exit cleanly.
+- If `Status: rejected` → print "This recommendation was rejected. Use `/user:gsd-t-optimization-reject --reason` to update, or wait out the cooldown." and exit.
+- If `Status: pending` → proceed to Step 3.
+## Step 3: Present the recommendation + promotion options
+Print the recommendation's metadata (Type, Evidence, Proposed change, Risk, Projected savings) and offer two promotion paths:
+1. **Quick task** (recommended for small changes): `/user:gsd-t-quick "{proposed_change}"`
+2. **Full backlog entry** (recommended for larger work): `/user:gsd-t-backlog-promote` so it flows through the normal milestone pipeline.
+At Autonomy Level 3: automatically choose option 1 (quick task) unless the recommendation Type is `investigate` (which warrants a full backlog entry since the scope is not yet defined).
+## Step 4: Mark the entry as promoted
+```bash
+node -e "
+const opt = require('./bin/token-optimizer.js');
+let content = opt.readBacklog('.');
+content = opt.setRecommendationStatus(content, process.argv[1], {
+  status: 'promoted'
+});
+opt.writeBacklog('.', content);
+console.log('Marked ' + process.argv[1] + ' as promoted.');
+" {ID}
+```
+## Step 5: Observability logging
+Append a line to `.gsd-t/token-log.md` documenting the promotion (this command does not spawn a subagent, so no model column — use `manual`):
+```bash
+DT=$(date +"%Y-%m-%d %H:%M")
+printf "| %s | %s | %s | %s | %s | %s | %s | %s | %s | %s |\n" \
+  "$DT" "$DT" "gsd-t-optimization-apply" "Step 4" "manual" "0s" "promoted {ID}" "" "" "" \
+  >> .gsd-t/token-log.md
+```
+## Step 6: Pre-Commit Gate
+This command modifies `.gsd-t/optimization-backlog.md`. Add a Decision Log entry to `.gsd-t/progress.md`:
+```
+- YYYY-MM-DD HH:MM: Promoted optimization recommendation {ID} — {summary}
+```
+## Contract References
+- `.gsd-t/contracts/token-telemetry-contract.md` v1.0.0 (source of truth for the optimization-backlog flow)

package/commands/gsd-t-optimization-reject.md ADDED Viewed

@@ -0,0 +1,94 @@
+# GSD-T: Optimization Reject — Dismiss a Recommendation
+Reject (dismiss) a pending recommendation from `.gsd-t/optimization-backlog.md` with an optional reason. Sets a 5-milestone cooldown so the same signal doesn't re-surface immediately.
+Takes `$ARGUMENTS` as `{ID} [--reason "text"]`.
+## Usage
+```
+/user:gsd-t-optimization-reject M35-OPT-001
+/user:gsd-t-optimization-reject M35-OPT-001 --reason "test-sync needs opus — mechanical reruns mask real failures"
+```
+## Step 0: Parse $ARGUMENTS
+- Extract the recommendation ID (first positional argument).
+- Extract the `--reason` text if present (quoted string after `--reason`).
+- If ID is empty, print usage and exit.
+```
+Usage: /user:gsd-t-optimization-reject {ID} [--reason "text"]
+Example: /user:gsd-t-optimization-reject M35-OPT-001 --reason "still needs opus"
+Run `/user:gsd-t-backlog-list --file optimization-backlog.md` to see pending recommendations.
+```
+## Step 1: Load the recommendation
+```bash
+node -e "
+const opt = require('./bin/token-optimizer.js');
+const content = opt.readBacklog('.');
+const entries = opt.parseBacklog(content);
+const id = process.argv[1];
+const entry = entries.find(e => e.id === id);
+if (!entry) {
+  console.error('Recommendation not found: ' + id);
+  process.exit(1);
+}
+console.log(JSON.stringify(entry, null, 2));
+" {ID}
+```
+## Step 2: Idempotency check
+- If `Status: rejected` → print "Already rejected. Cooldown: {N} milestones remaining." and exit cleanly.
+- If `Status: promoted` → print "This recommendation was already promoted; cannot reject now." and exit.
+- If `Status: pending` → proceed.
+## Step 3: Mark the entry as rejected with cooldown
+The reason text defaults to "no reason given" when `--reason` is absent.
+```bash
+REASON="${REASON:-no reason given}"
+node -e "
+const opt = require('./bin/token-optimizer.js');
+let content = opt.readBacklog('.');
+content = opt.setRecommendationStatus(content, process.argv[1], {
+  status: 'rejected',
+  rejection_cooldown: 5
+});
+opt.writeBacklog('.', content);
+console.log('Marked ' + process.argv[1] + ' as rejected (cooldown: 5 milestones). Reason: ' + process.argv[2]);
+" {ID} "$REASON"
+```
+Note: the reason text is captured in the observability log (Step 4) and the Decision Log (Step 5); it is not embedded directly in the backlog entry so that parseBacklog stays simple.
+## Step 4: Observability logging
+Append to `.gsd-t/token-log.md`:
+```bash
+DT=$(date +"%Y-%m-%d %H:%M")
+printf "| %s | %s | %s | %s | %s | %s | %s | %s | %s | %s |\n" \
+  "$DT" "$DT" "gsd-t-optimization-reject" "Step 3" "manual" "0s" "rejected {ID}: $REASON" "" "" "" \
+  >> .gsd-t/token-log.md
+```
+## Step 5: Pre-Commit Gate
+Add a Decision Log entry to `.gsd-t/progress.md`:
+```
+- YYYY-MM-DD HH:MM: Rejected optimization recommendation {ID} — {reason}
+```
+## Cooldown Behavior
+After rejection, `bin/token-optimizer.js` will skip any fingerprint-matching recommendation for 5 subsequent `complete-milestone` invocations. The cooldown counter is stored in the entry's `Rejection cooldown` field and decrements at each `complete-milestone` run (decrement logic lives in `bin/token-optimizer.js` — Wave 5 docs task DAT-T? covers the decrement step if missing).
+## Contract References
+- `.gsd-t/contracts/token-telemetry-contract.md` v1.0.0 (source of truth for the optimization-backlog flow)

package/commands/gsd-t-partition.md CHANGED Viewed

@@ -2,6 +2,13 @@
 You are the lead agent in a contract-driven development workflow. Your job is to decompose the current milestone into independent domains with explicit boundaries and contracts.
+## Model Assignment
+Per `.gsd-t/contracts/model-selection-contract.md` v1.0.0.
+- **Default**: `opus` (`selectModel({phase: "partition"})`) — domain decomposition is architectural reasoning. High stakes.
+- **Escalation**: already at opus; there is no stronger tier. Partition decisions are always made at full quality.
 ## Step 0.5: Scan Freshness Auto-Refresh
 Before reading scan data, check if scan docs are stale and auto-refresh if needed. This ensures partition decisions are based on current code — no warnings, no user involvement.

package/commands/gsd-t-pause.md CHANGED Viewed

@@ -47,6 +47,9 @@ Create `.gsd-t/continue-here-{timestamp}.md`:
 {any known blockers, pending decisions, or things to watch out for}
 {None if clean}
+## Outstanding User Directive
+{Copy any multi-step chain the user gave earlier in the session that has NOT been fully executed, verbatim. Examples: "run until milestone complete, then checkin publish update all", "complete M34 and then archive + publish". Resume honors this after the resumed phase finishes. Leave as _None_ if no outstanding chain.}
 ## User Note
 {$ARGUMENTS if provided, otherwise: _No note provided._}