@tekyzinc/gsd-t 2.74.13 → 2.76.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (61) hide show
  1. package/CHANGELOG.md +116 -0
  2. package/README.md +71 -1
  3. package/bin/advisor-integration.js +93 -0
  4. package/bin/check-headless-sessions.js +140 -0
  5. package/bin/context-meter-config.cjs +101 -0
  6. package/bin/context-meter-config.test.cjs +101 -0
  7. package/bin/gsd-t.js +709 -16
  8. package/bin/headless-auto-spawn.js +290 -0
  9. package/bin/model-selector.js +224 -0
  10. package/bin/runway-estimator.js +242 -0
  11. package/bin/token-budget.js +96 -89
  12. package/bin/token-optimizer.js +471 -0
  13. package/bin/token-telemetry.js +246 -0
  14. package/commands/gsd-t-audit.md +3 -3
  15. package/commands/gsd-t-backlog-list.md +38 -0
  16. package/commands/gsd-t-brainstorm.md +3 -3
  17. package/commands/gsd-t-complete-milestone.md +24 -0
  18. package/commands/gsd-t-debug.md +124 -7
  19. package/commands/gsd-t-discuss.md +10 -3
  20. package/commands/gsd-t-doc-ripple.md +32 -4
  21. package/commands/gsd-t-execute.md +107 -52
  22. package/commands/gsd-t-help.md +19 -0
  23. package/commands/gsd-t-integrate.md +67 -4
  24. package/commands/gsd-t-optimization-apply.md +91 -0
  25. package/commands/gsd-t-optimization-reject.md +94 -0
  26. package/commands/gsd-t-partition.md +7 -0
  27. package/commands/gsd-t-pause.md +3 -0
  28. package/commands/gsd-t-plan.md +10 -3
  29. package/commands/gsd-t-prd.md +3 -3
  30. package/commands/gsd-t-quick.md +71 -9
  31. package/commands/gsd-t-reflect.md +3 -7
  32. package/commands/gsd-t-resume.md +36 -0
  33. package/commands/gsd-t-status.md +31 -0
  34. package/commands/gsd-t-test-sync.md +7 -0
  35. package/commands/gsd-t-verify.md +12 -5
  36. package/commands/gsd-t-visualize.md +3 -7
  37. package/commands/gsd-t-wave.md +82 -18
  38. package/docs/GSD-T-README.md +52 -0
  39. package/docs/architecture.md +95 -0
  40. package/docs/infrastructure.md +117 -0
  41. package/docs/methodology.md +36 -0
  42. package/docs/prd-harness-evolution.md +51 -37
  43. package/docs/requirements.md +66 -0
  44. package/package.json +1 -1
  45. package/scripts/context-meter/count-tokens-client.js +221 -0
  46. package/scripts/context-meter/count-tokens-client.test.js +308 -0
  47. package/scripts/context-meter/test-injector.js +55 -0
  48. package/scripts/context-meter/threshold.js +88 -0
  49. package/scripts/context-meter/threshold.test.js +255 -0
  50. package/scripts/context-meter/transcript-parser.js +252 -0
  51. package/scripts/context-meter/transcript-parser.test.js +320 -0
  52. package/scripts/gsd-t-context-meter.e2e.test.js +415 -0
  53. package/scripts/gsd-t-context-meter.js +350 -0
  54. package/scripts/gsd-t-context-meter.test.js +417 -0
  55. package/scripts/gsd-t-heartbeat.js +2 -2
  56. package/scripts/gsd-t-statusline.js +23 -8
  57. package/templates/CLAUDE-global.md +5 -1
  58. package/templates/CLAUDE-project.md +26 -6
  59. package/templates/context-meter-config.json +10 -0
  60. package/templates/prompts/README.md +1 -1
  61. package/bin/task-counter.cjs +0 -161
@@ -2,17 +2,86 @@
2
2
 
3
3
  You are the lead agent coordinating task execution across domains. Choose solo or team mode based on the plan.
4
4
 
5
- ## Step 0: Reset Task-Count Gate (MANDATORY — first thing in a fresh session)
5
+ ## Model Assignment
6
+
7
+ Per `.gsd-t/contracts/model-selection-contract.md` v1.0.0. Selection is deterministic via `bin/model-selector.js` — never runtime-overridden by context pressure.
8
+
9
+ - **Default**: `sonnet` — routine task execution (`selectModel({phase: "execute"})`). Sonnet is the M35 routine tier.
10
+ - **Mechanical subroutines** (demote to `haiku`):
11
+ - Test runners (`selectModel({phase: "execute", task_type: "test_runner"})`)
12
+ - Branch guards (`selectModel({phase: "execute", task_type: "branch_guard"})`)
13
+ - File-existence checks (`selectModel({phase: "execute", task_type: "file_check"})`)
14
+ - **QA subagent (Step 2)**: `sonnet` — evaluation needs judgment per M31 tier refinement (`selectModel({phase: "execute", task_type: "qa"})`).
15
+ - **Red Team (Step 5.5)**: `opus` — adversarial reasoning benefits most from top tier (`selectModel({phase: "execute", task_type: "red_team"})`).
16
+ - **Escalation points**: at any declared high-stakes sub-decision (cross-module refactor, contract design, security-boundary change), invoke the convention-based `/advisor` fallback from `bin/advisor-integration.js`. If the `/advisor` tool is unavailable, the caller proceeds at the assigned model and logs a missed escalation to `.gsd-t/token-log.md` (see `.gsd-t/M35-advisor-findings.md`). Never silently downgrade the model or skip Red Team / doc-ripple under context pressure — M35 removed that behavior.
17
+
18
+ ## Per-Spawn Token Bracket (MANDATORY — wrap EVERY Task subagent spawn)
19
+
20
+ Per `.gsd-t/contracts/token-telemetry-contract.md` v1.0.0. Every Task subagent spawn below (Step 2 QA, Step 3 domain dispatcher, Step 5.25 Design Verification, Step 5.5 Red Team, Step 7 doc-ripple) **MUST** be wrapped in this token bracket so `.gsd-t/token-metrics.jsonl` gets one record per spawn. This is additive — the existing OBSERVABILITY LOGGING blocks in each spawn site are preserved unmodified alongside this bracket.
21
+
22
+ **Before each spawn — read starting context tokens:**
23
+
24
+ ```bash
25
+ T0_TOKENS=$(node -e "try{const s=require('fs').readFileSync('.gsd-t/.context-meter-state.json','utf8');process.stdout.write(String(JSON.parse(s).inputTokens||0))}catch(_){process.stdout.write('0')}")
26
+ T0_PCT=$(node -e "try{const tb=require('./bin/token-budget.js');process.stdout.write(String(tb.getSessionStatus('.').pct||0))}catch(_){process.stdout.write('0')}")
27
+ ```
28
+
29
+ **After each spawn — record the bracket:**
30
+
31
+ ```bash
32
+ T1_TOKENS=$(node -e "try{const s=require('fs').readFileSync('.gsd-t/.context-meter-state.json','utf8');process.stdout.write(String(JSON.parse(s).inputTokens||0))}catch(_){process.stdout.write('0')}")
33
+ T1_PCT=$(node -e "try{const tb=require('./bin/token-budget.js');process.stdout.write(String(tb.getSessionStatus('.').pct||0))}catch(_){process.stdout.write('0')}")
34
+ node -e "require('./bin/token-telemetry.js').recordSpawn({timestamp:new Date().toISOString(),milestone:process.env.GSD_T_MILESTONE||'',command:'gsd-t-execute',phase:'${PHASE:-execute}',step:'${STEP:-}',domain:'${DOMAIN:-}',domain_type:'${DOMAIN_TYPE:-}',task:'${TASK:-}',model:'${MODEL:-sonnet}',duration_s:${DURATION:-0},input_tokens_before:${T0_TOKENS},input_tokens_after:${T1_TOKENS},tokens_consumed:${T1_TOKENS}-${T0_TOKENS},context_window_pct_before:${T0_PCT},context_window_pct_after:${T1_PCT},outcome:'${OUTCOME:-success}',halt_type:${HALT_TYPE:-null},escalated_via_advisor:${ESCALATED_VIA_ADVISOR:-false}})" 2>/dev/null || true
35
+ ```
36
+
37
+ The bracket is additive to the existing `.gsd-t/token-log.md` OBSERVABILITY LOGGING rows. Both sinks coexist — token-log.md is human-readable with context percentage, token-metrics.jsonl is machine-readable with the full 18-field schema for `gsd-t metrics --tokens/--halts/--context-window` aggregation.
38
+
39
+ ## Step 0: Runway Check (MANDATORY — before any other work in a fresh session)
40
+
41
+ Run via Bash. Count the `remaining_tasks` from the unblocked task list (Step 1 reads `.gsd-t/domains/*/tasks.md`), or use a conservative estimate of 5 if the count is unknown yet:
42
+
43
+ ```bash
44
+ node -e "
45
+ const r = require('./bin/runway-estimator.js').estimateRunway({
46
+ command: 'gsd-t-execute',
47
+ domain_type: '{DOMAIN_TYPE}',
48
+ remaining_tasks: {N},
49
+ projectDir: '.'
50
+ });
51
+ console.log(JSON.stringify(r, null, 2));
52
+ if (!r.can_start) {
53
+ console.log('⛔ Insufficient runway — projected ' + r.projected_end_pct + '% (current ' + r.current_pct + '%, ' + r.pct_per_task + '%/task, ' + r.confidence + ' confidence, ' + r.confidence_basis + ' records)');
54
+ console.log('Auto-spawning headless to continue in a fresh context.');
55
+ const s = require('./bin/headless-auto-spawn.js').autoSpawnHeadless({
56
+ command: 'gsd-t-execute', args: [], continue_from: '.'
57
+ });
58
+ console.log('Session ID: ' + s.id);
59
+ console.log('Status: tail ' + s.logPath);
60
+ console.log('');
61
+ console.log('Your interactive session remains idle — you can use it for other work.');
62
+ console.log('You will be notified when the headless run completes.');
63
+ process.exit(0);
64
+ }
65
+ "
66
+ ```
67
+
68
+ If `can_start === false`, the Step 0 block above has already spawned the headless continuation and exited. The interactive session stops here — do NOT proceed to Step 0.1. If the command continues past Step 0, `can_start === true` and runway is sufficient.
69
+
70
+ **Contract**: `.gsd-t/contracts/runway-estimator-contract.md` v1.0.0 defines the decision-object shape and the refusal banner format. The stop threshold (85%) mirrors `.gsd-t/contracts/token-budget-contract.md` v3.0.0.
71
+
72
+ ## Step 0.1: Verify Context Gate Readiness (MANDATORY — first thing in a fresh session)
6
73
 
7
74
  Run via Bash:
8
75
 
9
76
  ```bash
10
- node bin/task-counter.cjs reset
77
+ node -e "const tb = require('./bin/token-budget.js'); const s = tb.getSessionStatus('.'); console.log(JSON.stringify(s));"
11
78
  ```
12
79
 
13
- This clears `.gsd-t/.task-counter` so the new session starts at 0. The reset is the SIGNAL that this is a clean post-`/clear` orchestrator. Do this exactly ONCE per `/user:gsd-t-execute` invocation, immediately on entry. The gate logic is in Step 3.5; do NOT skip it. If `bin/task-counter.cjs` is missing in this project, `npm install` it via `gsd-t install` then retry — the gate is required.
80
+ This calls `getSessionStatus()` (v2.0.0) which reads `.gsd-t/.context-meter-state.json` produced by the Context Meter PostToolUse hook. If the state file is fresh (timestamp within 5 min), you get real `pct` and `threshold` values; if missing or stale, the call falls back to the historical heuristic from `.gsd-t/token-log.md`.
81
+
82
+ Use the returned `threshold` as the gate signal for the rest of this run. The gate logic is in Step 3.5; do NOT skip it. If the Context Meter hook isn't installed (`.gsd-t/.context-meter-state.json` missing and doctor reports it), run `gsd-t doctor` to diagnose — the gate still works via the heuristic fallback but real-time readings give much better guardrails.
14
83
 
15
- Why: every `/user:gsd-t-execute` invocation is a fresh orchestrator session. Without the reset, the counter from the previous session would still be at the limit and the gate would refuse to spawn anything. Reset is the only acceptable way to advance the counter back to 0.
84
+ Why: every `/user:gsd-t-execute` invocation is a fresh orchestrator session and needs a current reading of context utilization before spawning any subagents. The authoritative source is the Context Meter state file; the fallback keeps the gate functional on projects that haven't installed the hook yet.
16
85
 
17
86
  ## Step 1: Load State
18
87
 
@@ -112,23 +181,22 @@ Before spawning — run via Bash:
112
181
  After subagent returns — run via Bash:
113
182
  `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
114
183
 
115
- Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Domain | Task | Tasks-Since-Reset |` if missing):
116
- `| {DT_START} | {DT_END} | gsd-t-execute | task:{task-id} | sonnet | {DURATION}s | {pass/fail} | {domain-name} | task-{task-id} | {COUNTER} |`
184
+ Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Domain | Task | Ctx% |` if missing):
185
+ `| {DT_START} | {DT_END} | gsd-t-execute | task:{task-id} | sonnet | {DURATION}s | {pass/fail} | {domain-name} | task-{task-id} | {CTX_PCT} |`
117
186
 
118
- Where `{COUNTER}` is the value returned by `node bin/task-counter.cjs status` (see Step 3.5). Note: the legacy `Tokens`, `Compacted`, and `Ctx%` columns were removed in v2.74.12Claude Code does not export `CLAUDE_CONTEXT_TOKENS_USED`/`_MAX`, so those columns always wrote zeros and the orchestrator self-check based on them was inert. The real burn signal is now `Tasks-Since-Reset`, which the task-counter gate in Step 3.5 enforces.
187
+ Where `{CTX_PCT}` is the current `pct` value returned by `getSessionStatus()` (Step 3.5). As of v2.0.0 (M34), `pct` reads the **real** `input_tokens` count from `.gsd-t/.context-meter-state.json`the count_tokens-based measurement produced by the Context Meter PostToolUse hook. When the state file is absent or stale, the fallback heuristic writes a best-effort percentage and this column reads `N/A` instead. The previous `Tasks-Since-Reset` column (v2.74.12) is retired.
119
188
 
120
189
  **For each domain (in wave order), run the domain task-dispatcher:**
121
190
 
122
191
  **Token Budget Check (before dispatching each domain's tasks):**
123
192
 
124
193
  Run via Bash:
125
- `node -e "const tb = require('./bin/token-budget.js'); const s = tb.getSessionStatus('.'); const d = tb.getDegradationActions(s.threshold, '.'); process.stdout.write(JSON.stringify({threshold: s.threshold, actions: d}));" 2>/dev/null`
194
+ `node -e "const tb = require('./bin/token-budget.js'); const s = tb.getSessionStatus('.'); const d = tb.getDegradationActions(s.threshold, '.'); process.stdout.write(JSON.stringify({band: d.band, pct: d.pct, message: d.message}));" 2>/dev/null`
126
195
 
127
- Apply the result:
128
- - `threshold: 'normal'` or file missing → skip silently, proceed with standard model assignments
129
- - `threshold: 'downgrade'` → apply model overrides from `actions.modelOverride` (e.g., downgrade opus tasks to sonnet)
130
- - `threshold: 'conserve'` → checkpoint progress to `.gsd-t/progress.md` and skip non-essential operations (Red Team, doc-ripple) for this domain
131
- - `threshold: 'stop'` → checkpoint all progress, output: "Token budget exhausted — progress saved. Resume after session reset.", and halt execution for remaining domains
196
+ Apply the result (three-band model per `token-budget-contract.md` v3.0.0 — never silently degrade quality):
197
+ - `band: 'normal'` or file missing → proceed with standard model assignments
198
+ - `band: 'warn'` (≥70%) log the warning to `.gsd-t/token-log.md` and proceed at full quality; do NOT downgrade models or skip phases
199
+ - `band: 'stop'` (≥85%) → checkpoint all progress, output: "Orchestrator context gate reached ({pct}%). Progress saved. Resume after session reset.", and halt execution for remaining domains. Runway estimator / headless auto-spawn will handle the handoff once they exist (m35-runway-estimator, m35-headless-auto-spawn).
132
200
 
133
201
  **Pre-dispatch experience retrieval (before dispatching each domain's tasks):**
134
202
  Run via Bash:
@@ -232,7 +300,7 @@ For each task in `.gsd-t/domains/{domain-name}/tasks.md` (in order, skip complet
232
300
  1. Load prior summaries: Read up to 5 most recent `.gsd-t/domains/{domain-name}/task-*-summary.md` files (10-20 lines each)
233
301
  2. Load graph context (if `.gsd-t/graph/meta.json` exists): query task's files for relevant graph context
234
302
  3. Display: `⚙ [sonnet] gsd-t-execute → domain: {domain-name}, task-{task-id}`
235
- 4. Run observability Bash (T_START / DT_START / TOK_START / TOK_MAX)
303
+ 4. Run observability Bash (T_START / DT_START)
236
304
  5. Spawn task subagent:
237
305
 
238
306
  ```
@@ -414,11 +482,11 @@ Report back:
414
482
  ```
415
483
 
416
484
  6. After task subagent returns:
417
- - Run observability Bash (T_END / TOK_END / DURATION / CTX_PCT)
485
+ - Run observability Bash (T_END / DURATION / CTX_PCT)
418
486
  - Append to token-log.md (per-task row)
419
487
  - Alert on CTX_PCT thresholds (display to user inline)
420
488
  - **Emit task-metrics record** — run via Bash:
421
- `node bin/metrics-collector.js --milestone {milestone} --domain {domain-name} --task task-{task-id} --command execute --duration_s $DURATION --tokens_used $TOKENS --context_pct ${CTX_PCT:-0} --pass {true|false} --fix_cycles {0|N} --signal_type {pass-through|fix-cycle} --notes "{brief outcome}" 2>/dev/null || true`
489
+ `node bin/metrics-collector.js --milestone {milestone} --domain {domain-name} --task task-{task-id} --command execute --duration_s $DURATION --tokens_used 0 --context_pct ${CTX_PCT:-0} --pass {true|false} --fix_cycles {0|N} --signal_type {pass-through|fix-cycle} --notes "{brief outcome}" 2>/dev/null || true`
422
490
  Signal type: `pass-through` if task passed on first attempt; `fix-cycle` if rework was needed.
423
491
  - **Emit task_complete event** — run via Bash:
424
492
  `node ~/.claude/scripts/gsd-t-event-writer.js --type task_complete --command gsd-t-execute --reasoning "signal_type={signal_type}, domain={domain-name}" --outcome {success|failure} || true`
@@ -461,7 +529,7 @@ Report back:
461
529
 
462
530
  6. **Per-domain Red Team** — invoke Step 5.5 (Red Team) NOW for this domain. This is the first place Red Team runs in v2.74.12 — there is no global post-execute Red Team anymore. If Red Team returns FAIL, fix bugs and re-run before proceeding to the next domain (max 2 fix-and-verify cycles); if bugs persist, log to `.gsd-t/deferred-items.md` and present to user.
463
531
 
464
- 7. **Task-count gate re-check** — run `node bin/task-counter.cjs should-stop`. If exit code is `10`, follow the Step 3.5 STOP procedure now (do NOT spawn the next domain).
532
+ 7. **Context gate re-check** — run `node -e "const tb=require('./bin/token-budget.js'); const s=tb.getSessionStatus('.'); if(s.threshold==='stop')process.exit(10); if(s.threshold==='warn')process.exit(13);"`. If exit code is `10`, follow the Step 3.5 STOP procedure now (do NOT spawn the next domain). If exit code is `13`, log the warning and proceed at full quality for the next domain (no model overrides, no phase skips — quality is never silently degraded).
465
533
 
466
534
  ### Team Mode (when agent teams are enabled)
467
535
  Spawn teammates for domains within the same wave. Only domains in the same wave can run in parallel — do not spawn teammates for domains in different waves simultaneously. Each teammate uses the **domain task-dispatcher pattern** — one subagent per task within their domain (same as solo mode).
@@ -605,31 +673,28 @@ After all merges complete (whether all passed, some rolled back, or errors occur
605
673
  Cleanup is not optional — orphaned worktrees waste disk space and can confuse subsequent executions. Always run cleanup, even if earlier steps failed.
606
674
  ```
607
675
 
608
- ## Step 3.5: Orchestrator Task-Count Gate (MANDATORY)
676
+ ## Step 3.5: Orchestrator Context Gate (MANDATORY)
609
677
 
610
- The orchestrator MUST check `bin/task-counter.cjs` BEFORE every task subagent spawn AND immediately AFTER every domain completes. This is the real context-burn guardrail. The previous version of this step relied on `CLAUDE_CONTEXT_TOKENS_USED`/`_MAX` env vars which Claude Code does not export that check was inert and silently let the orchestrator drain context until forced compaction. The replacement below uses a deterministic on-disk task counter.
678
+ The orchestrator MUST check `getSessionStatus()` BEFORE every task subagent spawn AND immediately AFTER every domain completes. This is the real context-burn guardrail. As of v2.0.0 (M34), `bin/token-budget.js` reads `.gsd-t/.context-meter-state.json` the live count_tokens-based `input_tokens` measurement produced by the Context Meter PostToolUse hook. When the state file is fresh (timestamp within 5 min), thresholds reflect the ACTUAL context window utilization; when absent or stale, the call falls back to the historical heuristic from `.gsd-t/token-log.md`.
611
679
 
612
680
  **Before each task spawn — gate check:**
613
681
 
614
682
  ```bash
615
- node bin/task-counter.cjs should-stop
683
+ node -e "const tb=require('./bin/token-budget.js'); const s=tb.getSessionStatus('.'); process.stdout.write(JSON.stringify(s)); if(s.threshold==='stop')process.exit(10); if(s.threshold==='warn')process.exit(13);"
616
684
  ```
617
685
 
618
- If the exit code is `10` (counter is at or past its limit), STOP immediately. Do NOT spawn the next task. Jump straight to the checkpoint/STOP procedure below.
619
-
620
- If the exit code is `0`, proceed to spawn the task.
621
-
622
- **After each task subagent returns — increment:**
686
+ Exit code semantics (three-band model per `token-budget-contract.md` v3.0.0):
687
+ - `0` → `normal` band (< 70% ctx). Proceed with standard model assignments.
688
+ - `13` → `warn` band (70–85%). Log the warning to `.gsd-t/token-log.md` and proceed at full quality. **Never downgrade models or skip phases** — M35 removed that behavior intentionally. If the projected runway is insufficient, the runway estimator (m35-runway-estimator) will halt cleanly before reaching `stop`.
689
+ - `10` → `stop` band (≥ 85%). STOP immediately. Do NOT spawn the next task. Jump straight to the STOP procedure below.
623
690
 
624
- ```bash
625
- node bin/task-counter.cjs increment task
626
- ```
691
+ The JSON on stdout contains `{consumed, estimated_remaining, pct, threshold}` — capture `pct` as `{CTX_PCT}` for the token-log `Ctx%` column on the NEXT spawn.
627
692
 
628
- This prints a JSON status line like `{"count":3,"limit":5,"remaining":2,"should_stop":false,...}`. Use this status when writing the token-log row (the `Tasks-Since-Reset` column).
693
+ **After each task subagent returns re-check:**
629
694
 
630
- If `should_stop` is `true` after the increment, STOP after this task completes even if more tasks remain in the current domain.
695
+ Run the same command again. The fresh reading reflects post-task consumption (the Context Meter hook refreshes after each tool call). If the band crossed into `stop`, STOP after this task completes even if more tasks remain in the current domain.
631
696
 
632
- **STOP procedure (when `should_stop` is true):**
697
+ **STOP procedure (when threshold === 'stop'):**
633
698
 
634
699
  1. **Save checkpoint to disk** — update `.gsd-t/progress.md` with:
635
700
  - Which domains are complete, which remain
@@ -637,30 +702,20 @@ If `should_stop` is `true` after the increment, STOP after this task completes
637
702
  - Last completed task id and the next pending task id
638
703
  2. **Instruct user**: Output exactly:
639
704
  ```
640
- ⏸️ Orchestrator task-count gate reached ({count}/{limit} tasks in this session).
705
+ ⏸️ Orchestrator context gate reached ({pct}% of model window).
641
706
  Progress saved. Run `/clear` then `/user:gsd-t-execute` to continue from the next task.
642
707
  ```
643
- 3. **STOP execution.** Do NOT spawn another task or domain subagent. The next session resumes from saved state. The first thing the resumed orchestrator does in Step 0 is run `node bin/task-counter.cjs reset` (see below).
644
-
645
- **Configuring the limit:**
646
-
647
- The default limit is 5 tasks per session — conservative, designed for the model+harness combination as of 2026-04-13. Override per-project via `.gsd-t/task-counter-config.json`:
708
+ 3. **STOP execution.** Do NOT spawn another task or domain subagent. The next session resumes from saved state with a fresh context window.
648
709
 
649
- ```json
650
- { "limit": 8 }
651
- ```
710
+ **Configuring threshold bands:**
652
711
 
653
- Or per-session via env var: `GSD_T_TASK_LIMIT=8 /user:gsd-t-execute`.
712
+ Band boundaries (`warn=70`, `stop=85`) are defined in `bin/token-budget.js` (`WARN_THRESHOLD_PCT` / `STOP_THRESHOLD_PCT` constants) and documented in `.gsd-t/contracts/token-budget-contract.md` v3.0.0. The `modelWindowSize` used for the denominator comes from `.gsd-t/context-meter-config.json` (default `200000`). Override the window size there if running against a different model. There is no per-session env-var override — the real-time measurement supersedes the need for one.
654
713
 
655
714
  **On resume (Step 0 — first thing the orchestrator does in a fresh session):**
656
715
 
657
- ```bash
658
- node bin/task-counter.cjs reset
659
- ```
660
-
661
- This clears the counter so the new session starts fresh. The reset is the SIGNAL that this is a clean post-`/clear` session — never reset mid-session.
716
+ Step 0 runs `getSessionStatus()` once for readiness confirmation. The reading should be fresh (the Context Meter hook fires on every tool call), so the gate immediately reflects the new session's starting pct — typically near 0 since `/clear` resets the conversation.
662
717
 
663
- This deterministic gate replaces the vaporware env-var check. It is fail-safe: if `bin/task-counter.cjs` is missing for any reason, the `should-stop` command exits non-zero (treated as STOP) rather than silently allowing unlimited spawns.
718
+ This gate replaces the v2.74.12 task counter proxy and the (never-functional) v1.x env-var check. It is fail-safe: if `bin/token-budget.js` or the state file is unreadable for any reason, `getSessionStatus()` throws and the gate exits non-zero (treated as STOP) rather than silently allowing unlimited spawns.
664
719
 
665
720
  ## Step 4: Checkpoint Handling
666
721
 
@@ -736,9 +791,9 @@ and summary, and the full comparison table per the protocol's Step 7."
736
791
  ```
737
792
 
738
793
  After subagent returns — run via Bash:
739
- `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START)) && COUNTER_JSON=$(node bin/task-counter.cjs status 2>/dev/null || echo '{}') && COUNTER=$(echo "$COUNTER_JSON" | node -e "let s=''; process.stdin.on('data',d=>s+=d).on('end',()=>{try{process.stdout.write(String(JSON.parse(s).count||''))}catch(_){process.stdout.write('')}})")`
794
+ `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START)) && CTX_PCT=$(node -e "try{const tb=require('./bin/token-budget.js'); process.stdout.write(String(tb.getSessionStatus('.').pct))}catch(_){process.stdout.write('N/A')}")`
740
795
  Append to `.gsd-t/token-log.md`:
741
- `| {DT_START} | {DT_END} | gsd-t-execute | Design Verify | opus | {DURATION}s | {VERDICT} — {MATCH}/{TOTAL} elements for {domain-name} | | | {COUNTER} |`
796
+ `| {DT_START} | {DT_END} | gsd-t-execute | Design Verify | opus | {DURATION}s | {VERDICT} — {MATCH}/{TOTAL} elements for {domain-name} | | | {CTX_PCT} |`
742
797
 
743
798
  **Artifact Gate (MANDATORY):**
744
799
  After the Design Verification Agent returns, check `.gsd-t/contracts/design-contract.md`:
@@ -757,7 +812,7 @@ After the Design Verification Agent returns, check `.gsd-t/contracts/design-cont
757
812
 
758
813
  ## Step 5.5: Red Team — Adversarial QA (per-domain, MANDATORY)
759
814
 
760
- **IMPORTANT — frequency change in v2.74.12**: Red Team was promoted to per-task by commit `da6d3ae` on the assumption that the orchestrator would catch context drain via the `CLAUDE_CONTEXT_TOKENS_USED` self-check. That env var is never set by Claude Code, so the check was inert and the per-task spawning of ~10k-token Red Team subagents was the largest single contributor to the v2.74.x context-burn regression. Red Team is now run ONCE PER COMPLETED DOMAIN — call this step from the "After all tasks in a domain complete" block, not from a per-task hook.
815
+ **IMPORTANT — frequency change in v2.74.12**: Red Team was promoted to per-task by commit `da6d3ae` on the assumption that the orchestrator would catch context drain via an environment-variable-based self-check. That env-var path was never populated by Claude Code, so the check was inert and the per-task spawning of ~10k-token Red Team subagents was the largest single contributor to the v2.74.x context-burn regression. Red Team is now run ONCE PER COMPLETED DOMAIN — call this step from the "After all tasks in a domain complete" block, not from a per-task hook.
761
816
 
762
817
  After all tasks in the CURRENT DOMAIN pass their tests, spawn an adversarial Red Team agent. Its sole purpose is to BREAK the domain that was just built.
763
818
 
@@ -789,9 +844,9 @@ attack categories exhausted, and the path to the written
789
844
  ```
790
845
 
791
846
  After subagent returns — run via Bash:
792
- `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START)) && COUNTER=$(node bin/task-counter.cjs status 2>/dev/null | node -e "let s='';process.stdin.on('data',d=>s+=d).on('end',()=>{try{process.stdout.write(String(JSON.parse(s).count||''))}catch(_){process.stdout.write('')}})")`
847
+ `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START)) && CTX_PCT=$(node -e "try{const tb=require('./bin/token-budget.js'); process.stdout.write(String(tb.getSessionStatus('.').pct))}catch(_){process.stdout.write('N/A')}")`
793
848
  Append to `.gsd-t/token-log.md`:
794
- `| {DT_START} | {DT_END} | gsd-t-execute | Red Team | opus | {DURATION}s | {VERDICT} — {N} bugs found in {domain-name} | | | {COUNTER} |`
849
+ `| {DT_START} | {DT_END} | gsd-t-execute | Red Team | opus | {DURATION}s | {VERDICT} — {N} bugs found in {domain-name} | | | {CTX_PCT} |`
795
850
 
796
851
  **If Red Team VERDICT is FAIL:**
797
852
  1. Fix all CRITICAL and HIGH bugs immediately (up to 2 fix attempts per bug)
@@ -85,6 +85,11 @@ BACKLOG Manual
85
85
  backlog-promote Refine, classify, and launch GSD-T workflow
86
86
  backlog-settings Manage types, apps, categories, and defaults
87
87
 
88
+ OPTIMIZATION Manual
89
+ ───────────────────────────────────────────────────────────────────────────────
90
+ optimization-apply Promote a pending token-optimizer recommendation
91
+ optimization-reject Dismiss a recommendation with optional reason + cooldown
92
+
88
93
  ───────────────────────────────────────────────────────────────────────────────
89
94
  Type /user:gsd-t-help {command} for detailed help on any command.
90
95
  Example: /user:gsd-t-help impact
@@ -456,6 +461,20 @@ Use these when user asks for help on a specific command:
456
461
  - **Files**: `.gsd-t/backlog-settings.md`
457
462
  - **Use when**: Customizing the classification dimensions for your project
458
463
 
464
+ ### optimization-apply
465
+ - **Summary**: Promote a pending token-optimizer recommendation from `.gsd-t/optimization-backlog.md`
466
+ - **Auto-invoked**: No
467
+ - **Files**: `.gsd-t/optimization-backlog.md`, `.gsd-t/progress.md`, `.gsd-t/token-log.md`
468
+ - **Usage**: `/user:gsd-t-optimization-apply {ID}`
469
+ - **Use when**: A recommendation looks correct and you want to act on it — offers a quick-task or full backlog-promote path
470
+
471
+ ### optimization-reject
472
+ - **Summary**: Dismiss a recommendation with an optional reason; sets a 5-milestone cooldown
473
+ - **Auto-invoked**: No
474
+ - **Files**: `.gsd-t/optimization-backlog.md`, `.gsd-t/progress.md`, `.gsd-t/token-log.md`
475
+ - **Usage**: `/user:gsd-t-optimization-reject {ID} [--reason "text"]`
476
+ - **Use when**: A recommendation is wrong or premature — prevents the same signal from re-surfacing for 5 milestones
477
+
459
478
  ## Unknown Command
460
479
 
461
480
  If user asks for help on unrecognized command:
@@ -2,6 +2,69 @@
2
2
 
3
3
  You are the lead agent performing integration work. This phase is ALWAYS single-session — one agent with full context across all domains to handle the seams.
4
4
 
5
+ ## Model Assignment
6
+
7
+ Per `.gsd-t/contracts/model-selection-contract.md` v1.0.0.
8
+
9
+ - **Default**: `sonnet` (`selectModel({phase: "integrate"})`) — integration wiring is routine coordination.
10
+ - **Mechanical subroutines** (demote to `haiku`): integration test runners.
11
+ - **Red Team**: `opus` — adversarial QA at integration seams always runs at top tier.
12
+ - **Escalation**: `/advisor` convention-based fallback from `bin/advisor-integration.js` when a seam reveals a contract gap or security boundary. Never silently downgrade the model or skip Red Team / doc-ripple under context pressure — M35 removed that behavior.
13
+
14
+ ## Per-Spawn Token Bracket (MANDATORY — wrap EVERY Task subagent spawn)
15
+
16
+ Per `.gsd-t/contracts/token-telemetry-contract.md` v1.0.0. Every Task subagent spawn below **MUST** be wrapped in this token bracket so `.gsd-t/token-metrics.jsonl` gets one record per spawn. This is additive — the existing OBSERVABILITY LOGGING blocks in each spawn site are preserved unmodified alongside this bracket.
17
+
18
+ **Before each spawn — read starting context tokens:**
19
+
20
+ ```bash
21
+ T0_TOKENS=$(node -e "try{const s=require('fs').readFileSync('.gsd-t/.context-meter-state.json','utf8');process.stdout.write(String(JSON.parse(s).inputTokens||0))}catch(_){process.stdout.write('0')}")
22
+ T0_PCT=$(node -e "try{const tb=require('./bin/token-budget.js');process.stdout.write(String(tb.getSessionStatus('.').pct||0))}catch(_){process.stdout.write('0')}")
23
+ ```
24
+
25
+ **After each spawn — record the bracket:**
26
+
27
+ ```bash
28
+ T1_TOKENS=$(node -e "try{const s=require('fs').readFileSync('.gsd-t/.context-meter-state.json','utf8');process.stdout.write(String(JSON.parse(s).inputTokens||0))}catch(_){process.stdout.write('0')}")
29
+ T1_PCT=$(node -e "try{const tb=require('./bin/token-budget.js');process.stdout.write(String(tb.getSessionStatus('.').pct||0))}catch(_){process.stdout.write('0')}")
30
+ node -e "require('./bin/token-telemetry.js').recordSpawn({timestamp:new Date().toISOString(),milestone:process.env.GSD_T_MILESTONE||'',command:'gsd-t-integrate',phase:'integrate',step:'${STEP:-}',domain:'${DOMAIN:-}',domain_type:'${DOMAIN_TYPE:-}',task:'${TASK:-}',model:'${MODEL:-sonnet}',duration_s:${DURATION:-0},input_tokens_before:${T0_TOKENS},input_tokens_after:${T1_TOKENS},tokens_consumed:${T1_TOKENS}-${T0_TOKENS},context_window_pct_before:${T0_PCT},context_window_pct_after:${T1_PCT},outcome:'${OUTCOME:-success}',halt_type:${HALT_TYPE:-null},escalated_via_advisor:${ESCALATED_VIA_ADVISOR:-false}})" 2>/dev/null || true
31
+ ```
32
+
33
+ The bracket is additive to the existing `.gsd-t/token-log.md` OBSERVABILITY LOGGING rows. Both sinks coexist.
34
+
35
+ ## Step 0: Runway Check (MANDATORY — before any other work in a fresh session)
36
+
37
+ Count the integration wiring seams in `.gsd-t/contracts/integration-points.md` as `remaining_tasks` (conservative estimate = integration-points section count). Then run via Bash:
38
+
39
+ ```bash
40
+ node -e "
41
+ const r = require('./bin/runway-estimator.js').estimateRunway({
42
+ command: 'gsd-t-integrate',
43
+ domain_type: '',
44
+ remaining_tasks: {N},
45
+ projectDir: '.'
46
+ });
47
+ console.log(JSON.stringify(r, null, 2));
48
+ if (!r.can_start) {
49
+ console.log('⛔ Insufficient runway — projected ' + r.projected_end_pct + '% (current ' + r.current_pct + '%, ' + r.pct_per_task + '%/task, ' + r.confidence + ' confidence, ' + r.confidence_basis + ' records)');
50
+ console.log('Auto-spawning headless to continue in a fresh context.');
51
+ const s = require('./bin/headless-auto-spawn.js').autoSpawnHeadless({
52
+ command: 'gsd-t-integrate', args: [], continue_from: '.'
53
+ });
54
+ console.log('Session ID: ' + s.id);
55
+ console.log('Status: tail ' + s.logPath);
56
+ console.log('');
57
+ console.log('Your interactive session remains idle — you can use it for other work.');
58
+ console.log('You will be notified when the headless run completes.');
59
+ process.exit(0);
60
+ }
61
+ "
62
+ ```
63
+
64
+ If `can_start === false`, the headless continuation has been spawned and the interactive session must stop here. Do NOT proceed to Step 1.
65
+
66
+ **Contract**: `.gsd-t/contracts/runway-estimator-contract.md` v1.0.0; stop threshold (85%) mirrors `.gsd-t/contracts/token-budget-contract.md` v3.0.0.
67
+
5
68
  ## Step 1: Load Full State
6
69
 
7
70
  Read everything:
@@ -173,8 +236,8 @@ Before spawning — run via Bash:
173
236
  `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M")`
174
237
  After subagent returns — run via Bash:
175
238
  `T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))`
176
- Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Domain | Task | Tasks-Since-Reset |` if missing):
177
- `| {DT_START} | {DT_END} | gsd-t-integrate | Step 5 | haiku | {DURATION}s | {pass/fail}, {N} boundaries tested | | | {COUNTER} |`
239
+ Append to `.gsd-t/token-log.md` (create with header `| Datetime-start | Datetime-end | Command | Step | Model | Duration(s) | Notes | Domain | Task | Ctx% |` if missing):
240
+ `| {DT_START} | {DT_END} | gsd-t-integrate | Step 5 | haiku | {DURATION}s | {pass/fail}, {N} boundaries tested | | | {CTX_PCT} |`
178
241
  If QA found issues, append each to `.gsd-t/qa-issues.md` (create with header `| Date | Command | Step | Model | Duration(s) | Severity | Finding |` if missing):
179
242
  `| {DT_START} | gsd-t-integrate | Step 5 | haiku | {DURATION}s | {severity} | {finding} |`
180
243
 
@@ -229,10 +292,10 @@ Spawn Task subagent (general-purpose, model: opus):
229
292
  After subagent returns — run via Bash:
230
293
  ```
231
294
  T_END=$(date +%s) && DT_END=$(date +"%Y-%m-%d %H:%M") && DURATION=$((T_END-T_START))
232
- COUNTER=$(node bin/task-counter.cjs status 2>/dev/null | node -e "let s='';process.stdin.on('data',d=>s+=d).on('end',()=>{try{process.stdout.write(String(JSON.parse(s).count||''))}catch(_){process.stdout.write('')}})")
295
+ CTX_PCT=$(node -e "try{const tb=require('./bin/token-budget.js'); process.stdout.write(String(tb.getSessionStatus('.').pct))}catch(_){process.stdout.write('N/A')}")
233
296
  ```
234
297
  Append to `.gsd-t/token-log.md`:
235
- `| {DT_START} | {DT_END} | gsd-t-integrate | Red Team | opus | {DURATION}s | {VERDICT} — {N} bugs found | | | {COUNTER} |`
298
+ `| {DT_START} | {DT_END} | gsd-t-integrate | Red Team | opus | {DURATION}s | {VERDICT} — {N} bugs found | | | {CTX_PCT} |`
236
299
 
237
300
  **If FAIL:** fix CRITICAL/HIGH bugs (≤2 cycles) → re-run. Persistent bugs → `.gsd-t/deferred-items.md`.
238
301
  **If GRUDGING PASS:** proceed to doc-ripple.
@@ -0,0 +1,91 @@
1
+ # GSD-T: Optimization Apply — Promote a Recommendation
2
+
3
+ Apply (promote) a pending recommendation from `.gsd-t/optimization-backlog.md`. Takes `$ARGUMENTS` as the recommendation ID (e.g., `M35-OPT-001`).
4
+
5
+ Recommendations are produced by `bin/token-optimizer.js` at `complete-milestone` and are **never auto-applied**. This command is the user's deliberate promotion step.
6
+
7
+ ## Usage
8
+
9
+ ```
10
+ /user:gsd-t-optimization-apply M35-OPT-001
11
+ ```
12
+
13
+ ## Step 0: Parse $ARGUMENTS
14
+
15
+ Extract the recommendation ID from `$ARGUMENTS`. If empty, print:
16
+ ```
17
+ Usage: /user:gsd-t-optimization-apply {ID}
18
+ Example: /user:gsd-t-optimization-apply M35-OPT-001
19
+
20
+ Run `/user:gsd-t-backlog-list --file optimization-backlog.md` to see pending recommendations.
21
+ ```
22
+ Then exit.
23
+
24
+ ## Step 1: Load the recommendation
25
+
26
+ ```bash
27
+ node -e "
28
+ const opt = require('./bin/token-optimizer.js');
29
+ const content = opt.readBacklog('.');
30
+ const entries = opt.parseBacklog(content);
31
+ const id = process.argv[1];
32
+ const entry = entries.find(e => e.id === id);
33
+ if (!entry) {
34
+ console.error('Recommendation not found: ' + id);
35
+ console.error('Run /user:gsd-t-backlog-list --file optimization-backlog.md');
36
+ process.exit(1);
37
+ }
38
+ console.log(JSON.stringify(entry, null, 2));
39
+ " {ID}
40
+ ```
41
+
42
+ ## Step 2: Idempotency check
43
+
44
+ - If `Status: promoted` → print "Already promoted. No action taken." and exit cleanly.
45
+ - If `Status: rejected` → print "This recommendation was rejected. Use `/user:gsd-t-optimization-reject --reason` to update, or wait out the cooldown." and exit.
46
+ - If `Status: pending` → proceed to Step 3.
47
+
48
+ ## Step 3: Present the recommendation + promotion options
49
+
50
+ Print the recommendation's metadata (Type, Evidence, Proposed change, Risk, Projected savings) and offer two promotion paths:
51
+
52
+ 1. **Quick task** (recommended for small changes): `/user:gsd-t-quick "{proposed_change}"`
53
+ 2. **Full backlog entry** (recommended for larger work): `/user:gsd-t-backlog-promote` so it flows through the normal milestone pipeline.
54
+
55
+ At Autonomy Level 3: automatically choose option 1 (quick task) unless the recommendation Type is `investigate` (which warrants a full backlog entry since the scope is not yet defined).
56
+
57
+ ## Step 4: Mark the entry as promoted
58
+
59
+ ```bash
60
+ node -e "
61
+ const opt = require('./bin/token-optimizer.js');
62
+ let content = opt.readBacklog('.');
63
+ content = opt.setRecommendationStatus(content, process.argv[1], {
64
+ status: 'promoted'
65
+ });
66
+ opt.writeBacklog('.', content);
67
+ console.log('Marked ' + process.argv[1] + ' as promoted.');
68
+ " {ID}
69
+ ```
70
+
71
+ ## Step 5: Observability logging
72
+
73
+ Append a line to `.gsd-t/token-log.md` documenting the promotion (this command does not spawn a subagent, so no model column — use `manual`):
74
+
75
+ ```bash
76
+ DT=$(date +"%Y-%m-%d %H:%M")
77
+ printf "| %s | %s | %s | %s | %s | %s | %s | %s | %s | %s |\n" \
78
+ "$DT" "$DT" "gsd-t-optimization-apply" "Step 4" "manual" "0s" "promoted {ID}" "" "" "" \
79
+ >> .gsd-t/token-log.md
80
+ ```
81
+
82
+ ## Step 6: Pre-Commit Gate
83
+
84
+ This command modifies `.gsd-t/optimization-backlog.md`. Add a Decision Log entry to `.gsd-t/progress.md`:
85
+ ```
86
+ - YYYY-MM-DD HH:MM: Promoted optimization recommendation {ID} — {summary}
87
+ ```
88
+
89
+ ## Contract References
90
+
91
+ - `.gsd-t/contracts/token-telemetry-contract.md` v1.0.0 (source of truth for the optimization-backlog flow)
@@ -0,0 +1,94 @@
1
+ # GSD-T: Optimization Reject — Dismiss a Recommendation
2
+
3
+ Reject (dismiss) a pending recommendation from `.gsd-t/optimization-backlog.md` with an optional reason. Sets a 5-milestone cooldown so the same signal doesn't re-surface immediately.
4
+
5
+ Takes `$ARGUMENTS` as `{ID} [--reason "text"]`.
6
+
7
+ ## Usage
8
+
9
+ ```
10
+ /user:gsd-t-optimization-reject M35-OPT-001
11
+ /user:gsd-t-optimization-reject M35-OPT-001 --reason "test-sync needs opus — mechanical reruns mask real failures"
12
+ ```
13
+
14
+ ## Step 0: Parse $ARGUMENTS
15
+
16
+ - Extract the recommendation ID (first positional argument).
17
+ - Extract the `--reason` text if present (quoted string after `--reason`).
18
+ - If ID is empty, print usage and exit.
19
+
20
+ ```
21
+ Usage: /user:gsd-t-optimization-reject {ID} [--reason "text"]
22
+ Example: /user:gsd-t-optimization-reject M35-OPT-001 --reason "still needs opus"
23
+
24
+ Run `/user:gsd-t-backlog-list --file optimization-backlog.md` to see pending recommendations.
25
+ ```
26
+
27
+ ## Step 1: Load the recommendation
28
+
29
+ ```bash
30
+ node -e "
31
+ const opt = require('./bin/token-optimizer.js');
32
+ const content = opt.readBacklog('.');
33
+ const entries = opt.parseBacklog(content);
34
+ const id = process.argv[1];
35
+ const entry = entries.find(e => e.id === id);
36
+ if (!entry) {
37
+ console.error('Recommendation not found: ' + id);
38
+ process.exit(1);
39
+ }
40
+ console.log(JSON.stringify(entry, null, 2));
41
+ " {ID}
42
+ ```
43
+
44
+ ## Step 2: Idempotency check
45
+
46
+ - If `Status: rejected` → print "Already rejected. Cooldown: {N} milestones remaining." and exit cleanly.
47
+ - If `Status: promoted` → print "This recommendation was already promoted; cannot reject now." and exit.
48
+ - If `Status: pending` → proceed.
49
+
50
+ ## Step 3: Mark the entry as rejected with cooldown
51
+
52
+ The reason text defaults to "no reason given" when `--reason` is absent.
53
+
54
+ ```bash
55
+ REASON="${REASON:-no reason given}"
56
+ node -e "
57
+ const opt = require('./bin/token-optimizer.js');
58
+ let content = opt.readBacklog('.');
59
+ content = opt.setRecommendationStatus(content, process.argv[1], {
60
+ status: 'rejected',
61
+ rejection_cooldown: 5
62
+ });
63
+ opt.writeBacklog('.', content);
64
+ console.log('Marked ' + process.argv[1] + ' as rejected (cooldown: 5 milestones). Reason: ' + process.argv[2]);
65
+ " {ID} "$REASON"
66
+ ```
67
+
68
+ Note: the reason text is captured in the observability log (Step 4) and the Decision Log (Step 5); it is not embedded directly in the backlog entry so that parseBacklog stays simple.
69
+
70
+ ## Step 4: Observability logging
71
+
72
+ Append to `.gsd-t/token-log.md`:
73
+
74
+ ```bash
75
+ DT=$(date +"%Y-%m-%d %H:%M")
76
+ printf "| %s | %s | %s | %s | %s | %s | %s | %s | %s | %s |\n" \
77
+ "$DT" "$DT" "gsd-t-optimization-reject" "Step 3" "manual" "0s" "rejected {ID}: $REASON" "" "" "" \
78
+ >> .gsd-t/token-log.md
79
+ ```
80
+
81
+ ## Step 5: Pre-Commit Gate
82
+
83
+ Add a Decision Log entry to `.gsd-t/progress.md`:
84
+ ```
85
+ - YYYY-MM-DD HH:MM: Rejected optimization recommendation {ID} — {reason}
86
+ ```
87
+
88
+ ## Cooldown Behavior
89
+
90
+ After rejection, `bin/token-optimizer.js` will skip any fingerprint-matching recommendation for 5 subsequent `complete-milestone` invocations. The cooldown counter is stored in the entry's `Rejection cooldown` field and decrements at each `complete-milestone` run (decrement logic lives in `bin/token-optimizer.js` — Wave 5 docs task DAT-T? covers the decrement step if missing).
91
+
92
+ ## Contract References
93
+
94
+ - `.gsd-t/contracts/token-telemetry-contract.md` v1.0.0 (source of truth for the optimization-backlog flow)
@@ -2,6 +2,13 @@
2
2
 
3
3
  You are the lead agent in a contract-driven development workflow. Your job is to decompose the current milestone into independent domains with explicit boundaries and contracts.
4
4
 
5
+ ## Model Assignment
6
+
7
+ Per `.gsd-t/contracts/model-selection-contract.md` v1.0.0.
8
+
9
+ - **Default**: `opus` (`selectModel({phase: "partition"})`) — domain decomposition is architectural reasoning. High stakes.
10
+ - **Escalation**: already at opus; there is no stronger tier. Partition decisions are always made at full quality.
11
+
5
12
  ## Step 0.5: Scan Freshness Auto-Refresh
6
13
 
7
14
  Before reading scan data, check if scan docs are stale and auto-refresh if needed. This ensures partition decisions are based on current code — no warnings, no user involvement.
@@ -47,6 +47,9 @@ Create `.gsd-t/continue-here-{timestamp}.md`:
47
47
  {any known blockers, pending decisions, or things to watch out for}
48
48
  {None if clean}
49
49
 
50
+ ## Outstanding User Directive
51
+ {Copy any multi-step chain the user gave earlier in the session that has NOT been fully executed, verbatim. Examples: "run until milestone complete, then checkin publish update all", "complete M34 and then archive + publish". Resume honors this after the resumed phase finishes. Leave as _None_ if no outstanding chain.}
52
+
50
53
  ## User Note
51
54
  {$ARGUMENTS if provided, otherwise: _No note provided._}
52
55