@tekyzinc/gsd-t 2.46.11 → 2.50.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. package/CHANGELOG.md +11 -0
  2. package/README.md +22 -2
  3. package/bin/debug-ledger.js +193 -0
  4. package/bin/gsd-t.js +259 -1
  5. package/commands/gsd-t-debug.md +26 -1
  6. package/commands/gsd-t-execute.md +31 -3
  7. package/commands/gsd-t-help.md +18 -2
  8. package/commands/gsd-t-integrate.md +16 -0
  9. package/commands/gsd-t-quick.md +18 -1
  10. package/commands/gsd-t-test-sync.md +5 -1
  11. package/commands/gsd-t-verify.md +6 -1
  12. package/commands/gsd-t-wave.md +26 -0
  13. package/docs/GSD-T-README.md +83 -1
  14. package/docs/architecture.md +9 -1
  15. package/docs/requirements.md +30 -0
  16. package/package.json +1 -1
  17. package/templates/CLAUDE-global.md +19 -2
  18. package/templates/stacks/_security.md +243 -0
  19. package/templates/stacks/desktop.ini +2 -0
  20. package/templates/stacks/docker.md +202 -0
  21. package/templates/stacks/firebase.md +166 -0
  22. package/templates/stacks/flutter.md +205 -0
  23. package/templates/stacks/github-actions.md +201 -0
  24. package/templates/stacks/graphql.md +216 -0
  25. package/templates/stacks/neo4j.md +218 -0
  26. package/templates/stacks/nextjs.md +184 -0
  27. package/templates/stacks/node-api.md +196 -0
  28. package/templates/stacks/playwright.md +528 -0
  29. package/templates/stacks/postgresql.md +225 -0
  30. package/templates/stacks/python.md +243 -0
  31. package/templates/stacks/react-native.md +216 -0
  32. package/templates/stacks/react.md +293 -0
  33. package/templates/stacks/redux.md +193 -0
  34. package/templates/stacks/rest-api.md +202 -0
  35. package/templates/stacks/supabase.md +188 -0
  36. package/templates/stacks/tailwind.md +169 -0
  37. package/templates/stacks/typescript.md +176 -0
  38. package/templates/stacks/vite.md +176 -0
  39. package/templates/stacks/vue.md +189 -0
  40. package/templates/stacks/zustand.md +203 -0
package/CHANGELOG.md CHANGED
@@ -2,6 +2,17 @@
2
2
 
3
3
  All notable changes to GSD-T are documented here. Updated with each release.
4
4
 
5
+ ## [2.50.10] - 2026-03-25
6
+
7
+ ### Added
8
+ - **18 new stack rule files** — python, flutter, tailwind, react-native, vite, nextjs, vue, docker, postgresql (with graph-in-SQL section), github-actions, rest-api, supabase, firebase, graphql, zustand, redux, neo4j, playwright. Total: 22 stack rules (was 4).
9
+ - **Playwright best practices** — coverage matrix per feature, pairwise combinatorial testing, state transition testing, multi-step workflow testing, Page Object Model, API mocking patterns. Enforces rigorous test depth across permutations.
10
+ - **react.md expanded** — added state management decision table, form management (react-hook-form + zod), React naming conventions (3 new sections from external best practices review).
11
+
12
+ ### Changed
13
+ - Stack detection in execute, quick, and debug commands updated to cover all 22 stack files with conditional detection per project dependencies.
14
+ - PostgreSQL graph-in-SQL patterns (adjacency lists, junction tables, recursive CTEs) added to postgresql.md based on real project analysis.
15
+
5
16
  ## [2.46.11] - 2026-03-24
6
17
 
7
18
  ### Added
package/README.md CHANGED
@@ -3,6 +3,7 @@
3
3
  A methodology for reliable, parallelizable development using Claude Code with optional Agent Teams support.
4
4
 
5
5
  **Eliminates context rot** — task-level fresh dispatch (one subagent per task, ~10-20% context each) means compaction never triggers.
6
+ **Compaction-proof debug loops** — `gsd-t headless --debug-loop` runs test-fix-retest cycles as separate `claude -p` sessions. A JSONL debug ledger persists all hypothesis/fix/learning history across fresh sessions. Anti-repetition preamble injection prevents retrying failed hypotheses. Escalation tiers (sonnet → opus → human) and a hard iteration ceiling enforced externally.
6
7
  **Safe parallel execution** — worktree isolation gives each domain agent its own filesystem; sequential atomic merges prevent conflicts.
7
8
  **Maintains test coverage** — automatically keeps tests aligned with code changes.
8
9
  **Catches downstream effects** — analyzes impact before changes break things.
@@ -11,6 +12,7 @@ A methodology for reliable, parallelizable development using Claude Code with op
11
12
  **Generates visual scan reports** — every `/gsd-t-scan` produces a self-contained HTML report with 6 live architectural diagrams, a tech debt register, and domain health scores; optional DOCX/PDF export via `--export docx|pdf`.
12
13
  **Self-learning rule engine** — declarative rules in rules.jsonl detect failure patterns from task metrics. Candidate patches progress through a 5-stage lifecycle (candidate, applied, measured, promoted, graduated) with >55% improvement gates before becoming permanent methodology artifacts.
13
14
  **Cross-project learning** — proven rules propagate to `~/.claude/metrics/` and sync across all registered projects via `update-all`. Rules validated in 3+ projects become universal; 5+ projects qualify for npm distribution. Cross-project signal comparison and global ELO rankings available via `gsd-t-metrics --cross-project` and `gsd-t-status`.
15
+ **Stack Rules Engine** — auto-detects project tech stack (React, TypeScript, Node API, Python, Go, Rust) from manifest files and injects mandatory best-practice rules into subagent prompts at execute-time. Universal security rules always apply; stack-specific rules layer on top. Extensible: drop a `.md` file in `templates/stacks/` to add a new stack.
14
16
 
15
17
  ---
16
18
 
@@ -83,8 +85,21 @@ npx @tekyzinc/gsd-t uninstall # Remove commands (keeps project files)
83
85
  gsd-t headless verify --json --timeout=1200 # Run verify non-interactively
84
86
  gsd-t headless query status # Get project state (no LLM, <100ms)
85
87
  gsd-t headless query domains # List domains (no LLM)
88
+
89
+ # Headless debug-loop (compaction-proof automated test-fix-retest)
90
+ gsd-t headless --debug-loop # Auto-detect test cmd, up to 20 iterations
91
+ gsd-t headless --debug-loop --max-iterations=10 # Cap at 10 iterations
92
+ gsd-t headless --debug-loop --test-cmd="npm test" # Override test command
93
+ gsd-t headless --debug-loop --fix-scope="src/auth/**" # Limit fix scope
94
+ gsd-t headless --debug-loop --json --log # Structured output + per-iteration logs
86
95
  ```
87
96
 
97
+ Each iteration runs as a fresh `claude -p` session. A cumulative debug ledger (`.gsd-t/debug-state.jsonl`) preserves hypothesis/fix/learning history across sessions. An anti-repetition preamble prevents retrying failed approaches.
98
+
99
+ **Escalation tiers**: sonnet (iterations 1–5) → opus (6–15) → STOP with diagnostic summary (16–20)
100
+
101
+ **Exit codes**: `0` all tests pass · `1` max iterations reached · `2` compaction error · `3` process error · `4` needs human decision
102
+
88
103
  ### Updating
89
104
 
90
105
  When a new version is published:
@@ -321,7 +336,7 @@ get-stuff-done-teams/
321
336
  │ ├── branch.md # Git branch helper
322
337
  │ ├── checkin.md # Auto-version + commit/push helper
323
338
  │ └── Claude-md.md # Reload CLAUDE.md directives
324
- ├── templates/ # Document templates
339
+ ├── templates/ # Document templates (9 base + stacks/)
325
340
  │ ├── CLAUDE-global.md
326
341
  │ ├── CLAUDE-project.md
327
342
  │ ├── requirements.md
@@ -330,7 +345,12 @@ get-stuff-done-teams/
330
345
  │ ├── infrastructure.md
331
346
  │ ├── progress.md
332
347
  │ ├── backlog.md
333
- └── backlog-settings.md
348
+ ├── backlog-settings.md
349
+ │ └── stacks/ # Stack Rules Engine templates
350
+ │ ├── _security.md # Universal — always injected
351
+ │ ├── react.md
352
+ │ ├── typescript.md
353
+ │ └── node-api.md
334
354
  ├── scripts/ # Runtime utility scripts (installed to ~/.claude/scripts/)
335
355
  │ ├── gsd-t-tools.js # State CLI (get/set/validate/list)
336
356
  │ ├── gsd-t-statusline.js # Context usage bar
@@ -0,0 +1,193 @@
1
+ #!/usr/bin/env node
2
+
3
+ /**
4
+ * GSD-T Debug Ledger — Persistent debug iteration store
5
+ *
6
+ * Reads and writes debug iteration records to .gsd-t/debug-state.jsonl.
7
+ * Supports compaction detection and ledger lifecycle management.
8
+ *
9
+ * Zero external dependencies (Node.js built-ins only).
10
+ */
11
+
12
+ const fs = require("fs");
13
+ const path = require("path");
14
+
15
+ // ── Constants ─────────────────────────────────────────────────────────────────
16
+
17
+ const COMPACTION_THRESHOLD = 51200; // 50KB
18
+
19
+ const REQUIRED_FIELDS = [
20
+ "iteration", "timestamp", "test", "error",
21
+ "hypothesis", "fix", "fixFiles", "result",
22
+ "learning", "model", "duration",
23
+ ];
24
+
25
+ const VALID_RESULTS = new Set(["PASS", "STILL_FAILS"]);
26
+
27
+ // ── Exports ───────────────────────────────────────────────────────────────────
28
+
29
+ module.exports = {
30
+ readLedger, appendEntry, getLedgerStats, clearLedger,
31
+ compactLedger, generateAntiRepetitionPreamble,
32
+ };
33
+
34
+ // ── readLedger ────────────────────────────────────────────────────────────────
35
+
36
+ /**
37
+ * Read all entries from the debug ledger.
38
+ * @param {string} projectDir - Root directory of the project
39
+ * @returns {object[]} Array of parsed ledger entry objects
40
+ */
41
+ function readLedger(projectDir) {
42
+ const fp = ledgerPath(projectDir);
43
+ if (!fs.existsSync(fp)) return [];
44
+ const content = fs.readFileSync(fp, "utf8").trim();
45
+ if (!content) return [];
46
+ return content.split("\n").map(safeParse).filter(Boolean);
47
+ }
48
+
49
+ // ── appendEntry ───────────────────────────────────────────────────────────────
50
+
51
+ /**
52
+ * Validate and append one debug iteration entry to the ledger.
53
+ * Creates the file and parent directories if they do not exist.
54
+ * @param {string} projectDir - Root directory of the project
55
+ * @param {object} entry - Debug iteration record (see Required Fields)
56
+ * @throws {Error} If required fields are missing or invalid
57
+ */
58
+ function appendEntry(projectDir, entry) {
59
+ const err = validateEntry(entry);
60
+ if (err) throw new Error(err);
61
+ const fp = ledgerPath(projectDir);
62
+ ensureDir(path.dirname(fp));
63
+ fs.appendFileSync(fp, JSON.stringify(entry) + "\n");
64
+ }
65
+
66
+ // ── getLedgerStats ────────────────────────────────────────────────────────────
67
+
68
+ /**
69
+ * Return summary statistics for the current ledger.
70
+ * @param {string} projectDir - Root directory of the project
71
+ * @returns {{ entryCount: number, sizeBytes: number, needsCompaction: boolean, failedHypotheses: string[], passCount: number, failCount: number }}
72
+ */
73
+ function getLedgerStats(projectDir) {
74
+ const fp = ledgerPath(projectDir);
75
+ const entries = readLedger(projectDir);
76
+ const sizeBytes = fs.existsSync(fp) ? fs.statSync(fp).size : 0;
77
+ const failedHypotheses = entries
78
+ .filter((e) => e.result === "STILL_FAILS" && e.hypothesis)
79
+ .map((e) => e.hypothesis);
80
+ const passCount = entries.filter((e) => e.result === "PASS").length;
81
+ const failCount = entries.filter((e) => e.result === "STILL_FAILS").length;
82
+ return {
83
+ entryCount: entries.length,
84
+ sizeBytes,
85
+ needsCompaction: sizeBytes > COMPACTION_THRESHOLD,
86
+ failedHypotheses,
87
+ passCount,
88
+ failCount,
89
+ };
90
+ }
91
+
92
+ // ── clearLedger ───────────────────────────────────────────────────────────────
93
+
94
+ /**
95
+ * Delete the debug ledger file. Called when all tests pass.
96
+ * No-op if the file does not exist.
97
+ * @param {string} projectDir - Root directory of the project
98
+ */
99
+ function clearLedger(projectDir) {
100
+ const fp = ledgerPath(projectDir);
101
+ if (fs.existsSync(fp)) fs.unlinkSync(fp);
102
+ }
103
+
104
+ // ── compactLedger ─────────────────────────────────────────────────────────────
105
+
106
+ /**
107
+ * Compact the ledger by replacing all but the last 5 entries with a summary.
108
+ * @param {string} projectDir - Root directory of the project
109
+ * @param {string} summary - Summarization of compacted entries
110
+ */
111
+ function compactLedger(projectDir, summary) {
112
+ const entries = readLedger(projectDir);
113
+ const tail = entries.slice(-5);
114
+ const compactedEntry = {
115
+ compacted: true,
116
+ learning: summary,
117
+ iteration: 0,
118
+ timestamp: new Date().toISOString(),
119
+ test: "compacted",
120
+ error: "see summary",
121
+ hypothesis: "compacted",
122
+ fix: "compacted",
123
+ fixFiles: [],
124
+ result: "compacted",
125
+ model: "haiku",
126
+ duration: 0,
127
+ };
128
+ const fp = ledgerPath(projectDir);
129
+ ensureDir(path.dirname(fp));
130
+ const lines = [compactedEntry, ...tail].map((e) => JSON.stringify(e)).join("\n") + "\n";
131
+ fs.writeFileSync(fp, lines);
132
+ }
133
+
134
+ // ── generateAntiRepetitionPreamble ────────────────────────────────────────────
135
+
136
+ /**
137
+ * Build a preamble string listing failed hypotheses and the current narrowing
138
+ * direction. Injected into each claude -p session to prevent repeated attempts.
139
+ * @param {string} projectDir - Root directory of the project
140
+ * @returns {string} Formatted preamble, or empty string if ledger is empty
141
+ */
142
+ function generateAntiRepetitionPreamble(projectDir) {
143
+ const entries = readLedger(projectDir);
144
+ if (!entries.length) return "";
145
+ const failed = entries.filter((e) => e.result === "STILL_FAILS");
146
+ const learnings = entries.filter((e) => e.learning && !e.compacted);
147
+ const lastLearning = learnings.length ? learnings[learnings.length - 1].learning : null;
148
+ const failLines = failed
149
+ .map((e, i) => `${i + 1}. [iteration ${e.iteration}] "${e.hypothesis}" — FAILED: ${e.error}`)
150
+ .join("\n");
151
+ const stillFailing = failed.map((e) => `- ${e.test}: ${e.error}`).join("\n");
152
+ const direction = lastLearning
153
+ ? `Based on ${entries.length} iterations, the evidence points to: ${lastLearning}`
154
+ : "No narrowing direction established yet.";
155
+ return [
156
+ "## Debug Ledger Context (DO NOT retry failed approaches)",
157
+ "",
158
+ "### Failed Hypotheses (DO NOT retry these):",
159
+ failLines || "(none yet)",
160
+ "",
161
+ "### Current Narrowing Direction:",
162
+ direction,
163
+ "",
164
+ "### Tests Still Failing:",
165
+ stillFailing || "(none recorded)",
166
+ ].join("\n");
167
+ }
168
+
169
+ // ── Internal helpers ──────────────────────────────────────────────────────────
170
+
171
+ function ledgerPath(projectDir) {
172
+ return path.join(projectDir || process.cwd(), ".gsd-t", "debug-state.jsonl");
173
+ }
174
+
175
+ function ensureDir(dir) {
176
+ if (!fs.existsSync(dir)) fs.mkdirSync(dir, { recursive: true });
177
+ }
178
+
179
+ function safeParse(line) {
180
+ try { return JSON.parse(line); } catch { return null; }
181
+ }
182
+
183
+ function validateEntry(entry) {
184
+ if (!entry || typeof entry !== "object") return "Entry must be an object";
185
+ for (const f of REQUIRED_FIELDS) {
186
+ if (entry[f] === undefined || entry[f] === null) return `Missing required field: ${f}`;
187
+ }
188
+ if (typeof entry.iteration !== "number") return "iteration must be a number";
189
+ if (typeof entry.duration !== "number") return "duration must be a number";
190
+ if (!Array.isArray(entry.fixFiles)) return "fixFiles must be an array";
191
+ if (!VALID_RESULTS.has(entry.result)) return `result must be "PASS" or "STILL_FAILS"`;
192
+ return null;
193
+ }
package/bin/gsd-t.js CHANGED
@@ -19,6 +19,7 @@ const fs = require("fs");
19
19
  const path = require("path");
20
20
  const os = require("os");
21
21
  const { execFileSync, spawn: cpSpawn } = require("child_process");
22
+ const debugLedger = require(path.join(__dirname, "debug-ledger.js"));
22
23
 
23
24
  // ─── Configuration ───────────────────────────────────────────────────────────
24
25
 
@@ -2174,6 +2175,236 @@ function doHeadlessQuery(type) {
2174
2175
  process.stdout.write(JSON.stringify(result) + "\n");
2175
2176
  }
2176
2177
 
2178
+ /**
2179
+ * Parse debug-loop flags from args array.
2180
+ * Extracts --max-iterations, --test-cmd, --fix-scope, --json, --log from args.
2181
+ */
2182
+ function parseDebugLoopFlags(args) {
2183
+ const flags = { maxIterations: 20, testCmd: null, fixScope: null, json: false, log: false };
2184
+ const positional = [];
2185
+ for (const arg of args) {
2186
+ if (arg.startsWith("--max-iterations=")) {
2187
+ const n = parseInt(arg.slice("--max-iterations=".length), 10);
2188
+ if (!isNaN(n) && n > 0) flags.maxIterations = n;
2189
+ } else if (arg.startsWith("--test-cmd=")) {
2190
+ flags.testCmd = arg.slice("--test-cmd=".length);
2191
+ } else if (arg.startsWith("--fix-scope=")) {
2192
+ flags.fixScope = arg.slice("--fix-scope=".length);
2193
+ } else if (arg === "--json") {
2194
+ flags.json = true;
2195
+ } else if (arg === "--log") {
2196
+ flags.log = true;
2197
+ } else {
2198
+ positional.push(arg);
2199
+ }
2200
+ }
2201
+ return { flags, positional };
2202
+ }
2203
+
2204
+ /**
2205
+ * Return the escalation model for a given iteration number.
2206
+ * Tiers: 1-5 → sonnet, 6-15 → opus, 16+ → null (stop)
2207
+ */
2208
+ function getEscalationModel(iteration) {
2209
+ if (iteration >= 1 && iteration <= 5) return "sonnet";
2210
+ if (iteration >= 6 && iteration <= 15) return "opus";
2211
+ return null;
2212
+ }
2213
+
2214
+ /**
2215
+ * Spawn a single `claude -p` session and return stdout as a string.
2216
+ * Returns null if the process fails.
2217
+ */
2218
+ function spawnClaudeSession(prompt, model) {
2219
+ try {
2220
+ return execFileSync("claude", ["-p", prompt, "--model", model], {
2221
+ encoding: "utf8", timeout: 300000,
2222
+ stdio: ["pipe", "pipe", "pipe"],
2223
+ });
2224
+ } catch (e) {
2225
+ return (e.stdout || "") + (e.stderr || "") || null;
2226
+ }
2227
+ }
2228
+
2229
+ /**
2230
+ * Parse test pass/fail from claude output.
2231
+ * Returns { passed: bool, summary: string }.
2232
+ */
2233
+ function parseTestResult(output) {
2234
+ const out = (output || "").toLowerCase();
2235
+ const passed =
2236
+ /\ball tests? pass(ed|ing)?\b/.test(out) ||
2237
+ /\ball \d+ tests? pass/.test(out) ||
2238
+ /\bno (test )?failures?\b/.test(out) ||
2239
+ /\btests? (all )?pass(ed)?\b/.test(out);
2240
+ const failed =
2241
+ /\bfail(ed|ing|ure)?\b/.test(out) ||
2242
+ /\berror\b/.test(out) ||
2243
+ /\bnot ok\b/.test(out);
2244
+ const summary = (output || "").slice(0, 500).replace(/\n/g, " ").trim();
2245
+ return { passed: passed && !failed, summary };
2246
+ }
2247
+
2248
+ /**
2249
+ * Run ledger compaction: spawn haiku to summarize, then compact.
2250
+ */
2251
+ function runLedgerCompaction(projectDir, jsonMode) {
2252
+ const entries = debugLedger.readLedger(projectDir);
2253
+ const compactPrompt =
2254
+ "Read this debug ledger. Produce a condensed summary of what has been tried, " +
2255
+ "what failed, and what the evidence suggests. Be concise.\n\n" +
2256
+ JSON.stringify(entries, null, 2);
2257
+ let summary = "Compacted — see previous entries.";
2258
+ try {
2259
+ const out = execFileSync("claude", ["-p", compactPrompt, "--model", "haiku"], {
2260
+ encoding: "utf8", timeout: 120000, stdio: ["pipe", "pipe", "pipe"],
2261
+ });
2262
+ summary = (out || "").trim() || summary;
2263
+ } catch (e) {
2264
+ if (!jsonMode) warn("Compaction haiku session failed — using default summary");
2265
+ }
2266
+ debugLedger.compactLedger(projectDir, summary);
2267
+ }
2268
+
2269
+ /**
2270
+ * Write a per-iteration log file under .gsd-t/.
2271
+ */
2272
+ function writeIterationLog(projectDir, ts, iteration, entry, rawOutput) {
2273
+ const logDir = path.join(projectDir, ".gsd-t");
2274
+ if (!fs.existsSync(logDir)) fs.mkdirSync(logDir, { recursive: true });
2275
+ const fname = `headless-debug-${ts}-iter-${iteration}.log`;
2276
+ const content = [
2277
+ `Iteration: ${iteration}`,
2278
+ `Timestamp: ${entry.timestamp}`,
2279
+ `Model: ${entry.model}`,
2280
+ `Result: ${entry.result}`,
2281
+ `Fix: ${entry.fix}`,
2282
+ `Learning: ${entry.learning}`,
2283
+ `---`,
2284
+ rawOutput || "",
2285
+ ].join("\n");
2286
+ fs.writeFileSync(path.join(logDir, fname), content);
2287
+ }
2288
+
2289
+ /**
2290
+ * Full debug-loop: validate flags, check claude CLI, run iteration cycle.
2291
+ */
2292
+ function doHeadlessDebugLoop(flags) {
2293
+ const opts = flags || {};
2294
+ const jsonMode = opts.json || false;
2295
+ const projectDir = process.cwd();
2296
+
2297
+ if (opts.maxIterations < 1) {
2298
+ const msg = "--max-iterations must be >= 1";
2299
+ if (jsonMode) process.stdout.write(JSON.stringify({ success: false, exitCode: 3, error: msg }) + "\n");
2300
+ else error(msg);
2301
+ process.exit(3);
2302
+ }
2303
+
2304
+ try {
2305
+ execFileSync("claude", ["--version"], { encoding: "utf8", timeout: 5000, stdio: ["pipe", "pipe", "pipe"] });
2306
+ } catch {
2307
+ const msg = "claude CLI not found. Install with: npm install -g @anthropic-ai/claude-code";
2308
+ if (jsonMode) process.stdout.write(JSON.stringify({ success: false, exitCode: 3, error: msg }) + "\n");
2309
+ else error(msg);
2310
+ process.exit(3);
2311
+ }
2312
+
2313
+ if (!jsonMode) {
2314
+ heading("GSD-T Headless — Debug Loop");
2315
+ info(`Max iterations: ${opts.maxIterations}`);
2316
+ if (opts.testCmd) info(`Test command: ${opts.testCmd}`);
2317
+ if (opts.fixScope) info(`Fix scope: ${opts.fixScope}`);
2318
+ if (opts.log) info(`Logging: enabled`);
2319
+ log("");
2320
+ }
2321
+
2322
+ const ts = Date.now();
2323
+
2324
+ for (let iteration = 1; iteration <= opts.maxIterations; iteration++) {
2325
+ const model = getEscalationModel(iteration);
2326
+
2327
+ // STOP tier: escalation stop
2328
+ if (model === null) {
2329
+ const entries = debugLedger.readLedger(projectDir);
2330
+ const stats = debugLedger.getLedgerStats(projectDir);
2331
+ const diagMsg = `ESCALATION STOP at iteration ${iteration}. ` +
2332
+ `Entries: ${stats.entryCount}, Failures: ${stats.failCount}. ` +
2333
+ `Failed hypotheses:\n${stats.failedHypotheses.map((h, i) => ` ${i + 1}. ${h}`).join("\n")}`;
2334
+ if (jsonMode) {
2335
+ process.stdout.write(JSON.stringify({ success: false, exitCode: 4, iteration, diagnostic: diagMsg, entries }) + "\n");
2336
+ } else {
2337
+ log("");
2338
+ warn(diagMsg);
2339
+ }
2340
+ process.exit(4);
2341
+ }
2342
+
2343
+ // Check compaction
2344
+ const stats = debugLedger.getLedgerStats(projectDir);
2345
+ if (stats.needsCompaction) {
2346
+ if (!jsonMode) info("Ledger compaction triggered...");
2347
+ try { runLedgerCompaction(projectDir, jsonMode); }
2348
+ catch { process.exit(2); }
2349
+ }
2350
+
2351
+ // Generate preamble and build prompt
2352
+ const preamble = debugLedger.generateAntiRepetitionPreamble(projectDir);
2353
+ const scopeHint = opts.fixScope ? `\nFix scope: ${opts.fixScope}` : "";
2354
+ const testHint = opts.testCmd ? `\nRun tests with: ${opts.testCmd}` : "";
2355
+ const prompt = [preamble, `Fix the failing test(s). Write your fix, then run the test suite. Report results.${scopeHint}${testHint}`]
2356
+ .filter(Boolean).join("\n\n");
2357
+
2358
+ if (!jsonMode) info(`Iteration ${iteration}/${opts.maxIterations} [${model}]...`);
2359
+
2360
+ const iterStart = Date.now();
2361
+ let rawOutput = null;
2362
+ try { rawOutput = spawnClaudeSession(prompt, model); }
2363
+ catch (e) {
2364
+ if (jsonMode) process.stdout.write(JSON.stringify({ success: false, exitCode: 3, iteration, error: String(e) }) + "\n");
2365
+ else error(`Process error at iteration ${iteration}: ${e.message}`);
2366
+ process.exit(3);
2367
+ }
2368
+ const duration = Math.round((Date.now() - iterStart) / 1000);
2369
+
2370
+ const { passed, summary } = parseTestResult(rawOutput);
2371
+ const result = passed ? "PASS" : "STILL_FAILS";
2372
+
2373
+ // Extract fix description from output (first 200 chars of output)
2374
+ const fixDesc = (rawOutput || "").split("\n").find((l) => l.trim().length > 20) || "see output";
2375
+ const entry = {
2376
+ iteration, timestamp: new Date().toISOString(),
2377
+ test: opts.testCmd || "unspecified", error: passed ? "" : summary,
2378
+ hypothesis: `iteration-${iteration}`, fix: fixDesc.trim().slice(0, 200),
2379
+ fixFiles: [], result, learning: summary.slice(0, 300),
2380
+ model, duration,
2381
+ };
2382
+
2383
+ try { debugLedger.appendEntry(projectDir, entry); }
2384
+ catch (e) {
2385
+ if (!jsonMode) warn(`Failed to append ledger entry: ${e.message}`);
2386
+ }
2387
+
2388
+ if (opts.log) writeIterationLog(projectDir, ts, iteration, entry, rawOutput);
2389
+
2390
+ if (jsonMode) {
2391
+ process.stdout.write(JSON.stringify({ success: passed, exitCode: passed ? 0 : 1, iteration, result, model, duration, summary }) + "\n");
2392
+ } else {
2393
+ info(` Result: ${result}`);
2394
+ }
2395
+
2396
+ if (passed) {
2397
+ debugLedger.clearLedger(projectDir);
2398
+ if (!jsonMode) log(`\n${GREEN}All tests pass — debug loop complete.${RESET}`);
2399
+ process.exit(0);
2400
+ }
2401
+ }
2402
+
2403
+ // Max iterations reached
2404
+ if (!jsonMode) warn(`Max iterations (${opts.maxIterations}) reached without all tests passing.`);
2405
+ process.exit(1);
2406
+ }
2407
+
2177
2408
  function doHeadless(args) {
2178
2409
  const sub = args[0];
2179
2410
  if (!sub || sub === "--help" || sub === "-h") {
@@ -2181,6 +2412,12 @@ function doHeadless(args) {
2181
2412
  return;
2182
2413
  }
2183
2414
 
2415
+ if (sub === "--debug-loop") {
2416
+ const { flags } = parseDebugLoopFlags(args.slice(1));
2417
+ doHeadlessDebugLoop(flags);
2418
+ return;
2419
+ }
2420
+
2184
2421
  if (sub === "query") {
2185
2422
  const type = args[1];
2186
2423
  doHeadlessQuery(type);
@@ -2196,7 +2433,24 @@ function showHeadlessHelp() {
2196
2433
  log(`\n${BOLD}GSD-T Headless Mode${RESET}\n`);
2197
2434
  log(`${BOLD}Usage:${RESET}`);
2198
2435
  log(` ${CYAN}gsd-t headless${RESET} <command> [args] [--json] [--timeout=N] [--log]`);
2199
- log(` ${CYAN}gsd-t headless query${RESET} <type>\n`);
2436
+ log(` ${CYAN}gsd-t headless query${RESET} <type>`);
2437
+ log(` ${CYAN}gsd-t headless --debug-loop${RESET} [--max-iterations=N] [--test-cmd=CMD] [--fix-scope=SCOPE] [--json] [--log]\n`);
2438
+ log(`${BOLD}Debug-loop flags:${RESET}`);
2439
+ log(` ${CYAN}--max-iterations=N${RESET} Hard ceiling on iterations (default: 20)`);
2440
+ log(` ${CYAN}--test-cmd=CMD${RESET} Override test command`);
2441
+ log(` ${CYAN}--fix-scope=SCOPE${RESET} Limit fix scope to specific files or test patterns`);
2442
+ log(` ${CYAN}--json${RESET} Structured JSON output per iteration`);
2443
+ log(` ${CYAN}--log${RESET} Write per-iteration logs to .gsd-t/\n`);
2444
+ log(`${BOLD}Debug-loop escalation tiers:${RESET}`);
2445
+ log(` Iterations 1-5: sonnet (standard debug)`);
2446
+ log(` Iterations 6-15: opus (deeper reasoning)`);
2447
+ log(` Iterations 16-20: STOP (exit code 4 — needs human)\n`);
2448
+ log(`${BOLD}Debug-loop exit codes:${RESET}`);
2449
+ log(` 0 all tests pass`);
2450
+ log(` 1 max iterations reached`);
2451
+ log(` 2 ledger compaction error`);
2452
+ log(` 3 process error`);
2453
+ log(` 4 escalation stop — needs human\n`);
2200
2454
  log(`${BOLD}Exec flags:${RESET}`);
2201
2455
  log(` ${CYAN}--json${RESET} Structured JSON output`);
2202
2456
  log(` ${CYAN}--timeout=N${RESET} Kill after N seconds (default: 300)`);
@@ -2304,6 +2558,10 @@ module.exports = {
2304
2558
  doHeadlessExec,
2305
2559
  doHeadlessQuery,
2306
2560
  doHeadless,
2561
+ // Headless debug-loop
2562
+ parseDebugLoopFlags,
2563
+ getEscalationModel,
2564
+ doHeadlessDebugLoop,
2307
2565
  queryStatus,
2308
2566
  queryDomains,
2309
2567
  queryContracts,
@@ -8,6 +8,22 @@ To give this debug session a fresh context window and prevent compaction, always
8
8
 
9
9
  **If you are the orchestrating agent** (you received the slash command directly):
10
10
 
11
+ **Stack Rules Detection (before spawning subagent):**
12
+ Run via Bash to detect project stack and collect matching rules:
13
+ `GSD_T_DIR=$(npm root -g 2>/dev/null)/@tekyzinc/gsd-t; STACKS_DIR="$GSD_T_DIR/templates/stacks"; STACK_RULES=""; if [ -d "$STACKS_DIR" ]; then for f in "$STACKS_DIR"/_*.md; do [ -f "$f" ] && STACK_RULES="${STACK_RULES}$(cat "$f")"$'\n\n'; done; if [ -f "package.json" ]; then grep -q '"react-native"' package.json 2>/dev/null && [ -f "$STACKS_DIR/react-native.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/react-native.md")"$'\n\n'; grep -q '"react"' package.json 2>/dev/null && ! grep -q '"react-native"' package.json 2>/dev/null && [ -f "$STACKS_DIR/react.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/react.md")"$'\n\n'; grep -q '"next"' package.json 2>/dev/null && [ -f "$STACKS_DIR/nextjs.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/nextjs.md")"$'\n\n'; grep -q '"vue"' package.json 2>/dev/null && [ -f "$STACKS_DIR/vue.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/vue.md")"$'\n\n'; (grep -q '"typescript"' package.json 2>/dev/null || [ -f "tsconfig.json" ]) && [ -f "$STACKS_DIR/typescript.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/typescript.md")"$'\n\n'; grep -qE '"(express|fastify|hono|koa)"' package.json 2>/dev/null && [ -f "$STACKS_DIR/node-api.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/node-api.md")"$'\n\n'; grep -q '"tailwindcss"' package.json 2>/dev/null && [ -f "$STACKS_DIR/tailwind.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/tailwind.md")"$'\n\n'; grep -q '"vite"' package.json 2>/dev/null && [ -f "$STACKS_DIR/vite.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/vite.md")"$'\n\n'; grep -q '"@supabase/supabase-js"' package.json 2>/dev/null && [ -f "$STACKS_DIR/supabase.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/supabase.md")"$'\n\n'; grep -q '"firebase"' package.json 2>/dev/null && [ -f "$STACKS_DIR/firebase.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/firebase.md")"$'\n\n'; grep -qE '"(graphql|@apollo/client|urql)"' package.json 2>/dev/null && [ -f "$STACKS_DIR/graphql.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/graphql.md")"$'\n\n'; grep -q '"zustand"' package.json 2>/dev/null && [ -f "$STACKS_DIR/zustand.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/zustand.md")"$'\n\n'; grep -q '"@reduxjs/toolkit"' package.json 2>/dev/null && [ -f "$STACKS_DIR/redux.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/redux.md")"$'\n\n'; grep -q '"neo4j-driver"' package.json 2>/dev/null && [ -f "$STACKS_DIR/neo4j.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/neo4j.md")"$'\n\n'; grep -qE '"(pg|prisma|drizzle-orm|knex)"' package.json 2>/dev/null && [ -f "$STACKS_DIR/postgresql.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/postgresql.md")"$'\n\n'; grep -qE '"(express|fastify|hono|koa)"' package.json 2>/dev/null && [ -f "$STACKS_DIR/rest-api.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/rest-api.md")"$'\n\n'; fi; ([ -f "requirements.txt" ] || [ -f "pyproject.toml" ] || [ -f "Pipfile" ]) && [ -f "$STACKS_DIR/python.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/python.md")"$'\n\n'; ([ -f "requirements.txt" ] && grep -q "psycopg" requirements.txt 2>/dev/null || [ -f "pyproject.toml" ] && grep -q "psycopg" pyproject.toml 2>/dev/null) && [ -f "$STACKS_DIR/postgresql.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/postgresql.md")"$'\n\n'; ([ -f "requirements.txt" ] && grep -q "neo4j" requirements.txt 2>/dev/null) && [ -f "$STACKS_DIR/neo4j.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/neo4j.md")"$'\n\n'; [ -f "pubspec.yaml" ] && [ -f "$STACKS_DIR/flutter.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/flutter.md")"$'\n\n'; [ -f "Dockerfile" ] && [ -f "$STACKS_DIR/docker.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/docker.md")"$'\n\n'; [ -d ".github/workflows" ] && [ -f "$STACKS_DIR/github-actions.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/github-actions.md")"$'\n\n'; ([ -f "playwright.config.ts" ] || [ -f "playwright.config.js" ]) && [ -f "$STACKS_DIR/playwright.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/playwright.md")"$'\n\n'; [ -f "go.mod" ] && [ -f "$STACKS_DIR/go.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/go.md")"$'\n\n'; [ -f "Cargo.toml" ] && [ -f "$STACKS_DIR/rust.md" ] && STACK_RULES="${STACK_RULES}$(cat "$STACKS_DIR/rust.md")"$'\n\n'; fi`
14
+
15
+ If STACK_RULES is non-empty, append to the subagent prompt:
16
+ ```
17
+ ## Stack Rules (MANDATORY — violations fail this task)
18
+
19
+ {STACK_RULES}
20
+
21
+ These standards have the same enforcement weight as contract compliance.
22
+ Violations are task failures, not warnings.
23
+ ```
24
+
25
+ If STACK_RULES is empty (no templates/stacks/ dir or no matches), skip silently.
26
+
11
27
  **OBSERVABILITY LOGGING (MANDATORY):**
12
28
  Before spawning — run via Bash:
13
29
  `T_START=$(date +%s) && DT_START=$(date +"%Y-%m-%d %H:%M") && TOK_START=${CLAUDE_CONTEXT_TOKENS_USED:-0} && TOK_MAX=${CLAUDE_CONTEXT_TOKENS_MAX:-200000}`
@@ -215,7 +231,16 @@ When you encounter unexpected situations during the fix:
215
231
  3. **Blocker (missing file, wrong API response)** → Fix blocker and continue. Log if non-trivial.
216
232
  4. **Architectural change required to fix correctly** → STOP. Explain what exists, what needs to change, what breaks, and a migration path. Wait for user approval. Never self-approve.
217
233
 
218
- **3-attempt limit**: If your fix doesn't work after 3 attempts within this session, treat it as a loop. Do NOT keep trying the same approach. Log the attempt to `.gsd-t/progress.md` Decision Log with a `[failure]` prefix, then return to Step 1.5 and run Deep Research Mode before any further attempts. Present findings and options to the user before proceeding.
234
+ **3-attempt limit**: If your fix doesn't work after 3 attempts within this session, treat it as a loop. Do NOT keep trying the same approach. Before entering Deep Research Mode, first try the headless debug-loop:
235
+ 1. Write current failure context to `.gsd-t/debug-state.jsonl` via appendEntry
236
+ 2. Log: "Delegating to headless debug-loop (3 in-context attempts exhausted)"
237
+ 3. Run: `gsd-t headless --debug-loop --max-iterations 10`
238
+ 4. Check exit code:
239
+ - 0: Tests pass, continue
240
+ - 1/4: Log to `.gsd-t/deferred-items.md`, then enter Deep Research Mode
241
+ - 3: Report error, stop
242
+
243
+ If the debug-loop also fails (exit 1/4), log the attempt to `.gsd-t/progress.md` Decision Log with a `[failure]` prefix, return to Step 1.5 and run Deep Research Mode before any further attempts. Present findings and options to the user before proceeding.
219
244
 
220
245
  ### Solo Mode
221
246
  1. Reproduce the issue — **reproduction script must exist before step 2** (see Step 2.5)