gsd-pi 2.78.1-dev.e9d88a536 → 2.78.1-dev.eccf86e27

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (212) hide show
  1. package/README.md +5 -7
  2. package/dist/help-text.js +1 -1
  3. package/dist/resource-loader.js +6 -1
  4. package/dist/resources/.managed-resources-content-hash +1 -1
  5. package/dist/resources/extensions/gsd/auto/detect-stuck.js +41 -5
  6. package/dist/resources/extensions/gsd/auto/loop.js +235 -36
  7. package/dist/resources/extensions/gsd/auto/phases.js +14 -7
  8. package/dist/resources/extensions/gsd/auto/session.js +36 -0
  9. package/dist/resources/extensions/gsd/auto-dispatch.js +49 -4
  10. package/dist/resources/extensions/gsd/auto-post-unit.js +26 -12
  11. package/dist/resources/extensions/gsd/auto-worktree.js +185 -201
  12. package/dist/resources/extensions/gsd/auto.js +139 -49
  13. package/dist/resources/extensions/gsd/bootstrap/agent-end-recovery.js +1 -1
  14. package/dist/resources/extensions/gsd/bootstrap/register-hooks.js +26 -20
  15. package/dist/resources/extensions/gsd/bootstrap/write-gate.js +67 -55
  16. package/dist/resources/extensions/gsd/crash-recovery.js +160 -47
  17. package/dist/resources/extensions/gsd/db/auto-workers.js +227 -0
  18. package/dist/resources/extensions/gsd/db/command-queue.js +105 -0
  19. package/dist/resources/extensions/gsd/db/milestone-leases.js +210 -0
  20. package/dist/resources/extensions/gsd/db/runtime-kv.js +91 -0
  21. package/dist/resources/extensions/gsd/db/unit-dispatches.js +322 -0
  22. package/dist/resources/extensions/gsd/db-writer.js +96 -16
  23. package/dist/resources/extensions/gsd/delegation-policy.js +155 -0
  24. package/dist/resources/extensions/gsd/docs/COORDINATION.md +42 -0
  25. package/dist/resources/extensions/gsd/doctor-proactive.js +4 -0
  26. package/dist/resources/extensions/gsd/doctor-runtime-checks.js +22 -6
  27. package/dist/resources/extensions/gsd/doctor.js +12 -2
  28. package/dist/resources/extensions/gsd/gsd-db.js +355 -3
  29. package/dist/resources/extensions/gsd/guided-flow-queue.js +1 -1
  30. package/dist/resources/extensions/gsd/guided-flow.js +116 -26
  31. package/dist/resources/extensions/gsd/interrupted-session.js +18 -15
  32. package/dist/resources/extensions/gsd/metrics.js +287 -1
  33. package/dist/resources/extensions/gsd/paths.js +79 -8
  34. package/dist/resources/extensions/gsd/prompts/complete-slice.md +4 -4
  35. package/dist/resources/extensions/gsd/prompts/execute-task.md +3 -3
  36. package/dist/resources/extensions/gsd/prompts/guided-discuss-milestone.md +8 -1
  37. package/dist/resources/extensions/gsd/prompts/guided-discuss-project.md +22 -7
  38. package/dist/resources/extensions/gsd/prompts/guided-discuss-requirements.md +6 -2
  39. package/dist/resources/extensions/gsd/prompts/guided-discuss-slice.md +8 -1
  40. package/dist/resources/extensions/gsd/state.js +21 -6
  41. package/dist/resources/extensions/gsd/templates/project.md +10 -0
  42. package/dist/resources/extensions/gsd/workflow-mcp.js +2 -2
  43. package/dist/resources/extensions/gsd/workspace.js +59 -0
  44. package/dist/resources/extensions/gsd/worktree-resolver.js +79 -2
  45. package/dist/resources/extensions/gsd/write-intercept.js +3 -3
  46. package/dist/tsconfig.extensions.tsbuildinfo +1 -1
  47. package/dist/web/standalone/.next/BUILD_ID +1 -1
  48. package/dist/web/standalone/.next/app-path-routes-manifest.json +14 -14
  49. package/dist/web/standalone/.next/build-manifest.json +2 -2
  50. package/dist/web/standalone/.next/prerender-manifest.json +3 -3
  51. package/dist/web/standalone/.next/required-server-files.json +1 -1
  52. package/dist/web/standalone/.next/server/app/_global-error.html +1 -1
  53. package/dist/web/standalone/.next/server/app/_global-error.rsc +1 -1
  54. package/dist/web/standalone/.next/server/app/_global-error.segments/_full.segment.rsc +1 -1
  55. package/dist/web/standalone/.next/server/app/_global-error.segments/_global-error/__PAGE__.segment.rsc +1 -1
  56. package/dist/web/standalone/.next/server/app/_global-error.segments/_global-error.segment.rsc +1 -1
  57. package/dist/web/standalone/.next/server/app/_global-error.segments/_head.segment.rsc +1 -1
  58. package/dist/web/standalone/.next/server/app/_global-error.segments/_index.segment.rsc +1 -1
  59. package/dist/web/standalone/.next/server/app/_global-error.segments/_tree.segment.rsc +1 -1
  60. package/dist/web/standalone/.next/server/app/_not-found.html +1 -1
  61. package/dist/web/standalone/.next/server/app/_not-found.rsc +1 -1
  62. package/dist/web/standalone/.next/server/app/_not-found.segments/_full.segment.rsc +1 -1
  63. package/dist/web/standalone/.next/server/app/_not-found.segments/_head.segment.rsc +1 -1
  64. package/dist/web/standalone/.next/server/app/_not-found.segments/_index.segment.rsc +1 -1
  65. package/dist/web/standalone/.next/server/app/_not-found.segments/_not-found/__PAGE__.segment.rsc +1 -1
  66. package/dist/web/standalone/.next/server/app/_not-found.segments/_not-found.segment.rsc +1 -1
  67. package/dist/web/standalone/.next/server/app/_not-found.segments/_tree.segment.rsc +1 -1
  68. package/dist/web/standalone/.next/server/app/index.html +1 -1
  69. package/dist/web/standalone/.next/server/app/index.rsc +1 -1
  70. package/dist/web/standalone/.next/server/app/index.segments/__PAGE__.segment.rsc +1 -1
  71. package/dist/web/standalone/.next/server/app/index.segments/_full.segment.rsc +1 -1
  72. package/dist/web/standalone/.next/server/app/index.segments/_head.segment.rsc +1 -1
  73. package/dist/web/standalone/.next/server/app/index.segments/_index.segment.rsc +1 -1
  74. package/dist/web/standalone/.next/server/app/index.segments/_tree.segment.rsc +1 -1
  75. package/dist/web/standalone/.next/server/app-paths-manifest.json +14 -14
  76. package/dist/web/standalone/.next/server/middleware-build-manifest.js +1 -1
  77. package/dist/web/standalone/.next/server/pages/404.html +1 -1
  78. package/dist/web/standalone/.next/server/pages/500.html +1 -1
  79. package/dist/web/standalone/.next/server/server-reference-manifest.json +1 -1
  80. package/dist/web/standalone/server.js +1 -1
  81. package/package.json +1 -1
  82. package/packages/mcp-server/README.md +2 -11
  83. package/packages/mcp-server/dist/remote-questions.d.ts +27 -0
  84. package/packages/mcp-server/dist/remote-questions.d.ts.map +1 -1
  85. package/packages/mcp-server/dist/remote-questions.js +28 -0
  86. package/packages/mcp-server/dist/remote-questions.js.map +1 -1
  87. package/packages/mcp-server/dist/server.d.ts +28 -0
  88. package/packages/mcp-server/dist/server.d.ts.map +1 -1
  89. package/packages/mcp-server/dist/server.js +94 -4
  90. package/packages/mcp-server/dist/server.js.map +1 -1
  91. package/packages/mcp-server/dist/workflow-tools.js.map +1 -1
  92. package/packages/mcp-server/src/mcp-server.test.ts +226 -0
  93. package/packages/mcp-server/src/remote-questions.test.ts +103 -0
  94. package/packages/mcp-server/src/remote-questions.ts +35 -0
  95. package/packages/mcp-server/src/server.ts +129 -6
  96. package/packages/mcp-server/src/workflow-tools.ts +1 -1
  97. package/packages/mcp-server/tsconfig.tsbuildinfo +1 -1
  98. package/src/resources/extensions/gsd/auto/detect-stuck.ts +37 -5
  99. package/src/resources/extensions/gsd/auto/loop.ts +263 -41
  100. package/src/resources/extensions/gsd/auto/phases.ts +15 -7
  101. package/src/resources/extensions/gsd/auto/session.ts +40 -0
  102. package/src/resources/extensions/gsd/auto-dispatch.ts +63 -4
  103. package/src/resources/extensions/gsd/auto-post-unit.ts +27 -12
  104. package/src/resources/extensions/gsd/auto-worktree.ts +218 -225
  105. package/src/resources/extensions/gsd/auto.ts +166 -43
  106. package/src/resources/extensions/gsd/bootstrap/agent-end-recovery.ts +1 -1
  107. package/src/resources/extensions/gsd/bootstrap/register-hooks.ts +26 -21
  108. package/src/resources/extensions/gsd/bootstrap/tests/write-gate-basepath.test.ts +103 -0
  109. package/src/resources/extensions/gsd/bootstrap/write-gate.ts +80 -55
  110. package/src/resources/extensions/gsd/crash-recovery.ts +177 -43
  111. package/src/resources/extensions/gsd/db/auto-workers.ts +273 -0
  112. package/src/resources/extensions/gsd/db/command-queue.ts +149 -0
  113. package/src/resources/extensions/gsd/db/milestone-leases.ts +274 -0
  114. package/src/resources/extensions/gsd/db/runtime-kv.ts +127 -0
  115. package/src/resources/extensions/gsd/db/unit-dispatches.ts +446 -0
  116. package/src/resources/extensions/gsd/db-writer.ts +113 -17
  117. package/src/resources/extensions/gsd/delegation-policy.ts +197 -0
  118. package/src/resources/extensions/gsd/docs/COORDINATION.md +42 -0
  119. package/src/resources/extensions/gsd/doctor-proactive.ts +4 -0
  120. package/src/resources/extensions/gsd/doctor-runtime-checks.ts +24 -6
  121. package/src/resources/extensions/gsd/doctor.ts +10 -2
  122. package/src/resources/extensions/gsd/gsd-db.ts +354 -3
  123. package/src/resources/extensions/gsd/guided-flow-queue.ts +1 -1
  124. package/src/resources/extensions/gsd/guided-flow.ts +152 -26
  125. package/src/resources/extensions/gsd/interrupted-session.ts +19 -12
  126. package/src/resources/extensions/gsd/metrics.ts +321 -1
  127. package/src/resources/extensions/gsd/paths.ts +67 -8
  128. package/src/resources/extensions/gsd/prompts/complete-slice.md +4 -4
  129. package/src/resources/extensions/gsd/prompts/execute-task.md +3 -3
  130. package/src/resources/extensions/gsd/prompts/guided-discuss-milestone.md +8 -1
  131. package/src/resources/extensions/gsd/prompts/guided-discuss-project.md +22 -7
  132. package/src/resources/extensions/gsd/prompts/guided-discuss-requirements.md +6 -2
  133. package/src/resources/extensions/gsd/prompts/guided-discuss-slice.md +8 -1
  134. package/src/resources/extensions/gsd/state.ts +44 -6
  135. package/src/resources/extensions/gsd/templates/project.md +10 -0
  136. package/src/resources/extensions/gsd/tests/auto-discuss-milestone-deadlock-4973.test.ts +14 -14
  137. package/src/resources/extensions/gsd/tests/auto-loop-no-copy-artifacts.test.ts +72 -0
  138. package/src/resources/extensions/gsd/tests/auto-loop-symlink-worktree.test.ts +190 -0
  139. package/src/resources/extensions/gsd/tests/auto-session-scope.test.ts +331 -0
  140. package/src/resources/extensions/gsd/tests/auto-workers.test.ts +105 -0
  141. package/src/resources/extensions/gsd/tests/auto-worktree-registry.test.ts +176 -0
  142. package/src/resources/extensions/gsd/tests/command-queue.test.ts +141 -0
  143. package/src/resources/extensions/gsd/tests/crash-recovery-via-db.test.ts +203 -0
  144. package/src/resources/extensions/gsd/tests/crash-recovery.test.ts +169 -59
  145. package/src/resources/extensions/gsd/tests/db-writer-path-containment.test.ts +152 -0
  146. package/src/resources/extensions/gsd/tests/db-writer-root-artifact.test.ts +221 -0
  147. package/src/resources/extensions/gsd/tests/db-writer-scope.test.ts +230 -0
  148. package/src/resources/extensions/gsd/tests/delegation-policy.test.ts +151 -0
  149. package/src/resources/extensions/gsd/tests/detect-stuck-respects-retry.test.ts +173 -0
  150. package/src/resources/extensions/gsd/tests/dispatch-backgroundable-annotation.test.ts +55 -0
  151. package/src/resources/extensions/gsd/tests/draft-promotion.test.ts +3 -23
  152. package/src/resources/extensions/gsd/tests/gate-1b-orphan-discrimination.test.ts +193 -0
  153. package/src/resources/extensions/gsd/tests/gate-1b-recovery-bound-corrections.test.ts +246 -0
  154. package/src/resources/extensions/gsd/tests/gate-1b-recovery-bound.test.ts +218 -0
  155. package/src/resources/extensions/gsd/tests/gsd-db-failed-open-restore.test.ts +117 -0
  156. package/src/resources/extensions/gsd/tests/gsd-db-workspace-scope.test.ts +226 -0
  157. package/src/resources/extensions/gsd/tests/gsd-root-canonical.test.ts +66 -0
  158. package/src/resources/extensions/gsd/tests/gsd-root-home-guard.test.ts +68 -5
  159. package/src/resources/extensions/gsd/tests/guided-flow-prompt-consolidation.test.ts +4 -4
  160. package/src/resources/extensions/gsd/tests/integration/auto-worktree.test.ts +22 -12
  161. package/src/resources/extensions/gsd/tests/integration/doctor-proactive.test.ts +24 -10
  162. package/src/resources/extensions/gsd/tests/integration/doctor-runtime.test.ts +35 -23
  163. package/src/resources/extensions/gsd/tests/integration/workspace-collapse-integration.test.ts +369 -0
  164. package/src/resources/extensions/gsd/tests/interrupted-session-auto.test.ts +72 -25
  165. package/src/resources/extensions/gsd/tests/interrupted-session-ui.test.ts +72 -25
  166. package/src/resources/extensions/gsd/tests/memory-pressure-stuck-state.test.ts +9 -6
  167. package/src/resources/extensions/gsd/tests/metrics-atomic-merge.test.ts +222 -0
  168. package/src/resources/extensions/gsd/tests/metrics-lock-hardening.test.ts +400 -0
  169. package/src/resources/extensions/gsd/tests/metrics-lock-not-acquired.test.ts +141 -0
  170. package/src/resources/extensions/gsd/tests/metrics-lock-retry-sleep.test.ts +287 -0
  171. package/src/resources/extensions/gsd/tests/metrics-prune-cache-invalidation.test.ts +149 -0
  172. package/src/resources/extensions/gsd/tests/metrics-scope.test.ts +378 -0
  173. package/src/resources/extensions/gsd/tests/milestone-leases.test.ts +152 -0
  174. package/src/resources/extensions/gsd/tests/originalbase-path-comparison.test.ts +329 -0
  175. package/src/resources/extensions/gsd/tests/parallel-milestone-isolation.test.ts +106 -0
  176. package/src/resources/extensions/gsd/tests/path-cache-decoupled.test.ts +209 -0
  177. package/src/resources/extensions/gsd/tests/path-normalization-unified.test.ts +175 -0
  178. package/src/resources/extensions/gsd/tests/paths-cache.test.ts +170 -0
  179. package/src/resources/extensions/gsd/tests/paused-session-via-db.test.ts +119 -0
  180. package/src/resources/extensions/gsd/tests/pending-autostart-scope.test.ts +120 -0
  181. package/src/resources/extensions/gsd/tests/pipeline-variant-dispatch.test.ts +58 -0
  182. package/src/resources/extensions/gsd/tests/preferences-worktree-sync.test.ts +3 -17
  183. package/src/resources/extensions/gsd/tests/prompt-contracts.test.ts +150 -7
  184. package/src/resources/extensions/gsd/tests/register-hooks-depth-verification.test.ts +138 -16
  185. package/src/resources/extensions/gsd/tests/resume-missing-worktree-warning.test.ts +209 -0
  186. package/src/resources/extensions/gsd/tests/runtime-kv.test.ts +120 -0
  187. package/src/resources/extensions/gsd/tests/skipped-validation-completion.test.ts +133 -28
  188. package/src/resources/extensions/gsd/tests/skipped-validation-db-atomicity.test.ts +17 -0
  189. package/src/resources/extensions/gsd/tests/stuck-state-via-db.test.ts +134 -0
  190. package/src/resources/extensions/gsd/tests/sync-layer-scope.test.ts +434 -0
  191. package/src/resources/extensions/gsd/tests/teardown-chdir-failure-clears-registry.test.ts +162 -0
  192. package/src/resources/extensions/gsd/tests/teardown-cleanup-parity.test.ts +98 -0
  193. package/src/resources/extensions/gsd/tests/teardown-failure-clears-registry.test.ts +186 -0
  194. package/src/resources/extensions/gsd/tests/tool-invocation-error-loop-break.test.ts +1 -1
  195. package/src/resources/extensions/gsd/tests/unit-dispatches.test.ts +247 -0
  196. package/src/resources/extensions/gsd/tests/validate-milestone.test.ts +41 -1
  197. package/src/resources/extensions/gsd/tests/validator-scope-parity.test.ts +239 -0
  198. package/src/resources/extensions/gsd/tests/workflow-mcp.test.ts +2 -2
  199. package/src/resources/extensions/gsd/tests/workflow-tool-executors.test.ts +9 -15
  200. package/src/resources/extensions/gsd/tests/workspace.test.ts +196 -0
  201. package/src/resources/extensions/gsd/tests/write-gate-predicates.test.ts +35 -35
  202. package/src/resources/extensions/gsd/tests/write-gate.test.ts +94 -71
  203. package/src/resources/extensions/gsd/tests/write-intercept.test.ts +1 -1
  204. package/src/resources/extensions/gsd/workflow-mcp.ts +2 -2
  205. package/src/resources/extensions/gsd/workspace.ts +95 -0
  206. package/src/resources/extensions/gsd/worktree-resolver.ts +78 -2
  207. package/src/resources/extensions/gsd/write-intercept.ts +3 -3
  208. package/src/resources/extensions/gsd/tests/auto-lock-creation.test.ts +0 -213
  209. package/src/resources/extensions/gsd/tests/auto-stale-lock-self-kill.test.ts +0 -87
  210. package/src/resources/extensions/gsd/tests/stop-auto-remote.test.ts +0 -159
  211. /package/dist/web/standalone/.next/static/{oZGTPvJBQX_IDKKnuV8Bt → Y5UeGFkXTYM9WIQOWHkot}/_buildManifest.js +0 -0
  212. /package/dist/web/standalone/.next/static/{oZGTPvJBQX_IDKKnuV8Bt → Y5UeGFkXTYM9WIQOWHkot}/_ssgManifest.js +0 -0
@@ -1,3 +1,4 @@
1
+ // GSD-2 + metrics.ts: token & cost tracking for auto-mode units
1
2
  /**
2
3
  * GSD Metrics — Token & Cost Tracking
3
4
  *
@@ -14,6 +15,7 @@
14
15
  */
15
16
 
16
17
  import { join } from "node:path";
18
+ import { openSync, closeSync, unlinkSync, statSync, writeFileSync } from "node:fs";
17
19
  import type { ExtensionContext } from "@gsd/pi-coding-agent";
18
20
  import { gsdRoot } from "./paths.js";
19
21
  import { getAndClearSkills } from "./skill-telemetry.js";
@@ -21,6 +23,8 @@ import { loadJsonFile, loadJsonFileOrNull, saveJsonFile } from "./json-persisten
21
23
  import { parseUnitId } from "./unit-id.js";
22
24
  import { buildAuditEnvelope, emitUokAuditEvent } from "./uok/audit.js";
23
25
  import { isUnifiedAuditEnabled } from "./uok/audit-toggle.js";
26
+ import type { MilestoneScope } from "./workspace.js";
27
+ import { logWarning } from "./workflow-logger.js";
24
28
 
25
29
  // Re-export from shared — import directly from format-utils to avoid pulling
26
30
  // in the full barrel (mod.js → ui.js → @gsd/pi-tui) which breaks when loaded
@@ -108,11 +112,17 @@ export function classifyUnitPhase(unitType: string): MetricsPhase {
108
112
  let ledger: MetricsLedger | null = null;
109
113
  let basePath: string = "";
110
114
 
115
+ // Per-workspace ledger map, keyed by workspace.identityKey.
116
+ // Populated by initMetricsByScope; independent of the module singleton.
117
+ const scopedLedgers = new Map<string, MetricsLedger>();
118
+
111
119
  // ─── Public API ───────────────────────────────────────────────────────────────
112
120
 
113
121
  /**
114
122
  * Initialize the metrics system for a given project.
115
123
  * Loads existing ledger from disk if present.
124
+ *
125
+ * @deprecated TODO(C-future): remove module singleton. Use initMetricsByScope instead.
116
126
  */
117
127
  export function initMetrics(base: string): void {
118
128
  basePath = base;
@@ -121,6 +131,8 @@ export function initMetrics(base: string): void {
121
131
 
122
132
  /**
123
133
  * Reset in-memory state. Called when auto-mode stops.
134
+ *
135
+ * @deprecated TODO(C-future): remove module singleton. Use resetMetricsByScope instead.
124
136
  */
125
137
  export function resetMetrics(): void {
126
138
  ledger = null;
@@ -130,6 +142,8 @@ export function resetMetrics(): void {
130
142
  /**
131
143
  * Snapshot usage metrics from the current session before it's wiped.
132
144
  * Scans session entries for AssistantMessage usage data.
145
+ *
146
+ * @deprecated TODO(C-future): remove module singleton. Use snapshotUnitMetricsByScope instead.
133
147
  */
134
148
  export function snapshotUnitMetrics(
135
149
  ctx: ExtensionContext,
@@ -272,6 +286,182 @@ export function getLedger(): MetricsLedger | null {
272
286
  return ledger;
273
287
  }
274
288
 
289
+ // ─── Scope-aware API (canonical) ─────────────────────────────────────────────
290
+
291
+ /**
292
+ * Initialize the metrics system for a given workspace scope.
293
+ * Loads existing ledger from disk into the per-scope ledger map.
294
+ * Does NOT touch the module-level singleton.
295
+ */
296
+ export function initMetricsByScope(scope: MilestoneScope): void {
297
+ const base = scope.workspace.projectRoot;
298
+ const loaded = loadLedger(base);
299
+ scopedLedgers.set(scope.workspace.identityKey, loaded);
300
+ }
301
+
302
+ /**
303
+ * Get the in-memory ledger for the given scope, or null if not initialized.
304
+ */
305
+ export function getLedgerByScope(scope: MilestoneScope): MetricsLedger | null {
306
+ return scopedLedgers.get(scope.workspace.identityKey) ?? null;
307
+ }
308
+
309
+ /**
310
+ * Reset scoped in-memory state for a workspace. Called when auto-mode stops.
311
+ */
312
+ export function resetMetricsByScope(scope: MilestoneScope): void {
313
+ scopedLedgers.delete(scope.workspace.identityKey);
314
+ }
315
+
316
+ /**
317
+ * Snapshot usage metrics using an explicit workspace scope.
318
+ *
319
+ * This is the canonical variant. It derives the metrics path from
320
+ * scope.workspace.projectRoot rather than the module singleton, so it
321
+ * remains correct across session resume and in multi-workspace processes.
322
+ *
323
+ * Preserves the atomic write-merge logic from saveLedger so concurrent
324
+ * workers cannot silently discard each other's entries.
325
+ *
326
+ * If initMetricsByScope has not been called, the ledger is loaded from
327
+ * disk on first call (lazy init).
328
+ */
329
+ export function snapshotUnitMetricsByScope(
330
+ scope: MilestoneScope,
331
+ ctx: ExtensionContext,
332
+ unitType: string,
333
+ unitId: string,
334
+ startedAt: number,
335
+ model: string,
336
+ opts?: {
337
+ tier?: string;
338
+ modelDowngraded?: boolean;
339
+ contextWindowTokens?: number;
340
+ truncationSections?: number;
341
+ continueHereFired?: boolean;
342
+ promptCharCount?: number;
343
+ baselineCharCount?: number;
344
+ autoSessionKey?: string;
345
+ traceId?: string;
346
+ turnId?: string;
347
+ causedBy?: string;
348
+ },
349
+ ): UnitMetrics | null {
350
+ const base = scope.workspace.projectRoot;
351
+ const key = scope.workspace.identityKey;
352
+
353
+ // Lazy init: load from disk if not yet in scoped map.
354
+ if (!scopedLedgers.has(key)) {
355
+ scopedLedgers.set(key, loadLedger(base));
356
+ }
357
+ const scopedLedger = scopedLedgers.get(key)!;
358
+
359
+ const entries = ctx.sessionManager.getEntries();
360
+ if (!entries || entries.length === 0) return null;
361
+
362
+ const tokens: TokenCounts = { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 };
363
+ let cost = 0;
364
+ let toolCalls = 0;
365
+ let assistantMessages = 0;
366
+ let userMessages = 0;
367
+
368
+ for (const entry of entries) {
369
+ if (entry.type !== "message") continue;
370
+ const msg = (entry as any).message;
371
+ if (!msg) continue;
372
+
373
+ if (msg.role === "assistant") {
374
+ assistantMessages++;
375
+ if (msg.usage) {
376
+ tokens.input += msg.usage.input ?? 0;
377
+ tokens.output += msg.usage.output ?? 0;
378
+ tokens.cacheRead += msg.usage.cacheRead ?? 0;
379
+ tokens.cacheWrite += msg.usage.cacheWrite ?? 0;
380
+ tokens.total += msg.usage.totalTokens ?? 0;
381
+ if (msg.usage.cost != null) {
382
+ const c = msg.usage.cost;
383
+ cost += typeof c === "number" ? c : (c.total ?? 0);
384
+ }
385
+ }
386
+ if (msg.content && Array.isArray(msg.content)) {
387
+ for (const block of msg.content) {
388
+ if (block.type === "toolCall") toolCalls++;
389
+ }
390
+ }
391
+ } else if (msg.role === "user") {
392
+ userMessages++;
393
+ }
394
+ }
395
+
396
+ const unit: UnitMetrics = {
397
+ type: unitType,
398
+ id: unitId,
399
+ model,
400
+ startedAt,
401
+ finishedAt: Date.now(),
402
+ ...(opts?.autoSessionKey ? { autoSessionKey: opts.autoSessionKey } : {}),
403
+ tokens,
404
+ cost,
405
+ toolCalls,
406
+ assistantMessages,
407
+ userMessages,
408
+ apiRequests: assistantMessages,
409
+ ...(opts?.tier ? { tier: opts.tier } : {}),
410
+ ...(opts?.modelDowngraded !== undefined ? { modelDowngraded: opts.modelDowngraded } : {}),
411
+ ...(opts?.contextWindowTokens !== undefined ? { contextWindowTokens: opts.contextWindowTokens } : {}),
412
+ ...(opts?.truncationSections !== undefined ? { truncationSections: opts.truncationSections } : {}),
413
+ ...(opts?.continueHereFired !== undefined ? { continueHereFired: opts.continueHereFired } : {}),
414
+ ...(opts?.promptCharCount != null ? { promptCharCount: opts.promptCharCount } : {}),
415
+ ...(opts?.baselineCharCount != null ? { baselineCharCount: opts.baselineCharCount } : {}),
416
+ };
417
+
418
+ // Auto-capture skill telemetry (#599)
419
+ const skills = getAndClearSkills();
420
+ if (skills.length > 0) {
421
+ unit.skills = skills;
422
+ }
423
+
424
+ // Compute cache hit rate
425
+ if (tokens.cacheRead > 0 || tokens.input > 0) {
426
+ const totalInput = tokens.cacheRead + tokens.input;
427
+ unit.cacheHitRate = totalInput > 0 ? Math.round((tokens.cacheRead / totalInput) * 100) : 0;
428
+ }
429
+
430
+ // Idempotency guard: update in-place on duplicate, append otherwise.
431
+ const dupeIdx = scopedLedger.units.findIndex(
432
+ (u) => u.type === unit.type && u.id === unit.id && u.startedAt === unit.startedAt,
433
+ );
434
+ if (dupeIdx >= 0) {
435
+ scopedLedger.units[dupeIdx] = unit;
436
+ } else {
437
+ scopedLedger.units.push(unit);
438
+ }
439
+ saveLedger(base, scopedLedger);
440
+
441
+ if (isUnifiedAuditEnabled()) {
442
+ emitUokAuditEvent(
443
+ base,
444
+ buildAuditEnvelope({
445
+ traceId: opts?.traceId ?? `metrics:${unitType}:${unitId}`,
446
+ turnId: opts?.turnId,
447
+ causedBy: opts?.causedBy,
448
+ category: "metrics",
449
+ type: "unit-metrics-snapshot",
450
+ payload: {
451
+ unitType,
452
+ unitId,
453
+ model,
454
+ tokens: unit.tokens,
455
+ cost: unit.cost,
456
+ toolCalls: unit.toolCalls,
457
+ },
458
+ }),
459
+ );
460
+ }
461
+
462
+ return unit;
463
+ }
464
+
275
465
  // ─── Aggregation helpers ──────────────────────────────────────────────────────
276
466
 
277
467
  export interface PhaseAggregate {
@@ -593,6 +783,12 @@ export function pruneMetricsLedger(base: string, keepCount: number): number {
593
783
  if (ledger) {
594
784
  ledger.units = ledger.units.slice(-keepCount);
595
785
  }
786
+ // Invalidate all scoped ledger cache entries. Prune is rare; clearing the
787
+ // entire map is simpler than tracking which entry belongs to `base`. Without
788
+ // this, scopedLedgers entries for the pruned workspace hold a pre-prune
789
+ // MetricsLedger that snapshotUnitMetricsByScope would merge back in, causing
790
+ // pruned units to reappear in subsequent snapshots.
791
+ scopedLedgers.clear();
596
792
  return removed;
597
793
  }
598
794
 
@@ -635,6 +831,130 @@ function deduplicateUnits(units: UnitMetrics[]): UnitMetrics[] {
635
831
  return Array.from(map.values());
636
832
  }
637
833
 
834
+ // How long a lock file must be untouched (in ms) before it is considered
835
+ // orphaned from a crashed process. Set to 2× the acquire timeout.
836
+ export const STALE_LOCK_THRESHOLD_MS = 4000;
837
+
838
+ // Retry interval between lock acquire attempts (ms). Caps syscall rate at
839
+ // ~200 attempts over a 2s timeout instead of ~20,000 without any sleep.
840
+ // Exposed for tests.
841
+ export const LOCK_RETRY_INTERVAL_MS = 5;
842
+
843
+ // Sync sleep via Atomics.wait — true OS-level sleep, no CPU spin.
844
+ // Int32Array must reference a SharedArrayBuffer; we wait on index 0 which
845
+ // will never be woken by a Atomics.notify, so the wait always times out.
846
+ const _lockSleepBuf = new Int32Array(new SharedArrayBuffer(4));
847
+ function syncSleep(ms: number): void {
848
+ Atomics.wait(_lockSleepBuf, 0, 0, ms);
849
+ }
850
+
851
+ // Counts the number of sleepy retries (non-stale-evicting) made by acquireLock
852
+ // across all calls since the last reset. Exported for test instrumentation only.
853
+ let _lockSleepyRetries = 0;
854
+ export function getLockSleepyRetries(): number { return _lockSleepyRetries; }
855
+ export function resetLockSleepyRetries(): void { _lockSleepyRetries = 0; }
856
+
857
+ /**
858
+ * Acquire an exclusive .lock sentinel file via O_EXCL.
859
+ *
860
+ * Improvements over the original:
861
+ * - No busy spin: the inner `while (Date.now() < waitUntil) {}` spin that
862
+ * burned CPU doing nothing useful is removed. Each retry attempt now makes
863
+ * one `openSync` syscall and immediately re-checks the deadline, which is
864
+ * orders of magnitude cheaper than a tight spin loop.
865
+ * - Stale-lock detection: if the existing lock file's mtime is older than
866
+ * STALE_LOCK_THRESHOLD_MS, the lock is considered orphaned (e.g. the
867
+ * writing process crashed) and is forcibly removed before retrying.
868
+ * A warning is logged so operators can detect crash patterns.
869
+ * - PID stamp: on success, writes the acquiring process's PID and a
870
+ * timestamp into the lock file so external monitors can identify orphans.
871
+ * - Retry sleep: after each non-stale-evicting retry, sleeps
872
+ * LOCK_RETRY_INTERVAL_MS (5ms) via Atomics.wait so the process yields to
873
+ * the OS. This caps syscall rate at ~200–400/s under contention instead of
874
+ * the ~20,000/s that would result from a tight openSync loop.
875
+ * After a stale-lock eviction (lock already removed), no sleep is injected
876
+ * — we retry immediately to close the short race window.
877
+ *
878
+ * Returns true on success, false on timeout.
879
+ */
880
+ function acquireLock(lockPath: string, timeoutMs = 2000): boolean {
881
+ const deadline = Date.now() + timeoutMs;
882
+ while (Date.now() < deadline) {
883
+ try {
884
+ const fd = openSync(lockPath, "wx"); // O_WRONLY | O_CREAT | O_EXCL
885
+ closeSync(fd);
886
+ // Write PID stamp so external monitors can identify the lock owner.
887
+ try {
888
+ writeFileSync(lockPath, `${process.pid}\n${new Date().toISOString()}\n`, "utf-8");
889
+ } catch { /* non-fatal — stamp is diagnostic only */ }
890
+ return true;
891
+ } catch {
892
+ // Lock held by another process — check for staleness before retrying.
893
+ try {
894
+ const st = statSync(lockPath);
895
+ if (Date.now() - st.mtimeMs > STALE_LOCK_THRESHOLD_MS) {
896
+ logWarning(
897
+ "fs",
898
+ `stale metrics lock at ${lockPath} (age ${Date.now() - st.mtimeMs}ms); forcibly removing and retrying`,
899
+ );
900
+ try { unlinkSync(lockPath); } catch { /* already gone */ }
901
+ // Do NOT sleep after stale-lock eviction — retry the open
902
+ // immediately. The lock file was just removed; a short race window
903
+ // exists and sleeping here would unnecessarily delay recovery.
904
+ continue;
905
+ }
906
+ } catch { /* lock file disappeared between the failed open and stat — retry */ }
907
+ // Sleep between retries to yield to the OS and cap syscall rate.
908
+ // Uses Atomics.wait for a true blocking sleep (no CPU spin).
909
+ _lockSleepyRetries++;
910
+ syncSleep(LOCK_RETRY_INTERVAL_MS);
911
+ }
912
+ }
913
+ return false;
914
+ }
915
+
916
+ function releaseLock(lockPath: string): void {
917
+ try { unlinkSync(lockPath); } catch { /* ignore */ }
918
+ }
919
+
920
+ /**
921
+ * Save the ledger with cross-process merge semantics.
922
+ *
923
+ * Acquires a .lock sentinel file, reads the current on-disk ledger,
924
+ * merges worker units with existing peer units (worker's entry wins on
925
+ * type+id+startedAt conflict since it has the latest finishedAt),
926
+ * then writes atomically. This prevents parallel auto-mode workers from
927
+ * silently discarding each other's metrics entries.
928
+ *
929
+ * Falls back to a direct write (no merge) if the lock cannot be acquired
930
+ * within the timeout — better to potentially overwrite than to lose data
931
+ * entirely.
932
+ */
638
933
  function saveLedger(base: string, data: MetricsLedger): void {
639
- saveJsonFile(metricsPath(base), data);
934
+ const path = metricsPath(base);
935
+ const lockPath = `${path}.lock`;
936
+ const acquired = acquireLock(lockPath);
937
+ if (acquired) {
938
+ try {
939
+ // Read current on-disk state and merge with worker's in-memory units.
940
+ // Worker units take precedence on conflict (by finishedAt in deduplicateUnits).
941
+ const onDisk = loadJsonFileOrNull(path, isMetricsLedger);
942
+ if (onDisk && onDisk.units.length > 0) {
943
+ const merged = deduplicateUnits([...onDisk.units, ...data.units]);
944
+ saveJsonFile(path, { ...data, units: merged });
945
+ } else {
946
+ saveJsonFile(path, data);
947
+ }
948
+ } finally {
949
+ releaseLock(lockPath);
950
+ }
951
+ } else {
952
+ // Lock could not be acquired within the timeout. Fall back to a direct
953
+ // write (no cross-process merge) to avoid losing this worker's data
954
+ // entirely. A concurrent writer may overwrite us, but that is preferable
955
+ // to a torn write caused by two writers simultaneously executing the
956
+ // read-merge-write sequence without mutual exclusion.
957
+ logWarning("fs", "saveLedger: lock not acquired — falling back to direct write (no merge)");
958
+ saveJsonFile(path, data);
959
+ }
640
960
  }
@@ -1,3 +1,4 @@
1
+ // GSD-2 — ID-based path resolution for GSD project files and directories
1
2
  /**
2
3
  * GSD Paths — ID-based path resolution
3
4
  *
@@ -128,9 +129,15 @@ function cachedReaddir(dirPath: string): string[] {
128
129
  }
129
130
 
130
131
  /**
131
- * Clear the directory listing cache.
132
+ * Clear the volatile directory listing caches.
132
133
  * Call after milestone transitions, file creation in planning directories,
133
134
  * or at the start/end of a dispatch cycle.
135
+ *
136
+ * NOTE: This does NOT clear gsdRootCache. The project root is stable for
137
+ * the lifetime of a process; clearing it on every agent turn-end caused a
138
+ * 250–2500 ms regression per session (git rev-parse + dir walk per turn).
139
+ * Use _clearGsdRootCache() at session-reset boundaries (workspace switch,
140
+ * process exit) when the project root may genuinely change.
134
141
  */
135
142
  export function clearPathCache(): void {
136
143
  dirEntryCache.clear();
@@ -285,6 +292,14 @@ const LEGACY_GSD_ROOT_FILES: Record<GSDRootFileKey, string> = {
285
292
 
286
293
  // ─── GSD Root Discovery ───────────────────────────────────────────────────────
287
294
 
295
+ // Process-lifetime cache for gsdRoot() results.
296
+ // Keys are realpath-normalized (via normCacheKey) so /foo and /foo/ share the
297
+ // same entry and so do case-variant paths on case-insensitive volumes. This
298
+ // normalization is the safety net that prevents cache poisoning from the
299
+ // ~/.gsd walk-up bug (fixed in c46cf4786 + b35e070eb), making it safe to
300
+ // hold this cache for the entire process lifetime.
301
+ // Use _clearGsdRootCache() only at session-reset boundaries (workspace switch,
302
+ // process exit) — NOT inside clearPathCache(), which runs on every agent turn.
288
303
  const gsdRootCache = new Map<string, string>();
289
304
 
290
305
  export interface GsdPathContract {
@@ -337,11 +352,37 @@ export function resolveGsdPathContract(
337
352
  };
338
353
  }
339
354
 
340
- /** Exported for tests only — do not call in production code. */
355
+ /**
356
+ * Invalidate the gsdRoot cache.
357
+ * Use ONLY at session-reset boundaries: workspace switch, process exit, or
358
+ * any context where the project root itself may genuinely change.
359
+ * Do NOT call this on every agent turn — use clearPathCache() for volatile
360
+ * directory listing invalidation instead.
361
+ */
341
362
  export function _clearGsdRootCache(): void {
342
363
  gsdRootCache.clear();
343
364
  }
344
365
 
366
+ /**
367
+ * Resolve a path to its canonical real path using the native resolver.
368
+ * On macOS case-insensitive (HFS+/APFS) volumes, realpathSync.native normalizes
369
+ * case — ensuring that /foo/Bar and /foo/bar resolve to the same string.
370
+ * Falls back to resolve(p) for non-existent paths.
371
+ *
372
+ * Use this helper everywhere a path is used as an identity/cache key so that
373
+ * all callers agree on the canonical form.
374
+ */
375
+ export function normalizeRealPath(p: string): string {
376
+ try { return realpathSync.native(p); } catch { return resolve(p); }
377
+ }
378
+
379
+ /** Normalize a path for use as a gsdRootCache key (realpath + trailing-slash strip). */
380
+ function normCacheKey(p: string): string {
381
+ const r = normalizeRealPath(p);
382
+ const s = r.replaceAll("\\", "/").replace(/\/+$/, "");
383
+ return process.platform === "win32" ? s.toLowerCase() : s;
384
+ }
385
+
345
386
  /**
346
387
  * Resolve the `.gsd` directory for a given project base path.
347
388
  *
@@ -351,13 +392,19 @@ export function _clearGsdRootCache(): void {
351
392
  * 3. Walk up from basePath — handles moved .gsd in an ancestor (bounded by git root)
352
393
  * 4. basePath/.gsd — creation fallback (init scenario)
353
394
  *
354
- * Result is cached per basePath for the process lifetime.
395
+ * Result is cached per normalized basePath for the process lifetime.
396
+ * Keys are realpath-normalized so /foo and /foo/ share the same cache entry.
355
397
  */
356
398
  export function gsdRoot(basePath: string): string {
357
- const cached = gsdRootCache.get(basePath);
399
+ const cacheKey = normCacheKey(basePath);
400
+ const cached = gsdRootCache.get(cacheKey);
358
401
  if (cached) return cached;
359
402
 
360
- const result = probeGsdRoot(basePath);
403
+ // Canonicalize result via realpath before asserting and caching so that
404
+ // callers always receive a canonical path regardless of whether probeGsdRoot
405
+ // returned a path through a symlink. Without this, the cached value can
406
+ // diverge from other realpath-normalized paths (e.g. workspace.identityKey).
407
+ const result = normalizeRealPath(probeGsdRoot(basePath));
361
408
 
362
409
  // Defense-in-depth: if basePath resolves to the user's home directory and
363
410
  // the result equals gsdHome(), refuse — project-scoped writes must never
@@ -365,7 +412,7 @@ export function gsdRoot(basePath: string): string {
365
412
  // valid (their basePath does not equal homedir).
366
413
  assertNotGlobalGsdHome(basePath, result);
367
414
 
368
- gsdRootCache.set(basePath, result);
415
+ gsdRootCache.set(cacheKey, result);
369
416
  return result;
370
417
  }
371
418
 
@@ -466,9 +513,21 @@ function probeGsdRoot(rawBasePath: string): string {
466
513
  }
467
514
  } catch { /* git not available */ }
468
515
 
516
+ // Compute gsdHome once for the skip-check used in steps 2 and 3.
517
+ const normPath = (p: string): string => {
518
+ let r: string;
519
+ try { r = realpathSync.native(p); } catch { r = p; }
520
+ const s = r.replaceAll("\\", "/").replace(/\/+$/, "");
521
+ return process.platform === "win32" ? s.toLowerCase() : s;
522
+ };
523
+ let gsdHomeNorm: string;
524
+ try { gsdHomeNorm = normPath(gsdHome()); } catch { gsdHomeNorm = ""; }
525
+
469
526
  if (gitRoot) {
470
527
  const candidate = join(gitRoot, ".gsd");
471
- if (existsSync(candidate)) return candidate;
528
+ // Skip if the candidate resolves to the global GSD home — a subdir basePath
529
+ // must not be anchored to ~/.gsd just because $HOME is a git repo.
530
+ if (existsSync(candidate) && normPath(candidate) !== gsdHomeNorm) return candidate;
472
531
  }
473
532
 
474
533
  // 3. Walk up from basePath to the git root (only if we are in a subdirectory)
@@ -476,7 +535,7 @@ function probeGsdRoot(rawBasePath: string): string {
476
535
  let cur = dirname(basePath);
477
536
  while (cur !== basePath) {
478
537
  const candidate = join(cur, ".gsd");
479
- if (existsSync(candidate)) return candidate;
538
+ if (existsSync(candidate) && normPath(candidate) !== gsdHomeNorm) return candidate;
480
539
  if (cur === gitRoot) break;
481
540
  basePath = cur;
482
541
  cur = dirname(cur);
@@ -28,7 +28,7 @@ This unit runs under the `planning-dispatch` tools-policy: you may use the `suba
28
28
  - **Touched auth, network, parsing, file IO, shell exec, or crypto** → dispatch the **security** agent for an OWASP-style audit.
29
29
  - **Added or modified tests** → dispatch the **tester** agent to assess coverage gaps relative to the slice plan.
30
30
 
31
- Subagents read the diff and report findings — they do **not** write user source. You remain responsible for acting on their feedback before calling `gsd_complete_slice` with `milestoneId` and `sliceId`.
31
+ Subagents read the diff and report findings — they do **not** write user source. You remain responsible for acting on their feedback before calling `gsd_slice_complete` with `milestoneId` and `sliceId`.
32
32
 
33
33
  Then:
34
34
  1. Use the **Slice Summary** and **UAT** output templates from the inlined context above
@@ -37,11 +37,11 @@ Then:
37
37
  4. If the slice plan includes observability/diagnostic surfaces, confirm they work. Skip this for simple slices that don't have observability sections.
38
38
  5. Address every gate listed in the **Gates to Close** section above — each gate maps to a specific slice-summary section the handler inspects (for example, Q8 maps to **Operational Readiness**: health signal, failure signal, recovery procedure, and monitoring gaps). Leaving a section empty records the gate as `omitted`.
39
39
  6. If this slice produced evidence that a requirement changed status (Active → Validated, Active → Deferred, etc.), call `gsd_requirement_update` with the requirement ID, updated `status`, and `validation` evidence. Do NOT write `.gsd/REQUIREMENTS.md` directly — the engine renders it from the database.
40
- 7. Prepare the slice completion content you will pass to `gsd_complete_slice` using the camelCase fields `milestoneId`, `sliceId`, `sliceTitle`, `oneLiner`, `narrative`, `verification`, and `uatContent`. Do **not** manually write `{{sliceSummaryPath}}`. Do **not** manually write `{{sliceUatPath}}` — the DB-backed tool is the canonical write path for both artifacts.
40
+ 7. Prepare the slice completion content you will pass to `gsd_slice_complete` using the camelCase fields `milestoneId`, `sliceId`, `sliceTitle`, `oneLiner`, `narrative`, `verification`, and `uatContent`. Do **not** manually write `{{sliceSummaryPath}}`. Do **not** manually write `{{sliceUatPath}}` — the DB-backed tool is the canonical write path for both artifacts.
41
41
  8. Draft the UAT content you will pass as `uatContent` — a concrete UAT script with real test cases derived from the slice plan and task summaries. Include preconditions, numbered steps with expected outcomes, and edge cases. This must NOT be a placeholder or generic template — tailor every test case to what this slice actually built. Fill the `UAT Type` and `Not Proven By This UAT` sections explicitly so the artifact states what class of acceptance it covers and what still remains unproven (e.g. live integration paths, performance under load, scenarios deferred to a later slice).
42
42
  9. Review task summaries for `key_decisions`. For each significant decision, call `capture_thought` with `category: "architecture"` (or `"pattern"`) and a `structuredFields` payload of `{ scope, decision, choice, rationale, made_by: "agent", revisable }`.
43
43
  10. Review task summaries for patterns, gotchas, or non-obvious lessons learned. For each one that would save future agents from repeating investigation, call `capture_thought` with the matching category (`gotcha`, `convention`, `pattern`, `environment`). The memory store is the single source of truth (ADR-013); do not append to `.gsd/DECISIONS.md` or `.gsd/KNOWLEDGE.md` directly.
44
- 11. Call `gsd_complete_slice` with the camelCase fields `milestoneId`, `sliceId`, `sliceTitle`, `oneLiner`, `narrative`, `verification`, and `uatContent`, plus any optional enrichment fields you have. Do NOT manually mark the roadmap checkbox — the tool writes to the DB, renders `{{sliceSummaryPath}}` and `{{sliceUatPath}}`, and updates the ROADMAP.md projection automatically.
44
+ 11. Call `gsd_slice_complete` with the camelCase fields `milestoneId`, `sliceId`, `sliceTitle`, `oneLiner`, `narrative`, `verification`, and `uatContent`, plus any optional enrichment fields you have. Do NOT manually mark the roadmap checkbox — the tool writes to the DB, renders `{{sliceSummaryPath}}` and `{{sliceUatPath}}`, and updates the ROADMAP.md projection automatically.
45
45
  12. Do not run git commands — the system commits your changes and handles any merge after this unit succeeds.
46
46
  13. Update `.gsd/PROJECT.md` if it exists — refresh current state if needed: use the `write` tool with `path: ".gsd/PROJECT.md"` and `content` containing the full updated document reflecting current project state. Do NOT use the `edit` tool for this — PROJECT.md is a full-document refresh.
47
47
 
@@ -49,6 +49,6 @@ Then:
49
49
 
50
50
  **File system safety:** Task summaries are preloaded in the inlined context above. Task artifacts use a **flat file layout** — files such as `T01-SUMMARY.md` and `T02-SUMMARY.md` live directly inside the `tasks/` directory, not inside per-task subdirectories like `tasks/T01/SUMMARY.md`. If you need to re-read any of them, use `find .gsd/milestones/{{milestoneId}}/slices/{{sliceId}}/tasks -name "*-SUMMARY.md"` to list file paths first. Never use `tasks/*/SUMMARY.md`, and never pass `{{slicePath}}` or any other directory path directly to the `read` tool. The `read` tool only accepts file paths, not directories.
51
51
 
52
- **You MUST call `gsd_complete_slice` with the slice summary and UAT content before finishing. The tool persists to both DB and disk and renders `{{sliceSummaryPath}}` and `{{sliceUatPath}}` automatically.**
52
+ **You MUST call `gsd_slice_complete` with the slice summary and UAT content before finishing. The tool persists to both DB and disk and renders `{{sliceSummaryPath}}` and `{{sliceUatPath}}` automatically.**
53
53
 
54
54
  When done, say: "Slice {{sliceId}} complete."
@@ -85,14 +85,14 @@ Then:
85
85
  17. If you made an architectural, pattern, library, or observability decision during this task that downstream work should know about, call `capture_thought` with `category: "architecture"` (or `"pattern"`). For decisions, populate `structuredFields` with `{ scope, decision, choice, rationale, made_by: "agent", revisable }` so future projection back to a human-visible decisions register stays lossless. Not every task produces decisions — only capture when a meaningful choice was made.
86
86
  18. If you discover a non-obvious rule, recurring gotcha, or useful pattern during execution, call `capture_thought` with `category: "gotcha"`, `"convention"`, `"pattern"`, or `"environment"` as appropriate. Only capture entries that would save future agents from repeating your investigation — don't capture obvious things. The memory store is the single source of truth for cross-session knowledge (ADR-013); do not append to `.gsd/DECISIONS.md` or `.gsd/KNOWLEDGE.md` directly.
87
87
  19. Read the template at `~/.gsd/agent/extensions/gsd/templates/task-summary.md`
88
- 20. Use that template to prepare the completion content you will pass to `gsd_complete_task` using the camelCase fields `milestoneId`, `sliceId`, `taskId`, `oneLiner`, `narrative`, `verification`, and `verificationEvidence`. Do **not** manually write `{{taskSummaryPath}}` — the DB-backed tool is the canonical write path and renders the summary file for you.
89
- 21. Call `gsd_complete_task` with milestoneId, sliceId, taskId, and the completion fields derived from the template. This is your final required step — do NOT manually edit PLAN.md checkboxes. The tool marks the task complete, updates the DB, renders `{{taskSummaryPath}}`, and updates PLAN.md automatically.
88
+ 20. Use that template to prepare the completion content you will pass to `gsd_task_complete` using the camelCase fields `milestoneId`, `sliceId`, `taskId`, `oneLiner`, `narrative`, `verification`, and `verificationEvidence`. Do **not** manually write `{{taskSummaryPath}}` — the DB-backed tool is the canonical write path and renders the summary file for you.
89
+ 21. Call `gsd_task_complete` with milestoneId, sliceId, taskId, and the completion fields derived from the template. This is your final required step — do NOT manually edit PLAN.md checkboxes. The tool marks the task complete, updates the DB, renders `{{taskSummaryPath}}`, and updates PLAN.md automatically.
90
90
  22. Do not run git commands — the system reads your task summary after completion and creates a meaningful commit from it (type inferred from title, message from your one-liner, key files from frontmatter). Write a clear, specific one-liner in the summary — it becomes the commit message.
91
91
 
92
92
  All work stays in your working directory: `{{workingDirectory}}`.
93
93
 
94
94
  **Autonomous execution:** Do not call `ask_user_questions` or `secure_env_collect`. You are running in auto-mode — there is no human available to answer questions. Make reasonable assumptions and document them in the task summary. If a decision genuinely requires human input, note it in the summary and proceed with the best available option.
95
95
 
96
- **You MUST call `gsd_complete_task` before finishing. Do not manually write `{{taskSummaryPath}}`.**
96
+ **You MUST call `gsd_task_complete` before finishing. Do not manually write `{{taskSummaryPath}}`.**
97
97
 
98
98
  When done, say: "Task {{taskId}} complete."
@@ -12,6 +12,13 @@ Discuss milestone {{milestoneId}} ("{{milestoneTitle}}"). Identify gray areas, a
12
12
 
13
13
  {{fastPathInstruction}}
14
14
 
15
+ ### Read project shape
16
+
17
+ Before your first question round, read `.gsd/PROJECT.md` and look for `## Project Shape` → `**Complexity:**`. The verdict is either **`simple`** or **`complex`** (default to `complex` if PROJECT.md is missing the section, predates this convention, or the value is unclear). The verdict scales the rest of this stage:
18
+
19
+ - **`simple`** — favor 1–2 plain-text question rounds. Skip the parallel-research investigation. Skip `ask_user_questions` unless presenting concrete alternatives.
20
+ - **`complex`** — full investigation, 3–4-option structured questions, multi-round.
21
+
15
22
  ### Before your first question round
16
23
 
17
24
  Do a lightweight targeted investigation so your questions are grounded in reality:
@@ -36,7 +43,7 @@ Ask **1–3 questions per round**. Keep each question focused on one of:
36
43
 
37
44
  **Never fabricate or simulate user input.** Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.
38
45
 
39
- **If `{{structuredQuestionsAvailable}}` is `true`:** use `ask_user_questions` for each round. 1–3 questions per call, each as a separate question object. Keep option labels short (3–5 words). Always include a freeform "Other / let me explain" option. When the user picks that option or writes a long freeform answer, switch to plain text follow-up for that thread before resuming structured questions. **IMPORTANT: Call `ask_user_questions` exactly once per turn. Never make multiple calls with the same or overlapping questions — wait for the user's response before asking the next round.**
46
+ **If `{{structuredQuestionsAvailable}}` is `true`:** use `ask_user_questions` for each round. 1–3 questions per call, each as a separate question object. Keep option labels short (3–5 words). In **`complex`** mode, each multi-choice question MUST present **3 or 4 concrete, researched options** plus a final **"Other let me discuss"** option; options must be grounded in the investigation above (codebase signals, library docs, prior `.gsd/` artifacts), not generic placeholders. In **`simple`** mode, 2 options is fine when alternatives are genuinely binary. Binary depth-check / wrap-up gates are exempt from the 3-or-4 rule. When the user picks "Other let me discuss" or writes a long freeform answer, switch to plain text follow-up for that thread before resuming structured questions. **IMPORTANT: Call `ask_user_questions` exactly once per turn. Never make multiple calls with the same or overlapping questions — wait for the user's response before asking the next round.**
40
47
 
41
48
  **If `{{structuredQuestionsAvailable}}` is `false`:** ask questions in plain text. Keep each round to 1–3 focused questions. Wait for answers before asking the next round.
42
49
 
@@ -26,6 +26,18 @@ Ask the user a single freeform question in plain text, not structured: **"What d
26
26
 
27
27
  Wait for their response. This grounds every follow-up in their own terminology.
28
28
 
29
+ ### Classify project shape
30
+
31
+ After the opening answer, classify the project as **`simple`** or **`complex`** before continuing. Print the verdict in chat as one line: `Project shape: simple` or `Project shape: complex` followed by a one-line rationale.
32
+
33
+ **`simple`** — most of these apply: single primary user (the user themselves or one immediate team), no external integrations beyond well-known SDKs/libs, greenfield or self-contained, scope describable in 1–2 sentences without ambiguity, no compliance/regulatory needs, ≤5 distinct capabilities.
34
+
35
+ **`complex`** — any of these apply: multi-user with roles/permissions, non-trivial brownfield codebase, external integrations with auth/data exchange, compliance/security/regulated domain (PII, payments, healthcare), >5 capabilities or unclear scope, cross-team/cross-org coordination, novel domain where assumptions need validation.
36
+
37
+ **Default to `complex` when uncertain.** The user can override the verdict in plain text; if they do, accept it and proceed.
38
+
39
+ The verdict drives the rest of this stage and gets persisted to PROJECT.md → `## Project Shape`. Downstream stages (`discuss-requirements`, `discuss-milestone`, `discuss-slice`) read it from there.
40
+
29
41
  ### Before deeper rounds
30
42
 
31
43
  Do a lightweight targeted investigation so your questions are grounded in reality:
@@ -50,9 +62,11 @@ Ask **1–3 questions per round**. Each round targets one of:
50
62
 
51
63
  **Never fabricate or simulate user input.** Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.
52
64
 
53
- **Plain-text default:** Project discovery is open-ended. Ask question rounds in plain text unless you are presenting 2–3 concrete alternatives with clear tradeoffs.
65
+ **Cadence is shape-dependent:**
66
+ - **`simple`** — favor 1–2 plain-text rounds. Skip `ask_user_questions` unless you are presenting concrete alternatives. Get to the depth checklist fast.
67
+ - **`complex`** — full investigation, multiple rounds, structured questions when meaningful alternatives exist.
54
68
 
55
- **If `{{structuredQuestionsAvailable}}` is `true` and you use `ask_user_questions`:** ask 1–3 questions per call. Every question object MUST include a stable lowercase `id`. Keep option labels short (3–5 words). Do not add a separate "Other" option; the question UI provides a freeform path automatically. Wait for each tool result before asking the next round.
69
+ **If `{{structuredQuestionsAvailable}}` is `true` and you use `ask_user_questions`:** ask 1–3 questions per call. Every question object MUST include a stable lowercase `id`. Keep option labels short (3–5 words). In **`complex`** mode, each multi-choice question MUST present **3 or 4 concrete, researched options** plus a final **"Other — let me discuss"** option; options must be grounded in your investigation (codebase signals, library docs, prior `.gsd/` artifacts), not generic placeholders. In **`simple`** mode, 2 options is fine. Binary depth-check / wrap-up gates are exempt from the 3-or-4 rule. Wait for each tool result before asking the next round.
56
70
 
57
71
  **If `{{structuredQuestionsAvailable}}` is `false`:** ask questions in plain text. Keep each round to 1–3 focused questions.
58
72
 
@@ -126,8 +140,9 @@ Once the user confirms depth:
126
140
 
127
141
  1. Use the **Project** output template (inlined above).
128
142
  2. Call `gsd_summary_save` with `artifact_type: "PROJECT"` and the full project markdown as `content`; omit `milestone_id`. The tool writes `.gsd/PROJECT.md` to disk and persists to DB. Preserve the user's exact terminology, emphasis, and framing.
129
- 3. The `## Capability Contract` section MUST reference `.gsd/REQUIREMENTS.md` that file does not yet exist; the next stage (`discuss-requirements`) will produce it.
130
- 4. The `## Milestone Sequence` MUST list at least M001 with title and one-liner. Subsequent milestones may be listed as known intents; they will be elaborated in their own discuss-milestone stages.
131
- 5. Do NOT use `artifact_type: "CONTEXT"` and do NOT pass `milestone_id: "PROJECT"`; that creates a fake milestone named PROJECT.
132
- 6. {{commitInstruction}}
133
- 7. Say exactly: `"Project context written."` — nothing else.
143
+ 3. The `## Project Shape` section MUST contain `**Complexity:** simple` or `**Complexity:** complex` (matching the verdict you announced) plus a one-line `**Why:**` rationale. Downstream stages read this line.
144
+ 4. The `## Capability Contract` section MUST reference `.gsd/REQUIREMENTS.md` that file does not yet exist; the next stage (`discuss-requirements`) will produce it.
145
+ 5. The `## Milestone Sequence` MUST list at least M001 with title and one-liner. Subsequent milestones may be listed as known intents; they will be elaborated in their own discuss-milestone stages.
146
+ 6. Do NOT use `artifact_type: "CONTEXT"` and do NOT pass `milestone_id: "PROJECT"`; that creates a fake milestone named PROJECT.
147
+ 7. {{commitInstruction}}
148
+ 8. Say exactly: `"Project context written."` — nothing else.