gsd-pi 2.78.1-dev.e9d88a536 → 2.78.1-dev.eccf86e27

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (212) hide show
  1. package/README.md +5 -7
  2. package/dist/help-text.js +1 -1
  3. package/dist/resource-loader.js +6 -1
  4. package/dist/resources/.managed-resources-content-hash +1 -1
  5. package/dist/resources/extensions/gsd/auto/detect-stuck.js +41 -5
  6. package/dist/resources/extensions/gsd/auto/loop.js +235 -36
  7. package/dist/resources/extensions/gsd/auto/phases.js +14 -7
  8. package/dist/resources/extensions/gsd/auto/session.js +36 -0
  9. package/dist/resources/extensions/gsd/auto-dispatch.js +49 -4
  10. package/dist/resources/extensions/gsd/auto-post-unit.js +26 -12
  11. package/dist/resources/extensions/gsd/auto-worktree.js +185 -201
  12. package/dist/resources/extensions/gsd/auto.js +139 -49
  13. package/dist/resources/extensions/gsd/bootstrap/agent-end-recovery.js +1 -1
  14. package/dist/resources/extensions/gsd/bootstrap/register-hooks.js +26 -20
  15. package/dist/resources/extensions/gsd/bootstrap/write-gate.js +67 -55
  16. package/dist/resources/extensions/gsd/crash-recovery.js +160 -47
  17. package/dist/resources/extensions/gsd/db/auto-workers.js +227 -0
  18. package/dist/resources/extensions/gsd/db/command-queue.js +105 -0
  19. package/dist/resources/extensions/gsd/db/milestone-leases.js +210 -0
  20. package/dist/resources/extensions/gsd/db/runtime-kv.js +91 -0
  21. package/dist/resources/extensions/gsd/db/unit-dispatches.js +322 -0
  22. package/dist/resources/extensions/gsd/db-writer.js +96 -16
  23. package/dist/resources/extensions/gsd/delegation-policy.js +155 -0
  24. package/dist/resources/extensions/gsd/docs/COORDINATION.md +42 -0
  25. package/dist/resources/extensions/gsd/doctor-proactive.js +4 -0
  26. package/dist/resources/extensions/gsd/doctor-runtime-checks.js +22 -6
  27. package/dist/resources/extensions/gsd/doctor.js +12 -2
  28. package/dist/resources/extensions/gsd/gsd-db.js +355 -3
  29. package/dist/resources/extensions/gsd/guided-flow-queue.js +1 -1
  30. package/dist/resources/extensions/gsd/guided-flow.js +116 -26
  31. package/dist/resources/extensions/gsd/interrupted-session.js +18 -15
  32. package/dist/resources/extensions/gsd/metrics.js +287 -1
  33. package/dist/resources/extensions/gsd/paths.js +79 -8
  34. package/dist/resources/extensions/gsd/prompts/complete-slice.md +4 -4
  35. package/dist/resources/extensions/gsd/prompts/execute-task.md +3 -3
  36. package/dist/resources/extensions/gsd/prompts/guided-discuss-milestone.md +8 -1
  37. package/dist/resources/extensions/gsd/prompts/guided-discuss-project.md +22 -7
  38. package/dist/resources/extensions/gsd/prompts/guided-discuss-requirements.md +6 -2
  39. package/dist/resources/extensions/gsd/prompts/guided-discuss-slice.md +8 -1
  40. package/dist/resources/extensions/gsd/state.js +21 -6
  41. package/dist/resources/extensions/gsd/templates/project.md +10 -0
  42. package/dist/resources/extensions/gsd/workflow-mcp.js +2 -2
  43. package/dist/resources/extensions/gsd/workspace.js +59 -0
  44. package/dist/resources/extensions/gsd/worktree-resolver.js +79 -2
  45. package/dist/resources/extensions/gsd/write-intercept.js +3 -3
  46. package/dist/tsconfig.extensions.tsbuildinfo +1 -1
  47. package/dist/web/standalone/.next/BUILD_ID +1 -1
  48. package/dist/web/standalone/.next/app-path-routes-manifest.json +14 -14
  49. package/dist/web/standalone/.next/build-manifest.json +2 -2
  50. package/dist/web/standalone/.next/prerender-manifest.json +3 -3
  51. package/dist/web/standalone/.next/required-server-files.json +1 -1
  52. package/dist/web/standalone/.next/server/app/_global-error.html +1 -1
  53. package/dist/web/standalone/.next/server/app/_global-error.rsc +1 -1
  54. package/dist/web/standalone/.next/server/app/_global-error.segments/_full.segment.rsc +1 -1
  55. package/dist/web/standalone/.next/server/app/_global-error.segments/_global-error/__PAGE__.segment.rsc +1 -1
  56. package/dist/web/standalone/.next/server/app/_global-error.segments/_global-error.segment.rsc +1 -1
  57. package/dist/web/standalone/.next/server/app/_global-error.segments/_head.segment.rsc +1 -1
  58. package/dist/web/standalone/.next/server/app/_global-error.segments/_index.segment.rsc +1 -1
  59. package/dist/web/standalone/.next/server/app/_global-error.segments/_tree.segment.rsc +1 -1
  60. package/dist/web/standalone/.next/server/app/_not-found.html +1 -1
  61. package/dist/web/standalone/.next/server/app/_not-found.rsc +1 -1
  62. package/dist/web/standalone/.next/server/app/_not-found.segments/_full.segment.rsc +1 -1
  63. package/dist/web/standalone/.next/server/app/_not-found.segments/_head.segment.rsc +1 -1
  64. package/dist/web/standalone/.next/server/app/_not-found.segments/_index.segment.rsc +1 -1
  65. package/dist/web/standalone/.next/server/app/_not-found.segments/_not-found/__PAGE__.segment.rsc +1 -1
  66. package/dist/web/standalone/.next/server/app/_not-found.segments/_not-found.segment.rsc +1 -1
  67. package/dist/web/standalone/.next/server/app/_not-found.segments/_tree.segment.rsc +1 -1
  68. package/dist/web/standalone/.next/server/app/index.html +1 -1
  69. package/dist/web/standalone/.next/server/app/index.rsc +1 -1
  70. package/dist/web/standalone/.next/server/app/index.segments/__PAGE__.segment.rsc +1 -1
  71. package/dist/web/standalone/.next/server/app/index.segments/_full.segment.rsc +1 -1
  72. package/dist/web/standalone/.next/server/app/index.segments/_head.segment.rsc +1 -1
  73. package/dist/web/standalone/.next/server/app/index.segments/_index.segment.rsc +1 -1
  74. package/dist/web/standalone/.next/server/app/index.segments/_tree.segment.rsc +1 -1
  75. package/dist/web/standalone/.next/server/app-paths-manifest.json +14 -14
  76. package/dist/web/standalone/.next/server/middleware-build-manifest.js +1 -1
  77. package/dist/web/standalone/.next/server/pages/404.html +1 -1
  78. package/dist/web/standalone/.next/server/pages/500.html +1 -1
  79. package/dist/web/standalone/.next/server/server-reference-manifest.json +1 -1
  80. package/dist/web/standalone/server.js +1 -1
  81. package/package.json +1 -1
  82. package/packages/mcp-server/README.md +2 -11
  83. package/packages/mcp-server/dist/remote-questions.d.ts +27 -0
  84. package/packages/mcp-server/dist/remote-questions.d.ts.map +1 -1
  85. package/packages/mcp-server/dist/remote-questions.js +28 -0
  86. package/packages/mcp-server/dist/remote-questions.js.map +1 -1
  87. package/packages/mcp-server/dist/server.d.ts +28 -0
  88. package/packages/mcp-server/dist/server.d.ts.map +1 -1
  89. package/packages/mcp-server/dist/server.js +94 -4
  90. package/packages/mcp-server/dist/server.js.map +1 -1
  91. package/packages/mcp-server/dist/workflow-tools.js.map +1 -1
  92. package/packages/mcp-server/src/mcp-server.test.ts +226 -0
  93. package/packages/mcp-server/src/remote-questions.test.ts +103 -0
  94. package/packages/mcp-server/src/remote-questions.ts +35 -0
  95. package/packages/mcp-server/src/server.ts +129 -6
  96. package/packages/mcp-server/src/workflow-tools.ts +1 -1
  97. package/packages/mcp-server/tsconfig.tsbuildinfo +1 -1
  98. package/src/resources/extensions/gsd/auto/detect-stuck.ts +37 -5
  99. package/src/resources/extensions/gsd/auto/loop.ts +263 -41
  100. package/src/resources/extensions/gsd/auto/phases.ts +15 -7
  101. package/src/resources/extensions/gsd/auto/session.ts +40 -0
  102. package/src/resources/extensions/gsd/auto-dispatch.ts +63 -4
  103. package/src/resources/extensions/gsd/auto-post-unit.ts +27 -12
  104. package/src/resources/extensions/gsd/auto-worktree.ts +218 -225
  105. package/src/resources/extensions/gsd/auto.ts +166 -43
  106. package/src/resources/extensions/gsd/bootstrap/agent-end-recovery.ts +1 -1
  107. package/src/resources/extensions/gsd/bootstrap/register-hooks.ts +26 -21
  108. package/src/resources/extensions/gsd/bootstrap/tests/write-gate-basepath.test.ts +103 -0
  109. package/src/resources/extensions/gsd/bootstrap/write-gate.ts +80 -55
  110. package/src/resources/extensions/gsd/crash-recovery.ts +177 -43
  111. package/src/resources/extensions/gsd/db/auto-workers.ts +273 -0
  112. package/src/resources/extensions/gsd/db/command-queue.ts +149 -0
  113. package/src/resources/extensions/gsd/db/milestone-leases.ts +274 -0
  114. package/src/resources/extensions/gsd/db/runtime-kv.ts +127 -0
  115. package/src/resources/extensions/gsd/db/unit-dispatches.ts +446 -0
  116. package/src/resources/extensions/gsd/db-writer.ts +113 -17
  117. package/src/resources/extensions/gsd/delegation-policy.ts +197 -0
  118. package/src/resources/extensions/gsd/docs/COORDINATION.md +42 -0
  119. package/src/resources/extensions/gsd/doctor-proactive.ts +4 -0
  120. package/src/resources/extensions/gsd/doctor-runtime-checks.ts +24 -6
  121. package/src/resources/extensions/gsd/doctor.ts +10 -2
  122. package/src/resources/extensions/gsd/gsd-db.ts +354 -3
  123. package/src/resources/extensions/gsd/guided-flow-queue.ts +1 -1
  124. package/src/resources/extensions/gsd/guided-flow.ts +152 -26
  125. package/src/resources/extensions/gsd/interrupted-session.ts +19 -12
  126. package/src/resources/extensions/gsd/metrics.ts +321 -1
  127. package/src/resources/extensions/gsd/paths.ts +67 -8
  128. package/src/resources/extensions/gsd/prompts/complete-slice.md +4 -4
  129. package/src/resources/extensions/gsd/prompts/execute-task.md +3 -3
  130. package/src/resources/extensions/gsd/prompts/guided-discuss-milestone.md +8 -1
  131. package/src/resources/extensions/gsd/prompts/guided-discuss-project.md +22 -7
  132. package/src/resources/extensions/gsd/prompts/guided-discuss-requirements.md +6 -2
  133. package/src/resources/extensions/gsd/prompts/guided-discuss-slice.md +8 -1
  134. package/src/resources/extensions/gsd/state.ts +44 -6
  135. package/src/resources/extensions/gsd/templates/project.md +10 -0
  136. package/src/resources/extensions/gsd/tests/auto-discuss-milestone-deadlock-4973.test.ts +14 -14
  137. package/src/resources/extensions/gsd/tests/auto-loop-no-copy-artifacts.test.ts +72 -0
  138. package/src/resources/extensions/gsd/tests/auto-loop-symlink-worktree.test.ts +190 -0
  139. package/src/resources/extensions/gsd/tests/auto-session-scope.test.ts +331 -0
  140. package/src/resources/extensions/gsd/tests/auto-workers.test.ts +105 -0
  141. package/src/resources/extensions/gsd/tests/auto-worktree-registry.test.ts +176 -0
  142. package/src/resources/extensions/gsd/tests/command-queue.test.ts +141 -0
  143. package/src/resources/extensions/gsd/tests/crash-recovery-via-db.test.ts +203 -0
  144. package/src/resources/extensions/gsd/tests/crash-recovery.test.ts +169 -59
  145. package/src/resources/extensions/gsd/tests/db-writer-path-containment.test.ts +152 -0
  146. package/src/resources/extensions/gsd/tests/db-writer-root-artifact.test.ts +221 -0
  147. package/src/resources/extensions/gsd/tests/db-writer-scope.test.ts +230 -0
  148. package/src/resources/extensions/gsd/tests/delegation-policy.test.ts +151 -0
  149. package/src/resources/extensions/gsd/tests/detect-stuck-respects-retry.test.ts +173 -0
  150. package/src/resources/extensions/gsd/tests/dispatch-backgroundable-annotation.test.ts +55 -0
  151. package/src/resources/extensions/gsd/tests/draft-promotion.test.ts +3 -23
  152. package/src/resources/extensions/gsd/tests/gate-1b-orphan-discrimination.test.ts +193 -0
  153. package/src/resources/extensions/gsd/tests/gate-1b-recovery-bound-corrections.test.ts +246 -0
  154. package/src/resources/extensions/gsd/tests/gate-1b-recovery-bound.test.ts +218 -0
  155. package/src/resources/extensions/gsd/tests/gsd-db-failed-open-restore.test.ts +117 -0
  156. package/src/resources/extensions/gsd/tests/gsd-db-workspace-scope.test.ts +226 -0
  157. package/src/resources/extensions/gsd/tests/gsd-root-canonical.test.ts +66 -0
  158. package/src/resources/extensions/gsd/tests/gsd-root-home-guard.test.ts +68 -5
  159. package/src/resources/extensions/gsd/tests/guided-flow-prompt-consolidation.test.ts +4 -4
  160. package/src/resources/extensions/gsd/tests/integration/auto-worktree.test.ts +22 -12
  161. package/src/resources/extensions/gsd/tests/integration/doctor-proactive.test.ts +24 -10
  162. package/src/resources/extensions/gsd/tests/integration/doctor-runtime.test.ts +35 -23
  163. package/src/resources/extensions/gsd/tests/integration/workspace-collapse-integration.test.ts +369 -0
  164. package/src/resources/extensions/gsd/tests/interrupted-session-auto.test.ts +72 -25
  165. package/src/resources/extensions/gsd/tests/interrupted-session-ui.test.ts +72 -25
  166. package/src/resources/extensions/gsd/tests/memory-pressure-stuck-state.test.ts +9 -6
  167. package/src/resources/extensions/gsd/tests/metrics-atomic-merge.test.ts +222 -0
  168. package/src/resources/extensions/gsd/tests/metrics-lock-hardening.test.ts +400 -0
  169. package/src/resources/extensions/gsd/tests/metrics-lock-not-acquired.test.ts +141 -0
  170. package/src/resources/extensions/gsd/tests/metrics-lock-retry-sleep.test.ts +287 -0
  171. package/src/resources/extensions/gsd/tests/metrics-prune-cache-invalidation.test.ts +149 -0
  172. package/src/resources/extensions/gsd/tests/metrics-scope.test.ts +378 -0
  173. package/src/resources/extensions/gsd/tests/milestone-leases.test.ts +152 -0
  174. package/src/resources/extensions/gsd/tests/originalbase-path-comparison.test.ts +329 -0
  175. package/src/resources/extensions/gsd/tests/parallel-milestone-isolation.test.ts +106 -0
  176. package/src/resources/extensions/gsd/tests/path-cache-decoupled.test.ts +209 -0
  177. package/src/resources/extensions/gsd/tests/path-normalization-unified.test.ts +175 -0
  178. package/src/resources/extensions/gsd/tests/paths-cache.test.ts +170 -0
  179. package/src/resources/extensions/gsd/tests/paused-session-via-db.test.ts +119 -0
  180. package/src/resources/extensions/gsd/tests/pending-autostart-scope.test.ts +120 -0
  181. package/src/resources/extensions/gsd/tests/pipeline-variant-dispatch.test.ts +58 -0
  182. package/src/resources/extensions/gsd/tests/preferences-worktree-sync.test.ts +3 -17
  183. package/src/resources/extensions/gsd/tests/prompt-contracts.test.ts +150 -7
  184. package/src/resources/extensions/gsd/tests/register-hooks-depth-verification.test.ts +138 -16
  185. package/src/resources/extensions/gsd/tests/resume-missing-worktree-warning.test.ts +209 -0
  186. package/src/resources/extensions/gsd/tests/runtime-kv.test.ts +120 -0
  187. package/src/resources/extensions/gsd/tests/skipped-validation-completion.test.ts +133 -28
  188. package/src/resources/extensions/gsd/tests/skipped-validation-db-atomicity.test.ts +17 -0
  189. package/src/resources/extensions/gsd/tests/stuck-state-via-db.test.ts +134 -0
  190. package/src/resources/extensions/gsd/tests/sync-layer-scope.test.ts +434 -0
  191. package/src/resources/extensions/gsd/tests/teardown-chdir-failure-clears-registry.test.ts +162 -0
  192. package/src/resources/extensions/gsd/tests/teardown-cleanup-parity.test.ts +98 -0
  193. package/src/resources/extensions/gsd/tests/teardown-failure-clears-registry.test.ts +186 -0
  194. package/src/resources/extensions/gsd/tests/tool-invocation-error-loop-break.test.ts +1 -1
  195. package/src/resources/extensions/gsd/tests/unit-dispatches.test.ts +247 -0
  196. package/src/resources/extensions/gsd/tests/validate-milestone.test.ts +41 -1
  197. package/src/resources/extensions/gsd/tests/validator-scope-parity.test.ts +239 -0
  198. package/src/resources/extensions/gsd/tests/workflow-mcp.test.ts +2 -2
  199. package/src/resources/extensions/gsd/tests/workflow-tool-executors.test.ts +9 -15
  200. package/src/resources/extensions/gsd/tests/workspace.test.ts +196 -0
  201. package/src/resources/extensions/gsd/tests/write-gate-predicates.test.ts +35 -35
  202. package/src/resources/extensions/gsd/tests/write-gate.test.ts +94 -71
  203. package/src/resources/extensions/gsd/tests/write-intercept.test.ts +1 -1
  204. package/src/resources/extensions/gsd/workflow-mcp.ts +2 -2
  205. package/src/resources/extensions/gsd/workspace.ts +95 -0
  206. package/src/resources/extensions/gsd/worktree-resolver.ts +78 -2
  207. package/src/resources/extensions/gsd/write-intercept.ts +3 -3
  208. package/src/resources/extensions/gsd/tests/auto-lock-creation.test.ts +0 -213
  209. package/src/resources/extensions/gsd/tests/auto-stale-lock-self-kill.test.ts +0 -87
  210. package/src/resources/extensions/gsd/tests/stop-auto-remote.test.ts +0 -159
  211. /package/dist/web/standalone/.next/static/{oZGTPvJBQX_IDKKnuV8Bt → Y5UeGFkXTYM9WIQOWHkot}/_buildManifest.js +0 -0
  212. /package/dist/web/standalone/.next/static/{oZGTPvJBQX_IDKKnuV8Bt → Y5UeGFkXTYM9WIQOWHkot}/_ssgManifest.js +0 -0
@@ -1,3 +1,4 @@
1
+ // GSD-2 + metrics.ts: token & cost tracking for auto-mode units
1
2
  /**
2
3
  * GSD Metrics — Token & Cost Tracking
3
4
  *
@@ -13,12 +14,14 @@
13
14
  * 4. On crash recovery or fresh start, the ledger is loaded from disk
14
15
  */
15
16
  import { join } from "node:path";
17
+ import { openSync, closeSync, unlinkSync, statSync, writeFileSync } from "node:fs";
16
18
  import { gsdRoot } from "./paths.js";
17
19
  import { getAndClearSkills } from "./skill-telemetry.js";
18
20
  import { loadJsonFile, loadJsonFileOrNull, saveJsonFile } from "./json-persistence.js";
19
21
  import { parseUnitId } from "./unit-id.js";
20
22
  import { buildAuditEnvelope, emitUokAuditEvent } from "./uok/audit.js";
21
23
  import { isUnifiedAuditEnabled } from "./uok/audit-toggle.js";
24
+ import { logWarning } from "./workflow-logger.js";
22
25
  // Re-export from shared — import directly from format-utils to avoid pulling
23
26
  // in the full barrel (mod.js → ui.js → @gsd/pi-tui) which breaks when loaded
24
27
  // outside jiti's alias resolution (e.g. dynamic import in auto-loop reports).
@@ -48,10 +51,15 @@ export function classifyUnitPhase(unitType) {
48
51
  // ─── In-memory state ──────────────────────────────────────────────────────────
49
52
  let ledger = null;
50
53
  let basePath = "";
54
+ // Per-workspace ledger map, keyed by workspace.identityKey.
55
+ // Populated by initMetricsByScope; independent of the module singleton.
56
+ const scopedLedgers = new Map();
51
57
  // ─── Public API ───────────────────────────────────────────────────────────────
52
58
  /**
53
59
  * Initialize the metrics system for a given project.
54
60
  * Loads existing ledger from disk if present.
61
+ *
62
+ * @deprecated TODO(C-future): remove module singleton. Use initMetricsByScope instead.
55
63
  */
56
64
  export function initMetrics(base) {
57
65
  basePath = base;
@@ -59,6 +67,8 @@ export function initMetrics(base) {
59
67
  }
60
68
  /**
61
69
  * Reset in-memory state. Called when auto-mode stops.
70
+ *
71
+ * @deprecated TODO(C-future): remove module singleton. Use resetMetricsByScope instead.
62
72
  */
63
73
  export function resetMetrics() {
64
74
  ledger = null;
@@ -67,6 +77,8 @@ export function resetMetrics() {
67
77
  /**
68
78
  * Snapshot usage metrics from the current session before it's wiped.
69
79
  * Scans session entries for AssistantMessage usage data.
80
+ *
81
+ * @deprecated TODO(C-future): remove module singleton. Use snapshotUnitMetricsByScope instead.
70
82
  */
71
83
  export function snapshotUnitMetrics(ctx, unitType, unitId, startedAt, model, opts) {
72
84
  if (!ledger)
@@ -180,6 +192,147 @@ export function snapshotUnitMetrics(ctx, unitType, unitId, startedAt, model, opt
180
192
  export function getLedger() {
181
193
  return ledger;
182
194
  }
195
+ // ─── Scope-aware API (canonical) ─────────────────────────────────────────────
196
+ /**
197
+ * Initialize the metrics system for a given workspace scope.
198
+ * Loads existing ledger from disk into the per-scope ledger map.
199
+ * Does NOT touch the module-level singleton.
200
+ */
201
+ export function initMetricsByScope(scope) {
202
+ const base = scope.workspace.projectRoot;
203
+ const loaded = loadLedger(base);
204
+ scopedLedgers.set(scope.workspace.identityKey, loaded);
205
+ }
206
+ /**
207
+ * Get the in-memory ledger for the given scope, or null if not initialized.
208
+ */
209
+ export function getLedgerByScope(scope) {
210
+ return scopedLedgers.get(scope.workspace.identityKey) ?? null;
211
+ }
212
+ /**
213
+ * Reset scoped in-memory state for a workspace. Called when auto-mode stops.
214
+ */
215
+ export function resetMetricsByScope(scope) {
216
+ scopedLedgers.delete(scope.workspace.identityKey);
217
+ }
218
+ /**
219
+ * Snapshot usage metrics using an explicit workspace scope.
220
+ *
221
+ * This is the canonical variant. It derives the metrics path from
222
+ * scope.workspace.projectRoot rather than the module singleton, so it
223
+ * remains correct across session resume and in multi-workspace processes.
224
+ *
225
+ * Preserves the atomic write-merge logic from saveLedger so concurrent
226
+ * workers cannot silently discard each other's entries.
227
+ *
228
+ * If initMetricsByScope has not been called, the ledger is loaded from
229
+ * disk on first call (lazy init).
230
+ */
231
+ export function snapshotUnitMetricsByScope(scope, ctx, unitType, unitId, startedAt, model, opts) {
232
+ const base = scope.workspace.projectRoot;
233
+ const key = scope.workspace.identityKey;
234
+ // Lazy init: load from disk if not yet in scoped map.
235
+ if (!scopedLedgers.has(key)) {
236
+ scopedLedgers.set(key, loadLedger(base));
237
+ }
238
+ const scopedLedger = scopedLedgers.get(key);
239
+ const entries = ctx.sessionManager.getEntries();
240
+ if (!entries || entries.length === 0)
241
+ return null;
242
+ const tokens = { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 };
243
+ let cost = 0;
244
+ let toolCalls = 0;
245
+ let assistantMessages = 0;
246
+ let userMessages = 0;
247
+ for (const entry of entries) {
248
+ if (entry.type !== "message")
249
+ continue;
250
+ const msg = entry.message;
251
+ if (!msg)
252
+ continue;
253
+ if (msg.role === "assistant") {
254
+ assistantMessages++;
255
+ if (msg.usage) {
256
+ tokens.input += msg.usage.input ?? 0;
257
+ tokens.output += msg.usage.output ?? 0;
258
+ tokens.cacheRead += msg.usage.cacheRead ?? 0;
259
+ tokens.cacheWrite += msg.usage.cacheWrite ?? 0;
260
+ tokens.total += msg.usage.totalTokens ?? 0;
261
+ if (msg.usage.cost != null) {
262
+ const c = msg.usage.cost;
263
+ cost += typeof c === "number" ? c : (c.total ?? 0);
264
+ }
265
+ }
266
+ if (msg.content && Array.isArray(msg.content)) {
267
+ for (const block of msg.content) {
268
+ if (block.type === "toolCall")
269
+ toolCalls++;
270
+ }
271
+ }
272
+ }
273
+ else if (msg.role === "user") {
274
+ userMessages++;
275
+ }
276
+ }
277
+ const unit = {
278
+ type: unitType,
279
+ id: unitId,
280
+ model,
281
+ startedAt,
282
+ finishedAt: Date.now(),
283
+ ...(opts?.autoSessionKey ? { autoSessionKey: opts.autoSessionKey } : {}),
284
+ tokens,
285
+ cost,
286
+ toolCalls,
287
+ assistantMessages,
288
+ userMessages,
289
+ apiRequests: assistantMessages,
290
+ ...(opts?.tier ? { tier: opts.tier } : {}),
291
+ ...(opts?.modelDowngraded !== undefined ? { modelDowngraded: opts.modelDowngraded } : {}),
292
+ ...(opts?.contextWindowTokens !== undefined ? { contextWindowTokens: opts.contextWindowTokens } : {}),
293
+ ...(opts?.truncationSections !== undefined ? { truncationSections: opts.truncationSections } : {}),
294
+ ...(opts?.continueHereFired !== undefined ? { continueHereFired: opts.continueHereFired } : {}),
295
+ ...(opts?.promptCharCount != null ? { promptCharCount: opts.promptCharCount } : {}),
296
+ ...(opts?.baselineCharCount != null ? { baselineCharCount: opts.baselineCharCount } : {}),
297
+ };
298
+ // Auto-capture skill telemetry (#599)
299
+ const skills = getAndClearSkills();
300
+ if (skills.length > 0) {
301
+ unit.skills = skills;
302
+ }
303
+ // Compute cache hit rate
304
+ if (tokens.cacheRead > 0 || tokens.input > 0) {
305
+ const totalInput = tokens.cacheRead + tokens.input;
306
+ unit.cacheHitRate = totalInput > 0 ? Math.round((tokens.cacheRead / totalInput) * 100) : 0;
307
+ }
308
+ // Idempotency guard: update in-place on duplicate, append otherwise.
309
+ const dupeIdx = scopedLedger.units.findIndex((u) => u.type === unit.type && u.id === unit.id && u.startedAt === unit.startedAt);
310
+ if (dupeIdx >= 0) {
311
+ scopedLedger.units[dupeIdx] = unit;
312
+ }
313
+ else {
314
+ scopedLedger.units.push(unit);
315
+ }
316
+ saveLedger(base, scopedLedger);
317
+ if (isUnifiedAuditEnabled()) {
318
+ emitUokAuditEvent(base, buildAuditEnvelope({
319
+ traceId: opts?.traceId ?? `metrics:${unitType}:${unitId}`,
320
+ turnId: opts?.turnId,
321
+ causedBy: opts?.causedBy,
322
+ category: "metrics",
323
+ type: "unit-metrics-snapshot",
324
+ payload: {
325
+ unitType,
326
+ unitId,
327
+ model,
328
+ tokens: unit.tokens,
329
+ cost: unit.cost,
330
+ toolCalls: unit.toolCalls,
331
+ },
332
+ }));
333
+ }
334
+ return unit;
335
+ }
183
336
  function emptyTokens() {
184
337
  return { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 };
185
338
  }
@@ -423,6 +576,12 @@ export function pruneMetricsLedger(base, keepCount) {
423
576
  if (ledger) {
424
577
  ledger.units = ledger.units.slice(-keepCount);
425
578
  }
579
+ // Invalidate all scoped ledger cache entries. Prune is rare; clearing the
580
+ // entire map is simpler than tracking which entry belongs to `base`. Without
581
+ // this, scopedLedgers entries for the pruned workspace hold a pre-prune
582
+ // MetricsLedger that snapshotUnitMetricsByScope would merge back in, causing
583
+ // pruned units to reappear in subsequent snapshots.
584
+ scopedLedgers.clear();
426
585
  return removed;
427
586
  }
428
587
  /**
@@ -461,6 +620,133 @@ function deduplicateUnits(units) {
461
620
  }
462
621
  return Array.from(map.values());
463
622
  }
623
+ // How long a lock file must be untouched (in ms) before it is considered
624
+ // orphaned from a crashed process. Set to 2× the acquire timeout.
625
+ export const STALE_LOCK_THRESHOLD_MS = 4000;
626
+ // Retry interval between lock acquire attempts (ms). Caps syscall rate at
627
+ // ~200 attempts over a 2s timeout instead of ~20,000 without any sleep.
628
+ // Exposed for tests.
629
+ export const LOCK_RETRY_INTERVAL_MS = 5;
630
+ // Sync sleep via Atomics.wait — true OS-level sleep, no CPU spin.
631
+ // Int32Array must reference a SharedArrayBuffer; we wait on index 0 which
632
+ // will never be woken by a Atomics.notify, so the wait always times out.
633
+ const _lockSleepBuf = new Int32Array(new SharedArrayBuffer(4));
634
+ function syncSleep(ms) {
635
+ Atomics.wait(_lockSleepBuf, 0, 0, ms);
636
+ }
637
+ // Counts the number of sleepy retries (non-stale-evicting) made by acquireLock
638
+ // across all calls since the last reset. Exported for test instrumentation only.
639
+ let _lockSleepyRetries = 0;
640
+ export function getLockSleepyRetries() { return _lockSleepyRetries; }
641
+ export function resetLockSleepyRetries() { _lockSleepyRetries = 0; }
642
+ /**
643
+ * Acquire an exclusive .lock sentinel file via O_EXCL.
644
+ *
645
+ * Improvements over the original:
646
+ * - No busy spin: the inner `while (Date.now() < waitUntil) {}` spin that
647
+ * burned CPU doing nothing useful is removed. Each retry attempt now makes
648
+ * one `openSync` syscall and immediately re-checks the deadline, which is
649
+ * orders of magnitude cheaper than a tight spin loop.
650
+ * - Stale-lock detection: if the existing lock file's mtime is older than
651
+ * STALE_LOCK_THRESHOLD_MS, the lock is considered orphaned (e.g. the
652
+ * writing process crashed) and is forcibly removed before retrying.
653
+ * A warning is logged so operators can detect crash patterns.
654
+ * - PID stamp: on success, writes the acquiring process's PID and a
655
+ * timestamp into the lock file so external monitors can identify orphans.
656
+ * - Retry sleep: after each non-stale-evicting retry, sleeps
657
+ * LOCK_RETRY_INTERVAL_MS (5ms) via Atomics.wait so the process yields to
658
+ * the OS. This caps syscall rate at ~200–400/s under contention instead of
659
+ * the ~20,000/s that would result from a tight openSync loop.
660
+ * After a stale-lock eviction (lock already removed), no sleep is injected
661
+ * — we retry immediately to close the short race window.
662
+ *
663
+ * Returns true on success, false on timeout.
664
+ */
665
+ function acquireLock(lockPath, timeoutMs = 2000) {
666
+ const deadline = Date.now() + timeoutMs;
667
+ while (Date.now() < deadline) {
668
+ try {
669
+ const fd = openSync(lockPath, "wx"); // O_WRONLY | O_CREAT | O_EXCL
670
+ closeSync(fd);
671
+ // Write PID stamp so external monitors can identify the lock owner.
672
+ try {
673
+ writeFileSync(lockPath, `${process.pid}\n${new Date().toISOString()}\n`, "utf-8");
674
+ }
675
+ catch { /* non-fatal — stamp is diagnostic only */ }
676
+ return true;
677
+ }
678
+ catch {
679
+ // Lock held by another process — check for staleness before retrying.
680
+ try {
681
+ const st = statSync(lockPath);
682
+ if (Date.now() - st.mtimeMs > STALE_LOCK_THRESHOLD_MS) {
683
+ logWarning("fs", `stale metrics lock at ${lockPath} (age ${Date.now() - st.mtimeMs}ms); forcibly removing and retrying`);
684
+ try {
685
+ unlinkSync(lockPath);
686
+ }
687
+ catch { /* already gone */ }
688
+ // Do NOT sleep after stale-lock eviction — retry the open
689
+ // immediately. The lock file was just removed; a short race window
690
+ // exists and sleeping here would unnecessarily delay recovery.
691
+ continue;
692
+ }
693
+ }
694
+ catch { /* lock file disappeared between the failed open and stat — retry */ }
695
+ // Sleep between retries to yield to the OS and cap syscall rate.
696
+ // Uses Atomics.wait for a true blocking sleep (no CPU spin).
697
+ _lockSleepyRetries++;
698
+ syncSleep(LOCK_RETRY_INTERVAL_MS);
699
+ }
700
+ }
701
+ return false;
702
+ }
703
+ function releaseLock(lockPath) {
704
+ try {
705
+ unlinkSync(lockPath);
706
+ }
707
+ catch { /* ignore */ }
708
+ }
709
+ /**
710
+ * Save the ledger with cross-process merge semantics.
711
+ *
712
+ * Acquires a .lock sentinel file, reads the current on-disk ledger,
713
+ * merges worker units with existing peer units (worker's entry wins on
714
+ * type+id+startedAt conflict since it has the latest finishedAt),
715
+ * then writes atomically. This prevents parallel auto-mode workers from
716
+ * silently discarding each other's metrics entries.
717
+ *
718
+ * Falls back to a direct write (no merge) if the lock cannot be acquired
719
+ * within the timeout — better to potentially overwrite than to lose data
720
+ * entirely.
721
+ */
464
722
  function saveLedger(base, data) {
465
- saveJsonFile(metricsPath(base), data);
723
+ const path = metricsPath(base);
724
+ const lockPath = `${path}.lock`;
725
+ const acquired = acquireLock(lockPath);
726
+ if (acquired) {
727
+ try {
728
+ // Read current on-disk state and merge with worker's in-memory units.
729
+ // Worker units take precedence on conflict (by finishedAt in deduplicateUnits).
730
+ const onDisk = loadJsonFileOrNull(path, isMetricsLedger);
731
+ if (onDisk && onDisk.units.length > 0) {
732
+ const merged = deduplicateUnits([...onDisk.units, ...data.units]);
733
+ saveJsonFile(path, { ...data, units: merged });
734
+ }
735
+ else {
736
+ saveJsonFile(path, data);
737
+ }
738
+ }
739
+ finally {
740
+ releaseLock(lockPath);
741
+ }
742
+ }
743
+ else {
744
+ // Lock could not be acquired within the timeout. Fall back to a direct
745
+ // write (no cross-process merge) to avoid losing this worker's data
746
+ // entirely. A concurrent writer may overwrite us, but that is preferable
747
+ // to a torn write caused by two writers simultaneously executing the
748
+ // read-merge-write sequence without mutual exclusion.
749
+ logWarning("fs", "saveLedger: lock not acquired — falling back to direct write (no merge)");
750
+ saveJsonFile(path, data);
751
+ }
466
752
  }
@@ -1,3 +1,4 @@
1
+ // GSD-2 — ID-based path resolution for GSD project files and directories
1
2
  /**
2
3
  * GSD Paths — ID-based path resolution
3
4
  *
@@ -121,9 +122,15 @@ function cachedReaddir(dirPath) {
121
122
  return entries;
122
123
  }
123
124
  /**
124
- * Clear the directory listing cache.
125
+ * Clear the volatile directory listing caches.
125
126
  * Call after milestone transitions, file creation in planning directories,
126
127
  * or at the start/end of a dispatch cycle.
128
+ *
129
+ * NOTE: This does NOT clear gsdRootCache. The project root is stable for
130
+ * the lifetime of a process; clearing it on every agent turn-end caused a
131
+ * 250–2500 ms regression per session (git rev-parse + dir walk per turn).
132
+ * Use _clearGsdRootCache() at session-reset boundaries (workspace switch,
133
+ * process exit) when the project root may genuinely change.
127
134
  */
128
135
  export function clearPathCache() {
129
136
  dirEntryCache.clear();
@@ -270,6 +277,14 @@ const LEGACY_GSD_ROOT_FILES = {
270
277
  CODEBASE: "codebase.md",
271
278
  };
272
279
  // ─── GSD Root Discovery ───────────────────────────────────────────────────────
280
+ // Process-lifetime cache for gsdRoot() results.
281
+ // Keys are realpath-normalized (via normCacheKey) so /foo and /foo/ share the
282
+ // same entry and so do case-variant paths on case-insensitive volumes. This
283
+ // normalization is the safety net that prevents cache poisoning from the
284
+ // ~/.gsd walk-up bug (fixed in c46cf4786 + b35e070eb), making it safe to
285
+ // hold this cache for the entire process lifetime.
286
+ // Use _clearGsdRootCache() only at session-reset boundaries (workspace switch,
287
+ // process exit) — NOT inside clearPathCache(), which runs on every agent turn.
273
288
  const gsdRootCache = new Map();
274
289
  export function resolveGsdPathContract(workRoot, originalProjectRoot) {
275
290
  const resolvedWorkRoot = resolve(workRoot || process.cwd());
@@ -301,10 +316,39 @@ export function resolveGsdPathContract(workRoot, originalProjectRoot) {
301
316
  isWorktree,
302
317
  };
303
318
  }
304
- /** Exported for tests only — do not call in production code. */
319
+ /**
320
+ * Invalidate the gsdRoot cache.
321
+ * Use ONLY at session-reset boundaries: workspace switch, process exit, or
322
+ * any context where the project root itself may genuinely change.
323
+ * Do NOT call this on every agent turn — use clearPathCache() for volatile
324
+ * directory listing invalidation instead.
325
+ */
305
326
  export function _clearGsdRootCache() {
306
327
  gsdRootCache.clear();
307
328
  }
329
+ /**
330
+ * Resolve a path to its canonical real path using the native resolver.
331
+ * On macOS case-insensitive (HFS+/APFS) volumes, realpathSync.native normalizes
332
+ * case — ensuring that /foo/Bar and /foo/bar resolve to the same string.
333
+ * Falls back to resolve(p) for non-existent paths.
334
+ *
335
+ * Use this helper everywhere a path is used as an identity/cache key so that
336
+ * all callers agree on the canonical form.
337
+ */
338
+ export function normalizeRealPath(p) {
339
+ try {
340
+ return realpathSync.native(p);
341
+ }
342
+ catch {
343
+ return resolve(p);
344
+ }
345
+ }
346
+ /** Normalize a path for use as a gsdRootCache key (realpath + trailing-slash strip). */
347
+ function normCacheKey(p) {
348
+ const r = normalizeRealPath(p);
349
+ const s = r.replaceAll("\\", "/").replace(/\/+$/, "");
350
+ return process.platform === "win32" ? s.toLowerCase() : s;
351
+ }
308
352
  /**
309
353
  * Resolve the `.gsd` directory for a given project base path.
310
354
  *
@@ -314,19 +358,25 @@ export function _clearGsdRootCache() {
314
358
  * 3. Walk up from basePath — handles moved .gsd in an ancestor (bounded by git root)
315
359
  * 4. basePath/.gsd — creation fallback (init scenario)
316
360
  *
317
- * Result is cached per basePath for the process lifetime.
361
+ * Result is cached per normalized basePath for the process lifetime.
362
+ * Keys are realpath-normalized so /foo and /foo/ share the same cache entry.
318
363
  */
319
364
  export function gsdRoot(basePath) {
320
- const cached = gsdRootCache.get(basePath);
365
+ const cacheKey = normCacheKey(basePath);
366
+ const cached = gsdRootCache.get(cacheKey);
321
367
  if (cached)
322
368
  return cached;
323
- const result = probeGsdRoot(basePath);
369
+ // Canonicalize result via realpath before asserting and caching so that
370
+ // callers always receive a canonical path regardless of whether probeGsdRoot
371
+ // returned a path through a symlink. Without this, the cached value can
372
+ // diverge from other realpath-normalized paths (e.g. workspace.identityKey).
373
+ const result = normalizeRealPath(probeGsdRoot(basePath));
324
374
  // Defense-in-depth: if basePath resolves to the user's home directory and
325
375
  // the result equals gsdHome(), refuse — project-scoped writes must never
326
376
  // land in the global ~/.gsd. Paths under ~/.gsd/projects/<hash>/ are still
327
377
  // valid (their basePath does not equal homedir).
328
378
  assertNotGlobalGsdHome(basePath, result);
329
- gsdRootCache.set(basePath, result);
379
+ gsdRootCache.set(cacheKey, result);
330
380
  return result;
331
381
  }
332
382
  function assertNotGlobalGsdHome(basePath, result) {
@@ -435,9 +485,30 @@ function probeGsdRoot(rawBasePath) {
435
485
  }
436
486
  }
437
487
  catch { /* git not available */ }
488
+ // Compute gsdHome once for the skip-check used in steps 2 and 3.
489
+ const normPath = (p) => {
490
+ let r;
491
+ try {
492
+ r = realpathSync.native(p);
493
+ }
494
+ catch {
495
+ r = p;
496
+ }
497
+ const s = r.replaceAll("\\", "/").replace(/\/+$/, "");
498
+ return process.platform === "win32" ? s.toLowerCase() : s;
499
+ };
500
+ let gsdHomeNorm;
501
+ try {
502
+ gsdHomeNorm = normPath(gsdHome());
503
+ }
504
+ catch {
505
+ gsdHomeNorm = "";
506
+ }
438
507
  if (gitRoot) {
439
508
  const candidate = join(gitRoot, ".gsd");
440
- if (existsSync(candidate))
509
+ // Skip if the candidate resolves to the global GSD home — a subdir basePath
510
+ // must not be anchored to ~/.gsd just because $HOME is a git repo.
511
+ if (existsSync(candidate) && normPath(candidate) !== gsdHomeNorm)
441
512
  return candidate;
442
513
  }
443
514
  // 3. Walk up from basePath to the git root (only if we are in a subdirectory)
@@ -445,7 +516,7 @@ function probeGsdRoot(rawBasePath) {
445
516
  let cur = dirname(basePath);
446
517
  while (cur !== basePath) {
447
518
  const candidate = join(cur, ".gsd");
448
- if (existsSync(candidate))
519
+ if (existsSync(candidate) && normPath(candidate) !== gsdHomeNorm)
449
520
  return candidate;
450
521
  if (cur === gitRoot)
451
522
  break;
@@ -28,7 +28,7 @@ This unit runs under the `planning-dispatch` tools-policy: you may use the `suba
28
28
  - **Touched auth, network, parsing, file IO, shell exec, or crypto** → dispatch the **security** agent for an OWASP-style audit.
29
29
  - **Added or modified tests** → dispatch the **tester** agent to assess coverage gaps relative to the slice plan.
30
30
 
31
- Subagents read the diff and report findings — they do **not** write user source. You remain responsible for acting on their feedback before calling `gsd_complete_slice` with `milestoneId` and `sliceId`.
31
+ Subagents read the diff and report findings — they do **not** write user source. You remain responsible for acting on their feedback before calling `gsd_slice_complete` with `milestoneId` and `sliceId`.
32
32
 
33
33
  Then:
34
34
  1. Use the **Slice Summary** and **UAT** output templates from the inlined context above
@@ -37,11 +37,11 @@ Then:
37
37
  4. If the slice plan includes observability/diagnostic surfaces, confirm they work. Skip this for simple slices that don't have observability sections.
38
38
  5. Address every gate listed in the **Gates to Close** section above — each gate maps to a specific slice-summary section the handler inspects (for example, Q8 maps to **Operational Readiness**: health signal, failure signal, recovery procedure, and monitoring gaps). Leaving a section empty records the gate as `omitted`.
39
39
  6. If this slice produced evidence that a requirement changed status (Active → Validated, Active → Deferred, etc.), call `gsd_requirement_update` with the requirement ID, updated `status`, and `validation` evidence. Do NOT write `.gsd/REQUIREMENTS.md` directly — the engine renders it from the database.
40
- 7. Prepare the slice completion content you will pass to `gsd_complete_slice` using the camelCase fields `milestoneId`, `sliceId`, `sliceTitle`, `oneLiner`, `narrative`, `verification`, and `uatContent`. Do **not** manually write `{{sliceSummaryPath}}`. Do **not** manually write `{{sliceUatPath}}` — the DB-backed tool is the canonical write path for both artifacts.
40
+ 7. Prepare the slice completion content you will pass to `gsd_slice_complete` using the camelCase fields `milestoneId`, `sliceId`, `sliceTitle`, `oneLiner`, `narrative`, `verification`, and `uatContent`. Do **not** manually write `{{sliceSummaryPath}}`. Do **not** manually write `{{sliceUatPath}}` — the DB-backed tool is the canonical write path for both artifacts.
41
41
  8. Draft the UAT content you will pass as `uatContent` — a concrete UAT script with real test cases derived from the slice plan and task summaries. Include preconditions, numbered steps with expected outcomes, and edge cases. This must NOT be a placeholder or generic template — tailor every test case to what this slice actually built. Fill the `UAT Type` and `Not Proven By This UAT` sections explicitly so the artifact states what class of acceptance it covers and what still remains unproven (e.g. live integration paths, performance under load, scenarios deferred to a later slice).
42
42
  9. Review task summaries for `key_decisions`. For each significant decision, call `capture_thought` with `category: "architecture"` (or `"pattern"`) and a `structuredFields` payload of `{ scope, decision, choice, rationale, made_by: "agent", revisable }`.
43
43
  10. Review task summaries for patterns, gotchas, or non-obvious lessons learned. For each one that would save future agents from repeating investigation, call `capture_thought` with the matching category (`gotcha`, `convention`, `pattern`, `environment`). The memory store is the single source of truth (ADR-013); do not append to `.gsd/DECISIONS.md` or `.gsd/KNOWLEDGE.md` directly.
44
- 11. Call `gsd_complete_slice` with the camelCase fields `milestoneId`, `sliceId`, `sliceTitle`, `oneLiner`, `narrative`, `verification`, and `uatContent`, plus any optional enrichment fields you have. Do NOT manually mark the roadmap checkbox — the tool writes to the DB, renders `{{sliceSummaryPath}}` and `{{sliceUatPath}}`, and updates the ROADMAP.md projection automatically.
44
+ 11. Call `gsd_slice_complete` with the camelCase fields `milestoneId`, `sliceId`, `sliceTitle`, `oneLiner`, `narrative`, `verification`, and `uatContent`, plus any optional enrichment fields you have. Do NOT manually mark the roadmap checkbox — the tool writes to the DB, renders `{{sliceSummaryPath}}` and `{{sliceUatPath}}`, and updates the ROADMAP.md projection automatically.
45
45
  12. Do not run git commands — the system commits your changes and handles any merge after this unit succeeds.
46
46
  13. Update `.gsd/PROJECT.md` if it exists — refresh current state if needed: use the `write` tool with `path: ".gsd/PROJECT.md"` and `content` containing the full updated document reflecting current project state. Do NOT use the `edit` tool for this — PROJECT.md is a full-document refresh.
47
47
 
@@ -49,6 +49,6 @@ Then:
49
49
 
50
50
  **File system safety:** Task summaries are preloaded in the inlined context above. Task artifacts use a **flat file layout** — files such as `T01-SUMMARY.md` and `T02-SUMMARY.md` live directly inside the `tasks/` directory, not inside per-task subdirectories like `tasks/T01/SUMMARY.md`. If you need to re-read any of them, use `find .gsd/milestones/{{milestoneId}}/slices/{{sliceId}}/tasks -name "*-SUMMARY.md"` to list file paths first. Never use `tasks/*/SUMMARY.md`, and never pass `{{slicePath}}` or any other directory path directly to the `read` tool. The `read` tool only accepts file paths, not directories.
51
51
 
52
- **You MUST call `gsd_complete_slice` with the slice summary and UAT content before finishing. The tool persists to both DB and disk and renders `{{sliceSummaryPath}}` and `{{sliceUatPath}}` automatically.**
52
+ **You MUST call `gsd_slice_complete` with the slice summary and UAT content before finishing. The tool persists to both DB and disk and renders `{{sliceSummaryPath}}` and `{{sliceUatPath}}` automatically.**
53
53
 
54
54
  When done, say: "Slice {{sliceId}} complete."
@@ -85,14 +85,14 @@ Then:
85
85
  17. If you made an architectural, pattern, library, or observability decision during this task that downstream work should know about, call `capture_thought` with `category: "architecture"` (or `"pattern"`). For decisions, populate `structuredFields` with `{ scope, decision, choice, rationale, made_by: "agent", revisable }` so future projection back to a human-visible decisions register stays lossless. Not every task produces decisions — only capture when a meaningful choice was made.
86
86
  18. If you discover a non-obvious rule, recurring gotcha, or useful pattern during execution, call `capture_thought` with `category: "gotcha"`, `"convention"`, `"pattern"`, or `"environment"` as appropriate. Only capture entries that would save future agents from repeating your investigation — don't capture obvious things. The memory store is the single source of truth for cross-session knowledge (ADR-013); do not append to `.gsd/DECISIONS.md` or `.gsd/KNOWLEDGE.md` directly.
87
87
  19. Read the template at `~/.gsd/agent/extensions/gsd/templates/task-summary.md`
88
- 20. Use that template to prepare the completion content you will pass to `gsd_complete_task` using the camelCase fields `milestoneId`, `sliceId`, `taskId`, `oneLiner`, `narrative`, `verification`, and `verificationEvidence`. Do **not** manually write `{{taskSummaryPath}}` — the DB-backed tool is the canonical write path and renders the summary file for you.
89
- 21. Call `gsd_complete_task` with milestoneId, sliceId, taskId, and the completion fields derived from the template. This is your final required step — do NOT manually edit PLAN.md checkboxes. The tool marks the task complete, updates the DB, renders `{{taskSummaryPath}}`, and updates PLAN.md automatically.
88
+ 20. Use that template to prepare the completion content you will pass to `gsd_task_complete` using the camelCase fields `milestoneId`, `sliceId`, `taskId`, `oneLiner`, `narrative`, `verification`, and `verificationEvidence`. Do **not** manually write `{{taskSummaryPath}}` — the DB-backed tool is the canonical write path and renders the summary file for you.
89
+ 21. Call `gsd_task_complete` with milestoneId, sliceId, taskId, and the completion fields derived from the template. This is your final required step — do NOT manually edit PLAN.md checkboxes. The tool marks the task complete, updates the DB, renders `{{taskSummaryPath}}`, and updates PLAN.md automatically.
90
90
  22. Do not run git commands — the system reads your task summary after completion and creates a meaningful commit from it (type inferred from title, message from your one-liner, key files from frontmatter). Write a clear, specific one-liner in the summary — it becomes the commit message.
91
91
 
92
92
  All work stays in your working directory: `{{workingDirectory}}`.
93
93
 
94
94
  **Autonomous execution:** Do not call `ask_user_questions` or `secure_env_collect`. You are running in auto-mode — there is no human available to answer questions. Make reasonable assumptions and document them in the task summary. If a decision genuinely requires human input, note it in the summary and proceed with the best available option.
95
95
 
96
- **You MUST call `gsd_complete_task` before finishing. Do not manually write `{{taskSummaryPath}}`.**
96
+ **You MUST call `gsd_task_complete` before finishing. Do not manually write `{{taskSummaryPath}}`.**
97
97
 
98
98
  When done, say: "Task {{taskId}} complete."
@@ -12,6 +12,13 @@ Discuss milestone {{milestoneId}} ("{{milestoneTitle}}"). Identify gray areas, a
12
12
 
13
13
  {{fastPathInstruction}}
14
14
 
15
+ ### Read project shape
16
+
17
+ Before your first question round, read `.gsd/PROJECT.md` and look for `## Project Shape` → `**Complexity:**`. The verdict is either **`simple`** or **`complex`** (default to `complex` if PROJECT.md is missing the section, predates this convention, or the value is unclear). The verdict scales the rest of this stage:
18
+
19
+ - **`simple`** — favor 1–2 plain-text question rounds. Skip the parallel-research investigation. Skip `ask_user_questions` unless presenting concrete alternatives.
20
+ - **`complex`** — full investigation, 3–4-option structured questions, multi-round.
21
+
15
22
  ### Before your first question round
16
23
 
17
24
  Do a lightweight targeted investigation so your questions are grounded in reality:
@@ -36,7 +43,7 @@ Ask **1–3 questions per round**. Keep each question focused on one of:
36
43
 
37
44
  **Never fabricate or simulate user input.** Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.
38
45
 
39
- **If `{{structuredQuestionsAvailable}}` is `true`:** use `ask_user_questions` for each round. 1–3 questions per call, each as a separate question object. Keep option labels short (3–5 words). Always include a freeform "Other / let me explain" option. When the user picks that option or writes a long freeform answer, switch to plain text follow-up for that thread before resuming structured questions. **IMPORTANT: Call `ask_user_questions` exactly once per turn. Never make multiple calls with the same or overlapping questions — wait for the user's response before asking the next round.**
46
+ **If `{{structuredQuestionsAvailable}}` is `true`:** use `ask_user_questions` for each round. 1–3 questions per call, each as a separate question object. Keep option labels short (3–5 words). In **`complex`** mode, each multi-choice question MUST present **3 or 4 concrete, researched options** plus a final **"Other let me discuss"** option; options must be grounded in the investigation above (codebase signals, library docs, prior `.gsd/` artifacts), not generic placeholders. In **`simple`** mode, 2 options is fine when alternatives are genuinely binary. Binary depth-check / wrap-up gates are exempt from the 3-or-4 rule. When the user picks "Other let me discuss" or writes a long freeform answer, switch to plain text follow-up for that thread before resuming structured questions. **IMPORTANT: Call `ask_user_questions` exactly once per turn. Never make multiple calls with the same or overlapping questions — wait for the user's response before asking the next round.**
40
47
 
41
48
  **If `{{structuredQuestionsAvailable}}` is `false`:** ask questions in plain text. Keep each round to 1–3 focused questions. Wait for answers before asking the next round.
42
49
 
@@ -26,6 +26,18 @@ Ask the user a single freeform question in plain text, not structured: **"What d
26
26
 
27
27
  Wait for their response. This grounds every follow-up in their own terminology.
28
28
 
29
+ ### Classify project shape
30
+
31
+ After the opening answer, classify the project as **`simple`** or **`complex`** before continuing. Print the verdict in chat as one line: `Project shape: simple` or `Project shape: complex` followed by a one-line rationale.
32
+
33
+ **`simple`** — most of these apply: single primary user (the user themselves or one immediate team), no external integrations beyond well-known SDKs/libs, greenfield or self-contained, scope describable in 1–2 sentences without ambiguity, no compliance/regulatory needs, ≤5 distinct capabilities.
34
+
35
+ **`complex`** — any of these apply: multi-user with roles/permissions, non-trivial brownfield codebase, external integrations with auth/data exchange, compliance/security/regulated domain (PII, payments, healthcare), >5 capabilities or unclear scope, cross-team/cross-org coordination, novel domain where assumptions need validation.
36
+
37
+ **Default to `complex` when uncertain.** The user can override the verdict in plain text; if they do, accept it and proceed.
38
+
39
+ The verdict drives the rest of this stage and gets persisted to PROJECT.md → `## Project Shape`. Downstream stages (`discuss-requirements`, `discuss-milestone`, `discuss-slice`) read it from there.
40
+
29
41
  ### Before deeper rounds
30
42
 
31
43
  Do a lightweight targeted investigation so your questions are grounded in reality:
@@ -50,9 +62,11 @@ Ask **1–3 questions per round**. Each round targets one of:
50
62
 
51
63
  **Never fabricate or simulate user input.** Never generate fake transcript markers like `[User]`, `[Human]`, or `User:`. Ask one question round, then wait for the user's actual response before continuing.
52
64
 
53
- **Plain-text default:** Project discovery is open-ended. Ask question rounds in plain text unless you are presenting 2–3 concrete alternatives with clear tradeoffs.
65
+ **Cadence is shape-dependent:**
66
+ - **`simple`** — favor 1–2 plain-text rounds. Skip `ask_user_questions` unless you are presenting concrete alternatives. Get to the depth checklist fast.
67
+ - **`complex`** — full investigation, multiple rounds, structured questions when meaningful alternatives exist.
54
68
 
55
- **If `{{structuredQuestionsAvailable}}` is `true` and you use `ask_user_questions`:** ask 1–3 questions per call. Every question object MUST include a stable lowercase `id`. Keep option labels short (3–5 words). Do not add a separate "Other" option; the question UI provides a freeform path automatically. Wait for each tool result before asking the next round.
69
+ **If `{{structuredQuestionsAvailable}}` is `true` and you use `ask_user_questions`:** ask 1–3 questions per call. Every question object MUST include a stable lowercase `id`. Keep option labels short (3–5 words). In **`complex`** mode, each multi-choice question MUST present **3 or 4 concrete, researched options** plus a final **"Other — let me discuss"** option; options must be grounded in your investigation (codebase signals, library docs, prior `.gsd/` artifacts), not generic placeholders. In **`simple`** mode, 2 options is fine. Binary depth-check / wrap-up gates are exempt from the 3-or-4 rule. Wait for each tool result before asking the next round.
56
70
 
57
71
  **If `{{structuredQuestionsAvailable}}` is `false`:** ask questions in plain text. Keep each round to 1–3 focused questions.
58
72
 
@@ -126,8 +140,9 @@ Once the user confirms depth:
126
140
 
127
141
  1. Use the **Project** output template (inlined above).
128
142
  2. Call `gsd_summary_save` with `artifact_type: "PROJECT"` and the full project markdown as `content`; omit `milestone_id`. The tool writes `.gsd/PROJECT.md` to disk and persists to DB. Preserve the user's exact terminology, emphasis, and framing.
129
- 3. The `## Capability Contract` section MUST reference `.gsd/REQUIREMENTS.md` that file does not yet exist; the next stage (`discuss-requirements`) will produce it.
130
- 4. The `## Milestone Sequence` MUST list at least M001 with title and one-liner. Subsequent milestones may be listed as known intents; they will be elaborated in their own discuss-milestone stages.
131
- 5. Do NOT use `artifact_type: "CONTEXT"` and do NOT pass `milestone_id: "PROJECT"`; that creates a fake milestone named PROJECT.
132
- 6. {{commitInstruction}}
133
- 7. Say exactly: `"Project context written."` — nothing else.
143
+ 3. The `## Project Shape` section MUST contain `**Complexity:** simple` or `**Complexity:** complex` (matching the verdict you announced) plus a one-line `**Why:**` rationale. Downstream stages read this line.
144
+ 4. The `## Capability Contract` section MUST reference `.gsd/REQUIREMENTS.md` that file does not yet exist; the next stage (`discuss-requirements`) will produce it.
145
+ 5. The `## Milestone Sequence` MUST list at least M001 with title and one-liner. Subsequent milestones may be listed as known intents; they will be elaborated in their own discuss-milestone stages.
146
+ 6. Do NOT use `artifact_type: "CONTEXT"` and do NOT pass `milestone_id: "PROJECT"`; that creates a fake milestone named PROJECT.
147
+ 7. {{commitInstruction}}
148
+ 8. Say exactly: `"Project context written."` — nothing else.
@@ -21,9 +21,13 @@ Before your first action, print this banner verbatim in chat:
21
21
  ## Pre-flight
22
22
 
23
23
  1. Read `.gsd/PROJECT.md` end-to-end. If it does not exist, STOP and emit: `"PROJECT.md missing — run discuss-project first."`
24
- 2. Extract: Core Value, Anti-goals, Constraints, Milestone Sequence.
24
+ 2. Extract: Core Value, Anti-goals, Constraints, Milestone Sequence, and the project shape verdict — read the `## Project Shape` section and look for `**Complexity:**` (verdict is either `simple` or `complex`; default to `complex` if the section is missing or unclear).
25
25
  3. Check for existing `.gsd/REQUIREMENTS.md` — if present, this is a refinement pass, not a fresh write. Read existing requirements and treat them as the working set.
26
26
 
27
+ **Shape-dependent cadence:**
28
+ - **`simple`** — favor a single fast pass: extract requirements directly from PROJECT.md, ask 1–2 plain-text clarifying questions only if a class or status assignment is genuinely ambiguous, then write REQUIREMENTS.md.
29
+ - **`complex`** — full multi-round questioning with structured 3–4-option questions where alternatives matter.
30
+
27
31
  ---
28
32
 
29
33
  ## Interview Protocol
@@ -51,7 +55,7 @@ Ask **1–3 questions per round**. Each round targets one dimension:
51
55
 
52
56
  **Never fabricate or simulate user input.** Wait for actual responses.
53
57
 
54
- **If `{{structuredQuestionsAvailable}}` is `true`:** use `ask_user_questions`. Every question object MUST include a stable lowercase `id`. For class assignments, present the allowed classes as multi-select options. For status, present the four statuses as exclusive options. Ask 1–3 questions per call. Wait for each tool result before asking the next round.
58
+ **If `{{structuredQuestionsAvailable}}` is `true`:** use `ask_user_questions`. Every question object MUST include a stable lowercase `id`. For class assignments, present the allowed classes as multi-select options. For status, present the four statuses as exclusive options. In **`complex`** mode, any free-form question MUST present **3 or 4 concrete, researched options** plus a final **"Other — let me discuss"** option grounded in the investigation above. The class-assignment and status questions are exempt — they have fixed enumerations. Ask 1–3 questions per call. Wait for each tool result before asking the next round.
55
59
 
56
60
  **If `{{structuredQuestionsAvailable}}` is `false`:** ask in plain text. Keep each round to 1–3 questions.
57
61