@opengsd/gsd-pi 1.1.1-dev.2034b16 → 1.1.1-dev.595401e

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (110) hide show
  1. package/dist/resources/.managed-resources-content-hash +1 -1
  2. package/dist/resources/extensions/gsd/auto-post-unit.js +21 -3
  3. package/dist/resources/extensions/gsd/auto-prompts.js +15 -6
  4. package/dist/resources/extensions/gsd/bootstrap/db-tools.js +2 -2
  5. package/dist/resources/extensions/gsd/browser-evidence.js +29 -2
  6. package/dist/resources/extensions/gsd/docs/preferences-reference.md +8 -0
  7. package/dist/resources/extensions/gsd/doctor-runtime-checks.js +2 -2
  8. package/dist/resources/extensions/gsd/post-unit-hooks.js +9 -0
  9. package/dist/resources/extensions/gsd/preferences-validation.js +39 -0
  10. package/dist/resources/extensions/gsd/prompt-loader.js +7 -0
  11. package/dist/resources/extensions/gsd/prompts/run-uat.md +40 -22
  12. package/dist/resources/extensions/gsd/prompts/validate-milestone.md +3 -3
  13. package/dist/resources/extensions/gsd/rule-registry.js +428 -52
  14. package/dist/resources/extensions/gsd/tools/validate-milestone.js +46 -16
  15. package/dist/resources/extensions/gsd/tools/workflow-tool-executors.js +29 -14
  16. package/dist/resources/extensions/gsd/verdict-parser.js +59 -15
  17. package/dist/rtk.d.ts +7 -1
  18. package/dist/rtk.js +27 -11
  19. package/dist/web/standalone/.next/BUILD_ID +1 -1
  20. package/dist/web/standalone/.next/app-path-routes-manifest.json +7 -7
  21. package/dist/web/standalone/.next/build-manifest.json +2 -2
  22. package/dist/web/standalone/.next/prerender-manifest.json +3 -3
  23. package/dist/web/standalone/.next/server/app/_global-error.html +1 -1
  24. package/dist/web/standalone/.next/server/app/_global-error.rsc +1 -1
  25. package/dist/web/standalone/.next/server/app/_global-error.segments/_full.segment.rsc +1 -1
  26. package/dist/web/standalone/.next/server/app/_global-error.segments/_global-error/__PAGE__.segment.rsc +1 -1
  27. package/dist/web/standalone/.next/server/app/_global-error.segments/_global-error.segment.rsc +1 -1
  28. package/dist/web/standalone/.next/server/app/_global-error.segments/_head.segment.rsc +1 -1
  29. package/dist/web/standalone/.next/server/app/_global-error.segments/_index.segment.rsc +1 -1
  30. package/dist/web/standalone/.next/server/app/_global-error.segments/_tree.segment.rsc +1 -1
  31. package/dist/web/standalone/.next/server/app/_not-found.html +1 -1
  32. package/dist/web/standalone/.next/server/app/_not-found.rsc +1 -1
  33. package/dist/web/standalone/.next/server/app/_not-found.segments/_full.segment.rsc +1 -1
  34. package/dist/web/standalone/.next/server/app/_not-found.segments/_head.segment.rsc +1 -1
  35. package/dist/web/standalone/.next/server/app/_not-found.segments/_index.segment.rsc +1 -1
  36. package/dist/web/standalone/.next/server/app/_not-found.segments/_not-found/__PAGE__.segment.rsc +1 -1
  37. package/dist/web/standalone/.next/server/app/_not-found.segments/_not-found.segment.rsc +1 -1
  38. package/dist/web/standalone/.next/server/app/_not-found.segments/_tree.segment.rsc +1 -1
  39. package/dist/web/standalone/.next/server/app/index.html +1 -1
  40. package/dist/web/standalone/.next/server/app/index.rsc +1 -1
  41. package/dist/web/standalone/.next/server/app/index.segments/__PAGE__.segment.rsc +1 -1
  42. package/dist/web/standalone/.next/server/app/index.segments/_full.segment.rsc +1 -1
  43. package/dist/web/standalone/.next/server/app/index.segments/_head.segment.rsc +1 -1
  44. package/dist/web/standalone/.next/server/app/index.segments/_index.segment.rsc +1 -1
  45. package/dist/web/standalone/.next/server/app/index.segments/_tree.segment.rsc +1 -1
  46. package/dist/web/standalone/.next/server/app-paths-manifest.json +7 -7
  47. package/dist/web/standalone/.next/server/chunks/8357.js +1 -1
  48. package/dist/web/standalone/.next/server/middleware-build-manifest.js +1 -1
  49. package/dist/web/standalone/.next/server/pages/404.html +1 -1
  50. package/dist/web/standalone/.next/server/pages/500.html +1 -1
  51. package/dist/web/standalone/.next/server/server-reference-manifest.json +1 -1
  52. package/package.json +1 -1
  53. package/packages/cloud-mcp-gateway/package.json +2 -2
  54. package/packages/contracts/package.json +1 -1
  55. package/packages/daemon/package.json +4 -4
  56. package/packages/gsd-agent-core/dist/session/agent-session-compaction.d.ts +2 -0
  57. package/packages/gsd-agent-core/dist/session/agent-session-compaction.d.ts.map +1 -1
  58. package/packages/gsd-agent-core/dist/session/agent-session-compaction.js +8 -2
  59. package/packages/gsd-agent-core/dist/session/agent-session-compaction.js.map +1 -1
  60. package/packages/gsd-agent-core/package.json +5 -5
  61. package/packages/gsd-agent-modes/package.json +7 -7
  62. package/packages/mcp-server/dist/remote-questions.d.ts.map +1 -1
  63. package/packages/mcp-server/dist/remote-questions.js +23 -9
  64. package/packages/mcp-server/dist/remote-questions.js.map +1 -1
  65. package/packages/mcp-server/dist/workflow-tools.js +1 -1
  66. package/packages/mcp-server/dist/workflow-tools.js.map +1 -1
  67. package/packages/mcp-server/package.json +3 -3
  68. package/packages/native/package.json +1 -1
  69. package/packages/pi-agent-core/package.json +1 -1
  70. package/packages/pi-ai/dist/models.generated.d.ts +17 -17
  71. package/packages/pi-ai/dist/models.generated.js +19 -19
  72. package/packages/pi-ai/dist/models.generated.js.map +1 -1
  73. package/packages/pi-ai/package.json +1 -1
  74. package/packages/pi-coding-agent/package.json +7 -7
  75. package/packages/pi-tui/package.json +1 -1
  76. package/packages/rpc-client/package.json +2 -2
  77. package/pkg/package.json +1 -1
  78. package/src/resources/extensions/gsd/auto-post-unit.ts +28 -2
  79. package/src/resources/extensions/gsd/auto-prompts.ts +16 -6
  80. package/src/resources/extensions/gsd/bootstrap/db-tools.ts +2 -2
  81. package/src/resources/extensions/gsd/browser-evidence.ts +26 -2
  82. package/src/resources/extensions/gsd/docs/preferences-reference.md +8 -0
  83. package/src/resources/extensions/gsd/doctor-runtime-checks.ts +2 -2
  84. package/src/resources/extensions/gsd/post-unit-hooks.ts +14 -1
  85. package/src/resources/extensions/gsd/preferences-validation.ts +36 -0
  86. package/src/resources/extensions/gsd/prompt-loader.ts +8 -0
  87. package/src/resources/extensions/gsd/prompts/run-uat.md +40 -22
  88. package/src/resources/extensions/gsd/prompts/validate-milestone.md +3 -3
  89. package/src/resources/extensions/gsd/rule-registry.ts +558 -58
  90. package/src/resources/extensions/gsd/rule-types.ts +2 -0
  91. package/src/resources/extensions/gsd/tests/browser-evidence.test.ts +142 -0
  92. package/src/resources/extensions/gsd/tests/complete-milestone-excerpt.test.ts +30 -0
  93. package/src/resources/extensions/gsd/tests/doctor-runtime-checks.test.ts +27 -0
  94. package/src/resources/extensions/gsd/tests/integration/auto-recovery.test.ts +4 -4
  95. package/src/resources/extensions/gsd/tests/integration/run-uat.test.ts +66 -10
  96. package/src/resources/extensions/gsd/tests/post-unit-hooks.test.ts +157 -0
  97. package/src/resources/extensions/gsd/tests/post-unit-retry-on-orchestrator-bridge.test.ts +179 -0
  98. package/src/resources/extensions/gsd/tests/preferences.test.ts +29 -0
  99. package/src/resources/extensions/gsd/tests/prompt-contracts.test.ts +22 -1
  100. package/src/resources/extensions/gsd/tests/prompt-loader-extension-dir.test.ts +14 -0
  101. package/src/resources/extensions/gsd/tests/rule-registry.test.ts +75 -0
  102. package/src/resources/extensions/gsd/tests/validate-milestone-prompt-verification-classes.test.ts +6 -3
  103. package/src/resources/extensions/gsd/tests/validate-milestone-write-order.test.ts +133 -0
  104. package/src/resources/extensions/gsd/tests/workflow-tool-executors.test.ts +74 -0
  105. package/src/resources/extensions/gsd/tools/validate-milestone.ts +46 -15
  106. package/src/resources/extensions/gsd/tools/workflow-tool-executors.ts +31 -14
  107. package/src/resources/extensions/gsd/types.ts +63 -0
  108. package/src/resources/extensions/gsd/verdict-parser.ts +54 -13
  109. /package/dist/web/standalone/.next/static/{StOMnvtgGiBHrBOZJZ1Gr → IDKjyRHLIaumjgonPcYiX}/_buildManifest.js +0 -0
  110. /package/dist/web/standalone/.next/static/{StOMnvtgGiBHrBOZJZ1Gr → IDKjyRHLIaumjgonPcYiX}/_ssgManifest.js +0 -0
@@ -1 +1 @@
1
- 2c5648f1e27d7188
1
+ e81bc72bf9d51027
@@ -32,7 +32,7 @@ import { isDbAvailable, getDbPath, refreshOpenDatabaseFromDisk, getTask, getSlic
32
32
  import { renderPlanCheckboxes, renderRoadmapFromDb } from "./markdown-renderer.js";
33
33
  import { parseRoadmap as parseLegacyRoadmap } from "./parsers-legacy.js";
34
34
  import { consumeSignal } from "./session-status-io.js";
35
- import { checkPostUnitHooks, isRetryPending, consumeRetryTrigger, persistHookState, resolveHookArtifactPath, } from "./post-unit-hooks.js";
35
+ import { checkPostUnitHooks, consumeHookFailure, isRetryPending, consumeRetryTrigger, consumeGateBlock, persistHookState, resolveHookArtifactPath, } from "./post-unit-hooks.js";
36
36
  import { hasPendingCaptures, loadPendingCaptures, revertExecutorResolvedCaptures } from "./captures.js";
37
37
  import { debugLog } from "./debug-logger.js";
38
38
  import { runSafely } from "./auto-utils.js";
@@ -1860,18 +1860,25 @@ export async function postUnitPostVerification(pctx) {
1860
1860
  // ── Post-unit hooks ──
1861
1861
  if (s.currentUnit && !s.stepMode) {
1862
1862
  const hookUnit = checkPostUnitHooks(s.currentUnit.type, s.currentUnit.id, s.basePath);
1863
+ persistHookState(s.basePath);
1863
1864
  if (hookUnit) {
1864
1865
  if (s.currentUnit) {
1865
1866
  await closeoutUnit(ctx, s.basePath, s.currentUnit.type, s.currentUnit.id, s.currentUnit.startedAt, buildSnapshotOpts(s.currentUnit.type, s.currentUnit.id));
1866
1867
  }
1867
- persistHookState(s.basePath);
1868
1868
  return enqueueSidecar(s, ctx, { kind: "hook", unitType: hookUnit.unitType, unitId: hookUnit.unitId, prompt: hookUnit.prompt, model: hookUnit.model }, { hookName: hookUnit.hookName });
1869
1869
  }
1870
+ const hookFailure = consumeHookFailure();
1871
+ if (hookFailure) {
1872
+ ctx.ui.notify(`Post-unit hook ${hookFailure.hookName} failed for ${hookFailure.unitId}: ${hookFailure.reason}. Pausing auto-mode.`, "warning");
1873
+ await pauseAuto(ctx, pi);
1874
+ return "stopped";
1875
+ }
1870
1876
  // Check if a hook requested a retry of the trigger unit
1871
1877
  if (isRetryPending()) {
1872
1878
  const trigger = consumeRetryTrigger();
1873
1879
  if (trigger) {
1874
- ctx.ui.notify(`Hook requested retry of ${trigger.unitType} ${trigger.unitId} — resetting task state.`, "info");
1880
+ persistHookState(s.basePath);
1881
+ ctx.ui.notify(`Hook requested retry of ${trigger.unitType} ${trigger.unitId} — resetting trigger unit state.`, "info");
1875
1882
  await s.orchestration?.retryActiveUnit({
1876
1883
  unitType: trigger.unitType,
1877
1884
  unitId: trigger.unitId,
@@ -1918,6 +1925,17 @@ export async function postUnitPostVerification(pctx) {
1918
1925
  // Fall through to normal dispatch — deriveState will re-derive the unit
1919
1926
  }
1920
1927
  }
1928
+ const gateBlock = consumeGateBlock();
1929
+ if (gateBlock) {
1930
+ persistHookState(s.basePath);
1931
+ const verdict = gateBlock.verdict ? ` verdict=${gateBlock.verdict};` : "";
1932
+ const artifact = gateBlock.artifact ? ` artifact=${gateBlock.artifact};` : "";
1933
+ const message = `Post-unit gate "${gateBlock.hookName}" blocked ${gateBlock.triggerUnitType} ${gateBlock.triggerUnitId}:` +
1934
+ `${verdict}${artifact} ${gateBlock.reason}. Run /gsd status to inspect, then /gsd auto after recovery.`;
1935
+ ctx.ui.notify(message, "warning");
1936
+ await pauseAuto(ctx, pi);
1937
+ return "stopped";
1938
+ }
1921
1939
  }
1922
1940
  // ── Fast-path stop detection (#3487) ──
1923
1941
  // Before waiting for triage, check if any PENDING captures contain explicit
@@ -1271,7 +1271,7 @@ export async function checkNeedsRunUat(base, mid, state, prefs) {
1271
1271
  if (hasVerdict(uatContent))
1272
1272
  continue;
1273
1273
  // Also check the ASSESSMENT file — the run-uat prompt writes the verdict
1274
- // there (via gsd_summary_save artifact_type:"ASSESSMENT"), not into the
1274
+ // there (via gsd_uat_result_save), not into the
1275
1275
  // UAT spec file. Without this check the unit re-dispatches indefinitely.
1276
1276
  const assessmentFile = resolveSliceFile(base, mid, sid, "ASSESSMENT");
1277
1277
  if (assessmentFile) {
@@ -2568,17 +2568,26 @@ export async function buildValidateMilestonePrompt(mid, midTitle, base, level) {
2568
2568
  if (isDbAvailable()) {
2569
2569
  const milestone = getMilestone(mid);
2570
2570
  if (milestone) {
2571
+ const escapeCell = (value) => value.replace(/[\\|]/g, (char) => `\\${char}`).replace(/\r?\n/g, " ");
2571
2572
  const classes = [];
2572
2573
  if (milestone.verification_contract)
2573
- classes.push(`- **Contract:** ${milestone.verification_contract}`);
2574
+ classes.push(`| Contract | ${escapeCell(milestone.verification_contract)} |`);
2574
2575
  if (milestone.verification_integration)
2575
- classes.push(`- **Integration:** ${milestone.verification_integration}`);
2576
+ classes.push(`| Integration | ${escapeCell(milestone.verification_integration)} |`);
2576
2577
  if (milestone.verification_operational)
2577
- classes.push(`- **Operational:** ${milestone.verification_operational}`);
2578
+ classes.push(`| Operational | ${escapeCell(milestone.verification_operational)} |`);
2578
2579
  if (milestone.verification_uat)
2579
- classes.push(`- **UAT:** ${milestone.verification_uat}`);
2580
+ classes.push(`| UAT | ${escapeCell(milestone.verification_uat)} |`);
2580
2581
  if (classes.length > 0) {
2581
- const verificationClasses = `### Verification Classes (from planning)\n\nThese verification tiers were defined during milestone planning. Each non-empty class must be checked for evidence during validation.\n\n${classes.join("\n")}`;
2582
+ const verificationClasses = [
2583
+ "### Verification Classes (from planning)",
2584
+ "",
2585
+ "These verification tiers were defined during milestone planning. Every row in this table must appear in `verificationClasses` with the same canonical class name.",
2586
+ "",
2587
+ "| Class | Planned Check |",
2588
+ "| --- | --- |",
2589
+ ...classes,
2590
+ ].join("\n");
2582
2591
  inlined.push(verificationClasses);
2583
2592
  trackPromptContext(contextTelemetry, "verification-classes", "inline", verificationClasses);
2584
2593
  }
@@ -1010,7 +1010,7 @@ export function registerDbTools(pi) {
1010
1010
  promptGuidelines: [
1011
1011
  "Use gsd_validate_milestone when all slices are done and the milestone needs validation before completion.",
1012
1012
  "Parameters: milestoneId, verdict, remediationRound, successCriteriaChecklist, sliceDeliveryAudit, crossSliceIntegration, requirementCoverage, verificationClasses (optional), verdictRationale, remediationPlan (optional).",
1013
- "If verification classes were planned, verificationClasses must include canonical class rows using the exact class names Contract, Integration, Operational, and UAT when present in planning.",
1013
+ "If verification classes were planned, verificationClasses must be a complete canonical table with one row for every applicable planned class using the exact class names Contract, Integration, Operational, and UAT. Do not submit a partial table.",
1014
1014
  "Planned verification text marked as none/not required/not applicable/N/A (including suffixed variants such as 'not required - backend-only') is treated as not applicable and does not require a class row.",
1015
1015
  "If verdict is 'needs-remediation', also provide remediationPlan and use gsd_reassess_roadmap to add remediation slices to the roadmap.",
1016
1016
  "On success, returns validationPath where VALIDATION.md was written.",
@@ -1023,7 +1023,7 @@ export function registerDbTools(pi) {
1023
1023
  sliceDeliveryAudit: Type.String({ description: "Markdown table auditing each slice's claimed vs delivered output" }),
1024
1024
  crossSliceIntegration: Type.String({ description: "Markdown describing any cross-slice boundary mismatches" }),
1025
1025
  requirementCoverage: Type.String({ description: "Markdown describing any unaddressed requirements" }),
1026
- verificationClasses: Type.Optional(Type.String({ description: "Markdown describing verification class compliance and gaps using canonical class names (Contract, Integration, Operational, UAT) for each applicable planned class" })),
1026
+ verificationClasses: Type.Optional(Type.String({ description: "Complete markdown table describing verification class compliance and gaps; include one canonical row for every applicable planned class (Contract, Integration, Operational, UAT)" })),
1027
1027
  verdictRationale: Type.String({ description: "Why this verdict was chosen" }),
1028
1028
  remediationPlan: Type.Optional(Type.String({ description: "Remediation plan (required if verdict is needs-remediation)" })),
1029
1029
  }),
@@ -1,17 +1,44 @@
1
1
  // Project/App: gsd-pi
2
2
  // File Purpose: Shared browser-observable UAT requirement and evidence detection.
3
- export const BROWSER_REQUIREMENT_RE = /\b(?:browser|file:\/\/|localhost|dom|localstorage|click(?:ing|ed)?|button|screenshot|snapshot|reload(?:ed)?|page refresh|user-visible|strikethrough|search box)\b/i;
3
+ export const BROWSER_REQUIREMENT_RE = /\b(?:file:\/\/|localhost|playwright|chrome|screenshot|snapshot|browser_(?:assert|batch|find|verify|snapshot_refs))\b|\b(?:open|launch|navigate|load|visit|serve|start)\b.{0,80}\b(?:browser|page|localhost|file:\/\/)\b|\bbrowser\s+(?:check|session|test|uat|tool|automation|interaction|flow)\b/i;
4
4
  export const NO_BROWSER_EVIDENCE_RE = /\b(?:no|without|not|wasn'?t|isn'?t)\s+(?:automated\s+)?(?:live\s+)?browser(?:\s+(?:session|test|uat))?|\bno\s+automated\s+browser\b|\bnot\s+conducted\b/i;
5
5
  export const BROWSER_RUNTIME_RE = /\b(?:browser|playwright|chrome|camoufox|browser_(?:assert|batch|find|verify|snapshot_refs)|screenshot|snapshot|file:\/\/|localhost)\b/i;
6
6
  export const BROWSER_ACTION_RE = /\b(?:open(?:ed)?|navigate(?:d)?|click(?:ed)?|type(?:d)?|reload(?:ed)?|capture(?:d)?|screenshot|snapshot)\b/i;
7
7
  export const BROWSER_ASSERTION_RE = /\b(?:assert(?:ed|ion)?|observed|confirmed|verified|expected|visible|text|count|label|strikethrough|localstorage|screenshot|snapshot|passed)\b/i;
8
+ const NON_REQUIREMENT_BROWSER_HEADING_RE = /^(?:not\s+proven|not\s+covered|out\s+of\s+scope|deferred|follow-?ups?|known\s+limitations|notes\s+for\s+tester)\b/i;
9
+ const NON_REQUIREMENT_BROWSER_LINE_RE = /\b(?:deferred|not\s+proven|not\s+covered|out\s+of\s+scope|future\s+slice|follow-?up|no\s+(?:live\s+)?browser|without\s+(?:a\s+)?browser|not\s+(?:a\s+)?browser)\b/i;
8
10
  export function compactTextParts(parts) {
9
11
  return parts.flatMap((part) => Array.isArray(part) ? part : [part])
10
12
  .filter((part) => typeof part === "string" && part.trim().length > 0)
11
13
  .join("\n");
12
14
  }
13
15
  export function hasBrowserRequiredText(text) {
14
- return BROWSER_REQUIREMENT_RE.test(text);
16
+ let inNonRequirementSection = false;
17
+ let nonRequirementDepth = 0;
18
+ for (const line of text.split(/\r?\n/)) {
19
+ const headingMatch = line.match(/^(#{1,6})\s+(.+?)\s*$/);
20
+ if (headingMatch) {
21
+ const depth = headingMatch[1].length;
22
+ const title = headingMatch[2] ?? "";
23
+ // Only update section context when at the same or higher level than the
24
+ // heading that opened the non-requirement zone. A sub-heading deeper than
25
+ // the opening heading must not escape or re-enter the zone on its own.
26
+ if (!inNonRequirementSection || depth <= nonRequirementDepth) {
27
+ inNonRequirementSection = NON_REQUIREMENT_BROWSER_HEADING_RE.test(title);
28
+ nonRequirementDepth = inNonRequirementSection ? depth : 0;
29
+ }
30
+ // Check the heading title itself — section state is already updated, so
31
+ // we correctly skip headings that opened a non-requirement zone.
32
+ if (!inNonRequirementSection && BROWSER_REQUIREMENT_RE.test(title))
33
+ return true;
34
+ continue;
35
+ }
36
+ if (inNonRequirementSection || NON_REQUIREMENT_BROWSER_LINE_RE.test(line))
37
+ continue;
38
+ if (BROWSER_REQUIREMENT_RE.test(line))
39
+ return true;
40
+ }
41
+ return false;
15
42
  }
16
43
  export function hasBrowserEvidenceText(text) {
17
44
  if (!text.trim())
@@ -305,10 +305,18 @@ This config sets a parent workspace with two child repositories. The implicit `p
305
305
  - `max_cycles`: number — max times this hook fires per trigger (default: 1, max: 10).
306
306
  - `model`: string — optional model override.
307
307
  - `artifact`: string — expected output file name (relative to task/slice dir). Hook is skipped if file already exists (idempotent).
308
+ - `criticality`: `"advisory"` or `"blocking"` — advisory preserves current best-effort behavior; blocking requires clean hook completion plus a valid outcome verdict before auto-mode advances. Default: `"advisory"`.
308
309
  - `retry_on`: string — if this file is produced instead of the artifact, re-run the trigger unit then re-run hooks.
310
+ - `on_block`: object — optional routing for blocking findings:
311
+ - `action`: `"retry-unit"`, `"retry-task"`, `"queue-task"`, `"queue-slice"`, or `"pause"`.
312
+ - `artifact`: string — optional compatibility artifact for retry routing.
309
313
  - `agent`: string — agent definition file to use for hook execution.
310
314
  - `enabled`: boolean — toggle without removing (default: `true`).
311
315
 
316
+ Blocking hook artifacts must begin with YAML frontmatter containing either `verdict` or `outcome.verdict`.
317
+ Supported verdicts are `pass`, `advisory`, `needs-rework`, `needs-remediation`, and `needs-attention`.
318
+ `pass` and `advisory` continue; `needs-rework` retries the trigger unit when routed with `retry-unit`/`retry-task`; `needs-remediation` and `needs-attention` pause with recovery guidance.
319
+
312
320
  - `pre_dispatch_hooks`: array — hooks that fire before a unit is dispatched. Each entry has:
313
321
  - `name`: string — unique hook identifier.
314
322
  - `before`: string[] — unit types to intercept.
@@ -258,14 +258,14 @@ export async function checkRuntimeHealth(basePath, issues, fixesApplied, shouldF
258
258
  catch {
259
259
  count = MAX_UAT_ATTEMPTS + 1;
260
260
  }
261
- if (count <= MAX_UAT_ATTEMPTS)
261
+ if (count < MAX_UAT_ATTEMPTS)
262
262
  continue;
263
263
  issues.push({
264
264
  severity: "warning",
265
265
  code: "uat_retry_exhausted",
266
266
  scope: "slice",
267
267
  unitId: `${mid}/${sid}`,
268
- message: `run-uat for ${mid}/${sid} exhausted ${count - 1} retry attempt(s) without an ASSESSMENT verdict. Reset the retry counter after fixing the underlying UAT/tool issue, then rerun /gsd auto.`,
268
+ message: `run-uat for ${mid}/${sid} exhausted ${count} attempt(s) without an ASSESSMENT verdict. Reset the retry counter after fixing the underlying UAT/tool issue, then rerun /gsd auto.`,
269
269
  file: `.gsd/runtime/${fileName}`,
270
270
  fixable: true,
271
271
  });
@@ -19,6 +19,15 @@ export function isRetryPending() {
19
19
  export function consumeRetryTrigger() {
20
20
  return getOrCreateRegistry().consumeRetryTrigger();
21
21
  }
22
+ export function consumeHookFailure() {
23
+ return getOrCreateRegistry().consumeHookFailure();
24
+ }
25
+ export function isGateBlockPending() {
26
+ return getOrCreateRegistry().isGateBlockPending();
27
+ }
28
+ export function consumeGateBlock() {
29
+ return getOrCreateRegistry().consumeGateBlock();
30
+ }
22
31
  export function resetHookState() {
23
32
  getOrCreateRegistry().resetState();
24
33
  }
@@ -15,6 +15,14 @@ const VALID_UOK_TURN_ACTIONS = new Set([
15
15
  "snapshot",
16
16
  "status-only",
17
17
  ]);
18
+ const VALID_POST_UNIT_HOOK_CRITICALITIES = new Set(["advisory", "blocking"]);
19
+ const VALID_POST_UNIT_HOOK_ON_BLOCK_ACTIONS = new Set([
20
+ "retry-unit",
21
+ "retry-task",
22
+ "queue-task",
23
+ "queue-slice",
24
+ "pause",
25
+ ]);
18
26
  export function validatePreferences(preferences) {
19
27
  const errors = [];
20
28
  const warnings = [];
@@ -474,9 +482,40 @@ export function validatePreferences(preferences) {
474
482
  if (typeof hook.artifact === "string" && hook.artifact.trim()) {
475
483
  validHook.artifact = hook.artifact.trim();
476
484
  }
485
+ if (hook.criticality !== undefined) {
486
+ const criticality = typeof hook.criticality === "string" ? hook.criticality.trim() : "";
487
+ if (VALID_POST_UNIT_HOOK_CRITICALITIES.has(criticality)) {
488
+ validHook.criticality = criticality;
489
+ }
490
+ else {
491
+ errors.push(`post_unit_hooks "${name}" invalid criticality: ${String(hook.criticality)}`);
492
+ }
493
+ }
477
494
  if (typeof hook.retry_on === "string" && hook.retry_on.trim()) {
478
495
  validHook.retry_on = hook.retry_on.trim();
479
496
  }
497
+ if (hook.on_block !== undefined) {
498
+ if (!hook.on_block || typeof hook.on_block !== "object") {
499
+ errors.push(`post_unit_hooks "${name}" on_block must be an object`);
500
+ }
501
+ else {
502
+ const onBlock = hook.on_block;
503
+ const action = typeof onBlock.action === "string" ? onBlock.action.trim() : "";
504
+ if (!VALID_POST_UNIT_HOOK_ON_BLOCK_ACTIONS.has(action)) {
505
+ errors.push(`post_unit_hooks "${name}" invalid on_block action: ${String(onBlock.action)}`);
506
+ }
507
+ else {
508
+ validHook.on_block = { action: action };
509
+ if (typeof onBlock.artifact === "string" && onBlock.artifact.trim()) {
510
+ validHook.on_block.artifact = onBlock.artifact.trim();
511
+ }
512
+ }
513
+ }
514
+ }
515
+ if (validHook.criticality === "blocking" && !validHook.artifact) {
516
+ errors.push(`post_unit_hooks "${name}" criticality blocking requires artifact`);
517
+ continue;
518
+ }
480
519
  if (typeof hook.agent === "string" && hook.agent.trim()) {
481
520
  validHook.agent = hook.agent.trim();
482
521
  }
@@ -26,9 +26,16 @@ function hasRequiredExtensionAssets(rootDir, exists = existsSync) {
26
26
  return (exists(join(rootDir, "prompts")) &&
27
27
  exists(join(rootDir, "templates", "task-summary.md")));
28
28
  }
29
+ function isSourceExtensionDir(moduleDir) {
30
+ return moduleDir.replaceAll("\\", "/").endsWith("/src/resources/extensions/gsd");
31
+ }
29
32
  export function resolveExtensionDirFromCandidates(moduleDir, agentGsdDir, exists = existsSync) {
30
33
  const moduleUsable = hasRequiredExtensionAssets(moduleDir, exists);
31
34
  const agentUsable = hasRequiredExtensionAssets(agentGsdDir, exists);
35
+ // Source checkouts must use their own prompt tree. Otherwise local tests and
36
+ // dev runs can silently render stale prompts from ~/.gsd/agent/extensions/gsd.
37
+ if (moduleUsable && isSourceExtensionDir(moduleDir))
38
+ return moduleDir;
32
39
  // Prefer the user-local extension tree when both are valid. This avoids
33
40
  // leaking npm/global-install paths into prompts on Windows.
34
41
  if (agentUsable)
@@ -63,35 +63,53 @@ After running all checks, compute the **overall verdict**:
63
63
  - `FAIL` — one or more automatable checks failed
64
64
  - `PARTIAL` — one or more automatable checks were skipped or returned inconclusive results (not the same as `NEEDS-HUMAN` — use PARTIAL only when the agent itself could not determine pass/fail for a check it was supposed to automate)
65
65
 
66
- Call `gsd_summary_save` with `milestone_id: "{{milestoneId}}"`, `slice_id: "{{sliceId}}"`, `artifact_type: "ASSESSMENT"`, and the full UAT result markdown as `content`. The tool computes the assessment path, persists to DB/disk, and saves the aggregate UAT gate. The content should follow this logical shape:
66
+ Call `gsd_uat_result_save` once after all checks are complete. The tool computes the assessment path, persists to DB/disk, saves attempt history, and saves the aggregate UAT gate.
67
67
 
68
- ```markdown
69
- ---
70
- sliceId: {{sliceId}}
71
- uatType: {{uatType}}
72
- verdict: PASS | FAIL | PARTIAL
73
- date: <ISO 8601 timestamp>
74
- ---
75
-
76
- # UAT Result — {{sliceId}}
77
-
78
- ## Checks
68
+ Pass these top-level fields:
79
69
 
80
- | Check | Mode | Result | Notes |
81
- |-------|------|--------|-------|
82
- | <check description> | artifact / runtime / human-follow-up | PASS / FAIL / NEEDS-HUMAN | <observed output, evidence, or reason> |
83
-
84
- ## Overall Verdict
85
-
86
- <PASS / FAIL / PARTIAL> — <one sentence summary>
70
+ ```ts
71
+ milestoneId: "{{milestoneId}}",
72
+ sliceId: "{{sliceId}}",
73
+ uatType: "{{uatType}}",
74
+ verdict: "PASS" | "FAIL" | "PARTIAL",
75
+ notes: "<one sentence overall verdict rationale>",
76
+ ```
87
77
 
88
- ## Notes
78
+ Use this exact `presentation` shape in the save call so the audit can verify the run-uat tool surface without retrying missing fields one by one:
79
+
80
+ ```ts
81
+ presentation: {
82
+ surface: "mcp",
83
+ presentedTools: [
84
+ "gsd_uat_exec",
85
+ "gsd_uat_result_save",
86
+ "gsd_resume",
87
+ "gsd_milestone_status",
88
+ "gsd_journal_query",
89
+ ],
90
+ blockedTools: [
91
+ { name: "gsd_exec", reason: "forbidden during run-uat" },
92
+ { name: "gsd_summary_save", reason: "forbidden during run-uat" },
93
+ { name: "gsd_save_gate_result", reason: "forbidden during run-uat" },
94
+ ],
95
+ }
96
+ ```
89
97
 
90
- <any additional context, errors encountered, screenshots/logs gathered, or manual follow-up still required>
98
+ Pass `checks` with this logical shape:
99
+
100
+ ```ts
101
+ checks: [{
102
+ id: "<stable check id>",
103
+ description: "<check description from the UAT file>",
104
+ mode: "artifact" | "runtime" | "browser" | "human-follow-up",
105
+ result: "PASS" | "FAIL" | "NEEDS-HUMAN",
106
+ evidence: [{ kind: "gsd_uat_exec", ref: "<evidence id>" }],
107
+ notes: "<observed output, evidence, reason, or manual follow-up>",
108
+ }]
91
109
  ```
92
110
 
93
111
  ---
94
112
 
95
- **You MUST call `gsd_summary_save` with `artifact_type: "ASSESSMENT"` and the UAT result content before finishing. Do not write the assessment file directly.**
113
+ **You MUST call `gsd_uat_result_save` before finishing. Do not write the assessment file directly, and do not call `gsd_summary_save` as a substitute.**
96
114
 
97
115
  When done, say: "UAT {{sliceId}} complete."
@@ -33,7 +33,7 @@ Prompt: "Review milestone {{milestoneId}} requirements coverage. Working directo
33
33
  Prompt: "Review milestone {{milestoneId}} cross-slice integration. Working directory: {{workingDirectory}}. Read `{{roadmapPath}}` and find the boundary map (produces/consumes contracts). For each boundary, confirm producer SUMMARY produced the artifact and consumer SUMMARY consumed it. Output table: Boundary | Producer Summary | Consumer Summary | Status. End with one-line verdict: PASS if all boundaries honored, NEEDS-ATTENTION if any gaps."
34
34
 
35
35
  **Reviewer C - Assessment & Acceptance Criteria**
36
- Prompt: "Review milestone {{milestoneId}} assessment evidence and acceptance criteria. Working directory: {{workingDirectory}}. Read `.gsd/milestones/{{milestoneId}}/{{milestoneId}}-CONTEXT.md` for criteria. Check slice SUMMARY and ASSESSMENT files under `.gsd/milestones/{{milestoneId}}/slices/`; UAT files are specs, not evidence. Verify each criterion maps to passing evidence. Then review inlined milestone verification classes. For each non-empty planned class, output table: Class | Planned Check | Evidence | Verdict. Use the exact class names `Contract`, `Integration`, `Operational`, and `UAT` whenever those classes are present. If a planned browser/UAT class has no ASSESSMENT with browser/runtime actions and assertions, return NEEDS-ATTENTION. If no verification classes were planned, say that explicitly. Output sections `Acceptance Criteria` with checklist `[ ] Criterion | Evidence`, and `Verification Classes` with the table. End with one-line verdict: PASS if all criteria and classes are covered by evidence, NEEDS-ATTENTION if gaps exist."
36
+ Prompt: "Review milestone {{milestoneId}} assessment evidence and acceptance criteria. Working directory: {{workingDirectory}}. Read `.gsd/milestones/{{milestoneId}}/{{milestoneId}}-CONTEXT.md` for criteria. Check slice SUMMARY and ASSESSMENT files under `.gsd/milestones/{{milestoneId}}/slices/`; UAT files are specs, not evidence. Verify each criterion maps to passing evidence. Then review the inlined `Verification Classes (from planning)` table. For every planned row in that table, output a `Verification Classes` table with columns `Class | Planned Check | Evidence | Verdict`. Preserve every planned non-empty class row; do not summarize, rename, combine, or omit planned classes. The first cell of each row must be exactly `Contract`, `Integration`, `Operational`, or `UAT` when that class is present in planning. If a planned class lacks evidence, still include its canonical row and mark the verdict NEEDS-ATTENTION or FAIL. If a planned browser/UAT class has no ASSESSMENT with browser/runtime actions and assertions, return NEEDS-ATTENTION. If no verification classes were planned, say that explicitly. Output sections `Acceptance Criteria` with checklist `[ ] Criterion | Evidence`, and `Verification Classes` with the table. End with one-line verdict: PASS if all criteria and classes are covered by evidence, NEEDS-ATTENTION if gaps exist."
37
37
 
38
38
  ### Step 2 - Synthesize Findings
39
39
 
@@ -71,8 +71,8 @@ reviewers: 3
71
71
  <if verdict is not pass: specific actions required>
72
72
  ```
73
73
 
74
- Call `gsd_validate_milestone` with the camelCase fields `milestoneId`, `verdict`, `remediationRound`, `successCriteriaChecklist`, `sliceDeliveryAudit`, `crossSliceIntegration`, `requirementCoverage`, `verdictRationale`, and `remediationPlan` when needed. If you include verification-class analysis, pass it in `verificationClasses`.
75
- Extract the `Verification Classes` subsection from Reviewer C and pass it verbatim in `verificationClasses` so the persisted validation output uses the canonical class names `Contract`, `Integration`, `Operational`, and `UAT`.
74
+ Call `gsd_validate_milestone` with the camelCase fields `milestoneId`, `verdict`, `remediationRound`, `successCriteriaChecklist`, `sliceDeliveryAudit`, `crossSliceIntegration`, `requirementCoverage`, `verdictRationale`, and `remediationPlan` when needed. If planning included verification classes, pass a complete canonical table in `verificationClasses`.
75
+ Set `verificationClasses` to the `Verification Classes` subsection from Reviewer C. It must include one canonical row for every non-empty planned class from `Verification Classes (from planning)`: `Contract`, `Integration`, `Operational`, and/or `UAT`. If Reviewer C omitted a planned class, reconstruct the missing row from the planning table, set Evidence to the gap, and use NEEDS-ATTENTION or FAIL. Do not call `gsd_validate_milestone` with a partial `verificationClasses` table.
76
76
 
77
77
  **DB access safety:** Do NOT query `.gsd/gsd.db` directly via `sqlite3` or `node -e require('better-sqlite3')` - the engine owns the WAL connection. Use `gsd_milestone_status` for milestone and slice state. Data is already inlined or available via `gsd_*` tools. Direct DB access risks WAL corruption and bypasses validation.
78
78