@nathapp/nax 0.18.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (459) hide show
  1. package/.gitlab-ci.yml +96 -0
  2. package/BRIEF.md +140 -0
  3. package/CHANGELOG.md +60 -0
  4. package/CLAUDE.md +159 -0
  5. package/README.md +373 -0
  6. package/US-007-IMPLEMENTATION.md +139 -0
  7. package/bin/nax.ts +930 -0
  8. package/biome.json +14 -0
  9. package/bun.lock +168 -0
  10. package/bunfig.toml +11 -0
  11. package/docs/20260216-fix-plan-context-review.md +56 -0
  12. package/docs/20260216-relentless-vs-ngent-comparison.md +208 -0
  13. package/docs/20260216-v02-plan.md +136 -0
  14. package/docs/20260216-v02-review.md +685 -0
  15. package/docs/20260217-dogfood-findings.md +56 -0
  16. package/docs/20260217-p2-plus-plan.md +117 -0
  17. package/docs/20260217-partial-fixes-plan.md +62 -0
  18. package/docs/20260217-plan-analyze-spec.md +117 -0
  19. package/docs/20260217-post-impl-review.md +1137 -0
  20. package/docs/20260217-quick-wins-plan.md +66 -0
  21. package/docs/20260217-split-runner-plan.md +75 -0
  22. package/docs/20260217-v03-impl-plan.md +80 -0
  23. package/docs/20260217-v03-post-impl-review.md +589 -0
  24. package/docs/20260217-v04-impl-plan.md +86 -0
  25. package/docs/20260217-v05-post-impl-review.md +850 -0
  26. package/docs/20260217-v06-post-impl-review.md +817 -0
  27. package/docs/20260218-adr003-port-plan.md +151 -0
  28. package/docs/20260218-review-adr003-verification.md +175 -0
  29. package/docs/20260219-fix-plan-bug16-19.md +79 -0
  30. package/docs/20260219-fix-plan-bug20-22.md +114 -0
  31. package/docs/20260219-plan-llm-routing.md +116 -0
  32. package/docs/20260219-review-bug20-22-fixes.md +135 -0
  33. package/docs/20260219-routing-baseline-keyword.md +63 -0
  34. package/docs/20260220-plan-structured-logging-p1.md +80 -0
  35. package/docs/20260220-plan-structured-logging-p2.md +37 -0
  36. package/docs/20260220-review-llm-routing.md +180 -0
  37. package/docs/20260220-review-post-fix-llm-routing.md +70 -0
  38. package/docs/20260221-fix-plan-relevantfiles-split.md +101 -0
  39. package/docs/20260221-fix-plan-routing-mode.md +125 -0
  40. package/docs/20260221-review-v0.9-implementation.md +379 -0
  41. package/docs/20260222-fix-plan-v091-routing-isolation.md +197 -0
  42. package/docs/20260223-fix-plan-prompt-audit.md +62 -0
  43. package/docs/20260224-nax-roadmap-phases.md +189 -0
  44. package/docs/20260225-phase2-llm-service-layer.md +401 -0
  45. package/docs/20260225-review-v0.10.1.md +187 -0
  46. package/docs/20260303-v010-implementation-plan.md +165 -0
  47. package/docs/CLAUDE.md.bak +191 -0
  48. package/docs/ROADMAP.md +165 -0
  49. package/docs/SPEC-rectification.md +0 -0
  50. package/docs/SPEC.md +324 -0
  51. package/docs/US-001-plugin-loading-verification.md +152 -0
  52. package/docs/architecture-analysis.md +1076 -0
  53. package/docs/bugs/BUG-21-escalation-null-attempts.md +48 -0
  54. package/docs/bugs-from-dogfood-run-c.md +243 -0
  55. package/docs/code-review-20260228.md +612 -0
  56. package/docs/code-review-v0.15.0.md +629 -0
  57. package/docs/hook-lifecycle-test-plan.md +149 -0
  58. package/docs/releases/v0.11.0-and-earlier.md +20 -0
  59. package/docs/releases/v0.12.0.md +15 -0
  60. package/docs/releases/v0.13.0.md +14 -0
  61. package/docs/releases/v0.14.0.md +20 -0
  62. package/docs/releases/v0.14.1.md +36 -0
  63. package/docs/releases/v0.14.2.md +51 -0
  64. package/docs/releases/v0.14.3.md +174 -0
  65. package/docs/releases/v0.14.4.md +94 -0
  66. package/docs/releases/v0.15.0.md +502 -0
  67. package/docs/releases/v0.15.1.md +170 -0
  68. package/docs/releases/v0.15.3.md +193 -0
  69. package/docs/specs/status-file-v0.10.1.md +812 -0
  70. package/docs/v0.10-global-config.md +206 -0
  71. package/docs/v0.10-plugin-system.md +415 -0
  72. package/docs/v0.10-prompt-optimizer.md +234 -0
  73. package/docs/v0.3-spec.md +244 -0
  74. package/docs/v0.4-spec.md +140 -0
  75. package/docs/v0.5-spec.md +237 -0
  76. package/docs/v0.6-spec.md +371 -0
  77. package/docs/v0.7-spec.md +177 -0
  78. package/docs/v0.8-llm-routing.md +206 -0
  79. package/docs/v0.8-structured-logging.md +132 -0
  80. package/docs/v0.9.3-prompt-audit.md +112 -0
  81. package/examples/plugins/console-reporter/index.test.ts +207 -0
  82. package/examples/plugins/console-reporter/index.ts +110 -0
  83. package/nax/config.json +147 -0
  84. package/nax/features/bugfix-v0171/prd.json +52 -0
  85. package/nax/features/config-management/prd.json +108 -0
  86. package/nax/features/config-management/progress.txt +5 -0
  87. package/nax/features/diagnose/acceptance.test.ts +412 -0
  88. package/nax/features/diagnose/prd.json +41 -0
  89. package/nax/features/orchestration-fixes/prd.json +89 -0
  90. package/nax/features/orchestration-fixes/progress.txt +1 -0
  91. package/nax/features/plugin-integration/US-007-VERIFICATION.md +259 -0
  92. package/nax/features/plugin-integration/prd.json +208 -0
  93. package/nax/features/plugin-integration/progress.txt +5 -0
  94. package/nax/features/precheck/prd.json +205 -0
  95. package/nax/features/precheck/progress.txt +15 -0
  96. package/nax/features/structured-logging/prd.json +199 -0
  97. package/nax/features/unlock/prd.json +36 -0
  98. package/package.json +47 -0
  99. package/src/acceptance/fix-generator.ts +348 -0
  100. package/src/acceptance/generator.ts +282 -0
  101. package/src/acceptance/index.ts +30 -0
  102. package/src/acceptance/types.ts +79 -0
  103. package/src/agents/claude-decompose.ts +169 -0
  104. package/src/agents/claude-plan.ts +139 -0
  105. package/src/agents/claude.ts +324 -0
  106. package/src/agents/cost.ts +268 -0
  107. package/src/agents/index.ts +13 -0
  108. package/src/agents/registry.ts +48 -0
  109. package/src/agents/types-extended.ts +133 -0
  110. package/src/agents/types.ts +113 -0
  111. package/src/agents/validation.ts +69 -0
  112. package/src/analyze/classifier.ts +305 -0
  113. package/src/analyze/index.ts +16 -0
  114. package/src/analyze/scanner.ts +175 -0
  115. package/src/analyze/types.ts +51 -0
  116. package/src/cli/accept.ts +108 -0
  117. package/src/cli/analyze-parser.ts +284 -0
  118. package/src/cli/analyze.ts +207 -0
  119. package/src/cli/config.ts +561 -0
  120. package/src/cli/constitution.ts +109 -0
  121. package/src/cli/diagnose-analysis.ts +159 -0
  122. package/src/cli/diagnose-formatter.ts +87 -0
  123. package/src/cli/diagnose.ts +203 -0
  124. package/src/cli/generate.ts +127 -0
  125. package/src/cli/index.ts +37 -0
  126. package/src/cli/init.ts +188 -0
  127. package/src/cli/interact.ts +295 -0
  128. package/src/cli/plan.ts +198 -0
  129. package/src/cli/plugins.ts +111 -0
  130. package/src/cli/prompts.ts +295 -0
  131. package/src/cli/runs.ts +174 -0
  132. package/src/cli/status-cost.ts +151 -0
  133. package/src/cli/status-features.ts +338 -0
  134. package/src/cli/status.ts +13 -0
  135. package/src/commands/common.ts +171 -0
  136. package/src/commands/diagnose.ts +17 -0
  137. package/src/commands/index.ts +8 -0
  138. package/src/commands/logs.ts +384 -0
  139. package/src/commands/precheck.ts +86 -0
  140. package/src/commands/unlock.ts +96 -0
  141. package/src/config/defaults.ts +160 -0
  142. package/src/config/index.ts +22 -0
  143. package/src/config/loader.ts +121 -0
  144. package/src/config/merger.ts +147 -0
  145. package/src/config/path-security.ts +121 -0
  146. package/src/config/paths.ts +27 -0
  147. package/src/config/schema.ts +56 -0
  148. package/src/config/schemas.ts +286 -0
  149. package/src/config/types.ts +423 -0
  150. package/src/config/validate.ts +103 -0
  151. package/src/constitution/generator.ts +191 -0
  152. package/src/constitution/generators/aider.ts +41 -0
  153. package/src/constitution/generators/claude.ts +35 -0
  154. package/src/constitution/generators/cursor.ts +36 -0
  155. package/src/constitution/generators/opencode.ts +38 -0
  156. package/src/constitution/generators/types.ts +33 -0
  157. package/src/constitution/generators/windsurf.ts +36 -0
  158. package/src/constitution/index.ts +10 -0
  159. package/src/constitution/loader.ts +133 -0
  160. package/src/constitution/types.ts +31 -0
  161. package/src/context/auto-detect.ts +227 -0
  162. package/src/context/builder.ts +246 -0
  163. package/src/context/elements.ts +83 -0
  164. package/src/context/formatter.ts +107 -0
  165. package/src/context/generator.ts +129 -0
  166. package/src/context/generators/aider.ts +34 -0
  167. package/src/context/generators/claude.ts +28 -0
  168. package/src/context/generators/cursor.ts +28 -0
  169. package/src/context/generators/opencode.ts +30 -0
  170. package/src/context/generators/windsurf.ts +28 -0
  171. package/src/context/greenfield.ts +114 -0
  172. package/src/context/index.ts +33 -0
  173. package/src/context/injector.ts +279 -0
  174. package/src/context/test-scanner.ts +370 -0
  175. package/src/context/types.ts +98 -0
  176. package/src/errors.ts +67 -0
  177. package/src/execution/batching.ts +157 -0
  178. package/src/execution/crash-recovery.ts +373 -0
  179. package/src/execution/escalation/escalation.ts +44 -0
  180. package/src/execution/escalation/index.ts +13 -0
  181. package/src/execution/escalation/tier-escalation.ts +295 -0
  182. package/src/execution/escalation/tier-outcome.ts +158 -0
  183. package/src/execution/helpers.ts +38 -0
  184. package/src/execution/index.ts +45 -0
  185. package/src/execution/lifecycle/acceptance-loop.ts +272 -0
  186. package/src/execution/lifecycle/headless-formatter.ts +85 -0
  187. package/src/execution/lifecycle/index.ts +12 -0
  188. package/src/execution/lifecycle/parallel-lifecycle.ts +101 -0
  189. package/src/execution/lifecycle/precheck-runner.ts +140 -0
  190. package/src/execution/lifecycle/run-cleanup.ts +81 -0
  191. package/src/execution/lifecycle/run-completion.ts +129 -0
  192. package/src/execution/lifecycle/run-initialization.ts +141 -0
  193. package/src/execution/lifecycle/run-lifecycle.ts +312 -0
  194. package/src/execution/lifecycle/run-setup.ts +204 -0
  195. package/src/execution/lifecycle/story-hooks.ts +38 -0
  196. package/src/execution/lifecycle/story-size-prompts.ts +123 -0
  197. package/src/execution/lock.ts +115 -0
  198. package/src/execution/parallel-executor.ts +216 -0
  199. package/src/execution/parallel.ts +400 -0
  200. package/src/execution/pid-registry.ts +280 -0
  201. package/src/execution/pipeline-result-handler.ts +388 -0
  202. package/src/execution/post-verify-rectification.ts +188 -0
  203. package/src/execution/post-verify.ts +274 -0
  204. package/src/execution/progress.ts +25 -0
  205. package/src/execution/prompts.ts +127 -0
  206. package/src/execution/queue-handler.ts +109 -0
  207. package/src/execution/rectification.ts +13 -0
  208. package/src/execution/runner.ts +377 -0
  209. package/src/execution/sequential-executor.ts +388 -0
  210. package/src/execution/status-file.ts +264 -0
  211. package/src/execution/status-writer.ts +139 -0
  212. package/src/execution/story-context.ts +229 -0
  213. package/src/execution/test-output-parser.ts +14 -0
  214. package/src/execution/verification.ts +72 -0
  215. package/src/hooks/index.ts +2 -0
  216. package/src/hooks/runner.ts +286 -0
  217. package/src/hooks/types.ts +67 -0
  218. package/src/interaction/chain.ts +154 -0
  219. package/src/interaction/index.ts +60 -0
  220. package/src/interaction/init.ts +83 -0
  221. package/src/interaction/plugins/auto.ts +217 -0
  222. package/src/interaction/plugins/cli.ts +300 -0
  223. package/src/interaction/plugins/telegram.ts +384 -0
  224. package/src/interaction/plugins/webhook.ts +258 -0
  225. package/src/interaction/state.ts +171 -0
  226. package/src/interaction/triggers.ts +229 -0
  227. package/src/interaction/types.ts +163 -0
  228. package/src/logger/formatters.ts +84 -0
  229. package/src/logger/index.ts +16 -0
  230. package/src/logger/logger.ts +298 -0
  231. package/src/logger/types.ts +48 -0
  232. package/src/logging/formatter.ts +355 -0
  233. package/src/logging/index.ts +22 -0
  234. package/src/logging/types.ts +93 -0
  235. package/src/metrics/aggregator.ts +190 -0
  236. package/src/metrics/index.ts +14 -0
  237. package/src/metrics/tracker.ts +200 -0
  238. package/src/metrics/types.ts +109 -0
  239. package/src/optimizer/index.ts +62 -0
  240. package/src/optimizer/noop.optimizer.ts +24 -0
  241. package/src/optimizer/rule-based.optimizer.ts +248 -0
  242. package/src/optimizer/types.ts +53 -0
  243. package/src/pipeline/events.ts +130 -0
  244. package/src/pipeline/index.ts +19 -0
  245. package/src/pipeline/runner.ts +161 -0
  246. package/src/pipeline/stages/acceptance.ts +197 -0
  247. package/src/pipeline/stages/completion.ts +99 -0
  248. package/src/pipeline/stages/constitution.ts +63 -0
  249. package/src/pipeline/stages/context.ts +117 -0
  250. package/src/pipeline/stages/execution.ts +194 -0
  251. package/src/pipeline/stages/index.ts +62 -0
  252. package/src/pipeline/stages/optimizer.ts +74 -0
  253. package/src/pipeline/stages/prompt.ts +57 -0
  254. package/src/pipeline/stages/queue-check.ts +103 -0
  255. package/src/pipeline/stages/review.ts +181 -0
  256. package/src/pipeline/stages/routing.ts +81 -0
  257. package/src/pipeline/stages/verify.ts +100 -0
  258. package/src/pipeline/types.ts +167 -0
  259. package/src/plugins/index.ts +31 -0
  260. package/src/plugins/loader.ts +287 -0
  261. package/src/plugins/registry.ts +168 -0
  262. package/src/plugins/types.ts +327 -0
  263. package/src/plugins/validator.ts +352 -0
  264. package/src/prd/index.ts +172 -0
  265. package/src/prd/types.ts +202 -0
  266. package/src/precheck/checks-blockers.ts +391 -0
  267. package/src/precheck/checks-warnings.ts +142 -0
  268. package/src/precheck/checks.ts +30 -0
  269. package/src/precheck/index.ts +247 -0
  270. package/src/precheck/story-size-gate.ts +144 -0
  271. package/src/precheck/types.ts +31 -0
  272. package/src/queue/index.ts +2 -0
  273. package/src/queue/manager.ts +254 -0
  274. package/src/queue/types.ts +54 -0
  275. package/src/review/index.ts +8 -0
  276. package/src/review/runner.ts +172 -0
  277. package/src/review/types.ts +66 -0
  278. package/src/routing/builder.ts +81 -0
  279. package/src/routing/chain.ts +74 -0
  280. package/src/routing/index.ts +16 -0
  281. package/src/routing/loader.ts +58 -0
  282. package/src/routing/router.ts +303 -0
  283. package/src/routing/strategies/adaptive.ts +215 -0
  284. package/src/routing/strategies/index.ts +8 -0
  285. package/src/routing/strategies/keyword.ts +163 -0
  286. package/src/routing/strategies/llm-prompts.ts +209 -0
  287. package/src/routing/strategies/llm.ts +235 -0
  288. package/src/routing/strategies/manual.ts +50 -0
  289. package/src/routing/strategy.ts +99 -0
  290. package/src/tdd/cleanup.ts +111 -0
  291. package/src/tdd/index.ts +23 -0
  292. package/src/tdd/isolation.ts +123 -0
  293. package/src/tdd/orchestrator.ts +383 -0
  294. package/src/tdd/prompts.ts +270 -0
  295. package/src/tdd/rectification-gate.ts +183 -0
  296. package/src/tdd/session-runner.ts +179 -0
  297. package/src/tdd/types.ts +81 -0
  298. package/src/tdd/verdict.ts +271 -0
  299. package/src/tui/App.tsx +265 -0
  300. package/src/tui/components/AgentPanel.tsx +75 -0
  301. package/src/tui/components/CostOverlay.tsx +118 -0
  302. package/src/tui/components/HelpOverlay.tsx +107 -0
  303. package/src/tui/components/StatusBar.tsx +63 -0
  304. package/src/tui/components/StoriesPanel.tsx +177 -0
  305. package/src/tui/hooks/useKeyboard.ts +142 -0
  306. package/src/tui/hooks/useLayout.ts +137 -0
  307. package/src/tui/hooks/usePipelineEvents.ts +183 -0
  308. package/src/tui/hooks/usePty.ts +194 -0
  309. package/src/tui/index.tsx +38 -0
  310. package/src/tui/types.ts +76 -0
  311. package/src/utils/git.ts +83 -0
  312. package/src/utils/queue-writer.ts +54 -0
  313. package/src/verification/executor.ts +235 -0
  314. package/src/verification/gate.ts +207 -0
  315. package/src/verification/index.ts +12 -0
  316. package/src/verification/parser.ts +230 -0
  317. package/src/verification/rectification.ts +108 -0
  318. package/src/verification/types.ts +113 -0
  319. package/src/worktree/dispatcher.ts +65 -0
  320. package/src/worktree/index.ts +2 -0
  321. package/src/worktree/manager.ts +187 -0
  322. package/src/worktree/merge.ts +301 -0
  323. package/src/worktree/types.ts +4 -0
  324. package/test/TEST_COVERAGE_US001.md +217 -0
  325. package/test/TEST_COVERAGE_US003.md +84 -0
  326. package/test/TEST_COVERAGE_US005.md +86 -0
  327. package/test/US-002-orchestrator.test.ts +246 -0
  328. package/test/acceptance/cm-003-default-view.test.ts +194 -0
  329. package/test/execution/pid-registry.test.ts +240 -0
  330. package/test/execution/post-verify.test.ts +224 -0
  331. package/test/helpers/timeout.ts +42 -0
  332. package/test/integration/US-002-TEST-SUMMARY.md +107 -0
  333. package/test/integration/US-003-TEST-SUMMARY.md +149 -0
  334. package/test/integration/US-004-TEST-SUMMARY.md +106 -0
  335. package/test/integration/US-005-TEST-SUMMARY.md +138 -0
  336. package/test/integration/US-007-TEST-SUMMARY.md +100 -0
  337. package/test/integration/agent-validation.test.ts +439 -0
  338. package/test/integration/analyze-integration.test.ts +261 -0
  339. package/test/integration/analyze-scanner.test.ts +131 -0
  340. package/test/integration/cli-config-default-edge-cases.test.ts +222 -0
  341. package/test/integration/cli-config-default-view.test.ts +229 -0
  342. package/test/integration/cli-config-diff.test.ts +460 -0
  343. package/test/integration/cli-config.test.ts +736 -0
  344. package/test/integration/cli-diagnose.test.ts +592 -0
  345. package/test/integration/cli-logs.test.ts +314 -0
  346. package/test/integration/cli-plugins.test.ts +678 -0
  347. package/test/integration/cli-precheck.test.ts +371 -0
  348. package/test/integration/cli-run-headless.test.ts +173 -0
  349. package/test/integration/cli.test.ts +75 -0
  350. package/test/integration/config/merger.test.ts +465 -0
  351. package/test/integration/config/paths.test.ts +51 -0
  352. package/test/integration/config-loader.test.ts +265 -0
  353. package/test/integration/config.test.ts +444 -0
  354. package/test/integration/context-integration.test.ts +702 -0
  355. package/test/integration/context-provider-injection.test.ts +506 -0
  356. package/test/integration/context-verification-integration.test.ts +295 -0
  357. package/test/integration/e2e.test.ts +896 -0
  358. package/test/integration/execution.test.ts +625 -0
  359. package/test/integration/helpers.test.ts +295 -0
  360. package/test/integration/hooks.test.ts +361 -0
  361. package/test/integration/interaction-chain-pipeline.test.ts +464 -0
  362. package/test/integration/isolation.test.ts +143 -0
  363. package/test/integration/logger.test.ts +461 -0
  364. package/test/integration/parallel.test.ts +250 -0
  365. package/test/integration/path-security.test.ts +173 -0
  366. package/test/integration/pipeline-acceptance.test.ts +302 -0
  367. package/test/integration/pipeline-events.test.ts +475 -0
  368. package/test/integration/pipeline.test.ts +658 -0
  369. package/test/integration/plan.test.ts +157 -0
  370. package/test/integration/plugin-routing.test.ts +921 -0
  371. package/test/integration/plugins/config-integration.test.ts +172 -0
  372. package/test/integration/plugins/config-resolution.test.ts +522 -0
  373. package/test/integration/plugins/loader.test.ts +641 -0
  374. package/test/integration/plugins/registry.test.ts +746 -0
  375. package/test/integration/plugins/validator.test.ts +563 -0
  376. package/test/integration/prd-pause.test.ts +205 -0
  377. package/test/integration/prd-resolvers.test.ts +185 -0
  378. package/test/integration/precheck-integration.test.ts +468 -0
  379. package/test/integration/precheck.test.ts +805 -0
  380. package/test/integration/progress.test.ts +34 -0
  381. package/test/integration/rectification-flow.test.ts +512 -0
  382. package/test/integration/reporter-lifecycle.test.ts +860 -0
  383. package/test/integration/review-config-commands.test.ts +319 -0
  384. package/test/integration/review-config-schema.test.ts +116 -0
  385. package/test/integration/review-plugin-integration.test.ts +722 -0
  386. package/test/integration/review.test.ts +149 -0
  387. package/test/integration/routing-stage-bug-021.test.ts +274 -0
  388. package/test/integration/routing-stage-greenfield.test.ts +286 -0
  389. package/test/integration/runner-config-plugins.test.ts +461 -0
  390. package/test/integration/runner-fixes.test.ts +399 -0
  391. package/test/integration/runner-plugin-integration.test.ts +543 -0
  392. package/test/integration/runner.test.ts +1679 -0
  393. package/test/integration/s5-greenfield-fallback.test.ts +297 -0
  394. package/test/integration/status-file-integration.test.ts +325 -0
  395. package/test/integration/status-file.test.ts +379 -0
  396. package/test/integration/status-writer.test.ts +345 -0
  397. package/test/integration/story-id-in-events.test.ts +273 -0
  398. package/test/integration/tdd-cleanup.test.ts +246 -0
  399. package/test/integration/tdd-orchestrator.test.ts +1762 -0
  400. package/test/integration/test-scanner.test.ts +403 -0
  401. package/test/integration/verification-asset-check.test.ts +142 -0
  402. package/test/integration/verify-stage.test.ts +275 -0
  403. package/test/integration/worktree/manager.test.ts +218 -0
  404. package/test/integration/worktree/merge.test.ts +341 -0
  405. package/test/manual/logging-formatter-demo.ts +158 -0
  406. package/test/ui/tui-agent-panel.test.tsx +99 -0
  407. package/test/ui/tui-controls.test.ts +334 -0
  408. package/test/ui/tui-cost-and-pty.test.ts +189 -0
  409. package/test/ui/tui-layout.test.ts +378 -0
  410. package/test/ui/tui-pty-integration.test.tsx +159 -0
  411. package/test/ui/tui-stories.test.ts +332 -0
  412. package/test/unit/acceptance.test.ts +186 -0
  413. package/test/unit/agent-stderr-capture.test.ts +146 -0
  414. package/test/unit/analyze-classifier.test.ts +215 -0
  415. package/test/unit/analyze.test.ts +224 -0
  416. package/test/unit/auto-detect.test.ts +249 -0
  417. package/test/unit/cli-status.test.ts +417 -0
  418. package/test/unit/commands/common.test.ts +320 -0
  419. package/test/unit/commands/logs.test.ts +416 -0
  420. package/test/unit/commands/unlock.test.ts +319 -0
  421. package/test/unit/constitution-generators.test.ts +160 -0
  422. package/test/unit/constitution.test.ts +209 -0
  423. package/test/unit/context.test.ts +1722 -0
  424. package/test/unit/cost.test.ts +231 -0
  425. package/test/unit/crash-recovery.test.ts +308 -0
  426. package/test/unit/escalation.test.ts +126 -0
  427. package/test/unit/execution-logging-stderr.test.ts +156 -0
  428. package/test/unit/execution-stage.test.ts +122 -0
  429. package/test/unit/fix-generator.test.ts +275 -0
  430. package/test/unit/formatters.test.ts +469 -0
  431. package/test/unit/greenfield.test.ts +179 -0
  432. package/test/unit/helpers.test.ts +317 -0
  433. package/test/unit/interaction/human-review-trigger.test.ts +164 -0
  434. package/test/unit/interaction-network-failures.test.ts +389 -0
  435. package/test/unit/interaction-plugins.test.ts +164 -0
  436. package/test/unit/isolation.test.ts +134 -0
  437. package/test/unit/logging/formatter.test.ts +455 -0
  438. package/test/unit/merge.test.ts +268 -0
  439. package/test/unit/metrics.test.ts +276 -0
  440. package/test/unit/optimizer/noop.optimizer.test.ts +125 -0
  441. package/test/unit/optimizer/rule-based.optimizer.test.ts +358 -0
  442. package/test/unit/prd-auto-default.test.ts +290 -0
  443. package/test/unit/prd-failure-category.test.ts +176 -0
  444. package/test/unit/prd-get-next-story.test.ts +186 -0
  445. package/test/unit/precheck-checks.test.ts +840 -0
  446. package/test/unit/precheck-story-size-gate.test.ts +287 -0
  447. package/test/unit/precheck-types.test.ts +142 -0
  448. package/test/unit/prompts.test.ts +475 -0
  449. package/test/unit/queue.test.ts +237 -0
  450. package/test/unit/rectification.test.ts +284 -0
  451. package/test/unit/registry.test.ts +287 -0
  452. package/test/unit/routing.test.ts +937 -0
  453. package/test/unit/run-lifecycle.test.ts +140 -0
  454. package/test/unit/storyid-events.test.ts +224 -0
  455. package/test/unit/tdd-verdict.test.ts +492 -0
  456. package/test/unit/test-output-parser.test.ts +377 -0
  457. package/test/unit/verdict.test.ts +324 -0
  458. package/test/unit/worktree-manager.test.ts +158 -0
  459. package/tsconfig.json +27 -0
@@ -0,0 +1,812 @@
1
+ # Spec: v0.10.1 — Status File + TDD Escalation Retry
2
+
3
+ **Version:** v0.10.1
4
+ **Author:** Subrina
5
+ **Date:** 2026-02-25
6
+ **Status:** Draft
7
+
8
+ ---
9
+
10
+ ## Summary
11
+
12
+ Add a `--status-file <path>` flag to `nax run` that writes a machine-readable JSON status file, updated after each story completes. Enables external tools (CI/CD, orchestrators, dashboards) to monitor nax runs without parsing logs or aggregating hooks.
13
+
14
+ ## Motivation
15
+
16
+ - **Log parsing is fragile** — format changes break consumers
17
+ - **Hook aggregation has gaps** — if a hook fails, events are lost; no single source of truth
18
+ - **nax already tracks this state** — `RunResult`, story counts, cost, PRD status are all in memory
19
+ - **General-purpose** — useful for any integration, not just our orchestrator skill
20
+
21
+ ## Interface
22
+
23
+ ### CLI Flag
24
+
25
+ ```bash
26
+ nax run -f <feature> --headless --status-file ./nax-status.json
27
+ ```
28
+
29
+ | Flag | Type | Default | Description |
30
+ |:-----|:-----|:--------|:------------|
31
+ | `--status-file` | `string` | `undefined` | Path to write JSON status file. If not set, no file is written. |
32
+
33
+ Relative paths resolved from `cwd` (same as `--headless` log behavior).
34
+
35
+ ### Status File Schema
36
+
37
+ ```typescript
38
+ interface NaxStatusFile {
39
+ /** Schema version for forward compatibility */
40
+ version: 1;
41
+
42
+ /** Run metadata */
43
+ run: {
44
+ id: string; // Run ID (e.g. "run-2026-02-25T10-00-00-000Z")
45
+ feature: string; // Feature name
46
+ startedAt: string; // ISO 8601
47
+ status: "running" | "completed" | "failed" | "stalled";
48
+ dryRun: boolean;
49
+ };
50
+
51
+ /** Aggregate progress */
52
+ progress: {
53
+ total: number; // Total stories in PRD
54
+ passed: number;
55
+ failed: number;
56
+ paused: number;
57
+ blocked: number;
58
+ pending: number; // total - passed - failed - paused - blocked
59
+ };
60
+
61
+ /** Cost tracking */
62
+ cost: {
63
+ spent: number; // USD accumulated
64
+ limit: number | null; // From config.execution.costLimit
65
+ };
66
+
67
+ /** Current story being processed (null if between stories) */
68
+ current: {
69
+ storyId: string;
70
+ title: string;
71
+ complexity: string; // simple | medium | complex
72
+ tddStrategy: string; // test-after | tdd-lite | three-session-tdd
73
+ model: string; // Resolved model name
74
+ attempt: number; // Current attempt (1-based)
75
+ phase: string; // routing | test-write | implement | verify | review
76
+ } | null;
77
+
78
+ /** Iteration count */
79
+ iterations: number;
80
+
81
+ /** Last updated timestamp */
82
+ updatedAt: string; // ISO 8601
83
+
84
+ /** Duration so far in ms */
85
+ durationMs: number;
86
+ }
87
+ ```
88
+
89
+ ### Example Output
90
+
91
+ ```json
92
+ {
93
+ "version": 1,
94
+ "run": {
95
+ "id": "run-2026-02-25T10-00-00-000Z",
96
+ "feature": "auth-refactor",
97
+ "startedAt": "2026-02-25T10:00:00Z",
98
+ "status": "running",
99
+ "dryRun": false
100
+ },
101
+ "progress": {
102
+ "total": 12,
103
+ "passed": 7,
104
+ "failed": 1,
105
+ "paused": 0,
106
+ "blocked": 1,
107
+ "pending": 3
108
+ },
109
+ "cost": {
110
+ "spent": 1.23,
111
+ "limit": 5.00
112
+ },
113
+ "current": {
114
+ "storyId": "US-008",
115
+ "title": "Add retry logic to queue handler",
116
+ "complexity": "medium",
117
+ "tddStrategy": "tdd-lite",
118
+ "model": "claude-sonnet-4-5-20250514",
119
+ "attempt": 1,
120
+ "phase": "implement"
121
+ },
122
+ "iterations": 8,
123
+ "updatedAt": "2026-02-25T10:15:32Z",
124
+ "durationMs": 932000
125
+ }
126
+ ```
127
+
128
+ ## Implementation
129
+
130
+ ### Files to Change
131
+
132
+ | File | Change |
133
+ |:-----|:-------|
134
+ | `src/execution/runner.ts` | Add `statusFile?: string` to `RunOptions`. Call `writeStatusFile()` at key points. |
135
+ | `src/execution/status-file.ts` | **New file.** `writeStatusFile()` function — builds `NaxStatusFile` from run state, writes atomically. |
136
+ | `src/main.ts` (or wherever CLI args are parsed) | Add `--status-file` option, pass to `RunOptions`. |
137
+
138
+ ### Write Points
139
+
140
+ Status file is updated at these moments:
141
+
142
+ 1. **Run start** — initial state (all stories pending)
143
+ 2. **Story start** — update `current` with story info
144
+ 3. **Story complete/fail/pause** — update `progress` counts, clear `current`
145
+ 4. **Run end** — final state (`status: "completed"` or `"failed"`)
146
+
147
+ ### Atomic Writes
148
+
149
+ Write to `<path>.tmp` then rename to `<path>` to prevent readers from seeing partial JSON:
150
+
151
+ ```typescript
152
+ import { rename } from "node:fs/promises";
153
+
154
+ async function writeStatusFile(path: string, status: NaxStatusFile): Promise<void> {
155
+ const tmpPath = `${path}.tmp`;
156
+ await Bun.write(tmpPath, JSON.stringify(status, null, 2));
157
+ await rename(tmpPath, path);
158
+ }
159
+ ```
160
+
161
+ ### Integration with RunOptions
162
+
163
+ ```typescript
164
+ // src/execution/runner.ts
165
+ export interface RunOptions {
166
+ // ... existing fields
167
+ /** Path to write JSON status file (optional) */
168
+ statusFile?: string;
169
+ }
170
+ ```
171
+
172
+ ### Progress Counting
173
+
174
+ Derive from PRD state (already loaded):
175
+
176
+ ```typescript
177
+ function countProgress(prd: PRD): NaxStatusFile["progress"] {
178
+ const stories = prd.stories;
179
+ const passed = stories.filter(s => s.status === "passed").length;
180
+ const failed = stories.filter(s => s.status === "failed").length;
181
+ const paused = stories.filter(s => s.status === "paused").length;
182
+ const blocked = stories.filter(s => s.status === "blocked").length;
183
+ const total = stories.length;
184
+ return { total, passed, failed, paused, blocked, pending: total - passed - failed - paused - blocked };
185
+ }
186
+ ```
187
+
188
+ ### Cleanup
189
+
190
+ The status file is **not** deleted on run end — it persists as a record of the last run. Consumers can check `run.status` to determine if the run is still active.
191
+
192
+ ## Testing
193
+
194
+ | Test | Description |
195
+ |:-----|:------------|
196
+ | `status-file.test.ts` | Unit: `writeStatusFile()` produces valid JSON, atomic write works |
197
+ | `status-file.test.ts` | Unit: `countProgress()` correctly counts all states |
198
+ | `runner.test.ts` | Integration: `--status-file` option flows through to `RunOptions` |
199
+ | `runner.test.ts` | Integration: status file updates at each write point |
200
+ | Manual | `--status-file` + `--dry-run` produces correct output |
201
+
202
+ ## Non-Goals
203
+
204
+ - **Real-time streaming** — this is a polled file, not a websocket/SSE stream
205
+ - **Historical run data** — status file represents current/last run only (hooks + events.jsonl cover history)
206
+ - **`nax status --json` command** — future work, can read this file
207
+
208
+ ## Migration
209
+
210
+ None. New optional flag, no breaking changes. If `--status-file` is not passed, behavior is identical to v0.10.0.
211
+
212
+ ---
213
+
214
+ # Feature 2: TDD Escalation Retry
215
+
216
+ ## Summary
217
+
218
+ Three-session TDD currently hard-codes `pause` for all failures — isolation violations, session crashes, and test failures all result in the story being paused with no retry. This means TDD stories never benefit from the escalation system that test-after stories use.
219
+
220
+ Change: TDD failures should follow the same escalation retry pattern as test-after. Only pause when all retry paths are exhausted.
221
+
222
+ ## Problem
223
+
224
+ Current flow (all TDD failures):
225
+ ```
226
+ TDD failure → needsHumanReview=true → execution stage returns "pause" → story paused → NO RETRY
227
+ ```
228
+
229
+ test-after flow (for comparison):
230
+ ```
231
+ Agent failure → execution stage returns "escalate" → runner bumps tier → retries → only fails after max attempts
232
+ ```
233
+
234
+ ## Proposed Retry Strategy
235
+
236
+ TDD failures are classified into three categories with different retry paths:
237
+
238
+ ### Category 1: Isolation Violation (test-writer touches source)
239
+
240
+ **Current:** Pause immediately.
241
+ **Proposed:** Auto-downgrade to tdd-lite, then escalate.
242
+
243
+ ```
244
+ three-session-tdd fails (isolation violation)
245
+ → Retry 1: three-session-tdd-lite (same tier, skip isolation for writer/implementer)
246
+ → Success? Done ✅
247
+ → Fail? Escalate to next tier
248
+ → Retry 2: tdd-lite + stronger model
249
+ → Success? Done ✅
250
+ → Fail? Continue escalation through tier chain
251
+ → All tiers exhausted → pause (needs human review) ⏸
252
+ ```
253
+
254
+ **Note:** The zero-file fallback already does this for one specific case (test-writer creates no test files → auto-retry as lite). This generalizes that pattern to all isolation violations.
255
+
256
+ ### Category 2: Session Failure (agent crash, timeout, non-zero exit)
257
+
258
+ **Current:** Pause immediately.
259
+ **Proposed:** Escalate model tier (same as test-after).
260
+
261
+ ```
262
+ TDD session fails (crash/timeout)
263
+ → Escalate to next model tier
264
+ → Retry with stronger model (same TDD strategy)
265
+ → Success? Done ✅
266
+ → Fail? Continue escalation
267
+ → All tiers exhausted → mark failed ❌
268
+ ```
269
+
270
+ ### Category 3: Tests Still Failing After All Sessions
271
+
272
+ **Current:** Post-TDD verification runs. If tests fail → pause.
273
+ **Proposed:** Escalate model tier.
274
+
275
+ ```
276
+ All 3 sessions complete but tests still fail
277
+ → Escalate to next model tier
278
+ → Retry full TDD with stronger model
279
+ → Success? Done ✅
280
+ → Fail? Continue escalation
281
+ → All tiers exhausted → mark failed ❌
282
+ ```
283
+
284
+ ### Summary Table
285
+
286
+ | Failure Type | Current Action | New Action | Final Fallback |
287
+ |:-------------|:--------------|:-----------|:--------------|
288
+ | Isolation violation | pause | Downgrade to lite → escalate | pause (human review) |
289
+ | Zero test files created | lite retry (exists) | Keep existing + escalate | pause (human review) |
290
+ | Session crash/timeout | pause | Escalate tier | fail |
291
+ | Tests fail post-TDD | pause | Escalate tier | fail |
292
+ | Verifier flags bad code | pause | Escalate tier | pause (human review) |
293
+
294
+ **Why "pause" for isolation/verifier but "fail" for crashes?**
295
+ - Isolation violations and verifier concerns suggest the code needs *human judgment* — the AI may be fundamentally misunderstanding the task.
296
+ - Crashes and test failures are mechanical — a stronger model usually fixes them.
297
+
298
+ ## Implementation
299
+
300
+ ### Changes to `ThreeSessionTddResult`
301
+
302
+ Add a `failureCategory` field so the execution stage can differentiate:
303
+
304
+ ```typescript
305
+ export interface ThreeSessionTddResult {
306
+ success: boolean;
307
+ sessions: TddSessionResult[];
308
+ needsHumanReview: boolean;
309
+ reviewReason?: string;
310
+ totalCost: number;
311
+ lite: boolean;
312
+
313
+ /** NEW: Categorize failure for retry routing */
314
+ failureCategory?: "isolation-violation" | "session-failure" | "tests-failing" | "verifier-rejected";
315
+ }
316
+ ```
317
+
318
+ ### Changes to `execution.ts` (pipeline stage)
319
+
320
+ Replace the blanket `pause` with category-based routing:
321
+
322
+ ```typescript
323
+ // Current:
324
+ if (tddResult.needsHumanReview) {
325
+ return { action: "pause", reason: tddResult.reviewReason };
326
+ }
327
+
328
+ // Proposed:
329
+ if (!tddResult.success) {
330
+ switch (tddResult.failureCategory) {
331
+ case "isolation-violation":
332
+ // If already lite → escalate. If strict → retry as lite (same tier).
333
+ if (tddResult.lite) {
334
+ return { action: "escalate", reason: tddResult.reviewReason };
335
+ }
336
+ // Store flag in context so runner knows to downgrade strategy
337
+ ctx.retryAsLite = true;
338
+ return { action: "escalate", reason: `Isolation violation — downgrading to lite` };
339
+
340
+ case "session-failure":
341
+ case "tests-failing":
342
+ return { action: "escalate", reason: tddResult.reviewReason };
343
+
344
+ case "verifier-rejected":
345
+ // Escalate first, pause only after all tiers exhausted
346
+ return { action: "escalate", reason: tddResult.reviewReason };
347
+
348
+ default:
349
+ return { action: "pause", reason: tddResult.reviewReason };
350
+ }
351
+ }
352
+ ```
353
+
354
+ ### Changes to `runner.ts` (escalation handler)
355
+
356
+ When escalating a TDD story with `retryAsLite`, update the story's routing to use `three-session-tdd-lite`:
357
+
358
+ ```typescript
359
+ case "escalate": {
360
+ // ... existing escalation logic ...
361
+
362
+ // NEW: If retryAsLite flag set, downgrade TDD strategy
363
+ if (pipelineResult.context?.retryAsLite && story.routing) {
364
+ story.routing.testStrategy = "three-session-tdd-lite";
365
+ }
366
+
367
+ // ... rest of escalation ...
368
+ }
369
+ ```
370
+
371
+ ### Changes to `tdd/orchestrator.ts`
372
+
373
+ Set `failureCategory` based on what went wrong:
374
+
375
+ ```typescript
376
+ // After session 1 (test-writer) isolation failure:
377
+ return {
378
+ success: false,
379
+ ...
380
+ failureCategory: "isolation-violation",
381
+ };
382
+
383
+ // After session crash/timeout:
384
+ return {
385
+ success: false,
386
+ ...
387
+ failureCategory: "session-failure",
388
+ };
389
+
390
+ // After post-TDD verification fails:
391
+ return {
392
+ success: false,
393
+ ...
394
+ failureCategory: "tests-failing",
395
+ };
396
+ ```
397
+
398
+ ### Files to Change
399
+
400
+ | File | Change |
401
+ |:-----|:-------|
402
+ | `src/tdd/types.ts` | Add `failureCategory` to `ThreeSessionTddResult` |
403
+ | `src/tdd/orchestrator.ts` | Set `failureCategory` at each failure point |
404
+ | `src/pipeline/stages/execution.ts` | Route by `failureCategory` instead of blanket `pause` |
405
+ | `src/pipeline/types.ts` | Add `retryAsLite?: boolean` to `PipelineContext` |
406
+ | `src/execution/runner.ts` | Handle `retryAsLite` flag in escalation case |
407
+
408
+ ### Testing
409
+
410
+ | Test | Description |
411
+ |:-----|:------------|
412
+ | `tdd/orchestrator.test.ts` | Unit: each failure path sets correct `failureCategory` |
413
+ | `pipeline/execution.test.ts` | Unit: isolation violation returns `escalate` (not `pause`) |
414
+ | `pipeline/execution.test.ts` | Unit: lite isolation violation returns `escalate` |
415
+ | `pipeline/execution.test.ts` | Unit: session failure returns `escalate` |
416
+ | `execution/runner.test.ts` | Integration: TDD story escalates through tiers before failing |
417
+ | `execution/runner.test.ts` | Integration: `retryAsLite` downgrades strategy on next attempt |
418
+ | Manual | Run with intentionally strict project, verify lite downgrade + tier escalation |
419
+
420
+ ## Retry Budget
421
+
422
+ Uses the existing escalation config (`autoMode.escalation.tierOrder`). Example:
423
+
424
+ ```json
425
+ {
426
+ "autoMode": {
427
+ "escalation": {
428
+ "enabled": true,
429
+ "tierOrder": [
430
+ { "tier": "fast", "attempts": 2 },
431
+ { "tier": "balanced", "attempts": 2 },
432
+ { "tier": "powerful", "attempts": 1 }
433
+ ]
434
+ }
435
+ }
436
+ }
437
+ ```
438
+
439
+ For a strict TDD story with isolation violation:
440
+ ```
441
+ Attempt 1: three-session-tdd @ fast → isolation violation
442
+ Attempt 2: three-session-tdd-lite @ fast → tests fail
443
+ Attempt 3: tdd-lite @ balanced → tests fail
444
+ Attempt 4: tdd-lite @ balanced → tests fail
445
+ Attempt 5: tdd-lite @ powerful → success ✅ (or fail → pause)
446
+ ```
447
+
448
+ Max cost is bounded by the existing tier budget. No new config needed.
449
+
450
+ ---
451
+
452
+ # Feature 3: Structured Verifier Verdicts
453
+
454
+ ## Summary
455
+
456
+ The verifier (session 3) is designed to judge whether the implementer's changes are legitimate — especially when the implementer modified test files. Currently, this judgment is implicit: the verifier runs as a regular agent, and the only signal is "did tests pass after verifier ran?" There's no structured verdict flowing back to the pipeline.
457
+
458
+ Add structured output parsing to the verifier session so its judgment feeds into `failureCategory` and the escalation system.
459
+
460
+ ## Problem
461
+
462
+ Current verifier prompt asks it to:
463
+ 1. Run tests and verify they pass
464
+ 2. Review implementation quality
465
+ 3. Check acceptance criteria
466
+ 4. **Check if implementer modified test files and judge legitimacy**
467
+ 5. Fix issues minimally
468
+
469
+ But the result is just `{ success: boolean, estimatedCost: number }` — same as any agent session. The verifier's judgment about test modifications, code quality, and acceptance criteria is lost.
470
+
471
+ **Consequences:**
472
+ - If verifier finds illegitimate test modifications, it tries to fix them but we don't know *what* it found
473
+ - If verifier can't fix the issue, it exits non-zero → treated same as a crash
474
+ - No signal to differentiate "tests pass but code is bad" from "tests fail"
475
+ - The `VerifierDecision` type exists in `types.ts` but is **never populated**
476
+
477
+ ## Proposed Solution
478
+
479
+ ### Structured Verdict File
480
+
481
+ Instead of parsing agent stdout (fragile), the verifier writes a structured verdict file that the orchestrator reads after the session:
482
+
483
+ ```
484
+ <workdir>/.nax-verifier-verdict.json
485
+ ```
486
+
487
+ **Why a file?** Claude Code (the agent) can easily write files. Parsing structured output from stdout is unreliable with Claude Code since it mixes tool calls, thinking, and output.
488
+
489
+ ### Verdict Schema
490
+
491
+ ```typescript
492
+ interface VerifierVerdict {
493
+ /** Schema version */
494
+ version: 1;
495
+
496
+ /** Overall approval */
497
+ approved: boolean;
498
+
499
+ /** Test results */
500
+ tests: {
501
+ /** Did all tests pass? */
502
+ allPassing: boolean;
503
+ /** Number of tests passing */
504
+ passCount: number;
505
+ /** Number of tests failing */
506
+ failCount: number;
507
+ };
508
+
509
+ /** Implementer test modification review */
510
+ testModifications: {
511
+ /** Were test files modified by implementer? */
512
+ detected: boolean;
513
+ /** List of modified test files */
514
+ files: string[];
515
+ /** Are the modifications legitimate? */
516
+ legitimate: boolean;
517
+ /** Reasoning for legitimacy judgment */
518
+ reasoning: string;
519
+ };
520
+
521
+ /** Acceptance criteria check */
522
+ acceptanceCriteria: {
523
+ /** All criteria met? */
524
+ allMet: boolean;
525
+ /** Per-criterion status */
526
+ criteria: Array<{
527
+ criterion: string;
528
+ met: boolean;
529
+ note?: string;
530
+ }>;
531
+ };
532
+
533
+ /** Code quality assessment */
534
+ quality: {
535
+ /** Overall quality: good | acceptable | poor */
536
+ rating: "good" | "acceptable" | "poor";
537
+ /** Issues found */
538
+ issues: string[];
539
+ };
540
+
541
+ /** Fixes applied by verifier */
542
+ fixes: string[];
543
+
544
+ /** Overall reasoning */
545
+ reasoning: string;
546
+ }
547
+ ```
548
+
549
+ ### Updated Verifier Prompt
550
+
551
+ ```typescript
552
+ export function buildVerifierPrompt(story: UserStory): string {
553
+ return `# Test-Driven Development — Session 3: Verify
554
+
555
+ You are in the third session of a three-session TDD workflow. Tests and implementation are complete.
556
+
557
+ **Story:** ${story.title}
558
+
559
+ **Your tasks:**
560
+ 1. Run all tests and verify they pass
561
+ 2. Review the implementation for quality and correctness
562
+ 3. Check that the implementation meets all acceptance criteria
563
+ 4. Check if test files were modified by the implementer. If yes, verify the changes are legitimate fixes (e.g. fixing incorrect expectations) and NOT just loosening assertions to mask bugs.
564
+ 5. If any issues exist, fix them minimally
565
+
566
+ **Acceptance Criteria:**
567
+ ${story.acceptanceCriteria.map((ac, i) => `${i + 1}. ${ac}`).join("\n")}
568
+
569
+ **IMPORTANT — Write Verdict File:**
570
+ After completing your review, write a JSON verdict file to \`.nax-verifier-verdict.json\` in the project root.
571
+
572
+ \`\`\`json
573
+ {
574
+ "version": 1,
575
+ "approved": true,
576
+ "tests": {
577
+ "allPassing": true,
578
+ "passCount": 15,
579
+ "failCount": 0
580
+ },
581
+ "testModifications": {
582
+ "detected": false,
583
+ "files": [],
584
+ "legitimate": true,
585
+ "reasoning": "No test files were modified by implementer"
586
+ },
587
+ "acceptanceCriteria": {
588
+ "allMet": true,
589
+ "criteria": [
590
+ { "criterion": "Criterion text", "met": true }
591
+ ]
592
+ },
593
+ "quality": {
594
+ "rating": "good",
595
+ "issues": []
596
+ },
597
+ "fixes": [],
598
+ "reasoning": "All tests pass, implementation is clean, all criteria met."
599
+ }
600
+ \`\`\`
601
+
602
+ Set \`approved: false\` if:
603
+ - Tests are failing and you cannot fix them
604
+ - Implementer loosened test assertions to mask bugs (testModifications.legitimate = false)
605
+ - Critical acceptance criteria are not met
606
+ - Code quality is poor with security or correctness issues
607
+
608
+ Set \`approved: true\` if:
609
+ - All tests pass (or pass after your minimal fixes)
610
+ - Implementation is clean and follows conventions
611
+ - All acceptance criteria met
612
+ - Any test modifications by implementer are legitimate fixes
613
+
614
+ When done, commit any fixes with message: "fix: verify and adjust ${story.title}"`;
615
+ }
616
+ ```
617
+
618
+ ### Orchestrator Changes
619
+
620
+ After verifier session completes, read and parse the verdict file:
621
+
622
+ ```typescript
623
+ // In tdd/orchestrator.ts, after session 3 completes:
624
+
625
+ // Read verdict file
626
+ const verdictPath = path.join(workdir, ".nax-verifier-verdict.json");
627
+ let verdict: VerifierVerdict | null = null;
628
+
629
+ try {
630
+ const file = Bun.file(verdictPath);
631
+ if (await file.exists()) {
632
+ verdict = await file.json() as VerifierVerdict;
633
+ logger.info("tdd", "Verifier verdict loaded", {
634
+ storyId: story.id,
635
+ approved: verdict.approved,
636
+ testsAllPassing: verdict.tests.allPassing,
637
+ testModsDetected: verdict.testModifications.detected,
638
+ testModsLegitimate: verdict.testModifications.legitimate,
639
+ qualityRating: verdict.quality.rating,
640
+ allCriteriaMet: verdict.acceptanceCriteria.allMet,
641
+ });
642
+ } else {
643
+ logger.warn("tdd", "No verifier verdict file found — falling back to test-only check", {
644
+ storyId: story.id,
645
+ });
646
+ }
647
+ } catch (err) {
648
+ logger.warn("tdd", "Failed to parse verifier verdict", {
649
+ storyId: story.id,
650
+ error: String(err),
651
+ });
652
+ }
653
+
654
+ // Clean up verdict file (don't leave it in the repo)
655
+ try {
656
+ await unlink(verdictPath);
657
+ } catch { /* ignore */ }
658
+ ```
659
+
660
+ ### Verdict → failureCategory Mapping
661
+
662
+ ```typescript
663
+ function categorizeVerdict(
664
+ verdict: VerifierVerdict | null,
665
+ session3Success: boolean,
666
+ testsPass: boolean,
667
+ ): { success: boolean; failureCategory?: FailureCategory; reviewReason?: string } {
668
+
669
+ // No verdict file → fall back to existing behavior (test-only check)
670
+ if (!verdict) {
671
+ if (testsPass) return { success: true };
672
+ return {
673
+ success: false,
674
+ failureCategory: "tests-failing",
675
+ reviewReason: "Tests failing after all sessions (no verdict file)",
676
+ };
677
+ }
678
+
679
+ // Verdict: approved
680
+ if (verdict.approved) {
681
+ return { success: true };
682
+ }
683
+
684
+ // Verdict: not approved — classify why
685
+
686
+ // Illegitimate test modifications (implementer cheated)
687
+ if (verdict.testModifications.detected && !verdict.testModifications.legitimate) {
688
+ return {
689
+ success: false,
690
+ failureCategory: "verifier-rejected",
691
+ reviewReason: `Verifier rejected: illegitimate test modifications in ${verdict.testModifications.files.join(", ")}. ${verdict.testModifications.reasoning}`,
692
+ };
693
+ }
694
+
695
+ // Tests failing
696
+ if (!verdict.tests.allPassing) {
697
+ return {
698
+ success: false,
699
+ failureCategory: "tests-failing",
700
+ reviewReason: `Tests failing: ${verdict.tests.failCount} failures. ${verdict.reasoning}`,
701
+ };
702
+ }
703
+
704
+ // Acceptance criteria not met
705
+ if (!verdict.acceptanceCriteria.allMet) {
706
+ const unmet = verdict.acceptanceCriteria.criteria
707
+ .filter(c => !c.met)
708
+ .map(c => c.criterion);
709
+ return {
710
+ success: false,
711
+ failureCategory: "verifier-rejected",
712
+ reviewReason: `Acceptance criteria not met: ${unmet.join("; ")}`,
713
+ };
714
+ }
715
+
716
+ // Poor quality
717
+ if (verdict.quality.rating === "poor") {
718
+ return {
719
+ success: false,
720
+ failureCategory: "verifier-rejected",
721
+ reviewReason: `Poor code quality: ${verdict.quality.issues.join("; ")}`,
722
+ };
723
+ }
724
+
725
+ // Catch-all: verdict says not approved but no clear reason
726
+ return {
727
+ success: false,
728
+ failureCategory: "verifier-rejected",
729
+ reviewReason: verdict.reasoning || "Verifier rejected without specific reason",
730
+ };
731
+ }
732
+ ```
733
+
734
+ ### Escalation Behavior per Verdict
735
+
736
+ | Verdict Reason | failureCategory | Escalation Path |
737
+ |:---------------|:---------------|:---------------|
738
+ | Illegitimate test mods | `verifier-rejected` | Escalate tier → pause after all tiers |
739
+ | Tests failing | `tests-failing` | Escalate tier → fail after all tiers |
740
+ | Criteria not met | `verifier-rejected` | Escalate tier → pause after all tiers |
741
+ | Poor quality | `verifier-rejected` | Escalate tier → pause after all tiers |
742
+ | Approved | — | Success ✅ |
743
+ | No verdict file | Falls back to test check | Same as before |
744
+
745
+ ### Verdict File Lifecycle
746
+
747
+ 1. **Created by:** Verifier agent (session 3) writes `.nax-verifier-verdict.json`
748
+ 2. **Read by:** TDD orchestrator after session 3 completes
749
+ 3. **Deleted by:** TDD orchestrator after reading (not committed to git)
750
+ 4. **Fallback:** If file missing or unparseable, fall back to existing behavior (post-TDD test verification)
751
+
752
+ ### `.gitignore`
753
+
754
+ Add to project `.gitignore` (or nax init template):
755
+ ```
756
+ .nax-verifier-verdict.json
757
+ ```
758
+
759
+ ### Files to Change
760
+
761
+ | File | Change |
762
+ |:-----|:-------|
763
+ | `src/tdd/types.ts` | Add `VerifierVerdict` interface |
764
+ | `src/tdd/prompts.ts` | Update `buildVerifierPrompt()` with verdict file instructions |
765
+ | `src/tdd/orchestrator.ts` | Read verdict file after session 3, map to `failureCategory` |
766
+ | `src/tdd/verdict.ts` | **New file.** `readVerdict()`, `categorizeVerdict()`, `cleanupVerdict()` |
767
+
768
+ ### Testing
769
+
770
+ | Test | Description |
771
+ |:-----|:------------|
772
+ | `tdd/verdict.test.ts` | Unit: `categorizeVerdict()` for all verdict combinations |
773
+ | `tdd/verdict.test.ts` | Unit: missing verdict file falls back gracefully |
774
+ | `tdd/verdict.test.ts` | Unit: malformed JSON falls back gracefully |
775
+ | `tdd/orchestrator.test.ts` | Integration: verdict file read + cleanup after session 3 |
776
+ | `tdd/orchestrator.test.ts` | Integration: illegitimate test mods → `verifier-rejected` |
777
+ | Manual | Run TDD on a story, verify verdict file is written and consumed |
778
+
779
+ ### Robustness
780
+
781
+ **What if the agent doesn't write the verdict file?**
782
+ Fall back to existing behavior: run tests independently, check pass/fail. This is the same as v0.10.0. The verdict file is an enhancement, not a requirement.
783
+
784
+ **What if the JSON is malformed?**
785
+ Log warning, fall back to test-only check. Never crash.
786
+
787
+ **What if the agent writes wrong data?**
788
+ Validate required fields (`version`, `approved`, `tests`). Missing fields → fall back. The verdict is advisory — the independent test run is the ground truth for "tests pass."
789
+
790
+ ---
791
+
792
+ # v0.10.1 Summary
793
+
794
+ Three features, cohesive release:
795
+
796
+ | Feature | Files Changed | Effort | Dependency |
797
+ |:--------|:-------------|:-------|:-----------|
798
+ | 1. `--status-file` | 3 (new `status-file.ts`, modify `runner.ts`, CLI) | Medium | None |
799
+ | 2. TDD Escalation Retry | 5 (types, orchestrator, execution stage, pipeline types, runner) | Medium | None |
800
+ | 3. Structured Verifier Verdicts | 4 (types, prompts, orchestrator, new `verdict.ts`) | Medium | Feature 2 (feeds `failureCategory`) |
801
+
802
+ **Total files:** 10 changed/new (some overlap — `types.ts` and `orchestrator.ts` touched by features 2+3).
803
+
804
+ **Breaking changes:** None. All features are additive/optional.
805
+
806
+ **Config changes:** None. Uses existing escalation config.
807
+
808
+ ### Implementation Order
809
+
810
+ 1. Feature 1 (`--status-file`) — independent, can ship alone
811
+ 2. Feature 2 (TDD escalation) — core retry logic
812
+ 3. Feature 3 (verifier verdicts) — builds on feature 2's `failureCategory`