@nathapp/nax 0.28.0 → 0.29.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (376) hide show
  1. package/CHANGELOG.md +13 -2
  2. package/dist/nax.js +72691 -0
  3. package/package.json +12 -4
  4. package/src/cli/config.ts +3 -1
  5. package/src/config/defaults.ts +1 -0
  6. package/src/config/schemas.ts +1 -0
  7. package/src/config/types.ts +1 -0
  8. package/src/context/builder.ts +10 -1
  9. package/src/prompts/sections/role-task.ts +4 -2
  10. package/src/review/runner.ts +6 -1
  11. package/src/version.ts +2 -1
  12. package/.claude/rules/01-project-conventions.md +0 -34
  13. package/.claude/rules/02-test-architecture.md +0 -39
  14. package/.claude/rules/03-test-writing.md +0 -58
  15. package/.claude/rules/04-forbidden-patterns.md +0 -29
  16. package/.claude/settings.json +0 -15
  17. package/.githooks/pre-commit +0 -16
  18. package/.gitlab-ci.yml +0 -103
  19. package/.mcp.json +0 -8
  20. package/BRIEF.md +0 -140
  21. package/CLAUDE.md +0 -143
  22. package/US-007-IMPLEMENTATION.md +0 -139
  23. package/biome.json +0 -14
  24. package/bun.lock +0 -163
  25. package/bunfig.toml +0 -12
  26. package/docker-compose.test.yml +0 -15
  27. package/docs/20260216-fix-plan-context-review.md +0 -56
  28. package/docs/20260216-relentless-vs-ngent-comparison.md +0 -208
  29. package/docs/20260216-v02-plan.md +0 -136
  30. package/docs/20260216-v02-review.md +0 -685
  31. package/docs/20260217-dogfood-findings.md +0 -56
  32. package/docs/20260217-p2-plus-plan.md +0 -117
  33. package/docs/20260217-partial-fixes-plan.md +0 -62
  34. package/docs/20260217-plan-analyze-spec.md +0 -117
  35. package/docs/20260217-post-impl-review.md +0 -1137
  36. package/docs/20260217-quick-wins-plan.md +0 -66
  37. package/docs/20260217-split-runner-plan.md +0 -75
  38. package/docs/20260217-v03-impl-plan.md +0 -80
  39. package/docs/20260217-v03-post-impl-review.md +0 -589
  40. package/docs/20260217-v04-impl-plan.md +0 -86
  41. package/docs/20260217-v05-post-impl-review.md +0 -850
  42. package/docs/20260217-v06-post-impl-review.md +0 -817
  43. package/docs/20260218-adr003-port-plan.md +0 -151
  44. package/docs/20260218-review-adr003-verification.md +0 -175
  45. package/docs/20260219-fix-plan-bug16-19.md +0 -79
  46. package/docs/20260219-fix-plan-bug20-22.md +0 -114
  47. package/docs/20260219-plan-llm-routing.md +0 -116
  48. package/docs/20260219-review-bug20-22-fixes.md +0 -135
  49. package/docs/20260219-routing-baseline-keyword.md +0 -63
  50. package/docs/20260220-plan-structured-logging-p1.md +0 -80
  51. package/docs/20260220-plan-structured-logging-p2.md +0 -37
  52. package/docs/20260220-review-llm-routing.md +0 -180
  53. package/docs/20260220-review-post-fix-llm-routing.md +0 -70
  54. package/docs/20260221-fix-plan-relevantfiles-split.md +0 -101
  55. package/docs/20260221-fix-plan-routing-mode.md +0 -125
  56. package/docs/20260221-review-v0.9-implementation.md +0 -379
  57. package/docs/20260222-fix-plan-v091-routing-isolation.md +0 -197
  58. package/docs/20260223-fix-plan-prompt-audit.md +0 -62
  59. package/docs/20260224-nax-roadmap-phases.md +0 -189
  60. package/docs/20260225-phase2-llm-service-layer.md +0 -401
  61. package/docs/20260225-review-v0.10.1.md +0 -187
  62. package/docs/20260303-v010-implementation-plan.md +0 -165
  63. package/docs/20260304-review-nax.md +0 -492
  64. package/docs/CLAUDE.md.bak +0 -191
  65. package/docs/ROADMAP.md +0 -390
  66. package/docs/SPEC-rectification.md +0 -0
  67. package/docs/SPEC.md +0 -324
  68. package/docs/US-001-plugin-loading-verification.md +0 -152
  69. package/docs/adr/ADR-005-implementation-plan.md +0 -655
  70. package/docs/adr/ADR-005-pipeline-re-architecture.md +0 -464
  71. package/docs/architecture-analysis.md +0 -1076
  72. package/docs/bugs/BUG-21-escalation-null-attempts.md +0 -48
  73. package/docs/bugs-from-dogfood-run-c.md +0 -243
  74. package/docs/code-review-20260228.md +0 -612
  75. package/docs/code-review-v0.15.0.md +0 -629
  76. package/docs/hook-lifecycle-test-plan.md +0 -149
  77. package/docs/releases/v0.11.0-and-earlier.md +0 -20
  78. package/docs/releases/v0.12.0.md +0 -15
  79. package/docs/releases/v0.13.0.md +0 -14
  80. package/docs/releases/v0.14.0.md +0 -20
  81. package/docs/releases/v0.14.1.md +0 -36
  82. package/docs/releases/v0.14.2.md +0 -51
  83. package/docs/releases/v0.14.3.md +0 -174
  84. package/docs/releases/v0.14.4.md +0 -94
  85. package/docs/releases/v0.15.0.md +0 -502
  86. package/docs/releases/v0.15.1.md +0 -170
  87. package/docs/releases/v0.15.3.md +0 -193
  88. package/docs/specs/bug-039-orphan-processes.md +0 -131
  89. package/docs/specs/bug-040-review-rectification.md +0 -82
  90. package/docs/specs/bug-041-cross-story-test-isolation.md +0 -88
  91. package/docs/specs/bug-042-verifier-failure-capture.md +0 -117
  92. package/docs/specs/bun-pty-migration.md +0 -171
  93. package/docs/specs/central-run-registry.md +0 -116
  94. package/docs/specs/feat-010-smart-runner-git-history.md +0 -96
  95. package/docs/specs/feat-011-file-context-strategy.md +0 -73
  96. package/docs/specs/feat-012-tdd-writer-tier.md +0 -79
  97. package/docs/specs/feat-013-test-after-review.md +0 -89
  98. package/docs/specs/feat-014-heartbeat-observability.md +0 -127
  99. package/docs/specs/status-file-consolidation.md +0 -93
  100. package/docs/specs/status-file-v0.10.1.md +0 -812
  101. package/docs/specs/trigger-completion.md +0 -145
  102. package/docs/specs/verification-architecture-v2.md +0 -343
  103. package/docs/tdd/strategies.md +0 -97
  104. package/docs/v0.10-global-config.md +0 -206
  105. package/docs/v0.10-plugin-system.md +0 -415
  106. package/docs/v0.10-prompt-optimizer.md +0 -234
  107. package/docs/v0.3-spec.md +0 -244
  108. package/docs/v0.4-spec.md +0 -140
  109. package/docs/v0.5-spec.md +0 -237
  110. package/docs/v0.6-spec.md +0 -371
  111. package/docs/v0.7-spec.md +0 -177
  112. package/docs/v0.8-llm-routing.md +0 -206
  113. package/docs/v0.8-structured-logging.md +0 -132
  114. package/docs/v0.9.3-prompt-audit.md +0 -112
  115. package/examples/plugins/console-reporter/index.test.ts +0 -207
  116. package/examples/plugins/console-reporter/index.ts +0 -110
  117. package/memory/topic/feat-010-baseref.md +0 -28
  118. package/memory/topic/feat-013-test-after-deprecation.md +0 -22
  119. package/nax/config.json +0 -154
  120. package/nax/features/bug-039-medium/prd.json +0 -45
  121. package/nax/features/bugfix-v0171/prd.json +0 -52
  122. package/nax/features/central-run-registry/prd.json +0 -105
  123. package/nax/features/config-management/prd.json +0 -108
  124. package/nax/features/config-management/progress.txt +0 -5
  125. package/nax/features/diagnose/acceptance.test.ts +0 -414
  126. package/nax/features/diagnose/prd.json +0 -41
  127. package/nax/features/nax-compliance/prd.json +0 -52
  128. package/nax/features/nax-compliance/progress.txt +0 -1
  129. package/nax/features/orchestration-fixes/prd.json +0 -89
  130. package/nax/features/orchestration-fixes/progress.txt +0 -1
  131. package/nax/features/plugin-integration/US-007-VERIFICATION.md +0 -259
  132. package/nax/features/plugin-integration/prd.json +0 -208
  133. package/nax/features/plugin-integration/progress.txt +0 -5
  134. package/nax/features/post-rearch-bugfix/prd.json +0 -137
  135. package/nax/features/precheck/prd.json +0 -205
  136. package/nax/features/precheck/progress.txt +0 -15
  137. package/nax/features/prompt-builder/prd.json +0 -152
  138. package/nax/features/prompt-builder/progress.txt +0 -3
  139. package/nax/features/review-quality/prd.json +0 -55
  140. package/nax/features/routing-persistence/prd.json +0 -104
  141. package/nax/features/routing-persistence/progress.txt +0 -1
  142. package/nax/features/smart-test-runner/plan.md +0 -7
  143. package/nax/features/smart-test-runner/prd.json +0 -203
  144. package/nax/features/smart-test-runner/progress.txt +0 -13
  145. package/nax/features/smart-test-runner/spec.md +0 -7
  146. package/nax/features/smart-test-runner/tasks.md +0 -8
  147. package/nax/features/status-file-consolidation/prd.json +0 -106
  148. package/nax/features/structured-logging/prd.json +0 -199
  149. package/nax/features/trigger-completion/prd.json +0 -150
  150. package/nax/features/trigger-completion/progress.txt +0 -7
  151. package/nax/features/unlock/prd.json +0 -36
  152. package/nax/features/v0.18.3-execution-reliability/prd.json +0 -80
  153. package/nax/features/v0.18.3-execution-reliability/progress.txt +0 -3
  154. package/nax/features/v0.19.0-hardening/plan.md +0 -7
  155. package/nax/features/v0.19.0-hardening/prd.json +0 -84
  156. package/nax/features/v0.19.0-hardening/progress.txt +0 -7
  157. package/nax/features/v0.19.0-hardening/spec.md +0 -18
  158. package/nax/features/v0.19.0-hardening/tasks.md +0 -8
  159. package/nax/features/verify-v2/prd.json +0 -79
  160. package/nax/features/verify-v2/progress.txt +0 -3
  161. package/nax/status.json +0 -36
  162. package/test/COVERAGE-GAPS.md +0 -333
  163. package/test/e2e/cm-003-default-view.test.ts +0 -195
  164. package/test/e2e/plan-analyze-run.test.ts +0 -902
  165. package/test/helpers/helpers.test.ts +0 -295
  166. package/test/helpers/timeout.ts +0 -42
  167. package/test/integration/US-002-TEST-SUMMARY.md +0 -107
  168. package/test/integration/US-003-TEST-SUMMARY.md +0 -149
  169. package/test/integration/US-004-TEST-SUMMARY.md +0 -106
  170. package/test/integration/US-005-TEST-SUMMARY.md +0 -138
  171. package/test/integration/US-007-TEST-SUMMARY.md +0 -100
  172. package/test/integration/cli/agent-validation.test.ts +0 -439
  173. package/test/integration/cli/cli-config-default-edge-cases.test.ts +0 -223
  174. package/test/integration/cli/cli-config-default-view.test.ts +0 -230
  175. package/test/integration/cli/cli-config-diff.test.ts +0 -461
  176. package/test/integration/cli/cli-config-prompts-explain.test.ts +0 -74
  177. package/test/integration/cli/cli-config.test.ts +0 -737
  178. package/test/integration/cli/cli-diagnose.test.ts +0 -595
  179. package/test/integration/cli/cli-logs.test.ts +0 -346
  180. package/test/integration/cli/cli-plugins.test.ts +0 -679
  181. package/test/integration/cli/cli-precheck.test.ts +0 -372
  182. package/test/integration/cli/cli-run-headless.test.ts +0 -174
  183. package/test/integration/cli/cli.test.ts +0 -76
  184. package/test/integration/cli/precheck-integration.test.ts +0 -476
  185. package/test/integration/cli/precheck-orchestrator.test.ts +0 -247
  186. package/test/integration/cli/precheck.test.ts +0 -806
  187. package/test/integration/config/config-loader.test.ts +0 -266
  188. package/test/integration/config/config.test.ts +0 -444
  189. package/test/integration/config/merger.test.ts +0 -466
  190. package/test/integration/config/paths.test.ts +0 -52
  191. package/test/integration/config/security-loader.test.ts +0 -83
  192. package/test/integration/context/context-integration.test.ts +0 -703
  193. package/test/integration/context/context-path-security.test.ts +0 -173
  194. package/test/integration/context/context-provider-injection.test.ts +0 -507
  195. package/test/integration/context/context-verification-integration.test.ts +0 -296
  196. package/test/integration/context/s5-greenfield-fallback.test.ts +0 -298
  197. package/test/integration/execution/execution-isolation.test.ts +0 -143
  198. package/test/integration/execution/execution.test.ts +0 -634
  199. package/test/integration/execution/feature-status-write.test.ts +0 -302
  200. package/test/integration/execution/parallel.test.ts +0 -251
  201. package/test/integration/execution/prd-pause.test.ts +0 -205
  202. package/test/integration/execution/prd-resolvers.test.ts +0 -186
  203. package/test/integration/execution/progress.test.ts +0 -34
  204. package/test/integration/execution/runner-batching.test.ts +0 -682
  205. package/test/integration/execution/runner-config-plugins.test.ts +0 -462
  206. package/test/integration/execution/runner-escalation.test.ts +0 -561
  207. package/test/integration/execution/runner-fixes.test.ts +0 -400
  208. package/test/integration/execution/runner-plugin-integration.test.ts +0 -544
  209. package/test/integration/execution/runner-queue-and-attempts.test.ts +0 -476
  210. package/test/integration/execution/status-file-integration.test.ts +0 -289
  211. package/test/integration/execution/status-file.test.ts +0 -380
  212. package/test/integration/execution/status-writer.test.ts +0 -447
  213. package/test/integration/execution/story-id-in-events.test.ts +0 -274
  214. package/test/integration/interaction/interaction-chain-pipeline.test.ts +0 -476
  215. package/test/integration/pipeline/hooks.test.ts +0 -363
  216. package/test/integration/pipeline/pipeline-acceptance.test.ts +0 -303
  217. package/test/integration/pipeline/pipeline-events.test.ts +0 -476
  218. package/test/integration/pipeline/pipeline.test.ts +0 -660
  219. package/test/integration/pipeline/reporter-lifecycle.test.ts +0 -862
  220. package/test/integration/pipeline/verify-stage.test.ts +0 -286
  221. package/test/integration/plan/analyze-integration.test.ts +0 -262
  222. package/test/integration/plan/analyze-scanner.test.ts +0 -132
  223. package/test/integration/plan/logger.test.ts +0 -461
  224. package/test/integration/plan/plan.test.ts +0 -157
  225. package/test/integration/plugins/config-integration.test.ts +0 -173
  226. package/test/integration/plugins/config-resolution.test.ts +0 -523
  227. package/test/integration/plugins/loader.test.ts +0 -644
  228. package/test/integration/plugins/plugins-registry.test.ts +0 -747
  229. package/test/integration/plugins/validator.test.ts +0 -564
  230. package/test/integration/prompts/pb-004-migration.test.ts +0 -523
  231. package/test/integration/review/review-config-commands.test.ts +0 -320
  232. package/test/integration/review/review-config-schema.test.ts +0 -117
  233. package/test/integration/review/review-plugin-integration.test.ts +0 -729
  234. package/test/integration/review/review.test.ts +0 -150
  235. package/test/integration/routing/plugin-routing-advanced.test.ts +0 -461
  236. package/test/integration/routing/plugin-routing-core.test.ts +0 -527
  237. package/test/integration/routing/routing-stage-bug-021.test.ts +0 -275
  238. package/test/integration/routing/routing-stage-greenfield.test.ts +0 -287
  239. package/test/integration/tdd/tdd-cleanup.test.ts +0 -246
  240. package/test/integration/tdd/tdd-orchestrator-core.test.ts +0 -565
  241. package/test/integration/tdd/tdd-orchestrator-failureCategory.test.ts +0 -355
  242. package/test/integration/tdd/tdd-orchestrator-fallback.test.ts +0 -311
  243. package/test/integration/tdd/tdd-orchestrator-lite.test.ts +0 -289
  244. package/test/integration/tdd/tdd-orchestrator-prompts.test.ts +0 -260
  245. package/test/integration/tdd/tdd-orchestrator-verdict.test.ts +0 -536
  246. package/test/integration/tmp/headless-test/test.jsonl +0 -30
  247. package/test/integration/verification/test-scanner.test.ts +0 -403
  248. package/test/integration/verification/verification-asset-check.test.ts +0 -143
  249. package/test/integration/worktree/manager.test.ts +0 -218
  250. package/test/integration/worktree/worktree-merge.test.ts +0 -341
  251. package/test/manual/logging-formatter-demo.ts +0 -158
  252. package/test/ui/tui-agent-panel.test.tsx +0 -99
  253. package/test/ui/tui-pty-integration.test.tsx +0 -146
  254. package/test/unit/acceptance.test.ts +0 -187
  255. package/test/unit/agent-stderr-capture.test.ts +0 -147
  256. package/test/unit/agents/claude.test.ts +0 -107
  257. package/test/unit/analyze-classifier.test.ts +0 -216
  258. package/test/unit/analyze.test.ts +0 -224
  259. package/test/unit/auto-detect.test.ts +0 -250
  260. package/test/unit/cli-status-project-level.test.ts +0 -283
  261. package/test/unit/cli-status.test.ts +0 -418
  262. package/test/unit/commands/common.test.ts +0 -321
  263. package/test/unit/commands/logs.test.ts +0 -458
  264. package/test/unit/commands/runs.test.ts +0 -303
  265. package/test/unit/commands/unlock.test.ts +0 -320
  266. package/test/unit/config/defaults.test.ts +0 -70
  267. package/test/unit/config/quality-commands-schema.test.ts +0 -72
  268. package/test/unit/config/regression-gate-schema.test.ts +0 -160
  269. package/test/unit/config/smart-runner-flag.test.ts +0 -250
  270. package/test/unit/constitution-generators.test.ts +0 -161
  271. package/test/unit/constitution.test.ts +0 -210
  272. package/test/unit/context/context-autodetect.test.ts +0 -297
  273. package/test/unit/context/context-build.test.ts +0 -575
  274. package/test/unit/context/context-coverage.test.ts +0 -236
  275. package/test/unit/context/context-error.test.ts +0 -93
  276. package/test/unit/context/context-estimate-tokens.test.ts +0 -201
  277. package/test/unit/context/context-format.test.ts +0 -302
  278. package/test/unit/context/context-isolation.test.ts +0 -267
  279. package/test/unit/context/context-sort.test.ts +0 -93
  280. package/test/unit/context/context-story.test.ts +0 -108
  281. package/test/unit/context/prior-failures.test.ts +0 -463
  282. package/test/unit/context.test.ts +0 -1726
  283. package/test/unit/cost.test.ts +0 -231
  284. package/test/unit/crash-recovery.test.ts +0 -309
  285. package/test/unit/escalation.test.ts +0 -127
  286. package/test/unit/execution/lifecycle/run-completion.test.ts +0 -240
  287. package/test/unit/execution/lifecycle/run-regression.test.ts +0 -420
  288. package/test/unit/execution/pid-registry.test.ts +0 -241
  289. package/test/unit/execution/sequential-executor.test.ts +0 -235
  290. package/test/unit/execution/sfc-004-dead-code-cleanup.test.ts +0 -89
  291. package/test/unit/execution/structured-failure.test.ts +0 -415
  292. package/test/unit/execution-logging-stderr.test.ts +0 -157
  293. package/test/unit/execution-stage.test.ts +0 -123
  294. package/test/unit/fix-generator.test.ts +0 -276
  295. package/test/unit/formatters.test.ts +0 -468
  296. package/test/unit/greenfield.test.ts +0 -180
  297. package/test/unit/hooks/shell-security.test.ts +0 -40
  298. package/test/unit/interaction/auto-plugin.test.ts +0 -162
  299. package/test/unit/interaction/human-review-trigger.test.ts +0 -165
  300. package/test/unit/interaction-network-failures.test.ts +0 -390
  301. package/test/unit/interaction-plugins.test.ts +0 -472
  302. package/test/unit/logging/formatter.test.ts +0 -456
  303. package/test/unit/merge.test.ts +0 -269
  304. package/test/unit/metrics/aggregator.test.ts +0 -164
  305. package/test/unit/metrics/tracker.test.ts +0 -186
  306. package/test/unit/metrics.test.ts +0 -276
  307. package/test/unit/optimizer/noop.optimizer.test.ts +0 -125
  308. package/test/unit/optimizer/rule-based.optimizer.test.ts +0 -358
  309. package/test/unit/pipeline/event-bus.test.ts +0 -105
  310. package/test/unit/pipeline/routing-partial-override.test.ts +0 -121
  311. package/test/unit/pipeline/runner-retry.test.ts +0 -89
  312. package/test/unit/pipeline/stages/autofix.test.ts +0 -97
  313. package/test/unit/pipeline/stages/completion-review-gate.test.ts +0 -218
  314. package/test/unit/pipeline/stages/execution-ambiguity.test.ts +0 -311
  315. package/test/unit/pipeline/stages/execution-merge-conflict.test.ts +0 -218
  316. package/test/unit/pipeline/stages/rectify.test.ts +0 -101
  317. package/test/unit/pipeline/stages/regression-stage.test.ts +0 -69
  318. package/test/unit/pipeline/stages/review.test.ts +0 -201
  319. package/test/unit/pipeline/stages/routing-idempotence.test.ts +0 -139
  320. package/test/unit/pipeline/stages/routing-initial-complexity.test.ts +0 -321
  321. package/test/unit/pipeline/stages/routing-persistence.test.ts +0 -380
  322. package/test/unit/pipeline/stages/verify.test.ts +0 -267
  323. package/test/unit/pipeline/subscribers/events-writer.test.ts +0 -227
  324. package/test/unit/pipeline/subscribers/hooks.test.ts +0 -84
  325. package/test/unit/pipeline/subscribers/interaction.test.ts +0 -313
  326. package/test/unit/pipeline/subscribers/registry.test.ts +0 -149
  327. package/test/unit/pipeline/subscribers/reporters.test.ts +0 -90
  328. package/test/unit/pipeline/verify-smart-runner.test.ts +0 -345
  329. package/test/unit/prd-auto-default.test.ts +0 -291
  330. package/test/unit/prd-failure-category.test.ts +0 -177
  331. package/test/unit/prd-get-next-story.test.ts +0 -215
  332. package/test/unit/precheck/checks-warnings.test.ts +0 -114
  333. package/test/unit/precheck-checks.test.ts +0 -841
  334. package/test/unit/precheck-story-size-gate.test.ts +0 -288
  335. package/test/unit/precheck-types.test.ts +0 -143
  336. package/test/unit/prompts/builder.test.ts +0 -258
  337. package/test/unit/prompts/loader.test.ts +0 -355
  338. package/test/unit/prompts/sections/conventions.test.ts +0 -30
  339. package/test/unit/prompts/sections/isolation.test.ts +0 -35
  340. package/test/unit/prompts/sections/role-task.test.ts +0 -40
  341. package/test/unit/prompts/sections/sections.test.ts +0 -238
  342. package/test/unit/prompts/sections/story.test.ts +0 -45
  343. package/test/unit/prompts/sections/verdict.test.ts +0 -58
  344. package/test/unit/prompts.test.ts +0 -476
  345. package/test/unit/queue.test.ts +0 -237
  346. package/test/unit/rectification.test.ts +0 -285
  347. package/test/unit/registry.test.ts +0 -288
  348. package/test/unit/review/runner.test.ts +0 -117
  349. package/test/unit/routing/content-hash.test.ts +0 -99
  350. package/test/unit/routing/routing-stability.test.ts +0 -208
  351. package/test/unit/routing/strategies/llm.test.ts +0 -306
  352. package/test/unit/routing-advanced.test.ts +0 -313
  353. package/test/unit/routing-core.test.ts +0 -341
  354. package/test/unit/routing-strategies.test.ts +0 -440
  355. package/test/unit/storyid-events.test.ts +0 -213
  356. package/test/unit/tdd-verdict.test.ts +0 -492
  357. package/test/unit/test-output-parser.test.ts +0 -377
  358. package/test/unit/ui/tui-controls.test.ts +0 -335
  359. package/test/unit/ui/tui-cost-and-pty.test.ts +0 -190
  360. package/test/unit/ui/tui-layout.test.ts +0 -379
  361. package/test/unit/ui/tui-stories.test.ts +0 -333
  362. package/test/unit/unit-isolation.test.ts +0 -135
  363. package/test/unit/utils/git.test.ts +0 -50
  364. package/test/unit/utils/path-security.test.ts +0 -47
  365. package/test/unit/utils-helpers.test.ts +0 -318
  366. package/test/unit/verdict.test.ts +0 -325
  367. package/test/unit/verification/orchestrator-types.test.ts +0 -54
  368. package/test/unit/verification/orchestrator.test.ts +0 -66
  369. package/test/unit/verification/smart-runner-config.test.ts +0 -163
  370. package/test/unit/verification/smart-runner-discovery.test.ts +0 -354
  371. package/test/unit/verification/smart-runner.test.ts +0 -262
  372. package/test/unit/verification/strategies/acceptance.test.ts +0 -33
  373. package/test/unit/verification/strategies/regression.test.ts +0 -87
  374. package/test/unit/verification/strategies/scoped.test.ts +0 -100
  375. package/test/unit/worktree-manager.test.ts +0 -159
  376. package/tsconfig.json +0 -27
@@ -1,135 +0,0 @@
1
- # Post-Fix Code Review: BUG-20, BUG-21, BUG-22
2
-
3
- **Date:** 2026-02-19
4
- **Reviewer:** Subrina (AI)
5
- **Scope:** `fix/bug-20-22-tdd-orchestrator` branch (3 commits, 7 files)
6
- **Depth:** Standard (post-fix verification)
7
- **Files:** 7 (lib: ~170 LOC added, test: ~470 LOC added)
8
- **Baseline:** 21 new tests, 67 assertions — all passing
9
-
10
- ---
11
-
12
- ## Overall Grade: B+ (82/100)
13
-
14
- Solid bug fixes with good test coverage and clean separation. The `cleanup.ts` module is well-structured with proper JSDoc and graceful error handling. The orchestrator changes address the root causes correctly. However, there are a few issues around race conditions, missing type narrowing, and a potential security concern in the BUG-22 fix that should be addressed before merge.
15
-
16
- ---
17
-
18
- ## Scoring
19
-
20
- | Dimension | Score | Notes |
21
- |:---|:---|:---|
22
- | Security | 16/20 | `executeWithTimeout` called with config-derived command — acceptable but no sanitization |
23
- | Reliability | 15/20 | SIGTERM→SIGKILL race window; no PGID validation; `reviewReason` type narrowing |
24
- | API Design | 18/20 | Clean module boundary; `pid` optional on `AgentResult` is correct |
25
- | Code Quality | 17/20 | Good JSDoc; test mocking pattern is verbose but thorough |
26
- | Best Practices | 16/20 | Bun.sleep mock in tests is fragile; `@ts-ignore` used for mocking |
27
-
28
- ---
29
-
30
- ## Findings
31
-
32
- ### 🟡 MEDIUM
33
-
34
- #### BUG-1: Race condition in SIGTERM→SIGKILL cleanup
35
- **Severity:** MEDIUM | **Category:** Bug
36
- ```typescript
37
- // src/tdd/cleanup.ts:73-76
38
- process.kill(-pgid, "SIGTERM");
39
- await Bun.sleep(3000); // ← Fixed 3s delay
40
- process.kill(-pgid, "SIGKILL");
41
- ```
42
- **Risk:** If the process group exits cleanly in <3s and the PGID is reassigned to a new process group (unlikely but possible on busy systems), SIGKILL hits the wrong group. Also, 3s is hardcoded with no configurability.
43
- **Fix:** Check if processes still exist before SIGKILL:
44
- ```typescript
45
- const stillAlive = await getPgid(pid);
46
- if (stillAlive === pgid) {
47
- process.kill(-pgid, "SIGKILL");
48
- }
49
- ```
50
-
51
- #### BUG-2: BUG-22 post-verification runs `bun test` without workdir context
52
- **Severity:** MEDIUM | **Category:** Bug
53
- ```typescript
54
- // src/tdd/orchestrator.ts:430-432
55
- const testCmd = config.quality?.commands?.test ?? "bun test";
56
- const timeoutSeconds = config.quality?.verificationTimeoutSeconds ?? 120;
57
- const postVerify = await executeWithTimeout(testCmd, timeoutSeconds);
58
- ```
59
- **Risk:** `executeWithTimeout` may not inherit the correct working directory. If the orchestrator's cwd differs from the project workdir, post-verification runs tests against the wrong codebase or fails with "no tests found."
60
- **Fix:** Pass `workdir` to `executeWithTimeout`:
61
- ```typescript
62
- const postVerify = await executeWithTimeout(testCmd, timeoutSeconds, { cwd: workdir });
63
- ```
64
- *(Verify `executeWithTimeout` accepts cwd option — if not, needs a small change.)*
65
-
66
- #### ENH-1: `reviewReason` type could be `undefined | string` but is set via `let`
67
- **Severity:** MEDIUM | **Category:** Type Safety
68
- ```typescript
69
- // src/tdd/orchestrator.ts:441
70
- reviewReason = undefined; // ← assignment to undefined in success path
71
- ```
72
- **Risk:** The `reviewReason` variable is declared with `let` higher up. The `undefined` assignment works but the type should be explicit to avoid accidental string checks downstream.
73
- **Fix:** Declare as `let reviewReason: string | undefined;` if not already.
74
-
75
- ### 🟢 LOW
76
-
77
- #### STYLE-1: `@ts-ignore` comments in tests instead of proper typing
78
- **Severity:** LOW | **Category:** Style
79
- ```typescript
80
- // test/tdd-cleanup.test.ts:24
81
- // @ts-ignore — mocking global
82
- Bun.spawn = mock((cmd: string[], spawnOpts?: any) => {
83
- ```
84
- **Risk:** If `Bun.spawn` signature changes, tests won't catch the type mismatch at compile time.
85
- **Fix:** Consider a wrapper function pattern:
86
- ```typescript
87
- const mockSpawn = mock((...args: Parameters<typeof Bun.spawn>) => { ... });
88
- Object.defineProperty(Bun, 'spawn', { value: mockSpawn, writable: true });
89
- ```
90
-
91
- #### STYLE-2: Verbose mock setup repeated across tests
92
- **Severity:** LOW | **Category:** Style
93
- The `Bun.spawn` mock setup for git commands is duplicated across `tdd-cleanup.test.ts` and `tdd-orchestrator.test.ts` with slightly different patterns.
94
- **Fix:** Extract a `createGitMock()` helper into a shared test utility (e.g., `test/helpers/git-mock.ts`).
95
-
96
- #### ENH-2: `cleanupProcessTree` hardcoded 3s grace period
97
- **Severity:** LOW | **Category:** Enhancement
98
- ```typescript
99
- await Bun.sleep(3000);
100
- ```
101
- **Fix:** Accept optional `gracePeriodMs` parameter with 3000 default:
102
- ```typescript
103
- export async function cleanupProcessTree(pid: number, gracePeriodMs = 3000): Promise<void> {
104
- ```
105
-
106
- #### PERF-1: BUG-20 test file detection uses regex on every file per session
107
- **Severity:** LOW | **Category:** Performance
108
- ```typescript
109
- const testFilePatterns = /\.(test|spec)\.(ts|js|tsx|jsx)$/;
110
- const testFilesCreated = session1.filesChanged.filter((f) => testFilePatterns.test(f));
111
- ```
112
- **Risk:** Negligible perf impact (typically <50 files), but the regex is recompiled each call.
113
- **Fix:** Move `testFilePatterns` to module scope as a constant. Minor.
114
-
115
- ---
116
-
117
- ## Priority Fix Order
118
-
119
- | Priority | ID | Effort | Description |
120
- |:---|:---|:---|:---|
121
- | P1 | BUG-2 | S | Pass workdir to post-TDD verification `executeWithTimeout` |
122
- | P2 | BUG-1 | S | Verify process still alive before SIGKILL |
123
- | P3 | ENH-1 | S | Explicit type for `reviewReason` |
124
- | P3 | ENH-2 | S | Configurable grace period in `cleanupProcessTree` |
125
- | P4 | STYLE-1 | M | Replace `@ts-ignore` with proper mock typing |
126
- | P4 | STYLE-2 | M | Extract shared git mock helper |
127
- | P5 | PERF-1 | S | Move test file regex to module scope |
128
-
129
- ---
130
-
131
- ## Verdict
132
-
133
- **Ship with P1 fix.** BUG-2 (missing workdir in post-verification) is the only functional risk. The SIGKILL race (BUG-1) is theoretical on macOS but worth a quick fix. Everything else is polish.
134
-
135
- The test coverage is thorough — 21 tests covering happy path, failure modes, isolation violations, dry-run, and all 3 bug-specific scenarios. The `cleanup.ts` module is clean, well-documented, and properly handles edge cases (dead processes, ESRCH, unexpected errors).
@@ -1,63 +0,0 @@
1
- # Keyword Routing Baseline — Config-Loader Dogfood
2
-
3
- > Recorded from Run D + Run D2 (2026-02-19) for comparison with LLM routing.
4
-
5
- ## Run D (US-001 to US-007)
6
-
7
- | Story | Title | Classified | Model | Test Strategy | Routing Reason | Cost | Status |
8
- |:---|:---|:---|:---|:---|:---|:---|:---|
9
- | US-001 | Define core types and error handling | simple | balanced | test-after | simple task (medium) | ~$0.08 | ✅ |
10
- | US-002 | Implement environment variable interpolation | medium | balanced | test-after | simple task (medium) | ~$0.08 | ✅ |
11
- | US-003 | Implement deep merge utility | medium | balanced | test-after | simple task (medium) | ~$0.08 | ✅ |
12
- | US-004 | Implement config file discovery | simple | balanced | test-after | simple task (medium) | ~$0.08 | ✅ |
13
- | US-005 | Implement synchronous config loader | medium | balanced | test-after | simple task (medium) | ~$0.10 | ✅ |
14
- | US-006 | Implement async config loader | medium | balanced | test-after | simple task (medium) | ~$0.10 | ✅ |
15
- | US-007 | Implement config file watcher | complex | powerful | three-session-tdd | complexity:expert | — | ❌ TDD failure (BUG-20) |
16
-
17
- **Run D total: $0.65, 6/9 passed, 174 tests, 9.8 min**
18
-
19
- ## Run D2 (US-007 to US-009, resumed)
20
-
21
- | Story | Title | Classified | Model | Test Strategy | Routing Reason | Cost | Status |
22
- |:---|:---|:---|:---|:---|:---|:---|:---|
23
- | US-007 | Implement config file watcher | complex | powerful | three-session-tdd | complexity:expert | $1.74 | ✅ |
24
- | US-008 (attempt 1) | Export public API and create barrel exports | simple | powerful | three-session-tdd | public-api, complexity:complex | $1.26 | ✅ but ASSET_CHECK failed |
25
- | US-008 (attempt 2) | Export public API and create barrel exports | simple | powerful | three-session-tdd | public-api, complexity:complex | $1.21 | ✅ |
26
- | US-009 | Comprehensive integration tests and documentation | medium | powerful | three-session-tdd | complexity:complex | $4.95 | ⏸ Human review (verifier issues) |
27
-
28
- **Run D2 total: $4.20 (completed) + $4.95 (US-009 paused) = ~$9.15, 41.4 min**
29
-
30
- ## Misroute Analysis
31
-
32
- | Story | Keyword Route | Ideal Route | Wasted Cost |
33
- |:---|:---|:---|:---|
34
- | US-008 | powerful + three-session-tdd ($2.47 over 2 attempts) | fast + test-after (~$0.10) | **~$2.37** |
35
- | US-009 | powerful + three-session-tdd ($4.95) | balanced + test-after (~$0.20) | **~$4.75** |
36
-
37
- **Total misroute waste: ~$7.12** (77% of Run D2 spend)
38
-
39
- ### Why Keyword Routing Failed
40
-
41
- **US-008:** Title "Export **public API** and create barrel exports" matches `PUBLIC_API_KEYWORDS` → forces TDD. But this is just creating `index.ts` barrel files — no logic, no contracts, no breaking changes. A 2-minute task got 3-session TDD with Opus.
42
-
43
- **US-009:** "Comprehensive **integration tests** and documentation" — classified as medium by AC count, but routing reason says `complexity:complex`. The word "comprehensive" + AC count likely pushed it. Also got TDD despite the story literally being "write tests" — TDD for writing tests is circular.
44
-
45
- ### Expected LLM Routing (to validate later)
46
-
47
- | Story | Expected LLM Route | Expected Cost |
48
- |:---|:---|:---|
49
- | US-001 | fast / test-after | ~$0.05 |
50
- | US-002 | fast / test-after | ~$0.05 |
51
- | US-003 | fast / test-after | ~$0.05 |
52
- | US-004 | fast / test-after | ~$0.05 |
53
- | US-005 | balanced / test-after | ~$0.10 |
54
- | US-006 | balanced / test-after | ~$0.10 |
55
- | US-007 | powerful / three-session-tdd | ~$1.50 |
56
- | US-008 | fast / test-after | ~$0.05 |
57
- | US-009 | balanced / test-after | ~$0.15 |
58
-
59
- **Expected total with LLM routing: ~$2.10** vs actual $9.80 (Run D + D2)
60
-
61
- ---
62
-
63
- *Recorded 2026-02-19 for A/B comparison with v0.8 LLM routing.*
@@ -1,80 +0,0 @@
1
- # Fix Plan: nax v0.8 Structured Logging — Phase 1
2
- **Date:** 2026-02-20
3
- **Branch:** `feat/v0.8-structured-logging`
4
-
5
- ## Scope
6
- Phase 1: Logger core, CLI flags, JSONL file output, debug mode.
7
- Covers AC-1, AC-2, AC-3, AC-4, AC-5, AC-8.
8
- Does NOT touch existing console.log calls (Phase 2).
9
-
10
- ## Phase 1A: Logger Core
11
- **Commit:** `feat(logger): implement structured Logger with level gating and JSONL output`
12
-
13
- ### File: `src/logger/index.ts` (NEW)
14
- Export barrel.
15
-
16
- ### File: `src/logger/logger.ts` (NEW)
17
- - `Logger` class with `error`, `warn`, `info`, `debug` methods
18
- - Each method signature: `(stage: string, message: string, data?: Record<string, unknown>)`
19
- - `withStory(storyId: string)` returns a `StoryLogger` with storyId auto-injected
20
- - Constructor: `{ level: LogLevel, filePath?: string, useChalk?: boolean }`
21
- - Console output: chalk-formatted, filtered by level
22
- - File output: JSON Lines, all levels written regardless of console level
23
- - Singleton pattern: `getLogger()` / `initLogger(opts)`
24
-
25
- ### File: `src/logger/types.ts` (NEW)
26
- - `LogLevel` type: `"error" | "warn" | "info" | "debug"`
27
- - `LogEntry` interface: `{ timestamp, level, stage, storyId?, message, data? }`
28
- - `LoggerOptions` interface
29
-
30
- ### File: `src/logger/formatters.ts` (NEW)
31
- - `formatConsole(entry: LogEntry): string` — chalk-formatted human-readable
32
- - `formatJsonl(entry: LogEntry): string` — JSON.stringify one-liner
33
-
34
- ## Phase 1B: CLI Integration
35
- **Commit:** `feat(cli): add --verbose, --quiet, --silent flags and run directory`
36
-
37
- ### File: `bin/nax.ts`
38
- - Add `--verbose` flag → sets log level to `debug`
39
- - Add `--quiet` flag → sets log level to `warn`
40
- - Add `--silent` flag → sets log level to `error`
41
- - Add `NAX_LOG_LEVEL` env var support (overrides flags)
42
- - Create run directory: `nax/features/<name>/runs/`
43
- - Generate run ID: ISO timestamp `YYYY-MM-DDTHH-mm-ssZ`
44
- - Pass `filePath` to logger init: `nax/features/<name>/runs/<run-id>.jsonl`
45
- - After run, create/update `latest.jsonl` symlink
46
-
47
- ### File: `src/config/schema.ts`
48
- - Add `logging` section to NaxConfig: `{ level: LogLevel, verbose: boolean }`
49
-
50
- ## Phase 1C: Stage Events
51
- **Commit:** `feat(logger): emit structured stage lifecycle events`
52
-
53
- ### File: `src/execution/runner.ts`
54
- - Add logger calls at key lifecycle points (alongside existing console.log, not replacing):
55
- - `run.start`, `iteration.start`, `context.built`
56
- - `agent.start`, `agent.complete`
57
- - `story.complete`, `run.complete`
58
- - These write to JSONL file even at `info` level
59
-
60
- ### File: `src/pipeline/stages/routing.ts`
61
- - Add logger call for routing decision
62
-
63
- ## Phase 1D: Tests
64
- **Commit:** `test(logger): add unit tests for Logger, formatters, and JSONL output`
65
-
66
- ### Test targets:
67
- - `test/logger.test.ts` — Logger class, level gating, withStory, file output
68
- - `test/formatters.test.ts` — console and JSONL formatters
69
- - Verify: JSONL lines are valid JSON with required fields
70
- - Verify: level gating (debug hidden at info level, etc.)
71
- - Verify: file always gets all levels regardless of console setting
72
-
73
- ## Test Strategy
74
- - Mode: test-after (implementing against spec)
75
- - Run: `bun test`
76
-
77
- ## Notes
78
- - Do NOT replace any existing `console.log` calls (Phase 2)
79
- - Logger runs alongside existing output in Phase 1
80
- - Console formatter should closely match current chalk output style
@@ -1,37 +0,0 @@
1
- # Fix Plan: nax v0.8 Structured Logging — Phase 2
2
- **Date:** 2026-02-20
3
- **Covers:** AC-6 (runs list/show), AC-7 (per-story metrics), AC-9 (console.log migration)
4
-
5
- ## Migration Rules
6
- 1. Import `getLogger` from `../logger` (adjust path as needed)
7
- 2. Get logger instance: `const logger = getLogger()`
8
- 3. Replace `console.log(chalk.X(...))` → `logger.info(stage, message, data?)`
9
- 4. Replace `console.warn(...)` → `logger.warn(stage, message, data?)`
10
- 5. Replace `console.error(...)` → `logger.error(stage, message, data?)`
11
- 6. Replace verbose/debug output → `logger.debug(stage, message, data?)`
12
- 7. `stage` should be the module/concern: "routing", "context", "agent", "tdd", "pipeline", "config", "cli", etc.
13
- 8. Keep chalk formatting in the logger's console formatter — do NOT use chalk in the migrated calls
14
- 9. For `data` objects, include structured fields (storyId, cost, duration, etc.) not string interpolation
15
- 10. Do NOT change test files — only src/ files
16
-
17
- ## Phase 2A: Core execution pipeline (highest impact)
18
- **Files:** src/execution/runner.ts, src/execution/helpers.ts, src/execution/post-verify.ts, src/execution/queue-handler.ts
19
- **Commit:** `refactor(execution): migrate console.log to structured logger`
20
-
21
- ## Phase 2B: Pipeline stages
22
- **Files:** src/pipeline/runner.ts, src/pipeline/events.ts, src/pipeline/stages/*.ts (acceptance, completion, constitution, execution, prompt, review, routing, verification)
23
- **Commit:** `refactor(pipeline): migrate console.log to structured logger`
24
-
25
- ## Phase 2C: Agents, routing, context, config
26
- **Files:** src/agents/claude.ts, src/agents/cost.ts, src/agents/validation.ts, src/routing/strategies/*.ts, src/context/builder.ts, src/config/loader.ts, src/analyze/*.ts
27
- **Commit:** `refactor(agents): migrate console.log to structured logger`
28
-
29
- ## Phase 2D: CLI, TDD, hooks, metrics + runs list/show commands
30
- **Files:** src/cli/*.ts, src/tdd/*.ts, src/hooks/*.ts, src/metrics/*.ts, src/review/*.ts, src/acceptance/*.ts
31
- **Also:** Implement `nax runs list -f <feature>` and `nax runs show <run-id> -f <feature>` commands in bin/nax.ts
32
- **Also:** Add per-story metrics summary table to run.complete event
33
- **Commit:** `feat(cli): add nax runs commands and migrate remaining console.log`
34
-
35
- ## Verification
36
- After all phases: `grep -rn "console\.\(log\|warn\|error\)" src/ | grep -v "logger\.ts\|formatters\.ts" | wc -l` should be 0
37
- Run: `bun test` — all tests must pass
@@ -1,180 +0,0 @@
1
- # Code Review: nax v0.8 LLM-Enhanced Routing
2
-
3
- **Date:** 2026-02-20
4
- **Reviewer:** Subrina (AI)
5
- **Branch:** `feat/v0.8-llm-routing` (7 commits, LLM routing scope)
6
- **Files:** 12 changed (src: ~450 LOC, test: ~700 LOC)
7
- **Baseline:** 633 pass, 0 fail, 2 skip
8
-
9
- ---
10
-
11
- ## Overall Grade: B+ (83/100)
12
-
13
- Solid implementation of LLM-based routing with good test coverage (532-line test file), clean separation from keyword strategy, proper fallback chain, and batch mode support. Two notable issues: a **process leak on timeout** (P1) and **duplicate batch routing code** (P2). The async strategy refactor is clean and non-breaking.
14
-
15
- | Dimension | Score | Notes |
16
- |:---|:---|:---|
17
- | Security | 16/20 | Process leak on timeout; prompt injection surface (low risk — internal tool) |
18
- | Reliability | 15/20 | Timeout doesn't kill process; no retry on transient failures |
19
- | API Design | 18/20 | Clean strategy interface, good batch/cache separation |
20
- | Code Quality | 17/20 | Well-documented, good JSDoc. Some duplication in runner.ts |
21
- | Best Practices | 17/20 | Proper fallback chain, zod validation, backward compat via `routeTask` |
22
-
23
- ---
24
-
25
- ## Findings
26
-
27
- ### 🔴 CRITICAL
28
-
29
- *(none)*
30
-
31
- ### 🟠 HIGH
32
-
33
- #### BUG-1: Process leak on LLM timeout
34
- **Severity:** HIGH | **Category:** Memory/Resource
35
- **File:** `src/routing/strategies/llm.ts:131-149`
36
-
37
- ```typescript
38
- const timeoutPromise = new Promise<never>((_, reject) => {
39
- setTimeout(() => reject(new Error(`LLM call timeout after ${timeoutMs}ms`)), timeoutMs);
40
- });
41
- // ...
42
- return await Promise.race([outputPromise, timeoutPromise]);
43
- ```
44
-
45
- When the timeout fires, `Promise.race` rejects but the spawned `claude` process continues running. There's no `proc.kill()` on timeout. This leaks a process that could run for minutes.
46
-
47
- **Risk:** Orphaned `claude` processes accumulating on Mac01, consuming memory and API credits.
48
-
49
- **Fix:**
50
- ```typescript
51
- const controller = new AbortController();
52
- const timeoutId = setTimeout(() => {
53
- proc.kill();
54
- controller.abort();
55
- }, timeoutMs);
56
-
57
- try {
58
- const output = await outputPromise;
59
- clearTimeout(timeoutId);
60
- return output;
61
- } catch (err) {
62
- proc.kill();
63
- clearTimeout(timeoutId);
64
- throw err;
65
- }
66
- ```
67
-
68
- #### BUG-2: `setTimeout` in `timeoutPromise` is never cleared on success
69
- **Severity:** HIGH | **Category:** Bug
70
- **File:** `src/routing/strategies/llm.ts:135`
71
-
72
- Even when the LLM responds quickly, the `setTimeout` callback still fires after `timeoutMs`, creating a rejected promise with no handler (unhandled rejection in some runtimes). In Bun this is silently swallowed, but it's undefined behavior.
73
-
74
- **Fix:** Use `clearTimeout` pattern (see BUG-1 fix above).
75
-
76
- ---
77
-
78
- ### 🟡 MEDIUM
79
-
80
- #### ENH-1: Duplicate batch routing logic in `runner.ts`
81
- **Severity:** MEDIUM | **Category:** Enhancement
82
- **File:** `src/execution/runner.ts:140-154` and `src/execution/runner.ts:183-193`
83
-
84
- The LLM batch routing block (check strategy, call `llmRouteBatch`, catch and warn) is duplicated verbatim for initial routing and re-routing after dependency resolution. Extract to a helper.
85
-
86
- **Fix:**
87
- ```typescript
88
- async function tryBatchRoute(config: NaxConfig, stories: UserStory[]): Promise<void> {
89
- if (config.routing.strategy !== "llm" || !config.routing.llm?.batchMode || stories.length === 0) return;
90
- try {
91
- console.log(chalk.dim(` LLM batch routing: routing ${stories.length} stories...`));
92
- await llmRouteBatch(stories, { config });
93
- } catch (err) {
94
- console.warn(chalk.yellow(` LLM batch routing failed: ${(err as Error).message}`));
95
- }
96
- }
97
- ```
98
-
99
- #### ENH-2: Duplicate cached-routing override blocks in `runner.ts`
100
- **Severity:** MEDIUM | **Category:** Enhancement
101
- **File:** `src/execution/runner.ts:228-237` and `src/execution/runner.ts:258-267`
102
-
103
- The `if (story.routing)` override block (complexity + modelTier + testStrategy) is duplicated for batch vs single-story paths. Same fix: extract helper.
104
-
105
- #### PERF-1: `buildStrategyChain` called per-story in `routeStory`
106
- **Severity:** MEDIUM | **Category:** Performance
107
- **File:** `src/routing/router.ts`
108
-
109
- ```typescript
110
- export async function routeStory(...): Promise<RoutingDecision> {
111
- const chain = await buildStrategyChain(context.config, workdir);
112
- return await chain.route(story, context);
113
- }
114
- ```
115
-
116
- The chain is rebuilt for every story call from the pipeline routing stage. For keyword/manual strategies this is cheap, but `buildStrategyChain` could load a custom strategy file each time. Consider caching the chain per-run.
117
-
118
- **Risk:** Low for current usage (pipeline already uses batch routing for LLM). But `routeStory` is the public API.
119
-
120
- #### TYPE-1: `parseBatchResponse` re-serializes then re-parses each entry
121
- **Severity:** MEDIUM | **Category:** Performance/Style
122
- **File:** `src/routing/strategies/llm.ts:239`
123
-
124
- ```typescript
125
- const decision = parseRoutingResponse(JSON.stringify(entry), story, config);
126
- ```
127
-
128
- Each batch entry is `JSON.stringify`'d then immediately `JSON.parse`'d inside `parseRoutingResponse`. This works but is wasteful. Consider extracting validation into a shared function that accepts an object.
129
-
130
- ---
131
-
132
- ### 🟢 LOW
133
-
134
- #### ENH-3: `maxInputTokens` config field is unused
135
- **Severity:** LOW | **Category:** Enhancement
136
- **File:** `src/config/schema.ts` (LlmRoutingConfig)
137
-
138
- `maxInputTokens` is defined in the schema and has a default of 2000, but nothing in `llm.ts` reads or enforces it. Either implement truncation of story context or remove the field to avoid config confusion.
139
-
140
- #### STYLE-1: `console.log`/`console.warn` for routing logs
141
- **Severity:** LOW | **Category:** Style
142
- **File:** `src/routing/strategies/llm.ts` (multiple)
143
-
144
- Uses raw `console.log`/`console.warn` with `[routing]` prefix. This will be addressed by the v0.8 structured logging feature, so noting for tracking only.
145
-
146
- #### ENH-4: No validation that `strategy: "llm"` has `routing.llm` config
147
- **Severity:** LOW | **Category:** Enhancement
148
- **File:** `src/config/schema.ts`
149
-
150
- When `strategy` is `"llm"`, there's a zod refinement for `customStrategyPath` on `"custom"` but no refinement requiring `llm` config when `strategy` is `"llm"`. The runtime handles it gracefully (falls through to keyword), but a config validation error would be more user-friendly.
151
-
152
- #### STYLE-2: Adaptive strategy now has unnecessary null guards
153
- **Severity:** LOW | **Category:** Style
154
- **File:** `src/routing/strategies/adaptive.ts:170,193`
155
-
156
- ```typescript
157
- const decision = await keywordStrategy.route(story, context);
158
- if (!decision) return null; // keyword never returns null
159
- ```
160
-
161
- The keyword strategy **never** returns null (it always produces a decision). The null guard is defensive but misleading — it suggests keyword might return null when it can't.
162
-
163
- ---
164
-
165
- ## Priority Fix Order
166
-
167
- | Priority | ID | Effort | Description |
168
- |:---|:---|:---|:---|
169
- | P1 | BUG-1 + BUG-2 | S | Kill process on timeout, clear setTimeout on success |
170
- | P2 | ENH-1 + ENH-2 | S | Extract duplicate batch routing + override helpers in runner.ts |
171
- | P3 | TYPE-1 | S | Avoid re-serializing batch entries for validation |
172
- | P4 | PERF-1 | M | Cache strategy chain per-run (optional) |
173
- | P5 | ENH-3 | S | Remove or implement `maxInputTokens` |
174
- | — | ENH-4, STYLE-1, STYLE-2 | S | Low priority / deferred to structured logging |
175
-
176
- ---
177
-
178
- ## Verdict
179
-
180
- **Ship after P1 fix.** The process leak on timeout is the only blocker — it could accumulate orphaned `claude` processes costing real API credits. P2-P5 are quality improvements that can land in a follow-up.
@@ -1,70 +0,0 @@
1
- Code Review Report — Post-fix LLM routing
2
- Date: 2026-02-20
3
- Branch: feat/v0.8-llm-routing
4
- Commit: a70d4f61d29ce1e528fd1e3d82ec9987b4737a79
5
-
6
- Summary
7
- - Reviewed only the latest commit (the P1–P5 fix commit).
8
- - Verified code changes for P1, P2, P3, P5 and test mock updates in the diff between HEAD~1 and HEAD.
9
- - Could not complete test run: 'bun test' appeared to hang in this environment within the allotted time. (See note at the end.)
10
-
11
- What I accomplished / found
12
- 1) P1 (BUG-1+2): callLlm timeout, process kill and clearTimeout
13
- - Changes made in src/routing/strategies/llm.ts:
14
- - Introduced timeoutId variable, setTimeout now kills the spawned process (proc.kill()) and rejects on timeout.
15
- - Promise.race is wrapped in try/catch and both success and error paths clearTimeout(timeoutId) and ensure proc.kill() is called on the error path.
16
- - Assessment: Good improvements. This addresses the resource leak (leftover child) on timeout and ensures the timer is cleared on both success and failure.
17
- - Notes / minor suggestions:
18
- - clearTimeout is called with timeoutId which is possibly undefined in the narrow window before setTimeout assigned it; that's safe because clearTimeout(undefined) is benign in Node.
19
- - proc.kill() is invoked both inside the timeout handler and again in catch — double-kill is usually safe but could be redundant; acceptable.
20
- - If proc.exited already resolved, kill() is a no-op. No dangling promises observed in this snippet.
21
-
22
- 2) P2 (ENH-1+2): tryLlmBatchRoute and applyCachedRouting
23
- - Changes made in src/execution/runner.ts:
24
- - Extracted the LLM batch routing logic into tryLlmBatchRoute that logs and swallows errors and returns early when not applicable.
25
- - Extracted cached-routing override into applyCachedRouting(routing, story, config) and replaced inline blocks with the helper.
26
- - Assessment: Behavior preserved. The helpers are straightforward extractions with identical logic (I compared the code before/after in the diff). applyCachedRouting reproduces the previous override logic for complexity -> modelTier mapping and testStrategy.
27
- - Minor suggestion: applyCachedRouting accepts routing produced by routeTask; that function's return type is used via ReturnType<> which is OK. Consider adding a narrow explicit type alias for clarity in future.
28
-
29
- 3) P3 (TYPE-1): validateRoutingDecision and stripCodeFences
30
- - Changes made in src/routing/strategies/llm.ts:
31
- - Extracted validateRoutingDecision(parsed, config) which validates parsed object fields and returns typed RoutingDecision.
32
- - Extracted stripCodeFences(text) to remove markdown/json code fences.
33
- - parseRoutingResponse now strips fences, JSON.parse once, and validates via validateRoutingDecision.
34
- - parseBatchResponse uses validateRoutingDecision(entry, config) directly instead of re-serializing entry->string->parse again.
35
- - Assessment: Correct and type-safer.
36
- - validateRoutingDecision uses type assertions (as Complexity/TestStrategy/ModelTier) before returning typed fields — appropriate.
37
- - Using direct validation on batch entries avoids unnecessary serialization/parsing and fixes the prior issue where batch entries were re-serialized.
38
- - Minor caution: validateRoutingDecision checks `if (!parsed.complexity || !parsed.modelTier || ...)` — if any field is present but falsy (empty string) it will still be caught; that likely matches intended validation.
39
-
40
- 4) P5 (ENH-3): Removed maxInputTokens from config schema
41
- - Changes made in src/config/schema.ts: removed maxInputTokens from LlmRoutingConfig interface, schema and DEFAULT_CONFIG.
42
- - Assessment: Removal in the three spots shown in diff is correct. I searched the diff for remaining references and did not see other references in the commit diff. (A full repo-wide search was not performed in this review scope.)
43
-
44
- 5) Test mocks: mock spawn kill handlers
45
- - Changes made in test/routing/llm-strategy.test.ts: added kill: () => {} to all 9 mock spawn objects.
46
- - Assessment: Matches the code changes in callLlm that call proc.kill() on timeout and in catch — tests needed mocks to provide a kill function to avoid runtime errors. Good.
47
-
48
- Tests
49
- - I attempted to run tests with `bun test` but the runner in this environment did not complete within the quick-review time window (process appeared to hang). I aborted attempts to repeatedly poll.
50
- - Because tests did not finish here, I could not validate runtime behavior across the suite. Local/CI test run is recommended.
51
-
52
- Grade (per requested areas)
53
- - P1 (BUG-1+2): PASS — timeout handling and process cleanup implemented; no obvious resource leaks or dangling timers.
54
- - P2 (ENH-1+2): PASS — helpers extracted; logic preserved.
55
- - P3 (TYPE-1): PASS — validation and fence-stripping extracted; batch entry validation fixed to avoid re-serialization.
56
- - P5 (ENH-3): PASS (within commit diff) — removed config field; no remaining references in diff.
57
- - Test mocks: PASS — all mocked spawns in the tested file now include kill() handler.
58
-
59
- Recommendations / Follow-ups
60
- - Run the full test suite in CI or locally to confirm all tests pass. The new timeout/process kill behavior and the added kill mocks should make tests stable, but I couldn't confirm here.
61
- - Optional: avoid double-kill redundancy by checking proc.killed/exited state before calling kill() if platform provides it — not required.
62
- - Optional: add unit tests for validateRoutingDecision and stripCodeFences to assert correct behavior on edge cases (fenced JSON, invalid fields, batch entries).
63
-
64
- Note about test run
65
- - The environment's `bun test` invocation did not complete in this review session; please run tests in CI or locally and paste the last 20 lines of output if you want me to re-check test failures (if any).
66
-
67
- Artifacts
68
- - Saved report at: docs/20260220-review-post-fix-llm-routing.md
69
-
70
- End of report.
@@ -1,101 +0,0 @@
1
- # Fix Plan: Split relevantFiles into contextFiles + expectedFiles
2
-
3
- **Date:** 2026-02-21
4
- **Branch:** `feat/v0.9-relevantfiles-split`
5
- **Issue:** #1
6
- **Base:** `master` (`b459e9f`)
7
-
8
- ## Context
9
-
10
- `relevantFiles` conflates two purposes:
11
- 1. **Context injection** — files loaded into agent prompt before execution
12
- 2. **Asset verification** — files that must exist after execution (pre-flight gate)
13
-
14
- This causes false negatives: LLM-predicted filenames fail asset check even when code is correct and tests pass. Observed in dogfood Runs F and H.
15
-
16
- ## Phase 1: Type Changes + Resolver Functions
17
-
18
- ### Fix 1.1: Add new fields to UserStory type
19
- **File:** `src/prd/types.ts`
20
- **Change:** Add `contextFiles?: string[]` and `expectedFiles?: string[]` to `UserStory` interface. Keep `relevantFiles?: string[]` as deprecated.
21
-
22
- ### Fix 1.2: Add resolver functions
23
- **File:** `src/prd/types.ts` (or new `src/prd/helpers.ts`)
24
- **Change:** Create two helper functions:
25
- ```typescript
26
- export function getContextFiles(story: UserStory): string[] {
27
- return story.contextFiles ?? story.relevantFiles ?? [];
28
- }
29
- export function getExpectedFiles(story: UserStory): string[] {
30
- return story.expectedFiles ?? [];
31
- }
32
- ```
33
- **Key:** `getExpectedFiles` does NOT fall back to `relevantFiles`. Asset check is opt-in only.
34
-
35
- ### Fix 1.3: Export helpers from prd index
36
- **File:** `src/prd/index.ts`
37
- **Change:** Export `getContextFiles` and `getExpectedFiles`.
38
-
39
- **Commit:** `refactor(prd): add contextFiles + expectedFiles types and resolvers`
40
-
41
- ## Phase 2: Wire Context Builder
42
-
43
- ### Fix 2.1: Use getContextFiles in context builder
44
- **File:** `src/context/builder.ts`
45
- **Change:** Replace `currentStory.relevantFiles` with `getContextFiles(currentStory)` at line ~296. Import from prd.
46
-
47
- **Commit:** `refactor(context): use getContextFiles for prompt injection`
48
-
49
- ## Phase 3: Wire Verification
50
-
51
- ### Fix 3.1: Use getExpectedFiles in post-verify
52
- **File:** `src/execution/post-verify.ts`
53
- **Change:** Replace `story.relevantFiles` (line ~73) with `getExpectedFiles(story)`. Import from prd.
54
-
55
- ### Fix 3.2: Update verification function signature
56
- **File:** `src/execution/verification.ts`
57
- **Change:** Rename parameter `relevantFiles` to `expectedFiles` in `runVerification()` and `verifyAssets()` for clarity.
58
-
59
- **Commit:** `refactor(verification): use getExpectedFiles for asset check (opt-in only)`
60
-
61
- ## Phase 4: Wire Analyze + Decompose Output
62
-
63
- ### Fix 4.1: Update classifier output
64
- **File:** `src/analyze/classifier.ts`
65
- **Change:** Map LLM output `relevantFiles` -> `contextFiles` in parsed result.
66
-
67
- ### Fix 4.2: Update analyze types
68
- **File:** `src/analyze/types.ts`
69
- **Change:** Add `contextFiles` field alongside `relevantFiles`.
70
-
71
- ### Fix 4.3: Update decompose prompt
72
- **File:** `src/agents/claude.ts`
73
- **Change:** In decompose prompt (~line 455), rename field 8 from `relevantFiles` to `contextFiles`.
74
-
75
- ### Fix 4.4: Update CLI analyze output
76
- **File:** `src/cli/analyze.ts`
77
- **Change:** Map `relevantFiles` -> `contextFiles` in feature creation output.
78
-
79
- ### Fix 4.5: Update acceptance fix-generator
80
- **File:** `src/acceptance/fix-generator.ts`
81
- **Change:** Replace `relevantFiles: []` with `contextFiles: []`.
82
-
83
- **Commit:** `refactor(analyze): output contextFiles instead of relevantFiles`
84
-
85
- ## Phase 5: Tests
86
-
87
- ### Fix 5.1: Update verification tests
88
- **Change:** Add test: story with `relevantFiles` but no `expectedFiles` -> asset check PASSES. Add test: story with `expectedFiles` set -> asset check verifies those files.
89
-
90
- ### Fix 5.2: Update context builder tests
91
- **Change:** Test `contextFiles` used when present. Test `relevantFiles` fallback. Test empty when neither set.
92
-
93
- ### Fix 5.3: Update classifier/analyze tests
94
- **Change:** Verify output uses `contextFiles` field.
95
-
96
- **Commit:** `test: update tests for contextFiles/expectedFiles split`
97
-
98
- ## Test Strategy
99
- - Mode: test-after
100
- - Run `bun test` after each phase
101
- - All existing test files should continue passing (backward compat)