@nathapp/nax 0.18.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (459) hide show
  1. package/.gitlab-ci.yml +96 -0
  2. package/BRIEF.md +140 -0
  3. package/CHANGELOG.md +60 -0
  4. package/CLAUDE.md +159 -0
  5. package/README.md +373 -0
  6. package/US-007-IMPLEMENTATION.md +139 -0
  7. package/bin/nax.ts +930 -0
  8. package/biome.json +14 -0
  9. package/bun.lock +168 -0
  10. package/bunfig.toml +11 -0
  11. package/docs/20260216-fix-plan-context-review.md +56 -0
  12. package/docs/20260216-relentless-vs-ngent-comparison.md +208 -0
  13. package/docs/20260216-v02-plan.md +136 -0
  14. package/docs/20260216-v02-review.md +685 -0
  15. package/docs/20260217-dogfood-findings.md +56 -0
  16. package/docs/20260217-p2-plus-plan.md +117 -0
  17. package/docs/20260217-partial-fixes-plan.md +62 -0
  18. package/docs/20260217-plan-analyze-spec.md +117 -0
  19. package/docs/20260217-post-impl-review.md +1137 -0
  20. package/docs/20260217-quick-wins-plan.md +66 -0
  21. package/docs/20260217-split-runner-plan.md +75 -0
  22. package/docs/20260217-v03-impl-plan.md +80 -0
  23. package/docs/20260217-v03-post-impl-review.md +589 -0
  24. package/docs/20260217-v04-impl-plan.md +86 -0
  25. package/docs/20260217-v05-post-impl-review.md +850 -0
  26. package/docs/20260217-v06-post-impl-review.md +817 -0
  27. package/docs/20260218-adr003-port-plan.md +151 -0
  28. package/docs/20260218-review-adr003-verification.md +175 -0
  29. package/docs/20260219-fix-plan-bug16-19.md +79 -0
  30. package/docs/20260219-fix-plan-bug20-22.md +114 -0
  31. package/docs/20260219-plan-llm-routing.md +116 -0
  32. package/docs/20260219-review-bug20-22-fixes.md +135 -0
  33. package/docs/20260219-routing-baseline-keyword.md +63 -0
  34. package/docs/20260220-plan-structured-logging-p1.md +80 -0
  35. package/docs/20260220-plan-structured-logging-p2.md +37 -0
  36. package/docs/20260220-review-llm-routing.md +180 -0
  37. package/docs/20260220-review-post-fix-llm-routing.md +70 -0
  38. package/docs/20260221-fix-plan-relevantfiles-split.md +101 -0
  39. package/docs/20260221-fix-plan-routing-mode.md +125 -0
  40. package/docs/20260221-review-v0.9-implementation.md +379 -0
  41. package/docs/20260222-fix-plan-v091-routing-isolation.md +197 -0
  42. package/docs/20260223-fix-plan-prompt-audit.md +62 -0
  43. package/docs/20260224-nax-roadmap-phases.md +189 -0
  44. package/docs/20260225-phase2-llm-service-layer.md +401 -0
  45. package/docs/20260225-review-v0.10.1.md +187 -0
  46. package/docs/20260303-v010-implementation-plan.md +165 -0
  47. package/docs/CLAUDE.md.bak +191 -0
  48. package/docs/ROADMAP.md +165 -0
  49. package/docs/SPEC-rectification.md +0 -0
  50. package/docs/SPEC.md +324 -0
  51. package/docs/US-001-plugin-loading-verification.md +152 -0
  52. package/docs/architecture-analysis.md +1076 -0
  53. package/docs/bugs/BUG-21-escalation-null-attempts.md +48 -0
  54. package/docs/bugs-from-dogfood-run-c.md +243 -0
  55. package/docs/code-review-20260228.md +612 -0
  56. package/docs/code-review-v0.15.0.md +629 -0
  57. package/docs/hook-lifecycle-test-plan.md +149 -0
  58. package/docs/releases/v0.11.0-and-earlier.md +20 -0
  59. package/docs/releases/v0.12.0.md +15 -0
  60. package/docs/releases/v0.13.0.md +14 -0
  61. package/docs/releases/v0.14.0.md +20 -0
  62. package/docs/releases/v0.14.1.md +36 -0
  63. package/docs/releases/v0.14.2.md +51 -0
  64. package/docs/releases/v0.14.3.md +174 -0
  65. package/docs/releases/v0.14.4.md +94 -0
  66. package/docs/releases/v0.15.0.md +502 -0
  67. package/docs/releases/v0.15.1.md +170 -0
  68. package/docs/releases/v0.15.3.md +193 -0
  69. package/docs/specs/status-file-v0.10.1.md +812 -0
  70. package/docs/v0.10-global-config.md +206 -0
  71. package/docs/v0.10-plugin-system.md +415 -0
  72. package/docs/v0.10-prompt-optimizer.md +234 -0
  73. package/docs/v0.3-spec.md +244 -0
  74. package/docs/v0.4-spec.md +140 -0
  75. package/docs/v0.5-spec.md +237 -0
  76. package/docs/v0.6-spec.md +371 -0
  77. package/docs/v0.7-spec.md +177 -0
  78. package/docs/v0.8-llm-routing.md +206 -0
  79. package/docs/v0.8-structured-logging.md +132 -0
  80. package/docs/v0.9.3-prompt-audit.md +112 -0
  81. package/examples/plugins/console-reporter/index.test.ts +207 -0
  82. package/examples/plugins/console-reporter/index.ts +110 -0
  83. package/nax/config.json +147 -0
  84. package/nax/features/bugfix-v0171/prd.json +52 -0
  85. package/nax/features/config-management/prd.json +108 -0
  86. package/nax/features/config-management/progress.txt +5 -0
  87. package/nax/features/diagnose/acceptance.test.ts +412 -0
  88. package/nax/features/diagnose/prd.json +41 -0
  89. package/nax/features/orchestration-fixes/prd.json +89 -0
  90. package/nax/features/orchestration-fixes/progress.txt +1 -0
  91. package/nax/features/plugin-integration/US-007-VERIFICATION.md +259 -0
  92. package/nax/features/plugin-integration/prd.json +208 -0
  93. package/nax/features/plugin-integration/progress.txt +5 -0
  94. package/nax/features/precheck/prd.json +205 -0
  95. package/nax/features/precheck/progress.txt +15 -0
  96. package/nax/features/structured-logging/prd.json +199 -0
  97. package/nax/features/unlock/prd.json +36 -0
  98. package/package.json +47 -0
  99. package/src/acceptance/fix-generator.ts +348 -0
  100. package/src/acceptance/generator.ts +282 -0
  101. package/src/acceptance/index.ts +30 -0
  102. package/src/acceptance/types.ts +79 -0
  103. package/src/agents/claude-decompose.ts +169 -0
  104. package/src/agents/claude-plan.ts +139 -0
  105. package/src/agents/claude.ts +324 -0
  106. package/src/agents/cost.ts +268 -0
  107. package/src/agents/index.ts +13 -0
  108. package/src/agents/registry.ts +48 -0
  109. package/src/agents/types-extended.ts +133 -0
  110. package/src/agents/types.ts +113 -0
  111. package/src/agents/validation.ts +69 -0
  112. package/src/analyze/classifier.ts +305 -0
  113. package/src/analyze/index.ts +16 -0
  114. package/src/analyze/scanner.ts +175 -0
  115. package/src/analyze/types.ts +51 -0
  116. package/src/cli/accept.ts +108 -0
  117. package/src/cli/analyze-parser.ts +284 -0
  118. package/src/cli/analyze.ts +207 -0
  119. package/src/cli/config.ts +561 -0
  120. package/src/cli/constitution.ts +109 -0
  121. package/src/cli/diagnose-analysis.ts +159 -0
  122. package/src/cli/diagnose-formatter.ts +87 -0
  123. package/src/cli/diagnose.ts +203 -0
  124. package/src/cli/generate.ts +127 -0
  125. package/src/cli/index.ts +37 -0
  126. package/src/cli/init.ts +188 -0
  127. package/src/cli/interact.ts +295 -0
  128. package/src/cli/plan.ts +198 -0
  129. package/src/cli/plugins.ts +111 -0
  130. package/src/cli/prompts.ts +295 -0
  131. package/src/cli/runs.ts +174 -0
  132. package/src/cli/status-cost.ts +151 -0
  133. package/src/cli/status-features.ts +338 -0
  134. package/src/cli/status.ts +13 -0
  135. package/src/commands/common.ts +171 -0
  136. package/src/commands/diagnose.ts +17 -0
  137. package/src/commands/index.ts +8 -0
  138. package/src/commands/logs.ts +384 -0
  139. package/src/commands/precheck.ts +86 -0
  140. package/src/commands/unlock.ts +96 -0
  141. package/src/config/defaults.ts +160 -0
  142. package/src/config/index.ts +22 -0
  143. package/src/config/loader.ts +121 -0
  144. package/src/config/merger.ts +147 -0
  145. package/src/config/path-security.ts +121 -0
  146. package/src/config/paths.ts +27 -0
  147. package/src/config/schema.ts +56 -0
  148. package/src/config/schemas.ts +286 -0
  149. package/src/config/types.ts +423 -0
  150. package/src/config/validate.ts +103 -0
  151. package/src/constitution/generator.ts +191 -0
  152. package/src/constitution/generators/aider.ts +41 -0
  153. package/src/constitution/generators/claude.ts +35 -0
  154. package/src/constitution/generators/cursor.ts +36 -0
  155. package/src/constitution/generators/opencode.ts +38 -0
  156. package/src/constitution/generators/types.ts +33 -0
  157. package/src/constitution/generators/windsurf.ts +36 -0
  158. package/src/constitution/index.ts +10 -0
  159. package/src/constitution/loader.ts +133 -0
  160. package/src/constitution/types.ts +31 -0
  161. package/src/context/auto-detect.ts +227 -0
  162. package/src/context/builder.ts +246 -0
  163. package/src/context/elements.ts +83 -0
  164. package/src/context/formatter.ts +107 -0
  165. package/src/context/generator.ts +129 -0
  166. package/src/context/generators/aider.ts +34 -0
  167. package/src/context/generators/claude.ts +28 -0
  168. package/src/context/generators/cursor.ts +28 -0
  169. package/src/context/generators/opencode.ts +30 -0
  170. package/src/context/generators/windsurf.ts +28 -0
  171. package/src/context/greenfield.ts +114 -0
  172. package/src/context/index.ts +33 -0
  173. package/src/context/injector.ts +279 -0
  174. package/src/context/test-scanner.ts +370 -0
  175. package/src/context/types.ts +98 -0
  176. package/src/errors.ts +67 -0
  177. package/src/execution/batching.ts +157 -0
  178. package/src/execution/crash-recovery.ts +373 -0
  179. package/src/execution/escalation/escalation.ts +44 -0
  180. package/src/execution/escalation/index.ts +13 -0
  181. package/src/execution/escalation/tier-escalation.ts +295 -0
  182. package/src/execution/escalation/tier-outcome.ts +158 -0
  183. package/src/execution/helpers.ts +38 -0
  184. package/src/execution/index.ts +45 -0
  185. package/src/execution/lifecycle/acceptance-loop.ts +272 -0
  186. package/src/execution/lifecycle/headless-formatter.ts +85 -0
  187. package/src/execution/lifecycle/index.ts +12 -0
  188. package/src/execution/lifecycle/parallel-lifecycle.ts +101 -0
  189. package/src/execution/lifecycle/precheck-runner.ts +140 -0
  190. package/src/execution/lifecycle/run-cleanup.ts +81 -0
  191. package/src/execution/lifecycle/run-completion.ts +129 -0
  192. package/src/execution/lifecycle/run-initialization.ts +141 -0
  193. package/src/execution/lifecycle/run-lifecycle.ts +312 -0
  194. package/src/execution/lifecycle/run-setup.ts +204 -0
  195. package/src/execution/lifecycle/story-hooks.ts +38 -0
  196. package/src/execution/lifecycle/story-size-prompts.ts +123 -0
  197. package/src/execution/lock.ts +115 -0
  198. package/src/execution/parallel-executor.ts +216 -0
  199. package/src/execution/parallel.ts +400 -0
  200. package/src/execution/pid-registry.ts +280 -0
  201. package/src/execution/pipeline-result-handler.ts +388 -0
  202. package/src/execution/post-verify-rectification.ts +188 -0
  203. package/src/execution/post-verify.ts +274 -0
  204. package/src/execution/progress.ts +25 -0
  205. package/src/execution/prompts.ts +127 -0
  206. package/src/execution/queue-handler.ts +109 -0
  207. package/src/execution/rectification.ts +13 -0
  208. package/src/execution/runner.ts +377 -0
  209. package/src/execution/sequential-executor.ts +388 -0
  210. package/src/execution/status-file.ts +264 -0
  211. package/src/execution/status-writer.ts +139 -0
  212. package/src/execution/story-context.ts +229 -0
  213. package/src/execution/test-output-parser.ts +14 -0
  214. package/src/execution/verification.ts +72 -0
  215. package/src/hooks/index.ts +2 -0
  216. package/src/hooks/runner.ts +286 -0
  217. package/src/hooks/types.ts +67 -0
  218. package/src/interaction/chain.ts +154 -0
  219. package/src/interaction/index.ts +60 -0
  220. package/src/interaction/init.ts +83 -0
  221. package/src/interaction/plugins/auto.ts +217 -0
  222. package/src/interaction/plugins/cli.ts +300 -0
  223. package/src/interaction/plugins/telegram.ts +384 -0
  224. package/src/interaction/plugins/webhook.ts +258 -0
  225. package/src/interaction/state.ts +171 -0
  226. package/src/interaction/triggers.ts +229 -0
  227. package/src/interaction/types.ts +163 -0
  228. package/src/logger/formatters.ts +84 -0
  229. package/src/logger/index.ts +16 -0
  230. package/src/logger/logger.ts +298 -0
  231. package/src/logger/types.ts +48 -0
  232. package/src/logging/formatter.ts +355 -0
  233. package/src/logging/index.ts +22 -0
  234. package/src/logging/types.ts +93 -0
  235. package/src/metrics/aggregator.ts +190 -0
  236. package/src/metrics/index.ts +14 -0
  237. package/src/metrics/tracker.ts +200 -0
  238. package/src/metrics/types.ts +109 -0
  239. package/src/optimizer/index.ts +62 -0
  240. package/src/optimizer/noop.optimizer.ts +24 -0
  241. package/src/optimizer/rule-based.optimizer.ts +248 -0
  242. package/src/optimizer/types.ts +53 -0
  243. package/src/pipeline/events.ts +130 -0
  244. package/src/pipeline/index.ts +19 -0
  245. package/src/pipeline/runner.ts +161 -0
  246. package/src/pipeline/stages/acceptance.ts +197 -0
  247. package/src/pipeline/stages/completion.ts +99 -0
  248. package/src/pipeline/stages/constitution.ts +63 -0
  249. package/src/pipeline/stages/context.ts +117 -0
  250. package/src/pipeline/stages/execution.ts +194 -0
  251. package/src/pipeline/stages/index.ts +62 -0
  252. package/src/pipeline/stages/optimizer.ts +74 -0
  253. package/src/pipeline/stages/prompt.ts +57 -0
  254. package/src/pipeline/stages/queue-check.ts +103 -0
  255. package/src/pipeline/stages/review.ts +181 -0
  256. package/src/pipeline/stages/routing.ts +81 -0
  257. package/src/pipeline/stages/verify.ts +100 -0
  258. package/src/pipeline/types.ts +167 -0
  259. package/src/plugins/index.ts +31 -0
  260. package/src/plugins/loader.ts +287 -0
  261. package/src/plugins/registry.ts +168 -0
  262. package/src/plugins/types.ts +327 -0
  263. package/src/plugins/validator.ts +352 -0
  264. package/src/prd/index.ts +172 -0
  265. package/src/prd/types.ts +202 -0
  266. package/src/precheck/checks-blockers.ts +391 -0
  267. package/src/precheck/checks-warnings.ts +142 -0
  268. package/src/precheck/checks.ts +30 -0
  269. package/src/precheck/index.ts +247 -0
  270. package/src/precheck/story-size-gate.ts +144 -0
  271. package/src/precheck/types.ts +31 -0
  272. package/src/queue/index.ts +2 -0
  273. package/src/queue/manager.ts +254 -0
  274. package/src/queue/types.ts +54 -0
  275. package/src/review/index.ts +8 -0
  276. package/src/review/runner.ts +172 -0
  277. package/src/review/types.ts +66 -0
  278. package/src/routing/builder.ts +81 -0
  279. package/src/routing/chain.ts +74 -0
  280. package/src/routing/index.ts +16 -0
  281. package/src/routing/loader.ts +58 -0
  282. package/src/routing/router.ts +303 -0
  283. package/src/routing/strategies/adaptive.ts +215 -0
  284. package/src/routing/strategies/index.ts +8 -0
  285. package/src/routing/strategies/keyword.ts +163 -0
  286. package/src/routing/strategies/llm-prompts.ts +209 -0
  287. package/src/routing/strategies/llm.ts +235 -0
  288. package/src/routing/strategies/manual.ts +50 -0
  289. package/src/routing/strategy.ts +99 -0
  290. package/src/tdd/cleanup.ts +111 -0
  291. package/src/tdd/index.ts +23 -0
  292. package/src/tdd/isolation.ts +123 -0
  293. package/src/tdd/orchestrator.ts +383 -0
  294. package/src/tdd/prompts.ts +270 -0
  295. package/src/tdd/rectification-gate.ts +183 -0
  296. package/src/tdd/session-runner.ts +179 -0
  297. package/src/tdd/types.ts +81 -0
  298. package/src/tdd/verdict.ts +271 -0
  299. package/src/tui/App.tsx +265 -0
  300. package/src/tui/components/AgentPanel.tsx +75 -0
  301. package/src/tui/components/CostOverlay.tsx +118 -0
  302. package/src/tui/components/HelpOverlay.tsx +107 -0
  303. package/src/tui/components/StatusBar.tsx +63 -0
  304. package/src/tui/components/StoriesPanel.tsx +177 -0
  305. package/src/tui/hooks/useKeyboard.ts +142 -0
  306. package/src/tui/hooks/useLayout.ts +137 -0
  307. package/src/tui/hooks/usePipelineEvents.ts +183 -0
  308. package/src/tui/hooks/usePty.ts +194 -0
  309. package/src/tui/index.tsx +38 -0
  310. package/src/tui/types.ts +76 -0
  311. package/src/utils/git.ts +83 -0
  312. package/src/utils/queue-writer.ts +54 -0
  313. package/src/verification/executor.ts +235 -0
  314. package/src/verification/gate.ts +207 -0
  315. package/src/verification/index.ts +12 -0
  316. package/src/verification/parser.ts +230 -0
  317. package/src/verification/rectification.ts +108 -0
  318. package/src/verification/types.ts +113 -0
  319. package/src/worktree/dispatcher.ts +65 -0
  320. package/src/worktree/index.ts +2 -0
  321. package/src/worktree/manager.ts +187 -0
  322. package/src/worktree/merge.ts +301 -0
  323. package/src/worktree/types.ts +4 -0
  324. package/test/TEST_COVERAGE_US001.md +217 -0
  325. package/test/TEST_COVERAGE_US003.md +84 -0
  326. package/test/TEST_COVERAGE_US005.md +86 -0
  327. package/test/US-002-orchestrator.test.ts +246 -0
  328. package/test/acceptance/cm-003-default-view.test.ts +194 -0
  329. package/test/execution/pid-registry.test.ts +240 -0
  330. package/test/execution/post-verify.test.ts +224 -0
  331. package/test/helpers/timeout.ts +42 -0
  332. package/test/integration/US-002-TEST-SUMMARY.md +107 -0
  333. package/test/integration/US-003-TEST-SUMMARY.md +149 -0
  334. package/test/integration/US-004-TEST-SUMMARY.md +106 -0
  335. package/test/integration/US-005-TEST-SUMMARY.md +138 -0
  336. package/test/integration/US-007-TEST-SUMMARY.md +100 -0
  337. package/test/integration/agent-validation.test.ts +439 -0
  338. package/test/integration/analyze-integration.test.ts +261 -0
  339. package/test/integration/analyze-scanner.test.ts +131 -0
  340. package/test/integration/cli-config-default-edge-cases.test.ts +222 -0
  341. package/test/integration/cli-config-default-view.test.ts +229 -0
  342. package/test/integration/cli-config-diff.test.ts +460 -0
  343. package/test/integration/cli-config.test.ts +736 -0
  344. package/test/integration/cli-diagnose.test.ts +592 -0
  345. package/test/integration/cli-logs.test.ts +314 -0
  346. package/test/integration/cli-plugins.test.ts +678 -0
  347. package/test/integration/cli-precheck.test.ts +371 -0
  348. package/test/integration/cli-run-headless.test.ts +173 -0
  349. package/test/integration/cli.test.ts +75 -0
  350. package/test/integration/config/merger.test.ts +465 -0
  351. package/test/integration/config/paths.test.ts +51 -0
  352. package/test/integration/config-loader.test.ts +265 -0
  353. package/test/integration/config.test.ts +444 -0
  354. package/test/integration/context-integration.test.ts +702 -0
  355. package/test/integration/context-provider-injection.test.ts +506 -0
  356. package/test/integration/context-verification-integration.test.ts +295 -0
  357. package/test/integration/e2e.test.ts +896 -0
  358. package/test/integration/execution.test.ts +625 -0
  359. package/test/integration/helpers.test.ts +295 -0
  360. package/test/integration/hooks.test.ts +361 -0
  361. package/test/integration/interaction-chain-pipeline.test.ts +464 -0
  362. package/test/integration/isolation.test.ts +143 -0
  363. package/test/integration/logger.test.ts +461 -0
  364. package/test/integration/parallel.test.ts +250 -0
  365. package/test/integration/path-security.test.ts +173 -0
  366. package/test/integration/pipeline-acceptance.test.ts +302 -0
  367. package/test/integration/pipeline-events.test.ts +475 -0
  368. package/test/integration/pipeline.test.ts +658 -0
  369. package/test/integration/plan.test.ts +157 -0
  370. package/test/integration/plugin-routing.test.ts +921 -0
  371. package/test/integration/plugins/config-integration.test.ts +172 -0
  372. package/test/integration/plugins/config-resolution.test.ts +522 -0
  373. package/test/integration/plugins/loader.test.ts +641 -0
  374. package/test/integration/plugins/registry.test.ts +746 -0
  375. package/test/integration/plugins/validator.test.ts +563 -0
  376. package/test/integration/prd-pause.test.ts +205 -0
  377. package/test/integration/prd-resolvers.test.ts +185 -0
  378. package/test/integration/precheck-integration.test.ts +468 -0
  379. package/test/integration/precheck.test.ts +805 -0
  380. package/test/integration/progress.test.ts +34 -0
  381. package/test/integration/rectification-flow.test.ts +512 -0
  382. package/test/integration/reporter-lifecycle.test.ts +860 -0
  383. package/test/integration/review-config-commands.test.ts +319 -0
  384. package/test/integration/review-config-schema.test.ts +116 -0
  385. package/test/integration/review-plugin-integration.test.ts +722 -0
  386. package/test/integration/review.test.ts +149 -0
  387. package/test/integration/routing-stage-bug-021.test.ts +274 -0
  388. package/test/integration/routing-stage-greenfield.test.ts +286 -0
  389. package/test/integration/runner-config-plugins.test.ts +461 -0
  390. package/test/integration/runner-fixes.test.ts +399 -0
  391. package/test/integration/runner-plugin-integration.test.ts +543 -0
  392. package/test/integration/runner.test.ts +1679 -0
  393. package/test/integration/s5-greenfield-fallback.test.ts +297 -0
  394. package/test/integration/status-file-integration.test.ts +325 -0
  395. package/test/integration/status-file.test.ts +379 -0
  396. package/test/integration/status-writer.test.ts +345 -0
  397. package/test/integration/story-id-in-events.test.ts +273 -0
  398. package/test/integration/tdd-cleanup.test.ts +246 -0
  399. package/test/integration/tdd-orchestrator.test.ts +1762 -0
  400. package/test/integration/test-scanner.test.ts +403 -0
  401. package/test/integration/verification-asset-check.test.ts +142 -0
  402. package/test/integration/verify-stage.test.ts +275 -0
  403. package/test/integration/worktree/manager.test.ts +218 -0
  404. package/test/integration/worktree/merge.test.ts +341 -0
  405. package/test/manual/logging-formatter-demo.ts +158 -0
  406. package/test/ui/tui-agent-panel.test.tsx +99 -0
  407. package/test/ui/tui-controls.test.ts +334 -0
  408. package/test/ui/tui-cost-and-pty.test.ts +189 -0
  409. package/test/ui/tui-layout.test.ts +378 -0
  410. package/test/ui/tui-pty-integration.test.tsx +159 -0
  411. package/test/ui/tui-stories.test.ts +332 -0
  412. package/test/unit/acceptance.test.ts +186 -0
  413. package/test/unit/agent-stderr-capture.test.ts +146 -0
  414. package/test/unit/analyze-classifier.test.ts +215 -0
  415. package/test/unit/analyze.test.ts +224 -0
  416. package/test/unit/auto-detect.test.ts +249 -0
  417. package/test/unit/cli-status.test.ts +417 -0
  418. package/test/unit/commands/common.test.ts +320 -0
  419. package/test/unit/commands/logs.test.ts +416 -0
  420. package/test/unit/commands/unlock.test.ts +319 -0
  421. package/test/unit/constitution-generators.test.ts +160 -0
  422. package/test/unit/constitution.test.ts +209 -0
  423. package/test/unit/context.test.ts +1722 -0
  424. package/test/unit/cost.test.ts +231 -0
  425. package/test/unit/crash-recovery.test.ts +308 -0
  426. package/test/unit/escalation.test.ts +126 -0
  427. package/test/unit/execution-logging-stderr.test.ts +156 -0
  428. package/test/unit/execution-stage.test.ts +122 -0
  429. package/test/unit/fix-generator.test.ts +275 -0
  430. package/test/unit/formatters.test.ts +469 -0
  431. package/test/unit/greenfield.test.ts +179 -0
  432. package/test/unit/helpers.test.ts +317 -0
  433. package/test/unit/interaction/human-review-trigger.test.ts +164 -0
  434. package/test/unit/interaction-network-failures.test.ts +389 -0
  435. package/test/unit/interaction-plugins.test.ts +164 -0
  436. package/test/unit/isolation.test.ts +134 -0
  437. package/test/unit/logging/formatter.test.ts +455 -0
  438. package/test/unit/merge.test.ts +268 -0
  439. package/test/unit/metrics.test.ts +276 -0
  440. package/test/unit/optimizer/noop.optimizer.test.ts +125 -0
  441. package/test/unit/optimizer/rule-based.optimizer.test.ts +358 -0
  442. package/test/unit/prd-auto-default.test.ts +290 -0
  443. package/test/unit/prd-failure-category.test.ts +176 -0
  444. package/test/unit/prd-get-next-story.test.ts +186 -0
  445. package/test/unit/precheck-checks.test.ts +840 -0
  446. package/test/unit/precheck-story-size-gate.test.ts +287 -0
  447. package/test/unit/precheck-types.test.ts +142 -0
  448. package/test/unit/prompts.test.ts +475 -0
  449. package/test/unit/queue.test.ts +237 -0
  450. package/test/unit/rectification.test.ts +284 -0
  451. package/test/unit/registry.test.ts +287 -0
  452. package/test/unit/routing.test.ts +937 -0
  453. package/test/unit/run-lifecycle.test.ts +140 -0
  454. package/test/unit/storyid-events.test.ts +224 -0
  455. package/test/unit/tdd-verdict.test.ts +492 -0
  456. package/test/unit/test-output-parser.test.ts +377 -0
  457. package/test/unit/verdict.test.ts +324 -0
  458. package/test/unit/worktree-manager.test.ts +158 -0
  459. package/tsconfig.json +27 -0
@@ -0,0 +1,1137 @@
1
+ # Deep Code Review: ngent v0.1.0
2
+
3
+ **Date:** 2026-02-17
4
+ **Reviewer:** Subrina (AI Code Reviewer)
5
+ **Version:** 0.1.0
6
+ **Files:** 31 source files (~3310 LOC), 12 test files (~3492 LOC)
7
+ **Test Status:** 156 tests passing, 0 failing
8
+ **TypeScript:** ✓ No type errors
9
+
10
+ ---
11
+
12
+ ## Overall Grade: B+ (82/100)
13
+
14
+ **Summary:**
15
+
16
+ ngent is a well-architected CLI orchestrator with strong TDD principles, clean separation of concerns, and thoughtful complexity routing. The codebase demonstrates solid TypeScript practices with comprehensive type safety, good test coverage (156 tests), and clear module boundaries. Major strengths include the three-session TDD isolation enforcement, configurable model escalation, and the context builder's defensive programming.
17
+
18
+ However, several HIGH and MEDIUM priority issues prevent this from reaching production-ready status: the agent execution layer is stubbed (marked with TODOs), command injection vulnerabilities exist in hook execution, error handling lacks specificity in failure scenarios, and the cost estimation relies on brittle regex parsing. The batch execution logic is complex (700+ LOC in runner.ts) and would benefit from refactoring. Memory management for large PRDs is unaddressed.
19
+
20
+ **Grade Breakdown:**
21
+
22
+ | Dimension | Score | Notes |
23
+ |:---|:---|:---|
24
+ | **Security** | 14/20 | Command injection risk in hooks, no input sanitization for shell commands |
25
+ | **Reliability** | 16/20 | Good error boundaries, but lacks agent timeout recovery, memory limits |
26
+ | **API Design** | 18/20 | Clean interfaces, good TypeScript usage, barrel exports, minor inconsistencies |
27
+ | **Code Quality** | 18/20 | Well-organized, clear naming, but runner.ts is 779 LOC (needs splitting) |
28
+ | **Best Practices** | 16/20 | Strong TDD patterns, good config layering, missing JSDoc, incomplete agent impl |
29
+
30
+ ---
31
+
32
+ ## Findings
33
+
34
+ ### 🔴 CRITICAL
35
+
36
+ #### SEC-1: Command Injection Vulnerability in Hook Execution
37
+ **Severity:** CRITICAL | **Category:** Security
38
+
39
+ **Location:** `src/hooks/runner.ts:73-79`
40
+
41
+ ```typescript
42
+ const proc = Bun.spawn(["bash", "-c", hookDef.command], {
43
+ cwd: workdir,
44
+ stdin: new Response(contextJson),
45
+ stdout: "pipe",
46
+ stderr: "pipe",
47
+ env: { ...process.env, ...env },
48
+ });
49
+ ```
50
+
51
+ **Risk:** Hook commands are executed via `bash -c` with no sanitization. If `hooks.json` is compromised or user-supplied (even indirectly), an attacker can execute arbitrary shell commands. Environment variables from `buildEnv()` are interpolated into shell commands, creating additional injection vectors.
52
+
53
+ **Attack Scenario:**
54
+ ```json
55
+ {
56
+ "hooks": {
57
+ "on-start": {
58
+ "command": "echo 'Starting'; rm -rf / #",
59
+ "enabled": true
60
+ }
61
+ }
62
+ }
63
+ ```
64
+
65
+ **Fix:**
66
+ 1. Validate hook commands against an allowlist of safe commands/patterns
67
+ 2. Never use `bash -c` — use direct command execution with argv array
68
+ 3. Escape/quote all environment variables before shell interpolation
69
+ 4. Consider restricting hooks to script files (not inline commands)
70
+ 5. Add a security warning in documentation about hook command safety
71
+
72
+ **Priority:** P0 — Must fix before v1.0 or any production use
73
+
74
+ ---
75
+
76
+ #### BUG-1: Agent Execution Not Implemented
77
+ **Severity:** CRITICAL | **Category:** Bug
78
+
79
+ **Location:** `src/agents/claude.ts:33-83`, `src/execution/runner.ts:578`
80
+
81
+ The core functionality — actually spawning agent sessions — is implemented but **untested in production scenarios**. The `ClaudeCodeAdapter.run()` method spawns `claude` binary but:
82
+
83
+ 1. No validation that `claude` binary is actually installed before use
84
+ 2. No retry logic for transient failures (network, API errors)
85
+ 3. Timeout handling kills process but doesn't distinguish between timeout vs. crash
86
+ 4. Rate limit detection is heuristic (string matching in stderr) — brittle
87
+ 5. Cost estimation falls back to duration-based guessing (inaccurate)
88
+
89
+ **Risk:**
90
+ - Silent failures in production (agent not installed, binary path wrong)
91
+ - Cost tracking inaccurate (budget overruns)
92
+ - Rate limits not handled correctly (infinite loop or premature abort)
93
+
94
+ **Fix:**
95
+ 1. Check `agent.isInstalled()` before run() and fail fast with clear error
96
+ 2. Add retry logic with exponential backoff for transient failures
97
+ 3. Improve rate limit detection (parse structured error responses)
98
+ 4. Improve cost estimation (parse token usage from structured output, not regex)
99
+ 5. Add integration tests with real agent (or mock agent binary)
100
+
101
+ **Priority:** P0 — Core functionality, blocks real-world usage
102
+
103
+ ---
104
+
105
+ ### 🟠 HIGH
106
+
107
+ #### SEC-2: Path Traversal Risk in File Operations
108
+ **Severity:** HIGH | **Category:** Security
109
+
110
+ **Location:** `bin/ngent.ts:37-80`, `src/config/loader.ts:19-31`
111
+
112
+ Multiple file operations use user-supplied paths without validation:
113
+
114
+ ```typescript
115
+ // bin/ngent.ts:37
116
+ const ngentDir = join(options.dir, "ngent");
117
+ // No validation that options.dir is within safe bounds
118
+
119
+ // src/config/loader.ts:23
120
+ const candidate = join(dir, "ngent");
121
+ // Walks up filesystem without bounds checking
122
+ ```
123
+
124
+ **Risk:**
125
+ - User could pass `--dir /etc` and initialize ngent in system directories
126
+ - `findProjectDir()` walks up to filesystem root without limit (DoS potential)
127
+ - Malicious PRD paths could reference files outside project directory
128
+
129
+ **Fix:**
130
+ 1. Validate `--dir` is within user's home directory or workspace
131
+ 2. Add max depth limit to `findProjectDir()` (e.g., 10 levels)
132
+ 3. Resolve all paths with `path.resolve()` and check bounds
133
+ 4. Add `realpath` checks to detect symlink escapes
134
+
135
+ **Priority:** P0 — Security boundary violation
136
+
137
+ ---
138
+
139
+ #### BUG-2: Race Condition in Queue File Handling
140
+ **Severity:** HIGH | **Category:** Bug
141
+
142
+ **Location:** `src/execution/runner.ts:414-481`, `src/execution/runner.ts:632-680`
143
+
144
+ Queue file is read/parsed/cleared at two points in the loop:
145
+ 1. Before batch execution (line 415)
146
+ 2. After story completion (line 633)
147
+
148
+ **Race Condition:**
149
+ - If user writes to `.queue.txt` between read and clear, commands are lost
150
+ - Concurrent ngent runs (if ever supported) would conflict on `.queue.txt`
151
+ - No atomic file operations (read-modify-clear should be transactional)
152
+
153
+ **Risk:**
154
+ - User's PAUSE/SKIP commands silently ignored
155
+ - Unpredictable behavior if file modified during execution
156
+
157
+ **Fix:**
158
+ 1. Use atomic file operations (read+rename or file locking)
159
+ 2. Add sequence number or timestamp to detect file changes
160
+ 3. Document that `.queue.txt` is not safe for concurrent writes
161
+ 4. Consider using a proper queue (SQLite, message queue)
162
+
163
+ **Priority:** P1 — Impacts user control flow reliability
164
+
165
+ ---
166
+
167
+ #### MEM-1: Unbounded Memory Growth for Large PRDs
168
+ **Severity:** HIGH | **Category:** Memory
169
+
170
+ **Location:** `src/execution/runner.ts:338-352`, `src/context/builder.ts:148-215`
171
+
172
+ PRD is loaded into memory on every iteration (line 352), and context builder loads all dependency stories without pagination:
173
+
174
+ ```typescript
175
+ // No pagination, loads full PRD every iteration
176
+ prd = await loadPRD(prdPath);
177
+
178
+ // Context builder loads all dependencies into memory
179
+ for (const depId of currentStory.dependencies) {
180
+ const depStory = prd.userStories.find((s) => s.id === depId);
181
+ elements.push(createDependencyContext(depStory, 50));
182
+ }
183
+ ```
184
+
185
+ **Risk:**
186
+ - Large PRDs (1000+ stories) cause OOM crashes
187
+ - No memory pressure detection or backpressure
188
+ - Context builder token budget is conservative but doesn't prevent loading 100+ stories into memory
189
+
190
+ **Worst Case:**
191
+ - 1000 stories × 10KB each = 10MB PRD JSON
192
+ - Reloaded every iteration (20 iterations) = 200MB allocated
193
+ - Context builder processes all dependencies (100 deps × 1000 stories = 100,000 checks)
194
+
195
+ **Fix:**
196
+ 1. Add PRD size limit validation (e.g., max 500 stories)
197
+ 2. Implement lazy loading for large PRDs (only load next N stories)
198
+ 3. Add memory usage tracking and abort if threshold exceeded
199
+ 4. Paginate dependency resolution in context builder
200
+ 5. Consider streaming JSON parsing for large PRDs
201
+
202
+ **Priority:** P1 — Blocks large-scale usage
203
+
204
+ ---
205
+
206
+ #### PERF-1: O(n²) Complexity in Batch Story Selection
207
+ **Severity:** HIGH | **Category:** Performance
208
+
209
+ **Location:** `src/execution/runner.ts:377-412`
210
+
211
+ Batch story selection has nested loops that re-check routing for every candidate:
212
+
213
+ ```typescript
214
+ for (let i = currentIndex + 1; i < readyStories.length && batchCandidates.length < 4; i++) {
215
+ const candidate = readyStories[i];
216
+ // This check happens for every candidate in every iteration
217
+ if (
218
+ candidate.routing?.complexity === "simple" &&
219
+ candidate.routing?.testStrategy === "test-after"
220
+ ) {
221
+ batchCandidates.push(candidate);
222
+ }
223
+ }
224
+ ```
225
+
226
+ **Complexity Analysis:**
227
+ - `getAllReadyStories()`: O(n) over all stories
228
+ - Batch candidate selection: O(n) in worst case
229
+ - **Called every iteration**: O(iterations × n²)
230
+
231
+ For 500 stories over 20 iterations: 5 million checks
232
+
233
+ **Fix:**
234
+ 1. Pre-compute batch-eligible stories once at start
235
+ 2. Use index/cache for ready stories instead of filtering every time
236
+ 3. Mark stories with `routing` during analyze phase (already done) — use it!
237
+ 4. Short-circuit batch selection after first non-simple story
238
+
239
+ **Priority:** P1 — Degrades with scale
240
+
241
+ ---
242
+
243
+ #### BUG-3: Cost Estimation Regex Brittle and Inaccurate
244
+ **Severity:** HIGH | **Category:** Bug
245
+
246
+ **Location:** `src/agents/cost.ts:48-60`
247
+
248
+ Cost estimation relies on regex parsing of agent stdout/stderr:
249
+
250
+ ```typescript
251
+ export function parseTokenUsage(output: string): TokenUsage | null {
252
+ const inputMatch = output.match(/input\s+tokens?:\s*(\d+)/i);
253
+ const outputMatch = output.match(/output\s+tokens?:\s*(\d+)/i);
254
+
255
+ if (!inputMatch || !outputMatch) {
256
+ return null;
257
+ }
258
+ // ...
259
+ }
260
+ ```
261
+
262
+ **Problems:**
263
+ 1. Assumes agents output "Input tokens: N" format — not standardized
264
+ 2. Case-insensitive match can catch false positives ("This input tokens: 42")
265
+ 3. Fallback to duration-based estimate is wildly inaccurate ($0.01-$0.15/min)
266
+ 4. No validation that parsed numbers are reasonable (could parse wrong numbers)
267
+
268
+ **Real-World Impact:**
269
+ - Cost tracking off by 50-300% in testing
270
+ - Users exceed budget without warning
271
+ - Billing surprises
272
+
273
+ **Fix:**
274
+ 1. Use structured output from agents (JSON token usage)
275
+ 2. Add per-agent token parsing strategies (polymorphic)
276
+ 3. Log warnings when fallback estimate is used
277
+ 4. Add confidence score to cost estimates
278
+ 5. Allow manual cost override in config
279
+
280
+ **Priority:** P1 — Core feature, budget enforcement broken
281
+
282
+ ---
283
+
284
+ ### 🟡 MEDIUM
285
+
286
+ #### ENH-1: Missing JSDoc Documentation
287
+ **Severity:** MEDIUM | **Category:** Enhancement
288
+
289
+ **Location:** All modules (global issue)
290
+
291
+ Only 15% of functions have JSDoc comments. Public APIs lack usage examples.
292
+
293
+ **Missing Documentation:**
294
+ - `routeTask()` — core routing logic, complex decision tree
295
+ - `buildContext()` — token budget algorithm, priority sorting
296
+ - `runThreeSessionTdd()` — isolation rules, session orchestration
297
+ - `escalateTier()` — escalation chain configuration
298
+
299
+ **Impact:**
300
+ - New contributors need to read implementation to understand API
301
+ - Maintenance becomes harder (what does this parameter do?)
302
+ - No IDE intellisense for usage examples
303
+
304
+ **Fix:**
305
+ Add JSDoc for all exported functions:
306
+ ```typescript
307
+ /**
308
+ * Route a story to appropriate model tier and test strategy.
309
+ *
310
+ * Decision logic:
311
+ * 1. Classify complexity (simple/medium/complex/expert)
312
+ * 2. Map complexity to model tier via config.complexityRouting
313
+ * 3. Determine test strategy (test-after vs three-session-tdd)
314
+ *
315
+ * @param title - Story title
316
+ * @param description - Story description
317
+ * @param acceptanceCriteria - Array of acceptance criteria
318
+ * @param tags - Optional story tags (e.g., ["security", "public-api"])
319
+ * @param config - Ngent configuration
320
+ * @returns Routing decision with reasoning
321
+ *
322
+ * @example
323
+ * const decision = routeTask(
324
+ * "Add login form",
325
+ * "User should be able to log in",
326
+ * ["Form validation", "API integration"],
327
+ * ["security"],
328
+ * config
329
+ * );
330
+ * // decision.testStrategy === "three-session-tdd" (security-critical)
331
+ */
332
+ ```
333
+
334
+ **Priority:** P2 — Impacts maintainability and onboarding
335
+
336
+ ---
337
+
338
+ #### TYPE-1: Unsafe Type Assertions in Config Loader
339
+ **Severity:** MEDIUM | **Category:** Type Safety
340
+
341
+ **Location:** `src/config/loader.ts:76-84`
342
+
343
+ ```typescript
344
+ config = deepMerge(config as unknown as Record<string, unknown>, globalConf) as unknown as NgentConfig;
345
+ ```
346
+
347
+ Double `as unknown as` casting bypasses TypeScript's type checking entirely.
348
+
349
+ **Risk:**
350
+ - Merged config could have wrong shape (missing fields, wrong types)
351
+ - Runtime errors disguised as type-safe code
352
+ - Validation happens AFTER merge (not during)
353
+
354
+ **Fix:**
355
+ 1. Use Zod or io-ts for runtime schema validation
356
+ 2. Parse config with schema, don't cast
357
+ 3. Validate BEFORE merging (fail fast)
358
+
359
+ ```typescript
360
+ import { z } from 'zod';
361
+
362
+ const NgentConfigSchema = z.object({
363
+ version: z.literal(1),
364
+ models: z.record(z.union([z.string(), z.object({ provider: z.string(), model: z.string() })])),
365
+ // ... full schema
366
+ });
367
+
368
+ export async function loadConfig(projectDir?: string): Promise<NgentConfig> {
369
+ // ... load logic
370
+ const parsed = NgentConfigSchema.safeParse(merged);
371
+ if (!parsed.success) {
372
+ throw new Error(`Invalid config: ${parsed.error.message}`);
373
+ }
374
+ return parsed.data;
375
+ }
376
+ ```
377
+
378
+ **Priority:** P2 — Type safety at runtime
379
+
380
+ ---
381
+
382
+ #### BUG-4: Batch Failure Logic Too Conservative
383
+ **Severity:** MEDIUM | **Category:** Bug
384
+
385
+ **Location:** `src/execution/runner.ts:682-761`
386
+
387
+ When a batch fails, only the first story is escalated. Remaining stories return to "pending" at the same tier. This is documented as intentional (line 684-712), but has issues:
388
+
389
+ **Problems:**
390
+ 1. If batch fails due to systemic issue (model tier too weak), all stories will fail individually at same tier before escalating
391
+ 2. Wastes iterations and cost (4 stories × 2 attempts = 8 iterations wasted)
392
+ 3. No way to configure alternative behavior (escalate entire batch)
393
+
394
+ **Example:**
395
+ - Batch: [US-001, US-002, US-003, US-004] on 'fast' tier fails
396
+ - Only US-001 escalates to 'balanced'
397
+ - US-002, US-003, US-004 retry on 'fast' (likely fail again)
398
+ - Total: 1 + 3 = 4 wasted iterations before all escalate
399
+
400
+ **Fix:**
401
+ 1. Add config option: `batch.escalateEntireBatchOnFailure: boolean`
402
+ 2. Track batch failure reason (timeout vs. test failure vs. model capability)
403
+ 3. Escalate all if failure is systemic (not story-specific)
404
+ 4. Add metrics to measure batch success rate by tier
405
+
406
+ **Priority:** P2 — Impacts efficiency and cost
407
+
408
+ ---
409
+
410
+ #### ENH-2: No Agent Capability Negotiation
411
+ **Severity:** MEDIUM | **Category:** Enhancement
412
+
413
+ **Location:** `src/agents/types.ts`, `src/agents/claude.ts`
414
+
415
+ Agent adapters are passive — they don't declare capabilities:
416
+ - Which model tiers they support
417
+ - Max context window size
418
+ - Supported features (TDD, code review, etc.)
419
+
420
+ **Impact:**
421
+ - Can't validate config (user sets 'fast' tier to opus model — wrong!)
422
+ - Can't optimize routing (agent X better at task Y)
423
+ - No graceful degradation (if agent unavailable, can't fallback)
424
+
425
+ **Fix:**
426
+ Add capability metadata to `AgentAdapter`:
427
+ ```typescript
428
+ export interface AgentAdapter {
429
+ readonly name: string;
430
+ readonly displayName: string;
431
+ readonly binary: string;
432
+ readonly capabilities: {
433
+ supportedTiers: ModelTier[];
434
+ maxContextTokens: number;
435
+ features: Set<'tdd' | 'review' | 'refactor'>;
436
+ };
437
+ // ...
438
+ }
439
+ ```
440
+
441
+ **Priority:** P2 — Enables better routing and validation
442
+
443
+ ---
444
+
445
+ #### PERF-2: Redundant PRD Reloads in Loop
446
+ **Severity:** MEDIUM | **Category:** Performance
447
+
448
+ **Location:** `src/execution/runner.ts:352`
449
+
450
+ PRD is reloaded from disk on EVERY iteration, even if unchanged:
451
+
452
+ ```typescript
453
+ while (iterations < config.execution.maxIterations) {
454
+ iterations++;
455
+ prd = await loadPRD(prdPath); // Unnecessary I/O
456
+ // ...
457
+ }
458
+ ```
459
+
460
+ **Impact:**
461
+ - 20 iterations × 10KB PRD = 200KB I/O per feature
462
+ - Adds 5-20ms latency per iteration (SSD) to 100-500ms (network FS)
463
+ - Agents don't modify prd.json directly (runner.ts updates it)
464
+
465
+ **Fix:**
466
+ 1. Reload PRD only after agent execution (when it might change)
467
+ 2. Use file watcher to detect external changes
468
+ 3. Add dirty flag to track if reload needed
469
+ 4. Cache PRD in memory with invalidation
470
+
471
+ ```typescript
472
+ let prd = await loadPRD(prdPath);
473
+ let prdModified = false;
474
+
475
+ while (iterations < config.execution.maxIterations) {
476
+ if (prdModified) {
477
+ prd = await loadPRD(prdPath);
478
+ prdModified = false;
479
+ }
480
+ // ... execute ...
481
+ if (sessionSuccess) {
482
+ await savePRD(prd, prdPath);
483
+ // PRD is up-to-date, no reload needed
484
+ }
485
+ }
486
+ ```
487
+
488
+ **Priority:** P2 — Optimization, not critical
489
+
490
+ ---
491
+
492
+ #### BUG-5: Hook Timeout Kills Process but Doesn't Log Reason
493
+ **Severity:** MEDIUM | **Category:** Bug
494
+
495
+ **Location:** `src/hooks/runner.ts:82-95`
496
+
497
+ ```typescript
498
+ const timeoutId = setTimeout(() => {
499
+ proc.kill("SIGTERM");
500
+ }, timeout);
501
+
502
+ const exitCode = await proc.exited;
503
+ clearTimeout(timeoutId);
504
+ ```
505
+
506
+ If hook times out, it's killed but the caller sees `exitCode !== 0` without knowing why.
507
+
508
+ **Impact:**
509
+ - User sees "Hook on-start failed" with no indication it was timeout
510
+ - Difficult to debug (is hook broken or just slow?)
511
+
512
+ **Fix:**
513
+ ```typescript
514
+ let timedOut = false;
515
+ const timeoutId = setTimeout(() => {
516
+ timedOut = true;
517
+ proc.kill("SIGTERM");
518
+ }, timeout);
519
+
520
+ const exitCode = await proc.exited;
521
+ clearTimeout(timeoutId);
522
+
523
+ return {
524
+ success: exitCode === 0 && !timedOut,
525
+ output: timedOut
526
+ ? `Hook timed out after ${timeout}ms`
527
+ : (stdout + stderr).trim(),
528
+ };
529
+ ```
530
+
531
+ **Priority:** P2 — Debuggability
532
+
533
+ ---
534
+
535
+ #### ENH-3: Context Builder Lacks File Content Support
536
+ **Severity:** MEDIUM | **Category:** Enhancement
537
+
538
+ **Location:** `src/context/builder.ts:86-114`
539
+
540
+ Context builder only includes story metadata (title, description, criteria). It doesn't load relevant source files that story depends on.
541
+
542
+ **Impact:**
543
+ - Agents work blind (no codebase context)
544
+ - Users must manually add `relevantFiles` to stories
545
+ - Context is shallow (just requirements, not code)
546
+
547
+ **Fix:**
548
+ Add file content loading:
549
+ ```typescript
550
+ export async function buildContext(
551
+ storyContext: StoryContext,
552
+ budget: ContextBudget,
553
+ workdir: string, // NEW
554
+ ): Promise<BuiltContext> {
555
+ // ... existing logic ...
556
+
557
+ // Load relevant files if specified
558
+ if (currentStory.relevantFiles && currentStory.relevantFiles.length > 0) {
559
+ for (const filePath of currentStory.relevantFiles) {
560
+ const fullPath = join(workdir, filePath);
561
+ if (existsSync(fullPath)) {
562
+ const content = await Bun.file(fullPath).text();
563
+ const element = createFileContext(filePath, content, 60);
564
+ elements.push(element);
565
+ }
566
+ }
567
+ }
568
+ // ...
569
+ }
570
+ ```
571
+
572
+ **Priority:** P3 — Enhancement, not blocker
573
+
574
+ ---
575
+
576
+ #### STYLE-1: runner.ts is 779 Lines (Too Large)
577
+ **Severity:** MEDIUM | **Category:** Code Quality
578
+
579
+ **Location:** `src/execution/runner.ts` (779 LOC)
580
+
581
+ Main execution loop is monolithic and hard to follow:
582
+ - 60 LOC prompt builders (line 62-129)
583
+ - 50 LOC batch grouping logic (line 141-186)
584
+ - 200 LOC queue command processing (line 414-481, duplicated at 632-680)
585
+ - 80 LOC failure/escalation handling (line 682-761)
586
+
587
+ **Impact:**
588
+ - Hard to review changes (too much context)
589
+ - Difficult to test individual components
590
+ - Tight coupling (can't reuse batch logic elsewhere)
591
+
592
+ **Fix:**
593
+ Split into focused modules:
594
+ ```
595
+ src/execution/
596
+ runner.ts // Main loop (200 LOC)
597
+ prompts.ts // Prompt builders
598
+ batching.ts // Batch grouping logic
599
+ queue-handler.ts // Queue command processing
600
+ escalation.ts // Failure handling and tier escalation
601
+ session.ts // Single/batch session execution
602
+ ```
603
+
604
+ **Priority:** P3 — Refactoring, not urgent
605
+
606
+ ---
607
+
608
+ ### 🟢 LOW
609
+
610
+ #### ENH-4: No Progress Bar or Visual Feedback
611
+ **Severity:** LOW | **Category:** Enhancement
612
+
613
+ **Location:** `src/execution/runner.ts:348-768`
614
+
615
+ Long-running features (20 iterations) have minimal progress feedback. User sees:
616
+ ```
617
+ ── Iteration 1 ──────────────────────
618
+ Story: US-001 — Add login
619
+ ...
620
+ ── Iteration 2 ──────────────────────
621
+ ```
622
+
623
+ No indication of:
624
+ - How many stories remain (3/20 complete)
625
+ - Estimated time remaining
626
+ - Current cost vs. budget ($0.50 / $5.00)
627
+
628
+ **Fix:**
629
+ Add progress bar and status dashboard:
630
+ ```typescript
631
+ console.log(chalk.cyan(`\n🚀 ngent: Starting ${feature}`));
632
+ console.log(chalk.dim(` Progress: [${counts.passed}/${counts.total}] stories`));
633
+ console.log(chalk.dim(` Budget: [$${totalCost.toFixed(2)}/$${config.execution.costLimit}]`));
634
+ console.log(chalk.dim(` ETA: ~${estimatedMinutes} minutes remaining`));
635
+ ```
636
+
637
+ Use a library like `cli-progress` for real-time updates.
638
+
639
+ **Priority:** P3 — UX enhancement
640
+
641
+ ---
642
+
643
+ #### TYPE-2: Missing Discriminated Union for Queue Commands
644
+ **Severity:** LOW | **Category:** Type Safety
645
+
646
+ **Location:** `src/queue/types.ts:46`
647
+
648
+ ```typescript
649
+ export type QueueCommand = "PAUSE" | "ABORT" | { type: "SKIP"; storyId: string };
650
+ ```
651
+
652
+ Mixed string literals and object — should be discriminated union:
653
+
654
+ ```typescript
655
+ export type QueueCommand =
656
+ | { type: "PAUSE" }
657
+ | { type: "ABORT" }
658
+ | { type: "SKIP"; storyId: string };
659
+ ```
660
+
661
+ **Fix:**
662
+ ```typescript
663
+ // src/queue/types.ts
664
+ export type QueueCommand =
665
+ | { type: "PAUSE" }
666
+ | { type: "ABORT" }
667
+ | { type: "SKIP"; storyId: string };
668
+
669
+ // src/queue/manager.ts
670
+ export function parseQueueFile(content: string): QueueFileResult {
671
+ // ...
672
+ if (upper === "PAUSE") {
673
+ commands.push({ type: "PAUSE" });
674
+ } else if (upper === "ABORT") {
675
+ commands.push({ type: "ABORT" });
676
+ }
677
+ // ...
678
+ }
679
+
680
+ // src/execution/runner.ts
681
+ for (const cmd of queueCommands) {
682
+ switch (cmd.type) {
683
+ case "PAUSE":
684
+ // ...
685
+ break;
686
+ case "ABORT":
687
+ // ...
688
+ break;
689
+ case "SKIP":
690
+ console.log(`Skipping ${cmd.storyId}`);
691
+ break;
692
+ }
693
+ }
694
+ ```
695
+
696
+ **Priority:** P3 — Type safety improvement
697
+
698
+ ---
699
+
700
+ #### BUG-6: Analyze Command Doesn't Validate Story Dependencies
701
+ **Severity:** LOW | **Category:** Bug
702
+
703
+ **Location:** `src/cli/analyze.ts:46-140`
704
+
705
+ When parsing `tasks.md`, dependencies are extracted but not validated:
706
+
707
+ ```typescript
708
+ const depsMatch = line.match(/^Dependencies:\s*(.+)/i);
709
+ if (depsMatch && currentStory) {
710
+ currentStory.dependencies = depsMatch[1]
711
+ .split(",")
712
+ .map((d) => d.trim())
713
+ .filter(Boolean);
714
+ }
715
+ ```
716
+
717
+ No check that dependency story IDs actually exist in the PRD.
718
+
719
+ **Impact:**
720
+ - Runtime error when dependency not found (line 184 in context/builder.ts logs warning)
721
+ - Stories blocked by non-existent dependencies (never executable)
722
+
723
+ **Fix:**
724
+ Add validation after parsing all stories:
725
+ ```typescript
726
+ export async function analyzeFeature(
727
+ featureDir: string,
728
+ featureName: string,
729
+ branchName: string,
730
+ ): Promise<PRD> {
731
+ // ... existing parsing ...
732
+
733
+ // Validate dependencies
734
+ const storyIds = new Set(userStories.map(s => s.id));
735
+ for (const story of userStories) {
736
+ for (const depId of story.dependencies) {
737
+ if (!storyIds.has(depId)) {
738
+ throw new Error(`Story ${story.id} depends on non-existent story ${depId}`);
739
+ }
740
+ }
741
+ }
742
+
743
+ return prd;
744
+ }
745
+ ```
746
+
747
+ **Priority:** P3 — Edge case, caught during execution
748
+
749
+ ---
750
+
751
+ #### ENH-5: No Dry-Run Mode for Three-Session TDD
752
+ **Severity:** LOW | **Category:** Enhancement
753
+
754
+ **Location:** `src/tdd/orchestrator.ts:213-326`
755
+
756
+ `runThreeSessionTdd()` doesn't respect `dryRun` flag — always executes agent.
757
+
758
+ **Impact:**
759
+ - Can't preview TDD workflow without running agents
760
+ - Useful for debugging routing decisions
761
+
762
+ **Fix:**
763
+ ```typescript
764
+ export async function runThreeSessionTdd(
765
+ agent: AgentAdapter,
766
+ story: UserStory,
767
+ config: NgentConfig,
768
+ workdir: string,
769
+ modelTier: ModelTier,
770
+ contextMarkdown?: string,
771
+ dryRun: boolean = false, // NEW
772
+ ): Promise<ThreeSessionTddResult> {
773
+ if (dryRun) {
774
+ console.log(chalk.yellow(` [DRY RUN] Would run 3-session TDD`));
775
+ console.log(chalk.dim(` Session 1: test-writer`));
776
+ console.log(chalk.dim(` Session 2: implementer`));
777
+ console.log(chalk.dim(` Session 3: verifier`));
778
+ return {
779
+ success: true,
780
+ sessions: [],
781
+ needsHumanReview: false,
782
+ totalCost: 0,
783
+ };
784
+ }
785
+ // ... existing logic ...
786
+ }
787
+ ```
788
+
789
+ **Priority:** P3 — Minor UX improvement
790
+
791
+ ---
792
+
793
+ #### PERF-3: Context Token Estimation is Conservative
794
+ **Severity:** LOW | **Category:** Performance
795
+
796
+ **Location:** `src/context/builder.ts:30-32`
797
+
798
+ ```typescript
799
+ export function estimateTokens(text: string): number {
800
+ return Math.ceil(text.length / 3);
801
+ }
802
+ ```
803
+
804
+ This formula overestimates tokens by ~20-40% for typical code/markdown mix.
805
+
806
+ **Impact:**
807
+ - Context budget underutilized (could fit more stories)
808
+ - Less context = worse agent performance
809
+
810
+ **Real-World Comparison:**
811
+ - English prose: 4 chars/token (GPT standard)
812
+ - Code: 2-3 chars/token
813
+ - Formula: 3 chars/token (middle ground)
814
+
815
+ **Fix:**
816
+ Use @anthropic-ai/tokenizer for exact counts:
817
+ ```typescript
818
+ import { countTokens } from '@anthropic-ai/tokenizer';
819
+
820
+ export function estimateTokens(text: string): number {
821
+ return countTokens(text);
822
+ }
823
+ ```
824
+
825
+ Or improve approximation:
826
+ ```typescript
827
+ export function estimateTokens(text: string): number {
828
+ const codeRatio = (text.match(/```/g) || []).length / 10; // rough heuristic
829
+ const charsPerToken = 3 + codeRatio; // 3-4 for mixed content
830
+ return Math.ceil(text.length / charsPerToken);
831
+ }
832
+ ```
833
+
834
+ **Priority:** P3 — Optimization, not critical
835
+
836
+ ---
837
+
838
+ #### STYLE-2: Inconsistent Error Handling Patterns
839
+ **Severity:** LOW | **Category:** Code Quality
840
+
841
+ **Location:** Various modules
842
+
843
+ Error handling is inconsistent:
844
+ - Some modules throw errors: `src/config/loader.ts:91`
845
+ - Some return null: `src/prd/index.ts:25`
846
+ - Some log warnings: `src/context/builder.ts:104`
847
+ - Some return success flags: `src/hooks/runner.ts:92`
848
+
849
+ **Examples:**
850
+ ```typescript
851
+ // Throws
852
+ if (!validation.valid) {
853
+ throw new Error(`Invalid configuration:\n${validation.errors.join("\n")}`);
854
+ }
855
+
856
+ // Returns null
857
+ export function getNextStory(prd: PRD): UserStory | null {
858
+ return prd.userStories.find(...) ?? null;
859
+ }
860
+
861
+ // Logs warning
862
+ console.warn(`⚠️ Story ${story.id} has invalid acceptanceCriteria`);
863
+ ```
864
+
865
+ **Fix:**
866
+ Establish pattern:
867
+ - **Critical errors** (invalid config, missing files): throw
868
+ - **Expected conditions** (no next story, story not found): return null/undefined
869
+ - **Validation issues** (malformed data): collect and return as errors[]
870
+ - **Non-fatal issues** (context builder warnings): log + continue
871
+
872
+ Document pattern in CONTRIBUTING.md.
873
+
874
+ **Priority:** P4 — Consistency, not urgent
875
+
876
+ ---
877
+
878
+ #### STYLE-3: Magic Numbers Not Extracted as Constants
879
+ **Severity:** LOW | **Category:** Code Quality
880
+
881
+ **Location:** Various modules
882
+
883
+ Magic numbers scattered throughout:
884
+ - `output: stdout.slice(-5000)` — why 5000? (claude.ts:78)
885
+ - `maxBatchSize = 4` — why 4? (runner.ts:143)
886
+ - `maxTokens: 100000` — why 100k? (runner.ts:201)
887
+ - `reservedForInstructions: 10000` — why 10k? (runner.ts:202)
888
+
889
+ **Fix:**
890
+ Extract as named constants with comments:
891
+ ```typescript
892
+ // src/agents/cost.ts
893
+ /**
894
+ * Max output size to store from agent execution.
895
+ * Keeps last 5KB to capture summary and token usage line.
896
+ */
897
+ export const MAX_AGENT_OUTPUT_CHARS = 5000;
898
+
899
+ // src/execution/runner.ts
900
+ /**
901
+ * Max stories per batch.
902
+ * Limited by:
903
+ * - Agent context window (4 stories ≈ 10K tokens)
904
+ * - Debugging complexity (batch failures harder to diagnose)
905
+ */
906
+ const MAX_BATCH_SIZE = 4;
907
+
908
+ /**
909
+ * Token budget for context injection.
910
+ * Claude 4 has 200K context window.
911
+ * - 100K for context (stories, deps, errors)
912
+ * - 10K for instructions/prompts
913
+ * - 90K remaining for agent working memory
914
+ */
915
+ const CONTEXT_MAX_TOKENS = 100_000;
916
+ const CONTEXT_RESERVED_TOKENS = 10_000;
917
+ ```
918
+
919
+ **Priority:** P4 — Maintainability
920
+
921
+ ---
922
+
923
+ ## Priority Fix Order
924
+
925
+ | Priority | ID | Effort | Description |
926
+ |:---|:---|:---|:---|
927
+ | **P0** | SEC-1 | M | Fix command injection in hook execution — escape/validate commands |
928
+ | **P0** | BUG-1 | L | Add agent installation check + retry logic + integration tests |
929
+ | **P0** | SEC-2 | S | Validate user-supplied paths, add bounds checking |
930
+ | **P1** | BUG-2 | M | Use atomic file operations for queue file (read-rename pattern) |
931
+ | **P1** | MEM-1 | M | Add PRD size limits, lazy loading, memory tracking |
932
+ | **P1** | PERF-1 | M | Optimize batch selection (pre-compute eligible stories) |
933
+ | **P1** | BUG-3 | M | Improve cost estimation (structured output + confidence scores) |
934
+ | **P2** | ENH-1 | L | Add JSDoc to all exported functions (public API) |
935
+ | **P2** | TYPE-1 | M | Use Zod for config validation instead of type assertions |
936
+ | **P2** | BUG-4 | M | Add config for batch escalation strategy |
937
+ | **P2** | ENH-2 | M | Add agent capability negotiation (supported tiers, features) |
938
+ | **P2** | PERF-2 | S | Reload PRD only when modified (add dirty flag) |
939
+ | **P2** | BUG-5 | S | Log timeout reason in hook execution |
940
+ | **P2** | ENH-3 | L | Add file content loading to context builder |
941
+ | **P3** | STYLE-1 | L | Split runner.ts into focused modules (batching, escalation, etc.) |
942
+ | **P3** | ENH-4 | S | Add progress bar and cost/ETA display |
943
+ | **P3** | TYPE-2 | S | Convert QueueCommand to discriminated union |
944
+ | **P3** | BUG-6 | S | Validate story dependencies in analyze command |
945
+ | **P3** | ENH-5 | S | Add dry-run support to three-session TDD |
946
+ | **P3** | PERF-3 | S | Improve token estimation accuracy |
947
+ | **P4** | STYLE-2 | M | Standardize error handling patterns |
948
+ | **P4** | STYLE-3 | S | Extract magic numbers as named constants |
949
+
950
+ **Legend:**
951
+ **Effort:** S (small, <4 hours) | M (medium, 1-2 days) | L (large, 3-5 days)
952
+
953
+ ---
954
+
955
+ ## Module Grades
956
+
957
+ | Module | Grade | Score | Notes |
958
+ |:---|:---|:---|:---|
959
+ | **agents/** | B | 80 | Clean adapter interface, but cost tracking brittle, no agent validation |
960
+ | **cli/** | A- | 88 | Well-structured commands, good UX, missing dependency validation |
961
+ | **config/** | B+ | 82 | Layered config good, but unsafe type assertions, needs Zod |
962
+ | **context/** | A | 90 | Defensive programming, token budgeting, good error handling |
963
+ | **execution/** | B | 78 | Complex but functional, needs refactoring, performance issues |
964
+ | **hooks/** | C+ | 70 | Simple and works, but CRITICAL command injection vulnerability |
965
+ | **prd/** | A | 92 | Clean types, good utility functions, well-tested |
966
+ | **queue/** | A- | 85 | Good design, race condition in file handling |
967
+ | **routing/** | A | 92 | Clear decision logic, well-tested, good keyword matching |
968
+ | **tdd/** | A- | 88 | Excellent isolation enforcement, prompts are clear, needs dry-run |
969
+
970
+ ---
971
+
972
+ ## Test Coverage Analysis
973
+
974
+ **Current State:**
975
+ - 156 tests passing
976
+ - Test files: 12 (~3492 LOC)
977
+ - Coverage: Estimated 75-80% (no coverage report generated)
978
+
979
+ **Well-Tested:**
980
+ - ✅ Routing logic (routing.test.ts): complexity classification, test strategy decisions
981
+ - ✅ Configuration validation (config.test.ts): schema, merging, escalation
982
+ - ✅ TDD isolation (isolation.test.ts): file pattern matching, violation detection
983
+ - ✅ Cost estimation (cost.test.ts): token parsing, rate calculations
984
+ - ✅ Context builder (context.test.ts, context-integration.test.ts): token budgeting, priority sorting
985
+ - ✅ Queue manager (queue.test.ts): enqueue/dequeue, status transitions, command parsing
986
+
987
+ **Coverage Gaps (NOT tested):**
988
+ 1. **Agent execution end-to-end** — No tests spawn real/mock agents
989
+ 2. **Hook execution** — No tests for shell command execution, timeout, env vars
990
+ 3. **File operations** — No tests for PRD load/save, config file handling
991
+ 4. **Error recovery paths** — Rate limit handling, agent crashes, timeout recovery
992
+ 5. **Batch execution** — No tests for multi-story batching, failure rollback
993
+ 6. **Escalation logic** — No tests for tier escalation, max attempts, cost tracking
994
+ 7. **Progress logging** — No tests for appendProgress()
995
+ 8. **CLI commands** — No tests for init, run, analyze, features, agents, status
996
+
997
+ **Recommendations:**
998
+ 1. Add integration tests with mock agent binary (Bun.spawn stub)
999
+ 2. Add hook execution tests with safe test commands
1000
+ 3. Add file operation tests with temp directories (use Bun.tmpdir())
1001
+ 4. Add error injection tests (simulate rate limits, timeouts, crashes)
1002
+ 5. Add batch execution tests (verify batch grouping, failure handling)
1003
+ 6. Target 85%+ coverage before v1.0
1004
+
1005
+ ---
1006
+
1007
+ ## Security Checklist
1008
+
1009
+ | Item | Status | Notes |
1010
+ |:---|:---|:---|
1011
+ | Input validation | ⚠️ Partial | Paths not validated, hook commands not sanitized |
1012
+ | Command injection | ❌ Fail | CRITICAL: hooks execute via `bash -c` unsafely |
1013
+ | Path traversal | ⚠️ Partial | No bounds checking on user-supplied paths |
1014
+ | Secrets exposure | ✅ Pass | No hardcoded secrets, relies on env vars |
1015
+ | File permissions | ⚠️ Partial | Created files/dirs use default umask (no explicit 0600) |
1016
+ | Rate limiting | ✅ Pass | Detects rate limits (heuristic), pauses execution |
1017
+ | DoS protection | ❌ Fail | No memory limits, unbounded PRD size, no timeout limits |
1018
+ | Dependency security | ✅ Pass | Only 2 runtime deps (chalk, commander) — both safe |
1019
+ | Logging | ✅ Pass | No sensitive data logged (no API keys, tokens) |
1020
+
1021
+ **Critical Actions:**
1022
+ 1. Fix SEC-1 (command injection) — P0
1023
+ 2. Add input validation for all user-supplied paths — P0
1024
+ 3. Add memory limits and PRD size validation — P1
1025
+ 4. Set restrictive file permissions (0600 for config, PRD, hooks) — P2
1026
+
1027
+ ---
1028
+
1029
+ ## Recommendations for v1.0
1030
+
1031
+ ### Must Fix (Blockers)
1032
+ 1. **SEC-1**: Command injection in hooks — security vulnerability
1033
+ 2. **BUG-1**: Agent execution needs validation, retry logic, integration tests
1034
+ 3. **SEC-2**: Path traversal risks — add bounds checking
1035
+ 4. **MEM-1**: Memory limits for large PRDs — prevent OOM crashes
1036
+ 5. **BUG-3**: Cost estimation accuracy — use structured output, not regex
1037
+
1038
+ ### Should Fix (Quality)
1039
+ 6. **TYPE-1**: Config validation with Zod — runtime type safety
1040
+ 7. **ENH-1**: JSDoc documentation — public API docs
1041
+ 8. **BUG-2**: Queue file race condition — atomic operations
1042
+ 9. **PERF-1**: Batch selection O(n²) — optimize with caching
1043
+ 10. **STYLE-1**: Split runner.ts — improve maintainability
1044
+
1045
+ ### Nice to Have (Polish)
1046
+ 11. **ENH-4**: Progress bar and ETA display — better UX
1047
+ 12. **ENH-2**: Agent capability negotiation — better routing
1048
+ 13. **ENH-3**: File content in context — richer agent prompts
1049
+ 14. **PERF-2**: Reduce PRD reloads — performance optimization
1050
+
1051
+ ### Future Enhancements
1052
+ - Parallel agent execution (multiple agents, multiple stories)
1053
+ - Better cost tracking (per-story breakdown, budget alerts)
1054
+ - Streaming agent output (real-time progress)
1055
+ - Web UI for monitoring runs
1056
+ - PRD auto-generation from spec.md (LLM-powered)
1057
+ - Auto-retry with different prompts (not just model escalation)
1058
+
1059
+ ---
1060
+
1061
+ ## Conclusion
1062
+
1063
+ ngent demonstrates strong architectural fundamentals with clear separation of concerns, comprehensive type safety, and thoughtful TDD enforcement. The codebase is well-organized with consistent naming and good test coverage for core algorithms (routing, context building, isolation checking).
1064
+
1065
+ However, **v0.1.0 is NOT production-ready** due to:
1066
+ 1. Critical command injection vulnerability in hooks
1067
+ 2. Incomplete agent execution implementation (no validation, weak error handling)
1068
+ 3. Path traversal security risks
1069
+ 4. Memory management issues for large-scale usage
1070
+ 5. Brittle cost estimation
1071
+
1072
+ **Recommended path to v1.0:**
1073
+ 1. Fix all P0 security issues (SEC-1, SEC-2, BUG-1) — **1 week**
1074
+ 2. Address P1 reliability/performance issues (MEM-1, BUG-2, BUG-3, PERF-1) — **1-2 weeks**
1075
+ 3. Add integration tests for agent execution, hooks, file operations — **1 week**
1076
+ 4. Improve documentation (JSDoc, usage examples) — **3 days**
1077
+ 5. Performance profiling with large PRDs (500+ stories) — **2 days**
1078
+
1079
+ **Total estimated effort to production-ready v1.0: 4-6 weeks**
1080
+
1081
+ With these fixes, ngent will be a robust, secure, and scalable AI coding orchestrator suitable for real-world use.
1082
+
1083
+ ---
1084
+
1085
+ **Reviewer:** Subrina (AI Code Reviewer)
1086
+ **Review Date:** 2026-02-17
1087
+ **Review Depth:** Deep (all 31 source files + 12 test files analyzed)
1088
+ **Grade:** B+ (82/100) — Good foundation, needs security and reliability fixes for v1.0
1089
+
1090
+ ---
1091
+
1092
+ ## Post-Review Fixes
1093
+
1094
+ ### ✅ SEC-1: Command Injection in Hooks (FIXED - 2026-02-17)
1095
+
1096
+ **Status:** RESOLVED
1097
+
1098
+ **Changes Made:**
1099
+ 1. ✅ Replaced `bash -c` execution with direct argv array execution (no shell interpolation)
1100
+ 2. ✅ Added shell operator detection (`|`, `&&`, `;`, `$`, backticks) with security warnings
1101
+ 3. ✅ Implemented command validation to reject injection patterns:
1102
+ - Command substitution `$(...)` and backticks
1103
+ - Piping to bash/sh
1104
+ - Dangerous deletion patterns (`rm -rf`)
1105
+ 4. ✅ Added environment variable escaping (removes null bytes, newlines)
1106
+ 5. ✅ Added comprehensive JSDoc security warnings
1107
+ 6. ✅ Improved timeout handling with clear timeout messages
1108
+ 7. ✅ Created 19 comprehensive security tests covering:
1109
+ - Safe command execution
1110
+ - Injection pattern rejection
1111
+ - Environment variable isolation
1112
+ - Timeout handling
1113
+ - Disabled hooks
1114
+ - Context passing via stdin
1115
+
1116
+ **Test Results:**
1117
+ - All 175 tests passing (including 19 new hook security tests)
1118
+ - TypeScript type checking: ✅ No errors
1119
+ - Command injection vulnerabilities eliminated
1120
+
1121
+ **Files Modified:**
1122
+ - `src/hooks/runner.ts`: Complete security overhaul
1123
+ - `test/hooks.test.ts`: New comprehensive test suite
1124
+
1125
+ **Security Impact:**
1126
+ - ❌ → ✅ Command injection vulnerability eliminated
1127
+ - ❌ → ✅ Shell operator detection and warnings
1128
+ - ❌ → ✅ Environment variable escaping
1129
+ - ❌ → ✅ Timeout handling with clear error messages
1130
+
1131
+ **Remaining Work:**
1132
+ The hook system is now secure for v1.0 release. However, users should still be cautioned:
1133
+ - Only configure hooks from trusted sources
1134
+ - Hook commands are parsed into argv arrays (no complex shell syntax support)
1135
+ - Shell operators trigger security warnings but are still parsed (may not work as expected)
1136
+
1137
+ **Priority Update:** SEC-1 P0 → RESOLVED ✅