@nathapp/nax 0.18.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.gitlab-ci.yml +96 -0
- package/BRIEF.md +140 -0
- package/CHANGELOG.md +60 -0
- package/CLAUDE.md +159 -0
- package/README.md +373 -0
- package/US-007-IMPLEMENTATION.md +139 -0
- package/bin/nax.ts +930 -0
- package/biome.json +14 -0
- package/bun.lock +168 -0
- package/bunfig.toml +11 -0
- package/docs/20260216-fix-plan-context-review.md +56 -0
- package/docs/20260216-relentless-vs-ngent-comparison.md +208 -0
- package/docs/20260216-v02-plan.md +136 -0
- package/docs/20260216-v02-review.md +685 -0
- package/docs/20260217-dogfood-findings.md +56 -0
- package/docs/20260217-p2-plus-plan.md +117 -0
- package/docs/20260217-partial-fixes-plan.md +62 -0
- package/docs/20260217-plan-analyze-spec.md +117 -0
- package/docs/20260217-post-impl-review.md +1137 -0
- package/docs/20260217-quick-wins-plan.md +66 -0
- package/docs/20260217-split-runner-plan.md +75 -0
- package/docs/20260217-v03-impl-plan.md +80 -0
- package/docs/20260217-v03-post-impl-review.md +589 -0
- package/docs/20260217-v04-impl-plan.md +86 -0
- package/docs/20260217-v05-post-impl-review.md +850 -0
- package/docs/20260217-v06-post-impl-review.md +817 -0
- package/docs/20260218-adr003-port-plan.md +151 -0
- package/docs/20260218-review-adr003-verification.md +175 -0
- package/docs/20260219-fix-plan-bug16-19.md +79 -0
- package/docs/20260219-fix-plan-bug20-22.md +114 -0
- package/docs/20260219-plan-llm-routing.md +116 -0
- package/docs/20260219-review-bug20-22-fixes.md +135 -0
- package/docs/20260219-routing-baseline-keyword.md +63 -0
- package/docs/20260220-plan-structured-logging-p1.md +80 -0
- package/docs/20260220-plan-structured-logging-p2.md +37 -0
- package/docs/20260220-review-llm-routing.md +180 -0
- package/docs/20260220-review-post-fix-llm-routing.md +70 -0
- package/docs/20260221-fix-plan-relevantfiles-split.md +101 -0
- package/docs/20260221-fix-plan-routing-mode.md +125 -0
- package/docs/20260221-review-v0.9-implementation.md +379 -0
- package/docs/20260222-fix-plan-v091-routing-isolation.md +197 -0
- package/docs/20260223-fix-plan-prompt-audit.md +62 -0
- package/docs/20260224-nax-roadmap-phases.md +189 -0
- package/docs/20260225-phase2-llm-service-layer.md +401 -0
- package/docs/20260225-review-v0.10.1.md +187 -0
- package/docs/20260303-v010-implementation-plan.md +165 -0
- package/docs/CLAUDE.md.bak +191 -0
- package/docs/ROADMAP.md +165 -0
- package/docs/SPEC-rectification.md +0 -0
- package/docs/SPEC.md +324 -0
- package/docs/US-001-plugin-loading-verification.md +152 -0
- package/docs/architecture-analysis.md +1076 -0
- package/docs/bugs/BUG-21-escalation-null-attempts.md +48 -0
- package/docs/bugs-from-dogfood-run-c.md +243 -0
- package/docs/code-review-20260228.md +612 -0
- package/docs/code-review-v0.15.0.md +629 -0
- package/docs/hook-lifecycle-test-plan.md +149 -0
- package/docs/releases/v0.11.0-and-earlier.md +20 -0
- package/docs/releases/v0.12.0.md +15 -0
- package/docs/releases/v0.13.0.md +14 -0
- package/docs/releases/v0.14.0.md +20 -0
- package/docs/releases/v0.14.1.md +36 -0
- package/docs/releases/v0.14.2.md +51 -0
- package/docs/releases/v0.14.3.md +174 -0
- package/docs/releases/v0.14.4.md +94 -0
- package/docs/releases/v0.15.0.md +502 -0
- package/docs/releases/v0.15.1.md +170 -0
- package/docs/releases/v0.15.3.md +193 -0
- package/docs/specs/status-file-v0.10.1.md +812 -0
- package/docs/v0.10-global-config.md +206 -0
- package/docs/v0.10-plugin-system.md +415 -0
- package/docs/v0.10-prompt-optimizer.md +234 -0
- package/docs/v0.3-spec.md +244 -0
- package/docs/v0.4-spec.md +140 -0
- package/docs/v0.5-spec.md +237 -0
- package/docs/v0.6-spec.md +371 -0
- package/docs/v0.7-spec.md +177 -0
- package/docs/v0.8-llm-routing.md +206 -0
- package/docs/v0.8-structured-logging.md +132 -0
- package/docs/v0.9.3-prompt-audit.md +112 -0
- package/examples/plugins/console-reporter/index.test.ts +207 -0
- package/examples/plugins/console-reporter/index.ts +110 -0
- package/nax/config.json +147 -0
- package/nax/features/bugfix-v0171/prd.json +52 -0
- package/nax/features/config-management/prd.json +108 -0
- package/nax/features/config-management/progress.txt +5 -0
- package/nax/features/diagnose/acceptance.test.ts +412 -0
- package/nax/features/diagnose/prd.json +41 -0
- package/nax/features/orchestration-fixes/prd.json +89 -0
- package/nax/features/orchestration-fixes/progress.txt +1 -0
- package/nax/features/plugin-integration/US-007-VERIFICATION.md +259 -0
- package/nax/features/plugin-integration/prd.json +208 -0
- package/nax/features/plugin-integration/progress.txt +5 -0
- package/nax/features/precheck/prd.json +205 -0
- package/nax/features/precheck/progress.txt +15 -0
- package/nax/features/structured-logging/prd.json +199 -0
- package/nax/features/unlock/prd.json +36 -0
- package/package.json +47 -0
- package/src/acceptance/fix-generator.ts +348 -0
- package/src/acceptance/generator.ts +282 -0
- package/src/acceptance/index.ts +30 -0
- package/src/acceptance/types.ts +79 -0
- package/src/agents/claude-decompose.ts +169 -0
- package/src/agents/claude-plan.ts +139 -0
- package/src/agents/claude.ts +324 -0
- package/src/agents/cost.ts +268 -0
- package/src/agents/index.ts +13 -0
- package/src/agents/registry.ts +48 -0
- package/src/agents/types-extended.ts +133 -0
- package/src/agents/types.ts +113 -0
- package/src/agents/validation.ts +69 -0
- package/src/analyze/classifier.ts +305 -0
- package/src/analyze/index.ts +16 -0
- package/src/analyze/scanner.ts +175 -0
- package/src/analyze/types.ts +51 -0
- package/src/cli/accept.ts +108 -0
- package/src/cli/analyze-parser.ts +284 -0
- package/src/cli/analyze.ts +207 -0
- package/src/cli/config.ts +561 -0
- package/src/cli/constitution.ts +109 -0
- package/src/cli/diagnose-analysis.ts +159 -0
- package/src/cli/diagnose-formatter.ts +87 -0
- package/src/cli/diagnose.ts +203 -0
- package/src/cli/generate.ts +127 -0
- package/src/cli/index.ts +37 -0
- package/src/cli/init.ts +188 -0
- package/src/cli/interact.ts +295 -0
- package/src/cli/plan.ts +198 -0
- package/src/cli/plugins.ts +111 -0
- package/src/cli/prompts.ts +295 -0
- package/src/cli/runs.ts +174 -0
- package/src/cli/status-cost.ts +151 -0
- package/src/cli/status-features.ts +338 -0
- package/src/cli/status.ts +13 -0
- package/src/commands/common.ts +171 -0
- package/src/commands/diagnose.ts +17 -0
- package/src/commands/index.ts +8 -0
- package/src/commands/logs.ts +384 -0
- package/src/commands/precheck.ts +86 -0
- package/src/commands/unlock.ts +96 -0
- package/src/config/defaults.ts +160 -0
- package/src/config/index.ts +22 -0
- package/src/config/loader.ts +121 -0
- package/src/config/merger.ts +147 -0
- package/src/config/path-security.ts +121 -0
- package/src/config/paths.ts +27 -0
- package/src/config/schema.ts +56 -0
- package/src/config/schemas.ts +286 -0
- package/src/config/types.ts +423 -0
- package/src/config/validate.ts +103 -0
- package/src/constitution/generator.ts +191 -0
- package/src/constitution/generators/aider.ts +41 -0
- package/src/constitution/generators/claude.ts +35 -0
- package/src/constitution/generators/cursor.ts +36 -0
- package/src/constitution/generators/opencode.ts +38 -0
- package/src/constitution/generators/types.ts +33 -0
- package/src/constitution/generators/windsurf.ts +36 -0
- package/src/constitution/index.ts +10 -0
- package/src/constitution/loader.ts +133 -0
- package/src/constitution/types.ts +31 -0
- package/src/context/auto-detect.ts +227 -0
- package/src/context/builder.ts +246 -0
- package/src/context/elements.ts +83 -0
- package/src/context/formatter.ts +107 -0
- package/src/context/generator.ts +129 -0
- package/src/context/generators/aider.ts +34 -0
- package/src/context/generators/claude.ts +28 -0
- package/src/context/generators/cursor.ts +28 -0
- package/src/context/generators/opencode.ts +30 -0
- package/src/context/generators/windsurf.ts +28 -0
- package/src/context/greenfield.ts +114 -0
- package/src/context/index.ts +33 -0
- package/src/context/injector.ts +279 -0
- package/src/context/test-scanner.ts +370 -0
- package/src/context/types.ts +98 -0
- package/src/errors.ts +67 -0
- package/src/execution/batching.ts +157 -0
- package/src/execution/crash-recovery.ts +373 -0
- package/src/execution/escalation/escalation.ts +44 -0
- package/src/execution/escalation/index.ts +13 -0
- package/src/execution/escalation/tier-escalation.ts +295 -0
- package/src/execution/escalation/tier-outcome.ts +158 -0
- package/src/execution/helpers.ts +38 -0
- package/src/execution/index.ts +45 -0
- package/src/execution/lifecycle/acceptance-loop.ts +272 -0
- package/src/execution/lifecycle/headless-formatter.ts +85 -0
- package/src/execution/lifecycle/index.ts +12 -0
- package/src/execution/lifecycle/parallel-lifecycle.ts +101 -0
- package/src/execution/lifecycle/precheck-runner.ts +140 -0
- package/src/execution/lifecycle/run-cleanup.ts +81 -0
- package/src/execution/lifecycle/run-completion.ts +129 -0
- package/src/execution/lifecycle/run-initialization.ts +141 -0
- package/src/execution/lifecycle/run-lifecycle.ts +312 -0
- package/src/execution/lifecycle/run-setup.ts +204 -0
- package/src/execution/lifecycle/story-hooks.ts +38 -0
- package/src/execution/lifecycle/story-size-prompts.ts +123 -0
- package/src/execution/lock.ts +115 -0
- package/src/execution/parallel-executor.ts +216 -0
- package/src/execution/parallel.ts +400 -0
- package/src/execution/pid-registry.ts +280 -0
- package/src/execution/pipeline-result-handler.ts +388 -0
- package/src/execution/post-verify-rectification.ts +188 -0
- package/src/execution/post-verify.ts +274 -0
- package/src/execution/progress.ts +25 -0
- package/src/execution/prompts.ts +127 -0
- package/src/execution/queue-handler.ts +109 -0
- package/src/execution/rectification.ts +13 -0
- package/src/execution/runner.ts +377 -0
- package/src/execution/sequential-executor.ts +388 -0
- package/src/execution/status-file.ts +264 -0
- package/src/execution/status-writer.ts +139 -0
- package/src/execution/story-context.ts +229 -0
- package/src/execution/test-output-parser.ts +14 -0
- package/src/execution/verification.ts +72 -0
- package/src/hooks/index.ts +2 -0
- package/src/hooks/runner.ts +286 -0
- package/src/hooks/types.ts +67 -0
- package/src/interaction/chain.ts +154 -0
- package/src/interaction/index.ts +60 -0
- package/src/interaction/init.ts +83 -0
- package/src/interaction/plugins/auto.ts +217 -0
- package/src/interaction/plugins/cli.ts +300 -0
- package/src/interaction/plugins/telegram.ts +384 -0
- package/src/interaction/plugins/webhook.ts +258 -0
- package/src/interaction/state.ts +171 -0
- package/src/interaction/triggers.ts +229 -0
- package/src/interaction/types.ts +163 -0
- package/src/logger/formatters.ts +84 -0
- package/src/logger/index.ts +16 -0
- package/src/logger/logger.ts +298 -0
- package/src/logger/types.ts +48 -0
- package/src/logging/formatter.ts +355 -0
- package/src/logging/index.ts +22 -0
- package/src/logging/types.ts +93 -0
- package/src/metrics/aggregator.ts +190 -0
- package/src/metrics/index.ts +14 -0
- package/src/metrics/tracker.ts +200 -0
- package/src/metrics/types.ts +109 -0
- package/src/optimizer/index.ts +62 -0
- package/src/optimizer/noop.optimizer.ts +24 -0
- package/src/optimizer/rule-based.optimizer.ts +248 -0
- package/src/optimizer/types.ts +53 -0
- package/src/pipeline/events.ts +130 -0
- package/src/pipeline/index.ts +19 -0
- package/src/pipeline/runner.ts +161 -0
- package/src/pipeline/stages/acceptance.ts +197 -0
- package/src/pipeline/stages/completion.ts +99 -0
- package/src/pipeline/stages/constitution.ts +63 -0
- package/src/pipeline/stages/context.ts +117 -0
- package/src/pipeline/stages/execution.ts +194 -0
- package/src/pipeline/stages/index.ts +62 -0
- package/src/pipeline/stages/optimizer.ts +74 -0
- package/src/pipeline/stages/prompt.ts +57 -0
- package/src/pipeline/stages/queue-check.ts +103 -0
- package/src/pipeline/stages/review.ts +181 -0
- package/src/pipeline/stages/routing.ts +81 -0
- package/src/pipeline/stages/verify.ts +100 -0
- package/src/pipeline/types.ts +167 -0
- package/src/plugins/index.ts +31 -0
- package/src/plugins/loader.ts +287 -0
- package/src/plugins/registry.ts +168 -0
- package/src/plugins/types.ts +327 -0
- package/src/plugins/validator.ts +352 -0
- package/src/prd/index.ts +172 -0
- package/src/prd/types.ts +202 -0
- package/src/precheck/checks-blockers.ts +391 -0
- package/src/precheck/checks-warnings.ts +142 -0
- package/src/precheck/checks.ts +30 -0
- package/src/precheck/index.ts +247 -0
- package/src/precheck/story-size-gate.ts +144 -0
- package/src/precheck/types.ts +31 -0
- package/src/queue/index.ts +2 -0
- package/src/queue/manager.ts +254 -0
- package/src/queue/types.ts +54 -0
- package/src/review/index.ts +8 -0
- package/src/review/runner.ts +172 -0
- package/src/review/types.ts +66 -0
- package/src/routing/builder.ts +81 -0
- package/src/routing/chain.ts +74 -0
- package/src/routing/index.ts +16 -0
- package/src/routing/loader.ts +58 -0
- package/src/routing/router.ts +303 -0
- package/src/routing/strategies/adaptive.ts +215 -0
- package/src/routing/strategies/index.ts +8 -0
- package/src/routing/strategies/keyword.ts +163 -0
- package/src/routing/strategies/llm-prompts.ts +209 -0
- package/src/routing/strategies/llm.ts +235 -0
- package/src/routing/strategies/manual.ts +50 -0
- package/src/routing/strategy.ts +99 -0
- package/src/tdd/cleanup.ts +111 -0
- package/src/tdd/index.ts +23 -0
- package/src/tdd/isolation.ts +123 -0
- package/src/tdd/orchestrator.ts +383 -0
- package/src/tdd/prompts.ts +270 -0
- package/src/tdd/rectification-gate.ts +183 -0
- package/src/tdd/session-runner.ts +179 -0
- package/src/tdd/types.ts +81 -0
- package/src/tdd/verdict.ts +271 -0
- package/src/tui/App.tsx +265 -0
- package/src/tui/components/AgentPanel.tsx +75 -0
- package/src/tui/components/CostOverlay.tsx +118 -0
- package/src/tui/components/HelpOverlay.tsx +107 -0
- package/src/tui/components/StatusBar.tsx +63 -0
- package/src/tui/components/StoriesPanel.tsx +177 -0
- package/src/tui/hooks/useKeyboard.ts +142 -0
- package/src/tui/hooks/useLayout.ts +137 -0
- package/src/tui/hooks/usePipelineEvents.ts +183 -0
- package/src/tui/hooks/usePty.ts +194 -0
- package/src/tui/index.tsx +38 -0
- package/src/tui/types.ts +76 -0
- package/src/utils/git.ts +83 -0
- package/src/utils/queue-writer.ts +54 -0
- package/src/verification/executor.ts +235 -0
- package/src/verification/gate.ts +207 -0
- package/src/verification/index.ts +12 -0
- package/src/verification/parser.ts +230 -0
- package/src/verification/rectification.ts +108 -0
- package/src/verification/types.ts +113 -0
- package/src/worktree/dispatcher.ts +65 -0
- package/src/worktree/index.ts +2 -0
- package/src/worktree/manager.ts +187 -0
- package/src/worktree/merge.ts +301 -0
- package/src/worktree/types.ts +4 -0
- package/test/TEST_COVERAGE_US001.md +217 -0
- package/test/TEST_COVERAGE_US003.md +84 -0
- package/test/TEST_COVERAGE_US005.md +86 -0
- package/test/US-002-orchestrator.test.ts +246 -0
- package/test/acceptance/cm-003-default-view.test.ts +194 -0
- package/test/execution/pid-registry.test.ts +240 -0
- package/test/execution/post-verify.test.ts +224 -0
- package/test/helpers/timeout.ts +42 -0
- package/test/integration/US-002-TEST-SUMMARY.md +107 -0
- package/test/integration/US-003-TEST-SUMMARY.md +149 -0
- package/test/integration/US-004-TEST-SUMMARY.md +106 -0
- package/test/integration/US-005-TEST-SUMMARY.md +138 -0
- package/test/integration/US-007-TEST-SUMMARY.md +100 -0
- package/test/integration/agent-validation.test.ts +439 -0
- package/test/integration/analyze-integration.test.ts +261 -0
- package/test/integration/analyze-scanner.test.ts +131 -0
- package/test/integration/cli-config-default-edge-cases.test.ts +222 -0
- package/test/integration/cli-config-default-view.test.ts +229 -0
- package/test/integration/cli-config-diff.test.ts +460 -0
- package/test/integration/cli-config.test.ts +736 -0
- package/test/integration/cli-diagnose.test.ts +592 -0
- package/test/integration/cli-logs.test.ts +314 -0
- package/test/integration/cli-plugins.test.ts +678 -0
- package/test/integration/cli-precheck.test.ts +371 -0
- package/test/integration/cli-run-headless.test.ts +173 -0
- package/test/integration/cli.test.ts +75 -0
- package/test/integration/config/merger.test.ts +465 -0
- package/test/integration/config/paths.test.ts +51 -0
- package/test/integration/config-loader.test.ts +265 -0
- package/test/integration/config.test.ts +444 -0
- package/test/integration/context-integration.test.ts +702 -0
- package/test/integration/context-provider-injection.test.ts +506 -0
- package/test/integration/context-verification-integration.test.ts +295 -0
- package/test/integration/e2e.test.ts +896 -0
- package/test/integration/execution.test.ts +625 -0
- package/test/integration/helpers.test.ts +295 -0
- package/test/integration/hooks.test.ts +361 -0
- package/test/integration/interaction-chain-pipeline.test.ts +464 -0
- package/test/integration/isolation.test.ts +143 -0
- package/test/integration/logger.test.ts +461 -0
- package/test/integration/parallel.test.ts +250 -0
- package/test/integration/path-security.test.ts +173 -0
- package/test/integration/pipeline-acceptance.test.ts +302 -0
- package/test/integration/pipeline-events.test.ts +475 -0
- package/test/integration/pipeline.test.ts +658 -0
- package/test/integration/plan.test.ts +157 -0
- package/test/integration/plugin-routing.test.ts +921 -0
- package/test/integration/plugins/config-integration.test.ts +172 -0
- package/test/integration/plugins/config-resolution.test.ts +522 -0
- package/test/integration/plugins/loader.test.ts +641 -0
- package/test/integration/plugins/registry.test.ts +746 -0
- package/test/integration/plugins/validator.test.ts +563 -0
- package/test/integration/prd-pause.test.ts +205 -0
- package/test/integration/prd-resolvers.test.ts +185 -0
- package/test/integration/precheck-integration.test.ts +468 -0
- package/test/integration/precheck.test.ts +805 -0
- package/test/integration/progress.test.ts +34 -0
- package/test/integration/rectification-flow.test.ts +512 -0
- package/test/integration/reporter-lifecycle.test.ts +860 -0
- package/test/integration/review-config-commands.test.ts +319 -0
- package/test/integration/review-config-schema.test.ts +116 -0
- package/test/integration/review-plugin-integration.test.ts +722 -0
- package/test/integration/review.test.ts +149 -0
- package/test/integration/routing-stage-bug-021.test.ts +274 -0
- package/test/integration/routing-stage-greenfield.test.ts +286 -0
- package/test/integration/runner-config-plugins.test.ts +461 -0
- package/test/integration/runner-fixes.test.ts +399 -0
- package/test/integration/runner-plugin-integration.test.ts +543 -0
- package/test/integration/runner.test.ts +1679 -0
- package/test/integration/s5-greenfield-fallback.test.ts +297 -0
- package/test/integration/status-file-integration.test.ts +325 -0
- package/test/integration/status-file.test.ts +379 -0
- package/test/integration/status-writer.test.ts +345 -0
- package/test/integration/story-id-in-events.test.ts +273 -0
- package/test/integration/tdd-cleanup.test.ts +246 -0
- package/test/integration/tdd-orchestrator.test.ts +1762 -0
- package/test/integration/test-scanner.test.ts +403 -0
- package/test/integration/verification-asset-check.test.ts +142 -0
- package/test/integration/verify-stage.test.ts +275 -0
- package/test/integration/worktree/manager.test.ts +218 -0
- package/test/integration/worktree/merge.test.ts +341 -0
- package/test/manual/logging-formatter-demo.ts +158 -0
- package/test/ui/tui-agent-panel.test.tsx +99 -0
- package/test/ui/tui-controls.test.ts +334 -0
- package/test/ui/tui-cost-and-pty.test.ts +189 -0
- package/test/ui/tui-layout.test.ts +378 -0
- package/test/ui/tui-pty-integration.test.tsx +159 -0
- package/test/ui/tui-stories.test.ts +332 -0
- package/test/unit/acceptance.test.ts +186 -0
- package/test/unit/agent-stderr-capture.test.ts +146 -0
- package/test/unit/analyze-classifier.test.ts +215 -0
- package/test/unit/analyze.test.ts +224 -0
- package/test/unit/auto-detect.test.ts +249 -0
- package/test/unit/cli-status.test.ts +417 -0
- package/test/unit/commands/common.test.ts +320 -0
- package/test/unit/commands/logs.test.ts +416 -0
- package/test/unit/commands/unlock.test.ts +319 -0
- package/test/unit/constitution-generators.test.ts +160 -0
- package/test/unit/constitution.test.ts +209 -0
- package/test/unit/context.test.ts +1722 -0
- package/test/unit/cost.test.ts +231 -0
- package/test/unit/crash-recovery.test.ts +308 -0
- package/test/unit/escalation.test.ts +126 -0
- package/test/unit/execution-logging-stderr.test.ts +156 -0
- package/test/unit/execution-stage.test.ts +122 -0
- package/test/unit/fix-generator.test.ts +275 -0
- package/test/unit/formatters.test.ts +469 -0
- package/test/unit/greenfield.test.ts +179 -0
- package/test/unit/helpers.test.ts +317 -0
- package/test/unit/interaction/human-review-trigger.test.ts +164 -0
- package/test/unit/interaction-network-failures.test.ts +389 -0
- package/test/unit/interaction-plugins.test.ts +164 -0
- package/test/unit/isolation.test.ts +134 -0
- package/test/unit/logging/formatter.test.ts +455 -0
- package/test/unit/merge.test.ts +268 -0
- package/test/unit/metrics.test.ts +276 -0
- package/test/unit/optimizer/noop.optimizer.test.ts +125 -0
- package/test/unit/optimizer/rule-based.optimizer.test.ts +358 -0
- package/test/unit/prd-auto-default.test.ts +290 -0
- package/test/unit/prd-failure-category.test.ts +176 -0
- package/test/unit/prd-get-next-story.test.ts +186 -0
- package/test/unit/precheck-checks.test.ts +840 -0
- package/test/unit/precheck-story-size-gate.test.ts +287 -0
- package/test/unit/precheck-types.test.ts +142 -0
- package/test/unit/prompts.test.ts +475 -0
- package/test/unit/queue.test.ts +237 -0
- package/test/unit/rectification.test.ts +284 -0
- package/test/unit/registry.test.ts +287 -0
- package/test/unit/routing.test.ts +937 -0
- package/test/unit/run-lifecycle.test.ts +140 -0
- package/test/unit/storyid-events.test.ts +224 -0
- package/test/unit/tdd-verdict.test.ts +492 -0
- package/test/unit/test-output-parser.test.ts +377 -0
- package/test/unit/verdict.test.ts +324 -0
- package/test/unit/worktree-manager.test.ts +158 -0
- package/tsconfig.json +27 -0
|
@@ -0,0 +1,1137 @@
|
|
|
1
|
+
# Deep Code Review: ngent v0.1.0
|
|
2
|
+
|
|
3
|
+
**Date:** 2026-02-17
|
|
4
|
+
**Reviewer:** Subrina (AI Code Reviewer)
|
|
5
|
+
**Version:** 0.1.0
|
|
6
|
+
**Files:** 31 source files (~3310 LOC), 12 test files (~3492 LOC)
|
|
7
|
+
**Test Status:** 156 tests passing, 0 failing
|
|
8
|
+
**TypeScript:** ✓ No type errors
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## Overall Grade: B+ (82/100)
|
|
13
|
+
|
|
14
|
+
**Summary:**
|
|
15
|
+
|
|
16
|
+
ngent is a well-architected CLI orchestrator with strong TDD principles, clean separation of concerns, and thoughtful complexity routing. The codebase demonstrates solid TypeScript practices with comprehensive type safety, good test coverage (156 tests), and clear module boundaries. Major strengths include the three-session TDD isolation enforcement, configurable model escalation, and the context builder's defensive programming.
|
|
17
|
+
|
|
18
|
+
However, several HIGH and MEDIUM priority issues prevent this from reaching production-ready status: the agent execution layer is stubbed (marked with TODOs), command injection vulnerabilities exist in hook execution, error handling lacks specificity in failure scenarios, and the cost estimation relies on brittle regex parsing. The batch execution logic is complex (700+ LOC in runner.ts) and would benefit from refactoring. Memory management for large PRDs is unaddressed.
|
|
19
|
+
|
|
20
|
+
**Grade Breakdown:**
|
|
21
|
+
|
|
22
|
+
| Dimension | Score | Notes |
|
|
23
|
+
|:---|:---|:---|
|
|
24
|
+
| **Security** | 14/20 | Command injection risk in hooks, no input sanitization for shell commands |
|
|
25
|
+
| **Reliability** | 16/20 | Good error boundaries, but lacks agent timeout recovery, memory limits |
|
|
26
|
+
| **API Design** | 18/20 | Clean interfaces, good TypeScript usage, barrel exports, minor inconsistencies |
|
|
27
|
+
| **Code Quality** | 18/20 | Well-organized, clear naming, but runner.ts is 779 LOC (needs splitting) |
|
|
28
|
+
| **Best Practices** | 16/20 | Strong TDD patterns, good config layering, missing JSDoc, incomplete agent impl |
|
|
29
|
+
|
|
30
|
+
---
|
|
31
|
+
|
|
32
|
+
## Findings
|
|
33
|
+
|
|
34
|
+
### 🔴 CRITICAL
|
|
35
|
+
|
|
36
|
+
#### SEC-1: Command Injection Vulnerability in Hook Execution
|
|
37
|
+
**Severity:** CRITICAL | **Category:** Security
|
|
38
|
+
|
|
39
|
+
**Location:** `src/hooks/runner.ts:73-79`
|
|
40
|
+
|
|
41
|
+
```typescript
|
|
42
|
+
const proc = Bun.spawn(["bash", "-c", hookDef.command], {
|
|
43
|
+
cwd: workdir,
|
|
44
|
+
stdin: new Response(contextJson),
|
|
45
|
+
stdout: "pipe",
|
|
46
|
+
stderr: "pipe",
|
|
47
|
+
env: { ...process.env, ...env },
|
|
48
|
+
});
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
**Risk:** Hook commands are executed via `bash -c` with no sanitization. If `hooks.json` is compromised or user-supplied (even indirectly), an attacker can execute arbitrary shell commands. Environment variables from `buildEnv()` are interpolated into shell commands, creating additional injection vectors.
|
|
52
|
+
|
|
53
|
+
**Attack Scenario:**
|
|
54
|
+
```json
|
|
55
|
+
{
|
|
56
|
+
"hooks": {
|
|
57
|
+
"on-start": {
|
|
58
|
+
"command": "echo 'Starting'; rm -rf / #",
|
|
59
|
+
"enabled": true
|
|
60
|
+
}
|
|
61
|
+
}
|
|
62
|
+
}
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
**Fix:**
|
|
66
|
+
1. Validate hook commands against an allowlist of safe commands/patterns
|
|
67
|
+
2. Never use `bash -c` — use direct command execution with argv array
|
|
68
|
+
3. Escape/quote all environment variables before shell interpolation
|
|
69
|
+
4. Consider restricting hooks to script files (not inline commands)
|
|
70
|
+
5. Add a security warning in documentation about hook command safety
|
|
71
|
+
|
|
72
|
+
**Priority:** P0 — Must fix before v1.0 or any production use
|
|
73
|
+
|
|
74
|
+
---
|
|
75
|
+
|
|
76
|
+
#### BUG-1: Agent Execution Not Implemented
|
|
77
|
+
**Severity:** CRITICAL | **Category:** Bug
|
|
78
|
+
|
|
79
|
+
**Location:** `src/agents/claude.ts:33-83`, `src/execution/runner.ts:578`
|
|
80
|
+
|
|
81
|
+
The core functionality — actually spawning agent sessions — is implemented but **untested in production scenarios**. The `ClaudeCodeAdapter.run()` method spawns `claude` binary but:
|
|
82
|
+
|
|
83
|
+
1. No validation that `claude` binary is actually installed before use
|
|
84
|
+
2. No retry logic for transient failures (network, API errors)
|
|
85
|
+
3. Timeout handling kills process but doesn't distinguish between timeout vs. crash
|
|
86
|
+
4. Rate limit detection is heuristic (string matching in stderr) — brittle
|
|
87
|
+
5. Cost estimation falls back to duration-based guessing (inaccurate)
|
|
88
|
+
|
|
89
|
+
**Risk:**
|
|
90
|
+
- Silent failures in production (agent not installed, binary path wrong)
|
|
91
|
+
- Cost tracking inaccurate (budget overruns)
|
|
92
|
+
- Rate limits not handled correctly (infinite loop or premature abort)
|
|
93
|
+
|
|
94
|
+
**Fix:**
|
|
95
|
+
1. Check `agent.isInstalled()` before run() and fail fast with clear error
|
|
96
|
+
2. Add retry logic with exponential backoff for transient failures
|
|
97
|
+
3. Improve rate limit detection (parse structured error responses)
|
|
98
|
+
4. Improve cost estimation (parse token usage from structured output, not regex)
|
|
99
|
+
5. Add integration tests with real agent (or mock agent binary)
|
|
100
|
+
|
|
101
|
+
**Priority:** P0 — Core functionality, blocks real-world usage
|
|
102
|
+
|
|
103
|
+
---
|
|
104
|
+
|
|
105
|
+
### 🟠 HIGH
|
|
106
|
+
|
|
107
|
+
#### SEC-2: Path Traversal Risk in File Operations
|
|
108
|
+
**Severity:** HIGH | **Category:** Security
|
|
109
|
+
|
|
110
|
+
**Location:** `bin/ngent.ts:37-80`, `src/config/loader.ts:19-31`
|
|
111
|
+
|
|
112
|
+
Multiple file operations use user-supplied paths without validation:
|
|
113
|
+
|
|
114
|
+
```typescript
|
|
115
|
+
// bin/ngent.ts:37
|
|
116
|
+
const ngentDir = join(options.dir, "ngent");
|
|
117
|
+
// No validation that options.dir is within safe bounds
|
|
118
|
+
|
|
119
|
+
// src/config/loader.ts:23
|
|
120
|
+
const candidate = join(dir, "ngent");
|
|
121
|
+
// Walks up filesystem without bounds checking
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
**Risk:**
|
|
125
|
+
- User could pass `--dir /etc` and initialize ngent in system directories
|
|
126
|
+
- `findProjectDir()` walks up to filesystem root without limit (DoS potential)
|
|
127
|
+
- Malicious PRD paths could reference files outside project directory
|
|
128
|
+
|
|
129
|
+
**Fix:**
|
|
130
|
+
1. Validate `--dir` is within user's home directory or workspace
|
|
131
|
+
2. Add max depth limit to `findProjectDir()` (e.g., 10 levels)
|
|
132
|
+
3. Resolve all paths with `path.resolve()` and check bounds
|
|
133
|
+
4. Add `realpath` checks to detect symlink escapes
|
|
134
|
+
|
|
135
|
+
**Priority:** P0 — Security boundary violation
|
|
136
|
+
|
|
137
|
+
---
|
|
138
|
+
|
|
139
|
+
#### BUG-2: Race Condition in Queue File Handling
|
|
140
|
+
**Severity:** HIGH | **Category:** Bug
|
|
141
|
+
|
|
142
|
+
**Location:** `src/execution/runner.ts:414-481`, `src/execution/runner.ts:632-680`
|
|
143
|
+
|
|
144
|
+
Queue file is read/parsed/cleared at two points in the loop:
|
|
145
|
+
1. Before batch execution (line 415)
|
|
146
|
+
2. After story completion (line 633)
|
|
147
|
+
|
|
148
|
+
**Race Condition:**
|
|
149
|
+
- If user writes to `.queue.txt` between read and clear, commands are lost
|
|
150
|
+
- Concurrent ngent runs (if ever supported) would conflict on `.queue.txt`
|
|
151
|
+
- No atomic file operations (read-modify-clear should be transactional)
|
|
152
|
+
|
|
153
|
+
**Risk:**
|
|
154
|
+
- User's PAUSE/SKIP commands silently ignored
|
|
155
|
+
- Unpredictable behavior if file modified during execution
|
|
156
|
+
|
|
157
|
+
**Fix:**
|
|
158
|
+
1. Use atomic file operations (read+rename or file locking)
|
|
159
|
+
2. Add sequence number or timestamp to detect file changes
|
|
160
|
+
3. Document that `.queue.txt` is not safe for concurrent writes
|
|
161
|
+
4. Consider using a proper queue (SQLite, message queue)
|
|
162
|
+
|
|
163
|
+
**Priority:** P1 — Impacts user control flow reliability
|
|
164
|
+
|
|
165
|
+
---
|
|
166
|
+
|
|
167
|
+
#### MEM-1: Unbounded Memory Growth for Large PRDs
|
|
168
|
+
**Severity:** HIGH | **Category:** Memory
|
|
169
|
+
|
|
170
|
+
**Location:** `src/execution/runner.ts:338-352`, `src/context/builder.ts:148-215`
|
|
171
|
+
|
|
172
|
+
PRD is loaded into memory on every iteration (line 352), and context builder loads all dependency stories without pagination:
|
|
173
|
+
|
|
174
|
+
```typescript
|
|
175
|
+
// No pagination, loads full PRD every iteration
|
|
176
|
+
prd = await loadPRD(prdPath);
|
|
177
|
+
|
|
178
|
+
// Context builder loads all dependencies into memory
|
|
179
|
+
for (const depId of currentStory.dependencies) {
|
|
180
|
+
const depStory = prd.userStories.find((s) => s.id === depId);
|
|
181
|
+
elements.push(createDependencyContext(depStory, 50));
|
|
182
|
+
}
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
**Risk:**
|
|
186
|
+
- Large PRDs (1000+ stories) cause OOM crashes
|
|
187
|
+
- No memory pressure detection or backpressure
|
|
188
|
+
- Context builder token budget is conservative but doesn't prevent loading 100+ stories into memory
|
|
189
|
+
|
|
190
|
+
**Worst Case:**
|
|
191
|
+
- 1000 stories × 10KB each = 10MB PRD JSON
|
|
192
|
+
- Reloaded every iteration (20 iterations) = 200MB allocated
|
|
193
|
+
- Context builder processes all dependencies (100 deps × 1000 stories = 100,000 checks)
|
|
194
|
+
|
|
195
|
+
**Fix:**
|
|
196
|
+
1. Add PRD size limit validation (e.g., max 500 stories)
|
|
197
|
+
2. Implement lazy loading for large PRDs (only load next N stories)
|
|
198
|
+
3. Add memory usage tracking and abort if threshold exceeded
|
|
199
|
+
4. Paginate dependency resolution in context builder
|
|
200
|
+
5. Consider streaming JSON parsing for large PRDs
|
|
201
|
+
|
|
202
|
+
**Priority:** P1 — Blocks large-scale usage
|
|
203
|
+
|
|
204
|
+
---
|
|
205
|
+
|
|
206
|
+
#### PERF-1: O(n²) Complexity in Batch Story Selection
|
|
207
|
+
**Severity:** HIGH | **Category:** Performance
|
|
208
|
+
|
|
209
|
+
**Location:** `src/execution/runner.ts:377-412`
|
|
210
|
+
|
|
211
|
+
Batch story selection has nested loops that re-check routing for every candidate:
|
|
212
|
+
|
|
213
|
+
```typescript
|
|
214
|
+
for (let i = currentIndex + 1; i < readyStories.length && batchCandidates.length < 4; i++) {
|
|
215
|
+
const candidate = readyStories[i];
|
|
216
|
+
// This check happens for every candidate in every iteration
|
|
217
|
+
if (
|
|
218
|
+
candidate.routing?.complexity === "simple" &&
|
|
219
|
+
candidate.routing?.testStrategy === "test-after"
|
|
220
|
+
) {
|
|
221
|
+
batchCandidates.push(candidate);
|
|
222
|
+
}
|
|
223
|
+
}
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
**Complexity Analysis:**
|
|
227
|
+
- `getAllReadyStories()`: O(n) over all stories
|
|
228
|
+
- Batch candidate selection: O(n) in worst case
|
|
229
|
+
- **Called every iteration**: O(iterations × n²)
|
|
230
|
+
|
|
231
|
+
For 500 stories over 20 iterations: 5 million checks
|
|
232
|
+
|
|
233
|
+
**Fix:**
|
|
234
|
+
1. Pre-compute batch-eligible stories once at start
|
|
235
|
+
2. Use index/cache for ready stories instead of filtering every time
|
|
236
|
+
3. Mark stories with `routing` during analyze phase (already done) — use it!
|
|
237
|
+
4. Short-circuit batch selection after first non-simple story
|
|
238
|
+
|
|
239
|
+
**Priority:** P1 — Degrades with scale
|
|
240
|
+
|
|
241
|
+
---
|
|
242
|
+
|
|
243
|
+
#### BUG-3: Cost Estimation Regex Brittle and Inaccurate
|
|
244
|
+
**Severity:** HIGH | **Category:** Bug
|
|
245
|
+
|
|
246
|
+
**Location:** `src/agents/cost.ts:48-60`
|
|
247
|
+
|
|
248
|
+
Cost estimation relies on regex parsing of agent stdout/stderr:
|
|
249
|
+
|
|
250
|
+
```typescript
|
|
251
|
+
export function parseTokenUsage(output: string): TokenUsage | null {
|
|
252
|
+
const inputMatch = output.match(/input\s+tokens?:\s*(\d+)/i);
|
|
253
|
+
const outputMatch = output.match(/output\s+tokens?:\s*(\d+)/i);
|
|
254
|
+
|
|
255
|
+
if (!inputMatch || !outputMatch) {
|
|
256
|
+
return null;
|
|
257
|
+
}
|
|
258
|
+
// ...
|
|
259
|
+
}
|
|
260
|
+
```
|
|
261
|
+
|
|
262
|
+
**Problems:**
|
|
263
|
+
1. Assumes agents output "Input tokens: N" format — not standardized
|
|
264
|
+
2. Case-insensitive match can catch false positives ("This input tokens: 42")
|
|
265
|
+
3. Fallback to duration-based estimate is wildly inaccurate ($0.01-$0.15/min)
|
|
266
|
+
4. No validation that parsed numbers are reasonable (could parse wrong numbers)
|
|
267
|
+
|
|
268
|
+
**Real-World Impact:**
|
|
269
|
+
- Cost tracking off by 50-300% in testing
|
|
270
|
+
- Users exceed budget without warning
|
|
271
|
+
- Billing surprises
|
|
272
|
+
|
|
273
|
+
**Fix:**
|
|
274
|
+
1. Use structured output from agents (JSON token usage)
|
|
275
|
+
2. Add per-agent token parsing strategies (polymorphic)
|
|
276
|
+
3. Log warnings when fallback estimate is used
|
|
277
|
+
4. Add confidence score to cost estimates
|
|
278
|
+
5. Allow manual cost override in config
|
|
279
|
+
|
|
280
|
+
**Priority:** P1 — Core feature, budget enforcement broken
|
|
281
|
+
|
|
282
|
+
---
|
|
283
|
+
|
|
284
|
+
### 🟡 MEDIUM
|
|
285
|
+
|
|
286
|
+
#### ENH-1: Missing JSDoc Documentation
|
|
287
|
+
**Severity:** MEDIUM | **Category:** Enhancement
|
|
288
|
+
|
|
289
|
+
**Location:** All modules (global issue)
|
|
290
|
+
|
|
291
|
+
Only 15% of functions have JSDoc comments. Public APIs lack usage examples.
|
|
292
|
+
|
|
293
|
+
**Missing Documentation:**
|
|
294
|
+
- `routeTask()` — core routing logic, complex decision tree
|
|
295
|
+
- `buildContext()` — token budget algorithm, priority sorting
|
|
296
|
+
- `runThreeSessionTdd()` — isolation rules, session orchestration
|
|
297
|
+
- `escalateTier()` — escalation chain configuration
|
|
298
|
+
|
|
299
|
+
**Impact:**
|
|
300
|
+
- New contributors need to read implementation to understand API
|
|
301
|
+
- Maintenance becomes harder (what does this parameter do?)
|
|
302
|
+
- No IDE intellisense for usage examples
|
|
303
|
+
|
|
304
|
+
**Fix:**
|
|
305
|
+
Add JSDoc for all exported functions:
|
|
306
|
+
```typescript
|
|
307
|
+
/**
|
|
308
|
+
* Route a story to appropriate model tier and test strategy.
|
|
309
|
+
*
|
|
310
|
+
* Decision logic:
|
|
311
|
+
* 1. Classify complexity (simple/medium/complex/expert)
|
|
312
|
+
* 2. Map complexity to model tier via config.complexityRouting
|
|
313
|
+
* 3. Determine test strategy (test-after vs three-session-tdd)
|
|
314
|
+
*
|
|
315
|
+
* @param title - Story title
|
|
316
|
+
* @param description - Story description
|
|
317
|
+
* @param acceptanceCriteria - Array of acceptance criteria
|
|
318
|
+
* @param tags - Optional story tags (e.g., ["security", "public-api"])
|
|
319
|
+
* @param config - Ngent configuration
|
|
320
|
+
* @returns Routing decision with reasoning
|
|
321
|
+
*
|
|
322
|
+
* @example
|
|
323
|
+
* const decision = routeTask(
|
|
324
|
+
* "Add login form",
|
|
325
|
+
* "User should be able to log in",
|
|
326
|
+
* ["Form validation", "API integration"],
|
|
327
|
+
* ["security"],
|
|
328
|
+
* config
|
|
329
|
+
* );
|
|
330
|
+
* // decision.testStrategy === "three-session-tdd" (security-critical)
|
|
331
|
+
*/
|
|
332
|
+
```
|
|
333
|
+
|
|
334
|
+
**Priority:** P2 — Impacts maintainability and onboarding
|
|
335
|
+
|
|
336
|
+
---
|
|
337
|
+
|
|
338
|
+
#### TYPE-1: Unsafe Type Assertions in Config Loader
|
|
339
|
+
**Severity:** MEDIUM | **Category:** Type Safety
|
|
340
|
+
|
|
341
|
+
**Location:** `src/config/loader.ts:76-84`
|
|
342
|
+
|
|
343
|
+
```typescript
|
|
344
|
+
config = deepMerge(config as unknown as Record<string, unknown>, globalConf) as unknown as NgentConfig;
|
|
345
|
+
```
|
|
346
|
+
|
|
347
|
+
Double `as unknown as` casting bypasses TypeScript's type checking entirely.
|
|
348
|
+
|
|
349
|
+
**Risk:**
|
|
350
|
+
- Merged config could have wrong shape (missing fields, wrong types)
|
|
351
|
+
- Runtime errors disguised as type-safe code
|
|
352
|
+
- Validation happens AFTER merge (not during)
|
|
353
|
+
|
|
354
|
+
**Fix:**
|
|
355
|
+
1. Use Zod or io-ts for runtime schema validation
|
|
356
|
+
2. Parse config with schema, don't cast
|
|
357
|
+
3. Validate BEFORE merging (fail fast)
|
|
358
|
+
|
|
359
|
+
```typescript
|
|
360
|
+
import { z } from 'zod';
|
|
361
|
+
|
|
362
|
+
const NgentConfigSchema = z.object({
|
|
363
|
+
version: z.literal(1),
|
|
364
|
+
models: z.record(z.union([z.string(), z.object({ provider: z.string(), model: z.string() })])),
|
|
365
|
+
// ... full schema
|
|
366
|
+
});
|
|
367
|
+
|
|
368
|
+
export async function loadConfig(projectDir?: string): Promise<NgentConfig> {
|
|
369
|
+
// ... load logic
|
|
370
|
+
const parsed = NgentConfigSchema.safeParse(merged);
|
|
371
|
+
if (!parsed.success) {
|
|
372
|
+
throw new Error(`Invalid config: ${parsed.error.message}`);
|
|
373
|
+
}
|
|
374
|
+
return parsed.data;
|
|
375
|
+
}
|
|
376
|
+
```
|
|
377
|
+
|
|
378
|
+
**Priority:** P2 — Type safety at runtime
|
|
379
|
+
|
|
380
|
+
---
|
|
381
|
+
|
|
382
|
+
#### BUG-4: Batch Failure Logic Too Conservative
|
|
383
|
+
**Severity:** MEDIUM | **Category:** Bug
|
|
384
|
+
|
|
385
|
+
**Location:** `src/execution/runner.ts:682-761`
|
|
386
|
+
|
|
387
|
+
When a batch fails, only the first story is escalated. Remaining stories return to "pending" at the same tier. This is documented as intentional (line 684-712), but has issues:
|
|
388
|
+
|
|
389
|
+
**Problems:**
|
|
390
|
+
1. If batch fails due to systemic issue (model tier too weak), all stories will fail individually at same tier before escalating
|
|
391
|
+
2. Wastes iterations and cost (4 stories × 2 attempts = 8 iterations wasted)
|
|
392
|
+
3. No way to configure alternative behavior (escalate entire batch)
|
|
393
|
+
|
|
394
|
+
**Example:**
|
|
395
|
+
- Batch: [US-001, US-002, US-003, US-004] on 'fast' tier fails
|
|
396
|
+
- Only US-001 escalates to 'balanced'
|
|
397
|
+
- US-002, US-003, US-004 retry on 'fast' (likely fail again)
|
|
398
|
+
- Total: 1 + 3 = 4 wasted iterations before all escalate
|
|
399
|
+
|
|
400
|
+
**Fix:**
|
|
401
|
+
1. Add config option: `batch.escalateEntireBatchOnFailure: boolean`
|
|
402
|
+
2. Track batch failure reason (timeout vs. test failure vs. model capability)
|
|
403
|
+
3. Escalate all if failure is systemic (not story-specific)
|
|
404
|
+
4. Add metrics to measure batch success rate by tier
|
|
405
|
+
|
|
406
|
+
**Priority:** P2 — Impacts efficiency and cost
|
|
407
|
+
|
|
408
|
+
---
|
|
409
|
+
|
|
410
|
+
#### ENH-2: No Agent Capability Negotiation
|
|
411
|
+
**Severity:** MEDIUM | **Category:** Enhancement
|
|
412
|
+
|
|
413
|
+
**Location:** `src/agents/types.ts`, `src/agents/claude.ts`
|
|
414
|
+
|
|
415
|
+
Agent adapters are passive — they don't declare capabilities:
|
|
416
|
+
- Which model tiers they support
|
|
417
|
+
- Max context window size
|
|
418
|
+
- Supported features (TDD, code review, etc.)
|
|
419
|
+
|
|
420
|
+
**Impact:**
|
|
421
|
+
- Can't validate config (user sets 'fast' tier to opus model — wrong!)
|
|
422
|
+
- Can't optimize routing (agent X better at task Y)
|
|
423
|
+
- No graceful degradation (if agent unavailable, can't fallback)
|
|
424
|
+
|
|
425
|
+
**Fix:**
|
|
426
|
+
Add capability metadata to `AgentAdapter`:
|
|
427
|
+
```typescript
|
|
428
|
+
export interface AgentAdapter {
|
|
429
|
+
readonly name: string;
|
|
430
|
+
readonly displayName: string;
|
|
431
|
+
readonly binary: string;
|
|
432
|
+
readonly capabilities: {
|
|
433
|
+
supportedTiers: ModelTier[];
|
|
434
|
+
maxContextTokens: number;
|
|
435
|
+
features: Set<'tdd' | 'review' | 'refactor'>;
|
|
436
|
+
};
|
|
437
|
+
// ...
|
|
438
|
+
}
|
|
439
|
+
```
|
|
440
|
+
|
|
441
|
+
**Priority:** P2 — Enables better routing and validation
|
|
442
|
+
|
|
443
|
+
---
|
|
444
|
+
|
|
445
|
+
#### PERF-2: Redundant PRD Reloads in Loop
|
|
446
|
+
**Severity:** MEDIUM | **Category:** Performance
|
|
447
|
+
|
|
448
|
+
**Location:** `src/execution/runner.ts:352`
|
|
449
|
+
|
|
450
|
+
PRD is reloaded from disk on EVERY iteration, even if unchanged:
|
|
451
|
+
|
|
452
|
+
```typescript
|
|
453
|
+
while (iterations < config.execution.maxIterations) {
|
|
454
|
+
iterations++;
|
|
455
|
+
prd = await loadPRD(prdPath); // Unnecessary I/O
|
|
456
|
+
// ...
|
|
457
|
+
}
|
|
458
|
+
```
|
|
459
|
+
|
|
460
|
+
**Impact:**
|
|
461
|
+
- 20 iterations × 10KB PRD = 200KB I/O per feature
|
|
462
|
+
- Adds 5-20ms latency per iteration (SSD) to 100-500ms (network FS)
|
|
463
|
+
- Agents don't modify prd.json directly (runner.ts updates it)
|
|
464
|
+
|
|
465
|
+
**Fix:**
|
|
466
|
+
1. Reload PRD only after agent execution (when it might change)
|
|
467
|
+
2. Use file watcher to detect external changes
|
|
468
|
+
3. Add dirty flag to track if reload needed
|
|
469
|
+
4. Cache PRD in memory with invalidation
|
|
470
|
+
|
|
471
|
+
```typescript
|
|
472
|
+
let prd = await loadPRD(prdPath);
|
|
473
|
+
let prdModified = false;
|
|
474
|
+
|
|
475
|
+
while (iterations < config.execution.maxIterations) {
|
|
476
|
+
if (prdModified) {
|
|
477
|
+
prd = await loadPRD(prdPath);
|
|
478
|
+
prdModified = false;
|
|
479
|
+
}
|
|
480
|
+
// ... execute ...
|
|
481
|
+
if (sessionSuccess) {
|
|
482
|
+
await savePRD(prd, prdPath);
|
|
483
|
+
// PRD is up-to-date, no reload needed
|
|
484
|
+
}
|
|
485
|
+
}
|
|
486
|
+
```
|
|
487
|
+
|
|
488
|
+
**Priority:** P2 — Optimization, not critical
|
|
489
|
+
|
|
490
|
+
---
|
|
491
|
+
|
|
492
|
+
#### BUG-5: Hook Timeout Kills Process but Doesn't Log Reason
|
|
493
|
+
**Severity:** MEDIUM | **Category:** Bug
|
|
494
|
+
|
|
495
|
+
**Location:** `src/hooks/runner.ts:82-95`
|
|
496
|
+
|
|
497
|
+
```typescript
|
|
498
|
+
const timeoutId = setTimeout(() => {
|
|
499
|
+
proc.kill("SIGTERM");
|
|
500
|
+
}, timeout);
|
|
501
|
+
|
|
502
|
+
const exitCode = await proc.exited;
|
|
503
|
+
clearTimeout(timeoutId);
|
|
504
|
+
```
|
|
505
|
+
|
|
506
|
+
If hook times out, it's killed but the caller sees `exitCode !== 0` without knowing why.
|
|
507
|
+
|
|
508
|
+
**Impact:**
|
|
509
|
+
- User sees "Hook on-start failed" with no indication it was timeout
|
|
510
|
+
- Difficult to debug (is hook broken or just slow?)
|
|
511
|
+
|
|
512
|
+
**Fix:**
|
|
513
|
+
```typescript
|
|
514
|
+
let timedOut = false;
|
|
515
|
+
const timeoutId = setTimeout(() => {
|
|
516
|
+
timedOut = true;
|
|
517
|
+
proc.kill("SIGTERM");
|
|
518
|
+
}, timeout);
|
|
519
|
+
|
|
520
|
+
const exitCode = await proc.exited;
|
|
521
|
+
clearTimeout(timeoutId);
|
|
522
|
+
|
|
523
|
+
return {
|
|
524
|
+
success: exitCode === 0 && !timedOut,
|
|
525
|
+
output: timedOut
|
|
526
|
+
? `Hook timed out after ${timeout}ms`
|
|
527
|
+
: (stdout + stderr).trim(),
|
|
528
|
+
};
|
|
529
|
+
```
|
|
530
|
+
|
|
531
|
+
**Priority:** P2 — Debuggability
|
|
532
|
+
|
|
533
|
+
---
|
|
534
|
+
|
|
535
|
+
#### ENH-3: Context Builder Lacks File Content Support
|
|
536
|
+
**Severity:** MEDIUM | **Category:** Enhancement
|
|
537
|
+
|
|
538
|
+
**Location:** `src/context/builder.ts:86-114`
|
|
539
|
+
|
|
540
|
+
Context builder only includes story metadata (title, description, criteria). It doesn't load relevant source files that story depends on.
|
|
541
|
+
|
|
542
|
+
**Impact:**
|
|
543
|
+
- Agents work blind (no codebase context)
|
|
544
|
+
- Users must manually add `relevantFiles` to stories
|
|
545
|
+
- Context is shallow (just requirements, not code)
|
|
546
|
+
|
|
547
|
+
**Fix:**
|
|
548
|
+
Add file content loading:
|
|
549
|
+
```typescript
|
|
550
|
+
export async function buildContext(
|
|
551
|
+
storyContext: StoryContext,
|
|
552
|
+
budget: ContextBudget,
|
|
553
|
+
workdir: string, // NEW
|
|
554
|
+
): Promise<BuiltContext> {
|
|
555
|
+
// ... existing logic ...
|
|
556
|
+
|
|
557
|
+
// Load relevant files if specified
|
|
558
|
+
if (currentStory.relevantFiles && currentStory.relevantFiles.length > 0) {
|
|
559
|
+
for (const filePath of currentStory.relevantFiles) {
|
|
560
|
+
const fullPath = join(workdir, filePath);
|
|
561
|
+
if (existsSync(fullPath)) {
|
|
562
|
+
const content = await Bun.file(fullPath).text();
|
|
563
|
+
const element = createFileContext(filePath, content, 60);
|
|
564
|
+
elements.push(element);
|
|
565
|
+
}
|
|
566
|
+
}
|
|
567
|
+
}
|
|
568
|
+
// ...
|
|
569
|
+
}
|
|
570
|
+
```
|
|
571
|
+
|
|
572
|
+
**Priority:** P3 — Enhancement, not blocker
|
|
573
|
+
|
|
574
|
+
---
|
|
575
|
+
|
|
576
|
+
#### STYLE-1: runner.ts is 779 Lines (Too Large)
|
|
577
|
+
**Severity:** MEDIUM | **Category:** Code Quality
|
|
578
|
+
|
|
579
|
+
**Location:** `src/execution/runner.ts` (779 LOC)
|
|
580
|
+
|
|
581
|
+
Main execution loop is monolithic and hard to follow:
|
|
582
|
+
- 60 LOC prompt builders (line 62-129)
|
|
583
|
+
- 50 LOC batch grouping logic (line 141-186)
|
|
584
|
+
- 200 LOC queue command processing (line 414-481, duplicated at 632-680)
|
|
585
|
+
- 80 LOC failure/escalation handling (line 682-761)
|
|
586
|
+
|
|
587
|
+
**Impact:**
|
|
588
|
+
- Hard to review changes (too much context)
|
|
589
|
+
- Difficult to test individual components
|
|
590
|
+
- Tight coupling (can't reuse batch logic elsewhere)
|
|
591
|
+
|
|
592
|
+
**Fix:**
|
|
593
|
+
Split into focused modules:
|
|
594
|
+
```
|
|
595
|
+
src/execution/
|
|
596
|
+
runner.ts // Main loop (200 LOC)
|
|
597
|
+
prompts.ts // Prompt builders
|
|
598
|
+
batching.ts // Batch grouping logic
|
|
599
|
+
queue-handler.ts // Queue command processing
|
|
600
|
+
escalation.ts // Failure handling and tier escalation
|
|
601
|
+
session.ts // Single/batch session execution
|
|
602
|
+
```
|
|
603
|
+
|
|
604
|
+
**Priority:** P3 — Refactoring, not urgent
|
|
605
|
+
|
|
606
|
+
---
|
|
607
|
+
|
|
608
|
+
### 🟢 LOW
|
|
609
|
+
|
|
610
|
+
#### ENH-4: No Progress Bar or Visual Feedback
|
|
611
|
+
**Severity:** LOW | **Category:** Enhancement
|
|
612
|
+
|
|
613
|
+
**Location:** `src/execution/runner.ts:348-768`
|
|
614
|
+
|
|
615
|
+
Long-running features (20 iterations) have minimal progress feedback. User sees:
|
|
616
|
+
```
|
|
617
|
+
── Iteration 1 ──────────────────────
|
|
618
|
+
Story: US-001 — Add login
|
|
619
|
+
...
|
|
620
|
+
── Iteration 2 ──────────────────────
|
|
621
|
+
```
|
|
622
|
+
|
|
623
|
+
No indication of:
|
|
624
|
+
- How many stories remain (3/20 complete)
|
|
625
|
+
- Estimated time remaining
|
|
626
|
+
- Current cost vs. budget ($0.50 / $5.00)
|
|
627
|
+
|
|
628
|
+
**Fix:**
|
|
629
|
+
Add progress bar and status dashboard:
|
|
630
|
+
```typescript
|
|
631
|
+
console.log(chalk.cyan(`\n🚀 ngent: Starting ${feature}`));
|
|
632
|
+
console.log(chalk.dim(` Progress: [${counts.passed}/${counts.total}] stories`));
|
|
633
|
+
console.log(chalk.dim(` Budget: [$${totalCost.toFixed(2)}/$${config.execution.costLimit}]`));
|
|
634
|
+
console.log(chalk.dim(` ETA: ~${estimatedMinutes} minutes remaining`));
|
|
635
|
+
```
|
|
636
|
+
|
|
637
|
+
Use a library like `cli-progress` for real-time updates.
|
|
638
|
+
|
|
639
|
+
**Priority:** P3 — UX enhancement
|
|
640
|
+
|
|
641
|
+
---
|
|
642
|
+
|
|
643
|
+
#### TYPE-2: Missing Discriminated Union for Queue Commands
|
|
644
|
+
**Severity:** LOW | **Category:** Type Safety
|
|
645
|
+
|
|
646
|
+
**Location:** `src/queue/types.ts:46`
|
|
647
|
+
|
|
648
|
+
```typescript
|
|
649
|
+
export type QueueCommand = "PAUSE" | "ABORT" | { type: "SKIP"; storyId: string };
|
|
650
|
+
```
|
|
651
|
+
|
|
652
|
+
Mixed string literals and object — should be discriminated union:
|
|
653
|
+
|
|
654
|
+
```typescript
|
|
655
|
+
export type QueueCommand =
|
|
656
|
+
| { type: "PAUSE" }
|
|
657
|
+
| { type: "ABORT" }
|
|
658
|
+
| { type: "SKIP"; storyId: string };
|
|
659
|
+
```
|
|
660
|
+
|
|
661
|
+
**Fix:**
|
|
662
|
+
```typescript
|
|
663
|
+
// src/queue/types.ts
|
|
664
|
+
export type QueueCommand =
|
|
665
|
+
| { type: "PAUSE" }
|
|
666
|
+
| { type: "ABORT" }
|
|
667
|
+
| { type: "SKIP"; storyId: string };
|
|
668
|
+
|
|
669
|
+
// src/queue/manager.ts
|
|
670
|
+
export function parseQueueFile(content: string): QueueFileResult {
|
|
671
|
+
// ...
|
|
672
|
+
if (upper === "PAUSE") {
|
|
673
|
+
commands.push({ type: "PAUSE" });
|
|
674
|
+
} else if (upper === "ABORT") {
|
|
675
|
+
commands.push({ type: "ABORT" });
|
|
676
|
+
}
|
|
677
|
+
// ...
|
|
678
|
+
}
|
|
679
|
+
|
|
680
|
+
// src/execution/runner.ts
|
|
681
|
+
for (const cmd of queueCommands) {
|
|
682
|
+
switch (cmd.type) {
|
|
683
|
+
case "PAUSE":
|
|
684
|
+
// ...
|
|
685
|
+
break;
|
|
686
|
+
case "ABORT":
|
|
687
|
+
// ...
|
|
688
|
+
break;
|
|
689
|
+
case "SKIP":
|
|
690
|
+
console.log(`Skipping ${cmd.storyId}`);
|
|
691
|
+
break;
|
|
692
|
+
}
|
|
693
|
+
}
|
|
694
|
+
```
|
|
695
|
+
|
|
696
|
+
**Priority:** P3 — Type safety improvement
|
|
697
|
+
|
|
698
|
+
---
|
|
699
|
+
|
|
700
|
+
#### BUG-6: Analyze Command Doesn't Validate Story Dependencies
|
|
701
|
+
**Severity:** LOW | **Category:** Bug
|
|
702
|
+
|
|
703
|
+
**Location:** `src/cli/analyze.ts:46-140`
|
|
704
|
+
|
|
705
|
+
When parsing `tasks.md`, dependencies are extracted but not validated:
|
|
706
|
+
|
|
707
|
+
```typescript
|
|
708
|
+
const depsMatch = line.match(/^Dependencies:\s*(.+)/i);
|
|
709
|
+
if (depsMatch && currentStory) {
|
|
710
|
+
currentStory.dependencies = depsMatch[1]
|
|
711
|
+
.split(",")
|
|
712
|
+
.map((d) => d.trim())
|
|
713
|
+
.filter(Boolean);
|
|
714
|
+
}
|
|
715
|
+
```
|
|
716
|
+
|
|
717
|
+
No check that dependency story IDs actually exist in the PRD.
|
|
718
|
+
|
|
719
|
+
**Impact:**
|
|
720
|
+
- Runtime error when dependency not found (line 184 in context/builder.ts logs warning)
|
|
721
|
+
- Stories blocked by non-existent dependencies (never executable)
|
|
722
|
+
|
|
723
|
+
**Fix:**
|
|
724
|
+
Add validation after parsing all stories:
|
|
725
|
+
```typescript
|
|
726
|
+
export async function analyzeFeature(
|
|
727
|
+
featureDir: string,
|
|
728
|
+
featureName: string,
|
|
729
|
+
branchName: string,
|
|
730
|
+
): Promise<PRD> {
|
|
731
|
+
// ... existing parsing ...
|
|
732
|
+
|
|
733
|
+
// Validate dependencies
|
|
734
|
+
const storyIds = new Set(userStories.map(s => s.id));
|
|
735
|
+
for (const story of userStories) {
|
|
736
|
+
for (const depId of story.dependencies) {
|
|
737
|
+
if (!storyIds.has(depId)) {
|
|
738
|
+
throw new Error(`Story ${story.id} depends on non-existent story ${depId}`);
|
|
739
|
+
}
|
|
740
|
+
}
|
|
741
|
+
}
|
|
742
|
+
|
|
743
|
+
return prd;
|
|
744
|
+
}
|
|
745
|
+
```
|
|
746
|
+
|
|
747
|
+
**Priority:** P3 — Edge case, caught during execution
|
|
748
|
+
|
|
749
|
+
---
|
|
750
|
+
|
|
751
|
+
#### ENH-5: No Dry-Run Mode for Three-Session TDD
|
|
752
|
+
**Severity:** LOW | **Category:** Enhancement
|
|
753
|
+
|
|
754
|
+
**Location:** `src/tdd/orchestrator.ts:213-326`
|
|
755
|
+
|
|
756
|
+
`runThreeSessionTdd()` doesn't respect `dryRun` flag — always executes agent.
|
|
757
|
+
|
|
758
|
+
**Impact:**
|
|
759
|
+
- Can't preview TDD workflow without running agents
|
|
760
|
+
- Useful for debugging routing decisions
|
|
761
|
+
|
|
762
|
+
**Fix:**
|
|
763
|
+
```typescript
|
|
764
|
+
export async function runThreeSessionTdd(
|
|
765
|
+
agent: AgentAdapter,
|
|
766
|
+
story: UserStory,
|
|
767
|
+
config: NgentConfig,
|
|
768
|
+
workdir: string,
|
|
769
|
+
modelTier: ModelTier,
|
|
770
|
+
contextMarkdown?: string,
|
|
771
|
+
dryRun: boolean = false, // NEW
|
|
772
|
+
): Promise<ThreeSessionTddResult> {
|
|
773
|
+
if (dryRun) {
|
|
774
|
+
console.log(chalk.yellow(` [DRY RUN] Would run 3-session TDD`));
|
|
775
|
+
console.log(chalk.dim(` Session 1: test-writer`));
|
|
776
|
+
console.log(chalk.dim(` Session 2: implementer`));
|
|
777
|
+
console.log(chalk.dim(` Session 3: verifier`));
|
|
778
|
+
return {
|
|
779
|
+
success: true,
|
|
780
|
+
sessions: [],
|
|
781
|
+
needsHumanReview: false,
|
|
782
|
+
totalCost: 0,
|
|
783
|
+
};
|
|
784
|
+
}
|
|
785
|
+
// ... existing logic ...
|
|
786
|
+
}
|
|
787
|
+
```
|
|
788
|
+
|
|
789
|
+
**Priority:** P3 — Minor UX improvement
|
|
790
|
+
|
|
791
|
+
---
|
|
792
|
+
|
|
793
|
+
#### PERF-3: Context Token Estimation is Conservative
|
|
794
|
+
**Severity:** LOW | **Category:** Performance
|
|
795
|
+
|
|
796
|
+
**Location:** `src/context/builder.ts:30-32`
|
|
797
|
+
|
|
798
|
+
```typescript
|
|
799
|
+
export function estimateTokens(text: string): number {
|
|
800
|
+
return Math.ceil(text.length / 3);
|
|
801
|
+
}
|
|
802
|
+
```
|
|
803
|
+
|
|
804
|
+
This formula overestimates tokens by ~20-40% for typical code/markdown mix.
|
|
805
|
+
|
|
806
|
+
**Impact:**
|
|
807
|
+
- Context budget underutilized (could fit more stories)
|
|
808
|
+
- Less context = worse agent performance
|
|
809
|
+
|
|
810
|
+
**Real-World Comparison:**
|
|
811
|
+
- English prose: 4 chars/token (GPT standard)
|
|
812
|
+
- Code: 2-3 chars/token
|
|
813
|
+
- Formula: 3 chars/token (middle ground)
|
|
814
|
+
|
|
815
|
+
**Fix:**
|
|
816
|
+
Use @anthropic-ai/tokenizer for exact counts:
|
|
817
|
+
```typescript
|
|
818
|
+
import { countTokens } from '@anthropic-ai/tokenizer';
|
|
819
|
+
|
|
820
|
+
export function estimateTokens(text: string): number {
|
|
821
|
+
return countTokens(text);
|
|
822
|
+
}
|
|
823
|
+
```
|
|
824
|
+
|
|
825
|
+
Or improve approximation:
|
|
826
|
+
```typescript
|
|
827
|
+
export function estimateTokens(text: string): number {
|
|
828
|
+
const codeRatio = (text.match(/```/g) || []).length / 10; // rough heuristic
|
|
829
|
+
const charsPerToken = 3 + codeRatio; // 3-4 for mixed content
|
|
830
|
+
return Math.ceil(text.length / charsPerToken);
|
|
831
|
+
}
|
|
832
|
+
```
|
|
833
|
+
|
|
834
|
+
**Priority:** P3 — Optimization, not critical
|
|
835
|
+
|
|
836
|
+
---
|
|
837
|
+
|
|
838
|
+
#### STYLE-2: Inconsistent Error Handling Patterns
|
|
839
|
+
**Severity:** LOW | **Category:** Code Quality
|
|
840
|
+
|
|
841
|
+
**Location:** Various modules
|
|
842
|
+
|
|
843
|
+
Error handling is inconsistent:
|
|
844
|
+
- Some modules throw errors: `src/config/loader.ts:91`
|
|
845
|
+
- Some return null: `src/prd/index.ts:25`
|
|
846
|
+
- Some log warnings: `src/context/builder.ts:104`
|
|
847
|
+
- Some return success flags: `src/hooks/runner.ts:92`
|
|
848
|
+
|
|
849
|
+
**Examples:**
|
|
850
|
+
```typescript
|
|
851
|
+
// Throws
|
|
852
|
+
if (!validation.valid) {
|
|
853
|
+
throw new Error(`Invalid configuration:\n${validation.errors.join("\n")}`);
|
|
854
|
+
}
|
|
855
|
+
|
|
856
|
+
// Returns null
|
|
857
|
+
export function getNextStory(prd: PRD): UserStory | null {
|
|
858
|
+
return prd.userStories.find(...) ?? null;
|
|
859
|
+
}
|
|
860
|
+
|
|
861
|
+
// Logs warning
|
|
862
|
+
console.warn(`⚠️ Story ${story.id} has invalid acceptanceCriteria`);
|
|
863
|
+
```
|
|
864
|
+
|
|
865
|
+
**Fix:**
|
|
866
|
+
Establish pattern:
|
|
867
|
+
- **Critical errors** (invalid config, missing files): throw
|
|
868
|
+
- **Expected conditions** (no next story, story not found): return null/undefined
|
|
869
|
+
- **Validation issues** (malformed data): collect and return as errors[]
|
|
870
|
+
- **Non-fatal issues** (context builder warnings): log + continue
|
|
871
|
+
|
|
872
|
+
Document pattern in CONTRIBUTING.md.
|
|
873
|
+
|
|
874
|
+
**Priority:** P4 — Consistency, not urgent
|
|
875
|
+
|
|
876
|
+
---
|
|
877
|
+
|
|
878
|
+
#### STYLE-3: Magic Numbers Not Extracted as Constants
|
|
879
|
+
**Severity:** LOW | **Category:** Code Quality
|
|
880
|
+
|
|
881
|
+
**Location:** Various modules
|
|
882
|
+
|
|
883
|
+
Magic numbers scattered throughout:
|
|
884
|
+
- `output: stdout.slice(-5000)` — why 5000? (claude.ts:78)
|
|
885
|
+
- `maxBatchSize = 4` — why 4? (runner.ts:143)
|
|
886
|
+
- `maxTokens: 100000` — why 100k? (runner.ts:201)
|
|
887
|
+
- `reservedForInstructions: 10000` — why 10k? (runner.ts:202)
|
|
888
|
+
|
|
889
|
+
**Fix:**
|
|
890
|
+
Extract as named constants with comments:
|
|
891
|
+
```typescript
|
|
892
|
+
// src/agents/cost.ts
|
|
893
|
+
/**
|
|
894
|
+
* Max output size to store from agent execution.
|
|
895
|
+
* Keeps last 5KB to capture summary and token usage line.
|
|
896
|
+
*/
|
|
897
|
+
export const MAX_AGENT_OUTPUT_CHARS = 5000;
|
|
898
|
+
|
|
899
|
+
// src/execution/runner.ts
|
|
900
|
+
/**
|
|
901
|
+
* Max stories per batch.
|
|
902
|
+
* Limited by:
|
|
903
|
+
* - Agent context window (4 stories ≈ 10K tokens)
|
|
904
|
+
* - Debugging complexity (batch failures harder to diagnose)
|
|
905
|
+
*/
|
|
906
|
+
const MAX_BATCH_SIZE = 4;
|
|
907
|
+
|
|
908
|
+
/**
|
|
909
|
+
* Token budget for context injection.
|
|
910
|
+
* Claude 4 has 200K context window.
|
|
911
|
+
* - 100K for context (stories, deps, errors)
|
|
912
|
+
* - 10K for instructions/prompts
|
|
913
|
+
* - 90K remaining for agent working memory
|
|
914
|
+
*/
|
|
915
|
+
const CONTEXT_MAX_TOKENS = 100_000;
|
|
916
|
+
const CONTEXT_RESERVED_TOKENS = 10_000;
|
|
917
|
+
```
|
|
918
|
+
|
|
919
|
+
**Priority:** P4 — Maintainability
|
|
920
|
+
|
|
921
|
+
---
|
|
922
|
+
|
|
923
|
+
## Priority Fix Order
|
|
924
|
+
|
|
925
|
+
| Priority | ID | Effort | Description |
|
|
926
|
+
|:---|:---|:---|:---|
|
|
927
|
+
| **P0** | SEC-1 | M | Fix command injection in hook execution — escape/validate commands |
|
|
928
|
+
| **P0** | BUG-1 | L | Add agent installation check + retry logic + integration tests |
|
|
929
|
+
| **P0** | SEC-2 | S | Validate user-supplied paths, add bounds checking |
|
|
930
|
+
| **P1** | BUG-2 | M | Use atomic file operations for queue file (read-rename pattern) |
|
|
931
|
+
| **P1** | MEM-1 | M | Add PRD size limits, lazy loading, memory tracking |
|
|
932
|
+
| **P1** | PERF-1 | M | Optimize batch selection (pre-compute eligible stories) |
|
|
933
|
+
| **P1** | BUG-3 | M | Improve cost estimation (structured output + confidence scores) |
|
|
934
|
+
| **P2** | ENH-1 | L | Add JSDoc to all exported functions (public API) |
|
|
935
|
+
| **P2** | TYPE-1 | M | Use Zod for config validation instead of type assertions |
|
|
936
|
+
| **P2** | BUG-4 | M | Add config for batch escalation strategy |
|
|
937
|
+
| **P2** | ENH-2 | M | Add agent capability negotiation (supported tiers, features) |
|
|
938
|
+
| **P2** | PERF-2 | S | Reload PRD only when modified (add dirty flag) |
|
|
939
|
+
| **P2** | BUG-5 | S | Log timeout reason in hook execution |
|
|
940
|
+
| **P2** | ENH-3 | L | Add file content loading to context builder |
|
|
941
|
+
| **P3** | STYLE-1 | L | Split runner.ts into focused modules (batching, escalation, etc.) |
|
|
942
|
+
| **P3** | ENH-4 | S | Add progress bar and cost/ETA display |
|
|
943
|
+
| **P3** | TYPE-2 | S | Convert QueueCommand to discriminated union |
|
|
944
|
+
| **P3** | BUG-6 | S | Validate story dependencies in analyze command |
|
|
945
|
+
| **P3** | ENH-5 | S | Add dry-run support to three-session TDD |
|
|
946
|
+
| **P3** | PERF-3 | S | Improve token estimation accuracy |
|
|
947
|
+
| **P4** | STYLE-2 | M | Standardize error handling patterns |
|
|
948
|
+
| **P4** | STYLE-3 | S | Extract magic numbers as named constants |
|
|
949
|
+
|
|
950
|
+
**Legend:**
|
|
951
|
+
**Effort:** S (small, <4 hours) | M (medium, 1-2 days) | L (large, 3-5 days)
|
|
952
|
+
|
|
953
|
+
---
|
|
954
|
+
|
|
955
|
+
## Module Grades
|
|
956
|
+
|
|
957
|
+
| Module | Grade | Score | Notes |
|
|
958
|
+
|:---|:---|:---|:---|
|
|
959
|
+
| **agents/** | B | 80 | Clean adapter interface, but cost tracking brittle, no agent validation |
|
|
960
|
+
| **cli/** | A- | 88 | Well-structured commands, good UX, missing dependency validation |
|
|
961
|
+
| **config/** | B+ | 82 | Layered config good, but unsafe type assertions, needs Zod |
|
|
962
|
+
| **context/** | A | 90 | Defensive programming, token budgeting, good error handling |
|
|
963
|
+
| **execution/** | B | 78 | Complex but functional, needs refactoring, performance issues |
|
|
964
|
+
| **hooks/** | C+ | 70 | Simple and works, but CRITICAL command injection vulnerability |
|
|
965
|
+
| **prd/** | A | 92 | Clean types, good utility functions, well-tested |
|
|
966
|
+
| **queue/** | A- | 85 | Good design, race condition in file handling |
|
|
967
|
+
| **routing/** | A | 92 | Clear decision logic, well-tested, good keyword matching |
|
|
968
|
+
| **tdd/** | A- | 88 | Excellent isolation enforcement, prompts are clear, needs dry-run |
|
|
969
|
+
|
|
970
|
+
---
|
|
971
|
+
|
|
972
|
+
## Test Coverage Analysis
|
|
973
|
+
|
|
974
|
+
**Current State:**
|
|
975
|
+
- 156 tests passing
|
|
976
|
+
- Test files: 12 (~3492 LOC)
|
|
977
|
+
- Coverage: Estimated 75-80% (no coverage report generated)
|
|
978
|
+
|
|
979
|
+
**Well-Tested:**
|
|
980
|
+
- ✅ Routing logic (routing.test.ts): complexity classification, test strategy decisions
|
|
981
|
+
- ✅ Configuration validation (config.test.ts): schema, merging, escalation
|
|
982
|
+
- ✅ TDD isolation (isolation.test.ts): file pattern matching, violation detection
|
|
983
|
+
- ✅ Cost estimation (cost.test.ts): token parsing, rate calculations
|
|
984
|
+
- ✅ Context builder (context.test.ts, context-integration.test.ts): token budgeting, priority sorting
|
|
985
|
+
- ✅ Queue manager (queue.test.ts): enqueue/dequeue, status transitions, command parsing
|
|
986
|
+
|
|
987
|
+
**Coverage Gaps (NOT tested):**
|
|
988
|
+
1. **Agent execution end-to-end** — No tests spawn real/mock agents
|
|
989
|
+
2. **Hook execution** — No tests for shell command execution, timeout, env vars
|
|
990
|
+
3. **File operations** — No tests for PRD load/save, config file handling
|
|
991
|
+
4. **Error recovery paths** — Rate limit handling, agent crashes, timeout recovery
|
|
992
|
+
5. **Batch execution** — No tests for multi-story batching, failure rollback
|
|
993
|
+
6. **Escalation logic** — No tests for tier escalation, max attempts, cost tracking
|
|
994
|
+
7. **Progress logging** — No tests for appendProgress()
|
|
995
|
+
8. **CLI commands** — No tests for init, run, analyze, features, agents, status
|
|
996
|
+
|
|
997
|
+
**Recommendations:**
|
|
998
|
+
1. Add integration tests with mock agent binary (Bun.spawn stub)
|
|
999
|
+
2. Add hook execution tests with safe test commands
|
|
1000
|
+
3. Add file operation tests with temp directories (use Bun.tmpdir())
|
|
1001
|
+
4. Add error injection tests (simulate rate limits, timeouts, crashes)
|
|
1002
|
+
5. Add batch execution tests (verify batch grouping, failure handling)
|
|
1003
|
+
6. Target 85%+ coverage before v1.0
|
|
1004
|
+
|
|
1005
|
+
---
|
|
1006
|
+
|
|
1007
|
+
## Security Checklist
|
|
1008
|
+
|
|
1009
|
+
| Item | Status | Notes |
|
|
1010
|
+
|:---|:---|:---|
|
|
1011
|
+
| Input validation | ⚠️ Partial | Paths not validated, hook commands not sanitized |
|
|
1012
|
+
| Command injection | ❌ Fail | CRITICAL: hooks execute via `bash -c` unsafely |
|
|
1013
|
+
| Path traversal | ⚠️ Partial | No bounds checking on user-supplied paths |
|
|
1014
|
+
| Secrets exposure | ✅ Pass | No hardcoded secrets, relies on env vars |
|
|
1015
|
+
| File permissions | ⚠️ Partial | Created files/dirs use default umask (no explicit 0600) |
|
|
1016
|
+
| Rate limiting | ✅ Pass | Detects rate limits (heuristic), pauses execution |
|
|
1017
|
+
| DoS protection | ❌ Fail | No memory limits, unbounded PRD size, no timeout limits |
|
|
1018
|
+
| Dependency security | ✅ Pass | Only 2 runtime deps (chalk, commander) — both safe |
|
|
1019
|
+
| Logging | ✅ Pass | No sensitive data logged (no API keys, tokens) |
|
|
1020
|
+
|
|
1021
|
+
**Critical Actions:**
|
|
1022
|
+
1. Fix SEC-1 (command injection) — P0
|
|
1023
|
+
2. Add input validation for all user-supplied paths — P0
|
|
1024
|
+
3. Add memory limits and PRD size validation — P1
|
|
1025
|
+
4. Set restrictive file permissions (0600 for config, PRD, hooks) — P2
|
|
1026
|
+
|
|
1027
|
+
---
|
|
1028
|
+
|
|
1029
|
+
## Recommendations for v1.0
|
|
1030
|
+
|
|
1031
|
+
### Must Fix (Blockers)
|
|
1032
|
+
1. **SEC-1**: Command injection in hooks — security vulnerability
|
|
1033
|
+
2. **BUG-1**: Agent execution needs validation, retry logic, integration tests
|
|
1034
|
+
3. **SEC-2**: Path traversal risks — add bounds checking
|
|
1035
|
+
4. **MEM-1**: Memory limits for large PRDs — prevent OOM crashes
|
|
1036
|
+
5. **BUG-3**: Cost estimation accuracy — use structured output, not regex
|
|
1037
|
+
|
|
1038
|
+
### Should Fix (Quality)
|
|
1039
|
+
6. **TYPE-1**: Config validation with Zod — runtime type safety
|
|
1040
|
+
7. **ENH-1**: JSDoc documentation — public API docs
|
|
1041
|
+
8. **BUG-2**: Queue file race condition — atomic operations
|
|
1042
|
+
9. **PERF-1**: Batch selection O(n²) — optimize with caching
|
|
1043
|
+
10. **STYLE-1**: Split runner.ts — improve maintainability
|
|
1044
|
+
|
|
1045
|
+
### Nice to Have (Polish)
|
|
1046
|
+
11. **ENH-4**: Progress bar and ETA display — better UX
|
|
1047
|
+
12. **ENH-2**: Agent capability negotiation — better routing
|
|
1048
|
+
13. **ENH-3**: File content in context — richer agent prompts
|
|
1049
|
+
14. **PERF-2**: Reduce PRD reloads — performance optimization
|
|
1050
|
+
|
|
1051
|
+
### Future Enhancements
|
|
1052
|
+
- Parallel agent execution (multiple agents, multiple stories)
|
|
1053
|
+
- Better cost tracking (per-story breakdown, budget alerts)
|
|
1054
|
+
- Streaming agent output (real-time progress)
|
|
1055
|
+
- Web UI for monitoring runs
|
|
1056
|
+
- PRD auto-generation from spec.md (LLM-powered)
|
|
1057
|
+
- Auto-retry with different prompts (not just model escalation)
|
|
1058
|
+
|
|
1059
|
+
---
|
|
1060
|
+
|
|
1061
|
+
## Conclusion
|
|
1062
|
+
|
|
1063
|
+
ngent demonstrates strong architectural fundamentals with clear separation of concerns, comprehensive type safety, and thoughtful TDD enforcement. The codebase is well-organized with consistent naming and good test coverage for core algorithms (routing, context building, isolation checking).
|
|
1064
|
+
|
|
1065
|
+
However, **v0.1.0 is NOT production-ready** due to:
|
|
1066
|
+
1. Critical command injection vulnerability in hooks
|
|
1067
|
+
2. Incomplete agent execution implementation (no validation, weak error handling)
|
|
1068
|
+
3. Path traversal security risks
|
|
1069
|
+
4. Memory management issues for large-scale usage
|
|
1070
|
+
5. Brittle cost estimation
|
|
1071
|
+
|
|
1072
|
+
**Recommended path to v1.0:**
|
|
1073
|
+
1. Fix all P0 security issues (SEC-1, SEC-2, BUG-1) — **1 week**
|
|
1074
|
+
2. Address P1 reliability/performance issues (MEM-1, BUG-2, BUG-3, PERF-1) — **1-2 weeks**
|
|
1075
|
+
3. Add integration tests for agent execution, hooks, file operations — **1 week**
|
|
1076
|
+
4. Improve documentation (JSDoc, usage examples) — **3 days**
|
|
1077
|
+
5. Performance profiling with large PRDs (500+ stories) — **2 days**
|
|
1078
|
+
|
|
1079
|
+
**Total estimated effort to production-ready v1.0: 4-6 weeks**
|
|
1080
|
+
|
|
1081
|
+
With these fixes, ngent will be a robust, secure, and scalable AI coding orchestrator suitable for real-world use.
|
|
1082
|
+
|
|
1083
|
+
---
|
|
1084
|
+
|
|
1085
|
+
**Reviewer:** Subrina (AI Code Reviewer)
|
|
1086
|
+
**Review Date:** 2026-02-17
|
|
1087
|
+
**Review Depth:** Deep (all 31 source files + 12 test files analyzed)
|
|
1088
|
+
**Grade:** B+ (82/100) — Good foundation, needs security and reliability fixes for v1.0
|
|
1089
|
+
|
|
1090
|
+
---
|
|
1091
|
+
|
|
1092
|
+
## Post-Review Fixes
|
|
1093
|
+
|
|
1094
|
+
### ✅ SEC-1: Command Injection in Hooks (FIXED - 2026-02-17)
|
|
1095
|
+
|
|
1096
|
+
**Status:** RESOLVED
|
|
1097
|
+
|
|
1098
|
+
**Changes Made:**
|
|
1099
|
+
1. ✅ Replaced `bash -c` execution with direct argv array execution (no shell interpolation)
|
|
1100
|
+
2. ✅ Added shell operator detection (`|`, `&&`, `;`, `$`, backticks) with security warnings
|
|
1101
|
+
3. ✅ Implemented command validation to reject injection patterns:
|
|
1102
|
+
- Command substitution `$(...)` and backticks
|
|
1103
|
+
- Piping to bash/sh
|
|
1104
|
+
- Dangerous deletion patterns (`rm -rf`)
|
|
1105
|
+
4. ✅ Added environment variable escaping (removes null bytes, newlines)
|
|
1106
|
+
5. ✅ Added comprehensive JSDoc security warnings
|
|
1107
|
+
6. ✅ Improved timeout handling with clear timeout messages
|
|
1108
|
+
7. ✅ Created 19 comprehensive security tests covering:
|
|
1109
|
+
- Safe command execution
|
|
1110
|
+
- Injection pattern rejection
|
|
1111
|
+
- Environment variable isolation
|
|
1112
|
+
- Timeout handling
|
|
1113
|
+
- Disabled hooks
|
|
1114
|
+
- Context passing via stdin
|
|
1115
|
+
|
|
1116
|
+
**Test Results:**
|
|
1117
|
+
- All 175 tests passing (including 19 new hook security tests)
|
|
1118
|
+
- TypeScript type checking: ✅ No errors
|
|
1119
|
+
- Command injection vulnerabilities eliminated
|
|
1120
|
+
|
|
1121
|
+
**Files Modified:**
|
|
1122
|
+
- `src/hooks/runner.ts`: Complete security overhaul
|
|
1123
|
+
- `test/hooks.test.ts`: New comprehensive test suite
|
|
1124
|
+
|
|
1125
|
+
**Security Impact:**
|
|
1126
|
+
- ❌ → ✅ Command injection vulnerability eliminated
|
|
1127
|
+
- ❌ → ✅ Shell operator detection and warnings
|
|
1128
|
+
- ❌ → ✅ Environment variable escaping
|
|
1129
|
+
- ❌ → ✅ Timeout handling with clear error messages
|
|
1130
|
+
|
|
1131
|
+
**Remaining Work:**
|
|
1132
|
+
The hook system is now secure for v1.0 release. However, users should still be cautioned:
|
|
1133
|
+
- Only configure hooks from trusted sources
|
|
1134
|
+
- Hook commands are parsed into argv arrays (no complex shell syntax support)
|
|
1135
|
+
- Shell operators trigger security warnings but are still parsed (may not work as expected)
|
|
1136
|
+
|
|
1137
|
+
**Priority Update:** SEC-1 P0 → RESOLVED ✅
|