@nathapp/nax 0.18.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.gitlab-ci.yml +96 -0
- package/BRIEF.md +140 -0
- package/CHANGELOG.md +60 -0
- package/CLAUDE.md +159 -0
- package/README.md +373 -0
- package/US-007-IMPLEMENTATION.md +139 -0
- package/bin/nax.ts +930 -0
- package/biome.json +14 -0
- package/bun.lock +168 -0
- package/bunfig.toml +11 -0
- package/docs/20260216-fix-plan-context-review.md +56 -0
- package/docs/20260216-relentless-vs-ngent-comparison.md +208 -0
- package/docs/20260216-v02-plan.md +136 -0
- package/docs/20260216-v02-review.md +685 -0
- package/docs/20260217-dogfood-findings.md +56 -0
- package/docs/20260217-p2-plus-plan.md +117 -0
- package/docs/20260217-partial-fixes-plan.md +62 -0
- package/docs/20260217-plan-analyze-spec.md +117 -0
- package/docs/20260217-post-impl-review.md +1137 -0
- package/docs/20260217-quick-wins-plan.md +66 -0
- package/docs/20260217-split-runner-plan.md +75 -0
- package/docs/20260217-v03-impl-plan.md +80 -0
- package/docs/20260217-v03-post-impl-review.md +589 -0
- package/docs/20260217-v04-impl-plan.md +86 -0
- package/docs/20260217-v05-post-impl-review.md +850 -0
- package/docs/20260217-v06-post-impl-review.md +817 -0
- package/docs/20260218-adr003-port-plan.md +151 -0
- package/docs/20260218-review-adr003-verification.md +175 -0
- package/docs/20260219-fix-plan-bug16-19.md +79 -0
- package/docs/20260219-fix-plan-bug20-22.md +114 -0
- package/docs/20260219-plan-llm-routing.md +116 -0
- package/docs/20260219-review-bug20-22-fixes.md +135 -0
- package/docs/20260219-routing-baseline-keyword.md +63 -0
- package/docs/20260220-plan-structured-logging-p1.md +80 -0
- package/docs/20260220-plan-structured-logging-p2.md +37 -0
- package/docs/20260220-review-llm-routing.md +180 -0
- package/docs/20260220-review-post-fix-llm-routing.md +70 -0
- package/docs/20260221-fix-plan-relevantfiles-split.md +101 -0
- package/docs/20260221-fix-plan-routing-mode.md +125 -0
- package/docs/20260221-review-v0.9-implementation.md +379 -0
- package/docs/20260222-fix-plan-v091-routing-isolation.md +197 -0
- package/docs/20260223-fix-plan-prompt-audit.md +62 -0
- package/docs/20260224-nax-roadmap-phases.md +189 -0
- package/docs/20260225-phase2-llm-service-layer.md +401 -0
- package/docs/20260225-review-v0.10.1.md +187 -0
- package/docs/20260303-v010-implementation-plan.md +165 -0
- package/docs/CLAUDE.md.bak +191 -0
- package/docs/ROADMAP.md +165 -0
- package/docs/SPEC-rectification.md +0 -0
- package/docs/SPEC.md +324 -0
- package/docs/US-001-plugin-loading-verification.md +152 -0
- package/docs/architecture-analysis.md +1076 -0
- package/docs/bugs/BUG-21-escalation-null-attempts.md +48 -0
- package/docs/bugs-from-dogfood-run-c.md +243 -0
- package/docs/code-review-20260228.md +612 -0
- package/docs/code-review-v0.15.0.md +629 -0
- package/docs/hook-lifecycle-test-plan.md +149 -0
- package/docs/releases/v0.11.0-and-earlier.md +20 -0
- package/docs/releases/v0.12.0.md +15 -0
- package/docs/releases/v0.13.0.md +14 -0
- package/docs/releases/v0.14.0.md +20 -0
- package/docs/releases/v0.14.1.md +36 -0
- package/docs/releases/v0.14.2.md +51 -0
- package/docs/releases/v0.14.3.md +174 -0
- package/docs/releases/v0.14.4.md +94 -0
- package/docs/releases/v0.15.0.md +502 -0
- package/docs/releases/v0.15.1.md +170 -0
- package/docs/releases/v0.15.3.md +193 -0
- package/docs/specs/status-file-v0.10.1.md +812 -0
- package/docs/v0.10-global-config.md +206 -0
- package/docs/v0.10-plugin-system.md +415 -0
- package/docs/v0.10-prompt-optimizer.md +234 -0
- package/docs/v0.3-spec.md +244 -0
- package/docs/v0.4-spec.md +140 -0
- package/docs/v0.5-spec.md +237 -0
- package/docs/v0.6-spec.md +371 -0
- package/docs/v0.7-spec.md +177 -0
- package/docs/v0.8-llm-routing.md +206 -0
- package/docs/v0.8-structured-logging.md +132 -0
- package/docs/v0.9.3-prompt-audit.md +112 -0
- package/examples/plugins/console-reporter/index.test.ts +207 -0
- package/examples/plugins/console-reporter/index.ts +110 -0
- package/nax/config.json +147 -0
- package/nax/features/bugfix-v0171/prd.json +52 -0
- package/nax/features/config-management/prd.json +108 -0
- package/nax/features/config-management/progress.txt +5 -0
- package/nax/features/diagnose/acceptance.test.ts +412 -0
- package/nax/features/diagnose/prd.json +41 -0
- package/nax/features/orchestration-fixes/prd.json +89 -0
- package/nax/features/orchestration-fixes/progress.txt +1 -0
- package/nax/features/plugin-integration/US-007-VERIFICATION.md +259 -0
- package/nax/features/plugin-integration/prd.json +208 -0
- package/nax/features/plugin-integration/progress.txt +5 -0
- package/nax/features/precheck/prd.json +205 -0
- package/nax/features/precheck/progress.txt +15 -0
- package/nax/features/structured-logging/prd.json +199 -0
- package/nax/features/unlock/prd.json +36 -0
- package/package.json +47 -0
- package/src/acceptance/fix-generator.ts +348 -0
- package/src/acceptance/generator.ts +282 -0
- package/src/acceptance/index.ts +30 -0
- package/src/acceptance/types.ts +79 -0
- package/src/agents/claude-decompose.ts +169 -0
- package/src/agents/claude-plan.ts +139 -0
- package/src/agents/claude.ts +324 -0
- package/src/agents/cost.ts +268 -0
- package/src/agents/index.ts +13 -0
- package/src/agents/registry.ts +48 -0
- package/src/agents/types-extended.ts +133 -0
- package/src/agents/types.ts +113 -0
- package/src/agents/validation.ts +69 -0
- package/src/analyze/classifier.ts +305 -0
- package/src/analyze/index.ts +16 -0
- package/src/analyze/scanner.ts +175 -0
- package/src/analyze/types.ts +51 -0
- package/src/cli/accept.ts +108 -0
- package/src/cli/analyze-parser.ts +284 -0
- package/src/cli/analyze.ts +207 -0
- package/src/cli/config.ts +561 -0
- package/src/cli/constitution.ts +109 -0
- package/src/cli/diagnose-analysis.ts +159 -0
- package/src/cli/diagnose-formatter.ts +87 -0
- package/src/cli/diagnose.ts +203 -0
- package/src/cli/generate.ts +127 -0
- package/src/cli/index.ts +37 -0
- package/src/cli/init.ts +188 -0
- package/src/cli/interact.ts +295 -0
- package/src/cli/plan.ts +198 -0
- package/src/cli/plugins.ts +111 -0
- package/src/cli/prompts.ts +295 -0
- package/src/cli/runs.ts +174 -0
- package/src/cli/status-cost.ts +151 -0
- package/src/cli/status-features.ts +338 -0
- package/src/cli/status.ts +13 -0
- package/src/commands/common.ts +171 -0
- package/src/commands/diagnose.ts +17 -0
- package/src/commands/index.ts +8 -0
- package/src/commands/logs.ts +384 -0
- package/src/commands/precheck.ts +86 -0
- package/src/commands/unlock.ts +96 -0
- package/src/config/defaults.ts +160 -0
- package/src/config/index.ts +22 -0
- package/src/config/loader.ts +121 -0
- package/src/config/merger.ts +147 -0
- package/src/config/path-security.ts +121 -0
- package/src/config/paths.ts +27 -0
- package/src/config/schema.ts +56 -0
- package/src/config/schemas.ts +286 -0
- package/src/config/types.ts +423 -0
- package/src/config/validate.ts +103 -0
- package/src/constitution/generator.ts +191 -0
- package/src/constitution/generators/aider.ts +41 -0
- package/src/constitution/generators/claude.ts +35 -0
- package/src/constitution/generators/cursor.ts +36 -0
- package/src/constitution/generators/opencode.ts +38 -0
- package/src/constitution/generators/types.ts +33 -0
- package/src/constitution/generators/windsurf.ts +36 -0
- package/src/constitution/index.ts +10 -0
- package/src/constitution/loader.ts +133 -0
- package/src/constitution/types.ts +31 -0
- package/src/context/auto-detect.ts +227 -0
- package/src/context/builder.ts +246 -0
- package/src/context/elements.ts +83 -0
- package/src/context/formatter.ts +107 -0
- package/src/context/generator.ts +129 -0
- package/src/context/generators/aider.ts +34 -0
- package/src/context/generators/claude.ts +28 -0
- package/src/context/generators/cursor.ts +28 -0
- package/src/context/generators/opencode.ts +30 -0
- package/src/context/generators/windsurf.ts +28 -0
- package/src/context/greenfield.ts +114 -0
- package/src/context/index.ts +33 -0
- package/src/context/injector.ts +279 -0
- package/src/context/test-scanner.ts +370 -0
- package/src/context/types.ts +98 -0
- package/src/errors.ts +67 -0
- package/src/execution/batching.ts +157 -0
- package/src/execution/crash-recovery.ts +373 -0
- package/src/execution/escalation/escalation.ts +44 -0
- package/src/execution/escalation/index.ts +13 -0
- package/src/execution/escalation/tier-escalation.ts +295 -0
- package/src/execution/escalation/tier-outcome.ts +158 -0
- package/src/execution/helpers.ts +38 -0
- package/src/execution/index.ts +45 -0
- package/src/execution/lifecycle/acceptance-loop.ts +272 -0
- package/src/execution/lifecycle/headless-formatter.ts +85 -0
- package/src/execution/lifecycle/index.ts +12 -0
- package/src/execution/lifecycle/parallel-lifecycle.ts +101 -0
- package/src/execution/lifecycle/precheck-runner.ts +140 -0
- package/src/execution/lifecycle/run-cleanup.ts +81 -0
- package/src/execution/lifecycle/run-completion.ts +129 -0
- package/src/execution/lifecycle/run-initialization.ts +141 -0
- package/src/execution/lifecycle/run-lifecycle.ts +312 -0
- package/src/execution/lifecycle/run-setup.ts +204 -0
- package/src/execution/lifecycle/story-hooks.ts +38 -0
- package/src/execution/lifecycle/story-size-prompts.ts +123 -0
- package/src/execution/lock.ts +115 -0
- package/src/execution/parallel-executor.ts +216 -0
- package/src/execution/parallel.ts +400 -0
- package/src/execution/pid-registry.ts +280 -0
- package/src/execution/pipeline-result-handler.ts +388 -0
- package/src/execution/post-verify-rectification.ts +188 -0
- package/src/execution/post-verify.ts +274 -0
- package/src/execution/progress.ts +25 -0
- package/src/execution/prompts.ts +127 -0
- package/src/execution/queue-handler.ts +109 -0
- package/src/execution/rectification.ts +13 -0
- package/src/execution/runner.ts +377 -0
- package/src/execution/sequential-executor.ts +388 -0
- package/src/execution/status-file.ts +264 -0
- package/src/execution/status-writer.ts +139 -0
- package/src/execution/story-context.ts +229 -0
- package/src/execution/test-output-parser.ts +14 -0
- package/src/execution/verification.ts +72 -0
- package/src/hooks/index.ts +2 -0
- package/src/hooks/runner.ts +286 -0
- package/src/hooks/types.ts +67 -0
- package/src/interaction/chain.ts +154 -0
- package/src/interaction/index.ts +60 -0
- package/src/interaction/init.ts +83 -0
- package/src/interaction/plugins/auto.ts +217 -0
- package/src/interaction/plugins/cli.ts +300 -0
- package/src/interaction/plugins/telegram.ts +384 -0
- package/src/interaction/plugins/webhook.ts +258 -0
- package/src/interaction/state.ts +171 -0
- package/src/interaction/triggers.ts +229 -0
- package/src/interaction/types.ts +163 -0
- package/src/logger/formatters.ts +84 -0
- package/src/logger/index.ts +16 -0
- package/src/logger/logger.ts +298 -0
- package/src/logger/types.ts +48 -0
- package/src/logging/formatter.ts +355 -0
- package/src/logging/index.ts +22 -0
- package/src/logging/types.ts +93 -0
- package/src/metrics/aggregator.ts +190 -0
- package/src/metrics/index.ts +14 -0
- package/src/metrics/tracker.ts +200 -0
- package/src/metrics/types.ts +109 -0
- package/src/optimizer/index.ts +62 -0
- package/src/optimizer/noop.optimizer.ts +24 -0
- package/src/optimizer/rule-based.optimizer.ts +248 -0
- package/src/optimizer/types.ts +53 -0
- package/src/pipeline/events.ts +130 -0
- package/src/pipeline/index.ts +19 -0
- package/src/pipeline/runner.ts +161 -0
- package/src/pipeline/stages/acceptance.ts +197 -0
- package/src/pipeline/stages/completion.ts +99 -0
- package/src/pipeline/stages/constitution.ts +63 -0
- package/src/pipeline/stages/context.ts +117 -0
- package/src/pipeline/stages/execution.ts +194 -0
- package/src/pipeline/stages/index.ts +62 -0
- package/src/pipeline/stages/optimizer.ts +74 -0
- package/src/pipeline/stages/prompt.ts +57 -0
- package/src/pipeline/stages/queue-check.ts +103 -0
- package/src/pipeline/stages/review.ts +181 -0
- package/src/pipeline/stages/routing.ts +81 -0
- package/src/pipeline/stages/verify.ts +100 -0
- package/src/pipeline/types.ts +167 -0
- package/src/plugins/index.ts +31 -0
- package/src/plugins/loader.ts +287 -0
- package/src/plugins/registry.ts +168 -0
- package/src/plugins/types.ts +327 -0
- package/src/plugins/validator.ts +352 -0
- package/src/prd/index.ts +172 -0
- package/src/prd/types.ts +202 -0
- package/src/precheck/checks-blockers.ts +391 -0
- package/src/precheck/checks-warnings.ts +142 -0
- package/src/precheck/checks.ts +30 -0
- package/src/precheck/index.ts +247 -0
- package/src/precheck/story-size-gate.ts +144 -0
- package/src/precheck/types.ts +31 -0
- package/src/queue/index.ts +2 -0
- package/src/queue/manager.ts +254 -0
- package/src/queue/types.ts +54 -0
- package/src/review/index.ts +8 -0
- package/src/review/runner.ts +172 -0
- package/src/review/types.ts +66 -0
- package/src/routing/builder.ts +81 -0
- package/src/routing/chain.ts +74 -0
- package/src/routing/index.ts +16 -0
- package/src/routing/loader.ts +58 -0
- package/src/routing/router.ts +303 -0
- package/src/routing/strategies/adaptive.ts +215 -0
- package/src/routing/strategies/index.ts +8 -0
- package/src/routing/strategies/keyword.ts +163 -0
- package/src/routing/strategies/llm-prompts.ts +209 -0
- package/src/routing/strategies/llm.ts +235 -0
- package/src/routing/strategies/manual.ts +50 -0
- package/src/routing/strategy.ts +99 -0
- package/src/tdd/cleanup.ts +111 -0
- package/src/tdd/index.ts +23 -0
- package/src/tdd/isolation.ts +123 -0
- package/src/tdd/orchestrator.ts +383 -0
- package/src/tdd/prompts.ts +270 -0
- package/src/tdd/rectification-gate.ts +183 -0
- package/src/tdd/session-runner.ts +179 -0
- package/src/tdd/types.ts +81 -0
- package/src/tdd/verdict.ts +271 -0
- package/src/tui/App.tsx +265 -0
- package/src/tui/components/AgentPanel.tsx +75 -0
- package/src/tui/components/CostOverlay.tsx +118 -0
- package/src/tui/components/HelpOverlay.tsx +107 -0
- package/src/tui/components/StatusBar.tsx +63 -0
- package/src/tui/components/StoriesPanel.tsx +177 -0
- package/src/tui/hooks/useKeyboard.ts +142 -0
- package/src/tui/hooks/useLayout.ts +137 -0
- package/src/tui/hooks/usePipelineEvents.ts +183 -0
- package/src/tui/hooks/usePty.ts +194 -0
- package/src/tui/index.tsx +38 -0
- package/src/tui/types.ts +76 -0
- package/src/utils/git.ts +83 -0
- package/src/utils/queue-writer.ts +54 -0
- package/src/verification/executor.ts +235 -0
- package/src/verification/gate.ts +207 -0
- package/src/verification/index.ts +12 -0
- package/src/verification/parser.ts +230 -0
- package/src/verification/rectification.ts +108 -0
- package/src/verification/types.ts +113 -0
- package/src/worktree/dispatcher.ts +65 -0
- package/src/worktree/index.ts +2 -0
- package/src/worktree/manager.ts +187 -0
- package/src/worktree/merge.ts +301 -0
- package/src/worktree/types.ts +4 -0
- package/test/TEST_COVERAGE_US001.md +217 -0
- package/test/TEST_COVERAGE_US003.md +84 -0
- package/test/TEST_COVERAGE_US005.md +86 -0
- package/test/US-002-orchestrator.test.ts +246 -0
- package/test/acceptance/cm-003-default-view.test.ts +194 -0
- package/test/execution/pid-registry.test.ts +240 -0
- package/test/execution/post-verify.test.ts +224 -0
- package/test/helpers/timeout.ts +42 -0
- package/test/integration/US-002-TEST-SUMMARY.md +107 -0
- package/test/integration/US-003-TEST-SUMMARY.md +149 -0
- package/test/integration/US-004-TEST-SUMMARY.md +106 -0
- package/test/integration/US-005-TEST-SUMMARY.md +138 -0
- package/test/integration/US-007-TEST-SUMMARY.md +100 -0
- package/test/integration/agent-validation.test.ts +439 -0
- package/test/integration/analyze-integration.test.ts +261 -0
- package/test/integration/analyze-scanner.test.ts +131 -0
- package/test/integration/cli-config-default-edge-cases.test.ts +222 -0
- package/test/integration/cli-config-default-view.test.ts +229 -0
- package/test/integration/cli-config-diff.test.ts +460 -0
- package/test/integration/cli-config.test.ts +736 -0
- package/test/integration/cli-diagnose.test.ts +592 -0
- package/test/integration/cli-logs.test.ts +314 -0
- package/test/integration/cli-plugins.test.ts +678 -0
- package/test/integration/cli-precheck.test.ts +371 -0
- package/test/integration/cli-run-headless.test.ts +173 -0
- package/test/integration/cli.test.ts +75 -0
- package/test/integration/config/merger.test.ts +465 -0
- package/test/integration/config/paths.test.ts +51 -0
- package/test/integration/config-loader.test.ts +265 -0
- package/test/integration/config.test.ts +444 -0
- package/test/integration/context-integration.test.ts +702 -0
- package/test/integration/context-provider-injection.test.ts +506 -0
- package/test/integration/context-verification-integration.test.ts +295 -0
- package/test/integration/e2e.test.ts +896 -0
- package/test/integration/execution.test.ts +625 -0
- package/test/integration/helpers.test.ts +295 -0
- package/test/integration/hooks.test.ts +361 -0
- package/test/integration/interaction-chain-pipeline.test.ts +464 -0
- package/test/integration/isolation.test.ts +143 -0
- package/test/integration/logger.test.ts +461 -0
- package/test/integration/parallel.test.ts +250 -0
- package/test/integration/path-security.test.ts +173 -0
- package/test/integration/pipeline-acceptance.test.ts +302 -0
- package/test/integration/pipeline-events.test.ts +475 -0
- package/test/integration/pipeline.test.ts +658 -0
- package/test/integration/plan.test.ts +157 -0
- package/test/integration/plugin-routing.test.ts +921 -0
- package/test/integration/plugins/config-integration.test.ts +172 -0
- package/test/integration/plugins/config-resolution.test.ts +522 -0
- package/test/integration/plugins/loader.test.ts +641 -0
- package/test/integration/plugins/registry.test.ts +746 -0
- package/test/integration/plugins/validator.test.ts +563 -0
- package/test/integration/prd-pause.test.ts +205 -0
- package/test/integration/prd-resolvers.test.ts +185 -0
- package/test/integration/precheck-integration.test.ts +468 -0
- package/test/integration/precheck.test.ts +805 -0
- package/test/integration/progress.test.ts +34 -0
- package/test/integration/rectification-flow.test.ts +512 -0
- package/test/integration/reporter-lifecycle.test.ts +860 -0
- package/test/integration/review-config-commands.test.ts +319 -0
- package/test/integration/review-config-schema.test.ts +116 -0
- package/test/integration/review-plugin-integration.test.ts +722 -0
- package/test/integration/review.test.ts +149 -0
- package/test/integration/routing-stage-bug-021.test.ts +274 -0
- package/test/integration/routing-stage-greenfield.test.ts +286 -0
- package/test/integration/runner-config-plugins.test.ts +461 -0
- package/test/integration/runner-fixes.test.ts +399 -0
- package/test/integration/runner-plugin-integration.test.ts +543 -0
- package/test/integration/runner.test.ts +1679 -0
- package/test/integration/s5-greenfield-fallback.test.ts +297 -0
- package/test/integration/status-file-integration.test.ts +325 -0
- package/test/integration/status-file.test.ts +379 -0
- package/test/integration/status-writer.test.ts +345 -0
- package/test/integration/story-id-in-events.test.ts +273 -0
- package/test/integration/tdd-cleanup.test.ts +246 -0
- package/test/integration/tdd-orchestrator.test.ts +1762 -0
- package/test/integration/test-scanner.test.ts +403 -0
- package/test/integration/verification-asset-check.test.ts +142 -0
- package/test/integration/verify-stage.test.ts +275 -0
- package/test/integration/worktree/manager.test.ts +218 -0
- package/test/integration/worktree/merge.test.ts +341 -0
- package/test/manual/logging-formatter-demo.ts +158 -0
- package/test/ui/tui-agent-panel.test.tsx +99 -0
- package/test/ui/tui-controls.test.ts +334 -0
- package/test/ui/tui-cost-and-pty.test.ts +189 -0
- package/test/ui/tui-layout.test.ts +378 -0
- package/test/ui/tui-pty-integration.test.tsx +159 -0
- package/test/ui/tui-stories.test.ts +332 -0
- package/test/unit/acceptance.test.ts +186 -0
- package/test/unit/agent-stderr-capture.test.ts +146 -0
- package/test/unit/analyze-classifier.test.ts +215 -0
- package/test/unit/analyze.test.ts +224 -0
- package/test/unit/auto-detect.test.ts +249 -0
- package/test/unit/cli-status.test.ts +417 -0
- package/test/unit/commands/common.test.ts +320 -0
- package/test/unit/commands/logs.test.ts +416 -0
- package/test/unit/commands/unlock.test.ts +319 -0
- package/test/unit/constitution-generators.test.ts +160 -0
- package/test/unit/constitution.test.ts +209 -0
- package/test/unit/context.test.ts +1722 -0
- package/test/unit/cost.test.ts +231 -0
- package/test/unit/crash-recovery.test.ts +308 -0
- package/test/unit/escalation.test.ts +126 -0
- package/test/unit/execution-logging-stderr.test.ts +156 -0
- package/test/unit/execution-stage.test.ts +122 -0
- package/test/unit/fix-generator.test.ts +275 -0
- package/test/unit/formatters.test.ts +469 -0
- package/test/unit/greenfield.test.ts +179 -0
- package/test/unit/helpers.test.ts +317 -0
- package/test/unit/interaction/human-review-trigger.test.ts +164 -0
- package/test/unit/interaction-network-failures.test.ts +389 -0
- package/test/unit/interaction-plugins.test.ts +164 -0
- package/test/unit/isolation.test.ts +134 -0
- package/test/unit/logging/formatter.test.ts +455 -0
- package/test/unit/merge.test.ts +268 -0
- package/test/unit/metrics.test.ts +276 -0
- package/test/unit/optimizer/noop.optimizer.test.ts +125 -0
- package/test/unit/optimizer/rule-based.optimizer.test.ts +358 -0
- package/test/unit/prd-auto-default.test.ts +290 -0
- package/test/unit/prd-failure-category.test.ts +176 -0
- package/test/unit/prd-get-next-story.test.ts +186 -0
- package/test/unit/precheck-checks.test.ts +840 -0
- package/test/unit/precheck-story-size-gate.test.ts +287 -0
- package/test/unit/precheck-types.test.ts +142 -0
- package/test/unit/prompts.test.ts +475 -0
- package/test/unit/queue.test.ts +237 -0
- package/test/unit/rectification.test.ts +284 -0
- package/test/unit/registry.test.ts +287 -0
- package/test/unit/routing.test.ts +937 -0
- package/test/unit/run-lifecycle.test.ts +140 -0
- package/test/unit/storyid-events.test.ts +224 -0
- package/test/unit/tdd-verdict.test.ts +492 -0
- package/test/unit/test-output-parser.test.ts +377 -0
- package/test/unit/verdict.test.ts +324 -0
- package/test/unit/worktree-manager.test.ts +158 -0
- package/tsconfig.json +27 -0
|
@@ -0,0 +1,401 @@
|
|
|
1
|
+
# Phase 2: LLM Service Layer — Merged Architecture Design
|
|
2
|
+
|
|
3
|
+
*Date: 2026-02-25*
|
|
4
|
+
*Status: Proposed (pending decision)*
|
|
5
|
+
*Supersedes: Original issue #3 design + 2026-02-25 architecture analysis*
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Problem
|
|
10
|
+
|
|
11
|
+
nax v0.10.0 has two coupling issues:
|
|
12
|
+
|
|
13
|
+
1. **All LLM calls go through Claude Code CLI** — routing, review, acceptance stages spawn `claude -p` just for text reasoning. Wasteful.
|
|
14
|
+
2. **All coding goes through CLI subprocess** — ~350MB RAM each, blocks parallelism.
|
|
15
|
+
|
|
16
|
+
## Solution: Unified LLM Service Layer + Lightweight Agent Loop
|
|
17
|
+
|
|
18
|
+
Two execution paths, one provider abstraction:
|
|
19
|
+
|
|
20
|
+
```
|
|
21
|
+
LlmProvider (interface — normalized across providers)
|
|
22
|
+
├── AnthropicProvider (Messages API)
|
|
23
|
+
├── GoogleProvider (GenerateContent API)
|
|
24
|
+
└── OpenAiCompatProvider (Chat Completions — covers OpenAI, Moonshot, DeepSeek, OpenRouter, Groq, etc.)
|
|
25
|
+
|
|
26
|
+
Used by:
|
|
27
|
+
├── LLM Mode (text in → text out) — routing, analyze, review, acceptance
|
|
28
|
+
│ └── llm/client.ts → callLlm(prompt, tier, config)
|
|
29
|
+
│
|
|
30
|
+
└── Agent Mode (text + tools) — coding, TDD
|
|
31
|
+
├── DirectApiAdapter — LlmProvider + tool loop (~5MB per session)
|
|
32
|
+
└── ClaudeCodeAdapter — CLI subprocess (~350MB, for TDD/interactive)
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Architecture
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
src/
|
|
39
|
+
├── llm/ # LLM Service Layer (shared by both modes)
|
|
40
|
+
│ ├── types.ts # LlmProvider interface, Message, ToolCall types
|
|
41
|
+
│ ├── client.ts # callLlm() with fallback chain logic
|
|
42
|
+
│ ├── registry.ts # Create provider from config
|
|
43
|
+
│ └── providers/
|
|
44
|
+
│ ├── anthropic.ts # Anthropic Messages API
|
|
45
|
+
│ ├── openai-compat.ts # OpenAI-compatible (configurable baseUrl)
|
|
46
|
+
│ └── google.ts # Google Gemini API
|
|
47
|
+
│
|
|
48
|
+
├── llm/tools/ # Minimal tool set for Direct API coding
|
|
49
|
+
│ ├── types.ts # ToolDefinition, ToolResult
|
|
50
|
+
│ ├── read-file.ts # Read file contents
|
|
51
|
+
│ ├── write-file.ts # Write/create file
|
|
52
|
+
│ ├── list-files.ts # List directory
|
|
53
|
+
│ ├── search-files.ts # Grep/ripgrep
|
|
54
|
+
│ └── run-command.ts # Shell exec (tests, git)
|
|
55
|
+
│
|
|
56
|
+
├── llm/agent-loop.ts # Tool use cycle: prompt → chat() → execute tools → loop
|
|
57
|
+
│
|
|
58
|
+
├── agents/ # Agent adapters (implement AgentAdapter interface)
|
|
59
|
+
│ ├── types.ts # AgentAdapter, AgentResult (unchanged)
|
|
60
|
+
│ ├── claude.ts # ClaudeCodeAdapter (current — subprocess)
|
|
61
|
+
│ ├── direct-api.ts # DirectApiAdapter (new — wraps llm/ + tools)
|
|
62
|
+
│ ├── registry.ts # Resolve backend config → adapter instance
|
|
63
|
+
│ └── cost.ts # Cost estimation (unchanged for CLI, exact for API)
|
|
64
|
+
│
|
|
65
|
+
├── pipeline/stages/ # Each stage declares its execution mode
|
|
66
|
+
│ ├── routing.ts # LLM Mode → llm/client.ts
|
|
67
|
+
│ ├── analyze.ts # LLM Mode → llm/client.ts
|
|
68
|
+
│ ├── coding.ts # Agent Mode → agents/registry.ts
|
|
69
|
+
│ ├── tdd.ts # Agent Mode → agents/registry.ts
|
|
70
|
+
│ ├── review.ts # LLM Mode → llm/client.ts
|
|
71
|
+
│ └── acceptance.ts # LLM Mode → llm/client.ts
|
|
72
|
+
│
|
|
73
|
+
└── config/schema.ts # Extended with providers, routing, pipeline overrides
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
## LlmProvider Interface
|
|
77
|
+
|
|
78
|
+
```typescript
|
|
79
|
+
interface LlmProvider {
|
|
80
|
+
readonly name: string;
|
|
81
|
+
|
|
82
|
+
chat(options: {
|
|
83
|
+
model: string;
|
|
84
|
+
messages: Message[];
|
|
85
|
+
tools?: ToolDefinition[]; // Optional — omit for LLM Mode (reasoning only)
|
|
86
|
+
maxTokens?: number;
|
|
87
|
+
temperature?: number;
|
|
88
|
+
timeoutMs?: number;
|
|
89
|
+
}): Promise<LlmResponse>;
|
|
90
|
+
}
|
|
91
|
+
|
|
92
|
+
interface LlmResponse {
|
|
93
|
+
content: string;
|
|
94
|
+
toolCalls: ToolCall[]; // Normalized regardless of provider format
|
|
95
|
+
stopReason: "end_turn" | "tool_use" | "max_tokens";
|
|
96
|
+
usage: { inputTokens: number; outputTokens: number };
|
|
97
|
+
}
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
Single interface serves both modes:
|
|
101
|
+
- **LLM Mode** (routing, review): `chat()` without `tools` → text response
|
|
102
|
+
- **Agent Mode** (coding): `chat()` with `tools` → tool calls → agent loop iterates
|
|
103
|
+
|
|
104
|
+
## Provider Implementations
|
|
105
|
+
|
|
106
|
+
| Implementation | Covers | API Format | Tool Use Format |
|
|
107
|
+
|:---------------|:-------|:-----------|:---------------|
|
|
108
|
+
| `AnthropicProvider` | Anthropic (Claude) | Messages API | `tool_use` content blocks |
|
|
109
|
+
| `OpenAiCompatProvider` | OpenAI, Moonshot, DeepSeek, OpenRouter, Groq, Together | Chat Completions | `tool_calls` in message |
|
|
110
|
+
| `GoogleProvider` | Google Gemini | GenerateContent | `functionCall` in parts |
|
|
111
|
+
|
|
112
|
+
`OpenAiCompatProvider` takes `baseUrl` + `apiKey` — any OpenAI-compatible provider works with zero code.
|
|
113
|
+
|
|
114
|
+
## Tier-Based Fallback Chains
|
|
115
|
+
|
|
116
|
+
Each tier is an ordered list of providers. On 429/error, try next in chain:
|
|
117
|
+
|
|
118
|
+
```
|
|
119
|
+
Stage needs "balanced" tier
|
|
120
|
+
→ Try anthropic/sonnet
|
|
121
|
+
→ Rate limited (429)? → Try openai/gpt-5
|
|
122
|
+
→ Also limited? → Try next in list
|
|
123
|
+
→ All exhausted? → Stage fails with clear error
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
Both LLM Mode and Agent Mode use the same fallback logic via `llm/client.ts`:
|
|
127
|
+
|
|
128
|
+
```
|
|
129
|
+
config.models["balanced"] → [anthropic/sonnet, openai/gpt-5]
|
|
130
|
+
│
|
|
131
|
+
┌─────────────────┴──────────────────┐
|
|
132
|
+
│ │
|
|
133
|
+
LLM Mode stages Agent Mode stages
|
|
134
|
+
(routing, review) (coding, TDD)
|
|
135
|
+
│ │
|
|
136
|
+
llm/client.ts DirectApiAdapter
|
|
137
|
+
tries providers tries providers
|
|
138
|
+
in order in order (with tools)
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
Single `ModelDef` (not array) is treated as array of one — backward compatible, no fallback.
|
|
142
|
+
|
|
143
|
+
## Backend Routing
|
|
144
|
+
|
|
145
|
+
Three-level resolution:
|
|
146
|
+
|
|
147
|
+
### 1. Per-Stage Pipeline Override (most specific)
|
|
148
|
+
|
|
149
|
+
```json
|
|
150
|
+
{
|
|
151
|
+
"pipeline": {
|
|
152
|
+
"routing": {
|
|
153
|
+
"primary": { "provider": "google", "model": "gemini-flash", "via": "api" },
|
|
154
|
+
"fallback": [
|
|
155
|
+
{ "provider": "anthropic", "model": "haiku", "via": "api" },
|
|
156
|
+
{ "via": "keyword" }
|
|
157
|
+
]
|
|
158
|
+
},
|
|
159
|
+
"implementation": {
|
|
160
|
+
"primary": { "provider": "anthropic", "model": "sonnet", "via": "api" },
|
|
161
|
+
"fallback": [
|
|
162
|
+
{ "via": "claude-cli" }
|
|
163
|
+
]
|
|
164
|
+
}
|
|
165
|
+
}
|
|
166
|
+
}
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
The `via` field determines execution path:
|
|
170
|
+
- `"api"` → Direct API (LLM Mode or DirectApiAdapter depending on stage)
|
|
171
|
+
- `"claude-cli"` → Claude Code CLI subprocess
|
|
172
|
+
- `"keyword"` → built-in keyword strategy (routing only)
|
|
173
|
+
|
|
174
|
+
### 2. Strategy Override (tdd/interactive → force backend)
|
|
175
|
+
|
|
176
|
+
```json
|
|
177
|
+
{
|
|
178
|
+
"agents": {
|
|
179
|
+
"overrides": {
|
|
180
|
+
"tdd": "claude-cli",
|
|
181
|
+
"interactive": "claude-cli"
|
|
182
|
+
}
|
|
183
|
+
}
|
|
184
|
+
}
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
### 3. Tier Routing (default)
|
|
188
|
+
|
|
189
|
+
```json
|
|
190
|
+
{
|
|
191
|
+
"agents": {
|
|
192
|
+
"routing": {
|
|
193
|
+
"fast": { "provider": "gemini", "model": "gemini-2.5-flash" },
|
|
194
|
+
"balanced": { "provider": "anthropic", "model": "claude-sonnet-4-5" },
|
|
195
|
+
"powerful": { "provider": "anthropic", "model": "claude-opus-4" }
|
|
196
|
+
}
|
|
197
|
+
}
|
|
198
|
+
}
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
### Resolution Logic
|
|
202
|
+
|
|
203
|
+
```typescript
|
|
204
|
+
function resolveBackend(
|
|
205
|
+
tier: ModelTier,
|
|
206
|
+
stage: string,
|
|
207
|
+
context: { tdd: boolean; interactive: boolean }
|
|
208
|
+
): BackendConfig {
|
|
209
|
+
const config = loadConfig();
|
|
210
|
+
|
|
211
|
+
// 1. Per-stage pipeline override
|
|
212
|
+
if (config.pipeline?.[stage]?.primary) return config.pipeline[stage];
|
|
213
|
+
|
|
214
|
+
// 2. Strategy override
|
|
215
|
+
if (context.tdd && config.agents?.overrides?.tdd) return config.agents.overrides.tdd;
|
|
216
|
+
if (context.interactive && config.agents?.overrides?.interactive) return config.agents.overrides.interactive;
|
|
217
|
+
|
|
218
|
+
// 3. Tier routing
|
|
219
|
+
return config.agents?.routing?.[tier] ?? "claude-cli";
|
|
220
|
+
}
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
## Full Config Example
|
|
224
|
+
|
|
225
|
+
```json
|
|
226
|
+
{
|
|
227
|
+
"agents": {
|
|
228
|
+
"providers": {
|
|
229
|
+
"anthropic": {
|
|
230
|
+
"type": "anthropic",
|
|
231
|
+
"apiKey": "${ANTHROPIC_API_KEY}"
|
|
232
|
+
},
|
|
233
|
+
"openai": {
|
|
234
|
+
"type": "openai-compat",
|
|
235
|
+
"baseUrl": "https://api.openai.com/v1",
|
|
236
|
+
"apiKey": "${OPENAI_API_KEY}"
|
|
237
|
+
},
|
|
238
|
+
"gemini": {
|
|
239
|
+
"type": "google",
|
|
240
|
+
"apiKey": "${GOOGLE_API_KEY}"
|
|
241
|
+
},
|
|
242
|
+
"moonshot": {
|
|
243
|
+
"type": "openai-compat",
|
|
244
|
+
"baseUrl": "https://api.moonshot.cn/v1",
|
|
245
|
+
"apiKey": "${MOONSHOT_API_KEY}"
|
|
246
|
+
},
|
|
247
|
+
"deepseek": {
|
|
248
|
+
"type": "openai-compat",
|
|
249
|
+
"baseUrl": "https://api.deepseek.com/v1",
|
|
250
|
+
"apiKey": "${DEEPSEEK_API_KEY}"
|
|
251
|
+
}
|
|
252
|
+
},
|
|
253
|
+
"routing": {
|
|
254
|
+
"fast": { "provider": "gemini", "model": "gemini-2.5-flash" },
|
|
255
|
+
"balanced": [
|
|
256
|
+
{ "provider": "anthropic", "model": "claude-sonnet-4-5" },
|
|
257
|
+
{ "provider": "openai", "model": "gpt-5" }
|
|
258
|
+
],
|
|
259
|
+
"powerful": { "provider": "anthropic", "model": "claude-opus-4" }
|
|
260
|
+
},
|
|
261
|
+
"overrides": {
|
|
262
|
+
"tdd": "claude-cli",
|
|
263
|
+
"interactive": "claude-cli"
|
|
264
|
+
}
|
|
265
|
+
},
|
|
266
|
+
"pipeline": {
|
|
267
|
+
"routing": {
|
|
268
|
+
"primary": { "provider": "gemini", "model": "gemini-flash", "via": "api" },
|
|
269
|
+
"fallback": [{ "via": "keyword" }]
|
|
270
|
+
}
|
|
271
|
+
}
|
|
272
|
+
}
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
## Minimal Tool Set (for DirectApiAdapter)
|
|
276
|
+
|
|
277
|
+
| Tool | What | Lines |
|
|
278
|
+
|:-----|:-----|:------|
|
|
279
|
+
| `read_file` | Read file contents (with line range) | ~15 |
|
|
280
|
+
| `write_file` | Write/create file (with mkdir -p) | ~15 |
|
|
281
|
+
| `list_files` | List directory (recursive option) | ~15 |
|
|
282
|
+
| `search_files` | Grep/ripgrep pattern search | ~20 |
|
|
283
|
+
| `run_command` | Shell exec with timeout + cwd | ~30 |
|
|
284
|
+
|
|
285
|
+
~95 lines total. Each tool is sandboxed to the project workdir.
|
|
286
|
+
|
|
287
|
+
## Agent Loop
|
|
288
|
+
|
|
289
|
+
```typescript
|
|
290
|
+
async function agentLoop(
|
|
291
|
+
provider: LlmProvider,
|
|
292
|
+
model: string,
|
|
293
|
+
prompt: string,
|
|
294
|
+
workdir: string,
|
|
295
|
+
maxIterations: number = 50,
|
|
296
|
+
): Promise<AgentResult> {
|
|
297
|
+
const tools = getToolDefinitions();
|
|
298
|
+
let messages: Message[] = [{ role: "user", content: prompt }];
|
|
299
|
+
let totalCost = { input: 0, output: 0 };
|
|
300
|
+
|
|
301
|
+
for (let i = 0; i < maxIterations; i++) {
|
|
302
|
+
const response = await provider.chat({ model, messages, tools });
|
|
303
|
+
totalCost.input += response.usage.inputTokens;
|
|
304
|
+
totalCost.output += response.usage.outputTokens;
|
|
305
|
+
|
|
306
|
+
if (response.stopReason === "end_turn") {
|
|
307
|
+
return { success: true, output: response.content, cost: totalCost };
|
|
308
|
+
}
|
|
309
|
+
|
|
310
|
+
// Execute tool calls
|
|
311
|
+
const toolResults = await Promise.all(
|
|
312
|
+
response.toolCalls.map(tc => executeTool(tc, workdir))
|
|
313
|
+
);
|
|
314
|
+
|
|
315
|
+
messages.push({ role: "assistant", content: response.content, toolCalls: response.toolCalls });
|
|
316
|
+
messages.push({ role: "tool", results: toolResults });
|
|
317
|
+
}
|
|
318
|
+
|
|
319
|
+
return { success: false, output: "Max iterations reached", cost: totalCost };
|
|
320
|
+
}
|
|
321
|
+
```
|
|
322
|
+
|
|
323
|
+
~150 lines with error handling, logging, and token budget checks.
|
|
324
|
+
|
|
325
|
+
## Comparison: CLI vs Direct API
|
|
326
|
+
|
|
327
|
+
| Factor | Claude Code CLI | Direct API |
|
|
328
|
+
|:-------|:---------------|:-----------|
|
|
329
|
+
| RAM per session | ~350MB | ~5MB |
|
|
330
|
+
| Parallel stories | OOMs at 3 | 10+ concurrent |
|
|
331
|
+
| Cost tracking | Estimated from duration | Exact token counts from API |
|
|
332
|
+
| Provider flexibility | Anthropic only | Any provider with tool_use |
|
|
333
|
+
| Tool access | ~50 tools (overkill) | 5 tools (minimal, sandboxed) |
|
|
334
|
+
| CLAUDE.md support | ✅ Auto-loaded | ❌ Must inject into prompt |
|
|
335
|
+
| TDD isolation | ✅ PTY-based session isolation | ⚠️ Possible but needs validation |
|
|
336
|
+
| Interactive/TUI | ✅ PTY handle | ❌ Not supported |
|
|
337
|
+
| Dependencies | `claude` binary installed | Just HTTP (fetch) |
|
|
338
|
+
|
|
339
|
+
## Backward Compatibility
|
|
340
|
+
|
|
341
|
+
- No `agents` section in config → everything uses `claude-cli` (current behavior)
|
|
342
|
+
- No `pipeline` section → stages inherit from tier routing
|
|
343
|
+
- Single ModelDef (not array) → treated as array of one, no fallback
|
|
344
|
+
- Zero breaking changes
|
|
345
|
+
|
|
346
|
+
## Component Breakdown
|
|
347
|
+
|
|
348
|
+
| Component | Est. Lines | What |
|
|
349
|
+
|:----------|:-----------|:-----|
|
|
350
|
+
| `llm/types.ts` | ~60 | LlmProvider, Message, ToolCall, LlmResponse |
|
|
351
|
+
| `llm/providers/anthropic.ts` | ~80 | Messages API + tool_use normalization |
|
|
352
|
+
| `llm/providers/openai-compat.ts` | ~80 | Chat Completions + configurable baseUrl |
|
|
353
|
+
| `llm/providers/google.ts` | ~100 | GenerateContent + functionCall normalization |
|
|
354
|
+
| `llm/registry.ts` | ~40 | Provider factory from config |
|
|
355
|
+
| `llm/client.ts` | ~80 | callLlm() with fallback chain + retry |
|
|
356
|
+
| `llm/tools/*.ts` (5 tools) | ~95 | read, write, list, search, exec |
|
|
357
|
+
| `llm/agent-loop.ts` | ~150 | Tool use cycle with iteration limit |
|
|
358
|
+
| `agents/direct-api.ts` | ~80 | DirectApiAdapter wrapping llm/ layer |
|
|
359
|
+
| `agents/registry.ts` (update) | ~30 | Resolve backend config → adapter |
|
|
360
|
+
| `config/schema.ts` (update) | ~100 | providers, routing, overrides, pipeline |
|
|
361
|
+
| **Total** | **~895** | |
|
|
362
|
+
|
|
363
|
+
## Implementation Phases
|
|
364
|
+
|
|
365
|
+
| Phase | Scope | Effort | Enables |
|
|
366
|
+
|:------|:------|:-------|:--------|
|
|
367
|
+
| P1 | LlmProvider interface + AnthropicProvider + callLlm() | Small | LLM Mode for routing/review |
|
|
368
|
+
| P2 | OpenAiCompatProvider + GoogleProvider | Small | Multi-provider support |
|
|
369
|
+
| P3 | Fallback chain logic in client.ts | Medium | Rate limit resilience |
|
|
370
|
+
| P4 | Tool definitions + agent loop + DirectApiAdapter | Medium | API-based coding |
|
|
371
|
+
| P5 | Per-stage pipeline config | Medium | Fine-grained stage control |
|
|
372
|
+
| P6 | Wire LLM Mode into routing, review, acceptance stages | Medium | Remove CLI dependency for reasoning |
|
|
373
|
+
|
|
374
|
+
P1-P2 can ship independently as a quick win (LLM Mode only). P4 is the big unlock for Phase 3 parallelism.
|
|
375
|
+
|
|
376
|
+
## Auth/Key Management
|
|
377
|
+
|
|
378
|
+
Provider keys flow from config with env var expansion:
|
|
379
|
+
|
|
380
|
+
```json
|
|
381
|
+
{
|
|
382
|
+
"providers": {
|
|
383
|
+
"anthropic": { "type": "anthropic", "apiKey": "${ANTHROPIC_API_KEY}" }
|
|
384
|
+
}
|
|
385
|
+
}
|
|
386
|
+
```
|
|
387
|
+
|
|
388
|
+
Each provider reads `apiKey` from its config entry. Fallback to `process.env` for backward compat.
|
|
389
|
+
Per-model env overrides via `ModelDef.env` still work (existing behavior).
|
|
390
|
+
|
|
391
|
+
## Enables Phase 3 (Parallelism)
|
|
392
|
+
|
|
393
|
+
With DirectApiAdapter (~5MB each), Phase 3 becomes feasible:
|
|
394
|
+
- N stories execute concurrently via parallel HTTP calls
|
|
395
|
+
- Each story gets its own git worktree (from dev-orchestrator pattern)
|
|
396
|
+
- No OOM risk — 10 concurrent stories ≈ 50MB total vs 3.5GB with CLI
|
|
397
|
+
- Exact cost tracking per story from API token counts
|
|
398
|
+
|
|
399
|
+
---
|
|
400
|
+
|
|
401
|
+
*Decision pending. This doc captures the merged architecture for future implementation.*
|
|
@@ -0,0 +1,187 @@
|
|
|
1
|
+
# Deep Code Review: @nathapp/nax v0.10.1
|
|
2
|
+
|
|
3
|
+
**Date:** 2026-02-25
|
|
4
|
+
**Reviewer:** Subrina (AI)
|
|
5
|
+
**Scope:** Status File, Failure Categories, Verifier Verdicts (31 files changed, ~5,200 lines added)
|
|
6
|
+
**Commit Range:** v0.10.0..v0.10.1
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Overall Grade: A- (87/100)
|
|
11
|
+
|
|
12
|
+
| Category | Score | Notes |
|
|
13
|
+
|:---|:---:|:---|
|
|
14
|
+
| Security | 17/20 | Solid input validation on verdict; minor path traversal concern |
|
|
15
|
+
| Reliability | 18/20 | Atomic writes, graceful fallbacks, comprehensive error handling |
|
|
16
|
+
| API Design | 18/20 | Clean types, good separation of concerns, extensible verdict schema |
|
|
17
|
+
| Code Quality | 17/20 | Well-structured, good test coverage; runner.ts exceeds 400-line guideline |
|
|
18
|
+
| Best Practices | 17/20 | Good patterns; minor DRY and cleanup opportunities |
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## Findings
|
|
23
|
+
|
|
24
|
+
### CRITICAL
|
|
25
|
+
|
|
26
|
+
*None found.*
|
|
27
|
+
|
|
28
|
+
### HIGH
|
|
29
|
+
|
|
30
|
+
#### BUG-1: runner.ts exceeds 400-line guideline at 1,310 lines (HIGH)
|
|
31
|
+
|
|
32
|
+
**File:** `src/execution/runner.ts`
|
|
33
|
+
**Risk:** Cognitive complexity, harder to test individual paths, merge conflicts.
|
|
34
|
+
**Snippet:** `wc -l src/execution/runner.ts → 1310`
|
|
35
|
+
**Fix:** Extract the status-file write orchestration, the story pipeline dispatch, and the escalation logic into separate modules (as was done with `post-verify.ts`). The `writeStatus` closure and its state vars could become a `StatusFileWriter` class.
|
|
36
|
+
|
|
37
|
+
#### SEC-1: Status file path not validated for path traversal (HIGH)
|
|
38
|
+
|
|
39
|
+
**File:** `src/execution/status-file.ts:170`
|
|
40
|
+
```typescript
|
|
41
|
+
export async function writeStatusFile(filePath: string, status: NaxStatusFile): Promise<void> {
|
|
42
|
+
const tmpPath = `${filePath}.tmp`;
|
|
43
|
+
await Bun.write(tmpPath, JSON.stringify(status, null, 2));
|
|
44
|
+
await rename(tmpPath, filePath);
|
|
45
|
+
}
|
|
46
|
+
```
|
|
47
|
+
**Risk:** If `statusFile` option is user-controlled (e.g., via CLI arg), arbitrary file overwrite is possible. Currently mitigated by the fact that the path is derived from `--status-file` CLI arg which is operator-controlled, but no validation exists.
|
|
48
|
+
**Fix:** Add `path.resolve()` + verify the path is within the workdir or a known safe directory.
|
|
49
|
+
|
|
50
|
+
### MEDIUM
|
|
51
|
+
|
|
52
|
+
#### ENH-1: `getSafeLogger()` duplicated across files (MEDIUM)
|
|
53
|
+
|
|
54
|
+
**Files:** `src/execution/runner.ts:71`, `src/execution/post-verify.ts:80`
|
|
55
|
+
```typescript
|
|
56
|
+
function getSafeLogger() {
|
|
57
|
+
try { return getLogger(); }
|
|
58
|
+
catch { return null; }
|
|
59
|
+
}
|
|
60
|
+
```
|
|
61
|
+
**Risk:** DRY violation. If logger initialization changes, both copies need updating.
|
|
62
|
+
**Fix:** Export `getSafeLogger()` from `src/logger.ts` or a shared utility.
|
|
63
|
+
|
|
64
|
+
#### TYPE-1: `captureGitRef()` duplicated with different signatures (MEDIUM)
|
|
65
|
+
|
|
66
|
+
**Files:** `src/execution/post-verify.ts:17` (returns `string | undefined`), `src/tdd/orchestrator.ts:30` (returns `string`, throws on failure)
|
|
67
|
+
**Risk:** Inconsistent error handling for the same operation. The orchestrator version will throw if git isn't available.
|
|
68
|
+
**Fix:** Consolidate into a single shared function in a git utility module. Prefer the `string | undefined` signature for resilience.
|
|
69
|
+
|
|
70
|
+
#### BUG-2: `writeStatus` swallows errors silently during critical state transitions (MEDIUM)
|
|
71
|
+
|
|
72
|
+
**File:** `src/execution/runner.ts:188-193`
|
|
73
|
+
```typescript
|
|
74
|
+
catch (err) {
|
|
75
|
+
safeLogger?.warn("status-file", "Failed to write status file (non-fatal)", {
|
|
76
|
+
path: statusFile,
|
|
77
|
+
error: (err as Error).message,
|
|
78
|
+
});
|
|
79
|
+
}
|
|
80
|
+
```
|
|
81
|
+
**Risk:** If the status file write fails repeatedly (e.g., disk full), the only signal is a warn log that may be missed. External tooling polling the file would see stale data.
|
|
82
|
+
**Fix:** Consider a counter; after N consecutive failures, emit a more prominent error or set a flag on the run state.
|
|
83
|
+
|
|
84
|
+
#### ENH-2: Verdict validation could use a schema library (MEDIUM)
|
|
85
|
+
|
|
86
|
+
**File:** `src/tdd/verdict.ts:85-120` — `isValidVerdict()` is 35 lines of manual validation.
|
|
87
|
+
**Risk:** Verbose and error-prone as the schema evolves. New fields require manual validation additions.
|
|
88
|
+
**Fix:** Consider using `zod` or `valibot` for declarative schema validation. However, the current zero-dependency approach is acceptable for a CLI tool — marking as enhancement, not a bug.
|
|
89
|
+
|
|
90
|
+
#### MEM-1: Atomic write leaves orphan `.tmp` file on crash between write and rename (MEDIUM)
|
|
91
|
+
|
|
92
|
+
**File:** `src/execution/status-file.ts:172-175`
|
|
93
|
+
**Risk:** If the process crashes after `Bun.write()` but before `rename()`, a `.tmp` file persists. Not a memory leak but can cause confusion.
|
|
94
|
+
**Fix:** Add cleanup of stale `.tmp` files at runner startup, or use `try/finally` to attempt cleanup.
|
|
95
|
+
|
|
96
|
+
### LOW
|
|
97
|
+
|
|
98
|
+
#### STYLE-1: Inconsistent `as FailureCategory` casts in orchestrator (LOW)
|
|
99
|
+
|
|
100
|
+
**File:** `src/tdd/orchestrator.ts:293, 312`
|
|
101
|
+
```typescript
|
|
102
|
+
failureCategory: "session-failure" as FailureCategory,
|
|
103
|
+
failureCategory: "isolation-violation" as FailureCategory,
|
|
104
|
+
```
|
|
105
|
+
**Risk:** The type is already `FailureCategory`, so the cast is redundant and adds noise.
|
|
106
|
+
**Fix:** Remove the `as FailureCategory` casts — TypeScript already infers the string literal correctly.
|
|
107
|
+
|
|
108
|
+
#### STYLE-2: Test file duplication — `tdd-verdict.test.ts` and `verdict.test.ts` overlap significantly (LOW)
|
|
109
|
+
|
|
110
|
+
**Files:** `test/verdict.test.ts` (339 lines), `test/tdd-verdict.test.ts` (290 lines)
|
|
111
|
+
**Risk:** Both test `readVerdict`, `cleanupVerdict`, and `categorizeVerdict` with very similar test cases. Maintenance burden doubles.
|
|
112
|
+
**Fix:** Consolidate into a single test file. If both were generated by different subtasks, merge the more thorough assertions from each.
|
|
113
|
+
|
|
114
|
+
#### ENH-3: `countProgress` iterates stories 4 times (LOW)
|
|
115
|
+
|
|
116
|
+
**File:** `src/execution/status-file.ts:101-106`
|
|
117
|
+
```typescript
|
|
118
|
+
const passed = stories.filter((s) => s.status === "passed").length;
|
|
119
|
+
const failed = stories.filter((s) => s.status === "failed").length;
|
|
120
|
+
const paused = stories.filter((s) => s.status === "paused").length;
|
|
121
|
+
const blocked = stories.filter((s) => s.status === "blocked").length;
|
|
122
|
+
```
|
|
123
|
+
**Risk:** Negligible performance impact (PRDs have <50 stories), but could be a single loop.
|
|
124
|
+
**Fix:** Single `reduce()` pass. Low priority — readability is fine as-is.
|
|
125
|
+
|
|
126
|
+
#### PERF-1: `buildStatusSnapshot` calls `Date.now()` once but creates `new Date()` separately (LOW)
|
|
127
|
+
|
|
128
|
+
**File:** `src/execution/status-file.ts:154`
|
|
129
|
+
```typescript
|
|
130
|
+
const now = Date.now();
|
|
131
|
+
// ...
|
|
132
|
+
updatedAt: new Date(now).toISOString(),
|
|
133
|
+
```
|
|
134
|
+
**Risk:** None — this is actually correct and efficient. No finding here, just noting the pattern is clean.
|
|
135
|
+
|
|
136
|
+
#### ENH-4: `resolveMaxAttemptsOutcome` could benefit from exhaustive matching (LOW)
|
|
137
|
+
|
|
138
|
+
**File:** `src/execution/runner.ts:59-64`
|
|
139
|
+
```typescript
|
|
140
|
+
export function resolveMaxAttemptsOutcome(failureCategory?: FailureCategory): "pause" | "fail" {
|
|
141
|
+
if (failureCategory === "isolation-violation" || failureCategory === "verifier-rejected") {
|
|
142
|
+
return "pause";
|
|
143
|
+
}
|
|
144
|
+
return "fail";
|
|
145
|
+
}
|
|
146
|
+
```
|
|
147
|
+
**Risk:** If new `FailureCategory` values are added, this function silently defaults to "fail".
|
|
148
|
+
**Fix:** Add a `satisfies never` exhaustive check or use a switch statement.
|
|
149
|
+
|
|
150
|
+
---
|
|
151
|
+
|
|
152
|
+
## What Was Done Well
|
|
153
|
+
|
|
154
|
+
1. **Atomic writes** for the status file — write-to-tmp-then-rename prevents partial reads.
|
|
155
|
+
2. **Verdict validation** is thorough — `isValidVerdict()` checks every required field, returns null on failure (never throws).
|
|
156
|
+
3. **Clean separation** — verdict reading, categorization, and cleanup are separate functions with single responsibilities.
|
|
157
|
+
4. **Fallback paths** — when no verdict file exists, the orchestrator gracefully falls back to independent test verification.
|
|
158
|
+
5. **Test coverage** is excellent — ~6 test files covering all three features with edge cases, error paths, and priority ordering.
|
|
159
|
+
6. **Type design** — `FailureCategory` as a union type, `NaxStatusFile` with version field for forward compat, `ThreeSessionTddResult.verdict` using `null | undefined` distinction.
|
|
160
|
+
7. **Documentation** — JSDoc on all public APIs with clear parameter descriptions.
|
|
161
|
+
8. **`markStoryFailed` backward compatibility** — the `failureCategory` parameter is optional; existing callers don't break.
|
|
162
|
+
|
|
163
|
+
---
|
|
164
|
+
|
|
165
|
+
## Priority Fix Order
|
|
166
|
+
|
|
167
|
+
| Priority | ID | Severity | Effort | Description |
|
|
168
|
+
|:---:|:---|:---:|:---:|:---|
|
|
169
|
+
| 1 | BUG-1 | HIGH | L | Extract runner.ts into smaller modules |
|
|
170
|
+
| 2 | SEC-1 | HIGH | S | Validate status file path |
|
|
171
|
+
| 3 | ENH-1 | MEDIUM | S | Deduplicate `getSafeLogger()` |
|
|
172
|
+
| 4 | TYPE-1 | MEDIUM | S | Consolidate `captureGitRef()` |
|
|
173
|
+
| 5 | STYLE-2 | LOW | M | Merge duplicate verdict test files |
|
|
174
|
+
| 6 | ENH-4 | LOW | S | Exhaustive match in `resolveMaxAttemptsOutcome` |
|
|
175
|
+
| 7 | STYLE-1 | LOW | S | Remove redundant `as FailureCategory` casts |
|
|
176
|
+
|
|
177
|
+
*Effort: S = <30min, M = 1-2h, L = 2-4h*
|
|
178
|
+
|
|
179
|
+
---
|
|
180
|
+
|
|
181
|
+
## Summary
|
|
182
|
+
|
|
183
|
+
v0.10.1 is a **solid implementation** of three well-scoped features. The code demonstrates good defensive programming (never-throw readers, atomic writes, graceful fallbacks) and strong type design. Test coverage is comprehensive with both happy-path and error-path cases.
|
|
184
|
+
|
|
185
|
+
The main concern is **runner.ts growing to 1,310 lines** — the status-file integration added more state and write points to an already large file. The next refactoring pass should extract the status-file writer and the story pipeline dispatch into separate modules.
|
|
186
|
+
|
|
187
|
+
No critical security or reliability issues found. The codebase is production-ready.
|