@nathapp/nax 0.18.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.gitlab-ci.yml +96 -0
- package/BRIEF.md +140 -0
- package/CHANGELOG.md +60 -0
- package/CLAUDE.md +159 -0
- package/README.md +373 -0
- package/US-007-IMPLEMENTATION.md +139 -0
- package/bin/nax.ts +930 -0
- package/biome.json +14 -0
- package/bun.lock +168 -0
- package/bunfig.toml +11 -0
- package/docs/20260216-fix-plan-context-review.md +56 -0
- package/docs/20260216-relentless-vs-ngent-comparison.md +208 -0
- package/docs/20260216-v02-plan.md +136 -0
- package/docs/20260216-v02-review.md +685 -0
- package/docs/20260217-dogfood-findings.md +56 -0
- package/docs/20260217-p2-plus-plan.md +117 -0
- package/docs/20260217-partial-fixes-plan.md +62 -0
- package/docs/20260217-plan-analyze-spec.md +117 -0
- package/docs/20260217-post-impl-review.md +1137 -0
- package/docs/20260217-quick-wins-plan.md +66 -0
- package/docs/20260217-split-runner-plan.md +75 -0
- package/docs/20260217-v03-impl-plan.md +80 -0
- package/docs/20260217-v03-post-impl-review.md +589 -0
- package/docs/20260217-v04-impl-plan.md +86 -0
- package/docs/20260217-v05-post-impl-review.md +850 -0
- package/docs/20260217-v06-post-impl-review.md +817 -0
- package/docs/20260218-adr003-port-plan.md +151 -0
- package/docs/20260218-review-adr003-verification.md +175 -0
- package/docs/20260219-fix-plan-bug16-19.md +79 -0
- package/docs/20260219-fix-plan-bug20-22.md +114 -0
- package/docs/20260219-plan-llm-routing.md +116 -0
- package/docs/20260219-review-bug20-22-fixes.md +135 -0
- package/docs/20260219-routing-baseline-keyword.md +63 -0
- package/docs/20260220-plan-structured-logging-p1.md +80 -0
- package/docs/20260220-plan-structured-logging-p2.md +37 -0
- package/docs/20260220-review-llm-routing.md +180 -0
- package/docs/20260220-review-post-fix-llm-routing.md +70 -0
- package/docs/20260221-fix-plan-relevantfiles-split.md +101 -0
- package/docs/20260221-fix-plan-routing-mode.md +125 -0
- package/docs/20260221-review-v0.9-implementation.md +379 -0
- package/docs/20260222-fix-plan-v091-routing-isolation.md +197 -0
- package/docs/20260223-fix-plan-prompt-audit.md +62 -0
- package/docs/20260224-nax-roadmap-phases.md +189 -0
- package/docs/20260225-phase2-llm-service-layer.md +401 -0
- package/docs/20260225-review-v0.10.1.md +187 -0
- package/docs/20260303-v010-implementation-plan.md +165 -0
- package/docs/CLAUDE.md.bak +191 -0
- package/docs/ROADMAP.md +165 -0
- package/docs/SPEC-rectification.md +0 -0
- package/docs/SPEC.md +324 -0
- package/docs/US-001-plugin-loading-verification.md +152 -0
- package/docs/architecture-analysis.md +1076 -0
- package/docs/bugs/BUG-21-escalation-null-attempts.md +48 -0
- package/docs/bugs-from-dogfood-run-c.md +243 -0
- package/docs/code-review-20260228.md +612 -0
- package/docs/code-review-v0.15.0.md +629 -0
- package/docs/hook-lifecycle-test-plan.md +149 -0
- package/docs/releases/v0.11.0-and-earlier.md +20 -0
- package/docs/releases/v0.12.0.md +15 -0
- package/docs/releases/v0.13.0.md +14 -0
- package/docs/releases/v0.14.0.md +20 -0
- package/docs/releases/v0.14.1.md +36 -0
- package/docs/releases/v0.14.2.md +51 -0
- package/docs/releases/v0.14.3.md +174 -0
- package/docs/releases/v0.14.4.md +94 -0
- package/docs/releases/v0.15.0.md +502 -0
- package/docs/releases/v0.15.1.md +170 -0
- package/docs/releases/v0.15.3.md +193 -0
- package/docs/specs/status-file-v0.10.1.md +812 -0
- package/docs/v0.10-global-config.md +206 -0
- package/docs/v0.10-plugin-system.md +415 -0
- package/docs/v0.10-prompt-optimizer.md +234 -0
- package/docs/v0.3-spec.md +244 -0
- package/docs/v0.4-spec.md +140 -0
- package/docs/v0.5-spec.md +237 -0
- package/docs/v0.6-spec.md +371 -0
- package/docs/v0.7-spec.md +177 -0
- package/docs/v0.8-llm-routing.md +206 -0
- package/docs/v0.8-structured-logging.md +132 -0
- package/docs/v0.9.3-prompt-audit.md +112 -0
- package/examples/plugins/console-reporter/index.test.ts +207 -0
- package/examples/plugins/console-reporter/index.ts +110 -0
- package/nax/config.json +147 -0
- package/nax/features/bugfix-v0171/prd.json +52 -0
- package/nax/features/config-management/prd.json +108 -0
- package/nax/features/config-management/progress.txt +5 -0
- package/nax/features/diagnose/acceptance.test.ts +412 -0
- package/nax/features/diagnose/prd.json +41 -0
- package/nax/features/orchestration-fixes/prd.json +89 -0
- package/nax/features/orchestration-fixes/progress.txt +1 -0
- package/nax/features/plugin-integration/US-007-VERIFICATION.md +259 -0
- package/nax/features/plugin-integration/prd.json +208 -0
- package/nax/features/plugin-integration/progress.txt +5 -0
- package/nax/features/precheck/prd.json +205 -0
- package/nax/features/precheck/progress.txt +15 -0
- package/nax/features/structured-logging/prd.json +199 -0
- package/nax/features/unlock/prd.json +36 -0
- package/package.json +47 -0
- package/src/acceptance/fix-generator.ts +348 -0
- package/src/acceptance/generator.ts +282 -0
- package/src/acceptance/index.ts +30 -0
- package/src/acceptance/types.ts +79 -0
- package/src/agents/claude-decompose.ts +169 -0
- package/src/agents/claude-plan.ts +139 -0
- package/src/agents/claude.ts +324 -0
- package/src/agents/cost.ts +268 -0
- package/src/agents/index.ts +13 -0
- package/src/agents/registry.ts +48 -0
- package/src/agents/types-extended.ts +133 -0
- package/src/agents/types.ts +113 -0
- package/src/agents/validation.ts +69 -0
- package/src/analyze/classifier.ts +305 -0
- package/src/analyze/index.ts +16 -0
- package/src/analyze/scanner.ts +175 -0
- package/src/analyze/types.ts +51 -0
- package/src/cli/accept.ts +108 -0
- package/src/cli/analyze-parser.ts +284 -0
- package/src/cli/analyze.ts +207 -0
- package/src/cli/config.ts +561 -0
- package/src/cli/constitution.ts +109 -0
- package/src/cli/diagnose-analysis.ts +159 -0
- package/src/cli/diagnose-formatter.ts +87 -0
- package/src/cli/diagnose.ts +203 -0
- package/src/cli/generate.ts +127 -0
- package/src/cli/index.ts +37 -0
- package/src/cli/init.ts +188 -0
- package/src/cli/interact.ts +295 -0
- package/src/cli/plan.ts +198 -0
- package/src/cli/plugins.ts +111 -0
- package/src/cli/prompts.ts +295 -0
- package/src/cli/runs.ts +174 -0
- package/src/cli/status-cost.ts +151 -0
- package/src/cli/status-features.ts +338 -0
- package/src/cli/status.ts +13 -0
- package/src/commands/common.ts +171 -0
- package/src/commands/diagnose.ts +17 -0
- package/src/commands/index.ts +8 -0
- package/src/commands/logs.ts +384 -0
- package/src/commands/precheck.ts +86 -0
- package/src/commands/unlock.ts +96 -0
- package/src/config/defaults.ts +160 -0
- package/src/config/index.ts +22 -0
- package/src/config/loader.ts +121 -0
- package/src/config/merger.ts +147 -0
- package/src/config/path-security.ts +121 -0
- package/src/config/paths.ts +27 -0
- package/src/config/schema.ts +56 -0
- package/src/config/schemas.ts +286 -0
- package/src/config/types.ts +423 -0
- package/src/config/validate.ts +103 -0
- package/src/constitution/generator.ts +191 -0
- package/src/constitution/generators/aider.ts +41 -0
- package/src/constitution/generators/claude.ts +35 -0
- package/src/constitution/generators/cursor.ts +36 -0
- package/src/constitution/generators/opencode.ts +38 -0
- package/src/constitution/generators/types.ts +33 -0
- package/src/constitution/generators/windsurf.ts +36 -0
- package/src/constitution/index.ts +10 -0
- package/src/constitution/loader.ts +133 -0
- package/src/constitution/types.ts +31 -0
- package/src/context/auto-detect.ts +227 -0
- package/src/context/builder.ts +246 -0
- package/src/context/elements.ts +83 -0
- package/src/context/formatter.ts +107 -0
- package/src/context/generator.ts +129 -0
- package/src/context/generators/aider.ts +34 -0
- package/src/context/generators/claude.ts +28 -0
- package/src/context/generators/cursor.ts +28 -0
- package/src/context/generators/opencode.ts +30 -0
- package/src/context/generators/windsurf.ts +28 -0
- package/src/context/greenfield.ts +114 -0
- package/src/context/index.ts +33 -0
- package/src/context/injector.ts +279 -0
- package/src/context/test-scanner.ts +370 -0
- package/src/context/types.ts +98 -0
- package/src/errors.ts +67 -0
- package/src/execution/batching.ts +157 -0
- package/src/execution/crash-recovery.ts +373 -0
- package/src/execution/escalation/escalation.ts +44 -0
- package/src/execution/escalation/index.ts +13 -0
- package/src/execution/escalation/tier-escalation.ts +295 -0
- package/src/execution/escalation/tier-outcome.ts +158 -0
- package/src/execution/helpers.ts +38 -0
- package/src/execution/index.ts +45 -0
- package/src/execution/lifecycle/acceptance-loop.ts +272 -0
- package/src/execution/lifecycle/headless-formatter.ts +85 -0
- package/src/execution/lifecycle/index.ts +12 -0
- package/src/execution/lifecycle/parallel-lifecycle.ts +101 -0
- package/src/execution/lifecycle/precheck-runner.ts +140 -0
- package/src/execution/lifecycle/run-cleanup.ts +81 -0
- package/src/execution/lifecycle/run-completion.ts +129 -0
- package/src/execution/lifecycle/run-initialization.ts +141 -0
- package/src/execution/lifecycle/run-lifecycle.ts +312 -0
- package/src/execution/lifecycle/run-setup.ts +204 -0
- package/src/execution/lifecycle/story-hooks.ts +38 -0
- package/src/execution/lifecycle/story-size-prompts.ts +123 -0
- package/src/execution/lock.ts +115 -0
- package/src/execution/parallel-executor.ts +216 -0
- package/src/execution/parallel.ts +400 -0
- package/src/execution/pid-registry.ts +280 -0
- package/src/execution/pipeline-result-handler.ts +388 -0
- package/src/execution/post-verify-rectification.ts +188 -0
- package/src/execution/post-verify.ts +274 -0
- package/src/execution/progress.ts +25 -0
- package/src/execution/prompts.ts +127 -0
- package/src/execution/queue-handler.ts +109 -0
- package/src/execution/rectification.ts +13 -0
- package/src/execution/runner.ts +377 -0
- package/src/execution/sequential-executor.ts +388 -0
- package/src/execution/status-file.ts +264 -0
- package/src/execution/status-writer.ts +139 -0
- package/src/execution/story-context.ts +229 -0
- package/src/execution/test-output-parser.ts +14 -0
- package/src/execution/verification.ts +72 -0
- package/src/hooks/index.ts +2 -0
- package/src/hooks/runner.ts +286 -0
- package/src/hooks/types.ts +67 -0
- package/src/interaction/chain.ts +154 -0
- package/src/interaction/index.ts +60 -0
- package/src/interaction/init.ts +83 -0
- package/src/interaction/plugins/auto.ts +217 -0
- package/src/interaction/plugins/cli.ts +300 -0
- package/src/interaction/plugins/telegram.ts +384 -0
- package/src/interaction/plugins/webhook.ts +258 -0
- package/src/interaction/state.ts +171 -0
- package/src/interaction/triggers.ts +229 -0
- package/src/interaction/types.ts +163 -0
- package/src/logger/formatters.ts +84 -0
- package/src/logger/index.ts +16 -0
- package/src/logger/logger.ts +298 -0
- package/src/logger/types.ts +48 -0
- package/src/logging/formatter.ts +355 -0
- package/src/logging/index.ts +22 -0
- package/src/logging/types.ts +93 -0
- package/src/metrics/aggregator.ts +190 -0
- package/src/metrics/index.ts +14 -0
- package/src/metrics/tracker.ts +200 -0
- package/src/metrics/types.ts +109 -0
- package/src/optimizer/index.ts +62 -0
- package/src/optimizer/noop.optimizer.ts +24 -0
- package/src/optimizer/rule-based.optimizer.ts +248 -0
- package/src/optimizer/types.ts +53 -0
- package/src/pipeline/events.ts +130 -0
- package/src/pipeline/index.ts +19 -0
- package/src/pipeline/runner.ts +161 -0
- package/src/pipeline/stages/acceptance.ts +197 -0
- package/src/pipeline/stages/completion.ts +99 -0
- package/src/pipeline/stages/constitution.ts +63 -0
- package/src/pipeline/stages/context.ts +117 -0
- package/src/pipeline/stages/execution.ts +194 -0
- package/src/pipeline/stages/index.ts +62 -0
- package/src/pipeline/stages/optimizer.ts +74 -0
- package/src/pipeline/stages/prompt.ts +57 -0
- package/src/pipeline/stages/queue-check.ts +103 -0
- package/src/pipeline/stages/review.ts +181 -0
- package/src/pipeline/stages/routing.ts +81 -0
- package/src/pipeline/stages/verify.ts +100 -0
- package/src/pipeline/types.ts +167 -0
- package/src/plugins/index.ts +31 -0
- package/src/plugins/loader.ts +287 -0
- package/src/plugins/registry.ts +168 -0
- package/src/plugins/types.ts +327 -0
- package/src/plugins/validator.ts +352 -0
- package/src/prd/index.ts +172 -0
- package/src/prd/types.ts +202 -0
- package/src/precheck/checks-blockers.ts +391 -0
- package/src/precheck/checks-warnings.ts +142 -0
- package/src/precheck/checks.ts +30 -0
- package/src/precheck/index.ts +247 -0
- package/src/precheck/story-size-gate.ts +144 -0
- package/src/precheck/types.ts +31 -0
- package/src/queue/index.ts +2 -0
- package/src/queue/manager.ts +254 -0
- package/src/queue/types.ts +54 -0
- package/src/review/index.ts +8 -0
- package/src/review/runner.ts +172 -0
- package/src/review/types.ts +66 -0
- package/src/routing/builder.ts +81 -0
- package/src/routing/chain.ts +74 -0
- package/src/routing/index.ts +16 -0
- package/src/routing/loader.ts +58 -0
- package/src/routing/router.ts +303 -0
- package/src/routing/strategies/adaptive.ts +215 -0
- package/src/routing/strategies/index.ts +8 -0
- package/src/routing/strategies/keyword.ts +163 -0
- package/src/routing/strategies/llm-prompts.ts +209 -0
- package/src/routing/strategies/llm.ts +235 -0
- package/src/routing/strategies/manual.ts +50 -0
- package/src/routing/strategy.ts +99 -0
- package/src/tdd/cleanup.ts +111 -0
- package/src/tdd/index.ts +23 -0
- package/src/tdd/isolation.ts +123 -0
- package/src/tdd/orchestrator.ts +383 -0
- package/src/tdd/prompts.ts +270 -0
- package/src/tdd/rectification-gate.ts +183 -0
- package/src/tdd/session-runner.ts +179 -0
- package/src/tdd/types.ts +81 -0
- package/src/tdd/verdict.ts +271 -0
- package/src/tui/App.tsx +265 -0
- package/src/tui/components/AgentPanel.tsx +75 -0
- package/src/tui/components/CostOverlay.tsx +118 -0
- package/src/tui/components/HelpOverlay.tsx +107 -0
- package/src/tui/components/StatusBar.tsx +63 -0
- package/src/tui/components/StoriesPanel.tsx +177 -0
- package/src/tui/hooks/useKeyboard.ts +142 -0
- package/src/tui/hooks/useLayout.ts +137 -0
- package/src/tui/hooks/usePipelineEvents.ts +183 -0
- package/src/tui/hooks/usePty.ts +194 -0
- package/src/tui/index.tsx +38 -0
- package/src/tui/types.ts +76 -0
- package/src/utils/git.ts +83 -0
- package/src/utils/queue-writer.ts +54 -0
- package/src/verification/executor.ts +235 -0
- package/src/verification/gate.ts +207 -0
- package/src/verification/index.ts +12 -0
- package/src/verification/parser.ts +230 -0
- package/src/verification/rectification.ts +108 -0
- package/src/verification/types.ts +113 -0
- package/src/worktree/dispatcher.ts +65 -0
- package/src/worktree/index.ts +2 -0
- package/src/worktree/manager.ts +187 -0
- package/src/worktree/merge.ts +301 -0
- package/src/worktree/types.ts +4 -0
- package/test/TEST_COVERAGE_US001.md +217 -0
- package/test/TEST_COVERAGE_US003.md +84 -0
- package/test/TEST_COVERAGE_US005.md +86 -0
- package/test/US-002-orchestrator.test.ts +246 -0
- package/test/acceptance/cm-003-default-view.test.ts +194 -0
- package/test/execution/pid-registry.test.ts +240 -0
- package/test/execution/post-verify.test.ts +224 -0
- package/test/helpers/timeout.ts +42 -0
- package/test/integration/US-002-TEST-SUMMARY.md +107 -0
- package/test/integration/US-003-TEST-SUMMARY.md +149 -0
- package/test/integration/US-004-TEST-SUMMARY.md +106 -0
- package/test/integration/US-005-TEST-SUMMARY.md +138 -0
- package/test/integration/US-007-TEST-SUMMARY.md +100 -0
- package/test/integration/agent-validation.test.ts +439 -0
- package/test/integration/analyze-integration.test.ts +261 -0
- package/test/integration/analyze-scanner.test.ts +131 -0
- package/test/integration/cli-config-default-edge-cases.test.ts +222 -0
- package/test/integration/cli-config-default-view.test.ts +229 -0
- package/test/integration/cli-config-diff.test.ts +460 -0
- package/test/integration/cli-config.test.ts +736 -0
- package/test/integration/cli-diagnose.test.ts +592 -0
- package/test/integration/cli-logs.test.ts +314 -0
- package/test/integration/cli-plugins.test.ts +678 -0
- package/test/integration/cli-precheck.test.ts +371 -0
- package/test/integration/cli-run-headless.test.ts +173 -0
- package/test/integration/cli.test.ts +75 -0
- package/test/integration/config/merger.test.ts +465 -0
- package/test/integration/config/paths.test.ts +51 -0
- package/test/integration/config-loader.test.ts +265 -0
- package/test/integration/config.test.ts +444 -0
- package/test/integration/context-integration.test.ts +702 -0
- package/test/integration/context-provider-injection.test.ts +506 -0
- package/test/integration/context-verification-integration.test.ts +295 -0
- package/test/integration/e2e.test.ts +896 -0
- package/test/integration/execution.test.ts +625 -0
- package/test/integration/helpers.test.ts +295 -0
- package/test/integration/hooks.test.ts +361 -0
- package/test/integration/interaction-chain-pipeline.test.ts +464 -0
- package/test/integration/isolation.test.ts +143 -0
- package/test/integration/logger.test.ts +461 -0
- package/test/integration/parallel.test.ts +250 -0
- package/test/integration/path-security.test.ts +173 -0
- package/test/integration/pipeline-acceptance.test.ts +302 -0
- package/test/integration/pipeline-events.test.ts +475 -0
- package/test/integration/pipeline.test.ts +658 -0
- package/test/integration/plan.test.ts +157 -0
- package/test/integration/plugin-routing.test.ts +921 -0
- package/test/integration/plugins/config-integration.test.ts +172 -0
- package/test/integration/plugins/config-resolution.test.ts +522 -0
- package/test/integration/plugins/loader.test.ts +641 -0
- package/test/integration/plugins/registry.test.ts +746 -0
- package/test/integration/plugins/validator.test.ts +563 -0
- package/test/integration/prd-pause.test.ts +205 -0
- package/test/integration/prd-resolvers.test.ts +185 -0
- package/test/integration/precheck-integration.test.ts +468 -0
- package/test/integration/precheck.test.ts +805 -0
- package/test/integration/progress.test.ts +34 -0
- package/test/integration/rectification-flow.test.ts +512 -0
- package/test/integration/reporter-lifecycle.test.ts +860 -0
- package/test/integration/review-config-commands.test.ts +319 -0
- package/test/integration/review-config-schema.test.ts +116 -0
- package/test/integration/review-plugin-integration.test.ts +722 -0
- package/test/integration/review.test.ts +149 -0
- package/test/integration/routing-stage-bug-021.test.ts +274 -0
- package/test/integration/routing-stage-greenfield.test.ts +286 -0
- package/test/integration/runner-config-plugins.test.ts +461 -0
- package/test/integration/runner-fixes.test.ts +399 -0
- package/test/integration/runner-plugin-integration.test.ts +543 -0
- package/test/integration/runner.test.ts +1679 -0
- package/test/integration/s5-greenfield-fallback.test.ts +297 -0
- package/test/integration/status-file-integration.test.ts +325 -0
- package/test/integration/status-file.test.ts +379 -0
- package/test/integration/status-writer.test.ts +345 -0
- package/test/integration/story-id-in-events.test.ts +273 -0
- package/test/integration/tdd-cleanup.test.ts +246 -0
- package/test/integration/tdd-orchestrator.test.ts +1762 -0
- package/test/integration/test-scanner.test.ts +403 -0
- package/test/integration/verification-asset-check.test.ts +142 -0
- package/test/integration/verify-stage.test.ts +275 -0
- package/test/integration/worktree/manager.test.ts +218 -0
- package/test/integration/worktree/merge.test.ts +341 -0
- package/test/manual/logging-formatter-demo.ts +158 -0
- package/test/ui/tui-agent-panel.test.tsx +99 -0
- package/test/ui/tui-controls.test.ts +334 -0
- package/test/ui/tui-cost-and-pty.test.ts +189 -0
- package/test/ui/tui-layout.test.ts +378 -0
- package/test/ui/tui-pty-integration.test.tsx +159 -0
- package/test/ui/tui-stories.test.ts +332 -0
- package/test/unit/acceptance.test.ts +186 -0
- package/test/unit/agent-stderr-capture.test.ts +146 -0
- package/test/unit/analyze-classifier.test.ts +215 -0
- package/test/unit/analyze.test.ts +224 -0
- package/test/unit/auto-detect.test.ts +249 -0
- package/test/unit/cli-status.test.ts +417 -0
- package/test/unit/commands/common.test.ts +320 -0
- package/test/unit/commands/logs.test.ts +416 -0
- package/test/unit/commands/unlock.test.ts +319 -0
- package/test/unit/constitution-generators.test.ts +160 -0
- package/test/unit/constitution.test.ts +209 -0
- package/test/unit/context.test.ts +1722 -0
- package/test/unit/cost.test.ts +231 -0
- package/test/unit/crash-recovery.test.ts +308 -0
- package/test/unit/escalation.test.ts +126 -0
- package/test/unit/execution-logging-stderr.test.ts +156 -0
- package/test/unit/execution-stage.test.ts +122 -0
- package/test/unit/fix-generator.test.ts +275 -0
- package/test/unit/formatters.test.ts +469 -0
- package/test/unit/greenfield.test.ts +179 -0
- package/test/unit/helpers.test.ts +317 -0
- package/test/unit/interaction/human-review-trigger.test.ts +164 -0
- package/test/unit/interaction-network-failures.test.ts +389 -0
- package/test/unit/interaction-plugins.test.ts +164 -0
- package/test/unit/isolation.test.ts +134 -0
- package/test/unit/logging/formatter.test.ts +455 -0
- package/test/unit/merge.test.ts +268 -0
- package/test/unit/metrics.test.ts +276 -0
- package/test/unit/optimizer/noop.optimizer.test.ts +125 -0
- package/test/unit/optimizer/rule-based.optimizer.test.ts +358 -0
- package/test/unit/prd-auto-default.test.ts +290 -0
- package/test/unit/prd-failure-category.test.ts +176 -0
- package/test/unit/prd-get-next-story.test.ts +186 -0
- package/test/unit/precheck-checks.test.ts +840 -0
- package/test/unit/precheck-story-size-gate.test.ts +287 -0
- package/test/unit/precheck-types.test.ts +142 -0
- package/test/unit/prompts.test.ts +475 -0
- package/test/unit/queue.test.ts +237 -0
- package/test/unit/rectification.test.ts +284 -0
- package/test/unit/registry.test.ts +287 -0
- package/test/unit/routing.test.ts +937 -0
- package/test/unit/run-lifecycle.test.ts +140 -0
- package/test/unit/storyid-events.test.ts +224 -0
- package/test/unit/tdd-verdict.test.ts +492 -0
- package/test/unit/test-output-parser.test.ts +377 -0
- package/test/unit/verdict.test.ts +324 -0
- package/test/unit/worktree-manager.test.ts +158 -0
- package/tsconfig.json +27 -0
|
@@ -0,0 +1,177 @@
|
|
|
1
|
+
# nax v0.7 Specification
|
|
2
|
+
**Date:** 2026-02-17
|
|
3
|
+
**Status:** Draft
|
|
4
|
+
|
|
5
|
+
## Theme: Test Context Injection
|
|
6
|
+
|
|
7
|
+
v0.7 addresses test redundancy caused by isolated story sessions. Each agent session currently writes tests without knowing what prior stories already covered, leading to duplicate coverage.
|
|
8
|
+
|
|
9
|
+
## Problem
|
|
10
|
+
|
|
11
|
+
During dogfooding (bun-kv-store, 8 stories), we observed:
|
|
12
|
+
- 6 tests for "name is required" (missing, empty, whitespace, not string, null, undefined)
|
|
13
|
+
- Each story independently writes "comprehensive tests" for its area
|
|
14
|
+
- Validation stories re-test what CRUD stories already covered
|
|
15
|
+
- **Root cause:** Each session's prompt has zero visibility into existing test files
|
|
16
|
+
|
|
17
|
+
## Solution: Test Context Injection
|
|
18
|
+
|
|
19
|
+
Inject a summary of existing test files into each story's prompt so the agent knows what's already covered and avoids duplication.
|
|
20
|
+
|
|
21
|
+
### How It Works
|
|
22
|
+
|
|
23
|
+
Before the **prompt stage**, scan the project's test directory and generate a concise summary:
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
## Existing Test Coverage
|
|
27
|
+
|
|
28
|
+
### test/store.test.ts (45 tests)
|
|
29
|
+
- CRUD operations: create, read, update, delete
|
|
30
|
+
- Error handling: missing key, invalid value
|
|
31
|
+
- Batch operations: getMany, setMany
|
|
32
|
+
|
|
33
|
+
### test/validation.test.ts (12 tests)
|
|
34
|
+
- Input validation: name required, type checking
|
|
35
|
+
- Size limits: max key length, max value size
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
This summary is injected into the prompt alongside the story context, constitution, and other context elements.
|
|
39
|
+
|
|
40
|
+
### Context Element
|
|
41
|
+
|
|
42
|
+
```typescript
|
|
43
|
+
interface TestCoverageContext {
|
|
44
|
+
type: 'test-coverage';
|
|
45
|
+
priority: 85; // Below constitution (95), above file context (50)
|
|
46
|
+
content: string; // Formatted test summary
|
|
47
|
+
tokens: number;
|
|
48
|
+
source: string; // e.g., "test/*.test.ts"
|
|
49
|
+
}
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
### Summary Generation
|
|
53
|
+
|
|
54
|
+
```typescript
|
|
55
|
+
interface TestSummaryOptions {
|
|
56
|
+
/** Test directory to scan (default: auto-detect from config or common patterns) */
|
|
57
|
+
testDir?: string;
|
|
58
|
+
/** Glob pattern for test files (default: "**/*.test.{ts,js,tsx,jsx}") */
|
|
59
|
+
testPattern?: string;
|
|
60
|
+
/** Max tokens for the summary (default: 500) */
|
|
61
|
+
maxTokens?: number;
|
|
62
|
+
/** Summary detail level */
|
|
63
|
+
detail: 'names-only' | 'names-and-counts' | 'describe-blocks';
|
|
64
|
+
}
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
**Detail levels:**
|
|
68
|
+
- `names-only` — Just file names and test count: `test/store.test.ts (45 tests)`
|
|
69
|
+
- `names-and-counts` — File names + top-level describe blocks with counts
|
|
70
|
+
- `describe-blocks` — File names + describe blocks + test names (most expensive but most useful)
|
|
71
|
+
|
|
72
|
+
**Default:** `names-and-counts` (good balance of info vs tokens)
|
|
73
|
+
|
|
74
|
+
### Scanning Approach
|
|
75
|
+
|
|
76
|
+
1. Find test files matching pattern
|
|
77
|
+
2. For each file, extract `describe()` and `test()`/`it()` block names via regex (no AST parsing needed)
|
|
78
|
+
3. Format as markdown summary
|
|
79
|
+
4. Truncate to token budget
|
|
80
|
+
|
|
81
|
+
### Prompt Injection
|
|
82
|
+
|
|
83
|
+
Add to the story prompt:
|
|
84
|
+
|
|
85
|
+
```
|
|
86
|
+
## Existing Test Coverage
|
|
87
|
+
|
|
88
|
+
The following tests already exist. DO NOT duplicate this coverage.
|
|
89
|
+
Focus only on testing NEW behavior introduced by this story.
|
|
90
|
+
|
|
91
|
+
### test/store.test.ts (45 tests)
|
|
92
|
+
- CRUD operations (12 tests): create, read, update, delete, upsert
|
|
93
|
+
- Validation (8 tests): required fields, type checks, size limits
|
|
94
|
+
- Error handling (6 tests): not found, duplicate key, connection error
|
|
95
|
+
...
|
|
96
|
+
|
|
97
|
+
## Your Story
|
|
98
|
+
|
|
99
|
+
US-007: Add input sanitization
|
|
100
|
+
...
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
### Config
|
|
104
|
+
|
|
105
|
+
```json
|
|
106
|
+
{
|
|
107
|
+
"context": {
|
|
108
|
+
"testCoverage": {
|
|
109
|
+
"enabled": true,
|
|
110
|
+
"detail": "names-and-counts",
|
|
111
|
+
"maxTokens": 500,
|
|
112
|
+
"testDir": "test",
|
|
113
|
+
"testPattern": "**/*.test.{ts,js}"
|
|
114
|
+
}
|
|
115
|
+
}
|
|
116
|
+
}
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
## Acceptance Criteria
|
|
120
|
+
|
|
121
|
+
- [ ] Test files scanned and summarized before each story prompt
|
|
122
|
+
- [ ] Summary injected into prompt at priority 85
|
|
123
|
+
- [ ] `describe()` and `test()` names extracted via regex
|
|
124
|
+
- [ ] Summary respects maxTokens budget
|
|
125
|
+
- [ ] Config allows enabling/disabling and adjusting detail level
|
|
126
|
+
- [ ] Summary updates between stories (reflects tests added by prior stories)
|
|
127
|
+
- [ ] No performance regression (scanning should be <100ms for typical projects)
|
|
128
|
+
|
|
129
|
+
## Implementation Plan
|
|
130
|
+
|
|
131
|
+
### Phase 1: Test Scanner
|
|
132
|
+
**Files:** `src/context/test-scanner.ts`
|
|
133
|
+
- Scan test directory for test files
|
|
134
|
+
- Extract describe/test block names via regex
|
|
135
|
+
- Format as markdown summary
|
|
136
|
+
- Respect token budget
|
|
137
|
+
|
|
138
|
+
**Commit:** `feat(context): add test file scanner for coverage summary`
|
|
139
|
+
|
|
140
|
+
### Phase 2: Context Integration
|
|
141
|
+
**Files:** `src/context/builder.ts`, `src/pipeline/stages/context.ts`, `src/config/schema.ts`
|
|
142
|
+
- Add TestCoverageContext element type
|
|
143
|
+
- Wire scanner into context stage (runs before prompt assembly)
|
|
144
|
+
- Add config schema for testCoverage settings
|
|
145
|
+
- Summary refreshes between stories
|
|
146
|
+
|
|
147
|
+
**Commit:** `feat(context): inject test coverage summary into story prompts`
|
|
148
|
+
|
|
149
|
+
### Phase 3: Prompt Guidance
|
|
150
|
+
**Files:** `src/execution/prompts.ts`
|
|
151
|
+
- Add "DO NOT duplicate" instruction in prompt template
|
|
152
|
+
- Reference existing coverage summary
|
|
153
|
+
- Reinforce constitution test guidance
|
|
154
|
+
|
|
155
|
+
**Commit:** `feat(prompts): add test dedup guidance referencing coverage summary`
|
|
156
|
+
|
|
157
|
+
## Test Strategy
|
|
158
|
+
- Mode: test-after
|
|
159
|
+
- Unit tests: scanner regex extraction, summary formatting, token truncation
|
|
160
|
+
- Integration: context builder includes test coverage element
|
|
161
|
+
- Run `bun test && bun run typecheck` after each phase
|
|
162
|
+
|
|
163
|
+
## Estimated Effort
|
|
164
|
+
~300-400 LOC across 3 phases.
|
|
165
|
+
|
|
166
|
+
## Measurement
|
|
167
|
+
|
|
168
|
+
Compare v0.5.0 (no dedup) vs v0.7.0 (context injection) on the same dogfood project:
|
|
169
|
+
|
|
170
|
+
| Metric | v0.5.0 | v0.7.0 |
|
|
171
|
+
|:---|:---|:---|
|
|
172
|
+
| Total tests generated | ? | ? |
|
|
173
|
+
| Redundant tests | ? | ? |
|
|
174
|
+
| Code quality grade | ? | ? |
|
|
175
|
+
| Acceptance rate | ? | ? |
|
|
176
|
+
| Total cost | ? | ? |
|
|
177
|
+
| Total time | ? | ? |
|
|
@@ -0,0 +1,206 @@
|
|
|
1
|
+
# v0.8 — LLM-Enhanced Routing
|
|
2
|
+
|
|
3
|
+
> Priority: **HIGH** — keyword routing causes costly misroutes (e.g., US-008 simple barrel exports → powerful + TDD due to "public api" keyword match).
|
|
4
|
+
|
|
5
|
+
## Problem
|
|
6
|
+
|
|
7
|
+
Keyword-based routing is brittle and expensive:
|
|
8
|
+
1. **False positives**: "public api" in title → three-session-tdd even for simple barrel exports ($1.25 wasted)
|
|
9
|
+
2. **False negatives**: Complex integration work without magic keywords → test-after on fast tier
|
|
10
|
+
3. **No semantic understanding**: Can't assess *actual* implementation effort from acceptance criteria
|
|
11
|
+
4. **Keyword overlap**: Security keywords fire on simple "add auth header to request" stories
|
|
12
|
+
|
|
13
|
+
Evidence from dogfood runs:
|
|
14
|
+
- Run D2: US-008 ("Export public API and create barrel exports") — simple task, keyword matched "public api" → powerful + three-session-tdd. Cost: $1.25. Should have been: fast + test-after (~$0.10).
|
|
15
|
+
|
|
16
|
+
## Design
|
|
17
|
+
|
|
18
|
+
### Config
|
|
19
|
+
|
|
20
|
+
```json
|
|
21
|
+
{
|
|
22
|
+
"routing": {
|
|
23
|
+
"strategy": "llm",
|
|
24
|
+
"llm": {
|
|
25
|
+
"model": "fast",
|
|
26
|
+
"fallbackToKeywords": true,
|
|
27
|
+
"maxInputTokens": 2000,
|
|
28
|
+
"cacheDecisions": true
|
|
29
|
+
}
|
|
30
|
+
}
|
|
31
|
+
}
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
- `model`: Tier used for the routing LLM call itself (default: `fast` — routing should be cheap)
|
|
35
|
+
- `fallbackToKeywords`: If LLM call fails (timeout, parse error), fall back to keyword strategy (default: `true`)
|
|
36
|
+
- `maxInputTokens`: Token budget for story context sent to LLM (default: `2000`)
|
|
37
|
+
- `cacheDecisions`: Cache routing decisions per story ID within a run (default: `true`)
|
|
38
|
+
|
|
39
|
+
### LLM Prompt
|
|
40
|
+
|
|
41
|
+
The LLM receives a structured prompt with the story and must return a JSON decision:
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
You are a code task router. Given a user story, classify its complexity and select the appropriate execution strategy.
|
|
45
|
+
|
|
46
|
+
## Story
|
|
47
|
+
Title: {title}
|
|
48
|
+
Description: {description}
|
|
49
|
+
Acceptance Criteria:
|
|
50
|
+
{acceptanceCriteria as numbered list}
|
|
51
|
+
Tags: {tags}
|
|
52
|
+
|
|
53
|
+
## Available Tiers
|
|
54
|
+
- fast: Simple changes, typos, config updates, boilerplate. <30 min of coding.
|
|
55
|
+
- balanced: Standard features, moderate logic, straightforward tests. 30-90 min.
|
|
56
|
+
- powerful: Complex architecture, security-critical, multi-file refactors, novel algorithms. >90 min.
|
|
57
|
+
|
|
58
|
+
## Available Test Strategies
|
|
59
|
+
- test-after: Write implementation first, add tests after. For straightforward work.
|
|
60
|
+
- three-session-tdd: Separate test-writer → implementer → verifier sessions. For complex/critical work where test design matters.
|
|
61
|
+
|
|
62
|
+
## Rules
|
|
63
|
+
- Default to the CHEAPEST option that will succeed.
|
|
64
|
+
- three-session-tdd ONLY when: (a) security/auth logic, (b) complex algorithms, (c) public API contracts that consumers depend on.
|
|
65
|
+
- Simple barrel exports, re-exports, or index files are ALWAYS test-after + fast, regardless of keywords.
|
|
66
|
+
- A story touching many files doesn't automatically mean complex — copy-paste refactors are simple.
|
|
67
|
+
|
|
68
|
+
Respond with ONLY this JSON (no markdown, no explanation):
|
|
69
|
+
{"complexity":"simple|medium|complex|expert","modelTier":"fast|balanced|powerful","testStrategy":"test-after|three-session-tdd","reasoning":"<one line>"}
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
### Implementation
|
|
73
|
+
|
|
74
|
+
Modify `src/routing/strategies/llm.ts`:
|
|
75
|
+
|
|
76
|
+
```typescript
|
|
77
|
+
export const llmStrategy: RoutingStrategy = {
|
|
78
|
+
name: "llm",
|
|
79
|
+
|
|
80
|
+
async route(story: UserStory, context: RoutingContext): Promise<RoutingDecision | null> {
|
|
81
|
+
const config = context.config;
|
|
82
|
+
const llmConfig = config.routing.llm;
|
|
83
|
+
if (!llmConfig) return null;
|
|
84
|
+
|
|
85
|
+
// Check cache
|
|
86
|
+
if (llmConfig.cacheDecisions && cachedDecisions.has(story.id)) {
|
|
87
|
+
return cachedDecisions.get(story.id)!;
|
|
88
|
+
}
|
|
89
|
+
|
|
90
|
+
try {
|
|
91
|
+
const prompt = buildRoutingPrompt(story, config);
|
|
92
|
+
const modelId = config.models[llmConfig.model ?? "fast"];
|
|
93
|
+
|
|
94
|
+
const result = await callLlm(modelId, prompt, {
|
|
95
|
+
maxTokens: 200,
|
|
96
|
+
timeout: 15_000, // 15s hard limit — routing shouldn't block
|
|
97
|
+
});
|
|
98
|
+
|
|
99
|
+
const decision = parseRoutingResponse(result, story, config);
|
|
100
|
+
|
|
101
|
+
if (llmConfig.cacheDecisions) {
|
|
102
|
+
cachedDecisions.set(story.id, decision);
|
|
103
|
+
}
|
|
104
|
+
|
|
105
|
+
return decision;
|
|
106
|
+
} catch (err) {
|
|
107
|
+
logger.warn(`LLM routing failed for ${story.id}: ${err.message}`);
|
|
108
|
+
return null; // Falls through to keyword strategy
|
|
109
|
+
}
|
|
110
|
+
},
|
|
111
|
+
};
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
### LLM Call Mechanism
|
|
115
|
+
|
|
116
|
+
nax already spawns Claude Code via `Bun.spawn`. For the routing LLM call, we need a **lightweight** approach:
|
|
117
|
+
|
|
118
|
+
**Option A — Claude Code one-shot**: `claude -p "..." --model <model>` with 15s timeout.
|
|
119
|
+
- Pro: Reuses existing infra, model aliases work.
|
|
120
|
+
- Con: ~3-5s startup overhead per call. For 9 stories = 27-45s total.
|
|
121
|
+
|
|
122
|
+
**Option B — Direct API call**: HTTP request to provider API (Anthropic/OpenAI/Google).
|
|
123
|
+
- Pro: <1s per call, batch-friendly.
|
|
124
|
+
- Con: Needs API key handling, provider-specific code.
|
|
125
|
+
|
|
126
|
+
**Recommendation: Option A** for v0.8 (simplicity), with config option to batch all stories in one call:
|
|
127
|
+
|
|
128
|
+
```
|
|
129
|
+
// Single LLM call for all pending stories (batch mode)
|
|
130
|
+
"Route these 9 stories:\n1. US-001: ...\n2. US-002: ...\n\nRespond with JSON array: [{id, complexity, modelTier, testStrategy, reasoning}]"
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
Batch mode cuts 9 calls → 1 call. ~5s total routing overhead.
|
|
134
|
+
|
|
135
|
+
### Strategy Interface Change
|
|
136
|
+
|
|
137
|
+
The current `RoutingStrategy.route()` is synchronous. LLM routing needs async:
|
|
138
|
+
|
|
139
|
+
```typescript
|
|
140
|
+
export interface RoutingStrategy {
|
|
141
|
+
readonly name: string;
|
|
142
|
+
route(story: UserStory, context: RoutingContext): RoutingDecision | null | Promise<RoutingDecision | null>;
|
|
143
|
+
}
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
`StrategyChain.route()` becomes async (already called with `await` in `routeStory()`).
|
|
147
|
+
|
|
148
|
+
### Error Handling
|
|
149
|
+
|
|
150
|
+
| Failure | Behavior |
|
|
151
|
+
|:---|:---|
|
|
152
|
+
| LLM timeout (>15s) | Return null → keyword fallback |
|
|
153
|
+
| JSON parse error | Return null → keyword fallback |
|
|
154
|
+
| Invalid field values | Return null → keyword fallback |
|
|
155
|
+
| LLM returns unknown complexity | Clamp to nearest valid value |
|
|
156
|
+
| All failures logged | `logger.warn()` with story ID |
|
|
157
|
+
|
|
158
|
+
### Logging
|
|
159
|
+
|
|
160
|
+
```
|
|
161
|
+
[routing] LLM classified US-008 as simple/fast/test-after: "Barrel export file, no logic to test"
|
|
162
|
+
[routing] LLM routing failed for US-003: timeout after 15000ms, falling back to keyword
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
## Cost Analysis
|
|
166
|
+
|
|
167
|
+
| Scenario | Keyword Cost | LLM Routing Cost | Savings |
|
|
168
|
+
|:---|:---|:---|:---|
|
|
169
|
+
| US-008 (barrel exports) | $1.25 (powerful+TDD) | $0.10 (fast+test-after) + $0.01 routing | **$1.14 saved** |
|
|
170
|
+
| 9-story run (batch) | Variable | ~$0.02 routing overhead | Net positive if prevents 1+ misroute |
|
|
171
|
+
|
|
172
|
+
LLM routing call: ~500 input tokens + 100 output tokens per story = ~$0.001 on Flash.
|
|
173
|
+
Batch mode (9 stories): ~2000 input + 400 output = ~$0.003 total.
|
|
174
|
+
|
|
175
|
+
**ROI: One prevented misroute pays for ~400 routing calls.**
|
|
176
|
+
|
|
177
|
+
## Acceptance Criteria
|
|
178
|
+
|
|
179
|
+
1. `config.routing.strategy = "llm"` activates LLM routing
|
|
180
|
+
2. LLM strategy returns structured `RoutingDecision` with reasoning
|
|
181
|
+
3. Falls back to keyword strategy on any LLM failure
|
|
182
|
+
4. Batch mode: single LLM call routes all pending stories
|
|
183
|
+
5. Routing decisions cached per story ID within a run
|
|
184
|
+
6. Strategy chain async support (non-breaking — keyword still sync)
|
|
185
|
+
7. Routing overhead < 10s for batch of 10 stories
|
|
186
|
+
8. Logging: every LLM routing decision logged with reasoning
|
|
187
|
+
|
|
188
|
+
## Files to Modify
|
|
189
|
+
|
|
190
|
+
- `src/routing/strategies/llm.ts` — Main implementation
|
|
191
|
+
- `src/routing/strategy.ts` — Make interface async-compatible
|
|
192
|
+
- `src/routing/chain.ts` — `route()` becomes async
|
|
193
|
+
- `src/config/schema.ts` — Add `LlmRoutingConfig` type
|
|
194
|
+
- `src/config/defaults.ts` — Add LLM routing defaults
|
|
195
|
+
- `test/routing/llm-strategy.test.ts` — Unit tests
|
|
196
|
+
- `test/routing/chain.test.ts` — Update for async
|
|
197
|
+
|
|
198
|
+
## Non-Goals (v0.8)
|
|
199
|
+
|
|
200
|
+
- Direct API calls (Option B) — defer to v0.9+
|
|
201
|
+
- Adaptive routing (learning from historical data) — existing stub, separate feature
|
|
202
|
+
- Custom routing prompts — hardcoded prompt is fine for now
|
|
203
|
+
|
|
204
|
+
---
|
|
205
|
+
|
|
206
|
+
*Created 2026-02-19*
|
|
@@ -0,0 +1,132 @@
|
|
|
1
|
+
# Feature: Structured Logging for nax v0.8
|
|
2
|
+
|
|
3
|
+
## Problem
|
|
4
|
+
|
|
5
|
+
nax currently uses raw `console.log` with chalk formatting throughout the codebase. Developers running `nax run` in headless mode have no way to:
|
|
6
|
+
- Control verbosity (only see errors vs full debug output)
|
|
7
|
+
- Get timing data per story/stage for performance analysis
|
|
8
|
+
- Review token counts and API costs per story
|
|
9
|
+
- Debug failures with full prompt/response dumps
|
|
10
|
+
- Parse logs programmatically for CI/CD integration
|
|
11
|
+
|
|
12
|
+
## Requirements
|
|
13
|
+
|
|
14
|
+
**REQ-1: Log Levels**
|
|
15
|
+
- Support 4 levels: `error`, `warn`, `info`, `debug`
|
|
16
|
+
- Default level: `info`
|
|
17
|
+
- CLI flags: `--verbose` (debug), `--quiet` (error+warn only), `--silent` (error only)
|
|
18
|
+
- Environment variable override: `NAX_LOG_LEVEL=debug`
|
|
19
|
+
|
|
20
|
+
**REQ-2: Structured Log Format**
|
|
21
|
+
- Each log entry includes: `timestamp`, `level`, `stage`, `storyId`, `message`
|
|
22
|
+
- Console output: human-readable with chalk (current style, but level-gated)
|
|
23
|
+
- File output: JSON Lines (`.jsonl`) for machine parsing
|
|
24
|
+
- File location: `nax/features/<name>/runs/<run-id>.jsonl`
|
|
25
|
+
|
|
26
|
+
**REQ-3: Stage Lifecycle Events**
|
|
27
|
+
- Emit structured events at each stage transition:
|
|
28
|
+
- `run.start` — feature name, story count, config
|
|
29
|
+
- `iteration.start` — iteration number, story id, complexity
|
|
30
|
+
- `context.built` — file count, token estimate
|
|
31
|
+
- `agent.start` — model, prompt size (chars/tokens), TDD strategy
|
|
32
|
+
- `agent.complete` — exit code, duration, stdout size, cost estimate
|
|
33
|
+
- `test.start` — test command
|
|
34
|
+
- `test.complete` — pass/fail, test count, duration
|
|
35
|
+
- `verification.start` — verification strategy
|
|
36
|
+
- `verification.complete` — pass/fail, issues found
|
|
37
|
+
- `story.complete` — story id, status, attempts, duration, cost
|
|
38
|
+
- `iteration.complete` — stories done this iteration
|
|
39
|
+
- `run.complete` — total stories, passed, failed, cost, duration
|
|
40
|
+
|
|
41
|
+
**REQ-4: Per-Story Metrics**
|
|
42
|
+
- Track and report per story: duration, API cost, token count (in/out), attempts, test count
|
|
43
|
+
- Include in `prd.json` story metadata after run completes
|
|
44
|
+
- Summary table at end of run (visible at `info` level)
|
|
45
|
+
|
|
46
|
+
**REQ-5: Debug Mode**
|
|
47
|
+
- `--debug` or `NAX_LOG_LEVEL=debug` enables:
|
|
48
|
+
- Full prompt text logged to file (not console)
|
|
49
|
+
- Full agent response logged to file
|
|
50
|
+
- Claude CLI command logged
|
|
51
|
+
- Environment variables passed to agent (sanitized — mask tokens)
|
|
52
|
+
|
|
53
|
+
**REQ-6: Run History**
|
|
54
|
+
- Each `nax run` creates a unique run ID (ISO timestamp or UUID)
|
|
55
|
+
- Log file persisted at `nax/features/<name>/runs/<run-id>.jsonl`
|
|
56
|
+
- Latest run symlinked as `nax/features/<name>/runs/latest.jsonl`
|
|
57
|
+
- `nax runs list -f <feature>` lists past runs with summary
|
|
58
|
+
- `nax runs show <run-id> -f <feature>` shows detailed run report
|
|
59
|
+
|
|
60
|
+
**REQ-7: Logger API (Internal)**
|
|
61
|
+
- Singleton logger instance, configured once at CLI entry
|
|
62
|
+
- API: `logger.info(stage, storyId, message, data?)`, `logger.debug(...)`, etc.
|
|
63
|
+
- Replace all `console.log` calls with logger calls
|
|
64
|
+
- Logger writes to both console (filtered by level) and file (all levels)
|
|
65
|
+
|
|
66
|
+
## Acceptance Criteria
|
|
67
|
+
|
|
68
|
+
**AC-1:** `nax run -f foo --headless` shows same output as today at `info` level
|
|
69
|
+
**AC-2:** `nax run -f foo --verbose` shows agent timing, token counts, and prompt sizes
|
|
70
|
+
**AC-3:** `nax run -f foo --quiet` shows only warnings, errors, and final summary
|
|
71
|
+
**AC-4:** After a run, `nax/features/foo/runs/latest.jsonl` contains structured events
|
|
72
|
+
**AC-5:** Each JSONL line is valid JSON with `timestamp`, `level`, `stage`, `storyId` fields
|
|
73
|
+
**AC-6:** `nax runs list -f foo` shows past runs with date, stories, cost, status
|
|
74
|
+
**AC-7:** Per-story metrics (duration, cost, attempts) appear in the run summary table
|
|
75
|
+
**AC-8:** Debug mode logs full prompts to file without printing to console
|
|
76
|
+
**AC-9:** No `console.log` calls remain in src/ (all replaced with logger)
|
|
77
|
+
|
|
78
|
+
## Technical Notes
|
|
79
|
+
|
|
80
|
+
### Logger Implementation
|
|
81
|
+
```typescript
|
|
82
|
+
// src/logger.ts
|
|
83
|
+
export type LogLevel = "error" | "warn" | "info" | "debug";
|
|
84
|
+
|
|
85
|
+
export interface LogEntry {
|
|
86
|
+
timestamp: string;
|
|
87
|
+
level: LogLevel;
|
|
88
|
+
stage: string;
|
|
89
|
+
storyId?: string;
|
|
90
|
+
message: string;
|
|
91
|
+
data?: Record<string, unknown>;
|
|
92
|
+
}
|
|
93
|
+
|
|
94
|
+
export class Logger {
|
|
95
|
+
constructor(options: { level: LogLevel; filePath?: string });
|
|
96
|
+
error(stage: string, message: string, data?: Record<string, unknown>): void;
|
|
97
|
+
warn(stage: string, message: string, data?: Record<string, unknown>): void;
|
|
98
|
+
info(stage: string, message: string, data?: Record<string, unknown>): void;
|
|
99
|
+
debug(stage: string, message: string, data?: Record<string, unknown>): void;
|
|
100
|
+
withStory(storyId: string): StoryLogger; // scoped logger
|
|
101
|
+
}
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
### Migration Strategy
|
|
105
|
+
1. Create `src/logger.ts` with Logger class
|
|
106
|
+
2. Add `--verbose`, `--quiet`, `--silent` flags to `bin/nax.ts`
|
|
107
|
+
3. Replace `console.log` calls one module at a time
|
|
108
|
+
4. Add stage events to orchestrator loop
|
|
109
|
+
5. Add run history commands
|
|
110
|
+
|
|
111
|
+
### Dependencies
|
|
112
|
+
- None (use Bun built-in fs for file writing)
|
|
113
|
+
- chalk remains for console formatting
|
|
114
|
+
|
|
115
|
+
### File Structure
|
|
116
|
+
```
|
|
117
|
+
nax/features/<name>/runs/
|
|
118
|
+
├── 2026-02-19T10-30-00Z.jsonl
|
|
119
|
+
├── 2026-02-20T14-15-00Z.jsonl
|
|
120
|
+
└── latest.jsonl -> 2026-02-20T14-15-00Z.jsonl
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
## Out of Scope
|
|
124
|
+
- Remote log shipping (Datadog, Sentry, etc.)
|
|
125
|
+
- Log rotation or cleanup policies
|
|
126
|
+
- Real-time log streaming via WebSocket
|
|
127
|
+
- Custom log formatters or plugins
|
|
128
|
+
- Metrics dashboard or visualization
|
|
129
|
+
|
|
130
|
+
---
|
|
131
|
+
|
|
132
|
+
*Spec created 2026-02-19 for nax v0.8*
|
|
@@ -0,0 +1,112 @@
|
|
|
1
|
+
# v0.9.3 — Prompt Audit & Context Isolation
|
|
2
|
+
|
|
3
|
+
**Status:** Draft
|
|
4
|
+
**Author:** Subrina
|
|
5
|
+
**Date:** 2026-02-23
|
|
6
|
+
**Base:** v0.9.2
|
|
7
|
+
|
|
8
|
+
## Overview
|
|
9
|
+
|
|
10
|
+
Add tooling to inspect and verify story-scoped prompt isolation. Ensures each story's agent prompt contains only context relevant to that story — no cross-story leakage.
|
|
11
|
+
|
|
12
|
+
## Motivation
|
|
13
|
+
|
|
14
|
+
- No way to inspect what prompts agents actually receive without running a full `nax run`
|
|
15
|
+
- `generateTestCoverageSummary` scans the entire repo's test files, leaking context from other stories into unrelated story prompts
|
|
16
|
+
- No automated test verifying that `buildContext()` properly isolates per-story context
|
|
17
|
+
- Prompt inspection is critical for debugging routing, context, and the upcoming v0.10 prompt optimizer
|
|
18
|
+
|
|
19
|
+
## Deliverables
|
|
20
|
+
|
|
21
|
+
### 1. `nax prompts` CLI Command
|
|
22
|
+
|
|
23
|
+
New subcommand that assembles prompts for all stories in a feature without executing agents.
|
|
24
|
+
|
|
25
|
+
```bash
|
|
26
|
+
# Dump all story prompts to stdout
|
|
27
|
+
nax prompts -f core
|
|
28
|
+
|
|
29
|
+
# Write to directory
|
|
30
|
+
nax prompts -f core --out ./prompt-dump/
|
|
31
|
+
|
|
32
|
+
# Single story
|
|
33
|
+
nax prompts -f core --story US-003
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
**Pipeline stages executed:** routing → context → constitution → prompt (stops before execution).
|
|
37
|
+
|
|
38
|
+
**Output per story:**
|
|
39
|
+
```
|
|
40
|
+
prompt-dump/
|
|
41
|
+
US-001.prompt.md # Final assembled prompt
|
|
42
|
+
US-001.context.md # Context markdown only (for isolation audit)
|
|
43
|
+
US-002.prompt.md
|
|
44
|
+
US-002.context.md
|
|
45
|
+
...
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
Each file includes a YAML frontmatter header:
|
|
49
|
+
```yaml
|
|
50
|
+
---
|
|
51
|
+
storyId: US-003
|
|
52
|
+
title: "Create health indicator interface"
|
|
53
|
+
testStrategy: test-after
|
|
54
|
+
modelTier: balanced
|
|
55
|
+
contextTokens: 2450
|
|
56
|
+
promptTokens: 3800
|
|
57
|
+
dependencies: [US-001]
|
|
58
|
+
contextElements:
|
|
59
|
+
- type: progress, tokens: 45
|
|
60
|
+
- type: story, storyId: US-003, tokens: 890
|
|
61
|
+
- type: dependency, storyId: US-001, tokens: 720
|
|
62
|
+
- type: test-coverage, tokens: 795
|
|
63
|
+
---
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
**For three-session-tdd stories:** outputs `US-001.test-writer.md`, `US-001.implementer.md`, `US-001.verifier.md`.
|
|
67
|
+
|
|
68
|
+
### 2. `buildContext` Isolation Unit Tests
|
|
69
|
+
|
|
70
|
+
Test that `buildContext()` for a given story only includes:
|
|
71
|
+
- The current story
|
|
72
|
+
- Declared dependency stories
|
|
73
|
+
- Progress summary (counts only, no story details)
|
|
74
|
+
- Test coverage (to be scoped — see #3)
|
|
75
|
+
- Error context from current story only
|
|
76
|
+
- File context from current story's `contextFiles` only
|
|
77
|
+
|
|
78
|
+
**Negative assertions:**
|
|
79
|
+
- No story IDs from non-dependency stories appear in output
|
|
80
|
+
- No acceptance criteria from unrelated stories leak through
|
|
81
|
+
- Progress summary contains only aggregate counts, not story titles
|
|
82
|
+
|
|
83
|
+
### 3. Scoped Test Coverage Scanner
|
|
84
|
+
|
|
85
|
+
Fix `generateTestCoverageSummary` to scope results to story-relevant files:
|
|
86
|
+
|
|
87
|
+
**Current behavior:** Scans all test files in `testDir` → agent sees coverage from every story.
|
|
88
|
+
|
|
89
|
+
**New behavior:**
|
|
90
|
+
1. If story has `contextFiles` → derive test file patterns from source paths (e.g., `src/health.service.ts` → `test/health.service.spec.ts`)
|
|
91
|
+
2. If no `contextFiles` → fall back to full scan (current behavior) with a warning logged
|
|
92
|
+
3. Add `context.testCoverage.scopeToStory` config option (default: `true`)
|
|
93
|
+
|
|
94
|
+
## User Stories
|
|
95
|
+
|
|
96
|
+
| # | Title | Complexity | Test Strategy | Dependencies |
|
|
97
|
+
|:--|:------|:-----------|:--------------|:-------------|
|
|
98
|
+
| US-001 | `nax prompts` CLI command with file output | medium | test-after | — |
|
|
99
|
+
| US-002 | `buildContext` isolation unit tests | simple | test-after | — |
|
|
100
|
+
| US-003 | Scope test coverage scanner to story-relevant files | medium | test-after | — |
|
|
101
|
+
|
|
102
|
+
## Non-Goals
|
|
103
|
+
|
|
104
|
+
- No changes to prompt assembly logic (that's v0.10 optimizer territory)
|
|
105
|
+
- No `--optimized` flag yet (depends on v0.10)
|
|
106
|
+
- No changes to TDD orchestrator prompt builders (just audit them)
|
|
107
|
+
|
|
108
|
+
## Compatibility
|
|
109
|
+
|
|
110
|
+
- `nax prompts` is additive — new CLI command, no existing behavior changed
|
|
111
|
+
- Test coverage scoping is behind config flag with backward-compatible default
|
|
112
|
+
- No breaking changes
|