sofia-cli 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.github/agents/copilot-instructions.md +39 -0
- package/.github/agents/speckit.analyze.agent.md +184 -0
- package/.github/agents/speckit.checklist.agent.md +294 -0
- package/.github/agents/speckit.clarify.agent.md +181 -0
- package/.github/agents/speckit.constitution.agent.md +84 -0
- package/.github/agents/speckit.implement.agent.md +135 -0
- package/.github/agents/speckit.plan.agent.md +90 -0
- package/.github/agents/speckit.specify.agent.md +258 -0
- package/.github/agents/speckit.tasks.agent.md +137 -0
- package/.github/agents/speckit.taskstoissues.agent.md +30 -0
- package/.github/copilot-instructions.md +257 -0
- package/.github/prompts/speckit.analyze.prompt.md +3 -0
- package/.github/prompts/speckit.checklist.prompt.md +3 -0
- package/.github/prompts/speckit.clarify.prompt.md +3 -0
- package/.github/prompts/speckit.constitution.prompt.md +3 -0
- package/.github/prompts/speckit.implement.prompt.md +3 -0
- package/.github/prompts/speckit.plan.prompt.md +3 -0
- package/.github/prompts/speckit.specify.prompt.md +3 -0
- package/.github/prompts/speckit.tasks.prompt.md +3 -0
- package/.github/prompts/speckit.taskstoissues.prompt.md +3 -0
- package/.github/workflows/ci.yml +38 -0
- package/.prettierrc +6 -0
- package/.specify/memory/constitution.md +181 -0
- package/.specify/scripts/bash/check-prerequisites.sh +166 -0
- package/.specify/scripts/bash/common.sh +156 -0
- package/.specify/scripts/bash/create-new-feature.sh +297 -0
- package/.specify/scripts/bash/setup-plan.sh +61 -0
- package/.specify/scripts/bash/update-agent-context.sh +810 -0
- package/.specify/templates/agent-file-template.md +28 -0
- package/.specify/templates/checklist-template.md +40 -0
- package/.specify/templates/constitution-template.md +50 -0
- package/.specify/templates/plan-template.md +113 -0
- package/.specify/templates/spec-template.md +115 -0
- package/.specify/templates/tasks-template.md +251 -0
- package/.vscode/mcp.json +42 -0
- package/.vscode/settings.json +19 -0
- package/CODE_OF_CONDUCT.md +128 -0
- package/LICENSE +21 -0
- package/README.md +213 -0
- package/dist/src/cli/developCommand.js +240 -0
- package/dist/src/cli/directCommands.js +143 -0
- package/dist/src/cli/envLoader.js +16 -0
- package/dist/src/cli/exportCommand.js +53 -0
- package/dist/src/cli/index.js +203 -0
- package/dist/src/cli/ioContext.js +109 -0
- package/dist/src/cli/preflight.js +57 -0
- package/dist/src/cli/statusCommand.js +110 -0
- package/dist/src/cli/workshopCommand.js +400 -0
- package/dist/src/develop/checkpointState.js +86 -0
- package/dist/src/develop/codeGenerator.js +319 -0
- package/dist/src/develop/dynamicScaffolder.js +226 -0
- package/dist/src/develop/githubMcpAdapter.js +122 -0
- package/dist/src/develop/index.js +15 -0
- package/dist/src/develop/mcpContextEnricher.js +195 -0
- package/dist/src/develop/pocScaffolder.js +542 -0
- package/dist/src/develop/ralphLoop.js +659 -0
- package/dist/src/develop/templateRegistry.js +364 -0
- package/dist/src/develop/testRunner.js +202 -0
- package/dist/src/logging/logger.js +58 -0
- package/dist/src/loop/conversationLoop.js +227 -0
- package/dist/src/loop/phaseSummarizer.js +87 -0
- package/dist/src/mcp/mcpManager.js +267 -0
- package/dist/src/mcp/mcpTransport.js +391 -0
- package/dist/src/mcp/retryPolicy.js +47 -0
- package/dist/src/mcp/webSearch.js +254 -0
- package/dist/src/phases/contextSummarizer.js +101 -0
- package/dist/src/phases/discoveryEnricher.js +156 -0
- package/dist/src/phases/phaseExtractors.js +222 -0
- package/dist/src/phases/phaseHandlers.js +328 -0
- package/dist/src/prompts/design.md +51 -0
- package/dist/src/prompts/develop-boundary.md +51 -0
- package/dist/src/prompts/develop.md +111 -0
- package/dist/src/prompts/discover.md +58 -0
- package/dist/src/prompts/ideate.md +56 -0
- package/dist/src/prompts/plan.md +51 -0
- package/dist/src/prompts/promptLoader.js +167 -0
- package/dist/src/prompts/promptLoader.ts +198 -0
- package/dist/src/prompts/select.md +47 -0
- package/dist/src/prompts/summarize/README.md +8 -0
- package/dist/src/prompts/summarize/design-summary.md +37 -0
- package/dist/src/prompts/summarize/develop-summary.md +25 -0
- package/dist/src/prompts/summarize/ideate-summary.md +27 -0
- package/dist/src/prompts/summarize/plan-summary.md +27 -0
- package/dist/src/prompts/summarize/select-summary.md +21 -0
- package/dist/src/prompts/system.md +28 -0
- package/dist/src/sessions/exportPaths.js +22 -0
- package/dist/src/sessions/exportWriter.js +406 -0
- package/dist/src/sessions/sessionManager.js +81 -0
- package/dist/src/sessions/sessionStore.js +65 -0
- package/dist/src/shared/activitySpinner.js +91 -0
- package/dist/src/shared/copilotClient.js +129 -0
- package/dist/src/shared/data/cards.json +1249 -0
- package/dist/src/shared/data/cardsLoader.js +51 -0
- package/dist/src/shared/errorClassifier.js +120 -0
- package/dist/src/shared/events.js +28 -0
- package/dist/src/shared/markdownRenderer.js +34 -0
- package/dist/src/shared/schemas/session.js +265 -0
- package/dist/src/shared/tableRenderer.js +20 -0
- package/dist/src/vendor/chalk.js +2 -0
- package/dist/src/vendor/cli-table3.js +3 -0
- package/dist/src/vendor/commander.js +2 -0
- package/dist/src/vendor/marked-terminal.js +3 -0
- package/dist/src/vendor/marked.js +2 -0
- package/dist/src/vendor/ora.js +2 -0
- package/dist/src/vendor/pino.js +2 -0
- package/dist/src/vendor/zod.js +2 -0
- package/dist/tests/e2e/developE2e.spec.js +126 -0
- package/dist/tests/e2e/developFailureE2e.spec.js +247 -0
- package/dist/tests/e2e/developPty.spec.js +75 -0
- package/dist/tests/e2e/discoveryWebSearchRelevance.spec.js +84 -0
- package/dist/tests/e2e/harness.spec.js +83 -0
- package/dist/tests/e2e/mcpLive.spec.js +120 -0
- package/dist/tests/e2e/newSession.e2e.spec.js +177 -0
- package/dist/tests/e2e/ralphLoopEnrichmentComparison.spec.js +62 -0
- package/dist/tests/e2e/workiqEnrichment.spec.js +56 -0
- package/dist/tests/e2e/zavaSimulation.spec.js +452 -0
- package/dist/tests/fixtures/test-fixture-project/src/add.js +3 -0
- package/dist/tests/fixtures/test-fixture-project/tests/failing.test.js +6 -0
- package/dist/tests/fixtures/test-fixture-project/tests/hanging.test.js +8 -0
- package/dist/tests/fixtures/test-fixture-project/tests/passing.test.js +10 -0
- package/dist/tests/fixtures/test-fixture-project/vitest.config.js +6 -0
- package/dist/tests/integration/autoStartConversation.spec.js +138 -0
- package/dist/tests/integration/defaultCommand.spec.js +147 -0
- package/dist/tests/integration/directCommandNonTty.spec.js +224 -0
- package/dist/tests/integration/directCommandTty.spec.js +151 -0
- package/dist/tests/integration/discoveryEnrichmentFlow.spec.js +175 -0
- package/dist/tests/integration/exportArtifacts.spec.js +202 -0
- package/dist/tests/integration/exportFallbackFlow.spec.js +99 -0
- package/dist/tests/integration/mcpDegradationFlow.spec.js +190 -0
- package/dist/tests/integration/mcpTransportFlow.spec.js +139 -0
- package/dist/tests/integration/newSessionFlow.spec.js +343 -0
- package/dist/tests/integration/pocGithubMcp.spec.js +186 -0
- package/dist/tests/integration/pocLocalFallback.spec.js +171 -0
- package/dist/tests/integration/pocScaffold.spec.js +163 -0
- package/dist/tests/integration/ralphLoopFlow.spec.js +359 -0
- package/dist/tests/integration/ralphLoopPartial.spec.js +368 -0
- package/dist/tests/integration/resumeAndBacktrack.spec.js +247 -0
- package/dist/tests/integration/spinnerLifecycle.spec.js +220 -0
- package/dist/tests/integration/summarizationFlow.spec.js +115 -0
- package/dist/tests/integration/testRunnerReal.spec.js +52 -0
- package/dist/tests/integration/webSearchAgent.spec.js +128 -0
- package/dist/tests/live/copilotSdkLive.spec.js +107 -0
- package/dist/tests/live/zavaFullWorkshop.spec.js +392 -0
- package/dist/tests/setup/loadEnv.js +3 -0
- package/dist/tests/unit/cli/developCommand.spec.js +567 -0
- package/dist/tests/unit/cli/directCommands.spec.js +279 -0
- package/dist/tests/unit/cli/envLoader.spec.js +58 -0
- package/dist/tests/unit/cli/ioContext.spec.js +119 -0
- package/dist/tests/unit/cli/preflight.spec.js +108 -0
- package/dist/tests/unit/cli/statusCommand.spec.js +111 -0
- package/dist/tests/unit/cli/workshopClientFallback.spec.js +80 -0
- package/dist/tests/unit/cli/workshopCommand.spec.js +329 -0
- package/dist/tests/unit/config/vitestEnvSetup.spec.js +13 -0
- package/dist/tests/unit/develop/checkpointState.spec.js +315 -0
- package/dist/tests/unit/develop/codeGenerator.spec.js +355 -0
- package/dist/tests/unit/develop/githubMcpAdapter.spec.js +231 -0
- package/dist/tests/unit/develop/mcpContextEnricher.spec.js +433 -0
- package/dist/tests/unit/develop/outputValidator.spec.js +119 -0
- package/dist/tests/unit/develop/pocScaffolder.spec.js +353 -0
- package/dist/tests/unit/develop/ralphLoop.spec.js +1248 -0
- package/dist/tests/unit/develop/templateRegistry.spec.js +85 -0
- package/dist/tests/unit/develop/testRunner.spec.js +249 -0
- package/dist/tests/unit/infraBicep.spec.js +92 -0
- package/dist/tests/unit/infraDeploy.spec.js +82 -0
- package/dist/tests/unit/infraTeardown.spec.js +63 -0
- package/dist/tests/unit/logging/logger.spec.js +43 -0
- package/dist/tests/unit/loop/conversationLoop.spec.js +592 -0
- package/dist/tests/unit/loop/phaseSummarizer.spec.js +141 -0
- package/dist/tests/unit/loop/streamingMarkdown.spec.js +147 -0
- package/dist/tests/unit/mcp/mcpManager.spec.js +279 -0
- package/dist/tests/unit/mcp/mcpTransport.spec.js +529 -0
- package/dist/tests/unit/mcp/retryPolicy.spec.js +218 -0
- package/dist/tests/unit/mcp/timeoutValidation.spec.js +46 -0
- package/dist/tests/unit/mcp/webSearch.spec.js +567 -0
- package/dist/tests/unit/phases/contextSummarizer.spec.js +140 -0
- package/dist/tests/unit/phases/discoveryEnricher.repeatCalls.spec.js +93 -0
- package/dist/tests/unit/phases/discoveryEnricher.spec.js +411 -0
- package/dist/tests/unit/phases/phaseExtractors.spec.js +352 -0
- package/dist/tests/unit/phases/phaseHandlers.spec.js +425 -0
- package/dist/tests/unit/prompts/promptLoader.spec.js +118 -0
- package/dist/tests/unit/schemas/pocSchemas.spec.js +412 -0
- package/dist/tests/unit/schemas/session.spec.js +257 -0
- package/dist/tests/unit/sessions/exportPaths.spec.js +31 -0
- package/dist/tests/unit/sessions/exportWriter.spec.js +655 -0
- package/dist/tests/unit/sessions/sessionManager.spec.js +151 -0
- package/dist/tests/unit/sessions/sessionStore.spec.js +116 -0
- package/dist/tests/unit/shared/activitySpinner.spec.js +175 -0
- package/dist/tests/unit/shared/cardsLoader.spec.js +76 -0
- package/dist/tests/unit/shared/copilotClient.spec.js +155 -0
- package/dist/tests/unit/shared/errorClassifier.spec.js +131 -0
- package/dist/tests/unit/shared/events.spec.js +55 -0
- package/dist/tests/unit/shared/markdownRenderer.spec.js +35 -0
- package/dist/tests/unit/shared/markdownRendererChunks.spec.js +70 -0
- package/dist/tests/unit/shared/tableRenderer.spec.js +34 -0
- package/dist/vitest.config.js +14 -0
- package/dist/vitest.live.config.js +18 -0
- package/docs/README.md +35 -0
- package/docs/architecture.md +169 -0
- package/docs/cli-usage.md +207 -0
- package/docs/environment.md +66 -0
- package/docs/export-format.md +146 -0
- package/docs/session-model.md +113 -0
- package/eslint.config.js +35 -0
- package/infra/deploy.sh +193 -0
- package/infra/gather-env.sh +211 -0
- package/infra/main.bicep +90 -0
- package/infra/main.bicepparam +18 -0
- package/infra/resources.bicep +134 -0
- package/infra/teardown.sh +114 -0
- package/package.json +63 -0
- package/specs/001-cli-workshop-rebuild/checklists/requirements.md +35 -0
- package/specs/001-cli-workshop-rebuild/contracts/cli.md +59 -0
- package/specs/001-cli-workshop-rebuild/contracts/export-summary-json.md +23 -0
- package/specs/001-cli-workshop-rebuild/contracts/session-json.md +30 -0
- package/specs/001-cli-workshop-rebuild/data-model.md +210 -0
- package/specs/001-cli-workshop-rebuild/plan.md +361 -0
- package/specs/001-cli-workshop-rebuild/quickstart.md +83 -0
- package/specs/001-cli-workshop-rebuild/research.md +116 -0
- package/specs/001-cli-workshop-rebuild/spec.md +240 -0
- package/specs/001-cli-workshop-rebuild/tasks.md +476 -0
- package/specs/002-poc-generation/contracts/poc-output.md +172 -0
- package/specs/002-poc-generation/contracts/ralph-loop.md +113 -0
- package/specs/002-poc-generation/data-model.md +172 -0
- package/specs/002-poc-generation/plan.md +109 -0
- package/specs/002-poc-generation/quickstart.md +97 -0
- package/specs/002-poc-generation/research.md +786 -0
- package/specs/002-poc-generation/spec.md +81 -0
- package/specs/002-poc-generation/tasks-fix.md +198 -0
- package/specs/002-poc-generation/tasks.md +252 -0
- package/specs/003-mcp-transport-integration/checklists/requirements.md +37 -0
- package/specs/003-mcp-transport-integration/contracts/context-enricher.md +220 -0
- package/specs/003-mcp-transport-integration/contracts/discovery-enricher.md +267 -0
- package/specs/003-mcp-transport-integration/contracts/github-adapter.md +149 -0
- package/specs/003-mcp-transport-integration/contracts/mcp-transport.md +288 -0
- package/specs/003-mcp-transport-integration/data-model.md +326 -0
- package/specs/003-mcp-transport-integration/plan.md +114 -0
- package/specs/003-mcp-transport-integration/quickstart.md +311 -0
- package/specs/003-mcp-transport-integration/research.md +395 -0
- package/specs/003-mcp-transport-integration/spec.md +234 -0
- package/specs/003-mcp-transport-integration/tasks.md +324 -0
- package/specs/003-next-spec-gaps.md +150 -0
- package/specs/004-dev-resume-hardening/checklists/requirements.md +37 -0
- package/specs/004-dev-resume-hardening/contracts/cli.md +160 -0
- package/specs/004-dev-resume-hardening/data-model.md +321 -0
- package/specs/004-dev-resume-hardening/plan.md +107 -0
- package/specs/004-dev-resume-hardening/quickstart.md +115 -0
- package/specs/004-dev-resume-hardening/research.md +142 -0
- package/specs/004-dev-resume-hardening/spec.md +221 -0
- package/specs/004-dev-resume-hardening/tasks.md +333 -0
- package/specs/005-ai-search-deploy/checklists/requirements.md +39 -0
- package/specs/005-ai-search-deploy/contracts/web-search-tool.md +241 -0
- package/specs/005-ai-search-deploy/data-model.md +130 -0
- package/specs/005-ai-search-deploy/plan.md +93 -0
- package/specs/005-ai-search-deploy/quickstart.md +96 -0
- package/specs/005-ai-search-deploy/research.md +187 -0
- package/specs/005-ai-search-deploy/spec.md +143 -0
- package/specs/005-ai-search-deploy/tasks.md +284 -0
- package/specs/006-workshop-extraction-fixes/checklists/requirements.md +61 -0
- package/specs/006-workshop-extraction-fixes/contracts/summarization-and-export.md +131 -0
- package/specs/006-workshop-extraction-fixes/data-model.md +149 -0
- package/specs/006-workshop-extraction-fixes/plan.md +123 -0
- package/specs/006-workshop-extraction-fixes/quickstart.md +101 -0
- package/specs/006-workshop-extraction-fixes/research.md +143 -0
- package/specs/006-workshop-extraction-fixes/spec.md +210 -0
- package/specs/006-workshop-extraction-fixes/tasks.md +316 -0
- package/src/cli/developCommand.ts +308 -0
- package/src/cli/directCommands.ts +195 -0
- package/src/cli/envLoader.ts +17 -0
- package/src/cli/exportCommand.ts +65 -0
- package/src/cli/index.ts +249 -0
- package/src/cli/ioContext.ts +139 -0
- package/src/cli/preflight.ts +86 -0
- package/src/cli/statusCommand.ts +118 -0
- package/src/cli/workshopCommand.ts +496 -0
- package/src/develop/checkpointState.ts +121 -0
- package/src/develop/codeGenerator.ts +402 -0
- package/src/develop/dynamicScaffolder.ts +284 -0
- package/src/develop/githubMcpAdapter.ts +199 -0
- package/src/develop/index.ts +34 -0
- package/src/develop/mcpContextEnricher.ts +279 -0
- package/src/develop/pocScaffolder.ts +646 -0
- package/src/develop/ralphLoop.ts +1044 -0
- package/src/develop/templateRegistry.ts +427 -0
- package/src/develop/testRunner.ts +276 -0
- package/src/logging/logger.ts +73 -0
- package/src/loop/conversationLoop.ts +355 -0
- package/src/loop/phaseSummarizer.ts +114 -0
- package/src/mcp/mcpManager.ts +365 -0
- package/src/mcp/mcpTransport.ts +562 -0
- package/src/mcp/retryPolicy.ts +87 -0
- package/src/mcp/webSearch.ts +388 -0
- package/src/originalPrompts/design_thinking.md +178 -0
- package/src/originalPrompts/design_thinking_persona.md +76 -0
- package/src/originalPrompts/document_generator_example.md +77 -0
- package/src/originalPrompts/document_generator_persona.md +47 -0
- package/src/originalPrompts/facilitator_persona.md +125 -0
- package/src/originalPrompts/guardrails.md +47 -0
- package/src/phases/contextSummarizer.ts +154 -0
- package/src/phases/discoveryEnricher.ts +223 -0
- package/src/phases/phaseExtractors.ts +247 -0
- package/src/phases/phaseHandlers.ts +450 -0
- package/src/prompts/design.md +51 -0
- package/src/prompts/develop-boundary.md +51 -0
- package/src/prompts/develop.md +111 -0
- package/src/prompts/discover.md +58 -0
- package/src/prompts/ideate.md +56 -0
- package/src/prompts/plan.md +51 -0
- package/src/prompts/promptLoader.ts +198 -0
- package/src/prompts/select.md +47 -0
- package/src/prompts/summarize/README.md +8 -0
- package/src/prompts/summarize/design-summary.md +37 -0
- package/src/prompts/summarize/develop-summary.md +25 -0
- package/src/prompts/summarize/ideate-summary.md +27 -0
- package/src/prompts/summarize/plan-summary.md +27 -0
- package/src/prompts/summarize/select-summary.md +21 -0
- package/src/prompts/system.md +28 -0
- package/src/sessions/exportPaths.ts +28 -0
- package/src/sessions/exportWriter.ts +490 -0
- package/src/sessions/sessionManager.ts +119 -0
- package/src/sessions/sessionStore.ts +69 -0
- package/src/shared/activitySpinner.ts +108 -0
- package/src/shared/copilotClient.ts +291 -0
- package/src/shared/data/cards.json +1249 -0
- package/src/shared/data/cardsLoader.ts +70 -0
- package/src/shared/errorClassifier.ts +160 -0
- package/src/shared/events.ts +103 -0
- package/src/shared/markdownRenderer.ts +44 -0
- package/src/shared/schemas/session.ts +346 -0
- package/src/shared/tableRenderer.ts +28 -0
- package/src/types/marked-terminal.d.ts +5 -0
- package/src/vendor/chalk.ts +2 -0
- package/src/vendor/cli-table3.ts +3 -0
- package/src/vendor/commander.ts +2 -0
- package/src/vendor/marked-terminal.ts +3 -0
- package/src/vendor/marked.ts +2 -0
- package/src/vendor/ora.ts +2 -0
- package/src/vendor/pino.ts +3 -0
- package/src/vendor/zod.ts +3 -0
- package/tests/e2e/developE2e.spec.ts +152 -0
- package/tests/e2e/developFailureE2e.spec.ts +289 -0
- package/tests/e2e/developPty.spec.ts +86 -0
- package/tests/e2e/discoveryWebSearchRelevance.spec.ts +103 -0
- package/tests/e2e/harness.spec.ts +104 -0
- package/tests/e2e/mcpLive.spec.ts +149 -0
- package/tests/e2e/newSession.e2e.spec.ts +245 -0
- package/tests/e2e/ralphLoopEnrichmentComparison.spec.ts +70 -0
- package/tests/e2e/workiqEnrichment.spec.ts +72 -0
- package/tests/e2e/zava-assessment/agent-interaction-script.md +258 -0
- package/tests/e2e/zava-assessment/company-profile.md +98 -0
- package/tests/e2e/zava-assessment/expected-results-checklist.md +454 -0
- package/tests/e2e/zavaSimulation.spec.ts +511 -0
- package/tests/fixtures/completedSession.json +141 -0
- package/tests/fixtures/test-fixture-project/package-lock.json +1585 -0
- package/tests/fixtures/test-fixture-project/package.json +12 -0
- package/tests/fixtures/test-fixture-project/src/add.ts +3 -0
- package/tests/fixtures/test-fixture-project/tests/failing.test.ts +7 -0
- package/tests/fixtures/test-fixture-project/tests/hanging.test.ts +9 -0
- package/tests/fixtures/test-fixture-project/tests/passing.test.ts +13 -0
- package/tests/fixtures/test-fixture-project/vitest.config.ts +7 -0
- package/tests/integration/autoStartConversation.spec.ts +168 -0
- package/tests/integration/defaultCommand.spec.ts +179 -0
- package/tests/integration/directCommandNonTty.spec.ts +260 -0
- package/tests/integration/directCommandTty.spec.ts +185 -0
- package/tests/integration/discoveryEnrichmentFlow.spec.ts +209 -0
- package/tests/integration/exportArtifacts.spec.ts +232 -0
- package/tests/integration/exportFallbackFlow.spec.ts +115 -0
- package/tests/integration/mcpDegradationFlow.spec.ts +231 -0
- package/tests/integration/mcpTransportFlow.spec.ts +178 -0
- package/tests/integration/newSessionFlow.spec.ts +406 -0
- package/tests/integration/pocGithubMcp.spec.ts +224 -0
- package/tests/integration/pocLocalFallback.spec.ts +205 -0
- package/tests/integration/pocScaffold.spec.ts +220 -0
- package/tests/integration/ralphLoopFlow.spec.ts +430 -0
- package/tests/integration/ralphLoopPartial.spec.ts +416 -0
- package/tests/integration/resumeAndBacktrack.spec.ts +278 -0
- package/tests/integration/spinnerLifecycle.spec.ts +270 -0
- package/tests/integration/summarizationFlow.spec.ts +135 -0
- package/tests/integration/testRunnerReal.spec.ts +63 -0
- package/tests/integration/webSearchAgent.spec.ts +155 -0
- package/tests/live/copilotSdkLive.spec.ts +149 -0
- package/tests/live/zavaFullWorkshop.spec.ts +515 -0
- package/tests/setup/loadEnv.ts +5 -0
- package/tests/unit/cli/developCommand.spec.ts +679 -0
- package/tests/unit/cli/directCommands.spec.ts +325 -0
- package/tests/unit/cli/envLoader.spec.ts +73 -0
- package/tests/unit/cli/ioContext.spec.ts +148 -0
- package/tests/unit/cli/preflight.spec.ts +125 -0
- package/tests/unit/cli/statusCommand.spec.ts +134 -0
- package/tests/unit/cli/workshopClientFallback.spec.ts +100 -0
- package/tests/unit/cli/workshopCommand.spec.ts +378 -0
- package/tests/unit/config/vitestEnvSetup.spec.ts +24 -0
- package/tests/unit/develop/checkpointState.spec.ts +378 -0
- package/tests/unit/develop/codeGenerator.spec.ts +447 -0
- package/tests/unit/develop/githubMcpAdapter.spec.ts +283 -0
- package/tests/unit/develop/mcpContextEnricher.spec.ts +564 -0
- package/tests/unit/develop/outputValidator.spec.ts +134 -0
- package/tests/unit/develop/pocScaffolder.spec.ts +451 -0
- package/tests/unit/develop/ralphLoop.spec.ts +1439 -0
- package/tests/unit/develop/templateRegistry.spec.ts +106 -0
- package/tests/unit/develop/testRunner.spec.ts +294 -0
- package/tests/unit/infraBicep.spec.ts +116 -0
- package/tests/unit/infraDeploy.spec.ts +102 -0
- package/tests/unit/infraTeardown.spec.ts +77 -0
- package/tests/unit/logging/logger.spec.ts +50 -0
- package/tests/unit/loop/conversationLoop.spec.ts +719 -0
- package/tests/unit/loop/phaseSummarizer.spec.ts +169 -0
- package/tests/unit/loop/streamingMarkdown.spec.ts +180 -0
- package/tests/unit/mcp/mcpManager.spec.ts +336 -0
- package/tests/unit/mcp/mcpTransport.spec.ts +689 -0
- package/tests/unit/mcp/retryPolicy.spec.ts +278 -0
- package/tests/unit/mcp/timeoutValidation.spec.ts +55 -0
- package/tests/unit/mcp/webSearch.spec.ts +718 -0
- package/tests/unit/phases/contextSummarizer.spec.ts +158 -0
- package/tests/unit/phases/discoveryEnricher.repeatCalls.spec.ts +125 -0
- package/tests/unit/phases/discoveryEnricher.spec.ts +512 -0
- package/tests/unit/phases/phaseExtractors.spec.ts +406 -0
- package/tests/unit/phases/phaseHandlers.spec.ts +483 -0
- package/tests/unit/prompts/promptLoader.spec.ts +144 -0
- package/tests/unit/schemas/pocSchemas.spec.ts +457 -0
- package/tests/unit/schemas/session.spec.ts +328 -0
- package/tests/unit/sessions/exportPaths.spec.ts +38 -0
- package/tests/unit/sessions/exportWriter.spec.ts +737 -0
- package/tests/unit/sessions/sessionManager.spec.ts +174 -0
- package/tests/unit/sessions/sessionStore.spec.ts +136 -0
- package/tests/unit/shared/activitySpinner.spec.ts +211 -0
- package/tests/unit/shared/cardsLoader.spec.ts +89 -0
- package/tests/unit/shared/copilotClient.spec.ts +185 -0
- package/tests/unit/shared/errorClassifier.spec.ts +152 -0
- package/tests/unit/shared/events.spec.ts +71 -0
- package/tests/unit/shared/markdownRenderer.spec.ts +42 -0
- package/tests/unit/shared/markdownRendererChunks.spec.ts +83 -0
- package/tests/unit/shared/tableRenderer.spec.ts +38 -0
- package/tsconfig.json +20 -0
- package/vitest.config.ts +15 -0
- package/vitest.live.config.ts +19 -0
|
@@ -0,0 +1,115 @@
|
|
|
1
|
+
# Quickstart: Dev Resume & Hardening
|
|
2
|
+
|
|
3
|
+
**Feature**: 004-dev-resume-hardening
|
|
4
|
+
**Date**: 2026-03-01
|
|
5
|
+
|
|
6
|
+
## Prerequisites
|
|
7
|
+
|
|
8
|
+
- Node.js >= 20 LTS
|
|
9
|
+
- npm (bundled with Node.js)
|
|
10
|
+
- sofIA CLI installed (`npm run build && npm link`)
|
|
11
|
+
- A workshop session that has completed the Plan phase
|
|
12
|
+
|
|
13
|
+
## Quick Verification
|
|
14
|
+
|
|
15
|
+
### 1. Resume a session after interruption
|
|
16
|
+
|
|
17
|
+
```bash
|
|
18
|
+
# Start a dev session
|
|
19
|
+
sofia dev --session abc123
|
|
20
|
+
|
|
21
|
+
# Interrupt with Ctrl+C after 2 iterations complete
|
|
22
|
+
# The CLI displays: "Use `sofia dev --session abc123` to resume"
|
|
23
|
+
|
|
24
|
+
# Resume — should skip scaffold, re-run npm install, resume from iteration 3
|
|
25
|
+
sofia dev --session abc123
|
|
26
|
+
|
|
27
|
+
# Expected output:
|
|
28
|
+
# ℹ Resuming session abc123 from iteration 3 (2 completed iterations found)
|
|
29
|
+
# ℹ Skipping scaffold — output directory and .sofia-metadata.json present
|
|
30
|
+
# ℹ Re-running dependency installation (npm install)
|
|
31
|
+
# Iteration 3/10: Running tests…
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
### 2. Force-restart a session
|
|
35
|
+
|
|
36
|
+
```bash
|
|
37
|
+
# Force restart — clears both output directory and session state
|
|
38
|
+
sofia dev --session abc123 --force
|
|
39
|
+
|
|
40
|
+
# Expected output:
|
|
41
|
+
# ℹ Cleared existing output directory and session state (--force)
|
|
42
|
+
# Scaffolding PoC project…
|
|
43
|
+
# Iteration 1/10: Running tests…
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
### 3. Template selection
|
|
47
|
+
|
|
48
|
+
```bash
|
|
49
|
+
# Create a plan with Python/FastAPI architecture notes
|
|
50
|
+
# Then run dev — should auto-select python-pytest template
|
|
51
|
+
sofia dev --session python-plan-123
|
|
52
|
+
|
|
53
|
+
# Expected output:
|
|
54
|
+
# ℹ Selected template: python-pytest (matched 'python' in architecture notes)
|
|
55
|
+
# Scaffolding PoC project…
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
## Development Setup
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
# Clone and install
|
|
62
|
+
git clone <repo-url>
|
|
63
|
+
cd sofia-cli
|
|
64
|
+
git checkout 004-dev-resume-hardening
|
|
65
|
+
npm install
|
|
66
|
+
|
|
67
|
+
# Run tests (targeted)
|
|
68
|
+
npm test -- tests/unit/develop/ralphLoop.spec.ts
|
|
69
|
+
npm test -- tests/unit/develop/templateRegistry.spec.ts
|
|
70
|
+
npm test -- tests/unit/cli/developCommand.spec.ts
|
|
71
|
+
npm test -- tests/integration/testRunnerReal.spec.ts
|
|
72
|
+
|
|
73
|
+
# Run all tests
|
|
74
|
+
npm test
|
|
75
|
+
|
|
76
|
+
# Type check
|
|
77
|
+
npm run typecheck
|
|
78
|
+
|
|
79
|
+
# Lint
|
|
80
|
+
npm run lint
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
## Key Files to Modify
|
|
84
|
+
|
|
85
|
+
| File | Change |
|
|
86
|
+
| --------------------------------- | ------------------------------------------- |
|
|
87
|
+
| `src/develop/ralphLoop.ts` | Resume iteration seeding in `run()` |
|
|
88
|
+
| `src/cli/developCommand.ts` | Resume detection, `--force` session reset |
|
|
89
|
+
| `src/develop/templateRegistry.ts` | **New**: template registry + selection |
|
|
90
|
+
| `src/develop/pocScaffolder.ts` | Use registry, extract template into entries |
|
|
91
|
+
| `src/develop/testRunner.ts` | Make test command configurable |
|
|
92
|
+
| `src/phases/phaseHandlers.ts` | Workshop→dev transition guidance |
|
|
93
|
+
| `src/cli/workshopCommand.ts` | Display `sofia dev` command after Plan |
|
|
94
|
+
|
|
95
|
+
## TDD Workflow Reminder
|
|
96
|
+
|
|
97
|
+
Per constitution (Principle V):
|
|
98
|
+
|
|
99
|
+
1. **Red**: Write failing tests first
|
|
100
|
+
2. **Green**: Implement minimum code to pass
|
|
101
|
+
3. **Review**: Run Test Review Checklist, add tests for gaps
|
|
102
|
+
|
|
103
|
+
```bash
|
|
104
|
+
# Example: adding resume test
|
|
105
|
+
# 1. Write test in tests/unit/develop/ralphLoop.spec.ts
|
|
106
|
+
# 2. Run it — should fail
|
|
107
|
+
npm test -- tests/unit/develop/ralphLoop.spec.ts --testNamePattern "resumes from"
|
|
108
|
+
# 3. Implement resume logic in src/develop/ralphLoop.ts
|
|
109
|
+
# 4. Run again — should pass
|
|
110
|
+
npm test -- tests/unit/develop/ralphLoop.spec.ts
|
|
111
|
+
# 5. Full suite
|
|
112
|
+
npm test
|
|
113
|
+
# 6. Type + lint
|
|
114
|
+
npm run typecheck && npm run lint
|
|
115
|
+
```
|
|
@@ -0,0 +1,142 @@
|
|
|
1
|
+
# Research: Dev Resume & Hardening
|
|
2
|
+
|
|
3
|
+
**Feature**: 004-dev-resume-hardening
|
|
4
|
+
**Date**: 2026-03-01
|
|
5
|
+
**Status**: Complete — all unknowns resolved
|
|
6
|
+
|
|
7
|
+
## R1: Resume Iteration Seeding Strategy
|
|
8
|
+
|
|
9
|
+
**Decision**: Seed `iterations` from `session.poc.iterations` at `ralphLoop.ts` L183, derive `iterNum = iterations.length + 1`, conditionally skip scaffold/install.
|
|
10
|
+
|
|
11
|
+
**Rationale**: The `run()` method always initializes `iterations = []` (L183) and starts the iteration loop at `iterNum = 2` (L280). The session already persists `poc.iterations` via `onSessionUpdate` → `store.save()` after every iteration. The data needed for resume is already being written — it's just never read back. Seeding from session state is the minimal change with maximum correctness.
|
|
12
|
+
|
|
13
|
+
**Key insertion points**:
|
|
14
|
+
|
|
15
|
+
- L183: After `const iterations: PocIteration[] = []`, push from `session.poc.iterations` if present and `finalStatus` is unset
|
|
16
|
+
- L280: Change loop start from `iterNum = 2` to `iterNum = iterations.length + 1`
|
|
17
|
+
- L190-L271: Wrap scaffold + npm install in `if (iterations.length === 0)` guard
|
|
18
|
+
- L278-L279: Seed `prevFailingTests` from last iteration's `testResults.failures` on resume
|
|
19
|
+
|
|
20
|
+
**Alternatives considered**:
|
|
21
|
+
|
|
22
|
+
- **New `resume()` method on RalphLoop**: Rejected — would duplicate significant logic from `run()`. Better to make `run()` resume-aware.
|
|
23
|
+
- **Checkpoint file on disk**: Rejected — session JSON already contains all state needed. Adding a secondary checkpoint source creates consistency risks.
|
|
24
|
+
|
|
25
|
+
**Open design question resolved**: `maxIterations` counts _total_ iterations (not additional from resume point). If `maxIterations=10` and 3 completed, the loop runs iterations 4-10 (7 more). This matches the semantic "max iterations for this PoC" and prevents open-ended runs.
|
|
26
|
+
|
|
27
|
+
**Incomplete iteration handling** (FR-001a): If the last iteration in `session.poc.iterations` has no `testResults` (indicating mid-execution interruption), pop it from the seeded iterations so it gets re-run. Only fully completed iterations (with `testResults` or `outcome` set) are preserved.
|
|
28
|
+
|
|
29
|
+
## R2: `--force` Session State Reset
|
|
30
|
+
|
|
31
|
+
**Decision**: Direct mutation `session.poc = undefined` in `developCommand.ts` after `rmSync()`, followed by `store.save()`. Do not use `backtrackSession`.
|
|
32
|
+
|
|
33
|
+
**Rationale**: `backtrackSession(session, 'Develop')` is a no-op when `session.phase` is already `'Develop'` (same-phase check at sessionManager.ts L80-L84 returns without changes). The `--force` reset is a single-field operation within the current phase — directly clearing `session.poc` is simpler, more explicit, and avoids coupling to the backtrack function's cross-phase navigation semantics.
|
|
34
|
+
|
|
35
|
+
**Alternatives considered**:
|
|
36
|
+
|
|
37
|
+
- **`backtrackSession` with `clearCurrentPhase` option**: Rejected — adds complexity to a generic function for a specific use case. Backtrack is designed for phase navigation, not in-phase resets.
|
|
38
|
+
- **Delete and recreate session**: Rejected — would lose all workshop phases (Discover, Ideate, Design, Select, Plan). `--force` should only reset the PoC, preserving all prior work.
|
|
39
|
+
|
|
40
|
+
## R3: Template Registry Architecture
|
|
41
|
+
|
|
42
|
+
**Decision**: Create a `TemplateRegistry` map in a new `src/develop/templateRegistry.ts` module. `PocScaffolder` constructor already accepts `template?: TemplateFile[]` — the registry provides the lookup layer.
|
|
43
|
+
|
|
44
|
+
**Rationale**: The scaffolder's template injection point exists (`constructor(template?)`), `TemplateFile` interface is stable, and `TechStack` schema already supports the fields needed. The registry formalizes what's already implicit (hardcoded template selection) into an extensible pattern.
|
|
45
|
+
|
|
46
|
+
**Template entry shape**:
|
|
47
|
+
|
|
48
|
+
```typescript
|
|
49
|
+
export interface TemplateEntry {
|
|
50
|
+
id: string; // e.g., 'node-ts-vitest', 'python-pytest'
|
|
51
|
+
displayName: string; // e.g., 'TypeScript + Node.js + Vitest'
|
|
52
|
+
files: TemplateFile[]; // scaffold file list
|
|
53
|
+
techStack: TechStack; // includes language, runtime, testRunner, buildCommand
|
|
54
|
+
installCommand: string; // e.g., 'npm install', 'pip install -r requirements.txt'
|
|
55
|
+
testCommand: string; // e.g., 'npm test -- --reporter=json'
|
|
56
|
+
matchPatterns: string[]; // keywords to match from architectureNotes
|
|
57
|
+
}
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
**Selection logic**: Scan `plan.architectureNotes` + `plan.dependencies` for `matchPatterns`. First match wins. Default: `node-ts-vitest`.
|
|
61
|
+
|
|
62
|
+
**`python-pytest` template files**: `.gitignore`, `requirements.txt`, `pytest.ini`, `README.md`, `src/__init__.py`, `src/main.py`, `tests/test_main.py`, `.sofia-metadata.json`.
|
|
63
|
+
|
|
64
|
+
**TechStack for Python**: `{ language: 'Python', runtime: 'Python 3.11', testRunner: 'pytest --tb=short -q --json-report', buildCommand: undefined, framework: undefined }`
|
|
65
|
+
|
|
66
|
+
**Alternatives considered**:
|
|
67
|
+
|
|
68
|
+
- **Auto-detection from plan (no registry)**: Rejected — fragile pattern matching without a structured lookup. Registry makes template addition declarative.
|
|
69
|
+
- **User-selectable template (CLI flag)**: Deferred — out of scope per spec. Registry enables this later without code changes.
|
|
70
|
+
|
|
71
|
+
## R4: TestRunner Command Configurability
|
|
72
|
+
|
|
73
|
+
**Decision**: Make test command configurable via `TestRunnerOptions.testCommand` (default: `'npm test -- --reporter=json'`). The `RalphLoop` passes the command from `TechStack.testRunner` or TemplateEntry.
|
|
74
|
+
|
|
75
|
+
**Rationale**: `spawnTests()` currently hardcodes `spawn('npm', ['test', '--', '--reporter=json'])`. For Python templates, the command would be `pytest --tb=short -q --json-report`. Rather than building separate parsers for each runner, make the command configurable and keep the JSON parsing generic — both Vitest and pytest can produce JSON output.
|
|
76
|
+
|
|
77
|
+
**Test strategy for coverage hardening**:
|
|
78
|
+
|
|
79
|
+
- Make `extractJson` and `buildErrorResult` `protected` (like `parseOutput` already is)
|
|
80
|
+
- Create test fixture files with sample Vitest JSON output (passing, failing, mixed, garbled)
|
|
81
|
+
- Use `child_process.spawn` mocking OR a real minimal project in `tests/fixtures/` for integration tests
|
|
82
|
+
- FR-019 requires real fixture — create `tests/fixtures/test-fixture-project/` with a minimal Vitest project
|
|
83
|
+
|
|
84
|
+
**Alternatives considered**:
|
|
85
|
+
|
|
86
|
+
- **Strategy pattern per runner**: Rejected for now — over-engineering. JSON output parsing can be generic. If pytest JSON format differs significantly, add a `parseStrategy` option later.
|
|
87
|
+
- **Test `spawnTests` via shell script fixture**: Rejected — too fragile across platforms. Real Vitest project is more reliable.
|
|
88
|
+
|
|
89
|
+
## R5: Workshop → Dev Transition
|
|
90
|
+
|
|
91
|
+
**Decision**: Insert guidance message in `workshopCommand.ts` when `getNextPhase(phase)` returns `'Develop'`, after the Plan decision gate. Show exact command `sofia dev --session <id>`. Optionally offer auto-transition in interactive mode (FR-021, SHOULD).
|
|
92
|
+
|
|
93
|
+
**Rationale**: The Plan → Develop boundary is where the workshop's conversational flow hands off to the RalphLoop's iterative code generation. The boundary handler at phaseHandlers.ts L283-L286 already comments that PoC generation uses `sofia dev`. Making this guidance explicit helps users complete the workflow.
|
|
94
|
+
|
|
95
|
+
**Insertion point**: `workshopCommand.ts` inside the `case 'continue'` block, when `next === 'Develop'`.
|
|
96
|
+
|
|
97
|
+
**Alternatives considered**:
|
|
98
|
+
|
|
99
|
+
- **Auto-transition always**: Rejected — breaks the two-command separation of concerns. Users may want to review the plan before generating code.
|
|
100
|
+
- **Display in phase handler instead of workshop command**: Rejected — phase handlers don't have access to the IO context for rich terminal rendering. The workshop command is the right orchestration layer.
|
|
101
|
+
|
|
102
|
+
## R6: `.sofia-metadata.json` TODO Tracking
|
|
103
|
+
|
|
104
|
+
**Decision**: Extend metadata JSON schema with a `todos` section. Scan template files at scaffold time for `TODO:` markers. After each RalphLoop iteration, rescan and update counts.
|
|
105
|
+
|
|
106
|
+
**Rationale**: The metadata file is already written at scaffold time (pocScaffolder.ts L243-L260), excluded from code generation (codeGenerator.ts L112), and used as a resume marker (developCommand.ts L160). Extending it with TODO tracking is a natural fit.
|
|
107
|
+
|
|
108
|
+
**Schema extension**:
|
|
109
|
+
|
|
110
|
+
```json
|
|
111
|
+
{
|
|
112
|
+
"todos": {
|
|
113
|
+
"totalInitial": 3,
|
|
114
|
+
"remaining": 1,
|
|
115
|
+
"markers": ["src/main.py:12: TODO: Implement business logic"]
|
|
116
|
+
}
|
|
117
|
+
}
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
**Alternatives considered**:
|
|
121
|
+
|
|
122
|
+
- **Separate `.sofia-todos.json` file**: Rejected — adds another file to manage. Metadata is already the canonical per-PoC state file.
|
|
123
|
+
- **Track in session JSON instead**: Rejected — TODOs are file-system artifacts, not session-level state. Metadata file is co-located with the scaffold output.
|
|
124
|
+
|
|
125
|
+
## R7: Existing Test Infrastructure
|
|
126
|
+
|
|
127
|
+
**Decision**: Follow existing test patterns: unit tests in `tests/unit/develop/`, integration in `tests/integration/`, E2E with `node-pty` in `tests/e2e/`. Use Vitest `vi.mock()` at module boundaries.
|
|
128
|
+
|
|
129
|
+
**Rationale**: 48 test files already establish clear conventions. Unit tests mock at module boundaries using `vi.mock()`. Integration tests use fake IO contexts and deterministic session objects. E2E tests use `node-pty` for PTY simulation (already a dev dependency).
|
|
130
|
+
|
|
131
|
+
**Key test files to extend**:
|
|
132
|
+
|
|
133
|
+
- `tests/unit/develop/ralphLoop.spec.ts` — add resume iteration seeding tests
|
|
134
|
+
- `tests/unit/cli/developCommand.spec.ts` — add --force reset tests
|
|
135
|
+
- `tests/integration/ralphLoopPartial.spec.ts` — add full resume flow tests
|
|
136
|
+
|
|
137
|
+
**New test files**:
|
|
138
|
+
|
|
139
|
+
- `tests/unit/develop/templateRegistry.spec.ts` — registry selection logic
|
|
140
|
+
- `tests/integration/testRunnerReal.spec.ts` — fixture-based testRunner tests
|
|
141
|
+
- `tests/e2e/developPty.spec.ts` — PTY-based interactive E2E
|
|
142
|
+
- `tests/fixtures/test-fixture-project/` — minimal Vitest project for testRunner tests
|
|
@@ -0,0 +1,221 @@
|
|
|
1
|
+
# Feature Specification: Dev Resume & Hardening
|
|
2
|
+
|
|
3
|
+
**Feature Branch**: `004-dev-resume-hardening`
|
|
4
|
+
**Created**: 2026-03-01
|
|
5
|
+
**Status**: Draft
|
|
6
|
+
**Upstream Dependency**: specs/002-poc-generation/spec.md (Ralph Loop, `--force`, testRunner, scaffolder), specs/003-mcp-transport-integration/spec.md (MCP transport layer)
|
|
7
|
+
**Input**: User description: "Implement dev command resume/checkpoint, --force flag, testRunner coverage hardening, PoC template selection, and other deferred P2/P3 gaps from Feature 003 spec"
|
|
8
|
+
|
|
9
|
+
## Overview
|
|
10
|
+
|
|
11
|
+
Feature 002 built the PoC generation pipeline, and Feature 003 wires it to real MCP servers. This feature hardens the `sofia dev` command for production use by implementing the resume/checkpoint flow, honoring the `--force` flag, expanding test coverage for `testRunner.ts`, introducing a template registry for multi-language PoC scaffolding, and adding interactive E2E tests.
|
|
12
|
+
|
|
13
|
+
Currently, running `sofia dev --session X` a second time re-scaffolds everything from scratch despite the CLI displaying a "Resume" suggestion. The `--force` flag deletes the output directory but does not reset session state. The test runner has significant untested code paths at 45% coverage. The scaffolder is locked to a single TypeScript/Vitest template regardless of the plan's architecture notes.
|
|
14
|
+
|
|
15
|
+
**Gaps addressed**: GAP-006 (P2, resume/checkpoint), GAP-007 (P2, `--force`), GAP-008 (P2, testRunner coverage), GAP-009 (P2, template selection), GAP-009 (P3, scaffold TODOs), GAP-010 (P3, PTY E2E), GAP-011 (P3, workshop→develop transition) from `specs/003-next-spec-gaps.md`.
|
|
16
|
+
|
|
17
|
+
## Clarifications
|
|
18
|
+
|
|
19
|
+
### Session 2026-03-01
|
|
20
|
+
|
|
21
|
+
- Q: When resuming after an interruption mid-iteration, should the system re-run the incomplete iteration or skip to N+1? → A: Re-run the last iteration if it has no test results (was interrupted mid-execution); skip to N+1 only if the iteration completed fully.
|
|
22
|
+
- Q: Should npm install be skipped on resume if node_modules exists? → A: Always re-run npm install on resume — it's idempotent and avoids stale dependency issues from mid-iteration interruptions.
|
|
23
|
+
- Q: Should the template define the test command or should the test runner auto-detect it? → A: Template defines both install and test commands in TechStack — single source of truth, no auto-detection.
|
|
24
|
+
- Q: Should resume decisions (skip scaffold, re-run iteration, re-run install) be logged? → A: Log all resume decisions at info level (visible by default) for user confidence and debugging.
|
|
25
|
+
- Q: What adjacent concerns should be explicitly out of scope? → A: Multi-session dev, cloud-based resume, template marketplace, and Python test runner integration are all out of scope.
|
|
26
|
+
|
|
27
|
+
## Out of Scope
|
|
28
|
+
|
|
29
|
+
The following concerns are explicitly excluded from this feature:
|
|
30
|
+
|
|
31
|
+
- **Multi-session development** — Running `sofia dev` on multiple sessions simultaneously is not supported; resume is single-session only.
|
|
32
|
+
- **Cloud-based resume** — Checkpoint state is local to the machine; syncing resume state across machines (e.g., via GitHub or cloud storage) is deferred.
|
|
33
|
+
- **Template marketplace** — User-contributed or externally hosted templates are not supported; the template registry is internal and code-defined.
|
|
34
|
+
- **Python test runner integration** — While the `python-pytest` scaffold template is in scope, adapting `testRunner.ts` to parse pytest's JSON output format is deferred. The Python template will use a test command format compatible with the existing JSON parser (e.g., pytest with `--json-report` plugin producing a compatible shape).
|
|
35
|
+
|
|
36
|
+
## User Scenarios & Testing _(mandatory)_
|
|
37
|
+
|
|
38
|
+
### User Story 1 — Resume an Interrupted PoC Session (Priority: P1)
|
|
39
|
+
|
|
40
|
+
As a facilitator who ran `sofia dev` and it was interrupted (Ctrl+C, network failure, LLM error), I want to run `sofia dev --session X` again and have it continue from where it left off — skipping scaffolding and npm install, resuming from the next iteration number — so that I don't lose progress and can reach a working PoC faster.
|
|
41
|
+
|
|
42
|
+
**Why this priority**: The CLI already advertises "Resume: sofia dev --session X" in its recovery message, but the command doesn't actually resume. This is the largest usability gap — users who encounter any interruption lose all iteration progress.
|
|
43
|
+
|
|
44
|
+
**Independent Test**: Run `sofia dev` on a session, interrupt after 2 iterations, re-run `sofia dev --session X`, and verify it starts from iteration 3 without re-scaffolding or re-running npm install.
|
|
45
|
+
|
|
46
|
+
**Acceptance Scenarios**:
|
|
47
|
+
|
|
48
|
+
1. **Given** a session with `poc.iterations` containing 2 completed iterations and `poc.finalStatus` unset, **When** the user runs `sofia dev --session X`, **Then** the Ralph Loop detects the existing iterations, skips scaffolding and npm install, and begins iteration 3 from the last known test results.
|
|
49
|
+
2. **Given** a session with `poc.finalStatus` set to `'success'`, **When** the user runs `sofia dev --session X`, **Then** the CLI displays a message indicating the PoC is already complete and exits without re-running the Ralph Loop.
|
|
50
|
+
3. **Given** a session with `poc.finalStatus` set to `'failed'` or `'partial'`, **When** the user runs `sofia dev --session X`, **Then** the CLI offers to resume from the last iteration or start fresh, defaulting to resume.
|
|
51
|
+
4. **Given** a session with existing iterations but the output directory is missing, **When** the user runs `sofia dev --session X`, **Then** the system re-scaffolds (using the original plan context) but preserves the iteration history for LLM context continuity.
|
|
52
|
+
|
|
53
|
+
---
|
|
54
|
+
|
|
55
|
+
### User Story 2 — Force-Restart a PoC Session (Priority: P1)
|
|
56
|
+
|
|
57
|
+
As a facilitator who wants to discard a previous PoC attempt and start completely fresh, I want to run `sofia dev --session X --force` and have it delete all prior output and reset PoC state, so that I get a clean slate without needing to create a new session.
|
|
58
|
+
|
|
59
|
+
**Why this priority**: The `--force` flag is already declared in the CLI and referenced in the recovery message, but it only partially works (deletes output directory without resetting session state). This creates a confusing state where files are gone but the session still references old iterations.
|
|
60
|
+
|
|
61
|
+
**Independent Test**: Run `sofia dev --session X` to create output, then run `sofia dev --session X --force`, and verify both the output directory and session's `poc.iterations` are reset to empty.
|
|
62
|
+
|
|
63
|
+
**Acceptance Scenarios**:
|
|
64
|
+
|
|
65
|
+
1. **Given** a session with existing `poc.iterations` and an output directory, **When** the user runs `sofia dev --session X --force`, **Then** the output directory is deleted, `poc.iterations` is reset to an empty array, `poc.finalStatus` is cleared, and the Ralph Loop starts fresh from iteration 1.
|
|
66
|
+
2. **Given** a session with no prior PoC state, **When** the user runs `sofia dev --session X --force`, **Then** it behaves identically to a first run (no error, no special message).
|
|
67
|
+
3. **Given** the `--force` flag is used on a session with `poc.finalStatus` set to `'success'`, **When** the command runs, **Then** it clears the success state and starts fresh without prompting for confirmation.
|
|
68
|
+
|
|
69
|
+
---
|
|
70
|
+
|
|
71
|
+
### User Story 3 — PoC Template Selection Based on Plan (Priority: P2)
|
|
72
|
+
|
|
73
|
+
As a facilitator whose plan specifies Python/FastAPI architecture, I want the scaffolder to generate a Python project with pytest instead of always generating TypeScript/Vitest, so that the PoC matches the planned technology stack.
|
|
74
|
+
|
|
75
|
+
**Why this priority**: Currently the scaffolder is hardcoded to TypeScript/Vitest regardless of the plan's `architectureNotes` or `dependencies`. This limits the PoC's usefulness when the plan targets a different technology. However, the core Ralph Loop works with any single template, making this an enhancement rather than a blocker.
|
|
76
|
+
|
|
77
|
+
**Independent Test**: Create a session with a plan specifying Python + FastAPI in its architecture notes, run `sofia dev`, and verify the scaffolder generates `requirements.txt`, `main.py`, `test_main.py` with pytest instead of `package.json` and TypeScript files.
|
|
78
|
+
|
|
79
|
+
**Acceptance Scenarios**:
|
|
80
|
+
|
|
81
|
+
1. **Given** a plan with `architectureNotes` mentioning "Python" or "FastAPI", **When** the scaffolder runs, **Then** it selects the `python-pytest` template and generates a Python project structure.
|
|
82
|
+
2. **Given** a plan with `architectureNotes` mentioning "TypeScript" or "Node.js" or no specific language, **When** the scaffolder runs, **Then** it uses the default `node-ts-vitest` template (current behavior preserved).
|
|
83
|
+
3. **Given** a plan with ambiguous architecture notes (e.g., "could be Python or TypeScript"), **When** the scaffolder runs, **Then** it defaults to `node-ts-vitest` and logs which template was selected and why.
|
|
84
|
+
4. **Given** a template registry with registered templates, **When** a new template is added, **Then** it only requires adding a new entry to the registry — no changes to the scaffolder's core logic.
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
### User Story 4 — TestRunner Coverage Hardening (Priority: P2)
|
|
89
|
+
|
|
90
|
+
As a developer maintaining sofIA, I want the test runner's critical code paths (subprocess spawning, output parsing, timeout handling) to be covered by integration tests, so that regressions in the test execution pipeline are caught early.
|
|
91
|
+
|
|
92
|
+
**Why this priority**: The test runner is at 45% coverage with critical untested paths including the child process spawning mechanism, output parsing fallbacks, and timeout error handling. These paths are exercised in every Ralph Loop iteration, making regressions high-impact but currently invisible.
|
|
93
|
+
|
|
94
|
+
**Independent Test**: Run the test runner integration tests against a tiny Vitest/pytest project fixture and verify all code paths are exercised including timeout, SIGTERM/SIGKILL, malformed output, and mixed stdout/JSON scenarios.
|
|
95
|
+
|
|
96
|
+
**Acceptance Scenarios**:
|
|
97
|
+
|
|
98
|
+
1. **Given** a test fixture project with passing tests, **When** the test runner executes, **Then** it correctly parses the JSON reporter output and returns accurate pass/fail/skip counts.
|
|
99
|
+
2. **Given** a test fixture project with a test that hangs indefinitely, **When** the test runner's timeout fires, **Then** it sends SIGTERM, waits 5 seconds, sends SIGKILL if needed, and returns a timeout-classified error result.
|
|
100
|
+
3. **Given** test output containing mixed console logs and JSON, **When** `extractJson()` parses the output, **Then** the fallback path (first `{` to last `}`) successfully extracts the JSON report.
|
|
101
|
+
4. **Given** test output containing no valid JSON at all, **When** `extractJson()` is called, **Then** it returns null and the caller produces a zero-count result with raw output preserved.
|
|
102
|
+
|
|
103
|
+
---
|
|
104
|
+
|
|
105
|
+
### User Story 5 — PTY-Based Interactive E2E Tests (Priority: P3)
|
|
106
|
+
|
|
107
|
+
As a developer, I want PTY-based E2E tests for the `sofia dev` command that verify interactive behavior (Ctrl+C handling, spinner display, progress output), so that the user's terminal experience is validated in CI.
|
|
108
|
+
|
|
109
|
+
**Why this priority**: Interactive behavior bugs (hanging spinners, swallowed Ctrl+C, garbled progress output) are invisible to the current E2E tests which use function calls. This is a quality-of-life improvement for developers but doesn't block production functionality.
|
|
110
|
+
|
|
111
|
+
**Independent Test**: Run PTY-based tests that spawn `sofia dev` as a subprocess, send Ctrl+C, and verify the process exits cleanly with the expected recovery message.
|
|
112
|
+
|
|
113
|
+
**Acceptance Scenarios**:
|
|
114
|
+
|
|
115
|
+
1. **Given** a PTY-spawned `sofia dev` process, **When** Ctrl+C is sent during an iteration, **Then** the process exits with the recovery message and a zero exit code.
|
|
116
|
+
2. **Given** a PTY-spawned `sofia dev` process, **When** the Ralph Loop progresses through iterations, **Then** the terminal displays iteration progress (e.g., "Iteration 2/10: Running tests…") readable from the PTY output buffer.
|
|
117
|
+
|
|
118
|
+
---
|
|
119
|
+
|
|
120
|
+
### User Story 6 — Workshop-to-Dev Transition Clarity (Priority: P3)
|
|
121
|
+
|
|
122
|
+
As a facilitator completing the Plan phase in `sofia workshop`, I want a clear indication of how to proceed to PoC development — whether via an automatic transition or explicit guidance to run `sofia dev` — so that the workflow feels intentional rather than abandoned after planning.
|
|
123
|
+
|
|
124
|
+
**Why this priority**: The current boundary prompt in the workshop only captures PoC intent without invoking the Ralph Loop. Users may not realize they need to run a separate command. However, the two-command workflow may be intentional for separation of concerns.
|
|
125
|
+
|
|
126
|
+
**Independent Test**: Complete all workshop phases through Plan, verify the workshop provides clear next-step guidance including the exact `sofia dev` command to run with the session ID.
|
|
127
|
+
|
|
128
|
+
**Acceptance Scenarios**:
|
|
129
|
+
|
|
130
|
+
1. **Given** a workshop session completing the Plan phase, **When** the plan is finalized, **Then** the workshop displays the exact `sofia dev --session <id>` command to run next, along with a brief explanation of what it does.
|
|
131
|
+
2. **Given** a workshop session completing the Plan phase, **When** the user is in interactive mode, **Then** the workshop offers: (a) automatically start development, or (b) save the session and exit with the `sofia dev` command displayed.
|
|
132
|
+
|
|
133
|
+
---
|
|
134
|
+
|
|
135
|
+
### Edge Cases
|
|
136
|
+
|
|
137
|
+
- What if the output directory exists but has been manually modified? Resume should detect file integrity via the `.sofia-metadata.json` marker and warn if unexpected changes are found.
|
|
138
|
+
- What if `poc.iterations` is corrupted or has invalid entries? The resume logic should validate iteration data and fall back to starting fresh if integrity checks fail.
|
|
139
|
+
- What if the user interrupts during npm install on a resumed session? The system should handle partial `node_modules` gracefully — either detect incomplete install or always re-run npm install on resume.
|
|
140
|
+
- How should the template registry handle unknown plan architectures? Fall back to the default `node-ts-vitest` template with a logged warning.
|
|
141
|
+
- What if a PTY E2E test environment doesn't support PTY allocation (e.g., some CI runners)? Tests must skip gracefully with a clear skip message.
|
|
142
|
+
|
|
143
|
+
## Requirements _(mandatory)_
|
|
144
|
+
|
|
145
|
+
### Functional Requirements
|
|
146
|
+
|
|
147
|
+
#### Resume/Checkpoint (GAP-006)
|
|
148
|
+
|
|
149
|
+
- **FR-001**: `RalphLoop.run()` MUST check `session.poc.iterations` at startup. If iterations exist and `session.poc.finalStatus` is unset, it MUST resume from the next iteration number rather than starting from scratch.
|
|
150
|
+
- **FR-001a**: When resuming, if the last recorded iteration has no test results (indicating it was interrupted mid-execution), the system MUST re-run that iteration from the last known-good state. Only fully completed iterations (with test results recorded) are considered done.
|
|
151
|
+
- **FR-002**: When resuming, the Ralph Loop MUST skip scaffolding if the output directory exists and contains a valid `.sofia-metadata.json` marker.
|
|
152
|
+
- **FR-003**: When resuming, the Ralph Loop MUST always re-run the dependency installation step (e.g., `npm install`). This is idempotent when dependencies haven't changed and avoids stale dependency issues when a prior iteration added packages before an interruption.
|
|
153
|
+
- **FR-004**: When resuming, the Ralph Loop MUST include prior iteration history (test results, applied changes) in the LLM prompt context so the model understands what has already been tried.
|
|
154
|
+
- **FR-005**: If `session.poc.finalStatus` is `'success'`, the CLI MUST display a completion message and exit without invoking the Ralph Loop.
|
|
155
|
+
- **FR-006**: If `session.poc.finalStatus` is `'failed'` or `'partial'`, the CLI MUST default to resuming from the last iteration and allow the user to override with `--force`.
|
|
156
|
+
- **FR-007**: If the output directory is missing but iterations exist in the session, the system MUST re-scaffold (using the original plan context) and resume iteration numbering from where it left off.
|
|
157
|
+
- **FR-007a**: All resume decisions MUST be logged at info level (visible by default), including: which iteration is being resumed from, whether an incomplete iteration is being re-run, whether scaffolding is being skipped, and that npm install is being re-run. These messages MUST be visible to the user without requiring `--debug`.
|
|
158
|
+
|
|
159
|
+
#### `--force` Flag (GAP-007)
|
|
160
|
+
|
|
161
|
+
- **FR-008**: When `--force` is set, the command handler MUST delete the existing output directory AND reset `session.poc.iterations` to an empty array AND clear `session.poc.finalStatus`.
|
|
162
|
+
- **FR-009**: After a force-reset, the Ralph Loop MUST start fresh from iteration 1 as if the session had never been developed.
|
|
163
|
+
- **FR-010**: The `--force` flag MUST work regardless of the current `poc.finalStatus` value (including `'success'`).
|
|
164
|
+
|
|
165
|
+
#### Template Registry (GAP-009)
|
|
166
|
+
|
|
167
|
+
- **FR-011**: The scaffolder MUST use a template registry that maps plan characteristics (language, framework) to scaffold templates.
|
|
168
|
+
- **FR-012**: The template registry MUST include at least two templates: `node-ts-vitest` (TypeScript/Node.js/Vitest, current default) and `python-pytest` (Python/pytest).
|
|
169
|
+
- **FR-013**: Template selection MUST be automatic based on the plan's `architectureNotes` and `dependencies`, with `node-ts-vitest` as the fallback default.
|
|
170
|
+
- **FR-014**: Each template MUST define: file list, `TechStack` configuration (language, runtime, test runner command, build command, dependency install command), and test execution command. Both the install and test commands are part of the template — the test runner MUST NOT auto-detect them.
|
|
171
|
+
- **FR-015**: Adding a new template MUST only require adding a registry entry — no changes to `PocScaffolder`'s core logic or `RalphLoop`'s iteration logic.
|
|
172
|
+
|
|
173
|
+
#### TestRunner Coverage (GAP-008)
|
|
174
|
+
|
|
175
|
+
- **FR-016**: Integration tests MUST cover the `spawnTests()` method including: successful test execution, timeout handling (SIGTERM then SIGKILL), and stderr collection.
|
|
176
|
+
- **FR-017**: Integration tests MUST cover the `extractJson()` fallback path where line-by-line parsing fails and the first-`{`-to-last-`}` slice is used.
|
|
177
|
+
- **FR-018**: Integration tests MUST cover the `buildErrorResult()` timeout path.
|
|
178
|
+
- **FR-019**: TestRunner integration tests MUST use a real test fixture project (minimal Vitest/pytest project) rather than mocking the subprocess.
|
|
179
|
+
|
|
180
|
+
#### Workshop→Dev Transition (GAP-011)
|
|
181
|
+
|
|
182
|
+
- **FR-020**: When the workshop Plan phase completes, the system MUST display the exact `sofia dev --session <id>` command needed to start PoC development.
|
|
183
|
+
- **FR-021**: In interactive mode, the workshop SHOULD offer to automatically transition to the development phase.
|
|
184
|
+
|
|
185
|
+
#### Scaffold TODO Tracking (GAP-009, P3)
|
|
186
|
+
|
|
187
|
+
- **FR-022**: Generated scaffold files containing intentional TODO markers MUST be tracked via `.sofia-metadata.json` so that the Ralph Loop can report how many TODOs remain at the end of each iteration.
|
|
188
|
+
|
|
189
|
+
### Key Entities
|
|
190
|
+
|
|
191
|
+
- **CheckpointState**: Represents the resume context derived from existing `session.poc` — includes last iteration number, whether scaffolding/install can be skipped, and prior iteration history for LLM context.
|
|
192
|
+
- **TemplateRegistry**: Maps plan characteristics to scaffold templates. Contains named template entries with file lists, tech stack configuration, and installation commands.
|
|
193
|
+
- **TemplateEntry**: A single scaffold template definition — includes template name (e.g., `node-ts-vitest`, `python-pytest`), file generators, TechStack shape, and install command.
|
|
194
|
+
- **TestFixtureProject**: A minimal project used by testRunner integration tests — contains a `package.json`, a passing test, a failing test, and a hanging test for timeout validation.
|
|
195
|
+
|
|
196
|
+
## Success Criteria _(mandatory)_
|
|
197
|
+
|
|
198
|
+
### Measurable Outcomes
|
|
199
|
+
|
|
200
|
+
- **SC-004-001**: A `sofia dev --session X` run after an interruption resumes from the correct iteration number (e.g., iteration 3 if 2 were completed), measured by verifying the iteration counter in session state and the absence of scaffolding logs.
|
|
201
|
+
- **SC-004-002**: `sofia dev --session X --force` resets both the output directory and session `poc.iterations`/`poc.finalStatus`, measured by verifying empty iteration state after force.
|
|
202
|
+
- **SC-004-003**: The scaffolder produces a valid Python/pytest project when the plan specifies Python/FastAPI, measured by generating `requirements.txt`, `main.py`, and `pytest`-based tests that pass basic syntax validation.
|
|
203
|
+
- **SC-004-004**: `testRunner.ts` test coverage increases from 45% to at least 80%, measured by the coverage report.
|
|
204
|
+
- **SC-004-005**: A resumed Ralph Loop session reaches the same or better PoC quality (pass rate) as a fresh run, measured by comparing test pass counts between resumed and fresh runs on the same plan.
|
|
205
|
+
- **SC-004-006**: The workshop displays actionable next-step guidance (including the exact command) when the Plan phase completes, measured by verifying the output contains the session ID and `sofia dev` command.
|
|
206
|
+
- **SC-004-007**: Resume detection adds less than 500ms overhead to the `sofia dev` startup time, measured by comparing startup times with and without existing iterations.
|
|
207
|
+
|
|
208
|
+
## Assumptions
|
|
209
|
+
|
|
210
|
+
- Feature 002 session schema (`poc.iterations`, `poc.finalStatus`) is stable and does not require migration for resume support — only reading existing fields that are currently written but never read back.
|
|
211
|
+
- The `.sofia-metadata.json` file written by the scaffolder is a reliable marker for detecting existing scaffold output.
|
|
212
|
+
- npm install is always re-run on resume since it's idempotent (fast no-op when dependencies match) and avoids hard-to-diagnose stale dependency issues from interrupted iterations.
|
|
213
|
+
- Python/FastAPI is the highest-value second template based on user demand and workshop feedback.
|
|
214
|
+
- PTY allocation is available in the CI environment for E2E tests; tests skip gracefully if PTY is unavailable.
|
|
215
|
+
- The two-command workflow (`workshop` then `dev`) is the intentional default; auto-transition in interactive mode is optional behavior.
|
|
216
|
+
|
|
217
|
+
## Dependencies
|
|
218
|
+
|
|
219
|
+
- **Feature 001**: Session model, workshop phases
|
|
220
|
+
- **Feature 002**: Ralph Loop, `--force` CLI option, testRunner, PocScaffolder, session schemas
|
|
221
|
+
- **Feature 003**: MCP transport layer (resume should work with both stub and real MCP; template registry should support templates for MCP-enabled vs local-only PoCs)
|