sofia-cli 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.github/agents/copilot-instructions.md +39 -0
- package/.github/agents/speckit.analyze.agent.md +184 -0
- package/.github/agents/speckit.checklist.agent.md +294 -0
- package/.github/agents/speckit.clarify.agent.md +181 -0
- package/.github/agents/speckit.constitution.agent.md +84 -0
- package/.github/agents/speckit.implement.agent.md +135 -0
- package/.github/agents/speckit.plan.agent.md +90 -0
- package/.github/agents/speckit.specify.agent.md +258 -0
- package/.github/agents/speckit.tasks.agent.md +137 -0
- package/.github/agents/speckit.taskstoissues.agent.md +30 -0
- package/.github/copilot-instructions.md +257 -0
- package/.github/prompts/speckit.analyze.prompt.md +3 -0
- package/.github/prompts/speckit.checklist.prompt.md +3 -0
- package/.github/prompts/speckit.clarify.prompt.md +3 -0
- package/.github/prompts/speckit.constitution.prompt.md +3 -0
- package/.github/prompts/speckit.implement.prompt.md +3 -0
- package/.github/prompts/speckit.plan.prompt.md +3 -0
- package/.github/prompts/speckit.specify.prompt.md +3 -0
- package/.github/prompts/speckit.tasks.prompt.md +3 -0
- package/.github/prompts/speckit.taskstoissues.prompt.md +3 -0
- package/.github/workflows/ci.yml +38 -0
- package/.prettierrc +6 -0
- package/.specify/memory/constitution.md +181 -0
- package/.specify/scripts/bash/check-prerequisites.sh +166 -0
- package/.specify/scripts/bash/common.sh +156 -0
- package/.specify/scripts/bash/create-new-feature.sh +297 -0
- package/.specify/scripts/bash/setup-plan.sh +61 -0
- package/.specify/scripts/bash/update-agent-context.sh +810 -0
- package/.specify/templates/agent-file-template.md +28 -0
- package/.specify/templates/checklist-template.md +40 -0
- package/.specify/templates/constitution-template.md +50 -0
- package/.specify/templates/plan-template.md +113 -0
- package/.specify/templates/spec-template.md +115 -0
- package/.specify/templates/tasks-template.md +251 -0
- package/.vscode/mcp.json +42 -0
- package/.vscode/settings.json +19 -0
- package/CODE_OF_CONDUCT.md +128 -0
- package/LICENSE +21 -0
- package/README.md +213 -0
- package/dist/src/cli/developCommand.js +240 -0
- package/dist/src/cli/directCommands.js +143 -0
- package/dist/src/cli/envLoader.js +16 -0
- package/dist/src/cli/exportCommand.js +53 -0
- package/dist/src/cli/index.js +203 -0
- package/dist/src/cli/ioContext.js +109 -0
- package/dist/src/cli/preflight.js +57 -0
- package/dist/src/cli/statusCommand.js +110 -0
- package/dist/src/cli/workshopCommand.js +400 -0
- package/dist/src/develop/checkpointState.js +86 -0
- package/dist/src/develop/codeGenerator.js +319 -0
- package/dist/src/develop/dynamicScaffolder.js +226 -0
- package/dist/src/develop/githubMcpAdapter.js +122 -0
- package/dist/src/develop/index.js +15 -0
- package/dist/src/develop/mcpContextEnricher.js +195 -0
- package/dist/src/develop/pocScaffolder.js +542 -0
- package/dist/src/develop/ralphLoop.js +659 -0
- package/dist/src/develop/templateRegistry.js +364 -0
- package/dist/src/develop/testRunner.js +202 -0
- package/dist/src/logging/logger.js +58 -0
- package/dist/src/loop/conversationLoop.js +227 -0
- package/dist/src/loop/phaseSummarizer.js +87 -0
- package/dist/src/mcp/mcpManager.js +267 -0
- package/dist/src/mcp/mcpTransport.js +391 -0
- package/dist/src/mcp/retryPolicy.js +47 -0
- package/dist/src/mcp/webSearch.js +254 -0
- package/dist/src/phases/contextSummarizer.js +101 -0
- package/dist/src/phases/discoveryEnricher.js +156 -0
- package/dist/src/phases/phaseExtractors.js +222 -0
- package/dist/src/phases/phaseHandlers.js +328 -0
- package/dist/src/prompts/design.md +51 -0
- package/dist/src/prompts/develop-boundary.md +51 -0
- package/dist/src/prompts/develop.md +111 -0
- package/dist/src/prompts/discover.md +58 -0
- package/dist/src/prompts/ideate.md +56 -0
- package/dist/src/prompts/plan.md +51 -0
- package/dist/src/prompts/promptLoader.js +167 -0
- package/dist/src/prompts/promptLoader.ts +198 -0
- package/dist/src/prompts/select.md +47 -0
- package/dist/src/prompts/summarize/README.md +8 -0
- package/dist/src/prompts/summarize/design-summary.md +37 -0
- package/dist/src/prompts/summarize/develop-summary.md +25 -0
- package/dist/src/prompts/summarize/ideate-summary.md +27 -0
- package/dist/src/prompts/summarize/plan-summary.md +27 -0
- package/dist/src/prompts/summarize/select-summary.md +21 -0
- package/dist/src/prompts/system.md +28 -0
- package/dist/src/sessions/exportPaths.js +22 -0
- package/dist/src/sessions/exportWriter.js +406 -0
- package/dist/src/sessions/sessionManager.js +81 -0
- package/dist/src/sessions/sessionStore.js +65 -0
- package/dist/src/shared/activitySpinner.js +91 -0
- package/dist/src/shared/copilotClient.js +129 -0
- package/dist/src/shared/data/cards.json +1249 -0
- package/dist/src/shared/data/cardsLoader.js +51 -0
- package/dist/src/shared/errorClassifier.js +120 -0
- package/dist/src/shared/events.js +28 -0
- package/dist/src/shared/markdownRenderer.js +34 -0
- package/dist/src/shared/schemas/session.js +265 -0
- package/dist/src/shared/tableRenderer.js +20 -0
- package/dist/src/vendor/chalk.js +2 -0
- package/dist/src/vendor/cli-table3.js +3 -0
- package/dist/src/vendor/commander.js +2 -0
- package/dist/src/vendor/marked-terminal.js +3 -0
- package/dist/src/vendor/marked.js +2 -0
- package/dist/src/vendor/ora.js +2 -0
- package/dist/src/vendor/pino.js +2 -0
- package/dist/src/vendor/zod.js +2 -0
- package/dist/tests/e2e/developE2e.spec.js +126 -0
- package/dist/tests/e2e/developFailureE2e.spec.js +247 -0
- package/dist/tests/e2e/developPty.spec.js +75 -0
- package/dist/tests/e2e/discoveryWebSearchRelevance.spec.js +84 -0
- package/dist/tests/e2e/harness.spec.js +83 -0
- package/dist/tests/e2e/mcpLive.spec.js +120 -0
- package/dist/tests/e2e/newSession.e2e.spec.js +177 -0
- package/dist/tests/e2e/ralphLoopEnrichmentComparison.spec.js +62 -0
- package/dist/tests/e2e/workiqEnrichment.spec.js +56 -0
- package/dist/tests/e2e/zavaSimulation.spec.js +452 -0
- package/dist/tests/fixtures/test-fixture-project/src/add.js +3 -0
- package/dist/tests/fixtures/test-fixture-project/tests/failing.test.js +6 -0
- package/dist/tests/fixtures/test-fixture-project/tests/hanging.test.js +8 -0
- package/dist/tests/fixtures/test-fixture-project/tests/passing.test.js +10 -0
- package/dist/tests/fixtures/test-fixture-project/vitest.config.js +6 -0
- package/dist/tests/integration/autoStartConversation.spec.js +138 -0
- package/dist/tests/integration/defaultCommand.spec.js +147 -0
- package/dist/tests/integration/directCommandNonTty.spec.js +224 -0
- package/dist/tests/integration/directCommandTty.spec.js +151 -0
- package/dist/tests/integration/discoveryEnrichmentFlow.spec.js +175 -0
- package/dist/tests/integration/exportArtifacts.spec.js +202 -0
- package/dist/tests/integration/exportFallbackFlow.spec.js +99 -0
- package/dist/tests/integration/mcpDegradationFlow.spec.js +190 -0
- package/dist/tests/integration/mcpTransportFlow.spec.js +139 -0
- package/dist/tests/integration/newSessionFlow.spec.js +343 -0
- package/dist/tests/integration/pocGithubMcp.spec.js +186 -0
- package/dist/tests/integration/pocLocalFallback.spec.js +171 -0
- package/dist/tests/integration/pocScaffold.spec.js +163 -0
- package/dist/tests/integration/ralphLoopFlow.spec.js +359 -0
- package/dist/tests/integration/ralphLoopPartial.spec.js +368 -0
- package/dist/tests/integration/resumeAndBacktrack.spec.js +247 -0
- package/dist/tests/integration/spinnerLifecycle.spec.js +220 -0
- package/dist/tests/integration/summarizationFlow.spec.js +115 -0
- package/dist/tests/integration/testRunnerReal.spec.js +52 -0
- package/dist/tests/integration/webSearchAgent.spec.js +128 -0
- package/dist/tests/live/copilotSdkLive.spec.js +107 -0
- package/dist/tests/live/zavaFullWorkshop.spec.js +392 -0
- package/dist/tests/setup/loadEnv.js +3 -0
- package/dist/tests/unit/cli/developCommand.spec.js +567 -0
- package/dist/tests/unit/cli/directCommands.spec.js +279 -0
- package/dist/tests/unit/cli/envLoader.spec.js +58 -0
- package/dist/tests/unit/cli/ioContext.spec.js +119 -0
- package/dist/tests/unit/cli/preflight.spec.js +108 -0
- package/dist/tests/unit/cli/statusCommand.spec.js +111 -0
- package/dist/tests/unit/cli/workshopClientFallback.spec.js +80 -0
- package/dist/tests/unit/cli/workshopCommand.spec.js +329 -0
- package/dist/tests/unit/config/vitestEnvSetup.spec.js +13 -0
- package/dist/tests/unit/develop/checkpointState.spec.js +315 -0
- package/dist/tests/unit/develop/codeGenerator.spec.js +355 -0
- package/dist/tests/unit/develop/githubMcpAdapter.spec.js +231 -0
- package/dist/tests/unit/develop/mcpContextEnricher.spec.js +433 -0
- package/dist/tests/unit/develop/outputValidator.spec.js +119 -0
- package/dist/tests/unit/develop/pocScaffolder.spec.js +353 -0
- package/dist/tests/unit/develop/ralphLoop.spec.js +1248 -0
- package/dist/tests/unit/develop/templateRegistry.spec.js +85 -0
- package/dist/tests/unit/develop/testRunner.spec.js +249 -0
- package/dist/tests/unit/infraBicep.spec.js +92 -0
- package/dist/tests/unit/infraDeploy.spec.js +82 -0
- package/dist/tests/unit/infraTeardown.spec.js +63 -0
- package/dist/tests/unit/logging/logger.spec.js +43 -0
- package/dist/tests/unit/loop/conversationLoop.spec.js +592 -0
- package/dist/tests/unit/loop/phaseSummarizer.spec.js +141 -0
- package/dist/tests/unit/loop/streamingMarkdown.spec.js +147 -0
- package/dist/tests/unit/mcp/mcpManager.spec.js +279 -0
- package/dist/tests/unit/mcp/mcpTransport.spec.js +529 -0
- package/dist/tests/unit/mcp/retryPolicy.spec.js +218 -0
- package/dist/tests/unit/mcp/timeoutValidation.spec.js +46 -0
- package/dist/tests/unit/mcp/webSearch.spec.js +567 -0
- package/dist/tests/unit/phases/contextSummarizer.spec.js +140 -0
- package/dist/tests/unit/phases/discoveryEnricher.repeatCalls.spec.js +93 -0
- package/dist/tests/unit/phases/discoveryEnricher.spec.js +411 -0
- package/dist/tests/unit/phases/phaseExtractors.spec.js +352 -0
- package/dist/tests/unit/phases/phaseHandlers.spec.js +425 -0
- package/dist/tests/unit/prompts/promptLoader.spec.js +118 -0
- package/dist/tests/unit/schemas/pocSchemas.spec.js +412 -0
- package/dist/tests/unit/schemas/session.spec.js +257 -0
- package/dist/tests/unit/sessions/exportPaths.spec.js +31 -0
- package/dist/tests/unit/sessions/exportWriter.spec.js +655 -0
- package/dist/tests/unit/sessions/sessionManager.spec.js +151 -0
- package/dist/tests/unit/sessions/sessionStore.spec.js +116 -0
- package/dist/tests/unit/shared/activitySpinner.spec.js +175 -0
- package/dist/tests/unit/shared/cardsLoader.spec.js +76 -0
- package/dist/tests/unit/shared/copilotClient.spec.js +155 -0
- package/dist/tests/unit/shared/errorClassifier.spec.js +131 -0
- package/dist/tests/unit/shared/events.spec.js +55 -0
- package/dist/tests/unit/shared/markdownRenderer.spec.js +35 -0
- package/dist/tests/unit/shared/markdownRendererChunks.spec.js +70 -0
- package/dist/tests/unit/shared/tableRenderer.spec.js +34 -0
- package/dist/vitest.config.js +14 -0
- package/dist/vitest.live.config.js +18 -0
- package/docs/README.md +35 -0
- package/docs/architecture.md +169 -0
- package/docs/cli-usage.md +207 -0
- package/docs/environment.md +66 -0
- package/docs/export-format.md +146 -0
- package/docs/session-model.md +113 -0
- package/eslint.config.js +35 -0
- package/infra/deploy.sh +193 -0
- package/infra/gather-env.sh +211 -0
- package/infra/main.bicep +90 -0
- package/infra/main.bicepparam +18 -0
- package/infra/resources.bicep +134 -0
- package/infra/teardown.sh +114 -0
- package/package.json +63 -0
- package/specs/001-cli-workshop-rebuild/checklists/requirements.md +35 -0
- package/specs/001-cli-workshop-rebuild/contracts/cli.md +59 -0
- package/specs/001-cli-workshop-rebuild/contracts/export-summary-json.md +23 -0
- package/specs/001-cli-workshop-rebuild/contracts/session-json.md +30 -0
- package/specs/001-cli-workshop-rebuild/data-model.md +210 -0
- package/specs/001-cli-workshop-rebuild/plan.md +361 -0
- package/specs/001-cli-workshop-rebuild/quickstart.md +83 -0
- package/specs/001-cli-workshop-rebuild/research.md +116 -0
- package/specs/001-cli-workshop-rebuild/spec.md +240 -0
- package/specs/001-cli-workshop-rebuild/tasks.md +476 -0
- package/specs/002-poc-generation/contracts/poc-output.md +172 -0
- package/specs/002-poc-generation/contracts/ralph-loop.md +113 -0
- package/specs/002-poc-generation/data-model.md +172 -0
- package/specs/002-poc-generation/plan.md +109 -0
- package/specs/002-poc-generation/quickstart.md +97 -0
- package/specs/002-poc-generation/research.md +786 -0
- package/specs/002-poc-generation/spec.md +81 -0
- package/specs/002-poc-generation/tasks-fix.md +198 -0
- package/specs/002-poc-generation/tasks.md +252 -0
- package/specs/003-mcp-transport-integration/checklists/requirements.md +37 -0
- package/specs/003-mcp-transport-integration/contracts/context-enricher.md +220 -0
- package/specs/003-mcp-transport-integration/contracts/discovery-enricher.md +267 -0
- package/specs/003-mcp-transport-integration/contracts/github-adapter.md +149 -0
- package/specs/003-mcp-transport-integration/contracts/mcp-transport.md +288 -0
- package/specs/003-mcp-transport-integration/data-model.md +326 -0
- package/specs/003-mcp-transport-integration/plan.md +114 -0
- package/specs/003-mcp-transport-integration/quickstart.md +311 -0
- package/specs/003-mcp-transport-integration/research.md +395 -0
- package/specs/003-mcp-transport-integration/spec.md +234 -0
- package/specs/003-mcp-transport-integration/tasks.md +324 -0
- package/specs/003-next-spec-gaps.md +150 -0
- package/specs/004-dev-resume-hardening/checklists/requirements.md +37 -0
- package/specs/004-dev-resume-hardening/contracts/cli.md +160 -0
- package/specs/004-dev-resume-hardening/data-model.md +321 -0
- package/specs/004-dev-resume-hardening/plan.md +107 -0
- package/specs/004-dev-resume-hardening/quickstart.md +115 -0
- package/specs/004-dev-resume-hardening/research.md +142 -0
- package/specs/004-dev-resume-hardening/spec.md +221 -0
- package/specs/004-dev-resume-hardening/tasks.md +333 -0
- package/specs/005-ai-search-deploy/checklists/requirements.md +39 -0
- package/specs/005-ai-search-deploy/contracts/web-search-tool.md +241 -0
- package/specs/005-ai-search-deploy/data-model.md +130 -0
- package/specs/005-ai-search-deploy/plan.md +93 -0
- package/specs/005-ai-search-deploy/quickstart.md +96 -0
- package/specs/005-ai-search-deploy/research.md +187 -0
- package/specs/005-ai-search-deploy/spec.md +143 -0
- package/specs/005-ai-search-deploy/tasks.md +284 -0
- package/specs/006-workshop-extraction-fixes/checklists/requirements.md +61 -0
- package/specs/006-workshop-extraction-fixes/contracts/summarization-and-export.md +131 -0
- package/specs/006-workshop-extraction-fixes/data-model.md +149 -0
- package/specs/006-workshop-extraction-fixes/plan.md +123 -0
- package/specs/006-workshop-extraction-fixes/quickstart.md +101 -0
- package/specs/006-workshop-extraction-fixes/research.md +143 -0
- package/specs/006-workshop-extraction-fixes/spec.md +210 -0
- package/specs/006-workshop-extraction-fixes/tasks.md +316 -0
- package/src/cli/developCommand.ts +308 -0
- package/src/cli/directCommands.ts +195 -0
- package/src/cli/envLoader.ts +17 -0
- package/src/cli/exportCommand.ts +65 -0
- package/src/cli/index.ts +249 -0
- package/src/cli/ioContext.ts +139 -0
- package/src/cli/preflight.ts +86 -0
- package/src/cli/statusCommand.ts +118 -0
- package/src/cli/workshopCommand.ts +496 -0
- package/src/develop/checkpointState.ts +121 -0
- package/src/develop/codeGenerator.ts +402 -0
- package/src/develop/dynamicScaffolder.ts +284 -0
- package/src/develop/githubMcpAdapter.ts +199 -0
- package/src/develop/index.ts +34 -0
- package/src/develop/mcpContextEnricher.ts +279 -0
- package/src/develop/pocScaffolder.ts +646 -0
- package/src/develop/ralphLoop.ts +1044 -0
- package/src/develop/templateRegistry.ts +427 -0
- package/src/develop/testRunner.ts +276 -0
- package/src/logging/logger.ts +73 -0
- package/src/loop/conversationLoop.ts +355 -0
- package/src/loop/phaseSummarizer.ts +114 -0
- package/src/mcp/mcpManager.ts +365 -0
- package/src/mcp/mcpTransport.ts +562 -0
- package/src/mcp/retryPolicy.ts +87 -0
- package/src/mcp/webSearch.ts +388 -0
- package/src/originalPrompts/design_thinking.md +178 -0
- package/src/originalPrompts/design_thinking_persona.md +76 -0
- package/src/originalPrompts/document_generator_example.md +77 -0
- package/src/originalPrompts/document_generator_persona.md +47 -0
- package/src/originalPrompts/facilitator_persona.md +125 -0
- package/src/originalPrompts/guardrails.md +47 -0
- package/src/phases/contextSummarizer.ts +154 -0
- package/src/phases/discoveryEnricher.ts +223 -0
- package/src/phases/phaseExtractors.ts +247 -0
- package/src/phases/phaseHandlers.ts +450 -0
- package/src/prompts/design.md +51 -0
- package/src/prompts/develop-boundary.md +51 -0
- package/src/prompts/develop.md +111 -0
- package/src/prompts/discover.md +58 -0
- package/src/prompts/ideate.md +56 -0
- package/src/prompts/plan.md +51 -0
- package/src/prompts/promptLoader.ts +198 -0
- package/src/prompts/select.md +47 -0
- package/src/prompts/summarize/README.md +8 -0
- package/src/prompts/summarize/design-summary.md +37 -0
- package/src/prompts/summarize/develop-summary.md +25 -0
- package/src/prompts/summarize/ideate-summary.md +27 -0
- package/src/prompts/summarize/plan-summary.md +27 -0
- package/src/prompts/summarize/select-summary.md +21 -0
- package/src/prompts/system.md +28 -0
- package/src/sessions/exportPaths.ts +28 -0
- package/src/sessions/exportWriter.ts +490 -0
- package/src/sessions/sessionManager.ts +119 -0
- package/src/sessions/sessionStore.ts +69 -0
- package/src/shared/activitySpinner.ts +108 -0
- package/src/shared/copilotClient.ts +291 -0
- package/src/shared/data/cards.json +1249 -0
- package/src/shared/data/cardsLoader.ts +70 -0
- package/src/shared/errorClassifier.ts +160 -0
- package/src/shared/events.ts +103 -0
- package/src/shared/markdownRenderer.ts +44 -0
- package/src/shared/schemas/session.ts +346 -0
- package/src/shared/tableRenderer.ts +28 -0
- package/src/types/marked-terminal.d.ts +5 -0
- package/src/vendor/chalk.ts +2 -0
- package/src/vendor/cli-table3.ts +3 -0
- package/src/vendor/commander.ts +2 -0
- package/src/vendor/marked-terminal.ts +3 -0
- package/src/vendor/marked.ts +2 -0
- package/src/vendor/ora.ts +2 -0
- package/src/vendor/pino.ts +3 -0
- package/src/vendor/zod.ts +3 -0
- package/tests/e2e/developE2e.spec.ts +152 -0
- package/tests/e2e/developFailureE2e.spec.ts +289 -0
- package/tests/e2e/developPty.spec.ts +86 -0
- package/tests/e2e/discoveryWebSearchRelevance.spec.ts +103 -0
- package/tests/e2e/harness.spec.ts +104 -0
- package/tests/e2e/mcpLive.spec.ts +149 -0
- package/tests/e2e/newSession.e2e.spec.ts +245 -0
- package/tests/e2e/ralphLoopEnrichmentComparison.spec.ts +70 -0
- package/tests/e2e/workiqEnrichment.spec.ts +72 -0
- package/tests/e2e/zava-assessment/agent-interaction-script.md +258 -0
- package/tests/e2e/zava-assessment/company-profile.md +98 -0
- package/tests/e2e/zava-assessment/expected-results-checklist.md +454 -0
- package/tests/e2e/zavaSimulation.spec.ts +511 -0
- package/tests/fixtures/completedSession.json +141 -0
- package/tests/fixtures/test-fixture-project/package-lock.json +1585 -0
- package/tests/fixtures/test-fixture-project/package.json +12 -0
- package/tests/fixtures/test-fixture-project/src/add.ts +3 -0
- package/tests/fixtures/test-fixture-project/tests/failing.test.ts +7 -0
- package/tests/fixtures/test-fixture-project/tests/hanging.test.ts +9 -0
- package/tests/fixtures/test-fixture-project/tests/passing.test.ts +13 -0
- package/tests/fixtures/test-fixture-project/vitest.config.ts +7 -0
- package/tests/integration/autoStartConversation.spec.ts +168 -0
- package/tests/integration/defaultCommand.spec.ts +179 -0
- package/tests/integration/directCommandNonTty.spec.ts +260 -0
- package/tests/integration/directCommandTty.spec.ts +185 -0
- package/tests/integration/discoveryEnrichmentFlow.spec.ts +209 -0
- package/tests/integration/exportArtifacts.spec.ts +232 -0
- package/tests/integration/exportFallbackFlow.spec.ts +115 -0
- package/tests/integration/mcpDegradationFlow.spec.ts +231 -0
- package/tests/integration/mcpTransportFlow.spec.ts +178 -0
- package/tests/integration/newSessionFlow.spec.ts +406 -0
- package/tests/integration/pocGithubMcp.spec.ts +224 -0
- package/tests/integration/pocLocalFallback.spec.ts +205 -0
- package/tests/integration/pocScaffold.spec.ts +220 -0
- package/tests/integration/ralphLoopFlow.spec.ts +430 -0
- package/tests/integration/ralphLoopPartial.spec.ts +416 -0
- package/tests/integration/resumeAndBacktrack.spec.ts +278 -0
- package/tests/integration/spinnerLifecycle.spec.ts +270 -0
- package/tests/integration/summarizationFlow.spec.ts +135 -0
- package/tests/integration/testRunnerReal.spec.ts +63 -0
- package/tests/integration/webSearchAgent.spec.ts +155 -0
- package/tests/live/copilotSdkLive.spec.ts +149 -0
- package/tests/live/zavaFullWorkshop.spec.ts +515 -0
- package/tests/setup/loadEnv.ts +5 -0
- package/tests/unit/cli/developCommand.spec.ts +679 -0
- package/tests/unit/cli/directCommands.spec.ts +325 -0
- package/tests/unit/cli/envLoader.spec.ts +73 -0
- package/tests/unit/cli/ioContext.spec.ts +148 -0
- package/tests/unit/cli/preflight.spec.ts +125 -0
- package/tests/unit/cli/statusCommand.spec.ts +134 -0
- package/tests/unit/cli/workshopClientFallback.spec.ts +100 -0
- package/tests/unit/cli/workshopCommand.spec.ts +378 -0
- package/tests/unit/config/vitestEnvSetup.spec.ts +24 -0
- package/tests/unit/develop/checkpointState.spec.ts +378 -0
- package/tests/unit/develop/codeGenerator.spec.ts +447 -0
- package/tests/unit/develop/githubMcpAdapter.spec.ts +283 -0
- package/tests/unit/develop/mcpContextEnricher.spec.ts +564 -0
- package/tests/unit/develop/outputValidator.spec.ts +134 -0
- package/tests/unit/develop/pocScaffolder.spec.ts +451 -0
- package/tests/unit/develop/ralphLoop.spec.ts +1439 -0
- package/tests/unit/develop/templateRegistry.spec.ts +106 -0
- package/tests/unit/develop/testRunner.spec.ts +294 -0
- package/tests/unit/infraBicep.spec.ts +116 -0
- package/tests/unit/infraDeploy.spec.ts +102 -0
- package/tests/unit/infraTeardown.spec.ts +77 -0
- package/tests/unit/logging/logger.spec.ts +50 -0
- package/tests/unit/loop/conversationLoop.spec.ts +719 -0
- package/tests/unit/loop/phaseSummarizer.spec.ts +169 -0
- package/tests/unit/loop/streamingMarkdown.spec.ts +180 -0
- package/tests/unit/mcp/mcpManager.spec.ts +336 -0
- package/tests/unit/mcp/mcpTransport.spec.ts +689 -0
- package/tests/unit/mcp/retryPolicy.spec.ts +278 -0
- package/tests/unit/mcp/timeoutValidation.spec.ts +55 -0
- package/tests/unit/mcp/webSearch.spec.ts +718 -0
- package/tests/unit/phases/contextSummarizer.spec.ts +158 -0
- package/tests/unit/phases/discoveryEnricher.repeatCalls.spec.ts +125 -0
- package/tests/unit/phases/discoveryEnricher.spec.ts +512 -0
- package/tests/unit/phases/phaseExtractors.spec.ts +406 -0
- package/tests/unit/phases/phaseHandlers.spec.ts +483 -0
- package/tests/unit/prompts/promptLoader.spec.ts +144 -0
- package/tests/unit/schemas/pocSchemas.spec.ts +457 -0
- package/tests/unit/schemas/session.spec.ts +328 -0
- package/tests/unit/sessions/exportPaths.spec.ts +38 -0
- package/tests/unit/sessions/exportWriter.spec.ts +737 -0
- package/tests/unit/sessions/sessionManager.spec.ts +174 -0
- package/tests/unit/sessions/sessionStore.spec.ts +136 -0
- package/tests/unit/shared/activitySpinner.spec.ts +211 -0
- package/tests/unit/shared/cardsLoader.spec.ts +89 -0
- package/tests/unit/shared/copilotClient.spec.ts +185 -0
- package/tests/unit/shared/errorClassifier.spec.ts +152 -0
- package/tests/unit/shared/events.spec.ts +71 -0
- package/tests/unit/shared/markdownRenderer.spec.ts +42 -0
- package/tests/unit/shared/markdownRendererChunks.spec.ts +83 -0
- package/tests/unit/shared/tableRenderer.spec.ts +38 -0
- package/tsconfig.json +20 -0
- package/vitest.config.ts +15 -0
- package/vitest.live.config.ts +19 -0
|
@@ -0,0 +1,454 @@
|
|
|
1
|
+
# Zava Industries — Expected Results & Assessment Checklist
|
|
2
|
+
|
|
3
|
+
This document defines what each phase of the sofIA workshop should produce when run with the Zava Industries company profile and agent interaction script. After the test run, each check should be marked as PASS, FAIL, or PARTIAL with notes.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Test Metadata
|
|
8
|
+
|
|
9
|
+
| Field | Value |
|
|
10
|
+
| ------------------------- | ---------------------------------------------------------------- |
|
|
11
|
+
| **Test Date** | _(to be filled)_ |
|
|
12
|
+
| **sofIA Version** | 0.1.0 |
|
|
13
|
+
| **Node.js Version** | _(to be filled)_ |
|
|
14
|
+
| **Environment** | _(local / CI)_ |
|
|
15
|
+
| **Copilot SDK Token** | _(configured / missing)_ |
|
|
16
|
+
| **MCP Servers Available** | _(list which ones: github, context7, azure, workiq, playwright)_ |
|
|
17
|
+
| **Web Search (Foundry)** | _(configured / not configured — FOUNDRY_PROJECT_ENDPOINT set?)_ |
|
|
18
|
+
| **WorkIQ** | _(configured / not configured — EULA accepted?)_ |
|
|
19
|
+
| **Overall Result** | _(PASS / FAIL / PARTIAL)_ |
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
## 0. Pre-flight & Environment
|
|
24
|
+
|
|
25
|
+
### Spec References: FR-051, FR-017 (005 spec)
|
|
26
|
+
|
|
27
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
28
|
+
| --- | ------------------------ | ---------------------------------------------------------------------------------------------- | ------ | ----- |
|
|
29
|
+
| 0.1 | `.env` loaded at startup | sofIA loads `.env` without error; if missing, proceeds normally (FR-017 005) | ☐ | |
|
|
30
|
+
| 0.2 | Pre-flight check | sofIA validates Copilot connectivity before starting (FR-051) | ☐ | |
|
|
31
|
+
| 0.3 | MCP readiness | Pre-flight reports which MCP servers are reachable | ☐ | |
|
|
32
|
+
| 0.4 | Web search configured | `FOUNDRY_PROJECT_ENDPOINT` + `FOUNDRY_MODEL_DEPLOYMENT_NAME` detected (or absent with warning) | ☐ | |
|
|
33
|
+
| 0.5 | Legacy env rejected | If `SOFIA_FOUNDRY_AGENT_ENDPOINT` is set, a migration error is shown (FR-016 005) | ☐ | |
|
|
34
|
+
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
## 1. CLI Startup & Session Creation
|
|
38
|
+
|
|
39
|
+
### Spec References: FR-004, FR-009, FR-015a, FR-023, FR-023a
|
|
40
|
+
|
|
41
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
42
|
+
| --- | --------------------------- | --------------------------------------------------------------------------------------------------------------- | ------ | ----- |
|
|
43
|
+
| 1.1 | CLI starts without errors | `npm run start -- workshop --new-session` launches successfully | ☐ | |
|
|
44
|
+
| 1.2 | Auto-start greeting | LLM produces a greeting introducing the Discover phase and asks the first question without requiring user input | ☐ | |
|
|
45
|
+
| 1.3 | Greeting timeout | First token arrives within 10 seconds | ☐ | |
|
|
46
|
+
| 1.4 | Markdown rendering | Streamed output is rendered as formatted markdown (not raw text) in TTY mode | ☐ | |
|
|
47
|
+
| 1.5 | Spinner display | A "Thinking..." spinner is shown before the first token arrives | ☐ | |
|
|
48
|
+
| 1.6 | Session created | A session JSON file is created in `.sofia/sessions/` | ☐ | |
|
|
49
|
+
| 1.7 | Session name auto-generated | After first Discover exchange, session gets a short auto-generated name (e.g., "Zava Trend Intelligence") | ☐ | |
|
|
50
|
+
|
|
51
|
+
---
|
|
52
|
+
|
|
53
|
+
## 2. Discover Phase (Steps 1–4)
|
|
54
|
+
|
|
55
|
+
### Spec References: FR-019, FR-020, FR-021, FR-022, FR-023, FR-009a
|
|
56
|
+
|
|
57
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
58
|
+
| ---- | --------------------------- | --------------------------------------------------------------------------------------------------- | ------ | ----- |
|
|
59
|
+
| 2.1 | Business context collection | sofIA asks about the business and accepts the company description | ☐ | |
|
|
60
|
+
| 2.2 | Follow-up probes | sofIA asks clarifying questions about team, process, or pain points | ☐ | |
|
|
61
|
+
| 2.3 | Web search offer | sofIA offers to search the web for company/industry context (FR-021) | ☐ | |
|
|
62
|
+
| 2.4 | Web search execution | If web search is available, sofIA executes a search and reports results | ☐ | |
|
|
63
|
+
| 2.5 | Web search degradation | If web search is unavailable, sofIA continues gracefully (FR-022) | ☐ | |
|
|
64
|
+
| 2.6 | WorkIQ permission prompt | sofIA asks the user for explicit permission before querying WorkIQ (FR-020) | ☐ | |
|
|
65
|
+
| 2.7 | WorkIQ graceful skip | When user declines WorkIQ (or WorkIQ unavailable), sofIA continues normally (FR-022) | ☐ | |
|
|
66
|
+
| 2.8 | Topic selection | sofIA suggests focus areas and accepts user's choice of "Trend Intelligence and Signal Aggregation" | ☐ | |
|
|
67
|
+
| 2.9 | Activity brainstorming | sofIA helps brainstorm activities and accepts the list of 8 activities | ☐ | |
|
|
68
|
+
| 2.10 | Workflow diagram | sofIA produces a Mermaid diagram of the activity flow | ☐ | |
|
|
69
|
+
| 2.11 | Critical step voting | sofIA accepts business/human value scores and key metrics | ☐ | |
|
|
70
|
+
| 2.12 | Phase summary | sofIA produces a summary covering: business context, topic, activities, workflow, critical steps | ☐ | |
|
|
71
|
+
| 2.13 | JSON extraction | `businessContext` is extracted and stored in session JSON | ☐ | |
|
|
72
|
+
| 2.14 | Session persistence | Session is persisted after every user turn (FR-039a) | ☐ | |
|
|
73
|
+
| 2.15 | Decision gate | After Discover, sofIA shows a decision gate with options (continue, refine, exit, etc.) | ☐ | |
|
|
74
|
+
| 2.16 | No auto-advance | System does NOT auto-advance to Ideate (FR-018, FR-060) | ☐ | |
|
|
75
|
+
|
|
76
|
+
### Discover — MCP Tool Invocation Audit
|
|
77
|
+
|
|
78
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
79
|
+
| ----- | -------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- | ------ | ----------- |
|
|
80
|
+
| 2.T1 | `web.search` — company news query | Enricher searches for "Zava Industries recent news" (or similar) | ☐ | Query used: |
|
|
81
|
+
| 2.T2 | `web.search` — competitor query | Enricher searches for competitor/market context | ☐ | Query used: |
|
|
82
|
+
| 2.T3 | `web.search` — industry trends query | Enricher searches for fashion/AI industry trends | ☐ | Query used: |
|
|
83
|
+
| 2.T4 | `web.search` — results surfaced | Search results are shown to the user (one-line summaries or inline in LLM text) | ☐ | |
|
|
84
|
+
| 2.T5 | `web.search` — results stored | `session.discoveryEnrichment.webSearchResults` is populated | ☐ | |
|
|
85
|
+
| 2.T6 | `web.search` — citations | Search results include source URLs (FR-014 005) | ☐ | |
|
|
86
|
+
| 2.T7 | `web.search` — spinner | Spinner shows "Searching..." or similar during web search calls | ☐ | |
|
|
87
|
+
| 2.T8 | `web.search` — summary line | After search completes, a one-line summary is shown (e.g., "✓ Web search: 3 results for ...") (FR-043b) | ☐ | |
|
|
88
|
+
| 2.T9 | `workiq.analyze_team` — consent prompt | sofIA asks "May sofIA access WorkIQ for team insights? (y/N)" before calling WorkIQ | ☐ | |
|
|
89
|
+
| 2.T10 | `workiq` — NOT called without consent | If user says "no" or "skip", WorkIQ is NOT invoked | ☐ | |
|
|
90
|
+
| 2.T11 | `workiq` — called with consent | If user says "yes", `analyze_team` is called with the company summary | ☐ | |
|
|
91
|
+
| 2.T12 | `workiq` — insights stored | If WorkIQ is called, `session.discoveryEnrichment.workiqInsights` is populated (teamExpertise, collaborationPatterns, documentationGaps) | ☐ | |
|
|
92
|
+
| 2.T13 | `workiq` — degradation | If WorkIQ is unavailable/errors, sofIA continues without crashing | ☐ | |
|
|
93
|
+
| 2.T14 | `enrichment.sourcesUsed` | Session records which sources were actually used (e.g., `["websearch"]` or `["websearch","workiq"]`) | ☐ | |
|
|
94
|
+
|
|
95
|
+
---
|
|
96
|
+
|
|
97
|
+
## 3. Ideate Phase (Steps 5–9)
|
|
98
|
+
|
|
99
|
+
### Spec References: FR-024, FR-025, FR-026, FR-027, FR-028
|
|
100
|
+
|
|
101
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
102
|
+
| ---- | ------------------------- | ---------------------------------------------------------------------------------------- | ------ | ----- |
|
|
103
|
+
| 3.1 | Card presentation | sofIA presents AI Discovery Cards, organized by category | ☐ | |
|
|
104
|
+
| 3.2 | Card explanation | Each card includes: capability description, workflow application, examples | ☐ | |
|
|
105
|
+
| 3.3 | Card scoring | sofIA accepts user scores (Relevance/Feasibility/Impact) for each card | ☐ | |
|
|
106
|
+
| 3.4 | Top card selection | sofIA selects top-scoring cards (up to 15) | ☐ | |
|
|
107
|
+
| 3.5 | Card aggregation | sofIA aggregates similar cards into themes when requested | ☐ | |
|
|
108
|
+
| 3.6 | Card–workflow mapping | sofIA creates a mapping of cards to workflow steps with metrics | ☐ | |
|
|
109
|
+
| 3.7 | Idea generation | sofIA generates ideas using Design Thinking techniques (HMW, SCAMPER) | ☐ | |
|
|
110
|
+
| 3.8 | Idea cards | At least 3–5 distinct ideas are generated with title, description, workflow steps, scope | ☐ | |
|
|
111
|
+
| 3.9 | Ideas match context | Generated ideas are relevant to fashion trend analysis (not generic) | ☐ | |
|
|
112
|
+
| 3.10 | Discovery enrichment used | Ideation references web search or WorkIQ insights from the Discover phase | ☐ | |
|
|
113
|
+
| 3.11 | Decision gate | After Ideate, sofIA shows a decision gate | ☐ | |
|
|
114
|
+
| 3.12 | Session persistence | Ideate artifacts are persisted to session JSON | ☐ | |
|
|
115
|
+
|
|
116
|
+
### Ideate — MCP Tool Invocation Audit
|
|
117
|
+
|
|
118
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
119
|
+
| ---- | ------------------------ | ------------------------------------------------------------------------------------------------------------- | ------ | -------------------- |
|
|
120
|
+
| 3.T1 | Cards dataset loaded | sofIA uses the built-in `cards.json` dataset (FR-024) — not an MCP call, but verify cards come from data file | ☐ | |
|
|
121
|
+
| 3.T2 | No unexpected tool calls | Ideate should not call MCP tools (Context7, Azure, web search are not expected in this phase) | ☐ | Tool calls observed: |
|
|
122
|
+
|
|
123
|
+
---
|
|
124
|
+
|
|
125
|
+
## 4. Design Phase (Steps 10–12)
|
|
126
|
+
|
|
127
|
+
### Spec References: FR-029, FR-030, FR-031, FR-032, FR-033
|
|
128
|
+
|
|
129
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
130
|
+
| --- | ------------------------- | --------------------------------------------------------------------------------------- | ------ | ----- |
|
|
131
|
+
| 4.1 | Idea card refinement | sofIA refines ideas into complete Idea Cards with assumptions and data requirements | ☐ | |
|
|
132
|
+
| 4.2 | Feasibility/Value matrix | sofIA creates a scoring matrix and accepts user scores | ☐ | |
|
|
133
|
+
| 4.3 | Impact assessment | BXT framework assessment is produced for each idea | ☐ | |
|
|
134
|
+
| 4.4 | Architecture sketch | A Mermaid architecture diagram is generated for top idea(s) | ☐ | |
|
|
135
|
+
| 4.5 | Documentation grounding | If Context7/MS Learn is available, recommendations are grounded with real documentation | ☐ | |
|
|
136
|
+
| 4.6 | User feedback integration | sofIA incorporates user additions (risks, notes) into the output | ☐ | |
|
|
137
|
+
| 4.7 | Decision gate | After Design, sofIA shows a decision gate | ☐ | |
|
|
138
|
+
| 4.8 | Session persistence | Design artifacts are persisted | ☐ | |
|
|
139
|
+
|
|
140
|
+
### Design — MCP Tool Invocation Audit
|
|
141
|
+
|
|
142
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
143
|
+
| ---- | ---------------------------------------- | -------------------------------------------------------------------------------------------------------------- | ------ | ------------------ |
|
|
144
|
+
| 4.T1 | Context7 — library docs queried | If Context7 is available and ideas reference specific libraries, Context7 is called for documentation (FR-031) | ☐ | Libraries queried: |
|
|
145
|
+
| 4.T2 | Context7 — `resolve-library-id` | For each queried library, `resolve-library-id` is called first | ☐ | |
|
|
146
|
+
| 4.T3 | Context7 — `query-docs` | After resolving, `query-docs` is called with the resolved ID | ☐ | |
|
|
147
|
+
| 4.T4 | Context7 — results in output | Documentation results are referenced in architecture or feasibility assessment | ☐ | |
|
|
148
|
+
| 4.T5 | MS Learn / Azure MCP — called | If Azure services are mentioned (Cognitive Services, Cosmos DB, etc.), Azure docs MCP or MS Learn is queried | ☐ | Services queried: |
|
|
149
|
+
| 4.T6 | MS Learn / Azure MCP — results in output | Azure architecture guidance appears in the architecture sketch or recommendations | ☐ | |
|
|
150
|
+
| 4.T7 | Tool call spinner | Spinner shows tool-specific text during each MCP call (FR-043a) | ☐ | |
|
|
151
|
+
| 4.T8 | Tool call summary lines | One-line summary after each tool completes (FR-043b) | ☐ | |
|
|
152
|
+
| 4.T9 | Degradation if tools unavailable | If Context7/Azure is unavailable, sofIA still produces reasonable output (FR-056) | ☐ | |
|
|
153
|
+
|
|
154
|
+
---
|
|
155
|
+
|
|
156
|
+
## 5. Select Phase
|
|
157
|
+
|
|
158
|
+
### Spec References: FR-032, FR-033, FR-034
|
|
159
|
+
|
|
160
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
161
|
+
| --- | ------------------ | ------------------------------------------------------------------------------------------- | ------ | ----- |
|
|
162
|
+
| 5.1 | Ranking | sofIA ranks ideas by composite score (Feasibility 30%, Business Value 40%, Human Value 30%) | ☐ | |
|
|
163
|
+
| 5.2 | Recommendation | sofIA recommends a top idea with clear rationale | ☐ | |
|
|
164
|
+
| 5.3 | User confirmation | sofIA asks for explicit user confirmation of selection | ☐ | |
|
|
165
|
+
| 5.4 | Selection recorded | Selected idea + rationale + `confirmedByUser` visible in session | ☐ | |
|
|
166
|
+
| 5.5 | Correct selection | Selection is "TrendPulse Dashboard with integrated TrendLens" (or similar combined idea) | ☐ | |
|
|
167
|
+
| 5.6 | Decision gate | After Select, sofIA shows a decision gate | ☐ | |
|
|
168
|
+
|
|
169
|
+
### Select — MCP Tool Invocation Audit
|
|
170
|
+
|
|
171
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
172
|
+
| ---- | --------------------- | ------------------------------------------------------------- | ------ | -------------------- |
|
|
173
|
+
| 5.T1 | No MCP tools expected | Select phase is LLM-only analysis; no MCP tool calls expected | ☐ | Tool calls observed: |
|
|
174
|
+
|
|
175
|
+
---
|
|
176
|
+
|
|
177
|
+
## 6. Plan Phase
|
|
178
|
+
|
|
179
|
+
### Spec References: FR-035
|
|
180
|
+
|
|
181
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
182
|
+
| --- | -------------------- | ------------------------------------------------------------------------------------------------ | ------ | ----- |
|
|
183
|
+
| 6.1 | Milestones | Plan includes 3–6 milestones with IDs, titles, and deliverables | ☐ | |
|
|
184
|
+
| 6.2 | Architecture notes | High-level architecture is documented with technology choices | ☐ | |
|
|
185
|
+
| 6.3 | Architecture diagram | A Mermaid diagram showing components and data flow is produced | ☐ | |
|
|
186
|
+
| 6.4 | Dependencies list | External dependencies (APIs, data sources, skills) are listed | ☐ | |
|
|
187
|
+
| 6.5 | PoC definition | PoC scope is defined with: minimum functionality, data needs, success criteria, timeline | ☐ | |
|
|
188
|
+
| 6.6 | Tech stack captured | Plan references the requested tech stack (Azure Functions, Cognitive Services, Cosmos DB, React) | ☐ | |
|
|
189
|
+
| 6.7 | Decision gate | After Plan, sofIA shows a decision gate | ☐ | |
|
|
190
|
+
| 6.8 | Dev command guidance | sofIA displays the exact `sofia dev --session <id>` command to run next | ☐ | |
|
|
191
|
+
|
|
192
|
+
### Plan — MCP Tool Invocation Audit
|
|
193
|
+
|
|
194
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
195
|
+
| ---- | -------------------------------------------- | ------------------------------------------------------------------------------- | ------ | ------------------ |
|
|
196
|
+
| 6.T1 | MS Learn / Azure MCP — architecture guidance | If Azure services are in the plan, Azure docs may be queried for best practices | ☐ | Services queried: |
|
|
197
|
+
| 6.T2 | Context7 — dependency docs | If specific npm/pip packages are in the plan, Context7 may be queried | ☐ | Libraries queried: |
|
|
198
|
+
| 6.T3 | Tool call feedback visible | Any tool calls during Plan show spinner + summary line | ☐ | |
|
|
199
|
+
|
|
200
|
+
---
|
|
201
|
+
|
|
202
|
+
## 7. Develop Phase (Boundary — workshop side)
|
|
203
|
+
|
|
204
|
+
### Spec References: FR-036, FR-037, FR-038
|
|
205
|
+
|
|
206
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
207
|
+
| --- | ------------------- | -------------------------------------------------------------------------- | ------ | ----- |
|
|
208
|
+
| 7.1 | PoC intent captured | Target stack, key scenarios, constraints are stored in session `poc` field | ☐ | |
|
|
209
|
+
| 7.2 | Structured data | Enough structured data exists for `sofia dev` to consume | ☐ | |
|
|
210
|
+
| 7.3 | Summary provided | User-visible summary of PoC scope/decisions is shown | ☐ | |
|
|
211
|
+
|
|
212
|
+
---
|
|
213
|
+
|
|
214
|
+
## 8. Cross-Cutting Concerns
|
|
215
|
+
|
|
216
|
+
### UX & Output Quality
|
|
217
|
+
|
|
218
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
219
|
+
| --- | --------------------- | ------------------------------------------------------------------------------------------------- | ------ | ----- |
|
|
220
|
+
| 8.1 | No raw JSON in output | No SDK JSON events appear in user-facing output (FR-013, FR-058) | ☐ | |
|
|
221
|
+
| 8.2 | Streaming works | Responses stream incrementally (not all-at-once block) (FR-009) | ☐ | |
|
|
222
|
+
| 8.3 | Tool call summaries | Tool calls show one-line summaries (FR-043b) | ☐ | |
|
|
223
|
+
| 8.4 | Spinner behavior | Spinners appear during waits and clear properly (FR-043a, FR-043c) | ☐ | |
|
|
224
|
+
| 8.5 | Thinking spinner | "Thinking..." spinner appears during silent gaps (after user input, after tool results) (FR-043c) | ☐ | |
|
|
225
|
+
| 8.6 | Ctrl+C handling | If Ctrl+C is pressed mid-session, session is persisted and recovery info shown | ☐ | |
|
|
226
|
+
|
|
227
|
+
### Session Integrity
|
|
228
|
+
|
|
229
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
230
|
+
| ---- | ------------------------------ | -------------------------------------------------------------------------------------------- | ------ | ----- |
|
|
231
|
+
| 8.7 | Session JSON valid | Session file is valid JSON and contains all expected fields | ☐ | |
|
|
232
|
+
| 8.8 | Turn history preserved | Conversation turns for each phase are stored in session | ☐ | |
|
|
233
|
+
| 8.9 | Phase progression correct | Session `phase` field matches the actual phase completed | ☐ | |
|
|
234
|
+
| 8.10 | No data corruption | Session JSON matches the session schema without extra or missing required fields | ☐ | |
|
|
235
|
+
| 8.11 | Discovery enrichment persisted | `session.discoveryEnrichment` contains web search / WorkIQ results (if tools were available) | ☐ | |
|
|
236
|
+
|
|
237
|
+
### Error Handling & Recovery
|
|
238
|
+
|
|
239
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
240
|
+
| ---- | ------------------------- | ------------------------------------------------------------------------ | ------ | ----- |
|
|
241
|
+
| 8.12 | Error classification | Errors are classified (auth, connection, MCP, timeout, unknown) (FR-047) | ☐ | |
|
|
242
|
+
| 8.13 | Actionable suggestions | Error messages include remediation guidance (FR-047) | ☐ | |
|
|
243
|
+
| 8.14 | Interactive recovery | Interactive failures return to a recovery decision flow (FR-048) | ☐ | |
|
|
244
|
+
| 8.15 | Original errors preserved | Underlying error messages are not swallowed (FR-046, FR-059) | ☐ | |
|
|
245
|
+
|
|
246
|
+
### Export (post-workshop)
|
|
247
|
+
|
|
248
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
249
|
+
| ---- | ------------------------------ | ---------------------------------------------------------------------------------------------- | ------ | ----- |
|
|
250
|
+
| 8.16 | Export executes | `sofia export --session <id>` runs without error | ☐ | |
|
|
251
|
+
| 8.17 | Export files present | Export directory contains: summary.json, discover.md, ideate.md, design.md, select.md, plan.md | ☐ | |
|
|
252
|
+
| 8.18 | Summary JSON valid | summary.json contains sessionId, exportedAt, phase, status, files, highlights | ☐ | |
|
|
253
|
+
| 8.19 | Markdown content quality | Exported markdown files contain meaningful content (not empty or stub text) | ☐ | |
|
|
254
|
+
| 8.20 | Discovery enrichment in export | Discover export includes web search findings and/or WorkIQ insights (if they were available) | ☐ | |
|
|
255
|
+
|
|
256
|
+
---
|
|
257
|
+
|
|
258
|
+
## 9. MCP Tool Invocation Master Audit
|
|
259
|
+
|
|
260
|
+
This section provides a consolidated view of all MCP tool calls across the entire session. Fill in after the test run by reviewing debug logs or `--debug` output.
|
|
261
|
+
|
|
262
|
+
### 9.1 Web Search (`web.search` / Foundry Agent)
|
|
263
|
+
|
|
264
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
265
|
+
| ----- | ---------------------------- | ----------------------------------------------------------------------------------- | ------ | -------------- |
|
|
266
|
+
| 9.1.1 | Configured | `isWebSearchConfigured()` returns true (Foundry endpoint + model set in env) | ☐ | |
|
|
267
|
+
| 9.1.2 | Ephemeral agent created | Foundry web search agent is created at session start (FR-015 005) | ☐ | |
|
|
268
|
+
| 9.1.3 | Ephemeral agent destroyed | Foundry web search agent is destroyed at session end (FR-015 005) | ☐ | |
|
|
269
|
+
| 9.1.4 | Called during Discover | At least 1–3 web search queries executed during Discover enrichment | ☐ | Total queries: |
|
|
270
|
+
| 9.1.5 | Called during Dev (if stuck) | If Ralph Loop gets stuck 2+ iterations, web.search is called with failing test info | ☐ | |
|
|
271
|
+
| 9.1.6 | NOT called unnecessarily | web.search is NOT called when there's no need (e.g., not during Select) | ☐ | |
|
|
272
|
+
| 9.1.7 | Results have citations | All web search results include URLs for source verification (FR-014 005) | ☐ | |
|
|
273
|
+
| 9.1.8 | Latency acceptable | Search queries return within 10 seconds (SC-003 005) | ☐ | Avg latency: |
|
|
274
|
+
| 9.1.9 | Graceful failure | If Foundry is down, sofIA degrades without crash and warns the user | ☐ | |
|
|
275
|
+
|
|
276
|
+
### 9.2 WorkIQ (`workiq.*`)
|
|
277
|
+
|
|
278
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
279
|
+
| ----- | ------------------------ | -------------------------------------------------------------------------------------- | ------ | ----------------- |
|
|
280
|
+
| 9.2.1 | Availability detected | `mcpManager.isAvailable('workiq')` correctly reports status | ☐ | Available: yes/no |
|
|
281
|
+
| 9.2.2 | Consent before call | WorkIQ is NEVER called without explicit user consent (FR-020) | ☐ | |
|
|
282
|
+
| 9.2.3 | `analyze_team` called | If consent given, `analyze_team` is invoked with company summary | ☐ | |
|
|
283
|
+
| 9.2.4 | Response parsed | WorkIQ response is parsed into teamExpertise, collaborationPatterns, documentationGaps | ☐ | |
|
|
284
|
+
| 9.2.5 | Insights used downstream | WorkIQ insights appear in Ideate or Design phase context | ☐ | |
|
|
285
|
+
| 9.2.6 | Timeout respected | WorkIQ call respects 30s timeout | ☐ | |
|
|
286
|
+
| 9.2.7 | Graceful failure | If WorkIQ errors or times out, sofIA continues without crashing | ☐ | |
|
|
287
|
+
|
|
288
|
+
### 9.3 Context7 (`context7.*`)
|
|
289
|
+
|
|
290
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
291
|
+
| ----- | --------------------------- | --------------------------------------------------------------------------------------------- | ------ | ------------------- |
|
|
292
|
+
| 9.3.1 | Availability detected | `mcpManager.isAvailable('context7')` correctly reports status | ☐ | Available: yes/no |
|
|
293
|
+
| 9.3.2 | `resolve-library-id` called | For each dependency, `resolve-library-id` is called first | ☐ | Libraries resolved: |
|
|
294
|
+
| 9.3.3 | `query-docs` called | After resolving, `query-docs` is called with resolved ID + topic | ☐ | |
|
|
295
|
+
| 9.3.4 | Max 5 deps queried | Context7 limits to first 5 dependencies (skips @types/\*, typescript, vitest) | ☐ | |
|
|
296
|
+
| 9.3.5 | Results in LLM prompt | Context7 docs appear in the Ralph Loop iteration prompt under "Library Documentation" section | ☐ | |
|
|
297
|
+
| 9.3.6 | Fallback on failure | If a single dep query fails, it falls back to an npm link (not a crash) | ☐ | |
|
|
298
|
+
| 9.3.7 | Graceful if unavailable | If Context7 server is down, enricher returns empty docs (no crash) | ☐ | |
|
|
299
|
+
|
|
300
|
+
### 9.4 Azure MCP / Microsoft Learn (`azure.*`)
|
|
301
|
+
|
|
302
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
303
|
+
| ----- | --------------------------- | --------------------------------------------------------------------------------------- | ------ | ----------------- |
|
|
304
|
+
| 9.4.1 | Availability detected | `mcpManager.isAvailable('azure')` correctly reports status | ☐ | Available: yes/no |
|
|
305
|
+
| 9.4.2 | Azure keywords detected | Plan mentions Azure services → `mentionsAzure()` returns true | ☐ | Keywords found: |
|
|
306
|
+
| 9.4.3 | `documentation` tool called | Azure MCP `documentation` tool is called with architecture notes | ☐ | |
|
|
307
|
+
| 9.4.4 | Results in LLM prompt | Azure guidance appears in Ralph Loop prompt under "Azure Architecture Guidance" section | ☐ | |
|
|
308
|
+
| 9.4.5 | Used in Design phase | Azure/MS Learn docs used to ground architecture sketches in Design (FR-031) | ☐ | |
|
|
309
|
+
| 9.4.6 | Graceful if unavailable | If Azure MCP is down, enricher returns empty guidance | ☐ | |
|
|
310
|
+
|
|
311
|
+
### 9.5 GitHub MCP (`github.*`)
|
|
312
|
+
|
|
313
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
314
|
+
| ----- | -------------------------- | -------------------------------------------------------------------------------------- | ------ | ----------------- |
|
|
315
|
+
| 9.5.1 | Availability detected | `mcpManager.isAvailable('github')` correctly reports status | ☐ | Available: yes/no |
|
|
316
|
+
| 9.5.2 | `create_repository` called | During Dev, GitHub MCP is called to create the PoC repo (or fallback to local) | ☐ | |
|
|
317
|
+
| 9.5.3 | `push_files` called | After each iteration, files are pushed to GitHub repo (with actual content, not empty) | ☐ | |
|
|
318
|
+
| 9.5.4 | File content not empty | Pushed files contain actual content read from disk (not empty strings) | ☐ | |
|
|
319
|
+
| 9.5.5 | Repo URL recorded | Repository URL is stored in session `poc` state | ☐ | |
|
|
320
|
+
| 9.5.6 | Local fallback | If GitHub MCP unavailable, PoC is generated locally under `./poc/<sessionId>/` | ☐ | |
|
|
321
|
+
| 9.5.7 | Fallback logged | Local fallback is clearly logged so user knows it's local-only (D-003) | ☐ | |
|
|
322
|
+
|
|
323
|
+
### 9.6 MCP Transport Layer
|
|
324
|
+
|
|
325
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
326
|
+
| ----- | ------------------------- | ------------------------------------------------------------------------------ | ------ | ----- |
|
|
327
|
+
| 9.6.1 | Retry on transient errors | Connection refused / timeout gets one automatic retry with backoff | ☐ | |
|
|
328
|
+
| 9.6.2 | No retry on auth errors | Auth/validation errors are NOT retried | ☐ | |
|
|
329
|
+
| 9.6.3 | Timeout per server | Each MCP server type respects its configured timeout (30s for context7/workiq) | ☐ | |
|
|
330
|
+
| 9.6.4 | Error preserves detail | MCP errors include server name, tool name, and original error message | ☐ | |
|
|
331
|
+
|
|
332
|
+
---
|
|
333
|
+
|
|
334
|
+
## 10. Dev Command (PoC Generation) — if running end-to-end
|
|
335
|
+
|
|
336
|
+
### Spec References: D-001 through D-005, FR-001 through FR-010 (004 spec)
|
|
337
|
+
|
|
338
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
339
|
+
| ----- | ---------------------- | ------------------------------------------------------------------------- | ------ | ----- |
|
|
340
|
+
| 10.1 | Dev command starts | `sofia dev --session <id>` starts without errors | ☐ | |
|
|
341
|
+
| 10.2 | Session validation | Dev validates that selection and plan are populated | ☐ | |
|
|
342
|
+
| 10.3 | Scaffolding | Initial PoC project is created with README, package.json, tsconfig, tests | ☐ | |
|
|
343
|
+
| 10.4 | npm install | Dependencies are installed in the PoC directory | ☐ | |
|
|
344
|
+
| 10.5 | Iteration 1 | First Ralph Loop iteration runs (generate code → run tests) | ☐ | |
|
|
345
|
+
| 10.6 | Test feedback | Test failures are fed back to the LLM for correction | ☐ | |
|
|
346
|
+
| 10.7 | Iteration progress | Multiple iterations show progress toward passing tests | ☐ | |
|
|
347
|
+
| 10.8 | Termination | Loop terminates (tests pass, max iterations, or user stop) | ☐ | |
|
|
348
|
+
| 10.9 | PoC output | Output directory contains working or partially working PoC code | ☐ | |
|
|
349
|
+
| 10.10 | Session updated | Session `poc.iterations` and `poc.finalStatus` are updated | ☐ | |
|
|
350
|
+
| 10.11 | GitHub MCP or fallback | Either pushes to GitHub (if MCP available) or falls back to local (D-003) | ☐ | |
|
|
351
|
+
| 10.12 | Recovery message | On non-success, shows recovery options with resume command | ☐ | |
|
|
352
|
+
|
|
353
|
+
### Dev — MCP Tool Invocation Audit
|
|
354
|
+
|
|
355
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
356
|
+
| ----- | ----------------------------- | -------------------------------------------------------------------------- | ------ | --------------------------- |
|
|
357
|
+
| 10.T1 | Context7 called per iteration | Context7 queried for PoC dependencies at each iteration (if available) | ☐ | Iterations with Context7: |
|
|
358
|
+
| 10.T2 | Azure MCP called | Azure guidance fetched because plan mentions Azure services | ☐ | |
|
|
359
|
+
| 10.T3 | web.search called when stuck | After 2+ stuck iterations (same failures), web.search is used for research | ☐ | Iteration # triggered: |
|
|
360
|
+
| 10.T4 | GitHub MCP — create repo | `create_repository` called at scaffold time | ☐ | |
|
|
361
|
+
| 10.T5 | GitHub MCP — push files | `push_files` called after each iteration with non-empty file content | ☐ | Files pushed per iteration: |
|
|
362
|
+
| 10.T6 | Tool results in prompt | MCP-fetched context appears in the iteration prompt to the LLM | ☐ | |
|
|
363
|
+
|
|
364
|
+
---
|
|
365
|
+
|
|
366
|
+
## 11. Infrastructure (005 spec) — if testing deployment
|
|
367
|
+
|
|
368
|
+
### Spec References: FR-001 through FR-018 (005 spec)
|
|
369
|
+
|
|
370
|
+
| # | Check | Expected Behavior | Result | Notes |
|
|
371
|
+
| ---- | -------------------- | --------------------------------------------------------------------------------------- | ------ | ----- |
|
|
372
|
+
| 11.1 | Deploy script exists | `infra/deploy.sh` is present and executable | ☐ | |
|
|
373
|
+
| 11.2 | Bicep template valid | `infra/main.bicep` passes `az bicep build` without errors | ☐ | |
|
|
374
|
+
| 11.3 | Parameterized | Template accepts resource group, region, model deployment overrides | ☐ | |
|
|
375
|
+
| 11.4 | Teardown exists | `infra/teardown.sh` exists and works | ☐ | |
|
|
376
|
+
| 11.5 | .env output | Deploy script writes FOUNDRY_PROJECT_ENDPOINT and FOUNDRY_MODEL_DEPLOYMENT_NAME to .env | ☐ | |
|
|
377
|
+
| 11.6 | Web search works | After deployment, web.search tool returns results with citations | ☐ | |
|
|
378
|
+
|
|
379
|
+
---
|
|
380
|
+
|
|
381
|
+
## Summary Scorecard
|
|
382
|
+
|
|
383
|
+
| Section | Total Checks | Pass | Fail | Partial | Score |
|
|
384
|
+
| ---------------------------- | ------------ | ---- | ---- | ------- | ----- |
|
|
385
|
+
| 0. Pre-flight | 5 | | | | |
|
|
386
|
+
| 1. CLI Startup | 7 | | | | |
|
|
387
|
+
| 2. Discover (functional) | 16 | | | | |
|
|
388
|
+
| 2. Discover (MCP audit) | 14 | | | | |
|
|
389
|
+
| 3. Ideate (functional) | 12 | | | | |
|
|
390
|
+
| 3. Ideate (MCP audit) | 2 | | | | |
|
|
391
|
+
| 4. Design (functional) | 8 | | | | |
|
|
392
|
+
| 4. Design (MCP audit) | 9 | | | | |
|
|
393
|
+
| 5. Select (functional) | 6 | | | | |
|
|
394
|
+
| 5. Select (MCP audit) | 1 | | | | |
|
|
395
|
+
| 6. Plan (functional) | 8 | | | | |
|
|
396
|
+
| 6. Plan (MCP audit) | 3 | | | | |
|
|
397
|
+
| 7. Develop Boundary | 3 | | | | |
|
|
398
|
+
| 8. Cross-Cutting | 20 | | | | |
|
|
399
|
+
| 9. MCP Master Audit | 30 | | | | |
|
|
400
|
+
| 10. Dev Command (functional) | 12 | | | | |
|
|
401
|
+
| 10. Dev Command (MCP audit) | 6 | | | | |
|
|
402
|
+
| 11. Infrastructure | 6 | | | | |
|
|
403
|
+
| **TOTAL** | **168** | | | | |
|
|
404
|
+
|
|
405
|
+
---
|
|
406
|
+
|
|
407
|
+
## Test Notes & Observations
|
|
408
|
+
|
|
409
|
+
_(To be filled during/after the test run)_
|
|
410
|
+
|
|
411
|
+
### Environment Issues
|
|
412
|
+
|
|
413
|
+
-
|
|
414
|
+
|
|
415
|
+
### MCP Availability Summary
|
|
416
|
+
|
|
417
|
+
| Server | Available? | Called? | Successful? | Degraded Gracefully? |
|
|
418
|
+
| -------------------- | ---------- | ------- | ----------- | -------------------- |
|
|
419
|
+
| web.search (Foundry) | | | | |
|
|
420
|
+
| WorkIQ | | | | |
|
|
421
|
+
| Context7 | | | | |
|
|
422
|
+
| Azure / MS Learn | | | | |
|
|
423
|
+
| GitHub MCP | | | | |
|
|
424
|
+
|
|
425
|
+
### Bugs Found
|
|
426
|
+
|
|
427
|
+
-
|
|
428
|
+
|
|
429
|
+
### Unexpected Behaviors
|
|
430
|
+
|
|
431
|
+
-
|
|
432
|
+
|
|
433
|
+
### Positive Surprises
|
|
434
|
+
|
|
435
|
+
-
|
|
436
|
+
|
|
437
|
+
### Content Quality Assessment
|
|
438
|
+
|
|
439
|
+
| Phase | Output quality (1–5) | Key observation |
|
|
440
|
+
| -------- | -------------------- | --------------- |
|
|
441
|
+
| Discover | | |
|
|
442
|
+
| Ideate | | |
|
|
443
|
+
| Design | | |
|
|
444
|
+
| Select | | |
|
|
445
|
+
| Plan | | |
|
|
446
|
+
| Develop | | |
|
|
447
|
+
|
|
448
|
+
### Recommendations
|
|
449
|
+
|
|
450
|
+
-
|
|
451
|
+
|
|
452
|
+
---
|
|
453
|
+
|
|
454
|
+
_This checklist is designed to validate sofIA against specs 001–005 using the Zava Industries scenario, with particular attention to MCP tool invocation, web search, WorkIQ, Context7, Azure MCP, and GitHub MCP usage across all phases._
|