sofia-cli 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.github/agents/copilot-instructions.md +39 -0
- package/.github/agents/speckit.analyze.agent.md +184 -0
- package/.github/agents/speckit.checklist.agent.md +294 -0
- package/.github/agents/speckit.clarify.agent.md +181 -0
- package/.github/agents/speckit.constitution.agent.md +84 -0
- package/.github/agents/speckit.implement.agent.md +135 -0
- package/.github/agents/speckit.plan.agent.md +90 -0
- package/.github/agents/speckit.specify.agent.md +258 -0
- package/.github/agents/speckit.tasks.agent.md +137 -0
- package/.github/agents/speckit.taskstoissues.agent.md +30 -0
- package/.github/copilot-instructions.md +257 -0
- package/.github/prompts/speckit.analyze.prompt.md +3 -0
- package/.github/prompts/speckit.checklist.prompt.md +3 -0
- package/.github/prompts/speckit.clarify.prompt.md +3 -0
- package/.github/prompts/speckit.constitution.prompt.md +3 -0
- package/.github/prompts/speckit.implement.prompt.md +3 -0
- package/.github/prompts/speckit.plan.prompt.md +3 -0
- package/.github/prompts/speckit.specify.prompt.md +3 -0
- package/.github/prompts/speckit.tasks.prompt.md +3 -0
- package/.github/prompts/speckit.taskstoissues.prompt.md +3 -0
- package/.github/workflows/ci.yml +38 -0
- package/.prettierrc +6 -0
- package/.specify/memory/constitution.md +181 -0
- package/.specify/scripts/bash/check-prerequisites.sh +166 -0
- package/.specify/scripts/bash/common.sh +156 -0
- package/.specify/scripts/bash/create-new-feature.sh +297 -0
- package/.specify/scripts/bash/setup-plan.sh +61 -0
- package/.specify/scripts/bash/update-agent-context.sh +810 -0
- package/.specify/templates/agent-file-template.md +28 -0
- package/.specify/templates/checklist-template.md +40 -0
- package/.specify/templates/constitution-template.md +50 -0
- package/.specify/templates/plan-template.md +113 -0
- package/.specify/templates/spec-template.md +115 -0
- package/.specify/templates/tasks-template.md +251 -0
- package/.vscode/mcp.json +42 -0
- package/.vscode/settings.json +19 -0
- package/CODE_OF_CONDUCT.md +128 -0
- package/LICENSE +21 -0
- package/README.md +213 -0
- package/dist/src/cli/developCommand.js +240 -0
- package/dist/src/cli/directCommands.js +143 -0
- package/dist/src/cli/envLoader.js +16 -0
- package/dist/src/cli/exportCommand.js +53 -0
- package/dist/src/cli/index.js +203 -0
- package/dist/src/cli/ioContext.js +109 -0
- package/dist/src/cli/preflight.js +57 -0
- package/dist/src/cli/statusCommand.js +110 -0
- package/dist/src/cli/workshopCommand.js +400 -0
- package/dist/src/develop/checkpointState.js +86 -0
- package/dist/src/develop/codeGenerator.js +319 -0
- package/dist/src/develop/dynamicScaffolder.js +226 -0
- package/dist/src/develop/githubMcpAdapter.js +122 -0
- package/dist/src/develop/index.js +15 -0
- package/dist/src/develop/mcpContextEnricher.js +195 -0
- package/dist/src/develop/pocScaffolder.js +542 -0
- package/dist/src/develop/ralphLoop.js +659 -0
- package/dist/src/develop/templateRegistry.js +364 -0
- package/dist/src/develop/testRunner.js +202 -0
- package/dist/src/logging/logger.js +58 -0
- package/dist/src/loop/conversationLoop.js +227 -0
- package/dist/src/loop/phaseSummarizer.js +87 -0
- package/dist/src/mcp/mcpManager.js +267 -0
- package/dist/src/mcp/mcpTransport.js +391 -0
- package/dist/src/mcp/retryPolicy.js +47 -0
- package/dist/src/mcp/webSearch.js +254 -0
- package/dist/src/phases/contextSummarizer.js +101 -0
- package/dist/src/phases/discoveryEnricher.js +156 -0
- package/dist/src/phases/phaseExtractors.js +222 -0
- package/dist/src/phases/phaseHandlers.js +328 -0
- package/dist/src/prompts/design.md +51 -0
- package/dist/src/prompts/develop-boundary.md +51 -0
- package/dist/src/prompts/develop.md +111 -0
- package/dist/src/prompts/discover.md +58 -0
- package/dist/src/prompts/ideate.md +56 -0
- package/dist/src/prompts/plan.md +51 -0
- package/dist/src/prompts/promptLoader.js +167 -0
- package/dist/src/prompts/promptLoader.ts +198 -0
- package/dist/src/prompts/select.md +47 -0
- package/dist/src/prompts/summarize/README.md +8 -0
- package/dist/src/prompts/summarize/design-summary.md +37 -0
- package/dist/src/prompts/summarize/develop-summary.md +25 -0
- package/dist/src/prompts/summarize/ideate-summary.md +27 -0
- package/dist/src/prompts/summarize/plan-summary.md +27 -0
- package/dist/src/prompts/summarize/select-summary.md +21 -0
- package/dist/src/prompts/system.md +28 -0
- package/dist/src/sessions/exportPaths.js +22 -0
- package/dist/src/sessions/exportWriter.js +406 -0
- package/dist/src/sessions/sessionManager.js +81 -0
- package/dist/src/sessions/sessionStore.js +65 -0
- package/dist/src/shared/activitySpinner.js +91 -0
- package/dist/src/shared/copilotClient.js +129 -0
- package/dist/src/shared/data/cards.json +1249 -0
- package/dist/src/shared/data/cardsLoader.js +51 -0
- package/dist/src/shared/errorClassifier.js +120 -0
- package/dist/src/shared/events.js +28 -0
- package/dist/src/shared/markdownRenderer.js +34 -0
- package/dist/src/shared/schemas/session.js +265 -0
- package/dist/src/shared/tableRenderer.js +20 -0
- package/dist/src/vendor/chalk.js +2 -0
- package/dist/src/vendor/cli-table3.js +3 -0
- package/dist/src/vendor/commander.js +2 -0
- package/dist/src/vendor/marked-terminal.js +3 -0
- package/dist/src/vendor/marked.js +2 -0
- package/dist/src/vendor/ora.js +2 -0
- package/dist/src/vendor/pino.js +2 -0
- package/dist/src/vendor/zod.js +2 -0
- package/dist/tests/e2e/developE2e.spec.js +126 -0
- package/dist/tests/e2e/developFailureE2e.spec.js +247 -0
- package/dist/tests/e2e/developPty.spec.js +75 -0
- package/dist/tests/e2e/discoveryWebSearchRelevance.spec.js +84 -0
- package/dist/tests/e2e/harness.spec.js +83 -0
- package/dist/tests/e2e/mcpLive.spec.js +120 -0
- package/dist/tests/e2e/newSession.e2e.spec.js +177 -0
- package/dist/tests/e2e/ralphLoopEnrichmentComparison.spec.js +62 -0
- package/dist/tests/e2e/workiqEnrichment.spec.js +56 -0
- package/dist/tests/e2e/zavaSimulation.spec.js +452 -0
- package/dist/tests/fixtures/test-fixture-project/src/add.js +3 -0
- package/dist/tests/fixtures/test-fixture-project/tests/failing.test.js +6 -0
- package/dist/tests/fixtures/test-fixture-project/tests/hanging.test.js +8 -0
- package/dist/tests/fixtures/test-fixture-project/tests/passing.test.js +10 -0
- package/dist/tests/fixtures/test-fixture-project/vitest.config.js +6 -0
- package/dist/tests/integration/autoStartConversation.spec.js +138 -0
- package/dist/tests/integration/defaultCommand.spec.js +147 -0
- package/dist/tests/integration/directCommandNonTty.spec.js +224 -0
- package/dist/tests/integration/directCommandTty.spec.js +151 -0
- package/dist/tests/integration/discoveryEnrichmentFlow.spec.js +175 -0
- package/dist/tests/integration/exportArtifacts.spec.js +202 -0
- package/dist/tests/integration/exportFallbackFlow.spec.js +99 -0
- package/dist/tests/integration/mcpDegradationFlow.spec.js +190 -0
- package/dist/tests/integration/mcpTransportFlow.spec.js +139 -0
- package/dist/tests/integration/newSessionFlow.spec.js +343 -0
- package/dist/tests/integration/pocGithubMcp.spec.js +186 -0
- package/dist/tests/integration/pocLocalFallback.spec.js +171 -0
- package/dist/tests/integration/pocScaffold.spec.js +163 -0
- package/dist/tests/integration/ralphLoopFlow.spec.js +359 -0
- package/dist/tests/integration/ralphLoopPartial.spec.js +368 -0
- package/dist/tests/integration/resumeAndBacktrack.spec.js +247 -0
- package/dist/tests/integration/spinnerLifecycle.spec.js +220 -0
- package/dist/tests/integration/summarizationFlow.spec.js +115 -0
- package/dist/tests/integration/testRunnerReal.spec.js +52 -0
- package/dist/tests/integration/webSearchAgent.spec.js +128 -0
- package/dist/tests/live/copilotSdkLive.spec.js +107 -0
- package/dist/tests/live/zavaFullWorkshop.spec.js +392 -0
- package/dist/tests/setup/loadEnv.js +3 -0
- package/dist/tests/unit/cli/developCommand.spec.js +567 -0
- package/dist/tests/unit/cli/directCommands.spec.js +279 -0
- package/dist/tests/unit/cli/envLoader.spec.js +58 -0
- package/dist/tests/unit/cli/ioContext.spec.js +119 -0
- package/dist/tests/unit/cli/preflight.spec.js +108 -0
- package/dist/tests/unit/cli/statusCommand.spec.js +111 -0
- package/dist/tests/unit/cli/workshopClientFallback.spec.js +80 -0
- package/dist/tests/unit/cli/workshopCommand.spec.js +329 -0
- package/dist/tests/unit/config/vitestEnvSetup.spec.js +13 -0
- package/dist/tests/unit/develop/checkpointState.spec.js +315 -0
- package/dist/tests/unit/develop/codeGenerator.spec.js +355 -0
- package/dist/tests/unit/develop/githubMcpAdapter.spec.js +231 -0
- package/dist/tests/unit/develop/mcpContextEnricher.spec.js +433 -0
- package/dist/tests/unit/develop/outputValidator.spec.js +119 -0
- package/dist/tests/unit/develop/pocScaffolder.spec.js +353 -0
- package/dist/tests/unit/develop/ralphLoop.spec.js +1248 -0
- package/dist/tests/unit/develop/templateRegistry.spec.js +85 -0
- package/dist/tests/unit/develop/testRunner.spec.js +249 -0
- package/dist/tests/unit/infraBicep.spec.js +92 -0
- package/dist/tests/unit/infraDeploy.spec.js +82 -0
- package/dist/tests/unit/infraTeardown.spec.js +63 -0
- package/dist/tests/unit/logging/logger.spec.js +43 -0
- package/dist/tests/unit/loop/conversationLoop.spec.js +592 -0
- package/dist/tests/unit/loop/phaseSummarizer.spec.js +141 -0
- package/dist/tests/unit/loop/streamingMarkdown.spec.js +147 -0
- package/dist/tests/unit/mcp/mcpManager.spec.js +279 -0
- package/dist/tests/unit/mcp/mcpTransport.spec.js +529 -0
- package/dist/tests/unit/mcp/retryPolicy.spec.js +218 -0
- package/dist/tests/unit/mcp/timeoutValidation.spec.js +46 -0
- package/dist/tests/unit/mcp/webSearch.spec.js +567 -0
- package/dist/tests/unit/phases/contextSummarizer.spec.js +140 -0
- package/dist/tests/unit/phases/discoveryEnricher.repeatCalls.spec.js +93 -0
- package/dist/tests/unit/phases/discoveryEnricher.spec.js +411 -0
- package/dist/tests/unit/phases/phaseExtractors.spec.js +352 -0
- package/dist/tests/unit/phases/phaseHandlers.spec.js +425 -0
- package/dist/tests/unit/prompts/promptLoader.spec.js +118 -0
- package/dist/tests/unit/schemas/pocSchemas.spec.js +412 -0
- package/dist/tests/unit/schemas/session.spec.js +257 -0
- package/dist/tests/unit/sessions/exportPaths.spec.js +31 -0
- package/dist/tests/unit/sessions/exportWriter.spec.js +655 -0
- package/dist/tests/unit/sessions/sessionManager.spec.js +151 -0
- package/dist/tests/unit/sessions/sessionStore.spec.js +116 -0
- package/dist/tests/unit/shared/activitySpinner.spec.js +175 -0
- package/dist/tests/unit/shared/cardsLoader.spec.js +76 -0
- package/dist/tests/unit/shared/copilotClient.spec.js +155 -0
- package/dist/tests/unit/shared/errorClassifier.spec.js +131 -0
- package/dist/tests/unit/shared/events.spec.js +55 -0
- package/dist/tests/unit/shared/markdownRenderer.spec.js +35 -0
- package/dist/tests/unit/shared/markdownRendererChunks.spec.js +70 -0
- package/dist/tests/unit/shared/tableRenderer.spec.js +34 -0
- package/dist/vitest.config.js +14 -0
- package/dist/vitest.live.config.js +18 -0
- package/docs/README.md +35 -0
- package/docs/architecture.md +169 -0
- package/docs/cli-usage.md +207 -0
- package/docs/environment.md +66 -0
- package/docs/export-format.md +146 -0
- package/docs/session-model.md +113 -0
- package/eslint.config.js +35 -0
- package/infra/deploy.sh +193 -0
- package/infra/gather-env.sh +211 -0
- package/infra/main.bicep +90 -0
- package/infra/main.bicepparam +18 -0
- package/infra/resources.bicep +134 -0
- package/infra/teardown.sh +114 -0
- package/package.json +63 -0
- package/specs/001-cli-workshop-rebuild/checklists/requirements.md +35 -0
- package/specs/001-cli-workshop-rebuild/contracts/cli.md +59 -0
- package/specs/001-cli-workshop-rebuild/contracts/export-summary-json.md +23 -0
- package/specs/001-cli-workshop-rebuild/contracts/session-json.md +30 -0
- package/specs/001-cli-workshop-rebuild/data-model.md +210 -0
- package/specs/001-cli-workshop-rebuild/plan.md +361 -0
- package/specs/001-cli-workshop-rebuild/quickstart.md +83 -0
- package/specs/001-cli-workshop-rebuild/research.md +116 -0
- package/specs/001-cli-workshop-rebuild/spec.md +240 -0
- package/specs/001-cli-workshop-rebuild/tasks.md +476 -0
- package/specs/002-poc-generation/contracts/poc-output.md +172 -0
- package/specs/002-poc-generation/contracts/ralph-loop.md +113 -0
- package/specs/002-poc-generation/data-model.md +172 -0
- package/specs/002-poc-generation/plan.md +109 -0
- package/specs/002-poc-generation/quickstart.md +97 -0
- package/specs/002-poc-generation/research.md +786 -0
- package/specs/002-poc-generation/spec.md +81 -0
- package/specs/002-poc-generation/tasks-fix.md +198 -0
- package/specs/002-poc-generation/tasks.md +252 -0
- package/specs/003-mcp-transport-integration/checklists/requirements.md +37 -0
- package/specs/003-mcp-transport-integration/contracts/context-enricher.md +220 -0
- package/specs/003-mcp-transport-integration/contracts/discovery-enricher.md +267 -0
- package/specs/003-mcp-transport-integration/contracts/github-adapter.md +149 -0
- package/specs/003-mcp-transport-integration/contracts/mcp-transport.md +288 -0
- package/specs/003-mcp-transport-integration/data-model.md +326 -0
- package/specs/003-mcp-transport-integration/plan.md +114 -0
- package/specs/003-mcp-transport-integration/quickstart.md +311 -0
- package/specs/003-mcp-transport-integration/research.md +395 -0
- package/specs/003-mcp-transport-integration/spec.md +234 -0
- package/specs/003-mcp-transport-integration/tasks.md +324 -0
- package/specs/003-next-spec-gaps.md +150 -0
- package/specs/004-dev-resume-hardening/checklists/requirements.md +37 -0
- package/specs/004-dev-resume-hardening/contracts/cli.md +160 -0
- package/specs/004-dev-resume-hardening/data-model.md +321 -0
- package/specs/004-dev-resume-hardening/plan.md +107 -0
- package/specs/004-dev-resume-hardening/quickstart.md +115 -0
- package/specs/004-dev-resume-hardening/research.md +142 -0
- package/specs/004-dev-resume-hardening/spec.md +221 -0
- package/specs/004-dev-resume-hardening/tasks.md +333 -0
- package/specs/005-ai-search-deploy/checklists/requirements.md +39 -0
- package/specs/005-ai-search-deploy/contracts/web-search-tool.md +241 -0
- package/specs/005-ai-search-deploy/data-model.md +130 -0
- package/specs/005-ai-search-deploy/plan.md +93 -0
- package/specs/005-ai-search-deploy/quickstart.md +96 -0
- package/specs/005-ai-search-deploy/research.md +187 -0
- package/specs/005-ai-search-deploy/spec.md +143 -0
- package/specs/005-ai-search-deploy/tasks.md +284 -0
- package/specs/006-workshop-extraction-fixes/checklists/requirements.md +61 -0
- package/specs/006-workshop-extraction-fixes/contracts/summarization-and-export.md +131 -0
- package/specs/006-workshop-extraction-fixes/data-model.md +149 -0
- package/specs/006-workshop-extraction-fixes/plan.md +123 -0
- package/specs/006-workshop-extraction-fixes/quickstart.md +101 -0
- package/specs/006-workshop-extraction-fixes/research.md +143 -0
- package/specs/006-workshop-extraction-fixes/spec.md +210 -0
- package/specs/006-workshop-extraction-fixes/tasks.md +316 -0
- package/src/cli/developCommand.ts +308 -0
- package/src/cli/directCommands.ts +195 -0
- package/src/cli/envLoader.ts +17 -0
- package/src/cli/exportCommand.ts +65 -0
- package/src/cli/index.ts +249 -0
- package/src/cli/ioContext.ts +139 -0
- package/src/cli/preflight.ts +86 -0
- package/src/cli/statusCommand.ts +118 -0
- package/src/cli/workshopCommand.ts +496 -0
- package/src/develop/checkpointState.ts +121 -0
- package/src/develop/codeGenerator.ts +402 -0
- package/src/develop/dynamicScaffolder.ts +284 -0
- package/src/develop/githubMcpAdapter.ts +199 -0
- package/src/develop/index.ts +34 -0
- package/src/develop/mcpContextEnricher.ts +279 -0
- package/src/develop/pocScaffolder.ts +646 -0
- package/src/develop/ralphLoop.ts +1044 -0
- package/src/develop/templateRegistry.ts +427 -0
- package/src/develop/testRunner.ts +276 -0
- package/src/logging/logger.ts +73 -0
- package/src/loop/conversationLoop.ts +355 -0
- package/src/loop/phaseSummarizer.ts +114 -0
- package/src/mcp/mcpManager.ts +365 -0
- package/src/mcp/mcpTransport.ts +562 -0
- package/src/mcp/retryPolicy.ts +87 -0
- package/src/mcp/webSearch.ts +388 -0
- package/src/originalPrompts/design_thinking.md +178 -0
- package/src/originalPrompts/design_thinking_persona.md +76 -0
- package/src/originalPrompts/document_generator_example.md +77 -0
- package/src/originalPrompts/document_generator_persona.md +47 -0
- package/src/originalPrompts/facilitator_persona.md +125 -0
- package/src/originalPrompts/guardrails.md +47 -0
- package/src/phases/contextSummarizer.ts +154 -0
- package/src/phases/discoveryEnricher.ts +223 -0
- package/src/phases/phaseExtractors.ts +247 -0
- package/src/phases/phaseHandlers.ts +450 -0
- package/src/prompts/design.md +51 -0
- package/src/prompts/develop-boundary.md +51 -0
- package/src/prompts/develop.md +111 -0
- package/src/prompts/discover.md +58 -0
- package/src/prompts/ideate.md +56 -0
- package/src/prompts/plan.md +51 -0
- package/src/prompts/promptLoader.ts +198 -0
- package/src/prompts/select.md +47 -0
- package/src/prompts/summarize/README.md +8 -0
- package/src/prompts/summarize/design-summary.md +37 -0
- package/src/prompts/summarize/develop-summary.md +25 -0
- package/src/prompts/summarize/ideate-summary.md +27 -0
- package/src/prompts/summarize/plan-summary.md +27 -0
- package/src/prompts/summarize/select-summary.md +21 -0
- package/src/prompts/system.md +28 -0
- package/src/sessions/exportPaths.ts +28 -0
- package/src/sessions/exportWriter.ts +490 -0
- package/src/sessions/sessionManager.ts +119 -0
- package/src/sessions/sessionStore.ts +69 -0
- package/src/shared/activitySpinner.ts +108 -0
- package/src/shared/copilotClient.ts +291 -0
- package/src/shared/data/cards.json +1249 -0
- package/src/shared/data/cardsLoader.ts +70 -0
- package/src/shared/errorClassifier.ts +160 -0
- package/src/shared/events.ts +103 -0
- package/src/shared/markdownRenderer.ts +44 -0
- package/src/shared/schemas/session.ts +346 -0
- package/src/shared/tableRenderer.ts +28 -0
- package/src/types/marked-terminal.d.ts +5 -0
- package/src/vendor/chalk.ts +2 -0
- package/src/vendor/cli-table3.ts +3 -0
- package/src/vendor/commander.ts +2 -0
- package/src/vendor/marked-terminal.ts +3 -0
- package/src/vendor/marked.ts +2 -0
- package/src/vendor/ora.ts +2 -0
- package/src/vendor/pino.ts +3 -0
- package/src/vendor/zod.ts +3 -0
- package/tests/e2e/developE2e.spec.ts +152 -0
- package/tests/e2e/developFailureE2e.spec.ts +289 -0
- package/tests/e2e/developPty.spec.ts +86 -0
- package/tests/e2e/discoveryWebSearchRelevance.spec.ts +103 -0
- package/tests/e2e/harness.spec.ts +104 -0
- package/tests/e2e/mcpLive.spec.ts +149 -0
- package/tests/e2e/newSession.e2e.spec.ts +245 -0
- package/tests/e2e/ralphLoopEnrichmentComparison.spec.ts +70 -0
- package/tests/e2e/workiqEnrichment.spec.ts +72 -0
- package/tests/e2e/zava-assessment/agent-interaction-script.md +258 -0
- package/tests/e2e/zava-assessment/company-profile.md +98 -0
- package/tests/e2e/zava-assessment/expected-results-checklist.md +454 -0
- package/tests/e2e/zavaSimulation.spec.ts +511 -0
- package/tests/fixtures/completedSession.json +141 -0
- package/tests/fixtures/test-fixture-project/package-lock.json +1585 -0
- package/tests/fixtures/test-fixture-project/package.json +12 -0
- package/tests/fixtures/test-fixture-project/src/add.ts +3 -0
- package/tests/fixtures/test-fixture-project/tests/failing.test.ts +7 -0
- package/tests/fixtures/test-fixture-project/tests/hanging.test.ts +9 -0
- package/tests/fixtures/test-fixture-project/tests/passing.test.ts +13 -0
- package/tests/fixtures/test-fixture-project/vitest.config.ts +7 -0
- package/tests/integration/autoStartConversation.spec.ts +168 -0
- package/tests/integration/defaultCommand.spec.ts +179 -0
- package/tests/integration/directCommandNonTty.spec.ts +260 -0
- package/tests/integration/directCommandTty.spec.ts +185 -0
- package/tests/integration/discoveryEnrichmentFlow.spec.ts +209 -0
- package/tests/integration/exportArtifacts.spec.ts +232 -0
- package/tests/integration/exportFallbackFlow.spec.ts +115 -0
- package/tests/integration/mcpDegradationFlow.spec.ts +231 -0
- package/tests/integration/mcpTransportFlow.spec.ts +178 -0
- package/tests/integration/newSessionFlow.spec.ts +406 -0
- package/tests/integration/pocGithubMcp.spec.ts +224 -0
- package/tests/integration/pocLocalFallback.spec.ts +205 -0
- package/tests/integration/pocScaffold.spec.ts +220 -0
- package/tests/integration/ralphLoopFlow.spec.ts +430 -0
- package/tests/integration/ralphLoopPartial.spec.ts +416 -0
- package/tests/integration/resumeAndBacktrack.spec.ts +278 -0
- package/tests/integration/spinnerLifecycle.spec.ts +270 -0
- package/tests/integration/summarizationFlow.spec.ts +135 -0
- package/tests/integration/testRunnerReal.spec.ts +63 -0
- package/tests/integration/webSearchAgent.spec.ts +155 -0
- package/tests/live/copilotSdkLive.spec.ts +149 -0
- package/tests/live/zavaFullWorkshop.spec.ts +515 -0
- package/tests/setup/loadEnv.ts +5 -0
- package/tests/unit/cli/developCommand.spec.ts +679 -0
- package/tests/unit/cli/directCommands.spec.ts +325 -0
- package/tests/unit/cli/envLoader.spec.ts +73 -0
- package/tests/unit/cli/ioContext.spec.ts +148 -0
- package/tests/unit/cli/preflight.spec.ts +125 -0
- package/tests/unit/cli/statusCommand.spec.ts +134 -0
- package/tests/unit/cli/workshopClientFallback.spec.ts +100 -0
- package/tests/unit/cli/workshopCommand.spec.ts +378 -0
- package/tests/unit/config/vitestEnvSetup.spec.ts +24 -0
- package/tests/unit/develop/checkpointState.spec.ts +378 -0
- package/tests/unit/develop/codeGenerator.spec.ts +447 -0
- package/tests/unit/develop/githubMcpAdapter.spec.ts +283 -0
- package/tests/unit/develop/mcpContextEnricher.spec.ts +564 -0
- package/tests/unit/develop/outputValidator.spec.ts +134 -0
- package/tests/unit/develop/pocScaffolder.spec.ts +451 -0
- package/tests/unit/develop/ralphLoop.spec.ts +1439 -0
- package/tests/unit/develop/templateRegistry.spec.ts +106 -0
- package/tests/unit/develop/testRunner.spec.ts +294 -0
- package/tests/unit/infraBicep.spec.ts +116 -0
- package/tests/unit/infraDeploy.spec.ts +102 -0
- package/tests/unit/infraTeardown.spec.ts +77 -0
- package/tests/unit/logging/logger.spec.ts +50 -0
- package/tests/unit/loop/conversationLoop.spec.ts +719 -0
- package/tests/unit/loop/phaseSummarizer.spec.ts +169 -0
- package/tests/unit/loop/streamingMarkdown.spec.ts +180 -0
- package/tests/unit/mcp/mcpManager.spec.ts +336 -0
- package/tests/unit/mcp/mcpTransport.spec.ts +689 -0
- package/tests/unit/mcp/retryPolicy.spec.ts +278 -0
- package/tests/unit/mcp/timeoutValidation.spec.ts +55 -0
- package/tests/unit/mcp/webSearch.spec.ts +718 -0
- package/tests/unit/phases/contextSummarizer.spec.ts +158 -0
- package/tests/unit/phases/discoveryEnricher.repeatCalls.spec.ts +125 -0
- package/tests/unit/phases/discoveryEnricher.spec.ts +512 -0
- package/tests/unit/phases/phaseExtractors.spec.ts +406 -0
- package/tests/unit/phases/phaseHandlers.spec.ts +483 -0
- package/tests/unit/prompts/promptLoader.spec.ts +144 -0
- package/tests/unit/schemas/pocSchemas.spec.ts +457 -0
- package/tests/unit/schemas/session.spec.ts +328 -0
- package/tests/unit/sessions/exportPaths.spec.ts +38 -0
- package/tests/unit/sessions/exportWriter.spec.ts +737 -0
- package/tests/unit/sessions/sessionManager.spec.ts +174 -0
- package/tests/unit/sessions/sessionStore.spec.ts +136 -0
- package/tests/unit/shared/activitySpinner.spec.ts +211 -0
- package/tests/unit/shared/cardsLoader.spec.ts +89 -0
- package/tests/unit/shared/copilotClient.spec.ts +185 -0
- package/tests/unit/shared/errorClassifier.spec.ts +152 -0
- package/tests/unit/shared/events.spec.ts +71 -0
- package/tests/unit/shared/markdownRenderer.spec.ts +42 -0
- package/tests/unit/shared/markdownRendererChunks.spec.ts +83 -0
- package/tests/unit/shared/tableRenderer.spec.ts +38 -0
- package/tsconfig.json +20 -0
- package/vitest.config.ts +15 -0
- package/vitest.live.config.ts +19 -0
|
@@ -0,0 +1,143 @@
|
|
|
1
|
+
# Feature Specification: AI Foundry Search Service Deployment
|
|
2
|
+
|
|
3
|
+
**Feature Branch**: `005-ai-search-deploy`
|
|
4
|
+
**Created**: 2026-03-01
|
|
5
|
+
**Status**: Draft
|
|
6
|
+
**Input**: User description: "Create the AI Foundry Search service as a bicep file and make it easily deployable using a script. This Search service will be the one used as a tool for web search, especially during the first step where we may need to search about the company, competitors and project specific information."
|
|
7
|
+
|
|
8
|
+
## User Scenarios & Testing _(mandatory)_
|
|
9
|
+
|
|
10
|
+
### User Story 1 — One-Command Search Service Deployment (Priority: P1)
|
|
11
|
+
|
|
12
|
+
A developer or workshop facilitator wants to provision the Azure AI Foundry infrastructure that powers sofIA's `web.search` tool. They run a single deployment script, provide a resource group name (and optionally an Azure subscription ID), and receive a fully deployed Foundry project with a web-search-enabled agent. The script writes the project endpoint URL and model deployment name to a `.env` file in the workspace root. Since the Foundry Agent Service uses the caller's Azure login credentials for authentication (no separate API key), and the sofIA CLI automatically loads the `.env` file at startup, the user has web search capabilities immediately — no manual environment variable configuration needed.
|
|
13
|
+
|
|
14
|
+
**Why this priority**: Without the deployed Search service, the `web.search` tool has no backend. This is the foundational story — every other story depends on a working deployment.
|
|
15
|
+
|
|
16
|
+
**Independent Test**: Can be tested by running the deployment script against an Azure subscription and verifying the Search service is provisioned, accessible, and returns valid responses to a test query.
|
|
17
|
+
|
|
18
|
+
**Acceptance Scenarios**:
|
|
19
|
+
|
|
20
|
+
1. **Given** a user with an active Azure subscription and Owner/Contributor permissions, **When** they run the deployment script providing a resource group name (and optionally a subscription ID), **Then** all required Azure AI Foundry resources are provisioned, and the script writes the project endpoint URL and model deployment name to a `.env` file in the workspace root.
|
|
21
|
+
2. **Given** the deployment script has completed successfully, **When** the user starts the sofIA CLI (which automatically loads the `.env` file) and is logged in to Azure, **Then** the sofIA CLI's `web.search` tool is enabled and returns search results with inline citations for a test query.
|
|
22
|
+
3. **Given** a user provides an Azure region that does not support the required services, **When** they run the deployment script, **Then** the script fails with a clear error message explaining which services are unavailable in that region and suggests supported alternatives.
|
|
23
|
+
|
|
24
|
+
---
|
|
25
|
+
|
|
26
|
+
### User Story 2 — Infrastructure-as-Code Reproducibility (Priority: P2)
|
|
27
|
+
|
|
28
|
+
A DevOps engineer or contributor wants to review, customize, and version-control the Azure infrastructure definition. They inspect the infrastructure template, understand the resources being created, modify parameters (such as region, naming conventions, or model selection), and deploy a customized version. The template is self-documenting with parameter descriptions and follows Azure best practices.
|
|
29
|
+
|
|
30
|
+
**Why this priority**: Reproducibility and transparency are essential for team collaboration, auditing, and compliance. Without a readable, parameterized template, each deployment becomes a one-off manual effort.
|
|
31
|
+
|
|
32
|
+
**Independent Test**: Can be tested by opening the infrastructure template files, verifying all parameters have descriptions and defaults, and deploying with customized parameter values.
|
|
33
|
+
|
|
34
|
+
**Acceptance Scenarios**:
|
|
35
|
+
|
|
36
|
+
1. **Given** a contributor opens the infrastructure template, **When** they review the file, **Then** every parameter has a description, a sensible default (where applicable), and the resources to be created are clearly documented.
|
|
37
|
+
2. **Given** a user wants to deploy in a different Azure region, **When** they override the region parameter, **Then** the deployment succeeds in the new region (assuming service availability) without modifying the template itself.
|
|
38
|
+
3. **Given** a user wants to use a different model for the Search agent, **When** they override the model deployment parameter, **Then** the deployment provisions the specified model instead of the default.
|
|
39
|
+
|
|
40
|
+
---
|
|
41
|
+
|
|
42
|
+
### User Story 3 — Seamless Integration with sofIA CLI (Priority: P3)
|
|
43
|
+
|
|
44
|
+
After deployment, a workshop facilitator launches sofIA and begins the Discover phase (Step 1). When the AI asks about the user's business and industry, the facilitator expects sofIA to automatically use the web search tool to research the company, its competitors, and industry trends. The search results are grounded with citations and enrich the discovery conversation.
|
|
45
|
+
|
|
46
|
+
**Why this priority**: This story validates the end-to-end value — deployment is only useful if the CLI can consume the service. However, the CLI integration layer already exists (the `web.search` tool in `webSearch.ts`); this story confirms it works with the newly deployed Foundry agent.
|
|
47
|
+
|
|
48
|
+
**Independent Test**: Can be tested by deploying the service, configuring environment variables, starting a sofIA workshop session, and verifying that the Discover phase uses web search to retrieve real-time information about a named company.
|
|
49
|
+
|
|
50
|
+
**Acceptance Scenarios**:
|
|
51
|
+
|
|
52
|
+
1. **Given** the Foundry Search agent is deployed and environment variables are configured, **When** a user starts a sofIA workshop and describes their business, **Then** the `web.search` tool is invoked to research the company and returns results with citations.
|
|
53
|
+
2. **Given** the Foundry Search agent endpoint becomes temporarily unavailable, **When** the sofIA CLI tries to use the `web.search` tool, **Then** the CLI degrades gracefully (no crash) and the workshop continues without web search capabilities, with a warning to the user.
|
|
54
|
+
|
|
55
|
+
---
|
|
56
|
+
|
|
57
|
+
### User Story 4 — Teardown and Cost Management (Priority: P4)
|
|
58
|
+
|
|
59
|
+
A user who has finished a workshop or testing session wants to remove all deployed Azure resources to avoid ongoing costs. They run a teardown command that cleanly removes the resource group and all contained resources.
|
|
60
|
+
|
|
61
|
+
**Why this priority**: Azure resources incur costs when idle. Workshop and PoC scenarios are often short-lived, so easy teardown is important for cost management, though it doesn't block core functionality.
|
|
62
|
+
|
|
63
|
+
**Independent Test**: Can be tested by deploying the infrastructure, running the teardown command, and verifying the resource group and all resources are deleted.
|
|
64
|
+
|
|
65
|
+
**Acceptance Scenarios**:
|
|
66
|
+
|
|
67
|
+
1. **Given** a previously deployed Foundry Search infrastructure, **When** the user runs the teardown command, **Then** the resource group and all contained resources are deleted within 10 minutes.
|
|
68
|
+
2. **Given** the user runs the teardown command for a resource group that doesn't exist, **Then** the script exits with a clear informational message (not an error).
|
|
69
|
+
|
|
70
|
+
---
|
|
71
|
+
|
|
72
|
+
### Edge Cases
|
|
73
|
+
|
|
74
|
+
- What happens when the user's Azure subscription has reached its quota for Cognitive Services resources?
|
|
75
|
+
- How does the system handle a deployment where the specified resource group already contains a Foundry resource with the same name?
|
|
76
|
+
- What happens if the user loses network connectivity mid-deployment?
|
|
77
|
+
- How does the system handle Azure regions where the Foundry Agent Service or web search capability is not yet available?
|
|
78
|
+
- What happens when the user's Azure account has insufficient permissions (e.g., Reader role instead of Owner/Contributor)?
|
|
79
|
+
- How does the system handle concurrent deployments to the same resource group?
|
|
80
|
+
|
|
81
|
+
## Requirements _(mandatory)_
|
|
82
|
+
|
|
83
|
+
### Functional Requirements
|
|
84
|
+
|
|
85
|
+
- **FR-001**: The project MUST include an infrastructure template that defines all Azure resources needed for the AI Foundry web search capability (Foundry account, project, model deployment, and agent capability).
|
|
86
|
+
- **FR-002**: The project MUST include a deployment script that provisions the infrastructure with a single command, accepting the resource group as a required input and the Azure subscription as an optional input (defaults to the current az CLI subscription). If the specified resource group does not exist, the script MUST create it automatically.
|
|
87
|
+
- **FR-003**: The deployment script MUST output the Foundry project endpoint URL and model deployment name needed to configure the sofIA CLI (`FOUNDRY_PROJECT_ENDPOINT`, `FOUNDRY_MODEL_DEPLOYMENT_NAME`), and MUST write them to a `.env` file in the workspace root (creating or updating the file as needed). Authentication uses the caller's Azure login credentials — no separate API key is needed.
|
|
88
|
+
- **FR-004**: The infrastructure template MUST be parameterized, allowing users to customize the deployment name, region, and model selection without modifying the template. The default region MUST be `swedencentral`.
|
|
89
|
+
- **FR-005**: The deployment script MUST validate prerequisites before attempting deployment (Azure CLI installed, user logged in, correct subscription selected, sufficient permissions).
|
|
90
|
+
- **FR-006**: The deployment script MUST provide clear, actionable error messages when a deployment fails, including the specific failure reason and suggested remediation.
|
|
91
|
+
- **FR-007**: The project MUST include a teardown command that removes all deployed resources by deleting the resource group.
|
|
92
|
+
- **FR-008**: The infrastructure template MUST follow the basic agent setup pattern (Microsoft-managed resources) to minimize complexity and cost for workshop/PoC scenarios.
|
|
93
|
+
- **FR-009**: The deployed agent MUST support the `web_search_preview` tool type to provide real-time web search grounded with citations.
|
|
94
|
+
- **FR-010**: The deployment script MUST be executable from common development environments (Linux, macOS, Windows via WSL or Git Bash).
|
|
95
|
+
- **FR-011**: The infrastructure template MUST include documentation (parameter descriptions, comments) explaining each resource and its purpose.
|
|
96
|
+
- **FR-012**: The deployment MUST configure the Foundry agent with web search enabled and an appropriate model deployment for handling search queries. The default model deployment MUST be `gpt-4.1-mini`.
|
|
97
|
+
- **FR-013**: The sofIA CLI MUST authenticate to the Foundry Agent Service using the user's Azure login credentials (e.g., via Azure Identity), eliminating the need for separate API key management.
|
|
98
|
+
- **FR-014**: Search responses MUST include inline URL citations so the user can verify the sources of information surfaced during the workshop.
|
|
99
|
+
- **FR-015**: The web search agent MUST follow an ephemeral lifecycle — created when a sofIA CLI session starts and automatically deleted when the session ends. The Bicep template provisions only the Foundry account, project, and model deployment; agent creation/deletion is handled at runtime by the CLI.
|
|
100
|
+
- **FR-016**: If the CLI detects the legacy environment variables (`SOFIA_FOUNDRY_AGENT_ENDPOINT` or `SOFIA_FOUNDRY_AGENT_KEY`), it MUST display a clear error message instructing the user to migrate to the new variables (`FOUNDRY_PROJECT_ENDPOINT`, `FOUNDRY_MODEL_DEPLOYMENT_NAME`). The old variables MUST NOT be used for authentication.
|
|
101
|
+
- **FR-017**: The sofIA CLI MUST load environment variables from a `.env` file in the project root at startup, without overwriting variables already present in `process.env`. If the `.env` file does not exist, the CLI MUST proceed normally without error.
|
|
102
|
+
- **FR-018**: The deployment script MUST write the output environment variables (`FOUNDRY_PROJECT_ENDPOINT`, `FOUNDRY_MODEL_DEPLOYMENT_NAME`) to a `.env` file in the workspace root. If the file already exists, only the relevant entries MUST be updated; other entries MUST be preserved.
|
|
103
|
+
|
|
104
|
+
### Key Entities
|
|
105
|
+
|
|
106
|
+
- **Foundry Account**: The top-level Azure AI Foundry resource that hosts projects and model deployments. Identified by a unique name within a resource group.
|
|
107
|
+
- **Foundry Project**: A project within the Foundry account where the web search agent operates. Provides the endpoint URL used by the sofIA CLI.
|
|
108
|
+
- **Model Deployment**: A deployed language model (default: `gpt-4.1-mini`) within the Foundry account that processes search queries and generates grounded responses with citations.
|
|
109
|
+
- **Web Search Agent**: A Foundry agent configured with the `web_search_preview` tool that performs real-time web searches. The agent is ephemeral — created when a sofIA CLI session starts and deleted when the session ends. No agent resource is managed in the Bicep template.
|
|
110
|
+
- **Deployment Configuration**: The set of parameters (subscription, resource group, region, naming, model choice) that define a specific infrastructure deployment.
|
|
111
|
+
|
|
112
|
+
## Success Criteria _(mandatory)_
|
|
113
|
+
|
|
114
|
+
### Measurable Outcomes
|
|
115
|
+
|
|
116
|
+
- **SC-001**: A new user can deploy the complete web search infrastructure in under 15 minutes, including prerequisite checks and resource provisioning.
|
|
117
|
+
- **SC-002**: The deployment script succeeds on the first attempt for 95% of users who have a valid Azure subscription with Owner/Contributor permissions.
|
|
118
|
+
- **SC-003**: After deployment, the sofIA CLI can perform a web search query and receive grounded results with citations within 10 seconds.
|
|
119
|
+
- **SC-004**: Teardown removes all deployed resources and stops all associated billing within 10 minutes of execution.
|
|
120
|
+
- **SC-005**: The infrastructure template is fully self-documented — a user can understand every resource and parameter by reading the template file alone, without external documentation.
|
|
121
|
+
- **SC-006**: The deployment is reproducible — running the deployment script twice with the same parameters on different machines produces functionally equivalent environments.
|
|
122
|
+
|
|
123
|
+
## Clarifications
|
|
124
|
+
|
|
125
|
+
### Session 2026-03-01
|
|
126
|
+
|
|
127
|
+
- Q: What should the default Azure region be for the deployment template? → A: `swedencentral`
|
|
128
|
+
- Q: What should the default model deployment for the web search agent be? → A: `gpt-4.1-mini`
|
|
129
|
+
- Q: Should the Foundry agent be persistent or ephemeral? → A: Ephemeral per CLI session (created on session start, deleted on session end)
|
|
130
|
+
- Q: Should the deployment script auto-create the resource group if it doesn't exist? → A: Yes, auto-create it (subscription-level deployment)
|
|
131
|
+
- Q: Should the CLI support both old and new env vars during migration? → A: Clean break — only new env vars, with error message if old ones detected
|
|
132
|
+
|
|
133
|
+
## Assumptions
|
|
134
|
+
|
|
135
|
+
- Users have an active Azure subscription with Owner or Contributor permissions on the target resource group.
|
|
136
|
+
- The Azure CLI is installed and the user is already authenticated (`az login`). This same Azure login is used for both deploying the infrastructure and authenticating the sofIA CLI to the Foundry Agent Service at runtime.
|
|
137
|
+
- The target Azure region supports Azure AI Foundry and the Grounding with Bing Search capability (uses Bing Search behind the scenes — governed by [Grounding with Bing terms of use](https://www.microsoft.com/bing/apis/grounding-legal-enterprise)).
|
|
138
|
+
- The basic agent setup (Microsoft-managed infrastructure) is sufficient for workshop and PoC use cases — standard agent setup with BYO resources is out of scope for this feature.
|
|
139
|
+
- The sofIA CLI's existing `webSearch.ts` module uses raw HTTP POST with a bearer token (`SOFIA_FOUNDRY_AGENT_ENDPOINT` + `SOFIA_FOUNDRY_AGENT_KEY`). This feature will migrate the web search integration to the Foundry Agent Service SDK pattern, which authenticates via Azure Identity credentials and uses the `web_search_preview` tool type. The environment variables will change from `SOFIA_FOUNDRY_AGENT_ENDPOINT`/`SOFIA_FOUNDRY_AGENT_KEY` to `FOUNDRY_PROJECT_ENDPOINT`/`FOUNDRY_MODEL_DEPLOYMENT_NAME`. This is a clean break — old env vars will trigger an error message guiding migration; they will not be silently ignored or used as fallback.
|
|
140
|
+
- The Foundry web search agent returns responses with inline URL citations (annotations of type `url_citation`), which the sofIA CLI should surface to the user.
|
|
141
|
+
- Cost for Grounding with Bing Search is usage-based and acceptable for workshop scenarios (typically a small number of queries per session). See [pricing](https://www.microsoft.com/bing/apis/grounding-pricing).
|
|
142
|
+
- The infrastructure files will live in a new `infra/` directory at the project root, following Azure conventions.
|
|
143
|
+
- An Azure admin may need to ensure web search is not disabled at the subscription level (see `az feature unregister --name OpenAI.BlockedTools.web_search --namespace Microsoft.CognitiveServices`).
|
|
@@ -0,0 +1,284 @@
|
|
|
1
|
+
# Tasks: AI Foundry Search Service Deployment
|
|
2
|
+
|
|
3
|
+
**Input**: Design documents from `/specs/005-ai-search-deploy/`
|
|
4
|
+
**Prerequisites**: plan.md, spec.md, research.md, data-model.md, contracts/web-search-tool.md, quickstart.md
|
|
5
|
+
|
|
6
|
+
**Tests**: Tests are REQUIRED for new behavior in this repository (Red → Green → Review). Include test tasks for each user story and write them first.
|
|
7
|
+
|
|
8
|
+
**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story.
|
|
9
|
+
|
|
10
|
+
## Format: `[ID] [P?] [Story] Description`
|
|
11
|
+
|
|
12
|
+
- **[P]**: Can run in parallel (different files, no dependencies)
|
|
13
|
+
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3, US4)
|
|
14
|
+
- Include exact file paths in descriptions
|
|
15
|
+
|
|
16
|
+
## Path Conventions
|
|
17
|
+
|
|
18
|
+
- **Infrastructure**: `infra/` at repository root
|
|
19
|
+
- **Source code**: `src/` at repository root
|
|
20
|
+
- **Tests**: `tests/unit/`, `tests/integration/` at repository root
|
|
21
|
+
- **Documentation**: `docs/` at repository root
|
|
22
|
+
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
## Phase 1: Setup (Shared Infrastructure)
|
|
26
|
+
|
|
27
|
+
**Purpose**: Install new dependencies, create directory structure, establish foundational config
|
|
28
|
+
|
|
29
|
+
- [x] T001 Install `@azure/ai-projects@beta` and `@azure/identity` as production dependencies in package.json
|
|
30
|
+
- [x] T002 [P] Create `infra/` directory structure with placeholder files: `infra/main.bicep`, `infra/main.bicepparam`, `infra/deploy.sh`, `infra/teardown.sh`
|
|
31
|
+
- [x] T003 [P] Add TypeScript ambient module declarations for `@azure/ai-projects` and `@azure/identity` if needed in src/types/
|
|
32
|
+
|
|
33
|
+
**Checkpoint**: Dependencies installed, directory structure ready, `npm run typecheck` passes
|
|
34
|
+
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
## Phase 2: Foundational (Blocking Prerequisites)
|
|
38
|
+
|
|
39
|
+
**Purpose**: Core changes that MUST be complete before ANY user story can be implemented
|
|
40
|
+
|
|
41
|
+
**⚠️ CRITICAL**: No user story work can begin until this phase is complete
|
|
42
|
+
|
|
43
|
+
- [x] T004 Update `isWebSearchConfigured()` in src/mcp/webSearch.ts to check new env vars (`FOUNDRY_PROJECT_ENDPOINT`, `FOUNDRY_MODEL_DEPLOYMENT_NAME`) instead of legacy vars
|
|
44
|
+
- [x] T005 [P] Add legacy env var detection to preflight checks in src/cli/preflight.ts — fail with migration error message if `SOFIA_FOUNDRY_AGENT_ENDPOINT` or `SOFIA_FOUNDRY_AGENT_KEY` are set (FR-016)
|
|
45
|
+
- [x] T006 [P] Update docs/environment.md — replace legacy env var documentation with new `FOUNDRY_PROJECT_ENDPOINT` and `FOUNDRY_MODEL_DEPLOYMENT_NAME` vars, document `DefaultAzureCredential` auth model
|
|
46
|
+
|
|
47
|
+
**Checkpoint**: Foundation ready — legacy env vars rejected, new env var pattern established, `npm run typecheck` and `npm run lint` pass
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
## Phase 3: User Story 1 — One-Command Search Service Deployment (Priority: P1) 🎯 MVP
|
|
52
|
+
|
|
53
|
+
**Goal**: A developer runs `./infra/deploy.sh` with subscription + resource group and gets a fully deployed Foundry project with web-search-enabled agent infrastructure. Script outputs configuration values.
|
|
54
|
+
|
|
55
|
+
**Independent Test**: Run the deployment script against an Azure subscription and verify all 5 resources are provisioned, script outputs correct env var values.
|
|
56
|
+
|
|
57
|
+
**FR Coverage**: FR-001, FR-002, FR-003, FR-005, FR-006, FR-008, FR-009, FR-010, FR-012
|
|
58
|
+
|
|
59
|
+
### Tests for User Story 1 (REQUIRED) ⚠️
|
|
60
|
+
|
|
61
|
+
> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**
|
|
62
|
+
|
|
63
|
+
- [x] T007 [P] [US1] Unit test for deploy script prerequisite validation (az CLI check, login check) in tests/unit/infraDeploy.spec.ts — test parameter parsing, missing required args, default values
|
|
64
|
+
- [x] T008 [P] [US1] Unit test verifying Bicep template structure (validate JSON compilation output has all 5 expected resource types) in tests/unit/infraBicep.spec.ts
|
|
65
|
+
|
|
66
|
+
### Implementation for User Story 1
|
|
67
|
+
|
|
68
|
+
- [x] T009 [US1] Create Bicep template `infra/main.bicep` with all 5 resources: `Microsoft.CognitiveServices/accounts` (kind: AIServices, allowProjectManagement: true, customSubDomainName), `accounts/deployments` (gpt-4.1-mini, GlobalStandard), `accounts/projects`, `accounts/capabilityHosts` (Agents), `accounts/projects/capabilityHosts` (Agents). Use `targetScope = 'subscription'` with resource group creation per research.md R1
|
|
69
|
+
- [x] T010 [US1] Create Bicep parameter file `infra/main.bicepparam` with defaults: location=swedencentral, modelDeploymentName=gpt-4.1-mini, modelName=gpt-4.1-mini, modelVersion=2025-04-14, modelSkuName=GlobalStandard per data-model.md FoundryDeploymentConfig
|
|
70
|
+
- [x] T011 [US1] Implement deployment script `infra/deploy.sh` — parse CLI flags (--subscription, --resource-group, --location, --account-name, --model), validate prerequisites (az CLI installed, user logged in, subscription accessible), run `az deployment sub create`, query Bicep outputs (projectEndpoint, modelDeploymentName), write env vars to `.env` file per contracts/web-search-tool.md deploy.sh contract
|
|
71
|
+
- [x] T012 [US1] Make `infra/deploy.sh` executable (chmod +x) and add shebang `#!/usr/bin/env bash`, set error handling (`set -euo pipefail`), add clear error messages with exit codes (0=success, 1=prereq fail, 2=deploy fail) per FR-006 and contracts
|
|
72
|
+
|
|
73
|
+
**Checkpoint**: User Story 1 complete — `./infra/deploy.sh -g <rg>` provisions all resources and writes env vars to `.env`. Tests in T007/T008 pass.
|
|
74
|
+
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
## Phase 4: User Story 2 — Infrastructure-as-Code Reproducibility (Priority: P2)
|
|
78
|
+
|
|
79
|
+
**Goal**: Every Bicep parameter has a description and sensible default. Users can customize region, model, and naming without modifying the template. Template is self-documenting.
|
|
80
|
+
|
|
81
|
+
**Independent Test**: Open `infra/main.bicep`, verify every `@description()` decorator is present. Deploy with `--location eastus --model gpt-4o-mini` and verify it succeeds with those overrides.
|
|
82
|
+
|
|
83
|
+
**FR Coverage**: FR-004, FR-011
|
|
84
|
+
|
|
85
|
+
### Tests for User Story 2 (REQUIRED) ⚠️
|
|
86
|
+
|
|
87
|
+
> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**
|
|
88
|
+
|
|
89
|
+
- [x] T013 [P] [US2] Unit test verifying all Bicep parameters have `@description()` decorators in tests/unit/infraBicep.spec.ts — parse main.bicep and confirm every `param` has a preceding `@description` annotation
|
|
90
|
+
- [x] T014 [P] [US2] Unit test verifying Bicep parameters have defaults where specified in data-model.md FoundryDeploymentConfig — location defaults to swedencentral, model to gpt-4.1-mini
|
|
91
|
+
|
|
92
|
+
### Implementation for User Story 2
|
|
93
|
+
|
|
94
|
+
- [x] T015 [US2] Add `@description()` decorators to all Bicep parameters in infra/main.bicep — location, accountName, projectName, modelDeploymentName, modelName, modelVersion, modelSkuName, modelSkuCapacity, resourceGroupName per FR-011
|
|
95
|
+
- [x] T016 [US2] Add inline comments to each Bicep resource in infra/main.bicep explaining its purpose (Foundry account, model deployment, project, account capability host, project capability host) per FR-011 and SC-005
|
|
96
|
+
- [x] T017 [US2] Ensure deploy.sh passes parameter overrides through to Bicep deployment — `--location`, `--account-name`, and `--model` flags map to Bicep parameter overrides via `az deployment sub create --parameters` per FR-004
|
|
97
|
+
|
|
98
|
+
**Checkpoint**: User Story 2 complete — template is self-documenting (SC-005), customizable region/model/naming (FR-004) works via CLI flag overrides. Tests in T013/T014 pass.
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## Phase 5: User Story 3 — Seamless Integration with sofIA CLI (Priority: P3)
|
|
103
|
+
|
|
104
|
+
**Goal**: The sofIA CLI's `web.search` tool uses `@azure/ai-projects` SDK with `DefaultAzureCredential` auth (no API key). Ephemeral agent created lazily on first search call, deleted on session end. Search responses include URL citations. Graceful degradation on failure.
|
|
105
|
+
|
|
106
|
+
**Independent Test**: Set `FOUNDRY_PROJECT_ENDPOINT` and `FOUNDRY_MODEL_DEPLOYMENT_NAME`, start a sofIA workshop, describe a company — web search returns grounded results with citations. Unset env vars — workshop proceeds without web search, no crash.
|
|
107
|
+
|
|
108
|
+
**FR Coverage**: FR-013, FR-014, FR-015, FR-016
|
|
109
|
+
|
|
110
|
+
### Tests for User Story 3 (REQUIRED) ⚠️
|
|
111
|
+
|
|
112
|
+
> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**
|
|
113
|
+
|
|
114
|
+
- [x] T018 [P] [US3] Unit test for updated `WebSearchConfig` validation (projectEndpoint format, modelDeploymentName non-empty) in tests/unit/webSearch.spec.ts per data-model.md validation rules
|
|
115
|
+
- [x] T019 [P] [US3] Unit test for legacy env var detection — `isWebSearchConfigured()` returns false when only old vars set, returns true when new vars set in tests/unit/webSearch.spec.ts
|
|
116
|
+
- [x] T020 [P] [US3] Unit test for `createWebSearchTool()` graceful degradation — returns `{ results: [], degraded: true, error }` when credential fails, agent creation fails, or network error in tests/unit/webSearch.spec.ts per contracts/web-search-tool.md degradation scenarios
|
|
117
|
+
- [x] T021 [P] [US3] Unit test for citation extraction — parses `url_citation` annotations from Foundry response into `WebSearchResultItem[]` with title, url, snippet and deduplicates sources in tests/unit/webSearch.spec.ts per contracts/web-search-tool.md output format
|
|
118
|
+
- [x] T022 [P] [US3] Integration test for ephemeral agent lifecycle (create → query → cleanup) using faked `AIProjectClient` in tests/integration/webSearchAgent.spec.ts — verify agent created on first call, reused on second call, deleted on `destroyWebSearchSession()` per data-model.md AgentSession state transitions
|
|
119
|
+
- [x] T023 [P] [US3] Unit test for preflight legacy env var check — verify preflight fails with clear migration message when `SOFIA_FOUNDRY_AGENT_ENDPOINT` or `SOFIA_FOUNDRY_AGENT_KEY` are set in tests/unit/preflight.spec.ts per data-model.md LegacyEnvVarError
|
|
120
|
+
|
|
121
|
+
### Implementation for User Story 3
|
|
122
|
+
|
|
123
|
+
- [x] T024 [US3] Update `WebSearchConfig` interface in src/mcp/webSearch.ts — replace `endpoint`/`apiKey`/`fetchFn` with `projectEndpoint`/`modelDeploymentName` per data-model.md entity 3
|
|
124
|
+
- [x] T025 [US3] Implement `AgentSession` class in src/mcp/webSearch.ts — lazy initialization with `AIProjectClient` + `DefaultAzureCredential`, `agents.createVersion()` for web_search_preview agent, conversation creation, state tracking (uninitialized → initialized → cleaned up) per data-model.md entity 4 and research.md R4/R8
|
|
125
|
+
- [x] T026 [US3] Implement citation extraction in src/mcp/webSearch.ts — parse `response.output` for `url_citation` annotations, map to `WebSearchResultItem[]` (title, url, snippet), deduplicate into `sources[]` per contracts/web-search-tool.md output format and FR-014
|
|
126
|
+
- [x] T027 [US3] Rewrite `createWebSearchTool()` in src/mcp/webSearch.ts — replace raw HTTP POST with `AgentSession.initialize()` on first call, `openAIClient.responses.create()` for queries, return structured `WebSearchResult` with citations. Handle all degradation scenarios per contracts/web-search-tool.md degradation table
|
|
127
|
+
- [x] T028 [US3] Implement `destroyWebSearchSession()` in src/mcp/webSearch.ts — delete conversation and agent version, register `process.on('beforeExit', ...)` handler, log warnings on cleanup failure (no throw) per research.md R8 lifecycle contract
|
|
128
|
+
- [x] T029 [US3] Update `isWebSearchConfigured()` in src/mcp/webSearch.ts — check `FOUNDRY_PROJECT_ENDPOINT` and `FOUNDRY_MODEL_DEPLOYMENT_NAME` (not legacy vars)
|
|
129
|
+
- [x] T030 [US3] Wire `destroyWebSearchSession()` cleanup into workshop session teardown in src/cli/workshopCommand.ts — call on workshop exit/completion to ensure ephemeral agent is deleted per FR-015
|
|
130
|
+
- [x] T031 [US3] Update src/develop/mcpContextEnricher.ts — ensure `isWebSearchConfigured()` import path is correct and behavior aligns with new env var check
|
|
131
|
+
|
|
132
|
+
**Checkpoint**: User Story 3 complete — sofIA CLI uses Foundry Agent Service SDK for web search, ephemeral agent lifecycle works, citations displayed, graceful degradation on failure. All tests in T018-T023 pass. `npm run typecheck` and `npm run lint` pass.
|
|
133
|
+
|
|
134
|
+
---
|
|
135
|
+
|
|
136
|
+
## Phase 6: User Story 4 — Teardown and Cost Management (Priority: P4)
|
|
137
|
+
|
|
138
|
+
**Goal**: User runs `./infra/teardown.sh --resource-group <name>` to delete all deployed resources. Clean exit when resource group doesn't exist.
|
|
139
|
+
|
|
140
|
+
**Independent Test**: Deploy infrastructure, run teardown, verify resource group deleted. Run teardown on non-existent group — clean informational message, exit 0.
|
|
141
|
+
|
|
142
|
+
**FR Coverage**: FR-007
|
|
143
|
+
|
|
144
|
+
### Tests for User Story 4 (REQUIRED) ⚠️
|
|
145
|
+
|
|
146
|
+
> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**
|
|
147
|
+
|
|
148
|
+
- [x] T032 [P] [US4] Unit test for teardown script parameter validation — required --resource-group flag, exit code 0 when group not found, exit code 1 on prereq failure in tests/unit/infraTeardown.spec.ts per contracts/web-search-tool.md teardown.sh contract
|
|
149
|
+
|
|
150
|
+
### Implementation for User Story 4
|
|
151
|
+
|
|
152
|
+
- [x] T033 [US4] Implement teardown script `infra/teardown.sh` — parse --resource-group flag, check az CLI prerequisites, verify resource group exists (informational exit 0 if not), prompt for confirmation (unless --yes), execute `az group delete --yes --no-wait`, print confirmation per contracts/web-search-tool.md teardown.sh contract
|
|
153
|
+
- [x] T034 [US4] Make `infra/teardown.sh` executable (chmod +x) and add shebang, set `set -euo pipefail`, handle exit codes (0=success/not-found, 1=prereq fail, 2=deletion fail)
|
|
154
|
+
|
|
155
|
+
**Checkpoint**: User Story 4 complete — teardown script deletes resource group cleanly, handles non-existent groups gracefully. Test in T032 passes.
|
|
156
|
+
|
|
157
|
+
---
|
|
158
|
+
|
|
159
|
+
## Phase 7: Polish & Cross-Cutting Concerns
|
|
160
|
+
|
|
161
|
+
**Purpose**: Documentation, validation, cleanup across all stories
|
|
162
|
+
|
|
163
|
+
- [x] T035 [P] Update README.md — add "Web Search Setup" section with link to quickstart.md and brief deployment instructions
|
|
164
|
+
- [x] T036 [P] Verify quickstart.md end-to-end flow matches final implementation — deploy → configure → verify → teardown in specs/005-ai-search-deploy/quickstart.md
|
|
165
|
+
- [x] T037 Run `npm run typecheck` and fix any remaining TypeScript errors across all modified files
|
|
166
|
+
- [x] T038 Run `npm run lint` and fix any ESLint `import/order` warnings across all modified files
|
|
167
|
+
- [x] T039 Run full test suite (`npm test`) and ensure no regressions in existing tests
|
|
168
|
+
- [x] T040 [P] Install `dotenv` as a production dependency and create `src/cli/envLoader.ts` — load `.env` file at startup without overwriting existing env vars (FR-017)
|
|
169
|
+
- [x] T041 [P] Wire `loadEnvFile()` call into `src/cli/index.ts` before CLI setup (FR-017)
|
|
170
|
+
- [x] T042 [P] Create unit tests for envLoader in `tests/unit/cli/envLoader.spec.ts` — test loading, no-overwrite, and missing file scenarios
|
|
171
|
+
- [x] T043 [P] Update `infra/deploy.sh` to write `FOUNDRY_PROJECT_ENDPOINT` and `FOUNDRY_MODEL_DEPLOYMENT_NAME` to `.env` file in workspace root (FR-018)
|
|
172
|
+
- [x] T044 Make `--subscription` flag optional in `infra/deploy.sh` — defaults to current az CLI subscription
|
|
173
|
+
|
|
174
|
+
---
|
|
175
|
+
|
|
176
|
+
## Dependencies & Execution Order
|
|
177
|
+
|
|
178
|
+
### Phase Dependencies
|
|
179
|
+
|
|
180
|
+
- **Setup (Phase 1)**: No dependencies — start immediately
|
|
181
|
+
- **Foundational (Phase 2)**: Depends on Setup completion — BLOCKS all user stories
|
|
182
|
+
- **US1 (Phase 3)**: Depends on Foundational — creates Bicep + deploy script
|
|
183
|
+
- **US2 (Phase 4)**: Depends on US1 (enhances the same Bicep template + deploy script created in US1)
|
|
184
|
+
- **US3 (Phase 5)**: Depends on Foundational — independent of US1/US2 (uses fakes for testing; only needs real deployment for live validation)
|
|
185
|
+
- **US4 (Phase 6)**: Depends on Foundational — independent of US1-US3
|
|
186
|
+
- **Polish (Phase 7)**: Depends on all desired user stories being complete
|
|
187
|
+
|
|
188
|
+
### User Story Dependencies
|
|
189
|
+
|
|
190
|
+
- **User Story 1 (P1)**: Can start after Foundational (Phase 2) — No dependencies on other stories
|
|
191
|
+
- **User Story 2 (P2)**: Depends on US1 (enhances the Bicep template and deploy script created in US1)
|
|
192
|
+
- **User Story 3 (P3)**: Can start after Foundational (Phase 2) — Independent of US1/US2 (uses faked SDK clients for tests)
|
|
193
|
+
- **User Story 4 (P4)**: Can start after Foundational (Phase 2) — Independent of US1-US3
|
|
194
|
+
|
|
195
|
+
### Within Each User Story
|
|
196
|
+
|
|
197
|
+
- Tests MUST be written and FAIL before implementation
|
|
198
|
+
- Models/interfaces before services
|
|
199
|
+
- Services before integration
|
|
200
|
+
- Core implementation before edge-case handling
|
|
201
|
+
- Story complete before moving to next priority
|
|
202
|
+
|
|
203
|
+
### Parallel Opportunities
|
|
204
|
+
|
|
205
|
+
- T002 and T003 can run in parallel (Phase 1 setup)
|
|
206
|
+
- T004, T005, T006 can run in parallel (Phase 2 foundational — different files)
|
|
207
|
+
- T007 and T008 can run in parallel (US1 tests — different test files)
|
|
208
|
+
- **US1 and US3 can run in parallel** after Foundational (different file sets: infra/ vs src/mcp/)
|
|
209
|
+
- **US1 and US4 can run in parallel** after Foundational (deploy.sh vs teardown.sh)
|
|
210
|
+
- **US3 and US4 can run in parallel** after Foundational (src/ vs infra/)
|
|
211
|
+
- All US3 tests (T018-T023) can run in parallel (different test files or independent test cases)
|
|
212
|
+
- T035 and T036 can run in parallel (different docs)
|
|
213
|
+
|
|
214
|
+
---
|
|
215
|
+
|
|
216
|
+
## Parallel Example: User Story 3
|
|
217
|
+
|
|
218
|
+
```bash
|
|
219
|
+
# Launch all tests for US3 together (6 parallel test tasks):
|
|
220
|
+
T018: Unit test for WebSearchConfig validation in tests/unit/webSearch.spec.ts
|
|
221
|
+
T019: Unit test for legacy env var detection in tests/unit/webSearch.spec.ts
|
|
222
|
+
T020: Unit test for graceful degradation in tests/unit/webSearch.spec.ts
|
|
223
|
+
T021: Unit test for citation extraction in tests/unit/webSearch.spec.ts
|
|
224
|
+
T022: Integration test for agent lifecycle in tests/integration/webSearchAgent.spec.ts
|
|
225
|
+
T023: Unit test for preflight legacy check in tests/unit/preflight.spec.ts
|
|
226
|
+
|
|
227
|
+
# After tests fail (Red), implement in order:
|
|
228
|
+
T024: Update WebSearchConfig interface
|
|
229
|
+
T025: Implement AgentSession class
|
|
230
|
+
T026: Implement citation extraction
|
|
231
|
+
T027: Rewrite createWebSearchTool()
|
|
232
|
+
T028: Implement destroyWebSearchSession()
|
|
233
|
+
T029: Update isWebSearchConfigured()
|
|
234
|
+
T030: Wire cleanup into workshop session
|
|
235
|
+
T031: Update mcpContextEnricher import
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
---
|
|
239
|
+
|
|
240
|
+
## Implementation Strategy
|
|
241
|
+
|
|
242
|
+
### MVP First (User Story 1 Only)
|
|
243
|
+
|
|
244
|
+
1. Complete Phase 1: Setup (T001-T003)
|
|
245
|
+
2. Complete Phase 2: Foundational (T004-T006)
|
|
246
|
+
3. Complete Phase 3: User Story 1 (T007-T012)
|
|
247
|
+
4. **STOP and VALIDATE**: Deploy to Azure, verify all 5 resources created, env var output correct
|
|
248
|
+
5. Deploy/demo if ready
|
|
249
|
+
|
|
250
|
+
### Incremental Delivery
|
|
251
|
+
|
|
252
|
+
1. Setup + Foundational → Foundation ready (new deps, env var migration)
|
|
253
|
+
2. Add US1 → Test deploy script → Deploy to Azure (MVP — infrastructure works!)
|
|
254
|
+
3. Add US2 → Verify template quality (parameterization, documentation)
|
|
255
|
+
4. Add US3 → Test CLI integration → Workshop uses web search with citations
|
|
256
|
+
5. Add US4 → Test teardown → Full lifecycle (deploy → use → teardown)
|
|
257
|
+
6. Polish → Docs, lint, typecheck, full suite
|
|
258
|
+
|
|
259
|
+
### Parallel Team Strategy
|
|
260
|
+
|
|
261
|
+
With multiple developers after Foundational phase:
|
|
262
|
+
|
|
263
|
+
1. Team completes Setup + Foundational together
|
|
264
|
+
2. Once Foundational is done:
|
|
265
|
+
- Developer A: US1 (deploy) then US2 (template quality)
|
|
266
|
+
- Developer B: US3 (CLI integration — largest story, 14 tasks)
|
|
267
|
+
- Developer C: US4 (teardown — smallest, 3 tasks)
|
|
268
|
+
3. All merge and Polish together
|
|
269
|
+
|
|
270
|
+
---
|
|
271
|
+
|
|
272
|
+
## Notes
|
|
273
|
+
|
|
274
|
+
- Total tasks: **44**
|
|
275
|
+
- US1: 6 tasks (2 test + 4 impl)
|
|
276
|
+
- US2: 5 tasks (2 test + 3 impl)
|
|
277
|
+
- US3: 14 tasks (6 test + 8 impl) — largest story, core SDK migration
|
|
278
|
+
- US4: 3 tasks (1 test + 2 impl) — smallest story
|
|
279
|
+
- Setup: 3 tasks
|
|
280
|
+
- Foundational: 3 tasks
|
|
281
|
+
- Polish: 10 tasks (includes dotenv integration T040-T042, .env writing T043, subscription-optional T044)
|
|
282
|
+
- Parallel opportunities: US1 ‖ US3 ‖ US4 after Foundational; numerous within-story [P] tasks
|
|
283
|
+
- MVP: US1 alone delivers deployable infrastructure (12 tasks through Phase 3)
|
|
284
|
+
- Each story is independently testable per spec acceptance scenarios
|
|
@@ -0,0 +1,61 @@
|
|
|
1
|
+
# Specification Quality Checklist: Workshop Phase Extraction & Tool Wiring Fixes
|
|
2
|
+
|
|
3
|
+
**Purpose**: Validate specification completeness and quality before proceeding to planning
|
|
4
|
+
**Created**: 2026-03-04
|
|
5
|
+
**Feature**: [specs/006-workshop-extraction-fixes/spec.md](../spec.md)
|
|
6
|
+
|
|
7
|
+
## Content Quality
|
|
8
|
+
|
|
9
|
+
- [x] No implementation details (languages, frameworks, APIs)
|
|
10
|
+
- [x] Focused on user value and business needs
|
|
11
|
+
- [x] Written for non-technical stakeholders
|
|
12
|
+
- [x] All mandatory sections completed
|
|
13
|
+
|
|
14
|
+
## Requirement Completeness
|
|
15
|
+
|
|
16
|
+
- [x] No [NEEDS CLARIFICATION] markers remain
|
|
17
|
+
- [x] Requirements are testable and unambiguous
|
|
18
|
+
- [x] Success criteria are measurable
|
|
19
|
+
- [x] Success criteria are technology-agnostic (no implementation details)
|
|
20
|
+
- [x] All acceptance scenarios are defined
|
|
21
|
+
- [x] Edge cases are identified
|
|
22
|
+
- [x] Scope is clearly bounded
|
|
23
|
+
- [x] Dependencies and assumptions identified
|
|
24
|
+
|
|
25
|
+
## Feature Readiness
|
|
26
|
+
|
|
27
|
+
- [x] All functional requirements have clear acceptance criteria
|
|
28
|
+
- [x] User scenarios cover primary flows
|
|
29
|
+
- [x] Feature meets measurable outcomes defined in Success Criteria
|
|
30
|
+
- [x] No implementation details leak into specification
|
|
31
|
+
|
|
32
|
+
## Bug Traceability
|
|
33
|
+
|
|
34
|
+
- [x] Each bug from the assessment is mapped to at least one FR
|
|
35
|
+
- [x] BUG-001 (lazy web search config) → FR-008, FR-009, FR-010
|
|
36
|
+
- [x] BUG-002 (extraction failures) → FR-001 through FR-007, FR-007a
|
|
37
|
+
- [x] BUG-003 (context window timeout) → FR-016 through FR-019, FR-019a
|
|
38
|
+
- [x] BUG-004 (export incompleteness) → FR-020 through FR-024
|
|
39
|
+
- [x] BUG-005 (MCP tools not wired) → FR-011 through FR-015, FR-012a
|
|
40
|
+
- [x] Each bug maps to at least one success criterion
|
|
41
|
+
- [x] Assessment source linked (Zava Industries assessment results)
|
|
42
|
+
|
|
43
|
+
## Cross-Spec Gap Coverage
|
|
44
|
+
|
|
45
|
+
- [x] GAP-A (LLM phase drift) → FR-007b, FR-007c (phase boundary enforcement)
|
|
46
|
+
- [x] GAP-C (WorkIQ explicit wiring) → FR-012a (WorkIQ consent and enrichment storage)
|
|
47
|
+
- [x] GAP-D (enrichment downstream use) → FR-017 (explicitly lists discoveryEnrichment)
|
|
48
|
+
- [x] GAP-E (Mermaid architecture diagrams) → FR-007a (Design summarization includes Mermaid)
|
|
49
|
+
- [x] GAP-G (Select timeout fallback after summarization) → FR-019a (minimal-context retry + user fallback)
|
|
50
|
+
- [x] GAP-B (input-counting test harness) → Noted in Edge Cases as test infrastructure concern
|
|
51
|
+
- [x] GAP-F (cards dataset fidelity) → Out of scope (prompt tuning, not extraction/wiring)
|
|
52
|
+
- [x] GAP-H (search results UX) → Already covered by spec 001 FR-043a/FR-043b
|
|
53
|
+
|
|
54
|
+
## Notes
|
|
55
|
+
|
|
56
|
+
- Spec was generated directly from the Zava Industries full-session assessment (tests/e2e/zava-assessment/results/assessment-results.md)
|
|
57
|
+
- All 5 bugs scored and prioritized by severity
|
|
58
|
+
- 29 functional requirements across 6 categories (original 24 + 5 gap additions: FR-007a, FR-007b, FR-007c, FR-012a, FR-019a)
|
|
59
|
+
- 6 measurable success criteria including the meta-criterion SC-006 (Zava assessment score improvement from 53% to 75%+)
|
|
60
|
+
- Cross-referenced against specs 001, 003, 005 and the `003-next-spec-gaps.md` gap tracker
|
|
61
|
+
- Out of scope section explicitly defers prose-based NLP extraction, retry logic, template selection, PTY testing, and cards dataset fidelity
|
|
@@ -0,0 +1,131 @@
|
|
|
1
|
+
# Contract: Post-Phase Summarization Call
|
|
2
|
+
|
|
3
|
+
**Feature**: 006-workshop-extraction-fixes
|
|
4
|
+
**FRs**: FR-001 through FR-007
|
|
5
|
+
|
|
6
|
+
## Purpose
|
|
7
|
+
|
|
8
|
+
After each phase's conversation loop completes, if the expected structured session field is still `null`, make a one-shot LLM call to extract structured data from the full conversation transcript.
|
|
9
|
+
|
|
10
|
+
## Interface
|
|
11
|
+
|
|
12
|
+
### Input
|
|
13
|
+
|
|
14
|
+
```
|
|
15
|
+
phaseSummarize(
|
|
16
|
+
client: CopilotClient,
|
|
17
|
+
phase: PhaseValue,
|
|
18
|
+
session: WorkshopSession,
|
|
19
|
+
handler: PhaseHandler
|
|
20
|
+
): Promise<Partial<WorkshopSession>>
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
### Behavior
|
|
24
|
+
|
|
25
|
+
1. Check if the relevant session field for `phase` is still null:
|
|
26
|
+
- Ideate: `session.ideas`
|
|
27
|
+
- Design: `session.evaluation`
|
|
28
|
+
- Select: `session.selection`
|
|
29
|
+
- Plan: `session.plan`
|
|
30
|
+
- Develop: `session.poc`
|
|
31
|
+
- Discover: `session.businessContext` (unlikely to be null, but included for completeness)
|
|
32
|
+
|
|
33
|
+
2. If the field is already populated, return `{}` (no-op).
|
|
34
|
+
|
|
35
|
+
3. Build the phase transcript from `session.turns` filtered by `t.phase === phase`.
|
|
36
|
+
|
|
37
|
+
4. Create a new `ConversationSession` with a phase-specific summarization system prompt loaded from `src/prompts/summarize/{phase}-summary.md`.
|
|
38
|
+
|
|
39
|
+
5. Send the transcript as a single user message. Collect the full response.
|
|
40
|
+
|
|
41
|
+
6. Run `handler.extractResult(session, response)` on the summarization response.
|
|
42
|
+
|
|
43
|
+
7. Return the extracted updates (may be `{}` if extraction still fails).
|
|
44
|
+
|
|
45
|
+
### Error Handling
|
|
46
|
+
|
|
47
|
+
- If `client.createSession()` or `send()` throws, log a warning and return `{}`.
|
|
48
|
+
- If the response doesn't contain valid JSON matching the schema, log a warning and return `{}`.
|
|
49
|
+
- Never throw — the summarization call is a best-effort fallback.
|
|
50
|
+
|
|
51
|
+
### Summarization Prompt Shape (per phase)
|
|
52
|
+
|
|
53
|
+
Each prompt in `src/prompts/summarize/{phase}-summary.md` MUST include:
|
|
54
|
+
|
|
55
|
+
1. A role instruction: "You are a data extraction assistant."
|
|
56
|
+
2. The exact JSON schema shape expected (with field names, types, and constraints).
|
|
57
|
+
3. An instruction to output ONLY a fenced JSON code block.
|
|
58
|
+
4. For Design phase: additionally request a Mermaid architecture diagram (FR-007a).
|
|
59
|
+
|
|
60
|
+
---
|
|
61
|
+
|
|
62
|
+
# Contract: Export Writer Conversation Fallback
|
|
63
|
+
|
|
64
|
+
**Feature**: 006-workshop-extraction-fixes
|
|
65
|
+
**FRs**: FR-020 through FR-024
|
|
66
|
+
|
|
67
|
+
## Purpose
|
|
68
|
+
|
|
69
|
+
Export Markdown files for all phases that had conversation data, even if structured session fields are null.
|
|
70
|
+
|
|
71
|
+
## Interface
|
|
72
|
+
|
|
73
|
+
Each `generate{Phase}Markdown(session: WorkshopSession): string | null` function follows this contract:
|
|
74
|
+
|
|
75
|
+
### Output Structure (when both structured data and turns exist)
|
|
76
|
+
|
|
77
|
+
```markdown
|
|
78
|
+
# {Phase} Phase
|
|
79
|
+
|
|
80
|
+
## {Structured Section — phase-specific}
|
|
81
|
+
|
|
82
|
+
{Rendered from session field if non-null}
|
|
83
|
+
|
|
84
|
+
## Conversation
|
|
85
|
+
|
|
86
|
+
**user**: {turn content}
|
|
87
|
+
|
|
88
|
+
**assistant**: {turn content}
|
|
89
|
+
...
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
### Output Structure (when only turns exist)
|
|
93
|
+
|
|
94
|
+
```markdown
|
|
95
|
+
# {Phase} Phase
|
|
96
|
+
|
|
97
|
+
## Conversation
|
|
98
|
+
|
|
99
|
+
**user**: {turn content}
|
|
100
|
+
|
|
101
|
+
**assistant**: {turn content}
|
|
102
|
+
...
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
### Returns `null` Only When
|
|
106
|
+
|
|
107
|
+
- No structured data AND no conversation turns exist for the phase.
|
|
108
|
+
|
|
109
|
+
### `summary.json` Enhancement
|
|
110
|
+
|
|
111
|
+
```json
|
|
112
|
+
{
|
|
113
|
+
"files": [
|
|
114
|
+
{ "path": "discover.md", "type": "markdown" },
|
|
115
|
+
{ "path": "ideate.md", "type": "markdown" },
|
|
116
|
+
{ "path": "design.md", "type": "markdown" },
|
|
117
|
+
{ "path": "select.md", "type": "markdown" },
|
|
118
|
+
{ "path": "plan.md", "type": "markdown" },
|
|
119
|
+
{ "path": "develop.md", "type": "markdown" }
|
|
120
|
+
],
|
|
121
|
+
"highlights": [
|
|
122
|
+
"Business: {from businessContext}",
|
|
123
|
+
"Ideas: {count} ideas generated",
|
|
124
|
+
"Selection: {ideaId or 'pending'}",
|
|
125
|
+
"Plan: {milestone count} milestones",
|
|
126
|
+
...
|
|
127
|
+
]
|
|
128
|
+
}
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
Highlights include at least one entry per phase with turns, even if the structured field is null. Fallback highlights use the first assistant turn's opening sentence (truncated to 100 chars).
|