sofia-cli 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.github/agents/copilot-instructions.md +39 -0
- package/.github/agents/speckit.analyze.agent.md +184 -0
- package/.github/agents/speckit.checklist.agent.md +294 -0
- package/.github/agents/speckit.clarify.agent.md +181 -0
- package/.github/agents/speckit.constitution.agent.md +84 -0
- package/.github/agents/speckit.implement.agent.md +135 -0
- package/.github/agents/speckit.plan.agent.md +90 -0
- package/.github/agents/speckit.specify.agent.md +258 -0
- package/.github/agents/speckit.tasks.agent.md +137 -0
- package/.github/agents/speckit.taskstoissues.agent.md +30 -0
- package/.github/copilot-instructions.md +257 -0
- package/.github/prompts/speckit.analyze.prompt.md +3 -0
- package/.github/prompts/speckit.checklist.prompt.md +3 -0
- package/.github/prompts/speckit.clarify.prompt.md +3 -0
- package/.github/prompts/speckit.constitution.prompt.md +3 -0
- package/.github/prompts/speckit.implement.prompt.md +3 -0
- package/.github/prompts/speckit.plan.prompt.md +3 -0
- package/.github/prompts/speckit.specify.prompt.md +3 -0
- package/.github/prompts/speckit.tasks.prompt.md +3 -0
- package/.github/prompts/speckit.taskstoissues.prompt.md +3 -0
- package/.github/workflows/ci.yml +38 -0
- package/.prettierrc +6 -0
- package/.specify/memory/constitution.md +181 -0
- package/.specify/scripts/bash/check-prerequisites.sh +166 -0
- package/.specify/scripts/bash/common.sh +156 -0
- package/.specify/scripts/bash/create-new-feature.sh +297 -0
- package/.specify/scripts/bash/setup-plan.sh +61 -0
- package/.specify/scripts/bash/update-agent-context.sh +810 -0
- package/.specify/templates/agent-file-template.md +28 -0
- package/.specify/templates/checklist-template.md +40 -0
- package/.specify/templates/constitution-template.md +50 -0
- package/.specify/templates/plan-template.md +113 -0
- package/.specify/templates/spec-template.md +115 -0
- package/.specify/templates/tasks-template.md +251 -0
- package/.vscode/mcp.json +42 -0
- package/.vscode/settings.json +19 -0
- package/CODE_OF_CONDUCT.md +128 -0
- package/LICENSE +21 -0
- package/README.md +213 -0
- package/dist/src/cli/developCommand.js +240 -0
- package/dist/src/cli/directCommands.js +143 -0
- package/dist/src/cli/envLoader.js +16 -0
- package/dist/src/cli/exportCommand.js +53 -0
- package/dist/src/cli/index.js +203 -0
- package/dist/src/cli/ioContext.js +109 -0
- package/dist/src/cli/preflight.js +57 -0
- package/dist/src/cli/statusCommand.js +110 -0
- package/dist/src/cli/workshopCommand.js +400 -0
- package/dist/src/develop/checkpointState.js +86 -0
- package/dist/src/develop/codeGenerator.js +319 -0
- package/dist/src/develop/dynamicScaffolder.js +226 -0
- package/dist/src/develop/githubMcpAdapter.js +122 -0
- package/dist/src/develop/index.js +15 -0
- package/dist/src/develop/mcpContextEnricher.js +195 -0
- package/dist/src/develop/pocScaffolder.js +542 -0
- package/dist/src/develop/ralphLoop.js +659 -0
- package/dist/src/develop/templateRegistry.js +364 -0
- package/dist/src/develop/testRunner.js +202 -0
- package/dist/src/logging/logger.js +58 -0
- package/dist/src/loop/conversationLoop.js +227 -0
- package/dist/src/loop/phaseSummarizer.js +87 -0
- package/dist/src/mcp/mcpManager.js +267 -0
- package/dist/src/mcp/mcpTransport.js +391 -0
- package/dist/src/mcp/retryPolicy.js +47 -0
- package/dist/src/mcp/webSearch.js +254 -0
- package/dist/src/phases/contextSummarizer.js +101 -0
- package/dist/src/phases/discoveryEnricher.js +156 -0
- package/dist/src/phases/phaseExtractors.js +222 -0
- package/dist/src/phases/phaseHandlers.js +328 -0
- package/dist/src/prompts/design.md +51 -0
- package/dist/src/prompts/develop-boundary.md +51 -0
- package/dist/src/prompts/develop.md +111 -0
- package/dist/src/prompts/discover.md +58 -0
- package/dist/src/prompts/ideate.md +56 -0
- package/dist/src/prompts/plan.md +51 -0
- package/dist/src/prompts/promptLoader.js +167 -0
- package/dist/src/prompts/promptLoader.ts +198 -0
- package/dist/src/prompts/select.md +47 -0
- package/dist/src/prompts/summarize/README.md +8 -0
- package/dist/src/prompts/summarize/design-summary.md +37 -0
- package/dist/src/prompts/summarize/develop-summary.md +25 -0
- package/dist/src/prompts/summarize/ideate-summary.md +27 -0
- package/dist/src/prompts/summarize/plan-summary.md +27 -0
- package/dist/src/prompts/summarize/select-summary.md +21 -0
- package/dist/src/prompts/system.md +28 -0
- package/dist/src/sessions/exportPaths.js +22 -0
- package/dist/src/sessions/exportWriter.js +406 -0
- package/dist/src/sessions/sessionManager.js +81 -0
- package/dist/src/sessions/sessionStore.js +65 -0
- package/dist/src/shared/activitySpinner.js +91 -0
- package/dist/src/shared/copilotClient.js +129 -0
- package/dist/src/shared/data/cards.json +1249 -0
- package/dist/src/shared/data/cardsLoader.js +51 -0
- package/dist/src/shared/errorClassifier.js +120 -0
- package/dist/src/shared/events.js +28 -0
- package/dist/src/shared/markdownRenderer.js +34 -0
- package/dist/src/shared/schemas/session.js +265 -0
- package/dist/src/shared/tableRenderer.js +20 -0
- package/dist/src/vendor/chalk.js +2 -0
- package/dist/src/vendor/cli-table3.js +3 -0
- package/dist/src/vendor/commander.js +2 -0
- package/dist/src/vendor/marked-terminal.js +3 -0
- package/dist/src/vendor/marked.js +2 -0
- package/dist/src/vendor/ora.js +2 -0
- package/dist/src/vendor/pino.js +2 -0
- package/dist/src/vendor/zod.js +2 -0
- package/dist/tests/e2e/developE2e.spec.js +126 -0
- package/dist/tests/e2e/developFailureE2e.spec.js +247 -0
- package/dist/tests/e2e/developPty.spec.js +75 -0
- package/dist/tests/e2e/discoveryWebSearchRelevance.spec.js +84 -0
- package/dist/tests/e2e/harness.spec.js +83 -0
- package/dist/tests/e2e/mcpLive.spec.js +120 -0
- package/dist/tests/e2e/newSession.e2e.spec.js +177 -0
- package/dist/tests/e2e/ralphLoopEnrichmentComparison.spec.js +62 -0
- package/dist/tests/e2e/workiqEnrichment.spec.js +56 -0
- package/dist/tests/e2e/zavaSimulation.spec.js +452 -0
- package/dist/tests/fixtures/test-fixture-project/src/add.js +3 -0
- package/dist/tests/fixtures/test-fixture-project/tests/failing.test.js +6 -0
- package/dist/tests/fixtures/test-fixture-project/tests/hanging.test.js +8 -0
- package/dist/tests/fixtures/test-fixture-project/tests/passing.test.js +10 -0
- package/dist/tests/fixtures/test-fixture-project/vitest.config.js +6 -0
- package/dist/tests/integration/autoStartConversation.spec.js +138 -0
- package/dist/tests/integration/defaultCommand.spec.js +147 -0
- package/dist/tests/integration/directCommandNonTty.spec.js +224 -0
- package/dist/tests/integration/directCommandTty.spec.js +151 -0
- package/dist/tests/integration/discoveryEnrichmentFlow.spec.js +175 -0
- package/dist/tests/integration/exportArtifacts.spec.js +202 -0
- package/dist/tests/integration/exportFallbackFlow.spec.js +99 -0
- package/dist/tests/integration/mcpDegradationFlow.spec.js +190 -0
- package/dist/tests/integration/mcpTransportFlow.spec.js +139 -0
- package/dist/tests/integration/newSessionFlow.spec.js +343 -0
- package/dist/tests/integration/pocGithubMcp.spec.js +186 -0
- package/dist/tests/integration/pocLocalFallback.spec.js +171 -0
- package/dist/tests/integration/pocScaffold.spec.js +163 -0
- package/dist/tests/integration/ralphLoopFlow.spec.js +359 -0
- package/dist/tests/integration/ralphLoopPartial.spec.js +368 -0
- package/dist/tests/integration/resumeAndBacktrack.spec.js +247 -0
- package/dist/tests/integration/spinnerLifecycle.spec.js +220 -0
- package/dist/tests/integration/summarizationFlow.spec.js +115 -0
- package/dist/tests/integration/testRunnerReal.spec.js +52 -0
- package/dist/tests/integration/webSearchAgent.spec.js +128 -0
- package/dist/tests/live/copilotSdkLive.spec.js +107 -0
- package/dist/tests/live/zavaFullWorkshop.spec.js +392 -0
- package/dist/tests/setup/loadEnv.js +3 -0
- package/dist/tests/unit/cli/developCommand.spec.js +567 -0
- package/dist/tests/unit/cli/directCommands.spec.js +279 -0
- package/dist/tests/unit/cli/envLoader.spec.js +58 -0
- package/dist/tests/unit/cli/ioContext.spec.js +119 -0
- package/dist/tests/unit/cli/preflight.spec.js +108 -0
- package/dist/tests/unit/cli/statusCommand.spec.js +111 -0
- package/dist/tests/unit/cli/workshopClientFallback.spec.js +80 -0
- package/dist/tests/unit/cli/workshopCommand.spec.js +329 -0
- package/dist/tests/unit/config/vitestEnvSetup.spec.js +13 -0
- package/dist/tests/unit/develop/checkpointState.spec.js +315 -0
- package/dist/tests/unit/develop/codeGenerator.spec.js +355 -0
- package/dist/tests/unit/develop/githubMcpAdapter.spec.js +231 -0
- package/dist/tests/unit/develop/mcpContextEnricher.spec.js +433 -0
- package/dist/tests/unit/develop/outputValidator.spec.js +119 -0
- package/dist/tests/unit/develop/pocScaffolder.spec.js +353 -0
- package/dist/tests/unit/develop/ralphLoop.spec.js +1248 -0
- package/dist/tests/unit/develop/templateRegistry.spec.js +85 -0
- package/dist/tests/unit/develop/testRunner.spec.js +249 -0
- package/dist/tests/unit/infraBicep.spec.js +92 -0
- package/dist/tests/unit/infraDeploy.spec.js +82 -0
- package/dist/tests/unit/infraTeardown.spec.js +63 -0
- package/dist/tests/unit/logging/logger.spec.js +43 -0
- package/dist/tests/unit/loop/conversationLoop.spec.js +592 -0
- package/dist/tests/unit/loop/phaseSummarizer.spec.js +141 -0
- package/dist/tests/unit/loop/streamingMarkdown.spec.js +147 -0
- package/dist/tests/unit/mcp/mcpManager.spec.js +279 -0
- package/dist/tests/unit/mcp/mcpTransport.spec.js +529 -0
- package/dist/tests/unit/mcp/retryPolicy.spec.js +218 -0
- package/dist/tests/unit/mcp/timeoutValidation.spec.js +46 -0
- package/dist/tests/unit/mcp/webSearch.spec.js +567 -0
- package/dist/tests/unit/phases/contextSummarizer.spec.js +140 -0
- package/dist/tests/unit/phases/discoveryEnricher.repeatCalls.spec.js +93 -0
- package/dist/tests/unit/phases/discoveryEnricher.spec.js +411 -0
- package/dist/tests/unit/phases/phaseExtractors.spec.js +352 -0
- package/dist/tests/unit/phases/phaseHandlers.spec.js +425 -0
- package/dist/tests/unit/prompts/promptLoader.spec.js +118 -0
- package/dist/tests/unit/schemas/pocSchemas.spec.js +412 -0
- package/dist/tests/unit/schemas/session.spec.js +257 -0
- package/dist/tests/unit/sessions/exportPaths.spec.js +31 -0
- package/dist/tests/unit/sessions/exportWriter.spec.js +655 -0
- package/dist/tests/unit/sessions/sessionManager.spec.js +151 -0
- package/dist/tests/unit/sessions/sessionStore.spec.js +116 -0
- package/dist/tests/unit/shared/activitySpinner.spec.js +175 -0
- package/dist/tests/unit/shared/cardsLoader.spec.js +76 -0
- package/dist/tests/unit/shared/copilotClient.spec.js +155 -0
- package/dist/tests/unit/shared/errorClassifier.spec.js +131 -0
- package/dist/tests/unit/shared/events.spec.js +55 -0
- package/dist/tests/unit/shared/markdownRenderer.spec.js +35 -0
- package/dist/tests/unit/shared/markdownRendererChunks.spec.js +70 -0
- package/dist/tests/unit/shared/tableRenderer.spec.js +34 -0
- package/dist/vitest.config.js +14 -0
- package/dist/vitest.live.config.js +18 -0
- package/docs/README.md +35 -0
- package/docs/architecture.md +169 -0
- package/docs/cli-usage.md +207 -0
- package/docs/environment.md +66 -0
- package/docs/export-format.md +146 -0
- package/docs/session-model.md +113 -0
- package/eslint.config.js +35 -0
- package/infra/deploy.sh +193 -0
- package/infra/gather-env.sh +211 -0
- package/infra/main.bicep +90 -0
- package/infra/main.bicepparam +18 -0
- package/infra/resources.bicep +134 -0
- package/infra/teardown.sh +114 -0
- package/package.json +63 -0
- package/specs/001-cli-workshop-rebuild/checklists/requirements.md +35 -0
- package/specs/001-cli-workshop-rebuild/contracts/cli.md +59 -0
- package/specs/001-cli-workshop-rebuild/contracts/export-summary-json.md +23 -0
- package/specs/001-cli-workshop-rebuild/contracts/session-json.md +30 -0
- package/specs/001-cli-workshop-rebuild/data-model.md +210 -0
- package/specs/001-cli-workshop-rebuild/plan.md +361 -0
- package/specs/001-cli-workshop-rebuild/quickstart.md +83 -0
- package/specs/001-cli-workshop-rebuild/research.md +116 -0
- package/specs/001-cli-workshop-rebuild/spec.md +240 -0
- package/specs/001-cli-workshop-rebuild/tasks.md +476 -0
- package/specs/002-poc-generation/contracts/poc-output.md +172 -0
- package/specs/002-poc-generation/contracts/ralph-loop.md +113 -0
- package/specs/002-poc-generation/data-model.md +172 -0
- package/specs/002-poc-generation/plan.md +109 -0
- package/specs/002-poc-generation/quickstart.md +97 -0
- package/specs/002-poc-generation/research.md +786 -0
- package/specs/002-poc-generation/spec.md +81 -0
- package/specs/002-poc-generation/tasks-fix.md +198 -0
- package/specs/002-poc-generation/tasks.md +252 -0
- package/specs/003-mcp-transport-integration/checklists/requirements.md +37 -0
- package/specs/003-mcp-transport-integration/contracts/context-enricher.md +220 -0
- package/specs/003-mcp-transport-integration/contracts/discovery-enricher.md +267 -0
- package/specs/003-mcp-transport-integration/contracts/github-adapter.md +149 -0
- package/specs/003-mcp-transport-integration/contracts/mcp-transport.md +288 -0
- package/specs/003-mcp-transport-integration/data-model.md +326 -0
- package/specs/003-mcp-transport-integration/plan.md +114 -0
- package/specs/003-mcp-transport-integration/quickstart.md +311 -0
- package/specs/003-mcp-transport-integration/research.md +395 -0
- package/specs/003-mcp-transport-integration/spec.md +234 -0
- package/specs/003-mcp-transport-integration/tasks.md +324 -0
- package/specs/003-next-spec-gaps.md +150 -0
- package/specs/004-dev-resume-hardening/checklists/requirements.md +37 -0
- package/specs/004-dev-resume-hardening/contracts/cli.md +160 -0
- package/specs/004-dev-resume-hardening/data-model.md +321 -0
- package/specs/004-dev-resume-hardening/plan.md +107 -0
- package/specs/004-dev-resume-hardening/quickstart.md +115 -0
- package/specs/004-dev-resume-hardening/research.md +142 -0
- package/specs/004-dev-resume-hardening/spec.md +221 -0
- package/specs/004-dev-resume-hardening/tasks.md +333 -0
- package/specs/005-ai-search-deploy/checklists/requirements.md +39 -0
- package/specs/005-ai-search-deploy/contracts/web-search-tool.md +241 -0
- package/specs/005-ai-search-deploy/data-model.md +130 -0
- package/specs/005-ai-search-deploy/plan.md +93 -0
- package/specs/005-ai-search-deploy/quickstart.md +96 -0
- package/specs/005-ai-search-deploy/research.md +187 -0
- package/specs/005-ai-search-deploy/spec.md +143 -0
- package/specs/005-ai-search-deploy/tasks.md +284 -0
- package/specs/006-workshop-extraction-fixes/checklists/requirements.md +61 -0
- package/specs/006-workshop-extraction-fixes/contracts/summarization-and-export.md +131 -0
- package/specs/006-workshop-extraction-fixes/data-model.md +149 -0
- package/specs/006-workshop-extraction-fixes/plan.md +123 -0
- package/specs/006-workshop-extraction-fixes/quickstart.md +101 -0
- package/specs/006-workshop-extraction-fixes/research.md +143 -0
- package/specs/006-workshop-extraction-fixes/spec.md +210 -0
- package/specs/006-workshop-extraction-fixes/tasks.md +316 -0
- package/src/cli/developCommand.ts +308 -0
- package/src/cli/directCommands.ts +195 -0
- package/src/cli/envLoader.ts +17 -0
- package/src/cli/exportCommand.ts +65 -0
- package/src/cli/index.ts +249 -0
- package/src/cli/ioContext.ts +139 -0
- package/src/cli/preflight.ts +86 -0
- package/src/cli/statusCommand.ts +118 -0
- package/src/cli/workshopCommand.ts +496 -0
- package/src/develop/checkpointState.ts +121 -0
- package/src/develop/codeGenerator.ts +402 -0
- package/src/develop/dynamicScaffolder.ts +284 -0
- package/src/develop/githubMcpAdapter.ts +199 -0
- package/src/develop/index.ts +34 -0
- package/src/develop/mcpContextEnricher.ts +279 -0
- package/src/develop/pocScaffolder.ts +646 -0
- package/src/develop/ralphLoop.ts +1044 -0
- package/src/develop/templateRegistry.ts +427 -0
- package/src/develop/testRunner.ts +276 -0
- package/src/logging/logger.ts +73 -0
- package/src/loop/conversationLoop.ts +355 -0
- package/src/loop/phaseSummarizer.ts +114 -0
- package/src/mcp/mcpManager.ts +365 -0
- package/src/mcp/mcpTransport.ts +562 -0
- package/src/mcp/retryPolicy.ts +87 -0
- package/src/mcp/webSearch.ts +388 -0
- package/src/originalPrompts/design_thinking.md +178 -0
- package/src/originalPrompts/design_thinking_persona.md +76 -0
- package/src/originalPrompts/document_generator_example.md +77 -0
- package/src/originalPrompts/document_generator_persona.md +47 -0
- package/src/originalPrompts/facilitator_persona.md +125 -0
- package/src/originalPrompts/guardrails.md +47 -0
- package/src/phases/contextSummarizer.ts +154 -0
- package/src/phases/discoveryEnricher.ts +223 -0
- package/src/phases/phaseExtractors.ts +247 -0
- package/src/phases/phaseHandlers.ts +450 -0
- package/src/prompts/design.md +51 -0
- package/src/prompts/develop-boundary.md +51 -0
- package/src/prompts/develop.md +111 -0
- package/src/prompts/discover.md +58 -0
- package/src/prompts/ideate.md +56 -0
- package/src/prompts/plan.md +51 -0
- package/src/prompts/promptLoader.ts +198 -0
- package/src/prompts/select.md +47 -0
- package/src/prompts/summarize/README.md +8 -0
- package/src/prompts/summarize/design-summary.md +37 -0
- package/src/prompts/summarize/develop-summary.md +25 -0
- package/src/prompts/summarize/ideate-summary.md +27 -0
- package/src/prompts/summarize/plan-summary.md +27 -0
- package/src/prompts/summarize/select-summary.md +21 -0
- package/src/prompts/system.md +28 -0
- package/src/sessions/exportPaths.ts +28 -0
- package/src/sessions/exportWriter.ts +490 -0
- package/src/sessions/sessionManager.ts +119 -0
- package/src/sessions/sessionStore.ts +69 -0
- package/src/shared/activitySpinner.ts +108 -0
- package/src/shared/copilotClient.ts +291 -0
- package/src/shared/data/cards.json +1249 -0
- package/src/shared/data/cardsLoader.ts +70 -0
- package/src/shared/errorClassifier.ts +160 -0
- package/src/shared/events.ts +103 -0
- package/src/shared/markdownRenderer.ts +44 -0
- package/src/shared/schemas/session.ts +346 -0
- package/src/shared/tableRenderer.ts +28 -0
- package/src/types/marked-terminal.d.ts +5 -0
- package/src/vendor/chalk.ts +2 -0
- package/src/vendor/cli-table3.ts +3 -0
- package/src/vendor/commander.ts +2 -0
- package/src/vendor/marked-terminal.ts +3 -0
- package/src/vendor/marked.ts +2 -0
- package/src/vendor/ora.ts +2 -0
- package/src/vendor/pino.ts +3 -0
- package/src/vendor/zod.ts +3 -0
- package/tests/e2e/developE2e.spec.ts +152 -0
- package/tests/e2e/developFailureE2e.spec.ts +289 -0
- package/tests/e2e/developPty.spec.ts +86 -0
- package/tests/e2e/discoveryWebSearchRelevance.spec.ts +103 -0
- package/tests/e2e/harness.spec.ts +104 -0
- package/tests/e2e/mcpLive.spec.ts +149 -0
- package/tests/e2e/newSession.e2e.spec.ts +245 -0
- package/tests/e2e/ralphLoopEnrichmentComparison.spec.ts +70 -0
- package/tests/e2e/workiqEnrichment.spec.ts +72 -0
- package/tests/e2e/zava-assessment/agent-interaction-script.md +258 -0
- package/tests/e2e/zava-assessment/company-profile.md +98 -0
- package/tests/e2e/zava-assessment/expected-results-checklist.md +454 -0
- package/tests/e2e/zavaSimulation.spec.ts +511 -0
- package/tests/fixtures/completedSession.json +141 -0
- package/tests/fixtures/test-fixture-project/package-lock.json +1585 -0
- package/tests/fixtures/test-fixture-project/package.json +12 -0
- package/tests/fixtures/test-fixture-project/src/add.ts +3 -0
- package/tests/fixtures/test-fixture-project/tests/failing.test.ts +7 -0
- package/tests/fixtures/test-fixture-project/tests/hanging.test.ts +9 -0
- package/tests/fixtures/test-fixture-project/tests/passing.test.ts +13 -0
- package/tests/fixtures/test-fixture-project/vitest.config.ts +7 -0
- package/tests/integration/autoStartConversation.spec.ts +168 -0
- package/tests/integration/defaultCommand.spec.ts +179 -0
- package/tests/integration/directCommandNonTty.spec.ts +260 -0
- package/tests/integration/directCommandTty.spec.ts +185 -0
- package/tests/integration/discoveryEnrichmentFlow.spec.ts +209 -0
- package/tests/integration/exportArtifacts.spec.ts +232 -0
- package/tests/integration/exportFallbackFlow.spec.ts +115 -0
- package/tests/integration/mcpDegradationFlow.spec.ts +231 -0
- package/tests/integration/mcpTransportFlow.spec.ts +178 -0
- package/tests/integration/newSessionFlow.spec.ts +406 -0
- package/tests/integration/pocGithubMcp.spec.ts +224 -0
- package/tests/integration/pocLocalFallback.spec.ts +205 -0
- package/tests/integration/pocScaffold.spec.ts +220 -0
- package/tests/integration/ralphLoopFlow.spec.ts +430 -0
- package/tests/integration/ralphLoopPartial.spec.ts +416 -0
- package/tests/integration/resumeAndBacktrack.spec.ts +278 -0
- package/tests/integration/spinnerLifecycle.spec.ts +270 -0
- package/tests/integration/summarizationFlow.spec.ts +135 -0
- package/tests/integration/testRunnerReal.spec.ts +63 -0
- package/tests/integration/webSearchAgent.spec.ts +155 -0
- package/tests/live/copilotSdkLive.spec.ts +149 -0
- package/tests/live/zavaFullWorkshop.spec.ts +515 -0
- package/tests/setup/loadEnv.ts +5 -0
- package/tests/unit/cli/developCommand.spec.ts +679 -0
- package/tests/unit/cli/directCommands.spec.ts +325 -0
- package/tests/unit/cli/envLoader.spec.ts +73 -0
- package/tests/unit/cli/ioContext.spec.ts +148 -0
- package/tests/unit/cli/preflight.spec.ts +125 -0
- package/tests/unit/cli/statusCommand.spec.ts +134 -0
- package/tests/unit/cli/workshopClientFallback.spec.ts +100 -0
- package/tests/unit/cli/workshopCommand.spec.ts +378 -0
- package/tests/unit/config/vitestEnvSetup.spec.ts +24 -0
- package/tests/unit/develop/checkpointState.spec.ts +378 -0
- package/tests/unit/develop/codeGenerator.spec.ts +447 -0
- package/tests/unit/develop/githubMcpAdapter.spec.ts +283 -0
- package/tests/unit/develop/mcpContextEnricher.spec.ts +564 -0
- package/tests/unit/develop/outputValidator.spec.ts +134 -0
- package/tests/unit/develop/pocScaffolder.spec.ts +451 -0
- package/tests/unit/develop/ralphLoop.spec.ts +1439 -0
- package/tests/unit/develop/templateRegistry.spec.ts +106 -0
- package/tests/unit/develop/testRunner.spec.ts +294 -0
- package/tests/unit/infraBicep.spec.ts +116 -0
- package/tests/unit/infraDeploy.spec.ts +102 -0
- package/tests/unit/infraTeardown.spec.ts +77 -0
- package/tests/unit/logging/logger.spec.ts +50 -0
- package/tests/unit/loop/conversationLoop.spec.ts +719 -0
- package/tests/unit/loop/phaseSummarizer.spec.ts +169 -0
- package/tests/unit/loop/streamingMarkdown.spec.ts +180 -0
- package/tests/unit/mcp/mcpManager.spec.ts +336 -0
- package/tests/unit/mcp/mcpTransport.spec.ts +689 -0
- package/tests/unit/mcp/retryPolicy.spec.ts +278 -0
- package/tests/unit/mcp/timeoutValidation.spec.ts +55 -0
- package/tests/unit/mcp/webSearch.spec.ts +718 -0
- package/tests/unit/phases/contextSummarizer.spec.ts +158 -0
- package/tests/unit/phases/discoveryEnricher.repeatCalls.spec.ts +125 -0
- package/tests/unit/phases/discoveryEnricher.spec.ts +512 -0
- package/tests/unit/phases/phaseExtractors.spec.ts +406 -0
- package/tests/unit/phases/phaseHandlers.spec.ts +483 -0
- package/tests/unit/prompts/promptLoader.spec.ts +144 -0
- package/tests/unit/schemas/pocSchemas.spec.ts +457 -0
- package/tests/unit/schemas/session.spec.ts +328 -0
- package/tests/unit/sessions/exportPaths.spec.ts +38 -0
- package/tests/unit/sessions/exportWriter.spec.ts +737 -0
- package/tests/unit/sessions/sessionManager.spec.ts +174 -0
- package/tests/unit/sessions/sessionStore.spec.ts +136 -0
- package/tests/unit/shared/activitySpinner.spec.ts +211 -0
- package/tests/unit/shared/cardsLoader.spec.ts +89 -0
- package/tests/unit/shared/copilotClient.spec.ts +185 -0
- package/tests/unit/shared/errorClassifier.spec.ts +152 -0
- package/tests/unit/shared/events.spec.ts +71 -0
- package/tests/unit/shared/markdownRenderer.spec.ts +42 -0
- package/tests/unit/shared/markdownRendererChunks.spec.ts +83 -0
- package/tests/unit/shared/tableRenderer.spec.ts +38 -0
- package/tsconfig.json +20 -0
- package/vitest.config.ts +15 -0
- package/vitest.live.config.ts +19 -0
|
@@ -0,0 +1,257 @@
|
|
|
1
|
+
# sofIA - Copilot Instructions
|
|
2
|
+
|
|
3
|
+
sofIA is an agentic system built with the **GitHub Copilot SDK** that implements the AI Discovery Cards workshop process. It guides users through discovering, ideating, designing, planning, and developing AI solutions for business needs.
|
|
4
|
+
|
|
5
|
+
## Project Overview
|
|
6
|
+
|
|
7
|
+
This project reimagines [Microsoft's AI Discovery Agent (AIDA)](https://github.com/microsoft/ai-discovery-agent/) using the GitHub Copilot SDK (`@github/copilot-sdk`) instead of Python/LangGraph. It extends AIDA's workshop facilitation by adding:
|
|
8
|
+
|
|
9
|
+
1. **Idea Selection** - Automatically selects the most feasible AI use case from generated ideas
|
|
10
|
+
2. **Planning** - Creates implementation plans for the selected idea
|
|
11
|
+
3. **PoC Development** - Generates working proof-of-concept code to demonstrate the idea
|
|
12
|
+
4. **Discovery Cards** - from [AI Discovery Cards Agent](https://github.com/microsoft-partner-solutions-ai/agent-guides/tree/main/ai-discovery-cards-agent)
|
|
13
|
+
|
|
14
|
+
## Architecture
|
|
15
|
+
|
|
16
|
+
### Agent Flow
|
|
17
|
+
```
|
|
18
|
+
User Input → Discovery Agent → Ideation Agent → Design Agent → Selection Agent → Planning Agent → Development Agent
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
Examples for these agents can be found in the [src/originalPrompts/](../src/originalPrompts/) directory.
|
|
22
|
+
|
|
23
|
+
### Core Components
|
|
24
|
+
|
|
25
|
+
- **Copilot Enabled App** - Main entry point handling GitHub Copilot chat interactions via `@github/copilot-sdk`
|
|
26
|
+
- **MCP Integrations** - External services for gathering context and generating solutions:
|
|
27
|
+
- **WorkIQ** - Process analysis and task discovery
|
|
28
|
+
- **Context7** - Documentation and context retrieval
|
|
29
|
+
- **Microsoft Learn** - Azure/AI documentation
|
|
30
|
+
- **GitHub MCP** - Repository search, code examples, and issue tracking
|
|
31
|
+
|
|
32
|
+
### AI Discovery Workshop Process
|
|
33
|
+
|
|
34
|
+
The system implements the 12 step AI Discovery Cards methodology, and after this it goes beyond by selecting the best idea, creating a plan, and generating PoC code:
|
|
35
|
+
|
|
36
|
+
### Phase 1: AI discovery and ideation
|
|
37
|
+
|
|
38
|
+
For each step:
|
|
39
|
+
|
|
40
|
+
- Ask for required input.
|
|
41
|
+
- Summarize or reflect back what was shared.
|
|
42
|
+
- Propose moving to the next step only when ready.
|
|
43
|
+
|
|
44
|
+
#### Step 1: Understand the Business
|
|
45
|
+
|
|
46
|
+
- Ask for a description of the business and its challenges.
|
|
47
|
+
- Store this information for later use.
|
|
48
|
+
|
|
49
|
+
#### Step 2: Choose a Topic
|
|
50
|
+
|
|
51
|
+
- Identify areas to work on.
|
|
52
|
+
- Prioritize and define today’s focus.
|
|
53
|
+
|
|
54
|
+
#### Step 3: Ideate Activities
|
|
55
|
+
|
|
56
|
+
- Brainstorm key activities in the focus area.
|
|
57
|
+
- Identify what’s not being done due to difficulty.
|
|
58
|
+
|
|
59
|
+
#### Step 4: Map Workflow
|
|
60
|
+
|
|
61
|
+
- Visualize the activity flow.
|
|
62
|
+
- Vote on critical steps based on business and human value.
|
|
63
|
+
- Identify key metrics (e.g., hours/week, NSAT).
|
|
64
|
+
|
|
65
|
+
#### Step 5: Explore AI Envisioning Cards
|
|
66
|
+
|
|
67
|
+
- Ask the AI Discovery Expert to present cards to attendees.
|
|
68
|
+
|
|
69
|
+
#### Step 6: Score Cards
|
|
70
|
+
|
|
71
|
+
- Ask which cards were selected and how they were scored.
|
|
72
|
+
|
|
73
|
+
#### Step 7: Review Top Cards
|
|
74
|
+
|
|
75
|
+
- Select up to 15 cards.
|
|
76
|
+
- Aggregate similar ones.
|
|
77
|
+
|
|
78
|
+
#### Step 8: Map Cards to Workflow
|
|
79
|
+
|
|
80
|
+
- Align cards to workflow steps.
|
|
81
|
+
- Ensure key metrics are clear.
|
|
82
|
+
|
|
83
|
+
#### Step 9: Generate Ideas
|
|
84
|
+
|
|
85
|
+
- Ask the Design Thinking Expert to help ideate for each step.
|
|
86
|
+
|
|
87
|
+
### Step 10: Create Idea Cards
|
|
88
|
+
|
|
89
|
+
For each idea, capture:
|
|
90
|
+
|
|
91
|
+
- **Title**
|
|
92
|
+
- **Description**
|
|
93
|
+
- **Workflow Steps Covered**
|
|
94
|
+
- **Aspirational Solution Scope**
|
|
95
|
+
|
|
96
|
+
#### Step 11: Evaluate Ideas
|
|
97
|
+
|
|
98
|
+
- Use a feasibility/value matrix.
|
|
99
|
+
- Consider KPIs and metrics.
|
|
100
|
+
|
|
101
|
+
#### Step 12: Assess Impact
|
|
102
|
+
|
|
103
|
+
For each idea, evaluate:
|
|
104
|
+
|
|
105
|
+
- Data needed
|
|
106
|
+
- Risks
|
|
107
|
+
- Business impact
|
|
108
|
+
- Human value
|
|
109
|
+
- Key metrics influenced
|
|
110
|
+
|
|
111
|
+
### Phase 2: Idea Selection
|
|
112
|
+
|
|
113
|
+
After generating and evaluating ideas, the system will automatically select the most promising one based on feasibility, business impact, and human value. This selection will be made transparent to the user, with an explanation of why it was chosen, and will let the user confirm or override the selection before proceeding to planning and development.
|
|
114
|
+
|
|
115
|
+
### Phase 3: Planning and PoC Development
|
|
116
|
+
|
|
117
|
+
Once an idea is selected, the system will create a high-level implementation plan, breaking down the idea into actionable steps. It will then generate proof-of-concept code to demonstrate the core functionality of the idea, using best practices for modularity and maintainability. The generated code will be designed to be easily extendable for full implementation after the workshop.
|
|
118
|
+
|
|
119
|
+
This code will be generated by a [Ralph Loop](https://github.com/anthropics/claude-plugins-official/tree/main/plugins/ralph-loop) agent that iteratively refines the implementation until the PoC meets the defined functionality, then stops. The system will also provide guidance on next steps for full implementation after the workshop concludes.
|
|
120
|
+
|
|
121
|
+
## Tech Stack
|
|
122
|
+
|
|
123
|
+
- **Runtime**: Node.js / TypeScript
|
|
124
|
+
- **SDK**: `@github/copilot-sdk`
|
|
125
|
+
- **MCP Protocol**: Model Context Protocol for tool integrations
|
|
126
|
+
- **Deployment**: GitHub App (Copilot SDK enabled App)
|
|
127
|
+
|
|
128
|
+
> **Note**: This repository currently contains specifications, workshop prompts, and
|
|
129
|
+
> governance/templates (under `.specify/`). If/when implementation code is added,
|
|
130
|
+
> it must follow the constitution and the stack expectations above.
|
|
131
|
+
|
|
132
|
+
## Development Commands (when implementation code exists)
|
|
133
|
+
|
|
134
|
+
If this repo contains a `package.json`, prefer the scripts defined there
|
|
135
|
+
(`npm test`, `npm run lint`, etc.). Do not invent commands that do not exist.
|
|
136
|
+
|
|
137
|
+
## Key Conventions
|
|
138
|
+
|
|
139
|
+
### All Changes — Test-Driven Development (TDD) + Lint/Typecheck Loop
|
|
140
|
+
|
|
141
|
+
All new behavior and bug fixes **must** follow a TDD workflow (Red → Green → Lint → Review).
|
|
142
|
+
Do not modify production code until a failing test proves the change is needed.
|
|
143
|
+
**Code must never be committed if `npm run lint` or `npm run typecheck` report errors.**
|
|
144
|
+
|
|
145
|
+
1. **Reproduce** — Identify the root cause and the exact code path that fails.
|
|
146
|
+
2. **Write a failing test first** — Create a test (unit or integration) that exercises the buggy behaviour and fails on the current code. The test name should describe the symptom (e.g., `"captures LLM text after tool-calling loop"`).
|
|
147
|
+
3. **Run the test** — Confirm it fails for the expected reason (`npm test -- <test-file>`).
|
|
148
|
+
4. **Fix the production code** — Make the minimal change needed to make the test pass.
|
|
149
|
+
5. **Run lint + typecheck** — `npm run lint && npm run typecheck` must both pass. Fix any errors before continuing.
|
|
150
|
+
6. **Run the test again** — Confirm it passes after the lint/typecheck fixes.
|
|
151
|
+
7. **Run the full suite** — Ensure the full test suite remains green (no regressions).
|
|
152
|
+
8. **Run lint + typecheck one final time** — Confirm both still pass after the full suite run. **Do not commit until they do.**
|
|
153
|
+
|
|
154
|
+
> **Loop invariant:** After every code change — no matter how small — run `npm run lint && npm run typecheck` before moving on. Treat a lint or typecheck failure the same as a failing test: stop, fix, re-run.
|
|
155
|
+
|
|
156
|
+
**Test placement guidelines (when a test suite exists):**
|
|
157
|
+
| Scope | Directory | When to use |
|
|
158
|
+
|-------|-----------|-------------|
|
|
159
|
+
| Unit | `tests/unit/` | Pure-function logic, renderers, models, single-module behaviour |
|
|
160
|
+
| Integration | `tests/integration/` | Multi-module flows, CLI subprocess, conversation loops |
|
|
161
|
+
| Contract | `tests/contract/` | CLI command shapes, error formats, public API surface |
|
|
162
|
+
|
|
163
|
+
**If tests do not exist yet:** The first implementation work for a feature must include setting up a minimal test runner and a first failing test, then proceed with implementation.
|
|
164
|
+
|
|
165
|
+
**Mock boundaries:** Mock at the module boundary (e.g., Vitest `vi.mock()`), not inside functions. For Copilot SDK tests, prefer deterministic fakes for streaming/tool-calling sessions.
|
|
166
|
+
|
|
167
|
+
> **Rationale:** Several bugs in the streaming pipeline (resolveIdle, onComplete fallback) were fixed without tests initially, requiring rework. Writing the test first catches regressions immediately and documents the expected SDK event sequence. Lint and typecheck failures have similarly caused hidden breakage (unused imports, removed interface properties still referenced in tests) that only surfaces at CI time — running them in every iteration prevents that drift.
|
|
168
|
+
|
|
169
|
+
|
|
170
|
+
### Agent State Management
|
|
171
|
+
Each agent phase should:
|
|
172
|
+
- Accept context from previous phases
|
|
173
|
+
- Emit structured output for the next phase
|
|
174
|
+
- Support checkpointing for long-running sessions
|
|
175
|
+
|
|
176
|
+
### Prompt Engineering
|
|
177
|
+
|
|
178
|
+
- Canonical prompts used by the runtime should live in `src/prompts/`.
|
|
179
|
+
- `src/originalPrompts/` is for inspiration/legacy examples; it is not the source of truth for production prompts.
|
|
180
|
+
- Use markdown format for prompt templates
|
|
181
|
+
- Include few-shot examples where applicable
|
|
182
|
+
- Reference the [AI Discovery Cards Agent Guide](https://github.com/microsoft-partner-solutions-ai/agent-guides/tree/main/ai-discovery-cards-agent) for:
|
|
183
|
+
- System instructions and workshop methodology
|
|
184
|
+
- Knowledge sources and suggested prompts
|
|
185
|
+
- Uploaded reference files (workshop materials, card decks)
|
|
186
|
+
|
|
187
|
+
### Import Ordering & Linting
|
|
188
|
+
|
|
189
|
+
The project uses ESLint with `eslint-plugin-import` and the `import/order` rule set to `warn`. Imports **must** be separated into groups with a blank line between each group:
|
|
190
|
+
|
|
191
|
+
1. **Built-in / External** — Node.js built-ins and `node_modules` packages (e.g., `commander`, `vitest`, `pino`)
|
|
192
|
+
2. **Internal / Parent / Sibling** — Project-relative imports (e.g., `../shared/schemas/session.js`)
|
|
193
|
+
|
|
194
|
+
```typescript
|
|
195
|
+
// ✅ Correct — blank line between groups
|
|
196
|
+
import { Command } from 'commander';
|
|
197
|
+
|
|
198
|
+
import type { PhaseValue } from '../shared/schemas/session.js';
|
|
199
|
+
|
|
200
|
+
// ❌ Wrong — no blank line between external and internal
|
|
201
|
+
import { Command } from 'commander';
|
|
202
|
+
import type { PhaseValue } from '../shared/schemas/session.js';
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
**Run `npm run lint` after every code change — not just at the end.** Lint failures block commits the same way failing tests do. If the linter reports `import/order` warnings, add blank lines between the import groups.
|
|
206
|
+
|
|
207
|
+
### Typecheck
|
|
208
|
+
|
|
209
|
+
The project enforces strict TypeScript checking via `npm run typecheck` (`tsc --noEmit`). **Run `npm run typecheck` after every code change — not just at the end.** Typecheck failures block commits the same way failing tests do.
|
|
210
|
+
|
|
211
|
+
1. Run `npm run typecheck` and fix all errors before proceeding to the next step.
|
|
212
|
+
2. Never suppress errors with `@ts-ignore` or `any` — use proper types, type narrowing, or Vitest's `Mock<>` generic.
|
|
213
|
+
3. For third-party packages without `@types`, add an ambient module declaration in `src/types/<package>.d.ts`.
|
|
214
|
+
|
|
215
|
+
> **Rationale:** Type mismatches between production code and Zod schemas (e.g., wrong property names in `exportWriter.ts`) were only caught by strict typechecking, not by tests. Unused imports left behind after interface refactoring (e.g., removing `githubAdapter` from `RalphLoopOptions`) caused silent CI failures that would have been caught immediately by running `npm run lint && npm run typecheck` in every loop iteration.
|
|
216
|
+
|
|
217
|
+
## MCP Server Configuration
|
|
218
|
+
|
|
219
|
+
The project uses Model Context Protocol servers for external integrations. Configuration is in:
|
|
220
|
+
- `.vscode/settings.json` - VS Code / GitHub Copilot integration
|
|
221
|
+
- `.vscode/mcp.json` - MCP server configuration
|
|
222
|
+
|
|
223
|
+
Ensure these services are being used when needed for context retrieval and tool calls in the agent flow.
|
|
224
|
+
|
|
225
|
+
### Available MCP Servers
|
|
226
|
+
|
|
227
|
+
| Server | Config | Purpose |
|
|
228
|
+
|--------|--------|---------|
|
|
229
|
+
| `workiq` | `@microsoft/workiq` | Microsoft 365 data - emails, meetings, documents, Teams messages |
|
|
230
|
+
| `github` | Remote: `https://api.githubcopilot.com/mcp/` | Repository search, code, issues, PRs, Actions workflows |
|
|
231
|
+
| `microsoftdocs` | Remote: `https://learn.microsoft.com/api/mcp` | Azure resource management - storage, compute, databases and everything else |
|
|
232
|
+
| `context7` | `@upstash/context7-mcp` | Up-to-date library/framework documentation |
|
|
233
|
+
| `playwright` | `@playwright/mcp@latest` | Browser automation for web research and PoC testing |
|
|
234
|
+
|
|
235
|
+
> **Note**: Web search is built into GitHub Copilot - no additional MCP server needed for researching companies, industry trends, or public information.
|
|
236
|
+
|
|
237
|
+
### WorkIQ Setup
|
|
238
|
+
|
|
239
|
+
WorkIQ requires Microsoft 365 tenant access and admin consent. On first use:
|
|
240
|
+
1. Run `npx -y @microsoft/workiq accept-eula` to accept the EULA
|
|
241
|
+
2. Sign in when prompted - admin consent may be required
|
|
242
|
+
3. See [WorkIQ Admin Instructions](https://github.com/microsoft/work-iq-mcp/blob/main/ADMIN-INSTRUCTIONS.md) for tenant setup
|
|
243
|
+
|
|
244
|
+
## Terminal Command Safety
|
|
245
|
+
|
|
246
|
+
The CLI solution may hang or get stuck during development (e.g., interactive prompts waiting for input, infinite loops, watch modes). To avoid blocking the agent:
|
|
247
|
+
|
|
248
|
+
- **Always use a timeout** when running `npm test`, `npm run build`, `npm run dev`, or any command that could hang. Use the `timeout` parameter on terminal calls (e.g., 30000ms for tests, 60000ms for builds).
|
|
249
|
+
- **Never run `npm run dev` without a timeout** — it starts a watch process that never exits on its own.
|
|
250
|
+
- **Prefer targeted test runs** (`npm test -- <specific-file>`) over full suite runs when iterating on a single module.
|
|
251
|
+
- **If a command hangs**, kill it and investigate the root cause rather than waiting indefinitely.
|
|
252
|
+
|
|
253
|
+
## Security Considerations
|
|
254
|
+
|
|
255
|
+
- Never log or expose user tokens
|
|
256
|
+
- Use the SDK's built-in request verification rather than custom implementations
|
|
257
|
+
- Follow [AIDA's RAI principles](https://github.com/microsoft/ai-discovery-agent/blob/main/docs/RESPONSIBLE_AI_PRINCIPLES.md) for AI transparency
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
name: CI
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
pull_request:
|
|
6
|
+
branches: [main]
|
|
7
|
+
|
|
8
|
+
jobs:
|
|
9
|
+
build-and-test:
|
|
10
|
+
runs-on: ubuntu-latest
|
|
11
|
+
strategy:
|
|
12
|
+
matrix:
|
|
13
|
+
node-version: [22.x]
|
|
14
|
+
|
|
15
|
+
steps:
|
|
16
|
+
- name: Checkout repository
|
|
17
|
+
uses: actions/checkout@v4
|
|
18
|
+
|
|
19
|
+
- name: Setup Node.js ${{ matrix.node-version }}
|
|
20
|
+
uses: actions/setup-node@v4
|
|
21
|
+
with:
|
|
22
|
+
node-version: ${{ matrix.node-version }}
|
|
23
|
+
cache: 'npm'
|
|
24
|
+
|
|
25
|
+
- name: Install dependencies
|
|
26
|
+
run: npm ci
|
|
27
|
+
|
|
28
|
+
- name: Lint
|
|
29
|
+
run: npm run lint
|
|
30
|
+
|
|
31
|
+
- name: Type check
|
|
32
|
+
run: npm run typecheck
|
|
33
|
+
|
|
34
|
+
- name: Run tests
|
|
35
|
+
env:
|
|
36
|
+
COPILOT_GITHUB_TOKEN: ${{ secrets.COPILOT_KEY }}
|
|
37
|
+
GITHUB_TOKEN: ${{ secrets.COPILOT_KEY }}
|
|
38
|
+
run: npm test
|
package/.prettierrc
ADDED
|
@@ -0,0 +1,181 @@
|
|
|
1
|
+
<!--
|
|
2
|
+
Sync Impact Report
|
|
3
|
+
|
|
4
|
+
- Version change: 1.1.1 → 1.1.2
|
|
5
|
+
- Modified principles: none
|
|
6
|
+
- Clarifications: added explicit mapping between 12-step process and Discover/Ideate/Design/Select/Plan/Develop wording
|
|
7
|
+
- Added sections: Sync Impact Report (this comment)
|
|
8
|
+
- Removed sections: none
|
|
9
|
+
- Templates requiring updates:
|
|
10
|
+
- ✅ .specify/templates/plan-template.md (Constitution Check gates clarified)
|
|
11
|
+
- ✅ .specify/templates/tasks-template.md (tests/TDD requirements aligned)
|
|
12
|
+
- ✅ .github/copilot-instructions.md (repo structure + TDD guidance aligned)
|
|
13
|
+
- ⚠ .specify/templates/commands/*.md (folder not present in this repo)
|
|
14
|
+
- Deferred TODOs: none
|
|
15
|
+
-->
|
|
16
|
+
|
|
17
|
+
# sofIA Copilot CLI Constitution
|
|
18
|
+
|
|
19
|
+
This constitution governs the design, development, and operation of the sofIA Copilot CLI solution. The system helps organizations **analyze, ideate, design, generate, and select** high‑quality AI‑enabled project ideas using the AI Discovery Cards methodology, implemented with the GitHub Copilot SDK for Node.js.
|
|
20
|
+
|
|
21
|
+
## Core Principles
|
|
22
|
+
|
|
23
|
+
### I. Outcome‑First AI Discovery
|
|
24
|
+
|
|
25
|
+
- The primary goal is to help users discover, refine, and prioritize valuable, feasible AI use cases – **not** to generate code for its own sake.
|
|
26
|
+
- The agent always keeps the AI Discovery Cards workshop phases in view: Discover → Ideate → Design → Select → Plan → Develop.
|
|
27
|
+
- All outputs (text, plans, code suggestions) must explicitly tie back to business goals, users, processes, and measurable impact.
|
|
28
|
+
|
|
29
|
+
### II. Secure‑by‑Default & Privacy‑Respecting
|
|
30
|
+
|
|
31
|
+
- Follow **least privilege**: only request and use the minimum data, scopes, repos, and MCP capabilities required for the current task.
|
|
32
|
+
- Never log, echo, or persist secrets, access tokens, PII, or customer‑sensitive data.
|
|
33
|
+
- When using external MCP services or web fetch, prefer **anonymized, aggregate, or redacted** context; avoid copying proprietary content into prompts unless the user explicitly provides it.
|
|
34
|
+
- Default to local execution where possible; remote calls must be transparent and justifiable.
|
|
35
|
+
|
|
36
|
+
### III. Node.js + TypeScript, SDK‑Aligned
|
|
37
|
+
|
|
38
|
+
- The solution is implemented in **Node.js (LTS) with TypeScript**, using the GitHub Copilot SDK for Node.js as the primary integration surface.
|
|
39
|
+
- All core behavior (request parsing, streaming events, MCP calls, orchestration) must follow SDK best practices and patterns established in this repository.
|
|
40
|
+
- Public contracts (CLI flags, JSON schemas, extension event formats) are treated as **APIs** and evolved carefully.
|
|
41
|
+
|
|
42
|
+
### IV. MCP‑First Context & Tools
|
|
43
|
+
|
|
44
|
+
- Prefer **MCP servers** over ad‑hoc HTTP calls whenever suitable tools exist (Context7, Playwright, WorkIQ, Microsoft Docs, GitHub MCP, filesystem, fetch, etc.).
|
|
45
|
+
- Each MCP call must have a **clear purpose** that supports the current workshop phase (e.g., research technology options, inspect existing codebases, understand documentation, analyze processes).
|
|
46
|
+
- Tool use is **explainable**: the agent should be able to state what tool it used, why, and how the result influenced its recommendation.
|
|
47
|
+
|
|
48
|
+
### V. Test‑First, Regressions‑Last (Red → Green → Lint → Review)
|
|
49
|
+
|
|
50
|
+
- New behavior must be covered by **automated tests** (unit where possible, integration/e2e where necessary).
|
|
51
|
+
- **No implementation starts before tests**: for every task phase, the first committed change MUST be failing tests that describe the target behavior.
|
|
52
|
+
- The project follows a **phase‑level TDD cycle** for every feature implemented:
|
|
53
|
+
1. **Red**: Write all tests for the current task phase **before** writing any implementation code. All new tests MUST fail initially, confirming they test real behavior that does not yet exist.
|
|
54
|
+
2. **Green**: Implement the minimum code needed to make all failing tests pass. Do not move to the next phase until every test is green.
|
|
55
|
+
3. **Lint**: After green, run `npm run lint && npm run typecheck`. **Both must pass before any commit.** Lint and typecheck failures are treated the same as failing tests — stop, fix, re‑verify. This step is mandatory in every iteration, not just at the end of a task.
|
|
56
|
+
4. **Review**: After lint passes, perform a **mandatory self‑review** using the Test Review Checklist (see Development Workflow below) to identify gaps. Add new tests if gaps are found and repeat the red → green → lint cycle for those additions.
|
|
57
|
+
- **Loop invariant for every code change**: run `npm run lint && npm run typecheck` immediately after editing any file. Never commit code that fails either check.
|
|
58
|
+
- Critical flows – idea evaluation, scoring, selection, and project planning – require **stable, deterministic tests** that avoid flaky external dependencies.
|
|
59
|
+
- Interactive CLI flows (menus, phase gates, follow-up prompts, retries, resume, and Ctrl+C paths) MUST be testable from day one through deterministic automation, not only manual checks.
|
|
60
|
+
- Test strategy: pure logic → unit tests; orchestration and tool integration → integration tests with fakes/mocks and a minimal set of live MCP smoke tests.
|
|
61
|
+
- Generated PoC repositories MUST include **basic smoke/happy‑path tests** that validate the code runs successfully. Full TDD is not required for PoC code, but generated tests serve as a quality signal for the Ralph loop.
|
|
62
|
+
- Each **Ralph loop iteration** MUST be test‑driven: every refinement cycle starts with a failing test or captured error that proves a defect exists, refines code until the failure is resolved, runs lint + typecheck, and then checks whether new issues were introduced.
|
|
63
|
+
|
|
64
|
+
### VI. Interactive CLI Testability by Design
|
|
65
|
+
|
|
66
|
+
- Every interactive feature MUST define a machine-testable contract up front: expected prompts, user choices, streamed activity signals, decision gates, and terminal end states.
|
|
67
|
+
- The project MUST maintain an automated interactive harness capable of validating full workshop behavior (including phase transitions and governed progression) in a pseudo-terminal environment.
|
|
68
|
+
- For LLM-involved interactive validation, tests MUST use a layered approach:
|
|
69
|
+
1. deterministic assertions on structure and control flow (menus, summaries, decisions, transitions),
|
|
70
|
+
2. optional semantic validation (e.g., LLM-as-judge) for quality-sensitive phases.
|
|
71
|
+
- When interaction complexity requires it, the harness MAY use Copilot SDK-generated answer banks or tool-assisted inputs, but runs MUST remain reproducible via saved transcripts/reports and explicit pass/fail checks.
|
|
72
|
+
- A feature is not complete unless at least one automated end-to-end interactive scenario validates the happy path and one validates a failure/recovery path.
|
|
73
|
+
|
|
74
|
+
### VII. Deterministic, Auditable Agent Behavior
|
|
75
|
+
|
|
76
|
+
- Prompts, system instructions, and agent flows must be **versioned and reviewable** (stored under `src/prompts` or equivalent, not embedded ad‑hoc everywhere).
|
|
77
|
+
- For the same inputs and configuration, the system should aim for **predictable, reproducible** outputs (within the limits of LLM variability), achieved via structured prompts, stable scoring rubrics, and constrained response schemas.
|
|
78
|
+
- Significant decisions (e.g., why an AI idea was ranked #1 vs #2) should be accompanied by **structured rationale** suitable for audit and stakeholder review.
|
|
79
|
+
|
|
80
|
+
### VIII. CLI‑First UX & Transparency
|
|
81
|
+
|
|
82
|
+
- The CLI interface is a **first‑class product**: clear commands, help text, and progress reporting are mandatory.
|
|
83
|
+
- All long‑running operations (multi‑phase workshops, MCP orchestrations) must **stream progress**, not leave users idle.
|
|
84
|
+
- Users MUST always see the current execution state (current phase, waiting for input, running tool/action, retry/recovery state) during interactive and long-running operations.
|
|
85
|
+
- On failures, user-facing output MUST include: what failed, why it likely failed, what was already completed, and the next actionable recovery options.
|
|
86
|
+
- The agent must be honest about limitations, uncertainty, and trade‑offs, avoiding over‑confident claims.
|
|
87
|
+
|
|
88
|
+
## Architecture & Scope
|
|
89
|
+
|
|
90
|
+
- The system implements the AI Discovery Cards process as a **multi‑phase agentic pipeline**:
|
|
91
|
+
- First phase is the AI Discovery Cards 12-step process
|
|
92
|
+
- Phase 2: idea selection
|
|
93
|
+
- Phase 3: Planning and development, outline milestones, dependencies, and PoC scope. Finally generate PoC‑level code examples and scaffolding.
|
|
94
|
+
|
|
95
|
+
Mapping: the 12-step workshop covers **Discover/Ideate/Design**; Phase 2 maps to **Select**; Phase 3 maps to **Plan/Develop**.
|
|
96
|
+
|
|
97
|
+
- Each step is implemented as a **composable agent/module** with:
|
|
98
|
+
- A narrow responsibility and input/output contract.
|
|
99
|
+
- A clear hand‑off format to the next phase.
|
|
100
|
+
- Optional checkpointing/check‑in with the user (especially at selection & planning).
|
|
101
|
+
- The Copilot CLI acts as an **orchestrator**, not a monolith: orchestration code wires agents, prompts, and MCP tools together.
|
|
102
|
+
|
|
103
|
+
## Security & Compliance
|
|
104
|
+
|
|
105
|
+
- Always validate CLI arguments, configuration files, and Copilot SDK session payloads before processing.
|
|
106
|
+
- Enforce strict **input validation** on CLI arguments, configuration files, and environment variables; fail fast on invalid or unsafe values.
|
|
107
|
+
- Use secure defaults:
|
|
108
|
+
- HTTPS‑only when calling remote services.
|
|
109
|
+
- TLS verification enabled; no blanket `NODE_TLS_REJECT_UNAUTHORIZED=0`.
|
|
110
|
+
- Timeouts and retries configured to avoid hanging processes.
|
|
111
|
+
- Access to GitHub, Azure, WorkIQ, or other enterprise systems must respect **organization policies** and least‑privilege scopes.
|
|
112
|
+
- Sensitive outputs (like architecture diagrams or PoC code that touches regulated data) should include **disclaimers and risk notes** when appropriate.
|
|
113
|
+
|
|
114
|
+
## MCP Services Usage
|
|
115
|
+
|
|
116
|
+
- **Context7**
|
|
117
|
+
- Use to fetch **up‑to‑date documentation and best practices** for libraries, frameworks, and platforms relevant to a proposed AI idea.
|
|
118
|
+
- Use when evaluating technical feasibility, comparing implementation options, or generating PoC scaffolding.
|
|
119
|
+
- Prefer official or high‑trust sources; clearly separate factual documentation from generated interpretation.
|
|
120
|
+
|
|
121
|
+
- **Playwright MCP**
|
|
122
|
+
- Use for **browser automation and validation** when ideas involve web UX, customer journeys, or site workflows.
|
|
123
|
+
- Suitable tasks: walking through existing user flows, capturing page structure, or validating that an AI augmentation can integrate into a target UI.
|
|
124
|
+
- Avoid using it to capture or persist sensitive user data; respect robots.txt and customer security guidelines.
|
|
125
|
+
|
|
126
|
+
- **WorkIQ / M365 MCP** (when enabled)
|
|
127
|
+
- Use for **process discovery** and empirical analysis of how work is currently performed (emails, meetings, documents, Teams, etc.).
|
|
128
|
+
- Only access tenants and scopes explicitly authorized; never assume cross‑tenant access.
|
|
129
|
+
- Summaries and suggestions must preserve confidentiality and avoid exposing individual‑level behavioral analytics unless policy allows.
|
|
130
|
+
|
|
131
|
+
- **GitHub MCP**
|
|
132
|
+
- Use to analyze existing repos and workflows when ideas involve **developer productivity, DevOps, or code quality**.
|
|
133
|
+
- Prefer light‑touch analysis (metadata, structure, high‑level patterns) over raw code dumps unless the user explicitly requests deeper review.
|
|
134
|
+
|
|
135
|
+
- **Microsoft Docs / Azure MCP**
|
|
136
|
+
- Use for authoritative **cloud architecture, security, and compliance** guidance when proposing Azure‑based or Microsoft‑based solutions.
|
|
137
|
+
- When generating Azure/AI solution ideas, ground recommendations in official docs where feasible.
|
|
138
|
+
|
|
139
|
+
## Development Workflow & Quality Gates
|
|
140
|
+
|
|
141
|
+
- **Branching & Reviews**
|
|
142
|
+
- All substantial changes (logic, prompts, workflows) go through PRs and human review.
|
|
143
|
+
- PR descriptions must state which workshop phases are affected and which tests were run.
|
|
144
|
+
|
|
145
|
+
- **Testing Requirements (Red → Green → Review)**
|
|
146
|
+
- A change is not done until there are passing tests covering the new behavior.
|
|
147
|
+
- The first implementation commit for a task phase MUST include failing tests before production-code changes.
|
|
148
|
+
- Core scoring and selection logic must have **high‑signal unit tests** (no reliance on live LLMs or MCP tools for correctness).
|
|
149
|
+
- End‑to‑end tests may stub LLMs/MCPs while validating orchestration and CLI UX.
|
|
150
|
+
- To generate proper non-stub integration LLM tests, GitHub Copilot SDK can help, keep in mind that results are non-deterministic.
|
|
151
|
+
- Interactive CLI changes MUST include automated terminal-flow tests that verify prompts, user decisions, transitions, and persistence/resume behavior.
|
|
152
|
+
- LLM-dependent behavior MUST expose deterministic checks first (schema/control flow/required signals), with optional semantic checks layered on top.
|
|
153
|
+
- Every task phase follows the **phase‑level TDD cycle**: tests written first → all must fail → implement until green → self‑review.
|
|
154
|
+
- After reaching green, the implementer MUST run through the **Test Review Checklist**:
|
|
155
|
+
- [ ] Are all edge cases covered (empty inputs, nulls, boundary values)?
|
|
156
|
+
- [ ] Are negative/error paths tested (invalid data, missing dependencies, permission failures)?
|
|
157
|
+
- [ ] Are boundary conditions verified (max/min values, empty collections, large payloads)?
|
|
158
|
+
- [ ] Are new integration points exercised (new MCP calls, Copilot SDK interactions)?
|
|
159
|
+
- [ ] Do existing tests still pass without modification (no silent regressions)?
|
|
160
|
+
If any gaps are found, add tests and repeat the red → green cycle before proceeding.
|
|
161
|
+
|
|
162
|
+
- **Observability & Diagnostics**
|
|
163
|
+
- Use structured, leveled logging with a clear separation between **debug**, **info**, **warn**, and **error**.
|
|
164
|
+
- Logging MUST be extensive enough to reconstruct interactive failures end-to-end: include session ID, phase, turn number, tool/action, timing, and transition decisions.
|
|
165
|
+
- Logs must never contain secrets or sensitive data; link to resource identifiers or hashes instead.
|
|
166
|
+
- For CLI users, provide concise error messages plus an optional `--verbose` or `--debug` mode.
|
|
167
|
+
- Interactive UX MUST surface real-time operational events to users (progress, tool activity, state changes) and provide explicit failure reasons with recovery guidance.
|
|
168
|
+
- Automated interactive runs MUST persist artifacts (for example: transcript + structured report) so regressions are diagnosable and reproducible.
|
|
169
|
+
|
|
170
|
+
## Governance
|
|
171
|
+
|
|
172
|
+
- This constitution **supersedes ad‑hoc practices** for the sofIA Copilot CLI and related agents.
|
|
173
|
+
- Any feature, design, or prompt that conflicts with this document must be revised or justified via a documented exception.
|
|
174
|
+
- Amendments require:
|
|
175
|
+
- A proposal documenting the motivation, risks, and migration/rollout plan.
|
|
176
|
+
- Review and approval via the project’s standard PR process.
|
|
177
|
+
- A version bump and date update in this file.
|
|
178
|
+
- All PR reviews should include an explicit, light‑weight check against this constitution: security, testing, MCP usage, and AI Discovery alignment.
|
|
179
|
+
- Runtime guidance (coding style, prompts, agent composition) should be kept in the project’s developer docs and referenced from here as needed.
|
|
180
|
+
|
|
181
|
+
**Version**: 1.1.2 | **Ratified**: 2026-02-24 | **Last Amended**: 2026-02-26
|