sofia-cli 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (435) hide show
  1. package/.github/agents/copilot-instructions.md +39 -0
  2. package/.github/agents/speckit.analyze.agent.md +184 -0
  3. package/.github/agents/speckit.checklist.agent.md +294 -0
  4. package/.github/agents/speckit.clarify.agent.md +181 -0
  5. package/.github/agents/speckit.constitution.agent.md +84 -0
  6. package/.github/agents/speckit.implement.agent.md +135 -0
  7. package/.github/agents/speckit.plan.agent.md +90 -0
  8. package/.github/agents/speckit.specify.agent.md +258 -0
  9. package/.github/agents/speckit.tasks.agent.md +137 -0
  10. package/.github/agents/speckit.taskstoissues.agent.md +30 -0
  11. package/.github/copilot-instructions.md +257 -0
  12. package/.github/prompts/speckit.analyze.prompt.md +3 -0
  13. package/.github/prompts/speckit.checklist.prompt.md +3 -0
  14. package/.github/prompts/speckit.clarify.prompt.md +3 -0
  15. package/.github/prompts/speckit.constitution.prompt.md +3 -0
  16. package/.github/prompts/speckit.implement.prompt.md +3 -0
  17. package/.github/prompts/speckit.plan.prompt.md +3 -0
  18. package/.github/prompts/speckit.specify.prompt.md +3 -0
  19. package/.github/prompts/speckit.tasks.prompt.md +3 -0
  20. package/.github/prompts/speckit.taskstoissues.prompt.md +3 -0
  21. package/.github/workflows/ci.yml +38 -0
  22. package/.prettierrc +6 -0
  23. package/.specify/memory/constitution.md +181 -0
  24. package/.specify/scripts/bash/check-prerequisites.sh +166 -0
  25. package/.specify/scripts/bash/common.sh +156 -0
  26. package/.specify/scripts/bash/create-new-feature.sh +297 -0
  27. package/.specify/scripts/bash/setup-plan.sh +61 -0
  28. package/.specify/scripts/bash/update-agent-context.sh +810 -0
  29. package/.specify/templates/agent-file-template.md +28 -0
  30. package/.specify/templates/checklist-template.md +40 -0
  31. package/.specify/templates/constitution-template.md +50 -0
  32. package/.specify/templates/plan-template.md +113 -0
  33. package/.specify/templates/spec-template.md +115 -0
  34. package/.specify/templates/tasks-template.md +251 -0
  35. package/.vscode/mcp.json +42 -0
  36. package/.vscode/settings.json +19 -0
  37. package/CODE_OF_CONDUCT.md +128 -0
  38. package/LICENSE +21 -0
  39. package/README.md +213 -0
  40. package/dist/src/cli/developCommand.js +240 -0
  41. package/dist/src/cli/directCommands.js +143 -0
  42. package/dist/src/cli/envLoader.js +16 -0
  43. package/dist/src/cli/exportCommand.js +53 -0
  44. package/dist/src/cli/index.js +203 -0
  45. package/dist/src/cli/ioContext.js +109 -0
  46. package/dist/src/cli/preflight.js +57 -0
  47. package/dist/src/cli/statusCommand.js +110 -0
  48. package/dist/src/cli/workshopCommand.js +400 -0
  49. package/dist/src/develop/checkpointState.js +86 -0
  50. package/dist/src/develop/codeGenerator.js +319 -0
  51. package/dist/src/develop/dynamicScaffolder.js +226 -0
  52. package/dist/src/develop/githubMcpAdapter.js +122 -0
  53. package/dist/src/develop/index.js +15 -0
  54. package/dist/src/develop/mcpContextEnricher.js +195 -0
  55. package/dist/src/develop/pocScaffolder.js +542 -0
  56. package/dist/src/develop/ralphLoop.js +659 -0
  57. package/dist/src/develop/templateRegistry.js +364 -0
  58. package/dist/src/develop/testRunner.js +202 -0
  59. package/dist/src/logging/logger.js +58 -0
  60. package/dist/src/loop/conversationLoop.js +227 -0
  61. package/dist/src/loop/phaseSummarizer.js +87 -0
  62. package/dist/src/mcp/mcpManager.js +267 -0
  63. package/dist/src/mcp/mcpTransport.js +391 -0
  64. package/dist/src/mcp/retryPolicy.js +47 -0
  65. package/dist/src/mcp/webSearch.js +254 -0
  66. package/dist/src/phases/contextSummarizer.js +101 -0
  67. package/dist/src/phases/discoveryEnricher.js +156 -0
  68. package/dist/src/phases/phaseExtractors.js +222 -0
  69. package/dist/src/phases/phaseHandlers.js +328 -0
  70. package/dist/src/prompts/design.md +51 -0
  71. package/dist/src/prompts/develop-boundary.md +51 -0
  72. package/dist/src/prompts/develop.md +111 -0
  73. package/dist/src/prompts/discover.md +58 -0
  74. package/dist/src/prompts/ideate.md +56 -0
  75. package/dist/src/prompts/plan.md +51 -0
  76. package/dist/src/prompts/promptLoader.js +167 -0
  77. package/dist/src/prompts/promptLoader.ts +198 -0
  78. package/dist/src/prompts/select.md +47 -0
  79. package/dist/src/prompts/summarize/README.md +8 -0
  80. package/dist/src/prompts/summarize/design-summary.md +37 -0
  81. package/dist/src/prompts/summarize/develop-summary.md +25 -0
  82. package/dist/src/prompts/summarize/ideate-summary.md +27 -0
  83. package/dist/src/prompts/summarize/plan-summary.md +27 -0
  84. package/dist/src/prompts/summarize/select-summary.md +21 -0
  85. package/dist/src/prompts/system.md +28 -0
  86. package/dist/src/sessions/exportPaths.js +22 -0
  87. package/dist/src/sessions/exportWriter.js +406 -0
  88. package/dist/src/sessions/sessionManager.js +81 -0
  89. package/dist/src/sessions/sessionStore.js +65 -0
  90. package/dist/src/shared/activitySpinner.js +91 -0
  91. package/dist/src/shared/copilotClient.js +129 -0
  92. package/dist/src/shared/data/cards.json +1249 -0
  93. package/dist/src/shared/data/cardsLoader.js +51 -0
  94. package/dist/src/shared/errorClassifier.js +120 -0
  95. package/dist/src/shared/events.js +28 -0
  96. package/dist/src/shared/markdownRenderer.js +34 -0
  97. package/dist/src/shared/schemas/session.js +265 -0
  98. package/dist/src/shared/tableRenderer.js +20 -0
  99. package/dist/src/vendor/chalk.js +2 -0
  100. package/dist/src/vendor/cli-table3.js +3 -0
  101. package/dist/src/vendor/commander.js +2 -0
  102. package/dist/src/vendor/marked-terminal.js +3 -0
  103. package/dist/src/vendor/marked.js +2 -0
  104. package/dist/src/vendor/ora.js +2 -0
  105. package/dist/src/vendor/pino.js +2 -0
  106. package/dist/src/vendor/zod.js +2 -0
  107. package/dist/tests/e2e/developE2e.spec.js +126 -0
  108. package/dist/tests/e2e/developFailureE2e.spec.js +247 -0
  109. package/dist/tests/e2e/developPty.spec.js +75 -0
  110. package/dist/tests/e2e/discoveryWebSearchRelevance.spec.js +84 -0
  111. package/dist/tests/e2e/harness.spec.js +83 -0
  112. package/dist/tests/e2e/mcpLive.spec.js +120 -0
  113. package/dist/tests/e2e/newSession.e2e.spec.js +177 -0
  114. package/dist/tests/e2e/ralphLoopEnrichmentComparison.spec.js +62 -0
  115. package/dist/tests/e2e/workiqEnrichment.spec.js +56 -0
  116. package/dist/tests/e2e/zavaSimulation.spec.js +452 -0
  117. package/dist/tests/fixtures/test-fixture-project/src/add.js +3 -0
  118. package/dist/tests/fixtures/test-fixture-project/tests/failing.test.js +6 -0
  119. package/dist/tests/fixtures/test-fixture-project/tests/hanging.test.js +8 -0
  120. package/dist/tests/fixtures/test-fixture-project/tests/passing.test.js +10 -0
  121. package/dist/tests/fixtures/test-fixture-project/vitest.config.js +6 -0
  122. package/dist/tests/integration/autoStartConversation.spec.js +138 -0
  123. package/dist/tests/integration/defaultCommand.spec.js +147 -0
  124. package/dist/tests/integration/directCommandNonTty.spec.js +224 -0
  125. package/dist/tests/integration/directCommandTty.spec.js +151 -0
  126. package/dist/tests/integration/discoveryEnrichmentFlow.spec.js +175 -0
  127. package/dist/tests/integration/exportArtifacts.spec.js +202 -0
  128. package/dist/tests/integration/exportFallbackFlow.spec.js +99 -0
  129. package/dist/tests/integration/mcpDegradationFlow.spec.js +190 -0
  130. package/dist/tests/integration/mcpTransportFlow.spec.js +139 -0
  131. package/dist/tests/integration/newSessionFlow.spec.js +343 -0
  132. package/dist/tests/integration/pocGithubMcp.spec.js +186 -0
  133. package/dist/tests/integration/pocLocalFallback.spec.js +171 -0
  134. package/dist/tests/integration/pocScaffold.spec.js +163 -0
  135. package/dist/tests/integration/ralphLoopFlow.spec.js +359 -0
  136. package/dist/tests/integration/ralphLoopPartial.spec.js +368 -0
  137. package/dist/tests/integration/resumeAndBacktrack.spec.js +247 -0
  138. package/dist/tests/integration/spinnerLifecycle.spec.js +220 -0
  139. package/dist/tests/integration/summarizationFlow.spec.js +115 -0
  140. package/dist/tests/integration/testRunnerReal.spec.js +52 -0
  141. package/dist/tests/integration/webSearchAgent.spec.js +128 -0
  142. package/dist/tests/live/copilotSdkLive.spec.js +107 -0
  143. package/dist/tests/live/zavaFullWorkshop.spec.js +392 -0
  144. package/dist/tests/setup/loadEnv.js +3 -0
  145. package/dist/tests/unit/cli/developCommand.spec.js +567 -0
  146. package/dist/tests/unit/cli/directCommands.spec.js +279 -0
  147. package/dist/tests/unit/cli/envLoader.spec.js +58 -0
  148. package/dist/tests/unit/cli/ioContext.spec.js +119 -0
  149. package/dist/tests/unit/cli/preflight.spec.js +108 -0
  150. package/dist/tests/unit/cli/statusCommand.spec.js +111 -0
  151. package/dist/tests/unit/cli/workshopClientFallback.spec.js +80 -0
  152. package/dist/tests/unit/cli/workshopCommand.spec.js +329 -0
  153. package/dist/tests/unit/config/vitestEnvSetup.spec.js +13 -0
  154. package/dist/tests/unit/develop/checkpointState.spec.js +315 -0
  155. package/dist/tests/unit/develop/codeGenerator.spec.js +355 -0
  156. package/dist/tests/unit/develop/githubMcpAdapter.spec.js +231 -0
  157. package/dist/tests/unit/develop/mcpContextEnricher.spec.js +433 -0
  158. package/dist/tests/unit/develop/outputValidator.spec.js +119 -0
  159. package/dist/tests/unit/develop/pocScaffolder.spec.js +353 -0
  160. package/dist/tests/unit/develop/ralphLoop.spec.js +1248 -0
  161. package/dist/tests/unit/develop/templateRegistry.spec.js +85 -0
  162. package/dist/tests/unit/develop/testRunner.spec.js +249 -0
  163. package/dist/tests/unit/infraBicep.spec.js +92 -0
  164. package/dist/tests/unit/infraDeploy.spec.js +82 -0
  165. package/dist/tests/unit/infraTeardown.spec.js +63 -0
  166. package/dist/tests/unit/logging/logger.spec.js +43 -0
  167. package/dist/tests/unit/loop/conversationLoop.spec.js +592 -0
  168. package/dist/tests/unit/loop/phaseSummarizer.spec.js +141 -0
  169. package/dist/tests/unit/loop/streamingMarkdown.spec.js +147 -0
  170. package/dist/tests/unit/mcp/mcpManager.spec.js +279 -0
  171. package/dist/tests/unit/mcp/mcpTransport.spec.js +529 -0
  172. package/dist/tests/unit/mcp/retryPolicy.spec.js +218 -0
  173. package/dist/tests/unit/mcp/timeoutValidation.spec.js +46 -0
  174. package/dist/tests/unit/mcp/webSearch.spec.js +567 -0
  175. package/dist/tests/unit/phases/contextSummarizer.spec.js +140 -0
  176. package/dist/tests/unit/phases/discoveryEnricher.repeatCalls.spec.js +93 -0
  177. package/dist/tests/unit/phases/discoveryEnricher.spec.js +411 -0
  178. package/dist/tests/unit/phases/phaseExtractors.spec.js +352 -0
  179. package/dist/tests/unit/phases/phaseHandlers.spec.js +425 -0
  180. package/dist/tests/unit/prompts/promptLoader.spec.js +118 -0
  181. package/dist/tests/unit/schemas/pocSchemas.spec.js +412 -0
  182. package/dist/tests/unit/schemas/session.spec.js +257 -0
  183. package/dist/tests/unit/sessions/exportPaths.spec.js +31 -0
  184. package/dist/tests/unit/sessions/exportWriter.spec.js +655 -0
  185. package/dist/tests/unit/sessions/sessionManager.spec.js +151 -0
  186. package/dist/tests/unit/sessions/sessionStore.spec.js +116 -0
  187. package/dist/tests/unit/shared/activitySpinner.spec.js +175 -0
  188. package/dist/tests/unit/shared/cardsLoader.spec.js +76 -0
  189. package/dist/tests/unit/shared/copilotClient.spec.js +155 -0
  190. package/dist/tests/unit/shared/errorClassifier.spec.js +131 -0
  191. package/dist/tests/unit/shared/events.spec.js +55 -0
  192. package/dist/tests/unit/shared/markdownRenderer.spec.js +35 -0
  193. package/dist/tests/unit/shared/markdownRendererChunks.spec.js +70 -0
  194. package/dist/tests/unit/shared/tableRenderer.spec.js +34 -0
  195. package/dist/vitest.config.js +14 -0
  196. package/dist/vitest.live.config.js +18 -0
  197. package/docs/README.md +35 -0
  198. package/docs/architecture.md +169 -0
  199. package/docs/cli-usage.md +207 -0
  200. package/docs/environment.md +66 -0
  201. package/docs/export-format.md +146 -0
  202. package/docs/session-model.md +113 -0
  203. package/eslint.config.js +35 -0
  204. package/infra/deploy.sh +193 -0
  205. package/infra/gather-env.sh +211 -0
  206. package/infra/main.bicep +90 -0
  207. package/infra/main.bicepparam +18 -0
  208. package/infra/resources.bicep +134 -0
  209. package/infra/teardown.sh +114 -0
  210. package/package.json +63 -0
  211. package/specs/001-cli-workshop-rebuild/checklists/requirements.md +35 -0
  212. package/specs/001-cli-workshop-rebuild/contracts/cli.md +59 -0
  213. package/specs/001-cli-workshop-rebuild/contracts/export-summary-json.md +23 -0
  214. package/specs/001-cli-workshop-rebuild/contracts/session-json.md +30 -0
  215. package/specs/001-cli-workshop-rebuild/data-model.md +210 -0
  216. package/specs/001-cli-workshop-rebuild/plan.md +361 -0
  217. package/specs/001-cli-workshop-rebuild/quickstart.md +83 -0
  218. package/specs/001-cli-workshop-rebuild/research.md +116 -0
  219. package/specs/001-cli-workshop-rebuild/spec.md +240 -0
  220. package/specs/001-cli-workshop-rebuild/tasks.md +476 -0
  221. package/specs/002-poc-generation/contracts/poc-output.md +172 -0
  222. package/specs/002-poc-generation/contracts/ralph-loop.md +113 -0
  223. package/specs/002-poc-generation/data-model.md +172 -0
  224. package/specs/002-poc-generation/plan.md +109 -0
  225. package/specs/002-poc-generation/quickstart.md +97 -0
  226. package/specs/002-poc-generation/research.md +786 -0
  227. package/specs/002-poc-generation/spec.md +81 -0
  228. package/specs/002-poc-generation/tasks-fix.md +198 -0
  229. package/specs/002-poc-generation/tasks.md +252 -0
  230. package/specs/003-mcp-transport-integration/checklists/requirements.md +37 -0
  231. package/specs/003-mcp-transport-integration/contracts/context-enricher.md +220 -0
  232. package/specs/003-mcp-transport-integration/contracts/discovery-enricher.md +267 -0
  233. package/specs/003-mcp-transport-integration/contracts/github-adapter.md +149 -0
  234. package/specs/003-mcp-transport-integration/contracts/mcp-transport.md +288 -0
  235. package/specs/003-mcp-transport-integration/data-model.md +326 -0
  236. package/specs/003-mcp-transport-integration/plan.md +114 -0
  237. package/specs/003-mcp-transport-integration/quickstart.md +311 -0
  238. package/specs/003-mcp-transport-integration/research.md +395 -0
  239. package/specs/003-mcp-transport-integration/spec.md +234 -0
  240. package/specs/003-mcp-transport-integration/tasks.md +324 -0
  241. package/specs/003-next-spec-gaps.md +150 -0
  242. package/specs/004-dev-resume-hardening/checklists/requirements.md +37 -0
  243. package/specs/004-dev-resume-hardening/contracts/cli.md +160 -0
  244. package/specs/004-dev-resume-hardening/data-model.md +321 -0
  245. package/specs/004-dev-resume-hardening/plan.md +107 -0
  246. package/specs/004-dev-resume-hardening/quickstart.md +115 -0
  247. package/specs/004-dev-resume-hardening/research.md +142 -0
  248. package/specs/004-dev-resume-hardening/spec.md +221 -0
  249. package/specs/004-dev-resume-hardening/tasks.md +333 -0
  250. package/specs/005-ai-search-deploy/checklists/requirements.md +39 -0
  251. package/specs/005-ai-search-deploy/contracts/web-search-tool.md +241 -0
  252. package/specs/005-ai-search-deploy/data-model.md +130 -0
  253. package/specs/005-ai-search-deploy/plan.md +93 -0
  254. package/specs/005-ai-search-deploy/quickstart.md +96 -0
  255. package/specs/005-ai-search-deploy/research.md +187 -0
  256. package/specs/005-ai-search-deploy/spec.md +143 -0
  257. package/specs/005-ai-search-deploy/tasks.md +284 -0
  258. package/specs/006-workshop-extraction-fixes/checklists/requirements.md +61 -0
  259. package/specs/006-workshop-extraction-fixes/contracts/summarization-and-export.md +131 -0
  260. package/specs/006-workshop-extraction-fixes/data-model.md +149 -0
  261. package/specs/006-workshop-extraction-fixes/plan.md +123 -0
  262. package/specs/006-workshop-extraction-fixes/quickstart.md +101 -0
  263. package/specs/006-workshop-extraction-fixes/research.md +143 -0
  264. package/specs/006-workshop-extraction-fixes/spec.md +210 -0
  265. package/specs/006-workshop-extraction-fixes/tasks.md +316 -0
  266. package/src/cli/developCommand.ts +308 -0
  267. package/src/cli/directCommands.ts +195 -0
  268. package/src/cli/envLoader.ts +17 -0
  269. package/src/cli/exportCommand.ts +65 -0
  270. package/src/cli/index.ts +249 -0
  271. package/src/cli/ioContext.ts +139 -0
  272. package/src/cli/preflight.ts +86 -0
  273. package/src/cli/statusCommand.ts +118 -0
  274. package/src/cli/workshopCommand.ts +496 -0
  275. package/src/develop/checkpointState.ts +121 -0
  276. package/src/develop/codeGenerator.ts +402 -0
  277. package/src/develop/dynamicScaffolder.ts +284 -0
  278. package/src/develop/githubMcpAdapter.ts +199 -0
  279. package/src/develop/index.ts +34 -0
  280. package/src/develop/mcpContextEnricher.ts +279 -0
  281. package/src/develop/pocScaffolder.ts +646 -0
  282. package/src/develop/ralphLoop.ts +1044 -0
  283. package/src/develop/templateRegistry.ts +427 -0
  284. package/src/develop/testRunner.ts +276 -0
  285. package/src/logging/logger.ts +73 -0
  286. package/src/loop/conversationLoop.ts +355 -0
  287. package/src/loop/phaseSummarizer.ts +114 -0
  288. package/src/mcp/mcpManager.ts +365 -0
  289. package/src/mcp/mcpTransport.ts +562 -0
  290. package/src/mcp/retryPolicy.ts +87 -0
  291. package/src/mcp/webSearch.ts +388 -0
  292. package/src/originalPrompts/design_thinking.md +178 -0
  293. package/src/originalPrompts/design_thinking_persona.md +76 -0
  294. package/src/originalPrompts/document_generator_example.md +77 -0
  295. package/src/originalPrompts/document_generator_persona.md +47 -0
  296. package/src/originalPrompts/facilitator_persona.md +125 -0
  297. package/src/originalPrompts/guardrails.md +47 -0
  298. package/src/phases/contextSummarizer.ts +154 -0
  299. package/src/phases/discoveryEnricher.ts +223 -0
  300. package/src/phases/phaseExtractors.ts +247 -0
  301. package/src/phases/phaseHandlers.ts +450 -0
  302. package/src/prompts/design.md +51 -0
  303. package/src/prompts/develop-boundary.md +51 -0
  304. package/src/prompts/develop.md +111 -0
  305. package/src/prompts/discover.md +58 -0
  306. package/src/prompts/ideate.md +56 -0
  307. package/src/prompts/plan.md +51 -0
  308. package/src/prompts/promptLoader.ts +198 -0
  309. package/src/prompts/select.md +47 -0
  310. package/src/prompts/summarize/README.md +8 -0
  311. package/src/prompts/summarize/design-summary.md +37 -0
  312. package/src/prompts/summarize/develop-summary.md +25 -0
  313. package/src/prompts/summarize/ideate-summary.md +27 -0
  314. package/src/prompts/summarize/plan-summary.md +27 -0
  315. package/src/prompts/summarize/select-summary.md +21 -0
  316. package/src/prompts/system.md +28 -0
  317. package/src/sessions/exportPaths.ts +28 -0
  318. package/src/sessions/exportWriter.ts +490 -0
  319. package/src/sessions/sessionManager.ts +119 -0
  320. package/src/sessions/sessionStore.ts +69 -0
  321. package/src/shared/activitySpinner.ts +108 -0
  322. package/src/shared/copilotClient.ts +291 -0
  323. package/src/shared/data/cards.json +1249 -0
  324. package/src/shared/data/cardsLoader.ts +70 -0
  325. package/src/shared/errorClassifier.ts +160 -0
  326. package/src/shared/events.ts +103 -0
  327. package/src/shared/markdownRenderer.ts +44 -0
  328. package/src/shared/schemas/session.ts +346 -0
  329. package/src/shared/tableRenderer.ts +28 -0
  330. package/src/types/marked-terminal.d.ts +5 -0
  331. package/src/vendor/chalk.ts +2 -0
  332. package/src/vendor/cli-table3.ts +3 -0
  333. package/src/vendor/commander.ts +2 -0
  334. package/src/vendor/marked-terminal.ts +3 -0
  335. package/src/vendor/marked.ts +2 -0
  336. package/src/vendor/ora.ts +2 -0
  337. package/src/vendor/pino.ts +3 -0
  338. package/src/vendor/zod.ts +3 -0
  339. package/tests/e2e/developE2e.spec.ts +152 -0
  340. package/tests/e2e/developFailureE2e.spec.ts +289 -0
  341. package/tests/e2e/developPty.spec.ts +86 -0
  342. package/tests/e2e/discoveryWebSearchRelevance.spec.ts +103 -0
  343. package/tests/e2e/harness.spec.ts +104 -0
  344. package/tests/e2e/mcpLive.spec.ts +149 -0
  345. package/tests/e2e/newSession.e2e.spec.ts +245 -0
  346. package/tests/e2e/ralphLoopEnrichmentComparison.spec.ts +70 -0
  347. package/tests/e2e/workiqEnrichment.spec.ts +72 -0
  348. package/tests/e2e/zava-assessment/agent-interaction-script.md +258 -0
  349. package/tests/e2e/zava-assessment/company-profile.md +98 -0
  350. package/tests/e2e/zava-assessment/expected-results-checklist.md +454 -0
  351. package/tests/e2e/zavaSimulation.spec.ts +511 -0
  352. package/tests/fixtures/completedSession.json +141 -0
  353. package/tests/fixtures/test-fixture-project/package-lock.json +1585 -0
  354. package/tests/fixtures/test-fixture-project/package.json +12 -0
  355. package/tests/fixtures/test-fixture-project/src/add.ts +3 -0
  356. package/tests/fixtures/test-fixture-project/tests/failing.test.ts +7 -0
  357. package/tests/fixtures/test-fixture-project/tests/hanging.test.ts +9 -0
  358. package/tests/fixtures/test-fixture-project/tests/passing.test.ts +13 -0
  359. package/tests/fixtures/test-fixture-project/vitest.config.ts +7 -0
  360. package/tests/integration/autoStartConversation.spec.ts +168 -0
  361. package/tests/integration/defaultCommand.spec.ts +179 -0
  362. package/tests/integration/directCommandNonTty.spec.ts +260 -0
  363. package/tests/integration/directCommandTty.spec.ts +185 -0
  364. package/tests/integration/discoveryEnrichmentFlow.spec.ts +209 -0
  365. package/tests/integration/exportArtifacts.spec.ts +232 -0
  366. package/tests/integration/exportFallbackFlow.spec.ts +115 -0
  367. package/tests/integration/mcpDegradationFlow.spec.ts +231 -0
  368. package/tests/integration/mcpTransportFlow.spec.ts +178 -0
  369. package/tests/integration/newSessionFlow.spec.ts +406 -0
  370. package/tests/integration/pocGithubMcp.spec.ts +224 -0
  371. package/tests/integration/pocLocalFallback.spec.ts +205 -0
  372. package/tests/integration/pocScaffold.spec.ts +220 -0
  373. package/tests/integration/ralphLoopFlow.spec.ts +430 -0
  374. package/tests/integration/ralphLoopPartial.spec.ts +416 -0
  375. package/tests/integration/resumeAndBacktrack.spec.ts +278 -0
  376. package/tests/integration/spinnerLifecycle.spec.ts +270 -0
  377. package/tests/integration/summarizationFlow.spec.ts +135 -0
  378. package/tests/integration/testRunnerReal.spec.ts +63 -0
  379. package/tests/integration/webSearchAgent.spec.ts +155 -0
  380. package/tests/live/copilotSdkLive.spec.ts +149 -0
  381. package/tests/live/zavaFullWorkshop.spec.ts +515 -0
  382. package/tests/setup/loadEnv.ts +5 -0
  383. package/tests/unit/cli/developCommand.spec.ts +679 -0
  384. package/tests/unit/cli/directCommands.spec.ts +325 -0
  385. package/tests/unit/cli/envLoader.spec.ts +73 -0
  386. package/tests/unit/cli/ioContext.spec.ts +148 -0
  387. package/tests/unit/cli/preflight.spec.ts +125 -0
  388. package/tests/unit/cli/statusCommand.spec.ts +134 -0
  389. package/tests/unit/cli/workshopClientFallback.spec.ts +100 -0
  390. package/tests/unit/cli/workshopCommand.spec.ts +378 -0
  391. package/tests/unit/config/vitestEnvSetup.spec.ts +24 -0
  392. package/tests/unit/develop/checkpointState.spec.ts +378 -0
  393. package/tests/unit/develop/codeGenerator.spec.ts +447 -0
  394. package/tests/unit/develop/githubMcpAdapter.spec.ts +283 -0
  395. package/tests/unit/develop/mcpContextEnricher.spec.ts +564 -0
  396. package/tests/unit/develop/outputValidator.spec.ts +134 -0
  397. package/tests/unit/develop/pocScaffolder.spec.ts +451 -0
  398. package/tests/unit/develop/ralphLoop.spec.ts +1439 -0
  399. package/tests/unit/develop/templateRegistry.spec.ts +106 -0
  400. package/tests/unit/develop/testRunner.spec.ts +294 -0
  401. package/tests/unit/infraBicep.spec.ts +116 -0
  402. package/tests/unit/infraDeploy.spec.ts +102 -0
  403. package/tests/unit/infraTeardown.spec.ts +77 -0
  404. package/tests/unit/logging/logger.spec.ts +50 -0
  405. package/tests/unit/loop/conversationLoop.spec.ts +719 -0
  406. package/tests/unit/loop/phaseSummarizer.spec.ts +169 -0
  407. package/tests/unit/loop/streamingMarkdown.spec.ts +180 -0
  408. package/tests/unit/mcp/mcpManager.spec.ts +336 -0
  409. package/tests/unit/mcp/mcpTransport.spec.ts +689 -0
  410. package/tests/unit/mcp/retryPolicy.spec.ts +278 -0
  411. package/tests/unit/mcp/timeoutValidation.spec.ts +55 -0
  412. package/tests/unit/mcp/webSearch.spec.ts +718 -0
  413. package/tests/unit/phases/contextSummarizer.spec.ts +158 -0
  414. package/tests/unit/phases/discoveryEnricher.repeatCalls.spec.ts +125 -0
  415. package/tests/unit/phases/discoveryEnricher.spec.ts +512 -0
  416. package/tests/unit/phases/phaseExtractors.spec.ts +406 -0
  417. package/tests/unit/phases/phaseHandlers.spec.ts +483 -0
  418. package/tests/unit/prompts/promptLoader.spec.ts +144 -0
  419. package/tests/unit/schemas/pocSchemas.spec.ts +457 -0
  420. package/tests/unit/schemas/session.spec.ts +328 -0
  421. package/tests/unit/sessions/exportPaths.spec.ts +38 -0
  422. package/tests/unit/sessions/exportWriter.spec.ts +737 -0
  423. package/tests/unit/sessions/sessionManager.spec.ts +174 -0
  424. package/tests/unit/sessions/sessionStore.spec.ts +136 -0
  425. package/tests/unit/shared/activitySpinner.spec.ts +211 -0
  426. package/tests/unit/shared/cardsLoader.spec.ts +89 -0
  427. package/tests/unit/shared/copilotClient.spec.ts +185 -0
  428. package/tests/unit/shared/errorClassifier.spec.ts +152 -0
  429. package/tests/unit/shared/events.spec.ts +71 -0
  430. package/tests/unit/shared/markdownRenderer.spec.ts +42 -0
  431. package/tests/unit/shared/markdownRendererChunks.spec.ts +83 -0
  432. package/tests/unit/shared/tableRenderer.spec.ts +38 -0
  433. package/tsconfig.json +20 -0
  434. package/vitest.config.ts +15 -0
  435. package/vitest.live.config.ts +19 -0
@@ -0,0 +1,257 @@
1
+ # sofIA - Copilot Instructions
2
+
3
+ sofIA is an agentic system built with the **GitHub Copilot SDK** that implements the AI Discovery Cards workshop process. It guides users through discovering, ideating, designing, planning, and developing AI solutions for business needs.
4
+
5
+ ## Project Overview
6
+
7
+ This project reimagines [Microsoft's AI Discovery Agent (AIDA)](https://github.com/microsoft/ai-discovery-agent/) using the GitHub Copilot SDK (`@github/copilot-sdk`) instead of Python/LangGraph. It extends AIDA's workshop facilitation by adding:
8
+
9
+ 1. **Idea Selection** - Automatically selects the most feasible AI use case from generated ideas
10
+ 2. **Planning** - Creates implementation plans for the selected idea
11
+ 3. **PoC Development** - Generates working proof-of-concept code to demonstrate the idea
12
+ 4. **Discovery Cards** - from [AI Discovery Cards Agent](https://github.com/microsoft-partner-solutions-ai/agent-guides/tree/main/ai-discovery-cards-agent)
13
+
14
+ ## Architecture
15
+
16
+ ### Agent Flow
17
+ ```
18
+ User Input → Discovery Agent → Ideation Agent → Design Agent → Selection Agent → Planning Agent → Development Agent
19
+ ```
20
+
21
+ Examples for these agents can be found in the [src/originalPrompts/](../src/originalPrompts/) directory.
22
+
23
+ ### Core Components
24
+
25
+ - **Copilot Enabled App** - Main entry point handling GitHub Copilot chat interactions via `@github/copilot-sdk`
26
+ - **MCP Integrations** - External services for gathering context and generating solutions:
27
+ - **WorkIQ** - Process analysis and task discovery
28
+ - **Context7** - Documentation and context retrieval
29
+ - **Microsoft Learn** - Azure/AI documentation
30
+ - **GitHub MCP** - Repository search, code examples, and issue tracking
31
+
32
+ ### AI Discovery Workshop Process
33
+
34
+ The system implements the 12 step AI Discovery Cards methodology, and after this it goes beyond by selecting the best idea, creating a plan, and generating PoC code:
35
+
36
+ ### Phase 1: AI discovery and ideation
37
+
38
+ For each step:
39
+
40
+ - Ask for required input.
41
+ - Summarize or reflect back what was shared.
42
+ - Propose moving to the next step only when ready.
43
+
44
+ #### Step 1: Understand the Business
45
+
46
+ - Ask for a description of the business and its challenges.
47
+ - Store this information for later use.
48
+
49
+ #### Step 2: Choose a Topic
50
+
51
+ - Identify areas to work on.
52
+ - Prioritize and define today’s focus.
53
+
54
+ #### Step 3: Ideate Activities
55
+
56
+ - Brainstorm key activities in the focus area.
57
+ - Identify what’s not being done due to difficulty.
58
+
59
+ #### Step 4: Map Workflow
60
+
61
+ - Visualize the activity flow.
62
+ - Vote on critical steps based on business and human value.
63
+ - Identify key metrics (e.g., hours/week, NSAT).
64
+
65
+ #### Step 5: Explore AI Envisioning Cards
66
+
67
+ - Ask the AI Discovery Expert to present cards to attendees.
68
+
69
+ #### Step 6: Score Cards
70
+
71
+ - Ask which cards were selected and how they were scored.
72
+
73
+ #### Step 7: Review Top Cards
74
+
75
+ - Select up to 15 cards.
76
+ - Aggregate similar ones.
77
+
78
+ #### Step 8: Map Cards to Workflow
79
+
80
+ - Align cards to workflow steps.
81
+ - Ensure key metrics are clear.
82
+
83
+ #### Step 9: Generate Ideas
84
+
85
+ - Ask the Design Thinking Expert to help ideate for each step.
86
+
87
+ ### Step 10: Create Idea Cards
88
+
89
+ For each idea, capture:
90
+
91
+ - **Title**
92
+ - **Description**
93
+ - **Workflow Steps Covered**
94
+ - **Aspirational Solution Scope**
95
+
96
+ #### Step 11: Evaluate Ideas
97
+
98
+ - Use a feasibility/value matrix.
99
+ - Consider KPIs and metrics.
100
+
101
+ #### Step 12: Assess Impact
102
+
103
+ For each idea, evaluate:
104
+
105
+ - Data needed
106
+ - Risks
107
+ - Business impact
108
+ - Human value
109
+ - Key metrics influenced
110
+
111
+ ### Phase 2: Idea Selection
112
+
113
+ After generating and evaluating ideas, the system will automatically select the most promising one based on feasibility, business impact, and human value. This selection will be made transparent to the user, with an explanation of why it was chosen, and will let the user confirm or override the selection before proceeding to planning and development.
114
+
115
+ ### Phase 3: Planning and PoC Development
116
+
117
+ Once an idea is selected, the system will create a high-level implementation plan, breaking down the idea into actionable steps. It will then generate proof-of-concept code to demonstrate the core functionality of the idea, using best practices for modularity and maintainability. The generated code will be designed to be easily extendable for full implementation after the workshop.
118
+
119
+ This code will be generated by a [Ralph Loop](https://github.com/anthropics/claude-plugins-official/tree/main/plugins/ralph-loop) agent that iteratively refines the implementation until the PoC meets the defined functionality, then stops. The system will also provide guidance on next steps for full implementation after the workshop concludes.
120
+
121
+ ## Tech Stack
122
+
123
+ - **Runtime**: Node.js / TypeScript
124
+ - **SDK**: `@github/copilot-sdk`
125
+ - **MCP Protocol**: Model Context Protocol for tool integrations
126
+ - **Deployment**: GitHub App (Copilot SDK enabled App)
127
+
128
+ > **Note**: This repository currently contains specifications, workshop prompts, and
129
+ > governance/templates (under `.specify/`). If/when implementation code is added,
130
+ > it must follow the constitution and the stack expectations above.
131
+
132
+ ## Development Commands (when implementation code exists)
133
+
134
+ If this repo contains a `package.json`, prefer the scripts defined there
135
+ (`npm test`, `npm run lint`, etc.). Do not invent commands that do not exist.
136
+
137
+ ## Key Conventions
138
+
139
+ ### All Changes — Test-Driven Development (TDD) + Lint/Typecheck Loop
140
+
141
+ All new behavior and bug fixes **must** follow a TDD workflow (Red → Green → Lint → Review).
142
+ Do not modify production code until a failing test proves the change is needed.
143
+ **Code must never be committed if `npm run lint` or `npm run typecheck` report errors.**
144
+
145
+ 1. **Reproduce** — Identify the root cause and the exact code path that fails.
146
+ 2. **Write a failing test first** — Create a test (unit or integration) that exercises the buggy behaviour and fails on the current code. The test name should describe the symptom (e.g., `"captures LLM text after tool-calling loop"`).
147
+ 3. **Run the test** — Confirm it fails for the expected reason (`npm test -- <test-file>`).
148
+ 4. **Fix the production code** — Make the minimal change needed to make the test pass.
149
+ 5. **Run lint + typecheck** — `npm run lint && npm run typecheck` must both pass. Fix any errors before continuing.
150
+ 6. **Run the test again** — Confirm it passes after the lint/typecheck fixes.
151
+ 7. **Run the full suite** — Ensure the full test suite remains green (no regressions).
152
+ 8. **Run lint + typecheck one final time** — Confirm both still pass after the full suite run. **Do not commit until they do.**
153
+
154
+ > **Loop invariant:** After every code change — no matter how small — run `npm run lint && npm run typecheck` before moving on. Treat a lint or typecheck failure the same as a failing test: stop, fix, re-run.
155
+
156
+ **Test placement guidelines (when a test suite exists):**
157
+ | Scope | Directory | When to use |
158
+ |-------|-----------|-------------|
159
+ | Unit | `tests/unit/` | Pure-function logic, renderers, models, single-module behaviour |
160
+ | Integration | `tests/integration/` | Multi-module flows, CLI subprocess, conversation loops |
161
+ | Contract | `tests/contract/` | CLI command shapes, error formats, public API surface |
162
+
163
+ **If tests do not exist yet:** The first implementation work for a feature must include setting up a minimal test runner and a first failing test, then proceed with implementation.
164
+
165
+ **Mock boundaries:** Mock at the module boundary (e.g., Vitest `vi.mock()`), not inside functions. For Copilot SDK tests, prefer deterministic fakes for streaming/tool-calling sessions.
166
+
167
+ > **Rationale:** Several bugs in the streaming pipeline (resolveIdle, onComplete fallback) were fixed without tests initially, requiring rework. Writing the test first catches regressions immediately and documents the expected SDK event sequence. Lint and typecheck failures have similarly caused hidden breakage (unused imports, removed interface properties still referenced in tests) that only surfaces at CI time — running them in every iteration prevents that drift.
168
+
169
+
170
+ ### Agent State Management
171
+ Each agent phase should:
172
+ - Accept context from previous phases
173
+ - Emit structured output for the next phase
174
+ - Support checkpointing for long-running sessions
175
+
176
+ ### Prompt Engineering
177
+
178
+ - Canonical prompts used by the runtime should live in `src/prompts/`.
179
+ - `src/originalPrompts/` is for inspiration/legacy examples; it is not the source of truth for production prompts.
180
+ - Use markdown format for prompt templates
181
+ - Include few-shot examples where applicable
182
+ - Reference the [AI Discovery Cards Agent Guide](https://github.com/microsoft-partner-solutions-ai/agent-guides/tree/main/ai-discovery-cards-agent) for:
183
+ - System instructions and workshop methodology
184
+ - Knowledge sources and suggested prompts
185
+ - Uploaded reference files (workshop materials, card decks)
186
+
187
+ ### Import Ordering & Linting
188
+
189
+ The project uses ESLint with `eslint-plugin-import` and the `import/order` rule set to `warn`. Imports **must** be separated into groups with a blank line between each group:
190
+
191
+ 1. **Built-in / External** — Node.js built-ins and `node_modules` packages (e.g., `commander`, `vitest`, `pino`)
192
+ 2. **Internal / Parent / Sibling** — Project-relative imports (e.g., `../shared/schemas/session.js`)
193
+
194
+ ```typescript
195
+ // ✅ Correct — blank line between groups
196
+ import { Command } from 'commander';
197
+
198
+ import type { PhaseValue } from '../shared/schemas/session.js';
199
+
200
+ // ❌ Wrong — no blank line between external and internal
201
+ import { Command } from 'commander';
202
+ import type { PhaseValue } from '../shared/schemas/session.js';
203
+ ```
204
+
205
+ **Run `npm run lint` after every code change — not just at the end.** Lint failures block commits the same way failing tests do. If the linter reports `import/order` warnings, add blank lines between the import groups.
206
+
207
+ ### Typecheck
208
+
209
+ The project enforces strict TypeScript checking via `npm run typecheck` (`tsc --noEmit`). **Run `npm run typecheck` after every code change — not just at the end.** Typecheck failures block commits the same way failing tests do.
210
+
211
+ 1. Run `npm run typecheck` and fix all errors before proceeding to the next step.
212
+ 2. Never suppress errors with `@ts-ignore` or `any` — use proper types, type narrowing, or Vitest's `Mock<>` generic.
213
+ 3. For third-party packages without `@types`, add an ambient module declaration in `src/types/<package>.d.ts`.
214
+
215
+ > **Rationale:** Type mismatches between production code and Zod schemas (e.g., wrong property names in `exportWriter.ts`) were only caught by strict typechecking, not by tests. Unused imports left behind after interface refactoring (e.g., removing `githubAdapter` from `RalphLoopOptions`) caused silent CI failures that would have been caught immediately by running `npm run lint && npm run typecheck` in every loop iteration.
216
+
217
+ ## MCP Server Configuration
218
+
219
+ The project uses Model Context Protocol servers for external integrations. Configuration is in:
220
+ - `.vscode/settings.json` - VS Code / GitHub Copilot integration
221
+ - `.vscode/mcp.json` - MCP server configuration
222
+
223
+ Ensure these services are being used when needed for context retrieval and tool calls in the agent flow.
224
+
225
+ ### Available MCP Servers
226
+
227
+ | Server | Config | Purpose |
228
+ |--------|--------|---------|
229
+ | `workiq` | `@microsoft/workiq` | Microsoft 365 data - emails, meetings, documents, Teams messages |
230
+ | `github` | Remote: `https://api.githubcopilot.com/mcp/` | Repository search, code, issues, PRs, Actions workflows |
231
+ | `microsoftdocs` | Remote: `https://learn.microsoft.com/api/mcp` | Azure resource management - storage, compute, databases and everything else |
232
+ | `context7` | `@upstash/context7-mcp` | Up-to-date library/framework documentation |
233
+ | `playwright` | `@playwright/mcp@latest` | Browser automation for web research and PoC testing |
234
+
235
+ > **Note**: Web search is built into GitHub Copilot - no additional MCP server needed for researching companies, industry trends, or public information.
236
+
237
+ ### WorkIQ Setup
238
+
239
+ WorkIQ requires Microsoft 365 tenant access and admin consent. On first use:
240
+ 1. Run `npx -y @microsoft/workiq accept-eula` to accept the EULA
241
+ 2. Sign in when prompted - admin consent may be required
242
+ 3. See [WorkIQ Admin Instructions](https://github.com/microsoft/work-iq-mcp/blob/main/ADMIN-INSTRUCTIONS.md) for tenant setup
243
+
244
+ ## Terminal Command Safety
245
+
246
+ The CLI solution may hang or get stuck during development (e.g., interactive prompts waiting for input, infinite loops, watch modes). To avoid blocking the agent:
247
+
248
+ - **Always use a timeout** when running `npm test`, `npm run build`, `npm run dev`, or any command that could hang. Use the `timeout` parameter on terminal calls (e.g., 30000ms for tests, 60000ms for builds).
249
+ - **Never run `npm run dev` without a timeout** — it starts a watch process that never exits on its own.
250
+ - **Prefer targeted test runs** (`npm test -- <specific-file>`) over full suite runs when iterating on a single module.
251
+ - **If a command hangs**, kill it and investigate the root cause rather than waiting indefinitely.
252
+
253
+ ## Security Considerations
254
+
255
+ - Never log or expose user tokens
256
+ - Use the SDK's built-in request verification rather than custom implementations
257
+ - Follow [AIDA's RAI principles](https://github.com/microsoft/ai-discovery-agent/blob/main/docs/RESPONSIBLE_AI_PRINCIPLES.md) for AI transparency
@@ -0,0 +1,3 @@
1
+ ---
2
+ agent: speckit.analyze
3
+ ---
@@ -0,0 +1,3 @@
1
+ ---
2
+ agent: speckit.checklist
3
+ ---
@@ -0,0 +1,3 @@
1
+ ---
2
+ agent: speckit.clarify
3
+ ---
@@ -0,0 +1,3 @@
1
+ ---
2
+ agent: speckit.constitution
3
+ ---
@@ -0,0 +1,3 @@
1
+ ---
2
+ agent: speckit.implement
3
+ ---
@@ -0,0 +1,3 @@
1
+ ---
2
+ agent: speckit.plan
3
+ ---
@@ -0,0 +1,3 @@
1
+ ---
2
+ agent: speckit.specify
3
+ ---
@@ -0,0 +1,3 @@
1
+ ---
2
+ agent: speckit.tasks
3
+ ---
@@ -0,0 +1,3 @@
1
+ ---
2
+ agent: speckit.taskstoissues
3
+ ---
@@ -0,0 +1,38 @@
1
+ name: CI
2
+
3
+ on:
4
+ push:
5
+ pull_request:
6
+ branches: [main]
7
+
8
+ jobs:
9
+ build-and-test:
10
+ runs-on: ubuntu-latest
11
+ strategy:
12
+ matrix:
13
+ node-version: [22.x]
14
+
15
+ steps:
16
+ - name: Checkout repository
17
+ uses: actions/checkout@v4
18
+
19
+ - name: Setup Node.js ${{ matrix.node-version }}
20
+ uses: actions/setup-node@v4
21
+ with:
22
+ node-version: ${{ matrix.node-version }}
23
+ cache: 'npm'
24
+
25
+ - name: Install dependencies
26
+ run: npm ci
27
+
28
+ - name: Lint
29
+ run: npm run lint
30
+
31
+ - name: Type check
32
+ run: npm run typecheck
33
+
34
+ - name: Run tests
35
+ env:
36
+ COPILOT_GITHUB_TOKEN: ${{ secrets.COPILOT_KEY }}
37
+ GITHUB_TOKEN: ${{ secrets.COPILOT_KEY }}
38
+ run: npm test
package/.prettierrc ADDED
@@ -0,0 +1,6 @@
1
+ {
2
+ "singleQuote": true,
3
+ "trailingComma": "all",
4
+ "printWidth": 100,
5
+ "semi": true
6
+ }
@@ -0,0 +1,181 @@
1
+ <!--
2
+ Sync Impact Report
3
+
4
+ - Version change: 1.1.1 → 1.1.2
5
+ - Modified principles: none
6
+ - Clarifications: added explicit mapping between 12-step process and Discover/Ideate/Design/Select/Plan/Develop wording
7
+ - Added sections: Sync Impact Report (this comment)
8
+ - Removed sections: none
9
+ - Templates requiring updates:
10
+ - ✅ .specify/templates/plan-template.md (Constitution Check gates clarified)
11
+ - ✅ .specify/templates/tasks-template.md (tests/TDD requirements aligned)
12
+ - ✅ .github/copilot-instructions.md (repo structure + TDD guidance aligned)
13
+ - ⚠ .specify/templates/commands/*.md (folder not present in this repo)
14
+ - Deferred TODOs: none
15
+ -->
16
+
17
+ # sofIA Copilot CLI Constitution
18
+
19
+ This constitution governs the design, development, and operation of the sofIA Copilot CLI solution. The system helps organizations **analyze, ideate, design, generate, and select** high‑quality AI‑enabled project ideas using the AI Discovery Cards methodology, implemented with the GitHub Copilot SDK for Node.js.
20
+
21
+ ## Core Principles
22
+
23
+ ### I. Outcome‑First AI Discovery
24
+
25
+ - The primary goal is to help users discover, refine, and prioritize valuable, feasible AI use cases – **not** to generate code for its own sake.
26
+ - The agent always keeps the AI Discovery Cards workshop phases in view: Discover → Ideate → Design → Select → Plan → Develop.
27
+ - All outputs (text, plans, code suggestions) must explicitly tie back to business goals, users, processes, and measurable impact.
28
+
29
+ ### II. Secure‑by‑Default & Privacy‑Respecting
30
+
31
+ - Follow **least privilege**: only request and use the minimum data, scopes, repos, and MCP capabilities required for the current task.
32
+ - Never log, echo, or persist secrets, access tokens, PII, or customer‑sensitive data.
33
+ - When using external MCP services or web fetch, prefer **anonymized, aggregate, or redacted** context; avoid copying proprietary content into prompts unless the user explicitly provides it.
34
+ - Default to local execution where possible; remote calls must be transparent and justifiable.
35
+
36
+ ### III. Node.js + TypeScript, SDK‑Aligned
37
+
38
+ - The solution is implemented in **Node.js (LTS) with TypeScript**, using the GitHub Copilot SDK for Node.js as the primary integration surface.
39
+ - All core behavior (request parsing, streaming events, MCP calls, orchestration) must follow SDK best practices and patterns established in this repository.
40
+ - Public contracts (CLI flags, JSON schemas, extension event formats) are treated as **APIs** and evolved carefully.
41
+
42
+ ### IV. MCP‑First Context & Tools
43
+
44
+ - Prefer **MCP servers** over ad‑hoc HTTP calls whenever suitable tools exist (Context7, Playwright, WorkIQ, Microsoft Docs, GitHub MCP, filesystem, fetch, etc.).
45
+ - Each MCP call must have a **clear purpose** that supports the current workshop phase (e.g., research technology options, inspect existing codebases, understand documentation, analyze processes).
46
+ - Tool use is **explainable**: the agent should be able to state what tool it used, why, and how the result influenced its recommendation.
47
+
48
+ ### V. Test‑First, Regressions‑Last (Red → Green → Lint → Review)
49
+
50
+ - New behavior must be covered by **automated tests** (unit where possible, integration/e2e where necessary).
51
+ - **No implementation starts before tests**: for every task phase, the first committed change MUST be failing tests that describe the target behavior.
52
+ - The project follows a **phase‑level TDD cycle** for every feature implemented:
53
+ 1. **Red**: Write all tests for the current task phase **before** writing any implementation code. All new tests MUST fail initially, confirming they test real behavior that does not yet exist.
54
+ 2. **Green**: Implement the minimum code needed to make all failing tests pass. Do not move to the next phase until every test is green.
55
+ 3. **Lint**: After green, run `npm run lint && npm run typecheck`. **Both must pass before any commit.** Lint and typecheck failures are treated the same as failing tests — stop, fix, re‑verify. This step is mandatory in every iteration, not just at the end of a task.
56
+ 4. **Review**: After lint passes, perform a **mandatory self‑review** using the Test Review Checklist (see Development Workflow below) to identify gaps. Add new tests if gaps are found and repeat the red → green → lint cycle for those additions.
57
+ - **Loop invariant for every code change**: run `npm run lint && npm run typecheck` immediately after editing any file. Never commit code that fails either check.
58
+ - Critical flows – idea evaluation, scoring, selection, and project planning – require **stable, deterministic tests** that avoid flaky external dependencies.
59
+ - Interactive CLI flows (menus, phase gates, follow-up prompts, retries, resume, and Ctrl+C paths) MUST be testable from day one through deterministic automation, not only manual checks.
60
+ - Test strategy: pure logic → unit tests; orchestration and tool integration → integration tests with fakes/mocks and a minimal set of live MCP smoke tests.
61
+ - Generated PoC repositories MUST include **basic smoke/happy‑path tests** that validate the code runs successfully. Full TDD is not required for PoC code, but generated tests serve as a quality signal for the Ralph loop.
62
+ - Each **Ralph loop iteration** MUST be test‑driven: every refinement cycle starts with a failing test or captured error that proves a defect exists, refines code until the failure is resolved, runs lint + typecheck, and then checks whether new issues were introduced.
63
+
64
+ ### VI. Interactive CLI Testability by Design
65
+
66
+ - Every interactive feature MUST define a machine-testable contract up front: expected prompts, user choices, streamed activity signals, decision gates, and terminal end states.
67
+ - The project MUST maintain an automated interactive harness capable of validating full workshop behavior (including phase transitions and governed progression) in a pseudo-terminal environment.
68
+ - For LLM-involved interactive validation, tests MUST use a layered approach:
69
+ 1. deterministic assertions on structure and control flow (menus, summaries, decisions, transitions),
70
+ 2. optional semantic validation (e.g., LLM-as-judge) for quality-sensitive phases.
71
+ - When interaction complexity requires it, the harness MAY use Copilot SDK-generated answer banks or tool-assisted inputs, but runs MUST remain reproducible via saved transcripts/reports and explicit pass/fail checks.
72
+ - A feature is not complete unless at least one automated end-to-end interactive scenario validates the happy path and one validates a failure/recovery path.
73
+
74
+ ### VII. Deterministic, Auditable Agent Behavior
75
+
76
+ - Prompts, system instructions, and agent flows must be **versioned and reviewable** (stored under `src/prompts` or equivalent, not embedded ad‑hoc everywhere).
77
+ - For the same inputs and configuration, the system should aim for **predictable, reproducible** outputs (within the limits of LLM variability), achieved via structured prompts, stable scoring rubrics, and constrained response schemas.
78
+ - Significant decisions (e.g., why an AI idea was ranked #1 vs #2) should be accompanied by **structured rationale** suitable for audit and stakeholder review.
79
+
80
+ ### VIII. CLI‑First UX & Transparency
81
+
82
+ - The CLI interface is a **first‑class product**: clear commands, help text, and progress reporting are mandatory.
83
+ - All long‑running operations (multi‑phase workshops, MCP orchestrations) must **stream progress**, not leave users idle.
84
+ - Users MUST always see the current execution state (current phase, waiting for input, running tool/action, retry/recovery state) during interactive and long-running operations.
85
+ - On failures, user-facing output MUST include: what failed, why it likely failed, what was already completed, and the next actionable recovery options.
86
+ - The agent must be honest about limitations, uncertainty, and trade‑offs, avoiding over‑confident claims.
87
+
88
+ ## Architecture & Scope
89
+
90
+ - The system implements the AI Discovery Cards process as a **multi‑phase agentic pipeline**:
91
+ - First phase is the AI Discovery Cards 12-step process
92
+ - Phase 2: idea selection
93
+ - Phase 3: Planning and development, outline milestones, dependencies, and PoC scope. Finally generate PoC‑level code examples and scaffolding.
94
+
95
+ Mapping: the 12-step workshop covers **Discover/Ideate/Design**; Phase 2 maps to **Select**; Phase 3 maps to **Plan/Develop**.
96
+
97
+ - Each step is implemented as a **composable agent/module** with:
98
+ - A narrow responsibility and input/output contract.
99
+ - A clear hand‑off format to the next phase.
100
+ - Optional checkpointing/check‑in with the user (especially at selection & planning).
101
+ - The Copilot CLI acts as an **orchestrator**, not a monolith: orchestration code wires agents, prompts, and MCP tools together.
102
+
103
+ ## Security & Compliance
104
+
105
+ - Always validate CLI arguments, configuration files, and Copilot SDK session payloads before processing.
106
+ - Enforce strict **input validation** on CLI arguments, configuration files, and environment variables; fail fast on invalid or unsafe values.
107
+ - Use secure defaults:
108
+ - HTTPS‑only when calling remote services.
109
+ - TLS verification enabled; no blanket `NODE_TLS_REJECT_UNAUTHORIZED=0`.
110
+ - Timeouts and retries configured to avoid hanging processes.
111
+ - Access to GitHub, Azure, WorkIQ, or other enterprise systems must respect **organization policies** and least‑privilege scopes.
112
+ - Sensitive outputs (like architecture diagrams or PoC code that touches regulated data) should include **disclaimers and risk notes** when appropriate.
113
+
114
+ ## MCP Services Usage
115
+
116
+ - **Context7**
117
+ - Use to fetch **up‑to‑date documentation and best practices** for libraries, frameworks, and platforms relevant to a proposed AI idea.
118
+ - Use when evaluating technical feasibility, comparing implementation options, or generating PoC scaffolding.
119
+ - Prefer official or high‑trust sources; clearly separate factual documentation from generated interpretation.
120
+
121
+ - **Playwright MCP**
122
+ - Use for **browser automation and validation** when ideas involve web UX, customer journeys, or site workflows.
123
+ - Suitable tasks: walking through existing user flows, capturing page structure, or validating that an AI augmentation can integrate into a target UI.
124
+ - Avoid using it to capture or persist sensitive user data; respect robots.txt and customer security guidelines.
125
+
126
+ - **WorkIQ / M365 MCP** (when enabled)
127
+ - Use for **process discovery** and empirical analysis of how work is currently performed (emails, meetings, documents, Teams, etc.).
128
+ - Only access tenants and scopes explicitly authorized; never assume cross‑tenant access.
129
+ - Summaries and suggestions must preserve confidentiality and avoid exposing individual‑level behavioral analytics unless policy allows.
130
+
131
+ - **GitHub MCP**
132
+ - Use to analyze existing repos and workflows when ideas involve **developer productivity, DevOps, or code quality**.
133
+ - Prefer light‑touch analysis (metadata, structure, high‑level patterns) over raw code dumps unless the user explicitly requests deeper review.
134
+
135
+ - **Microsoft Docs / Azure MCP**
136
+ - Use for authoritative **cloud architecture, security, and compliance** guidance when proposing Azure‑based or Microsoft‑based solutions.
137
+ - When generating Azure/AI solution ideas, ground recommendations in official docs where feasible.
138
+
139
+ ## Development Workflow & Quality Gates
140
+
141
+ - **Branching & Reviews**
142
+ - All substantial changes (logic, prompts, workflows) go through PRs and human review.
143
+ - PR descriptions must state which workshop phases are affected and which tests were run.
144
+
145
+ - **Testing Requirements (Red → Green → Review)**
146
+ - A change is not done until there are passing tests covering the new behavior.
147
+ - The first implementation commit for a task phase MUST include failing tests before production-code changes.
148
+ - Core scoring and selection logic must have **high‑signal unit tests** (no reliance on live LLMs or MCP tools for correctness).
149
+ - End‑to‑end tests may stub LLMs/MCPs while validating orchestration and CLI UX.
150
+ - To generate proper non-stub integration LLM tests, GitHub Copilot SDK can help, keep in mind that results are non-deterministic.
151
+ - Interactive CLI changes MUST include automated terminal-flow tests that verify prompts, user decisions, transitions, and persistence/resume behavior.
152
+ - LLM-dependent behavior MUST expose deterministic checks first (schema/control flow/required signals), with optional semantic checks layered on top.
153
+ - Every task phase follows the **phase‑level TDD cycle**: tests written first → all must fail → implement until green → self‑review.
154
+ - After reaching green, the implementer MUST run through the **Test Review Checklist**:
155
+ - [ ] Are all edge cases covered (empty inputs, nulls, boundary values)?
156
+ - [ ] Are negative/error paths tested (invalid data, missing dependencies, permission failures)?
157
+ - [ ] Are boundary conditions verified (max/min values, empty collections, large payloads)?
158
+ - [ ] Are new integration points exercised (new MCP calls, Copilot SDK interactions)?
159
+ - [ ] Do existing tests still pass without modification (no silent regressions)?
160
+ If any gaps are found, add tests and repeat the red → green cycle before proceeding.
161
+
162
+ - **Observability & Diagnostics**
163
+ - Use structured, leveled logging with a clear separation between **debug**, **info**, **warn**, and **error**.
164
+ - Logging MUST be extensive enough to reconstruct interactive failures end-to-end: include session ID, phase, turn number, tool/action, timing, and transition decisions.
165
+ - Logs must never contain secrets or sensitive data; link to resource identifiers or hashes instead.
166
+ - For CLI users, provide concise error messages plus an optional `--verbose` or `--debug` mode.
167
+ - Interactive UX MUST surface real-time operational events to users (progress, tool activity, state changes) and provide explicit failure reasons with recovery guidance.
168
+ - Automated interactive runs MUST persist artifacts (for example: transcript + structured report) so regressions are diagnosable and reproducible.
169
+
170
+ ## Governance
171
+
172
+ - This constitution **supersedes ad‑hoc practices** for the sofIA Copilot CLI and related agents.
173
+ - Any feature, design, or prompt that conflicts with this document must be revised or justified via a documented exception.
174
+ - Amendments require:
175
+ - A proposal documenting the motivation, risks, and migration/rollout plan.
176
+ - Review and approval via the project’s standard PR process.
177
+ - A version bump and date update in this file.
178
+ - All PR reviews should include an explicit, light‑weight check against this constitution: security, testing, MCP usage, and AI Discovery alignment.
179
+ - Runtime guidance (coding style, prompts, agent composition) should be kept in the project’s developer docs and referenced from here as needed.
180
+
181
+ **Version**: 1.1.2 | **Ratified**: 2026-02-24 | **Last Amended**: 2026-02-26