sofia-cli 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (435) hide show
  1. package/.github/agents/copilot-instructions.md +39 -0
  2. package/.github/agents/speckit.analyze.agent.md +184 -0
  3. package/.github/agents/speckit.checklist.agent.md +294 -0
  4. package/.github/agents/speckit.clarify.agent.md +181 -0
  5. package/.github/agents/speckit.constitution.agent.md +84 -0
  6. package/.github/agents/speckit.implement.agent.md +135 -0
  7. package/.github/agents/speckit.plan.agent.md +90 -0
  8. package/.github/agents/speckit.specify.agent.md +258 -0
  9. package/.github/agents/speckit.tasks.agent.md +137 -0
  10. package/.github/agents/speckit.taskstoissues.agent.md +30 -0
  11. package/.github/copilot-instructions.md +257 -0
  12. package/.github/prompts/speckit.analyze.prompt.md +3 -0
  13. package/.github/prompts/speckit.checklist.prompt.md +3 -0
  14. package/.github/prompts/speckit.clarify.prompt.md +3 -0
  15. package/.github/prompts/speckit.constitution.prompt.md +3 -0
  16. package/.github/prompts/speckit.implement.prompt.md +3 -0
  17. package/.github/prompts/speckit.plan.prompt.md +3 -0
  18. package/.github/prompts/speckit.specify.prompt.md +3 -0
  19. package/.github/prompts/speckit.tasks.prompt.md +3 -0
  20. package/.github/prompts/speckit.taskstoissues.prompt.md +3 -0
  21. package/.github/workflows/ci.yml +38 -0
  22. package/.prettierrc +6 -0
  23. package/.specify/memory/constitution.md +181 -0
  24. package/.specify/scripts/bash/check-prerequisites.sh +166 -0
  25. package/.specify/scripts/bash/common.sh +156 -0
  26. package/.specify/scripts/bash/create-new-feature.sh +297 -0
  27. package/.specify/scripts/bash/setup-plan.sh +61 -0
  28. package/.specify/scripts/bash/update-agent-context.sh +810 -0
  29. package/.specify/templates/agent-file-template.md +28 -0
  30. package/.specify/templates/checklist-template.md +40 -0
  31. package/.specify/templates/constitution-template.md +50 -0
  32. package/.specify/templates/plan-template.md +113 -0
  33. package/.specify/templates/spec-template.md +115 -0
  34. package/.specify/templates/tasks-template.md +251 -0
  35. package/.vscode/mcp.json +42 -0
  36. package/.vscode/settings.json +19 -0
  37. package/CODE_OF_CONDUCT.md +128 -0
  38. package/LICENSE +21 -0
  39. package/README.md +213 -0
  40. package/dist/src/cli/developCommand.js +240 -0
  41. package/dist/src/cli/directCommands.js +143 -0
  42. package/dist/src/cli/envLoader.js +16 -0
  43. package/dist/src/cli/exportCommand.js +53 -0
  44. package/dist/src/cli/index.js +203 -0
  45. package/dist/src/cli/ioContext.js +109 -0
  46. package/dist/src/cli/preflight.js +57 -0
  47. package/dist/src/cli/statusCommand.js +110 -0
  48. package/dist/src/cli/workshopCommand.js +400 -0
  49. package/dist/src/develop/checkpointState.js +86 -0
  50. package/dist/src/develop/codeGenerator.js +319 -0
  51. package/dist/src/develop/dynamicScaffolder.js +226 -0
  52. package/dist/src/develop/githubMcpAdapter.js +122 -0
  53. package/dist/src/develop/index.js +15 -0
  54. package/dist/src/develop/mcpContextEnricher.js +195 -0
  55. package/dist/src/develop/pocScaffolder.js +542 -0
  56. package/dist/src/develop/ralphLoop.js +659 -0
  57. package/dist/src/develop/templateRegistry.js +364 -0
  58. package/dist/src/develop/testRunner.js +202 -0
  59. package/dist/src/logging/logger.js +58 -0
  60. package/dist/src/loop/conversationLoop.js +227 -0
  61. package/dist/src/loop/phaseSummarizer.js +87 -0
  62. package/dist/src/mcp/mcpManager.js +267 -0
  63. package/dist/src/mcp/mcpTransport.js +391 -0
  64. package/dist/src/mcp/retryPolicy.js +47 -0
  65. package/dist/src/mcp/webSearch.js +254 -0
  66. package/dist/src/phases/contextSummarizer.js +101 -0
  67. package/dist/src/phases/discoveryEnricher.js +156 -0
  68. package/dist/src/phases/phaseExtractors.js +222 -0
  69. package/dist/src/phases/phaseHandlers.js +328 -0
  70. package/dist/src/prompts/design.md +51 -0
  71. package/dist/src/prompts/develop-boundary.md +51 -0
  72. package/dist/src/prompts/develop.md +111 -0
  73. package/dist/src/prompts/discover.md +58 -0
  74. package/dist/src/prompts/ideate.md +56 -0
  75. package/dist/src/prompts/plan.md +51 -0
  76. package/dist/src/prompts/promptLoader.js +167 -0
  77. package/dist/src/prompts/promptLoader.ts +198 -0
  78. package/dist/src/prompts/select.md +47 -0
  79. package/dist/src/prompts/summarize/README.md +8 -0
  80. package/dist/src/prompts/summarize/design-summary.md +37 -0
  81. package/dist/src/prompts/summarize/develop-summary.md +25 -0
  82. package/dist/src/prompts/summarize/ideate-summary.md +27 -0
  83. package/dist/src/prompts/summarize/plan-summary.md +27 -0
  84. package/dist/src/prompts/summarize/select-summary.md +21 -0
  85. package/dist/src/prompts/system.md +28 -0
  86. package/dist/src/sessions/exportPaths.js +22 -0
  87. package/dist/src/sessions/exportWriter.js +406 -0
  88. package/dist/src/sessions/sessionManager.js +81 -0
  89. package/dist/src/sessions/sessionStore.js +65 -0
  90. package/dist/src/shared/activitySpinner.js +91 -0
  91. package/dist/src/shared/copilotClient.js +129 -0
  92. package/dist/src/shared/data/cards.json +1249 -0
  93. package/dist/src/shared/data/cardsLoader.js +51 -0
  94. package/dist/src/shared/errorClassifier.js +120 -0
  95. package/dist/src/shared/events.js +28 -0
  96. package/dist/src/shared/markdownRenderer.js +34 -0
  97. package/dist/src/shared/schemas/session.js +265 -0
  98. package/dist/src/shared/tableRenderer.js +20 -0
  99. package/dist/src/vendor/chalk.js +2 -0
  100. package/dist/src/vendor/cli-table3.js +3 -0
  101. package/dist/src/vendor/commander.js +2 -0
  102. package/dist/src/vendor/marked-terminal.js +3 -0
  103. package/dist/src/vendor/marked.js +2 -0
  104. package/dist/src/vendor/ora.js +2 -0
  105. package/dist/src/vendor/pino.js +2 -0
  106. package/dist/src/vendor/zod.js +2 -0
  107. package/dist/tests/e2e/developE2e.spec.js +126 -0
  108. package/dist/tests/e2e/developFailureE2e.spec.js +247 -0
  109. package/dist/tests/e2e/developPty.spec.js +75 -0
  110. package/dist/tests/e2e/discoveryWebSearchRelevance.spec.js +84 -0
  111. package/dist/tests/e2e/harness.spec.js +83 -0
  112. package/dist/tests/e2e/mcpLive.spec.js +120 -0
  113. package/dist/tests/e2e/newSession.e2e.spec.js +177 -0
  114. package/dist/tests/e2e/ralphLoopEnrichmentComparison.spec.js +62 -0
  115. package/dist/tests/e2e/workiqEnrichment.spec.js +56 -0
  116. package/dist/tests/e2e/zavaSimulation.spec.js +452 -0
  117. package/dist/tests/fixtures/test-fixture-project/src/add.js +3 -0
  118. package/dist/tests/fixtures/test-fixture-project/tests/failing.test.js +6 -0
  119. package/dist/tests/fixtures/test-fixture-project/tests/hanging.test.js +8 -0
  120. package/dist/tests/fixtures/test-fixture-project/tests/passing.test.js +10 -0
  121. package/dist/tests/fixtures/test-fixture-project/vitest.config.js +6 -0
  122. package/dist/tests/integration/autoStartConversation.spec.js +138 -0
  123. package/dist/tests/integration/defaultCommand.spec.js +147 -0
  124. package/dist/tests/integration/directCommandNonTty.spec.js +224 -0
  125. package/dist/tests/integration/directCommandTty.spec.js +151 -0
  126. package/dist/tests/integration/discoveryEnrichmentFlow.spec.js +175 -0
  127. package/dist/tests/integration/exportArtifacts.spec.js +202 -0
  128. package/dist/tests/integration/exportFallbackFlow.spec.js +99 -0
  129. package/dist/tests/integration/mcpDegradationFlow.spec.js +190 -0
  130. package/dist/tests/integration/mcpTransportFlow.spec.js +139 -0
  131. package/dist/tests/integration/newSessionFlow.spec.js +343 -0
  132. package/dist/tests/integration/pocGithubMcp.spec.js +186 -0
  133. package/dist/tests/integration/pocLocalFallback.spec.js +171 -0
  134. package/dist/tests/integration/pocScaffold.spec.js +163 -0
  135. package/dist/tests/integration/ralphLoopFlow.spec.js +359 -0
  136. package/dist/tests/integration/ralphLoopPartial.spec.js +368 -0
  137. package/dist/tests/integration/resumeAndBacktrack.spec.js +247 -0
  138. package/dist/tests/integration/spinnerLifecycle.spec.js +220 -0
  139. package/dist/tests/integration/summarizationFlow.spec.js +115 -0
  140. package/dist/tests/integration/testRunnerReal.spec.js +52 -0
  141. package/dist/tests/integration/webSearchAgent.spec.js +128 -0
  142. package/dist/tests/live/copilotSdkLive.spec.js +107 -0
  143. package/dist/tests/live/zavaFullWorkshop.spec.js +392 -0
  144. package/dist/tests/setup/loadEnv.js +3 -0
  145. package/dist/tests/unit/cli/developCommand.spec.js +567 -0
  146. package/dist/tests/unit/cli/directCommands.spec.js +279 -0
  147. package/dist/tests/unit/cli/envLoader.spec.js +58 -0
  148. package/dist/tests/unit/cli/ioContext.spec.js +119 -0
  149. package/dist/tests/unit/cli/preflight.spec.js +108 -0
  150. package/dist/tests/unit/cli/statusCommand.spec.js +111 -0
  151. package/dist/tests/unit/cli/workshopClientFallback.spec.js +80 -0
  152. package/dist/tests/unit/cli/workshopCommand.spec.js +329 -0
  153. package/dist/tests/unit/config/vitestEnvSetup.spec.js +13 -0
  154. package/dist/tests/unit/develop/checkpointState.spec.js +315 -0
  155. package/dist/tests/unit/develop/codeGenerator.spec.js +355 -0
  156. package/dist/tests/unit/develop/githubMcpAdapter.spec.js +231 -0
  157. package/dist/tests/unit/develop/mcpContextEnricher.spec.js +433 -0
  158. package/dist/tests/unit/develop/outputValidator.spec.js +119 -0
  159. package/dist/tests/unit/develop/pocScaffolder.spec.js +353 -0
  160. package/dist/tests/unit/develop/ralphLoop.spec.js +1248 -0
  161. package/dist/tests/unit/develop/templateRegistry.spec.js +85 -0
  162. package/dist/tests/unit/develop/testRunner.spec.js +249 -0
  163. package/dist/tests/unit/infraBicep.spec.js +92 -0
  164. package/dist/tests/unit/infraDeploy.spec.js +82 -0
  165. package/dist/tests/unit/infraTeardown.spec.js +63 -0
  166. package/dist/tests/unit/logging/logger.spec.js +43 -0
  167. package/dist/tests/unit/loop/conversationLoop.spec.js +592 -0
  168. package/dist/tests/unit/loop/phaseSummarizer.spec.js +141 -0
  169. package/dist/tests/unit/loop/streamingMarkdown.spec.js +147 -0
  170. package/dist/tests/unit/mcp/mcpManager.spec.js +279 -0
  171. package/dist/tests/unit/mcp/mcpTransport.spec.js +529 -0
  172. package/dist/tests/unit/mcp/retryPolicy.spec.js +218 -0
  173. package/dist/tests/unit/mcp/timeoutValidation.spec.js +46 -0
  174. package/dist/tests/unit/mcp/webSearch.spec.js +567 -0
  175. package/dist/tests/unit/phases/contextSummarizer.spec.js +140 -0
  176. package/dist/tests/unit/phases/discoveryEnricher.repeatCalls.spec.js +93 -0
  177. package/dist/tests/unit/phases/discoveryEnricher.spec.js +411 -0
  178. package/dist/tests/unit/phases/phaseExtractors.spec.js +352 -0
  179. package/dist/tests/unit/phases/phaseHandlers.spec.js +425 -0
  180. package/dist/tests/unit/prompts/promptLoader.spec.js +118 -0
  181. package/dist/tests/unit/schemas/pocSchemas.spec.js +412 -0
  182. package/dist/tests/unit/schemas/session.spec.js +257 -0
  183. package/dist/tests/unit/sessions/exportPaths.spec.js +31 -0
  184. package/dist/tests/unit/sessions/exportWriter.spec.js +655 -0
  185. package/dist/tests/unit/sessions/sessionManager.spec.js +151 -0
  186. package/dist/tests/unit/sessions/sessionStore.spec.js +116 -0
  187. package/dist/tests/unit/shared/activitySpinner.spec.js +175 -0
  188. package/dist/tests/unit/shared/cardsLoader.spec.js +76 -0
  189. package/dist/tests/unit/shared/copilotClient.spec.js +155 -0
  190. package/dist/tests/unit/shared/errorClassifier.spec.js +131 -0
  191. package/dist/tests/unit/shared/events.spec.js +55 -0
  192. package/dist/tests/unit/shared/markdownRenderer.spec.js +35 -0
  193. package/dist/tests/unit/shared/markdownRendererChunks.spec.js +70 -0
  194. package/dist/tests/unit/shared/tableRenderer.spec.js +34 -0
  195. package/dist/vitest.config.js +14 -0
  196. package/dist/vitest.live.config.js +18 -0
  197. package/docs/README.md +35 -0
  198. package/docs/architecture.md +169 -0
  199. package/docs/cli-usage.md +207 -0
  200. package/docs/environment.md +66 -0
  201. package/docs/export-format.md +146 -0
  202. package/docs/session-model.md +113 -0
  203. package/eslint.config.js +35 -0
  204. package/infra/deploy.sh +193 -0
  205. package/infra/gather-env.sh +211 -0
  206. package/infra/main.bicep +90 -0
  207. package/infra/main.bicepparam +18 -0
  208. package/infra/resources.bicep +134 -0
  209. package/infra/teardown.sh +114 -0
  210. package/package.json +63 -0
  211. package/specs/001-cli-workshop-rebuild/checklists/requirements.md +35 -0
  212. package/specs/001-cli-workshop-rebuild/contracts/cli.md +59 -0
  213. package/specs/001-cli-workshop-rebuild/contracts/export-summary-json.md +23 -0
  214. package/specs/001-cli-workshop-rebuild/contracts/session-json.md +30 -0
  215. package/specs/001-cli-workshop-rebuild/data-model.md +210 -0
  216. package/specs/001-cli-workshop-rebuild/plan.md +361 -0
  217. package/specs/001-cli-workshop-rebuild/quickstart.md +83 -0
  218. package/specs/001-cli-workshop-rebuild/research.md +116 -0
  219. package/specs/001-cli-workshop-rebuild/spec.md +240 -0
  220. package/specs/001-cli-workshop-rebuild/tasks.md +476 -0
  221. package/specs/002-poc-generation/contracts/poc-output.md +172 -0
  222. package/specs/002-poc-generation/contracts/ralph-loop.md +113 -0
  223. package/specs/002-poc-generation/data-model.md +172 -0
  224. package/specs/002-poc-generation/plan.md +109 -0
  225. package/specs/002-poc-generation/quickstart.md +97 -0
  226. package/specs/002-poc-generation/research.md +786 -0
  227. package/specs/002-poc-generation/spec.md +81 -0
  228. package/specs/002-poc-generation/tasks-fix.md +198 -0
  229. package/specs/002-poc-generation/tasks.md +252 -0
  230. package/specs/003-mcp-transport-integration/checklists/requirements.md +37 -0
  231. package/specs/003-mcp-transport-integration/contracts/context-enricher.md +220 -0
  232. package/specs/003-mcp-transport-integration/contracts/discovery-enricher.md +267 -0
  233. package/specs/003-mcp-transport-integration/contracts/github-adapter.md +149 -0
  234. package/specs/003-mcp-transport-integration/contracts/mcp-transport.md +288 -0
  235. package/specs/003-mcp-transport-integration/data-model.md +326 -0
  236. package/specs/003-mcp-transport-integration/plan.md +114 -0
  237. package/specs/003-mcp-transport-integration/quickstart.md +311 -0
  238. package/specs/003-mcp-transport-integration/research.md +395 -0
  239. package/specs/003-mcp-transport-integration/spec.md +234 -0
  240. package/specs/003-mcp-transport-integration/tasks.md +324 -0
  241. package/specs/003-next-spec-gaps.md +150 -0
  242. package/specs/004-dev-resume-hardening/checklists/requirements.md +37 -0
  243. package/specs/004-dev-resume-hardening/contracts/cli.md +160 -0
  244. package/specs/004-dev-resume-hardening/data-model.md +321 -0
  245. package/specs/004-dev-resume-hardening/plan.md +107 -0
  246. package/specs/004-dev-resume-hardening/quickstart.md +115 -0
  247. package/specs/004-dev-resume-hardening/research.md +142 -0
  248. package/specs/004-dev-resume-hardening/spec.md +221 -0
  249. package/specs/004-dev-resume-hardening/tasks.md +333 -0
  250. package/specs/005-ai-search-deploy/checklists/requirements.md +39 -0
  251. package/specs/005-ai-search-deploy/contracts/web-search-tool.md +241 -0
  252. package/specs/005-ai-search-deploy/data-model.md +130 -0
  253. package/specs/005-ai-search-deploy/plan.md +93 -0
  254. package/specs/005-ai-search-deploy/quickstart.md +96 -0
  255. package/specs/005-ai-search-deploy/research.md +187 -0
  256. package/specs/005-ai-search-deploy/spec.md +143 -0
  257. package/specs/005-ai-search-deploy/tasks.md +284 -0
  258. package/specs/006-workshop-extraction-fixes/checklists/requirements.md +61 -0
  259. package/specs/006-workshop-extraction-fixes/contracts/summarization-and-export.md +131 -0
  260. package/specs/006-workshop-extraction-fixes/data-model.md +149 -0
  261. package/specs/006-workshop-extraction-fixes/plan.md +123 -0
  262. package/specs/006-workshop-extraction-fixes/quickstart.md +101 -0
  263. package/specs/006-workshop-extraction-fixes/research.md +143 -0
  264. package/specs/006-workshop-extraction-fixes/spec.md +210 -0
  265. package/specs/006-workshop-extraction-fixes/tasks.md +316 -0
  266. package/src/cli/developCommand.ts +308 -0
  267. package/src/cli/directCommands.ts +195 -0
  268. package/src/cli/envLoader.ts +17 -0
  269. package/src/cli/exportCommand.ts +65 -0
  270. package/src/cli/index.ts +249 -0
  271. package/src/cli/ioContext.ts +139 -0
  272. package/src/cli/preflight.ts +86 -0
  273. package/src/cli/statusCommand.ts +118 -0
  274. package/src/cli/workshopCommand.ts +496 -0
  275. package/src/develop/checkpointState.ts +121 -0
  276. package/src/develop/codeGenerator.ts +402 -0
  277. package/src/develop/dynamicScaffolder.ts +284 -0
  278. package/src/develop/githubMcpAdapter.ts +199 -0
  279. package/src/develop/index.ts +34 -0
  280. package/src/develop/mcpContextEnricher.ts +279 -0
  281. package/src/develop/pocScaffolder.ts +646 -0
  282. package/src/develop/ralphLoop.ts +1044 -0
  283. package/src/develop/templateRegistry.ts +427 -0
  284. package/src/develop/testRunner.ts +276 -0
  285. package/src/logging/logger.ts +73 -0
  286. package/src/loop/conversationLoop.ts +355 -0
  287. package/src/loop/phaseSummarizer.ts +114 -0
  288. package/src/mcp/mcpManager.ts +365 -0
  289. package/src/mcp/mcpTransport.ts +562 -0
  290. package/src/mcp/retryPolicy.ts +87 -0
  291. package/src/mcp/webSearch.ts +388 -0
  292. package/src/originalPrompts/design_thinking.md +178 -0
  293. package/src/originalPrompts/design_thinking_persona.md +76 -0
  294. package/src/originalPrompts/document_generator_example.md +77 -0
  295. package/src/originalPrompts/document_generator_persona.md +47 -0
  296. package/src/originalPrompts/facilitator_persona.md +125 -0
  297. package/src/originalPrompts/guardrails.md +47 -0
  298. package/src/phases/contextSummarizer.ts +154 -0
  299. package/src/phases/discoveryEnricher.ts +223 -0
  300. package/src/phases/phaseExtractors.ts +247 -0
  301. package/src/phases/phaseHandlers.ts +450 -0
  302. package/src/prompts/design.md +51 -0
  303. package/src/prompts/develop-boundary.md +51 -0
  304. package/src/prompts/develop.md +111 -0
  305. package/src/prompts/discover.md +58 -0
  306. package/src/prompts/ideate.md +56 -0
  307. package/src/prompts/plan.md +51 -0
  308. package/src/prompts/promptLoader.ts +198 -0
  309. package/src/prompts/select.md +47 -0
  310. package/src/prompts/summarize/README.md +8 -0
  311. package/src/prompts/summarize/design-summary.md +37 -0
  312. package/src/prompts/summarize/develop-summary.md +25 -0
  313. package/src/prompts/summarize/ideate-summary.md +27 -0
  314. package/src/prompts/summarize/plan-summary.md +27 -0
  315. package/src/prompts/summarize/select-summary.md +21 -0
  316. package/src/prompts/system.md +28 -0
  317. package/src/sessions/exportPaths.ts +28 -0
  318. package/src/sessions/exportWriter.ts +490 -0
  319. package/src/sessions/sessionManager.ts +119 -0
  320. package/src/sessions/sessionStore.ts +69 -0
  321. package/src/shared/activitySpinner.ts +108 -0
  322. package/src/shared/copilotClient.ts +291 -0
  323. package/src/shared/data/cards.json +1249 -0
  324. package/src/shared/data/cardsLoader.ts +70 -0
  325. package/src/shared/errorClassifier.ts +160 -0
  326. package/src/shared/events.ts +103 -0
  327. package/src/shared/markdownRenderer.ts +44 -0
  328. package/src/shared/schemas/session.ts +346 -0
  329. package/src/shared/tableRenderer.ts +28 -0
  330. package/src/types/marked-terminal.d.ts +5 -0
  331. package/src/vendor/chalk.ts +2 -0
  332. package/src/vendor/cli-table3.ts +3 -0
  333. package/src/vendor/commander.ts +2 -0
  334. package/src/vendor/marked-terminal.ts +3 -0
  335. package/src/vendor/marked.ts +2 -0
  336. package/src/vendor/ora.ts +2 -0
  337. package/src/vendor/pino.ts +3 -0
  338. package/src/vendor/zod.ts +3 -0
  339. package/tests/e2e/developE2e.spec.ts +152 -0
  340. package/tests/e2e/developFailureE2e.spec.ts +289 -0
  341. package/tests/e2e/developPty.spec.ts +86 -0
  342. package/tests/e2e/discoveryWebSearchRelevance.spec.ts +103 -0
  343. package/tests/e2e/harness.spec.ts +104 -0
  344. package/tests/e2e/mcpLive.spec.ts +149 -0
  345. package/tests/e2e/newSession.e2e.spec.ts +245 -0
  346. package/tests/e2e/ralphLoopEnrichmentComparison.spec.ts +70 -0
  347. package/tests/e2e/workiqEnrichment.spec.ts +72 -0
  348. package/tests/e2e/zava-assessment/agent-interaction-script.md +258 -0
  349. package/tests/e2e/zava-assessment/company-profile.md +98 -0
  350. package/tests/e2e/zava-assessment/expected-results-checklist.md +454 -0
  351. package/tests/e2e/zavaSimulation.spec.ts +511 -0
  352. package/tests/fixtures/completedSession.json +141 -0
  353. package/tests/fixtures/test-fixture-project/package-lock.json +1585 -0
  354. package/tests/fixtures/test-fixture-project/package.json +12 -0
  355. package/tests/fixtures/test-fixture-project/src/add.ts +3 -0
  356. package/tests/fixtures/test-fixture-project/tests/failing.test.ts +7 -0
  357. package/tests/fixtures/test-fixture-project/tests/hanging.test.ts +9 -0
  358. package/tests/fixtures/test-fixture-project/tests/passing.test.ts +13 -0
  359. package/tests/fixtures/test-fixture-project/vitest.config.ts +7 -0
  360. package/tests/integration/autoStartConversation.spec.ts +168 -0
  361. package/tests/integration/defaultCommand.spec.ts +179 -0
  362. package/tests/integration/directCommandNonTty.spec.ts +260 -0
  363. package/tests/integration/directCommandTty.spec.ts +185 -0
  364. package/tests/integration/discoveryEnrichmentFlow.spec.ts +209 -0
  365. package/tests/integration/exportArtifacts.spec.ts +232 -0
  366. package/tests/integration/exportFallbackFlow.spec.ts +115 -0
  367. package/tests/integration/mcpDegradationFlow.spec.ts +231 -0
  368. package/tests/integration/mcpTransportFlow.spec.ts +178 -0
  369. package/tests/integration/newSessionFlow.spec.ts +406 -0
  370. package/tests/integration/pocGithubMcp.spec.ts +224 -0
  371. package/tests/integration/pocLocalFallback.spec.ts +205 -0
  372. package/tests/integration/pocScaffold.spec.ts +220 -0
  373. package/tests/integration/ralphLoopFlow.spec.ts +430 -0
  374. package/tests/integration/ralphLoopPartial.spec.ts +416 -0
  375. package/tests/integration/resumeAndBacktrack.spec.ts +278 -0
  376. package/tests/integration/spinnerLifecycle.spec.ts +270 -0
  377. package/tests/integration/summarizationFlow.spec.ts +135 -0
  378. package/tests/integration/testRunnerReal.spec.ts +63 -0
  379. package/tests/integration/webSearchAgent.spec.ts +155 -0
  380. package/tests/live/copilotSdkLive.spec.ts +149 -0
  381. package/tests/live/zavaFullWorkshop.spec.ts +515 -0
  382. package/tests/setup/loadEnv.ts +5 -0
  383. package/tests/unit/cli/developCommand.spec.ts +679 -0
  384. package/tests/unit/cli/directCommands.spec.ts +325 -0
  385. package/tests/unit/cli/envLoader.spec.ts +73 -0
  386. package/tests/unit/cli/ioContext.spec.ts +148 -0
  387. package/tests/unit/cli/preflight.spec.ts +125 -0
  388. package/tests/unit/cli/statusCommand.spec.ts +134 -0
  389. package/tests/unit/cli/workshopClientFallback.spec.ts +100 -0
  390. package/tests/unit/cli/workshopCommand.spec.ts +378 -0
  391. package/tests/unit/config/vitestEnvSetup.spec.ts +24 -0
  392. package/tests/unit/develop/checkpointState.spec.ts +378 -0
  393. package/tests/unit/develop/codeGenerator.spec.ts +447 -0
  394. package/tests/unit/develop/githubMcpAdapter.spec.ts +283 -0
  395. package/tests/unit/develop/mcpContextEnricher.spec.ts +564 -0
  396. package/tests/unit/develop/outputValidator.spec.ts +134 -0
  397. package/tests/unit/develop/pocScaffolder.spec.ts +451 -0
  398. package/tests/unit/develop/ralphLoop.spec.ts +1439 -0
  399. package/tests/unit/develop/templateRegistry.spec.ts +106 -0
  400. package/tests/unit/develop/testRunner.spec.ts +294 -0
  401. package/tests/unit/infraBicep.spec.ts +116 -0
  402. package/tests/unit/infraDeploy.spec.ts +102 -0
  403. package/tests/unit/infraTeardown.spec.ts +77 -0
  404. package/tests/unit/logging/logger.spec.ts +50 -0
  405. package/tests/unit/loop/conversationLoop.spec.ts +719 -0
  406. package/tests/unit/loop/phaseSummarizer.spec.ts +169 -0
  407. package/tests/unit/loop/streamingMarkdown.spec.ts +180 -0
  408. package/tests/unit/mcp/mcpManager.spec.ts +336 -0
  409. package/tests/unit/mcp/mcpTransport.spec.ts +689 -0
  410. package/tests/unit/mcp/retryPolicy.spec.ts +278 -0
  411. package/tests/unit/mcp/timeoutValidation.spec.ts +55 -0
  412. package/tests/unit/mcp/webSearch.spec.ts +718 -0
  413. package/tests/unit/phases/contextSummarizer.spec.ts +158 -0
  414. package/tests/unit/phases/discoveryEnricher.repeatCalls.spec.ts +125 -0
  415. package/tests/unit/phases/discoveryEnricher.spec.ts +512 -0
  416. package/tests/unit/phases/phaseExtractors.spec.ts +406 -0
  417. package/tests/unit/phases/phaseHandlers.spec.ts +483 -0
  418. package/tests/unit/prompts/promptLoader.spec.ts +144 -0
  419. package/tests/unit/schemas/pocSchemas.spec.ts +457 -0
  420. package/tests/unit/schemas/session.spec.ts +328 -0
  421. package/tests/unit/sessions/exportPaths.spec.ts +38 -0
  422. package/tests/unit/sessions/exportWriter.spec.ts +737 -0
  423. package/tests/unit/sessions/sessionManager.spec.ts +174 -0
  424. package/tests/unit/sessions/sessionStore.spec.ts +136 -0
  425. package/tests/unit/shared/activitySpinner.spec.ts +211 -0
  426. package/tests/unit/shared/cardsLoader.spec.ts +89 -0
  427. package/tests/unit/shared/copilotClient.spec.ts +185 -0
  428. package/tests/unit/shared/errorClassifier.spec.ts +152 -0
  429. package/tests/unit/shared/events.spec.ts +71 -0
  430. package/tests/unit/shared/markdownRenderer.spec.ts +42 -0
  431. package/tests/unit/shared/markdownRendererChunks.spec.ts +83 -0
  432. package/tests/unit/shared/tableRenderer.spec.ts +38 -0
  433. package/tsconfig.json +20 -0
  434. package/vitest.config.ts +15 -0
  435. package/vitest.live.config.ts +19 -0
@@ -0,0 +1,333 @@
1
+ # Tasks: Dev Resume & Hardening
2
+
3
+ **Input**: Design documents from `/specs/004-dev-resume-hardening/`
4
+ **Prerequisites**: plan.md (required), spec.md (required for user stories), research.md, data-model.md, contracts/
5
+
6
+ **Tests**: Tests are REQUIRED for new behavior in this repository (Red → Green → Review). Include test tasks for each user story and write them first.
7
+
8
+ **Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story.
9
+
10
+ ## Format: `[ID] [P?] [Story] Description`
11
+
12
+ - **[P]**: Can run in parallel (different files, no dependencies)
13
+ - **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
14
+ - Include exact file paths in descriptions
15
+
16
+ ## Phase 1: Setup (Shared Infrastructure)
17
+
18
+ **Purpose**: Project initialization, shared types, and test infrastructure used by multiple stories
19
+
20
+ - [x] T001 Create `CheckpointState` interface and `deriveCheckpointState()` function in `src/develop/checkpointState.ts` per data-model.md derivation logic
21
+ - [x] T002 [P] Create `TemplateEntry` interface and `TemplateRegistry` type in `src/develop/templateRegistry.ts` (types only — no template content yet)
22
+ - [x] T003 [P] Create test fixture project in `tests/fixtures/test-fixture-project/` with `package.json`, `vitest.config.ts`, `src/add.ts`, `tests/passing.test.ts`, `tests/failing.test.ts`, and `tests/hanging.test.ts` per data-model.md TestFixtureProject spec
23
+ - [x] T004 Run `npm install` in `tests/fixtures/test-fixture-project/` and add `tests/fixtures/test-fixture-project/node_modules` to `.gitignore`
24
+
25
+ ---
26
+
27
+ ## Phase 2: Foundational (Blocking Prerequisites)
28
+
29
+ **Purpose**: Core changes shared across multiple user stories — MUST complete before story work
30
+
31
+ **⚠️ CRITICAL**: No user story work can begin until this phase is complete
32
+
33
+ - [x] T005 Make `extractJson()` and `buildErrorResult()` methods `protected` in `src/develop/testRunner.ts` (like `parseOutput` already is) to enable subclass testing
34
+ - [x] T006 Add `testCommand` optional parameter to `TestRunnerOptions` interface in `src/develop/testRunner.ts` and use it in `spawnTests()` instead of hardcoded `npm test -- --reporter=json`
35
+ - [x] T007 Extract `NODE_TS_VITEST_TEMPLATE` from `src/develop/pocScaffolder.ts` into `src/develop/templateRegistry.ts` as the first `TemplateEntry`, including `techStack`, `installCommand`, `testCommand`, and `matchPatterns`
36
+ - [x] T008 Add `selectTemplate()` function to `src/develop/templateRegistry.ts` implementing first-match-wins logic per contracts/cli.md Template Selection rules
37
+ - [x] T009 Add `PYTHON_PYTEST_TEMPLATE` entry to `src/develop/templateRegistry.ts` with files (`.gitignore`, `requirements.txt`, `pytest.ini`, `README.md`, `src/__init__.py`, `src/main.py`, `tests/test_main.py`, `.sofia-metadata.json`), `techStack`, `installCommand`, `testCommand`, and `matchPatterns` per data-model.md TemplateEntry table
38
+ - [x] T010 Update `PocScaffolder.buildContext()` in `src/develop/pocScaffolder.ts` to accept an optional `TemplateEntry` parameter and use its `techStack` instead of the hardcoded default
39
+ - [x] T011 Update `PocScaffolder` constructor in `src/develop/pocScaffolder.ts` to accept `TemplateEntry` (using `entry.files`) instead of raw `TemplateFile[]`, preserving backward compatibility
40
+
41
+ **Checkpoint**: Foundation ready — shared types, test fixtures, and registry exist. User story implementation can begin.
42
+
43
+ ---
44
+
45
+ ## Phase 3: User Story 1 — Resume an Interrupted PoC Session (Priority: P1) 🎯 MVP
46
+
47
+ **Goal**: Running `sofia dev --session X` on an interrupted session resumes from the last completed iteration, skipping scaffold and re-running npm install.
48
+
49
+ **Independent Test**: Interrupt after 2 iterations, re-run, verify iteration 3 starts without re-scaffolding.
50
+
51
+ **FRs covered**: FR-001, FR-001a, FR-002, FR-003, FR-004, FR-005, FR-006, FR-007, FR-007a
52
+
53
+ ### Tests for User Story 1 (REQUIRED) ⚠️
54
+
55
+ > **NOTE: Write these tests FIRST, ensure they FAIL before implementation**
56
+
57
+ - [x] T012 [P] [US1] Unit test: `deriveCheckpointState` returns correct state for no-poc, completed, partial, interrupted sessions in `tests/unit/develop/checkpointState.spec.ts`
58
+ - [x] T013 [P] [US1] Unit test: `RalphLoop.run()` seeds `iterations` from `session.poc.iterations` and starts from correct `iterNum` in `tests/unit/develop/ralphLoop.spec.ts` (add describe block "resume iteration seeding")
59
+ - [x] T014 [P] [US1] Unit test: `RalphLoop.run()` skips scaffold when checkpoint says `canSkipScaffold=true` in `tests/unit/develop/ralphLoop.spec.ts`
60
+ - [x] T015 [P] [US1] Unit test: `RalphLoop.run()` pops incomplete last iteration (no testResults) and re-runs it per FR-001a in `tests/unit/develop/ralphLoop.spec.ts`
61
+ - [x] T016 [P] [US1] Unit test: `developCommand` exits with completion message when `poc.finalStatus === 'success'` per FR-005 in `tests/unit/cli/developCommand.spec.ts`
62
+ - [x] T017 [P] [US1] Unit test: `developCommand` defaults to resume when `poc.finalStatus === 'failed'|'partial'` per FR-006 in `tests/unit/cli/developCommand.spec.ts`
63
+ - [x] T018 [P] [US1] Unit test: resume re-scaffolds when output directory is missing but iterations exist per FR-007 in `tests/unit/develop/ralphLoop.spec.ts`
64
+ - [x] T019 [US1] Integration test: full resume flow — create session with 2 completed iterations, run `RalphLoop`, verify starts at iteration 3 in `tests/integration/ralphLoopPartial.spec.ts` (add describe block "resume from interrupted session")
65
+ - [x] T065 [P] [US1] Unit test: resume ALWAYS re-runs dependency install step even when scaffolding is skipped per FR-003 in `tests/unit/develop/ralphLoop.spec.ts`
66
+ - [x] T066 [P] [US1] Unit test: resume includes prior iteration history in LLM prompt context (test results + applied changes summary) per FR-004 in `tests/unit/develop/ralphLoop.spec.ts`
67
+ - [x] T067 [P] [US1] Unit test: resume decision logging emits info-level messages for iteration number, skip scaffold, incomplete-iteration rerun, and re-run install per FR-007a in `tests/unit/develop/ralphLoop.spec.ts`
68
+ - [x] T068 [P] [US1] Unit test: corrupted/invalid `poc.iterations` causes safe fallback to fresh run (and warning log) per Edge Cases in `spec.md` in `tests/unit/develop/checkpointState.spec.ts`
69
+ - [x] T069 [P] [US1] Unit test: output directory present but `.sofia-metadata.json` integrity mismatch triggers warning and forces re-scaffold (do not skip scaffold) per Edge Cases in `spec.md` in `tests/unit/develop/checkpointState.spec.ts`
70
+
71
+ ### Implementation for User Story 1
72
+
73
+ - [x] T020 [US1] Implement `deriveCheckpointState()` logic in `src/develop/checkpointState.ts` per data-model.md derivation rules
74
+ - [x] T021 [US1] Update `developCommand()` in `src/cli/developCommand.ts` to call `deriveCheckpointState()` before creating RalphLoop and handle FR-005 (success exit) and FR-006 (failed/partial default resume)
75
+ - [x] T022 [US1] Modify `RalphLoop.run()` in `src/develop/ralphLoop.ts` to seed `iterations` from `session.poc.iterations`, derive `iterNum = iterations.length + 1`, and pop incomplete last iteration per FR-001/FR-001a
76
+ - [x] T023 [US1] Modify `RalphLoop.run()` in `src/develop/ralphLoop.ts` to skip scaffold when output dir + `.sofia-metadata.json` exist per FR-002, and always re-run install per FR-003
77
+ - [x] T024 [US1] Modify `RalphLoop.run()` in `src/develop/ralphLoop.ts` to include prior iteration history in LLM prompt context per FR-004 (include prior test results and a concise summary of applied changes across prior iterations; not just last failing tests)
78
+ - [x] T025 [US1] Modify `RalphLoop.run()` in `src/develop/ralphLoop.ts` to re-scaffold when output dir is missing but iterations exist per FR-007
79
+ - [x] T026 [US1] Add info-level resume decision logging in `src/develop/ralphLoop.ts` and `src/cli/developCommand.ts` per FR-007a (iteration number, skip scaffold, re-run install, incomplete iteration re-run)
80
+ - [x] T070 [US1] Harden `deriveCheckpointState()` in `src/develop/checkpointState.ts` to validate iteration entries (missing/invalid shapes) and safely fall back to fresh run + warning log per Edge Cases in `spec.md`
81
+ - [x] T071 [US1] Extend `deriveCheckpointState()` in `src/develop/checkpointState.ts` to validate `.sofia-metadata.json` integrity (at minimum: sessionId match; if Phase 9 adds `templateId`, validate that too) and disable `canSkipScaffold` + warn if mismatch per Edge Cases in `spec.md`
82
+
83
+ **Checkpoint**: Resume works end-to-end. `sofia dev --session X` resumes from correct iteration after interruption. All resume decisions are logged at info level.
84
+
85
+ ---
86
+
87
+ ## Phase 4: User Story 2 — Force-Restart a PoC Session (Priority: P1)
88
+
89
+ **Goal**: `sofia dev --session X --force` deletes output directory AND resets `session.poc` state, starting completely fresh.
90
+
91
+ **Independent Test**: Create output via `sofia dev`, then `--force`, verify both directory and `poc.iterations` reset.
92
+
93
+ **FRs covered**: FR-008, FR-009, FR-010
94
+
95
+ ### Tests for User Story 2 (REQUIRED) ⚠️
96
+
97
+ - [x] T027 [P] [US2] Unit test: `developCommand` with `--force` clears `session.poc` and calls `store.save()` before creating RalphLoop per FR-008 in `tests/unit/cli/developCommand.spec.ts`
98
+ - [x] T028 [P] [US2] Unit test: `developCommand` with `--force` on a `poc.finalStatus === 'success'` session clears status and starts fresh per FR-010 in `tests/unit/cli/developCommand.spec.ts`
99
+ - [x] T029 [P] [US2] Unit test: `developCommand` with `--force` on a session with no prior poc state behaves identically to first run in `tests/unit/cli/developCommand.spec.ts`
100
+ - [x] T030 [US2] Integration test: force-restart flow — create session with iterations, run with `--force`, verify empty iterations and fresh scaffold in `tests/integration/ralphLoopFlow.spec.ts` (add describe block "force restart")
101
+
102
+ ### Implementation for User Story 2
103
+
104
+ - [x] T031 [US2] Update `developCommand()` in `src/cli/developCommand.ts` to clear `session.poc = undefined` and call `store.save(session)` when `--force` is set per FR-008, before creating RalphLoop
105
+ - [x] T032 [US2] Ensure `--force` path logs info-level message "Cleared existing output directory and session state (--force)" in `src/cli/developCommand.ts`
106
+
107
+ **Checkpoint**: `--force` resets both output directory and session state. Works on any `finalStatus` value including `'success'`.
108
+
109
+ ---
110
+
111
+ ## Phase 5: User Story 3 — PoC Template Selection Based on Plan (Priority: P2)
112
+
113
+ **Goal**: Scaffolder auto-selects template based on plan's `architectureNotes` — Python plan gets `python-pytest`, TypeScript plan gets `node-ts-vitest`.
114
+
115
+ **Independent Test**: Session with Python/FastAPI plan generates Python project structure.
116
+
117
+ **FRs covered**: FR-011, FR-012, FR-013, FR-014, FR-015
118
+
119
+ ### Tests for User Story 3 (REQUIRED) ⚠️
120
+
121
+ - [x] T033 [P] [US3] Unit test: `selectTemplate()` returns `python-pytest` for plans mentioning "Python" or "FastAPI" in `tests/unit/develop/templateRegistry.spec.ts`
122
+ - [x] T034 [P] [US3] Unit test: `selectTemplate()` returns `node-ts-vitest` for plans mentioning "TypeScript" or with no architecture notes in `tests/unit/develop/templateRegistry.spec.ts`
123
+ - [x] T035 [P] [US3] Unit test: `selectTemplate()` returns default `node-ts-vitest` for ambiguous plans in `tests/unit/develop/templateRegistry.spec.ts`
124
+ - [x] T036 [P] [US3] Unit test: `PocScaffolder` uses `TemplateEntry.files` when constructed with a template entry in `tests/unit/develop/pocScaffolder.spec.ts`
125
+ - [x] T037 [US3] Integration test: scaffold with `python-pytest` template generates expected file structure (`requirements.txt`, `src/main.py`, `tests/test_main.py`) in `tests/integration/pocScaffold.spec.ts` (add describe block "python-pytest template")
126
+
127
+ ### Implementation for User Story 3
128
+
129
+ - [x] T038 [US3] Wire template selection into `developCommand.ts`: call `selectTemplate(registry, plan.architectureNotes, plan.dependencies)` and pass result to `PocScaffolder` and `RalphLoop`
130
+ - [x] T039 [US3] Update `RalphLoop` to use `TemplateEntry.installCommand` for dependency installation instead of hardcoded `npm install` in `src/develop/ralphLoop.ts`
131
+ - [x] T040 [US3] Update `RalphLoop` to pass `TemplateEntry.testCommand` to `TestRunner` constructor in `src/develop/ralphLoop.ts`
132
+ - [x] T041 [US3] Add info-level log "Selected template: {id} (matched '{pattern}' in architecture notes)" in `src/cli/developCommand.ts`
133
+
134
+ **Checkpoint**: Python plans produce Python scaffold. TypeScript plans preserve current behavior. Adding a new template requires only a registry entry.
135
+
136
+ ---
137
+
138
+ ## Phase 6: User Story 4 — TestRunner Coverage Hardening (Priority: P2)
139
+
140
+ **Goal**: `testRunner.ts` test coverage increases from 45% to 80%+ via real fixture-based integration tests.
141
+
142
+ **Independent Test**: Run fixture-based tests covering spawn, parse, timeout, and malformed output.
143
+
144
+ **FRs covered**: FR-016, FR-017, FR-018, FR-019
145
+
146
+ ### Tests for User Story 4 (REQUIRED) ⚠️
147
+
148
+ > **NOTE**: These are the deliverable for this story — the tests themselves ARE the feature
149
+
150
+ - [x] T042 [P] [US4] Integration test: `testRunner.run()` against fixture project with passing tests, verify correct pass/fail/skip counts in `tests/integration/testRunnerReal.spec.ts`
151
+ - [x] T043 [P] [US4] Integration test: `testRunner.run()` against fixture project with failing tests, verify failure details parsed correctly in `tests/integration/testRunnerReal.spec.ts`
152
+ - [x] T044 [US4] Integration test: `testRunner.run()` with short timeout against hanging test fixture, verify SIGTERM→SIGKILL and timeout error result per FR-016/FR-018 in `tests/integration/testRunnerReal.spec.ts`
153
+ - [x] T045 [US4] Unit test: `extractJson()` fallback path (first-`{`-to-last-`}`) with mixed console+JSON output per FR-017 in `tests/unit/develop/testRunner.spec.ts` (use `TestableTestRunner` subclass)
154
+ - [x] T046 [US4] Unit test: `extractJson()` returns null for output with no valid JSON per FR-017 in `tests/unit/develop/testRunner.spec.ts`
155
+ - [x] T047 [US4] Unit test: `buildErrorResult()` produces correct zero-count result with error message per FR-018 in `tests/unit/develop/testRunner.spec.ts`
156
+
157
+ ### Implementation for User Story 4
158
+
159
+ - [x] T048 [US4] If any coverage gaps remain after writing the above tests, add targeted unit tests to reach 80%+ coverage for `src/develop/testRunner.ts` (run `npm test -- --coverage` to verify)
160
+
161
+ **Checkpoint**: `testRunner.ts` coverage is at or above 80%. All critical code paths (spawn, parse, timeout, fallback) have automated tests using real fixtures.
162
+
163
+ ---
164
+
165
+ ## Phase 7: User Story 5 — PTY-Based Interactive E2E Tests (Priority: P3)
166
+
167
+ **Goal**: PTY-based E2E tests validate Ctrl+C handling, progress output, and clean exit behavior for `sofia dev`.
168
+
169
+ **Independent Test**: Spawn `sofia dev` in PTY, send Ctrl+C, verify recovery message and exit code.
170
+
171
+ **FRs covered**: (implicit quality requirement from spec)
172
+
173
+ ### Tests for User Story 5 (REQUIRED) ⚠️
174
+
175
+ > **NOTE**: The tests ARE the deliverable — this story is test-only
176
+
177
+ - [x] T049 [P] [US5] E2E test: PTY-spawn `sofia dev`, send Ctrl+C during iteration, verify exit code 0 and recovery message in `tests/e2e/developPty.spec.ts`
178
+ - [x] T050 [P] [US5] E2E test: PTY-spawn `sofia dev`, verify iteration progress lines ("Iteration N/M") appear in PTY output buffer in `tests/e2e/developPty.spec.ts`
179
+ - [x] T051 [US5] Add PTY availability guard to `tests/e2e/developPty.spec.ts` — skip gracefully if `node-pty` allocation fails (e.g., CI without TTY)
180
+
181
+ **Checkpoint**: Interactive behaviors (Ctrl+C, progress output) are validated in CI via PTY simulation.
182
+
183
+ ---
184
+
185
+ ## Phase 8: User Story 6 — Workshop-to-Dev Transition Clarity (Priority: P3)
186
+
187
+ **Goal**: Workshop displays actionable `sofia dev --session <id>` command after Plan phase completes.
188
+
189
+ **Independent Test**: Complete Plan phase, verify output contains exact `sofia dev` command with session ID.
190
+
191
+ **FRs covered**: FR-020, FR-021
192
+
193
+ ### Tests for User Story 6 (REQUIRED) ⚠️
194
+
195
+ - [x] T052 [P] [US6] Unit test: workshop command displays "sofia dev --session {id}" after Plan phase completes per FR-020 in `tests/unit/cli/workshopCommand.spec.ts`
196
+ - [x] T053 [P] [US6] Unit test: workshop command offers auto-transition prompt in interactive mode per FR-021 in `tests/unit/cli/workshopCommand.spec.ts`
197
+
198
+ ### Implementation for User Story 6
199
+
200
+ - [x] T054 [US6] Add transition guidance message in `src/cli/workshopCommand.ts` when `getNextPhase(phase) === 'Develop'` — display exact `sofia dev --session ${session.sessionId}` command per contracts/cli.md
201
+ - [x] T055 [US6] Add interactive mode offer ("Would you like to start PoC development now?") in `src/cli/workshopCommand.ts` per FR-021 (SHOULD — use `@inquirer/prompts` confirm)
202
+
203
+ **Checkpoint**: Workshop users see clear next-step guidance including the exact command to run after Plan phase.
204
+
205
+ ---
206
+
207
+ ## Phase 9: Polish & Cross-Cutting Concerns
208
+
209
+ **Purpose**: Improvements that affect multiple user stories
210
+
211
+ **TDD note**: Complete T072–T074 (tests) before implementing T056–T058 (FR-022) to satisfy the constitution's Red → Green requirement.
212
+
213
+ - [x] T072 [P] Unit test: scaffold TODO marker scanning records `totalInitial`, `remaining`, and `markers` in `.sofia-metadata.json` per FR-022 in `tests/unit/develop/pocScaffolder.spec.ts`
214
+ - [x] T073 [P] Unit test: TODO marker rescan after an iteration updates `.sofia-metadata.json.todos.remaining` per FR-022 in `tests/unit/develop/ralphLoop.spec.ts`
215
+ - [x] T074 [P] Integration test: TODO tracking writes and updates `.sofia-metadata.json` in a real scaffold output directory per FR-022 in `tests/integration/ralphLoopFlow.spec.ts` (new describe block "todo tracking")
216
+ - [x] T075 Validation task: compare fresh vs resumed run PoC quality (test pass counts) on the same plan/session to satisfy SC-004-005; capture results in test output or quickstart notes
217
+ - [x] T076 [P] Benchmark/validation task: measure resume detection overhead (derive checkpoint + metadata checks) and ensure <500ms per SC-004-007 (can be a small integration test with timing guard or a quickstart step)
218
+ - [x] T056 [P] Extend `.sofia-metadata.json` schema in `src/develop/pocScaffolder.ts` to include `templateId` and `todos` fields per FR-022 and contracts/cli.md extended schema
219
+ - [x] T057 [P] Add TODO marker scanning logic to `src/develop/pocScaffolder.ts` — scan scaffold files at scaffold time for `TODO:` markers, record in `.sofia-metadata.json`
220
+ - [x] T058 Add TODO marker rescan after each iteration in `src/develop/ralphLoop.ts` — update `.sofia-metadata.json` with remaining TODO count per FR-022
221
+ - [x] T059 [P] Update `src/develop/index.ts` barrel export to include `checkpointState.ts` and `templateRegistry.ts`
222
+ - [x] T060 Run `npm run typecheck` and fix any type errors across all modified files
223
+ - [x] T061 Run `npm run lint` and fix any lint warnings (especially `import/order`) across all modified files
224
+ - [x] T062 Run full test suite `npm test` and verify all tests pass (no regressions)
225
+ - [x] T063 Run `npm test -- --coverage` on `src/develop/testRunner.ts` and verify coverage ≥ 80% per SC-004-004
226
+ - [x] T064 Run quickstart.md validation — execute the quick verification steps from `specs/004-dev-resume-hardening/quickstart.md`
227
+
228
+ ---
229
+
230
+ ## Dependencies & Execution Order
231
+
232
+ ### Phase Dependencies
233
+
234
+ - **Setup (Phase 1)**: No dependencies — can start immediately
235
+ - **Foundational (Phase 2)**: Depends on Setup completion — BLOCKS all user stories
236
+ - **User Story 1 (Phase 3)**: Depends on Foundational (Phase 2) — P1, MVP
237
+ - **User Story 2 (Phase 4)**: Depends on Foundational (Phase 2) — P1, can parallel with US1 (different files: `developCommand.ts` vs `ralphLoop.ts`)
238
+ - **User Story 3 (Phase 5)**: Depends on Foundational (Phase 2) — P2, uses `templateRegistry.ts` from Phase 2
239
+ - **User Story 4 (Phase 6)**: Depends on Phase 2 T005 only (protected methods + test fixture) — P2, independent of all other stories
240
+ - **User Story 5 (Phase 7)**: Depends on US1 completion (resume behavior must work for Ctrl+C test) — P3
241
+ - **User Story 6 (Phase 8)**: Depends on Foundational only — P3, independent of other stories
242
+ - **Polish (Phase 9)**: Depends on all desired user stories being complete
243
+
244
+ ### User Story Dependencies
245
+
246
+ - **User Story 1 (P1)**: Can start after Phase 2. No dependencies on other stories. 🎯 **MVP target**
247
+ - **User Story 2 (P1)**: Can start after Phase 2. Shares `developCommand.ts` with US1 — coordinate edits but independently testable
248
+ - **User Story 3 (P2)**: Can start after Phase 2. Uses registry from Phase 2. Independent of US1/US2
249
+ - **User Story 4 (P2)**: Can start after T005 (protected methods). Fully independent — test-only story
250
+ - **User Story 5 (P3)**: Needs US1 resume working. Tests resume+Ctrl+C interaction
251
+ - **User Story 6 (P3)**: After Phase 2 only. Fully independent — workshop command changes
252
+
253
+ ### Within Each User Story
254
+
255
+ - Tests MUST be written and FAIL before implementation
256
+ - Implementation follows test order (entity → service → endpoint)
257
+ - Story complete before moving to next priority
258
+
259
+ ### Parallel Opportunities
260
+
261
+ - T001, T002, T003 can run in parallel (different files)
262
+ - T005, T006, T007, T008, T009, T010, T011 — some can parallel (T005 different file from T007-T011)
263
+ - T012-T018 (US1 tests) can all run in parallel (test file additions)
264
+ - T027-T029 (US2 tests) can run in parallel
265
+ - T033-T036 (US3 tests) can run in parallel
266
+ - T042-T047 (US4 tests) can run in parallel (different test files)
267
+ - US4 (Phase 6) and US6 (Phase 8) can run in parallel with US1/US2/US3 after Phase 2
268
+
269
+ ---
270
+
271
+ ## Parallel Example: User Story 1
272
+
273
+ ```bash
274
+ # Launch all tests for US1 together (different test files/blocks):
275
+ T012: "Unit test: deriveCheckpointState in tests/unit/develop/checkpointState.spec.ts"
276
+ T013: "Unit test: RalphLoop resume seeding in tests/unit/develop/ralphLoop.spec.ts"
277
+ T014: "Unit test: RalphLoop skip scaffold in tests/unit/develop/ralphLoop.spec.ts"
278
+ T015: "Unit test: RalphLoop pop incomplete in tests/unit/develop/ralphLoop.spec.ts"
279
+ T016: "Unit test: developCommand success exit in tests/unit/cli/developCommand.spec.ts"
280
+ T017: "Unit test: developCommand resume default in tests/unit/cli/developCommand.spec.ts"
281
+ T018: "Unit test: resuming re-scaffolds on missing dir in tests/unit/develop/ralphLoop.spec.ts"
282
+
283
+ # Then sequential implementation (shared files):
284
+ T020 → T021 → T022 → T023 → T024 → T025 → T026
285
+ ```
286
+
287
+ ---
288
+
289
+ ## Implementation Strategy
290
+
291
+ ### MVP First (User Story 1 Only)
292
+
293
+ 1. Complete Phase 1: Setup (T001-T004)
294
+ 2. Complete Phase 2: Foundational (T005-T011)
295
+ 3. Complete Phase 3: User Story 1 — Resume (T012-T026)
296
+ 4. **STOP and VALIDATE**: Test resume independently
297
+ 5. All 583+ existing tests still pass + new resume tests green
298
+
299
+ ### Incremental Delivery
300
+
301
+ 1. Setup + Foundational → Foundation ready
302
+ 2. ✅ User Story 1 (Resume) → Test independently → **MVP!** (core usability fix)
303
+ 3. ✅ User Story 2 (Force) → Test independently → Resume + Force both work
304
+ 4. ✅ User Story 3 (Templates) → Test independently → Multi-language scaffold
305
+ 5. ✅ User Story 4 (TestRunner) → Coverage verified → Quality gate met
306
+ 6. ✅ User Story 5 (PTY E2E) → Interactive validation in CI
307
+ 7. ✅ User Story 6 (Transition) → Full workshop→dev UX
308
+ 8. Polish → Ship
309
+
310
+ ### Parallel Team Strategy
311
+
312
+ With multiple developers:
313
+
314
+ 1. Team completes Setup + Foundational together
315
+ 2. Once Foundational is done:
316
+ - Developer A: US1 (Resume) + US2 (Force) — related, same area
317
+ - Developer B: US3 (Templates) — independent area
318
+ - Developer C: US4 (TestRunner coverage) — fully independent
319
+ - Developer D: US6 (Workshop transition) — independent area
320
+ 3. After US1: Developer A picks up US5 (PTY E2E, needs resume)
321
+
322
+ ---
323
+
324
+ ## Notes
325
+
326
+ - [P] tasks = different files, no dependencies
327
+ - [Story] label maps task to specific user story for traceability
328
+ - Each user story should be independently completable and testable
329
+ - Verify tests fail before implementing
330
+ - Commit after each task or logical group
331
+ - Stop at any checkpoint to validate story independently
332
+ - `maxIterations` counts total iterations (not additional from resume) — e.g., 10 max with 3 done → runs 4-10
333
+ - Existing session schema supports resume as-is — no migration needed
@@ -0,0 +1,39 @@
1
+ # Specification Quality Checklist: AI Foundry Search Service Deployment
2
+
3
+ **Purpose**: Validate specification completeness and quality before proceeding to planning
4
+ **Created**: 2026-03-01
5
+ **Feature**: [spec.md](../spec.md)
6
+
7
+ ## Content Quality
8
+
9
+ - [x] No implementation details (languages, frameworks, APIs)
10
+ - [x] Focused on user value and business needs
11
+ - [x] Written for non-technical stakeholders
12
+ - [x] All mandatory sections completed
13
+
14
+ ## Requirement Completeness
15
+
16
+ - [x] No [NEEDS CLARIFICATION] markers remain
17
+ - [x] Requirements are testable and unambiguous
18
+ - [x] Success criteria are measurable
19
+ - [x] Success criteria are technology-agnostic (no implementation details)
20
+ - [x] All acceptance scenarios are defined
21
+ - [x] Edge cases are identified
22
+ - [x] Scope is clearly bounded
23
+ - [x] Dependencies and assumptions identified
24
+
25
+ ## Feature Readiness
26
+
27
+ - [x] All functional requirements have clear acceptance criteria
28
+ - [x] User scenarios cover primary flows
29
+ - [x] Feature meets measurable outcomes defined in Success Criteria
30
+ - [x] No implementation details leak into specification
31
+
32
+ ## Notes
33
+
34
+ - All items pass validation. Spec is ready for `/speckit.clarify` or `/speckit.plan`.
35
+ - The Assumptions section documents that basic agent setup is chosen over standard setup, scoping the feature to workshop/PoC complexity levels.
36
+ - FR-009 references the `web_search_preview` tool type name as defined in the Azure AI Foundry Agent Service documentation — this is a capability name, not an implementation detail.
37
+ - Key Entities mention "GPT-4o" as an example model; the actual model choice is parameterized per FR-004.
38
+ - Authentication uses Azure Identity (the user's `az login` credentials) rather than a separate API key, per the Foundry Agent Service SDK pattern documented at https://learn.microsoft.com/en-us/azure/foundry/agents/how-to/tools/web-search?pivots=typescript.
39
+ - Environment variables align with Foundry conventions: `FOUNDRY_PROJECT_ENDPOINT` and `FOUNDRY_MODEL_DEPLOYMENT_NAME` (replacing the previous `SOFIA_FOUNDRY_AGENT_ENDPOINT` / `SOFIA_FOUNDRY_AGENT_KEY` pattern).
@@ -0,0 +1,241 @@
1
+ # Contract: `web.search` Copilot SDK Tool
2
+
3
+ **Feature**: 005-ai-search-deploy
4
+ **Date**: 2026-03-01
5
+ **Interface type**: Copilot SDK custom tool (registered via `ToolDefinition`)
6
+
7
+ ## Overview
8
+
9
+ The `web.search` tool is exposed to the LLM through the GitHub Copilot SDK's tool registration system. When invoked, it queries the Azure AI Foundry Agent Service (with `web_search_preview` enabled) and returns structured results with URL citations.
10
+
11
+ ## Tool Definition
12
+
13
+ ```typescript
14
+ const WEB_SEARCH_TOOL_DEFINITION: ToolDefinition = {
15
+ name: 'web.search',
16
+ description:
17
+ 'Search the web for information about companies, industries, technologies, and trends. ' +
18
+ 'Returns structured results with title, URL, and snippet.',
19
+ parameters: {
20
+ type: 'object',
21
+ properties: {
22
+ query: {
23
+ type: 'string',
24
+ description: 'The search query string.',
25
+ },
26
+ },
27
+ required: ['query'],
28
+ },
29
+ };
30
+ ```
31
+
32
+ **Contract stability**: The tool name (`web.search`), parameter schema, and return format are stable contracts referenced by multiple prompts (`discover.md`, `develop.md`) and the `mcpContextEnricher.ts` module. Changes require updating all consumers.
33
+
34
+ ## Input
35
+
36
+ | Parameter | Type | Required | Description |
37
+ | --------- | ------ | -------- | ------------------------------------------------------------------- |
38
+ | `query` | string | yes | Web search query (e.g., "Contoso Ltd competitors in healthcare AI") |
39
+
40
+ ## Output
41
+
42
+ ### Success response
43
+
44
+ ```typescript
45
+ interface WebSearchResult {
46
+ results: WebSearchResultItem[];
47
+ sources?: string[]; // Deduplicated citation URLs
48
+ degraded?: false;
49
+ }
50
+
51
+ interface WebSearchResultItem {
52
+ title: string; // Page title from citation
53
+ url: string; // Source URL (from url_citation annotation)
54
+ snippet: string; // Relevant text excerpt
55
+ }
56
+ ```
57
+
58
+ **Example**:
59
+
60
+ ```json
61
+ {
62
+ "results": [
63
+ {
64
+ "title": "Contoso Ltd - Healthcare AI Solutions",
65
+ "url": "https://contoso.com/about",
66
+ "snippet": "Contoso Ltd is a leading provider of AI-powered healthcare solutions..."
67
+ },
68
+ {
69
+ "title": "Top Healthcare AI Companies 2026 - TechReview",
70
+ "url": "https://techreview.com/healthcare-ai-2026",
71
+ "snippet": "The healthcare AI market is dominated by... Contoso ranks #3..."
72
+ }
73
+ ],
74
+ "sources": ["https://contoso.com/about", "https://techreview.com/healthcare-ai-2026"]
75
+ }
76
+ ```
77
+
78
+ ### Degraded response
79
+
80
+ Returned when the Foundry Agent Service is unavailable, misconfigured, or returns an error. The workshop continues without web search capabilities.
81
+
82
+ ```json
83
+ {
84
+ "results": [],
85
+ "degraded": true,
86
+ "error": "Foundry agent returned 401 Unauthorized — run `az login` to refresh credentials"
87
+ }
88
+ ```
89
+
90
+ ### Degradation scenarios
91
+
92
+ | Condition | `degraded` | `error` message |
93
+ | ---------------------------------- | ---------- | -------------------------------------------------------------------------------------------- |
94
+ | `FOUNDRY_PROJECT_ENDPOINT` not set | `true` | "Web search not configured — set FOUNDRY_PROJECT_ENDPOINT and FOUNDRY_MODEL_DEPLOYMENT_NAME" |
95
+ | `DefaultAzureCredential` fails | `true` | "Azure authentication failed — run `az login`" |
96
+ | Agent creation fails | `true` | "Failed to create web search agent: {details}" |
97
+ | Query returns error | `true` | "Web search query failed: {status} {message}" |
98
+ | Network error | `true` | "Network error: {message}" |
99
+ | Rate limited (429) | `true` | "Web search rate limited — retry in {seconds}s" |
100
+
101
+ ## Integration Points
102
+
103
+ ### Consuming prompts
104
+
105
+ - [src/prompts/discover.md](../../src/prompts/discover.md): `**web.search**: Research the user's industry, competitors, and trends`
106
+ - [src/prompts/develop.md](../../src/prompts/develop.md): `**web.search** — Use when stuck on an implementation pattern`
107
+
108
+ ### Consuming code
109
+
110
+ - `src/develop/mcpContextEnricher.ts`: Calls `isWebSearchConfigured()` to conditionally query web search when stuck for 2+ iterations
111
+ - `src/cli/preflight.ts` (new): Legacy env var detection check (FR-016)
112
+
113
+ ## Configuration Contract
114
+
115
+ ### Required environment variables
116
+
117
+ | Variable | Example | Description |
118
+ | ------------------------------- | ------------------------------------------------------------------------------- | ---------------------------------------- |
119
+ | `FOUNDRY_PROJECT_ENDPOINT` | `https://sofia-foundry-abc123.services.ai.azure.com/api/projects/sofia-project` | Foundry project endpoint URL |
120
+ | `FOUNDRY_MODEL_DEPLOYMENT_NAME` | `gpt-4.1-mini` | Model deployment name for agent creation |
121
+
122
+ ### Authentication
123
+
124
+ Uses `DefaultAzureCredential` — no API key environment variables. User must be logged in via `az login` (local development) or have a managed identity (Azure-hosted).
125
+
126
+ ### Legacy env var rejection (FR-016)
127
+
128
+ If either `SOFIA_FOUNDRY_AGENT_ENDPOINT` or `SOFIA_FOUNDRY_AGENT_KEY` is set:
129
+
130
+ - Preflight check fails with `required: true`
131
+ - Error message: `"Legacy web search env vars detected. Migrate: replace SOFIA_FOUNDRY_AGENT_ENDPOINT with FOUNDRY_PROJECT_ENDPOINT and remove SOFIA_FOUNDRY_AGENT_KEY (API key auth is no longer used). See docs/environment.md"`
132
+
133
+ ## Lifecycle Contract
134
+
135
+ ### Initialization (lazy)
136
+
137
+ ```
138
+ Session starts → web.search NOT invoked → no agent created (zero overhead)
139
+ Session starts → web.search invoked → agent + conversation created → reused for session
140
+ ```
141
+
142
+ ### Cleanup
143
+
144
+ ```
145
+ Session ends → destroyWebSearchSession() called → conversation deleted → agent version deleted
146
+ Process exit → process.beforeExit handler → same cleanup
147
+ Cleanup fails → warning logged → no throw (stale agent cleaned manually)
148
+ ```
149
+
150
+ ### Public API
151
+
152
+ ```typescript
153
+ // Check if web search can be used (env vars present)
154
+ function isWebSearchConfigured(): boolean;
155
+
156
+ // Create the tool handler function
157
+ function createWebSearchTool(config: WebSearchConfig): (query: string) => Promise<WebSearchResult>;
158
+
159
+ // Explicitly clean up the ephemeral agent and conversation
160
+ function destroyWebSearchSession(): Promise<void>;
161
+ ```
162
+
163
+ ---
164
+
165
+ # Contract: Deployment Script CLI
166
+
167
+ **Interface type**: Shell script (Bash)
168
+
169
+ ## `deploy.sh`
170
+
171
+ ### Usage
172
+
173
+ ```bash
174
+ ./infra/deploy.sh \
175
+ --resource-group <resource-group-name> \
176
+ [--subscription <subscription-id>] \
177
+ [--location <azure-region>] \
178
+ [--account-name <foundry-account-name>] \
179
+ [--model <model-deployment-name>]
180
+ ```
181
+
182
+ ### Parameters
183
+
184
+ | Flag | Required | Default | Description |
185
+ | ------------------------ | -------- | --------------------------- | ---------------------------------------- |
186
+ | `--resource-group`, `-g` | yes | — | Resource group name (created if missing) |
187
+ | `--subscription`, `-s` | no | current az CLI subscription | Azure subscription ID |
188
+ | `--location`, `-l` | no | `swedencentral` | Azure region |
189
+ | `--account-name`, `-n` | no | `sofia-foundry` | Foundry account name |
190
+ | `--model`, `-m` | no | `gpt-4.1-mini` | Model deployment name |
191
+
192
+ ### Exit codes
193
+
194
+ | Code | Meaning |
195
+ | ---- | ------------------------------------------------------- |
196
+ | 0 | Deployment succeeded |
197
+ | 1 | Prerequisite check failed (az CLI, login, subscription) |
198
+ | 2 | Deployment failed (Bicep error) |
199
+
200
+ ### Output (stdout on success)
201
+
202
+ The script writes `FOUNDRY_PROJECT_ENDPOINT` and `FOUNDRY_MODEL_DEPLOYMENT_NAME` to a `.env` file in the workspace root (creating or updating it), then prints:
203
+
204
+ ```
205
+ ✅ Deployment complete!
206
+
207
+ Environment variables written to /path/to/workspace/.env:
208
+
209
+ FOUNDRY_PROJECT_ENDPOINT="https://sofia-foundry-abc123.services.ai.azure.com/api/projects/sofia-project"
210
+ FOUNDRY_MODEL_DEPLOYMENT_NAME="gpt-4.1-mini"
211
+
212
+ To tear down: ./infra/teardown.sh --resource-group <resource-group-name>
213
+ ```
214
+
215
+ ## `teardown.sh`
216
+
217
+ ### Usage
218
+
219
+ ```bash
220
+ ./infra/teardown.sh --resource-group <resource-group-name>
221
+ ```
222
+
223
+ ### Parameters
224
+
225
+ | Flag | Required | Default | Description |
226
+ | ------------------------ | -------- | ------- | ------------------------ |
227
+ | `--resource-group`, `-g` | yes | — | Resource group to delete |
228
+
229
+ ### Exit codes
230
+
231
+ | Code | Meaning |
232
+ | ---- | -------------------------------------------------- |
233
+ | 0 | Teardown succeeded or resource group doesn't exist |
234
+ | 1 | Prerequisite check failed |
235
+ | 2 | Deletion failed |
236
+
237
+ ### Behavior
238
+
239
+ - If resource group doesn't exist: prints informational message, exits 0
240
+ - Prompts for confirmation before deletion (unless `--yes` flag)
241
+ - Uses `az group delete --yes --no-wait` for non-blocking deletion