sofia-cli 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (435) hide show
  1. package/.github/agents/copilot-instructions.md +39 -0
  2. package/.github/agents/speckit.analyze.agent.md +184 -0
  3. package/.github/agents/speckit.checklist.agent.md +294 -0
  4. package/.github/agents/speckit.clarify.agent.md +181 -0
  5. package/.github/agents/speckit.constitution.agent.md +84 -0
  6. package/.github/agents/speckit.implement.agent.md +135 -0
  7. package/.github/agents/speckit.plan.agent.md +90 -0
  8. package/.github/agents/speckit.specify.agent.md +258 -0
  9. package/.github/agents/speckit.tasks.agent.md +137 -0
  10. package/.github/agents/speckit.taskstoissues.agent.md +30 -0
  11. package/.github/copilot-instructions.md +257 -0
  12. package/.github/prompts/speckit.analyze.prompt.md +3 -0
  13. package/.github/prompts/speckit.checklist.prompt.md +3 -0
  14. package/.github/prompts/speckit.clarify.prompt.md +3 -0
  15. package/.github/prompts/speckit.constitution.prompt.md +3 -0
  16. package/.github/prompts/speckit.implement.prompt.md +3 -0
  17. package/.github/prompts/speckit.plan.prompt.md +3 -0
  18. package/.github/prompts/speckit.specify.prompt.md +3 -0
  19. package/.github/prompts/speckit.tasks.prompt.md +3 -0
  20. package/.github/prompts/speckit.taskstoissues.prompt.md +3 -0
  21. package/.github/workflows/ci.yml +38 -0
  22. package/.prettierrc +6 -0
  23. package/.specify/memory/constitution.md +181 -0
  24. package/.specify/scripts/bash/check-prerequisites.sh +166 -0
  25. package/.specify/scripts/bash/common.sh +156 -0
  26. package/.specify/scripts/bash/create-new-feature.sh +297 -0
  27. package/.specify/scripts/bash/setup-plan.sh +61 -0
  28. package/.specify/scripts/bash/update-agent-context.sh +810 -0
  29. package/.specify/templates/agent-file-template.md +28 -0
  30. package/.specify/templates/checklist-template.md +40 -0
  31. package/.specify/templates/constitution-template.md +50 -0
  32. package/.specify/templates/plan-template.md +113 -0
  33. package/.specify/templates/spec-template.md +115 -0
  34. package/.specify/templates/tasks-template.md +251 -0
  35. package/.vscode/mcp.json +42 -0
  36. package/.vscode/settings.json +19 -0
  37. package/CODE_OF_CONDUCT.md +128 -0
  38. package/LICENSE +21 -0
  39. package/README.md +213 -0
  40. package/dist/src/cli/developCommand.js +240 -0
  41. package/dist/src/cli/directCommands.js +143 -0
  42. package/dist/src/cli/envLoader.js +16 -0
  43. package/dist/src/cli/exportCommand.js +53 -0
  44. package/dist/src/cli/index.js +203 -0
  45. package/dist/src/cli/ioContext.js +109 -0
  46. package/dist/src/cli/preflight.js +57 -0
  47. package/dist/src/cli/statusCommand.js +110 -0
  48. package/dist/src/cli/workshopCommand.js +400 -0
  49. package/dist/src/develop/checkpointState.js +86 -0
  50. package/dist/src/develop/codeGenerator.js +319 -0
  51. package/dist/src/develop/dynamicScaffolder.js +226 -0
  52. package/dist/src/develop/githubMcpAdapter.js +122 -0
  53. package/dist/src/develop/index.js +15 -0
  54. package/dist/src/develop/mcpContextEnricher.js +195 -0
  55. package/dist/src/develop/pocScaffolder.js +542 -0
  56. package/dist/src/develop/ralphLoop.js +659 -0
  57. package/dist/src/develop/templateRegistry.js +364 -0
  58. package/dist/src/develop/testRunner.js +202 -0
  59. package/dist/src/logging/logger.js +58 -0
  60. package/dist/src/loop/conversationLoop.js +227 -0
  61. package/dist/src/loop/phaseSummarizer.js +87 -0
  62. package/dist/src/mcp/mcpManager.js +267 -0
  63. package/dist/src/mcp/mcpTransport.js +391 -0
  64. package/dist/src/mcp/retryPolicy.js +47 -0
  65. package/dist/src/mcp/webSearch.js +254 -0
  66. package/dist/src/phases/contextSummarizer.js +101 -0
  67. package/dist/src/phases/discoveryEnricher.js +156 -0
  68. package/dist/src/phases/phaseExtractors.js +222 -0
  69. package/dist/src/phases/phaseHandlers.js +328 -0
  70. package/dist/src/prompts/design.md +51 -0
  71. package/dist/src/prompts/develop-boundary.md +51 -0
  72. package/dist/src/prompts/develop.md +111 -0
  73. package/dist/src/prompts/discover.md +58 -0
  74. package/dist/src/prompts/ideate.md +56 -0
  75. package/dist/src/prompts/plan.md +51 -0
  76. package/dist/src/prompts/promptLoader.js +167 -0
  77. package/dist/src/prompts/promptLoader.ts +198 -0
  78. package/dist/src/prompts/select.md +47 -0
  79. package/dist/src/prompts/summarize/README.md +8 -0
  80. package/dist/src/prompts/summarize/design-summary.md +37 -0
  81. package/dist/src/prompts/summarize/develop-summary.md +25 -0
  82. package/dist/src/prompts/summarize/ideate-summary.md +27 -0
  83. package/dist/src/prompts/summarize/plan-summary.md +27 -0
  84. package/dist/src/prompts/summarize/select-summary.md +21 -0
  85. package/dist/src/prompts/system.md +28 -0
  86. package/dist/src/sessions/exportPaths.js +22 -0
  87. package/dist/src/sessions/exportWriter.js +406 -0
  88. package/dist/src/sessions/sessionManager.js +81 -0
  89. package/dist/src/sessions/sessionStore.js +65 -0
  90. package/dist/src/shared/activitySpinner.js +91 -0
  91. package/dist/src/shared/copilotClient.js +129 -0
  92. package/dist/src/shared/data/cards.json +1249 -0
  93. package/dist/src/shared/data/cardsLoader.js +51 -0
  94. package/dist/src/shared/errorClassifier.js +120 -0
  95. package/dist/src/shared/events.js +28 -0
  96. package/dist/src/shared/markdownRenderer.js +34 -0
  97. package/dist/src/shared/schemas/session.js +265 -0
  98. package/dist/src/shared/tableRenderer.js +20 -0
  99. package/dist/src/vendor/chalk.js +2 -0
  100. package/dist/src/vendor/cli-table3.js +3 -0
  101. package/dist/src/vendor/commander.js +2 -0
  102. package/dist/src/vendor/marked-terminal.js +3 -0
  103. package/dist/src/vendor/marked.js +2 -0
  104. package/dist/src/vendor/ora.js +2 -0
  105. package/dist/src/vendor/pino.js +2 -0
  106. package/dist/src/vendor/zod.js +2 -0
  107. package/dist/tests/e2e/developE2e.spec.js +126 -0
  108. package/dist/tests/e2e/developFailureE2e.spec.js +247 -0
  109. package/dist/tests/e2e/developPty.spec.js +75 -0
  110. package/dist/tests/e2e/discoveryWebSearchRelevance.spec.js +84 -0
  111. package/dist/tests/e2e/harness.spec.js +83 -0
  112. package/dist/tests/e2e/mcpLive.spec.js +120 -0
  113. package/dist/tests/e2e/newSession.e2e.spec.js +177 -0
  114. package/dist/tests/e2e/ralphLoopEnrichmentComparison.spec.js +62 -0
  115. package/dist/tests/e2e/workiqEnrichment.spec.js +56 -0
  116. package/dist/tests/e2e/zavaSimulation.spec.js +452 -0
  117. package/dist/tests/fixtures/test-fixture-project/src/add.js +3 -0
  118. package/dist/tests/fixtures/test-fixture-project/tests/failing.test.js +6 -0
  119. package/dist/tests/fixtures/test-fixture-project/tests/hanging.test.js +8 -0
  120. package/dist/tests/fixtures/test-fixture-project/tests/passing.test.js +10 -0
  121. package/dist/tests/fixtures/test-fixture-project/vitest.config.js +6 -0
  122. package/dist/tests/integration/autoStartConversation.spec.js +138 -0
  123. package/dist/tests/integration/defaultCommand.spec.js +147 -0
  124. package/dist/tests/integration/directCommandNonTty.spec.js +224 -0
  125. package/dist/tests/integration/directCommandTty.spec.js +151 -0
  126. package/dist/tests/integration/discoveryEnrichmentFlow.spec.js +175 -0
  127. package/dist/tests/integration/exportArtifacts.spec.js +202 -0
  128. package/dist/tests/integration/exportFallbackFlow.spec.js +99 -0
  129. package/dist/tests/integration/mcpDegradationFlow.spec.js +190 -0
  130. package/dist/tests/integration/mcpTransportFlow.spec.js +139 -0
  131. package/dist/tests/integration/newSessionFlow.spec.js +343 -0
  132. package/dist/tests/integration/pocGithubMcp.spec.js +186 -0
  133. package/dist/tests/integration/pocLocalFallback.spec.js +171 -0
  134. package/dist/tests/integration/pocScaffold.spec.js +163 -0
  135. package/dist/tests/integration/ralphLoopFlow.spec.js +359 -0
  136. package/dist/tests/integration/ralphLoopPartial.spec.js +368 -0
  137. package/dist/tests/integration/resumeAndBacktrack.spec.js +247 -0
  138. package/dist/tests/integration/spinnerLifecycle.spec.js +220 -0
  139. package/dist/tests/integration/summarizationFlow.spec.js +115 -0
  140. package/dist/tests/integration/testRunnerReal.spec.js +52 -0
  141. package/dist/tests/integration/webSearchAgent.spec.js +128 -0
  142. package/dist/tests/live/copilotSdkLive.spec.js +107 -0
  143. package/dist/tests/live/zavaFullWorkshop.spec.js +392 -0
  144. package/dist/tests/setup/loadEnv.js +3 -0
  145. package/dist/tests/unit/cli/developCommand.spec.js +567 -0
  146. package/dist/tests/unit/cli/directCommands.spec.js +279 -0
  147. package/dist/tests/unit/cli/envLoader.spec.js +58 -0
  148. package/dist/tests/unit/cli/ioContext.spec.js +119 -0
  149. package/dist/tests/unit/cli/preflight.spec.js +108 -0
  150. package/dist/tests/unit/cli/statusCommand.spec.js +111 -0
  151. package/dist/tests/unit/cli/workshopClientFallback.spec.js +80 -0
  152. package/dist/tests/unit/cli/workshopCommand.spec.js +329 -0
  153. package/dist/tests/unit/config/vitestEnvSetup.spec.js +13 -0
  154. package/dist/tests/unit/develop/checkpointState.spec.js +315 -0
  155. package/dist/tests/unit/develop/codeGenerator.spec.js +355 -0
  156. package/dist/tests/unit/develop/githubMcpAdapter.spec.js +231 -0
  157. package/dist/tests/unit/develop/mcpContextEnricher.spec.js +433 -0
  158. package/dist/tests/unit/develop/outputValidator.spec.js +119 -0
  159. package/dist/tests/unit/develop/pocScaffolder.spec.js +353 -0
  160. package/dist/tests/unit/develop/ralphLoop.spec.js +1248 -0
  161. package/dist/tests/unit/develop/templateRegistry.spec.js +85 -0
  162. package/dist/tests/unit/develop/testRunner.spec.js +249 -0
  163. package/dist/tests/unit/infraBicep.spec.js +92 -0
  164. package/dist/tests/unit/infraDeploy.spec.js +82 -0
  165. package/dist/tests/unit/infraTeardown.spec.js +63 -0
  166. package/dist/tests/unit/logging/logger.spec.js +43 -0
  167. package/dist/tests/unit/loop/conversationLoop.spec.js +592 -0
  168. package/dist/tests/unit/loop/phaseSummarizer.spec.js +141 -0
  169. package/dist/tests/unit/loop/streamingMarkdown.spec.js +147 -0
  170. package/dist/tests/unit/mcp/mcpManager.spec.js +279 -0
  171. package/dist/tests/unit/mcp/mcpTransport.spec.js +529 -0
  172. package/dist/tests/unit/mcp/retryPolicy.spec.js +218 -0
  173. package/dist/tests/unit/mcp/timeoutValidation.spec.js +46 -0
  174. package/dist/tests/unit/mcp/webSearch.spec.js +567 -0
  175. package/dist/tests/unit/phases/contextSummarizer.spec.js +140 -0
  176. package/dist/tests/unit/phases/discoveryEnricher.repeatCalls.spec.js +93 -0
  177. package/dist/tests/unit/phases/discoveryEnricher.spec.js +411 -0
  178. package/dist/tests/unit/phases/phaseExtractors.spec.js +352 -0
  179. package/dist/tests/unit/phases/phaseHandlers.spec.js +425 -0
  180. package/dist/tests/unit/prompts/promptLoader.spec.js +118 -0
  181. package/dist/tests/unit/schemas/pocSchemas.spec.js +412 -0
  182. package/dist/tests/unit/schemas/session.spec.js +257 -0
  183. package/dist/tests/unit/sessions/exportPaths.spec.js +31 -0
  184. package/dist/tests/unit/sessions/exportWriter.spec.js +655 -0
  185. package/dist/tests/unit/sessions/sessionManager.spec.js +151 -0
  186. package/dist/tests/unit/sessions/sessionStore.spec.js +116 -0
  187. package/dist/tests/unit/shared/activitySpinner.spec.js +175 -0
  188. package/dist/tests/unit/shared/cardsLoader.spec.js +76 -0
  189. package/dist/tests/unit/shared/copilotClient.spec.js +155 -0
  190. package/dist/tests/unit/shared/errorClassifier.spec.js +131 -0
  191. package/dist/tests/unit/shared/events.spec.js +55 -0
  192. package/dist/tests/unit/shared/markdownRenderer.spec.js +35 -0
  193. package/dist/tests/unit/shared/markdownRendererChunks.spec.js +70 -0
  194. package/dist/tests/unit/shared/tableRenderer.spec.js +34 -0
  195. package/dist/vitest.config.js +14 -0
  196. package/dist/vitest.live.config.js +18 -0
  197. package/docs/README.md +35 -0
  198. package/docs/architecture.md +169 -0
  199. package/docs/cli-usage.md +207 -0
  200. package/docs/environment.md +66 -0
  201. package/docs/export-format.md +146 -0
  202. package/docs/session-model.md +113 -0
  203. package/eslint.config.js +35 -0
  204. package/infra/deploy.sh +193 -0
  205. package/infra/gather-env.sh +211 -0
  206. package/infra/main.bicep +90 -0
  207. package/infra/main.bicepparam +18 -0
  208. package/infra/resources.bicep +134 -0
  209. package/infra/teardown.sh +114 -0
  210. package/package.json +63 -0
  211. package/specs/001-cli-workshop-rebuild/checklists/requirements.md +35 -0
  212. package/specs/001-cli-workshop-rebuild/contracts/cli.md +59 -0
  213. package/specs/001-cli-workshop-rebuild/contracts/export-summary-json.md +23 -0
  214. package/specs/001-cli-workshop-rebuild/contracts/session-json.md +30 -0
  215. package/specs/001-cli-workshop-rebuild/data-model.md +210 -0
  216. package/specs/001-cli-workshop-rebuild/plan.md +361 -0
  217. package/specs/001-cli-workshop-rebuild/quickstart.md +83 -0
  218. package/specs/001-cli-workshop-rebuild/research.md +116 -0
  219. package/specs/001-cli-workshop-rebuild/spec.md +240 -0
  220. package/specs/001-cli-workshop-rebuild/tasks.md +476 -0
  221. package/specs/002-poc-generation/contracts/poc-output.md +172 -0
  222. package/specs/002-poc-generation/contracts/ralph-loop.md +113 -0
  223. package/specs/002-poc-generation/data-model.md +172 -0
  224. package/specs/002-poc-generation/plan.md +109 -0
  225. package/specs/002-poc-generation/quickstart.md +97 -0
  226. package/specs/002-poc-generation/research.md +786 -0
  227. package/specs/002-poc-generation/spec.md +81 -0
  228. package/specs/002-poc-generation/tasks-fix.md +198 -0
  229. package/specs/002-poc-generation/tasks.md +252 -0
  230. package/specs/003-mcp-transport-integration/checklists/requirements.md +37 -0
  231. package/specs/003-mcp-transport-integration/contracts/context-enricher.md +220 -0
  232. package/specs/003-mcp-transport-integration/contracts/discovery-enricher.md +267 -0
  233. package/specs/003-mcp-transport-integration/contracts/github-adapter.md +149 -0
  234. package/specs/003-mcp-transport-integration/contracts/mcp-transport.md +288 -0
  235. package/specs/003-mcp-transport-integration/data-model.md +326 -0
  236. package/specs/003-mcp-transport-integration/plan.md +114 -0
  237. package/specs/003-mcp-transport-integration/quickstart.md +311 -0
  238. package/specs/003-mcp-transport-integration/research.md +395 -0
  239. package/specs/003-mcp-transport-integration/spec.md +234 -0
  240. package/specs/003-mcp-transport-integration/tasks.md +324 -0
  241. package/specs/003-next-spec-gaps.md +150 -0
  242. package/specs/004-dev-resume-hardening/checklists/requirements.md +37 -0
  243. package/specs/004-dev-resume-hardening/contracts/cli.md +160 -0
  244. package/specs/004-dev-resume-hardening/data-model.md +321 -0
  245. package/specs/004-dev-resume-hardening/plan.md +107 -0
  246. package/specs/004-dev-resume-hardening/quickstart.md +115 -0
  247. package/specs/004-dev-resume-hardening/research.md +142 -0
  248. package/specs/004-dev-resume-hardening/spec.md +221 -0
  249. package/specs/004-dev-resume-hardening/tasks.md +333 -0
  250. package/specs/005-ai-search-deploy/checklists/requirements.md +39 -0
  251. package/specs/005-ai-search-deploy/contracts/web-search-tool.md +241 -0
  252. package/specs/005-ai-search-deploy/data-model.md +130 -0
  253. package/specs/005-ai-search-deploy/plan.md +93 -0
  254. package/specs/005-ai-search-deploy/quickstart.md +96 -0
  255. package/specs/005-ai-search-deploy/research.md +187 -0
  256. package/specs/005-ai-search-deploy/spec.md +143 -0
  257. package/specs/005-ai-search-deploy/tasks.md +284 -0
  258. package/specs/006-workshop-extraction-fixes/checklists/requirements.md +61 -0
  259. package/specs/006-workshop-extraction-fixes/contracts/summarization-and-export.md +131 -0
  260. package/specs/006-workshop-extraction-fixes/data-model.md +149 -0
  261. package/specs/006-workshop-extraction-fixes/plan.md +123 -0
  262. package/specs/006-workshop-extraction-fixes/quickstart.md +101 -0
  263. package/specs/006-workshop-extraction-fixes/research.md +143 -0
  264. package/specs/006-workshop-extraction-fixes/spec.md +210 -0
  265. package/specs/006-workshop-extraction-fixes/tasks.md +316 -0
  266. package/src/cli/developCommand.ts +308 -0
  267. package/src/cli/directCommands.ts +195 -0
  268. package/src/cli/envLoader.ts +17 -0
  269. package/src/cli/exportCommand.ts +65 -0
  270. package/src/cli/index.ts +249 -0
  271. package/src/cli/ioContext.ts +139 -0
  272. package/src/cli/preflight.ts +86 -0
  273. package/src/cli/statusCommand.ts +118 -0
  274. package/src/cli/workshopCommand.ts +496 -0
  275. package/src/develop/checkpointState.ts +121 -0
  276. package/src/develop/codeGenerator.ts +402 -0
  277. package/src/develop/dynamicScaffolder.ts +284 -0
  278. package/src/develop/githubMcpAdapter.ts +199 -0
  279. package/src/develop/index.ts +34 -0
  280. package/src/develop/mcpContextEnricher.ts +279 -0
  281. package/src/develop/pocScaffolder.ts +646 -0
  282. package/src/develop/ralphLoop.ts +1044 -0
  283. package/src/develop/templateRegistry.ts +427 -0
  284. package/src/develop/testRunner.ts +276 -0
  285. package/src/logging/logger.ts +73 -0
  286. package/src/loop/conversationLoop.ts +355 -0
  287. package/src/loop/phaseSummarizer.ts +114 -0
  288. package/src/mcp/mcpManager.ts +365 -0
  289. package/src/mcp/mcpTransport.ts +562 -0
  290. package/src/mcp/retryPolicy.ts +87 -0
  291. package/src/mcp/webSearch.ts +388 -0
  292. package/src/originalPrompts/design_thinking.md +178 -0
  293. package/src/originalPrompts/design_thinking_persona.md +76 -0
  294. package/src/originalPrompts/document_generator_example.md +77 -0
  295. package/src/originalPrompts/document_generator_persona.md +47 -0
  296. package/src/originalPrompts/facilitator_persona.md +125 -0
  297. package/src/originalPrompts/guardrails.md +47 -0
  298. package/src/phases/contextSummarizer.ts +154 -0
  299. package/src/phases/discoveryEnricher.ts +223 -0
  300. package/src/phases/phaseExtractors.ts +247 -0
  301. package/src/phases/phaseHandlers.ts +450 -0
  302. package/src/prompts/design.md +51 -0
  303. package/src/prompts/develop-boundary.md +51 -0
  304. package/src/prompts/develop.md +111 -0
  305. package/src/prompts/discover.md +58 -0
  306. package/src/prompts/ideate.md +56 -0
  307. package/src/prompts/plan.md +51 -0
  308. package/src/prompts/promptLoader.ts +198 -0
  309. package/src/prompts/select.md +47 -0
  310. package/src/prompts/summarize/README.md +8 -0
  311. package/src/prompts/summarize/design-summary.md +37 -0
  312. package/src/prompts/summarize/develop-summary.md +25 -0
  313. package/src/prompts/summarize/ideate-summary.md +27 -0
  314. package/src/prompts/summarize/plan-summary.md +27 -0
  315. package/src/prompts/summarize/select-summary.md +21 -0
  316. package/src/prompts/system.md +28 -0
  317. package/src/sessions/exportPaths.ts +28 -0
  318. package/src/sessions/exportWriter.ts +490 -0
  319. package/src/sessions/sessionManager.ts +119 -0
  320. package/src/sessions/sessionStore.ts +69 -0
  321. package/src/shared/activitySpinner.ts +108 -0
  322. package/src/shared/copilotClient.ts +291 -0
  323. package/src/shared/data/cards.json +1249 -0
  324. package/src/shared/data/cardsLoader.ts +70 -0
  325. package/src/shared/errorClassifier.ts +160 -0
  326. package/src/shared/events.ts +103 -0
  327. package/src/shared/markdownRenderer.ts +44 -0
  328. package/src/shared/schemas/session.ts +346 -0
  329. package/src/shared/tableRenderer.ts +28 -0
  330. package/src/types/marked-terminal.d.ts +5 -0
  331. package/src/vendor/chalk.ts +2 -0
  332. package/src/vendor/cli-table3.ts +3 -0
  333. package/src/vendor/commander.ts +2 -0
  334. package/src/vendor/marked-terminal.ts +3 -0
  335. package/src/vendor/marked.ts +2 -0
  336. package/src/vendor/ora.ts +2 -0
  337. package/src/vendor/pino.ts +3 -0
  338. package/src/vendor/zod.ts +3 -0
  339. package/tests/e2e/developE2e.spec.ts +152 -0
  340. package/tests/e2e/developFailureE2e.spec.ts +289 -0
  341. package/tests/e2e/developPty.spec.ts +86 -0
  342. package/tests/e2e/discoveryWebSearchRelevance.spec.ts +103 -0
  343. package/tests/e2e/harness.spec.ts +104 -0
  344. package/tests/e2e/mcpLive.spec.ts +149 -0
  345. package/tests/e2e/newSession.e2e.spec.ts +245 -0
  346. package/tests/e2e/ralphLoopEnrichmentComparison.spec.ts +70 -0
  347. package/tests/e2e/workiqEnrichment.spec.ts +72 -0
  348. package/tests/e2e/zava-assessment/agent-interaction-script.md +258 -0
  349. package/tests/e2e/zava-assessment/company-profile.md +98 -0
  350. package/tests/e2e/zava-assessment/expected-results-checklist.md +454 -0
  351. package/tests/e2e/zavaSimulation.spec.ts +511 -0
  352. package/tests/fixtures/completedSession.json +141 -0
  353. package/tests/fixtures/test-fixture-project/package-lock.json +1585 -0
  354. package/tests/fixtures/test-fixture-project/package.json +12 -0
  355. package/tests/fixtures/test-fixture-project/src/add.ts +3 -0
  356. package/tests/fixtures/test-fixture-project/tests/failing.test.ts +7 -0
  357. package/tests/fixtures/test-fixture-project/tests/hanging.test.ts +9 -0
  358. package/tests/fixtures/test-fixture-project/tests/passing.test.ts +13 -0
  359. package/tests/fixtures/test-fixture-project/vitest.config.ts +7 -0
  360. package/tests/integration/autoStartConversation.spec.ts +168 -0
  361. package/tests/integration/defaultCommand.spec.ts +179 -0
  362. package/tests/integration/directCommandNonTty.spec.ts +260 -0
  363. package/tests/integration/directCommandTty.spec.ts +185 -0
  364. package/tests/integration/discoveryEnrichmentFlow.spec.ts +209 -0
  365. package/tests/integration/exportArtifacts.spec.ts +232 -0
  366. package/tests/integration/exportFallbackFlow.spec.ts +115 -0
  367. package/tests/integration/mcpDegradationFlow.spec.ts +231 -0
  368. package/tests/integration/mcpTransportFlow.spec.ts +178 -0
  369. package/tests/integration/newSessionFlow.spec.ts +406 -0
  370. package/tests/integration/pocGithubMcp.spec.ts +224 -0
  371. package/tests/integration/pocLocalFallback.spec.ts +205 -0
  372. package/tests/integration/pocScaffold.spec.ts +220 -0
  373. package/tests/integration/ralphLoopFlow.spec.ts +430 -0
  374. package/tests/integration/ralphLoopPartial.spec.ts +416 -0
  375. package/tests/integration/resumeAndBacktrack.spec.ts +278 -0
  376. package/tests/integration/spinnerLifecycle.spec.ts +270 -0
  377. package/tests/integration/summarizationFlow.spec.ts +135 -0
  378. package/tests/integration/testRunnerReal.spec.ts +63 -0
  379. package/tests/integration/webSearchAgent.spec.ts +155 -0
  380. package/tests/live/copilotSdkLive.spec.ts +149 -0
  381. package/tests/live/zavaFullWorkshop.spec.ts +515 -0
  382. package/tests/setup/loadEnv.ts +5 -0
  383. package/tests/unit/cli/developCommand.spec.ts +679 -0
  384. package/tests/unit/cli/directCommands.spec.ts +325 -0
  385. package/tests/unit/cli/envLoader.spec.ts +73 -0
  386. package/tests/unit/cli/ioContext.spec.ts +148 -0
  387. package/tests/unit/cli/preflight.spec.ts +125 -0
  388. package/tests/unit/cli/statusCommand.spec.ts +134 -0
  389. package/tests/unit/cli/workshopClientFallback.spec.ts +100 -0
  390. package/tests/unit/cli/workshopCommand.spec.ts +378 -0
  391. package/tests/unit/config/vitestEnvSetup.spec.ts +24 -0
  392. package/tests/unit/develop/checkpointState.spec.ts +378 -0
  393. package/tests/unit/develop/codeGenerator.spec.ts +447 -0
  394. package/tests/unit/develop/githubMcpAdapter.spec.ts +283 -0
  395. package/tests/unit/develop/mcpContextEnricher.spec.ts +564 -0
  396. package/tests/unit/develop/outputValidator.spec.ts +134 -0
  397. package/tests/unit/develop/pocScaffolder.spec.ts +451 -0
  398. package/tests/unit/develop/ralphLoop.spec.ts +1439 -0
  399. package/tests/unit/develop/templateRegistry.spec.ts +106 -0
  400. package/tests/unit/develop/testRunner.spec.ts +294 -0
  401. package/tests/unit/infraBicep.spec.ts +116 -0
  402. package/tests/unit/infraDeploy.spec.ts +102 -0
  403. package/tests/unit/infraTeardown.spec.ts +77 -0
  404. package/tests/unit/logging/logger.spec.ts +50 -0
  405. package/tests/unit/loop/conversationLoop.spec.ts +719 -0
  406. package/tests/unit/loop/phaseSummarizer.spec.ts +169 -0
  407. package/tests/unit/loop/streamingMarkdown.spec.ts +180 -0
  408. package/tests/unit/mcp/mcpManager.spec.ts +336 -0
  409. package/tests/unit/mcp/mcpTransport.spec.ts +689 -0
  410. package/tests/unit/mcp/retryPolicy.spec.ts +278 -0
  411. package/tests/unit/mcp/timeoutValidation.spec.ts +55 -0
  412. package/tests/unit/mcp/webSearch.spec.ts +718 -0
  413. package/tests/unit/phases/contextSummarizer.spec.ts +158 -0
  414. package/tests/unit/phases/discoveryEnricher.repeatCalls.spec.ts +125 -0
  415. package/tests/unit/phases/discoveryEnricher.spec.ts +512 -0
  416. package/tests/unit/phases/phaseExtractors.spec.ts +406 -0
  417. package/tests/unit/phases/phaseHandlers.spec.ts +483 -0
  418. package/tests/unit/prompts/promptLoader.spec.ts +144 -0
  419. package/tests/unit/schemas/pocSchemas.spec.ts +457 -0
  420. package/tests/unit/schemas/session.spec.ts +328 -0
  421. package/tests/unit/sessions/exportPaths.spec.ts +38 -0
  422. package/tests/unit/sessions/exportWriter.spec.ts +737 -0
  423. package/tests/unit/sessions/sessionManager.spec.ts +174 -0
  424. package/tests/unit/sessions/sessionStore.spec.ts +136 -0
  425. package/tests/unit/shared/activitySpinner.spec.ts +211 -0
  426. package/tests/unit/shared/cardsLoader.spec.ts +89 -0
  427. package/tests/unit/shared/copilotClient.spec.ts +185 -0
  428. package/tests/unit/shared/errorClassifier.spec.ts +152 -0
  429. package/tests/unit/shared/events.spec.ts +71 -0
  430. package/tests/unit/shared/markdownRenderer.spec.ts +42 -0
  431. package/tests/unit/shared/markdownRendererChunks.spec.ts +83 -0
  432. package/tests/unit/shared/tableRenderer.spec.ts +38 -0
  433. package/tsconfig.json +20 -0
  434. package/vitest.config.ts +15 -0
  435. package/vitest.live.config.ts +19 -0
@@ -0,0 +1,115 @@
1
+ # Quickstart: Dev Resume & Hardening
2
+
3
+ **Feature**: 004-dev-resume-hardening
4
+ **Date**: 2026-03-01
5
+
6
+ ## Prerequisites
7
+
8
+ - Node.js >= 20 LTS
9
+ - npm (bundled with Node.js)
10
+ - sofIA CLI installed (`npm run build && npm link`)
11
+ - A workshop session that has completed the Plan phase
12
+
13
+ ## Quick Verification
14
+
15
+ ### 1. Resume a session after interruption
16
+
17
+ ```bash
18
+ # Start a dev session
19
+ sofia dev --session abc123
20
+
21
+ # Interrupt with Ctrl+C after 2 iterations complete
22
+ # The CLI displays: "Use `sofia dev --session abc123` to resume"
23
+
24
+ # Resume — should skip scaffold, re-run npm install, resume from iteration 3
25
+ sofia dev --session abc123
26
+
27
+ # Expected output:
28
+ # ℹ Resuming session abc123 from iteration 3 (2 completed iterations found)
29
+ # ℹ Skipping scaffold — output directory and .sofia-metadata.json present
30
+ # ℹ Re-running dependency installation (npm install)
31
+ # Iteration 3/10: Running tests…
32
+ ```
33
+
34
+ ### 2. Force-restart a session
35
+
36
+ ```bash
37
+ # Force restart — clears both output directory and session state
38
+ sofia dev --session abc123 --force
39
+
40
+ # Expected output:
41
+ # ℹ Cleared existing output directory and session state (--force)
42
+ # Scaffolding PoC project…
43
+ # Iteration 1/10: Running tests…
44
+ ```
45
+
46
+ ### 3. Template selection
47
+
48
+ ```bash
49
+ # Create a plan with Python/FastAPI architecture notes
50
+ # Then run dev — should auto-select python-pytest template
51
+ sofia dev --session python-plan-123
52
+
53
+ # Expected output:
54
+ # ℹ Selected template: python-pytest (matched 'python' in architecture notes)
55
+ # Scaffolding PoC project…
56
+ ```
57
+
58
+ ## Development Setup
59
+
60
+ ```bash
61
+ # Clone and install
62
+ git clone <repo-url>
63
+ cd sofia-cli
64
+ git checkout 004-dev-resume-hardening
65
+ npm install
66
+
67
+ # Run tests (targeted)
68
+ npm test -- tests/unit/develop/ralphLoop.spec.ts
69
+ npm test -- tests/unit/develop/templateRegistry.spec.ts
70
+ npm test -- tests/unit/cli/developCommand.spec.ts
71
+ npm test -- tests/integration/testRunnerReal.spec.ts
72
+
73
+ # Run all tests
74
+ npm test
75
+
76
+ # Type check
77
+ npm run typecheck
78
+
79
+ # Lint
80
+ npm run lint
81
+ ```
82
+
83
+ ## Key Files to Modify
84
+
85
+ | File | Change |
86
+ | --------------------------------- | ------------------------------------------- |
87
+ | `src/develop/ralphLoop.ts` | Resume iteration seeding in `run()` |
88
+ | `src/cli/developCommand.ts` | Resume detection, `--force` session reset |
89
+ | `src/develop/templateRegistry.ts` | **New**: template registry + selection |
90
+ | `src/develop/pocScaffolder.ts` | Use registry, extract template into entries |
91
+ | `src/develop/testRunner.ts` | Make test command configurable |
92
+ | `src/phases/phaseHandlers.ts` | Workshop→dev transition guidance |
93
+ | `src/cli/workshopCommand.ts` | Display `sofia dev` command after Plan |
94
+
95
+ ## TDD Workflow Reminder
96
+
97
+ Per constitution (Principle V):
98
+
99
+ 1. **Red**: Write failing tests first
100
+ 2. **Green**: Implement minimum code to pass
101
+ 3. **Review**: Run Test Review Checklist, add tests for gaps
102
+
103
+ ```bash
104
+ # Example: adding resume test
105
+ # 1. Write test in tests/unit/develop/ralphLoop.spec.ts
106
+ # 2. Run it — should fail
107
+ npm test -- tests/unit/develop/ralphLoop.spec.ts --testNamePattern "resumes from"
108
+ # 3. Implement resume logic in src/develop/ralphLoop.ts
109
+ # 4. Run again — should pass
110
+ npm test -- tests/unit/develop/ralphLoop.spec.ts
111
+ # 5. Full suite
112
+ npm test
113
+ # 6. Type + lint
114
+ npm run typecheck && npm run lint
115
+ ```
@@ -0,0 +1,142 @@
1
+ # Research: Dev Resume & Hardening
2
+
3
+ **Feature**: 004-dev-resume-hardening
4
+ **Date**: 2026-03-01
5
+ **Status**: Complete — all unknowns resolved
6
+
7
+ ## R1: Resume Iteration Seeding Strategy
8
+
9
+ **Decision**: Seed `iterations` from `session.poc.iterations` at `ralphLoop.ts` L183, derive `iterNum = iterations.length + 1`, conditionally skip scaffold/install.
10
+
11
+ **Rationale**: The `run()` method always initializes `iterations = []` (L183) and starts the iteration loop at `iterNum = 2` (L280). The session already persists `poc.iterations` via `onSessionUpdate` → `store.save()` after every iteration. The data needed for resume is already being written — it's just never read back. Seeding from session state is the minimal change with maximum correctness.
12
+
13
+ **Key insertion points**:
14
+
15
+ - L183: After `const iterations: PocIteration[] = []`, push from `session.poc.iterations` if present and `finalStatus` is unset
16
+ - L280: Change loop start from `iterNum = 2` to `iterNum = iterations.length + 1`
17
+ - L190-L271: Wrap scaffold + npm install in `if (iterations.length === 0)` guard
18
+ - L278-L279: Seed `prevFailingTests` from last iteration's `testResults.failures` on resume
19
+
20
+ **Alternatives considered**:
21
+
22
+ - **New `resume()` method on RalphLoop**: Rejected — would duplicate significant logic from `run()`. Better to make `run()` resume-aware.
23
+ - **Checkpoint file on disk**: Rejected — session JSON already contains all state needed. Adding a secondary checkpoint source creates consistency risks.
24
+
25
+ **Open design question resolved**: `maxIterations` counts _total_ iterations (not additional from resume point). If `maxIterations=10` and 3 completed, the loop runs iterations 4-10 (7 more). This matches the semantic "max iterations for this PoC" and prevents open-ended runs.
26
+
27
+ **Incomplete iteration handling** (FR-001a): If the last iteration in `session.poc.iterations` has no `testResults` (indicating mid-execution interruption), pop it from the seeded iterations so it gets re-run. Only fully completed iterations (with `testResults` or `outcome` set) are preserved.
28
+
29
+ ## R2: `--force` Session State Reset
30
+
31
+ **Decision**: Direct mutation `session.poc = undefined` in `developCommand.ts` after `rmSync()`, followed by `store.save()`. Do not use `backtrackSession`.
32
+
33
+ **Rationale**: `backtrackSession(session, 'Develop')` is a no-op when `session.phase` is already `'Develop'` (same-phase check at sessionManager.ts L80-L84 returns without changes). The `--force` reset is a single-field operation within the current phase — directly clearing `session.poc` is simpler, more explicit, and avoids coupling to the backtrack function's cross-phase navigation semantics.
34
+
35
+ **Alternatives considered**:
36
+
37
+ - **`backtrackSession` with `clearCurrentPhase` option**: Rejected — adds complexity to a generic function for a specific use case. Backtrack is designed for phase navigation, not in-phase resets.
38
+ - **Delete and recreate session**: Rejected — would lose all workshop phases (Discover, Ideate, Design, Select, Plan). `--force` should only reset the PoC, preserving all prior work.
39
+
40
+ ## R3: Template Registry Architecture
41
+
42
+ **Decision**: Create a `TemplateRegistry` map in a new `src/develop/templateRegistry.ts` module. `PocScaffolder` constructor already accepts `template?: TemplateFile[]` — the registry provides the lookup layer.
43
+
44
+ **Rationale**: The scaffolder's template injection point exists (`constructor(template?)`), `TemplateFile` interface is stable, and `TechStack` schema already supports the fields needed. The registry formalizes what's already implicit (hardcoded template selection) into an extensible pattern.
45
+
46
+ **Template entry shape**:
47
+
48
+ ```typescript
49
+ export interface TemplateEntry {
50
+ id: string; // e.g., 'node-ts-vitest', 'python-pytest'
51
+ displayName: string; // e.g., 'TypeScript + Node.js + Vitest'
52
+ files: TemplateFile[]; // scaffold file list
53
+ techStack: TechStack; // includes language, runtime, testRunner, buildCommand
54
+ installCommand: string; // e.g., 'npm install', 'pip install -r requirements.txt'
55
+ testCommand: string; // e.g., 'npm test -- --reporter=json'
56
+ matchPatterns: string[]; // keywords to match from architectureNotes
57
+ }
58
+ ```
59
+
60
+ **Selection logic**: Scan `plan.architectureNotes` + `plan.dependencies` for `matchPatterns`. First match wins. Default: `node-ts-vitest`.
61
+
62
+ **`python-pytest` template files**: `.gitignore`, `requirements.txt`, `pytest.ini`, `README.md`, `src/__init__.py`, `src/main.py`, `tests/test_main.py`, `.sofia-metadata.json`.
63
+
64
+ **TechStack for Python**: `{ language: 'Python', runtime: 'Python 3.11', testRunner: 'pytest --tb=short -q --json-report', buildCommand: undefined, framework: undefined }`
65
+
66
+ **Alternatives considered**:
67
+
68
+ - **Auto-detection from plan (no registry)**: Rejected — fragile pattern matching without a structured lookup. Registry makes template addition declarative.
69
+ - **User-selectable template (CLI flag)**: Deferred — out of scope per spec. Registry enables this later without code changes.
70
+
71
+ ## R4: TestRunner Command Configurability
72
+
73
+ **Decision**: Make test command configurable via `TestRunnerOptions.testCommand` (default: `'npm test -- --reporter=json'`). The `RalphLoop` passes the command from `TechStack.testRunner` or TemplateEntry.
74
+
75
+ **Rationale**: `spawnTests()` currently hardcodes `spawn('npm', ['test', '--', '--reporter=json'])`. For Python templates, the command would be `pytest --tb=short -q --json-report`. Rather than building separate parsers for each runner, make the command configurable and keep the JSON parsing generic — both Vitest and pytest can produce JSON output.
76
+
77
+ **Test strategy for coverage hardening**:
78
+
79
+ - Make `extractJson` and `buildErrorResult` `protected` (like `parseOutput` already is)
80
+ - Create test fixture files with sample Vitest JSON output (passing, failing, mixed, garbled)
81
+ - Use `child_process.spawn` mocking OR a real minimal project in `tests/fixtures/` for integration tests
82
+ - FR-019 requires real fixture — create `tests/fixtures/test-fixture-project/` with a minimal Vitest project
83
+
84
+ **Alternatives considered**:
85
+
86
+ - **Strategy pattern per runner**: Rejected for now — over-engineering. JSON output parsing can be generic. If pytest JSON format differs significantly, add a `parseStrategy` option later.
87
+ - **Test `spawnTests` via shell script fixture**: Rejected — too fragile across platforms. Real Vitest project is more reliable.
88
+
89
+ ## R5: Workshop → Dev Transition
90
+
91
+ **Decision**: Insert guidance message in `workshopCommand.ts` when `getNextPhase(phase)` returns `'Develop'`, after the Plan decision gate. Show exact command `sofia dev --session <id>`. Optionally offer auto-transition in interactive mode (FR-021, SHOULD).
92
+
93
+ **Rationale**: The Plan → Develop boundary is where the workshop's conversational flow hands off to the RalphLoop's iterative code generation. The boundary handler at phaseHandlers.ts L283-L286 already comments that PoC generation uses `sofia dev`. Making this guidance explicit helps users complete the workflow.
94
+
95
+ **Insertion point**: `workshopCommand.ts` inside the `case 'continue'` block, when `next === 'Develop'`.
96
+
97
+ **Alternatives considered**:
98
+
99
+ - **Auto-transition always**: Rejected — breaks the two-command separation of concerns. Users may want to review the plan before generating code.
100
+ - **Display in phase handler instead of workshop command**: Rejected — phase handlers don't have access to the IO context for rich terminal rendering. The workshop command is the right orchestration layer.
101
+
102
+ ## R6: `.sofia-metadata.json` TODO Tracking
103
+
104
+ **Decision**: Extend metadata JSON schema with a `todos` section. Scan template files at scaffold time for `TODO:` markers. After each RalphLoop iteration, rescan and update counts.
105
+
106
+ **Rationale**: The metadata file is already written at scaffold time (pocScaffolder.ts L243-L260), excluded from code generation (codeGenerator.ts L112), and used as a resume marker (developCommand.ts L160). Extending it with TODO tracking is a natural fit.
107
+
108
+ **Schema extension**:
109
+
110
+ ```json
111
+ {
112
+ "todos": {
113
+ "totalInitial": 3,
114
+ "remaining": 1,
115
+ "markers": ["src/main.py:12: TODO: Implement business logic"]
116
+ }
117
+ }
118
+ ```
119
+
120
+ **Alternatives considered**:
121
+
122
+ - **Separate `.sofia-todos.json` file**: Rejected — adds another file to manage. Metadata is already the canonical per-PoC state file.
123
+ - **Track in session JSON instead**: Rejected — TODOs are file-system artifacts, not session-level state. Metadata file is co-located with the scaffold output.
124
+
125
+ ## R7: Existing Test Infrastructure
126
+
127
+ **Decision**: Follow existing test patterns: unit tests in `tests/unit/develop/`, integration in `tests/integration/`, E2E with `node-pty` in `tests/e2e/`. Use Vitest `vi.mock()` at module boundaries.
128
+
129
+ **Rationale**: 48 test files already establish clear conventions. Unit tests mock at module boundaries using `vi.mock()`. Integration tests use fake IO contexts and deterministic session objects. E2E tests use `node-pty` for PTY simulation (already a dev dependency).
130
+
131
+ **Key test files to extend**:
132
+
133
+ - `tests/unit/develop/ralphLoop.spec.ts` — add resume iteration seeding tests
134
+ - `tests/unit/cli/developCommand.spec.ts` — add --force reset tests
135
+ - `tests/integration/ralphLoopPartial.spec.ts` — add full resume flow tests
136
+
137
+ **New test files**:
138
+
139
+ - `tests/unit/develop/templateRegistry.spec.ts` — registry selection logic
140
+ - `tests/integration/testRunnerReal.spec.ts` — fixture-based testRunner tests
141
+ - `tests/e2e/developPty.spec.ts` — PTY-based interactive E2E
142
+ - `tests/fixtures/test-fixture-project/` — minimal Vitest project for testRunner tests
@@ -0,0 +1,221 @@
1
+ # Feature Specification: Dev Resume & Hardening
2
+
3
+ **Feature Branch**: `004-dev-resume-hardening`
4
+ **Created**: 2026-03-01
5
+ **Status**: Draft
6
+ **Upstream Dependency**: specs/002-poc-generation/spec.md (Ralph Loop, `--force`, testRunner, scaffolder), specs/003-mcp-transport-integration/spec.md (MCP transport layer)
7
+ **Input**: User description: "Implement dev command resume/checkpoint, --force flag, testRunner coverage hardening, PoC template selection, and other deferred P2/P3 gaps from Feature 003 spec"
8
+
9
+ ## Overview
10
+
11
+ Feature 002 built the PoC generation pipeline, and Feature 003 wires it to real MCP servers. This feature hardens the `sofia dev` command for production use by implementing the resume/checkpoint flow, honoring the `--force` flag, expanding test coverage for `testRunner.ts`, introducing a template registry for multi-language PoC scaffolding, and adding interactive E2E tests.
12
+
13
+ Currently, running `sofia dev --session X` a second time re-scaffolds everything from scratch despite the CLI displaying a "Resume" suggestion. The `--force` flag deletes the output directory but does not reset session state. The test runner has significant untested code paths at 45% coverage. The scaffolder is locked to a single TypeScript/Vitest template regardless of the plan's architecture notes.
14
+
15
+ **Gaps addressed**: GAP-006 (P2, resume/checkpoint), GAP-007 (P2, `--force`), GAP-008 (P2, testRunner coverage), GAP-009 (P2, template selection), GAP-009 (P3, scaffold TODOs), GAP-010 (P3, PTY E2E), GAP-011 (P3, workshop→develop transition) from `specs/003-next-spec-gaps.md`.
16
+
17
+ ## Clarifications
18
+
19
+ ### Session 2026-03-01
20
+
21
+ - Q: When resuming after an interruption mid-iteration, should the system re-run the incomplete iteration or skip to N+1? → A: Re-run the last iteration if it has no test results (was interrupted mid-execution); skip to N+1 only if the iteration completed fully.
22
+ - Q: Should npm install be skipped on resume if node_modules exists? → A: Always re-run npm install on resume — it's idempotent and avoids stale dependency issues from mid-iteration interruptions.
23
+ - Q: Should the template define the test command or should the test runner auto-detect it? → A: Template defines both install and test commands in TechStack — single source of truth, no auto-detection.
24
+ - Q: Should resume decisions (skip scaffold, re-run iteration, re-run install) be logged? → A: Log all resume decisions at info level (visible by default) for user confidence and debugging.
25
+ - Q: What adjacent concerns should be explicitly out of scope? → A: Multi-session dev, cloud-based resume, template marketplace, and Python test runner integration are all out of scope.
26
+
27
+ ## Out of Scope
28
+
29
+ The following concerns are explicitly excluded from this feature:
30
+
31
+ - **Multi-session development** — Running `sofia dev` on multiple sessions simultaneously is not supported; resume is single-session only.
32
+ - **Cloud-based resume** — Checkpoint state is local to the machine; syncing resume state across machines (e.g., via GitHub or cloud storage) is deferred.
33
+ - **Template marketplace** — User-contributed or externally hosted templates are not supported; the template registry is internal and code-defined.
34
+ - **Python test runner integration** — While the `python-pytest` scaffold template is in scope, adapting `testRunner.ts` to parse pytest's JSON output format is deferred. The Python template will use a test command format compatible with the existing JSON parser (e.g., pytest with `--json-report` plugin producing a compatible shape).
35
+
36
+ ## User Scenarios & Testing _(mandatory)_
37
+
38
+ ### User Story 1 — Resume an Interrupted PoC Session (Priority: P1)
39
+
40
+ As a facilitator who ran `sofia dev` and it was interrupted (Ctrl+C, network failure, LLM error), I want to run `sofia dev --session X` again and have it continue from where it left off — skipping scaffolding and npm install, resuming from the next iteration number — so that I don't lose progress and can reach a working PoC faster.
41
+
42
+ **Why this priority**: The CLI already advertises "Resume: sofia dev --session X" in its recovery message, but the command doesn't actually resume. This is the largest usability gap — users who encounter any interruption lose all iteration progress.
43
+
44
+ **Independent Test**: Run `sofia dev` on a session, interrupt after 2 iterations, re-run `sofia dev --session X`, and verify it starts from iteration 3 without re-scaffolding or re-running npm install.
45
+
46
+ **Acceptance Scenarios**:
47
+
48
+ 1. **Given** a session with `poc.iterations` containing 2 completed iterations and `poc.finalStatus` unset, **When** the user runs `sofia dev --session X`, **Then** the Ralph Loop detects the existing iterations, skips scaffolding and npm install, and begins iteration 3 from the last known test results.
49
+ 2. **Given** a session with `poc.finalStatus` set to `'success'`, **When** the user runs `sofia dev --session X`, **Then** the CLI displays a message indicating the PoC is already complete and exits without re-running the Ralph Loop.
50
+ 3. **Given** a session with `poc.finalStatus` set to `'failed'` or `'partial'`, **When** the user runs `sofia dev --session X`, **Then** the CLI offers to resume from the last iteration or start fresh, defaulting to resume.
51
+ 4. **Given** a session with existing iterations but the output directory is missing, **When** the user runs `sofia dev --session X`, **Then** the system re-scaffolds (using the original plan context) but preserves the iteration history for LLM context continuity.
52
+
53
+ ---
54
+
55
+ ### User Story 2 — Force-Restart a PoC Session (Priority: P1)
56
+
57
+ As a facilitator who wants to discard a previous PoC attempt and start completely fresh, I want to run `sofia dev --session X --force` and have it delete all prior output and reset PoC state, so that I get a clean slate without needing to create a new session.
58
+
59
+ **Why this priority**: The `--force` flag is already declared in the CLI and referenced in the recovery message, but it only partially works (deletes output directory without resetting session state). This creates a confusing state where files are gone but the session still references old iterations.
60
+
61
+ **Independent Test**: Run `sofia dev --session X` to create output, then run `sofia dev --session X --force`, and verify both the output directory and session's `poc.iterations` are reset to empty.
62
+
63
+ **Acceptance Scenarios**:
64
+
65
+ 1. **Given** a session with existing `poc.iterations` and an output directory, **When** the user runs `sofia dev --session X --force`, **Then** the output directory is deleted, `poc.iterations` is reset to an empty array, `poc.finalStatus` is cleared, and the Ralph Loop starts fresh from iteration 1.
66
+ 2. **Given** a session with no prior PoC state, **When** the user runs `sofia dev --session X --force`, **Then** it behaves identically to a first run (no error, no special message).
67
+ 3. **Given** the `--force` flag is used on a session with `poc.finalStatus` set to `'success'`, **When** the command runs, **Then** it clears the success state and starts fresh without prompting for confirmation.
68
+
69
+ ---
70
+
71
+ ### User Story 3 — PoC Template Selection Based on Plan (Priority: P2)
72
+
73
+ As a facilitator whose plan specifies Python/FastAPI architecture, I want the scaffolder to generate a Python project with pytest instead of always generating TypeScript/Vitest, so that the PoC matches the planned technology stack.
74
+
75
+ **Why this priority**: Currently the scaffolder is hardcoded to TypeScript/Vitest regardless of the plan's `architectureNotes` or `dependencies`. This limits the PoC's usefulness when the plan targets a different technology. However, the core Ralph Loop works with any single template, making this an enhancement rather than a blocker.
76
+
77
+ **Independent Test**: Create a session with a plan specifying Python + FastAPI in its architecture notes, run `sofia dev`, and verify the scaffolder generates `requirements.txt`, `main.py`, `test_main.py` with pytest instead of `package.json` and TypeScript files.
78
+
79
+ **Acceptance Scenarios**:
80
+
81
+ 1. **Given** a plan with `architectureNotes` mentioning "Python" or "FastAPI", **When** the scaffolder runs, **Then** it selects the `python-pytest` template and generates a Python project structure.
82
+ 2. **Given** a plan with `architectureNotes` mentioning "TypeScript" or "Node.js" or no specific language, **When** the scaffolder runs, **Then** it uses the default `node-ts-vitest` template (current behavior preserved).
83
+ 3. **Given** a plan with ambiguous architecture notes (e.g., "could be Python or TypeScript"), **When** the scaffolder runs, **Then** it defaults to `node-ts-vitest` and logs which template was selected and why.
84
+ 4. **Given** a template registry with registered templates, **When** a new template is added, **Then** it only requires adding a new entry to the registry — no changes to the scaffolder's core logic.
85
+
86
+ ---
87
+
88
+ ### User Story 4 — TestRunner Coverage Hardening (Priority: P2)
89
+
90
+ As a developer maintaining sofIA, I want the test runner's critical code paths (subprocess spawning, output parsing, timeout handling) to be covered by integration tests, so that regressions in the test execution pipeline are caught early.
91
+
92
+ **Why this priority**: The test runner is at 45% coverage with critical untested paths including the child process spawning mechanism, output parsing fallbacks, and timeout error handling. These paths are exercised in every Ralph Loop iteration, making regressions high-impact but currently invisible.
93
+
94
+ **Independent Test**: Run the test runner integration tests against a tiny Vitest/pytest project fixture and verify all code paths are exercised including timeout, SIGTERM/SIGKILL, malformed output, and mixed stdout/JSON scenarios.
95
+
96
+ **Acceptance Scenarios**:
97
+
98
+ 1. **Given** a test fixture project with passing tests, **When** the test runner executes, **Then** it correctly parses the JSON reporter output and returns accurate pass/fail/skip counts.
99
+ 2. **Given** a test fixture project with a test that hangs indefinitely, **When** the test runner's timeout fires, **Then** it sends SIGTERM, waits 5 seconds, sends SIGKILL if needed, and returns a timeout-classified error result.
100
+ 3. **Given** test output containing mixed console logs and JSON, **When** `extractJson()` parses the output, **Then** the fallback path (first `{` to last `}`) successfully extracts the JSON report.
101
+ 4. **Given** test output containing no valid JSON at all, **When** `extractJson()` is called, **Then** it returns null and the caller produces a zero-count result with raw output preserved.
102
+
103
+ ---
104
+
105
+ ### User Story 5 — PTY-Based Interactive E2E Tests (Priority: P3)
106
+
107
+ As a developer, I want PTY-based E2E tests for the `sofia dev` command that verify interactive behavior (Ctrl+C handling, spinner display, progress output), so that the user's terminal experience is validated in CI.
108
+
109
+ **Why this priority**: Interactive behavior bugs (hanging spinners, swallowed Ctrl+C, garbled progress output) are invisible to the current E2E tests which use function calls. This is a quality-of-life improvement for developers but doesn't block production functionality.
110
+
111
+ **Independent Test**: Run PTY-based tests that spawn `sofia dev` as a subprocess, send Ctrl+C, and verify the process exits cleanly with the expected recovery message.
112
+
113
+ **Acceptance Scenarios**:
114
+
115
+ 1. **Given** a PTY-spawned `sofia dev` process, **When** Ctrl+C is sent during an iteration, **Then** the process exits with the recovery message and a zero exit code.
116
+ 2. **Given** a PTY-spawned `sofia dev` process, **When** the Ralph Loop progresses through iterations, **Then** the terminal displays iteration progress (e.g., "Iteration 2/10: Running tests…") readable from the PTY output buffer.
117
+
118
+ ---
119
+
120
+ ### User Story 6 — Workshop-to-Dev Transition Clarity (Priority: P3)
121
+
122
+ As a facilitator completing the Plan phase in `sofia workshop`, I want a clear indication of how to proceed to PoC development — whether via an automatic transition or explicit guidance to run `sofia dev` — so that the workflow feels intentional rather than abandoned after planning.
123
+
124
+ **Why this priority**: The current boundary prompt in the workshop only captures PoC intent without invoking the Ralph Loop. Users may not realize they need to run a separate command. However, the two-command workflow may be intentional for separation of concerns.
125
+
126
+ **Independent Test**: Complete all workshop phases through Plan, verify the workshop provides clear next-step guidance including the exact `sofia dev` command to run with the session ID.
127
+
128
+ **Acceptance Scenarios**:
129
+
130
+ 1. **Given** a workshop session completing the Plan phase, **When** the plan is finalized, **Then** the workshop displays the exact `sofia dev --session <id>` command to run next, along with a brief explanation of what it does.
131
+ 2. **Given** a workshop session completing the Plan phase, **When** the user is in interactive mode, **Then** the workshop offers: (a) automatically start development, or (b) save the session and exit with the `sofia dev` command displayed.
132
+
133
+ ---
134
+
135
+ ### Edge Cases
136
+
137
+ - What if the output directory exists but has been manually modified? Resume should detect file integrity via the `.sofia-metadata.json` marker and warn if unexpected changes are found.
138
+ - What if `poc.iterations` is corrupted or has invalid entries? The resume logic should validate iteration data and fall back to starting fresh if integrity checks fail.
139
+ - What if the user interrupts during npm install on a resumed session? The system should handle partial `node_modules` gracefully — either detect incomplete install or always re-run npm install on resume.
140
+ - How should the template registry handle unknown plan architectures? Fall back to the default `node-ts-vitest` template with a logged warning.
141
+ - What if a PTY E2E test environment doesn't support PTY allocation (e.g., some CI runners)? Tests must skip gracefully with a clear skip message.
142
+
143
+ ## Requirements _(mandatory)_
144
+
145
+ ### Functional Requirements
146
+
147
+ #### Resume/Checkpoint (GAP-006)
148
+
149
+ - **FR-001**: `RalphLoop.run()` MUST check `session.poc.iterations` at startup. If iterations exist and `session.poc.finalStatus` is unset, it MUST resume from the next iteration number rather than starting from scratch.
150
+ - **FR-001a**: When resuming, if the last recorded iteration has no test results (indicating it was interrupted mid-execution), the system MUST re-run that iteration from the last known-good state. Only fully completed iterations (with test results recorded) are considered done.
151
+ - **FR-002**: When resuming, the Ralph Loop MUST skip scaffolding if the output directory exists and contains a valid `.sofia-metadata.json` marker.
152
+ - **FR-003**: When resuming, the Ralph Loop MUST always re-run the dependency installation step (e.g., `npm install`). This is idempotent when dependencies haven't changed and avoids stale dependency issues when a prior iteration added packages before an interruption.
153
+ - **FR-004**: When resuming, the Ralph Loop MUST include prior iteration history (test results, applied changes) in the LLM prompt context so the model understands what has already been tried.
154
+ - **FR-005**: If `session.poc.finalStatus` is `'success'`, the CLI MUST display a completion message and exit without invoking the Ralph Loop.
155
+ - **FR-006**: If `session.poc.finalStatus` is `'failed'` or `'partial'`, the CLI MUST default to resuming from the last iteration and allow the user to override with `--force`.
156
+ - **FR-007**: If the output directory is missing but iterations exist in the session, the system MUST re-scaffold (using the original plan context) and resume iteration numbering from where it left off.
157
+ - **FR-007a**: All resume decisions MUST be logged at info level (visible by default), including: which iteration is being resumed from, whether an incomplete iteration is being re-run, whether scaffolding is being skipped, and that npm install is being re-run. These messages MUST be visible to the user without requiring `--debug`.
158
+
159
+ #### `--force` Flag (GAP-007)
160
+
161
+ - **FR-008**: When `--force` is set, the command handler MUST delete the existing output directory AND reset `session.poc.iterations` to an empty array AND clear `session.poc.finalStatus`.
162
+ - **FR-009**: After a force-reset, the Ralph Loop MUST start fresh from iteration 1 as if the session had never been developed.
163
+ - **FR-010**: The `--force` flag MUST work regardless of the current `poc.finalStatus` value (including `'success'`).
164
+
165
+ #### Template Registry (GAP-009)
166
+
167
+ - **FR-011**: The scaffolder MUST use a template registry that maps plan characteristics (language, framework) to scaffold templates.
168
+ - **FR-012**: The template registry MUST include at least two templates: `node-ts-vitest` (TypeScript/Node.js/Vitest, current default) and `python-pytest` (Python/pytest).
169
+ - **FR-013**: Template selection MUST be automatic based on the plan's `architectureNotes` and `dependencies`, with `node-ts-vitest` as the fallback default.
170
+ - **FR-014**: Each template MUST define: file list, `TechStack` configuration (language, runtime, test runner command, build command, dependency install command), and test execution command. Both the install and test commands are part of the template — the test runner MUST NOT auto-detect them.
171
+ - **FR-015**: Adding a new template MUST only require adding a registry entry — no changes to `PocScaffolder`'s core logic or `RalphLoop`'s iteration logic.
172
+
173
+ #### TestRunner Coverage (GAP-008)
174
+
175
+ - **FR-016**: Integration tests MUST cover the `spawnTests()` method including: successful test execution, timeout handling (SIGTERM then SIGKILL), and stderr collection.
176
+ - **FR-017**: Integration tests MUST cover the `extractJson()` fallback path where line-by-line parsing fails and the first-`{`-to-last-`}` slice is used.
177
+ - **FR-018**: Integration tests MUST cover the `buildErrorResult()` timeout path.
178
+ - **FR-019**: TestRunner integration tests MUST use a real test fixture project (minimal Vitest/pytest project) rather than mocking the subprocess.
179
+
180
+ #### Workshop→Dev Transition (GAP-011)
181
+
182
+ - **FR-020**: When the workshop Plan phase completes, the system MUST display the exact `sofia dev --session <id>` command needed to start PoC development.
183
+ - **FR-021**: In interactive mode, the workshop SHOULD offer to automatically transition to the development phase.
184
+
185
+ #### Scaffold TODO Tracking (GAP-009, P3)
186
+
187
+ - **FR-022**: Generated scaffold files containing intentional TODO markers MUST be tracked via `.sofia-metadata.json` so that the Ralph Loop can report how many TODOs remain at the end of each iteration.
188
+
189
+ ### Key Entities
190
+
191
+ - **CheckpointState**: Represents the resume context derived from existing `session.poc` — includes last iteration number, whether scaffolding/install can be skipped, and prior iteration history for LLM context.
192
+ - **TemplateRegistry**: Maps plan characteristics to scaffold templates. Contains named template entries with file lists, tech stack configuration, and installation commands.
193
+ - **TemplateEntry**: A single scaffold template definition — includes template name (e.g., `node-ts-vitest`, `python-pytest`), file generators, TechStack shape, and install command.
194
+ - **TestFixtureProject**: A minimal project used by testRunner integration tests — contains a `package.json`, a passing test, a failing test, and a hanging test for timeout validation.
195
+
196
+ ## Success Criteria _(mandatory)_
197
+
198
+ ### Measurable Outcomes
199
+
200
+ - **SC-004-001**: A `sofia dev --session X` run after an interruption resumes from the correct iteration number (e.g., iteration 3 if 2 were completed), measured by verifying the iteration counter in session state and the absence of scaffolding logs.
201
+ - **SC-004-002**: `sofia dev --session X --force` resets both the output directory and session `poc.iterations`/`poc.finalStatus`, measured by verifying empty iteration state after force.
202
+ - **SC-004-003**: The scaffolder produces a valid Python/pytest project when the plan specifies Python/FastAPI, measured by generating `requirements.txt`, `main.py`, and `pytest`-based tests that pass basic syntax validation.
203
+ - **SC-004-004**: `testRunner.ts` test coverage increases from 45% to at least 80%, measured by the coverage report.
204
+ - **SC-004-005**: A resumed Ralph Loop session reaches the same or better PoC quality (pass rate) as a fresh run, measured by comparing test pass counts between resumed and fresh runs on the same plan.
205
+ - **SC-004-006**: The workshop displays actionable next-step guidance (including the exact command) when the Plan phase completes, measured by verifying the output contains the session ID and `sofia dev` command.
206
+ - **SC-004-007**: Resume detection adds less than 500ms overhead to the `sofia dev` startup time, measured by comparing startup times with and without existing iterations.
207
+
208
+ ## Assumptions
209
+
210
+ - Feature 002 session schema (`poc.iterations`, `poc.finalStatus`) is stable and does not require migration for resume support — only reading existing fields that are currently written but never read back.
211
+ - The `.sofia-metadata.json` file written by the scaffolder is a reliable marker for detecting existing scaffold output.
212
+ - npm install is always re-run on resume since it's idempotent (fast no-op when dependencies match) and avoids hard-to-diagnose stale dependency issues from interrupted iterations.
213
+ - Python/FastAPI is the highest-value second template based on user demand and workshop feedback.
214
+ - PTY allocation is available in the CI environment for E2E tests; tests skip gracefully if PTY is unavailable.
215
+ - The two-command workflow (`workshop` then `dev`) is the intentional default; auto-transition in interactive mode is optional behavior.
216
+
217
+ ## Dependencies
218
+
219
+ - **Feature 001**: Session model, workshop phases
220
+ - **Feature 002**: Ralph Loop, `--force` CLI option, testRunner, PocScaffolder, session schemas
221
+ - **Feature 003**: MCP transport layer (resume should work with both stub and real MCP; template registry should support templates for MCP-enabled vs local-only PoCs)