specweave 0.4.1 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (392) hide show
  1. package/.claude-plugin/README.md +325 -0
  2. package/.claude-plugin/marketplace.json +210 -0
  3. package/CLAUDE.md +871 -596
  4. package/README.md +188 -137
  5. package/bin/install-agents.sh +1 -1
  6. package/bin/install-commands.sh +66 -14
  7. package/bin/install-hooks.sh +1 -1
  8. package/bin/install-skills.sh +1 -1
  9. package/bin/specweave.js +2 -0
  10. package/dist/adapters/claude/adapter.d.ts +49 -11
  11. package/dist/adapters/claude/adapter.d.ts.map +1 -1
  12. package/dist/adapters/claude/adapter.js +175 -42
  13. package/dist/adapters/claude/adapter.js.map +1 -1
  14. package/dist/adapters/copilot/adapter.d.ts +20 -2
  15. package/dist/adapters/copilot/adapter.d.ts.map +1 -1
  16. package/dist/adapters/copilot/adapter.js +117 -7
  17. package/dist/adapters/copilot/adapter.js.map +1 -1
  18. package/dist/adapters/cursor/adapter.d.ts +18 -0
  19. package/dist/adapters/cursor/adapter.d.ts.map +1 -1
  20. package/dist/adapters/cursor/adapter.js +55 -3
  21. package/dist/adapters/cursor/adapter.js.map +1 -1
  22. package/dist/adapters/generic/adapter.d.ts +18 -0
  23. package/dist/adapters/generic/adapter.d.ts.map +1 -1
  24. package/dist/adapters/generic/adapter.js +55 -3
  25. package/dist/adapters/generic/adapter.js.map +1 -1
  26. package/dist/cli/commands/init.d.ts +1 -0
  27. package/dist/cli/commands/init.d.ts.map +1 -1
  28. package/dist/cli/commands/init.js +346 -124
  29. package/dist/cli/commands/init.js.map +1 -1
  30. package/dist/cli/commands/install.d.ts +2 -0
  31. package/dist/cli/commands/install.d.ts.map +1 -1
  32. package/dist/cli/commands/install.js +28 -25
  33. package/dist/cli/commands/install.js.map +1 -1
  34. package/dist/cli/commands/list.d.ts +2 -0
  35. package/dist/cli/commands/list.d.ts.map +1 -1
  36. package/dist/cli/commands/list.js +26 -24
  37. package/dist/cli/commands/list.js.map +1 -1
  38. package/dist/cli/commands/plugin.d.ts +7 -1
  39. package/dist/cli/commands/plugin.d.ts.map +1 -1
  40. package/dist/cli/commands/plugin.js +72 -61
  41. package/dist/cli/commands/plugin.js.map +1 -1
  42. package/dist/core/i18n/language-detector.d.ts +29 -0
  43. package/dist/core/i18n/language-detector.d.ts.map +1 -0
  44. package/dist/core/i18n/language-detector.js +143 -0
  45. package/dist/core/i18n/language-detector.js.map +1 -0
  46. package/dist/core/i18n/language-manager.d.ts +101 -0
  47. package/dist/core/i18n/language-manager.d.ts.map +1 -0
  48. package/dist/core/i18n/language-manager.js +232 -0
  49. package/dist/core/i18n/language-manager.js.map +1 -0
  50. package/dist/core/i18n/language-registry.d.ts +44 -0
  51. package/dist/core/i18n/language-registry.d.ts.map +1 -0
  52. package/dist/core/i18n/language-registry.js +234 -0
  53. package/dist/core/i18n/language-registry.js.map +1 -0
  54. package/dist/core/i18n/locale-manager.d.ts +62 -0
  55. package/dist/core/i18n/locale-manager.d.ts.map +1 -0
  56. package/dist/core/i18n/locale-manager.js +137 -0
  57. package/dist/core/i18n/locale-manager.js.map +1 -0
  58. package/dist/core/i18n/system-prompt-injector.d.ts +33 -0
  59. package/dist/core/i18n/system-prompt-injector.d.ts.map +1 -0
  60. package/dist/core/i18n/system-prompt-injector.js +131 -0
  61. package/dist/core/i18n/system-prompt-injector.js.map +1 -0
  62. package/dist/core/i18n/types.d.ts +151 -0
  63. package/dist/core/i18n/types.d.ts.map +1 -0
  64. package/dist/core/i18n/types.js +11 -0
  65. package/dist/core/i18n/types.js.map +1 -0
  66. package/dist/core/increment-status.d.ts +72 -0
  67. package/dist/core/increment-status.d.ts.map +1 -0
  68. package/dist/core/increment-status.js +227 -0
  69. package/dist/core/increment-status.js.map +1 -0
  70. package/dist/core/plugin-loader.d.ts +33 -13
  71. package/dist/core/plugin-loader.d.ts.map +1 -1
  72. package/dist/core/plugin-loader.js +145 -43
  73. package/dist/core/plugin-loader.js.map +1 -1
  74. package/dist/core/types/config.d.ts +51 -0
  75. package/dist/core/types/config.d.ts.map +1 -0
  76. package/dist/core/types/config.js +21 -0
  77. package/dist/core/types/config.js.map +1 -0
  78. package/dist/core/types/plugin.d.ts +73 -42
  79. package/dist/core/types/plugin.d.ts.map +1 -1
  80. package/dist/core/types/plugin.js +4 -3
  81. package/dist/core/types/plugin.js.map +1 -1
  82. package/dist/hooks/lib/sync-living-docs.d.ts +27 -0
  83. package/dist/hooks/lib/sync-living-docs.d.ts.map +1 -0
  84. package/dist/hooks/lib/sync-living-docs.js +116 -0
  85. package/dist/hooks/lib/sync-living-docs.js.map +1 -0
  86. package/dist/hooks/lib/translate-living-docs.d.ts +13 -0
  87. package/dist/hooks/lib/translate-living-docs.d.ts.map +1 -0
  88. package/dist/hooks/lib/translate-living-docs.js +166 -0
  89. package/dist/hooks/lib/translate-living-docs.js.map +1 -0
  90. package/dist/hooks/lib/update-tasks-md.d.ts +29 -0
  91. package/dist/hooks/lib/update-tasks-md.d.ts.map +1 -0
  92. package/dist/hooks/lib/update-tasks-md.js +203 -0
  93. package/dist/hooks/lib/update-tasks-md.js.map +1 -0
  94. package/dist/integrations/jira/jira-incremental-mapper.js.map +1 -1
  95. package/dist/integrations/jira/jira-mapper.js.map +1 -1
  96. package/dist/locales/de/.gitkeep +0 -0
  97. package/dist/locales/de/cli.json +108 -0
  98. package/dist/locales/en/cli.json +269 -0
  99. package/dist/locales/en/errors.json +7 -0
  100. package/dist/locales/en/templates.json +6 -0
  101. package/dist/locales/es/.gitkeep +0 -0
  102. package/dist/locales/es/cli.json +41 -0
  103. package/dist/locales/fr/.gitkeep +0 -0
  104. package/dist/locales/fr/cli.json +108 -0
  105. package/dist/locales/ja/.gitkeep +0 -0
  106. package/dist/locales/ja/cli.json +108 -0
  107. package/dist/locales/ko/.gitkeep +0 -0
  108. package/dist/locales/ko/cli.json +108 -0
  109. package/dist/locales/pt/.gitkeep +0 -0
  110. package/dist/locales/pt/cli.json +108 -0
  111. package/dist/locales/ru/.gitkeep +0 -0
  112. package/dist/locales/ru/cli.json +269 -0
  113. package/dist/locales/zh/.gitkeep +0 -0
  114. package/dist/locales/zh/cli.json +108 -0
  115. package/dist/plugins/specweave-github/lib/github-client.d.ts +86 -0
  116. package/dist/plugins/specweave-github/lib/github-client.d.ts.map +1 -0
  117. package/dist/plugins/specweave-github/lib/github-client.js +275 -0
  118. package/dist/plugins/specweave-github/lib/github-client.js.map +1 -0
  119. package/dist/plugins/specweave-github/lib/index.d.ts +10 -0
  120. package/dist/plugins/specweave-github/lib/index.d.ts.map +1 -0
  121. package/dist/plugins/specweave-github/lib/index.js +10 -0
  122. package/dist/plugins/specweave-github/lib/index.js.map +1 -0
  123. package/dist/plugins/specweave-github/lib/subtask-sync.d.ts +51 -0
  124. package/dist/plugins/specweave-github/lib/subtask-sync.d.ts.map +1 -0
  125. package/dist/plugins/specweave-github/lib/subtask-sync.js +147 -0
  126. package/dist/plugins/specweave-github/lib/subtask-sync.js.map +1 -0
  127. package/dist/plugins/specweave-github/lib/task-parser.d.ts +37 -0
  128. package/dist/plugins/specweave-github/lib/task-parser.d.ts.map +1 -0
  129. package/dist/plugins/specweave-github/lib/task-parser.js +211 -0
  130. package/dist/plugins/specweave-github/lib/task-parser.js.map +1 -0
  131. package/dist/plugins/specweave-github/lib/task-sync.d.ts +51 -0
  132. package/dist/plugins/specweave-github/lib/task-sync.d.ts.map +1 -0
  133. package/dist/plugins/specweave-github/lib/task-sync.js +332 -0
  134. package/dist/plugins/specweave-github/lib/task-sync.js.map +1 -0
  135. package/dist/plugins/specweave-github/lib/types.d.ts +80 -0
  136. package/dist/plugins/specweave-github/lib/types.d.ts.map +1 -0
  137. package/dist/plugins/specweave-github/lib/types.js +5 -0
  138. package/dist/plugins/specweave-github/lib/types.js.map +1 -0
  139. package/dist/utils/agents-md-compiler.d.ts +68 -0
  140. package/dist/utils/agents-md-compiler.d.ts.map +1 -0
  141. package/dist/utils/agents-md-compiler.js +420 -0
  142. package/dist/utils/agents-md-compiler.js.map +1 -0
  143. package/dist/utils/generate-skills-index.js +4 -4
  144. package/dist/utils/generate-skills-index.js.map +1 -1
  145. package/package.json +12 -13
  146. package/plugins/specweave-ado/.claude-plugin/plugin.json +8 -0
  147. package/plugins/specweave-alternatives/.claude-plugin/plugin.json +8 -0
  148. package/plugins/specweave-alternatives/skills/bmad-method-expert/SKILL.md +626 -0
  149. package/plugins/specweave-alternatives/skills/bmad-method-expert/scripts/analyze-project.js +318 -0
  150. package/plugins/specweave-alternatives/skills/bmad-method-expert/scripts/check-setup.js +208 -0
  151. package/plugins/specweave-alternatives/skills/bmad-method-expert/scripts/generate-template.js +1149 -0
  152. package/plugins/specweave-alternatives/skills/bmad-method-expert/scripts/validate-documents.js +340 -0
  153. package/plugins/specweave-alternatives/skills/spec-kit-expert/SKILL.md +1010 -0
  154. package/plugins/specweave-backend/.claude-plugin/plugin.json +8 -0
  155. package/plugins/specweave-core/.claude-plugin/plugin.json +25 -0
  156. package/{src → plugins/specweave-core}/agents/pm/AGENT.md +80 -0
  157. package/plugins/specweave-core/agents/translator/AGENT.md +282 -0
  158. package/{src → plugins/specweave-core}/commands/README.md +11 -11
  159. package/{src → plugins/specweave-core}/commands/specweave.costs.md +7 -7
  160. package/{src → plugins/specweave-core}/commands/specweave.do.md +34 -7
  161. package/{src → plugins/specweave-core}/commands/specweave.increment.md +83 -18
  162. package/{src → plugins/specweave-core}/commands/specweave.md +49 -17
  163. package/{src → plugins/specweave-core}/commands/specweave.sync-docs.md +5 -5
  164. package/plugins/specweave-core/commands/specweave.translate.md +425 -0
  165. package/{src → plugins/specweave-core}/commands/specweave.validate.md +1 -1
  166. package/plugins/specweave-core/hooks/hooks.json +13 -0
  167. package/plugins/specweave-core/hooks/post-task-completion.sh +265 -0
  168. package/plugins/specweave-core/skills/SKILLS-INDEX.md +229 -0
  169. package/{src → plugins/specweave-core}/skills/brownfield-analyzer/SKILL.md +66 -24
  170. package/{src → plugins/specweave-core}/skills/context-loader/SKILL.md +1 -1
  171. package/plugins/specweave-core/skills/context-optimizer/SKILL.md +588 -0
  172. package/plugins/specweave-core/skills/docs-updater/SKILL.md +0 -0
  173. package/{src → plugins/specweave-core}/skills/increment-planner/SKILL.md +81 -4
  174. package/plugins/specweave-core/skills/plugin-detector/SKILL.md +211 -0
  175. package/{src → plugins/specweave-core}/skills/project-kickstarter/SKILL.md +7 -7
  176. package/plugins/specweave-core/skills/rfc-generator/SKILL.md +369 -0
  177. package/{src → plugins/specweave-core}/skills/specweave-detector/SKILL.md +2 -2
  178. package/plugins/specweave-core/skills/specweave-framework/SKILL.md +498 -0
  179. package/plugins/specweave-core/skills/specweave-framework/test-cases/test-1-increment-naming.yaml +11 -0
  180. package/plugins/specweave-core/skills/specweave-framework/test-cases/test-2-source-of-truth.yaml +11 -0
  181. package/plugins/specweave-core/skills/specweave-framework/test-cases/test-3-increment-discipline.yaml +12 -0
  182. package/plugins/specweave-core/skills/specweave-framework/test-cases/test-4-file-placement.yaml +11 -0
  183. package/{src → plugins/specweave-core}/skills/tdd-workflow/SKILL.md +20 -20
  184. package/plugins/specweave-core/skills/translator/SKILL.md +172 -0
  185. package/plugins/specweave-cost-optimizer/.claude-plugin/plugin.json +8 -0
  186. package/plugins/specweave-diagrams/.claude-plugin/plugin.json +8 -0
  187. package/plugins/specweave-docs/.claude-plugin/plugin.json +8 -0
  188. package/plugins/specweave-docs/skills/docusaurus/SKILL.md +526 -0
  189. package/plugins/specweave-figma/.claude-plugin/.mcp.json +12 -0
  190. package/plugins/specweave-figma/.claude-plugin/plugin.json +8 -0
  191. package/plugins/specweave-figma/ARCHITECTURE.md +453 -0
  192. package/plugins/specweave-figma/README.md +728 -0
  193. package/plugins/specweave-figma/skills/figma-to-code/SKILL.md +632 -0
  194. package/plugins/specweave-figma/skills/figma-to-code/test-1-token-generation.yaml +29 -0
  195. package/plugins/specweave-figma/skills/figma-to-code/test-2-component-generation.yaml +27 -0
  196. package/plugins/specweave-figma/skills/figma-to-code/test-3-typescript-generation.yaml +28 -0
  197. package/plugins/specweave-frontend/.claude-plugin/plugin.json +8 -0
  198. package/plugins/specweave-github/.claude-plugin/plugin.json +8 -0
  199. package/plugins/specweave-github/agents/github-manager/AGENT.md +651 -0
  200. package/plugins/specweave-github/commands/github-close-issue.md +418 -0
  201. package/plugins/specweave-github/commands/github-create-issue.md +307 -0
  202. package/plugins/specweave-github/commands/github-status.md +533 -0
  203. package/plugins/specweave-github/commands/github-sync-tasks.md +530 -0
  204. package/plugins/specweave-github/commands/github-sync.md +443 -0
  205. package/plugins/specweave-github/lib/github-client.ts +330 -0
  206. package/plugins/specweave-github/lib/index.ts +10 -0
  207. package/plugins/specweave-github/lib/subtask-sync.ts +225 -0
  208. package/plugins/specweave-github/lib/task-parser.ts +246 -0
  209. package/plugins/specweave-github/lib/task-sync.ts +402 -0
  210. package/plugins/specweave-github/lib/types.ts +86 -0
  211. package/plugins/specweave-github/skills/github-issue-tracker/SKILL.md +497 -0
  212. package/plugins/specweave-github/skills/github-sync/SKILL.md +461 -0
  213. package/plugins/specweave-infrastructure/.claude-plugin/plugin.json +8 -0
  214. package/plugins/specweave-jira/.claude-plugin/plugin.json +8 -0
  215. package/{src → plugins/specweave-jira}/commands/specweave.sync-jira.md +18 -18
  216. package/plugins/specweave-kubernetes/.claude-plugin/plugin.json +8 -0
  217. package/plugins/specweave-ml/.claude-plugin/plugin.json +39 -0
  218. package/plugins/specweave-ml/README.md +885 -0
  219. package/plugins/specweave-ml/agents/ml-engineer/AGENT.md +402 -0
  220. package/plugins/specweave-ml/commands/ml-deploy.md +116 -0
  221. package/plugins/specweave-ml/commands/ml-evaluate.md +87 -0
  222. package/plugins/specweave-ml/commands/ml-explain.md +83 -0
  223. package/plugins/specweave-ml/skills/anomaly-detector/SKILL.md +559 -0
  224. package/plugins/specweave-ml/skills/automl-optimizer/SKILL.md +485 -0
  225. package/plugins/specweave-ml/skills/cv-pipeline-builder/SKILL.md +157 -0
  226. package/plugins/specweave-ml/skills/data-visualizer/SKILL.md +521 -0
  227. package/plugins/specweave-ml/skills/experiment-tracker/SKILL.md +535 -0
  228. package/plugins/specweave-ml/skills/feature-engineer/SKILL.md +566 -0
  229. package/plugins/specweave-ml/skills/ml-deployment-helper/SKILL.md +345 -0
  230. package/plugins/specweave-ml/skills/ml-pipeline-orchestrator/SKILL.md +518 -0
  231. package/plugins/specweave-ml/skills/model-evaluator/SKILL.md +155 -0
  232. package/plugins/specweave-ml/skills/model-explainer/SKILL.md +227 -0
  233. package/plugins/specweave-ml/skills/model-registry/SKILL.md +541 -0
  234. package/plugins/specweave-ml/skills/nlp-pipeline-builder/SKILL.md +180 -0
  235. package/plugins/specweave-ml/skills/time-series-forecaster/SKILL.md +569 -0
  236. package/plugins/specweave-payments/.claude-plugin/plugin.json +8 -0
  237. package/plugins/specweave-testing/.claude-plugin/plugin.json +8 -0
  238. package/plugins/specweave-tooling/.claude-plugin/plugin.json +8 -0
  239. package/plugins/specweave-ui/.claude-plugin/plugin.json +106 -0
  240. package/plugins/specweave-ui/.mcp.json +14 -0
  241. package/plugins/specweave-ui/README.md +386 -0
  242. package/src/adapters/claude/adapter.ts +193 -46
  243. package/src/adapters/copilot/adapter.ts +132 -7
  244. package/src/adapters/cursor/adapter.ts +62 -3
  245. package/src/adapters/generic/adapter.ts +62 -3
  246. package/src/templates/AGENTS.md.template +170 -1
  247. package/src/templates/CLAUDE.md.template +122 -24
  248. package/src/templates/tasks.md.template +261 -0
  249. package/src/agents/ml-engineer/AGENT.md +0 -150
  250. package/src/commands/specweave.sync-github.md +0 -269
  251. package/src/hooks/post-task-completion.sh +0 -121
  252. package/src/skills/SKILLS-INDEX.md +0 -444
  253. package/src/skills/github-sync/SKILL.md +0 -234
  254. /package/{src → plugins/specweave-ado}/skills/ado-sync/README.md +0 -0
  255. /package/{src → plugins/specweave-ado}/skills/ado-sync/SKILL.md +0 -0
  256. /package/{src → plugins/specweave-ado}/skills/specweave-ado-mapper/SKILL.md +0 -0
  257. /package/{src → plugins/specweave-backend}/agents/database-optimizer/AGENT.md +0 -0
  258. /package/{src → plugins/specweave-backend}/skills/dotnet-backend/SKILL.md +0 -0
  259. /package/{src → plugins/specweave-backend}/skills/nodejs-backend/SKILL.md +0 -0
  260. /package/{src → plugins/specweave-backend}/skills/python-backend/SKILL.md +0 -0
  261. /package/{src → plugins/specweave-core}/agents/architect/AGENT.md +0 -0
  262. /package/{src → plugins/specweave-core}/agents/code-reviewer.md +0 -0
  263. /package/{src → plugins/specweave-core}/agents/docs-writer/AGENT.md +0 -0
  264. /package/{src → plugins/specweave-core}/agents/performance/AGENT.md +0 -0
  265. /package/{src → plugins/specweave-core}/agents/qa-lead/AGENT.md +0 -0
  266. /package/{src → plugins/specweave-core}/agents/security/AGENT.md +0 -0
  267. /package/{src → plugins/specweave-core}/agents/tdd-orchestrator/AGENT.md +0 -0
  268. /package/{src → plugins/specweave-core}/agents/tech-lead/AGENT.md +0 -0
  269. /package/{src → plugins/specweave-core}/commands/specweave.done.md +0 -0
  270. /package/{src → plugins/specweave-core}/commands/specweave.inc.md +0 -0
  271. /package/{src → plugins/specweave-core}/commands/specweave.list-increments.md +0 -0
  272. /package/{src → plugins/specweave-core}/commands/specweave.next.md +0 -0
  273. /package/{src → plugins/specweave-core}/commands/specweave.progress.md +0 -0
  274. /package/{src → plugins/specweave-core}/commands/specweave.tdd-cycle.md +0 -0
  275. /package/{src → plugins/specweave-core}/commands/specweave.tdd-green.md +0 -0
  276. /package/{src → plugins/specweave-core}/commands/specweave.tdd-red.md +0 -0
  277. /package/{src → plugins/specweave-core}/commands/specweave.tdd-refactor.md +0 -0
  278. /package/{src → plugins/specweave-core}/hooks/README.md +0 -0
  279. /package/{src → plugins/specweave-core}/hooks/docs-changed.sh +0 -0
  280. /package/{src → plugins/specweave-core}/hooks/human-input-required.sh +0 -0
  281. /package/{src → plugins/specweave-core}/hooks/post-increment-plugin-detect.sh +0 -0
  282. /package/{src → plugins/specweave-core}/hooks/pre-implementation.sh +0 -0
  283. /package/{src → plugins/specweave-core}/hooks/pre-task-plugin-detect.sh +0 -0
  284. /package/{src → plugins/specweave-core}/skills/brownfield-onboarder/SKILL.md +0 -0
  285. /package/{src → plugins/specweave-core}/skills/docs-updater/README.md +0 -0
  286. /package/{src → plugins/specweave-core}/skills/increment-planner/scripts/feature-utils.js +0 -0
  287. /package/{src → plugins/specweave-core}/skills/increment-quality-judge/SKILL.md +0 -0
  288. /package/{src → plugins/specweave-core}/skills/project-kickstarter/test-cases/test-1-high-confidence-full-product.yaml +0 -0
  289. /package/{src → plugins/specweave-core}/skills/project-kickstarter/test-cases/test-2-medium-confidence-partial.yaml +0 -0
  290. /package/{src → plugins/specweave-core}/skills/project-kickstarter/test-cases/test-3-low-confidence-technical-question.yaml +0 -0
  291. /package/{src → plugins/specweave-core}/skills/project-kickstarter/test-cases/test-4-opt-out-explicit.yaml +0 -0
  292. /package/{src → plugins/specweave-core}/skills/role-orchestrator/README.md +0 -0
  293. /package/{src → plugins/specweave-core}/skills/role-orchestrator/SKILL.md +0 -0
  294. /package/{src → plugins/specweave-core}/skills/task-builder/README.md +0 -0
  295. /package/{src → plugins/specweave-cost-optimizer}/skills/cost-optimizer/SKILL.md +0 -0
  296. /package/{src → plugins/specweave-diagrams}/agents/diagrams-architect/AGENT.md +0 -0
  297. /package/{src → plugins/specweave-diagrams}/agents/diagrams-architect/templates/c4-component-template.mmd +0 -0
  298. /package/{src → plugins/specweave-diagrams}/agents/diagrams-architect/templates/c4-container-template.mmd +0 -0
  299. /package/{src → plugins/specweave-diagrams}/agents/diagrams-architect/templates/c4-context-template.mmd +0 -0
  300. /package/{src → plugins/specweave-diagrams}/agents/diagrams-architect/templates/deployment-template.mmd +0 -0
  301. /package/{src → plugins/specweave-diagrams}/agents/diagrams-architect/templates/er-diagram-template.mmd +0 -0
  302. /package/{src → plugins/specweave-diagrams}/agents/diagrams-architect/templates/sequence-template.mmd +0 -0
  303. /package/{src → plugins/specweave-diagrams}/agents/diagrams-architect/test-cases/test-1-c4-context.yaml +0 -0
  304. /package/{src → plugins/specweave-diagrams}/agents/diagrams-architect/test-cases/test-2-sequence.yaml +0 -0
  305. /package/{src → plugins/specweave-diagrams}/agents/diagrams-architect/test-cases/test-3-er-diagram.yaml +0 -0
  306. /package/{src → plugins/specweave-diagrams}/skills/diagrams-architect/SKILL.md +0 -0
  307. /package/{src → plugins/specweave-diagrams}/skills/diagrams-generator/SKILL.md +0 -0
  308. /package/{src → plugins/specweave-docs}/skills/spec-driven-brainstorming/README.md +0 -0
  309. /package/{src → plugins/specweave-docs}/skills/spec-driven-brainstorming/SKILL.md +0 -0
  310. /package/{src → plugins/specweave-docs}/skills/spec-driven-debugging/README.md +0 -0
  311. /package/{src → plugins/specweave-docs}/skills/spec-driven-debugging/SKILL.md +0 -0
  312. /package/{src → plugins/specweave-frontend}/skills/design-system-architect/SKILL.md +0 -0
  313. /package/{src → plugins/specweave-frontend}/skills/frontend/SKILL.md +0 -0
  314. /package/{src → plugins/specweave-frontend}/skills/nextjs/SKILL.md +0 -0
  315. /package/{src → plugins/specweave-infrastructure}/agents/devops/AGENT.md +0 -0
  316. /package/{src → plugins/specweave-infrastructure}/agents/network-engineer/AGENT.md +0 -0
  317. /package/{src → plugins/specweave-infrastructure}/agents/observability-engineer/AGENT.md +0 -0
  318. /package/{src → plugins/specweave-infrastructure}/agents/performance-engineer/AGENT.md +0 -0
  319. /package/{src → plugins/specweave-infrastructure}/agents/sre/AGENT.md +0 -0
  320. /package/{src → plugins/specweave-infrastructure}/agents/sre/modules/backend-diagnostics.md +0 -0
  321. /package/{src → plugins/specweave-infrastructure}/agents/sre/modules/database-diagnostics.md +0 -0
  322. /package/{src → plugins/specweave-infrastructure}/agents/sre/modules/infrastructure.md +0 -0
  323. /package/{src → plugins/specweave-infrastructure}/agents/sre/modules/monitoring.md +0 -0
  324. /package/{src → plugins/specweave-infrastructure}/agents/sre/modules/security-incidents.md +0 -0
  325. /package/{src → plugins/specweave-infrastructure}/agents/sre/modules/ui-diagnostics.md +0 -0
  326. /package/{src → plugins/specweave-infrastructure}/agents/sre/playbooks/01-high-cpu-usage.md +0 -0
  327. /package/{src → plugins/specweave-infrastructure}/agents/sre/playbooks/02-database-deadlock.md +0 -0
  328. /package/{src → plugins/specweave-infrastructure}/agents/sre/playbooks/03-memory-leak.md +0 -0
  329. /package/{src → plugins/specweave-infrastructure}/agents/sre/playbooks/04-slow-api-response.md +0 -0
  330. /package/{src → plugins/specweave-infrastructure}/agents/sre/playbooks/05-ddos-attack.md +0 -0
  331. /package/{src → plugins/specweave-infrastructure}/agents/sre/playbooks/06-disk-full.md +0 -0
  332. /package/{src → plugins/specweave-infrastructure}/agents/sre/playbooks/07-service-down.md +0 -0
  333. /package/{src → plugins/specweave-infrastructure}/agents/sre/playbooks/08-data-corruption.md +0 -0
  334. /package/{src → plugins/specweave-infrastructure}/agents/sre/playbooks/09-cascade-failure.md +0 -0
  335. /package/{src → plugins/specweave-infrastructure}/agents/sre/playbooks/10-rate-limit-exceeded.md +0 -0
  336. /package/{src → plugins/specweave-infrastructure}/agents/sre/scripts/health-check.sh +0 -0
  337. /package/{src → plugins/specweave-infrastructure}/agents/sre/scripts/log-analyzer.py +0 -0
  338. /package/{src → plugins/specweave-infrastructure}/agents/sre/scripts/metrics-collector.sh +0 -0
  339. /package/{src → plugins/specweave-infrastructure}/agents/sre/scripts/trace-analyzer.js +0 -0
  340. /package/{src → plugins/specweave-infrastructure}/agents/sre/templates/incident-report.md +0 -0
  341. /package/{src → plugins/specweave-infrastructure}/agents/sre/templates/mitigation-plan.md +0 -0
  342. /package/{src → plugins/specweave-infrastructure}/agents/sre/templates/post-mortem.md +0 -0
  343. /package/{src → plugins/specweave-infrastructure}/agents/sre/templates/runbook-template.md +0 -0
  344. /package/{src → plugins/specweave-infrastructure}/commands/specweave.monitor-setup.md +0 -0
  345. /package/{src → plugins/specweave-infrastructure}/commands/specweave.slo-implement.md +0 -0
  346. /package/{src → plugins/specweave-infrastructure}/skills/distributed-tracing/SKILL.md +0 -0
  347. /package/{src → plugins/specweave-infrastructure}/skills/grafana-dashboards/SKILL.md +0 -0
  348. /package/{src → plugins/specweave-infrastructure}/skills/hetzner-provisioner/README.md +0 -0
  349. /package/{src → plugins/specweave-infrastructure}/skills/hetzner-provisioner/SKILL.md +0 -0
  350. /package/{src → plugins/specweave-infrastructure}/skills/prometheus-configuration/SKILL.md +0 -0
  351. /package/{src → plugins/specweave-infrastructure}/skills/slo-implementation/SKILL.md +0 -0
  352. /package/{src → plugins/specweave-jira}/skills/jira-sync/README.md +0 -0
  353. /package/{src → plugins/specweave-jira}/skills/jira-sync/SKILL.md +0 -0
  354. /package/{src → plugins/specweave-jira}/skills/specweave-jira-mapper/SKILL.md +0 -0
  355. /package/{src → plugins/specweave-kubernetes}/agents/kubernetes-architect/AGENT.md +0 -0
  356. /package/{src → plugins/specweave-kubernetes}/skills/gitops-workflow/SKILL.md +0 -0
  357. /package/{src → plugins/specweave-kubernetes}/skills/gitops-workflow/references/argocd-setup.md +0 -0
  358. /package/{src → plugins/specweave-kubernetes}/skills/gitops-workflow/references/sync-policies.md +0 -0
  359. /package/{src → plugins/specweave-kubernetes}/skills/helm-chart-scaffolding/SKILL.md +0 -0
  360. /package/{src → plugins/specweave-kubernetes}/skills/helm-chart-scaffolding/assets/Chart.yaml.template +0 -0
  361. /package/{src → plugins/specweave-kubernetes}/skills/helm-chart-scaffolding/assets/values.yaml.template +0 -0
  362. /package/{src → plugins/specweave-kubernetes}/skills/helm-chart-scaffolding/references/chart-structure.md +0 -0
  363. /package/{src → plugins/specweave-kubernetes}/skills/helm-chart-scaffolding/scripts/validate-chart.sh +0 -0
  364. /package/{src → plugins/specweave-kubernetes}/skills/k8s-manifest-generator/SKILL.md +0 -0
  365. /package/{src → plugins/specweave-kubernetes}/skills/k8s-manifest-generator/assets/configmap-template.yaml +0 -0
  366. /package/{src → plugins/specweave-kubernetes}/skills/k8s-manifest-generator/assets/deployment-template.yaml +0 -0
  367. /package/{src → plugins/specweave-kubernetes}/skills/k8s-manifest-generator/assets/service-template.yaml +0 -0
  368. /package/{src → plugins/specweave-kubernetes}/skills/k8s-manifest-generator/references/deployment-spec.md +0 -0
  369. /package/{src → plugins/specweave-kubernetes}/skills/k8s-manifest-generator/references/service-spec.md +0 -0
  370. /package/{src → plugins/specweave-kubernetes}/skills/k8s-security-policies/SKILL.md +0 -0
  371. /package/{src → plugins/specweave-kubernetes}/skills/k8s-security-policies/assets/network-policy-template.yaml +0 -0
  372. /package/{src → plugins/specweave-kubernetes}/skills/k8s-security-policies/references/rbac-patterns.md +0 -0
  373. /package/{src → plugins/specweave-ml}/agents/data-scientist/AGENT.md +0 -0
  374. /package/{src → plugins/specweave-ml}/agents/mlops-engineer/AGENT.md +0 -0
  375. /package/{src → plugins/specweave-ml}/commands/specweave.ml-pipeline.md +0 -0
  376. /package/{src → plugins/specweave-ml}/skills/ml-pipeline-workflow/SKILL.md +0 -0
  377. /package/{src → plugins/specweave-payments}/agents/payment-integration/AGENT.md +0 -0
  378. /package/{src → plugins/specweave-payments}/skills/billing-automation/SKILL.md +0 -0
  379. /package/{src → plugins/specweave-payments}/skills/paypal-integration/SKILL.md +0 -0
  380. /package/{src → plugins/specweave-payments}/skills/pci-compliance/SKILL.md +0 -0
  381. /package/{src → plugins/specweave-payments}/skills/stripe-integration/SKILL.md +0 -0
  382. /package/{src → plugins/specweave-testing}/skills/e2e-playwright/README.md +0 -0
  383. /package/{src → plugins/specweave-testing}/skills/e2e-playwright/SKILL.md +0 -0
  384. /package/{src → plugins/specweave-testing}/skills/e2e-playwright/execute.js +0 -0
  385. /package/{src → plugins/specweave-testing}/skills/e2e-playwright/lib/utils.js +0 -0
  386. /package/{src → plugins/specweave-testing}/skills/e2e-playwright/package.json +0 -0
  387. /package/{src → plugins/specweave-tooling}/skills/skill-creator/LICENSE.txt +0 -0
  388. /package/{src → plugins/specweave-tooling}/skills/skill-creator/SKILL.md +0 -0
  389. /package/{src → plugins/specweave-tooling}/skills/skill-creator/scripts/init_skill.py +0 -0
  390. /package/{src → plugins/specweave-tooling}/skills/skill-creator/scripts/package_skill.py +0 -0
  391. /package/{src → plugins/specweave-tooling}/skills/skill-creator/scripts/quick_validate.py +0 -0
  392. /package/{src → plugins/specweave-tooling}/skills/skill-router/SKILL.md +0 -0
@@ -0,0 +1,518 @@
1
+ ---
2
+ name: ml-pipeline-orchestrator
3
+ description: |
4
+ Orchestrates complete machine learning pipelines within SpecWeave increments. Activates when users request "ML pipeline", "train model", "build ML system", "end-to-end ML", "ML workflow", "model training pipeline", or similar. Guides users through data preprocessing, feature engineering, model training, evaluation, and deployment using SpecWeave's spec-driven approach. Integrates with increment lifecycle for reproducible ML development.
5
+ ---
6
+
7
+ # ML Pipeline Orchestrator
8
+
9
+ ## Overview
10
+
11
+ This skill transforms ML development into a SpecWeave increment-based workflow, ensuring every ML project follows the same disciplined approach: spec → plan → tasks → implement → validate. It orchestrates the complete ML lifecycle from data exploration to model deployment, with full traceability and living documentation.
12
+
13
+ ## Core Philosophy
14
+
15
+ **SpecWeave + ML = Disciplined Data Science**
16
+
17
+ Traditional ML development often lacks structure:
18
+ - ❌ Jupyter notebooks with no version control
19
+ - ❌ Experiments without documentation
20
+ - ❌ Models deployed with no reproducibility
21
+ - ❌ Team knowledge trapped in individual notebooks
22
+
23
+ SpecWeave brings discipline:
24
+ - ✅ Every ML feature is an increment (with spec, plan, tasks)
25
+ - ✅ Experiments tracked and documented automatically
26
+ - ✅ Model versions tied to increments
27
+ - ✅ Living docs capture learnings and decisions
28
+
29
+ ## How It Works
30
+
31
+ ### Phase 1: ML Increment Planning
32
+
33
+ When you request "build a recommendation model", the skill:
34
+
35
+ 1. **Creates ML increment structure**:
36
+ ```
37
+ .specweave/increments/0042-recommendation-model/
38
+ ├── spec.md # ML requirements, success metrics
39
+ ├── plan.md # Pipeline architecture
40
+ ├── tasks.md # Implementation tasks
41
+ ├── tests.md # Evaluation criteria
42
+ ├── experiments/ # Experiment tracking
43
+ │ ├── exp-001-baseline/
44
+ │ ├── exp-002-xgboost/
45
+ │ └── exp-003-neural-net/
46
+ ├── data/ # Data samples, schemas
47
+ │ ├── schema.yaml
48
+ │ └── sample.csv
49
+ ├── models/ # Trained models
50
+ │ ├── model-v1.pkl
51
+ │ └── model-v2.pkl
52
+ └── notebooks/ # Exploratory notebooks
53
+ ├── 01-eda.ipynb
54
+ └── 02-feature-engineering.ipynb
55
+ ```
56
+
57
+ 2. **Generates ML-specific spec** (spec.md):
58
+ ```markdown
59
+ ## ML Problem Definition
60
+ - Problem type: Recommendation (collaborative filtering)
61
+ - Input: User behavior history
62
+ - Output: Top-N product recommendations
63
+ - Success metrics: Precision@10 > 0.25, Recall@10 > 0.15
64
+
65
+ ## Data Requirements
66
+ - Training data: 6 months user interactions
67
+ - Validation: Last month
68
+ - Features: User profile, product attributes, interaction history
69
+
70
+ ## Model Requirements
71
+ - Latency: <100ms inference
72
+ - Throughput: 1000 req/sec
73
+ - Accuracy: Better than random baseline by 3x
74
+ - Explainability: Must explain top-3 recommendations
75
+ ```
76
+
77
+ 3. **Creates ML-specific tasks** (tasks.md):
78
+ ```markdown
79
+ - [ ] T-001: Data exploration and quality analysis
80
+ - [ ] T-002: Feature engineering pipeline
81
+ - [ ] T-003: Train baseline model (random/popularity)
82
+ - [ ] T-004: Train candidate models (3 algorithms)
83
+ - [ ] T-005: Hyperparameter tuning (best model)
84
+ - [ ] T-006: Model evaluation (all metrics)
85
+ - [ ] T-007: Model explainability (SHAP/LIME)
86
+ - [ ] T-008: Production deployment preparation
87
+ - [ ] T-009: A/B test plan
88
+ ```
89
+
90
+ ### Phase 2: Pipeline Execution
91
+
92
+ The skill guides through each task with best practices:
93
+
94
+ #### Task 1: Data Exploration
95
+ ```python
96
+ # Generated template with SpecWeave integration
97
+ import pandas as pd
98
+ import mlflow
99
+ from specweave import track_experiment
100
+
101
+ # Auto-logs to .specweave/increments/0042.../experiments/
102
+ with track_experiment("exp-001-eda") as exp:
103
+ df = pd.read_csv("data/interactions.csv")
104
+
105
+ # EDA
106
+ exp.log_param("dataset_size", len(df))
107
+ exp.log_metric("missing_values", df.isnull().sum().sum())
108
+
109
+ # Auto-generates report in increment folder
110
+ exp.save_report("eda-summary.md")
111
+ ```
112
+
113
+ #### Task 3: Train Baseline
114
+ ```python
115
+ from sklearn.dummy import DummyClassifier
116
+ from specweave import track_model
117
+
118
+ with track_model("baseline-random", increment="0042") as model:
119
+ clf = DummyClassifier(strategy="uniform")
120
+ clf.fit(X_train, y_train)
121
+
122
+ # Automatically logged to increment
123
+ model.log_metrics({
124
+ "accuracy": 0.12,
125
+ "precision@10": 0.08
126
+ })
127
+ model.save_artifact(clf, "baseline.pkl")
128
+ ```
129
+
130
+ #### Task 4: Train Candidate Models
131
+ ```python
132
+ from xgboost import XGBClassifier
133
+ from specweave import ModelExperiment
134
+
135
+ # Parallel experiments with auto-tracking
136
+ experiments = [
137
+ ModelExperiment("xgboost", XGBClassifier, params_xgb),
138
+ ModelExperiment("lightgbm", LGBMClassifier, params_lgbm),
139
+ ModelExperiment("neural-net", KerasModel, params_nn)
140
+ ]
141
+
142
+ results = run_experiments(
143
+ experiments,
144
+ increment="0042",
145
+ save_to="experiments/"
146
+ )
147
+
148
+ # Auto-generates comparison table in increment docs
149
+ ```
150
+
151
+ ### Phase 3: Increment Completion
152
+
153
+ When `/specweave:done 0042` runs:
154
+
155
+ 1. **Validates ML-specific criteria**:
156
+ - ✅ All experiments logged
157
+ - ✅ Best model saved
158
+ - ✅ Evaluation metrics documented
159
+ - ✅ Model explainability artifacts present
160
+
161
+ 2. **Generates completion summary**:
162
+ ```markdown
163
+ ## Recommendation Model - COMPLETE
164
+
165
+ ### Experiments Run: 7
166
+ 1. exp-001-baseline (random): precision@10=0.08
167
+ 2. exp-002-popularity: precision@10=0.18
168
+ 3. exp-003-xgboost: precision@10=0.26 ✅ BEST
169
+ 4. exp-004-lightgbm: precision@10=0.24
170
+ 5. exp-005-neural-net: precision@10=0.22
171
+ ...
172
+
173
+ ### Best Model
174
+ - Algorithm: XGBoost
175
+ - Version: model-v3.pkl
176
+ - Metrics: precision@10=0.26, recall@10=0.16
177
+ - Training time: 45 min
178
+ - Model size: 12 MB
179
+
180
+ ### Deployment Ready
181
+ - ✅ Inference latency: 35ms (target: <100ms)
182
+ - ✅ Explainability: SHAP values computed
183
+ - ✅ A/B test plan documented
184
+ ```
185
+
186
+ 3. **Syncs living docs** (via `/specweave:sync-docs`):
187
+ - Updates architecture docs with model design
188
+ - Adds ADR for algorithm selection
189
+ - Documents learnings in runbooks
190
+
191
+ ## When to Use This Skill
192
+
193
+ Activate this skill when you need to:
194
+
195
+ - **Build ML features end-to-end** - From idea to deployed model
196
+ - **Ensure reproducibility** - Every experiment tracked and documented
197
+ - **Follow ML best practices** - Baseline comparison, proper validation, explainability
198
+ - **Integrate ML with software engineering** - ML as increments, not isolated notebooks
199
+ - **Maintain team knowledge** - Living docs capture why decisions were made
200
+
201
+ ## ML Pipeline Stages
202
+
203
+ ### 1. Data Stage
204
+ - Data exploration (EDA)
205
+ - Data quality assessment
206
+ - Schema validation
207
+ - Sample data documentation
208
+
209
+ ### 2. Feature Stage
210
+ - Feature engineering
211
+ - Feature selection
212
+ - Feature importance analysis
213
+ - Feature store integration (optional)
214
+
215
+ ### 3. Training Stage
216
+ - Baseline model (random, rule-based)
217
+ - Candidate models (3+ algorithms)
218
+ - Hyperparameter tuning
219
+ - Cross-validation
220
+
221
+ ### 4. Evaluation Stage
222
+ - Comprehensive metrics (accuracy, precision, recall, F1, AUC)
223
+ - Business metrics (latency, throughput)
224
+ - Model comparison (vs baseline, vs previous version)
225
+ - Error analysis
226
+
227
+ ### 5. Explainability Stage
228
+ - Feature importance
229
+ - SHAP values
230
+ - LIME explanations
231
+ - Example predictions with rationale
232
+
233
+ ### 6. Deployment Stage
234
+ - Model packaging
235
+ - Inference pipeline
236
+ - A/B test plan
237
+ - Monitoring setup
238
+
239
+ ## Integration with SpecWeave Workflow
240
+
241
+ ### With Experiment Tracking
242
+ ```bash
243
+ # Start ML increment
244
+ /specweave:inc "0042-recommendation-model"
245
+
246
+ # Automatically integrates experiment tracking
247
+ # All MLflow/W&B logs saved to increment folder
248
+ ```
249
+
250
+ ### With Living Docs
251
+ ```bash
252
+ # After training best model
253
+ /specweave:sync-docs update
254
+
255
+ # Automatically:
256
+ # - Updates architecture/ml-models.md
257
+ # - Adds ADR for algorithm choice
258
+ # - Documents hyperparameters in runbooks
259
+ ```
260
+
261
+ ### With GitHub Sync
262
+ ```bash
263
+ # Create GitHub issue for model retraining
264
+ /specweave:github:create-issue "Retrain recommendation model with new data"
265
+
266
+ # Linked to increment 0042
267
+ # Issue tracks model performance over time
268
+ ```
269
+
270
+ ## Best Practices
271
+
272
+ ### 1. Always Start with Baseline
273
+ ```python
274
+ # Before training complex models, establish baseline
275
+ baseline_results = train_baseline_model(
276
+ strategies=["random", "popularity", "rule-based"]
277
+ )
278
+ # Requirement: New model must beat best baseline by 20%+
279
+ ```
280
+
281
+ ### 2. Use Cross-Validation
282
+ ```python
283
+ # Never trust single train/test split
284
+ cv_scores = cross_val_score(model, X, y, cv=5)
285
+ exp.log_metric("cv_mean", cv_scores.mean())
286
+ exp.log_metric("cv_std", cv_scores.std())
287
+ ```
288
+
289
+ ### 3. Track Everything
290
+ ```python
291
+ # Hyperparameters, metrics, artifacts, environment
292
+ exp.log_params(model.get_params())
293
+ exp.log_metrics({"accuracy": acc, "f1": f1})
294
+ exp.log_artifact("model.pkl")
295
+ exp.log_artifact("requirements.txt") # Reproducibility
296
+ ```
297
+
298
+ ### 4. Document Failures
299
+ ```python
300
+ # Failed experiments are valuable learnings
301
+ with track_experiment("exp-006-failed-lstm") as exp:
302
+ # ... training fails ...
303
+ exp.log_note("FAILED: LSTM overfits badly, needs regularization")
304
+ exp.set_status("failed")
305
+ # This documents why LSTM wasn't chosen
306
+ ```
307
+
308
+ ### 5. Model Versioning
309
+ ```python
310
+ # Tie model versions to increments
311
+ model_version = f"0042-v{iteration}"
312
+ mlflow.register_model(
313
+ f"runs:/{run_id}/model",
314
+ f"recommendation-model-{model_version}"
315
+ )
316
+ ```
317
+
318
+ ## Examples
319
+
320
+ ### Example 1: Classification Pipeline
321
+ ```bash
322
+ User: "Build a fraud detection model for transactions"
323
+
324
+ Skill creates increment 0051-fraud-detection with:
325
+ - spec.md: Binary classification, 99% precision target
326
+ - plan.md: Imbalanced data handling, threshold tuning
327
+ - tasks.md: 9 tasks from EDA to deployment
328
+ - experiments/: exp-001-baseline, exp-002-xgboost, etc.
329
+
330
+ Guides through:
331
+ 1. EDA → identify class imbalance (0.1% fraud)
332
+ 2. Baseline → random/majority (terrible results)
333
+ 3. Candidates → XGBoost, LightGBM, Neural Net
334
+ 4. Threshold tuning → optimize for precision
335
+ 5. SHAP → explain high-risk predictions
336
+ 6. Deploy → model + threshold + explainer
337
+ ```
338
+
339
+ ### Example 2: Regression Pipeline
340
+ ```bash
341
+ User: "Predict customer lifetime value"
342
+
343
+ Skill creates increment 0063-ltv-prediction with:
344
+ - spec.md: Regression, RMSE < $50 target
345
+ - plan.md: Time-based validation, feature engineering
346
+ - tasks.md: Customer cohort analysis, feature importance
347
+
348
+ Key difference: Regression-specific evaluation (RMSE, MAE, R²)
349
+ ```
350
+
351
+ ### Example 3: Time Series Forecasting
352
+ ```bash
353
+ User: "Forecast weekly sales for next 12 weeks"
354
+
355
+ Skill creates increment 0072-sales-forecasting with:
356
+ - spec.md: Time series, MAPE < 10% target
357
+ - plan.md: Seasonal decomposition, ARIMA vs Prophet
358
+ - tasks.md: Stationarity tests, residual analysis
359
+
360
+ Key difference: Time series validation (no random split)
361
+ ```
362
+
363
+ ## Framework Support
364
+
365
+ This skill works with all major ML frameworks:
366
+
367
+ ### Scikit-Learn
368
+ ```python
369
+ from sklearn.ensemble import RandomForestClassifier
370
+ from specweave import track_sklearn_model
371
+
372
+ model = RandomForestClassifier(n_estimators=100)
373
+ with track_sklearn_model(model, increment="0042") as tracked:
374
+ tracked.fit(X_train, y_train)
375
+ tracked.evaluate(X_test, y_test)
376
+ ```
377
+
378
+ ### PyTorch
379
+ ```python
380
+ import torch
381
+ from specweave import track_pytorch_model
382
+
383
+ model = NeuralNet()
384
+ with track_pytorch_model(model, increment="0042") as tracked:
385
+ for epoch in range(epochs):
386
+ tracked.train_epoch(train_loader)
387
+ tracked.log_metric(f"loss_epoch_{epoch}", loss)
388
+ ```
389
+
390
+ ### TensorFlow/Keras
391
+ ```python
392
+ from tensorflow import keras
393
+ from specweave import KerasCallback
394
+
395
+ model = keras.Sequential([...])
396
+ model.fit(
397
+ X_train, y_train,
398
+ callbacks=[KerasCallback(increment="0042")]
399
+ )
400
+ ```
401
+
402
+ ### XGBoost/LightGBM
403
+ ```python
404
+ import xgboost as xgb
405
+ from specweave import track_boosting_model
406
+
407
+ dtrain = xgb.DMatrix(X_train, label=y_train)
408
+ with track_boosting_model("xgboost", increment="0042") as tracked:
409
+ model = xgb.train(params, dtrain, callbacks=[tracked.callback])
410
+ ```
411
+
412
+ ## Integration Points
413
+
414
+ ### With `experiment-tracker` skill
415
+ - Auto-detects MLflow/W&B in project
416
+ - Configures tracking URI to increment folder
417
+ - Syncs experiment metadata to increment docs
418
+
419
+ ### With `model-evaluator` skill
420
+ - Generates comprehensive evaluation reports
421
+ - Compares models across experiments
422
+ - Highlights best model with confidence intervals
423
+
424
+ ### With `feature-engineer` skill
425
+ - Generates feature engineering pipeline
426
+ - Documents feature importance
427
+ - Creates feature store schemas
428
+
429
+ ### With `ml-engineer` agent
430
+ - Delegates complex ML decisions to specialized agent
431
+ - Reviews model architecture
432
+ - Suggests improvements based on results
433
+
434
+ ## Skill Outputs
435
+
436
+ After running `/specweave:do` on an ML increment, you get:
437
+
438
+ ```
439
+ .specweave/increments/0042-recommendation-model/
440
+ ├── spec.md ✅
441
+ ├── plan.md ✅
442
+ ├── tasks.md ✅ (all completed)
443
+ ├── COMPLETION-SUMMARY.md ✅
444
+ ├── experiments/
445
+ │ ├── exp-001-baseline/
446
+ │ │ ├── metrics.json
447
+ │ │ ├── params.json
448
+ │ │ └── logs/
449
+ │ ├── exp-002-xgboost/ ✅ BEST
450
+ │ │ ├── metrics.json
451
+ │ │ ├── params.json
452
+ │ │ ├── model.pkl
453
+ │ │ └── shap_values.pkl
454
+ │ └── comparison.md
455
+ ├── models/
456
+ │ ├── model-v3.pkl (best)
457
+ │ └── model-v3.metadata.json
458
+ ├── data/
459
+ │ ├── schema.yaml
460
+ │ └── sample.parquet
461
+ └── notebooks/
462
+ ├── 01-eda.ipynb
463
+ ├── 02-feature-engineering.ipynb
464
+ └── 03-model-analysis.ipynb
465
+ ```
466
+
467
+ ## Commands
468
+
469
+ This skill integrates with SpecWeave commands:
470
+
471
+ ```bash
472
+ # Create ML increment
473
+ /specweave:inc "build recommendation model"
474
+ → Activates ml-pipeline-orchestrator
475
+ → Creates ML-specific increment structure
476
+
477
+ # Execute ML tasks
478
+ /specweave:do
479
+ → Guides through data → train → eval workflow
480
+ → Auto-tracks experiments
481
+
482
+ # Validate ML increment
483
+ /specweave:validate 0042
484
+ → Checks: experiments logged, model saved, metrics documented
485
+ → Validates: model meets success criteria
486
+
487
+ # Complete ML increment
488
+ /specweave:done 0042
489
+ → Generates ML completion summary
490
+ → Syncs model metadata to living docs
491
+ ```
492
+
493
+ ## Tips
494
+
495
+ 1. **Start simple** - Always begin with baseline, then iterate
496
+ 2. **Track failures** - Document why approaches didn't work
497
+ 3. **Version data** - Use DVC or similar for data versioning
498
+ 4. **Reproducibility** - Log environment (requirements.txt, conda env)
499
+ 5. **Incremental improvement** - Each increment improves on previous model
500
+ 6. **Team collaboration** - Living docs make ML decisions visible to all
501
+
502
+ ## Advanced: Multi-Increment ML Projects
503
+
504
+ For complex ML systems (e.g., recommendation system with multiple models):
505
+
506
+ ```
507
+ 0042-recommendation-data-pipeline
508
+ 0043-recommendation-candidate-generation
509
+ 0044-recommendation-ranking-model
510
+ 0045-recommendation-reranking
511
+ 0046-recommendation-ab-test
512
+ ```
513
+
514
+ Each increment:
515
+ - Has its own spec, plan, tasks
516
+ - Builds on previous increments
517
+ - Documents model interactions
518
+ - Maintains system-level living docs
@@ -0,0 +1,155 @@
1
+ ---
2
+ name: model-evaluator
3
+ description: |
4
+ Comprehensive ML model evaluation with multiple metrics, cross-validation, and statistical testing. Activates for "evaluate model", "model metrics", "model performance", "compare models", "validation metrics", "test accuracy", "precision recall", "ROC AUC". Generates detailed evaluation reports with visualizations and statistical significance tests, integrated with SpecWeave increment documentation.
5
+ ---
6
+
7
+ # Model Evaluator
8
+
9
+ ## Overview
10
+
11
+ Provides comprehensive, unbiased model evaluation following ML best practices. Goes beyond simple accuracy to evaluate models across multiple dimensions, ensuring confident deployment decisions.
12
+
13
+ ## Core Evaluation Framework
14
+
15
+ ### 1. Classification Metrics
16
+ - Accuracy, Precision, Recall, F1-score
17
+ - ROC AUC, PR AUC
18
+ - Confusion matrix
19
+ - Per-class metrics (for multi-class)
20
+ - Class imbalance handling
21
+
22
+ ### 2. Regression Metrics
23
+ - RMSE, MAE, MAPE
24
+ - R² score, Adjusted R²
25
+ - Residual analysis
26
+ - Prediction interval coverage
27
+
28
+ ### 3. Ranking Metrics (Recommendations)
29
+ - Precision@K, Recall@K
30
+ - NDCG@K, MAP@K
31
+ - MRR (Mean Reciprocal Rank)
32
+ - Coverage, Diversity
33
+
34
+ ### 4. Statistical Validation
35
+ - Cross-validation (K-fold, stratified, time-series)
36
+ - Confidence intervals
37
+ - Statistical significance testing
38
+ - Calibration curves
39
+
40
+ ## Usage
41
+
42
+ ```python
43
+ from specweave import ModelEvaluator
44
+
45
+ evaluator = ModelEvaluator(
46
+ model=trained_model,
47
+ X_test=X_test,
48
+ y_test=y_test,
49
+ increment="0042"
50
+ )
51
+
52
+ # Comprehensive evaluation
53
+ report = evaluator.evaluate_all()
54
+
55
+ # Generates:
56
+ # - .specweave/increments/0042.../evaluation-report.md
57
+ # - Visualizations (confusion matrix, ROC curves, etc.)
58
+ # - Statistical tests
59
+ ```
60
+
61
+ ## Evaluation Report Structure
62
+
63
+ ```markdown
64
+ # Model Evaluation Report: XGBoost Classifier
65
+
66
+ ## Overall Performance
67
+ - **Accuracy**: 0.87 ± 0.02 (95% CI: [0.85, 0.89])
68
+ - **ROC AUC**: 0.92 ± 0.01
69
+ - **F1 Score**: 0.85 ± 0.02
70
+
71
+ ## Per-Class Performance
72
+ | Class | Precision | Recall | F1 | Support |
73
+ |---------|-----------|--------|------|---------|
74
+ | Class 0 | 0.88 | 0.85 | 0.86 | 1000 |
75
+ | Class 1 | 0.84 | 0.87 | 0.86 | 800 |
76
+
77
+ ## Confusion Matrix
78
+ [Visualization embedded]
79
+
80
+ ## Cross-Validation Results
81
+ - 5-fold CV accuracy: 0.86 ± 0.03
82
+ - Fold scores: [0.85, 0.88, 0.84, 0.87, 0.86]
83
+ - No overfitting detected (train=0.89, val=0.86, gap=0.03)
84
+
85
+ ## Statistical Tests
86
+ - Comparison vs baseline: p=0.001 (highly significant)
87
+ - Comparison vs previous model: p=0.042 (significant)
88
+
89
+ ## Recommendations
90
+ ✅ Deploy: Model meets accuracy threshold (>0.85)
91
+ ✅ Stable: Low variance across folds
92
+ ⚠️ Monitor: Class 1 recall slightly lower (0.84)
93
+ ```
94
+
95
+ ## Model Comparison
96
+
97
+ ```python
98
+ from specweave import compare_models
99
+
100
+ models = {
101
+ "baseline": baseline_model,
102
+ "xgboost": xgb_model,
103
+ "lightgbm": lgbm_model,
104
+ "neural-net": nn_model
105
+ }
106
+
107
+ comparison = compare_models(
108
+ models,
109
+ X_test,
110
+ y_test,
111
+ metrics=["accuracy", "auc", "f1"],
112
+ increment="0042"
113
+ )
114
+ ```
115
+
116
+ **Output**:
117
+ ```
118
+ Model Comparison Report
119
+ =======================
120
+
121
+ | Model | Accuracy | ROC AUC | F1 | Inference Time | Model Size |
122
+ |------------|----------|---------|------|----------------|------------|
123
+ | baseline | 0.65 | 0.70 | 0.62 | 1ms | 10KB |
124
+ | xgboost | 0.87 | 0.92 | 0.85 | 35ms | 12MB |
125
+ | lightgbm | 0.86 | 0.91 | 0.84 | 28ms | 8MB |
126
+ | neural-net | 0.85 | 0.90 | 0.83 | 120ms | 45MB |
127
+
128
+ Recommendation: XGBoost
129
+ - Best accuracy and AUC
130
+ - Acceptable inference time (<50ms requirement)
131
+ - Good size/performance tradeoff
132
+ ```
133
+
134
+ ## Best Practices
135
+
136
+ 1. **Always compare to baseline** - Random, majority, rule-based
137
+ 2. **Use cross-validation** - Never trust single split
138
+ 3. **Check calibration** - Are probabilities meaningful?
139
+ 4. **Analyze errors** - What types of mistakes?
140
+ 5. **Test statistical significance** - Is improvement real?
141
+
142
+ ## Integration with SpecWeave
143
+
144
+ ```bash
145
+ # Evaluate model in increment
146
+ /ml:evaluate-model 0042
147
+
148
+ # Compare all models in increment
149
+ /ml:compare-models 0042
150
+
151
+ # Generate full evaluation report
152
+ /ml:evaluation-report 0042
153
+ ```
154
+
155
+ Evaluation results automatically included in increment COMPLETION-SUMMARY.md.