compact-agent 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (324) hide show
  1. package/README.md +394 -0
  2. package/bin/anycode.js +2 -0
  3. package/bin/crowcoder.js +19 -0
  4. package/bin/ecc-hooks.cjs +138 -0
  5. package/dist/agents.d.ts +17 -0
  6. package/dist/agents.js +1603 -0
  7. package/dist/agents.js.map +1 -0
  8. package/dist/api.d.ts +16 -0
  9. package/dist/api.js +115 -0
  10. package/dist/api.js.map +1 -0
  11. package/dist/autonomous-loops.d.ts +108 -0
  12. package/dist/autonomous-loops.js +526 -0
  13. package/dist/autonomous-loops.js.map +1 -0
  14. package/dist/codemaps.d.ts +53 -0
  15. package/dist/codemaps.js +325 -0
  16. package/dist/codemaps.js.map +1 -0
  17. package/dist/compaction.d.ts +30 -0
  18. package/dist/compaction.js +125 -0
  19. package/dist/compaction.js.map +1 -0
  20. package/dist/config.d.ts +5 -0
  21. package/dist/config.js +79 -0
  22. package/dist/config.js.map +1 -0
  23. package/dist/content-engine.d.ts +97 -0
  24. package/dist/content-engine.js +721 -0
  25. package/dist/content-engine.js.map +1 -0
  26. package/dist/cost-tracker.d.ts +49 -0
  27. package/dist/cost-tracker.js +150 -0
  28. package/dist/cost-tracker.js.map +1 -0
  29. package/dist/counter-button.d.ts +35 -0
  30. package/dist/counter-button.js +48 -0
  31. package/dist/counter-button.js.map +1 -0
  32. package/dist/counter.d.ts +21 -0
  33. package/dist/counter.js +31 -0
  34. package/dist/counter.js.map +1 -0
  35. package/dist/coverage.d.ts +23 -0
  36. package/dist/coverage.js +215 -0
  37. package/dist/coverage.js.map +1 -0
  38. package/dist/docs-sync.d.ts +23 -0
  39. package/dist/docs-sync.js +266 -0
  40. package/dist/docs-sync.js.map +1 -0
  41. package/dist/ecc.d.ts +41 -0
  42. package/dist/ecc.js +644 -0
  43. package/dist/ecc.js.map +1 -0
  44. package/dist/evaluation.d.ts +24 -0
  45. package/dist/evaluation.js +412 -0
  46. package/dist/evaluation.js.map +1 -0
  47. package/dist/export.d.ts +22 -0
  48. package/dist/export.js +109 -0
  49. package/dist/export.js.map +1 -0
  50. package/dist/git-workflow.d.ts +22 -0
  51. package/dist/git-workflow.js +197 -0
  52. package/dist/git-workflow.js.map +1 -0
  53. package/dist/hook-controls.d.ts +34 -0
  54. package/dist/hook-controls.js +90 -0
  55. package/dist/hook-controls.js.map +1 -0
  56. package/dist/hooks.d.ts +30 -0
  57. package/dist/hooks.js +130 -0
  58. package/dist/hooks.js.map +1 -0
  59. package/dist/html-parser.d.ts +18 -0
  60. package/dist/html-parser.js +101 -0
  61. package/dist/html-parser.js.map +1 -0
  62. package/dist/index.d.ts +12 -0
  63. package/dist/index.js +1230 -0
  64. package/dist/index.js.map +1 -0
  65. package/dist/learning.d.ts +35 -0
  66. package/dist/learning.js +238 -0
  67. package/dist/learning.js.map +1 -0
  68. package/dist/login.d.ts +37 -0
  69. package/dist/login.js +191 -0
  70. package/dist/login.js.map +1 -0
  71. package/dist/memory.d.ts +39 -0
  72. package/dist/memory.js +183 -0
  73. package/dist/memory.js.map +1 -0
  74. package/dist/model-router.d.ts +23 -0
  75. package/dist/model-router.js +145 -0
  76. package/dist/model-router.js.map +1 -0
  77. package/dist/modes.d.ts +17 -0
  78. package/dist/modes.js +217 -0
  79. package/dist/modes.js.map +1 -0
  80. package/dist/orchestration.d.ts +37 -0
  81. package/dist/orchestration.js +139 -0
  82. package/dist/orchestration.js.map +1 -0
  83. package/dist/package-detect.d.ts +36 -0
  84. package/dist/package-detect.js +529 -0
  85. package/dist/package-detect.js.map +1 -0
  86. package/dist/permissions.d.ts +25 -0
  87. package/dist/permissions.js +50 -0
  88. package/dist/permissions.js.map +1 -0
  89. package/dist/pm2-manager.d.ts +40 -0
  90. package/dist/pm2-manager.js +127 -0
  91. package/dist/pm2-manager.js.map +1 -0
  92. package/dist/query.d.ts +15 -0
  93. package/dist/query.js +278 -0
  94. package/dist/query.js.map +1 -0
  95. package/dist/refactor.d.ts +22 -0
  96. package/dist/refactor.js +226 -0
  97. package/dist/refactor.js.map +1 -0
  98. package/dist/retry.d.ts +20 -0
  99. package/dist/retry.js +88 -0
  100. package/dist/retry.js.map +1 -0
  101. package/dist/rules.d.ts +34 -0
  102. package/dist/rules.js +942 -0
  103. package/dist/rules.js.map +1 -0
  104. package/dist/schema.d.ts +23 -0
  105. package/dist/schema.js +12 -0
  106. package/dist/schema.js.map +1 -0
  107. package/dist/search-first.d.ts +17 -0
  108. package/dist/search-first.js +301 -0
  109. package/dist/search-first.js.map +1 -0
  110. package/dist/security.d.ts +10 -0
  111. package/dist/security.js +145 -0
  112. package/dist/security.js.map +1 -0
  113. package/dist/sessions.d.ts +21 -0
  114. package/dist/sessions.js +112 -0
  115. package/dist/sessions.js.map +1 -0
  116. package/dist/skill-create.d.ts +38 -0
  117. package/dist/skill-create.js +389 -0
  118. package/dist/skill-create.js.map +1 -0
  119. package/dist/skills.d.ts +34 -0
  120. package/dist/skills.js +161 -0
  121. package/dist/skills.js.map +1 -0
  122. package/dist/strategic-compaction.d.ts +24 -0
  123. package/dist/strategic-compaction.js +144 -0
  124. package/dist/strategic-compaction.js.map +1 -0
  125. package/dist/system-prompt.d.ts +3 -0
  126. package/dist/system-prompt.js +101 -0
  127. package/dist/system-prompt.js.map +1 -0
  128. package/dist/theme.d.ts +60 -0
  129. package/dist/theme.js +220 -0
  130. package/dist/theme.js.map +1 -0
  131. package/dist/tools/bash.d.ts +2 -0
  132. package/dist/tools/bash.js +49 -0
  133. package/dist/tools/bash.js.map +1 -0
  134. package/dist/tools/edit.d.ts +2 -0
  135. package/dist/tools/edit.js +76 -0
  136. package/dist/tools/edit.js.map +1 -0
  137. package/dist/tools/glob.d.ts +2 -0
  138. package/dist/tools/glob.js +54 -0
  139. package/dist/tools/glob.js.map +1 -0
  140. package/dist/tools/grep.d.ts +2 -0
  141. package/dist/tools/grep.js +64 -0
  142. package/dist/tools/grep.js.map +1 -0
  143. package/dist/tools/index.d.ts +5 -0
  144. package/dist/tools/index.js +27 -0
  145. package/dist/tools/index.js.map +1 -0
  146. package/dist/tools/list-dir.d.ts +2 -0
  147. package/dist/tools/list-dir.js +51 -0
  148. package/dist/tools/list-dir.js.map +1 -0
  149. package/dist/tools/read.d.ts +2 -0
  150. package/dist/tools/read.js +56 -0
  151. package/dist/tools/read.js.map +1 -0
  152. package/dist/tools/types.d.ts +45 -0
  153. package/dist/tools/types.js +2 -0
  154. package/dist/tools/types.js.map +1 -0
  155. package/dist/tools/web-fetch.d.ts +2 -0
  156. package/dist/tools/web-fetch.js +41 -0
  157. package/dist/tools/web-fetch.js.map +1 -0
  158. package/dist/tools/web-search.d.ts +27 -0
  159. package/dist/tools/web-search.js +139 -0
  160. package/dist/tools/web-search.js.map +1 -0
  161. package/dist/tools/write.d.ts +2 -0
  162. package/dist/tools/write.js +36 -0
  163. package/dist/tools/write.js.map +1 -0
  164. package/dist/types.d.ts +28 -0
  165. package/dist/types.js +57 -0
  166. package/dist/types.js.map +1 -0
  167. package/dist/users.d.ts +51 -0
  168. package/dist/users.js +193 -0
  169. package/dist/users.js.map +1 -0
  170. package/dist/verification.d.ts +73 -0
  171. package/dist/verification.js +269 -0
  172. package/dist/verification.js.map +1 -0
  173. package/dist/walkthrough.d.ts +10 -0
  174. package/dist/walkthrough.js +121 -0
  175. package/dist/walkthrough.js.map +1 -0
  176. package/package.json +58 -0
  177. package/resources/ecc/agents/architect.json +16 -0
  178. package/resources/ecc/agents/architect.md +212 -0
  179. package/resources/ecc/agents/build-error-resolver.json +17 -0
  180. package/resources/ecc/agents/build-error-resolver.md +116 -0
  181. package/resources/ecc/agents/chief-of-staff.json +17 -0
  182. package/resources/ecc/agents/chief-of-staff.md +153 -0
  183. package/resources/ecc/agents/code-reviewer.json +16 -0
  184. package/resources/ecc/agents/code-reviewer.md +238 -0
  185. package/resources/ecc/agents/database-reviewer.json +16 -0
  186. package/resources/ecc/agents/database-reviewer.md +92 -0
  187. package/resources/ecc/agents/doc-updater.json +16 -0
  188. package/resources/ecc/agents/doc-updater.md +108 -0
  189. package/resources/ecc/agents/e2e-runner.json +17 -0
  190. package/resources/ecc/agents/e2e-runner.md +109 -0
  191. package/resources/ecc/agents/go-build-resolver.json +17 -0
  192. package/resources/ecc/agents/go-build-resolver.md +96 -0
  193. package/resources/ecc/agents/go-reviewer.json +16 -0
  194. package/resources/ecc/agents/go-reviewer.md +77 -0
  195. package/resources/ecc/agents/harness-optimizer.json +15 -0
  196. package/resources/ecc/agents/harness-optimizer.md +34 -0
  197. package/resources/ecc/agents/loop-operator.json +16 -0
  198. package/resources/ecc/agents/loop-operator.md +36 -0
  199. package/resources/ecc/agents/planner.json +15 -0
  200. package/resources/ecc/agents/planner.md +212 -0
  201. package/resources/ecc/agents/python-reviewer.json +16 -0
  202. package/resources/ecc/agents/python-reviewer.md +99 -0
  203. package/resources/ecc/agents/refactor-cleaner.json +17 -0
  204. package/resources/ecc/agents/refactor-cleaner.md +87 -0
  205. package/resources/ecc/agents/security-reviewer.json +16 -0
  206. package/resources/ecc/agents/security-reviewer.md +109 -0
  207. package/resources/ecc/agents/tdd-guide.json +17 -0
  208. package/resources/ecc/agents/tdd-guide.md +93 -0
  209. package/resources/ecc/commands/add-language-rules.md +39 -0
  210. package/resources/ecc/commands/database-migration.md +36 -0
  211. package/resources/ecc/commands/feature-development.md +38 -0
  212. package/resources/ecc/prompts/build-fix.prompt.md +47 -0
  213. package/resources/ecc/prompts/code-review.prompt.md +56 -0
  214. package/resources/ecc/prompts/plan.prompt.md +52 -0
  215. package/resources/ecc/prompts/refactor.prompt.md +50 -0
  216. package/resources/ecc/prompts/security-review.prompt.md +70 -0
  217. package/resources/ecc/prompts/tdd.prompt.md +47 -0
  218. package/resources/ecc/rules/common-agents.md +53 -0
  219. package/resources/ecc/rules/common-coding-style.md +52 -0
  220. package/resources/ecc/rules/common-development-workflow.md +33 -0
  221. package/resources/ecc/rules/common-git-workflow.md +28 -0
  222. package/resources/ecc/rules/common-hooks.md +34 -0
  223. package/resources/ecc/rules/common-patterns.md +35 -0
  224. package/resources/ecc/rules/common-performance.md +59 -0
  225. package/resources/ecc/rules/common-security.md +33 -0
  226. package/resources/ecc/rules/common-testing.md +33 -0
  227. package/resources/ecc/rules/golang-coding-style.md +31 -0
  228. package/resources/ecc/rules/golang-hooks.md +16 -0
  229. package/resources/ecc/rules/golang-patterns.md +44 -0
  230. package/resources/ecc/rules/golang-security.md +33 -0
  231. package/resources/ecc/rules/golang-testing.md +30 -0
  232. package/resources/ecc/rules/kotlin-coding-style.md +39 -0
  233. package/resources/ecc/rules/kotlin-hooks.md +16 -0
  234. package/resources/ecc/rules/kotlin-patterns.md +50 -0
  235. package/resources/ecc/rules/kotlin-security.md +58 -0
  236. package/resources/ecc/rules/kotlin-testing.md +38 -0
  237. package/resources/ecc/rules/php-coding-style.md +25 -0
  238. package/resources/ecc/rules/php-hooks.md +21 -0
  239. package/resources/ecc/rules/php-patterns.md +23 -0
  240. package/resources/ecc/rules/php-security.md +24 -0
  241. package/resources/ecc/rules/php-testing.md +26 -0
  242. package/resources/ecc/rules/python-coding-style.md +42 -0
  243. package/resources/ecc/rules/python-hooks.md +19 -0
  244. package/resources/ecc/rules/python-patterns.md +39 -0
  245. package/resources/ecc/rules/python-security.md +30 -0
  246. package/resources/ecc/rules/python-testing.md +38 -0
  247. package/resources/ecc/rules/swift-coding-style.md +47 -0
  248. package/resources/ecc/rules/swift-hooks.md +20 -0
  249. package/resources/ecc/rules/swift-patterns.md +66 -0
  250. package/resources/ecc/rules/swift-security.md +33 -0
  251. package/resources/ecc/rules/swift-testing.md +45 -0
  252. package/resources/ecc/rules/typescript-coding-style.md +63 -0
  253. package/resources/ecc/rules/typescript-hooks.md +20 -0
  254. package/resources/ecc/rules/typescript-patterns.md +50 -0
  255. package/resources/ecc/rules/typescript-security.md +26 -0
  256. package/resources/ecc/rules/typescript-testing.md +16 -0
  257. package/resources/ecc/skills/agent-introspection-debugging/SKILL.md +152 -0
  258. package/resources/ecc/skills/agent-introspection-debugging/agents/openai.yaml +7 -0
  259. package/resources/ecc/skills/agent-sort/SKILL.md +214 -0
  260. package/resources/ecc/skills/agent-sort/agents/openai.yaml +7 -0
  261. package/resources/ecc/skills/api-design/SKILL.md +522 -0
  262. package/resources/ecc/skills/api-design/agents/openai.yaml +7 -0
  263. package/resources/ecc/skills/article-writing/SKILL.md +78 -0
  264. package/resources/ecc/skills/article-writing/agents/openai.yaml +7 -0
  265. package/resources/ecc/skills/backend-patterns/SKILL.md +597 -0
  266. package/resources/ecc/skills/backend-patterns/agents/openai.yaml +7 -0
  267. package/resources/ecc/skills/brand-voice/SKILL.md +96 -0
  268. package/resources/ecc/skills/brand-voice/agents/openai.yaml +7 -0
  269. package/resources/ecc/skills/brand-voice/references/voice-profile-schema.md +55 -0
  270. package/resources/ecc/skills/bun-runtime/SKILL.md +83 -0
  271. package/resources/ecc/skills/bun-runtime/agents/openai.yaml +7 -0
  272. package/resources/ecc/skills/coding-standards/SKILL.md +548 -0
  273. package/resources/ecc/skills/coding-standards/agents/openai.yaml +7 -0
  274. package/resources/ecc/skills/content-engine/SKILL.md +130 -0
  275. package/resources/ecc/skills/content-engine/agents/openai.yaml +7 -0
  276. package/resources/ecc/skills/crosspost/SKILL.md +110 -0
  277. package/resources/ecc/skills/crosspost/agents/openai.yaml +7 -0
  278. package/resources/ecc/skills/deep-research/SKILL.md +154 -0
  279. package/resources/ecc/skills/deep-research/agents/openai.yaml +7 -0
  280. package/resources/ecc/skills/dmux-workflows/SKILL.md +143 -0
  281. package/resources/ecc/skills/dmux-workflows/agents/openai.yaml +7 -0
  282. package/resources/ecc/skills/documentation-lookup/SKILL.md +89 -0
  283. package/resources/ecc/skills/documentation-lookup/agents/openai.yaml +7 -0
  284. package/resources/ecc/skills/e2e-testing/SKILL.md +325 -0
  285. package/resources/ecc/skills/e2e-testing/agents/openai.yaml +7 -0
  286. package/resources/ecc/skills/eval-harness/SKILL.md +235 -0
  287. package/resources/ecc/skills/eval-harness/agents/openai.yaml +7 -0
  288. package/resources/ecc/skills/everything-claude-code/SKILL.md +442 -0
  289. package/resources/ecc/skills/everything-claude-code/agents/openai.yaml +7 -0
  290. package/resources/ecc/skills/exa-search/SKILL.md +169 -0
  291. package/resources/ecc/skills/exa-search/agents/openai.yaml +7 -0
  292. package/resources/ecc/skills/fal-ai-media/SKILL.md +276 -0
  293. package/resources/ecc/skills/fal-ai-media/agents/openai.yaml +7 -0
  294. package/resources/ecc/skills/frontend-patterns/SKILL.md +647 -0
  295. package/resources/ecc/skills/frontend-patterns/agents/openai.yaml +7 -0
  296. package/resources/ecc/skills/frontend-slides/SKILL.md +183 -0
  297. package/resources/ecc/skills/frontend-slides/STYLE_PRESETS.md +330 -0
  298. package/resources/ecc/skills/frontend-slides/agents/openai.yaml +7 -0
  299. package/resources/ecc/skills/investor-materials/SKILL.md +95 -0
  300. package/resources/ecc/skills/investor-materials/agents/openai.yaml +7 -0
  301. package/resources/ecc/skills/investor-outreach/SKILL.md +90 -0
  302. package/resources/ecc/skills/investor-outreach/agents/openai.yaml +7 -0
  303. package/resources/ecc/skills/market-research/SKILL.md +74 -0
  304. package/resources/ecc/skills/market-research/agents/openai.yaml +7 -0
  305. package/resources/ecc/skills/mcp-server-patterns/SKILL.md +66 -0
  306. package/resources/ecc/skills/mcp-server-patterns/agents/openai.yaml +7 -0
  307. package/resources/ecc/skills/mle-workflow/SKILL.md +346 -0
  308. package/resources/ecc/skills/mle-workflow/agents/openai.yaml +7 -0
  309. package/resources/ecc/skills/nextjs-turbopack/SKILL.md +43 -0
  310. package/resources/ecc/skills/nextjs-turbopack/agents/openai.yaml +7 -0
  311. package/resources/ecc/skills/product-capability/SKILL.md +140 -0
  312. package/resources/ecc/skills/product-capability/agents/openai.yaml +7 -0
  313. package/resources/ecc/skills/security-review/SKILL.md +494 -0
  314. package/resources/ecc/skills/security-review/agents/openai.yaml +7 -0
  315. package/resources/ecc/skills/strategic-compact/SKILL.md +102 -0
  316. package/resources/ecc/skills/strategic-compact/agents/openai.yaml +7 -0
  317. package/resources/ecc/skills/tdd-workflow/SKILL.md +409 -0
  318. package/resources/ecc/skills/tdd-workflow/agents/openai.yaml +7 -0
  319. package/resources/ecc/skills/verification-loop/SKILL.md +125 -0
  320. package/resources/ecc/skills/verification-loop/agents/openai.yaml +7 -0
  321. package/resources/ecc/skills/video-editing/SKILL.md +307 -0
  322. package/resources/ecc/skills/video-editing/agents/openai.yaml +7 -0
  323. package/resources/ecc/skills/x-api/SKILL.md +229 -0
  324. package/resources/ecc/skills/x-api/agents/openai.yaml +7 -0
@@ -0,0 +1,7 @@
1
+ interface:
2
+ display_name: "Investor Outreach"
3
+ short_description: "Personalized investor outreach and follow-ups"
4
+ brand_color: "#059669"
5
+ default_prompt: "Use $investor-outreach to write concise personalized investor outreach."
6
+ policy:
7
+ allow_implicit_invocation: true
@@ -0,0 +1,74 @@
1
+ ---
2
+ name: market-research
3
+ description: Conduct market research, competitive analysis, investor due diligence, and industry intelligence with source attribution and decision-oriented summaries. Use when the user wants market sizing, competitor comparisons, fund research, technology scans, or research that informs business decisions.
4
+ ---
5
+
6
+ # Market Research
7
+
8
+ Produce research that supports decisions, not research theater.
9
+
10
+ ## When to Activate
11
+
12
+ - researching a market, category, company, investor, or technology trend
13
+ - building TAM/SAM/SOM estimates
14
+ - comparing competitors or adjacent products
15
+ - preparing investor dossiers before outreach
16
+ - pressure-testing a thesis before building, funding, or entering a market
17
+
18
+ ## Research Standards
19
+
20
+ 1. Every important claim needs a source.
21
+ 2. Prefer recent data and call out stale data.
22
+ 3. Include contrarian evidence and downside cases.
23
+ 4. Translate findings into a decision, not just a summary.
24
+ 5. Separate fact, inference, and recommendation clearly.
25
+
26
+ ## Common Research Modes
27
+
28
+ ### Investor / Fund Diligence
29
+ Collect:
30
+ - fund size, stage, and typical check size
31
+ - relevant portfolio companies
32
+ - public thesis and recent activity
33
+ - reasons the fund is or is not a fit
34
+ - any obvious red flags or mismatches
35
+
36
+ ### Competitive Analysis
37
+ Collect:
38
+ - product reality, not marketing copy
39
+ - funding and investor history if public
40
+ - traction metrics if public
41
+ - distribution and pricing clues
42
+ - strengths, weaknesses, and positioning gaps
43
+
44
+ ### Market Sizing
45
+ Use:
46
+ - top-down estimates from reports or public datasets
47
+ - bottom-up sanity checks from realistic customer acquisition assumptions
48
+ - explicit assumptions for every leap in logic
49
+
50
+ ### Technology / Vendor Research
51
+ Collect:
52
+ - how it works
53
+ - trade-offs and adoption signals
54
+ - integration complexity
55
+ - lock-in, security, compliance, and operational risk
56
+
57
+ ## Output Format
58
+
59
+ Default structure:
60
+ 1. executive summary
61
+ 2. key findings
62
+ 3. implications
63
+ 4. risks and caveats
64
+ 5. recommendation
65
+ 6. sources
66
+
67
+ ## Quality Gate
68
+
69
+ Before delivering:
70
+ - all numbers are sourced or labeled as estimates
71
+ - old data is flagged
72
+ - the recommendation follows from the evidence
73
+ - risks and counterarguments are included
74
+ - the output makes a decision easier
@@ -0,0 +1,7 @@
1
+ interface:
2
+ display_name: "Market Research"
3
+ short_description: "Source-attributed market research"
4
+ brand_color: "#2563EB"
5
+ default_prompt: "Use $market-research to research markets with source-attributed findings."
6
+ policy:
7
+ allow_implicit_invocation: true
@@ -0,0 +1,66 @@
1
+ ---
2
+ name: mcp-server-patterns
3
+ description: Build MCP servers with Node/TypeScript SDK — tools, resources, prompts, Zod validation, stdio vs Streamable HTTP. Use Context7 or official MCP docs for latest API.
4
+ ---
5
+
6
+ # MCP Server Patterns
7
+
8
+ The Model Context Protocol (MCP) lets AI assistants call tools, read resources, and use prompts from your server. Use this skill when building or maintaining MCP servers. The SDK API evolves; check Context7 (query-docs for "MCP") or the official MCP documentation for current method names and signatures.
9
+
10
+ ## When to Use
11
+
12
+ Use when: implementing a new MCP server, adding tools or resources, choosing stdio vs HTTP, upgrading the SDK, or debugging MCP registration and transport issues.
13
+
14
+ ## How It Works
15
+
16
+ ### Core concepts
17
+
18
+ - **Tools**: Actions the model can invoke (e.g. search, run a command). Register with `registerTool()` or `tool()` depending on SDK version.
19
+ - **Resources**: Read-only data the model can fetch (e.g. file contents, API responses). Register with `registerResource()` or `resource()`. Handlers typically receive a `uri` argument.
20
+ - **Prompts**: Reusable, parameterised prompt templates the client can surface (e.g. in Claude Desktop). Register with `registerPrompt()` or equivalent.
21
+ - **Transport**: stdio for local clients (e.g. Claude Desktop); Streamable HTTP is preferred for remote (Cursor, cloud). Legacy HTTP/SSE is for backward compatibility.
22
+
23
+ The Node/TypeScript SDK may expose `tool()` / `resource()` or `registerTool()` / `registerResource()`; the official SDK has changed over time. Always verify against the current [MCP docs](https://modelcontextprotocol.io) or Context7.
24
+
25
+ ### Connecting with stdio
26
+
27
+ For local clients, create a stdio transport and pass it to your server’s connect method. The exact API varies by SDK version (e.g. constructor vs factory). See the official MCP documentation or query Context7 for "MCP stdio server" for the current pattern.
28
+
29
+ Keep server logic (tools + resources) independent of transport so you can plug in stdio or HTTP in the entrypoint.
30
+
31
+ ### Remote (Streamable HTTP)
32
+
33
+ For Cursor, cloud, or other remote clients, use **Streamable HTTP** (single MCP HTTP endpoint per current spec). Support legacy HTTP/SSE only when backward compatibility is required.
34
+
35
+ ## Examples
36
+
37
+ ### Install and server setup
38
+
39
+ ```bash
40
+ npm install @modelcontextprotocol/sdk zod
41
+ ```
42
+
43
+ ```typescript
44
+ import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
45
+ import { z } from "zod";
46
+
47
+ const server = new McpServer({ name: "my-server", version: "1.0.0" });
48
+ ```
49
+
50
+ Register tools and resources using the API your SDK version provides: some versions use `server.tool(name, description, schema, handler)` (positional args), others use `server.tool({ name, description, inputSchema }, handler)` or `registerTool()`. Same for resources — include a `uri` in the handler when the API provides it. Check the official MCP docs or Context7 for the current `@modelcontextprotocol/sdk` signatures to avoid copy-paste errors.
51
+
52
+ Use **Zod** (or the SDK’s preferred schema format) for input validation.
53
+
54
+ ## Best Practices
55
+
56
+ - **Schema first**: Define input schemas for every tool; document parameters and return shape.
57
+ - **Errors**: Return structured errors or messages the model can interpret; avoid raw stack traces.
58
+ - **Idempotency**: Prefer idempotent tools where possible so retries are safe.
59
+ - **Rate and cost**: For tools that call external APIs, consider rate limits and cost; document in the tool description.
60
+ - **Versioning**: Pin SDK version in package.json; check release notes when upgrading.
61
+
62
+ ## Official SDKs and Docs
63
+
64
+ - **JavaScript/TypeScript**: `@modelcontextprotocol/sdk` (npm). Use Context7 with library name "MCP" for current registration and transport patterns.
65
+ - **Go**: Official Go SDK on GitHub (`modelcontextprotocol/go-sdk`).
66
+ - **C#**: Official C# SDK for .NET.
@@ -0,0 +1,7 @@
1
+ interface:
2
+ display_name: "MCP Server Patterns"
3
+ short_description: "MCP server tools, resources, and prompts"
4
+ brand_color: "#0EA5E9"
5
+ default_prompt: "Use $mcp-server-patterns to build MCP tools, resources, and prompts."
6
+ policy:
7
+ allow_implicit_invocation: true
@@ -0,0 +1,346 @@
1
+ ---
2
+ name: mle-workflow
3
+ description: Production machine-learning engineering workflow for data contracts, reproducible training, model evaluation, deployment, monitoring, and rollback. Use when building, reviewing, or hardening ML systems beyond one-off notebooks.
4
+ allowed-tools: Read, Write, Edit, Bash, Grep, Glob
5
+ ---
6
+
7
+ # Machine Learning Engineering Workflow
8
+
9
+ Use this skill to turn model work into a production ML system with clear data contracts, repeatable training, measurable quality gates, deployable artifacts, and operational monitoring.
10
+
11
+ ## When to Activate
12
+
13
+ - Planning or reviewing a production ML feature, model refresh, ranking system, recommender, classifier, embedding workflow, or forecasting pipeline
14
+ - Converting notebook code into a reusable training, evaluation, batch inference, or online inference pipeline
15
+ - Designing model promotion criteria, offline/online evals, experiment tracking, or rollback paths
16
+ - Debugging failures caused by data drift, label leakage, stale features, artifact mismatch, or inconsistent training and serving logic
17
+ - Adding model monitoring, canary rollout, shadow traffic, or post-deploy quality checks
18
+
19
+ ## Scope Calibration
20
+
21
+ Use only the lanes that fit the system in front of you. This skill is useful for ranking, search, recommendations, classifiers, forecasting, embeddings, LLM workflows, anomaly detection, and batch analytics, but it should not force one architecture onto all of them.
22
+
23
+ - Do not assume every model has supervised labels, online serving, a feature store, PyTorch, GPUs, human review, A/B tests, or real-time feedback.
24
+ - Do not add heavyweight MLOps machinery when a data contract, baseline, eval script, and rollback note would make the change reviewable.
25
+ - Do make assumptions explicit when the project lacks labels, delayed outcomes, slice definitions, production traffic, or monitoring ownership.
26
+ - Treat examples as interchangeable scaffolds. Replace metrics, serving mode, data stores, and rollout mechanics with the project-native equivalents.
27
+
28
+ ## Related Skills
29
+
30
+ - `python-patterns` and `python-testing` for Python implementation and pytest coverage
31
+ - `pytorch-patterns` for deep learning models, data loaders, device handling, and training loops
32
+ - `eval-harness` and `ai-regression-testing` for promotion gates and agent-assisted regression checks
33
+ - `database-migrations`, `postgres-patterns`, and `clickhouse-io` for data storage and analytics surfaces
34
+ - `deployment-patterns`, `docker-patterns`, and `security-review` for serving, secrets, containers, and production hardening
35
+
36
+ ## Reuse the SWE Surface
37
+
38
+ Do not treat MLE as separate from software engineering. Most ECC SWE workflows apply directly to ML systems, often with stricter failure modes:
39
+
40
+ The recommended `minimal --with capability:machine-learning` install keeps the core agent surface available alongside this skill. For skill-only or agent-limited harnesses, pair `skill:mle-workflow` with `agent:mle-reviewer` where the target supports agents.
41
+
42
+ | SWE surface | MLE use |
43
+ |-------------|---------|
44
+ | `product-capability` / `architecture-decision-records` | Turn model work into explicit product contracts and record irreversible data, model, and rollout choices |
45
+ | `repo-scan` / `codebase-onboarding` / `code-tour` | Find existing training, feature, serving, eval, and monitoring paths before introducing a parallel ML stack |
46
+ | `plan` / `feature-dev` | Scope model changes as product capabilities with data, eval, serving, and rollback phases |
47
+ | `tdd-workflow` / `python-testing` | Test feature transforms, split logic, metric calculations, artifact loading, and inference schemas before implementation |
48
+ | `code-reviewer` / `mle-reviewer` | Review code quality plus ML-specific leakage, reproducibility, promotion, and monitoring risks |
49
+ | `build-fix` / `pr-test-analyzer` | Diagnose broken CI, flaky evals, missing fixtures, and environment-specific model or dependency failures |
50
+ | `quality-gate` / `test-coverage` | Require automated evidence for transforms, metrics, inference contracts, promotion gates, and rollback behavior |
51
+ | `eval-harness` / `verification-loop` | Turn offline metrics, slice checks, latency budgets, and rollback drills into repeatable gates |
52
+ | `ai-regression-testing` | Preserve every production bug as a regression: missing feature, stale label, bad artifact, schema drift, or serving mismatch |
53
+ | `api-design` / `backend-patterns` | Design prediction APIs, batch jobs, idempotent retraining endpoints, and response envelopes |
54
+ | `database-migrations` / `postgres-patterns` / `clickhouse-io` | Version labels, feature snapshots, prediction logs, experiment metrics, and drift analytics |
55
+ | `deployment-patterns` / `docker-patterns` | Package reproducible training and serving images with health checks, resource limits, and rollback |
56
+ | `canary-watch` / `dashboard-builder` | Make rollout health visible with model-version, slice, drift, latency, cost, and delayed-label dashboards |
57
+ | `security-review` / `security-scan` | Check model artifacts, notebooks, prompts, datasets, and logs for secrets, PII, unsafe deserialization, and supply-chain risk |
58
+ | `e2e-testing` / `browser-qa` / `accessibility` | Test critical product flows that consume predictions, including explainability and fallback UI states |
59
+ | `benchmark` / `performance-optimizer` | Measure throughput, p95 latency, memory, GPU utilization, and cost per prediction or retrain |
60
+ | `cost-aware-llm-pipeline` / `token-budget-advisor` | Route LLM/embedding workloads by quality, latency, and budget instead of defaulting to the largest model |
61
+ | `documentation-lookup` / `search-first` | Verify current library behavior for model serving, feature stores, vector DBs, and eval tooling before coding |
62
+ | `git-workflow` / `github-ops` / `opensource-pipeline` | Package MLE changes for review with crisp scope, generated artifacts excluded, and reproducible test evidence |
63
+ | `strategic-compact` / `dmux-workflows` | Split long ML work into parallel tracks: data contract, eval harness, serving path, monitoring, and docs |
64
+
65
+ ## Ten MLE Task Simulations
66
+
67
+ Use these simulations as coverage checks when planning or reviewing MLE work. A strong MLE workflow should reduce each task to explicit contracts, reusable SWE surfaces, automated evidence, and a reviewable artifact.
68
+
69
+ | ID | Common MLE task | Streamlined ECC path | Required output | Pipeline lanes covered |
70
+ |----|-----------------|----------------------|-----------------|------------------------|
71
+ | MLE-01 | Frame an ambiguous prediction, ranking, recommender, classifier, embedding, or forecast capability | `product-capability`, `plan`, `architecture-decision-records`, `mle-workflow` | Iteration Compact naming who cares, decision owner, success metric, unacceptable mistakes, assumptions, constraints, and first experiment | product contract, stakeholder loss, risk, rollout |
72
+ | MLE-02 | Define metric goals, labels, data sources, and the mistake budget | `repo-scan`, `database-reviewer`, `database-migrations`, `postgres-patterns`, `clickhouse-io` | Data and metric contract with entity grain, label timing, label confidence, feature timing, point-in-time joins, split policy, and dataset snapshot | data contract, metric design, leakage, reproducibility |
73
+ | MLE-03 | Build a baseline model and scoring path before adding complexity | `tdd-workflow`, `python-testing`, `python-patterns`, `code-reviewer` | Baseline scorer with confusion matrix, calibration notes, latency/cost estimate, known weaknesses, and tests for score shape and determinism | baseline, scoring, testing, serving parity |
74
+ | MLE-04 | Generate features from hypotheses about what separates outcomes | `python-patterns`, `pytorch-patterns`, `docker-patterns`, `deployment-patterns` | Feature plan and transform module covering signal source, missing values, outliers, correlations, leakage checks, and train/serve equivalence | feature pipeline, leakage, training, artifacts |
75
+ | MLE-05 | Tune thresholds, configs, and model complexity under tradeoffs | `eval-harness`, `ai-regression-testing`, `quality-gate`, `test-coverage` | Threshold/config report comparing precision, recall, F1, AUC, calibration, group slices, latency, cost, complexity, and acceptable error classes | evaluation, threshold, promotion, regression |
76
+ | MLE-06 | Run error analysis and turn mistakes into the next experiment | `eval-harness`, `ai-regression-testing`, `mle-reviewer`, `silent-failure-hunter` | Error cluster report for false positives, false negatives, ambiguous labels, stale features, missing signals, and bug traces with lessons captured | error analysis, bug trace, iteration, regression |
77
+ | MLE-07 | Package a model artifact for batch or online inference | `api-design`, `backend-patterns`, `security-review`, `security-scan` | Versioned artifact bundle with preprocessing, config, dependency constraints, schema validation, safe loading, and PII-safe logs | artifact, security, inference contract |
78
+ | MLE-08 | Ship online serving or batch scoring with feedback capture | `api-design`, `backend-patterns`, `e2e-testing`, `browser-qa`, `accessibility` | Prediction endpoint or batch job with response envelope, timeout, batching, fallback, model version, confidence, feedback logging, and product-flow tests | serving, batch inference, fallback, user workflow |
79
+ | MLE-09 | Roll out a model with shadow traffic, canary, A/B test, or rollback | `canary-watch`, `dashboard-builder`, `verification-loop`, `performance-optimizer` | Rollout plan naming traffic split, dashboards, p95 latency, cost, quality guardrails, rollback artifact, and rollback trigger | deployment, canary, rollback |
80
+ | MLE-10 | Operate, debug, and refresh a production model after launch | `silent-failure-hunter`, `dashboard-builder`, `mle-reviewer`, `doc-updater`, `github-ops` | Observation ledger and refresh plan with drift checks, delayed-label health, alert owners, runbook updates, retrain criteria, and PR evidence | monitoring, incident response, retraining |
81
+
82
+ ## Iteration Compact
83
+
84
+ Before touching model code, compress the work into one reviewable artifact. This should be short enough to fit in a PR description and precise enough that another engineer can challenge the tradeoffs.
85
+
86
+ ```text
87
+ Goal:
88
+ Who cares:
89
+ Decision owner:
90
+ User or system action changed by the model:
91
+ Success metric:
92
+ Guardrail metrics:
93
+ Mistake budget:
94
+ Unacceptable mistakes:
95
+ Acceptable mistakes:
96
+ Assumptions:
97
+ Constraints:
98
+ Labels and data snapshot:
99
+ Baseline:
100
+ Candidate signals:
101
+ Threshold or config plan:
102
+ Eval slices:
103
+ Known risks:
104
+ Next experiment:
105
+ Rollback or fallback:
106
+ ```
107
+
108
+ This compact is the MLE equivalent of a strong SWE design note. It keeps the team from optimizing a metric no one trusts, adding features that do not address the real error mode, or shipping complexity without a rollback.
109
+
110
+ ## Decision Brain
111
+
112
+ Use this loop whenever the task is ambiguous, high-impact, or metric-heavy:
113
+
114
+ 1. Start from the decision, not the model. Name the action that changes downstream behavior.
115
+ 2. Name who cares and why. Different stakeholders pay different costs for false positives, false negatives, latency, compute spend, opacity, or missed opportunities.
116
+ 3. Convert ambiguity into hypotheses. Ask what signal would separate outcomes, what evidence would disprove it, and what simple baseline should be hard to beat.
117
+ 4. Research prior art or a nearby known problem before inventing a bespoke system.
118
+ 5. Score choices with `(probability, confidence) x (cost, severity, importance, impact)`.
119
+ 6. Consider adversarial behavior, incentives, selective disclosure, distribution shift, and feedback loops.
120
+ 7. Prefer the simplest change that reduces the most important mistake. Simplicity is not laziness; it is a way to minimize blunders while preserving iteration speed.
121
+ 8. Capture the decision, evidence, counterargument, and next reversible step.
122
+
123
+ ## Metric and Mistake Economics
124
+
125
+ Choose metrics from failure costs, not habit:
126
+
127
+ - Use a confusion matrix early so the team can discuss concrete false positives and false negatives instead of abstract accuracy.
128
+ - Favor precision when the cost of an incorrect positive decision dominates.
129
+ - Favor recall when the cost of a missed positive dominates.
130
+ - Use F1 only when the precision/recall tradeoff is genuinely balanced and explainable.
131
+ - Use AUC or ranking metrics when ordering quality matters more than a single threshold.
132
+ - Track latency, throughput, memory, and cost as first-class metrics because they shape feasible model complexity.
133
+ - Compare against a baseline and the current production model before celebrating an offline gain.
134
+ - Treat real-world feedback signals as delayed labels with bias, lag, and coverage gaps; do not treat them as ground truth without analysis.
135
+
136
+ Every metric choice should state which mistake it makes cheaper, which mistake it makes more likely, and who absorbs that cost.
137
+
138
+ ## Data and Feature Hypotheses
139
+
140
+ Features should come from a theory of separation:
141
+
142
+ - Text, categorical fields, numeric histories, graph relationships, recency, frequency, and aggregates are candidate signal families, not automatic features.
143
+ - For every feature family, state why it should separate outcomes and how it could leak future information.
144
+ - For noisy labels, consider adjudication, label confidence, soft targets, or confidence weighting.
145
+ - For class imbalance, compare weighted loss, resampling, threshold movement, and calibrated decision rules.
146
+ - For missing values, decide whether absence is informative, imputable, or a reason to abstain.
147
+ - For outliers, decide whether to clip, bucket, investigate, or preserve them as rare but important signal.
148
+ - For correlated features, check whether they are redundant, unstable, or proxies for unavailable future state.
149
+
150
+ Do not add model complexity until error analysis shows that the baseline is failing for a reason additional signal or capacity can plausibly fix.
151
+
152
+ ## Error Analysis Loop
153
+
154
+ After each baseline, training run, threshold change, or config change:
155
+
156
+ 1. Split mistakes into false positives, false negatives, abstentions, low-confidence cases, and system failures.
157
+ 2. Cluster errors by shared traits: language, entity type, source, time, geography, device, sparsity, recency, feature freshness, label source, or model version.
158
+ 3. Separate model mistakes from data bugs, label ambiguity, product ambiguity, instrumentation gaps, and serving mismatches.
159
+ 4. Trace each major cluster to one of four moves: better labels, better features, better threshold/config, or better product fallback.
160
+ 5. Preserve every important mistake as a regression test, eval slice, dashboard panel, or runbook entry.
161
+ 6. Write the next iteration as a falsifiable experiment, not a vague "improve model" task.
162
+
163
+ The strongest MLE loop is not train -> metric -> ship. It is mistake -> cluster -> hypothesis -> experiment -> evidence -> simpler system.
164
+
165
+ ## Observation Ledger
166
+
167
+ Keep a compact decision and evidence trail beside the code, PR, experiment report, or runbook:
168
+
169
+ ```text
170
+ Iteration:
171
+ Change:
172
+ Why this mattered:
173
+ Metric movement:
174
+ Slice movement:
175
+ False positives:
176
+ False negatives:
177
+ Unexpected errors:
178
+ Decision:
179
+ Tradeoff accepted:
180
+ Lesson captured:
181
+ Regression added:
182
+ Debt created:
183
+ Next iteration:
184
+ ```
185
+
186
+ Use the ledger to make model work cumulative. The goal is for each iteration to make the next decision easier, not merely to produce another artifact.
187
+
188
+ ## Core Workflow
189
+
190
+ ### 1. Define the Prediction Contract
191
+
192
+ Capture the product-level contract before writing model code:
193
+
194
+ - Prediction target and decision owner
195
+ - Input entity, output schema, confidence/calibration fields, and allowed latency
196
+ - Batch, online, streaming, or hybrid serving mode
197
+ - Fallback behavior when the model, feature store, or dependency is unavailable
198
+ - Human review or override path for high-impact decisions
199
+ - Privacy, retention, and audit requirements for inputs, predictions, and labels
200
+
201
+ Do not accept "improve the model" as a requirement. Tie the model to an observable product behavior and a measurable acceptance gate.
202
+
203
+ ### 2. Lock the Data Contract
204
+
205
+ Every ML task needs an explicit data contract:
206
+
207
+ - Entity grain and primary key
208
+ - Label definition, label timestamp, and label availability delay
209
+ - Feature timestamp, freshness SLA, and point-in-time join rules
210
+ - Train, validation, test, and backtest split policy
211
+ - Required columns, allowed nulls, ranges, categories, and units
212
+ - PII or sensitive fields that must not enter training artifacts or logs
213
+ - Dataset version or snapshot ID for reproducibility
214
+
215
+ Guard against leakage first. If a feature is not available at prediction time, or is joined using future information, remove it or move it to an analysis-only path.
216
+
217
+ ### 3. Build a Reproducible Pipeline
218
+
219
+ Training code should be runnable by another engineer without hidden notebook state:
220
+
221
+ - Use typed config files or dataclasses for all hyperparameters and paths
222
+ - Pin package and model dependencies
223
+ - Set random seeds and document any nondeterministic GPU behavior
224
+ - Record dataset version, code SHA, config hash, metrics, and artifact URI
225
+ - Save preprocessing logic with the model artifact, not separately in a notebook
226
+ - Keep train, eval, and inference transformations shared or generated from one source
227
+ - Make every step idempotent so retries do not corrupt artifacts or metrics
228
+
229
+ Prefer immutable values and pure transformation functions. Avoid mutating shared data frames or global config during feature generation.
230
+
231
+ ```python
232
+ import hashlib
233
+ from dataclasses import dataclass
234
+ from pathlib import Path
235
+
236
+
237
+ @dataclass(frozen=True)
238
+ class TrainingConfig:
239
+ dataset_uri: str
240
+ model_dir: Path
241
+ seed: int
242
+ learning_rate: float
243
+ batch_size: int
244
+
245
+
246
+ def artifact_name(config: TrainingConfig, code_sha: str) -> str:
247
+ config_key = f"{config.dataset_uri}:{config.seed}:{config.learning_rate}:{config.batch_size}"
248
+ config_hash = hashlib.sha256(config_key.encode("utf-8")).hexdigest()[:12]
249
+ return f"{code_sha[:12]}-{config_hash}"
250
+ ```
251
+
252
+ ### 4. Evaluate Before Promotion
253
+
254
+ Promotion criteria should be declared before training finishes:
255
+
256
+ - Baseline model and current production model comparison
257
+ - Primary metric aligned to product behavior
258
+ - Guardrail metrics for latency, calibration, fairness slices, cost, and error concentration
259
+ - Slice metrics for important cohorts, geographies, devices, languages, or data sources
260
+ - Confidence intervals or repeated-run variance when metrics are noisy
261
+ - Failure examples reviewed by a human for high-impact models
262
+ - Explicit "do not ship" thresholds
263
+
264
+ ```python
265
+ PROMOTION_GATES = {
266
+ "auc": ("min", 0.82),
267
+ "calibration_error": ("max", 0.04),
268
+ "p95_latency_ms": ("max", 80),
269
+ }
270
+
271
+
272
+ def assert_promotion_ready(metrics: dict[str, float]) -> None:
273
+ missing = sorted(name for name in PROMOTION_GATES if name not in metrics)
274
+ if missing:
275
+ raise ValueError(f"Model promotion metrics missing required gates: {missing}")
276
+
277
+ failures = {
278
+ name: value
279
+ for name, (direction, threshold) in PROMOTION_GATES.items()
280
+ for value in [metrics[name]]
281
+ if (direction == "min" and value < threshold)
282
+ or (direction == "max" and value > threshold)
283
+ }
284
+ if failures:
285
+ raise ValueError(f"Model failed promotion gates: {failures}")
286
+ ```
287
+
288
+ Use offline metrics as gates, not guarantees. When the model changes product behavior, plan shadow evaluation, canary rollout, or A/B testing before full rollout.
289
+
290
+ ### 5. Package for Serving
291
+
292
+ An ML artifact is production-ready only when the serving contract is testable:
293
+
294
+ - Model artifact includes version, training data reference, config, and preprocessing
295
+ - Input schema rejects invalid, stale, or out-of-range features
296
+ - Output schema includes model version and confidence or explanation fields when useful
297
+ - Serving path has timeout, batching, resource limits, and fallback behavior
298
+ - CPU/GPU requirements are explicit and tested
299
+ - Prediction logs avoid PII and include enough identifiers for debugging and label joins
300
+ - Integration tests cover missing features, stale features, bad types, empty batches, and fallback path
301
+
302
+ Never let training-only feature code diverge from serving feature code without a test that proves equivalence.
303
+
304
+ ### 6. Operate the Model
305
+
306
+ Model monitoring needs both system and quality signals:
307
+
308
+ - Availability, error rate, timeout rate, queue depth, and p50/p95/p99 latency
309
+ - Feature null rate, range drift, categorical drift, and freshness drift
310
+ - Prediction distribution drift and confidence distribution drift
311
+ - Label arrival health and delayed quality metrics
312
+ - Business KPI guardrails and rollback triggers
313
+ - Per-version dashboards for canaries and rollbacks
314
+
315
+ Every deployment should have a rollback plan that names the previous artifact, config, data dependency, and traffic-switch mechanism.
316
+
317
+ ## Review Checklist
318
+
319
+ - [ ] Prediction contract is explicit and testable
320
+ - [ ] Data contract defines entity grain, label timing, feature timing, and snapshot/version
321
+ - [ ] Leakage risks were checked against prediction-time availability
322
+ - [ ] Training is reproducible from code, config, data version, and seed
323
+ - [ ] Metrics compare against baseline and current production model
324
+ - [ ] Slice metrics and guardrails are included for high-risk cohorts
325
+ - [ ] Promotion gates are automated and fail closed
326
+ - [ ] Training and serving transformations are shared or equivalence-tested
327
+ - [ ] Model artifact carries version, config, dataset reference, and preprocessing
328
+ - [ ] Serving path validates inputs and has timeout, fallback, and rollback behavior
329
+ - [ ] Monitoring covers system health, feature drift, prediction drift, and delayed labels
330
+ - [ ] Sensitive data is excluded from artifacts, logs, prompts, and examples
331
+
332
+ ## Anti-Patterns
333
+
334
+ - Notebook state is required to reproduce the model
335
+ - Random split leaks future data into validation or test sets
336
+ - Feature joins ignore event time and label availability
337
+ - Offline metric improves while important slices regress
338
+ - Thresholds are tuned on the test set repeatedly
339
+ - Training preprocessing is copied manually into serving code
340
+ - Model version is missing from prediction logs
341
+ - Monitoring only checks service uptime, not data or prediction quality
342
+ - Rollback requires retraining instead of switching to a known-good artifact
343
+
344
+ ## Output Expectations
345
+
346
+ When using this skill, return concrete artifacts: data contract, promotion gates, pipeline steps, test plan, deployment plan, or review findings. Call out unknowns that block production readiness instead of filling them with assumptions.
@@ -0,0 +1,7 @@
1
+ interface:
2
+ display_name: "MLE Workflow"
3
+ short_description: "Production ML workflow and review gates"
4
+ brand_color: "#2563EB"
5
+ default_prompt: "Use $mle-workflow to plan or review a production ML pipeline."
6
+ policy:
7
+ allow_implicit_invocation: true
@@ -0,0 +1,43 @@
1
+ ---
2
+ name: nextjs-turbopack
3
+ description: Next.js 16+ and Turbopack — incremental bundling, FS caching, dev speed, and when to use Turbopack vs webpack.
4
+ ---
5
+
6
+ # Next.js and Turbopack
7
+
8
+ Next.js 16+ uses Turbopack by default for local development: an incremental bundler written in Rust that significantly speeds up dev startup and hot updates.
9
+
10
+ ## When to Use
11
+
12
+ - **Turbopack (default dev)**: Use for day-to-day development. Faster cold start and HMR, especially in large apps.
13
+ - **Webpack (legacy dev)**: Use only if you hit a Turbopack bug or rely on a webpack-only plugin in dev. Disable with `--webpack` (or `--no-turbopack` depending on your Next.js version; check the docs for your release).
14
+ - **Production**: Production build behavior (`next build`) may use Turbopack or webpack depending on Next.js version; check the official Next.js docs for your version.
15
+
16
+ Use when: developing or debugging Next.js 16+ apps, diagnosing slow dev startup or HMR, or optimizing production bundles.
17
+
18
+ ## How It Works
19
+
20
+ - **Turbopack**: Incremental bundler for Next.js dev. Uses file-system caching so restarts are much faster (e.g. 5–14x on large projects).
21
+ - **Default in dev**: From Next.js 16, `next dev` runs with Turbopack unless disabled.
22
+ - **File-system caching**: Restarts reuse previous work; cache is typically under `.next`; no extra config needed for basic use.
23
+ - **Bundle Analyzer (Next.js 16.1+)**: Experimental Bundle Analyzer to inspect output and find heavy dependencies; enable via config or experimental flag (see Next.js docs for your version).
24
+
25
+ ## Examples
26
+
27
+ ### Commands
28
+
29
+ ```bash
30
+ next dev
31
+ next build
32
+ next start
33
+ ```
34
+
35
+ ### Usage
36
+
37
+ Run `next dev` for local development with Turbopack. Use the Bundle Analyzer (see Next.js docs) to optimize code-splitting and trim large dependencies. Prefer App Router and server components where possible.
38
+
39
+ ## Best Practices
40
+
41
+ - Stay on a recent Next.js 16.x for stable Turbopack and caching behavior.
42
+ - If dev is slow, ensure you're on Turbopack (default) and that the cache isn't being cleared unnecessarily.
43
+ - For production bundle size issues, use the official Next.js bundle analysis tooling for your version.
@@ -0,0 +1,7 @@
1
+ interface:
2
+ display_name: "Next.js Turbopack"
3
+ short_description: "Next.js and Turbopack workflow guidance"
4
+ brand_color: "#000000"
5
+ default_prompt: "Use $nextjs-turbopack to work through Next.js and Turbopack decisions."
6
+ policy:
7
+ allow_implicit_invocation: true