@rubix0270/arboris 1.0.1 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (451) hide show
  1. package/package.json +8 -19
  2. package/run.mjs +10 -0
  3. package/dist/cli.mjs +0 -382
  4. package/manifest.json +0 -323
  5. package/prisma/skills/accessibility/SKILL.md +0 -147
  6. package/prisma/skills/agent-architecture-audit/SKILL.md +0 -257
  7. package/prisma/skills/agent-eval/SKILL.md +0 -146
  8. package/prisma/skills/agent-harness-construction/SKILL.md +0 -74
  9. package/prisma/skills/agent-introspection-debugging/SKILL.md +0 -154
  10. package/prisma/skills/agent-payment-x402/SKILL.md +0 -225
  11. package/prisma/skills/agent-self-evaluation/SKILL.md +0 -182
  12. package/prisma/skills/agent-self-evaluation/examples/high-score-example.md +0 -87
  13. package/prisma/skills/agent-self-evaluation/examples/low-score-example.md +0 -86
  14. package/prisma/skills/agent-self-evaluation/references/evaluation-criteria.md +0 -71
  15. package/prisma/skills/agent-self-evaluation/references/hook-integration.md +0 -64
  16. package/prisma/skills/agent-self-evaluation/scripts/evaluate.py +0 -408
  17. package/prisma/skills/agent-self-evaluation/templates/evaluation-report.md +0 -86
  18. package/prisma/skills/agent-sort/SKILL.md +0 -216
  19. package/prisma/skills/agentic-engineering/SKILL.md +0 -64
  20. package/prisma/skills/agentic-os/SKILL.md +0 -388
  21. package/prisma/skills/ai-first-engineering/SKILL.md +0 -52
  22. package/prisma/skills/ai-regression-testing/SKILL.md +0 -386
  23. package/prisma/skills/android-clean-architecture/SKILL.md +0 -340
  24. package/prisma/skills/angular-developer/SKILL.md +0 -155
  25. package/prisma/skills/angular-developer/references/angular-animations.md +0 -160
  26. package/prisma/skills/angular-developer/references/angular-aria.md +0 -410
  27. package/prisma/skills/angular-developer/references/cli.md +0 -86
  28. package/prisma/skills/angular-developer/references/component-harnesses.md +0 -59
  29. package/prisma/skills/angular-developer/references/component-styling.md +0 -91
  30. package/prisma/skills/angular-developer/references/components.md +0 -117
  31. package/prisma/skills/angular-developer/references/creating-services.md +0 -97
  32. package/prisma/skills/angular-developer/references/data-resolvers.md +0 -69
  33. package/prisma/skills/angular-developer/references/define-routes.md +0 -67
  34. package/prisma/skills/angular-developer/references/defining-providers.md +0 -72
  35. package/prisma/skills/angular-developer/references/di-fundamentals.md +0 -120
  36. package/prisma/skills/angular-developer/references/e2e-testing.md +0 -56
  37. package/prisma/skills/angular-developer/references/effects.md +0 -83
  38. package/prisma/skills/angular-developer/references/hierarchical-injectors.md +0 -43
  39. package/prisma/skills/angular-developer/references/host-elements.md +0 -80
  40. package/prisma/skills/angular-developer/references/injection-context.md +0 -63
  41. package/prisma/skills/angular-developer/references/inputs.md +0 -101
  42. package/prisma/skills/angular-developer/references/linked-signal.md +0 -59
  43. package/prisma/skills/angular-developer/references/loading-strategies.md +0 -61
  44. package/prisma/skills/angular-developer/references/mcp.md +0 -108
  45. package/prisma/skills/angular-developer/references/navigate-to-routes.md +0 -69
  46. package/prisma/skills/angular-developer/references/outputs.md +0 -86
  47. package/prisma/skills/angular-developer/references/reactive-forms.md +0 -122
  48. package/prisma/skills/angular-developer/references/rendering-strategies.md +0 -44
  49. package/prisma/skills/angular-developer/references/resource.md +0 -77
  50. package/prisma/skills/angular-developer/references/route-animations.md +0 -56
  51. package/prisma/skills/angular-developer/references/route-guards.md +0 -52
  52. package/prisma/skills/angular-developer/references/router-lifecycle.md +0 -45
  53. package/prisma/skills/angular-developer/references/router-testing.md +0 -87
  54. package/prisma/skills/angular-developer/references/show-routes-with-outlets.md +0 -68
  55. package/prisma/skills/angular-developer/references/signal-forms.md +0 -795
  56. package/prisma/skills/angular-developer/references/signals-overview.md +0 -94
  57. package/prisma/skills/angular-developer/references/tailwind-css.md +0 -69
  58. package/prisma/skills/angular-developer/references/template-driven-forms.md +0 -114
  59. package/prisma/skills/angular-developer/references/testing-fundamentals.md +0 -65
  60. package/prisma/skills/api-connector-builder/SKILL.md +0 -121
  61. package/prisma/skills/api-design/SKILL.md +0 -524
  62. package/prisma/skills/architecture-decision-records/SKILL.md +0 -180
  63. package/prisma/skills/article-writing/SKILL.md +0 -80
  64. package/prisma/skills/automation-audit-ops/SKILL.md +0 -143
  65. package/prisma/skills/autonomous-agent-harness/SKILL.md +0 -274
  66. package/prisma/skills/autonomous-loops/SKILL.md +0 -611
  67. package/prisma/skills/backend-patterns/SKILL.md +0 -562
  68. package/prisma/skills/benchmark/SKILL.md +0 -94
  69. package/prisma/skills/benchmark-methodology/SKILL.md +0 -190
  70. package/prisma/skills/benchmark-optimization-loop/SKILL.md +0 -70
  71. package/prisma/skills/blender-motion-state-inspection/SKILL.md +0 -165
  72. package/prisma/skills/blueprint/SKILL.md +0 -106
  73. package/prisma/skills/brand-discovery/SKILL.md +0 -145
  74. package/prisma/skills/brand-discovery/references/10_purpose-why.md +0 -40
  75. package/prisma/skills/brand-discovery/references/20_positioning.md +0 -44
  76. package/prisma/skills/brand-discovery/references/30_audience-niche.md +0 -52
  77. package/prisma/skills/brand-discovery/references/40_personality-archetype.md +0 -57
  78. package/prisma/skills/brand-discovery/references/50_voice-tone.md +0 -59
  79. package/prisma/skills/brand-discovery/references/60_narrative-story.md +0 -50
  80. package/prisma/skills/brand-discovery/references/70_founder-tension.md +0 -49
  81. package/prisma/skills/brand-discovery/references/90_SYNTHESIS.md +0 -133
  82. package/prisma/skills/brand-voice/SKILL.md +0 -98
  83. package/prisma/skills/brand-voice/references/voice-profile-schema.md +0 -55
  84. package/prisma/skills/browser-qa/SKILL.md +0 -105
  85. package/prisma/skills/bun-runtime/SKILL.md +0 -85
  86. package/prisma/skills/canary-watch/SKILL.md +0 -108
  87. package/prisma/skills/carrier-relationship-management/SKILL.md +0 -212
  88. package/prisma/skills/cisco-ios-patterns/SKILL.md +0 -164
  89. package/prisma/skills/ck/SKILL.md +0 -148
  90. package/prisma/skills/ck/commands/forget.mjs +0 -44
  91. package/prisma/skills/ck/commands/info.mjs +0 -24
  92. package/prisma/skills/ck/commands/init.mjs +0 -143
  93. package/prisma/skills/ck/commands/list.mjs +0 -40
  94. package/prisma/skills/ck/commands/migrate.mjs +0 -202
  95. package/prisma/skills/ck/commands/resume.mjs +0 -36
  96. package/prisma/skills/ck/commands/save.mjs +0 -210
  97. package/prisma/skills/ck/commands/shared.mjs +0 -387
  98. package/prisma/skills/ck/hooks/session-start.mjs +0 -224
  99. package/prisma/skills/claude-devfleet/SKILL.md +0 -112
  100. package/prisma/skills/click-path-audit/SKILL.md +0 -245
  101. package/prisma/skills/clickhouse-io/SKILL.md +0 -440
  102. package/prisma/skills/code-tour/SKILL.md +0 -254
  103. package/prisma/skills/codebase-onboarding/SKILL.md +0 -234
  104. package/prisma/skills/codehealth-mcp/SKILL.md +0 -167
  105. package/prisma/skills/coding-standards/SKILL.md +0 -551
  106. package/prisma/skills/competitive-platform-analysis/SKILL.md +0 -214
  107. package/prisma/skills/competitive-report-structure/SKILL.md +0 -162
  108. package/prisma/skills/compose-multiplatform-patterns/SKILL.md +0 -300
  109. package/prisma/skills/config-gc/SKILL.md +0 -120
  110. package/prisma/skills/configure-ecc/SKILL.md +0 -385
  111. package/prisma/skills/connections-optimizer/SKILL.md +0 -190
  112. package/prisma/skills/content-engine/SKILL.md +0 -132
  113. package/prisma/skills/content-hash-cache-pattern/SKILL.md +0 -162
  114. package/prisma/skills/context-budget/SKILL.md +0 -136
  115. package/prisma/skills/continuous-agent-loop/SKILL.md +0 -46
  116. package/prisma/skills/continuous-learning/SKILL.md +0 -132
  117. package/prisma/skills/continuous-learning/config.json +0 -18
  118. package/prisma/skills/continuous-learning/evaluate-session.sh +0 -69
  119. package/prisma/skills/continuous-learning-v2/SKILL.md +0 -361
  120. package/prisma/skills/continuous-learning-v2/agents/observer-loop.sh +0 -359
  121. package/prisma/skills/continuous-learning-v2/agents/observer.md +0 -189
  122. package/prisma/skills/continuous-learning-v2/agents/session-guardian.sh +0 -150
  123. package/prisma/skills/continuous-learning-v2/agents/start-observer.sh +0 -248
  124. package/prisma/skills/continuous-learning-v2/config.json +0 -8
  125. package/prisma/skills/continuous-learning-v2/hooks/observe.sh +0 -585
  126. package/prisma/skills/continuous-learning-v2/scripts/detect-project.sh +0 -322
  127. package/prisma/skills/continuous-learning-v2/scripts/instinct-cli.py +0 -1956
  128. package/prisma/skills/continuous-learning-v2/scripts/lib/homunculus-dir.sh +0 -31
  129. package/prisma/skills/continuous-learning-v2/scripts/migrate-homunculus.sh +0 -68
  130. package/prisma/skills/continuous-learning-v2/scripts/test_parse_instinct.py +0 -1421
  131. package/prisma/skills/cost-aware-llm-pipeline/SKILL.md +0 -184
  132. package/prisma/skills/cost-tracking/SKILL.md +0 -97
  133. package/prisma/skills/council/SKILL.md +0 -204
  134. package/prisma/skills/cpp-coding-standards/SKILL.md +0 -724
  135. package/prisma/skills/cpp-testing/SKILL.md +0 -325
  136. package/prisma/skills/crosspost/SKILL.md +0 -112
  137. package/prisma/skills/csharp-testing/SKILL.md +0 -322
  138. package/prisma/skills/customer-billing-ops/SKILL.md +0 -141
  139. package/prisma/skills/customs-trade-compliance/SKILL.md +0 -263
  140. package/prisma/skills/dart-flutter-patterns/SKILL.md +0 -564
  141. package/prisma/skills/dashboard-builder/SKILL.md +0 -109
  142. package/prisma/skills/data-scraper-agent/SKILL.md +0 -765
  143. package/prisma/skills/data-throughput-accelerator/SKILL.md +0 -73
  144. package/prisma/skills/database-migrations/SKILL.md +0 -430
  145. package/prisma/skills/deep-research/SKILL.md +0 -160
  146. package/prisma/skills/defi-amm-security/SKILL.md +0 -167
  147. package/prisma/skills/delivery-gate/SKILL.md +0 -126
  148. package/prisma/skills/delivery-gate/hooks/quality-gate.py +0 -220
  149. package/prisma/skills/deployment-patterns/SKILL.md +0 -428
  150. package/prisma/skills/design-system/SKILL.md +0 -83
  151. package/prisma/skills/django-celery/SKILL.md +0 -458
  152. package/prisma/skills/django-patterns/SKILL.md +0 -735
  153. package/prisma/skills/django-security/SKILL.md +0 -644
  154. package/prisma/skills/django-tdd/SKILL.md +0 -730
  155. package/prisma/skills/django-verification/SKILL.md +0 -470
  156. package/prisma/skills/dmux-workflows/SKILL.md +0 -192
  157. package/prisma/skills/docker-patterns/SKILL.md +0 -365
  158. package/prisma/skills/documentation-lookup/SKILL.md +0 -91
  159. package/prisma/skills/dotnet-patterns/SKILL.md +0 -322
  160. package/prisma/skills/dynamic-workflow-mode/SKILL.md +0 -124
  161. package/prisma/skills/e2e-testing/SKILL.md +0 -327
  162. package/prisma/skills/ecc-guide/SKILL.md +0 -190
  163. package/prisma/skills/ecc-recipes/SKILL.md +0 -149
  164. package/prisma/skills/ecc-tools-cost-audit/SKILL.md +0 -161
  165. package/prisma/skills/email-ops/SKILL.md +0 -122
  166. package/prisma/skills/energy-procurement/SKILL.md +0 -228
  167. package/prisma/skills/enterprise-agent-ops/SKILL.md +0 -51
  168. package/prisma/skills/error-handling/SKILL.md +0 -377
  169. package/prisma/skills/eval-harness/SKILL.md +0 -271
  170. package/prisma/skills/evm-token-decimals/SKILL.md +0 -131
  171. package/prisma/skills/exa-search/SKILL.md +0 -108
  172. package/prisma/skills/fal-ai-media/SKILL.md +0 -289
  173. package/prisma/skills/fastapi-patterns/SKILL.md +0 -514
  174. package/prisma/skills/finance-billing-ops/SKILL.md +0 -128
  175. package/prisma/skills/flox-environments/SKILL.md +0 -497
  176. package/prisma/skills/flutter-dart-code-review/SKILL.md +0 -436
  177. package/prisma/skills/foundation-models-on-device/SKILL.md +0 -243
  178. package/prisma/skills/frontend-a11y/SKILL.md +0 -446
  179. package/prisma/skills/frontend-design-direction/SKILL.md +0 -93
  180. package/prisma/skills/frontend-patterns/SKILL.md +0 -657
  181. package/prisma/skills/frontend-slides/SKILL.md +0 -185
  182. package/prisma/skills/frontend-slides/STYLE_PRESETS.md +0 -330
  183. package/prisma/skills/frontend-slides/animation-patterns.md +0 -122
  184. package/prisma/skills/frontend-slides/html-template.md +0 -419
  185. package/prisma/skills/frontend-slides/scripts/export-pdf.sh +0 -418
  186. package/prisma/skills/frontend-slides/scripts/extract-pptx.py +0 -96
  187. package/prisma/skills/frontend-slides/viewport-base.css +0 -153
  188. package/prisma/skills/fsharp-testing/SKILL.md +0 -281
  189. package/prisma/skills/gan-style-harness/SKILL.md +0 -279
  190. package/prisma/skills/gateguard/SKILL.md +0 -133
  191. package/prisma/skills/generating-python-installer/SKILL.md +0 -820
  192. package/prisma/skills/git-workflow/SKILL.md +0 -716
  193. package/prisma/skills/github-ops/SKILL.md +0 -145
  194. package/prisma/skills/golang-patterns/SKILL.md +0 -675
  195. package/prisma/skills/golang-testing/SKILL.md +0 -721
  196. package/prisma/skills/google-workspace-ops/SKILL.md +0 -96
  197. package/prisma/skills/growth-log/SKILL.md +0 -128
  198. package/prisma/skills/healthcare-cdss-patterns/SKILL.md +0 -246
  199. package/prisma/skills/healthcare-emr-patterns/SKILL.md +0 -160
  200. package/prisma/skills/healthcare-eval-harness/SKILL.md +0 -208
  201. package/prisma/skills/healthcare-phi-compliance/SKILL.md +0 -146
  202. package/prisma/skills/hermes-imports/SKILL.md +0 -89
  203. package/prisma/skills/hexagonal-architecture/SKILL.md +0 -277
  204. package/prisma/skills/hipaa-compliance/SKILL.md +0 -79
  205. package/prisma/skills/homelab-network-readiness/SKILL.md +0 -170
  206. package/prisma/skills/homelab-network-setup/SKILL.md +0 -130
  207. package/prisma/skills/homelab-pihole-dns/SKILL.md +0 -275
  208. package/prisma/skills/homelab-vlan-segmentation/SKILL.md +0 -312
  209. package/prisma/skills/homelab-wireguard-vpn/SKILL.md +0 -306
  210. package/prisma/skills/hookify-rules/SKILL.md +0 -128
  211. package/prisma/skills/inherit-legacy-style/SKILL.md +0 -157
  212. package/prisma/skills/intent-driven-development/SKILL.md +0 -360
  213. package/prisma/skills/inventory-demand-planning/SKILL.md +0 -247
  214. package/prisma/skills/investor-materials/SKILL.md +0 -97
  215. package/prisma/skills/investor-outreach/SKILL.md +0 -92
  216. package/prisma/skills/ios-icon-gen/SKILL.md +0 -158
  217. package/prisma/skills/ios-icon-gen/scripts/generate_icons.swift +0 -258
  218. package/prisma/skills/ios-icon-gen/scripts/iconify_gen.sh +0 -235
  219. package/prisma/skills/iterative-retrieval/SKILL.md +0 -212
  220. package/prisma/skills/ito-basket-compare/SKILL.md +0 -64
  221. package/prisma/skills/ito-data-atlas-agent/SKILL.md +0 -64
  222. package/prisma/skills/ito-market-intelligence/SKILL.md +0 -61
  223. package/prisma/skills/ito-trade-planner/SKILL.md +0 -68
  224. package/prisma/skills/java-coding-standards/SKILL.md +0 -384
  225. package/prisma/skills/jira-integration/SKILL.md +0 -303
  226. package/prisma/skills/jpa-patterns/SKILL.md +0 -152
  227. package/prisma/skills/knowledge-ops/SKILL.md +0 -155
  228. package/prisma/skills/kotlin-coroutines-flows/SKILL.md +0 -285
  229. package/prisma/skills/kotlin-exposed-patterns/SKILL.md +0 -720
  230. package/prisma/skills/kotlin-ktor-patterns/SKILL.md +0 -690
  231. package/prisma/skills/kotlin-patterns/SKILL.md +0 -712
  232. package/prisma/skills/kotlin-testing/SKILL.md +0 -825
  233. package/prisma/skills/kubernetes-patterns/SKILL.md +0 -756
  234. package/prisma/skills/laravel-patterns/SKILL.md +0 -416
  235. package/prisma/skills/laravel-plugin-discovery/SKILL.md +0 -230
  236. package/prisma/skills/laravel-security/SKILL.md +0 -948
  237. package/prisma/skills/laravel-tdd/SKILL.md +0 -675
  238. package/prisma/skills/laravel-verification/SKILL.md +0 -180
  239. package/prisma/skills/latency-critical-systems/SKILL.md +0 -74
  240. package/prisma/skills/lead-intelligence/SKILL.md +0 -322
  241. package/prisma/skills/lead-intelligence/agents/enrichment-agent.md +0 -85
  242. package/prisma/skills/lead-intelligence/agents/mutual-mapper.md +0 -75
  243. package/prisma/skills/lead-intelligence/agents/outreach-drafter.md +0 -98
  244. package/prisma/skills/lead-intelligence/agents/signal-scorer.md +0 -60
  245. package/prisma/skills/liquid-glass-design/SKILL.md +0 -279
  246. package/prisma/skills/llm-trading-agent-security/SKILL.md +0 -147
  247. package/prisma/skills/logistics-exception-management/SKILL.md +0 -222
  248. package/prisma/skills/loop-design-check/SKILL.md +0 -143
  249. package/prisma/skills/mailtrap-email-integration/SKILL.md +0 -77
  250. package/prisma/skills/make-interfaces-feel-better/SKILL.md +0 -152
  251. package/prisma/skills/manim-video/SKILL.md +0 -90
  252. package/prisma/skills/manim-video/assets/network_graph_scene.py +0 -52
  253. package/prisma/skills/market-research/SKILL.md +0 -76
  254. package/prisma/skills/marketing-campaign/SKILL.md +0 -114
  255. package/prisma/skills/mcp-server-patterns/SKILL.md +0 -70
  256. package/prisma/skills/messages-ops/SKILL.md +0 -105
  257. package/prisma/skills/ml-adoption-playbook/SKILL.md +0 -57
  258. package/prisma/skills/mle-workflow/SKILL.md +0 -347
  259. package/prisma/skills/motion-advanced/SKILL.md +0 -596
  260. package/prisma/skills/motion-foundations/SKILL.md +0 -299
  261. package/prisma/skills/motion-patterns/SKILL.md +0 -434
  262. package/prisma/skills/motion-ui/SKILL.md +0 -576
  263. package/prisma/skills/mysql-patterns/SKILL.md +0 -413
  264. package/prisma/skills/nanoclaw-repl/SKILL.md +0 -34
  265. package/prisma/skills/nestjs-patterns/SKILL.md +0 -231
  266. package/prisma/skills/netmiko-ssh-automation/SKILL.md +0 -174
  267. package/prisma/skills/network-bgp-diagnostics/SKILL.md +0 -168
  268. package/prisma/skills/network-config-validation/SKILL.md +0 -211
  269. package/prisma/skills/network-interface-health/SKILL.md +0 -153
  270. package/prisma/skills/nextjs-turbopack/SKILL.md +0 -58
  271. package/prisma/skills/nodejs-keccak256/SKILL.md +0 -103
  272. package/prisma/skills/nutrient-document-processing/SKILL.md +0 -168
  273. package/prisma/skills/nuxt4-patterns/SKILL.md +0 -101
  274. package/prisma/skills/openclaw-persona-forge/SKILL.md +0 -289
  275. package/prisma/skills/openclaw-persona-forge/gacha.py +0 -224
  276. package/prisma/skills/openclaw-persona-forge/gacha.sh +0 -5
  277. package/prisma/skills/openclaw-persona-forge/references/avatar-style.md +0 -124
  278. package/prisma/skills/openclaw-persona-forge/references/boundary-rules.md +0 -53
  279. package/prisma/skills/openclaw-persona-forge/references/error-handling.md +0 -53
  280. package/prisma/skills/openclaw-persona-forge/references/identity-tension.md +0 -48
  281. package/prisma/skills/openclaw-persona-forge/references/naming-system.md +0 -39
  282. package/prisma/skills/openclaw-persona-forge/references/output-template.md +0 -166
  283. package/prisma/skills/opensource-pipeline/SKILL.md +0 -256
  284. package/prisma/skills/orch-add-feature/SKILL.md +0 -45
  285. package/prisma/skills/orch-build-mvp/SKILL.md +0 -49
  286. package/prisma/skills/orch-change-feature/SKILL.md +0 -43
  287. package/prisma/skills/orch-fix-defect/SKILL.md +0 -43
  288. package/prisma/skills/orch-pipeline/SKILL.md +0 -121
  289. package/prisma/skills/orch-refine-code/SKILL.md +0 -44
  290. package/prisma/skills/parallel-execution-optimizer/SKILL.md +0 -73
  291. package/prisma/skills/perl-patterns/SKILL.md +0 -505
  292. package/prisma/skills/perl-security/SKILL.md +0 -504
  293. package/prisma/skills/perl-testing/SKILL.md +0 -476
  294. package/prisma/skills/plan-orchestrate/SKILL.md +0 -263
  295. package/prisma/skills/plankton-code-quality/SKILL.md +0 -237
  296. package/prisma/skills/postgres-patterns/SKILL.md +0 -148
  297. package/prisma/skills/prediction-market-oracle-research/SKILL.md +0 -64
  298. package/prisma/skills/prediction-market-risk-review/SKILL.md +0 -61
  299. package/prisma/skills/prisma-patterns/SKILL.md +0 -401
  300. package/prisma/skills/product-capability/SKILL.md +0 -142
  301. package/prisma/skills/product-lens/SKILL.md +0 -93
  302. package/prisma/skills/production-audit/SKILL.md +0 -207
  303. package/prisma/skills/production-scheduling/SKILL.md +0 -238
  304. package/prisma/skills/project-flow-ops/SKILL.md +0 -112
  305. package/prisma/skills/prompt-optimizer/SKILL.md +0 -398
  306. package/prisma/skills/python-patterns/SKILL.md +0 -751
  307. package/prisma/skills/python-testing/SKILL.md +0 -817
  308. package/prisma/skills/pytorch-patterns/SKILL.md +0 -397
  309. package/prisma/skills/quality-nonconformance/SKILL.md +0 -260
  310. package/prisma/skills/quarkus-patterns/SKILL.md +0 -723
  311. package/prisma/skills/quarkus-security/SKILL.md +0 -468
  312. package/prisma/skills/quarkus-tdd/SKILL.md +0 -812
  313. package/prisma/skills/quarkus-verification/SKILL.md +0 -480
  314. package/prisma/skills/ralphinho-rfc-pipeline/SKILL.md +0 -68
  315. package/prisma/skills/react-native-patterns/SKILL.md +0 -326
  316. package/prisma/skills/react-patterns/SKILL.md +0 -342
  317. package/prisma/skills/react-performance/SKILL.md +0 -575
  318. package/prisma/skills/react-testing/SKILL.md +0 -424
  319. package/prisma/skills/recsys-pipeline-architect/SKILL.md +0 -115
  320. package/prisma/skills/recursive-decision-ledger/SKILL.md +0 -80
  321. package/prisma/skills/redis-patterns/SKILL.md +0 -404
  322. package/prisma/skills/regex-vs-llm-structured-text/SKILL.md +0 -221
  323. package/prisma/skills/remotion-video-creation/SKILL.md +0 -43
  324. package/prisma/skills/remotion-video-creation/rules/3d.md +0 -86
  325. package/prisma/skills/remotion-video-creation/rules/animations.md +0 -29
  326. package/prisma/skills/remotion-video-creation/rules/assets/charts-bar-chart.tsx +0 -173
  327. package/prisma/skills/remotion-video-creation/rules/assets/text-animations-typewriter.tsx +0 -100
  328. package/prisma/skills/remotion-video-creation/rules/assets/text-animations-word-highlight.tsx +0 -108
  329. package/prisma/skills/remotion-video-creation/rules/assets.md +0 -78
  330. package/prisma/skills/remotion-video-creation/rules/audio.md +0 -172
  331. package/prisma/skills/remotion-video-creation/rules/calculate-metadata.md +0 -104
  332. package/prisma/skills/remotion-video-creation/rules/can-decode.md +0 -75
  333. package/prisma/skills/remotion-video-creation/rules/charts.md +0 -58
  334. package/prisma/skills/remotion-video-creation/rules/compositions.md +0 -146
  335. package/prisma/skills/remotion-video-creation/rules/display-captions.md +0 -126
  336. package/prisma/skills/remotion-video-creation/rules/extract-frames.md +0 -229
  337. package/prisma/skills/remotion-video-creation/rules/fonts.md +0 -152
  338. package/prisma/skills/remotion-video-creation/rules/get-audio-duration.md +0 -58
  339. package/prisma/skills/remotion-video-creation/rules/get-video-dimensions.md +0 -68
  340. package/prisma/skills/remotion-video-creation/rules/get-video-duration.md +0 -58
  341. package/prisma/skills/remotion-video-creation/rules/gifs.md +0 -138
  342. package/prisma/skills/remotion-video-creation/rules/images.md +0 -130
  343. package/prisma/skills/remotion-video-creation/rules/import-srt-captions.md +0 -67
  344. package/prisma/skills/remotion-video-creation/rules/lottie.md +0 -67
  345. package/prisma/skills/remotion-video-creation/rules/measuring-dom-nodes.md +0 -34
  346. package/prisma/skills/remotion-video-creation/rules/measuring-text.md +0 -143
  347. package/prisma/skills/remotion-video-creation/rules/sequencing.md +0 -106
  348. package/prisma/skills/remotion-video-creation/rules/tailwind.md +0 -11
  349. package/prisma/skills/remotion-video-creation/rules/text-animations.md +0 -20
  350. package/prisma/skills/remotion-video-creation/rules/timing.md +0 -179
  351. package/prisma/skills/remotion-video-creation/rules/transcribe-captions.md +0 -19
  352. package/prisma/skills/remotion-video-creation/rules/transitions.md +0 -122
  353. package/prisma/skills/remotion-video-creation/rules/trimming.md +0 -52
  354. package/prisma/skills/remotion-video-creation/rules/videos.md +0 -171
  355. package/prisma/skills/repo-scan/SKILL.md +0 -79
  356. package/prisma/skills/research-ops/SKILL.md +0 -113
  357. package/prisma/skills/returns-reverse-logistics/SKILL.md +0 -240
  358. package/prisma/skills/rules-distill/SKILL.md +0 -265
  359. package/prisma/skills/rules-distill/scripts/scan-rules.sh +0 -58
  360. package/prisma/skills/rules-distill/scripts/scan-skills.sh +0 -129
  361. package/prisma/skills/rust-patterns/SKILL.md +0 -500
  362. package/prisma/skills/rust-testing/SKILL.md +0 -501
  363. package/prisma/skills/safety-guard/SKILL.md +0 -76
  364. package/prisma/skills/santa-method/SKILL.md +0 -307
  365. package/prisma/skills/scientific-db-pubmed-database/SKILL.md +0 -176
  366. package/prisma/skills/scientific-db-uspto-database/SKILL.md +0 -178
  367. package/prisma/skills/scientific-pkg-gget/SKILL.md +0 -167
  368. package/prisma/skills/scientific-thinking-literature-review/SKILL.md +0 -193
  369. package/prisma/skills/scientific-thinking-scholar-evaluation/SKILL.md +0 -161
  370. package/prisma/skills/search-first/SKILL.md +0 -183
  371. package/prisma/skills/security-bounty-hunter/SKILL.md +0 -100
  372. package/prisma/skills/security-review/SKILL.md +0 -504
  373. package/prisma/skills/security-review/cloud-infrastructure-security.md +0 -361
  374. package/prisma/skills/security-scan/SKILL.md +0 -166
  375. package/prisma/skills/seo/SKILL.md +0 -155
  376. package/prisma/skills/skill-comply/SKILL.md +0 -59
  377. package/prisma/skills/skill-comply/fixtures/compliant_trace.jsonl +0 -5
  378. package/prisma/skills/skill-comply/fixtures/noncompliant_trace.jsonl +0 -3
  379. package/prisma/skills/skill-comply/fixtures/tdd_spec.yaml +0 -44
  380. package/prisma/skills/skill-comply/prompts/classifier.md +0 -24
  381. package/prisma/skills/skill-comply/prompts/scenario_generator.md +0 -62
  382. package/prisma/skills/skill-comply/prompts/spec_generator.md +0 -42
  383. package/prisma/skills/skill-comply/pyproject.toml +0 -15
  384. package/prisma/skills/skill-comply/scripts/__init__.py +0 -0
  385. package/prisma/skills/skill-comply/scripts/classifier.py +0 -85
  386. package/prisma/skills/skill-comply/scripts/grader.py +0 -124
  387. package/prisma/skills/skill-comply/scripts/parser.py +0 -107
  388. package/prisma/skills/skill-comply/scripts/report.py +0 -170
  389. package/prisma/skills/skill-comply/scripts/run.py +0 -127
  390. package/prisma/skills/skill-comply/scripts/runner.py +0 -194
  391. package/prisma/skills/skill-comply/scripts/scenario_generator.py +0 -70
  392. package/prisma/skills/skill-comply/scripts/spec_generator.py +0 -72
  393. package/prisma/skills/skill-comply/scripts/utils.py +0 -13
  394. package/prisma/skills/skill-comply/tests/test_grader.py +0 -197
  395. package/prisma/skills/skill-comply/tests/test_parser.py +0 -90
  396. package/prisma/skills/skill-comply/tests/test_runner.py +0 -172
  397. package/prisma/skills/skill-scout/SKILL.md +0 -141
  398. package/prisma/skills/skill-stocktake/SKILL.md +0 -195
  399. package/prisma/skills/skill-stocktake/scripts/quick-diff.sh +0 -87
  400. package/prisma/skills/skill-stocktake/scripts/save-results.sh +0 -56
  401. package/prisma/skills/skill-stocktake/scripts/scan.sh +0 -170
  402. package/prisma/skills/social-graph-ranker/SKILL.md +0 -155
  403. package/prisma/skills/social-publisher/SKILL.md +0 -130
  404. package/prisma/skills/springboot-patterns/SKILL.md +0 -315
  405. package/prisma/skills/springboot-security/SKILL.md +0 -273
  406. package/prisma/skills/springboot-tdd/SKILL.md +0 -159
  407. package/prisma/skills/springboot-verification/SKILL.md +0 -232
  408. package/prisma/skills/strategic-compact/SKILL.md +0 -136
  409. package/prisma/skills/swift-actor-persistence/SKILL.md +0 -144
  410. package/prisma/skills/swift-concurrency-6-2/SKILL.md +0 -216
  411. package/prisma/skills/swift-protocol-di-testing/SKILL.md +0 -191
  412. package/prisma/skills/swiftui-patterns/SKILL.md +0 -259
  413. package/prisma/skills/taste/SKILL.md +0 -264
  414. package/prisma/skills/taste/references/genre-taxonomy.md +0 -87
  415. package/prisma/skills/tdd-workflow/SKILL.md +0 -583
  416. package/prisma/skills/team-agent-orchestration/SKILL.md +0 -111
  417. package/prisma/skills/team-builder/SKILL.md +0 -169
  418. package/prisma/skills/terminal-ops/SKILL.md +0 -110
  419. package/prisma/skills/tinystruct-patterns/SKILL.md +0 -279
  420. package/prisma/skills/tinystruct-patterns/references/architecture.md +0 -90
  421. package/prisma/skills/tinystruct-patterns/references/data-handling.md +0 -60
  422. package/prisma/skills/tinystruct-patterns/references/database.md +0 -99
  423. package/prisma/skills/tinystruct-patterns/references/routing.md +0 -64
  424. package/prisma/skills/tinystruct-patterns/references/system-usage.md +0 -97
  425. package/prisma/skills/tinystruct-patterns/references/testing.md +0 -72
  426. package/prisma/skills/token-budget-advisor/SKILL.md +0 -134
  427. package/prisma/skills/ui-demo/SKILL.md +0 -466
  428. package/prisma/skills/ui-to-vue/SKILL.md +0 -135
  429. package/prisma/skills/uncloud/SKILL.md +0 -344
  430. package/prisma/skills/unified-notifications-ops/SKILL.md +0 -188
  431. package/prisma/skills/verification-loop/SKILL.md +0 -127
  432. package/prisma/skills/video-editing/SKILL.md +0 -311
  433. package/prisma/skills/videodb/SKILL.md +0 -375
  434. package/prisma/skills/videodb/reference/api-reference.md +0 -550
  435. package/prisma/skills/videodb/reference/capture-reference.md +0 -407
  436. package/prisma/skills/videodb/reference/capture.md +0 -101
  437. package/prisma/skills/videodb/reference/editor.md +0 -443
  438. package/prisma/skills/videodb/reference/generative.md +0 -331
  439. package/prisma/skills/videodb/reference/rtstream-reference.md +0 -564
  440. package/prisma/skills/videodb/reference/rtstream.md +0 -65
  441. package/prisma/skills/videodb/reference/search.md +0 -230
  442. package/prisma/skills/videodb/reference/streaming.md +0 -406
  443. package/prisma/skills/videodb/reference/use-cases.md +0 -118
  444. package/prisma/skills/videodb/scripts/ws_listener.py +0 -282
  445. package/prisma/skills/visa-doc-translate/README.md +0 -86
  446. package/prisma/skills/visa-doc-translate/SKILL.md +0 -117
  447. package/prisma/skills/vite-patterns/SKILL.md +0 -450
  448. package/prisma/skills/vue-patterns/SKILL.md +0 -471
  449. package/prisma/skills/windows-desktop-e2e/SKILL.md +0 -888
  450. package/prisma/skills/workspace-surface-audit/SKILL.md +0 -126
  451. package/prisma/skills/x-api/SKILL.md +0 -235
@@ -1,347 +0,0 @@
1
- ---
2
- name: mle-workflow
3
- description: Production machine-learning engineering workflow for data contracts, reproducible training, model evaluation, deployment, monitoring, and rollback. Use when building, reviewing, or hardening ML systems beyond one-off notebooks.
4
- metadata:
5
- origin: ECC
6
- ---
7
-
8
- # Machine Learning Engineering Workflow
9
-
10
- Use this skill to turn model work into a production ML system with clear data contracts, repeatable training, measurable quality gates, deployable artifacts, and operational monitoring.
11
-
12
- ## When to Activate
13
-
14
- - Planning or reviewing a production ML feature, model refresh, ranking system, recommender, classifier, embedding workflow, or forecasting pipeline
15
- - Converting notebook code into a reusable training, evaluation, batch inference, or online inference pipeline
16
- - Designing model promotion criteria, offline/online evals, experiment tracking, or rollback paths
17
- - Debugging failures caused by data drift, label leakage, stale features, artifact mismatch, or inconsistent training and serving logic
18
- - Adding model monitoring, canary rollout, shadow traffic, or post-deploy quality checks
19
-
20
- ## Scope Calibration
21
-
22
- Use only the lanes that fit the system in front of you. This skill is useful for ranking, search, recommendations, classifiers, forecasting, embeddings, LLM workflows, anomaly detection, and batch analytics, but it should not force one architecture onto all of them.
23
-
24
- - Do not assume every model has supervised labels, online serving, a feature store, PyTorch, GPUs, human review, A/B tests, or real-time feedback.
25
- - Do not add heavyweight MLOps machinery when a data contract, baseline, eval script, and rollback note would make the change reviewable.
26
- - Do make assumptions explicit when the project lacks labels, delayed outcomes, slice definitions, production traffic, or monitoring ownership.
27
- - Treat examples as interchangeable scaffolds. Replace metrics, serving mode, data stores, and rollout mechanics with the project-native equivalents.
28
-
29
- ## Related Skills
30
-
31
- - `python-patterns` and `python-testing` for Python implementation and pytest coverage
32
- - `pytorch-patterns` for deep learning models, data loaders, device handling, and training loops
33
- - `eval-harness` and `ai-regression-testing` for promotion gates and agent-assisted regression checks
34
- - `database-migrations`, `postgres-patterns`, and `clickhouse-io` for data storage and analytics surfaces
35
- - `deployment-patterns`, `docker-patterns`, and `security-review` for serving, secrets, containers, and production hardening
36
-
37
- ## Reuse the SWE Surface
38
-
39
- Do not treat MLE as separate from software engineering. Most ECC SWE workflows apply directly to ML systems, often with stricter failure modes:
40
-
41
- The recommended `minimal --with capability:machine-learning` install keeps the core agent surface available alongside this skill. For skill-only or agent-limited harnesses, pair `skill:mle-workflow` with `agent:mle-reviewer` where the target supports agents.
42
-
43
- | SWE surface | MLE use |
44
- |-------------|---------|
45
- | `product-capability` / `architecture-decision-records` | Turn model work into explicit product contracts and record irreversible data, model, and rollout choices |
46
- | `repo-scan` / `codebase-onboarding` / `code-tour` | Find existing training, feature, serving, eval, and monitoring paths before introducing a parallel ML stack |
47
- | `plan` / `feature-dev` | Scope model changes as product capabilities with data, eval, serving, and rollback phases |
48
- | `tdd-workflow` / `python-testing` | Test feature transforms, split logic, metric calculations, artifact loading, and inference schemas before implementation |
49
- | `code-reviewer` / `mle-reviewer` | Review code quality plus ML-specific leakage, reproducibility, promotion, and monitoring risks |
50
- | `build-fix` / `pr-test-analyzer` | Diagnose broken CI, flaky evals, missing fixtures, and environment-specific model or dependency failures |
51
- | `quality-gate` / `test-coverage` | Require automated evidence for transforms, metrics, inference contracts, promotion gates, and rollback behavior |
52
- | `eval-harness` / `verification-loop` | Turn offline metrics, slice checks, latency budgets, and rollback drills into repeatable gates |
53
- | `ai-regression-testing` | Preserve every production bug as a regression: missing feature, stale label, bad artifact, schema drift, or serving mismatch |
54
- | `api-design` / `backend-patterns` | Design prediction APIs, batch jobs, idempotent retraining endpoints, and response envelopes |
55
- | `database-migrations` / `postgres-patterns` / `clickhouse-io` | Version labels, feature snapshots, prediction logs, experiment metrics, and drift analytics |
56
- | `deployment-patterns` / `docker-patterns` | Package reproducible training and serving images with health checks, resource limits, and rollback |
57
- | `canary-watch` / `dashboard-builder` | Make rollout health visible with model-version, slice, drift, latency, cost, and delayed-label dashboards |
58
- | `security-review` / `security-scan` | Check model artifacts, notebooks, prompts, datasets, and logs for secrets, PII, unsafe deserialization, and supply-chain risk |
59
- | `e2e-testing` / `browser-qa` / `accessibility` | Test critical product flows that consume predictions, including explainability and fallback UI states |
60
- | `benchmark` / `performance-optimizer` | Measure throughput, p95 latency, memory, GPU utilization, and cost per prediction or retrain |
61
- | `cost-aware-llm-pipeline` / `token-budget-advisor` | Route LLM/embedding workloads by quality, latency, and budget instead of defaulting to the largest model |
62
- | `documentation-lookup` / `search-first` | Verify current library behavior for model serving, feature stores, vector DBs, and eval tooling before coding |
63
- | `git-workflow` / `github-ops` / `opensource-pipeline` | Package MLE changes for review with crisp scope, generated artifacts excluded, and reproducible test evidence |
64
- | `strategic-compact` / `dmux-workflows` | Split long ML work into parallel tracks: data contract, eval harness, serving path, monitoring, and docs |
65
-
66
- ## Ten MLE Task Simulations
67
-
68
- Use these simulations as coverage checks when planning or reviewing MLE work. A strong MLE workflow should reduce each task to explicit contracts, reusable SWE surfaces, automated evidence, and a reviewable artifact.
69
-
70
- | ID | Common MLE task | Streamlined ECC path | Required output | Pipeline lanes covered |
71
- |----|-----------------|----------------------|-----------------|------------------------|
72
- | MLE-01 | Frame an ambiguous prediction, ranking, recommender, classifier, embedding, or forecast capability | `product-capability`, `plan`, `architecture-decision-records`, `mle-workflow` | Iteration Compact naming who cares, decision owner, success metric, unacceptable mistakes, assumptions, constraints, and first experiment | product contract, stakeholder loss, risk, rollout |
73
- | MLE-02 | Define metric goals, labels, data sources, and the mistake budget | `repo-scan`, `database-reviewer`, `database-migrations`, `postgres-patterns`, `clickhouse-io` | Data and metric contract with entity grain, label timing, label confidence, feature timing, point-in-time joins, split policy, and dataset snapshot | data contract, metric design, leakage, reproducibility |
74
- | MLE-03 | Build a baseline model and scoring path before adding complexity | `tdd-workflow`, `python-testing`, `python-patterns`, `code-reviewer` | Baseline scorer with confusion matrix, calibration notes, latency/cost estimate, known weaknesses, and tests for score shape and determinism | baseline, scoring, testing, serving parity |
75
- | MLE-04 | Generate features from hypotheses about what separates outcomes | `python-patterns`, `pytorch-patterns`, `docker-patterns`, `deployment-patterns` | Feature plan and transform module covering signal source, missing values, outliers, correlations, leakage checks, and train/serve equivalence | feature pipeline, leakage, training, artifacts |
76
- | MLE-05 | Tune thresholds, configs, and model complexity under tradeoffs | `eval-harness`, `ai-regression-testing`, `quality-gate`, `test-coverage` | Threshold/config report comparing precision, recall, F1, AUC, calibration, group slices, latency, cost, complexity, and acceptable error classes | evaluation, threshold, promotion, regression |
77
- | MLE-06 | Run error analysis and turn mistakes into the next experiment | `eval-harness`, `ai-regression-testing`, `mle-reviewer`, `silent-failure-hunter` | Error cluster report for false positives, false negatives, ambiguous labels, stale features, missing signals, and bug traces with lessons captured | error analysis, bug trace, iteration, regression |
78
- | MLE-07 | Package a model artifact for batch or online inference | `api-design`, `backend-patterns`, `security-review`, `security-scan` | Versioned artifact bundle with preprocessing, config, dependency constraints, schema validation, safe loading, and PII-safe logs | artifact, security, inference contract |
79
- | MLE-08 | Ship online serving or batch scoring with feedback capture | `api-design`, `backend-patterns`, `e2e-testing`, `browser-qa`, `accessibility` | Prediction endpoint or batch job with response envelope, timeout, batching, fallback, model version, confidence, feedback logging, and product-flow tests | serving, batch inference, fallback, user workflow |
80
- | MLE-09 | Roll out a model with shadow traffic, canary, A/B test, or rollback | `canary-watch`, `dashboard-builder`, `verification-loop`, `performance-optimizer` | Rollout plan naming traffic split, dashboards, p95 latency, cost, quality guardrails, rollback artifact, and rollback trigger | deployment, canary, rollback |
81
- | MLE-10 | Operate, debug, and refresh a production model after launch | `silent-failure-hunter`, `dashboard-builder`, `mle-reviewer`, `doc-updater`, `github-ops` | Observation ledger and refresh plan with drift checks, delayed-label health, alert owners, runbook updates, retrain criteria, and PR evidence | monitoring, incident response, retraining |
82
-
83
- ## Iteration Compact
84
-
85
- Before touching model code, compress the work into one reviewable artifact. This should be short enough to fit in a PR description and precise enough that another engineer can challenge the tradeoffs.
86
-
87
- ```text
88
- Goal:
89
- Who cares:
90
- Decision owner:
91
- User or system action changed by the model:
92
- Success metric:
93
- Guardrail metrics:
94
- Mistake budget:
95
- Unacceptable mistakes:
96
- Acceptable mistakes:
97
- Assumptions:
98
- Constraints:
99
- Labels and data snapshot:
100
- Baseline:
101
- Candidate signals:
102
- Threshold or config plan:
103
- Eval slices:
104
- Known risks:
105
- Next experiment:
106
- Rollback or fallback:
107
- ```
108
-
109
- This compact is the MLE equivalent of a strong SWE design note. It keeps the team from optimizing a metric no one trusts, adding features that do not address the real error mode, or shipping complexity without a rollback.
110
-
111
- ## Decision Brain
112
-
113
- Use this loop whenever the task is ambiguous, high-impact, or metric-heavy:
114
-
115
- 1. Start from the decision, not the model. Name the action that changes downstream behavior.
116
- 2. Name who cares and why. Different stakeholders pay different costs for false positives, false negatives, latency, compute spend, opacity, or missed opportunities.
117
- 3. Convert ambiguity into hypotheses. Ask what signal would separate outcomes, what evidence would disprove it, and what simple baseline should be hard to beat.
118
- 4. Research prior art or a nearby known problem before inventing a bespoke system.
119
- 5. Score choices with `(probability, confidence) x (cost, severity, importance, impact)`.
120
- 6. Consider adversarial behavior, incentives, selective disclosure, distribution shift, and feedback loops.
121
- 7. Prefer the simplest change that reduces the most important mistake. Simplicity is not laziness; it is a way to minimize blunders while preserving iteration speed.
122
- 8. Capture the decision, evidence, counterargument, and next reversible step.
123
-
124
- ## Metric and Mistake Economics
125
-
126
- Choose metrics from failure costs, not habit:
127
-
128
- - Use a confusion matrix early so the team can discuss concrete false positives and false negatives instead of abstract accuracy.
129
- - Favor precision when the cost of an incorrect positive decision dominates.
130
- - Favor recall when the cost of a missed positive dominates.
131
- - Use F1 only when the precision/recall tradeoff is genuinely balanced and explainable.
132
- - Use AUC or ranking metrics when ordering quality matters more than a single threshold.
133
- - Track latency, throughput, memory, and cost as first-class metrics because they shape feasible model complexity.
134
- - Compare against a baseline and the current production model before celebrating an offline gain.
135
- - Treat real-world feedback signals as delayed labels with bias, lag, and coverage gaps; do not treat them as ground truth without analysis.
136
-
137
- Every metric choice should state which mistake it makes cheaper, which mistake it makes more likely, and who absorbs that cost.
138
-
139
- ## Data and Feature Hypotheses
140
-
141
- Features should come from a theory of separation:
142
-
143
- - Text, categorical fields, numeric histories, graph relationships, recency, frequency, and aggregates are candidate signal families, not automatic features.
144
- - For every feature family, state why it should separate outcomes and how it could leak future information.
145
- - For noisy labels, consider adjudication, label confidence, soft targets, or confidence weighting.
146
- - For class imbalance, compare weighted loss, resampling, threshold movement, and calibrated decision rules.
147
- - For missing values, decide whether absence is informative, imputable, or a reason to abstain.
148
- - For outliers, decide whether to clip, bucket, investigate, or preserve them as rare but important signal.
149
- - For correlated features, check whether they are redundant, unstable, or proxies for unavailable future state.
150
-
151
- Do not add model complexity until error analysis shows that the baseline is failing for a reason additional signal or capacity can plausibly fix.
152
-
153
- ## Error Analysis Loop
154
-
155
- After each baseline, training run, threshold change, or config change:
156
-
157
- 1. Split mistakes into false positives, false negatives, abstentions, low-confidence cases, and system failures.
158
- 2. Cluster errors by shared traits: language, entity type, source, time, geography, device, sparsity, recency, feature freshness, label source, or model version.
159
- 3. Separate model mistakes from data bugs, label ambiguity, product ambiguity, instrumentation gaps, and serving mismatches.
160
- 4. Trace each major cluster to one of four moves: better labels, better features, better threshold/config, or better product fallback.
161
- 5. Preserve every important mistake as a regression test, eval slice, dashboard panel, or runbook entry.
162
- 6. Write the next iteration as a falsifiable experiment, not a vague "improve model" task.
163
-
164
- The strongest MLE loop is not train -> metric -> ship. It is mistake -> cluster -> hypothesis -> experiment -> evidence -> simpler system.
165
-
166
- ## Observation Ledger
167
-
168
- Keep a compact decision and evidence trail beside the code, PR, experiment report, or runbook:
169
-
170
- ```text
171
- Iteration:
172
- Change:
173
- Why this mattered:
174
- Metric movement:
175
- Slice movement:
176
- False positives:
177
- False negatives:
178
- Unexpected errors:
179
- Decision:
180
- Tradeoff accepted:
181
- Lesson captured:
182
- Regression added:
183
- Debt created:
184
- Next iteration:
185
- ```
186
-
187
- Use the ledger to make model work cumulative. The goal is for each iteration to make the next decision easier, not merely to produce another artifact.
188
-
189
- ## Core Workflow
190
-
191
- ### 1. Define the Prediction Contract
192
-
193
- Capture the product-level contract before writing model code:
194
-
195
- - Prediction target and decision owner
196
- - Input entity, output schema, confidence/calibration fields, and allowed latency
197
- - Batch, online, streaming, or hybrid serving mode
198
- - Fallback behavior when the model, feature store, or dependency is unavailable
199
- - Human review or override path for high-impact decisions
200
- - Privacy, retention, and audit requirements for inputs, predictions, and labels
201
-
202
- Do not accept "improve the model" as a requirement. Tie the model to an observable product behavior and a measurable acceptance gate.
203
-
204
- ### 2. Lock the Data Contract
205
-
206
- Every ML task needs an explicit data contract:
207
-
208
- - Entity grain and primary key
209
- - Label definition, label timestamp, and label availability delay
210
- - Feature timestamp, freshness SLA, and point-in-time join rules
211
- - Train, validation, test, and backtest split policy
212
- - Required columns, allowed nulls, ranges, categories, and units
213
- - PII or sensitive fields that must not enter training artifacts or logs
214
- - Dataset version or snapshot ID for reproducibility
215
-
216
- Guard against leakage first. If a feature is not available at prediction time, or is joined using future information, remove it or move it to an analysis-only path.
217
-
218
- ### 3. Build a Reproducible Pipeline
219
-
220
- Training code should be runnable by another engineer without hidden notebook state:
221
-
222
- - Use typed config files or dataclasses for all hyperparameters and paths
223
- - Pin package and model dependencies
224
- - Set random seeds and document any nondeterministic GPU behavior
225
- - Record dataset version, code SHA, config hash, metrics, and artifact URI
226
- - Save preprocessing logic with the model artifact, not separately in a notebook
227
- - Keep train, eval, and inference transformations shared or generated from one source
228
- - Make every step idempotent so retries do not corrupt artifacts or metrics
229
-
230
- Prefer immutable values and pure transformation functions. Avoid mutating shared data frames or global config during feature generation.
231
-
232
- ```python
233
- import hashlib
234
- from dataclasses import dataclass
235
- from pathlib import Path
236
-
237
-
238
- @dataclass(frozen=True)
239
- class TrainingConfig:
240
- dataset_uri: str
241
- model_dir: Path
242
- seed: int
243
- learning_rate: float
244
- batch_size: int
245
-
246
-
247
- def artifact_name(config: TrainingConfig, code_sha: str) -> str:
248
- config_key = f"{config.dataset_uri}:{config.seed}:{config.learning_rate}:{config.batch_size}"
249
- config_hash = hashlib.sha256(config_key.encode("utf-8")).hexdigest()[:12]
250
- return f"{code_sha[:12]}-{config_hash}"
251
- ```
252
-
253
- ### 4. Evaluate Before Promotion
254
-
255
- Promotion criteria should be declared before training finishes:
256
-
257
- - Baseline model and current production model comparison
258
- - Primary metric aligned to product behavior
259
- - Guardrail metrics for latency, calibration, fairness slices, cost, and error concentration
260
- - Slice metrics for important cohorts, geographies, devices, languages, or data sources
261
- - Confidence intervals or repeated-run variance when metrics are noisy
262
- - Failure examples reviewed by a human for high-impact models
263
- - Explicit "do not ship" thresholds
264
-
265
- ```python
266
- PROMOTION_GATES = {
267
- "auc": ("min", 0.82),
268
- "calibration_error": ("max", 0.04),
269
- "p95_latency_ms": ("max", 80),
270
- }
271
-
272
-
273
- def assert_promotion_ready(metrics: dict[str, float]) -> None:
274
- missing = sorted(name for name in PROMOTION_GATES if name not in metrics)
275
- if missing:
276
- raise ValueError(f"Model promotion metrics missing required gates: {missing}")
277
-
278
- failures = {
279
- name: value
280
- for name, (direction, threshold) in PROMOTION_GATES.items()
281
- for value in [metrics[name]]
282
- if (direction == "min" and value < threshold)
283
- or (direction == "max" and value > threshold)
284
- }
285
- if failures:
286
- raise ValueError(f"Model failed promotion gates: {failures}")
287
- ```
288
-
289
- Use offline metrics as gates, not guarantees. When the model changes product behavior, plan shadow evaluation, canary rollout, or A/B testing before full rollout.
290
-
291
- ### 5. Package for Serving
292
-
293
- An ML artifact is production-ready only when the serving contract is testable:
294
-
295
- - Model artifact includes version, training data reference, config, and preprocessing
296
- - Input schema rejects invalid, stale, or out-of-range features
297
- - Output schema includes model version and confidence or explanation fields when useful
298
- - Serving path has timeout, batching, resource limits, and fallback behavior
299
- - CPU/GPU requirements are explicit and tested
300
- - Prediction logs avoid PII and include enough identifiers for debugging and label joins
301
- - Integration tests cover missing features, stale features, bad types, empty batches, and fallback path
302
-
303
- Never let training-only feature code diverge from serving feature code without a test that proves equivalence.
304
-
305
- ### 6. Operate the Model
306
-
307
- Model monitoring needs both system and quality signals:
308
-
309
- - Availability, error rate, timeout rate, queue depth, and p50/p95/p99 latency
310
- - Feature null rate, range drift, categorical drift, and freshness drift
311
- - Prediction distribution drift and confidence distribution drift
312
- - Label arrival health and delayed quality metrics
313
- - Business KPI guardrails and rollback triggers
314
- - Per-version dashboards for canaries and rollbacks
315
-
316
- Every deployment should have a rollback plan that names the previous artifact, config, data dependency, and traffic-switch mechanism.
317
-
318
- ## Review Checklist
319
-
320
- - [ ] Prediction contract is explicit and testable
321
- - [ ] Data contract defines entity grain, label timing, feature timing, and snapshot/version
322
- - [ ] Leakage risks were checked against prediction-time availability
323
- - [ ] Training is reproducible from code, config, data version, and seed
324
- - [ ] Metrics compare against baseline and current production model
325
- - [ ] Slice metrics and guardrails are included for high-risk cohorts
326
- - [ ] Promotion gates are automated and fail closed
327
- - [ ] Training and serving transformations are shared or equivalence-tested
328
- - [ ] Model artifact carries version, config, dataset reference, and preprocessing
329
- - [ ] Serving path validates inputs and has timeout, fallback, and rollback behavior
330
- - [ ] Monitoring covers system health, feature drift, prediction drift, and delayed labels
331
- - [ ] Sensitive data is excluded from artifacts, logs, prompts, and examples
332
-
333
- ## Anti-Patterns
334
-
335
- - Notebook state is required to reproduce the model
336
- - Random split leaks future data into validation or test sets
337
- - Feature joins ignore event time and label availability
338
- - Offline metric improves while important slices regress
339
- - Thresholds are tuned on the test set repeatedly
340
- - Training preprocessing is copied manually into serving code
341
- - Model version is missing from prediction logs
342
- - Monitoring only checks service uptime, not data or prediction quality
343
- - Rollback requires retraining instead of switching to a known-good artifact
344
-
345
- ## Output Expectations
346
-
347
- When using this skill, return concrete artifacts: data contract, promotion gates, pipeline steps, test plan, deployment plan, or review findings. Call out unknowns that block production readiness instead of filling them with assumptions.