claude-turing 4.6.0 → 4.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (333) hide show
  1. package/.claude-plugin/plugin.json +2 -2
  2. package/README.md +1 -1
  3. package/commands/ablate.md +0 -1
  4. package/commands/annotate.md +0 -1
  5. package/commands/archive.md +0 -1
  6. package/commands/audit.md +0 -1
  7. package/commands/baseline.md +0 -1
  8. package/commands/brief.md +0 -1
  9. package/commands/budget.md +0 -1
  10. package/commands/calibrate.md +0 -1
  11. package/commands/card.md +0 -1
  12. package/commands/changelog.md +0 -1
  13. package/commands/checkpoint.md +0 -1
  14. package/commands/cite.md +0 -1
  15. package/commands/compare.md +0 -1
  16. package/commands/counterfactual.md +0 -1
  17. package/commands/curriculum.md +0 -1
  18. package/commands/design.md +0 -1
  19. package/commands/diagnose.md +0 -1
  20. package/commands/diff.md +0 -1
  21. package/commands/distill.md +0 -1
  22. package/commands/doctor.md +0 -1
  23. package/commands/ensemble.md +0 -1
  24. package/commands/explore.md +0 -1
  25. package/commands/export.md +0 -1
  26. package/commands/feature.md +0 -1
  27. package/commands/flashback.md +0 -1
  28. package/commands/fork.md +0 -1
  29. package/commands/frontier.md +0 -1
  30. package/commands/init.md +0 -1
  31. package/commands/leak.md +0 -1
  32. package/commands/lit.md +0 -1
  33. package/commands/logbook.md +0 -1
  34. package/commands/merge.md +0 -1
  35. package/commands/mode.md +0 -1
  36. package/commands/onboard.md +0 -1
  37. package/commands/paper.md +0 -1
  38. package/commands/plan.md +0 -1
  39. package/commands/poster.md +0 -1
  40. package/commands/postmortem.md +0 -1
  41. package/commands/preflight.md +0 -1
  42. package/commands/present.md +0 -1
  43. package/commands/profile.md +0 -1
  44. package/commands/prune.md +0 -1
  45. package/commands/quantize.md +0 -1
  46. package/commands/queue.md +0 -1
  47. package/commands/registry.md +0 -1
  48. package/commands/regress.md +0 -1
  49. package/commands/replay.md +0 -1
  50. package/commands/report.md +0 -1
  51. package/commands/reproduce.md +0 -1
  52. package/commands/retry.md +0 -1
  53. package/commands/review.md +0 -1
  54. package/commands/sanity.md +0 -1
  55. package/commands/scale.md +0 -1
  56. package/commands/search.md +0 -1
  57. package/commands/seed.md +0 -1
  58. package/commands/sensitivity.md +0 -1
  59. package/commands/share.md +0 -1
  60. package/commands/simulate.md +0 -1
  61. package/commands/status.md +0 -1
  62. package/commands/stitch.md +0 -1
  63. package/commands/suggest.md +0 -1
  64. package/commands/surgery.md +0 -1
  65. package/commands/sweep.md +0 -1
  66. package/commands/template.md +0 -1
  67. package/commands/train.md +0 -1
  68. package/commands/transfer.md +0 -1
  69. package/commands/trend.md +0 -1
  70. package/commands/try.md +0 -1
  71. package/commands/turing.md +3 -3
  72. package/commands/update.md +0 -1
  73. package/commands/validate.md +0 -1
  74. package/commands/warm.md +0 -1
  75. package/commands/watch.md +0 -1
  76. package/commands/whatif.md +0 -1
  77. package/commands/xray.md +0 -1
  78. package/config/commands.yaml +74 -74
  79. package/package.json +10 -3
  80. package/skills/turing/SKILL.md +180 -0
  81. package/skills/turing/ablate/SKILL.md +46 -0
  82. package/skills/turing/annotate/SKILL.md +22 -0
  83. package/skills/turing/archive/SKILL.md +22 -0
  84. package/skills/turing/audit/SKILL.md +55 -0
  85. package/skills/turing/baseline/SKILL.md +44 -0
  86. package/skills/turing/brief/SKILL.md +94 -0
  87. package/skills/turing/budget/SKILL.md +51 -0
  88. package/skills/turing/calibrate/SKILL.md +46 -0
  89. package/skills/turing/card/SKILL.md +35 -0
  90. package/skills/turing/changelog/SKILL.md +21 -0
  91. package/skills/turing/checkpoint/SKILL.md +46 -0
  92. package/skills/turing/cite/SKILL.md +22 -0
  93. package/skills/turing/compare/SKILL.md +23 -0
  94. package/skills/turing/counterfactual/SKILL.md +26 -0
  95. package/skills/turing/curriculum/SKILL.md +42 -0
  96. package/skills/turing/design/SKILL.md +96 -0
  97. package/skills/turing/diagnose/SKILL.md +51 -0
  98. package/skills/turing/diff/SKILL.md +47 -0
  99. package/skills/turing/distill/SKILL.md +55 -0
  100. package/skills/turing/doctor/SKILL.md +30 -0
  101. package/skills/turing/ensemble/SKILL.md +53 -0
  102. package/skills/turing/explore/SKILL.md +106 -0
  103. package/skills/turing/export/SKILL.md +47 -0
  104. package/skills/turing/feature/SKILL.md +41 -0
  105. package/skills/turing/flashback/SKILL.md +21 -0
  106. package/skills/turing/fork/SKILL.md +39 -0
  107. package/skills/turing/frontier/SKILL.md +44 -0
  108. package/skills/turing/init/SKILL.md +153 -0
  109. package/skills/turing/leak/SKILL.md +46 -0
  110. package/skills/turing/lit/SKILL.md +46 -0
  111. package/skills/turing/logbook/SKILL.md +50 -0
  112. package/skills/turing/merge/SKILL.md +23 -0
  113. package/skills/turing/mode/SKILL.md +42 -0
  114. package/skills/turing/onboard/SKILL.md +19 -0
  115. package/skills/turing/paper/SKILL.md +43 -0
  116. package/skills/turing/plan/SKILL.md +26 -0
  117. package/skills/turing/poster/SKILL.md +88 -0
  118. package/skills/turing/postmortem/SKILL.md +27 -0
  119. package/skills/turing/preflight/SKILL.md +74 -0
  120. package/skills/turing/present/SKILL.md +22 -0
  121. package/skills/turing/profile/SKILL.md +42 -0
  122. package/skills/turing/prune/SKILL.md +25 -0
  123. package/skills/turing/quantize/SKILL.md +23 -0
  124. package/skills/turing/queue/SKILL.md +47 -0
  125. package/skills/turing/registry/SKILL.md +30 -0
  126. package/skills/turing/regress/SKILL.md +52 -0
  127. package/skills/turing/replay/SKILL.md +22 -0
  128. package/skills/turing/report/SKILL.md +96 -0
  129. package/skills/turing/reproduce/SKILL.md +47 -0
  130. package/skills/turing/retry/SKILL.md +40 -0
  131. package/skills/turing/review/SKILL.md +19 -0
  132. package/skills/turing/rules/loop-protocol.md +91 -0
  133. package/skills/turing/sanity/SKILL.md +47 -0
  134. package/skills/turing/scale/SKILL.md +54 -0
  135. package/skills/turing/search/SKILL.md +21 -0
  136. package/skills/turing/seed/SKILL.md +46 -0
  137. package/skills/turing/sensitivity/SKILL.md +40 -0
  138. package/skills/turing/share/SKILL.md +19 -0
  139. package/skills/turing/simulate/SKILL.md +27 -0
  140. package/skills/turing/status/SKILL.md +23 -0
  141. package/skills/turing/stitch/SKILL.md +48 -0
  142. package/skills/turing/suggest/SKILL.md +158 -0
  143. package/skills/turing/surgery/SKILL.md +26 -0
  144. package/skills/turing/sweep/SKILL.md +44 -0
  145. package/skills/turing/template/SKILL.md +21 -0
  146. package/skills/turing/train/SKILL.md +74 -0
  147. package/skills/turing/transfer/SKILL.md +53 -0
  148. package/skills/turing/trend/SKILL.md +20 -0
  149. package/skills/turing/try/SKILL.md +62 -0
  150. package/skills/turing/update/SKILL.md +26 -0
  151. package/skills/turing/validate/SKILL.md +33 -0
  152. package/skills/turing/warm/SKILL.md +52 -0
  153. package/skills/turing/watch/SKILL.md +59 -0
  154. package/skills/turing/whatif/SKILL.md +30 -0
  155. package/skills/turing/xray/SKILL.md +42 -0
  156. package/src/command-registry.js +21 -0
  157. package/src/install.js +4 -3
  158. package/src/sync-commands-layout.js +149 -0
  159. package/src/sync-skills-layout.js +20 -0
  160. package/templates/__pycache__/evaluate.cpython-312.pyc +0 -0
  161. package/templates/__pycache__/evaluate.cpython-314.pyc +0 -0
  162. package/templates/__pycache__/prepare.cpython-312.pyc +0 -0
  163. package/templates/__pycache__/prepare.cpython-314.pyc +0 -0
  164. package/templates/features/__pycache__/__init__.cpython-312.pyc +0 -0
  165. package/templates/features/__pycache__/__init__.cpython-314.pyc +0 -0
  166. package/templates/features/__pycache__/featurizers.cpython-312.pyc +0 -0
  167. package/templates/features/__pycache__/featurizers.cpython-314.pyc +0 -0
  168. package/templates/scripts/__pycache__/__init__.cpython-312.pyc +0 -0
  169. package/templates/scripts/__pycache__/__init__.cpython-314.pyc +0 -0
  170. package/templates/scripts/__pycache__/ablation_study.cpython-312.pyc +0 -0
  171. package/templates/scripts/__pycache__/ablation_study.cpython-314.pyc +0 -0
  172. package/templates/scripts/__pycache__/architecture_surgery.cpython-312.pyc +0 -0
  173. package/templates/scripts/__pycache__/architecture_surgery.cpython-314.pyc +0 -0
  174. package/templates/scripts/__pycache__/budget_manager.cpython-312.pyc +0 -0
  175. package/templates/scripts/__pycache__/budget_manager.cpython-314.pyc +0 -0
  176. package/templates/scripts/__pycache__/build_ensemble.cpython-312.pyc +0 -0
  177. package/templates/scripts/__pycache__/build_ensemble.cpython-314.pyc +0 -0
  178. package/templates/scripts/__pycache__/calibration.cpython-312.pyc +0 -0
  179. package/templates/scripts/__pycache__/calibration.cpython-314.pyc +0 -0
  180. package/templates/scripts/__pycache__/check_convergence.cpython-312.pyc +0 -0
  181. package/templates/scripts/__pycache__/check_convergence.cpython-314.pyc +0 -0
  182. package/templates/scripts/__pycache__/checkpoint_manager.cpython-312.pyc +0 -0
  183. package/templates/scripts/__pycache__/checkpoint_manager.cpython-314.pyc +0 -0
  184. package/templates/scripts/__pycache__/citation_manager.cpython-312.pyc +0 -0
  185. package/templates/scripts/__pycache__/citation_manager.cpython-314.pyc +0 -0
  186. package/templates/scripts/__pycache__/cost_frontier.cpython-312.pyc +0 -0
  187. package/templates/scripts/__pycache__/cost_frontier.cpython-314.pyc +0 -0
  188. package/templates/scripts/__pycache__/counterfactual_explanation.cpython-312.pyc +0 -0
  189. package/templates/scripts/__pycache__/counterfactual_explanation.cpython-314.pyc +0 -0
  190. package/templates/scripts/__pycache__/critique_hypothesis.cpython-312.pyc +0 -0
  191. package/templates/scripts/__pycache__/critique_hypothesis.cpython-314.pyc +0 -0
  192. package/templates/scripts/__pycache__/curriculum_optimizer.cpython-312.pyc +0 -0
  193. package/templates/scripts/__pycache__/curriculum_optimizer.cpython-314.pyc +0 -0
  194. package/templates/scripts/__pycache__/diagnose_errors.cpython-312.pyc +0 -0
  195. package/templates/scripts/__pycache__/diagnose_errors.cpython-314.pyc +0 -0
  196. package/templates/scripts/__pycache__/draft_paper_sections.cpython-312.pyc +0 -0
  197. package/templates/scripts/__pycache__/draft_paper_sections.cpython-314.pyc +0 -0
  198. package/templates/scripts/__pycache__/equivalence_checker.cpython-312.pyc +0 -0
  199. package/templates/scripts/__pycache__/equivalence_checker.cpython-314.pyc +0 -0
  200. package/templates/scripts/__pycache__/experiment_annotations.cpython-312.pyc +0 -0
  201. package/templates/scripts/__pycache__/experiment_annotations.cpython-314.pyc +0 -0
  202. package/templates/scripts/__pycache__/experiment_archive.cpython-312.pyc +0 -0
  203. package/templates/scripts/__pycache__/experiment_archive.cpython-314.pyc +0 -0
  204. package/templates/scripts/__pycache__/experiment_diff.cpython-312.pyc +0 -0
  205. package/templates/scripts/__pycache__/experiment_diff.cpython-314.pyc +0 -0
  206. package/templates/scripts/__pycache__/experiment_index.cpython-312.pyc +0 -0
  207. package/templates/scripts/__pycache__/experiment_index.cpython-314.pyc +0 -0
  208. package/templates/scripts/__pycache__/experiment_queue.cpython-312.pyc +0 -0
  209. package/templates/scripts/__pycache__/experiment_queue.cpython-314.pyc +0 -0
  210. package/templates/scripts/__pycache__/experiment_replay.cpython-312.pyc +0 -0
  211. package/templates/scripts/__pycache__/experiment_replay.cpython-314.pyc +0 -0
  212. package/templates/scripts/__pycache__/experiment_search.cpython-312.pyc +0 -0
  213. package/templates/scripts/__pycache__/experiment_search.cpython-314.pyc +0 -0
  214. package/templates/scripts/__pycache__/experiment_simulator.cpython-312.pyc +0 -0
  215. package/templates/scripts/__pycache__/experiment_simulator.cpython-314.pyc +0 -0
  216. package/templates/scripts/__pycache__/experiment_templates.cpython-312.pyc +0 -0
  217. package/templates/scripts/__pycache__/experiment_templates.cpython-314.pyc +0 -0
  218. package/templates/scripts/__pycache__/export_card.cpython-312.pyc +0 -0
  219. package/templates/scripts/__pycache__/export_card.cpython-314.pyc +0 -0
  220. package/templates/scripts/__pycache__/export_formats.cpython-312.pyc +0 -0
  221. package/templates/scripts/__pycache__/export_formats.cpython-314.pyc +0 -0
  222. package/templates/scripts/__pycache__/failure_postmortem.cpython-312.pyc +0 -0
  223. package/templates/scripts/__pycache__/failure_postmortem.cpython-314.pyc +0 -0
  224. package/templates/scripts/__pycache__/feature_intelligence.cpython-312.pyc +0 -0
  225. package/templates/scripts/__pycache__/feature_intelligence.cpython-314.pyc +0 -0
  226. package/templates/scripts/__pycache__/fork_experiment.cpython-312.pyc +0 -0
  227. package/templates/scripts/__pycache__/fork_experiment.cpython-314.pyc +0 -0
  228. package/templates/scripts/__pycache__/generate_baselines.cpython-312.pyc +0 -0
  229. package/templates/scripts/__pycache__/generate_baselines.cpython-314.pyc +0 -0
  230. package/templates/scripts/__pycache__/generate_brief.cpython-312.pyc +0 -0
  231. package/templates/scripts/__pycache__/generate_brief.cpython-314.pyc +0 -0
  232. package/templates/scripts/__pycache__/generate_changelog.cpython-312.pyc +0 -0
  233. package/templates/scripts/__pycache__/generate_changelog.cpython-314.pyc +0 -0
  234. package/templates/scripts/__pycache__/generate_figures.cpython-312.pyc +0 -0
  235. package/templates/scripts/__pycache__/generate_figures.cpython-314.pyc +0 -0
  236. package/templates/scripts/__pycache__/generate_logbook.cpython-312.pyc +0 -0
  237. package/templates/scripts/__pycache__/generate_logbook.cpython-314.pyc +0 -0
  238. package/templates/scripts/__pycache__/generate_model_card.cpython-312.pyc +0 -0
  239. package/templates/scripts/__pycache__/generate_model_card.cpython-314.pyc +0 -0
  240. package/templates/scripts/__pycache__/generate_onboarding.cpython-312.pyc +0 -0
  241. package/templates/scripts/__pycache__/generate_onboarding.cpython-314.pyc +0 -0
  242. package/templates/scripts/__pycache__/harness_doctor.cpython-312.pyc +0 -0
  243. package/templates/scripts/__pycache__/harness_doctor.cpython-314.pyc +0 -0
  244. package/templates/scripts/__pycache__/incremental_update.cpython-312.pyc +0 -0
  245. package/templates/scripts/__pycache__/incremental_update.cpython-314.pyc +0 -0
  246. package/templates/scripts/__pycache__/knowledge_transfer.cpython-312.pyc +0 -0
  247. package/templates/scripts/__pycache__/knowledge_transfer.cpython-314.pyc +0 -0
  248. package/templates/scripts/__pycache__/latency_benchmark.cpython-312.pyc +0 -0
  249. package/templates/scripts/__pycache__/latency_benchmark.cpython-314.pyc +0 -0
  250. package/templates/scripts/__pycache__/leakage_detector.cpython-312.pyc +0 -0
  251. package/templates/scripts/__pycache__/leakage_detector.cpython-314.pyc +0 -0
  252. package/templates/scripts/__pycache__/literature_search.cpython-312.pyc +0 -0
  253. package/templates/scripts/__pycache__/literature_search.cpython-314.pyc +0 -0
  254. package/templates/scripts/__pycache__/log_experiment.cpython-312.pyc +0 -0
  255. package/templates/scripts/__pycache__/log_experiment.cpython-314.pyc +0 -0
  256. package/templates/scripts/__pycache__/manage_hypotheses.cpython-312.pyc +0 -0
  257. package/templates/scripts/__pycache__/manage_hypotheses.cpython-314.pyc +0 -0
  258. package/templates/scripts/__pycache__/methodology_audit.cpython-312.pyc +0 -0
  259. package/templates/scripts/__pycache__/methodology_audit.cpython-314.pyc +0 -0
  260. package/templates/scripts/__pycache__/model_distiller.cpython-312.pyc +0 -0
  261. package/templates/scripts/__pycache__/model_distiller.cpython-314.pyc +0 -0
  262. package/templates/scripts/__pycache__/model_lifecycle.cpython-312.pyc +0 -0
  263. package/templates/scripts/__pycache__/model_lifecycle.cpython-314.pyc +0 -0
  264. package/templates/scripts/__pycache__/model_merger.cpython-312.pyc +0 -0
  265. package/templates/scripts/__pycache__/model_merger.cpython-314.pyc +0 -0
  266. package/templates/scripts/__pycache__/model_pruning.cpython-312.pyc +0 -0
  267. package/templates/scripts/__pycache__/model_pruning.cpython-314.pyc +0 -0
  268. package/templates/scripts/__pycache__/model_quantization.cpython-312.pyc +0 -0
  269. package/templates/scripts/__pycache__/model_quantization.cpython-314.pyc +0 -0
  270. package/templates/scripts/__pycache__/model_xray.cpython-312.pyc +0 -0
  271. package/templates/scripts/__pycache__/model_xray.cpython-314.pyc +0 -0
  272. package/templates/scripts/__pycache__/novelty_guard.cpython-312.pyc +0 -0
  273. package/templates/scripts/__pycache__/novelty_guard.cpython-314.pyc +0 -0
  274. package/templates/scripts/__pycache__/package_experiments.cpython-312.pyc +0 -0
  275. package/templates/scripts/__pycache__/package_experiments.cpython-314.pyc +0 -0
  276. package/templates/scripts/__pycache__/pareto_frontier.cpython-312.pyc +0 -0
  277. package/templates/scripts/__pycache__/pareto_frontier.cpython-314.pyc +0 -0
  278. package/templates/scripts/__pycache__/parse_metrics.cpython-312.pyc +0 -0
  279. package/templates/scripts/__pycache__/parse_metrics.cpython-314.pyc +0 -0
  280. package/templates/scripts/__pycache__/pipeline_manager.cpython-312.pyc +0 -0
  281. package/templates/scripts/__pycache__/pipeline_manager.cpython-314.pyc +0 -0
  282. package/templates/scripts/__pycache__/profile_training.cpython-312.pyc +0 -0
  283. package/templates/scripts/__pycache__/profile_training.cpython-314.pyc +0 -0
  284. package/templates/scripts/__pycache__/regression_gate.cpython-312.pyc +0 -0
  285. package/templates/scripts/__pycache__/regression_gate.cpython-314.pyc +0 -0
  286. package/templates/scripts/__pycache__/reproduce_experiment.cpython-312.pyc +0 -0
  287. package/templates/scripts/__pycache__/reproduce_experiment.cpython-314.pyc +0 -0
  288. package/templates/scripts/__pycache__/research_planner.cpython-312.pyc +0 -0
  289. package/templates/scripts/__pycache__/research_planner.cpython-314.pyc +0 -0
  290. package/templates/scripts/__pycache__/sanity_checks.cpython-312.pyc +0 -0
  291. package/templates/scripts/__pycache__/sanity_checks.cpython-314.pyc +0 -0
  292. package/templates/scripts/__pycache__/scaffold.cpython-312.pyc +0 -0
  293. package/templates/scripts/__pycache__/scaffold.cpython-314.pyc +0 -0
  294. package/templates/scripts/__pycache__/scaling_estimator.cpython-312.pyc +0 -0
  295. package/templates/scripts/__pycache__/scaling_estimator.cpython-314.pyc +0 -0
  296. package/templates/scripts/__pycache__/seed_runner.cpython-312.pyc +0 -0
  297. package/templates/scripts/__pycache__/seed_runner.cpython-314.pyc +0 -0
  298. package/templates/scripts/__pycache__/sensitivity_analysis.cpython-312.pyc +0 -0
  299. package/templates/scripts/__pycache__/sensitivity_analysis.cpython-314.pyc +0 -0
  300. package/templates/scripts/__pycache__/session_flashback.cpython-312.pyc +0 -0
  301. package/templates/scripts/__pycache__/session_flashback.cpython-314.pyc +0 -0
  302. package/templates/scripts/__pycache__/show_experiment_tree.cpython-312.pyc +0 -0
  303. package/templates/scripts/__pycache__/show_experiment_tree.cpython-314.pyc +0 -0
  304. package/templates/scripts/__pycache__/show_families.cpython-312.pyc +0 -0
  305. package/templates/scripts/__pycache__/show_families.cpython-314.pyc +0 -0
  306. package/templates/scripts/__pycache__/simulate_review.cpython-312.pyc +0 -0
  307. package/templates/scripts/__pycache__/simulate_review.cpython-314.pyc +0 -0
  308. package/templates/scripts/__pycache__/smart_retry.cpython-312.pyc +0 -0
  309. package/templates/scripts/__pycache__/smart_retry.cpython-314.pyc +0 -0
  310. package/templates/scripts/__pycache__/statistical_compare.cpython-312.pyc +0 -0
  311. package/templates/scripts/__pycache__/statistical_compare.cpython-314.pyc +0 -0
  312. package/templates/scripts/__pycache__/suggest_next.cpython-312.pyc +0 -0
  313. package/templates/scripts/__pycache__/suggest_next.cpython-314.pyc +0 -0
  314. package/templates/scripts/__pycache__/sweep.cpython-312.pyc +0 -0
  315. package/templates/scripts/__pycache__/sweep.cpython-314.pyc +0 -0
  316. package/templates/scripts/__pycache__/synthesize_decision.cpython-312.pyc +0 -0
  317. package/templates/scripts/__pycache__/synthesize_decision.cpython-314.pyc +0 -0
  318. package/templates/scripts/__pycache__/training_monitor.cpython-312.pyc +0 -0
  319. package/templates/scripts/__pycache__/training_monitor.cpython-314.pyc +0 -0
  320. package/templates/scripts/__pycache__/treequest_suggest.cpython-312.pyc +0 -0
  321. package/templates/scripts/__pycache__/treequest_suggest.cpython-314.pyc +0 -0
  322. package/templates/scripts/__pycache__/trend_analysis.cpython-312.pyc +0 -0
  323. package/templates/scripts/__pycache__/trend_analysis.cpython-314.pyc +0 -0
  324. package/templates/scripts/__pycache__/turing_io.cpython-312.pyc +0 -0
  325. package/templates/scripts/__pycache__/turing_io.cpython-314.pyc +0 -0
  326. package/templates/scripts/__pycache__/update_state.cpython-312.pyc +0 -0
  327. package/templates/scripts/__pycache__/update_state.cpython-314.pyc +0 -0
  328. package/templates/scripts/__pycache__/verify_placeholders.cpython-312.pyc +0 -0
  329. package/templates/scripts/__pycache__/verify_placeholders.cpython-314.pyc +0 -0
  330. package/templates/scripts/__pycache__/warm_start.cpython-312.pyc +0 -0
  331. package/templates/scripts/__pycache__/warm_start.cpython-314.pyc +0 -0
  332. package/templates/scripts/__pycache__/whatif_engine.cpython-312.pyc +0 -0
  333. package/templates/scripts/__pycache__/whatif_engine.cpython-314.pyc +0 -0
@@ -0,0 +1,41 @@
1
+ ---
2
+ name: feature
3
+ description: Automated feature selection — multi-method importance consensus, redundancy detection, and interaction feature generation.
4
+ argument-hint: "[--method all|importance] [--top-k 20]"
5
+ allowed-tools: Read, Bash(*), Grep, Glob
6
+ ---
7
+
8
+ Systematically evaluate which features matter and which are noise.
9
+
10
+ ## Steps
11
+
12
+ 1. **Activate environment:**
13
+ ```bash
14
+ source .venv/bin/activate
15
+ ```
16
+
17
+ 2. **Parse arguments from `$ARGUMENTS`:**
18
+ - `--method all|importance|selection|generation` — analysis type (default: all)
19
+ - `--top-k 20` — number of top features to consider
20
+ - `--json` — raw JSON output
21
+
22
+ 3. **Run feature analysis:**
23
+ ```bash
24
+ python scripts/feature_intelligence.py $ARGUMENTS
25
+ ```
26
+
27
+ 4. **Report includes:**
28
+ - Consensus ranking: features ranked by number of methods placing them in top-K
29
+ - Per-method ranks: mutual information, L1, tree-based
30
+ - Redundant pairs: features with |r| > 0.95
31
+ - Candidate interaction features from top consensus set
32
+ - Drop recommendation for zero-consensus features
33
+
34
+ 5. **Saved output:** report in `experiments/features/features-*.yaml`
35
+
36
+ ## Examples
37
+
38
+ ```
39
+ /turing:feature # Full analysis
40
+ /turing:feature --top-k 10 # Top-10 consensus
41
+ ```
@@ -0,0 +1,21 @@
1
+ ---
2
+ name: flashback
3
+ description: Session context restoration — "where was I?" summary after days away. Current best, pending hypotheses, last session, annotations.
4
+ argument-hint: "[--days 7] [--last 10]"
5
+ allowed-tools: Read, Bash(*), Grep, Glob
6
+ ---
7
+
8
+ Come back to a project after a week and start working in 10 seconds instead of 30 minutes.
9
+
10
+ ## Steps
11
+ 1. **Activate environment:** `source .venv/bin/activate`
12
+ 2. **Run:** `python scripts/session_flashback.py $ARGUMENTS`
13
+ 3. **Report:** current best, last session experiments, pending hypotheses, annotations, budget, suggested next action
14
+ 4. **Saved output:** `experiments/flashbacks/flashback-*.yaml`
15
+
16
+ ## Examples
17
+ ```
18
+ /turing:flashback # Default: last 7 days
19
+ /turing:flashback --days 14 # 2-week lookback
20
+ /turing:flashback --last 5 # Last 5 experiments
21
+ ```
@@ -0,0 +1,39 @@
1
+ ---
2
+ name: fork
3
+ description: Branch an experiment into parallel tracks — run both A and B, report the winner.
4
+ argument-hint: "<exp-id> --branches \"approach A\" \"approach B\" [--auto-promote]"
5
+ allowed-tools: Read, Bash(*), Grep, Glob
6
+ ---
7
+
8
+ Fork an experiment into parallel branches and compare results.
9
+
10
+ ## Steps
11
+
12
+ 1. **Activate environment:**
13
+ ```bash
14
+ source .venv/bin/activate
15
+ ```
16
+
17
+ 2. **Parse arguments from `$ARGUMENTS`:**
18
+ - First argument is the parent experiment ID
19
+ - `--branches "A" "B" "C"` — branch descriptions (2+ required)
20
+ - `--auto-promote` — automatically keep the winning branch
21
+
22
+ 3. **Run fork:**
23
+ ```bash
24
+ python scripts/fork_experiment.py $ARGUMENTS
25
+ ```
26
+
27
+ 4. **Report results:**
28
+ - Comparison tree showing each branch's metric
29
+ - Winner identified and marked
30
+ - Recommendation: promote winner, abandon rest
31
+
32
+ 5. **Saved output:** report written to `experiments/forks/exp-NNN-fork.yaml`
33
+
34
+ ## Examples
35
+
36
+ ```
37
+ /turing:fork exp-042 --branches "LightGBM with dart" "XGBoost deeper trees"
38
+ /turing:fork exp-042 --branches "A" "B" "C" --auto-promote
39
+ ```
@@ -0,0 +1,44 @@
1
+ ---
2
+ name: frontier
3
+ description: Visualize Pareto frontier across multiple objectives — answers "which model is actually best?" when there are tradeoffs.
4
+ argument-hint: "[--metrics \"accuracy,train_seconds,n_params\"] [--ascii]"
5
+ allowed-tools: Read, Bash(*), Grep, Glob
6
+ ---
7
+
8
+ Visualize the Pareto frontier across multiple objectives from experiment history.
9
+
10
+ ## Steps
11
+
12
+ 1. **Activate environment:**
13
+ ```bash
14
+ source .venv/bin/activate
15
+ ```
16
+
17
+ 2. **Parse arguments from `$ARGUMENTS`:**
18
+ - `--metrics "accuracy,train_seconds,n_params"` specifies metrics to analyze
19
+ - Without `--metrics`, uses primary metric + train_seconds from config
20
+ - `--ascii` generates an ASCII scatter plot (2D projection)
21
+
22
+ 3. **Run Pareto analysis:**
23
+ ```bash
24
+ python scripts/pareto_frontier.py $ARGUMENTS
25
+ ```
26
+
27
+ 4. **Report results:**
28
+ - **Pareto-optimal experiments:** table with all metrics and what each is best at
29
+ - **Dominated experiments:** with their nearest Pareto neighbor
30
+ - **ASCII scatter plot** (if `--ascii`): 2D projection with * for Pareto, · for dominated
31
+ - Summary: "N Pareto-optimal of M experiments across K metrics"
32
+
33
+ 5. **Saved output:** results written to `experiments/frontiers/frontier-YYYY-MM-DD.yaml`
34
+
35
+ 6. **If no experiments have all requested metrics:** suggest which metrics are available.
36
+
37
+ ## Examples
38
+
39
+ ```
40
+ /turing:frontier # Default: metric vs time
41
+ /turing:frontier --metrics "accuracy,train_seconds" # 2D frontier
42
+ /turing:frontier --metrics "accuracy,train_seconds,n_params" # 3D frontier
43
+ /turing:frontier --ascii # With scatter plot
44
+ ```
@@ -0,0 +1,153 @@
1
+ ---
2
+ name: init
3
+ description: Initialize a new ML project with the Turing autoresearch harness. Scaffolds the full experiment infrastructure — immutable evaluation pipeline, agent-editable training code, structured logging, convergence detection hooks, and a Python virtual environment. Use --plan to generate a research plan.
4
+ argument-hint: "[project_name] [--plan]"
5
+ allowed-tools: Read, Write, Edit, Bash(*), Grep, Glob, WebSearch, WebFetch
6
+ ---
7
+
8
+ Scaffold a new ML project with the Turing autoresearch harness. This creates the separation between the measurement apparatus (READ-ONLY) and the hypothesis space (AGENT-EDITABLE) that makes autonomous experimentation trustworthy.
9
+
10
+ ## Interactive Setup
11
+
12
+ Ask the user for the following (or accept from `$ARGUMENTS` if provided as JSON):
13
+
14
+ 1. **Project name** (`{{PROJECT_NAME}}`): Name of the ML project (e.g., "sentiment", "churn", "fraud-detection")
15
+ 2. **Target metric** (`{{TARGET_METRIC}}`): Primary metric to optimize (e.g., "accuracy", "f1", "mae", "mse", "auc")
16
+ 3. **Metric direction**: Is lower better (mae, mse, loss) or higher better (accuracy, f1, auc)?
17
+ 4. **Task description** (`{{TASK_DESCRIPTION}}`): What the model does (e.g., "Predict customer churn from usage data")
18
+ 5. **ML directory** (`{{ML_DIR}}`): Where ML files go relative to project root (e.g., "ml/sentiment")
19
+ 6. **Data source** (`{{DATA_SOURCE}}`): Where training data comes from (e.g., "data/reviews.csv")
20
+
21
+ ## Scaffolding
22
+
23
+ Once you have all 6 values, delegate to the unified scaffolding script:
24
+
25
+ ```bash
26
+ python3 <templates_dir>/scripts/scaffold.py \
27
+ --project-name "<project_name>" \
28
+ --target-metric "<target_metric>" \
29
+ --metric-direction "<metric_direction>" \
30
+ --task-description "<task_description>" \
31
+ --ml-dir "<ml_dir>" \
32
+ --data-source "<data_source>" \
33
+ --templates-dir "<templates_dir>"
34
+ ```
35
+
36
+ The scaffold script handles everything in a single atomic operation:
37
+ - Copies all template files with placeholder substitution
38
+ - Creates data/, experiments/, models/ directories
39
+ - Sets up agent memory at `.claude/agent-memory/ml-researcher-{project_name}/MEMORY.md`
40
+ - Configures Claude Code hooks in `.claude/settings.local.json`
41
+ - Creates Python virtual environment and installs requirements
42
+ - Verifies all placeholders were replaced (fails loudly if any remain)
43
+
44
+ ## Locating Templates
45
+
46
+ Use the installed command-pack templates directory first:
47
+ ```
48
+ .claude/commands/turing/templates/
49
+ ~/.claude/commands/turing/templates/
50
+ ```
51
+ Then fall back to plugin or npm locations:
52
+ ```
53
+ ~/.claude/plugins/*/templates/
54
+ node_modules/claude-turing/templates/
55
+ ```
56
+
57
+ Example command:
58
+
59
+ ```bash
60
+ python3 ~/.claude/commands/turing/templates/scripts/scaffold.py \
61
+ --project-name "<project_name>" \
62
+ --target-metric "<target_metric>" \
63
+ --metric-direction "<metric_direction>" \
64
+ --task-description "<task_description>" \
65
+ --ml-dir "<ml_dir>" \
66
+ --data-source "<data_source>" \
67
+ --templates-dir ~/.claude/commands/turing/templates
68
+ ```
69
+
70
+ ## After Scaffolding
71
+
72
+ Report what was created:
73
+ - The separation: READ-ONLY (`prepare.py`, `evaluate.py`) vs AGENT-EDITABLE (`train.py`)
74
+ - Next steps: add data to the configured data source path, run `python prepare.py`, then `/turing:train`
75
+ - The taste-leverage loop: `/turing:try` to inject hypotheses, `/turing:brief` for intelligence reports
76
+
77
+ ## Research Plan Generation (--plan flag)
78
+
79
+ If `$ARGUMENTS` contains `--plan`, generate a research plan AFTER scaffolding. This gives the agent strategic direction for its first 5-10 experiments rather than ad-hoc exploration.
80
+
81
+ ### Steps:
82
+
83
+ 1. **Read the task context** from the just-created `config.yaml`: task description, model type, target metric, data source.
84
+
85
+ 2. **Search literature** with `WebSearch` for the task domain:
86
+ - "state of the art <task description> machine learning 2024 2025"
87
+ - "best model <target metric> <data type> benchmark"
88
+ - "<task description> common approaches survey"
89
+
90
+ Use `WebFetch` on top 2-3 results to extract: dominant model families, typical metric ranges, known challenges.
91
+
92
+ 3. **Generate `RESEARCH_PLAN.md`** in the ML project directory with this structure:
93
+
94
+ ```markdown
95
+ # Research Plan: <task description>
96
+
97
+ Generated: <date>
98
+
99
+ ## Task Summary
100
+ <one paragraph describing the task, data, and success criteria>
101
+
102
+ ## Model Families to Explore
103
+ Ordered by expected relevance based on literature:
104
+ 1. **<family 1>** — <why, with citation>
105
+ 2. **<family 2>** — <why, with citation>
106
+ 3. **<family 3>** — <why, with citation>
107
+
108
+ ## Evaluation Strategy
109
+ - Primary metric: <metric> (<higher/lower> is better)
110
+ - Multi-run recommendation: <yes/no, based on expected variance>
111
+ - Baseline target: <realistic first-pass metric from literature>
112
+
113
+ ## Search Budget
114
+ - <N> experiments per model family before moving on
115
+ - Total budget: <N> experiments before first convergence check
116
+
117
+ ## Success Criteria
118
+ - Target metric: <value from literature benchmarks>
119
+ - Convergence: <patience> consecutive non-improvements
120
+
121
+ ## Known Challenges
122
+ - <challenge 1 from literature, e.g., "class imbalance common in this domain">
123
+ - <challenge 2>
124
+
125
+ ## Sources
126
+ - <citation 1>
127
+ - <citation 2>
128
+ ```
129
+
130
+ 4. **Self-critique the plan** (one round):
131
+ - Are the model families ordered by evidence strength?
132
+ - Is the budget realistic?
133
+ - Are the success criteria grounded in benchmark data?
134
+ Revise if any section is vague or unsupported.
135
+
136
+ 5. **Report:** "Research plan generated at `<ml_dir>/RESEARCH_PLAN.md`. The agent will read this during `/turing:train` for strategic direction."
137
+
138
+ ### Integration
139
+
140
+ The agent's `program.md` OBSERVE step reads `RESEARCH_PLAN.md` (if it exists) for strategic direction. The plan is advisory — the agent can deviate but should note why in `experiment_state.yaml`.
141
+
142
+ ## Multiple Projects
143
+
144
+ You can scaffold multiple ML projects in the same repository:
145
+
146
+ ```bash
147
+ /turing:init # First project: prompts for ml_dir (e.g., ml/sentiment)
148
+ /turing:init # Second project: prompts for ml_dir (e.g., ml/churn)
149
+ ```
150
+
151
+ Each project gets its own directory with independent config, data, experiments, and models. `/turing:train ml/sentiment` or `/turing:train ml/churn` targets a specific project. If you `cd ml/sentiment` first, `/turing:train` auto-detects from cwd.
152
+
153
+ Agent memory is scoped per project: `.claude/agent-memory/ml-researcher-{project_name}/MEMORY.md`
@@ -0,0 +1,46 @@
1
+ ---
2
+ name: leak
3
+ description: Targeted leakage detection — probe for data leakage with single-feature tests, correlation checks, and train/test overlap detection.
4
+ argument-hint: "[--deep] [--features feature_1,feature_2]"
5
+ allowed-tools: Read, Bash(*), Grep, Glob
6
+ ---
7
+
8
+ Actively probe for data leakage. The #1 cause of "too good to be true" results.
9
+
10
+ ## Steps
11
+
12
+ 1. **Activate environment:**
13
+ ```bash
14
+ source .venv/bin/activate
15
+ ```
16
+
17
+ 2. **Parse arguments from `$ARGUMENTS`:**
18
+ - `--deep` — run full single-feature analysis (slow but thorough)
19
+ - `--features "feat_1,feat_2"` — check specific features
20
+ - `--json` — raw JSON output
21
+
22
+ 3. **Run leakage scan:**
23
+ ```bash
24
+ python scripts/leakage_detector.py $ARGUMENTS
25
+ ```
26
+
27
+ 4. **Checks performed:**
28
+ - **Feature-target correlation:** flag features with >0.95 correlation to target
29
+ - **Single-feature predictiveness (--deep):** train on each feature alone, flag any that achieve >80% of full model performance
30
+ - **Train/test overlap:** hash-based deduplication across splits
31
+
32
+ 5. **Verdicts:**
33
+ - **CLEAN** — no leakage detected
34
+ - **SUSPICIOUS** — warnings to review
35
+ - **LEAKAGE DETECTED** — critical flags found
36
+
37
+ 6. **Integration:** satisfies the "data leakage" check in `/turing:audit`
38
+
39
+ 7. **Saved output:** report in `experiments/leakage/leak-*.yaml`
40
+
41
+ ## Examples
42
+
43
+ ```
44
+ /turing:leak # Standard correlation + overlap checks
45
+ /turing:leak --deep # Full single-feature analysis
46
+ ```
@@ -0,0 +1,46 @@
1
+ ---
2
+ name: lit
3
+ description: Literature search scoped to the current experiment domain — find papers, SOTA baselines, and related work without leaving the terminal.
4
+ argument-hint: "<query> | --baseline | --related <exp-id>"
5
+ allowed-tools: Read, Bash(*), Grep, Glob, WebSearch
6
+ ---
7
+
8
+ Search the literature for papers, baselines, and related work.
9
+
10
+ ## Steps
11
+
12
+ 1. **Activate environment:**
13
+ ```bash
14
+ source .venv/bin/activate
15
+ ```
16
+
17
+ 2. **Parse arguments from `$ARGUMENTS`:**
18
+ - **Free query:** `"gradient boosting for tabular data"` — searches Semantic Scholar
19
+ - **Baseline:** `--baseline` — finds SOTA results for the current task, compares against your best
20
+ - **Related:** `--related exp-042` — finds papers using similar methods to a specific experiment
21
+ - `--auto-queue` — auto-queues hypotheses from literature with `source: "literature"`
22
+ - `--limit 10` — max number of results
23
+
24
+ 3. **Run literature search:**
25
+ ```bash
26
+ python scripts/literature_search.py $ARGUMENTS
27
+ ```
28
+
29
+ 4. **Report results:**
30
+ - **Papers:** title, authors, year, venue, citations, abstract snippet, URL
31
+ - **Baseline mode:** SOTA comparison with gap analysis against current best
32
+ - **Related mode:** methodological differences worth investigating
33
+ - **Hypotheses:** if `--auto-queue`, shows queued experiments from findings
34
+
35
+ 5. **Saved output:** results written to `experiments/literature/query-YYYY-MM-DD-HHMMSS.md`
36
+
37
+ 6. **If API unavailable:** reports error and suggests manual search.
38
+
39
+ ## Examples
40
+
41
+ ```
42
+ /turing:lit "gradient boosting missing values" # Free query
43
+ /turing:lit --baseline # SOTA comparison
44
+ /turing:lit --related exp-042 # Related work
45
+ /turing:lit --auto-queue "ensemble methods" # Queue hypotheses
46
+ ```
@@ -0,0 +1,50 @@
1
+ ---
2
+ name: logbook
3
+ description: Generate a research logbook showing the full experiment narrative — hypotheses proposed, experiments run, decisions made, and progress over time. Outputs HTML (with interactive chart) or markdown.
4
+ argument-hint: "[--since YYYY-MM-DD] [--format html|markdown] [--output path]"
5
+ allowed-tools: Read, Bash(python scripts/*:*, source .venv/bin/activate:*, mkdir:*), Grep, Glob
6
+ ---
7
+
8
+ Generate a research logbook that captures the full narrative of the experiment campaign.
9
+
10
+ ## Steps
11
+
12
+ 1. **Generate the logbook:**
13
+ ```bash
14
+ source .venv/bin/activate && python scripts/generate_logbook.py
15
+ ```
16
+
17
+ **With options from `$ARGUMENTS`:**
18
+ - `--since 2026-03-15` — only include events after this date
19
+ - `--format markdown` — output as markdown instead of HTML
20
+ - `--output logbook.html` — write to file instead of stdout
21
+
22
+ **Common usage:**
23
+ ```bash
24
+ # HTML logbook with interactive trajectory chart
25
+ source .venv/bin/activate && python scripts/generate_logbook.py --output logbook.html
26
+
27
+ # Markdown for embedding in docs or READMEs
28
+ source .venv/bin/activate && python scripts/generate_logbook.py --format markdown --output logbook.md
29
+
30
+ # Last week's activity
31
+ source .venv/bin/activate && python scripts/generate_logbook.py --since 2026-03-24 --output logbook.html
32
+ ```
33
+
34
+ 2. **Present the result:**
35
+ - If HTML: tell the user to open the file in their browser. The logbook includes an interactive Chart.js trajectory visualization.
36
+ - If markdown: display inline or note the output file location.
37
+
38
+ ## What the Logbook Contains
39
+
40
+ - **Campaign summary:** total experiments, keep rate, best metric, hypothesis count
41
+ - **Improvement trajectory:** interactive line chart showing metric progression and best-so-far envelope
42
+ - **Experiment log:** every experiment with ID, description, metric value, status (kept/discarded), date
43
+ - **Hypothesis queue:** every hypothesis with source (human/agent/literature), status, priority
44
+
45
+ ## When to Use
46
+
47
+ - To share progress with collaborators
48
+ - Before and after meetings to show what was tried
49
+ - To archive a completed research campaign
50
+ - To track progress over a specific time period
@@ -0,0 +1,23 @@
1
+ ---
2
+ name: merge
3
+ description: Model merging — average weights from multiple checkpoints into a single model (soups, TIES, DARE). Free accuracy, zero latency cost.
4
+ argument-hint: "<exp-ids...> [--method uniform|greedy|ties|dare]"
5
+ allowed-tools: Read, Bash(*), Grep, Glob
6
+ ---
7
+
8
+ Combine model weights (not predictions) into a single, better model with no latency overhead.
9
+
10
+ ## Steps
11
+
12
+ 1. **Activate environment:** `source .venv/bin/activate`
13
+ 2. **Run:** `python scripts/model_merger.py $ARGUMENTS`
14
+ 3. **Methods:** uniform soup (simple average), greedy soup (include only if improves), TIES (trim+elect+merge), DARE (drop+rescale)
15
+ 4. **Report:** compatibility check, per-model metrics, method comparison, improvement delta
16
+ 5. **Saved output:** `experiments/merges/merge-*.yaml`
17
+
18
+ ## Examples
19
+
20
+ ```
21
+ /turing:merge exp-042 exp-053 exp-067 # All methods
22
+ /turing:merge exp-042 exp-053 --method greedy # Greedy soup only
23
+ ```
@@ -0,0 +1,42 @@
1
+ ---
2
+ name: mode
3
+ description: Set the research strategy mode — explore (try new things), exploit (refine what works), or replicate (verify results). Drives novelty guard policy and agent behavior.
4
+ argument-hint: "<explore|exploit|replicate>"
5
+ ---
6
+
7
+ Set the research mode for the current project. The mode determines how the novelty guard filters proposed experiments and how the agent prioritizes its work.
8
+
9
+ ## Modes
10
+
11
+ | Mode | Novelty Guard Policy | Agent Behavior |
12
+ |------|---------------------|----------------|
13
+ | **explore** | Allow novel ideas, block repeats and follow-ups | Try fundamentally different approaches |
14
+ | **exploit** | Allow follow-ups and known successes, block repeats | Refine the current best configuration |
15
+ | **replicate** | Allow duplicate runs, block novel ideas | Re-run best experiments with different seeds |
16
+
17
+ ## Steps
18
+
19
+ 1. **Parse mode** from `$ARGUMENTS`. Must be one of: `explore`, `exploit`, `replicate`.
20
+
21
+ 2. **Update experiment state:**
22
+ ```bash
23
+ source .venv/bin/activate
24
+ python -c "
25
+ import yaml
26
+ from pathlib import Path
27
+ path = Path('experiment_state.yaml')
28
+ state = yaml.safe_load(path.read_text()) if path.exists() else {}
29
+ state['research_mode'] = '$ARGUMENTS'
30
+ path.write_text(yaml.dump(state, default_flow_style=False))
31
+ print(f'Research mode set to: $ARGUMENTS')
32
+ "
33
+ ```
34
+
35
+ 3. **Confirm** with guidance:
36
+ - `explore`: "The agent will prioritize novel ideas and avoid follow-ups. Best when the current approach feels exhausted."
37
+ - `exploit`: "The agent will refine the current best. Best when you have a promising direction."
38
+ - `replicate`: "The agent will re-run experiments for statistical verification. Best before declaring a winner."
39
+
40
+ ## Default
41
+
42
+ The default mode is `exploit` (refine what works). Change to `explore` when plateauing, `replicate` before final decisions.
@@ -0,0 +1,19 @@
1
+ ---
2
+ name: onboard
3
+ description: Project onboarding — generate a walkthrough for new collaborators. Task, history, decisions, next steps.
4
+ argument-hint: "[--audience researcher|engineer|stakeholder] [--depth brief|full]"
5
+ allowed-tools: Read, Bash(*), Grep, Glob
6
+ ---
7
+
8
+ 5-minute read that replaces a 1-hour onboarding meeting.
9
+
10
+ ## Steps
11
+ 1. `source .venv/bin/activate`
12
+ 2. `python scripts/generate_onboarding.py $ARGUMENTS`
13
+ 3. **Saved:** `ONBOARDING.md`
14
+
15
+ ## Examples
16
+ ```
17
+ /turing:onboard
18
+ /turing:onboard --audience engineer --depth brief
19
+ ```
@@ -0,0 +1,43 @@
1
+ ---
2
+ name: paper
3
+ description: Draft mechanical paper sections (setup, results, ablation, hyperparameters) from experiment logs. LaTeX and markdown output.
4
+ argument-hint: "[--sections setup,results,ablation] [--format latex|markdown]"
5
+ allowed-tools: Read, Bash(*), Grep, Glob
6
+ ---
7
+
8
+ Draft paper sections directly from experiment data.
9
+
10
+ ## Steps
11
+
12
+ 1. **Activate environment:**
13
+ ```bash
14
+ source .venv/bin/activate
15
+ ```
16
+
17
+ 2. **Parse arguments from `$ARGUMENTS`:**
18
+ - `--sections setup,results,ablation,hyperparameters` — which sections to draft (default: all)
19
+ - `--format latex|markdown` — output format (default: latex)
20
+
21
+ 3. **Run paper drafting:**
22
+ ```bash
23
+ python scripts/draft_paper_sections.py $ARGUMENTS
24
+ ```
25
+
26
+ 4. **Report results:**
27
+ - **setup:** Experimental setup prose (dataset, metrics, split, seed methodology)
28
+ - **results:** Comparison table with all model types, best bolded, seed study stats
29
+ - **ablation:** Ablation table from `/turing:ablate` results
30
+ - **hyperparameters:** Appendix-style parameter table per model
31
+
32
+ 5. **Output:** Each section saved to `paper/sections/` as `.tex` or `.md`
33
+
34
+ 6. **Numbers are pulled directly from experiment logs** — no manual transcription needed.
35
+
36
+ ## Examples
37
+
38
+ ```
39
+ /turing:paper # All sections, LaTeX
40
+ /turing:paper --format markdown # All sections, markdown
41
+ /turing:paper --sections setup,results # Just setup + results
42
+ /turing:paper --sections ablation --format latex # Just ablation table
43
+ ```
@@ -0,0 +1,26 @@
1
+ ---
2
+ name: plan
3
+ description: Research planning assistant — design a strategic experiment campaign with budget-aware ROI allocation.
4
+ argument-hint: "[--budget 20] [--goal \"maximize F1 for production\"]"
5
+ allowed-tools: Read, Bash(*), Grep, Glob
6
+ ---
7
+
8
+ Design the next N experiments strategically, not randomly. Allocates budget by expected ROI.
9
+
10
+ ## Steps
11
+ 1. `source .venv/bin/activate`
12
+ 2. `python scripts/research_planner.py $ARGUMENTS`
13
+ 3. **Saved:** `experiments/plans/`
14
+
15
+ ## How it works
16
+ - Analyzes experiment history to compute per-family ROI
17
+ - Adjusts strategy priorities based on project state and goal
18
+ - Allocates budget across: feature engineering, model search, ensemble, calibration, verification
19
+ - Generates phased plan with specific experiment descriptions
20
+
21
+ ## Examples
22
+ ```
23
+ /turing:plan --budget 20
24
+ /turing:plan --budget 10 --goal "maximize F1 for production deployment"
25
+ /turing:plan --budget 30 --json
26
+ ```