@researai/deepscientist 1.5.17 → 1.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (894) hide show
  1. package/AGENTS.md +309 -130
  2. package/AISB/catalog/aisb.b1.agentic_coding.yaml +244 -0
  3. package/AISB/catalog/aisb.b10.climate_earth.yaml +235 -0
  4. package/AISB/catalog/aisb.b11.model_efficiency.yaml +231 -0
  5. package/AISB/catalog/aisb.b12.embodied_ai.yaml +238 -0
  6. package/AISB/catalog/aisb.b2.agent_systems.yaml +229 -0
  7. package/AISB/catalog/aisb.b3.self_evolving_rl.yaml +237 -0
  8. package/AISB/catalog/aisb.b4.lm_reasoning.yaml +240 -0
  9. package/AISB/catalog/aisb.b5.math_proof.yaml +235 -0
  10. package/AISB/catalog/aisb.b6.research_process.yaml +243 -0
  11. package/AISB/catalog/aisb.b7.multimodal_fusion.yaml +232 -0
  12. package/AISB/catalog/aisb.b8.lifesci_drug.yaml +275 -0
  13. package/AISB/catalog/aisb.b9.material_science.yaml +237 -0
  14. package/AISB/catalog/aisb.t3.001_savvy.yaml +159 -0
  15. package/AISB/catalog/aisb.t3.001_savvy.zh.yaml +121 -0
  16. package/AISB/catalog/aisb.t3.002_pinet.yaml +189 -0
  17. package/AISB/catalog/aisb.t3.002_pinet.zh.yaml +130 -0
  18. package/AISB/catalog/aisb.t3.004_decentralattn.yaml +184 -0
  19. package/AISB/catalog/aisb.t3.004_decentralattn.zh.yaml +153 -0
  20. package/AISB/catalog/aisb.t3.005_tsae.yaml +193 -0
  21. package/AISB/catalog/aisb.t3.005_tsae.zh.yaml +139 -0
  22. package/AISB/catalog/aisb.t3.006_physense.yaml +194 -0
  23. package/AISB/catalog/aisb.t3.006_physense.zh.yaml +118 -0
  24. package/AISB/catalog/aisb.t3.007_reasoningiqa.yaml +169 -0
  25. package/AISB/catalog/aisb.t3.007_reasoningiqa.zh.yaml +133 -0
  26. package/AISB/catalog/aisb.t3.008_meanflows.yaml +188 -0
  27. package/AISB/catalog/aisb.t3.008_meanflows.zh.yaml +140 -0
  28. package/AISB/catalog/aisb.t3.009_scoremissing.yaml +179 -0
  29. package/AISB/catalog/aisb.t3.009_scoremissing.zh.yaml +119 -0
  30. package/AISB/catalog/aisb.t3.010_suitabilityfilter.yaml +221 -0
  31. package/AISB/catalog/aisb.t3.010_suitabilityfilter.zh.yaml +141 -0
  32. package/AISB/catalog/aisb.t3.011_osd.yaml +206 -0
  33. package/AISB/catalog/aisb.t3.011_osd.zh.yaml +163 -0
  34. package/AISB/catalog/aisb.t3.012_efficientqat.yaml +206 -0
  35. package/AISB/catalog/aisb.t3.012_efficientqat.zh.yaml +159 -0
  36. package/AISB/catalog/aisb.t3.013_appl.yaml +152 -0
  37. package/AISB/catalog/aisb.t3.013_appl.zh.yaml +126 -0
  38. package/AISB/catalog/aisb.t3.014_piguard.yaml +207 -0
  39. package/AISB/catalog/aisb.t3.014_piguard.zh.yaml +164 -0
  40. package/AISB/catalog/aisb.t3.015_frspec.yaml +209 -0
  41. package/AISB/catalog/aisb.t3.015_frspec.zh.yaml +163 -0
  42. package/AISB/catalog/aisb.t3.016_mathfusion.yaml +166 -0
  43. package/AISB/catalog/aisb.t3.016_mathfusion.zh.yaml +145 -0
  44. package/AISB/catalog/aisb.t3.017_multimodalglp.yaml +171 -0
  45. package/AISB/catalog/aisb.t3.017_multimodalglp.zh.yaml +122 -0
  46. package/AISB/catalog/aisb.t3.018_cotsynth.yaml +206 -0
  47. package/AISB/catalog/aisb.t3.018_cotsynth.zh.yaml +162 -0
  48. package/AISB/catalog/aisb.t3.019_dyscaleut.yaml +211 -0
  49. package/AISB/catalog/aisb.t3.019_dyscaleut.zh.yaml +148 -0
  50. package/AISB/catalog/aisb.t3.020_aristotle.yaml +173 -0
  51. package/AISB/catalog/aisb.t3.020_aristotle.zh.yaml +119 -0
  52. package/AISB/catalog/aisb.t3.021_tokenrecycling.yaml +160 -0
  53. package/AISB/catalog/aisb.t3.021_tokenrecycling.zh.yaml +129 -0
  54. package/AISB/catalog/aisb.t3.022_chainofreasoning.yaml +204 -0
  55. package/AISB/catalog/aisb.t3.022_chainofreasoning.zh.yaml +161 -0
  56. package/AISB/catalog/aisb.t3.023_guidedembed.yaml +211 -0
  57. package/AISB/catalog/aisb.t3.023_guidedembed.zh.yaml +189 -0
  58. package/AISB/catalog/aisb.t3.024_outputcentric.yaml +148 -0
  59. package/AISB/catalog/aisb.t3.024_outputcentric.zh.yaml +131 -0
  60. package/AISB/catalog/aisb.t3.025_deeper.yaml +143 -0
  61. package/AISB/catalog/aisb.t3.025_deeper.zh.yaml +116 -0
  62. package/AISB/catalog/aisb.t3.026_gartkg.yaml +195 -0
  63. package/AISB/catalog/aisb.t3.026_gartkg.zh.yaml +127 -0
  64. package/AISB/catalog/aisb.t3.027_citeeval.yaml +182 -0
  65. package/AISB/catalog/aisb.t3.027_citeeval.zh.yaml +135 -0
  66. package/AISB/catalog/aisb.t3.028_sbam.yaml +206 -0
  67. package/AISB/catalog/aisb.t3.028_sbam.zh.yaml +166 -0
  68. package/AISB/catalog/aisb.t3.029_cdqgeoembed.yaml +224 -0
  69. package/AISB/catalog/aisb.t3.029_cdqgeoembed.zh.yaml +142 -0
  70. package/AISB/catalog/aisb.t3.030_processrm.yaml +211 -0
  71. package/AISB/catalog/aisb.t3.030_processrm.zh.yaml +166 -0
  72. package/AISB/catalog/aisb.t3.031_circuitstability.yaml +172 -0
  73. package/AISB/catalog/aisb.t3.031_circuitstability.zh.yaml +134 -0
  74. package/AISB/catalog/aisb.t3.032_ptsolver.yaml +169 -0
  75. package/AISB/catalog/aisb.t3.032_ptsolver.zh.yaml +135 -0
  76. package/AISB/catalog/aisb.t3.033_gcse.yaml +144 -0
  77. package/AISB/catalog/aisb.t3.033_gcse.zh.yaml +126 -0
  78. package/AISB/catalog/aisb.t3.034_ensemblewm.yaml +183 -0
  79. package/AISB/catalog/aisb.t3.034_ensemblewm.zh.yaml +146 -0
  80. package/AISB/catalog/aisb.t3.035_moralvalueswa.yaml +207 -0
  81. package/AISB/catalog/aisb.t3.035_moralvalueswa.zh.yaml +165 -0
  82. package/AISB/catalog/aisb.t3.036_weakstrongpref.yaml +210 -0
  83. package/AISB/catalog/aisb.t3.036_weakstrongpref.zh.yaml +194 -0
  84. package/AISB/catalog/aisb.t3.037_dementiamask.yaml +172 -0
  85. package/AISB/catalog/aisb.t3.037_dementiamask.zh.yaml +132 -0
  86. package/AISB/catalog/aisb.t3.038_tinysam.yaml +284 -0
  87. package/AISB/catalog/aisb.t3.038_tinysam.zh.yaml +240 -0
  88. package/AISB/catalog/aisb.t3.039_calf.yaml +224 -0
  89. package/AISB/catalog/aisb.t3.039_calf.zh.yaml +194 -0
  90. package/AISB/catalog/aisb.t3.040_graniteguardian.yaml +199 -0
  91. package/AISB/catalog/aisb.t3.040_graniteguardian.zh.yaml +174 -0
  92. package/AISB/catalog/aisb.t3.041_amdm.yaml +149 -0
  93. package/AISB/catalog/aisb.t3.041_amdm.zh.yaml +137 -0
  94. package/AISB/catalog/aisb.t3.042_xpatch.yaml +216 -0
  95. package/AISB/catalog/aisb.t3.042_xpatch.zh.yaml +182 -0
  96. package/AISB/catalog/aisb.t3.043_vhm.yaml +268 -0
  97. package/AISB/catalog/aisb.t3.043_vhm.zh.yaml +193 -0
  98. package/AISB/catalog/aisb.t3.044_rgvi.yaml +224 -0
  99. package/AISB/catalog/aisb.t3.044_rgvi.zh.yaml +176 -0
  100. package/AISB/catalog/aisb.t3.045_pslstm.yaml +203 -0
  101. package/AISB/catalog/aisb.t3.045_pslstm.zh.yaml +179 -0
  102. package/AISB/catalog/aisb.t3.046_nonstatts.yaml +208 -0
  103. package/AISB/catalog/aisb.t3.046_nonstatts.zh.yaml +194 -0
  104. package/AISB/catalog/aisb.t3.047_timepfn.yaml +156 -0
  105. package/AISB/catalog/aisb.t3.047_timepfn.zh.yaml +124 -0
  106. package/AISB/catalog/aisb.t3.048_proxyspex.yaml +148 -0
  107. package/AISB/catalog/aisb.t3.048_proxyspex.zh.yaml +125 -0
  108. package/AISB/catalog/aisb.t3.049_hogwildinference.yaml +183 -0
  109. package/AISB/catalog/aisb.t3.049_hogwildinference.zh.yaml +138 -0
  110. package/AISB/catalog/aisb.t3.050_causalpfn.yaml +214 -0
  111. package/AISB/catalog/aisb.t3.050_causalpfn.zh.yaml +190 -0
  112. package/AISB/catalog/aisb.t3.051_flashtp.yaml +169 -0
  113. package/AISB/catalog/aisb.t3.051_flashtp.zh.yaml +124 -0
  114. package/AISB/catalog/aisb.t3.052_nsdiff.yaml +155 -0
  115. package/AISB/catalog/aisb.t3.052_nsdiff.zh.yaml +138 -0
  116. package/AISB/catalog/aisb.t3.053_k2vae.yaml +158 -0
  117. package/AISB/catalog/aisb.t3.053_k2vae.zh.yaml +132 -0
  118. package/AISB/catalog/aisb.t3.054_timebase.yaml +178 -0
  119. package/AISB/catalog/aisb.t3.054_timebase.zh.yaml +158 -0
  120. package/AISB/catalog/aisb.t3.055_csbrain.yaml +238 -0
  121. package/AISB/catalog/aisb.t3.055_csbrain.zh.yaml +184 -0
  122. package/AISB/catalog/aisb.t3.056_infosam.yaml +224 -0
  123. package/AISB/catalog/aisb.t3.056_infosam.zh.yaml +189 -0
  124. package/AISB/catalog/aisb.t3.057_mdreid.yaml +129 -0
  125. package/AISB/catalog/aisb.t3.057_mdreid.zh.yaml +117 -0
  126. package/AISB/catalog/aisb.t3.058_mindglitch.yaml +171 -0
  127. package/AISB/catalog/aisb.t3.058_mindglitch.zh.yaml +145 -0
  128. package/AISB/catalog/aisb.t3.059_selfsupervised.yaml +154 -0
  129. package/AISB/catalog/aisb.t3.059_selfsupervised.zh.yaml +125 -0
  130. package/AISB/catalog/aisb.t3.060_iaggad.yaml +121 -0
  131. package/AISB/catalog/aisb.t3.060_iaggad.zh.yaml +100 -0
  132. package/AISB/catalog/aisb.t3.061_hsgkn.yaml +136 -0
  133. package/AISB/catalog/aisb.t3.061_hsgkn.zh.yaml +113 -0
  134. package/AISB/catalog/aisb.t3.062_visionts.yaml +237 -0
  135. package/AISB/catalog/aisb.t3.062_visionts.zh.yaml +216 -0
  136. package/AISB/catalog/aisb.t3.063_tsrag.yaml +162 -0
  137. package/AISB/catalog/aisb.t3.063_tsrag.zh.yaml +138 -0
  138. package/AISB/catalog/aisb.t3.064_pir.yaml +221 -0
  139. package/AISB/catalog/aisb.t3.064_pir.zh.yaml +197 -0
  140. package/AISB/catalog/aisb.t3.065_proteinbinding.yaml +234 -0
  141. package/AISB/catalog/aisb.t3.065_proteinbinding.zh.yaml +167 -0
  142. package/AISB/catalog/aisb.t3.066_tropicalattention.yaml +267 -0
  143. package/AISB/catalog/aisb.t3.066_tropicalattention.zh.yaml +229 -0
  144. package/AISB/catalog/aisb.t3.067_kanad.yaml +193 -0
  145. package/AISB/catalog/aisb.t3.067_kanad.zh.yaml +167 -0
  146. package/AISB/catalog/aisb.t3.068_sempo.yaml +187 -0
  147. package/AISB/catalog/aisb.t3.068_sempo.zh.yaml +148 -0
  148. package/AISB/catalog/aisb.t3.069_treehfd.yaml +129 -0
  149. package/AISB/catalog/aisb.t3.069_treehfd.zh.yaml +111 -0
  150. package/AISB/catalog/aisb.t3.070_certifiedunlearning.yaml +224 -0
  151. package/AISB/catalog/aisb.t3.070_certifiedunlearning.zh.yaml +171 -0
  152. package/AISB/catalog/aisb.t3.071_neuralmjd.yaml +142 -0
  153. package/AISB/catalog/aisb.t3.071_neuralmjd.zh.yaml +120 -0
  154. package/AISB/catalog/aisb.t3.072_fedgmt.yaml +181 -0
  155. package/AISB/catalog/aisb.t3.072_fedgmt.zh.yaml +158 -0
  156. package/AISB/catalog/aisb.t3.073_rld.yaml +161 -0
  157. package/AISB/catalog/aisb.t3.073_rld.zh.yaml +129 -0
  158. package/AISB/catalog/aisb.t3.074_lsvi.yaml +163 -0
  159. package/AISB/catalog/aisb.t3.074_lsvi.zh.yaml +129 -0
  160. package/AISB/catalog/aisb.t3.075_treeslicedentropy.yaml +201 -0
  161. package/AISB/catalog/aisb.t3.075_treeslicedentropy.zh.yaml +148 -0
  162. package/AISB/catalog/aisb.t3.076_aanet.yaml +169 -0
  163. package/AISB/catalog/aisb.t3.076_aanet.zh.yaml +129 -0
  164. package/AISB/catalog/aisb.t3.077_cmnn.yaml +199 -0
  165. package/AISB/catalog/aisb.t3.077_cmnn.zh.yaml +165 -0
  166. package/AISB/catalog/aisb.t3.078_conformalanomaly.yaml +146 -0
  167. package/AISB/catalog/aisb.t3.078_conformalanomaly.zh.yaml +117 -0
  168. package/AISB/catalog/aisb.t3.079_dpfkmeans.yaml +131 -0
  169. package/AISB/catalog/aisb.t3.079_dpfkmeans.zh.yaml +104 -0
  170. package/AISB/catalog/aisb.t3.080_latentscorereweight.yaml +169 -0
  171. package/AISB/catalog/aisb.t3.080_latentscorereweight.zh.yaml +123 -0
  172. package/AISB/catalog/aisb.t3.081_qmamba.yaml +150 -0
  173. package/AISB/catalog/aisb.t3.081_qmamba.zh.yaml +117 -0
  174. package/AISB/catalog/aisb.t3.082_onlinellmrouting.yaml +160 -0
  175. package/AISB/catalog/aisb.t3.082_onlinellmrouting.zh.yaml +133 -0
  176. package/AISB/catalog/aisb.t3.083_starformer.yaml +178 -0
  177. package/AISB/catalog/aisb.t3.083_starformer.zh.yaml +140 -0
  178. package/AISB/catalog/aisb.t3.084_ift.yaml +139 -0
  179. package/AISB/catalog/aisb.t3.084_ift.zh.yaml +111 -0
  180. package/AISB/catalog/aisb.t3.085_neuralsurv.yaml +183 -0
  181. package/AISB/catalog/aisb.t3.085_neuralsurv.zh.yaml +143 -0
  182. package/AISB/catalog/aisb.t3.086_stella.yaml +197 -0
  183. package/AISB/catalog/aisb.t3.086_stella.zh.yaml +142 -0
  184. package/AISB/catalog/aisb.t3.087_moses.yaml +167 -0
  185. package/AISB/catalog/aisb.t3.087_moses.zh.yaml +132 -0
  186. package/AISB/catalog/aisb.t3.088_channelnorm.yaml +140 -0
  187. package/AISB/catalog/aisb.t3.088_channelnorm.zh.yaml +109 -0
  188. package/AISB/catalog/aisb.t3.089_causalvelocity.yaml +730 -0
  189. package/AISB/catalog/aisb.t3.089_causalvelocity.zh.yaml +668 -0
  190. package/AISB/catalog/aisb.t3.090_rstib.yaml +144 -0
  191. package/AISB/catalog/aisb.t3.090_rstib.zh.yaml +109 -0
  192. package/AISB/catalog/aisb.t3.091_timeawarecausal.yaml +132 -0
  193. package/AISB/catalog/aisb.t3.091_timeawarecausal.zh.yaml +107 -0
  194. package/AISB/catalog/aisb.t3.092_kmeanslocalopt.yaml +138 -0
  195. package/AISB/catalog/aisb.t3.092_kmeanslocalopt.zh.yaml +110 -0
  196. package/AISB/catalog/aisb.t3.093_fedwmsam.yaml +134 -0
  197. package/AISB/catalog/aisb.t3.093_fedwmsam.zh.yaml +106 -0
  198. package/AISB/catalog/aisb.t3.094_boundre.yaml +147 -0
  199. package/AISB/catalog/aisb.t3.094_boundre.zh.yaml +114 -0
  200. package/AISB/catalog/aisb.t3.095_fastfeaturecp.yaml +153 -0
  201. package/AISB/catalog/aisb.t3.095_fastfeaturecp.zh.yaml +118 -0
  202. package/AISB/catalog/aisb.t3.096_m3svm.yaml +189 -0
  203. package/AISB/catalog/aisb.t3.096_m3svm.zh.yaml +149 -0
  204. package/AISB/catalog/aisb.t3.097_wassersteintl.yaml +212 -0
  205. package/AISB/catalog/aisb.t3.097_wassersteintl.zh.yaml +169 -0
  206. package/AISB/catalog/aisb.t3.098_xmahalanobis.yaml +171 -0
  207. package/AISB/catalog/aisb.t3.098_xmahalanobis.zh.yaml +127 -0
  208. package/AISB/catalog/aisb.t3.099_ollalanding.yaml +248 -0
  209. package/AISB/catalog/aisb.t3.099_ollalanding.zh.yaml +182 -0
  210. package/AISB/catalog/aisb.t3.100_invmissingdata.yaml +179 -0
  211. package/AISB/catalog/aisb.t3.100_invmissingdata.zh.yaml +150 -0
  212. package/AISB/catalog/aisb.t3.101_acia.yaml +164 -0
  213. package/AISB/catalog/aisb.t3.101_acia.zh.yaml +109 -0
  214. package/AISB/catalog/aisb.t3.102_stochasticff.yaml +178 -0
  215. package/AISB/catalog/aisb.t3.102_stochasticff.zh.yaml +130 -0
  216. package/AISB/catalog/aisb.t3.103_qdcp.yaml +150 -0
  217. package/AISB/catalog/aisb.t3.103_qdcp.zh.yaml +116 -0
  218. package/AISB/catalog/aisb.t3.104_balancedactiveinf.yaml +137 -0
  219. package/AISB/catalog/aisb.t3.104_balancedactiveinf.zh.yaml +104 -0
  220. package/AISB/catalog/aisb.t3.105_binaryclasseval.yaml +161 -0
  221. package/AISB/catalog/aisb.t3.105_binaryclasseval.zh.yaml +130 -0
  222. package/AISB/image/001_aisb.t3.001_savvy.jpg +0 -0
  223. package/AISB/image/002_aisb.t3.002_pinet.jpg +0 -0
  224. package/AISB/image/003_aisb.t3.003_dmsqd.jpg +0 -0
  225. package/AISB/image/004_aisb.t3.004_decentralattn.jpg +0 -0
  226. package/AISB/image/005_aisb.t3.005_tsae.jpg +0 -0
  227. package/AISB/image/006_aisb.t3.006_physense.jpg +0 -0
  228. package/AISB/image/007_aisb.t3.007_reasoningiqa.jpg +0 -0
  229. package/AISB/image/008_aisb.t3.008_meanflows.jpg +0 -0
  230. package/AISB/image/009_aisb.t3.009_scoremissing.jpg +0 -0
  231. package/AISB/image/010_aisb.t3.010_suitabilityfilter.jpg +0 -0
  232. package/AISB/image/011_aisb.t3.011_osd.jpg +0 -0
  233. package/AISB/image/012_aisb.t3.012_efficientqat.jpg +0 -0
  234. package/AISB/image/013_aisb.t3.013_appl.jpg +0 -0
  235. package/AISB/image/014_aisb.t3.014_piguard.jpg +0 -0
  236. package/AISB/image/015_aisb.t3.015_frspec.jpg +0 -0
  237. package/AISB/image/016_aisb.t3.016_mathfusion.jpg +0 -0
  238. package/AISB/image/017_aisb.t3.017_multimodalglp.jpg +0 -0
  239. package/AISB/image/018_aisb.t3.018_cotsynth.jpg +0 -0
  240. package/AISB/image/019_aisb.t3.019_dyscaleut.jpg +0 -0
  241. package/AISB/image/020_aisb.t3.020_aristotle.jpg +0 -0
  242. package/AISB/image/021_aisb.t3.021_tokenrecycling.jpg +0 -0
  243. package/AISB/image/022_aisb.t3.022_chainofreasoning.jpg +0 -0
  244. package/AISB/image/023_aisb.t3.023_guidedembed.jpg +0 -0
  245. package/AISB/image/024_aisb.t3.024_outputcentric.jpg +0 -0
  246. package/AISB/image/025_aisb.t3.025_deeper.jpg +0 -0
  247. package/AISB/image/026_aisb.t3.026_gartkg.jpg +0 -0
  248. package/AISB/image/027_aisb.t3.027_citeeval.jpg +0 -0
  249. package/AISB/image/028_aisb.t3.028_sbam.jpg +0 -0
  250. package/AISB/image/029_aisb.t3.029_cdqgeoembed.jpg +0 -0
  251. package/AISB/image/030_aisb.t3.030_processrm.jpg +0 -0
  252. package/AISB/image/031_aisb.t3.031_circuitstability.jpg +0 -0
  253. package/AISB/image/032_aisb.t3.032_ptsolver.jpg +0 -0
  254. package/AISB/image/033_aisb.t3.033_gcse.jpg +0 -0
  255. package/AISB/image/034_aisb.t3.034_ensemblewm.jpg +0 -0
  256. package/AISB/image/035_aisb.t3.035_moralvalueswa.jpg +0 -0
  257. package/AISB/image/036_aisb.t3.036_weakstrongpref.jpg +0 -0
  258. package/AISB/image/037_aisb.t3.037_dementiamask.jpg +0 -0
  259. package/AISB/image/038_aisb.t3.038_tinysam.jpg +0 -0
  260. package/AISB/image/039_aisb.t3.039_calf.jpg +0 -0
  261. package/AISB/image/040_aisb.t3.040_graniteguardian.jpg +0 -0
  262. package/AISB/image/041_aisb.t3.041_amdm.jpg +0 -0
  263. package/AISB/image/042_aisb.t3.042_xpatch.jpg +0 -0
  264. package/AISB/image/043_aisb.t3.043_vhm.jpg +0 -0
  265. package/AISB/image/044_aisb.t3.044_rgvi.jpg +0 -0
  266. package/AISB/image/045_aisb.t3.045_pslstm.jpg +0 -0
  267. package/AISB/image/046_aisb.t3.046_nonstatts.jpg +0 -0
  268. package/AISB/image/047_aisb.t3.047_timepfn.jpg +0 -0
  269. package/AISB/image/048_aisb.t3.048_proxyspex.jpg +0 -0
  270. package/AISB/image/049_aisb.t3.049_hogwildinference.jpg +0 -0
  271. package/AISB/image/050_aisb.t3.050_causalpfn.jpg +0 -0
  272. package/AISB/image/051_aisb.t3.051_flashtp.jpg +0 -0
  273. package/AISB/image/052_aisb.t3.052_nsdiff.jpg +0 -0
  274. package/AISB/image/053_aisb.t3.053_k2vae.jpg +0 -0
  275. package/AISB/image/054_aisb.t3.054_timebase.jpg +0 -0
  276. package/AISB/image/055_aisb.t3.055_csbrain.jpg +0 -0
  277. package/AISB/image/056_aisb.t3.056_infosam.jpg +0 -0
  278. package/AISB/image/057_aisb.t3.057_mdreid.jpg +0 -0
  279. package/AISB/image/058_aisb.t3.058_mindglitch.jpg +0 -0
  280. package/AISB/image/059_aisb.t3.059_selfsupervised.jpg +0 -0
  281. package/AISB/image/060_aisb.t3.060_iaggad.jpg +0 -0
  282. package/AISB/image/061_aisb.t3.061_hsgkn.jpg +0 -0
  283. package/AISB/image/062_aisb.t3.062_visionts.jpg +0 -0
  284. package/AISB/image/063_aisb.t3.063_tsrag.jpg +0 -0
  285. package/AISB/image/064_aisb.t3.064_pir.jpg +0 -0
  286. package/AISB/image/065_aisb.t3.065_proteinbinding.jpg +0 -0
  287. package/AISB/image/066_aisb.t3.066_tropicalattention.jpg +0 -0
  288. package/AISB/image/067_aisb.t3.067_kanad.jpg +0 -0
  289. package/AISB/image/068_aisb.t3.068_sempo.jpg +0 -0
  290. package/AISB/image/069_aisb.t3.069_treehfd.jpg +0 -0
  291. package/AISB/image/070_aisb.t3.070_certifiedunlearning.jpg +0 -0
  292. package/AISB/image/071_aisb.t3.071_neuralmjd.jpg +0 -0
  293. package/AISB/image/072_aisb.t3.072_fedgmt.jpg +0 -0
  294. package/AISB/image/073_aisb.t3.073_rld.jpg +0 -0
  295. package/AISB/image/074_aisb.t3.074_lsvi.jpg +0 -0
  296. package/AISB/image/075_aisb.t3.075_treeslicedentropy.jpg +0 -0
  297. package/AISB/image/076_aisb.t3.076_aanet.jpg +0 -0
  298. package/AISB/image/077_aisb.t3.077_cmnn.jpg +0 -0
  299. package/AISB/image/078_aisb.t3.078_conformalanomaly.jpg +0 -0
  300. package/AISB/image/079_aisb.t3.079_dpfkmeans.jpg +0 -0
  301. package/AISB/image/080_aisb.t3.080_latentscorereweight.jpg +0 -0
  302. package/AISB/image/081_aisb.t3.081_qmamba.jpg +0 -0
  303. package/AISB/image/082_aisb.t3.082_onlinellmrouting.jpg +0 -0
  304. package/AISB/image/083_aisb.t3.083_starformer.jpg +0 -0
  305. package/AISB/image/084_aisb.t3.084_ift.jpg +0 -0
  306. package/AISB/image/085_aisb.t3.085_neuralsurv.jpg +0 -0
  307. package/AISB/image/086_aisb.t3.086_stella.jpg +0 -0
  308. package/AISB/image/087_aisb.t3.087_moses.jpg +0 -0
  309. package/AISB/image/088_aisb.t3.088_channelnorm.jpg +0 -0
  310. package/AISB/image/089_aisb.t3.089_causalvelocity.jpg +0 -0
  311. package/AISB/image/090_aisb.t3.090_rstib.jpg +0 -0
  312. package/AISB/image/091_aisb.t3.091_timeawarecausal.jpg +0 -0
  313. package/AISB/image/092_aisb.t3.092_kmeanslocalopt.jpg +0 -0
  314. package/AISB/image/093_aisb.t3.093_fedwmsam.jpg +0 -0
  315. package/AISB/image/094_aisb.t3.094_boundre.jpg +0 -0
  316. package/AISB/image/095_aisb.t3.095_fastfeaturecp.jpg +0 -0
  317. package/AISB/image/096_aisb.t3.096_m3svm.jpg +0 -0
  318. package/AISB/image/097_aisb.t3.097_wassersteintl.jpg +0 -0
  319. package/AISB/image/098_aisb.t3.098_xmahalanobis.jpg +0 -0
  320. package/AISB/image/099_aisb.t3.099_ollalanding.jpg +0 -0
  321. package/AISB/image/100_aisb.t3.100_invmissingdata.jpg +0 -0
  322. package/AISB/image/101_aisb.t3.101_acia.jpg +0 -0
  323. package/AISB/image/102_aisb.t3.102_stochasticff.jpg +0 -0
  324. package/AISB/image/103_aisb.t3.103_qdcp.jpg +0 -0
  325. package/AISB/image/104_aisb.t3.104_balancedactiveinf.jpg +0 -0
  326. package/AISB/image/105_aisb.t3.105_binaryclasseval.jpg +0 -0
  327. package/AISB/image/106_aisb.t1.reasoning_lite.jpg +0 -0
  328. package/AISB/image/107_aisb.t2.paper_audit.jpg +0 -0
  329. package/AISB/image/108_aisb.t3.multi_gpu_search.jpg +0 -0
  330. package/AISB/image/109_aisb.t3.tdc_admet.jpg +0 -0
  331. package/AISB/image/aisb.b1.agentic_coding.svg +16 -0
  332. package/AISB/image/aisb.b10.climate_earth.svg +16 -0
  333. package/AISB/image/aisb.b11.model_efficiency.svg +16 -0
  334. package/AISB/image/aisb.b12.embodied_ai.svg +16 -0
  335. package/AISB/image/aisb.b2.agent_systems.svg +16 -0
  336. package/AISB/image/aisb.b3.self_evolving_rl.svg +16 -0
  337. package/AISB/image/aisb.b4.lm_reasoning.svg +16 -0
  338. package/AISB/image/aisb.b5.math_proof.svg +16 -0
  339. package/AISB/image/aisb.b6.research_process.svg +16 -0
  340. package/AISB/image/aisb.b7.multimodal_fusion.svg +16 -0
  341. package/AISB/image/aisb.b8.lifesci_drug.svg +16 -0
  342. package/AISB/image/aisb.b9.material_science.svg +16 -0
  343. package/README.md +132 -11
  344. package/bin/ds.js +376 -49
  345. package/docs/en/00_QUICK_START.md +135 -18
  346. package/docs/en/01_SETTINGS_REFERENCE.md +468 -96
  347. package/docs/en/02_START_RESEARCH_GUIDE.md +26 -5
  348. package/docs/en/03_QQ_CONNECTOR_GUIDE.md +14 -3
  349. package/docs/en/04_LINGZHU_CONNECTOR_GUIDE.md +2 -0
  350. package/docs/en/05_TUI_GUIDE.md +171 -2
  351. package/docs/en/07_MEMORY_AND_MCP.md +38 -2
  352. package/docs/en/09_DOCTOR.md +64 -4
  353. package/docs/en/10_WEIXIN_CONNECTOR_GUIDE.md +38 -1
  354. package/docs/en/11_LICENSE_AND_RISK.md +4 -0
  355. package/docs/en/12_GUIDED_WORKFLOW_TOUR.md +15 -0
  356. package/docs/en/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +9 -0
  357. package/docs/en/15_CODEX_PROVIDER_SETUP.md +622 -187
  358. package/docs/en/16_TELEGRAM_CONNECTOR_GUIDE.md +14 -0
  359. package/docs/en/17_WHATSAPP_CONNECTOR_GUIDE.md +14 -0
  360. package/docs/en/18_FEISHU_CONNECTOR_GUIDE.md +14 -0
  361. package/docs/en/21_LOCAL_MODEL_BACKENDS_GUIDE.md +105 -2
  362. package/docs/en/22_BENCHSTORE_YAML_REFERENCE.md +469 -0
  363. package/docs/en/23_BENCHSTORE_GITHUB_RELEASES_SPEC.md +316 -0
  364. package/docs/en/24_CLAUDE_CODE_PROVIDER_SETUP.md +469 -0
  365. package/docs/en/25_OPENCODE_PROVIDER_SETUP.md +653 -0
  366. package/docs/en/26_CITATION_AND_ATTRIBUTION.md +119 -0
  367. package/docs/en/27_KIMI_CODE_PROVIDER_SETUP.md +180 -0
  368. package/docs/en/28_DISCORD_CONNECTOR_GUIDE.md +61 -0
  369. package/docs/en/29_SLACK_CONNECTOR_GUIDE.md +60 -0
  370. package/docs/en/30_SETTINGS_CONTROL_CENTER_GUIDE.md +371 -0
  371. package/docs/en/{19_LOCAL_BROWSER_AUTH.md → 31_LOCAL_BROWSER_AUTH.md} +1 -1
  372. package/docs/en/32_WINDOWS_WSL2_DEPLOYMENT_GUIDE.md +273 -0
  373. package/docs/en/33_WORKSPACE_EXPLORER_QA.md +121 -0
  374. package/docs/en/91_DEVELOPMENT.md +29 -0
  375. package/docs/en/99_ACKNOWLEDGEMENTS.md +24 -19
  376. package/docs/en/README.md +44 -7
  377. package/docs/images/admin/admin-connectors-health-en.png +0 -0
  378. package/docs/images/admin/admin-controllers-en.png +0 -0
  379. package/docs/images/admin/admin-diagnostics-en.png +0 -0
  380. package/docs/images/admin/admin-errors-en.png +0 -0
  381. package/docs/images/admin/admin-issues-en.png +0 -0
  382. package/docs/images/admin/admin-logs-en.png +0 -0
  383. package/docs/images/admin/admin-quest-detail-en.png +0 -0
  384. package/docs/images/admin/admin-quests-en.png +0 -0
  385. package/docs/images/admin/admin-repairs-en.png +0 -0
  386. package/docs/images/admin/admin-runtime-en.png +0 -0
  387. package/docs/images/admin/admin-search-en.png +0 -0
  388. package/docs/images/admin/admin-stats-en.png +0 -0
  389. package/docs/images/admin/admin-summary-en.png +0 -0
  390. package/docs/images/connectors/connector-discord-en.png +0 -0
  391. package/docs/images/connectors/connector-feishu-en.png +0 -0
  392. package/docs/images/connectors/connector-lingzhu-en.png +0 -0
  393. package/docs/images/connectors/connector-qq-en.png +0 -0
  394. package/docs/images/connectors/connector-slack-en.png +0 -0
  395. package/docs/images/connectors/connector-telegram-en.png +0 -0
  396. package/docs/images/connectors/connector-weixin-en.png +0 -0
  397. package/docs/images/connectors/connector-whatsapp-en.png +0 -0
  398. package/docs/images/settings/settings-baselines-en.png +0 -0
  399. package/docs/images/settings/settings-config-en.png +0 -0
  400. package/docs/images/settings/settings-connectors-overview-en.png +0 -0
  401. package/docs/images/settings/settings-deepxiv-en.png +0 -0
  402. package/docs/images/settings/settings-mcp-servers-en.png +0 -0
  403. package/docs/images/settings/settings-plugins-en.png +0 -0
  404. package/docs/images/settings/settings-runners-en.png +0 -0
  405. package/docs/zh/00_QUICK_START.md +92 -17
  406. package/docs/zh/01_SETTINGS_REFERENCE.md +219 -98
  407. package/docs/zh/02_START_RESEARCH_GUIDE.md +26 -5
  408. package/docs/zh/05_TUI_GUIDE.md +171 -2
  409. package/docs/zh/07_MEMORY_AND_MCP.md +29 -2
  410. package/docs/zh/09_DOCTOR.md +39 -4
  411. package/docs/zh/10_WEIXIN_CONNECTOR_GUIDE.md +24 -1
  412. package/docs/zh/11_LICENSE_AND_RISK.md +4 -0
  413. package/docs/zh/12_GUIDED_WORKFLOW_TOUR.md +15 -0
  414. package/docs/zh/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +9 -0
  415. package/docs/zh/15_CODEX_PROVIDER_SETUP.md +550 -188
  416. package/docs/zh/21_LOCAL_MODEL_BACKENDS_GUIDE.md +105 -2
  417. package/docs/zh/22_BENCHSTORE_YAML_REFERENCE.md +459 -0
  418. package/docs/zh/23_BENCHSTORE_GITHUB_RELEASES_SPEC.md +287 -0
  419. package/docs/zh/23_CLAUDE_RUNNER_GUIDE.md +103 -0
  420. package/docs/zh/24_CLAUDE_CODE_PROVIDER_SETUP.md +460 -0
  421. package/docs/zh/25_OPENCODE_PROVIDER_SETUP.md +660 -0
  422. package/docs/zh/26_CITATION_AND_ATTRIBUTION.md +102 -0
  423. package/docs/zh/27_KIMI_CODE_PROVIDER_SETUP.md +51 -0
  424. package/docs/zh/{19_LOCAL_BROWSER_AUTH.md → 31_LOCAL_BROWSER_AUTH.md} +1 -1
  425. package/docs/zh/32_WINDOWS_WSL2_DEPLOYMENT_GUIDE.md +264 -0
  426. package/docs/zh/33_WORKSPACE_EXPLORER_QA.md +127 -0
  427. package/docs/zh/99_ACKNOWLEDGEMENTS.md +23 -19
  428. package/docs/zh/README.md +29 -7
  429. package/install.sh +122 -16
  430. package/package.json +4 -1
  431. package/pyproject.toml +2 -1
  432. package/src/deepscientist/__init__.py +1 -1
  433. package/src/deepscientist/acp/envelope.py +13 -0
  434. package/src/deepscientist/admin/__init__.py +3 -0
  435. package/src/deepscientist/admin/charts.py +681 -0
  436. package/src/deepscientist/admin/logs.py +119 -0
  437. package/src/deepscientist/admin/repairs.py +217 -0
  438. package/src/deepscientist/admin/service.py +1310 -0
  439. package/src/deepscientist/admin/system_info.py +700 -0
  440. package/src/deepscientist/admin/tasks.py +465 -0
  441. package/src/deepscientist/admin/tool_metrics.py +600 -0
  442. package/src/deepscientist/artifact/guidance.py +8 -4
  443. package/src/deepscientist/artifact/schemas.py +115 -0
  444. package/src/deepscientist/artifact/service.py +4268 -260
  445. package/src/deepscientist/bash_exec/monitor.py +30 -3
  446. package/src/deepscientist/bash_exec/service.py +134 -1
  447. package/src/deepscientist/benchstore/__init__.py +4 -0
  448. package/src/deepscientist/benchstore/prompt_builder.py +224 -0
  449. package/src/deepscientist/benchstore/service.py +1716 -0
  450. package/src/deepscientist/channels/weixin_ilink.py +8 -1
  451. package/src/deepscientist/cli.py +92 -17
  452. package/src/deepscientist/codex_cli_compat.py +2 -2
  453. package/src/deepscientist/config/models.py +82 -11
  454. package/src/deepscientist/config/service.py +927 -91
  455. package/src/deepscientist/connector/weixin_support.py +48 -17
  456. package/src/deepscientist/daemon/api/handlers.py +697 -210
  457. package/src/deepscientist/daemon/api/router.py +76 -1
  458. package/src/deepscientist/daemon/app.py +1054 -51
  459. package/src/deepscientist/diagnostics/runner_failures.py +147 -0
  460. package/src/deepscientist/doctor.py +212 -65
  461. package/src/deepscientist/evidence_packets.py +590 -0
  462. package/src/deepscientist/home.py +52 -4
  463. package/src/deepscientist/kimi_cli_compat.py +50 -0
  464. package/src/deepscientist/latex_runtime.py +2 -2
  465. package/src/deepscientist/mcp/context.py +2 -0
  466. package/src/deepscientist/mcp/schemas.py +114 -0
  467. package/src/deepscientist/mcp/server.py +1566 -126
  468. package/src/deepscientist/memory/service.py +203 -16
  469. package/src/deepscientist/process_control.py +8 -1
  470. package/src/deepscientist/prompts/builder.py +836 -92
  471. package/src/deepscientist/quest/__init__.py +2 -2
  472. package/src/deepscientist/quest/layout.py +12 -1
  473. package/src/deepscientist/quest/node_traces.py +10 -0
  474. package/src/deepscientist/quest/service.py +1430 -139
  475. package/src/deepscientist/quest/stage_views.py +1 -1
  476. package/src/deepscientist/runners/__init__.py +18 -0
  477. package/src/deepscientist/runners/base.py +89 -1
  478. package/src/deepscientist/runners/builtins.py +13 -1
  479. package/src/deepscientist/runners/claude.py +391 -0
  480. package/src/deepscientist/runners/codex.py +421 -21
  481. package/src/deepscientist/runners/codex_telemetry.py +127 -0
  482. package/src/deepscientist/runners/kimi.py +334 -0
  483. package/src/deepscientist/runners/metadata.py +68 -0
  484. package/src/deepscientist/runners/opencode.py +414 -0
  485. package/src/deepscientist/runners/runtime_overrides.py +100 -0
  486. package/src/deepscientist/runners/simple_cli.py +538 -0
  487. package/src/deepscientist/runtime_storage.py +303 -0
  488. package/src/deepscientist/shared.py +61 -16
  489. package/src/deepscientist/skills/installer.py +37 -0
  490. package/src/deepscientist/skills/registry.py +2 -0
  491. package/src/deepscientist/tinytex.py +2 -2
  492. package/src/deepscientist/tui.py +10 -3
  493. package/src/prompts/benchstore/system.md +77 -0
  494. package/src/prompts/connectors/qq.md +33 -2
  495. package/src/prompts/connectors/weixin.md +208 -23
  496. package/src/prompts/contracts/admin_ops.md +74 -0
  497. package/src/prompts/contracts/admin_ops_knowledge.md +138 -0
  498. package/src/prompts/contracts/shared_interaction.md +5 -11
  499. package/src/prompts/start_setup/system.md +422 -0
  500. package/src/prompts/system.md +409 -315
  501. package/src/prompts/system_copilot.md +88 -12
  502. package/src/skills/analysis-campaign/SKILL.md +239 -578
  503. package/src/skills/analysis-campaign/references/artifact-flow-examples.md +102 -0
  504. package/src/skills/analysis-campaign/references/boundary-cases.md +98 -0
  505. package/src/skills/analysis-campaign/references/campaign-checklist-template.md +39 -24
  506. package/src/skills/analysis-campaign/references/campaign-design.md +26 -10
  507. package/src/skills/analysis-campaign/references/campaign-plan-template.md +53 -54
  508. package/src/skills/analysis-campaign/references/operational-guidance.md +97 -0
  509. package/src/skills/analysis-campaign/references/writing-facing-slice-examples.md +10 -20
  510. package/src/skills/baseline/SKILL.md +183 -461
  511. package/src/skills/baseline/references/artifact-flow-examples.md +106 -0
  512. package/src/skills/baseline/references/artifact-payload-examples.md +1 -1
  513. package/src/skills/baseline/references/baseline-checklist-template.md +27 -35
  514. package/src/skills/baseline/references/baseline-plan-template.md +37 -76
  515. package/src/skills/baseline/references/boundary-cases.md +86 -0
  516. package/src/skills/baseline/references/codebase-audit-checklist.md +2 -6
  517. package/src/skills/baseline/references/comparability-contract.md +7 -12
  518. package/src/skills/baseline/references/operational-guidance.md +56 -0
  519. package/src/skills/baseline/references/route-selection.md +5 -25
  520. package/src/skills/decision/SKILL.md +113 -306
  521. package/src/skills/decision/references/checkpoint-memory-template.md +47 -0
  522. package/src/skills/decision/references/operational-guidance.md +94 -0
  523. package/src/skills/decision/references/research-route-criteria.md +7 -8
  524. package/src/skills/decision/references/strategic-decision-template.md +13 -26
  525. package/src/skills/experiment/SKILL.md +132 -670
  526. package/src/skills/experiment/references/execution-playbook.md +374 -0
  527. package/src/skills/experiment/references/main-experiment-checklist-template.md +26 -2
  528. package/src/skills/experiment/references/main-experiment-plan-template.md +28 -17
  529. package/src/skills/experiment/references/operational-guidance.md +108 -0
  530. package/src/skills/finalize/SKILL.md +62 -0
  531. package/src/skills/finalize/references/checkpoint-memory-template.md +49 -0
  532. package/src/skills/finalize/references/resume-packet-template.md +7 -0
  533. package/src/skills/idea/SKILL.md +228 -15
  534. package/src/skills/idea/references/controlled-brainstorming-playbook.md +78 -0
  535. package/src/skills/idea/references/current-board-packet-template.md +61 -0
  536. package/src/skills/idea/references/high-value-idea-sourcing.md +119 -0
  537. package/src/skills/idea/references/idea-generation-playbook.md +21 -0
  538. package/src/skills/idea/references/idea-thinking-flow.md +6 -0
  539. package/src/skills/idea/references/literature-survey-template.md +3 -0
  540. package/src/skills/idea/references/objective-contract-template.md +54 -0
  541. package/src/skills/idea/references/outline-seeding-example.md +56 -0
  542. package/src/skills/idea/references/pre-idea-draft-template.md +105 -0
  543. package/src/skills/idea/references/related-work-playbook.md +75 -2
  544. package/src/skills/idea/references/research-history-playbook.md +114 -0
  545. package/src/skills/idea/references/selection-gate.md +58 -6
  546. package/src/skills/intake-audit/SKILL.md +43 -2
  547. package/src/skills/intake-audit/references/state-audit-template.md +10 -0
  548. package/src/skills/nature-data/SKILL.md +128 -0
  549. package/src/skills/nature-data/UPSTREAM_LICENSE.txt +21 -0
  550. package/src/skills/nature-data/agents/openai.yaml +4 -0
  551. package/src/skills/nature-data/references/chinese-author-alignment.md +84 -0
  552. package/src/skills/nature-data/references/fair-metadata-checklist.md +105 -0
  553. package/src/skills/nature-data/references/policy-principles.md +103 -0
  554. package/src/skills/nature-data/references/repository-and-identifiers.md +96 -0
  555. package/src/skills/nature-data/references/source-basis.md +54 -0
  556. package/src/skills/nature-data/references/statement-patterns.md +153 -0
  557. package/src/skills/nature-figure/SKILL.md +197 -0
  558. package/src/skills/nature-figure/UPSTREAM_LICENSE.txt +21 -0
  559. package/src/skills/nature-figure/agents/openai.yaml +4 -0
  560. package/src/skills/nature-figure/evals/evals.json +37 -0
  561. package/src/skills/nature-figure/references/api.md +428 -0
  562. package/src/skills/nature-figure/references/backend-selection.md +100 -0
  563. package/src/skills/nature-figure/references/chart-types.md +281 -0
  564. package/src/skills/nature-figure/references/common-patterns.md +349 -0
  565. package/src/skills/nature-figure/references/design-theory.md +436 -0
  566. package/src/skills/nature-figure/references/figure-contract.md +93 -0
  567. package/src/skills/nature-figure/references/nature-2026-observations.md +112 -0
  568. package/src/skills/nature-figure/references/qa-contract.md +119 -0
  569. package/src/skills/nature-figure/references/r-template-index.md +66 -0
  570. package/src/skills/nature-figure/references/r-workflow.md +161 -0
  571. package/src/skills/nature-figure/references/tutorials.md +250 -0
  572. package/src/skills/nature-paper2ppt/SKILL.md +507 -0
  573. package/src/skills/nature-paper2ppt/UPSTREAM_LICENSE.txt +21 -0
  574. package/src/skills/nature-paper2ppt/agents/openai.yaml +4 -0
  575. package/src/skills/nature-polishing/SKILL.md +385 -0
  576. package/src/skills/nature-polishing/UPSTREAM_LICENSE.txt +21 -0
  577. package/src/skills/nature-polishing/agents/openai.yaml +4 -0
  578. package/src/skills/nature-polishing/references/phrasebank-playbook.md +162 -0
  579. package/src/skills/nature-polishing/references/section-moves.md +240 -0
  580. package/src/skills/nature-polishing/references/style-guardrails.md +94 -0
  581. package/src/skills/nature-polishing/references/writing-strategy.md +148 -0
  582. package/src/skills/optimize/SKILL.md +177 -1568
  583. package/src/skills/optimize/references/brief-shaping-playbook.md +95 -0
  584. package/src/skills/optimize/references/candidate-board-template.md +13 -0
  585. package/src/skills/optimize/references/candidate-ranking-template.md +51 -0
  586. package/src/skills/optimize/references/codegen-route-playbook.md +50 -0
  587. package/src/skills/optimize/references/debug-response-template.md +29 -0
  588. package/src/skills/optimize/references/frontier-review-template.md +32 -0
  589. package/src/skills/optimize/references/fusion-playbook.md +36 -0
  590. package/src/skills/optimize/references/method-brief-template.md +73 -0
  591. package/src/skills/optimize/references/operational-guidance.md +621 -0
  592. package/src/skills/optimize/references/optimization-memory-template.md +30 -0
  593. package/src/skills/optimize/references/optimize-checklist-template.md +18 -0
  594. package/src/skills/optimize/references/plateau-response-playbook.md +28 -0
  595. package/src/skills/optimize/references/prompt-patterns.md +49 -0
  596. package/src/skills/paper-outline/SKILL.md +227 -0
  597. package/src/skills/paper-outline/references/outline-patterns.md +87 -0
  598. package/src/skills/paper-plot/SKILL.md +79 -0
  599. package/src/skills/paper-plot/agents/openai.yaml +4 -0
  600. package/src/skills/paper-plot/references/bar_grouped_hatch.md +96 -0
  601. package/src/skills/paper-plot/references/bar_paired_delta.md +72 -0
  602. package/src/skills/paper-plot/references/line_confidence_band.md +75 -0
  603. package/src/skills/paper-plot/references/line_loss_with_inset.md +65 -0
  604. package/src/skills/paper-plot/references/line_training_curve.md +44 -0
  605. package/src/skills/paper-plot/references/radar_dual_series.md +59 -0
  606. package/src/skills/paper-plot/references/scatter_broken_axis.md +59 -0
  607. package/src/skills/paper-plot/references/scatter_tsne_cluster.md +72 -0
  608. package/src/skills/paper-plot/scripts/bar_memevolve.py +109 -0
  609. package/src/skills/paper-plot/scripts/bar_spice.py +166 -0
  610. package/src/skills/paper-plot/scripts/line_aime.py +94 -0
  611. package/src/skills/paper-plot/scripts/line_loss_inset.py +157 -0
  612. package/src/skills/paper-plot/scripts/line_selfdistill.py +168 -0
  613. package/src/skills/paper-plot/scripts/radar_dora.py +151 -0
  614. package/src/skills/paper-plot/scripts/scatter_break.py +169 -0
  615. package/src/skills/paper-plot/scripts/scatter_tsne.py +133 -0
  616. package/src/skills/rebuttal/SKILL.md +9 -0
  617. package/src/skills/references/tool-usage-by-stage.md +438 -0
  618. package/src/skills/review/SKILL.md +105 -7
  619. package/src/skills/science/PROVENANCE.md +44 -0
  620. package/src/skills/science/SKILL.md +137 -0
  621. package/src/skills/science/references/artifact-science-tool.md +110 -0
  622. package/src/skills/science/references/claim-type-discipline.md +56 -0
  623. package/src/skills/science/references/domain-index.md +422 -0
  624. package/src/skills/science/references/hpc-via-bash-exec.md +42 -0
  625. package/src/skills/science/references/package-check-playbook.md +64 -0
  626. package/src/skills/science/references/package-index.min.json +3616 -0
  627. package/src/skills/science/references/packages/abinit.md +80 -0
  628. package/src/skills/science/references/packages/acts.md +73 -0
  629. package/src/skills/science/references/packages/aiida-core.md +80 -0
  630. package/src/skills/science/references/packages/alamode.md +80 -0
  631. package/src/skills/science/references/packages/amuse.md +88 -0
  632. package/src/skills/science/references/packages/anndata.md +88 -0
  633. package/src/skills/science/references/packages/arbor.md +80 -0
  634. package/src/skills/science/references/packages/arc.md +73 -0
  635. package/src/skills/science/references/packages/astropy.md +88 -0
  636. package/src/skills/science/references/packages/astroquery.md +88 -0
  637. package/src/skills/science/references/packages/atomate2.md +80 -0
  638. package/src/skills/science/references/packages/atomsmltr.md +73 -0
  639. package/src/skills/science/references/packages/awkward.md +73 -0
  640. package/src/skills/science/references/packages/batman.md +88 -0
  641. package/src/skills/science/references/packages/biopython.md +88 -0
  642. package/src/skills/science/references/packages/bloqade.md +73 -0
  643. package/src/skills/science/references/packages/brian2.md +73 -0
  644. package/src/skills/science/references/packages/bullet3.md +73 -0
  645. package/src/skills/science/references/packages/calculix.md +80 -0
  646. package/src/skills/science/references/packages/cantera.md +73 -0
  647. package/src/skills/science/references/packages/cavity-md-ipi.md +80 -0
  648. package/src/skills/science/references/packages/ccdproc.md +88 -0
  649. package/src/skills/science/references/packages/celerite2.md +88 -0
  650. package/src/skills/science/references/packages/cellrank.md +73 -0
  651. package/src/skills/science/references/packages/cesm.md +80 -0
  652. package/src/skills/science/references/packages/chemicals.md +73 -0
  653. package/src/skills/science/references/packages/chempy.md +73 -0
  654. package/src/skills/science/references/packages/cirq.md +73 -0
  655. package/src/skills/science/references/packages/coffea.md +73 -0
  656. package/src/skills/science/references/packages/cp2k.md +88 -0
  657. package/src/skills/science/references/packages/custodian.md +80 -0
  658. package/src/skills/science/references/packages/dart.md +73 -0
  659. package/src/skills/science/references/packages/datamol.md +88 -0
  660. package/src/skills/science/references/packages/dd4hep.md +73 -0
  661. package/src/skills/science/references/packages/dealii.md +80 -0
  662. package/src/skills/science/references/packages/deepchem.md +88 -0
  663. package/src/skills/science/references/packages/delphes.md +73 -0
  664. package/src/skills/science/references/packages/devito.md +80 -0
  665. package/src/skills/science/references/packages/dftb.md +88 -0
  666. package/src/skills/science/references/packages/dftd4.md +88 -0
  667. package/src/skills/science/references/packages/dftk-jl.md +80 -0
  668. package/src/skills/science/references/packages/dolfinx.md +80 -0
  669. package/src/skills/science/references/packages/drake.md +73 -0
  670. package/src/skills/science/references/packages/dumux.md +73 -0
  671. package/src/skills/science/references/packages/elk.md +80 -0
  672. package/src/skills/science/references/packages/elmerfem.md +80 -0
  673. package/src/skills/science/references/packages/enzo-e.md +88 -0
  674. package/src/skills/science/references/packages/espresso.md +80 -0
  675. package/src/skills/science/references/packages/exoplanet.md +88 -0
  676. package/src/skills/science/references/packages/fairroot.md +73 -0
  677. package/src/skills/science/references/packages/fbpic.md +80 -0
  678. package/src/skills/science/references/packages/fdtdbath-meep.md +80 -0
  679. package/src/skills/science/references/packages/geant4.md +73 -0
  680. package/src/skills/science/references/packages/geosx.md +80 -0
  681. package/src/skills/science/references/packages/gprmax.md +80 -0
  682. package/src/skills/science/references/packages/gromacs.md +80 -0
  683. package/src/skills/science/references/packages/gwaslab.md +73 -0
  684. package/src/skills/science/references/packages/gz-sim.md +73 -0
  685. package/src/skills/science/references/packages/hail.md +88 -0
  686. package/src/skills/science/references/packages/hiphive.md +80 -0
  687. package/src/skills/science/references/packages/hoomd-blue.md +80 -0
  688. package/src/skills/science/references/packages/itensor.md +73 -0
  689. package/src/skills/science/references/packages/itensors-jl.md +73 -0
  690. package/src/skills/science/references/packages/jdftx.md +73 -0
  691. package/src/skills/science/references/packages/jobflow.md +80 -0
  692. package/src/skills/science/references/packages/kadanoffbaym-jl.md +73 -0
  693. package/src/skills/science/references/packages/kite.md +80 -0
  694. package/src/skills/science/references/packages/kratos.md +80 -0
  695. package/src/skills/science/references/packages/kwant.md +73 -0
  696. package/src/skills/science/references/packages/lammps.md +80 -0
  697. package/src/skills/science/references/packages/lightkurve.md +88 -0
  698. package/src/skills/science/references/packages/limix.md +73 -0
  699. package/src/skills/science/references/packages/maxwelllink.md +80 -0
  700. package/src/skills/science/references/packages/mcdc.md +73 -0
  701. package/src/skills/science/references/packages/meep.md +80 -0
  702. package/src/skills/science/references/packages/mfem.md +80 -0
  703. package/src/skills/science/references/packages/mitgcm.md +73 -0
  704. package/src/skills/science/references/packages/modflow6.md +73 -0
  705. package/src/skills/science/references/packages/molecool.md +73 -0
  706. package/src/skills/science/references/packages/mom6.md +73 -0
  707. package/src/skills/science/references/packages/moose.md +80 -0
  708. package/src/skills/science/references/packages/mpas-model.md +73 -0
  709. package/src/skills/science/references/packages/mujoco.md +73 -0
  710. package/src/skills/science/references/packages/mumax3.md +73 -0
  711. package/src/skills/science/references/packages/nekrs.md +80 -0
  712. package/src/skills/science/references/packages/nessi.md +73 -0
  713. package/src/skills/science/references/packages/nest-simulator.md +73 -0
  714. package/src/skills/science/references/packages/netket.md +73 -0
  715. package/src/skills/science/references/packages/neuron.md +73 -0
  716. package/src/skills/science/references/packages/nextflow.md +88 -0
  717. package/src/skills/science/references/packages/nwchem.md +88 -0
  718. package/src/skills/science/references/packages/openbabel.md +88 -0
  719. package/src/skills/science/references/packages/openems.md +80 -0
  720. package/src/skills/science/references/packages/openff-toolkit.md +88 -0
  721. package/src/skills/science/references/packages/openfoam-dev.md +80 -0
  722. package/src/skills/science/references/packages/openmc.md +73 -0
  723. package/src/skills/science/references/packages/openmm.md +80 -0
  724. package/src/skills/science/references/packages/openmoc.md +73 -0
  725. package/src/skills/science/references/packages/openmx.md +80 -0
  726. package/src/skills/science/references/packages/opensees.md +80 -0
  727. package/src/skills/science/references/packages/opensn.md +80 -0
  728. package/src/skills/science/references/packages/opm-simulators.md +73 -0
  729. package/src/skills/science/references/packages/oqupy.md +73 -0
  730. package/src/skills/science/references/packages/packmol.md +80 -0
  731. package/src/skills/science/references/packages/palabos.md +80 -0
  732. package/src/skills/science/references/packages/parflow.md +80 -0
  733. package/src/skills/science/references/packages/pennylane.md +88 -0
  734. package/src/skills/science/references/packages/perceval.md +73 -0
  735. package/src/skills/science/references/packages/phono3py.md +73 -0
  736. package/src/skills/science/references/packages/phonopy.md +73 -0
  737. package/src/skills/science/references/packages/photutils.md +88 -0
  738. package/src/skills/science/references/packages/picongpu.md +80 -0
  739. package/src/skills/science/references/packages/plink-ng.md +88 -0
  740. package/src/skills/science/references/packages/precice.md +73 -0
  741. package/src/skills/science/references/packages/psc.md +80 -0
  742. package/src/skills/science/references/packages/psi4.md +88 -0
  743. package/src/skills/science/references/packages/pybinding.md +73 -0
  744. package/src/skills/science/references/packages/pyfr.md +80 -0
  745. package/src/skills/science/references/packages/pyhf.md +73 -0
  746. package/src/skills/science/references/packages/pyiron_base.md +80 -0
  747. package/src/skills/science/references/packages/pylcp.md +73 -0
  748. package/src/skills/science/references/packages/pylith.md +80 -0
  749. package/src/skills/science/references/packages/pynbody.md +88 -0
  750. package/src/skills/science/references/packages/pysam.md +88 -0
  751. package/src/skills/science/references/packages/pyscf.md +88 -0
  752. package/src/skills/science/references/packages/q-e.md +73 -0
  753. package/src/skills/science/references/packages/qibo.md +73 -0
  754. package/src/skills/science/references/packages/qiskit.md +73 -0
  755. package/src/skills/science/references/packages/quantica-jl.md +73 -0
  756. package/src/skills/science/references/packages/quantumoptics-jl.md +73 -0
  757. package/src/skills/science/references/packages/quimb.md +73 -0
  758. package/src/skills/science/references/packages/qulacs.md +73 -0
  759. package/src/skills/science/references/packages/qutip.md +73 -0
  760. package/src/skills/science/references/packages/rdkit.md +88 -0
  761. package/src/skills/science/references/packages/rmg-py.md +73 -0
  762. package/src/skills/science/references/packages/root.md +73 -0
  763. package/src/skills/science/references/packages/scanpy.md +88 -0
  764. package/src/skills/science/references/packages/scikit-allel.md +88 -0
  765. package/src/skills/science/references/packages/scikit-bio.md +88 -0
  766. package/src/skills/science/references/packages/scqubits.md +73 -0
  767. package/src/skills/science/references/packages/scuff-em.md +80 -0
  768. package/src/skills/science/references/packages/scvi-tools.md +73 -0
  769. package/src/skills/science/references/packages/seissol.md +73 -0
  770. package/src/skills/science/references/packages/sfepy.md +80 -0
  771. package/src/skills/science/references/packages/sisl.md +73 -0
  772. package/src/skills/science/references/packages/smilei.md +80 -0
  773. package/src/skills/science/references/packages/snakemake.md +88 -0
  774. package/src/skills/science/references/packages/specfem3d-globe.md +80 -0
  775. package/src/skills/science/references/packages/specutils.md +88 -0
  776. package/src/skills/science/references/packages/spglib.md +80 -0
  777. package/src/skills/science/references/packages/squidpy.md +88 -0
  778. package/src/skills/science/references/packages/starry.md +88 -0
  779. package/src/skills/science/references/packages/strawberryfields.md +73 -0
  780. package/src/skills/science/references/packages/su2.md +80 -0
  781. package/src/skills/science/references/packages/sunny-jl.md +73 -0
  782. package/src/skills/science/references/packages/sw4.md +73 -0
  783. package/src/skills/science/references/packages/swift.md +88 -0
  784. package/src/skills/science/references/packages/tdnegf.md +73 -0
  785. package/src/skills/science/references/packages/tenpy.md +73 -0
  786. package/src/skills/science/references/packages/thermo.md +73 -0
  787. package/src/skills/science/references/packages/tkwant.md +73 -0
  788. package/src/skills/science/references/packages/tvb-root.md +73 -0
  789. package/src/skills/science/references/packages/uproot5.md +73 -0
  790. package/src/skills/science/references/packages/vampire.md +80 -0
  791. package/src/skills/science/references/packages/wannier_tools.md +73 -0
  792. package/src/skills/science/references/packages/warpx.md +80 -0
  793. package/src/skills/science/references/packages/wrf.md +73 -0
  794. package/src/skills/science/references/packages/xtb.md +88 -0
  795. package/src/skills/science/references/packages/yt.md +73 -0
  796. package/src/skills/science/references/science-task-brief-template.md +71 -0
  797. package/src/skills/scout/SKILL.md +83 -425
  798. package/src/skills/scout/references/literature-scout-template.md +5 -24
  799. package/src/skills/scout/references/operational-guidance.md +191 -0
  800. package/src/skills/scout/references/paper-triage-playbook.md +11 -35
  801. package/src/skills/write/SKILL.md +744 -1246
  802. package/src/skills/write/references/experiments_analysis_patterns.md +129 -0
  803. package/src/skills/write/references/oral_package_patterns.md +252 -0
  804. package/src/skills/write/references/oral_writing_principles.md +291 -0
  805. package/src/skills/write/references/section_rewrite_checklist.md +234 -0
  806. package/src/tui/dist/app/AppContainer.js +1314 -27
  807. package/src/tui/dist/components/Composer.js +26 -1
  808. package/src/tui/dist/components/ConfigScreen.js +2 -1
  809. package/src/tui/dist/components/InputPrompt.js +25 -9
  810. package/src/tui/dist/components/MainContent.js +18 -3
  811. package/src/tui/dist/components/QuestScreen.js +3 -2
  812. package/src/tui/dist/components/UtilityScreen.js +37 -0
  813. package/src/tui/dist/hooks/useSafeInput.js +10 -0
  814. package/src/tui/dist/index.js +13 -1
  815. package/src/tui/dist/layouts/DefaultAppLayout.js +11 -8
  816. package/src/tui/dist/lib/api.js +89 -1
  817. package/src/tui/package.json +1 -1
  818. package/src/ui/dist/assets/{AnalysisPlugin-BCKAfjba.js → AnalysisPlugin-CA94NGmI.js} +1 -1
  819. package/src/ui/dist/assets/CliPlugin-DHBzphZU.js +79 -0
  820. package/src/ui/dist/assets/CodeEditorPlugin-BOFwD2rn.js +2 -0
  821. package/src/ui/dist/assets/{CodeViewerPlugin-CbaFRrUU.js → CodeViewerPlugin-CqDpgjik.js} +4 -4
  822. package/src/ui/dist/assets/{DocViewerPlugin-DAjLVeQD.js → DocViewerPlugin-UDBgt8-4.js} +3 -3
  823. package/src/ui/dist/assets/GitCommitViewerPlugin-BmHtZ0bZ.js +6 -0
  824. package/src/ui/dist/assets/{GitDiffViewerPlugin-CQACjoAA.js → GitDiffViewerPlugin-CAxjNorQ.js} +2 -2
  825. package/src/ui/dist/assets/{GitSnapshotViewer-0r4nLPke.js → GitSnapshotViewer-CweA6VON.js} +2 -2
  826. package/src/ui/dist/assets/{ImageViewerPlugin-nBOmI2v_.js → ImageViewerPlugin-C8wHGvGN.js} +5 -5
  827. package/src/ui/dist/assets/LabPlugin-COyyLUol.js +32 -0
  828. package/src/ui/dist/assets/{LatexPlugin-ZwtV8pIp.js → LatexPlugin-BQjAaA5J.js} +4 -4
  829. package/src/ui/dist/assets/{MarkdownViewerPlugin-DKqVfKyW.js → MarkdownViewerPlugin-Dy1NE2dI.js} +3 -3
  830. package/src/ui/dist/assets/{MarketplacePlugin-BwxStZ9D.js → MarketplacePlugin-DMIZtEJ2.js} +2 -2
  831. package/src/ui/dist/assets/NotebookEditor-CFHMq_Qt.js +91 -0
  832. package/src/ui/dist/assets/{NotebookEditor-DB9N_T9q.js → NotebookEditor-WFyd8Ybt.js} +3 -3
  833. package/src/ui/dist/assets/{PdfLoader-eWBONbQP.js → PdfLoader-CLE5u5TS.js} +3 -3
  834. package/src/ui/dist/assets/{PdfMarkdownPlugin-D22YOZL3.js → PdfMarkdownPlugin-_iNK_H83.js} +1 -1
  835. package/src/ui/dist/assets/PdfViewerPlugin-DgWsbInT.js +22 -0
  836. package/src/ui/dist/assets/SearchPlugin-DrZmn5iw.js +11 -0
  837. package/src/ui/dist/assets/{TextViewerPlugin-C5xqeeUH.js → TextViewerPlugin-D1-T3aC7.js} +4 -4
  838. package/src/ui/dist/assets/branding/runner-claude.svg +107 -0
  839. package/src/ui/dist/assets/branding/runner-codex.svg +10 -0
  840. package/src/ui/dist/assets/branding/runner-kimi.svg +14 -0
  841. package/src/ui/dist/assets/branding/runner-opencode.svg +7 -0
  842. package/src/ui/dist/assets/cli-store-CoZ-x5Ip.js +1 -0
  843. package/src/ui/dist/assets/{code-WlFHE7z_.js → code-DbsmSd3Y.js} +1 -1
  844. package/src/ui/dist/assets/file-diff-panel-DsvyRz47.js +1 -0
  845. package/src/ui/dist/assets/{wrap-text-BC-Hltpd.js → file-jump-queue-DeQBikaw.js} +3 -3
  846. package/src/ui/dist/assets/{file-socket-CfQPKQKj.js → file-socket-DA5XIx88.js} +1 -1
  847. package/src/ui/dist/assets/fonts/ds-fonts.css +50 -4
  848. package/src/ui/dist/assets/images/deepxiv/register-guide.png +0 -0
  849. package/src/ui/dist/assets/index-39vY9LmZ.js +1 -0
  850. package/src/ui/dist/assets/{index-CwNu1aH4.js → index-BsO46tJA.js} +1 -1
  851. package/src/ui/dist/assets/index-CHzJ2xtB.js +3530 -0
  852. package/src/ui/dist/assets/index-DH-zxoZ3.css +33 -0
  853. package/src/ui/dist/assets/{plugin-notebook-HbW2K-1c.js → plugin-notebook-JRhysCqj.js} +2 -2
  854. package/src/ui/dist/assets/{project-sync-C9IdzdZW.js → project-sync-DPmWKmKD.js} +1 -1
  855. package/src/ui/dist/assets/{zoom-out-E_gaeAxL.js → zoom-out-DAukFWen.js} +3 -3
  856. package/src/ui/dist/index.html +3 -3
  857. package/src/skills/analysis-campaign/references/artifact-orchestration.md +0 -58
  858. package/src/skills/baseline/references/memory-playbook.md +0 -40
  859. package/src/skills/baseline/references/publishable-baseline-package.md +0 -30
  860. package/src/skills/write/references/outline-evidence-contract-example.md +0 -107
  861. package/src/skills/write/references/paper-experiment-matrix-template.md +0 -131
  862. package/src/skills/write/references/paper-section-playbook.md +0 -64
  863. package/src/skills/write/references/reviewer-first-writing.md +0 -64
  864. package/src/skills/write/references/revision-checklist.md +0 -70
  865. package/src/skills/write/references/section-contracts.md +0 -82
  866. package/src/skills/write/references/sentence-level-proofing.md +0 -49
  867. package/src/ui/dist/assets/AiManusChatView-Bv-Z8YpU.js +0 -204
  868. package/src/ui/dist/assets/CliPlugin-BCKcpc35.js +0 -109
  869. package/src/ui/dist/assets/CodeEditorPlugin-DbOfSJ8K.js +0 -2
  870. package/src/ui/dist/assets/GitCommitViewerPlugin-CIUqbUDO.js +0 -1
  871. package/src/ui/dist/assets/LabCopilotPanel-BHxOxF4z.js +0 -14
  872. package/src/ui/dist/assets/LabPlugin-BKoZGs95.js +0 -22
  873. package/src/ui/dist/assets/NotebookEditor-BEQhaQbt.js +0 -81
  874. package/src/ui/dist/assets/PdfViewerPlugin-c-RK9DLM.js +0 -17
  875. package/src/ui/dist/assets/SearchPlugin-CxF9ytAx.js +0 -16
  876. package/src/ui/dist/assets/VNCViewer-BoLGLnHz.js +0 -11
  877. package/src/ui/dist/assets/bot-DREQOxzP.js +0 -6
  878. package/src/ui/dist/assets/chevron-up-C9Qpx4DE.js +0 -6
  879. package/src/ui/dist/assets/file-content-BZMz3RYp.js +0 -1
  880. package/src/ui/dist/assets/file-diff-panel-CQhw0jS2.js +0 -1
  881. package/src/ui/dist/assets/file-jump-queue-DA-SdG__.js +0 -1
  882. package/src/ui/dist/assets/git-commit-horizontal-DxZ8DCZh.js +0 -6
  883. package/src/ui/dist/assets/image-Bgl4VIyx.js +0 -6
  884. package/src/ui/dist/assets/index-BpV6lusQ.css +0 -33
  885. package/src/ui/dist/assets/index-CBNVuWcP.js +0 -2496
  886. package/src/ui/dist/assets/index-DrUnlf6K.js +0 -1
  887. package/src/ui/dist/assets/index-NW-h8VzN.js +0 -1
  888. package/src/ui/dist/assets/pdf-effect-queue-J8OnM0jE.js +0 -6
  889. package/src/ui/dist/assets/popover-CLc0pPP8.js +0 -1
  890. package/src/ui/dist/assets/select-Cs2PmzwL.js +0 -11
  891. package/src/ui/dist/assets/sigma-ClKcHAXm.js +0 -6
  892. package/src/ui/dist/assets/trash-DwpbFr3w.js +0 -11
  893. package/src/ui/dist/assets/useCliAccess-NQ8m0Let.js +0 -1
  894. package/src/ui/dist/assets/useFileDiffOverlay-FuhcnKiw.js +0 -1
@@ -7,427 +7,108 @@ skill_role: stage
7
7
  # Optimize
8
8
 
9
9
  Use this skill for algorithm-first quests where the goal is the strongest justified optimization result rather than paper packaging.
10
+ The goal is to move the frontier by one justified step at a time, not to generate a large pile of low-information candidates.
10
11
 
11
- This skill is the lightweight optimization control layer for DeepScientist.
12
- It does not replace the normal quest runtime. It tells you how to use the existing DeepScientist artifact, memory, bash_exec, Git, and worktree mechanisms as an optimization system.
12
+ ## Match signals
13
13
 
14
- ## Interaction discipline
15
-
16
- - Follow the shared interaction contract injected by the system prompt.
17
- - For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
18
- - Ordinary candidate creation, smoke checks, and route updates should stay concise.
19
- - Use richer milestone updates only when a candidate is promoted, a strong run finishes, the frontier shifts materially, or a fusion/debug route becomes the new main path.
20
- - When the user asks for the current optimization state, answer from the frontier and durable artifacts rather than from chat memory.
21
- - Hard execution rule: every terminal command in this stage must go through `bash_exec`; do not use any other terminal path for smoke checks, quick validations, long runs, Git, Python, package-manager, or file-inspection commands.
22
-
23
- ## Stage purpose
24
-
25
- The optimize stage should do four things:
26
-
27
- 1. turn loose ideas into candidate briefs
28
- 2. rank and promote only the strongest briefs into durable lines
29
- 3. manage candidate attempts within a durable line
30
- 4. choose when to explore, exploit, fuse, debug, or stop
31
-
32
- This skill is especially appropriate when `startup_contract.need_research_paper = false`.
33
-
34
- Treat `optimize` as one stable stage skill with six internal submodes:
35
-
36
- - `brief`
37
- - `rank`
38
- - `seed`
39
- - `loop`
40
- - `fusion`
41
- - `debug`
42
-
43
- Do not treat these as separate public skills.
44
- Treat them as internal execution modes inside one optimize workflow.
45
-
46
- InternAgent maps most naturally onto the `brief` and `rank` side of this stage.
47
- MLEvolve maps most naturally onto the `seed`, `loop`, `fusion`, and `debug` side of this stage.
48
- Do not collapse those two layers into one vague "optimize more" loop.
49
-
50
- ## Required working files
51
-
52
- Before broad optimization search or candidate management becomes substantial, maintain these quest-visible control files:
53
-
54
- - `OPTIMIZE_CHECKLIST.md`
55
- - `CANDIDATE_BOARD.md`
56
-
57
- Use:
58
-
59
- - the integrated `optimize checklist template` appendix section
60
- - the integrated `candidate board template` appendix section
61
-
62
- `OPTIMIZE_CHECKLIST.md` is the execution control surface.
63
- It should track:
64
-
65
- - current frontier mode
66
- - current optimize submode
67
- - candidate brief count
68
- - promoted line count
69
- - current smoke queue
70
- - current full-eval queue
71
- - stagnation / fusion checks
72
- - next concrete action
73
-
74
- `CANDIDATE_BOARD.md` is the compact candidate ledger.
75
- It should track:
76
-
77
- - candidate id
78
- - candidate type: brief or implementation attempt
79
- - parent line or parent candidate
80
- - strategy: explore / exploit / fusion / debug
81
- - status
82
- - expected gain
83
- - observed result
84
- - promote / archive recommendation
85
-
86
- ## Required MCP-driven workflow
87
-
88
- Treat this as the concrete optimize workflow. Do not skip these steps just because the quest is algorithm-first.
89
-
90
- ### 1. Recover the optimization state first
91
-
92
- At the start of each meaningful optimize pass, use this order unless a stronger local reason exists:
93
-
94
- 1. `artifact.get_optimization_frontier(...)`
95
- 2. `memory.list_recent(scope='quest', limit=5)`
96
- 3. `memory.search(...)`
97
- 4. `artifact.get_quest_state(detail='summary')`
98
- 5. `artifact.read_quest_documents(...)` when exact durable wording matters
99
-
100
- Do not create new candidates before the frontier, recent optimization lessons, and current runtime refs are checked.
101
- If the frontier is missing or obviously stale, recover that state before proposing more work.
102
-
103
- ### 2. Shape candidate briefs before branch promotion
104
-
105
- When the next direction is still fuzzy, do not jump straight into code or branch creation.
106
- First turn the direction into a compact candidate brief.
107
-
108
- The brief-shaping sequence is:
109
-
110
- 1. clarify the bottleneck, constraints, and comparability boundary
111
- 2. identify the incumbent or baseline that this brief must beat or complement
112
- 3. generate a small differentiated slate, usually `2-3` serious approaches
113
- 4. compare them on one shared surface
114
- 5. recommend exactly one lead brief
115
- 6. self-check the recommended brief before submission
116
-
117
- Every serious brief should answer:
118
-
119
- - bottleneck
120
- - why_current_line_is_limited
121
- - mechanism
122
- - why_now
123
- - keep_unchanged
124
- - expected_gain
125
- - implementation_surface
126
- - main_risks
127
-
128
- The durable call for this step is usually:
129
-
130
- - `artifact.submit_idea(mode='create', submission_mode='candidate', ...)`
131
-
132
- Use `idea` when the mechanism family itself is still unresolved.
133
- Use `optimize` when the family is already chosen and the work is now branchless brief shaping, ranking, or within-line search.
134
-
135
- ### 3. Rank candidate briefs on one explicit surface
136
-
137
- Before promoting a line, compare the serious briefs on one shared ranking surface.
138
- At minimum evaluate:
139
-
140
- - expected information gain
141
- - feasibility in current repo
142
- - comparability against baseline
143
- - implementation surface
144
- - novelty or distinctiveness
145
- - family diversity
146
- - change-layer diversity
147
- - incumbent-improvement potential
148
- - failure risk
149
-
150
- Then state:
151
-
152
- - winner justification
153
- - non-winner defer / reject reasons
154
- - promotion cap: how many lines should actually be promoted now
155
-
156
- Do not promote every plausible brief.
157
- Default rule: promote only `1-3` candidate briefs, and usually fewer.
158
-
159
- The durable call for this step is one of:
160
-
161
- - `artifact.submit_idea(mode='create', submission_mode='line', source_candidate_id=..., ...)`
162
- - `artifact.record(payload={'kind': 'decision', 'action': 'branch'|'continue'|'stop', ...})`
163
-
164
- ### 4. Hand off promoted lines into experiment cleanly
165
-
166
- Once a brief is promoted, the next main work belongs to `experiment`, not to vague optimize chatter.
167
- Before substantial implementation or compute:
168
-
169
- - activate or confirm the intended durable line
170
- - update `OPTIMIZE_CHECKLIST.md`
171
- - update `CANDIDATE_BOARD.md`
172
- - create or revise `PLAN.md`
173
- - create or revise `CHECKLIST.md`
174
- - define the smoke queue and full-eval queue explicitly
175
-
176
- Then hand off into `experiment` for:
177
-
178
- - one clean implementation pass
179
- - one bounded smoke or pilot run
180
- - one real measured main run
181
-
182
- Do not keep reshaping the method after the run contract is already concrete.
183
-
184
- ### 5. Record every meaningful result durably
185
-
186
- Use these artifact forms consistently:
187
-
188
- - candidate brief:
189
- - `artifact.submit_idea(..., submission_mode='candidate')`
190
- - durable optimization line:
191
- - `artifact.submit_idea(..., submission_mode='line')`
192
- - implementation-level candidate attempt inside one line:
193
- - `artifact.record(payload={'kind': 'report', 'report_type': 'optimization_candidate', ...})`
194
- - real measured main result:
195
- - `artifact.record_main_experiment(...)`
196
- - route change after the result:
197
- - `artifact.record(payload={'kind': 'decision', 'action': 'iterate'|'branch'|'continue'|'stop', ...})`
198
-
199
- Do not treat chat summaries as substitutes for these durable records.
200
-
201
- ### 6. Manage process lifecycle explicitly
202
-
203
- Optimize uses the same long-run process discipline as `experiment`.
204
-
205
- - Use `bash_exec` for smoke checks, quick validations, and long runs.
206
- - Before launching a new run, inspect current managed sessions first.
207
- - Do not start a duplicate process for the same purpose if a valid live session already exists.
208
- - Use bounded smoke before long runs unless direct quick validation is already cheap and equally informative.
209
- - Use `bash_exec(mode='detach', ...)` for long runs and monitor with `list/read/await`.
210
- - Read logs before retrying a failed or suspicious run; do not relaunch blindly.
211
- - Kill only on explicit invalidity, supersession, or checked no-progress conditions.
212
- - After pause, resume, or daemon recovery, recover session state before spawning new runs.
14
+ Use `optimize` when:
213
15
 
214
- ### 7. Route from evidence, not from momentum
215
-
216
- After every real measured result:
217
-
218
- 1. refresh the frontier
219
- 2. compare the result against the incumbent and backlog
220
- 3. choose exactly one dominant next action:
221
- - explore
222
- - exploit
223
- - fusion
224
- - debug
225
- - stop
226
- 4. record that route durably
227
-
228
- Do not treat one candidate creation, one smoke pass, or one detached launch as stage completion.
229
-
230
- ## Integrated templates and playbooks
231
-
232
- Use the following integrated structures directly inside this skill. They replace the old optimize reference files conceptually, even if those files still exist on disk.
233
-
234
- ### Candidate brief template
235
-
236
- Every serious candidate brief should include:
237
-
238
- - title
239
- - bottleneck
240
- - why_current_line_is_limited
241
- - mechanism
242
- - mechanism_family
243
- - change_layer: `Tier1` / `Tier2` / `Tier3`
244
- - source_lens
245
- - keep_unchanged
246
- - expected_gain
247
- - implementation_surface
248
- - risks
249
- - foundation
250
- - promote_now
251
- - next_target
252
-
253
- ### Brief-shaping playbook
254
-
255
- Use this when a candidate direction is still fuzzy and needs to become a ranking-ready brief.
256
-
257
- - clarify the concrete bottleneck before widening
258
- - resolve the evaluation or comparability boundary
259
- - identify the main hard constraint
260
- - identify the current incumbent
261
- - generate only a small differentiated slate
262
- - compare on one shared surface
263
- - recommend exactly one lead brief
264
- - self-check for ambiguity, overlap, and weak justification
265
-
266
- ### Candidate ranking template
267
-
268
- When several briefs compete, produce:
269
-
270
- - candidate set
271
- - ranking scope
272
- - comparison surface
273
- - ranked candidates with score summary, why each ranks there, and promote / hold / reject
274
- - winner justification
275
- - non-winner notes
276
- - promotion cap
277
-
278
- ### Candidate board template
279
-
280
- `CANDIDATE_BOARD.md` should expose at least these columns:
281
-
282
- - candidate id
283
- - level: `brief` or `implementation`
284
- - parent
285
- - strategy
286
- - status
287
- - expected gain
288
- - observed result
289
- - promote / archive recommendation
290
-
291
- ### Optimize checklist template
292
-
293
- `OPTIMIZE_CHECKLIST.md` should track at least:
294
-
295
- - frontier has been refreshed
296
- - primary optimize submode chosen
297
- - current route mode chosen
298
- - recent optimization memory reviewed
299
- - brief slate checked for family diversity
300
- - candidate briefs updated or confirmed
301
- - candidate ranking updated
302
- - promotion decision made
303
- - current implementation pool recorded
304
- - smoke queue defined
305
- - full-eval queue defined
306
- - failures classified
307
- - stagnation check performed
308
- - fusion eligibility checked
309
- - next concrete action written
310
-
311
- ### Frontier review template
312
-
313
- Whenever route choice is unclear, write down:
314
-
315
- - current frontier
316
- - evidence summary
317
- - route choice
318
- - active optimize submode
319
- - immediate next action
320
-
321
- ### Code-generation route playbook
322
-
323
- Choose one route deliberately:
324
-
325
- - brief-only when the direction is still unclear
326
- - stepwise generation for first substantial implementation of a new line
327
- - diff / patch generation for improve / exploit / debug / most fusion work
328
- - full rewrite only when the current implementation is structurally broken or mismatched
329
-
330
- Do not jump to a rewrite merely because one local patch failed.
331
-
332
- ### Debug response template
333
-
334
- When a candidate fails but still looks strategically valuable, record:
335
-
336
- - error
337
- - retrieved memory
338
- - root cause
339
- - minimal fix
340
- - keep unchanged
341
- - next check
342
- - archive threshold
343
-
344
- ### Fusion playbook
16
+ - the quest is algorithm-first
17
+ - the baseline gate is already confirmed or waived
18
+ - the task has at least one plausible optimization direction
19
+ - multiple candidate directions exist and the system should rank them before promotion
20
+ - a durable line exists and the next step is to manage explore, exploit, fusion, debug, or stop
345
21
 
346
- Before opening a fusion candidate, answer:
22
+ Do not use `optimize` when:
347
23
 
348
- - what exactly is being fused?
349
- - why are the source strengths complementary rather than redundant?
350
- - what remains unchanged for comparability?
351
- - what bounded evidence would prove the fusion worthwhile?
352
- - what bounded first validation step should run before any broad rollout?
24
+ - the baseline gate is unresolved
25
+ - the main need is a paper draft, rebuttal, review, or finalize task
26
+ - the quest is still in broad literature scouting with no concrete optimization handle
27
+ - the real blocker is still idea-family selection rather than bounded optimization search inside an accepted family
353
28
 
354
- Do not fuse two weak lines or two same-mechanism lines under different names.
29
+ ## One-sentence summary
355
30
 
356
- ### Optimization memory template
31
+ Recover the current frontier, choose one optimize submode, advance one justified move, then record the new frontier or explicit stop condition.
357
32
 
358
- When writing reusable optimization lessons, capture:
33
+ ## Control workflow
359
34
 
360
- - type
361
- - context
362
- - observation
363
- - why it matters
364
- - retrieval hint
365
- - reuse hint
35
+ 1. Recover the current frontier and recent durable optimization state.
36
+ Read the frontier, recent memory, and current quest state before creating or promoting anything.
37
+ 2. Choose exactly one primary optimize submode for this pass.
38
+ Keep the pass legible: one dominant optimize move, not several unrelated route changes.
39
+ 3. Keep the candidate slate or active pool small and differentiated.
40
+ If the direction is still fuzzy, shape and rank branchless candidate briefs; if a durable line already exists, manage a bounded implementation pool inside that line.
41
+ 4. Promote or execute only bounded candidates with explicit evidence criteria.
42
+ Promote only the strongest briefs into durable lines, and record implementation-level attempts separately from durable line creation.
43
+ 5. Route from evidence to exactly one dominant next action.
44
+ End in `explore`, `exploit`, `fusion`, `debug`, or `stop`, and record that route durably.
366
45
 
367
- ### Plateau response playbook
46
+ ## AVOID / pitfalls
368
47
 
369
- If one line keeps producing non-improving results:
48
+ - Do not treat every patch or micro-attempt as a new durable idea line.
49
+ - Do not create a new Git branch or worktree for every implementation-level candidate.
50
+ - Do not create a new Git branch/worktree for every implementation-level candidate.
51
+ - Do not promote every plausible brief.
52
+ - Do not keep widening the frontier once a small serious slate already exists.
53
+ - Do not let one optimize pass mix multiple major route changes.
54
+ - Do not keep selecting the same familiar mechanism family after repeated non-improving results.
55
+ - Do not drift into paper-outline, bundle, or finalize work by default while this stage is active.
56
+ - Do not treat one candidate creation or one smoke pass as stage completion.
370
57
 
371
- 1. state that the line is plateauing
372
- 2. identify the most likely root cause
373
- 3. choose one larger route change:
374
- - widen search
375
- - promote a stronger alternative
376
- - fuse
377
- - debug
378
- - stop
379
- 4. record one explicit non-repeat rule
58
+ ## Constraints
380
59
 
381
- Do not hide plateau under a sequence of tiny "one more tweak" loops.
60
+ - Use these three object levels consistently:
61
+ - candidate brief
62
+ - durable optimization line
63
+ - implementation-level candidate attempt
64
+ - Keep exactly one primary optimize submode active for the current meaningful pass.
65
+ - Keep only one bottom-layer optimize move truly in progress at a time.
66
+ - Before deciding the next route, call `artifact.get_optimization_frontier(...)` when available and use it as the primary optimization-state summary.
67
+ - Candidate briefs should use `artifact.submit_idea(..., submission_mode='candidate')`.
68
+ - Durable lines should use `artifact.submit_idea(..., submission_mode='line')`.
69
+ - Only promote a candidate brief into a durable line when it has enough expected value, differentiation, and execution path clarity to deserve branch/worktree state.
70
+ - Implementation-level candidate attempts inside one durable line should use `artifact.record(... report_type='optimization_candidate' ...)`.
71
+ - Real measured line results should use `artifact.record_main_experiment(...)`.
72
+ - All terminal work in this stage must go through `bash_exec(...)`.
382
73
 
383
- ### Prompt patterns worth preserving
74
+ ## Validation
384
75
 
385
- For candidate-brief, improve, fusion, and debug prompts, preserve:
76
+ Before `optimize` can end, all applicable checks should be true:
386
77
 
387
- - introduction
388
- - task description
389
- - memory
390
- - previous solution or previous line
391
- - instructions
392
- - explicit response format
78
+ - the frontier was refreshed
79
+ - the active optimize submode is explicit
80
+ - the candidate board and optimize checklist reflect the current state
81
+ - promoted lines are justified and bounded
82
+ - every live candidate has status and next action
83
+ - every major success, failure, promotion, or route change is durably recorded
84
+ - the pass ends with one durable next action or stop condition
393
85
 
394
- Preserve these reasoning contracts whenever possible:
86
+ ## Interaction discipline
395
87
 
396
- - WHAT is changing?
397
- - WHY is the current line limited?
398
- - HOW should the change address the limitation?
399
- - KEEP UNCHANGED
400
- - NEXT ACTION
88
+ - Follow the shared interaction contract injected by the system prompt.
89
+ - For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
90
+ - Ordinary candidate creation, smoke checks, and route updates should stay concise.
91
+ - Use richer milestone updates only when a candidate is promoted, a strong run finishes, the frontier shifts materially, or a fusion/debug route becomes the new main path.
92
+ - When the user asks for the current optimization state, answer from the frontier and durable artifacts rather than from chat memory.
93
+ - Every terminal command in this stage must go through `bash_exec`; do not use any other terminal path for smoke checks, quick validations, long runs, Git, Python, package-manager, or file-inspection commands.
401
94
 
402
- ## Non-negotiable rules
95
+ ## Working surfaces
403
96
 
404
- - Do not treat every patch or micro-attempt as a new durable idea line.
405
- - Do not create a new Git branch/worktree for every implementation-level candidate.
406
- - Use `artifact.submit_idea(..., submission_mode='candidate')` for candidate briefs that should be ranked before promotion.
407
- - Use `artifact.submit_idea(..., submission_mode='line')` only for directions that deserve a durable optimization line and branch/worktree.
408
- - Use `artifact.record(payload={'kind': 'report', 'report_type': 'optimization_candidate', ...})` for implementation-level candidate attempts inside one durable line.
409
- - Before deciding the next route, call `artifact.get_optimization_frontier(...)` when available and use it as the primary optimization-state summary.
410
- - Keep all major optimization successes and failures durable through artifacts and memory.
411
- - Do not drift into paper-outline, bundle, or finalize work by default while this stage is active.
412
- - Do not convert ranking uncertainty into premature branch creation.
413
- - Do not treat an implementation-level candidate report as a new durable optimization line.
414
- - Do not keep widening the frontier once a small serious slate already exists.
415
- - Do not let one optimize pass mix multiple major route changes.
416
- One pass may inspect several possibilities, but it should finish with one dominant next action.
97
+ Before broad optimization search or candidate management becomes substantial, maintain these quest-visible control files:
417
98
 
418
- ## When to use
99
+ - quest-root `plan.md` as the research map and loop tracker for the whole quest
100
+ - workspace `PLAN.md` as the active optimize-node contract
101
+ - `OPTIMIZE_CHECKLIST.md` as the optimize-specific execution frontier
102
+ - workspace `CHECKLIST.md` as a mirror of the immediate next move when it exists
103
+ - `CANDIDATE_BOARD.md` as the compact candidate ledger
419
104
 
420
- - the quest is algorithm-first
421
- - the baseline gate is already confirmed or waived
422
- - the task has at least one plausible optimization direction
423
- - multiple candidate directions exist and the system should rank them before promotion
424
- - a durable line exists and the next step is to manage explore / exploit / fuse / debug
105
+ Use these templates:
425
106
 
426
- ## Do not use when
107
+ - `references/optimize-checklist-template.md`
108
+ - `references/candidate-board-template.md`
427
109
 
428
- - the baseline gate is unresolved
429
- - the main need is a paper draft, rebuttal, or review task
430
- - the quest is still in broad literature scouting with no concrete optimization handle
110
+ `optimize` is the looped search controller for algorithm-first quests, not a replacement for the quest-level roadmap.
111
+ When a result becomes the new incumbent, plateaus, or stops, update quest-root `plan.md` so the next loop edge is explicit.
431
112
 
432
113
  ## Core object model
433
114
 
@@ -435,230 +116,42 @@ Use these three object levels consistently:
435
116
 
436
117
  1. candidate brief
437
118
  `artifact.submit_idea(mode='create', submission_mode='candidate', ...)`
438
- This records a possible direction or method brief without opening a branch yet.
439
-
119
+ Record a possible direction or method brief without opening a branch yet.
440
120
  2. durable optimization line
441
121
  `artifact.submit_idea(mode='create', submission_mode='line', ...)`
442
- This opens a real branch/worktree and becomes a formal optimization path.
443
-
122
+ Open a real branch or worktree and make it a formal optimization path.
444
123
  3. implementation-level candidate attempt
445
124
  `artifact.record(payload={'kind': 'report', 'report_type': 'optimization_candidate', ...})`
446
- This is a within-line attempt such as one patch, one smoke candidate, one debug candidate, or one fusion candidate.
447
-
448
- ## Recommended workflow
449
-
450
- 1. Read the current frontier and recent durable state.
451
- 2. If only loose candidate directions exist, create or refine candidate briefs first.
452
- 3. Rank the candidate briefs and promote only the best `1-3` into durable lines.
453
- 4. Inside a durable line, generate a small candidate pool, then run bounded smoke checks before full evaluations.
454
- 5. Record each implementation-level attempt durably with status, change plan, and result.
455
- 6. After each real result, decide whether to explore, exploit, fuse, debug, or stop.
456
- 7. Write optimization lessons to memory before leaving the stage.
457
-
458
- At the start of each meaningful optimize pass, update `OPTIMIZE_CHECKLIST.md` before spending significant code or compute.
459
-
460
- ## Mandatory first-call sequence
125
+ Record one within-line attempt such as one patch, one smoke candidate, one debug candidate, or one fusion candidate.
461
126
 
462
- At the start of a meaningful optimize pass, use this order unless a stronger local reason exists:
127
+ Use `artifact.record(payload={'kind': 'decision', ...})` when the frontier route changes, a line is promoted, a line is stopped, or the next optimize submode is selected.
463
128
 
464
- 1. `artifact.get_optimization_frontier(...)`
465
- 2. `memory.search(...)`
466
- 3. `artifact.get_quest_state(detail='summary')`
467
- 4. `artifact.read_quest_documents(...)` when exact durable wording matters
129
+ ## Optimize submodes
468
130
 
469
- Do not start generating new candidates before the frontier and recent optimization lessons are checked.
470
-
471
- ## Stage-start requirement
472
-
473
- Stage-start requirement:
474
-
475
- - run `memory.list_recent(scope='quest', limit=5)`
476
- - run at least one `memory.search(...)`
477
- - read `artifact.get_optimization_frontier(...)`
478
- - update `OPTIMIZE_CHECKLIST.md`
479
-
480
- If the frontier is missing or obviously stale, recover that state before proposing more work.
131
+ Treat `optimize` as one stable stage skill with six internal submodes:
481
132
 
482
- ## Internal submode selection
133
+ - `brief`: turn loose directions into compact candidate briefs
134
+ - `rank`: compare briefs on one shared surface and choose promotion candidates
135
+ - `seed`: create a small implementation-level pool inside one durable line
136
+ - `loop`: advance one durable line with bounded smoke/full-eval/record actions
137
+ - `fusion`: combine complementary strengths from multiple lines
138
+ - `debug`: rescue a strategically valuable candidate blocked by a concrete failure mode
483
139
 
484
- Choose exactly one primary optimize submode for the current meaningful pass.
140
+ Do not treat these as separate public skills.
141
+ Treat them as internal execution modes inside one optimize workflow.
485
142
 
486
143
  Default selection order:
487
144
 
488
- 1. `fusion`
489
- - when the frontier explicitly says `fusion`
490
- 2. `debug`
491
- - when a strategically valuable candidate failed for a concrete and likely fixable reason
492
- 3. `rank`
493
- - when several candidate briefs already exist and promotion is the main unresolved question
494
- 4. `brief`
495
- - when the candidate-brief slate is too thin or too weak
496
- 5. `seed`
497
- - when a durable line exists but there is no live implementation-candidate pool
498
- 6. `loop`
499
- - when a live candidate pool or leading durable line already exists and the main need is bounded execution progress
500
-
501
- Do not bounce among submodes repeatedly in one pass.
502
- If the best submode changes after new evidence appears, record that route shift explicitly.
503
-
504
- ## Candidate brief protocol
505
-
506
- When a direction is interesting but not yet worthy of a new branch:
507
-
508
- - create a candidate brief with `submission_mode='candidate'`
509
- - keep it branchless
510
- - record enough structure that later ranking or promotion is possible
511
-
512
- Good candidate-brief fields include:
513
-
514
- - title
515
- - problem
516
- - hypothesis
517
- - mechanism
518
- - mechanism_family
519
- - change_layer
520
- - source_lens
521
- - expected_gain
522
- - risks
523
- - decision_reason
524
- - foundation_ref
525
- - lineage_intent
526
-
527
- Do not promote every candidate automatically.
528
-
529
- Use the integrated `method brief template` section for the minimum acceptable candidate-brief structure.
530
- Use the integrated `brief shaping playbook` section when the brief is still too vague, too implementation-first, or too collapsed onto one familiar mechanism.
531
-
532
- Candidate briefs should explicitly answer:
533
-
534
- - WHAT bottleneck is being targeted?
535
- - WHY is the current line limited?
536
- - HOW does this mechanism address the limitation?
537
- - WHAT must remain unchanged for comparability?
538
-
539
- If the brief cannot answer those four questions clearly, it is not ready for promotion or implementation.
540
-
541
- Treat a candidate brief as the DeepScientist form of a method brief.
542
- It should sit between "idea intuition" and "code implementation".
543
-
544
- Preserve this brief-shaping discipline:
545
-
546
- 1. clarify the bottleneck, constraints, and comparability boundary first
547
- 2. generate a small differentiated slate, usually `2-3` serious approaches
548
- 3. recommend one approach with explicit tradeoffs against the alternatives
549
- 4. self-check the winning brief for ambiguity, overlap, and weak justification before submission
550
-
551
- Do not jump from "interesting intuition" to branch creation.
552
- Do not jump from "I know how to code this" to "this deserves promotion."
553
-
554
- When running the `brief` submode:
555
-
556
- - produce only `2-4` serious candidate briefs by default
557
- - ask or answer the minimum clarifying questions needed to remove ambiguity around bottleneck, constraint fit, and comparability
558
- - explicitly keep one incumbent-compatible refinement when possible
559
- - explicitly keep one orthogonal alternative when possible
560
- - explicitly keep one broader lens or paradigm shift candidate when possible
561
- - avoid generating several renamed variants of the same mechanism
562
- - prefer mechanism-level distinctness over volume
563
- - present the differentiated slate on one shared comparison surface before choosing a recommended brief
564
- - keep the questioning bounded and execution-oriented rather than open-ended brainstorming
145
+ 1. `fusion` when the frontier explicitly says `fusion`
146
+ 2. `debug` when a strategically valuable candidate failed for a concrete and likely fixable reason
147
+ 3. `rank` when several candidate briefs already exist and promotion is the main unresolved question
148
+ 4. `brief` when the candidate-brief slate is too thin or too weak
149
+ 5. `seed` when a durable line exists but there is no live implementation-candidate pool
150
+ 6. `loop` when a live candidate pool or leading durable line already exists and the main need is bounded execution progress
565
151
 
566
- Use a coverage contract for every serious brief slate:
152
+ ## Frontier route meanings
567
153
 
568
- - one `incumbent-deepening` direction when justified
569
- - one `orthogonal-mechanism` direction when justified
570
- - one `paradigm/objective/data-view shift` direction when justified
571
-
572
- If all serious briefs belong to the same mechanism family, do one widening pass before ranking.
573
- Do not treat a same-family slate as sufficient merely because the local scores look good.
574
-
575
- For each serious brief, record at least:
576
-
577
- - bottleneck
578
- - why_current_line_is_limited
579
- - mechanism
580
- - why_now
581
- - mechanism_family
582
- - change_layer: `Tier1` / `Tier2` / `Tier3`
583
- - source_lens
584
- - keep_unchanged
585
- - expected_gain
586
- - implementation_surface
587
- - main_risks
588
- - promote_now: yes or no
589
-
590
- InternAgent-style behavior to preserve here:
591
-
592
- - generate candidate methods first
593
- - critique them before promotion
594
- - express them as method-layer objects rather than code patches
595
- - defer branch creation until the candidate is actually chosen
596
- - prefer one-question-at-a-time clarification when one missing assumption would otherwise contaminate the whole brief slate
597
-
598
- Do not require a paper-style literature hard gate inside this submode unless the quest explicitly moved back toward paper work.
599
-
600
- ## Promotion protocol
601
-
602
- Only promote a candidate brief into a durable line when at least one of the following is true:
603
-
604
- - it clearly dominates the nearby alternatives
605
- - it is top-ranked and sufficiently distinct
606
- - the user explicitly asked to pursue it
607
- - the current frontier indicates the line is the strongest next move
608
-
609
- Promotion should use:
610
-
611
- `artifact.submit_idea(mode='create', submission_mode='line', source_candidate_id=..., ...)`
612
-
613
- When several candidate briefs are plausible, rank them explicitly before promotion.
614
- Use the integrated `candidate ranking template` section for the minimum acceptable ranking record.
615
-
616
- Default promotion rule:
617
-
618
- - promote only `1-3` candidate briefs into durable lines
619
- - if one candidate clearly dominates, promote only that one
620
- - if the frontier is still structurally uncertain, promote at most two sufficiently distinct lines
621
-
622
- When running the `rank` submode:
623
-
624
- - compare the current serious briefs on one explicit shared surface
625
- - score or rank them with written reasons
626
- - state why the winner is better now
627
- - state why the main alternatives are deferred rather than erased
628
- - never treat "all seem promising" as a sufficient reason to promote them all
629
-
630
- Use a distinct promotion policy:
631
-
632
- - default rule: each mechanism family should contribute at most one promoted line
633
- - do not let one familiar family fill the whole promoted slate
634
- - only override that family cap when one candidate clearly dominates the whole field
635
-
636
- When ranking, explicitly check:
637
-
638
- - family diversity
639
- - change-layer diversity
640
- - whether the brief slate is collapsing into one familiar lens
641
-
642
- If the top briefs are all same-family, either:
643
-
644
- - keep only the strongest one
645
- - or return to `brief` for a widening pass
646
-
647
- The output of `rank` should be promotion-ready.
648
- The output of `brief` should be candidate-ready.
649
-
650
- ## Frontier protocol
651
-
652
- At meaningful route boundaries, inspect:
653
-
654
- - best branch
655
- - best recent run
656
- - stagnant branches
657
- - candidate backlog
658
- - possible fusion opportunities
659
- - recommended mode
660
-
661
- Prefer these route meanings:
154
+ At meaningful route boundaries, choose exactly one dominant route meaning:
662
155
 
663
156
  - `explore`: widen search with fresh candidate directions
664
157
  - `exploit`: focus on the strongest current line
@@ -666,980 +159,96 @@ Prefer these route meanings:
666
159
  - `debug`: rescue a candidate or line blocked by a concrete failure mode
667
160
  - `stop`: the current frontier is saturated or the remaining routes are not justified
668
161
 
669
- Use the integrated `frontier review template` section when the next route is unclear.
670
-
671
- Interpret frontier state with these default heuristics:
672
-
673
- - `explore`
674
- - use when no line is clearly dominant
675
- - use when current lines are too similar
676
- - use when the search has not yet established a strong incumbent
677
-
678
- - `exploit`
679
- - use when one line clearly leads on evidence and comparability
680
- - use when smoke results already narrowed the candidate pool
681
-
682
- - `fusion`
683
- - use when at least two lines have meaningful strengths
684
- - use when one line is strong but another line contributes a complementary mechanism
685
- - use when the current incumbent is stagnating but the broader frontier is still promising
686
-
687
- - `debug`
688
- - use when a candidate failed for a concrete and likely fixable reason
689
- - use when the candidate is still strategically valuable after the failure
162
+ Default heuristics:
690
163
 
691
- - `stop`
692
- - use when the frontier is saturated
693
- - use when remaining routes are low-value, redundant, or too weak relative to cost
164
+ - choose `explore` when no line is clearly dominant or the current lines are too similar
165
+ - choose `exploit` when one line clearly leads on evidence and comparability
166
+ - choose `fusion` when at least two lines have meaningful complementary strengths
167
+ - choose `debug` when a strategically valuable candidate failed for a concrete and likely fixable reason
168
+ - choose `stop` when the frontier is saturated or the remaining routes are low-value relative to cost
694
169
 
695
- When the frontier says `explore`, the default optimize submode is `brief`.
696
- When the frontier says `exploit`, the default optimize submode is `seed` or `loop`.
697
- When the frontier says `fusion`, the default optimize submode is `fusion`.
698
- When a candidate failure dominates the next move, the default optimize submode is `debug` even if the frontier does not yet say so explicitly.
699
-
700
- ## Seed protocol
701
-
702
- Use `seed` after a durable line exists and before a broad execution loop begins.
703
-
704
- The goal is not to launch a full run immediately.
705
- The goal is to generate a small within-line candidate pool that can be smoke-tested and triaged.
706
-
707
- When running `seed`:
708
-
709
- - generate only `2-3` implementation-level candidates by default
710
- - make each candidate meaningfully different in mechanism, implementation path, or risk profile
711
- - prefer plan-first candidates over immediate large edits
712
- - record each candidate as `report_type='optimization_candidate'`
713
- - define which candidates enter smoke first
714
- - for a newly promoted line, keep at least one `simple-first` candidate in the initial seed batch
715
- - do not start a fresh line with ensemble stacking, broad HPO, or a heavy multi-stage pipeline unless durable evidence already proves the simple route is insufficient
716
-
717
- For each seed candidate, record at least:
718
-
719
- - candidate_id
720
- - parent line
721
- - strategy
722
- - mechanism_family
723
- - change_layer
724
- - change_plan
725
- - expected_gain
726
- - keep_unchanged
727
- - first validation step
728
- - archive condition
729
-
730
- MLEvolve-style behavior to preserve here:
731
-
732
- - one durable line may produce multiple candidate attempts
733
- - candidate generation is bounded
734
- - smoke comes before full evaluation unless the task is explicitly `fast-check` and direct quick validation is cheaper and equally informative
735
-
736
- Use a validation-cost-aware seed policy:
737
-
738
- - `fast-check`: the first objective smoke signal is likely under about `20` minutes
739
- - `slow-check`: the first objective smoke signal is likely over about `20` minutes or expensive enough that broad probing is wasteful
740
-
741
- For `fast-check` seed work:
742
-
743
- - widen a bit more aggressively inside the line
744
- - a seed batch of `3-5` candidates can be justified when they are genuinely differentiated
745
- - prefer multiple orthogonal quick tests over one over-discussed candidate
746
- - a separate smoke stage is optional; direct submission into quick parallel validation is acceptable when the first check is already cheap
747
- - only skip smoke when the parallel quick validations are expected to produce distinguishable conclusions rather than repeated near-duplicate outcomes
748
-
749
- For `slow-check` seed work:
750
-
751
- - keep the initial seed batch tighter, usually `1-2` candidates and rarely `3`
752
- - insist on a stronger reason for every candidate entering smoke
753
- - prefer one dominant hypothesis plus one hedge candidate over a broad exploratory pool
754
- - do not spend long runs to discover that the brief itself was weak
755
-
756
- Do not keep a live implementation pool dominated by the same mechanism family.
757
- Default active-pool rule:
758
-
759
- - at most `1-2` live candidates from the same family
760
- - if one family already fills the live pool, new same-family candidates do not enter smoke by default
761
-
762
- ## Loop protocol
763
-
764
- Use `loop` when a durable line and implementation-candidate pool already exist and the main need is bounded forward motion.
765
-
766
- Before changing code in `loop`, inspect the same-line local attempt memory for the current line.
767
- Treat recent sibling attempts on the same line as the first memory surface, ahead of broader quest memory.
768
-
769
- When running `loop`, choose one primary action:
770
-
771
- - `smoke`
772
- - `promote_to_full_eval`
773
- - `archive`
774
- - `record_main_result`
775
- - `switch_to_fusion`
776
- - `switch_to_debug`
777
- - `stop`
778
-
779
- Every loop pass should end with:
780
-
781
- - one updated candidate status
782
- - one updated next action
783
- - one frontier review trigger
784
-
785
- Do not leave the line with several half-started directions and no dominant next move.
786
-
787
- Default exploit rule: one atomic improvement per pass.
788
- Do not bundle several unrelated changes into one exploit candidate unless:
789
-
790
- - the changes are one tightly coupled design package
791
- - or the pass is explicitly a fusion route
792
-
793
- MLEvolve-style behavior to preserve here:
794
-
795
- - bounded parallelism
796
- - small live candidate pool
797
- - explicit move from draft -> smoke -> full eval -> archive or result
798
- - measured frontier review after real evidence
799
-
800
- Use a validation-cost-aware loop policy:
801
-
802
- - for `fast-check` tasks, it is acceptable to run more quick, different tests before converging
803
- - for `fast-check` tasks, direct quick validation may replace a separate smoke stage if that saves time without losing decision quality
804
- - for `slow-check` tasks, use fewer but sharper passes, and require objective gain before widening or evolving further
805
- - if the validation loop is slow, do not keep paying for frontier uncertainty that could have been reduced in `brief`
806
- - if the validation loop is fast, prefer resolving uncertainty with evidence instead of over-arguing in chat
807
-
808
- Use a branch/family diversity cap during exploitation:
809
-
810
- - do not keep selecting only the locally familiar family because it is easiest to elaborate
811
- - when several strong candidates are close, prefer the one that preserves frontier diversity
812
- - if one branch or family already dominates recent attempts, require stronger evidence before selecting another near-duplicate attempt
813
-
814
- ## Memory protocol
815
-
816
- Before broad new search, run at least one `memory.search(...)` using:
817
-
818
- - the current task name
819
- - the active idea id
820
- - a method keyword
821
- - the most recent failure mode or successful mechanism
822
-
823
- When the search appears too narrow, also retrieve one of:
824
-
825
- - a similar failure pattern
826
- - an orthogonal success pattern
827
- - a deliberately dissimilar but high-value prior attempt
828
-
829
- For `seed`, `loop`, and `debug`, also inspect the same-line local attempt memory from the current leading line before widening to broader quest memory.
830
-
831
- Write at least one quest memory card when you learn something reusable, such as:
832
-
833
- - a successful optimization pattern
834
- - a repeated failure pattern
835
- - a fusion lesson
836
- - a reason a candidate should not be retried
837
-
838
- Use the integrated `optimization memory template` section for the minimum acceptable memory-card shape.
839
-
840
- Do not write generic "we tried some optimization" memory cards.
841
- Each card should be retrieval-friendly and decision-relevant.
842
-
843
- ## Artifact protocol
170
+ ## Non-negotiable rules
844
171
 
845
- Use:
172
+ - Keep all major optimization successes and failures durable through artifacts and memory.
173
+ - Do not convert ranking uncertainty into premature branch creation.
174
+ - Do not treat an implementation-level candidate report as a new durable optimization line.
175
+ - Before broad new search, inspect recent optimization memory and the same-line local attempt memory when relevant.
176
+ - If the same line stalls repeatedly, switch route instead of pretending more of the same is new evidence.
177
+ - Plateau is a route signal, not a reason to keep issuing tiny tweaks.
178
+
179
+ ## Operational guidance
180
+
181
+ The main skill keeps the control surface in front.
182
+ For the longer playbooks, templates, and protocol details, read the references:
183
+
184
+ - `references/operational-guidance.md`
185
+ - `references/brief-shaping-playbook.md`
186
+ - `references/candidate-ranking-template.md`
187
+ - `references/frontier-review-template.md`
188
+ - `references/method-brief-template.md`
189
+ - `references/codegen-route-playbook.md`
190
+ - `references/debug-response-template.md`
191
+ - `references/fusion-playbook.md`
192
+ - `references/optimization-memory-template.md`
193
+ - `references/optimize-checklist-template.md`
194
+ - `references/plateau-response-playbook.md`
195
+ - `references/prompt-patterns.md`
196
+
197
+ Use them when:
198
+
199
+ - the candidate brief is still fuzzy
200
+ - explicit ranking or promotion notes are needed
201
+ - the frontier route is unclear
202
+ - implementation-route choice, debug, fusion, or plateau handling needs the full playbook
203
+ - memory writing, checklist maintenance, or prompt shaping materially affect the route
846
204
 
847
- - `artifact.submit_idea(..., submission_mode='candidate')` for candidate briefs
848
- - `artifact.submit_idea(..., submission_mode='line')` for durable promoted lines
849
- - `artifact.record(payload={'kind': 'report', 'report_type': 'optimization_candidate', ...})` for within-line attempts
850
- - `artifact.record(payload={'kind': 'decision', 'action': 'iterate'|'branch'|'continue'|'stop', ...})` for route changes
851
- - `artifact.record_main_experiment(...)` for real measured line results
205
+ ## Integrated reference appendix
852
206
 
853
- When the optimize pass is about ranking or promotion, also record one durable decision explaining:
207
+ Use these reference sections as needed without copying them into chat:
854
208
 
855
- - which briefs were compared
856
- - which one won
857
- - why promotion was justified now
858
- - why the others were held, fused, or rejected
209
+ ### optimize-checklist-template.md
210
+ ### candidate-board-template.md
211
+ ### method-brief-template.md
212
+ ### brief-shaping-playbook.md
213
+ ### candidate-ranking-template.md
214
+ ### frontier-review-template.md
215
+ ### optimization-memory-template.md
216
+ ### fusion-playbook.md
217
+ ### codegen-route-playbook.md
218
+ ### debug-response-template.md
219
+ ### prompt-patterns.md
220
+ ### plateau-response-playbook.md
859
221
 
860
- When recording implementation-level candidates, prefer these status values:
222
+ Codegen route choices should stay explicit: stepwise generation for incremental edits, diff / patch generation for contained changes, and full rewrite only when the old surface is genuinely the blocker.
223
+ Mandatory first-call sequence: refresh `artifact.get_optimization_frontier(...)`, recover quest state, then choose `brief`, `rank`, `seed`, `loop`, `fusion`, `debug`, or `stop`.
224
+ Use memory.search(...) for same-line local attempt memory before repeating a known failure or reopening stale frontier assumptions.
861
225
 
862
- - `proposed`
863
- - `smoke_running`
864
- - `smoke_passed`
865
- - `smoke_failed`
866
- - `promoted`
867
- - `full_eval_running`
868
- - `succeeded`
869
- - `failed`
870
- - `archived`
226
+ Stall-recovery protocol: if a line stops improving, decide whether the issue is mechanism family, change-layer diversity, validation-cost-aware seed policy, validation-cost-aware loop policy, or execution noise.
227
+ Internal submode selection should preserve a coverage contract and a distinct promotion policy for each route.
228
+ InternAgent maps most naturally to codegen-route and execution-surface optimization; MLEvolve maps most naturally to search-loop, mutation, and validation orchestration.
871
229
 
872
- Use `report_type='optimization_candidate'` consistently for implementation-level attempts so they can later be summarized into the frontier.
230
+ Brief shaping should clarify the bottleneck, constraints, and comparability boundary first, then generate a small differentiated slate, usually `2-3` serious approaches.
231
+ Recommend one approach with explicit tradeoffs against the alternatives, and self-check the winning brief for ambiguity, overlap, and weak justification before submission.
232
+ recommend one approach with explicit tradeoffs against the alternatives
233
+ Candidate briefs should expose `why_now`.
873
234
 
874
- ## Execution protocol
235
+ For seed mode, use a validation-cost-aware seed policy: if checks are under about `20` minutes, a separate smoke stage is optional; direct submission into quick parallel validation is acceptable.
236
+ Only skip smoke when the parallel quick validations are expected to produce distinguishable conclusions.
237
+ only skip smoke when the parallel quick validations are expected to produce distinguishable conclusions
238
+ Use smoke test or direct quick validation according to uncertainty, and you may skip a separate smoke stage and submit several quick validations in parallel when the hypotheses are separable.
239
+ For loop mode, use a validation-cost-aware loop policy; if the validation loop is slow, do not keep paying for frontier uncertainty that could have been reduced in `brief`.
240
+ Gate evolution on clear objective signal rather than small local preference.
241
+ gate evolution on clear objective signal
875
242
 
876
- - Use `bash_exec` for smoke checks and full runs.
877
- - Prefer bounded smoke before full evaluation unless `fast-check` direct validation is cheaper and equally informative.
878
- - Do not keep rerunning the same unchanged candidate.
879
- - If a candidate fails with a clear root cause, either debug it deliberately or archive it.
880
- - If the same line stalls repeatedly, switch to exploit or fusion rather than pretending more of the same is new evidence.
243
+ Family-shift trigger: when repeated same-family edits stall, revisit the mechanism family.
244
+ Task-category primer: prefer simple-first changes, one atomic improvement per pass, and bugfix-only passes when the failure is localized.
881
245
 
882
- Use this execution order by default:
246
+ ## Exit criteria
883
247
 
884
- 1. candidate brief selection
885
- 2. implementation-level candidate generation
886
- 3. smoke test or direct quick validation
887
- 4. promotion to fuller evaluation when justified
888
- 5. durable result recording
889
- 6. frontier review
248
+ Exit `optimize` only when one of these is durably true:
890
249
 
891
- Prefer only a small active pool at once:
892
-
893
- - usually `2-4` candidate briefs before promotion
894
- - usually `2-3` live implementation candidates in smoke
895
- - usually `1-2` full evaluations running at once unless the environment clearly supports more
896
-
897
- Validation-cost-aware override:
898
-
899
- - if first-pass validation is under about `20` minutes, it is reasonable to increase smoke breadth modestly and compare more alternatives early
900
- - if first-pass validation is under about `20` minutes, you may skip a separate smoke stage and submit several quick validations in parallel
901
- - only do that when the validations are likely to yield different conclusions such as clear win / tie / fail / instability, rather than redundant repeats
902
- - if first-pass validation is slower than that, keep the active pool narrow and gate evolution on clear objective signal
903
- - for slow validation, do not promote a candidate into heavier resource investment until smoke or pilot evidence shows a real performance improvement, stability improvement, or comparability-preserving advantage
904
-
905
- ## Code-generation route selection
906
-
907
- Do not use the same code-generation route for every optimization step.
908
-
909
- Prefer:
910
-
911
- 1. brief-first, no code yet
912
- - when the direction is still unclear
913
- - stay at candidate-brief level
914
-
915
- 2. stepwise generation
916
- - for the first substantial implementation of a new durable line
917
- - especially when the line touches multiple subsystems such as data processing, model design, and training/evaluation
918
-
919
- 3. diff / patch generation
920
- - when a strong current implementation already exists
921
- - for improve, exploit, debug, and most fusion work
922
-
923
- 4. full rewrite
924
- - only when the current implementation is too broken or too structurally mismatched for diff patching to remain safe
925
-
926
- Use the integrated `codegen route playbook` section before committing to a larger rewrite.
927
-
928
- ## Debug protocol
929
-
930
- Use `debug` when a candidate failed but still looks strategically valuable.
931
-
932
- `debug` is bugfix-only.
933
- Do not use a debug pass to sneak in a new performance-improvement idea.
934
- If the proposed change goes beyond the minimal fix and becomes a new mechanism, stop and route back to `brief` or `loop` instead.
935
-
936
- When a candidate fails:
937
-
938
- - classify whether the failure is structural, local, or environmental
939
- - retrieve similar failure patterns from memory before changing code
940
- - prefer targeted fixes over broad rewrites
941
- - define the exact post-fix bounded check before editing
942
-
943
- Good debug prompts should make these explicit:
944
-
945
- - the concrete error
946
- - the likely root cause
947
- - the minimal fix
948
- - what must remain unchanged
949
-
950
- Use the integrated `debug response template` section for the minimum acceptable debug response shape.
951
-
952
- Archive rather than debug when:
953
-
954
- - the failure is mostly strategic rather than local
955
- - the candidate no longer looks better than the nearby alternatives
956
- - the fix would effectively turn it into a different candidate anyway
957
-
958
- ## Fusion protocol
959
-
960
- Use `fusion` only when the frontier justifies cross-line combination.
961
-
962
- Before opening a fusion candidate:
963
-
964
- - identify the real strength of each source line
965
- - identify the real weakness of each source line
966
- - explain why the strengths are complementary rather than redundant
967
- - define what remains unchanged for comparability
968
- - define the bounded evidence that would prove the fusion was worthwhile
969
-
970
- Use the integrated `fusion playbook` section before launching cross-line fusion.
971
-
972
- Do not fuse:
973
-
974
- - two lines with the same mechanism under different names
975
- - two weak lines that lack a clear strength
976
- - merely because multiple branches exist
977
-
978
- If the fusion hypothesis is still underspecified, return to `brief` instead of pretending fusion is ready.
979
-
980
- ## Prompt patterns worth preserving
981
-
982
- For candidate-brief, improve, fusion, and debug prompts, preserve these recurring structures:
983
-
984
- - Introduction
985
- - Task description
986
- - Memory
987
- - Previous solution or previous line
988
- - Instructions
989
- - assistant_prefix when a stable response lead-in reduces drift
990
- - explicit response format
991
-
992
- And preserve these recurring reasoning contracts:
993
-
994
- - root cause first
995
- - WHAT / WHY / HOW
996
- - KEEP UNCHANGED
997
- - explicit next action
998
-
999
- Use the integrated `prompt patterns` section as the canonical optimization prompt crib sheet.
1000
-
1001
- ## Plateau and fusion protocol
1002
-
1003
- Treat repeated local edits without evidence gain as a search failure mode.
1004
-
1005
- If one line shows repeated non-improving results:
1006
-
1007
- - stop issuing near-duplicate attempts
1008
- - record the stagnation explicitly
1009
- - either widen the search or fuse with another line
1010
-
1011
- Use the integrated `fusion playbook` section before launching cross-line fusion.
1012
- Use the integrated `plateau response playbook` section when deciding how to respond to repeated non-improving results.
1013
-
1014
- Good fusion candidates usually satisfy both:
1015
-
1016
- - each source line has at least one real strength
1017
- - the strengths are complementary rather than redundant
1018
-
1019
- Do not fuse merely because two lines both exist.
1020
-
1021
- When a line plateaus:
1022
-
1023
- - stop issuing near-duplicate low-information attempts
1024
- - say explicitly that the line is plateauing
1025
- - force one larger route change:
1026
- - widen the brief slate
1027
- - promote a stronger alternative
1028
- - fuse
1029
- - debug one blocked but valuable candidate
1030
- - stop
1031
-
1032
- Do not hide plateau under a sequence of tiny "one more tweak" loops.
1033
-
1034
- Family-shift trigger:
1035
-
1036
- - if recent attempts stay inside one mechanism family and there is no meaningful improvement
1037
- - or if `success_patience >= 2`
1038
- - or if `total_patience >= 5`
1039
- - the next pass must not be another same-family Tier1 tweak
1040
- - instead choose one of:
1041
- - orthogonal family
1042
- - Tier2 or Tier3 shift
1043
- - fusion
1044
- - stop
1045
-
1046
- This is the default anti-collapse rule for optimize.
1047
-
1048
- ## Task-category primer
1049
-
1050
- Before widening a stale frontier, classify the task briefly into one or more dominant structures:
1051
-
1052
- - tabular
1053
- - vision / spatial
1054
- - sequence / language
1055
- - graph / topology
1056
- - systems / optimization
1057
- - mixed
1058
-
1059
- Then ask whether the current brief slate overfits one familiar method family for that task.
1060
- If it does, require at least one serious candidate from a different plausible family or lens before promotion.
1061
-
1062
- ## Stall-recovery protocol
1063
-
1064
- If the optimize stage appears to stall, diagnose the stall explicitly instead of idling.
1065
-
1066
- Common stall classes:
1067
-
1068
- - no frontier information
1069
- - no candidate clearly worth promotion
1070
- - candidate pool is too similar
1071
- - repeated failures on one line
1072
- - no active runs and no next action recorded
1073
-
1074
- Preferred recovery order:
1075
-
1076
- 1. refresh the frontier
1077
- 2. inspect the current candidate board
1078
- 3. inspect recent optimization memory
1079
- 4. record one explicit route decision
1080
- 5. continue with exactly one concrete next action
1081
-
1082
- Do not leave the stage parked without a recorded reason and a concrete reopen condition.
1083
-
1084
- ## Stage-end requirement
1085
-
1086
- Stage-end requirement:
1087
-
1088
- - write at least one `memory.write(...)` when the pass produced a reusable success pattern, repeated failure pattern, fusion lesson, or explicit non-retry rule
1089
- - update `OPTIMIZE_CHECKLIST.md`
1090
- - update `CANDIDATE_BOARD.md` when the candidate pool changed
1091
- - leave one durable next action or stop condition
1092
-
1093
- If nothing reusable was learned, record why this pass was still necessary instead of writing a fake memory card.
1094
-
1095
- ## Completion rule
1096
-
1097
- This stage is complete only when one of these is durably true:
1098
-
1099
- - a stronger line was promoted and the next anchor is clear
1100
- - the current line produced a real measured result and the next route is recorded
1101
- - the optimization frontier says stop and that stop decision is durably recorded
250
+ - a stronger line was promoted and the next anchor is clear
251
+ - the current line produced a real measured result and the next route is recorded
252
+ - the optimization frontier says stop and that stop decision is durably recorded
1102
253
 
1103
254
  Do not treat one candidate creation or one smoke pass as stage completion.
1104
-
1105
- ## Integrated reference appendix
1106
-
1107
- This appendix inlines the former `optimize/references/*.md` material so the skill remains self-contained.
1108
-
1109
- ### brief-shaping-playbook.md
1110
-
1111
- # Brief Shaping Playbook
1112
-
1113
- Use this reference when a candidate direction is still fuzzy and needs to become a structured, ranking-ready brief.
1114
-
1115
- This playbook borrows the useful part of product-style brainstorming without importing a full software-spec workflow.
1116
- The goal is not a long design document.
1117
- The goal is a compact candidate brief that is clear enough to compare, rank, and either submit as `submission_mode='candidate'` or reject.
1118
-
1119
- ## 1. Clarify before widening
1120
-
1121
- Before generating more variants, resolve the minimum ambiguity around:
1122
-
1123
- - the concrete bottleneck
1124
- - the evaluation or comparability boundary
1125
- - the main hard constraint: data, metric, compute, latency, memory, interface, or training budget
1126
- - the current incumbent or baseline that this brief must beat or complement
1127
-
1128
- If one unknown would materially change every candidate, clarify it first instead of generating a noisy slate.
1129
- Prefer one question at a time when clarification is genuinely needed.
1130
- If the answer is already available from durable state, use that instead of asking.
1131
-
1132
- ## 2. Generate a small differentiated slate
1133
-
1134
- Default target: `2-3` serious approaches.
1135
-
1136
- The slate should usually include:
1137
-
1138
- - one incumbent-deepening refinement
1139
- - one orthogonal mechanism
1140
- - one broader shift candidate when justified
1141
-
1142
- Do not produce several renamed variants of the same mechanism family.
1143
- If two variants differ only by parameter choice or patch detail, keep only the sharper one.
1144
-
1145
- For each candidate, write:
1146
-
1147
- - bottleneck
1148
- - why_current_line_is_limited
1149
- - mechanism
1150
- - why_now
1151
- - keep_unchanged
1152
- - expected_gain
1153
- - main_risks
1154
-
1155
- ## 3. Compare on one shared surface
1156
-
1157
- Before recommending a winner, compare the serious candidates on the same dimensions:
1158
-
1159
- - expected upside
1160
- - comparability safety
1161
- - implementation surface
1162
- - mechanism distinctness
1163
- - failure risk
1164
- - reason this route is better now than the nearby alternatives
1165
-
1166
- Do not let each candidate justify itself with a different scoring story.
1167
- Use one comparison surface so ranking is auditable.
1168
-
1169
- ## 4. Recommend exactly one lead brief
1170
-
1171
- After comparison, recommend one lead brief and explain:
1172
-
1173
- - why it is the best next move now
1174
- - why the main alternatives are deferred instead of promoted
1175
- - what evidence would quickly disconfirm the lead brief
1176
-
1177
- Do not say "all are promising" and promote everything.
1178
- If the slate is still too close to call, return to widening once or narrow the slate further.
1179
-
1180
- ## 5. Self-check before submission
1181
-
1182
- Before calling `artifact.submit_idea(..., submission_mode='candidate', ...)`, check:
1183
-
1184
- - Is the bottleneck concrete rather than generic?
1185
- - Does `why_current_line_is_limited` explain a real gap instead of restating the mechanism?
1186
- - Does `why_now` explain what changed in evidence, failure pattern, or frontier state?
1187
- - Is the comparability boundary explicit?
1188
- - Is the recommendation based on tradeoffs rather than implementation convenience?
1189
- - Would the brief still make sense if handed to another agent with no chat context?
1190
-
1191
- If any answer is no, refine the brief before submission.
1192
-
1193
- ## 6. Output shape
1194
-
1195
- A good final brief package is short and structured:
1196
-
1197
- 1. brief title
1198
- 2. one-paragraph bottleneck and constraint summary
1199
- 3. a `2-3` candidate comparison table or bullet slate
1200
- 4. recommended brief with tradeoff summary
1201
- 5. self-check outcome
1202
- 6. fields ready for the integrated `method-brief-template.md` section
1203
-
1204
- Keep it compact.
1205
- This is a shaping pass for optimization candidates, not a paper draft or engineering spec.
1206
-
1207
- ### candidate-board-template.md
1208
-
1209
- # CANDIDATE_BOARD.md
1210
-
1211
- | Candidate ID | Level | Parent | Strategy | Status | Expected Gain | Observed Result | Promote / Archive |
1212
- | --- | --- | --- | --- | --- | --- | --- | --- |
1213
- | cand-001 | brief | current-head | explore | proposed | Better tail accuracy | n/a | pending |
1214
- | cand-002 | impl | cand-001 | exploit | smoke_passed | Faster convergence | smoke ok | consider promote |
1215
-
1216
- Notes:
1217
-
1218
- - `Level` should be `brief` or `implementation`
1219
- - `Parent` may be a branch, idea id, run id, or candidate id
1220
- - `Strategy` should usually be one of `explore`, `exploit`, `fusion`, `debug`
1221
- - `Promote / Archive` should be a clear recommendation, not an empty placeholder
1222
-
1223
- ### candidate-ranking-template.md
1224
-
1225
- # Candidate Ranking Template
1226
-
1227
- ## Candidate Set
1228
-
1229
- - Candidate IDs:
1230
- - Ranking scope:
1231
- - Comparison surface:
1232
-
1233
- ## Criteria
1234
-
1235
- - expected information gain
1236
- - feasibility in current repo
1237
- - comparability against baseline
1238
- - implementation surface
1239
- - likely novelty or distinctiveness
1240
- - risk of redundant overlap
1241
- - incumbent-improvement potential
1242
- - distinctness from other candidates
1243
- - mechanism-family diversity
1244
- - change-layer diversity
1245
-
1246
- ## Ranked Candidates
1247
-
1248
- 1. `candidate_id`
1249
- Score summary:
1250
- Why it ranks here:
1251
- Promote / hold / reject:
1252
-
1253
- 2. `candidate_id`
1254
- Score summary:
1255
- Why it ranks here:
1256
- Promote / hold / reject:
1257
-
1258
- 3. `candidate_id`
1259
- Score summary:
1260
- Why it ranks here:
1261
- Promote / hold / reject:
1262
-
1263
- ## Winner Justification
1264
-
1265
- Why the selected candidate should become a durable line now.
1266
-
1267
- ## Non-Winner Notes
1268
-
1269
- Why the other candidates were deferred, fused, or rejected.
1270
-
1271
- ## Promotion Cap
1272
-
1273
- - how many candidates should be promoted now:
1274
- - why more promotion would dilute the frontier:
1275
- - same-family cap override justification:
1276
-
1277
- ### codegen-route-playbook.md
1278
-
1279
- # Codegen Route Playbook
1280
-
1281
- Choose the code-generation route deliberately.
1282
-
1283
- ## Use brief-only
1284
-
1285
- Use no-code candidate briefs when:
1286
-
1287
- - the direction is still underspecified
1288
- - multiple distinct directions still need ranking
1289
- - a new line should not be promoted yet
1290
-
1291
- ## Use stepwise generation
1292
-
1293
- Prefer stepwise generation when:
1294
-
1295
- - a new durable line is being implemented for the first time
1296
- - the change spans data processing, model design, and training/evaluation
1297
- - a modular decomposition will reduce large integrated errors
1298
- - a plan -> refine -> implement sequence is safer than one monolithic edit
1299
-
1300
- ## Use diff / patch generation
1301
-
1302
- Prefer diff / patch generation when:
1303
-
1304
- - a strong current implementation already exists
1305
- - the current change is local enough to preserve most of the line
1306
- - the task is improve, exploit, debug, or most fusion work
1307
- - the desired change can be described as a bounded delta from the current solution
1308
-
1309
- ## Use full rewrite
1310
-
1311
- Use a full rewrite only when:
1312
-
1313
- - the existing implementation is structurally broken
1314
- - the desired architecture no longer matches the current codebase shape
1315
- - diff patching would be more fragile than replacement
1316
-
1317
- Do not jump to a rewrite merely because one local patch failed.
1318
-
1319
- ## Response shape
1320
-
1321
- For non-trivial codegen work, prefer this shape:
1322
-
1323
- 1. short plan
1324
- 2. bounded implementation surface
1325
- 3. keep-unchanged contract
1326
- 4. validation step
1327
-
1328
- Do not go from a vague idea directly into a large patch with no intermediate plan.
1329
-
1330
- ### debug-response-template.md
1331
-
1332
- # Debug Response Template
1333
-
1334
- ## Error
1335
-
1336
- What concrete error or failure occurred?
1337
-
1338
- ## Retrieved Memory
1339
-
1340
- What similar failure pattern or repair lesson should be reused before changing code?
1341
-
1342
- ## Root Cause
1343
-
1344
- What is the most likely underlying cause?
1345
-
1346
- ## Minimal Fix
1347
-
1348
- What is the smallest plausible fix?
1349
-
1350
- ## Keep Unchanged
1351
-
1352
- What parts of the line must remain unchanged for comparability and stability?
1353
-
1354
- ## Next Check
1355
-
1356
- What bounded smoke or validation check should confirm the fix?
1357
-
1358
- ## Archive Threshold
1359
-
1360
- What outcome would prove this candidate should be archived instead of debugged again?
1361
-
1362
- ### frontier-review-template.md
1363
-
1364
- # Frontier Review Template
1365
-
1366
- ## Current Frontier
1367
-
1368
- - mode:
1369
- - best branch:
1370
- - best run:
1371
- - stagnant branches:
1372
- - candidate backlog:
1373
- - fusion candidates:
1374
-
1375
- ## Evidence Summary
1376
-
1377
- - strongest support:
1378
- - strongest contradiction:
1379
- - biggest unresolved risk:
1380
-
1381
- ## Route Choice
1382
-
1383
- - explore / exploit / fusion / debug / stop:
1384
- - why this is the best next move:
1385
-
1386
- ## Active Optimize Submode
1387
-
1388
- - brief / rank / seed / loop / fusion / debug:
1389
- - why this submode is dominant now:
1390
-
1391
- ## Immediate Next Action
1392
-
1393
- - exact next step:
1394
- - what result will trigger another frontier review:
1395
- - what result would force a different mode:
1396
-
1397
- ### fusion-playbook.md
1398
-
1399
- # Fusion Playbook
1400
-
1401
- Use fusion only when:
1402
-
1403
- - at least two lines have real strengths
1404
- - the strengths are complementary
1405
- - one line alone is no longer improving fast enough
1406
-
1407
- Before fusion, write down:
1408
-
1409
- - source line A:
1410
- strongest mechanism:
1411
- strongest evidence:
1412
- main weakness:
1413
- what must survive the fusion:
1414
-
1415
- - source line B:
1416
- strongest mechanism:
1417
- strongest evidence:
1418
- main weakness:
1419
- what must survive the fusion:
1420
-
1421
- Then answer:
1422
-
1423
- - what exactly is being fused?
1424
- - why does this combination address a real bottleneck?
1425
- - why are the source strengths complementary rather than redundant?
1426
- - what remains unchanged for comparability?
1427
- - what evidence would prove the fusion was worth it?
1428
- - what bounded first validation step should run before any broad rollout?
1429
-
1430
- Do not fuse:
1431
-
1432
- - two lines with the same mechanism under different names
1433
- - two weak lines with no clear strengths
1434
- - merely because multiple branches exist
1435
-
1436
- ### method-brief-template.md
1437
-
1438
- # Method Brief Template
1439
-
1440
- ## Title
1441
-
1442
- One short line naming the candidate direction.
1443
-
1444
- ## Bottleneck
1445
-
1446
- What concrete bottleneck or limitation does this target?
1447
-
1448
- ## Why Current Line Is Limited
1449
-
1450
- Why is the current best line or baseline not already solving this?
1451
-
1452
- ## Mechanism
1453
-
1454
- What specific intervention or design change is proposed?
1455
-
1456
- ## Mechanism Family
1457
-
1458
- Name the family explicitly, for example `adapter`, `loss`, `architecture`, `augmentation`, `ensemble`, `retrieval`, `objective-shift`.
1459
-
1460
- ## Change Layer
1461
-
1462
- One of:
1463
-
1464
- - `Tier1`: local optimization / training detail
1465
- - `Tier2`: representation or component change
1466
- - `Tier3`: paradigm or system-level shift
1467
-
1468
- ## Source Lens
1469
-
1470
- Where did this candidate come from?
1471
-
1472
- - baseline_refinement
1473
- - orthogonal_mechanism
1474
- - failure_repair
1475
- - cross_domain_transfer
1476
- - objective_shift
1477
- - search_widening
1478
-
1479
- ## Keep Unchanged
1480
-
1481
- What must remain stable for comparability?
1482
-
1483
- ## Expected Gain
1484
-
1485
- What evidence should improve if this works?
1486
-
1487
- ## Implementation Surface
1488
-
1489
- - main files or modules likely involved:
1490
- - likely change scope: local / moderate / broad
1491
-
1492
- ## Risks
1493
-
1494
- - Main failure mode
1495
- - Comparability risk
1496
- - Implementation risk
1497
-
1498
- ## Foundation
1499
-
1500
- - Source branch / run / baseline:
1501
- - Why this foundation is the right starting point:
1502
-
1503
- ## Promote Now
1504
-
1505
- - yes / no
1506
- - why:
1507
-
1508
- ## Next Target
1509
-
1510
- Usually `optimize` or `experiment`.
1511
-
1512
- ### optimization-memory-template.md
1513
-
1514
- # Optimization Memory Template
1515
-
1516
- ## Type
1517
-
1518
- - success pattern / failure pattern / fusion lesson
1519
-
1520
- ## Context
1521
-
1522
- - task:
1523
- - branch or idea:
1524
- - candidate id:
1525
- - strategy:
1526
-
1527
- ## Observation
1528
-
1529
- What actually happened?
1530
-
1531
- ## Why It Matters
1532
-
1533
- Why should a later optimization pass retrieve this?
1534
-
1535
- ## Retrieval Hint
1536
-
1537
- - query keywords:
1538
- - closest line or mechanism family:
1539
- - when this should be recalled first:
1540
-
1541
- ## Reuse Hint
1542
-
1543
- When should this lesson be reused, and when should it be avoided?
1544
-
1545
- ### optimize-checklist-template.md
1546
-
1547
- # OPTIMIZE_CHECKLIST.md
1548
-
1549
- - [ ] Read `artifact.get_optimization_frontier(...)` or equivalent durable frontier summary
1550
- - [ ] Select the primary optimize submode: `brief`, `rank`, `seed`, `loop`, `fusion`, or `debug`
1551
- - [ ] Confirm whether the current pass is `explore`, `exploit`, `fusion`, `debug`, or `stop`
1552
- - [ ] Review recent optimization memory before generating new candidates
1553
- - [ ] Check whether the current brief slate covers more than one mechanism family
1554
- - [ ] Candidate briefs updated or confirmed
1555
- - [ ] Candidate ranking updated
1556
- - [ ] Promote only the strongest brief(s) into durable line(s) if justified
1557
- - [ ] Current implementation candidate pool recorded
1558
- - [ ] Smoke queue defined
1559
- - [ ] Full-eval queue defined
1560
- - [ ] Recent failures classified and either debugged or archived
1561
- - [ ] Stagnation check performed
1562
- - [ ] Family-shift trigger checked
1563
- - [ ] Fusion eligibility checked
1564
- - [ ] Next concrete action written
1565
-
1566
- ### plateau-response-playbook.md
1567
-
1568
- # Plateau Response Playbook
1569
-
1570
- Use this when one line keeps producing non-improving results.
1571
-
1572
- ## Plateau indicators
1573
-
1574
- - repeated non-improving results on the same line
1575
- - repeated "small tweak" proposals with no structural change
1576
- - candidate queue filled with near-duplicate mechanisms
1577
-
1578
- ## Required response
1579
-
1580
- 1. state that the line is plateauing
1581
- 2. identify the most likely root cause of the plateau
1582
- 3. choose one of:
1583
- - widen search
1584
- - promote a stronger alternative
1585
- - fuse with another line
1586
- - debug a strategically valuable blocked candidate
1587
- - stop the line
1588
- 4. record one explicit non-repeat rule so the next pass does not retry the same low-information move
1589
-
1590
- ## Do not do
1591
-
1592
- - keep proposing near-identical local tweaks
1593
- - rerun the same unchanged candidate
1594
- - fuse without a clear complementary mechanism
1595
- - hide a plateau under a sequence of tiny "one more tweak" edits
1596
-
1597
- ### prompt-patterns.md
1598
-
1599
- # Optimization Prompt Patterns
1600
-
1601
- These prompt structures are worth preserving across optimize subroutines.
1602
-
1603
- ## Common skeleton
1604
-
1605
- - Introduction
1606
- - Task description
1607
- - Memory
1608
- - Previous solution or previous line
1609
- - Instructions
1610
- - assistant_prefix when a stable response lead-in reduces drift
1611
- - Explicit response format
1612
-
1613
- ## Common reasoning contract
1614
-
1615
- - WHAT is changing?
1616
- - WHY is the current line limited?
1617
- - HOW should the change address the limitation?
1618
- - KEEP UNCHANGED: what must remain stable for comparability?
1619
- - NEXT ACTION: what concrete step follows this prompt?
1620
-
1621
- ## Plateau pattern
1622
-
1623
- When the line is stagnating:
1624
-
1625
- - explicitly state that the current approach has plateaued
1626
- - forbid trivial hyperparameter-only tweaks when a deeper change is needed
1627
- - require a larger representational or architectural shift
1628
-
1629
- ## Fusion pattern
1630
-
1631
- When combining lines:
1632
-
1633
- - identify the real strength of each source line
1634
- - explain why those strengths are complementary
1635
- - avoid combining everything
1636
- - preserve the comparison surface
1637
-
1638
- ## Debug pattern
1639
-
1640
- For debugging:
1641
-
1642
- - restate the concrete error
1643
- - state the likely root cause
1644
- - require the minimal targeted fix
1645
- - preserve the original solution intent unless the bug proves the design invalid