@researai/deepscientist 1.5.16 → 1.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +309 -130
- package/AISB/catalog/aisb.b1.agentic_coding.yaml +244 -0
- package/AISB/catalog/aisb.b10.climate_earth.yaml +235 -0
- package/AISB/catalog/aisb.b11.model_efficiency.yaml +231 -0
- package/AISB/catalog/aisb.b12.embodied_ai.yaml +238 -0
- package/AISB/catalog/aisb.b2.agent_systems.yaml +229 -0
- package/AISB/catalog/aisb.b3.self_evolving_rl.yaml +237 -0
- package/AISB/catalog/aisb.b4.lm_reasoning.yaml +240 -0
- package/AISB/catalog/aisb.b5.math_proof.yaml +235 -0
- package/AISB/catalog/aisb.b6.research_process.yaml +243 -0
- package/AISB/catalog/aisb.b7.multimodal_fusion.yaml +232 -0
- package/AISB/catalog/aisb.b8.lifesci_drug.yaml +275 -0
- package/AISB/catalog/aisb.b9.material_science.yaml +237 -0
- package/AISB/catalog/aisb.t3.001_savvy.yaml +159 -0
- package/AISB/catalog/aisb.t3.001_savvy.zh.yaml +121 -0
- package/AISB/catalog/aisb.t3.002_pinet.yaml +189 -0
- package/AISB/catalog/aisb.t3.002_pinet.zh.yaml +130 -0
- package/AISB/catalog/aisb.t3.004_decentralattn.yaml +184 -0
- package/AISB/catalog/aisb.t3.004_decentralattn.zh.yaml +153 -0
- package/AISB/catalog/aisb.t3.005_tsae.yaml +193 -0
- package/AISB/catalog/aisb.t3.005_tsae.zh.yaml +139 -0
- package/AISB/catalog/aisb.t3.006_physense.yaml +194 -0
- package/AISB/catalog/aisb.t3.006_physense.zh.yaml +118 -0
- package/AISB/catalog/aisb.t3.007_reasoningiqa.yaml +169 -0
- package/AISB/catalog/aisb.t3.007_reasoningiqa.zh.yaml +133 -0
- package/AISB/catalog/aisb.t3.008_meanflows.yaml +188 -0
- package/AISB/catalog/aisb.t3.008_meanflows.zh.yaml +140 -0
- package/AISB/catalog/aisb.t3.009_scoremissing.yaml +179 -0
- package/AISB/catalog/aisb.t3.009_scoremissing.zh.yaml +119 -0
- package/AISB/catalog/aisb.t3.010_suitabilityfilter.yaml +221 -0
- package/AISB/catalog/aisb.t3.010_suitabilityfilter.zh.yaml +141 -0
- package/AISB/catalog/aisb.t3.011_osd.yaml +206 -0
- package/AISB/catalog/aisb.t3.011_osd.zh.yaml +163 -0
- package/AISB/catalog/aisb.t3.012_efficientqat.yaml +206 -0
- package/AISB/catalog/aisb.t3.012_efficientqat.zh.yaml +159 -0
- package/AISB/catalog/aisb.t3.013_appl.yaml +152 -0
- package/AISB/catalog/aisb.t3.013_appl.zh.yaml +126 -0
- package/AISB/catalog/aisb.t3.014_piguard.yaml +207 -0
- package/AISB/catalog/aisb.t3.014_piguard.zh.yaml +164 -0
- package/AISB/catalog/aisb.t3.015_frspec.yaml +209 -0
- package/AISB/catalog/aisb.t3.015_frspec.zh.yaml +163 -0
- package/AISB/catalog/aisb.t3.016_mathfusion.yaml +166 -0
- package/AISB/catalog/aisb.t3.016_mathfusion.zh.yaml +145 -0
- package/AISB/catalog/aisb.t3.017_multimodalglp.yaml +171 -0
- package/AISB/catalog/aisb.t3.017_multimodalglp.zh.yaml +122 -0
- package/AISB/catalog/aisb.t3.018_cotsynth.yaml +206 -0
- package/AISB/catalog/aisb.t3.018_cotsynth.zh.yaml +162 -0
- package/AISB/catalog/aisb.t3.019_dyscaleut.yaml +211 -0
- package/AISB/catalog/aisb.t3.019_dyscaleut.zh.yaml +148 -0
- package/AISB/catalog/aisb.t3.020_aristotle.yaml +173 -0
- package/AISB/catalog/aisb.t3.020_aristotle.zh.yaml +119 -0
- package/AISB/catalog/aisb.t3.021_tokenrecycling.yaml +160 -0
- package/AISB/catalog/aisb.t3.021_tokenrecycling.zh.yaml +129 -0
- package/AISB/catalog/aisb.t3.022_chainofreasoning.yaml +204 -0
- package/AISB/catalog/aisb.t3.022_chainofreasoning.zh.yaml +161 -0
- package/AISB/catalog/aisb.t3.023_guidedembed.yaml +211 -0
- package/AISB/catalog/aisb.t3.023_guidedembed.zh.yaml +189 -0
- package/AISB/catalog/aisb.t3.024_outputcentric.yaml +148 -0
- package/AISB/catalog/aisb.t3.024_outputcentric.zh.yaml +131 -0
- package/AISB/catalog/aisb.t3.025_deeper.yaml +143 -0
- package/AISB/catalog/aisb.t3.025_deeper.zh.yaml +116 -0
- package/AISB/catalog/aisb.t3.026_gartkg.yaml +195 -0
- package/AISB/catalog/aisb.t3.026_gartkg.zh.yaml +127 -0
- package/AISB/catalog/aisb.t3.027_citeeval.yaml +182 -0
- package/AISB/catalog/aisb.t3.027_citeeval.zh.yaml +135 -0
- package/AISB/catalog/aisb.t3.028_sbam.yaml +206 -0
- package/AISB/catalog/aisb.t3.028_sbam.zh.yaml +166 -0
- package/AISB/catalog/aisb.t3.029_cdqgeoembed.yaml +224 -0
- package/AISB/catalog/aisb.t3.029_cdqgeoembed.zh.yaml +142 -0
- package/AISB/catalog/aisb.t3.030_processrm.yaml +211 -0
- package/AISB/catalog/aisb.t3.030_processrm.zh.yaml +166 -0
- package/AISB/catalog/aisb.t3.031_circuitstability.yaml +172 -0
- package/AISB/catalog/aisb.t3.031_circuitstability.zh.yaml +134 -0
- package/AISB/catalog/aisb.t3.032_ptsolver.yaml +169 -0
- package/AISB/catalog/aisb.t3.032_ptsolver.zh.yaml +135 -0
- package/AISB/catalog/aisb.t3.033_gcse.yaml +144 -0
- package/AISB/catalog/aisb.t3.033_gcse.zh.yaml +126 -0
- package/AISB/catalog/aisb.t3.034_ensemblewm.yaml +183 -0
- package/AISB/catalog/aisb.t3.034_ensemblewm.zh.yaml +146 -0
- package/AISB/catalog/aisb.t3.035_moralvalueswa.yaml +207 -0
- package/AISB/catalog/aisb.t3.035_moralvalueswa.zh.yaml +165 -0
- package/AISB/catalog/aisb.t3.036_weakstrongpref.yaml +210 -0
- package/AISB/catalog/aisb.t3.036_weakstrongpref.zh.yaml +194 -0
- package/AISB/catalog/aisb.t3.037_dementiamask.yaml +172 -0
- package/AISB/catalog/aisb.t3.037_dementiamask.zh.yaml +132 -0
- package/AISB/catalog/aisb.t3.038_tinysam.yaml +284 -0
- package/AISB/catalog/aisb.t3.038_tinysam.zh.yaml +240 -0
- package/AISB/catalog/aisb.t3.039_calf.yaml +224 -0
- package/AISB/catalog/aisb.t3.039_calf.zh.yaml +194 -0
- package/AISB/catalog/aisb.t3.040_graniteguardian.yaml +199 -0
- package/AISB/catalog/aisb.t3.040_graniteguardian.zh.yaml +174 -0
- package/AISB/catalog/aisb.t3.041_amdm.yaml +149 -0
- package/AISB/catalog/aisb.t3.041_amdm.zh.yaml +137 -0
- package/AISB/catalog/aisb.t3.042_xpatch.yaml +216 -0
- package/AISB/catalog/aisb.t3.042_xpatch.zh.yaml +182 -0
- package/AISB/catalog/aisb.t3.043_vhm.yaml +268 -0
- package/AISB/catalog/aisb.t3.043_vhm.zh.yaml +193 -0
- package/AISB/catalog/aisb.t3.044_rgvi.yaml +224 -0
- package/AISB/catalog/aisb.t3.044_rgvi.zh.yaml +176 -0
- package/AISB/catalog/aisb.t3.045_pslstm.yaml +203 -0
- package/AISB/catalog/aisb.t3.045_pslstm.zh.yaml +179 -0
- package/AISB/catalog/aisb.t3.046_nonstatts.yaml +208 -0
- package/AISB/catalog/aisb.t3.046_nonstatts.zh.yaml +194 -0
- package/AISB/catalog/aisb.t3.047_timepfn.yaml +156 -0
- package/AISB/catalog/aisb.t3.047_timepfn.zh.yaml +124 -0
- package/AISB/catalog/aisb.t3.048_proxyspex.yaml +148 -0
- package/AISB/catalog/aisb.t3.048_proxyspex.zh.yaml +125 -0
- package/AISB/catalog/aisb.t3.049_hogwildinference.yaml +183 -0
- package/AISB/catalog/aisb.t3.049_hogwildinference.zh.yaml +138 -0
- package/AISB/catalog/aisb.t3.050_causalpfn.yaml +214 -0
- package/AISB/catalog/aisb.t3.050_causalpfn.zh.yaml +190 -0
- package/AISB/catalog/aisb.t3.051_flashtp.yaml +169 -0
- package/AISB/catalog/aisb.t3.051_flashtp.zh.yaml +124 -0
- package/AISB/catalog/aisb.t3.052_nsdiff.yaml +155 -0
- package/AISB/catalog/aisb.t3.052_nsdiff.zh.yaml +138 -0
- package/AISB/catalog/aisb.t3.053_k2vae.yaml +158 -0
- package/AISB/catalog/aisb.t3.053_k2vae.zh.yaml +132 -0
- package/AISB/catalog/aisb.t3.054_timebase.yaml +178 -0
- package/AISB/catalog/aisb.t3.054_timebase.zh.yaml +158 -0
- package/AISB/catalog/aisb.t3.055_csbrain.yaml +238 -0
- package/AISB/catalog/aisb.t3.055_csbrain.zh.yaml +184 -0
- package/AISB/catalog/aisb.t3.056_infosam.yaml +224 -0
- package/AISB/catalog/aisb.t3.056_infosam.zh.yaml +189 -0
- package/AISB/catalog/aisb.t3.057_mdreid.yaml +129 -0
- package/AISB/catalog/aisb.t3.057_mdreid.zh.yaml +117 -0
- package/AISB/catalog/aisb.t3.058_mindglitch.yaml +171 -0
- package/AISB/catalog/aisb.t3.058_mindglitch.zh.yaml +145 -0
- package/AISB/catalog/aisb.t3.059_selfsupervised.yaml +154 -0
- package/AISB/catalog/aisb.t3.059_selfsupervised.zh.yaml +125 -0
- package/AISB/catalog/aisb.t3.060_iaggad.yaml +121 -0
- package/AISB/catalog/aisb.t3.060_iaggad.zh.yaml +100 -0
- package/AISB/catalog/aisb.t3.061_hsgkn.yaml +136 -0
- package/AISB/catalog/aisb.t3.061_hsgkn.zh.yaml +113 -0
- package/AISB/catalog/aisb.t3.062_visionts.yaml +237 -0
- package/AISB/catalog/aisb.t3.062_visionts.zh.yaml +216 -0
- package/AISB/catalog/aisb.t3.063_tsrag.yaml +162 -0
- package/AISB/catalog/aisb.t3.063_tsrag.zh.yaml +138 -0
- package/AISB/catalog/aisb.t3.064_pir.yaml +221 -0
- package/AISB/catalog/aisb.t3.064_pir.zh.yaml +197 -0
- package/AISB/catalog/aisb.t3.065_proteinbinding.yaml +234 -0
- package/AISB/catalog/aisb.t3.065_proteinbinding.zh.yaml +167 -0
- package/AISB/catalog/aisb.t3.066_tropicalattention.yaml +267 -0
- package/AISB/catalog/aisb.t3.066_tropicalattention.zh.yaml +229 -0
- package/AISB/catalog/aisb.t3.067_kanad.yaml +193 -0
- package/AISB/catalog/aisb.t3.067_kanad.zh.yaml +167 -0
- package/AISB/catalog/aisb.t3.068_sempo.yaml +187 -0
- package/AISB/catalog/aisb.t3.068_sempo.zh.yaml +148 -0
- package/AISB/catalog/aisb.t3.069_treehfd.yaml +129 -0
- package/AISB/catalog/aisb.t3.069_treehfd.zh.yaml +111 -0
- package/AISB/catalog/aisb.t3.070_certifiedunlearning.yaml +224 -0
- package/AISB/catalog/aisb.t3.070_certifiedunlearning.zh.yaml +171 -0
- package/AISB/catalog/aisb.t3.071_neuralmjd.yaml +142 -0
- package/AISB/catalog/aisb.t3.071_neuralmjd.zh.yaml +120 -0
- package/AISB/catalog/aisb.t3.072_fedgmt.yaml +181 -0
- package/AISB/catalog/aisb.t3.072_fedgmt.zh.yaml +158 -0
- package/AISB/catalog/aisb.t3.073_rld.yaml +161 -0
- package/AISB/catalog/aisb.t3.073_rld.zh.yaml +129 -0
- package/AISB/catalog/aisb.t3.074_lsvi.yaml +163 -0
- package/AISB/catalog/aisb.t3.074_lsvi.zh.yaml +129 -0
- package/AISB/catalog/aisb.t3.075_treeslicedentropy.yaml +201 -0
- package/AISB/catalog/aisb.t3.075_treeslicedentropy.zh.yaml +148 -0
- package/AISB/catalog/aisb.t3.076_aanet.yaml +169 -0
- package/AISB/catalog/aisb.t3.076_aanet.zh.yaml +129 -0
- package/AISB/catalog/aisb.t3.077_cmnn.yaml +199 -0
- package/AISB/catalog/aisb.t3.077_cmnn.zh.yaml +165 -0
- package/AISB/catalog/aisb.t3.078_conformalanomaly.yaml +146 -0
- package/AISB/catalog/aisb.t3.078_conformalanomaly.zh.yaml +117 -0
- package/AISB/catalog/aisb.t3.079_dpfkmeans.yaml +131 -0
- package/AISB/catalog/aisb.t3.079_dpfkmeans.zh.yaml +104 -0
- package/AISB/catalog/aisb.t3.080_latentscorereweight.yaml +169 -0
- package/AISB/catalog/aisb.t3.080_latentscorereweight.zh.yaml +123 -0
- package/AISB/catalog/aisb.t3.081_qmamba.yaml +150 -0
- package/AISB/catalog/aisb.t3.081_qmamba.zh.yaml +117 -0
- package/AISB/catalog/aisb.t3.082_onlinellmrouting.yaml +160 -0
- package/AISB/catalog/aisb.t3.082_onlinellmrouting.zh.yaml +133 -0
- package/AISB/catalog/aisb.t3.083_starformer.yaml +178 -0
- package/AISB/catalog/aisb.t3.083_starformer.zh.yaml +140 -0
- package/AISB/catalog/aisb.t3.084_ift.yaml +139 -0
- package/AISB/catalog/aisb.t3.084_ift.zh.yaml +111 -0
- package/AISB/catalog/aisb.t3.085_neuralsurv.yaml +183 -0
- package/AISB/catalog/aisb.t3.085_neuralsurv.zh.yaml +143 -0
- package/AISB/catalog/aisb.t3.086_stella.yaml +197 -0
- package/AISB/catalog/aisb.t3.086_stella.zh.yaml +142 -0
- package/AISB/catalog/aisb.t3.087_moses.yaml +167 -0
- package/AISB/catalog/aisb.t3.087_moses.zh.yaml +132 -0
- package/AISB/catalog/aisb.t3.088_channelnorm.yaml +140 -0
- package/AISB/catalog/aisb.t3.088_channelnorm.zh.yaml +109 -0
- package/AISB/catalog/aisb.t3.089_causalvelocity.yaml +730 -0
- package/AISB/catalog/aisb.t3.089_causalvelocity.zh.yaml +668 -0
- package/AISB/catalog/aisb.t3.090_rstib.yaml +144 -0
- package/AISB/catalog/aisb.t3.090_rstib.zh.yaml +109 -0
- package/AISB/catalog/aisb.t3.091_timeawarecausal.yaml +132 -0
- package/AISB/catalog/aisb.t3.091_timeawarecausal.zh.yaml +107 -0
- package/AISB/catalog/aisb.t3.092_kmeanslocalopt.yaml +138 -0
- package/AISB/catalog/aisb.t3.092_kmeanslocalopt.zh.yaml +110 -0
- package/AISB/catalog/aisb.t3.093_fedwmsam.yaml +134 -0
- package/AISB/catalog/aisb.t3.093_fedwmsam.zh.yaml +106 -0
- package/AISB/catalog/aisb.t3.094_boundre.yaml +147 -0
- package/AISB/catalog/aisb.t3.094_boundre.zh.yaml +114 -0
- package/AISB/catalog/aisb.t3.095_fastfeaturecp.yaml +153 -0
- package/AISB/catalog/aisb.t3.095_fastfeaturecp.zh.yaml +118 -0
- package/AISB/catalog/aisb.t3.096_m3svm.yaml +189 -0
- package/AISB/catalog/aisb.t3.096_m3svm.zh.yaml +149 -0
- package/AISB/catalog/aisb.t3.097_wassersteintl.yaml +212 -0
- package/AISB/catalog/aisb.t3.097_wassersteintl.zh.yaml +169 -0
- package/AISB/catalog/aisb.t3.098_xmahalanobis.yaml +171 -0
- package/AISB/catalog/aisb.t3.098_xmahalanobis.zh.yaml +127 -0
- package/AISB/catalog/aisb.t3.099_ollalanding.yaml +248 -0
- package/AISB/catalog/aisb.t3.099_ollalanding.zh.yaml +182 -0
- package/AISB/catalog/aisb.t3.100_invmissingdata.yaml +179 -0
- package/AISB/catalog/aisb.t3.100_invmissingdata.zh.yaml +150 -0
- package/AISB/catalog/aisb.t3.101_acia.yaml +164 -0
- package/AISB/catalog/aisb.t3.101_acia.zh.yaml +109 -0
- package/AISB/catalog/aisb.t3.102_stochasticff.yaml +178 -0
- package/AISB/catalog/aisb.t3.102_stochasticff.zh.yaml +130 -0
- package/AISB/catalog/aisb.t3.103_qdcp.yaml +150 -0
- package/AISB/catalog/aisb.t3.103_qdcp.zh.yaml +116 -0
- package/AISB/catalog/aisb.t3.104_balancedactiveinf.yaml +137 -0
- package/AISB/catalog/aisb.t3.104_balancedactiveinf.zh.yaml +104 -0
- package/AISB/catalog/aisb.t3.105_binaryclasseval.yaml +161 -0
- package/AISB/catalog/aisb.t3.105_binaryclasseval.zh.yaml +130 -0
- package/AISB/image/001_aisb.t3.001_savvy.jpg +0 -0
- package/AISB/image/002_aisb.t3.002_pinet.jpg +0 -0
- package/AISB/image/003_aisb.t3.003_dmsqd.jpg +0 -0
- package/AISB/image/004_aisb.t3.004_decentralattn.jpg +0 -0
- package/AISB/image/005_aisb.t3.005_tsae.jpg +0 -0
- package/AISB/image/006_aisb.t3.006_physense.jpg +0 -0
- package/AISB/image/007_aisb.t3.007_reasoningiqa.jpg +0 -0
- package/AISB/image/008_aisb.t3.008_meanflows.jpg +0 -0
- package/AISB/image/009_aisb.t3.009_scoremissing.jpg +0 -0
- package/AISB/image/010_aisb.t3.010_suitabilityfilter.jpg +0 -0
- package/AISB/image/011_aisb.t3.011_osd.jpg +0 -0
- package/AISB/image/012_aisb.t3.012_efficientqat.jpg +0 -0
- package/AISB/image/013_aisb.t3.013_appl.jpg +0 -0
- package/AISB/image/014_aisb.t3.014_piguard.jpg +0 -0
- package/AISB/image/015_aisb.t3.015_frspec.jpg +0 -0
- package/AISB/image/016_aisb.t3.016_mathfusion.jpg +0 -0
- package/AISB/image/017_aisb.t3.017_multimodalglp.jpg +0 -0
- package/AISB/image/018_aisb.t3.018_cotsynth.jpg +0 -0
- package/AISB/image/019_aisb.t3.019_dyscaleut.jpg +0 -0
- package/AISB/image/020_aisb.t3.020_aristotle.jpg +0 -0
- package/AISB/image/021_aisb.t3.021_tokenrecycling.jpg +0 -0
- package/AISB/image/022_aisb.t3.022_chainofreasoning.jpg +0 -0
- package/AISB/image/023_aisb.t3.023_guidedembed.jpg +0 -0
- package/AISB/image/024_aisb.t3.024_outputcentric.jpg +0 -0
- package/AISB/image/025_aisb.t3.025_deeper.jpg +0 -0
- package/AISB/image/026_aisb.t3.026_gartkg.jpg +0 -0
- package/AISB/image/027_aisb.t3.027_citeeval.jpg +0 -0
- package/AISB/image/028_aisb.t3.028_sbam.jpg +0 -0
- package/AISB/image/029_aisb.t3.029_cdqgeoembed.jpg +0 -0
- package/AISB/image/030_aisb.t3.030_processrm.jpg +0 -0
- package/AISB/image/031_aisb.t3.031_circuitstability.jpg +0 -0
- package/AISB/image/032_aisb.t3.032_ptsolver.jpg +0 -0
- package/AISB/image/033_aisb.t3.033_gcse.jpg +0 -0
- package/AISB/image/034_aisb.t3.034_ensemblewm.jpg +0 -0
- package/AISB/image/035_aisb.t3.035_moralvalueswa.jpg +0 -0
- package/AISB/image/036_aisb.t3.036_weakstrongpref.jpg +0 -0
- package/AISB/image/037_aisb.t3.037_dementiamask.jpg +0 -0
- package/AISB/image/038_aisb.t3.038_tinysam.jpg +0 -0
- package/AISB/image/039_aisb.t3.039_calf.jpg +0 -0
- package/AISB/image/040_aisb.t3.040_graniteguardian.jpg +0 -0
- package/AISB/image/041_aisb.t3.041_amdm.jpg +0 -0
- package/AISB/image/042_aisb.t3.042_xpatch.jpg +0 -0
- package/AISB/image/043_aisb.t3.043_vhm.jpg +0 -0
- package/AISB/image/044_aisb.t3.044_rgvi.jpg +0 -0
- package/AISB/image/045_aisb.t3.045_pslstm.jpg +0 -0
- package/AISB/image/046_aisb.t3.046_nonstatts.jpg +0 -0
- package/AISB/image/047_aisb.t3.047_timepfn.jpg +0 -0
- package/AISB/image/048_aisb.t3.048_proxyspex.jpg +0 -0
- package/AISB/image/049_aisb.t3.049_hogwildinference.jpg +0 -0
- package/AISB/image/050_aisb.t3.050_causalpfn.jpg +0 -0
- package/AISB/image/051_aisb.t3.051_flashtp.jpg +0 -0
- package/AISB/image/052_aisb.t3.052_nsdiff.jpg +0 -0
- package/AISB/image/053_aisb.t3.053_k2vae.jpg +0 -0
- package/AISB/image/054_aisb.t3.054_timebase.jpg +0 -0
- package/AISB/image/055_aisb.t3.055_csbrain.jpg +0 -0
- package/AISB/image/056_aisb.t3.056_infosam.jpg +0 -0
- package/AISB/image/057_aisb.t3.057_mdreid.jpg +0 -0
- package/AISB/image/058_aisb.t3.058_mindglitch.jpg +0 -0
- package/AISB/image/059_aisb.t3.059_selfsupervised.jpg +0 -0
- package/AISB/image/060_aisb.t3.060_iaggad.jpg +0 -0
- package/AISB/image/061_aisb.t3.061_hsgkn.jpg +0 -0
- package/AISB/image/062_aisb.t3.062_visionts.jpg +0 -0
- package/AISB/image/063_aisb.t3.063_tsrag.jpg +0 -0
- package/AISB/image/064_aisb.t3.064_pir.jpg +0 -0
- package/AISB/image/065_aisb.t3.065_proteinbinding.jpg +0 -0
- package/AISB/image/066_aisb.t3.066_tropicalattention.jpg +0 -0
- package/AISB/image/067_aisb.t3.067_kanad.jpg +0 -0
- package/AISB/image/068_aisb.t3.068_sempo.jpg +0 -0
- package/AISB/image/069_aisb.t3.069_treehfd.jpg +0 -0
- package/AISB/image/070_aisb.t3.070_certifiedunlearning.jpg +0 -0
- package/AISB/image/071_aisb.t3.071_neuralmjd.jpg +0 -0
- package/AISB/image/072_aisb.t3.072_fedgmt.jpg +0 -0
- package/AISB/image/073_aisb.t3.073_rld.jpg +0 -0
- package/AISB/image/074_aisb.t3.074_lsvi.jpg +0 -0
- package/AISB/image/075_aisb.t3.075_treeslicedentropy.jpg +0 -0
- package/AISB/image/076_aisb.t3.076_aanet.jpg +0 -0
- package/AISB/image/077_aisb.t3.077_cmnn.jpg +0 -0
- package/AISB/image/078_aisb.t3.078_conformalanomaly.jpg +0 -0
- package/AISB/image/079_aisb.t3.079_dpfkmeans.jpg +0 -0
- package/AISB/image/080_aisb.t3.080_latentscorereweight.jpg +0 -0
- package/AISB/image/081_aisb.t3.081_qmamba.jpg +0 -0
- package/AISB/image/082_aisb.t3.082_onlinellmrouting.jpg +0 -0
- package/AISB/image/083_aisb.t3.083_starformer.jpg +0 -0
- package/AISB/image/084_aisb.t3.084_ift.jpg +0 -0
- package/AISB/image/085_aisb.t3.085_neuralsurv.jpg +0 -0
- package/AISB/image/086_aisb.t3.086_stella.jpg +0 -0
- package/AISB/image/087_aisb.t3.087_moses.jpg +0 -0
- package/AISB/image/088_aisb.t3.088_channelnorm.jpg +0 -0
- package/AISB/image/089_aisb.t3.089_causalvelocity.jpg +0 -0
- package/AISB/image/090_aisb.t3.090_rstib.jpg +0 -0
- package/AISB/image/091_aisb.t3.091_timeawarecausal.jpg +0 -0
- package/AISB/image/092_aisb.t3.092_kmeanslocalopt.jpg +0 -0
- package/AISB/image/093_aisb.t3.093_fedwmsam.jpg +0 -0
- package/AISB/image/094_aisb.t3.094_boundre.jpg +0 -0
- package/AISB/image/095_aisb.t3.095_fastfeaturecp.jpg +0 -0
- package/AISB/image/096_aisb.t3.096_m3svm.jpg +0 -0
- package/AISB/image/097_aisb.t3.097_wassersteintl.jpg +0 -0
- package/AISB/image/098_aisb.t3.098_xmahalanobis.jpg +0 -0
- package/AISB/image/099_aisb.t3.099_ollalanding.jpg +0 -0
- package/AISB/image/100_aisb.t3.100_invmissingdata.jpg +0 -0
- package/AISB/image/101_aisb.t3.101_acia.jpg +0 -0
- package/AISB/image/102_aisb.t3.102_stochasticff.jpg +0 -0
- package/AISB/image/103_aisb.t3.103_qdcp.jpg +0 -0
- package/AISB/image/104_aisb.t3.104_balancedactiveinf.jpg +0 -0
- package/AISB/image/105_aisb.t3.105_binaryclasseval.jpg +0 -0
- package/AISB/image/106_aisb.t1.reasoning_lite.jpg +0 -0
- package/AISB/image/107_aisb.t2.paper_audit.jpg +0 -0
- package/AISB/image/108_aisb.t3.multi_gpu_search.jpg +0 -0
- package/AISB/image/109_aisb.t3.tdc_admet.jpg +0 -0
- package/AISB/image/aisb.b1.agentic_coding.svg +16 -0
- package/AISB/image/aisb.b10.climate_earth.svg +16 -0
- package/AISB/image/aisb.b11.model_efficiency.svg +16 -0
- package/AISB/image/aisb.b12.embodied_ai.svg +16 -0
- package/AISB/image/aisb.b2.agent_systems.svg +16 -0
- package/AISB/image/aisb.b3.self_evolving_rl.svg +16 -0
- package/AISB/image/aisb.b4.lm_reasoning.svg +16 -0
- package/AISB/image/aisb.b5.math_proof.svg +16 -0
- package/AISB/image/aisb.b6.research_process.svg +16 -0
- package/AISB/image/aisb.b7.multimodal_fusion.svg +16 -0
- package/AISB/image/aisb.b8.lifesci_drug.svg +16 -0
- package/AISB/image/aisb.b9.material_science.svg +16 -0
- package/README.md +196 -32
- package/bin/ds.js +924 -66
- package/docs/en/00_QUICK_START.md +195 -18
- package/docs/en/01_SETTINGS_REFERENCE.md +468 -96
- package/docs/en/02_START_RESEARCH_GUIDE.md +26 -5
- package/docs/en/03_QQ_CONNECTOR_GUIDE.md +14 -3
- package/docs/en/04_LINGZHU_CONNECTOR_GUIDE.md +2 -0
- package/docs/en/05_TUI_GUIDE.md +171 -2
- package/docs/en/07_MEMORY_AND_MCP.md +38 -2
- package/docs/en/09_DOCTOR.md +78 -7
- package/docs/en/10_WEIXIN_CONNECTOR_GUIDE.md +38 -1
- package/docs/en/11_LICENSE_AND_RISK.md +4 -0
- package/docs/en/12_GUIDED_WORKFLOW_TOUR.md +15 -0
- package/docs/en/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +9 -0
- package/docs/en/15_CODEX_PROVIDER_SETUP.md +624 -180
- package/docs/en/16_TELEGRAM_CONNECTOR_GUIDE.md +14 -0
- package/docs/en/17_WHATSAPP_CONNECTOR_GUIDE.md +14 -0
- package/docs/en/18_FEISHU_CONNECTOR_GUIDE.md +14 -0
- package/docs/en/21_LOCAL_MODEL_BACKENDS_GUIDE.md +386 -0
- package/docs/en/22_BENCHSTORE_YAML_REFERENCE.md +469 -0
- package/docs/en/23_BENCHSTORE_GITHUB_RELEASES_SPEC.md +316 -0
- package/docs/en/24_CLAUDE_CODE_PROVIDER_SETUP.md +469 -0
- package/docs/en/25_OPENCODE_PROVIDER_SETUP.md +653 -0
- package/docs/en/26_CITATION_AND_ATTRIBUTION.md +119 -0
- package/docs/en/27_KIMI_CODE_PROVIDER_SETUP.md +180 -0
- package/docs/en/28_DISCORD_CONNECTOR_GUIDE.md +61 -0
- package/docs/en/29_SLACK_CONNECTOR_GUIDE.md +60 -0
- package/docs/en/30_SETTINGS_CONTROL_CENTER_GUIDE.md +371 -0
- package/docs/en/{19_LOCAL_BROWSER_AUTH.md → 31_LOCAL_BROWSER_AUTH.md} +1 -1
- package/docs/en/32_WINDOWS_WSL2_DEPLOYMENT_GUIDE.md +273 -0
- package/docs/en/33_WORKSPACE_EXPLORER_QA.md +121 -0
- package/docs/en/91_DEVELOPMENT.md +266 -0
- package/docs/en/99_ACKNOWLEDGEMENTS.md +24 -19
- package/docs/en/README.md +48 -7
- package/docs/images/admin/admin-connectors-health-en.png +0 -0
- package/docs/images/admin/admin-controllers-en.png +0 -0
- package/docs/images/admin/admin-diagnostics-en.png +0 -0
- package/docs/images/admin/admin-errors-en.png +0 -0
- package/docs/images/admin/admin-issues-en.png +0 -0
- package/docs/images/admin/admin-logs-en.png +0 -0
- package/docs/images/admin/admin-quest-detail-en.png +0 -0
- package/docs/images/admin/admin-quests-en.png +0 -0
- package/docs/images/admin/admin-repairs-en.png +0 -0
- package/docs/images/admin/admin-runtime-en.png +0 -0
- package/docs/images/admin/admin-search-en.png +0 -0
- package/docs/images/admin/admin-stats-en.png +0 -0
- package/docs/images/admin/admin-summary-en.png +0 -0
- package/docs/images/connectors/connector-discord-en.png +0 -0
- package/docs/images/connectors/connector-feishu-en.png +0 -0
- package/docs/images/connectors/connector-lingzhu-en.png +0 -0
- package/docs/images/connectors/connector-qq-en.png +0 -0
- package/docs/images/connectors/connector-slack-en.png +0 -0
- package/docs/images/connectors/connector-telegram-en.png +0 -0
- package/docs/images/connectors/connector-weixin-en.png +0 -0
- package/docs/images/connectors/connector-whatsapp-en.png +0 -0
- package/docs/images/settings/settings-baselines-en.png +0 -0
- package/docs/images/settings/settings-config-en.png +0 -0
- package/docs/images/settings/settings-connectors-overview-en.png +0 -0
- package/docs/images/settings/settings-deepxiv-en.png +0 -0
- package/docs/images/settings/settings-mcp-servers-en.png +0 -0
- package/docs/images/settings/settings-plugins-en.png +0 -0
- package/docs/images/settings/settings-runners-en.png +0 -0
- package/docs/zh/00_QUICK_START.md +142 -18
- package/docs/zh/01_SETTINGS_REFERENCE.md +219 -98
- package/docs/zh/02_START_RESEARCH_GUIDE.md +26 -5
- package/docs/zh/05_TUI_GUIDE.md +171 -2
- package/docs/zh/07_MEMORY_AND_MCP.md +29 -2
- package/docs/zh/09_DOCTOR.md +54 -8
- package/docs/zh/10_WEIXIN_CONNECTOR_GUIDE.md +24 -1
- package/docs/zh/11_LICENSE_AND_RISK.md +4 -0
- package/docs/zh/12_GUIDED_WORKFLOW_TOUR.md +15 -0
- package/docs/zh/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +9 -0
- package/docs/zh/15_CODEX_PROVIDER_SETUP.md +552 -181
- package/docs/zh/21_LOCAL_MODEL_BACKENDS_GUIDE.md +384 -0
- package/docs/zh/22_BENCHSTORE_YAML_REFERENCE.md +459 -0
- package/docs/zh/23_BENCHSTORE_GITHUB_RELEASES_SPEC.md +287 -0
- package/docs/zh/23_CLAUDE_RUNNER_GUIDE.md +103 -0
- package/docs/zh/24_CLAUDE_CODE_PROVIDER_SETUP.md +460 -0
- package/docs/zh/25_OPENCODE_PROVIDER_SETUP.md +660 -0
- package/docs/zh/26_CITATION_AND_ATTRIBUTION.md +102 -0
- package/docs/zh/27_KIMI_CODE_PROVIDER_SETUP.md +51 -0
- package/docs/zh/{19_LOCAL_BROWSER_AUTH.md → 31_LOCAL_BROWSER_AUTH.md} +1 -1
- package/docs/zh/32_WINDOWS_WSL2_DEPLOYMENT_GUIDE.md +264 -0
- package/docs/zh/33_WORKSPACE_EXPLORER_QA.md +127 -0
- package/docs/zh/99_ACKNOWLEDGEMENTS.md +23 -19
- package/docs/zh/README.md +33 -7
- package/install.sh +168 -20
- package/package.json +5 -1
- package/pyproject.toml +2 -1
- package/src/deepscientist/__init__.py +1 -1
- package/src/deepscientist/acp/envelope.py +13 -0
- package/src/deepscientist/admin/__init__.py +3 -0
- package/src/deepscientist/admin/charts.py +681 -0
- package/src/deepscientist/admin/logs.py +119 -0
- package/src/deepscientist/admin/repairs.py +217 -0
- package/src/deepscientist/admin/service.py +1310 -0
- package/src/deepscientist/admin/system_info.py +700 -0
- package/src/deepscientist/admin/tasks.py +465 -0
- package/src/deepscientist/admin/tool_metrics.py +600 -0
- package/src/deepscientist/artifact/guidance.py +8 -4
- package/src/deepscientist/artifact/schemas.py +115 -0
- package/src/deepscientist/artifact/service.py +4268 -260
- package/src/deepscientist/bash_exec/monitor.py +30 -3
- package/src/deepscientist/bash_exec/service.py +134 -1
- package/src/deepscientist/benchstore/__init__.py +4 -0
- package/src/deepscientist/benchstore/prompt_builder.py +224 -0
- package/src/deepscientist/benchstore/service.py +1716 -0
- package/src/deepscientist/bridges/connectors.py +8 -2
- package/src/deepscientist/channels/weixin_ilink.py +8 -1
- package/src/deepscientist/cli.py +92 -17
- package/src/deepscientist/codex_cli_compat.py +187 -74
- package/src/deepscientist/config/models.py +82 -11
- package/src/deepscientist/config/service.py +1077 -93
- package/src/deepscientist/connector/weixin_support.py +48 -17
- package/src/deepscientist/daemon/api/handlers.py +827 -235
- package/src/deepscientist/daemon/api/router.py +81 -1
- package/src/deepscientist/daemon/app.py +1512 -85
- package/src/deepscientist/diagnostics/__init__.py +6 -0
- package/src/deepscientist/diagnostics/runner_failures.py +277 -0
- package/src/deepscientist/doctor.py +407 -56
- package/src/deepscientist/evidence_packets.py +590 -0
- package/src/deepscientist/home.py +52 -4
- package/src/deepscientist/kimi_cli_compat.py +50 -0
- package/src/deepscientist/latex_runtime.py +2 -2
- package/src/deepscientist/mcp/context.py +2 -0
- package/src/deepscientist/mcp/schemas.py +114 -0
- package/src/deepscientist/mcp/server.py +1566 -126
- package/src/deepscientist/memory/service.py +203 -16
- package/src/deepscientist/process_control.py +8 -1
- package/src/deepscientist/prompts/builder.py +850 -88
- package/src/deepscientist/quest/__init__.py +2 -2
- package/src/deepscientist/quest/layout.py +12 -1
- package/src/deepscientist/quest/node_traces.py +10 -0
- package/src/deepscientist/quest/service.py +1852 -161
- package/src/deepscientist/quest/stage_views.py +1 -1
- package/src/deepscientist/runners/__init__.py +18 -0
- package/src/deepscientist/runners/base.py +89 -1
- package/src/deepscientist/runners/builtins.py +13 -1
- package/src/deepscientist/runners/claude.py +391 -0
- package/src/deepscientist/runners/codex.py +480 -35
- package/src/deepscientist/runners/codex_telemetry.py +127 -0
- package/src/deepscientist/runners/kimi.py +334 -0
- package/src/deepscientist/runners/metadata.py +68 -0
- package/src/deepscientist/runners/opencode.py +414 -0
- package/src/deepscientist/runners/runtime_overrides.py +100 -0
- package/src/deepscientist/runners/simple_cli.py +538 -0
- package/src/deepscientist/runtime_storage.py +303 -0
- package/src/deepscientist/shared.py +80 -16
- package/src/deepscientist/skills/installer.py +37 -0
- package/src/deepscientist/skills/registry.py +2 -0
- package/src/deepscientist/tinytex.py +2 -2
- package/src/deepscientist/tui.py +10 -3
- package/src/prompts/benchstore/system.md +77 -0
- package/src/prompts/connectors/qq.md +33 -2
- package/src/prompts/connectors/weixin.md +208 -23
- package/src/prompts/contracts/admin_ops.md +74 -0
- package/src/prompts/contracts/admin_ops_knowledge.md +138 -0
- package/src/prompts/contracts/shared_interaction.md +5 -10
- package/src/prompts/start_setup/system.md +422 -0
- package/src/prompts/system.md +411 -304
- package/src/prompts/system_copilot.md +89 -0
- package/src/skills/analysis-campaign/SKILL.md +239 -578
- package/src/skills/analysis-campaign/references/artifact-flow-examples.md +102 -0
- package/src/skills/analysis-campaign/references/boundary-cases.md +98 -0
- package/src/skills/analysis-campaign/references/campaign-checklist-template.md +39 -24
- package/src/skills/analysis-campaign/references/campaign-design.md +26 -10
- package/src/skills/analysis-campaign/references/campaign-plan-template.md +53 -54
- package/src/skills/analysis-campaign/references/operational-guidance.md +97 -0
- package/src/skills/analysis-campaign/references/writing-facing-slice-examples.md +10 -20
- package/src/skills/baseline/SKILL.md +183 -461
- package/src/skills/baseline/references/artifact-flow-examples.md +106 -0
- package/src/skills/baseline/references/artifact-payload-examples.md +1 -1
- package/src/skills/baseline/references/baseline-checklist-template.md +27 -35
- package/src/skills/baseline/references/baseline-plan-template.md +37 -76
- package/src/skills/baseline/references/boundary-cases.md +86 -0
- package/src/skills/baseline/references/codebase-audit-checklist.md +2 -6
- package/src/skills/baseline/references/comparability-contract.md +7 -12
- package/src/skills/baseline/references/operational-guidance.md +56 -0
- package/src/skills/baseline/references/route-selection.md +5 -25
- package/src/skills/decision/SKILL.md +113 -306
- package/src/skills/decision/references/checkpoint-memory-template.md +47 -0
- package/src/skills/decision/references/operational-guidance.md +94 -0
- package/src/skills/decision/references/research-route-criteria.md +7 -8
- package/src/skills/decision/references/strategic-decision-template.md +13 -26
- package/src/skills/experiment/SKILL.md +132 -670
- package/src/skills/experiment/references/execution-playbook.md +374 -0
- package/src/skills/experiment/references/main-experiment-checklist-template.md +26 -2
- package/src/skills/experiment/references/main-experiment-plan-template.md +28 -17
- package/src/skills/experiment/references/operational-guidance.md +108 -0
- package/src/skills/finalize/SKILL.md +62 -0
- package/src/skills/finalize/references/checkpoint-memory-template.md +49 -0
- package/src/skills/finalize/references/resume-packet-template.md +7 -0
- package/src/skills/idea/SKILL.md +228 -15
- package/src/skills/idea/references/controlled-brainstorming-playbook.md +78 -0
- package/src/skills/idea/references/current-board-packet-template.md +61 -0
- package/src/skills/idea/references/high-value-idea-sourcing.md +119 -0
- package/src/skills/idea/references/idea-generation-playbook.md +21 -0
- package/src/skills/idea/references/idea-thinking-flow.md +6 -0
- package/src/skills/idea/references/literature-survey-template.md +3 -0
- package/src/skills/idea/references/objective-contract-template.md +54 -0
- package/src/skills/idea/references/outline-seeding-example.md +56 -0
- package/src/skills/idea/references/pre-idea-draft-template.md +105 -0
- package/src/skills/idea/references/related-work-playbook.md +75 -2
- package/src/skills/idea/references/research-history-playbook.md +114 -0
- package/src/skills/idea/references/selection-gate.md +58 -6
- package/src/skills/intake-audit/SKILL.md +43 -2
- package/src/skills/intake-audit/references/state-audit-template.md +10 -0
- package/src/skills/nature-data/SKILL.md +128 -0
- package/src/skills/nature-data/UPSTREAM_LICENSE.txt +21 -0
- package/src/skills/nature-data/agents/openai.yaml +4 -0
- package/src/skills/nature-data/references/chinese-author-alignment.md +84 -0
- package/src/skills/nature-data/references/fair-metadata-checklist.md +105 -0
- package/src/skills/nature-data/references/policy-principles.md +103 -0
- package/src/skills/nature-data/references/repository-and-identifiers.md +96 -0
- package/src/skills/nature-data/references/source-basis.md +54 -0
- package/src/skills/nature-data/references/statement-patterns.md +153 -0
- package/src/skills/nature-figure/SKILL.md +197 -0
- package/src/skills/nature-figure/UPSTREAM_LICENSE.txt +21 -0
- package/src/skills/nature-figure/agents/openai.yaml +4 -0
- package/src/skills/nature-figure/evals/evals.json +37 -0
- package/src/skills/nature-figure/references/api.md +428 -0
- package/src/skills/nature-figure/references/backend-selection.md +100 -0
- package/src/skills/nature-figure/references/chart-types.md +281 -0
- package/src/skills/nature-figure/references/common-patterns.md +349 -0
- package/src/skills/nature-figure/references/design-theory.md +436 -0
- package/src/skills/nature-figure/references/figure-contract.md +93 -0
- package/src/skills/nature-figure/references/nature-2026-observations.md +112 -0
- package/src/skills/nature-figure/references/qa-contract.md +119 -0
- package/src/skills/nature-figure/references/r-template-index.md +66 -0
- package/src/skills/nature-figure/references/r-workflow.md +161 -0
- package/src/skills/nature-figure/references/tutorials.md +250 -0
- package/src/skills/nature-paper2ppt/SKILL.md +507 -0
- package/src/skills/nature-paper2ppt/UPSTREAM_LICENSE.txt +21 -0
- package/src/skills/nature-paper2ppt/agents/openai.yaml +4 -0
- package/src/skills/nature-polishing/SKILL.md +385 -0
- package/src/skills/nature-polishing/UPSTREAM_LICENSE.txt +21 -0
- package/src/skills/nature-polishing/agents/openai.yaml +4 -0
- package/src/skills/nature-polishing/references/phrasebank-playbook.md +162 -0
- package/src/skills/nature-polishing/references/section-moves.md +240 -0
- package/src/skills/nature-polishing/references/style-guardrails.md +94 -0
- package/src/skills/nature-polishing/references/writing-strategy.md +148 -0
- package/src/skills/optimize/SKILL.md +177 -1568
- package/src/skills/optimize/references/brief-shaping-playbook.md +95 -0
- package/src/skills/optimize/references/candidate-board-template.md +13 -0
- package/src/skills/optimize/references/candidate-ranking-template.md +51 -0
- package/src/skills/optimize/references/codegen-route-playbook.md +50 -0
- package/src/skills/optimize/references/debug-response-template.md +29 -0
- package/src/skills/optimize/references/frontier-review-template.md +32 -0
- package/src/skills/optimize/references/fusion-playbook.md +36 -0
- package/src/skills/optimize/references/method-brief-template.md +73 -0
- package/src/skills/optimize/references/operational-guidance.md +621 -0
- package/src/skills/optimize/references/optimization-memory-template.md +30 -0
- package/src/skills/optimize/references/optimize-checklist-template.md +18 -0
- package/src/skills/optimize/references/plateau-response-playbook.md +28 -0
- package/src/skills/optimize/references/prompt-patterns.md +49 -0
- package/src/skills/paper-outline/SKILL.md +227 -0
- package/src/skills/paper-outline/references/outline-patterns.md +87 -0
- package/src/skills/paper-plot/SKILL.md +79 -0
- package/src/skills/paper-plot/agents/openai.yaml +4 -0
- package/src/skills/paper-plot/references/bar_grouped_hatch.md +96 -0
- package/src/skills/paper-plot/references/bar_paired_delta.md +72 -0
- package/src/skills/paper-plot/references/line_confidence_band.md +75 -0
- package/src/skills/paper-plot/references/line_loss_with_inset.md +65 -0
- package/src/skills/paper-plot/references/line_training_curve.md +44 -0
- package/src/skills/paper-plot/references/radar_dual_series.md +59 -0
- package/src/skills/paper-plot/references/scatter_broken_axis.md +59 -0
- package/src/skills/paper-plot/references/scatter_tsne_cluster.md +72 -0
- package/src/skills/paper-plot/scripts/bar_memevolve.py +109 -0
- package/src/skills/paper-plot/scripts/bar_spice.py +166 -0
- package/src/skills/paper-plot/scripts/line_aime.py +94 -0
- package/src/skills/paper-plot/scripts/line_loss_inset.py +157 -0
- package/src/skills/paper-plot/scripts/line_selfdistill.py +168 -0
- package/src/skills/paper-plot/scripts/radar_dora.py +151 -0
- package/src/skills/paper-plot/scripts/scatter_break.py +169 -0
- package/src/skills/paper-plot/scripts/scatter_tsne.py +133 -0
- package/src/skills/rebuttal/SKILL.md +9 -0
- package/src/skills/references/tool-usage-by-stage.md +438 -0
- package/src/skills/review/SKILL.md +105 -7
- package/src/skills/science/PROVENANCE.md +44 -0
- package/src/skills/science/SKILL.md +137 -0
- package/src/skills/science/references/artifact-science-tool.md +110 -0
- package/src/skills/science/references/claim-type-discipline.md +56 -0
- package/src/skills/science/references/domain-index.md +422 -0
- package/src/skills/science/references/hpc-via-bash-exec.md +42 -0
- package/src/skills/science/references/package-check-playbook.md +64 -0
- package/src/skills/science/references/package-index.min.json +3616 -0
- package/src/skills/science/references/packages/abinit.md +80 -0
- package/src/skills/science/references/packages/acts.md +73 -0
- package/src/skills/science/references/packages/aiida-core.md +80 -0
- package/src/skills/science/references/packages/alamode.md +80 -0
- package/src/skills/science/references/packages/amuse.md +88 -0
- package/src/skills/science/references/packages/anndata.md +88 -0
- package/src/skills/science/references/packages/arbor.md +80 -0
- package/src/skills/science/references/packages/arc.md +73 -0
- package/src/skills/science/references/packages/astropy.md +88 -0
- package/src/skills/science/references/packages/astroquery.md +88 -0
- package/src/skills/science/references/packages/atomate2.md +80 -0
- package/src/skills/science/references/packages/atomsmltr.md +73 -0
- package/src/skills/science/references/packages/awkward.md +73 -0
- package/src/skills/science/references/packages/batman.md +88 -0
- package/src/skills/science/references/packages/biopython.md +88 -0
- package/src/skills/science/references/packages/bloqade.md +73 -0
- package/src/skills/science/references/packages/brian2.md +73 -0
- package/src/skills/science/references/packages/bullet3.md +73 -0
- package/src/skills/science/references/packages/calculix.md +80 -0
- package/src/skills/science/references/packages/cantera.md +73 -0
- package/src/skills/science/references/packages/cavity-md-ipi.md +80 -0
- package/src/skills/science/references/packages/ccdproc.md +88 -0
- package/src/skills/science/references/packages/celerite2.md +88 -0
- package/src/skills/science/references/packages/cellrank.md +73 -0
- package/src/skills/science/references/packages/cesm.md +80 -0
- package/src/skills/science/references/packages/chemicals.md +73 -0
- package/src/skills/science/references/packages/chempy.md +73 -0
- package/src/skills/science/references/packages/cirq.md +73 -0
- package/src/skills/science/references/packages/coffea.md +73 -0
- package/src/skills/science/references/packages/cp2k.md +88 -0
- package/src/skills/science/references/packages/custodian.md +80 -0
- package/src/skills/science/references/packages/dart.md +73 -0
- package/src/skills/science/references/packages/datamol.md +88 -0
- package/src/skills/science/references/packages/dd4hep.md +73 -0
- package/src/skills/science/references/packages/dealii.md +80 -0
- package/src/skills/science/references/packages/deepchem.md +88 -0
- package/src/skills/science/references/packages/delphes.md +73 -0
- package/src/skills/science/references/packages/devito.md +80 -0
- package/src/skills/science/references/packages/dftb.md +88 -0
- package/src/skills/science/references/packages/dftd4.md +88 -0
- package/src/skills/science/references/packages/dftk-jl.md +80 -0
- package/src/skills/science/references/packages/dolfinx.md +80 -0
- package/src/skills/science/references/packages/drake.md +73 -0
- package/src/skills/science/references/packages/dumux.md +73 -0
- package/src/skills/science/references/packages/elk.md +80 -0
- package/src/skills/science/references/packages/elmerfem.md +80 -0
- package/src/skills/science/references/packages/enzo-e.md +88 -0
- package/src/skills/science/references/packages/espresso.md +80 -0
- package/src/skills/science/references/packages/exoplanet.md +88 -0
- package/src/skills/science/references/packages/fairroot.md +73 -0
- package/src/skills/science/references/packages/fbpic.md +80 -0
- package/src/skills/science/references/packages/fdtdbath-meep.md +80 -0
- package/src/skills/science/references/packages/geant4.md +73 -0
- package/src/skills/science/references/packages/geosx.md +80 -0
- package/src/skills/science/references/packages/gprmax.md +80 -0
- package/src/skills/science/references/packages/gromacs.md +80 -0
- package/src/skills/science/references/packages/gwaslab.md +73 -0
- package/src/skills/science/references/packages/gz-sim.md +73 -0
- package/src/skills/science/references/packages/hail.md +88 -0
- package/src/skills/science/references/packages/hiphive.md +80 -0
- package/src/skills/science/references/packages/hoomd-blue.md +80 -0
- package/src/skills/science/references/packages/itensor.md +73 -0
- package/src/skills/science/references/packages/itensors-jl.md +73 -0
- package/src/skills/science/references/packages/jdftx.md +73 -0
- package/src/skills/science/references/packages/jobflow.md +80 -0
- package/src/skills/science/references/packages/kadanoffbaym-jl.md +73 -0
- package/src/skills/science/references/packages/kite.md +80 -0
- package/src/skills/science/references/packages/kratos.md +80 -0
- package/src/skills/science/references/packages/kwant.md +73 -0
- package/src/skills/science/references/packages/lammps.md +80 -0
- package/src/skills/science/references/packages/lightkurve.md +88 -0
- package/src/skills/science/references/packages/limix.md +73 -0
- package/src/skills/science/references/packages/maxwelllink.md +80 -0
- package/src/skills/science/references/packages/mcdc.md +73 -0
- package/src/skills/science/references/packages/meep.md +80 -0
- package/src/skills/science/references/packages/mfem.md +80 -0
- package/src/skills/science/references/packages/mitgcm.md +73 -0
- package/src/skills/science/references/packages/modflow6.md +73 -0
- package/src/skills/science/references/packages/molecool.md +73 -0
- package/src/skills/science/references/packages/mom6.md +73 -0
- package/src/skills/science/references/packages/moose.md +80 -0
- package/src/skills/science/references/packages/mpas-model.md +73 -0
- package/src/skills/science/references/packages/mujoco.md +73 -0
- package/src/skills/science/references/packages/mumax3.md +73 -0
- package/src/skills/science/references/packages/nekrs.md +80 -0
- package/src/skills/science/references/packages/nessi.md +73 -0
- package/src/skills/science/references/packages/nest-simulator.md +73 -0
- package/src/skills/science/references/packages/netket.md +73 -0
- package/src/skills/science/references/packages/neuron.md +73 -0
- package/src/skills/science/references/packages/nextflow.md +88 -0
- package/src/skills/science/references/packages/nwchem.md +88 -0
- package/src/skills/science/references/packages/openbabel.md +88 -0
- package/src/skills/science/references/packages/openems.md +80 -0
- package/src/skills/science/references/packages/openff-toolkit.md +88 -0
- package/src/skills/science/references/packages/openfoam-dev.md +80 -0
- package/src/skills/science/references/packages/openmc.md +73 -0
- package/src/skills/science/references/packages/openmm.md +80 -0
- package/src/skills/science/references/packages/openmoc.md +73 -0
- package/src/skills/science/references/packages/openmx.md +80 -0
- package/src/skills/science/references/packages/opensees.md +80 -0
- package/src/skills/science/references/packages/opensn.md +80 -0
- package/src/skills/science/references/packages/opm-simulators.md +73 -0
- package/src/skills/science/references/packages/oqupy.md +73 -0
- package/src/skills/science/references/packages/packmol.md +80 -0
- package/src/skills/science/references/packages/palabos.md +80 -0
- package/src/skills/science/references/packages/parflow.md +80 -0
- package/src/skills/science/references/packages/pennylane.md +88 -0
- package/src/skills/science/references/packages/perceval.md +73 -0
- package/src/skills/science/references/packages/phono3py.md +73 -0
- package/src/skills/science/references/packages/phonopy.md +73 -0
- package/src/skills/science/references/packages/photutils.md +88 -0
- package/src/skills/science/references/packages/picongpu.md +80 -0
- package/src/skills/science/references/packages/plink-ng.md +88 -0
- package/src/skills/science/references/packages/precice.md +73 -0
- package/src/skills/science/references/packages/psc.md +80 -0
- package/src/skills/science/references/packages/psi4.md +88 -0
- package/src/skills/science/references/packages/pybinding.md +73 -0
- package/src/skills/science/references/packages/pyfr.md +80 -0
- package/src/skills/science/references/packages/pyhf.md +73 -0
- package/src/skills/science/references/packages/pyiron_base.md +80 -0
- package/src/skills/science/references/packages/pylcp.md +73 -0
- package/src/skills/science/references/packages/pylith.md +80 -0
- package/src/skills/science/references/packages/pynbody.md +88 -0
- package/src/skills/science/references/packages/pysam.md +88 -0
- package/src/skills/science/references/packages/pyscf.md +88 -0
- package/src/skills/science/references/packages/q-e.md +73 -0
- package/src/skills/science/references/packages/qibo.md +73 -0
- package/src/skills/science/references/packages/qiskit.md +73 -0
- package/src/skills/science/references/packages/quantica-jl.md +73 -0
- package/src/skills/science/references/packages/quantumoptics-jl.md +73 -0
- package/src/skills/science/references/packages/quimb.md +73 -0
- package/src/skills/science/references/packages/qulacs.md +73 -0
- package/src/skills/science/references/packages/qutip.md +73 -0
- package/src/skills/science/references/packages/rdkit.md +88 -0
- package/src/skills/science/references/packages/rmg-py.md +73 -0
- package/src/skills/science/references/packages/root.md +73 -0
- package/src/skills/science/references/packages/scanpy.md +88 -0
- package/src/skills/science/references/packages/scikit-allel.md +88 -0
- package/src/skills/science/references/packages/scikit-bio.md +88 -0
- package/src/skills/science/references/packages/scqubits.md +73 -0
- package/src/skills/science/references/packages/scuff-em.md +80 -0
- package/src/skills/science/references/packages/scvi-tools.md +73 -0
- package/src/skills/science/references/packages/seissol.md +73 -0
- package/src/skills/science/references/packages/sfepy.md +80 -0
- package/src/skills/science/references/packages/sisl.md +73 -0
- package/src/skills/science/references/packages/smilei.md +80 -0
- package/src/skills/science/references/packages/snakemake.md +88 -0
- package/src/skills/science/references/packages/specfem3d-globe.md +80 -0
- package/src/skills/science/references/packages/specutils.md +88 -0
- package/src/skills/science/references/packages/spglib.md +80 -0
- package/src/skills/science/references/packages/squidpy.md +88 -0
- package/src/skills/science/references/packages/starry.md +88 -0
- package/src/skills/science/references/packages/strawberryfields.md +73 -0
- package/src/skills/science/references/packages/su2.md +80 -0
- package/src/skills/science/references/packages/sunny-jl.md +73 -0
- package/src/skills/science/references/packages/sw4.md +73 -0
- package/src/skills/science/references/packages/swift.md +88 -0
- package/src/skills/science/references/packages/tdnegf.md +73 -0
- package/src/skills/science/references/packages/tenpy.md +73 -0
- package/src/skills/science/references/packages/thermo.md +73 -0
- package/src/skills/science/references/packages/tkwant.md +73 -0
- package/src/skills/science/references/packages/tvb-root.md +73 -0
- package/src/skills/science/references/packages/uproot5.md +73 -0
- package/src/skills/science/references/packages/vampire.md +80 -0
- package/src/skills/science/references/packages/wannier_tools.md +73 -0
- package/src/skills/science/references/packages/warpx.md +80 -0
- package/src/skills/science/references/packages/wrf.md +73 -0
- package/src/skills/science/references/packages/xtb.md +88 -0
- package/src/skills/science/references/packages/yt.md +73 -0
- package/src/skills/science/references/science-task-brief-template.md +71 -0
- package/src/skills/scout/SKILL.md +83 -425
- package/src/skills/scout/references/literature-scout-template.md +5 -24
- package/src/skills/scout/references/operational-guidance.md +191 -0
- package/src/skills/scout/references/paper-triage-playbook.md +11 -35
- package/src/skills/write/SKILL.md +744 -1246
- package/src/skills/write/references/experiments_analysis_patterns.md +129 -0
- package/src/skills/write/references/oral_package_patterns.md +252 -0
- package/src/skills/write/references/oral_writing_principles.md +291 -0
- package/src/skills/write/references/section_rewrite_checklist.md +234 -0
- package/src/tui/dist/app/AppContainer.js +1314 -27
- package/src/tui/dist/components/Composer.js +26 -1
- package/src/tui/dist/components/ConfigScreen.js +2 -1
- package/src/tui/dist/components/InputPrompt.js +25 -9
- package/src/tui/dist/components/MainContent.js +18 -3
- package/src/tui/dist/components/QuestScreen.js +3 -2
- package/src/tui/dist/components/UtilityScreen.js +37 -0
- package/src/tui/dist/hooks/useSafeInput.js +10 -0
- package/src/tui/dist/index.js +13 -1
- package/src/tui/dist/layouts/DefaultAppLayout.js +11 -8
- package/src/tui/dist/lib/api.js +89 -1
- package/src/tui/package.json +1 -1
- package/src/ui/dist/assets/{AnalysisPlugin-DnSm0GZn.js → AnalysisPlugin-CA94NGmI.js} +1 -1
- package/src/ui/dist/assets/CliPlugin-DHBzphZU.js +79 -0
- package/src/ui/dist/assets/CodeEditorPlugin-BOFwD2rn.js +2 -0
- package/src/ui/dist/assets/{CodeViewerPlugin-itb0tltR.js → CodeViewerPlugin-CqDpgjik.js} +4 -4
- package/src/ui/dist/assets/{DocViewerPlugin-DqKkiCI6.js → DocViewerPlugin-UDBgt8-4.js} +3 -3
- package/src/ui/dist/assets/GitCommitViewerPlugin-BmHtZ0bZ.js +6 -0
- package/src/ui/dist/assets/{GitDiffViewerPlugin-DxL2ezFG.js → GitDiffViewerPlugin-CAxjNorQ.js} +2 -2
- package/src/ui/dist/assets/{GitSnapshotViewer-B_RQm1YZ.js → GitSnapshotViewer-CweA6VON.js} +2 -2
- package/src/ui/dist/assets/{ImageViewerPlugin-tHqlXY3n.js → ImageViewerPlugin-C8wHGvGN.js} +5 -5
- package/src/ui/dist/assets/LabPlugin-COyyLUol.js +32 -0
- package/src/ui/dist/assets/{LatexPlugin-B495DTXC.js → LatexPlugin-BQjAaA5J.js} +4 -4
- package/src/ui/dist/assets/{MarkdownViewerPlugin-DG28-61B.js → MarkdownViewerPlugin-Dy1NE2dI.js} +3 -3
- package/src/ui/dist/assets/{MarketplacePlugin-BiOGT-Kj.js → MarketplacePlugin-DMIZtEJ2.js} +2 -2
- package/src/ui/dist/assets/NotebookEditor-CFHMq_Qt.js +91 -0
- package/src/ui/dist/assets/{NotebookEditor-CVsj8h_T.js → NotebookEditor-WFyd8Ybt.js} +23 -23
- package/src/ui/dist/assets/{PdfLoader-CASDQmxJ.js → PdfLoader-CLE5u5TS.js} +3 -3
- package/src/ui/dist/assets/{PdfMarkdownPlugin-BFhwoKsY.js → PdfMarkdownPlugin-_iNK_H83.js} +1 -1
- package/src/ui/dist/assets/PdfViewerPlugin-DgWsbInT.js +22 -0
- package/src/ui/dist/assets/SearchPlugin-DrZmn5iw.js +11 -0
- package/src/ui/dist/assets/{TextViewerPlugin-CB4DYfWO.js → TextViewerPlugin-D1-T3aC7.js} +4 -4
- package/src/ui/dist/assets/branding/runner-claude.svg +107 -0
- package/src/ui/dist/assets/branding/runner-codex.svg +10 -0
- package/src/ui/dist/assets/branding/runner-kimi.svg +14 -0
- package/src/ui/dist/assets/branding/runner-opencode.svg +7 -0
- package/src/ui/dist/assets/cli-store-CoZ-x5Ip.js +1 -0
- package/src/ui/dist/assets/{code-DLC6G24T.js → code-DbsmSd3Y.js} +1 -1
- package/src/ui/dist/assets/file-diff-panel-DsvyRz47.js +1 -0
- package/src/ui/dist/assets/{wrap-text-CwMn-iqb.js → file-jump-queue-DeQBikaw.js} +3 -3
- package/src/ui/dist/assets/{file-socket-Cu4Qln7Y.js → file-socket-DA5XIx88.js} +1 -1
- package/src/ui/dist/assets/fonts/ds-fonts.css +50 -4
- package/src/ui/dist/assets/images/deepxiv/register-guide.png +0 -0
- package/src/ui/dist/assets/index-39vY9LmZ.js +1 -0
- package/src/ui/dist/assets/{index-wQ7RIIRd.js → index-BsO46tJA.js} +1 -1
- package/src/ui/dist/assets/index-CHzJ2xtB.js +3530 -0
- package/src/ui/dist/assets/index-DH-zxoZ3.css +33 -0
- package/src/ui/dist/assets/{plugin-notebook-HbW2K-1c.js → plugin-notebook-JRhysCqj.js} +2 -2
- package/src/ui/dist/assets/{project-sync-CsX08Qno.js → project-sync-DPmWKmKD.js} +1 -1
- package/src/ui/dist/assets/{zoom-out-R-GWEhzS.js → zoom-out-DAukFWen.js} +3 -3
- package/src/ui/dist/index.html +3 -3
- package/src/skills/analysis-campaign/references/artifact-orchestration.md +0 -58
- package/src/skills/baseline/references/memory-playbook.md +0 -40
- package/src/skills/baseline/references/publishable-baseline-package.md +0 -30
- package/src/skills/write/references/outline-evidence-contract-example.md +0 -107
- package/src/skills/write/references/paper-experiment-matrix-template.md +0 -131
- package/src/skills/write/references/paper-section-playbook.md +0 -64
- package/src/skills/write/references/reviewer-first-writing.md +0 -64
- package/src/skills/write/references/revision-checklist.md +0 -70
- package/src/skills/write/references/section-contracts.md +0 -82
- package/src/skills/write/references/sentence-level-proofing.md +0 -49
- package/src/ui/dist/assets/AiManusChatView-COFACy7V.js +0 -204
- package/src/ui/dist/assets/CliPlugin-CvwCmDQ5.js +0 -109
- package/src/ui/dist/assets/CodeEditorPlugin-cOqSa0xq.js +0 -2
- package/src/ui/dist/assets/GitCommitViewerPlugin-DVgNHBCS.js +0 -1
- package/src/ui/dist/assets/LabCopilotPanel-ClMbq5Yu.js +0 -14
- package/src/ui/dist/assets/LabPlugin-L_SuE8ow.js +0 -22
- package/src/ui/dist/assets/NotebookEditor-C-4Kt1p9.js +0 -81
- package/src/ui/dist/assets/PdfViewerPlugin-DcOzU9vd.js +0 -17
- package/src/ui/dist/assets/SearchPlugin-CHj7M58O.js +0 -16
- package/src/ui/dist/assets/VNCViewer-CjlbyCB3.js +0 -11
- package/src/ui/dist/assets/bot-CFkZY-JP.js +0 -6
- package/src/ui/dist/assets/chevron-up-Dq5ofbht.js +0 -6
- package/src/ui/dist/assets/file-content-Dv4LoZec.js +0 -1
- package/src/ui/dist/assets/file-diff-panel-Denq-lC3.js +0 -1
- package/src/ui/dist/assets/file-jump-queue-DA-SdG__.js +0 -1
- package/src/ui/dist/assets/git-commit-horizontal-BUh6G52n.js +0 -6
- package/src/ui/dist/assets/image-B9HUUddG.js +0 -6
- package/src/ui/dist/assets/index-B2B1sg-M.js +0 -1
- package/src/ui/dist/assets/index-Cgla8biy.css +0 -33
- package/src/ui/dist/assets/index-DRyx7vAc.js +0 -1
- package/src/ui/dist/assets/index-Gbl53BNp.js +0 -2496
- package/src/ui/dist/assets/pdf-effect-queue-ZtnHFCAi.js +0 -6
- package/src/ui/dist/assets/popover-DL6h35vr.js +0 -1
- package/src/ui/dist/assets/select-DvmXt1yY.js +0 -11
- package/src/ui/dist/assets/sigma-7jpXazui.js +0 -6
- package/src/ui/dist/assets/trash-xA7kFt8i.js +0 -11
- package/src/ui/dist/assets/useCliAccess-DsMwDjOp.js +0 -1
- package/src/ui/dist/assets/useFileDiffOverlay-FuhcnKiw.js +0 -1
|
@@ -0,0 +1,374 @@
|
|
|
1
|
+
# Execution Playbook
|
|
2
|
+
|
|
3
|
+
Use this reference when the experiment route needs the full execution checklist rather than the short control surface in `SKILL.md`.
|
|
4
|
+
|
|
5
|
+
## 1. Define the run contract
|
|
6
|
+
|
|
7
|
+
Before implementation or execution, state:
|
|
8
|
+
|
|
9
|
+
- `run_id`
|
|
10
|
+
- experiment tier: `auxiliary/dev` or `main/test`
|
|
11
|
+
- research question
|
|
12
|
+
- null hypothesis
|
|
13
|
+
- alternative hypothesis
|
|
14
|
+
- hypothesis
|
|
15
|
+
- baseline id or variant
|
|
16
|
+
- metric targets
|
|
17
|
+
- expected changed files
|
|
18
|
+
- expected outputs
|
|
19
|
+
- stop condition
|
|
20
|
+
- compute or runtime budget
|
|
21
|
+
- minimal experiment
|
|
22
|
+
- abandonment condition
|
|
23
|
+
- strongest alternative hypothesis
|
|
24
|
+
- exact metric keys that will decide success or failure
|
|
25
|
+
|
|
26
|
+
Prefer to write this contract first in `PLAN.md` using `references/main-experiment-plan-template.md`, then keep the current execution state visible in `CHECKLIST.md` using `references/main-experiment-checklist-template.md`.
|
|
27
|
+
|
|
28
|
+
For substantial runs, also record the following seven experiment fields early and keep them updated during execution:
|
|
29
|
+
|
|
30
|
+
1. research question
|
|
31
|
+
2. research type
|
|
32
|
+
3. research objective
|
|
33
|
+
4. experimental setup
|
|
34
|
+
5. experimental results
|
|
35
|
+
6. experimental analysis
|
|
36
|
+
7. experimental conclusions
|
|
37
|
+
|
|
38
|
+
If the run contract changes materially later, record the change durably.
|
|
39
|
+
|
|
40
|
+
Treat the run contract as a research-question contract, not only an execution checklist.
|
|
41
|
+
Before coding, be able to explain:
|
|
42
|
+
|
|
43
|
+
- why this run is the best current route rather than the main alternatives
|
|
44
|
+
- what observation would count as a real answer to the research question
|
|
45
|
+
- what result would force a downgrade, retry, or route change
|
|
46
|
+
- what confounder would make the run non-comparable even if it finishes successfully
|
|
47
|
+
|
|
48
|
+
If multiple candidate experiment packages exist, prefer the one with the best balance of technical feasibility, research importance, and methodological rigor.
|
|
49
|
+
Do not choose a package only because it sounds ambitious.
|
|
50
|
+
|
|
51
|
+
For paper-facing lines, default to this evidence ladder:
|
|
52
|
+
|
|
53
|
+
- `auxiliary/dev`
|
|
54
|
+
- clarify parameters, settings, mechanisms, or diagnostics
|
|
55
|
+
- `main/test`
|
|
56
|
+
- carry the core comparison the paper will rely on
|
|
57
|
+
- `minimum -> solid -> maximum`
|
|
58
|
+
- first make the result executable and comparable
|
|
59
|
+
- then make it strong enough to carry the claim
|
|
60
|
+
- only then spend effort on broader supporting polish
|
|
61
|
+
|
|
62
|
+
## 2. Run a preflight check
|
|
63
|
+
|
|
64
|
+
Before editing or executing:
|
|
65
|
+
|
|
66
|
+
- confirm the dataset path, version, and split contract
|
|
67
|
+
- confirm the baseline metrics reference
|
|
68
|
+
- if durable state exposes `active_baseline_metric_contract_json`, read that JSON file before planning commands or comparisons
|
|
69
|
+
- treat `active_baseline_metric_contract_json` as the default authoritative baseline comparison contract unless you record a concrete reason to override it
|
|
70
|
+
- confirm the selected idea claim and code-level plan
|
|
71
|
+
- look up prior incidents or repeated failure patterns when available
|
|
72
|
+
- confirm output directories and naming
|
|
73
|
+
- confirm that the intended run still matches the current quest decision
|
|
74
|
+
|
|
75
|
+
If a repeated failure pattern already exists, apply the mitigation first and record that choice.
|
|
76
|
+
|
|
77
|
+
Also confirm before comparison work:
|
|
78
|
+
|
|
79
|
+
- the baseline verification is trustworthy enough
|
|
80
|
+
- the planned comparison still uses the same metric contract
|
|
81
|
+
- the metric keys and primary metric still match `active_baseline_metric_contract_json` when that file is available
|
|
82
|
+
- every main experiment submission still covers all required baseline metric ids from `active_baseline_metric_contract_json`; extra metrics are allowed, but missing required metrics are not
|
|
83
|
+
- the required baseline metrics still use the same evaluation code and metric definitions; if an extra evaluator is genuinely necessary, record it as supplementary output rather than replacing the canonical comparator
|
|
84
|
+
- if the run is `main/test` and superiority is likely to be claimed, define the significance-testing plan before execution rather than after seeing the numbers
|
|
85
|
+
- if `Result/metric.md` was used during the run, treat it as optional scratch memory only and reconcile it against the final submitted metrics before `artifact.record_main_experiment(...)`
|
|
86
|
+
|
|
87
|
+
Before you begin a substantial run, send a concise threaded `artifact.interact(kind='progress', ...)` update naming:
|
|
88
|
+
|
|
89
|
+
- the run contract you are about to execute
|
|
90
|
+
- the main evidence it is testing
|
|
91
|
+
- the expected durable outputs
|
|
92
|
+
- the next checkpoint for reporting back
|
|
93
|
+
|
|
94
|
+
## 2.1 Diagnostic mode trigger
|
|
95
|
+
|
|
96
|
+
Switch from ordinary execution mode into diagnosis mode when any of the following becomes true:
|
|
97
|
+
|
|
98
|
+
- two retries in a row add no new evidence or no interpretable delta
|
|
99
|
+
- the baseline gap is much larger than expected and the cause is unclear
|
|
100
|
+
- the metrics are suspiciously strong, suspiciously identical to baseline, or highly unstable
|
|
101
|
+
- logs, checkpoints, or intermediate outputs conflict with the claimed behavior
|
|
102
|
+
|
|
103
|
+
In diagnosis mode:
|
|
104
|
+
|
|
105
|
+
- stop brute-force retrying
|
|
106
|
+
- prefer the smallest discriminative test that can separate competing hypotheses
|
|
107
|
+
- resolve obvious environment or data-contract issues before launching another comparison run
|
|
108
|
+
- make the diagnosis goal explicit: explain the behavior, not just "try something else"
|
|
109
|
+
|
|
110
|
+
## 3. Confirm the execution workspace
|
|
111
|
+
|
|
112
|
+
The normal experiment workspace is the current active idea worktree returned by `artifact.submit_idea(...)`.
|
|
113
|
+
|
|
114
|
+
- do not create a fresh manual branch for the main experiment unless recovery or debugging truly requires it
|
|
115
|
+
- implement and run inside the current active idea workspace
|
|
116
|
+
- if the idea package changes materially before execution, submit a new durable idea branch with `artifact.submit_idea(mode='create', lineage_intent='continue_line'|'branch_alternative', ...)` instead of silently mutating the old node
|
|
117
|
+
- after a real main run finishes, record it with `artifact.record_main_experiment(...)` before moving to analysis or writing
|
|
118
|
+
- once that durable main result exists, treat the branch as a fixed round node; a later new optimization round should usually compare foundations and create a new `continue_line` child branch or `branch_alternative` sibling-like branch
|
|
119
|
+
- after `artifact.record_main_experiment(...)`, if QQ milestone media is enabled and the metrics are stable enough to summarize honestly, prefer one concise summary PNG over multiple attachments
|
|
120
|
+
|
|
121
|
+
## 4. Implement the minimum required change
|
|
122
|
+
|
|
123
|
+
Implementation rules:
|
|
124
|
+
|
|
125
|
+
- keep the change hypothesis-bound
|
|
126
|
+
- prefer small, explainable edits
|
|
127
|
+
- avoid unrelated cleanup during a main run
|
|
128
|
+
- record which files matter for later review
|
|
129
|
+
- preserve theory fidelity between the idea claim and the code change
|
|
130
|
+
- add robustness checks when the mechanism risks NaN, inf, or unstable behavior
|
|
131
|
+
- implement according to the current `PLAN.md` instead of repeatedly improvising a new method after each small observation
|
|
132
|
+
- avoid repeated code churn between the smoke test and the real run unless the smoke test exposes a specific problem that the next change is meant to fix
|
|
133
|
+
|
|
134
|
+
Prefer to complete one experiment cleanly before expanding to the next, unless parallel execution is explicitly justified and isolated.
|
|
135
|
+
For substantial experiment packages, the default is one experiment at a time, with each one reaching a recoverable recorded state before the next begins.
|
|
136
|
+
|
|
137
|
+
Retry-delta discipline:
|
|
138
|
+
|
|
139
|
+
- unless the current state is completely non-executable, change only one major variable per retry
|
|
140
|
+
- if broader recovery is unavoidable, record exactly which layer changed: data, preprocessing, model, objective, optimization, evaluation, or environment
|
|
141
|
+
- before each retry, state the expected effect and the fastest falsification signal
|
|
142
|
+
- if the retry produced no interpretable delta, do not treat it as meaningful evidence about the underlying research hypothesis
|
|
143
|
+
- if the retry does not change the hypothesis, code path, command path, or evidence surface, stop rerunning and route through `decision`
|
|
144
|
+
- if the same failure class appears again without a real route or evidence change, stop looping and route through `decision`
|
|
145
|
+
|
|
146
|
+
## 5. Execute the run
|
|
147
|
+
|
|
148
|
+
Run with auditable commands and durable outputs.
|
|
149
|
+
|
|
150
|
+
Execution rules:
|
|
151
|
+
|
|
152
|
+
- use non-interactive commands
|
|
153
|
+
- prefer `bash_exec` instead of ephemeral shell invocations
|
|
154
|
+
- use the intended dataset and split
|
|
155
|
+
- keep logs durable
|
|
156
|
+
- report progress for long runs
|
|
157
|
+
- avoid silent metric-definition changes
|
|
158
|
+
- do not drift away from `active_baseline_metric_contract_json` silently when that file exists
|
|
159
|
+
- avoid silently changing the baseline comparison recipe
|
|
160
|
+
- run the full agreed evaluation, not only a smoke test
|
|
161
|
+
|
|
162
|
+
You may do a quick sanity run first, but if the stage goal is a real experiment you must continue to the real evaluation unless the run is blocked and recorded.
|
|
163
|
+
|
|
164
|
+
Pilot-before-scale rule:
|
|
165
|
+
|
|
166
|
+
- start with a bounded pilot only when the modification is non-trivial and that pilot resolves a real execution uncertainty
|
|
167
|
+
- use the pilot to catch implementation mistakes early
|
|
168
|
+
- record pilot outcomes explicitly
|
|
169
|
+
- do not mistake pilot success for final evidence
|
|
170
|
+
|
|
171
|
+
Incremental-recording rule:
|
|
172
|
+
|
|
173
|
+
- do not wait until the end to reconstruct the run from memory
|
|
174
|
+
- update the durable run note after:
|
|
175
|
+
- contract definition
|
|
176
|
+
- important code changes
|
|
177
|
+
- pilot validation
|
|
178
|
+
- full execution checkpoints
|
|
179
|
+
- post-run analysis
|
|
180
|
+
- update `CHECKLIST.md` alongside those durable notes so the current execution frontier is obvious without replaying the whole log
|
|
181
|
+
- include timestamps when they materially help reconstruction
|
|
182
|
+
- preserve failed attempts, anomalies, and partial outcomes rather than overwriting them
|
|
183
|
+
- a durable run memory or note should explicitly record whether the current state is `success`, `partial`, or `failure`
|
|
184
|
+
- when available, include `idea_id`, `branch`, and `run_id`
|
|
185
|
+
|
|
186
|
+
Last-known-good rule:
|
|
187
|
+
|
|
188
|
+
- keep track of the most recent state that was executable, comparable, and explainable
|
|
189
|
+
- when a new attempt breaks that state, debug forward from the last-known-good point instead of stacking more speculative edits on top of the broken state
|
|
190
|
+
- if the last-known-good state is unclear, reconstruct it before spending more budget on new hypotheses
|
|
191
|
+
|
|
192
|
+
## 5.1 Long-running command protocol
|
|
193
|
+
|
|
194
|
+
For commands that may run longer than a few minutes:
|
|
195
|
+
|
|
196
|
+
- if command paths, outputs, or basic metrics are still unverified, execute one bounded smoke test or pilot first
|
|
197
|
+
- keep smoke or pilot budget at `0-2` for the current experiment pass
|
|
198
|
+
- treat smoke work as a `0-2` budget rather than as a mandatory separate phase
|
|
199
|
+
- allow a second smoke or pilot only after a real code, command, environment, or evaluator change
|
|
200
|
+
- once the path is verified, launch the real run with `bash_exec(mode='detach', ...)` and normally leave `timeout_seconds` unset for that long run
|
|
201
|
+
- monitor through durable logs rather than only live terminal output
|
|
202
|
+
- `bash_exec(mode='read', id=...)` returns the full rendered log when it is 2000 lines or fewer; for longer logs it returns the first 500 lines plus the last 1500 lines and a hint to inspect omitted sections with `start` and `tail`
|
|
203
|
+
- if the middle of a long saved log matters, inspect that omitted region with `bash_exec(mode='read', id=..., start=..., tail=...)`
|
|
204
|
+
- use `bash_exec(mode='list')` and `bash_exec(mode='read', id=..., tail_limit=..., order='desc')` to monitor or revisit managed commands while focusing on the newest evidence first
|
|
205
|
+
- after the first read, prefer `bash_exec(mode='read', id=..., after_seq=last_seen_seq, tail_limit=..., order='asc')` so later checks only fetch new evidence
|
|
206
|
+
- if you need to recover ids or sanity-check the active session ordering, use `bash_exec(mode='history')`
|
|
207
|
+
- launch important runs with a structured `comment` such as `{stage, goal, action, expected_signal, next_check}`
|
|
208
|
+
- use `silent_seconds`, `progress_age_seconds`, `signal_age_seconds`, and `watchdog_overdue` from `bash_exec(mode='list'|'read', ...)` as your default watchdog signals
|
|
209
|
+
- use an explicit wait-and-check loop such as:
|
|
210
|
+
- wait about `60s`, then inspect logs
|
|
211
|
+
- wait about `120s`, then inspect logs
|
|
212
|
+
- wait about `300s`, then inspect logs
|
|
213
|
+
- wait about `600s`, then inspect logs
|
|
214
|
+
- wait about `1800s`, then inspect logs
|
|
215
|
+
- then keep checking about every `1800s` while the run is still active
|
|
216
|
+
- if needed, use an explicit bounded wait such as `bash_exec(command='sleep 60', mode='await', timeout_seconds=70)` or `bash_exec(mode='await', id=..., wait_timeout_seconds=1800)` between checks
|
|
217
|
+
- canonical sleep choice:
|
|
218
|
+
- if you only need wall-clock waiting between checks, use `bash_exec(command='sleep N', mode='await', timeout_seconds=N+buffer, ...)`
|
|
219
|
+
- keep a real buffer on that sleep timeout; do not set `timeout_seconds` exactly equal to `N`
|
|
220
|
+
- if you are waiting on an already running managed session, prefer `bash_exec(mode='await', id=..., wait_timeout_seconds=1800)` instead of starting a new sleep command
|
|
221
|
+
- if that bounded await returns while the session is still `running`, treat that as expected managed-monitoring behavior; read the log, judge forward progress, and then decide whether another `1800s` wait is justified
|
|
222
|
+
- after every completed sleep or await cycle, inspect logs first; only send `artifact.interact(kind='progress', ...)` when the user-visible state, frontier, blocker status, or ETA materially changed
|
|
223
|
+
- after the first meaningful signal and then at real checkpoints such as completion, recovery, blocker, or a materially widened comparable surface, keep those progress updates going rather than waiting silently
|
|
224
|
+
- if the run is clearly invalid, wedged, or superseded, stop it with `bash_exec(mode='kill', id=..., wait=true, timeout_seconds=...)`; if it must die immediately, add `force=true`, record the reason, fix the issue, and relaunch cleanly
|
|
225
|
+
- do not report completion until logs and output files both confirm completion
|
|
226
|
+
- when you control the run code, prefer a throttled `tqdm` progress reporter and concise structured progress markers when feasible
|
|
227
|
+
|
|
228
|
+
Always preserve the managed `bash_exec` log and export it into the experiment artifact directory when the run artifact is written.
|
|
229
|
+
|
|
230
|
+
## 5.2 Progress marker protocol
|
|
231
|
+
|
|
232
|
+
If the run emits progress markers, keep them concise and machine-readable instead of narrating every low-level update in chat.
|
|
233
|
+
When a real checkpoint is reached, include the estimated next reply time and `next_reply_at` when that is honestly knowable.
|
|
234
|
+
|
|
235
|
+
## 6. Validate the outputs
|
|
236
|
+
|
|
237
|
+
After the run, verify:
|
|
238
|
+
|
|
239
|
+
- outputs correspond to the intended code and config
|
|
240
|
+
- metrics are complete and interpretable
|
|
241
|
+
- comparison to baseline is fair
|
|
242
|
+
- any failure mode or confounder is visible
|
|
243
|
+
- required metric keys are present and finite
|
|
244
|
+
- the result can be mapped back to the original claim
|
|
245
|
+
- the summary states a clear go or no-go recommendation
|
|
246
|
+
|
|
247
|
+
Create a durable claim-validation record that maps:
|
|
248
|
+
|
|
249
|
+
- claim
|
|
250
|
+
- metric key
|
|
251
|
+
- expected direction
|
|
252
|
+
- observed result
|
|
253
|
+
- verdict:
|
|
254
|
+
- `supported`
|
|
255
|
+
- `refuted`
|
|
256
|
+
- `inconclusive`
|
|
257
|
+
|
|
258
|
+
Also verify baseline comparability before claiming deltas:
|
|
259
|
+
|
|
260
|
+
- was the baseline verification stable?
|
|
261
|
+
- was the evaluation path the same?
|
|
262
|
+
- are the compared metric keys identical?
|
|
263
|
+
- if the run is claim-carrying, are the significance results or uncertainty estimates strong enough for main-text use?
|
|
264
|
+
- do known caveats make the delta weaker than it first appears?
|
|
265
|
+
|
|
266
|
+
## 7. Record the run
|
|
267
|
+
|
|
268
|
+
Every meaningful main run must be recorded through `artifact.record_main_experiment(...)`.
|
|
269
|
+
|
|
270
|
+
That call is responsible for writing:
|
|
271
|
+
|
|
272
|
+
- `experiments/main/<run_id>/RUN.md`
|
|
273
|
+
- `experiments/main/<run_id>/RESULT.json`
|
|
274
|
+
- the durable `run` artifact payload
|
|
275
|
+
- baseline comparisons
|
|
276
|
+
- breakthrough status derived by the system
|
|
277
|
+
|
|
278
|
+
`artifact.record_main_experiment(...)` should include at least:
|
|
279
|
+
|
|
280
|
+
- `run_id`
|
|
281
|
+
- title
|
|
282
|
+
- hypothesis
|
|
283
|
+
- setup
|
|
284
|
+
- execution
|
|
285
|
+
- results
|
|
286
|
+
- conclusion
|
|
287
|
+
- baseline reference
|
|
288
|
+
- `metrics_summary`
|
|
289
|
+
- `metric_rows` when available
|
|
290
|
+
- the metric contract actually used
|
|
291
|
+
- verdict
|
|
292
|
+
- evidence paths
|
|
293
|
+
- changed files
|
|
294
|
+
- relevant config paths when applicable
|
|
295
|
+
- `evaluation_summary` with exactly these six fields:
|
|
296
|
+
- `takeaway`
|
|
297
|
+
- `claim_update`
|
|
298
|
+
- `baseline_relation`
|
|
299
|
+
- `comparability`
|
|
300
|
+
- `failure_mode`
|
|
301
|
+
- `next_action`
|
|
302
|
+
|
|
303
|
+
Use `evaluation_summary` as the short structured judgment layer on top of the longer narrative fields:
|
|
304
|
+
|
|
305
|
+
- `takeaway`: one sentence the next reader can reuse directly
|
|
306
|
+
- `claim_update`: `strengthens`, `weakens`, `narrows`, or `neutral`
|
|
307
|
+
- `baseline_relation`: `better`, `worse`, `mixed`, or `not_comparable`
|
|
308
|
+
- `comparability`: `high`, `medium`, or `low`
|
|
309
|
+
- `failure_mode`: `none`, `implementation`, `evaluation`, `environment`, or `direction`
|
|
310
|
+
- `next_action`: the immediate route such as `continue`, `revise_idea`, `analysis_campaign`, `write`, or `stop`
|
|
311
|
+
|
|
312
|
+
After `artifact.record_main_experiment(...)` succeeds, do not assume the same branch should absorb the next round by default.
|
|
313
|
+
Interpret the measured result first, then either:
|
|
314
|
+
|
|
315
|
+
- launch analysis from this branch, or
|
|
316
|
+
- compare candidate foundations and create the next child research branch
|
|
317
|
+
|
|
318
|
+
Use `artifact.create_analysis_campaign(...)` only when the extra slices have clear academic or claim-level value relative to their resource cost.
|
|
319
|
+
If the main need is simply to continue optimization from a measured result, prefer a new durable child idea branch instead of an expensive analysis package by reflex.
|
|
320
|
+
If the extra work should happen on an older durable branch rather than the current head, first switch the runtime back there with `artifact.activate_branch(...)`, then launch the analysis campaign from that activated workspace.
|
|
321
|
+
|
|
322
|
+
When `artifact.record_main_experiment(...)` succeeds, send a richer threaded `artifact.interact(kind='milestone', ...)` update rather than a generic one-line progress ping.
|
|
323
|
+
Lead that milestone with a concise `1-2` sentence outcome summary before expanding into more detail.
|
|
324
|
+
That milestone should state:
|
|
325
|
+
|
|
326
|
+
- the research question that was tested
|
|
327
|
+
- the primary result and baseline delta
|
|
328
|
+
- whether the run supports, weakens, or leaves the idea inconclusive
|
|
329
|
+
- the main caveat or confidence note that still matters
|
|
330
|
+
- the exact recommended next move
|
|
331
|
+
|
|
332
|
+
Do not treat a main run as durably complete until `artifact.record_main_experiment(...)` succeeds.
|
|
333
|
+
|
|
334
|
+
Recommended per-run documentation fields:
|
|
335
|
+
|
|
336
|
+
1. research question
|
|
337
|
+
2. research type
|
|
338
|
+
3. research objective
|
|
339
|
+
4. experimental setup
|
|
340
|
+
5. experimental results
|
|
341
|
+
6. experimental analysis
|
|
342
|
+
7. experimental conclusions
|
|
343
|
+
|
|
344
|
+
For durable main runs, these seven fields should be progressively filled as the run advances, not only at final packaging time.
|
|
345
|
+
For lightweight runs, a shorter summary is acceptable if the route remains obvious and the result is still durably recorded.
|
|
346
|
+
|
|
347
|
+
`RUN.md` should make it easy for later stages to answer:
|
|
348
|
+
|
|
349
|
+
- what changed?
|
|
350
|
+
- how can this run be reproduced?
|
|
351
|
+
- what are the main results?
|
|
352
|
+
- why did it work or fail?
|
|
353
|
+
- what should happen next?
|
|
354
|
+
|
|
355
|
+
Recording rules:
|
|
356
|
+
|
|
357
|
+
- record results incrementally, not only at the end
|
|
358
|
+
- include timestamps when helpful
|
|
359
|
+
- include failed attempts, partial runs, and unexpected outcomes
|
|
360
|
+
- do not leave placeholder sections for later if the information is already known
|
|
361
|
+
- report exactly what happened, not what you hoped would happen
|
|
362
|
+
|
|
363
|
+
## 8. Decide the next move
|
|
364
|
+
|
|
365
|
+
The experiment stage should normally end with one of:
|
|
366
|
+
|
|
367
|
+
- continue the current line
|
|
368
|
+
- branch a new line
|
|
369
|
+
- launch an analysis campaign
|
|
370
|
+
- move to writing
|
|
371
|
+
- reset or stop
|
|
372
|
+
|
|
373
|
+
Do not let the stage end without an explicit next direction.
|
|
374
|
+
If analysis is selected, record why the expected information gain is strong enough to justify the added compute, time, or annotation budget.
|
|
@@ -1,13 +1,34 @@
|
|
|
1
1
|
# Main Experiment Checklist Template
|
|
2
2
|
|
|
3
3
|
Update this while planning, modifying code, running pilots, monitoring the full run, and validating the result.
|
|
4
|
+
For a lightweight run, keep only the core planning, validation, and closeout items active.
|
|
4
5
|
|
|
5
6
|
## Identity
|
|
6
7
|
|
|
8
|
+
- parent_map_node:
|
|
9
|
+
- loop_id:
|
|
7
10
|
- run id:
|
|
8
11
|
- idea id:
|
|
9
12
|
- stage:
|
|
10
13
|
|
|
14
|
+
## In Progress
|
|
15
|
+
|
|
16
|
+
- [ ] one concrete experiment frontier item is actively in progress
|
|
17
|
+
|
|
18
|
+
## Next
|
|
19
|
+
|
|
20
|
+
- [ ] next code / run / validation step is explicit
|
|
21
|
+
- [ ] next map transition is explicit
|
|
22
|
+
- [ ] next reporting checkpoint is explicit
|
|
23
|
+
|
|
24
|
+
## Later
|
|
25
|
+
|
|
26
|
+
- [ ] deferred but still relevant items live here
|
|
27
|
+
|
|
28
|
+
## Blocked
|
|
29
|
+
|
|
30
|
+
- [ ] blockers or unresolved dependencies are recorded here
|
|
31
|
+
|
|
11
32
|
## Planning
|
|
12
33
|
|
|
13
34
|
- [ ] selected idea summarized in `1-2` sentences
|
|
@@ -34,8 +55,7 @@ Update this while planning, modifying code, running pilots, monitoring the full
|
|
|
34
55
|
## Main Run
|
|
35
56
|
|
|
36
57
|
- [ ] real run launched
|
|
37
|
-
- [ ]
|
|
38
|
-
- [ ] health signals confirmed
|
|
58
|
+
- [ ] health signals confirmed when the run is long enough to need monitoring
|
|
39
59
|
- [ ] major runtime deviations reflected in `PLAN.md`
|
|
40
60
|
|
|
41
61
|
## Validation
|
|
@@ -46,6 +66,10 @@ Update this while planning, modifying code, running pilots, monitoring the full
|
|
|
46
66
|
- [ ] main claim is classified as supported / refuted / inconclusive
|
|
47
67
|
- [ ] result recorded durably
|
|
48
68
|
|
|
69
|
+
## Done
|
|
70
|
+
|
|
71
|
+
- [ ] completed frontier items are moved here instead of staying mixed into `Next`
|
|
72
|
+
|
|
49
73
|
## Closeout
|
|
50
74
|
|
|
51
75
|
- [ ] main experiment summarized in `1-2` sentences
|
|
@@ -2,8 +2,20 @@
|
|
|
2
2
|
|
|
3
3
|
Use this before substantial code edits or the real main run.
|
|
4
4
|
Treat it as the implementation-and-execution plan for the selected idea, not just a metadata form.
|
|
5
|
+
For lightweight runs, a one-screen plan is enough if it preserves the route, comparability boundary, command path, outputs, and fallback.
|
|
5
6
|
|
|
6
|
-
## 1.
|
|
7
|
+
## 1. Map Link
|
|
8
|
+
|
|
9
|
+
- parent_map_node:
|
|
10
|
+
- loop_id:
|
|
11
|
+
- node_objective:
|
|
12
|
+
- node_deliverable:
|
|
13
|
+
- success_condition:
|
|
14
|
+
- abandonment_condition:
|
|
15
|
+
- next_on_success:
|
|
16
|
+
- next_on_failure:
|
|
17
|
+
|
|
18
|
+
## 2. Objective
|
|
7
19
|
|
|
8
20
|
- run id:
|
|
9
21
|
- selected idea in `1-2` sentences:
|
|
@@ -13,7 +25,15 @@ Treat it as the implementation-and-execution plan for the selected idea, not jus
|
|
|
13
25
|
- null hypothesis:
|
|
14
26
|
- alternative hypothesis:
|
|
15
27
|
|
|
16
|
-
##
|
|
28
|
+
## 3. Current Node Tasks
|
|
29
|
+
|
|
30
|
+
- [ ] sync the experiment node status and current incumbent context
|
|
31
|
+
- [ ] confirm comparability and code translation plan
|
|
32
|
+
- [ ] run the smoke or pilot path
|
|
33
|
+
- [ ] launch or validate the main run
|
|
34
|
+
- [ ] classify the result and update the next map edge
|
|
35
|
+
|
|
36
|
+
## 4. Baseline And Comparability
|
|
17
37
|
|
|
18
38
|
- baseline id:
|
|
19
39
|
- baseline variant:
|
|
@@ -22,7 +42,7 @@ Treat it as the implementation-and-execution plan for the selected idea, not jus
|
|
|
22
42
|
- required metric keys:
|
|
23
43
|
- comparability risks:
|
|
24
44
|
|
|
25
|
-
##
|
|
45
|
+
## 5. Code Translation Plan
|
|
26
46
|
|
|
27
47
|
Map the idea into concrete code changes.
|
|
28
48
|
|
|
@@ -30,7 +50,7 @@ Map the idea into concrete code changes.
|
|
|
30
50
|
|---|---|---|---|---|
|
|
31
51
|
| | | | | |
|
|
32
52
|
|
|
33
|
-
##
|
|
53
|
+
## 6. Execution Design
|
|
34
54
|
|
|
35
55
|
- minimal experiment:
|
|
36
56
|
- smoke / pilot plan:
|
|
@@ -40,7 +60,7 @@ Map the idea into concrete code changes.
|
|
|
40
60
|
- abandonment condition:
|
|
41
61
|
- strongest alternative hypothesis:
|
|
42
62
|
|
|
43
|
-
##
|
|
63
|
+
## 7. Runtime Strategy
|
|
44
64
|
|
|
45
65
|
- command for smoke:
|
|
46
66
|
- command for main run:
|
|
@@ -48,31 +68,22 @@ Map the idea into concrete code changes.
|
|
|
48
68
|
- log / artifact locations:
|
|
49
69
|
- safe efficiency levers to use first:
|
|
50
70
|
- how existing tooling will be used efficiently:
|
|
51
|
-
|
|
52
|
-
Monitoring and sleep plan:
|
|
53
|
-
|
|
54
|
-
- wait cadence:
|
|
55
|
-
- `60s`
|
|
56
|
-
- `120s`
|
|
57
|
-
- `300s`
|
|
58
|
-
- `600s`
|
|
59
|
-
- `1800s`
|
|
60
71
|
- health signals that justify continuing to monitor:
|
|
61
72
|
- conditions that trigger kill / relaunch:
|
|
62
73
|
|
|
63
|
-
##
|
|
74
|
+
## 8. Fallbacks And Recovery
|
|
64
75
|
|
|
65
76
|
- if the intended model / endpoint / download path fails:
|
|
66
77
|
- if hardware or memory is tighter than expected:
|
|
67
78
|
- if the code path is wrong after smoke:
|
|
68
79
|
- if the first full run becomes non-comparable:
|
|
69
80
|
|
|
70
|
-
##
|
|
81
|
+
## 9. Checklist Link
|
|
71
82
|
|
|
72
83
|
- checklist path:
|
|
73
84
|
- next unchecked item:
|
|
74
85
|
|
|
75
|
-
##
|
|
86
|
+
## 10. Revision Log
|
|
76
87
|
|
|
77
88
|
| Time | What changed | Why it changed | Impact on comparability or runtime |
|
|
78
89
|
|---|---|---|---|
|
|
@@ -0,0 +1,108 @@
|
|
|
1
|
+
# Operational Guidance
|
|
2
|
+
|
|
3
|
+
Use this reference when the experiment route needs the longer planning, environment, artifact, memory, or charting notes rather than the main control surface in `SKILL.md`.
|
|
4
|
+
|
|
5
|
+
## Planning surfaces
|
|
6
|
+
|
|
7
|
+
Use quest or workspace planning files only when they help control a non-trivial run; otherwise keep the run contract small and move to the first decisive execution step.
|
|
8
|
+
|
|
9
|
+
## Required plan and checklist
|
|
10
|
+
|
|
11
|
+
Before substantial implementation work or a real main run, create a quest-visible `PLAN.md` and `CHECKLIST.md`.
|
|
12
|
+
|
|
13
|
+
- Use `references/main-experiment-plan-template.md` as the canonical structure for `PLAN.md`.
|
|
14
|
+
- Use `references/main-experiment-checklist-template.md` as the canonical structure for `CHECKLIST.md`.
|
|
15
|
+
- `PLAN.md` and `CHECKLIST.md` are the canonical planning-and-control surface before and during execution.
|
|
16
|
+
- `PLAN.md` should lead with the selected idea summarized in `1-2` sentences and include the baseline and comparability rules, safe efficiency levers, minimal code-change map, smoke or pilot path, full-run path, fallback options, monitoring and sleep rules, expected outputs, and a revision log.
|
|
17
|
+
- Once the route is concrete, implement according to the current `PLAN.md`.
|
|
18
|
+
- If the code path, comparability contract, runtime strategy, or execution route changes materially, revise `PLAN.md` before spending more code or compute.
|
|
19
|
+
|
|
20
|
+
## Working-boundary rules
|
|
21
|
+
|
|
22
|
+
Only modify the active quest workspace for this experiment line.
|
|
23
|
+
|
|
24
|
+
- treat the accepted baseline workspace as read-only
|
|
25
|
+
- do not derive branch or worktree assumptions from guesswork
|
|
26
|
+
- keep all durable outputs inside the quest
|
|
27
|
+
- if the runtime gives an explicit worktree path, use it exactly
|
|
28
|
+
|
|
29
|
+
## Resource note
|
|
30
|
+
|
|
31
|
+
Respect explicit resource limits and record real environment or dependency constraints, but do not stop the run early just to over-document them.
|
|
32
|
+
|
|
33
|
+
## Resource and environment rules
|
|
34
|
+
|
|
35
|
+
- Follow the explicit resource assignment if one exists.
|
|
36
|
+
- If GPU assignment is explicit, respect it exactly and record it in the run manifest.
|
|
37
|
+
- Do not silently consume extra GPUs or broaden resource scope.
|
|
38
|
+
- Capture enough environment information that the run can later be reconstructed.
|
|
39
|
+
- If a new dependency appears necessary, record it as a risk and prefer a fallback if possible.
|
|
40
|
+
|
|
41
|
+
## Required durable outputs
|
|
42
|
+
|
|
43
|
+
A meaningful experiment pass should leave behind:
|
|
44
|
+
|
|
45
|
+
- a run directory under `artifacts/experiment/<run_id>/` or the quest-equivalent canonical location
|
|
46
|
+
- `artifact_manifest.json`, `run_manifest.json`, `metrics.json`, and `summary.md`
|
|
47
|
+
- `metrics.md` and `runlog.summary.md` for durable main runs
|
|
48
|
+
- durable command, config, and log pointers
|
|
49
|
+
- exported shell log, typically `bash.log`
|
|
50
|
+
- a run artifact with explicit deltas versus baseline
|
|
51
|
+
- a decision about what should happen next
|
|
52
|
+
|
|
53
|
+
Recommended additional files:
|
|
54
|
+
|
|
55
|
+
- `claim_validation.md`
|
|
56
|
+
- environment snapshot files such as Python version, package freeze, and GPU info when applicable
|
|
57
|
+
- a live execution note or rolling run log when the experiment spans multiple implementation or execution steps
|
|
58
|
+
|
|
59
|
+
`run_manifest.json` should capture at least:
|
|
60
|
+
|
|
61
|
+
- `run_id`
|
|
62
|
+
- quest or branch context
|
|
63
|
+
- baseline reference or commit
|
|
64
|
+
- full commands
|
|
65
|
+
- config paths and key resolved hyperparameters
|
|
66
|
+
- dataset identifier or version
|
|
67
|
+
- seeds
|
|
68
|
+
- environment snapshot paths
|
|
69
|
+
- start time, end time, and final status
|
|
70
|
+
|
|
71
|
+
If a command needed for environment capture is unavailable, record that gap in the manifest and summary.
|
|
72
|
+
|
|
73
|
+
## Memory rules
|
|
74
|
+
|
|
75
|
+
Stage-start requirement:
|
|
76
|
+
|
|
77
|
+
- begin every experiment pass with `memory.list_recent(scope='quest', limit=5)`
|
|
78
|
+
- then run at least one experiment-relevant `memory.search(...)` before reopening a previously tested command path or retrying an old run
|
|
79
|
+
|
|
80
|
+
Stage-end requirement:
|
|
81
|
+
|
|
82
|
+
- if the run produced a durable lesson, incident pattern, comparability caveat, or route-changing outcome, write at least one `memory.write(...)` before leaving the stage
|
|
83
|
+
|
|
84
|
+
## Memory note
|
|
85
|
+
|
|
86
|
+
Use memory only to avoid repeating known failures or to preserve reusable experiment lessons; the canonical run record belongs in `artifact`.
|
|
87
|
+
|
|
88
|
+
## Artifact rules
|
|
89
|
+
|
|
90
|
+
- use `progress` for long-running execution updates
|
|
91
|
+
- use `artifact.record_main_experiment(...)` for each meaningful completed main experiment
|
|
92
|
+
- use `report` for suspicious-result investigations or analysis-rich summaries when they materially help the next route
|
|
93
|
+
- use `decision` for continue / branch / analysis / write / reset / stop
|
|
94
|
+
- use `approval` when an explicit user approval is captured for an expensive or risky run change
|
|
95
|
+
- use `artifact.checkpoint(...)` when code evolution is meaningful and should be preserved in Git
|
|
96
|
+
- after a meaningful experiment checkpoint or completion, emit `artifact.interact(kind='progress' | 'milestone', ...)` so the user sees the concrete result and next step
|
|
97
|
+
|
|
98
|
+
## Connector-facing chart requirements
|
|
99
|
+
|
|
100
|
+
When this stage produces connector-facing charts or milestone-facing visuals, keep the palette aligned with the system prompt Morandi plotting template.
|
|
101
|
+
|
|
102
|
+
- `sage-clay` should be the primary positive or accepted-result color
|
|
103
|
+
- `mist-stone` should be the neutral comparison or baseline color
|
|
104
|
+
- `dust-rose` should be the restrained caution or limitation accent when needed
|
|
105
|
+
- keep light paper-style backgrounds close to `#F3EEE8`
|
|
106
|
+
- use calm connector-safe palettes such as `sage-clay`, `mist-stone`, and `dust-rose`
|
|
107
|
+
- highlight only the decisive delta; do not color every series as if they are equally important
|
|
108
|
+
- stay aligned with the system prompt rather than inventing a new local visual language for each chart
|