npm - @researai/deepscientist - Versions diffs - 1.5.16 → 1.6.0 - Mend

@researai/deepscientist 1.5.16 → 1.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (896) hide show

package/AGENTS.md +309 -130
package/AISB/catalog/aisb.b1.agentic_coding.yaml +244 -0
package/AISB/catalog/aisb.b10.climate_earth.yaml +235 -0
package/AISB/catalog/aisb.b11.model_efficiency.yaml +231 -0
package/AISB/catalog/aisb.b12.embodied_ai.yaml +238 -0
package/AISB/catalog/aisb.b2.agent_systems.yaml +229 -0
package/AISB/catalog/aisb.b3.self_evolving_rl.yaml +237 -0
package/AISB/catalog/aisb.b4.lm_reasoning.yaml +240 -0
package/AISB/catalog/aisb.b5.math_proof.yaml +235 -0
package/AISB/catalog/aisb.b6.research_process.yaml +243 -0
package/AISB/catalog/aisb.b7.multimodal_fusion.yaml +232 -0
package/AISB/catalog/aisb.b8.lifesci_drug.yaml +275 -0
package/AISB/catalog/aisb.b9.material_science.yaml +237 -0
package/AISB/catalog/aisb.t3.001_savvy.yaml +159 -0
package/AISB/catalog/aisb.t3.001_savvy.zh.yaml +121 -0
package/AISB/catalog/aisb.t3.002_pinet.yaml +189 -0
package/AISB/catalog/aisb.t3.002_pinet.zh.yaml +130 -0
package/AISB/catalog/aisb.t3.004_decentralattn.yaml +184 -0
package/AISB/catalog/aisb.t3.004_decentralattn.zh.yaml +153 -0
package/AISB/catalog/aisb.t3.005_tsae.yaml +193 -0
package/AISB/catalog/aisb.t3.005_tsae.zh.yaml +139 -0
package/AISB/catalog/aisb.t3.006_physense.yaml +194 -0
package/AISB/catalog/aisb.t3.006_physense.zh.yaml +118 -0
package/AISB/catalog/aisb.t3.007_reasoningiqa.yaml +169 -0
package/AISB/catalog/aisb.t3.007_reasoningiqa.zh.yaml +133 -0
package/AISB/catalog/aisb.t3.008_meanflows.yaml +188 -0
package/AISB/catalog/aisb.t3.008_meanflows.zh.yaml +140 -0
package/AISB/catalog/aisb.t3.009_scoremissing.yaml +179 -0
package/AISB/catalog/aisb.t3.009_scoremissing.zh.yaml +119 -0
package/AISB/catalog/aisb.t3.010_suitabilityfilter.yaml +221 -0
package/AISB/catalog/aisb.t3.010_suitabilityfilter.zh.yaml +141 -0
package/AISB/catalog/aisb.t3.011_osd.yaml +206 -0
package/AISB/catalog/aisb.t3.011_osd.zh.yaml +163 -0
package/AISB/catalog/aisb.t3.012_efficientqat.yaml +206 -0
package/AISB/catalog/aisb.t3.012_efficientqat.zh.yaml +159 -0
package/AISB/catalog/aisb.t3.013_appl.yaml +152 -0
package/AISB/catalog/aisb.t3.013_appl.zh.yaml +126 -0
package/AISB/catalog/aisb.t3.014_piguard.yaml +207 -0
package/AISB/catalog/aisb.t3.014_piguard.zh.yaml +164 -0
package/AISB/catalog/aisb.t3.015_frspec.yaml +209 -0
package/AISB/catalog/aisb.t3.015_frspec.zh.yaml +163 -0
package/AISB/catalog/aisb.t3.016_mathfusion.yaml +166 -0
package/AISB/catalog/aisb.t3.016_mathfusion.zh.yaml +145 -0
package/AISB/catalog/aisb.t3.017_multimodalglp.yaml +171 -0
package/AISB/catalog/aisb.t3.017_multimodalglp.zh.yaml +122 -0
package/AISB/catalog/aisb.t3.018_cotsynth.yaml +206 -0
package/AISB/catalog/aisb.t3.018_cotsynth.zh.yaml +162 -0
package/AISB/catalog/aisb.t3.019_dyscaleut.yaml +211 -0
package/AISB/catalog/aisb.t3.019_dyscaleut.zh.yaml +148 -0
package/AISB/catalog/aisb.t3.020_aristotle.yaml +173 -0
package/AISB/catalog/aisb.t3.020_aristotle.zh.yaml +119 -0
package/AISB/catalog/aisb.t3.021_tokenrecycling.yaml +160 -0
package/AISB/catalog/aisb.t3.021_tokenrecycling.zh.yaml +129 -0
package/AISB/catalog/aisb.t3.022_chainofreasoning.yaml +204 -0
package/AISB/catalog/aisb.t3.022_chainofreasoning.zh.yaml +161 -0
package/AISB/catalog/aisb.t3.023_guidedembed.yaml +211 -0
package/AISB/catalog/aisb.t3.023_guidedembed.zh.yaml +189 -0
package/AISB/catalog/aisb.t3.024_outputcentric.yaml +148 -0
package/AISB/catalog/aisb.t3.024_outputcentric.zh.yaml +131 -0
package/AISB/catalog/aisb.t3.025_deeper.yaml +143 -0
package/AISB/catalog/aisb.t3.025_deeper.zh.yaml +116 -0
package/AISB/catalog/aisb.t3.026_gartkg.yaml +195 -0
package/AISB/catalog/aisb.t3.026_gartkg.zh.yaml +127 -0
package/AISB/catalog/aisb.t3.027_citeeval.yaml +182 -0
package/AISB/catalog/aisb.t3.027_citeeval.zh.yaml +135 -0
package/AISB/catalog/aisb.t3.028_sbam.yaml +206 -0
package/AISB/catalog/aisb.t3.028_sbam.zh.yaml +166 -0
package/AISB/catalog/aisb.t3.029_cdqgeoembed.yaml +224 -0
package/AISB/catalog/aisb.t3.029_cdqgeoembed.zh.yaml +142 -0
package/AISB/catalog/aisb.t3.030_processrm.yaml +211 -0
package/AISB/catalog/aisb.t3.030_processrm.zh.yaml +166 -0
package/AISB/catalog/aisb.t3.031_circuitstability.yaml +172 -0
package/AISB/catalog/aisb.t3.031_circuitstability.zh.yaml +134 -0
package/AISB/catalog/aisb.t3.032_ptsolver.yaml +169 -0
package/AISB/catalog/aisb.t3.032_ptsolver.zh.yaml +135 -0
package/AISB/catalog/aisb.t3.033_gcse.yaml +144 -0
package/AISB/catalog/aisb.t3.033_gcse.zh.yaml +126 -0
package/AISB/catalog/aisb.t3.034_ensemblewm.yaml +183 -0
package/AISB/catalog/aisb.t3.034_ensemblewm.zh.yaml +146 -0
package/AISB/catalog/aisb.t3.035_moralvalueswa.yaml +207 -0
package/AISB/catalog/aisb.t3.035_moralvalueswa.zh.yaml +165 -0
package/AISB/catalog/aisb.t3.036_weakstrongpref.yaml +210 -0
package/AISB/catalog/aisb.t3.036_weakstrongpref.zh.yaml +194 -0
package/AISB/catalog/aisb.t3.037_dementiamask.yaml +172 -0
package/AISB/catalog/aisb.t3.037_dementiamask.zh.yaml +132 -0
package/AISB/catalog/aisb.t3.038_tinysam.yaml +284 -0
package/AISB/catalog/aisb.t3.038_tinysam.zh.yaml +240 -0
package/AISB/catalog/aisb.t3.039_calf.yaml +224 -0
package/AISB/catalog/aisb.t3.039_calf.zh.yaml +194 -0
package/AISB/catalog/aisb.t3.040_graniteguardian.yaml +199 -0
package/AISB/catalog/aisb.t3.040_graniteguardian.zh.yaml +174 -0
package/AISB/catalog/aisb.t3.041_amdm.yaml +149 -0
package/AISB/catalog/aisb.t3.041_amdm.zh.yaml +137 -0
package/AISB/catalog/aisb.t3.042_xpatch.yaml +216 -0
package/AISB/catalog/aisb.t3.042_xpatch.zh.yaml +182 -0
package/AISB/catalog/aisb.t3.043_vhm.yaml +268 -0
package/AISB/catalog/aisb.t3.043_vhm.zh.yaml +193 -0
package/AISB/catalog/aisb.t3.044_rgvi.yaml +224 -0
package/AISB/catalog/aisb.t3.044_rgvi.zh.yaml +176 -0
package/AISB/catalog/aisb.t3.045_pslstm.yaml +203 -0
package/AISB/catalog/aisb.t3.045_pslstm.zh.yaml +179 -0
package/AISB/catalog/aisb.t3.046_nonstatts.yaml +208 -0
package/AISB/catalog/aisb.t3.046_nonstatts.zh.yaml +194 -0
package/AISB/catalog/aisb.t3.047_timepfn.yaml +156 -0
package/AISB/catalog/aisb.t3.047_timepfn.zh.yaml +124 -0
package/AISB/catalog/aisb.t3.048_proxyspex.yaml +148 -0
package/AISB/catalog/aisb.t3.048_proxyspex.zh.yaml +125 -0
package/AISB/catalog/aisb.t3.049_hogwildinference.yaml +183 -0
package/AISB/catalog/aisb.t3.049_hogwildinference.zh.yaml +138 -0
package/AISB/catalog/aisb.t3.050_causalpfn.yaml +214 -0
package/AISB/catalog/aisb.t3.050_causalpfn.zh.yaml +190 -0
package/AISB/catalog/aisb.t3.051_flashtp.yaml +169 -0
package/AISB/catalog/aisb.t3.051_flashtp.zh.yaml +124 -0
package/AISB/catalog/aisb.t3.052_nsdiff.yaml +155 -0
package/AISB/catalog/aisb.t3.052_nsdiff.zh.yaml +138 -0
package/AISB/catalog/aisb.t3.053_k2vae.yaml +158 -0
package/AISB/catalog/aisb.t3.053_k2vae.zh.yaml +132 -0
package/AISB/catalog/aisb.t3.054_timebase.yaml +178 -0
package/AISB/catalog/aisb.t3.054_timebase.zh.yaml +158 -0
package/AISB/catalog/aisb.t3.055_csbrain.yaml +238 -0
package/AISB/catalog/aisb.t3.055_csbrain.zh.yaml +184 -0
package/AISB/catalog/aisb.t3.056_infosam.yaml +224 -0
package/AISB/catalog/aisb.t3.056_infosam.zh.yaml +189 -0
package/AISB/catalog/aisb.t3.057_mdreid.yaml +129 -0
package/AISB/catalog/aisb.t3.057_mdreid.zh.yaml +117 -0
package/AISB/catalog/aisb.t3.058_mindglitch.yaml +171 -0
package/AISB/catalog/aisb.t3.058_mindglitch.zh.yaml +145 -0
package/AISB/catalog/aisb.t3.059_selfsupervised.yaml +154 -0
package/AISB/catalog/aisb.t3.059_selfsupervised.zh.yaml +125 -0
package/AISB/catalog/aisb.t3.060_iaggad.yaml +121 -0
package/AISB/catalog/aisb.t3.060_iaggad.zh.yaml +100 -0
package/AISB/catalog/aisb.t3.061_hsgkn.yaml +136 -0
package/AISB/catalog/aisb.t3.061_hsgkn.zh.yaml +113 -0
package/AISB/catalog/aisb.t3.062_visionts.yaml +237 -0
package/AISB/catalog/aisb.t3.062_visionts.zh.yaml +216 -0
package/AISB/catalog/aisb.t3.063_tsrag.yaml +162 -0
package/AISB/catalog/aisb.t3.063_tsrag.zh.yaml +138 -0
package/AISB/catalog/aisb.t3.064_pir.yaml +221 -0
package/AISB/catalog/aisb.t3.064_pir.zh.yaml +197 -0
package/AISB/catalog/aisb.t3.065_proteinbinding.yaml +234 -0
package/AISB/catalog/aisb.t3.065_proteinbinding.zh.yaml +167 -0
package/AISB/catalog/aisb.t3.066_tropicalattention.yaml +267 -0
package/AISB/catalog/aisb.t3.066_tropicalattention.zh.yaml +229 -0
package/AISB/catalog/aisb.t3.067_kanad.yaml +193 -0
package/AISB/catalog/aisb.t3.067_kanad.zh.yaml +167 -0
package/AISB/catalog/aisb.t3.068_sempo.yaml +187 -0
package/AISB/catalog/aisb.t3.068_sempo.zh.yaml +148 -0
package/AISB/catalog/aisb.t3.069_treehfd.yaml +129 -0
package/AISB/catalog/aisb.t3.069_treehfd.zh.yaml +111 -0
package/AISB/catalog/aisb.t3.070_certifiedunlearning.yaml +224 -0
package/AISB/catalog/aisb.t3.070_certifiedunlearning.zh.yaml +171 -0
package/AISB/catalog/aisb.t3.071_neuralmjd.yaml +142 -0
package/AISB/catalog/aisb.t3.071_neuralmjd.zh.yaml +120 -0
package/AISB/catalog/aisb.t3.072_fedgmt.yaml +181 -0
package/AISB/catalog/aisb.t3.072_fedgmt.zh.yaml +158 -0
package/AISB/catalog/aisb.t3.073_rld.yaml +161 -0
package/AISB/catalog/aisb.t3.073_rld.zh.yaml +129 -0
package/AISB/catalog/aisb.t3.074_lsvi.yaml +163 -0
package/AISB/catalog/aisb.t3.074_lsvi.zh.yaml +129 -0
package/AISB/catalog/aisb.t3.075_treeslicedentropy.yaml +201 -0
package/AISB/catalog/aisb.t3.075_treeslicedentropy.zh.yaml +148 -0
package/AISB/catalog/aisb.t3.076_aanet.yaml +169 -0
package/AISB/catalog/aisb.t3.076_aanet.zh.yaml +129 -0
package/AISB/catalog/aisb.t3.077_cmnn.yaml +199 -0
package/AISB/catalog/aisb.t3.077_cmnn.zh.yaml +165 -0
package/AISB/catalog/aisb.t3.078_conformalanomaly.yaml +146 -0
package/AISB/catalog/aisb.t3.078_conformalanomaly.zh.yaml +117 -0
package/AISB/catalog/aisb.t3.079_dpfkmeans.yaml +131 -0
package/AISB/catalog/aisb.t3.079_dpfkmeans.zh.yaml +104 -0
package/AISB/catalog/aisb.t3.080_latentscorereweight.yaml +169 -0
package/AISB/catalog/aisb.t3.080_latentscorereweight.zh.yaml +123 -0
package/AISB/catalog/aisb.t3.081_qmamba.yaml +150 -0
package/AISB/catalog/aisb.t3.081_qmamba.zh.yaml +117 -0
package/AISB/catalog/aisb.t3.082_onlinellmrouting.yaml +160 -0
package/AISB/catalog/aisb.t3.082_onlinellmrouting.zh.yaml +133 -0
package/AISB/catalog/aisb.t3.083_starformer.yaml +178 -0
package/AISB/catalog/aisb.t3.083_starformer.zh.yaml +140 -0
package/AISB/catalog/aisb.t3.084_ift.yaml +139 -0
package/AISB/catalog/aisb.t3.084_ift.zh.yaml +111 -0
package/AISB/catalog/aisb.t3.085_neuralsurv.yaml +183 -0
package/AISB/catalog/aisb.t3.085_neuralsurv.zh.yaml +143 -0
package/AISB/catalog/aisb.t3.086_stella.yaml +197 -0
package/AISB/catalog/aisb.t3.086_stella.zh.yaml +142 -0
package/AISB/catalog/aisb.t3.087_moses.yaml +167 -0
package/AISB/catalog/aisb.t3.087_moses.zh.yaml +132 -0
package/AISB/catalog/aisb.t3.088_channelnorm.yaml +140 -0
package/AISB/catalog/aisb.t3.088_channelnorm.zh.yaml +109 -0
package/AISB/catalog/aisb.t3.089_causalvelocity.yaml +730 -0
package/AISB/catalog/aisb.t3.089_causalvelocity.zh.yaml +668 -0
package/AISB/catalog/aisb.t3.090_rstib.yaml +144 -0
package/AISB/catalog/aisb.t3.090_rstib.zh.yaml +109 -0
package/AISB/catalog/aisb.t3.091_timeawarecausal.yaml +132 -0
package/AISB/catalog/aisb.t3.091_timeawarecausal.zh.yaml +107 -0
package/AISB/catalog/aisb.t3.092_kmeanslocalopt.yaml +138 -0
package/AISB/catalog/aisb.t3.092_kmeanslocalopt.zh.yaml +110 -0
package/AISB/catalog/aisb.t3.093_fedwmsam.yaml +134 -0
package/AISB/catalog/aisb.t3.093_fedwmsam.zh.yaml +106 -0
package/AISB/catalog/aisb.t3.094_boundre.yaml +147 -0
package/AISB/catalog/aisb.t3.094_boundre.zh.yaml +114 -0
package/AISB/catalog/aisb.t3.095_fastfeaturecp.yaml +153 -0
package/AISB/catalog/aisb.t3.095_fastfeaturecp.zh.yaml +118 -0
package/AISB/catalog/aisb.t3.096_m3svm.yaml +189 -0
package/AISB/catalog/aisb.t3.096_m3svm.zh.yaml +149 -0
package/AISB/catalog/aisb.t3.097_wassersteintl.yaml +212 -0
package/AISB/catalog/aisb.t3.097_wassersteintl.zh.yaml +169 -0
package/AISB/catalog/aisb.t3.098_xmahalanobis.yaml +171 -0
package/AISB/catalog/aisb.t3.098_xmahalanobis.zh.yaml +127 -0
package/AISB/catalog/aisb.t3.099_ollalanding.yaml +248 -0
package/AISB/catalog/aisb.t3.099_ollalanding.zh.yaml +182 -0
package/AISB/catalog/aisb.t3.100_invmissingdata.yaml +179 -0
package/AISB/catalog/aisb.t3.100_invmissingdata.zh.yaml +150 -0
package/AISB/catalog/aisb.t3.101_acia.yaml +164 -0
package/AISB/catalog/aisb.t3.101_acia.zh.yaml +109 -0
package/AISB/catalog/aisb.t3.102_stochasticff.yaml +178 -0
package/AISB/catalog/aisb.t3.102_stochasticff.zh.yaml +130 -0
package/AISB/catalog/aisb.t3.103_qdcp.yaml +150 -0
package/AISB/catalog/aisb.t3.103_qdcp.zh.yaml +116 -0
package/AISB/catalog/aisb.t3.104_balancedactiveinf.yaml +137 -0
package/AISB/catalog/aisb.t3.104_balancedactiveinf.zh.yaml +104 -0
package/AISB/catalog/aisb.t3.105_binaryclasseval.yaml +161 -0
package/AISB/catalog/aisb.t3.105_binaryclasseval.zh.yaml +130 -0
package/AISB/image/001_aisb.t3.001_savvy.jpg +0 -0
package/AISB/image/002_aisb.t3.002_pinet.jpg +0 -0
package/AISB/image/003_aisb.t3.003_dmsqd.jpg +0 -0
package/AISB/image/004_aisb.t3.004_decentralattn.jpg +0 -0
package/AISB/image/005_aisb.t3.005_tsae.jpg +0 -0
package/AISB/image/006_aisb.t3.006_physense.jpg +0 -0
package/AISB/image/007_aisb.t3.007_reasoningiqa.jpg +0 -0
package/AISB/image/008_aisb.t3.008_meanflows.jpg +0 -0
package/AISB/image/009_aisb.t3.009_scoremissing.jpg +0 -0
package/AISB/image/010_aisb.t3.010_suitabilityfilter.jpg +0 -0
package/AISB/image/011_aisb.t3.011_osd.jpg +0 -0
package/AISB/image/012_aisb.t3.012_efficientqat.jpg +0 -0
package/AISB/image/013_aisb.t3.013_appl.jpg +0 -0
package/AISB/image/014_aisb.t3.014_piguard.jpg +0 -0
package/AISB/image/015_aisb.t3.015_frspec.jpg +0 -0
package/AISB/image/016_aisb.t3.016_mathfusion.jpg +0 -0
package/AISB/image/017_aisb.t3.017_multimodalglp.jpg +0 -0
package/AISB/image/018_aisb.t3.018_cotsynth.jpg +0 -0
package/AISB/image/019_aisb.t3.019_dyscaleut.jpg +0 -0
package/AISB/image/020_aisb.t3.020_aristotle.jpg +0 -0
package/AISB/image/021_aisb.t3.021_tokenrecycling.jpg +0 -0
package/AISB/image/022_aisb.t3.022_chainofreasoning.jpg +0 -0
package/AISB/image/023_aisb.t3.023_guidedembed.jpg +0 -0
package/AISB/image/024_aisb.t3.024_outputcentric.jpg +0 -0
package/AISB/image/025_aisb.t3.025_deeper.jpg +0 -0
package/AISB/image/026_aisb.t3.026_gartkg.jpg +0 -0
package/AISB/image/027_aisb.t3.027_citeeval.jpg +0 -0
package/AISB/image/028_aisb.t3.028_sbam.jpg +0 -0
package/AISB/image/029_aisb.t3.029_cdqgeoembed.jpg +0 -0
package/AISB/image/030_aisb.t3.030_processrm.jpg +0 -0
package/AISB/image/031_aisb.t3.031_circuitstability.jpg +0 -0
package/AISB/image/032_aisb.t3.032_ptsolver.jpg +0 -0
package/AISB/image/033_aisb.t3.033_gcse.jpg +0 -0
package/AISB/image/034_aisb.t3.034_ensemblewm.jpg +0 -0
package/AISB/image/035_aisb.t3.035_moralvalueswa.jpg +0 -0
package/AISB/image/036_aisb.t3.036_weakstrongpref.jpg +0 -0
package/AISB/image/037_aisb.t3.037_dementiamask.jpg +0 -0
package/AISB/image/038_aisb.t3.038_tinysam.jpg +0 -0
package/AISB/image/039_aisb.t3.039_calf.jpg +0 -0
package/AISB/image/040_aisb.t3.040_graniteguardian.jpg +0 -0
package/AISB/image/041_aisb.t3.041_amdm.jpg +0 -0
package/AISB/image/042_aisb.t3.042_xpatch.jpg +0 -0
package/AISB/image/043_aisb.t3.043_vhm.jpg +0 -0
package/AISB/image/044_aisb.t3.044_rgvi.jpg +0 -0
package/AISB/image/045_aisb.t3.045_pslstm.jpg +0 -0
package/AISB/image/046_aisb.t3.046_nonstatts.jpg +0 -0
package/AISB/image/047_aisb.t3.047_timepfn.jpg +0 -0
package/AISB/image/048_aisb.t3.048_proxyspex.jpg +0 -0
package/AISB/image/049_aisb.t3.049_hogwildinference.jpg +0 -0
package/AISB/image/050_aisb.t3.050_causalpfn.jpg +0 -0
package/AISB/image/051_aisb.t3.051_flashtp.jpg +0 -0
package/AISB/image/052_aisb.t3.052_nsdiff.jpg +0 -0
package/AISB/image/053_aisb.t3.053_k2vae.jpg +0 -0
package/AISB/image/054_aisb.t3.054_timebase.jpg +0 -0
package/AISB/image/055_aisb.t3.055_csbrain.jpg +0 -0
package/AISB/image/056_aisb.t3.056_infosam.jpg +0 -0
package/AISB/image/057_aisb.t3.057_mdreid.jpg +0 -0
package/AISB/image/058_aisb.t3.058_mindglitch.jpg +0 -0
package/AISB/image/059_aisb.t3.059_selfsupervised.jpg +0 -0
package/AISB/image/060_aisb.t3.060_iaggad.jpg +0 -0
package/AISB/image/061_aisb.t3.061_hsgkn.jpg +0 -0
package/AISB/image/062_aisb.t3.062_visionts.jpg +0 -0
package/AISB/image/063_aisb.t3.063_tsrag.jpg +0 -0
package/AISB/image/064_aisb.t3.064_pir.jpg +0 -0
package/AISB/image/065_aisb.t3.065_proteinbinding.jpg +0 -0
package/AISB/image/066_aisb.t3.066_tropicalattention.jpg +0 -0
package/AISB/image/067_aisb.t3.067_kanad.jpg +0 -0
package/AISB/image/068_aisb.t3.068_sempo.jpg +0 -0
package/AISB/image/069_aisb.t3.069_treehfd.jpg +0 -0
package/AISB/image/070_aisb.t3.070_certifiedunlearning.jpg +0 -0
package/AISB/image/071_aisb.t3.071_neuralmjd.jpg +0 -0
package/AISB/image/072_aisb.t3.072_fedgmt.jpg +0 -0
package/AISB/image/073_aisb.t3.073_rld.jpg +0 -0
package/AISB/image/074_aisb.t3.074_lsvi.jpg +0 -0
package/AISB/image/075_aisb.t3.075_treeslicedentropy.jpg +0 -0
package/AISB/image/076_aisb.t3.076_aanet.jpg +0 -0
package/AISB/image/077_aisb.t3.077_cmnn.jpg +0 -0
package/AISB/image/078_aisb.t3.078_conformalanomaly.jpg +0 -0
package/AISB/image/079_aisb.t3.079_dpfkmeans.jpg +0 -0
package/AISB/image/080_aisb.t3.080_latentscorereweight.jpg +0 -0
package/AISB/image/081_aisb.t3.081_qmamba.jpg +0 -0
package/AISB/image/082_aisb.t3.082_onlinellmrouting.jpg +0 -0
package/AISB/image/083_aisb.t3.083_starformer.jpg +0 -0
package/AISB/image/084_aisb.t3.084_ift.jpg +0 -0
package/AISB/image/085_aisb.t3.085_neuralsurv.jpg +0 -0
package/AISB/image/086_aisb.t3.086_stella.jpg +0 -0
package/AISB/image/087_aisb.t3.087_moses.jpg +0 -0
package/AISB/image/088_aisb.t3.088_channelnorm.jpg +0 -0
package/AISB/image/089_aisb.t3.089_causalvelocity.jpg +0 -0
package/AISB/image/090_aisb.t3.090_rstib.jpg +0 -0
package/AISB/image/091_aisb.t3.091_timeawarecausal.jpg +0 -0
package/AISB/image/092_aisb.t3.092_kmeanslocalopt.jpg +0 -0
package/AISB/image/093_aisb.t3.093_fedwmsam.jpg +0 -0
package/AISB/image/094_aisb.t3.094_boundre.jpg +0 -0
package/AISB/image/095_aisb.t3.095_fastfeaturecp.jpg +0 -0
package/AISB/image/096_aisb.t3.096_m3svm.jpg +0 -0
package/AISB/image/097_aisb.t3.097_wassersteintl.jpg +0 -0
package/AISB/image/098_aisb.t3.098_xmahalanobis.jpg +0 -0
package/AISB/image/099_aisb.t3.099_ollalanding.jpg +0 -0
package/AISB/image/100_aisb.t3.100_invmissingdata.jpg +0 -0
package/AISB/image/101_aisb.t3.101_acia.jpg +0 -0
package/AISB/image/102_aisb.t3.102_stochasticff.jpg +0 -0
package/AISB/image/103_aisb.t3.103_qdcp.jpg +0 -0
package/AISB/image/104_aisb.t3.104_balancedactiveinf.jpg +0 -0
package/AISB/image/105_aisb.t3.105_binaryclasseval.jpg +0 -0
package/AISB/image/106_aisb.t1.reasoning_lite.jpg +0 -0
package/AISB/image/107_aisb.t2.paper_audit.jpg +0 -0
package/AISB/image/108_aisb.t3.multi_gpu_search.jpg +0 -0
package/AISB/image/109_aisb.t3.tdc_admet.jpg +0 -0
package/AISB/image/aisb.b1.agentic_coding.svg +16 -0
package/AISB/image/aisb.b10.climate_earth.svg +16 -0
package/AISB/image/aisb.b11.model_efficiency.svg +16 -0
package/AISB/image/aisb.b12.embodied_ai.svg +16 -0
package/AISB/image/aisb.b2.agent_systems.svg +16 -0
package/AISB/image/aisb.b3.self_evolving_rl.svg +16 -0
package/AISB/image/aisb.b4.lm_reasoning.svg +16 -0
package/AISB/image/aisb.b5.math_proof.svg +16 -0
package/AISB/image/aisb.b6.research_process.svg +16 -0
package/AISB/image/aisb.b7.multimodal_fusion.svg +16 -0
package/AISB/image/aisb.b8.lifesci_drug.svg +16 -0
package/AISB/image/aisb.b9.material_science.svg +16 -0
package/README.md +196 -32
package/bin/ds.js +924 -66
package/docs/en/00_QUICK_START.md +195 -18
package/docs/en/01_SETTINGS_REFERENCE.md +468 -96
package/docs/en/02_START_RESEARCH_GUIDE.md +26 -5
package/docs/en/03_QQ_CONNECTOR_GUIDE.md +14 -3
package/docs/en/04_LINGZHU_CONNECTOR_GUIDE.md +2 -0
package/docs/en/05_TUI_GUIDE.md +171 -2
package/docs/en/07_MEMORY_AND_MCP.md +38 -2
package/docs/en/09_DOCTOR.md +78 -7
package/docs/en/10_WEIXIN_CONNECTOR_GUIDE.md +38 -1
package/docs/en/11_LICENSE_AND_RISK.md +4 -0
package/docs/en/12_GUIDED_WORKFLOW_TOUR.md +15 -0
package/docs/en/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +9 -0
package/docs/en/15_CODEX_PROVIDER_SETUP.md +624 -180
package/docs/en/16_TELEGRAM_CONNECTOR_GUIDE.md +14 -0
package/docs/en/17_WHATSAPP_CONNECTOR_GUIDE.md +14 -0
package/docs/en/18_FEISHU_CONNECTOR_GUIDE.md +14 -0
package/docs/en/21_LOCAL_MODEL_BACKENDS_GUIDE.md +386 -0
package/docs/en/22_BENCHSTORE_YAML_REFERENCE.md +469 -0
package/docs/en/23_BENCHSTORE_GITHUB_RELEASES_SPEC.md +316 -0
package/docs/en/24_CLAUDE_CODE_PROVIDER_SETUP.md +469 -0
package/docs/en/25_OPENCODE_PROVIDER_SETUP.md +653 -0
package/docs/en/26_CITATION_AND_ATTRIBUTION.md +119 -0
package/docs/en/27_KIMI_CODE_PROVIDER_SETUP.md +180 -0
package/docs/en/28_DISCORD_CONNECTOR_GUIDE.md +61 -0
package/docs/en/29_SLACK_CONNECTOR_GUIDE.md +60 -0
package/docs/en/30_SETTINGS_CONTROL_CENTER_GUIDE.md +371 -0
package/docs/en/{19_LOCAL_BROWSER_AUTH.md → 31_LOCAL_BROWSER_AUTH.md} +1 -1
package/docs/en/32_WINDOWS_WSL2_DEPLOYMENT_GUIDE.md +273 -0
package/docs/en/33_WORKSPACE_EXPLORER_QA.md +121 -0
package/docs/en/91_DEVELOPMENT.md +266 -0
package/docs/en/99_ACKNOWLEDGEMENTS.md +24 -19
package/docs/en/README.md +48 -7
package/docs/images/admin/admin-connectors-health-en.png +0 -0
package/docs/images/admin/admin-controllers-en.png +0 -0
package/docs/images/admin/admin-diagnostics-en.png +0 -0
package/docs/images/admin/admin-errors-en.png +0 -0
package/docs/images/admin/admin-issues-en.png +0 -0
package/docs/images/admin/admin-logs-en.png +0 -0
package/docs/images/admin/admin-quest-detail-en.png +0 -0
package/docs/images/admin/admin-quests-en.png +0 -0
package/docs/images/admin/admin-repairs-en.png +0 -0
package/docs/images/admin/admin-runtime-en.png +0 -0
package/docs/images/admin/admin-search-en.png +0 -0
package/docs/images/admin/admin-stats-en.png +0 -0
package/docs/images/admin/admin-summary-en.png +0 -0
package/docs/images/connectors/connector-discord-en.png +0 -0
package/docs/images/connectors/connector-feishu-en.png +0 -0
package/docs/images/connectors/connector-lingzhu-en.png +0 -0
package/docs/images/connectors/connector-qq-en.png +0 -0
package/docs/images/connectors/connector-slack-en.png +0 -0
package/docs/images/connectors/connector-telegram-en.png +0 -0
package/docs/images/connectors/connector-weixin-en.png +0 -0
package/docs/images/connectors/connector-whatsapp-en.png +0 -0
package/docs/images/settings/settings-baselines-en.png +0 -0
package/docs/images/settings/settings-config-en.png +0 -0
package/docs/images/settings/settings-connectors-overview-en.png +0 -0
package/docs/images/settings/settings-deepxiv-en.png +0 -0
package/docs/images/settings/settings-mcp-servers-en.png +0 -0
package/docs/images/settings/settings-plugins-en.png +0 -0
package/docs/images/settings/settings-runners-en.png +0 -0
package/docs/zh/00_QUICK_START.md +142 -18
package/docs/zh/01_SETTINGS_REFERENCE.md +219 -98
package/docs/zh/02_START_RESEARCH_GUIDE.md +26 -5
package/docs/zh/05_TUI_GUIDE.md +171 -2
package/docs/zh/07_MEMORY_AND_MCP.md +29 -2
package/docs/zh/09_DOCTOR.md +54 -8
package/docs/zh/10_WEIXIN_CONNECTOR_GUIDE.md +24 -1
package/docs/zh/11_LICENSE_AND_RISK.md +4 -0
package/docs/zh/12_GUIDED_WORKFLOW_TOUR.md +15 -0
package/docs/zh/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +9 -0
package/docs/zh/15_CODEX_PROVIDER_SETUP.md +552 -181
package/docs/zh/21_LOCAL_MODEL_BACKENDS_GUIDE.md +384 -0
package/docs/zh/22_BENCHSTORE_YAML_REFERENCE.md +459 -0
package/docs/zh/23_BENCHSTORE_GITHUB_RELEASES_SPEC.md +287 -0
package/docs/zh/23_CLAUDE_RUNNER_GUIDE.md +103 -0
package/docs/zh/24_CLAUDE_CODE_PROVIDER_SETUP.md +460 -0
package/docs/zh/25_OPENCODE_PROVIDER_SETUP.md +660 -0
package/docs/zh/26_CITATION_AND_ATTRIBUTION.md +102 -0
package/docs/zh/27_KIMI_CODE_PROVIDER_SETUP.md +51 -0
package/docs/zh/{19_LOCAL_BROWSER_AUTH.md → 31_LOCAL_BROWSER_AUTH.md} +1 -1
package/docs/zh/32_WINDOWS_WSL2_DEPLOYMENT_GUIDE.md +264 -0
package/docs/zh/33_WORKSPACE_EXPLORER_QA.md +127 -0
package/docs/zh/99_ACKNOWLEDGEMENTS.md +23 -19
package/docs/zh/README.md +33 -7
package/install.sh +168 -20
package/package.json +5 -1
package/pyproject.toml +2 -1
package/src/deepscientist/__init__.py +1 -1
package/src/deepscientist/acp/envelope.py +13 -0
package/src/deepscientist/admin/__init__.py +3 -0
package/src/deepscientist/admin/charts.py +681 -0
package/src/deepscientist/admin/logs.py +119 -0
package/src/deepscientist/admin/repairs.py +217 -0
package/src/deepscientist/admin/service.py +1310 -0
package/src/deepscientist/admin/system_info.py +700 -0
package/src/deepscientist/admin/tasks.py +465 -0
package/src/deepscientist/admin/tool_metrics.py +600 -0
package/src/deepscientist/artifact/guidance.py +8 -4
package/src/deepscientist/artifact/schemas.py +115 -0
package/src/deepscientist/artifact/service.py +4268 -260
package/src/deepscientist/bash_exec/monitor.py +30 -3
package/src/deepscientist/bash_exec/service.py +134 -1
package/src/deepscientist/benchstore/__init__.py +4 -0
package/src/deepscientist/benchstore/prompt_builder.py +224 -0
package/src/deepscientist/benchstore/service.py +1716 -0
package/src/deepscientist/bridges/connectors.py +8 -2
package/src/deepscientist/channels/weixin_ilink.py +8 -1
package/src/deepscientist/cli.py +92 -17
package/src/deepscientist/codex_cli_compat.py +187 -74
package/src/deepscientist/config/models.py +82 -11
package/src/deepscientist/config/service.py +1077 -93
package/src/deepscientist/connector/weixin_support.py +48 -17
package/src/deepscientist/daemon/api/handlers.py +827 -235
package/src/deepscientist/daemon/api/router.py +81 -1
package/src/deepscientist/daemon/app.py +1512 -85
package/src/deepscientist/diagnostics/__init__.py +6 -0
package/src/deepscientist/diagnostics/runner_failures.py +277 -0
package/src/deepscientist/doctor.py +407 -56
package/src/deepscientist/evidence_packets.py +590 -0
package/src/deepscientist/home.py +52 -4
package/src/deepscientist/kimi_cli_compat.py +50 -0
package/src/deepscientist/latex_runtime.py +2 -2
package/src/deepscientist/mcp/context.py +2 -0
package/src/deepscientist/mcp/schemas.py +114 -0
package/src/deepscientist/mcp/server.py +1566 -126
package/src/deepscientist/memory/service.py +203 -16
package/src/deepscientist/process_control.py +8 -1
package/src/deepscientist/prompts/builder.py +850 -88
package/src/deepscientist/quest/__init__.py +2 -2
package/src/deepscientist/quest/layout.py +12 -1
package/src/deepscientist/quest/node_traces.py +10 -0
package/src/deepscientist/quest/service.py +1852 -161
package/src/deepscientist/quest/stage_views.py +1 -1
package/src/deepscientist/runners/__init__.py +18 -0
package/src/deepscientist/runners/base.py +89 -1
package/src/deepscientist/runners/builtins.py +13 -1
package/src/deepscientist/runners/claude.py +391 -0
package/src/deepscientist/runners/codex.py +480 -35
package/src/deepscientist/runners/codex_telemetry.py +127 -0
package/src/deepscientist/runners/kimi.py +334 -0
package/src/deepscientist/runners/metadata.py +68 -0
package/src/deepscientist/runners/opencode.py +414 -0
package/src/deepscientist/runners/runtime_overrides.py +100 -0
package/src/deepscientist/runners/simple_cli.py +538 -0
package/src/deepscientist/runtime_storage.py +303 -0
package/src/deepscientist/shared.py +80 -16
package/src/deepscientist/skills/installer.py +37 -0
package/src/deepscientist/skills/registry.py +2 -0
package/src/deepscientist/tinytex.py +2 -2
package/src/deepscientist/tui.py +10 -3
package/src/prompts/benchstore/system.md +77 -0
package/src/prompts/connectors/qq.md +33 -2
package/src/prompts/connectors/weixin.md +208 -23
package/src/prompts/contracts/admin_ops.md +74 -0
package/src/prompts/contracts/admin_ops_knowledge.md +138 -0
package/src/prompts/contracts/shared_interaction.md +5 -10
package/src/prompts/start_setup/system.md +422 -0
package/src/prompts/system.md +411 -304
package/src/prompts/system_copilot.md +89 -0
package/src/skills/analysis-campaign/SKILL.md +239 -578
package/src/skills/analysis-campaign/references/artifact-flow-examples.md +102 -0
package/src/skills/analysis-campaign/references/boundary-cases.md +98 -0
package/src/skills/analysis-campaign/references/campaign-checklist-template.md +39 -24
package/src/skills/analysis-campaign/references/campaign-design.md +26 -10
package/src/skills/analysis-campaign/references/campaign-plan-template.md +53 -54
package/src/skills/analysis-campaign/references/operational-guidance.md +97 -0
package/src/skills/analysis-campaign/references/writing-facing-slice-examples.md +10 -20
package/src/skills/baseline/SKILL.md +183 -461
package/src/skills/baseline/references/artifact-flow-examples.md +106 -0
package/src/skills/baseline/references/artifact-payload-examples.md +1 -1
package/src/skills/baseline/references/baseline-checklist-template.md +27 -35
package/src/skills/baseline/references/baseline-plan-template.md +37 -76
package/src/skills/baseline/references/boundary-cases.md +86 -0
package/src/skills/baseline/references/codebase-audit-checklist.md +2 -6
package/src/skills/baseline/references/comparability-contract.md +7 -12
package/src/skills/baseline/references/operational-guidance.md +56 -0
package/src/skills/baseline/references/route-selection.md +5 -25
package/src/skills/decision/SKILL.md +113 -306
package/src/skills/decision/references/checkpoint-memory-template.md +47 -0
package/src/skills/decision/references/operational-guidance.md +94 -0
package/src/skills/decision/references/research-route-criteria.md +7 -8
package/src/skills/decision/references/strategic-decision-template.md +13 -26
package/src/skills/experiment/SKILL.md +132 -670
package/src/skills/experiment/references/execution-playbook.md +374 -0
package/src/skills/experiment/references/main-experiment-checklist-template.md +26 -2
package/src/skills/experiment/references/main-experiment-plan-template.md +28 -17
package/src/skills/experiment/references/operational-guidance.md +108 -0
package/src/skills/finalize/SKILL.md +62 -0
package/src/skills/finalize/references/checkpoint-memory-template.md +49 -0
package/src/skills/finalize/references/resume-packet-template.md +7 -0
package/src/skills/idea/SKILL.md +228 -15
package/src/skills/idea/references/controlled-brainstorming-playbook.md +78 -0
package/src/skills/idea/references/current-board-packet-template.md +61 -0
package/src/skills/idea/references/high-value-idea-sourcing.md +119 -0
package/src/skills/idea/references/idea-generation-playbook.md +21 -0
package/src/skills/idea/references/idea-thinking-flow.md +6 -0
package/src/skills/idea/references/literature-survey-template.md +3 -0
package/src/skills/idea/references/objective-contract-template.md +54 -0
package/src/skills/idea/references/outline-seeding-example.md +56 -0
package/src/skills/idea/references/pre-idea-draft-template.md +105 -0
package/src/skills/idea/references/related-work-playbook.md +75 -2
package/src/skills/idea/references/research-history-playbook.md +114 -0
package/src/skills/idea/references/selection-gate.md +58 -6
package/src/skills/intake-audit/SKILL.md +43 -2
package/src/skills/intake-audit/references/state-audit-template.md +10 -0
package/src/skills/nature-data/SKILL.md +128 -0
package/src/skills/nature-data/UPSTREAM_LICENSE.txt +21 -0
package/src/skills/nature-data/agents/openai.yaml +4 -0
package/src/skills/nature-data/references/chinese-author-alignment.md +84 -0
package/src/skills/nature-data/references/fair-metadata-checklist.md +105 -0
package/src/skills/nature-data/references/policy-principles.md +103 -0
package/src/skills/nature-data/references/repository-and-identifiers.md +96 -0
package/src/skills/nature-data/references/source-basis.md +54 -0
package/src/skills/nature-data/references/statement-patterns.md +153 -0
package/src/skills/nature-figure/SKILL.md +197 -0
package/src/skills/nature-figure/UPSTREAM_LICENSE.txt +21 -0
package/src/skills/nature-figure/agents/openai.yaml +4 -0
package/src/skills/nature-figure/evals/evals.json +37 -0
package/src/skills/nature-figure/references/api.md +428 -0
package/src/skills/nature-figure/references/backend-selection.md +100 -0
package/src/skills/nature-figure/references/chart-types.md +281 -0
package/src/skills/nature-figure/references/common-patterns.md +349 -0
package/src/skills/nature-figure/references/design-theory.md +436 -0
package/src/skills/nature-figure/references/figure-contract.md +93 -0
package/src/skills/nature-figure/references/nature-2026-observations.md +112 -0
package/src/skills/nature-figure/references/qa-contract.md +119 -0
package/src/skills/nature-figure/references/r-template-index.md +66 -0
package/src/skills/nature-figure/references/r-workflow.md +161 -0
package/src/skills/nature-figure/references/tutorials.md +250 -0
package/src/skills/nature-paper2ppt/SKILL.md +507 -0
package/src/skills/nature-paper2ppt/UPSTREAM_LICENSE.txt +21 -0
package/src/skills/nature-paper2ppt/agents/openai.yaml +4 -0
package/src/skills/nature-polishing/SKILL.md +385 -0
package/src/skills/nature-polishing/UPSTREAM_LICENSE.txt +21 -0
package/src/skills/nature-polishing/agents/openai.yaml +4 -0
package/src/skills/nature-polishing/references/phrasebank-playbook.md +162 -0
package/src/skills/nature-polishing/references/section-moves.md +240 -0
package/src/skills/nature-polishing/references/style-guardrails.md +94 -0
package/src/skills/nature-polishing/references/writing-strategy.md +148 -0
package/src/skills/optimize/SKILL.md +177 -1568
package/src/skills/optimize/references/brief-shaping-playbook.md +95 -0
package/src/skills/optimize/references/candidate-board-template.md +13 -0
package/src/skills/optimize/references/candidate-ranking-template.md +51 -0
package/src/skills/optimize/references/codegen-route-playbook.md +50 -0
package/src/skills/optimize/references/debug-response-template.md +29 -0
package/src/skills/optimize/references/frontier-review-template.md +32 -0
package/src/skills/optimize/references/fusion-playbook.md +36 -0
package/src/skills/optimize/references/method-brief-template.md +73 -0
package/src/skills/optimize/references/operational-guidance.md +621 -0
package/src/skills/optimize/references/optimization-memory-template.md +30 -0
package/src/skills/optimize/references/optimize-checklist-template.md +18 -0
package/src/skills/optimize/references/plateau-response-playbook.md +28 -0
package/src/skills/optimize/references/prompt-patterns.md +49 -0
package/src/skills/paper-outline/SKILL.md +227 -0
package/src/skills/paper-outline/references/outline-patterns.md +87 -0
package/src/skills/paper-plot/SKILL.md +79 -0
package/src/skills/paper-plot/agents/openai.yaml +4 -0
package/src/skills/paper-plot/references/bar_grouped_hatch.md +96 -0
package/src/skills/paper-plot/references/bar_paired_delta.md +72 -0
package/src/skills/paper-plot/references/line_confidence_band.md +75 -0
package/src/skills/paper-plot/references/line_loss_with_inset.md +65 -0
package/src/skills/paper-plot/references/line_training_curve.md +44 -0
package/src/skills/paper-plot/references/radar_dual_series.md +59 -0
package/src/skills/paper-plot/references/scatter_broken_axis.md +59 -0
package/src/skills/paper-plot/references/scatter_tsne_cluster.md +72 -0
package/src/skills/paper-plot/scripts/bar_memevolve.py +109 -0
package/src/skills/paper-plot/scripts/bar_spice.py +166 -0
package/src/skills/paper-plot/scripts/line_aime.py +94 -0
package/src/skills/paper-plot/scripts/line_loss_inset.py +157 -0
package/src/skills/paper-plot/scripts/line_selfdistill.py +168 -0
package/src/skills/paper-plot/scripts/radar_dora.py +151 -0
package/src/skills/paper-plot/scripts/scatter_break.py +169 -0
package/src/skills/paper-plot/scripts/scatter_tsne.py +133 -0
package/src/skills/rebuttal/SKILL.md +9 -0
package/src/skills/references/tool-usage-by-stage.md +438 -0
package/src/skills/review/SKILL.md +105 -7
package/src/skills/science/PROVENANCE.md +44 -0
package/src/skills/science/SKILL.md +137 -0
package/src/skills/science/references/artifact-science-tool.md +110 -0
package/src/skills/science/references/claim-type-discipline.md +56 -0
package/src/skills/science/references/domain-index.md +422 -0
package/src/skills/science/references/hpc-via-bash-exec.md +42 -0
package/src/skills/science/references/package-check-playbook.md +64 -0
package/src/skills/science/references/package-index.min.json +3616 -0
package/src/skills/science/references/packages/abinit.md +80 -0
package/src/skills/science/references/packages/acts.md +73 -0
package/src/skills/science/references/packages/aiida-core.md +80 -0
package/src/skills/science/references/packages/alamode.md +80 -0
package/src/skills/science/references/packages/amuse.md +88 -0
package/src/skills/science/references/packages/anndata.md +88 -0
package/src/skills/science/references/packages/arbor.md +80 -0
package/src/skills/science/references/packages/arc.md +73 -0
package/src/skills/science/references/packages/astropy.md +88 -0
package/src/skills/science/references/packages/astroquery.md +88 -0
package/src/skills/science/references/packages/atomate2.md +80 -0
package/src/skills/science/references/packages/atomsmltr.md +73 -0
package/src/skills/science/references/packages/awkward.md +73 -0
package/src/skills/science/references/packages/batman.md +88 -0
package/src/skills/science/references/packages/biopython.md +88 -0
package/src/skills/science/references/packages/bloqade.md +73 -0
package/src/skills/science/references/packages/brian2.md +73 -0
package/src/skills/science/references/packages/bullet3.md +73 -0
package/src/skills/science/references/packages/calculix.md +80 -0
package/src/skills/science/references/packages/cantera.md +73 -0
package/src/skills/science/references/packages/cavity-md-ipi.md +80 -0
package/src/skills/science/references/packages/ccdproc.md +88 -0
package/src/skills/science/references/packages/celerite2.md +88 -0
package/src/skills/science/references/packages/cellrank.md +73 -0
package/src/skills/science/references/packages/cesm.md +80 -0
package/src/skills/science/references/packages/chemicals.md +73 -0
package/src/skills/science/references/packages/chempy.md +73 -0
package/src/skills/science/references/packages/cirq.md +73 -0
package/src/skills/science/references/packages/coffea.md +73 -0
package/src/skills/science/references/packages/cp2k.md +88 -0
package/src/skills/science/references/packages/custodian.md +80 -0
package/src/skills/science/references/packages/dart.md +73 -0
package/src/skills/science/references/packages/datamol.md +88 -0
package/src/skills/science/references/packages/dd4hep.md +73 -0
package/src/skills/science/references/packages/dealii.md +80 -0
package/src/skills/science/references/packages/deepchem.md +88 -0
package/src/skills/science/references/packages/delphes.md +73 -0
package/src/skills/science/references/packages/devito.md +80 -0
package/src/skills/science/references/packages/dftb.md +88 -0
package/src/skills/science/references/packages/dftd4.md +88 -0
package/src/skills/science/references/packages/dftk-jl.md +80 -0
package/src/skills/science/references/packages/dolfinx.md +80 -0
package/src/skills/science/references/packages/drake.md +73 -0
package/src/skills/science/references/packages/dumux.md +73 -0
package/src/skills/science/references/packages/elk.md +80 -0
package/src/skills/science/references/packages/elmerfem.md +80 -0
package/src/skills/science/references/packages/enzo-e.md +88 -0
package/src/skills/science/references/packages/espresso.md +80 -0
package/src/skills/science/references/packages/exoplanet.md +88 -0
package/src/skills/science/references/packages/fairroot.md +73 -0
package/src/skills/science/references/packages/fbpic.md +80 -0
package/src/skills/science/references/packages/fdtdbath-meep.md +80 -0
package/src/skills/science/references/packages/geant4.md +73 -0
package/src/skills/science/references/packages/geosx.md +80 -0
package/src/skills/science/references/packages/gprmax.md +80 -0
package/src/skills/science/references/packages/gromacs.md +80 -0
package/src/skills/science/references/packages/gwaslab.md +73 -0
package/src/skills/science/references/packages/gz-sim.md +73 -0
package/src/skills/science/references/packages/hail.md +88 -0
package/src/skills/science/references/packages/hiphive.md +80 -0
package/src/skills/science/references/packages/hoomd-blue.md +80 -0
package/src/skills/science/references/packages/itensor.md +73 -0
package/src/skills/science/references/packages/itensors-jl.md +73 -0
package/src/skills/science/references/packages/jdftx.md +73 -0
package/src/skills/science/references/packages/jobflow.md +80 -0
package/src/skills/science/references/packages/kadanoffbaym-jl.md +73 -0
package/src/skills/science/references/packages/kite.md +80 -0
package/src/skills/science/references/packages/kratos.md +80 -0
package/src/skills/science/references/packages/kwant.md +73 -0
package/src/skills/science/references/packages/lammps.md +80 -0
package/src/skills/science/references/packages/lightkurve.md +88 -0
package/src/skills/science/references/packages/limix.md +73 -0
package/src/skills/science/references/packages/maxwelllink.md +80 -0
package/src/skills/science/references/packages/mcdc.md +73 -0
package/src/skills/science/references/packages/meep.md +80 -0
package/src/skills/science/references/packages/mfem.md +80 -0
package/src/skills/science/references/packages/mitgcm.md +73 -0
package/src/skills/science/references/packages/modflow6.md +73 -0
package/src/skills/science/references/packages/molecool.md +73 -0
package/src/skills/science/references/packages/mom6.md +73 -0
package/src/skills/science/references/packages/moose.md +80 -0
package/src/skills/science/references/packages/mpas-model.md +73 -0
package/src/skills/science/references/packages/mujoco.md +73 -0
package/src/skills/science/references/packages/mumax3.md +73 -0
package/src/skills/science/references/packages/nekrs.md +80 -0
package/src/skills/science/references/packages/nessi.md +73 -0
package/src/skills/science/references/packages/nest-simulator.md +73 -0
package/src/skills/science/references/packages/netket.md +73 -0
package/src/skills/science/references/packages/neuron.md +73 -0
package/src/skills/science/references/packages/nextflow.md +88 -0
package/src/skills/science/references/packages/nwchem.md +88 -0
package/src/skills/science/references/packages/openbabel.md +88 -0
package/src/skills/science/references/packages/openems.md +80 -0
package/src/skills/science/references/packages/openff-toolkit.md +88 -0
package/src/skills/science/references/packages/openfoam-dev.md +80 -0
package/src/skills/science/references/packages/openmc.md +73 -0
package/src/skills/science/references/packages/openmm.md +80 -0
package/src/skills/science/references/packages/openmoc.md +73 -0
package/src/skills/science/references/packages/openmx.md +80 -0
package/src/skills/science/references/packages/opensees.md +80 -0
package/src/skills/science/references/packages/opensn.md +80 -0
package/src/skills/science/references/packages/opm-simulators.md +73 -0
package/src/skills/science/references/packages/oqupy.md +73 -0
package/src/skills/science/references/packages/packmol.md +80 -0
package/src/skills/science/references/packages/palabos.md +80 -0
package/src/skills/science/references/packages/parflow.md +80 -0
package/src/skills/science/references/packages/pennylane.md +88 -0
package/src/skills/science/references/packages/perceval.md +73 -0
package/src/skills/science/references/packages/phono3py.md +73 -0
package/src/skills/science/references/packages/phonopy.md +73 -0
package/src/skills/science/references/packages/photutils.md +88 -0
package/src/skills/science/references/packages/picongpu.md +80 -0
package/src/skills/science/references/packages/plink-ng.md +88 -0
package/src/skills/science/references/packages/precice.md +73 -0
package/src/skills/science/references/packages/psc.md +80 -0
package/src/skills/science/references/packages/psi4.md +88 -0
package/src/skills/science/references/packages/pybinding.md +73 -0
package/src/skills/science/references/packages/pyfr.md +80 -0
package/src/skills/science/references/packages/pyhf.md +73 -0
package/src/skills/science/references/packages/pyiron_base.md +80 -0
package/src/skills/science/references/packages/pylcp.md +73 -0
package/src/skills/science/references/packages/pylith.md +80 -0
package/src/skills/science/references/packages/pynbody.md +88 -0
package/src/skills/science/references/packages/pysam.md +88 -0
package/src/skills/science/references/packages/pyscf.md +88 -0
package/src/skills/science/references/packages/q-e.md +73 -0
package/src/skills/science/references/packages/qibo.md +73 -0
package/src/skills/science/references/packages/qiskit.md +73 -0
package/src/skills/science/references/packages/quantica-jl.md +73 -0
package/src/skills/science/references/packages/quantumoptics-jl.md +73 -0
package/src/skills/science/references/packages/quimb.md +73 -0
package/src/skills/science/references/packages/qulacs.md +73 -0
package/src/skills/science/references/packages/qutip.md +73 -0
package/src/skills/science/references/packages/rdkit.md +88 -0
package/src/skills/science/references/packages/rmg-py.md +73 -0
package/src/skills/science/references/packages/root.md +73 -0
package/src/skills/science/references/packages/scanpy.md +88 -0
package/src/skills/science/references/packages/scikit-allel.md +88 -0
package/src/skills/science/references/packages/scikit-bio.md +88 -0
package/src/skills/science/references/packages/scqubits.md +73 -0
package/src/skills/science/references/packages/scuff-em.md +80 -0
package/src/skills/science/references/packages/scvi-tools.md +73 -0
package/src/skills/science/references/packages/seissol.md +73 -0
package/src/skills/science/references/packages/sfepy.md +80 -0
package/src/skills/science/references/packages/sisl.md +73 -0
package/src/skills/science/references/packages/smilei.md +80 -0
package/src/skills/science/references/packages/snakemake.md +88 -0
package/src/skills/science/references/packages/specfem3d-globe.md +80 -0
package/src/skills/science/references/packages/specutils.md +88 -0
package/src/skills/science/references/packages/spglib.md +80 -0
package/src/skills/science/references/packages/squidpy.md +88 -0
package/src/skills/science/references/packages/starry.md +88 -0
package/src/skills/science/references/packages/strawberryfields.md +73 -0
package/src/skills/science/references/packages/su2.md +80 -0
package/src/skills/science/references/packages/sunny-jl.md +73 -0
package/src/skills/science/references/packages/sw4.md +73 -0
package/src/skills/science/references/packages/swift.md +88 -0
package/src/skills/science/references/packages/tdnegf.md +73 -0
package/src/skills/science/references/packages/tenpy.md +73 -0
package/src/skills/science/references/packages/thermo.md +73 -0
package/src/skills/science/references/packages/tkwant.md +73 -0
package/src/skills/science/references/packages/tvb-root.md +73 -0
package/src/skills/science/references/packages/uproot5.md +73 -0
package/src/skills/science/references/packages/vampire.md +80 -0
package/src/skills/science/references/packages/wannier_tools.md +73 -0
package/src/skills/science/references/packages/warpx.md +80 -0
package/src/skills/science/references/packages/wrf.md +73 -0
package/src/skills/science/references/packages/xtb.md +88 -0
package/src/skills/science/references/packages/yt.md +73 -0
package/src/skills/science/references/science-task-brief-template.md +71 -0
package/src/skills/scout/SKILL.md +83 -425
package/src/skills/scout/references/literature-scout-template.md +5 -24
package/src/skills/scout/references/operational-guidance.md +191 -0
package/src/skills/scout/references/paper-triage-playbook.md +11 -35
package/src/skills/write/SKILL.md +744 -1246
package/src/skills/write/references/experiments_analysis_patterns.md +129 -0
package/src/skills/write/references/oral_package_patterns.md +252 -0
package/src/skills/write/references/oral_writing_principles.md +291 -0
package/src/skills/write/references/section_rewrite_checklist.md +234 -0
package/src/tui/dist/app/AppContainer.js +1314 -27
package/src/tui/dist/components/Composer.js +26 -1
package/src/tui/dist/components/ConfigScreen.js +2 -1
package/src/tui/dist/components/InputPrompt.js +25 -9
package/src/tui/dist/components/MainContent.js +18 -3
package/src/tui/dist/components/QuestScreen.js +3 -2
package/src/tui/dist/components/UtilityScreen.js +37 -0
package/src/tui/dist/hooks/useSafeInput.js +10 -0
package/src/tui/dist/index.js +13 -1
package/src/tui/dist/layouts/DefaultAppLayout.js +11 -8
package/src/tui/dist/lib/api.js +89 -1
package/src/tui/package.json +1 -1
package/src/ui/dist/assets/{AnalysisPlugin-DnSm0GZn.js → AnalysisPlugin-CA94NGmI.js} +1 -1
package/src/ui/dist/assets/CliPlugin-DHBzphZU.js +79 -0
package/src/ui/dist/assets/CodeEditorPlugin-BOFwD2rn.js +2 -0
package/src/ui/dist/assets/{CodeViewerPlugin-itb0tltR.js → CodeViewerPlugin-CqDpgjik.js} +4 -4
package/src/ui/dist/assets/{DocViewerPlugin-DqKkiCI6.js → DocViewerPlugin-UDBgt8-4.js} +3 -3
package/src/ui/dist/assets/GitCommitViewerPlugin-BmHtZ0bZ.js +6 -0
package/src/ui/dist/assets/{GitDiffViewerPlugin-DxL2ezFG.js → GitDiffViewerPlugin-CAxjNorQ.js} +2 -2
package/src/ui/dist/assets/{GitSnapshotViewer-B_RQm1YZ.js → GitSnapshotViewer-CweA6VON.js} +2 -2
package/src/ui/dist/assets/{ImageViewerPlugin-tHqlXY3n.js → ImageViewerPlugin-C8wHGvGN.js} +5 -5
package/src/ui/dist/assets/LabPlugin-COyyLUol.js +32 -0
package/src/ui/dist/assets/{LatexPlugin-B495DTXC.js → LatexPlugin-BQjAaA5J.js} +4 -4
package/src/ui/dist/assets/{MarkdownViewerPlugin-DG28-61B.js → MarkdownViewerPlugin-Dy1NE2dI.js} +3 -3
package/src/ui/dist/assets/{MarketplacePlugin-BiOGT-Kj.js → MarketplacePlugin-DMIZtEJ2.js} +2 -2
package/src/ui/dist/assets/NotebookEditor-CFHMq_Qt.js +91 -0
package/src/ui/dist/assets/{NotebookEditor-CVsj8h_T.js → NotebookEditor-WFyd8Ybt.js} +23 -23
package/src/ui/dist/assets/{PdfLoader-CASDQmxJ.js → PdfLoader-CLE5u5TS.js} +3 -3
package/src/ui/dist/assets/{PdfMarkdownPlugin-BFhwoKsY.js → PdfMarkdownPlugin-_iNK_H83.js} +1 -1
package/src/ui/dist/assets/PdfViewerPlugin-DgWsbInT.js +22 -0
package/src/ui/dist/assets/SearchPlugin-DrZmn5iw.js +11 -0
package/src/ui/dist/assets/{TextViewerPlugin-CB4DYfWO.js → TextViewerPlugin-D1-T3aC7.js} +4 -4
package/src/ui/dist/assets/branding/runner-claude.svg +107 -0
package/src/ui/dist/assets/branding/runner-codex.svg +10 -0
package/src/ui/dist/assets/branding/runner-kimi.svg +14 -0
package/src/ui/dist/assets/branding/runner-opencode.svg +7 -0
package/src/ui/dist/assets/cli-store-CoZ-x5Ip.js +1 -0
package/src/ui/dist/assets/{code-DLC6G24T.js → code-DbsmSd3Y.js} +1 -1
package/src/ui/dist/assets/file-diff-panel-DsvyRz47.js +1 -0
package/src/ui/dist/assets/{wrap-text-CwMn-iqb.js → file-jump-queue-DeQBikaw.js} +3 -3
package/src/ui/dist/assets/{file-socket-Cu4Qln7Y.js → file-socket-DA5XIx88.js} +1 -1
package/src/ui/dist/assets/fonts/ds-fonts.css +50 -4
package/src/ui/dist/assets/images/deepxiv/register-guide.png +0 -0
package/src/ui/dist/assets/index-39vY9LmZ.js +1 -0
package/src/ui/dist/assets/{index-wQ7RIIRd.js → index-BsO46tJA.js} +1 -1
package/src/ui/dist/assets/index-CHzJ2xtB.js +3530 -0
package/src/ui/dist/assets/index-DH-zxoZ3.css +33 -0
package/src/ui/dist/assets/{plugin-notebook-HbW2K-1c.js → plugin-notebook-JRhysCqj.js} +2 -2
package/src/ui/dist/assets/{project-sync-CsX08Qno.js → project-sync-DPmWKmKD.js} +1 -1
package/src/ui/dist/assets/{zoom-out-R-GWEhzS.js → zoom-out-DAukFWen.js} +3 -3
package/src/ui/dist/index.html +3 -3
package/src/skills/analysis-campaign/references/artifact-orchestration.md +0 -58
package/src/skills/baseline/references/memory-playbook.md +0 -40
package/src/skills/baseline/references/publishable-baseline-package.md +0 -30
package/src/skills/write/references/outline-evidence-contract-example.md +0 -107
package/src/skills/write/references/paper-experiment-matrix-template.md +0 -131
package/src/skills/write/references/paper-section-playbook.md +0 -64
package/src/skills/write/references/reviewer-first-writing.md +0 -64
package/src/skills/write/references/revision-checklist.md +0 -70
package/src/skills/write/references/section-contracts.md +0 -82
package/src/skills/write/references/sentence-level-proofing.md +0 -49
package/src/ui/dist/assets/AiManusChatView-COFACy7V.js +0 -204
package/src/ui/dist/assets/CliPlugin-CvwCmDQ5.js +0 -109
package/src/ui/dist/assets/CodeEditorPlugin-cOqSa0xq.js +0 -2
package/src/ui/dist/assets/GitCommitViewerPlugin-DVgNHBCS.js +0 -1
package/src/ui/dist/assets/LabCopilotPanel-ClMbq5Yu.js +0 -14
package/src/ui/dist/assets/LabPlugin-L_SuE8ow.js +0 -22
package/src/ui/dist/assets/NotebookEditor-C-4Kt1p9.js +0 -81
package/src/ui/dist/assets/PdfViewerPlugin-DcOzU9vd.js +0 -17
package/src/ui/dist/assets/SearchPlugin-CHj7M58O.js +0 -16
package/src/ui/dist/assets/VNCViewer-CjlbyCB3.js +0 -11
package/src/ui/dist/assets/bot-CFkZY-JP.js +0 -6
package/src/ui/dist/assets/chevron-up-Dq5ofbht.js +0 -6
package/src/ui/dist/assets/file-content-Dv4LoZec.js +0 -1
package/src/ui/dist/assets/file-diff-panel-Denq-lC3.js +0 -1
package/src/ui/dist/assets/file-jump-queue-DA-SdG__.js +0 -1
package/src/ui/dist/assets/git-commit-horizontal-BUh6G52n.js +0 -6
package/src/ui/dist/assets/image-B9HUUddG.js +0 -6
package/src/ui/dist/assets/index-B2B1sg-M.js +0 -1
package/src/ui/dist/assets/index-Cgla8biy.css +0 -33
package/src/ui/dist/assets/index-DRyx7vAc.js +0 -1
package/src/ui/dist/assets/index-Gbl53BNp.js +0 -2496
package/src/ui/dist/assets/pdf-effect-queue-ZtnHFCAi.js +0 -6
package/src/ui/dist/assets/popover-DL6h35vr.js +0 -1
package/src/ui/dist/assets/select-DvmXt1yY.js +0 -11
package/src/ui/dist/assets/sigma-7jpXazui.js +0 -6
package/src/ui/dist/assets/trash-xA7kFt8i.js +0 -11
package/src/ui/dist/assets/useCliAccess-DsMwDjOp.js +0 -1
package/src/ui/dist/assets/useFileDiffOverlay-FuhcnKiw.js +0 -1

package/src/skills/baseline/SKILL.md CHANGED Viewed

@@ -6,446 +6,226 @@ skill_role: stage
 # Baseline
-This skill establishes the reference system the quest will compare against.
-The target is one trustworthy baseline line, not an endless reproduction diary.
+Use this skill to secure one trustworthy comparator and then get out of the way.
+The target is one accepted baseline line, not an endless reproduction diary.
-## Interaction discipline
-- Follow the shared interaction contract injected by the system prompt.
-- Keep ordinary setup and debugging updates concise.
-- Use richer milestone updates only when the baseline becomes trusted, caveated, blocked, waived, or route-changing.
-- Hard execution rule: every terminal command in this stage must go through `bash_exec`; do not use any other terminal path for setup, reproduction, monitoring, verification, Git, Python, package-manager, or file-inspection commands.
-- Prefer `bash_exec` for setup, reproduction, monitoring, and verification commands so the baseline line stays durable and auditable.
-## Tool discipline
-- **Do not use native `shell_command` / `command_execution` in this skill.**
-- **All shell, CLI, Python, bash, node, git, npm, uv, and environment work must go through `bash_exec(...)`.**
-- **For git work inside the current quest repository or worktree, prefer `artifact.git(...)` before raw shell git commands.**
-- **If a generic git smoke test is needed outside the quest repo, use `bash_exec(...)` in an isolated scratch repository.**
-## Non-negotiable rules
-- no fabricated metrics, logs, run status, or success claims
-- do not skip baseline steps or silently simplify the route when that would change trust or comparability
-- do not claim a baseline is ready before verification is complete
-- do not infer missing commands, scripts, or parameters when the uncertainty could change the result
-- any unavoidable guess must be written down explicitly with expected impact
-- use web search for discovering papers or repos, but use `artifact.arxiv(paper_id=..., full_text=False)` for actually reading a source arXiv paper when it exists
-- set `full_text=True` only when the short form is insufficient
-- for Python baselines, environment setup should be standardized around `uv`
-## Stage purpose
-The baseline stage should produce a usable reference point through one of four routes:
-1. attach an existing reusable baseline
-2. import a reusable baseline package
-3. reproduce a baseline from source
-4. repair a broken or stale baseline
-Keep the classic control flow:
-1. analysis
-2. setup
-3. execution
-4. verification
-These are control gates, not paperwork walls.
-## Quick workflow
-1. Read the source paper and source repo first, or record exactly what is missing and why.
-2. Choose the lightest trustworthy route: attach, import, reproduce, or repair.
-3. Start with the fast path whenever the current baseline object, command path, and acceptance target are already clear enough to validate cheaply.
-4. Before substantial baseline setup, code edits, or a real baseline run, create `PLAN.md` and `CHECKLIST.md`; short-form files are enough for simple fast-path work.
-5. Keep one dominant phase visible: analysis -> setup -> execution -> verification.
-6. Prefer one clean implementation pass, one smoke test, and then one normal baseline run.
-7. Retry only when smoke, verification, or runtime evidence shows a concrete failure or incompatibility.
-8. Close the stage by confirming or waiving the gate, then hand off with a concise `1-2` sentence summary of trust status and next anchor.
-## Fast-path first
-Default to the lightest baseline path that can still establish a trustworthy comparison.
-Default to a fast path when it can establish trust with less work.
-Fast path is the default when any of the following is true:
-- `requested_baseline_ref` or `confirmed_baseline_ref` already points to the active baseline object
-- the route is clearly `attach` or `import`
-- the repo entrypoint, dataset or split, and metric contract are already concrete enough to validate cheaply
-- reproduction requires no meaningful code changes and the main uncertainty is only whether the command still runs
-Fast path means:
-- do not restart broad baseline discovery by default
-- do not front-load a full codebase audit when the entrypoint is already concrete
-- use a minimal `PLAN.md`, a minimal `CHECKLIST.md`, one bounded smoke test when needed, and then one real validation or run
-- default to reuse-and-verify when runtime already attached a concrete baseline
-Escalate from fast path to fuller audit only when:
-- the paper and repo disagree materially
-- the real run or eval entrypoint is unclear
-- code changes are likely required
-- the contract spans multiple metrics, datasets, subtasks, or splits that still need interpretation
-- the same failure class reappears after one documented autonomous fix
-- the quest is trying to publish a reusable global baseline rather than only clear the current gate
+## Match signals
-## Use when
+Use `baseline` when:
 - no credible baseline exists yet
 - the current baseline is unverified or stale
 - the user already has a baseline package that should be attached or imported
+- a local code path or local service should be verified as the comparator
 - a reproduction failed earlier and now needs repair
 - the quest resumed and the baseline trust state is unclear
-## Do not use when
-- the quest already has a verified active baseline and the next move is ideation or execution
-- the user explicitly waived the baseline gate and that waiver is durably recorded
-## Stage gate
-Do not proceed to comparison-heavy downstream work unless one of the following is durably true:
-- a baseline has been attached and accepted
-- a baseline has been imported and accepted
-- a baseline reproduction has completed and been verified
-- an explicit waiver decision exists with a clear reason
-Operationally:
-- call `artifact.confirm_baseline(...)` once the accepted baseline root and trusted comparison contract are clear
-- call `artifact.waive_baseline(...)` when the quest must continue without a baseline
-- attach, import, or publish alone do not open the downstream gate
-## Required plan and checklist
-Before substantial baseline setup, code edits, or a real baseline run, create a quest-visible `PLAN.md` and `CHECKLIST.md`.
-- Use `references/baseline-plan-template.md` as the canonical structure for `PLAN.md`.
-- Use `references/baseline-checklist-template.md` as the canonical structure for `CHECKLIST.md`.
-- `analysis_plan.md` and `REPRO_CHECKLIST.md` remain acceptable compatibility alias files when an older quest already depends on them.
-- For fast-path attach/import/prebound validation or a simple reproduce path with no expected code changes, short-form `PLAN.md` and `CHECKLIST.md` are enough.
-- The plan should put the user's explicit requirements and non-negotiable constraints first.
-- Then record the chosen route, source identity, command path, expected outputs, acceptance condition, safe efficiency levers, main risks, and fallback.
-- If the route, commands, source package, fallback path, or trust judgment changes materially, revise `PLAN.md` before continuing.
-- Once the route is concrete, stop reshaping code and commands speculatively.
-Default retry discipline:
-- do not rerun the same unchanged smoke command just to reconfirm the same fact
-- treat one autonomous retry for the same failure class as the normal upper bound
-- if the same failure class appears again, switch explicitly into `repair`, record `blocked`, or route through `decision`
-## Required durable outputs
-The baseline stage should usually leave behind:
-- a baseline directory under `baselines/local/` or `baselines/imported/`
-- `PLAN.md` and `CHECKLIST.md`
-- a verification note or report
-- command, config, environment, and metrics pointers
-- a baseline artifact
-- a confirmed baseline gate via `artifact.confirm_baseline(...)`, or an explicit waiver via `artifact.waive_baseline(...)`
-- an optional registry publication if the baseline is reusable beyond this quest
-For simple attach/import flows or a straightforward reproduce flow, do not stall just to precreate every optional note file.
-Useful optional notes:
-- `setup.md`
-- `execution.md`
-- `verification.md`
-- `STRUCTURE.md` when the layout is non-obvious
-## File-by-file contract
-- `PLAN.md` or compatibility alias `analysis_plan.md` is the required route contract before substantial setup, code edits, or a real run; it should state the route, source identity, command path, expected outputs, acceptance condition, main risks, and fallback.
-- `CHECKLIST.md` or compatibility alias `REPRO_CHECKLIST.md` is the required living state tracker; it should show whether the baseline object, smoke decision, real run decision, and final accept / block / waive outcome are explicit.
-- `setup.md` is optional unless environment or layout choices are non-trivial; if used, record the working directory, environment route, important config paths, source revision, and notable setup deviations.
-- `execution.md` is optional unless the run is long, multi-step, or rerun-heavy; if used, record the launched commands, durable log paths, checkpoints, exit state, and any reruns or repairs.
-- `verification.md` is optional as a filename but required in substance before acceptance or blocked closeout; either this file or an equivalent report should record trusted metrics, expected-versus-observed comparison, caveats, canonical output paths, and the next anchor.
-- `STRUCTURE.md` becomes required when the workspace layout, mounts, symlinks, or generated outputs are non-obvious or meant for reuse; it should map the important directories and say which paths are canonical.
-- `attachment.yaml` is required for attached or imported baselines under `baselines/imported/`; preserve source identity, selected variant when relevant, and attachment provenance there.
-- `<baseline_root>/json/metric_contract.json` is the canonical accepted comparison contract; once the baseline is accepted, do not leave the authoritative metric surface only in chat, memory, or prose.
-- `Result/metric.md` is scratch-only; it may help during execution, but it is never the final source of truth.
-Minimum stability rules:
-- before the first real run, leave one durable note with the chosen route, expected command path, target outputs, and main risks
-- after each smoke test or real run, record what actually happened and whether the route still looks viable
-- before acceptance, leave a clear verification note and baseline gate decision
-- every accepted baseline should leave one accepted baseline artifact
-- every blocked baseline line should leave one blocked report and one next-step decision
-- if one rolling note is enough for a simple baseline line, use it
-## Durable path contract
-Use the real runtime paths consistently.
-Quest-local paths:
-- reproduced baseline root: `<quest_root>/baselines/local/<baseline_id>/`
-- attached or imported baseline root: `<quest_root>/baselines/imported/<baseline_id>/`
-- attachment record: `<quest_root>/baselines/imported/<baseline_id>/attachment.yaml`
-- canonical baseline metric contract JSON: `<baseline_root>/json/metric_contract.json`
-- baseline artifact record: `<quest_root>/artifacts/baselines/<artifact_id>.json`
-- baseline reports: `<quest_root>/artifacts/reports/<artifact_id>.json`
-- confirmed baseline reference: `quest.yaml -> confirmed_baseline_ref`
-Global reusable registry paths:
-- baseline registry index: `~/DeepScientist/config/baselines/index.jsonl`
-- canonical baseline entry: `~/DeepScientist/config/baselines/entries/<baseline_id>.yaml`
+Do not use `baseline` when:
-## Baseline id and variant rules
-- `baseline_id` should be short, stable, and filesystem-safe
-- use letters, digits, `.`, `_`, or `-`
-- do not use spaces, `/`, `\\`, or `..`
-- if one codebase contains multiple comparable baselines, prefer one `baseline_id` with structured variants instead of inventing many near-duplicate entries
-- when variants exist, keep `default_variant_id`, `baseline_variants`, and per-variant metric summaries stable enough that later `experiment` and `write` stages can cite them directly
-Do not invent parallel durable locations when these runtime contracts already exist.
-Do not leave the authoritative metric contract only in chat, memory, or prose once the baseline is accepted.
-If a baseline is reproduced only because an analysis campaign needs an extra comparator:
+- a verified active baseline already exists and the next move is obviously `idea`, `experiment`, `write`, or `finalize`
+- the baseline gate was already explicitly waived for the current route
-- still place it under the normal baseline roots
-- treat it as a supplementary analysis baseline unless the quest explicitly promotes it into the canonical gate
-- do not call `artifact.confirm_baseline(...)` for that supplementary case unless the quest truly intends to replace the canonical baseline
+## One-sentence summary
-## Multi-baseline policy
+Secure the lightest trustworthy comparator, make the comparison contract explicit, then confirm, waive, or block the baseline and stop.
-One quest may legitimately need more than one baseline.
+## Control workflow
-- explicitly mark which baseline is the primary downstream comparator
-- distinguish primary comparison baselines from fallback or infrastructure baselines
-- if several baselines are credible, record why the chosen primary baseline is the fairest paper-facing comparator
-- do not leave later stages guessing which baseline is authoritative
+1. Choose the current acceptance target and the lightest route that can satisfy it.
+   Prefer `attach`, `import`, or `verify-local-existing` before full reproduction.
+2. Make the comparator identity and core metric contract explicit.
+   Record task, dataset, split, evaluation path, required metric ids, metric directions, source identity, and known deviations.
+3. Collect only the evidence needed to establish comparability.
+   Do not widen into broad codebase audit or heavy reruns unless the lighter route cannot be trusted.
+4. Verify before acceptance.
+   Check that outputs are real, metrics trace to real evidence, and the intended dataset/split and metric definitions match the contract.
+   Explicitly verify the comparator and metric contract before treating the baseline gate as open.
+5. Close the gate explicitly.
+   Call `artifact.confirm_baseline(...)`, call `artifact.waive_baseline(...)`, or record an explicit blocker and next route.
+   When an already accepted baseline needs a deliberate second-pass refresh after verified code, variant, or canonical metric changes, prefer `artifact.overwrite_baseline(...)` over pretending the update is just a first confirmation.
-## Route order
+## AVOID / pitfalls
-Prefer this order:
+- Do not default to full source reproduction when reuse or verify-local-existing is already sufficient.
+- Do not treat attach, import, or publish alone as baseline acceptance.
+- Do not accept metrics that are fabricated, copied from the paper, or not traceable to real outputs, logs, or service responses.
+- Do not silently normalize away deviations in dataset, split, metric definition, evaluation path, or source identity.
+- Do not keep doing baseline work after the current acceptance target is already satisfied.
+- Do not repeat the same failure class without new evidence, code changes, environment changes, or a route change.
-1. attach
-2. import
-3. reproduce
-4. repair
+## Constraints
-Prefer reuse over redundant reproduction.
+- Routes, templates, filenames, smoke tests, and environment choices are tactics; the hard requirement is objective evidence sufficient to accept, waive, block, or switch the route.
+- Do not treat templates, filenames, `uv`, smoke tests, detached runs, or the phase order as required paths.
+- Durable records are required in substance, not in fixed filenames.
+- `PLAN.md`, `CHECKLIST.md`, `setup.md`, `execution.md`, `verification.md`, `analysis_plan.md`, and `REPRO_CHECKLIST.md` are allowed compatibility surfaces, not mandatory success paths.
+- `<baseline_root>/json/metric_contract.json` is the canonical accepted comparison contract.
+- Accepted baselines still require `artifact.confirm_baseline(...)`.
+- Waived baselines still require `artifact.waive_baseline(...)`.
+- Attach/import/publish alone do not open the downstream gate.
+- Later stages must not need to guess the active comparator, trusted metrics, or main caveats.
-## Workflow
+## Validation
-### Phase 1. Analysis
+Before `baseline` can end, all applicable checks should be true:
-Before running anything substantial, determine:
+- comparator identity is explicit and stable enough to cite later
+- task, dataset, split, evaluation path, required metric ids, metric directions, source identity, and known deviations are durably recorded
+- trusted metric values or trusted output pointers trace to real files, logs, service responses, or source artifacts
+- verification checked the intended dataset/split and metric definitions
+- the accepted comparison contract exists at `<baseline_root>/json/metric_contract.json`
+- the route ended in `artifact.confirm_baseline(...)`, `artifact.waive_baseline(...)`, or an explicit blocked state with next-step routing
-- exact task
-- dataset and split contract
-- metric contract
-- source baseline identity
-- source code path
-- expected run command or evaluation path
-- expected paper or repo numbers when they exist
-- local resource constraints
-Default analysis discipline:
+## Interaction discipline
-- read the source paper and source repo first
-- if runtime already exposes a matching `requested_baseline_ref` or `confirmed_baseline_ref`, validate that concrete object before restarting broad discovery
-- identify the real run or evaluation entrypoint
-- identify the dataset or split and metric contract
-- identify likely environment blockers
-- define the cheapest credible smoke test
+Follow the shared interaction contract injected by the system prompt.
+Keep baseline updates brief unless trust state, blocker state, route, cost, or user-facing risk changed materially.
-Escalate to a fuller audit only when the command path is unclear, the repo is large or confusing, repair mode is active, or custom code changes look likely.
+## Tool discipline
-When the fuller audit is necessary, capture only what later stages truly need:
+- **Do not use native `shell_command` / `command_execution` in this skill.**
+- **All shell, CLI, Python, bash, node, git, npm, uv, and environment work must go through `bash_exec(...)`.**
+- **For git work inside the current quest repository or worktree, prefer `artifact.git(...)` before raw shell git commands.**
+- **If a generic git smoke test is needed outside the quest repo, use `bash_exec(...)` in an isolated scratch repository.**
+- Use web search for discovering papers or repos, but use `artifact.arxiv(paper_id=..., full_text=False)` for actually reading a source arXiv paper when it exists.
+- Set `full_text=True` only when the short form is insufficient.
-- major entry scripts, configs, and modules
-- end-to-end data flow
-- evaluation path and metric computation path
-- obvious environment assumptions
-- obvious bottlenecks or incompatibilities
+## Authority and freedom
-If the source paper is available, record:
+The agent owns the execution path.
+It may choose the workspace layout, environment manager, command order, debugging route, smoke strategy, local paths, and whether the best route is attach, import, verify-local-existing, reproduce, or repair.
-- the core algorithm in compact, implementation-faithful form
-- the main reported numbers
-- the main weaknesses or bottlenecks likely to matter for this quest
+Ask the user only when the next move depends on a real scope, cost, permission, data-access, or scientific-preference decision that cannot be inferred from the quest contract.
+Ordinary route, path, environment, and debugging choices are autonomous unless they change the accepted comparison meaning.
-You may inspect local feasibility with shell-based checks for OS, GPU, CPU, RAM, disk, Python version, and whether `uv` is available.
+## Comparator-first rule
-The analysis phase should leave behind a concrete plan rather than only conversational intent.
+The baseline stage is comparator-first, not reproduction-first.
+For `comparison_ready`, the default question is:
-## Phase 2. Setup
+- what is the lightest trustworthy comparator?
-Prepare the selected route:
+not:
-- attach: validate the selected baseline id and variant
-- import: place the imported baseline metadata under the quest and confirm the package is readable
-- reproduce: prepare the baseline work directory, commands, config pointers, and environment notes
-- repair: identify the precise broken point before rerunning blindly
+- how do I reproduce the whole source package most completely?
-For Python baselines, standardize environment setup around `uv`.
+Default to the lightest baseline path that can still support a fair downstream comparison.
+Default to a fast path when it can establish trust with less work.
+Do not restart broad discovery or front-load a full codebase audit when the comparator, command path, and metric contract are already concrete.
+When this applies, do not front-load a full codebase audit.
+In that fast-path state, do not restart broad baseline discovery by default.
+Do not require a fresh memory pass for every fast-path validation; use memory when it prevents repeated work or clarifies stale route state.
+In short, do not require a fresh memory pass for every fast-path validation.
+A bounded smoke test is usually helpful only when command path, environment viability, evaluator wiring, or output schema is still unclear.
+Treat smoke/pilot work as a `0-2` default budget, and remember not to repeat an unchanged check without new evidence.
+When resuming a previously blocked or ambiguous route, recover the relevant memory before trusting the old path again.
-### Python environment rule: use `uv`
+If runtime already exposes `requested_baseline_ref` or a matching `confirmed_baseline_ref`, default to reuse-and-verify.
+Escalate to fuller audit, reproduction, or repair only when no concrete comparator, command path, or core comparability surface can be trusted yet.
-- if the repo already contains `uv.lock` or a solid `pyproject.toml`, use `uv sync`
-- otherwise create a local virtual environment with `uv venv`
-- install dependencies with `uv pip install ...`
-- run setup, smoke tests, and real commands through `uv run ...`
+For route examples and boundary cases, read `references/route-selection.md`, `references/artifact-flow-examples.md`, and `references/boundary-cases.md`.
+Use `references/baseline-plan-template.md` and `references/baseline-checklist-template.md` when a baseline route is complex enough to need durable planning surfaces.
-Practical rules:
+## Acceptance targets
-- prefer a quest-local or baseline-local `.venv`
-- prefer `uv run python ...` or `uv run bash ...` over relying on shell activation state
-- if a specific interpreter is required, make it explicit with `uv venv --python 3.11` or `uv run --python 3.11 ...`
-- if CUDA, PyTorch, JAX, or custom wheels require a special index URL, keep that install under `uv pip`
-- only accept a non-`uv` route when there is a concrete blocker that cannot be resolved locally
+- `comparison_ready`: the default target; one comparator is trustworthy enough for downstream comparison, and the core metric contract is durably recorded
+- `paper_repro_ready`: the baseline is strong enough to support paper-facing reproduction or comparison claims
+- `registry_publishable`: the baseline package is reusable and clean enough to publish as a durable baseline package
+- `blocked`: the current route cannot clear the gate cleanly, and the next move is explicit
+- `waived`: the quest must continue without a baseline, and the reason is durably recorded
-Common `uv` patterns:
+Not every baseline needs paper-grade exact reproduction.
+A verified attached, imported, or local-existing comparator can be enough when the acceptance target is only `comparison_ready`.
-- `uv sync`
-- `uv venv --python 3.11`
-- `uv pip install -r requirements.txt`
-- `uv run python scripts/smoke_test.py`
-- `uv run python train.py --config ...`
+## Hard acceptance gates
-Setup should record:
+Baseline success means later stages can compare against one accepted comparator without guessing task, data, split, metric, source, command or evaluation path, provenance, or caveats.
-- baseline id and source identity
-- working directory
-- config files
-- command template
-- expected outputs
-- known deviations from paper or source
-- the chosen `uv` route and Python version
+A baseline is successful only when all applicable gates are true:
-Fallbacks:
+- the comparator identity is explicit and stable enough for later stages to cite
+- the task, dataset, split, evaluation path, required metric ids, metric directions, source identity, and known deviations are durably recorded
+- trusted metric values or trusted output pointers are traceable to real files, logs, service responses, source artifacts, or an accepted registry/package record
+- verification checked that the evidence came from the intended dataset/split and metric definitions
+- the accepted comparison contract is written to `<baseline_root>/json/metric_contract.json`
+- the baseline gate is opened with `artifact.confirm_baseline(...)`, or intentionally bypassed with `artifact.waive_baseline(...)`
-- if Hugging Face access is blocked, record and try an approved local mirror such as ModelScope when that does not change the comparison meaning
-- if a quest already depends on `analysis_plan.md` or `REPRO_CHECKLIST.md`, keep the compatibility alias explicit rather than splitting truth across two active plans
+Once a comparison-ready baseline is durably confirmed, baseline should usually stop immediately.
+Once a comparison-ready baseline is durably confirmed, baseline should usually stop immediately and hand off to the next scientific step.
+Any extra baseline work after that must name one explicit unresolved comparison risk it is meant to remove.
-## Phase 3. Execution
+## Route success criteria
-Run only the work required to establish the baseline credibly.
+Choose the route that maximizes trust per unit time and compute; do not follow a fixed ritual.
+Keep one dominant baseline route active at a time.
+If a lighter route already satisfies the current acceptance target, stop there.
-Execution rules:
+- `attach` succeeds when baseline identity, provenance, trusted outputs pointer, core metric contract, and accepted baseline artifact are explicit
+- `import` succeeds when the package is materialized/readable inside the quest, `attachment.yaml` or equivalent provenance exists, and trusted outputs or metrics are traceable
+- `verify-local-existing` succeeds when the concrete local path or service, exact command or evaluation endpoint, output location, required metrics, and core metric contract are verified
+- `reproduce` succeeds when source identity, command or evaluation path, expected outputs, verification evidence, deviations, and metric contract are explicit
+- `repair` succeeds when the broken point is identified, a bounded fix or route change is made, rerun or re-read evidence supports the new trust state, and the result is accepted or blocked
-- keep commands auditable
-- keep logs durable
-- avoid uncontrolled side experiments during baseline establishment
-- checkpoint only explainable, minimal code changes
-- prefer equivalence-preserving efficiency gains such as larger safe batch size, cache reuse, checkpoint resume, and parallel downloads or workers
-- do not use an efficiency lever if it changes accepted baseline meaning, effective evaluation contract, or trust judgment
+Prefer reuse over redundant reproduction, but prefer reproduction or repair when reuse would still leave the baseline incomparable.
+Do not replace a working comparison-ready comparator with a heavier route merely because the heavier route feels cleaner or more complete.
-Long-running execution discipline:
+## Objective evidence requirements
-- run one bounded smoke test before a substantial baseline reproduction
-- once the smoke test passes, launch the real baseline reproduction with `bash_exec(mode='detach', ...)`
-- monitor by forward progress instead of by short-window completion anxiety
-- do not report final success until the command actually finished and the expected result files exist
-- if you need to recover ids or inspect session state, use `bash_exec(mode='history')` or `bash_exec(mode='list')`
-- `bash_exec(mode='read', id=...)` returns the full saved log when it is `2000 lines or fewer`; for longer logs, inspect omitted middle windows with `start` and `tail`
-- during monitoring, prefer `bash_exec(mode='read', id=..., tail_limit=..., order='desc')`, and after the first read prefer incremental checks with `after_seq=last_seen_seq`
-- use `silent_seconds`, `progress_age_seconds`, `signal_age_seconds`, and `watchdog_overdue` as the default staleness clues
-- if a run is clearly invalid, wedged, or superseded, stop it with `bash_exec(mode='kill', id=..., wait=true, timeout_seconds=...)`, document why, and relaunch cleanly
-- do not let more than the `30-minute visibility bound` pass without a real inspection and a `next expected update time`
-- when the baseline code is under your control, prefer a throttled `tqdm` progress reporter and periodic `__DS_PROGRESS__` markers when feasible
+The final evidence should cover these facts before acceptance:
-Keep retries bounded:
+- comparator candidate and baseline id
+- source paper, source repo, source commit/version/tag, local service identity, or registry/package identity as applicable
+- task identity
+- dataset identity and split contract
+- evaluation script, evaluation endpoint, or evaluation path
+- required metric keys for the current downstream comparison
+- metric directions
+- metric values or trusted output pointers
+- environment and hardware facts that materially affect comparability
+- known deviations from the paper, source package, local reference, or selected target
+- verification verdict and caveats
-- one smoke test is the default
-- one autonomous fix-and-retry for the same failure class is the normal upper bound
-- if the same failure class returns, stop looping
+Unless the user explicitly specifies otherwise, treat the original paper's evaluation protocol as the canonical starting point.
+If later `experiment` work would still have to guess the comparison contract, the baseline is not ready.
+For a compact verdict rubric, read `references/comparability-contract.md`.
-## Phase 4. Verification
+## Verification
 Verification is mandatory before baseline acceptance.
 Verify:
-- the run actually finished
+- the run, service call, package import, or trusted-output inspection actually finished
 - the reported metrics came from the intended dataset and split
-- the metric definitions match the quest contract
-- the result is comparable to the paper, source repo, or selected target
-- any deviations are explicitly stated
+- metric definitions and directions match the quest contract
+- the result is comparable to the paper, source repo, local comparator, registry package, or selected target
+- deviations are explicitly stated rather than silently normalized away
 Classify the outcome as one of:
 - `verified_match`
 - `verified_close`
 - `verified_diverged`
+- `trusted_with_caveats`
 - `broken`
-Verification must explicitly separate:
+Verification should explicitly separate likely implementation mismatch, environment mismatch, data or split mismatch, expected stochastic variance, and unexplained divergence when those distinctions matter.
-- likely implementation mismatch
-- environment mismatch
-- data or split mismatch
-- expected stochastic variance
-- unexplained divergence
+## Core metric contract
-Verification should answer:
-- whether the baseline is trustworthy enough for downstream comparison
-- whether the result is reusable beyond this quest
-- whether another repair or rerun is justified
-- whether the line should stop here and hand off
-A verification report should be self-contained enough that a later stage can answer:
-- what was used
-- how it was obtained: attach, import, reproduce, or repair
-- what commands and configs were used
-- what metrics are trusted
-- what caveats remain
-- whether the result is reusable beyond this quest
-## Baseline comparability contract
-The baseline stage is not complete just because something ran.
-It is complete when later stages can compare against it fairly.
-Before declaring a baseline usable, make the comparability contract explicit:
+Before declaring a baseline usable, make the core comparison contract explicit:
 - task identity
-- dataset identity and version
-- split contract
-- preprocessing boundary
+- dataset identity and split contract
 - evaluation script or evaluation path
-- required metric keys
+- required metric keys for the current downstream comparison
 - metric directions
-- seed policy when relevant
 - source commit or source package identity
 - known deviations from the source reference
-Unless the user explicitly specifies otherwise, treat the original paper's evaluation protocol as the canonical baseline contract.
-If any of these fields are still materially unknown, do not pretend the baseline is a clean downstream reference.
-For the fuller checklist and verdict meanings, read `references/comparability-contract.md`.
-## Feasibility and trust classes
-Before acceptance, classify feasibility as one of:
-- `full_reproducible`
-- `degraded_but_acceptable`
-- `blocked`
-And classify downstream trust as one of:
-- `verified`
-- `partially_verified`
-- `operational_but_incomparable`
-- `failed`
-Do not silently upgrade a degraded or merely operational result into a normal trusted baseline.
-## Minimum baseline artifact content
+`<baseline_root>/json/metric_contract.json` is the canonical accepted comparison contract.
+The comparison-ready minimum still requires `<baseline_root>/json/metric_contract.json`.
+A core contract is enough to confirm a `comparison_ready` baseline; expand it later when paper claims, registry publication, or variant-heavy comparison need more coverage.
 The accepted baseline artifact should include at least:
@@ -460,111 +240,46 @@ The accepted baseline artifact should include at least:
 - `source`
 - `summary`
-If variants exist, also include:
-- `default_variant_id`
-- `baseline_variants`
 Metric-contract rules:
-- if the accepted baseline contract includes multiple metrics, datasets, subtasks, or splits, record all of them in `<baseline_root>/json/metric_contract.json`
 - keep `primary_metric` as the headline metric only; do not let it erase the rest of the comparison surface
-- when confirming a baseline, submit the canonical `metrics_summary` as a flat top-level dictionary keyed by the paper-facing metric ids
+- submit canonical `metrics_summary` as a flat top-level dictionary keyed by the paper-facing metric ids
 - every canonical baseline metric entry should include `description`, either `derivation` or `origin_path`, and `source_ref`
+- mark only the currently required canonical metrics as required; additional metrics can be added later or kept supplementary
+- if the accepted baseline contract already needs multiple metrics, datasets, subtasks, or splits, record them in `<baseline_root>/json/metric_contract.json`
 - if the paper reports both aggregate and per-dataset or per-task results, preserve both whenever feasible through `metrics_summary` plus structured rows rather than one cherry-picked scalar
 - if the source package already has a richer leaderboard table, structured result file, or `json/metric_contract.json`, reuse that richer contract instead of hand-writing a thinner one that keeps only one averaged scalar
 - `Result/metric.md` is optional temporary scratch memory only; reconcile against it before calling `artifact.confirm_baseline(...)`, but do not treat it as a required durable file
+- for stable accepted payload shapes, read `references/artifact-payload-examples.md`
-## Publication and reuse
-Use the registry deliberately, not as an afterthought.
-If the result is reusable beyond the current quest:
-- publish it through `artifact.publish_baseline(...)`
-- ensure the payload includes identity, provenance, trusted metrics, and any variant structure
-- set `publish_global: true` only when verification is complete and reuse is justified
-If the current quest should reuse an existing baseline:
-- attach it through `artifact.attach_baseline(...)`
-- preserve the selected `baseline_id`
-- preserve the selected `variant_id` when one is used
-- keep the attachment durable under `baselines/imported/`
-If runtime state already includes `requested_baseline_ref` or a matching `confirmed_baseline_ref`:
-- default to reuse-and-verify, not rediscovery
-- treat a creation-time pre-bound baseline as the active starting point unless you find a concrete incompatibility
-- do not rerun broad baseline scouting or full reproduction just because the stage name is `baseline`
-For a clearer attach/import/reproduce/repair rubric, read `references/route-selection.md`.
-For reusable-package expectations, read `references/publishable-baseline-package.md`.
-## Workspace and branch rules
-- treat the baseline workspace as a system-managed reproduction surface, not an unrelated sandbox
-- avoid creating a nested authoritative Git lifecycle inside the baseline workspace
-- use the quest branch unless isolation is genuinely needed
-- if baseline setup is risky or intrusive, prepare an isolated branch or worktree first and record why
-- do not proliferate branches without a reason
-## Memory rules
-Stage-start requirement:
-- by default, begin every baseline pass with `memory.list_recent(scope='quest', limit=5)`
-- then run at least one baseline-relevant `memory.search(...)` before new baseline analysis, repair, or rerun work
-- fast-path exception: if the quest already exposes a clear `requested_baseline_ref` or `confirmed_baseline_ref` and the immediate task is only to validate or reattach that concrete baseline, you may skip broad retrieval
+## Operational guidance
-Write memory only for reusable lessons such as:
+The main skill keeps the control surface in front.
+For the longer operational notes, read `references/operational-guidance.md`.
-- paper-to-code mismatch notes
-- environment incidents
-- dataset quirks
-- verification caveats
-- attach vs import vs reproduce vs repair rationale
+- use it when you need the exact durable route record shape
+- use it when you need detailed execution tactics or environment tactics
+- use it when reuse or memory handling materially affects the route
-When calling `memory.write(...)`, pass `tags` as an array like `["stage:baseline", "baseline:<baseline_id>", "type:repro-lesson"]`, not as one comma-joined string.
+## Negative cases and stop rules
-Stage-end requirement:
+Do not accept a baseline when:
-- if baseline work produced a durable reproduction lesson, verification caveat, environment incident, or route rationale, write at least one `memory.write(...)` before leaving the stage
-## Artifact rules
-Typical artifact sequence:
-- `progress` for long-running setup or execution checkpoints
-- `report` for analysis notes or verification notes
-- `decision` for route choice, blocked routing, or accept/reject/rerun/repair calls
-- `baseline` only for an accepted baseline record
-For stable field shapes, read `references/artifact-payload-examples.md`.
-The baseline handoff should make these items obvious:
-- `baseline_id`
-- `baseline_variant_id` when relevant
-- route used: attach, import, reproduce, or repair
-- trusted metrics
-- canonical metric contract JSON path
-- verification outcome
-- reusable or quest-local only
-- canonical output paths
-- main caveats
-- recommended next anchor
-If this packet is not obvious from the accepted artifact plus verification note, the baseline line is not stable enough yet.
-## Failure and blocked handling
+- metrics are fabricated, copied, or paraphrased without provenance
+- metrics are copied from a paper while the acceptance target requires local verification
+- dataset, split, metric direction, or evaluation path is materially unknown
+- outputs exist but cannot be tied to the intended command, source, comparator, package, or service
+- a local run completed but used a materially different protocol without a recorded caveat
+- source code was modified in a way that changes baseline scope without recording the deviation
+- a package imports but trusted metrics or outputs are not traceable
+- later experiment work would still need to guess the required baseline metric ids
+- the same failure class reappears without new evidence, code changes, environment changes, or a route change
+If the same failure class appears again without new evidence, code changes, environment changes, or a route change, stop looping and route through `repair`, `decision`, `blocked`, `waive`, or one bounded clarification.
 Do not hide failures.
-If blocked, record the class explicitly:
+If blocked, record the class explicitly when possible:
 - `missing_source`
-- `missing_code`
 - `missing_metric_contract`
 - `environment_infeasible`
 - `command_unknown`
@@ -576,29 +291,36 @@ A blocked result must state:
 - what failed
 - what was tried
 - which paths or logs show the issue
-- whether the next best move is attach, import, retry, repair, reset, or ask the user
+- whether the next best move is attach, import, retry, repair, reset, waive, or ask the user
-Reasonable autonomous fixes before escalation:
+Bounded autonomous fixes are acceptable only when they do not change confirmed scope, metrics, permissions, resource assumptions, or scientific meaning.
+Reasonable bounded fixes include missing dependency installs, wrong dataset paths, permission fixes on scripts, obvious environment activation mistakes, and conservative batch-size reductions for OOM.
-- missing module or dependency
-- wrong dataset path
-- permission errors on scripts
-- reasonable batch-size reductions for OOM
-- obvious environment activation mistakes
+## Baseline id and variant rules
+Keep baseline identifiers and variant names stable enough that later stages can cite the same comparator without guesswork.
-If a fix would change confirmed scope, metrics, permissions, or resource assumptions, stop and return to analysis rather than applying it silently.
+- keep `baseline_id` short, stable, and filesystem-safe
+- prefer one baseline id with stable variant names over many near-duplicate ids
+- if multiple comparators exist, mark which one is the primary downstream baseline
 ## Exit criteria
-Exit the baseline stage once one of the following is durably true:
+Exit once one of these is durably true:
 - a baseline is attached and accepted
 - an imported baseline is accepted
+- a verified local-existing comparator is accepted
 - a reproduced baseline is verified and accepted
+- a repaired baseline is verified and accepted
 - a broken route has been declared blocked and a next decision is recorded
+- a waiver decision explicitly leaves the baseline gate
+- a route change is recorded because the previous route is no longer the best trust-per-cost path
 Typical next anchors:
 - `idea`
 - `experiment` in tightly scoped follow-on cases
 - `decision` if the baseline line remains contested
+A good baseline pass leaves one trusted comparator, one explicit blocker, or one explicit route change, not a vague promise to keep rechecking baseline.