@researai/deepscientist 1.5.16 → 1.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +309 -130
- package/AISB/catalog/aisb.b1.agentic_coding.yaml +244 -0
- package/AISB/catalog/aisb.b10.climate_earth.yaml +235 -0
- package/AISB/catalog/aisb.b11.model_efficiency.yaml +231 -0
- package/AISB/catalog/aisb.b12.embodied_ai.yaml +238 -0
- package/AISB/catalog/aisb.b2.agent_systems.yaml +229 -0
- package/AISB/catalog/aisb.b3.self_evolving_rl.yaml +237 -0
- package/AISB/catalog/aisb.b4.lm_reasoning.yaml +240 -0
- package/AISB/catalog/aisb.b5.math_proof.yaml +235 -0
- package/AISB/catalog/aisb.b6.research_process.yaml +243 -0
- package/AISB/catalog/aisb.b7.multimodal_fusion.yaml +232 -0
- package/AISB/catalog/aisb.b8.lifesci_drug.yaml +275 -0
- package/AISB/catalog/aisb.b9.material_science.yaml +237 -0
- package/AISB/catalog/aisb.t3.001_savvy.yaml +159 -0
- package/AISB/catalog/aisb.t3.001_savvy.zh.yaml +121 -0
- package/AISB/catalog/aisb.t3.002_pinet.yaml +189 -0
- package/AISB/catalog/aisb.t3.002_pinet.zh.yaml +130 -0
- package/AISB/catalog/aisb.t3.004_decentralattn.yaml +184 -0
- package/AISB/catalog/aisb.t3.004_decentralattn.zh.yaml +153 -0
- package/AISB/catalog/aisb.t3.005_tsae.yaml +193 -0
- package/AISB/catalog/aisb.t3.005_tsae.zh.yaml +139 -0
- package/AISB/catalog/aisb.t3.006_physense.yaml +194 -0
- package/AISB/catalog/aisb.t3.006_physense.zh.yaml +118 -0
- package/AISB/catalog/aisb.t3.007_reasoningiqa.yaml +169 -0
- package/AISB/catalog/aisb.t3.007_reasoningiqa.zh.yaml +133 -0
- package/AISB/catalog/aisb.t3.008_meanflows.yaml +188 -0
- package/AISB/catalog/aisb.t3.008_meanflows.zh.yaml +140 -0
- package/AISB/catalog/aisb.t3.009_scoremissing.yaml +179 -0
- package/AISB/catalog/aisb.t3.009_scoremissing.zh.yaml +119 -0
- package/AISB/catalog/aisb.t3.010_suitabilityfilter.yaml +221 -0
- package/AISB/catalog/aisb.t3.010_suitabilityfilter.zh.yaml +141 -0
- package/AISB/catalog/aisb.t3.011_osd.yaml +206 -0
- package/AISB/catalog/aisb.t3.011_osd.zh.yaml +163 -0
- package/AISB/catalog/aisb.t3.012_efficientqat.yaml +206 -0
- package/AISB/catalog/aisb.t3.012_efficientqat.zh.yaml +159 -0
- package/AISB/catalog/aisb.t3.013_appl.yaml +152 -0
- package/AISB/catalog/aisb.t3.013_appl.zh.yaml +126 -0
- package/AISB/catalog/aisb.t3.014_piguard.yaml +207 -0
- package/AISB/catalog/aisb.t3.014_piguard.zh.yaml +164 -0
- package/AISB/catalog/aisb.t3.015_frspec.yaml +209 -0
- package/AISB/catalog/aisb.t3.015_frspec.zh.yaml +163 -0
- package/AISB/catalog/aisb.t3.016_mathfusion.yaml +166 -0
- package/AISB/catalog/aisb.t3.016_mathfusion.zh.yaml +145 -0
- package/AISB/catalog/aisb.t3.017_multimodalglp.yaml +171 -0
- package/AISB/catalog/aisb.t3.017_multimodalglp.zh.yaml +122 -0
- package/AISB/catalog/aisb.t3.018_cotsynth.yaml +206 -0
- package/AISB/catalog/aisb.t3.018_cotsynth.zh.yaml +162 -0
- package/AISB/catalog/aisb.t3.019_dyscaleut.yaml +211 -0
- package/AISB/catalog/aisb.t3.019_dyscaleut.zh.yaml +148 -0
- package/AISB/catalog/aisb.t3.020_aristotle.yaml +173 -0
- package/AISB/catalog/aisb.t3.020_aristotle.zh.yaml +119 -0
- package/AISB/catalog/aisb.t3.021_tokenrecycling.yaml +160 -0
- package/AISB/catalog/aisb.t3.021_tokenrecycling.zh.yaml +129 -0
- package/AISB/catalog/aisb.t3.022_chainofreasoning.yaml +204 -0
- package/AISB/catalog/aisb.t3.022_chainofreasoning.zh.yaml +161 -0
- package/AISB/catalog/aisb.t3.023_guidedembed.yaml +211 -0
- package/AISB/catalog/aisb.t3.023_guidedembed.zh.yaml +189 -0
- package/AISB/catalog/aisb.t3.024_outputcentric.yaml +148 -0
- package/AISB/catalog/aisb.t3.024_outputcentric.zh.yaml +131 -0
- package/AISB/catalog/aisb.t3.025_deeper.yaml +143 -0
- package/AISB/catalog/aisb.t3.025_deeper.zh.yaml +116 -0
- package/AISB/catalog/aisb.t3.026_gartkg.yaml +195 -0
- package/AISB/catalog/aisb.t3.026_gartkg.zh.yaml +127 -0
- package/AISB/catalog/aisb.t3.027_citeeval.yaml +182 -0
- package/AISB/catalog/aisb.t3.027_citeeval.zh.yaml +135 -0
- package/AISB/catalog/aisb.t3.028_sbam.yaml +206 -0
- package/AISB/catalog/aisb.t3.028_sbam.zh.yaml +166 -0
- package/AISB/catalog/aisb.t3.029_cdqgeoembed.yaml +224 -0
- package/AISB/catalog/aisb.t3.029_cdqgeoembed.zh.yaml +142 -0
- package/AISB/catalog/aisb.t3.030_processrm.yaml +211 -0
- package/AISB/catalog/aisb.t3.030_processrm.zh.yaml +166 -0
- package/AISB/catalog/aisb.t3.031_circuitstability.yaml +172 -0
- package/AISB/catalog/aisb.t3.031_circuitstability.zh.yaml +134 -0
- package/AISB/catalog/aisb.t3.032_ptsolver.yaml +169 -0
- package/AISB/catalog/aisb.t3.032_ptsolver.zh.yaml +135 -0
- package/AISB/catalog/aisb.t3.033_gcse.yaml +144 -0
- package/AISB/catalog/aisb.t3.033_gcse.zh.yaml +126 -0
- package/AISB/catalog/aisb.t3.034_ensemblewm.yaml +183 -0
- package/AISB/catalog/aisb.t3.034_ensemblewm.zh.yaml +146 -0
- package/AISB/catalog/aisb.t3.035_moralvalueswa.yaml +207 -0
- package/AISB/catalog/aisb.t3.035_moralvalueswa.zh.yaml +165 -0
- package/AISB/catalog/aisb.t3.036_weakstrongpref.yaml +210 -0
- package/AISB/catalog/aisb.t3.036_weakstrongpref.zh.yaml +194 -0
- package/AISB/catalog/aisb.t3.037_dementiamask.yaml +172 -0
- package/AISB/catalog/aisb.t3.037_dementiamask.zh.yaml +132 -0
- package/AISB/catalog/aisb.t3.038_tinysam.yaml +284 -0
- package/AISB/catalog/aisb.t3.038_tinysam.zh.yaml +240 -0
- package/AISB/catalog/aisb.t3.039_calf.yaml +224 -0
- package/AISB/catalog/aisb.t3.039_calf.zh.yaml +194 -0
- package/AISB/catalog/aisb.t3.040_graniteguardian.yaml +199 -0
- package/AISB/catalog/aisb.t3.040_graniteguardian.zh.yaml +174 -0
- package/AISB/catalog/aisb.t3.041_amdm.yaml +149 -0
- package/AISB/catalog/aisb.t3.041_amdm.zh.yaml +137 -0
- package/AISB/catalog/aisb.t3.042_xpatch.yaml +216 -0
- package/AISB/catalog/aisb.t3.042_xpatch.zh.yaml +182 -0
- package/AISB/catalog/aisb.t3.043_vhm.yaml +268 -0
- package/AISB/catalog/aisb.t3.043_vhm.zh.yaml +193 -0
- package/AISB/catalog/aisb.t3.044_rgvi.yaml +224 -0
- package/AISB/catalog/aisb.t3.044_rgvi.zh.yaml +176 -0
- package/AISB/catalog/aisb.t3.045_pslstm.yaml +203 -0
- package/AISB/catalog/aisb.t3.045_pslstm.zh.yaml +179 -0
- package/AISB/catalog/aisb.t3.046_nonstatts.yaml +208 -0
- package/AISB/catalog/aisb.t3.046_nonstatts.zh.yaml +194 -0
- package/AISB/catalog/aisb.t3.047_timepfn.yaml +156 -0
- package/AISB/catalog/aisb.t3.047_timepfn.zh.yaml +124 -0
- package/AISB/catalog/aisb.t3.048_proxyspex.yaml +148 -0
- package/AISB/catalog/aisb.t3.048_proxyspex.zh.yaml +125 -0
- package/AISB/catalog/aisb.t3.049_hogwildinference.yaml +183 -0
- package/AISB/catalog/aisb.t3.049_hogwildinference.zh.yaml +138 -0
- package/AISB/catalog/aisb.t3.050_causalpfn.yaml +214 -0
- package/AISB/catalog/aisb.t3.050_causalpfn.zh.yaml +190 -0
- package/AISB/catalog/aisb.t3.051_flashtp.yaml +169 -0
- package/AISB/catalog/aisb.t3.051_flashtp.zh.yaml +124 -0
- package/AISB/catalog/aisb.t3.052_nsdiff.yaml +155 -0
- package/AISB/catalog/aisb.t3.052_nsdiff.zh.yaml +138 -0
- package/AISB/catalog/aisb.t3.053_k2vae.yaml +158 -0
- package/AISB/catalog/aisb.t3.053_k2vae.zh.yaml +132 -0
- package/AISB/catalog/aisb.t3.054_timebase.yaml +178 -0
- package/AISB/catalog/aisb.t3.054_timebase.zh.yaml +158 -0
- package/AISB/catalog/aisb.t3.055_csbrain.yaml +238 -0
- package/AISB/catalog/aisb.t3.055_csbrain.zh.yaml +184 -0
- package/AISB/catalog/aisb.t3.056_infosam.yaml +224 -0
- package/AISB/catalog/aisb.t3.056_infosam.zh.yaml +189 -0
- package/AISB/catalog/aisb.t3.057_mdreid.yaml +129 -0
- package/AISB/catalog/aisb.t3.057_mdreid.zh.yaml +117 -0
- package/AISB/catalog/aisb.t3.058_mindglitch.yaml +171 -0
- package/AISB/catalog/aisb.t3.058_mindglitch.zh.yaml +145 -0
- package/AISB/catalog/aisb.t3.059_selfsupervised.yaml +154 -0
- package/AISB/catalog/aisb.t3.059_selfsupervised.zh.yaml +125 -0
- package/AISB/catalog/aisb.t3.060_iaggad.yaml +121 -0
- package/AISB/catalog/aisb.t3.060_iaggad.zh.yaml +100 -0
- package/AISB/catalog/aisb.t3.061_hsgkn.yaml +136 -0
- package/AISB/catalog/aisb.t3.061_hsgkn.zh.yaml +113 -0
- package/AISB/catalog/aisb.t3.062_visionts.yaml +237 -0
- package/AISB/catalog/aisb.t3.062_visionts.zh.yaml +216 -0
- package/AISB/catalog/aisb.t3.063_tsrag.yaml +162 -0
- package/AISB/catalog/aisb.t3.063_tsrag.zh.yaml +138 -0
- package/AISB/catalog/aisb.t3.064_pir.yaml +221 -0
- package/AISB/catalog/aisb.t3.064_pir.zh.yaml +197 -0
- package/AISB/catalog/aisb.t3.065_proteinbinding.yaml +234 -0
- package/AISB/catalog/aisb.t3.065_proteinbinding.zh.yaml +167 -0
- package/AISB/catalog/aisb.t3.066_tropicalattention.yaml +267 -0
- package/AISB/catalog/aisb.t3.066_tropicalattention.zh.yaml +229 -0
- package/AISB/catalog/aisb.t3.067_kanad.yaml +193 -0
- package/AISB/catalog/aisb.t3.067_kanad.zh.yaml +167 -0
- package/AISB/catalog/aisb.t3.068_sempo.yaml +187 -0
- package/AISB/catalog/aisb.t3.068_sempo.zh.yaml +148 -0
- package/AISB/catalog/aisb.t3.069_treehfd.yaml +129 -0
- package/AISB/catalog/aisb.t3.069_treehfd.zh.yaml +111 -0
- package/AISB/catalog/aisb.t3.070_certifiedunlearning.yaml +224 -0
- package/AISB/catalog/aisb.t3.070_certifiedunlearning.zh.yaml +171 -0
- package/AISB/catalog/aisb.t3.071_neuralmjd.yaml +142 -0
- package/AISB/catalog/aisb.t3.071_neuralmjd.zh.yaml +120 -0
- package/AISB/catalog/aisb.t3.072_fedgmt.yaml +181 -0
- package/AISB/catalog/aisb.t3.072_fedgmt.zh.yaml +158 -0
- package/AISB/catalog/aisb.t3.073_rld.yaml +161 -0
- package/AISB/catalog/aisb.t3.073_rld.zh.yaml +129 -0
- package/AISB/catalog/aisb.t3.074_lsvi.yaml +163 -0
- package/AISB/catalog/aisb.t3.074_lsvi.zh.yaml +129 -0
- package/AISB/catalog/aisb.t3.075_treeslicedentropy.yaml +201 -0
- package/AISB/catalog/aisb.t3.075_treeslicedentropy.zh.yaml +148 -0
- package/AISB/catalog/aisb.t3.076_aanet.yaml +169 -0
- package/AISB/catalog/aisb.t3.076_aanet.zh.yaml +129 -0
- package/AISB/catalog/aisb.t3.077_cmnn.yaml +199 -0
- package/AISB/catalog/aisb.t3.077_cmnn.zh.yaml +165 -0
- package/AISB/catalog/aisb.t3.078_conformalanomaly.yaml +146 -0
- package/AISB/catalog/aisb.t3.078_conformalanomaly.zh.yaml +117 -0
- package/AISB/catalog/aisb.t3.079_dpfkmeans.yaml +131 -0
- package/AISB/catalog/aisb.t3.079_dpfkmeans.zh.yaml +104 -0
- package/AISB/catalog/aisb.t3.080_latentscorereweight.yaml +169 -0
- package/AISB/catalog/aisb.t3.080_latentscorereweight.zh.yaml +123 -0
- package/AISB/catalog/aisb.t3.081_qmamba.yaml +150 -0
- package/AISB/catalog/aisb.t3.081_qmamba.zh.yaml +117 -0
- package/AISB/catalog/aisb.t3.082_onlinellmrouting.yaml +160 -0
- package/AISB/catalog/aisb.t3.082_onlinellmrouting.zh.yaml +133 -0
- package/AISB/catalog/aisb.t3.083_starformer.yaml +178 -0
- package/AISB/catalog/aisb.t3.083_starformer.zh.yaml +140 -0
- package/AISB/catalog/aisb.t3.084_ift.yaml +139 -0
- package/AISB/catalog/aisb.t3.084_ift.zh.yaml +111 -0
- package/AISB/catalog/aisb.t3.085_neuralsurv.yaml +183 -0
- package/AISB/catalog/aisb.t3.085_neuralsurv.zh.yaml +143 -0
- package/AISB/catalog/aisb.t3.086_stella.yaml +197 -0
- package/AISB/catalog/aisb.t3.086_stella.zh.yaml +142 -0
- package/AISB/catalog/aisb.t3.087_moses.yaml +167 -0
- package/AISB/catalog/aisb.t3.087_moses.zh.yaml +132 -0
- package/AISB/catalog/aisb.t3.088_channelnorm.yaml +140 -0
- package/AISB/catalog/aisb.t3.088_channelnorm.zh.yaml +109 -0
- package/AISB/catalog/aisb.t3.089_causalvelocity.yaml +730 -0
- package/AISB/catalog/aisb.t3.089_causalvelocity.zh.yaml +668 -0
- package/AISB/catalog/aisb.t3.090_rstib.yaml +144 -0
- package/AISB/catalog/aisb.t3.090_rstib.zh.yaml +109 -0
- package/AISB/catalog/aisb.t3.091_timeawarecausal.yaml +132 -0
- package/AISB/catalog/aisb.t3.091_timeawarecausal.zh.yaml +107 -0
- package/AISB/catalog/aisb.t3.092_kmeanslocalopt.yaml +138 -0
- package/AISB/catalog/aisb.t3.092_kmeanslocalopt.zh.yaml +110 -0
- package/AISB/catalog/aisb.t3.093_fedwmsam.yaml +134 -0
- package/AISB/catalog/aisb.t3.093_fedwmsam.zh.yaml +106 -0
- package/AISB/catalog/aisb.t3.094_boundre.yaml +147 -0
- package/AISB/catalog/aisb.t3.094_boundre.zh.yaml +114 -0
- package/AISB/catalog/aisb.t3.095_fastfeaturecp.yaml +153 -0
- package/AISB/catalog/aisb.t3.095_fastfeaturecp.zh.yaml +118 -0
- package/AISB/catalog/aisb.t3.096_m3svm.yaml +189 -0
- package/AISB/catalog/aisb.t3.096_m3svm.zh.yaml +149 -0
- package/AISB/catalog/aisb.t3.097_wassersteintl.yaml +212 -0
- package/AISB/catalog/aisb.t3.097_wassersteintl.zh.yaml +169 -0
- package/AISB/catalog/aisb.t3.098_xmahalanobis.yaml +171 -0
- package/AISB/catalog/aisb.t3.098_xmahalanobis.zh.yaml +127 -0
- package/AISB/catalog/aisb.t3.099_ollalanding.yaml +248 -0
- package/AISB/catalog/aisb.t3.099_ollalanding.zh.yaml +182 -0
- package/AISB/catalog/aisb.t3.100_invmissingdata.yaml +179 -0
- package/AISB/catalog/aisb.t3.100_invmissingdata.zh.yaml +150 -0
- package/AISB/catalog/aisb.t3.101_acia.yaml +164 -0
- package/AISB/catalog/aisb.t3.101_acia.zh.yaml +109 -0
- package/AISB/catalog/aisb.t3.102_stochasticff.yaml +178 -0
- package/AISB/catalog/aisb.t3.102_stochasticff.zh.yaml +130 -0
- package/AISB/catalog/aisb.t3.103_qdcp.yaml +150 -0
- package/AISB/catalog/aisb.t3.103_qdcp.zh.yaml +116 -0
- package/AISB/catalog/aisb.t3.104_balancedactiveinf.yaml +137 -0
- package/AISB/catalog/aisb.t3.104_balancedactiveinf.zh.yaml +104 -0
- package/AISB/catalog/aisb.t3.105_binaryclasseval.yaml +161 -0
- package/AISB/catalog/aisb.t3.105_binaryclasseval.zh.yaml +130 -0
- package/AISB/image/001_aisb.t3.001_savvy.jpg +0 -0
- package/AISB/image/002_aisb.t3.002_pinet.jpg +0 -0
- package/AISB/image/003_aisb.t3.003_dmsqd.jpg +0 -0
- package/AISB/image/004_aisb.t3.004_decentralattn.jpg +0 -0
- package/AISB/image/005_aisb.t3.005_tsae.jpg +0 -0
- package/AISB/image/006_aisb.t3.006_physense.jpg +0 -0
- package/AISB/image/007_aisb.t3.007_reasoningiqa.jpg +0 -0
- package/AISB/image/008_aisb.t3.008_meanflows.jpg +0 -0
- package/AISB/image/009_aisb.t3.009_scoremissing.jpg +0 -0
- package/AISB/image/010_aisb.t3.010_suitabilityfilter.jpg +0 -0
- package/AISB/image/011_aisb.t3.011_osd.jpg +0 -0
- package/AISB/image/012_aisb.t3.012_efficientqat.jpg +0 -0
- package/AISB/image/013_aisb.t3.013_appl.jpg +0 -0
- package/AISB/image/014_aisb.t3.014_piguard.jpg +0 -0
- package/AISB/image/015_aisb.t3.015_frspec.jpg +0 -0
- package/AISB/image/016_aisb.t3.016_mathfusion.jpg +0 -0
- package/AISB/image/017_aisb.t3.017_multimodalglp.jpg +0 -0
- package/AISB/image/018_aisb.t3.018_cotsynth.jpg +0 -0
- package/AISB/image/019_aisb.t3.019_dyscaleut.jpg +0 -0
- package/AISB/image/020_aisb.t3.020_aristotle.jpg +0 -0
- package/AISB/image/021_aisb.t3.021_tokenrecycling.jpg +0 -0
- package/AISB/image/022_aisb.t3.022_chainofreasoning.jpg +0 -0
- package/AISB/image/023_aisb.t3.023_guidedembed.jpg +0 -0
- package/AISB/image/024_aisb.t3.024_outputcentric.jpg +0 -0
- package/AISB/image/025_aisb.t3.025_deeper.jpg +0 -0
- package/AISB/image/026_aisb.t3.026_gartkg.jpg +0 -0
- package/AISB/image/027_aisb.t3.027_citeeval.jpg +0 -0
- package/AISB/image/028_aisb.t3.028_sbam.jpg +0 -0
- package/AISB/image/029_aisb.t3.029_cdqgeoembed.jpg +0 -0
- package/AISB/image/030_aisb.t3.030_processrm.jpg +0 -0
- package/AISB/image/031_aisb.t3.031_circuitstability.jpg +0 -0
- package/AISB/image/032_aisb.t3.032_ptsolver.jpg +0 -0
- package/AISB/image/033_aisb.t3.033_gcse.jpg +0 -0
- package/AISB/image/034_aisb.t3.034_ensemblewm.jpg +0 -0
- package/AISB/image/035_aisb.t3.035_moralvalueswa.jpg +0 -0
- package/AISB/image/036_aisb.t3.036_weakstrongpref.jpg +0 -0
- package/AISB/image/037_aisb.t3.037_dementiamask.jpg +0 -0
- package/AISB/image/038_aisb.t3.038_tinysam.jpg +0 -0
- package/AISB/image/039_aisb.t3.039_calf.jpg +0 -0
- package/AISB/image/040_aisb.t3.040_graniteguardian.jpg +0 -0
- package/AISB/image/041_aisb.t3.041_amdm.jpg +0 -0
- package/AISB/image/042_aisb.t3.042_xpatch.jpg +0 -0
- package/AISB/image/043_aisb.t3.043_vhm.jpg +0 -0
- package/AISB/image/044_aisb.t3.044_rgvi.jpg +0 -0
- package/AISB/image/045_aisb.t3.045_pslstm.jpg +0 -0
- package/AISB/image/046_aisb.t3.046_nonstatts.jpg +0 -0
- package/AISB/image/047_aisb.t3.047_timepfn.jpg +0 -0
- package/AISB/image/048_aisb.t3.048_proxyspex.jpg +0 -0
- package/AISB/image/049_aisb.t3.049_hogwildinference.jpg +0 -0
- package/AISB/image/050_aisb.t3.050_causalpfn.jpg +0 -0
- package/AISB/image/051_aisb.t3.051_flashtp.jpg +0 -0
- package/AISB/image/052_aisb.t3.052_nsdiff.jpg +0 -0
- package/AISB/image/053_aisb.t3.053_k2vae.jpg +0 -0
- package/AISB/image/054_aisb.t3.054_timebase.jpg +0 -0
- package/AISB/image/055_aisb.t3.055_csbrain.jpg +0 -0
- package/AISB/image/056_aisb.t3.056_infosam.jpg +0 -0
- package/AISB/image/057_aisb.t3.057_mdreid.jpg +0 -0
- package/AISB/image/058_aisb.t3.058_mindglitch.jpg +0 -0
- package/AISB/image/059_aisb.t3.059_selfsupervised.jpg +0 -0
- package/AISB/image/060_aisb.t3.060_iaggad.jpg +0 -0
- package/AISB/image/061_aisb.t3.061_hsgkn.jpg +0 -0
- package/AISB/image/062_aisb.t3.062_visionts.jpg +0 -0
- package/AISB/image/063_aisb.t3.063_tsrag.jpg +0 -0
- package/AISB/image/064_aisb.t3.064_pir.jpg +0 -0
- package/AISB/image/065_aisb.t3.065_proteinbinding.jpg +0 -0
- package/AISB/image/066_aisb.t3.066_tropicalattention.jpg +0 -0
- package/AISB/image/067_aisb.t3.067_kanad.jpg +0 -0
- package/AISB/image/068_aisb.t3.068_sempo.jpg +0 -0
- package/AISB/image/069_aisb.t3.069_treehfd.jpg +0 -0
- package/AISB/image/070_aisb.t3.070_certifiedunlearning.jpg +0 -0
- package/AISB/image/071_aisb.t3.071_neuralmjd.jpg +0 -0
- package/AISB/image/072_aisb.t3.072_fedgmt.jpg +0 -0
- package/AISB/image/073_aisb.t3.073_rld.jpg +0 -0
- package/AISB/image/074_aisb.t3.074_lsvi.jpg +0 -0
- package/AISB/image/075_aisb.t3.075_treeslicedentropy.jpg +0 -0
- package/AISB/image/076_aisb.t3.076_aanet.jpg +0 -0
- package/AISB/image/077_aisb.t3.077_cmnn.jpg +0 -0
- package/AISB/image/078_aisb.t3.078_conformalanomaly.jpg +0 -0
- package/AISB/image/079_aisb.t3.079_dpfkmeans.jpg +0 -0
- package/AISB/image/080_aisb.t3.080_latentscorereweight.jpg +0 -0
- package/AISB/image/081_aisb.t3.081_qmamba.jpg +0 -0
- package/AISB/image/082_aisb.t3.082_onlinellmrouting.jpg +0 -0
- package/AISB/image/083_aisb.t3.083_starformer.jpg +0 -0
- package/AISB/image/084_aisb.t3.084_ift.jpg +0 -0
- package/AISB/image/085_aisb.t3.085_neuralsurv.jpg +0 -0
- package/AISB/image/086_aisb.t3.086_stella.jpg +0 -0
- package/AISB/image/087_aisb.t3.087_moses.jpg +0 -0
- package/AISB/image/088_aisb.t3.088_channelnorm.jpg +0 -0
- package/AISB/image/089_aisb.t3.089_causalvelocity.jpg +0 -0
- package/AISB/image/090_aisb.t3.090_rstib.jpg +0 -0
- package/AISB/image/091_aisb.t3.091_timeawarecausal.jpg +0 -0
- package/AISB/image/092_aisb.t3.092_kmeanslocalopt.jpg +0 -0
- package/AISB/image/093_aisb.t3.093_fedwmsam.jpg +0 -0
- package/AISB/image/094_aisb.t3.094_boundre.jpg +0 -0
- package/AISB/image/095_aisb.t3.095_fastfeaturecp.jpg +0 -0
- package/AISB/image/096_aisb.t3.096_m3svm.jpg +0 -0
- package/AISB/image/097_aisb.t3.097_wassersteintl.jpg +0 -0
- package/AISB/image/098_aisb.t3.098_xmahalanobis.jpg +0 -0
- package/AISB/image/099_aisb.t3.099_ollalanding.jpg +0 -0
- package/AISB/image/100_aisb.t3.100_invmissingdata.jpg +0 -0
- package/AISB/image/101_aisb.t3.101_acia.jpg +0 -0
- package/AISB/image/102_aisb.t3.102_stochasticff.jpg +0 -0
- package/AISB/image/103_aisb.t3.103_qdcp.jpg +0 -0
- package/AISB/image/104_aisb.t3.104_balancedactiveinf.jpg +0 -0
- package/AISB/image/105_aisb.t3.105_binaryclasseval.jpg +0 -0
- package/AISB/image/106_aisb.t1.reasoning_lite.jpg +0 -0
- package/AISB/image/107_aisb.t2.paper_audit.jpg +0 -0
- package/AISB/image/108_aisb.t3.multi_gpu_search.jpg +0 -0
- package/AISB/image/109_aisb.t3.tdc_admet.jpg +0 -0
- package/AISB/image/aisb.b1.agentic_coding.svg +16 -0
- package/AISB/image/aisb.b10.climate_earth.svg +16 -0
- package/AISB/image/aisb.b11.model_efficiency.svg +16 -0
- package/AISB/image/aisb.b12.embodied_ai.svg +16 -0
- package/AISB/image/aisb.b2.agent_systems.svg +16 -0
- package/AISB/image/aisb.b3.self_evolving_rl.svg +16 -0
- package/AISB/image/aisb.b4.lm_reasoning.svg +16 -0
- package/AISB/image/aisb.b5.math_proof.svg +16 -0
- package/AISB/image/aisb.b6.research_process.svg +16 -0
- package/AISB/image/aisb.b7.multimodal_fusion.svg +16 -0
- package/AISB/image/aisb.b8.lifesci_drug.svg +16 -0
- package/AISB/image/aisb.b9.material_science.svg +16 -0
- package/README.md +196 -32
- package/bin/ds.js +924 -66
- package/docs/en/00_QUICK_START.md +195 -18
- package/docs/en/01_SETTINGS_REFERENCE.md +468 -96
- package/docs/en/02_START_RESEARCH_GUIDE.md +26 -5
- package/docs/en/03_QQ_CONNECTOR_GUIDE.md +14 -3
- package/docs/en/04_LINGZHU_CONNECTOR_GUIDE.md +2 -0
- package/docs/en/05_TUI_GUIDE.md +171 -2
- package/docs/en/07_MEMORY_AND_MCP.md +38 -2
- package/docs/en/09_DOCTOR.md +78 -7
- package/docs/en/10_WEIXIN_CONNECTOR_GUIDE.md +38 -1
- package/docs/en/11_LICENSE_AND_RISK.md +4 -0
- package/docs/en/12_GUIDED_WORKFLOW_TOUR.md +15 -0
- package/docs/en/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +9 -0
- package/docs/en/15_CODEX_PROVIDER_SETUP.md +624 -180
- package/docs/en/16_TELEGRAM_CONNECTOR_GUIDE.md +14 -0
- package/docs/en/17_WHATSAPP_CONNECTOR_GUIDE.md +14 -0
- package/docs/en/18_FEISHU_CONNECTOR_GUIDE.md +14 -0
- package/docs/en/21_LOCAL_MODEL_BACKENDS_GUIDE.md +386 -0
- package/docs/en/22_BENCHSTORE_YAML_REFERENCE.md +469 -0
- package/docs/en/23_BENCHSTORE_GITHUB_RELEASES_SPEC.md +316 -0
- package/docs/en/24_CLAUDE_CODE_PROVIDER_SETUP.md +469 -0
- package/docs/en/25_OPENCODE_PROVIDER_SETUP.md +653 -0
- package/docs/en/26_CITATION_AND_ATTRIBUTION.md +119 -0
- package/docs/en/27_KIMI_CODE_PROVIDER_SETUP.md +180 -0
- package/docs/en/28_DISCORD_CONNECTOR_GUIDE.md +61 -0
- package/docs/en/29_SLACK_CONNECTOR_GUIDE.md +60 -0
- package/docs/en/30_SETTINGS_CONTROL_CENTER_GUIDE.md +371 -0
- package/docs/en/{19_LOCAL_BROWSER_AUTH.md → 31_LOCAL_BROWSER_AUTH.md} +1 -1
- package/docs/en/32_WINDOWS_WSL2_DEPLOYMENT_GUIDE.md +273 -0
- package/docs/en/33_WORKSPACE_EXPLORER_QA.md +121 -0
- package/docs/en/91_DEVELOPMENT.md +266 -0
- package/docs/en/99_ACKNOWLEDGEMENTS.md +24 -19
- package/docs/en/README.md +48 -7
- package/docs/images/admin/admin-connectors-health-en.png +0 -0
- package/docs/images/admin/admin-controllers-en.png +0 -0
- package/docs/images/admin/admin-diagnostics-en.png +0 -0
- package/docs/images/admin/admin-errors-en.png +0 -0
- package/docs/images/admin/admin-issues-en.png +0 -0
- package/docs/images/admin/admin-logs-en.png +0 -0
- package/docs/images/admin/admin-quest-detail-en.png +0 -0
- package/docs/images/admin/admin-quests-en.png +0 -0
- package/docs/images/admin/admin-repairs-en.png +0 -0
- package/docs/images/admin/admin-runtime-en.png +0 -0
- package/docs/images/admin/admin-search-en.png +0 -0
- package/docs/images/admin/admin-stats-en.png +0 -0
- package/docs/images/admin/admin-summary-en.png +0 -0
- package/docs/images/connectors/connector-discord-en.png +0 -0
- package/docs/images/connectors/connector-feishu-en.png +0 -0
- package/docs/images/connectors/connector-lingzhu-en.png +0 -0
- package/docs/images/connectors/connector-qq-en.png +0 -0
- package/docs/images/connectors/connector-slack-en.png +0 -0
- package/docs/images/connectors/connector-telegram-en.png +0 -0
- package/docs/images/connectors/connector-weixin-en.png +0 -0
- package/docs/images/connectors/connector-whatsapp-en.png +0 -0
- package/docs/images/settings/settings-baselines-en.png +0 -0
- package/docs/images/settings/settings-config-en.png +0 -0
- package/docs/images/settings/settings-connectors-overview-en.png +0 -0
- package/docs/images/settings/settings-deepxiv-en.png +0 -0
- package/docs/images/settings/settings-mcp-servers-en.png +0 -0
- package/docs/images/settings/settings-plugins-en.png +0 -0
- package/docs/images/settings/settings-runners-en.png +0 -0
- package/docs/zh/00_QUICK_START.md +142 -18
- package/docs/zh/01_SETTINGS_REFERENCE.md +219 -98
- package/docs/zh/02_START_RESEARCH_GUIDE.md +26 -5
- package/docs/zh/05_TUI_GUIDE.md +171 -2
- package/docs/zh/07_MEMORY_AND_MCP.md +29 -2
- package/docs/zh/09_DOCTOR.md +54 -8
- package/docs/zh/10_WEIXIN_CONNECTOR_GUIDE.md +24 -1
- package/docs/zh/11_LICENSE_AND_RISK.md +4 -0
- package/docs/zh/12_GUIDED_WORKFLOW_TOUR.md +15 -0
- package/docs/zh/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +9 -0
- package/docs/zh/15_CODEX_PROVIDER_SETUP.md +552 -181
- package/docs/zh/21_LOCAL_MODEL_BACKENDS_GUIDE.md +384 -0
- package/docs/zh/22_BENCHSTORE_YAML_REFERENCE.md +459 -0
- package/docs/zh/23_BENCHSTORE_GITHUB_RELEASES_SPEC.md +287 -0
- package/docs/zh/23_CLAUDE_RUNNER_GUIDE.md +103 -0
- package/docs/zh/24_CLAUDE_CODE_PROVIDER_SETUP.md +460 -0
- package/docs/zh/25_OPENCODE_PROVIDER_SETUP.md +660 -0
- package/docs/zh/26_CITATION_AND_ATTRIBUTION.md +102 -0
- package/docs/zh/27_KIMI_CODE_PROVIDER_SETUP.md +51 -0
- package/docs/zh/{19_LOCAL_BROWSER_AUTH.md → 31_LOCAL_BROWSER_AUTH.md} +1 -1
- package/docs/zh/32_WINDOWS_WSL2_DEPLOYMENT_GUIDE.md +264 -0
- package/docs/zh/33_WORKSPACE_EXPLORER_QA.md +127 -0
- package/docs/zh/99_ACKNOWLEDGEMENTS.md +23 -19
- package/docs/zh/README.md +33 -7
- package/install.sh +168 -20
- package/package.json +5 -1
- package/pyproject.toml +2 -1
- package/src/deepscientist/__init__.py +1 -1
- package/src/deepscientist/acp/envelope.py +13 -0
- package/src/deepscientist/admin/__init__.py +3 -0
- package/src/deepscientist/admin/charts.py +681 -0
- package/src/deepscientist/admin/logs.py +119 -0
- package/src/deepscientist/admin/repairs.py +217 -0
- package/src/deepscientist/admin/service.py +1310 -0
- package/src/deepscientist/admin/system_info.py +700 -0
- package/src/deepscientist/admin/tasks.py +465 -0
- package/src/deepscientist/admin/tool_metrics.py +600 -0
- package/src/deepscientist/artifact/guidance.py +8 -4
- package/src/deepscientist/artifact/schemas.py +115 -0
- package/src/deepscientist/artifact/service.py +4268 -260
- package/src/deepscientist/bash_exec/monitor.py +30 -3
- package/src/deepscientist/bash_exec/service.py +134 -1
- package/src/deepscientist/benchstore/__init__.py +4 -0
- package/src/deepscientist/benchstore/prompt_builder.py +224 -0
- package/src/deepscientist/benchstore/service.py +1716 -0
- package/src/deepscientist/bridges/connectors.py +8 -2
- package/src/deepscientist/channels/weixin_ilink.py +8 -1
- package/src/deepscientist/cli.py +92 -17
- package/src/deepscientist/codex_cli_compat.py +187 -74
- package/src/deepscientist/config/models.py +82 -11
- package/src/deepscientist/config/service.py +1077 -93
- package/src/deepscientist/connector/weixin_support.py +48 -17
- package/src/deepscientist/daemon/api/handlers.py +827 -235
- package/src/deepscientist/daemon/api/router.py +81 -1
- package/src/deepscientist/daemon/app.py +1512 -85
- package/src/deepscientist/diagnostics/__init__.py +6 -0
- package/src/deepscientist/diagnostics/runner_failures.py +277 -0
- package/src/deepscientist/doctor.py +407 -56
- package/src/deepscientist/evidence_packets.py +590 -0
- package/src/deepscientist/home.py +52 -4
- package/src/deepscientist/kimi_cli_compat.py +50 -0
- package/src/deepscientist/latex_runtime.py +2 -2
- package/src/deepscientist/mcp/context.py +2 -0
- package/src/deepscientist/mcp/schemas.py +114 -0
- package/src/deepscientist/mcp/server.py +1566 -126
- package/src/deepscientist/memory/service.py +203 -16
- package/src/deepscientist/process_control.py +8 -1
- package/src/deepscientist/prompts/builder.py +850 -88
- package/src/deepscientist/quest/__init__.py +2 -2
- package/src/deepscientist/quest/layout.py +12 -1
- package/src/deepscientist/quest/node_traces.py +10 -0
- package/src/deepscientist/quest/service.py +1852 -161
- package/src/deepscientist/quest/stage_views.py +1 -1
- package/src/deepscientist/runners/__init__.py +18 -0
- package/src/deepscientist/runners/base.py +89 -1
- package/src/deepscientist/runners/builtins.py +13 -1
- package/src/deepscientist/runners/claude.py +391 -0
- package/src/deepscientist/runners/codex.py +480 -35
- package/src/deepscientist/runners/codex_telemetry.py +127 -0
- package/src/deepscientist/runners/kimi.py +334 -0
- package/src/deepscientist/runners/metadata.py +68 -0
- package/src/deepscientist/runners/opencode.py +414 -0
- package/src/deepscientist/runners/runtime_overrides.py +100 -0
- package/src/deepscientist/runners/simple_cli.py +538 -0
- package/src/deepscientist/runtime_storage.py +303 -0
- package/src/deepscientist/shared.py +80 -16
- package/src/deepscientist/skills/installer.py +37 -0
- package/src/deepscientist/skills/registry.py +2 -0
- package/src/deepscientist/tinytex.py +2 -2
- package/src/deepscientist/tui.py +10 -3
- package/src/prompts/benchstore/system.md +77 -0
- package/src/prompts/connectors/qq.md +33 -2
- package/src/prompts/connectors/weixin.md +208 -23
- package/src/prompts/contracts/admin_ops.md +74 -0
- package/src/prompts/contracts/admin_ops_knowledge.md +138 -0
- package/src/prompts/contracts/shared_interaction.md +5 -10
- package/src/prompts/start_setup/system.md +422 -0
- package/src/prompts/system.md +411 -304
- package/src/prompts/system_copilot.md +89 -0
- package/src/skills/analysis-campaign/SKILL.md +239 -578
- package/src/skills/analysis-campaign/references/artifact-flow-examples.md +102 -0
- package/src/skills/analysis-campaign/references/boundary-cases.md +98 -0
- package/src/skills/analysis-campaign/references/campaign-checklist-template.md +39 -24
- package/src/skills/analysis-campaign/references/campaign-design.md +26 -10
- package/src/skills/analysis-campaign/references/campaign-plan-template.md +53 -54
- package/src/skills/analysis-campaign/references/operational-guidance.md +97 -0
- package/src/skills/analysis-campaign/references/writing-facing-slice-examples.md +10 -20
- package/src/skills/baseline/SKILL.md +183 -461
- package/src/skills/baseline/references/artifact-flow-examples.md +106 -0
- package/src/skills/baseline/references/artifact-payload-examples.md +1 -1
- package/src/skills/baseline/references/baseline-checklist-template.md +27 -35
- package/src/skills/baseline/references/baseline-plan-template.md +37 -76
- package/src/skills/baseline/references/boundary-cases.md +86 -0
- package/src/skills/baseline/references/codebase-audit-checklist.md +2 -6
- package/src/skills/baseline/references/comparability-contract.md +7 -12
- package/src/skills/baseline/references/operational-guidance.md +56 -0
- package/src/skills/baseline/references/route-selection.md +5 -25
- package/src/skills/decision/SKILL.md +113 -306
- package/src/skills/decision/references/checkpoint-memory-template.md +47 -0
- package/src/skills/decision/references/operational-guidance.md +94 -0
- package/src/skills/decision/references/research-route-criteria.md +7 -8
- package/src/skills/decision/references/strategic-decision-template.md +13 -26
- package/src/skills/experiment/SKILL.md +132 -670
- package/src/skills/experiment/references/execution-playbook.md +374 -0
- package/src/skills/experiment/references/main-experiment-checklist-template.md +26 -2
- package/src/skills/experiment/references/main-experiment-plan-template.md +28 -17
- package/src/skills/experiment/references/operational-guidance.md +108 -0
- package/src/skills/finalize/SKILL.md +62 -0
- package/src/skills/finalize/references/checkpoint-memory-template.md +49 -0
- package/src/skills/finalize/references/resume-packet-template.md +7 -0
- package/src/skills/idea/SKILL.md +228 -15
- package/src/skills/idea/references/controlled-brainstorming-playbook.md +78 -0
- package/src/skills/idea/references/current-board-packet-template.md +61 -0
- package/src/skills/idea/references/high-value-idea-sourcing.md +119 -0
- package/src/skills/idea/references/idea-generation-playbook.md +21 -0
- package/src/skills/idea/references/idea-thinking-flow.md +6 -0
- package/src/skills/idea/references/literature-survey-template.md +3 -0
- package/src/skills/idea/references/objective-contract-template.md +54 -0
- package/src/skills/idea/references/outline-seeding-example.md +56 -0
- package/src/skills/idea/references/pre-idea-draft-template.md +105 -0
- package/src/skills/idea/references/related-work-playbook.md +75 -2
- package/src/skills/idea/references/research-history-playbook.md +114 -0
- package/src/skills/idea/references/selection-gate.md +58 -6
- package/src/skills/intake-audit/SKILL.md +43 -2
- package/src/skills/intake-audit/references/state-audit-template.md +10 -0
- package/src/skills/nature-data/SKILL.md +128 -0
- package/src/skills/nature-data/UPSTREAM_LICENSE.txt +21 -0
- package/src/skills/nature-data/agents/openai.yaml +4 -0
- package/src/skills/nature-data/references/chinese-author-alignment.md +84 -0
- package/src/skills/nature-data/references/fair-metadata-checklist.md +105 -0
- package/src/skills/nature-data/references/policy-principles.md +103 -0
- package/src/skills/nature-data/references/repository-and-identifiers.md +96 -0
- package/src/skills/nature-data/references/source-basis.md +54 -0
- package/src/skills/nature-data/references/statement-patterns.md +153 -0
- package/src/skills/nature-figure/SKILL.md +197 -0
- package/src/skills/nature-figure/UPSTREAM_LICENSE.txt +21 -0
- package/src/skills/nature-figure/agents/openai.yaml +4 -0
- package/src/skills/nature-figure/evals/evals.json +37 -0
- package/src/skills/nature-figure/references/api.md +428 -0
- package/src/skills/nature-figure/references/backend-selection.md +100 -0
- package/src/skills/nature-figure/references/chart-types.md +281 -0
- package/src/skills/nature-figure/references/common-patterns.md +349 -0
- package/src/skills/nature-figure/references/design-theory.md +436 -0
- package/src/skills/nature-figure/references/figure-contract.md +93 -0
- package/src/skills/nature-figure/references/nature-2026-observations.md +112 -0
- package/src/skills/nature-figure/references/qa-contract.md +119 -0
- package/src/skills/nature-figure/references/r-template-index.md +66 -0
- package/src/skills/nature-figure/references/r-workflow.md +161 -0
- package/src/skills/nature-figure/references/tutorials.md +250 -0
- package/src/skills/nature-paper2ppt/SKILL.md +507 -0
- package/src/skills/nature-paper2ppt/UPSTREAM_LICENSE.txt +21 -0
- package/src/skills/nature-paper2ppt/agents/openai.yaml +4 -0
- package/src/skills/nature-polishing/SKILL.md +385 -0
- package/src/skills/nature-polishing/UPSTREAM_LICENSE.txt +21 -0
- package/src/skills/nature-polishing/agents/openai.yaml +4 -0
- package/src/skills/nature-polishing/references/phrasebank-playbook.md +162 -0
- package/src/skills/nature-polishing/references/section-moves.md +240 -0
- package/src/skills/nature-polishing/references/style-guardrails.md +94 -0
- package/src/skills/nature-polishing/references/writing-strategy.md +148 -0
- package/src/skills/optimize/SKILL.md +177 -1568
- package/src/skills/optimize/references/brief-shaping-playbook.md +95 -0
- package/src/skills/optimize/references/candidate-board-template.md +13 -0
- package/src/skills/optimize/references/candidate-ranking-template.md +51 -0
- package/src/skills/optimize/references/codegen-route-playbook.md +50 -0
- package/src/skills/optimize/references/debug-response-template.md +29 -0
- package/src/skills/optimize/references/frontier-review-template.md +32 -0
- package/src/skills/optimize/references/fusion-playbook.md +36 -0
- package/src/skills/optimize/references/method-brief-template.md +73 -0
- package/src/skills/optimize/references/operational-guidance.md +621 -0
- package/src/skills/optimize/references/optimization-memory-template.md +30 -0
- package/src/skills/optimize/references/optimize-checklist-template.md +18 -0
- package/src/skills/optimize/references/plateau-response-playbook.md +28 -0
- package/src/skills/optimize/references/prompt-patterns.md +49 -0
- package/src/skills/paper-outline/SKILL.md +227 -0
- package/src/skills/paper-outline/references/outline-patterns.md +87 -0
- package/src/skills/paper-plot/SKILL.md +79 -0
- package/src/skills/paper-plot/agents/openai.yaml +4 -0
- package/src/skills/paper-plot/references/bar_grouped_hatch.md +96 -0
- package/src/skills/paper-plot/references/bar_paired_delta.md +72 -0
- package/src/skills/paper-plot/references/line_confidence_band.md +75 -0
- package/src/skills/paper-plot/references/line_loss_with_inset.md +65 -0
- package/src/skills/paper-plot/references/line_training_curve.md +44 -0
- package/src/skills/paper-plot/references/radar_dual_series.md +59 -0
- package/src/skills/paper-plot/references/scatter_broken_axis.md +59 -0
- package/src/skills/paper-plot/references/scatter_tsne_cluster.md +72 -0
- package/src/skills/paper-plot/scripts/bar_memevolve.py +109 -0
- package/src/skills/paper-plot/scripts/bar_spice.py +166 -0
- package/src/skills/paper-plot/scripts/line_aime.py +94 -0
- package/src/skills/paper-plot/scripts/line_loss_inset.py +157 -0
- package/src/skills/paper-plot/scripts/line_selfdistill.py +168 -0
- package/src/skills/paper-plot/scripts/radar_dora.py +151 -0
- package/src/skills/paper-plot/scripts/scatter_break.py +169 -0
- package/src/skills/paper-plot/scripts/scatter_tsne.py +133 -0
- package/src/skills/rebuttal/SKILL.md +9 -0
- package/src/skills/references/tool-usage-by-stage.md +438 -0
- package/src/skills/review/SKILL.md +105 -7
- package/src/skills/science/PROVENANCE.md +44 -0
- package/src/skills/science/SKILL.md +137 -0
- package/src/skills/science/references/artifact-science-tool.md +110 -0
- package/src/skills/science/references/claim-type-discipline.md +56 -0
- package/src/skills/science/references/domain-index.md +422 -0
- package/src/skills/science/references/hpc-via-bash-exec.md +42 -0
- package/src/skills/science/references/package-check-playbook.md +64 -0
- package/src/skills/science/references/package-index.min.json +3616 -0
- package/src/skills/science/references/packages/abinit.md +80 -0
- package/src/skills/science/references/packages/acts.md +73 -0
- package/src/skills/science/references/packages/aiida-core.md +80 -0
- package/src/skills/science/references/packages/alamode.md +80 -0
- package/src/skills/science/references/packages/amuse.md +88 -0
- package/src/skills/science/references/packages/anndata.md +88 -0
- package/src/skills/science/references/packages/arbor.md +80 -0
- package/src/skills/science/references/packages/arc.md +73 -0
- package/src/skills/science/references/packages/astropy.md +88 -0
- package/src/skills/science/references/packages/astroquery.md +88 -0
- package/src/skills/science/references/packages/atomate2.md +80 -0
- package/src/skills/science/references/packages/atomsmltr.md +73 -0
- package/src/skills/science/references/packages/awkward.md +73 -0
- package/src/skills/science/references/packages/batman.md +88 -0
- package/src/skills/science/references/packages/biopython.md +88 -0
- package/src/skills/science/references/packages/bloqade.md +73 -0
- package/src/skills/science/references/packages/brian2.md +73 -0
- package/src/skills/science/references/packages/bullet3.md +73 -0
- package/src/skills/science/references/packages/calculix.md +80 -0
- package/src/skills/science/references/packages/cantera.md +73 -0
- package/src/skills/science/references/packages/cavity-md-ipi.md +80 -0
- package/src/skills/science/references/packages/ccdproc.md +88 -0
- package/src/skills/science/references/packages/celerite2.md +88 -0
- package/src/skills/science/references/packages/cellrank.md +73 -0
- package/src/skills/science/references/packages/cesm.md +80 -0
- package/src/skills/science/references/packages/chemicals.md +73 -0
- package/src/skills/science/references/packages/chempy.md +73 -0
- package/src/skills/science/references/packages/cirq.md +73 -0
- package/src/skills/science/references/packages/coffea.md +73 -0
- package/src/skills/science/references/packages/cp2k.md +88 -0
- package/src/skills/science/references/packages/custodian.md +80 -0
- package/src/skills/science/references/packages/dart.md +73 -0
- package/src/skills/science/references/packages/datamol.md +88 -0
- package/src/skills/science/references/packages/dd4hep.md +73 -0
- package/src/skills/science/references/packages/dealii.md +80 -0
- package/src/skills/science/references/packages/deepchem.md +88 -0
- package/src/skills/science/references/packages/delphes.md +73 -0
- package/src/skills/science/references/packages/devito.md +80 -0
- package/src/skills/science/references/packages/dftb.md +88 -0
- package/src/skills/science/references/packages/dftd4.md +88 -0
- package/src/skills/science/references/packages/dftk-jl.md +80 -0
- package/src/skills/science/references/packages/dolfinx.md +80 -0
- package/src/skills/science/references/packages/drake.md +73 -0
- package/src/skills/science/references/packages/dumux.md +73 -0
- package/src/skills/science/references/packages/elk.md +80 -0
- package/src/skills/science/references/packages/elmerfem.md +80 -0
- package/src/skills/science/references/packages/enzo-e.md +88 -0
- package/src/skills/science/references/packages/espresso.md +80 -0
- package/src/skills/science/references/packages/exoplanet.md +88 -0
- package/src/skills/science/references/packages/fairroot.md +73 -0
- package/src/skills/science/references/packages/fbpic.md +80 -0
- package/src/skills/science/references/packages/fdtdbath-meep.md +80 -0
- package/src/skills/science/references/packages/geant4.md +73 -0
- package/src/skills/science/references/packages/geosx.md +80 -0
- package/src/skills/science/references/packages/gprmax.md +80 -0
- package/src/skills/science/references/packages/gromacs.md +80 -0
- package/src/skills/science/references/packages/gwaslab.md +73 -0
- package/src/skills/science/references/packages/gz-sim.md +73 -0
- package/src/skills/science/references/packages/hail.md +88 -0
- package/src/skills/science/references/packages/hiphive.md +80 -0
- package/src/skills/science/references/packages/hoomd-blue.md +80 -0
- package/src/skills/science/references/packages/itensor.md +73 -0
- package/src/skills/science/references/packages/itensors-jl.md +73 -0
- package/src/skills/science/references/packages/jdftx.md +73 -0
- package/src/skills/science/references/packages/jobflow.md +80 -0
- package/src/skills/science/references/packages/kadanoffbaym-jl.md +73 -0
- package/src/skills/science/references/packages/kite.md +80 -0
- package/src/skills/science/references/packages/kratos.md +80 -0
- package/src/skills/science/references/packages/kwant.md +73 -0
- package/src/skills/science/references/packages/lammps.md +80 -0
- package/src/skills/science/references/packages/lightkurve.md +88 -0
- package/src/skills/science/references/packages/limix.md +73 -0
- package/src/skills/science/references/packages/maxwelllink.md +80 -0
- package/src/skills/science/references/packages/mcdc.md +73 -0
- package/src/skills/science/references/packages/meep.md +80 -0
- package/src/skills/science/references/packages/mfem.md +80 -0
- package/src/skills/science/references/packages/mitgcm.md +73 -0
- package/src/skills/science/references/packages/modflow6.md +73 -0
- package/src/skills/science/references/packages/molecool.md +73 -0
- package/src/skills/science/references/packages/mom6.md +73 -0
- package/src/skills/science/references/packages/moose.md +80 -0
- package/src/skills/science/references/packages/mpas-model.md +73 -0
- package/src/skills/science/references/packages/mujoco.md +73 -0
- package/src/skills/science/references/packages/mumax3.md +73 -0
- package/src/skills/science/references/packages/nekrs.md +80 -0
- package/src/skills/science/references/packages/nessi.md +73 -0
- package/src/skills/science/references/packages/nest-simulator.md +73 -0
- package/src/skills/science/references/packages/netket.md +73 -0
- package/src/skills/science/references/packages/neuron.md +73 -0
- package/src/skills/science/references/packages/nextflow.md +88 -0
- package/src/skills/science/references/packages/nwchem.md +88 -0
- package/src/skills/science/references/packages/openbabel.md +88 -0
- package/src/skills/science/references/packages/openems.md +80 -0
- package/src/skills/science/references/packages/openff-toolkit.md +88 -0
- package/src/skills/science/references/packages/openfoam-dev.md +80 -0
- package/src/skills/science/references/packages/openmc.md +73 -0
- package/src/skills/science/references/packages/openmm.md +80 -0
- package/src/skills/science/references/packages/openmoc.md +73 -0
- package/src/skills/science/references/packages/openmx.md +80 -0
- package/src/skills/science/references/packages/opensees.md +80 -0
- package/src/skills/science/references/packages/opensn.md +80 -0
- package/src/skills/science/references/packages/opm-simulators.md +73 -0
- package/src/skills/science/references/packages/oqupy.md +73 -0
- package/src/skills/science/references/packages/packmol.md +80 -0
- package/src/skills/science/references/packages/palabos.md +80 -0
- package/src/skills/science/references/packages/parflow.md +80 -0
- package/src/skills/science/references/packages/pennylane.md +88 -0
- package/src/skills/science/references/packages/perceval.md +73 -0
- package/src/skills/science/references/packages/phono3py.md +73 -0
- package/src/skills/science/references/packages/phonopy.md +73 -0
- package/src/skills/science/references/packages/photutils.md +88 -0
- package/src/skills/science/references/packages/picongpu.md +80 -0
- package/src/skills/science/references/packages/plink-ng.md +88 -0
- package/src/skills/science/references/packages/precice.md +73 -0
- package/src/skills/science/references/packages/psc.md +80 -0
- package/src/skills/science/references/packages/psi4.md +88 -0
- package/src/skills/science/references/packages/pybinding.md +73 -0
- package/src/skills/science/references/packages/pyfr.md +80 -0
- package/src/skills/science/references/packages/pyhf.md +73 -0
- package/src/skills/science/references/packages/pyiron_base.md +80 -0
- package/src/skills/science/references/packages/pylcp.md +73 -0
- package/src/skills/science/references/packages/pylith.md +80 -0
- package/src/skills/science/references/packages/pynbody.md +88 -0
- package/src/skills/science/references/packages/pysam.md +88 -0
- package/src/skills/science/references/packages/pyscf.md +88 -0
- package/src/skills/science/references/packages/q-e.md +73 -0
- package/src/skills/science/references/packages/qibo.md +73 -0
- package/src/skills/science/references/packages/qiskit.md +73 -0
- package/src/skills/science/references/packages/quantica-jl.md +73 -0
- package/src/skills/science/references/packages/quantumoptics-jl.md +73 -0
- package/src/skills/science/references/packages/quimb.md +73 -0
- package/src/skills/science/references/packages/qulacs.md +73 -0
- package/src/skills/science/references/packages/qutip.md +73 -0
- package/src/skills/science/references/packages/rdkit.md +88 -0
- package/src/skills/science/references/packages/rmg-py.md +73 -0
- package/src/skills/science/references/packages/root.md +73 -0
- package/src/skills/science/references/packages/scanpy.md +88 -0
- package/src/skills/science/references/packages/scikit-allel.md +88 -0
- package/src/skills/science/references/packages/scikit-bio.md +88 -0
- package/src/skills/science/references/packages/scqubits.md +73 -0
- package/src/skills/science/references/packages/scuff-em.md +80 -0
- package/src/skills/science/references/packages/scvi-tools.md +73 -0
- package/src/skills/science/references/packages/seissol.md +73 -0
- package/src/skills/science/references/packages/sfepy.md +80 -0
- package/src/skills/science/references/packages/sisl.md +73 -0
- package/src/skills/science/references/packages/smilei.md +80 -0
- package/src/skills/science/references/packages/snakemake.md +88 -0
- package/src/skills/science/references/packages/specfem3d-globe.md +80 -0
- package/src/skills/science/references/packages/specutils.md +88 -0
- package/src/skills/science/references/packages/spglib.md +80 -0
- package/src/skills/science/references/packages/squidpy.md +88 -0
- package/src/skills/science/references/packages/starry.md +88 -0
- package/src/skills/science/references/packages/strawberryfields.md +73 -0
- package/src/skills/science/references/packages/su2.md +80 -0
- package/src/skills/science/references/packages/sunny-jl.md +73 -0
- package/src/skills/science/references/packages/sw4.md +73 -0
- package/src/skills/science/references/packages/swift.md +88 -0
- package/src/skills/science/references/packages/tdnegf.md +73 -0
- package/src/skills/science/references/packages/tenpy.md +73 -0
- package/src/skills/science/references/packages/thermo.md +73 -0
- package/src/skills/science/references/packages/tkwant.md +73 -0
- package/src/skills/science/references/packages/tvb-root.md +73 -0
- package/src/skills/science/references/packages/uproot5.md +73 -0
- package/src/skills/science/references/packages/vampire.md +80 -0
- package/src/skills/science/references/packages/wannier_tools.md +73 -0
- package/src/skills/science/references/packages/warpx.md +80 -0
- package/src/skills/science/references/packages/wrf.md +73 -0
- package/src/skills/science/references/packages/xtb.md +88 -0
- package/src/skills/science/references/packages/yt.md +73 -0
- package/src/skills/science/references/science-task-brief-template.md +71 -0
- package/src/skills/scout/SKILL.md +83 -425
- package/src/skills/scout/references/literature-scout-template.md +5 -24
- package/src/skills/scout/references/operational-guidance.md +191 -0
- package/src/skills/scout/references/paper-triage-playbook.md +11 -35
- package/src/skills/write/SKILL.md +744 -1246
- package/src/skills/write/references/experiments_analysis_patterns.md +129 -0
- package/src/skills/write/references/oral_package_patterns.md +252 -0
- package/src/skills/write/references/oral_writing_principles.md +291 -0
- package/src/skills/write/references/section_rewrite_checklist.md +234 -0
- package/src/tui/dist/app/AppContainer.js +1314 -27
- package/src/tui/dist/components/Composer.js +26 -1
- package/src/tui/dist/components/ConfigScreen.js +2 -1
- package/src/tui/dist/components/InputPrompt.js +25 -9
- package/src/tui/dist/components/MainContent.js +18 -3
- package/src/tui/dist/components/QuestScreen.js +3 -2
- package/src/tui/dist/components/UtilityScreen.js +37 -0
- package/src/tui/dist/hooks/useSafeInput.js +10 -0
- package/src/tui/dist/index.js +13 -1
- package/src/tui/dist/layouts/DefaultAppLayout.js +11 -8
- package/src/tui/dist/lib/api.js +89 -1
- package/src/tui/package.json +1 -1
- package/src/ui/dist/assets/{AnalysisPlugin-DnSm0GZn.js → AnalysisPlugin-CA94NGmI.js} +1 -1
- package/src/ui/dist/assets/CliPlugin-DHBzphZU.js +79 -0
- package/src/ui/dist/assets/CodeEditorPlugin-BOFwD2rn.js +2 -0
- package/src/ui/dist/assets/{CodeViewerPlugin-itb0tltR.js → CodeViewerPlugin-CqDpgjik.js} +4 -4
- package/src/ui/dist/assets/{DocViewerPlugin-DqKkiCI6.js → DocViewerPlugin-UDBgt8-4.js} +3 -3
- package/src/ui/dist/assets/GitCommitViewerPlugin-BmHtZ0bZ.js +6 -0
- package/src/ui/dist/assets/{GitDiffViewerPlugin-DxL2ezFG.js → GitDiffViewerPlugin-CAxjNorQ.js} +2 -2
- package/src/ui/dist/assets/{GitSnapshotViewer-B_RQm1YZ.js → GitSnapshotViewer-CweA6VON.js} +2 -2
- package/src/ui/dist/assets/{ImageViewerPlugin-tHqlXY3n.js → ImageViewerPlugin-C8wHGvGN.js} +5 -5
- package/src/ui/dist/assets/LabPlugin-COyyLUol.js +32 -0
- package/src/ui/dist/assets/{LatexPlugin-B495DTXC.js → LatexPlugin-BQjAaA5J.js} +4 -4
- package/src/ui/dist/assets/{MarkdownViewerPlugin-DG28-61B.js → MarkdownViewerPlugin-Dy1NE2dI.js} +3 -3
- package/src/ui/dist/assets/{MarketplacePlugin-BiOGT-Kj.js → MarketplacePlugin-DMIZtEJ2.js} +2 -2
- package/src/ui/dist/assets/NotebookEditor-CFHMq_Qt.js +91 -0
- package/src/ui/dist/assets/{NotebookEditor-CVsj8h_T.js → NotebookEditor-WFyd8Ybt.js} +23 -23
- package/src/ui/dist/assets/{PdfLoader-CASDQmxJ.js → PdfLoader-CLE5u5TS.js} +3 -3
- package/src/ui/dist/assets/{PdfMarkdownPlugin-BFhwoKsY.js → PdfMarkdownPlugin-_iNK_H83.js} +1 -1
- package/src/ui/dist/assets/PdfViewerPlugin-DgWsbInT.js +22 -0
- package/src/ui/dist/assets/SearchPlugin-DrZmn5iw.js +11 -0
- package/src/ui/dist/assets/{TextViewerPlugin-CB4DYfWO.js → TextViewerPlugin-D1-T3aC7.js} +4 -4
- package/src/ui/dist/assets/branding/runner-claude.svg +107 -0
- package/src/ui/dist/assets/branding/runner-codex.svg +10 -0
- package/src/ui/dist/assets/branding/runner-kimi.svg +14 -0
- package/src/ui/dist/assets/branding/runner-opencode.svg +7 -0
- package/src/ui/dist/assets/cli-store-CoZ-x5Ip.js +1 -0
- package/src/ui/dist/assets/{code-DLC6G24T.js → code-DbsmSd3Y.js} +1 -1
- package/src/ui/dist/assets/file-diff-panel-DsvyRz47.js +1 -0
- package/src/ui/dist/assets/{wrap-text-CwMn-iqb.js → file-jump-queue-DeQBikaw.js} +3 -3
- package/src/ui/dist/assets/{file-socket-Cu4Qln7Y.js → file-socket-DA5XIx88.js} +1 -1
- package/src/ui/dist/assets/fonts/ds-fonts.css +50 -4
- package/src/ui/dist/assets/images/deepxiv/register-guide.png +0 -0
- package/src/ui/dist/assets/index-39vY9LmZ.js +1 -0
- package/src/ui/dist/assets/{index-wQ7RIIRd.js → index-BsO46tJA.js} +1 -1
- package/src/ui/dist/assets/index-CHzJ2xtB.js +3530 -0
- package/src/ui/dist/assets/index-DH-zxoZ3.css +33 -0
- package/src/ui/dist/assets/{plugin-notebook-HbW2K-1c.js → plugin-notebook-JRhysCqj.js} +2 -2
- package/src/ui/dist/assets/{project-sync-CsX08Qno.js → project-sync-DPmWKmKD.js} +1 -1
- package/src/ui/dist/assets/{zoom-out-R-GWEhzS.js → zoom-out-DAukFWen.js} +3 -3
- package/src/ui/dist/index.html +3 -3
- package/src/skills/analysis-campaign/references/artifact-orchestration.md +0 -58
- package/src/skills/baseline/references/memory-playbook.md +0 -40
- package/src/skills/baseline/references/publishable-baseline-package.md +0 -30
- package/src/skills/write/references/outline-evidence-contract-example.md +0 -107
- package/src/skills/write/references/paper-experiment-matrix-template.md +0 -131
- package/src/skills/write/references/paper-section-playbook.md +0 -64
- package/src/skills/write/references/reviewer-first-writing.md +0 -64
- package/src/skills/write/references/revision-checklist.md +0 -70
- package/src/skills/write/references/section-contracts.md +0 -82
- package/src/skills/write/references/sentence-level-proofing.md +0 -49
- package/src/ui/dist/assets/AiManusChatView-COFACy7V.js +0 -204
- package/src/ui/dist/assets/CliPlugin-CvwCmDQ5.js +0 -109
- package/src/ui/dist/assets/CodeEditorPlugin-cOqSa0xq.js +0 -2
- package/src/ui/dist/assets/GitCommitViewerPlugin-DVgNHBCS.js +0 -1
- package/src/ui/dist/assets/LabCopilotPanel-ClMbq5Yu.js +0 -14
- package/src/ui/dist/assets/LabPlugin-L_SuE8ow.js +0 -22
- package/src/ui/dist/assets/NotebookEditor-C-4Kt1p9.js +0 -81
- package/src/ui/dist/assets/PdfViewerPlugin-DcOzU9vd.js +0 -17
- package/src/ui/dist/assets/SearchPlugin-CHj7M58O.js +0 -16
- package/src/ui/dist/assets/VNCViewer-CjlbyCB3.js +0 -11
- package/src/ui/dist/assets/bot-CFkZY-JP.js +0 -6
- package/src/ui/dist/assets/chevron-up-Dq5ofbht.js +0 -6
- package/src/ui/dist/assets/file-content-Dv4LoZec.js +0 -1
- package/src/ui/dist/assets/file-diff-panel-Denq-lC3.js +0 -1
- package/src/ui/dist/assets/file-jump-queue-DA-SdG__.js +0 -1
- package/src/ui/dist/assets/git-commit-horizontal-BUh6G52n.js +0 -6
- package/src/ui/dist/assets/image-B9HUUddG.js +0 -6
- package/src/ui/dist/assets/index-B2B1sg-M.js +0 -1
- package/src/ui/dist/assets/index-Cgla8biy.css +0 -33
- package/src/ui/dist/assets/index-DRyx7vAc.js +0 -1
- package/src/ui/dist/assets/index-Gbl53BNp.js +0 -2496
- package/src/ui/dist/assets/pdf-effect-queue-ZtnHFCAi.js +0 -6
- package/src/ui/dist/assets/popover-DL6h35vr.js +0 -1
- package/src/ui/dist/assets/select-DvmXt1yY.js +0 -11
- package/src/ui/dist/assets/sigma-7jpXazui.js +0 -6
- package/src/ui/dist/assets/trash-xA7kFt8i.js +0 -11
- package/src/ui/dist/assets/useCliAccess-DsMwDjOp.js +0 -1
- package/src/ui/dist/assets/useFileDiffOverlay-FuhcnKiw.js +0 -1
|
@@ -6,634 +6,295 @@ skill_role: stage
|
|
|
6
6
|
|
|
7
7
|
# Analysis Campaign
|
|
8
8
|
|
|
9
|
-
Use this skill when
|
|
9
|
+
Use this skill when follow-up evidence is needed after a durable result.
|
|
10
|
+
The goal is to answer a bounded, resource-aware evidence question, not to keep opening more slices just because they are imaginable.
|
|
10
11
|
|
|
11
|
-
|
|
12
|
-
Use the same route for:
|
|
12
|
+
## Match signals
|
|
13
13
|
|
|
14
|
-
-
|
|
15
|
-
- review-driven evidence gaps
|
|
16
|
-
- rebuttal-driven extra experiments
|
|
17
|
-
- writing-driven evidence gaps
|
|
14
|
+
Use `analysis-campaign` when:
|
|
18
15
|
|
|
19
|
-
|
|
16
|
+
- a durable main result already exists and follow-up evidence is needed
|
|
17
|
+
- the quest needs ablations, robustness checks, sensitivity checks, failure analysis, error analysis, efficiency or cost checks, or limitation-boundary checks
|
|
18
|
+
- writing, review, or rebuttal pressure exposed an evidence gap that should be answered by bounded follow-up slices
|
|
20
19
|
|
|
21
|
-
|
|
22
|
-
- also ablations, sensitivity checks, robustness checks, efficiency or cost checks, highlight-validation runs, and limitation-boundary work beyond the main result
|
|
20
|
+
Do not use `analysis-campaign` when:
|
|
23
21
|
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
22
|
+
- the quest still lacks a credible main run or accepted baseline and the proposed work depends on that missing reference
|
|
23
|
+
- the next step is obviously another main experiment rather than follow-up evidence work
|
|
24
|
+
- the proposed slice does not connect to a parent claim, parent result, paper gap, reviewer item, or route decision
|
|
25
|
+
|
|
26
|
+
## One-sentence summary
|
|
27
|
+
|
|
28
|
+
Answer the smallest evidence question that changes, confirms, or blocks a parent claim, then stop when the next route is clear.
|
|
29
|
+
|
|
30
|
+
## Control workflow
|
|
31
|
+
|
|
32
|
+
1. Lock the parent object, evidence question, comparison target, and stop condition.
|
|
33
|
+
Make explicit what claim, failure mode, or route decision is actually being tested.
|
|
34
|
+
2. Audit the real execution envelope before designing the slice set.
|
|
35
|
+
Make explicit the current device and runtime limits: available GPU or CPU class, memory, wall-clock budget, storage, concurrency, required dependencies, and any queue or service constraints that materially limit what can run now.
|
|
36
|
+
3. Choose the lightest analysis route and the smallest slice set that can answer the question within that envelope.
|
|
37
|
+
Prefer slices with the highest soundness gain per unit of compute, time, or engineering effort.
|
|
38
|
+
Run claim-critical slices first and mark infeasible slices explicitly instead of quietly keeping them in scope.
|
|
39
|
+
4. Keep slices isolated and comparable.
|
|
40
|
+
Record exactly what changed, what stayed fixed, and whether apples-to-apples comparison still holds.
|
|
41
|
+
5. Record slice-level evidence before making any campaign-level claim.
|
|
42
|
+
Every meaningful slice should leave a durable outcome and a claim update.
|
|
43
|
+
6. Aggregate only the decision-relevant findings and route the next step.
|
|
44
|
+
End in continue, write, experiment, idea, decision, blocker, or stop.
|
|
45
|
+
|
|
46
|
+
## Paper-facing analysis quantity reminder
|
|
47
|
+
|
|
48
|
+
For manuscript-support campaigns, first audit `artifact.get_paper_contract(detail='full')` and, when a draft exists, `artifact.validate_manuscript_coverage(detail='full')`.
|
|
49
|
+
|
|
50
|
+
- A mature empirical manuscript usually needs 5-10 ready paper-facing experiment/analysis groups total, with 4-8 reviewer-facing analysis jobs in the outline when the paper is full empirical. Fewer is acceptable only for an early/narrow outline with an explicit waiver.
|
|
51
|
+
- If the user requested a concrete analysis count, such as 4-8 analyses, treat it as a tracked target; report the completed/mapped count and any explicit waiver before returning to full-paper writing.
|
|
52
|
+
- Do not pad the count with stale methods, abandoned methods, unrelated baseline repairs, or old exploratory rows. Each slice must identify the current method or claim it supports.
|
|
53
|
+
- If legacy-method analysis is intentionally included, mark it as baseline/comparator/negative evidence and keep it separate from current-method support.
|
|
54
|
+
- Paper-facing slice outputs must separate the `manuscript_takeaway` from internal setup, user instructions, worktree paths, command history, and artifact provenance.
|
|
55
|
+
- Do not encode local throughput shorthand such as `64 + 64` as a manuscript takeaway; record exact per-endpoint settings only as reproducibility/protocol detail when needed.
|
|
56
|
+
- If the count is below the needed range, create the smallest claim-critical frontier rather than pretending the manuscript is ready.
|
|
57
|
+
|
|
58
|
+
## AVOID / pitfalls
|
|
59
|
+
|
|
60
|
+
- Do not disguise a new main experiment as an analysis slice.
|
|
61
|
+
- Do not hide null, negative, partial, failed, or contradictory slices.
|
|
62
|
+
- Do not change many factors at once and then interpret the result as isolating one factor.
|
|
63
|
+
- Do not widen the campaign after the next route is already clear.
|
|
64
|
+
- Do not use subjective or manual inspection to support a claim without rubric, sample, prompt, trace, and caveat.
|
|
65
|
+
- Do not design a slice frontier that ignores current hardware, memory, runtime, or storage limits.
|
|
66
|
+
- Do not keep infeasible slices as silent assumptions; either downscope them, replace them with runnable proxies, or record them as blocked.
|
|
67
|
+
|
|
68
|
+
## Constraints
|
|
69
|
+
|
|
70
|
+
- Every meaningful slice must map to a parent claim, parent result, paper gap, reviewer item, or route decision.
|
|
71
|
+
- Every evidence-bearing slice must record question, intervention or inspection target, fixed conditions, metric or observable, evidence path, claim update, comparability verdict, and next action.
|
|
72
|
+
- Keep the same evaluation contract unless the variation itself is the point.
|
|
73
|
+
- When baseline comparison matters, keep slice comparisons aligned with the active baseline metric contract unless the deviation is explicit.
|
|
74
|
+
- Campaign-level conclusions must be derived from per-slice evidence rather than impressions.
|
|
75
|
+
- Campaign design must be conditioned on the current execution envelope, not an idealized future machine.
|
|
76
|
+
- If a slice would materially improve soundness but is infeasible now, record the blocker and choose the best runnable lower-cost alternative or narrower proxy.
|
|
77
|
+
- If a slice is paper-relevant, its result must be bound back into the current paper contract rather than left only in `experiments/analysis-results/*` or chat.
|
|
78
|
+
- Writing-facing slices must carry write-back metadata: `paper_role`, `section_id`, `item_id`, `claim_links`, method/comparator id, display target, and main/appendix role.
|
|
79
|
+
- Writing-facing campaign metadata should keep `selected_outline_ref`, `research_questions`, `experimental_designs`, and `todo_items` explicit; map results back to `paper/paper_experiment_matrix.md` with `exp_id`, `section_id`, `item_id`, `claim_links`, and `paper_role`.
|
|
80
|
+
- Classify paper evidence as claim-carrying, supporting, or auxiliary; keep stable support separate from contradiction, and record `comparison_baselines`, `evaluation_summary`, `takeaway`, and `comparability` when comparisons matter.
|
|
81
|
+
- Include highlight-validation, efficiency or cost, robustness, failure, and limitation checks only when they answer the parent claim or reviewer question.
|
|
82
|
+
|
|
83
|
+
## Validation
|
|
84
|
+
|
|
85
|
+
Before `analysis-campaign` can end, all applicable checks should be true:
|
|
86
|
+
|
|
87
|
+
- the parent object is explicit
|
|
88
|
+
- the current execution envelope and its binding constraints are explicit when they affect slice design or ordering
|
|
89
|
+
- every launched slice has a durable outcome: completed, partial, failed, blocked, infeasible, or superseded
|
|
90
|
+
- launched and deferred slices were screened against the current device or resource limits
|
|
91
|
+
- null, negative, failed, partial, and contradictory findings remain visible
|
|
92
|
+
- the campaign changed or confirmed the evidence boundary of the parent claim with traceable slice-level evidence
|
|
93
|
+
- the next route is explicit: continue campaign, return to `experiment`, return to `idea`, move to `write`, route through `decision`, stop, reset, or record a blocker
|
|
27
94
|
|
|
28
95
|
## Interaction discipline
|
|
29
96
|
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
- point-range or bar summaries for slice-to-slice endpoint comparisons
|
|
51
|
-
- line plots only when the x-axis is truly ordered and comparable across slices
|
|
52
|
-
- small multiples instead of one rainbow figure when slices answer different questions
|
|
53
|
-
- If a campaign view uses continuous color, keep it sequential for ordered magnitude and diverging only for signed deltas around a meaningful center.
|
|
54
|
-
- Avoid rainbow / jet-like maps and decorative heatmaps when a simpler comparison plot would communicate the result better.
|
|
55
|
-
- Keep the same muted palette semantics across the full campaign so the same color means the same role in every slice summary.
|
|
56
|
-
- If a campaign figure is milestone-facing, paper-facing, or otherwise durable, open `figure-polish/SKILL.md` and complete its render-inspect-revise pass before treating the figure as final.
|
|
57
|
-
- If plotting in Python, reuse the fixed Morandi plotting starter from the system prompt and keep the same palette discipline across the whole campaign.
|
|
58
|
-
- If the runtime starts an auto-continue turn with no new user message, resume from the current campaign state and active requirements instead of replaying the previous user turn.
|
|
59
|
-
- Progress message templates are references only. Adapt to the actual context and vary wording so messages feel human, respectful, and non-robotic.
|
|
60
|
-
- If a threaded user reply arrives, interpret it relative to the latest campaign progress update before assuming the task changed completely.
|
|
61
|
-
|
|
62
|
-
## Stage purpose
|
|
63
|
-
|
|
64
|
-
The analysis-campaign stage exists to test the strength, boundaries, and failure modes of a result.
|
|
65
|
-
It preserves the core old DeepScientist analysis-experimenter discipline:
|
|
66
|
-
|
|
67
|
-
- each analysis run should correspond to one clear question
|
|
68
|
-
- campaign runs should stay isolated and comparable
|
|
69
|
-
- negative results must remain visible
|
|
70
|
-
- campaign-level conclusions should be aggregated explicitly
|
|
71
|
-
|
|
72
|
-
The campaign should behave like a disciplined evidence program, not an unstructured pile of extra runs.
|
|
97
|
+
Follow the shared interaction contract injected by the system prompt.
|
|
98
|
+
Keep campaign updates brief unless the evidence boundary, blocker state, cost, or next route changed materially.
|
|
99
|
+
For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
|
|
100
|
+
For meaningful long-running slices, include the estimated next reply time or next check-in window whenever it is defensible.
|
|
101
|
+
|
|
102
|
+
## Authority and freedom
|
|
103
|
+
|
|
104
|
+
The agent owns the analysis path.
|
|
105
|
+
It may choose a one-slice check, a lightweight durable report, an artifact-backed one-slice campaign, a full multi-slice campaign, or a writing-facing campaign.
|
|
106
|
+
It may choose slice order, workspace layout, filenames, monitoring strategy, and whether a smoke test, direct verification, or full run is the right first move.
|
|
107
|
+
It may also shrink, reorder, or replace slices to fit the real hardware and runtime envelope, as long as the resulting campaign still answers the parent evidence question honestly.
|
|
108
|
+
|
|
109
|
+
Do not treat `PLAN.md`, `CHECKLIST.md`, `artifact.create_analysis_campaign(...)`, one-slice campaigns, returned worktrees, `evaluation_summary`, smoke tests, detached runs, or paper-matrix updates as universal required paths.
|
|
110
|
+
Do not treat paper-matrix files, `tqdm`, or a fixed phase order as required paths either.
|
|
111
|
+
`PLAN.md`, `CHECKLIST.md`, `paper/paper_experiment_matrix.md`, and local matrix/checklist files are allowed control surfaces, not mandatory success paths.
|
|
112
|
+
They are tactics.
|
|
113
|
+
The hard requirement is traceable evidence that changes, confirms, or blocks the evidence boundary of the parent claim and leaves an explicit next route.
|
|
114
|
+
|
|
115
|
+
Use the artifact-backed campaign path when durable lineage, branch or worktree isolation, Canvas visibility, paper or rebuttal traceability, or multiple slices matter.
|
|
116
|
+
Use a lighter durable report when one bounded answer is enough and extra campaign overhead would not improve trust, routing, or auditability.
|
|
73
117
|
|
|
74
118
|
For campaign prioritization and writing-facing slice design, read `references/campaign-design.md`.
|
|
75
|
-
When the campaign is
|
|
119
|
+
When the campaign is writing-facing and the mapping fields are not obvious, also read `references/writing-facing-slice-examples.md`.
|
|
120
|
+
For artifact examples and edge-case examples, also read `references/artifact-flow-examples.md` and `references/boundary-cases.md`.
|
|
76
121
|
|
|
77
|
-
##
|
|
122
|
+
## Hard success gates
|
|
78
123
|
|
|
79
|
-
|
|
124
|
+
An analysis campaign succeeds when it changes or confirms the evidence boundary of a parent claim with traceable slice-level evidence, preserves comparability or records why comparability broke, and leaves a durable next-route decision.
|
|
80
125
|
|
|
81
|
-
|
|
82
|
-
2. When the campaign is writing-facing, refresh `paper/paper_experiment_matrix.*` before freezing the slice frontier.
|
|
83
|
-
3. Before launching slices, create `PLAN.md` and `CHECKLIST.md`.
|
|
84
|
-
4. Use `PLAN.md` as the durable charter and `CHECKLIST.md` as the living execution surface while launching, monitoring, recording, and aggregating slices.
|
|
85
|
-
5. Run claim-critical slices first and smoke-test long slices before their real runs.
|
|
86
|
-
6. Revise the plan and matrix if slice feasibility, ordering, comparators, or campaign interpretation changes materially, and record every slice durably, including honest non-success states.
|
|
87
|
-
7. Close meaningful campaign milestones with a concise `1-2` sentence summary that says whether the claim gained stable support, partial support, contradiction, or unresolved ambiguity, what the matrix frontier now looks like, and what happens next.
|
|
126
|
+
Before treating analysis as successful, all applicable gates must be true:
|
|
88
127
|
|
|
89
|
-
|
|
128
|
+
- the parent object is explicit, such as a main run, accepted idea line, paper gap, reviewer item, or rebuttal item
|
|
129
|
+
- the claim, question, failure mode, or decision being tested is explicit
|
|
130
|
+
- the slice frontier was screened against current compute, memory, storage, dependency, and runtime limits
|
|
131
|
+
- every launched slice has a durable outcome: completed, partial, failed, blocked, infeasible, or superseded
|
|
132
|
+
- every evidence-bearing slice records the question, intervention or inspection target, fixed conditions, metric or observable, evidence path, claim update, comparability verdict, and next action
|
|
133
|
+
- null, negative, failed, partial, and contradictory findings remain visible
|
|
134
|
+
- campaign-level interpretation is derived from per-slice evidence rather than impressions
|
|
135
|
+
- the next route is explicit: continue campaign, return to `experiment`, return to `idea`, move to `write`, route through `decision`, stop, reset, or record a blocker
|
|
90
136
|
|
|
91
|
-
|
|
92
|
-
- Do not introduce human evaluation or subjective assessment into a campaign.
|
|
93
|
-
- Do not bring in a new dataset unless the quest scope explicitly changed.
|
|
94
|
-
- Every analysis slice must have a specific research question and a falsifiable or at least decision-relevant expectation.
|
|
95
|
-
- If the campaign is supporting a paper or paper-like report, do not launch it until a selected outline exists.
|
|
96
|
-
- When a selected outline exists, every slice should map to a named `research_question` and `experimental_design` from that outline.
|
|
97
|
-
- When the campaign is supporting a paper or paper-like report, do not launch or reorder the slice set without first reading `paper/paper_experiment_matrix.md` when it exists.
|
|
98
|
-
- For writing-facing campaigns, every slice should correspond to a stable matrix row such as `exp_id`, not just a free-form note.
|
|
99
|
-
- For writing-facing campaigns, every todo item must also carry `section_id`, `item_id`, `claim_links`, and `paper_role`; otherwise the slice is not paper-ready.
|
|
100
|
-
- Do not aggregate campaign conclusions without per-run evidence.
|
|
101
|
-
- Do not bury null or contradictory findings.
|
|
137
|
+
## Analysis routes
|
|
102
138
|
|
|
103
|
-
|
|
139
|
+
Use the lightest route that preserves trust and downstream utility.
|
|
104
140
|
|
|
105
|
-
-
|
|
106
|
-
-
|
|
107
|
-
-
|
|
108
|
-
-
|
|
109
|
-
-
|
|
141
|
+
- `analysis-lite`: one clear follow-up question, one slice or very small slice set, and a compact durable result
|
|
142
|
+
- `artifact-backed campaign`: one or more slices that need durable lineage, branch/worktree isolation, Canvas visibility, or later replay
|
|
143
|
+
- `writing-facing campaign`: evidence directly supports a selected outline, paper experiment matrix, evidence ledger, section, claim, or table
|
|
144
|
+
- `review/rebuttal campaign`: evidence directly answers reviewer pressure or audit findings
|
|
145
|
+
- `failure-analysis route`: evidence explains why a result failed, diverged, or became non-comparable
|
|
110
146
|
|
|
111
|
-
|
|
147
|
+
Start the smallest route that can answer the current follow-up question.
|
|
148
|
+
Run claim-critical slices first, weighted by soundness gain under the current resource budget, and stop widening once the next route is already clear.
|
|
112
149
|
|
|
113
|
-
|
|
114
|
-
- the next step is obviously another main experiment rather than follow-up evidence work
|
|
150
|
+
Useful slice classes:
|
|
115
151
|
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
- the reference main run or accepted idea line
|
|
121
|
-
- the claim or question being tested
|
|
122
|
-
- the comparison target
|
|
123
|
-
- the metric or observable of interest
|
|
124
|
-
- the list of specific analysis questions
|
|
125
|
-
- the current quest / user-provided assets that each planned slice will actually use
|
|
126
|
-
- whether each slice is executable with the current assets, tooling, and available credentials
|
|
127
|
-
- for paper-facing campaigns, the current paper experiment matrix frontier and which rows are actually feasible now
|
|
128
|
-
- if durable state exposes `active_baseline_metric_contract_json`, read that JSON file before defining slice success criteria or comparison tables
|
|
129
|
-
- treat `active_baseline_metric_contract_json` as the default baseline comparison contract unless a slice is explicitly testing a different evaluation contract
|
|
130
|
-
|
|
131
|
-
If the question list is fuzzy, sharpen it before running anything.
|
|
132
|
-
Treat quest files, attached user assets, checkpoints, configs, extracted texts, baselines, and existing code paths as the first-choice asset pool.
|
|
133
|
-
Do not design slices around hypothetical resources that the current system cannot actually access or run.
|
|
134
|
-
If a slice cannot be executed with the current system, redesign it around available assets or explicitly report that the task cannot currently be completed.
|
|
135
|
-
If infeasibility appears mid-run, attempt bounded recovery first; if still blocked, record the slice with a non-success status and explain why.
|
|
136
|
-
If ids, active refs, or current quest state are unclear after restart, call `artifact.get_quest_state(detail='summary')` and `artifact.resolve_runtime_refs(...)` before launching or recording slices.
|
|
137
|
-
If the exact quest brief / plan / status wording matters for campaign scope, call `artifact.read_quest_documents(...)`.
|
|
138
|
-
If earlier user instructions materially affect campaign scope or ordering, call `artifact.get_conversation_context(...)` before changing the slice set.
|
|
152
|
+
- `auxiliary`: helps understand settings, thresholds, or mechanisms but does not carry the main claim by itself
|
|
153
|
+
- `claim-carrying`: directly affects whether the main narrative or route decision is justified
|
|
154
|
+
- `supporting`: broadens confidence or interpretability after the main claim is already credible
|
|
139
155
|
|
|
140
|
-
|
|
156
|
+
## Slice evidence contract
|
|
141
157
|
|
|
142
|
-
|
|
143
|
-
- if the slice is useful but non-blocking, make it `appendix`
|
|
144
|
-
- if the slice is informative but not meant for the manuscript, keep it durable and mark it `reference_only` with a reason
|
|
145
|
-
- after every completed paper-facing slice, verify the return path immediately:
|
|
146
|
-
- the matching outline `result_table` row is updated
|
|
147
|
-
- the section notes are updated when the outline folder exists
|
|
148
|
-
- `paper/evidence_ledger.json` reflects the new mapping
|
|
149
|
-
- the active paper line summary no longer treats that slice as missing
|
|
150
|
-
|
|
151
|
-
Do not leave a slice "completed" while the paper contract still looks stale.
|
|
152
|
-
|
|
153
|
-
## Required plan and checklist
|
|
154
|
-
|
|
155
|
-
Before launching any real campaign slice, create a quest-visible `PLAN.md` and `CHECKLIST.md`.
|
|
156
|
-
|
|
157
|
-
- Use `references/campaign-plan-template.md` as the canonical structure for `PLAN.md`.
|
|
158
|
-
- Use `references/campaign-checklist-template.md` as the canonical structure for `CHECKLIST.md`.
|
|
159
|
-
- `PLAN.md` is the durable campaign charter and should cover the claim under test, slice table, comparability boundary, available assets, required comparators, smoke and main-run strategy, monitoring and sleep rules, reporting expectations, and a revision log.
|
|
160
|
-
- `CHECKLIST.md` is the living campaign execution list; update it during launch, asset preparation, slice execution, aggregation, and route changes.
|
|
161
|
-
- If slice ordering, feasibility, required baselines, campaign interpretation, or the writing-facing outline mapping changes materially, revise `PLAN.md` before continuing.
|
|
162
|
-
- The later charter report, slice artifacts, and aggregate report remain required, but `PLAN.md` and `CHECKLIST.md` should be the canonical campaign-control surface during execution.
|
|
163
|
-
|
|
164
|
-
## Truth sources
|
|
165
|
-
|
|
166
|
-
Use:
|
|
167
|
-
|
|
168
|
-
- main experiment artifacts
|
|
169
|
-
- baseline artifacts
|
|
170
|
-
- `active_baseline_metric_contract_json` when available
|
|
171
|
-
- recent decisions and milestone reports
|
|
172
|
-
- code and configs used in the accepted main line
|
|
173
|
-
- actual analysis outputs and logs
|
|
174
|
-
- `bash_exec` session ids and managed shell logs for campaign runs
|
|
175
|
-
|
|
176
|
-
Do not summarize a campaign from impressions alone.
|
|
177
|
-
|
|
178
|
-
## Required durable outputs
|
|
179
|
-
|
|
180
|
-
A campaign should usually leave behind:
|
|
181
|
-
|
|
182
|
-
- a campaign identifier
|
|
183
|
-
- a selected outline reference when the campaign is writing-facing
|
|
184
|
-
- a refreshed `paper/paper_experiment_matrix.md`
|
|
185
|
-
- a refreshed `paper/paper_experiment_matrix.json`
|
|
186
|
-
- one directory per analysis run
|
|
187
|
-
- any supplementary baseline reproduced for analysis under `baselines/local/<baseline_id>/` or attached under `baselines/imported/<baseline_id>/`
|
|
188
|
-
- one quest-level supplementary baseline inventory at `artifacts/baselines/analysis_inventory.json`
|
|
189
|
-
- one run artifact per analysis slice
|
|
190
|
-
- one outline-bound todo manifest when the campaign is writing-facing
|
|
191
|
-
- an aggregated campaign report
|
|
192
|
-
- a decision about the next move
|
|
193
|
-
|
|
194
|
-
In the current runtime, represent that with existing artifact actions only:
|
|
195
|
-
|
|
196
|
-
- one `decision` artifact with `action='launch_analysis_campaign'`
|
|
197
|
-
- one charter `report`
|
|
198
|
-
- one `run` artifact per slice
|
|
199
|
-
- optional `progress` artifacts during execution
|
|
200
|
-
- one aggregated `report`
|
|
201
|
-
- one closing `decision`
|
|
202
|
-
|
|
203
|
-
## Workflow
|
|
204
|
-
|
|
205
|
-
### 0. Launch the campaign durably
|
|
206
|
-
|
|
207
|
-
Before launching any slice, record the campaign start through artifacts:
|
|
208
|
-
|
|
209
|
-
1. write a `decision` artifact with:
|
|
210
|
-
- `action='launch_analysis_campaign'`
|
|
211
|
-
- `campaign_id`
|
|
212
|
-
- `parent_run_id` or `parent_idea_id`
|
|
213
|
-
- why the campaign is needed now
|
|
214
|
-
2. write a charter `report` with the planned slice list
|
|
215
|
-
3. update `plan.md` if the campaign materially changes the quest path
|
|
216
|
-
|
|
217
|
-
Do not start a multi-slice campaign from chat-only intent.
|
|
218
|
-
Do not start it from chat-only intent plus vague notes either: write `PLAN.md` and `CHECKLIST.md` first, using `references/campaign-plan-template.md` and `references/campaign-checklist-template.md` as the default structures.
|
|
219
|
-
|
|
220
|
-
After the charter and launch decision are durably recorded, send one threaded `artifact.interact(kind='milestone', ...)` update naming:
|
|
221
|
-
|
|
222
|
-
- why the campaign exists now
|
|
223
|
-
- the claim-critical slices that will run first
|
|
224
|
-
- the first thing the user should expect from the campaign
|
|
225
|
-
- the first real checkpoint for the user
|
|
226
|
-
- if the active surface is QQ, keep that campaign-launch milestone text-first unless a single summary image is already genuinely useful
|
|
227
|
-
|
|
228
|
-
### 0.1 Bind the campaign to the selected outline when writing-facing
|
|
229
|
-
|
|
230
|
-
If the campaign exists to support a paper or paper-like report:
|
|
231
|
-
|
|
232
|
-
- do not proceed until one selected outline exists
|
|
233
|
-
- if no selected outline exists yet, route to `write` or `decision` first so the outline can be created and selected durably
|
|
234
|
-
- before deciding the slice list, create or refresh `paper/paper_experiment_matrix.md` when it is missing or stale
|
|
235
|
-
- treat that matrix as the upstream paper experiment contract, not `todo_items` alone
|
|
236
|
-
- use the matrix to decide:
|
|
237
|
-
- which rows are `main_required`
|
|
238
|
-
- which are `main_optional`
|
|
239
|
-
- which are appendix-only
|
|
240
|
-
- which are optional or should be dropped
|
|
241
|
-
- do not start stable experiments-section drafting while currently feasible non-optional matrix rows remain unresolved
|
|
242
|
-
- call `artifact.create_analysis_campaign(...)` with:
|
|
243
|
-
- `selected_outline_ref`
|
|
244
|
-
- `research_questions`
|
|
245
|
-
- `experimental_designs`
|
|
246
|
-
- `todo_items`
|
|
247
|
-
- ensure each todo item names at least:
|
|
248
|
-
- `exp_id`
|
|
249
|
-
- `todo_id`
|
|
250
|
-
- `slice_id`
|
|
251
|
-
- `title`
|
|
252
|
-
- `research_question`
|
|
253
|
-
- `experimental_design`
|
|
254
|
-
- `tier`
|
|
255
|
-
- `paper_placement`
|
|
256
|
-
- `completion_condition`
|
|
257
|
-
|
|
258
|
-
For writing-facing campaigns, every slice should also carry paper-contract identity, not just free-form text:
|
|
259
|
-
|
|
260
|
-
- `section_id`
|
|
261
|
-
- `item_id`
|
|
262
|
-
- `claim_links`
|
|
263
|
-
- `paper_role`
|
|
264
|
-
|
|
265
|
-
Do not treat a completed analysis slice as paper-ready until those fields exist and the slice is mappable back into the selected outline or paper experiment matrix.
|
|
266
|
-
Use `references/writing-facing-slice-examples.md` when the correct field values are not obvious.
|
|
267
|
-
|
|
268
|
-
This keeps the analysis campaign aligned with the paper plan instead of becoming a free-floating batch of slices.
|
|
269
|
-
|
|
270
|
-
### 1. Define the campaign charter
|
|
271
|
-
|
|
272
|
-
State:
|
|
273
|
-
|
|
274
|
-
- campaign id
|
|
275
|
-
- parent run or parent idea
|
|
276
|
-
- main claim under test
|
|
277
|
-
- list of analysis questions
|
|
278
|
-
- what will be held fixed
|
|
279
|
-
- what may vary
|
|
280
|
-
|
|
281
|
-
The charter should also include:
|
|
282
|
-
|
|
283
|
-
- campaign type priority order
|
|
284
|
-
- expected slice count
|
|
285
|
-
- dependency structure between slices
|
|
286
|
-
- the matrix path and current execution frontier
|
|
287
|
-
- whether any slice requires isolated code changes or only reruns/config changes
|
|
288
|
-
- the top-level success condition for ending the campaign
|
|
289
|
-
- the top-level abandonment condition for stopping it early
|
|
290
|
-
|
|
291
|
-
Prefer to keep this charter in `PLAN.md` first and mirror the execution frontier in `CHECKLIST.md`.
|
|
292
|
-
|
|
293
|
-
For each analysis question, also state:
|
|
294
|
-
|
|
295
|
-
- why it matters to the main claim
|
|
296
|
-
- whether it exists mainly to support a core claim, validate a highlight, answer an efficiency or cost concern, or bound a limitation
|
|
297
|
-
- what result would strengthen the claim
|
|
298
|
-
- what result would weaken or complicate the claim
|
|
299
|
-
- whether the run is:
|
|
300
|
-
- ablation
|
|
301
|
-
- robustness
|
|
302
|
-
- sensitivity
|
|
303
|
-
- error analysis
|
|
304
|
-
- efficiency
|
|
305
|
-
- environment variation
|
|
306
|
-
|
|
307
|
-
If there are many possible slices, order them by decision value:
|
|
308
|
-
|
|
309
|
-
1. most claim-critical ablation or contradiction check
|
|
310
|
-
2. strongest robustness or sensitivity checks
|
|
311
|
-
3. failure-mode explanation
|
|
312
|
-
4. efficiency or secondary supporting analyses
|
|
313
|
-
|
|
314
|
-
Do not spend half the campaign budget on secondary slices before the claim-critical ones run.
|
|
315
|
-
When the parent line is still below `solid` evidence quality, use the campaign first to move it from `minimum` to `solid` before chasing broader polish.
|
|
316
|
-
|
|
317
|
-
### 2. Split into isolated analysis runs
|
|
318
|
-
|
|
319
|
-
Each analysis run should correspond to one need, such as:
|
|
320
|
-
|
|
321
|
-
- remove one component
|
|
322
|
-
- vary one hyperparameter family
|
|
323
|
-
- run additional seeds
|
|
324
|
-
- inspect one failure bucket
|
|
325
|
-
- test one environment variation
|
|
326
|
-
- measure one efficiency or cost dimension
|
|
327
|
-
- validate one highlight hypothesis
|
|
328
|
-
|
|
329
|
-
Avoid changing many factors at once unless the campaign is explicitly exploratory.
|
|
330
|
-
|
|
331
|
-
For each slice, define at minimum:
|
|
158
|
+
For each meaningful slice, define and record enough of the following to make the evidence reusable:
|
|
332
159
|
|
|
333
160
|
- research question
|
|
334
|
-
- hypothesis or
|
|
335
|
-
- intervention
|
|
161
|
+
- hypothesis, expected pattern, or decision-relevant expectation
|
|
162
|
+
- intervention, ablation, variation, inspection target, or failure bucket
|
|
336
163
|
- controls or fixed conditions
|
|
337
|
-
- metric or
|
|
338
|
-
-
|
|
164
|
+
- metric, observable, table, qualitative artifact, or rubric
|
|
165
|
+
- comparison target
|
|
166
|
+
- expected resource class or major execution constraint when it affects feasibility
|
|
167
|
+
- stop condition or completion condition
|
|
339
168
|
- evidence path expectations
|
|
340
|
-
-
|
|
341
|
-
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
-
|
|
345
|
-
-
|
|
346
|
-
|
|
347
|
-
|
|
348
|
-
|
|
349
|
-
|
|
350
|
-
|
|
351
|
-
|
|
352
|
-
|
|
353
|
-
If a slice needs an extra comparator baseline:
|
|
354
|
-
|
|
355
|
-
- reproduce it under `baselines/local/<baseline_id>/` unless it is attached under `baselines/imported/<baseline_id>/`
|
|
356
|
-
- keep the usual durable baseline notes there, including `analysis_plan.md`, `setup.md`, `execution.md`, and `verification.md`
|
|
357
|
-
- do not overwrite the canonical quest baseline gate just because an analysis slice needed a supplementary baseline
|
|
358
|
-
- after the comparator is ready, record it back through `record_analysis_slice(..., comparison_baselines=[...])` with its `baseline_id`, path, benchmark/split, and metrics summary
|
|
359
|
-
- `parent_run_id`
|
|
360
|
-
- whether a code diff is required
|
|
361
|
-
- whether an isolated branch/worktree is required
|
|
362
|
-
- quantitative success criteria
|
|
363
|
-
- quantitative abandonment criteria
|
|
364
|
-
- contingency trigger for the next slice
|
|
365
|
-
|
|
366
|
-
Recommended `run_kind` naming in the current runtime:
|
|
367
|
-
|
|
368
|
-
- `analysis.ablation`
|
|
369
|
-
- `analysis.robustness`
|
|
370
|
-
- `analysis.sensitivity`
|
|
371
|
-
- `analysis.error`
|
|
372
|
-
- `analysis.efficiency`
|
|
373
|
-
- `analysis.environment`
|
|
374
|
-
|
|
375
|
-
Create the campaign with `artifact.create_analysis_campaign(...)` before starting any slice.
|
|
376
|
-
Even one extra experiment should still be represented as a one-slice campaign so Git and Canvas show a real child node.
|
|
377
|
-
Branch that campaign from the current workspace/result node rather than mutating the completed parent node in place.
|
|
378
|
-
That tool should receive the full slice list, and each returned slice worktree becomes the required execution location for that slice.
|
|
379
|
-
Only create the campaign after you have verified that the listed slices are actually executable with the current quest assets and runtime.
|
|
380
|
-
When the campaign is writing-facing, the same call should also carry `selected_outline_ref`, `research_questions`, `experimental_designs`, and `todo_items`.
|
|
381
|
-
If ids or refs are unclear, recover them first with `artifact.resolve_runtime_refs(...)`, `artifact.get_analysis_campaign(...)`, or `artifact.list_paper_outlines(...)` instead of guessing.
|
|
382
|
-
Treat `campaign_id` as system-owned, and treat `slice_id` / `todo_id` as agent-authored semantic ids.
|
|
383
|
-
Do not replace the normal campaign flow with repeated manual `artifact.prepare_branch(...)` calls.
|
|
384
|
-
After each slice finishes, call `artifact.record_analysis_slice(...)` immediately so the result is mirrored back to the parent branch and the next slice can be activated.
|
|
385
|
-
If a slice fails or becomes infeasible, still call `artifact.record_analysis_slice(...)` with an honest non-success status plus the real blocker and next recommendation; do not leave the campaign state ambiguous.
|
|
386
|
-
After every completed, excluded, or blocked writing-facing slice:
|
|
387
|
-
|
|
388
|
-
- reopen `paper/paper_experiment_matrix.md`
|
|
389
|
-
- update the row status, feasibility, and result artifacts
|
|
390
|
-
- update whether the row now belongs in main text, appendix, or omission
|
|
391
|
-
- update the remaining execution frontier before choosing the next slice
|
|
392
|
-
|
|
393
|
-
Do not keep launching writing-facing slices from stale memory when the matrix has changed.
|
|
394
|
-
For slice recording, `deviations` and `evidence_paths` are optional context fields, not mandatory ceremony; include them only when they materially help explanation or auditability.
|
|
395
|
-
Each `artifact.record_analysis_slice(...)` call should also include an `evaluation_summary` with exactly these six fields:
|
|
396
|
-
|
|
397
|
-
- `takeaway`
|
|
398
|
-
- `claim_update`
|
|
399
|
-
- `baseline_relation`
|
|
400
|
-
- `comparability`
|
|
401
|
-
- `failure_mode`
|
|
402
|
-
- `next_action`
|
|
403
|
-
|
|
404
|
-
Use those six fields to keep each slice readable at a glance from Canvas, stage tabs, review, and rebuttal.
|
|
405
|
-
The longer prose still matters, but the six-field summary is the stable routing summary.
|
|
406
|
-
|
|
407
|
-
For writing-facing campaigns, prefer running `claim-carrying` slices before `supporting` slices unless an auxiliary check is required to make the main slice interpretable.
|
|
408
|
-
|
|
409
|
-
For slices that run longer than a quick smoke check:
|
|
410
|
-
|
|
411
|
-
- first run a bounded smoke test so the slice command, outputs, and metric path are validated cheaply
|
|
412
|
-
- once the smoke test passes, launch the real slice with `bash_exec(mode='detach', ...)` and normally leave `timeout_seconds` unset for that long run
|
|
413
|
-
- `bash_exec(mode='read', id=...)` returns the full rendered log when it is 2000 lines or fewer; for longer logs it returns the first 500 lines plus the last 1500 lines and a hint to inspect omitted sections with `start` and `tail`
|
|
414
|
-
- if you need a middle section that was omitted from that default preview, use `bash_exec(mode='read', id=..., start=..., tail=...)`
|
|
415
|
-
- monitor them with `bash_exec(mode='list')` and `bash_exec(mode='read', id=..., tail_limit=..., order='desc')`
|
|
416
|
-
- after the first read, prefer `bash_exec(mode='read', id=..., after_seq=last_seen_seq, tail_limit=..., order='asc')` for incremental monitoring
|
|
417
|
-
- if ids become unclear, recover them through `bash_exec(mode='history')`
|
|
418
|
-
- launch long slices with a structured `comment` such as `{stage, goal, action, expected_signal, next_check}`
|
|
419
|
-
- use `silent_seconds`, `progress_age_seconds`, `signal_age_seconds`, and `watchdog_overdue` from `bash_exec(mode='list'|'read', ...)` as the default stall checks
|
|
420
|
-
- use an explicit wait-and-check cadence of about `60s`, `120s`, `300s`, `600s`, `1800s`, then every `1800s` while still running
|
|
421
|
-
- if needed, use an explicit bounded wait such as `bash_exec(command='sleep 60', mode='await', timeout_seconds=70)` or `bash_exec(mode='await', id=..., timeout_seconds=...)` between checks
|
|
422
|
-
- canonical sleep choice:
|
|
423
|
-
- if you only need wall-clock waiting between checks, use `bash_exec(command='sleep N', mode='await', timeout_seconds=N+buffer, ...)`
|
|
424
|
-
- keep a real buffer on that sleep timeout; do not set `timeout_seconds` exactly equal to `N`
|
|
425
|
-
- if you are waiting on an already running managed session, prefer `bash_exec(mode='await', id=..., timeout_seconds=...)` instead of starting a new sleep command
|
|
426
|
-
- after the first meaningful signal and then at real checkpoints (e.g., completion, blocker, recovery, or a materially changed evidence frontier), send `artifact.interact(kind='progress', ...)` so the user sees the newest real state
|
|
427
|
-
- after each completed sleep / await monitoring cycle for an active slice, inspect state first; only send another `artifact.interact(kind='progress', ...)` update if the user-visible state materially changed
|
|
428
|
-
- include the estimated next reply time or next check time in those monitoring updates
|
|
429
|
-
- stop them with `bash_exec(mode='kill', id=..., wait=true, timeout_seconds=...)` if the slice is invalid, wedged, or superseded; add `force=true` when immediate termination is required
|
|
430
|
-
- when you control the slice code, prefer a throttled `tqdm` progress reporter and, when feasible, pair it with concise `__DS_PROGRESS__` lines carrying phase and ETA
|
|
431
|
-
- do not mark a slice complete until the managed log and outputs both confirm completion
|
|
432
|
-
|
|
433
|
-
### 3. Keep comparability
|
|
434
|
-
|
|
435
|
-
Comparability rules:
|
|
169
|
+
- claim update
|
|
170
|
+
- comparability verdict
|
|
171
|
+
- next action
|
|
172
|
+
|
|
173
|
+
Code-based, fully automatable analysis is preferred when it is the most faithful and repeatable path.
|
|
174
|
+
But not every valid analysis must be fully automatable: failure-bucket inspection, qualitative artifact review, extracted-text audits, reviewer-linked example checks, or table/figure consistency checks can be valid when the evidence is concrete, sampled or scoped, and reproducible enough for the claim being made.
|
|
175
|
+
|
|
176
|
+
Do not present subjective judgment as objective measurement.
|
|
177
|
+
If human, model, or qualitative judgment is used, record the rubric, sample, prompt or inspection basis, caveats, and why it is sufficient for the route decision.
|
|
178
|
+
|
|
179
|
+
## Comparability contract
|
|
180
|
+
|
|
181
|
+
Comparability is a hard boundary.
|
|
436
182
|
|
|
437
183
|
- keep the same evaluation contract unless the variation is the point
|
|
184
|
+
- when `active_baseline_metric_contract_json` exists, read it before defining slice success criteria or comparison tables when baseline comparison matters
|
|
438
185
|
- when `active_baseline_metric_contract_json` exists, keep slice comparisons aligned with it unless the slice explicitly records why it differs
|
|
439
186
|
- state exactly what changed
|
|
440
187
|
- state exactly what stayed fixed
|
|
441
|
-
- keep naming and output paths clean
|
|
442
|
-
|
|
443
|
-
For code-modifying slices, the default durable layout should stay interpretable:
|
|
444
|
-
|
|
445
|
-
- working surface:
|
|
446
|
-
- `.ds/worktrees/<slice_id>/` when isolated worktrees are used
|
|
447
|
-
- experiment surface:
|
|
448
|
-
- `experiments/analysis/<campaign_id>/<slice_id>/`
|
|
449
|
-
- artifact surface:
|
|
450
|
-
- `artifacts/runs/<artifact_id>.json`
|
|
451
|
-
- `artifacts/reports/<artifact_id>.json`
|
|
188
|
+
- keep naming and output paths clean enough that multiple runs can coexist
|
|
452
189
|
|
|
453
190
|
If the variation itself changes the evaluation setup, record that explicitly and do not present the run as a direct apples-to-apples comparison.
|
|
454
191
|
|
|
455
|
-
|
|
456
|
-
|
|
457
|
-
Before a long slice starts, emit a `progress` artifact or `artifact.interact(kind='progress', ...)` update so the quest shows that the slice is active.
|
|
192
|
+
Do not bring in a new dataset as if it were the same comparison contract.
|
|
193
|
+
A new dataset can be valid as a generalization, external-validity, stress-test, or limitation-boundary slice, but it must be labeled that way and must not replace the accepted baseline or main comparison contract.
|
|
458
194
|
|
|
459
|
-
|
|
195
|
+
If a slice needs an extra comparator baseline, place it under the normal baseline roots, do not overwrite the canonical quest baseline gate, and record it back through `record_analysis_slice(..., comparison_baselines=[...])`.
|
|
460
196
|
|
|
461
|
-
-
|
|
462
|
-
- intervention
|
|
463
|
-
- metric or qualitative evidence
|
|
464
|
-
- whether the result strengthens, weakens, or complicates the claim
|
|
465
|
-
- paths to the evidence
|
|
197
|
+
## Writing-facing boundary
|
|
466
198
|
|
|
467
|
-
|
|
199
|
+
If analysis directly supports a paper or paper-like report, the evidence must be write-backable.
|
|
200
|
+
That does not always mean a selected outline must exist before any pre-outline evidence check, but paper-ready slices must map cleanly back to a selected outline, paper experiment matrix, evidence ledger, section, claim, table, or reviewer item.
|
|
468
201
|
|
|
469
|
-
-
|
|
470
|
-
- implementation change
|
|
471
|
-
- main metric delta
|
|
472
|
-
- interpretation
|
|
473
|
-
- caveats
|
|
474
|
-
|
|
475
|
-
Each completed slice should also leave a `run` artifact containing at least:
|
|
476
|
-
|
|
477
|
-
- `campaign_id`
|
|
478
|
-
- `slice_id`
|
|
479
|
-
- `run_kind`
|
|
480
|
-
- `parent_run_id`
|
|
481
|
-
- `analysis_question`
|
|
482
|
-
- `fixed_conditions`
|
|
483
|
-
- `changed_factors`
|
|
484
|
-
- `metrics_summary`
|
|
485
|
-
- `metric_deltas`
|
|
486
|
-
- `success_criteria`
|
|
487
|
-
- `abandonment_criteria`
|
|
488
|
-
- `verdict`
|
|
489
|
-
- `reason`
|
|
490
|
-
- `paths`
|
|
491
|
-
|
|
492
|
-
If a slice fails before producing evidence, still record it as a failed or partial `run` artifact rather than silently skipping it.
|
|
202
|
+
For concrete paper-facing cases:
|
|
493
203
|
|
|
494
|
-
|
|
495
|
-
|
|
204
|
+
- if the slice is the only thing keeping a main-text section unsupported, make it `main_required` or `main_text`
|
|
205
|
+
- if the slice is useful but non-blocking, make it `appendix`
|
|
206
|
+
- if the slice is informative but not meant for the manuscript, keep it durable and mark it `reference_only` with a reason
|
|
207
|
+
- if a selected outline exists, map paper-ready slices to named `research_question` and `experimental_design` fields when those fields exist
|
|
208
|
+
- if `paper/paper_experiment_matrix.md` exists and the campaign is directly supporting the paper, read it before launching or reordering the slice set
|
|
209
|
+
- for writing-facing campaigns, prefer stable ids such as `exp_id`, `todo_id`, or `slice_id` over free-form notes
|
|
210
|
+
- paper-ready slices should carry the available write-back fields such as `paper_role`, `section_id`, `item_id`, `claim_links`, `analysis_role`, `reviewer_question`, `target_display`, `main_or_appendix`, and `failure_interpretation` when those fields exist in the paper contract
|
|
211
|
+
- paper-ready slices should record whether they support the latest method, an older comparator, a failure mode, or an appendix-only sanity check
|
|
212
|
+
- paper-ready slices should label implementation/setup details as `reproducibility_detail` or `internal_only` when they should not become main-text prose
|
|
213
|
+
- after every completed paper-ready slice, update or verify the relevant paper experiment matrix, section notes, evidence ledger, or active paper-line summary
|
|
214
|
+
|
|
215
|
+
Do not leave a slice "completed" while the paper contract still looks stale and that slice is meant to unblock the paper.
|
|
216
|
+
If no selected outline exists yet but the evidence question is needed to decide whether writing is worthwhile, run it as pre-outline analysis and route to `write` or `decision` afterward.
|
|
217
|
+
|
|
218
|
+
## Durable route records
|
|
219
|
+
|
|
220
|
+
Durable records are required in substance, not in fixed filenames.
|
|
221
|
+
The agent may choose the shortest durable form that lets a later turn resume without guessing.
|
|
222
|
+
|
|
223
|
+
For multi-slice, writing-facing, route-changing, expensive, unstable, or long-running analysis, leave a route record that states:
|
|
224
|
+
|
|
225
|
+
- parent object and parent claim
|
|
226
|
+
- acceptance or stop condition
|
|
227
|
+
- slice list or first slice frontier
|
|
228
|
+
- comparability boundary
|
|
229
|
+
- execution envelope and the slices ruled infeasible under it
|
|
230
|
+
- available assets and required comparators
|
|
231
|
+
- evidence paths or expected outputs
|
|
232
|
+
- current blocker or fallback
|
|
233
|
+
- next route after success or failure
|
|
234
|
+
|
|
235
|
+
`PLAN.md`, `CHECKLIST.md`, `paper/paper_experiment_matrix.md`, and local matrix or checklist files are allowed control surfaces, not mandatory success paths.
|
|
236
|
+
Use `references/campaign-plan-template.md` and `references/campaign-checklist-template.md` when they help, but do not expand them as paperwork.
|
|
237
|
+
|
|
238
|
+
If slice feasibility, ordering, comparators, or campaign interpretation changes materially, revise the durable route record before spending more compute.
|
|
239
|
+
|
|
240
|
+
## Operational guidance
|
|
241
|
+
|
|
242
|
+
The main skill keeps the control surface in front.
|
|
243
|
+
For the longer operational notes, read `references/operational-guidance.md`.
|
|
244
|
+
|
|
245
|
+
- use it when the route needs the exact artifact-backed campaign tactics
|
|
246
|
+
- use it when execution monitoring, stall handling, or slice recording details matter
|
|
247
|
+
- use it when memory handling or connector-facing chart notes materially affect the route
|
|
248
|
+
|
|
249
|
+
## Negative cases and stop rules
|
|
250
|
+
|
|
251
|
+
Do not treat analysis as successful when:
|
|
252
|
+
|
|
253
|
+
- slices do not map to a parent claim, parent result, paper gap, reviewer item, or decision
|
|
254
|
+
- a summary claims stable support without per-slice evidence
|
|
255
|
+
- negative, null, contradictory, failed, or partial slices are hidden
|
|
256
|
+
- an ablation changes many factors but is interpreted as isolating one factor
|
|
257
|
+
- a robustness slice changes dataset, split, or evaluation protocol but is reported as direct apples-to-apples comparison
|
|
258
|
+
- subjective or manual inspection supports a claim without rubric, sample, prompt, trace, or caveat
|
|
259
|
+
- a writing-facing slice is called paper-ready but cannot be mapped back to the paper matrix, evidence ledger, outline, claim, section, or reviewer item
|
|
260
|
+
- a completed paper-relevant slice remains visible only as a free-floating analysis result and is not bound back into the current paper contract
|
|
261
|
+
- a failed slice is silently skipped and replaced by a different slice
|
|
262
|
+
- the campaign keeps expanding after the next route is already clear
|
|
263
|
+
- the campaign scope assumes hardware, memory, or runtime that is not actually available in the current environment
|
|
264
|
+
- a new comparator overwrites the canonical quest baseline gate instead of being recorded as analysis-local comparison evidence
|
|
265
|
+
- the underlying main result is still untrusted and the proposed work is really baseline recovery or a new main experiment
|
|
266
|
+
- a new main experiment is disguised as an analysis slice to bypass the main-experiment gate
|
|
267
|
+
|
|
268
|
+
If two slices in a row fail to change the claim boundary, matrix frontier, or next route, stop widening the campaign and route through `decision`, `write`, `experiment`, or an explicit blocker.
|
|
269
|
+
|
|
270
|
+
Record blocked or failed campaign states explicitly, such as missing parent run, under-specified analysis question, run failure before evidence, non-comparable metrics, missing assets, missing credentials, or still-ambiguous campaign conclusion.
|
|
271
|
+
A blocked campaign should still name the next best action.
|
|
496
272
|
|
|
497
|
-
|
|
273
|
+
## Aggregation and reporting
|
|
498
274
|
|
|
499
|
-
|
|
275
|
+
Campaign reporting should explain:
|
|
500
276
|
|
|
501
277
|
- which findings are stable
|
|
502
278
|
- which findings are fragile
|
|
503
279
|
- what changed the interpretation of the main result
|
|
504
280
|
- which open questions still remain
|
|
505
|
-
|
|
506
|
-
|
|
507
|
-
|
|
508
|
-
|
|
509
|
-
|
|
510
|
-
|
|
511
|
-
-
|
|
512
|
-
|
|
513
|
-
- partial support
|
|
514
|
-
- contradiction
|
|
515
|
-
- unresolved ambiguity
|
|
516
|
-
|
|
281
|
+
- whether the main claim should be strengthened, weakened, narrowed, abandoned, or left ambiguous
|
|
282
|
+
- which slice changed the interpretation most
|
|
283
|
+
- which planned slices were intentionally skipped because earlier results made them low value
|
|
284
|
+
|
|
285
|
+
Focus on the highest-impact findings first.
|
|
286
|
+
Results matter more than process narration.
|
|
287
|
+
If using tables, show only the most decision-relevant rows.
|
|
288
|
+
Separate stable support, partial support, contradiction, and unresolved ambiguity.
|
|
517
289
|
When there are many slices, summarize the top `3-5` most important ones first, then point to the full evidence paths.
|
|
518
290
|
|
|
519
|
-
The aggregated report should also answer:
|
|
520
|
-
|
|
521
|
-
- should the main claim be strengthened, weakened, narrowed, or abandoned?
|
|
522
|
-
- which slice changed the interpretation most?
|
|
523
|
-
- which slice is still worth rerunning, and why?
|
|
524
|
-
- which planned slices were intentionally skipped because earlier results made them low value?
|
|
525
|
-
|
|
526
|
-
When the aggregated campaign report is complete, send a richer threaded `artifact.interact(kind='milestone', ...)` update.
|
|
527
|
-
Lead that milestone with a concise `1-2` sentence campaign outcome summary before expanding into slice-level detail.
|
|
528
|
-
|
|
529
|
-
If QQ milestone media is enabled and the aggregated report materially changes the claim boundary, you may attach one campaign summary PNG to that closing milestone update.
|
|
530
|
-
That update should explicitly classify the campaign outcome in the same language as the report:
|
|
531
|
-
|
|
532
|
-
- stable support
|
|
533
|
-
- partial support
|
|
534
|
-
- contradiction
|
|
535
|
-
- unresolved ambiguity
|
|
536
|
-
|
|
537
|
-
### 6. Route the next step
|
|
538
|
-
|
|
539
|
-
A campaign should end with an explicit next move:
|
|
540
|
-
|
|
541
|
-
- continue the campaign
|
|
542
|
-
- return to `experiment`
|
|
543
|
-
- move to `write`
|
|
544
|
-
- stop or reset the current line
|
|
545
|
-
|
|
546
|
-
Record the post-campaign route as a `decision` artifact.
|
|
547
|
-
When helpful, include a reflection block with:
|
|
548
|
-
|
|
549
|
-
- `what_worked`
|
|
550
|
-
- `what_failed`
|
|
551
|
-
- `learned_constraints`
|
|
552
|
-
|
|
553
|
-
and a `next_direction` block that states:
|
|
554
|
-
|
|
555
|
-
- objective
|
|
556
|
-
- key steps
|
|
557
|
-
- success criteria
|
|
558
|
-
- abandonment criteria
|
|
559
|
-
|
|
560
|
-
This makes the next stage executable without guesswork.
|
|
561
|
-
|
|
562
|
-
## Analysis-quality rules
|
|
563
|
-
|
|
564
|
-
Good campaign behavior:
|
|
565
|
-
|
|
566
|
-
- one clear question per run
|
|
567
|
-
- one-factor-at-a-time changes when possible
|
|
568
|
-
- clear comparison against the accepted reference line
|
|
569
|
-
- visibility of null and negative findings
|
|
570
|
-
- a logically ordered suite rather than a random batch
|
|
571
|
-
|
|
572
|
-
Strong campaign ordering usually looks like:
|
|
573
|
-
|
|
574
|
-
1. most claim-critical ablation or comparison
|
|
575
|
-
2. strongest robustness or sensitivity checks
|
|
576
|
-
3. failure-mode or error analysis
|
|
577
|
-
4. efficiency or secondary analysis
|
|
578
|
-
|
|
579
|
-
The exact order can vary, but the most claim-relevant evidence should appear first.
|
|
580
|
-
|
|
581
|
-
Weak campaign behavior:
|
|
582
|
-
|
|
583
|
-
- hidden scope expansion
|
|
584
|
-
- many untracked simultaneous changes
|
|
585
|
-
- campaign summary without per-run evidence
|
|
586
|
-
- ignoring contradictory analysis results
|
|
587
|
-
- reporting every minor slice with equal weight instead of prioritizing the important ones
|
|
588
|
-
|
|
589
|
-
## Memory rules
|
|
590
|
-
|
|
591
|
-
Stage-start requirement:
|
|
592
|
-
|
|
593
|
-
- begin every analysis campaign pass with `memory.list_recent(scope='quest', limit=5)`
|
|
594
|
-
- then run at least one analysis-relevant `memory.search(...)` before launching or resuming slices
|
|
595
|
-
- if several campaigns, parent runs, or idea lines exist, narrow retrieval to the current `campaign_id`, `parent_run_id`, `idea_id`, or `branch` instead of mixing unrelated slice memory
|
|
596
|
-
|
|
597
|
-
Write to memory only when the campaign yields reusable lessons, such as:
|
|
598
|
-
|
|
599
|
-
- robust failure patterns
|
|
600
|
-
- evaluation caveats
|
|
601
|
-
- reproducible sensitivity findings
|
|
602
|
-
|
|
603
|
-
Stage-end requirement:
|
|
604
|
-
|
|
605
|
-
- if the campaign produced a durable cross-slice lesson, failure pattern, or comparability caveat, write at least one `memory.write(...)` before leaving the stage
|
|
606
|
-
|
|
607
|
-
The campaign’s main record belongs in run artifacts and the aggregated report.
|
|
608
|
-
When synthesizing the campaign, read the per-slice `evaluation_summary` fields first, then expand into longer evidence only where the short summaries are still ambiguous.
|
|
609
|
-
|
|
610
|
-
## Artifact rules
|
|
611
|
-
|
|
612
|
-
Typical artifact sequence:
|
|
613
|
-
|
|
614
|
-
- decision artifact to launch the campaign
|
|
615
|
-
- report artifact for the charter
|
|
616
|
-
- progress artifacts during long campaigns
|
|
617
|
-
- run artifacts per analysis slice
|
|
618
|
-
- report artifact for the aggregated campaign summary
|
|
619
|
-
- decision artifact for the next anchor
|
|
620
|
-
|
|
621
|
-
## Failure and blocked handling
|
|
622
|
-
|
|
623
|
-
Record blocked or failed campaign states explicitly, such as:
|
|
624
|
-
|
|
625
|
-
- missing parent run
|
|
626
|
-
- analysis question under-specified
|
|
627
|
-
- campaign run failed before evidence was produced
|
|
628
|
-
- metrics not comparable
|
|
629
|
-
- campaign conclusion still ambiguous
|
|
630
|
-
|
|
631
|
-
A blocked campaign should still name the next best action.
|
|
632
|
-
|
|
633
291
|
## Exit criteria
|
|
634
292
|
|
|
635
|
-
Exit
|
|
293
|
+
Exit once one of these is durably true:
|
|
636
294
|
|
|
637
295
|
- the campaign produced enough evidence for writing or decision-making
|
|
638
|
-
- the campaign exposed a problem that requires returning to `experiment` or `
|
|
296
|
+
- the campaign exposed a problem that requires returning to `experiment`, `idea`, baseline recovery, or `decision`
|
|
639
297
|
- the campaign is blocked and the blocker is durably recorded
|
|
298
|
+
- the campaign route changed because the original slice set is no longer the best evidence-per-cost path
|
|
299
|
+
|
|
300
|
+
A good campaign closes when the claim got stronger, weaker, narrower, abandoned, or clearly stuck, not when more slice ideas merely remain possible.
|