@researai/deepscientist 1.5.9 → 1.5.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (165) hide show
  1. package/README.md +112 -99
  2. package/assets/branding/connector-qq.png +0 -0
  3. package/assets/branding/connector-rokid.png +0 -0
  4. package/assets/branding/connector-weixin.png +0 -0
  5. package/assets/branding/projects.png +0 -0
  6. package/bin/ds.js +519 -63
  7. package/docs/assets/branding/projects.png +0 -0
  8. package/docs/en/00_QUICK_START.md +338 -68
  9. package/docs/en/01_SETTINGS_REFERENCE.md +14 -0
  10. package/docs/en/02_START_RESEARCH_GUIDE.md +180 -4
  11. package/docs/en/04_LINGZHU_CONNECTOR_GUIDE.md +62 -179
  12. package/docs/en/09_DOCTOR.md +66 -5
  13. package/docs/en/10_WEIXIN_CONNECTOR_GUIDE.md +137 -0
  14. package/docs/en/11_LICENSE_AND_RISK.md +256 -0
  15. package/docs/en/12_GUIDED_WORKFLOW_TOUR.md +446 -0
  16. package/docs/en/13_CORE_ARCHITECTURE_GUIDE.md +297 -0
  17. package/docs/en/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +506 -0
  18. package/docs/en/15_CODEX_PROVIDER_SETUP.md +284 -0
  19. package/docs/en/99_ACKNOWLEDGEMENTS.md +4 -1
  20. package/docs/en/README.md +83 -0
  21. package/docs/images/lingzhu/rokid-agent-platform-create.png +0 -0
  22. package/docs/images/weixin/weixin-plugin-entry.png +0 -0
  23. package/docs/images/weixin/weixin-plugin-entry.svg +33 -0
  24. package/docs/images/weixin/weixin-qr-confirm.svg +30 -0
  25. package/docs/images/weixin/weixin-quest-media-flow.svg +44 -0
  26. package/docs/images/weixin/weixin-settings-bind.svg +57 -0
  27. package/docs/zh/00_QUICK_START.md +345 -72
  28. package/docs/zh/01_SETTINGS_REFERENCE.md +14 -0
  29. package/docs/zh/02_START_RESEARCH_GUIDE.md +181 -3
  30. package/docs/zh/04_LINGZHU_CONNECTOR_GUIDE.md +62 -193
  31. package/docs/zh/09_DOCTOR.md +68 -5
  32. package/docs/zh/10_WEIXIN_CONNECTOR_GUIDE.md +144 -0
  33. package/docs/zh/11_LICENSE_AND_RISK.md +256 -0
  34. package/docs/zh/12_GUIDED_WORKFLOW_TOUR.md +442 -0
  35. package/docs/zh/13_CORE_ARCHITECTURE_GUIDE.md +296 -0
  36. package/docs/zh/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +506 -0
  37. package/docs/zh/15_CODEX_PROVIDER_SETUP.md +285 -0
  38. package/docs/zh/99_ACKNOWLEDGEMENTS.md +4 -1
  39. package/docs/zh/README.md +129 -0
  40. package/install.sh +0 -34
  41. package/package.json +2 -2
  42. package/pyproject.toml +1 -1
  43. package/src/deepscientist/__init__.py +1 -1
  44. package/src/deepscientist/annotations.py +343 -0
  45. package/src/deepscientist/artifact/arxiv.py +484 -37
  46. package/src/deepscientist/artifact/service.py +574 -108
  47. package/src/deepscientist/arxiv_library.py +275 -0
  48. package/src/deepscientist/bash_exec/monitor.py +7 -5
  49. package/src/deepscientist/bash_exec/service.py +93 -21
  50. package/src/deepscientist/bridges/builtins.py +2 -0
  51. package/src/deepscientist/bridges/connectors.py +447 -0
  52. package/src/deepscientist/channels/__init__.py +2 -0
  53. package/src/deepscientist/channels/builtins.py +3 -1
  54. package/src/deepscientist/channels/local.py +3 -3
  55. package/src/deepscientist/channels/qq.py +8 -8
  56. package/src/deepscientist/channels/qq_gateway.py +1 -1
  57. package/src/deepscientist/channels/relay.py +14 -8
  58. package/src/deepscientist/channels/weixin.py +59 -0
  59. package/src/deepscientist/channels/weixin_ilink.py +388 -0
  60. package/src/deepscientist/config/models.py +23 -2
  61. package/src/deepscientist/config/service.py +539 -67
  62. package/src/deepscientist/connector/__init__.py +4 -0
  63. package/src/deepscientist/connector/connector_profiles.py +481 -0
  64. package/src/deepscientist/connector/lingzhu_support.py +668 -0
  65. package/src/deepscientist/connector/qq_profiles.py +206 -0
  66. package/src/deepscientist/connector/weixin_support.py +663 -0
  67. package/src/deepscientist/connector_profiles.py +1 -374
  68. package/src/deepscientist/connector_runtime.py +2 -0
  69. package/src/deepscientist/daemon/api/handlers.py +165 -5
  70. package/src/deepscientist/daemon/api/router.py +13 -1
  71. package/src/deepscientist/daemon/app.py +1444 -67
  72. package/src/deepscientist/doctor.py +4 -5
  73. package/src/deepscientist/gitops/diff.py +120 -29
  74. package/src/deepscientist/lingzhu_support.py +1 -182
  75. package/src/deepscientist/mcp/server.py +135 -7
  76. package/src/deepscientist/prompts/builder.py +128 -11
  77. package/src/deepscientist/qq_profiles.py +1 -196
  78. package/src/deepscientist/quest/node_traces.py +23 -0
  79. package/src/deepscientist/quest/service.py +359 -74
  80. package/src/deepscientist/quest/stage_views.py +71 -5
  81. package/src/deepscientist/runners/codex.py +170 -19
  82. package/src/deepscientist/runners/runtime_overrides.py +6 -0
  83. package/src/deepscientist/shared.py +33 -14
  84. package/src/deepscientist/weixin_support.py +1 -0
  85. package/src/prompts/connectors/lingzhu.md +3 -1
  86. package/src/prompts/connectors/qq.md +2 -1
  87. package/src/prompts/connectors/weixin.md +231 -0
  88. package/src/prompts/contracts/shared_interaction.md +4 -1
  89. package/src/prompts/system.md +61 -9
  90. package/src/skills/analysis-campaign/SKILL.md +46 -6
  91. package/src/skills/analysis-campaign/references/campaign-plan-template.md +21 -8
  92. package/src/skills/baseline/SKILL.md +1 -1
  93. package/src/skills/decision/SKILL.md +1 -1
  94. package/src/skills/experiment/SKILL.md +1 -1
  95. package/src/skills/finalize/SKILL.md +1 -1
  96. package/src/skills/idea/SKILL.md +1 -1
  97. package/src/skills/intake-audit/SKILL.md +1 -1
  98. package/src/skills/rebuttal/SKILL.md +74 -1
  99. package/src/skills/rebuttal/references/response-letter-template.md +55 -11
  100. package/src/skills/review/SKILL.md +118 -1
  101. package/src/skills/review/references/experiment-todo-template.md +23 -0
  102. package/src/skills/review/references/review-report-template.md +16 -0
  103. package/src/skills/review/references/revision-log-template.md +4 -0
  104. package/src/skills/scout/SKILL.md +1 -1
  105. package/src/skills/write/SKILL.md +168 -7
  106. package/src/skills/write/references/paper-experiment-matrix-template.md +131 -0
  107. package/src/tui/package.json +1 -1
  108. package/src/ui/dist/assets/{AiManusChatView-BKZ103sn.js → AiManusChatView-CnJcXynW.js} +156 -48
  109. package/src/ui/dist/assets/{AnalysisPlugin-mTTzGAlK.js → AnalysisPlugin-DeyzPEhV.js} +1 -1
  110. package/src/ui/dist/assets/{CliPlugin-BH58n3GY.js → CliPlugin-CB1YODQn.js} +164 -9
  111. package/src/ui/dist/assets/{CodeEditorPlugin-BKGRUH7e.js → CodeEditorPlugin-B-xicq1e.js} +8 -8
  112. package/src/ui/dist/assets/{CodeViewerPlugin-BMADwFWJ.js → CodeViewerPlugin-DT54ysXa.js} +5 -5
  113. package/src/ui/dist/assets/{DocViewerPlugin-ZOnTIHLN.js → DocViewerPlugin-DQtKT-VD.js} +3 -3
  114. package/src/ui/dist/assets/{GitDiffViewerPlugin-CQ7h1Djm.js → GitDiffViewerPlugin-hqHbCfnv.js} +20 -21
  115. package/src/ui/dist/assets/{ImageViewerPlugin-GVS5MsnC.js → ImageViewerPlugin-OcVo33jV.js} +5 -5
  116. package/src/ui/dist/assets/{LabCopilotPanel-BZNv1JML.js → LabCopilotPanel-DdGwhEUV.js} +11 -11
  117. package/src/ui/dist/assets/{LabPlugin-TWcJsdQA.js → LabPlugin-Ciz1gDaX.js} +2 -1
  118. package/src/ui/dist/assets/{LatexPlugin-DIjHiR2x.js → LatexPlugin-BhmjNQRC.js} +37 -11
  119. package/src/ui/dist/assets/{MarkdownViewerPlugin-D3ooGAH0.js → MarkdownViewerPlugin-BzdVH9Bx.js} +4 -4
  120. package/src/ui/dist/assets/{MarketplacePlugin-DfVfE9hN.js → MarketplacePlugin-DmyHspXt.js} +3 -3
  121. package/src/ui/dist/assets/{NotebookEditor-DDl0_Mc0.js → NotebookEditor-BMXKrDRk.js} +1 -1
  122. package/src/ui/dist/assets/{NotebookEditor-s8JhzuX1.js → NotebookEditor-BTVYRGkm.js} +12 -12
  123. package/src/ui/dist/assets/{PdfLoader-C2Sf6SJM.js → PdfLoader-CvcjJHXv.js} +14 -7
  124. package/src/ui/dist/assets/{PdfMarkdownPlugin-CXFLoIsa.js → PdfMarkdownPlugin-DW2ej8Vk.js} +73 -6
  125. package/src/ui/dist/assets/{PdfViewerPlugin-BYTmz2fK.js → PdfViewerPlugin-CmlDxbhU.js} +103 -34
  126. package/src/ui/dist/assets/PdfViewerPlugin-DQ11QcSf.css +3627 -0
  127. package/src/ui/dist/assets/{SearchPlugin-CjWBI1O9.js → SearchPlugin-DAjQZPSv.js} +1 -1
  128. package/src/ui/dist/assets/{TextViewerPlugin-DdOBU3-S.js → TextViewerPlugin-C-nVAZb_.js} +5 -4
  129. package/src/ui/dist/assets/{VNCViewer-B8HGgLwQ.js → VNCViewer-D7-dIYon.js} +10 -10
  130. package/src/ui/dist/assets/bot-C_G4WtNI.js +21 -0
  131. package/src/ui/dist/assets/branding/logo-rokid.png +0 -0
  132. package/src/ui/dist/assets/browser-BAcuE0Xj.js +2895 -0
  133. package/src/ui/dist/assets/{code-BWAY76JP.js → code-Cd7WfiWq.js} +1 -1
  134. package/src/ui/dist/assets/{file-content-C1NwU5oQ.js → file-content-B57zsL9y.js} +1 -1
  135. package/src/ui/dist/assets/{file-diff-panel-CywslwB9.js → file-diff-panel-DVoheLFq.js} +1 -1
  136. package/src/ui/dist/assets/{file-socket-B4kzuOBQ.js → file-socket-B5kXFxZP.js} +1 -1
  137. package/src/ui/dist/assets/{image-D-NZM-6P.js → image-LLOjkMHF.js} +1 -1
  138. package/src/ui/dist/assets/{index-DGIYDuTv.css → index-BQG-1s2o.css} +40 -13
  139. package/src/ui/dist/assets/{index-DHZJ_0TI.js → index-C3r2iGrp.js} +12 -12
  140. package/src/ui/dist/assets/{index-7Chr1g9c.js → index-CLQauncb.js} +15050 -9561
  141. package/src/ui/dist/assets/index-Dxa2eYMY.js +25 -0
  142. package/src/ui/dist/assets/{index-BdM1Gqfr.js → index-hOUOWbW2.js} +2 -2
  143. package/src/ui/dist/assets/{monaco-Cb2uKKe6.js → monaco-BGGAEii3.js} +1 -1
  144. package/src/ui/dist/assets/{pdf-effect-queue-DSw_D3RV.js → pdf-effect-queue-DlEr1_y5.js} +16 -1
  145. package/src/ui/dist/assets/pdf.worker.min-yatZIOMy.mjs +21 -0
  146. package/src/ui/dist/assets/{popover-Bg72DGgT.js → popover-CWJbJuYY.js} +1 -1
  147. package/src/ui/dist/assets/{project-sync-Ce_0BglY.js → project-sync-CRJiucYO.js} +18 -77
  148. package/src/ui/dist/assets/select-CoHB7pvH.js +1690 -0
  149. package/src/ui/dist/assets/{sigma-DPaACDrh.js → sigma-D5aJWR8J.js} +1 -1
  150. package/src/ui/dist/assets/{index-CDxNdQdz.js → square-check-big-DUK_mnkS.js} +2 -13
  151. package/src/ui/dist/assets/{trash-BvTgE5__.js → trash-ChU3SEE3.js} +1 -1
  152. package/src/ui/dist/assets/{useCliAccess-CgPeMOwP.js → useCliAccess-BrJBV3tY.js} +1 -1
  153. package/src/ui/dist/assets/{useFileDiffOverlay-xPhz7P5B.js → useFileDiffOverlay-C2OQaVWc.js} +1 -1
  154. package/src/ui/dist/assets/{wrap-text-C3Un3YQr.js → wrap-text-C7Qqh-om.js} +1 -1
  155. package/src/ui/dist/assets/{zoom-out-BgxLa0Ri.js → zoom-out-rtX0FKya.js} +1 -1
  156. package/src/ui/dist/index.html +2 -2
  157. package/src/ui/dist/assets/AutoFigurePlugin-BGxN8Umr.css +0 -3056
  158. package/src/ui/dist/assets/AutoFigurePlugin-C_wWw4AP.js +0 -8149
  159. package/src/ui/dist/assets/PdfViewerPlugin-BJXtIwj_.css +0 -260
  160. package/src/ui/dist/assets/Stepper-B0Dd8CxK.js +0 -158
  161. package/src/ui/dist/assets/bibtex-CKaefIN2.js +0 -189
  162. package/src/ui/dist/assets/file-utils-H2fjA46S.js +0 -109
  163. package/src/ui/dist/assets/message-square-BzjLiXir.js +0 -16
  164. package/src/ui/dist/assets/pdfjs-DU1YE8WO.js +0 -3
  165. package/src/ui/dist/assets/tooltip-C_mA6R0w.js +0 -108
@@ -20,7 +20,7 @@ This skill intentionally absorbs the strongest old DeepScientist writing discipl
20
20
  ## Interaction discipline
21
21
 
22
22
  - Follow the shared interaction contract injected by the system prompt.
23
- - For ordinary active work, prefer a concise progress update once work has crossed roughly 10 tool calls with a human-meaningful delta, and do not drift beyond roughly 20 tool calls or about 15 minutes without a user-visible update.
23
+ - For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
24
24
  - Prefer `bash_exec` for durable document-build commands such as LaTeX compilation, figure regeneration, and scripted export steps so logs remain quest-local and reviewable.
25
25
  - Keep ordinary subtask completions concise. When a paper/draft milestone is actually completed, upgrade to a richer `artifact.interact(kind='milestone', reply_mode='threaded', ...)` report instead of another short progress update.
26
26
  - That richer writing-stage milestone report should normally cover: which draft, section, or outline milestone finished, what is now supportable, what is still missing, and the exact recommended next revision or route decision.
@@ -146,6 +146,8 @@ The write stage should usually produce most of the following:
146
146
 
147
147
  - `paper/outline.md` or equivalent outline
148
148
  - `paper/selected_outline.json`
149
+ - `paper/paper_experiment_matrix.md`
150
+ - `paper/paper_experiment_matrix.json`
149
151
  - `paper/outline_selection.md`
150
152
  - `paper/reviewer_first_pass.md`
151
153
  - `paper/section_contracts.md`
@@ -202,6 +204,144 @@ At minimum, repeatedly verify:
202
204
  - figure and table provenance
203
205
  - file inclusion integrity for the draft or bundle
204
206
 
207
+ ## Paper experiment matrix contract
208
+
209
+ For any paper-like writing line that has more than a trivial single-result story, create and maintain:
210
+
211
+ - `paper/paper_experiment_matrix.md`
212
+ - `paper/paper_experiment_matrix.json`
213
+
214
+ Use `references/paper-experiment-matrix-template.md` when helpful.
215
+
216
+ The paper experiment matrix is the durable experiment-control surface for the paper line.
217
+ It exists to prevent two common failures:
218
+
219
+ - an outline that overweights post-hoc analysis and under-specifies paper-typical experiments
220
+ - a drifting supplementary-experiment queue where runs are launched ad hoc without a full paper-facing plan
221
+
222
+ The matrix is not just an “analysis list”.
223
+ It should cover the full paper-facing experiment program beyond the already-finished main run, including:
224
+
225
+ - main comparison surfaces that still need packaging or extension
226
+ - component ablations
227
+ - sensitivity / hyperparameter checks
228
+ - robustness or stress checks
229
+ - efficiency / cost / latency / token-overhead checks when the method may have a strong deployment or efficiency story
230
+ - highlight-validation experiments that test the method's most likely reader-facing strengths rather than merely assuming those strengths
231
+ - failure-boundary or limitation-surface analyses
232
+ - case study or trace walkthrough rows as optional supporting material rather than mandatory core evidence
233
+
234
+ Case study is usually optional.
235
+ Do not let it displace stronger quantitative evidence.
236
+ Efficiency or cost experiments are not mandatory in every paper, but they should be added whenever:
237
+
238
+ - the method may be attractive partly because it is lightweight or prompt-level
239
+ - the overhead skepticism from reviewers is easy to anticipate
240
+ - a performance-over-cost tradeoff could become part of the paper's practical contribution
241
+
242
+ Highlight-validation rule:
243
+
244
+ - do not assume the method's strongest selling point is already obvious from the aggregate metric
245
+ - explicitly write down `highlight hypotheses`
246
+ - plan at least one experiment that could confirm or falsify each serious highlight hypothesis
247
+
248
+ Typical highlight hypotheses include:
249
+
250
+ - the method is more selective rather than merely more conservative
251
+ - the gain comes from a named mechanism rather than from generic stubbornness or scale
252
+ - the improvement concentrates on the intended failure regime
253
+ - the method keeps a strong performance / overhead tradeoff
254
+
255
+ Each matrix row should normally record at least:
256
+
257
+ - `exp_id`
258
+ - `title`
259
+ - `tier`
260
+ - `main_required`
261
+ - `main_optional`
262
+ - `appendix`
263
+ - `optional`
264
+ - `dropped`
265
+ - `experiment_type`
266
+ - `main_comparison`
267
+ - `component_ablation`
268
+ - `sensitivity`
269
+ - `robustness`
270
+ - `efficiency_cost`
271
+ - `highlight_validation`
272
+ - `failure_boundary`
273
+ - `case_study_optional`
274
+ - `status`
275
+ - `proposed`
276
+ - `planned`
277
+ - `ready`
278
+ - `running`
279
+ - `completed`
280
+ - `analyzed`
281
+ - `written`
282
+ - `excluded`
283
+ - `blocked`
284
+ - `feasibility_now`
285
+ - whether the row is runnable with current assets or still blocked
286
+ - `claim_ids`
287
+ - `highlight_ids`
288
+ - `research_question`
289
+ - `hypothesis`
290
+ - `why_this_matters`
291
+ - `comparators`
292
+ - `fixed_conditions`
293
+ - `changed_variables`
294
+ - `metrics`
295
+ - `cost_budget`
296
+ - `minimal_success_criterion`
297
+ - `promotion_rule`
298
+ - what result would move the row into main text
299
+ - what result keeps it appendix-only
300
+ - what result should exclude it
301
+ - `paper_placement`
302
+ - `main_text`
303
+ - `appendix`
304
+ - `maybe`
305
+ - `omit`
306
+ - `result_artifacts`
307
+ - `next_action`
308
+
309
+ The matrix should also contain:
310
+
311
+ - core paper claims
312
+ - highlight hypotheses
313
+ - a short experiment taxonomy summary
314
+ - the current execution frontier
315
+ - an explicit main-text gate
316
+ - a refresh log that records how priorities changed after new evidence arrived
317
+
318
+ Main-text drafting gate:
319
+
320
+ - do not treat the main experiments section as stable while any row that is both:
321
+ - currently feasible
322
+ - and not marked `optional` or `dropped`
323
+ remains unaddressed
324
+ - before the experiments section becomes stable, every currently feasible row should be:
325
+ - `completed`
326
+ - `analyzed`
327
+ - `excluded` with a real reason
328
+ - or `blocked` with a real reason
329
+
330
+ This does not forbid drafting the introduction, method, or placeholders early.
331
+ It does forbid pretending the paper's experimental story is settled while the feasible experiment frontier is still open.
332
+
333
+ After every meaningful experiment outcome, even a null result or exclusion:
334
+
335
+ - reopen the matrix first
336
+ - update the row status and feasibility
337
+ - update `paper_placement`
338
+ - update the claim and highlight impact
339
+ - update the priority order of the remaining rows
340
+ - then decide the next experiment or writing move
341
+
342
+ Do not decide the next supplementary experiment from memory alone when the matrix exists.
343
+ The matrix should be the authoritative experiment-routing surface for the paper line, and the selected outline's `experimental_designs` should stay consistent with that matrix rather than drifting away from it.
344
+
205
345
  ## Venue template selection
206
346
 
207
347
  For paper-like writing, use a real venue template rather than improvising a blank LaTeX tree.
@@ -246,18 +386,20 @@ For paper-like deliverables, the safest default order is:
246
386
  3. choose the venue template from `templates/`, copy it into `paper/latex/`, and default general ML work to `templates/iclr2026/` unless a stronger venue target exists
247
387
  4. if the line benefits from an explicit outline contract, record one or more outline candidates with `artifact.submit_paper_outline(mode='candidate', ...)`
248
388
  5. if one outline should become the durable paper contract, select or revise it with `artifact.submit_paper_outline(mode='select'|'revise', ...)`
249
- 6. if the selected outline still exposes evidence gaps, launch an outline-bound `artifact.create_analysis_campaign(...)` before drafting
250
- 7. plan and generate decisive figures or tables
251
- 8. draft sections directly from the evidence and the current working outline; do not force extra outline rounds when direct drafting is clearer and safer
252
- 9. run harsh review and revision cycles
253
- 10. proof, package, submit `artifact.submit_paper_bundle(...)` when the bundle is ready, and then pass to `finalize`
254
- 11. if the final paper PDF exists and QQ milestone media is enabled in config, the bundle-ready milestone may attach that PDF once
389
+ 6. create or refresh `paper/paper_experiment_matrix.md` and `paper/paper_experiment_matrix.json` before stabilizing the experiments section
390
+ 7. if the selected outline or matrix still exposes evidence gaps, launch an outline-bound and matrix-bound `artifact.create_analysis_campaign(...)` before drafting the experiments section as if it were settled
391
+ 8. plan and generate decisive figures or tables
392
+ 9. draft sections directly from the evidence and the current working outline; do not force extra outline rounds when direct drafting is clearer and safer
393
+ 10. run harsh review and revision cycles
394
+ 11. proof, package, submit `artifact.submit_paper_bundle(...)` when the bundle is ready, and then pass to `finalize`
395
+ 12. if the final paper PDF exists and QQ milestone media is enabled in config, the bundle-ready milestone may attach that PDF once
255
396
 
256
397
  Before real drafting, force one explicit planning pass that stabilizes at least:
257
398
 
258
399
  - the current claim inventory
259
400
  - the claim-evidence map skeleton
260
401
  - the outline or outline candidates
402
+ - the paper experiment matrix
261
403
  - the figure/table plan
262
404
  - the main evidence gaps
263
405
 
@@ -273,6 +415,7 @@ For substantial paper-like writing, the durable writing plan should usually incl
273
415
 
274
416
  - section goals
275
417
  - paragraph or subsection intent when it materially affects correctness
418
+ - paper experiment matrix status and execution frontier
276
419
  - experiment-to-section mapping
277
420
  - figure/table-to-data-source mapping
278
421
  - citation/search plan
@@ -284,6 +427,7 @@ Do not let drafting quietly outrun the current evidence inventory.
284
427
 
285
428
  For reviewer-facing structure and section-level drafting contracts, read these references when the line needs sharper paper craft:
286
429
 
430
+ - `references/paper-experiment-matrix-template.md`
287
431
  - `references/reviewer-first-writing.md`
288
432
  - `references/section-contracts.md`
289
433
  - `references/sentence-level-proofing.md`
@@ -306,6 +450,21 @@ Also build an experiment inventory before outlining:
306
450
  - appendix-only evidence
307
451
  - unusable or too-weak evidence
308
452
  - verify that each planned main claim has at least one durable evidence path
453
+ - convert that inventory into the paper experiment matrix instead of leaving it as loose notes
454
+
455
+ When building the matrix, do not reduce the candidate pool to “analysis experiments”.
456
+ The inventory should explicitly consider:
457
+
458
+ - ablations
459
+ - robustness checks
460
+ - sensitivity or hyperparameter checks
461
+ - efficiency / cost / latency / token-overhead checks
462
+ - experiments aimed at validating likely highlights
463
+ - limitation-boundary analyses
464
+ - optional case studies
465
+
466
+ If the method appears to have a likely practical or deployment-facing strength, test it directly instead of burying that possibility in prose.
467
+ If the method appears to have a likely conceptual highlight, write the corresponding `highlight hypothesis` and treat it as something that still needs evidence rather than something to assume.
309
468
 
310
469
  If an experiment is too weak, too tiny, or poorly comparable, do not let it silently anchor a main claim.
311
470
  As a strong default, experiments with very small evaluation support, such as `<=10` effective examples or similarly fragile sample counts, should not carry a main-text claim unless the user explicitly accepts that limitation and the caveat is written next to the claim.
@@ -1083,3 +1242,5 @@ Exit the write stage only when one of the following is durably true:
1083
1242
  - the current draft is evidence-complete enough for `finalize`, including a selected outline and a durable paper bundle manifest when the deliverable is paper-like
1084
1243
  - a clear evidence gap has been recorded and the quest is routed backward
1085
1244
  - a packaging or proofing blocker has been recorded and the next action is explicit
1245
+
1246
+ For paper-like writing, do not treat the draft as evidence-complete enough for `finalize` while `paper/paper_experiment_matrix.*` still contains currently feasible non-optional rows that remain unresolved.
@@ -0,0 +1,131 @@
1
+ # Paper Experiment Matrix Template
2
+
3
+ Use this template when a paper-like line needs a durable experiment-control surface beyond the selected outline.
4
+
5
+ Create and maintain both:
6
+
7
+ - `paper/paper_experiment_matrix.md`
8
+ - `paper/paper_experiment_matrix.json`
9
+
10
+ The Markdown file is the human-facing control surface.
11
+ The JSON file is the machine-facing mirror.
12
+
13
+ ## 1. Current Judgment
14
+
15
+ - current judgment:
16
+ - why the matrix is needed now:
17
+ - what would make the experiments section stable:
18
+ - what still blocks stable paper writing:
19
+
20
+ ## 2. Core Claims
21
+
22
+ - `C1`:
23
+ - one-line claim:
24
+ - current support status:
25
+ - strongest current evidence:
26
+ - still-missing evidence:
27
+
28
+ - `C2`:
29
+ - one-line claim:
30
+ - current support status:
31
+ - strongest current evidence:
32
+ - still-missing evidence:
33
+
34
+ - `C3`:
35
+ - one-line claim:
36
+ - current support status:
37
+ - strongest current evidence:
38
+ - still-missing evidence:
39
+
40
+ ## 3. Highlight Hypotheses
41
+
42
+ Write only serious hypotheses that could matter to the paper's reader-facing value.
43
+ Do not assume the highlight is already true just because it sounds attractive.
44
+
45
+ - `H1`:
46
+ - one-line highlight:
47
+ - why it is plausible:
48
+ - validation rows:
49
+ - fallback if unsupported:
50
+
51
+ - `H2`:
52
+ - one-line highlight:
53
+ - why it is plausible:
54
+ - validation rows:
55
+ - fallback if unsupported:
56
+
57
+ ## 4. Taxonomy Summary
58
+
59
+ Check every category deliberately.
60
+ Do not collapse the matrix into “analysis only”.
61
+
62
+ - main comparison:
63
+ - component ablation:
64
+ - sensitivity:
65
+ - robustness:
66
+ - efficiency / cost:
67
+ - highlight validation:
68
+ - failure boundary:
69
+ - case study optional:
70
+
71
+ ## 5. Matrix Table
72
+
73
+ | Exp id | Title | Tier | Experiment type | Status | Feasibility now | Claim ids | Highlight ids | Research question | Metrics | Paper placement | Next action |
74
+ |---|---|---|---|---|---|---|---|---|---|---|---|
75
+ | | | main_required / main_optional / appendix / optional / dropped | main_comparison / component_ablation / sensitivity / robustness / efficiency_cost / highlight_validation / failure_boundary / case_study_optional | proposed / planned / ready / running / completed / analyzed / written / excluded / blocked | feasible_now / light_setup / blocked / uncertain | | | | | main_text / appendix / maybe / omit | |
76
+
77
+ ## 6. Detail Cards
78
+
79
+ Repeat one card per meaningful row.
80
+
81
+ ### EXP-001
82
+
83
+ - title:
84
+ - tier:
85
+ - experiment type:
86
+ - current status:
87
+ - feasibility now:
88
+ - why this row exists:
89
+ - research question:
90
+ - hypothesis:
91
+ - comparators:
92
+ - fixed conditions:
93
+ - changed variables:
94
+ - required metric(s):
95
+ - minimal success criterion:
96
+ - cost / runtime budget:
97
+ - promotion rule:
98
+ - main text if:
99
+ - appendix if:
100
+ - omit if:
101
+ - expected figure or table:
102
+ - result artifact paths:
103
+ - dependencies:
104
+ - next action:
105
+
106
+ ## 7. Execution Frontier
107
+
108
+ - rows ready now:
109
+ - rows blocked now:
110
+ - rows that must finish before the experiments section is stable:
111
+ - rows that are appendix-only and can wait:
112
+ - rows that are optional and should not block:
113
+
114
+ ## 8. Main-Text Gate
115
+
116
+ Do not treat the experiments section as stable while any currently feasible row that is not merely `optional` or `dropped` remains unresolved.
117
+
118
+ Every currently feasible non-optional row should be one of:
119
+
120
+ - completed
121
+ - analyzed
122
+ - excluded with reason
123
+ - blocked with reason
124
+
125
+ ## 9. Refresh Log
126
+
127
+ After every completed, excluded, or blocked slice, reopen the matrix first and update it before selecting the next run.
128
+
129
+ | Time | Exp id | What changed | Claim/highlight impact | Priority change | New next action |
130
+ |---|---|---|---|---|---|
131
+ | | | | | | |
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "deepscientist-tui",
3
- "version": "1.5.9",
3
+ "version": "1.5.12",
4
4
  "private": true,
5
5
  "type": "module",
6
6
  "main": "dist/index.js",