@researai/deepscientist 1.5.9 → 1.5.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (165) hide show
  1. package/README.md +112 -99
  2. package/assets/branding/connector-qq.png +0 -0
  3. package/assets/branding/connector-rokid.png +0 -0
  4. package/assets/branding/connector-weixin.png +0 -0
  5. package/assets/branding/projects.png +0 -0
  6. package/bin/ds.js +519 -63
  7. package/docs/assets/branding/projects.png +0 -0
  8. package/docs/en/00_QUICK_START.md +338 -68
  9. package/docs/en/01_SETTINGS_REFERENCE.md +14 -0
  10. package/docs/en/02_START_RESEARCH_GUIDE.md +180 -4
  11. package/docs/en/04_LINGZHU_CONNECTOR_GUIDE.md +62 -179
  12. package/docs/en/09_DOCTOR.md +66 -5
  13. package/docs/en/10_WEIXIN_CONNECTOR_GUIDE.md +137 -0
  14. package/docs/en/11_LICENSE_AND_RISK.md +256 -0
  15. package/docs/en/12_GUIDED_WORKFLOW_TOUR.md +446 -0
  16. package/docs/en/13_CORE_ARCHITECTURE_GUIDE.md +297 -0
  17. package/docs/en/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +506 -0
  18. package/docs/en/15_CODEX_PROVIDER_SETUP.md +284 -0
  19. package/docs/en/99_ACKNOWLEDGEMENTS.md +4 -1
  20. package/docs/en/README.md +83 -0
  21. package/docs/images/lingzhu/rokid-agent-platform-create.png +0 -0
  22. package/docs/images/weixin/weixin-plugin-entry.png +0 -0
  23. package/docs/images/weixin/weixin-plugin-entry.svg +33 -0
  24. package/docs/images/weixin/weixin-qr-confirm.svg +30 -0
  25. package/docs/images/weixin/weixin-quest-media-flow.svg +44 -0
  26. package/docs/images/weixin/weixin-settings-bind.svg +57 -0
  27. package/docs/zh/00_QUICK_START.md +345 -72
  28. package/docs/zh/01_SETTINGS_REFERENCE.md +14 -0
  29. package/docs/zh/02_START_RESEARCH_GUIDE.md +181 -3
  30. package/docs/zh/04_LINGZHU_CONNECTOR_GUIDE.md +62 -193
  31. package/docs/zh/09_DOCTOR.md +68 -5
  32. package/docs/zh/10_WEIXIN_CONNECTOR_GUIDE.md +144 -0
  33. package/docs/zh/11_LICENSE_AND_RISK.md +256 -0
  34. package/docs/zh/12_GUIDED_WORKFLOW_TOUR.md +442 -0
  35. package/docs/zh/13_CORE_ARCHITECTURE_GUIDE.md +296 -0
  36. package/docs/zh/14_PROMPT_SKILLS_AND_MCP_GUIDE.md +506 -0
  37. package/docs/zh/15_CODEX_PROVIDER_SETUP.md +285 -0
  38. package/docs/zh/99_ACKNOWLEDGEMENTS.md +4 -1
  39. package/docs/zh/README.md +129 -0
  40. package/install.sh +0 -34
  41. package/package.json +2 -2
  42. package/pyproject.toml +1 -1
  43. package/src/deepscientist/__init__.py +1 -1
  44. package/src/deepscientist/annotations.py +343 -0
  45. package/src/deepscientist/artifact/arxiv.py +484 -37
  46. package/src/deepscientist/artifact/service.py +574 -108
  47. package/src/deepscientist/arxiv_library.py +275 -0
  48. package/src/deepscientist/bash_exec/monitor.py +7 -5
  49. package/src/deepscientist/bash_exec/service.py +93 -21
  50. package/src/deepscientist/bridges/builtins.py +2 -0
  51. package/src/deepscientist/bridges/connectors.py +447 -0
  52. package/src/deepscientist/channels/__init__.py +2 -0
  53. package/src/deepscientist/channels/builtins.py +3 -1
  54. package/src/deepscientist/channels/local.py +3 -3
  55. package/src/deepscientist/channels/qq.py +8 -8
  56. package/src/deepscientist/channels/qq_gateway.py +1 -1
  57. package/src/deepscientist/channels/relay.py +14 -8
  58. package/src/deepscientist/channels/weixin.py +59 -0
  59. package/src/deepscientist/channels/weixin_ilink.py +388 -0
  60. package/src/deepscientist/config/models.py +23 -2
  61. package/src/deepscientist/config/service.py +539 -67
  62. package/src/deepscientist/connector/__init__.py +4 -0
  63. package/src/deepscientist/connector/connector_profiles.py +481 -0
  64. package/src/deepscientist/connector/lingzhu_support.py +668 -0
  65. package/src/deepscientist/connector/qq_profiles.py +206 -0
  66. package/src/deepscientist/connector/weixin_support.py +663 -0
  67. package/src/deepscientist/connector_profiles.py +1 -374
  68. package/src/deepscientist/connector_runtime.py +2 -0
  69. package/src/deepscientist/daemon/api/handlers.py +165 -5
  70. package/src/deepscientist/daemon/api/router.py +13 -1
  71. package/src/deepscientist/daemon/app.py +1444 -67
  72. package/src/deepscientist/doctor.py +4 -5
  73. package/src/deepscientist/gitops/diff.py +120 -29
  74. package/src/deepscientist/lingzhu_support.py +1 -182
  75. package/src/deepscientist/mcp/server.py +135 -7
  76. package/src/deepscientist/prompts/builder.py +128 -11
  77. package/src/deepscientist/qq_profiles.py +1 -196
  78. package/src/deepscientist/quest/node_traces.py +23 -0
  79. package/src/deepscientist/quest/service.py +359 -74
  80. package/src/deepscientist/quest/stage_views.py +71 -5
  81. package/src/deepscientist/runners/codex.py +170 -19
  82. package/src/deepscientist/runners/runtime_overrides.py +6 -0
  83. package/src/deepscientist/shared.py +33 -14
  84. package/src/deepscientist/weixin_support.py +1 -0
  85. package/src/prompts/connectors/lingzhu.md +3 -1
  86. package/src/prompts/connectors/qq.md +2 -1
  87. package/src/prompts/connectors/weixin.md +231 -0
  88. package/src/prompts/contracts/shared_interaction.md +4 -1
  89. package/src/prompts/system.md +61 -9
  90. package/src/skills/analysis-campaign/SKILL.md +46 -6
  91. package/src/skills/analysis-campaign/references/campaign-plan-template.md +21 -8
  92. package/src/skills/baseline/SKILL.md +1 -1
  93. package/src/skills/decision/SKILL.md +1 -1
  94. package/src/skills/experiment/SKILL.md +1 -1
  95. package/src/skills/finalize/SKILL.md +1 -1
  96. package/src/skills/idea/SKILL.md +1 -1
  97. package/src/skills/intake-audit/SKILL.md +1 -1
  98. package/src/skills/rebuttal/SKILL.md +74 -1
  99. package/src/skills/rebuttal/references/response-letter-template.md +55 -11
  100. package/src/skills/review/SKILL.md +118 -1
  101. package/src/skills/review/references/experiment-todo-template.md +23 -0
  102. package/src/skills/review/references/review-report-template.md +16 -0
  103. package/src/skills/review/references/revision-log-template.md +4 -0
  104. package/src/skills/scout/SKILL.md +1 -1
  105. package/src/skills/write/SKILL.md +168 -7
  106. package/src/skills/write/references/paper-experiment-matrix-template.md +131 -0
  107. package/src/tui/package.json +1 -1
  108. package/src/ui/dist/assets/{AiManusChatView-BKZ103sn.js → AiManusChatView-CnJcXynW.js} +156 -48
  109. package/src/ui/dist/assets/{AnalysisPlugin-mTTzGAlK.js → AnalysisPlugin-DeyzPEhV.js} +1 -1
  110. package/src/ui/dist/assets/{CliPlugin-BH58n3GY.js → CliPlugin-CB1YODQn.js} +164 -9
  111. package/src/ui/dist/assets/{CodeEditorPlugin-BKGRUH7e.js → CodeEditorPlugin-B-xicq1e.js} +8 -8
  112. package/src/ui/dist/assets/{CodeViewerPlugin-BMADwFWJ.js → CodeViewerPlugin-DT54ysXa.js} +5 -5
  113. package/src/ui/dist/assets/{DocViewerPlugin-ZOnTIHLN.js → DocViewerPlugin-DQtKT-VD.js} +3 -3
  114. package/src/ui/dist/assets/{GitDiffViewerPlugin-CQ7h1Djm.js → GitDiffViewerPlugin-hqHbCfnv.js} +20 -21
  115. package/src/ui/dist/assets/{ImageViewerPlugin-GVS5MsnC.js → ImageViewerPlugin-OcVo33jV.js} +5 -5
  116. package/src/ui/dist/assets/{LabCopilotPanel-BZNv1JML.js → LabCopilotPanel-DdGwhEUV.js} +11 -11
  117. package/src/ui/dist/assets/{LabPlugin-TWcJsdQA.js → LabPlugin-Ciz1gDaX.js} +2 -1
  118. package/src/ui/dist/assets/{LatexPlugin-DIjHiR2x.js → LatexPlugin-BhmjNQRC.js} +37 -11
  119. package/src/ui/dist/assets/{MarkdownViewerPlugin-D3ooGAH0.js → MarkdownViewerPlugin-BzdVH9Bx.js} +4 -4
  120. package/src/ui/dist/assets/{MarketplacePlugin-DfVfE9hN.js → MarketplacePlugin-DmyHspXt.js} +3 -3
  121. package/src/ui/dist/assets/{NotebookEditor-DDl0_Mc0.js → NotebookEditor-BMXKrDRk.js} +1 -1
  122. package/src/ui/dist/assets/{NotebookEditor-s8JhzuX1.js → NotebookEditor-BTVYRGkm.js} +12 -12
  123. package/src/ui/dist/assets/{PdfLoader-C2Sf6SJM.js → PdfLoader-CvcjJHXv.js} +14 -7
  124. package/src/ui/dist/assets/{PdfMarkdownPlugin-CXFLoIsa.js → PdfMarkdownPlugin-DW2ej8Vk.js} +73 -6
  125. package/src/ui/dist/assets/{PdfViewerPlugin-BYTmz2fK.js → PdfViewerPlugin-CmlDxbhU.js} +103 -34
  126. package/src/ui/dist/assets/PdfViewerPlugin-DQ11QcSf.css +3627 -0
  127. package/src/ui/dist/assets/{SearchPlugin-CjWBI1O9.js → SearchPlugin-DAjQZPSv.js} +1 -1
  128. package/src/ui/dist/assets/{TextViewerPlugin-DdOBU3-S.js → TextViewerPlugin-C-nVAZb_.js} +5 -4
  129. package/src/ui/dist/assets/{VNCViewer-B8HGgLwQ.js → VNCViewer-D7-dIYon.js} +10 -10
  130. package/src/ui/dist/assets/bot-C_G4WtNI.js +21 -0
  131. package/src/ui/dist/assets/branding/logo-rokid.png +0 -0
  132. package/src/ui/dist/assets/browser-BAcuE0Xj.js +2895 -0
  133. package/src/ui/dist/assets/{code-BWAY76JP.js → code-Cd7WfiWq.js} +1 -1
  134. package/src/ui/dist/assets/{file-content-C1NwU5oQ.js → file-content-B57zsL9y.js} +1 -1
  135. package/src/ui/dist/assets/{file-diff-panel-CywslwB9.js → file-diff-panel-DVoheLFq.js} +1 -1
  136. package/src/ui/dist/assets/{file-socket-B4kzuOBQ.js → file-socket-B5kXFxZP.js} +1 -1
  137. package/src/ui/dist/assets/{image-D-NZM-6P.js → image-LLOjkMHF.js} +1 -1
  138. package/src/ui/dist/assets/{index-DGIYDuTv.css → index-BQG-1s2o.css} +40 -13
  139. package/src/ui/dist/assets/{index-DHZJ_0TI.js → index-C3r2iGrp.js} +12 -12
  140. package/src/ui/dist/assets/{index-7Chr1g9c.js → index-CLQauncb.js} +15050 -9561
  141. package/src/ui/dist/assets/index-Dxa2eYMY.js +25 -0
  142. package/src/ui/dist/assets/{index-BdM1Gqfr.js → index-hOUOWbW2.js} +2 -2
  143. package/src/ui/dist/assets/{monaco-Cb2uKKe6.js → monaco-BGGAEii3.js} +1 -1
  144. package/src/ui/dist/assets/{pdf-effect-queue-DSw_D3RV.js → pdf-effect-queue-DlEr1_y5.js} +16 -1
  145. package/src/ui/dist/assets/pdf.worker.min-yatZIOMy.mjs +21 -0
  146. package/src/ui/dist/assets/{popover-Bg72DGgT.js → popover-CWJbJuYY.js} +1 -1
  147. package/src/ui/dist/assets/{project-sync-Ce_0BglY.js → project-sync-CRJiucYO.js} +18 -77
  148. package/src/ui/dist/assets/select-CoHB7pvH.js +1690 -0
  149. package/src/ui/dist/assets/{sigma-DPaACDrh.js → sigma-D5aJWR8J.js} +1 -1
  150. package/src/ui/dist/assets/{index-CDxNdQdz.js → square-check-big-DUK_mnkS.js} +2 -13
  151. package/src/ui/dist/assets/{trash-BvTgE5__.js → trash-ChU3SEE3.js} +1 -1
  152. package/src/ui/dist/assets/{useCliAccess-CgPeMOwP.js → useCliAccess-BrJBV3tY.js} +1 -1
  153. package/src/ui/dist/assets/{useFileDiffOverlay-xPhz7P5B.js → useFileDiffOverlay-C2OQaVWc.js} +1 -1
  154. package/src/ui/dist/assets/{wrap-text-C3Un3YQr.js → wrap-text-C7Qqh-om.js} +1 -1
  155. package/src/ui/dist/assets/{zoom-out-BgxLa0Ri.js → zoom-out-rtX0FKya.js} +1 -1
  156. package/src/ui/dist/index.html +2 -2
  157. package/src/ui/dist/assets/AutoFigurePlugin-BGxN8Umr.css +0 -3056
  158. package/src/ui/dist/assets/AutoFigurePlugin-C_wWw4AP.js +0 -8149
  159. package/src/ui/dist/assets/PdfViewerPlugin-BJXtIwj_.css +0 -260
  160. package/src/ui/dist/assets/Stepper-B0Dd8CxK.js +0 -158
  161. package/src/ui/dist/assets/bibtex-CKaefIN2.js +0 -189
  162. package/src/ui/dist/assets/file-utils-H2fjA46S.js +0 -109
  163. package/src/ui/dist/assets/message-square-BzjLiXir.js +0 -16
  164. package/src/ui/dist/assets/pdfjs-DU1YE8WO.js +0 -3
  165. package/src/ui/dist/assets/tooltip-C_mA6R0w.js +0 -108
@@ -10,7 +10,7 @@ Use this skill to close or pause a quest responsibly.
10
10
  ## Interaction discipline
11
11
 
12
12
  - Follow the shared interaction contract injected by the system prompt.
13
- - For ordinary active work, prefer a concise progress update once work has crossed roughly 10 tool calls with a human-meaningful delta, and do not drift beyond roughly 20 tool calls or about 15 minutes without a user-visible update.
13
+ - For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
14
14
  - If the runtime starts an auto-continue turn with no new user message, keep finalizing from the durable quest state and active requirements instead of replaying the previous user turn.
15
15
  - If a threaded user reply arrives, interpret it relative to the latest finalize progress update before assuming the task changed completely.
16
16
  - When finalize reaches a real closure state, pause-ready packet, or route-back decision, send one threaded `artifact.interact(kind='milestone', ...)` update that names the recommendation, why it is the right call, and any reopen condition that still matters.
@@ -10,7 +10,7 @@ Use this skill to turn the current baseline and problem frame into concrete, lit
10
10
  ## Interaction discipline
11
11
 
12
12
  - Follow the shared interaction contract injected by the system prompt.
13
- - For ordinary active work, prefer a concise progress update once work has crossed roughly 10 tool calls with a human-meaningful delta, and do not drift beyond roughly 20 tool calls or about 15 minutes without a user-visible update.
13
+ - For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
14
14
  - Keep ordinary subtask completions concise. When the idea stage actually finishes a meaningful deliverable such as a selected idea package, a rejected-ideas summary, or a route-shaping ideation checkpoint, upgrade to a richer `artifact.interact(kind='milestone', reply_mode='threaded', ...)` report.
15
15
  - That richer idea-stage milestone report should normally cover: the final selected or rejected direction, why it won or lost, the main remaining risk, and the exact recommended next stage or experiment.
16
16
  - That richer milestone report is still normally non-blocking. If the next experiment or route is already clear from durable evidence, continue automatically after reporting instead of waiting.
@@ -10,7 +10,7 @@ Use this skill when the quest already has meaningful state and the first job is
10
10
  ## Interaction discipline
11
11
 
12
12
  - Follow the shared interaction contract injected by the system prompt.
13
- - For ordinary active work, prefer a concise progress update once work has crossed roughly 10 tool calls with a human-meaningful delta, and do not drift beyond roughly 20 tool calls or about 15 minutes without a user-visible update.
13
+ - For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
14
14
  - Message templates are references only. Adapt to the actual context and vary wording so updates feel natural and non-robotic.
15
15
  - If a threaded user reply arrives, interpret it relative to the latest intake-audit progress update before assuming the task changed completely.
16
16
  - When the audit reaches a durable route recommendation, send one richer `artifact.interact(kind='milestone', reply_mode='threaded', ...)` update that says what state is trusted, what still needs work, and which anchor should run next.
@@ -14,7 +14,7 @@ The task is “respond to concrete reviewer pressure with the smallest honest se
14
14
  ## Interaction discipline
15
15
 
16
16
  - Follow the shared interaction contract injected by the system prompt.
17
- - For ordinary active work, prefer a concise progress update once work has crossed roughly 10 tool calls with a human-meaningful delta, and do not drift beyond roughly 20 tool calls or about 15 minutes without a user-visible update.
17
+ - For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
18
18
  - Message templates are references only. Adapt to the actual context and vary wording so updates feel natural and non-robotic.
19
19
  - If a threaded user reply arrives, interpret it relative to the latest rebuttal progress update before assuming the task changed completely.
20
20
  - When the rebuttal plan, the main supplementary-evidence package, or the final response bundle becomes durable, send one richer `artifact.interact(kind='milestone', reply_mode='threaded', ...)` update that says what reviewer concerns are now addressed, what still remains open, and what happens next.
@@ -73,6 +73,16 @@ First decide whether the issue is actually:
73
73
  - Do not run supplementary experiments without first mapping them to named reviewer concerns.
74
74
  - Do not keep the original claim scope if the new evidence no longer supports it.
75
75
  - If a reviewer request cannot be fully satisfied, say so clearly and explain the honest limitation.
76
+ - If `startup_contract.baseline_execution_policy` is present, honor it:
77
+ - `must_reproduce_or_verify`
78
+ - verify or recover the rebuttal-critical baseline/comparator before reviewer-linked follow-up work
79
+ - `reuse_existing_only`
80
+ - trust the current baseline/results unless you find concrete inconsistency, corruption, or missing-evidence problems
81
+ - `skip_unless_blocking`
82
+ - do not spend time rerunning baselines unless a named reviewer item truly depends on a missing comparator
83
+ - If `startup_contract.manuscript_edit_mode = latex_required`, treat the provided LaTeX tree or `paper/latex/` as the preferred writing surface when manuscript revision is needed.
84
+ - If LaTeX source is unavailable while `latex_required` is requested, do not pretend the manuscript was edited; produce LaTeX-ready replacement text and an explicit blocker note instead.
85
+ - Accept review inputs from URLs, local file paths, local directories, or current-turn attachments; do not assume the review packet must already be neatly structured.
76
86
 
77
87
  ## Primary inputs
78
88
 
@@ -81,6 +91,7 @@ Use, in roughly this order:
81
91
  - the current paper or draft
82
92
  - the selected outline if one exists
83
93
  - review comments, meta-review, or editor letter
94
+ - current-turn attachments and user-provided local paths / directories / URLs for the manuscript or review packet
84
95
  - the six-field `evaluation_summary` blocks from recent main experiments and analysis slices
85
96
  - recent main and analysis experiment results
86
97
  - prior decision and writing memory
@@ -88,6 +99,7 @@ Use, in roughly this order:
88
99
 
89
100
  If the current paper/result state is still unclear, open `intake-audit` first before continuing the rebuttal workflow.
90
101
  Before launching any new supplementary experiment, read those structured `evaluation_summary` blocks first so the rebuttal plan starts from the already-recorded evidence state rather than from raw narrative memory.
102
+ If the user provided manuscript files or review-packet files directly, first normalize them into durable quest-visible paths under `paper/` or `paper/rebuttal/input/` before planning reviewer-linked experiments or draft replies.
91
103
 
92
104
  ## Core outputs
93
105
 
@@ -98,6 +110,8 @@ The rebuttal pass should usually leave behind:
98
110
  - `paper/rebuttal/response_letter.md`
99
111
  - `paper/rebuttal/text_deltas.md`
100
112
  - `paper/rebuttal/evidence_update.md`
113
+ - `paper/paper_experiment_matrix.md` when reviewer concerns materially change the paper experiment plan
114
+ - `paper/paper_experiment_matrix.json` when reviewer concerns materially change the paper experiment plan
101
115
 
102
116
  Use the templates in `references/` when needed:
103
117
 
@@ -212,6 +226,7 @@ For each reviewer issue, decide whether the right answer is:
212
226
 
213
227
  Then write one durable rebuttal plan in `paper/rebuttal/action_plan.md`.
214
228
  That plan should explicitly include the analysis-experiment TODO list for reviewer-linked follow-up work.
229
+ If reviewer concerns materially change the paper's experiment story, also create or revise `paper/paper_experiment_matrix.*` so the rebuttal experiment package stays consistent with the paper-facing plan rather than drifting into a reviewer-only side list.
215
230
 
216
231
  The action plan should be the main thinking draft before execution.
217
232
  For each serious item, record:
@@ -237,6 +252,18 @@ Write at least:
237
252
  For novelty / comparison / positioning complaints, do not default to experiments.
238
253
  First decide whether the issue is better answered by a focused literature audit and clearer paper positioning.
239
254
 
255
+ When a reviewer concern really does imply experimental follow-up, map it into the same paper experiment taxonomy used by the writing line:
256
+
257
+ - `component_ablation`
258
+ - `sensitivity`
259
+ - `robustness`
260
+ - `efficiency_cost`
261
+ - `highlight_validation`
262
+ - `failure_boundary`
263
+ - `case_study_optional`
264
+
265
+ Case study remains optional unless the reviewer concern is specifically qualitative and cannot be addressed better with quantitative evidence.
266
+
240
267
  ### 3. Route experiments only when genuinely needed
241
268
 
242
269
  If one or more comments truly require new runs:
@@ -252,9 +279,18 @@ If one or more comments truly require new runs:
252
279
  Do not launch a free-floating ablation batch.
253
280
  Every supplementary run should answer a named reviewer issue.
254
281
  Every slice should reference one or more stable reviewer item ids.
282
+ Every rebuttal-linked slice should also reference the corresponding `exp_id` from `paper/paper_experiment_matrix.*` when that matrix exists.
255
283
  After each completed reviewer-linked slice, record the result, the implication for the manuscript, and the concrete modification advice in `paper/rebuttal/evidence_update.md`.
256
284
  Use the same shared supplementary-experiment protocol as ordinary analysis work; do not invent a rebuttal-only experiment system.
257
285
  If ids or refs are unclear, recover them first with `artifact.resolve_runtime_refs(...)`, `artifact.get_analysis_campaign(...)`, or `artifact.list_paper_outlines(...)`.
286
+ After each completed, excluded, or blocked reviewer-linked slice:
287
+
288
+ - reopen `paper/paper_experiment_matrix.*`
289
+ - update the affected `exp_id`
290
+ - update whether the result now belongs in main text, appendix, or omission
291
+ - update which reviewer items are now fully answered
292
+
293
+ Do not finalize the rebuttal package while reviewer-critical and currently feasible matrix rows remain unresolved without an explicit blocker note.
258
294
 
259
295
  ### 4. Route manuscript changes explicitly
260
296
 
@@ -279,6 +315,14 @@ If a reviewer request forces a narrower story, revise the outline before polishi
279
315
 
280
316
  Use `references/response-letter-template.md` when helpful.
281
317
 
318
+ Before treating the response letter as final:
319
+
320
+ - first complete every feasible reviewer-linked experiment or analysis slice that the current plan marked as necessary
321
+ - ensure the necessary rows in `paper/paper_experiment_matrix.*` have been refreshed after those runs
322
+ - use real completed experiment results directly in the reply wherever the concern is genuinely experimental
323
+ - for non-experimental items, do not wait for unnecessary experiments; answer as strongly as the current manuscript, literature, and analysis already allow
324
+ - if one experimental item cannot be completed in time, keep the reply honest and explicit about the remaining limitation or fallback wording
325
+
282
326
  The response should be:
283
327
 
284
328
  - professional
@@ -290,6 +334,8 @@ The response should be:
290
334
  Good response structure:
291
335
 
292
336
  - short appreciation / acknowledgement
337
+ - overall response that summarizes the revision strategy and the strongest strengths acknowledged across reviewers
338
+ - strengths recognized across reviewers
293
339
  - direct answer to the reviewer concern
294
340
  - keep stable item ids visible when helpful
295
341
  - restate reviewer wording faithfully before answering
@@ -300,6 +346,28 @@ Good response structure:
300
346
  - claim scope
301
347
  - if not fully addressed, why not and what honest limitation remains
302
348
 
349
+ Drafting style rules for the actual author reply body:
350
+
351
+ - Treat `response_letter.md` as rebuttal-ready author text, not as internal coaching notes.
352
+ - Write in a calm, direct, precise author voice.
353
+ - Sound like authors clarifying the record, not authors asking for approval.
354
+ - Brief professional courtesy is allowed, but keep it short and move to substance immediately.
355
+ - Avoid sycophancy, flattery, excessive gratitude, or approval-seeking language.
356
+ - Do not default to conceding fault.
357
+ - Use selective concede, selective clarify, and selective defend.
358
+ - Answer the reviewer concern directly in the first 1 to 2 sentences.
359
+ - For non-experimental items, reduce reviewer uncertainty as much as the real evidence allows; the goal is to make a score improvement reasonable for an honest reviewer, not to persuade through rhetoric alone.
360
+ - Write strongly enough that a neutral reviewer or AC can judge the concern substantially addressed from the rebuttal text alone.
361
+ - After the literal answer, address the underlying doubt about validity, novelty, scope, fairness, or completeness.
362
+ - If the answer already exists in the manuscript, restate it in the rebuttal and then point to the manuscript change; do not only say “we will clarify”.
363
+ - If the issue is about wording, interpretation, or claim strength, include the revised sentence or close paraphrase that should appear in the manuscript.
364
+ - Keep the main response body for each item as 1 to 2 full paragraphs of polished prose.
365
+ - Do not use bullets, numbered lists, bold labels, or checklist fragments inside the actual response paragraphs.
366
+ - Do not narrate rebuttal strategy inside the author reply.
367
+ - Do not rely on future edits alone when you can already give the clarification, argument, or wording now.
368
+ - When pushing back, lead with evidence, scope, or feasibility constraints before intuition.
369
+ - If `startup_contract.manuscript_edit_mode = latex_required`, keep manuscript-facing replacement text LaTeX-ready.
370
+
303
371
  If details are still genuinely unknown, use explicit placeholders such as `[[AUTHOR TO FILL]]` rather than inventing specifics.
304
372
 
305
373
  Avoid:
@@ -319,6 +387,8 @@ When the rebuttal package is durably ready:
319
387
 
320
388
  If a combined rebuttal note is useful, make sure the total package still covers:
321
389
 
390
+ - overall response
391
+ - strengths recognized across reviewers
322
392
  - overview and revision strategy
323
393
  - draft responses to reviewers
324
394
  - point-to-point triage
@@ -398,6 +468,9 @@ Useful tags include:
398
468
  - supplementary experiments, if needed, are routed cleanly
399
469
  - manuscript deltas are explicit
400
470
  - the response letter is evidence-backed and honest
471
+ - the final package contains both:
472
+ - reviewer-specific replies
473
+ - one overall response that makes the paper strengths, the main resolved concerns, and the remaining limitations legible to a neutral reader or AC
401
474
 
402
475
  The goal is not just “write a nicer response”.
403
476
  The goal is to convert review pressure into a durable, auditable revision workflow.
@@ -1,9 +1,55 @@
1
1
  # Response Letter Template
2
2
 
3
+ ## Drafting rules
4
+
5
+ - Treat this file as rebuttal-ready author text, not as private coaching notes.
6
+ - Write in a calm, direct, precise author voice.
7
+ - Brief professional courtesy is allowed, but keep it short and move to substance immediately.
8
+ - Avoid sycophancy, flattery, excessive gratitude, or approval-seeking language.
9
+ - Do not default to conceding fault.
10
+ - Use selective concede, selective clarify, and selective defend.
11
+ - Answer the reviewer concern directly in the first 1 to 2 sentences.
12
+ - Keep the actual response body for each item as 1 to 2 full paragraphs of polished prose.
13
+ - If the issue is about wording, interpretation, or claim strength, include the revised sentence or close paraphrase that should appear in the manuscript.
14
+ - Do not use bullets, numbered lists, or label-value schemas inside the actual response paragraphs.
15
+ - Do not rely on future edits alone when you can already give the clarification, argument, or draft wording now.
16
+ - If a concrete number, setup detail, or result is still unknown, use `[[AUTHOR TO FILL]]`.
17
+
3
18
  ## Cover note
4
19
 
5
20
  We thank the reviewers for the careful reading and constructive feedback. Below we respond point by point and indicate the corresponding manuscript changes and supplementary evidence when applicable.
6
21
 
22
+ ## Overview & Revision Strategy
23
+
24
+ - main reviewer risks:
25
+ - current strongest evidence:
26
+ - current weakest evidence:
27
+ - baseline handling decision:
28
+ - response strategy:
29
+ - manuscript_edit_mode:
30
+
31
+ ## Overall Response
32
+
33
+ - strongest strengths recognized across reviewers:
34
+ - overall revision strategy:
35
+ - biggest concerns now addressed:
36
+ - concerns still partially open:
37
+ - claim-scope changes:
38
+ - remaining limitation:
39
+
40
+ ## Strengths Recognized Across Reviewers
41
+
42
+ - strength 1:
43
+ - strength 2:
44
+ - why these strengths still matter after revision:
45
+
46
+ ## Resolution Snapshot
47
+
48
+ | Item ID | Status | What changed | Evidence basis | Manuscript delta |
49
+ | --- | --- | --- | --- | --- |
50
+ | R1-C1 | | | | |
51
+ | R1-C2 | | | | |
52
+
7
53
  ## Reviewer 1
8
54
 
9
55
  ### Item R1-C1
@@ -20,9 +66,11 @@ We thank the reviewers for the careful reading and constructive feedback. Below
20
66
 
21
67
  - agree / partially_agree / clarify / respectful_disagree
22
68
 
23
- **Response**
69
+ **Response Draft**
24
70
 
25
- -
71
+ Write 1 to 2 full paragraphs of rebuttal-ready prose here.
72
+ The first 1 to 2 sentences should answer the concern directly.
73
+ Then explain the evidence, manuscript rationale, and the exact clarification or wording that should appear in the revision.
26
74
 
27
75
  **What changed**
28
76
 
@@ -30,6 +78,7 @@ We thank the reviewers for the careful reading and constructive feedback. Below
30
78
  - evidence basis:
31
79
  - claim-scope effect:
32
80
  - remaining limitation:
81
+ - latex-ready manuscript text:
33
82
 
34
83
  **If an experiment is still pending**
35
84
 
@@ -51,9 +100,9 @@ We thank the reviewers for the careful reading and constructive feedback. Below
51
100
 
52
101
  - agree / partially_agree / clarify / respectful_disagree
53
102
 
54
- **Response**
103
+ **Response Draft**
55
104
 
56
- -
105
+ Write 1 to 2 full paragraphs of rebuttal-ready prose here.
57
106
 
58
107
  **What changed**
59
108
 
@@ -84,9 +133,9 @@ We thank the reviewers for the careful reading and constructive feedback. Below
84
133
 
85
134
  - agree / partially_agree / clarify / respectful_disagree
86
135
 
87
- **Response**
136
+ **Response Draft**
88
137
 
89
- -
138
+ Write 1 to 2 full paragraphs of rebuttal-ready prose here.
90
139
 
91
140
  **What changed**
92
141
 
@@ -106,8 +155,3 @@ We thank the reviewers for the careful reading and constructive feedback. Below
106
155
  - what could not be fully addressed:
107
156
  - why:
108
157
  - how the manuscript now reflects that limitation:
109
-
110
- ## Author placeholders
111
-
112
- - If a concrete number, setup detail, or result is still unknown, use `[[AUTHOR TO FILL]]`.
113
- - Do not fabricate missing details just to make the letter sound complete.
@@ -17,7 +17,7 @@ It is also not the same as `rebuttal`.
17
17
  ## Interaction discipline
18
18
 
19
19
  - Follow the shared interaction contract injected by the system prompt.
20
- - For ordinary active work, prefer a concise progress update once work has crossed roughly 10 tool calls with a human-meaningful delta, and do not drift beyond roughly 20 tool calls or about 15 minutes without a user-visible update.
20
+ - For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
21
21
  - When the review report, revision plan, or follow-up experiment TODO list becomes durable, send a richer `artifact.interact(kind='milestone', reply_mode='threaded', ...)` update that says what the main risks are, what should be fixed next, and whether the next route is writing, experiment, or claim downgrade.
22
22
 
23
23
  ## Purpose
@@ -63,6 +63,16 @@ Do not treat “looks polished” as “is defensible”.
63
63
  - Do not recommend rhetoric when the real problem is missing evidence.
64
64
  - If novelty or positioning is uncertain, treat that as a literature-audit question first, not an automatic experiment request.
65
65
  - If a claim is too broad for the evidence, prefer narrowing or downgrading the claim over defending it with style.
66
+ - If `startup_contract.review_followup_policy` is present, honor it:
67
+ - `audit_only`
68
+ - stop after durable review artifacts and a clear route recommendation
69
+ - `auto_execute_followups`
70
+ - do not stop at the audit if the next route is already clear; continue into the required experiments and manuscript deltas
71
+ - `user_gated_followups`
72
+ - finish the audit first, then package the next expensive follow-up step into one structured decision
73
+ - If `startup_contract.manuscript_edit_mode = latex_required`, treat the provided LaTeX tree or `paper/latex/` as the writing surface when manuscript revision is needed.
74
+ - If LaTeX source is unavailable while `latex_required` is requested, do not pretend the manuscript was edited; produce LaTeX-ready replacement text and an explicit blocker note instead.
75
+ - Accept manuscript and review inputs from URLs, local file paths, local directories, or current-turn attachments; do not assume the draft is already perfectly normalized.
66
76
 
67
77
  ## Primary inputs
68
78
 
@@ -74,11 +84,13 @@ Use, in roughly this order:
74
84
  - the six-field `evaluation_summary` blocks from recent main experiments and analysis slices
75
85
  - recent main and analysis experiment results
76
86
  - figures, tables, and captions
87
+ - current-turn attachments and user-provided local paths / directories / URLs for the manuscript bundle or review packet
77
88
  - prior self-review or reviewer-first notes as low-trust auxiliary input
78
89
  - nearby papers when novelty or comparison is unclear
79
90
 
80
91
  If the draft/result state is still unclear, open `intake-audit` first before continuing the review workflow.
81
92
  Before proposing extra experiments, read those structured `evaluation_summary` blocks first so you do not request work that the recorded evidence already resolved.
93
+ If the user provided draft files or manuscript bundles directly, first normalize them into durable quest-visible paths before planning experiments or section-level revisions.
82
94
 
83
95
  ## Core outputs
84
96
 
@@ -87,6 +99,8 @@ The review pass should usually leave behind:
87
99
  - `paper/review/review.md`
88
100
  - `paper/review/revision_log.md`
89
101
  - `paper/review/experiment_todo.md`
102
+ - `paper/paper_experiment_matrix.md` when more evidence is still needed
103
+ - `paper/paper_experiment_matrix.json` when more evidence is still needed
90
104
 
91
105
  Use the templates in `references/` when needed:
92
106
 
@@ -175,14 +189,25 @@ For each serious issue, record:
175
189
  - what should change
176
190
  - whether the fix is writing-only, evidence-only, or experiment-dependent
177
191
  - whether the issue blocks `finalize`
192
+ - one copy-ready replacement sentence / paragraph when feasible
193
+ - one LaTeX-ready replacement block when `startup_contract.manuscript_edit_mode = latex_required`
178
194
 
179
195
  ### 5. Produce the follow-up experiment TODO list
180
196
 
181
197
  Only if more evidence is truly needed, write `paper/review/experiment_todo.md` using `references/experiment-todo-template.md`.
182
198
 
199
+ When the paper still lacks experimental support, also create or revise:
200
+
201
+ - `paper/paper_experiment_matrix.md`
202
+ - `paper/paper_experiment_matrix.json`
203
+
204
+ Treat the matrix as the paper-facing master plan and `paper/review/experiment_todo.md` as only the current execution frontier or review-facing subset.
205
+
183
206
  Each TODO item should include:
184
207
 
185
208
  - the review issue it answers
209
+ - the matrix exp id
210
+ - the corresponding `exp_id` in the paper experiment matrix
186
211
  - why existing evidence is still insufficient
187
212
  - the minimum experiment or analysis needed
188
213
  - required metric(s)
@@ -195,6 +220,50 @@ Each TODO item should include:
195
220
 
196
221
  Do not write a vague “run more ablations” list.
197
222
  Each TODO item should be concrete enough to turn into `analysis-campaign` slices or a `baseline` recovery task.
223
+ The matrix should be broader than the TODO list and should classify the full paper-facing experiment space, not just analysis work.
224
+ When building or revising that matrix, explicitly consider:
225
+
226
+ - main comparison packaging or extension
227
+ - component ablations
228
+ - sensitivity / hyperparameter checks
229
+ - robustness checks
230
+ - efficiency / cost / latency / token-overhead checks when relevant
231
+ - highlight-validation experiments that test the likely strengths of the method
232
+ - limitation-boundary analyses
233
+ - case study rows as optional rather than mandatory evidence
234
+
235
+ Do not assume the paper only needs “analysis experiments”.
236
+ Do not assume case studies belong in the required set.
237
+ If efficiency or cost could become a reviewer-facing strength or concern, put that into the matrix explicitly.
238
+
239
+ For the matrix, each row should usually record:
240
+
241
+ - `exp_id`
242
+ - `tier`
243
+ - `experiment_type`
244
+ - `status`
245
+ - `feasibility_now`
246
+ - `claim_ids`
247
+ - `highlight_ids`
248
+ - `research_question`
249
+ - `hypothesis`
250
+ - `comparators`
251
+ - `metrics`
252
+ - `minimal_success_criterion`
253
+ - `paper_placement`
254
+ - `promotion_rule`
255
+ - `next_action`
256
+
257
+ The matrix should also keep a short `highlight hypotheses` block.
258
+ Do not rely on prose intuition for the method's best selling point; if a likely highlight matters, it should have a corresponding validation row in the matrix.
259
+
260
+ Before treating the experiments section as stable, require that every currently feasible matrix row that is not merely `optional` or `dropped` is either:
261
+
262
+ - completed
263
+ - analyzed
264
+ - excluded with a real reason
265
+ - or blocked with a real reason
266
+
198
267
  When extra evidence is truly needed, use the shared supplementary-experiment protocol:
199
268
 
200
269
  - recover ids / refs first if needed
@@ -216,6 +285,54 @@ After the review artifacts are durable:
216
285
 
217
286
  Do not stop immediately after writing the review if the next route is already clear.
218
287
 
288
+ ### 7. Auto follow-up execution contract
289
+
290
+ When `startup_contract.review_followup_policy = auto_execute_followups`:
291
+
292
+ - treat the review as a gate, not as the endpoint
293
+ - immediately turn the accepted follow-up route into action:
294
+ - `analysis-campaign`
295
+ - when new evidence is truly required
296
+ - `baseline`
297
+ - when a missing comparator baseline blocks fair review
298
+ - `write`
299
+ - when the issues are mostly text, outline, claim-scope, figure, or framing revisions
300
+ - after each completed follow-up step, update:
301
+ - `paper/review/revision_log.md`
302
+ - `paper/review/experiment_todo.md`
303
+ - the draft or manuscript-facing revision package
304
+ - only treat the review line as truly closed after the follow-up route has either completed or been downgraded / blocked explicitly
305
+
306
+ When `startup_contract.review_followup_policy = user_gated_followups`:
307
+
308
+ - stop after the durable audit artifacts
309
+ - turn the next expensive follow-up package into one structured decision instead of continuing silently
310
+
311
+ When `startup_contract.review_followup_policy = audit_only`:
312
+
313
+ - stop after the durable audit artifacts and route recommendation
314
+
315
+ ### 8. Manuscript revision delivery contract
316
+
317
+ If manuscript revision is required, make the delta explicit:
318
+
319
+ - section
320
+ - old claim / weakness
321
+ - new wording
322
+ - evidence basis
323
+ - remaining limitation
324
+
325
+ If `startup_contract.manuscript_edit_mode = copy_ready_text`:
326
+
327
+ - provide copy-ready replacement wording in `paper/review/revision_log.md` or a nearby revision note
328
+ - keep the wording directly usable by the user or downstream `write`
329
+
330
+ If `startup_contract.manuscript_edit_mode = latex_required`:
331
+
332
+ - prefer editing the actual LaTeX sources when they are available
333
+ - otherwise provide LaTeX-ready replacement text blocks with explicit insertion targets
334
+ - preserve labels, citations, figure/table refs, and section structure in the suggested replacements
335
+
219
336
  ## Companion skill routing
220
337
 
221
338
  Open additional skills only when the review workflow requires them:
@@ -5,25 +5,48 @@
5
5
  ### TODO EXP-001
6
6
 
7
7
  - source review issue:
8
+ - matrix exp id:
8
9
  - why current evidence is insufficient:
9
10
  - route type:
10
11
  - existing-result analysis
11
12
  - comparator baseline
12
13
  - supplementary experiment
13
14
  - figure / table regeneration
15
+ - experiment type:
16
+ - component_ablation
17
+ - sensitivity
18
+ - robustness
19
+ - efficiency_cost
20
+ - highlight_validation
21
+ - failure_boundary
22
+ - case_study_optional
23
+ - tier:
24
+ - main_required
25
+ - main_optional
26
+ - appendix
27
+ - optional
14
28
  - minimum task:
15
29
  - required metric(s):
16
30
  - minimal success criterion:
31
+ - likely paper placement:
32
+ - main_text
33
+ - appendix
34
+ - maybe
35
+ - omit
17
36
  - expected manuscript impact:
18
37
  - owner / next step:
19
38
 
20
39
  ### TODO EXP-002
21
40
 
22
41
  - source review issue:
42
+ - matrix exp id:
23
43
  - why current evidence is insufficient:
24
44
  - route type:
45
+ - experiment type:
46
+ - tier:
25
47
  - minimum task:
26
48
  - required metric(s):
27
49
  - minimal success criterion:
50
+ - likely paper placement:
28
51
  - expected manuscript impact:
29
52
  - owner / next step:
@@ -1,5 +1,11 @@
1
1
  # Review Report Template
2
2
 
3
+ ## Review mode
4
+
5
+ - review_followup_policy:
6
+ - manuscript_edit_mode:
7
+ - manuscript_source_status:
8
+
3
9
  ## Summary
4
10
 
5
11
  - paper / draft:
@@ -36,6 +42,8 @@
36
42
  - cause:
37
43
  - actionable fix:
38
44
  - acceptance criterion:
45
+ - copy-ready revision text:
46
+ - latex-ready revision text:
39
47
 
40
48
  ## Storyline Options + Writing Outlines
41
49
 
@@ -49,6 +57,14 @@
49
57
  2.
50
58
  3.
51
59
 
60
+ ## Manuscript Revision Package
61
+
62
+ - section:
63
+ - old wording / weakness:
64
+ - new wording:
65
+ - evidence basis:
66
+ - latex-ready replacement block:
67
+
52
68
  ## Experiment Inventory & Research Experiment Plan
53
69
 
54
70
  - what existing experiments already cover:
@@ -3,6 +3,8 @@
3
3
  ## Revision Summary
4
4
 
5
5
  - current draft state:
6
+ - review_followup_policy:
7
+ - manuscript_edit_mode:
6
8
  - highest-priority fixes:
7
9
  - blockers:
8
10
 
@@ -20,6 +22,8 @@
20
22
  - supplementary experiment
21
23
  - claim downgrade
22
24
  - concrete change:
25
+ - copy-ready revision text:
26
+ - latex-ready revision text:
23
27
  - status:
24
28
  - blocks finalize:
25
29
 
@@ -10,7 +10,7 @@ Use this skill when the quest does not yet have a stable research frame.
10
10
  ## Interaction discipline
11
11
 
12
12
  - Follow the shared interaction contract injected by the system prompt.
13
- - For ordinary active work, prefer a concise progress update once work has crossed roughly 10 tool calls with a human-meaningful delta, and do not drift beyond roughly 20 tool calls or about 15 minutes without a user-visible update.
13
+ - For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
14
14
  - Message templates are references only. Adapt to the actual context and vary wording so updates feel natural and non-robotic.
15
15
  - If a threaded user reply arrives, interpret it relative to the latest scout progress update before assuming the task changed completely.
16
16
  - When scouting actually resolves the framing ambiguity, locks the evaluation contract, or makes the next anchor obvious, send one richer `artifact.interact(kind='milestone', reply_mode='threaded', ...)` update that says what is now clear, why it matters, and which stage should come next.