npm - @researai/deepscientist - Versions diffs - 1.5.9 → 1.5.12 - Mend

@researai/deepscientist 1.5.9 → 1.5.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (165) hide show

package/src/skills/finalize/SKILL.md CHANGED Viewed

@@ -10,7 +10,7 @@ Use this skill to close or pause a quest responsibly.
 ## Interaction discipline
 - Follow the shared interaction contract injected by the system prompt.
-- For ordinary active work, prefer a concise progress update once work has crossed roughly 10 tool calls with a human-meaningful delta, and do not drift beyond roughly 20 tool calls or about 15 minutes without a user-visible update.
+- For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
 - If the runtime starts an auto-continue turn with no new user message, keep finalizing from the durable quest state and active requirements instead of replaying the previous user turn.
 - If a threaded user reply arrives, interpret it relative to the latest finalize progress update before assuming the task changed completely.
 - When finalize reaches a real closure state, pause-ready packet, or route-back decision, send one threaded `artifact.interact(kind='milestone', ...)` update that names the recommendation, why it is the right call, and any reopen condition that still matters.

package/src/skills/idea/SKILL.md CHANGED Viewed

@@ -10,7 +10,7 @@ Use this skill to turn the current baseline and problem frame into concrete, lit
 ## Interaction discipline
 - Follow the shared interaction contract injected by the system prompt.
-- For ordinary active work, prefer a concise progress update once work has crossed roughly 10 tool calls with a human-meaningful delta, and do not drift beyond roughly 20 tool calls or about 15 minutes without a user-visible update.
+- For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
 - Keep ordinary subtask completions concise. When the idea stage actually finishes a meaningful deliverable such as a selected idea package, a rejected-ideas summary, or a route-shaping ideation checkpoint, upgrade to a richer `artifact.interact(kind='milestone', reply_mode='threaded', ...)` report.
 - That richer idea-stage milestone report should normally cover: the final selected or rejected direction, why it won or lost, the main remaining risk, and the exact recommended next stage or experiment.
 - That richer milestone report is still normally non-blocking. If the next experiment or route is already clear from durable evidence, continue automatically after reporting instead of waiting.

package/src/skills/intake-audit/SKILL.md CHANGED Viewed

@@ -10,7 +10,7 @@ Use this skill when the quest already has meaningful state and the first job is
 ## Interaction discipline
 - Follow the shared interaction contract injected by the system prompt.
-- For ordinary active work, prefer a concise progress update once work has crossed roughly 10 tool calls with a human-meaningful delta, and do not drift beyond roughly 20 tool calls or about 15 minutes without a user-visible update.
+- For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
 - Message templates are references only. Adapt to the actual context and vary wording so updates feel natural and non-robotic.
 - If a threaded user reply arrives, interpret it relative to the latest intake-audit progress update before assuming the task changed completely.
 - When the audit reaches a durable route recommendation, send one richer `artifact.interact(kind='milestone', reply_mode='threaded', ...)` update that says what state is trusted, what still needs work, and which anchor should run next.

package/src/skills/rebuttal/SKILL.md CHANGED Viewed

@@ -14,7 +14,7 @@ The task is “respond to concrete reviewer pressure with the smallest honest se
 ## Interaction discipline
 - Follow the shared interaction contract injected by the system prompt.
-- For ordinary active work, prefer a concise progress update once work has crossed roughly 10 tool calls with a human-meaningful delta, and do not drift beyond roughly 20 tool calls or about 15 minutes without a user-visible update.
+- For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
 - Message templates are references only. Adapt to the actual context and vary wording so updates feel natural and non-robotic.
 - If a threaded user reply arrives, interpret it relative to the latest rebuttal progress update before assuming the task changed completely.
 - When the rebuttal plan, the main supplementary-evidence package, or the final response bundle becomes durable, send one richer `artifact.interact(kind='milestone', reply_mode='threaded', ...)` update that says what reviewer concerns are now addressed, what still remains open, and what happens next.
@@ -73,6 +73,16 @@ First decide whether the issue is actually:
 - Do not run supplementary experiments without first mapping them to named reviewer concerns.
 - Do not keep the original claim scope if the new evidence no longer supports it.
 - If a reviewer request cannot be fully satisfied, say so clearly and explain the honest limitation.
+- If `startup_contract.baseline_execution_policy` is present, honor it:
+  - `must_reproduce_or_verify`
+    - verify or recover the rebuttal-critical baseline/comparator before reviewer-linked follow-up work
+  - `reuse_existing_only`
+    - trust the current baseline/results unless you find concrete inconsistency, corruption, or missing-evidence problems
+  - `skip_unless_blocking`
+    - do not spend time rerunning baselines unless a named reviewer item truly depends on a missing comparator
+- If `startup_contract.manuscript_edit_mode = latex_required`, treat the provided LaTeX tree or `paper/latex/` as the preferred writing surface when manuscript revision is needed.
+- If LaTeX source is unavailable while `latex_required` is requested, do not pretend the manuscript was edited; produce LaTeX-ready replacement text and an explicit blocker note instead.
+- Accept review inputs from URLs, local file paths, local directories, or current-turn attachments; do not assume the review packet must already be neatly structured.
 ## Primary inputs
@@ -81,6 +91,7 @@ Use, in roughly this order:
 - the current paper or draft
 - the selected outline if one exists
 - review comments, meta-review, or editor letter
+- current-turn attachments and user-provided local paths / directories / URLs for the manuscript or review packet
 - the six-field `evaluation_summary` blocks from recent main experiments and analysis slices
 - recent main and analysis experiment results
 - prior decision and writing memory
@@ -88,6 +99,7 @@ Use, in roughly this order:
 If the current paper/result state is still unclear, open `intake-audit` first before continuing the rebuttal workflow.
 Before launching any new supplementary experiment, read those structured `evaluation_summary` blocks first so the rebuttal plan starts from the already-recorded evidence state rather than from raw narrative memory.
+If the user provided manuscript files or review-packet files directly, first normalize them into durable quest-visible paths under `paper/` or `paper/rebuttal/input/` before planning reviewer-linked experiments or draft replies.
 ## Core outputs
@@ -98,6 +110,8 @@ The rebuttal pass should usually leave behind:
 - `paper/rebuttal/response_letter.md`
 - `paper/rebuttal/text_deltas.md`
 - `paper/rebuttal/evidence_update.md`
+- `paper/paper_experiment_matrix.md` when reviewer concerns materially change the paper experiment plan
+- `paper/paper_experiment_matrix.json` when reviewer concerns materially change the paper experiment plan
 Use the templates in `references/` when needed:
@@ -212,6 +226,7 @@ For each reviewer issue, decide whether the right answer is:
 Then write one durable rebuttal plan in `paper/rebuttal/action_plan.md`.
 That plan should explicitly include the analysis-experiment TODO list for reviewer-linked follow-up work.
+If reviewer concerns materially change the paper's experiment story, also create or revise `paper/paper_experiment_matrix.*` so the rebuttal experiment package stays consistent with the paper-facing plan rather than drifting into a reviewer-only side list.
 The action plan should be the main thinking draft before execution.
 For each serious item, record:
@@ -237,6 +252,18 @@ Write at least:
 For novelty / comparison / positioning complaints, do not default to experiments.
 First decide whether the issue is better answered by a focused literature audit and clearer paper positioning.
+When a reviewer concern really does imply experimental follow-up, map it into the same paper experiment taxonomy used by the writing line:
+- `component_ablation`
+- `sensitivity`
+- `robustness`
+- `efficiency_cost`
+- `highlight_validation`
+- `failure_boundary`
+- `case_study_optional`
+Case study remains optional unless the reviewer concern is specifically qualitative and cannot be addressed better with quantitative evidence.
 ### 3. Route experiments only when genuinely needed
 If one or more comments truly require new runs:
@@ -252,9 +279,18 @@ If one or more comments truly require new runs:
 Do not launch a free-floating ablation batch.
 Every supplementary run should answer a named reviewer issue.
 Every slice should reference one or more stable reviewer item ids.
+Every rebuttal-linked slice should also reference the corresponding `exp_id` from `paper/paper_experiment_matrix.*` when that matrix exists.
 After each completed reviewer-linked slice, record the result, the implication for the manuscript, and the concrete modification advice in `paper/rebuttal/evidence_update.md`.
 Use the same shared supplementary-experiment protocol as ordinary analysis work; do not invent a rebuttal-only experiment system.
 If ids or refs are unclear, recover them first with `artifact.resolve_runtime_refs(...)`, `artifact.get_analysis_campaign(...)`, or `artifact.list_paper_outlines(...)`.
+After each completed, excluded, or blocked reviewer-linked slice:
+- reopen `paper/paper_experiment_matrix.*`
+- update the affected `exp_id`
+- update whether the result now belongs in main text, appendix, or omission
+- update which reviewer items are now fully answered
+Do not finalize the rebuttal package while reviewer-critical and currently feasible matrix rows remain unresolved without an explicit blocker note.
 ### 4. Route manuscript changes explicitly
@@ -279,6 +315,14 @@ If a reviewer request forces a narrower story, revise the outline before polishi
 Use `references/response-letter-template.md` when helpful.
+Before treating the response letter as final:
+- first complete every feasible reviewer-linked experiment or analysis slice that the current plan marked as necessary
+- ensure the necessary rows in `paper/paper_experiment_matrix.*` have been refreshed after those runs
+- use real completed experiment results directly in the reply wherever the concern is genuinely experimental
+- for non-experimental items, do not wait for unnecessary experiments; answer as strongly as the current manuscript, literature, and analysis already allow
+- if one experimental item cannot be completed in time, keep the reply honest and explicit about the remaining limitation or fallback wording
 The response should be:
 - professional
@@ -290,6 +334,8 @@ The response should be:
 Good response structure:
 - short appreciation / acknowledgement
+- overall response that summarizes the revision strategy and the strongest strengths acknowledged across reviewers
+- strengths recognized across reviewers
 - direct answer to the reviewer concern
 - keep stable item ids visible when helpful
 - restate reviewer wording faithfully before answering
@@ -300,6 +346,28 @@ Good response structure:
   - claim scope
 - if not fully addressed, why not and what honest limitation remains
+Drafting style rules for the actual author reply body:
+- Treat `response_letter.md` as rebuttal-ready author text, not as internal coaching notes.
+- Write in a calm, direct, precise author voice.
+- Sound like authors clarifying the record, not authors asking for approval.
+- Brief professional courtesy is allowed, but keep it short and move to substance immediately.
+- Avoid sycophancy, flattery, excessive gratitude, or approval-seeking language.
+- Do not default to conceding fault.
+- Use selective concede, selective clarify, and selective defend.
+- Answer the reviewer concern directly in the first 1 to 2 sentences.
+- For non-experimental items, reduce reviewer uncertainty as much as the real evidence allows; the goal is to make a score improvement reasonable for an honest reviewer, not to persuade through rhetoric alone.
+- Write strongly enough that a neutral reviewer or AC can judge the concern substantially addressed from the rebuttal text alone.
+- After the literal answer, address the underlying doubt about validity, novelty, scope, fairness, or completeness.
+- If the answer already exists in the manuscript, restate it in the rebuttal and then point to the manuscript change; do not only say “we will clarify”.
+- If the issue is about wording, interpretation, or claim strength, include the revised sentence or close paraphrase that should appear in the manuscript.
+- Keep the main response body for each item as 1 to 2 full paragraphs of polished prose.
+- Do not use bullets, numbered lists, bold labels, or checklist fragments inside the actual response paragraphs.
+- Do not narrate rebuttal strategy inside the author reply.
+- Do not rely on future edits alone when you can already give the clarification, argument, or wording now.
+- When pushing back, lead with evidence, scope, or feasibility constraints before intuition.
+- If `startup_contract.manuscript_edit_mode = latex_required`, keep manuscript-facing replacement text LaTeX-ready.
 If details are still genuinely unknown, use explicit placeholders such as `[[AUTHOR TO FILL]]` rather than inventing specifics.
 Avoid:
@@ -319,6 +387,8 @@ When the rebuttal package is durably ready:
 If a combined rebuttal note is useful, make sure the total package still covers:
+- overall response
+- strengths recognized across reviewers
 - overview and revision strategy
 - draft responses to reviewers
 - point-to-point triage
@@ -398,6 +468,9 @@ Useful tags include:
 - supplementary experiments, if needed, are routed cleanly
 - manuscript deltas are explicit
 - the response letter is evidence-backed and honest
+- the final package contains both:
+  - reviewer-specific replies
+  - one overall response that makes the paper strengths, the main resolved concerns, and the remaining limitations legible to a neutral reader or AC
 The goal is not just “write a nicer response”.
 The goal is to convert review pressure into a durable, auditable revision workflow.

package/src/skills/rebuttal/references/response-letter-template.md CHANGED Viewed

@@ -1,9 +1,55 @@
 # Response Letter Template
+## Drafting rules
+- Treat this file as rebuttal-ready author text, not as private coaching notes.
+- Write in a calm, direct, precise author voice.
+- Brief professional courtesy is allowed, but keep it short and move to substance immediately.
+- Avoid sycophancy, flattery, excessive gratitude, or approval-seeking language.
+- Do not default to conceding fault.
+- Use selective concede, selective clarify, and selective defend.
+- Answer the reviewer concern directly in the first 1 to 2 sentences.
+- Keep the actual response body for each item as 1 to 2 full paragraphs of polished prose.
+- If the issue is about wording, interpretation, or claim strength, include the revised sentence or close paraphrase that should appear in the manuscript.
+- Do not use bullets, numbered lists, or label-value schemas inside the actual response paragraphs.
+- Do not rely on future edits alone when you can already give the clarification, argument, or draft wording now.
+- If a concrete number, setup detail, or result is still unknown, use `[[AUTHOR TO FILL]]`.
 ## Cover note
 We thank the reviewers for the careful reading and constructive feedback. Below we respond point by point and indicate the corresponding manuscript changes and supplementary evidence when applicable.
+## Overview & Revision Strategy
+- main reviewer risks:
+- current strongest evidence:
+- current weakest evidence:
+- baseline handling decision:
+- response strategy:
+- manuscript_edit_mode:
+## Overall Response
+- strongest strengths recognized across reviewers:
+- overall revision strategy:
+- biggest concerns now addressed:
+- concerns still partially open:
+- claim-scope changes:
+- remaining limitation:
+## Strengths Recognized Across Reviewers
+- strength 1:
+- strength 2:
+- why these strengths still matter after revision:
+## Resolution Snapshot
+| Item ID | Status | What changed | Evidence basis | Manuscript delta |
+| --- | --- | --- | --- | --- |
+| R1-C1 |  |  |  |  |
+| R1-C2 |  |  |  |  |
 ## Reviewer 1
 ### Item R1-C1
@@ -20,9 +66,11 @@ We thank the reviewers for the careful reading and constructive feedback. Below
 - agree / partially_agree / clarify / respectful_disagree
-**Response**
+**Response Draft**
--
+Write 1 to 2 full paragraphs of rebuttal-ready prose here.
+The first 1 to 2 sentences should answer the concern directly.
+Then explain the evidence, manuscript rationale, and the exact clarification or wording that should appear in the revision.
 **What changed**
@@ -30,6 +78,7 @@ We thank the reviewers for the careful reading and constructive feedback. Below
 - evidence basis:
 - claim-scope effect:
 - remaining limitation:
+- latex-ready manuscript text:
 **If an experiment is still pending**
@@ -51,9 +100,9 @@ We thank the reviewers for the careful reading and constructive feedback. Below
 - agree / partially_agree / clarify / respectful_disagree
-**Response**
+**Response Draft**
--
+Write 1 to 2 full paragraphs of rebuttal-ready prose here.
 **What changed**
@@ -84,9 +133,9 @@ We thank the reviewers for the careful reading and constructive feedback. Below
 - agree / partially_agree / clarify / respectful_disagree
-**Response**
+**Response Draft**
--
+Write 1 to 2 full paragraphs of rebuttal-ready prose here.
 **What changed**
@@ -106,8 +155,3 @@ We thank the reviewers for the careful reading and constructive feedback. Below
 - what could not be fully addressed:
 - why:
 - how the manuscript now reflects that limitation:
-## Author placeholders
-- If a concrete number, setup detail, or result is still unknown, use `[[AUTHOR TO FILL]]`.
-- Do not fabricate missing details just to make the letter sound complete.

package/src/skills/review/SKILL.md CHANGED Viewed

@@ -17,7 +17,7 @@ It is also not the same as `rebuttal`.
 ## Interaction discipline
 - Follow the shared interaction contract injected by the system prompt.
-- For ordinary active work, prefer a concise progress update once work has crossed roughly 10 tool calls with a human-meaningful delta, and do not drift beyond roughly 20 tool calls or about 15 minutes without a user-visible update.
+- For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
 - When the review report, revision plan, or follow-up experiment TODO list becomes durable, send a richer `artifact.interact(kind='milestone', reply_mode='threaded', ...)` update that says what the main risks are, what should be fixed next, and whether the next route is writing, experiment, or claim downgrade.
 ## Purpose
@@ -63,6 +63,16 @@ Do not treat “looks polished” as “is defensible”.
 - Do not recommend rhetoric when the real problem is missing evidence.
 - If novelty or positioning is uncertain, treat that as a literature-audit question first, not an automatic experiment request.
 - If a claim is too broad for the evidence, prefer narrowing or downgrading the claim over defending it with style.
+- If `startup_contract.review_followup_policy` is present, honor it:
+  - `audit_only`
+    - stop after durable review artifacts and a clear route recommendation
+  - `auto_execute_followups`
+    - do not stop at the audit if the next route is already clear; continue into the required experiments and manuscript deltas
+  - `user_gated_followups`
+    - finish the audit first, then package the next expensive follow-up step into one structured decision
+- If `startup_contract.manuscript_edit_mode = latex_required`, treat the provided LaTeX tree or `paper/latex/` as the writing surface when manuscript revision is needed.
+- If LaTeX source is unavailable while `latex_required` is requested, do not pretend the manuscript was edited; produce LaTeX-ready replacement text and an explicit blocker note instead.
+- Accept manuscript and review inputs from URLs, local file paths, local directories, or current-turn attachments; do not assume the draft is already perfectly normalized.
 ## Primary inputs
@@ -74,11 +84,13 @@ Use, in roughly this order:
 - the six-field `evaluation_summary` blocks from recent main experiments and analysis slices
 - recent main and analysis experiment results
 - figures, tables, and captions
+- current-turn attachments and user-provided local paths / directories / URLs for the manuscript bundle or review packet
 - prior self-review or reviewer-first notes as low-trust auxiliary input
 - nearby papers when novelty or comparison is unclear
 If the draft/result state is still unclear, open `intake-audit` first before continuing the review workflow.
 Before proposing extra experiments, read those structured `evaluation_summary` blocks first so you do not request work that the recorded evidence already resolved.
+If the user provided draft files or manuscript bundles directly, first normalize them into durable quest-visible paths before planning experiments or section-level revisions.
 ## Core outputs
@@ -87,6 +99,8 @@ The review pass should usually leave behind:
 - `paper/review/review.md`
 - `paper/review/revision_log.md`
 - `paper/review/experiment_todo.md`
+- `paper/paper_experiment_matrix.md` when more evidence is still needed
+- `paper/paper_experiment_matrix.json` when more evidence is still needed
 Use the templates in `references/` when needed:
@@ -175,14 +189,25 @@ For each serious issue, record:
 - what should change
 - whether the fix is writing-only, evidence-only, or experiment-dependent
 - whether the issue blocks `finalize`
+- one copy-ready replacement sentence / paragraph when feasible
+- one LaTeX-ready replacement block when `startup_contract.manuscript_edit_mode = latex_required`
 ### 5. Produce the follow-up experiment TODO list
 Only if more evidence is truly needed, write `paper/review/experiment_todo.md` using `references/experiment-todo-template.md`.
+When the paper still lacks experimental support, also create or revise:
+- `paper/paper_experiment_matrix.md`
+- `paper/paper_experiment_matrix.json`
+Treat the matrix as the paper-facing master plan and `paper/review/experiment_todo.md` as only the current execution frontier or review-facing subset.
 Each TODO item should include:
 - the review issue it answers
+- the matrix exp id
+- the corresponding `exp_id` in the paper experiment matrix
 - why existing evidence is still insufficient
 - the minimum experiment or analysis needed
 - required metric(s)
@@ -195,6 +220,50 @@ Each TODO item should include:
 Do not write a vague “run more ablations” list.
 Each TODO item should be concrete enough to turn into `analysis-campaign` slices or a `baseline` recovery task.
+The matrix should be broader than the TODO list and should classify the full paper-facing experiment space, not just analysis work.
+When building or revising that matrix, explicitly consider:
+- main comparison packaging or extension
+- component ablations
+- sensitivity / hyperparameter checks
+- robustness checks
+- efficiency / cost / latency / token-overhead checks when relevant
+- highlight-validation experiments that test the likely strengths of the method
+- limitation-boundary analyses
+- case study rows as optional rather than mandatory evidence
+Do not assume the paper only needs “analysis experiments”.
+Do not assume case studies belong in the required set.
+If efficiency or cost could become a reviewer-facing strength or concern, put that into the matrix explicitly.
+For the matrix, each row should usually record:
+- `exp_id`
+- `tier`
+- `experiment_type`
+- `status`
+- `feasibility_now`
+- `claim_ids`
+- `highlight_ids`
+- `research_question`
+- `hypothesis`
+- `comparators`
+- `metrics`
+- `minimal_success_criterion`
+- `paper_placement`
+- `promotion_rule`
+- `next_action`
+The matrix should also keep a short `highlight hypotheses` block.
+Do not rely on prose intuition for the method's best selling point; if a likely highlight matters, it should have a corresponding validation row in the matrix.
+Before treating the experiments section as stable, require that every currently feasible matrix row that is not merely `optional` or `dropped` is either:
+- completed
+- analyzed
+- excluded with a real reason
+- or blocked with a real reason
 When extra evidence is truly needed, use the shared supplementary-experiment protocol:
 - recover ids / refs first if needed
@@ -216,6 +285,54 @@ After the review artifacts are durable:
 Do not stop immediately after writing the review if the next route is already clear.
+### 7. Auto follow-up execution contract
+When `startup_contract.review_followup_policy = auto_execute_followups`:
+- treat the review as a gate, not as the endpoint
+- immediately turn the accepted follow-up route into action:
+  - `analysis-campaign`
+    - when new evidence is truly required
+  - `baseline`
+    - when a missing comparator baseline blocks fair review
+  - `write`
+    - when the issues are mostly text, outline, claim-scope, figure, or framing revisions
+- after each completed follow-up step, update:
+  - `paper/review/revision_log.md`
+  - `paper/review/experiment_todo.md`
+  - the draft or manuscript-facing revision package
+- only treat the review line as truly closed after the follow-up route has either completed or been downgraded / blocked explicitly
+When `startup_contract.review_followup_policy = user_gated_followups`:
+- stop after the durable audit artifacts
+- turn the next expensive follow-up package into one structured decision instead of continuing silently
+When `startup_contract.review_followup_policy = audit_only`:
+- stop after the durable audit artifacts and route recommendation
+### 8. Manuscript revision delivery contract
+If manuscript revision is required, make the delta explicit:
+- section
+- old claim / weakness
+- new wording
+- evidence basis
+- remaining limitation
+If `startup_contract.manuscript_edit_mode = copy_ready_text`:
+- provide copy-ready replacement wording in `paper/review/revision_log.md` or a nearby revision note
+- keep the wording directly usable by the user or downstream `write`
+If `startup_contract.manuscript_edit_mode = latex_required`:
+- prefer editing the actual LaTeX sources when they are available
+- otherwise provide LaTeX-ready replacement text blocks with explicit insertion targets
+- preserve labels, citations, figure/table refs, and section structure in the suggested replacements
 ## Companion skill routing
 Open additional skills only when the review workflow requires them:

package/src/skills/review/references/experiment-todo-template.md CHANGED Viewed

@@ -5,25 +5,48 @@
 ### TODO EXP-001
 - source review issue:
+- matrix exp id:
 - why current evidence is insufficient:
 - route type:
   - existing-result analysis
   - comparator baseline
   - supplementary experiment
   - figure / table regeneration
+- experiment type:
+  - component_ablation
+  - sensitivity
+  - robustness
+  - efficiency_cost
+  - highlight_validation
+  - failure_boundary
+  - case_study_optional
+- tier:
+  - main_required
+  - main_optional
+  - appendix
+  - optional
 - minimum task:
 - required metric(s):
 - minimal success criterion:
+- likely paper placement:
+  - main_text
+  - appendix
+  - maybe
+  - omit
 - expected manuscript impact:
 - owner / next step:
 ### TODO EXP-002
 - source review issue:
+- matrix exp id:
 - why current evidence is insufficient:
 - route type:
+- experiment type:
+- tier:
 - minimum task:
 - required metric(s):
 - minimal success criterion:
+- likely paper placement:
 - expected manuscript impact:
 - owner / next step:

package/src/skills/review/references/review-report-template.md CHANGED Viewed

@@ -1,5 +1,11 @@
 # Review Report Template
+## Review mode
+- review_followup_policy:
+- manuscript_edit_mode:
+- manuscript_source_status:
 ## Summary
 - paper / draft:
@@ -36,6 +42,8 @@
 - cause:
 - actionable fix:
 - acceptance criterion:
+- copy-ready revision text:
+- latex-ready revision text:
 ## Storyline Options + Writing Outlines
@@ -49,6 +57,14 @@
 2.
 3.
+## Manuscript Revision Package
+- section:
+- old wording / weakness:
+- new wording:
+- evidence basis:
+- latex-ready replacement block:
 ## Experiment Inventory & Research Experiment Plan
 - what existing experiments already cover:

package/src/skills/review/references/revision-log-template.md CHANGED Viewed

@@ -3,6 +3,8 @@
 ## Revision Summary
 - current draft state:
+- review_followup_policy:
+- manuscript_edit_mode:
 - highest-priority fixes:
 - blockers:
@@ -20,6 +22,8 @@
   - supplementary experiment
   - claim downgrade
 - concrete change:
+- copy-ready revision text:
+- latex-ready revision text:
 - status:
 - blocks finalize:

package/src/skills/scout/SKILL.md CHANGED Viewed

@@ -10,7 +10,7 @@ Use this skill when the quest does not yet have a stable research frame.
 ## Interaction discipline
 - Follow the shared interaction contract injected by the system prompt.
-- For ordinary active work, prefer a concise progress update once work has crossed roughly 10 tool calls with a human-meaningful delta, and do not drift beyond roughly 20 tool calls or about 15 minutes without a user-visible update.
+- For ordinary active work, prefer a concise progress update once work has crossed roughly 6 tool calls with a human-meaningful delta, and do not drift beyond roughly 12 tool calls or about 8 minutes without a user-visible update.
 - Message templates are references only. Adapt to the actual context and vary wording so updates feel natural and non-robotic.
 - If a threaded user reply arrives, interpret it relative to the latest scout progress update before assuming the task changed completely.
 - When scouting actually resolves the framing ambiguity, locks the evaluation contract, or makes the next anchor obvious, send one richer `artifact.interact(kind='milestone', reply_mode='threaded', ...)` update that says what is now clear, why it matters, and which stage should come next.