npm - superlab - Versions diffs - 0.1.74 → 0.1.76 - Mend

superlab 0.1.74 → 0.1.76

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/lib/i18n.cjs CHANGED Viewed

@@ -1974,7 +1974,7 @@ for (const [relativePath, content] of Object.entries(ZH_SKILL_FILES)) {
 const zhRebuttalModeReference = `# Rebuttal Mode
-本文件是 reviewer panel 和外部 rebuttal intake 的唯一共享合同。review、write、auto 阶段只引用本文件，不复制四审稿人逻辑。
+本文件是 reviewer panel 和外部 rebuttal intake 的唯一共享合同。review、write、auto 阶段只引用本文件，不复制五审稿人逻辑。
 ## 触发条件
@@ -1985,6 +1985,19 @@ const zhRebuttalModeReference = `# Rebuttal Mode
 普通路径修复、依赖安装、实验轮询不触发本模式，除非它们会影响 paper-facing claim。
+## Light Read Set / 轻量读取范围
+Rebuttal 是批评和路由，不是全仓审计。默认只读最小证据集合：
+- active LaTeX / 现役 LaTeX：\`main.tex\`、\`sections/*.tex\`、\`tables/*.tex\`、\`figures/*.tex\`、\`analysis/*.tex\`
+- result summaries / 结果摘要：\`summary.csv\`、\`summary.tsv\`、\`summary.json\`、\`score_effect_summary.json\`、\`metric_summary.*\`、\`run_table.*\`
+- 受管索引：evidence index、evaluation protocol、paper plan、metric glossary、terminology glossary、artifact status、active topology
+- 用户提供的外部 rebuttal、批评或审稿意见
+Do not run a whole-repository scan / 不要默认全仓扫描。不要默认读取 raw datasets / 原始数据集、full logs / 完整日志、完整 outputs 树、源码、notebook 或无关旧稿。
+只有在 LaTeX claim 与结果摘要冲突、表格数值无来源、validator 指向具体文件、或用户明确要求 deep audit 时，才扩大读取范围。扩大时必须在 rebuttal panel 记录原因和额外路径。
 ## 外部 Rebuttal Intake
 外部批评必须先转成内部 issue，再进入改稿或实验。
@@ -1994,7 +2007,7 @@ const zhRebuttalModeReference = `# Rebuttal Mode
 - 来源：reviewer id、AC、meta-review、同事或用户
 - 批评摘要
 - 影响对象：claim、section、table、figure、protocol、metric、threat model、experiment 或 wording
-- 审稿轴：R1、R2、R3 或 R4
+- 审稿轴：R1、R2、R3、R4 或 R5
 - 严重性：fatal、major、minor 或 clarification
 - 路由：\`write\`、\`iterate\`、\`report\`、\`framing\`、\`data\`、\`spec\` 或 \`ask-user\`
 - 接受检查：什么证据或稿件状态算修完
@@ -2013,7 +2026,11 @@ const zhRebuttalModeReference = `# Rebuttal Mode
 检查消融、鲁棒性、泛化、失败案例、替代解释和指标解释是否完整。
-### R4 Presentation / Clarity
+### R4 Results / Tables / Numeric Evidence / 结果、表格与数值证据
+检查实验数值、差值、表格设计、指标方向、split 数、统计支持、加粗规则、caption 和表注是否可审计；每张主表是否说明评估什么、指标如何解释、协议如何产生行，以及哪些比较边界不能跨越。
+### R5 Presentation / Clarity
 检查叙事线、术语、图表自解释、引用、LaTeX 和 section flow 是否清楚。
@@ -2037,13 +2054,25 @@ ZH_CONTENT[path.join(".lab", ".managed", "templates", "rebuttal-panel.md")] = `#
 - 证据基础：
 - 外部 rebuttal 来源（如果有）：
+## Read-scope audit / 读取范围审计
+- 是否从 Light Read Set / 轻量读取范围开始：
+- Active LaTeX / 现役 LaTeX 文件：
+- Result summaries / 结果摘要：
+- 受管索引：
+- 额外读取路径：
+- 如有扩大范围，原因：
+- 是否避免 whole-repository scan / 全仓扫描：
+- 是否避免 raw datasets / 原始数据集：
+- 是否避免 full logs / 完整日志：
 ## 外部 Rebuttal Intake
 | 来源 | 批评摘要 | 影响对象 | 审稿轴 | 严重性 | 路由 | 接受检查 |
 | --- | --- | --- | --- | --- | --- | --- |
 |  |  |  |  |  |  |  |
-## 四类审稿视角
+## 五类审稿视角
 ### R1 Significance / Originality / Insight
@@ -2069,7 +2098,15 @@ ZH_CONTENT[path.join(".lab", ".managed", "templates", "rebuttal-panel.md")] = `#
 - 路由：
 - 接受检查：
-### R4 Presentation / Clarity
+### R4 Results / Tables / Numeric Evidence / 结果、表格与数值证据
+- 问题：
+- 为什么重要：
+- 必要修复：
+- 路由：
+- 接受检查：
+### R5 Presentation / Clarity
 - 问题：
 - 为什么重要：
@@ -2172,9 +2209,10 @@ const zhReviewRebuttalMode = `
 ## Rebuttal 模式
 - 当目标是论文、section、表、图、report、claim set 或外部 rebuttal 批评时，必须读取 \`skills/lab/references/rebuttal-mode.md\`。
-- 不要在 review 阶段复制四审稿人逻辑；使用 \`.lab/.managed/templates/rebuttal-panel.md\` 写持久 reviewer panel 工件。
+- 不要在 review 阶段复制五审稿人逻辑；使用 \`.lab/.managed/templates/rebuttal-panel.md\` 写持久 reviewer panel 工件。
+- 对“rebuttal 一下看有什么缺点”这类快速审查，默认只用 Light Read Set / 轻量读取范围：active LaTeX / 现役 LaTeX、result summaries / 结果摘要、受管索引和用户提供的批评。不要默认 whole-repository scan / 全仓扫描。
 - 外部 reviewer、AC、meta-review、同事或用户批评必须先转成内部可执行 issue，再进入改稿或 response draft。
-- Reviewer Panel 按 R1 Significance / Originality / Insight、R2 Soundness / Technical Quality、R3 Evaluation / Analysis、R4 Presentation / Clarity 四类审稿视角分类。
+- Reviewer Panel 按 R1 Significance / Originality / Insight、R2 Soundness / Technical Quality、R3 Evaluation / Analysis、R4 Results / Tables / Numeric Evidence、R5 Presentation / Clarity 五类审稿视角分类。
 - L1/L2 默认把核心变更当作批准边界；L3 通过共享核心变更台账策略处理核心 claim、协议、指标、threat model、数据集范围、benchmark 范围或 framing 变化。
 `;
@@ -2184,8 +2222,9 @@ const zhWriteRebuttalMode = `
 - 当用户提供外部 reviewer、AC、meta-review、rebuttal、同事或用户自己的批评时，起草前必须读取 \`skills/lab/references/rebuttal-mode.md\`。
 - 非平凡 paper-facing 写作轮次应把 rebuttal mode 当成 reviewer acceptance gate，并用 \`.lab/.managed/templates/rebuttal-panel.md\` 写 critique artifact。
+- write 的 rebuttal gate 必须先用 Light Read Set / 轻量读取范围：active LaTeX / 现役 LaTeX、result summaries / 结果摘要、受管索引和用户提供的批评；除非 rebuttal panel 记录具体扩大原因，否则不要 whole-repository scan / 全仓扫描。
 - 不要实现 write-only rebuttal workflow；共享 rebuttal-mode 负责审稿轴、外部 rebuttal intake、issue routing 和核心变更策略。
-- fatal 或 major 的 R1/R2/R3 issue 未解决前，不要进入 prose polish；先修复、路由到 \`iterate\` / \`report\` / \`framing\` / \`spec\`，或用证据显式 waive。
+- fatal 或 major 的 R1/R2/R3/R4 issue 未解决前，不要进入 prose polish；先修复、路由到 \`iterate\` / \`report\` / \`framing\` / \`spec\`，或用证据显式 waive。
 - L3 或显式授权的写作 campaign 可以改 paper-level claim、协议、指标、threat model、数据集范围、benchmark 范围或 framing，但必须通过 \`skills/lab/references/rebuttal-mode.md\` 里的 Core Mutation Ledger 策略。
 - 在 write iteration artifact 里记录 rebuttal panel 路径、核心变更台账路径和未解决 issue id。
 `;
@@ -2196,6 +2235,7 @@ const zhAutoRebuttalMode = `
 - 当 auto campaign 包含 paper-facing \`report\`、\`write\`、外部 rebuttal repair 或 reviewer-driven paper revision 时，必须读取 \`skills/lab/references/rebuttal-mode.md\`。
 - 使用 \`.lab/.managed/templates/rebuttal-panel.md\` 写持久 Reviewer Panel 工件，不要在 auto mode 里复制一套 reviewer workflow。
+- reviewer-driven repair 先用 Light Read Set / 轻量读取范围：active LaTeX / 现役 LaTeX、result summaries / 结果摘要、受管索引和用户批评。除非 rebuttal panel 记录扩大原因，否则不要 whole-repository scan / 全仓扫描、raw datasets / 原始数据集或 full logs / 完整日志。
 - 外部 rebuttal 批评必须先转成内部 issue、route 和 acceptance check，再开始 \`run\`、\`iterate\`、\`report\` 或 \`write\`。
 - L1/L2 默认把核心变更当作批准边界；L3 可以在已批准 envelope 内修改 paper-level claim、协议、指标、threat model、reviewer profile、数据集范围、benchmark 范围或 framing。
 - L3 执行核心变更前，必须用 \`.lab/.managed/templates/core-mutation-ledger.md\` 写或更新 \`.lab/writing/core-mutation-ledger.md\`。
@@ -2703,6 +2743,8 @@ const zhAutoPriorityCodexLine =
   "显式的 `/lab:auto` 或 `/lab-auto` 请求，其优先级高于 brainstorming、spec review 这类更宽的创作或审阅技能路径。";
 const zhAutoPriorityClaudeLine =
   "显式的 `/lab auto` 或 `/lab-auto` 请求，其优先级高于 brainstorming、spec review 这类更宽的创作或审阅技能路径。";
+const zhAutoVisibleCloseoutLine =
+  "最终可见收尾必须直接消费已通过校验的 stage report：展示请求交付物或目标的状态、核心说明表的关键行、证据路径、验证命令和验证结果、已知缺口，以及下一步动作和原因。不能只用“已完成”“已推送”或流水账命令日志结束。";
 ZH_CONTENT[path.join(".codex", "prompts", "lab.md")] = ZH_CONTENT[
   path.join(".codex", "prompts", "lab.md")
@@ -2716,6 +2758,9 @@ ZH_CONTENT[path.join(".codex", "prompts", "lab-auto.md")] = ZH_CONTENT[
 ].replace(
   "已批准的 `L2` 和 `L3` 执行 campaign 默认进入执行模式。",
   `${zhAutoPriorityCodexLine}\n已批准的 \`L2\` 和 \`L3\` 执行 campaign 默认进入执行模式。`
+).replace(
+  "不要用 `sleep 30`、单次 `pgrep` 或一次性的 `metrics.json` 探针来代替真实长任务命令；当真实实验进程还活着时，只允许在出现有意义变化时发进度更新，并继续等待。没有新变化时，也只按保活节奏汇报，不要让用户触发下一次轮询。",
+  "不要用 `sleep 30`、单次 `pgrep` 或一次性的 `metrics.json` 探针来代替真实长任务命令；当真实实验进程还活着时，只允许在出现有意义变化时发进度更新，并继续等待。没有新变化时，也只按保活节奏汇报，不要让用户触发下一次轮询。\n\n" + zhAutoVisibleCloseoutLine
 );
 ZH_CONTENT[path.join(".claude", "commands", "lab.md")] = ZH_CONTENT[
@@ -2730,6 +2775,9 @@ ZH_CONTENT[path.join(".claude", "commands", "lab-auto.md")] = ZH_CONTENT[
 ].replace(
   "已批准的 `L2` 和 `L3` 执行 campaign 默认进入执行模式。",
   `${zhAutoPriorityClaudeLine}\n已批准的 \`L2\` 和 \`L3\` 执行 campaign 默认进入执行模式。`
+).replace(
+  "不要用 `sleep 30`、单次 `pgrep` 或一次性的 `metrics.json` 探针来代替真实长任务命令；当真实实验进程还活着时，只允许在出现有意义变化时发进度更新，并继续等待。没有新变化时，也只按保活节奏汇报，不要让用户触发下一次轮询。",
+  "不要用 `sleep 30`、单次 `pgrep` 或一次性的 `metrics.json` 探针来代替真实长任务命令；当真实实验进程还活着时，只允许在出现有意义变化时发进度更新，并继续等待。没有新变化时，也只按保活节奏汇报，不要让用户触发下一次轮询。\n\n" + zhAutoVisibleCloseoutLine
 );
 const zhRecipeQuickPathLine =
@@ -2761,6 +2809,12 @@ ZH_CONTENT[path.join(".claude", "commands", "lab.md")] = ZH_CONTENT[
   "- 用户只要显式调用某个 stage，无论写成 `/lab:<stage>`、`/lab: <stage>`、`/lab <stage>`、`/lab-<stage>` 还是 `/lab：<stage>`，都要立刻执行该 stage，而不是只推荐别的阶段。\n- 如果输入看起来像 stage 请求，但又不属于上述受支持写法，就必须停下并要求用户用精确的 stage 名重述，而不是自己猜。\n"
 );
+for (const rootPromptKey of [path.join(".codex", "prompts", "lab.md"), path.join(".claude", "commands", "lab.md")]) {
+  if (ZH_CONTENT[rootPromptKey] && !ZH_CONTENT[rootPromptKey].includes("最终可见收尾")) {
+    ZH_CONTENT[rootPromptKey] += `\n\n${zhAutoVisibleCloseoutLine}\n`;
+  }
+}
 ZH_CONTENT[path.join(".codex", "skills", "lab", "SKILL.md")] = `---
 name: lab
 description: 严格研究工作流，覆盖 idea、data、auto、framing、spec、run、iterate、review、report 和 paper-writing。
@@ -3405,6 +3459,18 @@ ZH_CONTENT[path.join(".codex", "skills", "lab", "stages", "auto.md")] = ZH_CONTE
   "- 只有当级别本身真的有歧义时，才停下来追问，例如 \\`第三层\\`、\\`phase 3\\`、\\`table 3\\`。",
   "- 只有当级别本身真的有歧义时，才停下来追问，例如 \\`第三层\\`、\\`phase 3\\`、\\`table 3\\`。\n- 如果用户显式调用 \\`/lab:auto\\` 或 \\`/lab-auto\\`，就保持在 auto 执行路径里；只要请求仍在已批准 execution envelope 内，即使目标听起来像 feature selection、baseline selection、离散化或 candidate sweep，也不要重新路由到 brainstorming 或 spec review。"
 );
+const zhAutoStageVisibleCloseout = `
+## 最终可见收尾
+- 最终可见收尾必须在 stage report 校验通过后给出，不能只写“已完成”“已推送”或命令流水账。
+- 最终可见收尾必须直接来自已校验的阶段报告，而不是另起一套临场叙述。
+- 最终可见收尾至少包含：请求交付物或目标及状态、核心说明表关键行、证据路径、验证命令和验证结果、已知缺口、下一步动作和为什么这样做。
+- 如果说“已完成”，也必须同时写明仍然存在的 handoff 边界，例如 PDF 编译、版面检查、外部审批、预算耗尽、冻结核心风险或环境缺失。
+`;
+if (!ZH_CONTENT[path.join(".codex", "skills", "lab", "stages", "auto.md")].includes("最终可见收尾")) {
+  ZH_CONTENT[path.join(".codex", "skills", "lab", "stages", "auto.md")] += zhAutoStageVisibleCloseout;
+}
 ZH_CONTENT[path.join(".claude", "skills", "lab", "stages", "auto.md")] =
   ZH_CONTENT[path.join(".codex", "skills", "lab", "stages", "auto.md")];
 ZH_CONTENT[path.join(".claude", "skills", "lab", "stages", "report.md")] =

package/package-assets/claude/commands/lab.md CHANGED Viewed

@@ -100,6 +100,7 @@ Treat all of these as equivalent stage requests:
 - While the loop is alive, `/lab auto` should keep `.lab/context/auto-ledger.md` updated with the active owner, observed state, and resume boundary.
 - Separate internal polling from user-facing progress reports.
 - While the loop is healthy, `/lab auto` should report to the user only on a meaningful change or at the keepalive cadence recorded in the current contract or runtime state, and it should not ask the user to trigger the next poll.
+- Final visible closeout must consume the validated stage report: show requested deliverable statuses, Core Explanation Table rows, evidence paths, validation/verification commands and results, known gaps, and the next action. Do not end with only "done", "pushed", or a chronological command log.
 - Treat `Autonomy level L1/L2/L3` as the execution privilege level, not as a paper layer, phase, or table number.
 - Treat `paper layer`, `phase`, and `table` as experiment targets. For example, `paper layer 3` or `Phase 1` should not be interpreted as `Autonomy level L3`.

package/package-assets/codex/prompts/lab/auto.md CHANGED Viewed

@@ -27,3 +27,4 @@ If the preflight block cannot be completed because any required field is missing
 When the repository workflow language is Chinese, summaries, checklist items, task labels, and progress updates should be written in Chinese unless a literal identifier must stay unchanged.
 Treat `Layer 3`, `Phase 1`, or `Table 2` as paper-scope targets. Treat `Autonomy level L3` as the execution permission level.
 Do not replace the real long-running experiment command with a short watcher such as `sleep 30`, `pgrep`, or a one-shot `metrics.json` probe. While the real experiment process is still alive, emit only a progress update and keep waiting.
+Final visible closeout is mandatory when `/lab:auto` reaches stop, failure, escalation, or handoff. After validating the stage report, the final answer must consume that report directly: list the requested deliverables or objectives with status, summarize the Core Explanation Table rows, provide evidence paths, show validation/verification commands and validation results, name known gaps or commands that could not run, and state the next action plus why it is appropriate. Do not end with only `done`, `pushed`, `completed`, or a chronological command log.

package/package-assets/codex/prompts/lab-auto.md CHANGED Viewed

@@ -27,3 +27,4 @@ If the preflight block cannot be completed because any required field is missing
 When the repository workflow language is Chinese, summaries, checklist items, task labels, and progress updates should be written in Chinese unless a literal identifier must stay unchanged.
 Treat `Layer 3`, `Phase 1`, or `Table 2` as paper-scope targets. Treat `Autonomy level L3` as the execution permission level.
 Do not replace the real long-running experiment command with a short watcher such as `sleep 30`, `pgrep`, or a one-shot `metrics.json` probe. While the real experiment process is still alive, emit only a progress update and keep waiting.
+Final visible closeout is mandatory when `/lab:auto` reaches stop, failure, escalation, or handoff. After validating the stage report, the final answer must consume that report directly: list the requested deliverables or objectives with status, summarize the Core Explanation Table rows, provide evidence paths, show validation/verification commands and validation results, name known gaps or commands that could not run, and state the next action plus why it is appropriate. Do not end with only `done`, `pushed`, `completed`, or a chronological command log.

package/package-assets/codex/prompts/lab.md CHANGED Viewed

@@ -94,6 +94,7 @@ Treat all of these as equivalent stage requests:
 - While the loop is alive, `/lab:auto` should keep `.lab/context/auto-ledger.md` updated with the active owner, observed state, and resume boundary.
 - Separate internal polling from user-facing progress reports.
 - While the loop is healthy, `/lab:auto` should report to the user only on a meaningful change or at the keepalive cadence recorded in the current contract or runtime state, and it should not ask the user to trigger the next poll.
+- Final visible closeout must consume the validated stage report: show requested deliverable statuses, Core Explanation Table rows, evidence paths, validation/verification commands and results, known gaps, and the next action. Do not end with only "done", "pushed", or a chronological command log.
 - Treat `Autonomy level L1/L2/L3` as the execution privilege level, not as a paper layer, phase, or table number.
 - Treat `paper layer`, `phase`, and `table` as experiment targets. For example, `paper layer 3` or `Phase 1` should not be interpreted as `Autonomy level L3`.

package/package-assets/codex/prompts/lab:auto.md CHANGED Viewed

@@ -27,3 +27,4 @@ If the preflight block cannot be completed because any required field is missing
 When the repository workflow language is Chinese, summaries, checklist items, task labels, and progress updates should be written in Chinese unless a literal identifier must stay unchanged.
 Treat `Layer 3`, `Phase 1`, or `Table 2` as paper-scope targets. Treat `Autonomy level L3` as the execution permission level.
 Do not replace the real long-running experiment command with a short watcher such as `sleep 30`, `pgrep`, or a one-shot `metrics.json` probe. While the real experiment process is still alive, emit only a progress update and keep waiting.
+Final visible closeout is mandatory when `/lab:auto` reaches stop, failure, escalation, or handoff. After validating the stage report, the final answer must consume that report directly: list the requested deliverables or objectives with status, summarize the Core Explanation Table rows, provide evidence paths, show validation/verification commands and validation results, name known gaps or commands that could not run, and state the next action plus why it is appropriate. Do not end with only `done`, `pushed`, `completed`, or a chronological command log.

package/package-assets/codex/prompts/lab/357/274/232auto.md CHANGED Viewed

@@ -27,3 +27,4 @@ If the preflight block cannot be completed because any required field is missing
 When the repository workflow language is Chinese, summaries, checklist items, task labels, and progress updates should be written in Chinese unless a literal identifier must stay unchanged.
 Treat `Layer 3`, `Phase 1`, or `Table 2` as paper-scope targets. Treat `Autonomy level L3` as the execution permission level.
 Do not replace the real long-running experiment command with a short watcher such as `sleep 30`, `pgrep`, or a one-shot `metrics.json` probe. While the real experiment process is still alive, emit only a progress update and keep waiting.
+Final visible closeout is mandatory when `/lab:auto` reaches stop, failure, escalation, or handoff. After validating the stage report, the final answer must consume that report directly: list the requested deliverables or objectives with status, summarize the Core Explanation Table rows, provide evidence paths, show validation/verification commands and validation results, name known gaps or commands that could not run, and state the next action plus why it is appropriate. Do not end with only `done`, `pushed`, `completed`, or a chronological command log.

package/package-assets/shared/lab/.managed/templates/rebuttal-panel.md CHANGED Viewed

@@ -8,6 +8,18 @@
 - Evidence base:
 - External rebuttal source, if any:
+## Read-scope audit
+- Started from the Light Read Set:
+- active LaTeX files read:
+- result summaries read:
+- Managed indices read:
+- Extra paths read:
+- Why scope was expanded, if any:
+- Whole-repository scan avoided:
+- Raw datasets avoided:
+- Full logs avoided:
 ## External Rebuttal Intake
 | Source | Raw criticism summary | Affected unit | Reviewer axis | Severity | Route | Acceptance check |
@@ -40,7 +52,15 @@
 - Route:
 - Acceptance check:
-### R4 Presentation / Clarity
+### R4 Results / Tables / Numeric Evidence
+- Finding:
+- Why it matters:
+- Required fix:
+- Route:
+- Acceptance check:
+### R5 Presentation / Clarity
 - Finding:
 - Why it matters:
@@ -68,4 +88,3 @@
 - Next route:
 - Blocking issue, if any:
 - Handoff note:

package/package-assets/shared/skills/lab/SKILL.md CHANGED Viewed

@@ -49,11 +49,12 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
 - If the stage says improvement is needed, do not choose `stop` unless the next action states a concrete terminal boundary such as budget exhaustion, frozen-core risk, safety or integrity failure, impossible target, or a required approval boundary. Otherwise choose `continue`, `revise`, `rerun`, or `escalate`.
 - Stage reports are closeout and handoff artifacts, not a new user command and not a replacement for stage-specific artifacts such as idea memos, iteration reports, final reports, or write-iteration records.
 - Run `.lab/.managed/scripts/validate_stage_report.py --stage-report <stage-report> --stage <stage>` before claiming the stage is complete, and include the stage-report path plus validation result in the final user-facing summary.
+- For `/lab:auto`, the final user-facing answer must visibly consume the validated stage report: summarize requested deliverable statuses, Core Explanation Table rows, evidence paths, validation/verification commands and results, known gaps, and the next action. A chat-only chronological result list is not a valid closeout.
 - Final paper output should default to LaTeX, and its manuscript language should be decided separately from the workflow language.
 - Separate sourced facts from model-generated hypotheses.
 - Preserve failed runs, failed ideas, and limitations.
 - Use `skills/lab/references/recipes.md` as the quick path for common stage chains without inventing new commands.
-- Use `.codex/skills/lab/references/rebuttal-mode.md` or `.claude/skills/lab/references/rebuttal-mode.md` as the single shared reviewer-panel and external rebuttal intake contract. Do not copy four-reviewer logic into `review`, `write`, or `auto` stage guides.
+- Use `.codex/skills/lab/references/rebuttal-mode.md` or `.claude/skills/lab/references/rebuttal-mode.md` as the single shared reviewer-panel and external rebuttal intake contract. Do not copy five-reviewer logic into `review`, `write`, or `auto` stage guides.
 ## Stage Contract

package/package-assets/shared/skills/lab/references/rebuttal-mode.md CHANGED Viewed

@@ -23,6 +23,28 @@ Do not trigger rebuttal mode for routine implementation reviews, path fixes, dep
 - external rebuttal text when provided
 - active autonomy level when the stage is `/lab:auto`
+## Light Read Set
+Rebuttal mode is a criticism and routing pass, not a full repository audit. Start with the smallest evidence bundle that can support reviewer-style findings.
+Default read set:
+- active LaTeX manuscript files: `main.tex`, `sections/*.tex`, `tables/*.tex`, `figures/*.tex`, and `analysis/*.tex` when they are part of the active paper topology
+- result summaries: `summary.csv`, `summary.tsv`, `summary.json`, `score_effect_summary.json`, `metric_summary.*`, `run_table.*`, selected aggregate tables, and already-rendered table inputs
+- managed paper indices when present: evidence index, evaluation protocol, paper plan, metric glossary, terminology glossary, artifact status, and active topology file
+- the specific external rebuttal text, user criticism, or reviewer comments supplied for the pass
+Do not run a whole-repository scan by default. Do not read raw datasets, full logs, full output trees, source code, notebooks, or unrelated drafts unless a specific issue cannot be resolved from the light read set.
+Expand the read set only when one of these conditions holds:
+- a LaTeX claim names a result whose summary file is missing or contradictory
+- a table value cannot be traced to any result summary
+- a validator points to a specific source file or generated artifact
+- the user explicitly asks for a deep audit instead of a rebuttal pass
+When expanding scope, record the reason and extra paths in the rebuttal panel. If the pass stays within the light read set, record that as well.
 ## External Rebuttal Intake
 External criticism must be converted into internal issues before any rewrite.
@@ -32,7 +54,7 @@ For each external comment, record:
 - source: reviewer id, AC, meta-review, colleague, or user
 - raw criticism summary
 - affected paper unit: claim, section, table, figure, protocol, metric, threat model, experiment, or wording
-- reviewer axis: R1, R2, R3, or R4
+- reviewer axis: R1, R2, R3, R4, or R5
 - severity: fatal, major, minor, or clarification
 - route: `write`, `iterate`, `report`, `framing`, `data`, `spec`, or `ask-user`
 - acceptance check: concrete evidence or manuscript condition that resolves the issue
@@ -41,7 +63,7 @@ Do not answer external criticism with prose-only reassurance. If the issue is va
 ## Reviewer Panel
-Run four independent review lenses. Each lens must produce actionable issues, not vague advice.
+Run five independent review lenses. Each lens must produce actionable issues, not vague advice.
 ### R1 Significance / Originality / Insight
@@ -61,7 +83,13 @@ Ask whether evaluation covers ablations, robustness, generalization, failure cas
 Typical fixes route to `iterate`, `report`, or `write`.
-### R4 Presentation / Clarity
+### R4 Results / Tables / Numeric Evidence
+Ask whether reported numbers, deltas, table design, metric directions, split counts, statistical support, bolding, captions, and table notes make the evidence auditable. Check whether each major table states what it evaluates, how metrics are computed or interpreted, what protocol generated the rows, and what can or cannot be compared.
+Typical fixes route to `report`, `iterate`, or `write`.
+### R5 Presentation / Clarity
 Ask whether the storyline, terminology, figure/table semantics, citations, LaTeX, and section flow are readable and self-contained.
@@ -121,7 +149,7 @@ If old evidence remains usable under a narrower interpretation, say exactly wher
 `/lab:review` uses rebuttal mode as its reviewer-panel operating mode when the target is paper-facing or when external criticism is supplied.
-`/lab:write` uses rebuttal mode as an acceptance gate for nontrivial section or manuscript rounds. A write round may not proceed to prose polish while a fatal or major R1/R2/R3 issue remains unresolved.
+`/lab:write` uses rebuttal mode as an acceptance gate for nontrivial section or manuscript rounds. A write round may not proceed to prose polish while a fatal or major R1/R2/R3/R4 issue remains unresolved.
 `/lab:auto` uses rebuttal mode as a promotion guard when the campaign includes paper-facing `report`, `write`, or external rebuttal repair. In L3, auto may execute core mutation after ledger entry and impact audit.
@@ -132,4 +160,3 @@ If old evidence remains usable under a narrower interpretation, say exactly wher
 - Revise when the fix is manuscript-only.
 - Escalate when the issue requires a decision outside the current autonomy level.
 - Stop only when the remaining issue is terminal, already waived with evidence, or outside the campaign boundary.

package/package-assets/shared/skills/lab/stages/auto.md CHANGED Viewed

@@ -129,6 +129,7 @@
 - When an auto campaign includes paper-facing `report`, `write`, external rebuttal repair, or reviewer-driven paper revision, load the shared rebuttal procedure in `skills/lab/references/rebuttal-mode.md`.
 - Use `.lab/.managed/templates/rebuttal-panel.md` for the durable Reviewer Panel artifact instead of embedding a separate reviewer workflow in auto mode.
+- Start reviewer-driven repair from the rebuttal Light Read Set: active LaTeX, result summaries, managed indices, and supplied criticism. Do not perform whole-repository scans, raw dataset reads, or full log sweeps unless the rebuttal panel records a concrete expansion reason.
 - External rebuttal criticism must be converted into internal issues, routes, and acceptance checks before `run`, `iterate`, `report`, or `write` work starts.
 - In L1/L2, core mutation remains an approval boundary unless explicitly authorized by the auto contract.
 - In L3, auto may change paper-level claim, protocol, metric, threat model, reviewer profile, dataset scope, benchmark scope, or framing inside the approved campaign envelope. It must first write or update `.lab/writing/core-mutation-ledger.md` from `.lab/.managed/templates/core-mutation-ledger.md`.
@@ -230,3 +231,13 @@
 - Fill the `Core Explanation Table` in plain language: background, why now, what ran, how the loop ran, what worked, what did not work, what was verified, what remains unverified, what needs improvement and why, how to improve and why, key evidence, and the continue/stop/revise/rerun/escalate/handoff decision.
 - If the table says improvement is needed, the next action may be `stop` only when a terminal boundary is explicitly named; otherwise choose `continue`, `revise`, `rerun`, or `escalate`.
 - Run `.lab/.managed/scripts/validate_stage_report.py --stage-report <stage-report> --stage auto` and include the report path plus validation result in the final user-facing summary.
+- Final visible closeout is mandatory after validation. Do not end `/lab:auto` with only "done", "pushed", "completed", or a chronological command log.
+- The final visible closeout must be derived from the validated stage report, not from a separate improvised narrative.
+- The final visible closeout must include:
+  - the user's requested deliverables or objectives and their status: completed, repaired, failed-gate, not promoted, blocked, or handoff
+  - the key Core Explanation Table rows: what was done, how it was done, what worked, what did not work, what was verified, what remains unverified, whether improvement is needed and why, how to improve and why
+  - evidence paths and primary artifacts
+  - validation/verification commands and validation result, including commands that could not run
+  - known gaps or compile/runtime limitations
+  - next action and why that action is appropriate
+- If the final answer says the work is "completed", it must still name any remaining handoff boundary such as PDF compile, layout check, external approval, budget exhaustion, frozen-core risk, or missing environment.

package/package-assets/shared/skills/lab/stages/review.md CHANGED Viewed

@@ -44,9 +44,10 @@
 ## Rebuttal Mode
 - When the target is a paper, paper section, table, figure, report, claim set, or external rebuttal criticism, run the shared reviewer-panel procedure in `skills/lab/references/rebuttal-mode.md`.
-- Do not duplicate the four-reviewer logic in this stage file. Use `.lab/.managed/templates/rebuttal-panel.md` for the durable critique artifact.
+- Do not duplicate the five-reviewer logic in this stage file. Use `.lab/.managed/templates/rebuttal-panel.md` for the durable critique artifact.
+- For quick prompts such as "rebuttal一下看有什么缺点", start with the rebuttal Light Read Set only: active LaTeX, result summaries, managed indices, and supplied criticism. Do not run a whole-repository scan unless the panel records a specific escalation reason.
 - External rebuttal, AC, meta-review, colleague, or user criticism must be converted into internal actionable issues before any rewrite or response draft.
-- The Reviewer Panel must classify issues across R1 Significance / Originality / Insight, R2 Soundness / Technical Quality, R3 Evaluation / Analysis, and R4 Presentation / Clarity.
+- The Reviewer Panel must classify issues across R1 Significance / Originality / Insight, R2 Soundness / Technical Quality, R3 Evaluation / Analysis, R4 Results / Tables / Numeric Evidence, and R5 Presentation / Clarity.
 - Each issue must include severity, affected artifact, required fix, route, acceptance check, and whether core mutation is required.
 - In L1/L2, core mutation remains an approval boundary unless explicitly authorized. In L3, route core mutation through the shared ledger policy instead of treating it as a reviewer-stage blocker.

package/package-assets/shared/skills/lab/stages/write.md CHANGED Viewed

@@ -75,8 +75,9 @@ Run these on every round:
 - When the user provides external reviewer, AC, meta-review, rebuttal, colleague, or user criticism, load `skills/lab/references/rebuttal-mode.md` before drafting.
 - For nontrivial paper-facing write rounds, use rebuttal mode as the reviewer acceptance gate and write the critique artifact from `.lab/.managed/templates/rebuttal-panel.md`.
+- Rebuttal gating in write mode must start from the rebuttal Light Read Set. Read active LaTeX, result summaries, managed indices, and supplied criticism first; avoid whole-repository scans unless the rebuttal panel records a concrete expansion reason.
 - Do not implement a separate write-only rebuttal workflow. The shared rebuttal-mode reference owns reviewer axes, external rebuttal intake, issue routing, and core mutation policy.
-- Fatal or major R1/R2/R3 issues block prose polish until they are repaired, routed to `iterate`/`report`/`framing`/`spec`, or explicitly waived with evidence.
+- Fatal or major R1/R2/R3/R4 issues block prose polish until they are repaired, routed to `iterate`/`report`/`framing`/`spec`, or explicitly waived with evidence.
 - In L3 or an explicitly core-authorized write campaign, paper-level claim, protocol, metric, threat model, dataset scope, benchmark scope, or framing changes are allowed only through the shared Core Mutation Ledger policy in `skills/lab/references/rebuttal-mode.md`.
 - Record the rebuttal panel path, any core mutation ledger path, and unresolved issue ids in the write-iteration artifact.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "superlab",
-  "version": "0.1.74",
+  "version": "0.1.76",
   "description": "Strict /lab research workflow installer for Codex and Claude",
   "keywords": [
     "codex",