npm - superlab - Versions diffs - 0.1.31 → 0.1.33 - Mend

superlab 0.1.31 → 0.1.33

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/lib/i18n.cjs CHANGED Viewed

@@ -387,10 +387,13 @@ const ZH_SKILL_FILES = {
 - 为什么我们的想法优于现有方法
 - 用白话解释我们准备怎么做
 - 三个一眼就有意义的点
-- 第一轮脑暴：先列 2-4 个候选方向
+- 第一轮脑暴：先列 3-4 个候选方向
+- 每个候选方向都要讲清：它是什么、为什么值得研究、大致怎么做、解决了什么问题、主要风险是什么
 - 第一轮文献检索：每个方向补 3-5 篇最接近前作
 - 第二轮脑暴：收敛到 1-2 个幸存方向
+- 第二轮脑暴还要讲清：为什么保留这些方向、为什么淘汰其他方向、为什么当前推荐项更强
 - 第二轮文献检索：把幸存方向扩展成完整文献范围包
+- 用于最终推荐的文献摘要：明确写出最接近前作、近期强相关论文，以及现有工作仍未解决什么
 - 文献范围包，默认目标约 20 篇来源；如果领域太窄不足 20 篇，必须明确解释
 - idea 来源清单：记录两轮检索实际用到的查询、来源分桶和最终来源数
 - 进入 \`/lab:spec\` 前的 approval gate
@@ -403,8 +406,10 @@ const ZH_SKILL_FILES = {
 - 在谈 novelty 前先做文献范围梳理。默认目标是约 20 篇来源，覆盖最接近前作、近期强相关论文、benchmark 或评测论文、survey 或 taxonomy，以及必要的相邻领域工作。
 - 必须维护单独的 idea 来源清单，把两轮检索真正使用过的查询、来源分桶和最终来源数记下来。
 - 第一轮脑暴只负责打开空间，不负责直接下 novelty 结论。
+- 第一轮脑暴至少覆盖三个候选方向，并为每个方向写清楚核心理由。
 - 第一轮文献检索负责快速淘汰撞题或没有缺口的方向。
 - 第二轮脑暴必须明确哪些方向被保留、哪些被否掉、为什么。
+- 最终推荐前要有一个用户可见的文献摘要，而不是只把来源藏在 source log 里。
 - 第二轮文献检索完成前，不要把方向写成最终 paper fit 或 final recommendation。
 - 如果真实领域过窄，达不到默认数量，也必须把原因写清楚，而不是静默跳过。
 - \`idea\` 工件必须跟随仓库的 \`workflow_language\`，不能自己换成别的语言。
@@ -1881,6 +1886,8 @@ ZH_CONTENT[path.join(".codex", "skills", "lab", "stages", "write.md")] = `# \`/l
 - 如果当前是最终导出或最终定稿轮次，且 \`workflow_language\` 与 \`paper_language\` 不一致、\`paper_language_finalization_decision\` 还是 \`unconfirmed\`，就必须追问一次：保持当前 workflow language，还是把最终稿转换成 \`paper_language\`。
 - 如果用户选择保持当前 workflow language，就把 \`paper_language_finalization_decision\` 持久化成 \`keep-workflow-language\`。
 - 如果用户选择转换成最终论文语言，就把 \`paper_language_finalization_decision\` 持久化成 \`convert-to-paper-language\`。
+- 如果 \`workflow_language\` 与 \`paper_language\` 不一致，就不要在确认前先把最终论文正文改写成 \`paper_language\`；必须先追问、再持久化、再改稿。
+- 如果 \`workflow_language\` 与 \`paper_language\` 不一致，就要在最新 write iteration artifact 里记录工作流语言、论文语言、最终语言决定，以及为什么这样决定。
 - 如果 \`paper_language_finalization_decision\` 是 \`convert-to-paper-language\`，在接受最终定稿前必须把最终稿转换到 \`paper_language\`。
 - 只加载当前 section guide，不要一次加载全部章节参考。
 - 如果当前 section 是 \`abstract\`、\`introduction\` 或 \`method\`，还必须继续读取本地 example bank：\`references/paper-writing/examples/index.md\`、对应的 examples index，以及 1-2 个具体 example 文件。
@@ -1909,6 +1916,7 @@ ZH_CONTENT[path.join(".codex", "skills", "lab", "stages", "write.md")] = `# \`/l
 - 最终定稿或导出轮次结束前，必须先通过 \`.lab/.managed/scripts/validate_section_draft.py --section <section> --section-file <section-file> --mode final\` 和 \`.lab/.managed/scripts/validate_paper_claims.py --section-file <section-file> --mode final\`。
 - 如果最终轮次的 section 或 claim 校验失败，就继续改正文；不能因为资产齐全就停下。
 - 在最终定稿或导出轮次结束前，必须运行 \`.lab/.managed/scripts/validate_manuscript_delivery.py --paper-dir <deliverables_root>/paper\`；如果失败，就继续补表、图、引用和正文。
+- 如果 \`workflow_language\` 与 \`paper_language\` 不一致，\`validate_manuscript_delivery.py\` 还会检查最新 write iteration 里是否存在语言决策审计；缺这块就不能算最终稿合格。
 - 如果本地有 LaTeX 工具链，就再跑一次 compile smoke test；如果没有，也要在 write iteration artifact 里记录该限制。
 - 如果缺少 framing artifact，不要继续写作，直接回到 \`/lab:framing\`。
 - 如果某个 claim 没有证据支撑，就削弱或删除。
@@ -2003,7 +2011,7 @@ ZH_CONTENT[path.join(".lab", ".managed", "templates", "framing.md")] = `# 论文
 ZH_CONTENT[path.join(".codex", "prompts", "lab.md")] = codexPrompt(
   "查看 /lab 研究工作流总览并选择合适阶段",
   "workflow question 或 stage choice",
-  "# `/lab` for Codex\n\n`/lab` 是严格的研究工作流命令族。每次都使用同一套仓库工件和阶段边界。\n\n## 子命令\n\n- `/lab:idea`\n  先做两轮脑暴和两轮文献检索，再定义问题与 failure case、对比最接近前作，并输出带 approval gate 的 source-backed recommendation。\n\n- `/lab:data`\n  把已批准的 idea 转成数据集与 benchmark 方案，记录数据集年份、使用过该数据集的论文、下载来源、许可或访问限制，以及 classic-public、recent-strong-public、claim-specific 三类 benchmark 的纳入理由，和 canonical baselines、strong historical baselines、recent strong public methods、closest prior work 四类对比方法的纳入理由。\n\n- `/lab:auto`\n  在不改变 mission、framing 和核心 claims 的前提下，读取 eval-protocol 与 auto-mode 契约并自动编排 `run`、`iterate`、`review`、`report`，必要时扩展数据集、benchmark 和 comparison methods，并在满足升格策略时自动升级 primary package。启动前必须选定 autonomy level、声明 terminal goal，并显式批准契约。\n\n- `/lab:framing`\n  通过审计当前领域与相邻领域的术语，锁定 paper-facing 的方法名、模块名、论文题目和 contribution bullets，并在 section 起草前保留 approval gate。\n\n- `/lab:spec`\n  把已批准的 idea 转成 `.lab/changes/<change-id>/` 下的一个 lab change 目录，并在其中写出 `proposal`、`design`、`spec`、`tasks`。\n\n- `/lab:run`\n  执行最小有意义验证运行，登记 run，并生成第一版标准化评估摘要。\n\n- `/lab:iterate`\n  在冻结 mission、阈值、verification commands 与 `completion_promise` 的前提下执行有边界的实验迭代。\n\n- `/lab:review`\n  以 reviewer mode 审查文档或结果，先给短摘要，再输出 findings、fatal flaws、fix priority 和 residual risks。\n\n- `/lab:report`\n  从 runs 和 iterations 工件生成最终研究报告。\n\n- `/lab:write`\n  使用已安装 `lab` skill 下 vendored 的 paper-writing references，把稳定 report 工件转成论文 section。\n\n## 调度规则\n\n- 始终使用 `skills/lab/SKILL.md` 作为工作流合同。\n- 用户显式调用 `/lab:<stage>` 时，要立刻执行该 stage，而不是只推荐别的 `/lab` stage。\n- 先给简洁摘要，再决定是否写工件，最后回报输出路径和下一步。\n- 如果歧义会影响结论，一次只问一个问题；如果有多条可行路径，先给 2-3 个方案再收敛。\n- `/lab:spec` 前应已有经批准的数据集与 benchmark 方案。\n- `/lab:run`、`/lab:iterate`、`/lab:auto`、`/lab:report` 都应遵循 `.lab/context/eval-protocol.md`。\n- `.lab/context/eval-protocol.md` 不只定义主指标和主表，也应定义指标释义、实验阶梯，以及指标和对比实现的来源。\n- `/lab:auto` 只编排已批准边界内的执行阶段，不替代手动的 idea/data/framing/spec 决策。\n- `/lab:write` 前必须已有经批准的 `/lab:framing` 工件。\n\n## 如何输入 `/lab:auto`\n\n## `/lab:auto` 层级指南\n\n- `L1`：适合安全验证、一轮 bounded 真实运行，或简单 report 刷新。\n- `L2`：默认推荐级别，适合冻结核心边界内的常规实验迭代。\n- `L3`：激进 campaign 级别，只在你明确想做更大范围探索和可选写作时使用。\n- 如果不确定，默认推荐 `L2`。\n- 如果用户输入没写级别，或者把级别和 `paper layer`、`phase`、`table` 混用了，就应先停下来，要求用户明确选 `L1/L2/L3`。\n\n- 把 `Autonomy level L1/L2/L3` 视为执行权限级别，不要和论文里的 layer、phase、table 编号混用。\n- 把 `paper layer`、`phase`、`table` 视为实验目标。例如 `paper layer 3` 或 `Phase 1` 不是 `Autonomy level L3`。\n- 一条好的 `/lab:auto` 输入应至少说清：objective、自治级别、terminal goal、scope、allowed modifications。\n- 如果 workflow language 是中文，摘要、清单条目、任务标签和进度更新都应使用中文，除非文件路径、代码标识符或字面指标名必须保持原样。\n- 示例：`/lab:auto 自治级别 L2。目标：推进 paper layer 3。终止条件：完成 bounded protocol、测试、最小实现和一轮小规模结果。允许修改：配置、数据接入、评估脚本。`\n"
+  "# `/lab` for Codex\n\n`/lab` 是严格的研究工作流命令族。每次都使用同一套仓库工件和阶段边界。\n\n## 子命令\n\n- `/lab:idea`\n  先做两轮脑暴和两轮文献检索，再定义问题与 failure case、对比最接近前作，并输出带 approval gate 的 source-backed recommendation。\n\n- `/lab:data`\n  把已批准的 idea 转成数据集与 benchmark 方案，记录数据集年份、使用过该数据集的论文、下载来源、许可或访问限制，以及 classic-public、recent-strong-public、claim-specific 三类 benchmark 的纳入理由，和 canonical baselines、strong historical baselines、recent strong public methods、closest prior work 四类对比方法的纳入理由。\n\n- `/lab:auto`\n  在不改变 mission、framing 和核心 claims 的前提下，读取 eval-protocol 与 auto-mode 契约并自动编排 `run`、`iterate`、`review`、`report`，必要时扩展数据集、benchmark 和 comparison methods，并在满足升格策略时自动升级 primary package。启动前必须选定 autonomy level、声明 terminal goal，并显式批准契约。\n\n- `/lab:framing`\n  通过审计当前领域与相邻领域的术语，锁定 paper-facing 的方法名、模块名、论文题目和 contribution bullets，并在 section 起草前保留 approval gate。\n\n- `/lab:spec`\n  把已批准的 idea 转成 `.lab/changes/<change-id>/` 下的一个 lab change 目录，并在其中写出 `proposal`、`design`、`spec`、`tasks`。\n\n- `/lab:run`\n  执行最小有意义验证运行，登记 run，并生成第一版标准化评估摘要。\n\n- `/lab:iterate`\n  在冻结 mission、阈值、verification commands 与 `completion_promise` 的前提下执行有边界的实验迭代。\n\n- `/lab:review`\n  以 reviewer mode 审查文档或结果，先给短摘要，再输出 findings、fatal flaws、fix priority 和 residual risks。\n\n- `/lab:report`\n  从 runs 和 iterations 工件生成最终研究报告。\n\n- `/lab:write`\n  使用已安装 `lab` skill 下 vendored 的 paper-writing references，把稳定 report 工件转成论文 section。\n\n## 调度规则\n\n- 始终使用 `skills/lab/SKILL.md` 作为工作流合同。\n- 用户显式调用 `/lab:<stage>` 时，要立刻执行该 stage，而不是只推荐别的 `/lab` stage。\n- 先给简洁的阶段摘要；只要 stage 合同要求受管工件，就应立刻落盘，再回报输出路径和下一步。\n- 如果歧义会影响结论，一次只问一个问题；如果有多条可行路径，先给 2-3 个方案再收敛。\n- `/lab:spec` 前应已有经批准的数据集与 benchmark 方案。\n- `/lab:run`、`/lab:iterate`、`/lab:auto`、`/lab:report` 都应遵循 `.lab/context/eval-protocol.md`。\n- `.lab/context/eval-protocol.md` 不只定义主指标和主表，也应定义指标释义、实验阶梯，以及指标和对比实现的来源。\n- `/lab:auto` 只编排已批准边界内的执行阶段，不替代手动的 idea/data/framing/spec 决策。\n- `/lab:write` 前必须已有经批准的 `/lab:framing` 工件。\n\n## 如何输入 `/lab:auto`\n\n## `/lab:auto` 层级指南\n\n- `L1`：适合安全验证、一轮 bounded 真实运行，或简单 report 刷新。\n- `L2`：默认推荐级别，适合冻结核心边界内的常规实验迭代。\n- `L3`：激进 campaign 级别，只在你明确想做更大范围探索和可选写作时使用。\n- 如果不确定，默认推荐 `L2`。\n- 如果用户输入没写级别，或者把级别和 `paper layer`、`phase`、`table` 混用了，就应先停下来，要求用户明确选 `L1/L2/L3`。\n\n- 把 `Autonomy level L1/L2/L3` 视为执行权限级别，不要和论文里的 layer、phase、table 编号混用。\n- 把 `paper layer`、`phase`、`table` 视为实验目标。例如 `paper layer 3` 或 `Phase 1` 不是 `Autonomy level L3`。\n- 一条好的 `/lab:auto` 输入应至少说清：objective、自治级别、terminal goal、scope、allowed modifications。\n- 如果 workflow language 是中文，摘要、清单条目、任务标签和进度更新都应使用中文，除非文件路径、代码标识符或字面指标名必须保持原样。\n- 示例：`/lab:auto 自治级别 L2。目标：推进 paper layer 3。终止条件：完成 bounded protocol、测试、最小实现和一轮小规模结果。允许修改：配置、数据接入、评估脚本。`\n"
 );
 ZH_CONTENT[path.join(".codex", "prompts", "lab-data.md")] = codexPrompt(
@@ -2022,7 +2030,7 @@ ZH_CONTENT[path.join(".claude", "commands", "lab.md")] = claudeCommand(
   "lab",
   "查看 /lab 研究工作流总览并选择合适阶段",
   "[stage] [target]",
-  "# `/lab` for Claude\n\n`/lab` 是 Claude Code 里的 lab 工作流分发入口。调用方式有两种：\n\n- `/lab <stage> ...`\n- `/lab-idea`、`/lab-data`、`/lab-auto`、`/lab-framing`、`/lab-spec`、`/lab-run`、`/lab-iterate`、`/lab-review`、`/lab-report`、`/lab-write`\n\n## 阶段别名\n\n- `/lab idea ...` 或 `/lab-idea`\n- `/lab data ...` 或 `/lab-data`\n- `/lab auto ...` 或 `/lab-auto`\n- `/lab framing ...` 或 `/lab-framing`\n- `/lab spec ...` 或 `/lab-spec`\n- `/lab run ...` 或 `/lab-run`\n- `/lab iterate ...` 或 `/lab-iterate`\n- `/lab review ...` 或 `/lab-review`\n- `/lab report ...` 或 `/lab-report`\n- `/lab write ...` 或 `/lab-write`\n\n## 调度规则\n\n- 始终使用 `skills/lab/SKILL.md` 作为工作流合同。\n- 用户显式调用 `/lab <stage> ...` 或 `/lab-<stage>` 时，要立刻执行该 stage，而不是只推荐别的阶段。\n- 先给简洁摘要，再决定是否写工件，最后回报输出路径和下一步。\n- 如果歧义会影响结论，一次只问一个问题；如果有多条可行路径，先给 2-3 个方案再收敛。\n- `spec` 前应已有经批准的数据集与 benchmark 方案。\n- `run`、`iterate`、`auto`、`report` 都应遵循 `.lab/context/eval-protocol.md`。\n- `auto` 只编排已批准边界内的执行阶段，不替代手动的 idea/data/framing/spec 决策。\n- `write` 前必须已有经批准的 `framing` 工件。\n\n## 如何输入 `/lab auto`\n\n## `/lab auto` 层级指南\n\n- `L1`：适合安全验证、一轮 bounded 真实运行，或简单 report 刷新。\n- `L2`：默认推荐级别，适合冻结核心边界内的常规实验迭代。\n- `L3`：激进 campaign 级别，只在你明确想做更大范围探索和可选写作时使用。\n- 如果不确定，默认推荐 `L2`。\n- 如果用户输入没写级别，或者把级别和 `paper layer`、`phase`、`table` 混用了，就应先停下来，要求用户明确选 `L1/L2/L3`。\n\n- 把 `Autonomy level L1/L2/L3` 视为执行权限级别，不要和论文里的 layer、phase、table 编号混用。\n- 把 `paper layer`、`phase`、`table` 视为实验目标。例如 `paper layer 3` 或 `Phase 1` 不是 `Autonomy level L3`。\n- 一条好的 `/lab auto` 输入应至少说清：objective、自治级别、terminal goal、scope、allowed modifications。\n- 如果 workflow language 是中文，摘要、清单条目、任务标签和进度更新都应使用中文，除非文件路径、代码标识符或字面指标名必须保持原样。\n- 示例：`/lab auto 自治级别 L2。目标：推进 paper layer 3。终止条件：完成 bounded protocol、测试、最小实现和一轮小规模结果。允许修改：配置、数据接入、评估脚本。`\n"
+  "# `/lab` for Claude\n\n`/lab` 是 Claude Code 里的 lab 工作流分发入口。调用方式有两种：\n\n- `/lab <stage> ...`\n- `/lab-idea`、`/lab-data`、`/lab-auto`、`/lab-framing`、`/lab-spec`、`/lab-run`、`/lab-iterate`、`/lab-review`、`/lab-report`、`/lab-write`\n\n## 阶段别名\n\n- `/lab idea ...` 或 `/lab-idea`\n- `/lab data ...` 或 `/lab-data`\n- `/lab auto ...` 或 `/lab-auto`\n- `/lab framing ...` 或 `/lab-framing`\n- `/lab spec ...` 或 `/lab-spec`\n- `/lab run ...` 或 `/lab-run`\n- `/lab iterate ...` 或 `/lab-iterate`\n- `/lab review ...` 或 `/lab-review`\n- `/lab report ...` 或 `/lab-report`\n- `/lab write ...` 或 `/lab-write`\n\n## 调度规则\n\n- 始终使用 `skills/lab/SKILL.md` 作为工作流合同。\n- 用户显式调用 `/lab <stage> ...` 或 `/lab-<stage>` 时，要立刻执行该 stage，而不是只推荐别的阶段。\n- 先给简洁的阶段摘要；只要 stage 合同要求受管工件，就应立刻落盘，再回报输出路径和下一步。\n- 如果歧义会影响结论，一次只问一个问题；如果有多条可行路径，先给 2-3 个方案再收敛。\n- `spec` 前应已有经批准的数据集与 benchmark 方案。\n- `run`、`iterate`、`auto`、`report` 都应遵循 `.lab/context/eval-protocol.md`。\n- `auto` 只编排已批准边界内的执行阶段，不替代手动的 idea/data/framing/spec 决策。\n- `write` 前必须已有经批准的 `framing` 工件。\n\n## 如何输入 `/lab auto`\n\n## `/lab auto` 层级指南\n\n- `L1`：适合安全验证、一轮 bounded 真实运行，或简单 report 刷新。\n- `L2`：默认推荐级别，适合冻结核心边界内的常规实验迭代。\n- `L3`：激进 campaign 级别，只在你明确想做更大范围探索和可选写作时使用。\n- 如果不确定，默认推荐 `L2`。\n- 如果用户输入没写级别，或者把级别和 `paper layer`、`phase`、`table` 混用了，就应先停下来，要求用户明确选 `L1/L2/L3`。\n\n- 把 `Autonomy level L1/L2/L3` 视为执行权限级别，不要和论文里的 layer、phase、table 编号混用。\n- 把 `paper layer`、`phase`、`table` 视为实验目标。例如 `paper layer 3` 或 `Phase 1` 不是 `Autonomy level L3`。\n- 一条好的 `/lab auto` 输入应至少说清：objective、自治级别、terminal goal、scope、allowed modifications。\n- 如果 workflow language 是中文，摘要、清单条目、任务标签和进度更新都应使用中文，除非文件路径、代码标识符或字面指标名必须保持原样。\n- 示例：`/lab auto 自治级别 L2。目标：推进 paper layer 3。终止条件：完成 bounded protocol、测试、最小实现和一轮小规模结果。允许修改：配置、数据接入、评估脚本。`\n"
 );
 ZH_CONTENT[path.join(".claude", "commands", "lab-data.md")] = claudeCommand(
@@ -2055,7 +2063,7 @@ description: 严格研究工作流，覆盖 idea、data、auto、framing、spec
 - 论文写作阶段要与实验执行阶段分离。
 - 用户显式调用某个 \`/lab:stage\` 时，要直接执行该 stage，而不是只推荐别的 stage。
 - 关键决策必须落盘，不能只留在聊天里。
-- 每个 stage 都要先给用户一个简洁简介，再决定是否落盘；如果落盘，最后必须回报路径和下一步。
+- 每个 stage 都要先给用户一个简洁简介；如果 stage 合同要求受管工件，就应立刻落盘，最后必须回报路径和下一步。
 - 如果缺少的前提会改变结论，一次只追问一个问题。
 - 如果存在多条可行路径，先给 2-3 个方案、trade-offs 和推荐项，再收敛。
 - 如果某个 stage 会决定后续方向，就要保留明确的 approval gate。
@@ -2617,6 +2625,8 @@ ZH_CONTENT[path.join(".codex", "skills", "lab", "stages", "auto.md")] = `# \`/la
 - 如果当前写作目标是最终导出、\`workflow_language\` 与 \`paper_language\` 不一致，且 \`paper_language_finalization_decision\` 还是 \`unconfirmed\`，就在最终定稿前追问一次：保持当前 workflow language，还是转换成 \`paper_language\`
 - 如果用户选择保持当前语言，就持久化 \`paper_language_finalization_decision: keep-workflow-language\`
 - 如果用户选择转换，就持久化 \`paper_language_finalization_decision: convert-to-paper-language\`
+- 如果当前写作目标是最终导出，且语言不一致，就不要在追问前先把最终论文正文改成 \`paper_language\`；先问、先持久化，再改稿
+- 如果当前写作目标是最终导出，且语言不一致，就在最新 write iteration 里记录语言决策审计：工作流语言、论文语言、最终语言决定，以及为什么这样决定
 - 不要把 \`sleep 30\`、单次 \`pgrep\` 或一次性的 \`metrics.json\` 探针当成 rung 主命令；这些只能算进度检查。
 - 当真实实验进程还活着时，只允许发进度更新并继续等待，不能把这一 rung 当作已经完成。
 `;

package/lib/lab_idea_contract.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
   "stage_prompt": {
-    "codex_en": "This command runs the `/lab:idea` stage. Use `.codex/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, and approval gate. Start with brainstorm pass 1 over 2-4 candidate directions, run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, and what the user should decide next. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified.",
-    "claude_en": "This command runs the `idea` stage of the lab workflow. Use `.claude/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, and approval gate. Start with brainstorm pass 1 over 2-4 candidate directions, run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, and what the user should decide next. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified.",
-    "codex_zh": "本命令运行 `/lab:idea` 阶段。把 `.codex/skills/lab/stages/idea.md` 当成两轮脑暴、两轮文献检索、最接近前作对照、source-backed proposal memo、评测草图、暂定贡献、用户引导、最小可行实验和 approval gate 的单一来源。先做第一轮脑暴，产出 2-4 个候选方向；再做第一轮文献检索，为每个方向补最接近前作；然后用第二轮脑暴淘汰或收敛方向；最后做第二轮文献检索，补齐最终来源包，再输出协作者可读的推荐结论。最终 idea memo 必须讲清真实场景、解决了什么问题、现有方法为什么不够、准备怎么做、大致怎么评、暂定贡献是什么，以及用户下一步该决定什么。`.lab/writing/idea-source-log.md` 必须和两轮检索实际用到的查询、来源分桶和最终来源数保持一致。文献来源包默认目标约 20 篇；如果领域确实很窄，必须显式解释为什么合理地低于这个目标。",
-    "claude_zh": "本命令运行 lab workflow 的 `idea` 阶段。把 `.claude/skills/lab/stages/idea.md` 当成两轮脑暴、两轮文献检索、最接近前作对照、source-backed proposal memo、评测草图、暂定贡献、用户引导、最小可行实验和 approval gate 的单一来源。先做第一轮脑暴，产出 2-4 个候选方向；再做第一轮文献检索，为每个方向补最接近前作；然后用第二轮脑暴淘汰或收敛方向；最后做第二轮文献检索，补齐最终来源包，再输出协作者可读的推荐结论。最终 idea memo 必须讲清真实场景、解决了什么问题、现有方法为什么不够、准备怎么做、大致怎么评、暂定贡献是什么，以及用户下一步该决定什么。`.lab/writing/idea-source-log.md` 必须和两轮检索实际用到的查询、来源分桶和最终来源数保持一致。文献来源包默认目标约 20 篇；如果领域确实很窄，必须显式解释为什么合理地低于这个目标。"
+    "codex_en": "This command runs the `/lab:idea` stage. Use `.codex/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, convergence status, and approval gate. Start with brainstorm pass 1 over 3-4 candidate directions. For each candidate direction, explain what it is, why it matters, roughly how it would work, what problem it solves, and its main risk. Run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2 to 1-2 surviving directions, explain why each survivor remains, why each rejected direction was dropped, and why the narrowed recommendation is stronger now, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. Materialize or update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before any final recommendation, paper-fit judgment, or mission writeback. Do not end the stage with a chat-only brainstorm; if the work is still unconverged, say so explicitly, list what is still missing, and stop there. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, what is already source-backed, what is still hypothesis-only, and what the user should decide next. It must also include a user-visible literature summary naming the closest prior found, the recent strong papers found, and what existing work still does not solve. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified. Only after `.lab/.managed/scripts/validate_idea_artifact.py` passes may the stage present a final recommendation as converged.",
+    "claude_en": "This command runs the `idea` stage of the lab workflow. Use `.claude/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, convergence status, and approval gate. Start with brainstorm pass 1 over 3-4 candidate directions. For each candidate direction, explain what it is, why it matters, roughly how it would work, what problem it solves, and its main risk. Run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2 to 1-2 surviving directions, explain why each survivor remains, why each rejected direction was dropped, and why the narrowed recommendation is stronger now, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. Materialize or update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before any final recommendation, paper-fit judgment, or mission writeback. Do not end the stage with a chat-only brainstorm; if the work is still unconverged, say so explicitly, list what is still missing, and stop there. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, what is already source-backed, what is still hypothesis-only, and what the user should decide next. It must also include a user-visible literature summary naming the closest prior found, the recent strong papers found, and what existing work still does not solve. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified. Only after `.lab/.managed/scripts/validate_idea_artifact.py` passes may the stage present a final recommendation as converged.",
+    "codex_zh": "本命令运行 `/lab:idea` 阶段。把 `.codex/skills/lab/stages/idea.md` 当成两轮脑暴、两轮文献检索、最接近前作对照、source-backed proposal memo、评测草图、暂定贡献、用户引导、最小可行实验、收敛状态和 approval gate 的单一来源。先做第一轮脑暴，产出 3-4 个候选方向。每个候选方向都要说明：它是什么、为什么值得研究、大致怎么做、解决了什么问题、主要风险是什么。再做第一轮文献检索，为每个方向补最接近前作；然后用第二轮脑暴把范围收敛到 1-2 个幸存方向，并写清每个幸存方向为什么保留、每个被淘汰方向为什么淘汰、为什么当前 narrowed recommendation 更强；最后做第二轮文献检索，补齐最终来源包，再输出协作者可读的推荐结论。任何 final recommendation、paper fit 判断或 mission 写回之前，都必须先 materialize or update `.lab/writing/idea.md` 和 `.lab/writing/idea-source-log.md`。不要以纯聊天脑暴收尾；如果当前还没收敛，就明确写出还缺什么，并停在未收敛状态。最终 idea memo 必须讲清真实场景、解决了什么问题、现有方法为什么不够、准备怎么做、大致怎么评、暂定贡献是什么、哪些部分已经 source-backed、哪些还只是 hypothesis，以及用户下一步该决定什么。它还必须包含一个用户可见的文献摘要，明确写出：找到的最接近前作、找到的近期强相关论文、以及现有工作仍未解决什么。`.lab/writing/idea-source-log.md` 必须和两轮检索实际用到的查询、来源分桶和最终来源数保持一致。文献来源包默认目标约 20 篇；如果领域确实很窄，必须显式解释为什么合理地低于这个目标。只有在 `.lab/.managed/scripts/validate_idea_artifact.py` 通过之后，才可以把最终推荐当成已收敛结论输出。",
+    "claude_zh": "本命令运行 lab workflow 的 `idea` 阶段。把 `.claude/skills/lab/stages/idea.md` 当成两轮脑暴、两轮文献检索、最接近前作对照、source-backed proposal memo、评测草图、暂定贡献、用户引导、最小可行实验、收敛状态和 approval gate 的单一来源。先做第一轮脑暴，产出 3-4 个候选方向。每个候选方向都要说明：它是什么、为什么值得研究、大致怎么做、解决了什么问题、主要风险是什么。再做第一轮文献检索，为每个方向补最接近前作；然后用第二轮脑暴把范围收敛到 1-2 个幸存方向，并写清每个幸存方向为什么保留、每个被淘汰方向为什么淘汰、为什么当前 narrowed recommendation 更强；最后做第二轮文献检索，补齐最终来源包，再输出协作者可读的推荐结论。任何 final recommendation、paper fit 判断或 mission 写回之前，都必须先 materialize or update `.lab/writing/idea.md` 和 `.lab/writing/idea-source-log.md`。不要以纯聊天脑暴收尾；如果当前还没收敛，就明确写出还缺什么，并停在未收敛状态。最终 idea memo 必须讲清真实场景、解决了什么问题、现有方法为什么不够、准备怎么做、大致怎么评、暂定贡献是什么、哪些部分已经 source-backed、哪些还只是 hypothesis，以及用户下一步该决定什么。它还必须包含一个用户可见的文献摘要，明确写出：找到的最接近前作、找到的近期强相关论文、以及现有工作仍未解决什么。`.lab/writing/idea-source-log.md` 必须和两轮检索实际用到的查询、来源分桶和最终来源数保持一致。文献来源包默认目标约 20 篇；如果领域确实很窄，必须显式解释为什么合理地低于这个目标。只有在 `.lab/.managed/scripts/validate_idea_artifact.py` 通过之后，才可以把最终推荐当成已收敛结论输出。"
   }
 }

package/lib/lab_write_contract.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
   "stage_prompt": {
-    "codex_en": "This command runs the `/lab:write` stage. Use `.codex/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, ask once whether to keep the draft language or convert the final manuscript to `paper_language`.",
-    "claude_en": "This command runs the `write` stage of the lab workflow. Use `.claude/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, ask once whether to keep the draft language or convert the final manuscript to `paper_language`.",
-    "codex_zh": "本命令运行 `/lab:write` 阶段。把 `.codex/skills/lab/stages/write.md` 当成模板选择、paper-plan、section 参考、校验 gate、资产覆盖和最终 manuscript 规则的单一来源。读取与当前 section 对应的 paper-writing reference 和 bundled example-bank 文件，一次只修改一个 section；普通草稿轮次把写作校验当 warning，最终定稿或导出轮次必须满足 write-stage 的接受 gate。普通起草轮次先跟随 `workflow_language`。如果当前稿件将从托管默认 scaffold 开始，且还没有模板决定，就先追问一次：继续使用默认 scaffold，还是先接入模板目录。如果进入最终定稿时 `workflow_language` 与 `paper_language` 不一致，就再追问一次：保持当前语言，还是把最终稿转换成 `paper_language`。",
-    "claude_zh": "本命令运行 lab workflow 的 `write` 阶段。把 `.claude/skills/lab/stages/write.md` 当成模板选择、paper-plan、section 参考、校验 gate、资产覆盖和最终 manuscript 规则的单一来源。读取与当前 section 对应的 paper-writing reference 和 bundled example-bank 文件，一次只修改一个 section；普通草稿轮次把写作校验当 warning，最终定稿或导出轮次必须满足 write-stage 的接受 gate。普通起草轮次先跟随 `workflow_language`。如果当前稿件将从托管默认 scaffold 开始，且还没有模板决定，就先追问一次：继续使用默认 scaffold，还是先接入模板目录。如果进入最终定稿时 `workflow_language` 与 `paper_language` 不一致，就再追问一次：保持当前语言，还是把最终稿转换成 `paper_language`。"
+    "codex_en": "This command runs the `/lab:write` stage. Use `.codex/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, ask once whether to keep the draft language or convert the final manuscript to `paper_language`, persist that answer, record the language decision in the latest write iteration, and only then edit the final manuscript in the chosen language.",
+    "claude_en": "This command runs the `write` stage of the lab workflow. Use `.claude/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, ask once whether to keep the draft language or convert the final manuscript to `paper_language`, persist that answer, record the language decision in the latest write iteration, and only then edit the final manuscript in the chosen language.",
+    "codex_zh": "本命令运行 `/lab:write` 阶段。把 `.codex/skills/lab/stages/write.md` 当成模板选择、paper-plan、section 参考、校验 gate、资产覆盖和最终 manuscript 规则的单一来源。读取与当前 section 对应的 paper-writing reference 和 bundled example-bank 文件，一次只修改一个 section；普通草稿轮次把写作校验当 warning，最终定稿或导出轮次必须满足 write-stage 的接受 gate。普通起草轮次先跟随 `workflow_language`。如果当前稿件将从托管默认 scaffold 开始，且还没有模板决定，就先追问一次：继续使用默认 scaffold，还是先接入模板目录。如果进入最终定稿时 `workflow_language` 与 `paper_language` 不一致，就再追问一次：保持当前语言，还是把最终稿转换成 `paper_language`；先持久化这个决定，再在最新 write iteration 里记录语言决策，最后才允许按该语言修改最终稿。",
+    "claude_zh": "本命令运行 lab workflow 的 `write` 阶段。把 `.claude/skills/lab/stages/write.md` 当成模板选择、paper-plan、section 参考、校验 gate、资产覆盖和最终 manuscript 规则的单一来源。读取与当前 section 对应的 paper-writing reference 和 bundled example-bank 文件，一次只修改一个 section；普通草稿轮次把写作校验当 warning，最终定稿或导出轮次必须满足 write-stage 的接受 gate。普通起草轮次先跟随 `workflow_language`。如果当前稿件将从托管默认 scaffold 开始，且还没有模板决定，就先追问一次：继续使用默认 scaffold，还是先接入模板目录。如果进入最终定稿时 `workflow_language` 与 `paper_language` 不一致，就再追问一次：保持当前语言，还是把最终稿转换成 `paper_language`；先持久化这个决定，再在最新 write iteration 里记录语言决策，最后才允许按该语言修改最终稿。"
   }
 }

package/package-assets/claude/commands/lab-idea.md CHANGED Viewed

@@ -7,4 +7,4 @@ argument-hint: idea or research problem
 Use the installed `lab` skill at `.claude/skills/lab/SKILL.md`.
 Execute the requested `/lab-idea` command against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
-This command runs the `idea` stage of the lab workflow. Use `.claude/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, and approval gate. Start with brainstorm pass 1 over 2-4 candidate directions, run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, and what the user should decide next. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified.
+This command runs the `idea` stage of the lab workflow. Use `.claude/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, convergence status, and approval gate. Start with brainstorm pass 1 over 3-4 candidate directions. For each candidate direction, explain what it is, why it matters, roughly how it would work, what problem it solves, and its main risk. Run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2 to 1-2 surviving directions, explain why each survivor remains, why each rejected direction was dropped, and why the narrowed recommendation is stronger now, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. Materialize or update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before any final recommendation, paper-fit judgment, or mission writeback. Do not end the stage with a chat-only brainstorm; if the work is still unconverged, say so explicitly, list what is still missing, and stop there. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, what is already source-backed, what is still hypothesis-only, and what the user should decide next. It must also include a user-visible literature summary naming the closest prior found, the recent strong papers found, and what existing work still does not solve. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified. Only after `.lab/.managed/scripts/validate_idea_artifact.py` passes may the stage present a final recommendation as converged.

package/package-assets/claude/commands/lab-write.md CHANGED Viewed

@@ -7,4 +7,4 @@ argument-hint: section or writing target
 Use the installed `lab` skill at `.claude/skills/lab/SKILL.md`.
 Execute the requested `/lab-write` command against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
-This command runs the `write` stage of the lab workflow. Use `.claude/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, ask once whether to keep the draft language or convert the final manuscript to `paper_language`.
+This command runs the `write` stage of the lab workflow. Use `.claude/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, ask once whether to keep the draft language or convert the final manuscript to `paper_language`, persist that answer, record the language decision in the latest write iteration, and only then edit the final manuscript in the chosen language.

package/package-assets/claude/commands/lab.md CHANGED Viewed

@@ -50,7 +50,7 @@ Use the same repository artifacts and stage boundaries every time.
 - Always use `skills/lab/SKILL.md` as the workflow contract.
 - When the user explicitly invokes `/lab <stage> ...` or a direct `/lab-<stage>` alias, execute that stage now against the provided argument instead of only recommending another lab stage.
-- Start by giving the user a concise summary, then decide whether to write artifacts, then report the output path and next step.
+- Start by giving the user a concise stage summary. Materialize managed artifacts immediately when the stage contract requires them, then report the output path and next step.
 - When ambiguity matters, ask one clarifying question at a time; when multiple paths are viable, present 2-3 approaches before converging.
 - `spec` is not complete until the approved change is frozen under `.lab/changes/<change-id>/`.
 - `spec` should inherit the approved dataset package from `.lab/context/data-decisions.md`.

package/package-assets/codex/prompts/lab-idea.md CHANGED Viewed

@@ -6,4 +6,4 @@ argument-hint: idea or research problem
 Use the installed `lab` skill at `.codex/skills/lab/SKILL.md`.
 Execute the requested `/lab:idea` stage against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
-This command runs the `/lab:idea` stage. Use `.codex/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, and approval gate. Start with brainstorm pass 1 over 2-4 candidate directions, run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, and what the user should decide next. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified.
+This command runs the `/lab:idea` stage. Use `.codex/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, convergence status, and approval gate. Start with brainstorm pass 1 over 3-4 candidate directions. For each candidate direction, explain what it is, why it matters, roughly how it would work, what problem it solves, and its main risk. Run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2 to 1-2 surviving directions, explain why each survivor remains, why each rejected direction was dropped, and why the narrowed recommendation is stronger now, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. Materialize or update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before any final recommendation, paper-fit judgment, or mission writeback. Do not end the stage with a chat-only brainstorm; if the work is still unconverged, say so explicitly, list what is still missing, and stop there. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, what is already source-backed, what is still hypothesis-only, and what the user should decide next. It must also include a user-visible literature summary naming the closest prior found, the recent strong papers found, and what existing work still does not solve. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified. Only after `.lab/.managed/scripts/validate_idea_artifact.py` passes may the stage present a final recommendation as converged.

package/package-assets/codex/prompts/lab-write.md CHANGED Viewed

@@ -6,4 +6,4 @@ argument-hint: section or writing target
 Use the installed `lab` skill at `.codex/skills/lab/SKILL.md`.
 Execute the requested `/lab:write` stage against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
-This command runs the `/lab:write` stage. Use `.codex/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, ask once whether to keep the draft language or convert the final manuscript to `paper_language`.
+This command runs the `/lab:write` stage. Use `.codex/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, ask once whether to keep the draft language or convert the final manuscript to `paper_language`, persist that answer, record the language decision in the latest write iteration, and only then edit the final manuscript in the chosen language.

package/package-assets/codex/prompts/lab.md CHANGED Viewed

@@ -44,7 +44,7 @@ argument-hint: workflow question or stage choice
 - Always use `skills/lab/SKILL.md` as the workflow contract.
 - When the user explicitly invokes `/lab:<stage>`, execute that stage now against the provided argument instead of only recommending another `/lab` stage.
-- Start by giving the user a concise summary, then decide whether to write artifacts, then report the output path and next step.
+- Start by giving the user a concise stage summary. Materialize managed artifacts immediately when the stage contract requires them, then report the output path and next step.
 - When ambiguity matters, ask one clarifying question at a time; when multiple paths are viable, present 2-3 approaches before converging.
 - `/lab:spec` is not complete until the approved change is frozen under `.lab/changes/<change-id>/`.
 - `/lab:spec` should inherit the approved dataset package from `.lab/context/data-decisions.md`.

package/package-assets/shared/lab/.managed/scripts/validate_idea_artifact.py CHANGED Viewed

@@ -17,16 +17,52 @@ REQUIRED_SECTIONS = {
     "Closest Prior Work Comparison": [r"^##\s+Closest Prior Work Comparison\s*$", r"^##\s+最接近前作对照\s*$"],
     "Brainstorm Pass 2": [r"^##\s+Brainstorm Pass 2\s*$", r"^##\s+第二轮脑暴\s*$"],
     "Literature Sweep 2": [r"^##\s+Literature Sweep 2\s*$", r"^##\s+第二轮文献(?:检索|收敛)?\s*$"],
+    "Literature Summary for Recommendation": [
+        r"^##\s+Literature Summary for Recommendation\s*$",
+        r"^##\s+用于最终推荐的文献摘要\s*$",
+    ],
     "Rough Approach": [r"^##\s+Rough Approach\s*$", r"^##\s+我们准备怎么做\s*$"],
     "Problem Solved": [r"^##\s+Problem Solved\s*$", r"^##\s+解决了什么问题\s*$"],
     "Evaluation Sketch": [r"^##\s+Evaluation Sketch\s*$", r"^##\s+评测草图\s*$"],
     "Tentative Contributions": [r"^##\s+Tentative Contributions\s*$", r"^##\s+暂定贡献\s*$"],
     "Candidate Experiment": [r"^##\s+Candidate Experiment\s*$", r"^##\s+(?:最小实验|候选实验)\s*$"],
     "Falsifiable Hypothesis": [r"^##\s+Falsifiable Hypothesis\s*$", r"^##\s+可证伪假设\s*$"],
+    "Convergence Status": [r"^##\s+Convergence Status\s*$", r"^##\s+收敛状态\s*$"],
     "Final Recommendation": [r"^##\s+Final Recommendation\s*$", r"^##\s+最终推荐\s*$"],
     "User Guidance": [r"^##\s+User Guidance\s*$", r"^##\s+用户引导\s*$"],
 }
+CANDIDATE_DIRECTION_STARTS = (
+    r"^-\s*Candidate direction \d+\s*[:：]",
+    r"^-\s*候选方向 \d+\s*[:：]",
+)
+SURVIVING_DIRECTION_STARTS = (
+    r"^-\s*Surviving direction \d+\s*[:：]",
+    r"^-\s*幸存方向 \d+\s*[:：]",
+)
+CANDIDATE_DIRECTION_FIELDS = {
+    "what it is": ("What it is", "是什么"),
+    "why it matters": ("Why it matters", "为什么重要", "为什么值得研究"),
+    "rough how": ("Rough how", "大致怎么做", "粗略做法"),
+    "problem solved": ("Problem solved", "解决了什么问题"),
+    "main risk": ("Main risk", "主要风险"),
+}
+SURVIVING_DIRECTION_FIELDS = {
+    "why it survived": ("Why it survived", "为什么保留", "为什么留下"),
+}
+LITERATURE_SUMMARY_FIELDS = {
+    "closest prior found": ("Closest prior found", "最接近前作"),
+    "recent strong papers found": ("Recent strong papers found", "近期强相关论文", "近期强论文"),
+    "what existing work still does not solve": (
+        "What existing work still does not solve",
+        "现有工作仍未解决什么",
+    ),
+}
 SOURCE_LOG_SECTIONS = {
     "Search Intent": [r"^##\s+Search Intent\s*$", r"^##\s+检索意图\s*$"],
     "Sweep 1 Log": [r"^##\s+Sweep 1 Log\s*$", r"^##\s+第一轮检索记录\s*$"],
@@ -170,6 +206,34 @@ def extract_bucket_body(body: str, labels: tuple[str, ...]) -> str:
     return "\n".join(captured).strip()
+def extract_labeled_blocks(body: str, start_patterns: tuple[str, ...]) -> list[str]:
+    lines = body.splitlines()
+    blocks: list[str] = []
+    current: list[str] = []
+    current_indent: int | None = None
+    for line in lines:
+        stripped = line.lstrip()
+        indent = len(line) - len(stripped)
+        is_start = any(re.match(pattern, stripped, flags=re.IGNORECASE) for pattern in start_patterns)
+        if is_start:
+            if current:
+                blocks.append("\n".join(current).strip())
+            current = [line]
+            current_indent = indent
+            continue
+        if current and stripped.startswith("- ") and current_indent is not None and indent <= current_indent:
+            blocks.append("\n".join(current).strip())
+            current = []
+            current_indent = None
+        if current:
+            current.append(line)
+    if current:
+        blocks.append("\n".join(current).strip())
+    return blocks
 def validate_content(text: str) -> list[str]:
     issues: list[str] = []
     scenario = extract_section_body(text, REQUIRED_SECTIONS["Scenario"])
@@ -187,6 +251,13 @@ def validate_content(text: str) -> list[str]:
     brainstorm_1 = extract_section_body(text, REQUIRED_SECTIONS["Brainstorm Pass 1"])
     if not contains_any(brainstorm_1, ("candidate direction", "候选方向", "worth checking", "值得检查")):
         issues.append("idea artifact is missing a real brainstorm pass 1 shortlist")
+    candidate_blocks = extract_labeled_blocks(brainstorm_1, CANDIDATE_DIRECTION_STARTS)
+    if len(candidate_blocks) < 3:
+        issues.append("idea artifact brainstorm pass 1 must include at least three candidate directions")
+    for index, block in enumerate(candidate_blocks[:3], start=1):
+        for field_name, labels in CANDIDATE_DIRECTION_FIELDS.items():
+            if not has_field_value(block, labels):
+                issues.append(f"idea artifact candidate direction {index} is missing {field_name}")
     sweep_1 = extract_section_body(text, REQUIRED_SECTIONS["Literature Sweep 1"])
     if count_references(sweep_1) < 3:
@@ -221,11 +292,36 @@ def validate_content(text: str) -> list[str]:
     brainstorm_2 = extract_section_body(text, REQUIRED_SECTIONS["Brainstorm Pass 2"])
     if not contains_any(brainstorm_2, ("surviving direction", "recommended narrowed direction", "surviving", "幸存方向", "推荐收敛方向", "淘汰")):
         issues.append("idea artifact is missing a real brainstorm pass 2 narrowing step")
+    surviving_blocks = extract_labeled_blocks(brainstorm_2, SURVIVING_DIRECTION_STARTS)
+    if not surviving_blocks:
+        issues.append("idea artifact brainstorm pass 2 must include at least one surviving direction")
+    if len(surviving_blocks) > 2:
+        issues.append("idea artifact brainstorm pass 2 must narrow to at most two surviving directions")
+    for index, block in enumerate(surviving_blocks, start=1):
+        for field_name, labels in SURVIVING_DIRECTION_FIELDS.items():
+            if not has_field_value(block, labels):
+                issues.append(f"idea artifact surviving direction {index} is missing {field_name}")
+    if not has_field_value(brainstorm_2, ("Rejected directions and why", "被淘汰的方向和原因")):
+        issues.append("idea artifact brainstorm pass 2 is missing rejected directions and why")
+    if not has_field_value(brainstorm_2, ("Recommended narrowed direction", "推荐收敛方向")):
+        issues.append("idea artifact brainstorm pass 2 is missing the recommended narrowed direction")
+    if not has_field_value(brainstorm_2, ("Why this is stronger now", "为什么它现在更强")):
+        issues.append("idea artifact brainstorm pass 2 is missing why the surviving recommendation is stronger now")
     sweep_2 = extract_section_body(text, REQUIRED_SECTIONS["Literature Sweep 2"])
     if count_references(sweep_2) < 5:
         issues.append("idea artifact is missing literature sweep 2 with real references")
+    literature_summary = extract_section_body(text, REQUIRED_SECTIONS["Literature Summary for Recommendation"])
+    if not literature_summary:
+        issues.append("idea artifact is missing a literature summary for recommendation")
+    else:
+        for field_name, labels in LITERATURE_SUMMARY_FIELDS.items():
+            if not has_field_value(literature_summary, labels):
+                issues.append(f"idea artifact literature summary for recommendation is missing {field_name}")
+        if count_references(literature_summary) < 2:
+            issues.append("idea artifact literature summary for recommendation is missing real references")
     rough_approach = extract_section_body(text, REQUIRED_SECTIONS["Rough Approach"])
     if not contains_any(rough_approach, ("plain-language", "how this would work", "粗略做法", "怎么做", "why this design", "为什么")):
         issues.append("idea artifact is missing a rough plain-language approach")
@@ -254,6 +350,16 @@ def validate_content(text: str) -> list[str]:
     if not contains_any(experiment, ("minimum viable experiment", "minimum experiment", "dataset", "metric", "最小实验", "主指标", "次指标")):
         issues.append("idea artifact is missing a minimum experiment")
+    convergence_status = extract_section_body(text, REQUIRED_SECTIONS["Convergence Status"])
+    if not has_field_value(convergence_status, ("Current status", "当前状态")):
+        issues.append("idea artifact is missing a convergence status with the current stage state")
+    if not has_field_value(convergence_status, ("What is already source-backed", "哪些部分已经 source-backed", "已经有来源支撑的部分")):
+        issues.append("idea artifact is missing a convergence status section that states what is already source-backed")
+    if not has_field_value(convergence_status, ("What is still hypothesis-only", "哪些部分还只是 hypothesis", "还只是生成性假设的部分")):
+        issues.append("idea artifact is missing a convergence status section that states what is still hypothesis-only")
+    if not has_field_value(convergence_status, ("Can this round end with a final recommendation", "这一轮能否结束并给出 final recommendation", "这一轮能否给出最终推荐")):
+        issues.append("idea artifact is missing a convergence status section that states whether the stage can end with a final recommendation")
     final_recommendation = extract_section_body(text, REQUIRED_SECTIONS["Final Recommendation"])
     if not contains_any(final_recommendation, ("recommended direction", "paper-worthy", "推荐方向", "值得做论文")):
         issues.append("idea artifact is missing a final recommendation after the second sweep")

package/package-assets/shared/lab/.managed/scripts/validate_manuscript_delivery.py CHANGED Viewed

@@ -1,5 +1,6 @@
 #!/usr/bin/env python3
 import argparse
+import json
 import re
 import sys
 from pathlib import Path
@@ -8,6 +9,10 @@ from pathlib import Path
 ABSOLUTE_PATH_MARKERS = ("/Users/", "/home/", "/tmp/", "/private/tmp/")
 REQUIRED_TABLE_FILES = ("main-results.tex", "ablations.tex")
 REQUIRED_FIGURE_FILES = ("problem-setting.tex", "method-overview.tex", "results-overview.tex")
+REF_PATTERN_TEMPLATE = r"\\(?:auto|c|C)?ref\{%s\}"
+WRITE_ITERATION_LANGUAGE_SECTION = {
+    "Language Decision": (r"^##\s+Language Decision\s*$", r"^##\s+语言决策\s*$"),
+}
 def parse_args():
@@ -22,6 +27,73 @@ def read_text(path: Path) -> str:
     return path.read_text(encoding="utf-8")
+def find_workflow_config(start_path: Path) -> Path | None:
+    search_roots = [start_path, *start_path.parents]
+    for root in search_roots:
+        for relative in ("config/workflow.json", ".lab/config/workflow.json"):
+            candidate = root / relative
+            if candidate.exists():
+                return candidate
+    return None
+def load_workflow_config(path: Path) -> dict:
+    return json.loads(path.read_text(encoding="utf-8"))
+def extract_section_body(text: str, patterns: tuple[str, ...]) -> str:
+    for pattern in patterns:
+        match = re.search(pattern, text, flags=re.MULTILINE)
+        if not match:
+            continue
+        start = match.end()
+        next_heading = re.search(r"^##\s+", text[start:], flags=re.MULTILINE)
+        end = start + next_heading.start() if next_heading else len(text)
+        return text[start:end].strip()
+    return ""
+def has_field_value(body: str, labels: tuple[str, ...]) -> bool:
+    for label in labels:
+        pattern = re.compile(rf"^\s*(?:-|\d+\.)\s*{re.escape(label)}[:：][ \t]*([^\n]+?)\s*$", flags=re.MULTILINE)
+        for match in pattern.finditer(body):
+            value = match.group(1).strip()
+            if value and value not in {"TBD", "TODO", "待补", "待定", "unclear"}:
+                return True
+    return False
+def find_latest_write_iteration(paper_dir: Path) -> Path | None:
+    search_roots = [paper_dir, *paper_dir.parents]
+    for root in search_roots:
+        iterations_dir = root / ".lab" / "writing" / "iterations"
+        if not iterations_dir.exists():
+            continue
+        candidates = sorted(path for path in iterations_dir.glob("*.md") if path.is_file())
+        if candidates:
+            return candidates[-1]
+    return None
+def text_looks_like_language(text: str, language: str) -> bool:
+    cjk_chars = len(re.findall(r"[\u4e00-\u9fff]", text))
+    latin_chars = len(re.findall(r"[A-Za-z]", text))
+    if language == "zh":
+        return cjk_chars >= 20
+    if language == "en":
+        return latin_chars >= 80 and cjk_chars < 20
+    return True
+def extract_label(text: str) -> str | None:
+    match = re.search(r"\\label\{([^}]+)\}", text)
+    return match.group(1) if match else None
+def section_references_label(text: str, label: str) -> bool:
+    return bool(re.search(REF_PATTERN_TEMPLATE % re.escape(label), text))
 def check_exists(path: Path, issues: list[str], label: str):
     if not path.exists():
         issues.append(f"missing required file: {label} ({path})")
@@ -100,6 +172,13 @@ def check_analysis_asset(path: Path, issues: list[str]):
         issues.append("analysis/analysis-asset.tex must explain asset intent")
+def require_section_reference(section_text: str, label: str | None, issues: list[str], section_name: str, asset_name: str):
+    if not label:
+        return
+    if not section_references_label(section_text, label):
+        issues.append(f"{section_name} must explicitly reference the {asset_name} via \\ref{{{label}}}")
 def check_introduction_section(paper_dir: Path, issues: list[str]):
     introduction = paper_dir / "sections" / "introduction.tex"
     check_exists(introduction, issues, "sections/introduction.tex")
@@ -153,6 +232,41 @@ def check_experiments_section(paper_dir: Path, issues: list[str]):
         issues.append("experiments section is missing an analysis asset")
+def check_asset_consumption(paper_dir: Path, issues: list[str]):
+    intro_path = paper_dir / "sections" / "introduction.tex"
+    method_path = paper_dir / "sections" / "method.tex"
+    experiments_path = paper_dir / "sections" / "experiments.tex"
+    if not intro_path.exists() or not method_path.exists() or not experiments_path.exists():
+        return
+    introduction_text = read_text(intro_path)
+    method_text = read_text(method_path)
+    experiments_text = read_text(experiments_path)
+    figures_dir = paper_dir / "figures"
+    tables_dir = paper_dir / "tables"
+    analysis_dir = paper_dir / "analysis"
+    problem_label = extract_label(read_text(figures_dir / "problem-setting.tex")) if (figures_dir / "problem-setting.tex").exists() else None
+    method_label = extract_label(read_text(figures_dir / "method-overview.tex")) if (figures_dir / "method-overview.tex").exists() else None
+    results_label = extract_label(read_text(figures_dir / "results-overview.tex")) if (figures_dir / "results-overview.tex").exists() else None
+    main_table_label = extract_label(read_text(tables_dir / "main-results.tex")) if (tables_dir / "main-results.tex").exists() else None
+    ablation_label = extract_label(read_text(tables_dir / "ablations.tex")) if (tables_dir / "ablations.tex").exists() else None
+    analysis_label = extract_label(read_text(analysis_dir / "analysis-asset.tex")) if (analysis_dir / "analysis-asset.tex").exists() else None
+    require_section_reference(introduction_text, problem_label, issues, "introduction section", "problem-setting figure")
+    require_section_reference(method_text, method_label, issues, "method section", "method-overview figure")
+    require_section_reference(experiments_text, main_table_label, issues, "experiments section", "main results table")
+    require_section_reference(experiments_text, ablation_label, issues, "experiments section", "ablation table")
+    require_section_reference(experiments_text, results_label, issues, "experiments section", "results-overview figure")
+    require_section_reference(experiments_text, analysis_label, issues, "experiments section", "analysis asset")
+    main_results_text = read_text(tables_dir / "main-results.tex") if (tables_dir / "main-results.tex").exists() else ""
+    prose_for_ranking_claims = "\n".join((introduction_text, experiments_text))
+    if re.search(r"\bQini\b", prose_for_ranking_claims, flags=re.IGNORECASE) and "Qini" not in main_results_text:
+        issues.append("manuscript prose mentions Qini but tables/main-results.tex does not expose it")
 def check_method_section(paper_dir: Path, issues: list[str]):
     method = paper_dir / "sections" / "method.tex"
     check_exists(method, issues, "sections/method.tex")
@@ -180,6 +294,67 @@ def check_main_tex(paper_dir: Path, issues: list[str]):
         issues.append("main.tex must include the references bibliography")
+def check_language_layers(paper_dir: Path, issues: list[str]):
+    workflow_config = find_workflow_config(paper_dir)
+    if workflow_config is None:
+        return
+    config = load_workflow_config(workflow_config)
+    workflow_language = config.get("workflow_language")
+    paper_language = config.get("paper_language", workflow_language)
+    finalization_decision = config.get("paper_language_finalization_decision", "unconfirmed")
+    if not workflow_language or workflow_language == paper_language:
+        return
+    if finalization_decision == "unconfirmed":
+        issues.append(
+            "workflow_language and paper_language differ; confirm paper_language_finalization_decision before finalizing the manuscript"
+        )
+        return
+    latest_iteration = find_latest_write_iteration(paper_dir)
+    if latest_iteration is None:
+        issues.append(
+            "workflow_language and paper_language differ; record the language decision in the latest write iteration before finalizing the manuscript"
+        )
+        return
+    iteration_text = read_text(latest_iteration)
+    language_section = extract_section_body(iteration_text, WRITE_ITERATION_LANGUAGE_SECTION["Language Decision"])
+    if not language_section:
+        issues.append(
+            "latest write iteration is missing a language decision section for the final manuscript language choice"
+        )
+    else:
+        if not has_field_value(language_section, ("Workflow language", "workflow_language", "工作流语言")):
+            issues.append("latest write iteration is missing the workflow language in its language decision audit")
+        if not has_field_value(language_section, ("Paper language", "paper_language", "论文语言")):
+            issues.append("latest write iteration is missing the paper language in its language decision audit")
+        if not has_field_value(
+            language_section,
+            ("Finalization decision", "paper_language_finalization_decision", "最终语言决定"),
+        ):
+            issues.append("latest write iteration is missing the finalization decision in its language decision audit")
+        if not has_field_value(
+            language_section,
+            ("Why this decision was chosen", "Why this language was chosen", "为什么这样决定"),
+        ):
+            issues.append("latest write iteration is missing why the language decision was chosen")
+        if workflow_language == "zh" and not text_looks_like_language(iteration_text, workflow_language):
+            issues.append("latest write iteration should follow workflow_language=zh instead of drifting into another language")
+    sections_dir = paper_dir / "sections"
+    section_text = "\n".join(
+        read_text(path) for path in sorted(sections_dir.glob("*.tex")) if path.is_file()
+    )
+    target_language = workflow_language if finalization_decision == "keep-workflow-language" else paper_language
+    if section_text and not text_looks_like_language(section_text, target_language):
+        issues.append(
+            f"final manuscript sections should follow {target_language} after paper_language_finalization_decision={finalization_decision}"
+        )
 def main():
     args = parse_args()
     paper_dir = Path(args.paper_dir)
@@ -195,6 +370,8 @@ def main():
     check_introduction_section(paper_dir, issues)
     check_method_section(paper_dir, issues)
     check_experiments_section(paper_dir, issues)
+    check_asset_consumption(paper_dir, issues)
+    check_language_layers(paper_dir, issues)
     tables_dir = paper_dir / "tables"
     check_table_file(tables_dir / REQUIRED_TABLE_FILES[0], issues, "tables/main-results.tex")

package/package-assets/shared/lab/.managed/scripts/validate_paper_plan.py CHANGED Viewed

@@ -1,5 +1,6 @@
 #!/usr/bin/env python3
 import argparse
+import json
 import re
 import sys
 from pathlib import Path
@@ -144,6 +145,31 @@ def read_text(path: Path) -> str:
     return path.read_text(encoding="utf-8")
+def find_workflow_config(start_path: Path) -> Path | None:
+    search_roots = [start_path.parent, *start_path.parents]
+    for root in search_roots:
+        for relative in ("config/workflow.json", ".lab/config/workflow.json"):
+            candidate = root / relative
+            if candidate.exists():
+                return candidate
+    return None
+def load_workflow_language(path: Path) -> str:
+    data = json.loads(path.read_text(encoding="utf-8"))
+    return data.get("workflow_language", "en")
+def text_looks_like_language(text: str, language: str) -> bool:
+    cjk_chars = len(re.findall(r"[\u4e00-\u9fff]", text))
+    latin_chars = len(re.findall(r"[A-Za-z]", text))
+    if language == "zh":
+        return cjk_chars >= 20
+    if language == "en":
+        return latin_chars >= 80 and cjk_chars < 20
+    return True
 def extract_section_body(text: str, patterns: list[str]) -> str:
     for pattern in patterns:
         match = re.search(pattern, text, flags=re.MULTILINE)
@@ -250,6 +276,14 @@ def main():
         issues.append(f"paper plan is missing required sections: {', '.join(missing)}")
     issues.extend(validate_filled_fields(text))
+    workflow_config = find_workflow_config(plan_path)
+    if workflow_config is not None:
+        workflow_language = load_workflow_language(workflow_config)
+        if not text_looks_like_language(text, workflow_language):
+            issues.append(
+                f"paper plan should follow workflow_language={workflow_language} instead of drifting into another language"
+            )
     if issues:
         for issue in issues:
             print(issue, file=sys.stderr)

package/package-assets/shared/lab/.managed/templates/idea.md CHANGED Viewed

@@ -67,9 +67,29 @@ Suggested levels:
 ## Brainstorm Pass 1
 - Candidate direction 1:
+  - What it is:
+  - Why it matters:
+  - Rough how:
+  - Problem solved:
+  - Main risk:
 - Candidate direction 2:
+  - What it is:
+  - Why it matters:
+  - Rough how:
+  - Problem solved:
+  - Main risk:
 - Candidate direction 3:
+  - What it is:
+  - Why it matters:
+  - Rough how:
+  - Problem solved:
+  - Main risk:
 - Candidate direction 4:
+  - What it is:
+  - Why it matters:
+  - Rough how:
+  - Problem solved:
+  - Main risk:
 - Why these directions are worth checking:
 ## Literature Sweep 1
@@ -109,9 +129,12 @@ Suggested levels:
 ## Brainstorm Pass 2
 - Surviving direction 1:
+  - Why it survived:
 - Surviving direction 2:
+  - Why it survived:
 - Rejected directions and why:
 - Recommended narrowed direction:
+- Why this is stronger now:
 ## Literature Sweep 2
@@ -121,6 +144,12 @@ Suggested levels:
 - Adjacent-field papers:
 - Final literature takeaway:
+## Literature Summary for Recommendation
+- Closest prior found:
+- Recent strong papers found:
+- What existing work still does not solve:
 ## Why Ours Is Different
 - Existing methods rely on:
@@ -180,6 +209,13 @@ Suggested levels:
 - If the idea is correct:
 - If the idea is wrong:
+## Convergence Status
+- Current status:
+- What is already source-backed:
+- What is still hypothesis-only:
+- Can this round end with a final recommendation:
 ## Candidate Experiment
 - Baseline:

package/package-assets/shared/lab/.managed/templates/write-iteration.md CHANGED Viewed

@@ -27,6 +27,13 @@
 - Terminology consistency:
 - Five-dimension self-review outcome:
+## Language Decision
+- Workflow language:
+- Paper language:
+- Finalization decision:
+- Why this decision was chosen:
 ## Decision
 - Continue or stop:

package/package-assets/shared/skills/lab/SKILL.md CHANGED Viewed

@@ -14,8 +14,8 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
 - Keep dataset and benchmark selection separate from idea generation and specification.
 - Keep paper writing separate from experiment execution and reporting.
 - When the user explicitly invokes `/lab:<stage>`, execute that stage against the provided target instead of replying with a recommendation for another `/lab` stage.
-- Start each stage with a concise user-facing summary before writing artifacts.
-- After that summary, decide whether immediate artifact creation is useful; if yes, write the artifact and then report the output path plus next step.
+- Start each stage with a concise user-facing summary.
+- If the stage contract requires managed artifacts, materialize them immediately instead of treating artifact creation as optional. Then report the output path plus next step.
 - When a missing assumption would materially change the stage outcome, ask one clarifying question at a time.
 - When there are multiple viable paths, present 2-3 approaches with trade-offs and a recommendation before converging.
 - When a stage materially sets downstream direction, keep an explicit approval gate before proceeding.
@@ -37,11 +37,14 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
 ### `/lab:idea`
 - Search relevant literature, baselines, datasets, and evaluation metrics before proposing a plan.
-- Start with brainstorm pass 1 over 2-4 candidate directions instead of locking the first idea immediately.
+- Start with brainstorm pass 1 over 3-4 candidate directions instead of locking the first idea immediately.
+- For each brainstorm pass 1 candidate direction, explain what it is, why it matters, roughly how it would work, what problem it solves, and its main risk.
 - Run literature sweep 1 with closest-prior references for each candidate direction before narrowing.
-- Use brainstorm pass 2 to keep only the strongest 1-2 directions and explain what was rejected.
+- Use brainstorm pass 2 to keep only the strongest 1-2 directions, explain why each surviving direction remains, explain what was rejected, and say why the narrowed recommendation is stronger now.
 - Run literature sweep 2 before making a final recommendation or novelty claim.
+- Include a user-visible literature summary that names the closest prior found, the recent strong papers found, and what existing work still does not solve before giving the final recommendation.
 - Build a literature-scoping bundle before claiming novelty. The default target is 20 relevant sources unless the field is genuinely too narrow and that exception is written down.
+- Materialize or update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before giving any final recommendation, paper-fit judgment, or mission writeback.
 - Read `.lab/context/mission.md` and `.lab/context/open-questions.md` before drafting.
 - Read `.lab/config/workflow.json` before drafting and follow its `workflow_language` for idea artifacts.
 - Ask one clarifying question at a time when critical ambiguity remains.
@@ -58,6 +61,8 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
 - Write idea artifacts with the template in `.lab/.managed/templates/idea.md`.
 - Keep `.lab/writing/idea-source-log.md` as the source-backed search manifest for the two literature sweeps.
 - Run `.lab/.managed/scripts/validate_idea_artifact.py --idea <idea-artifact> --source-log .lab/writing/idea-source-log.md --workflow-config .lab/config/workflow.json` before treating the idea as converged.
+- Do not end the stage with a chat-only brainstorm. If only one brainstorm pass or literature sweep is complete, mark the idea as unconverged, list what is missing, and stop without pretending that the stage has converged.
+- Only after the validator passes may the stage update `.lab/context/mission.md`, `.lab/context/decisions.md`, and `.lab/context/open-questions.md` with a final recommendation.
 - Update `.lab/context/mission.md`, `.lab/context/decisions.md`, and `.lab/context/open-questions.md` after convergence.
 - Do not leave `.lab/context/mission.md` as a template shell once the problem statement and approved direction are known.
 - Do not implement code in this stage.
@@ -182,7 +187,9 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
 - Read `.lab/context/mission.md`, `.lab/context/decisions.md`, `.lab/context/evidence-index.md`, and `.lab/context/data-decisions.md` before drafting.
 - Write one paper section or one explicit subproblem per round.
 - Ordinary manuscript drafting rounds should follow `workflow_language`.
-- If `workflow_language` and `paper_language` differ, the first final-draft or export round must ask once whether to keep the draft language or convert the final manuscript to `paper_language`, then persist that choice.
+- If `workflow_language` and `paper_language` differ, the first final-draft or export round must ask once whether to keep the draft language or convert the final manuscript to `paper_language`.
+- When the languages differ, do not rewrite final manuscript sections in `paper_language` before that question has been answered; ask first, persist the choice, then edit the final manuscript in the chosen language.
+- When the languages differ, record the workflow language, paper language, finalization decision, and why the decision was chosen in the latest write iteration artifact.
 - Bind each claim to evidence from `report`, iteration reports, or normalized summaries.
 - Use the write-stage contract in `.codex/skills/lab/stages/write.md` or `.claude/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section-specific references, validator calls, asset coverage, and final manuscript gates.
 - Use the vendored paper-writing references under `skills/lab/references/paper-writing/` and the matching example-bank files under `skills/lab/references/paper-writing/examples/`.
@@ -209,6 +216,7 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
 - No final report without validated normalized results.
 - No paper-writing round without stable report artifacts, an approved framing artifact, evidence links, and LaTeX manuscript output.
 - No final-draft or export round without passing section-quality, claim-safety, and manuscript-delivery validation.
+- No final-draft or export round with mismatched `workflow_language` and `paper_language` unless the latest write iteration records the language decision audit that justified the final manuscript language.
 ## References

package/package-assets/shared/skills/lab/stages/idea.md CHANGED Viewed

@@ -16,11 +16,15 @@
 - rough plain-language approach description
 - evaluation sketch with the evaluation subject, any proxy or simulator, the main outcome to observe, and the main validity risk
 - tentative contributions stated at idea level, not final paper-facing wording
+- convergence status that says what is already source-backed, what is still hypothesis-only, and whether the stage may end with a final recommendation
 - three meaningful points
-- brainstorm pass 1 with 2-4 candidate directions
+- brainstorm pass 1 with 3-4 candidate directions
+- each brainstorm pass 1 candidate direction explained with what it is, why it matters, rough how it would work, what problem it solves, and its main risk
 - literature sweep 1 with 3-5 closest-prior references per direction
 - brainstorm pass 2 that narrows to 1-2 surviving directions
+- each brainstorm pass 2 surviving direction explained with why it survived, plus rejected directions and why, and why the narrowed recommendation is stronger now
 - literature sweep 2 that expands the surviving directions into the full source bundle
+- literature summary for recommendation with closest prior found, recent strong papers found, and what existing work still does not solve
 - literature scoping bundle with a default target of 20 sources, or an explicit explanation for a smaller scoped field
 - literature-backed framing
 - sourced datasets and metrics
@@ -42,10 +46,16 @@
 - Build a source bundle before claiming novelty. The default target is 20 relevant sources split across closest prior work, recent strong papers, benchmark or evaluation papers, surveys or taxonomies, and adjacent-field work when useful.
 - Treat closest prior work, recent strong papers, benchmark or evaluation papers, and survey or taxonomy papers as mandatory coverage buckets. Do not leave those buckets empty in the final source bundle.
 - Keep a separate idea source log that records the actual search queries, bucketed sources, and final source count for both literature sweeps.
+- Materialize or update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before giving any final recommendation, paper-fit judgment, or mission writeback.
 - Use the first brainstorm pass only to generate candidate directions. Treat it as hypothesis generation, not as a novelty judgment.
+- Brainstorm pass 1 should cover at least three candidate directions before narrowing.
+- For each brainstorm pass 1 candidate, explain what it is, why it matters, rough how it would work, what problem it solves, and its main risk.
 - After brainstorm pass 1, run a first literature sweep that gathers 3-5 closest-prior references per direction before narrowing the idea.
 - After literature sweep 1, run a second brainstorm pass that explicitly kills, merges, or narrows directions.
+- Brainstorm pass 2 must narrow the field to one or two surviving directions, explain why each survived, explain why the others were rejected, and say why the narrowed recommendation is stronger now.
 - Only after literature sweep 2 may the artifact give a final recommendation, paper fit, or novelty claim.
+- Before the final recommendation, add a user-visible literature summary that names the closest prior found, the recent strong papers found, and what existing work still does not solve.
+- Do not end the stage with a chat-only brainstorm. If only a brainstorm pass or literature sweep is complete, mark the stage as unconverged, list what is still missing, and stop without pretending that the idea has converged.
 - If the field is genuinely too narrow to support that target, say so explicitly in both the idea artifact and the idea source log, and justify the smaller literature bundle instead of silently skipping the search.
 - The idea artifact must follow the repository `workflow_language`, not whichever language is easiest locally.
 - Before writing the full artifact, give the user a short summary with the one-sentence problem, why current methods fail, and the three meaningful points.
@@ -69,6 +79,7 @@
 - idea artifact derived from `.lab/.managed/templates/idea.md`
 - idea source log at `.lab/writing/idea-source-log.md`, derived from `.lab/.managed/templates/idea-source-log.md`
+- the working idea artifact should live at `.lab/writing/idea.md`
 ## Recommended Structure
@@ -84,21 +95,23 @@
 10. Closest-prior-work comparison
 11. Brainstorm pass 2
 12. Literature sweep 2
-13. Rough approach in plain language
-14. Problem solved in plain language
-15. Why the proposed idea is better
-16. Evaluation sketch
-17. Tentative contributions
-18. Three meaningful points
-19. Candidate approaches and recommendation
-20. Dataset, baseline, and metric candidates
-21. Falsifiable hypothesis
-22. Expert critique
-23. Revised proposal or final recommendation
-24. User guidance
-25. Approval gate
-26. Minimum viable experiment
-27. Idea source log aligned with the two literature sweeps
+13. Literature summary for recommendation
+14. Rough approach in plain language
+15. Problem solved in plain language
+16. Why the proposed idea is better
+17. Evaluation sketch
+18. Tentative contributions
+19. Three meaningful points
+20. Candidate approaches and recommendation
+21. Dataset, baseline, and metric candidates
+22. Falsifiable hypothesis
+23. Convergence status
+24. Expert critique
+25. Revised proposal or final recommendation
+26. User guidance
+27. Approval gate
+28. Minimum viable experiment
+29. Idea source log aligned with the two literature sweeps
 ## Writing Standard
@@ -106,13 +119,17 @@
 - Explain the scenario, target user or beneficiary, and why the problem matters before talking about novelty.
 - State why the target problem matters before talking about the method.
 - Use brainstorm pass 1 to open the space, not to declare a winner.
+- Give brainstorm pass 1 at least three candidate directions, and explain each one with what it is, why it matters, rough how it would work, what problem it solves, and its main risk.
 - Use literature sweep 1 to test candidate directions against real papers before narrowing them.
-- Use brainstorm pass 2 to explain what survived, what was rejected, and why.
+- Use brainstorm pass 2 to explain what survived, why it survived, what was rejected, why it was rejected, and why the narrowed recommendation is stronger now.
 - Use literature sweep 2 to support the final recommendation with real references across the required buckets.
+- Add a short literature summary for recommendation so the final output shows the closest prior found, the recent strong papers found, and what existing work still does not solve.
 - Compare against existing methods explicitly, not by vague novelty language.
 - Do not call something new without a literature-scoping bundle and a closest-prior comparison.
 - Do not call something paper-worthy or novel after only one brainstorm pass or one literature sweep.
 - Do not treat the idea artifact itself as the only evidence record; keep `.lab/writing/idea-source-log.md` synchronized with the actual searches and source buckets used in both literature sweeps.
+- Do not leave the working result only in chat. Update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before returning a stage conclusion.
+- Do not present a final recommendation, paper fit, or next-stage approval unless the working idea artifact is validator-clean and its convergence status says the stage is ready for that recommendation.
 - Explain what current methods do, why they fall short, and roughly how the proposed idea would work in plain language.
 - Explain what problem the idea actually solves before describing tentative contributions.
 - Keep the evaluation sketch high-level: who or what is evaluated, what proxy or simulator is used if any, what outcome matters, and what the main validity risk is. Leave full protocol design to later stages.

package/package-assets/shared/skills/lab/stages/write.md CHANGED Viewed

@@ -77,6 +77,7 @@ Run these on every round:
 - If the current round is a final manuscript export or final-draft pass, `paper_template_root` is still empty, `paper_template_decision` is `default-scaffold`, and `paper_template_final_reminder_acknowledged` is `false`, ask one final reminder question about switching to a template before finalizing.
 - If the user confirms staying on the default scaffold at that final reminder, persist `paper_template_final_reminder_acknowledged: true`.
 - If the current round is a final manuscript export or final-draft pass, `workflow_language` and `paper_language` differ, and `paper_language_finalization_decision` is `unconfirmed`, ask one explicit question before finalizing: keep the manuscript in `workflow_language`, or convert the final manuscript to `paper_language`.
+- Do not rewrite final manuscript sections in `paper_language` before that question has been answered. Ask first, persist the answer, then edit the final manuscript.
 - If the user chooses to keep the draft language, persist `paper_language_finalization_decision: keep-workflow-language`.
 - If the user chooses to convert, persist `paper_language_finalization_decision: convert-to-paper-language`.
 - If `paper_language_finalization_decision` is `convert-to-paper-language`, the final manuscript output should be converted to `paper_language` before accepting the final round.
@@ -92,6 +93,7 @@ Run these on every round:
   - record what each figure or analysis asset should show and why the reader needs it
   - record which citation anchors must appear in the section and why each anchor matters
 - Before drafting `introduction`, `method`, `experiments`, `related work`, or `conclusion`, run `.lab/.managed/scripts/validate_paper_plan.py --paper-plan .lab/writing/plan.md`.
+- When the repository workflow config is available, the paper-plan validator also checks that `.lab/writing/plan.md` stays in `workflow_language` instead of silently drifting into another language.
 - If the paper-plan validator fails, stop and fill `.lab/writing/plan.md` first instead of drafting prose.
 - During ordinary draft rounds, run `.lab/.managed/scripts/validate_section_draft.py --section <section> --section-file <section-file> --mode draft` and `.lab/.managed/scripts/validate_paper_claims.py --section-file <section-file> --mode draft` after revising the active section.
 - Treat draft-round output from the section and claim validators as warnings that must be recorded and addressed in the write-iteration artifact, not as immediate stop conditions.
@@ -120,9 +122,11 @@ Run these on every round:
 - For final-draft or export rounds, run `.lab/.managed/scripts/validate_section_draft.py --section <section> --section-file <section-file> --mode final` and `.lab/.managed/scripts/validate_paper_claims.py --section-file <section-file> --mode final` before accepting the round.
 - If the final-round section or claim validators fail, keep editing the affected section until it passes; do not stop at asset-complete but rhetorically weak or unsafe prose.
 - Run `.lab/.managed/scripts/validate_manuscript_delivery.py --paper-dir <deliverables_root>/paper` before accepting a final-draft or export round.
+- The manuscript-delivery validator should fail if the core figures and tables are only inserted but never cited from section prose, if final manuscript acceptance tries to bypass the one-time `paper_language_finalization_decision` gate when `workflow_language` and `paper_language` differ, or if the latest write iteration does not audit that language decision.
 - If the manuscript validator fails, keep editing and asset generation until it passes; do not stop at prose-only completion.
 - Run a LaTeX compile smoke test when a local LaTeX toolchain is available; if not available, record the missing verification in the write iteration artifact.
 - Record what changed and why in a write-iteration artifact.
+- When `workflow_language` and `paper_language` differ, record the final manuscript language choice in the write-iteration artifact with the workflow language, paper language, finalization decision, and why that decision was chosen.
 - Return paragraph-level roles for the revised prose when drafting.
 - Run the five-dimension self-review checklist before accepting a round.
 - Run reviewer-style checks after every round.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "superlab",
-  "version": "0.1.31",
+  "version": "0.1.33",
   "description": "Strict /lab research workflow installer for Codex and Claude",
   "keywords": [
     "codex",