superlab 0.1.30 → 0.1.32
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/superlab.cjs +0 -0
- package/lib/i18n.cjs +3 -3
- package/lib/lab_idea_contract.json +4 -4
- package/package-assets/claude/commands/lab-idea.md +1 -1
- package/package-assets/claude/commands/lab.md +1 -1
- package/package-assets/codex/prompts/lab-idea.md +1 -1
- package/package-assets/codex/prompts/lab.md +1 -1
- package/package-assets/shared/lab/.managed/scripts/validate_idea_artifact.py +11 -0
- package/package-assets/shared/lab/.managed/scripts/validate_manuscript_delivery.py +109 -0
- package/package-assets/shared/lab/.managed/scripts/validate_paper_plan.py +34 -0
- package/package-assets/shared/lab/.managed/templates/idea.md +7 -0
- package/package-assets/shared/skills/lab/SKILL.md +5 -2
- package/package-assets/shared/skills/lab/stages/idea.md +13 -6
- package/package-assets/shared/skills/lab/stages/write.md +2 -0
- package/package.json +1 -1
package/bin/superlab.cjs
CHANGED
|
File without changes
|
package/lib/i18n.cjs
CHANGED
|
@@ -2003,7 +2003,7 @@ ZH_CONTENT[path.join(".lab", ".managed", "templates", "framing.md")] = `# 论文
|
|
|
2003
2003
|
ZH_CONTENT[path.join(".codex", "prompts", "lab.md")] = codexPrompt(
|
|
2004
2004
|
"查看 /lab 研究工作流总览并选择合适阶段",
|
|
2005
2005
|
"workflow question 或 stage choice",
|
|
2006
|
-
"# `/lab` for Codex\n\n`/lab` 是严格的研究工作流命令族。每次都使用同一套仓库工件和阶段边界。\n\n## 子命令\n\n- `/lab:idea`\n 先做两轮脑暴和两轮文献检索,再定义问题与 failure case、对比最接近前作,并输出带 approval gate 的 source-backed recommendation。\n\n- `/lab:data`\n 把已批准的 idea 转成数据集与 benchmark 方案,记录数据集年份、使用过该数据集的论文、下载来源、许可或访问限制,以及 classic-public、recent-strong-public、claim-specific 三类 benchmark 的纳入理由,和 canonical baselines、strong historical baselines、recent strong public methods、closest prior work 四类对比方法的纳入理由。\n\n- `/lab:auto`\n 在不改变 mission、framing 和核心 claims 的前提下,读取 eval-protocol 与 auto-mode 契约并自动编排 `run`、`iterate`、`review`、`report`,必要时扩展数据集、benchmark 和 comparison methods,并在满足升格策略时自动升级 primary package。启动前必须选定 autonomy level、声明 terminal goal,并显式批准契约。\n\n- `/lab:framing`\n 通过审计当前领域与相邻领域的术语,锁定 paper-facing 的方法名、模块名、论文题目和 contribution bullets,并在 section 起草前保留 approval gate。\n\n- `/lab:spec`\n 把已批准的 idea 转成 `.lab/changes/<change-id>/` 下的一个 lab change 目录,并在其中写出 `proposal`、`design`、`spec`、`tasks`。\n\n- `/lab:run`\n 执行最小有意义验证运行,登记 run,并生成第一版标准化评估摘要。\n\n- `/lab:iterate`\n 在冻结 mission、阈值、verification commands 与 `completion_promise` 的前提下执行有边界的实验迭代。\n\n- `/lab:review`\n 以 reviewer mode 审查文档或结果,先给短摘要,再输出 findings、fatal flaws、fix priority 和 residual risks。\n\n- `/lab:report`\n 从 runs 和 iterations 工件生成最终研究报告。\n\n- `/lab:write`\n 使用已安装 `lab` skill 下 vendored 的 paper-writing references,把稳定 report 工件转成论文 section。\n\n## 调度规则\n\n- 始终使用 `skills/lab/SKILL.md` 作为工作流合同。\n- 用户显式调用 `/lab:<stage>` 时,要立刻执行该 stage,而不是只推荐别的 `/lab` stage。\n-
|
|
2006
|
+
"# `/lab` for Codex\n\n`/lab` 是严格的研究工作流命令族。每次都使用同一套仓库工件和阶段边界。\n\n## 子命令\n\n- `/lab:idea`\n 先做两轮脑暴和两轮文献检索,再定义问题与 failure case、对比最接近前作,并输出带 approval gate 的 source-backed recommendation。\n\n- `/lab:data`\n 把已批准的 idea 转成数据集与 benchmark 方案,记录数据集年份、使用过该数据集的论文、下载来源、许可或访问限制,以及 classic-public、recent-strong-public、claim-specific 三类 benchmark 的纳入理由,和 canonical baselines、strong historical baselines、recent strong public methods、closest prior work 四类对比方法的纳入理由。\n\n- `/lab:auto`\n 在不改变 mission、framing 和核心 claims 的前提下,读取 eval-protocol 与 auto-mode 契约并自动编排 `run`、`iterate`、`review`、`report`,必要时扩展数据集、benchmark 和 comparison methods,并在满足升格策略时自动升级 primary package。启动前必须选定 autonomy level、声明 terminal goal,并显式批准契约。\n\n- `/lab:framing`\n 通过审计当前领域与相邻领域的术语,锁定 paper-facing 的方法名、模块名、论文题目和 contribution bullets,并在 section 起草前保留 approval gate。\n\n- `/lab:spec`\n 把已批准的 idea 转成 `.lab/changes/<change-id>/` 下的一个 lab change 目录,并在其中写出 `proposal`、`design`、`spec`、`tasks`。\n\n- `/lab:run`\n 执行最小有意义验证运行,登记 run,并生成第一版标准化评估摘要。\n\n- `/lab:iterate`\n 在冻结 mission、阈值、verification commands 与 `completion_promise` 的前提下执行有边界的实验迭代。\n\n- `/lab:review`\n 以 reviewer mode 审查文档或结果,先给短摘要,再输出 findings、fatal flaws、fix priority 和 residual risks。\n\n- `/lab:report`\n 从 runs 和 iterations 工件生成最终研究报告。\n\n- `/lab:write`\n 使用已安装 `lab` skill 下 vendored 的 paper-writing references,把稳定 report 工件转成论文 section。\n\n## 调度规则\n\n- 始终使用 `skills/lab/SKILL.md` 作为工作流合同。\n- 用户显式调用 `/lab:<stage>` 时,要立刻执行该 stage,而不是只推荐别的 `/lab` stage。\n- 先给简洁的阶段摘要;只要 stage 合同要求受管工件,就应立刻落盘,再回报输出路径和下一步。\n- 如果歧义会影响结论,一次只问一个问题;如果有多条可行路径,先给 2-3 个方案再收敛。\n- `/lab:spec` 前应已有经批准的数据集与 benchmark 方案。\n- `/lab:run`、`/lab:iterate`、`/lab:auto`、`/lab:report` 都应遵循 `.lab/context/eval-protocol.md`。\n- `.lab/context/eval-protocol.md` 不只定义主指标和主表,也应定义指标释义、实验阶梯,以及指标和对比实现的来源。\n- `/lab:auto` 只编排已批准边界内的执行阶段,不替代手动的 idea/data/framing/spec 决策。\n- `/lab:write` 前必须已有经批准的 `/lab:framing` 工件。\n\n## 如何输入 `/lab:auto`\n\n## `/lab:auto` 层级指南\n\n- `L1`:适合安全验证、一轮 bounded 真实运行,或简单 report 刷新。\n- `L2`:默认推荐级别,适合冻结核心边界内的常规实验迭代。\n- `L3`:激进 campaign 级别,只在你明确想做更大范围探索和可选写作时使用。\n- 如果不确定,默认推荐 `L2`。\n- 如果用户输入没写级别,或者把级别和 `paper layer`、`phase`、`table` 混用了,就应先停下来,要求用户明确选 `L1/L2/L3`。\n\n- 把 `Autonomy level L1/L2/L3` 视为执行权限级别,不要和论文里的 layer、phase、table 编号混用。\n- 把 `paper layer`、`phase`、`table` 视为实验目标。例如 `paper layer 3` 或 `Phase 1` 不是 `Autonomy level L3`。\n- 一条好的 `/lab:auto` 输入应至少说清:objective、自治级别、terminal goal、scope、allowed modifications。\n- 如果 workflow language 是中文,摘要、清单条目、任务标签和进度更新都应使用中文,除非文件路径、代码标识符或字面指标名必须保持原样。\n- 示例:`/lab:auto 自治级别 L2。目标:推进 paper layer 3。终止条件:完成 bounded protocol、测试、最小实现和一轮小规模结果。允许修改:配置、数据接入、评估脚本。`\n"
|
|
2007
2007
|
);
|
|
2008
2008
|
|
|
2009
2009
|
ZH_CONTENT[path.join(".codex", "prompts", "lab-data.md")] = codexPrompt(
|
|
@@ -2022,7 +2022,7 @@ ZH_CONTENT[path.join(".claude", "commands", "lab.md")] = claudeCommand(
|
|
|
2022
2022
|
"lab",
|
|
2023
2023
|
"查看 /lab 研究工作流总览并选择合适阶段",
|
|
2024
2024
|
"[stage] [target]",
|
|
2025
|
-
"# `/lab` for Claude\n\n`/lab` 是 Claude Code 里的 lab 工作流分发入口。调用方式有两种:\n\n- `/lab <stage> ...`\n- `/lab-idea`、`/lab-data`、`/lab-auto`、`/lab-framing`、`/lab-spec`、`/lab-run`、`/lab-iterate`、`/lab-review`、`/lab-report`、`/lab-write`\n\n## 阶段别名\n\n- `/lab idea ...` 或 `/lab-idea`\n- `/lab data ...` 或 `/lab-data`\n- `/lab auto ...` 或 `/lab-auto`\n- `/lab framing ...` 或 `/lab-framing`\n- `/lab spec ...` 或 `/lab-spec`\n- `/lab run ...` 或 `/lab-run`\n- `/lab iterate ...` 或 `/lab-iterate`\n- `/lab review ...` 或 `/lab-review`\n- `/lab report ...` 或 `/lab-report`\n- `/lab write ...` 或 `/lab-write`\n\n## 调度规则\n\n- 始终使用 `skills/lab/SKILL.md` 作为工作流合同。\n- 用户显式调用 `/lab <stage> ...` 或 `/lab-<stage>` 时,要立刻执行该 stage,而不是只推荐别的阶段。\n-
|
|
2025
|
+
"# `/lab` for Claude\n\n`/lab` 是 Claude Code 里的 lab 工作流分发入口。调用方式有两种:\n\n- `/lab <stage> ...`\n- `/lab-idea`、`/lab-data`、`/lab-auto`、`/lab-framing`、`/lab-spec`、`/lab-run`、`/lab-iterate`、`/lab-review`、`/lab-report`、`/lab-write`\n\n## 阶段别名\n\n- `/lab idea ...` 或 `/lab-idea`\n- `/lab data ...` 或 `/lab-data`\n- `/lab auto ...` 或 `/lab-auto`\n- `/lab framing ...` 或 `/lab-framing`\n- `/lab spec ...` 或 `/lab-spec`\n- `/lab run ...` 或 `/lab-run`\n- `/lab iterate ...` 或 `/lab-iterate`\n- `/lab review ...` 或 `/lab-review`\n- `/lab report ...` 或 `/lab-report`\n- `/lab write ...` 或 `/lab-write`\n\n## 调度规则\n\n- 始终使用 `skills/lab/SKILL.md` 作为工作流合同。\n- 用户显式调用 `/lab <stage> ...` 或 `/lab-<stage>` 时,要立刻执行该 stage,而不是只推荐别的阶段。\n- 先给简洁的阶段摘要;只要 stage 合同要求受管工件,就应立刻落盘,再回报输出路径和下一步。\n- 如果歧义会影响结论,一次只问一个问题;如果有多条可行路径,先给 2-3 个方案再收敛。\n- `spec` 前应已有经批准的数据集与 benchmark 方案。\n- `run`、`iterate`、`auto`、`report` 都应遵循 `.lab/context/eval-protocol.md`。\n- `auto` 只编排已批准边界内的执行阶段,不替代手动的 idea/data/framing/spec 决策。\n- `write` 前必须已有经批准的 `framing` 工件。\n\n## 如何输入 `/lab auto`\n\n## `/lab auto` 层级指南\n\n- `L1`:适合安全验证、一轮 bounded 真实运行,或简单 report 刷新。\n- `L2`:默认推荐级别,适合冻结核心边界内的常规实验迭代。\n- `L3`:激进 campaign 级别,只在你明确想做更大范围探索和可选写作时使用。\n- 如果不确定,默认推荐 `L2`。\n- 如果用户输入没写级别,或者把级别和 `paper layer`、`phase`、`table` 混用了,就应先停下来,要求用户明确选 `L1/L2/L3`。\n\n- 把 `Autonomy level L1/L2/L3` 视为执行权限级别,不要和论文里的 layer、phase、table 编号混用。\n- 把 `paper layer`、`phase`、`table` 视为实验目标。例如 `paper layer 3` 或 `Phase 1` 不是 `Autonomy level L3`。\n- 一条好的 `/lab auto` 输入应至少说清:objective、自治级别、terminal goal、scope、allowed modifications。\n- 如果 workflow language 是中文,摘要、清单条目、任务标签和进度更新都应使用中文,除非文件路径、代码标识符或字面指标名必须保持原样。\n- 示例:`/lab auto 自治级别 L2。目标:推进 paper layer 3。终止条件:完成 bounded protocol、测试、最小实现和一轮小规模结果。允许修改:配置、数据接入、评估脚本。`\n"
|
|
2026
2026
|
);
|
|
2027
2027
|
|
|
2028
2028
|
ZH_CONTENT[path.join(".claude", "commands", "lab-data.md")] = claudeCommand(
|
|
@@ -2055,7 +2055,7 @@ description: 严格研究工作流,覆盖 idea、data、auto、framing、spec
|
|
|
2055
2055
|
- 论文写作阶段要与实验执行阶段分离。
|
|
2056
2056
|
- 用户显式调用某个 \`/lab:stage\` 时,要直接执行该 stage,而不是只推荐别的 stage。
|
|
2057
2057
|
- 关键决策必须落盘,不能只留在聊天里。
|
|
2058
|
-
- 每个 stage
|
|
2058
|
+
- 每个 stage 都要先给用户一个简洁简介;如果 stage 合同要求受管工件,就应立刻落盘,最后必须回报路径和下一步。
|
|
2059
2059
|
- 如果缺少的前提会改变结论,一次只追问一个问题。
|
|
2060
2060
|
- 如果存在多条可行路径,先给 2-3 个方案、trade-offs 和推荐项,再收敛。
|
|
2061
2061
|
- 如果某个 stage 会决定后续方向,就要保留明确的 approval gate。
|
|
@@ -1,8 +1,8 @@
|
|
|
1
1
|
{
|
|
2
2
|
"stage_prompt": {
|
|
3
|
-
"codex_en": "This command runs the `/lab:idea` stage. Use `.codex/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, and approval gate. Start with brainstorm pass 1 over 2-4 candidate directions, run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, and what the user should decide next. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified.",
|
|
4
|
-
"claude_en": "This command runs the `idea` stage of the lab workflow. Use `.claude/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, and approval gate. Start with brainstorm pass 1 over 2-4 candidate directions, run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, and what the user should decide next. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified.",
|
|
5
|
-
"codex_zh": "本命令运行 `/lab:idea` 阶段。把 `.codex/skills/lab/stages/idea.md` 当成两轮脑暴、两轮文献检索、最接近前作对照、source-backed proposal memo
|
|
6
|
-
"claude_zh": "本命令运行 lab workflow 的 `idea` 阶段。把 `.claude/skills/lab/stages/idea.md` 当成两轮脑暴、两轮文献检索、最接近前作对照、source-backed proposal memo
|
|
3
|
+
"codex_en": "This command runs the `/lab:idea` stage. Use `.codex/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, convergence status, and approval gate. Start with brainstorm pass 1 over 2-4 candidate directions, run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. Materialize or update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before any final recommendation, paper-fit judgment, or mission writeback. Do not end the stage with a chat-only brainstorm; if the work is still unconverged, say so explicitly, list what is still missing, and stop there. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, what is already source-backed, what is still hypothesis-only, and what the user should decide next. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified. Only after `.lab/.managed/scripts/validate_idea_artifact.py` passes may the stage present a final recommendation as converged.",
|
|
4
|
+
"claude_en": "This command runs the `idea` stage of the lab workflow. Use `.claude/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, convergence status, and approval gate. Start with brainstorm pass 1 over 2-4 candidate directions, run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. Materialize or update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before any final recommendation, paper-fit judgment, or mission writeback. Do not end the stage with a chat-only brainstorm; if the work is still unconverged, say so explicitly, list what is still missing, and stop there. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, what is already source-backed, what is still hypothesis-only, and what the user should decide next. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified. Only after `.lab/.managed/scripts/validate_idea_artifact.py` passes may the stage present a final recommendation as converged.",
|
|
5
|
+
"codex_zh": "本命令运行 `/lab:idea` 阶段。把 `.codex/skills/lab/stages/idea.md` 当成两轮脑暴、两轮文献检索、最接近前作对照、source-backed proposal memo、评测草图、暂定贡献、用户引导、最小可行实验、收敛状态和 approval gate 的单一来源。先做第一轮脑暴,产出 2-4 个候选方向;再做第一轮文献检索,为每个方向补最接近前作;然后用第二轮脑暴淘汰或收敛方向;最后做第二轮文献检索,补齐最终来源包,再输出协作者可读的推荐结论。任何 final recommendation、paper fit 判断或 mission 写回之前,都必须先 materialize or update `.lab/writing/idea.md` 和 `.lab/writing/idea-source-log.md`。不要以纯聊天脑暴收尾;如果当前还没收敛,就明确写出还缺什么,并停在未收敛状态。最终 idea memo 必须讲清真实场景、解决了什么问题、现有方法为什么不够、准备怎么做、大致怎么评、暂定贡献是什么、哪些部分已经 source-backed、哪些还只是 hypothesis,以及用户下一步该决定什么。`.lab/writing/idea-source-log.md` 必须和两轮检索实际用到的查询、来源分桶和最终来源数保持一致。文献来源包默认目标约 20 篇;如果领域确实很窄,必须显式解释为什么合理地低于这个目标。只有在 `.lab/.managed/scripts/validate_idea_artifact.py` 通过之后,才可以把最终推荐当成已收敛结论输出。",
|
|
6
|
+
"claude_zh": "本命令运行 lab workflow 的 `idea` 阶段。把 `.claude/skills/lab/stages/idea.md` 当成两轮脑暴、两轮文献检索、最接近前作对照、source-backed proposal memo、评测草图、暂定贡献、用户引导、最小可行实验、收敛状态和 approval gate 的单一来源。先做第一轮脑暴,产出 2-4 个候选方向;再做第一轮文献检索,为每个方向补最接近前作;然后用第二轮脑暴淘汰或收敛方向;最后做第二轮文献检索,补齐最终来源包,再输出协作者可读的推荐结论。任何 final recommendation、paper fit 判断或 mission 写回之前,都必须先 materialize or update `.lab/writing/idea.md` 和 `.lab/writing/idea-source-log.md`。不要以纯聊天脑暴收尾;如果当前还没收敛,就明确写出还缺什么,并停在未收敛状态。最终 idea memo 必须讲清真实场景、解决了什么问题、现有方法为什么不够、准备怎么做、大致怎么评、暂定贡献是什么、哪些部分已经 source-backed、哪些还只是 hypothesis,以及用户下一步该决定什么。`.lab/writing/idea-source-log.md` 必须和两轮检索实际用到的查询、来源分桶和最终来源数保持一致。文献来源包默认目标约 20 篇;如果领域确实很窄,必须显式解释为什么合理地低于这个目标。只有在 `.lab/.managed/scripts/validate_idea_artifact.py` 通过之后,才可以把最终推荐当成已收敛结论输出。"
|
|
7
7
|
}
|
|
8
8
|
}
|
|
@@ -7,4 +7,4 @@ argument-hint: idea or research problem
|
|
|
7
7
|
Use the installed `lab` skill at `.claude/skills/lab/SKILL.md`.
|
|
8
8
|
|
|
9
9
|
Execute the requested `/lab-idea` command against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
|
|
10
|
-
This command runs the `idea` stage of the lab workflow. Use `.claude/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, and approval gate. Start with brainstorm pass 1 over 2-4 candidate directions, run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, and what the user should decide next. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified.
|
|
10
|
+
This command runs the `idea` stage of the lab workflow. Use `.claude/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, convergence status, and approval gate. Start with brainstorm pass 1 over 2-4 candidate directions, run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. Materialize or update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before any final recommendation, paper-fit judgment, or mission writeback. Do not end the stage with a chat-only brainstorm; if the work is still unconverged, say so explicitly, list what is still missing, and stop there. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, what is already source-backed, what is still hypothesis-only, and what the user should decide next. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified. Only after `.lab/.managed/scripts/validate_idea_artifact.py` passes may the stage present a final recommendation as converged.
|
|
@@ -50,7 +50,7 @@ Use the same repository artifacts and stage boundaries every time.
|
|
|
50
50
|
|
|
51
51
|
- Always use `skills/lab/SKILL.md` as the workflow contract.
|
|
52
52
|
- When the user explicitly invokes `/lab <stage> ...` or a direct `/lab-<stage>` alias, execute that stage now against the provided argument instead of only recommending another lab stage.
|
|
53
|
-
- Start by giving the user a concise summary
|
|
53
|
+
- Start by giving the user a concise stage summary. Materialize managed artifacts immediately when the stage contract requires them, then report the output path and next step.
|
|
54
54
|
- When ambiguity matters, ask one clarifying question at a time; when multiple paths are viable, present 2-3 approaches before converging.
|
|
55
55
|
- `spec` is not complete until the approved change is frozen under `.lab/changes/<change-id>/`.
|
|
56
56
|
- `spec` should inherit the approved dataset package from `.lab/context/data-decisions.md`.
|
|
@@ -6,4 +6,4 @@ argument-hint: idea or research problem
|
|
|
6
6
|
Use the installed `lab` skill at `.codex/skills/lab/SKILL.md`.
|
|
7
7
|
|
|
8
8
|
Execute the requested `/lab:idea` stage against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
|
|
9
|
-
This command runs the `/lab:idea` stage. Use `.codex/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, and approval gate. Start with brainstorm pass 1 over 2-4 candidate directions, run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, and what the user should decide next. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified.
|
|
9
|
+
This command runs the `/lab:idea` stage. Use `.codex/skills/lab/stages/idea.md` as the single source of truth for the two brainstorm passes, two literature sweeps, closest-prior comparison, source-backed proposal memo, evaluation sketch, tentative contributions, user guidance, minimum viable experiment, convergence status, and approval gate. Start with brainstorm pass 1 over 2-4 candidate directions, run literature sweep 1 with real closest-prior references for each direction, narrow the field with brainstorm pass 2, then run literature sweep 2 to build the final source bundle before producing a collaborator-readable recommendation. Materialize or update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before any final recommendation, paper-fit judgment, or mission writeback. Do not end the stage with a chat-only brainstorm; if the work is still unconverged, say so explicitly, list what is still missing, and stop there. The final idea memo must explain the real-world scenario, the problem solved, why current methods fall short, roughly how the idea would work, how it would be evaluated, what the tentative contributions are, what is already source-backed, what is still hypothesis-only, and what the user should decide next. Keep `.lab/writing/idea-source-log.md` synchronized with the actual search queries, bucketed sources, and final source count used in both sweeps. The literature bundle should default to about 20 sources unless the field is genuinely narrow and that smaller bundle is explicitly justified. Only after `.lab/.managed/scripts/validate_idea_artifact.py` passes may the stage present a final recommendation as converged.
|
|
@@ -44,7 +44,7 @@ argument-hint: workflow question or stage choice
|
|
|
44
44
|
|
|
45
45
|
- Always use `skills/lab/SKILL.md` as the workflow contract.
|
|
46
46
|
- When the user explicitly invokes `/lab:<stage>`, execute that stage now against the provided argument instead of only recommending another `/lab` stage.
|
|
47
|
-
- Start by giving the user a concise summary
|
|
47
|
+
- Start by giving the user a concise stage summary. Materialize managed artifacts immediately when the stage contract requires them, then report the output path and next step.
|
|
48
48
|
- When ambiguity matters, ask one clarifying question at a time; when multiple paths are viable, present 2-3 approaches before converging.
|
|
49
49
|
- `/lab:spec` is not complete until the approved change is frozen under `.lab/changes/<change-id>/`.
|
|
50
50
|
- `/lab:spec` should inherit the approved dataset package from `.lab/context/data-decisions.md`.
|
|
@@ -23,6 +23,7 @@ REQUIRED_SECTIONS = {
|
|
|
23
23
|
"Tentative Contributions": [r"^##\s+Tentative Contributions\s*$", r"^##\s+暂定贡献\s*$"],
|
|
24
24
|
"Candidate Experiment": [r"^##\s+Candidate Experiment\s*$", r"^##\s+(?:最小实验|候选实验)\s*$"],
|
|
25
25
|
"Falsifiable Hypothesis": [r"^##\s+Falsifiable Hypothesis\s*$", r"^##\s+可证伪假设\s*$"],
|
|
26
|
+
"Convergence Status": [r"^##\s+Convergence Status\s*$", r"^##\s+收敛状态\s*$"],
|
|
26
27
|
"Final Recommendation": [r"^##\s+Final Recommendation\s*$", r"^##\s+最终推荐\s*$"],
|
|
27
28
|
"User Guidance": [r"^##\s+User Guidance\s*$", r"^##\s+用户引导\s*$"],
|
|
28
29
|
}
|
|
@@ -254,6 +255,16 @@ def validate_content(text: str) -> list[str]:
|
|
|
254
255
|
if not contains_any(experiment, ("minimum viable experiment", "minimum experiment", "dataset", "metric", "最小实验", "主指标", "次指标")):
|
|
255
256
|
issues.append("idea artifact is missing a minimum experiment")
|
|
256
257
|
|
|
258
|
+
convergence_status = extract_section_body(text, REQUIRED_SECTIONS["Convergence Status"])
|
|
259
|
+
if not has_field_value(convergence_status, ("Current status", "当前状态")):
|
|
260
|
+
issues.append("idea artifact is missing a convergence status with the current stage state")
|
|
261
|
+
if not has_field_value(convergence_status, ("What is already source-backed", "哪些部分已经 source-backed", "已经有来源支撑的部分")):
|
|
262
|
+
issues.append("idea artifact is missing a convergence status section that states what is already source-backed")
|
|
263
|
+
if not has_field_value(convergence_status, ("What is still hypothesis-only", "哪些部分还只是 hypothesis", "还只是生成性假设的部分")):
|
|
264
|
+
issues.append("idea artifact is missing a convergence status section that states what is still hypothesis-only")
|
|
265
|
+
if not has_field_value(convergence_status, ("Can this round end with a final recommendation", "这一轮能否结束并给出 final recommendation", "这一轮能否给出最终推荐")):
|
|
266
|
+
issues.append("idea artifact is missing a convergence status section that states whether the stage can end with a final recommendation")
|
|
267
|
+
|
|
257
268
|
final_recommendation = extract_section_body(text, REQUIRED_SECTIONS["Final Recommendation"])
|
|
258
269
|
if not contains_any(final_recommendation, ("recommended direction", "paper-worthy", "推荐方向", "值得做论文")):
|
|
259
270
|
issues.append("idea artifact is missing a final recommendation after the second sweep")
|
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
#!/usr/bin/env python3
|
|
2
2
|
import argparse
|
|
3
|
+
import json
|
|
3
4
|
import re
|
|
4
5
|
import sys
|
|
5
6
|
from pathlib import Path
|
|
@@ -8,6 +9,7 @@ from pathlib import Path
|
|
|
8
9
|
ABSOLUTE_PATH_MARKERS = ("/Users/", "/home/", "/tmp/", "/private/tmp/")
|
|
9
10
|
REQUIRED_TABLE_FILES = ("main-results.tex", "ablations.tex")
|
|
10
11
|
REQUIRED_FIGURE_FILES = ("problem-setting.tex", "method-overview.tex", "results-overview.tex")
|
|
12
|
+
REF_PATTERN_TEMPLATE = r"\\(?:auto|c|C)?ref\{%s\}"
|
|
11
13
|
|
|
12
14
|
|
|
13
15
|
def parse_args():
|
|
@@ -22,6 +24,39 @@ def read_text(path: Path) -> str:
|
|
|
22
24
|
return path.read_text(encoding="utf-8")
|
|
23
25
|
|
|
24
26
|
|
|
27
|
+
def find_workflow_config(start_path: Path) -> Path | None:
|
|
28
|
+
search_roots = [start_path, *start_path.parents]
|
|
29
|
+
for root in search_roots:
|
|
30
|
+
for relative in ("config/workflow.json", ".lab/config/workflow.json"):
|
|
31
|
+
candidate = root / relative
|
|
32
|
+
if candidate.exists():
|
|
33
|
+
return candidate
|
|
34
|
+
return None
|
|
35
|
+
|
|
36
|
+
|
|
37
|
+
def load_workflow_config(path: Path) -> dict:
|
|
38
|
+
return json.loads(path.read_text(encoding="utf-8"))
|
|
39
|
+
|
|
40
|
+
|
|
41
|
+
def text_looks_like_language(text: str, language: str) -> bool:
|
|
42
|
+
cjk_chars = len(re.findall(r"[\u4e00-\u9fff]", text))
|
|
43
|
+
latin_chars = len(re.findall(r"[A-Za-z]", text))
|
|
44
|
+
if language == "zh":
|
|
45
|
+
return cjk_chars >= 20
|
|
46
|
+
if language == "en":
|
|
47
|
+
return latin_chars >= 80 and cjk_chars < 20
|
|
48
|
+
return True
|
|
49
|
+
|
|
50
|
+
|
|
51
|
+
def extract_label(text: str) -> str | None:
|
|
52
|
+
match = re.search(r"\\label\{([^}]+)\}", text)
|
|
53
|
+
return match.group(1) if match else None
|
|
54
|
+
|
|
55
|
+
|
|
56
|
+
def section_references_label(text: str, label: str) -> bool:
|
|
57
|
+
return bool(re.search(REF_PATTERN_TEMPLATE % re.escape(label), text))
|
|
58
|
+
|
|
59
|
+
|
|
25
60
|
def check_exists(path: Path, issues: list[str], label: str):
|
|
26
61
|
if not path.exists():
|
|
27
62
|
issues.append(f"missing required file: {label} ({path})")
|
|
@@ -100,6 +135,13 @@ def check_analysis_asset(path: Path, issues: list[str]):
|
|
|
100
135
|
issues.append("analysis/analysis-asset.tex must explain asset intent")
|
|
101
136
|
|
|
102
137
|
|
|
138
|
+
def require_section_reference(section_text: str, label: str | None, issues: list[str], section_name: str, asset_name: str):
|
|
139
|
+
if not label:
|
|
140
|
+
return
|
|
141
|
+
if not section_references_label(section_text, label):
|
|
142
|
+
issues.append(f"{section_name} must explicitly reference the {asset_name} via \\ref{{{label}}}")
|
|
143
|
+
|
|
144
|
+
|
|
103
145
|
def check_introduction_section(paper_dir: Path, issues: list[str]):
|
|
104
146
|
introduction = paper_dir / "sections" / "introduction.tex"
|
|
105
147
|
check_exists(introduction, issues, "sections/introduction.tex")
|
|
@@ -153,6 +195,41 @@ def check_experiments_section(paper_dir: Path, issues: list[str]):
|
|
|
153
195
|
issues.append("experiments section is missing an analysis asset")
|
|
154
196
|
|
|
155
197
|
|
|
198
|
+
def check_asset_consumption(paper_dir: Path, issues: list[str]):
|
|
199
|
+
intro_path = paper_dir / "sections" / "introduction.tex"
|
|
200
|
+
method_path = paper_dir / "sections" / "method.tex"
|
|
201
|
+
experiments_path = paper_dir / "sections" / "experiments.tex"
|
|
202
|
+
if not intro_path.exists() or not method_path.exists() or not experiments_path.exists():
|
|
203
|
+
return
|
|
204
|
+
|
|
205
|
+
introduction_text = read_text(intro_path)
|
|
206
|
+
method_text = read_text(method_path)
|
|
207
|
+
experiments_text = read_text(experiments_path)
|
|
208
|
+
|
|
209
|
+
figures_dir = paper_dir / "figures"
|
|
210
|
+
tables_dir = paper_dir / "tables"
|
|
211
|
+
analysis_dir = paper_dir / "analysis"
|
|
212
|
+
|
|
213
|
+
problem_label = extract_label(read_text(figures_dir / "problem-setting.tex")) if (figures_dir / "problem-setting.tex").exists() else None
|
|
214
|
+
method_label = extract_label(read_text(figures_dir / "method-overview.tex")) if (figures_dir / "method-overview.tex").exists() else None
|
|
215
|
+
results_label = extract_label(read_text(figures_dir / "results-overview.tex")) if (figures_dir / "results-overview.tex").exists() else None
|
|
216
|
+
main_table_label = extract_label(read_text(tables_dir / "main-results.tex")) if (tables_dir / "main-results.tex").exists() else None
|
|
217
|
+
ablation_label = extract_label(read_text(tables_dir / "ablations.tex")) if (tables_dir / "ablations.tex").exists() else None
|
|
218
|
+
analysis_label = extract_label(read_text(analysis_dir / "analysis-asset.tex")) if (analysis_dir / "analysis-asset.tex").exists() else None
|
|
219
|
+
|
|
220
|
+
require_section_reference(introduction_text, problem_label, issues, "introduction section", "problem-setting figure")
|
|
221
|
+
require_section_reference(method_text, method_label, issues, "method section", "method-overview figure")
|
|
222
|
+
require_section_reference(experiments_text, main_table_label, issues, "experiments section", "main results table")
|
|
223
|
+
require_section_reference(experiments_text, ablation_label, issues, "experiments section", "ablation table")
|
|
224
|
+
require_section_reference(experiments_text, results_label, issues, "experiments section", "results-overview figure")
|
|
225
|
+
require_section_reference(experiments_text, analysis_label, issues, "experiments section", "analysis asset")
|
|
226
|
+
|
|
227
|
+
main_results_text = read_text(tables_dir / "main-results.tex") if (tables_dir / "main-results.tex").exists() else ""
|
|
228
|
+
prose_for_ranking_claims = "\n".join((introduction_text, experiments_text))
|
|
229
|
+
if re.search(r"\bQini\b", prose_for_ranking_claims, flags=re.IGNORECASE) and "Qini" not in main_results_text:
|
|
230
|
+
issues.append("manuscript prose mentions Qini but tables/main-results.tex does not expose it")
|
|
231
|
+
|
|
232
|
+
|
|
156
233
|
def check_method_section(paper_dir: Path, issues: list[str]):
|
|
157
234
|
method = paper_dir / "sections" / "method.tex"
|
|
158
235
|
check_exists(method, issues, "sections/method.tex")
|
|
@@ -180,6 +257,36 @@ def check_main_tex(paper_dir: Path, issues: list[str]):
|
|
|
180
257
|
issues.append("main.tex must include the references bibliography")
|
|
181
258
|
|
|
182
259
|
|
|
260
|
+
def check_language_layers(paper_dir: Path, issues: list[str]):
|
|
261
|
+
workflow_config = find_workflow_config(paper_dir)
|
|
262
|
+
if workflow_config is None:
|
|
263
|
+
return
|
|
264
|
+
|
|
265
|
+
config = load_workflow_config(workflow_config)
|
|
266
|
+
workflow_language = config.get("workflow_language")
|
|
267
|
+
paper_language = config.get("paper_language", workflow_language)
|
|
268
|
+
finalization_decision = config.get("paper_language_finalization_decision", "unconfirmed")
|
|
269
|
+
|
|
270
|
+
if not workflow_language or workflow_language == paper_language:
|
|
271
|
+
return
|
|
272
|
+
|
|
273
|
+
if finalization_decision == "unconfirmed":
|
|
274
|
+
issues.append(
|
|
275
|
+
"workflow_language and paper_language differ; confirm paper_language_finalization_decision before finalizing the manuscript"
|
|
276
|
+
)
|
|
277
|
+
return
|
|
278
|
+
|
|
279
|
+
sections_dir = paper_dir / "sections"
|
|
280
|
+
section_text = "\n".join(
|
|
281
|
+
read_text(path) for path in sorted(sections_dir.glob("*.tex")) if path.is_file()
|
|
282
|
+
)
|
|
283
|
+
target_language = workflow_language if finalization_decision == "keep-workflow-language" else paper_language
|
|
284
|
+
if section_text and not text_looks_like_language(section_text, target_language):
|
|
285
|
+
issues.append(
|
|
286
|
+
f"final manuscript sections should follow {target_language} after paper_language_finalization_decision={finalization_decision}"
|
|
287
|
+
)
|
|
288
|
+
|
|
289
|
+
|
|
183
290
|
def main():
|
|
184
291
|
args = parse_args()
|
|
185
292
|
paper_dir = Path(args.paper_dir)
|
|
@@ -195,6 +302,8 @@ def main():
|
|
|
195
302
|
check_introduction_section(paper_dir, issues)
|
|
196
303
|
check_method_section(paper_dir, issues)
|
|
197
304
|
check_experiments_section(paper_dir, issues)
|
|
305
|
+
check_asset_consumption(paper_dir, issues)
|
|
306
|
+
check_language_layers(paper_dir, issues)
|
|
198
307
|
|
|
199
308
|
tables_dir = paper_dir / "tables"
|
|
200
309
|
check_table_file(tables_dir / REQUIRED_TABLE_FILES[0], issues, "tables/main-results.tex")
|
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
#!/usr/bin/env python3
|
|
2
2
|
import argparse
|
|
3
|
+
import json
|
|
3
4
|
import re
|
|
4
5
|
import sys
|
|
5
6
|
from pathlib import Path
|
|
@@ -144,6 +145,31 @@ def read_text(path: Path) -> str:
|
|
|
144
145
|
return path.read_text(encoding="utf-8")
|
|
145
146
|
|
|
146
147
|
|
|
148
|
+
def find_workflow_config(start_path: Path) -> Path | None:
|
|
149
|
+
search_roots = [start_path.parent, *start_path.parents]
|
|
150
|
+
for root in search_roots:
|
|
151
|
+
for relative in ("config/workflow.json", ".lab/config/workflow.json"):
|
|
152
|
+
candidate = root / relative
|
|
153
|
+
if candidate.exists():
|
|
154
|
+
return candidate
|
|
155
|
+
return None
|
|
156
|
+
|
|
157
|
+
|
|
158
|
+
def load_workflow_language(path: Path) -> str:
|
|
159
|
+
data = json.loads(path.read_text(encoding="utf-8"))
|
|
160
|
+
return data.get("workflow_language", "en")
|
|
161
|
+
|
|
162
|
+
|
|
163
|
+
def text_looks_like_language(text: str, language: str) -> bool:
|
|
164
|
+
cjk_chars = len(re.findall(r"[\u4e00-\u9fff]", text))
|
|
165
|
+
latin_chars = len(re.findall(r"[A-Za-z]", text))
|
|
166
|
+
if language == "zh":
|
|
167
|
+
return cjk_chars >= 20
|
|
168
|
+
if language == "en":
|
|
169
|
+
return latin_chars >= 80 and cjk_chars < 20
|
|
170
|
+
return True
|
|
171
|
+
|
|
172
|
+
|
|
147
173
|
def extract_section_body(text: str, patterns: list[str]) -> str:
|
|
148
174
|
for pattern in patterns:
|
|
149
175
|
match = re.search(pattern, text, flags=re.MULTILINE)
|
|
@@ -250,6 +276,14 @@ def main():
|
|
|
250
276
|
issues.append(f"paper plan is missing required sections: {', '.join(missing)}")
|
|
251
277
|
issues.extend(validate_filled_fields(text))
|
|
252
278
|
|
|
279
|
+
workflow_config = find_workflow_config(plan_path)
|
|
280
|
+
if workflow_config is not None:
|
|
281
|
+
workflow_language = load_workflow_language(workflow_config)
|
|
282
|
+
if not text_looks_like_language(text, workflow_language):
|
|
283
|
+
issues.append(
|
|
284
|
+
f"paper plan should follow workflow_language={workflow_language} instead of drifting into another language"
|
|
285
|
+
)
|
|
286
|
+
|
|
253
287
|
if issues:
|
|
254
288
|
for issue in issues:
|
|
255
289
|
print(issue, file=sys.stderr)
|
|
@@ -180,6 +180,13 @@ Suggested levels:
|
|
|
180
180
|
- If the idea is correct:
|
|
181
181
|
- If the idea is wrong:
|
|
182
182
|
|
|
183
|
+
## Convergence Status
|
|
184
|
+
|
|
185
|
+
- Current status:
|
|
186
|
+
- What is already source-backed:
|
|
187
|
+
- What is still hypothesis-only:
|
|
188
|
+
- Can this round end with a final recommendation:
|
|
189
|
+
|
|
183
190
|
## Candidate Experiment
|
|
184
191
|
|
|
185
192
|
- Baseline:
|
|
@@ -14,8 +14,8 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
14
14
|
- Keep dataset and benchmark selection separate from idea generation and specification.
|
|
15
15
|
- Keep paper writing separate from experiment execution and reporting.
|
|
16
16
|
- When the user explicitly invokes `/lab:<stage>`, execute that stage against the provided target instead of replying with a recommendation for another `/lab` stage.
|
|
17
|
-
- Start each stage with a concise user-facing summary
|
|
18
|
-
-
|
|
17
|
+
- Start each stage with a concise user-facing summary.
|
|
18
|
+
- If the stage contract requires managed artifacts, materialize them immediately instead of treating artifact creation as optional. Then report the output path plus next step.
|
|
19
19
|
- When a missing assumption would materially change the stage outcome, ask one clarifying question at a time.
|
|
20
20
|
- When there are multiple viable paths, present 2-3 approaches with trade-offs and a recommendation before converging.
|
|
21
21
|
- When a stage materially sets downstream direction, keep an explicit approval gate before proceeding.
|
|
@@ -42,6 +42,7 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
42
42
|
- Use brainstorm pass 2 to keep only the strongest 1-2 directions and explain what was rejected.
|
|
43
43
|
- Run literature sweep 2 before making a final recommendation or novelty claim.
|
|
44
44
|
- Build a literature-scoping bundle before claiming novelty. The default target is 20 relevant sources unless the field is genuinely too narrow and that exception is written down.
|
|
45
|
+
- Materialize or update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before giving any final recommendation, paper-fit judgment, or mission writeback.
|
|
45
46
|
- Read `.lab/context/mission.md` and `.lab/context/open-questions.md` before drafting.
|
|
46
47
|
- Read `.lab/config/workflow.json` before drafting and follow its `workflow_language` for idea artifacts.
|
|
47
48
|
- Ask one clarifying question at a time when critical ambiguity remains.
|
|
@@ -58,6 +59,8 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
58
59
|
- Write idea artifacts with the template in `.lab/.managed/templates/idea.md`.
|
|
59
60
|
- Keep `.lab/writing/idea-source-log.md` as the source-backed search manifest for the two literature sweeps.
|
|
60
61
|
- Run `.lab/.managed/scripts/validate_idea_artifact.py --idea <idea-artifact> --source-log .lab/writing/idea-source-log.md --workflow-config .lab/config/workflow.json` before treating the idea as converged.
|
|
62
|
+
- Do not end the stage with a chat-only brainstorm. If only one brainstorm pass or literature sweep is complete, mark the idea as unconverged, list what is missing, and stop without pretending that the stage has converged.
|
|
63
|
+
- Only after the validator passes may the stage update `.lab/context/mission.md`, `.lab/context/decisions.md`, and `.lab/context/open-questions.md` with a final recommendation.
|
|
61
64
|
- Update `.lab/context/mission.md`, `.lab/context/decisions.md`, and `.lab/context/open-questions.md` after convergence.
|
|
62
65
|
- Do not leave `.lab/context/mission.md` as a template shell once the problem statement and approved direction are known.
|
|
63
66
|
- Do not implement code in this stage.
|
|
@@ -16,6 +16,7 @@
|
|
|
16
16
|
- rough plain-language approach description
|
|
17
17
|
- evaluation sketch with the evaluation subject, any proxy or simulator, the main outcome to observe, and the main validity risk
|
|
18
18
|
- tentative contributions stated at idea level, not final paper-facing wording
|
|
19
|
+
- convergence status that says what is already source-backed, what is still hypothesis-only, and whether the stage may end with a final recommendation
|
|
19
20
|
- three meaningful points
|
|
20
21
|
- brainstorm pass 1 with 2-4 candidate directions
|
|
21
22
|
- literature sweep 1 with 3-5 closest-prior references per direction
|
|
@@ -42,10 +43,12 @@
|
|
|
42
43
|
- Build a source bundle before claiming novelty. The default target is 20 relevant sources split across closest prior work, recent strong papers, benchmark or evaluation papers, surveys or taxonomies, and adjacent-field work when useful.
|
|
43
44
|
- Treat closest prior work, recent strong papers, benchmark or evaluation papers, and survey or taxonomy papers as mandatory coverage buckets. Do not leave those buckets empty in the final source bundle.
|
|
44
45
|
- Keep a separate idea source log that records the actual search queries, bucketed sources, and final source count for both literature sweeps.
|
|
46
|
+
- Materialize or update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before giving any final recommendation, paper-fit judgment, or mission writeback.
|
|
45
47
|
- Use the first brainstorm pass only to generate candidate directions. Treat it as hypothesis generation, not as a novelty judgment.
|
|
46
48
|
- After brainstorm pass 1, run a first literature sweep that gathers 3-5 closest-prior references per direction before narrowing the idea.
|
|
47
49
|
- After literature sweep 1, run a second brainstorm pass that explicitly kills, merges, or narrows directions.
|
|
48
50
|
- Only after literature sweep 2 may the artifact give a final recommendation, paper fit, or novelty claim.
|
|
51
|
+
- Do not end the stage with a chat-only brainstorm. If only a brainstorm pass or literature sweep is complete, mark the stage as unconverged, list what is still missing, and stop without pretending that the idea has converged.
|
|
49
52
|
- If the field is genuinely too narrow to support that target, say so explicitly in both the idea artifact and the idea source log, and justify the smaller literature bundle instead of silently skipping the search.
|
|
50
53
|
- The idea artifact must follow the repository `workflow_language`, not whichever language is easiest locally.
|
|
51
54
|
- Before writing the full artifact, give the user a short summary with the one-sentence problem, why current methods fail, and the three meaningful points.
|
|
@@ -69,6 +72,7 @@
|
|
|
69
72
|
|
|
70
73
|
- idea artifact derived from `.lab/.managed/templates/idea.md`
|
|
71
74
|
- idea source log at `.lab/writing/idea-source-log.md`, derived from `.lab/.managed/templates/idea-source-log.md`
|
|
75
|
+
- the working idea artifact should live at `.lab/writing/idea.md`
|
|
72
76
|
|
|
73
77
|
## Recommended Structure
|
|
74
78
|
|
|
@@ -93,12 +97,13 @@
|
|
|
93
97
|
19. Candidate approaches and recommendation
|
|
94
98
|
20. Dataset, baseline, and metric candidates
|
|
95
99
|
21. Falsifiable hypothesis
|
|
96
|
-
22.
|
|
97
|
-
23.
|
|
98
|
-
24.
|
|
99
|
-
25.
|
|
100
|
-
26.
|
|
101
|
-
27.
|
|
100
|
+
22. Convergence status
|
|
101
|
+
23. Expert critique
|
|
102
|
+
24. Revised proposal or final recommendation
|
|
103
|
+
25. User guidance
|
|
104
|
+
26. Approval gate
|
|
105
|
+
27. Minimum viable experiment
|
|
106
|
+
28. Idea source log aligned with the two literature sweeps
|
|
102
107
|
|
|
103
108
|
## Writing Standard
|
|
104
109
|
|
|
@@ -113,6 +118,8 @@
|
|
|
113
118
|
- Do not call something new without a literature-scoping bundle and a closest-prior comparison.
|
|
114
119
|
- Do not call something paper-worthy or novel after only one brainstorm pass or one literature sweep.
|
|
115
120
|
- Do not treat the idea artifact itself as the only evidence record; keep `.lab/writing/idea-source-log.md` synchronized with the actual searches and source buckets used in both literature sweeps.
|
|
121
|
+
- Do not leave the working result only in chat. Update `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` before returning a stage conclusion.
|
|
122
|
+
- Do not present a final recommendation, paper fit, or next-stage approval unless the working idea artifact is validator-clean and its convergence status says the stage is ready for that recommendation.
|
|
116
123
|
- Explain what current methods do, why they fall short, and roughly how the proposed idea would work in plain language.
|
|
117
124
|
- Explain what problem the idea actually solves before describing tentative contributions.
|
|
118
125
|
- Keep the evaluation sketch high-level: who or what is evaluated, what proxy or simulator is used if any, what outcome matters, and what the main validity risk is. Leave full protocol design to later stages.
|
|
@@ -92,6 +92,7 @@ Run these on every round:
|
|
|
92
92
|
- record what each figure or analysis asset should show and why the reader needs it
|
|
93
93
|
- record which citation anchors must appear in the section and why each anchor matters
|
|
94
94
|
- Before drafting `introduction`, `method`, `experiments`, `related work`, or `conclusion`, run `.lab/.managed/scripts/validate_paper_plan.py --paper-plan .lab/writing/plan.md`.
|
|
95
|
+
- When the repository workflow config is available, the paper-plan validator also checks that `.lab/writing/plan.md` stays in `workflow_language` instead of silently drifting into another language.
|
|
95
96
|
- If the paper-plan validator fails, stop and fill `.lab/writing/plan.md` first instead of drafting prose.
|
|
96
97
|
- During ordinary draft rounds, run `.lab/.managed/scripts/validate_section_draft.py --section <section> --section-file <section-file> --mode draft` and `.lab/.managed/scripts/validate_paper_claims.py --section-file <section-file> --mode draft` after revising the active section.
|
|
97
98
|
- Treat draft-round output from the section and claim validators as warnings that must be recorded and addressed in the write-iteration artifact, not as immediate stop conditions.
|
|
@@ -120,6 +121,7 @@ Run these on every round:
|
|
|
120
121
|
- For final-draft or export rounds, run `.lab/.managed/scripts/validate_section_draft.py --section <section> --section-file <section-file> --mode final` and `.lab/.managed/scripts/validate_paper_claims.py --section-file <section-file> --mode final` before accepting the round.
|
|
121
122
|
- If the final-round section or claim validators fail, keep editing the affected section until it passes; do not stop at asset-complete but rhetorically weak or unsafe prose.
|
|
122
123
|
- Run `.lab/.managed/scripts/validate_manuscript_delivery.py --paper-dir <deliverables_root>/paper` before accepting a final-draft or export round.
|
|
124
|
+
- The manuscript-delivery validator should fail if the core figures and tables are only inserted but never cited from section prose, or if final manuscript acceptance tries to bypass the one-time `paper_language_finalization_decision` gate when `workflow_language` and `paper_language` differ.
|
|
123
125
|
- If the manuscript validator fails, keep editing and asset generation until it passes; do not stop at prose-only completion.
|
|
124
126
|
- Run a LaTeX compile smoke test when a local LaTeX toolchain is available; if not available, record the missing verification in the write iteration artifact.
|
|
125
127
|
- Record what changed and why in a write-iteration artifact.
|