superlab 0.1.22 → 0.1.23
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +9 -0
- package/README.zh-CN.md +9 -0
- package/lib/i18n.cjs +22 -2
- package/package-assets/claude/commands/lab.md +8 -0
- package/package-assets/codex/prompts/lab.md +8 -0
- package/package-assets/shared/lab/context/auto-mode.md +6 -0
- package/package-assets/shared/skills/lab/SKILL.md +1 -0
- package/package-assets/shared/skills/lab/stages/auto.md +14 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -176,6 +176,7 @@ superlab auto stop
|
|
|
176
176
|
- `L1` is a safe run envelope for `run`, `review`, and `report`
|
|
177
177
|
- `L2` is the default bounded iteration envelope for `run`, `iterate`, `review`, and `report`
|
|
178
178
|
- `L3` is an aggressive campaign envelope that may also include `write`
|
|
179
|
+
- If you are unsure, choose `L2`
|
|
179
180
|
|
|
180
181
|
- `run` and `iterate` must change persistent outputs under `results_root`
|
|
181
182
|
- `review` must update canonical review context
|
|
@@ -189,6 +190,14 @@ It does not replace manual `idea`, `data`, `framing`, or `spec` decisions.
|
|
|
189
190
|
|
|
190
191
|
Good `/lab:auto` input is explicit. Treat `Autonomy level L1/L2/L3` as execution privilege, and treat `paper layer`, `phase`, or `table` as experiment targets. If the workflow language is Chinese, summaries, checklist items, task labels, and progress updates should also stay in Chinese unless a literal identifier must remain unchanged.
|
|
191
192
|
|
|
193
|
+
Level Guide for `/lab:auto`:
|
|
194
|
+
|
|
195
|
+
- `L1`: use this when you want safe validation, one bounded real run, or a simple report refresh
|
|
196
|
+
- `L2`: use this for normal bounded experiment iteration inside a frozen core
|
|
197
|
+
- `L3`: use this only when you want a broader campaign with a larger search space and optional writing
|
|
198
|
+
- If the request omits the level or mixes it with a paper layer, phase, or table target, stop and ask for an explicit level before starting
|
|
199
|
+
- If you are unsure, choose `L2`
|
|
200
|
+
|
|
192
201
|
Example:
|
|
193
202
|
|
|
194
203
|
```text
|
package/README.zh-CN.md
CHANGED
|
@@ -174,6 +174,7 @@ superlab auto stop
|
|
|
174
174
|
- `L1` 是安全运行级别,只允许 `run`、`review`、`report`
|
|
175
175
|
- `L2` 是默认推荐级别,允许 `run`、`iterate`、`review`、`report`
|
|
176
176
|
- `L3` 是激进 campaign 级别,才允许额外编排 `write`
|
|
177
|
+
- 如果不确定,默认推荐 `L2`
|
|
177
178
|
|
|
178
179
|
- `run` 和 `iterate` 必须更新 `results_root` 下的持久输出
|
|
179
180
|
- `review` 必须更新规范的审查上下文
|
|
@@ -187,6 +188,14 @@ superlab auto stop
|
|
|
187
188
|
|
|
188
189
|
好的 `/lab:auto` 输入应该显式写清。把 `Autonomy level L1/L2/L3` 当成执行权限级别,把 `paper layer`、`phase`、`table` 当成实验目标,不要混用。如果 workflow language 是中文,摘要、清单条目、任务标签和进度更新也应保持中文,除非某个字面标识符必须保持原样。
|
|
189
190
|
|
|
191
|
+
`/lab:auto` 层级指南:
|
|
192
|
+
|
|
193
|
+
- `L1`:适合安全验证、一轮有边界真实运行,或简单的 report 刷新
|
|
194
|
+
- `L2`:适合冻结核心边界内的常规实验迭代,也是默认推荐级别
|
|
195
|
+
- `L3`:只在你明确想做更大范围 campaign、允许更广探索和可选写作时使用
|
|
196
|
+
- 如果用户输入没写级别,或者把级别和 `paper layer`、`phase`、`table` 混用了,就应先停下来,给出更详细的层级说明,再要求用户明确选 `L1/L2/L3`
|
|
197
|
+
- 如果不确定,默认推荐 `L2`
|
|
198
|
+
|
|
190
199
|
示例:
|
|
191
200
|
|
|
192
201
|
```text
|
package/lib/i18n.cjs
CHANGED
|
@@ -1107,7 +1107,13 @@ const ZH_SKILL_FILES = {
|
|
|
1107
1107
|
- Objective:
|
|
1108
1108
|
- Autonomy level: L2
|
|
1109
1109
|
- Autonomy level 只表示执行权限级别,不表示论文 layer 或 table 编号。
|
|
1110
|
+
- 层级指南:
|
|
1111
|
+
- \`L1\` = 一轮安全验证或简单 report 刷新
|
|
1112
|
+
- \`L2\` = 默认推荐级别,适合冻结核心边界内的实验迭代
|
|
1113
|
+
- \`L3\` = 激进 campaign,适合更大范围探索和可选写作
|
|
1114
|
+
- 如果不确定,默认推荐 \`L2\`。
|
|
1110
1115
|
- 如果你想表达论文层、实验 phase 或主表,请明确写成 \`paper layer\`、\`phase\` 或 \`table\`。
|
|
1116
|
+
- 如果你的请求提到了论文层、实验 phase 或主表,但没写自治级别,就不要启动循环,先补明确的 \`Autonomy level\`。
|
|
1111
1117
|
- Approval status: draft
|
|
1112
1118
|
- Allowed stages: run, iterate, review, report
|
|
1113
1119
|
- Success criteria:
|
|
@@ -1572,7 +1578,7 @@ ZH_CONTENT[path.join(".lab", ".managed", "templates", "framing.md")] = `# 论文
|
|
|
1572
1578
|
ZH_CONTENT[path.join(".codex", "prompts", "lab.md")] = codexPrompt(
|
|
1573
1579
|
"查看 /lab 研究工作流总览并选择合适阶段",
|
|
1574
1580
|
"workflow question 或 stage choice",
|
|
1575
|
-
"# `/lab` for Codex\n\n`/lab` 是严格的研究工作流命令族。每次都使用同一套仓库工件和阶段边界。\n\n## 子命令\n\n- `/lab:idea`\n 调研 idea,定义问题与 failure case,归类 contribution 与 breakthrough level,对比现有方法,收束三个一眼就有意义的点,并在实现前保留 approval gate。\n\n- `/lab:data`\n 把已批准的 idea 转成数据集与 benchmark 方案,记录数据集年份、使用过该数据集的论文、下载来源、许可或访问限制,以及 classic-public、recent-strong-public、claim-specific 三类 benchmark 的纳入理由,和 canonical baselines、strong historical baselines、recent strong public methods、closest prior work 四类对比方法的纳入理由。\n\n- `/lab:auto`\n 在不改变 mission、framing 和核心 claims 的前提下,读取 eval-protocol 与 auto-mode 契约并自动编排 `run`、`iterate`、`review`、`report`,必要时扩展数据集、benchmark 和 comparison methods,并在满足升格策略时自动升级 primary package。启动前必须选定 autonomy level、声明 terminal goal,并显式批准契约。\n\n- `/lab:framing`\n 通过审计当前领域与相邻领域的术语,锁定 paper-facing 的方法名、模块名、论文题目和 contribution bullets,并在 section 起草前保留 approval gate。\n\n- `/lab:spec`\n 把已批准的 idea 转成 `.lab/changes/<change-id>/` 下的一个 lab change 目录,并在其中写出 `proposal`、`design`、`spec`、`tasks`。\n\n- `/lab:run`\n 执行最小有意义验证运行,登记 run,并生成第一版标准化评估摘要。\n\n- `/lab:iterate`\n 在冻结 mission、阈值、verification commands 与 `completion_promise` 的前提下执行有边界的实验迭代。\n\n- `/lab:review`\n 以 reviewer mode 审查文档或结果,先给短摘要,再输出 findings、fatal flaws、fix priority 和 residual risks。\n\n- `/lab:report`\n 从 runs 和 iterations 工件生成最终研究报告。\n\n- `/lab:write`\n 使用已安装 `lab` skill 下 vendored 的 paper-writing references,把稳定 report 工件转成论文 section。\n\n## 调度规则\n\n- 始终使用 `skills/lab/SKILL.md` 作为工作流合同。\n- 用户显式调用 `/lab:<stage>` 时,要立刻执行该 stage,而不是只推荐别的 `/lab` stage。\n- 先给简洁摘要,再决定是否写工件,最后回报输出路径和下一步。\n- 如果歧义会影响结论,一次只问一个问题;如果有多条可行路径,先给 2-3 个方案再收敛。\n- `/lab:spec` 前应已有经批准的数据集与 benchmark 方案。\n- `/lab:run`、`/lab:iterate`、`/lab:auto`、`/lab:report` 都应遵循 `.lab/context/eval-protocol.md`。\n- `.lab/context/eval-protocol.md` 不只定义主指标和主表,也应定义指标释义、实验阶梯,以及指标和对比实现的来源。\n- `/lab:auto` 只编排已批准边界内的执行阶段,不替代手动的 idea/data/framing/spec 决策。\n- `/lab:write` 前必须已有经批准的 `/lab:framing` 工件。\n\n## 如何输入 `/lab:auto`\n\n- 把 `Autonomy level L1/L2/L3` 视为执行权限级别,不要和论文里的 layer、phase、table 编号混用。\n- 把 `paper layer`、`phase`、`table` 视为实验目标。例如 `paper layer 3` 或 `Phase 1 reviewer fidelity` 不是 `Autonomy level L3`。\n- 一条好的 `/lab:auto` 输入应至少说清:objective、自治级别、terminal goal、scope、allowed modifications。\n- 如果 workflow language 是中文,摘要、清单条目、任务标签和进度更新都应使用中文,除非文件路径、代码标识符或字面指标名必须保持原样。\n- 示例:`/lab:auto 自治级别 L2。目标:推进 paper layer 3 的 organizer enforcement。终止条件:完成 bounded protocol、测试、最小实现和一轮小规模结果。允许修改:evaluator prompt registry、ingestion、parser。`\n"
|
|
1581
|
+
"# `/lab` for Codex\n\n`/lab` 是严格的研究工作流命令族。每次都使用同一套仓库工件和阶段边界。\n\n## 子命令\n\n- `/lab:idea`\n 调研 idea,定义问题与 failure case,归类 contribution 与 breakthrough level,对比现有方法,收束三个一眼就有意义的点,并在实现前保留 approval gate。\n\n- `/lab:data`\n 把已批准的 idea 转成数据集与 benchmark 方案,记录数据集年份、使用过该数据集的论文、下载来源、许可或访问限制,以及 classic-public、recent-strong-public、claim-specific 三类 benchmark 的纳入理由,和 canonical baselines、strong historical baselines、recent strong public methods、closest prior work 四类对比方法的纳入理由。\n\n- `/lab:auto`\n 在不改变 mission、framing 和核心 claims 的前提下,读取 eval-protocol 与 auto-mode 契约并自动编排 `run`、`iterate`、`review`、`report`,必要时扩展数据集、benchmark 和 comparison methods,并在满足升格策略时自动升级 primary package。启动前必须选定 autonomy level、声明 terminal goal,并显式批准契约。\n\n- `/lab:framing`\n 通过审计当前领域与相邻领域的术语,锁定 paper-facing 的方法名、模块名、论文题目和 contribution bullets,并在 section 起草前保留 approval gate。\n\n- `/lab:spec`\n 把已批准的 idea 转成 `.lab/changes/<change-id>/` 下的一个 lab change 目录,并在其中写出 `proposal`、`design`、`spec`、`tasks`。\n\n- `/lab:run`\n 执行最小有意义验证运行,登记 run,并生成第一版标准化评估摘要。\n\n- `/lab:iterate`\n 在冻结 mission、阈值、verification commands 与 `completion_promise` 的前提下执行有边界的实验迭代。\n\n- `/lab:review`\n 以 reviewer mode 审查文档或结果,先给短摘要,再输出 findings、fatal flaws、fix priority 和 residual risks。\n\n- `/lab:report`\n 从 runs 和 iterations 工件生成最终研究报告。\n\n- `/lab:write`\n 使用已安装 `lab` skill 下 vendored 的 paper-writing references,把稳定 report 工件转成论文 section。\n\n## 调度规则\n\n- 始终使用 `skills/lab/SKILL.md` 作为工作流合同。\n- 用户显式调用 `/lab:<stage>` 时,要立刻执行该 stage,而不是只推荐别的 `/lab` stage。\n- 先给简洁摘要,再决定是否写工件,最后回报输出路径和下一步。\n- 如果歧义会影响结论,一次只问一个问题;如果有多条可行路径,先给 2-3 个方案再收敛。\n- `/lab:spec` 前应已有经批准的数据集与 benchmark 方案。\n- `/lab:run`、`/lab:iterate`、`/lab:auto`、`/lab:report` 都应遵循 `.lab/context/eval-protocol.md`。\n- `.lab/context/eval-protocol.md` 不只定义主指标和主表,也应定义指标释义、实验阶梯,以及指标和对比实现的来源。\n- `/lab:auto` 只编排已批准边界内的执行阶段,不替代手动的 idea/data/framing/spec 决策。\n- `/lab:write` 前必须已有经批准的 `/lab:framing` 工件。\n\n## 如何输入 `/lab:auto`\n\n## `/lab:auto` 层级指南\n\n- `L1`:适合安全验证、一轮 bounded 真实运行,或简单 report 刷新。\n- `L2`:默认推荐级别,适合冻结核心边界内的常规实验迭代。\n- `L3`:激进 campaign 级别,只在你明确想做更大范围探索和可选写作时使用。\n- 如果不确定,默认推荐 `L2`。\n- 如果用户输入没写级别,或者把级别和 `paper layer`、`phase`、`table` 混用了,就应先停下来,要求用户明确选 `L1/L2/L3`。\n\n- 把 `Autonomy level L1/L2/L3` 视为执行权限级别,不要和论文里的 layer、phase、table 编号混用。\n- 把 `paper layer`、`phase`、`table` 视为实验目标。例如 `paper layer 3` 或 `Phase 1 reviewer fidelity` 不是 `Autonomy level L3`。\n- 一条好的 `/lab:auto` 输入应至少说清:objective、自治级别、terminal goal、scope、allowed modifications。\n- 如果 workflow language 是中文,摘要、清单条目、任务标签和进度更新都应使用中文,除非文件路径、代码标识符或字面指标名必须保持原样。\n- 示例:`/lab:auto 自治级别 L2。目标:推进 paper layer 3 的 organizer enforcement。终止条件:完成 bounded protocol、测试、最小实现和一轮小规模结果。允许修改:evaluator prompt registry、ingestion、parser。`\n"
|
|
1576
1582
|
);
|
|
1577
1583
|
|
|
1578
1584
|
ZH_CONTENT[path.join(".codex", "prompts", "lab-data.md")] = codexPrompt(
|
|
@@ -1591,7 +1597,7 @@ ZH_CONTENT[path.join(".claude", "commands", "lab.md")] = claudeCommand(
|
|
|
1591
1597
|
"lab",
|
|
1592
1598
|
"查看 /lab 研究工作流总览并选择合适阶段",
|
|
1593
1599
|
"[stage] [target]",
|
|
1594
|
-
"# `/lab` for Claude\n\n`/lab` 是 Claude Code 里的 lab 工作流分发入口。调用方式有两种:\n\n- `/lab <stage> ...`\n- `/lab-idea`、`/lab-data`、`/lab-auto`、`/lab-framing`、`/lab-spec`、`/lab-run`、`/lab-iterate`、`/lab-review`、`/lab-report`、`/lab-write`\n\n## 阶段别名\n\n- `/lab idea ...` 或 `/lab-idea`\n- `/lab data ...` 或 `/lab-data`\n- `/lab auto ...` 或 `/lab-auto`\n- `/lab framing ...` 或 `/lab-framing`\n- `/lab spec ...` 或 `/lab-spec`\n- `/lab run ...` 或 `/lab-run`\n- `/lab iterate ...` 或 `/lab-iterate`\n- `/lab review ...` 或 `/lab-review`\n- `/lab report ...` 或 `/lab-report`\n- `/lab write ...` 或 `/lab-write`\n\n## 调度规则\n\n- 始终使用 `skills/lab/SKILL.md` 作为工作流合同。\n- 用户显式调用 `/lab <stage> ...` 或 `/lab-<stage>` 时,要立刻执行该 stage,而不是只推荐别的阶段。\n- 先给简洁摘要,再决定是否写工件,最后回报输出路径和下一步。\n- 如果歧义会影响结论,一次只问一个问题;如果有多条可行路径,先给 2-3 个方案再收敛。\n- `spec` 前应已有经批准的数据集与 benchmark 方案。\n- `run`、`iterate`、`auto`、`report` 都应遵循 `.lab/context/eval-protocol.md`。\n- `auto` 只编排已批准边界内的执行阶段,不替代手动的 idea/data/framing/spec 决策。\n- `write` 前必须已有经批准的 `framing` 工件。\n\n## 如何输入 `/lab auto`\n\n- 把 `Autonomy level L1/L2/L3` 视为执行权限级别,不要和论文里的 layer、phase、table 编号混用。\n- 把 `paper layer`、`phase`、`table` 视为实验目标。例如 `paper layer 3` 或 `Phase 1 reviewer fidelity` 不是 `Autonomy level L3`。\n- 一条好的 `/lab auto` 输入应至少说清:objective、自治级别、terminal goal、scope、allowed modifications。\n- 如果 workflow language 是中文,摘要、清单条目、任务标签和进度更新都应使用中文,除非文件路径、代码标识符或字面指标名必须保持原样。\n- 示例:`/lab auto 自治级别 L2。目标:推进 paper layer 3 的 organizer enforcement。终止条件:完成 bounded protocol、测试、最小实现和一轮小规模结果。允许修改:evaluator prompt registry、ingestion、parser。`\n"
|
|
1600
|
+
"# `/lab` for Claude\n\n`/lab` 是 Claude Code 里的 lab 工作流分发入口。调用方式有两种:\n\n- `/lab <stage> ...`\n- `/lab-idea`、`/lab-data`、`/lab-auto`、`/lab-framing`、`/lab-spec`、`/lab-run`、`/lab-iterate`、`/lab-review`、`/lab-report`、`/lab-write`\n\n## 阶段别名\n\n- `/lab idea ...` 或 `/lab-idea`\n- `/lab data ...` 或 `/lab-data`\n- `/lab auto ...` 或 `/lab-auto`\n- `/lab framing ...` 或 `/lab-framing`\n- `/lab spec ...` 或 `/lab-spec`\n- `/lab run ...` 或 `/lab-run`\n- `/lab iterate ...` 或 `/lab-iterate`\n- `/lab review ...` 或 `/lab-review`\n- `/lab report ...` 或 `/lab-report`\n- `/lab write ...` 或 `/lab-write`\n\n## 调度规则\n\n- 始终使用 `skills/lab/SKILL.md` 作为工作流合同。\n- 用户显式调用 `/lab <stage> ...` 或 `/lab-<stage>` 时,要立刻执行该 stage,而不是只推荐别的阶段。\n- 先给简洁摘要,再决定是否写工件,最后回报输出路径和下一步。\n- 如果歧义会影响结论,一次只问一个问题;如果有多条可行路径,先给 2-3 个方案再收敛。\n- `spec` 前应已有经批准的数据集与 benchmark 方案。\n- `run`、`iterate`、`auto`、`report` 都应遵循 `.lab/context/eval-protocol.md`。\n- `auto` 只编排已批准边界内的执行阶段,不替代手动的 idea/data/framing/spec 决策。\n- `write` 前必须已有经批准的 `framing` 工件。\n\n## 如何输入 `/lab auto`\n\n## `/lab auto` 层级指南\n\n- `L1`:适合安全验证、一轮 bounded 真实运行,或简单 report 刷新。\n- `L2`:默认推荐级别,适合冻结核心边界内的常规实验迭代。\n- `L3`:激进 campaign 级别,只在你明确想做更大范围探索和可选写作时使用。\n- 如果不确定,默认推荐 `L2`。\n- 如果用户输入没写级别,或者把级别和 `paper layer`、`phase`、`table` 混用了,就应先停下来,要求用户明确选 `L1/L2/L3`。\n\n- 把 `Autonomy level L1/L2/L3` 视为执行权限级别,不要和论文里的 layer、phase、table 编号混用。\n- 把 `paper layer`、`phase`、`table` 视为实验目标。例如 `paper layer 3` 或 `Phase 1 reviewer fidelity` 不是 `Autonomy level L3`。\n- 一条好的 `/lab auto` 输入应至少说清:objective、自治级别、terminal goal、scope、allowed modifications。\n- 如果 workflow language 是中文,摘要、清单条目、任务标签和进度更新都应使用中文,除非文件路径、代码标识符或字面指标名必须保持原样。\n- 示例:`/lab auto 自治级别 L2。目标:推进 paper layer 3 的 organizer enforcement。终止条件:完成 bounded protocol、测试、最小实现和一轮小规模结果。允许修改:evaluator prompt registry、ingestion、parser。`\n"
|
|
1595
1601
|
);
|
|
1596
1602
|
|
|
1597
1603
|
ZH_CONTENT[path.join(".claude", "commands", "lab-data.md")] = claudeCommand(
|
|
@@ -2132,11 +2138,25 @@ ZH_CONTENT[path.join(".codex", "skills", "lab", "stages", "auto.md")] = `# \`/la
|
|
|
2132
2138
|
|
|
2133
2139
|
## 交互约束
|
|
2134
2140
|
|
|
2141
|
+
## 层级指南
|
|
2142
|
+
|
|
2143
|
+
- \`L1\` = safe validation
|
|
2144
|
+
- \`L2\` = 默认推荐的 bounded iteration
|
|
2145
|
+
- \`L3\` = aggressive campaign
|
|
2146
|
+
- 如果不确定,默认推荐 \`L2\`。
|
|
2147
|
+
|
|
2135
2148
|
- 开始前先简洁说明:objective、frozen core 和下一自动阶段。
|
|
2136
2149
|
- 如果契约本身不完整,一次只追问一个问题。
|
|
2137
2150
|
- 如果存在多个可信的下一动作,先给 2-3 个 bounded 方案和推荐项,再启动长任务。
|
|
2138
2151
|
- 只有当下一步会离开已批准的 exploration envelope、超出选定 autonomy level,或实质改变 frozen core 时,才保留人工 approval gate。
|
|
2152
|
+
- 每次进入 \`/lab:auto\` 都要先给出这份层级指南。
|
|
2139
2153
|
- 先做输入归一化:把 \`Autonomy level L1/L2/L3\` 视为执行权限级别,把 \`Layer 3\`、\`Phase 1\`、\`Table 2\` 视为论文范围目标。
|
|
2154
|
+
- 如果用户没有写自治级别,或者把自治级别和论文层、phase、table 混用了,就必须先给一版更详细的层级说明,至少解释:
|
|
2155
|
+
- \`L1/L2/L3\` 的典型适用场景
|
|
2156
|
+
- 每个级别允许改什么
|
|
2157
|
+
- 每个级别通常在什么 stop boundary 停下
|
|
2158
|
+
- 如果不确定,默认推荐 \`L2\`
|
|
2159
|
+
- 给完这版详细说明后,再追问一个明确的 \`L1/L2/L3\` 选择;在用户明确选级别前不要启动循环。
|
|
2140
2160
|
- 如果用户同时提了论文层、实验 phase 和自治级别,先用一句话重述:objective、自治级别、terminal goal、scope、allowed modifications。
|
|
2141
2161
|
- 如果 workflow language 是中文,摘要、清单条目、任务标签和进度更新都应使用中文,除非文件路径、代码标识符或字面指标名必须保持原样。
|
|
2142
2162
|
- 当循环进入 \`report\` 时,要主动给出用户可读的白话总结,解释主指标、次级指标和主表作用;不要等用户额外发一句“解释这些指标”。
|
|
@@ -62,6 +62,14 @@ Use the same repository artifacts and stage boundaries every time.
|
|
|
62
62
|
|
|
63
63
|
## How to Ask for `/lab auto`
|
|
64
64
|
|
|
65
|
+
## Level Guide for `/lab auto`
|
|
66
|
+
|
|
67
|
+
- `L1` is the safe validation level. Use it for a smoke run, one bounded real run, or a simple review/report refresh when you do not want automatic iteration.
|
|
68
|
+
- `L2` is the default recommended level. Use it for bounded experiment iteration inside a frozen core when you want auto to keep running until a gate, stop condition, or terminal goal is hit.
|
|
69
|
+
- `L3` is the aggressive campaign level. Use it only when you explicitly want broad exploration, larger search space changes, and optional manuscript-writing work.
|
|
70
|
+
- If you are unsure, choose `L2`.
|
|
71
|
+
- If the request omits the level or mixes it with a paper layer, phase, or table target, `/lab auto` should stop and ask for an explicit autonomy level before arming the loop.
|
|
72
|
+
|
|
65
73
|
- Treat `Autonomy level L1/L2/L3` as the execution privilege level, not as a paper layer, phase, or table number.
|
|
66
74
|
- Treat `paper layer`, `phase`, and `table` as experiment targets. For example, `paper layer 3` or `Phase 1 reviewer fidelity` should not be interpreted as `Autonomy level L3`.
|
|
67
75
|
- A good `/lab auto` request should name:
|
|
@@ -56,6 +56,14 @@ argument-hint: workflow question or stage choice
|
|
|
56
56
|
|
|
57
57
|
## How to Ask for `/lab:auto`
|
|
58
58
|
|
|
59
|
+
## Level Guide for `/lab:auto`
|
|
60
|
+
|
|
61
|
+
- `L1` is the safe validation level. Use it for a smoke run, one bounded real run, or a simple review/report refresh when you do not want automatic iteration.
|
|
62
|
+
- `L2` is the default recommended level. Use it for bounded experiment iteration inside a frozen core when you want auto to keep running until a gate, stop condition, or terminal goal is hit.
|
|
63
|
+
- `L3` is the aggressive campaign level. Use it only when you explicitly want broad exploration, larger search space changes, and optional manuscript-writing work.
|
|
64
|
+
- If you are unsure, choose `L2`.
|
|
65
|
+
- If the request omits the level or mixes it with a paper layer, phase, or table target, `/lab:auto` should stop and ask for an explicit autonomy level before arming the loop.
|
|
66
|
+
|
|
59
67
|
- Treat `Autonomy level L1/L2/L3` as the execution privilege level, not as a paper layer, phase, or table number.
|
|
60
68
|
- Treat `paper layer`, `phase`, and `table` as experiment targets. For example, `paper layer 3` or `Phase 1 reviewer fidelity` should not be interpreted as `Autonomy level L3`.
|
|
61
69
|
- A good `/lab:auto` request should name:
|
|
@@ -9,7 +9,13 @@ If `eval-protocol.md` declares structured rung entries, auto mode follows those
|
|
|
9
9
|
- Objective:
|
|
10
10
|
- Autonomy level: L2
|
|
11
11
|
- Autonomy level controls execution privilege, not paper layer or table number.
|
|
12
|
+
- Level guide:
|
|
13
|
+
- `L1` = safe validation over one bounded run/review/report cycle
|
|
14
|
+
- `L2` = default recommended bounded iteration inside a frozen core
|
|
15
|
+
- `L3` = aggressive campaign with broader exploration and optional writing
|
|
16
|
+
- If you are unsure, choose `L2`.
|
|
12
17
|
- If you mean a paper layer, phase, or table, spell it explicitly as `paper layer`, `phase`, or `table`.
|
|
18
|
+
- If your request mentions a paper layer, phase, or table but omits the autonomy level, do not arm the loop until the level is explicit.
|
|
13
19
|
- Approval status: draft
|
|
14
20
|
- Allowed stages: run, iterate, review, report
|
|
15
21
|
- Success criteria:
|
|
@@ -87,6 +87,7 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
87
87
|
- Treat `.lab/context/auto-mode.md` as the control contract and `.lab/context/auto-status.md` as the live state file.
|
|
88
88
|
- Require `Autonomy level` and `Approval status` in `.lab/context/auto-mode.md` before execution.
|
|
89
89
|
- Treat `L1` as safe-run validation, `L2` as bounded iteration, and `L3` as aggressive campaign mode.
|
|
90
|
+
- Surface the level guide every time `/lab:auto` starts, and make the detailed guide mandatory when the user omits the level or mixes it with a paper layer, phase, or table target.
|
|
90
91
|
- Reuse `/lab:run`, `/lab:iterate`, `/lab:review`, `/lab:report`, and optional `/lab:write` instead of inventing a second workflow.
|
|
91
92
|
- Do not automatically change the research mission, paper-facing framing, or core claims.
|
|
92
93
|
- You may add exploratory datasets, benchmarks, and comparison methods inside the approved exploration envelope.
|
|
@@ -90,7 +90,15 @@
|
|
|
90
90
|
|
|
91
91
|
## Interaction Contract
|
|
92
92
|
|
|
93
|
+
## Level Guide
|
|
94
|
+
|
|
95
|
+
- `L1` = safe validation
|
|
96
|
+
- `L2` = default recommended bounded iteration
|
|
97
|
+
- `L3` = aggressive campaign
|
|
98
|
+
- If you are unsure, choose `L2`.
|
|
99
|
+
|
|
93
100
|
- Start with a concise summary of the objective, the frozen core, and the next automatic stage.
|
|
101
|
+
- Always surface the level guide before execution.
|
|
94
102
|
- If the contract is incomplete, ask one clarifying question at a time.
|
|
95
103
|
- If multiple next actions are credible, present 2-3 bounded options with trade-offs before arming a long run.
|
|
96
104
|
- Only ask for approval when the next step would leave the approved exploration envelope, exceed the chosen autonomy level, or materially change the frozen core.
|
|
@@ -100,6 +108,12 @@
|
|
|
100
108
|
- Normalize ambiguous user requests before arming the loop.
|
|
101
109
|
- Treat `Autonomy level L1/L2/L3` as execution privilege only.
|
|
102
110
|
- Treat `Layer`, `Phase`, and `Table` references as paper-structure or experiment-scope targets, not as autonomy levels.
|
|
111
|
+
- If the user does not name an autonomy level, or mixes it with a paper layer, phase, or table target, stop and deliver a detailed level guide before execution. That detailed guide should explain:
|
|
112
|
+
- the typical use case for `L1`, `L2`, and `L3`
|
|
113
|
+
- what kinds of modifications each level allows
|
|
114
|
+
- what kind of stop boundary each level is meant for
|
|
115
|
+
- that `L2` is the default recommendation when the user is unsure
|
|
116
|
+
- Ask for one explicit level choice before arming the loop after that detailed guide.
|
|
103
117
|
- Example:
|
|
104
118
|
- `Layer 3 organizer enforcement` means a paper layer or experiment target.
|
|
105
119
|
- `Autonomy level L3` means the aggressive campaign permission envelope.
|