superlab 0.1.24 → 0.1.26
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +8 -2
- package/README.zh-CN.md +8 -3
- package/lib/auto_contracts.cjs +4 -2
- package/lib/context.cjs +155 -13
- package/lib/i18n.cjs +89 -15
- package/lib/install.cjs +2 -0
- package/package-assets/claude/commands/lab-write.md +1 -1
- package/package-assets/claude/commands/lab.md +2 -1
- package/package-assets/codex/prompts/lab-write.md +1 -1
- package/package-assets/codex/prompts/lab.md +2 -1
- package/package-assets/shared/lab/.managed/scripts/validate_manuscript_delivery.py +175 -0
- package/package-assets/shared/lab/.managed/templates/artifact-status.md +28 -0
- package/package-assets/shared/lab/.managed/templates/final-report.md +0 -11
- package/package-assets/shared/lab/.managed/templates/paper-figure.tex +6 -0
- package/package-assets/shared/lab/.managed/templates/paper-references.bib +9 -0
- package/package-assets/shared/lab/.managed/templates/paper-table.tex +13 -0
- package/package-assets/shared/lab/context/auto-mode.md +2 -2
- package/package-assets/shared/lab/context/session-brief.md +1 -1
- package/package-assets/shared/lab/context/state.md +19 -13
- package/package-assets/shared/lab/context/workflow-state.md +19 -0
- package/package-assets/shared/lab/system/core.md +4 -2
- package/package-assets/shared/skills/lab/SKILL.md +19 -14
- package/package-assets/shared/skills/lab/references/paper-writing/examples/conclusion/conservative-claim-boundary.md +27 -0
- package/package-assets/shared/skills/lab/references/paper-writing/examples/conclusion-examples.md +16 -0
- package/package-assets/shared/skills/lab/references/paper-writing/examples/experiments/figure-placeholder-and-discussion.md +44 -0
- package/package-assets/shared/skills/lab/references/paper-writing/examples/experiments/main-results-and-ablation-latex.md +83 -0
- package/package-assets/shared/skills/lab/references/paper-writing/examples/experiments-examples.md +17 -0
- package/package-assets/shared/skills/lab/references/paper-writing/examples/index.md +12 -3
- package/package-assets/shared/skills/lab/references/paper-writing/examples/related-work/closest-prior-gap-template.md +20 -0
- package/package-assets/shared/skills/lab/references/paper-writing/examples/related-work/topic-comparison-template.md +24 -0
- package/package-assets/shared/skills/lab/references/paper-writing/examples/related-work-examples.md +17 -0
- package/package-assets/shared/skills/lab/references/paper-writing-integration.md +19 -10
- package/package-assets/shared/skills/lab/stages/report.md +5 -2
- package/package-assets/shared/skills/lab/stages/write.md +34 -1
- package/package.json +1 -1
|
@@ -58,7 +58,8 @@ Use the same repository artifacts and stage boundaries every time.
|
|
|
58
58
|
- `iterate` requires a normalized summary from `scripts/eval_report.py`.
|
|
59
59
|
- `run`, `iterate`, `auto`, and `report` should all follow `.lab/context/eval-protocol.md`, including its recorded sources for metrics and comparison implementations.
|
|
60
60
|
- `write` requires an approved framing artifact from the `framing` stage.
|
|
61
|
-
- `write` requires stable report artifacts, a mini-outline, the active section guide, `paper-review.md`, and `does-my-writing-flow-source.md`, and should only change one section per round.
|
|
61
|
+
- `write` requires stable report artifacts, a mini-outline, the active section guide, the matching bundled examples when available, `paper-review.md`, and `does-my-writing-flow-source.md`, and should only change one section per round.
|
|
62
|
+
- Final-draft or export rounds in `write` should materialize paper-facing tables, figure placeholders, a non-empty `references.bib`, and pass `.lab/.managed/scripts/validate_manuscript_delivery.py --paper-dir <deliverables_root>/paper`.
|
|
62
63
|
|
|
63
64
|
## How to Ask for `/lab auto`
|
|
64
65
|
|
|
@@ -6,4 +6,4 @@ argument-hint: section or writing target
|
|
|
6
6
|
Use the installed `lab` skill at `.codex/skills/lab/SKILL.md`.
|
|
7
7
|
|
|
8
8
|
Execute the requested `/lab:write` stage against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
|
|
9
|
-
This command runs the `/lab:write` stage. It requires an approved framing artifact from `/lab:framing`, must read the matching section reference from `.codex/skills/lab/references/paper-writing/`, and for
|
|
9
|
+
This command runs the `/lab:write` stage. It requires an approved framing artifact from `/lab:framing`, must read the matching section reference from `.codex/skills/lab/references/paper-writing/`, and for any section with a bundled example bank it must also read `.codex/skills/lab/references/paper-writing/examples/index.md` plus the matching examples index and 1-2 concrete example files. Then it should run `paper-review.md` and `does-my-writing-flow-source.md`, build a mini-outline, plan the section's paper-facing tables/figures/citations, and revise only one section. Final-draft or export rounds must run `.lab/.managed/scripts/validate_manuscript_delivery.py --paper-dir <deliverables_root>/paper` before stopping.
|
|
@@ -52,7 +52,8 @@ argument-hint: workflow question or stage choice
|
|
|
52
52
|
- `/lab:iterate` requires a normalized summary from `scripts/eval_report.py`.
|
|
53
53
|
- `/lab:run`, `/lab:iterate`, `/lab:auto`, and `/lab:report` should all follow `.lab/context/eval-protocol.md`, including its recorded sources for metrics and comparison implementations.
|
|
54
54
|
- `/lab:write` requires an approved framing artifact from `/lab:framing`.
|
|
55
|
-
- `/lab:write` requires stable report artifacts, a mini-outline, the active section guide, `paper-review.md`, and `does-my-writing-flow-source.md`, and should only change one section per round.
|
|
55
|
+
- `/lab:write` requires stable report artifacts, a mini-outline, the active section guide, the matching bundled examples when available, `paper-review.md`, and `does-my-writing-flow-source.md`, and should only change one section per round.
|
|
56
|
+
- Final-draft or export rounds in `/lab:write` should materialize paper-facing tables, figure placeholders, a non-empty `references.bib`, and pass `.lab/.managed/scripts/validate_manuscript_delivery.py --paper-dir <deliverables_root>/paper`.
|
|
56
57
|
|
|
57
58
|
## How to Ask for `/lab:auto`
|
|
58
59
|
|
|
@@ -0,0 +1,175 @@
|
|
|
1
|
+
#!/usr/bin/env python3
|
|
2
|
+
import argparse
|
|
3
|
+
import re
|
|
4
|
+
import sys
|
|
5
|
+
from pathlib import Path
|
|
6
|
+
|
|
7
|
+
|
|
8
|
+
ABSOLUTE_PATH_MARKERS = ("/Users/", "/home/", "/tmp/", "/private/tmp/")
|
|
9
|
+
REQUIRED_TABLE_FILES = ("main-results.tex", "ablations.tex")
|
|
10
|
+
REQUIRED_FIGURE_FILES = ("method-overview.tex", "results-overview.tex")
|
|
11
|
+
|
|
12
|
+
|
|
13
|
+
def parse_args():
|
|
14
|
+
parser = argparse.ArgumentParser(
|
|
15
|
+
description="Validate that a paper delivery contains basic manuscript-ready assets."
|
|
16
|
+
)
|
|
17
|
+
parser.add_argument("--paper-dir", required=True, help="Path to the paper deliverable root")
|
|
18
|
+
return parser.parse_args()
|
|
19
|
+
|
|
20
|
+
|
|
21
|
+
def read_text(path: Path) -> str:
|
|
22
|
+
return path.read_text(encoding="utf-8")
|
|
23
|
+
|
|
24
|
+
|
|
25
|
+
def check_exists(path: Path, issues: list[str], label: str):
|
|
26
|
+
if not path.exists():
|
|
27
|
+
issues.append(f"missing required file: {label} ({path})")
|
|
28
|
+
|
|
29
|
+
|
|
30
|
+
def check_bibliography(paper_dir: Path, issues: list[str]):
|
|
31
|
+
bib_path = paper_dir / "references.bib"
|
|
32
|
+
check_exists(bib_path, issues, "references.bib")
|
|
33
|
+
if not bib_path.exists():
|
|
34
|
+
return
|
|
35
|
+
text = read_text(bib_path)
|
|
36
|
+
if "TODO" in text or "todo" in text or "Add bibliography entries" in text or "@" not in text:
|
|
37
|
+
issues.append("missing a non-empty references.bib")
|
|
38
|
+
|
|
39
|
+
|
|
40
|
+
def check_global_tex(paper_dir: Path, issues: list[str]):
|
|
41
|
+
tex_files = sorted(paper_dir.rglob("*.tex"))
|
|
42
|
+
combined = "\n".join(read_text(path) for path in tex_files)
|
|
43
|
+
|
|
44
|
+
if r"\cite{" not in combined:
|
|
45
|
+
issues.append("missing citation commands in manuscript tex files")
|
|
46
|
+
if "+/-" in combined:
|
|
47
|
+
issues.append("replace '+/-' with LaTeX \\pm formatting")
|
|
48
|
+
if any(marker in combined for marker in ABSOLUTE_PATH_MARKERS):
|
|
49
|
+
issues.append("manuscript tex files must not contain absolute local paths")
|
|
50
|
+
|
|
51
|
+
|
|
52
|
+
def check_table_file(path: Path, issues: list[str], label: str):
|
|
53
|
+
if not path.exists():
|
|
54
|
+
if label == "tables/main-results.tex":
|
|
55
|
+
issues.append("missing a main results table")
|
|
56
|
+
elif label == "tables/ablations.tex":
|
|
57
|
+
issues.append("missing an ablation table")
|
|
58
|
+
else:
|
|
59
|
+
issues.append(f"missing required file: {label} ({path})")
|
|
60
|
+
return
|
|
61
|
+
text = read_text(path)
|
|
62
|
+
if r"\begin{table}" not in text:
|
|
63
|
+
issues.append(f"{label} must contain a table environment")
|
|
64
|
+
if r"\caption{" not in text or r"\label{" not in text:
|
|
65
|
+
issues.append(f"{label} must contain both caption and label")
|
|
66
|
+
if not all(token in text for token in (r"\toprule", r"\midrule", r"\bottomrule")):
|
|
67
|
+
issues.append(f"{label} must use booktabs structure")
|
|
68
|
+
|
|
69
|
+
|
|
70
|
+
def check_figure_file(path: Path, issues: list[str], label: str):
|
|
71
|
+
if not path.exists():
|
|
72
|
+
if label == "figures/method-overview.tex":
|
|
73
|
+
issues.append("missing a method figure placeholder")
|
|
74
|
+
elif label == "figures/results-overview.tex":
|
|
75
|
+
issues.append("missing an experiments figure placeholder")
|
|
76
|
+
else:
|
|
77
|
+
issues.append(f"missing required file: {label} ({path})")
|
|
78
|
+
return
|
|
79
|
+
text = read_text(path)
|
|
80
|
+
if r"\begin{figure}" not in text:
|
|
81
|
+
issues.append(f"{label} must contain a figure environment")
|
|
82
|
+
if r"\caption{" not in text or r"\label{" not in text:
|
|
83
|
+
issues.append(f"{label} must contain both caption and label")
|
|
84
|
+
if "Figure intent:" not in text and "图意图:" not in text:
|
|
85
|
+
issues.append(f"{label} must explain figure intent")
|
|
86
|
+
|
|
87
|
+
|
|
88
|
+
def check_experiments_section(paper_dir: Path, issues: list[str]):
|
|
89
|
+
experiments = paper_dir / "sections" / "experiments.tex"
|
|
90
|
+
check_exists(experiments, issues, "sections/experiments.tex")
|
|
91
|
+
if not experiments.exists():
|
|
92
|
+
return
|
|
93
|
+
text = read_text(experiments)
|
|
94
|
+
has_table = any(
|
|
95
|
+
token in text
|
|
96
|
+
for token in (
|
|
97
|
+
r"\input{tables/main-results}",
|
|
98
|
+
r"\input{tables/ablations}",
|
|
99
|
+
r"\begin{table}",
|
|
100
|
+
)
|
|
101
|
+
)
|
|
102
|
+
has_figure = any(
|
|
103
|
+
token in text
|
|
104
|
+
for token in (
|
|
105
|
+
r"\input{figures/results-overview}",
|
|
106
|
+
r"\begin{figure}",
|
|
107
|
+
)
|
|
108
|
+
)
|
|
109
|
+
if not has_table:
|
|
110
|
+
issues.append("experiments section is missing a main results table")
|
|
111
|
+
if not has_figure:
|
|
112
|
+
issues.append("experiments section is missing an experiments figure placeholder")
|
|
113
|
+
|
|
114
|
+
|
|
115
|
+
def check_method_section(paper_dir: Path, issues: list[str]):
|
|
116
|
+
method = paper_dir / "sections" / "method.tex"
|
|
117
|
+
check_exists(method, issues, "sections/method.tex")
|
|
118
|
+
if not method.exists():
|
|
119
|
+
return
|
|
120
|
+
text = read_text(method)
|
|
121
|
+
has_figure = any(
|
|
122
|
+
token in text
|
|
123
|
+
for token in (
|
|
124
|
+
r"\input{figures/method-overview}",
|
|
125
|
+
r"\begin{figure}",
|
|
126
|
+
)
|
|
127
|
+
)
|
|
128
|
+
if not has_figure:
|
|
129
|
+
issues.append("method section is missing a method figure placeholder")
|
|
130
|
+
|
|
131
|
+
|
|
132
|
+
def check_main_tex(paper_dir: Path, issues: list[str]):
|
|
133
|
+
main_tex = paper_dir / "main.tex"
|
|
134
|
+
check_exists(main_tex, issues, "main.tex")
|
|
135
|
+
if not main_tex.exists():
|
|
136
|
+
return
|
|
137
|
+
text = read_text(main_tex)
|
|
138
|
+
if r"\bibliography{references}" not in text:
|
|
139
|
+
issues.append("main.tex must include the references bibliography")
|
|
140
|
+
|
|
141
|
+
|
|
142
|
+
def main():
|
|
143
|
+
args = parse_args()
|
|
144
|
+
paper_dir = Path(args.paper_dir)
|
|
145
|
+
issues: list[str] = []
|
|
146
|
+
|
|
147
|
+
if not paper_dir.exists():
|
|
148
|
+
print(f"paper directory does not exist: {paper_dir}", file=sys.stderr)
|
|
149
|
+
return 1
|
|
150
|
+
|
|
151
|
+
check_main_tex(paper_dir, issues)
|
|
152
|
+
check_bibliography(paper_dir, issues)
|
|
153
|
+
check_global_tex(paper_dir, issues)
|
|
154
|
+
check_method_section(paper_dir, issues)
|
|
155
|
+
check_experiments_section(paper_dir, issues)
|
|
156
|
+
|
|
157
|
+
tables_dir = paper_dir / "tables"
|
|
158
|
+
check_table_file(tables_dir / REQUIRED_TABLE_FILES[0], issues, "tables/main-results.tex")
|
|
159
|
+
check_table_file(tables_dir / REQUIRED_TABLE_FILES[1], issues, "tables/ablations.tex")
|
|
160
|
+
|
|
161
|
+
figures_dir = paper_dir / "figures"
|
|
162
|
+
check_figure_file(figures_dir / REQUIRED_FIGURE_FILES[0], issues, "figures/method-overview.tex")
|
|
163
|
+
check_figure_file(figures_dir / REQUIRED_FIGURE_FILES[1], issues, "figures/results-overview.tex")
|
|
164
|
+
|
|
165
|
+
if issues:
|
|
166
|
+
for issue in issues:
|
|
167
|
+
print(issue, file=sys.stderr)
|
|
168
|
+
return 1
|
|
169
|
+
|
|
170
|
+
print("manuscript delivery artifacts are valid")
|
|
171
|
+
return 0
|
|
172
|
+
|
|
173
|
+
|
|
174
|
+
if __name__ == "__main__":
|
|
175
|
+
raise SystemExit(main())
|
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
# Artifact Status
|
|
2
|
+
|
|
3
|
+
## Deliverable Status
|
|
4
|
+
|
|
5
|
+
- Collaborator-facing report path:
|
|
6
|
+
- Managed main tables path:
|
|
7
|
+
- Current report mode:
|
|
8
|
+
- Why this status is appropriate:
|
|
9
|
+
|
|
10
|
+
## Workflow Audit
|
|
11
|
+
|
|
12
|
+
- Latest completed action:
|
|
13
|
+
- Latest artifact path:
|
|
14
|
+
- Latest run or report id:
|
|
15
|
+
- Rerun or validation notes:
|
|
16
|
+
|
|
17
|
+
## Internal Provenance
|
|
18
|
+
|
|
19
|
+
- Frozen result artifacts used:
|
|
20
|
+
- Canonical context files refreshed:
|
|
21
|
+
- Evidence index anchors:
|
|
22
|
+
|
|
23
|
+
## Paper Handoff
|
|
24
|
+
|
|
25
|
+
- Sections ready for `/lab:write`:
|
|
26
|
+
- Evidence bundles to cite:
|
|
27
|
+
- Claims that still need stronger support:
|
|
28
|
+
- Paper-finishing items still open:
|
|
@@ -105,11 +105,6 @@
|
|
|
105
105
|
- Final performance summary:
|
|
106
106
|
- Table coverage:
|
|
107
107
|
|
|
108
|
-
## Artifact Status
|
|
109
|
-
|
|
110
|
-
- Deliverables or workflow artifacts that are ready:
|
|
111
|
-
- Artifact status notes that are not scientific findings:
|
|
112
|
-
|
|
113
108
|
## Main Results
|
|
114
109
|
|
|
115
110
|
Summarize validated iteration outcomes.
|
|
@@ -129,9 +124,3 @@ Describe unresolved risks and external validity limits.
|
|
|
129
124
|
## Next Steps
|
|
130
125
|
|
|
131
126
|
List concrete follow-up actions.
|
|
132
|
-
|
|
133
|
-
## Paper Handoff
|
|
134
|
-
|
|
135
|
-
- Sections ready for `/lab:write`:
|
|
136
|
-
- Evidence bundles to cite:
|
|
137
|
-
- Claims that still need stronger support:
|
|
@@ -0,0 +1,9 @@
|
|
|
1
|
+
% Add paper-facing bibliography entries here.
|
|
2
|
+
% Keep keys stable with the manuscript's \cite{...} usage.
|
|
3
|
+
|
|
4
|
+
@article{placeholder2026,
|
|
5
|
+
title = {Replace with a real cited work before finalizing},
|
|
6
|
+
author = {Placeholder, Example},
|
|
7
|
+
journal = {Placeholder Venue},
|
|
8
|
+
year = {2026}
|
|
9
|
+
}
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
\begin{table}[t]
|
|
2
|
+
\caption{One-sentence message of the table and the evaluation protocol.}
|
|
3
|
+
\label{tab:placeholder}
|
|
4
|
+
\centering
|
|
5
|
+
\begin{tabular}{lcc}
|
|
6
|
+
\toprule
|
|
7
|
+
Method & Metric 1 $\uparrow$ & Metric 2 $\uparrow$ \\
|
|
8
|
+
\midrule
|
|
9
|
+
Ours & 0.0000 & 0.0000 \\
|
|
10
|
+
Baseline & 0.0000 & 0.0000 \\
|
|
11
|
+
\bottomrule
|
|
12
|
+
\end{tabular}
|
|
13
|
+
\end{table}
|
|
@@ -51,8 +51,8 @@ If `eval-protocol.md` declares structured rung entries, auto mode follows those
|
|
|
51
51
|
|
|
52
52
|
- Run stage contract: write persistent outputs under `results_root`.
|
|
53
53
|
- Iterate stage contract: update persistent outputs under `results_root`.
|
|
54
|
-
- Review stage contract: update canonical review context such as `.lab/context/decisions.md`, `state.md`, `open-questions.md`, or `evidence-index.md`.
|
|
55
|
-
- Report stage contract: write
|
|
54
|
+
- Review stage contract: update canonical review context such as `.lab/context/decisions.md`, `state.md`, `workflow-state.md`, `open-questions.md`, or `evidence-index.md`.
|
|
55
|
+
- Report stage contract: write `<deliverables_root>/report.md`, `<deliverables_root>/main-tables.md`, and `<deliverables_root>/artifact-status.md`.
|
|
56
56
|
- Write stage contract: write LaTeX output under `<deliverables_root>/paper/`.
|
|
57
57
|
|
|
58
58
|
## Promotion Policy
|
|
@@ -1,19 +1,25 @@
|
|
|
1
|
-
#
|
|
1
|
+
# Research State
|
|
2
2
|
|
|
3
|
-
##
|
|
3
|
+
## Approved Direction
|
|
4
4
|
|
|
5
|
-
-
|
|
6
|
-
-
|
|
7
|
-
-
|
|
5
|
+
- One-sentence problem:
|
|
6
|
+
- Approved direction:
|
|
7
|
+
- Strongest supported claim:
|
|
8
8
|
|
|
9
|
-
##
|
|
9
|
+
## Evidence Boundary
|
|
10
10
|
|
|
11
|
-
-
|
|
12
|
-
-
|
|
13
|
-
-
|
|
11
|
+
- What the current evidence really supports:
|
|
12
|
+
- What is still outside the boundary:
|
|
13
|
+
- Biggest research risk:
|
|
14
14
|
|
|
15
|
-
##
|
|
15
|
+
## Active Research Track
|
|
16
16
|
|
|
17
|
-
-
|
|
18
|
-
-
|
|
19
|
-
-
|
|
17
|
+
- Current research focus:
|
|
18
|
+
- Primary metric:
|
|
19
|
+
- Dataset or benchmark scope:
|
|
20
|
+
|
|
21
|
+
## Current Research Constraints
|
|
22
|
+
|
|
23
|
+
- Hard constraints:
|
|
24
|
+
- Claim boundary:
|
|
25
|
+
- Conditions that require reopening the direction:
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
# Workflow State
|
|
2
|
+
|
|
3
|
+
## Current Stage
|
|
4
|
+
|
|
5
|
+
- Active stage:
|
|
6
|
+
- Current objective:
|
|
7
|
+
- Next required output:
|
|
8
|
+
|
|
9
|
+
## Latest Update
|
|
10
|
+
|
|
11
|
+
- Last completed action:
|
|
12
|
+
- Latest artifact path:
|
|
13
|
+
- Latest run or report id:
|
|
14
|
+
|
|
15
|
+
## Next Step
|
|
16
|
+
|
|
17
|
+
- Immediate next action:
|
|
18
|
+
- Blocking issue:
|
|
19
|
+
- Human decision needed:
|
|
@@ -8,7 +8,7 @@ For a new AI session, read these files in order:
|
|
|
8
8
|
|
|
9
9
|
1. `.lab/context/session-brief.md`
|
|
10
10
|
2. `.lab/context/mission.md`
|
|
11
|
-
3. `.lab/context/state.md`
|
|
11
|
+
3. `.lab/context/workflow-state.md`
|
|
12
12
|
4. `.lab/context/evidence-index.md`
|
|
13
13
|
|
|
14
14
|
Only expand to additional context when the brief points to it.
|
|
@@ -24,13 +24,15 @@ For auto-mode orchestration or long-running experiment campaigns, also read:
|
|
|
24
24
|
|
|
25
25
|
## Workflow Boundaries
|
|
26
26
|
|
|
27
|
-
- `.lab/context/` holds durable project research state.
|
|
27
|
+
- `.lab/context/` holds durable project research state plus lightweight workflow state.
|
|
28
28
|
- `.lab/changes/`, `.lab/iterations/`, and `.lab/writing/` hold workflow control artifacts, lightweight manifests, and change-local harnesses.
|
|
29
29
|
- `.lab/.managed/` holds tool-managed templates and scripts.
|
|
30
30
|
- Durable run outputs belong under the configured `results_root`, not inside `.lab/changes/`.
|
|
31
31
|
- Figures and plots belong under the configured `figures_root`, not inside `.lab/changes/`.
|
|
32
32
|
- Deliverables belong under the configured `deliverables_root`, not inside `.lab/context/`.
|
|
33
33
|
- Change-local `data/` directories may hold lightweight manifests or batch specs, but not the canonical dataset copy.
|
|
34
|
+
- `.lab/context/state.md` holds durable research state; `.lab/context/workflow-state.md` holds live workflow state.
|
|
35
|
+
- `.lab/context/summary.md` is the durable project summary; `.lab/context/session-brief.md` is the next-session startup brief.
|
|
34
36
|
- `.lab/context/auto-mode.md` defines the bounded autonomous envelope; `.lab/context/auto-status.md` records live state for resume and handoff.
|
|
35
37
|
- If the user provides a LaTeX template directory, validate it and attach it through `paper_template_root` before drafting.
|
|
36
38
|
- Treat attached template directories as user-owned assets. Do not rewrite template files unless the user explicitly asks.
|
|
@@ -83,7 +83,7 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
83
83
|
### `/lab:auto`
|
|
84
84
|
|
|
85
85
|
- Use this stage to orchestrate approved execution stages with bounded autonomy.
|
|
86
|
-
- Read `.lab/config/workflow.json`, `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/decisions.md`, `.lab/context/data-decisions.md`, `.lab/context/evidence-index.md`, `.lab/context/terminology-lock.md`, `.lab/context/auto-mode.md`, and `.lab/context/auto-status.md` before acting.
|
|
86
|
+
- Read `.lab/config/workflow.json`, `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, `.lab/context/decisions.md`, `.lab/context/data-decisions.md`, `.lab/context/evidence-index.md`, `.lab/context/terminology-lock.md`, `.lab/context/auto-mode.md`, and `.lab/context/auto-status.md` before acting.
|
|
87
87
|
- Treat `.lab/context/auto-mode.md` as the control contract and `.lab/context/auto-status.md` as the live state file.
|
|
88
88
|
- Require `Autonomy level` and `Approval status` in `.lab/context/auto-mode.md` before execution.
|
|
89
89
|
- Treat `L1` as safe-run validation, `L2` as bounded iteration, and `L3` as aggressive campaign mode.
|
|
@@ -93,13 +93,13 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
93
93
|
- You may add exploratory datasets, benchmarks, and comparison methods inside the approved exploration envelope.
|
|
94
94
|
- You may promote an exploratory addition to the primary package only after the promotion policy in `auto-mode.md` is satisfied and the promotion is written back into `.lab/context/data-decisions.md`, `.lab/context/decisions.md`, `.lab/context/state.md`, and `.lab/context/session-brief.md`.
|
|
95
95
|
- Poll long-running commands until they complete, time out, or hit a stop condition.
|
|
96
|
-
- Update `.lab/context/auto-status.md`, `.lab/context/state.md`, `.lab/context/decisions.md`, `.lab/context/data-decisions.md`, `.lab/context/evidence-index.md`, and `.lab/context/session-brief.md` as the campaign advances.
|
|
96
|
+
- Update `.lab/context/auto-status.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, `.lab/context/decisions.md`, `.lab/context/data-decisions.md`, `.lab/context/evidence-index.md`, and `.lab/context/session-brief.md` as the campaign advances.
|
|
97
97
|
- Keep an explicit approval gate when a proposed action would leave the frozen core defined by the auto-mode contract.
|
|
98
98
|
|
|
99
99
|
### `/lab:spec`
|
|
100
100
|
|
|
101
101
|
- Read `.lab/config/workflow.json` before drafting the change.
|
|
102
|
-
- Read `.lab/context/mission.md`, `.lab/context/decisions.md`, `.lab/context/state.md`, and `.lab/context/data-decisions.md` before drafting the change.
|
|
102
|
+
- Read `.lab/context/mission.md`, `.lab/context/decisions.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, and `.lab/context/data-decisions.md` before drafting the change.
|
|
103
103
|
- Use `.lab/changes/<change-id>/` as the canonical lab change directory.
|
|
104
104
|
- Convert the approved idea into lab change artifacts using `.lab/.managed/templates/proposal.md`, `.lab/.managed/templates/design.md`, `.lab/.managed/templates/spec.md`, and `.lab/.managed/templates/tasks.md`.
|
|
105
105
|
- Update `.lab/context/state.md` and `.lab/context/decisions.md` after freezing the spec.
|
|
@@ -108,12 +108,12 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
108
108
|
### `/lab:run`
|
|
109
109
|
|
|
110
110
|
- Start with the smallest meaningful experiment.
|
|
111
|
-
- Read `.lab/context/mission.md`, `.lab/context/state.md`, and `.lab/context/data-decisions.md` before choosing the run.
|
|
111
|
+
- Read `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, and `.lab/context/data-decisions.md` before choosing the run.
|
|
112
112
|
- Register the run with `.lab/.managed/scripts/register_run.py`.
|
|
113
113
|
- Normalize the result with `.lab/.managed/scripts/eval_report.py`.
|
|
114
114
|
- Validate normalized output with `.lab/.managed/scripts/validate_results.py`.
|
|
115
115
|
- Read `.lab/context/eval-protocol.md` before choosing the smallest run so the first experiment already targets the approved tables, metrics, and gates.
|
|
116
|
-
- Update `.lab/context/state.md`, `.lab/context/evidence-index.md`, and `.lab/context/eval-protocol.md` after the run.
|
|
116
|
+
- Update `.lab/context/state.md`, `.lab/context/workflow-state.md`, `.lab/context/evidence-index.md`, and `.lab/context/eval-protocol.md` after the run.
|
|
117
117
|
- If the evaluation protocol is still skeletal, initialize the smallest trustworthy source-backed version before treating the run as the protocol anchor.
|
|
118
118
|
|
|
119
119
|
### `/lab:iterate`
|
|
@@ -128,13 +128,13 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
128
128
|
- maximum iteration count
|
|
129
129
|
- Only change implementation hypotheses within the loop.
|
|
130
130
|
- Require a normalized evaluation report each round.
|
|
131
|
-
- Read `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/decisions.md`, and `.lab/context/evidence-index.md` at the start of each round.
|
|
131
|
+
- Read `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, `.lab/context/decisions.md`, and `.lab/context/evidence-index.md` at the start of each round.
|
|
132
132
|
- Read `.lab/context/data-decisions.md` before changing benchmark-facing experiments.
|
|
133
133
|
- Read `.lab/context/eval-protocol.md` before changing evaluation ladders, sample sizes, or promotion gates.
|
|
134
134
|
- Keep metric definitions, baseline behavior, and comparison implementations anchored to the source-backed evaluation protocol before changing thresholds, gates, or ladder transitions.
|
|
135
135
|
- Switch to diagnostic mode if risk increases for two consecutive rounds.
|
|
136
136
|
- Write round reports with `.lab/.managed/templates/iteration-report.md`.
|
|
137
|
-
- Update `.lab/context/state.md`, `.lab/context/decisions.md`, `.lab/context/evidence-index.md`, `.lab/context/open-questions.md`, and `.lab/context/eval-protocol.md` each round as needed.
|
|
137
|
+
- Update `.lab/context/state.md`, `.lab/context/workflow-state.md`, `.lab/context/decisions.md`, `.lab/context/evidence-index.md`, `.lab/context/open-questions.md`, and `.lab/context/eval-protocol.md` each round as needed.
|
|
138
138
|
- Keep `.lab/context/eval-protocol.md` synchronized with accepted ladder changes, benchmark scope, and source-backed implementation deviations.
|
|
139
139
|
- Stop at threshold success or iteration cap, and record blockers plus next-best actions when the campaign ends without success.
|
|
140
140
|
|
|
@@ -151,13 +151,13 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
151
151
|
### `/lab:report`
|
|
152
152
|
|
|
153
153
|
- Summarize all validated iteration summaries.
|
|
154
|
-
- Read `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/decisions.md`, `.lab/context/evidence-index.md`, and `.lab/context/data-decisions.md` before drafting.
|
|
154
|
+
- Read `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, `.lab/context/decisions.md`, `.lab/context/evidence-index.md`, and `.lab/context/data-decisions.md` before drafting.
|
|
155
155
|
- Read `.lab/context/eval-protocol.md` before choosing tables, thresholds, or final result framing.
|
|
156
156
|
- Keep metric definitions, comparison semantics, and implementation references anchored to the approved evaluation protocol instead of re-deriving them during reporting.
|
|
157
157
|
- Aggregate them with `.lab/.managed/scripts/summarize_iterations.py`.
|
|
158
|
-
- Write the final document with `.lab/.managed/templates/final-report.md
|
|
158
|
+
- Write the final document with `.lab/.managed/templates/final-report.md`, the managed table summary with `.lab/.managed/templates/main-tables.md`, and the internal handoff with `.lab/.managed/templates/artifact-status.md`.
|
|
159
159
|
- Keep failed attempts and limitations visible.
|
|
160
|
-
- Update `.lab/context/mission.md`, `.lab/context/eval-protocol.md`, `.lab/context/state.md`, and `.lab/context/evidence-index.md` with report-level handoff notes.
|
|
160
|
+
- Update `.lab/context/mission.md`, `.lab/context/eval-protocol.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, and `.lab/context/evidence-index.md` with report-level handoff notes.
|
|
161
161
|
- If canonical context is still skeletal, hydrate the smallest trustworthy version from frozen artifacts before finalizing the report.
|
|
162
162
|
- If collaborator-critical fields remain missing after hydration, downgrade to an `artifact-anchored interim report` instead of presenting a final collaborator-ready report.
|
|
163
163
|
|
|
@@ -172,14 +172,19 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
172
172
|
- Write one paper section or one explicit subproblem per round.
|
|
173
173
|
- Bind each claim to evidence from `report`, iteration reports, or normalized summaries.
|
|
174
174
|
- Write planning artifacts with `.lab/.managed/templates/paper-plan.md`, `.lab/.managed/templates/paper-section.md`, and `.lab/.managed/templates/write-iteration.md`.
|
|
175
|
-
- Write final manuscript artifacts with `.lab/.managed/templates/paper.tex
|
|
175
|
+
- Write final manuscript artifacts with `.lab/.managed/templates/paper.tex`, `.lab/.managed/templates/paper-section.tex`, `.lab/.managed/templates/paper-table.tex`, `.lab/.managed/templates/paper-figure.tex`, and `.lab/.managed/templates/paper-references.bib`.
|
|
176
176
|
- Use the vendored paper-writing references under `skills/lab/references/paper-writing/`.
|
|
177
|
-
- For
|
|
177
|
+
- For any section with a bundled example bank, also use the vendored example-bank files under `skills/lab/references/paper-writing/examples/`.
|
|
178
178
|
- Load only the current section guide, the matching examples index when one exists, 1-2 matching concrete example files, plus `paper-review.md` and `does-my-writing-flow-source.md`.
|
|
179
179
|
- Build a compact mini-outline before prose.
|
|
180
|
+
- Build the paper asset plan before prose when the section carries method or experiments claims.
|
|
180
181
|
- For each subsection, explicitly cover motivation, design, and technical advantage when applicable.
|
|
181
182
|
- Keep terminology stable across rounds and sections.
|
|
182
183
|
- If a claim is not supported by evidence, weaken or remove it.
|
|
184
|
+
- Treat tables, figures, citations, and bibliography as core manuscript content rather than optional polish.
|
|
185
|
+
- Keep paper-facing LaTeX free of absolute local paths, rerun ids, shell transcripts, and internal workflow provenance.
|
|
186
|
+
- Materialize real LaTeX tables and figure placeholders instead of leaving all evidence inside prose paragraphs.
|
|
187
|
+
- Run `.lab/.managed/scripts/validate_manuscript_delivery.py --paper-dir <deliverables_root>/paper` before accepting a final-draft or export round.
|
|
183
188
|
- Before finalizing a round, append and answer the five-dimension self-review checklist and revise unresolved items.
|
|
184
189
|
- Apply paper-writing discipline without changing experimental truth.
|
|
185
190
|
- If the evidence is insufficient, stop and route back to `review` or `iterate`.
|
|
@@ -194,7 +199,7 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
194
199
|
- No unconstrained auto mode. Every `/lab:auto` campaign must declare allowed stages, stop conditions, and a promotion policy in `.lab/context/auto-mode.md`.
|
|
195
200
|
- No auto start without an explicit autonomy level and `Approval status: approved`.
|
|
196
201
|
- No final report without validated normalized results.
|
|
197
|
-
- No paper-writing round without stable report artifacts, an approved framing artifact, evidence links, and
|
|
202
|
+
- No paper-writing round without stable report artifacts, an approved framing artifact, evidence links, LaTeX manuscript output, and a passing manuscript-delivery validation for final-draft or export rounds.
|
|
198
203
|
|
|
199
204
|
## References
|
|
200
205
|
|
|
@@ -212,7 +217,7 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
212
217
|
- Write stage guide: `.codex/skills/lab/stages/write.md` or `.claude/skills/lab/stages/write.md`
|
|
213
218
|
- Paper-writing integration: `.codex/skills/lab/references/paper-writing-integration.md` or `.claude/skills/lab/references/paper-writing-integration.md`
|
|
214
219
|
- Vendored paper-writing references: `.codex/skills/lab/references/paper-writing/{abstract,introduction,related-work,method,experiments,conclusion,paper-review,does-my-writing-flow-source}.md` or `.claude/skills/lab/references/paper-writing/{abstract,introduction,related-work,method,experiments,conclusion,paper-review,does-my-writing-flow-source}.md`
|
|
215
|
-
- Vendored paper-writing example bank: `.codex/skills/lab/references/paper-writing/examples/{index,abstract-examples,introduction-examples,method-examples}.md` or `.claude/skills/lab/references/paper-writing/examples/{index,abstract-examples,introduction-examples,method-examples}.md`, plus the matching section subdirectories
|
|
220
|
+
- Vendored paper-writing example bank: `.codex/skills/lab/references/paper-writing/examples/{index,abstract-examples,introduction-examples,method-examples,related-work-examples,experiments-examples,conclusion-examples}.md` or `.claude/skills/lab/references/paper-writing/examples/{index,abstract-examples,introduction-examples,method-examples,related-work-examples,experiments-examples,conclusion-examples}.md`, plus the matching section subdirectories
|
|
216
221
|
- Command adapters: the installed `/lab:*` command assets
|
|
217
222
|
- Shared workflow config: `.lab/config/workflow.json`
|
|
218
223
|
- Shared project context: `.lab/context/{mission,state,decisions,evidence-index,open-questions,data-decisions,eval-protocol,auto-mode,auto-status}.md`
|
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
# Conservative Claim-Boundary LaTeX Example
|
|
2
|
+
|
|
3
|
+
Use this example to close with the strongest supported claim while keeping the
|
|
4
|
+
boundary explicit.
|
|
5
|
+
|
|
6
|
+
```tex
|
|
7
|
+
\section{Conclusion}
|
|
8
|
+
|
|
9
|
+
This paper shows that adding a structured ranking backbone together with a
|
|
10
|
+
post-hoc calibration stage improves uplift ranking under the frozen benchmark
|
|
11
|
+
protocol. Across the three benchmark families used in this work, the full model
|
|
12
|
+
consistently matches or exceeds the strongest baselines and remains stronger
|
|
13
|
+
than the key ablated variants. This makes the main claim narrower than a
|
|
14
|
+
universal superiority claim but stronger than a single-dataset win.
|
|
15
|
+
|
|
16
|
+
We do not claim that the current method solves uplift modeling in every domain
|
|
17
|
+
or that every design choice helps equally on every benchmark. In particular, the
|
|
18
|
+
calibration stage appears beneficial on some datasets and neutral on others,
|
|
19
|
+
which means its value should be interpreted as setting-dependent rather than as
|
|
20
|
+
a guaranteed gain. That boundary is consistent with recent benchmarking
|
|
21
|
+
practice, which argues for claim discipline and protocol-specific interpretation
|
|
22
|
+
rather than broad overgeneralization~\cite{carlini2019evaluating}.
|
|
23
|
+
|
|
24
|
+
The most useful next step is to extend the evaluation to a broader set of
|
|
25
|
+
benchmark slices and to test whether the same ranking-versus-calibration split
|
|
26
|
+
remains useful when the label distribution shifts more aggressively.
|
|
27
|
+
```
|
package/package-assets/shared/skills/lab/references/paper-writing/examples/conclusion-examples.md
ADDED
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
# Conclusion Example Patterns
|
|
2
|
+
|
|
3
|
+
Use these examples to end with a bounded claim, not a marketing recap. The
|
|
4
|
+
referenced file is a complete LaTeX conclusion example with explicit claim
|
|
5
|
+
boundary language.
|
|
6
|
+
|
|
7
|
+
## Recommended Pattern
|
|
8
|
+
|
|
9
|
+
1. Restate the narrow supported claim.
|
|
10
|
+
2. Restate the strongest evidence in one compact sentence.
|
|
11
|
+
3. State the main limitation or boundary.
|
|
12
|
+
4. End with the next concrete direction, not generic future work.
|
|
13
|
+
|
|
14
|
+
## Example Files
|
|
15
|
+
|
|
16
|
+
- `examples/conclusion/conservative-claim-boundary.md`
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
# Figure Placeholder and Discussion Example
|
|
2
|
+
|
|
3
|
+
Use complete figure placeholders when the visual asset is not finalized yet but
|
|
4
|
+
the manuscript already needs a stable figure slot, caption, label, and prose
|
|
5
|
+
attachment.
|
|
6
|
+
|
|
7
|
+
## Method Figure Placeholder
|
|
8
|
+
|
|
9
|
+
```tex
|
|
10
|
+
\begin{figure}[t]
|
|
11
|
+
\centering
|
|
12
|
+
\fbox{\rule{0pt}{1.55in}\rule{0.92\linewidth}{0pt}}
|
|
13
|
+
\caption{Method overview. Figure intent: show the full pipeline, highlight the
|
|
14
|
+
boundary between the structured scoring module and the post-hoc calibration
|
|
15
|
+
stage, and make the train-time versus inference-time data flow easy to inspect.}
|
|
16
|
+
\label{fig:method-overview}
|
|
17
|
+
\end{figure}
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
## Results Figure Placeholder
|
|
21
|
+
|
|
22
|
+
```tex
|
|
23
|
+
\begin{figure}[t]
|
|
24
|
+
\centering
|
|
25
|
+
\fbox{\rule{0pt}{1.55in}\rule{0.92\linewidth}{0pt}}
|
|
26
|
+
\caption{Benchmark-level results overview. Figure intent: summarize the trend
|
|
27
|
+
across datasets, show error bars or confidence intervals, and reveal whether the
|
|
28
|
+
main gain is stable or dominated by one benchmark.}
|
|
29
|
+
\label{fig:results-overview}
|
|
30
|
+
\end{figure}
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
## Discussion Example
|
|
34
|
+
|
|
35
|
+
```tex
|
|
36
|
+
Figure~\ref{fig:method-overview} gives the reader the shortest path to the
|
|
37
|
+
method's logic before the section moves into module details. The figure should
|
|
38
|
+
make it obvious which component produces the structured signal and where the
|
|
39
|
+
post-hoc calibration step changes the final ranking.
|
|
40
|
+
|
|
41
|
+
Figure~\ref{fig:results-overview} should then complement the tables rather than
|
|
42
|
+
repeat them. Its job is to show whether the gain is stable across datasets and
|
|
43
|
+
seeds, not to claim a new effect that the tables do not already support.
|
|
44
|
+
```
|