superlab 0.1.24 → 0.1.26

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/README.md +8 -2
  2. package/README.zh-CN.md +8 -3
  3. package/lib/auto_contracts.cjs +4 -2
  4. package/lib/context.cjs +155 -13
  5. package/lib/i18n.cjs +89 -15
  6. package/lib/install.cjs +2 -0
  7. package/package-assets/claude/commands/lab-write.md +1 -1
  8. package/package-assets/claude/commands/lab.md +2 -1
  9. package/package-assets/codex/prompts/lab-write.md +1 -1
  10. package/package-assets/codex/prompts/lab.md +2 -1
  11. package/package-assets/shared/lab/.managed/scripts/validate_manuscript_delivery.py +175 -0
  12. package/package-assets/shared/lab/.managed/templates/artifact-status.md +28 -0
  13. package/package-assets/shared/lab/.managed/templates/final-report.md +0 -11
  14. package/package-assets/shared/lab/.managed/templates/paper-figure.tex +6 -0
  15. package/package-assets/shared/lab/.managed/templates/paper-references.bib +9 -0
  16. package/package-assets/shared/lab/.managed/templates/paper-table.tex +13 -0
  17. package/package-assets/shared/lab/context/auto-mode.md +2 -2
  18. package/package-assets/shared/lab/context/session-brief.md +1 -1
  19. package/package-assets/shared/lab/context/state.md +19 -13
  20. package/package-assets/shared/lab/context/workflow-state.md +19 -0
  21. package/package-assets/shared/lab/system/core.md +4 -2
  22. package/package-assets/shared/skills/lab/SKILL.md +19 -14
  23. package/package-assets/shared/skills/lab/references/paper-writing/examples/conclusion/conservative-claim-boundary.md +27 -0
  24. package/package-assets/shared/skills/lab/references/paper-writing/examples/conclusion-examples.md +16 -0
  25. package/package-assets/shared/skills/lab/references/paper-writing/examples/experiments/figure-placeholder-and-discussion.md +44 -0
  26. package/package-assets/shared/skills/lab/references/paper-writing/examples/experiments/main-results-and-ablation-latex.md +83 -0
  27. package/package-assets/shared/skills/lab/references/paper-writing/examples/experiments-examples.md +17 -0
  28. package/package-assets/shared/skills/lab/references/paper-writing/examples/index.md +12 -3
  29. package/package-assets/shared/skills/lab/references/paper-writing/examples/related-work/closest-prior-gap-template.md +20 -0
  30. package/package-assets/shared/skills/lab/references/paper-writing/examples/related-work/topic-comparison-template.md +24 -0
  31. package/package-assets/shared/skills/lab/references/paper-writing/examples/related-work-examples.md +17 -0
  32. package/package-assets/shared/skills/lab/references/paper-writing-integration.md +19 -10
  33. package/package-assets/shared/skills/lab/stages/report.md +5 -2
  34. package/package-assets/shared/skills/lab/stages/write.md +34 -1
  35. package/package.json +1 -1
@@ -58,7 +58,8 @@ Use the same repository artifacts and stage boundaries every time.
58
58
  - `iterate` requires a normalized summary from `scripts/eval_report.py`.
59
59
  - `run`, `iterate`, `auto`, and `report` should all follow `.lab/context/eval-protocol.md`, including its recorded sources for metrics and comparison implementations.
60
60
  - `write` requires an approved framing artifact from the `framing` stage.
61
- - `write` requires stable report artifacts, a mini-outline, the active section guide, `paper-review.md`, and `does-my-writing-flow-source.md`, and should only change one section per round.
61
+ - `write` requires stable report artifacts, a mini-outline, the active section guide, the matching bundled examples when available, `paper-review.md`, and `does-my-writing-flow-source.md`, and should only change one section per round.
62
+ - Final-draft or export rounds in `write` should materialize paper-facing tables, figure placeholders, a non-empty `references.bib`, and pass `.lab/.managed/scripts/validate_manuscript_delivery.py --paper-dir <deliverables_root>/paper`.
62
63
 
63
64
  ## How to Ask for `/lab auto`
64
65
 
@@ -6,4 +6,4 @@ argument-hint: section or writing target
6
6
  Use the installed `lab` skill at `.codex/skills/lab/SKILL.md`.
7
7
 
8
8
  Execute the requested `/lab:write` stage against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
9
- This command runs the `/lab:write` stage. It requires an approved framing artifact from `/lab:framing`, must read the matching section reference from `.codex/skills/lab/references/paper-writing/`, and for `abstract`, `introduction`, or `method` it must also read `.codex/skills/lab/references/paper-writing/examples/index.md` plus the matching examples index and 1-2 concrete example files. Then it should run `paper-review.md` and `does-my-writing-flow-source.md`, build a mini-outline, and revise only one section.
9
+ This command runs the `/lab:write` stage. It requires an approved framing artifact from `/lab:framing`, must read the matching section reference from `.codex/skills/lab/references/paper-writing/`, and for any section with a bundled example bank it must also read `.codex/skills/lab/references/paper-writing/examples/index.md` plus the matching examples index and 1-2 concrete example files. Then it should run `paper-review.md` and `does-my-writing-flow-source.md`, build a mini-outline, plan the section's paper-facing tables/figures/citations, and revise only one section. Final-draft or export rounds must run `.lab/.managed/scripts/validate_manuscript_delivery.py --paper-dir <deliverables_root>/paper` before stopping.
@@ -52,7 +52,8 @@ argument-hint: workflow question or stage choice
52
52
  - `/lab:iterate` requires a normalized summary from `scripts/eval_report.py`.
53
53
  - `/lab:run`, `/lab:iterate`, `/lab:auto`, and `/lab:report` should all follow `.lab/context/eval-protocol.md`, including its recorded sources for metrics and comparison implementations.
54
54
  - `/lab:write` requires an approved framing artifact from `/lab:framing`.
55
- - `/lab:write` requires stable report artifacts, a mini-outline, the active section guide, `paper-review.md`, and `does-my-writing-flow-source.md`, and should only change one section per round.
55
+ - `/lab:write` requires stable report artifacts, a mini-outline, the active section guide, the matching bundled examples when available, `paper-review.md`, and `does-my-writing-flow-source.md`, and should only change one section per round.
56
+ - Final-draft or export rounds in `/lab:write` should materialize paper-facing tables, figure placeholders, a non-empty `references.bib`, and pass `.lab/.managed/scripts/validate_manuscript_delivery.py --paper-dir <deliverables_root>/paper`.
56
57
 
57
58
  ## How to Ask for `/lab:auto`
58
59
 
@@ -0,0 +1,175 @@
1
+ #!/usr/bin/env python3
2
+ import argparse
3
+ import re
4
+ import sys
5
+ from pathlib import Path
6
+
7
+
8
+ ABSOLUTE_PATH_MARKERS = ("/Users/", "/home/", "/tmp/", "/private/tmp/")
9
+ REQUIRED_TABLE_FILES = ("main-results.tex", "ablations.tex")
10
+ REQUIRED_FIGURE_FILES = ("method-overview.tex", "results-overview.tex")
11
+
12
+
13
+ def parse_args():
14
+ parser = argparse.ArgumentParser(
15
+ description="Validate that a paper delivery contains basic manuscript-ready assets."
16
+ )
17
+ parser.add_argument("--paper-dir", required=True, help="Path to the paper deliverable root")
18
+ return parser.parse_args()
19
+
20
+
21
+ def read_text(path: Path) -> str:
22
+ return path.read_text(encoding="utf-8")
23
+
24
+
25
+ def check_exists(path: Path, issues: list[str], label: str):
26
+ if not path.exists():
27
+ issues.append(f"missing required file: {label} ({path})")
28
+
29
+
30
+ def check_bibliography(paper_dir: Path, issues: list[str]):
31
+ bib_path = paper_dir / "references.bib"
32
+ check_exists(bib_path, issues, "references.bib")
33
+ if not bib_path.exists():
34
+ return
35
+ text = read_text(bib_path)
36
+ if "TODO" in text or "todo" in text or "Add bibliography entries" in text or "@" not in text:
37
+ issues.append("missing a non-empty references.bib")
38
+
39
+
40
+ def check_global_tex(paper_dir: Path, issues: list[str]):
41
+ tex_files = sorted(paper_dir.rglob("*.tex"))
42
+ combined = "\n".join(read_text(path) for path in tex_files)
43
+
44
+ if r"\cite{" not in combined:
45
+ issues.append("missing citation commands in manuscript tex files")
46
+ if "+/-" in combined:
47
+ issues.append("replace '+/-' with LaTeX \\pm formatting")
48
+ if any(marker in combined for marker in ABSOLUTE_PATH_MARKERS):
49
+ issues.append("manuscript tex files must not contain absolute local paths")
50
+
51
+
52
+ def check_table_file(path: Path, issues: list[str], label: str):
53
+ if not path.exists():
54
+ if label == "tables/main-results.tex":
55
+ issues.append("missing a main results table")
56
+ elif label == "tables/ablations.tex":
57
+ issues.append("missing an ablation table")
58
+ else:
59
+ issues.append(f"missing required file: {label} ({path})")
60
+ return
61
+ text = read_text(path)
62
+ if r"\begin{table}" not in text:
63
+ issues.append(f"{label} must contain a table environment")
64
+ if r"\caption{" not in text or r"\label{" not in text:
65
+ issues.append(f"{label} must contain both caption and label")
66
+ if not all(token in text for token in (r"\toprule", r"\midrule", r"\bottomrule")):
67
+ issues.append(f"{label} must use booktabs structure")
68
+
69
+
70
+ def check_figure_file(path: Path, issues: list[str], label: str):
71
+ if not path.exists():
72
+ if label == "figures/method-overview.tex":
73
+ issues.append("missing a method figure placeholder")
74
+ elif label == "figures/results-overview.tex":
75
+ issues.append("missing an experiments figure placeholder")
76
+ else:
77
+ issues.append(f"missing required file: {label} ({path})")
78
+ return
79
+ text = read_text(path)
80
+ if r"\begin{figure}" not in text:
81
+ issues.append(f"{label} must contain a figure environment")
82
+ if r"\caption{" not in text or r"\label{" not in text:
83
+ issues.append(f"{label} must contain both caption and label")
84
+ if "Figure intent:" not in text and "图意图:" not in text:
85
+ issues.append(f"{label} must explain figure intent")
86
+
87
+
88
+ def check_experiments_section(paper_dir: Path, issues: list[str]):
89
+ experiments = paper_dir / "sections" / "experiments.tex"
90
+ check_exists(experiments, issues, "sections/experiments.tex")
91
+ if not experiments.exists():
92
+ return
93
+ text = read_text(experiments)
94
+ has_table = any(
95
+ token in text
96
+ for token in (
97
+ r"\input{tables/main-results}",
98
+ r"\input{tables/ablations}",
99
+ r"\begin{table}",
100
+ )
101
+ )
102
+ has_figure = any(
103
+ token in text
104
+ for token in (
105
+ r"\input{figures/results-overview}",
106
+ r"\begin{figure}",
107
+ )
108
+ )
109
+ if not has_table:
110
+ issues.append("experiments section is missing a main results table")
111
+ if not has_figure:
112
+ issues.append("experiments section is missing an experiments figure placeholder")
113
+
114
+
115
+ def check_method_section(paper_dir: Path, issues: list[str]):
116
+ method = paper_dir / "sections" / "method.tex"
117
+ check_exists(method, issues, "sections/method.tex")
118
+ if not method.exists():
119
+ return
120
+ text = read_text(method)
121
+ has_figure = any(
122
+ token in text
123
+ for token in (
124
+ r"\input{figures/method-overview}",
125
+ r"\begin{figure}",
126
+ )
127
+ )
128
+ if not has_figure:
129
+ issues.append("method section is missing a method figure placeholder")
130
+
131
+
132
+ def check_main_tex(paper_dir: Path, issues: list[str]):
133
+ main_tex = paper_dir / "main.tex"
134
+ check_exists(main_tex, issues, "main.tex")
135
+ if not main_tex.exists():
136
+ return
137
+ text = read_text(main_tex)
138
+ if r"\bibliography{references}" not in text:
139
+ issues.append("main.tex must include the references bibliography")
140
+
141
+
142
+ def main():
143
+ args = parse_args()
144
+ paper_dir = Path(args.paper_dir)
145
+ issues: list[str] = []
146
+
147
+ if not paper_dir.exists():
148
+ print(f"paper directory does not exist: {paper_dir}", file=sys.stderr)
149
+ return 1
150
+
151
+ check_main_tex(paper_dir, issues)
152
+ check_bibliography(paper_dir, issues)
153
+ check_global_tex(paper_dir, issues)
154
+ check_method_section(paper_dir, issues)
155
+ check_experiments_section(paper_dir, issues)
156
+
157
+ tables_dir = paper_dir / "tables"
158
+ check_table_file(tables_dir / REQUIRED_TABLE_FILES[0], issues, "tables/main-results.tex")
159
+ check_table_file(tables_dir / REQUIRED_TABLE_FILES[1], issues, "tables/ablations.tex")
160
+
161
+ figures_dir = paper_dir / "figures"
162
+ check_figure_file(figures_dir / REQUIRED_FIGURE_FILES[0], issues, "figures/method-overview.tex")
163
+ check_figure_file(figures_dir / REQUIRED_FIGURE_FILES[1], issues, "figures/results-overview.tex")
164
+
165
+ if issues:
166
+ for issue in issues:
167
+ print(issue, file=sys.stderr)
168
+ return 1
169
+
170
+ print("manuscript delivery artifacts are valid")
171
+ return 0
172
+
173
+
174
+ if __name__ == "__main__":
175
+ raise SystemExit(main())
@@ -0,0 +1,28 @@
1
+ # Artifact Status
2
+
3
+ ## Deliverable Status
4
+
5
+ - Collaborator-facing report path:
6
+ - Managed main tables path:
7
+ - Current report mode:
8
+ - Why this status is appropriate:
9
+
10
+ ## Workflow Audit
11
+
12
+ - Latest completed action:
13
+ - Latest artifact path:
14
+ - Latest run or report id:
15
+ - Rerun or validation notes:
16
+
17
+ ## Internal Provenance
18
+
19
+ - Frozen result artifacts used:
20
+ - Canonical context files refreshed:
21
+ - Evidence index anchors:
22
+
23
+ ## Paper Handoff
24
+
25
+ - Sections ready for `/lab:write`:
26
+ - Evidence bundles to cite:
27
+ - Claims that still need stronger support:
28
+ - Paper-finishing items still open:
@@ -105,11 +105,6 @@
105
105
  - Final performance summary:
106
106
  - Table coverage:
107
107
 
108
- ## Artifact Status
109
-
110
- - Deliverables or workflow artifacts that are ready:
111
- - Artifact status notes that are not scientific findings:
112
-
113
108
  ## Main Results
114
109
 
115
110
  Summarize validated iteration outcomes.
@@ -129,9 +124,3 @@ Describe unresolved risks and external validity limits.
129
124
  ## Next Steps
130
125
 
131
126
  List concrete follow-up actions.
132
-
133
- ## Paper Handoff
134
-
135
- - Sections ready for `/lab:write`:
136
- - Evidence bundles to cite:
137
- - Claims that still need stronger support:
@@ -0,0 +1,6 @@
1
+ \begin{figure}[t]
2
+ \centering
3
+ \fbox{\rule{0pt}{1.2in}\rule{0.9\linewidth}{0pt}}
4
+ \caption{Figure title. Figure intent: explain what this figure should show and why the reader needs it.}
5
+ \label{fig:placeholder}
6
+ \end{figure}
@@ -0,0 +1,9 @@
1
+ % Add paper-facing bibliography entries here.
2
+ % Keep keys stable with the manuscript's \cite{...} usage.
3
+
4
+ @article{placeholder2026,
5
+ title = {Replace with a real cited work before finalizing},
6
+ author = {Placeholder, Example},
7
+ journal = {Placeholder Venue},
8
+ year = {2026}
9
+ }
@@ -0,0 +1,13 @@
1
+ \begin{table}[t]
2
+ \caption{One-sentence message of the table and the evaluation protocol.}
3
+ \label{tab:placeholder}
4
+ \centering
5
+ \begin{tabular}{lcc}
6
+ \toprule
7
+ Method & Metric 1 $\uparrow$ & Metric 2 $\uparrow$ \\
8
+ \midrule
9
+ Ours & 0.0000 & 0.0000 \\
10
+ Baseline & 0.0000 & 0.0000 \\
11
+ \bottomrule
12
+ \end{tabular}
13
+ \end{table}
@@ -51,8 +51,8 @@ If `eval-protocol.md` declares structured rung entries, auto mode follows those
51
51
 
52
52
  - Run stage contract: write persistent outputs under `results_root`.
53
53
  - Iterate stage contract: update persistent outputs under `results_root`.
54
- - Review stage contract: update canonical review context such as `.lab/context/decisions.md`, `state.md`, `open-questions.md`, or `evidence-index.md`.
55
- - Report stage contract: write the final report to `<deliverables_root>/report.md`.
54
+ - Review stage contract: update canonical review context such as `.lab/context/decisions.md`, `state.md`, `workflow-state.md`, `open-questions.md`, or `evidence-index.md`.
55
+ - Report stage contract: write `<deliverables_root>/report.md`, `<deliverables_root>/main-tables.md`, and `<deliverables_root>/artifact-status.md`.
56
56
  - Write stage contract: write LaTeX output under `<deliverables_root>/paper/`.
57
57
 
58
58
  ## Promotion Policy
@@ -24,7 +24,7 @@ One sentence describing the active research mission.
24
24
  ## Read First
25
25
 
26
26
  1. `.lab/context/mission.md`
27
- 2. `.lab/context/state.md`
27
+ 2. `.lab/context/workflow-state.md`
28
28
  3. `.lab/context/evidence-index.md`
29
29
 
30
30
  ## Do Not Change Silently
@@ -1,19 +1,25 @@
1
- # Workflow State
1
+ # Research State
2
2
 
3
- ## Current Stage
3
+ ## Approved Direction
4
4
 
5
- - Active stage:
6
- - Current objective:
7
- - Next required output:
5
+ - One-sentence problem:
6
+ - Approved direction:
7
+ - Strongest supported claim:
8
8
 
9
- ## Latest Update
9
+ ## Evidence Boundary
10
10
 
11
- - Last completed action:
12
- - Latest artifact path:
13
- - Latest run or report id:
11
+ - What the current evidence really supports:
12
+ - What is still outside the boundary:
13
+ - Biggest research risk:
14
14
 
15
- ## Next Step
15
+ ## Active Research Track
16
16
 
17
- - Immediate next action:
18
- - Blocking issue:
19
- - Human decision needed:
17
+ - Current research focus:
18
+ - Primary metric:
19
+ - Dataset or benchmark scope:
20
+
21
+ ## Current Research Constraints
22
+
23
+ - Hard constraints:
24
+ - Claim boundary:
25
+ - Conditions that require reopening the direction:
@@ -0,0 +1,19 @@
1
+ # Workflow State
2
+
3
+ ## Current Stage
4
+
5
+ - Active stage:
6
+ - Current objective:
7
+ - Next required output:
8
+
9
+ ## Latest Update
10
+
11
+ - Last completed action:
12
+ - Latest artifact path:
13
+ - Latest run or report id:
14
+
15
+ ## Next Step
16
+
17
+ - Immediate next action:
18
+ - Blocking issue:
19
+ - Human decision needed:
@@ -8,7 +8,7 @@ For a new AI session, read these files in order:
8
8
 
9
9
  1. `.lab/context/session-brief.md`
10
10
  2. `.lab/context/mission.md`
11
- 3. `.lab/context/state.md`
11
+ 3. `.lab/context/workflow-state.md`
12
12
  4. `.lab/context/evidence-index.md`
13
13
 
14
14
  Only expand to additional context when the brief points to it.
@@ -24,13 +24,15 @@ For auto-mode orchestration or long-running experiment campaigns, also read:
24
24
 
25
25
  ## Workflow Boundaries
26
26
 
27
- - `.lab/context/` holds durable project research state.
27
+ - `.lab/context/` holds durable project research state plus lightweight workflow state.
28
28
  - `.lab/changes/`, `.lab/iterations/`, and `.lab/writing/` hold workflow control artifacts, lightweight manifests, and change-local harnesses.
29
29
  - `.lab/.managed/` holds tool-managed templates and scripts.
30
30
  - Durable run outputs belong under the configured `results_root`, not inside `.lab/changes/`.
31
31
  - Figures and plots belong under the configured `figures_root`, not inside `.lab/changes/`.
32
32
  - Deliverables belong under the configured `deliverables_root`, not inside `.lab/context/`.
33
33
  - Change-local `data/` directories may hold lightweight manifests or batch specs, but not the canonical dataset copy.
34
+ - `.lab/context/state.md` holds durable research state; `.lab/context/workflow-state.md` holds live workflow state.
35
+ - `.lab/context/summary.md` is the durable project summary; `.lab/context/session-brief.md` is the next-session startup brief.
34
36
  - `.lab/context/auto-mode.md` defines the bounded autonomous envelope; `.lab/context/auto-status.md` records live state for resume and handoff.
35
37
  - If the user provides a LaTeX template directory, validate it and attach it through `paper_template_root` before drafting.
36
38
  - Treat attached template directories as user-owned assets. Do not rewrite template files unless the user explicitly asks.
@@ -83,7 +83,7 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
83
83
  ### `/lab:auto`
84
84
 
85
85
  - Use this stage to orchestrate approved execution stages with bounded autonomy.
86
- - Read `.lab/config/workflow.json`, `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/decisions.md`, `.lab/context/data-decisions.md`, `.lab/context/evidence-index.md`, `.lab/context/terminology-lock.md`, `.lab/context/auto-mode.md`, and `.lab/context/auto-status.md` before acting.
86
+ - Read `.lab/config/workflow.json`, `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, `.lab/context/decisions.md`, `.lab/context/data-decisions.md`, `.lab/context/evidence-index.md`, `.lab/context/terminology-lock.md`, `.lab/context/auto-mode.md`, and `.lab/context/auto-status.md` before acting.
87
87
  - Treat `.lab/context/auto-mode.md` as the control contract and `.lab/context/auto-status.md` as the live state file.
88
88
  - Require `Autonomy level` and `Approval status` in `.lab/context/auto-mode.md` before execution.
89
89
  - Treat `L1` as safe-run validation, `L2` as bounded iteration, and `L3` as aggressive campaign mode.
@@ -93,13 +93,13 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
93
93
  - You may add exploratory datasets, benchmarks, and comparison methods inside the approved exploration envelope.
94
94
  - You may promote an exploratory addition to the primary package only after the promotion policy in `auto-mode.md` is satisfied and the promotion is written back into `.lab/context/data-decisions.md`, `.lab/context/decisions.md`, `.lab/context/state.md`, and `.lab/context/session-brief.md`.
95
95
  - Poll long-running commands until they complete, time out, or hit a stop condition.
96
- - Update `.lab/context/auto-status.md`, `.lab/context/state.md`, `.lab/context/decisions.md`, `.lab/context/data-decisions.md`, `.lab/context/evidence-index.md`, and `.lab/context/session-brief.md` as the campaign advances.
96
+ - Update `.lab/context/auto-status.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, `.lab/context/decisions.md`, `.lab/context/data-decisions.md`, `.lab/context/evidence-index.md`, and `.lab/context/session-brief.md` as the campaign advances.
97
97
  - Keep an explicit approval gate when a proposed action would leave the frozen core defined by the auto-mode contract.
98
98
 
99
99
  ### `/lab:spec`
100
100
 
101
101
  - Read `.lab/config/workflow.json` before drafting the change.
102
- - Read `.lab/context/mission.md`, `.lab/context/decisions.md`, `.lab/context/state.md`, and `.lab/context/data-decisions.md` before drafting the change.
102
+ - Read `.lab/context/mission.md`, `.lab/context/decisions.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, and `.lab/context/data-decisions.md` before drafting the change.
103
103
  - Use `.lab/changes/<change-id>/` as the canonical lab change directory.
104
104
  - Convert the approved idea into lab change artifacts using `.lab/.managed/templates/proposal.md`, `.lab/.managed/templates/design.md`, `.lab/.managed/templates/spec.md`, and `.lab/.managed/templates/tasks.md`.
105
105
  - Update `.lab/context/state.md` and `.lab/context/decisions.md` after freezing the spec.
@@ -108,12 +108,12 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
108
108
  ### `/lab:run`
109
109
 
110
110
  - Start with the smallest meaningful experiment.
111
- - Read `.lab/context/mission.md`, `.lab/context/state.md`, and `.lab/context/data-decisions.md` before choosing the run.
111
+ - Read `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, and `.lab/context/data-decisions.md` before choosing the run.
112
112
  - Register the run with `.lab/.managed/scripts/register_run.py`.
113
113
  - Normalize the result with `.lab/.managed/scripts/eval_report.py`.
114
114
  - Validate normalized output with `.lab/.managed/scripts/validate_results.py`.
115
115
  - Read `.lab/context/eval-protocol.md` before choosing the smallest run so the first experiment already targets the approved tables, metrics, and gates.
116
- - Update `.lab/context/state.md`, `.lab/context/evidence-index.md`, and `.lab/context/eval-protocol.md` after the run.
116
+ - Update `.lab/context/state.md`, `.lab/context/workflow-state.md`, `.lab/context/evidence-index.md`, and `.lab/context/eval-protocol.md` after the run.
117
117
  - If the evaluation protocol is still skeletal, initialize the smallest trustworthy source-backed version before treating the run as the protocol anchor.
118
118
 
119
119
  ### `/lab:iterate`
@@ -128,13 +128,13 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
128
128
  - maximum iteration count
129
129
  - Only change implementation hypotheses within the loop.
130
130
  - Require a normalized evaluation report each round.
131
- - Read `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/decisions.md`, and `.lab/context/evidence-index.md` at the start of each round.
131
+ - Read `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, `.lab/context/decisions.md`, and `.lab/context/evidence-index.md` at the start of each round.
132
132
  - Read `.lab/context/data-decisions.md` before changing benchmark-facing experiments.
133
133
  - Read `.lab/context/eval-protocol.md` before changing evaluation ladders, sample sizes, or promotion gates.
134
134
  - Keep metric definitions, baseline behavior, and comparison implementations anchored to the source-backed evaluation protocol before changing thresholds, gates, or ladder transitions.
135
135
  - Switch to diagnostic mode if risk increases for two consecutive rounds.
136
136
  - Write round reports with `.lab/.managed/templates/iteration-report.md`.
137
- - Update `.lab/context/state.md`, `.lab/context/decisions.md`, `.lab/context/evidence-index.md`, `.lab/context/open-questions.md`, and `.lab/context/eval-protocol.md` each round as needed.
137
+ - Update `.lab/context/state.md`, `.lab/context/workflow-state.md`, `.lab/context/decisions.md`, `.lab/context/evidence-index.md`, `.lab/context/open-questions.md`, and `.lab/context/eval-protocol.md` each round as needed.
138
138
  - Keep `.lab/context/eval-protocol.md` synchronized with accepted ladder changes, benchmark scope, and source-backed implementation deviations.
139
139
  - Stop at threshold success or iteration cap, and record blockers plus next-best actions when the campaign ends without success.
140
140
 
@@ -151,13 +151,13 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
151
151
  ### `/lab:report`
152
152
 
153
153
  - Summarize all validated iteration summaries.
154
- - Read `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/decisions.md`, `.lab/context/evidence-index.md`, and `.lab/context/data-decisions.md` before drafting.
154
+ - Read `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, `.lab/context/decisions.md`, `.lab/context/evidence-index.md`, and `.lab/context/data-decisions.md` before drafting.
155
155
  - Read `.lab/context/eval-protocol.md` before choosing tables, thresholds, or final result framing.
156
156
  - Keep metric definitions, comparison semantics, and implementation references anchored to the approved evaluation protocol instead of re-deriving them during reporting.
157
157
  - Aggregate them with `.lab/.managed/scripts/summarize_iterations.py`.
158
- - Write the final document with `.lab/.managed/templates/final-report.md` and the managed table summary with `.lab/.managed/templates/main-tables.md`.
158
+ - Write the final document with `.lab/.managed/templates/final-report.md`, the managed table summary with `.lab/.managed/templates/main-tables.md`, and the internal handoff with `.lab/.managed/templates/artifact-status.md`.
159
159
  - Keep failed attempts and limitations visible.
160
- - Update `.lab/context/mission.md`, `.lab/context/eval-protocol.md`, `.lab/context/state.md`, and `.lab/context/evidence-index.md` with report-level handoff notes.
160
+ - Update `.lab/context/mission.md`, `.lab/context/eval-protocol.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, and `.lab/context/evidence-index.md` with report-level handoff notes.
161
161
  - If canonical context is still skeletal, hydrate the smallest trustworthy version from frozen artifacts before finalizing the report.
162
162
  - If collaborator-critical fields remain missing after hydration, downgrade to an `artifact-anchored interim report` instead of presenting a final collaborator-ready report.
163
163
 
@@ -172,14 +172,19 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
172
172
  - Write one paper section or one explicit subproblem per round.
173
173
  - Bind each claim to evidence from `report`, iteration reports, or normalized summaries.
174
174
  - Write planning artifacts with `.lab/.managed/templates/paper-plan.md`, `.lab/.managed/templates/paper-section.md`, and `.lab/.managed/templates/write-iteration.md`.
175
- - Write final manuscript artifacts with `.lab/.managed/templates/paper.tex` and `.lab/.managed/templates/paper-section.tex`.
175
+ - Write final manuscript artifacts with `.lab/.managed/templates/paper.tex`, `.lab/.managed/templates/paper-section.tex`, `.lab/.managed/templates/paper-table.tex`, `.lab/.managed/templates/paper-figure.tex`, and `.lab/.managed/templates/paper-references.bib`.
176
176
  - Use the vendored paper-writing references under `skills/lab/references/paper-writing/`.
177
- - For `abstract`, `introduction`, and `method`, also use the vendored example-bank files under `skills/lab/references/paper-writing/examples/`.
177
+ - For any section with a bundled example bank, also use the vendored example-bank files under `skills/lab/references/paper-writing/examples/`.
178
178
  - Load only the current section guide, the matching examples index when one exists, 1-2 matching concrete example files, plus `paper-review.md` and `does-my-writing-flow-source.md`.
179
179
  - Build a compact mini-outline before prose.
180
+ - Build the paper asset plan before prose when the section carries method or experiments claims.
180
181
  - For each subsection, explicitly cover motivation, design, and technical advantage when applicable.
181
182
  - Keep terminology stable across rounds and sections.
182
183
  - If a claim is not supported by evidence, weaken or remove it.
184
+ - Treat tables, figures, citations, and bibliography as core manuscript content rather than optional polish.
185
+ - Keep paper-facing LaTeX free of absolute local paths, rerun ids, shell transcripts, and internal workflow provenance.
186
+ - Materialize real LaTeX tables and figure placeholders instead of leaving all evidence inside prose paragraphs.
187
+ - Run `.lab/.managed/scripts/validate_manuscript_delivery.py --paper-dir <deliverables_root>/paper` before accepting a final-draft or export round.
183
188
  - Before finalizing a round, append and answer the five-dimension self-review checklist and revise unresolved items.
184
189
  - Apply paper-writing discipline without changing experimental truth.
185
190
  - If the evidence is insufficient, stop and route back to `review` or `iterate`.
@@ -194,7 +199,7 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
194
199
  - No unconstrained auto mode. Every `/lab:auto` campaign must declare allowed stages, stop conditions, and a promotion policy in `.lab/context/auto-mode.md`.
195
200
  - No auto start without an explicit autonomy level and `Approval status: approved`.
196
201
  - No final report without validated normalized results.
197
- - No paper-writing round without stable report artifacts, an approved framing artifact, evidence links, and LaTeX manuscript output.
202
+ - No paper-writing round without stable report artifacts, an approved framing artifact, evidence links, LaTeX manuscript output, and a passing manuscript-delivery validation for final-draft or export rounds.
198
203
 
199
204
  ## References
200
205
 
@@ -212,7 +217,7 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
212
217
  - Write stage guide: `.codex/skills/lab/stages/write.md` or `.claude/skills/lab/stages/write.md`
213
218
  - Paper-writing integration: `.codex/skills/lab/references/paper-writing-integration.md` or `.claude/skills/lab/references/paper-writing-integration.md`
214
219
  - Vendored paper-writing references: `.codex/skills/lab/references/paper-writing/{abstract,introduction,related-work,method,experiments,conclusion,paper-review,does-my-writing-flow-source}.md` or `.claude/skills/lab/references/paper-writing/{abstract,introduction,related-work,method,experiments,conclusion,paper-review,does-my-writing-flow-source}.md`
215
- - Vendored paper-writing example bank: `.codex/skills/lab/references/paper-writing/examples/{index,abstract-examples,introduction-examples,method-examples}.md` or `.claude/skills/lab/references/paper-writing/examples/{index,abstract-examples,introduction-examples,method-examples}.md`, plus the matching section subdirectories
220
+ - Vendored paper-writing example bank: `.codex/skills/lab/references/paper-writing/examples/{index,abstract-examples,introduction-examples,method-examples,related-work-examples,experiments-examples,conclusion-examples}.md` or `.claude/skills/lab/references/paper-writing/examples/{index,abstract-examples,introduction-examples,method-examples,related-work-examples,experiments-examples,conclusion-examples}.md`, plus the matching section subdirectories
216
221
  - Command adapters: the installed `/lab:*` command assets
217
222
  - Shared workflow config: `.lab/config/workflow.json`
218
223
  - Shared project context: `.lab/context/{mission,state,decisions,evidence-index,open-questions,data-decisions,eval-protocol,auto-mode,auto-status}.md`
@@ -0,0 +1,27 @@
1
+ # Conservative Claim-Boundary LaTeX Example
2
+
3
+ Use this example to close with the strongest supported claim while keeping the
4
+ boundary explicit.
5
+
6
+ ```tex
7
+ \section{Conclusion}
8
+
9
+ This paper shows that adding a structured ranking backbone together with a
10
+ post-hoc calibration stage improves uplift ranking under the frozen benchmark
11
+ protocol. Across the three benchmark families used in this work, the full model
12
+ consistently matches or exceeds the strongest baselines and remains stronger
13
+ than the key ablated variants. This makes the main claim narrower than a
14
+ universal superiority claim but stronger than a single-dataset win.
15
+
16
+ We do not claim that the current method solves uplift modeling in every domain
17
+ or that every design choice helps equally on every benchmark. In particular, the
18
+ calibration stage appears beneficial on some datasets and neutral on others,
19
+ which means its value should be interpreted as setting-dependent rather than as
20
+ a guaranteed gain. That boundary is consistent with recent benchmarking
21
+ practice, which argues for claim discipline and protocol-specific interpretation
22
+ rather than broad overgeneralization~\cite{carlini2019evaluating}.
23
+
24
+ The most useful next step is to extend the evaluation to a broader set of
25
+ benchmark slices and to test whether the same ranking-versus-calibration split
26
+ remains useful when the label distribution shifts more aggressively.
27
+ ```
@@ -0,0 +1,16 @@
1
+ # Conclusion Example Patterns
2
+
3
+ Use these examples to end with a bounded claim, not a marketing recap. The
4
+ referenced file is a complete LaTeX conclusion example with explicit claim
5
+ boundary language.
6
+
7
+ ## Recommended Pattern
8
+
9
+ 1. Restate the narrow supported claim.
10
+ 2. Restate the strongest evidence in one compact sentence.
11
+ 3. State the main limitation or boundary.
12
+ 4. End with the next concrete direction, not generic future work.
13
+
14
+ ## Example Files
15
+
16
+ - `examples/conclusion/conservative-claim-boundary.md`
@@ -0,0 +1,44 @@
1
+ # Figure Placeholder and Discussion Example
2
+
3
+ Use complete figure placeholders when the visual asset is not finalized yet but
4
+ the manuscript already needs a stable figure slot, caption, label, and prose
5
+ attachment.
6
+
7
+ ## Method Figure Placeholder
8
+
9
+ ```tex
10
+ \begin{figure}[t]
11
+ \centering
12
+ \fbox{\rule{0pt}{1.55in}\rule{0.92\linewidth}{0pt}}
13
+ \caption{Method overview. Figure intent: show the full pipeline, highlight the
14
+ boundary between the structured scoring module and the post-hoc calibration
15
+ stage, and make the train-time versus inference-time data flow easy to inspect.}
16
+ \label{fig:method-overview}
17
+ \end{figure}
18
+ ```
19
+
20
+ ## Results Figure Placeholder
21
+
22
+ ```tex
23
+ \begin{figure}[t]
24
+ \centering
25
+ \fbox{\rule{0pt}{1.55in}\rule{0.92\linewidth}{0pt}}
26
+ \caption{Benchmark-level results overview. Figure intent: summarize the trend
27
+ across datasets, show error bars or confidence intervals, and reveal whether the
28
+ main gain is stable or dominated by one benchmark.}
29
+ \label{fig:results-overview}
30
+ \end{figure}
31
+ ```
32
+
33
+ ## Discussion Example
34
+
35
+ ```tex
36
+ Figure~\ref{fig:method-overview} gives the reader the shortest path to the
37
+ method's logic before the section moves into module details. The figure should
38
+ make it obvious which component produces the structured signal and where the
39
+ post-hoc calibration step changes the final ranking.
40
+
41
+ Figure~\ref{fig:results-overview} should then complement the tables rather than
42
+ repeat them. Its job is to show whether the gain is stable across datasets and
43
+ seeds, not to claim a new effect that the tables do not already support.
44
+ ```