superlab 0.1.65 → 0.1.66

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/lib/i18n.cjs CHANGED
@@ -2150,6 +2150,12 @@ ZH_CONTENT[path.join(".codex", "skills", "lab", "stages", "write.md")] = `# \`/l
2150
2150
  - 如果当前 section 是 \`abstract\`、\`introduction\` 或 \`method\`,还必须继续读取本地 example bank:\`references/paper-writing/examples/index.md\`、对应的 examples index,以及 1-2 个具体 example 文件。
2151
2151
  - 如果当前 section 是 \`related work\`、\`experiments\` 或 \`conclusion\`,也要读取对应的本地 example bank:\`references/paper-writing/examples/index.md\`、对应的 examples index,以及 1-2 个具体 example 文件。
2152
2152
  - 例子只能复用结构、段落角色和句法逻辑,不能直接复用原句。
2153
+ - 如果用户在 \`/lab:write\` 里提供本地 PDF、PDF URL、HTML 页面或本地参考论文,起草前必须先运行 \`.lab/.managed/scripts/extract_reference_paper_structure.py --output-dir .lab/writing/reference-patterns <sources...>\`,除非 \`.lab/writing/reference-patterns/aggregate-template-playbook.md\` 已经覆盖这些完全相同的来源。
2154
+ - reference-patterns 是 \`/lab:write\` 的内部能力,不是让用户额外学习的新命令;用户仍然只需要输入 \`/lab-write\`。
2155
+ - 使用参考论文时,必须读取 \`.lab/writing/reference-patterns/aggregate-template-playbook.md\`、当前 section 对应的 \`.lab/writing/reference-patterns/section-templates/<section>.json\`,以及需要图表时的 \`.lab/writing/reference-patterns/visual-templates/experiment-assets.json\`。
2156
+ - 写 experiments 时,如果存在 \`.lab/writing/reference-patterns/section-templates/experiments-protocol.json\`,还必须读取它,并保留其中的协议槽位逻辑:数据集描述、数据集统计或 appendix 链接、split 或 sampling 协议、baseline / 对比方法设置、指标定义、implementation / tuning 细节、主结果、消融、敏感性或分析。
2157
+ - 不要把 \`setup\`、\`overall performance\` 这类粗标签当成完整的 experiments 结构抽取;如果参考论文里有单独的 dataset、baseline、metric、implementation 或 appendix-detail 段落,就必须把这些细槽位抽出来并在 mini-outline 中体现。
2158
+ - 参考论文只允许复现结构、段落功能、图表作用、放置逻辑和前后桥接;不得复制措辞、claim、指标、baseline、数据、caption 或结论。
2153
2159
  - 先写 mini-outline 再写 prose。
2154
2160
  - 如果当前 section 带引言、方法、实验、相关工作或结论 claim,先规划需要的表格、figure placeholders 和 citations,再写 prose。
2155
2161
  - 在起草 \`introduction\`、\`method\`、\`experiments\`、\`related work\` 或 \`conclusion\` 之前,必须先运行 \`.lab/.managed/scripts/validate_paper_plan.py --paper-plan .lab/writing/plan.md\`。
@@ -87,6 +87,53 @@ CANONICAL_HEADING_TITLES = {
87
87
  "appendix",
88
88
  }
89
89
 
90
+ EXPERIMENT_PROTOCOL_SLOT_GUIDANCE = {
91
+ "dataset_description": {
92
+ "reader_question": "Which datasets define the evaluation scope, and why are they relevant?",
93
+ "placement_guidance": "place before baselines, metrics, and main results",
94
+ },
95
+ "dataset_statistics": {
96
+ "reader_question": "What dataset scale, feature, treatment, or split facts constrain interpretation?",
97
+ "placement_guidance": "place near dataset descriptions or move detailed statistics to appendix with a main-text pointer",
98
+ },
99
+ "split_protocol": {
100
+ "reader_question": "How are train, validation, test, seed, or sampling decisions made?",
101
+ "placement_guidance": "place before metrics and results so comparisons have a fixed protocol",
102
+ },
103
+ "baseline_setup": {
104
+ "reader_question": "Which comparator families are included, and what role does each comparator play?",
105
+ "placement_guidance": "place after datasets and before the main comparison table",
106
+ },
107
+ "metric_definition": {
108
+ "reader_question": "Which metrics decide ranking, what do they measure, and which direction is better?",
109
+ "placement_guidance": "place before the first result table and repeat local definitions in table notes when needed",
110
+ },
111
+ "implementation_details": {
112
+ "reader_question": "Which tuning, validation, training, or hardware details are needed for reproducibility?",
113
+ "placement_guidance": "place after metrics or in appendix when details are long",
114
+ },
115
+ "main_results": {
116
+ "reader_question": "What is the primary comparison result under the declared protocol?",
117
+ "placement_guidance": "place after setup, baselines, and metrics",
118
+ },
119
+ "ablation": {
120
+ "reader_question": "Which component or design choice accounts for the claimed effect?",
121
+ "placement_guidance": "place after the main results",
122
+ },
123
+ "sensitivity": {
124
+ "reader_question": "How stable is the result under relevant protocol or hyperparameter changes?",
125
+ "placement_guidance": "place after ablations or in an analysis subsection",
126
+ },
127
+ "appendix_dataset_statistics": {
128
+ "reader_question": "Which detailed dataset facts support the compact dataset setup in the main experiments?",
129
+ "placement_guidance": "link from the main experimental setup and keep detailed tables in appendix",
130
+ },
131
+ "appendix_baseline_metric_details": {
132
+ "reader_question": "Which baseline, metric, or implementation details are too long for the main setup?",
133
+ "placement_guidance": "link from the main setup and keep long comparator or metric definitions in appendix",
134
+ },
135
+ }
136
+
90
137
 
91
138
  @dataclass
92
139
  class SectionRecord:
@@ -272,6 +319,8 @@ def normalize_heading(line: str) -> str:
272
319
 
273
320
  def section_type_for_title(title: str) -> str:
274
321
  lowered = title.lower()
322
+ if re.match(r"^(appendix\b|[a-z]\.\d+(?:\.\d+)*\.?\s+)", lowered.strip()):
323
+ return "appendix"
275
324
  lowered = re.sub(r"^\d+(?:\.\d+)*\.?\s+", "", lowered).strip()
276
325
  for section_type, aliases in SECTION_ALIASES.items():
277
326
  if any(alias in lowered for alias in aliases):
@@ -424,6 +473,60 @@ def paragraph_role(section_type: str, paragraph: str) -> str:
424
473
  return "visual_or_table_anchor"
425
474
  if section_type == "abstract":
426
475
  return "abstract_summary"
476
+ if any(token in lowered for token in ("dataset statistics", "statistics of the dataset", "sample count", "feature dimension")):
477
+ return "dataset_statistics"
478
+ if re.search(r"\bdatasets?\s*[:.]", lowered) or any(
479
+ token in lowered for token in ("benchmark dataset", "public dataset", "semi-synthetic dataset")
480
+ ):
481
+ return "dataset_description"
482
+ if re.search(r"\bbaselines?\s*[:.]", lowered) or any(
483
+ token in lowered for token in ("baseline family", "comparator", "compare with")
484
+ ):
485
+ return "baseline_setup"
486
+ if re.search(r"\bmetrics?\s*[:.]", lowered) or any(
487
+ token in lowered
488
+ for token in (
489
+ "metric definition",
490
+ "we report",
491
+ "primary metric",
492
+ "secondary metric",
493
+ "higher is better",
494
+ "lower is better",
495
+ "auuc",
496
+ "qini",
497
+ )
498
+ ):
499
+ return "metric_definition"
500
+ if any(
501
+ token in lowered
502
+ for token in (
503
+ "train/test",
504
+ "train, validation",
505
+ "training split",
506
+ "validation split",
507
+ "test split",
508
+ "random split",
509
+ "repeated split",
510
+ "sampling",
511
+ "seed",
512
+ "protocol",
513
+ )
514
+ ):
515
+ return "split_protocol"
516
+ if any(
517
+ token in lowered
518
+ for token in (
519
+ "implementation detail",
520
+ "hyperparameter",
521
+ "tuning",
522
+ "learning rate",
523
+ "epoch",
524
+ "batch size",
525
+ "hardware",
526
+ "gpu",
527
+ )
528
+ ):
529
+ return "implementation_details"
427
530
  if any(token in lowered for token in ("limitation", "future work", "caveat", "drift", "局限")):
428
531
  return "limitation_boundary"
429
532
  if any(token in lowered for token in ("ablation", "component", "without", "remove")):
@@ -726,6 +829,172 @@ def merge_unique(values: list[str]) -> list[str]:
726
829
  return merged
727
830
 
728
831
 
832
+ def normalized_section_title(title: str) -> str:
833
+ title = re.sub(r"^\d+(?:\.\d+)*\.?\s+", "", title.strip())
834
+ title = re.sub(r"^[A-Z]\.\d+(?:\.\d+)*\.?\s+", "", title)
835
+ title = re.sub(r"^Appendix\s+[A-Z](?:\.\d+)*\.?\s*[:\-]?\s*", "", title, flags=re.IGNORECASE)
836
+ return re.sub(r"\s+", " ", title).strip().lower()
837
+
838
+
839
+ def experiment_slots_from_signal(section: SectionRecord, role: str, text: str) -> list[str]:
840
+ title = normalized_section_title(section.title)
841
+ lowered = f"{title} {text.lower()}"
842
+ is_appendix = section.section_type == "appendix"
843
+ slots: list[str] = []
844
+
845
+ if is_appendix and "dataset" in lowered and any(
846
+ token in lowered for token in ("statistics", "statistic", "summary", "sample", "feature")
847
+ ):
848
+ slots.append("appendix_dataset_statistics")
849
+ if is_appendix and any(token in lowered for token in ("baseline", "metric", "implementation", "hyperparameter")):
850
+ slots.append("appendix_baseline_metric_details")
851
+ if role == "dataset_statistics" or (
852
+ "dataset" in lowered and any(token in lowered for token in ("statistics", "summary", "sample", "feature"))
853
+ ):
854
+ slots.append("dataset_statistics")
855
+ if role == "dataset_description" or "datasets" in title or "dataset" in title:
856
+ slots.append("dataset_description")
857
+ if role == "split_protocol" or any(
858
+ token in lowered
859
+ for token in ("split", "seed", "sampling", "train/test", "validation", "protocol")
860
+ ):
861
+ slots.append("split_protocol")
862
+ if role == "baseline_setup" or "baseline" in lowered or "comparator" in lowered:
863
+ slots.append("baseline_setup")
864
+ if role == "metric_definition" or "metric" in lowered or "auuc" in lowered or "qini" in lowered:
865
+ slots.append("metric_definition")
866
+ if role == "implementation_details" or any(
867
+ token in lowered
868
+ for token in ("implementation", "hyperparameter", "tuning", "epoch", "learning rate", "hardware")
869
+ ):
870
+ slots.append("implementation_details")
871
+ if "ablation" in lowered:
872
+ slots.append("ablation")
873
+ if "sensitivity" in lowered or "shift" in lowered or "trade-off" in lowered or "tradeoff" in lowered:
874
+ slots.append("sensitivity")
875
+ if role == "result_interpretation" or any(
876
+ token in lowered for token in ("main result", "overall performance", "performance", "results and discussion")
877
+ ):
878
+ slots.append("main_results")
879
+ return list(dict.fromkeys(slots))
880
+
881
+
882
+ def is_experiment_protocol_section(section: SectionRecord) -> bool:
883
+ if section.section_type in {"experiments", "discussion"}:
884
+ return True
885
+ if section.section_type != "appendix":
886
+ return False
887
+ title = normalized_section_title(section.title)
888
+ return any(
889
+ token in title
890
+ for token in (
891
+ "dataset",
892
+ "baseline",
893
+ "metric",
894
+ "experiment",
895
+ "experimental",
896
+ "setup",
897
+ "result",
898
+ "ablation",
899
+ "sensitivity",
900
+ "complexity",
901
+ "online",
902
+ )
903
+ )
904
+
905
+
906
+ def slot_payload(
907
+ *,
908
+ source_paper: str,
909
+ slot: str,
910
+ section: SectionRecord,
911
+ evidence_excerpt: str,
912
+ paragraph_index: int | None = None,
913
+ asset_type: str | None = None,
914
+ asset_id: str | None = None,
915
+ ) -> dict:
916
+ guidance = EXPERIMENT_PROTOCOL_SLOT_GUIDANCE[slot]
917
+ payload = {
918
+ "slot": slot,
919
+ "source_paper": source_paper,
920
+ "source_heading": section.title,
921
+ "source_section_type": section.section_type,
922
+ "paragraph_index": paragraph_index,
923
+ "evidence_excerpt": short_excerpt(evidence_excerpt),
924
+ "reader_question": guidance["reader_question"],
925
+ "placement_guidance": guidance["placement_guidance"],
926
+ "linked_main_section": "experiments" if slot.startswith("appendix_") else "",
927
+ "reuse_guidance": "Reuse this protocol role and placement logic only; do not copy wording, claims, metrics, data, or conclusions.",
928
+ }
929
+ if asset_type and asset_id:
930
+ payload["asset_type"] = asset_type
931
+ payload["asset_id"] = asset_id
932
+ return payload
933
+
934
+
935
+ def build_experiment_protocol_slots_for_payload(payload: dict) -> list[dict]:
936
+ slots: list[dict] = []
937
+ sections_by_title = {section.title: section for section in payload["sections"]}
938
+
939
+ for role in payload["roles"]:
940
+ section = sections_by_title.get(role["section_title"])
941
+ if not section:
942
+ continue
943
+ if not is_experiment_protocol_section(section):
944
+ continue
945
+ role_slots = experiment_slots_from_signal(section, role["role"], role["excerpt"])
946
+ if not role_slots:
947
+ continue
948
+ for slot in role_slots:
949
+ slots.append(
950
+ slot_payload(
951
+ source_paper=payload["slug"],
952
+ slot=slot,
953
+ section=section,
954
+ paragraph_index=role["paragraph_index"],
955
+ evidence_excerpt=role["excerpt"],
956
+ )
957
+ )
958
+
959
+ for asset in payload["assets"]:
960
+ section = sections_by_title.get(asset.appears_in_title)
961
+ if not section:
962
+ continue
963
+ if not is_experiment_protocol_section(section):
964
+ continue
965
+ asset_slots = experiment_slots_from_signal(section, asset.evidence_role, asset.caption)
966
+ if asset.evidence_role == "dataset_or_protocol" and section.section_type == "appendix":
967
+ asset_slots = ["appendix_dataset_statistics"]
968
+ if not asset_slots:
969
+ continue
970
+ for slot in asset_slots:
971
+ slots.append(
972
+ slot_payload(
973
+ source_paper=payload["slug"],
974
+ slot=slot,
975
+ section=section,
976
+ evidence_excerpt=asset.caption,
977
+ asset_type=asset.asset_type,
978
+ asset_id=asset.asset_id,
979
+ )
980
+ )
981
+
982
+ seen: set[tuple[str, str, str, str]] = set()
983
+ unique_slots: list[dict] = []
984
+ for item in slots:
985
+ key = (
986
+ item["source_paper"],
987
+ item["slot"],
988
+ item["source_heading"].lower(),
989
+ item["evidence_excerpt"].lower(),
990
+ )
991
+ if key in seen:
992
+ continue
993
+ seen.add(key)
994
+ unique_slots.append(item)
995
+ return unique_slots
996
+
997
+
729
998
  def build_section_templates(output_dir: Path, paper_payloads: list[dict]) -> None:
730
999
  target = output_dir / "section-templates"
731
1000
  target.mkdir(parents=True, exist_ok=True)
@@ -774,10 +1043,30 @@ def build_section_templates(output_dir: Path, paper_payloads: list[dict]) -> Non
774
1043
  "asset_roles": asset_roles,
775
1044
  "reuse_rule": "Reuse structure only; do not copy wording, claims, metrics, or conclusions from reference papers.",
776
1045
  }
1046
+ if section_type == "experiments":
1047
+ protocol_slots: list[dict] = []
1048
+ for payload in paper_payloads:
1049
+ protocol_slots.extend(build_experiment_protocol_slots_for_payload(payload))
1050
+ template["experiment_protocol_slots"] = protocol_slots
777
1051
  (target / f"{section_type}.json").write_text(
778
1052
  json.dumps(template, indent=2, ensure_ascii=False),
779
1053
  encoding="utf-8",
780
1054
  )
1055
+ if section_type == "experiments":
1056
+ (target / "experiments-protocol.json").write_text(
1057
+ json.dumps(
1058
+ {
1059
+ "section": "experiments",
1060
+ "template_id": "experiments-protocol-slots",
1061
+ "source_papers": template["source_papers"],
1062
+ "experiment_protocol_slots": template["experiment_protocol_slots"],
1063
+ "reuse_rule": "Reuse experiment setup topology only: dataset, split, baseline, metric, implementation, result, ablation, sensitivity, and appendix-link roles.",
1064
+ },
1065
+ indent=2,
1066
+ ensure_ascii=False,
1067
+ ),
1068
+ encoding="utf-8",
1069
+ )
781
1070
 
782
1071
 
783
1072
  def build_visual_templates(output_dir: Path, paper_payloads: list[dict]) -> None:
@@ -824,9 +1113,10 @@ def write_aggregate_playbook(output_dir: Path, paper_payloads: list[dict]) -> No
824
1113
  "## Multi-Template Write Procedure",
825
1114
  "",
826
1115
  "1. Pick 2-3 closest section templates for the current paper section.",
827
- "2. Build a mini-outline from common slots and current-paper evidence.",
828
- "3. Add required table/figure assets with local before/after bridge functions.",
829
- "4. Draft with current-paper terminology and evidence only.",
1116
+ "2. For experiment sections, preserve protocol slots when present: datasets, splits, baselines, metrics, implementation details, main results, ablations, sensitivity analysis, and appendix links.",
1117
+ "3. Build a mini-outline from common slots and current-paper evidence.",
1118
+ "4. Add required table/figure assets with local before/after bridge functions.",
1119
+ "5. Draft with current-paper terminology and evidence only.",
830
1120
  "",
831
1121
  "## Table/Figure Planning Rule",
832
1122
  "",
@@ -91,6 +91,14 @@ def has_meaningful_field_value(value: str) -> bool:
91
91
  return normalized not in {"", "-", "n/a", "na", "none", "no", "not applicable", "null", "false"}
92
92
 
93
93
 
94
+ def latest_write_iteration(project_root: Path) -> Path | None:
95
+ iteration_dir = project_root / ".lab" / "writing" / "iterations"
96
+ if not iteration_dir.exists():
97
+ return None
98
+ iteration_files = sorted(iteration_dir.glob("*.md"))
99
+ return iteration_files[-1] if iteration_files else None
100
+
101
+
94
102
  SECTION_STYLE_WARNINGS = {
95
103
  "abstract": [
96
104
  (
@@ -432,6 +440,32 @@ def check_active_paper_topology(section_path: Path, issues: list[str]):
432
440
  issues.extend(validate_topology_artifacts(project_root))
433
441
 
434
442
 
443
+ def check_reference_template_intake(section_path: Path, issues: list[str]):
444
+ project_root = find_project_root(section_path)
445
+ if project_root is None:
446
+ return
447
+
448
+ reference_root = project_root / ".lab" / "writing" / "reference-patterns"
449
+ aggregate_playbook = reference_root / "aggregate-template-playbook.md"
450
+ legacy_notes_root = project_root / ".lab" / "writing" / "pdf-structure-notes"
451
+ has_legacy_notes = legacy_notes_root.exists() and any(legacy_notes_root.glob("*"))
452
+
453
+ latest_iteration = latest_write_iteration(project_root)
454
+ reference_sources = ""
455
+ if latest_iteration:
456
+ iteration_text = read_text(latest_iteration)
457
+ reference_sources = extract_markdown_field(
458
+ iteration_text,
459
+ "Reference Template Intake",
460
+ "Reference sources used:",
461
+ )
462
+
463
+ if (has_legacy_notes or has_meaningful_field_value(reference_sources)) and not aggregate_playbook.exists():
464
+ issues.append(
465
+ "reference papers appear to be used without .lab/writing/reference-patterns/aggregate-template-playbook.md; run extract_reference_paper_structure.py and use structured section/visual templates instead of legacy pdf-structure-notes"
466
+ )
467
+
468
+
435
469
  def check_abstract(text: str, issues: list[str]):
436
470
  numbers = re.findall(r"\b\d+(?:\.\d+)?\b", text)
437
471
  if len(numbers) > 6:
@@ -672,6 +706,7 @@ def main():
672
706
  check_workflow_language_targeting(section_path, blocking_issues)
673
707
  check_common_section_gate_risks(text, warning_issues)
674
708
  check_section_style_policy(text, args.section, warning_issues)
709
+ check_reference_template_intake(section_path, warning_issues)
675
710
  SECTION_CHECKS[args.section](text, warning_issues)
676
711
  check_neighbor_asset_files(args.section, section_path, warning_issues)
677
712
 
@@ -19,6 +19,16 @@
19
19
  - Related work:
20
20
  - Method:
21
21
  - Experiments:
22
+ - Experiments protocol slots:
23
+ - Dataset descriptions:
24
+ - Dataset statistics / appendix link:
25
+ - Split or sampling protocol:
26
+ - Baseline setup:
27
+ - Metric definitions:
28
+ - Implementation and tuning details:
29
+ - Main results:
30
+ - Ablations:
31
+ - Sensitivity or analysis:
22
32
  - Conclusion:
23
33
 
24
34
  ## Visual/Table Templates
@@ -120,6 +120,8 @@ Run these on every round:
120
120
  - For every reference table or figure, extract what reader question it answers, which section/subsection it supports, why it is placed there, what the prose before it should do, and what the prose after it should explain.
121
121
  - When drafting from reference templates, reproduce structure and logic only. Do not copy wording, claims, metrics, baselines, data, captions, or conclusions from reference papers.
122
122
  - Before drafting a section from reference templates, read `.lab/writing/reference-patterns/aggregate-template-playbook.md`, the matching file under `.lab/writing/reference-patterns/section-templates/`, and the matching visual/table template under `.lab/writing/reference-patterns/visual-templates/` when the section uses tables or figures.
123
+ - For experiment sections, also read `.lab/writing/reference-patterns/section-templates/experiments-protocol.json` when it exists and preserve its protocol slot logic: dataset descriptions, dataset statistics, split/sampling protocol, baseline setup, metric definitions, implementation or tuning details, main results, ablations, sensitivity analysis, and appendix-to-main links.
124
+ - Do not accept coarse labels such as “setup” or “overall performance” as a complete experiment-template extraction when the source papers contain explicit dataset, baseline, metric, implementation, or appendix-detail paragraphs.
123
125
  - Build a compact mini-outline before prose.
124
126
  - Academic readability standards are the same in `workflow_language` and `paper_language`; changing languages must not lower external-reader clarity.
125
127
  - If the current round introduces or revises key terms, abbreviations, metric names, mechanism names, or system labels, explain them at first mention by briefly stating what they are and why they matter here.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "superlab",
3
- "version": "0.1.65",
3
+ "version": "0.1.66",
4
4
  "description": "Strict /lab research workflow installer for Codex and Claude",
5
5
  "keywords": [
6
6
  "codex",