superlab 0.1.68 → 0.1.70
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package-assets/claude/commands/lab/write.md +1 -0
- package/package-assets/claude/commands/lab-write.md +1 -0
- package/package-assets/claude/commands/lab:write.md +1 -0
- package/package-assets/claude/commands/lab/357/274/232write.md +1 -0
- package/package-assets/codex/prompts/lab/write.md +1 -0
- package/package-assets/codex/prompts/lab-write.md +1 -0
- package/package-assets/codex/prompts/lab:write.md +1 -0
- package/package-assets/codex/prompts/lab/357/274/232write.md +1 -0
- package/package-assets/shared/lab/.managed/scripts/validate_collaborator_report.py +45 -1
- package/package-assets/shared/lab/.managed/scripts/validate_manuscript_delivery.py +116 -0
- package/package-assets/shared/lab/.managed/scripts/validate_metric_glossary.py +191 -0
- package/package-assets/shared/lab/.managed/scripts/validate_section_draft.py +20 -0
- package/package-assets/shared/lab/.managed/templates/final-report.md +6 -0
- package/package-assets/shared/lab/.managed/templates/main-tables.md +6 -0
- package/package-assets/shared/lab/.managed/templates/metric-glossary.md +35 -0
- package/package-assets/shared/lab/.managed/templates/paper-table.tex +3 -3
- package/package-assets/shared/lab/.managed/templates/paper.tex +2 -0
- package/package-assets/shared/lab/.managed/templates/write-iteration.md +15 -0
- package/package-assets/shared/skills/lab/SKILL.md +9 -1
- package/package-assets/shared/skills/lab/references/paper-writing/section-style-policies.md +1 -0
- package/package-assets/shared/skills/lab/stages/report.md +3 -0
- package/package-assets/shared/skills/lab/stages/write.md +18 -1
- package/package.json +1 -1
|
@@ -8,4 +8,5 @@ Use the installed `lab` skill at `.claude/skills/lab/SKILL.md`.
|
|
|
8
8
|
|
|
9
9
|
Execute the requested `/lab-write` command against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
|
|
10
10
|
When the user provides reference PDFs, paper URLs, local reference-paper paths, or asks to write by reference, stay within the write stage and switch to reference-guided deep-write. Extract structure, map section/subsection slots, paragraph roles, table/figure roles, and bridge logic to the current paper, record the consumption plan, and only then draft prose. The current section must visibly realize the mapped slots; do not treat a consumption plan as enough. Reuse structure only; do not copy wording, claims, metrics, captions, or conclusions. Keep service-style or AI-assistant meta language and workflow-only placeholder language out of paper-facing prose.
|
|
11
|
+
When Method, Experiments, captions, tables, or analysis assets introduce or revise reported metrics, create or update `.lab/writing/metric-glossary.md` before prose polish. Each metric must define its paper-facing name, approved short name, table/header label, plain-language definition, calculation, unit or denominator, direction, scope or conditions, allowed aliases, forbidden aliases, and first-use location. Use the same metric names across prose, captions, table notes, table headers, and result summaries. Run `validate_metric_glossary.py` and remove forbidden aliases from reader-facing LaTeX before finalizing the round.
|
|
11
12
|
This command runs the `write` stage of the lab workflow. Use `.claude/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference, the current section block in `section-style-policies.md`, and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`, and ordinary `.tex` section drafts must stay in `workflow_language` instead of treating `paper_language` as the default draft language. When `workflow_language` and `paper_language` differ, treat the workflow-language paper layer as the default ordinary working layer. Ordinary write rounds should still edit one target paper layer at a time rather than silently refreshing both language layers. If the user names a concrete file or layer, treat that as the only target for the round unless they also explicitly request synchronization. If a workflow-language paper layer is active and the round still targets the canonical manuscript, record why canonical-only writing was acceptable in the write iteration artifact. If `paper_language_finalization_decision=convert-to-paper-language`, explicit canonical-manuscript work may target the canonical `paper_language` manuscript, but that does not make canonical the default ordinary working layer while workflow-language remains active. Treat the workflow-language paper layer as a real persisted artifact rather than a review layer, and preserve it as a full LaTeX mirror with `workflow-language/main.tex`, `workflow-language/references.bib`, `workflow-language/sections/*.tex`, `workflow-language/tables/*.tex`, `workflow-language/figures/*.tex`, and `workflow-language/analysis/analysis-asset.tex`. Do not write new workflow-language output to deprecated review-layer paths such as `docs/lab/paper/review_zh/`. Maintain `.lab/writing/terminology-glossary.md` as the write-stage glossary for full forms, approved short forms, reader-facing explanations, and aliases. Apply the same academic readability standard in every language: when the round introduces or revises key terms, abbreviations, metrics, mechanism names, or system labels, use the full form first, define any short form at first mention, explain what the term is and why it matters here, keep one natural-language paper-facing name per concept, use natural-language full names in prose, do not use labels containing `_` or `-` in reader-facing prose, apply the same first-mention rule to table headers, table captions, table notes, and figure captions or labels, do not assume a fixed drafting order such as Method before Experiments, add a local naming bridge when a section uses canonical short names before their defining section has been drafted, and reuse the canonical label instead of replacing it with a narrative alias. Follow the current section's encouraged, discouraged, and banned expression lists from `section-style-policies.md`; section-specific banned expressions take priority over prose-polish goals. Before any additional tighten, compress, or polish pass on the same section, run a section-level acceptance gate first. That gate must explicitly confirm naming consistency, adjacent-section consistency, claim, metric, and ranking consistency with the current evidence, local clarity, local concision, and section-style compliance. If the round changes the paper's canonical experiment or evaluation protocol, treat that change as a canonical replacement unless the user explicitly scoped it as supplementary or appendix-only, run a paper-wide impact audit before more polishing, update the highest-impact stale sections and assets first, and do not default to translation/workflow-layer sync work unless the user explicitly asked for it or the language-finalization workflow requires it. Only edit both the canonical manuscript and the workflow-language paper layer in the same round when the user explicitly asks for cross-language synchronization or when a final-draft/export language-finalization step requires both layers to be refreshed together. Do not treat a routine tighten/compress/polish request as an instruction to sync the workflow-language companion. For export or remote-publication rounds, if `paper_language_finalization_decision=convert-to-paper-language`, include the workflow-language paper layer in the exported or pushed bundle by default. Allow canonical-only export or remote publication only when the user explicitly asked for it or when the remote target forbids extra files. If any gate item is unresolved, or if a banned expression or move from the current section policy remains, spend the round fixing that blocker instead of polishing sentences further, and do not default the next-step recommendation to another polish pass. Main tables must be locally self-contained: the title, header, note, and adjacent prose should tell the reader what each row and column means, the metric direction, and any relevant unit, denominator, or event condition. Short headers remain allowed, but abbreviations in paper-facing tables must be expanded locally in the same table. If Method or Experiments prose promises a metric family, the main table set must either expose those metrics directly or explicitly mark the missing ones as appendix-only and explain why. If a metric is measured but omitted because it is zero everywhere, redundant, or appendix-only, state that disposition explicitly in the table note instead of silently dropping it. Do not treat `\resizebox{\linewidth}{!}{...}` as the default way to fit a main table. Fit main tables by redesign first: shorten headers, move secondary metrics out of the main table, reduce or split columns, then adjust `\tabcolsep` conservatively; only use `\resizebox` as a last resort, keep width changes readable, and explain the width-control rationale locally in the same table note. Do not use `\scriptsize` or `\tiny` as the default main-table fit strategy. Keep internal identifiers out of reader-facing prose unless they are mapped once for the reader and then moved back out of prose, and record the terminology-clarity self-check, the section-level acceptance gate, section-style policy compliance, the protocol/scope impact audit, the export or remote bundle audit, the round target layer, any canonical-only justification while workflow-language was active, any cross-language sync justification, and the table-semantics audit in the write iteration artifact. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, finish and preserve the workflow-language paper layer first, then ask once whether to keep the draft language or convert the canonical manuscript to `paper_language`, persist that answer, record both the language decision and the workflow-language paper-layer path in the latest write iteration, and only then edit the final manuscript in the chosen language.
|
|
@@ -8,4 +8,5 @@ Use the installed `lab` skill at `.claude/skills/lab/SKILL.md`.
|
|
|
8
8
|
|
|
9
9
|
Execute the requested `/lab-write` command against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
|
|
10
10
|
When the user provides reference PDFs, paper URLs, local reference-paper paths, or asks to write by reference, stay within the write stage and switch to reference-guided deep-write. Extract structure, map section/subsection slots, paragraph roles, table/figure roles, and bridge logic to the current paper, record the consumption plan, and only then draft prose. The current section must visibly realize the mapped slots; do not treat a consumption plan as enough. Reuse structure only; do not copy wording, claims, metrics, captions, or conclusions. Keep service-style or AI-assistant meta language and workflow-only placeholder language out of paper-facing prose.
|
|
11
|
+
When Method, Experiments, captions, tables, or analysis assets introduce or revise reported metrics, create or update `.lab/writing/metric-glossary.md` before prose polish. Each metric must define its paper-facing name, approved short name, table/header label, plain-language definition, calculation, unit or denominator, direction, scope or conditions, allowed aliases, forbidden aliases, and first-use location. Use the same metric names across prose, captions, table notes, table headers, and result summaries. Run `validate_metric_glossary.py` and remove forbidden aliases from reader-facing LaTeX before finalizing the round.
|
|
11
12
|
This command runs the `write` stage of the lab workflow. Use `.claude/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference, the current section block in `section-style-policies.md`, and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`, and ordinary `.tex` section drafts must stay in `workflow_language` instead of treating `paper_language` as the default draft language. When `workflow_language` and `paper_language` differ, treat the workflow-language paper layer as the default ordinary working layer. Ordinary write rounds should still edit one target paper layer at a time rather than silently refreshing both language layers. If the user names a concrete file or layer, treat that as the only target for the round unless they also explicitly request synchronization. If a workflow-language paper layer is active and the round still targets the canonical manuscript, record why canonical-only writing was acceptable in the write iteration artifact. If `paper_language_finalization_decision=convert-to-paper-language`, explicit canonical-manuscript work may target the canonical `paper_language` manuscript, but that does not make canonical the default ordinary working layer while workflow-language remains active. Treat the workflow-language paper layer as a real persisted artifact rather than a review layer, and preserve it as a full LaTeX mirror with `workflow-language/main.tex`, `workflow-language/references.bib`, `workflow-language/sections/*.tex`, `workflow-language/tables/*.tex`, `workflow-language/figures/*.tex`, and `workflow-language/analysis/analysis-asset.tex`. Do not write new workflow-language output to deprecated review-layer paths such as `docs/lab/paper/review_zh/`. Maintain `.lab/writing/terminology-glossary.md` as the write-stage glossary for full forms, approved short forms, reader-facing explanations, and aliases. Apply the same academic readability standard in every language: when the round introduces or revises key terms, abbreviations, metrics, mechanism names, or system labels, use the full form first, define any short form at first mention, explain what the term is and why it matters here, keep one natural-language paper-facing name per concept, use natural-language full names in prose, do not use labels containing `_` or `-` in reader-facing prose, apply the same first-mention rule to table headers, table captions, table notes, and figure captions or labels, do not assume a fixed drafting order such as Method before Experiments, add a local naming bridge when a section uses canonical short names before their defining section has been drafted, and reuse the canonical label instead of replacing it with a narrative alias. Follow the current section's encouraged, discouraged, and banned expression lists from `section-style-policies.md`; section-specific banned expressions take priority over prose-polish goals. Before any additional tighten, compress, or polish pass on the same section, run a section-level acceptance gate first. That gate must explicitly confirm naming consistency, adjacent-section consistency, claim, metric, and ranking consistency with the current evidence, local clarity, local concision, and section-style compliance. If the round changes the paper's canonical experiment or evaluation protocol, treat that change as a canonical replacement unless the user explicitly scoped it as supplementary or appendix-only, run a paper-wide impact audit before more polishing, update the highest-impact stale sections and assets first, and do not default to translation/workflow-layer sync work unless the user explicitly asked for it or the language-finalization workflow requires it. Only edit both the canonical manuscript and the workflow-language paper layer in the same round when the user explicitly asks for cross-language synchronization or when a final-draft/export language-finalization step requires both layers to be refreshed together. Do not treat a routine tighten/compress/polish request as an instruction to sync the workflow-language companion. For export or remote-publication rounds, if `paper_language_finalization_decision=convert-to-paper-language`, include the workflow-language paper layer in the exported or pushed bundle by default. Allow canonical-only export or remote publication only when the user explicitly asked for it or when the remote target forbids extra files. If any gate item is unresolved, or if a banned expression or move from the current section policy remains, spend the round fixing that blocker instead of polishing sentences further, and do not default the next-step recommendation to another polish pass. Main tables must be locally self-contained: the title, header, note, and adjacent prose should tell the reader what each row and column means, the metric direction, and any relevant unit, denominator, or event condition. Short headers remain allowed, but abbreviations in paper-facing tables must be expanded locally in the same table. If Method or Experiments prose promises a metric family, the main table set must either expose those metrics directly or explicitly mark the missing ones as appendix-only and explain why. If a metric is measured but omitted because it is zero everywhere, redundant, or appendix-only, state that disposition explicitly in the table note instead of silently dropping it. Do not treat `\resizebox{\linewidth}{!}{...}` as the default way to fit a main table. Fit main tables by redesign first: shorten headers, move secondary metrics out of the main table, reduce or split columns, then adjust `\tabcolsep` conservatively; only use `\resizebox` as a last resort, keep width changes readable, and explain the width-control rationale locally in the same table note. Do not use `\scriptsize` or `\tiny` as the default main-table fit strategy. Keep internal identifiers out of reader-facing prose unless they are mapped once for the reader and then moved back out of prose, and record the terminology-clarity self-check, the section-level acceptance gate, section-style policy compliance, the protocol/scope impact audit, the export or remote bundle audit, the round target layer, any canonical-only justification while workflow-language was active, any cross-language sync justification, and the table-semantics audit in the write iteration artifact. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, finish and preserve the workflow-language paper layer first, then ask once whether to keep the draft language or convert the canonical manuscript to `paper_language`, persist that answer, record both the language decision and the workflow-language paper-layer path in the latest write iteration, and only then edit the final manuscript in the chosen language.
|
|
@@ -8,4 +8,5 @@ Use the installed `lab` skill at `.claude/skills/lab/SKILL.md`.
|
|
|
8
8
|
|
|
9
9
|
Execute the requested `/lab-write` command against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
|
|
10
10
|
When the user provides reference PDFs, paper URLs, local reference-paper paths, or asks to write by reference, stay within the write stage and switch to reference-guided deep-write. Extract structure, map section/subsection slots, paragraph roles, table/figure roles, and bridge logic to the current paper, record the consumption plan, and only then draft prose. The current section must visibly realize the mapped slots; do not treat a consumption plan as enough. Reuse structure only; do not copy wording, claims, metrics, captions, or conclusions. Keep service-style or AI-assistant meta language and workflow-only placeholder language out of paper-facing prose.
|
|
11
|
+
When Method, Experiments, captions, tables, or analysis assets introduce or revise reported metrics, create or update `.lab/writing/metric-glossary.md` before prose polish. Each metric must define its paper-facing name, approved short name, table/header label, plain-language definition, calculation, unit or denominator, direction, scope or conditions, allowed aliases, forbidden aliases, and first-use location. Use the same metric names across prose, captions, table notes, table headers, and result summaries. Run `validate_metric_glossary.py` and remove forbidden aliases from reader-facing LaTeX before finalizing the round.
|
|
11
12
|
This command runs the `write` stage of the lab workflow. Use `.claude/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference, the current section block in `section-style-policies.md`, and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`, and ordinary `.tex` section drafts must stay in `workflow_language` instead of treating `paper_language` as the default draft language. When `workflow_language` and `paper_language` differ, treat the workflow-language paper layer as the default ordinary working layer. Ordinary write rounds should still edit one target paper layer at a time rather than silently refreshing both language layers. If the user names a concrete file or layer, treat that as the only target for the round unless they also explicitly request synchronization. If a workflow-language paper layer is active and the round still targets the canonical manuscript, record why canonical-only writing was acceptable in the write iteration artifact. If `paper_language_finalization_decision=convert-to-paper-language`, explicit canonical-manuscript work may target the canonical `paper_language` manuscript, but that does not make canonical the default ordinary working layer while workflow-language remains active. Treat the workflow-language paper layer as a real persisted artifact rather than a review layer, and preserve it as a full LaTeX mirror with `workflow-language/main.tex`, `workflow-language/references.bib`, `workflow-language/sections/*.tex`, `workflow-language/tables/*.tex`, `workflow-language/figures/*.tex`, and `workflow-language/analysis/analysis-asset.tex`. Do not write new workflow-language output to deprecated review-layer paths such as `docs/lab/paper/review_zh/`. Maintain `.lab/writing/terminology-glossary.md` as the write-stage glossary for full forms, approved short forms, reader-facing explanations, and aliases. Apply the same academic readability standard in every language: when the round introduces or revises key terms, abbreviations, metrics, mechanism names, or system labels, use the full form first, define any short form at first mention, explain what the term is and why it matters here, keep one natural-language paper-facing name per concept, use natural-language full names in prose, do not use labels containing `_` or `-` in reader-facing prose, apply the same first-mention rule to table headers, table captions, table notes, and figure captions or labels, do not assume a fixed drafting order such as Method before Experiments, add a local naming bridge when a section uses canonical short names before their defining section has been drafted, and reuse the canonical label instead of replacing it with a narrative alias. Follow the current section's encouraged, discouraged, and banned expression lists from `section-style-policies.md`; section-specific banned expressions take priority over prose-polish goals. Before any additional tighten, compress, or polish pass on the same section, run a section-level acceptance gate first. That gate must explicitly confirm naming consistency, adjacent-section consistency, claim, metric, and ranking consistency with the current evidence, local clarity, local concision, and section-style compliance. If the round changes the paper's canonical experiment or evaluation protocol, treat that change as a canonical replacement unless the user explicitly scoped it as supplementary or appendix-only, run a paper-wide impact audit before more polishing, update the highest-impact stale sections and assets first, and do not default to translation/workflow-layer sync work unless the user explicitly asked for it or the language-finalization workflow requires it. Only edit both the canonical manuscript and the workflow-language paper layer in the same round when the user explicitly asks for cross-language synchronization or when a final-draft/export language-finalization step requires both layers to be refreshed together. Do not treat a routine tighten/compress/polish request as an instruction to sync the workflow-language companion. For export or remote-publication rounds, if `paper_language_finalization_decision=convert-to-paper-language`, include the workflow-language paper layer in the exported or pushed bundle by default. Allow canonical-only export or remote publication only when the user explicitly asked for it or when the remote target forbids extra files. If any gate item is unresolved, or if a banned expression or move from the current section policy remains, spend the round fixing that blocker instead of polishing sentences further, and do not default the next-step recommendation to another polish pass. Main tables must be locally self-contained: the title, header, note, and adjacent prose should tell the reader what each row and column means, the metric direction, and any relevant unit, denominator, or event condition. Short headers remain allowed, but abbreviations in paper-facing tables must be expanded locally in the same table. If Method or Experiments prose promises a metric family, the main table set must either expose those metrics directly or explicitly mark the missing ones as appendix-only and explain why. If a metric is measured but omitted because it is zero everywhere, redundant, or appendix-only, state that disposition explicitly in the table note instead of silently dropping it. Do not treat `\resizebox{\linewidth}{!}{...}` as the default way to fit a main table. Fit main tables by redesign first: shorten headers, move secondary metrics out of the main table, reduce or split columns, then adjust `\tabcolsep` conservatively; only use `\resizebox` as a last resort, keep width changes readable, and explain the width-control rationale locally in the same table note. Do not use `\scriptsize` or `\tiny` as the default main-table fit strategy. Keep internal identifiers out of reader-facing prose unless they are mapped once for the reader and then moved back out of prose, and record the terminology-clarity self-check, the section-level acceptance gate, section-style policy compliance, the protocol/scope impact audit, the export or remote bundle audit, the round target layer, any canonical-only justification while workflow-language was active, any cross-language sync justification, and the table-semantics audit in the write iteration artifact. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, finish and preserve the workflow-language paper layer first, then ask once whether to keep the draft language or convert the canonical manuscript to `paper_language`, persist that answer, record both the language decision and the workflow-language paper-layer path in the latest write iteration, and only then edit the final manuscript in the chosen language.
|
|
@@ -8,4 +8,5 @@ Use the installed `lab` skill at `.claude/skills/lab/SKILL.md`.
|
|
|
8
8
|
|
|
9
9
|
Execute the requested `/lab-write` command against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
|
|
10
10
|
When the user provides reference PDFs, paper URLs, local reference-paper paths, or asks to write by reference, stay within the write stage and switch to reference-guided deep-write. Extract structure, map section/subsection slots, paragraph roles, table/figure roles, and bridge logic to the current paper, record the consumption plan, and only then draft prose. The current section must visibly realize the mapped slots; do not treat a consumption plan as enough. Reuse structure only; do not copy wording, claims, metrics, captions, or conclusions. Keep service-style or AI-assistant meta language and workflow-only placeholder language out of paper-facing prose.
|
|
11
|
+
When Method, Experiments, captions, tables, or analysis assets introduce or revise reported metrics, create or update `.lab/writing/metric-glossary.md` before prose polish. Each metric must define its paper-facing name, approved short name, table/header label, plain-language definition, calculation, unit or denominator, direction, scope or conditions, allowed aliases, forbidden aliases, and first-use location. Use the same metric names across prose, captions, table notes, table headers, and result summaries. Run `validate_metric_glossary.py` and remove forbidden aliases from reader-facing LaTeX before finalizing the round.
|
|
11
12
|
This command runs the `write` stage of the lab workflow. Use `.claude/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference, the current section block in `section-style-policies.md`, and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`, and ordinary `.tex` section drafts must stay in `workflow_language` instead of treating `paper_language` as the default draft language. When `workflow_language` and `paper_language` differ, treat the workflow-language paper layer as the default ordinary working layer. Ordinary write rounds should still edit one target paper layer at a time rather than silently refreshing both language layers. If the user names a concrete file or layer, treat that as the only target for the round unless they also explicitly request synchronization. If a workflow-language paper layer is active and the round still targets the canonical manuscript, record why canonical-only writing was acceptable in the write iteration artifact. If `paper_language_finalization_decision=convert-to-paper-language`, explicit canonical-manuscript work may target the canonical `paper_language` manuscript, but that does not make canonical the default ordinary working layer while workflow-language remains active. Treat the workflow-language paper layer as a real persisted artifact rather than a review layer, and preserve it as a full LaTeX mirror with `workflow-language/main.tex`, `workflow-language/references.bib`, `workflow-language/sections/*.tex`, `workflow-language/tables/*.tex`, `workflow-language/figures/*.tex`, and `workflow-language/analysis/analysis-asset.tex`. Do not write new workflow-language output to deprecated review-layer paths such as `docs/lab/paper/review_zh/`. Maintain `.lab/writing/terminology-glossary.md` as the write-stage glossary for full forms, approved short forms, reader-facing explanations, and aliases. Apply the same academic readability standard in every language: when the round introduces or revises key terms, abbreviations, metrics, mechanism names, or system labels, use the full form first, define any short form at first mention, explain what the term is and why it matters here, keep one natural-language paper-facing name per concept, use natural-language full names in prose, do not use labels containing `_` or `-` in reader-facing prose, apply the same first-mention rule to table headers, table captions, table notes, and figure captions or labels, do not assume a fixed drafting order such as Method before Experiments, add a local naming bridge when a section uses canonical short names before their defining section has been drafted, and reuse the canonical label instead of replacing it with a narrative alias. Follow the current section's encouraged, discouraged, and banned expression lists from `section-style-policies.md`; section-specific banned expressions take priority over prose-polish goals. Before any additional tighten, compress, or polish pass on the same section, run a section-level acceptance gate first. That gate must explicitly confirm naming consistency, adjacent-section consistency, claim, metric, and ranking consistency with the current evidence, local clarity, local concision, and section-style compliance. If the round changes the paper's canonical experiment or evaluation protocol, treat that change as a canonical replacement unless the user explicitly scoped it as supplementary or appendix-only, run a paper-wide impact audit before more polishing, update the highest-impact stale sections and assets first, and do not default to translation/workflow-layer sync work unless the user explicitly asked for it or the language-finalization workflow requires it. Only edit both the canonical manuscript and the workflow-language paper layer in the same round when the user explicitly asks for cross-language synchronization or when a final-draft/export language-finalization step requires both layers to be refreshed together. Do not treat a routine tighten/compress/polish request as an instruction to sync the workflow-language companion. For export or remote-publication rounds, if `paper_language_finalization_decision=convert-to-paper-language`, include the workflow-language paper layer in the exported or pushed bundle by default. Allow canonical-only export or remote publication only when the user explicitly asked for it or when the remote target forbids extra files. If any gate item is unresolved, or if a banned expression or move from the current section policy remains, spend the round fixing that blocker instead of polishing sentences further, and do not default the next-step recommendation to another polish pass. Main tables must be locally self-contained: the title, header, note, and adjacent prose should tell the reader what each row and column means, the metric direction, and any relevant unit, denominator, or event condition. Short headers remain allowed, but abbreviations in paper-facing tables must be expanded locally in the same table. If Method or Experiments prose promises a metric family, the main table set must either expose those metrics directly or explicitly mark the missing ones as appendix-only and explain why. If a metric is measured but omitted because it is zero everywhere, redundant, or appendix-only, state that disposition explicitly in the table note instead of silently dropping it. Do not treat `\resizebox{\linewidth}{!}{...}` as the default way to fit a main table. Fit main tables by redesign first: shorten headers, move secondary metrics out of the main table, reduce or split columns, then adjust `\tabcolsep` conservatively; only use `\resizebox` as a last resort, keep width changes readable, and explain the width-control rationale locally in the same table note. Do not use `\scriptsize` or `\tiny` as the default main-table fit strategy. Keep internal identifiers out of reader-facing prose unless they are mapped once for the reader and then moved back out of prose, and record the terminology-clarity self-check, the section-level acceptance gate, section-style policy compliance, the protocol/scope impact audit, the export or remote bundle audit, the round target layer, any canonical-only justification while workflow-language was active, any cross-language sync justification, and the table-semantics audit in the write iteration artifact. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, finish and preserve the workflow-language paper layer first, then ask once whether to keep the draft language or convert the canonical manuscript to `paper_language`, persist that answer, record both the language decision and the workflow-language paper-layer path in the latest write iteration, and only then edit the final manuscript in the chosen language.
|
|
@@ -7,4 +7,5 @@ Use the installed `lab` skill at `.codex/skills/lab/SKILL.md`.
|
|
|
7
7
|
|
|
8
8
|
Execute the requested `/lab:write` stage against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
|
|
9
9
|
When the user provides reference PDFs, paper URLs, local reference-paper paths, or asks to write by reference, stay within the write stage and switch to reference-guided deep-write. Extract structure, map section/subsection slots, paragraph roles, table/figure roles, and bridge logic to the current paper, record the consumption plan, and only then draft prose. The current section must visibly realize the mapped slots; do not treat a consumption plan as enough. Reuse structure only; do not copy wording, claims, metrics, captions, or conclusions. Keep service-style or AI-assistant meta language and workflow-only placeholder language out of paper-facing prose.
|
|
10
|
+
When Method, Experiments, captions, tables, or analysis assets introduce or revise reported metrics, create or update `.lab/writing/metric-glossary.md` before prose polish. Each metric must define its paper-facing name, approved short name, table/header label, plain-language definition, calculation, unit or denominator, direction, scope or conditions, allowed aliases, forbidden aliases, and first-use location. Use the same metric names across prose, captions, table notes, table headers, and result summaries. Run `validate_metric_glossary.py` and remove forbidden aliases from reader-facing LaTeX before finalizing the round.
|
|
10
11
|
This command runs the `/lab:write` stage. Use `.codex/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference, the current section block in `section-style-policies.md`, and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`, and ordinary `.tex` section drafts must stay in `workflow_language` instead of treating `paper_language` as the default draft language. When `workflow_language` and `paper_language` differ, treat the workflow-language paper layer as the default ordinary working layer. Resolve the active paper topology from `.lab/config/workflow.json` before drafting: the active canonical root is `<deliverables_root>/paper/`, and when workflow-language is active its root is `<deliverables_root>/paper/workflow-language/`. Ordinary write rounds should still edit one target paper layer at a time rather than silently refreshing both language layers. If the user names a concrete file or layer, treat that as the only target for the round unless they also explicitly request synchronization. Classify the named target path before editing it. Only active-layer targets count as managed manuscript rounds; legacy side layers such as `review_zh`, `translation_zh`, `sections_zh`, or stale `deliverables/.../workflow-language/*.md` paths are out-of-band/legacy edits and must not silently replace the active paper topology. If a workflow-language paper layer is active and the round still targets the canonical manuscript, record why canonical-only writing was acceptable in the write iteration artifact. If `paper_language_finalization_decision=convert-to-paper-language`, explicit canonical-manuscript work may target the canonical `paper_language` manuscript, but that does not make canonical the default ordinary working layer while workflow-language remains active. Treat the workflow-language paper layer as a real persisted artifact rather than a review layer, and preserve it as a full LaTeX mirror with `workflow-language/main.tex`, `workflow-language/references.bib`, `workflow-language/sections/*.tex`, `workflow-language/tables/*.tex`, `workflow-language/figures/*.tex`, and `workflow-language/analysis/analysis-asset.tex`. Do not write new workflow-language output to deprecated review-layer paths such as `docs/lab/paper/review_zh/`. Maintain `.lab/writing/terminology-glossary.md` as the write-stage glossary for full forms, approved short forms, reader-facing explanations, and aliases. Apply the same academic readability standard in every language: when the round introduces or revises key terms, abbreviations, metrics, mechanism names, or system labels, use the full form first, define any short form at first mention, explain what the term is and why it matters here, keep one natural-language paper-facing name per concept, use natural-language full names in prose, do not use labels containing `_` or `-` in reader-facing prose, apply the same first-mention rule to table headers, table captions, table notes, and figure captions or labels, do not assume a fixed drafting order such as Method before Experiments, add a local naming bridge when a section uses canonical short names before their defining section has been drafted, and reuse the canonical label instead of replacing it with a narrative alias. Follow the current section's encouraged, discouraged, and banned expression lists from `section-style-policies.md`; section-specific banned expressions take priority over prose-polish goals. Before any additional tighten, compress, or polish pass on the same section, run a section-level acceptance gate first. That gate must explicitly confirm naming consistency, adjacent-section consistency, claim, metric, and ranking consistency with the current evidence, local clarity, local concision, and section-style compliance. If the round changes the paper's canonical experiment or evaluation protocol, treat that change as a canonical replacement unless the user explicitly scoped it as supplementary or appendix-only, run a paper-wide impact audit before more polishing, update the highest-impact stale sections and assets first, and do not default to translation/workflow-layer sync work unless the user explicitly asked for it or the language-finalization workflow requires it. Only edit both the canonical manuscript and the workflow-language paper layer in the same round when the user explicitly asks for cross-language synchronization or when a final-draft/export language-finalization step requires both layers to be refreshed together. Do not treat a routine tighten/compress/polish request as an instruction to sync the workflow-language companion. For export or remote-publication rounds, if `paper_language_finalization_decision=convert-to-paper-language`, include the workflow-language paper layer in the exported or pushed bundle by default. Allow canonical-only export or remote publication only when the user explicitly asked for it or when the remote target forbids extra files. If any gate item is unresolved, or if a banned expression or move from the current section policy remains, spend the round fixing that blocker instead of polishing sentences further, and do not default the next-step recommendation to another polish pass. Main tables must be locally self-contained: the title, header, note, and adjacent prose should tell the reader what each row and column means, the metric direction, and any relevant unit, denominator, or event condition. Short headers remain allowed, but abbreviations in paper-facing tables must be expanded locally in the same table. If Method or Experiments prose promises a metric family, the main table set must either expose those metrics directly or explicitly mark the missing ones as appendix-only and explain why. If a metric is measured but omitted because it is zero everywhere, redundant, or appendix-only, state that disposition explicitly in the table note instead of silently dropping it. Do not treat `\resizebox{\linewidth}{!}{...}` as the default way to fit a main table. Fit main tables by redesign first: shorten headers, move secondary metrics out of the main table, reduce or split columns, then adjust `\tabcolsep` conservatively; only use `\resizebox` as a last resort, keep width changes readable, and explain the width-control rationale locally in the same table note. Do not use `\scriptsize` or `\tiny` as the default main-table fit strategy. Keep internal identifiers out of reader-facing prose unless they are mapped once for the reader and then moved back out of prose, and record the terminology-clarity self-check, the section-level acceptance gate, section-style policy compliance, the protocol/scope impact audit, the export or remote bundle audit, the round target layer, any canonical-only justification while workflow-language was active, any cross-language sync justification, the active canonical/workflow-language roots, the resolved target path role, any out-of-band justification, and the table-semantics audit in the write iteration artifact. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, finish and preserve the workflow-language paper layer first, then ask once whether to keep the draft language or convert the canonical manuscript to `paper_language`, persist that answer, record both the language decision and the workflow-language paper-layer path in the latest write iteration, and only then edit the final manuscript in the chosen language.
|
|
@@ -7,4 +7,5 @@ Use the installed `lab` skill at `.codex/skills/lab/SKILL.md`.
|
|
|
7
7
|
|
|
8
8
|
Execute the requested `/lab:write` stage against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
|
|
9
9
|
When the user provides reference PDFs, paper URLs, local reference-paper paths, or asks to write by reference, stay within the write stage and switch to reference-guided deep-write. Extract structure, map section/subsection slots, paragraph roles, table/figure roles, and bridge logic to the current paper, record the consumption plan, and only then draft prose. The current section must visibly realize the mapped slots; do not treat a consumption plan as enough. Reuse structure only; do not copy wording, claims, metrics, captions, or conclusions. Keep service-style or AI-assistant meta language and workflow-only placeholder language out of paper-facing prose.
|
|
10
|
+
When Method, Experiments, captions, tables, or analysis assets introduce or revise reported metrics, create or update `.lab/writing/metric-glossary.md` before prose polish. Each metric must define its paper-facing name, approved short name, table/header label, plain-language definition, calculation, unit or denominator, direction, scope or conditions, allowed aliases, forbidden aliases, and first-use location. Use the same metric names across prose, captions, table notes, table headers, and result summaries. Run `validate_metric_glossary.py` and remove forbidden aliases from reader-facing LaTeX before finalizing the round.
|
|
10
11
|
This command runs the `/lab:write` stage. Use `.codex/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference, the current section block in `section-style-policies.md`, and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`, and ordinary `.tex` section drafts must stay in `workflow_language` instead of treating `paper_language` as the default draft language. When `workflow_language` and `paper_language` differ, treat the workflow-language paper layer as the default ordinary working layer. Resolve the active paper topology from `.lab/config/workflow.json` before drafting: the active canonical root is `<deliverables_root>/paper/`, and when workflow-language is active its root is `<deliverables_root>/paper/workflow-language/`. Ordinary write rounds should still edit one target paper layer at a time rather than silently refreshing both language layers. If the user names a concrete file or layer, treat that as the only target for the round unless they also explicitly request synchronization. Classify the named target path before editing it. Only active-layer targets count as managed manuscript rounds; legacy side layers such as `review_zh`, `translation_zh`, `sections_zh`, or stale `deliverables/.../workflow-language/*.md` paths are out-of-band/legacy edits and must not silently replace the active paper topology. If a workflow-language paper layer is active and the round still targets the canonical manuscript, record why canonical-only writing was acceptable in the write iteration artifact. If `paper_language_finalization_decision=convert-to-paper-language`, explicit canonical-manuscript work may target the canonical `paper_language` manuscript, but that does not make canonical the default ordinary working layer while workflow-language remains active. Treat the workflow-language paper layer as a real persisted artifact rather than a review layer, and preserve it as a full LaTeX mirror with `workflow-language/main.tex`, `workflow-language/references.bib`, `workflow-language/sections/*.tex`, `workflow-language/tables/*.tex`, `workflow-language/figures/*.tex`, and `workflow-language/analysis/analysis-asset.tex`. Do not write new workflow-language output to deprecated review-layer paths such as `docs/lab/paper/review_zh/`. Maintain `.lab/writing/terminology-glossary.md` as the write-stage glossary for full forms, approved short forms, reader-facing explanations, and aliases. Apply the same academic readability standard in every language: when the round introduces or revises key terms, abbreviations, metrics, mechanism names, or system labels, use the full form first, define any short form at first mention, explain what the term is and why it matters here, keep one natural-language paper-facing name per concept, use natural-language full names in prose, do not use labels containing `_` or `-` in reader-facing prose, apply the same first-mention rule to table headers, table captions, table notes, and figure captions or labels, do not assume a fixed drafting order such as Method before Experiments, add a local naming bridge when a section uses canonical short names before their defining section has been drafted, and reuse the canonical label instead of replacing it with a narrative alias. Follow the current section's encouraged, discouraged, and banned expression lists from `section-style-policies.md`; section-specific banned expressions take priority over prose-polish goals. Before any additional tighten, compress, or polish pass on the same section, run a section-level acceptance gate first. That gate must explicitly confirm naming consistency, adjacent-section consistency, claim, metric, and ranking consistency with the current evidence, local clarity, local concision, and section-style compliance. If the round changes the paper's canonical experiment or evaluation protocol, treat that change as a canonical replacement unless the user explicitly scoped it as supplementary or appendix-only, run a paper-wide impact audit before more polishing, update the highest-impact stale sections and assets first, and do not default to translation/workflow-layer sync work unless the user explicitly asked for it or the language-finalization workflow requires it. Only edit both the canonical manuscript and the workflow-language paper layer in the same round when the user explicitly asks for cross-language synchronization or when a final-draft/export language-finalization step requires both layers to be refreshed together. Do not treat a routine tighten/compress/polish request as an instruction to sync the workflow-language companion. For export or remote-publication rounds, if `paper_language_finalization_decision=convert-to-paper-language`, include the workflow-language paper layer in the exported or pushed bundle by default. Allow canonical-only export or remote publication only when the user explicitly asked for it or when the remote target forbids extra files. If any gate item is unresolved, or if a banned expression or move from the current section policy remains, spend the round fixing that blocker instead of polishing sentences further, and do not default the next-step recommendation to another polish pass. Main tables must be locally self-contained: the title, header, note, and adjacent prose should tell the reader what each row and column means, the metric direction, and any relevant unit, denominator, or event condition. Short headers remain allowed, but abbreviations in paper-facing tables must be expanded locally in the same table. If Method or Experiments prose promises a metric family, the main table set must either expose those metrics directly or explicitly mark the missing ones as appendix-only and explain why. If a metric is measured but omitted because it is zero everywhere, redundant, or appendix-only, state that disposition explicitly in the table note instead of silently dropping it. Do not treat `\resizebox{\linewidth}{!}{...}` as the default way to fit a main table. Fit main tables by redesign first: shorten headers, move secondary metrics out of the main table, reduce or split columns, then adjust `\tabcolsep` conservatively; only use `\resizebox` as a last resort, keep width changes readable, and explain the width-control rationale locally in the same table note. Do not use `\scriptsize` or `\tiny` as the default main-table fit strategy. Keep internal identifiers out of reader-facing prose unless they are mapped once for the reader and then moved back out of prose, and record the terminology-clarity self-check, the section-level acceptance gate, section-style policy compliance, the protocol/scope impact audit, the export or remote bundle audit, the round target layer, any canonical-only justification while workflow-language was active, any cross-language sync justification, the active canonical/workflow-language roots, the resolved target path role, any out-of-band justification, and the table-semantics audit in the write iteration artifact. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, finish and preserve the workflow-language paper layer first, then ask once whether to keep the draft language or convert the canonical manuscript to `paper_language`, persist that answer, record both the language decision and the workflow-language paper-layer path in the latest write iteration, and only then edit the final manuscript in the chosen language.
|
|
@@ -7,4 +7,5 @@ Use the installed `lab` skill at `.codex/skills/lab/SKILL.md`.
|
|
|
7
7
|
|
|
8
8
|
Execute the requested `/lab:write` stage against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
|
|
9
9
|
When the user provides reference PDFs, paper URLs, local reference-paper paths, or asks to write by reference, stay within the write stage and switch to reference-guided deep-write. Extract structure, map section/subsection slots, paragraph roles, table/figure roles, and bridge logic to the current paper, record the consumption plan, and only then draft prose. The current section must visibly realize the mapped slots; do not treat a consumption plan as enough. Reuse structure only; do not copy wording, claims, metrics, captions, or conclusions. Keep service-style or AI-assistant meta language and workflow-only placeholder language out of paper-facing prose.
|
|
10
|
+
When Method, Experiments, captions, tables, or analysis assets introduce or revise reported metrics, create or update `.lab/writing/metric-glossary.md` before prose polish. Each metric must define its paper-facing name, approved short name, table/header label, plain-language definition, calculation, unit or denominator, direction, scope or conditions, allowed aliases, forbidden aliases, and first-use location. Use the same metric names across prose, captions, table notes, table headers, and result summaries. Run `validate_metric_glossary.py` and remove forbidden aliases from reader-facing LaTeX before finalizing the round.
|
|
10
11
|
This command runs the `/lab:write` stage. Use `.codex/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference, the current section block in `section-style-policies.md`, and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`, and ordinary `.tex` section drafts must stay in `workflow_language` instead of treating `paper_language` as the default draft language. When `workflow_language` and `paper_language` differ, treat the workflow-language paper layer as the default ordinary working layer. Resolve the active paper topology from `.lab/config/workflow.json` before drafting: the active canonical root is `<deliverables_root>/paper/`, and when workflow-language is active its root is `<deliverables_root>/paper/workflow-language/`. Ordinary write rounds should still edit one target paper layer at a time rather than silently refreshing both language layers. If the user names a concrete file or layer, treat that as the only target for the round unless they also explicitly request synchronization. Classify the named target path before editing it. Only active-layer targets count as managed manuscript rounds; legacy side layers such as `review_zh`, `translation_zh`, `sections_zh`, or stale `deliverables/.../workflow-language/*.md` paths are out-of-band/legacy edits and must not silently replace the active paper topology. If a workflow-language paper layer is active and the round still targets the canonical manuscript, record why canonical-only writing was acceptable in the write iteration artifact. If `paper_language_finalization_decision=convert-to-paper-language`, explicit canonical-manuscript work may target the canonical `paper_language` manuscript, but that does not make canonical the default ordinary working layer while workflow-language remains active. Treat the workflow-language paper layer as a real persisted artifact rather than a review layer, and preserve it as a full LaTeX mirror with `workflow-language/main.tex`, `workflow-language/references.bib`, `workflow-language/sections/*.tex`, `workflow-language/tables/*.tex`, `workflow-language/figures/*.tex`, and `workflow-language/analysis/analysis-asset.tex`. Do not write new workflow-language output to deprecated review-layer paths such as `docs/lab/paper/review_zh/`. Maintain `.lab/writing/terminology-glossary.md` as the write-stage glossary for full forms, approved short forms, reader-facing explanations, and aliases. Apply the same academic readability standard in every language: when the round introduces or revises key terms, abbreviations, metrics, mechanism names, or system labels, use the full form first, define any short form at first mention, explain what the term is and why it matters here, keep one natural-language paper-facing name per concept, use natural-language full names in prose, do not use labels containing `_` or `-` in reader-facing prose, apply the same first-mention rule to table headers, table captions, table notes, and figure captions or labels, do not assume a fixed drafting order such as Method before Experiments, add a local naming bridge when a section uses canonical short names before their defining section has been drafted, and reuse the canonical label instead of replacing it with a narrative alias. Follow the current section's encouraged, discouraged, and banned expression lists from `section-style-policies.md`; section-specific banned expressions take priority over prose-polish goals. Before any additional tighten, compress, or polish pass on the same section, run a section-level acceptance gate first. That gate must explicitly confirm naming consistency, adjacent-section consistency, claim, metric, and ranking consistency with the current evidence, local clarity, local concision, and section-style compliance. If the round changes the paper's canonical experiment or evaluation protocol, treat that change as a canonical replacement unless the user explicitly scoped it as supplementary or appendix-only, run a paper-wide impact audit before more polishing, update the highest-impact stale sections and assets first, and do not default to translation/workflow-layer sync work unless the user explicitly asked for it or the language-finalization workflow requires it. Only edit both the canonical manuscript and the workflow-language paper layer in the same round when the user explicitly asks for cross-language synchronization or when a final-draft/export language-finalization step requires both layers to be refreshed together. Do not treat a routine tighten/compress/polish request as an instruction to sync the workflow-language companion. For export or remote-publication rounds, if `paper_language_finalization_decision=convert-to-paper-language`, include the workflow-language paper layer in the exported or pushed bundle by default. Allow canonical-only export or remote publication only when the user explicitly asked for it or when the remote target forbids extra files. If any gate item is unresolved, or if a banned expression or move from the current section policy remains, spend the round fixing that blocker instead of polishing sentences further, and do not default the next-step recommendation to another polish pass. Main tables must be locally self-contained: the title, header, note, and adjacent prose should tell the reader what each row and column means, the metric direction, and any relevant unit, denominator, or event condition. Short headers remain allowed, but abbreviations in paper-facing tables must be expanded locally in the same table. If Method or Experiments prose promises a metric family, the main table set must either expose those metrics directly or explicitly mark the missing ones as appendix-only and explain why. If a metric is measured but omitted because it is zero everywhere, redundant, or appendix-only, state that disposition explicitly in the table note instead of silently dropping it. Do not treat `\resizebox{\linewidth}{!}{...}` as the default way to fit a main table. Fit main tables by redesign first: shorten headers, move secondary metrics out of the main table, reduce or split columns, then adjust `\tabcolsep` conservatively; only use `\resizebox` as a last resort, keep width changes readable, and explain the width-control rationale locally in the same table note. Do not use `\scriptsize` or `\tiny` as the default main-table fit strategy. Keep internal identifiers out of reader-facing prose unless they are mapped once for the reader and then moved back out of prose, and record the terminology-clarity self-check, the section-level acceptance gate, section-style policy compliance, the protocol/scope impact audit, the export or remote bundle audit, the round target layer, any canonical-only justification while workflow-language was active, any cross-language sync justification, the active canonical/workflow-language roots, the resolved target path role, any out-of-band justification, and the table-semantics audit in the write iteration artifact. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, finish and preserve the workflow-language paper layer first, then ask once whether to keep the draft language or convert the canonical manuscript to `paper_language`, persist that answer, record both the language decision and the workflow-language paper-layer path in the latest write iteration, and only then edit the final manuscript in the chosen language.
|
|
@@ -7,4 +7,5 @@ Use the installed `lab` skill at `.codex/skills/lab/SKILL.md`.
|
|
|
7
7
|
|
|
8
8
|
Execute the requested `/lab:write` stage against the user's argument now. Do not only recommend another lab stage. If a blocking prerequisite is missing, say exactly what is missing and ask at most one clarifying question.
|
|
9
9
|
When the user provides reference PDFs, paper URLs, local reference-paper paths, or asks to write by reference, stay within the write stage and switch to reference-guided deep-write. Extract structure, map section/subsection slots, paragraph roles, table/figure roles, and bridge logic to the current paper, record the consumption plan, and only then draft prose. The current section must visibly realize the mapped slots; do not treat a consumption plan as enough. Reuse structure only; do not copy wording, claims, metrics, captions, or conclusions. Keep service-style or AI-assistant meta language and workflow-only placeholder language out of paper-facing prose.
|
|
10
|
+
When Method, Experiments, captions, tables, or analysis assets introduce or revise reported metrics, create or update `.lab/writing/metric-glossary.md` before prose polish. Each metric must define its paper-facing name, approved short name, table/header label, plain-language definition, calculation, unit or denominator, direction, scope or conditions, allowed aliases, forbidden aliases, and first-use location. Use the same metric names across prose, captions, table notes, table headers, and result summaries. Run `validate_metric_glossary.py` and remove forbidden aliases from reader-facing LaTeX before finalizing the round.
|
|
10
11
|
This command runs the `/lab:write` stage. Use `.codex/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section references, validator gates, asset coverage, and final manuscript rules. Read the matching paper-writing reference, the current section block in `section-style-policies.md`, and any bundled example-bank files for the requested section, revise only one section, and keep draft rounds warning-only while final-draft or export rounds must satisfy the write-stage acceptance gates. Draft ordinary manuscript rounds in `workflow_language`, and ordinary `.tex` section drafts must stay in `workflow_language` instead of treating `paper_language` as the default draft language. When `workflow_language` and `paper_language` differ, treat the workflow-language paper layer as the default ordinary working layer. Resolve the active paper topology from `.lab/config/workflow.json` before drafting: the active canonical root is `<deliverables_root>/paper/`, and when workflow-language is active its root is `<deliverables_root>/paper/workflow-language/`. Ordinary write rounds should still edit one target paper layer at a time rather than silently refreshing both language layers. If the user names a concrete file or layer, treat that as the only target for the round unless they also explicitly request synchronization. Classify the named target path before editing it. Only active-layer targets count as managed manuscript rounds; legacy side layers such as `review_zh`, `translation_zh`, `sections_zh`, or stale `deliverables/.../workflow-language/*.md` paths are out-of-band/legacy edits and must not silently replace the active paper topology. If a workflow-language paper layer is active and the round still targets the canonical manuscript, record why canonical-only writing was acceptable in the write iteration artifact. If `paper_language_finalization_decision=convert-to-paper-language`, explicit canonical-manuscript work may target the canonical `paper_language` manuscript, but that does not make canonical the default ordinary working layer while workflow-language remains active. Treat the workflow-language paper layer as a real persisted artifact rather than a review layer, and preserve it as a full LaTeX mirror with `workflow-language/main.tex`, `workflow-language/references.bib`, `workflow-language/sections/*.tex`, `workflow-language/tables/*.tex`, `workflow-language/figures/*.tex`, and `workflow-language/analysis/analysis-asset.tex`. Do not write new workflow-language output to deprecated review-layer paths such as `docs/lab/paper/review_zh/`. Maintain `.lab/writing/terminology-glossary.md` as the write-stage glossary for full forms, approved short forms, reader-facing explanations, and aliases. Apply the same academic readability standard in every language: when the round introduces or revises key terms, abbreviations, metrics, mechanism names, or system labels, use the full form first, define any short form at first mention, explain what the term is and why it matters here, keep one natural-language paper-facing name per concept, use natural-language full names in prose, do not use labels containing `_` or `-` in reader-facing prose, apply the same first-mention rule to table headers, table captions, table notes, and figure captions or labels, do not assume a fixed drafting order such as Method before Experiments, add a local naming bridge when a section uses canonical short names before their defining section has been drafted, and reuse the canonical label instead of replacing it with a narrative alias. Follow the current section's encouraged, discouraged, and banned expression lists from `section-style-policies.md`; section-specific banned expressions take priority over prose-polish goals. Before any additional tighten, compress, or polish pass on the same section, run a section-level acceptance gate first. That gate must explicitly confirm naming consistency, adjacent-section consistency, claim, metric, and ranking consistency with the current evidence, local clarity, local concision, and section-style compliance. If the round changes the paper's canonical experiment or evaluation protocol, treat that change as a canonical replacement unless the user explicitly scoped it as supplementary or appendix-only, run a paper-wide impact audit before more polishing, update the highest-impact stale sections and assets first, and do not default to translation/workflow-layer sync work unless the user explicitly asked for it or the language-finalization workflow requires it. Only edit both the canonical manuscript and the workflow-language paper layer in the same round when the user explicitly asks for cross-language synchronization or when a final-draft/export language-finalization step requires both layers to be refreshed together. Do not treat a routine tighten/compress/polish request as an instruction to sync the workflow-language companion. For export or remote-publication rounds, if `paper_language_finalization_decision=convert-to-paper-language`, include the workflow-language paper layer in the exported or pushed bundle by default. Allow canonical-only export or remote publication only when the user explicitly asked for it or when the remote target forbids extra files. If any gate item is unresolved, or if a banned expression or move from the current section policy remains, spend the round fixing that blocker instead of polishing sentences further, and do not default the next-step recommendation to another polish pass. Main tables must be locally self-contained: the title, header, note, and adjacent prose should tell the reader what each row and column means, the metric direction, and any relevant unit, denominator, or event condition. Short headers remain allowed, but abbreviations in paper-facing tables must be expanded locally in the same table. If Method or Experiments prose promises a metric family, the main table set must either expose those metrics directly or explicitly mark the missing ones as appendix-only and explain why. If a metric is measured but omitted because it is zero everywhere, redundant, or appendix-only, state that disposition explicitly in the table note instead of silently dropping it. Do not treat `\resizebox{\linewidth}{!}{...}` as the default way to fit a main table. Fit main tables by redesign first: shorten headers, move secondary metrics out of the main table, reduce or split columns, then adjust `\tabcolsep` conservatively; only use `\resizebox` as a last resort, keep width changes readable, and explain the width-control rationale locally in the same table note. Do not use `\scriptsize` or `\tiny` as the default main-table fit strategy. Keep internal identifiers out of reader-facing prose unless they are mapped once for the reader and then moved back out of prose, and record the terminology-clarity self-check, the section-level acceptance gate, section-style policy compliance, the protocol/scope impact audit, the export or remote bundle audit, the round target layer, any canonical-only justification while workflow-language was active, any cross-language sync justification, the active canonical/workflow-language roots, the resolved target path role, any out-of-band justification, and the table-semantics audit in the write iteration artifact. If the manuscript would start from the managed scaffold and no template decision is recorded yet, ask once whether to keep the default scaffold or attach a template directory first. If finalization reaches a round where `workflow_language` and `paper_language` differ, finish and preserve the workflow-language paper layer first, then ask once whether to keep the draft language or convert the canonical manuscript to `paper_language`, persist that answer, record both the language decision and the workflow-language paper-layer path in the latest write iteration, and only then edit the final manuscript in the chosen language.
|
|
@@ -51,6 +51,21 @@ SOURCE_SECTION_PATH_MARKERS = (
|
|
|
51
51
|
SOURCE_SECTION_CITATION_MARKERS = ("Citation:", "引用:")
|
|
52
52
|
SOURCE_SECTION_ROLE_MARKERS = ("What it established:", "What it does:", "What it measures:", "做了什么:", "衡量什么:")
|
|
53
53
|
SOURCE_SECTION_LIMITATION_MARKERS = ("Limitation", "局限")
|
|
54
|
+
METRIC_GUIDE_DETAIL_MARKERS = {
|
|
55
|
+
"evaluation target": ("Evaluation target:", "What is evaluated:", "评估对象:", "评估什么:"),
|
|
56
|
+
"test-set prediction": ("Test-set prediction used:", "Prediction used:", "测试集预测:", "预测量:"),
|
|
57
|
+
"ranking or grouping": ("Ranking or grouping step:", "Ranking step:", "Grouping step:", "排序或分组:", "排序步骤:", "分组步骤:"),
|
|
58
|
+
"calculation sketch": (
|
|
59
|
+
"Aggregation / calculation sketch:",
|
|
60
|
+
"Calculation sketch:",
|
|
61
|
+
"Approximate calculation:",
|
|
62
|
+
"大致计算:",
|
|
63
|
+
"近似公式:",
|
|
64
|
+
"聚合方式:",
|
|
65
|
+
),
|
|
66
|
+
"direction and scale": ("Direction and scale:", "Metric direction:", "方向与尺度:", "方向:", "越高/越低:"),
|
|
67
|
+
"comparability boundary": ("Comparability boundary:", "What not to compare:", "可比性边界:", "不能比较:"),
|
|
68
|
+
}
|
|
54
69
|
|
|
55
70
|
|
|
56
71
|
def parse_args():
|
|
@@ -99,6 +114,35 @@ def validate_source_sections(text: str, label: str) -> list[str]:
|
|
|
99
114
|
return issues
|
|
100
115
|
|
|
101
116
|
|
|
117
|
+
def has_marker_with_value(body: str, markers: tuple[str, ...]) -> bool:
|
|
118
|
+
for line in body.splitlines():
|
|
119
|
+
stripped = line.strip()
|
|
120
|
+
for marker in markers:
|
|
121
|
+
if marker not in stripped:
|
|
122
|
+
continue
|
|
123
|
+
value = stripped.split(marker, 1)[1].strip()
|
|
124
|
+
if value and value not in {"-", "—", "TODO", "TBD", "待补", "待定"}:
|
|
125
|
+
return True
|
|
126
|
+
return False
|
|
127
|
+
|
|
128
|
+
|
|
129
|
+
def validate_metric_guide_detail(text: str, label: str) -> list[str]:
|
|
130
|
+
body = extract_section_body(text, REPORT_REQUIRED_SECTIONS["Metric Guide"])
|
|
131
|
+
if not body:
|
|
132
|
+
return []
|
|
133
|
+
missing = [
|
|
134
|
+
detail_name
|
|
135
|
+
for detail_name, markers in METRIC_GUIDE_DETAIL_MARKERS.items()
|
|
136
|
+
if not has_marker_with_value(body, markers)
|
|
137
|
+
]
|
|
138
|
+
if not missing:
|
|
139
|
+
return []
|
|
140
|
+
return [
|
|
141
|
+
f"{label} section 'Metric Guide' must explain metric computation details: "
|
|
142
|
+
f"{', '.join(missing)}"
|
|
143
|
+
]
|
|
144
|
+
|
|
145
|
+
|
|
102
146
|
def validate(path_str: str, required_sections: dict[str, list[str]], label: str) -> list[str]:
|
|
103
147
|
path = Path(path_str)
|
|
104
148
|
if not path.exists():
|
|
@@ -108,7 +152,7 @@ def validate(path_str: str, required_sections: dict[str, list[str]], label: str)
|
|
|
108
152
|
if missing:
|
|
109
153
|
return [f"{label} is missing required sections: {', '.join(missing)}"]
|
|
110
154
|
if label == "report.md":
|
|
111
|
-
return validate_source_sections(text, label)
|
|
155
|
+
return validate_source_sections(text, label) + validate_metric_guide_detail(text, label)
|
|
112
156
|
return []
|
|
113
157
|
|
|
114
158
|
|
|
@@ -38,6 +38,7 @@ REQUIRED_TABLE_NOTE_MARKERS = (
|
|
|
38
38
|
"% Important caveat:",
|
|
39
39
|
)
|
|
40
40
|
WIDTH_CONTROL_NOTE_MARKER = "% Width control:"
|
|
41
|
+
WIDE_PLAIN_TABULAR_COLUMN_LIMIT = 7
|
|
41
42
|
TABLE_ABBREVIATION_EXCEPTIONS = {"TODO", "TBD"}
|
|
42
43
|
PLACEHOLDER_TABLE_NOTE_PREFIXES = (
|
|
43
44
|
"explain ",
|
|
@@ -97,6 +98,109 @@ def contains_any(text: str, needles: tuple[str, ...]) -> bool:
|
|
|
97
98
|
return any(needle.lower() in lowered for needle in needles)
|
|
98
99
|
|
|
99
100
|
|
|
101
|
+
def read_braced_group(text: str, start: int) -> tuple[str, int] | None:
|
|
102
|
+
if start >= len(text) or text[start] != "{":
|
|
103
|
+
return None
|
|
104
|
+
depth = 0
|
|
105
|
+
content_start = start + 1
|
|
106
|
+
for index in range(start, len(text)):
|
|
107
|
+
char = text[index]
|
|
108
|
+
if char == "{":
|
|
109
|
+
depth += 1
|
|
110
|
+
elif char == "}":
|
|
111
|
+
depth -= 1
|
|
112
|
+
if depth == 0:
|
|
113
|
+
return text[content_start:index], index + 1
|
|
114
|
+
return None
|
|
115
|
+
|
|
116
|
+
|
|
117
|
+
def skip_whitespace(text: str, index: int) -> int:
|
|
118
|
+
while index < len(text) and text[index].isspace():
|
|
119
|
+
index += 1
|
|
120
|
+
return index
|
|
121
|
+
|
|
122
|
+
|
|
123
|
+
def extract_plain_tabular_specs(text: str) -> list[str]:
|
|
124
|
+
specs: list[str] = []
|
|
125
|
+
needle = r"\begin{tabular}"
|
|
126
|
+
search_from = 0
|
|
127
|
+
while True:
|
|
128
|
+
index = text.find(needle, search_from)
|
|
129
|
+
if index == -1:
|
|
130
|
+
return specs
|
|
131
|
+
spec_start = skip_whitespace(text, index + len(needle))
|
|
132
|
+
group = read_braced_group(text, spec_start)
|
|
133
|
+
if group is not None:
|
|
134
|
+
specs.append(group[0])
|
|
135
|
+
search_from = group[1]
|
|
136
|
+
else:
|
|
137
|
+
search_from = index + len(needle)
|
|
138
|
+
|
|
139
|
+
|
|
140
|
+
def count_column_spec(spec: str) -> tuple[int, bool]:
|
|
141
|
+
count = 0
|
|
142
|
+
has_width_aware_column = False
|
|
143
|
+
index = 0
|
|
144
|
+
while index < len(spec):
|
|
145
|
+
char = spec[index]
|
|
146
|
+
if char in "lcr":
|
|
147
|
+
count += 1
|
|
148
|
+
index += 1
|
|
149
|
+
continue
|
|
150
|
+
if char == "X":
|
|
151
|
+
count += 1
|
|
152
|
+
has_width_aware_column = True
|
|
153
|
+
index += 1
|
|
154
|
+
continue
|
|
155
|
+
if char in "pmb":
|
|
156
|
+
count += 1
|
|
157
|
+
has_width_aware_column = True
|
|
158
|
+
index = skip_whitespace(spec, index + 1)
|
|
159
|
+
if index < len(spec) and spec[index] == "{":
|
|
160
|
+
group = read_braced_group(spec, index)
|
|
161
|
+
index = group[1] if group is not None else index + 1
|
|
162
|
+
continue
|
|
163
|
+
if char == "*":
|
|
164
|
+
index = skip_whitespace(spec, index + 1)
|
|
165
|
+
repeat_group = read_braced_group(spec, index)
|
|
166
|
+
if repeat_group is None:
|
|
167
|
+
continue
|
|
168
|
+
repeat_text, index = repeat_group
|
|
169
|
+
index = skip_whitespace(spec, index)
|
|
170
|
+
repeated_spec_group = read_braced_group(spec, index)
|
|
171
|
+
if repeated_spec_group is None:
|
|
172
|
+
continue
|
|
173
|
+
repeated_spec, index = repeated_spec_group
|
|
174
|
+
try:
|
|
175
|
+
repeat_count = int(repeat_text.strip())
|
|
176
|
+
except ValueError:
|
|
177
|
+
repeat_count = 1
|
|
178
|
+
nested_count, nested_width_aware = count_column_spec(repeated_spec)
|
|
179
|
+
count += repeat_count * nested_count
|
|
180
|
+
has_width_aware_column = has_width_aware_column or nested_width_aware
|
|
181
|
+
continue
|
|
182
|
+
if char in "@!<>":
|
|
183
|
+
index = skip_whitespace(spec, index + 1)
|
|
184
|
+
if index < len(spec) and spec[index] == "{":
|
|
185
|
+
group = read_braced_group(spec, index)
|
|
186
|
+
index = group[1] if group is not None else index + 1
|
|
187
|
+
continue
|
|
188
|
+
index += 1
|
|
189
|
+
return count, has_width_aware_column
|
|
190
|
+
|
|
191
|
+
|
|
192
|
+
def has_width_control_command(text: str) -> bool:
|
|
193
|
+
return any(
|
|
194
|
+
token in text
|
|
195
|
+
for token in (
|
|
196
|
+
r"\begin{tabularx}",
|
|
197
|
+
r"\begin{tabular*}",
|
|
198
|
+
r"\resizebox{",
|
|
199
|
+
r"\setlength{\tabcolsep}",
|
|
200
|
+
)
|
|
201
|
+
)
|
|
202
|
+
|
|
203
|
+
|
|
100
204
|
def find_workflow_config(start_path: Path) -> Path | None:
|
|
101
205
|
search_roots = [start_path, *start_path.parents]
|
|
102
206
|
for root in search_roots:
|
|
@@ -315,6 +419,18 @@ def check_table_file(path: Path, issues: list[str], label: str):
|
|
|
315
419
|
continue
|
|
316
420
|
if value < 3.0:
|
|
317
421
|
issues.append(f"{label} sets \\tabcolsep below the safe range for paper-facing main tables")
|
|
422
|
+
for spec in extract_plain_tabular_specs(text):
|
|
423
|
+
column_count, has_width_aware_column = count_column_spec(spec)
|
|
424
|
+
if (
|
|
425
|
+
column_count >= WIDE_PLAIN_TABULAR_COLUMN_LIMIT
|
|
426
|
+
and not has_width_aware_column
|
|
427
|
+
and not has_width_control_command(text)
|
|
428
|
+
):
|
|
429
|
+
issues.append(
|
|
430
|
+
f"{label} uses a wide plain tabular layout ({column_count} columns) without a width-aware strategy; "
|
|
431
|
+
"use tabularx or p columns, split the table, move secondary metrics to appendix, "
|
|
432
|
+
"or document last-resort width control"
|
|
433
|
+
)
|
|
318
434
|
|
|
319
435
|
|
|
320
436
|
def check_figure_file(path: Path, issues: list[str], label: str):
|
|
@@ -0,0 +1,191 @@
|
|
|
1
|
+
#!/usr/bin/env python3
|
|
2
|
+
from __future__ import annotations
|
|
3
|
+
|
|
4
|
+
import argparse
|
|
5
|
+
import re
|
|
6
|
+
import sys
|
|
7
|
+
from dataclasses import dataclass
|
|
8
|
+
from pathlib import Path
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
REQUIRED_HEADINGS = (
|
|
12
|
+
"## Metric Rules",
|
|
13
|
+
"## Metrics",
|
|
14
|
+
"## Audit",
|
|
15
|
+
)
|
|
16
|
+
|
|
17
|
+
REQUIRED_METRIC_FIELDS = (
|
|
18
|
+
"Paper-facing name",
|
|
19
|
+
"Approved short name",
|
|
20
|
+
"Table/header label",
|
|
21
|
+
"Plain-language definition",
|
|
22
|
+
"Calculation",
|
|
23
|
+
"Unit or denominator",
|
|
24
|
+
"Direction",
|
|
25
|
+
"Scope / conditions",
|
|
26
|
+
"First-use location",
|
|
27
|
+
)
|
|
28
|
+
|
|
29
|
+
REQUIRED_ALIAS_FIELDS = (
|
|
30
|
+
"Allowed aliases",
|
|
31
|
+
"Forbidden aliases",
|
|
32
|
+
)
|
|
33
|
+
|
|
34
|
+
EMPTY_VALUES = {
|
|
35
|
+
"",
|
|
36
|
+
"todo",
|
|
37
|
+
"tbd",
|
|
38
|
+
"待补",
|
|
39
|
+
"待定",
|
|
40
|
+
"none",
|
|
41
|
+
"n/a",
|
|
42
|
+
"na",
|
|
43
|
+
"null",
|
|
44
|
+
"unknown",
|
|
45
|
+
"未定",
|
|
46
|
+
}
|
|
47
|
+
|
|
48
|
+
|
|
49
|
+
@dataclass
|
|
50
|
+
class MetricEntry:
|
|
51
|
+
heading: str
|
|
52
|
+
fields: dict[str, str]
|
|
53
|
+
|
|
54
|
+
|
|
55
|
+
def parse_args() -> argparse.Namespace:
|
|
56
|
+
parser = argparse.ArgumentParser(description="Validate reader-facing metric names and definitions.")
|
|
57
|
+
parser.add_argument("--metric-glossary", required=True, help="Path to .lab/writing/metric-glossary.md")
|
|
58
|
+
parser.add_argument("--tex-file", action="append", default=[], help="Paper-facing .tex file to scan for forbidden metric aliases")
|
|
59
|
+
parser.add_argument("--mode", required=True, choices=("draft", "final"))
|
|
60
|
+
return parser.parse_args()
|
|
61
|
+
|
|
62
|
+
|
|
63
|
+
def read_text(path: Path) -> str:
|
|
64
|
+
return path.read_text(encoding="utf-8")
|
|
65
|
+
|
|
66
|
+
|
|
67
|
+
def is_empty_value(value: str) -> bool:
|
|
68
|
+
normalized = value.strip().strip("`").strip().lower()
|
|
69
|
+
return normalized in EMPTY_VALUES
|
|
70
|
+
|
|
71
|
+
|
|
72
|
+
def has_required_heading(text: str, heading: str) -> bool:
|
|
73
|
+
return re.search(rf"^{re.escape(heading)}\s*$", text, flags=re.MULTILINE) is not None
|
|
74
|
+
|
|
75
|
+
|
|
76
|
+
def section_block(text: str, heading: str) -> str:
|
|
77
|
+
heading_name = heading.removeprefix("## ").strip()
|
|
78
|
+
match = re.search(
|
|
79
|
+
rf"^##\s+{re.escape(heading_name)}\s*$([\s\S]*?)(?=^##\s+|\Z)",
|
|
80
|
+
text,
|
|
81
|
+
flags=re.MULTILINE,
|
|
82
|
+
)
|
|
83
|
+
return match.group(1) if match else ""
|
|
84
|
+
|
|
85
|
+
|
|
86
|
+
def parse_metric_entries(text: str) -> list[MetricEntry]:
|
|
87
|
+
metrics = section_block(text, "## Metrics")
|
|
88
|
+
entries: list[MetricEntry] = []
|
|
89
|
+
for match in re.finditer(r"^###\s+(.+?)\s*$([\s\S]*?)(?=^###\s+|\Z)", metrics, flags=re.MULTILINE):
|
|
90
|
+
heading = match.group(1).strip()
|
|
91
|
+
body = match.group(2)
|
|
92
|
+
fields: dict[str, str] = {}
|
|
93
|
+
for line in body.splitlines():
|
|
94
|
+
field_match = re.match(r"^\s*-\s*([^::]+)[::]\s*(.*)\s*$", line)
|
|
95
|
+
if field_match:
|
|
96
|
+
fields[field_match.group(1).strip()] = field_match.group(2).strip()
|
|
97
|
+
entries.append(MetricEntry(heading=heading, fields=fields))
|
|
98
|
+
return entries
|
|
99
|
+
|
|
100
|
+
|
|
101
|
+
def split_aliases(value: str) -> list[str]:
|
|
102
|
+
aliases = []
|
|
103
|
+
for alias in re.split(r"[,;,;、\n]+", value):
|
|
104
|
+
cleaned = alias.strip().strip("`").strip()
|
|
105
|
+
if cleaned and not is_empty_value(cleaned):
|
|
106
|
+
aliases.append(cleaned)
|
|
107
|
+
return aliases
|
|
108
|
+
|
|
109
|
+
|
|
110
|
+
def contains_alias(text: str, alias: str) -> bool:
|
|
111
|
+
if re.search(r"[A-Za-z0-9_]", alias):
|
|
112
|
+
pattern = rf"(?<![A-Za-z0-9_]){re.escape(alias)}(?![A-Za-z0-9_])"
|
|
113
|
+
return re.search(pattern, text, flags=re.IGNORECASE) is not None
|
|
114
|
+
return alias in text
|
|
115
|
+
|
|
116
|
+
|
|
117
|
+
def validate_glossary(text: str) -> tuple[list[MetricEntry], list[str]]:
|
|
118
|
+
issues: list[str] = []
|
|
119
|
+
for heading in REQUIRED_HEADINGS:
|
|
120
|
+
if not has_required_heading(text, heading):
|
|
121
|
+
issues.append(f"metric glossary missing required heading: {heading}")
|
|
122
|
+
|
|
123
|
+
entries = parse_metric_entries(text)
|
|
124
|
+
if not entries:
|
|
125
|
+
issues.append("metric glossary must define at least one metric under ## Metrics")
|
|
126
|
+
return entries, issues
|
|
127
|
+
|
|
128
|
+
for entry in entries:
|
|
129
|
+
for field in REQUIRED_METRIC_FIELDS:
|
|
130
|
+
value = entry.fields.get(field, "")
|
|
131
|
+
if is_empty_value(value):
|
|
132
|
+
issues.append(f"metric '{entry.heading}' missing required field value: {field}")
|
|
133
|
+
for field in REQUIRED_ALIAS_FIELDS:
|
|
134
|
+
if field not in entry.fields:
|
|
135
|
+
issues.append(f"metric '{entry.heading}' missing required field: {field}")
|
|
136
|
+
|
|
137
|
+
return entries, issues
|
|
138
|
+
|
|
139
|
+
|
|
140
|
+
def validate_forbidden_aliases(entries: list[MetricEntry], tex_files: list[str]) -> list[str]:
|
|
141
|
+
issues: list[str] = []
|
|
142
|
+
for tex_file in tex_files:
|
|
143
|
+
tex_path = Path(tex_file)
|
|
144
|
+
if not tex_path.exists():
|
|
145
|
+
issues.append(f"tex file does not exist: {tex_path}")
|
|
146
|
+
continue
|
|
147
|
+
text = read_text(tex_path)
|
|
148
|
+
for entry in entries:
|
|
149
|
+
canonical = entry.fields.get("Paper-facing name") or entry.heading
|
|
150
|
+
table_label = entry.fields.get("Table/header label", "")
|
|
151
|
+
for alias in split_aliases(entry.fields.get("Forbidden aliases", "")):
|
|
152
|
+
if contains_alias(text, alias):
|
|
153
|
+
issues.append(
|
|
154
|
+
"forbidden metric alias "
|
|
155
|
+
f"'{alias}' appears in {tex_path}; use '{canonical}'"
|
|
156
|
+
+ (f" or table/header label '{table_label}'" if table_label else "")
|
|
157
|
+
+ " instead"
|
|
158
|
+
)
|
|
159
|
+
return issues
|
|
160
|
+
|
|
161
|
+
|
|
162
|
+
def main() -> int:
|
|
163
|
+
args = parse_args()
|
|
164
|
+
glossary_path = Path(args.metric_glossary)
|
|
165
|
+
if not glossary_path.exists():
|
|
166
|
+
message = f"metric glossary does not exist: {glossary_path}"
|
|
167
|
+
if args.mode == "draft":
|
|
168
|
+
print(f"WARNING: {message}")
|
|
169
|
+
return 0
|
|
170
|
+
print(message, file=sys.stderr)
|
|
171
|
+
return 1
|
|
172
|
+
|
|
173
|
+
entries, issues = validate_glossary(read_text(glossary_path))
|
|
174
|
+
issues.extend(validate_forbidden_aliases(entries, args.tex_file))
|
|
175
|
+
|
|
176
|
+
if not issues:
|
|
177
|
+
print("metric glossary is valid")
|
|
178
|
+
return 0
|
|
179
|
+
|
|
180
|
+
if args.mode == "draft":
|
|
181
|
+
for issue in issues:
|
|
182
|
+
print(f"WARNING: {issue}")
|
|
183
|
+
return 0
|
|
184
|
+
|
|
185
|
+
for issue in issues:
|
|
186
|
+
print(issue, file=sys.stderr)
|
|
187
|
+
return 1
|
|
188
|
+
|
|
189
|
+
|
|
190
|
+
if __name__ == "__main__":
|
|
191
|
+
raise SystemExit(main())
|
|
@@ -241,6 +241,22 @@ INTERNAL_EXPERIMENT_PROVENANCE_PHRASES = (
|
|
|
241
241
|
"调参运行",
|
|
242
242
|
"调参轮次",
|
|
243
243
|
)
|
|
244
|
+
INTERNAL_EXPERIMENT_PLANNING_PATTERNS = (
|
|
245
|
+
r"current\s+[\d.]+\s+only\s+shows?.*need(?:s|ed)?\s+(?:a\s+)?(?:new\s+)?holdout",
|
|
246
|
+
r"(?:new|additional)\s+holdout\s+(?:and|or)\s+(?:more\s+)?natural(?:ized)?\s+(?:payload|attack|statement)",
|
|
247
|
+
r"(?:small[- ]batch|pilot[- ]batch).*(?:gate|gating)",
|
|
248
|
+
r"(?:freeze|freezing).*(?:payload|attack statement|trigger)",
|
|
249
|
+
r"(?:api|API).*(?:budget|cost|scale)",
|
|
250
|
+
r"新增\s*(?:holdout|外部|样本|实验).*验证",
|
|
251
|
+
r"还需要\s*新增.*验证",
|
|
252
|
+
r"后文.*边界",
|
|
253
|
+
r"当前\s*[\d.]+\s*只能说明.*不能外推.*(?:还需要|需要)",
|
|
254
|
+
r"小批量.*(?:门控|gate)",
|
|
255
|
+
r"(?:冻结|固定).*(?:payload|载荷|攻击语句|触发语句)",
|
|
256
|
+
r"(?:不能|不得).*边跑边调",
|
|
257
|
+
r"API\s*(?:规模|预算|成本)",
|
|
258
|
+
r"(?:按设计|设计上).*(?:失败|不通过).*(?:过拟合|调参)",
|
|
259
|
+
)
|
|
244
260
|
INTERNAL_CONFIG_LABEL_PATTERN = re.compile(
|
|
245
261
|
r"\b[a-z]{1,4}\d+(?:[-_][a-z]?\d+(?:\.\d+)?){1,4}\b",
|
|
246
262
|
flags=re.IGNORECASE,
|
|
@@ -265,6 +281,10 @@ def check_common_section_gate_risks(text: str, issues: list[str]):
|
|
|
265
281
|
issues.append(
|
|
266
282
|
"reader-facing prose appears to contain internal experiment provenance or tuning/config labels; move run provenance to workflow notes or map it to paper-facing diagnostic terminology"
|
|
267
283
|
)
|
|
284
|
+
if any(re.search(pattern, prose_text, flags=re.IGNORECASE) for pattern in INTERNAL_EXPERIMENT_PLANNING_PATTERNS):
|
|
285
|
+
issues.append(
|
|
286
|
+
"reader-facing prose appears to contain internal experiment planning or holdout-expansion rationale; keep plans, gates, payload-freezing notes, and future validation logistics in workflow artifacts instead of the manuscript"
|
|
287
|
+
)
|
|
268
288
|
if contains_any(
|
|
269
289
|
prose_text,
|
|
270
290
|
(
|
|
@@ -51,6 +51,12 @@
|
|
|
51
51
|
- Primary metric plain-language explanation:
|
|
52
52
|
- Secondary metric plain-language explanation:
|
|
53
53
|
- Health or support metrics and why they are not the main claim:
|
|
54
|
+
- Evaluation target:
|
|
55
|
+
- Test-set prediction used:
|
|
56
|
+
- Ranking or grouping step:
|
|
57
|
+
- Aggregation / calculation sketch:
|
|
58
|
+
- Direction and scale:
|
|
59
|
+
- Comparability boundary:
|
|
54
60
|
|
|
55
61
|
## Background Sources
|
|
56
62
|
|
|
@@ -17,6 +17,12 @@
|
|
|
17
17
|
- Primary metric plain-language explanation:
|
|
18
18
|
- Secondary metric plain-language explanation:
|
|
19
19
|
- Health or support metrics and how to read them:
|
|
20
|
+
- Evaluation target:
|
|
21
|
+
- Test-set prediction used:
|
|
22
|
+
- Ranking or grouping step:
|
|
23
|
+
- Aggregation / calculation sketch:
|
|
24
|
+
- Direction and scale:
|
|
25
|
+
- Comparability boundary:
|
|
20
26
|
|
|
21
27
|
## Final Performance Summary
|
|
22
28
|
|
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
# Metric Glossary
|
|
2
|
+
|
|
3
|
+
Use this glossary during `/lab:write` to keep reported metrics understandable and consistent across prose, captions, table headers, and table notes.
|
|
4
|
+
|
|
5
|
+
## Metric Rules
|
|
6
|
+
|
|
7
|
+
- Define every reported metric before using a short name or table label.
|
|
8
|
+
- Every metric needs a paper-facing name, approved short name, table/header label, plain-language definition, calculation, unit or denominator, direction, scope or conditions, allowed aliases, forbidden aliases, and first-use location.
|
|
9
|
+
- Use the same metric name in Method, Experiments, captions, table notes, and result summaries.
|
|
10
|
+
- If the denominator, event condition, score scale, or comparison scope changes, either define a separate metric entry or make the scope explicit in the metric definition.
|
|
11
|
+
- Short table headers are allowed only when the table note or adjacent prose locally resolves the metric.
|
|
12
|
+
- Forbidden aliases are deprecated reader-facing names. Remove them from prose, captions, tables, and analysis assets.
|
|
13
|
+
|
|
14
|
+
## Metrics
|
|
15
|
+
|
|
16
|
+
### Metric
|
|
17
|
+
|
|
18
|
+
- Paper-facing name:
|
|
19
|
+
- Approved short name:
|
|
20
|
+
- Table/header label:
|
|
21
|
+
- Plain-language definition:
|
|
22
|
+
- Calculation:
|
|
23
|
+
- Unit or denominator:
|
|
24
|
+
- Direction:
|
|
25
|
+
- Scope / conditions:
|
|
26
|
+
- Allowed aliases:
|
|
27
|
+
- Forbidden aliases:
|
|
28
|
+
- First-use location:
|
|
29
|
+
|
|
30
|
+
## Audit
|
|
31
|
+
|
|
32
|
+
- Metric naming drift found this round:
|
|
33
|
+
- Metrics used in prose, captions, tables, or figures but missing from this glossary:
|
|
34
|
+
- Deprecated aliases removed this round:
|
|
35
|
+
- Metrics that still need a clearer calculation or denominator:
|
|
@@ -2,18 +2,18 @@
|
|
|
2
2
|
\caption{One-sentence message of the table and the evaluation protocol.}
|
|
3
3
|
\label{tab:placeholder}
|
|
4
4
|
\centering
|
|
5
|
-
\begin{
|
|
5
|
+
\begin{tabularx}{\linewidth}{>{\raggedright\arraybackslash}Xcc}
|
|
6
6
|
\toprule
|
|
7
7
|
Method & Metric 1 $\uparrow$ & Metric 2 $\uparrow$ \\
|
|
8
8
|
\midrule
|
|
9
9
|
Ours & 0.0000 & 0.0000 \\
|
|
10
10
|
Baseline & 0.0000 & 0.0000 \\
|
|
11
11
|
\bottomrule
|
|
12
|
-
\end{
|
|
12
|
+
\end{tabularx}
|
|
13
13
|
% Rows: explain what each row represents.
|
|
14
14
|
% Columns: explain what each column represents and its direction.
|
|
15
15
|
% Metric definitions: expand local abbreviations, units, denominators, or event conditions.
|
|
16
16
|
% Comparison scope: explain which setting, split, attack family, or benchmark scope this table covers.
|
|
17
17
|
% Important caveat: state any omitted metrics, zero-valued metrics, or appendix-only reporting decision.
|
|
18
|
-
% Width control: first shorten headers, move secondary metrics out of the main table, and reduce or split columns; only then adjust \setlength{\tabcolsep}{...} conservatively or use \resizebox{\linewidth}{!}{...} as a documented last resort.
|
|
18
|
+
% Width control: default to bounded columns with tabularx or p{...}; first shorten headers, move secondary metrics out of the main table, and reduce or split columns; only then adjust \setlength{\tabcolsep}{...} conservatively or use \resizebox{\linewidth}{!}{...} as a documented last resort.
|
|
19
19
|
\end{table}
|
|
@@ -51,6 +51,17 @@
|
|
|
51
51
|
- Did any alias drift remain unresolved:
|
|
52
52
|
- Remaining reader-facing jargon risk:
|
|
53
53
|
|
|
54
|
+
## Metric Glossary
|
|
55
|
+
|
|
56
|
+
- Was `.lab/writing/metric-glossary.md` required this round:
|
|
57
|
+
- Metric glossary path:
|
|
58
|
+
- Metrics introduced or revised:
|
|
59
|
+
- Did each metric include paper-facing name, approved short name, table/header label, plain-language definition, calculation, unit or denominator, direction, scope or conditions, allowed aliases, forbidden aliases, and first-use location:
|
|
60
|
+
- Did prose, captions, table headers, table notes, and result summaries use the same metric names:
|
|
61
|
+
- Deprecated or forbidden aliases removed:
|
|
62
|
+
- Metrics used in prose or tables but missing from the glossary:
|
|
63
|
+
- Validator command and result:
|
|
64
|
+
|
|
54
65
|
## Section Acceptance Gate
|
|
55
66
|
|
|
56
67
|
- Canonical naming consistency passed:
|
|
@@ -75,6 +86,9 @@
|
|
|
75
86
|
- Were all abbreviations expanded at local first mention:
|
|
76
87
|
- Did each main table include a local table note:
|
|
77
88
|
- Can a reader interpret rows and columns without chasing Method:
|
|
89
|
+
- Table width audit:
|
|
90
|
+
- Did any main table use a wide plain `tabular` layout:
|
|
91
|
+
- If width control was needed, was the table first shortened, split, moved partly to appendix, or converted to `tabularx` / bounded columns before using `\tabcolsep` or `\resizebox`:
|
|
78
92
|
- If this section used canonical short names before their defining section, was a local naming bridge added:
|
|
79
93
|
- Did model and ablation labels stay canonical instead of drifting into narrative aliases:
|
|
80
94
|
|
|
@@ -130,6 +144,7 @@
|
|
|
130
144
|
- Did the round avoid copying reference wording, claims, metrics, captions, or conclusions:
|
|
131
145
|
- Did final prose avoid service-style or AI-assistant meta language:
|
|
132
146
|
- Did final prose avoid workflow-only placeholder language:
|
|
147
|
+
- Did final prose avoid internal experiment planning, future-holdout logistics, gates, payload-freezing notes, API-budget notes, and automation triage language:
|
|
133
148
|
- Validator command and result:
|
|
134
149
|
|
|
135
150
|
## Decision
|
|
@@ -210,6 +210,7 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
210
210
|
- Read `.lab/context/mission.md`, `.lab/context/state.md`, `.lab/context/workflow-state.md`, `.lab/context/decisions.md`, `.lab/context/evidence-index.md`, and `.lab/context/data-decisions.md` before drafting.
|
|
211
211
|
- Read `.lab/context/eval-protocol.md` before choosing tables, thresholds, or final result framing.
|
|
212
212
|
- Keep metric definitions, comparison semantics, and implementation references anchored to the approved evaluation protocol instead of re-deriving them during reporting.
|
|
213
|
+
- In `report.md`, explain each primary metric with a computation guide: what is evaluated, which test-set predictions or scores are used, whether examples are sorted, grouped, bucketed, or paired, how the value is aggregated or approximately calculated, what direction and scale mean, and what cannot be compared across datasets, splits, or implementations.
|
|
213
214
|
- Aggregate them with `.lab/.managed/scripts/summarize_iterations.py`.
|
|
214
215
|
- Write the final document with `.lab/.managed/templates/final-report.md`, the managed table summary with `.lab/.managed/templates/main-tables.md`, and the internal handoff with `.lab/.managed/templates/artifact-status.md`.
|
|
215
216
|
- Keep failed attempts and limitations visible.
|
|
@@ -268,11 +269,16 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
268
269
|
- Short table headers are allowed, but any abbreviation in a paper-facing table must be expanded locally in the same table.
|
|
269
270
|
- Local table notes must be filled with real reader-facing explanations; default template text such as "explain what each row represents" or "expand local abbreviations" is still incomplete.
|
|
270
271
|
- If a metric is measured but omitted because it is zero everywhere, redundant, or appendix-only, state that decision explicitly in the table note instead of silently dropping it.
|
|
272
|
+
- Maintain `.lab/writing/metric-glossary.md` whenever the paper reports empirical metrics. Each metric needs a paper-facing name, approved short name, table/header label, plain-language definition, calculation, unit or denominator, direction, scope or conditions, allowed aliases, forbidden aliases, and first-use location.
|
|
273
|
+
- Use the same metric names across Method, Experiments, captions, table headers, table notes, and result summaries; remove forbidden aliases from reader-facing LaTeX instead of letting legacy metric names drift.
|
|
274
|
+
- Run `.lab/.managed/scripts/validate_metric_glossary.py` in metric-bearing draft, final-draft, or export rounds and record the result in the latest write iteration artifact.
|
|
271
275
|
- Do not treat `\resizebox{\linewidth}{!}{...}` as the default main-table fit strategy.
|
|
272
|
-
-
|
|
276
|
+
- Wide plain `tabular` layouts with many columns are not manuscript-ready by default; prefer `tabularx` or bounded `p{...}` columns for text-heavy or multi-metric tables.
|
|
277
|
+
- Fit paper-facing main tables by redesign first: shorten headers, move secondary metrics out of the main table, reduce or split columns, prefer `tabularx` or bounded columns, then adjust `\tabcolsep` conservatively; only use `\resizebox` as a last resort and document why.
|
|
273
278
|
- Keep `\tabcolsep` adjustments conservative and avoid shrinking below a roughly readable floor for paper-facing main tables.
|
|
274
279
|
- Do not rely on `\scriptsize` or `\tiny` as the default way to make a main table fit.
|
|
275
280
|
- Keep internal identifiers, tuning-run labels, probe names, config strings, rerun ids, and package labels out of prose unless they are mapped once for the reader and then moved back out of prose.
|
|
281
|
+
- Keep internal experiment planning out of manuscript prose: future holdout expansion, small-batch gates, payload freezing, API budgets, automation decisions, and overfitting triage logic belong in lab artifacts, not paper-facing sections.
|
|
276
282
|
- Do not rely on unexplained jargon density as a substitute for academic tone.
|
|
277
283
|
- Bind each claim to evidence from `report`, iteration reports, or normalized summaries.
|
|
278
284
|
- Use the write-stage contract in `.codex/skills/lab/stages/write.md` or `.claude/skills/lab/stages/write.md` as the single source of truth for template choice, paper-plan requirements, section-specific references, validator calls, asset coverage, and final manuscript gates.
|
|
@@ -282,7 +288,9 @@ Use this skill when the user invokes `/lab:*` or asks for the structured researc
|
|
|
282
288
|
- For each subsection, explicitly cover motivation, design, and technical advantage when applicable.
|
|
283
289
|
- Keep terminology stable across rounds and sections.
|
|
284
290
|
- Maintain `.lab/writing/terminology-glossary.md` as the write-stage source for full forms, approved short forms, reader-facing explanations, allowed aliases, and terms that should stay out of prose.
|
|
291
|
+
- Maintain `.lab/writing/metric-glossary.md` as the write-stage source for reported metric names, definitions, calculations, denominators, directions, scopes, and aliases.
|
|
285
292
|
- When a round introduces or revises key terms, include a compact terminology note in the user-facing write summary and record the terminology-clarity self-check in the latest write iteration artifact.
|
|
293
|
+
- When a round introduces or revises metrics, include a compact metric-glossary note in the user-facing write summary and record the metric-glossary validation in the latest write iteration artifact.
|
|
286
294
|
- Record the section-level acceptance gate in the latest write iteration artifact before recommending another tighten/compress/polish pass on the same section.
|
|
287
295
|
- Record section-style policy compliance, any retained discouraged move, and any banned move found in the latest write iteration artifact.
|
|
288
296
|
- Record the round target layer in the latest write iteration artifact as `canonical manuscript`, `workflow-language paper layer`, or `both`.
|
|
@@ -122,6 +122,7 @@ These are paper-facing defaults. They are not project-specific branding rules.
|
|
|
122
122
|
- Self-evaluations such as "结果也很清楚", "the defense results are very clear", or "the table is self-explanatory".
|
|
123
123
|
- Layout-process commentary in scientific prose, such as "由于表列较多,这里采用页宽自适应排版" or "we use page-width adaptive layout here".
|
|
124
124
|
- Claims that a table "proves" something when the evidence only supports a bounded empirical result.
|
|
125
|
+
- Internal experiment-planning prose, such as "还需要新增 holdout", "小批量门控", "冻结 payload", "不能边跑边调", "API 规模估计", or "if all scores are 1.0000, treat it as overfitting".
|
|
125
126
|
- Service-style or AI-assistant meta language such as "用户说", "按你的要求", "我来解释", "let me explain", or "as requested by the user".
|
|
126
127
|
- Workflow-only placeholder language such as "图的意图", "资产意图", "占位符", "workflow-language", or "sync this wording".
|
|
127
128
|
|
|
@@ -10,6 +10,7 @@
|
|
|
10
10
|
- method overview
|
|
11
11
|
- selected metrics summary
|
|
12
12
|
- plain-language metric guide
|
|
13
|
+
- metric computation guide that explains what is evaluated, which test-set predictions are used, whether examples are sorted or grouped, how values are aggregated or approximately calculated, metric direction and scale, and comparability boundaries
|
|
13
14
|
- background sources
|
|
14
15
|
- method and baseline sources
|
|
15
16
|
- metric sources
|
|
@@ -52,6 +53,8 @@
|
|
|
52
53
|
- Do not restate metric definitions, baseline behavior, or comparison implementations from memory; use the approved evaluation protocol and its recorded sources.
|
|
53
54
|
- Carry the approved `Primary metrics`, `Secondary metrics`, and `Required terminal evidence` into both the report and the managed main-tables artifact.
|
|
54
55
|
- Explain the selected primary and secondary metrics in plain language for the user: what each metric measures, whether higher or lower is better, and whether it is a main result metric or only a health/support metric.
|
|
56
|
+
- For every primary metric, also explain enough of the computation for a collaborator to reproduce the idea without reading code: what is evaluated, which test-set predictions or scores are used, whether the examples are sorted, bucketed, grouped, or paired, how the resulting values are aggregated or approximately calculated, what direction and scale mean, and which comparisons are invalid across datasets, splits, or metric implementations.
|
|
57
|
+
- If a metric depends on ranking, the report must name the ranking score and the order. If it depends on a contrast, the report must name the compared conditions or groups. If it depends on an average, rate, area, threshold crossing, or recovery amount, the report must give a simple calculation sketch.
|
|
55
58
|
- If coverage, completeness, confidence, or similar health metrics appear, explicitly say that they describe experimental reliability rather than the main scientific effect.
|
|
56
59
|
- Pull the core background references, method or baseline references, and metric references out of the approved evaluation protocol instead of hiding them in `.lab/context/*`.
|
|
57
60
|
- Treat `report.md` as an external-review-ready memo. Source sections must not rely on local file paths or internal provenance notes; they must give a few human-readable anchor references instead.
|
|
@@ -5,6 +5,7 @@
|
|
|
5
5
|
- stable `report` artifact
|
|
6
6
|
- approved framing artifact at `.lab/writing/framing.md`
|
|
7
7
|
- current terminology glossary at `.lab/writing/terminology-glossary.md` when it already exists
|
|
8
|
+
- current metric glossary at `.lab/writing/metric-glossary.md` when the paper reports empirical metrics
|
|
8
9
|
- iteration reports
|
|
9
10
|
- normalized summaries
|
|
10
11
|
- reviewer notes when available
|
|
@@ -28,6 +29,7 @@
|
|
|
28
29
|
- `.lab/context/data-decisions.md`
|
|
29
30
|
- `.lab/context/terminology-lock.md`
|
|
30
31
|
- `.lab/writing/terminology-glossary.md` when it exists
|
|
32
|
+
- `.lab/writing/metric-glossary.md` when it exists or the current section reports metrics
|
|
31
33
|
- `.lab/.managed/rule-manifest.json`
|
|
32
34
|
|
|
33
35
|
## Rule Preflight
|
|
@@ -163,6 +165,8 @@ Do not enter prose polish until the current section has passed the reference-con
|
|
|
163
165
|
- Do not use labels containing `_` or `-` in reader-facing prose.
|
|
164
166
|
- Keep internal identifiers, config keys, and experiment package labels out of reader-facing prose unless they are mapped once for the reader and then moved back out of prose.
|
|
165
167
|
- Keep run provenance such as tuning-run labels, probe names, internal config strings, rerun ids, and package labels out of reader-facing prose. If the evidence is useful, rewrite it as a bounded paper-facing diagnostic or move the raw provenance to workflow notes or appendix metadata.
|
|
168
|
+
- Keep internal experiment planning out of reader-facing prose. Do not write paper sentences that explain future holdout expansion, small-batch gates, payload freezing, API budget, "if all scores are 1.0000 then treat as overfitting", or why a next automation round is needed.
|
|
169
|
+
- When an experiment boundary matters, report only the scientific scope already supported by the evidence. Put the operational plan for collecting new attacks, new papers, new markers, or additional holdout cases into `.lab/changes/`, `.lab/iterations/`, or report artifacts, not into manuscript sections.
|
|
166
170
|
- Do not use unexplained terminology density as a substitute for academic tone.
|
|
167
171
|
- Keep service-style or AI-assistant meta language out of manuscript prose. Phrases such as "用户说", "按你的要求", "我来解释", "下面我", "this version", or "as requested by the user" belong in workflow notes, not in paper-facing sections, captions, table notes, or analysis assets.
|
|
168
172
|
- Keep workflow-only placeholder language out of manuscript prose. Phrases such as "图的意图", "资产意图", "占位符", "workflow-language", "translation layer", or "sync this wording" belong in authoring artifacts, not in reader-facing LaTeX.
|
|
@@ -170,11 +174,18 @@ Do not enter prose polish until the current section has passed the reference-con
|
|
|
170
174
|
- Short headers remain allowed, but they must be resolved locally through the same table's caption or table note instead of forcing the reader to chase the Method section.
|
|
171
175
|
- If the Method or Experiments prose says the paper reports a metric family, the main table set must either expose those metrics directly or explicitly mark the missing ones as appendix-only and explain why.
|
|
172
176
|
- If a metric is measured but omitted because it is uniformly zero, redundant, or appendix-only, state that disposition explicitly in the caption or table note instead of silently dropping it.
|
|
177
|
+
- Maintain `.lab/writing/metric-glossary.md` whenever the paper reports metrics or the round introduces or revises Method metric descriptions, Experiments metrics, captions, table notes, table headers, or analysis assets.
|
|
178
|
+
- Every metric glossary entry must include paper-facing name, approved short name, table/header label, plain-language definition, calculation, unit or denominator, direction, scope or conditions, allowed aliases, forbidden aliases, and first-use location.
|
|
179
|
+
- Use the same metric names across prose, captions, table notes, table headers, and result summaries. Do not let one metric drift across paper-facing names, shorthand, table headers, and legacy aliases.
|
|
180
|
+
- If a metric's denominator, event condition, score scale, or comparison scope differs by setting, define a separate entry or explicitly scope the metric in `.lab/writing/metric-glossary.md`.
|
|
181
|
+
- Deprecated or forbidden metric aliases must be removed from reader-facing LaTeX instead of explained away locally.
|
|
173
182
|
- Do not treat `\resizebox{\linewidth}{!}{...}` as the default way to fit a main table.
|
|
174
|
-
-
|
|
183
|
+
- Wide plain `tabular` layouts with many columns are not manuscript-ready by default; final/export validation should force a width-aware table design instead of silently accepting likely overfull tables.
|
|
184
|
+
- Main-table width control should follow this order: shorten headers while preserving local explanations, move secondary metrics to appendix-only, reduce or split columns, prefer `tabularx` or bounded `p{...}` columns, adjust `\tabcolsep` conservatively, and only then consider `\resizebox` as a last resort.
|
|
175
185
|
- When `\tabcolsep` is adjusted for a paper-facing main table, keep it in a safe range and avoid shrinking below roughly `3pt`; prefer `4pt` or `5pt` when a small reduction is enough.
|
|
176
186
|
- Do not use `\scriptsize` or `\tiny` as the default main-table fit strategy. If a table only fits after aggressive font shrinking, redesign the table instead of forcing it into the page.
|
|
177
187
|
- If a paper-facing main table uses `\resizebox` or non-default width control, explain the width-control rationale in the same table note.
|
|
188
|
+
- Prefer `tabularx` for paper-facing main tables whose first column or text-heavy columns need bounded line wrapping; use plain `tabular` only for compact tables with a small column count.
|
|
178
189
|
- Every main table should have a short table-introduction sentence before it and a short interpretation sentence after it so the reader knows what question the table answers and how to read the result.
|
|
179
190
|
- Build the paper asset plan before prose when the section carries introduction, experimental, method, related-work, or conclusion claims:
|
|
180
191
|
- record the asset coverage targets and gaps for the current paper
|
|
@@ -188,6 +199,7 @@ Do not enter prose polish until the current section has passed the reference-con
|
|
|
188
199
|
- When the repository workflow config is available, the paper-plan validator also checks that `.lab/writing/plan.md` stays in `workflow_language` instead of silently drifting into another language.
|
|
189
200
|
- If the paper-plan validator fails, stop and fill `.lab/writing/plan.md` first instead of drafting prose.
|
|
190
201
|
- During ordinary draft rounds, run `.lab/.managed/scripts/validate_section_draft.py --section <section> --section-file <section-file> --mode draft` and `.lab/.managed/scripts/validate_paper_claims.py --section-file <section-file> --mode draft` after revising the active section.
|
|
202
|
+
- During ordinary draft rounds that introduce or revise metrics, also run `.lab/.managed/scripts/validate_metric_glossary.py --metric-glossary .lab/writing/metric-glossary.md --tex-file <section-or-table-file> --mode draft`.
|
|
191
203
|
- If reference-guided deep-write was triggered, also run `.lab/.managed/scripts/validate_reference_consumption.py --section <section> --section-file <section-file> --mode draft` after revising the active section.
|
|
192
204
|
- Treat draft-round output from the section and claim validators as warnings that must be recorded and addressed in the write-iteration artifact, not as immediate stop conditions.
|
|
193
205
|
- If the active section already lives under a paper-layer `sections/` directory, the draft section validator should also warn when the neighboring required figure or analysis placeholder files are still missing from that same paper layer.
|
|
@@ -213,10 +225,12 @@ Do not enter prose polish until the current section has passed the reference-con
|
|
|
213
225
|
- Table assets must also include a local table note that explains row meaning, column meaning, metric definitions, comparison scope, and any important caveat.
|
|
214
226
|
- The local table note must contain real reader-facing explanations, not the default template phrases such as "explain what each row represents" or "expand local abbreviations".
|
|
215
227
|
- Table assets must not rely on aggressive width hacks by default; if width control is still needed after table redesign, document it locally and keep it readable.
|
|
228
|
+
- Table assets with seven or more columns should be split, moved partly to appendix, or written with width-aware columns such as `tabularx` or `p{...}` instead of a plain `tabular` layout.
|
|
216
229
|
- Figure placeholders may record what the final figure should show and why the reader needs it in authoring comments, the paper plan, or the write-iteration artifact, but the caption itself must remain paper-facing and must not contain "Figure intent", "图的意图", "asset intent", "占位符", or similar workflow language.
|
|
217
230
|
- Core asset coverage for a paper-facing final draft should include a problem-setting or teaser figure, a method overview figure, a results overview figure, a main-results table, an ablation table, and one additional analysis asset.
|
|
218
231
|
- Keep `.lab/writing/plan.md` synchronized with the current table plan, figure plan, citation plan, and section-to-asset map whenever manuscript assets change.
|
|
219
232
|
- For final-draft or export rounds, run `.lab/.managed/scripts/validate_section_draft.py --section <section> --section-file <section-file> --mode final` and `.lab/.managed/scripts/validate_paper_claims.py --section-file <section-file> --mode final` before accepting the round.
|
|
233
|
+
- For final-draft or export rounds with reported metrics, run `.lab/.managed/scripts/validate_metric_glossary.py --metric-glossary .lab/writing/metric-glossary.md --tex-file <section-file> --mode final` before accepting the round. Add relevant table, figure, or analysis `.tex` files with repeated `--tex-file` when they contain metric names.
|
|
220
234
|
- If reference-guided deep-write was triggered, run `.lab/.managed/scripts/validate_reference_consumption.py --section <section> --section-file <section-file> --mode final` before accepting the final-draft or export round.
|
|
221
235
|
- If the final-round section or claim validators fail, keep editing the affected section until it passes; do not stop at asset-complete but rhetorically weak or unsafe prose.
|
|
222
236
|
- Final-round section validation should fail when a section in the paper layer references required figure or analysis placeholders but the neighboring asset files are still missing from that layer.
|
|
@@ -227,7 +241,9 @@ Do not enter prose polish until the current section has passed the reference-con
|
|
|
227
241
|
- Run a LaTeX compile smoke test when a local LaTeX toolchain is available; if not available, record the missing verification in the write iteration artifact.
|
|
228
242
|
- Record what changed and why in a write-iteration artifact.
|
|
229
243
|
- Maintain `.lab/writing/terminology-glossary.md` as the write-stage source for full forms, approved short forms, reader-facing explanations, and aliases whenever terminology changes.
|
|
244
|
+
- Maintain `.lab/writing/metric-glossary.md` as the write-stage source for metric names, calculations, denominators, directions, scopes, and aliases whenever empirical metrics are reported.
|
|
230
245
|
- When a round introduces or revises key terms, include a compact terminology note in the user-facing round summary and record the terminology-clarity self-check in the write-iteration artifact.
|
|
246
|
+
- When a round introduces or revises metrics, include a compact metric-glossary note in the user-facing round summary and record the metric-glossary validation in the write-iteration artifact.
|
|
231
247
|
- Record the section-level acceptance gate in the write-iteration artifact before recommending further tightening on the same section.
|
|
232
248
|
- Record section-style policy compliance, any retained discouraged move, and any banned move found in the write-iteration artifact.
|
|
233
249
|
- Record the round target layer in the write-iteration artifact as `canonical manuscript`, `workflow-language paper layer`, or `both`.
|
|
@@ -250,6 +266,7 @@ Do not enter prose polish until the current section has passed the reference-con
|
|
|
250
266
|
- `.lab/writing/framing.md`
|
|
251
267
|
- `.lab/writing/plan.md`
|
|
252
268
|
- `.lab/writing/terminology-glossary.md`
|
|
269
|
+
- `.lab/writing/metric-glossary.md` when the paper reports empirical metrics
|
|
253
270
|
- `.lab/writing/reference-patterns/consumption-plan/<section>.md` when reference-guided deep-write is triggered
|
|
254
271
|
- `.lab/writing/iterations/<n>.md`
|
|
255
272
|
- `<deliverables_root>/paper/main.tex`
|