superlab 0.1.71 → 0.1.72
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/lib/i18n.cjs +89 -6
- package/lib/lab_idea_contract.json +4 -4
- package/lib/lab_write_contract.json +1 -1
- package/package-assets/claude/commands/lab/idea.md +1 -1
- package/package-assets/claude/commands/lab/report.md +1 -0
- package/package-assets/claude/commands/lab/write.md +1 -0
- package/package-assets/claude/commands/lab-idea.md +1 -1
- package/package-assets/claude/commands/lab-report.md +1 -0
- package/package-assets/claude/commands/lab-write.md +1 -0
- package/package-assets/claude/commands/lab:idea.md +1 -1
- package/package-assets/claude/commands/lab:report.md +1 -0
- package/package-assets/claude/commands/lab:write.md +1 -0
- package/package-assets/claude/commands/lab/357/274/232idea.md +1 -1
- package/package-assets/claude/commands/lab/357/274/232report.md +1 -0
- package/package-assets/claude/commands/lab/357/274/232write.md +1 -0
- package/package-assets/codex/prompts/lab/idea.md +1 -1
- package/package-assets/codex/prompts/lab/report.md +1 -0
- package/package-assets/codex/prompts/lab/write.md +1 -1
- package/package-assets/codex/prompts/lab-idea.md +1 -1
- package/package-assets/codex/prompts/lab-report.md +1 -0
- package/package-assets/codex/prompts/lab-write.md +1 -1
- package/package-assets/codex/prompts/lab:idea.md +1 -1
- package/package-assets/codex/prompts/lab:report.md +1 -0
- package/package-assets/codex/prompts/lab:write.md +1 -1
- package/package-assets/codex/prompts/lab/357/274/232idea.md +1 -1
- package/package-assets/codex/prompts/lab/357/274/232report.md +1 -0
- package/package-assets/codex/prompts/lab/357/274/232write.md +1 -1
- package/package-assets/shared/lab/.managed/scripts/validate_collaborator_report.py +55 -1
- package/package-assets/shared/lab/.managed/scripts/validate_idea_artifact.py +75 -0
- package/package-assets/shared/lab/.managed/scripts/validate_section_draft.py +119 -0
- package/package-assets/shared/lab/.managed/scripts/validate_stage_report.py +246 -0
- package/package-assets/shared/lab/.managed/templates/final-report.md +11 -0
- package/package-assets/shared/lab/.managed/templates/idea.md +18 -0
- package/package-assets/shared/lab/.managed/templates/main-tables.md +6 -0
- package/package-assets/shared/lab/.managed/templates/paper-plan.md +9 -0
- package/package-assets/shared/lab/.managed/templates/stage-report.md +19 -0
- package/package-assets/shared/lab/.managed/templates/write-iteration.md +13 -0
- package/package-assets/shared/skills/lab/SKILL.md +18 -0
- package/package-assets/shared/skills/lab/references/paper-writing/abstract.md +14 -0
- package/package-assets/shared/skills/lab/references/paper-writing/conclusion.md +13 -0
- package/package-assets/shared/skills/lab/references/paper-writing/experiments.md +19 -0
- package/package-assets/shared/skills/lab/references/paper-writing/introduction.md +17 -2
- package/package-assets/shared/skills/lab/references/paper-writing/method.md +10 -0
- package/package-assets/shared/skills/lab/references/paper-writing/section-style-policies.md +10 -1
- package/package-assets/shared/skills/lab/stages/auto.md +20 -0
- package/package-assets/shared/skills/lab/stages/data.md +3 -0
- package/package-assets/shared/skills/lab/stages/framing.md +3 -0
- package/package-assets/shared/skills/lab/stages/idea.md +33 -19
- package/package-assets/shared/skills/lab/stages/iterate.md +3 -0
- package/package-assets/shared/skills/lab/stages/report.md +11 -0
- package/package-assets/shared/skills/lab/stages/review.md +3 -0
- package/package-assets/shared/skills/lab/stages/run.md +3 -0
- package/package-assets/shared/skills/lab/stages/spec.md +3 -0
- package/package-assets/shared/skills/lab/stages/write.md +12 -0
- package/package.json +1 -1
|
@@ -57,6 +57,20 @@ Introduce the technical challenge, then use one to two sentences to present the
|
|
|
57
57
|
4. The technical term must be easy to understand; do not create a jump in reading.
|
|
58
58
|
5. This ability is very important for writing a good abstract.
|
|
59
59
|
|
|
60
|
+
## Insight Anchor Rule
|
|
61
|
+
|
|
62
|
+
Use one mechanism-level insight sentence as the abstract hinge. The sentence should explain why the problem behaves as it does, not only name the proposed method.
|
|
63
|
+
|
|
64
|
+
Good shape:
|
|
65
|
+
|
|
66
|
+
1. `We observe that [surface failure], suggesting that [root mechanism].`
|
|
67
|
+
2. `This motivates [method/design], which [technical effect].`
|
|
68
|
+
|
|
69
|
+
Avoid:
|
|
70
|
+
|
|
71
|
+
1. A standalone "Our insight is..." sentence that is disconnected from the challenge.
|
|
72
|
+
2. A method-name sentence that could be deleted without changing the reader's understanding.
|
|
73
|
+
|
|
60
74
|
## Version 3: Multiple Contributions
|
|
61
75
|
|
|
62
76
|
Version 3: When there are multiple technical contributions, describe each contribution together with its technical advantage.
|
|
@@ -12,6 +12,19 @@ Close the paper with clear takeaways and credible limitations.
|
|
|
12
12
|
4. Add limitation paragraph.
|
|
13
13
|
5. End with concrete future direction.
|
|
14
14
|
|
|
15
|
+
## Insight Closeout
|
|
16
|
+
|
|
17
|
+
The conclusion should not introduce a new insight. It should restate the same core insight anchor as a supported takeaway and turn it into a broader principle or action implication.
|
|
18
|
+
|
|
19
|
+
Use this order:
|
|
20
|
+
|
|
21
|
+
1. Evidence-backed takeaway.
|
|
22
|
+
2. Broader principle implied by the takeaway.
|
|
23
|
+
3. Boundary that prevents overclaiming.
|
|
24
|
+
4. One future direction that follows from the boundary.
|
|
25
|
+
|
|
26
|
+
Avoid repeating the method inventory or ending with generic impact language.
|
|
27
|
+
|
|
15
28
|
## Limitation Guidance
|
|
16
29
|
|
|
17
30
|
Prefer limitations tied to task goal/setting boundaries, for example:
|
|
@@ -20,6 +20,25 @@ Convince reviewers with complete evidence on effectiveness, causality, and pract
|
|
|
20
20
|
- Add stress-test scenarios (more complex scenes, rarer cases, noisier inputs, or stricter constraints).
|
|
21
21
|
- Report both gains and failure modes to show realistic boundaries.
|
|
22
22
|
|
|
23
|
+
## Insight-Diagnostic Reading
|
|
24
|
+
|
|
25
|
+
Experiments should not only prove that a method is strong. They should diagnose whether the paper's insight is true.
|
|
26
|
+
|
|
27
|
+
For each main result, ablation, robustness check, or failure analysis, write down:
|
|
28
|
+
|
|
29
|
+
1. Which part of the insight this experiment tests.
|
|
30
|
+
2. What alternative explanation the experiment rules out or weakens.
|
|
31
|
+
3. What mechanism the observed pattern supports.
|
|
32
|
+
4. What boundary or failure mode remains.
|
|
33
|
+
|
|
34
|
+
Good interpretation shape:
|
|
35
|
+
|
|
36
|
+
1. `This result supports the hypothesis that [mechanism], because [observed pattern].`
|
|
37
|
+
2. `The ablation weakens the simpler explanation that [alternative], since [diagnostic contrast].`
|
|
38
|
+
3. `The remaining failures indicate that [boundary], rather than [overclaim].`
|
|
39
|
+
|
|
40
|
+
Avoid paragraphs that only say "Table X shows Y improves by Z." The table already contains the number; prose should explain what the number teaches.
|
|
41
|
+
|
|
23
42
|
## Experiment Planning
|
|
24
43
|
|
|
25
44
|
```mermaid
|
|
@@ -56,12 +56,27 @@ graph LR
|
|
|
56
56
|
3. What are the benefits of our contributions, why can they solve this technical challenge, and what new insight do they bring? (important)
|
|
57
57
|
4. How do we use prior methods to lead readers to our solved challenge and our new insight?
|
|
58
58
|
|
|
59
|
+
### Insight-driven introduction pass
|
|
60
|
+
|
|
61
|
+
Before drafting the final introduction, write one core insight anchor sentence and test whether every paragraph points toward it.
|
|
62
|
+
|
|
63
|
+
Use this causal arc:
|
|
64
|
+
|
|
65
|
+
1. Conventional assumption or prior explanation.
|
|
66
|
+
2. Observation, anomaly, or failure that this assumption does not explain.
|
|
67
|
+
3. Root mechanism or insight.
|
|
68
|
+
4. Method or evaluation introduced as a way to test or exploit that insight.
|
|
69
|
+
5. Boundary of what the evidence can and cannot claim.
|
|
70
|
+
|
|
71
|
+
Avoid making insight a separate subsection. The introduction should let the reader feel the contrast before the method name appears.
|
|
72
|
+
|
|
59
73
|
### Forward story (write in this order)
|
|
60
74
|
|
|
61
75
|
1. Introduce the paper's task.
|
|
62
76
|
2. Use prior methods to lead to the technical challenge we solve.
|
|
63
|
-
3.
|
|
64
|
-
4.
|
|
77
|
+
3. State the insight as the root explanation for that challenge.
|
|
78
|
+
4. Present xx contributions to solve this technical challenge.
|
|
79
|
+
5. Explain technical advantages of our contributions and explicitly express our new insight. (important)
|
|
65
80
|
|
|
66
81
|
## Section Skeleton
|
|
67
82
|
|
|
@@ -23,6 +23,15 @@ Recommended organization:
|
|
|
23
23
|
|
|
24
24
|
3. Organize answers as a mind map or a table for clarity.
|
|
25
25
|
|
|
26
|
+
Add an insight-to-design row before module details:
|
|
27
|
+
|
|
28
|
+
1. What is the paper's core insight anchor?
|
|
29
|
+
2. Which failure mechanism does the method need to model, block, separate, or expose?
|
|
30
|
+
3. Which design choice follows from that mechanism?
|
|
31
|
+
4. What prediction would be false if the mechanism were wrong?
|
|
32
|
+
|
|
33
|
+
Method writing should read as "because this mechanism exists, this design is necessary", not as an inventory of modules.
|
|
34
|
+
|
|
26
35
|
## Method Writing Steps
|
|
27
36
|
|
|
28
37
|
`Method writing steps: (1) draw pipeline figure sketch, (2) map subsections from the sketch, (3) plan each subsection with motivation/design/advantages, (4) write module design first, (5) then add motivation and technical advantages.`
|
|
@@ -52,6 +61,7 @@ Definition:
|
|
|
52
61
|
|
|
53
62
|
1. Explain why this module is needed.
|
|
54
63
|
2. Use problem-driven logic: because problem X exists, we design module Y.
|
|
64
|
+
3. Tie the problem back to the core insight when possible, so the module feels derived rather than arbitrary.
|
|
55
65
|
|
|
56
66
|
### 3) Technical Advantages of This Module
|
|
57
67
|
|
|
@@ -19,6 +19,7 @@ These are paper-facing defaults. They are not project-specific branding rules.
|
|
|
19
19
|
- Direct problem statements.
|
|
20
20
|
- Explicit gap language tied to prior work.
|
|
21
21
|
- One-sentence mechanism summaries.
|
|
22
|
+
- Challenge -> insight -> contribution progression.
|
|
22
23
|
- Bounded result claims with concrete scope.
|
|
23
24
|
|
|
24
25
|
**Discouraged expressions**
|
|
@@ -32,6 +33,7 @@ These are paper-facing defaults. They are not project-specific branding rules.
|
|
|
32
33
|
- Unbounded superiority claims such as "universally", "always", or "in every setting".
|
|
33
34
|
- Service-style or AI-assistant meta language such as "用户说", "按你的要求", "我来解释", "let me explain", or "as requested by the user".
|
|
34
35
|
- Workflow-only placeholder language such as "图的意图", "资产意图", "占位符", "workflow-language", or "sync this wording".
|
|
36
|
+
- Standalone insight headings such as "Our Insights" when the insight is not woven into the abstract's challenge and contribution arc.
|
|
35
37
|
|
|
36
38
|
## Introduction
|
|
37
39
|
|
|
@@ -40,6 +42,7 @@ These are paper-facing defaults. They are not project-specific branding rules.
|
|
|
40
42
|
|
|
41
43
|
**Encouraged expressions**
|
|
42
44
|
- Problem -> gap -> challenge -> contribution progression.
|
|
45
|
+
- Common assumption -> observed anomaly -> root mechanism progression.
|
|
43
46
|
- Explicit prior-work limitation statements.
|
|
44
47
|
- Clear contribution bullets or equivalent prose.
|
|
45
48
|
- Early explanation of task setting and scope.
|
|
@@ -53,6 +56,7 @@ These are paper-facing defaults. They are not project-specific branding rules.
|
|
|
53
56
|
- Empty macro-importance claims such as "this problem is increasingly critical" with no concrete consequence.
|
|
54
57
|
- Marketing-style first-claim language such as "revolutionary", "game-changing", or "unprecedented" without evidence.
|
|
55
58
|
- Paragraphs that only praise the paper instead of stating the research gap.
|
|
59
|
+
- Standalone "Our Insights" sections; the insight should be part of the motivation and gap logic.
|
|
56
60
|
- Service-style or AI-assistant meta language such as "用户说", "按你的要求", "我来解释", "let me explain", or "as requested by the user".
|
|
57
61
|
- Workflow-only placeholder language such as "图的意图", "资产意图", "占位符", "workflow-language", or "sync this wording".
|
|
58
62
|
|
|
@@ -85,6 +89,7 @@ These are paper-facing defaults. They are not project-specific branding rules.
|
|
|
85
89
|
|
|
86
90
|
**Encouraged expressions**
|
|
87
91
|
- Motivation -> design -> technical effect progression.
|
|
92
|
+
- Insight -> required design consequence progression.
|
|
88
93
|
- Explicit role statements for modules or steps.
|
|
89
94
|
- Concrete descriptions of information flow and interaction.
|
|
90
95
|
- Local naming bridges when canonical labels appear before their defining section.
|
|
@@ -97,6 +102,7 @@ These are paper-facing defaults. They are not project-specific branding rules.
|
|
|
97
102
|
**Banned expressions / moves**
|
|
98
103
|
- Marketing-style or self-promotional wording such as "elegant", "powerful", "dramatically stronger", or "significantly outperforms prior methods" when used as prose decoration rather than evidence-backed result reporting.
|
|
99
104
|
- Explaining the method by saying it is "better", "stronger", or "more advanced" without saying how it works.
|
|
105
|
+
- Method subsections that read like API documentation without explaining which mechanism or insight requires the design.
|
|
100
106
|
- Introducing new narrative aliases for canonical model or ablation labels after they have already been locked.
|
|
101
107
|
- Service-style or AI-assistant meta language such as "用户说", "按你的要求", "我来解释", "let me explain", or "as requested by the user".
|
|
102
108
|
- Workflow-only placeholder language such as "图的意图", "资产意图", "占位符", "workflow-language", or "sync this wording".
|
|
@@ -109,13 +115,15 @@ These are paper-facing defaults. They are not project-specific branding rules.
|
|
|
109
115
|
**Encouraged expressions**
|
|
110
116
|
- Direct statements of protocol, metric definition, and comparison scope.
|
|
111
117
|
- Immediate result reporting with concrete numbers.
|
|
112
|
-
- Short interpretation tied to the table or figure.
|
|
118
|
+
- Short diagnostic interpretation tied to the mechanism tested by the table or figure.
|
|
119
|
+
- Ablation prose that says which alternative explanation is weakened.
|
|
113
120
|
- Explicit limitations or boundary statements after the result.
|
|
114
121
|
|
|
115
122
|
**Discouraged expressions**
|
|
116
123
|
- Long policy or deployment discussion after every table.
|
|
117
124
|
- Re-explaining the same metric in every paragraph.
|
|
118
125
|
- Paragraphs that only restate the table without synthesis.
|
|
126
|
+
- Result paragraphs that say only "higher/lower/better" without explaining what the pattern teaches.
|
|
119
127
|
|
|
120
128
|
**Banned expressions / moves**
|
|
121
129
|
- Meta-reader guidance such as "这样读者可以……", "the reader can first...", or "this table lets the reader...".
|
|
@@ -133,6 +141,7 @@ These are paper-facing defaults. They are not project-specific branding rules.
|
|
|
133
141
|
|
|
134
142
|
**Encouraged expressions**
|
|
135
143
|
- Short recap of the paper's supported findings.
|
|
144
|
+
- Broader principle implied by the supported findings.
|
|
136
145
|
- Boundary or limitation statement.
|
|
137
146
|
- One concrete next step or open question.
|
|
138
147
|
|
|
@@ -125,6 +125,23 @@
|
|
|
125
125
|
- Before each rung and before each success, stop, or promotion decision, re-check the generic academic-risk questions: setting semantics, visibility/leakage, anchor or label policy, scale comparability, metric validity, comparison validity, statistical validity, claim boundary, and integrity self-check.
|
|
126
126
|
- Before each success, stop, or promotion decision, also re-check the anomaly policy: whether anomaly signals fired, whether simpler explanations were ruled out, whether a cross-check was performed, and whether the current interpretation is still the narrowest supported one.
|
|
127
127
|
|
|
128
|
+
## Gate Miss And Repair Loop
|
|
129
|
+
|
|
130
|
+
- A gate miss is not automatically a terminal stop for `L2` or `L3` when `iterate` is allowed and the loop budget remains.
|
|
131
|
+
- After any failed metric gate, classify the miss before writing a terminal outcome:
|
|
132
|
+
- recoverable: ordinary target miss, weak effect, overly strong effect, low coverage, placement or extraction mismatch, threshold mismatch, candidate-generation weakness, no-op delta, or noisy split
|
|
133
|
+
- terminal: budget exhausted, frozen-core change required, approval-required scope change, safety or integrity risk, invalid metric, impossible target, or repeated failed repair attempts
|
|
134
|
+
- For a recoverable miss, run at least one bounded repair iteration inside the approved envelope before stopping. The repair must state the hypothesis, the specific knob changed, the unchanged frozen core, and the validation command.
|
|
135
|
+
- Generic repair knobs include intervention strength, delivery channel or placement, detector/scoring threshold, candidate generation, sampling or stratification, baseline alignment, extraction/parser behavior, calibration, and control checks.
|
|
136
|
+
- Separate ordinary engineering fixes from evidence-changing repairs. Ordinary fixes such as path repair, parser bugs, dependency setup, data loading, runner retry, logging, cache invalidation, and result serialization should be fixed directly and do not spend repair budget when they do not change evidence interpretation.
|
|
137
|
+
- Evidence-changing repairs must spend repair budget and be logged: changes to intervention strength, delivery semantics, scoring thresholds, sampling, candidate generation, baseline alignment, calibration, extraction behavior that changes observed evidence, or evaluated control set.
|
|
138
|
+
- Do not over-constrain problem solving: a repair may change multiple coupled knobs when the hypothesis requires it, but the report must name every changed knob and explain why the knobs are coupled.
|
|
139
|
+
- Forbidden repair moves cannot be used to claim success without explicit approval: changing the primary metric definition, relaxing target thresholds, deleting hard cases, changing labels or ground truth, switching the final test split, changing paper-facing claims, or changing threat model, reviewer profile, dataset scope, or frozen core.
|
|
140
|
+
- A repair pilot that passes must go through a confirmation check before promotion or final success. Valid confirmation includes a new seed, holdout, control batch, repeated run, anomaly check, or other predeclared cross-check.
|
|
141
|
+
- For `L3`, prefer continuing through the repair ladder until pass, terminal boundary, or budget exhaustion. Do not pause merely because the first pilot failed.
|
|
142
|
+
- If stopping after a miss, the final outcome and stage report must name the terminal boundary. "The gate failed" alone is not a sufficient stop reason when a plausible repair remains.
|
|
143
|
+
- The user-facing final answer must start from the user's requested deliverables: list each requested table, artifact, or objective; mark completed, failed-gate, repaired, not promoted, or blocked; then give evidence paths and the next action.
|
|
144
|
+
|
|
128
145
|
## Minimum Procedure
|
|
129
146
|
|
|
130
147
|
1. Validate the auto-mode contract
|
|
@@ -198,5 +215,8 @@
|
|
|
198
215
|
## Stage Report Closeout
|
|
199
216
|
|
|
200
217
|
- At every stop, failure, escalation, or final handoff, write or update `.lab/stage-reports/<date>--auto--<target>.md` from `.lab/.managed/templates/stage-report.md`.
|
|
218
|
+
- Fill `Requested Outcome Mapping` before the core table so the final answer can be checked against the user's original request rather than only against internal stage state.
|
|
219
|
+
- Fill `Repair Control` with repair budget, attempts used, failure class, repair hypothesis, evidence-changing knobs, ordinary engineering fixes still allowed, unchanged frozen core, forbidden repairs avoided, and confirmation check.
|
|
201
220
|
- Fill the `Core Explanation Table` in plain language: background, why now, what ran, how the loop ran, what worked, what did not work, what was verified, what remains unverified, what needs improvement and why, how to improve and why, key evidence, and the continue/stop/revise/rerun/escalate/handoff decision.
|
|
221
|
+
- If the table says improvement is needed, the next action may be `stop` only when a terminal boundary is explicitly named; otherwise choose `continue`, `revise`, `rerun`, or `escalate`.
|
|
202
222
|
- Run `.lab/.managed/scripts/validate_stage_report.py --stage-report <stage-report> --stage auto` and include the report path plus validation result in the final user-facing summary.
|
|
@@ -70,5 +70,8 @@
|
|
|
70
70
|
## Stage Report Closeout
|
|
71
71
|
|
|
72
72
|
- Before final handoff, write or update `.lab/stage-reports/<date>--data--<target>.md` from `.lab/.managed/templates/stage-report.md`.
|
|
73
|
+
- Fill `Requested Outcome Mapping` before the core table so the final answer can be checked against the user's original request rather than only against internal stage state.
|
|
74
|
+
- Fill `Repair Control`; if no repair loop ran, mark it as not applicable and state that ordinary drafting or evidence fixes remain allowed inside the stage contract.
|
|
73
75
|
- Fill the `Core Explanation Table` in plain language: background, why now, what changed, how the dataset package was chosen, what worked, what did not work, what was verified, what remains unverified, what needs improvement and why, how to improve and why, key evidence, and the continue/stop/revise/rerun/escalate/handoff decision.
|
|
76
|
+
- If the table says improvement is needed, the next action may be `stop` only when a terminal boundary is explicitly named; otherwise choose `continue`, `revise`, `rerun`, or `escalate`.
|
|
74
77
|
- Run `.lab/.managed/scripts/validate_stage_report.py --stage-report <stage-report> --stage data` and include the report path plus validation result in the final user-facing summary.
|
|
@@ -73,5 +73,8 @@
|
|
|
73
73
|
## Stage Report Closeout
|
|
74
74
|
|
|
75
75
|
- Before final handoff, write or update `.lab/stage-reports/<date>--framing--<target>.md` from `.lab/.managed/templates/stage-report.md`.
|
|
76
|
+
- Fill `Requested Outcome Mapping` before the core table so the final answer can be checked against the user's original request rather than only against internal stage state.
|
|
77
|
+
- Fill `Repair Control`; if no repair loop ran, mark it as not applicable and state that ordinary drafting or evidence fixes remain allowed inside the stage contract.
|
|
76
78
|
- Fill the `Core Explanation Table` in plain language: background, why now, what naming or framing changed, how it was checked, what worked, what did not work, what was verified, what remains unverified, what needs improvement and why, how to improve and why, key evidence, and the continue/stop/revise/rerun/escalate/handoff decision.
|
|
79
|
+
- If the table says improvement is needed, the next action may be `stop` only when a terminal boundary is explicitly named; otherwise choose `continue`, `revise`, `rerun`, or `escalate`.
|
|
77
80
|
- Run `.lab/.managed/scripts/validate_stage_report.py --stage-report <stage-report> --stage framing` and include the report path plus validation result in the final user-facing summary.
|
|
@@ -16,6 +16,8 @@
|
|
|
16
16
|
- rough plain-language approach description
|
|
17
17
|
- evaluation sketch with the evaluation subject, any proxy or simulator, the main outcome to observe, and the main validity risk
|
|
18
18
|
- tentative contributions stated at idea level, not final paper-facing wording
|
|
19
|
+
- explicit contribution-vs-insight separation: contribution says what the work adds, insight says what the work teaches
|
|
20
|
+
- insight evidence chain: observation, why existing explanations fail, core insight, mechanism, validation tests, generalization or action implication, and prediction
|
|
19
21
|
- convergence status that says what is already source-backed, what is still hypothesis-only, and whether the stage may end with a final recommendation
|
|
20
22
|
- three meaningful points
|
|
21
23
|
- brainstorm pass 1 with 3-4 candidate directions
|
|
@@ -67,6 +69,11 @@
|
|
|
67
69
|
- Before ending the stage, give the user a concise decision summary that states the recommended direction, what current methods do, why they still fall short, how the proposed direction differs, the rough approach, the main risk, and where to read the full idea artifact and source log.
|
|
68
70
|
- If the current evaluation plan uses a proxy, simulator, or synthetic user in place of a real subject, say that explicitly in the idea artifact and explain why it is acceptable at the idea stage.
|
|
69
71
|
- Keep tentative contributions at the idea level. Do not drift into final paper-facing naming, title, or contribution wording; that belongs to `/lab:framing`.
|
|
72
|
+
- Treat contribution and insight as different outputs. Contribution answers what the work adds; insight answers what the community or decision-maker learns and why the idea should generalize beyond the artifact.
|
|
73
|
+
- Do not let a method name, module list, or metric gain substitute for insight. If the method name were removed, the idea should still have a clear observation, explanation, mechanism, and prediction.
|
|
74
|
+
- Write a single core insight anchor sentence that downstream `/lab:write` and `/lab:report` can reuse. It should be a mechanism-level statement, not a method-name sentence.
|
|
75
|
+
- Present insight as a structured evidence chain: observation -> why existing explanations fail -> core insight -> mechanism -> validation tests -> generalization or action implication -> prediction.
|
|
76
|
+
- For academic ideas, the insight should explain external validity and community value. For technical or business reports, the insight should lead to an action, decision rule, or system change.
|
|
70
77
|
- End the stage output with a user-guidance block that tells the user what to decide next, what information would most improve the idea, and which `/lab` stage should follow.
|
|
71
78
|
|
|
72
79
|
## Context Read Set
|
|
@@ -105,25 +112,21 @@
|
|
|
105
112
|
14. Rough approach in plain language
|
|
106
113
|
15. Problem solved in plain language
|
|
107
114
|
16. Why the proposed idea is better
|
|
108
|
-
17.
|
|
109
|
-
18.
|
|
110
|
-
19.
|
|
111
|
-
20.
|
|
112
|
-
21.
|
|
113
|
-
22.
|
|
114
|
-
23.
|
|
115
|
-
24.
|
|
116
|
-
25.
|
|
117
|
-
26.
|
|
118
|
-
27.
|
|
119
|
-
28.
|
|
120
|
-
29.
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
- Before final handoff, write or update `.lab/stage-reports/<date>--idea--<target>.md` from `.lab/.managed/templates/stage-report.md`.
|
|
125
|
-
- Fill the `Core Explanation Table` in plain language: background, why now, what idea work was done, how sources and brainstorm passes were used, what worked, what did not work, what was verified, what remains unverified, what needs improvement and why, how to improve and why, key evidence, and the continue/stop/revise/rerun/escalate/handoff decision.
|
|
126
|
-
- Run `.lab/.managed/scripts/validate_stage_report.py --stage-report <stage-report> --stage idea` and include the report path plus validation result in the final user-facing summary.
|
|
115
|
+
17. Contribution vs insight
|
|
116
|
+
18. Insight evidence chain
|
|
117
|
+
19. Evaluation sketch
|
|
118
|
+
20. Tentative contributions
|
|
119
|
+
21. Three meaningful points
|
|
120
|
+
22. Candidate approaches and recommendation
|
|
121
|
+
23. Dataset, baseline, and metric candidates
|
|
122
|
+
24. Falsifiable hypothesis
|
|
123
|
+
25. Convergence status
|
|
124
|
+
26. Expert critique
|
|
125
|
+
27. Revised proposal or final recommendation
|
|
126
|
+
28. User guidance
|
|
127
|
+
29. Approval gate
|
|
128
|
+
30. Minimum viable experiment
|
|
129
|
+
31. Idea source log aligned with the two literature sweeps
|
|
127
130
|
|
|
128
131
|
## Writing Standard
|
|
129
132
|
|
|
@@ -147,6 +150,8 @@
|
|
|
147
150
|
- Do not update `.lab/context/mission.md`, `.lab/context/decisions.md`, or `.lab/context/open-questions.md` from rewrite-only mode.
|
|
148
151
|
- Explain what current methods do, why they fall short, and roughly how the proposed idea would work in plain language.
|
|
149
152
|
- Explain what problem the idea actually solves before describing tentative contributions.
|
|
153
|
+
- Before listing tentative contributions, write the insight evidence chain in plain language. It should make the reader see the anomaly or failure first, then accept the proposed explanation.
|
|
154
|
+
- A valid insight should be testable. Include at least one prediction that would be expected if the insight is right and one validation test that could falsify it.
|
|
150
155
|
- Keep the evaluation sketch high-level: who or what is evaluated, what proxy or simulator is used if any, what outcome matters, and what the main validity risk is. Leave full protocol design to later stages.
|
|
151
156
|
- Use the idea stage to say roughly how the idea would be validated and what the minimum viable experiment looks like, but do not freeze sample size, recruitment plan, condition count, questionnaire design, or randomization protocol here.
|
|
152
157
|
- Human-subject experiment design belongs to `/lab:spec`, where recruitment, assignment, measurement, and ethics details can be made explicit.
|
|
@@ -156,3 +161,12 @@
|
|
|
156
161
|
- The final output must be short but decision-capable. Do not hide the key recommendation logic only inside `.lab/writing/idea.md`; summarize the recommended direction, current-method contrast, difference, rough approach, and main risk in the user-facing reply, then point to `.lab/writing/idea.md` and `.lab/writing/idea-source-log.md` for the full detail.
|
|
157
162
|
- Before approval, run `.lab/.managed/scripts/validate_idea_artifact.py --idea <idea-artifact> --source-log .lab/writing/idea-source-log.md --workflow-config .lab/config/workflow.json`.
|
|
158
163
|
- Do not leave `.lab/context/mission.md` as an empty template after convergence; write the approved problem, why it matters, the current benchmark scope, and the approved direction back into canonical context.
|
|
164
|
+
|
|
165
|
+
## Stage Report Closeout
|
|
166
|
+
|
|
167
|
+
- Before final handoff, write or update `.lab/stage-reports/<date>--idea--<target>.md` from `.lab/.managed/templates/stage-report.md`.
|
|
168
|
+
- Fill `Requested Outcome Mapping` before the core table so the final answer can be checked against the user's original request rather than only against internal stage state.
|
|
169
|
+
- Fill `Repair Control`; if no repair loop ran, mark it as not applicable and state that ordinary drafting or evidence fixes remain allowed inside the stage contract.
|
|
170
|
+
- Fill the `Core Explanation Table` in plain language: background, why now, what idea work ran, how evidence was checked, what worked, what did not work, what was verified, what remains unverified, what needs improvement and why, how to improve and why, key evidence, and the continue/stop/revise/rerun/escalate/handoff decision.
|
|
171
|
+
- If the table says improvement is needed, the next action may be `stop` only when a terminal boundary is explicitly named; otherwise choose `continue`, `revise`, `rerun`, or `escalate`.
|
|
172
|
+
- Run `.lab/.managed/scripts/validate_stage_report.py --stage-report <stage-report> --stage idea` and include the report path plus validation result in the final user-facing summary.
|
|
@@ -82,5 +82,8 @@ If the loop stops without success, record:
|
|
|
82
82
|
## Stage Report Closeout
|
|
83
83
|
|
|
84
84
|
- Before final handoff, write or update `.lab/stage-reports/<date>--iterate--<target>.md` from `.lab/.managed/templates/stage-report.md`.
|
|
85
|
+
- Fill `Requested Outcome Mapping` before the core table so the final answer can be checked against the user's original request rather than only against internal stage state.
|
|
86
|
+
- Fill `Repair Control`; if no repair loop ran, mark it as not applicable and state that ordinary drafting or evidence fixes remain allowed inside the stage contract.
|
|
85
87
|
- Fill the `Core Explanation Table` in plain language: background, why now, what rounds ran, how the loop evaluated them, what worked, what did not work, what was verified, what remains unverified, what needs improvement and why, how to improve and why, key evidence, and the continue/stop/revise/rerun/escalate/handoff decision.
|
|
88
|
+
- If the table says improvement is needed, the next action may be `stop` only when a terminal boundary is explicitly named; otherwise choose `continue`, `revise`, `rerun`, or `escalate`.
|
|
86
89
|
- Run `.lab/.managed/scripts/validate_stage_report.py --stage-report <stage-report> --stage iterate` and include the report path plus validation result in the final user-facing summary.
|
|
@@ -7,6 +7,8 @@
|
|
|
7
7
|
- problem and background in plain language
|
|
8
8
|
- dataset scene notes in plain language
|
|
9
9
|
- contribution summary
|
|
10
|
+
- core insight summary that explains what was learned beyond the produced artifact
|
|
11
|
+
- decision or action implication derived from that insight
|
|
10
12
|
- method overview
|
|
11
13
|
- selected metrics summary
|
|
12
14
|
- plain-language metric guide
|
|
@@ -60,6 +62,10 @@
|
|
|
60
62
|
- Treat `report.md` as an external-review-ready memo. Source sections must not rely on local file paths or internal provenance notes; they must give a few human-readable anchor references instead.
|
|
61
63
|
- Pull the approved method name and contribution bullets out of `.lab/context/terminology-lock.md` when that framing context exists; do not silently drop them from the collaborator-facing report.
|
|
62
64
|
- Explain the method overview in collaborator language: what the method roughly does, what changed relative to the closest prior work or strongest baseline, what those prior methods do, and why they remain insufficient for the approved claim.
|
|
65
|
+
- Explain the report-level insight as a mechanism, not as a slogan: observed phenomenon, why the simplest or prior explanation is insufficient, what mechanism best explains the result, what evidence supports it, what action or design implication follows, and what boundary still applies.
|
|
66
|
+
- Keep insight interpretation in the reader summary, method overview, main result interpretation, ablations, and limitations. Do not hide it in a standalone inspirational paragraph.
|
|
67
|
+
- When results are negative, mixed, or too strong, still write the insight honestly: what the result teaches about the mechanism, target, setup, metric, or boundary, and what action follows.
|
|
68
|
+
- For technical or business reports, state the decision implication as an actionable rule, next experiment, system change, or stop boundary. Do not leave the insight as a theoretical phrase only.
|
|
63
69
|
- When citing prior work or baselines in the method overview, include only the few anchor references a collaborator needs, and summarize their role and limitation in one short line each.
|
|
64
70
|
- Report only the few references a collaborator needs to orient themselves quickly; do not turn `report.md` into a full bibliography dump.
|
|
65
71
|
- In `Background Sources`, `Method and Baseline Sources`, and `Metric Sources`, every anchor must include a citation line, one short line about what it established or measures, and one limitation or caveat.
|
|
@@ -87,6 +93,8 @@
|
|
|
87
93
|
- Proactively deliver a user-readable plain-language summary when the stage is reached from `/lab:auto`; do not wait for a separate follow-up request asking what the metrics or tables mean.
|
|
88
94
|
- Treat `report.md` as a user-facing artifact rather than an internal dump. Prefer plain-language explanations before jargon, and explain each metric the first time it matters.
|
|
89
95
|
- Treat contribution bullets as collaborator-facing claim summaries, not as internal TODOs; tie each one to the current evidence boundary.
|
|
96
|
+
- Put the bottom-line insight near the top of the report: one-sentence conclusion, core insight, evidence that supports it, action implication, and biggest risk.
|
|
97
|
+
- Use the main tables as diagnostic evidence for the insight, not just result containers. For each main table, state what mechanism or diagnostic question it addresses and what it does not prove.
|
|
90
98
|
- If a missing assumption would change report interpretation, ask one clarifying question at a time.
|
|
91
99
|
- If there are multiple defensible report framings, present 2-3 approaches with trade-offs and recommend the most evidence-faithful framing before writing.
|
|
92
100
|
- Keep an approval gate when the reporting frame would materially affect what the paper later claims.
|
|
@@ -94,5 +102,8 @@
|
|
|
94
102
|
## Stage Report Closeout
|
|
95
103
|
|
|
96
104
|
- Before final handoff, write or update `.lab/stage-reports/<date>--report--<target>.md` from `.lab/.managed/templates/stage-report.md`.
|
|
105
|
+
- Fill `Requested Outcome Mapping` before the core table so the final answer can be checked against the user's original request rather than only against internal stage state.
|
|
106
|
+
- Fill `Repair Control`; if no repair loop ran, mark it as not applicable and state that ordinary drafting or evidence fixes remain allowed inside the stage contract.
|
|
97
107
|
- Fill the `Core Explanation Table` in plain language: background, why now, what report artifacts were produced, how evidence was carried forward, what worked, what did not work, what was verified, what remains unverified, what needs improvement and why, how to improve and why, key evidence, and the continue/stop/revise/rerun/escalate/handoff decision.
|
|
108
|
+
- If the table says improvement is needed, the next action may be `stop` only when a terminal boundary is explicitly named; otherwise choose `continue`, `revise`, `rerun`, or `escalate`.
|
|
98
109
|
- Run `.lab/.managed/scripts/validate_stage_report.py --stage-report <stage-report> --stage report` and include the report path plus validation result in the final user-facing summary.
|
|
@@ -62,5 +62,8 @@
|
|
|
62
62
|
## Stage Report Closeout
|
|
63
63
|
|
|
64
64
|
- Before final handoff, write or update `.lab/stage-reports/<date>--review--<target>.md` from `.lab/.managed/templates/stage-report.md`.
|
|
65
|
+
- Fill `Requested Outcome Mapping` before the core table so the final answer can be checked against the user's original request rather than only against internal stage state.
|
|
66
|
+
- Fill `Repair Control`; if no repair loop ran, mark it as not applicable and state that ordinary drafting or evidence fixes remain allowed inside the stage contract.
|
|
65
67
|
- Fill the `Core Explanation Table` in plain language: background, why now, what was reviewed, how the review was performed, what worked, what did not work, what was verified, what remains unverified, what needs improvement and why, how to improve and why, key evidence, and the continue/stop/revise/rerun/escalate/handoff decision.
|
|
68
|
+
- If the table says improvement is needed, the next action may be `stop` only when a terminal boundary is explicitly named; otherwise choose `continue`, `revise`, `rerun`, or `escalate`.
|
|
66
69
|
- Run `.lab/.managed/scripts/validate_stage_report.py --stage-report <stage-report> --stage review` and include the report path plus validation result in the final user-facing summary.
|
|
@@ -59,5 +59,8 @@
|
|
|
59
59
|
## Stage Report Closeout
|
|
60
60
|
|
|
61
61
|
- Before final handoff, write or update `.lab/stage-reports/<date>--run--<target>.md` from `.lab/.managed/templates/stage-report.md`.
|
|
62
|
+
- Fill `Requested Outcome Mapping` before the core table so the final answer can be checked against the user's original request rather than only against internal stage state.
|
|
63
|
+
- Fill `Repair Control`; if no repair loop ran, mark it as not applicable and state that ordinary drafting or evidence fixes remain allowed inside the stage contract.
|
|
62
64
|
- Fill the `Core Explanation Table` in plain language: background, why now, what ran, how it ran, what worked, what did not work, what was verified, what remains unverified, what needs improvement and why, how to improve and why, key evidence, and the continue/stop/revise/rerun/escalate/handoff decision.
|
|
65
|
+
- If the table says improvement is needed, the next action may be `stop` only when a terminal boundary is explicitly named; otherwise choose `continue`, `revise`, `rerun`, or `escalate`.
|
|
63
66
|
- Run `.lab/.managed/scripts/validate_stage_report.py --stage-report <stage-report> --stage run` and include the report path plus validation result in the final user-facing summary.
|
|
@@ -76,5 +76,8 @@
|
|
|
76
76
|
## Stage Report Closeout
|
|
77
77
|
|
|
78
78
|
- Before final handoff, write or update `.lab/stage-reports/<date>--spec--<target>.md` from `.lab/.managed/templates/stage-report.md`.
|
|
79
|
+
- Fill `Requested Outcome Mapping` before the core table so the final answer can be checked against the user's original request rather than only against internal stage state.
|
|
80
|
+
- Fill `Repair Control`; if no repair loop ran, mark it as not applicable and state that ordinary drafting or evidence fixes remain allowed inside the stage contract.
|
|
79
81
|
- Fill the `Core Explanation Table` in plain language: background, why now, what change artifacts were created, how the spec was structured, what worked, what did not work, what was verified, what remains unverified, what needs improvement and why, how to improve and why, key evidence, and the continue/stop/revise/rerun/escalate/handoff decision.
|
|
82
|
+
- If the table says improvement is needed, the next action may be `stop` only when a terminal boundary is explicitly named; otherwise choose `continue`, `revise`, `rerun`, or `escalate`.
|
|
80
83
|
- Run `.lab/.managed/scripts/validate_stage_report.py --stage-report <stage-report> --stage spec` and include the report path plus validation result in the final user-facing summary.
|
|
@@ -148,6 +148,14 @@ Do not enter prose polish until the current section has passed the reference-con
|
|
|
148
148
|
- If a section must use canonical short names, model labels, or ablation labels before the section that formally introduces them has been drafted, add a local naming bridge in that section that briefly maps the descriptive phrase to the canonical paper-facing labels and then reuse those labels consistently.
|
|
149
149
|
- Keep one canonical natural-language paper-facing name per concept. Do not let one concept drift across paper-facing names, experiment labels, and internal identifiers.
|
|
150
150
|
- Once a paper-facing model or ablation label is chosen, reuse the canonical label in later prose, tables, captions, and ranking summaries instead of replacing it with a narrative alias.
|
|
151
|
+
- Treat the paper's core insight as an anchor that must be woven through section logic, not as an isolated `Our Insights` subsection.
|
|
152
|
+
- Before drafting, recover the current core insight anchor from `.lab/writing/idea.md`, `.lab/writing/framing.md`, `.lab/writing/plan.md`, or the collaborator report. If no reliable anchor exists, write the best supported one in the write-iteration artifact and mark it as provisional instead of inventing a new paper claim.
|
|
153
|
+
- Use the same insight anchor across Abstract, Introduction, Method, Experiments, and Conclusion unless the evidence changed and the framing artifact is revised.
|
|
154
|
+
- In Introduction, create cognitive contrast: common assumption or prior explanation -> observed failure or anomaly -> root mechanism or insight -> contribution.
|
|
155
|
+
- In Method, make design choices consequences of the insight: why the mechanism requires this decomposition, module, representation, loss, or protocol before explaining how it runs.
|
|
156
|
+
- In Experiments, interpret results diagnostically: say which part of the insight each result, ablation, robustness check, or failure case supports, weakens, or bounds. Do not only read numbers from a table.
|
|
157
|
+
- In Conclusion, state the broader principle or action implication implied by the evidence, then state the boundary. Do not introduce a new insight there.
|
|
158
|
+
- Avoid paper-facing headings such as `Our Insights` or `核心洞见`; if a heading is needed, use normal section roles such as motivation, analysis, ablation, or discussion and let the insight appear in the prose.
|
|
151
159
|
- Before drafting or polishing, check the current section's block in `section-style-policies.md` and follow its encouraged, discouraged, and banned expression lists.
|
|
152
160
|
- Before any additional tighten, compress, or polish pass on the same section, run a section-level acceptance gate first.
|
|
153
161
|
- The section-level acceptance gate is passed only when canonical naming consistency, adjacent-section consistency, claim, metric, and ranking consistency with the current evidence, local clarity, local concision, and section-style compliance are all explicitly checked and no unresolved blocker remains.
|
|
@@ -246,6 +254,7 @@ Do not enter prose polish until the current section has passed the reference-con
|
|
|
246
254
|
- When a round introduces or revises metrics, include a compact metric-glossary note in the user-facing round summary and record the metric-glossary validation in the write-iteration artifact.
|
|
247
255
|
- Record the section-level acceptance gate in the write-iteration artifact before recommending further tightening on the same section.
|
|
248
256
|
- Record section-style policy compliance, any retained discouraged move, and any banned move found in the write-iteration artifact.
|
|
257
|
+
- Record the insight integration audit in the write-iteration artifact: core insight anchor, section role in the insight chain, challenged assumption, mechanism explanation, diagnostic evidence, and whether the prose avoided an isolated insight section.
|
|
249
258
|
- Record the round target layer in the write-iteration artifact as `canonical manuscript`, `workflow-language paper layer`, or `both`.
|
|
250
259
|
- If workflow-language was active and the round still targeted the canonical manuscript, record why canonical-only writing was acceptable in the write-iteration artifact.
|
|
251
260
|
- If both layers were edited, record why the cross-language sync was required and whether it was explicitly requested by the user or required by final-draft/export finalization.
|
|
@@ -307,5 +316,8 @@ Do not enter prose polish until the current section has passed the reference-con
|
|
|
307
316
|
## Stage Report Closeout
|
|
308
317
|
|
|
309
318
|
- Before final handoff, write or update `.lab/stage-reports/<date>--write--<target>.md` from `.lab/.managed/templates/stage-report.md`.
|
|
319
|
+
- Fill `Requested Outcome Mapping` before the core table so the final answer can be checked against the user's original request rather than only against internal stage state.
|
|
320
|
+
- Fill `Repair Control`; if no repair loop ran, mark it as not applicable and state that ordinary drafting or evidence fixes remain allowed inside the stage contract.
|
|
310
321
|
- Fill the `Core Explanation Table` in plain language: background, why now, what section or asset changed, how evidence and writing rules were applied, what worked, what did not work, what was verified, what remains unverified, what needs improvement and why, how to improve and why, key evidence, and the continue/stop/revise/rerun/escalate/handoff decision.
|
|
322
|
+
- If the table says improvement is needed, the next action may be `stop` only when a terminal boundary is explicitly named; otherwise choose `continue`, `revise`, `rerun`, or `escalate`.
|
|
311
323
|
- Run `.lab/.managed/scripts/validate_stage_report.py --stage-report <stage-report> --stage write` and include the report path plus validation result in the final user-facing summary.
|