EvoScientist 0.0.1.dev2__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (107) hide show
  1. EvoScientist/EvoScientist.py +157 -0
  2. EvoScientist/__init__.py +24 -0
  3. EvoScientist/__main__.py +4 -0
  4. EvoScientist/backends.py +392 -0
  5. EvoScientist/cli.py +1553 -0
  6. EvoScientist/middleware.py +35 -0
  7. EvoScientist/prompts.py +277 -0
  8. EvoScientist/skills/accelerate/SKILL.md +332 -0
  9. EvoScientist/skills/accelerate/references/custom-plugins.md +453 -0
  10. EvoScientist/skills/accelerate/references/megatron-integration.md +489 -0
  11. EvoScientist/skills/accelerate/references/performance.md +525 -0
  12. EvoScientist/skills/bitsandbytes/SKILL.md +411 -0
  13. EvoScientist/skills/bitsandbytes/references/memory-optimization.md +521 -0
  14. EvoScientist/skills/bitsandbytes/references/qlora-training.md +521 -0
  15. EvoScientist/skills/bitsandbytes/references/quantization-formats.md +447 -0
  16. EvoScientist/skills/find-skills/SKILL.md +133 -0
  17. EvoScientist/skills/find-skills/scripts/install_skill.py +211 -0
  18. EvoScientist/skills/flash-attention/SKILL.md +367 -0
  19. EvoScientist/skills/flash-attention/references/benchmarks.md +215 -0
  20. EvoScientist/skills/flash-attention/references/transformers-integration.md +293 -0
  21. EvoScientist/skills/llama-cpp/SKILL.md +258 -0
  22. EvoScientist/skills/llama-cpp/references/optimization.md +89 -0
  23. EvoScientist/skills/llama-cpp/references/quantization.md +213 -0
  24. EvoScientist/skills/llama-cpp/references/server.md +125 -0
  25. EvoScientist/skills/lm-evaluation-harness/SKILL.md +490 -0
  26. EvoScientist/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
  27. EvoScientist/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
  28. EvoScientist/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
  29. EvoScientist/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
  30. EvoScientist/skills/ml-paper-writing/SKILL.md +937 -0
  31. EvoScientist/skills/ml-paper-writing/references/checklists.md +361 -0
  32. EvoScientist/skills/ml-paper-writing/references/citation-workflow.md +562 -0
  33. EvoScientist/skills/ml-paper-writing/references/reviewer-guidelines.md +367 -0
  34. EvoScientist/skills/ml-paper-writing/references/sources.md +159 -0
  35. EvoScientist/skills/ml-paper-writing/references/writing-guide.md +476 -0
  36. EvoScientist/skills/ml-paper-writing/templates/README.md +251 -0
  37. EvoScientist/skills/ml-paper-writing/templates/aaai2026/README.md +534 -0
  38. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026-unified-supp.tex +144 -0
  39. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026-unified-template.tex +952 -0
  40. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026.bib +111 -0
  41. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026.bst +1493 -0
  42. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026.sty +315 -0
  43. EvoScientist/skills/ml-paper-writing/templates/acl/README.md +50 -0
  44. EvoScientist/skills/ml-paper-writing/templates/acl/acl.sty +312 -0
  45. EvoScientist/skills/ml-paper-writing/templates/acl/acl_latex.tex +377 -0
  46. EvoScientist/skills/ml-paper-writing/templates/acl/acl_lualatex.tex +101 -0
  47. EvoScientist/skills/ml-paper-writing/templates/acl/acl_natbib.bst +1940 -0
  48. EvoScientist/skills/ml-paper-writing/templates/acl/anthology.bib.txt +26 -0
  49. EvoScientist/skills/ml-paper-writing/templates/acl/custom.bib +70 -0
  50. EvoScientist/skills/ml-paper-writing/templates/acl/formatting.md +326 -0
  51. EvoScientist/skills/ml-paper-writing/templates/colm2025/README.md +3 -0
  52. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.bib +11 -0
  53. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.bst +1440 -0
  54. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.pdf +0 -0
  55. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.sty +218 -0
  56. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.tex +305 -0
  57. EvoScientist/skills/ml-paper-writing/templates/colm2025/fancyhdr.sty +485 -0
  58. EvoScientist/skills/ml-paper-writing/templates/colm2025/math_commands.tex +508 -0
  59. EvoScientist/skills/ml-paper-writing/templates/colm2025/natbib.sty +1246 -0
  60. EvoScientist/skills/ml-paper-writing/templates/iclr2026/fancyhdr.sty +485 -0
  61. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.bib +24 -0
  62. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.bst +1440 -0
  63. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.pdf +0 -0
  64. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.sty +246 -0
  65. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.tex +414 -0
  66. EvoScientist/skills/ml-paper-writing/templates/iclr2026/math_commands.tex +508 -0
  67. EvoScientist/skills/ml-paper-writing/templates/iclr2026/natbib.sty +1246 -0
  68. EvoScientist/skills/ml-paper-writing/templates/icml2026/algorithm.sty +79 -0
  69. EvoScientist/skills/ml-paper-writing/templates/icml2026/algorithmic.sty +201 -0
  70. EvoScientist/skills/ml-paper-writing/templates/icml2026/example_paper.bib +75 -0
  71. EvoScientist/skills/ml-paper-writing/templates/icml2026/example_paper.pdf +0 -0
  72. EvoScientist/skills/ml-paper-writing/templates/icml2026/example_paper.tex +662 -0
  73. EvoScientist/skills/ml-paper-writing/templates/icml2026/fancyhdr.sty +864 -0
  74. EvoScientist/skills/ml-paper-writing/templates/icml2026/icml2026.bst +1443 -0
  75. EvoScientist/skills/ml-paper-writing/templates/icml2026/icml2026.sty +767 -0
  76. EvoScientist/skills/ml-paper-writing/templates/icml2026/icml_numpapers.pdf +0 -0
  77. EvoScientist/skills/ml-paper-writing/templates/neurips2025/Makefile +36 -0
  78. EvoScientist/skills/ml-paper-writing/templates/neurips2025/extra_pkgs.tex +53 -0
  79. EvoScientist/skills/ml-paper-writing/templates/neurips2025/main.tex +38 -0
  80. EvoScientist/skills/ml-paper-writing/templates/neurips2025/neurips.sty +382 -0
  81. EvoScientist/skills/peft/SKILL.md +431 -0
  82. EvoScientist/skills/peft/references/advanced-usage.md +514 -0
  83. EvoScientist/skills/peft/references/troubleshooting.md +480 -0
  84. EvoScientist/skills/ray-data/SKILL.md +326 -0
  85. EvoScientist/skills/ray-data/references/integration.md +82 -0
  86. EvoScientist/skills/ray-data/references/transformations.md +83 -0
  87. EvoScientist/skills/skill-creator/LICENSE.txt +202 -0
  88. EvoScientist/skills/skill-creator/SKILL.md +356 -0
  89. EvoScientist/skills/skill-creator/references/output-patterns.md +82 -0
  90. EvoScientist/skills/skill-creator/references/workflows.md +28 -0
  91. EvoScientist/skills/skill-creator/scripts/init_skill.py +303 -0
  92. EvoScientist/skills/skill-creator/scripts/package_skill.py +110 -0
  93. EvoScientist/skills/skill-creator/scripts/quick_validate.py +95 -0
  94. EvoScientist/stream/__init__.py +53 -0
  95. EvoScientist/stream/emitter.py +94 -0
  96. EvoScientist/stream/formatter.py +168 -0
  97. EvoScientist/stream/tracker.py +115 -0
  98. EvoScientist/stream/utils.py +255 -0
  99. EvoScientist/subagent.yaml +147 -0
  100. EvoScientist/tools.py +135 -0
  101. EvoScientist/utils.py +207 -0
  102. evoscientist-0.0.1.dev2.dist-info/METADATA +227 -0
  103. evoscientist-0.0.1.dev2.dist-info/RECORD +107 -0
  104. evoscientist-0.0.1.dev2.dist-info/WHEEL +5 -0
  105. evoscientist-0.0.1.dev2.dist-info/entry_points.txt +5 -0
  106. evoscientist-0.0.1.dev2.dist-info/licenses/LICENSE +21 -0
  107. evoscientist-0.0.1.dev2.dist-info/top_level.txt +1 -0
@@ -0,0 +1,476 @@
1
+ # ML Paper Writing Philosophy & Best Practices
2
+
3
+ This reference compiles writing advice from prominent ML researchers including Neel Nanda, Andrej Karpathy, Sebastian Farquhar, Zachary Lipton, and Jacob Steinhardt.
4
+
5
+ ---
6
+
7
+ ## Contents
8
+
9
+ - [The Narrative Principle](#the-narrative-principle)
10
+ - [Time Allocation](#time-allocation)
11
+ - [Abstract Writing Formula](#abstract-writing-formula)
12
+ - [Introduction Structure](#introduction-structure)
13
+ - [Sentence-Level Clarity](#sentence-level-clarity)
14
+ - [Word Choice and Precision](#word-choice-and-precision)
15
+ - [Mathematical Writing](#mathematical-writing)
16
+ - [Figure Design](#figure-design)
17
+ - [Common Mistakes to Avoid](#common-mistakes-to-avoid)
18
+
19
+ ---
20
+
21
+ ## The Narrative Principle
22
+
23
+ ### From Neel Nanda
24
+
25
+ "A paper is a short, rigorous, evidence-based technical story with a takeaway readers care about."
26
+
27
+ The narrative rests on three pillars that must be crystal clear by the end of your introduction:
28
+
29
+ **The "What"**: One to three specific novel claims fitting within a cohesive theme. Vague contributions like "we study X" fail immediately—reviewers need precise, falsifiable claims.
30
+
31
+ **The "Why"**: Rigorous empirical evidence that convincingly supports those claims, including strong baselines honestly tuned and experiments that distinguish between competing hypotheses rather than merely showing "decent results."
32
+
33
+ **The "So What"**: Why readers should care, connecting your contribution to problems the community recognizes as important.
34
+
35
+ ### From Andrej Karpathy
36
+
37
+ "A paper is not a random collection of experiments you report on. The paper sells a single thing that was not obvious or present before. The entire paper is organized around this core contribution with surgical precision."
38
+
39
+ This applies whether you're presenting a new architecture, a theoretical result, or improved understanding of existing methods—NeurIPS explicitly notes that "originality does not necessarily require an entirely new method."
40
+
41
+ **Practical Implication**: If you cannot state your contribution in one sentence, you don't yet have a paper. Everything else—experiments, related work, discussion—exists only to support that core claim.
42
+
43
+ ---
44
+
45
+ ## Time Allocation
46
+
47
+ ### From Neel Nanda
48
+
49
+ Spend approximately **the same amount of time** on each of:
50
+ 1. The abstract
51
+ 2. The introduction
52
+ 3. The figures
53
+ 4. Everything else combined
54
+
55
+ This isn't hyperbole—most reviewers form preliminary judgments before reaching your methods section. Readers encounter your paper in a predictable pattern: **title → abstract → introduction → figures → maybe the rest.**
56
+
57
+ ### Reviewer Reading Patterns
58
+
59
+ Studies of reviewer behavior show:
60
+ - Abstract is read 100% of the time
61
+ - Introduction is skimmed by 90%+ of reviewers
62
+ - Figures are examined before methods by most reviewers
63
+ - Full methods are read only if interest is established
64
+
65
+ **Implication**: Front-load your paper's value. Don't bury the contribution.
66
+
67
+ ---
68
+
69
+ ## Abstract Writing Formula
70
+
71
+ ### Sebastian Farquhar's 5-Sentence Formula
72
+
73
+ 1. **What you achieved**: "We introduce...", "We prove...", "We demonstrate..."
74
+ 2. **Why this is hard and important**
75
+ 3. **How you do it** (with specialist keywords for discoverability)
76
+ 4. **What evidence you have**
77
+ 5. **Your most remarkable number/result**
78
+
79
+ ### Example (Good Abstract)
80
+
81
+ ```
82
+ We prove that gradient descent on overparameterized neural networks
83
+ converges to global minima at a linear rate. [What]
84
+ This resolves a fundamental question about why deep learning works
85
+ despite non-convex optimization landscapes. [Why hard/important]
86
+ Our proof relies on showing that the Neural Tangent Kernel remains
87
+ approximately constant during training, reducing the problem to
88
+ kernel regression. [How with keywords]
89
+ We validate our theory on CIFAR-10 and ImageNet, showing that
90
+ predicted convergence rates match experiments within 5%. [Evidence]
91
+ This is the first polynomial-time convergence guarantee for
92
+ networks with practical depth and width. [Remarkable result]
93
+ ```
94
+
95
+ ### What to Avoid
96
+
97
+ From Zachary Lipton: "If the first sentence can be pre-pended to any ML paper, delete it."
98
+
99
+ **Delete these openings**:
100
+ - "Large language models have achieved remarkable success..."
101
+ - "Deep learning has revolutionized..."
102
+ - "In recent years, neural networks have..."
103
+
104
+ **Start with your specific contribution instead.**
105
+
106
+ ---
107
+
108
+ ## Introduction Structure
109
+
110
+ ### Requirements
111
+
112
+ - **1-1.5 pages maximum** (in two-column format)
113
+ - **Methods should start by page 2-3**
114
+ - Must include **2-4 bullet contribution list** (max 1-2 lines each)
115
+
116
+ ### Structure Template
117
+
118
+ ```markdown
119
+ 1. Opening Hook (2-3 sentences)
120
+ - State the problem your paper addresses
121
+ - Why it matters RIGHT NOW
122
+
123
+ 2. Background/Challenge (1 paragraph)
124
+ - What makes this problem hard?
125
+ - What have others tried? Why is it insufficient?
126
+
127
+ 3. Your Approach (1 paragraph)
128
+ - What do you do differently?
129
+ - Key insight that enables your contribution
130
+
131
+ 4. Contribution Bullets (2-4 items)
132
+ - Be specific and falsifiable
133
+ - Each bullet: 1-2 lines maximum
134
+
135
+ 5. Results Preview (2-3 sentences)
136
+ - Most impressive numbers
137
+ - Scope of evaluation
138
+
139
+ 6. Paper Organization (optional, 1-2 sentences)
140
+ - "Section 2 presents... Section 3 describes..."
141
+ ```
142
+
143
+ ### Contribution Bullets: Good vs Bad
144
+
145
+ **Good:**
146
+ - We prove that X converges in O(n log n) time under assumption Y
147
+ - We introduce Z, a 3-layer architecture that reduces memory by 40%
148
+ - We demonstrate that A outperforms B by 15% on benchmark C
149
+
150
+ **Bad:**
151
+ - We study the problem of X (not a contribution)
152
+ - We provide extensive experiments (too vague)
153
+ - We make several contributions to the field (says nothing)
154
+
155
+ ---
156
+
157
+ ## Sentence-Level Clarity
158
+
159
+ ### From Gopen & Swan: "The Science of Scientific Writing"
160
+
161
+ The seminal 1990 paper by George Gopen and Judith Swan establishes that **readers have structural expectations** about where information appears in prose. Violating these expectations forces readers to spend energy on structure rather than content.
162
+
163
+ > "If the reader is to grasp what the writer means, the writer must understand what the reader needs."
164
+
165
+ #### The 7 Principles of Reader Expectations
166
+
167
+ **Principle 1: Subject-Verb Proximity**
168
+
169
+ Keep grammatical subject and verb close together. Anything intervening reads as interruption of lesser importance.
170
+
171
+ **Weak**: "The model, which was trained on 100M tokens and fine-tuned on domain-specific data using LoRA with rank 16, achieves state-of-the-art results"
172
+
173
+ **Strong**: "The model achieves state-of-the-art results after training on 100M tokens and fine-tuning with LoRA (rank 16)"
174
+
175
+ **Principle 2: Stress Position (Save the Best for Last)**
176
+
177
+ Readers naturally emphasize the **last words of a sentence**. Place your most important information there.
178
+
179
+ **Weak**: "Accuracy improves by 15% when using attention"
180
+ **Strong**: "When using attention, accuracy improves by **15%**"
181
+
182
+ **Principle 3: Topic Position (First Things First)**
183
+
184
+ The beginning of a sentence establishes perspective. Put the "whose story" element first—readers expect the sentence to be about whoever shows up first.
185
+
186
+ **Weak**: "A novel attention mechanism that computes alignment scores is introduced"
187
+ **Strong**: "To address the alignment problem, we introduce a novel attention mechanism"
188
+
189
+ **Principle 4: Old Information Before New**
190
+
191
+ Put familiar information (old) in the topic position for backward linkage; put new information in the stress position for emphasis.
192
+
193
+ **Weak**: "Sparse attention was introduced by Child et al. The quadratic complexity of standard attention motivates this work."
194
+ **Strong**: "Standard attention has quadratic complexity. To address this, Child et al. introduced sparse attention."
195
+
196
+ **Principle 5: One Unit, One Function**
197
+
198
+ Each unit of discourse (sentence, paragraph, section) should serve a single function. If you have two points, use two units.
199
+
200
+ **Principle 6: Articulate Action in the Verb**
201
+
202
+ Express the action of each sentence in its verb, not in nominalized nouns.
203
+
204
+ **Weak**: "We performed an analysis of the results" (nominalization)
205
+ **Strong**: "We analyzed the results" (action in verb)
206
+
207
+ **Principle 7: Context Before New Information**
208
+
209
+ Provide context before asking the reader to consider anything new. This applies at all levels—sentence, paragraph, section.
210
+
211
+ **Weak**: "Equation 3 shows that convergence is guaranteed when the learning rate satisfies..."
212
+ **Strong**: "For convergence to be guaranteed, the learning rate must satisfy the condition in Equation 3..."
213
+
214
+ #### Summary Table
215
+
216
+ | Principle | Rule | Mnemonic |
217
+ |-----------|------|----------|
218
+ | Subject-Verb Proximity | Keep subject and verb close | "Don't interrupt yourself" |
219
+ | Stress Position | Emphasis at sentence end | "Save the best for last" |
220
+ | Topic Position | Context at sentence start | "First things first" |
221
+ | Old Before New | Familiar → unfamiliar | "Build on known ground" |
222
+ | One Unit, One Function | Each paragraph = one point | "One idea per container" |
223
+ | Action in Verb | Use verbs, not nominalizations | "Verbs do, nouns sit" |
224
+ | Context Before New | Explain before presenting | "Set the stage first" |
225
+
226
+ ---
227
+
228
+ ---
229
+
230
+ ## Micro-Level Writing Tips
231
+
232
+ ### From Ethan Perez (Anthropic)
233
+
234
+ These practical micro-level tips improve clarity at the sentence and word level.
235
+
236
+ #### Pronoun Management
237
+
238
+ **Minimize pronouns** ("this," "it," "these," "that"). When pronouns are necessary, use them as adjectives with a noun:
239
+
240
+ **Weak**: "This shows that the model converges."
241
+ **Strong**: "This result shows that the model converges."
242
+
243
+ **Weak**: "It improves performance."
244
+ **Strong**: "This modification improves performance."
245
+
246
+ #### Verb Placement
247
+
248
+ **Position verbs early** in sentences for better parsing:
249
+
250
+ **Weak**: "The gradient, after being computed and normalized, updates the weights."
251
+ **Strong**: "The gradient updates the weights after being computed and normalized."
252
+
253
+ #### Apostrophe Unfolding
254
+
255
+ Transform possessive constructions for clarity:
256
+
257
+ **Original**: "X's Y" → **Unfolded**: "The Y of X"
258
+
259
+ **Before**: "The model's accuracy on the test set"
260
+ **After**: "The accuracy of the model on the test set"
261
+
262
+ This isn't always better, but when sentences feel awkward, try unfolding.
263
+
264
+ #### Words to Eliminate
265
+
266
+ Delete these filler words in almost all cases:
267
+ - "actually"
268
+ - "a bit"
269
+ - "fortunately" / "unfortunately"
270
+ - "very" / "really"
271
+ - "quite"
272
+ - "basically"
273
+ - "essentially"
274
+ - Excessive connectives ("however," "moreover," "furthermore" when not needed)
275
+
276
+ #### Sentence Construction Rules
277
+
278
+ 1. **One idea per sentence** - If struggling to express an idea in one sentence, it needs two
279
+ 2. **No repeated sounds** - Avoid similar-sounding words in the same sentence
280
+ 3. **Every sentence adds information** - Delete sentences that merely restate
281
+ 4. **Active voice always** - Specify the actor ("We find..." not "It is found...")
282
+ 5. **Expand contractions** - "don't" → "do not" for formality
283
+
284
+ #### Paragraph Architecture
285
+
286
+ - **First sentence**: State the point clearly
287
+ - **Middle sentences**: Support with evidence
288
+ - **Last sentence**: Reinforce or transition
289
+
290
+ Don't bury key information in the middle of paragraphs.
291
+
292
+ ---
293
+
294
+ ## Word Choice and Precision
295
+
296
+ ### From Zachary Lipton
297
+
298
+ **Eliminate hedging** unless genuine uncertainty exists:
299
+ - Delete "may" and "can" unless necessary
300
+ - "provides *very* tight approximation" drips with insecurity
301
+ - "provides tight approximation" is confident
302
+
303
+ **Avoid vacuous intensifiers**:
304
+ - Delete: very, extremely, highly, significantly (unless statistical)
305
+ - These words signal insecurity, not strength
306
+
307
+ ### From Jacob Steinhardt
308
+
309
+ **Precision over brevity**: Replace vague terms with specific ones.
310
+
311
+ | Vague | Specific |
312
+ |-------|----------|
313
+ | performance | accuracy, latency, throughput |
314
+ | improves | increases accuracy by X%, reduces latency by Y |
315
+ | large | 1B parameters, 100M tokens |
316
+ | fast | 3x faster, 50ms latency |
317
+ | good results | 92% accuracy, 0.85 F1 |
318
+
319
+ **Consistent terminology**: Referring to the same concept with different terms creates confusion.
320
+
321
+ **Choose one and stick with it**:
322
+ - "model" vs "network" vs "architecture"
323
+ - "training" vs "learning" vs "optimization"
324
+ - "sample" vs "example" vs "instance"
325
+
326
+ ### Vocabulary Signaling
327
+
328
+ **Avoid words signaling incremental work**:
329
+ - Never: "combine," "modify," "expand," "extend"
330
+ - Instead: "develop," "propose," "introduce"
331
+
332
+ **Why**: "We combine X and Y" sounds like you stapled two existing ideas together. "We develop a method that leverages X for Y" sounds like genuine contribution.
333
+
334
+ ---
335
+
336
+ ## Mathematical Writing
337
+
338
+ ### From Ethan Perez
339
+
340
+ **Unfold apostrophes** for clarity:
341
+ - Weak: "X's Y"
342
+ - Strong: "The Y of X"
343
+
344
+ Example: "the model's accuracy" → "the accuracy of the model"
345
+
346
+ ### General Principles
347
+
348
+ 1. **State all assumptions formally** before theorems
349
+ 2. **Provide intuitive explanations** alongside proofs
350
+ 3. **Use consistent notation** throughout the paper
351
+ 4. **Define symbols at first use**
352
+
353
+ ### Notation Conventions
354
+
355
+ ```latex
356
+ % Scalars: lowercase italic
357
+ $x$, $y$, $\alpha$, $\beta$
358
+
359
+ % Vectors: lowercase bold
360
+ $\mathbf{x}$, $\mathbf{v}$
361
+
362
+ % Matrices: uppercase bold
363
+ $\mathbf{W}$, $\mathbf{X}$
364
+
365
+ % Sets: uppercase calligraphic
366
+ $\mathcal{X}$, $\mathcal{D}$
367
+
368
+ % Functions: roman for named functions
369
+ $\mathrm{softmax}$, $\mathrm{ReLU}$
370
+ ```
371
+
372
+ ---
373
+
374
+ ## Figure Design
375
+
376
+ ### From Neel Nanda
377
+
378
+ Figures should tell a coherent story even if the reader skips the text. Many readers DO skip the text initially.
379
+
380
+ ### Design Principles
381
+
382
+ 1. **Figure 1 is crucial**: Often the first thing readers examine after abstract
383
+ 2. **Self-contained captions**: Reader should understand figure without main text
384
+ 3. **No title inside figure**: The caption serves this function (ICML/NeurIPS rule)
385
+ 4. **Vector graphics**: PDF/EPS for plots, PNG (600 DPI) only for photographs
386
+
387
+ ### Accessibility Requirements
388
+
389
+ 8% of men have color vision deficiency. Your figures must work for them.
390
+
391
+ **Solutions**:
392
+ - Use colorblind-safe palettes: Okabe-Ito or Paul Tol
393
+ - Avoid red-green combinations
394
+ - Verify figures work in grayscale
395
+ - Use different line styles (solid, dashed, dotted) in addition to colors
396
+
397
+ ### Tools
398
+
399
+ ```python
400
+ # SciencePlots: Publication-ready styles
401
+ import matplotlib.pyplot as plt
402
+ plt.style.use(['science', 'ieee'])
403
+
404
+ # Or for Nature-style
405
+ plt.style.use(['science', 'nature'])
406
+ ```
407
+
408
+ ---
409
+
410
+ ## Common Mistakes to Avoid
411
+
412
+ ### Structure Mistakes
413
+
414
+ | Mistake | Solution |
415
+ |---------|----------|
416
+ | Introduction too long (>1.5 pages) | Move background to Related Work |
417
+ | Methods buried (after page 3) | Front-load contribution, cut intro |
418
+ | Missing contribution bullets | Add 2-4 specific, falsifiable claims |
419
+ | Experiments without explicit claims | State what each experiment tests |
420
+
421
+ ### Writing Mistakes
422
+
423
+ | Mistake | Solution |
424
+ |---------|----------|
425
+ | Generic abstract opening | Start with your specific contribution |
426
+ | Inconsistent terminology | Choose one term per concept |
427
+ | Passive voice overuse | Use active voice: "We show" not "It is shown" |
428
+ | Hedging everywhere | Be confident unless genuinely uncertain |
429
+
430
+ ### Figure Mistakes
431
+
432
+ | Mistake | Solution |
433
+ |---------|----------|
434
+ | Raster graphics for plots | Use vector (PDF/EPS) |
435
+ | Red-green color scheme | Use colorblind-safe palette |
436
+ | Title inside figure | Put title in caption |
437
+ | Captions require main text | Make captions self-contained |
438
+
439
+ ### Citation Mistakes
440
+
441
+ | Mistake | Solution |
442
+ |---------|----------|
443
+ | Paper-by-paper Related Work | Organize methodologically |
444
+ | Missing relevant citations | Reviewers authored papers—cite generously |
445
+ | AI-generated citations | Always verify via APIs |
446
+ | Inconsistent citation format | Use BibLaTeX with consistent keys |
447
+
448
+ ---
449
+
450
+ ## Pre-Submission Checklist
451
+
452
+ Before submitting, verify:
453
+
454
+ **Narrative**:
455
+ - [ ] Can state contribution in one sentence
456
+ - [ ] Three pillars (What/Why/So What) clear in intro
457
+ - [ ] Every experiment supports a specific claim
458
+
459
+ **Structure**:
460
+ - [ ] Abstract follows 5-sentence formula
461
+ - [ ] Introduction ≤1.5 pages
462
+ - [ ] Methods start by page 2-3
463
+ - [ ] 2-4 contribution bullets included
464
+ - [ ] Limitations section present
465
+
466
+ **Writing**:
467
+ - [ ] Consistent terminology throughout
468
+ - [ ] No generic opening sentences
469
+ - [ ] Hedging removed unless necessary
470
+ - [ ] All figures have self-contained captions
471
+
472
+ **Technical**:
473
+ - [ ] All citations verified via API
474
+ - [ ] Error bars included with methodology
475
+ - [ ] Compute resources documented
476
+ - [ ] Code/data availability stated