EvoScientist 0.0.1.dev3__py3-none-any.whl → 0.1.0rc1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (108) hide show
  1. EvoScientist/EvoScientist.py +17 -49
  2. EvoScientist/backends.py +0 -26
  3. EvoScientist/cli.py +1109 -255
  4. EvoScientist/middleware.py +8 -61
  5. EvoScientist/stream/__init__.py +0 -25
  6. EvoScientist/stream/utils.py +16 -23
  7. EvoScientist/tools.py +0 -64
  8. evoscientist-0.1.0rc1.dist-info/METADATA +199 -0
  9. evoscientist-0.1.0rc1.dist-info/RECORD +21 -0
  10. evoscientist-0.1.0rc1.dist-info/entry_points.txt +2 -0
  11. EvoScientist/memory.py +0 -715
  12. EvoScientist/paths.py +0 -45
  13. EvoScientist/skills/accelerate/SKILL.md +0 -332
  14. EvoScientist/skills/accelerate/references/custom-plugins.md +0 -453
  15. EvoScientist/skills/accelerate/references/megatron-integration.md +0 -489
  16. EvoScientist/skills/accelerate/references/performance.md +0 -525
  17. EvoScientist/skills/bitsandbytes/SKILL.md +0 -411
  18. EvoScientist/skills/bitsandbytes/references/memory-optimization.md +0 -521
  19. EvoScientist/skills/bitsandbytes/references/qlora-training.md +0 -521
  20. EvoScientist/skills/bitsandbytes/references/quantization-formats.md +0 -447
  21. EvoScientist/skills/find-skills/SKILL.md +0 -133
  22. EvoScientist/skills/find-skills/scripts/install_skill.py +0 -211
  23. EvoScientist/skills/flash-attention/SKILL.md +0 -367
  24. EvoScientist/skills/flash-attention/references/benchmarks.md +0 -215
  25. EvoScientist/skills/flash-attention/references/transformers-integration.md +0 -293
  26. EvoScientist/skills/llama-cpp/SKILL.md +0 -258
  27. EvoScientist/skills/llama-cpp/references/optimization.md +0 -89
  28. EvoScientist/skills/llama-cpp/references/quantization.md +0 -213
  29. EvoScientist/skills/llama-cpp/references/server.md +0 -125
  30. EvoScientist/skills/lm-evaluation-harness/SKILL.md +0 -490
  31. EvoScientist/skills/lm-evaluation-harness/references/api-evaluation.md +0 -490
  32. EvoScientist/skills/lm-evaluation-harness/references/benchmark-guide.md +0 -488
  33. EvoScientist/skills/lm-evaluation-harness/references/custom-tasks.md +0 -602
  34. EvoScientist/skills/lm-evaluation-harness/references/distributed-eval.md +0 -519
  35. EvoScientist/skills/ml-paper-writing/SKILL.md +0 -937
  36. EvoScientist/skills/ml-paper-writing/references/checklists.md +0 -361
  37. EvoScientist/skills/ml-paper-writing/references/citation-workflow.md +0 -562
  38. EvoScientist/skills/ml-paper-writing/references/reviewer-guidelines.md +0 -367
  39. EvoScientist/skills/ml-paper-writing/references/sources.md +0 -159
  40. EvoScientist/skills/ml-paper-writing/references/writing-guide.md +0 -476
  41. EvoScientist/skills/ml-paper-writing/templates/README.md +0 -251
  42. EvoScientist/skills/ml-paper-writing/templates/aaai2026/README.md +0 -534
  43. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026-unified-supp.tex +0 -144
  44. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026-unified-template.tex +0 -952
  45. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026.bib +0 -111
  46. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026.bst +0 -1493
  47. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026.sty +0 -315
  48. EvoScientist/skills/ml-paper-writing/templates/acl/README.md +0 -50
  49. EvoScientist/skills/ml-paper-writing/templates/acl/acl.sty +0 -312
  50. EvoScientist/skills/ml-paper-writing/templates/acl/acl_latex.tex +0 -377
  51. EvoScientist/skills/ml-paper-writing/templates/acl/acl_lualatex.tex +0 -101
  52. EvoScientist/skills/ml-paper-writing/templates/acl/acl_natbib.bst +0 -1940
  53. EvoScientist/skills/ml-paper-writing/templates/acl/anthology.bib.txt +0 -26
  54. EvoScientist/skills/ml-paper-writing/templates/acl/custom.bib +0 -70
  55. EvoScientist/skills/ml-paper-writing/templates/acl/formatting.md +0 -326
  56. EvoScientist/skills/ml-paper-writing/templates/colm2025/README.md +0 -3
  57. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.bib +0 -11
  58. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.bst +0 -1440
  59. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.pdf +0 -0
  60. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.sty +0 -218
  61. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.tex +0 -305
  62. EvoScientist/skills/ml-paper-writing/templates/colm2025/fancyhdr.sty +0 -485
  63. EvoScientist/skills/ml-paper-writing/templates/colm2025/math_commands.tex +0 -508
  64. EvoScientist/skills/ml-paper-writing/templates/colm2025/natbib.sty +0 -1246
  65. EvoScientist/skills/ml-paper-writing/templates/iclr2026/fancyhdr.sty +0 -485
  66. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.bib +0 -24
  67. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.bst +0 -1440
  68. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.pdf +0 -0
  69. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.sty +0 -246
  70. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.tex +0 -414
  71. EvoScientist/skills/ml-paper-writing/templates/iclr2026/math_commands.tex +0 -508
  72. EvoScientist/skills/ml-paper-writing/templates/iclr2026/natbib.sty +0 -1246
  73. EvoScientist/skills/ml-paper-writing/templates/icml2026/algorithm.sty +0 -79
  74. EvoScientist/skills/ml-paper-writing/templates/icml2026/algorithmic.sty +0 -201
  75. EvoScientist/skills/ml-paper-writing/templates/icml2026/example_paper.bib +0 -75
  76. EvoScientist/skills/ml-paper-writing/templates/icml2026/example_paper.pdf +0 -0
  77. EvoScientist/skills/ml-paper-writing/templates/icml2026/example_paper.tex +0 -662
  78. EvoScientist/skills/ml-paper-writing/templates/icml2026/fancyhdr.sty +0 -864
  79. EvoScientist/skills/ml-paper-writing/templates/icml2026/icml2026.bst +0 -1443
  80. EvoScientist/skills/ml-paper-writing/templates/icml2026/icml2026.sty +0 -767
  81. EvoScientist/skills/ml-paper-writing/templates/icml2026/icml_numpapers.pdf +0 -0
  82. EvoScientist/skills/ml-paper-writing/templates/neurips2025/Makefile +0 -36
  83. EvoScientist/skills/ml-paper-writing/templates/neurips2025/extra_pkgs.tex +0 -53
  84. EvoScientist/skills/ml-paper-writing/templates/neurips2025/main.tex +0 -38
  85. EvoScientist/skills/ml-paper-writing/templates/neurips2025/neurips.sty +0 -382
  86. EvoScientist/skills/peft/SKILL.md +0 -431
  87. EvoScientist/skills/peft/references/advanced-usage.md +0 -514
  88. EvoScientist/skills/peft/references/troubleshooting.md +0 -480
  89. EvoScientist/skills/ray-data/SKILL.md +0 -326
  90. EvoScientist/skills/ray-data/references/integration.md +0 -82
  91. EvoScientist/skills/ray-data/references/transformations.md +0 -83
  92. EvoScientist/skills/skill-creator/LICENSE.txt +0 -202
  93. EvoScientist/skills/skill-creator/SKILL.md +0 -356
  94. EvoScientist/skills/skill-creator/references/output-patterns.md +0 -82
  95. EvoScientist/skills/skill-creator/references/workflows.md +0 -28
  96. EvoScientist/skills/skill-creator/scripts/init_skill.py +0 -303
  97. EvoScientist/skills/skill-creator/scripts/package_skill.py +0 -110
  98. EvoScientist/skills/skill-creator/scripts/quick_validate.py +0 -95
  99. EvoScientist/skills_manager.py +0 -392
  100. EvoScientist/stream/display.py +0 -604
  101. EvoScientist/stream/events.py +0 -415
  102. EvoScientist/stream/state.py +0 -343
  103. evoscientist-0.0.1.dev3.dist-info/METADATA +0 -321
  104. evoscientist-0.0.1.dev3.dist-info/RECORD +0 -113
  105. evoscientist-0.0.1.dev3.dist-info/entry_points.txt +0 -5
  106. {evoscientist-0.0.1.dev3.dist-info → evoscientist-0.1.0rc1.dist-info}/WHEEL +0 -0
  107. {evoscientist-0.0.1.dev3.dist-info → evoscientist-0.1.0rc1.dist-info}/licenses/LICENSE +0 -0
  108. {evoscientist-0.0.1.dev3.dist-info → evoscientist-0.1.0rc1.dist-info}/top_level.txt +0 -0
EvoScientist/paths.py DELETED
@@ -1,45 +0,0 @@
1
- """Path resolution utilities for EvoScientist runtime directories."""
2
-
3
- from __future__ import annotations
4
-
5
- import os
6
- from datetime import datetime
7
- from pathlib import Path
8
-
9
-
10
- def _expand(path: str) -> Path:
11
- return Path(path).expanduser()
12
-
13
-
14
- def _env_path(key: str) -> Path | None:
15
- value = os.getenv(key)
16
- if not value:
17
- return None
18
- return _expand(value)
19
-
20
-
21
- STATE_ROOT = _env_path("EVOSCIENTIST_HOME") or (Path.cwd() / ".evoscientist")
22
-
23
- WORKSPACE_ROOT = _env_path("EVOSCIENTIST_WORKSPACE_DIR") or (STATE_ROOT / "workspace")
24
-
25
- RUNS_DIR = _env_path("EVOSCIENTIST_RUNS_DIR") or (WORKSPACE_ROOT / "runs")
26
- MEMORY_DIR = _env_path("EVOSCIENTIST_MEMORY_DIR") or (WORKSPACE_ROOT / "memory")
27
- USER_SKILLS_DIR = _env_path("EVOSCIENTIST_SKILLS_DIR") or (WORKSPACE_ROOT / "skills")
28
-
29
-
30
- def ensure_dirs() -> None:
31
- """Create runtime directories if they do not exist."""
32
- for path in (WORKSPACE_ROOT, RUNS_DIR, MEMORY_DIR, USER_SKILLS_DIR):
33
- path.mkdir(parents=True, exist_ok=True)
34
-
35
-
36
- def default_workspace_dir() -> Path:
37
- """Default workspace for non-CLI usage."""
38
- return WORKSPACE_ROOT
39
-
40
-
41
- def new_run_dir(session_id: str | None = None) -> Path:
42
- """Create a new run directory name under RUNS_DIR (path only)."""
43
- if session_id is None:
44
- session_id = datetime.now().strftime("%Y%m%d_%H%M%S")
45
- return RUNS_DIR / session_id
@@ -1,332 +0,0 @@
1
- ---
2
- name: accelerate
3
- description: Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision (FP16/BF16/FP8). Interactive config, single launch command. HuggingFace ecosystem standard.
4
- version: 1.0.0
5
- author: Orchestra Research
6
- license: MIT
7
- tags: [Distributed Training, HuggingFace, Accelerate, DeepSpeed, FSDP, Mixed Precision, PyTorch, DDP, Unified API, Simple]
8
- dependencies: [accelerate, torch, transformers]
9
- ---
10
-
11
- # HuggingFace Accelerate - Unified Distributed Training
12
-
13
- ## Quick start
14
-
15
- Accelerate simplifies distributed training to 4 lines of code.
16
-
17
- **Installation**:
18
- ```bash
19
- pip install accelerate
20
- ```
21
-
22
- **Convert PyTorch script** (4 lines):
23
- ```python
24
- import torch
25
- + from accelerate import Accelerator
26
-
27
- + accelerator = Accelerator()
28
-
29
- model = torch.nn.Transformer()
30
- optimizer = torch.optim.Adam(model.parameters())
31
- dataloader = torch.utils.data.DataLoader(dataset)
32
-
33
- + model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)
34
-
35
- for batch in dataloader:
36
- optimizer.zero_grad()
37
- loss = model(batch)
38
- - loss.backward()
39
- + accelerator.backward(loss)
40
- optimizer.step()
41
- ```
42
-
43
- **Run** (single command):
44
- ```bash
45
- accelerate launch train.py
46
- ```
47
-
48
- ## Common workflows
49
-
50
- ### Workflow 1: From single GPU to multi-GPU
51
-
52
- **Original script**:
53
- ```python
54
- # train.py
55
- import torch
56
-
57
- model = torch.nn.Linear(10, 2).to('cuda')
58
- optimizer = torch.optim.Adam(model.parameters())
59
- dataloader = torch.utils.data.DataLoader(dataset, batch_size=32)
60
-
61
- for epoch in range(10):
62
- for batch in dataloader:
63
- batch = batch.to('cuda')
64
- optimizer.zero_grad()
65
- loss = model(batch).mean()
66
- loss.backward()
67
- optimizer.step()
68
- ```
69
-
70
- **With Accelerate** (4 lines added):
71
- ```python
72
- # train.py
73
- import torch
74
- from accelerate import Accelerator # +1
75
-
76
- accelerator = Accelerator() # +2
77
-
78
- model = torch.nn.Linear(10, 2)
79
- optimizer = torch.optim.Adam(model.parameters())
80
- dataloader = torch.utils.data.DataLoader(dataset, batch_size=32)
81
-
82
- model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader) # +3
83
-
84
- for epoch in range(10):
85
- for batch in dataloader:
86
- # No .to('cuda') needed - automatic!
87
- optimizer.zero_grad()
88
- loss = model(batch).mean()
89
- accelerator.backward(loss) # +4
90
- optimizer.step()
91
- ```
92
-
93
- **Configure** (interactive):
94
- ```bash
95
- accelerate config
96
- ```
97
-
98
- **Questions**:
99
- - Which machine? (single/multi GPU/TPU/CPU)
100
- - How many machines? (1)
101
- - Mixed precision? (no/fp16/bf16/fp8)
102
- - DeepSpeed? (no/yes)
103
-
104
- **Launch** (works on any setup):
105
- ```bash
106
- # Single GPU
107
- accelerate launch train.py
108
-
109
- # Multi-GPU (8 GPUs)
110
- accelerate launch --multi_gpu --num_processes 8 train.py
111
-
112
- # Multi-node
113
- accelerate launch --multi_gpu --num_processes 16 \
114
- --num_machines 2 --machine_rank 0 \
115
- --main_process_ip $MASTER_ADDR \
116
- train.py
117
- ```
118
-
119
- ### Workflow 2: Mixed precision training
120
-
121
- **Enable FP16/BF16**:
122
- ```python
123
- from accelerate import Accelerator
124
-
125
- # FP16 (with gradient scaling)
126
- accelerator = Accelerator(mixed_precision='fp16')
127
-
128
- # BF16 (no scaling, more stable)
129
- accelerator = Accelerator(mixed_precision='bf16')
130
-
131
- # FP8 (H100+)
132
- accelerator = Accelerator(mixed_precision='fp8')
133
-
134
- model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)
135
-
136
- # Everything else is automatic!
137
- for batch in dataloader:
138
- with accelerator.autocast(): # Optional, done automatically
139
- loss = model(batch)
140
- accelerator.backward(loss)
141
- ```
142
-
143
- ### Workflow 3: DeepSpeed ZeRO integration
144
-
145
- **Enable DeepSpeed ZeRO-2**:
146
- ```python
147
- from accelerate import Accelerator
148
-
149
- accelerator = Accelerator(
150
- mixed_precision='bf16',
151
- deepspeed_plugin={
152
- "zero_stage": 2, # ZeRO-2
153
- "offload_optimizer": False,
154
- "gradient_accumulation_steps": 4
155
- }
156
- )
157
-
158
- # Same code as before!
159
- model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)
160
- ```
161
-
162
- **Or via config**:
163
- ```bash
164
- accelerate config
165
- # Select: DeepSpeed → ZeRO-2
166
- ```
167
-
168
- **deepspeed_config.json**:
169
- ```json
170
- {
171
- "fp16": {"enabled": false},
172
- "bf16": {"enabled": true},
173
- "zero_optimization": {
174
- "stage": 2,
175
- "offload_optimizer": {"device": "cpu"},
176
- "allgather_bucket_size": 5e8,
177
- "reduce_bucket_size": 5e8
178
- }
179
- }
180
- ```
181
-
182
- **Launch**:
183
- ```bash
184
- accelerate launch --config_file deepspeed_config.json train.py
185
- ```
186
-
187
- ### Workflow 4: FSDP (Fully Sharded Data Parallel)
188
-
189
- **Enable FSDP**:
190
- ```python
191
- from accelerate import Accelerator, FullyShardedDataParallelPlugin
192
-
193
- fsdp_plugin = FullyShardedDataParallelPlugin(
194
- sharding_strategy="FULL_SHARD", # ZeRO-3 equivalent
195
- auto_wrap_policy="TRANSFORMER_AUTO_WRAP",
196
- cpu_offload=False
197
- )
198
-
199
- accelerator = Accelerator(
200
- mixed_precision='bf16',
201
- fsdp_plugin=fsdp_plugin
202
- )
203
-
204
- model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)
205
- ```
206
-
207
- **Or via config**:
208
- ```bash
209
- accelerate config
210
- # Select: FSDP → Full Shard → No CPU Offload
211
- ```
212
-
213
- ### Workflow 5: Gradient accumulation
214
-
215
- **Accumulate gradients**:
216
- ```python
217
- from accelerate import Accelerator
218
-
219
- accelerator = Accelerator(gradient_accumulation_steps=4)
220
-
221
- model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)
222
-
223
- for batch in dataloader:
224
- with accelerator.accumulate(model): # Handles accumulation
225
- optimizer.zero_grad()
226
- loss = model(batch)
227
- accelerator.backward(loss)
228
- optimizer.step()
229
- ```
230
-
231
- **Effective batch size**: `batch_size * num_gpus * gradient_accumulation_steps`
232
-
233
- ## When to use vs alternatives
234
-
235
- **Use Accelerate when**:
236
- - Want simplest distributed training
237
- - Need single script for any hardware
238
- - Use HuggingFace ecosystem
239
- - Want flexibility (DDP/DeepSpeed/FSDP/Megatron)
240
- - Need quick prototyping
241
-
242
- **Key advantages**:
243
- - **4 lines**: Minimal code changes
244
- - **Unified API**: Same code for DDP, DeepSpeed, FSDP, Megatron
245
- - **Automatic**: Device placement, mixed precision, sharding
246
- - **Interactive config**: No manual launcher setup
247
- - **Single launch**: Works everywhere
248
-
249
- **Use alternatives instead**:
250
- - **PyTorch Lightning**: Need callbacks, high-level abstractions
251
- - **Ray Train**: Multi-node orchestration, hyperparameter tuning
252
- - **DeepSpeed**: Direct API control, advanced features
253
- - **Raw DDP**: Maximum control, minimal abstraction
254
-
255
- ## Common issues
256
-
257
- **Issue: Wrong device placement**
258
-
259
- Don't manually move to device:
260
- ```python
261
- # WRONG
262
- batch = batch.to('cuda')
263
-
264
- # CORRECT
265
- # Accelerate handles it automatically after prepare()
266
- ```
267
-
268
- **Issue: Gradient accumulation not working**
269
-
270
- Use context manager:
271
- ```python
272
- # CORRECT
273
- with accelerator.accumulate(model):
274
- optimizer.zero_grad()
275
- accelerator.backward(loss)
276
- optimizer.step()
277
- ```
278
-
279
- **Issue: Checkpointing in distributed**
280
-
281
- Use accelerator methods:
282
- ```python
283
- # Save only on main process
284
- if accelerator.is_main_process:
285
- accelerator.save_state('checkpoint/')
286
-
287
- # Load on all processes
288
- accelerator.load_state('checkpoint/')
289
- ```
290
-
291
- **Issue: Different results with FSDP**
292
-
293
- Ensure same random seed:
294
- ```python
295
- from accelerate.utils import set_seed
296
- set_seed(42)
297
- ```
298
-
299
- ## Advanced topics
300
-
301
- **Megatron integration**: See [references/megatron-integration.md](references/megatron-integration.md) for tensor parallelism, pipeline parallelism, and sequence parallelism setup.
302
-
303
- **Custom plugins**: See [references/custom-plugins.md](references/custom-plugins.md) for creating custom distributed plugins and advanced configuration.
304
-
305
- **Performance tuning**: See [references/performance.md](references/performance.md) for profiling, memory optimization, and best practices.
306
-
307
- ## Hardware requirements
308
-
309
- - **CPU**: Works (slow)
310
- - **Single GPU**: Works
311
- - **Multi-GPU**: DDP (default), DeepSpeed, or FSDP
312
- - **Multi-node**: DDP, DeepSpeed, FSDP, Megatron
313
- - **TPU**: Supported
314
- - **Apple MPS**: Supported
315
-
316
- **Launcher requirements**:
317
- - **DDP**: `torch.distributed.run` (built-in)
318
- - **DeepSpeed**: `deepspeed` (pip install deepspeed)
319
- - **FSDP**: PyTorch 1.12+ (built-in)
320
- - **Megatron**: Custom setup
321
-
322
- ## Resources
323
-
324
- - Docs: https://huggingface.co/docs/accelerate
325
- - GitHub: https://github.com/huggingface/accelerate
326
- - Version: 1.11.0+
327
- - Tutorial: "Accelerate your scripts"
328
- - Examples: https://github.com/huggingface/accelerate/tree/main/examples
329
- - Used by: HuggingFace Transformers, TRL, PEFT, all HF libraries
330
-
331
-
332
-