EvoScientist 0.0.1.dev4__py3-none-any.whl → 0.1.0rc1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (113) hide show
  1. EvoScientist/EvoScientist.py +26 -62
  2. EvoScientist/__init__.py +0 -19
  3. EvoScientist/backends.py +0 -26
  4. EvoScientist/cli.py +1111 -498
  5. EvoScientist/middleware.py +8 -61
  6. EvoScientist/stream/__init__.py +0 -25
  7. EvoScientist/stream/utils.py +16 -23
  8. EvoScientist/tools.py +2 -75
  9. evoscientist-0.1.0rc1.dist-info/METADATA +199 -0
  10. evoscientist-0.1.0rc1.dist-info/RECORD +21 -0
  11. evoscientist-0.1.0rc1.dist-info/entry_points.txt +2 -0
  12. EvoScientist/config.py +0 -274
  13. EvoScientist/llm/__init__.py +0 -21
  14. EvoScientist/llm/models.py +0 -99
  15. EvoScientist/memory.py +0 -715
  16. EvoScientist/onboard.py +0 -725
  17. EvoScientist/paths.py +0 -44
  18. EvoScientist/skills/accelerate/SKILL.md +0 -332
  19. EvoScientist/skills/accelerate/references/custom-plugins.md +0 -453
  20. EvoScientist/skills/accelerate/references/megatron-integration.md +0 -489
  21. EvoScientist/skills/accelerate/references/performance.md +0 -525
  22. EvoScientist/skills/bitsandbytes/SKILL.md +0 -411
  23. EvoScientist/skills/bitsandbytes/references/memory-optimization.md +0 -521
  24. EvoScientist/skills/bitsandbytes/references/qlora-training.md +0 -521
  25. EvoScientist/skills/bitsandbytes/references/quantization-formats.md +0 -447
  26. EvoScientist/skills/find-skills/SKILL.md +0 -133
  27. EvoScientist/skills/find-skills/scripts/install_skill.py +0 -211
  28. EvoScientist/skills/flash-attention/SKILL.md +0 -367
  29. EvoScientist/skills/flash-attention/references/benchmarks.md +0 -215
  30. EvoScientist/skills/flash-attention/references/transformers-integration.md +0 -293
  31. EvoScientist/skills/llama-cpp/SKILL.md +0 -258
  32. EvoScientist/skills/llama-cpp/references/optimization.md +0 -89
  33. EvoScientist/skills/llama-cpp/references/quantization.md +0 -213
  34. EvoScientist/skills/llama-cpp/references/server.md +0 -125
  35. EvoScientist/skills/lm-evaluation-harness/SKILL.md +0 -490
  36. EvoScientist/skills/lm-evaluation-harness/references/api-evaluation.md +0 -490
  37. EvoScientist/skills/lm-evaluation-harness/references/benchmark-guide.md +0 -488
  38. EvoScientist/skills/lm-evaluation-harness/references/custom-tasks.md +0 -602
  39. EvoScientist/skills/lm-evaluation-harness/references/distributed-eval.md +0 -519
  40. EvoScientist/skills/ml-paper-writing/SKILL.md +0 -937
  41. EvoScientist/skills/ml-paper-writing/references/checklists.md +0 -361
  42. EvoScientist/skills/ml-paper-writing/references/citation-workflow.md +0 -562
  43. EvoScientist/skills/ml-paper-writing/references/reviewer-guidelines.md +0 -367
  44. EvoScientist/skills/ml-paper-writing/references/sources.md +0 -159
  45. EvoScientist/skills/ml-paper-writing/references/writing-guide.md +0 -476
  46. EvoScientist/skills/ml-paper-writing/templates/README.md +0 -251
  47. EvoScientist/skills/ml-paper-writing/templates/aaai2026/README.md +0 -534
  48. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026-unified-supp.tex +0 -144
  49. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026-unified-template.tex +0 -952
  50. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026.bib +0 -111
  51. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026.bst +0 -1493
  52. EvoScientist/skills/ml-paper-writing/templates/aaai2026/aaai2026.sty +0 -315
  53. EvoScientist/skills/ml-paper-writing/templates/acl/README.md +0 -50
  54. EvoScientist/skills/ml-paper-writing/templates/acl/acl.sty +0 -312
  55. EvoScientist/skills/ml-paper-writing/templates/acl/acl_latex.tex +0 -377
  56. EvoScientist/skills/ml-paper-writing/templates/acl/acl_lualatex.tex +0 -101
  57. EvoScientist/skills/ml-paper-writing/templates/acl/acl_natbib.bst +0 -1940
  58. EvoScientist/skills/ml-paper-writing/templates/acl/anthology.bib.txt +0 -26
  59. EvoScientist/skills/ml-paper-writing/templates/acl/custom.bib +0 -70
  60. EvoScientist/skills/ml-paper-writing/templates/acl/formatting.md +0 -326
  61. EvoScientist/skills/ml-paper-writing/templates/colm2025/README.md +0 -3
  62. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.bib +0 -11
  63. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.bst +0 -1440
  64. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.pdf +0 -0
  65. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.sty +0 -218
  66. EvoScientist/skills/ml-paper-writing/templates/colm2025/colm2025_conference.tex +0 -305
  67. EvoScientist/skills/ml-paper-writing/templates/colm2025/fancyhdr.sty +0 -485
  68. EvoScientist/skills/ml-paper-writing/templates/colm2025/math_commands.tex +0 -508
  69. EvoScientist/skills/ml-paper-writing/templates/colm2025/natbib.sty +0 -1246
  70. EvoScientist/skills/ml-paper-writing/templates/iclr2026/fancyhdr.sty +0 -485
  71. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.bib +0 -24
  72. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.bst +0 -1440
  73. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.pdf +0 -0
  74. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.sty +0 -246
  75. EvoScientist/skills/ml-paper-writing/templates/iclr2026/iclr2026_conference.tex +0 -414
  76. EvoScientist/skills/ml-paper-writing/templates/iclr2026/math_commands.tex +0 -508
  77. EvoScientist/skills/ml-paper-writing/templates/iclr2026/natbib.sty +0 -1246
  78. EvoScientist/skills/ml-paper-writing/templates/icml2026/algorithm.sty +0 -79
  79. EvoScientist/skills/ml-paper-writing/templates/icml2026/algorithmic.sty +0 -201
  80. EvoScientist/skills/ml-paper-writing/templates/icml2026/example_paper.bib +0 -75
  81. EvoScientist/skills/ml-paper-writing/templates/icml2026/example_paper.pdf +0 -0
  82. EvoScientist/skills/ml-paper-writing/templates/icml2026/example_paper.tex +0 -662
  83. EvoScientist/skills/ml-paper-writing/templates/icml2026/fancyhdr.sty +0 -864
  84. EvoScientist/skills/ml-paper-writing/templates/icml2026/icml2026.bst +0 -1443
  85. EvoScientist/skills/ml-paper-writing/templates/icml2026/icml2026.sty +0 -767
  86. EvoScientist/skills/ml-paper-writing/templates/icml2026/icml_numpapers.pdf +0 -0
  87. EvoScientist/skills/ml-paper-writing/templates/neurips2025/Makefile +0 -36
  88. EvoScientist/skills/ml-paper-writing/templates/neurips2025/extra_pkgs.tex +0 -53
  89. EvoScientist/skills/ml-paper-writing/templates/neurips2025/main.tex +0 -38
  90. EvoScientist/skills/ml-paper-writing/templates/neurips2025/neurips.sty +0 -382
  91. EvoScientist/skills/peft/SKILL.md +0 -431
  92. EvoScientist/skills/peft/references/advanced-usage.md +0 -514
  93. EvoScientist/skills/peft/references/troubleshooting.md +0 -480
  94. EvoScientist/skills/ray-data/SKILL.md +0 -326
  95. EvoScientist/skills/ray-data/references/integration.md +0 -82
  96. EvoScientist/skills/ray-data/references/transformations.md +0 -83
  97. EvoScientist/skills/skill-creator/LICENSE.txt +0 -202
  98. EvoScientist/skills/skill-creator/SKILL.md +0 -356
  99. EvoScientist/skills/skill-creator/references/output-patterns.md +0 -82
  100. EvoScientist/skills/skill-creator/references/workflows.md +0 -28
  101. EvoScientist/skills/skill-creator/scripts/init_skill.py +0 -303
  102. EvoScientist/skills/skill-creator/scripts/package_skill.py +0 -110
  103. EvoScientist/skills/skill-creator/scripts/quick_validate.py +0 -95
  104. EvoScientist/skills_manager.py +0 -391
  105. EvoScientist/stream/display.py +0 -604
  106. EvoScientist/stream/events.py +0 -415
  107. EvoScientist/stream/state.py +0 -343
  108. evoscientist-0.0.1.dev4.dist-info/METADATA +0 -367
  109. evoscientist-0.0.1.dev4.dist-info/RECORD +0 -117
  110. evoscientist-0.0.1.dev4.dist-info/entry_points.txt +0 -5
  111. {evoscientist-0.0.1.dev4.dist-info → evoscientist-0.1.0rc1.dist-info}/WHEEL +0 -0
  112. {evoscientist-0.0.1.dev4.dist-info → evoscientist-0.1.0rc1.dist-info}/licenses/LICENSE +0 -0
  113. {evoscientist-0.0.1.dev4.dist-info → evoscientist-0.1.0rc1.dist-info}/top_level.txt +0 -0
EvoScientist/paths.py DELETED
@@ -1,44 +0,0 @@
1
- """Path resolution utilities for EvoScientist runtime directories."""
2
-
3
- from __future__ import annotations
4
-
5
- import os
6
- from datetime import datetime
7
- from pathlib import Path
8
-
9
-
10
- def _expand(path: str) -> Path:
11
- return Path(path).expanduser()
12
-
13
-
14
- def _env_path(key: str) -> Path | None:
15
- value = os.getenv(key)
16
- if not value:
17
- return None
18
- return _expand(value)
19
-
20
-
21
- # Workspace root: directly under cwd (no hidden .evoscientist layer)
22
- WORKSPACE_ROOT = _env_path("EVOSCIENTIST_WORKSPACE_DIR") or (Path.cwd() / "workspace")
23
-
24
- RUNS_DIR = _env_path("EVOSCIENTIST_RUNS_DIR") or (WORKSPACE_ROOT / "runs")
25
- MEMORY_DIR = _env_path("EVOSCIENTIST_MEMORY_DIR") or (WORKSPACE_ROOT / "memory")
26
- USER_SKILLS_DIR = _env_path("EVOSCIENTIST_SKILLS_DIR") or (WORKSPACE_ROOT / "skills")
27
-
28
-
29
- def ensure_dirs() -> None:
30
- """Create runtime directories if they do not exist."""
31
- for path in (WORKSPACE_ROOT, RUNS_DIR, MEMORY_DIR, USER_SKILLS_DIR):
32
- path.mkdir(parents=True, exist_ok=True)
33
-
34
-
35
- def default_workspace_dir() -> Path:
36
- """Default workspace for non-CLI usage."""
37
- return WORKSPACE_ROOT
38
-
39
-
40
- def new_run_dir(session_id: str | None = None) -> Path:
41
- """Create a new run directory name under RUNS_DIR (path only)."""
42
- if session_id is None:
43
- session_id = datetime.now().strftime("%Y%m%d_%H%M%S")
44
- return RUNS_DIR / session_id
@@ -1,332 +0,0 @@
1
- ---
2
- name: accelerate
3
- description: Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision (FP16/BF16/FP8). Interactive config, single launch command. HuggingFace ecosystem standard.
4
- version: 1.0.0
5
- author: Orchestra Research
6
- license: MIT
7
- tags: [Distributed Training, HuggingFace, Accelerate, DeepSpeed, FSDP, Mixed Precision, PyTorch, DDP, Unified API, Simple]
8
- dependencies: [accelerate, torch, transformers]
9
- ---
10
-
11
- # HuggingFace Accelerate - Unified Distributed Training
12
-
13
- ## Quick start
14
-
15
- Accelerate simplifies distributed training to 4 lines of code.
16
-
17
- **Installation**:
18
- ```bash
19
- pip install accelerate
20
- ```
21
-
22
- **Convert PyTorch script** (4 lines):
23
- ```python
24
- import torch
25
- + from accelerate import Accelerator
26
-
27
- + accelerator = Accelerator()
28
-
29
- model = torch.nn.Transformer()
30
- optimizer = torch.optim.Adam(model.parameters())
31
- dataloader = torch.utils.data.DataLoader(dataset)
32
-
33
- + model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)
34
-
35
- for batch in dataloader:
36
- optimizer.zero_grad()
37
- loss = model(batch)
38
- - loss.backward()
39
- + accelerator.backward(loss)
40
- optimizer.step()
41
- ```
42
-
43
- **Run** (single command):
44
- ```bash
45
- accelerate launch train.py
46
- ```
47
-
48
- ## Common workflows
49
-
50
- ### Workflow 1: From single GPU to multi-GPU
51
-
52
- **Original script**:
53
- ```python
54
- # train.py
55
- import torch
56
-
57
- model = torch.nn.Linear(10, 2).to('cuda')
58
- optimizer = torch.optim.Adam(model.parameters())
59
- dataloader = torch.utils.data.DataLoader(dataset, batch_size=32)
60
-
61
- for epoch in range(10):
62
- for batch in dataloader:
63
- batch = batch.to('cuda')
64
- optimizer.zero_grad()
65
- loss = model(batch).mean()
66
- loss.backward()
67
- optimizer.step()
68
- ```
69
-
70
- **With Accelerate** (4 lines added):
71
- ```python
72
- # train.py
73
- import torch
74
- from accelerate import Accelerator # +1
75
-
76
- accelerator = Accelerator() # +2
77
-
78
- model = torch.nn.Linear(10, 2)
79
- optimizer = torch.optim.Adam(model.parameters())
80
- dataloader = torch.utils.data.DataLoader(dataset, batch_size=32)
81
-
82
- model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader) # +3
83
-
84
- for epoch in range(10):
85
- for batch in dataloader:
86
- # No .to('cuda') needed - automatic!
87
- optimizer.zero_grad()
88
- loss = model(batch).mean()
89
- accelerator.backward(loss) # +4
90
- optimizer.step()
91
- ```
92
-
93
- **Configure** (interactive):
94
- ```bash
95
- accelerate config
96
- ```
97
-
98
- **Questions**:
99
- - Which machine? (single/multi GPU/TPU/CPU)
100
- - How many machines? (1)
101
- - Mixed precision? (no/fp16/bf16/fp8)
102
- - DeepSpeed? (no/yes)
103
-
104
- **Launch** (works on any setup):
105
- ```bash
106
- # Single GPU
107
- accelerate launch train.py
108
-
109
- # Multi-GPU (8 GPUs)
110
- accelerate launch --multi_gpu --num_processes 8 train.py
111
-
112
- # Multi-node
113
- accelerate launch --multi_gpu --num_processes 16 \
114
- --num_machines 2 --machine_rank 0 \
115
- --main_process_ip $MASTER_ADDR \
116
- train.py
117
- ```
118
-
119
- ### Workflow 2: Mixed precision training
120
-
121
- **Enable FP16/BF16**:
122
- ```python
123
- from accelerate import Accelerator
124
-
125
- # FP16 (with gradient scaling)
126
- accelerator = Accelerator(mixed_precision='fp16')
127
-
128
- # BF16 (no scaling, more stable)
129
- accelerator = Accelerator(mixed_precision='bf16')
130
-
131
- # FP8 (H100+)
132
- accelerator = Accelerator(mixed_precision='fp8')
133
-
134
- model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)
135
-
136
- # Everything else is automatic!
137
- for batch in dataloader:
138
- with accelerator.autocast(): # Optional, done automatically
139
- loss = model(batch)
140
- accelerator.backward(loss)
141
- ```
142
-
143
- ### Workflow 3: DeepSpeed ZeRO integration
144
-
145
- **Enable DeepSpeed ZeRO-2**:
146
- ```python
147
- from accelerate import Accelerator
148
-
149
- accelerator = Accelerator(
150
- mixed_precision='bf16',
151
- deepspeed_plugin={
152
- "zero_stage": 2, # ZeRO-2
153
- "offload_optimizer": False,
154
- "gradient_accumulation_steps": 4
155
- }
156
- )
157
-
158
- # Same code as before!
159
- model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)
160
- ```
161
-
162
- **Or via config**:
163
- ```bash
164
- accelerate config
165
- # Select: DeepSpeed → ZeRO-2
166
- ```
167
-
168
- **deepspeed_config.json**:
169
- ```json
170
- {
171
- "fp16": {"enabled": false},
172
- "bf16": {"enabled": true},
173
- "zero_optimization": {
174
- "stage": 2,
175
- "offload_optimizer": {"device": "cpu"},
176
- "allgather_bucket_size": 5e8,
177
- "reduce_bucket_size": 5e8
178
- }
179
- }
180
- ```
181
-
182
- **Launch**:
183
- ```bash
184
- accelerate launch --config_file deepspeed_config.json train.py
185
- ```
186
-
187
- ### Workflow 4: FSDP (Fully Sharded Data Parallel)
188
-
189
- **Enable FSDP**:
190
- ```python
191
- from accelerate import Accelerator, FullyShardedDataParallelPlugin
192
-
193
- fsdp_plugin = FullyShardedDataParallelPlugin(
194
- sharding_strategy="FULL_SHARD", # ZeRO-3 equivalent
195
- auto_wrap_policy="TRANSFORMER_AUTO_WRAP",
196
- cpu_offload=False
197
- )
198
-
199
- accelerator = Accelerator(
200
- mixed_precision='bf16',
201
- fsdp_plugin=fsdp_plugin
202
- )
203
-
204
- model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)
205
- ```
206
-
207
- **Or via config**:
208
- ```bash
209
- accelerate config
210
- # Select: FSDP → Full Shard → No CPU Offload
211
- ```
212
-
213
- ### Workflow 5: Gradient accumulation
214
-
215
- **Accumulate gradients**:
216
- ```python
217
- from accelerate import Accelerator
218
-
219
- accelerator = Accelerator(gradient_accumulation_steps=4)
220
-
221
- model, optimizer, dataloader = accelerator.prepare(model, optimizer, dataloader)
222
-
223
- for batch in dataloader:
224
- with accelerator.accumulate(model): # Handles accumulation
225
- optimizer.zero_grad()
226
- loss = model(batch)
227
- accelerator.backward(loss)
228
- optimizer.step()
229
- ```
230
-
231
- **Effective batch size**: `batch_size * num_gpus * gradient_accumulation_steps`
232
-
233
- ## When to use vs alternatives
234
-
235
- **Use Accelerate when**:
236
- - Want simplest distributed training
237
- - Need single script for any hardware
238
- - Use HuggingFace ecosystem
239
- - Want flexibility (DDP/DeepSpeed/FSDP/Megatron)
240
- - Need quick prototyping
241
-
242
- **Key advantages**:
243
- - **4 lines**: Minimal code changes
244
- - **Unified API**: Same code for DDP, DeepSpeed, FSDP, Megatron
245
- - **Automatic**: Device placement, mixed precision, sharding
246
- - **Interactive config**: No manual launcher setup
247
- - **Single launch**: Works everywhere
248
-
249
- **Use alternatives instead**:
250
- - **PyTorch Lightning**: Need callbacks, high-level abstractions
251
- - **Ray Train**: Multi-node orchestration, hyperparameter tuning
252
- - **DeepSpeed**: Direct API control, advanced features
253
- - **Raw DDP**: Maximum control, minimal abstraction
254
-
255
- ## Common issues
256
-
257
- **Issue: Wrong device placement**
258
-
259
- Don't manually move to device:
260
- ```python
261
- # WRONG
262
- batch = batch.to('cuda')
263
-
264
- # CORRECT
265
- # Accelerate handles it automatically after prepare()
266
- ```
267
-
268
- **Issue: Gradient accumulation not working**
269
-
270
- Use context manager:
271
- ```python
272
- # CORRECT
273
- with accelerator.accumulate(model):
274
- optimizer.zero_grad()
275
- accelerator.backward(loss)
276
- optimizer.step()
277
- ```
278
-
279
- **Issue: Checkpointing in distributed**
280
-
281
- Use accelerator methods:
282
- ```python
283
- # Save only on main process
284
- if accelerator.is_main_process:
285
- accelerator.save_state('checkpoint/')
286
-
287
- # Load on all processes
288
- accelerator.load_state('checkpoint/')
289
- ```
290
-
291
- **Issue: Different results with FSDP**
292
-
293
- Ensure same random seed:
294
- ```python
295
- from accelerate.utils import set_seed
296
- set_seed(42)
297
- ```
298
-
299
- ## Advanced topics
300
-
301
- **Megatron integration**: See [references/megatron-integration.md](references/megatron-integration.md) for tensor parallelism, pipeline parallelism, and sequence parallelism setup.
302
-
303
- **Custom plugins**: See [references/custom-plugins.md](references/custom-plugins.md) for creating custom distributed plugins and advanced configuration.
304
-
305
- **Performance tuning**: See [references/performance.md](references/performance.md) for profiling, memory optimization, and best practices.
306
-
307
- ## Hardware requirements
308
-
309
- - **CPU**: Works (slow)
310
- - **Single GPU**: Works
311
- - **Multi-GPU**: DDP (default), DeepSpeed, or FSDP
312
- - **Multi-node**: DDP, DeepSpeed, FSDP, Megatron
313
- - **TPU**: Supported
314
- - **Apple MPS**: Supported
315
-
316
- **Launcher requirements**:
317
- - **DDP**: `torch.distributed.run` (built-in)
318
- - **DeepSpeed**: `deepspeed` (pip install deepspeed)
319
- - **FSDP**: PyTorch 1.12+ (built-in)
320
- - **Megatron**: Custom setup
321
-
322
- ## Resources
323
-
324
- - Docs: https://huggingface.co/docs/accelerate
325
- - GitHub: https://github.com/huggingface/accelerate
326
- - Version: 1.11.0+
327
- - Tutorial: "Accelerate your scripts"
328
- - Examples: https://github.com/huggingface/accelerate/tree/main/examples
329
- - Used by: HuggingFace Transformers, TRL, PEFT, all HF libraries
330
-
331
-
332
-