ato 2.0.2__py3-none-any.whl → 2.1.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of ato might be problematic. Click here for more details.

@@ -1,1261 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: ato
3
- Version: 2.0.2
4
- Summary: A minimal, composable config layer for Python and ML pipelines. Built to stay, not to impress.
5
- Author: ato contributors
6
- License: MIT
7
- Project-URL: Homepage, https://github.com/yourusername/ato
8
- Project-URL: Repository, https://github.com/yourusername/ato
9
- Project-URL: Documentation, https://github.com/yourusername/ato#readme
10
- Project-URL: Issues, https://github.com/yourusername/ato/issues
11
- Keywords: config management,experiment tracking,hyperparameter optimization,lightweight,composable,namespace isolation,machine learning
12
- Classifier: Development Status :: 4 - Beta
13
- Classifier: Intended Audience :: Developers
14
- Classifier: Intended Audience :: Science/Research
15
- Classifier: License :: OSI Approved :: MIT License
16
- Classifier: Programming Language :: Python :: 3
17
- Classifier: Programming Language :: Python :: 3.7
18
- Classifier: Programming Language :: Python :: 3.8
19
- Classifier: Programming Language :: Python :: 3.9
20
- Classifier: Programming Language :: Python :: 3.10
21
- Classifier: Programming Language :: Python :: 3.11
22
- Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
23
- Classifier: Topic :: Software Development :: Libraries :: Python Modules
24
- Requires-Python: >=3.7
25
- Description-Content-Type: text/markdown
26
- License-File: LICENSE
27
- Requires-Dist: pyyaml>=6.0
28
- Requires-Dist: toml>=0.10.2
29
- Requires-Dist: sqlalchemy>=2.0
30
- Requires-Dist: numpy>=1.19.0
31
- Provides-Extra: distributed
32
- Requires-Dist: torch>=1.8.0; extra == "distributed"
33
- Dynamic: license-file
34
-
35
- # Ato: A Tiny Orchestrator
36
-
37
- ## A minimal, composable config layer for Python and ML pipelines
38
-
39
- Ato is a minimal, composable config system for Python and ML pipelines.
40
- It lets you **chain**, **merge**, and **freeze** modular configs,
41
- so you can move seamlessly from **dynamic experiments** to **static production builds**.
42
-
43
- Unlike heavy frameworks, Ato keeps everything **transparent** and **Pythonic** —
44
- you can use it alongside tools like Hydra, WandB, or MLflow without friction.
45
- It’s built for people who prefer clarity over magic.
46
-
47
- After all, *Ato* was never built to impress — it was built to stay.
48
-
49
- <details>
50
- <summary><strong>Developer’s Note</strong></summary>
51
-
52
- I didn’t know there was a great tool called Hydra.
53
- So I built something a bit simpler, a bit more opinionated,
54
- and maybe a bit more compatible — something that could also work nicely
55
- with amazing tools like Hydra, WandB, or MLflow.
56
-
57
- Even though I didn’t know these tools at the time,
58
- I deliberately designed for compatibility —
59
- and later, after learning about Hydra and others,
60
- I added explicit interop layers.
61
- Because I know how tempting — and exhausting —
62
- it can be to move from a familiar environment
63
- to a new, more attractive one.
64
-
65
- I’ve been the only user so far —
66
- not because I wanted to hide it,
67
- but because I never had anyone around
68
- who could really tell me if it was good enough.
69
- Maybe this is the right time to find out.
70
-
71
- So — there’s no need to compete.
72
- Just *try it once.*
73
- This tool won’t make you tired.
74
- It might even feel a little kind.
75
-
76
- </details>
77
-
78
- ---
79
-
80
- **Ato** is designed to work *with* your existing tools — not replace them.
81
- It provides configuration management, experiment tracking, and hyperparameter optimization
82
- as a **philosophical layer** that plays nicely with Hydra, MLflow, W&B, and whatever else you use.
83
-
84
- ## Why Ato?
85
-
86
- Ato isn't trying to compete with Hydra or replace your experiment tracking platform.
87
- It's for the projects that live *before* things get complicated — or for teams that want clarity over features.
88
-
89
- **Philosophy over framework**: Ato gives you enough structure to stay organized, without imposing a rigid system.
90
- Use it standalone, or layer it on top of Hydra, MLflow, or W&B. It's a tool, not a commitment.
91
-
92
- ### Core Differentiators
93
-
94
- - **True Namespace Isolation**: MultiScope provides independent config contexts (unique to Ato!)
95
- - **Configuration Transparency**: Visualize exact config merge order - debug configs with `manual` command
96
- - **Built-in Experiment Tracking**: SQLite-based tracking with no external services required
97
- - **Structural Hashing**: Track experiment structure changes automatically
98
-
99
- ### Developer Experience
100
-
101
- - **Zero Boilerplate**: Auto-nested configs, lazy evaluation, attribute access
102
- - **CLI-first Design**: Configure experiments from command line without touching code
103
- - **Framework Agnostic**: Works with PyTorch, TensorFlow, JAX, or pure Python
104
-
105
- ## Quick Start
106
-
107
- ```bash
108
- pip install ato
109
- ```
110
-
111
- ### 30-Second Example
112
-
113
- ```python
114
- from ato.scope import Scope
115
-
116
- scope = Scope()
117
-
118
- @scope.observe(default=True)
119
- def config(cfg):
120
- cfg.lr = 0.001
121
- cfg.batch_size = 32
122
- cfg.model = 'resnet50'
123
-
124
- @scope
125
- def train(cfg):
126
- print(f"Training {cfg.model} with lr={cfg.lr}")
127
- # Your training code here
128
-
129
- if __name__ == '__main__':
130
- train() # python train.py
131
- # Override from CLI: python train.py lr=0.01 model=%resnet101%
132
- ```
133
-
134
- ---
135
-
136
- ## Table of Contents
137
-
138
- - [ADict: Enhanced Dictionary](#adict-enhanced-dictionary)
139
- - [Scope: Configuration Management](#scope-configuration-management)
140
- - [MultiScope: Namespace Isolation](#2-multiscope---multiple-configuration-contexts) ⭐ Unique to Ato
141
- - [Config Documentation & Debugging](#5-configuration-documentation--inspection) ⭐ Unique to Ato
142
- - [SQL Tracker: Experiment Tracking](#sql-tracker-experiment-tracking)
143
- - [Hyperparameter Optimization](#hyperparameter-optimization)
144
- - [Best Practices](#best-practices)
145
- - [Future Work](#future-work--optional-modular-non-intrusive)
146
- - [Working with Existing Tools](#working-with-existing-tools)
147
-
148
- ---
149
-
150
- ## ADict: Enhanced Dictionary
151
-
152
- `ADict` is an enhanced dictionary designed for managing experiment configurations. It combines the simplicity of Python dictionaries with powerful features for ML workflows.
153
-
154
- ### Core Features
155
-
156
- These are the fundamental capabilities that make ADict powerful for experiment management:
157
-
158
- | Feature | Description | Why It Matters |
159
- |---------|-------------|----------------|
160
- | **Structural Hashing** | Hash based on keys + types, not values | Track when experiment structure changes |
161
- | **Nested Access** | Dot notation for nested configs | `config.model.lr` instead of `config['model']['lr']` |
162
- | **Format Agnostic** | Load/save JSON, YAML, TOML, XYZ | Work with any config format |
163
- | **Safe Updates** | `update_if_absent()` method | Prevent accidental overwrites |
164
-
165
- ### Developer Convenience Features
166
-
167
- These utilities maximize developer productivity and reduce boilerplate:
168
-
169
- | Feature | Description | Benefit |
170
- |---------|-------------|---------|
171
- | **Auto-nested (`ADict.auto()`)** | Infinite depth lazy creation | `config.a.b.c = 1` just works - no KeyError |
172
- | **Attribute-style Assignment** | `config.lr = 0.1` | Cleaner, more readable code |
173
- | **Conditional Updates** | Only update missing keys | Merge configs safely |
174
-
175
- ### Quick Examples
176
-
177
- ```python
178
- from ato.adict import ADict
179
-
180
- # Structural hashing - track config structure changes
181
- config1 = ADict(lr=0.1, epochs=100, model='resnet50')
182
- config2 = ADict(lr=0.01, epochs=200, model='resnet101')
183
- print(config1.get_structural_hash() == config2.get_structural_hash()) # True
184
-
185
- config3 = ADict(lr=0.1, epochs='100', model='resnet50') # epochs is str!
186
- print(config1.get_structural_hash() == config3.get_structural_hash()) # False
187
-
188
- # Load/save any format
189
- config = ADict.from_file('config.json')
190
- config.dump('config.yaml')
191
-
192
- # Safe updates
193
- config.update_if_absent(lr=0.01, scheduler='cosine') # Only adds scheduler
194
- ```
195
-
196
- ### Convenience Features in Detail
197
-
198
- #### Auto-nested: Zero Boilerplate Config Building
199
-
200
- The most loved feature - no more manual nesting:
201
-
202
- ```python
203
- # ❌ Traditional way
204
- config = ADict()
205
- config.model = ADict()
206
- config.model.backbone = ADict()
207
- config.model.backbone.layers = [64, 128, 256]
208
-
209
- # ✅ With ADict.auto()
210
- config = ADict.auto()
211
- config.model.backbone.layers = [64, 128, 256] # Just works!
212
- config.data.augmentation.brightness = 0.2
213
- ```
214
-
215
- **Perfect for Scope integration**:
216
-
217
- ```python
218
- from ato.scope import Scope
219
-
220
- scope = Scope()
221
-
222
- @scope.observe(default=True)
223
- def config(cfg):
224
- # No pre-definition needed!
225
- cfg.training.optimizer.name = 'AdamW'
226
- cfg.training.optimizer.lr = 0.001
227
- cfg.model.encoder.num_layers = 12
228
- ```
229
-
230
- **Works with CLI**:
231
-
232
- ```bash
233
- python train.py model.backbone.resnet.depth=50 data.batch_size=32
234
- ```
235
-
236
- #### More Convenience Utilities
237
-
238
- ```python
239
- # Attribute-style access
240
- config.lr = 0.1
241
- print(config.lr) # Instead of config['lr']
242
-
243
- # Nested access
244
- print(config.model.backbone.type) # Clean and readable
245
-
246
- # Conditional updates - merge configs safely
247
- base_config.update_if_absent(**experiment_config)
248
- ```
249
-
250
- ---
251
-
252
- ## Scope: Configuration Management
253
-
254
- Scope solves configuration complexity through **priority-based merging** and **CLI integration**. No more scattered config files or hard-coded parameters.
255
-
256
- ### Key Concepts
257
-
258
- ```
259
- Default Configs (priority=0)
260
-
261
- Named Configs (priority=0+)
262
-
263
- CLI Arguments (highest priority)
264
-
265
- Lazy Configs (computed after CLI)
266
- ```
267
-
268
- ### Basic Usage
269
-
270
- #### Simple Configuration
271
-
272
- ```python
273
- from ato.scope import Scope
274
-
275
- scope = Scope()
276
-
277
- @scope.observe()
278
- def my_config(config):
279
- config.dataset = 'cifar10'
280
- config.lr = 0.001
281
- config.batch_size = 32
282
-
283
- @scope
284
- def train(config):
285
- print(f"Training on {config.dataset}")
286
- # Your code here
287
-
288
- if __name__ == '__main__':
289
- train()
290
- ```
291
-
292
- #### Priority-based Merging
293
-
294
- ```python
295
- @scope.observe(default=True) # Always applied
296
- def defaults(cfg):
297
- cfg.lr = 0.001
298
- cfg.epochs = 100
299
-
300
- @scope.observe(priority=1) # Applied after defaults
301
- def high_lr(cfg):
302
- cfg.lr = 0.01
303
-
304
- @scope.observe(priority=2) # Applied last
305
- def long_training(cfg):
306
- cfg.epochs = 300
307
- ```
308
-
309
- ```bash
310
- python train.py # lr=0.001, epochs=100
311
- python train.py high_lr # lr=0.01, epochs=100
312
- python train.py high_lr long_training # lr=0.01, epochs=300
313
- ```
314
-
315
- #### CLI Configuration
316
-
317
- Override any parameter from command line:
318
-
319
- ```bash
320
- # Simple values
321
- python train.py lr=0.01 batch_size=64
322
-
323
- # Nested configs
324
- python train.py model.backbone=%resnet101% model.depth=101
325
-
326
- # Lists and complex types
327
- python train.py layers=[64,128,256,512] dropout=0.5
328
-
329
- # Combine with named configs
330
- python train.py my_config lr=0.001 batch_size=128
331
- ```
332
-
333
- **Note**: Wrap strings with `%` (e.g., `%resnet101%`) instead of quotes.
334
-
335
- ### Advanced Features
336
-
337
- #### 1. Lazy Evaluation - Dynamic Configuration
338
-
339
- Sometimes you need configs that depend on other values set via CLI:
340
-
341
- ```python
342
- @scope.observe()
343
- def base_config(cfg):
344
- cfg.model = 'resnet50'
345
- cfg.dataset = 'imagenet'
346
-
347
- @scope.observe(lazy=True) # Evaluated AFTER CLI args
348
- def computed_config(cfg):
349
- # Adjust based on dataset
350
- if cfg.dataset == 'imagenet':
351
- cfg.num_classes = 1000
352
- cfg.image_size = 224
353
- elif cfg.dataset == 'cifar10':
354
- cfg.num_classes = 10
355
- cfg.image_size = 32
356
- ```
357
-
358
- ```bash
359
- python train.py dataset=%cifar10% computed_config
360
- # Results in: num_classes=10, image_size=32
361
- ```
362
-
363
- **Python 3.11+ Context Manager**:
364
-
365
- ```python
366
- @scope.observe()
367
- def my_config(cfg):
368
- cfg.model = 'resnet50'
369
- cfg.num_layers = 50
370
-
371
- with Scope.lazy(): # Evaluated after CLI
372
- if cfg.model == 'resnet101':
373
- cfg.num_layers = 101
374
- ```
375
-
376
- #### 2. MultiScope - Multiple Configuration Contexts
377
-
378
- **Unique to Ato**: Manage completely separate configuration namespaces. Unlike Hydra's config groups, MultiScope provides true **namespace isolation** with independent priority systems.
379
-
380
- ##### Why MultiScope?
381
-
382
- | Challenge | Hydra's Approach | Ato's MultiScope |
383
- |-----------|------------------|---------------------|
384
- | Separate model/data configs | Config groups in one namespace | **Independent scopes with own priorities** |
385
- | Avoid key collisions | Manual prefixing (`model.lr`, `train.lr`) | **Automatic namespace isolation** |
386
- | Different teams/modules | Single config file | **Each scope can be owned separately** |
387
- | Priority conflicts | Global priority system | **Per-scope priority system** |
388
-
389
- ##### Basic Usage
390
-
391
- ```python
392
- from ato.scope import Scope, MultiScope
393
-
394
- model_scope = Scope(name='model')
395
- data_scope = Scope(name='data')
396
- scope = MultiScope(model_scope, data_scope)
397
-
398
- @model_scope.observe(default=True)
399
- def model_config(model):
400
- model.backbone = 'resnet50'
401
- model.pretrained = True
402
-
403
- @data_scope.observe(default=True)
404
- def data_config(data):
405
- data.dataset = 'cifar10'
406
- data.batch_size = 32
407
-
408
- @scope
409
- def train(model, data): # Named parameters match scope names
410
- print(f"Training {model.backbone} on {data.dataset}")
411
- ```
412
-
413
- ##### Real-world: Team Collaboration
414
-
415
- Different team members can own different scopes without conflicts:
416
-
417
- ```python
418
- # team_model.py - ML team owns this
419
- model_scope = Scope(name='model')
420
-
421
- @model_scope.observe(default=True)
422
- def resnet_default(model):
423
- model.backbone = 'resnet50'
424
- model.lr = 0.1 # Model-specific learning rate
425
-
426
- @model_scope.observe(priority=1)
427
- def resnet101(model):
428
- model.backbone = 'resnet101'
429
- model.lr = 0.05 # Different lr for bigger model
430
-
431
- # team_data.py - Data team owns this
432
- data_scope = Scope(name='data')
433
-
434
- @data_scope.observe(default=True)
435
- def cifar_default(data):
436
- data.dataset = 'cifar10'
437
- data.lr = 0.001 # Data augmentation learning rate (no conflict!)
438
-
439
- @data_scope.observe(priority=1)
440
- def imagenet(data):
441
- data.dataset = 'imagenet'
442
- data.workers = 16
443
-
444
- # train.py - Integration point
445
- from team_model import model_scope
446
- from team_data import data_scope
447
-
448
- scope = MultiScope(model_scope, data_scope)
449
-
450
- @scope
451
- def train(model, data):
452
- # Both have 'lr' but in separate namespaces!
453
- print(f"Model LR: {model.lr}, Data LR: {data.lr}")
454
- ```
455
-
456
- **Key advantage**: `model.lr` and `data.lr` are completely independent. No need for naming conventions like `model_lr` vs `data_lr`.
457
-
458
- ##### CLI with MultiScope
459
-
460
- Override each scope independently:
461
-
462
- ```bash
463
- # Override model scope only
464
- python train.py model.backbone=%resnet101%
465
-
466
- # Override data scope only
467
- python train.py data.dataset=%imagenet%
468
-
469
- # Override both
470
- python train.py model.backbone=%resnet101% data.dataset=%imagenet%
471
-
472
- # Call named configs per scope
473
- python train.py resnet101 imagenet
474
- ```
475
-
476
- #### 3. Import/Export Configs
477
-
478
- Ato supports importing configs from multiple frameworks:
479
-
480
- ```python
481
- @scope.observe()
482
- def load_external(config):
483
- # Load from any format
484
- config.load('experiments/baseline.json')
485
- config.load('models/resnet.yaml')
486
-
487
- # Export to any format
488
- config.dump('output/final_config.toml')
489
-
490
- # Import OpenMMLab configs - handles _base_ inheritance automatically
491
- config.load_mm_config('mmdet_configs/faster_rcnn.py')
492
- ```
493
-
494
- **OpenMMLab compatibility** is built-in:
495
- - Automatically resolves `_base_` inheritance chains
496
- - Supports `_delete_` keys for config overriding
497
- - Makes migration from MMDetection/MMSegmentation/etc. seamless
498
-
499
- **Hydra-style config composition** is also built-in via `compose_hierarchy`:
500
-
501
- ```python
502
- from ato.adict import ADict
503
-
504
- # Hydra-style directory structure:
505
- # configs/
506
- # ├── config.yaml # base config
507
- # ├── model/
508
- # │ ├── resnet50.yaml
509
- # │ └── resnet101.yaml
510
- # └── data/
511
- # ├── cifar10.yaml
512
- # └── imagenet.yaml
513
-
514
- config = ADict.compose_hierarchy(
515
- root='configs',
516
- config_filename='config',
517
- select={
518
- 'model': 'resnet50', # or ['resnet50', 'resnet101'] for multiple
519
- 'data': 'imagenet'
520
- },
521
- overrides={
522
- 'model.lr': 0.01,
523
- 'data.batch_size': 64
524
- },
525
- required=['model.backbone', 'data.dataset'], # Validation
526
- on_missing='warn' # or 'error'
527
- )
528
- ```
529
-
530
- **Key features**:
531
- - Config groups (model/, data/, optimizer/, etc.)
532
- - Automatic file discovery (tries .yaml, .json, .toml, .xyz)
533
- - Dotted overrides (`model.lr=0.01`)
534
- - Required key validation
535
- - Flexible error handling
536
-
537
- #### 4. Argparse Integration
538
-
539
- Mix Ato with existing argparse code:
540
-
541
- ```python
542
- from ato.scope import Scope
543
- import argparse
544
-
545
- scope = Scope(use_external_parser=True)
546
- parser = argparse.ArgumentParser()
547
- parser.add_argument('--gpu', type=int, default=0)
548
- parser.add_argument('--seed', type=int, default=42)
549
-
550
- @scope.observe(default=True)
551
- def config(cfg):
552
- cfg.lr = 0.001
553
- cfg.batch_size = 32
554
-
555
- @scope
556
- def train(cfg):
557
- print(f"GPU: {cfg.gpu}, LR: {cfg.lr}")
558
-
559
- if __name__ == '__main__':
560
- parser.parse_args() # Merges argparse with scope
561
- train()
562
- ```
563
-
564
- #### 5. Configuration Documentation & Inspection
565
-
566
- **One of Ato's most powerful features**: Auto-generate documentation AND visualize the exact order of configuration application.
567
-
568
- ##### Basic Documentation
569
-
570
- ```python
571
- @scope.manual
572
- def config_docs(cfg):
573
- cfg.lr = 'Learning rate for optimizer'
574
- cfg.batch_size = 'Number of samples per batch'
575
- cfg.model = 'Model architecture (resnet50, resnet101, etc.)'
576
- ```
577
-
578
- ```bash
579
- python train.py manual
580
- ```
581
-
582
- **Output:**
583
- ```
584
- --------------------------------------------------
585
- [Scope "config"]
586
- (The Applying Order of Views)
587
- defaults → (CLI Inputs) → lazy_config → main
588
-
589
- (User Manuals)
590
- config.lr: Learning rate for optimizer
591
- config.batch_size: Number of samples per batch
592
- config.model: Model architecture (resnet50, resnet101, etc.)
593
- --------------------------------------------------
594
- ```
595
-
596
- ##### Why This Matters
597
-
598
- The **applying order visualization** shows you **exactly** how your configs are merged:
599
- - Which config functions are applied (in order)
600
- - When CLI inputs override values
601
- - Where lazy configs are evaluated
602
- - The final function that uses the config
603
-
604
- **This prevents configuration bugs** by making the merge order explicit and debuggable.
605
-
606
- ##### MultiScope Documentation
607
-
608
- For complex projects with multiple scopes, `manual` shows each scope separately:
609
-
610
- ```python
611
- from ato.scope import Scope, MultiScope
612
-
613
- model_scope = Scope(name='model')
614
- train_scope = Scope(name='train')
615
- scope = MultiScope(model_scope, train_scope)
616
-
617
- @model_scope.observe(default=True)
618
- def model_defaults(model):
619
- model.backbone = 'resnet50'
620
- model.num_layers = 50
621
-
622
- @model_scope.observe(priority=1)
623
- def model_advanced(model):
624
- model.pretrained = True
625
-
626
- @model_scope.observe(lazy=True)
627
- def model_lazy(model):
628
- if model.backbone == 'resnet101':
629
- model.num_layers = 101
630
-
631
- @train_scope.observe(default=True)
632
- def train_defaults(train):
633
- train.lr = 0.001
634
- train.epochs = 100
635
-
636
- @model_scope.manual
637
- def model_docs(model):
638
- model.backbone = 'Model backbone architecture'
639
- model.num_layers = 'Number of layers in the model'
640
-
641
- @train_scope.manual
642
- def train_docs(train):
643
- train.lr = 'Learning rate for optimizer'
644
- train.epochs = 'Total training epochs'
645
-
646
- @scope
647
- def main(model, train):
648
- print(f"Training {model.backbone} with lr={train.lr}")
649
-
650
- if __name__ == '__main__':
651
- main()
652
- ```
653
-
654
- ```bash
655
- python train.py manual
656
- ```
657
-
658
- **Output:**
659
- ```
660
- --------------------------------------------------
661
- [Scope "model"]
662
- (The Applying Order of Views)
663
- model_defaults → model_advanced → (CLI Inputs) → model_lazy → main
664
-
665
- (User Manuals)
666
- model.backbone: Model backbone architecture
667
- model.num_layers: Number of layers in the model
668
- --------------------------------------------------
669
- [Scope "train"]
670
- (The Applying Order of Views)
671
- train_defaults → (CLI Inputs) → main
672
-
673
- (User Manuals)
674
- train.lr: Learning rate for optimizer
675
- train.epochs: Total training epochs
676
- --------------------------------------------------
677
- ```
678
-
679
- ##### Real-world Example
680
-
681
- This is especially valuable when debugging why a config value isn't what you expect:
682
-
683
- ```python
684
- @scope.observe(default=True)
685
- def defaults(cfg):
686
- cfg.lr = 0.001
687
-
688
- @scope.observe(priority=1)
689
- def experiment_config(cfg):
690
- cfg.lr = 0.01
691
-
692
- @scope.observe(priority=2)
693
- def another_config(cfg):
694
- cfg.lr = 0.1
695
-
696
- @scope.observe(lazy=True)
697
- def adaptive_lr(cfg):
698
- if cfg.batch_size > 64:
699
- cfg.lr = cfg.lr * 2
700
- ```
701
-
702
- When you run `python train.py manual`, you see:
703
- ```
704
- (The Applying Order of Views)
705
- defaults → experiment_config → another_config → (CLI Inputs) → adaptive_lr → main
706
- ```
707
-
708
- Now it's **crystal clear** why `lr=0.1` (from `another_config`) and not `0.01`!
709
-
710
- ---
711
-
712
- ## SQL Tracker: Experiment Tracking
713
-
714
- Lightweight experiment tracking using SQLite - no external services, no setup complexity.
715
-
716
- ### Why SQL Tracker?
717
-
718
- - **Zero Setup**: Just a SQLite file, no servers
719
- - **Full History**: Track all runs, metrics, and artifacts
720
- - **Smart Search**: Find similar experiments by config structure
721
- - **Code Versioning**: Track code changes via fingerprints
722
-
723
- ### Database Schema
724
-
725
- ```
726
- Project (my_ml_project)
727
- ├── Experiment (run_1)
728
- │ ├── config: {...}
729
- │ ├── structural_hash: "abc123..."
730
- │ ├── Metrics: [loss, accuracy, ...]
731
- │ ├── Artifacts: [model.pt, plots/*, ...]
732
- │ └── Fingerprints: [model_forward, train_step, ...]
733
- ├── Experiment (run_2)
734
- └── ...
735
- ```
736
-
737
- ### Quick Start
738
-
739
- #### Logging Experiments
740
-
741
- ```python
742
- from ato.db_routers.sql.manager import SQLLogger
743
- from ato.adict import ADict
744
-
745
- # Setup config
746
- config = ADict(
747
- experiment=ADict(
748
- project_name='image_classification',
749
- sql=ADict(db_path='sqlite:///experiments.db')
750
- ),
751
- # Your hyperparameters
752
- lr=0.001,
753
- batch_size=32,
754
- model='resnet50'
755
- )
756
-
757
- # Create logger
758
- logger = SQLLogger(config)
759
-
760
- # Start experiment run
761
- run_id = logger.run(tags=['baseline', 'resnet50', 'cifar10'])
762
-
763
- # Training loop
764
- for epoch in range(100):
765
- # Your training code
766
- train_loss = train_one_epoch()
767
- val_acc = validate()
768
-
769
- # Log metrics
770
- logger.log_metric('train_loss', train_loss, step=epoch)
771
- logger.log_metric('val_accuracy', val_acc, step=epoch)
772
-
773
- # Log artifacts
774
- logger.log_artifact(run_id, 'checkpoints/model_best.pt',
775
- data_type='model',
776
- metadata={'epoch': best_epoch})
777
-
778
- # Finish run
779
- logger.finish(status='completed')
780
- ```
781
-
782
- #### Querying Experiments
783
-
784
- ```python
785
- from ato.db_routers.sql.manager import SQLFinder
786
-
787
- finder = SQLFinder(config)
788
-
789
- # Get all runs in project
790
- runs = finder.get_runs_in_project('image_classification')
791
- for run in runs:
792
- print(f"Run {run.id}: {run.config.model} - {run.status}")
793
-
794
- # Find best performing run
795
- best_run = finder.find_best_run(
796
- project_name='image_classification',
797
- metric_key='val_accuracy',
798
- mode='max' # or 'min' for loss
799
- )
800
- print(f"Best config: {best_run.config}")
801
-
802
- # Find similar experiments (same config structure)
803
- similar = finder.find_similar_runs(run_id=123)
804
- print(f"Found {len(similar)} runs with similar config structure")
805
-
806
- # Trace statistics (code fingerprints)
807
- stats = finder.get_trace_statistics('image_classification', trace_id='model_forward')
808
- print(f"Model forward pass has {stats['static_trace_versions']} versions")
809
- ```
810
-
811
- ### Real-world Example: Experiment Comparison
812
-
813
- ```python
814
- # Compare hyperparameter impact
815
- finder = SQLFinder(config)
816
-
817
- runs = finder.get_runs_in_project('my_project')
818
- for run in runs:
819
- # Get final accuracy
820
- final_metrics = [m for m in run.metrics if m.key == 'val_accuracy']
821
- best_acc = max(m.value for m in final_metrics) if final_metrics else 0
822
-
823
- print(f"LR: {run.config.lr}, Batch: {run.config.batch_size} → Acc: {best_acc:.2%}")
824
- ```
825
-
826
- ### Features Summary
827
-
828
- | Feature | Description |
829
- |---------|-------------|
830
- | **Structural Hash** | Auto-track config structure changes |
831
- | **Metric Logging** | Time-series metrics with step tracking |
832
- | **Artifact Management** | Track model checkpoints, plots, data files |
833
- | **Fingerprint Tracking** | Version control for code (static & runtime) |
834
- | **Smart Search** | Find similar configs, best runs, statistics |
835
-
836
- ---
837
-
838
- ## Hyperparameter Optimization
839
-
840
- Built-in **Hyperband** algorithm for efficient hyperparameter search with early stopping.
841
-
842
- ### Extensible Design
843
-
844
- Ato's hyperopt module is built for extensibility and reusability:
845
-
846
- | Component | Purpose | Benefit |
847
- |-----------|---------|---------|
848
- | `GridSpaceMixIn` | Parameter sampling logic | Reusable across different algorithms |
849
- | `HyperOpt` | Base optimization class | Easy to implement custom strategies |
850
- | `DistributedMixIn` | Distributed training support | Optional, composable |
851
-
852
- **This design makes it trivial to implement custom search algorithms**:
853
-
854
- ```python
855
- from ato.hyperopt.base import GridSpaceMixIn, HyperOpt
856
-
857
- class RandomSearch(GridSpaceMixIn, HyperOpt):
858
- def main(self, func):
859
- # Reuse GridSpaceMixIn.prepare_distributions()
860
- configs = self.prepare_distributions(self.config, self.search_spaces)
861
-
862
- # Implement random sampling
863
- import random
864
- random.shuffle(configs)
865
-
866
- results = []
867
- for config in configs[:10]: # Sample 10 random configs
868
- metric = func(config)
869
- results.append((config, metric))
870
-
871
- return max(results, key=lambda x: x[1])
872
- ```
873
-
874
- ### How Hyperband Works
875
-
876
- Hyperband uses successive halving:
877
- 1. Start with many configs, train briefly
878
- 2. Keep top performers, discard poor ones
879
- 3. Train survivors longer
880
- 4. Repeat until one winner remains
881
-
882
- ### Basic Usage
883
-
884
- ```python
885
- from ato.adict import ADict
886
- from ato.hyperopt.hyperband import HyperBand
887
- from ato.scope import Scope
888
-
889
- scope = Scope()
890
-
891
- # Define search space
892
- search_spaces = ADict(
893
- lr=ADict(
894
- param_type='FLOAT',
895
- param_range=(1e-5, 1e-1),
896
- num_samples=20,
897
- space_type='LOG' # Logarithmic spacing
898
- ),
899
- batch_size=ADict(
900
- param_type='INTEGER',
901
- param_range=(16, 128),
902
- num_samples=5,
903
- space_type='LOG'
904
- ),
905
- model=ADict(
906
- param_type='CATEGORY',
907
- categories=['resnet50', 'resnet101', 'efficientnet_b0']
908
- )
909
- )
910
-
911
- # Create Hyperband optimizer
912
- hyperband = HyperBand(
913
- scope,
914
- search_spaces,
915
- halving_rate=0.3, # Keep top 30% each round
916
- num_min_samples=3, # Stop when <= 3 configs remain
917
- mode='max' # Maximize metric (use 'min' for loss)
918
- )
919
-
920
- @hyperband.main
921
- def train(config):
922
- # Your training code
923
- model = create_model(config.model)
924
- optimizer = Adam(lr=config.lr)
925
-
926
- # Use __num_halved__ for early stopping
927
- num_epochs = compute_epochs(config.__num_halved__)
928
-
929
- # Train and return metric
930
- val_acc = train_and_evaluate(model, optimizer, num_epochs)
931
- return val_acc
932
-
933
- if __name__ == '__main__':
934
- # Run hyperparameter search
935
- best_result = train()
936
- print(f"Best config: {best_result.config}")
937
- print(f"Best metric: {best_result.metric}")
938
- ```
939
-
940
- ### Automatic Step Calculation
941
-
942
- Let Hyperband compute optimal training steps:
943
-
944
- ```python
945
- hyperband = HyperBand(scope, search_spaces, halving_rate=0.3, num_min_samples=4)
946
-
947
- max_steps = 100000
948
- steps_per_generation = hyperband.compute_optimized_initial_training_steps(max_steps)
949
- # Example output: [27, 88, 292, 972, 3240, 10800, 36000, 120000]
950
-
951
- # Use in training
952
- @hyperband.main
953
- def train(config):
954
- generation = config.__num_halved__
955
- num_steps = steps_per_generation[generation]
956
-
957
- metric = train_for_n_steps(num_steps)
958
- return metric
959
- ```
960
-
961
- ### Parameter Types
962
-
963
- | Type | Description | Example |
964
- |------|-------------|---------|
965
- | `FLOAT` | Continuous values | Learning rate, dropout |
966
- | `INTEGER` | Discrete integers | Batch size, num layers |
967
- | `CATEGORY` | Categorical choices | Model type, optimizer |
968
-
969
- Space types:
970
- - `LOG`: Logarithmic spacing (good for learning rates)
971
- - `LINEAR`: Linear spacing (default)
972
-
973
- ### Distributed Hyperparameter Search
974
-
975
- Ato supports distributed hyperparameter optimization out of the box:
976
-
977
- ```python
978
- from ato.hyperopt.hyperband import DistributedHyperBand
979
- import torch.distributed as dist
980
-
981
- # Initialize distributed training
982
- dist.init_process_group(backend='nccl')
983
- rank = dist.get_rank()
984
- world_size = dist.get_world_size()
985
-
986
- # Create distributed hyperband
987
- hyperband = DistributedHyperBand(
988
- scope,
989
- search_spaces,
990
- halving_rate=0.3,
991
- num_min_samples=3,
992
- mode='max',
993
- rank=rank,
994
- world_size=world_size,
995
- backend='pytorch'
996
- )
997
-
998
- @hyperband.main
999
- def train(config):
1000
- # Your distributed training code
1001
- model = create_model(config)
1002
- model = DDP(model, device_ids=[rank])
1003
- metric = train_and_evaluate(model)
1004
- return metric
1005
-
1006
- if __name__ == '__main__':
1007
- result = train()
1008
- if rank == 0:
1009
- print(f"Best config: {result.config}")
1010
- ```
1011
-
1012
- **Key features**:
1013
- - Automatic work distribution across GPUs
1014
- - Synchronized config selection via `broadcast_object_from_root`
1015
- - Results aggregation with `all_gather_object`
1016
- - Compatible with PyTorch DDP, FSDP, DeepSpeed
1017
-
1018
- ---
1019
-
1020
- ## Best Practices
1021
-
1022
- ### 1. Project Structure
1023
-
1024
- ```
1025
- my_project/
1026
- ├── configs/
1027
- │ ├── default.py # Default config with @scope.observe(default=True)
1028
- │ ├── models.py # Model-specific configs
1029
- │ └── datasets.py # Dataset configs
1030
- ├── train.py # Main training script
1031
- ├── experiments.db # SQLite experiment tracking
1032
- └── experiments/
1033
- ├── run_001/
1034
- │ ├── checkpoints/
1035
- │ └── logs/
1036
- └── run_002/
1037
- ```
1038
-
1039
- ### 2. Config Organization
1040
-
1041
- ```python
1042
- # configs/default.py
1043
- from ato.scope import Scope
1044
-
1045
- scope = Scope()
1046
-
1047
- @scope.observe(default=True)
1048
- def defaults(cfg):
1049
- # Data
1050
- cfg.data = ADict(
1051
- dataset='cifar10',
1052
- batch_size=32,
1053
- num_workers=4
1054
- )
1055
-
1056
- # Model
1057
- cfg.model = ADict(
1058
- backbone='resnet50',
1059
- pretrained=True
1060
- )
1061
-
1062
- # Training
1063
- cfg.train = ADict(
1064
- lr=0.001,
1065
- epochs=100,
1066
- optimizer='adam'
1067
- )
1068
-
1069
- # Experiment tracking
1070
- cfg.experiment = ADict(
1071
- project_name='my_project',
1072
- sql=ADict(db_path='sqlite:///experiments.db')
1073
- )
1074
- ```
1075
-
1076
- ### 3. Combined Workflow
1077
-
1078
- ```python
1079
- from ato.scope import Scope
1080
- from ato.db_routers.sql.manager import SQLLogger
1081
- from configs.default import scope
1082
-
1083
- @scope
1084
- def train(cfg):
1085
- # Setup experiment tracking
1086
- logger = SQLLogger(cfg)
1087
- run_id = logger.run(tags=[cfg.model.backbone, cfg.data.dataset])
1088
-
1089
- try:
1090
- # Training loop
1091
- for epoch in range(cfg.train.epochs):
1092
- loss = train_epoch()
1093
- acc = validate()
1094
-
1095
- logger.log_metric('loss', loss, epoch)
1096
- logger.log_metric('accuracy', acc, epoch)
1097
-
1098
- logger.finish(status='completed')
1099
-
1100
- except Exception as e:
1101
- logger.finish(status='failed')
1102
- raise e
1103
-
1104
- if __name__ == '__main__':
1105
- train()
1106
- ```
1107
-
1108
- ### 4. Reproducibility Checklist
1109
-
1110
- - ✅ Use structural hashing to track config changes
1111
- - ✅ Log all hyperparameters to SQLLogger
1112
- - ✅ Tag experiments with meaningful labels
1113
- - ✅ Track artifacts (checkpoints, plots)
1114
- - ✅ Use lazy configs for derived parameters
1115
- - ✅ Document configs with `@scope.manual`
1116
-
1117
- ---
1118
-
1119
- ## Requirements
1120
-
1121
- - Python >= 3.7
1122
- - SQLAlchemy (for SQL Tracker)
1123
- - PyYAML, toml (for config serialization)
1124
-
1125
- See `pyproject.toml` for full dependencies.
1126
-
1127
- ---
1128
-
1129
- ## License
1130
-
1131
- MIT License
1132
-
1133
- ---
1134
-
1135
- ## Future Work — Optional, Modular, Non-Intrusive
1136
-
1137
- We're planning to add an **HTML dashboard** (as a small local daemon) for teams that want visual exploration:
1138
-
1139
- **Planned features:**
1140
- - Metric comparison & trends
1141
- - Run history & artifact browsing
1142
- - Configuration diffs (including structural hash visualization)
1143
- - Interactive hyperparameter analysis
1144
-
1145
- **Philosophy stays the same:**
1146
- - **No hard dependency** - Ato core (Scope / ADict / SQL tracker / HyperOpt) works 100% without the dashboard
1147
- - **No coupling** - The dashboard is a separate process that reads from SQLite/logs; it doesn't block or modify your runs
1148
- - **Zero lock-in** - Remove the dashboard and nothing in your training code changes
1149
- - **Fully modular** - Pick only what you need
1150
-
1151
- **Example workflows:**
1152
-
1153
- | What you need | What you use |
1154
- |---------------|--------------|
1155
- | Just configs | `ADict` + `Scope` only — no DB, no UI |
1156
- | Headless tracking | Add SQL tracker — still no UI |
1157
- | Visual exploration | Start dashboard daemon when you want; stop it and keep training |
1158
- | Full stack | Use everything, or mix with MLflow/W&B dashboards |
1159
-
1160
- **Guiding rule:** Ato is a set of small, composable tools — not a monolith. Use what helps; ignore the rest.
1161
-
1162
- ---
1163
-
1164
- ## Contributing
1165
-
1166
- Contributions are welcome! Please feel free to submit issues or pull requests.
1167
-
1168
- ### Development Setup
1169
-
1170
- ```bash
1171
- git clone https://github.com/yourusername/ato.git
1172
- cd ato
1173
- pip install -e .
1174
- ```
1175
-
1176
- ---
1177
-
1178
- ## Working with Existing Tools
1179
-
1180
- Ato isn't meant to replace Hydra, MLflow, or W&B — it's a **composable layer** you can use alongside them.
1181
-
1182
- Think of Ato as a "config control surface" that gives you clarity and structure without forcing you into a framework.
1183
- Many teams use Ato for the 90% of experiments that don't need heavy infrastructure, then graduate to larger tools when needed.
1184
-
1185
- ### Ato + Hydra = Better Together
1186
-
1187
- Ato has **built-in Hydra compatibility** via `compose_hierarchy()`:
1188
-
1189
- ```python
1190
- from ato.adict import ADict
1191
-
1192
- # Load Hydra-style configs directly
1193
- config = ADict.compose_hierarchy(
1194
- root='configs',
1195
- config_filename='config',
1196
- select={'model': 'resnet50', 'data': 'imagenet'},
1197
- overrides={'model.lr': 0.01}
1198
- )
1199
-
1200
- # Now add Ato's unique features on top:
1201
- # - MultiScope for namespace isolation
1202
- # - `manual` command for merge debugging
1203
- # - Built-in SQL tracking
1204
- ```
1205
-
1206
- **Migration from Hydra** is literally just replacing `hydra.compose()` with `ADict.compose_hierarchy()`.
1207
-
1208
- ### What Makes Ato Different?
1209
-
1210
- Ato focuses on **three unique capabilities** that complement existing tools:
1211
-
1212
- | Feature | What It Solves | Why It Matters |
1213
- |---------|----------------|----------------|
1214
- | **MultiScope** | True namespace isolation | Multiple teams can own separate config scopes without key collisions (no `model_lr` vs `data_lr` prefixing needed) |
1215
- | **`manual` command** | Config merge order visualization | Debug *why* a config value is set — see exact merge order, not just final result |
1216
- | **Offline-first tracking** | Zero-setup SQLite tracking | Experiment tracking without servers, platforms, or external dependencies |
1217
-
1218
- ### Compatibility Matrix
1219
-
1220
- Ato plays nicely with your existing stack:
1221
-
1222
- | Tool | Ato's Role | Integration |
1223
- |------|------------|-------------|
1224
- | **Hydra** | Extends with MultiScope + merge debugging | `compose_hierarchy()` loads Hydra configs directly |
1225
- | **MLflow** | Lightweight alternative for simple projects | Use Ato's SQL tracker for offline work, MLflow for dashboards |
1226
- | **W&B** | Offline-first complement | Track locally with Ato, sync to W&B when needed |
1227
- | **OpenMMLab** | Config migration layer | `load_mm_config()` handles `_base_` inheritance |
1228
- | **PyTorch/TF/JAX** | Framework-agnostic config + tracking | Works with any training framework |
1229
-
1230
- ### When to Use What
1231
-
1232
- **Use Ato alone** for:
1233
- - Individual research experiments
1234
- - Projects that don't need a dashboard
1235
- - Teams wanting namespace isolation (MultiScope)
1236
- - Config merge debugging (`manual` command)
1237
-
1238
- **Use Ato + Hydra** when:
1239
- - You need Hydra's deep config hierarchies
1240
- - Your team already uses Hydra YAML structure
1241
- - You want MultiScope on top of Hydra's composition
1242
-
1243
- **Use Ato + MLflow/W&B** when:
1244
- - You want local-first tracking with optional cloud sync
1245
- - You need Ato's structural hashing + offline SQLite
1246
- - Your team prefers MLflow/W&B dashboards for collaboration
1247
-
1248
- **Graduate to pure MLflow/W&B** when:
1249
- - You need real-time dashboards and team collaboration UI
1250
- - Model registry and dataset versioning become critical
1251
- - Your experiments are production-facing
1252
-
1253
- ### What Ato Doesn't Do
1254
-
1255
- Ato intentionally skips features that larger tools handle better:
1256
- - ❌ Real-time web dashboards (use MLflow/W&B)
1257
- - ❌ Model registry (use MLflow)
1258
- - ❌ Dataset versioning (use W&B/DVC)
1259
- - ❌ Deep plugin ecosystems (use Hydra)
1260
-
1261
- Ato's philosophy: **give you enough structure to stay organized, without becoming infrastructure.**