ato 2.0.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of ato might be problematic. Click here for more details.

@@ -0,0 +1,1181 @@
1
+ Metadata-Version: 2.4
2
+ Name: ato
3
+ Version: 2.0.0
4
+ Summary: A Python library for experiment tracking and hyperparameter optimization
5
+ Author: ato contributors
6
+ License: MIT
7
+ Project-URL: Homepage, https://github.com/yourusername/ato
8
+ Project-URL: Repository, https://github.com/yourusername/ato
9
+ Project-URL: Documentation, https://github.com/yourusername/ato#readme
10
+ Project-URL: Issues, https://github.com/yourusername/ato/issues
11
+ Keywords: machine learning,experiment tracking,hyperparameter optimization
12
+ Classifier: Development Status :: 4 - Beta
13
+ Classifier: Intended Audience :: Developers
14
+ Classifier: Intended Audience :: Science/Research
15
+ Classifier: License :: OSI Approved :: MIT License
16
+ Classifier: Programming Language :: Python :: 3
17
+ Classifier: Programming Language :: Python :: 3.7
18
+ Classifier: Programming Language :: Python :: 3.8
19
+ Classifier: Programming Language :: Python :: 3.9
20
+ Classifier: Programming Language :: Python :: 3.10
21
+ Classifier: Programming Language :: Python :: 3.11
22
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
23
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
24
+ Requires-Python: >=3.7
25
+ Description-Content-Type: text/markdown
26
+ License-File: LICENSE
27
+ Requires-Dist: pyyaml>=6.0
28
+ Requires-Dist: toml>=0.10.2
29
+ Requires-Dist: sqlalchemy>=2.0
30
+ Requires-Dist: numpy>=1.19.0
31
+ Provides-Extra: distributed
32
+ Requires-Dist: torch>=1.8.0; extra == "distributed"
33
+ Dynamic: license-file
34
+
35
+ # Ato
36
+
37
+ Ato is intentionally small — it’s not about lines of code,
38
+ it’s about where they belong.
39
+ The core fits in a few hundred lines because it doesn’t need to fight Python — it flows with it.
40
+
41
+ ---
42
+
43
+ **Ato** is a lightweight Python library for experiment management in machine learning and data science.
44
+ It provides flexible configuration management, experiment tracking, and hyperparameter optimization —
45
+ all without the complexity or overhead of heavy frameworks.
46
+
47
+ ## Why Ato?
48
+
49
+ ### Core Differentiators
50
+
51
+ - **True Namespace Isolation**: MultiScope provides independent config contexts (unique to Ato!)
52
+ - **Configuration Transparency**: Visualize exact config merge order - debug configs with `manual` command
53
+ - **Built-in Experiment Tracking**: SQLite-based tracking with no external services required
54
+ - **Structural Hashing**: Track experiment structure changes automatically
55
+
56
+ ### Developer Experience
57
+
58
+ - **Zero Boilerplate**: Auto-nested configs, lazy evaluation, attribute access
59
+ - **CLI-first Design**: Configure experiments from command line without touching code
60
+ - **Framework Agnostic**: Works with PyTorch, TensorFlow, JAX, or pure Python
61
+
62
+ ## Quick Start
63
+
64
+ ```bash
65
+ pip install ato-python
66
+ ```
67
+
68
+ ### 30-Second Example
69
+
70
+ ```python
71
+ from ato.scope import Scope
72
+
73
+ scope = Scope()
74
+
75
+ @scope.observe(default=True)
76
+ def config(cfg):
77
+ cfg.lr = 0.001
78
+ cfg.batch_size = 32
79
+ cfg.model = 'resnet50'
80
+
81
+ @scope
82
+ def train(cfg):
83
+ print(f"Training {cfg.model} with lr={cfg.lr}")
84
+ # Your training code here
85
+
86
+ if __name__ == '__main__':
87
+ train() # python train.py
88
+ # Override from CLI: python train.py lr=0.01 model=%resnet101%
89
+ ```
90
+
91
+ ---
92
+
93
+ ## Table of Contents
94
+
95
+ - [ADict: Enhanced Dictionary](#adict-enhanced-dictionary)
96
+ - [Scope: Configuration Management](#scope-configuration-management)
97
+ - [MultiScope: Namespace Isolation](#2-multiscope---multiple-configuration-contexts) ⭐ Unique to Ato
98
+ - [Config Documentation & Debugging](#5-configuration-documentation--inspection) ⭐ Unique to Ato
99
+ - [SQL Tracker: Experiment Tracking](#sql-tracker-experiment-tracking)
100
+ - [Hyperparameter Optimization](#hyperparameter-optimization)
101
+ - [Best Practices](#best-practices)
102
+ - [Comparison with Hydra](#ato-vs-hydra)
103
+
104
+ ---
105
+
106
+ ## ADict: Enhanced Dictionary
107
+
108
+ `ADict` is an enhanced dictionary designed for managing experiment configurations. It combines the simplicity of Python dictionaries with powerful features for ML workflows.
109
+
110
+ ### Core Features
111
+
112
+ These are the fundamental capabilities that make ADict powerful for experiment management:
113
+
114
+ | Feature | Description | Why It Matters |
115
+ |---------|-------------|----------------|
116
+ | **Structural Hashing** | Hash based on keys + types, not values | Track when experiment structure changes |
117
+ | **Nested Access** | Dot notation for nested configs | `config.model.lr` instead of `config['model']['lr']` |
118
+ | **Format Agnostic** | Load/save JSON, YAML, TOML, XYZ | Work with any config format |
119
+ | **Safe Updates** | `update_if_absent()` method | Prevent accidental overwrites |
120
+
121
+ ### Developer Convenience Features
122
+
123
+ These utilities maximize developer productivity and reduce boilerplate:
124
+
125
+ | Feature | Description | Benefit |
126
+ |---------|-------------|---------|
127
+ | **Auto-nested (`ADict.auto()`)** | Infinite depth lazy creation | `config.a.b.c = 1` just works - no KeyError |
128
+ | **Attribute-style Assignment** | `config.lr = 0.1` | Cleaner, more readable code |
129
+ | **Conditional Updates** | Only update missing keys | Merge configs safely |
130
+
131
+ ### Quick Examples
132
+
133
+ ```python
134
+ from ato.adict import ADict
135
+
136
+ # Structural hashing - track config structure changes
137
+ config1 = ADict(lr=0.1, epochs=100, model='resnet50')
138
+ config2 = ADict(lr=0.01, epochs=200, model='resnet101')
139
+ print(config1.get_structural_hash() == config2.get_structural_hash()) # True
140
+
141
+ config3 = ADict(lr=0.1, epochs='100', model='resnet50') # epochs is str!
142
+ print(config1.get_structural_hash() == config3.get_structural_hash()) # False
143
+
144
+ # Load/save any format
145
+ config = ADict.from_file('config.json')
146
+ config.dump('config.yaml')
147
+
148
+ # Safe updates
149
+ config.update_if_absent(lr=0.01, scheduler='cosine') # Only adds scheduler
150
+ ```
151
+
152
+ ### Convenience Features in Detail
153
+
154
+ #### Auto-nested: Zero Boilerplate Config Building
155
+
156
+ The most loved feature - no more manual nesting:
157
+
158
+ ```python
159
+ # ❌ Traditional way
160
+ config = ADict()
161
+ config.model = ADict()
162
+ config.model.backbone = ADict()
163
+ config.model.backbone.layers = [64, 128, 256]
164
+
165
+ # ✅ With ADict.auto()
166
+ config = ADict.auto()
167
+ config.model.backbone.layers = [64, 128, 256] # Just works!
168
+ config.data.augmentation.brightness = 0.2
169
+ ```
170
+
171
+ **Perfect for Scope integration**:
172
+
173
+ ```python
174
+ from ato.scope import Scope
175
+
176
+ scope = Scope()
177
+
178
+ @scope.observe(default=True)
179
+ def config(cfg):
180
+ # No pre-definition needed!
181
+ cfg.training.optimizer.name = 'AdamW'
182
+ cfg.training.optimizer.lr = 0.001
183
+ cfg.model.encoder.num_layers = 12
184
+ ```
185
+
186
+ **Works with CLI**:
187
+
188
+ ```bash
189
+ python train.py model.backbone.resnet.depth=50 data.batch_size=32
190
+ ```
191
+
192
+ #### More Convenience Utilities
193
+
194
+ ```python
195
+ # Attribute-style access
196
+ config.lr = 0.1
197
+ print(config.lr) # Instead of config['lr']
198
+
199
+ # Nested access
200
+ print(config.model.backbone.type) # Clean and readable
201
+
202
+ # Conditional updates - merge configs safely
203
+ base_config.update_if_absent(**experiment_config)
204
+ ```
205
+
206
+ ---
207
+
208
+ ## Scope: Configuration Management
209
+
210
+ Scope solves configuration complexity through **priority-based merging** and **CLI integration**. No more scattered config files or hard-coded parameters.
211
+
212
+ ### Key Concepts
213
+
214
+ ```
215
+ Default Configs (priority=0)
216
+
217
+ Named Configs (priority=0+)
218
+
219
+ CLI Arguments (highest priority)
220
+
221
+ Lazy Configs (computed after CLI)
222
+ ```
223
+
224
+ ### Basic Usage
225
+
226
+ #### Simple Configuration
227
+
228
+ ```python
229
+ from ato.scope import Scope
230
+
231
+ scope = Scope()
232
+
233
+ @scope.observe()
234
+ def my_config(config):
235
+ config.dataset = 'cifar10'
236
+ config.lr = 0.001
237
+ config.batch_size = 32
238
+
239
+ @scope
240
+ def train(config):
241
+ print(f"Training on {config.dataset}")
242
+ # Your code here
243
+
244
+ if __name__ == '__main__':
245
+ train()
246
+ ```
247
+
248
+ #### Priority-based Merging
249
+
250
+ ```python
251
+ @scope.observe(default=True) # Always applied
252
+ def defaults(cfg):
253
+ cfg.lr = 0.001
254
+ cfg.epochs = 100
255
+
256
+ @scope.observe(priority=1) # Applied after defaults
257
+ def high_lr(cfg):
258
+ cfg.lr = 0.01
259
+
260
+ @scope.observe(priority=2) # Applied last
261
+ def long_training(cfg):
262
+ cfg.epochs = 300
263
+ ```
264
+
265
+ ```bash
266
+ python train.py # lr=0.001, epochs=100
267
+ python train.py high_lr # lr=0.01, epochs=100
268
+ python train.py high_lr long_training # lr=0.01, epochs=300
269
+ ```
270
+
271
+ #### CLI Configuration
272
+
273
+ Override any parameter from command line:
274
+
275
+ ```bash
276
+ # Simple values
277
+ python train.py lr=0.01 batch_size=64
278
+
279
+ # Nested configs
280
+ python train.py model.backbone=%resnet101% model.depth=101
281
+
282
+ # Lists and complex types
283
+ python train.py layers=[64,128,256,512] dropout=0.5
284
+
285
+ # Combine with named configs
286
+ python train.py my_config lr=0.001 batch_size=128
287
+ ```
288
+
289
+ **Note**: Wrap strings with `%` (e.g., `%resnet101%`) instead of quotes.
290
+
291
+ ### Advanced Features
292
+
293
+ #### 1. Lazy Evaluation - Dynamic Configuration
294
+
295
+ Sometimes you need configs that depend on other values set via CLI:
296
+
297
+ ```python
298
+ @scope.observe()
299
+ def base_config(cfg):
300
+ cfg.model = 'resnet50'
301
+ cfg.dataset = 'imagenet'
302
+
303
+ @scope.observe(lazy=True) # Evaluated AFTER CLI args
304
+ def computed_config(cfg):
305
+ # Adjust based on dataset
306
+ if cfg.dataset == 'imagenet':
307
+ cfg.num_classes = 1000
308
+ cfg.image_size = 224
309
+ elif cfg.dataset == 'cifar10':
310
+ cfg.num_classes = 10
311
+ cfg.image_size = 32
312
+ ```
313
+
314
+ ```bash
315
+ python train.py dataset=%cifar10% computed_config
316
+ # Results in: num_classes=10, image_size=32
317
+ ```
318
+
319
+ **Python 3.11+ Context Manager**:
320
+
321
+ ```python
322
+ @scope.observe()
323
+ def my_config(cfg):
324
+ cfg.model = 'resnet50'
325
+ cfg.num_layers = 50
326
+
327
+ with Scope.lazy(): # Evaluated after CLI
328
+ if cfg.model == 'resnet101':
329
+ cfg.num_layers = 101
330
+ ```
331
+
332
+ #### 2. MultiScope - Multiple Configuration Contexts
333
+
334
+ **Unique to Ato**: Manage completely separate configuration namespaces. Unlike Hydra's config groups, MultiScope provides true **namespace isolation** with independent priority systems.
335
+
336
+ ##### Why MultiScope?
337
+
338
+ | Challenge | Hydra's Approach | Ato's MultiScope |
339
+ |-----------|------------------|---------------------|
340
+ | Separate model/data configs | Config groups in one namespace | **Independent scopes with own priorities** |
341
+ | Avoid key collisions | Manual prefixing (`model.lr`, `train.lr`) | **Automatic namespace isolation** |
342
+ | Different teams/modules | Single config file | **Each scope can be owned separately** |
343
+ | Priority conflicts | Global priority system | **Per-scope priority system** |
344
+
345
+ ##### Basic Usage
346
+
347
+ ```python
348
+ from ato.scope import Scope, MultiScope
349
+
350
+ model_scope = Scope(name='model')
351
+ data_scope = Scope(name='data')
352
+ scope = MultiScope(model_scope, data_scope)
353
+
354
+ @model_scope.observe(default=True)
355
+ def model_config(model):
356
+ model.backbone = 'resnet50'
357
+ model.pretrained = True
358
+
359
+ @data_scope.observe(default=True)
360
+ def data_config(data):
361
+ data.dataset = 'cifar10'
362
+ data.batch_size = 32
363
+
364
+ @scope
365
+ def train(model, data): # Named parameters match scope names
366
+ print(f"Training {model.backbone} on {data.dataset}")
367
+ ```
368
+
369
+ ##### Real-world: Team Collaboration
370
+
371
+ Different team members can own different scopes without conflicts:
372
+
373
+ ```python
374
+ # team_model.py - ML team owns this
375
+ model_scope = Scope(name='model')
376
+
377
+ @model_scope.observe(default=True)
378
+ def resnet_default(model):
379
+ model.backbone = 'resnet50'
380
+ model.lr = 0.1 # Model-specific learning rate
381
+
382
+ @model_scope.observe(priority=1)
383
+ def resnet101(model):
384
+ model.backbone = 'resnet101'
385
+ model.lr = 0.05 # Different lr for bigger model
386
+
387
+ # team_data.py - Data team owns this
388
+ data_scope = Scope(name='data')
389
+
390
+ @data_scope.observe(default=True)
391
+ def cifar_default(data):
392
+ data.dataset = 'cifar10'
393
+ data.lr = 0.001 # Data augmentation learning rate (no conflict!)
394
+
395
+ @data_scope.observe(priority=1)
396
+ def imagenet(data):
397
+ data.dataset = 'imagenet'
398
+ data.workers = 16
399
+
400
+ # train.py - Integration point
401
+ from team_model import model_scope
402
+ from team_data import data_scope
403
+
404
+ scope = MultiScope(model_scope, data_scope)
405
+
406
+ @scope
407
+ def train(model, data):
408
+ # Both have 'lr' but in separate namespaces!
409
+ print(f"Model LR: {model.lr}, Data LR: {data.lr}")
410
+ ```
411
+
412
+ **Key advantage**: `model.lr` and `data.lr` are completely independent. No need for naming conventions like `model_lr` vs `data_lr`.
413
+
414
+ ##### CLI with MultiScope
415
+
416
+ Override each scope independently:
417
+
418
+ ```bash
419
+ # Override model scope only
420
+ python train.py model.backbone=%resnet101%
421
+
422
+ # Override data scope only
423
+ python train.py data.dataset=%imagenet%
424
+
425
+ # Override both
426
+ python train.py model.backbone=%resnet101% data.dataset=%imagenet%
427
+
428
+ # Call named configs per scope
429
+ python train.py resnet101 imagenet
430
+ ```
431
+
432
+ #### 3. Import/Export Configs
433
+
434
+ Ato supports importing configs from multiple frameworks:
435
+
436
+ ```python
437
+ @scope.observe()
438
+ def load_external(config):
439
+ # Load from any format
440
+ config.load('experiments/baseline.json')
441
+ config.load('models/resnet.yaml')
442
+
443
+ # Export to any format
444
+ config.dump('output/final_config.toml')
445
+
446
+ # Import OpenMMLab configs - handles _base_ inheritance automatically
447
+ config.load_mm_config('mmdet_configs/faster_rcnn.py')
448
+ ```
449
+
450
+ **OpenMMLab compatibility** is built-in:
451
+ - Automatically resolves `_base_` inheritance chains
452
+ - Supports `_delete_` keys for config overriding
453
+ - Makes migration from MMDetection/MMSegmentation/etc. seamless
454
+
455
+ **Hydra-style config composition** is also built-in via `compose_hierarchy`:
456
+
457
+ ```python
458
+ from ato.adict import ADict
459
+
460
+ # Hydra-style directory structure:
461
+ # configs/
462
+ # ├── config.yaml # base config
463
+ # ├── model/
464
+ # │ ├── resnet50.yaml
465
+ # │ └── resnet101.yaml
466
+ # └── data/
467
+ # ├── cifar10.yaml
468
+ # └── imagenet.yaml
469
+
470
+ config = ADict.compose_hierarchy(
471
+ root='configs',
472
+ config_filename='config',
473
+ select={
474
+ 'model': 'resnet50', # or ['resnet50', 'resnet101'] for multiple
475
+ 'data': 'imagenet'
476
+ },
477
+ overrides={
478
+ 'model.lr': 0.01,
479
+ 'data.batch_size': 64
480
+ },
481
+ required=['model.backbone', 'data.dataset'], # Validation
482
+ on_missing='warn' # or 'error'
483
+ )
484
+ ```
485
+
486
+ **Key features**:
487
+ - Config groups (model/, data/, optimizer/, etc.)
488
+ - Automatic file discovery (tries .yaml, .json, .toml, .xyz)
489
+ - Dotted overrides (`model.lr=0.01`)
490
+ - Required key validation
491
+ - Flexible error handling
492
+
493
+ #### 4. Argparse Integration
494
+
495
+ Mix Ato with existing argparse code:
496
+
497
+ ```python
498
+ from ato.scope import Scope
499
+ import argparse
500
+
501
+ scope = Scope(use_external_parser=True)
502
+ parser = argparse.ArgumentParser()
503
+ parser.add_argument('--gpu', type=int, default=0)
504
+ parser.add_argument('--seed', type=int, default=42)
505
+
506
+ @scope.observe(default=True)
507
+ def config(cfg):
508
+ cfg.lr = 0.001
509
+ cfg.batch_size = 32
510
+
511
+ @scope
512
+ def train(cfg):
513
+ print(f"GPU: {cfg.gpu}, LR: {cfg.lr}")
514
+
515
+ if __name__ == '__main__':
516
+ parser.parse_args() # Merges argparse with scope
517
+ train()
518
+ ```
519
+
520
+ #### 5. Configuration Documentation & Inspection
521
+
522
+ **One of Ato's most powerful features**: Auto-generate documentation AND visualize the exact order of configuration application.
523
+
524
+ ##### Basic Documentation
525
+
526
+ ```python
527
+ @scope.manual
528
+ def config_docs(cfg):
529
+ cfg.lr = 'Learning rate for optimizer'
530
+ cfg.batch_size = 'Number of samples per batch'
531
+ cfg.model = 'Model architecture (resnet50, resnet101, etc.)'
532
+ ```
533
+
534
+ ```bash
535
+ python train.py manual
536
+ ```
537
+
538
+ **Output:**
539
+ ```
540
+ --------------------------------------------------
541
+ [Scope "config"]
542
+ (The Applying Order of Views)
543
+ defaults → (CLI Inputs) → lazy_config → main
544
+
545
+ (User Manuals)
546
+ config.lr: Learning rate for optimizer
547
+ config.batch_size: Number of samples per batch
548
+ config.model: Model architecture (resnet50, resnet101, etc.)
549
+ --------------------------------------------------
550
+ ```
551
+
552
+ ##### Why This Matters
553
+
554
+ The **applying order visualization** shows you **exactly** how your configs are merged:
555
+ - Which config functions are applied (in order)
556
+ - When CLI inputs override values
557
+ - Where lazy configs are evaluated
558
+ - The final function that uses the config
559
+
560
+ **This prevents configuration bugs** by making the merge order explicit and debuggable.
561
+
562
+ ##### MultiScope Documentation
563
+
564
+ For complex projects with multiple scopes, `manual` shows each scope separately:
565
+
566
+ ```python
567
+ from ato.scope import Scope, MultiScope
568
+
569
+ model_scope = Scope(name='model')
570
+ train_scope = Scope(name='train')
571
+ scope = MultiScope(model_scope, train_scope)
572
+
573
+ @model_scope.observe(default=True)
574
+ def model_defaults(model):
575
+ model.backbone = 'resnet50'
576
+ model.num_layers = 50
577
+
578
+ @model_scope.observe(priority=1)
579
+ def model_advanced(model):
580
+ model.pretrained = True
581
+
582
+ @model_scope.observe(lazy=True)
583
+ def model_lazy(model):
584
+ if model.backbone == 'resnet101':
585
+ model.num_layers = 101
586
+
587
+ @train_scope.observe(default=True)
588
+ def train_defaults(train):
589
+ train.lr = 0.001
590
+ train.epochs = 100
591
+
592
+ @model_scope.manual
593
+ def model_docs(model):
594
+ model.backbone = 'Model backbone architecture'
595
+ model.num_layers = 'Number of layers in the model'
596
+
597
+ @train_scope.manual
598
+ def train_docs(train):
599
+ train.lr = 'Learning rate for optimizer'
600
+ train.epochs = 'Total training epochs'
601
+
602
+ @scope
603
+ def main(model, train):
604
+ print(f"Training {model.backbone} with lr={train.lr}")
605
+
606
+ if __name__ == '__main__':
607
+ main()
608
+ ```
609
+
610
+ ```bash
611
+ python train.py manual
612
+ ```
613
+
614
+ **Output:**
615
+ ```
616
+ --------------------------------------------------
617
+ [Scope "model"]
618
+ (The Applying Order of Views)
619
+ model_defaults → model_advanced → (CLI Inputs) → model_lazy → main
620
+
621
+ (User Manuals)
622
+ model.backbone: Model backbone architecture
623
+ model.num_layers: Number of layers in the model
624
+ --------------------------------------------------
625
+ [Scope "train"]
626
+ (The Applying Order of Views)
627
+ train_defaults → (CLI Inputs) → main
628
+
629
+ (User Manuals)
630
+ train.lr: Learning rate for optimizer
631
+ train.epochs: Total training epochs
632
+ --------------------------------------------------
633
+ ```
634
+
635
+ ##### Real-world Example
636
+
637
+ This is especially valuable when debugging why a config value isn't what you expect:
638
+
639
+ ```python
640
+ @scope.observe(default=True)
641
+ def defaults(cfg):
642
+ cfg.lr = 0.001
643
+
644
+ @scope.observe(priority=1)
645
+ def experiment_config(cfg):
646
+ cfg.lr = 0.01
647
+
648
+ @scope.observe(priority=2)
649
+ def another_config(cfg):
650
+ cfg.lr = 0.1
651
+
652
+ @scope.observe(lazy=True)
653
+ def adaptive_lr(cfg):
654
+ if cfg.batch_size > 64:
655
+ cfg.lr = cfg.lr * 2
656
+ ```
657
+
658
+ When you run `python train.py manual`, you see:
659
+ ```
660
+ (The Applying Order of Views)
661
+ defaults → experiment_config → another_config → (CLI Inputs) → adaptive_lr → main
662
+ ```
663
+
664
+ Now it's **crystal clear** why `lr=0.1` (from `another_config`) and not `0.01`!
665
+
666
+ ---
667
+
668
+ ## SQL Tracker: Experiment Tracking
669
+
670
+ Lightweight experiment tracking using SQLite - no external services, no setup complexity.
671
+
672
+ ### Why SQL Tracker?
673
+
674
+ - **Zero Setup**: Just a SQLite file, no servers
675
+ - **Full History**: Track all runs, metrics, and artifacts
676
+ - **Smart Search**: Find similar experiments by config structure
677
+ - **Code Versioning**: Track code changes via fingerprints
678
+
679
+ ### Database Schema
680
+
681
+ ```
682
+ Project (my_ml_project)
683
+ ├── Experiment (run_1)
684
+ │ ├── config: {...}
685
+ │ ├── structural_hash: "abc123..."
686
+ │ ├── Metrics: [loss, accuracy, ...]
687
+ │ ├── Artifacts: [model.pt, plots/*, ...]
688
+ │ └── Fingerprints: [model_forward, train_step, ...]
689
+ ├── Experiment (run_2)
690
+ └── ...
691
+ ```
692
+
693
+ ### Quick Start
694
+
695
+ #### Logging Experiments
696
+
697
+ ```python
698
+ from ato.db_routers.sql.manager import SQLLogger
699
+ from ato.adict import ADict
700
+
701
+ # Setup config
702
+ config = ADict(
703
+ experiment=ADict(
704
+ project_name='image_classification',
705
+ sql=ADict(db_path='sqlite:///experiments.db')
706
+ ),
707
+ # Your hyperparameters
708
+ lr=0.001,
709
+ batch_size=32,
710
+ model='resnet50'
711
+ )
712
+
713
+ # Create logger
714
+ logger = SQLLogger(config)
715
+
716
+ # Start experiment run
717
+ run_id = logger.run(tags=['baseline', 'resnet50', 'cifar10'])
718
+
719
+ # Training loop
720
+ for epoch in range(100):
721
+ # Your training code
722
+ train_loss = train_one_epoch()
723
+ val_acc = validate()
724
+
725
+ # Log metrics
726
+ logger.log_metric('train_loss', train_loss, step=epoch)
727
+ logger.log_metric('val_accuracy', val_acc, step=epoch)
728
+
729
+ # Log artifacts
730
+ logger.log_artifact(run_id, 'checkpoints/model_best.pt',
731
+ data_type='model',
732
+ metadata={'epoch': best_epoch})
733
+
734
+ # Finish run
735
+ logger.finish(status='completed')
736
+ ```
737
+
738
+ #### Querying Experiments
739
+
740
+ ```python
741
+ from ato.db_routers.sql.manager import SQLFinder
742
+
743
+ finder = SQLFinder(config)
744
+
745
+ # Get all runs in project
746
+ runs = finder.get_runs_in_project('image_classification')
747
+ for run in runs:
748
+ print(f"Run {run.id}: {run.config.model} - {run.status}")
749
+
750
+ # Find best performing run
751
+ best_run = finder.find_best_run(
752
+ project_name='image_classification',
753
+ metric_key='val_accuracy',
754
+ mode='max' # or 'min' for loss
755
+ )
756
+ print(f"Best config: {best_run.config}")
757
+
758
+ # Find similar experiments (same config structure)
759
+ similar = finder.find_similar_runs(run_id=123)
760
+ print(f"Found {len(similar)} runs with similar config structure")
761
+
762
+ # Trace statistics (code fingerprints)
763
+ stats = finder.get_trace_statistics('image_classification', trace_id='model_forward')
764
+ print(f"Model forward pass has {stats['static_trace_versions']} versions")
765
+ ```
766
+
767
+ ### Real-world Example: Experiment Comparison
768
+
769
+ ```python
770
+ # Compare hyperparameter impact
771
+ finder = SQLFinder(config)
772
+
773
+ runs = finder.get_runs_in_project('my_project')
774
+ for run in runs:
775
+ # Get final accuracy
776
+ final_metrics = [m for m in run.metrics if m.key == 'val_accuracy']
777
+ best_acc = max(m.value for m in final_metrics) if final_metrics else 0
778
+
779
+ print(f"LR: {run.config.lr}, Batch: {run.config.batch_size} → Acc: {best_acc:.2%}")
780
+ ```
781
+
782
+ ### Features Summary
783
+
784
+ | Feature | Description |
785
+ |---------|-------------|
786
+ | **Structural Hash** | Auto-track config structure changes |
787
+ | **Metric Logging** | Time-series metrics with step tracking |
788
+ | **Artifact Management** | Track model checkpoints, plots, data files |
789
+ | **Fingerprint Tracking** | Version control for code (static & runtime) |
790
+ | **Smart Search** | Find similar configs, best runs, statistics |
791
+
792
+ ---
793
+
794
+ ## Hyperparameter Optimization
795
+
796
+ Built-in **Hyperband** algorithm for efficient hyperparameter search with early stopping.
797
+
798
+ ### Extensible Design
799
+
800
+ Ato's hyperopt module is built for extensibility and reusability:
801
+
802
+ | Component | Purpose | Benefit |
803
+ |-----------|---------|---------|
804
+ | `GridSpaceMixIn` | Parameter sampling logic | Reusable across different algorithms |
805
+ | `HyperOpt` | Base optimization class | Easy to implement custom strategies |
806
+ | `DistributedMixIn` | Distributed training support | Optional, composable |
807
+
808
+ **This design makes it trivial to implement custom search algorithms**:
809
+
810
+ ```python
811
+ from ato.hyperopt.base import GridSpaceMixIn, HyperOpt
812
+
813
+ class RandomSearch(GridSpaceMixIn, HyperOpt):
814
+ def main(self, func):
815
+ # Reuse GridSpaceMixIn.prepare_distributions()
816
+ configs = self.prepare_distributions(self.config, self.search_spaces)
817
+
818
+ # Implement random sampling
819
+ import random
820
+ random.shuffle(configs)
821
+
822
+ results = []
823
+ for config in configs[:10]: # Sample 10 random configs
824
+ metric = func(config)
825
+ results.append((config, metric))
826
+
827
+ return max(results, key=lambda x: x[1])
828
+ ```
829
+
830
+ ### How Hyperband Works
831
+
832
+ Hyperband uses successive halving:
833
+ 1. Start with many configs, train briefly
834
+ 2. Keep top performers, discard poor ones
835
+ 3. Train survivors longer
836
+ 4. Repeat until one winner remains
837
+
838
+ ### Basic Usage
839
+
840
+ ```python
841
+ from ato.adict import ADict
842
+ from ato.hyperopt.hyperband import HyperBand
843
+ from ato.scope import Scope
844
+
845
+ scope = Scope()
846
+
847
+ # Define search space
848
+ search_spaces = ADict(
849
+ lr=ADict(
850
+ param_type='FLOAT',
851
+ param_range=(1e-5, 1e-1),
852
+ num_samples=20,
853
+ space_type='LOG' # Logarithmic spacing
854
+ ),
855
+ batch_size=ADict(
856
+ param_type='INTEGER',
857
+ param_range=(16, 128),
858
+ num_samples=5,
859
+ space_type='LOG'
860
+ ),
861
+ model=ADict(
862
+ param_type='CATEGORY',
863
+ categories=['resnet50', 'resnet101', 'efficientnet_b0']
864
+ )
865
+ )
866
+
867
+ # Create Hyperband optimizer
868
+ hyperband = HyperBand(
869
+ scope,
870
+ search_spaces,
871
+ halving_rate=0.3, # Keep top 30% each round
872
+ num_min_samples=3, # Stop when <= 3 configs remain
873
+ mode='max' # Maximize metric (use 'min' for loss)
874
+ )
875
+
876
+ @hyperband.main
877
+ def train(config):
878
+ # Your training code
879
+ model = create_model(config.model)
880
+ optimizer = Adam(lr=config.lr)
881
+
882
+ # Use __num_halved__ for early stopping
883
+ num_epochs = compute_epochs(config.__num_halved__)
884
+
885
+ # Train and return metric
886
+ val_acc = train_and_evaluate(model, optimizer, num_epochs)
887
+ return val_acc
888
+
889
+ if __name__ == '__main__':
890
+ # Run hyperparameter search
891
+ best_result = train()
892
+ print(f"Best config: {best_result.config}")
893
+ print(f"Best metric: {best_result.metric}")
894
+ ```
895
+
896
+ ### Automatic Step Calculation
897
+
898
+ Let Hyperband compute optimal training steps:
899
+
900
+ ```python
901
+ hyperband = HyperBand(scope, search_spaces, halving_rate=0.3, num_min_samples=4)
902
+
903
+ max_steps = 100000
904
+ steps_per_generation = hyperband.compute_optimized_initial_training_steps(max_steps)
905
+ # Example output: [27, 88, 292, 972, 3240, 10800, 36000, 120000]
906
+
907
+ # Use in training
908
+ @hyperband.main
909
+ def train(config):
910
+ generation = config.__num_halved__
911
+ num_steps = steps_per_generation[generation]
912
+
913
+ metric = train_for_n_steps(num_steps)
914
+ return metric
915
+ ```
916
+
917
+ ### Parameter Types
918
+
919
+ | Type | Description | Example |
920
+ |------|-------------|---------|
921
+ | `FLOAT` | Continuous values | Learning rate, dropout |
922
+ | `INTEGER` | Discrete integers | Batch size, num layers |
923
+ | `CATEGORY` | Categorical choices | Model type, optimizer |
924
+
925
+ Space types:
926
+ - `LOG`: Logarithmic spacing (good for learning rates)
927
+ - `LINEAR`: Linear spacing (default)
928
+
929
+ ### Distributed Hyperparameter Search
930
+
931
+ Ato supports distributed hyperparameter optimization out of the box:
932
+
933
+ ```python
934
+ from ato.hyperopt.hyperband import DistributedHyperBand
935
+ import torch.distributed as dist
936
+
937
+ # Initialize distributed training
938
+ dist.init_process_group(backend='nccl')
939
+ rank = dist.get_rank()
940
+ world_size = dist.get_world_size()
941
+
942
+ # Create distributed hyperband
943
+ hyperband = DistributedHyperBand(
944
+ scope,
945
+ search_spaces,
946
+ halving_rate=0.3,
947
+ num_min_samples=3,
948
+ mode='max',
949
+ rank=rank,
950
+ world_size=world_size,
951
+ backend='pytorch'
952
+ )
953
+
954
+ @hyperband.main
955
+ def train(config):
956
+ # Your distributed training code
957
+ model = create_model(config)
958
+ model = DDP(model, device_ids=[rank])
959
+ metric = train_and_evaluate(model)
960
+ return metric
961
+
962
+ if __name__ == '__main__':
963
+ result = train()
964
+ if rank == 0:
965
+ print(f"Best config: {result.config}")
966
+ ```
967
+
968
+ **Key features**:
969
+ - Automatic work distribution across GPUs
970
+ - Synchronized config selection via `broadcast_object_from_root`
971
+ - Results aggregation with `all_gather_object`
972
+ - Compatible with PyTorch DDP, FSDP, DeepSpeed
973
+
974
+ ---
975
+
976
+ ## Best Practices
977
+
978
+ ### 1. Project Structure
979
+
980
+ ```
981
+ my_project/
982
+ ├── configs/
983
+ │ ├── default.py # Default config with @scope.observe(default=True)
984
+ │ ├── models.py # Model-specific configs
985
+ │ └── datasets.py # Dataset configs
986
+ ├── train.py # Main training script
987
+ ├── experiments.db # SQLite experiment tracking
988
+ └── experiments/
989
+ ├── run_001/
990
+ │ ├── checkpoints/
991
+ │ └── logs/
992
+ └── run_002/
993
+ ```
994
+
995
+ ### 2. Config Organization
996
+
997
+ ```python
998
+ # configs/default.py
999
+ from ato.scope import Scope
1000
+
1001
+ scope = Scope()
1002
+
1003
+ @scope.observe(default=True)
1004
+ def defaults(cfg):
1005
+ # Data
1006
+ cfg.data = ADict(
1007
+ dataset='cifar10',
1008
+ batch_size=32,
1009
+ num_workers=4
1010
+ )
1011
+
1012
+ # Model
1013
+ cfg.model = ADict(
1014
+ backbone='resnet50',
1015
+ pretrained=True
1016
+ )
1017
+
1018
+ # Training
1019
+ cfg.train = ADict(
1020
+ lr=0.001,
1021
+ epochs=100,
1022
+ optimizer='adam'
1023
+ )
1024
+
1025
+ # Experiment tracking
1026
+ cfg.experiment = ADict(
1027
+ project_name='my_project',
1028
+ sql=ADict(db_path='sqlite:///experiments.db')
1029
+ )
1030
+ ```
1031
+
1032
+ ### 3. Combined Workflow
1033
+
1034
+ ```python
1035
+ from ato.scope import Scope
1036
+ from ato.db_routers.sql.manager import SQLLogger
1037
+ from configs.default import scope
1038
+
1039
+ @scope
1040
+ def train(cfg):
1041
+ # Setup experiment tracking
1042
+ logger = SQLLogger(cfg)
1043
+ run_id = logger.run(tags=[cfg.model.backbone, cfg.data.dataset])
1044
+
1045
+ try:
1046
+ # Training loop
1047
+ for epoch in range(cfg.train.epochs):
1048
+ loss = train_epoch()
1049
+ acc = validate()
1050
+
1051
+ logger.log_metric('loss', loss, epoch)
1052
+ logger.log_metric('accuracy', acc, epoch)
1053
+
1054
+ logger.finish(status='completed')
1055
+
1056
+ except Exception as e:
1057
+ logger.finish(status='failed')
1058
+ raise e
1059
+
1060
+ if __name__ == '__main__':
1061
+ train()
1062
+ ```
1063
+
1064
+ ### 4. Reproducibility Checklist
1065
+
1066
+ - ✅ Use structural hashing to track config changes
1067
+ - ✅ Log all hyperparameters to SQLLogger
1068
+ - ✅ Tag experiments with meaningful labels
1069
+ - ✅ Track artifacts (checkpoints, plots)
1070
+ - ✅ Use lazy configs for derived parameters
1071
+ - ✅ Document configs with `@scope.manual`
1072
+
1073
+ ---
1074
+
1075
+ ## Requirements
1076
+
1077
+ - Python >= 3.7
1078
+ - SQLAlchemy (for SQL Tracker)
1079
+ - PyYAML, toml (for config serialization)
1080
+
1081
+ See `pyproject.toml` for full dependencies.
1082
+
1083
+ ---
1084
+
1085
+ ## License
1086
+
1087
+ MIT License
1088
+
1089
+ ---
1090
+
1091
+ ## Contributing
1092
+
1093
+ Contributions are welcome! Please feel free to submit issues or pull requests.
1094
+
1095
+ ### Development Setup
1096
+
1097
+ ```bash
1098
+ git clone https://github.com/yourusername/ato.git
1099
+ cd ato
1100
+ pip install -e .
1101
+ ```
1102
+
1103
+ ---
1104
+
1105
+ ## Comparison with Other Tools
1106
+
1107
+ | Feature | Ato | MLflow | W&B | Hydra |
1108
+ |---------|--------|--------|-----|-------|
1109
+ | **Core Features** |
1110
+ | Zero setup | ✅ | ❌ | ❌ | ✅ |
1111
+ | Offline-first | ✅ | Partial | ❌ | ✅ |
1112
+ | Config priority system | ✅ Explicit | Partial (Tags) | Partial (Run params) | ✅ Override |
1113
+ | **True namespace isolation** | **✅ MultiScope** | **❌** | **❌** | **❌ Config groups only** |
1114
+ | **Config merge visualization** | **✅ `manual`** | **❌** | **❌** | **Partial (`--cfg` tree)** |
1115
+ | Structural hashing | ✅ | ❌ | ❌ | ❌ |
1116
+ | Built-in HyperOpt | ✅ Hyperband | ❌ | ✅ Sweeps | Plugins (Optuna) |
1117
+ | CLI-first design | ✅ | ❌ | ❌ | ✅ |
1118
+ | **Compatibility** |
1119
+ | Framework agnostic | ✅ | ✅ | ✅ | ✅ |
1120
+ | Distributed training | ✅ Native + DDP/FSDP⁽¹⁾ | ✅ | ✅ | ✅ |
1121
+ | Distributed HyperOpt | ✅ `DistributedHyperBand` | ❌ | Partial | Plugins |
1122
+ | Hydra-style composition | ✅ `compose_hierarchy` | N/A | N/A | Native |
1123
+ | OpenMMLab configs | ✅ `load_mm_config` | ❌ | ❌ | ❌ |
1124
+ | **Visualization & UI** |
1125
+ | Web dashboard | 🔜 Planned | ✅ | ✅ | ❌ |
1126
+ | Real-time metrics | 🔜 Planned | ✅ | ✅ | ❌ |
1127
+ | Interactive plots | 🔜 Planned | ✅ | ✅ | ❌ |
1128
+ | Metric comparison UI | 🔜 Planned | ✅ | ✅ | ❌ |
1129
+ | **Advanced Features** |
1130
+ | Model registry | 🔜 Planned | ✅ | ✅ | ❌ |
1131
+ | Dataset versioning | 🔜 Planned | Partial | ✅ | ❌ |
1132
+ | Team collaboration | ✅ MultiScope⁽²⁾ | ✅ Platform | ✅ Platform | ❌ |
1133
+
1134
+ ⁽¹⁾ Native distributed hyperparameter optimization via `DistributedHyperBand`. Regular training is compatible with any distributed framework (DDP, FSDP, DeepSpeed) - just integrate logging, no special code needed.
1135
+
1136
+ ⁽²⁾ Team collaboration via MultiScope: separate config ownership per team (e.g., Team A owns model scope, Team B owns data scope) without naming conflicts.
1137
+
1138
+ **Note on config compatibility**: Ato provides built-in support for other config frameworks:
1139
+ - **Hydra-style composition**: `compose_hierarchy()` supports config groups, select, overrides - full compatibility
1140
+ - **OpenMMLab configs**: `load_mm_config()` handles `_base_` inheritance and `_delete_` keys
1141
+ - Migration from existing projects is seamless - just import your configs and go
1142
+
1143
+ ### Ato vs. Hydra
1144
+
1145
+ While Hydra is excellent for config composition, Ato provides unique features:
1146
+
1147
+ | Aspect | Hydra | Ato |
1148
+ |--------|-------|--------|
1149
+ | **Namespace isolation** | Config groups share namespace | ✅ MultiScope with independent namespaces<br/>(no key collisions) |
1150
+ | **Priority system** | Single global override system | ✅ Per-scope priority + lazy evaluation |
1151
+ | **Config merge debugging** | Tree view (`--cfg`)<br/>Shows final config | ✅ `manual` command<br/>Shows merge order & execution flow |
1152
+ | **Experiment tracking** | Requires external tools<br/>(MLflow/W&B) | ✅ Built-in SQL tracker |
1153
+ | **Team workflow** | Single config file ownership | ✅ Separate scope ownership per team⁽³⁾ |
1154
+
1155
+ ⁽³⁾ Example: Team A defines `model_scope`, Team B defines `data_scope`, both can use `model.lr` and `data.lr` without conflicts.
1156
+
1157
+ **Use Ato over Hydra when:**
1158
+ - Multiple teams need independent config ownership (MultiScope)
1159
+ - You want to avoid key collision issues (no manual prefixing needed)
1160
+ - You need to debug why a config value was set (`manual` command)
1161
+ - You want experiment tracking without adding MLflow/W&B
1162
+ - You're migrating from OpenMMLab projects
1163
+
1164
+ **Use Hydra when:**
1165
+ - You have very deep config hierarchies with complex inheritance
1166
+ - You prefer YAML over Python
1167
+ - You need the mature plugin ecosystem (Ray, Joblib, etc.)
1168
+ - You don't need namespace isolation
1169
+
1170
+ **Why not both?**
1171
+ - Ato has **built-in Hydra-style composition** via `compose_hierarchy()`
1172
+ - You can use Hydra's directory structure and config groups directly in Ato
1173
+ - Get MultiScope + experiment tracking + merge debugging on top of Hydra's composition
1174
+ - Migration is literally just replacing `hydra.compose()` with `ADict.compose_hierarchy()`
1175
+
1176
+ **Ato is for you if:**
1177
+ - You want lightweight, offline-first experiment tracking
1178
+ - You need **true namespace isolation for team collaboration**
1179
+ - **You want to debug config merge order visually** (unique to Ato!)
1180
+ - You prefer simple Python over complex frameworks
1181
+ - You want reproducibility without overhead