synapse-sdk 2025.10.5__py3-none-any.whl → 2025.10.6__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of synapse-sdk might be problematic. Click here for more details.

@@ -0,0 +1,663 @@
1
+ ---
2
+ id: train-action-overview
3
+ title: Train Action Overview
4
+ sidebar_position: 1
5
+ ---
6
+
7
+ # Train Action Overview
8
+
9
+ The Train Action provides unified functionality for both model training and hyperparameter optimization (HPO) through a single interface. It supports regular training workflows and advanced hyperparameter tuning with Ray Tune integration.
10
+
11
+ ## Quick Overview
12
+
13
+ **Category:** Neural Net
14
+ **Available Actions:** `train`
15
+ **Execution Method:** Job-based execution
16
+ **Modes:** Training mode and Hyperparameter Tuning mode
17
+
18
+ ## Key Features
19
+
20
+ - **Unified Interface**: Single action for both training and hyperparameter tuning
21
+ - **Flexible Hyperparameters**: No rigid structure - plugins define their own hyperparameter schema
22
+ - **Ray Tune Integration**: Advanced HPO with multiple search algorithms and schedulers
23
+ - **Automatic Trial Tracking**: Trial IDs automatically injected into logs during tuning
24
+ - **Resource Management**: Configurable CPU/GPU allocation per trial
25
+ - **Best Model Selection**: Automatic best model checkpoint selection after tuning
26
+ - **Progress Tracking**: Real-time progress updates across training/tuning phases
27
+
28
+ ## Modes
29
+
30
+ ### Training Mode (Default)
31
+
32
+ Standard model training with fixed hyperparameters.
33
+
34
+ ```json
35
+ {
36
+ "action": "train",
37
+ "params": {
38
+ "name": "my_model",
39
+ "dataset": 123,
40
+ "checkpoint": null,
41
+ "is_tune": false,
42
+ "hyperparameter": {
43
+ "epochs": 100,
44
+ "batch_size": 32,
45
+ "learning_rate": 0.001,
46
+ "optimizer": "adam"
47
+ }
48
+ }
49
+ }
50
+ ```
51
+
52
+ ### Hyperparameter Tuning Mode
53
+
54
+ Hyperparameter optimization using Ray Tune.
55
+
56
+ ```json
57
+ {
58
+ "action": "train",
59
+ "params": {
60
+ "name": "my_tuning_job",
61
+ "dataset": 123,
62
+ "checkpoint": null,
63
+ "is_tune": true,
64
+ "tune_hyperparameter": [
65
+ {
66
+ "name": "batch_size",
67
+ "type": "choice",
68
+ "options": [16, 32, 64]
69
+ },
70
+ {
71
+ "name": "learning_rate",
72
+ "type": "loguniform",
73
+ "min": 0.0001,
74
+ "max": 0.01,
75
+ "base": 10
76
+ },
77
+ {
78
+ "name": "optimizer",
79
+ "type": "choice",
80
+ "options": ["adam", "sgd"]
81
+ }
82
+ ],
83
+ "tune_config": {
84
+ "mode": "max",
85
+ "metric": "accuracy",
86
+ "num_samples": 10,
87
+ "max_concurrent_trials": 2
88
+ }
89
+ }
90
+ }
91
+ ```
92
+
93
+ ## Configuration Parameters
94
+
95
+ ### Common Parameters (Both Modes)
96
+
97
+ | Parameter | Type | Required | Description |
98
+ | ------------ | ------------- | -------- | ------------------------------------- |
99
+ | `name` | `str` | Yes | Training/tuning job name |
100
+ | `dataset` | `int` | Yes | Dataset ID |
101
+ | `checkpoint` | `int \| None` | No | Checkpoint ID for resuming training |
102
+ | `is_tune` | `bool` | No | Enable tuning mode (default: `false`) |
103
+ | `num_cpus` | `float` | No | CPU resources per trial (tuning only) |
104
+ | `num_gpus` | `float` | No | GPU resources per trial (tuning only) |
105
+
106
+ ### Training Mode Parameters (`is_tune=false`)
107
+
108
+ | Parameter | Type | Required | Description |
109
+ | ---------------- | ------ | -------- | ---------------------------------- |
110
+ | `hyperparameter` | `dict` | Yes | Fixed hyperparameters for training |
111
+
112
+ **Note**: The structure of `hyperparameter` is completely flexible and defined by your plugin. Common fields include:
113
+
114
+ - `epochs`: Number of training epochs
115
+ - `batch_size`: Batch size for training
116
+ - `learning_rate`: Learning rate
117
+ - `optimizer`: Optimizer type (adam, sgd, etc.)
118
+ - Any custom fields your plugin needs (e.g., `dropout_rate`, `weight_decay`, `image_size`)
119
+
120
+ ### Tuning Mode Parameters (`is_tune=true`)
121
+
122
+ | Parameter | Type | Required | Description |
123
+ | --------------------- | ------ | -------- | ------------------------------------ |
124
+ | `tune_hyperparameter` | `list` | Yes | List of hyperparameter search spaces |
125
+ | `tune_config` | `dict` | Yes | Ray Tune configuration |
126
+
127
+ ## Hyperparameter Search Spaces
128
+
129
+ Define hyperparameter distributions for tuning:
130
+
131
+ ### Continuous Distributions
132
+
133
+ ```json
134
+ [
135
+ {
136
+ "name": "learning_rate",
137
+ "type": "uniform",
138
+ "min": 0.0001,
139
+ "max": 0.01
140
+ },
141
+ {
142
+ "name": "dropout_rate",
143
+ "type": "loguniform",
144
+ "min": 0.0001,
145
+ "max": 0.1,
146
+ "base": 10
147
+ }
148
+ ]
149
+ ```
150
+
151
+ ### Discrete Distributions
152
+
153
+ ```json
154
+ [
155
+ {
156
+ "name": "batch_size",
157
+ "type": "choice",
158
+ "options": [16, 32, 64, 128]
159
+ },
160
+ {
161
+ "name": "optimizer",
162
+ "type": "choice",
163
+ "options": ["adam", "sgd", "rmsprop"]
164
+ }
165
+ ]
166
+ ```
167
+
168
+ ### Quantized Distributions
169
+
170
+ ```json
171
+ [
172
+ {
173
+ "name": "learning_rate",
174
+ "type": "quniform",
175
+ "min": 0.0001,
176
+ "max": 0.01,
177
+ "q": 0.0001
178
+ }
179
+ ]
180
+ ```
181
+
182
+ ### Supported Distribution Types
183
+
184
+ Each hyperparameter type requires specific parameters:
185
+
186
+ | Type | Required Parameters | Description | Example |
187
+ |------|-------------------|-------------|---------|
188
+ | `uniform` | `min`, `max` | Uniform distribution between min and max | `{"name": "lr", "type": "uniform", "min": 0.0001, "max": 0.01}` |
189
+ | `quniform` | `min`, `max` | Quantized uniform distribution | `{"name": "lr", "type": "quniform", "min": 0.0001, "max": 0.01}` |
190
+ | `loguniform` | `min`, `max`, `base` | Log-uniform distribution | `{"name": "lr", "type": "loguniform", "min": 0.0001, "max": 0.01, "base": 10}` |
191
+ | `qloguniform` | `min`, `max`, `base` | Quantized log-uniform distribution | `{"name": "lr", "type": "qloguniform", "min": 0.0001, "max": 0.01, "base": 10}` |
192
+ | `randn` | `mean`, `sd` | Normal (Gaussian) distribution | `{"name": "noise", "type": "randn", "mean": 0.0, "sd": 1.0}` |
193
+ | `qrandn` | `mean`, `sd` | Quantized normal distribution | `{"name": "noise", "type": "qrandn", "mean": 0.0, "sd": 1.0}` |
194
+ | `randint` | `min`, `max` | Random integer between min and max | `{"name": "epochs", "type": "randint", "min": 5, "max": 15}` |
195
+ | `qrandint` | `min`, `max` | Quantized random integer | `{"name": "epochs", "type": "qrandint", "min": 5, "max": 15}` |
196
+ | `lograndint` | `min`, `max`, `base` | Log-random integer | `{"name": "units", "type": "lograndint", "min": 16, "max": 256, "base": 2}` |
197
+ | `qlograndint` | `min`, `max`, `base` | Quantized log-random integer | `{"name": "units", "type": "qlograndint", "min": 16, "max": 256, "base": 2}` |
198
+ | `choice` | `options` | Choose from a list of values | `{"name": "optimizer", "type": "choice", "options": ["adam", "sgd"]}` |
199
+ | `grid_search` | `options` | Grid search over all values | `{"name": "batch_size", "type": "grid_search", "options": [16, 32, 64]}` |
200
+
201
+ **Important Notes:**
202
+ - All hyperparameters must include `name` and `type` fields
203
+ - For `loguniform`, `qloguniform`, `lograndint`, `qlograndint`: `base` parameter is required (typically 10 or 2)
204
+ - For `choice` and `grid_search`: Use `options` (not `values`)
205
+ - For range-based types: Use `min` and `max` (not `lower` and `upper`)
206
+
207
+ ## Tune Configuration
208
+
209
+ ### Basic Configuration
210
+
211
+ ```python
212
+ {
213
+ "mode": "max", # "max" or "min"
214
+ "metric": "accuracy", # Metric to optimize
215
+ "num_samples": 10, # Number of trials
216
+ "max_concurrent_trials": 2 # Parallel trials
217
+ }
218
+ ```
219
+
220
+ ### With Search Algorithm
221
+
222
+ ```python
223
+ {
224
+ "mode": "max",
225
+ "metric": "accuracy",
226
+ "num_samples": 20,
227
+ "max_concurrent_trials": 4,
228
+ "search_alg": {
229
+ "name": "hyperoptsearch", # Search algorithm
230
+ "points_to_evaluate": [ # Optional initial points
231
+ {
232
+ "learning_rate": 0.001,
233
+ "batch_size": 32
234
+ }
235
+ ]
236
+ }
237
+ }
238
+ ```
239
+
240
+ ### With Scheduler
241
+
242
+ ```python
243
+ {
244
+ "mode": "max",
245
+ "metric": "accuracy",
246
+ "num_samples": 50,
247
+ "max_concurrent_trials": 8,
248
+ "scheduler": {
249
+ "name": "hyperband", # Scheduler type
250
+ "options": {
251
+ "max_t": 100
252
+ }
253
+ }
254
+ }
255
+ ```
256
+
257
+ ### Supported Search Algorithms
258
+
259
+ - `basicvariantgenerator` - Random search (default)
260
+ - `bayesoptsearch` - Bayesian optimization
261
+ - `hyperoptsearch` - Tree-structured Parzen Estimator
262
+
263
+ ### Supported Schedulers
264
+
265
+ - `fifo` - First-in-first-out (default)
266
+ - `hyperband` - HyperBand scheduler
267
+
268
+ ## Plugin Development
269
+
270
+ ### For Training Mode
271
+
272
+ Implement the `train()` function in your plugin:
273
+
274
+ ```python
275
+ def train(run, dataset, hyperparameter, checkpoint, **kwargs):
276
+ """
277
+ Training function for your model.
278
+
279
+ Args:
280
+ run: TrainRun object for logging
281
+ dataset: Dataset object
282
+ hyperparameter: dict with hyperparameters
283
+ checkpoint: Optional checkpoint for resuming
284
+ """
285
+ # Access hyperparameters
286
+ epochs = hyperparameter['epochs']
287
+ batch_size = hyperparameter['batch_size']
288
+ learning_rate = hyperparameter['learning_rate']
289
+
290
+ # Training loop
291
+ for epoch in range(epochs):
292
+ # Train one epoch
293
+ loss, accuracy = train_one_epoch(...)
294
+
295
+ # Log metrics
296
+ run.log_metric('training', 'loss', loss, epoch=epoch)
297
+ run.log_metric('training', 'accuracy', accuracy, epoch=epoch)
298
+
299
+ # Log visualizations
300
+ run.log_visualization('predictions', 'train', epoch, image_data)
301
+
302
+ # Save final model
303
+ save_model(model, '/path/to/model.pth')
304
+ ```
305
+
306
+ ### For Tuning Mode
307
+
308
+ Implement the `tune()` function in your plugin:
309
+
310
+ ```python
311
+ def tune(hyperparameter, run, dataset, checkpoint, **kwargs):
312
+ """
313
+ Tuning function for hyperparameter optimization.
314
+
315
+ Args:
316
+ hyperparameter: dict with current trial's hyperparameters
317
+ run: TrainRun object for logging (with is_tune=True)
318
+ dataset: Dataset object
319
+ checkpoint: Optional checkpoint for resuming
320
+ """
321
+ from ray import tune
322
+
323
+ # Set checkpoint output path BEFORE training
324
+ output_path = Path('/path/to/trial/weights')
325
+ run.checkpoint_output = str(output_path)
326
+
327
+ # Training loop
328
+ for epoch in range(hyperparameter['epochs']):
329
+ loss, accuracy = train_one_epoch(...)
330
+
331
+ # Log metrics (trial_id automatically added)
332
+ run.log_metric('training', 'loss', loss, epoch=epoch)
333
+ run.log_metric('training', 'accuracy', accuracy, epoch=epoch)
334
+
335
+ # Report results to Ray Tune
336
+ results = {
337
+ "accuracy": final_accuracy,
338
+ "loss": final_loss
339
+ }
340
+
341
+ # IMPORTANT: Report with checkpoint
342
+ tune.report(
343
+ results,
344
+ checkpoint=tune.Checkpoint.from_directory(run.checkpoint_output)
345
+ )
346
+ ```
347
+
348
+ ### Parameter Order Difference
349
+
350
+ **Important**: The parameter order differs between `train()` and `tune()`:
351
+
352
+ - `train(run, dataset, hyperparameter, checkpoint, **kwargs)`
353
+ - `tune(hyperparameter, run, dataset, checkpoint, **kwargs)`
354
+
355
+ ### Automatic Trial ID Logging
356
+
357
+ When `is_tune=True`, the SDK automatically injects `trial_id` into all metric and visualization logs:
358
+
359
+ ```python
360
+ # Your plugin code
361
+ run.log_metric('training', 'loss', 0.5, epoch=10)
362
+
363
+ # Actual logged data (trial_id added automatically)
364
+ {
365
+ 'category': 'training',
366
+ 'key': 'loss',
367
+ 'value': 0.5,
368
+ 'metrics': {'epoch': 10},
369
+ 'trial_id': 'abc123' # Added automatically
370
+ }
371
+ ```
372
+
373
+ No plugin changes required - this happens transparently at the SDK level.
374
+
375
+ ## Migration from TuneAction
376
+
377
+ The standalone `TuneAction` is now **deprecated**. Migrate to `TrainAction` with `is_tune=true`:
378
+
379
+ ### Before (Deprecated)
380
+
381
+ ```json
382
+ {
383
+ "action": "tune",
384
+ "params": {
385
+ "name": "my_tuning_job",
386
+ "dataset": 123,
387
+ "hyperparameter": [...],
388
+ "tune_config": {...}
389
+ }
390
+ }
391
+ ```
392
+
393
+ ### After (Recommended)
394
+
395
+ ```json
396
+ {
397
+ "action": "train",
398
+ "params": {
399
+ "name": "my_tuning_job",
400
+ "dataset": 123,
401
+ "is_tune": true,
402
+ "tune_hyperparameter": [...],
403
+ "tune_config": {...}
404
+ }
405
+ }
406
+ ```
407
+
408
+ ### Key Changes
409
+
410
+ 1. Change `"action": "tune"` to `"action": "train"`
411
+ 2. Add `"is_tune": true`
412
+ 3. Rename `"hyperparameter"` to `"tune_hyperparameter"`
413
+
414
+ ## Examples
415
+
416
+ ### Simple Training
417
+
418
+ ```json
419
+ {
420
+ "action": "train",
421
+ "params": {
422
+ "name": "resnet50_training",
423
+ "dataset": 456,
424
+ "checkpoint": null,
425
+ "hyperparameter": {
426
+ "epochs": 100,
427
+ "batch_size": 32,
428
+ "learning_rate": 0.001,
429
+ "optimizer": "adam",
430
+ "weight_decay": 0.0001
431
+ }
432
+ }
433
+ }
434
+ ```
435
+
436
+ ### Resume from Checkpoint
437
+
438
+ ```json
439
+ {
440
+ "action": "train",
441
+ "params": {
442
+ "name": "resnet50_continued",
443
+ "dataset": 456,
444
+ "checkpoint": 789,
445
+ "hyperparameter": {
446
+ "epochs": 50,
447
+ "batch_size": 32,
448
+ "learning_rate": 0.0001,
449
+ "optimizer": "adam"
450
+ }
451
+ }
452
+ }
453
+ ```
454
+
455
+ ### Hyperparameter Tuning with Grid Search
456
+
457
+ ```json
458
+ {
459
+ "action": "train",
460
+ "params": {
461
+ "name": "resnet50_tuning",
462
+ "dataset": 456,
463
+ "is_tune": true,
464
+ "tune_hyperparameter": [
465
+ {
466
+ "name": "batch_size",
467
+ "type": "grid_search",
468
+ "options": [16, 32, 64]
469
+ },
470
+ {
471
+ "name": "learning_rate",
472
+ "type": "grid_search",
473
+ "options": [0.001, 0.0001]
474
+ },
475
+ {
476
+ "name": "optimizer",
477
+ "type": "grid_search",
478
+ "options": ["adam", "sgd"]
479
+ }
480
+ ],
481
+ "tune_config": {
482
+ "mode": "max",
483
+ "metric": "validation_accuracy",
484
+ "num_samples": 12,
485
+ "max_concurrent_trials": 4
486
+ }
487
+ }
488
+ }
489
+ ```
490
+
491
+ ### Advanced Tuning with HyperOpt and HyperBand
492
+
493
+ ```json
494
+ {
495
+ "action": "train",
496
+ "params": {
497
+ "name": "resnet50_hyperopt_tuning",
498
+ "dataset": 456,
499
+ "is_tune": true,
500
+ "num_cpus": 2,
501
+ "num_gpus": 0.5,
502
+ "tune_hyperparameter": [
503
+ {
504
+ "name": "batch_size",
505
+ "type": "choice",
506
+ "options": [16, 32, 64, 128]
507
+ },
508
+ {
509
+ "name": "learning_rate",
510
+ "type": "loguniform",
511
+ "min": 0.00001,
512
+ "max": 0.01,
513
+ "base": 10
514
+ },
515
+ {
516
+ "name": "weight_decay",
517
+ "type": "loguniform",
518
+ "min": 0.00001,
519
+ "max": 0.001,
520
+ "base": 10
521
+ },
522
+ {
523
+ "name": "optimizer",
524
+ "type": "choice",
525
+ "options": ["adam", "sgd", "rmsprop"]
526
+ }
527
+ ],
528
+ "tune_config": {
529
+ "mode": "max",
530
+ "metric": "validation_accuracy",
531
+ "num_samples": 50,
532
+ "max_concurrent_trials": 8,
533
+ "search_alg": {
534
+ "name": "hyperoptsearch"
535
+ },
536
+ "scheduler": {
537
+ "name": "hyperband",
538
+ "options": {
539
+ "max_t": 100
540
+ }
541
+ }
542
+ }
543
+ }
544
+ }
545
+ ```
546
+
547
+ ## Progress Tracking
548
+
549
+ The train action tracks progress across different phases:
550
+
551
+ ### Training Mode
552
+
553
+ | Category | Proportion | Description |
554
+ | ------------ | ---------- | -------------------- |
555
+ | `validation` | 10% | Parameter validation |
556
+ | `training` | 90% | Model training |
557
+
558
+ ### Tuning Mode
559
+
560
+ | Category | Proportion | Description |
561
+ | ------------ | ---------- | ---------------------------- |
562
+ | `validation` | 10% | Parameter validation |
563
+ | `trials` | 90% | Hyperparameter tuning trials |
564
+
565
+ ## Benefits
566
+
567
+ ### Unified Interface
568
+
569
+ - Single action for both training and tuning
570
+ - Consistent parameter handling
571
+ - Reduced code duplication
572
+
573
+ ### Flexible Hyperparameters
574
+
575
+ - No rigid structure enforced by SDK
576
+ - Plugins define their own hyperparameter schema
577
+ - Support for custom fields without validation errors
578
+
579
+ ### Advanced HPO
580
+
581
+ - Multiple search algorithms (Optuna, Ax, HyperOpt, BayesOpt)
582
+ - Multiple schedulers (ASHA, HyperBand, PBT)
583
+ - Automatic best model selection
584
+
585
+ ### Developer Experience
586
+
587
+ - Automatic trial tracking
588
+ - Transparent logging enhancements
589
+ - Clear migration path from deprecated TuneAction
590
+
591
+ ## Best Practices
592
+
593
+ ### Hyperparameter Design
594
+
595
+ - Keep hyperparameter search spaces reasonable
596
+ - Start with grid search for initial exploration
597
+ - Use Bayesian optimization (Optuna, Ax) for efficient search
598
+ - Set appropriate `num_samples` based on search space size
599
+
600
+ ### Resource Management
601
+
602
+ - Allocate `num_cpus` and `num_gpus` based on trial resource needs
603
+ - Set `max_concurrent_trials` based on available hardware
604
+ - Monitor resource usage during tuning
605
+
606
+ ### Checkpoint Management
607
+
608
+ - Always set `run.checkpoint_output` before training in tune mode
609
+ - Save checkpoints at regular intervals
610
+ - Use the best checkpoint returned by tuning
611
+
612
+ ### Logging
613
+
614
+ - Log all relevant metrics for comparison
615
+ - Use consistent metric names across trials
616
+ - Include validation metrics in tune reports
617
+
618
+ ## Troubleshooting
619
+
620
+ ### Common Issues
621
+
622
+ #### "hyperparameter is required when is_tune=False"
623
+
624
+ Make sure to provide `hyperparameter` in training mode:
625
+
626
+ ```json
627
+ {
628
+ "is_tune": false,
629
+ "hyperparameter": {...}
630
+ }
631
+ ```
632
+
633
+ #### "tune_hyperparameter is required when is_tune=True"
634
+
635
+ Make sure to provide `tune_hyperparameter` and `tune_config` in tuning mode:
636
+
637
+ ```json
638
+ {
639
+ "is_tune": true,
640
+ "tune_hyperparameter": [...],
641
+ "tune_config": {...}
642
+ }
643
+ ```
644
+
645
+ #### Tuning Fails Without Error
646
+
647
+ Check that your `tune()` function:
648
+
649
+ 1. Sets `run.checkpoint_output` before training
650
+ 2. Calls `tune.report()` with results and checkpoint
651
+ 3. Returns properly without exceptions
652
+
653
+ ## Next Steps
654
+
655
+ - **For Plugin Developers**: Implement `train()` and optionally `tune()` functions
656
+ - **For Users**: Start with training mode, then experiment with tuning
657
+ - **For Advanced Users**: Explore different search algorithms and schedulers
658
+
659
+ ## Support and Resources
660
+
661
+ - **API Reference**: See TrainAction class documentation
662
+ - **Examples**: Check plugin examples repository
663
+ - **Ray Tune Documentation**: https://docs.ray.io/en/latest/tune/