PyPI - cifar10-tools - Versions diffs - 0.4.0__tar.gz → 0.5.0__tar.gz - Mend

cifar10-tools 0.4.0tar.gz → 0.5.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

{cifar10_tools-0.4.0 → cifar10_tools-0.5.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: cifar10_tools
-Version: 0.4.0
+Version: 0.5.0
 Summary: Tools for training neural networks on the CIFAR-10 task with PyTorch and TensorFlow
 License: GPLv3
 License-File: LICENSE
@@ -36,17 +36,23 @@ Description-Content-Type: text/markdown
 A progressive deep learning tutorial for image classification on the CIFAR-10 dataset using PyTorch. This project demonstrates the evolution from basic deep neural networks to optimized convolutional neural networks with data augmentation. It also provides a set of utility functions as a PyPI package for use in other projects.
-[View on PyPI](https://pypi.org/project/cifar10_tools)
+[View on PyPI](https://pypi.org/project/cifar10_tools) | [Documentation](https://gperdrizet.github.io/CIFAR10/)
 ## Installation
-Install the helper tools package locally in editable mode:
+Install the helper tools package locally in editable mode to use in this repository:
 ```bash
 pip install -e .
 ```
-## Project Overview
+Or install from PyPI to use in other projects:
+```bash
+pip install cifar10_tools
+```
+## Project overview
 This repository contains a series of Jupyter notebooks that progressively build more sophisticated neural network architectures for the CIFAR-10 image classification task. Each notebook builds upon concepts from the previous one, demonstrating key deep learning techniques.

{cifar10_tools-0.4.0 → cifar10_tools-0.5.0}/README.md RENAMED Viewed

@@ -2,17 +2,23 @@
 A progressive deep learning tutorial for image classification on the CIFAR-10 dataset using PyTorch. This project demonstrates the evolution from basic deep neural networks to optimized convolutional neural networks with data augmentation. It also provides a set of utility functions as a PyPI package for use in other projects.
-[View on PyPI](https://pypi.org/project/cifar10_tools)
+[View on PyPI](https://pypi.org/project/cifar10_tools) | [Documentation](https://gperdrizet.github.io/CIFAR10/)
 ## Installation
-Install the helper tools package locally in editable mode:
+Install the helper tools package locally in editable mode to use in this repository:
 ```bash
 pip install -e .
 ```
-## Project Overview
+Or install from PyPI to use in other projects:
+```bash
+pip install cifar10_tools
+```
+## Project overview
 This repository contains a series of Jupyter notebooks that progressively build more sophisticated neural network architectures for the CIFAR-10 image classification task. Each notebook builds upon concepts from the previous one, demonstrating key deep learning techniques.

{cifar10_tools-0.4.0 → cifar10_tools-0.5.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"
 [tool.poetry]
 name = "cifar10_tools"
-version = "0.4.0"
+version = "0.5.0"
 description = "Tools for training neural networks on the CIFAR-10 task with PyTorch and TensorFlow"
 authors = ["gperdrizet <george@perdrizet.org>"]
 readme = "README.md"
@@ -33,6 +33,17 @@ torch = ">=2.0"
 torchvision = ">=0.15"
 numpy = ">=1.24"
+[tool.poetry.group.docs]
+optional = true
+[tool.poetry.group.docs.dependencies]
+sphinx = ">=7.0"
+sphinx-rtd-theme = ">=2.0"
+nbsphinx = ">=0.9"
+sphinx-autodoc-typehints = ">=1.25"
+ipykernel = ">=6.0"
+pandoc = ">=2.0"
 [tool.poetry.extras]
 tensorflow = ["tensorflow"]

{cifar10_tools-0.4.0 → cifar10_tools-0.5.0}/src/cifar10_tools/pytorch/hyperparameter_optimization.py RENAMED Viewed

@@ -46,6 +46,7 @@ def create_cnn(
     Returns:
         nn.Sequential model
     '''
     layers = []
     current_channels = in_channels
     current_size = input_size
@@ -58,6 +59,8 @@ def create_cnn(
         # First conv in block
         layers.append(nn.Conv2d(current_channels, out_channels, kernel_size=kernel_size, padding=padding))
+        # Update size after conv: output_size = (input_size + 2*padding - kernel_size) + 1
+        current_size = (current_size + 2 * padding - kernel_size) + 1
         if use_batch_norm:
             layers.append(nn.BatchNorm2d(out_channels))
@@ -66,6 +69,7 @@ def create_cnn(
         # Second conv in block
         layers.append(nn.Conv2d(out_channels, out_channels, kernel_size=kernel_size, padding=padding))
+        current_size = (current_size + 2 * padding - kernel_size) + 1
         if use_batch_norm:
             layers.append(nn.BatchNorm2d(out_channels))
@@ -81,11 +85,10 @@ def create_cnn(
         layers.append(nn.Dropout(conv_dropout_rate))
         current_channels = out_channels
-        current_size //= 2
+        current_size //= 2  # Pooling halves the size
-    # Calculate flattened size
-    final_channels = initial_filters * (2 ** (n_conv_blocks - 1))
-    flattened_size = final_channels * current_size * current_size
+    # Calculate flattened size using actual current_size
+    flattened_size = current_channels * current_size * current_size
     # Classifier - dynamic FC layers with halving pattern
     layers.append(nn.Flatten())
@@ -180,7 +183,8 @@ def create_objective(
     n_epochs: int,
     device: torch.device,
     num_classes: int = 10,
-    in_channels: int = 3
+    in_channels: int = 3,
+    search_space: dict = None
 ) -> Callable[[optuna.Trial], float]:
     '''Create an Optuna objective function for CNN hyperparameter optimization.
@@ -195,6 +199,7 @@ def create_objective(
         device: Device to train on (cuda or cpu)
         num_classes: Number of output classes (default: 10)
         in_channels: Number of input channels (default: 3 for RGB)
+        search_space: Dictionary defining hyperparameter search space (default: None)
     Returns:
         Objective function for optuna.Study.optimize()
@@ -205,21 +210,43 @@ def create_objective(
         >>> study.optimize(objective, n_trials=100)
     '''
+    # Default search space if none provided
+    if search_space is None:
+        search_space = {
+            'batch_size': [64, 128, 256, 512, 1024],
+            'n_conv_blocks': (1, 5),
+            'initial_filters': [8, 16, 32, 64, 128],
+            'n_fc_layers': (1, 8),
+            'base_kernel_size': (3, 7),
+            'conv_dropout_rate': (0.0, 0.5),
+            'fc_dropout_rate': (0.2, 0.75),
+            'pooling_strategy': ['max', 'avg'],
+            'use_batch_norm': [True, False],
+            'learning_rate': (1e-5, 1e-1, 'log'),
+            'optimizer': ['Adam', 'SGD', 'RMSprop'],
+            'sgd_momentum': (0.8, 0.99)
+        }
     def objective(trial: optuna.Trial) -> float:
         '''Optuna objective function for CNN hyperparameter optimization.'''
-        # Suggest hyperparameters
-        batch_size = trial.suggest_categorical('batch_size', [64, 128, 256, 512, 1024])
-        n_conv_blocks = trial.suggest_int('n_conv_blocks', 1, 5)
-        initial_filters = trial.suggest_categorical('initial_filters', [8, 16, 32, 64, 128])
-        n_fc_layers = trial.suggest_int('n_fc_layers', 1, 8)
-        base_kernel_size = trial.suggest_int('base_kernel_size', 3, 7)
-        conv_dropout_rate = trial.suggest_float('conv_dropout_rate', 0.0, 0.5)
-        fc_dropout_rate = trial.suggest_float('fc_dropout_rate', 0.2, 0.75)
-        pooling_strategy = trial.suggest_categorical('pooling_strategy', ['max', 'avg'])
-        use_batch_norm = trial.suggest_categorical('use_batch_norm', [True, False])
-        learning_rate = trial.suggest_float('learning_rate', 1e-5, 1e-1, log=True)
-        optimizer_name = trial.suggest_categorical('optimizer', ['Adam', 'SGD', 'RMSprop'])
+        # Suggest hyperparameters from search space
+        batch_size = trial.suggest_categorical('batch_size', search_space['batch_size'])
+        n_conv_blocks = trial.suggest_int('n_conv_blocks', *search_space['n_conv_blocks'])
+        initial_filters = trial.suggest_categorical('initial_filters', search_space['initial_filters'])
+        n_fc_layers = trial.suggest_int('n_fc_layers', *search_space['n_fc_layers'])
+        base_kernel_size = trial.suggest_int('base_kernel_size', *search_space['base_kernel_size'])
+        conv_dropout_rate = trial.suggest_float('conv_dropout_rate', *search_space['conv_dropout_rate'])
+        fc_dropout_rate = trial.suggest_float('fc_dropout_rate', *search_space['fc_dropout_rate'])
+        pooling_strategy = trial.suggest_categorical('pooling_strategy', search_space['pooling_strategy'])
+        use_batch_norm = trial.suggest_categorical('use_batch_norm', search_space['use_batch_norm'])
+        # Handle learning rate with optional log scale
+        lr_params = search_space['learning_rate']
+        learning_rate = trial.suggest_float('learning_rate', lr_params[0], lr_params[1],
+                                           log=(lr_params[2] == 'log' if len(lr_params) > 2 else False))
+        optimizer_name = trial.suggest_categorical('optimizer', search_space['optimizer'])
         # Create data loaders with suggested batch size
         train_loader, val_loader, _ = make_data_loaders(
@@ -250,7 +277,7 @@ def create_objective(
             optimizer = optim.Adam(model.parameters(), lr=learning_rate)
         elif optimizer_name == 'SGD':
-            momentum = trial.suggest_float('sgd_momentum', 0.8, 0.99)
+            momentum = trial.suggest_float('sgd_momentum', *search_space['sgd_momentum'])
             optimizer = optim.SGD(model.parameters(), lr=learning_rate, momentum=momentum)
         else:  # RMSprop
@@ -270,6 +297,10 @@ def create_objective(
                 trial=trial
             )
+        except RuntimeError as e:
+            # Catch architecture errors (e.g., dimension mismatches)
+            raise optuna.TrialPruned(f'RuntimeError with params: {trial.params} - {str(e)}')
         except torch.cuda.OutOfMemoryError:
             # Clear CUDA cache and skip this trial
             torch.cuda.empty_cache()