py-sadl 1.0.2__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,338 @@
1
+ Metadata-Version: 2.4
2
+ Name: py-sadl
3
+ Version: 1.0.2
4
+ Summary: Simple Autograd Deep Learning: A minimal, readable deep learning framework built on NumPy
5
+ Project-URL: Homepage, https://github.com/timcares/sadl
6
+ Project-URL: Documentation, https://github.com/timcares/sadl#readme
7
+ Project-URL: Repository, https://github.com/timcares/sadl
8
+ Project-URL: Issues, https://github.com/timcares/sadl/issues
9
+ Author: Tim Cares
10
+ License: MIT
11
+ License-File: LICENSE
12
+ Keywords: autograd,automatic-differentiation,cupy,deep-learning,educational,machine-learning,neural-networks,numpy
13
+ Classifier: Development Status :: 4 - Beta
14
+ Classifier: Environment :: GPU :: NVIDIA CUDA
15
+ Classifier: Intended Audience :: Developers
16
+ Classifier: Intended Audience :: Education
17
+ Classifier: Intended Audience :: Science/Research
18
+ Classifier: License :: OSI Approved :: MIT License
19
+ Classifier: Natural Language :: English
20
+ Classifier: Operating System :: OS Independent
21
+ Classifier: Programming Language :: Python :: 3
22
+ Classifier: Programming Language :: Python :: 3.13
23
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
24
+ Classifier: Topic :: Scientific/Engineering :: Mathematics
25
+ Classifier: Typing :: Typed
26
+ Requires-Python: >=3.13
27
+ Requires-Dist: numpy>=2.4.0
28
+ Provides-Extra: dev
29
+ Requires-Dist: commitizen>=4.0.0; extra == 'dev'
30
+ Requires-Dist: ipykernel>=7.1.0; extra == 'dev'
31
+ Requires-Dist: ipywidgets>=8.1.8; extra == 'dev'
32
+ Requires-Dist: jupyter>=1.1.1; extra == 'dev'
33
+ Requires-Dist: matplotlib>=3.10.8; extra == 'dev'
34
+ Requires-Dist: mypy>=1.13.0; extra == 'dev'
35
+ Requires-Dist: notebook>=7.5.3; extra == 'dev'
36
+ Requires-Dist: pre-commit>=4.0.0; extra == 'dev'
37
+ Requires-Dist: pytest-cov>=6.0.0; extra == 'dev'
38
+ Requires-Dist: pytest>=8.3.0; extra == 'dev'
39
+ Requires-Dist: python-semantic-release>=9.0.0; extra == 'dev'
40
+ Requires-Dist: ruff>=0.8.0; extra == 'dev'
41
+ Provides-Extra: gpu
42
+ Requires-Dist: cupy-cuda12x>=13.0.0; extra == 'gpu'
43
+ Provides-Extra: gpu-cuda11
44
+ Requires-Dist: cupy-cuda11x>=13.0.0; extra == 'gpu-cuda11'
45
+ Description-Content-Type: text/markdown
46
+
47
+ <p align="center">
48
+ <img src="assets/sadl_icon_light.png" alt="SADL Logo" width="200">
49
+ </p>
50
+
51
+ <h1 align="center">SADL: Simple Autograd Deep Learning</h1>
52
+
53
+ <p align="center">
54
+ A minimal, readable deep learning framework built on NumPy and CuPy.<br>
55
+ Automatic differentiation, neural network primitives, and optimization in ~2000 lines of Python.
56
+ </p>
57
+
58
+ ## Installation
59
+
60
+ Using [uv](https://docs.astral.sh/uv/) for installation is recommended.
61
+
62
+ (I had to name the pypi project `py-sadl` instead of `sadl`, because `sadl` was too similar to an existing project.)
63
+
64
+ ```bash
65
+ # Install with uv (recommended)
66
+ uv add py-sadl
67
+
68
+ # With GPU support (CUDA 12.x)
69
+ uv add py-sadl --extra gpu
70
+
71
+ # With GPU support (CUDA 11.x)
72
+ uv add py-sadl --extra gpu-cuda11
73
+ ```
74
+
75
+ Alternatively, using pip:
76
+
77
+ ```bash
78
+ # Install with pip
79
+ pip install py-sadl
80
+
81
+ # With GPU support
82
+ pip install "py-sadl[gpu]"
83
+ ```
84
+
85
+ ## Quick Start
86
+
87
+ ```python
88
+ import sadl
89
+
90
+ # Create tensors
91
+ x = sadl.tensor([[1.0, 2.0], [3.0, 4.0]], requires_grad=True)
92
+
93
+ # Build a model
94
+ model = sadl.Mlp([
95
+ sadl.Linear(dim_in=2, dim_out=4),
96
+ sadl.ReLU(),
97
+ sadl.Linear(dim_in=4, dim_out=1),
98
+ ])
99
+
100
+ # Forward pass
101
+ output = model(x)
102
+ loss = output.sum()
103
+
104
+ # Backward pass and optimization
105
+ optimizer = sadl.SGD(list(model.parameters), lr=0.01)
106
+ optimizer.backward(loss)
107
+ optimizer.step()
108
+ optimizer.zero_grad()
109
+ ```
110
+
111
+ ## Motivation
112
+
113
+ Modern deep learning frameworks like PyTorch and TensorFlow are powerful but complex. Their codebases span millions of lines, making it difficult to understand how automatic differentiation and neural network training actually work at a fundamental level.
114
+
115
+ SADL addresses this by providing a complete, functional deep learning framework that remains small enough to read and understand in its entirety. Every component, from tensor operations to backpropagation, is implemented transparently using standard NumPy operations.
116
+
117
+ The goal is not to replace production frameworks, but to serve as an educational resource and a foundation for experimentation. Researchers and engineers can trace exactly how gradients flow through computations without navigating layers of abstraction.
118
+
119
+ ## Related Projects
120
+
121
+ SADL joins a family of educational and minimal deep learning frameworks that have made autodiff more accessible:
122
+
123
+ **[micrograd](https://github.com/karpathy/micrograd)** by Andrej Karpathy is an elegant, minimal autograd engine operating on scalar values. In roughly 150 lines of code, it demonstrates the core concepts of backpropagation with remarkable clarity. micrograd is an excellent starting point for understanding how gradients flow through computations.
124
+
125
+ **[tinygrad](https://github.com/tinygrad/tinygrad)** by George Hotz takes a different approach, building a fully-featured deep learning framework with a focus on simplicity and hardware portability. tinygrad supports multiple backends and has grown into a serious alternative for running models on diverse hardware.
126
+
127
+ SADL takes inspiration from both projects while pursuing its own path: building directly on NumPy's ndarray infrastructure. By subclassing `numpy.ndarray` and intercepting operations via `__array_ufunc__` and `__array_function__`, SADL achieves autograd without introducing a new tensor abstraction. This means existing NumPy code works unchanged, and the mental model stays close to the numerical computing patterns that researchers already know.
128
+
129
+ ## Design Principles
130
+
131
+ ### Build on NumPy
132
+
133
+ SADL extends `numpy.ndarray` directly rather than wrapping arrays in custom containers. This means:
134
+
135
+ - All NumPy operations work out of the box
136
+ - No need to learn a new tensor API
137
+ - Seamless interoperability with the scientific Python ecosystem
138
+ - GPU support through CuPy with zero code changes
139
+
140
+ ### Mathematical Functions as First-Class Citizens
141
+
142
+ Neural network layers are modeled as mathematical functions, matching how they appear in research papers. The `Function` abstract base class enforces a simple contract: implement `__call__` to define the forward pass. This creates a natural bridge between mathematical notation and code.
143
+
144
+ ```python
145
+ class Sigmoid(Function):
146
+ def __call__(self, x: Tensor) -> Tensor:
147
+ return 1 / (xp.exp(-x) + 1)
148
+ ```
149
+
150
+ ### Explicit Over Implicit
151
+
152
+ SADL favors explicit behavior over magic:
153
+
154
+ - Gradients must be explicitly enabled with `requires_grad=True`
155
+ - Parameters are a distinct type that always tracks gradients
156
+ - The computation graph is visible and inspectable
157
+ - Device transfers are explicit operations
158
+
159
+ ### Minimal but Complete
160
+
161
+ The framework includes only what is necessary for training neural networks:
162
+
163
+ - Tensor with autograd support
164
+ - Parameter for learnable weights
165
+ - Function base class for layers
166
+ - Optimizer base class with SGD implementation
167
+ - Serialization for model persistence
168
+
169
+ Additional layers and optimizers can be built on these primitives without modifying core code.
170
+
171
+ ## How Autodiff Works
172
+
173
+ SADL implements reverse-mode automatic differentiation (backpropagation) using a dynamic computation graph, similar to PyTorch.
174
+
175
+ ### The Computation Graph
176
+
177
+ In SADL, **Tensors are the computation graph**. There is no separate graph data structure. Each Tensor stores a `src` attribute pointing to the Tensors it was created from. This forms a back-referencing graph where each node knows its parents, but parents do not know their children:
178
+
179
+ ```
180
+ Forward computation:
181
+
182
+ x ─┐
183
+ ├─► z ─► loss
184
+ y ─┘
185
+
186
+ Graph structure (src references):
187
+
188
+ loss
189
+
190
+
191
+ z
192
+ ╱ ╲
193
+ ▼ ▼
194
+ x y
195
+ ```
196
+
197
+ This is intentional. Deep learning frameworks optimize for backward traversal because that is what backpropagation requires. Starting from the loss, we follow `src` references backward through the graph to compute gradients. Forward references (parent to child) are unnecessary and would only consume memory.
198
+
199
+ ### Forward Pass: Building the Graph
200
+
201
+ When operations are performed on Tensors with `requires_grad=True`, the graph builds itself automatically:
202
+
203
+ 1. The `Tensor` class overrides `__array_ufunc__` and `__array_function__` to intercept NumPy operations
204
+ 2. Each operation creates a new Tensor that stores:
205
+ - `src`: References to input tensors (the parents in the graph)
206
+ - `backward_fn`: The gradient function for this operation
207
+ - `op_ctx`: Any context needed for gradient computation (axis, masks, etc.)
208
+ 3. The graph grows dynamically as operations execute
209
+
210
+ ```python
211
+ x = sadl.tensor([1.0, 2.0], requires_grad=True) # leaf, src = ()
212
+ y = sadl.tensor([3.0, 4.0], requires_grad=True) # leaf, src = ()
213
+ z = x * y # z.src = (x, y), z.backward_fn = mul_backward
214
+ loss = z.sum() # loss.src = (z,), loss.backward_fn = sum_backward
215
+ ```
216
+
217
+ A more complex example:
218
+
219
+ ```
220
+ a = tensor(...) # leaf
221
+ b = tensor(...) # leaf
222
+ c = tensor(...) # leaf
223
+
224
+ d = a + b # d.src = (a, b)
225
+ e = d * c # e.src = (d, c)
226
+ f = e.sum() # f.src = (e,)
227
+
228
+ Graph (following src backwards from f):
229
+
230
+ f
231
+
232
+
233
+ e
234
+ ╱ ╲
235
+ ▼ ▼
236
+ d c
237
+ ╱ ╲
238
+ ▼ ▼
239
+ a b
240
+ ```
241
+
242
+ ### Backward Pass: Computing Gradients
243
+
244
+ When `optimizer.backward(loss)` is called:
245
+
246
+ 1. **Topological Sort**: The graph is traversed from the loss tensor to find all nodes, ordered so that each node appears after all nodes that depend on it. This uses an iterative stack-based algorithm to avoid recursion limits on deep graphs.
247
+ 2. **Gradient Propagation**: Starting from the loss (seeded with gradient 1.0), each node's `backward_fn` is called with:
248
+ - The input tensors (`src`)
249
+ - Which inputs need gradients (`compute_grad`)
250
+ - The upstream gradient (`grad_out`)
251
+ - Operation context (`op_ctx`)
252
+ 3. **Gradient Accumulation**: Gradients flow backward through the graph. When a tensor is used in multiple operations, gradients are summed.
253
+ 4. **Graph Cleanup**: After backpropagation, the graph structure is cleared to free memory. Parameter gradients are retained for the optimizer step.
254
+
255
+ ### Gradient Operations Registry
256
+
257
+ Each supported operation has a corresponding backward function registered in `grad_ops.py` with metadata (op type inspired by [tinygrad](https://github.com/tinygrad/tinygrad)):
258
+
259
+ ```python
260
+ @register_grad_op(
261
+ op_type=OpType.ELEMENTWISE,
262
+ op_inputs=OpInputs.BINARY,
263
+ forward_names=("mul", "multiply"),
264
+ )
265
+ @broadcastable
266
+ def mul_backward(*inputs, compute_grad, grad_out):
267
+ x, y = inputs
268
+ grad_x = y * grad_out if compute_grad[0] else None
269
+ grad_y = x * grad_out if compute_grad[1] else None
270
+ return grad_x, grad_y
271
+ ```
272
+
273
+ The `@broadcastable` decorator handles gradient reduction when inputs were broadcast during the forward pass.
274
+
275
+ ### Supported Operations
276
+
277
+ Unary: `abs`, `negative`, `sqrt`, `square`, `exp`, `log`, `sin`, `cos`
278
+
279
+ Binary: `add`, `subtract`, `multiply`, `divide`, `power`, `matmul`, `maximum`, `minimum`
280
+
281
+ Reductions: `sum`, `mean`, `max`, `min`
282
+
283
+ ## Architecture
284
+
285
+ ```
286
+ sadl/
287
+ ├── __init__.py # Public API re-exports
288
+ ├── backend.py # NumPy/CuPy abstraction
289
+ ├── disk.py # Saving and loading data to/from disk
290
+ ├── tensor.py # Tensor, Parameter, serialization
291
+ ├── grad_ops.py # Gradient operation registry
292
+ ├── function.py # Neural network layers
293
+ ├── optimizer.py # Optimizer base class, SGD, backpropagation
294
+ ├── ops.py # Array creation and device utilities
295
+ └── utils.py # Device transfer utilities
296
+ ```
297
+
298
+ ### Key Components
299
+
300
+ **Tensor**: Subclass of `numpy.ndarray` with additional attributes for autograd. Intercepts NumPy operations to build the computation graph.
301
+
302
+ **Parameter**: Tensor subclass for learnable weights. Always requires gradients and retains them after backward pass for gradient accumulation.
303
+
304
+ **Function**: Abstract base class for neural network layers. Provides parameter traversal, device management, and train/inference mode switching.
305
+
306
+ **Optimizer**: Abstract base class that owns the backward pass. Performs topological sort, gradient computation, and graph cleanup.
307
+
308
+ **GradOp Registry**: Dictionary mapping operation names to backward functions. New operations can be registered with a decorator.
309
+
310
+ ## Serialization
311
+
312
+ SADL uses a custom binary format (`.sadl` files) for efficient tensor storage:
313
+
314
+ - 4-byte magic header for format validation
315
+ - Version byte for forward compatibility
316
+ - Compact encoding of dtype, shape, and raw data
317
+ - Support for single tensors or ordered dictionaries of tensors
318
+
319
+ ## Contributing
320
+
321
+ See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, commands, and guidelines.
322
+
323
+ ## Code of Conduct
324
+
325
+ See [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) for behavior guidelines. The file was created using the [Contributor Covenant](https://www.contributor-covenant.org).
326
+
327
+ ## Future Plans
328
+
329
+ - Static graph compilation for repeated computations
330
+ - Additional layers and components (convolution, batch normalization, attention)
331
+ - More optimizers (Adam, AdamW, RMSprop)
332
+ - XLA compilation backend for TPU support
333
+ - Automatic mixed precision training
334
+ - Distributed training primitives
335
+
336
+ ## See Also
337
+
338
+ - `docs/API_REFERENCE.md`: Complete API documentation
@@ -0,0 +1,13 @@
1
+ sadl/__init__.py,sha256=t-lx9xcwdU9egY5zUXOMjLI-Wz1t9lTzZYJnS5VWlf4,1214
2
+ sadl/backend.py,sha256=G7nZ1V7qm7myfgBSQV7hWIgk4ohVW_SfA_6BvvwvVdw,1060
3
+ sadl/disk.py,sha256=Sy9Fg6EbovTQvawUn-DMO8e_YecqLZXwtTf99K7w4WA,4807
4
+ sadl/function.py,sha256=eJgSAO91KvwNfGiL1-AUZHWiAr6P77P_uvnpV5F8rLw,14196
5
+ sadl/grad_ops.py,sha256=rRJ2HhXzmqdIBfdd_yuKUaZ_f_ZkXQeGOuZsCk9EPho,37549
6
+ sadl/ops.py,sha256=-NptddSsHAredwjcZUiGm50GLSatXYHGcXiS3qfA6Qw,1868
7
+ sadl/optimizer.py,sha256=qPscSJyqbWTNcFUIJDmaVaYFVpuqKVESDp6Dn8wSoPQ,13361
8
+ sadl/tensor.py,sha256=My-GJ1wQxclTbFjobG-hCn2VXQYBLW17NUeDOL6DKi4,18160
9
+ sadl/utils.py,sha256=M_0r0s14w3JdvA5Q_IxeRitXA8W5Fm3-1LUBGJVGMv4,1126
10
+ py_sadl-1.0.2.dist-info/METADATA,sha256=S9cULasg-M7LKX6U17OwpMcNXMiRDigqYf1SICMoONk,13065
11
+ py_sadl-1.0.2.dist-info/WHEEL,sha256=WLgqFyCfm_KASv4WHyYy0P3pM_m7J5L9k2skdKLirC8,87
12
+ py_sadl-1.0.2.dist-info/licenses/LICENSE,sha256=zOSEe9MBxdTcRzOau3ZZlnaBKkhmbLJtskkyONfSeIk,1066
13
+ py_sadl-1.0.2.dist-info/RECORD,,
@@ -0,0 +1,4 @@
1
+ Wheel-Version: 1.0
2
+ Generator: hatchling 1.28.0
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Tim Cares
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
sadl/__init__.py ADDED
@@ -0,0 +1,74 @@
1
+ """SADL: Simple Autograd Deep Learning.
2
+
3
+ A minimal, readable deep learning framework built on NumPy/CuPy.
4
+ """
5
+
6
+ from importlib.metadata import PackageNotFoundError, version
7
+
8
+ from . import grad_ops
9
+
10
+ try:
11
+ __version__ = version("py-sadl")
12
+ except PackageNotFoundError:
13
+ __version__ = "0.0.0.dev0" # Fallback for uninstalled package
14
+ from .backend import (
15
+ BACKEND,
16
+ TensorDevice,
17
+ xp,
18
+ )
19
+ from .disk import (
20
+ load,
21
+ save,
22
+ )
23
+ from .function import (
24
+ Function,
25
+ Linear,
26
+ Mlp,
27
+ ReLU,
28
+ Sigmoid,
29
+ )
30
+ from .ops import (
31
+ copy_to_device,
32
+ ones_like,
33
+ zeros_like,
34
+ )
35
+ from .optimizer import (
36
+ SGD,
37
+ Optimizer,
38
+ )
39
+ from .tensor import (
40
+ Parameter,
41
+ Tensor,
42
+ get_current_global_grad_mode,
43
+ no_grad,
44
+ no_grad_fn,
45
+ set_global_grad_mode,
46
+ tensor,
47
+ )
48
+
49
+ __all__ = [
50
+ "BACKEND",
51
+ "SGD",
52
+ "Function",
53
+ "Linear",
54
+ "Mlp",
55
+ "Optimizer",
56
+ "Parameter",
57
+ "ReLU",
58
+ "Sigmoid",
59
+ "Tensor",
60
+ "TensorDevice",
61
+ "__version__",
62
+ "copy_to_device",
63
+ "get_current_global_grad_mode",
64
+ "grad_ops",
65
+ "load",
66
+ "no_grad",
67
+ "no_grad_fn",
68
+ "ones_like",
69
+ "save",
70
+ "set_global_grad_mode",
71
+ "tensor",
72
+ "xp",
73
+ "zeros_like",
74
+ ]
sadl/backend.py ADDED
@@ -0,0 +1,45 @@
1
+ import logging
2
+ from typing import TYPE_CHECKING, Any, Literal
3
+
4
+ logger = logging.getLogger(__name__)
5
+
6
+
7
+ TensorDevice = Literal["cpu"] | int
8
+
9
+
10
+ if TYPE_CHECKING:
11
+ import numpy
12
+
13
+ ModuleType = numpy.ndarray[Any, Any] | Any # numpy or cupy module
14
+
15
+
16
+ def _validate_cupy_available() -> None:
17
+ """Validate that CuPy is available with working CUDA devices.
18
+
19
+ Raises:
20
+ RuntimeError: If CUDA is unavailable or no devices are found.
21
+ """
22
+ try:
23
+ _device_count = xp.cuda.runtime.getDeviceCount()
24
+ except Exception as exc:
25
+ raise RuntimeError("Cupy is installed but CUDA is unavailable") from exc
26
+
27
+ if _device_count < 1:
28
+ raise RuntimeError("Cupy is installed but no CUDA devices are available")
29
+
30
+
31
+ try:
32
+ import cupy as xp
33
+
34
+ _validate_cupy_available()
35
+
36
+ BACKEND = "cupy"
37
+ logger.debug("Using cupy as backend")
38
+ except (ImportError, RuntimeError):
39
+ import numpy as xp
40
+
41
+ BACKEND = "numpy"
42
+ logger.warning("Cupy backend unavailable; falling back to numpy (cpu)")
43
+
44
+
45
+ __all__ = ["BACKEND", "TensorDevice", "xp"]
sadl/disk.py ADDED
@@ -0,0 +1,147 @@
1
+ """Code for serializing and deserializing data."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import struct
6
+ from collections import OrderedDict
7
+ from typing import Any
8
+
9
+ import numpy as np
10
+
11
+ from .tensor import Tensor, tensor
12
+
13
+ _SADL_MAGIC = b"SADL"
14
+ _SADL_VERSION = 1
15
+
16
+
17
+ def _dtype_to_str(dtype: Any) -> str:
18
+ """Convert numpy/cupy dtype to string representation."""
19
+ return str(np.dtype(dtype).name)
20
+
21
+
22
+ def _str_to_dtype(dtype_str: str) -> Any:
23
+ """Convert string back to numpy dtype."""
24
+ return np.dtype(dtype_str)
25
+
26
+
27
+ def save(data: Tensor | OrderedDict[str, Tensor], file_path: str) -> None:
28
+ """Save Tensor data to disk using custom binary format.
29
+
30
+ Args:
31
+ data (Tensor | OrderedDict[str, Tensor]): The data to save,
32
+ can either be a single Tensor or an OrderedDict with strings
33
+ as keys and Tensors as values.
34
+ file_path (str): The file path to which to store the data. Must
35
+ end with ".sadl".
36
+
37
+ Raises:
38
+ ValueError: If an OrderedDict with non-Tensor values is passed.
39
+ ValueError: If file_path doesn't end with ".sadl".
40
+ """
41
+ if not file_path.endswith(".sadl"):
42
+ raise ValueError('file_path must end with ".sadl"')
43
+
44
+ # Normalize to OrderedDict
45
+ if isinstance(data, Tensor):
46
+ tensors = OrderedDict([("__single__", data)])
47
+ else:
48
+ if not all(isinstance(v, Tensor) for v in data.values()):
49
+ raise ValueError("If an OrderedDict is passed, all values must be Tensors.")
50
+ tensors = data
51
+
52
+ with open(file_path, "wb") as f:
53
+ # Write header
54
+ f.write(_SADL_MAGIC)
55
+ f.write(struct.pack("<B", _SADL_VERSION)) # uint8 version
56
+ f.write(struct.pack("<I", len(tensors))) # uint32 num tensors
57
+
58
+ # Write each tensor
59
+ for key, tensor in tensors.items():
60
+ # Convert to numpy (CPU) for serialization
61
+ arr = np.asarray(tensor)
62
+ # Ensure C-contiguous
63
+ if not arr.flags["C_CONTIGUOUS"]:
64
+ arr = np.ascontiguousarray(arr)
65
+
66
+ # Key
67
+ key_bytes = key.encode("utf-8")
68
+ f.write(struct.pack("<I", len(key_bytes))) # uint32 key length
69
+ f.write(key_bytes)
70
+
71
+ # Dtype
72
+ dtype_str = _dtype_to_str(arr.dtype)
73
+ dtype_bytes = dtype_str.encode("utf-8")
74
+ f.write(struct.pack("<B", len(dtype_bytes))) # uint8 dtype length
75
+ f.write(dtype_bytes)
76
+
77
+ # Shape
78
+ f.write(struct.pack("<B", arr.ndim)) # uint8 ndim
79
+ f.writelines(struct.pack("<Q", dim) for dim in arr.shape) # uint64 per dimension
80
+
81
+ # Raw data
82
+ f.write(arr.tobytes())
83
+
84
+
85
+ def load(file_path: str) -> Tensor | OrderedDict[str, Tensor]:
86
+ """Load Tensor data from disk.
87
+
88
+ Args:
89
+ file_path (str): The file path from which to read the data. Must
90
+ end with ".sadl".
91
+
92
+ Raises:
93
+ ValueError: If file_path doesn't end with ".sadl".
94
+ ValueError: If file has invalid magic bytes or unsupported version.
95
+
96
+ Returns:
97
+ Tensor | OrderedDict[str, Tensor]: The loaded data. Returns a single
98
+ Tensor if one was saved, otherwise an OrderedDict.
99
+ """
100
+ if not file_path.endswith(".sadl"):
101
+ raise ValueError('file_path must end with ".sadl"')
102
+
103
+ with open(file_path, "rb") as f:
104
+ # Read and validate header
105
+ magic = f.read(4)
106
+ if magic != _SADL_MAGIC:
107
+ raise ValueError(f"Invalid file format. Expected SADL magic bytes, got {magic!r}")
108
+
109
+ version = struct.unpack("<B", f.read(1))[0]
110
+ if version != _SADL_VERSION:
111
+ raise ValueError(f"Unsupported version {version}. Expected {_SADL_VERSION}")
112
+
113
+ num_tensors = struct.unpack("<I", f.read(4))[0]
114
+
115
+ # Read tensors
116
+ tensors: OrderedDict[str, Tensor] = OrderedDict()
117
+ for _ in range(num_tensors):
118
+ # Key
119
+ key_length = struct.unpack("<I", f.read(4))[0]
120
+ key = f.read(key_length).decode("utf-8")
121
+
122
+ # Dtype
123
+ dtype_length = struct.unpack("<B", f.read(1))[0]
124
+ dtype_str = f.read(dtype_length).decode("utf-8")
125
+ dtype = _str_to_dtype(dtype_str)
126
+
127
+ # Shape
128
+ ndim = struct.unpack("<B", f.read(1))[0]
129
+ shape = tuple(struct.unpack("<Q", f.read(8))[0] for _ in range(ndim))
130
+
131
+ # Data
132
+ num_bytes = int(np.prod(shape)) * dtype.itemsize
133
+ data_bytes = f.read(num_bytes)
134
+ arr = np.frombuffer(data_bytes, dtype=dtype).reshape(shape)
135
+
136
+ tensors[key] = tensor(arr)
137
+
138
+ # Return single tensor if that's what was saved
139
+ if len(tensors) == 1 and "__single__" in tensors:
140
+ return tensors["__single__"]
141
+ return tensors
142
+
143
+
144
+ __all__ = [
145
+ "load",
146
+ "save",
147
+ ]