volresample 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (36) hide show
  1. volresample-0.1.0/LICENSE +21 -0
  2. volresample-0.1.0/MANIFEST.in +26 -0
  3. volresample-0.1.0/PKG-INFO +268 -0
  4. volresample-0.1.0/README.md +241 -0
  5. volresample-0.1.0/pyproject.toml +200 -0
  6. volresample-0.1.0/setup.cfg +4 -0
  7. volresample-0.1.0/setup.py +172 -0
  8. volresample-0.1.0/src/volresample/__init__.py +77 -0
  9. volresample-0.1.0/src/volresample/_config.py +55 -0
  10. volresample-0.1.0/src/volresample/_resample.c +26539 -0
  11. volresample-0.1.0/src/volresample/_resample.pyi +98 -0
  12. volresample-0.1.0/src/volresample/_resample.pyx +342 -0
  13. volresample-0.1.0/src/volresample/cython_src/area.pxd +14 -0
  14. volresample-0.1.0/src/volresample/cython_src/area.pyx +119 -0
  15. volresample-0.1.0/src/volresample/cython_src/grid_sample.pxd +68 -0
  16. volresample-0.1.0/src/volresample/cython_src/grid_sample.pyx +842 -0
  17. volresample-0.1.0/src/volresample/cython_src/linear.pxd +10 -0
  18. volresample-0.1.0/src/volresample/cython_src/linear.pyx +160 -0
  19. volresample-0.1.0/src/volresample/cython_src/nearest.pxd +11 -0
  20. volresample-0.1.0/src/volresample/cython_src/nearest.pyx +65 -0
  21. volresample-0.1.0/src/volresample/cython_src/utils.pxd +7 -0
  22. volresample-0.1.0/src/volresample/cython_src/utils.pyx +9 -0
  23. volresample-0.1.0/src/volresample/py.typed +0 -0
  24. volresample-0.1.0/src/volresample.egg-info/PKG-INFO +268 -0
  25. volresample-0.1.0/src/volresample.egg-info/SOURCES.txt +34 -0
  26. volresample-0.1.0/src/volresample.egg-info/dependency_links.txt +1 -0
  27. volresample-0.1.0/src/volresample.egg-info/not-zip-safe +1 -0
  28. volresample-0.1.0/src/volresample.egg-info/requires.txt +1 -0
  29. volresample-0.1.0/src/volresample.egg-info/top_level.txt +1 -0
  30. volresample-0.1.0/tests/__init__.py +20 -0
  31. volresample-0.1.0/tests/benchmark_resampling.py +529 -0
  32. volresample-0.1.0/tests/conftest.py +64 -0
  33. volresample-0.1.0/tests/test_dtype_support.py +347 -0
  34. volresample-0.1.0/tests/test_grid_sample.py +437 -0
  35. volresample-0.1.0/tests/test_resampling.py +702 -0
  36. volresample-0.1.0/tests/torch_reference.py +170 -0
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Johannes Hofmanninger
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,26 @@
1
+ # Include Cython source files
2
+ recursive-include src *.pyx
3
+ recursive-include src *.pxd
4
+ recursive-include src *.c
5
+
6
+ # Include package data
7
+ include src/volresample/*.pyi
8
+
9
+ # Include documentation
10
+ include README.md
11
+ include LICENSE
12
+
13
+ # Include build configuration
14
+ include pyproject.toml
15
+ include setup.py
16
+
17
+ # Include tests
18
+ recursive-include tests *.py
19
+
20
+ # Exclude compiled files and caches
21
+ global-exclude *.pyc
22
+ global-exclude *.pyo
23
+ global-exclude __pycache__
24
+ global-exclude *.so
25
+ global-exclude *.dylib
26
+ global-exclude .DS_Store
@@ -0,0 +1,268 @@
1
+ Metadata-Version: 2.4
2
+ Name: volresample
3
+ Version: 0.1.0
4
+ Summary: Fast 3D volume resampling with optimized Cython
5
+ Author-email: Johannes <j.hofmanninger@gmail.com>
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/JoHof/volresample
8
+ Project-URL: Repository, https://github.com/JoHof/volresample
9
+ Keywords: volume,resampling,interpolation,3d,medical imaging,cython
10
+ Classifier: Development Status :: 4 - Beta
11
+ Classifier: Intended Audience :: Science/Research
12
+ Classifier: Intended Audience :: Developers
13
+ Classifier: Programming Language :: Python :: 3
14
+ Classifier: Programming Language :: Python :: 3.9
15
+ Classifier: Programming Language :: Python :: 3.10
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Classifier: Programming Language :: Python :: 3.12
18
+ Classifier: Programming Language :: Python :: 3.13
19
+ Classifier: Programming Language :: Cython
20
+ Classifier: Topic :: Scientific/Engineering
21
+ Classifier: Topic :: Scientific/Engineering :: Medical Science Apps.
22
+ Requires-Python: >=3.9
23
+ Description-Content-Type: text/markdown
24
+ License-File: LICENSE
25
+ Requires-Dist: numpy>=2.0.0
26
+ Dynamic: license-file
27
+
28
+ # volresample
29
+
30
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
31
+ [![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
32
+
33
+ Fast 3D volume resampling with Cython and OpenMP parallelization.
34
+
35
+ Implemented against PyTorch's `F.interpolate` and `F.grid_sample` as a reference, producing identical results. Can be used as a drop-in replacement when PyTorch is not available or when better performance is desired on CPU.
36
+
37
+ ## Features
38
+
39
+ - Cython-optimized with OpenMP parallelization
40
+ - Simple API: `resample()` and `grid_sample()`
41
+ - Interpolation modes: nearest, linear and area
42
+ - Supports 3D and 4D (multi-channel) volumes
43
+ - Supports uint8, int16 (nearest) and float32 dtypes (all)
44
+
45
+ ## Installation
46
+
47
+ ```bash
48
+ pip install volresample
49
+ ```
50
+
51
+ Or build from source:
52
+
53
+ ```bash
54
+ git clone https://github.com/JoHof/volresample.git
55
+ cd volresample
56
+ uv sync
57
+ ```
58
+
59
+ ## Quick Start
60
+
61
+ ### Basic Resampling
62
+
63
+ ```python
64
+ import numpy as np
65
+ import volresample
66
+
67
+ # Create a 3D volume
68
+ volume = np.random.rand(128, 128, 128).astype(np.float32)
69
+
70
+ # Resample to a different size
71
+ resampled = volresample.resample(volume, (64, 64, 64), mode='linear')
72
+ print(resampled.shape) # (64, 64, 64)
73
+ ```
74
+
75
+ ### Multi-Channel Volumes
76
+
77
+ ```python
78
+ # 4D volume with 4 channels
79
+ volume_4d = np.random.rand(4, 128, 128, 128).astype(np.float32)
80
+
81
+ # Resample all channels
82
+ resampled_4d = volresample.resample(volume_4d, (64, 64, 64), mode='linear')
83
+ print(resampled_4d.shape) # (4, 64, 64, 64)
84
+ ```
85
+
86
+ ### Batched Multi-Channel Volumes
87
+
88
+ ```python
89
+ # 5D volume with batch dimension (N, C, D, H, W)
90
+ volume_5d = np.random.rand(2, 4, 128, 128, 128).astype(np.float32)
91
+
92
+ # Resample all batches and channels
93
+ resampled_5d = volresample.resample(volume_5d, (64, 64, 64), mode='linear')
94
+ print(resampled_5d.shape) # (2, 4, 64, 64, 64)
95
+ ```
96
+
97
+ ### Grid Sampling
98
+
99
+ ```python
100
+ # Input volume: (N, C, D, H, W)
101
+ input = np.random.rand(2, 3, 32, 32, 32).astype(np.float32)
102
+
103
+ # Sampling grid with normalized coordinates in [-1, 1]
104
+ grid = np.random.uniform(-1, 1, (2, 24, 24, 24, 3)).astype(np.float32)
105
+
106
+ # Sample with linear interpolation
107
+ output = volresample.grid_sample(input, grid, mode='linear', padding_mode='zeros')
108
+ print(output.shape) # (2, 3, 24, 24, 24)
109
+ ```
110
+
111
+ ### Parallelization
112
+
113
+ ```python
114
+ import volresample
115
+
116
+ # Check default thread count (min of cpu_count and 4)
117
+ print(volresample.get_num_threads()) # e.g., 4
118
+
119
+ # Set custom thread count
120
+ volresample.set_num_threads(8)
121
+
122
+ # All subsequent operations use 8 threads
123
+ resampled = volresample.resample(volume, (64, 64, 64), mode='linear')
124
+ ```
125
+
126
+ ## API Reference
127
+
128
+ ### `resample(data, size, mode='linear')`
129
+
130
+ Resample a 3D, 4D, or 5D volume to a new size.
131
+
132
+ **Parameters:**
133
+ - `data` (ndarray): Input volume of shape `(D, H, W)`, `(C, D, H, W)`, or `(N, C, D, H, W)`
134
+ - `size` (tuple): Target size `(D_out, H_out, W_out)`
135
+ - `mode` (str): Interpolation mode:
136
+ - `'nearest'`: Nearest neighbor (works with all dtypes)
137
+ - `'linear'`: Trilinear interpolation (float32 only)
138
+ - `'area'`: Area-based averaging (float32 only, suited for downsampling)
139
+
140
+ **PyTorch correspondence:**
141
+
142
+ | volresample | PyTorch `F.interpolate` |
143
+ |-------------|-------------------------|
144
+ | `mode='nearest'` | `mode='nearest-exact'` |
145
+ | `mode='linear'` | `mode='trilinear'` |
146
+ | `mode='area'` | `mode='area'` |
147
+
148
+ volresample does not expose an `align_corners` parameter. The behavior matches PyTorch's `align_corners=False` (the default).
149
+
150
+ **Returns:**
151
+ - Resampled array with same number of dimensions as input
152
+
153
+ **Supported Dtypes:**
154
+ - `uint8`, `int16`: Only with `mode='nearest'`
155
+ - `float32`: All modes
156
+
157
+ ### `grid_sample(input, grid, mode='linear', padding_mode='zeros')`
158
+
159
+ Sample input at arbitrary locations specified by a grid.
160
+
161
+ **Parameters:**
162
+ - `input` (ndarray): Input volume of shape `(N, C, D, H, W)`
163
+ - `grid` (ndarray): Sampling grid of shape `(N, D_out, H_out, W_out, 3)`
164
+ - Values in range `[-1, 1]` where -1 maps to the first voxel, 1 to the last
165
+ - `mode` (str): `'nearest'` or `'linear'`
166
+ - `padding_mode` (str): `'zeros'`, `'border'`, or `'reflection'`
167
+
168
+ **PyTorch correspondence:**
169
+
170
+ | volresample | PyTorch `F.grid_sample` |
171
+ |-------------|-------------------------|
172
+ | `mode='nearest'` | `mode='nearest'` |
173
+ | `mode='linear'` | `mode='bilinear'` |
174
+
175
+ The behavior matches PyTorch's `grid_sample` with `align_corners=False`.
176
+
177
+ **Returns:**
178
+ - Sampled array of shape `(N, C, D_out, H_out, W_out)`
179
+
180
+ ### `set_num_threads(num_threads)`
181
+
182
+ Set the number of threads used for parallel operations.
183
+
184
+ **Parameters:**
185
+ - `num_threads` (int): Number of threads to use (must be >= 1)
186
+
187
+ ### `get_num_threads()`
188
+
189
+ Get the current number of threads used for parallel operations.
190
+
191
+ **Returns:**
192
+ - Current thread count (default: `min(cpu_count, 4)`)
193
+
194
+ ## Performance
195
+
196
+ Benchmarks on an Intel i7-8565U against PyTorch 2.8.0. Times are means over 10 iterations.
197
+
198
+ **`resample()`** — single large 3D volume:
199
+
200
+ | Operation | Mode | **Single-thread** | | | **Four-threads** | | |
201
+ | ----------- | --------------- | ----------------- | ------- | :-------: | ---------------- | ------- | :------: |
202
+ | | | volresample | PyTorch | Speedup | volresample | PyTorch | Speedup |
203
+ | 512³ → 256³ | nearest | 23.6 ms | 38.0 ms | 1.6× | 12.6 ms | 16.7 ms | 1.3× |
204
+ | 512³ → 256³ | linear | 99.9 ms | 182 ms | 1.8× | 34.3 ms | 54.6 ms | 1.6× |
205
+ | 512³ → 256³ | area | 230 ms | 611 ms | 2.7× | 64.5 ms | 613 ms | **9.5×** |
206
+ | 512³ → 256³ | nearest (uint8) | 13.7 ms | 33.8 ms | 2.5× | 4.3 ms | 10.4 ms | 2.4× |
207
+ | 512³ → 256³ | nearest (int16) | 16.5 ms | 217 ms | **13.2×** | 8.4 ms | 93.2 ms | 11.2× |
208
+
209
+
210
+ **`grid_sample()`** — single large 3D volume (128³ input):
211
+
212
+ | Mode | Padding | **Single-thread** | | | **Four-threads** | | |
213
+ | ------ | ---------- | ----------------- | ------- | :-----: | ---------------- | ------- | :-----: |
214
+ | | | volresample | PyTorch | Speedup | volresample | PyTorch | Speedup |
215
+ | linear | zeros | 118 ms | 181 ms | 1.5× | 38.1 ms | 169 ms | 4.4× |
216
+ | linear | reflection | 103 ms | 211 ms | 2.1× | 33.2 ms | 194 ms | 5.9× |
217
+
218
+
219
+ Average speedup across all benchmarks: **3.1× at 1 thread**, **6.0× at 4 threads**.
220
+
221
+ **Notes:**
222
+
223
+ - **Area mode**: At 1 thread the speedup is 2.7×; at 4 threads it reaches 9.5×. PyTorch's area interpolation does not appear to parallelize over spatial dimensions for single-image workloads — its runtime is essentially unchanged between 1 and 4 threads (611 ms vs. 613 ms). volresample parallelizes along the first spatial dimension, reducing runtime from 230 ms to 65 ms with 4 threads.
224
+ - **int16**: PyTorch does not support int16 interpolation natively and requires casting to float32, processing, then casting back. volresample operates directly on int16, eliminating two full-volume type conversions. The advantage is large even at 1 thread (13.2×) and persists at 4 threads because the conversion overhead scales with data volume, not thread count.
225
+ - **Thread scaling**: For large volumes, volresample typically halves wall time going from 1 to 4 threads on nearest and linear modes. Grid sample scales more strongly (1.5× → 4.4× for linear) because per-voxel work is higher. PyTorch scaling is more variable, and negligible for area mode.
226
+ - **These are estimates** on a single machine under light load. Actual results will vary with CPU architecture, memory bandwidth, and system conditions.
227
+
228
+ ## Development
229
+
230
+ ### Running Tests
231
+
232
+ ```bash
233
+ # Run all tests
234
+ pytest tests/
235
+
236
+ # Run with PyTorch comparison tests
237
+ pip install torch
238
+ pytest tests/ -v
239
+
240
+ # Skip PyTorch tests
241
+ pytest tests/ --skip-torch
242
+ ```
243
+
244
+
245
+ ### Running Benchmarks
246
+
247
+ ```bash
248
+ # Use default threads (min of cpu_count and 4)
249
+ python tests/benchmark_resampling.py --iterations 10
250
+
251
+ # Or specify thread count
252
+ python tests/benchmark_resampling.py --threads 4 --iterations 10
253
+ ```
254
+
255
+ ### Building from Source
256
+
257
+ ```bash
258
+ pip install -e ".[dev]"
259
+ python setup.py build_ext --inplace
260
+ ```
261
+
262
+ ## License
263
+
264
+ MIT License - see [LICENSE](LICENSE) file for details.
265
+
266
+ ## Contributing
267
+
268
+ Contributions are welcome. Please submit a Pull Request.
@@ -0,0 +1,241 @@
1
+ # volresample
2
+
3
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
4
+ [![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
5
+
6
+ Fast 3D volume resampling with Cython and OpenMP parallelization.
7
+
8
+ Implemented against PyTorch's `F.interpolate` and `F.grid_sample` as a reference, producing identical results. Can be used as a drop-in replacement when PyTorch is not available or when better performance is desired on CPU.
9
+
10
+ ## Features
11
+
12
+ - Cython-optimized with OpenMP parallelization
13
+ - Simple API: `resample()` and `grid_sample()`
14
+ - Interpolation modes: nearest, linear and area
15
+ - Supports 3D and 4D (multi-channel) volumes
16
+ - Supports uint8, int16 (nearest) and float32 dtypes (all)
17
+
18
+ ## Installation
19
+
20
+ ```bash
21
+ pip install volresample
22
+ ```
23
+
24
+ Or build from source:
25
+
26
+ ```bash
27
+ git clone https://github.com/JoHof/volresample.git
28
+ cd volresample
29
+ uv sync
30
+ ```
31
+
32
+ ## Quick Start
33
+
34
+ ### Basic Resampling
35
+
36
+ ```python
37
+ import numpy as np
38
+ import volresample
39
+
40
+ # Create a 3D volume
41
+ volume = np.random.rand(128, 128, 128).astype(np.float32)
42
+
43
+ # Resample to a different size
44
+ resampled = volresample.resample(volume, (64, 64, 64), mode='linear')
45
+ print(resampled.shape) # (64, 64, 64)
46
+ ```
47
+
48
+ ### Multi-Channel Volumes
49
+
50
+ ```python
51
+ # 4D volume with 4 channels
52
+ volume_4d = np.random.rand(4, 128, 128, 128).astype(np.float32)
53
+
54
+ # Resample all channels
55
+ resampled_4d = volresample.resample(volume_4d, (64, 64, 64), mode='linear')
56
+ print(resampled_4d.shape) # (4, 64, 64, 64)
57
+ ```
58
+
59
+ ### Batched Multi-Channel Volumes
60
+
61
+ ```python
62
+ # 5D volume with batch dimension (N, C, D, H, W)
63
+ volume_5d = np.random.rand(2, 4, 128, 128, 128).astype(np.float32)
64
+
65
+ # Resample all batches and channels
66
+ resampled_5d = volresample.resample(volume_5d, (64, 64, 64), mode='linear')
67
+ print(resampled_5d.shape) # (2, 4, 64, 64, 64)
68
+ ```
69
+
70
+ ### Grid Sampling
71
+
72
+ ```python
73
+ # Input volume: (N, C, D, H, W)
74
+ input = np.random.rand(2, 3, 32, 32, 32).astype(np.float32)
75
+
76
+ # Sampling grid with normalized coordinates in [-1, 1]
77
+ grid = np.random.uniform(-1, 1, (2, 24, 24, 24, 3)).astype(np.float32)
78
+
79
+ # Sample with linear interpolation
80
+ output = volresample.grid_sample(input, grid, mode='linear', padding_mode='zeros')
81
+ print(output.shape) # (2, 3, 24, 24, 24)
82
+ ```
83
+
84
+ ### Parallelization
85
+
86
+ ```python
87
+ import volresample
88
+
89
+ # Check default thread count (min of cpu_count and 4)
90
+ print(volresample.get_num_threads()) # e.g., 4
91
+
92
+ # Set custom thread count
93
+ volresample.set_num_threads(8)
94
+
95
+ # All subsequent operations use 8 threads
96
+ resampled = volresample.resample(volume, (64, 64, 64), mode='linear')
97
+ ```
98
+
99
+ ## API Reference
100
+
101
+ ### `resample(data, size, mode='linear')`
102
+
103
+ Resample a 3D, 4D, or 5D volume to a new size.
104
+
105
+ **Parameters:**
106
+ - `data` (ndarray): Input volume of shape `(D, H, W)`, `(C, D, H, W)`, or `(N, C, D, H, W)`
107
+ - `size` (tuple): Target size `(D_out, H_out, W_out)`
108
+ - `mode` (str): Interpolation mode:
109
+ - `'nearest'`: Nearest neighbor (works with all dtypes)
110
+ - `'linear'`: Trilinear interpolation (float32 only)
111
+ - `'area'`: Area-based averaging (float32 only, suited for downsampling)
112
+
113
+ **PyTorch correspondence:**
114
+
115
+ | volresample | PyTorch `F.interpolate` |
116
+ |-------------|-------------------------|
117
+ | `mode='nearest'` | `mode='nearest-exact'` |
118
+ | `mode='linear'` | `mode='trilinear'` |
119
+ | `mode='area'` | `mode='area'` |
120
+
121
+ volresample does not expose an `align_corners` parameter. The behavior matches PyTorch's `align_corners=False` (the default).
122
+
123
+ **Returns:**
124
+ - Resampled array with same number of dimensions as input
125
+
126
+ **Supported Dtypes:**
127
+ - `uint8`, `int16`: Only with `mode='nearest'`
128
+ - `float32`: All modes
129
+
130
+ ### `grid_sample(input, grid, mode='linear', padding_mode='zeros')`
131
+
132
+ Sample input at arbitrary locations specified by a grid.
133
+
134
+ **Parameters:**
135
+ - `input` (ndarray): Input volume of shape `(N, C, D, H, W)`
136
+ - `grid` (ndarray): Sampling grid of shape `(N, D_out, H_out, W_out, 3)`
137
+ - Values in range `[-1, 1]` where -1 maps to the first voxel, 1 to the last
138
+ - `mode` (str): `'nearest'` or `'linear'`
139
+ - `padding_mode` (str): `'zeros'`, `'border'`, or `'reflection'`
140
+
141
+ **PyTorch correspondence:**
142
+
143
+ | volresample | PyTorch `F.grid_sample` |
144
+ |-------------|-------------------------|
145
+ | `mode='nearest'` | `mode='nearest'` |
146
+ | `mode='linear'` | `mode='bilinear'` |
147
+
148
+ The behavior matches PyTorch's `grid_sample` with `align_corners=False`.
149
+
150
+ **Returns:**
151
+ - Sampled array of shape `(N, C, D_out, H_out, W_out)`
152
+
153
+ ### `set_num_threads(num_threads)`
154
+
155
+ Set the number of threads used for parallel operations.
156
+
157
+ **Parameters:**
158
+ - `num_threads` (int): Number of threads to use (must be >= 1)
159
+
160
+ ### `get_num_threads()`
161
+
162
+ Get the current number of threads used for parallel operations.
163
+
164
+ **Returns:**
165
+ - Current thread count (default: `min(cpu_count, 4)`)
166
+
167
+ ## Performance
168
+
169
+ Benchmarks on an Intel i7-8565U against PyTorch 2.8.0. Times are means over 10 iterations.
170
+
171
+ **`resample()`** — single large 3D volume:
172
+
173
+ | Operation | Mode | **Single-thread** | | | **Four-threads** | | |
174
+ | ----------- | --------------- | ----------------- | ------- | :-------: | ---------------- | ------- | :------: |
175
+ | | | volresample | PyTorch | Speedup | volresample | PyTorch | Speedup |
176
+ | 512³ → 256³ | nearest | 23.6 ms | 38.0 ms | 1.6× | 12.6 ms | 16.7 ms | 1.3× |
177
+ | 512³ → 256³ | linear | 99.9 ms | 182 ms | 1.8× | 34.3 ms | 54.6 ms | 1.6× |
178
+ | 512³ → 256³ | area | 230 ms | 611 ms | 2.7× | 64.5 ms | 613 ms | **9.5×** |
179
+ | 512³ → 256³ | nearest (uint8) | 13.7 ms | 33.8 ms | 2.5× | 4.3 ms | 10.4 ms | 2.4× |
180
+ | 512³ → 256³ | nearest (int16) | 16.5 ms | 217 ms | **13.2×** | 8.4 ms | 93.2 ms | 11.2× |
181
+
182
+
183
+ **`grid_sample()`** — single large 3D volume (128³ input):
184
+
185
+ | Mode | Padding | **Single-thread** | | | **Four-threads** | | |
186
+ | ------ | ---------- | ----------------- | ------- | :-----: | ---------------- | ------- | :-----: |
187
+ | | | volresample | PyTorch | Speedup | volresample | PyTorch | Speedup |
188
+ | linear | zeros | 118 ms | 181 ms | 1.5× | 38.1 ms | 169 ms | 4.4× |
189
+ | linear | reflection | 103 ms | 211 ms | 2.1× | 33.2 ms | 194 ms | 5.9× |
190
+
191
+
192
+ Average speedup across all benchmarks: **3.1× at 1 thread**, **6.0× at 4 threads**.
193
+
194
+ **Notes:**
195
+
196
+ - **Area mode**: At 1 thread the speedup is 2.7×; at 4 threads it reaches 9.5×. PyTorch's area interpolation does not appear to parallelize over spatial dimensions for single-image workloads — its runtime is essentially unchanged between 1 and 4 threads (611 ms vs. 613 ms). volresample parallelizes along the first spatial dimension, reducing runtime from 230 ms to 65 ms with 4 threads.
197
+ - **int16**: PyTorch does not support int16 interpolation natively and requires casting to float32, processing, then casting back. volresample operates directly on int16, eliminating two full-volume type conversions. The advantage is large even at 1 thread (13.2×) and persists at 4 threads because the conversion overhead scales with data volume, not thread count.
198
+ - **Thread scaling**: For large volumes, volresample typically halves wall time going from 1 to 4 threads on nearest and linear modes. Grid sample scales more strongly (1.5× → 4.4× for linear) because per-voxel work is higher. PyTorch scaling is more variable, and negligible for area mode.
199
+ - **These are estimates** on a single machine under light load. Actual results will vary with CPU architecture, memory bandwidth, and system conditions.
200
+
201
+ ## Development
202
+
203
+ ### Running Tests
204
+
205
+ ```bash
206
+ # Run all tests
207
+ pytest tests/
208
+
209
+ # Run with PyTorch comparison tests
210
+ pip install torch
211
+ pytest tests/ -v
212
+
213
+ # Skip PyTorch tests
214
+ pytest tests/ --skip-torch
215
+ ```
216
+
217
+
218
+ ### Running Benchmarks
219
+
220
+ ```bash
221
+ # Use default threads (min of cpu_count and 4)
222
+ python tests/benchmark_resampling.py --iterations 10
223
+
224
+ # Or specify thread count
225
+ python tests/benchmark_resampling.py --threads 4 --iterations 10
226
+ ```
227
+
228
+ ### Building from Source
229
+
230
+ ```bash
231
+ pip install -e ".[dev]"
232
+ python setup.py build_ext --inplace
233
+ ```
234
+
235
+ ## License
236
+
237
+ MIT License - see [LICENSE](LICENSE) file for details.
238
+
239
+ ## Contributing
240
+
241
+ Contributions are welcome. Please submit a Pull Request.