patchworks 0.2.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,294 @@
1
+ Metadata-Version: 2.4
2
+ Name: patchworks
3
+ Version: 0.2.0
4
+ Summary: Tiled processing of arbitrarily large images with globally consistent labels
5
+ Project-URL: Homepage, https://github.com/imcf/patchworks
6
+ Project-URL: Issues, https://github.com/imcf/patchworks/issues
7
+ Author: IMCF Basel
8
+ License: MIT
9
+ Keywords: bioimage,chunked,dask,image processing,segmentation,tiling,zarr
10
+ Classifier: Development Status :: 4 - Beta
11
+ Classifier: Intended Audience :: Science/Research
12
+ Classifier: License :: OSI Approved :: MIT License
13
+ Classifier: Programming Language :: Python :: 3
14
+ Classifier: Programming Language :: Python :: 3.9
15
+ Classifier: Programming Language :: Python :: 3.10
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Classifier: Programming Language :: Python :: 3.12
18
+ Classifier: Topic :: Scientific/Engineering :: Image Processing
19
+ Requires-Python: >=3.9
20
+ Requires-Dist: dask[array]>=2023.1.0
21
+ Requires-Dist: numpy>=1.24
22
+ Requires-Dist: scipy>=1.9
23
+ Requires-Dist: zarr>=2.14
24
+ Provides-Extra: all
25
+ Requires-Dist: nvidia-ml-py; extra == 'all'
26
+ Requires-Dist: psutil; extra == 'all'
27
+ Requires-Dist: scikit-image; extra == 'all'
28
+ Requires-Dist: tqdm; extra == 'all'
29
+ Provides-Extra: cellpose
30
+ Requires-Dist: cellpose>=3.0; extra == 'cellpose'
31
+ Provides-Extra: dev
32
+ Requires-Dist: psutil; extra == 'dev'
33
+ Requires-Dist: pytest; extra == 'dev'
34
+ Requires-Dist: pytest-cov; extra == 'dev'
35
+ Requires-Dist: scikit-image; extra == 'dev'
36
+ Requires-Dist: tqdm; extra == 'dev'
37
+ Provides-Extra: docs
38
+ Requires-Dist: mkdocs-material>=9.0; extra == 'docs'
39
+ Requires-Dist: mkdocstrings[python]>=0.24; extra == 'docs'
40
+ Provides-Extra: gpu
41
+ Requires-Dist: nvidia-ml-py; extra == 'gpu'
42
+ Provides-Extra: io
43
+ Requires-Dist: psutil; extra == 'io'
44
+ Requires-Dist: tqdm; extra == 'io'
45
+ Description-Content-Type: text/markdown
46
+
47
+ # patchworks
48
+
49
+ > Tiled processing of arbitrarily large images — any image, any function.
50
+
51
+ ```
52
+ ┌──────┬──────┬──────┐ fn(tile) → labels ┌──────┬──────┬──────┐
53
+ │ tile │ tile │ tile │ ─────────────────────► │ 1 │ 2 │ 3 │
54
+ ├──────┼──────┼──────┤ ├──────┼──────┼──────┤
55
+ │ tile │ tile │ tile │ │ 4 │ 5 │ 6 │ globally
56
+ ├──────┼──────┼──────┤ ├──────┼──────┼──────┤ consistent
57
+ │ tile │ tile │ tile │ │ 7 │ 8 │ 9 │ labels
58
+ └──────┴──────┴──────┘ └──────┴──────┴──────┘
59
+ ```
60
+
61
+ patchworks splits a large image into tiles, runs **any callable** on each
62
+ tile in parallel, and merges the results into a globally consistent label array.
63
+ It handles terabyte-scale images without loading them into memory.
64
+
65
+ ---
66
+
67
+ ## Installation
68
+
69
+ ```bash
70
+ pip install patchworks
71
+ ```
72
+
73
+ Optional extras:
74
+
75
+ ```bash
76
+ pip install "patchworks[gpu]" # GPU VRAM querying (nvidia-ml-py)
77
+ pip install "patchworks[cellpose]" # Cellpose plugin
78
+ pip install "patchworks[all]" # Everything
79
+ ```
80
+
81
+ ---
82
+
83
+ ## Quick start — 5 lines
84
+
85
+ ```python
86
+ from patchworks import tile_process
87
+
88
+ def my_fn(tile):
89
+ from skimage.filters import threshold_otsu
90
+ from skimage.measure import label
91
+ return label(tile > threshold_otsu(tile)).astype("int32")
92
+
93
+ result = tile_process("image.zarr", my_fn, compute=True)
94
+ ```
95
+
96
+ Done. `result` is a NumPy array of integer labels, same spatial shape as the
97
+ input, with globally unique IDs across all tiles.
98
+
99
+ ---
100
+
101
+ ## With Cellpose
102
+
103
+ ```python
104
+ from patchworks import tile_process
105
+ from patchworks.plugins.cellpose import cellpose_fn
106
+
107
+ fn = cellpose_fn("cyto3", gpu=True, diameter=30)
108
+
109
+ tile_process(
110
+ "image.zarr", fn,
111
+ tile_shape=(1, 2048, 2048), # one z-slice per tile
112
+ overlap=20, # gives boundary cells enough context
113
+ write_to="labels.zarr", # stream directly to disk — no RAM accumulation
114
+ progress=True,
115
+ )
116
+ ```
117
+
118
+ ---
119
+
120
+ ## With StarDist
121
+
122
+ ```python
123
+ from stardist.models import StarDist2D
124
+ from patchworks import tile_process
125
+
126
+ model = StarDist2D.from_pretrained("2D_versatile_fluo")
127
+
128
+ def stardist_fn(tile):
129
+ img = tile[0] if tile.ndim == 3 and tile.shape[0] == 1 else tile
130
+ norm = img.astype("float32") / (img.max() or 1)
131
+ labels, _ = model.predict_instances(norm)
132
+ return labels.astype("int32")[None] if tile.ndim == 3 else labels.astype("int32")
133
+
134
+ tile_process("image.zarr", stardist_fn,
135
+ tile_shape=(1, 1024, 1024), overlap=32,
136
+ write_to="labels.zarr", progress=True)
137
+ ```
138
+
139
+ ---
140
+
141
+ ## With any function
142
+
143
+ ```python
144
+ import numpy as np
145
+ from scipy.ndimage import gaussian_filter
146
+ from skimage.measure import label
147
+ from patchworks import tile_process
148
+
149
+ def my_custom_fn(tile: np.ndarray) -> np.ndarray:
150
+ smoothed = gaussian_filter(tile.astype("float32"), sigma=1.5)
151
+ binary = smoothed > smoothed.mean()
152
+ return label(binary).astype("int32")
153
+
154
+ tile_process("image.zarr", my_custom_fn, tile_shape=(1, 512, 512))
155
+ ```
156
+
157
+ ---
158
+
159
+ ## Common patterns
160
+
161
+ ### Auto-size tiles from available memory
162
+
163
+ ```python
164
+ from patchworks import tile_process
165
+
166
+ tile_process("image.zarr", fn, tile_shape="auto", use_gpu=True)
167
+ ```
168
+
169
+ ### Skip empty tiles (sparse volumes)
170
+
171
+ ```python
172
+ from patchworks import estimate_empty_tiles, tile_process
173
+
174
+ info = estimate_empty_tiles("image.zarr", tile_shape=(120, 697, 697))
175
+ print(f"{info['empty_fraction']:.0%} tiles are background — will be skipped")
176
+
177
+ tile_process("image.zarr", fn,
178
+ tile_shape=(120, 697, 697),
179
+ skip_empty=True,
180
+ empty_threshold=info["threshold"],
181
+ write_to="labels.zarr")
182
+ ```
183
+
184
+ ### Distributed cluster for GPU
185
+
186
+ ```python
187
+ from patchworks import make_local_cluster, tile_process
188
+
189
+ client, cluster = make_local_cluster(use_gpu=True)
190
+ try:
191
+ tile_process("image.zarr", fn, write_to="labels.zarr", progress=True)
192
+ finally:
193
+ client.close(); cluster.close()
194
+ ```
195
+
196
+ ### Contiguous label numbering
197
+
198
+ ```python
199
+ # Labels are globally unique by default, but may be gappy (block-encoded IDs).
200
+ # sequential_labels=True does a linear relabel O(voxels) — not O(n_tiles²).
201
+ tile_process("image.zarr", fn,
202
+ write_to="labels.zarr",
203
+ sequential_labels=True)
204
+ ```
205
+
206
+ ### Use only the merge step (bring your own tiling)
207
+
208
+ If you already have per-tile labels from your own pipeline, just call the
209
+ merge step directly:
210
+
211
+ ```python
212
+ import dask.array as da
213
+ import numpy as np
214
+ from patchworks import merge_tile_labels
215
+
216
+ # Your own tiling + segmentation
217
+ image = da.from_zarr("image.zarr").rechunk((1, 1024, 1024))
218
+ labeled = image.map_blocks(my_segment_fn, dtype="int32",
219
+ meta=np.empty((0,) * image.ndim, dtype="int32"))
220
+
221
+ merged = merge_tile_labels(labeled, write_to="labels.zarr", progress=True)
222
+ ```
223
+
224
+ Or merge from a zarr store your pipeline already wrote:
225
+
226
+ ```python
227
+ from patchworks import merge_tile_labels
228
+
229
+ merged = merge_tile_labels(
230
+ "my_staged_labels.zarr",
231
+ input_component="raw_labels",
232
+ write_to="merged.zarr",
233
+ sequential_labels=True,
234
+ )
235
+ ```
236
+
237
+ ---
238
+
239
+ ## How tiling and merging work
240
+
241
+ See [docs/how-it-works.md](docs/how-it-works.md) for a full explanation.
242
+ Short version:
243
+
244
+ 1. Image is split into tiles (with optional overlap for boundary context).
245
+ 2. Your function is called independently on each tile. Dask handles parallelism
246
+ and streaming — tiles are never all in memory at once.
247
+ 3. Each tile's labels are written to a temp zarr exactly once (the staging
248
+ step — this prevents your function being called 3-4× per tile during merge).
249
+ 4. Thin slabs at each tile boundary are scanned for touching label pairs.
250
+ 5. scipy connected components on the pairs → relabeling lookup table.
251
+ 6. LUT applied to every tile in parallel → globally consistent labels.
252
+
253
+ The merge is **zarr-native** (no dask task graph), so it scales to thousands of
254
+ tiles where the dask-image approach stalls.
255
+
256
+ ---
257
+
258
+ ## Known pitfalls (and how patchworks avoids them)
259
+
260
+ | Pitfall | Symptom | How patchworks handles it |
261
+ |---|---|---|
262
+ | In-process Dask client | `FutureCancelledError: lost dependencies` | Detected at startup, raises immediately with fix instructions |
263
+ | 3-4× fn recompute during merge | Cellpose runs 3× per tile | Staging writes labels once, merge reads from disk |
264
+ | O(n²) sequential relabelling | Graph construction hangs at 1000+ tiles | Linear post-pass O(voxels) via `np.unique` + LUT |
265
+ | Wrong overlap boundary | Output shape mismatch | Always uses `boundary="none"` |
266
+ | Persisting large arrays | Worker OOM | Never persists; keeps dask graph lazy and streams |
267
+
268
+ ---
269
+
270
+ ## Documentation
271
+
272
+ - [Quick Start](docs/quickstart.md)
273
+ - [API Reference](docs/api-reference.md)
274
+ - [How It Works](docs/how-it-works.md)
275
+ - [Examples](docs/examples/)
276
+
277
+ ---
278
+
279
+ ## Requirements
280
+
281
+ - Python ≥ 3.9
282
+ - dask[array], numpy, zarr, scipy
283
+
284
+ Optional:
285
+ - `psutil` — accurate RAM sizing for `tile_shape="auto"`
286
+ - `nvidia-ml-py` — accurate GPU VRAM sizing
287
+ - `tqdm` — progress bars
288
+ - `cellpose` — Cellpose plugin
289
+
290
+ ---
291
+
292
+ ## License
293
+
294
+ MIT
@@ -0,0 +1,12 @@
1
+ patchworks/__init__.py,sha256=3aSxbe0hGP13pLiQRoQhd2W5whH8cSAHUoviuntjrvw,1516
2
+ patchworks/_chunks.py,sha256=hkmAqirC9rO2rvdrLVrm0ROnbgwfVVPA9j0Vf-tT4m8,8368
3
+ patchworks/_cluster.py,sha256=T5MuhKqv0ScjhKkTcPzjpaNwUc6CsT4ouQP8keRHru0,2828
4
+ patchworks/_core.py,sha256=-Roe95H7wa6SxpRsjGzusdXmfRwsQWQjUUNvwVCmuPA,14332
5
+ patchworks/_io.py,sha256=fZvlPSot2JHK7Fi7_8NPvfJoJroBWFOOBYxOJubwBUI,7548
6
+ patchworks/_merge.py,sha256=LHEqt2AYqX9z1YKROc-ZBaFsubjiMKl-oxsJF_V1IR4,15529
7
+ patchworks/_relabel.py,sha256=xFC3OiB1_yxwcepdf_h0V4M1Bml8kTDiQmvgI1rffZc,3262
8
+ patchworks/plugins/__init__.py,sha256=NBc7lisET3JpK2SjtC6EA72z8kKLigq4NWZM0cXtNis,21
9
+ patchworks/plugins/cellpose.py,sha256=EcC56x-RrDtuGhY0v7ZmOCoTn7A1lqgDfwX7i9iHj1Q,6058
10
+ patchworks-0.2.0.dist-info/METADATA,sha256=S29Ey54jVC1YHESf28FhNV0NG-F0b425touWRYG09Z0,9253
11
+ patchworks-0.2.0.dist-info/WHEEL,sha256=mffPy8wBnZQn2VnJUU5jE99KsxaSfiyMHV9Yt0aLVxs,87
12
+ patchworks-0.2.0.dist-info/RECORD,,
@@ -0,0 +1,4 @@
1
+ Wheel-Version: 1.0
2
+ Generator: hatchling 1.30.1
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any