cfdb 0.1.0__tar.gz → 0.1.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
cfdb-0.1.1/PKG-INFO ADDED
@@ -0,0 +1,204 @@
1
+ Metadata-Version: 2.4
2
+ Name: cfdb
3
+ Version: 0.1.1
4
+ Summary: CF conventions multi-dimensional array storage on top of Booklet
5
+ Project-URL: Documentation, https://mullenkamp.github.io/cfdb/
6
+ Project-URL: Source, https://github.com/mullenkamp/cfdb
7
+ Author-email: mullenkamp <mullenkamp1@gmail.com>
8
+ License-File: LICENSE
9
+ Classifier: Programming Language :: Python :: 3 :: Only
10
+ Requires-Python: >=3.10
11
+ Requires-Dist: booklet>=0.9.2
12
+ Requires-Dist: cftime
13
+ Requires-Dist: lz4
14
+ Requires-Dist: msgspec
15
+ Requires-Dist: numpy
16
+ Requires-Dist: rechunkit>=0.1.0
17
+ Requires-Dist: zstandard
18
+ Provides-Extra: ebooklet
19
+ Requires-Dist: ebooklet>=0.5.10; extra == 'ebooklet'
20
+ Provides-Extra: netcdf4
21
+ Requires-Dist: h5netcdf; extra == 'netcdf4'
22
+ Description-Content-Type: text/markdown
23
+
24
+ # cfdb
25
+
26
+ <p align="center">
27
+ <em>CF conventions multi-dimensional array storage on top of Booklet</em>
28
+ </p>
29
+
30
+ [![build](https://github.com/mullenkamp/cfdb/workflows/Build/badge.svg)](https://github.com/mullenkamp/cfdb/actions)
31
+ [![codecov](https://codecov.io/gh/mullenkamp/cfdb/branch/master/graph/badge.svg)](https://codecov.io/gh/mullenkamp/cfdb)
32
+ [![PyPI version](https://badge.fury.io/py/cfdb.svg)](https://badge.fury.io/py/cfdb)
33
+
34
+ ---
35
+
36
+ **Source Code**: <a href="https://github.com/mullenkamp/cfdb" target="_blank">https://github.com/mullenkamp/cfbdb</a>
37
+
38
+ ---
39
+ ## Introduction
40
+ cfdb is a pure python database for managing labeled multi-dimensional arrays that mostly follows the [CF conventions](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.12/cf-conventions.html). It is an alternative to netcdf4 and [xarray](https://docs.xarray.dev/). It builds upon the [Booklet](https://github.com/mullenkamp/booklet) for the underlying local file storage and [EBooklet](https://github.com/mullenkamp/ebooklet) to sync and share on any S3 system. It has been designed to follow the programming style of opening a file, iteratively read data, iteratively write data, then closing the file.
41
+ It is thread-safe on reads and writes (using thread locks) and multiprocessing-safe (using file locks) including on the S3 remote (using object locking).
42
+
43
+ When an error occurs, cfdb will try to properly close the file and remove the file (object) locks. This will not sync any changes, so the user will lose any changes that were not synced. There will be circumstances that can occur that will not properly close the file, so care still needs to be made.
44
+
45
+
46
+ ## Installation
47
+
48
+ Install via pip:
49
+
50
+ ```
51
+ pip install cfdb
52
+ ```
53
+
54
+ I'll probably put it on conda-forge once I feel appropriately motivated...
55
+
56
+ ## Usage
57
+ ### Opening a file/dataset
58
+ Usage starts off by opening the file (and closing the file when done):
59
+ ```python
60
+ import cfdb
61
+ import numpy as np
62
+
63
+ file_path = '/path/to/file.cfdb'
64
+
65
+ ds = cfdb.open_dataset(file_path, flag='n')
66
+ # Do fancy stuff
67
+ ds.close()
68
+ ```
69
+
70
+ By default, files will be open for read-only, so we need to specify that we want to write (in this case, 'n' is to open for write and replace the existing file with a new one). There are also some compression options, and those are described in the doc strings. Other kwargs from [Booklet](https://github.com/mullenkamp/booklet?tab=readme-ov-file#usage) can be passed to open_dataset.
71
+
72
+ The dataset can also be opened with the context manager like so:
73
+ ```python
74
+ with cfdb.open_dataset(file_path, flag='n') as ds:
75
+ print(ds)
76
+ ```
77
+ This is generally encouraged as this will ensure that the file is closed properly and file locks are removed.
78
+
79
+ ### Variables
80
+ In the [CF conventions](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.12/cf-conventions.html#dimensions), variables are the objects that store data. These can be 1 dimensional or many dimensional. The dimensions are the labels of 1-D variables (like latitude or time). These 1-D variables are called coordinate variables (or coordinates) with the same name as their associated dimension. All variables that use these coordinates as their dimension labels are called data variables. The combination of multiple data variables with their coordinates in a single file is called a dataset.
81
+
82
+ #### Coordinates
83
+ Since all data variables must have coordinates, the coordinates must be created before data variables are created.
84
+
85
+ Coordinates in cfdb are more similar to the definition by the earlier [COARDS conventions](https://ferret.pmel.noaa.gov/Ferret/documentation/coards-netcdf-conventions) than the latter CF conventions. Coordinate values must be unique, sorted in ascending order (a partial consequence to np.sort), and cannot have null (or np.nan) values. The CF conventions do not have those limitations, but these limitations are good! Coordinates must also be only 1-D.
86
+
87
+ Coordinates can be created using the generic creation method, or templates can be used for some of the more common dimensions (like latitude, longitude, and time):
88
+ ```python
89
+ lat_data = np.linspace(0, 19.9, 200, dtype='float32')
90
+
91
+ with cfdb.open_dataset(file_path, flag='n') as ds:
92
+ lat_coord = ds.create.coord.latitude(data=lat_data, chunk_shape=(20,))
93
+ print(lat_coord)
94
+ ```
95
+ When creating coordinates, the user can pass a np.ndarray as data and cfdb will figure out the rest (especially when using a creation template). Otherwise, a coordinate can be created without any data input and the data can be appended later:
96
+ ```python
97
+ with cfdb.open_dataset(file_path, flag='n') as ds:
98
+ lat_coord = ds.create.coord.latitude(chunk_shape=(20,))
99
+ lat_coord.append(lat_data)
100
+ print(lat_coord.data)
101
+ ```
102
+ Coordinate data can either be appended or prepended, but keep in mind the limitations described above! And once assigned, coordinate values cannot be changed. At some point, I'll implement the ability to shrink the size of coordinates, but for now they can only be expanded. As seen in the above example, the .data method will return the entire variable data as a single np.ndarray. Coordinates always hold the entire data in memory, while data variables never do. On disk, all data are stored as chunks, whether it's coordinates or data variables.
103
+
104
+ Let's add another coordinate for fun:
105
+ ```python
106
+ time_data = np.linspace(0, 199, 200, dtype='datetime64[D]')
107
+
108
+ with cfdb.open_dataset(file_path, flag='w') as ds:
109
+ time_coord = ds.create.coord.time(data=time_data, dtype_decoded=time_data.dtype, dtype_encoded='int32')
110
+ print(time_coord)
111
+ ```
112
+ A time variable works similarly to other numpy dtypes, but you can assign the precision of the datetime object within the brackets (shown as [D] for days). Look at the [numpy datetime reference page](https://numpy.org/doc/stable/reference/arrays.datetime.html#datetime-units) for all of the frequency codes. Do not use a frequency code finer than "ns". Encoding a datetime64 dtype to an int32 is possible down to the "m" (minute) resolution (with a max year of 6053), but all higher frequency codes should use int64.
113
+
114
+ #### Data Variables
115
+ Data variables are created in a similar way as coordinates except that you cannot pass data on creation and you must pass a tuple of the coordinate names to link the coordinates to the data variable:
116
+ ```python
117
+ data_var_data = np.linspace(0, 3999.9, 40000, dtype='float64').reshape(200, 200)
118
+ name = 'data_var'
119
+ coords = ('latitude', 'time')
120
+ dtype_encoded = 'int32'
121
+ scale_factor = 0.1
122
+
123
+ with cfdb.open_dataset(file_path, flag='w') as ds:
124
+ data_var = ds.create.data_var.generic(name, coords, data_var_data.dtype, dtype_encoded, scale_factor=scale_factor)
125
+ data_var[:] = data_var_data
126
+ data_var.attrs['test'] = ['test attributes']
127
+ print(data_var)
128
+ ```
129
+ Since there are no data variable templates (yet), we need to use the generic creation method. If no fillvalue or chunk_shape is passed, then cfdb figures them out for you.
130
+
131
+ Assigning data to data variables is different to coordinates. Data variables can only be expanded via the coordinates themselves. Assignment and selection is performed by the [basic numpy indexing](https://numpy.org/doc/stable/user/basics.indexing.html#basic-indexing), but not the [advanced indexing](https://numpy.org/doc/stable/user/basics.indexing.html#advanced-indexing).
132
+
133
+ The example shown above is the simplest way of assigning data to a data variable, but it's not a preferred method when datasets are very large. The recommended way to write (and read) data is to iterate over the chunks:
134
+
135
+ ```python
136
+ with cfdb.open_dataset(file_path, flag='w') as ds:
137
+ data_var = ds[name]
138
+ for chunk_slices in data_var.iter_chunks(include_data=False):
139
+ data_var[chunk_slices] = data_var_data[chunk_slices]
140
+ ```
141
+
142
+ This is a bit of a contrived example given that data_var_data is a single in-memory numpy array, but in many cases your data source will be much larger or in many pieces. The chunk_slices is a tuple of index slices that the data chunk covers. It is the same indexing that can be passed to a numpy ndarray.
143
+
144
+ Reading data uses the same "iter_chunks" method. This ensures that memory usage is kept to a minimum:
145
+
146
+ ```python
147
+ with cfdb.open_dataset(file_path, flag='r') as ds:
148
+ data_var = ds[name]
149
+ for chunk_slices, data in data_var.iter_chunks():
150
+ print(chunk_slices)
151
+ print(data.shape)
152
+ ```
153
+
154
+ There's a groupby method that works similarly to the iter_chunks method except that it requires one or more coordinate names (like pandas or xarray):
155
+
156
+ ```python
157
+ with cfdb.open_dataset(file_path, flag='r') as ds:
158
+ data_var = ds[name]
159
+ for slices, data in data_var.groupby('latitude'):
160
+ print(slices)
161
+ print(data.shape)
162
+ ```
163
+
164
+ #### Rechunking
165
+ All data for variables are stored as chunks of data. For example, the shape of your data may be 2000 x 2000, but the data are stored in 100 x 100 chunks. This is done for a variety of reasons including the ability to compress data. When a variable is created, either the user can define their own chunk shape or cfdb will determine the chunk shape automatically.
166
+
167
+ The chunk shape defined in the variable might be good for some use cases but not others. The user might have specific use cases where they want a specific chunking; for example the groupby operation listed in the last example. In that example, the user wanted to iterate over each latitude but with all of the other coordinates (in this case the full time coordinate). A groupby operation is a common rechunking example, but the user might need chunks in many different shapes.
168
+
169
+ The [rechunkit package](https://github.com/mullenkamp/rechunkit) is used under the hood to rechunk the data in cfdb. It is exposed in cfdb via the "rechunker" method in a variable. The Rechunker class has several methods to help the user decide the chunk shape.
170
+
171
+ ```python
172
+ new_chunk_shape = (41, 41)
173
+
174
+ with cfdb.open_dataset(file_path) as ds:
175
+ data_var = ds[name]
176
+ rechunker = data_var.rechunker()
177
+ alt_chunk_shape = rechunker.guess_chunk_shape(2**8)
178
+ n_chunks = rechunker.calc_n_chunks()
179
+ print(n_chunks)
180
+ n_reads, n_writes = rechunker.calc_n_reads_rechunker(new_chunk_shape)
181
+ print(n_reads, n_writes)
182
+ rechunk = rechunker.rechunk(new_chunk_shape)
183
+
184
+ for slices, data in rechunk:
185
+ print(slices)
186
+ print(data.shape)
187
+ ```
188
+
189
+ #### Serializers
190
+ The datasets can be serialized to netcdf4 via the to_netcdf4 method. You must have the [h5netcdf package](https://h5netcdf.org/) installed for netcdf4. It can also be copied to another cfdb file.
191
+
192
+ ```python
193
+ with open_dataset(file_path) as ds:
194
+ new_ds = ds.copy(new_file_path)
195
+ print(new_ds)
196
+ new_ds.close()
197
+ ds.to_netcdf4(nc_file_path)
198
+ ```
199
+
200
+
201
+
202
+ ## License
203
+
204
+ This project is licensed under the terms of the Apache Software License 2.0.
cfdb-0.1.1/README.md ADDED
@@ -0,0 +1,181 @@
1
+ # cfdb
2
+
3
+ <p align="center">
4
+ <em>CF conventions multi-dimensional array storage on top of Booklet</em>
5
+ </p>
6
+
7
+ [![build](https://github.com/mullenkamp/cfdb/workflows/Build/badge.svg)](https://github.com/mullenkamp/cfdb/actions)
8
+ [![codecov](https://codecov.io/gh/mullenkamp/cfdb/branch/master/graph/badge.svg)](https://codecov.io/gh/mullenkamp/cfdb)
9
+ [![PyPI version](https://badge.fury.io/py/cfdb.svg)](https://badge.fury.io/py/cfdb)
10
+
11
+ ---
12
+
13
+ **Source Code**: <a href="https://github.com/mullenkamp/cfdb" target="_blank">https://github.com/mullenkamp/cfbdb</a>
14
+
15
+ ---
16
+ ## Introduction
17
+ cfdb is a pure python database for managing labeled multi-dimensional arrays that mostly follows the [CF conventions](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.12/cf-conventions.html). It is an alternative to netcdf4 and [xarray](https://docs.xarray.dev/). It builds upon the [Booklet](https://github.com/mullenkamp/booklet) for the underlying local file storage and [EBooklet](https://github.com/mullenkamp/ebooklet) to sync and share on any S3 system. It has been designed to follow the programming style of opening a file, iteratively read data, iteratively write data, then closing the file.
18
+ It is thread-safe on reads and writes (using thread locks) and multiprocessing-safe (using file locks) including on the S3 remote (using object locking).
19
+
20
+ When an error occurs, cfdb will try to properly close the file and remove the file (object) locks. This will not sync any changes, so the user will lose any changes that were not synced. There will be circumstances that can occur that will not properly close the file, so care still needs to be made.
21
+
22
+
23
+ ## Installation
24
+
25
+ Install via pip:
26
+
27
+ ```
28
+ pip install cfdb
29
+ ```
30
+
31
+ I'll probably put it on conda-forge once I feel appropriately motivated...
32
+
33
+ ## Usage
34
+ ### Opening a file/dataset
35
+ Usage starts off by opening the file (and closing the file when done):
36
+ ```python
37
+ import cfdb
38
+ import numpy as np
39
+
40
+ file_path = '/path/to/file.cfdb'
41
+
42
+ ds = cfdb.open_dataset(file_path, flag='n')
43
+ # Do fancy stuff
44
+ ds.close()
45
+ ```
46
+
47
+ By default, files will be open for read-only, so we need to specify that we want to write (in this case, 'n' is to open for write and replace the existing file with a new one). There are also some compression options, and those are described in the doc strings. Other kwargs from [Booklet](https://github.com/mullenkamp/booklet?tab=readme-ov-file#usage) can be passed to open_dataset.
48
+
49
+ The dataset can also be opened with the context manager like so:
50
+ ```python
51
+ with cfdb.open_dataset(file_path, flag='n') as ds:
52
+ print(ds)
53
+ ```
54
+ This is generally encouraged as this will ensure that the file is closed properly and file locks are removed.
55
+
56
+ ### Variables
57
+ In the [CF conventions](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.12/cf-conventions.html#dimensions), variables are the objects that store data. These can be 1 dimensional or many dimensional. The dimensions are the labels of 1-D variables (like latitude or time). These 1-D variables are called coordinate variables (or coordinates) with the same name as their associated dimension. All variables that use these coordinates as their dimension labels are called data variables. The combination of multiple data variables with their coordinates in a single file is called a dataset.
58
+
59
+ #### Coordinates
60
+ Since all data variables must have coordinates, the coordinates must be created before data variables are created.
61
+
62
+ Coordinates in cfdb are more similar to the definition by the earlier [COARDS conventions](https://ferret.pmel.noaa.gov/Ferret/documentation/coards-netcdf-conventions) than the latter CF conventions. Coordinate values must be unique, sorted in ascending order (a partial consequence to np.sort), and cannot have null (or np.nan) values. The CF conventions do not have those limitations, but these limitations are good! Coordinates must also be only 1-D.
63
+
64
+ Coordinates can be created using the generic creation method, or templates can be used for some of the more common dimensions (like latitude, longitude, and time):
65
+ ```python
66
+ lat_data = np.linspace(0, 19.9, 200, dtype='float32')
67
+
68
+ with cfdb.open_dataset(file_path, flag='n') as ds:
69
+ lat_coord = ds.create.coord.latitude(data=lat_data, chunk_shape=(20,))
70
+ print(lat_coord)
71
+ ```
72
+ When creating coordinates, the user can pass a np.ndarray as data and cfdb will figure out the rest (especially when using a creation template). Otherwise, a coordinate can be created without any data input and the data can be appended later:
73
+ ```python
74
+ with cfdb.open_dataset(file_path, flag='n') as ds:
75
+ lat_coord = ds.create.coord.latitude(chunk_shape=(20,))
76
+ lat_coord.append(lat_data)
77
+ print(lat_coord.data)
78
+ ```
79
+ Coordinate data can either be appended or prepended, but keep in mind the limitations described above! And once assigned, coordinate values cannot be changed. At some point, I'll implement the ability to shrink the size of coordinates, but for now they can only be expanded. As seen in the above example, the .data method will return the entire variable data as a single np.ndarray. Coordinates always hold the entire data in memory, while data variables never do. On disk, all data are stored as chunks, whether it's coordinates or data variables.
80
+
81
+ Let's add another coordinate for fun:
82
+ ```python
83
+ time_data = np.linspace(0, 199, 200, dtype='datetime64[D]')
84
+
85
+ with cfdb.open_dataset(file_path, flag='w') as ds:
86
+ time_coord = ds.create.coord.time(data=time_data, dtype_decoded=time_data.dtype, dtype_encoded='int32')
87
+ print(time_coord)
88
+ ```
89
+ A time variable works similarly to other numpy dtypes, but you can assign the precision of the datetime object within the brackets (shown as [D] for days). Look at the [numpy datetime reference page](https://numpy.org/doc/stable/reference/arrays.datetime.html#datetime-units) for all of the frequency codes. Do not use a frequency code finer than "ns". Encoding a datetime64 dtype to an int32 is possible down to the "m" (minute) resolution (with a max year of 6053), but all higher frequency codes should use int64.
90
+
91
+ #### Data Variables
92
+ Data variables are created in a similar way as coordinates except that you cannot pass data on creation and you must pass a tuple of the coordinate names to link the coordinates to the data variable:
93
+ ```python
94
+ data_var_data = np.linspace(0, 3999.9, 40000, dtype='float64').reshape(200, 200)
95
+ name = 'data_var'
96
+ coords = ('latitude', 'time')
97
+ dtype_encoded = 'int32'
98
+ scale_factor = 0.1
99
+
100
+ with cfdb.open_dataset(file_path, flag='w') as ds:
101
+ data_var = ds.create.data_var.generic(name, coords, data_var_data.dtype, dtype_encoded, scale_factor=scale_factor)
102
+ data_var[:] = data_var_data
103
+ data_var.attrs['test'] = ['test attributes']
104
+ print(data_var)
105
+ ```
106
+ Since there are no data variable templates (yet), we need to use the generic creation method. If no fillvalue or chunk_shape is passed, then cfdb figures them out for you.
107
+
108
+ Assigning data to data variables is different to coordinates. Data variables can only be expanded via the coordinates themselves. Assignment and selection is performed by the [basic numpy indexing](https://numpy.org/doc/stable/user/basics.indexing.html#basic-indexing), but not the [advanced indexing](https://numpy.org/doc/stable/user/basics.indexing.html#advanced-indexing).
109
+
110
+ The example shown above is the simplest way of assigning data to a data variable, but it's not a preferred method when datasets are very large. The recommended way to write (and read) data is to iterate over the chunks:
111
+
112
+ ```python
113
+ with cfdb.open_dataset(file_path, flag='w') as ds:
114
+ data_var = ds[name]
115
+ for chunk_slices in data_var.iter_chunks(include_data=False):
116
+ data_var[chunk_slices] = data_var_data[chunk_slices]
117
+ ```
118
+
119
+ This is a bit of a contrived example given that data_var_data is a single in-memory numpy array, but in many cases your data source will be much larger or in many pieces. The chunk_slices is a tuple of index slices that the data chunk covers. It is the same indexing that can be passed to a numpy ndarray.
120
+
121
+ Reading data uses the same "iter_chunks" method. This ensures that memory usage is kept to a minimum:
122
+
123
+ ```python
124
+ with cfdb.open_dataset(file_path, flag='r') as ds:
125
+ data_var = ds[name]
126
+ for chunk_slices, data in data_var.iter_chunks():
127
+ print(chunk_slices)
128
+ print(data.shape)
129
+ ```
130
+
131
+ There's a groupby method that works similarly to the iter_chunks method except that it requires one or more coordinate names (like pandas or xarray):
132
+
133
+ ```python
134
+ with cfdb.open_dataset(file_path, flag='r') as ds:
135
+ data_var = ds[name]
136
+ for slices, data in data_var.groupby('latitude'):
137
+ print(slices)
138
+ print(data.shape)
139
+ ```
140
+
141
+ #### Rechunking
142
+ All data for variables are stored as chunks of data. For example, the shape of your data may be 2000 x 2000, but the data are stored in 100 x 100 chunks. This is done for a variety of reasons including the ability to compress data. When a variable is created, either the user can define their own chunk shape or cfdb will determine the chunk shape automatically.
143
+
144
+ The chunk shape defined in the variable might be good for some use cases but not others. The user might have specific use cases where they want a specific chunking; for example the groupby operation listed in the last example. In that example, the user wanted to iterate over each latitude but with all of the other coordinates (in this case the full time coordinate). A groupby operation is a common rechunking example, but the user might need chunks in many different shapes.
145
+
146
+ The [rechunkit package](https://github.com/mullenkamp/rechunkit) is used under the hood to rechunk the data in cfdb. It is exposed in cfdb via the "rechunker" method in a variable. The Rechunker class has several methods to help the user decide the chunk shape.
147
+
148
+ ```python
149
+ new_chunk_shape = (41, 41)
150
+
151
+ with cfdb.open_dataset(file_path) as ds:
152
+ data_var = ds[name]
153
+ rechunker = data_var.rechunker()
154
+ alt_chunk_shape = rechunker.guess_chunk_shape(2**8)
155
+ n_chunks = rechunker.calc_n_chunks()
156
+ print(n_chunks)
157
+ n_reads, n_writes = rechunker.calc_n_reads_rechunker(new_chunk_shape)
158
+ print(n_reads, n_writes)
159
+ rechunk = rechunker.rechunk(new_chunk_shape)
160
+
161
+ for slices, data in rechunk:
162
+ print(slices)
163
+ print(data.shape)
164
+ ```
165
+
166
+ #### Serializers
167
+ The datasets can be serialized to netcdf4 via the to_netcdf4 method. You must have the [h5netcdf package](https://h5netcdf.org/) installed for netcdf4. It can also be copied to another cfdb file.
168
+
169
+ ```python
170
+ with open_dataset(file_path) as ds:
171
+ new_ds = ds.copy(new_file_path)
172
+ print(new_ds)
173
+ new_ds.close()
174
+ ds.to_netcdf4(nc_file_path)
175
+ ```
176
+
177
+
178
+
179
+ ## License
180
+
181
+ This project is licensed under the terms of the Apache Software License 2.0.
@@ -1,6 +1,7 @@
1
1
  """CF conventions multi-dimensional array database on top of Booklet"""
2
2
  from cfdb.main import open_dataset, open_edataset
3
3
  from cfdb.utils import compute_scale_and_offset
4
+ from cfdb.tools import netcdf4_to_cfdb, cfdb_to_netcdf4
4
5
  from rechunkit import guess_chunk_shape
5
6
 
6
- __version__ = '0.1.0'
7
+ __version__ = '0.1.1'
@@ -287,11 +287,8 @@ def slices_to_chunks_keys(slices, var_name, var_chunk_shape, clip_ends=True):
287
287
  """
288
288
  starts = tuple(s.start for s in slices)
289
289
  stops = tuple(s.stop for s in slices)
290
- # chunk_iter1 = rechunkit.chunk_range(starts, stops, var_chunk_shape, clip_ends=False)
291
290
  chunk_iter2 = rechunkit.chunk_range(starts, stops, var_chunk_shape, clip_ends=clip_ends)
292
- # for full_chunk, partial_chunk in zip(chunk_iter1, chunk_iter2):
293
291
  for partial_chunk in chunk_iter2:
294
- # starts_chunk = tuple(s.start for s in full_chunk)
295
292
  starts_chunk = tuple((pc.start//cs) * cs for cs, pc in zip(var_chunk_shape, partial_chunk))
296
293
  new_key = utils.make_var_chunk_key(var_name, starts_chunk)
297
294