eo-tides 0.1.1__tar.gz → 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: eo-tides
3
- Version: 0.1.1
3
+ Version: 0.2.0
4
4
  Summary: Tide modelling tools for large-scale satellite earth observation analysis
5
5
  Author-email: Robbi Bishop-Taylor <Robbi.BishopTaylor@ga.gov.au>
6
6
  Project-URL: Homepage, https://GeoscienceAustralia.github.io/eo-tides/
@@ -29,7 +29,7 @@ Requires-Dist: numpy>=1.26.0
29
29
  Requires-Dist: odc-geo>=0.4.7
30
30
  Requires-Dist: pandas>=2.2.0
31
31
  Requires-Dist: pyproj>=3.6.1
32
- Requires-Dist: pyTMD==2.1.6
32
+ Requires-Dist: pyTMD==2.1.7
33
33
  Requires-Dist: scikit-learn>=1.4.0
34
34
  Requires-Dist: scipy>=1.11.2
35
35
  Requires-Dist: shapely>=2.0.6
@@ -68,9 +68,9 @@ These tools can be applied to petabytes of freely available satellite data (e.g.
68
68
 
69
69
  ## Highlights
70
70
 
71
- - 🌊 Model tides from multiple global ocean tide models in parallel, and return tide heights in standardised `pandas.DataFrame` format for further analysis
71
+ - 🌊 Model tide heights and phases (e.g. high, low, ebb, flow) from multiple global ocean tide models in parallel, and return a `pandas.DataFrame` for further analysis
72
72
  - 🛰️ "Tag" satellite data with tide height and stage based on the exact moment of image acquisition
73
- - 🌐 Model tides for every individual satellite pixel, producing three-dimensional "tide height" `xarray`-format datacubes that can be integrated with satellite data
73
+ - 🌐 Model tides for every individual satellite pixel through time, producing three-dimensional "tide height" `xarray`-format datacubes that can be integrated with satellite data
74
74
  - 📈 Calculate statistics describing local tide dynamics, as well as biases caused by interactions between tidal processes and satellite orbits
75
75
  - 🛠️ Validate modelled tides using measured sea levels from coastal tide gauges (e.g. [GESLA Global Extreme Sea Level Analysis](https://gesla.org/))
76
76
  <!-- - 🎯 Combine multiple tide models into a single locally-optimised "ensemble" model informed by satellite altimetry and satellite-observed patterns of tidal inundation -->
@@ -103,6 +103,12 @@ To cite `eo-tides` in your work, please use the following citation:
103
103
  Bishop-Taylor, R., Sagar, S., Phillips, C., & Newey, V. (2024). eo-tides: Tide modelling tools for large-scale satellite earth observation analysis. https://github.com/GeoscienceAustralia/eo-tides
104
104
  ```
105
105
 
106
+ In addition, please consider also citing the underlying [`pyTMD` Python package](https://pytmd.readthedocs.io/en/latest/) which powers the tide modelling functionality behind `eo-tides`:
107
+
108
+ ```
109
+ Sutterley, T. C., Alley, K., Brunt, K., Howard, S., Padman, L., Siegfried, M. (2017) pyTMD: Python-based tidal prediction software. 10.5281/zenodo.5555395
110
+ ```
111
+
106
112
  ## Acknowledgements
107
113
 
108
114
  For a full list of acknowledgements, refer to [Citations and Credits](https://geoscienceaustralia.github.io/eo-tides/credits/).
@@ -24,9 +24,9 @@ These tools can be applied to petabytes of freely available satellite data (e.g.
24
24
 
25
25
  ## Highlights
26
26
 
27
- - 🌊 Model tides from multiple global ocean tide models in parallel, and return tide heights in standardised `pandas.DataFrame` format for further analysis
27
+ - 🌊 Model tide heights and phases (e.g. high, low, ebb, flow) from multiple global ocean tide models in parallel, and return a `pandas.DataFrame` for further analysis
28
28
  - 🛰️ "Tag" satellite data with tide height and stage based on the exact moment of image acquisition
29
- - 🌐 Model tides for every individual satellite pixel, producing three-dimensional "tide height" `xarray`-format datacubes that can be integrated with satellite data
29
+ - 🌐 Model tides for every individual satellite pixel through time, producing three-dimensional "tide height" `xarray`-format datacubes that can be integrated with satellite data
30
30
  - 📈 Calculate statistics describing local tide dynamics, as well as biases caused by interactions between tidal processes and satellite orbits
31
31
  - 🛠️ Validate modelled tides using measured sea levels from coastal tide gauges (e.g. [GESLA Global Extreme Sea Level Analysis](https://gesla.org/))
32
32
  <!-- - 🎯 Combine multiple tide models into a single locally-optimised "ensemble" model informed by satellite altimetry and satellite-observed patterns of tidal inundation -->
@@ -59,6 +59,12 @@ To cite `eo-tides` in your work, please use the following citation:
59
59
  Bishop-Taylor, R., Sagar, S., Phillips, C., & Newey, V. (2024). eo-tides: Tide modelling tools for large-scale satellite earth observation analysis. https://github.com/GeoscienceAustralia/eo-tides
60
60
  ```
61
61
 
62
+ In addition, please consider also citing the underlying [`pyTMD` Python package](https://pytmd.readthedocs.io/en/latest/) which powers the tide modelling functionality behind `eo-tides`:
63
+
64
+ ```
65
+ Sutterley, T. C., Alley, K., Brunt, K., Howard, S., Padman, L., Siegfried, M. (2017) pyTMD: Python-based tidal prediction software. 10.5281/zenodo.5555395
66
+ ```
67
+
62
68
  ## Acknowledgements
63
69
 
64
70
  For a full list of acknowledgements, refer to [Citations and Credits](https://geoscienceaustralia.github.io/eo-tides/credits/).
@@ -28,8 +28,8 @@ validation : Load observed tide gauge data to validate modelled tides
28
28
 
29
29
  # Import commonly used functions for convenience
30
30
  from .eo import pixel_tides, tag_tides
31
- from .model import list_models, model_tides
32
- from .stats import tide_stats
31
+ from .model import list_models, model_phases, model_tides
32
+ from .stats import pixel_stats, tide_stats
33
33
  from .utils import idw
34
34
  from .validation import eval_metrics, load_gauge_gesla
35
35
 
@@ -37,9 +37,11 @@ from .validation import eval_metrics, load_gauge_gesla
37
37
  __all__ = [
38
38
  "list_models",
39
39
  "model_tides",
40
+ "model_phases",
40
41
  "tag_tides",
41
42
  "pixel_tides",
42
43
  "tide_stats",
44
+ "pixel_stats",
43
45
  "idw",
44
46
  "eval_metrics",
45
47
  "load_gauge_gesla",
@@ -2,9 +2,11 @@
2
2
  from __future__ import annotations
3
3
 
4
4
  import os
5
+ import textwrap
5
6
  import warnings
6
7
  from typing import TYPE_CHECKING
7
8
 
9
+ import numpy as np
8
10
  import odc.geo.xr
9
11
  import pandas as pd
10
12
  import xarray as xr
@@ -12,16 +14,94 @@ from odc.geo.geobox import GeoBox
12
14
 
13
15
  # Only import if running type checking
14
16
  if TYPE_CHECKING:
15
- import numpy as np
17
+ import datetime
16
18
 
17
- from .model import model_tides
19
+ from odc.geo import Shape2d
20
+
21
+ from .model import DatetimeLike, _standardise_time, model_tides
22
+
23
+
24
+ def _resample_chunks(
25
+ data: xr.DataArray | xr.Dataset | GeoBox,
26
+ dask_chunks: tuple | None = None,
27
+ ) -> tuple | Shape2d:
28
+ """
29
+ Automatically return optimised dask chunks
30
+ for reprojection with `_pixel_tides_resample`.
31
+ Use entire image if GeoBox or if no default
32
+ chunks; use existing chunks if they exist.
33
+ """
34
+
35
+ # If dask_chunks is provided, return directly
36
+ if dask_chunks is not None:
37
+ return dask_chunks
38
+
39
+ # If data is a GeoBox, return its shape
40
+ if isinstance(data, GeoBox):
41
+ return data.shape
42
+
43
+ # if data has chunks, then return just spatial chunks
44
+ if data.chunks is not None:
45
+ y_dim, x_dim = data.odc.spatial_dims
46
+ return data.chunks[y_dim], data.chunks[x_dim]
47
+
48
+ # if data has no chunks, then return entire image shape
49
+ return data.odc.geobox.shape
50
+
51
+
52
+ def _standardise_inputs(
53
+ data: xr.DataArray | xr.Dataset | GeoBox,
54
+ time: DatetimeLike | None,
55
+ ) -> tuple[GeoBox, np.ndarray | None]:
56
+ """
57
+ Takes an xarray or GeoBox input and an optional custom times,
58
+ and returns a standardised GeoBox and times (usually an
59
+ array, but possibly None).
60
+ """
61
+
62
+ # If `data` is an xarray object, extract its GeoBox and time
63
+ if isinstance(data, (xr.DataArray, xr.Dataset)):
64
+ # Try to extract GeoBox
65
+ try:
66
+ gbox: GeoBox = data.odc.geobox
67
+ except AttributeError:
68
+ error_msg = """
69
+ Cannot extract a valid GeoBox for `data`. This is required for
70
+ extracting details about `data`'s CRS and spatial location.
71
+
72
+ Import `odc.geo.xr` then run `data = data.odc.assign_crs(crs=...)`
73
+ to prepare your data before passing it to this function.
74
+ """
75
+ raise Exception(textwrap.dedent(error_msg).strip())
76
+
77
+ # Use custom time by default if provided; otherwise try and extract from `data`
78
+ if time is not None:
79
+ time = _standardise_time(time)
80
+ elif "time" in data.dims:
81
+ time = np.asarray(data.coords["time"].values)
82
+ else:
83
+ raise ValueError("`data` does not have a 'time' dimension, and no custom times were provided via `time`.")
84
+
85
+ # If `data` is a GeoBox, use it directly; raise an error if no time was provided
86
+ elif isinstance(data, GeoBox):
87
+ gbox = data
88
+ if time is not None:
89
+ time = _standardise_time(time)
90
+ else:
91
+ raise ValueError("If `data` is a GeoBox, custom times must be provided via `time`.")
92
+
93
+ # Raise error if no valid inputs were provided
94
+ else:
95
+ raise TypeError("`data` must be an xarray.DataArray, xarray.Dataset, or odc.geo.geobox.GeoBox.")
96
+
97
+ return gbox, time
18
98
 
19
99
 
20
100
  def _pixel_tides_resample(
21
101
  tides_lowres,
22
- ds,
102
+ gbox,
23
103
  resample_method="bilinear",
24
- dask_chunks="auto",
104
+ dask_chunks=None,
25
105
  dask_compute=True,
26
106
  ):
27
107
  """Resamples low resolution tides modelled by `pixel_tides` into the
@@ -32,56 +112,39 @@ def _pixel_tides_resample(
32
112
  ----------
33
113
  tides_lowres : xarray.DataArray
34
114
  The low resolution tide modelling data array to be resampled.
35
- ds : xarray.Dataset
36
- The dataset whose geobox will be used as the template for the
37
- resampling operation. This is typically the same satellite
38
- dataset originally passed to `pixel_tides`.
115
+ gbox : GeoBox
116
+ The GeoBox to use as the template for the resampling operation.
117
+ This is typically comes from the same satellite dataset originally
118
+ passed to `pixel_tides` (e.g. `data.odc.geobox`).
39
119
  resample_method : string, optional
40
120
  The resampling method to use. Defaults to "bilinear"; valid
41
121
  options include "nearest", "cubic", "min", "max", "average" etc.
42
- dask_chunks : str or tuple, optional
122
+ dask_chunks : tuple of float, optional
43
123
  Can be used to configure custom Dask chunking for the final
44
- resampling step. The default of "auto" will automatically set
45
- x/y chunks to match those in `ds` if they exist, otherwise will
46
- set x/y chunks that cover the entire extent of the dataset.
47
- For custom chunks, provide a tuple in the form `(y, x)`, e.g.
48
- `(2048, 2048)`.
124
+ resampling step. For custom chunks, provide a tuple in the form
125
+ (y, x), e.g. (2048, 2048).
49
126
  dask_compute : bool, optional
50
127
  Whether to compute results of the resampling step using Dask.
51
- If False, this will return `tides_highres` as a Dask array.
128
+ If False, this will return `tides_highres` as a lazy loaded
129
+ Dask-enabled array.
52
130
 
53
131
  Returns
54
132
  -------
55
- tides_highres, tides_lowres : tuple of xr.DataArrays
56
- In addition to `tides_lowres` (see above), a high resolution
57
- array of tide heights will be generated matching the
58
- exact spatial resolution and extent of `ds`.
133
+ tides_highres : xr.DataArray
134
+ A high resolution array of tide heights matching the exact
135
+ spatial resolution and extent of `gbox`.
59
136
 
60
137
  """
61
138
  # Determine spatial dimensions
62
- y_dim, x_dim = ds.odc.spatial_dims
139
+ y_dim, x_dim = gbox.dimensions
63
140
 
64
141
  # Convert array to Dask, using no chunking along y and x dims,
65
142
  # and a single chunk for each timestep/quantile and tide model
66
143
  tides_lowres_dask = tides_lowres.chunk({d: None if d in [y_dim, x_dim] else 1 for d in tides_lowres.dims})
67
144
 
68
- # Automatically set Dask chunks for reprojection if set to "auto".
69
- # This will either use x/y chunks if they exist in `ds`, else
70
- # will cover the entire x and y dims) so we don't end up with
71
- # hundreds of tiny x and y chunks due to the small size of
72
- # `tides_lowres` (possible odc.geo bug?)
73
- if dask_chunks == "auto":
74
- if ds.chunks is not None:
75
- if (y_dim in ds.chunks) & (x_dim in ds.chunks):
76
- dask_chunks = (ds.chunks[y_dim], ds.chunks[x_dim])
77
- else:
78
- dask_chunks = ds.odc.geobox.shape
79
- else:
80
- dask_chunks = ds.odc.geobox.shape
81
-
82
- # Reproject into the GeoBox of `ds` using odc.geo and Dask
145
+ # Reproject into the pixel grid of `gbox` using odc.geo and Dask
83
146
  tides_highres = tides_lowres_dask.odc.reproject(
84
- how=ds.odc.geobox,
147
+ how=gbox,
85
148
  chunks=dask_chunks,
86
149
  resampling=resample_method,
87
150
  ).rename("tide_height")
@@ -94,7 +157,8 @@ def _pixel_tides_resample(
94
157
 
95
158
 
96
159
  def tag_tides(
97
- ds: xr.Dataset | xr.DataArray,
160
+ data: xr.Dataset | xr.DataArray | GeoBox,
161
+ time: DatetimeLike | None = None,
98
162
  model: str | list[str] = "EOT20",
99
163
  directory: str | os.PathLike | None = None,
100
164
  tidepost_lat: float | None = None,
@@ -104,7 +168,7 @@ def tag_tides(
104
168
  """
105
169
  Model tide heights for every timestep in a multi-dimensional
106
170
  dataset, and return a new `tide_height` array that can
107
- be used to "tag" each observation with tide data.
171
+ be used to "tag" each observation with tide heights.
108
172
 
109
173
  The function models tides at the centroid of the dataset
110
174
  by default, but a custom tidal modelling location can
@@ -122,15 +186,23 @@ def tag_tides(
122
186
 
123
187
  Parameters
124
188
  ----------
125
- ds : xarray.Dataset or xarray.DataArray
126
- A multi-dimensional dataset (e.g. "x", "y", "time") to
127
- tag with tide heights. This dataset must contain a "time"
128
- dimension.
189
+ data : xarray.Dataset or xarray.DataArray or odc.geo.geobox.GeoBox
190
+ A multi-dimensional dataset or GeoBox pixel grid that will
191
+ be used to define the tide modelling location. If `data`
192
+ is an xarray object, it should include a "time" dimension.
193
+ If no "time" dimension exists or if `data` is a GeoBox,
194
+ then times must be passed using the `time` parameter.
195
+ time : DatetimeLike, optional
196
+ By default, tides will be modelled using times from the
197
+ "time" dimension of `data`. Alternatively, this param can
198
+ be used to provide a custom set of times. Accepts any format
199
+ that can be converted by `pandas.to_datetime()`. For example:
200
+ `time=pd.date_range(start="2000", end="2001", freq="5h")`
129
201
  model : str or list of str, optional
130
- The tide model (or models) to use to model tides. If a list is
131
- provided, a new "tide_model" dimension will be added to `ds`.
132
- Defaults to "EOT20"; for a full list of available/supported
133
- models, run `eo_tides.model.list_models`.
202
+ The tide model (or models) used to model tides. If a list is
203
+ provided, a new "tide_model" dimension will be added to the
204
+ `xarray.DataArray` outputs. Defaults to "EOT20"; for a full
205
+ list of available/supported models, run `eo_tides.model.list_models`.
134
206
  directory : str, optional
135
207
  The directory containing tide model data files. If no path is
136
208
  provided, this will default to the environment variable
@@ -151,23 +223,18 @@ def tag_tides(
151
223
 
152
224
  Returns
153
225
  -------
154
- ds : xr.Dataset
155
- The original `xarray.Dataset` with a new `tide_height` variable
156
- giving the height of the tide (and optionally, its ebb-flow phase)
157
- for each timestep in the data.
158
-
226
+ tides_da : xr.DataArray
227
+ A one-dimensional tide height array. This will contain either
228
+ tide heights for every timestep in `data`, or for every time in
229
+ `times` if provided.
159
230
  """
160
- # Only datasets are supported
161
- if not isinstance(ds, xr.Dataset):
162
- raise TypeError("Input must be an xarray.Dataset, not an xarray.DataArray or other data type.")
163
-
164
- # Standardise model into a list for easy handling. and verify only one
231
+ # Standardise data inputs, time and models
232
+ gbox, time_coords = _standardise_inputs(data, time)
165
233
  model = [model] if isinstance(model, str) else model
166
234
 
167
- # If custom tide modelling locations are not provided, use the
168
- # dataset centroid
235
+ # If custom tide posts are not provided, use dataset centroid
169
236
  if tidepost_lat is None or tidepost_lon is None:
170
- lon, lat = ds.odc.geobox.geographic_extent.centroid.coords[0]
237
+ lon, lat = gbox.geographic_extent.centroid.coords[0]
171
238
  print(f"Setting tide modelling location from dataset centroid: {lon:.2f}, {lat:.2f}")
172
239
  else:
173
240
  lon, lat = tidepost_lon, tidepost_lat
@@ -177,7 +244,7 @@ def tag_tides(
177
244
  tide_df = model_tides(
178
245
  x=lon, # type: ignore
179
246
  y=lat, # type: ignore
180
- time=ds.time,
247
+ time=time_coords,
181
248
  model=model,
182
249
  directory=directory,
183
250
  crs="EPSG:4326",
@@ -195,43 +262,19 @@ def tag_tides(
195
262
  f"`tidepost_lat` and `tidepost_lon` parameters."
196
263
  )
197
264
 
198
- # # Optionally calculate the tide phase for each observation
199
- # if ebb_flow:
200
- # # Model tides for a time 15 minutes prior to each previously
201
- # # modelled satellite acquisition time. This allows us to compare
202
- # # tide heights to see if they are rising or falling.
203
- # print("Modelling tidal phase (e.g. ebb or flow)")
204
- # tide_pre_df = model_tides(
205
- # x=lon, # type: ignore
206
- # y=lat, # type: ignore
207
- # time=(ds.time - pd.Timedelta("15 min")),
208
- # model=model,
209
- # directory=directory,
210
- # crs="EPSG:4326",
211
- # **model_tides_kwargs,
212
- # )
213
-
214
- # # Compare tides computed for each timestep. If the previous tide
215
- # # was higher than the current tide, the tide is 'ebbing'. If the
216
- # # previous tide was lower, the tide is 'flowing'
217
- # tide_df["ebb_flow"] = (tide_df.tide_height < tide_pre_df.tide_height.values).replace({
218
- # True: "Ebb",
219
- # False: "Flow",
220
- # })
221
-
222
265
  # Convert to xarray format
223
- tide_xr = tide_df.reset_index().set_index(["time", "tide_model"]).drop(["x", "y"], axis=1).tide_height.to_xarray()
266
+ tides_da = tide_df.reset_index().set_index(["time", "tide_model"]).drop(["x", "y"], axis=1).tide_height.to_xarray()
224
267
 
225
268
  # If only one tidal model exists, squeeze out "tide_model" dim
226
- if len(tide_xr.tide_model) == 1:
227
- tide_xr = tide_xr.squeeze("tide_model")
269
+ if len(tides_da.tide_model) == 1:
270
+ tides_da = tides_da.squeeze("tide_model")
228
271
 
229
- return tide_xr
272
+ return tides_da
230
273
 
231
274
 
232
275
  def pixel_tides(
233
- ds: xr.Dataset | xr.DataArray,
234
- times=None,
276
+ data: xr.Dataset | xr.DataArray | GeoBox,
277
+ time: DatetimeLike | None = None,
235
278
  model: str | list[str] = "EOT20",
236
279
  directory: str | os.PathLike | None = None,
237
280
  resample: bool = True,
@@ -239,7 +282,7 @@ def pixel_tides(
239
282
  resolution: float | None = None,
240
283
  buffer: float | None = None,
241
284
  resample_method: str = "bilinear",
242
- dask_chunks: str | tuple[float, float] = "auto",
285
+ dask_chunks: tuple[float, float] | None = None,
243
286
  dask_compute: bool = True,
244
287
  **model_tides_kwargs,
245
288
  ) -> xr.DataArray:
@@ -250,10 +293,9 @@ def pixel_tides(
250
293
  This function models tides into a low-resolution tide
251
294
  modelling grid covering the spatial extent of the input
252
295
  data (buffered to reduce potential edge effects). These
253
- modelled tides are then (optionally) resampled back into
254
- the original higher resolution dataset's extent and
255
- resolution - resulting in a modelled tide height for every
256
- pixel through time.
296
+ modelled tides can then be resampled back into the original
297
+ higher resolution dataset's extent and resolution to
298
+ produce a modelled tide height for every pixel through time.
257
299
 
258
300
  This function uses the parallelised `model_tides` function
259
301
  under the hood. It supports all tidal models supported by
@@ -273,15 +315,18 @@ def pixel_tides(
273
315
 
274
316
  Parameters
275
317
  ----------
276
- ds : xarray.Dataset or xarray.DataArray
277
- A multi-dimensional dataset (e.g. "x", "y", "time") that will
278
- be used to define the tide modelling grid.
279
- times : pd.DatetimeIndex or list of pd.Timestamp, optional
280
- By default, the function will model tides using the times
281
- contained in the `time` dimension of `ds`. Alternatively, this
282
- param can be used to model tides for a custom set of times
283
- instead. For example:
284
- `times=pd.date_range(start="2000", end="2001", freq="5h")`
318
+ data : xarray.Dataset or xarray.DataArray or odc.geo.geobox.GeoBox
319
+ A multi-dimensional dataset or GeoBox pixel grid that will
320
+ be used to define the spatial tide modelling grid. If `data`
321
+ is an xarray object, it should include a "time" dimension.
322
+ If no "time" dimension exists or if `data` is a GeoBox,
323
+ then times must be passed using the `time` parameter.
324
+ time : DatetimeLike, optional
325
+ By default, tides will be modelled using times from the
326
+ "time" dimension of `data`. Alternatively, this param can
327
+ be used to provide a custom set of times. Accepts any format
328
+ that can be converted by `pandas.to_datetime()`. For example:
329
+ `time=pd.date_range(start="2000", end="2001", freq="5h")`
285
330
  model : str or list of str, optional
286
331
  The tide model (or models) used to model tides. If a list is
287
332
  provided, a new "tide_model" dimension will be added to the
@@ -295,7 +340,7 @@ def pixel_tides(
295
340
  model that match the structure required by `pyTMD`
296
341
  (<https://geoscienceaustralia.github.io/eo-tides/setup/>).
297
342
  resample : bool, optional
298
- Whether to resample low resolution tides back into `ds`'s original
343
+ Whether to resample low resolution tides back into `data`'s original
299
344
  higher resolution grid. Set this to `False` if you do not want
300
345
  low resolution tides to be re-projected back to higher resolution.
301
346
  calculate_quantiles : tuple of float or numpy.ndarray, optional
@@ -307,38 +352,38 @@ def pixel_tides(
307
352
  resolution : float, optional
308
353
  The desired resolution of the low-resolution grid used for tide
309
354
  modelling. The default None will create a 5000 m resolution grid
310
- if `ds` has a projected CRS (i.e. metre units), or a 0.05 degree
311
- resolution grid if `ds` has a geographic CRS (e.g. degree units).
355
+ if `data` has a projected CRS (i.e. metre units), or a 0.05 degree
356
+ resolution grid if `data` has a geographic CRS (e.g. degree units).
312
357
  Note: higher resolutions do not necessarily provide better
313
358
  tide modelling performance, as results will be limited by the
314
359
  resolution of the underlying global tide model (e.g. 1/16th
315
360
  degree / ~5 km resolution grid for FES2014).
316
361
  buffer : float, optional
317
362
  The amount by which to buffer the higher resolution grid extent
318
- when creating the new low resolution grid. This buffering is
319
- important as it ensures that ensure pixel-based tides are seamless
320
- across dataset boundaries. This buffer will eventually be clipped
321
- away when the low-resolution data is re-projected back to the
322
- resolution and extent of the higher resolution dataset. To
323
- ensure that at least two pixels occur outside of the dataset
324
- bounds, the default None applies a 12000 m buffer if `ds` has a
363
+ when creating the new low resolution grid. This buffering
364
+ ensures that modelled tides are seamless across analysis
365
+ boundaries. This buffer is eventually be clipped away when
366
+ the low-resolution modelled tides are re-projected back to the
367
+ original resolution and extent of `data`. To ensure that at least
368
+ two low-resolution grid pixels occur outside of the dataset
369
+ bounds, the default None applies a 12000 m buffer if `data` has a
325
370
  projected CRS (i.e. metre units), or a 0.12 degree buffer if
326
- `ds` has a geographic CRS (e.g. degree units).
371
+ `data` has a geographic CRS (e.g. degree units).
327
372
  resample_method : str, optional
328
373
  If resampling is requested (see `resample` above), use this
329
374
  resampling method when converting from low resolution to high
330
375
  resolution pixels. Defaults to "bilinear"; valid options include
331
376
  "nearest", "cubic", "min", "max", "average" etc.
332
- dask_chunks : str or tuple of float, optional
377
+ dask_chunks : tuple of float, optional
333
378
  Can be used to configure custom Dask chunking for the final
334
- resampling step. The default of "auto" will automatically set
335
- x/y chunks to match those in `ds` if they exist, otherwise will
336
- set x/y chunks that cover the entire extent of the dataset.
379
+ resampling step. By default, chunks will be automatically set
380
+ to match y/x chunks from `data` if they exist; otherwise chunks
381
+ will be chosen to cover the entire y/x extent of the dataset.
337
382
  For custom chunks, provide a tuple in the form `(y, x)`, e.g.
338
383
  `(2048, 2048)`.
339
384
  dask_compute : bool, optional
340
385
  Whether to compute results of the resampling step using Dask.
341
- If False, this will return `tides_highres` as a Dask array.
386
+ If False, `tides_highres` will be returned as a Dask array.
342
387
  **model_tides_kwargs :
343
388
  Optional parameters passed to the `eo_tides.model.model_tides`
344
389
  function. Important parameters include `cutoff` (used to
@@ -348,54 +393,34 @@ def pixel_tides(
348
393
  Returns
349
394
  -------
350
395
  tides_da : xr.DataArray
351
- If `resample=True` (default), a high-resolution array
352
- of tide heights matching the exact spatial resolution and
353
- extents of `ds`. This will contain either tide heights every
354
- timestep in `ds` (if `times` is None), tide heights at every
355
- time in `times` (if `times` is not None), or tide height
396
+ A three-dimensional tide height array.
397
+ If `resample=True` (default), a high-resolution array of tide
398
+ heights will be returned that matches the exact spatial resolution
399
+ and extents of `data`. This will contain either tide heights for
400
+ every timestep in `data` (or in `times` if provided), or tide height
356
401
  quantiles for every quantile provided by `calculate_quantiles`.
357
402
  If `resample=False`, results for the intermediate low-resolution
358
403
  tide modelling grid will be returned instead.
359
404
  """
360
- # First test if no time dimension and nothing passed to `times`
361
- if ("time" not in ds.dims) & (times is None):
362
- raise ValueError(
363
- "`ds` does not contain a 'time' dimension. Times are required "
364
- "for modelling tides: please pass in a set of custom tides "
365
- "using the `times` parameter. For example: "
366
- "`times=pd.date_range(start='2000', end='2001', freq='5h')`",
367
- )
368
-
369
- # If custom times are provided, convert them to a consistent
370
- # pandas.DatatimeIndex format
371
- if times is not None:
372
- if isinstance(times, list):
373
- time_coords = pd.DatetimeIndex(times)
374
- elif isinstance(times, pd.Timestamp):
375
- time_coords = pd.DatetimeIndex([times])
376
- else:
377
- time_coords = times
378
-
379
- # Otherwise, use times from `ds` directly
380
- else:
381
- time_coords = ds.coords["time"]
382
-
383
- # Standardise model into a list for easy handling
405
+ # Standardise data inputs, time and models
406
+ gbox, time_coords = _standardise_inputs(data, time)
407
+ dask_chunks = _resample_chunks(data, dask_chunks)
384
408
  model = [model] if isinstance(model, str) else model
385
409
 
386
410
  # Determine spatial dimensions
387
- y_dim, x_dim = ds.odc.spatial_dims
411
+ y_dim, x_dim = gbox.dimensions
388
412
 
389
413
  # Determine resolution and buffer, using different defaults for
390
414
  # geographic (i.e. degrees) and projected (i.e. metres) CRSs:
391
- crs_units = ds.odc.geobox.crs.units[0][0:6]
392
- if ds.odc.geobox.crs.geographic:
415
+ assert gbox.crs is not None
416
+ crs_units = gbox.crs.units[0][0:6]
417
+ if gbox.crs.geographic:
393
418
  if resolution is None:
394
419
  resolution = 0.05
395
420
  elif resolution > 360:
396
421
  raise ValueError(
397
422
  f"A resolution of greater than 360 was "
398
- f"provided, but `ds` has a geographic CRS "
423
+ f"provided, but `data` has a geographic CRS "
399
424
  f"in {crs_units} units. Did you accidently "
400
425
  f"provide a resolution in projected "
401
426
  f"(i.e. metre) units?",
@@ -408,7 +433,7 @@ def pixel_tides(
408
433
  elif resolution < 1:
409
434
  raise ValueError(
410
435
  f"A resolution of less than 1 was provided, "
411
- f"but `ds` has a projected CRS in "
436
+ f"but `data` has a projected CRS in "
412
437
  f"{crs_units} units. Did you accidently "
413
438
  f"provide a resolution in geographic "
414
439
  f"(degree) units?",
@@ -417,12 +442,12 @@ def pixel_tides(
417
442
  buffer = 12000
418
443
 
419
444
  # Raise error if resolution is less than dataset resolution
420
- dataset_res = ds.odc.geobox.resolution.x
445
+ dataset_res = gbox.resolution.x
421
446
  if resolution < dataset_res:
422
447
  raise ValueError(
423
448
  f"The resolution of the low-resolution tide "
424
449
  f"modelling grid ({resolution:.2f}) is less "
425
- f"than `ds`'s pixel resolution ({dataset_res:.2f}). "
450
+ f"than `data`'s pixel resolution ({dataset_res:.2f}). "
426
451
  f"This can cause extremely slow tide modelling "
427
452
  f"performance. Please select provide a resolution "
428
453
  f"greater than {dataset_res:.2f} using "
@@ -432,20 +457,20 @@ def pixel_tides(
432
457
  # Create a new reduced resolution tide modelling grid after
433
458
  # first buffering the grid
434
459
  print(f"Creating reduced resolution {resolution} x {resolution} {crs_units} tide modelling array")
435
- buffered_geobox = ds.odc.geobox.buffered(buffer)
460
+ buffered_geobox = gbox.buffered(buffer)
436
461
  rescaled_geobox = GeoBox.from_bbox(bbox=buffered_geobox.boundingbox, resolution=resolution)
437
462
  rescaled_ds = odc.geo.xr.xr_zeros(rescaled_geobox)
438
463
 
439
464
  # Flatten grid to 1D, then add time dimension
440
465
  flattened_ds = rescaled_ds.stack(z=(x_dim, y_dim))
441
- flattened_ds = flattened_ds.expand_dims(dim={"time": time_coords.values})
466
+ flattened_ds = flattened_ds.expand_dims(dim={"time": time_coords})
442
467
 
443
468
  # Model tides in parallel, returning a pandas.DataFrame
444
469
  tide_df = model_tides(
445
470
  x=flattened_ds[x_dim],
446
471
  y=flattened_ds[y_dim],
447
472
  time=flattened_ds.time,
448
- crs=f"EPSG:{ds.odc.geobox.crs.epsg}",
473
+ crs=f"EPSG:{gbox.crs.epsg}",
449
474
  model=model,
450
475
  directory=directory,
451
476
  **model_tides_kwargs,
@@ -480,14 +505,14 @@ def pixel_tides(
480
505
  tides_lowres = tides_lowres.squeeze("tide_model")
481
506
 
482
507
  # Ensure CRS is present before we apply any resampling
483
- tides_lowres = tides_lowres.odc.assign_crs(ds.odc.geobox.crs)
508
+ tides_lowres = tides_lowres.odc.assign_crs(gbox.crs)
484
509
 
485
510
  # Reproject into original high resolution grid
486
511
  if resample:
487
512
  print("Reprojecting tides into original resolution")
488
513
  tides_highres = _pixel_tides_resample(
489
514
  tides_lowres,
490
- ds,
515
+ gbox,
491
516
  resample_method,
492
517
  dask_chunks,
493
518
  dask_compute,