PyPI - deepsensor - Versions diffs - 0.3.8__tar.gz → 0.4.1__tar.gz - Mend

deepsensor 0.3.8tar.gz → 0.4.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (46) hide show

{deepsensor-0.3.8 → deepsensor-0.4.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: deepsensor
-Version: 0.3.8
+Version: 0.4.1
 Summary: A Python package for modelling xarray and pandas data with neural processes.
 Home-page: https://github.com/alan-turing-institute/deepsensor
 Author: Tom R. Andersson
@@ -44,7 +44,7 @@ data with neural processes</p>
 -----------
-[![release](https://img.shields.io/badge/release-v0.3.8-green?logo=github)](https://github.com/alan-turing-institute/deepsensor/releases)
+[![release](https://img.shields.io/badge/release-v0.4.1-green?logo=github)](https://github.com/alan-turing-institute/deepsensor/releases)
 [![Latest Docs](https://img.shields.io/badge/docs-latest-blue.svg)](https://alan-turing-institute.github.io/deepsensor/)
 ![Tests](https://github.com/alan-turing-institute/deepsensor/actions/workflows/tests.yml/badge.svg)
 [![Coverage Status](https://coveralls.io/repos/github/alan-turing-institute/deepsensor/badge.svg?branch=main)](https://coveralls.io/github/alan-turing-institute/deepsensor?branch=main)
@@ -240,7 +240,7 @@ if you would like to join this list!
 <table>
   <tbody>
     <tr>
-      <td align="center" valign="top" width="14.28%"><a href="https://github.com/acocac"><img src="https://avatars.githubusercontent.com/u/13321552?v=4?s=100" width="100px;" alt="Alejandro ©"/><br /><sub><b>Alejandro ©</b></sub></a><br /><a href="#userTesting-acocac" title="User Testing">📓</a> <a href="#bug-acocac" title="Bug reports">🐛</a> <a href="#mentoring-acocac" title="Mentoring">🧑‍🏫</a> <a href="#ideas-acocac" title="Ideas, Planning, & Feedback">🤔</a> <a href="#research-acocac" title="Research">🔬</a></td>
+      <td align="center" valign="top" width="14.28%"><a href="https://github.com/acocac"><img src="https://avatars.githubusercontent.com/u/13321552?v=4?s=100" width="100px;" alt="Alejandro ©"/><br /><sub><b>Alejandro ©</b></sub></a><br /><a href="#userTesting-acocac" title="User Testing">📓</a> <a href="#bug-acocac" title="Bug reports">🐛</a> <a href="#mentoring-acocac" title="Mentoring">🧑‍🏫</a> <a href="#ideas-acocac" title="Ideas, Planning, & Feedback">🤔</a> <a href="#research-acocac" title="Research">🔬</a> <a href="#code-acocac" title="Code">💻</a> <a href="#test-acocac" title="Tests">⚠️</a></td>
       <td align="center" valign="top" width="14.28%"><a href="https://github.com/annavaughan"><img src="https://avatars.githubusercontent.com/u/45528489?v=4?s=100" width="100px;" alt="Anna Vaughan"/><br /><sub><b>Anna Vaughan</b></sub></a><br /><a href="#research-annavaughan" title="Research">🔬</a></td>
       <td align="center" valign="top" width="14.28%"><a href="http://davidwilby.dev"><img src="https://avatars.githubusercontent.com/u/24752124?v=4?s=100" width="100px;" alt="David Wilby"/><br /><sub><b>David Wilby</b></sub></a><br /><a href="#doc-davidwilby" title="Documentation">📖</a> <a href="#test-davidwilby" title="Tests">⚠️</a> <a href="#maintenance-davidwilby" title="Maintenance">🚧</a></td>
       <td align="center" valign="top" width="14.28%"><a href="http://inconsistentrecords.co.uk"><img src="https://avatars.githubusercontent.com/u/731727?v=4?s=100" width="100px;" alt="Jim Circadian"/><br /><sub><b>Jim Circadian</b></sub></a><br /><a href="#ideas-JimCircadian" title="Ideas, Planning, & Feedback">🤔</a> <a href="#projectManagement-JimCircadian" title="Project Management">📆</a> <a href="#maintenance-JimCircadian" title="Maintenance">🚧</a></td>

{deepsensor-0.3.8 → deepsensor-0.4.1}/README.md RENAMED Viewed

@@ -11,7 +11,7 @@ data with neural processes</p>
 -----------
-[![release](https://img.shields.io/badge/release-v0.3.8-green?logo=github)](https://github.com/alan-turing-institute/deepsensor/releases)
+[![release](https://img.shields.io/badge/release-v0.4.1-green?logo=github)](https://github.com/alan-turing-institute/deepsensor/releases)
 [![Latest Docs](https://img.shields.io/badge/docs-latest-blue.svg)](https://alan-turing-institute.github.io/deepsensor/)
 ![Tests](https://github.com/alan-turing-institute/deepsensor/actions/workflows/tests.yml/badge.svg)
 [![Coverage Status](https://coveralls.io/repos/github/alan-turing-institute/deepsensor/badge.svg?branch=main)](https://coveralls.io/github/alan-turing-institute/deepsensor?branch=main)
@@ -207,7 +207,7 @@ if you would like to join this list!
 <table>
   <tbody>
     <tr>
-      <td align="center" valign="top" width="14.28%"><a href="https://github.com/acocac"><img src="https://avatars.githubusercontent.com/u/13321552?v=4?s=100" width="100px;" alt="Alejandro ©"/><br /><sub><b>Alejandro ©</b></sub></a><br /><a href="#userTesting-acocac" title="User Testing">📓</a> <a href="#bug-acocac" title="Bug reports">🐛</a> <a href="#mentoring-acocac" title="Mentoring">🧑‍🏫</a> <a href="#ideas-acocac" title="Ideas, Planning, & Feedback">🤔</a> <a href="#research-acocac" title="Research">🔬</a></td>
+      <td align="center" valign="top" width="14.28%"><a href="https://github.com/acocac"><img src="https://avatars.githubusercontent.com/u/13321552?v=4?s=100" width="100px;" alt="Alejandro ©"/><br /><sub><b>Alejandro ©</b></sub></a><br /><a href="#userTesting-acocac" title="User Testing">📓</a> <a href="#bug-acocac" title="Bug reports">🐛</a> <a href="#mentoring-acocac" title="Mentoring">🧑‍🏫</a> <a href="#ideas-acocac" title="Ideas, Planning, & Feedback">🤔</a> <a href="#research-acocac" title="Research">🔬</a> <a href="#code-acocac" title="Code">💻</a> <a href="#test-acocac" title="Tests">⚠️</a></td>
       <td align="center" valign="top" width="14.28%"><a href="https://github.com/annavaughan"><img src="https://avatars.githubusercontent.com/u/45528489?v=4?s=100" width="100px;" alt="Anna Vaughan"/><br /><sub><b>Anna Vaughan</b></sub></a><br /><a href="#research-annavaughan" title="Research">🔬</a></td>
       <td align="center" valign="top" width="14.28%"><a href="http://davidwilby.dev"><img src="https://avatars.githubusercontent.com/u/24752124?v=4?s=100" width="100px;" alt="David Wilby"/><br /><sub><b>David Wilby</b></sub></a><br /><a href="#doc-davidwilby" title="Documentation">📖</a> <a href="#test-davidwilby" title="Tests">⚠️</a> <a href="#maintenance-davidwilby" title="Maintenance">🚧</a></td>
       <td align="center" valign="top" width="14.28%"><a href="http://inconsistentrecords.co.uk"><img src="https://avatars.githubusercontent.com/u/731727?v=4?s=100" width="100px;" alt="Jim Circadian"/><br /><sub><b>Jim Circadian</b></sub></a><br /><a href="#ideas-JimCircadian" title="Ideas, Planning, & Feedback">🤔</a> <a href="#projectManagement-JimCircadian" title="Project Management">📆</a> <a href="#maintenance-JimCircadian" title="Maintenance">🚧</a></td>

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/data/loader.py RENAMED Viewed

@@ -678,11 +678,11 @@ class TaskLoader:
         seed: Optional[int] = None,
     ) -> (np.ndarray, np.ndarray):
         """
-        Sample a DataArray according to a given strategy.
+        Sample a DataFrame according to a given strategy.
         Args:
             df (:class:`pandas.DataFrame` | :class:`pandas.Series`):
-                DataArray to sample, assumed to be time-sliced for the task
+                Dataframe to sample, assumed to be time-sliced for the task
                 already.
             sampling_strat (str | int | float | :class:`numpy:numpy.ndarray`):
                 Sampling strategy, either "all" or an integer for random grid
@@ -720,20 +720,24 @@ class TaskLoader:
             X_c = df.reset_index()[["x1", "x2"]].values.T.astype(self.dtype)
             Y_c = df.values.T
         elif isinstance(sampling_strat, np.ndarray):
+            if df.index.get_level_values("x1").dtype != sampling_strat.dtype:
+                raise InvalidSamplingStrategyError(
+                    "Passed a numpy coordinate array to sample pandas DataFrame, "
+                    "but the coordinate array has a different dtype than the DataFrame. "
+                    f"Got {sampling_strat.dtype} but expected {df.index.get_level_values('x1').dtype}."
+                )
             X_c = sampling_strat.astype(self.dtype)
-            x1match = np.in1d(df.index.get_level_values("x1"), X_c[0])
-            x2match = np.in1d(df.index.get_level_values("x2"), X_c[1])
-            num_matches = np.sum(x1match & x2match)
-            # Check that we got all the samples we asked for
-            if num_matches != X_c.shape[1]:
+            try:
+                Y_c = df.loc[pd.IndexSlice[:, X_c[0], X_c[1]]].values.T
+            except KeyError:
                 raise InvalidSamplingStrategyError(
-                    f"Passed a numpy coordinate array to sample pandas DataFrame, "
-                    f"but the DataFrame did not contain all the requested samples. "
-                    f"Requested {X_c.shape[1]} samples but only got {num_matches}."
+                    "Passed a numpy coordinate array to sample pandas DataFrame, "
+                    "but the DataFrame did not contain all the requested samples.\n"
+                    f"Indexes: {df.index}\n"
+                    f"Sampling coords: {X_c}\n"
+                    "If this is unexpected, check that your numpy sampling array matches "
+                    "the DataFrame index values *exactly*."
                 )
-            Y_c = df[x1match & x2match].values.T
         else:
             raise InvalidSamplingStrategyError(
                 f"Unknown sampling strategy {sampling_strat}"

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/data/sources.py RENAMED Viewed

@@ -277,7 +277,7 @@ def get_era5_reanalysis_data(
     if num_processes == 1:
         # Just download in one go
         if verbose:
-            print("Downloading ERA5 data in without parallelisation... ")
+            print("Downloading ERA5 data without parallelisation... ")
         era5_da = _get_era5_reanalysis_data_parallel(
             date_range=date_range,
             var_IDs=var_IDs,
@@ -432,7 +432,7 @@ def get_gldas_land_mask(
         with urllib.request.urlopen(req) as response:
             with open(fname, "wb") as f:
                 f.write(response.read())
-        da = xr.open_dataset(fname)["GLDAS_mask"].isel(time=0).drop("time").load()
+        da = xr.open_dataset(fname)["GLDAS_mask"].isel(time=0).drop_vars("time").load()
         if isinstance(extent, str):
             extent = extent_str_to_tuple(extent)
@@ -577,7 +577,7 @@ def get_earthenv_auxiliary_data(
             # Read data
             da = xr.open_dataset(fname).to_array().squeeze().load()
             da = da.rename({"y": "lat", "x": "lon"})
-            da = da.drop(["band", "spatial_ref", "variable"])
+            da = da.drop_vars(["band", "spatial_ref", "variable"])
             da.name = var_ID
             da = da.sel(lat=slice(lat_max, lat_min), lon=slice(lon_min, lon_max))
             da_dict[var_ID] = da

deepsensor-0.4.1/deepsensor/eval/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ from .metrics import *

deepsensor-0.4.1/deepsensor/eval/metrics.py ADDED Viewed

@@ -0,0 +1,24 @@
+import xarray as xr
+from deepsensor.model.pred import Prediction
+def compute_errors(pred: Prediction, target: xr.Dataset) -> xr.Dataset:
+    """
+    Compute errors between predictions and targets.
+    Args:
+        pred: Prediction object.
+        target: Target data.
+    Returns:
+        xr.Dataset: Dataset of pointwise differences between predictions and targets
+        at the same valid time in the predictions. Note, the difference is positive
+        when the prediction is greater than the target.
+    """
+    errors = {}
+    for var_ID, pred_var in pred.items():
+        target_var = target[var_ID]
+        error = pred_var["mean"] - target_var.sel(time=pred_var.time)
+        error.name = f"{var_ID}"
+        errors[var_ID] = error
+    return xr.Dataset(errors)

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/model/model.py RENAMED Viewed

@@ -348,6 +348,34 @@ class DeepSensorModel(ProbabilisticModel):
         if ar_sample and n_samples < 1:
             raise ValueError("Must pass `n_samples` > 0 to use `ar_sample`.")
+        target_delta_t = self.task_loader.target_delta_t
+        dts = [pd.Timedelta(dt) for dt in target_delta_t]
+        dts_all_zero = all([dt == pd.Timedelta(seconds=0) for dt in dts])
+        if target_delta_t is not None and dts_all_zero:
+            forecasting_mode = False
+            lead_times = None
+        elif target_delta_t is not None and not dts_all_zero:
+            target_var_IDs_set = set(self.task_loader.target_var_IDs)
+            msg = f"""
+            Got more than one set of target variables in target sets,
+            but predictions can only be made with one set of target variables
+            to simplify implementation.
+            Got {target_var_IDs_set}.
+            """
+            assert len(target_var_IDs_set) == 1, msg
+            # Repeat lead_tim for each variable in each target set
+            lead_times = []
+            for target_set_idx, dt in enumerate(target_delta_t):
+                target_set_dim = self.task_loader.target_dims[target_set_idx]
+                lead_times += [
+                    pd.Timedelta(dt, unit=self.task_loader.time_freq)
+                    for _ in range(target_set_dim)
+                ]
+            forecasting_mode = True
+        else:
+            forecasting_mode = False
+            lead_times = None
         if type(tasks) is Task:
             tasks = [tasks]
@@ -355,12 +383,14 @@ class DeepSensorModel(ProbabilisticModel):
             B.set_random_seed(seed)
             np.random.seed(seed)
-        dates = [task["time"] for task in tasks]
+        init_dates = [task["time"] for task in tasks]
         # Flatten tuple of tuples to single list
         target_var_IDs = [
             var_ID for set in self.task_loader.target_var_IDs for var_ID in set
         ]
+        if lead_times is not None:
+            assert len(lead_times) == len(target_var_IDs)
         # TODO consider removing this logic, can we just depend on the dim names in X_t?
         if not unnormalise:
@@ -385,7 +415,7 @@ class DeepSensorModel(ProbabilisticModel):
         elif isinstance(X_t, (xr.DataArray, xr.Dataset)):
             # Remove time dimension if present
             if "time" in X_t.coords:
-                X_t = X_t.isel(time=0).drop("time")
+                X_t = X_t.isel(time=0).drop_vars("time")
         if mode == "off-grid" and append_indexes is not None:
             # Check append_indexes are all same length as X_t
@@ -450,11 +480,13 @@ class DeepSensorModel(ProbabilisticModel):
         pred = Prediction(
             target_var_IDs,
             pred_params_to_store,
-            dates,
+            init_dates,
             X_t,
             X_t_mask,
             coord_names,
             n_samples=n_samples,
+            forecasting_mode=forecasting_mode,
+            lead_times=lead_times,
         )
         def unnormalise_pred_array(arr, **kwargs):
@@ -605,14 +637,22 @@ class DeepSensorModel(ProbabilisticModel):
             # Assign predictions to Prediction object
             for param, arr in prediction_arrs.items():
                 if param != "mixture_probs":
-                    pred.assign(param, task["time"], arr)
+                    pred.assign(param, task["time"], arr, lead_times=lead_times)
                 elif param == "mixture_probs":
                     assert arr.shape[0] == self.N_mixture_components, (
                         f"Number of mixture components ({arr.shape[0]}) does not match "
                         f"model attribute N_mixture_components ({self.N_mixture_components})."
                     )
                     for component_i, probs in enumerate(arr):
-                        pred.assign(f"{param}_{component_i}", task["time"], probs)
+                        pred.assign(
+                            f"{param}_{component_i}",
+                            task["time"],
+                            probs,
+                            lead_times=lead_times,
+                        )
+        if forecasting_mode:
+            pred = add_valid_time_coord_to_pred_and_move_time_dims(pred)
         if verbose:
             dur = time.time() - tic
@@ -621,6 +661,37 @@ class DeepSensorModel(ProbabilisticModel):
         return pred
+def add_valid_time_coord_to_pred_and_move_time_dims(pred: Prediction) -> Prediction:
+    """
+    Add a valid time coordinate "time" to a Prediction object based on the
+    initialisation times "init_time" and lead times "lead_time", and
+    reorder the time dims from ("lead_time", "init_time") to ("init_time", "lead_time").
+    Args:
+        pred (:class:`~.model.pred.Prediction`):
+            Prediction object to add valid time coordinate to.
+    Returns:
+        :class:`~.model.pred.Prediction`:
+            Prediction object with valid time coordinate added.
+    """
+    for var_ID in pred.keys():
+        if isinstance(pred[var_ID], pd.DataFrame):
+            x = pred[var_ID].reset_index()
+            pred[var_ID]["time"] = (x["lead_time"] + x["init_time"]).values
+            pred[var_ID] = pred[var_ID].swaplevel("init_time", "lead_time")
+            pred[var_ID] = pred[var_ID].sort_index()
+        elif isinstance(pred[var_ID], xr.Dataset):
+            x = pred[var_ID]
+            pred[var_ID] = pred[var_ID].assign_coords(
+                time=x["lead_time"] + x["init_time"]
+            )
+            pred[var_ID] = pred[var_ID].transpose("init_time", "lead_time", ...)
+        else:
+            raise ValueError(f"Unsupported prediction type {type(pred[var_ID])}.")
+    return pred
 def main():  # pragma: no cover
     import deepsensor.tensorflow
     from deepsensor.data.loader import TaskLoader

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/model/pred.py RENAMED Viewed

@@ -4,6 +4,8 @@ import numpy as np
 import pandas as pd
 import xarray as xr
+Timestamp = Union[str, pd.Timestamp, np.datetime64]
 class Prediction(dict):
     """
@@ -32,13 +34,20 @@ class Prediction(dict):
         n_samples (int)
             Number of joint samples to draw from the model. If 0, will not
             draw samples. Default 0.
+        forecasting_mode (bool)
+            If True, stored forecast predictions with an init_time and lead_time dimension,
+            and a valid_time coordinate. If False, stores prediction at t=0 only
+            (i.e. spatial interpolation), with only a single time dimension. Default False.
+        lead_times (List[pd.Timedelta], optional)
+            List of lead times to store in predictions. Must be provided if
+            forecasting_mode is True. Default None.
     """
     def __init__(
         self,
         target_var_IDs: List[str],
         pred_params: List[str],
-        dates: List[Union[str, pd.Timestamp]],
+        dates: List[Timestamp],
         X_t: Union[
             xr.Dataset,
             xr.DataArray,
@@ -50,6 +59,8 @@ class Prediction(dict):
         X_t_mask: Optional[Union[xr.Dataset, xr.DataArray]] = None,
         coord_names: dict = None,
         n_samples: int = 0,
+        forecasting_mode: bool = False,
+        lead_times: Optional[List[pd.Timedelta]] = None,
     ):
         self.target_var_IDs = target_var_IDs
         self.X_t_mask = X_t_mask
@@ -58,6 +69,13 @@ class Prediction(dict):
         self.x1_name = coord_names["x1"]
         self.x2_name = coord_names["x2"]
+        self.forecasting_mode = forecasting_mode
+        if forecasting_mode:
+            assert (
+                lead_times is not None
+            ), "If forecasting_mode is True, lead_times must be provided."
+        self.lead_times = lead_times
         self.mode = infer_prediction_modality_from_X_t(X_t)
         self.pred_params = pred_params
@@ -67,15 +85,25 @@ class Prediction(dict):
                 *[f"sample_{i}" for i in range(n_samples)],
             ]
+        # Create empty xarray/pandas objects to store predictions
         if self.mode == "on-grid":
             for var_ID in self.target_var_IDs:
-                # Create empty xarray/pandas objects to store predictions
+                if self.forecasting_mode:
+                    prepend_dims = ["lead_time"]
+                    prepend_coords = {"lead_time": lead_times}
+                else:
+                    prepend_dims = None
+                    prepend_coords = None
                 self[var_ID] = create_empty_spatiotemporal_xarray(
                     X_t,
                     dates,
                     data_vars=self.pred_params,
                     coord_names=coord_names,
+                    prepend_dims=prepend_dims,
+                    prepend_coords=prepend_coords,
                 )
+                if self.forecasting_mode:
+                    self[var_ID] = self[var_ID].rename(time="init_time")
             if self.X_t_mask is None:
                 # Create 2D boolean array of True values to simplify indexing
                 self.X_t_mask = (
@@ -86,8 +114,18 @@ class Prediction(dict):
                 )
         elif self.mode == "off-grid":
             # Repeat target locs for each date to create multiindex
-            idxs = [(date, *idxs) for date in dates for idxs in X_t.index]
-            index = pd.MultiIndex.from_tuples(idxs, names=["time", *X_t.index.names])
+            if self.forecasting_mode:
+                index_names = ["lead_time", "init_time", *X_t.index.names]
+                idxs = [
+                    (lt, date, *idxs)
+                    for lt in lead_times
+                    for date in dates
+                    for idxs in X_t.index
+                ]
+            else:
+                index_names = ["time", *X_t.index.names]
+                idxs = [(date, *idxs) for date in dates for idxs in X_t.index]
+            index = pd.MultiIndex.from_tuples(idxs, names=index_names)
             for var_ID in self.target_var_IDs:
                 self[var_ID] = pd.DataFrame(index=index, columns=self.pred_params)
@@ -106,6 +144,7 @@ class Prediction(dict):
         prediction_parameter: str,
         date: Union[str, pd.Timestamp],
         data: np.ndarray,
+        lead_times: Optional[List[pd.Timedelta]] = None,
     ):
         """
@@ -117,11 +156,29 @@ class Prediction(dict):
             data (np.ndarray)
                 If off-grid: Shape (N_var, N_targets) or (N_samples, N_var, N_targets).
                 If on-grid: Shape (N_var, N_x1, N_x2) or (N_samples, N_var, N_x1, N_x2).
+            lead_time (pd.Timedelta, optional)
+                Lead time of the forecast. Required if forecasting_mode is True. Default None.
         """
+        if self.forecasting_mode:
+            assert (
+                lead_times is not None
+            ), "If forecasting_mode is True, lead_times must be provided."
+            msg = f"""
+            If forecasting_mode is True, lead_times must be of equal length to the number of
+            variables in the data (the first dimension). Got {lead_times=} of length
+            {len(lead_times)} lead times and data shape {data.shape}.
+            """
+            assert len(lead_times) == data.shape[0], msg
         if self.mode == "on-grid":
             if prediction_parameter != "samples":
-                for var_ID, pred in zip(self.target_var_IDs, data):
-                    self[var_ID][prediction_parameter].loc[date].data[
+                for i, (var_ID, pred) in enumerate(zip(self.target_var_IDs, data)):
+                    if self.forecasting_mode:
+                        index = (lead_times[i], date)
+                    else:
+                        index = date
+                    self[var_ID][prediction_parameter].loc[index].data[
                         self.X_t_mask.data
                     ] = pred.ravel()
             elif prediction_parameter == "samples":
@@ -130,28 +187,44 @@ class Prediction(dict):
                     f"have shape (N_samples, N_var, N_x1, N_x2). Got {data.shape}."
                 )
                 for sample_i, sample in enumerate(data):
-                    for var_ID, pred in zip(self.target_var_IDs, sample):
-                        self[var_ID][f"sample_{sample_i}"].loc[date].data[
+                    for i, (var_ID, pred) in enumerate(
+                        zip(self.target_var_IDs, sample)
+                    ):
+                        if self.forecasting_mode:
+                            index = (lead_times[i], date)
+                        else:
+                            index = date
+                        self[var_ID][f"sample_{sample_i}"].loc[index].data[
                             self.X_t_mask.data
                         ] = pred.ravel()
         elif self.mode == "off-grid":
             if prediction_parameter != "samples":
-                for var_ID, pred in zip(self.target_var_IDs, data):
-                    self[var_ID][prediction_parameter].loc[date] = pred
+                for i, (var_ID, pred) in enumerate(zip(self.target_var_IDs, data)):
+                    if self.forecasting_mode:
+                        index = (lead_times[i], date)
+                    else:
+                        index = date
+                    self[var_ID][prediction_parameter].loc[index] = pred
             elif prediction_parameter == "samples":
                 assert len(data.shape) == 3, (
                     f"If prediction_parameter is 'samples', and mode is 'off-grid', data must"
                     f"have shape (N_samples, N_var, N_targets). Got {data.shape}."
                 )
                 for sample_i, sample in enumerate(data):
-                    for var_ID, pred in zip(self.target_var_IDs, sample):
-                        self[var_ID][f"sample_{sample_i}"].loc[date] = pred
+                    for i, (var_ID, pred) in enumerate(
+                        zip(self.target_var_IDs, sample)
+                    ):
+                        if self.forecasting_mode:
+                            index = (lead_times[i], date)
+                        else:
+                            index = date
+                        self[var_ID][f"sample_{sample_i}"].loc[index] = pred
 def create_empty_spatiotemporal_xarray(
     X: Union[xr.Dataset, xr.DataArray],
-    dates: List,
+    dates: List[Timestamp],
     coord_names: dict = None,
     data_vars: List[str] = None,
     prepend_dims: Optional[List[str]] = None,
@@ -231,10 +304,6 @@ def create_empty_spatiotemporal_xarray(
     # Convert time coord to pandas timestamps
     pred_ds = pred_ds.assign_coords(time=pd.to_datetime(pred_ds.time.values))
-    # TODO: Convert init time to forecast time?
-    # pred_ds = pred_ds.assign_coords(
-    #     time=pred_ds['time'] + pd.Timedelta(days=task_loader.target_delta_t[0]))
     return pred_ds

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/plot.py RENAMED Viewed

@@ -954,6 +954,8 @@ def prediction(
             ax = axes[row_i, col_i]
             if pred.mode == "on-grid":
+                if "init_time" in pred[0].indexes:
+                    raise ValueError("Plotting forecasts not currently supported.")
                 if param == "std":
                     vmin = 0
                 else:
@@ -1000,6 +1002,8 @@ def prediction(
                     # )
             elif pred.mode == "off-grid":
+                if "init_time" in pred[0].index.names:
+                    raise ValueError("Plotting forecasts not currently supported.")
                 import seaborn as sns
                 hue = (

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: deepsensor
-Version: 0.3.8
+Version: 0.4.1
 Summary: A Python package for modelling xarray and pandas data with neural processes.
 Home-page: https://github.com/alan-turing-institute/deepsensor
 Author: Tom R. Andersson
@@ -44,7 +44,7 @@ data with neural processes</p>
 -----------
-[![release](https://img.shields.io/badge/release-v0.3.8-green?logo=github)](https://github.com/alan-turing-institute/deepsensor/releases)
+[![release](https://img.shields.io/badge/release-v0.4.1-green?logo=github)](https://github.com/alan-turing-institute/deepsensor/releases)
 [![Latest Docs](https://img.shields.io/badge/docs-latest-blue.svg)](https://alan-turing-institute.github.io/deepsensor/)
 ![Tests](https://github.com/alan-turing-institute/deepsensor/actions/workflows/tests.yml/badge.svg)
 [![Coverage Status](https://coveralls.io/repos/github/alan-turing-institute/deepsensor/badge.svg?branch=main)](https://coveralls.io/github/alan-turing-institute/deepsensor?branch=main)
@@ -240,7 +240,7 @@ if you would like to join this list!
 <table>
   <tbody>
     <tr>
-      <td align="center" valign="top" width="14.28%"><a href="https://github.com/acocac"><img src="https://avatars.githubusercontent.com/u/13321552?v=4?s=100" width="100px;" alt="Alejandro ©"/><br /><sub><b>Alejandro ©</b></sub></a><br /><a href="#userTesting-acocac" title="User Testing">📓</a> <a href="#bug-acocac" title="Bug reports">🐛</a> <a href="#mentoring-acocac" title="Mentoring">🧑‍🏫</a> <a href="#ideas-acocac" title="Ideas, Planning, & Feedback">🤔</a> <a href="#research-acocac" title="Research">🔬</a></td>
+      <td align="center" valign="top" width="14.28%"><a href="https://github.com/acocac"><img src="https://avatars.githubusercontent.com/u/13321552?v=4?s=100" width="100px;" alt="Alejandro ©"/><br /><sub><b>Alejandro ©</b></sub></a><br /><a href="#userTesting-acocac" title="User Testing">📓</a> <a href="#bug-acocac" title="Bug reports">🐛</a> <a href="#mentoring-acocac" title="Mentoring">🧑‍🏫</a> <a href="#ideas-acocac" title="Ideas, Planning, & Feedback">🤔</a> <a href="#research-acocac" title="Research">🔬</a> <a href="#code-acocac" title="Code">💻</a> <a href="#test-acocac" title="Tests">⚠️</a></td>
       <td align="center" valign="top" width="14.28%"><a href="https://github.com/annavaughan"><img src="https://avatars.githubusercontent.com/u/45528489?v=4?s=100" width="100px;" alt="Anna Vaughan"/><br /><sub><b>Anna Vaughan</b></sub></a><br /><a href="#research-annavaughan" title="Research">🔬</a></td>
       <td align="center" valign="top" width="14.28%"><a href="http://davidwilby.dev"><img src="https://avatars.githubusercontent.com/u/24752124?v=4?s=100" width="100px;" alt="David Wilby"/><br /><sub><b>David Wilby</b></sub></a><br /><a href="#doc-davidwilby" title="Documentation">📖</a> <a href="#test-davidwilby" title="Tests">⚠️</a> <a href="#maintenance-davidwilby" title="Maintenance">🚧</a></td>
       <td align="center" valign="top" width="14.28%"><a href="http://inconsistentrecords.co.uk"><img src="https://avatars.githubusercontent.com/u/731727?v=4?s=100" width="100px;" alt="Jim Circadian"/><br /><sub><b>Jim Circadian</b></sub></a><br /><a href="#ideas-JimCircadian" title="Ideas, Planning, & Feedback">🤔</a> <a href="#projectManagement-JimCircadian" title="Project Management">📆</a> <a href="#maintenance-JimCircadian" title="Maintenance">🚧</a></td>

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor.egg-info/SOURCES.txt RENAMED Viewed

@@ -22,6 +22,8 @@ deepsensor/data/processor.py
 deepsensor/data/sources.py
 deepsensor/data/task.py
 deepsensor/data/utils.py
+deepsensor/eval/__init__.py
+deepsensor/eval/metrics.py
 deepsensor/model/__init__.py
 deepsensor/model/convnp.py
 deepsensor/model/defaults.py

{deepsensor-0.3.8 → deepsensor-0.4.1}/setup.cfg RENAMED Viewed

@@ -1,6 +1,6 @@
 [metadata]
 name = deepsensor
-version = 0.3.8
+version = 0.4.1
 author = Tom R. Andersson
 author_email = tomandersson3@gmail.com
 description = A Python package for modelling xarray and pandas data with neural processes.

{deepsensor-0.3.8 → deepsensor-0.4.1}/tests/test_model.py RENAMED Viewed

@@ -18,6 +18,7 @@ from deepsensor.data.processor import DataProcessor
 from deepsensor.data.loader import TaskLoader
 from deepsensor.model.convnp import ConvNP
 from deepsensor.train.train import Trainer
+from deepsensor.eval.metrics import compute_errors
 from tests.utils import gen_random_data_xr, gen_random_data_pandas
@@ -55,8 +56,15 @@ class TestModel(unittest.TestCase):
     def setUpClass(cls):
         # super().__init__(*args, **kwargs)
         # It's safe to share data between tests because the TaskLoader does not modify data
+        cls.var_ID = "2m_temp"
         cls.da = _gen_data_xr()
+        cls.da.name = cls.var_ID
         cls.df = _gen_data_pandas()
+        cls.df.name = cls.var_ID
+        # Various tests assume we have a single target set with a single variable.
+        # If a test requires multiple target sets or variables, this is set up in the test.
+        assert isinstance(cls.da, xr.DataArray)
+        assert isinstance(cls.df, pd.Series)
         cls.dp = DataProcessor()
         _ = cls.dp([cls.da, cls.df])  # Compute normalisation parameters
@@ -417,10 +425,10 @@ class TestModel(unittest.TestCase):
         task = tl("2020-01-01")
         pred = model.predict(task, X_t=da_raw)
-        assert np.array_equal(
+        np.testing.assert_array_equal(
             pred["dummy_data"]["mean"]["latitude"], da_raw["latitude"]
         )
-        assert np.array_equal(
+        np.testing.assert_array_equal(
             pred["dummy_data"]["mean"]["longitude"], da_raw["longitude"]
         )
@@ -493,14 +501,14 @@ class TestModel(unittest.TestCase):
             # Check that nothing breaks and the correct parameters are returned
             pred = model.predict(task, X_t=X_t, pred_params=pred_params)
             for pred_param in pred_params:
-                assert pred_param in pred["var"]
+                assert pred_param in pred[self.var_ID]
             # Test mixture probs special case
             pred_params = ["mixture_probs"]
             pred = model.predict(task, X_t=self.da, pred_params=pred_params)
             for component in range(model.N_mixture_components):
                 pred_param = f"mixture_probs_{component}"
-                assert pred_param in pred["var"]
+                assert pred_param in pred[self.var_ID]
     def test_highlevel_predict_with_pred_params_xarray(self):
         """
@@ -528,14 +536,14 @@ class TestModel(unittest.TestCase):
             # Check that nothing breaks and the correct parameters are returned
             pred = model.predict(task, X_t=self.da, pred_params=pred_params)
             for pred_param in pred_params:
-                assert pred_param in pred["var"]
+                assert pred_param in pred[self.var_ID]
             # Test mixture probs special case
             pred_params = ["mixture_probs"]
             pred = model.predict(task, X_t=self.da, pred_params=pred_params)
             for component in range(model.N_mixture_components):
                 pred_param = f"mixture_probs_{component}"
-                assert pred_param in pred["var"]
+                assert pred_param in pred[self.var_ID]
     def test_highlevel_predict_with_invalid_pred_params(self):
         """Test that passing ``pred_params`` to ``.predict`` works."""
@@ -640,6 +648,66 @@ class TestModel(unittest.TestCase):
                 ar_sample=True,
             )
+    def test_forecasting_model_predict_return_valid_times(self):
+        """Test that the times returned by a forecasting model are valid."""
+        init_dates = ["2020-01-01", "2020-01-02"]
+        expected_init_times = np.array(init_dates).astype(np.datetime64)
+        lead_times_days = [1, 2, 3]
+        expected_lead_times = np.array(
+            [np.timedelta64(lt, "D") for lt in lead_times_days]
+        )
+        expected_valid_times = np.array(
+            expected_init_times[:, None] + expected_lead_times[None, :]
+        )
+        tl = TaskLoader(
+            context=self.da,
+            target=[
+                self.da,
+            ]
+            * len(lead_times_days),
+            target_delta_t=lead_times_days,
+            time_freq="D",
+        )
+        model = ConvNP(self.dp, tl, unet_channels=(5, 5, 5), verbose=False)
+        tasks = tl(init_dates, context_sampling=10)
+        X_ts = [
+            # Gridded predictions (xarray)
+            self.da,
+            # Off-grid prediction (pandas)
+            np.array([[0.0, 0.5, 1.0], [0.0, 0.5, 1.0]]),
+        ]
+        for X_t in X_ts:
+            pred = model.predict(tasks, X_t=X_t)
+            pred_var = pred[self.var_ID]
+            if isinstance(pred_var, xr.Dataset):
+                # Check we can compute errors using the valid time coord ('time')
+                errors = compute_errors(pred, self.da.to_dataset())
+                for var_ID in errors.keys():
+                    assert tuple(errors[var_ID].dims) == (
+                        "init_time",
+                        "lead_time",
+                        "x1",
+                        "x2",
+                    )
+                    assert errors[var_ID].shape == pred[var_ID]["mean"].shape
+            elif isinstance(pred_var, pd.DataFrame):
+                # Makes coordinate checking easier by avoiding repeat values
+                pred_var = pred_var.to_xarray().isel(x1=0, x2=0)
+            np.testing.assert_array_equal(
+                pred_var.lead_time.values, expected_lead_times
+            )
+            np.testing.assert_array_equal(
+                pred_var.init_time.values, expected_init_times
+            )
+            np.testing.assert_array_equal(pred_var.time.values, expected_valid_times)
 def assert_shape(x, shape: tuple):
     """Assert that the shape of ``x`` matches ``shape``."""

{deepsensor-0.3.8 → deepsensor-0.4.1}/tests/test_task_loader.py RENAMED Viewed

@@ -192,6 +192,23 @@ class TestTaskLoader(unittest.TestCase):
                 with self.assertRaises(InvalidSamplingStrategyError):
                     task = tl("2020-01-01", invalid_sampling_strategy)
+    def test_different_dtype_when_sampling_offgrid_data_at_specific_numpy_locs(self):
+        """Test different dtype when sampling off-grid data at specific numpy locations."""
+        sampling_strat = np.array(
+            [np.linspace(0, 1, 10), np.linspace(0, 1, 10)], dtype=np.float16
+        )
+        tl = TaskLoader(
+            context=self.df,
+            target=self.df,
+        )
+        assert sampling_strat.dtype != tl.context[0].index.get_level_values("x1").dtype
+        assert sampling_strat.dtype != tl.context[0].index.get_level_values("x2").dtype
+        with self.assertRaises(InvalidSamplingStrategyError):
+            task = tl("2020-01-01", sampling_strat, sampling_strat)
     def test_wrong_links(self):
         """Test link indexes out of range."""
         with self.assertRaises(ValueError):

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/__init__.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/active_learning/__init__.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/active_learning/acquisition_fns.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/active_learning/algorithms.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/config.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/data/__init__.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/data/processor.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/data/task.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/data/utils.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/errors.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/model/__init__.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/model/convnp.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/model/defaults.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/model/nps.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/py.typed RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/tensorflow/__init__.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/torch/__init__.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/train/__init__.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor/train/train.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor.egg-info/dependency_links.txt RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor.egg-info/not-zip-safe RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor.egg-info/requires.txt RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/deepsensor.egg-info/top_level.txt RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/pyproject.toml RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/setup.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/tests/__init__.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/tests/test_active_learning.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/tests/test_data_processor.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/tests/test_plotting.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/tests/test_task.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/tests/test_training.py RENAMED Viewed

File without changes

{deepsensor-0.3.8 → deepsensor-0.4.1}/tests/utils.py RENAMED Viewed

File without changes

deepsensor 0.3.8__tar.gz → 0.4.1__tar.gz

deepsensor 0.3.8tar.gz → 0.4.1tar.gz