PyPI - pypromice - Versions diffs - 1.3.1__tar.gz → 1.3.3__tar.gz - Mend

pypromice 1.3.1tar.gz → 1.3.3tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of pypromice might be problematic. Click here for more details.

Files changed (56) hide show

{pypromice-1.3.1/src/pypromice.egg-info → pypromice-1.3.3}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: pypromice
-Version: 1.3.1
+Version: 1.3.3
 Summary: PROMICE/GC-Net data processing toolbox
 Home-page: https://github.com/GEUS-Glaciology-and-Climate/pypromice
 Author: GEUS Glaciology and Climate
@@ -29,10 +29,9 @@ Requires-Dist: netcdf4
 Requires-Dist: pyDataverse
 # pypromice
-[![PyPI version](https://badge.fury.io/py/pypromice.svg)](https://badge.fury.io/py/pypromice)
-[![](<https://img.shields.io/badge/Dataverse DOI-10.22008/FK2/3TSBF0-orange>)](https://www.doi.org/10.22008/FK2/3TSBF0) [![DOI](https://joss.theoj.org/papers/10.21105/joss.05298/status.svg)](https://doi.org/10.21105/joss.05298) [![Documentation Status](https://readthedocs.org/projects/pypromice/badge/?version=latest)](https://pypromice.readthedocs.io/en/latest/?badge=latest)
+[![PyPI version](https://badge.fury.io/py/pypromice.svg)](https://badge.fury.io/py/pypromice) [![Anaconda-Server Badge](https://anaconda.org/conda-forge/pypromice/badges/version.svg)](https://anaconda.org/conda-forge/pypromice) [![Anaconda-Server Badge](https://anaconda.org/conda-forge/pypromice/badges/platforms.svg)](https://anaconda.org/conda-forge/pypromice) [![](<https://img.shields.io/badge/Dataverse DOI-10.22008/FK2/3TSBF0-orange>)](https://www.doi.org/10.22008/FK2/3TSBF0) [![DOI](https://joss.theoj.org/papers/10.21105/joss.05298/status.svg)](https://doi.org/10.21105/joss.05298) [![Documentation Status](https://readthedocs.org/projects/pypromice/badge/?version=latest)](https://pypromice.readthedocs.io/en/latest/?badge=latest)
-pypromice is designed for processing and handling [PROMICE](https://promice.dk) automated weather station (AWS) data.
+pypromice is designed for processing and handling [PROMICE](https://promice.org) automated weather station (AWS) data.
 It is envisioned for pypromice to be the go-to toolbox for handling and processing [PROMICE](https://promice.dk) and [GC-Net](http://cires1.colorado.edu/steffen/gcnet/) datasets. New releases of pypromice are uploaded alongside PROMICE AWS data releases to our [Dataverse](https://dataverse.geus.dk/dataverse/PROMICE) for transparency purposes and to encourage collaboration on improving our data. Please visit the pypromice [readthedocs](https://pypromice.readthedocs.io/en/latest/?badge=latest) for more information.
@@ -48,7 +47,11 @@ If you intend to use PROMICE AWS data and/or pypromice in your work, please cite
 ### Quick install
-The latest release of pypromice can installed using pip:
+The latest release of pypromice can installed using conda or pip:
+```
+$ conda install pypromice -c conda-forge
+```
 ```
 $ pip install pypromice

{pypromice-1.3.1 → pypromice-1.3.3}/README.md RENAMED Viewed

@@ -1,8 +1,7 @@
 # pypromice
-[![PyPI version](https://badge.fury.io/py/pypromice.svg)](https://badge.fury.io/py/pypromice)
-[![](<https://img.shields.io/badge/Dataverse DOI-10.22008/FK2/3TSBF0-orange>)](https://www.doi.org/10.22008/FK2/3TSBF0) [![DOI](https://joss.theoj.org/papers/10.21105/joss.05298/status.svg)](https://doi.org/10.21105/joss.05298) [![Documentation Status](https://readthedocs.org/projects/pypromice/badge/?version=latest)](https://pypromice.readthedocs.io/en/latest/?badge=latest)
+[![PyPI version](https://badge.fury.io/py/pypromice.svg)](https://badge.fury.io/py/pypromice) [![Anaconda-Server Badge](https://anaconda.org/conda-forge/pypromice/badges/version.svg)](https://anaconda.org/conda-forge/pypromice) [![Anaconda-Server Badge](https://anaconda.org/conda-forge/pypromice/badges/platforms.svg)](https://anaconda.org/conda-forge/pypromice) [![](<https://img.shields.io/badge/Dataverse DOI-10.22008/FK2/3TSBF0-orange>)](https://www.doi.org/10.22008/FK2/3TSBF0) [![DOI](https://joss.theoj.org/papers/10.21105/joss.05298/status.svg)](https://doi.org/10.21105/joss.05298) [![Documentation Status](https://readthedocs.org/projects/pypromice/badge/?version=latest)](https://pypromice.readthedocs.io/en/latest/?badge=latest)
-pypromice is designed for processing and handling [PROMICE](https://promice.dk) automated weather station (AWS) data.
+pypromice is designed for processing and handling [PROMICE](https://promice.org) automated weather station (AWS) data.
 It is envisioned for pypromice to be the go-to toolbox for handling and processing [PROMICE](https://promice.dk) and [GC-Net](http://cires1.colorado.edu/steffen/gcnet/) datasets. New releases of pypromice are uploaded alongside PROMICE AWS data releases to our [Dataverse](https://dataverse.geus.dk/dataverse/PROMICE) for transparency purposes and to encourage collaboration on improving our data. Please visit the pypromice [readthedocs](https://pypromice.readthedocs.io/en/latest/?badge=latest) for more information.
@@ -18,7 +17,11 @@ If you intend to use PROMICE AWS data and/or pypromice in your work, please cite
 ### Quick install
-The latest release of pypromice can installed using pip:
+The latest release of pypromice can installed using conda or pip:
+```
+$ conda install pypromice -c conda-forge
+```
 ```
 $ pip install pypromice

{pypromice-1.3.1 → pypromice-1.3.3}/setup.py RENAMED Viewed

@@ -5,7 +5,7 @@ with open("README.md", "r", encoding="utf-8") as fh:
 setuptools.setup(
     name="pypromice",
-    version="1.3.1",
+    version="1.3.3",
     author="GEUS Glaciology and Climate",
     description="PROMICE/GC-Net data processing toolbox",
     long_description=long_description,
@@ -30,6 +30,7 @@ setuptools.setup(
     include_package_data = True,
     packages=setuptools.find_packages(where="src"),
     python_requires=">=3.8",
+    package_data={"pypromice.qc.percentiles": ["thresholds.csv"]},
     install_requires=['numpy>=1.23.0', 'pandas>=1.5.0', 'xarray>=2022.6.0', 'toml', 'scipy>=1.9.0', 'scikit-learn>=1.1.0', 'Bottleneck', 'netcdf4', 'pyDataverse'],
     entry_points={
     'console_scripts': [

{pypromice-1.3.1 → pypromice-1.3.3}/src/pypromice/process/L0toL1.py RENAMED Viewed

@@ -57,7 +57,9 @@ def toL1(L0, vars_df, T_0=273.15, tilt_threshold=-100):
         ds['ulr'] = ((ds['ulr'] * 10) / ds.attrs['ulr_eng_coef']) + 5.67E-8*(ds['t_rad'] + T_0)**4
     ds['z_boom_u'] = _reformatArray(ds['z_boom_u'])                            # Reformat boom height
-    ds['z_boom_u'] = ds['z_boom_u'] * ((ds['t_u'] + T_0)/T_0)**0.5             # Adjust sonic ranger readings for sensitivity to air temperature
+    ds['t_u_interp'] = interpTemp(ds['t_u'], vars_df)
+    ds['z_boom_u'] = ds['z_boom_u'] * ((ds['t_u_interp'] + T_0)/T_0)**0.5      # Adjust sonic ranger readings for sensitivity to air temperature
     if ds['gps_lat'].dtype.kind == 'O':                                        # Decode and reformat GPS information
         if 'NH' in ds['gps_lat'].dropna(dim='time').values[1]:
@@ -113,7 +115,8 @@ def toL1(L0, vars_df, T_0=273.15, tilt_threshold=-100):
     elif ds.attrs['number_of_booms']==2:                                       # 2-boom processing
         ds['z_boom_l'] = _reformatArray(ds['z_boom_l'])                        # Reformat boom height
-        ds['z_boom_l'] = ds['z_boom_l'] * ((ds['t_l'] + T_0)/T_0)**0.5         # Adjust sonic ranger readings for sensitivity to air temperature
+        ds['t_l_interp'] = interpTemp(ds['t_l'], vars_df)
+        ds['z_boom_l'] = ds['z_boom_l'] * ((ds['t_l_interp']+ T_0)/T_0)**0.5   # Adjust sonic ranger readings for sensitivity to air temperature
     ds = clip_values(ds, vars_df)
     for key in ['hygroclip_t_offset', 'dsr_eng_coef', 'usr_eng_coef',
@@ -254,6 +257,41 @@ def getPressDepth(z_pt, p, pt_antifreeze, pt_z_factor, pt_z_coef, pt_z_p_coef):
     return z_pt_cor, z_pt
+def interpTemp(temp, var_configurations, max_interp=pd.Timedelta(12,'h')):
+    '''Clip and interpolate temperature dataset for use in corrections
+    Parameters
+    ----------
+    temp : `xarray.DataArray`
+        Array of temperature data
+    vars_df : `pandas.DataFrame`
+        Dataframe to retrieve attribute hi-lo values from for temperature clipping
+    max_interp : `pandas.Timedelta`
+        Maximum time steps to interpolate across. The default is 12 hours.
+    Returns
+    -------
+    temp_interp : `xarray.DataArray`
+        Array of interpolatedtemperature data
+    '''
+    # Determine if upper or lower temperature array
+    var = temp.name.lower()
+    # Find range threshold and use it to clip measurements
+    cols = ["lo", "hi", "OOL"]
+    assert set(cols) <= set(var_configurations.columns)
+    variable_limits = var_configurations[cols].dropna(how="all")
+    temp = temp.where(temp >= variable_limits.loc[var,'lo'])
+    temp = temp.where(temp <= variable_limits.loc[var, 'hi'])
+    # Drop duplicates and interpolate across NaN values
+#    temp_interp = temp.drop_duplicates(dim='time', keep='first')
+    temp_interp = temp.interpolate_na(dim='time', max_gap=max_interp)
+    return temp_interp
 def smoothTilt(tilt, win_size):
     '''Smooth tilt values using a rolling window. This is translated from the
     previous IDL/GDL smoothing algorithm:
@@ -361,23 +399,35 @@ def decodeGPS(ds, gps_names):
     return ds
 def reformatGPS(pos_arr, attrs):
-    '''Correct position if only recorded minutes (and not degrees), and
-    reformat values and attributes
+    '''Correct latitude and longitude from native format to decimal degrees.
+    v2 stations should send  "NH6429.01544","WH04932.86061" (NUK_L 2022)
+    v3 stations should send coordinates as "6628.93936","04617.59187" (DY2) or 6430,4916 (NUK_Uv3)
+    decodeGPS should have decoded these strings to floats in ddmm.mmmm format
+    v1 stations however only saved decimal minutes (mm.mmmmm) as float<=60. '
+    In this case, we use the integer part of the latitude given in the config
+    file and append the gps value after it.
     Parameters
     ----------
     pos_arr : xr.Dataarray
-        GPS position array
+        Array of latitude or longitude measured by the GPS
     attrs : dict
-        Array attributes
+        The global attribute 'latitude' or 'longitude' associated with the
+        file being processed. It is the standard latitude/longitude given in the
+        config file for that station.
     Returns
     -------
     pos_arr : xr.Dataarray
-        Formatted GPS position array
+        Formatted GPS position array in decimal degree
     '''
     if np.any((pos_arr <= 90) & (pos_arr > 0)):
-        pos_arr = pos_arr + 100*attrs
+        # then pos_arr is in decimal minutes, so we add to it the integer
+        # part of the latitude given in the config file x100
+        # so that it reads ddmm.mmmmmm like for v2 and v3 files
+        # Note that np.sign and np.attrs handles negative longitudes.
+        pos_arr = np.sign(attrs) * (pos_arr + 100*np.floor(np.abs(attrs)))
     a = pos_arr.attrs
     pos_arr = np.floor(pos_arr / 100) + (pos_arr / 100 - np.floor(pos_arr / 100)) * 100 / 60
     pos_arr.attrs = a

{pypromice-1.3.1 → pypromice-1.3.3}/src/pypromice/process/L1toL2.py RENAMED Viewed

@@ -5,17 +5,21 @@ AWS Level 1 (L1) to Level 2 (L2) data processing
 import logging
 import numpy as np
-import urllib.request
-from urllib.error import HTTPError, URLError
 import pandas as pd
-import os
 import xarray as xr
+from pypromice.qc.github_data_issues import flagNAN, adjustTime, adjustData
+from pypromice.qc.percentiles.outlier_detector import ThresholdBasedOutlierDetector
 from pypromice.qc.persistence import persistence_qc
 from pypromice.process.value_clipping import clip_values
+__all__ = [
+    "toL2",
+]
 logger = logging.getLogger(__name__)
 def toL2(
     L1: xr.Dataset,
     vars_df: pd.DataFrame,
@@ -61,8 +65,12 @@ def toL2(
         ds = adjustData(ds)                                                    # Adjust data after a user-defined csv files
     except Exception:
         logger.exception('Flagging and fixing failed:')
     if ds.attrs['format'] == 'TX':
-        ds = persistence_qc(ds)                                               # Detect and filter data points that seems to be static
+        ds = persistence_qc(ds)                                               # Flag and remove persistence outliers
+        # TODO: The configuration should be provided explicitly
+        outlier_detector = ThresholdBasedOutlierDetector.default()
+        ds = outlier_detector.filter_data(ds)                                 # Flag and remove percentile outliers
     T_100 = _getTempK(T_0)
     ds['rh_u_cor'] = correctHumidity(ds['rh_u'], ds['t_u'],
@@ -81,7 +89,7 @@ def toL2(
     ds['t_surf'] = calcSurfaceTemperature(T_0, ds['ulr'], ds['dlr'],           # Calculate surface temperature
                                           emissivity)
     if not ds.attrs['bedrock']:
-        ds['t_surf'] = ds['t_surf'].where(ds['t_surf'] <= 0, other = 0)
+        ds['t_surf'] = xr.where(ds['t_surf'] > 0, 0, ds['t_surf'])
     # Determine station position relative to sun
     doy = ds['time'].to_dataframe().index.dayofyear.values                     # Gather variables to calculate sun pos
@@ -167,291 +175,6 @@ def toL2(
     ds = clip_values(ds, vars_df)
     return ds
-def flagNAN(ds_in,
-            flag_url='https://raw.githubusercontent.com/GEUS-Glaciology-and-Climate/PROMICE-AWS-data-issues/master/flags/',
-            flag_dir='local/flags/'):
-    '''Read flagged data from .csv file. For each variable, and downstream
-    dependents, flag as invalid (or other) if set in the flag .csv
-    Parameters
-    ----------
-    ds_in : xr.Dataset
-        Level 0 dataset
-    flag_url : str
-        URL to directory where .csv flag files can be found
-    flag_dir : str
-        File directory where .csv flag files can be found
-    Returns
-    -------
-    ds : xr.Dataset
-        Level 0 data with flagged data
-    '''
-    ds = ds_in.copy()
-    df = None
-    df = _getDF(flag_url + ds.attrs["station_id"] + ".csv",
-                os.path.join(flag_dir, ds.attrs["station_id"] + ".csv"),
-                # download = False,  # only for working on draft local flag'n'fix files
-                )
-    if isinstance(df, pd.DataFrame):
-        df.t0 = pd.to_datetime(df.t0).dt.tz_localize(None)
-        df.t1 = pd.to_datetime(df.t1).dt.tz_localize(None)
-        if df.shape[0] > 0:
-            for i in df.index:
-                t0, t1, avar = df.loc[i,['t0','t1','variable']]
-                if avar == '*':
-                    # Set to all vars if var is "*"
-                    varlist = list(ds.keys())
-                elif '*' in avar:
-                    # Reads as regex if contains "*" and other characters (e.g. 't_i_.*($)')
-                    varlist = pd.DataFrame(columns = list(ds.keys())).filter(regex=(avar)).columns
-                else:
-                    varlist = avar.split()
-                if 'time' in varlist: varlist.remove("time")
-                # Set to all times if times are "n/a"
-                if pd.isnull(t0):
-                    t0 = ds['time'].values[0]
-                if pd.isnull(t1):
-                    t1 = ds['time'].values[-1]
-                for v in varlist:
-                    if v in list(ds.keys()):
-                        logger.info(f'---> flagging {t0} {t1} {v}')
-                        ds[v] = ds[v].where((ds['time'] < t0) | (ds['time'] > t1))
-                    else:
-                        logger.info(f'---> could not flag {v} not in dataset')
-    return ds
-def adjustTime(ds,
-               adj_url="https://raw.githubusercontent.com/GEUS-Glaciology-and-Climate/PROMICE-AWS-data-issues/master/adjustments/",
-               adj_dir='local/adjustments/',
-               var_list=[], skip_var=[]):
-    '''Read adjustment data from .csv file. Only applies the "time_shift" adjustment
-    Parameters
-    ----------
-    ds : xr.Dataset
-        Level 0 dataset
-    adj_url : str
-        URL to directory where .csv adjustment files can be found
-    adj_dir : str
-        File directory where .csv adjustment files can be found
-    Returns
-    -------
-    ds : xr.Dataset
-        Level 0 data with flagged data
-    '''
-    ds_out = ds.copy()
-    adj_info=None
-    adj_info = _getDF(adj_url + ds.attrs["station_id"] + ".csv",
-                      os.path.join(adj_dir, ds.attrs["station_id"] + ".csv"),)
-    if isinstance(adj_info, pd.DataFrame):
-        if "time_shift" in adj_info.adjust_function.values:
-            time_shifts = adj_info.loc[adj_info.adjust_function == "time_shift", :]
-            # if t1 is left empty, then adjustment is applied until the end of the file
-            time_shifts.loc[time_shifts.t1.isnull(), "t1"] = pd.to_datetime(ds_out.time.values[-1]).isoformat()
-            time_shifts.t0 = pd.to_datetime(time_shifts.t0).dt.tz_localize(None)
-            time_shifts.t1 = pd.to_datetime(time_shifts.t1).dt.tz_localize(None)
-            for t0, t1, val in zip(
-                time_shifts.t0,
-                time_shifts.t1,
-                time_shifts.adjust_value,
-            ):
-                ds_shifted = ds_out.sel(time=slice(t0,t1))
-                ds_shifted['time'] = ds_shifted.time.values + pd.Timedelta(days = val)
-                # here we concatenate what was before the shifted part, the shifted
-                # part and what was after the shifted part
-                # note that if any data was already present in the target period
-                # (where the data lands after the shift), it is overwritten
-                ds_out = xr.concat(
-                                        (
-                                            ds_out.sel(time=slice(pd.to_datetime(ds_out.time.values[0]),
-                                                                  t0 + pd.Timedelta(days = val))),
-                                            ds_shifted,
-                                            ds_out.sel(time=slice(t1 + pd.Timedelta(days = val),
-                                                                  pd.to_datetime(ds_out.time.values[-1])))
-                                        ),
-                                        dim = 'time',
-                                       )
-                if t0 > pd.Timestamp.now():
-                    ds_out = ds_out.sel(time=slice(pd.to_datetime(ds_out.time.values[0]),
-                                                   t0))
-    return ds_out
-def adjustData(ds,
-               adj_url="https://raw.githubusercontent.com/GEUS-Glaciology-and-Climate/PROMICE-AWS-data-issues/master/adjustments/",
-               adj_dir='local/adjustments/',
-               var_list=[], skip_var=[]):
-    '''Read adjustment data from .csv file. For each variable, and downstream
-    dependents, adjust data accordingly if set in the adjustment .csv
-    Parameters
-    ----------
-    ds : xr.Dataset
-        Level 0 dataset
-    adj_url : str
-        URL to directory where .csv adjustment files can be found
-    adj_dir : str
-        File directory where .csv adjustment files can be found
-    Returns
-    -------
-    ds : xr.Dataset
-        Level 0 data with flagged data
-    '''
-    ds_out = ds.copy()
-    adj_info=None
-    adj_info = _getDF(adj_url + ds.attrs["station_id"] + ".csv",
-                      os.path.join(adj_dir, ds.attrs["station_id"] + ".csv"),
-                      # download = False,  # only for working on draft local flag'n'fix files
-                      )
-    if isinstance(adj_info, pd.DataFrame):
-        # removing potential time shifts from the adjustment list
-        adj_info = adj_info.loc[adj_info.adjust_function != "time_shift", :]
-        # if t1 is left empty, then adjustment is applied until the end of the file
-        adj_info.loc[adj_info.t0.isnull(), "t0"] = ds_out.time.values[0]
-        adj_info.loc[adj_info.t1.isnull(), "t1"] = ds_out.time.values[-1]
-        # making all timestamps timezone naive (compatibility with xarray)
-        adj_info.t0 = pd.to_datetime(adj_info.t0).dt.tz_localize(None)
-        adj_info.t1 = pd.to_datetime(adj_info.t1).dt.tz_localize(None)
-        # if "*" is in the variable name then we interpret it as regex
-        selec =  adj_info['variable'].str.contains('\*') & (adj_info['variable'] != "*")
-        for ind in adj_info.loc[selec, :].index:
-            line_template = adj_info.loc[ind, :].copy()
-            regex = adj_info.loc[ind, 'variable']
-            for var in pd.DataFrame(columns = list(ds.keys())).filter(regex=regex).columns:
-                line_template.variable = var
-                line_template.name = adj_info.index.max() + 1
-                adj_info = pd.concat((adj_info, line_template.to_frame().transpose()),axis=0)
-            adj_info = adj_info.drop(labels=ind, axis=0)
-        adj_info = adj_info.sort_values(by=["variable", "t0"])
-        adj_info.set_index(["variable", "t0"], drop=False, inplace=True)
-        if len(var_list) == 0:
-            var_list = np.unique(adj_info.variable)
-        else:
-            adj_info = adj_info.loc[np.isin(adj_info.variable, var_list), :]
-            var_list = np.unique(adj_info.variable)
-        if len(skip_var) > 0:
-            adj_info = adj_info.loc[~np.isin(adj_info.variable, skip_var), :]
-            var_list = np.unique(adj_info.variable)
-        for var in var_list:
-            if var not in list(ds_out.keys()):
-                logger.info(f'could not adjust {var } not in dataset')
-                continue
-            for t0, t1, func, val in zip(
-                adj_info.loc[var].t0,
-                adj_info.loc[var].t1,
-                adj_info.loc[var].adjust_function,
-                adj_info.loc[var].adjust_value,
-            ):
-                if (t0 > pd.to_datetime(ds_out.time.values[-1])) | (t1 < pd.to_datetime(ds_out.time.values[0])):
-                    continue
-                logger.info(f'---> {t0} {t1} {var} {func} {val}')
-                if func == "add":
-                    ds_out[var].loc[dict(time=slice(t0, t1))] = ds_out[var].loc[dict(time=slice(t0, t1))].values + val
-                    # flagging adjusted values
-                    # if var + "_adj_flag" not in ds_out.columns:
-                    #     ds_out[var + "_adj_flag"] = 0
-                    # msk = ds_out[var].loc[dict(time=slice(t0, t1))])].notnull()
-                    # ind = ds_out[var].loc[dict(time=slice(t0, t1))])].loc[msk].time
-                    # ds_out.loc[ind, var + "_adj_flag"] = 1
-                if func == "multiply":
-                    ds_out[var].loc[dict(time=slice(t0, t1))] = ds_out[var].loc[dict(time=slice(t0, t1))].values * val
-                    if "DW" in var:
-                        ds_out[var].loc[dict(time=slice(t0, t1))] = ds_out[var].loc[dict(time=slice(t0, t1))] % 360
-                    # flagging adjusted values
-                    # if var + "_adj_flag" not in ds_out.columns:
-                    #     ds_out[var + "_adj_flag"] = 0
-                    # msk = ds_out[var].loc[dict(time=slice(t0, t1))].notnull()
-                    # ind = ds_out[var].loc[dict(time=slice(t0, t1))].loc[msk].time
-                    # ds_out.loc[ind, var + "_adj_flag"] = 1
-                if func == "min_filter":
-                    tmp = ds_out[var].loc[dict(time=slice(t0, t1))].values
-                    tmp[tmp < val] = np.nan
-                if func == "max_filter":
-                    tmp = ds_out[var].loc[dict(time=slice(t0, t1))].values
-                    tmp[tmp > val] = np.nan
-                    ds_out[var].loc[dict(time=slice(t0, t1))] = tmp
-                if func == "upper_perc_filter":
-                    tmp = ds_out[var].loc[dict(time=slice(t0, t1))].copy()
-                    df_w = ds_out[var].loc[dict(time=slice(t0, t1))].resample("14D").quantile(1 - val / 100)
-                    df_w = ds_out[var].loc[dict(time=slice(t0, t1))].resample("14D").var()
-                    for m_start, m_end in zip(df_w.time[:-2], df_w.time[1:]):
-                        msk = (tmp.time >= m_start) & (tmp.time < m_end)
-                        values_month = tmp.loc[msk].values
-                        values_month[values_month < df_w.loc[m_start]] = np.nan
-                        tmp.loc[msk] = values_month
-                    ds_out[var].loc[dict(time=slice(t0, t1))] = tmp.values
-                if func == "biweekly_upper_range_filter":
-                    tmp = ds_out[var].loc[dict(time=slice(t0, t1))].copy()
-                    df_max = ds_out[var].loc[dict(time=slice(t0, t1))].resample("14D").max()
-                    for m_start, m_end in zip(df_max.time[:-2], df_max.time[1:]):
-                        msk = (tmp.time >= m_start) & (tmp.time < m_end)
-                        lim = df_max.loc[m_start] - val
-                        values_month = tmp.loc[msk].values
-                        values_month[values_month < lim] = np.nan
-                        tmp.loc[msk] = values_month
-                    # remaining samples following outside of the last 2 weeks window
-                    msk = tmp.time >= m_end
-                    lim = df_max.loc[m_start] - val
-                    values_month = tmp.loc[msk].values
-                    values_month[values_month < lim] = np.nan
-                    tmp.loc[msk] = values_month
-                    # updating original pandas
-                    ds_out[var].loc[dict(time=slice(t0, t1))] = tmp.values
-                if func == "hampel_filter":
-                    tmp = ds_out[var].loc[dict(time=slice(t0, t1))]
-                    tmp = _hampel(tmp, k=7 * 24, t0=val)
-                    ds_out[var].loc[dict(time=slice(t0, t1))] = tmp.values
-                if func == "grad_filter":
-                    tmp = ds_out[var].loc[dict(time=slice(t0, t1))].copy()
-                    msk = ds_out[var].loc[dict(time=slice(t0, t1))].copy().diff()
-                    tmp[np.roll(msk.abs() > val, -1)] = np.nan
-                    ds_out[var].loc[dict(time=slice(t0, t1))] = tmp
-                if "swap_with_" in func:
-                    var2 = func[10:]
-                    val_var = ds_out[var].loc[dict(time=slice(t0, t1))].values.copy()
-                    val_var2 = ds_out[var2].loc[dict(time=slice(t0, t1))].values.copy()
-                    ds_out[var2].loc[dict(time=slice(t0, t1))] = val_var
-                    ds_out[var].loc[dict(time=slice(t0, t1))] = val_var2
-                if func == "rotate":
-                    ds_out[var].loc[dict(time=slice(t0, t1))] = (ds_out[var].loc[dict(time=slice(t0, t1))].values + val) % 360
-    return ds_out
 def calcCloudCoverage(T, T_0, eps_overcast, eps_clear, dlr, station_id):
     '''Calculate cloud cover from T and T_0
@@ -493,6 +216,7 @@ def calcCloudCoverage(T, T_0, eps_overcast, eps_clear, dlr, station_id):
     cc[cc < 0] = 0
     return cc
 def calcSurfaceTemperature(T_0, ulr, dlr, emissivity):
     '''Calculate surface temperature from air temperature, upwelling and
     downwelling radiation and emissivity
@@ -516,6 +240,7 @@ def calcSurfaceTemperature(T_0, ulr, dlr, emissivity):
     t_surf = ((ulr - (1 - emissivity) * dlr) / emissivity / 5.67e-8)**0.25 - T_0
     return t_surf
 def calcTilt(tilt_x, tilt_y, deg2rad):
     '''Calculate station tilt
@@ -557,6 +282,7 @@ def calcTilt(tilt_x, tilt_y, deg2rad):
     # theta_sensor_deg = theta_sensor_rad * rad2deg
     return phi_sensor_rad, theta_sensor_rad
 def correctHumidity(rh, T, T_0, T_100, ews, ei0):                        #TODO figure out if T replicate is needed
     '''Correct relative humidity using Groff & Gratch method, where values are
     set when freezing and remain the original values when not freezing
@@ -599,6 +325,7 @@ def correctHumidity(rh, T, T_0, T_100, ews, ei0):                        #TODO f
     rh_cor = rh.where(~freezing, other = rh*(e_s_wtr / e_s_ice))
     return rh_cor
 def correctPrecip(precip, wspd):
     '''Correct precipitation with the undercatch correction method used in
     Yang et al. (1999) and Box et al. (2022), based on Goodison et al. (1998)
@@ -654,6 +381,7 @@ def correctPrecip(precip, wspd):
     return precip_cor, precip_rate
 def calcDeclination(doy, hour, minute):
     '''Calculate sun declination based on time
@@ -702,6 +430,7 @@ def calcHourAngle(hour, minute, lon):
     return 2 * np.pi * (((hour + minute / 60) / 24 - 0.5) - lon/360)
      # ; - 15.*timezone/360.)
 def calcDirectionDeg(HourAngle_rad):                                          #TODO remove if not plan to use this
     '''Calculate sun direction as degrees. This is an alternative to
     _calcHourAngle that is currently not implemented into the offical L0>>L3
@@ -754,6 +483,7 @@ def calcZenith(lat, Declination_rad, HourAngle_rad, deg2rad, rad2deg):
     ZenithAngle_deg = ZenithAngle_rad * rad2deg
     return ZenithAngle_rad, ZenithAngle_deg
 def calcAngleDiff(ZenithAngle_rad, HourAngle_rad, phi_sensor_rad,
                   theta_sensor_rad):
     '''Calculate angle between sun and upper sensor (to determine when sun is
@@ -822,6 +552,7 @@ def calcAlbedo(usr, dsr_cor, AngleDif_deg, ZenithAngle_deg):
     albedo = albedo.ffill(dim='time').bfill(dim='time')                        #TODO remove this line and one above?
     return albedo, OKalbedos
 def calcTOA(ZenithAngle_deg, ZenithAngle_rad):
     '''Calculate incoming shortwave radiation at the top of the atmosphere,
     accounting for sunset periods
@@ -912,75 +643,6 @@ def calcCorrectionFactor(Declination_rad, phi_sensor_rad, theta_sensor_rad,
     return CorFac_all
-def _getDF(flag_url, flag_file, download=True):
-    '''Get dataframe from flag or adjust file. First attempt to retrieve from
-    URL. If this fails then attempt to retrieve from local file
-    Parameters
-    ----------
-    flag_url : str
-        URL address to file
-    flag_file : str
-        Local path to file
-    download : bool
-        Flag to download file from URL
-    Returns
-    -------
-    df : pd.DataFrame
-        Flag or adjustment dataframe
-    '''
-    # Download local copy as csv
-    if download==True:
-        os.makedirs(os.path.dirname(flag_file), exist_ok = True)
-        try:
-            urllib.request.urlretrieve(flag_url, flag_file)
-            logger.info(f"Downloaded a {flag_file.split('/')[-2][:-1],} file to {flag_file}")
-        except (HTTPError, URLError) as e:
-            logger.info(f"Unable to download {flag_file.split('/')[-2][:-1],} file, using local file: {flag_file}")
-    else:
-        logger.info(f"Using local {flag_file.split('/')[-2][:-1],} file: {flag_file}")
-    if os.path.isfile(flag_file):
-        df = pd.read_csv(
-                        flag_file,
-                        comment="#",
-                        skipinitialspace=True,
-                        ).dropna(how='all', axis='rows')
-    else:
-        df=None
-        logger.info(f"No {flag_file.split('/')[-2][:-1]} file to read.")
-    return df
-def _hampel(vals_orig, k=7*24, t0=3):
-    '''Hampel filter
-    Parameters
-    ----------
-    vals : pd.DataSeries
-        Series of values from which to remove outliers
-    k : int
-        Size of window, including the sample. For example, 7 is equal to 3 on
-        either side of value. The default is 7*24.
-    t0 : int
-        Threshold value. The default is 3.
-    '''
-    #Make copy so original not edited
-    vals=vals_orig.copy()
-    #Hampel Filter
-    L= 1.4826
-    rolling_median=vals.rolling(k).median()
-    difference=np.abs(rolling_median-vals)
-    median_abs_deviation=difference.rolling(k).median()
-    threshold= t0 *L * median_abs_deviation
-    outlier_idx=difference>threshold
-    outlier_idx[0:round(k/2)]=False
-    vals.loc[outlier_idx]=np.nan
-    return(vals)
 def _checkSunPos(ds, OKalbedos, sundown, sunonlowerdome, TOA_crit_nopass):
     '''Check sun position
@@ -1025,12 +687,14 @@ def _getTempK(T_0):                                                            #
         Steam point temperature in K'''
     return T_0+100
 def _getRotation():                                                            #TODO same as L2toL3._getRotation()
     '''Return degrees-to-radians and radians-to-degrees'''
     deg2rad = np.pi / 180
     rad2deg = 1 / deg2rad
     return deg2rad, rad2deg
 if __name__ == "__main__":
     # unittest.main()
     pass

pypromice 1.3.1__tar.gz → 1.3.3__tar.gz

Potentially problematic release.

pypromice 1.3.1tar.gz → 1.3.3tar.gz