PyPI - bulum - Versions diffs - 0.2.10__tar.gz → 0.3.0__tar.gz - Mend

bulum 0.2.10tar.gz → 0.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (63) hide show

{bulum-0.2.10/src/bulum.egg-info → bulum-0.3.0}/PKG-INFO RENAMED Viewed

@@ -1,8 +1,8 @@
 Metadata-Version: 2.2
 Name: bulum
-Version: 0.2.10
+Version: 0.3.0
 Summary: Open source python library for assessing hydrologic model results in Queensland
-Home-page: https://bitbucket.org/odhydrology/bulum.git
+Home-page: https://github.com/odhydrology/bulum
 Author: Chas Egan
 Author-email: chas@odhydrology.com
 Classifier: Programming Language :: Python :: 3
@@ -56,6 +56,8 @@ bulum.__version__
 bulum.hello_world()
 ```
+API documentation is available at [odhydrology.github.io/bulum](https://odhydrology.github.io/bulum/).
 ## Build and Upload to PyPi
 First build a distribution from an anaconda prompt in the root of your project, and then upload the dist to PyPi using Twine.
@@ -68,7 +70,7 @@ python setup.py sdist
 twine upload dist\bulum-0.0.32.tar.gz
 ```
-As of Nov 2023, PyPi uses an API token instead of a conventional password. You can still use Twine, but the username is "__token__", and password is the API token which is very long string starting with "pypi-".
+As of Nov 2023, PyPi uses an API token instead of a conventional password. You can still use Twine, but the username is "\_\_token__", and password is the API token which is very long string starting with "pypi-".
 ``` bash
 username = __token__
@@ -81,7 +83,7 @@ How do I make a new API token? Go to your PyPi account settings, and click on "A
 ## Unit Tests
-WARNING: Run unit tests from an anaconda environment with compatable dependencies!
+WARNING: Run unit tests from an anaconda environment with compatible dependencies!
 Install the nose2 test-runner framework.
@@ -89,7 +91,7 @@ Install the nose2 test-runner framework.
 pip install nose2
 ```
-Then from the root project folder run the nose2 module. You can do this as a python modules, or just direcly from the anaconda prompt (both examples given below). This will automatically find and run tests in any modules named "test_*".
+Then from the root project folder run the nose2 module. You can do this as a python modules, or just directly from the anaconda prompt (both examples given below). This will automatically find and run tests in any modules named "test_*".
 ```bash
 python -m nose2

{bulum-0.2.10 → bulum-0.3.0}/README.md RENAMED Viewed

@@ -28,6 +28,8 @@ bulum.__version__
 bulum.hello_world()
 ```
+API documentation is available at [odhydrology.github.io/bulum](https://odhydrology.github.io/bulum/).
 ## Build and Upload to PyPi
 First build a distribution from an anaconda prompt in the root of your project, and then upload the dist to PyPi using Twine.
@@ -40,7 +42,7 @@ python setup.py sdist
 twine upload dist\bulum-0.0.32.tar.gz
 ```
-As of Nov 2023, PyPi uses an API token instead of a conventional password. You can still use Twine, but the username is "__token__", and password is the API token which is very long string starting with "pypi-".
+As of Nov 2023, PyPi uses an API token instead of a conventional password. You can still use Twine, but the username is "\_\_token__", and password is the API token which is very long string starting with "pypi-".
 ``` bash
 username = __token__
@@ -53,7 +55,7 @@ How do I make a new API token? Go to your PyPi account settings, and click on "A
 ## Unit Tests
-WARNING: Run unit tests from an anaconda environment with compatable dependencies!
+WARNING: Run unit tests from an anaconda environment with compatible dependencies!
 Install the nose2 test-runner framework.
@@ -61,7 +63,7 @@ Install the nose2 test-runner framework.
 pip install nose2
 ```
-Then from the root project folder run the nose2 module. You can do this as a python modules, or just direcly from the anaconda prompt (both examples given below). This will automatically find and run tests in any modules named "test_*".
+Then from the root project folder run the nose2 module. You can do this as a python modules, or just directly from the anaconda prompt (both examples given below). This will automatically find and run tests in any modules named "test_*".
 ```bash
 python -m nose2
@@ -79,4 +81,4 @@ nose2 src.bulum.stats.tests
 ## License
-Refer to LICENCE.md
+Refer to LICENCE.md

{bulum-0.2.10 → bulum-0.3.0}/setup.py RENAMED Viewed

@@ -14,7 +14,7 @@ setuptools.setup(
     description="Open source python library for assessing hydrologic model results in Queensland",
     long_description=long_description,
     long_description_content_type="text/markdown",
-    url="https://bitbucket.org/odhydrology/bulum.git",
+    url="https://github.com/odhydrology/bulum",
     package_dir={'': 'src'},
     packages=setuptools.find_packages('src'),
     classifiers=[

{bulum-0.2.10 → bulum-0.3.0}/src/bulum/clim/clim.py RENAMED Viewed

@@ -3,20 +3,20 @@ import pandas as pd
 def derive_transformation_curves(original_ts: pd.Series, augmented_ts: pd.Series, season_start_months=[1,2,3,4,5,6,7,8,9,10,11,12], epsilon=1e-3) -> dict:
-    """Returns a dictionary of exceedence-based transformation curves - one for each season
-    with the season's start month as the key. These are tables that map from exceedance
-    (cunnane plotting position as a fraction) to a scaling factor. These are intended to
-    be used to
-    effectively summarise climate-change adjustments, and allow them to be transported from
-    one timeseries to another.
+    """Returns a dictionary of exceedence-based transformation curves - one for
+    each season with the season's start month as the key. These are tables that
+    map from exceedance (cunnane plotting position as a fraction) to a scaling
+    factor. These are intended to be used to effectively summarise
+    climate-change adjustments, and allow them to be transported from one
+    timeseries to another.
-    Args:
-        original_ts (pd.Series): _description_
-        augmented_ts (pd.Series): _description_
-        season_start_months (list, optional): _description_. Defaults to [1,2,3,4,5,6,7,8,9,10,11,12].
-    Returns:
-        dict: _description_
+    Parameters
+    ----------
+    original_ts : pd.Series
+    augmented_ts : pd.Series
+    season_start_months : list, optional
+        Defaults to [1,2,3,4,5,6,7,8,9,10,11,12].
     """
     df = pd.DataFrame()
     df["x"] = original_ts
@@ -43,17 +43,19 @@ def derive_transformation_curves(original_ts: pd.Series, augmented_ts: pd.Series
 def apply_transformation_curves(tranformation_curves: dict, series: pd.Series) -> pd.Series:
     """Applies seasonal transformation curves to an input series.
-    Refer to the function 'derive_transformation_curves(...)'.
+    Refer to the function `derive_transformation_curves`.
-    Args:
-        tranformation_curves (dict): _description_
-        series (pd.Series): _description_
+    Parameters
+    ----------
+    tranformation_curves : dict
+    series : pd.Series
-    Returns:
-        pd.Series: _description_
+    Returns
+    -------
+    pd.Series
     """
     dates = series.index
-    answer = series.copy()
+    answer = series.copy()
     # Apply each transformation curves to the whole series. Splice the appropriate
     # parts (seasons) into the 'answer' series as we go.
     season_start_months = sorted(tranformation_curves.keys())
@@ -73,7 +75,7 @@ def apply_transformation_curves(tranformation_curves: dict, series: pd.Series) -
         # And get their ranks and plotting positions
         rank_starting_at_one = values.rank(ascending=True) # This function is nice because equal values are assigned the same (averaged) rank.
         n = len(values)
-        p = [(r - 0.4)/(n + 0.2) for r in rank_starting_at_one] # plotting position
+        p = [(r - 0.4)/(n + 0.2) for r in rank_starting_at_one] # Cunnane plotting position
         f = np.interp(p, xp, fp) # interpolated scaling factors
         # Calcualte new values and update the answer
         new_values = pd.Series([values.iloc[i] * f[i] for i in range(n)], index=season_dates)
@@ -88,14 +90,15 @@ def derive_transformation_factors(original_ts: pd.Series, augmented_ts: pd.Serie
     be used to effectively summarise climate-change adjustments, and allow them to be
     transported from one timeseries to another.
-    Args:
-        original_ts (pd.Series): _description_
-        augmented_ts (pd.Series): _description_
-        season_start_months (list, optional): _description_. Defaults to [1,2,3,4,5,6,7,8,9,10,11,12].
-        epsilon: Threshold below which values are treated as zero, and the associated factor defaults to 1.
+    Parameters
+    ----------
+    original_ts : pd.Series
+    augmented_ts : pd.Series
+    season_start_months : list, optional
+        [1,2,3,4,5,6,7,8,9,10,11,12].
+    epsilon : float
+        Threshold below which values are treated as zero, and the associated factor defaults to 1.
-    Returns:
-        dict: _description_
     """
     # Create a map of month -> season_start_month (for all months)
     month_to_season_map = {}
@@ -116,25 +119,24 @@ def derive_transformation_factors(original_ts: pd.Series, augmented_ts: pd.Serie
     return df2['f'].to_dict()
-def apply_transformation_factors(tranformation_factors: dict, series: pd.Series) -> pd.Series:
+def apply_transformation_factors(transformation_factors: dict, series: pd.Series) -> pd.Series:
     """Applies seasonal transformation factors to an input series.
-    Refer to the function 'derive_transformation_factors(...)'.
+    Refer to the function `derive_transformation_curves`.
-    Args:
-        tranformation_curves (dict): _description_
-        series (pd.Series): _description_
+    Parameters
+    ----------
+    transformation_curves : dict
+    series : pd.Series
-    Returns:
-        pd.Series: _description_
     """
     # Create a map of month -> factor (containing all months)
-    season_start_months = sorted(tranformation_factors.keys())
+    season_start_months = sorted(transformation_factors.keys())
     month_to_factor_map = {}
     key = max(season_start_months)
     for m in [1,2,3,4,5,6,7,8,9,10,11,12]:
         if m in season_start_months:
             key = m
-        month_to_factor_map[m] = tranformation_factors[key]
+        month_to_factor_map[m] = transformation_factors[key]
     # Apply transformation factors to the whole series. Splice the appropriate
     df = pd.DataFrame()
     df['x'] = series
@@ -144,7 +146,3 @@ def apply_transformation_factors(tranformation_factors: dict, series: pd.Series)
     answer = df['y']
     answer.name = series.name
     return answer

{bulum-0.2.10 → bulum-0.3.0}/src/bulum/io/__init__.py RENAMED Viewed

@@ -4,4 +4,4 @@ from .idx_io import *
 from .idx_io_native import *
 from .iqqm_out_reader import *
 from .lqn_io import *
-from .general_io import *
+from .general_io import *

{bulum-0.2.10 → bulum-0.3.0}/src/bulum/io/csv_io.py RENAMED Viewed

@@ -1,20 +1,36 @@
+"""
+Functions for reading CSVs, particularly time-series CSVs.
+"""
 import numpy as np
 import pandas as pd
 from bulum import utils
+import os
 na_values = ['', ' ', 'null', 'NULL', 'NAN', 'NaN', 'nan', 'NA', 'na', 'N/A' 'n/a', '#N/A', '#NA', '-NaN', '-nan']
-def read_ts_csv(filename, date_format=None, df=None, colprefix=None, allow_nonnumeric=False, assert_date=True, **kwargs) -> utils.TimeseriesDataframe:
-    """Reads a daily timeseries csv into a DataFrame, and sets the index to string dates in the "%Y-%m-%d" format.
-    The method assumes the first column are dates.
+def read_ts_csv(filename: str | os.PathLike, date_format=None,
+                df=None, colprefix=None, allow_nonnumeric=False,
+                assert_date=True, **kwargs) -> utils.TimeseriesDataframe:
+    """
+    Reads a daily timeseries csv into a DataFrame, and sets the index to string
+    dates in the "%Y-%m-%d" format. The method assumes the first column are
+    dates.
-    Args:
-        filename (_type_): _description_
-        date_format (str, optional): defaults to "%d/%m/%Y" as per Fors. Other common formats include "%Y-%m-%d", "%Y/%m/%d".
-        df (pd.DataFrame, optional): If provided, the reader will append columns to this dataframe. Defaults to None.
-        colprefix (str, optional): If provided, the reader will append this prefix to the start of each column name. Defaults to None.
-        allow_nonnumeric (bool, optional): If false, the method will assert that all columns are numerical. Defaults to False.
-        assert_date (bool, optional): If true, the method will assert that date index meets "%Y-%m-%d" format. Defaults to True.
+    Parameters
+    ----------
+    filename : str | PathLike
+    date_format : str, optional
+        defaults to "%d/%m/%Y" as per Fors. Other common formats include "%Y-%m-%d", "%Y/%m/%d".
+    df : pd.DataFrame, optional
+        If provided, the reader will append columns to this dataframe. Defaults to None.
+    colprefix : str, optional
+        If provided, the reader will append this prefix to the start of each column name. Defaults to None.
+    allow_nonnumeric : bool, optional
+        If false, the method will assert that all columns are numerical. Defaults to False.
+    assert_date : bool, optional
+        If true, the method will assert that date index meets "%Y-%m-%d" format. Defaults to True.
     Returns:
         pd.DataFrame: Dataframe containing the data from the csv file.
@@ -34,7 +50,7 @@ def read_ts_csv(filename, date_format=None, df=None, colprefix=None, allow_nonnu
     # Rename columns if required
     if colprefix is not None:
         for c in new_df.columns:
-            new_df.rename(columns = {c:f"{colprefix}{c}"}, inplace = True)
+            new_df.rename(columns={c: f"{colprefix}{c}"}, inplace=True)
     # Join to existing dataframe if required
     if df is None:
         df = new_df
@@ -49,11 +65,7 @@ def read_ts_csv(filename, date_format=None, df=None, colprefix=None, allow_nonnu
     return utils.TimeseriesDataframe.from_dataframe(df)
-def write_ts_csv(df: pd.DataFrame, filename: str):
-    """_summary_
-    Args:
-        df (pd.DataFrame): _description_
-        filename (str): _description_
-    """
-    df.to_csv(filename)
+def write_ts_csv(df: pd.DataFrame, filename: str,
+                 *args, **kwargs):
+    """Wrapper around ``pandas.DataFrame.to_csv()``."""
+    df.to_csv(filename, *args, **kwargs)

{bulum-0.2.10 → bulum-0.3.0}/src/bulum/io/general_io.py RENAMED Viewed

@@ -1,10 +1,19 @@
-import pandas as pd
+"""
+General use IO functions.
+"""
+import re
 import bulum.io as bio
 from bulum import utils
-import re
 def read(filename: str, **kwargs) -> utils.TimeseriesDataframe:
+    """
+    Read the input file.
+    It will attempt to determine the filetype and dispatch to the appropriate
+    function in `bulum.io`.
+    """
     filename_lower = filename.lower()
     df = None
     if filename_lower.endswith(".res.csv"):

{bulum-0.2.10 → bulum-0.3.0}/src/bulum/io/idx_io.py RENAMED Viewed

@@ -1,20 +1,20 @@
+"""
+IO functions for IDX files.
+See also :py:mod:`bulum.op.idx_io_native`.
+"""
 import os
-import pandas as pd
-import uuid
 import shutil
 import subprocess
-from bulum import utils
-from .csv_io import *
+import uuid
+from bulum import utils
+from .csv_io import *
-def write_idx(df, filename, cleanup_tempfile=True):
-    """_summary_
-    Args:
-        df (_type_): _description_
-        filename (_type_): _description_
-    """
+def write_idx(df: pd.DataFrame, filename, cleanup_tempfile=True):
+    """Write IDX file from dataframe, requires csvidx.exe."""
     if shutil.which('csvidx') is None:
         raise Exception("This method relies on the external program 'csvidx.exe'. Please ensure it is in your path.")
     temp_filename = f"{uuid.uuid4().hex}.tempfile.csv"
@@ -26,17 +26,20 @@ def write_idx(df, filename, cleanup_tempfile=True):
         os.remove(temp_filename)
-def write_area_ts_csv(df, filename, units = "(mm.d^-1)"):
+def write_area_ts_csv(df, filename, units="(mm.d^-1)"):
     """_summary_
-    Args:
-        df (_type_): _description_
-        filename (_type_): _description_
-        units (str, optional): _description_. Defaults to "(mm.d^-1)".
+    Parameters
+    ----------
+    df : DataFrame
+    filename
+    units : str, optional
+        Defaults to "(mm.d^-1)".
-    Raises:
-        Exception: If shortenned field names are going to clash in output file.
+    Raises
+    ------
+    Exception
+        If shortened field names are going to clash in output file.
     """
     # ensures dataframe adheres to standards
     utils.assert_df_format_standards(df)
@@ -45,7 +48,7 @@ def write_area_ts_csv(df, filename, units = "(mm.d^-1)"):
     for c in df.columns:
         c12 = f"{c[:12]:<12}"
         if c12 in fields.keys():
-            raise Exception(f"Field names clash when shortenned to 12 chars: {c} and {fields[c12]}")
+            raise Exception(f"Field names clash when shortened to 12 chars: {c} and {fields[c12]}")
         fields[c12] = c
     # create the header text
     header = f"{units}"
@@ -60,5 +63,3 @@ def write_area_ts_csv(df, filename, units = "(mm.d^-1)"):
     with open(filename, "w+", newline='', encoding='utf-8') as file:
         file.write(header)
         df.to_csv(file, header=False, na_rep=' NaN')

{bulum-0.2.10 → bulum-0.3.0}/src/bulum/io/idx_io_native.py RENAMED Viewed

@@ -1,18 +1,26 @@
+"""
+IO functions for IDX format (binary) written in native Python.
+"""
 import os
-import pandas as pd
+from typing import Optional
 import numpy as np
+import pandas as pd
 from bulum import utils
 def _detect_header_bytes(b_data: np.ndarray) -> bool:
     """
-    Helper function for read_idx. Detects whether the OUT file was written with
-    a version of IQQM with an old compiler with metadata/junk data as a header.
-    Fails if (not necessarily only if) the run was undertaken with only one
-    source of data, i.e. the .idx file has only one entry.
+    Helper function for :func:`read_idx`. Detects whether the .OUT file was
+    written with a version of IQQM with an old compiler with metadata/junk data
+    as a header. Fails if the run was undertaken with only one source of data,
+    i.e. the .idx file has only one entry.
-    Args:
-        b_data (np.ndarray): 2d array of binary data filled with float32 data.
+    Parameters
+    ----------
+    b_data : np.ndarray
+        2d array of binary data filled with float32 data
     """
     b_data_slice: tuple[np.float32] = b_data[0]
     first_non_zero = b_data_slice[0] != 0.0
@@ -20,17 +28,22 @@ def _detect_header_bytes(b_data: np.ndarray) -> bool:
     return first_non_zero and rest_zeroes
-def read_idx(filename, skip_header_bytes=None) -> utils.TimeseriesDataframe:
-    """_summary_
+def read_idx(filename, skip_header_bytes: Optional[bool] = None) -> utils.TimeseriesDataframe:
+    """
+    Read IDX file.
-    Args:
-        filename (_type_): Name of the IDX file.
-        skip_header_bytes (bool | None): Whether to skip header bytes in the IDX
-          file (related to the compiler used for IQQM). If set to None, attempt
-          to detect the presence of header bytes automatically.
+    Parameters
+    ----------
+    filename
+        Name of the IDX file.
+    skip_header_bytes : bool, optional
+        Whether to skip header bytes in the corresponding OUTs file (related to
+        the compiler used for IQQM). If set to None, attempt to detect the
+        presence of header bytes automatically.
-    Returns:
-        utils.TimeseriesDataframe: _description_
+    Returns
+    -------
+    utils.TimeseriesDataframe
     """
     if not os.path.exists(filename):
         raise FileNotFoundError(f"File does not exist: {filename}")
@@ -41,7 +54,7 @@ def read_idx(filename, skip_header_bytes=None) -> utils.TimeseriesDataframe:
         # Start date, end date, date interval
         stmp = f.readline().split()
         date_start = utils.standardize_datestring_format([stmp[0]])[0]
-        date_end = utils.standardize_datestring_format([stmp[1]])[0]
+        date_end = utils.standardize_datestring_format([stmp[1]])[0]
         date_flag = int(stmp[2])
         snames = []
         for n, line in enumerate(f):
@@ -65,14 +78,14 @@ def read_idx(filename, skip_header_bytes=None) -> utils.TimeseriesDataframe:
     # Read data
     if date_flag == 0:
         daily_date_values = utils.datetime_functions.get_dates(
-            date_start, end_date=date_end, include_end_date=True)
+            date_start, end_date=date_end, include_end_date=True)
         df = pd.DataFrame.from_records(b_data, index=daily_date_values)
         df.columns = snames
         df.index.name = "Date"
         # Check data types. If not 'float64' or 'int64', convert to 'float64'
-        x = df.select_dtypes(exclude=['int64','float64']).columns
-        if x.__len__()>0:
-            df=df.astype({i: 'float64' for i in x})
+        x = df.select_dtypes(exclude=['int64', 'float64']).columns
+        if x.__len__() > 0:
+            df = df.astype({i: 'float64' for i in x})
     elif date_flag == 1:
         raise NotImplementedError("Monthly data not yet supported")
     elif date_flag == 3:
@@ -84,21 +97,29 @@ def read_idx(filename, skip_header_bytes=None) -> utils.TimeseriesDataframe:
 def write_idx_native(df: pd.DataFrame, filepath, type="None", units="None") -> None:
-    """Writer for .IDX and corresponding .OUT binary files written in native Python.
-    Currently only supports daily data (date flag 0), as with the reader read_idx(...).
+    """Writer for .IDX and corresponding .OUT binary files written in native
+    Python. Currently only supports daily data (date flag 0), as with the reader
+    :func:`read_idx`.
-    Assumes that data are homogeneous in units and type e.g. Precipitation & mm resp., or Flow & ML/d.
+    Assumes that data are homogeneous in units and type e.g. Precipitation & mm
+    resp., or Flow & ML/d.
-    Args:
-        df (pd.Dataframe): DataFrame as per the output of read_idx(...).
-        filepath (str)   : Path to the IDX file to be written to including .IDX extension.
-        units (str, optional)      : Units for data in df.
-        type (str, optional)       : Data specifier for data in df, e.g. Gauged Flow, Precipitation, etc.
+    Parameters
+    ----------
+    df : pd.Dataframe
+        DataFrame as per the output of :func:`read_idx`.
+    filepath
+        Path to the IDX file to be written to including .IDX extension.
+    units : str, optional
+        Units for data in df.
+    type : str, optional
+        Data specifier for data in df, e.g. Gauged Flow, Precipitation, etc.
     """
     date_flag = 0
-    # TODO: When generalising to other frequencies, we may be able to simply read the data type off the time delta in df.index values
-    # As is, I've essentially copied what was done in the reader to flag that this should be implemented at the "same time".
-    # Verify valid date_flag
+    # TODO: When generalising to other frequencies, we may be able to simply
+    # read the data type off the time delta in df.index values As is, I've
+    # essentially copied what was done in the reader to flag that this should be
+    # implemented at the "same time". Verify valid date_flag
     match date_flag:
         case 0:
             pass  # valid
@@ -122,7 +143,9 @@ def write_idx_native(df: pd.DataFrame, filepath, type="None", units="None") -> N
         f.write(f"{first_date} {last_date} {date_flag}\n")
         # data
         # inline fn to ensure padded string is exactly l characters long
-        def ljust_or_truncate(s, l): return s.ljust(l)[0:l]
+        def ljust_or_truncate(s, l):
+            return s.ljust(l)[0:l]
         for idx, col_name in enumerate(col_names):
             source_entry = ljust_or_truncate(f"df_col{idx+1}", 12)
             name_entry = ljust_or_truncate(f"{col_name}", 40)

bulum 0.2.10__tar.gz → 0.3.0__tar.gz

bulum 0.2.10tar.gz → 0.3.0tar.gz