PyPI - cavapy - Versions diffs - 1.1.0__tar.gz → 1.1.5__tar.gz - Mend

cavapy 1.1.0tar.gz → 1.1.5tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of cavapy might be problematic. Click here for more details.

Files changed (11) hide show

{cavapy-1.1.0 → cavapy-1.1.5}/PKG-INFO +46 -1
{cavapy-1.1.0 → cavapy-1.1.5}/README.md +63 -18
cavapy-1.1.5/cava_bias.py +73 -0
cavapy-1.1.5/cava_config.py +65 -0
cavapy-1.1.5/cava_download.py +450 -0
cavapy-1.1.5/cava_plot.py +204 -0
cavapy-1.1.5/cava_validation.py +359 -0
cavapy-1.1.5/cavapy.py +523 -0
{cavapy-1.1.0 → cavapy-1.1.5}/pyproject.toml +2 -1
cavapy-1.1.0/cavapy.py +0 -1177
{cavapy-1.1.0 → cavapy-1.1.5}/LICENSE +0 -0

{cavapy-1.1.0 → cavapy-1.1.5}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: cavapy
-Version: 1.1.0
+Version: 1.1.5
 Summary: CAVA Python package. Retrive climate data.
 License: MIT
 License-File: LICENSE
@@ -100,6 +100,13 @@ The get_climate_data function performs automatically:
 - Convert into a Gregorian calendar (CORDEX-CORE models do not have a full 365 days calendar) through linear interpolation
 - Bias correction using the empirical quantile mapping (optional)
+### Parallelization strategy
+- If you request a single model/RCP combination, cavapy parallelizes **across variables** (one process per variable).
+- If you request multiple models and/or RCPs, cavapy parallelizes **across combo-variable tasks** (one process per variable per model), capped globally.
+- If `num_processes <= 1` or only one variable is requested, variables run sequentially (even for a single combo).
+- By default, up to **12 total processes** are used (capped by number of combo-variable tasks).
+- Inside each process, a thread pool handles per-variable downloads and observation/model fetches concurrently.
 ## Example usage
 Depending on the interest, downloading climate data can be done in a few different ways. Note that GCM stands for General Circulation Model while RCM stands for Regional Climate Model. As the climate data comes from the CORDEX-CORE initiative, users can choose between 3 different GCMs downscaled with two RCMs. In total, there are six simulations for any given domain (except for CAS-22 where only three are available).
@@ -238,6 +245,44 @@ import cavapy
 Togo_climate_data = cavapy.get_climate_data(country="Togo", variables=["tasmax", "pr"], obs=True,  years_obs=range(1980,2019))
 ```
+### Multiple models and/or RCPs
+You can pass lists (or None) to `rcp`, `gcm`, and `rcm`. If multiple combinations are requested,
+the return structure becomes nested:
+```
+results[rcp][f"{gcm}-{rcm}"][variable] -> DataArray
+```
+Example: all models and both RCPs for Togo (AFR-22):
+```
+import cavapy
+data = cavapy.get_climate_data(
+    country="Togo",
+    cordex_domain="AFR-22",
+    rcp=None,          # all RCPs
+    gcm=None,          # all GCMs
+    rcm=None,          # all RCMs
+    years_up_to=2030,
+    historical=True,
+    dataset="CORDEX-CORE",
+)
+```
+Example: specific models and RCPs:
+```
+data = cavapy.get_climate_data(
+    country="Togo",
+    cordex_domain="AFR-22",
+    rcp=["rcp26", "rcp85"],
+    gcm=["MPI", "MOHC"],
+    rcm=["Reg", "REMO"],
+    years_up_to=2030,
+    historical=True,
+)
+```
 ## Plotting Functionality
 `cavapy` now includes built-in plotting functions to easily visualize your climate data as maps and time series. The plotting functions work seamlessly with the data returned by `get_climate_data()`. **However, if your main goal is visualisation, we strongly encourage you to check out [CAVAanalytics](https://risk-team.github.io/CAVAanalytics/), our R package**.

{cavapy-1.1.0 → cavapy-1.1.5}/README.md RENAMED Viewed

@@ -63,13 +63,20 @@ conda activate test
 pip install cavapy
 ```
-## Process
-The get_climate_data function performs automatically:
-- Data retrieval in parallel
-- Unit conversion
-- Convert into a Gregorian calendar (CORDEX-CORE models do not have a full 365 days calendar) through linear interpolation
-- Bias correction using the empirical quantile mapping (optional)
+## Process
+The get_climate_data function performs automatically:
+- Data retrieval in parallel
+- Unit conversion
+- Convert into a Gregorian calendar (CORDEX-CORE models do not have a full 365 days calendar) through linear interpolation
+- Bias correction using the empirical quantile mapping (optional)
+### Parallelization strategy
+- If you request a single model/RCP combination, cavapy parallelizes **across variables** (one process per variable).
+- If you request multiple models and/or RCPs, cavapy parallelizes **across combo-variable tasks** (one process per variable per model), capped globally.
+- If `num_processes <= 1` or only one variable is requested, variables run sequentially (even for a single combo).
+- By default, up to **12 total processes** are used (capped by number of combo-variable tasks).
+- Inside each process, a thread pool handles per-variable downloads and observation/model fetches concurrently.
 ## Example usage
@@ -77,9 +84,9 @@ Depending on the interest, downloading climate data can be done in a few differe
 Since bias-correction requires both the historical run of the CORDEX model and the observational dataset (in this case ERA5), even when the historical argument is set to False, the historical run will be used for learning the bias correction factor.
-### Bias-corrected climate projections
-**Option 1: Use pre-bias-corrected ISIMIP data (Recommended)**
+### Bias-corrected climate projections
+**Option 1: Use pre-bias-corrected ISIMIP data (Recommended)**
 *Example with AFR-22 domain:*
 ```
@@ -129,7 +136,7 @@ Togo_climate_data = cavapy.get_climate_data(
     dataset="CORDEX-CORE"  # Original data with on-the-fly bias correction
 )
 ```
-### Non bias-corrected climate projections (Original CORDEX-CORE data)
+### Non bias-corrected climate projections (Original CORDEX-CORE data)
 ```
 import cavapy
@@ -145,7 +152,7 @@ Togo_climate_data = cavapy.get_climate_data(
     dataset="CORDEX-CORE"  # Original data, no bias correction
 )
 ```
-### Climate projections plus historical run
+### Climate projections plus historical run
 This is useful when assessing changes from the historical period.
@@ -202,12 +209,50 @@ Togo_climate_data = cavapy.get_climate_data(
     dataset="CORDEX-CORE"
 )
 ```
-### Observations only (ERA5)
-```
-import cavapy
-Togo_climate_data = cavapy.get_climate_data(country="Togo", variables=["tasmax", "pr"], obs=True,  years_obs=range(1980,2019))
-```
+### Observations only (ERA5)
+```
+import cavapy
+Togo_climate_data = cavapy.get_climate_data(country="Togo", variables=["tasmax", "pr"], obs=True,  years_obs=range(1980,2019))
+```
+### Multiple models and/or RCPs
+You can pass lists (or None) to `rcp`, `gcm`, and `rcm`. If multiple combinations are requested,
+the return structure becomes nested:
+```
+results[rcp][f"{gcm}-{rcm}"][variable] -> DataArray
+```
+Example: all models and both RCPs for Togo (AFR-22):
+```
+import cavapy
+data = cavapy.get_climate_data(
+    country="Togo",
+    cordex_domain="AFR-22",
+    rcp=None,          # all RCPs
+    gcm=None,          # all GCMs
+    rcm=None,          # all RCMs
+    years_up_to=2030,
+    historical=True,
+    dataset="CORDEX-CORE",
+)
+```
+Example: specific models and RCPs:
+```
+data = cavapy.get_climate_data(
+    country="Togo",
+    cordex_domain="AFR-22",
+    rcp=["rcp26", "rcp85"],
+    gcm=["MPI", "MOHC"],
+    rcm=["Reg", "REMO"],
+    years_up_to=2030,
+    historical=True,
+)
+```
 ## Plotting Functionality

cavapy-1.1.5/cava_bias.py ADDED Viewed

@@ -0,0 +1,73 @@
+"""Bias-correction utilities for CORDEX data using xsdba."""
+import numpy as np
+import xsdba as sdba
+import xarray as xr
+def _leave_one_out_bias_correction(ref, hist, variable, log):
+    """
+    Perform leave-one-out cross-validation for bias correction to avoid overfitting.
+    Args:
+        ref: Reference (observational) data
+        hist: Historical model data
+        variable: Variable name for determining correction method
+        log: Logger instance
+    Returns:
+        xr.DataArray: Bias-corrected historical data
+    """
+    log.info("Starting leave-one-out cross-validation for bias correction")
+    # Get unique years from historical data
+    hist_years = hist.time.dt.year.values
+    unique_years = np.unique(hist_years)
+    # Initialize list to store corrected data for each year
+    corrected_years = []
+    for leave_out_year in unique_years:
+        log.info(f"Processing leave-out year: {leave_out_year}")
+        # Create masks for training (all years except leave_out_year) and testing (only leave_out_year)
+        train_mask = hist.time.dt.year != leave_out_year
+        test_mask = hist.time.dt.year == leave_out_year
+        # Get training data (all years except the current one)
+        hist_train = hist.sel(time=train_mask)
+        hist_test = hist.sel(time=test_mask)
+        # Get corresponding reference data for training period
+        ref_train_mask = ref.time.dt.year != leave_out_year
+        ref_train = ref.sel(time=ref_train_mask)
+        # Train the bias correction model on the training data
+        QM_leave_out = sdba.EmpiricalQuantileMapping.train(
+            ref_train,
+            hist_train,
+            group="time.month",
+            kind="*" if variable in ["pr", "rsds", "sfcWind"] else "+",
+        )
+        # Apply bias correction to the left-out year
+        hist_corrected_year = QM_leave_out.adjust(
+            hist_test, extrapolation="constant", interp="linear"
+        )
+        # Apply variable-specific constraints
+        if variable == "hurs":
+            hist_corrected_year = hist_corrected_year.where(
+                hist_corrected_year <= 100, 100
+            )
+            hist_corrected_year = hist_corrected_year.where(
+                hist_corrected_year >= 0, 0
+            )
+        corrected_years.append(hist_corrected_year)
+    # Concatenate all corrected years and sort by time
+    hist_bs = xr.concat(corrected_years, dim="time").sortby("time")
+    log.info("Leave-one-out cross-validation bias correction completed")
+    return hist_bs

cavapy-1.1.5/cava_config.py ADDED Viewed

@@ -0,0 +1,65 @@
+"""Configuration constants and logging setup for cavapy."""
+import os
+import logging
+import warnings
+# Suppress cartopy download warnings for Natural Earth data
+try:
+    from cartopy.io import DownloadWarning
+    warnings.filterwarnings("ignore", category=DownloadWarning)
+except ImportError:
+    # Fallback to suppressing all UserWarnings from cartopy.io
+    warnings.filterwarnings("ignore", category=UserWarning, module="cartopy.io")
+logger = logging.getLogger("climate")
+logger.handlers = []  # Remove any existing handlers
+handler = logging.StreamHandler()
+formatter = logging.Formatter(
+    "%(asctime)s [%(levelname)s] %(name)s: %(message)s",
+    datefmt="%H:%M:%S",
+)
+handler.setFormatter(formatter)
+for hdlr in logger.handlers[:]:  # remove all old handlers
+    logger.removeHandler(hdlr)
+logger.addHandler(handler)
+logger.setLevel(logging.DEBUG)
+VARIABLES_MAP = {
+    "pr": "tp",
+    "tasmax": "t2mx",
+    "tasmin": "t2mn",
+    "hurs": "hurs",
+    "sfcWind": "sfcwind",
+    "rsds": "ssrd",
+}
+VALID_VARIABLES = list(VARIABLES_MAP)
+VALID_DOMAINS = [
+    "NAM-22",
+    "EUR-22",
+    "AFR-22",
+    "EAS-22",
+    "SEA-22",
+    "WAS-22",
+    "AUS-22",
+    "SAM-22",
+    "CAM-22",
+]
+VALID_RCPS = ["rcp26", "rcp85"]
+VALID_GCM = ["MOHC", "MPI", "NCC"]
+VALID_RCM = ["REMO", "Reg"]
+VALID_DATASETS = ["CORDEX-CORE", "CORDEX-CORE-BC"]
+INVENTORY_DATA_REMOTE_URL = (
+    "https://hub.ipcc.ifca.es/thredds/fileServer/inventories/cava.csv"
+)
+INVENTORY_DATA_LOCAL_PATH = os.path.join(
+    os.path.expanduser("~"), "shared/inventories/cava/inventory.csv"
+)
+ERA5_DATA_REMOTE_URL = (
+    "https://hub.ipcc.ifca.es/thredds/dodsC/fao/observations/ERA5/0.25/ERA5_025.ncml"
+)
+ERA5_DATA_LOCAL_PATH = os.path.join(
+    os.path.expanduser("~"), "shared/data/observations/ERA5/0.25/ERA5_025.ncml"
+)
+DEFAULT_YEARS_OBS = range(1980, 2006)

cavapy 1.1.0__tar.gz → 1.1.5__tar.gz

Potentially problematic release.

cavapy 1.1.0tar.gz → 1.1.5tar.gz