PyPI - pydartdiags - Versions diffs - 0.0.3b0__tar.gz → 0.0.41__tar.gz - Mend

pydartdiags 0.0.3b0tar.gz → 0.0.41tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of pydartdiags might be problematic. Click here for more details.

Files changed (24) hide show

{pydartdiags-0.0.3b0 → pydartdiags-0.0.41}/PKG-INFO RENAMED Viewed

@@ -1,29 +1,36 @@
-Metadata-Version: 2.3
+Metadata-Version: 2.1
 Name: pydartdiags
-Version: 0.0.3b0
+Version: 0.0.41
 Summary: Observation Sequence Diagnostics for DART
+Home-page: https://github.com/NCAR/pyDARTdiags.git
+Author: Helen Kershaw
+Author-email: Helen Kershaw <hkershaw@ucar.edu>
 Project-URL: Homepage, https://github.com/NCAR/pyDARTdiags.git
 Project-URL: Issues, https://github.com/NCAR/pyDARTdiags/issues
-Author-email: Helen Kershaw <hkershaw@ucar.edu>
-License-File: LICENSE
+Project-URL: Documentation, https://ncar.github.io/pyDARTdiags
+Classifier: Programming Language :: Python :: 3
 Classifier: License :: OSI Approved :: Apache Software License
 Classifier: Operating System :: OS Independent
-Classifier: Programming Language :: Python :: 3
 Requires-Python: >=3.8
-Requires-Dist: numpy>=1.26
+Description-Content-Type: text/markdown
+License-File: LICENSE
 Requires-Dist: pandas>=2.2.0
+Requires-Dist: numpy>=1.26
 Requires-Dist: plotly>=5.22.0
-Description-Content-Type: text/markdown
+Requires-Dist: pyyaml>=6.0.2
 # pyDARTdiags
-pyDARTdiags is a python library for obsevation space diagnostics for the Data Assimilation Research Testbed ([DART](https://github.com/NCAR/DART)).
+pyDARTdiags is a Python library for obsevation space diagnostics for the Data Assimilation Research Testbed ([DART](https://github.com/NCAR/DART)).
 pyDARTdiags is under initial development, so please use caution.
 The MATLAB [observation space diagnostics](https://docs.dart.ucar.edu/en/latest/guide/matlab-observation-space.html) are available through [DART](https://github.com/NCAR/DART).
-pyDARTdiags can be installed through pip.  We recommend installing pydartdiags in a virtual enviroment:
+pyDARTdiags can be installed through pip: https://pypi.org/project/pydartdiags/
+Documenation : https://ncar.github.io/pyDARTdiags/
+We recommend installing pydartdiags in a virtual enviroment:
 ```
@@ -35,14 +42,14 @@ pip install pydartdiags
 ## Example importing the obs\_sequence and plots modules
 ```python
-from pydartdiags.obs_sequence import obs_sequence as obs_seq
+from pydartdiags.obs_sequence import obs_sequence as obsq
 from pydartdiags.plots import plots
 ```
 ## Examining the dataframe
 ```python
-obs_seq = obs_seq.obs_sequence('obs_seq.final.ascii')
+obs_seq = obsq.obs_sequence('obs_seq.final.ascii')
 obs_seq.df.head()
 ```
@@ -203,7 +210,7 @@ obs_seq.df.head()
 Find the numeber of assimilated (used) observations vs. possible observations by type
 ```python
-obs_seq.possible_vs_used(obs_seq.df)
+obsq.possible_vs_used(obs_seq.df)
 ```
 <table border="1" class="dataframe">
@@ -360,7 +367,7 @@ obs_seq.possible_vs_used(obs_seq.df)
 * plot the rank histogram
 ```python
-df_qc0 = obs_seq.select_by_dart_qc(obs_seq.df, 0)
+df_qc0 = obsq.select_by_dart_qc(obs_seq.df, 0)
 plots.plot_rank_histogram(df_qc0)
 ```
 ![Rank Histogram](docs/images/rankhist.png)
@@ -376,7 +383,7 @@ plots.plot_rank_histogram(df_qc0)
 hPalevels = [0.0, 100.0,  150.0, 200.0, 250.0, 300.0, 400.0, 500.0, 700, 850, 925, 1000]# float("inf")] # Pa?
 plevels = [i * 100 for i in hPalevels]
-df_qc0 = obs_seq.select_by_dart_qc(obs_seq.df, 0)  # only qc 0
+df_qc0 = obsq.select_by_dart_qc(obs_seq.df, 0)  # only qc 0
 df_profile, figrmse, figbias = plots.plot_profile(df_qc0, plevels)
 ```
@@ -389,4 +396,4 @@ Contributions are welcome! If you have a feature request, bug report, or a sugge
 ## License
-DartLabPlot is released under the Apache License 2.0. For more details, see the LICENSE file in the root directory of this source tree or visit [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).
+pyDARTdiags is released under the Apache License 2.0. For more details, see the LICENSE file in the root directory of this source tree or visit [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).

{pydartdiags-0.0.3b0 → pydartdiags-0.0.41}/README.md RENAMED Viewed

@@ -1,12 +1,15 @@
 # pyDARTdiags
-pyDARTdiags is a python library for obsevation space diagnostics for the Data Assimilation Research Testbed ([DART](https://github.com/NCAR/DART)).
+pyDARTdiags is a Python library for obsevation space diagnostics for the Data Assimilation Research Testbed ([DART](https://github.com/NCAR/DART)).
 pyDARTdiags is under initial development, so please use caution.
 The MATLAB [observation space diagnostics](https://docs.dart.ucar.edu/en/latest/guide/matlab-observation-space.html) are available through [DART](https://github.com/NCAR/DART).
-pyDARTdiags can be installed through pip.  We recommend installing pydartdiags in a virtual enviroment:
+pyDARTdiags can be installed through pip: https://pypi.org/project/pydartdiags/
+Documenation : https://ncar.github.io/pyDARTdiags/
+We recommend installing pydartdiags in a virtual enviroment:
 ```
@@ -18,14 +21,14 @@ pip install pydartdiags
 ## Example importing the obs\_sequence and plots modules
 ```python
-from pydartdiags.obs_sequence import obs_sequence as obs_seq
+from pydartdiags.obs_sequence import obs_sequence as obsq
 from pydartdiags.plots import plots
 ```
 ## Examining the dataframe
 ```python
-obs_seq = obs_seq.obs_sequence('obs_seq.final.ascii')
+obs_seq = obsq.obs_sequence('obs_seq.final.ascii')
 obs_seq.df.head()
 ```
@@ -186,7 +189,7 @@ obs_seq.df.head()
 Find the numeber of assimilated (used) observations vs. possible observations by type
 ```python
-obs_seq.possible_vs_used(obs_seq.df)
+obsq.possible_vs_used(obs_seq.df)
 ```
 <table border="1" class="dataframe">
@@ -343,7 +346,7 @@ obs_seq.possible_vs_used(obs_seq.df)
 * plot the rank histogram
 ```python
-df_qc0 = obs_seq.select_by_dart_qc(obs_seq.df, 0)
+df_qc0 = obsq.select_by_dart_qc(obs_seq.df, 0)
 plots.plot_rank_histogram(df_qc0)
 ```
 ![Rank Histogram](docs/images/rankhist.png)
@@ -359,7 +362,7 @@ plots.plot_rank_histogram(df_qc0)
 hPalevels = [0.0, 100.0,  150.0, 200.0, 250.0, 300.0, 400.0, 500.0, 700, 850, 925, 1000]# float("inf")] # Pa?
 plevels = [i * 100 for i in hPalevels]
-df_qc0 = obs_seq.select_by_dart_qc(obs_seq.df, 0)  # only qc 0
+df_qc0 = obsq.select_by_dart_qc(obs_seq.df, 0)  # only qc 0
 df_profile, figrmse, figbias = plots.plot_profile(df_qc0, plevels)
 ```
@@ -372,4 +375,4 @@ Contributions are welcome! If you have a feature request, bug report, or a sugge
 ## License
-DartLabPlot is released under the Apache License 2.0. For more details, see the LICENSE file in the root directory of this source tree or visit [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).
+pyDARTdiags is released under the Apache License 2.0. For more details, see the LICENSE file in the root directory of this source tree or visit [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).

{pydartdiags-0.0.3b0 → pydartdiags-0.0.41}/pyproject.toml RENAMED Viewed

@@ -1,10 +1,10 @@
 [build-system]
-requires = ["hatchling"]
-build-backend = "hatchling.build"
+requires = ["setuptools", "wheel"]
+build-backend = "setuptools.build_meta"
 [project]
 name = "pydartdiags"
-version = "0.0.3b"
+version = "0.0.41"
 authors = [
   { name="Helen Kershaw", email="hkershaw@ucar.edu" },
 ]
@@ -19,12 +19,12 @@ classifiers = [
 dependencies = [
     "pandas>=2.2.0",
     "numpy>=1.26",
-    "plotly>=5.22.0"
+    "plotly>=5.22.0",
+    "pyyaml>=6.0.2"
 ]
 [project.urls]
 Homepage = "https://github.com/NCAR/pyDARTdiags.git"
 Issues = "https://github.com/NCAR/pyDARTdiags/issues"
+Documentation = "https://ncar.github.io/pyDARTdiags"
-[tool.hatch.build.targets.wheel]
-packages = ["src/pydartdiags"]

pydartdiags-0.0.41/setup.cfg ADDED Viewed

@@ -0,0 +1,4 @@
+[egg_info]
+tag_build =
+tag_date = 0

pydartdiags-0.0.41/setup.py ADDED Viewed

@@ -0,0 +1,26 @@
+from setuptools import setup, find_packages
+setup(
+    name="pydartdiags",
+    version="0.0.41",
+    packages=find_packages(where="src"),
+    package_dir={"": "src"},
+    author="Helen Kershaw",
+    author_email="hkershaw@ucar.edu",
+    description="Observation Sequence Diagnostics for DART",
+    long_description=open("README.md").read(),
+    long_description_content_type="text/markdown",
+    url="https://github.com/NCAR/pyDARTdiags.git",
+    classifiers=[
+        "Programming Language :: Python :: 3",
+        "License :: OSI Approved :: Apache Software License",
+        "Operating System :: OS Independent",
+    ],
+    python_requires=">=3.8",
+    install_requires=[
+        "pandas>=2.2.0",
+        "numpy>=1.26",
+        "plotly>=5.22.0",
+        "pyyaml>=6.0.2"
+    ],
+)

{pydartdiags-0.0.3b0 → pydartdiags-0.0.41}/src/pydartdiags/obs_sequence/obs_sequence.py RENAMED Viewed

@@ -5,37 +5,38 @@ import os
 import yaml
 class obs_sequence:
-    """Create an obs_sequence object from an ascii observation
-       sequence file.
-       Attributes:
-           df : pandas Dataframe containing all the observations
-           all_obs : list of all observations, each observation is a list
-           header : header from the ascii file
-           vert : dictionary of dart vertical units
-           types : dictionary of types in the observation sequence file
-           copie_names : names of copies in the observation sequence file.
-                         Spelled copie to avoid conflict with python built-in copy function.
-                         Spaces are replaced with underscores in copie_names.
-           file : the input observation sequence ascii file
-       usage:
-         Read the observation sequence from file:
-              obs_seq = obs_sequence('/home/data/obs_seq.final.ascii.small')
-         Access the resulting pandas dataFrame:
-              obs_seq.df
-       For 3D sphere models: latitude and longitude are in degrees in the DataFrame
-       sq_err = (mean-obs)**2
-       bias = (mean-obs)
-       rmse = sqrt( sum((mean-obs)**2)/n )
-       bias = sum((mean-obs)/n)
-       spread = sum(sd)
-       totalspread = sqrt(sum(sd+obs_err_var))
+    """Create an obs_sequence object from an ascii observation sequence file.
+    Attributes:
+        df (pandas.DataFrame): DataFrame containing all the observations.
+        all_obs (list): List of all observations, each observation is a list.
+        header (str): Header from the ascii file.
+        vert (dict): Dictionary of dart vertical units.
+        types (dict): Dictionary of types in the observation sequence file.
+        copie_names (list): Names of copies in the observation sequence file.
+            Spelled 'copie' to avoid conflict with the Python built-in copy function.
+            Spaces are replaced with underscores in copie_names.
+    Parameters:
+        file : the input observation sequence ascii file
+    Example:
+        Read the observation sequence from file:
+            ``obs_seq = obs_sequence('/home/data/obs_seq.final.ascii.small')``
+        Access the resulting pandas DataFrame:
+            ``obs_seq.df``
+    For 3D sphere models: latitude and longitude are in degrees in the DataFrame
+    Calculations:
+        - sq_err = (mean-obs)**2
+        - bias = (mean-obs)
+        - rmse = sqrt( sum((mean-obs)**2)/n )
+        - bias = sum((mean-obs)/n)
+        - spread = sum(sd)
+        - totalspread = sqrt(sum(sd+obs_err_var))
     """
     ## static variables
     # vertrical coordinate:
@@ -59,7 +60,8 @@ class obs_sequence:
                         'AIRS observation',
                         'GTSPP observation',
                         'SST observation',
-                        'observations']
+                        'observations',
+                        'WOD observation']
     def __init__(self, file):
         self.loc_mod = 'None'
@@ -101,6 +103,7 @@ class obs_sequence:
     def obs_to_list(self, obs):
         """put single observation into a list
            discards obs_def
         """
         data = []
@@ -170,7 +173,7 @@ class obs_sequence:
     def write_obs_seq(self, file, df=None):
         """
         Write the observation sequence to a file.
         This function writes the observation sequence to disk.
         If no DataFrame is provided, it writes the obs_sequence object to a file using the
         header and all observations stored in the object.
@@ -178,19 +181,17 @@ class obs_sequence:
         then writes the DataFrame obs to an obs_sequence file. Note the DataFrame is assumed
         to have been created from obs_sequence object.
         Parameters:
-        file (str): The path to the file where the observation sequence will be written.
-        df (pandas.DataFrame, optional): A DataFrame containing the observation data.
-                                        If not provided, the function uses self.header
-                                        and self.all_obs.
+            file (str): The path to the file where the observation sequence will be written.
+            df (pandas.DataFrame, optional): A DataFrame containing the observation data. If not provided, the function uses self.header and self.all_obs.
         Returns:
-        None
-        Usage:
-        obs_seq.write_obs_seq('/path/to/output/file')
-        obs_seq.write_obs_seq('/path/to/output/file', df=obs_seq.df)
+            None
+        Examples:
+            ``obs_seq.write_obs_seq('/path/to/output/file')``
+            ``obs_seq.write_obs_seq('/path/to/output/file', df=obs_seq.df)``
         """
         with open(file, 'w') as f:
@@ -281,14 +282,13 @@ class obs_sequence:
         """
         Extracts the names of the copies from the header of an obs_seq file.
         Parameters:
-        header (list): A list of strings representing the lines in the header of the obs_seq file.
+            header (list): A list of strings representing the lines in the header of the obs_seq file.
         Returns:
-        tuple: A tuple containing two elements:
-            - copie_names (list): A list of strings representing the copy names with _ for spaces.
-            - len(copie_names) (int): The number of copy names.
+            tuple: A tuple containing two elements:
+             - copie_names (list): A list of strings representing the copy names with underscores for spaces.
+             - len(copie_names) (int): The number of copy names.
         """
         for i, line in enumerate(header):
             if "num_obs:" in line and "max_num_obs:" in line:
@@ -348,15 +348,13 @@ class obs_sequence:
         components and adds them to the DataFrame.
         Parameters:
-        composite_types (str, optional): The YAML configuration for composite types.
-                                        If 'use_default', the default configuration is used.
-                                        Otherwise, a custom YAML configuration can be provided.
+            composite_types (str, optional): The YAML configuration for composite types. If 'use_default', the default configuration is used. Otherwise, a custom YAML configuration can be provided.
         Returns:
-        pd.DataFrame: The updated DataFrame with the new composite rows added.
+            pd.DataFrame: The updated DataFrame with the new composite rows added.
         Raises:
-        Exception: If there are repeat values in the components.
+            Exception: If there are repeat values in the components.
         """
         if composite_types == 'use_default':
@@ -386,10 +384,10 @@ def load_yaml_to_dict(file_path):
     Load a YAML file and convert it to a dictionary.
     Parameters:
-    - file_path (str): The path to the YAML file.
+        file_path (str): The path to the YAML file.
     Returns:
-    - dict: The YAML file content as a dictionary.
+        dict: The YAML file content as a dictionary.
     """
     try:
         with open(file_path, 'r') as file:
@@ -402,8 +400,9 @@ def load_yaml_to_dict(file_path):
 def convert_dart_time(seconds, days):
     """covert from seconds, days after 1601 to datetime object
-    base year for Gregorian calendar is 1601
-    dart time is seconds, days since 1601
+    Note:
+        - base year for Gregorian calendar is 1601
+        - dart time is seconds, days since 1601
     """
     time = dt.datetime(1601,1,1) + dt.timedelta(days=days, seconds=seconds)
     return time
@@ -413,14 +412,14 @@ def select_by_dart_qc(df, dart_qc):
     Selects rows from a DataFrame based on the DART quality control flag.
     Parameters:
-    df (DataFrame): A pandas DataFrame.
-    dart_qc (int): The DART quality control flag to select.
+        df (DataFrame): A pandas DataFrame.
+        dart_qc (int): The DART quality control flag to select.
     Returns:
-    DataFrame: A DataFrame containing only the rows with the specified DART quality control flag.
+        DataFrame: A DataFrame containing only the rows with the specified DART quality control flag.
     Raises:
-    ValueError: If the DART quality control flag is not present in the DataFrame.
+        ValueError: If the DART quality control flag is not present in the DataFrame.
     """
     if dart_qc not in df['DART_quality_control'].unique():
         raise ValueError(f"DART quality control flag '{dart_qc}' not found in DataFrame.")
@@ -432,10 +431,10 @@ def select_failed_qcs(df):
     Selects rows from a DataFrame where the DART quality control flag is greater than 0.
     Parameters:
-    df (DataFrame): A pandas DataFrame.
+        df (DataFrame): A pandas DataFrame.
     Returns:
-    DataFrame: A DataFrame containing only the rows with a DART quality control flag greater than 0.
+        DataFrame: A DataFrame containing only the rows with a DART quality control flag greater than 0.
     """
     return df[df['DART_quality_control'] > 0]
@@ -450,14 +449,14 @@ def possible_vs_used(df):
     used observations.
     Parameters:
-    - df (pd.DataFrame): A DataFrame with at least two columns: 'type' for the observation type and 'observation'
-      for the observation data. It may also contain other columns required by the `select_failed_qcs` function
-      to determine failed quality control checks.
+        df (pd.DataFrame): A DataFrame with at least two columns: 'type' for the observation type and 'observation'
+        for the observation data. It may also contain other columns required by the `select_failed_qcs` function
+        to determine failed quality control checks.
     Returns:
-    - pd.DataFrame: A DataFrame with three columns: 'type', 'possible', and 'used'. 'type' is the observation type,
-      'possible' is the count of all observations of that type, and 'used' is the count of observations of that type
-      that passed quality control checks.
+        pd.DataFrame: A DataFrame with three columns: 'type', 'possible', and 'used'. 'type' is the observation type,
+        'possible' is the count of all observations of that type, and 'used' is the count of observations of that type
+        that passed quality control checks.
     """
     possible = df.groupby('type')['observation'].count()
@@ -476,12 +475,12 @@ def construct_composit(df_comp, composite, components):
     specified columns using the square root of the sum of squares method.
     Parameters:
-    df_comp (pd.DataFrame): The DataFrame containing the component rows to be combined.
-    composite (str): The type name for the new composite rows.
-    components (list of str): A list containing the type names of the two components to be combined.
+        df_comp (pd.DataFrame): The DataFrame containing the component rows to be combined.
+        composite (str): The type name for the new composite rows.
+        components (list of str): A list containing the type names of the two components to be combined.
     Returns:
-    merged_df (pd.DataFrame): The updated DataFrame with the new composite rows added.
+        merged_df (pd.DataFrame): The updated DataFrame with the new composite rows added.
     """
     selected_rows = df_comp[df_comp['type'] == components[0].upper()]
     selected_rows_v = df_comp[df_comp['type'] == components[1].upper()]

{pydartdiags-0.0.3b0 → pydartdiags-0.0.41}/src/pydartdiags/plots/plots.py RENAMED Viewed

@@ -6,6 +6,7 @@ import pandas as pd
 def plot_rank_histogram(df):
     """
     Plots a rank histogram colored by observation type.
     All histogram bars are initalized to be hidden and can be toggled visible in the plot's legend
     """
     _, _, df_hist = calculate_rank(df)
@@ -27,12 +28,12 @@ def calculate_rank(df):
     size plus one.
     Parameters:
-    - df (pd.DataFrame): A DataFrame with columns for mean, standard deviation, observed values,
-      ensemble size, and observation type. The DataFrame should have one row per observation.
+        df (pd.DataFrame): A DataFrame with columns for mean, standard deviation, observed values,
+                           ensemble size, and observation type. The DataFrame should have one row per observation.
     Returns:
-    - tuple: A tuple containing the rank array, ensemble size, and a result DataFrame. The result
-      DataFrame contains columns for 'rank' and 'obstype'.
+        tuple: A tuple containing the rank array, ensemble size, and a result DataFrame. The result
+        DataFrame contains columns for 'rank' and 'obstype'.
     """
     ensemble_values = df.filter(regex='prior_ensemble_member').to_numpy().copy()
     std_dev = np.sqrt(df['obs_err_var']).to_numpy()
@@ -72,24 +73,24 @@ def plot_profile(df, levels):
     the vertical profile in the atmosphere correctly.
     Parameters:
-    - df (pd.DataFrame): The input DataFrame containing at least the 'vertical' column for pressure levels,
-      and other columns required by the `rmse_bias` function for calculating RMSE and Bias.
-    - levels (array-like): The bin edges for categorizing the 'vertical' column values into pressure levels.
+        df (pd.DataFrame): The input DataFrame containing at least the 'vertical' column for pressure levels,
+        and other columns required by the `rmse_bias` function for calculating RMSE and Bias.
+        levels (array-like): The bin edges for categorizing the 'vertical' column values into pressure levels.
     Returns:
-    - tuple: A tuple containing the DataFrame with RMSE and Bias calculations, the RMSE plot figure, and the
-      Bias plot figure. The DataFrame includes a 'plevels' column representing the categorized pressure levels
-      and 'hPa' column representing the midpoint of each pressure level bin.
+        tuple: A tuple containing the DataFrame with RMSE and Bias calculations, the RMSE plot figure, and the
+        Bias plot figure. The DataFrame includes a 'plevels' column representing the categorized pressure levels
+        and 'hPa' column representing the midpoint of each pressure level bin.
     Raises:
-    - ValueError: If there are missing values in the 'vertical' column of the input DataFrame.
+        ValueError: If there are missing values in the 'vertical' column of the input DataFrame.
     Note:
-    - The function modifies the input DataFrame by adding 'plevels' and 'hPa' columns.
-    - The 'hPa' values are calculated as half the midpoint of each pressure level bin, which may need
-      adjustment based on the specific requirements for pressure level representation.
-    - The plots are generated using Plotly Express and are displayed inline. The y-axis of the plots is
-      reversed to align with standard atmospheric pressure level representation.
+        - The function modifies the input DataFrame by adding 'plevels' and 'hPa' columns.
+        - The 'hPa' values are calculated as half the midpoint of each pressure level bin, which may need
+          adjustment based on the specific requirements for pressure level representation.
+        - The plots are generated using Plotly Express and are displayed inline. The y-axis of the plots is
+          reversed to align with standard atmospheric pressure level representation.
     """
     pd.options.mode.copy_on_write = True
@@ -116,15 +117,17 @@ def mean_then_sqrt(x):
     Calculates the mean of an array-like object and then takes the square root of the result.
     Parameters:
-    arr (array-like): An array-like object (such as a list or a pandas Series).
-                      The elements should be numeric.
+        arr (array-like): An array-like object (such as a list or a pandas Series).
+                          The elements should be numeric.
     Returns:
-    float: The square root of the mean of the input array.
+        float: The square root of the mean of the input array.
     Raises:
-    TypeError: If the input is not an array-like object containing numeric values.
+        TypeError: If the input is not an array-like object containing numeric values.
+         ValueError: If the input array is empty.
     """
     return np.sqrt(np.mean(x))
 def rmse_bias(df):
@@ -139,14 +142,14 @@ def rmse_bias_by_obs_type(df, obs_type):
     Calculate the RMSE and bias for a given observation type.
     Parameters:
-    df (DataFrame): A pandas DataFrame.
-    obs_type (str): The observation type for which to calculate the RMSE and bias.
+        df (DataFrame): A pandas DataFrame.
+        obs_type (str): The observation type for which to calculate the RMSE and bias.
     Returns:
-    DataFrame: A DataFrame containing the RMSE and bias for the given observation type.
+        DataFrame: A DataFrame containing the RMSE and bias for the given observation type.
     Raises:
-    ValueError: If the observation type is not present in the DataFrame.
+        ValueError: If the observation type is not present in the DataFrame.
     """
     if obs_type not in df['type'].unique():
         raise ValueError(f"Observation type '{obs_type}' not found in DataFrame.")

pydartdiags-0.0.41/src/pydartdiags.egg-info/PKG-INFO ADDED Viewed

@@ -0,0 +1,399 @@
+Metadata-Version: 2.1
+Name: pydartdiags
+Version: 0.0.41
+Summary: Observation Sequence Diagnostics for DART
+Home-page: https://github.com/NCAR/pyDARTdiags.git
+Author: Helen Kershaw
+Author-email: Helen Kershaw <hkershaw@ucar.edu>
+Project-URL: Homepage, https://github.com/NCAR/pyDARTdiags.git
+Project-URL: Issues, https://github.com/NCAR/pyDARTdiags/issues
+Project-URL: Documentation, https://ncar.github.io/pyDARTdiags
+Classifier: Programming Language :: Python :: 3
+Classifier: License :: OSI Approved :: Apache Software License
+Classifier: Operating System :: OS Independent
+Requires-Python: >=3.8
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: pandas>=2.2.0
+Requires-Dist: numpy>=1.26
+Requires-Dist: plotly>=5.22.0
+Requires-Dist: pyyaml>=6.0.2
+# pyDARTdiags
+pyDARTdiags is a Python library for obsevation space diagnostics for the Data Assimilation Research Testbed ([DART](https://github.com/NCAR/DART)).
+pyDARTdiags is under initial development, so please use caution.
+The MATLAB [observation space diagnostics](https://docs.dart.ucar.edu/en/latest/guide/matlab-observation-space.html) are available through [DART](https://github.com/NCAR/DART).
+pyDARTdiags can be installed through pip: https://pypi.org/project/pydartdiags/
+Documenation : https://ncar.github.io/pyDARTdiags/
+We recommend installing pydartdiags in a virtual enviroment:
+```
+python3 -m venv dartdiags
+source dartdiags/bin/activate
+pip install pydartdiags
+```
+## Example importing the obs\_sequence and plots modules
+```python
+from pydartdiags.obs_sequence import obs_sequence as obsq
+from pydartdiags.plots import plots
+```
+## Examining the dataframe
+```python
+obs_seq = obsq.obs_sequence('obs_seq.final.ascii')
+obs_seq.df.head()
+```
+<table border="1" class="dataframe">
+  <thead>
+    <tr style="text-align: right;">
+      <th></th>
+      <th>obs_num</th>
+      <th>observation</th>
+      <th>prior_ensemble_mean</th>
+      <th>prior_ensemble_spread</th>
+      <th>prior_ensemble_member_1</th>
+      <th>prior_ensemble_member_2</th>
+      <th>prior_ensemble_member_3</th>
+      <th>prior_ensemble_member_4</th>
+      <th>prior_ensemble_member_5</th>
+      <th>prior_ensemble_member_6</th>
+      <th>...</th>
+      <th>latitude</th>
+      <th>vertical</th>
+      <th>vert_unit</th>
+      <th>type</th>
+      <th>seconds</th>
+      <th>days</th>
+      <th>time</th>
+      <th>obs_err_var</th>
+      <th>bias</th>
+      <th>sq_err</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <th>0</th>
+      <td>1</td>
+      <td>230.16</td>
+      <td>231.310652</td>
+      <td>0.405191</td>
+      <td>231.304725</td>
+      <td>231.562874</td>
+      <td>231.333915</td>
+      <td>231.297690</td>
+      <td>232.081416</td>
+      <td>231.051063</td>
+      <td>...</td>
+      <td>0.012188</td>
+      <td>23950.0</td>
+      <td>pressure (Pa)</td>
+      <td>ACARS_TEMPERATURE</td>
+      <td>75603</td>
+      <td>153005</td>
+      <td>2019-12-01 21:00:03</td>
+      <td>1.00</td>
+      <td>1.150652</td>
+      <td>1.324001</td>
+    </tr>
+    <tr>
+      <th>1</th>
+      <td>2</td>
+      <td>18.40</td>
+      <td>15.720527</td>
+      <td>0.630827</td>
+      <td>14.217207</td>
+      <td>15.558196</td>
+      <td>15.805599</td>
+      <td>16.594644</td>
+      <td>14.877743</td>
+      <td>16.334438</td>
+      <td>...</td>
+      <td>0.012188</td>
+      <td>23950.0</td>
+      <td>pressure (Pa)</td>
+      <td>ACARS_U_WIND_COMPONENT</td>
+      <td>75603</td>
+      <td>153005</td>
+      <td>2019-12-01 21:00:03</td>
+      <td>6.25</td>
+      <td>-2.679473</td>
+      <td>7.179578</td>
+    </tr>
+    <tr>
+      <th>2</th>
+      <td>3</td>
+      <td>1.60</td>
+      <td>-4.932073</td>
+      <td>0.825899</td>
+      <td>-5.270562</td>
+      <td>-5.955998</td>
+      <td>-4.209766</td>
+      <td>-5.105016</td>
+      <td>-4.669405</td>
+      <td>-4.365305</td>
+      <td>...</td>
+      <td>0.012188</td>
+      <td>23950.0</td>
+      <td>pressure (Pa)</td>
+      <td>ACARS_V_WIND_COMPONENT</td>
+      <td>75603</td>
+      <td>153005</td>
+      <td>2019-12-01 21:00:03</td>
+      <td>6.25</td>
+      <td>-6.532073</td>
+      <td>42.667980</td>
+    </tr>
+    <tr>
+      <th>3</th>
+      <td>4</td>
+      <td>264.16</td>
+      <td>264.060532</td>
+      <td>0.035584</td>
+      <td>264.107192</td>
+      <td>264.097270</td>
+      <td>264.073212</td>
+      <td>264.047718</td>
+      <td>264.074140</td>
+      <td>264.019895</td>
+      <td>...</td>
+      <td>0.010389</td>
+      <td>56260.0</td>
+      <td>pressure (Pa)</td>
+      <td>ACARS_TEMPERATURE</td>
+      <td>75603</td>
+      <td>153005</td>
+      <td>2019-12-01 21:00:03</td>
+      <td>1.00</td>
+      <td>-0.099468</td>
+      <td>0.009894</td>
+    </tr>
+    <tr>
+      <th>4</th>
+      <td>5</td>
+      <td>11.60</td>
+      <td>10.134115</td>
+      <td>0.063183</td>
+      <td>10.067956</td>
+      <td>10.078798</td>
+      <td>10.120263</td>
+      <td>10.084885</td>
+      <td>10.135112</td>
+      <td>10.140610</td>
+      <td>...</td>
+      <td>0.010389</td>
+      <td>56260.0</td>
+      <td>pressure (Pa)</td>
+      <td>ACARS_U_WIND_COMPONENT</td>
+      <td>75603</td>
+      <td>153005</td>
+      <td>2019-12-01 21:00:03</td>
+      <td>6.25</td>
+      <td>-1.465885</td>
+      <td>2.148818</td>
+    </tr>
+  </tbody>
+</table>
+<p>5 rows × 97 columns</p>
+</div>
+Find the numeber of assimilated (used) observations vs. possible observations by type
+```python
+obsq.possible_vs_used(obs_seq.df)
+```
+<table border="1" class="dataframe">
+  <thead>
+    <tr style="text-align: right;">
+      <th></th>
+      <th>type</th>
+      <th>possible</th>
+      <th>used</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <th>0</th>
+      <td>ACARS_TEMPERATURE</td>
+      <td>175429</td>
+      <td>128040</td>
+    </tr>
+    <tr>
+      <th>1</th>
+      <td>ACARS_U_WIND_COMPONENT</td>
+      <td>176120</td>
+      <td>126946</td>
+    </tr>
+    <tr>
+      <th>2</th>
+      <td>ACARS_V_WIND_COMPONENT</td>
+      <td>176120</td>
+      <td>127834</td>
+    </tr>
+    <tr>
+      <th>3</th>
+      <td>AIRCRAFT_TEMPERATURE</td>
+      <td>21335</td>
+      <td>13663</td>
+    </tr>
+    <tr>
+      <th>4</th>
+      <td>AIRCRAFT_U_WIND_COMPONENT</td>
+      <td>21044</td>
+      <td>13694</td>
+    </tr>
+    <tr>
+      <th>5</th>
+      <td>AIRCRAFT_V_WIND_COMPONENT</td>
+      <td>21044</td>
+      <td>13642</td>
+    </tr>
+    <tr>
+      <th>6</th>
+      <td>AIRS_SPECIFIC_HUMIDITY</td>
+      <td>6781</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <th>7</th>
+      <td>AIRS_TEMPERATURE</td>
+      <td>19583</td>
+      <td>7901</td>
+    </tr>
+    <tr>
+      <th>8</th>
+      <td>GPSRO_REFRACTIVITY</td>
+      <td>81404</td>
+      <td>54626</td>
+    </tr>
+    <tr>
+      <th>9</th>
+      <td>LAND_SFC_ALTIMETER</td>
+      <td>21922</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <th>10</th>
+      <td>MARINE_SFC_ALTIMETER</td>
+      <td>9987</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <th>11</th>
+      <td>MARINE_SFC_SPECIFIC_HUMIDITY</td>
+      <td>4196</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <th>12</th>
+      <td>MARINE_SFC_TEMPERATURE</td>
+      <td>8646</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <th>13</th>
+      <td>MARINE_SFC_U_WIND_COMPONENT</td>
+      <td>8207</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <th>14</th>
+      <td>MARINE_SFC_V_WIND_COMPONENT</td>
+      <td>8207</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <th>15</th>
+      <td>RADIOSONDE_SPECIFIC_HUMIDITY</td>
+      <td>14272</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <th>16</th>
+      <td>RADIOSONDE_SURFACE_ALTIMETER</td>
+      <td>601</td>
+      <td>0</td>
+    </tr>
+    <tr>
+      <th>17</th>
+      <td>RADIOSONDE_TEMPERATURE</td>
+      <td>29275</td>
+      <td>22228</td>
+    </tr>
+    <tr>
+      <th>18</th>
+      <td>RADIOSONDE_U_WIND_COMPONENT</td>
+      <td>36214</td>
+      <td>27832</td>
+    </tr>
+    <tr>
+      <th>19</th>
+      <td>RADIOSONDE_V_WIND_COMPONENT</td>
+      <td>36214</td>
+      <td>27975</td>
+    </tr>
+    <tr>
+      <th>20</th>
+      <td>SAT_U_WIND_COMPONENT</td>
+      <td>107212</td>
+      <td>82507</td>
+    </tr>
+    <tr>
+      <th>21</th>
+      <td>SAT_V_WIND_COMPONENT</td>
+      <td>107212</td>
+      <td>82647</td>
+    </tr>
+  </tbody>
+</table>
+## Example plotting
+### rank histogram
+* Select only observations that were assimliated (QC === 0).
+* plot the rank histogram
+```python
+df_qc0 = obsq.select_by_dart_qc(obs_seq.df, 0)
+plots.plot_rank_histogram(df_qc0)
+```
+![Rank Histogram](docs/images/rankhist.png)
+### plot profile of RMSE and Bias
+* Chose levels
+* Select only observations that were assimliated (QC === 0).
+* plot the profiles
+```python
+hPalevels = [0.0, 100.0,  150.0, 200.0, 250.0, 300.0, 400.0, 500.0, 700, 850, 925, 1000]# float("inf")] # Pa?
+plevels = [i * 100 for i in hPalevels]
+df_qc0 = obsq.select_by_dart_qc(obs_seq.df, 0)  # only qc 0
+df_profile, figrmse, figbias = plots.plot_profile(df_qc0, plevels)
+```
+![RMSE Plot](docs/images/rmse.png)
+![Bias Plot](docs/images/bias.png)
+## Contributing
+Contributions are welcome! If you have a feature request, bug report, or a suggestion, please open an issue on our GitHub repository.
+## License
+pyDARTdiags is released under the Apache License 2.0. For more details, see the LICENSE file in the root directory of this source tree or visit [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0).

pydartdiags-0.0.41/src/pydartdiags.egg-info/SOURCES.txt ADDED Viewed

@@ -0,0 +1,16 @@
+LICENSE
+README.md
+pyproject.toml
+setup.py
+src/pydartdiags/__init__.py
+src/pydartdiags.egg-info/PKG-INFO
+src/pydartdiags.egg-info/SOURCES.txt
+src/pydartdiags.egg-info/dependency_links.txt
+src/pydartdiags.egg-info/requires.txt
+src/pydartdiags.egg-info/top_level.txt
+src/pydartdiags/obs_sequence/__init__.py
+src/pydartdiags/obs_sequence/obs_sequence.py
+src/pydartdiags/plots/__init__.py
+src/pydartdiags/plots/plots.py
+tests/test_obs_sequence.py
+tests/test_plots.py

pydartdiags-0.0.41/src/pydartdiags.egg-info/dependency_links.txt ADDED Viewed

	@@ -0,0 +1 @@
1	+

pydartdiags-0.0.41/src/pydartdiags.egg-info/requires.txt ADDED Viewed

@@ -0,0 +1,4 @@
+pandas>=2.2.0
+numpy>=1.26
+plotly>=5.22.0
+pyyaml>=6.0.2

pydartdiags-0.0.41/src/pydartdiags.egg-info/top_level.txt ADDED Viewed

	@@ -0,0 +1 @@
1	+ pydartdiags

pydartdiags-0.0.41/tests/test_obs_sequence.py ADDED Viewed

@@ -0,0 +1,29 @@
+import datetime as dt
+from pydartdiags.obs_sequence import obs_sequence as obs_seq
+def test_convert_dart_time():
+    # Test case 1: Convert 0 seconds and 0 days
+    result = obs_seq.convert_dart_time(0, 0)
+    expected = dt.datetime(1601, 1, 1)
+    assert result == expected
+    # Test case 2: Convert 86400 seconds (1 day) and 0 days
+    result = obs_seq.convert_dart_time(86400, 0)
+    expected = dt.datetime(1601, 1, 2)
+    assert result == expected
+    # Test case 3: Convert 0 seconds and 1 day
+    result = obs_seq.convert_dart_time(0, 1)
+    expected = dt.datetime(1601, 1, 2)
+    assert result == expected
+    # Test case 4: Convert 3600 seconds (1 hour) and 1 day
+    result = obs_seq.convert_dart_time(2164, 151240)
+    expected = dt.datetime(2015, 1, 31, 0, 36, 4)
+    assert result == expected
+if __name__ == '__main__':
+    pytest.main()

pydartdiags-0.0.41/tests/test_plots.py ADDED Viewed

@@ -0,0 +1,52 @@
+import pandas as pd
+import numpy as np
+import pytest
+from pydartdiags.plots import plots as plts
+def test_calculate_rank():
+    # Example DataFrame setup
+    data = {
+        'observation': [2.5, 3.0, 4.5],  # Actual observation values
+        'obs_err_var': [0.1, 0.2, 0.3],  # Variance of the observation error
+        'prior_ensemble_member1': [2.3, 3.1, 4.6],
+        'prior_ensemble_member2': [2.4, 2.9, 4.4],
+        'prior_ensemble_member3': [2.5, 3.2, 4.5],
+        'type': ['A', 'B', 'A']  # Observation type
+    }
+    df = pd.DataFrame(data)
+    # Call the function
+    rank, ens_size, df_hist = plts.calculate_rank(df)
+    # Assertions to check if the function works as expected
+    assert ens_size == 3  # 3 ensemble members
+    assert 'rank' in df_hist.columns
+    assert 'obstype' in df_hist.columns
+def test_mean_then_sqrt():
+    # Test with a list
+    data = [1, -4, 9.1, 16]
+    result = plts.mean_then_sqrt(data)
+    expected = np.sqrt(np.mean(data))
+    assert result == expected, f"Expected {expected}, but got {result}"
+    # Test with a numpy array
+    data = np.array([1, -4, 9.1, 16])
+    result = plts.mean_then_sqrt(data)
+    expected = np.sqrt(np.mean(data))
+    assert result == expected, f"Expected {expected}, but got {result}"
+    # Test with a pandas Series of positive numbers
+    data = pd.Series([1, -4, 9.1, 16])
+    result = plts.mean_then_sqrt(data)
+    expected = np.sqrt(np.mean(data))
+    assert result == expected, f"Expected {expected}, but got {result}"
+      # Test with non-numeric values
+    data = ['a', 'b', 'c']
+    with pytest.raises(TypeError):
+        plts.mean_then_sqrt(data)
+# If you want to run the test directly from this script
+if __name__ == "__main__":
+    test_calculate_rank()

pydartdiags-0.0.3b0/.gitignore DELETED Viewed

@@ -1,4 +0,0 @@
-# Python files
-dist
-.ipynb_checkpoints
-__pycache__/

pydartdiags-0.0.3b0/docs/images/bias.png DELETED Viewed

Binary file

pydartdiags-0.0.3b0/docs/images/rankhist.png DELETED Viewed

Binary file

pydartdiags-0.0.3b0/docs/images/rmse.png DELETED Viewed

Binary file

pydartdiags-0.0.3b0/src/pydartdiags/obs_sequence/composite_types.yaml DELETED Viewed

@@ -1,35 +0,0 @@
-acars_horizontal_wind:
-  description: ACARS-derived Horizontal wind speed
-  components:
-    - acars_u_wind_component
-    - acars_v_wind_component
-sat_horizontal_wind:
-  description: Satellite-derived horizontal wind speed
-  components:
-    - sat_u_wind_component
-    - sat_v_wind_component
-radiosonde_horizontal_wind:
-  description: Radiosonde-derived horizontal wind speed
-  components:
-    - radiosonde_u_wind_component
-    - radiosonde_v_wind_component
-aircraft_horizontal_wind:
-  description: Aircraft-derived horizontal wind speed
-  components:
-    - aircraft_u_wind_component
-    - aircraft_v_wind_component
-10_m_horizontal_wind:
-  description: 10 meter horizontal wind speed
-  components:
-    - 10m_u_wind_component
-    - 10m_v_wind_component
-marine_sfc_horizontal_wind:
-  description: Marine surface horizontal wind speed
-  components:
-    - marine_sfc_u_wind_component
-    - marine_sfc_v_wind_component

pydartdiags-0.0.3b0/src/pydartdiags/plots/tests/test_rank_histogram.py DELETED Viewed

@@ -1,18 +0,0 @@
-import pandas as pd
-import numpy as np
-# Example DataFrame setup
-data = {
-    'observation': [2.5, 3.0, 4.5],  # Actual observation values
-    'obs_err_var': [0.1, 0.2, 0.3],  # Variance of the observation error
-    'prior_ensemble_member1': [2.3, 3.1, 4.6],
-    'prior_ensemble_member2': [2.4, 2.9, 4.4],
-    'prior_ensemble_member3': [2.5, 3.2, 4.5]
-}
-df = pd.DataFrame(data)
-# Call the function
-rank, ens_size, _ = calculate_rank(df)
-print("Rank:", rank)
-print("Ensemble Size:", ens_size)

{pydartdiags-0.0.3b0 → pydartdiags-0.0.41}/LICENSE RENAMED Viewed

File without changes

{pydartdiags-0.0.3b0 → pydartdiags-0.0.41}/src/pydartdiags/__init__.py RENAMED Viewed

File without changes

{pydartdiags-0.0.3b0 → pydartdiags-0.0.41}/src/pydartdiags/obs_sequence/__init__.py RENAMED Viewed

File without changes

{pydartdiags-0.0.3b0 → pydartdiags-0.0.41}/src/pydartdiags/plots/__init__.py RENAMED Viewed

File without changes

pydartdiags 0.0.3b0__tar.gz → 0.0.41__tar.gz

Potentially problematic release.

pydartdiags 0.0.3b0tar.gz → 0.0.41tar.gz