PyPI - pyfortracc - Versions diffs - 1.0.0__tar.gz - Mend

pyfortracc 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (68) hide show

pyfortracc-1.0.0/LICENSE ADDED Viewed

	@@ -0,0 +1 @@
1	+ DISCLAIMER: All materials contained in this computer program was developed by COPDT/CGIP/INPE (Brazil). You may download, display, print and reproduce this material for your personal, non-commercial use or use within your organisation. All other rights are reserved.

pyfortracc-1.0.0/MANIFEST.in ADDED Viewed

@@ -0,0 +1,8 @@
+# C/cython source files
+include pyfortracc/*
+include README.md
+include LICENSE
+include tests/*.*
+include requirements.txt
+recursive-include pyfortracc *.c
+include pyproject.toml

pyfortracc-1.0.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,147 @@
+Metadata-Version: 2.1
+Name: pyfortracc
+Version: 1.0.0
+Summary: A Python package for track and forecasting configurable clusters.
+Home-page: https://github.com/fortracc/pyfortracc
+Author: Helvecio B. L. Neto, Alan J. P. Calheiros
+Author-email: fortracc.project@inpe.br
+License: LICENSE
+Classifier: Programming Language :: Python
+Classifier: Development Status :: 5 - Production/Stable
+Classifier: Operating System :: OS Independent
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Topic :: Scientific/Engineering :: Hydrology
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: rasterio
+Requires-Dist: geopandas
+Requires-Dist: opencv-python
+Requires-Dist: opencv_contrib_python
+Requires-Dist: xarray
+Requires-Dist: scipy
+Requires-Dist: scikit-learn
+Requires-Dist: pyarrow
+Requires-Dist: duckdb
+Requires-Dist: netCDF4
+Requires-Dist: cartopy
+Requires-Dist: shapelysmooth
+Requires-Dist: tqdm
+Requires-Dist: ipython
+Requires-Dist: ipykernel
+Requires-Dist: psutil
+# pyForTraCC - Python Library for Tracking and Forecasting Clusters
+> **Note**: `pyForTraCC` library is currently in a **release candidate** phase, meaning it is nearly finalized but may receive minor updates before the official stable release.
+<!-- badges: start -->
+[![stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://pyfortracc.readthedocs.io)
+[![pypi](https://badge.fury.io/py/pyfortracc.svg)](https://pypi.python.org/pypi/pyfortracc)
+[![Documentation](https://readthedocs.org/projects/pyfortracc/badge/?version=latest)](https://pyfortracc.readthedocs.io/)
+[![Downloads](https://img.shields.io/pypi/dm/pyfortracc.svg)](https://pypi.python.org/pypi/pyfortracc)
+[![Contributors](https://img.shields.io/github/contributors/fortracc-project/pyfortracc.svg)](https://github.com/fortracc/pyfortracc/graphs/contributors)
+[![License](https://img.shields.io/pypi/l/pyfortracc.svg)](https://github.com/fortracc/pyfortracc/blob/main/LICENSE)
+<!-- badges: end -->
+## Overview
+`pyForTraCC` is a Python library designed to identify and track clusters in diverse datasets, offering flexible integration based on user-defined needs. Its modular structure incorporates specialized methods for feature identification and tracking, allowing for compatibility with various input formats.
+### Algorithm Workflow
+The algorithm is divided into two main modules: **Track** and **Forecast**.
+1. **Track**: This module identifies and tracks clusters in a time-sequenced field. It follows four steps:
+   - **Feature Extraction**: Identifies relevant features using multi-thresholding on a time-varying field, clusters contiguous pixels above thresholds, and vectorizes clusters as geospatial objects.
+   - **Spatial Operations**: Establishes spatial relationships between features and computes vector displacements between feature centroids.
+   - **Cluster Linkage**: Links features across time steps by indexing current features with those from the previous time step, generating unique cluster identifiers, tracking trajectories, and recording the cluster lifetime.
+   - **Concatenation**: Combines all identified features and trajectories into a single Parquet file, forming a consolidated tracking table with complete tracking data.
+2. **Forecast** (Upcoming): This module will predict future cluster positions through:
+   - **Virtual Image**: A persistence-based forecast of cluster positions by shifting clusters in the current time step to a specified future position based on average vector displacement.
+   - **Track Routine**: Applies the tracking routine to the virtual image, projecting cluster identification to the anticipated time step.
+## Documentation
+For detailed instructions and usage, refer to the [pyForTraCC Documentation](https://pyfortracc.readthedocs.io/).
+## Installation
+Download the package from GitHub or clone the repository:
+```bash
+git clone https://github.com/fortracc/pyfortracc/
+```
+It is recommended to use Python 3.12 and a virtual environment (Anaconda3, Miniconda, Mamba, etc.) to avoid dependency conflicts.
+### Installing with Conda
+Create environment using conda and install from environment.yml file:
+```bash
+cd pyfortracc
+conda env create -f environment.yml
+conda activate pyfortracc
+```
+### Installing with Pip
+```bash
+pip3 install pyfortracc
+```
+Running pyFortracc
+=====================================================================
+To use `pyForTraCC`, install and import the library, then create a custom data-reading function, read_function, tailored to your data’s format. This function should return a two-dimensional matrix as required by the library. Define a dictionary, name_list, with necessary configuration parameters for tracking, including data paths, thresholds, and time intervals. Finally, run the tracking function.
+Here is an example script:
+```python
+import pyfortracc
+import xarray as xr
+# Custom data reading function
+def read_function(path):
+    """
+    This function reads data from the given path and returns a two-dimensional matrix.
+    """
+    data = xr.open_dataarray(path).data
+    return data
+# Parameter dictionary for tracking configuration
+name_list = {
+    'input_path': 'input/',  # Path to input data
+    'output_path': 'output/',  # Path to output data
+    'thresholds': [20, 30, 45],  # Intensity thresholds
+    'min_cluster_size': [10, 5, 3],  # Minimum cluster size (in number of points)
+    'operator': '>=',  # Comparison operator (>=, <=, or ==)
+    'timestamp_pattern': '%Y%m%d_%H%M%S.nc',  # Timestamp file naming pattern
+    'delta_time': 12  # Time interval between frames, in minutes
+}
+# Execute tracking with parameters and custom reading function
+pyfortracc.track(name_list, read_function)
+```
+Example Gallery
+=====================================================================
+To use the library we have a gallery of examples that demonstrate the application of the algorithm in different situations.
+The development of this framework is constantly evolving, and several application examples can be seen in our example gallery.
+[![01 - Introducing Example:](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fortracc/pyfortracc/blob/main/examples/01_Introducing_Example/01_Introducing-pyFortraCC.ipynb) - 01 - Track Synthetic data (Introducing Example)
+[![02 - Track Radar Data (GoAmazon Example)](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fortracc/pyfortracc/blob/main/examples/02_Track-Radar-Data/02_Track-Radar-Dataset.ipynb) - 02 - Track Radar Data (GoAmazon Example)
+[![03 - Track Infrared (Real Time Tracking):](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fortracc/pyfortracc/blob/main/examples/03_Track-Infrared-Dataset/03_Track-Infrared-Dataset.ipynb) - 03 - Track GOES16-IR (Real Time Tracking from CPTEC/INPE)
+[![04 - Track Global Precipitation (Milton Hurricane):](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fortracc/pyfortracc/blob/main/examples/04_Track-Global-Precipitation-EDA/04_Track-Global-Precipitation.ipynb) - 04 - Track GSMAP (Milton Hurricane)
+Support and Contact
+=====================================================================
+For support and contact e-mail:
+- fortracc.project@inpe.br
+- helvecio.neto@inpe.br
+- alan.calheiros@inpe.br

pyfortracc-1.0.0/README.md ADDED Viewed

@@ -0,0 +1,113 @@
+# pyForTraCC - Python Library for Tracking and Forecasting Clusters
+> **Note**: `pyForTraCC` library is currently in a **release candidate** phase, meaning it is nearly finalized but may receive minor updates before the official stable release.
+<!-- badges: start -->
+[![stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://pyfortracc.readthedocs.io)
+[![pypi](https://badge.fury.io/py/pyfortracc.svg)](https://pypi.python.org/pypi/pyfortracc)
+[![Documentation](https://readthedocs.org/projects/pyfortracc/badge/?version=latest)](https://pyfortracc.readthedocs.io/)
+[![Downloads](https://img.shields.io/pypi/dm/pyfortracc.svg)](https://pypi.python.org/pypi/pyfortracc)
+[![Contributors](https://img.shields.io/github/contributors/fortracc-project/pyfortracc.svg)](https://github.com/fortracc/pyfortracc/graphs/contributors)
+[![License](https://img.shields.io/pypi/l/pyfortracc.svg)](https://github.com/fortracc/pyfortracc/blob/main/LICENSE)
+<!-- badges: end -->
+## Overview
+`pyForTraCC` is a Python library designed to identify and track clusters in diverse datasets, offering flexible integration based on user-defined needs. Its modular structure incorporates specialized methods for feature identification and tracking, allowing for compatibility with various input formats.
+### Algorithm Workflow
+The algorithm is divided into two main modules: **Track** and **Forecast**.
+1. **Track**: This module identifies and tracks clusters in a time-sequenced field. It follows four steps:
+   - **Feature Extraction**: Identifies relevant features using multi-thresholding on a time-varying field, clusters contiguous pixels above thresholds, and vectorizes clusters as geospatial objects.
+   - **Spatial Operations**: Establishes spatial relationships between features and computes vector displacements between feature centroids.
+   - **Cluster Linkage**: Links features across time steps by indexing current features with those from the previous time step, generating unique cluster identifiers, tracking trajectories, and recording the cluster lifetime.
+   - **Concatenation**: Combines all identified features and trajectories into a single Parquet file, forming a consolidated tracking table with complete tracking data.
+2. **Forecast** (Upcoming): This module will predict future cluster positions through:
+   - **Virtual Image**: A persistence-based forecast of cluster positions by shifting clusters in the current time step to a specified future position based on average vector displacement.
+   - **Track Routine**: Applies the tracking routine to the virtual image, projecting cluster identification to the anticipated time step.
+## Documentation
+For detailed instructions and usage, refer to the [pyForTraCC Documentation](https://pyfortracc.readthedocs.io/).
+## Installation
+Download the package from GitHub or clone the repository:
+```bash
+git clone https://github.com/fortracc/pyfortracc/
+```
+It is recommended to use Python 3.12 and a virtual environment (Anaconda3, Miniconda, Mamba, etc.) to avoid dependency conflicts.
+### Installing with Conda
+Create environment using conda and install from environment.yml file:
+```bash
+cd pyfortracc
+conda env create -f environment.yml
+conda activate pyfortracc
+```
+### Installing with Pip
+```bash
+pip3 install pyfortracc
+```
+Running pyFortracc
+=====================================================================
+To use `pyForTraCC`, install and import the library, then create a custom data-reading function, read_function, tailored to your data’s format. This function should return a two-dimensional matrix as required by the library. Define a dictionary, name_list, with necessary configuration parameters for tracking, including data paths, thresholds, and time intervals. Finally, run the tracking function.
+Here is an example script:
+```python
+import pyfortracc
+import xarray as xr
+# Custom data reading function
+def read_function(path):
+    """
+    This function reads data from the given path and returns a two-dimensional matrix.
+    """
+    data = xr.open_dataarray(path).data
+    return data
+# Parameter dictionary for tracking configuration
+name_list = {
+    'input_path': 'input/',  # Path to input data
+    'output_path': 'output/',  # Path to output data
+    'thresholds': [20, 30, 45],  # Intensity thresholds
+    'min_cluster_size': [10, 5, 3],  # Minimum cluster size (in number of points)
+    'operator': '>=',  # Comparison operator (>=, <=, or ==)
+    'timestamp_pattern': '%Y%m%d_%H%M%S.nc',  # Timestamp file naming pattern
+    'delta_time': 12  # Time interval between frames, in minutes
+}
+# Execute tracking with parameters and custom reading function
+pyfortracc.track(name_list, read_function)
+```
+Example Gallery
+=====================================================================
+To use the library we have a gallery of examples that demonstrate the application of the algorithm in different situations.
+The development of this framework is constantly evolving, and several application examples can be seen in our example gallery.
+[![01 - Introducing Example:](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fortracc/pyfortracc/blob/main/examples/01_Introducing_Example/01_Introducing-pyFortraCC.ipynb) - 01 - Track Synthetic data (Introducing Example)
+[![02 - Track Radar Data (GoAmazon Example)](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fortracc/pyfortracc/blob/main/examples/02_Track-Radar-Data/02_Track-Radar-Dataset.ipynb) - 02 - Track Radar Data (GoAmazon Example)
+[![03 - Track Infrared (Real Time Tracking):](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fortracc/pyfortracc/blob/main/examples/03_Track-Infrared-Dataset/03_Track-Infrared-Dataset.ipynb) - 03 - Track GOES16-IR (Real Time Tracking from CPTEC/INPE)
+[![04 - Track Global Precipitation (Milton Hurricane):](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/fortracc/pyfortracc/blob/main/examples/04_Track-Global-Precipitation-EDA/04_Track-Global-Precipitation.ipynb) - 04 - Track GSMAP (Milton Hurricane)
+Support and Contact
+=====================================================================
+For support and contact e-mail:
+- fortracc.project@inpe.br
+- helvecio.neto@inpe.br
+- alan.calheiros@inpe.br

pyfortracc-1.0.0/pyfortracc/.DS_Store ADDED Viewed

Binary file

pyfortracc-1.0.0/pyfortracc/__init__.py ADDED Viewed

@@ -0,0 +1,71 @@
+"""
+pyfortracc
+=====
+Provides
+    1. Tracking Non-Rigid Clusters in 2D matrix
+    2. Forecasting the movement of the Clusters by extrapolation
+    3. Plot and analysis of the results of the tracking and forecasting
+    4. Visualizing the results of the tracking, validation and forecasting
+How to use the package
+----------------------------
+Documentation is available in two forms: docstrings provided
+with the code, and a reference guide, available from
+`the project homepage <https://pyfortracc.readthedocs.io/>`.
+Available modules
+---------------------
+track
+    Tracking Non-Rigid Clusters in 2D matrix
+forecast
+    Forecasting the movement of the Clusters by extrapolation
+Available subpackages
+-----------------
+feature_extraction
+    Extracting features from the input data
+spatial_operations
+    Spatial operations on the features
+cluster_linking
+    Linking the clusters
+spatial_conversions
+    Converting the results of the tracking or forecasting to spatial data
+plot
+    Visualizing the results of the tracking, validation and forecasting
+About the package
+-----------------------------
+pyForTraCC is a Python package designed to identify, track, forecast and analyze clusters moving in a time-varying field.
+It offers a modular framework that incorporates different algorithms for feature identification, tracking, and analyses.
+One of the key advantages of pyForTraCC is its versatility, as it does not depend on specific input variables or a particular grid structure.
+In its current implementation, pyForTraCC identifies individual cluster features in a 2D field by applying a specified threshold value.
+By utilizing a time-varying 2D input images and a specified threshold value, pyForTraCC can determine the associated volume for these features. The
+software then establishes consistent trajectories that represent the complete lifecycle of a single cell of feature through the tracking step.
+Furthermore, pyForTraCC provides analysis and visualization methods that facilitate the utilization and display of the tracking results.
+This algorithm was initially developed and used in the publication "Impact of Multi-Thresholds and Vector Correction for Tracking Precipitating
+Systems over the Amazon Basin" (https://doi.org/10.3390/rs14215408). The methods presented in the research paper have enabled the implementation of robust techniques for extracting the motion vector
+field and trajectory of individual clusters of precipitating cells. These techniques have been applied to the Amazon Basin, where the tracking of
+precipitating systems is essential for understanding the hydrological cycle and its impacts on the environment and used in this algorithm
+For further information on pyForTraCC, its modules, and the continuous development process, please refer to the official documentation and stay tuned for updates
+from the community.
+"""
+from ._version import __version__
+from .default_parameters import default_parameters
+from pyfortracc.track import track
+from pyfortracc.forecast import forecast
+from pyfortracc.features_extraction import features_extraction
+from pyfortracc.spatial_operations import spatial_operations
+from pyfortracc.cluster_linking import cluster_linking
+from .concat import concat
+from pyfortracc.plot.plot import plot
+from pyfortracc.plot.plot_animation import plot_animation
+from pyfortracc.spatial_conversions import spatial_conversions

pyfortracc-1.0.0/pyfortracc/_version.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ __version__ = "1.0.0"

pyfortracc-1.0.0/pyfortracc/cluster_linking/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ from .cluster_linking import *

pyfortracc-1.0.0/pyfortracc/cluster_linking/board_clusters.py ADDED Viewed

@@ -0,0 +1,29 @@
+import pandas as pd
+def board_clusters(cur_frame):
+    """
+    Copy values from board_idx to board_idx. The board_idx is the index of
+    the clusters that are touching the board.
+    Parameters
+    ----------
+    cur_frame : pandas dataframe
+        Current dataframe.
+    Returns
+    -------
+    cur_frame : pandas dataframe
+        Updated dataframe.
+    """
+    board_idx = cur_frame[(cur_frame['board'] == True)]
+    if board_idx.empty:
+        return cur_frame
+    current_index = cur_frame.loc[board_idx.index]
+    touching_idx = pd.Index(current_index['board_idx'].values.astype(int))
+    board_uids = cur_frame.loc[touching_idx]['uid'].values
+    board_iuids = cur_frame.loc[touching_idx]['iuid'].values
+    board_status = cur_frame.loc[touching_idx]['status'].values
+    cur_frame.loc[board_idx.index,'uid'] = board_uids
+    cur_frame.loc[board_idx.index,'iuid'] = board_iuids
+    cur_frame.loc[board_idx.index,'status'] = board_status
+    return cur_frame

pyfortracc-1.0.0/pyfortracc/cluster_linking/cluster_linking.py ADDED Viewed

@@ -0,0 +1,180 @@
+import pandas as pd
+import pathlib
+from pyfortracc.default_parameters import default_parameters
+from pyfortracc.utilities.utils import (get_feature_files, create_dirs,
+                                        get_loading_bar, get_featstamp,
+                                        set_schema, set_outputdf,
+                                        read_parquet, write_parquet,
+                                        check_operational_system)
+from .new_frame import new_frame
+from .max_uid import update_max_uid
+from .board_clusters import board_clusters
+from .refact_inside import refact_inside
+from .merge_trajectory import merge_trajectory
+def cluster_linking(name_lst):
+    """
+    The function links clusters over time, ensuring that clusters in different frames (representing different time points)
+    are identified and associated with each other based on spatial and temporal proximity.
+    Parameters
+    ----------
+    name_lst : dict
+        Dictionary with the parameters to be used.
+    """
+    print('Cluster linking:')
+    # Set default parameters
+    name_lst = default_parameters(name_lst)
+    # Check operational system
+    name_lst, _ = check_operational_system(name_lst, False)
+    # Get feature files to be processed
+    feat_path = name_lst['output_path'] + 'track/processing/spatial/'
+    output_path = name_lst['output_path'] + 'track/processing/linked/'
+    name_lst['output_spatial'] = output_path
+    feat_files = get_feature_files(feat_path)
+    create_dirs(output_path)
+    loading_bar = get_loading_bar(feat_files)
+    # Get number of prev_files to skip based on the number of prev_time
+    prev_skip = name_lst['num_prev_skip']
+    # Set delta_time
+    dt_time = pd.Timedelta(minutes=name_lst['delta_time'])
+    max_dt_time = pd.Timedelta(minutes=(name_lst['delta_time'] +
+                                        name_lst['delta_tolerance']))
+    max_dt_time = max_dt_time * (prev_skip + 1)
+    # Set initial uid if not set in the name_lst
+    if 'initial_uid' not in name_lst.keys():
+        name_lst['initial_uid'] = 1
+    # Uid iterator is used to create new uids
+    uid_iter = name_lst['initial_uid']
+    # Set set_schema
+    schema = set_schema('linked', name_lst)
+    # Create empty previous frame
+    prv_frame = pd.DataFrame()
+    prv_stamp = get_featstamp(feat_files[0]) - dt_time
+    # Set idx counter is used to create cindex
+    cdx = 0
+    for feat_time, feat_file in enumerate(feat_files):
+        prv_frame, prv_stamp, uid_iter, cdx = linking((feat_time, feat_file,
+                                                prv_frame, prv_stamp,
+                                                name_lst, uid_iter,
+                                                max_dt_time, schema, cdx))
+        loading_bar.update(1)
+    loading_bar.close()
+    return
+def linking(args):
+    """
+    Links clusters between the current and previous frames, updates their unique identifiers (UIDs),
+    handles new frames, and saves the processed data.
+    Parameters
+    ----------
+    args : tuple
+        A tuple containing the following elements:
+        - time_ (int): The current time step index.
+        - cur_file (str): The path to the current frame's data file.
+        - prv_frame (pandas.DataFrame): The DataFrame of the previous frame.
+        - prv_stamp (pandas.Timestamp): The timestamp of the previous frame.
+        - nm_lst (dict): A dictionary with necessary parameters (e.g., output paths, delta times, etc.).
+        - uid_iter (int): The current UID iterator used to assign new UIDs.
+        - max_dt (pandas.Timedelta): The maximum allowed time difference between frames.
+        - schm (pandas.DataFrame): The schema for the output DataFrame.
+        - icdx (int): The current index counter, which increments with each frame processed.
+    Returns
+    -------
+    tuple
+        A tuple containing the following elements:
+        - cur_frame (pandas.DataFrame): The processed DataFrame of the current frame.
+        - cur_stamp (pandas.Timestamp): The timestamp of the current frame.
+        - uid_iter (int): The updated UID iterator.
+        - icdx (int): The updated index counter.
+    """
+    time_, cur_file, prv_frame, prv_stamp, nm_lst, uid_iter, max_dt, schm, icdx = args
+    # Read current file        print('Empty frame:', cur_file)
+    cur_frame = read_parquet(cur_file, ['status','threshold_level',
+                                        'past_idx','inside_idx',
+                                        'board','board_idx',
+                                        'trajectory'])
+    # Set output file
+    output_file = nm_lst['output_spatial'] + pathlib.Path(cur_file).name
+    icdx += 1 # Increment cindex
+    # Check if current frame is empty
+    if cur_frame.empty:
+        cur_frame['cindex'] = []
+        cur_frame['uid'] = []
+        if len(nm_lst['thresholds']) > 1:
+            cur_frame['iuid'] = []
+        cur_frame = cur_frame.astype({'cindex': 'int64', 'uid': 'int64'})
+        # Calculate lifetime
+        cur_frame['lifetime'] = []
+        cur_frame['lifetime'] = cur_frame['lifetime'].fillna(0)
+        write_parquet(cur_frame, output_file)
+        return cur_frame, prv_stamp, uid_iter, icdx
+    # Get schema and cols
+    link_df = set_outputdf(schm)
+    linked_cols = list(link_df.columns)
+    # Create a column cal cindex based on the range of the current frame and assign
+    # the value to the current frameprv_frame
+    cdx_range = range(icdx, icdx + len(cur_frame))
+    cur_frame['cindex'] = cdx_range
+    cur_frame = pd.concat([cur_frame, link_df])
+    # Get current stamp
+    cur_stamp = get_featstamp(cur_file)
+    # Calculate delta time
+    dt_time = cur_stamp - prv_stamp
+    # Conditions to enter in this conditional below:
+    #   - time_ is 0
+    #   - prv_frame is empty
+    #   - dt_time is greater than max_dt
+    if time_ == 0 or prv_frame.empty or dt_time > max_dt:
+        # Classify clusters as new clusters and check board clusprv_frameters
+        cur_frame = new_frame(cur_frame, uid_iter)
+        cur_frame = board_clusters(cur_frame)
+        cur_frame = refact_inside(cur_frame)
+        uid_iter = update_max_uid(cur_frame['uid'].max(), uid_iter)
+        # Calculate lifetime
+        cur_frame['lifetime'] = 0
+        write_parquet(cur_frame[linked_cols], output_file)
+        return cur_frame, cur_stamp, uid_iter, cdx_range[-1]
+    # Get previous indx based for conditions:
+    #  - prev_idx is not null
+    #  - status is not NEW
+    # The association values is based on prev_idx
+    cur_prev_idx = cur_frame.loc[(~cur_frame['past_idx'].isnull())]
+    cur_prev_idx = cur_prev_idx[~cur_prev_idx['status'].str.contains('NEW')]
+    cur_idx = cur_prev_idx.index
+    prv_idx = pd.Index(cur_prev_idx['past_idx'].values.astype(int))
+    previous_uids = prv_frame.loc[prv_idx]['uid'].values
+    previous_iuids = prv_frame.loc[prv_idx]['iuid'].values
+    cur_frame.loc[cur_idx, 'uid'] = previous_uids
+    cur_frame.loc[cur_idx, 'iuid'] = previous_iuids
+    # Merge trajectories
+    cur_frame = merge_trajectory(cur_frame, cur_idx, prv_frame, prv_idx)
+    # New frames for base threshold
+    cur_frame = new_frame(cur_frame, uid_iter)
+    # Check board clusters
+    cur_frame = board_clusters(cur_frame)
+    # Refact inside clusters
+    cur_frame = refact_inside(cur_frame)
+    # Update max uid
+    uid_iter = update_max_uid(cur_frame['uid'].max(), uid_iter)
+    # Calculate lifetime, get previous lifetime and add to current lifetime
+    prev_lifetime = prv_frame.loc[prv_idx]['lifetime'].values
+    # Calc time interval
+    time_int = (cur_stamp - prv_stamp).total_seconds() / 60
+    cur_frame.loc[cur_idx, 'lifetime'] = prev_lifetime + time_int
+    # Split lifetime: Preserve lifetime of split clusters
+    if nm_lst['preserv_split']:
+        split_frs = cur_frame.loc[cur_frame['split_pr_idx'].notnull()]
+        if len(split_frs) > 0:
+            split_idx = split_frs['split_pr_idx'].values.astype(int)
+            lifetimes = prv_frame.loc[split_idx]['lifetime']
+            cur_frame.loc[split_frs.index, 'lifetime'] = lifetimes.values
+    # Fill NaN values to 0
+    cur_frame['lifetime'] = cur_frame['lifetime'].fillna(0)
+    # Write linked file
+    write_parquet(cur_frame[linked_cols], output_file)
+    return cur_frame, cur_stamp, uid_iter, cdx_range[-1]

pyfortracc-1.0.0/pyfortracc/cluster_linking/max_uid.py ADDED Viewed

@@ -0,0 +1,25 @@
+def update_max_uid(current_max_uid, global_uid):
+    """
+    Update the global unique identifier (UID) based on the current maximum UID.
+    This function compares the 'current_max_uid' with 'global_uid' and updates 'global_uid' accordingly.
+    If 'current_max_uid' is greater than or equal to 'global_uid', it increments 'global_uid' by 1.
+    Otherwise, it leaves 'global_uid' unchanged.
+    Parameters
+    ----------
+    current_max_uid : int
+        The current maximum UID observed.
+    global_uid : int
+        The global UID that is used and needs to be updated if necessary.
+    Returns
+    -------
+    int
+        The updated global UID.
+    """
+    if current_max_uid >= global_uid:
+        global_uid = current_max_uid + 1
+    else:
+        global_uid = global_uid
+    return global_uid