PyPI - autogluon.timeseries - Versions diffs - 1.0.0b20231124__py3-none-any.whl → 1.0.0b20231125__py3-none-any.whl - Mend

autogluon.timeseries 1.0.0b20231124py3-none-any.whl → 1.0.0b20231125py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of autogluon.timeseries might be problematic. Click here for more details.

Files changed (12) hide show

autogluon/timeseries/dataset/ts_dataframe.py CHANGED Viewed

@@ -5,6 +5,7 @@ import itertools
 import logging
 import reprlib
 from collections.abc import Iterable
+from itertools import islice
 from pathlib import Path
 from typing import Any, List, Optional, Tuple, Type, Union
@@ -849,6 +850,8 @@ class TimeSeriesDataFrame(pd.DataFrame):
         freq: Union[str, pd.DateOffset],
         agg_numeric: str = "mean",
         agg_categorical: str = "first",
+        num_cpus: int = -1,
+        chunk_size: int = 100,
         **kwargs,
     ) -> TimeSeriesDataFrame:
         """Convert each time series in the data frame to the given frequency.
@@ -858,6 +861,10 @@ class TimeSeriesDataFrame(pd.DataFrame):
         1. Converting an irregularly-sampled time series to a regular time index.
         2. Aggregating time series data by downsampling (e.g., convert daily sales into weekly sales)
+        Standard ``df.groupby(...).resample(...)`` can be extremely slow for large datasets, so we parallelize this
+        operation across multiple CPU cores.
         Parameters
         ----------
         freq : Union[str, pd.DateOffset]
@@ -867,6 +874,10 @@ class TimeSeriesDataFrame(pd.DataFrame):
             Aggregation method applied to numeric columns.
         agg_categorical : {"first", "last"}, default = "first"
             Aggregation method applied to categorical columns.
+        num_cpus : int, default = -1
+            Number of CPU cores used when resampling in parallel. Set to -1 to use all cores.
+        chunk_size : int, default = 100
+            Number of time series in a chunk assigned to each parallel worker.
         **kwargs
             Additional keywords arguments that will be passed to ``pandas.DataFrameGroupBy.resample``.
@@ -928,7 +939,8 @@ class TimeSeriesDataFrame(pd.DataFrame):
         0       2020-12-31    10.0
                 2021-12-31    26.0
         """
-        if self.freq == pd.tseries.frequencies.to_offset(freq).freqstr:
+        offset = pd.tseries.frequencies.to_offset(freq)
+        if self.freq == offset.freqstr:
             return self
         # We need to aggregate categorical columns separately because .agg("mean") deletes all non-numeric columns
@@ -939,9 +951,23 @@ class TimeSeriesDataFrame(pd.DataFrame):
             else:
                 aggregation[col] = agg_categorical
-        resampled_df = TimeSeriesDataFrame(
-            self.groupby(level=ITEMID, sort=False).resample(freq, level=TIMESTAMP, **kwargs).agg(aggregation)
-        )
+        def split_into_chunks(iterable: Iterable, size: int) -> Iterable[Iterable]:
+            # Based on https://stackoverflow.com/a/22045226/5497447
+            iterable = iter(iterable)
+            return iter(lambda: tuple(islice(iterable, size)), ())
+        def resample_chunk(chunk: Iterable[Tuple[str, pd.DataFrame]]) -> pd.DataFrame:
+            resampled_dfs = []
+            for item_id, df in chunk:
+                resampled_df = df.resample(offset, level=TIMESTAMP, **kwargs).agg(aggregation)
+                resampled_dfs.append(pd.concat({item_id: resampled_df}, names=[ITEMID]))
+            return pd.concat(resampled_dfs)
+        # Resampling time for 1 item < overhead time for a single parallel job. Therefore, we group items into chunks
+        # so that the speedup from parallelization isn't dominated by the communication costs.
+        chunks = split_into_chunks(pd.DataFrame(self).groupby(level=ITEMID, sort=False), chunk_size)
+        resampled_chunks = Parallel(n_jobs=num_cpus)(delayed(resample_chunk)(chunk) for chunk in chunks)
+        resampled_df = TimeSeriesDataFrame(pd.concat(resampled_chunks))
         resampled_df.static_features = self.static_features
         return resampled_df

autogluon/timeseries/models/gluonts/abstract_gluonts.py CHANGED Viewed

@@ -283,6 +283,7 @@ class AbstractGluonTSModel(AbstractTimeSeriesModel):
         init_args = self._get_estimator_init_args()
         default_trainer_kwargs = {
+            "limit_val_batches": 3,
             "max_epochs": init_args["max_epochs"],
             "callbacks": init_args["callbacks"],
             "enable_progress_bar": False,

autogluon/timeseries/version.py CHANGED Viewed

@@ -1,3 +1,3 @@
 """This is the autogluon version file."""
-__version__ = '1.0.0b20231124'
+__version__ = '1.0.0b20231125'
 __lite__ = False

{autogluon.timeseries-1.0.0b20231124.dist-info → autogluon.timeseries-1.0.0b20231125.dist-info}/METADATA RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: autogluon.timeseries
-Version: 1.0.0b20231124
+Version: 1.0.0b20231125
 Summary: AutoML for Image, Text, and Tabular Data
 Home-page: https://github.com/autogluon/autogluon
 Author: AutoGluon Community
@@ -50,9 +50,9 @@ Requires-Dist: utilsforecast <0.0.11,>=0.0.10
 Requires-Dist: tqdm <5,>=4.38
 Requires-Dist: orjson ~=3.9
 Requires-Dist: tensorboard <3,>=2.9
-Requires-Dist: autogluon.core[raytune] ==1.0.0b20231124
-Requires-Dist: autogluon.common ==1.0.0b20231124
-Requires-Dist: autogluon.tabular[catboost,lightgbm,xgboost] ==1.0.0b20231124
+Requires-Dist: autogluon.core[raytune] ==1.0.0b20231125
+Requires-Dist: autogluon.common ==1.0.0b20231125
+Requires-Dist: autogluon.tabular[catboost,lightgbm,xgboost] ==1.0.0b20231125
 Provides-Extra: all
 Provides-Extra: tests
 Requires-Dist: pytest ; extra == 'tests'

{autogluon.timeseries-1.0.0b20231124.dist-info → autogluon.timeseries-1.0.0b20231125.dist-info}/RECORD RENAMED Viewed

@@ -1,14 +1,14 @@
-autogluon.timeseries-1.0.0b20231124-py3.8-nspkg.pth,sha256=cQGwpuGPqg1GXscIwt-7PmME1OnSpD-7ixkikJ31WAY,554
+autogluon.timeseries-1.0.0b20231125-py3.8-nspkg.pth,sha256=cQGwpuGPqg1GXscIwt-7PmME1OnSpD-7ixkikJ31WAY,554
 autogluon/timeseries/__init__.py,sha256=_CrLLc1fkjen7UzWoO0Os8WZoHOgvZbHKy46I8v_4k4,304
 autogluon/timeseries/evaluator.py,sha256=l642tYfTHsl8WVIq_vV6qhgAFVFr9UuZD7gLra3A_Kc,250
 autogluon/timeseries/learner.py,sha256=HVfsoWTG3dXBCc7JbPfHCCYCMwL3zlrqHwLBG33MTJ8,9633
 autogluon/timeseries/predictor.py,sha256=sohEmnK0Z-sf7zhQRR6i7zTtuTigs0QXQrzhxKx8v9o,59016
 autogluon/timeseries/splitter.py,sha256=eghGwAAN2_cxGk5aJBILgjGWtLzjxJcytMy49gg_q18,3061
-autogluon/timeseries/version.py,sha256=-k59F7BtYG5KzVCW8NlMl325YMkn2027VY6iivBRmI4,90
+autogluon/timeseries/version.py,sha256=d9yJ5IbELS1blBNVeuaYKDWdfnjJtkgwPmPd9R_0wec,90
 autogluon/timeseries/configs/__init__.py,sha256=BTtHIPCYeGjqgOcvqb8qPD4VNX-ICKOg6wnkew1cPOE,98
 autogluon/timeseries/configs/presets_configs.py,sha256=1u6tbOKJdIRULYDu41dlJwXRNswWsjBDF0aR2YhyMQs,479
 autogluon/timeseries/dataset/__init__.py,sha256=UvnhAN5tjgxXTHoZMQDy64YMDj4Xxa68yY7NP4vAw0o,81
-autogluon/timeseries/dataset/ts_dataframe.py,sha256=gbYz6kwA6DRIPw2ijuWV4CneDPKQO_Zx6ildCSMfV2E,42929
+autogluon/timeseries/dataset/ts_dataframe.py,sha256=PgOz-88hbxNnhbpp0DMJbGBdtM6wIB32YpPWdyROB1c,44424
 autogluon/timeseries/metrics/__init__.py,sha256=gzvHptT-UdvB26CLOoFIznaKT-5FDwuVO37gaYPp88o,1835
 autogluon/timeseries/metrics/abstract.py,sha256=-muJuc30zSqHYXNBYyGocL-4zT7bt4SRjW9ddWcCq9w,8069
 autogluon/timeseries/metrics/point.py,sha256=WdhUrKB0ilO_N9-jHljQBQOj8mDvlNCfwMAD0RO61kI,11277
@@ -26,7 +26,7 @@ autogluon/timeseries/models/ensemble/__init__.py,sha256=kFr11Gmt7lQJu9Rr8HuIPphQ
 autogluon/timeseries/models/ensemble/abstract_timeseries_ensemble.py,sha256=tifETwmiEGt-YtQ9eNK7ojJ3fBvtFMUJvisbfkIJ7gw,3393
 autogluon/timeseries/models/ensemble/greedy_ensemble.py,sha256=3xYzg0CIe0U4l-HScVThb-q8wfKCmNB8SwRjRBMkCMU,7369
 autogluon/timeseries/models/gluonts/__init__.py,sha256=M8PV9ZE4WpteScMobXM6RH1Udb1AZiHHtj2g5GQL3TU,329
-autogluon/timeseries/models/gluonts/abstract_gluonts.py,sha256=t6nyLTcvkLYh_xYhHQGu4UK-c7fqdYguQrqzJT2j9Oo,25563
+autogluon/timeseries/models/gluonts/abstract_gluonts.py,sha256=cdzWbJ36vnSIg5TxzRYaOedvtUipbvQLQbsUSfj43ZA,25599
 autogluon/timeseries/models/gluonts/torch/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
 autogluon/timeseries/models/gluonts/torch/models.py,sha256=7ktOy6MxEzD0ykhUwcVEufSjdQNwYadtInLN6cms4Ig,18322
 autogluon/timeseries/models/local/__init__.py,sha256=JyckWWgMG1BTIWJqFTW6e1O-eb0LPPOwtXwmb1ErohQ,756
@@ -48,11 +48,11 @@ autogluon/timeseries/utils/datetime/base.py,sha256=MsqIHY14m3QMjSwwtE7Uo1oNwepWU
 autogluon/timeseries/utils/datetime/lags.py,sha256=kcU4liKbHj7KP2ajNU-KLZ8OYSU35EgT4kJjZNSw0Zg,5875
 autogluon/timeseries/utils/datetime/seasonality.py,sha256=kgK_ukw2wCviEB7CZXRVC5HZpBJZu9IsRrvCJ9E_rOE,755
 autogluon/timeseries/utils/datetime/time_features.py,sha256=pROkYyxETQ8rHKfPGhf2paB73C7rWJ2Ui0cCswLqbBg,2562
-autogluon.timeseries-1.0.0b20231124.dist-info/LICENSE,sha256=CeipvOyAZxBGUsFoaFqwkx54aPnIKEtm9a5u2uXxEws,10142
-autogluon.timeseries-1.0.0b20231124.dist-info/METADATA,sha256=S1f5aKQr741Y-QzUCMNAH6W1PEp5e7LNRSvz3g6SKbQ,13324
-autogluon.timeseries-1.0.0b20231124.dist-info/NOTICE,sha256=7nPQuj8Kp-uXsU0S5so3-2dNU5EctS5hDXvvzzehd7E,114
-autogluon.timeseries-1.0.0b20231124.dist-info/WHEEL,sha256=Xo9-1PvkuimrydujYJAjF7pCkriuXBpUPEjma1nZyJ0,92
-autogluon.timeseries-1.0.0b20231124.dist-info/namespace_packages.txt,sha256=giERA4R78OkJf2ijn5slgjURlhRPzfLr7waIcGkzYAo,10
-autogluon.timeseries-1.0.0b20231124.dist-info/top_level.txt,sha256=giERA4R78OkJf2ijn5slgjURlhRPzfLr7waIcGkzYAo,10
-autogluon.timeseries-1.0.0b20231124.dist-info/zip-safe,sha256=AbpHGcgLb-kRsJGnwFEktk7uzpZOCcBY74-YBdrKVGs,1
-autogluon.timeseries-1.0.0b20231124.dist-info/RECORD,,
+autogluon.timeseries-1.0.0b20231125.dist-info/LICENSE,sha256=CeipvOyAZxBGUsFoaFqwkx54aPnIKEtm9a5u2uXxEws,10142
+autogluon.timeseries-1.0.0b20231125.dist-info/METADATA,sha256=eFmR7JN0SI0xEjL_5c6lQRxNuhHNz8ClyNr0F46Od-w,13324
+autogluon.timeseries-1.0.0b20231125.dist-info/NOTICE,sha256=7nPQuj8Kp-uXsU0S5so3-2dNU5EctS5hDXvvzzehd7E,114
+autogluon.timeseries-1.0.0b20231125.dist-info/WHEEL,sha256=Xo9-1PvkuimrydujYJAjF7pCkriuXBpUPEjma1nZyJ0,92
+autogluon.timeseries-1.0.0b20231125.dist-info/namespace_packages.txt,sha256=giERA4R78OkJf2ijn5slgjURlhRPzfLr7waIcGkzYAo,10
+autogluon.timeseries-1.0.0b20231125.dist-info/top_level.txt,sha256=giERA4R78OkJf2ijn5slgjURlhRPzfLr7waIcGkzYAo,10
+autogluon.timeseries-1.0.0b20231125.dist-info/zip-safe,sha256=AbpHGcgLb-kRsJGnwFEktk7uzpZOCcBY74-YBdrKVGs,1
+autogluon.timeseries-1.0.0b20231125.dist-info/RECORD,,

/autogluon.timeseries-1.0.0b20231124-py3.8-nspkg.pth → /autogluon.timeseries-1.0.0b20231125-py3.8-nspkg.pth RENAMED Viewed

File without changes

{autogluon.timeseries-1.0.0b20231124.dist-info → autogluon.timeseries-1.0.0b20231125.dist-info}/LICENSE RENAMED Viewed

File without changes

{autogluon.timeseries-1.0.0b20231124.dist-info → autogluon.timeseries-1.0.0b20231125.dist-info}/NOTICE RENAMED Viewed

File without changes

{autogluon.timeseries-1.0.0b20231124.dist-info → autogluon.timeseries-1.0.0b20231125.dist-info}/WHEEL RENAMED Viewed

File without changes

{autogluon.timeseries-1.0.0b20231124.dist-info → autogluon.timeseries-1.0.0b20231125.dist-info}/namespace_packages.txt RENAMED Viewed

File without changes

{autogluon.timeseries-1.0.0b20231124.dist-info → autogluon.timeseries-1.0.0b20231125.dist-info}/top_level.txt RENAMED Viewed

File without changes

{autogluon.timeseries-1.0.0b20231124.dist-info → autogluon.timeseries-1.0.0b20231125.dist-info}/zip-safe RENAMED Viewed

File without changes

autogluon.timeseries 1.0.0b20231124__py3-none-any.whl → 1.0.0b20231125__py3-none-any.whl

Potentially problematic release.

autogluon.timeseries 1.0.0b20231124py3-none-any.whl → 1.0.0b20231125py3-none-any.whl