direl-ts-tool-kit 0.4.4__tar.gz → 0.4.8__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (19) hide show
  1. {direl_ts_tool_kit-0.4.4 → direl_ts_tool_kit-0.4.8}/LICENCE +0 -0
  2. direl_ts_tool_kit-0.4.8/PKG-INFO +487 -0
  3. direl_ts_tool_kit-0.4.8/README.md +456 -0
  4. {direl_ts_tool_kit-0.4.4 → direl_ts_tool_kit-0.4.8}/direl_ts_tool_kit/__init__.py +0 -0
  5. {direl_ts_tool_kit-0.4.4 → direl_ts_tool_kit-0.4.8}/direl_ts_tool_kit/plot/__init__.py +0 -0
  6. {direl_ts_tool_kit-0.4.4 → direl_ts_tool_kit-0.4.8}/direl_ts_tool_kit/plot/plot_style.py +0 -0
  7. {direl_ts_tool_kit-0.4.4 → direl_ts_tool_kit-0.4.8}/direl_ts_tool_kit/plot/plot_ts.py +1 -1
  8. {direl_ts_tool_kit-0.4.4 → direl_ts_tool_kit-0.4.8}/direl_ts_tool_kit/utilities/__init__.py +0 -0
  9. {direl_ts_tool_kit-0.4.4 → direl_ts_tool_kit-0.4.8}/direl_ts_tool_kit/utilities/data_prep.py +4 -2
  10. direl_ts_tool_kit-0.4.8/direl_ts_tool_kit.egg-info/PKG-INFO +487 -0
  11. {direl_ts_tool_kit-0.4.4 → direl_ts_tool_kit-0.4.8}/direl_ts_tool_kit.egg-info/SOURCES.txt +0 -0
  12. {direl_ts_tool_kit-0.4.4 → direl_ts_tool_kit-0.4.8}/direl_ts_tool_kit.egg-info/dependency_links.txt +0 -0
  13. {direl_ts_tool_kit-0.4.4 → direl_ts_tool_kit-0.4.8}/direl_ts_tool_kit.egg-info/requires.txt +0 -0
  14. {direl_ts_tool_kit-0.4.4 → direl_ts_tool_kit-0.4.8}/direl_ts_tool_kit.egg-info/top_level.txt +0 -0
  15. {direl_ts_tool_kit-0.4.4 → direl_ts_tool_kit-0.4.8}/setup.cfg +0 -0
  16. {direl_ts_tool_kit-0.4.4 → direl_ts_tool_kit-0.4.8}/setup.py +1 -1
  17. direl_ts_tool_kit-0.4.4/PKG-INFO +0 -53
  18. direl_ts_tool_kit-0.4.4/README.md +0 -22
  19. direl_ts_tool_kit-0.4.4/direl_ts_tool_kit.egg-info/PKG-INFO +0 -53
@@ -0,0 +1,487 @@
1
+ Metadata-Version: 2.4
2
+ Name: direl-ts-tool-kit
3
+ Version: 0.4.8
4
+ Summary: A toolbox for time series analysis and visualization.
5
+ Home-page: https://gitlab.com/direl/direl_tool_kit
6
+ Author: Diego Restrepo-Leal
7
+ Author-email: diegorestrepoleal@gmail.com
8
+ Classifier: Programming Language :: Python :: 3
9
+ Classifier: Programming Language :: Python :: 3.9
10
+ Classifier: License :: OSI Approved :: MIT License
11
+ Classifier: Operating System :: OS Independent
12
+ Classifier: Intended Audience :: Science/Research
13
+ Classifier: Topic :: Scientific/Engineering :: Visualization
14
+ Requires-Python: >=3.8
15
+ Description-Content-Type: text/markdown
16
+ License-File: LICENCE
17
+ Requires-Dist: pandas>=1.0.0
18
+ Requires-Dist: numpy>=1.18.0
19
+ Requires-Dist: matplotlib>=3.0.0
20
+ Requires-Dist: openpyxl
21
+ Dynamic: author
22
+ Dynamic: author-email
23
+ Dynamic: classifier
24
+ Dynamic: description
25
+ Dynamic: description-content-type
26
+ Dynamic: home-page
27
+ Dynamic: license-file
28
+ Dynamic: requires-dist
29
+ Dynamic: requires-python
30
+ Dynamic: summary
31
+
32
+ # direl-ts-tool-kit
33
+ > A Toolbox for Time Series Analysis and Visualization
34
+
35
+ A lightweight Python library developed to streamline common tasks in time series processing, including data preparation,
36
+ visualization with a consistent aesthetic style, and handling irregular indices.
37
+
38
+ ## Key features and functions
39
+
40
+ The library provides the following key functionalities, primarily centered around data preparation and plotting.
41
+
42
+ ### Data preparation and index management
43
+ #### parse_datetime_index
44
+ `parse_datetime_index(df_raw, date_column="date", format=None)`
45
+
46
+ Parses a specified column into datetime objects and sets it as the DataFrame index.
47
+
48
+ This function prepares raw data for time series analysis by ensuring the
49
+ DataFrame is indexed by the correct datetime type.
50
+
51
+ #### generate_dates
52
+ `generate_dates(df_ts, freq="MS")`
53
+
54
+ Generates a continuous DatetimeIndex covering the time span of the input DataFrame.
55
+
56
+ The function determines the start and end dates from the existing DataFrame index
57
+ and creates a new, regular date sequence based on the specified frequency.
58
+
59
+ #### reindex_and_aggregate
60
+ `reindex_and_aggregate(df_ts, column_name, freq="MS")`
61
+
62
+ Re-indexes a time series DataFrame to a regular frequency, aggregates values,
63
+ and introduces NaN for missing time steps.
64
+
65
+ This function first identifies the time range from the original (potentially irregular)
66
+ index, aggregates data if necessary (e.g., if multiple entries exist per time step),
67
+ and then merges the data onto a complete date range, effectively filling gaps
68
+ with NaN values.
69
+
70
+ #### remove_outliers_by_threshold
71
+ `remove_outliers_by_threshold(df_ts, column_name, lower_bound, upper_bound)`
72
+
73
+ Replaces values in a specified column with NaN if they fall outside a defined range (outlier removal).
74
+
75
+ This function identifies data points that are either below the lower
76
+ bound or above the upper bound and treats them as missing data.
77
+
78
+
79
+ ### Visualization and styling
80
+
81
+ #### plot_time_series
82
+ `plot_time_series(df_ts, variable, units="", color="BLUE_LINES", time_unit="Year", rot=90, auto_format_label=True)`
83
+
84
+ Plots a time series with custom styling and dual-level grid visibility.
85
+
86
+ This function automatically sets major and minor time-based locators
87
+ on the x-axis based on the specified time unit, and formats the y-axis
88
+ to use scientific notation.
89
+
90
+ #### save_figure
91
+ `save_figure(fig, file_name, variable_name="", path="./")`
92
+
93
+ Saves a Matplotlib figure in three common high-quality formats (PNG, PDF, SVG).
94
+
95
+ The function creates a consistent file name structure:
96
+ {path}/{file_name}_{variable_name}.{extension}.
97
+
98
+ # Examples
99
+ ## Example 1
100
+ ```python
101
+ import jupyter_black
102
+
103
+ jupyter_black.load()
104
+ ```
105
+
106
+
107
+ ```python
108
+ import pandas as pd
109
+ import warnings
110
+
111
+ from direl_ts_tool_kit import (
112
+ plot_time_series,
113
+ save_figure,
114
+ parse_datetime_index,
115
+ reindex_and_aggregate,
116
+ )
117
+
118
+ warnings.filterwarnings("ignore")
119
+ ```
120
+
121
+
122
+ ```python
123
+ df0 = pd.read_csv("dataset_test_01.csv")
124
+ ```
125
+
126
+
127
+ ```python
128
+ df0.head(2)
129
+ ```
130
+
131
+
132
+
133
+
134
+ <div>
135
+ <style scoped>
136
+ .dataframe tbody tr th:only-of-type {
137
+ vertical-align: middle;
138
+ }
139
+
140
+ .dataframe tbody tr th {
141
+ vertical-align: top;
142
+ }
143
+
144
+ .dataframe thead th {
145
+ text-align: right;
146
+ }
147
+ </style>
148
+ <table border="1" class="dataframe">
149
+ <thead>
150
+ <tr style="text-align: right;">
151
+ <th></th>
152
+ <th>date</th>
153
+ <th>LPUE</th>
154
+ <th>common name</th>
155
+ </tr>
156
+ </thead>
157
+ <tbody>
158
+ <tr>
159
+ <th>0</th>
160
+ <td>1993-01-01</td>
161
+ <td>0.47</td>
162
+ <td>camaron blanco</td>
163
+ </tr>
164
+ <tr>
165
+ <th>1</th>
166
+ <td>1993-02-01</td>
167
+ <td>0.22</td>
168
+ <td>camaron blanco</td>
169
+ </tr>
170
+ </tbody>
171
+ </table>
172
+ </div>
173
+
174
+
175
+
176
+
177
+ ```python
178
+ df1 = parse_datetime_index(df0)
179
+ df1.head(2)
180
+ ```
181
+
182
+
183
+
184
+
185
+ <div>
186
+ <style scoped>
187
+ .dataframe tbody tr th:only-of-type {
188
+ vertical-align: middle;
189
+ }
190
+
191
+ .dataframe tbody tr th {
192
+ vertical-align: top;
193
+ }
194
+
195
+ .dataframe thead th {
196
+ text-align: right;
197
+ }
198
+ </style>
199
+ <table border="1" class="dataframe">
200
+ <thead>
201
+ <tr style="text-align: right;">
202
+ <th></th>
203
+ <th>LPUE</th>
204
+ <th>common name</th>
205
+ </tr>
206
+ <tr>
207
+ <th>date</th>
208
+ <th></th>
209
+ <th></th>
210
+ </tr>
211
+ </thead>
212
+ <tbody>
213
+ <tr>
214
+ <th>1993-01-01</th>
215
+ <td>0.47</td>
216
+ <td>camaron blanco</td>
217
+ </tr>
218
+ <tr>
219
+ <th>1993-02-01</th>
220
+ <td>0.22</td>
221
+ <td>camaron blanco</td>
222
+ </tr>
223
+ </tbody>
224
+ </table>
225
+ </div>
226
+
227
+
228
+
229
+
230
+ ```python
231
+ fig = plot_time_series(df1, "LPUE", "$(kg/dop)$", auto_format_label=False)
232
+ fig.show()
233
+ ```
234
+
235
+
236
+
237
+ ![png](example/output_5_0.png)
238
+
239
+
240
+
241
+
242
+ ```python
243
+ save_figure(fig, file_name="LPUE_raw")
244
+ ```
245
+
246
+
247
+ ```python
248
+ df2 = reindex_and_aggregate(df1, column_name="LPUE")
249
+ df2.head(2)
250
+ ```
251
+
252
+
253
+
254
+
255
+ <div>
256
+ <style scoped>
257
+ .dataframe tbody tr th:only-of-type {
258
+ vertical-align: middle;
259
+ }
260
+
261
+ .dataframe tbody tr th {
262
+ vertical-align: top;
263
+ }
264
+
265
+ .dataframe thead th {
266
+ text-align: right;
267
+ }
268
+ </style>
269
+ <table border="1" class="dataframe">
270
+ <thead>
271
+ <tr style="text-align: right;">
272
+ <th></th>
273
+ <th>LPUE</th>
274
+ </tr>
275
+ <tr>
276
+ <th>date</th>
277
+ <th></th>
278
+ </tr>
279
+ </thead>
280
+ <tbody>
281
+ <tr>
282
+ <th>1993-01-01</th>
283
+ <td>0.47</td>
284
+ </tr>
285
+ <tr>
286
+ <th>1993-02-01</th>
287
+ <td>0.22</td>
288
+ </tr>
289
+ </tbody>
290
+ </table>
291
+ </div>
292
+
293
+
294
+
295
+
296
+ ```python
297
+ fig = plot_time_series(df2, "LPUE", "$(kg/dop)$", auto_format_label=False)
298
+ fig.show()
299
+ ```
300
+
301
+
302
+
303
+ ![png](example/output_8_0.png)
304
+
305
+
306
+
307
+
308
+ ```python
309
+ save_figure(fig, file_name="LPUE")
310
+ ```
311
+
312
+ ## Example 2
313
+ ```python
314
+ import jupyter_black
315
+
316
+ jupyter_black.load()
317
+ ```
318
+
319
+
320
+ ```python
321
+ import pandas as pd
322
+ import warnings
323
+
324
+ from direl_ts_tool_kit import (
325
+ plot_time_series,
326
+ parse_datetime_index,
327
+ remove_outliers_by_threshold,
328
+ )
329
+
330
+ warnings.filterwarnings("ignore")
331
+ ```
332
+
333
+
334
+ ```python
335
+ df0 = pd.read_csv("Data_DHT11_4.csv")
336
+ ```
337
+
338
+
339
+ ```python
340
+ df0 = df0.rename(columns={"Date": "date"})
341
+ ```
342
+
343
+
344
+ ```python
345
+ df0.head(2)
346
+ ```
347
+
348
+
349
+
350
+
351
+ <div>
352
+ <style scoped>
353
+ .dataframe tbody tr th:only-of-type {
354
+ vertical-align: middle;
355
+ }
356
+
357
+ .dataframe tbody tr th {
358
+ vertical-align: top;
359
+ }
360
+
361
+ .dataframe thead th {
362
+ text-align: right;
363
+ }
364
+ </style>
365
+ <table border="1" class="dataframe">
366
+ <thead>
367
+ <tr style="text-align: right;">
368
+ <th></th>
369
+ <th>date</th>
370
+ <th>Temperature</th>
371
+ <th>Humidity</th>
372
+ </tr>
373
+ </thead>
374
+ <tbody>
375
+ <tr>
376
+ <th>0</th>
377
+ <td>4/07/2025 15:30:46</td>
378
+ <td>33.6</td>
379
+ <td>62.0</td>
380
+ </tr>
381
+ <tr>
382
+ <th>1</th>
383
+ <td>4/07/2025 15:40:53</td>
384
+ <td>33.4</td>
385
+ <td>62.0</td>
386
+ </tr>
387
+ </tbody>
388
+ </table>
389
+ </div>
390
+
391
+
392
+
393
+
394
+ ```python
395
+ df1 = parse_datetime_index(df0, format="%d/%m/%Y %H:%M:%S")
396
+ df1.head(2)
397
+ ```
398
+
399
+
400
+
401
+
402
+ <div>
403
+ <style scoped>
404
+ .dataframe tbody tr th:only-of-type {
405
+ vertical-align: middle;
406
+ }
407
+
408
+ .dataframe tbody tr th {
409
+ vertical-align: top;
410
+ }
411
+
412
+ .dataframe thead th {
413
+ text-align: right;
414
+ }
415
+ </style>
416
+ <table border="1" class="dataframe">
417
+ <thead>
418
+ <tr style="text-align: right;">
419
+ <th></th>
420
+ <th>Temperature</th>
421
+ <th>Humidity</th>
422
+ </tr>
423
+ <tr>
424
+ <th>date</th>
425
+ <th></th>
426
+ <th></th>
427
+ </tr>
428
+ </thead>
429
+ <tbody>
430
+ <tr>
431
+ <th>2025-07-04 15:30:46</th>
432
+ <td>33.6</td>
433
+ <td>62.0</td>
434
+ </tr>
435
+ <tr>
436
+ <th>2025-07-04 15:40:53</th>
437
+ <td>33.4</td>
438
+ <td>62.0</td>
439
+ </tr>
440
+ </tbody>
441
+ </table>
442
+ </div>
443
+
444
+
445
+
446
+
447
+ ```python
448
+ fig = plot_time_series(
449
+ df1,
450
+ variable="Temperature",
451
+ units="$(^\circ C)$",
452
+ time_unit="Day",
453
+ rot=0,
454
+ )
455
+ fig.show()
456
+ ```
457
+
458
+
459
+
460
+ ![png](example/output_6_0.png)
461
+
462
+
463
+
464
+
465
+ ```python
466
+ df2 = remove_outliers_by_threshold(
467
+ df1, column_name="Temperature", lower_bound=30, upper_bound=32
468
+ )
469
+ ```
470
+
471
+
472
+ ```python
473
+ fig = plot_time_series(
474
+ df2,
475
+ variable="Temperature",
476
+ units="$(^\circ C)$",
477
+ time_unit="Day",
478
+ rot=0,
479
+ auto_format_label=False,
480
+ )
481
+ fig.show()
482
+ ```
483
+
484
+
485
+
486
+ ![png](example/output_9_0.png)
487
+