paradigma 1.0.1__tar.gz → 1.0.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (22) hide show
  1. {paradigma-1.0.1 → paradigma-1.0.3}/PKG-INFO +8 -7
  2. {paradigma-1.0.1 → paradigma-1.0.3}/README.md +6 -6
  3. {paradigma-1.0.1 → paradigma-1.0.3}/pyproject.toml +4 -1
  4. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/config.py +3 -1
  5. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/pipelines/gait_pipeline.py +93 -45
  6. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/pipelines/tremor_pipeline.py +4 -13
  7. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/preprocessing.py +4 -4
  8. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/util.py +20 -4
  9. {paradigma-1.0.1 → paradigma-1.0.3}/LICENSE +0 -0
  10. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/__init__.py +0 -0
  11. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/assets/gait_detection_clf_package.pkl +0 -0
  12. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/assets/gait_filtering_clf_package.pkl +0 -0
  13. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/assets/ppg_quality_clf_package.pkl +0 -0
  14. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/assets/tremor_detection_clf_package.pkl +0 -0
  15. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/classification.py +0 -0
  16. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/constants.py +0 -0
  17. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/feature_extraction.py +0 -0
  18. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/pipelines/__init__.py +0 -0
  19. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/pipelines/pulse_rate_pipeline.py +0 -0
  20. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/pipelines/pulse_rate_utils.py +0 -0
  21. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/segmenting.py +0 -0
  22. {paradigma-1.0.1 → paradigma-1.0.3}/src/paradigma/testing.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.3
2
2
  Name: paradigma
3
- Version: 1.0.1
3
+ Version: 1.0.3
4
4
  Summary: ParaDigMa - A toolbox for deriving Parkinson's disease Digital Markers from real-life wrist sensor data
5
5
  License: Apache-2.0
6
6
  Author: Erik Post
@@ -11,6 +11,7 @@ Classifier: Programming Language :: Python :: 3
11
11
  Classifier: Programming Language :: Python :: 3.11
12
12
  Classifier: Programming Language :: Python :: 3.12
13
13
  Classifier: Programming Language :: Python :: 3.13
14
+ Requires-Dist: nbconvert (>=7.16.6,<8.0.0)
14
15
  Requires-Dist: pandas (>=2.1.4,<3.0.0)
15
16
  Requires-Dist: python-dateutil (>=2.9.0.post0,<3.0.0)
16
17
  Requires-Dist: pytype (>=2024.4.11,<2025.0.0)
@@ -41,7 +42,7 @@ It contains three data processing pipelines: (1) arm swing during gait, (2) trem
41
42
  and (3) pulse rate. These pipelines are scientifically validated for their
42
43
  use in persons with PD. Furthermore, the toolbox contains general functionalities for
43
44
  signal processing and feature extraction, such as filtering, peak detection, and
44
- spectral analysis.
45
+ spectral analysis.
45
46
 
46
47
  The toolbox is accompanied by a set of example scripts and notebooks for
47
48
  each processing pipeline that demonstrate how to use the toolbox for extracting
@@ -111,10 +112,10 @@ We have included support for [TSDF](https://biomarkersparkinson.github.io/tsdf/)
111
112
 
112
113
  ## Scientific validation
113
114
 
114
- The pipelines were developed and validated using data from the Parkinson@Home Validation study [[Evers et al. 2020]](https://pmc.ncbi.nlm.nih.gov/articles/PMC7584982/) and the Personalized Parkinson Project [[Bloem et al. 2019]](https://pubmed.ncbi.nlm.nih.gov/31315608/). The following publication contains the details and validation of the arm swing during gait pipeline:
115
- * [Post, E. et al. - Quantifying arm swing in Parkinson's disease: a method account for arm activities during free-living gait](https://doi.org/10.1186/s12984-025-01578-z)
116
-
117
- Details and validation of the other pipelines shall be shared in upcoming scientific publications.
115
+ The pipelines were developed and validated using data from the Parkinson@Home Validation study [[Evers et al. (2020)]](https://pmc.ncbi.nlm.nih.gov/articles/PMC7584982/) and the Personalized Parkinson Project [[Bloem et al. (2019)]](https://pubmed.ncbi.nlm.nih.gov/31315608/). The following publications contain details and validation of the pipelines:
116
+ * [Post, E. et al. (2025) - Quantifying arm swing in Parkinson's disease: a method account for arm activities during free-living gait](https://doi.org/10.1186/s12984-025-01578-z)
117
+ * [Timmermans, N.A. et al. (2025) - A generalizable and open-source algorithm for real-life monitoring of tremor in Parkinson's disease](https://doi.org/10.1038/s41531-025-01056-2)
118
+ * [Veldkamp, K.I. et al. (2025) - Heart rate monitoring using wrist photoplethysmography in Parkinson disease: feasibility and relation with autonomic dysfunction](https://doi.org/10.1101/2025.08.15.25333751)
118
119
 
119
120
  ## Contributing
120
121
 
@@ -133,5 +134,5 @@ The initial release of ParaDigMa was funded by the Michael J Fox Foundation (gra
133
134
  ParaDigMa was created with [`cookiecutter`](https://cookiecutter.readthedocs.io/en/latest/) and the `py-pkgs-cookiecutter` [template](https://github.com/py-pkgs/py-pkgs-cookiecutter).
134
135
 
135
136
  ## Contact
136
- Questions, issues or suggestions about ParaDigMa? Please reach out to erik.post@radboudumc.nl, or open an issue in the GitHub repository.
137
+ Questions, issues or suggestions about ParaDigMa? Please reach out to paradigma@radboudumc.nl, or open an issue in the GitHub repository.
137
138
 
@@ -21,7 +21,7 @@ It contains three data processing pipelines: (1) arm swing during gait, (2) trem
21
21
  and (3) pulse rate. These pipelines are scientifically validated for their
22
22
  use in persons with PD. Furthermore, the toolbox contains general functionalities for
23
23
  signal processing and feature extraction, such as filtering, peak detection, and
24
- spectral analysis.
24
+ spectral analysis.
25
25
 
26
26
  The toolbox is accompanied by a set of example scripts and notebooks for
27
27
  each processing pipeline that demonstrate how to use the toolbox for extracting
@@ -91,10 +91,10 @@ We have included support for [TSDF](https://biomarkersparkinson.github.io/tsdf/)
91
91
 
92
92
  ## Scientific validation
93
93
 
94
- The pipelines were developed and validated using data from the Parkinson@Home Validation study [[Evers et al. 2020]](https://pmc.ncbi.nlm.nih.gov/articles/PMC7584982/) and the Personalized Parkinson Project [[Bloem et al. 2019]](https://pubmed.ncbi.nlm.nih.gov/31315608/). The following publication contains the details and validation of the arm swing during gait pipeline:
95
- * [Post, E. et al. - Quantifying arm swing in Parkinson's disease: a method account for arm activities during free-living gait](https://doi.org/10.1186/s12984-025-01578-z)
96
-
97
- Details and validation of the other pipelines shall be shared in upcoming scientific publications.
94
+ The pipelines were developed and validated using data from the Parkinson@Home Validation study [[Evers et al. (2020)]](https://pmc.ncbi.nlm.nih.gov/articles/PMC7584982/) and the Personalized Parkinson Project [[Bloem et al. (2019)]](https://pubmed.ncbi.nlm.nih.gov/31315608/). The following publications contain details and validation of the pipelines:
95
+ * [Post, E. et al. (2025) - Quantifying arm swing in Parkinson's disease: a method account for arm activities during free-living gait](https://doi.org/10.1186/s12984-025-01578-z)
96
+ * [Timmermans, N.A. et al. (2025) - A generalizable and open-source algorithm for real-life monitoring of tremor in Parkinson's disease](https://doi.org/10.1038/s41531-025-01056-2)
97
+ * [Veldkamp, K.I. et al. (2025) - Heart rate monitoring using wrist photoplethysmography in Parkinson disease: feasibility and relation with autonomic dysfunction](https://doi.org/10.1101/2025.08.15.25333751)
98
98
 
99
99
  ## Contributing
100
100
 
@@ -113,4 +113,4 @@ The initial release of ParaDigMa was funded by the Michael J Fox Foundation (gra
113
113
  ParaDigMa was created with [`cookiecutter`](https://cookiecutter.readthedocs.io/en/latest/) and the `py-pkgs-cookiecutter` [template](https://github.com/py-pkgs/py-pkgs-cookiecutter).
114
114
 
115
115
  ## Contact
116
- Questions, issues or suggestions about ParaDigMa? Please reach out to erik.post@radboudumc.nl, or open an issue in the GitHub repository.
116
+ Questions, issues or suggestions about ParaDigMa? Please reach out to paradigma@radboudumc.nl, or open an issue in the GitHub repository.
@@ -1,6 +1,6 @@
1
1
  [tool.poetry]
2
2
  name = "paradigma"
3
- version = "1.0.1"
3
+ version = "1.0.3"
4
4
  description = "ParaDigMa - A toolbox for deriving Parkinson's disease Digital Markers from real-life wrist sensor data"
5
5
  authors = [ "Erik Post <erik.post@radboudumc.nl>",
6
6
  "Kars Veldkamp <kars.veldkamp@radboudumc.nl>",
@@ -22,6 +22,7 @@ pytype = "^2024.4.11"
22
22
  # for the record: pytype was installed directly with pip (in the poetry environment),
23
23
  # because poetry didn't handle the install for different CPU architectures
24
24
  python-dateutil = "^2.9.0.post0"
25
+ nbconvert = "^7.16.6"
25
26
 
26
27
  [tool.poetry.group.testing.dependencies]
27
28
  ipykernel = "^6.27.1"
@@ -41,6 +42,8 @@ nbsphinx = "^0.9.6"
41
42
 
42
43
  [tool.poetry.group.dev.dependencies]
43
44
  pytype = "^2024.10.11"
45
+ notebook = "^7.4.5"
46
+ ipykernel = "^6.30.1"
44
47
 
45
48
  [build-system]
46
49
  requires = ["poetry-core>=1.0.0"]
@@ -66,6 +66,7 @@ class IMUConfig(BaseConfig):
66
66
  self.d_channels_imu = {**self.d_channels_accelerometer, **self.d_channels_gyroscope}
67
67
 
68
68
  self.sampling_frequency = 100
69
+ self.resampling_frequency = 100
69
70
  self.lower_cutoff_frequency = 0.2
70
71
  self.upper_cutoff_frequency = 3.5
71
72
  self.filter_order = 4
@@ -222,7 +223,8 @@ class TremorConfig(IMUConfig):
222
223
  # -----------
223
224
  # Aggregation
224
225
  # -----------
225
- self.aggregates_tremor_power: List[str] = ['mode', 'median', '90p']
226
+ self.aggregates_tremor_power: List[str] = ['mode_binned', 'median', '90p']
227
+ self.evaluation_points_tremor_power: np.ndarray = np.linspace(0, 6, 301)
226
228
 
227
229
  # -----------------
228
230
  # TSDF data storage
@@ -1,3 +1,4 @@
1
+ import logging
1
2
  import numpy as np
2
3
  import pandas as pd
3
4
  from scipy.signal import periodogram
@@ -10,10 +11,16 @@ from paradigma.feature_extraction import pca_transform_gyroscope, compute_angle,
10
11
  extract_angle_extremes, compute_range_of_motion, compute_peak_angular_velocity, compute_statistics, \
11
12
  compute_std_euclidean_norm, compute_power_in_bandwidth, compute_dominant_frequency, compute_mfccs, \
12
13
  compute_total_power
13
- from paradigma.segmenting import tabulate_windows, create_segments, discard_segments, categorize_segments, WindowedDataExtractor
14
+ from paradigma.segmenting import tabulate_windows, create_segments, discard_segments, WindowedDataExtractor
14
15
  from paradigma.util import aggregate_parameter
15
16
 
16
17
 
18
+ logger = logging.getLogger(__name__)
19
+
20
+ # Only configure basic logging if no handlers exist
21
+ if not logger.hasHandlers():
22
+ logging.basicConfig(level=logging.INFO, format='%(levelname)s: %(message)s')
23
+
17
24
  def extract_gait_features(
18
25
  df: pd.DataFrame,
19
26
  config: GaitConfig
@@ -351,19 +358,15 @@ def quantify_arm_swing(
351
358
  """
352
359
  # Group consecutive timestamps into segments, with new segments starting after a pre-specified gap.
353
360
  # Segments are made based on predicted gait
354
- df[DataColumns.SEGMENT_NR] = create_segments(
361
+ df['unfiltered_segment_nr'] = create_segments(
355
362
  time_array=df[DataColumns.TIME],
356
363
  max_segment_gap_s=max_segment_gap_s
357
364
  )
358
365
 
359
- # Segment category is determined based on predicted gait, hence it is set
360
- # before filtering the DataFrame to only include predicted no other arm activity
361
- df[DataColumns.SEGMENT_CAT] = categorize_segments(df=df, fs=fs)
362
-
363
366
  # Remove segments that do not meet predetermined criteria
364
367
  df = discard_segments(
365
368
  df=df,
366
- segment_nr_colname=DataColumns.SEGMENT_NR,
369
+ segment_nr_colname='unfiltered_segment_nr',
367
370
  min_segment_length_s=min_segment_length_s,
368
371
  fs=fs,
369
372
  format='timestamps'
@@ -371,7 +374,10 @@ def quantify_arm_swing(
371
374
 
372
375
  if df.empty:
373
376
  raise ValueError("No segments found in the input data after discarding segments of invalid shape.")
374
-
377
+
378
+ # Create dictionary of gait segment number and duration
379
+ gait_segment_duration_dict = {segment_nr: len(group[DataColumns.TIME]) / fs for segment_nr, group in df.groupby('unfiltered_segment_nr', sort=False)}
380
+
375
381
  # If no arm swing data is remaining, return an empty dictionary
376
382
  if filtered and df.loc[df[DataColumns.PRED_NO_OTHER_ARM_ACTIVITY]==1].empty:
377
383
  raise ValueError("No gait without other arm activities to quantify.")
@@ -380,7 +386,7 @@ def quantify_arm_swing(
380
386
  df = df.loc[df[DataColumns.PRED_NO_OTHER_ARM_ACTIVITY]==1].reset_index(drop=True)
381
387
 
382
388
  # Group consecutive timestamps into segments of filtered gait
383
- df[DataColumns.SEGMENT_NR] = create_segments(
389
+ df['filtered_segment_nr'] = create_segments(
384
390
  time_array=df[DataColumns.TIME],
385
391
  max_segment_gap_s=max_segment_gap_s
386
392
  )
@@ -388,13 +394,15 @@ def quantify_arm_swing(
388
394
  # Remove segments that do not meet predetermined criteria
389
395
  df = discard_segments(
390
396
  df=df,
391
- segment_nr_colname=DataColumns.SEGMENT_NR,
397
+ segment_nr_colname='filtered_segment_nr',
392
398
  min_segment_length_s=min_segment_length_s,
393
399
  fs=fs,
394
400
  )
395
401
 
396
402
  if df.empty:
397
403
  raise ValueError("No filtered gait segments found in the input data after discarding segments of invalid shape.")
404
+
405
+ grouping_colname = 'filtered_segment_nr' if filtered else 'unfiltered_segment_nr'
398
406
 
399
407
  arm_swing_quantified = []
400
408
  segment_meta = {
@@ -415,8 +423,22 @@ def quantify_arm_swing(
415
423
  )
416
424
 
417
425
  # Group and process segments
418
- for segment_nr, group in df.groupby(DataColumns.SEGMENT_NR, sort=False):
419
- segment_cat = group[DataColumns.SEGMENT_CAT].iloc[0]
426
+ for segment_nr, group in df.groupby(grouping_colname, sort=False):
427
+ if filtered:
428
+ gait_segment_nr = group['unfiltered_segment_nr'].iloc[0] # Each filtered segment is contained within an unfiltered segment
429
+ else:
430
+ gait_segment_nr = segment_nr
431
+
432
+ try:
433
+ gait_segment_duration_s = gait_segment_duration_dict[gait_segment_nr]
434
+ except KeyError:
435
+ logger.warning(
436
+ "Segment %s (filtered = %s) not found in gait segment duration dictionary. Skipping this segment.",
437
+ gait_segment_nr, filtered
438
+ )
439
+ logger.debug("Available segments: %s", list(gait_segment_duration_dict.keys()))
440
+ continue
441
+
420
442
  time_array = group[DataColumns.TIME].to_numpy()
421
443
  velocity_array = group[DataColumns.VELOCITY].to_numpy()
422
444
 
@@ -435,10 +457,12 @@ def quantify_arm_swing(
435
457
  segment_meta['per_segment'][segment_nr] = {
436
458
  'start_time_s': time_array.min(),
437
459
  'end_time_s': time_array.max(),
438
- 'duration_s': len(angle_array) / fs,
439
- DataColumns.SEGMENT_CAT: segment_cat
460
+ 'duration_unfiltered_segment_s': gait_segment_duration_s,
440
461
  }
441
462
 
463
+ if filtered:
464
+ segment_meta['per_segment'][segment_nr]['duration_filtered_segment_s'] = len(time_array) / fs
465
+
442
466
  if angle_array.size > 0:
443
467
  angle_extrema_indices, _, _ = extract_angle_extremes(
444
468
  angle_array=angle_array,
@@ -469,26 +493,18 @@ def quantify_arm_swing(
469
493
 
470
494
  df_params_segment = pd.DataFrame({
471
495
  DataColumns.SEGMENT_NR: segment_nr,
472
- DataColumns.SEGMENT_CAT: segment_cat,
473
496
  DataColumns.RANGE_OF_MOTION: rom,
474
497
  DataColumns.PEAK_VELOCITY: pav
475
498
  })
476
499
 
477
500
  arm_swing_quantified.append(df_params_segment)
478
501
 
479
- # Combine segment categories
480
- segment_categories = set([segment_meta['per_segment'][x][DataColumns.SEGMENT_CAT] for x in segment_meta['per_segment'].keys()])
481
- for segment_cat in segment_categories:
482
- segment_meta['aggregated'][segment_cat] = {
483
- 'duration_s': sum([segment_meta['per_segment'][x]['duration_s'] for x in segment_meta['per_segment'].keys() if segment_meta['per_segment'][x][DataColumns.SEGMENT_CAT] == segment_cat])
484
- }
485
-
486
502
  arm_swing_quantified = pd.concat(arm_swing_quantified, ignore_index=True)
487
503
 
488
504
  return arm_swing_quantified, segment_meta
489
505
 
490
506
 
491
- def aggregate_arm_swing_params(df_arm_swing_params: pd.DataFrame, segment_meta: dict, aggregates: List[str] = ['median']) -> dict:
507
+ def aggregate_arm_swing_params(df_arm_swing_params: pd.DataFrame, segment_meta: dict, segment_cats: List[tuple], aggregates: List[str] = ['median']) -> dict:
492
508
  """
493
509
  Aggregate the quantification results for arm swing parameters.
494
510
 
@@ -499,6 +515,9 @@ def aggregate_arm_swing_params(df_arm_swing_params: pd.DataFrame, segment_meta:
499
515
 
500
516
  segment_meta : dict
501
517
  A dictionary containing metadata for each segment.
518
+
519
+ segment_cats : List[tuple]
520
+ A list of tuples defining the segment categories, where each tuple contains the lower and upper bounds for the segment duration.
502
521
 
503
522
  aggregates : List[str], optional
504
523
  A list of aggregation methods to apply to the quantification results.
@@ -510,29 +529,58 @@ def aggregate_arm_swing_params(df_arm_swing_params: pd.DataFrame, segment_meta:
510
529
  """
511
530
  arm_swing_parameters = [DataColumns.RANGE_OF_MOTION, DataColumns.PEAK_VELOCITY]
512
531
 
513
- uq_segment_cats = set([segment_meta[x][DataColumns.SEGMENT_CAT] for x in df_arm_swing_params[DataColumns.SEGMENT_NR].unique()])
514
-
515
532
  aggregated_results = {}
516
- for segment_cat in uq_segment_cats:
517
- cat_segments = [x for x in segment_meta.keys() if segment_meta[x][DataColumns.SEGMENT_CAT] == segment_cat]
518
-
519
- aggregated_results[segment_cat] = {
520
- 'duration_s': sum([segment_meta[x]['duration_s'] for x in cat_segments])
521
- }
522
-
523
- df_arm_swing_params_cat = df_arm_swing_params[df_arm_swing_params[DataColumns.SEGMENT_NR].isin(cat_segments)]
524
-
525
- for arm_swing_parameter in arm_swing_parameters:
526
- for aggregate in aggregates:
527
- aggregated_results[segment_cat][f'{aggregate}_{arm_swing_parameter}'] = aggregate_parameter(df_arm_swing_params_cat[arm_swing_parameter], aggregate)
528
-
529
- aggregated_results['all_segment_categories'] = {
530
- 'duration_s': sum([segment_meta[x]['duration_s'] for x in segment_meta.keys()])
531
- }
532
-
533
- for arm_swing_parameter in arm_swing_parameters:
534
- for aggregate in aggregates:
535
- aggregated_results['all_segment_categories'][f'{aggregate}_{arm_swing_parameter}'] = aggregate_parameter(df_arm_swing_params[arm_swing_parameter], aggregate)
533
+ for segment_cat_range in segment_cats:
534
+ segment_cat_str = f'{segment_cat_range[0]}_{segment_cat_range[1]}'
535
+ cat_segments = [
536
+ x for x in segment_meta.keys()
537
+ if segment_meta[x]['duration_unfiltered_segment_s'] >= segment_cat_range[0]
538
+ and segment_meta[x]['duration_unfiltered_segment_s'] < segment_cat_range[1]
539
+ ]
540
+
541
+ if len(cat_segments) > 0:
542
+ # For each segment, use 'duration_filtered_segment_s' if present, else 'duration_unfiltered_segment_s'
543
+ aggregated_results[segment_cat_str] = {
544
+ 'duration_s': sum(
545
+ [
546
+ segment_meta[x]['duration_filtered_segment_s']
547
+ if 'duration_filtered_segment_s' in segment_meta[x]
548
+ else segment_meta[x]['duration_unfiltered_segment_s']
549
+ for x in cat_segments
550
+ ]
551
+ )}
552
+
553
+ df_arm_swing_params_cat = df_arm_swing_params.loc[df_arm_swing_params[DataColumns.SEGMENT_NR].isin(cat_segments)]
554
+
555
+ # Aggregate across all segments
556
+ aggregates_per_segment = ['median', 'mean']
557
+
558
+ for arm_swing_parameter in arm_swing_parameters:
559
+ for aggregate in aggregates:
560
+ if aggregate in ['std', 'cov']:
561
+ per_segment_agg = []
562
+ # If the aggregate is 'cov' (coefficient of variation), we also compute the mean and standard deviation per segment
563
+ segment_groups = dict(tuple(df_arm_swing_params_cat.groupby(DataColumns.SEGMENT_NR)))
564
+ for segment_nr in cat_segments:
565
+ segment_df = segment_groups.get(segment_nr)
566
+ if segment_df is not None:
567
+ per_segment_agg.append(aggregate_parameter(segment_df[arm_swing_parameter], aggregate))
568
+
569
+ # Drop nans
570
+ per_segment_agg = np.array(per_segment_agg)
571
+ per_segment_agg = per_segment_agg[~np.isnan(per_segment_agg)]
572
+
573
+
574
+ for segment_level_aggregate in aggregates_per_segment:
575
+ aggregated_results[segment_cat_str][f'{segment_level_aggregate}_{aggregate}_{arm_swing_parameter}'] = aggregate_parameter(per_segment_agg, segment_level_aggregate)
576
+ else:
577
+ aggregated_results[segment_cat_str][f'{aggregate}_{arm_swing_parameter}'] = aggregate_parameter(df_arm_swing_params_cat[arm_swing_parameter], aggregate)
578
+
579
+ else:
580
+ # If no segments are found for this category, initialize with NaN
581
+ aggregated_results[segment_cat_str] = {
582
+ 'duration_s': 0,
583
+ }
536
584
 
537
585
  return aggregated_results
538
586
 
@@ -2,7 +2,6 @@ import pandas as pd
2
2
  import numpy as np
3
3
  from pathlib import Path
4
4
  from scipy import signal
5
- from scipy.stats import gaussian_kde
6
5
 
7
6
  from paradigma.classification import ClassifierPackage
8
7
  from paradigma.constants import DataColumns
@@ -191,7 +190,7 @@ def aggregate_tremor(df: pd.DataFrame, config: TremorConfig):
191
190
  if n_windows_tremor == 0: # if no tremor is detected, the tremor power measures are set to NaN
192
191
 
193
192
  aggregated_tremor_power['median_tremor_power'] = np.nan
194
- aggregated_tremor_power['modal_tremor_power'] = np.nan
193
+ aggregated_tremor_power['mode_binned_tremor_power'] = np.nan
195
194
  aggregated_tremor_power['90p_tremor_power'] = np.nan
196
195
 
197
196
  else:
@@ -202,16 +201,8 @@ def aggregate_tremor(df: pd.DataFrame, config: TremorConfig):
202
201
 
203
202
  for aggregate in config.aggregates_tremor_power:
204
203
  aggregate_name = f"{aggregate}_tremor_power"
205
- if aggregate == 'mode':
206
- # calculate modal tremor power
207
- bin_edges = np.linspace(0, 6, 301)
208
- kde = gaussian_kde(tremor_power)
209
- kde_values = kde(bin_edges)
210
- max_index = np.argmax(kde_values)
211
- aggregated_tremor_power['modal_tremor_power'] = bin_edges[max_index]
212
- else: # calculate te other aggregates (e.g. median and 90th percentile) of tremor power
213
- aggregated_tremor_power[aggregate_name] = aggregate_parameter(tremor_power, aggregate)
214
-
204
+ aggregated_tremor_power[aggregate_name] = aggregate_parameter(tremor_power, aggregate, config.evaluation_points_tremor_power)
205
+
215
206
  # store aggregates in json format
216
207
  d_aggregates = {
217
208
  'metadata': {
@@ -222,7 +213,7 @@ def aggregate_tremor(df: pd.DataFrame, config: TremorConfig):
222
213
  'aggregated_tremor_measures': {
223
214
  'perc_windows_tremor': perc_windows_tremor,
224
215
  'median_tremor_power': aggregated_tremor_power['median_tremor_power'],
225
- 'modal_tremor_power': aggregated_tremor_power['modal_tremor_power'],
216
+ 'modal_tremor_power': aggregated_tremor_power['mode_binned_tremor_power'],
226
217
  '90p_tremor_power': aggregated_tremor_power['90p_tremor_power']
227
218
  }
228
219
  }
@@ -39,7 +39,7 @@ def resample_data(
39
39
  tolerance : float, optional
40
40
  The tolerance added to the expected difference when checking
41
41
  for contiguous timestamps. If not provided, it defaults to
42
- twice the expected interval.
42
+ three times the expected interval.
43
43
 
44
44
  Returns
45
45
  -------
@@ -57,9 +57,9 @@ def resample_data(
57
57
  - Uses cubic interpolation for smooth resampling if there are enough points.
58
58
  - If only two timestamps are available, it falls back to linear interpolation.
59
59
  """
60
- # Set default tolerance if not provided to twice the expected interval
60
+ # Set default tolerance if not provided to three times the expected interval
61
61
  if tolerance is None:
62
- tolerance = 2 * 1 / sampling_frequency
62
+ tolerance = 3 * 1 / sampling_frequency
63
63
 
64
64
  # Extract time and values
65
65
  time_abs_array = np.array(df[time_column])
@@ -208,7 +208,7 @@ def preprocess_imu_data(df: pd.DataFrame, config: IMUConfig, sensor: str, watch_
208
208
  time_column=DataColumns.TIME,
209
209
  values_column_names=values_colnames,
210
210
  sampling_frequency=config.sampling_frequency,
211
- resampling_frequency=config.sampling_frequency
211
+ resampling_frequency=config.resampling_frequency
212
212
  )
213
213
 
214
214
  # Invert the IMU data if the watch was worn on the right wrist
@@ -3,7 +3,8 @@ import numpy as np
3
3
  import pandas as pd
4
4
  from datetime import datetime, timedelta
5
5
  from dateutil import parser
6
- from typing import List, Tuple
6
+ from typing import List, Tuple, Optional
7
+ from scipy.stats import gaussian_kde
7
8
 
8
9
  import tsdf
9
10
  from tsdf import TSDFMetadata
@@ -316,7 +317,7 @@ def invert_watch_side(df: pd.DataFrame, side: str, sensor='both') -> np.ndarray:
316
317
 
317
318
  return df
318
319
 
319
- def aggregate_parameter(parameter: np.ndarray, aggregate: str) -> np.ndarray:
320
+ def aggregate_parameter(parameter: np.ndarray, aggregate: str, evaluation_points: Optional[np.ndarray] = None) -> np.ndarray | int:
320
321
  """
321
322
  Aggregate a parameter based on the specified method.
322
323
 
@@ -327,7 +328,11 @@ def aggregate_parameter(parameter: np.ndarray, aggregate: str) -> np.ndarray:
327
328
 
328
329
  aggregate : str
329
330
  The aggregation method to apply.
330
-
331
+
332
+ evaluation_points : np.ndarray, optional
333
+ Should be specified if the mode is derived for a continuous parameter.
334
+ Defines the evaluation points for the kernel density estimation function, from which the maximum is derived as the mode.
335
+
331
336
  Returns
332
337
  -------
333
338
  np.ndarray
@@ -337,6 +342,14 @@ def aggregate_parameter(parameter: np.ndarray, aggregate: str) -> np.ndarray:
337
342
  return np.mean(parameter)
338
343
  elif aggregate == 'median':
339
344
  return np.median(parameter)
345
+ elif aggregate == 'mode_binned':
346
+ if evaluation_points is None:
347
+ raise ValueError("evaluation_points must be provided for 'mode_binned' aggregation.")
348
+ else:
349
+ kde = gaussian_kde(parameter)
350
+ kde_values = kde(evaluation_points)
351
+ max_index = np.argmax(kde_values)
352
+ return evaluation_points[max_index]
340
353
  elif aggregate == 'mode':
341
354
  unique_values, counts = np.unique(parameter, return_counts=True)
342
355
  return unique_values[np.argmax(counts)]
@@ -348,6 +361,9 @@ def aggregate_parameter(parameter: np.ndarray, aggregate: str) -> np.ndarray:
348
361
  return np.percentile(parameter, 99)
349
362
  elif aggregate == 'std':
350
363
  return np.std(parameter)
364
+ elif aggregate == 'cov':
365
+ mean_value = np.mean(parameter)
366
+ return np.std(parameter) / mean_value if mean_value != 0 else 0
351
367
  else:
352
368
  raise ValueError(f"Invalid aggregation method: {aggregate}")
353
369
 
@@ -483,7 +499,7 @@ def select_days(df: pd.DataFrame, min_hours_per_day: int) -> pd.DataFrame:
483
499
  """
484
500
 
485
501
  min_s_per_day = min_hours_per_day * 3600
486
- window_length_s = df['time_dt'].diff().dt.total_seconds()[1] # determine the length of the first window in seconds
502
+ window_length_s = df['time_dt'].diff().dt.total_seconds().iloc[1] # determine the length of the first window in seconds
487
503
  min_windows_per_day = min_s_per_day / window_length_s
488
504
  df_subset = df.groupby(df['time_dt'].dt.date).filter(lambda x: len(x) >= min_windows_per_day)
489
505
 
File without changes