tsgap 0.3.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,96 @@
1
+ # Changelog
2
+
3
+ ## Version 0.1.1 - Critical Fixes
4
+
5
+ ### Critical Bug Fixes
6
+
7
+ #### 1. Calibration Bracketing Logic (Fixed) ⚠️
8
+ - **Before**: Bound expansion logic was reversed, causing calibration to fail for extreme rates
9
+ - **After**: Correct bracketing - expands bounds in proper direction
10
+ - **Impact**: Now handles extreme missing rates (1%, 90%) correctly
11
+
12
+ #### 2. MAR Normalization for 3D Data (Improved)
13
+ - **Before**: Global normalization across all participants
14
+ - **After**: Per-participant normalization for 3D data
15
+ - **Impact**: More consistent MAR behavior across subjects with different scales
16
+
17
+ #### 3. Base Rate Handling (Fixed)
18
+ - **Before**: `base_rate` could conflict with low `missing_rate`
19
+ - **After**: Automatically capped at `missing_rate * 0.5`
20
+ - **Impact**: Prevents calibration issues with very low missing rates
21
+
22
+ #### 4. Probability Zeroing (Improved)
23
+ - **Before**: Non-eligible positions handled after sampling
24
+ - **After**: Probabilities zeroed before sampling for cleaner semantics
25
+ - **Impact**: Slightly faster and more explicit logic
26
+
27
+ ### Testing
28
+
29
+ Added extreme rate testing:
30
+ - Low rates: 1%, 2%, 5% - all within 0.4% of target
31
+ - High rates: 50%, 70%, 90% - all within 1.3% of target
32
+ - MCAR remains exact at all rates
33
+
34
+ ---
35
+
36
+ ## Version 0.1.0 - Initial Release
37
+
38
+ ### Key Improvements
39
+
40
+ #### 1. Reproducibility (Fixed)
41
+ - **Before**: Used global `np.random.seed()` and direct calls to `np.random.choice()`, `np.random.rand()`
42
+ - **After**: Uses `np.random.Generator` with seed passed through all functions
43
+ - **Impact**: Fully reproducible results with same seed across all mechanisms
44
+
45
+ #### 2. Consistent Mask Semantics (Fixed)
46
+ - **Before**: MCAR didn't mark existing NaNs as False in returned mask
47
+ - **After**: All mechanisms consistently set `mask[existing_nans] = False`
48
+ - **Impact**: Uniform mask interpretation across all mechanisms
49
+
50
+ #### 3. Target Dimension Support (Added)
51
+ - **Before**: Only MCAR supported `target` parameter
52
+ - **After**: MAR and MNAR now support `target` parameter
53
+ - **Impact**: Can selectively mask specific dimensions while others drive missingness
54
+
55
+ #### 4. Improved Calibration (Enhanced)
56
+ - **Before**: Fixed bounds [-10, 10] for binary search
57
+ - **After**: Automatic bound expansion with bracketing
58
+ - **Impact**: Handles extreme missing rates (e.g., 0.01, 0.99) correctly
59
+
60
+ #### 5. Eligible Position Handling (Fixed)
61
+ - **Before**: MAR/MNAR calibrated over all positions including existing NaNs
62
+ - **After**: Calibration only considers eligible (non-NaN) positions
63
+ - **Impact**: More accurate missing rate achievement on real datasets with existing NaNs
64
+
65
+ #### 6. Performance Optimization (Improved)
66
+ - **Before**: Nested loops for MNAR normalization (slow on large datasets)
67
+ - **After**: Vectorized normalization using numpy broadcasting
68
+ - **Impact**: ~10-100x faster on large 3D arrays
69
+
70
+ #### 7. Code Organization (Refactored)
71
+ - Added helper functions:
72
+ - `_get_eligible_mask()`: Unified eligible position logic
73
+ - `_calibrate_offset()`: Reusable calibration with auto-bracketing
74
+ - **Impact**: Cleaner code, easier to maintain and extend
75
+
76
+ ### API Changes
77
+
78
+ All changes are backward compatible. New optional parameters:
79
+ - `rng`: `np.random.Generator` for explicit RNG control
80
+ - `target`: Now supported in MAR and MNAR (was MCAR-only)
81
+
82
+ ### Testing
83
+
84
+ All 17 unit tests pass:
85
+ - MCAR exact rate control
86
+ - MAR/MNAR approximate rate control
87
+ - Reproducibility with seeds
88
+ - Edge cases (constant signals, existing NaNs)
89
+ - Block missingness patterns
90
+
91
+ ### Documentation
92
+
93
+ - Updated docstrings with clearer parameter descriptions
94
+ - Added notes about missing rate being applied to eligible entries
95
+ - Clarified mask semantics (True=observed, False=missing)
96
+ - Added mathematical formulations for each mechanism in README
@@ -0,0 +1,21 @@
1
+ cff-version: 1.2.0
2
+ message: "If you use this software, please cite it as below."
3
+ authors:
4
+ - family-names: "Oripov"
5
+ given-names: "Feruz"
6
+ orcid: "https://orcid.org/0009-0001-4303-0512"
7
+ title: "TSGap: Composable Time-Series Missingness Simulation"
8
+ version: 0.3.0
9
+ date-released: 2026-03-16
10
+ url: "https://github.com/feruzoripov/tsgap"
11
+ license: MIT
12
+ repository-code: "https://github.com/feruzoripov/tsgap"
13
+ keywords:
14
+ - time series
15
+ - missing data
16
+ - imputation
17
+ - benchmarking
18
+ - simulation
19
+ - MCAR
20
+ - MAR
21
+ - MNAR
@@ -0,0 +1,56 @@
1
+ # Contributing to TSGap
2
+
3
+ Thank you for your interest in contributing to TSGap.
4
+
5
+ ## Getting Started
6
+
7
+ 1. Fork the repository
8
+ 2. Clone your fork: `git clone https://github.com/YOUR_USERNAME/tsgap.git`
9
+ 3. Create a virtual environment: `python -m venv .env && source .env/bin/activate`
10
+ 4. Install in development mode: `pip install -e ".[dev]"`
11
+
12
+ ## Running Tests
13
+
14
+ ```bash
15
+ pytest tsgap/tests/ -v
16
+ ```
17
+
18
+ All tests must pass before submitting a pull request.
19
+
20
+ ## Adding a New Pattern
21
+
22
+ 1. Implement your pattern function in `tsgap/patterns.py` following the existing signature:
23
+ ```python
24
+ def apply_my_pattern(mask, shape, rng=None, **kwargs):
25
+ ...
26
+ return modified_mask
27
+ ```
28
+ 2. Register it in the `PATTERNS` dictionary at the bottom of the file.
29
+ 3. Add tests in `tsgap/tests/test_missingness.py`.
30
+ 4. Update the docstring in `tsgap/core.py` and `README.md`.
31
+
32
+ ## Adding a New Mechanism
33
+
34
+ 1. Implement in `tsgap/mechanisms.py` following the existing signature.
35
+ 2. Register it in the `MECHANISMS` dictionary.
36
+ 3. Add tests and update documentation.
37
+
38
+ ## Code Style
39
+
40
+ - Use type hints (`from __future__ import annotations`)
41
+ - Include docstrings with Parameters/Returns sections (NumPy style)
42
+ - Keep functions focused and composable
43
+
44
+ ## Reporting Issues
45
+
46
+ Open an issue on GitHub with:
47
+ - A minimal reproducible example
48
+ - Expected vs. actual behavior
49
+ - Python and NumPy versions
50
+
51
+ ## Pull Requests
52
+
53
+ - One feature per PR
54
+ - Include tests for new functionality
55
+ - Update documentation as needed
56
+ - Ensure all existing tests still pass
tsgap-0.3.0/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Feruz Oripov
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,6 @@
1
+ include LICENSE
2
+ include README.md
3
+ include CHANGELOG.md
4
+ include CITATION.cff
5
+ include CONTRIBUTING.md
6
+ recursive-include tsgap *.py