ssbc 0.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,266 @@
1
+ Metadata-Version: 2.4
2
+ Name: ssbc
3
+ Version: 0.1.0
4
+ Summary: Small Sample Beta Correction - PAC guarantees with small datasets
5
+ Author-email: Petrus H Zwart <phzwart@lbl.gov>
6
+ Maintainer-email: Petrus H Zwart <phzwart@lbl.gov>
7
+ License-Expression: MIT
8
+ Project-URL: bugs, https://github.com/phzwart/ssbc/issues
9
+ Project-URL: changelog, https://github.com/phzwart/ssbc/blob/master/changelog.md
10
+ Project-URL: homepage, https://github.com/phzwart/ssbc
11
+ Classifier: Development Status :: 4 - Beta
12
+ Classifier: Intended Audience :: Science/Research
13
+ Classifier: Intended Audience :: Developers
14
+ Classifier: Programming Language :: Python :: 3
15
+ Classifier: Programming Language :: Python :: 3.10
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Classifier: Programming Language :: Python :: 3.12
18
+ Classifier: Programming Language :: Python :: 3.13
19
+ Classifier: Topic :: Scientific/Engineering
20
+ Classifier: Topic :: Scientific/Engineering :: Mathematics
21
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
22
+ Requires-Python: >=3.10
23
+ Description-Content-Type: text/markdown
24
+ License-File: LICENSE
25
+ Requires-Dist: matplotlib
26
+ Requires-Dist: numpy
27
+ Requires-Dist: pandas
28
+ Requires-Dist: plotly
29
+ Requires-Dist: rich
30
+ Requires-Dist: scipy
31
+ Requires-Dist: typer
32
+ Provides-Extra: test
33
+ Requires-Dist: coverage; extra == "test"
34
+ Requires-Dist: pytest; extra == "test"
35
+ Requires-Dist: pytest-cov; extra == "test"
36
+ Requires-Dist: ruff; extra == "test"
37
+ Requires-Dist: ty; extra == "test"
38
+ Requires-Dist: ipdb; extra == "test"
39
+ Provides-Extra: dev
40
+ Requires-Dist: pre-commit; extra == "dev"
41
+ Requires-Dist: bandit[toml]; extra == "dev"
42
+ Dynamic: license-file
43
+
44
+ # SSBC: Small-Sample Beta Correction
45
+
46
+ ![PyPI version](https://img.shields.io/pypi/v/ssbc.svg)
47
+ [![Documentation Status](https://readthedocs.org/projects/ssbc/badge/?version=latest)](https://ssbc.readthedocs.io/en/latest/?version=latest)
48
+
49
+ **Small-Sample Beta Correction** provides PAC (Probably Approximately Correct) guarantees for conformal prediction with small calibration sets.
50
+
51
+ * PyPI package: https://pypi.org/project/ssbc/
52
+ * Free software: MIT License
53
+ * Documentation: https://ssbc.readthedocs.io.
54
+
55
+ ## Overview
56
+
57
+ SSBC addresses the challenge of constructing valid prediction sets when you have limited calibration data. Traditional conformal prediction assumes large calibration sets, but in practice, data is often scarce. SSBC provides finite-sample correction with PAC guarantees.
58
+
59
+ ### Key Features
60
+
61
+ - ✅ **Small-Sample Correction**: PAC-valid conformal prediction for small calibration sets
62
+ - ✅ **Mondrian Conformal Prediction**: Per-class calibration for handling class imbalance
63
+ - ✅ **Comprehensive Statistics**: Detailed reporting with Clopper-Pearson confidence intervals
64
+ - ✅ **Hyperparameter Tuning**: Interactive parallel coordinates visualization for parameter optimization
65
+ - ✅ **Simulation Tools**: Built-in data generators for testing and validation
66
+
67
+ ## Installation
68
+
69
+ ```bash
70
+ pip install ssbc
71
+ ```
72
+
73
+ Or from source:
74
+
75
+ ```bash
76
+ git clone https://github.com/yourusername/ssbc.git
77
+ cd ssbc
78
+ pip install -e .
79
+ ```
80
+
81
+ ## Quick Start
82
+
83
+ ```python
84
+ import numpy as np
85
+ from ssbc import (
86
+ ssbc_correct,
87
+ BinaryClassifierSimulator,
88
+ split_by_class,
89
+ mondrian_conformal_calibrate,
90
+ report_prediction_stats,
91
+ )
92
+
93
+ # 1. Generate simulated data
94
+ sim = BinaryClassifierSimulator(
95
+ p_class1=0.1,
96
+ beta_params_class0=(2, 8),
97
+ beta_params_class1=(8, 2),
98
+ seed=42
99
+ )
100
+ labels, probs = sim.generate(n_samples=100)
101
+
102
+ # 2. Split by class for Mondrian CP
103
+ class_data = split_by_class(labels, probs)
104
+
105
+ # 3. Calibrate with SSBC correction
106
+ cal_result, pred_stats = mondrian_conformal_calibrate(
107
+ class_data=class_data,
108
+ alpha_target=0.10, # 10% miscoverage
109
+ delta=0.10, # 90% PAC guarantee
110
+ mode="beta"
111
+ )
112
+
113
+ # 4. Generate comprehensive report
114
+ summary = report_prediction_stats(pred_stats, cal_result, verbose=True)
115
+ ```
116
+
117
+ ## Core Algorithm: SSBC
118
+
119
+ The SSBC algorithm finds the optimal corrected miscoverage rate α' that satisfies:
120
+
121
+ **P(Coverage(α') ≥ 1 - α_target) ≥ 1 - δ**
122
+
123
+ ```python
124
+ from ssbc import ssbc_correct
125
+
126
+ result = ssbc_correct(
127
+ alpha_target=0.10, # Target 10% miscoverage
128
+ n=50, # Calibration set size
129
+ delta=0.10, # PAC parameter (90% confidence)
130
+ mode="beta" # Infinite test window
131
+ )
132
+
133
+ print(f"Corrected α: {result.alpha_corrected:.4f}")
134
+ print(f"u*: {result.u_star}")
135
+ ```
136
+
137
+ ### Parameters
138
+
139
+ - `alpha_target`: Target miscoverage rate (e.g., 0.10 for 90% coverage)
140
+ - `n`: Calibration set size
141
+ - `delta`: PAC risk tolerance (probability of violating guarantee)
142
+ - `mode`: "beta" (infinite test) or "beta-binomial" (finite test)
143
+
144
+ ## Module Structure
145
+
146
+ The library is organized into focused modules:
147
+
148
+ ### Core Modules
149
+
150
+ - **`ssbc.core`**: Core SSBC algorithm (`ssbc_correct`, `SSBCResult`)
151
+ - **`ssbc.conformal`**: Mondrian conformal prediction (`mondrian_conformal_calibrate`, `split_by_class`)
152
+ - **`ssbc.statistics`**: Statistical utilities (`clopper_pearson_intervals`, `cp_interval`)
153
+
154
+ ### Analysis & Visualization
155
+
156
+ - **`ssbc.visualization`**: Reporting and plotting (`report_prediction_stats`, `plot_parallel_coordinates_plotly`)
157
+ - **`ssbc.hyperparameter`**: Parameter tuning (`sweep_hyperparams_and_collect`, `sweep_and_plot_parallel_plotly`)
158
+
159
+ ### Testing & Simulation
160
+
161
+ - **`ssbc.simulation`**: Data generators (`BinaryClassifierSimulator`)
162
+
163
+ ## Examples
164
+
165
+ The `examples/` directory contains comprehensive demonstrations:
166
+
167
+ ### 1. Core SSBC Algorithm
168
+ ```bash
169
+ python examples/ssbc_core_example.py
170
+ ```
171
+ Demonstrates the SSBC algorithm for different calibration set sizes.
172
+
173
+ ### 2. Mondrian Conformal Prediction
174
+ ```bash
175
+ python examples/mondrian_conformal_example.py
176
+ ```
177
+ Complete workflow: simulation → calibration → reporting.
178
+
179
+ ### 3. Hyperparameter Sweep
180
+ ```bash
181
+ python examples/hyperparameter_sweep_example.py
182
+ ```
183
+ Interactive parameter tuning with parallel coordinates visualization.
184
+
185
+ ## Hyperparameter Tuning
186
+
187
+ Sweep over α and δ values to find optimal configurations:
188
+
189
+ ```python
190
+ from ssbc import sweep_and_plot_parallel_plotly
191
+ import numpy as np
192
+
193
+ # Define grid
194
+ alpha_grid = np.arange(0.05, 0.20, 0.05)
195
+ delta_grid = np.arange(0.05, 0.20, 0.05)
196
+
197
+ # Run sweep and visualize
198
+ df, fig = sweep_and_plot_parallel_plotly(
199
+ class_data=class_data,
200
+ alpha_0=alpha_grid, delta_0=delta_grid,
201
+ alpha_1=alpha_grid, delta_1=delta_grid,
202
+ color='err_all' # Color by error rate
203
+ )
204
+
205
+ # Save interactive plot
206
+ fig.write_html("sweep_results.html")
207
+
208
+ # Analyze results
209
+ print(df[['a0', 'd0', 'cov', 'sing_rate', 'err_all']].head())
210
+ ```
211
+
212
+ The interactive plot allows you to:
213
+ - Brush (select) ranges on any axis to filter configurations
214
+ - Explore trade-offs between coverage, automation, and error rates
215
+ - Identify Pareto-optimal hyperparameter settings
216
+
217
+ ## Understanding the Output
218
+
219
+ ### Per-Class Statistics (Conditioned on True Label)
220
+
221
+ For each class, the report shows:
222
+ - **Abstentions**: Empty prediction sets
223
+ - **Singletons**: Confident predictions (automated decisions)
224
+ - Correct: True label in singleton set
225
+ - Incorrect: True label not in singleton set
226
+ - **Doublets**: Both labels included (escalated to human review)
227
+
228
+ ### Marginal Statistics (Deployment View)
229
+
230
+ Overall performance metrics ignoring true labels:
231
+ - **Coverage**: Fraction of predictions containing the true label
232
+ - **Singleton rate**: Fraction of confident predictions (automation level)
233
+ - **Escalation rate**: Fraction requiring human review
234
+ - **Error rates**: By predicted class and overall
235
+
236
+ ### PAC Bounds
237
+
238
+ The report includes theoretical and observed singleton error rates:
239
+ - **α'_bound**: Theoretical upper bound from PAC analysis
240
+ - **α'_observed**: Observed error rate on calibration data
241
+ - ✓ if observed ≤ bound (PAC guarantee satisfied)
242
+
243
+ ## Citation
244
+
245
+ If you use SSBC in your research, please cite:
246
+
247
+ ```bibtex
248
+ @software{ssbc2024,
249
+ author = {Zwart, Petrus H},
250
+ title = {SSBC: Small-Sample Beta Correction},
251
+ year = {2024},
252
+ url = {https://github.com/yourusername/ssbc}
253
+ }
254
+ ```
255
+
256
+ ## Contributing
257
+
258
+ Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
259
+
260
+ ## License
261
+
262
+ MIT License - see [LICENSE](LICENSE) file for details.
263
+
264
+ ## Credits
265
+
266
+ This package was created with [Cookiecutter](https://github.com/audreyfeldroy/cookiecutter) and the [audreyfeldroy/cookiecutter-pypackage](https://github.com/audreyfeldroy/cookiecutter-pypackage) project template.
@@ -0,0 +1,17 @@
1
+ ssbc/__init__.py,sha256=T5Xr2tMLW0SIsphFeAKqbjHK_cjixLA8yUgdqVK53Io,1236
2
+ ssbc/__main__.py,sha256=Qd-f8z2Q2vpiEP2x6PBFsJrpACWDVxFKQk820MhFmHo,59
3
+ ssbc/cli.py,sha256=i5PLSsXS3glAFH1yVdd0YJAQVDNLd8z3jPJiPo5IW8k,436
4
+ ssbc/conformal.py,sha256=4Xq6OXJ3_vm8A177_goIYPic-Bb-qzAWLnSRJhTZMik,12340
5
+ ssbc/core.py,sha256=2qDiMgBloevB4v62Cl_CKuGhwlYF_bUd2nI_V4dSPPo,6857
6
+ ssbc/hyperparameter.py,sha256=e61j6koQ_l8sFvv5XRYa0x3RRPbhGIV3FxnXn1N3v28,9096
7
+ ssbc/simulation.py,sha256=V0nOUoxdDr5tPTCn1A5PRPJpwINqSi8kqwwk9msUwIo,5158
8
+ ssbc/ssbc.py,sha256=h0hwdogXGFqerm-5ZPeT-irPn91pCcQRjiHThXsRzEk,19
9
+ ssbc/statistics.py,sha256=UzQe6kPBIVHsGe-NMt4nECdqqIpZRnsls1Bnl2GNBgI,4824
10
+ ssbc/utils.py,sha256=1RxiNQM7rpegUEPuFvOlbSGesR4gnWpXr82bZQCgELM,77
11
+ ssbc/visualization.py,sha256=aAjnDKYMPMFEyD965f_8DMccTrhWeEvtFvKuqy1Cflk,19647
12
+ ssbc-0.1.0.dist-info/licenses/LICENSE,sha256=YOKwrV5OLHoJ_e8T4lkylNpjDdEEe4vvMG0tmUxYxco,1072
13
+ ssbc-0.1.0.dist-info/METADATA,sha256=jFZIeDJz1p6BWVvF0985urA-2g0_jnD-ETi3z5BdWwI,8357
14
+ ssbc-0.1.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
15
+ ssbc-0.1.0.dist-info/entry_points.txt,sha256=hgp8rkP_J-wInF8y42DBMGmqJ1bDGMgHQGq12Y3tMs4,38
16
+ ssbc-0.1.0.dist-info/top_level.txt,sha256=jkM9L2hWrag3UXW32pWlFmuB_L-G1el4oUbnMTJdDv0,5
17
+ ssbc-0.1.0.dist-info/RECORD,,
@@ -0,0 +1,5 @@
1
+ Wheel-Version: 1.0
2
+ Generator: setuptools (80.9.0)
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
5
+
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ ssbc = ssbc.cli:app
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025, Petrus H Zwart
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1 @@
1
+ ssbc