claimbounded 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 claimbounded contributors
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,340 @@
1
+ Metadata-Version: 2.4
2
+ Name: claimbounded
3
+ Version: 0.2.0
4
+ Summary: Claim-bounded monitoring of AI-enabled medical devices: profile a device, classify what postmarket evidence can substantiate, and retrieve comparable FDA precedents.
5
+ Author: Yanis Vandecasteele, Sofiane Vandecasteele
6
+ License: MIT
7
+ Project-URL: Homepage, https://github.com/yanisvdc/claimbounded
8
+ Project-URL: Issues, https://github.com/yanisvdc/claimbounded/issues
9
+ Keywords: FDA,medical-devices,AI,postmarket,monitoring,regulatory-science
10
+ Classifier: Programming Language :: Python :: 3
11
+ Classifier: License :: OSI Approved :: MIT License
12
+ Classifier: Intended Audience :: Healthcare Industry
13
+ Classifier: Topic :: Scientific/Engineering :: Medical Science Apps.
14
+ Requires-Python: >=3.9
15
+ Description-Content-Type: text/markdown
16
+ License-File: LICENSE
17
+ Provides-Extra: test
18
+ Requires-Dist: pytest>=7.0; extra == "test"
19
+ Provides-Extra: ui
20
+ Requires-Dist: gradio>=6.0; extra == "ui"
21
+ Requires-Dist: python-docx>=1.1; extra == "ui"
22
+ Dynamic: license-file
23
+
24
+ ---
25
+ title: claimbounded
26
+ emoji: πŸ₯
27
+ colorFrom: blue
28
+ colorTo: indigo
29
+ sdk: gradio
30
+ sdk_version: "6.19.0"
31
+ app_file: app.py
32
+ pinned: false
33
+ license: mit
34
+ short_description: Claim-bounded monitoring of AI-enabled medical devices
35
+ ---
36
+
37
+ # claimbounded
38
+
39
+ **Claim-Bounded Monitoring of AI-Enabled Medical Devices**
40
+
41
+ [![Python](https://img.shields.io/badge/python-3.9%2B-blue)](https://www.python.org)
42
+ [![License: MIT](https://img.shields.io/badge/License-MIT-green)](LICENSE)
43
+ [![Schema](https://img.shields.io/badge/schema-v4__claimbounded-teal)](claimbounded/schema.py)
44
+ [![OSF Preregistration](https://img.shields.io/badge/OSF-10.17605%2FOSF.IO%2F74WAP-blue)](https://doi.org/10.17605/OSF.IO/74WAP)
45
+ [![HuggingFace Space](https://img.shields.io/badge/πŸ€—%20HuggingFace-Live%20Demo-orange)](https://huggingface.co/spaces/yanisvdc/claimbounded)
46
+
47
+ ## Try it now β€” no install required
48
+
49
+ **[β†’ Open the live app on HuggingFace Spaces](https://huggingface.co/spaces/yanisvdc/claimbounded)**
50
+
51
+ ---
52
+
53
+ `claimbounded` is a regulatory science Python package grounded in a structured audit of **1,400 public FDA authorization summaries** for AI-enabled medical devices (510(k) and De Novo). It answers a foundational question in AI medical device oversight:
54
+
55
+ > **What is the strongest performance claim a health system can substantiate using only the data routine deployment naturally generates β€” and how far does that fall short of what the device was authorized on?**
56
+
57
+ The package classifies any device along five primary variables validated against human reviewers (all ΞΊ β‰₯ 0.75):
58
+
59
+ | Variable | What it captures |
60
+ |---|---|
61
+ | **Postmarket evaluability class** | What *kind* of correctness signal routine deployment produces (surrogate-only, correction-evaluable, delayed-evaluable, directly auditable) |
62
+ | **Authorization endpoint recoverability** | Whether the *specific* performance endpoint the device was cleared on can be recovered from routine data β€” and at what cost |
63
+ | **Strongest auditable postmarket claim** | The highest claim level routine evidence can support without a new study |
64
+ | **Postmarket audit burden** | The evidence work required to reconstruct the authorization endpoint |
65
+ | **Routine data claim type** | Whether routine data supports the same endpoint, a clinical proxy, a workflow proxy, or only technical monitoring |
66
+
67
+ **Key empirical findings from the 1,400-device corpus** (publicly available FDA summaries):
68
+ - **85%** of authorized AI devices produce only surrogate-only evidence in deployment β€” no natural correctness signal
69
+ - **62%** have a claim ceiling of *workflow performance* β€” alert rates and output volume, not clinical accuracy
70
+ - **51%** have *proxy-only* recoverability β€” the authorization endpoint cannot be recovered from routine data at all
71
+ - **Only 1 in 1,400** devices is directly auditable on its authorization endpoint from routine deployment data
72
+ - **96%** have no PCCP; **99%** have no device-specific postmarket monitoring plan
73
+
74
+ ---
75
+
76
+ ## Who Is This For?
77
+
78
+ | Audience | How `claimbounded` helps |
79
+ |---|---|
80
+ | **Regulators** | Assess whether a manufacturer's marketed postmarket monitoring claim is supportable from the evidence their routine deployment generates. Cross-reference real FDA submission numbers from the precedent table on `accessdata.fda.gov`. See what fraction of comparable authorized devices share the same recoverability class. |
81
+ | **Device manufacturers** | Know your claim ceiling before your device ships. The *Manufacturer Design Requirements* section tells you exactly which logging, export, and identifier features would raise that ceiling. The *Landscape Context* shows how your device compares to 1,400 authorized peers. |
82
+ | **Health systems** | Use the *Procurement Questions* as a vendor checklist before deployment. Know the strongest monitoring claim your routine data supports β€” and verify it before signing a contract. The package surfaces whether comparable authorized devices can substantiate their marketed claims. |
83
+
84
+ ---
85
+
86
+ ## Installation
87
+
88
+ ```bash
89
+ pip install claimbounded
90
+ ```
91
+
92
+ With interactive UI (adds Gradio + python-docx):
93
+ ```bash
94
+ pip install "claimbounded[ui]"
95
+ ```
96
+
97
+ From source:
98
+ ```bash
99
+ git clone https://github.com/yanisvdc/claimbounded
100
+ cd claimbounded
101
+ pip install -e ".[ui]"
102
+ ```
103
+
104
+ ---
105
+
106
+ ## Quick Start
107
+
108
+ ### Launch the interactive UI
109
+ ```bash
110
+ claimbounded ui
111
+ ```
112
+ Opens at `http://localhost:7860`. All processing runs locally β€” no data leaves your machine.
113
+
114
+ ### Python API
115
+ ```python
116
+ from claimbounded import (
117
+ profile_device,
118
+ classify_evaluability_class,
119
+ classify_recoverability,
120
+ generate_monitoring_package,
121
+ )
122
+
123
+ profile = profile_device({
124
+ "device_name": "Acme LVO Triage",
125
+ "device_function": "triage_notification",
126
+ "authorization_endpoint_type": "diagnostic_accuracy",
127
+ "authorization_ground_truth_modality": "expert_reader_panel",
128
+ "routine_postmarket_evidence_stream": "workflow_logs",
129
+ "endpoint_linked_to_ai_output": "possible_but_not_described",
130
+ "human_correction_available": "no",
131
+ })
132
+
133
+ # Primary V4 variables
134
+ print(classify_evaluability_class(profile))
135
+ # β†’ "surrogate_only" (85% of authorized AI devices)
136
+
137
+ print(classify_recoverability(profile))
138
+ # β†’ "recoverable_with_chart_review" (expert panel GT; images retained in PACS)
139
+
140
+ # Full monitoring package
141
+ pkg = generate_monitoring_package(profile, k=8)
142
+ print(pkg["claim_profile"]["routine_evidence_claim_ceiling"])
143
+ # β†’ "workflow_performance"
144
+
145
+ print(pkg["claim_profile"]["recoverability_label"])
146
+ # β†’ "Recoverable with chart/image review"
147
+
148
+ # Landscape context: how this device compares to the 1,400-device corpus
149
+ ctx = pkg["landscape_context"]
150
+ print(f"{ctx['ceiling_pct']}% of FDA-authorized AI devices share this claim ceiling")
151
+ # β†’ "62.2% of FDA-authorized AI devices share this claim ceiling"
152
+ ```
153
+
154
+ ### CLI
155
+ ```bash
156
+ claimbounded report examples/example_profiles/lvo_triage.json
157
+ claimbounded precedents examples/example_profiles/lvo_triage.json --mode hybrid -k 10
158
+ claimbounded lookup K192383
159
+ claimbounded search "large vessel occlusion"
160
+ claimbounded search "oncology"
161
+ ```
162
+
163
+ ---
164
+
165
+ ## The Five Primary Variables
166
+
167
+ ### Postmarket evaluability class
168
+ What *kind* of correctness signal routine deployment naturally produces β€” before any additional effort.
169
+
170
+ | Class | Description | Prevalence |
171
+ |---|---|---|
172
+ | `surrogate_only` | Deployment produces outputs and logs but no natural correctness signal | **85%** of corpus |
173
+ | `correction_evaluable` | Physician edits/confirmations explicitly captured and stored | 13% |
174
+ | `delayed_evaluable` | Clinical outcome accumulates naturally over time in EHR records | 1% |
175
+ | `workflow_endpoint_directly_auditable` | Authorization endpoint is itself a workflow metric, co-logged in deployment | <1% |
176
+ | `closed_loop_evaluable` | AI output and ground truth both automatically co-logged | <1% |
177
+
178
+ ### Authorization endpoint recoverability
179
+ Whether the *specific* authorization endpoint can be recovered and re-measured.
180
+
181
+ | Class | Description | Prevalence |
182
+ |---|---|---|
183
+ | `proxy_only` | Endpoint NOT recoverable; only operational proxies available | **51%** of corpus |
184
+ | `recoverable_with_chart_review` | Endpoint recoverable but requires expert re-annotation (major effort) | 43% |
185
+ | `recoverable_with_linkage` | Endpoint recoverable via data engineering on structured records | 4% |
186
+ | `not_recoverable` | Endpoint not recoverable AND no operational proxy exists | 2% |
187
+ | `directly_auditable` | Endpoint re-measurable from routine deployment data | **<0.1%** (1 in 1,400) |
188
+
189
+ ### The Claim Hierarchy
190
+ The strongest monitoring claim routine evidence can support:
191
+
192
+ | Level | Claim | Prevalence in corpus |
193
+ |---|---|---|
194
+ | 7 | **Clinical accuracy or calibration** | 0% (no device reaches this from routine data) |
195
+ | 6 | **Output quality / measurement agreement** | 2.5% |
196
+ | 5 | **Human–machine concordance** | 11% |
197
+ | 4 | **Workflow performance** | **62%** |
198
+ | 3 | **Technical pipeline stability** | 23% |
199
+ | 2 | **Utilization only** | β€” |
200
+ | 1 | **No performance claim auditable** | 1% |
201
+
202
+ ---
203
+
204
+ ## Precedent Retrieval
205
+
206
+ `claimbounded` retrieves comparable FDA-authorized devices using a hybrid scoring function:
207
+
208
+ | Signal | Weight | Fields |
209
+ |---|---|---|
210
+ | Regulatory identity | 35% | disease area, clinical domain, device function, submission pathway |
211
+ | Evidence structure | 30% | endpoint type, recoverability, ground truth, claim ceiling, evaluability class, audit burden |
212
+ | Text similarity (BM25) | 20% | authorization endpoint description, supporting quotes |
213
+ | Evidence-gap matching | 15% | audit burden, monitoring implication |
214
+
215
+ **Retrieval modes:**
216
+ - `hybrid` β€” weighted blend (recommended)
217
+ - `like_for_like` β€” same regulatory and clinical identity
218
+ - `adjacent` β€” same postmarket-evidence problem, any device type
219
+ - `claim_gap` β€” same divergence between authorization endpoint and ceiling
220
+
221
+ ---
222
+
223
+ ## Interactive UI
224
+
225
+ Launch with `claimbounded ui` and navigate three tabs:
226
+
227
+ ### β‘  Profile & Report
228
+ Fill in a device description using structured dropdowns (V4 FDA-Panel vocabulary). Click **Generate Report** to receive:
229
+ - **Claim hierarchy** β€” visual ceiling and authorization gap
230
+ - **Postmarket evaluability class** β€” what correctness signal deployment produces, with full V4 codebook definition
231
+ - **Authorization endpoint recoverability** β€” whether/how the clearing endpoint can be recovered
232
+ - **Landscape context** β€” how this device compares to 1,400 authorized peers (% sharing same ceiling, recoverability, evaluability)
233
+ - **Minimum audit dataset**, **Manufacturer design requirements**, **Procurement questions**
234
+ - **Comparable FDA precedents** β€” up to 20 real 510(k)/De Novo submission numbers with scoring
235
+ - Downloadable HTML report and Word document (.docx)
236
+
237
+ ### β‘‘ Corpus Search
238
+ Search the 1,400-device corpus by device name, manufacturer, authorization endpoint, disease area, or clinical domain. Results render as a full stakeholder report with evaluability class, recoverability, PCCP status, and monitoring plan notes.
239
+
240
+ ### β‘’ Submission Lookup
241
+ Enter a 510(k) or De Novo submission number to retrieve the complete coded profile β€” including evaluability class, recoverability, claim ceiling, supporting quotes, and PCCP/monitoring plan context.
242
+
243
+ ---
244
+
245
+ ## Validation
246
+
247
+ Five primary variables validated against two independent human reviewers on a 200-record stratified sample (pre-registered before full extraction):
248
+
249
+ | Variable | ΞΊ (R1 vs R2) | 95% CI | Gate |
250
+ |---|---|---|---|
251
+ | `authorization_endpoint_recoverability` | 0.759 | [0.68, 0.83] | βœ“ PASS |
252
+ | `routine_data_claim_type` | 0.837 | [0.76, 0.91] | βœ“ PASS |
253
+ | `postmarket_evaluability_class` | 0.768 | [0.63, 0.88] | βœ“ PASS |
254
+ | `strongest_auditable_postmarket_claim` | 0.821 | [0.74, 0.89] | βœ“ PASS |
255
+ | `postmarket_audit_burden` | 0.832 | [0.76, 0.90] | βœ“ PASS |
256
+
257
+ Pre-registration: [doi:10.17605/OSF.IO/74WAP](https://doi.org/10.17605/OSF.IO/74WAP)
258
+
259
+ ---
260
+
261
+ ## Public API Reference
262
+
263
+ ```python
264
+ from claimbounded import (
265
+ # Profile a device
266
+ profile_device, # dict β†’ DeviceEvidenceProfile
267
+ normalize_device_record, # dict β†’ dict (canonical field set)
268
+ load_corpus, # β†’ list[DeviceEvidenceProfile]
269
+ find_in_corpus, # submission_number β†’ DeviceEvidenceProfile | None
270
+ search_corpus, # text β†’ list[DeviceEvidenceProfile]
271
+ corpus_stats, # profile β†’ dict (corpus-level context percentages)
272
+
273
+ # Classify (primary V4 variables)
274
+ classify_evaluability_class, # profile β†’ str
275
+ classify_recoverability, # profile β†’ str
276
+ classify_claim_ceiling, # profile β†’ str
277
+ classify_supportable_claims, # profile β†’ list[str]
278
+ classify_audit_burden, # profile β†’ dict
279
+ estimate_authorization_remeasurement, # profile β†’ dict
280
+
281
+ # Retrieve precedents
282
+ retrieve_precedents, # (profile, mode, k) β†’ list[dict]
283
+ build_bm25_index,
284
+ structured_similarity,
285
+ schema_similarity,
286
+ explain_precedent_match,
287
+
288
+ # Generate operational outputs
289
+ generate_claim_support_matrix,
290
+ generate_dashboard_claim_limits,
291
+ generate_minimum_audit_dataset,
292
+ generate_manufacturer_design_requirements,
293
+ generate_procurement_questions,
294
+
295
+ # Assemble complete reports
296
+ generate_monitoring_package, # (profile, mode, k) β†’ dict
297
+ generate_monitoring_profile_report, # (profile, mode, k) β†’ str (Markdown)
298
+ )
299
+ ```
300
+
301
+ ---
302
+
303
+ ## Design Principles
304
+
305
+ **Zero runtime dependencies** β€” the core package uses only the Python standard library, including a dependency-free BM25 implementation. Gradio and python-docx are optional extras.
306
+
307
+ **Empirically grounded** β€” every classification rule mirrors the pre-registered V4 codebook used to extract and code 1,400 public FDA authorization summaries. Classifications for new devices follow the same logic as the published audit.
308
+
309
+ **Conservative** β€” the codebook errs on the side of requiring more evidence work rather than overstating what routine data supports. `proxy_only` is the conservative default for recoverability; `surrogate_only` is the conservative default for evaluability.
310
+
311
+ **Precedent-grounded** β€” every output cites real FDA submission numbers verifiable at `accessdata.fda.gov`. The package cannot generate a recommendation not tied to a public precedent.
312
+
313
+ **Schema-first retrieval** β€” structured matching over shared coded fields (endpoint type, recoverability, ground truth, evaluability class) outperforms free-text search for this regulatory science task.
314
+
315
+ ---
316
+
317
+ ## Disclaimer
318
+
319
+ This package does not determine whether a device is safe or effective and does not predict FDA decisions. It maps the evidentiary relationship between authorization claims, routine postmarket evidence, and supportable monitoring claims, grounded in public authorization precedents. All classifications are preliminary and generated from user-provided inputs under the study codebook (schema `v4_claimbounded`, pre-registered at [doi:10.17605/OSF.IO/74WAP](https://doi.org/10.17605/OSF.IO/74WAP)). Nothing in this package constitutes regulatory advice.
320
+
321
+ ---
322
+
323
+ ## Citation
324
+
325
+ ```bibtex
326
+ @software{claimbounded2026,
327
+ title = {claimbounded: Claim-Bounded Monitoring of AI-Enabled Medical Devices},
328
+ author = {Yanis Vandecasteele and Sofiane Vandecasteele},
329
+ year = {2026},
330
+ url = {https://github.com/yanisvdc/claimbounded},
331
+ note = {Schema version v4\_claimbounded. Grounded in 1,400 public FDA authorization
332
+ records. OSF Preregistration: doi:10.17605/OSF.IO/74WAP}
333
+ }
334
+ ```
335
+
336
+ ---
337
+
338
+ ## License
339
+
340
+ MIT Β© 2026 Yanis Vandecasteele & Sofiane Vandecasteele
@@ -0,0 +1,317 @@
1
+ ---
2
+ title: claimbounded
3
+ emoji: πŸ₯
4
+ colorFrom: blue
5
+ colorTo: indigo
6
+ sdk: gradio
7
+ sdk_version: "6.19.0"
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ short_description: Claim-bounded monitoring of AI-enabled medical devices
12
+ ---
13
+
14
+ # claimbounded
15
+
16
+ **Claim-Bounded Monitoring of AI-Enabled Medical Devices**
17
+
18
+ [![Python](https://img.shields.io/badge/python-3.9%2B-blue)](https://www.python.org)
19
+ [![License: MIT](https://img.shields.io/badge/License-MIT-green)](LICENSE)
20
+ [![Schema](https://img.shields.io/badge/schema-v4__claimbounded-teal)](claimbounded/schema.py)
21
+ [![OSF Preregistration](https://img.shields.io/badge/OSF-10.17605%2FOSF.IO%2F74WAP-blue)](https://doi.org/10.17605/OSF.IO/74WAP)
22
+ [![HuggingFace Space](https://img.shields.io/badge/πŸ€—%20HuggingFace-Live%20Demo-orange)](https://huggingface.co/spaces/yanisvdc/claimbounded)
23
+
24
+ ## Try it now β€” no install required
25
+
26
+ **[β†’ Open the live app on HuggingFace Spaces](https://huggingface.co/spaces/yanisvdc/claimbounded)**
27
+
28
+ ---
29
+
30
+ `claimbounded` is a regulatory science Python package grounded in a structured audit of **1,400 public FDA authorization summaries** for AI-enabled medical devices (510(k) and De Novo). It answers a foundational question in AI medical device oversight:
31
+
32
+ > **What is the strongest performance claim a health system can substantiate using only the data routine deployment naturally generates β€” and how far does that fall short of what the device was authorized on?**
33
+
34
+ The package classifies any device along five primary variables validated against human reviewers (all ΞΊ β‰₯ 0.75):
35
+
36
+ | Variable | What it captures |
37
+ |---|---|
38
+ | **Postmarket evaluability class** | What *kind* of correctness signal routine deployment produces (surrogate-only, correction-evaluable, delayed-evaluable, directly auditable) |
39
+ | **Authorization endpoint recoverability** | Whether the *specific* performance endpoint the device was cleared on can be recovered from routine data β€” and at what cost |
40
+ | **Strongest auditable postmarket claim** | The highest claim level routine evidence can support without a new study |
41
+ | **Postmarket audit burden** | The evidence work required to reconstruct the authorization endpoint |
42
+ | **Routine data claim type** | Whether routine data supports the same endpoint, a clinical proxy, a workflow proxy, or only technical monitoring |
43
+
44
+ **Key empirical findings from the 1,400-device corpus** (publicly available FDA summaries):
45
+ - **85%** of authorized AI devices produce only surrogate-only evidence in deployment β€” no natural correctness signal
46
+ - **62%** have a claim ceiling of *workflow performance* β€” alert rates and output volume, not clinical accuracy
47
+ - **51%** have *proxy-only* recoverability β€” the authorization endpoint cannot be recovered from routine data at all
48
+ - **Only 1 in 1,400** devices is directly auditable on its authorization endpoint from routine deployment data
49
+ - **96%** have no PCCP; **99%** have no device-specific postmarket monitoring plan
50
+
51
+ ---
52
+
53
+ ## Who Is This For?
54
+
55
+ | Audience | How `claimbounded` helps |
56
+ |---|---|
57
+ | **Regulators** | Assess whether a manufacturer's marketed postmarket monitoring claim is supportable from the evidence their routine deployment generates. Cross-reference real FDA submission numbers from the precedent table on `accessdata.fda.gov`. See what fraction of comparable authorized devices share the same recoverability class. |
58
+ | **Device manufacturers** | Know your claim ceiling before your device ships. The *Manufacturer Design Requirements* section tells you exactly which logging, export, and identifier features would raise that ceiling. The *Landscape Context* shows how your device compares to 1,400 authorized peers. |
59
+ | **Health systems** | Use the *Procurement Questions* as a vendor checklist before deployment. Know the strongest monitoring claim your routine data supports β€” and verify it before signing a contract. The package surfaces whether comparable authorized devices can substantiate their marketed claims. |
60
+
61
+ ---
62
+
63
+ ## Installation
64
+
65
+ ```bash
66
+ pip install claimbounded
67
+ ```
68
+
69
+ With interactive UI (adds Gradio + python-docx):
70
+ ```bash
71
+ pip install "claimbounded[ui]"
72
+ ```
73
+
74
+ From source:
75
+ ```bash
76
+ git clone https://github.com/yanisvdc/claimbounded
77
+ cd claimbounded
78
+ pip install -e ".[ui]"
79
+ ```
80
+
81
+ ---
82
+
83
+ ## Quick Start
84
+
85
+ ### Launch the interactive UI
86
+ ```bash
87
+ claimbounded ui
88
+ ```
89
+ Opens at `http://localhost:7860`. All processing runs locally β€” no data leaves your machine.
90
+
91
+ ### Python API
92
+ ```python
93
+ from claimbounded import (
94
+ profile_device,
95
+ classify_evaluability_class,
96
+ classify_recoverability,
97
+ generate_monitoring_package,
98
+ )
99
+
100
+ profile = profile_device({
101
+ "device_name": "Acme LVO Triage",
102
+ "device_function": "triage_notification",
103
+ "authorization_endpoint_type": "diagnostic_accuracy",
104
+ "authorization_ground_truth_modality": "expert_reader_panel",
105
+ "routine_postmarket_evidence_stream": "workflow_logs",
106
+ "endpoint_linked_to_ai_output": "possible_but_not_described",
107
+ "human_correction_available": "no",
108
+ })
109
+
110
+ # Primary V4 variables
111
+ print(classify_evaluability_class(profile))
112
+ # β†’ "surrogate_only" (85% of authorized AI devices)
113
+
114
+ print(classify_recoverability(profile))
115
+ # β†’ "recoverable_with_chart_review" (expert panel GT; images retained in PACS)
116
+
117
+ # Full monitoring package
118
+ pkg = generate_monitoring_package(profile, k=8)
119
+ print(pkg["claim_profile"]["routine_evidence_claim_ceiling"])
120
+ # β†’ "workflow_performance"
121
+
122
+ print(pkg["claim_profile"]["recoverability_label"])
123
+ # β†’ "Recoverable with chart/image review"
124
+
125
+ # Landscape context: how this device compares to the 1,400-device corpus
126
+ ctx = pkg["landscape_context"]
127
+ print(f"{ctx['ceiling_pct']}% of FDA-authorized AI devices share this claim ceiling")
128
+ # β†’ "62.2% of FDA-authorized AI devices share this claim ceiling"
129
+ ```
130
+
131
+ ### CLI
132
+ ```bash
133
+ claimbounded report examples/example_profiles/lvo_triage.json
134
+ claimbounded precedents examples/example_profiles/lvo_triage.json --mode hybrid -k 10
135
+ claimbounded lookup K192383
136
+ claimbounded search "large vessel occlusion"
137
+ claimbounded search "oncology"
138
+ ```
139
+
140
+ ---
141
+
142
+ ## The Five Primary Variables
143
+
144
+ ### Postmarket evaluability class
145
+ What *kind* of correctness signal routine deployment naturally produces β€” before any additional effort.
146
+
147
+ | Class | Description | Prevalence |
148
+ |---|---|---|
149
+ | `surrogate_only` | Deployment produces outputs and logs but no natural correctness signal | **85%** of corpus |
150
+ | `correction_evaluable` | Physician edits/confirmations explicitly captured and stored | 13% |
151
+ | `delayed_evaluable` | Clinical outcome accumulates naturally over time in EHR records | 1% |
152
+ | `workflow_endpoint_directly_auditable` | Authorization endpoint is itself a workflow metric, co-logged in deployment | <1% |
153
+ | `closed_loop_evaluable` | AI output and ground truth both automatically co-logged | <1% |
154
+
155
+ ### Authorization endpoint recoverability
156
+ Whether the *specific* authorization endpoint can be recovered and re-measured.
157
+
158
+ | Class | Description | Prevalence |
159
+ |---|---|---|
160
+ | `proxy_only` | Endpoint NOT recoverable; only operational proxies available | **51%** of corpus |
161
+ | `recoverable_with_chart_review` | Endpoint recoverable but requires expert re-annotation (major effort) | 43% |
162
+ | `recoverable_with_linkage` | Endpoint recoverable via data engineering on structured records | 4% |
163
+ | `not_recoverable` | Endpoint not recoverable AND no operational proxy exists | 2% |
164
+ | `directly_auditable` | Endpoint re-measurable from routine deployment data | **<0.1%** (1 in 1,400) |
165
+
166
+ ### The Claim Hierarchy
167
+ The strongest monitoring claim routine evidence can support:
168
+
169
+ | Level | Claim | Prevalence in corpus |
170
+ |---|---|---|
171
+ | 7 | **Clinical accuracy or calibration** | 0% (no device reaches this from routine data) |
172
+ | 6 | **Output quality / measurement agreement** | 2.5% |
173
+ | 5 | **Human–machine concordance** | 11% |
174
+ | 4 | **Workflow performance** | **62%** |
175
+ | 3 | **Technical pipeline stability** | 23% |
176
+ | 2 | **Utilization only** | β€” |
177
+ | 1 | **No performance claim auditable** | 1% |
178
+
179
+ ---
180
+
181
+ ## Precedent Retrieval
182
+
183
+ `claimbounded` retrieves comparable FDA-authorized devices using a hybrid scoring function:
184
+
185
+ | Signal | Weight | Fields |
186
+ |---|---|---|
187
+ | Regulatory identity | 35% | disease area, clinical domain, device function, submission pathway |
188
+ | Evidence structure | 30% | endpoint type, recoverability, ground truth, claim ceiling, evaluability class, audit burden |
189
+ | Text similarity (BM25) | 20% | authorization endpoint description, supporting quotes |
190
+ | Evidence-gap matching | 15% | audit burden, monitoring implication |
191
+
192
+ **Retrieval modes:**
193
+ - `hybrid` β€” weighted blend (recommended)
194
+ - `like_for_like` β€” same regulatory and clinical identity
195
+ - `adjacent` β€” same postmarket-evidence problem, any device type
196
+ - `claim_gap` β€” same divergence between authorization endpoint and ceiling
197
+
198
+ ---
199
+
200
+ ## Interactive UI
201
+
202
+ Launch with `claimbounded ui` and navigate three tabs:
203
+
204
+ ### β‘  Profile & Report
205
+ Fill in a device description using structured dropdowns (V4 FDA-Panel vocabulary). Click **Generate Report** to receive:
206
+ - **Claim hierarchy** β€” visual ceiling and authorization gap
207
+ - **Postmarket evaluability class** β€” what correctness signal deployment produces, with full V4 codebook definition
208
+ - **Authorization endpoint recoverability** β€” whether/how the clearing endpoint can be recovered
209
+ - **Landscape context** β€” how this device compares to 1,400 authorized peers (% sharing same ceiling, recoverability, evaluability)
210
+ - **Minimum audit dataset**, **Manufacturer design requirements**, **Procurement questions**
211
+ - **Comparable FDA precedents** β€” up to 20 real 510(k)/De Novo submission numbers with scoring
212
+ - Downloadable HTML report and Word document (.docx)
213
+
214
+ ### β‘‘ Corpus Search
215
+ Search the 1,400-device corpus by device name, manufacturer, authorization endpoint, disease area, or clinical domain. Results render as a full stakeholder report with evaluability class, recoverability, PCCP status, and monitoring plan notes.
216
+
217
+ ### β‘’ Submission Lookup
218
+ Enter a 510(k) or De Novo submission number to retrieve the complete coded profile β€” including evaluability class, recoverability, claim ceiling, supporting quotes, and PCCP/monitoring plan context.
219
+
220
+ ---
221
+
222
+ ## Validation
223
+
224
+ Five primary variables validated against two independent human reviewers on a 200-record stratified sample (pre-registered before full extraction):
225
+
226
+ | Variable | ΞΊ (R1 vs R2) | 95% CI | Gate |
227
+ |---|---|---|---|
228
+ | `authorization_endpoint_recoverability` | 0.759 | [0.68, 0.83] | βœ“ PASS |
229
+ | `routine_data_claim_type` | 0.837 | [0.76, 0.91] | βœ“ PASS |
230
+ | `postmarket_evaluability_class` | 0.768 | [0.63, 0.88] | βœ“ PASS |
231
+ | `strongest_auditable_postmarket_claim` | 0.821 | [0.74, 0.89] | βœ“ PASS |
232
+ | `postmarket_audit_burden` | 0.832 | [0.76, 0.90] | βœ“ PASS |
233
+
234
+ Pre-registration: [doi:10.17605/OSF.IO/74WAP](https://doi.org/10.17605/OSF.IO/74WAP)
235
+
236
+ ---
237
+
238
+ ## Public API Reference
239
+
240
+ ```python
241
+ from claimbounded import (
242
+ # Profile a device
243
+ profile_device, # dict β†’ DeviceEvidenceProfile
244
+ normalize_device_record, # dict β†’ dict (canonical field set)
245
+ load_corpus, # β†’ list[DeviceEvidenceProfile]
246
+ find_in_corpus, # submission_number β†’ DeviceEvidenceProfile | None
247
+ search_corpus, # text β†’ list[DeviceEvidenceProfile]
248
+ corpus_stats, # profile β†’ dict (corpus-level context percentages)
249
+
250
+ # Classify (primary V4 variables)
251
+ classify_evaluability_class, # profile β†’ str
252
+ classify_recoverability, # profile β†’ str
253
+ classify_claim_ceiling, # profile β†’ str
254
+ classify_supportable_claims, # profile β†’ list[str]
255
+ classify_audit_burden, # profile β†’ dict
256
+ estimate_authorization_remeasurement, # profile β†’ dict
257
+
258
+ # Retrieve precedents
259
+ retrieve_precedents, # (profile, mode, k) β†’ list[dict]
260
+ build_bm25_index,
261
+ structured_similarity,
262
+ schema_similarity,
263
+ explain_precedent_match,
264
+
265
+ # Generate operational outputs
266
+ generate_claim_support_matrix,
267
+ generate_dashboard_claim_limits,
268
+ generate_minimum_audit_dataset,
269
+ generate_manufacturer_design_requirements,
270
+ generate_procurement_questions,
271
+
272
+ # Assemble complete reports
273
+ generate_monitoring_package, # (profile, mode, k) β†’ dict
274
+ generate_monitoring_profile_report, # (profile, mode, k) β†’ str (Markdown)
275
+ )
276
+ ```
277
+
278
+ ---
279
+
280
+ ## Design Principles
281
+
282
+ **Zero runtime dependencies** β€” the core package uses only the Python standard library, including a dependency-free BM25 implementation. Gradio and python-docx are optional extras.
283
+
284
+ **Empirically grounded** β€” every classification rule mirrors the pre-registered V4 codebook used to extract and code 1,400 public FDA authorization summaries. Classifications for new devices follow the same logic as the published audit.
285
+
286
+ **Conservative** β€” the codebook errs on the side of requiring more evidence work rather than overstating what routine data supports. `proxy_only` is the conservative default for recoverability; `surrogate_only` is the conservative default for evaluability.
287
+
288
+ **Precedent-grounded** β€” every output cites real FDA submission numbers verifiable at `accessdata.fda.gov`. The package cannot generate a recommendation not tied to a public precedent.
289
+
290
+ **Schema-first retrieval** β€” structured matching over shared coded fields (endpoint type, recoverability, ground truth, evaluability class) outperforms free-text search for this regulatory science task.
291
+
292
+ ---
293
+
294
+ ## Disclaimer
295
+
296
+ This package does not determine whether a device is safe or effective and does not predict FDA decisions. It maps the evidentiary relationship between authorization claims, routine postmarket evidence, and supportable monitoring claims, grounded in public authorization precedents. All classifications are preliminary and generated from user-provided inputs under the study codebook (schema `v4_claimbounded`, pre-registered at [doi:10.17605/OSF.IO/74WAP](https://doi.org/10.17605/OSF.IO/74WAP)). Nothing in this package constitutes regulatory advice.
297
+
298
+ ---
299
+
300
+ ## Citation
301
+
302
+ ```bibtex
303
+ @software{claimbounded2026,
304
+ title = {claimbounded: Claim-Bounded Monitoring of AI-Enabled Medical Devices},
305
+ author = {Yanis Vandecasteele and Sofiane Vandecasteele},
306
+ year = {2026},
307
+ url = {https://github.com/yanisvdc/claimbounded},
308
+ note = {Schema version v4\_claimbounded. Grounded in 1,400 public FDA authorization
309
+ records. OSF Preregistration: doi:10.17605/OSF.IO/74WAP}
310
+ }
311
+ ```
312
+
313
+ ---
314
+
315
+ ## License
316
+
317
+ MIT Β© 2026 Yanis Vandecasteele & Sofiane Vandecasteele