claimbounded 0.2.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- claimbounded-0.2.0/LICENSE +21 -0
- claimbounded-0.2.0/PKG-INFO +340 -0
- claimbounded-0.2.0/README.md +317 -0
- claimbounded-0.2.0/claimbounded/__init__.py +94 -0
- claimbounded-0.2.0/claimbounded/claims.py +352 -0
- claimbounded-0.2.0/claimbounded/cli.py +129 -0
- claimbounded-0.2.0/claimbounded/data/fda_ai_device_claims.csv +2465 -0
- claimbounded-0.2.0/claimbounded/outputs.py +152 -0
- claimbounded-0.2.0/claimbounded/precedents.py +284 -0
- claimbounded-0.2.0/claimbounded/profiles.py +160 -0
- claimbounded-0.2.0/claimbounded/reports.py +188 -0
- claimbounded-0.2.0/claimbounded/schema.py +275 -0
- claimbounded-0.2.0/claimbounded/ui.py +1566 -0
- claimbounded-0.2.0/claimbounded.egg-info/PKG-INFO +340 -0
- claimbounded-0.2.0/claimbounded.egg-info/SOURCES.txt +20 -0
- claimbounded-0.2.0/claimbounded.egg-info/dependency_links.txt +1 -0
- claimbounded-0.2.0/claimbounded.egg-info/entry_points.txt +2 -0
- claimbounded-0.2.0/claimbounded.egg-info/requires.txt +7 -0
- claimbounded-0.2.0/claimbounded.egg-info/top_level.txt +1 -0
- claimbounded-0.2.0/pyproject.toml +40 -0
- claimbounded-0.2.0/setup.cfg +4 -0
- claimbounded-0.2.0/tests/test_claimbounded.py +460 -0
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 claimbounded contributors
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
@@ -0,0 +1,340 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: claimbounded
|
|
3
|
+
Version: 0.2.0
|
|
4
|
+
Summary: Claim-bounded monitoring of AI-enabled medical devices: profile a device, classify what postmarket evidence can substantiate, and retrieve comparable FDA precedents.
|
|
5
|
+
Author: Yanis Vandecasteele, Sofiane Vandecasteele
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/yanisvdc/claimbounded
|
|
8
|
+
Project-URL: Issues, https://github.com/yanisvdc/claimbounded/issues
|
|
9
|
+
Keywords: FDA,medical-devices,AI,postmarket,monitoring,regulatory-science
|
|
10
|
+
Classifier: Programming Language :: Python :: 3
|
|
11
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
12
|
+
Classifier: Intended Audience :: Healthcare Industry
|
|
13
|
+
Classifier: Topic :: Scientific/Engineering :: Medical Science Apps.
|
|
14
|
+
Requires-Python: >=3.9
|
|
15
|
+
Description-Content-Type: text/markdown
|
|
16
|
+
License-File: LICENSE
|
|
17
|
+
Provides-Extra: test
|
|
18
|
+
Requires-Dist: pytest>=7.0; extra == "test"
|
|
19
|
+
Provides-Extra: ui
|
|
20
|
+
Requires-Dist: gradio>=6.0; extra == "ui"
|
|
21
|
+
Requires-Dist: python-docx>=1.1; extra == "ui"
|
|
22
|
+
Dynamic: license-file
|
|
23
|
+
|
|
24
|
+
---
|
|
25
|
+
title: claimbounded
|
|
26
|
+
emoji: π₯
|
|
27
|
+
colorFrom: blue
|
|
28
|
+
colorTo: indigo
|
|
29
|
+
sdk: gradio
|
|
30
|
+
sdk_version: "6.19.0"
|
|
31
|
+
app_file: app.py
|
|
32
|
+
pinned: false
|
|
33
|
+
license: mit
|
|
34
|
+
short_description: Claim-bounded monitoring of AI-enabled medical devices
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
# claimbounded
|
|
38
|
+
|
|
39
|
+
**Claim-Bounded Monitoring of AI-Enabled Medical Devices**
|
|
40
|
+
|
|
41
|
+
[](https://www.python.org)
|
|
42
|
+
[](LICENSE)
|
|
43
|
+
[](claimbounded/schema.py)
|
|
44
|
+
[](https://doi.org/10.17605/OSF.IO/74WAP)
|
|
45
|
+
[](https://huggingface.co/spaces/yanisvdc/claimbounded)
|
|
46
|
+
|
|
47
|
+
## Try it now β no install required
|
|
48
|
+
|
|
49
|
+
**[β Open the live app on HuggingFace Spaces](https://huggingface.co/spaces/yanisvdc/claimbounded)**
|
|
50
|
+
|
|
51
|
+
---
|
|
52
|
+
|
|
53
|
+
`claimbounded` is a regulatory science Python package grounded in a structured audit of **1,400 public FDA authorization summaries** for AI-enabled medical devices (510(k) and De Novo). It answers a foundational question in AI medical device oversight:
|
|
54
|
+
|
|
55
|
+
> **What is the strongest performance claim a health system can substantiate using only the data routine deployment naturally generates β and how far does that fall short of what the device was authorized on?**
|
|
56
|
+
|
|
57
|
+
The package classifies any device along five primary variables validated against human reviewers (all ΞΊ β₯ 0.75):
|
|
58
|
+
|
|
59
|
+
| Variable | What it captures |
|
|
60
|
+
|---|---|
|
|
61
|
+
| **Postmarket evaluability class** | What *kind* of correctness signal routine deployment produces (surrogate-only, correction-evaluable, delayed-evaluable, directly auditable) |
|
|
62
|
+
| **Authorization endpoint recoverability** | Whether the *specific* performance endpoint the device was cleared on can be recovered from routine data β and at what cost |
|
|
63
|
+
| **Strongest auditable postmarket claim** | The highest claim level routine evidence can support without a new study |
|
|
64
|
+
| **Postmarket audit burden** | The evidence work required to reconstruct the authorization endpoint |
|
|
65
|
+
| **Routine data claim type** | Whether routine data supports the same endpoint, a clinical proxy, a workflow proxy, or only technical monitoring |
|
|
66
|
+
|
|
67
|
+
**Key empirical findings from the 1,400-device corpus** (publicly available FDA summaries):
|
|
68
|
+
- **85%** of authorized AI devices produce only surrogate-only evidence in deployment β no natural correctness signal
|
|
69
|
+
- **62%** have a claim ceiling of *workflow performance* β alert rates and output volume, not clinical accuracy
|
|
70
|
+
- **51%** have *proxy-only* recoverability β the authorization endpoint cannot be recovered from routine data at all
|
|
71
|
+
- **Only 1 in 1,400** devices is directly auditable on its authorization endpoint from routine deployment data
|
|
72
|
+
- **96%** have no PCCP; **99%** have no device-specific postmarket monitoring plan
|
|
73
|
+
|
|
74
|
+
---
|
|
75
|
+
|
|
76
|
+
## Who Is This For?
|
|
77
|
+
|
|
78
|
+
| Audience | How `claimbounded` helps |
|
|
79
|
+
|---|---|
|
|
80
|
+
| **Regulators** | Assess whether a manufacturer's marketed postmarket monitoring claim is supportable from the evidence their routine deployment generates. Cross-reference real FDA submission numbers from the precedent table on `accessdata.fda.gov`. See what fraction of comparable authorized devices share the same recoverability class. |
|
|
81
|
+
| **Device manufacturers** | Know your claim ceiling before your device ships. The *Manufacturer Design Requirements* section tells you exactly which logging, export, and identifier features would raise that ceiling. The *Landscape Context* shows how your device compares to 1,400 authorized peers. |
|
|
82
|
+
| **Health systems** | Use the *Procurement Questions* as a vendor checklist before deployment. Know the strongest monitoring claim your routine data supports β and verify it before signing a contract. The package surfaces whether comparable authorized devices can substantiate their marketed claims. |
|
|
83
|
+
|
|
84
|
+
---
|
|
85
|
+
|
|
86
|
+
## Installation
|
|
87
|
+
|
|
88
|
+
```bash
|
|
89
|
+
pip install claimbounded
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
With interactive UI (adds Gradio + python-docx):
|
|
93
|
+
```bash
|
|
94
|
+
pip install "claimbounded[ui]"
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
From source:
|
|
98
|
+
```bash
|
|
99
|
+
git clone https://github.com/yanisvdc/claimbounded
|
|
100
|
+
cd claimbounded
|
|
101
|
+
pip install -e ".[ui]"
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
---
|
|
105
|
+
|
|
106
|
+
## Quick Start
|
|
107
|
+
|
|
108
|
+
### Launch the interactive UI
|
|
109
|
+
```bash
|
|
110
|
+
claimbounded ui
|
|
111
|
+
```
|
|
112
|
+
Opens at `http://localhost:7860`. All processing runs locally β no data leaves your machine.
|
|
113
|
+
|
|
114
|
+
### Python API
|
|
115
|
+
```python
|
|
116
|
+
from claimbounded import (
|
|
117
|
+
profile_device,
|
|
118
|
+
classify_evaluability_class,
|
|
119
|
+
classify_recoverability,
|
|
120
|
+
generate_monitoring_package,
|
|
121
|
+
)
|
|
122
|
+
|
|
123
|
+
profile = profile_device({
|
|
124
|
+
"device_name": "Acme LVO Triage",
|
|
125
|
+
"device_function": "triage_notification",
|
|
126
|
+
"authorization_endpoint_type": "diagnostic_accuracy",
|
|
127
|
+
"authorization_ground_truth_modality": "expert_reader_panel",
|
|
128
|
+
"routine_postmarket_evidence_stream": "workflow_logs",
|
|
129
|
+
"endpoint_linked_to_ai_output": "possible_but_not_described",
|
|
130
|
+
"human_correction_available": "no",
|
|
131
|
+
})
|
|
132
|
+
|
|
133
|
+
# Primary V4 variables
|
|
134
|
+
print(classify_evaluability_class(profile))
|
|
135
|
+
# β "surrogate_only" (85% of authorized AI devices)
|
|
136
|
+
|
|
137
|
+
print(classify_recoverability(profile))
|
|
138
|
+
# β "recoverable_with_chart_review" (expert panel GT; images retained in PACS)
|
|
139
|
+
|
|
140
|
+
# Full monitoring package
|
|
141
|
+
pkg = generate_monitoring_package(profile, k=8)
|
|
142
|
+
print(pkg["claim_profile"]["routine_evidence_claim_ceiling"])
|
|
143
|
+
# β "workflow_performance"
|
|
144
|
+
|
|
145
|
+
print(pkg["claim_profile"]["recoverability_label"])
|
|
146
|
+
# β "Recoverable with chart/image review"
|
|
147
|
+
|
|
148
|
+
# Landscape context: how this device compares to the 1,400-device corpus
|
|
149
|
+
ctx = pkg["landscape_context"]
|
|
150
|
+
print(f"{ctx['ceiling_pct']}% of FDA-authorized AI devices share this claim ceiling")
|
|
151
|
+
# β "62.2% of FDA-authorized AI devices share this claim ceiling"
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
### CLI
|
|
155
|
+
```bash
|
|
156
|
+
claimbounded report examples/example_profiles/lvo_triage.json
|
|
157
|
+
claimbounded precedents examples/example_profiles/lvo_triage.json --mode hybrid -k 10
|
|
158
|
+
claimbounded lookup K192383
|
|
159
|
+
claimbounded search "large vessel occlusion"
|
|
160
|
+
claimbounded search "oncology"
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
---
|
|
164
|
+
|
|
165
|
+
## The Five Primary Variables
|
|
166
|
+
|
|
167
|
+
### Postmarket evaluability class
|
|
168
|
+
What *kind* of correctness signal routine deployment naturally produces β before any additional effort.
|
|
169
|
+
|
|
170
|
+
| Class | Description | Prevalence |
|
|
171
|
+
|---|---|---|
|
|
172
|
+
| `surrogate_only` | Deployment produces outputs and logs but no natural correctness signal | **85%** of corpus |
|
|
173
|
+
| `correction_evaluable` | Physician edits/confirmations explicitly captured and stored | 13% |
|
|
174
|
+
| `delayed_evaluable` | Clinical outcome accumulates naturally over time in EHR records | 1% |
|
|
175
|
+
| `workflow_endpoint_directly_auditable` | Authorization endpoint is itself a workflow metric, co-logged in deployment | <1% |
|
|
176
|
+
| `closed_loop_evaluable` | AI output and ground truth both automatically co-logged | <1% |
|
|
177
|
+
|
|
178
|
+
### Authorization endpoint recoverability
|
|
179
|
+
Whether the *specific* authorization endpoint can be recovered and re-measured.
|
|
180
|
+
|
|
181
|
+
| Class | Description | Prevalence |
|
|
182
|
+
|---|---|---|
|
|
183
|
+
| `proxy_only` | Endpoint NOT recoverable; only operational proxies available | **51%** of corpus |
|
|
184
|
+
| `recoverable_with_chart_review` | Endpoint recoverable but requires expert re-annotation (major effort) | 43% |
|
|
185
|
+
| `recoverable_with_linkage` | Endpoint recoverable via data engineering on structured records | 4% |
|
|
186
|
+
| `not_recoverable` | Endpoint not recoverable AND no operational proxy exists | 2% |
|
|
187
|
+
| `directly_auditable` | Endpoint re-measurable from routine deployment data | **<0.1%** (1 in 1,400) |
|
|
188
|
+
|
|
189
|
+
### The Claim Hierarchy
|
|
190
|
+
The strongest monitoring claim routine evidence can support:
|
|
191
|
+
|
|
192
|
+
| Level | Claim | Prevalence in corpus |
|
|
193
|
+
|---|---|---|
|
|
194
|
+
| 7 | **Clinical accuracy or calibration** | 0% (no device reaches this from routine data) |
|
|
195
|
+
| 6 | **Output quality / measurement agreement** | 2.5% |
|
|
196
|
+
| 5 | **Humanβmachine concordance** | 11% |
|
|
197
|
+
| 4 | **Workflow performance** | **62%** |
|
|
198
|
+
| 3 | **Technical pipeline stability** | 23% |
|
|
199
|
+
| 2 | **Utilization only** | β |
|
|
200
|
+
| 1 | **No performance claim auditable** | 1% |
|
|
201
|
+
|
|
202
|
+
---
|
|
203
|
+
|
|
204
|
+
## Precedent Retrieval
|
|
205
|
+
|
|
206
|
+
`claimbounded` retrieves comparable FDA-authorized devices using a hybrid scoring function:
|
|
207
|
+
|
|
208
|
+
| Signal | Weight | Fields |
|
|
209
|
+
|---|---|---|
|
|
210
|
+
| Regulatory identity | 35% | disease area, clinical domain, device function, submission pathway |
|
|
211
|
+
| Evidence structure | 30% | endpoint type, recoverability, ground truth, claim ceiling, evaluability class, audit burden |
|
|
212
|
+
| Text similarity (BM25) | 20% | authorization endpoint description, supporting quotes |
|
|
213
|
+
| Evidence-gap matching | 15% | audit burden, monitoring implication |
|
|
214
|
+
|
|
215
|
+
**Retrieval modes:**
|
|
216
|
+
- `hybrid` β weighted blend (recommended)
|
|
217
|
+
- `like_for_like` β same regulatory and clinical identity
|
|
218
|
+
- `adjacent` β same postmarket-evidence problem, any device type
|
|
219
|
+
- `claim_gap` β same divergence between authorization endpoint and ceiling
|
|
220
|
+
|
|
221
|
+
---
|
|
222
|
+
|
|
223
|
+
## Interactive UI
|
|
224
|
+
|
|
225
|
+
Launch with `claimbounded ui` and navigate three tabs:
|
|
226
|
+
|
|
227
|
+
### β Profile & Report
|
|
228
|
+
Fill in a device description using structured dropdowns (V4 FDA-Panel vocabulary). Click **Generate Report** to receive:
|
|
229
|
+
- **Claim hierarchy** β visual ceiling and authorization gap
|
|
230
|
+
- **Postmarket evaluability class** β what correctness signal deployment produces, with full V4 codebook definition
|
|
231
|
+
- **Authorization endpoint recoverability** β whether/how the clearing endpoint can be recovered
|
|
232
|
+
- **Landscape context** β how this device compares to 1,400 authorized peers (% sharing same ceiling, recoverability, evaluability)
|
|
233
|
+
- **Minimum audit dataset**, **Manufacturer design requirements**, **Procurement questions**
|
|
234
|
+
- **Comparable FDA precedents** β up to 20 real 510(k)/De Novo submission numbers with scoring
|
|
235
|
+
- Downloadable HTML report and Word document (.docx)
|
|
236
|
+
|
|
237
|
+
### β‘ Corpus Search
|
|
238
|
+
Search the 1,400-device corpus by device name, manufacturer, authorization endpoint, disease area, or clinical domain. Results render as a full stakeholder report with evaluability class, recoverability, PCCP status, and monitoring plan notes.
|
|
239
|
+
|
|
240
|
+
### β’ Submission Lookup
|
|
241
|
+
Enter a 510(k) or De Novo submission number to retrieve the complete coded profile β including evaluability class, recoverability, claim ceiling, supporting quotes, and PCCP/monitoring plan context.
|
|
242
|
+
|
|
243
|
+
---
|
|
244
|
+
|
|
245
|
+
## Validation
|
|
246
|
+
|
|
247
|
+
Five primary variables validated against two independent human reviewers on a 200-record stratified sample (pre-registered before full extraction):
|
|
248
|
+
|
|
249
|
+
| Variable | ΞΊ (R1 vs R2) | 95% CI | Gate |
|
|
250
|
+
|---|---|---|---|
|
|
251
|
+
| `authorization_endpoint_recoverability` | 0.759 | [0.68, 0.83] | β PASS |
|
|
252
|
+
| `routine_data_claim_type` | 0.837 | [0.76, 0.91] | β PASS |
|
|
253
|
+
| `postmarket_evaluability_class` | 0.768 | [0.63, 0.88] | β PASS |
|
|
254
|
+
| `strongest_auditable_postmarket_claim` | 0.821 | [0.74, 0.89] | β PASS |
|
|
255
|
+
| `postmarket_audit_burden` | 0.832 | [0.76, 0.90] | β PASS |
|
|
256
|
+
|
|
257
|
+
Pre-registration: [doi:10.17605/OSF.IO/74WAP](https://doi.org/10.17605/OSF.IO/74WAP)
|
|
258
|
+
|
|
259
|
+
---
|
|
260
|
+
|
|
261
|
+
## Public API Reference
|
|
262
|
+
|
|
263
|
+
```python
|
|
264
|
+
from claimbounded import (
|
|
265
|
+
# Profile a device
|
|
266
|
+
profile_device, # dict β DeviceEvidenceProfile
|
|
267
|
+
normalize_device_record, # dict β dict (canonical field set)
|
|
268
|
+
load_corpus, # β list[DeviceEvidenceProfile]
|
|
269
|
+
find_in_corpus, # submission_number β DeviceEvidenceProfile | None
|
|
270
|
+
search_corpus, # text β list[DeviceEvidenceProfile]
|
|
271
|
+
corpus_stats, # profile β dict (corpus-level context percentages)
|
|
272
|
+
|
|
273
|
+
# Classify (primary V4 variables)
|
|
274
|
+
classify_evaluability_class, # profile β str
|
|
275
|
+
classify_recoverability, # profile β str
|
|
276
|
+
classify_claim_ceiling, # profile β str
|
|
277
|
+
classify_supportable_claims, # profile β list[str]
|
|
278
|
+
classify_audit_burden, # profile β dict
|
|
279
|
+
estimate_authorization_remeasurement, # profile β dict
|
|
280
|
+
|
|
281
|
+
# Retrieve precedents
|
|
282
|
+
retrieve_precedents, # (profile, mode, k) β list[dict]
|
|
283
|
+
build_bm25_index,
|
|
284
|
+
structured_similarity,
|
|
285
|
+
schema_similarity,
|
|
286
|
+
explain_precedent_match,
|
|
287
|
+
|
|
288
|
+
# Generate operational outputs
|
|
289
|
+
generate_claim_support_matrix,
|
|
290
|
+
generate_dashboard_claim_limits,
|
|
291
|
+
generate_minimum_audit_dataset,
|
|
292
|
+
generate_manufacturer_design_requirements,
|
|
293
|
+
generate_procurement_questions,
|
|
294
|
+
|
|
295
|
+
# Assemble complete reports
|
|
296
|
+
generate_monitoring_package, # (profile, mode, k) β dict
|
|
297
|
+
generate_monitoring_profile_report, # (profile, mode, k) β str (Markdown)
|
|
298
|
+
)
|
|
299
|
+
```
|
|
300
|
+
|
|
301
|
+
---
|
|
302
|
+
|
|
303
|
+
## Design Principles
|
|
304
|
+
|
|
305
|
+
**Zero runtime dependencies** β the core package uses only the Python standard library, including a dependency-free BM25 implementation. Gradio and python-docx are optional extras.
|
|
306
|
+
|
|
307
|
+
**Empirically grounded** β every classification rule mirrors the pre-registered V4 codebook used to extract and code 1,400 public FDA authorization summaries. Classifications for new devices follow the same logic as the published audit.
|
|
308
|
+
|
|
309
|
+
**Conservative** β the codebook errs on the side of requiring more evidence work rather than overstating what routine data supports. `proxy_only` is the conservative default for recoverability; `surrogate_only` is the conservative default for evaluability.
|
|
310
|
+
|
|
311
|
+
**Precedent-grounded** β every output cites real FDA submission numbers verifiable at `accessdata.fda.gov`. The package cannot generate a recommendation not tied to a public precedent.
|
|
312
|
+
|
|
313
|
+
**Schema-first retrieval** β structured matching over shared coded fields (endpoint type, recoverability, ground truth, evaluability class) outperforms free-text search for this regulatory science task.
|
|
314
|
+
|
|
315
|
+
---
|
|
316
|
+
|
|
317
|
+
## Disclaimer
|
|
318
|
+
|
|
319
|
+
This package does not determine whether a device is safe or effective and does not predict FDA decisions. It maps the evidentiary relationship between authorization claims, routine postmarket evidence, and supportable monitoring claims, grounded in public authorization precedents. All classifications are preliminary and generated from user-provided inputs under the study codebook (schema `v4_claimbounded`, pre-registered at [doi:10.17605/OSF.IO/74WAP](https://doi.org/10.17605/OSF.IO/74WAP)). Nothing in this package constitutes regulatory advice.
|
|
320
|
+
|
|
321
|
+
---
|
|
322
|
+
|
|
323
|
+
## Citation
|
|
324
|
+
|
|
325
|
+
```bibtex
|
|
326
|
+
@software{claimbounded2026,
|
|
327
|
+
title = {claimbounded: Claim-Bounded Monitoring of AI-Enabled Medical Devices},
|
|
328
|
+
author = {Yanis Vandecasteele and Sofiane Vandecasteele},
|
|
329
|
+
year = {2026},
|
|
330
|
+
url = {https://github.com/yanisvdc/claimbounded},
|
|
331
|
+
note = {Schema version v4\_claimbounded. Grounded in 1,400 public FDA authorization
|
|
332
|
+
records. OSF Preregistration: doi:10.17605/OSF.IO/74WAP}
|
|
333
|
+
}
|
|
334
|
+
```
|
|
335
|
+
|
|
336
|
+
---
|
|
337
|
+
|
|
338
|
+
## License
|
|
339
|
+
|
|
340
|
+
MIT Β© 2026 Yanis Vandecasteele & Sofiane Vandecasteele
|
|
@@ -0,0 +1,317 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: claimbounded
|
|
3
|
+
emoji: π₯
|
|
4
|
+
colorFrom: blue
|
|
5
|
+
colorTo: indigo
|
|
6
|
+
sdk: gradio
|
|
7
|
+
sdk_version: "6.19.0"
|
|
8
|
+
app_file: app.py
|
|
9
|
+
pinned: false
|
|
10
|
+
license: mit
|
|
11
|
+
short_description: Claim-bounded monitoring of AI-enabled medical devices
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
# claimbounded
|
|
15
|
+
|
|
16
|
+
**Claim-Bounded Monitoring of AI-Enabled Medical Devices**
|
|
17
|
+
|
|
18
|
+
[](https://www.python.org)
|
|
19
|
+
[](LICENSE)
|
|
20
|
+
[](claimbounded/schema.py)
|
|
21
|
+
[](https://doi.org/10.17605/OSF.IO/74WAP)
|
|
22
|
+
[](https://huggingface.co/spaces/yanisvdc/claimbounded)
|
|
23
|
+
|
|
24
|
+
## Try it now β no install required
|
|
25
|
+
|
|
26
|
+
**[β Open the live app on HuggingFace Spaces](https://huggingface.co/spaces/yanisvdc/claimbounded)**
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
`claimbounded` is a regulatory science Python package grounded in a structured audit of **1,400 public FDA authorization summaries** for AI-enabled medical devices (510(k) and De Novo). It answers a foundational question in AI medical device oversight:
|
|
31
|
+
|
|
32
|
+
> **What is the strongest performance claim a health system can substantiate using only the data routine deployment naturally generates β and how far does that fall short of what the device was authorized on?**
|
|
33
|
+
|
|
34
|
+
The package classifies any device along five primary variables validated against human reviewers (all ΞΊ β₯ 0.75):
|
|
35
|
+
|
|
36
|
+
| Variable | What it captures |
|
|
37
|
+
|---|---|
|
|
38
|
+
| **Postmarket evaluability class** | What *kind* of correctness signal routine deployment produces (surrogate-only, correction-evaluable, delayed-evaluable, directly auditable) |
|
|
39
|
+
| **Authorization endpoint recoverability** | Whether the *specific* performance endpoint the device was cleared on can be recovered from routine data β and at what cost |
|
|
40
|
+
| **Strongest auditable postmarket claim** | The highest claim level routine evidence can support without a new study |
|
|
41
|
+
| **Postmarket audit burden** | The evidence work required to reconstruct the authorization endpoint |
|
|
42
|
+
| **Routine data claim type** | Whether routine data supports the same endpoint, a clinical proxy, a workflow proxy, or only technical monitoring |
|
|
43
|
+
|
|
44
|
+
**Key empirical findings from the 1,400-device corpus** (publicly available FDA summaries):
|
|
45
|
+
- **85%** of authorized AI devices produce only surrogate-only evidence in deployment β no natural correctness signal
|
|
46
|
+
- **62%** have a claim ceiling of *workflow performance* β alert rates and output volume, not clinical accuracy
|
|
47
|
+
- **51%** have *proxy-only* recoverability β the authorization endpoint cannot be recovered from routine data at all
|
|
48
|
+
- **Only 1 in 1,400** devices is directly auditable on its authorization endpoint from routine deployment data
|
|
49
|
+
- **96%** have no PCCP; **99%** have no device-specific postmarket monitoring plan
|
|
50
|
+
|
|
51
|
+
---
|
|
52
|
+
|
|
53
|
+
## Who Is This For?
|
|
54
|
+
|
|
55
|
+
| Audience | How `claimbounded` helps |
|
|
56
|
+
|---|---|
|
|
57
|
+
| **Regulators** | Assess whether a manufacturer's marketed postmarket monitoring claim is supportable from the evidence their routine deployment generates. Cross-reference real FDA submission numbers from the precedent table on `accessdata.fda.gov`. See what fraction of comparable authorized devices share the same recoverability class. |
|
|
58
|
+
| **Device manufacturers** | Know your claim ceiling before your device ships. The *Manufacturer Design Requirements* section tells you exactly which logging, export, and identifier features would raise that ceiling. The *Landscape Context* shows how your device compares to 1,400 authorized peers. |
|
|
59
|
+
| **Health systems** | Use the *Procurement Questions* as a vendor checklist before deployment. Know the strongest monitoring claim your routine data supports β and verify it before signing a contract. The package surfaces whether comparable authorized devices can substantiate their marketed claims. |
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
## Installation
|
|
64
|
+
|
|
65
|
+
```bash
|
|
66
|
+
pip install claimbounded
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
With interactive UI (adds Gradio + python-docx):
|
|
70
|
+
```bash
|
|
71
|
+
pip install "claimbounded[ui]"
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
From source:
|
|
75
|
+
```bash
|
|
76
|
+
git clone https://github.com/yanisvdc/claimbounded
|
|
77
|
+
cd claimbounded
|
|
78
|
+
pip install -e ".[ui]"
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
---
|
|
82
|
+
|
|
83
|
+
## Quick Start
|
|
84
|
+
|
|
85
|
+
### Launch the interactive UI
|
|
86
|
+
```bash
|
|
87
|
+
claimbounded ui
|
|
88
|
+
```
|
|
89
|
+
Opens at `http://localhost:7860`. All processing runs locally β no data leaves your machine.
|
|
90
|
+
|
|
91
|
+
### Python API
|
|
92
|
+
```python
|
|
93
|
+
from claimbounded import (
|
|
94
|
+
profile_device,
|
|
95
|
+
classify_evaluability_class,
|
|
96
|
+
classify_recoverability,
|
|
97
|
+
generate_monitoring_package,
|
|
98
|
+
)
|
|
99
|
+
|
|
100
|
+
profile = profile_device({
|
|
101
|
+
"device_name": "Acme LVO Triage",
|
|
102
|
+
"device_function": "triage_notification",
|
|
103
|
+
"authorization_endpoint_type": "diagnostic_accuracy",
|
|
104
|
+
"authorization_ground_truth_modality": "expert_reader_panel",
|
|
105
|
+
"routine_postmarket_evidence_stream": "workflow_logs",
|
|
106
|
+
"endpoint_linked_to_ai_output": "possible_but_not_described",
|
|
107
|
+
"human_correction_available": "no",
|
|
108
|
+
})
|
|
109
|
+
|
|
110
|
+
# Primary V4 variables
|
|
111
|
+
print(classify_evaluability_class(profile))
|
|
112
|
+
# β "surrogate_only" (85% of authorized AI devices)
|
|
113
|
+
|
|
114
|
+
print(classify_recoverability(profile))
|
|
115
|
+
# β "recoverable_with_chart_review" (expert panel GT; images retained in PACS)
|
|
116
|
+
|
|
117
|
+
# Full monitoring package
|
|
118
|
+
pkg = generate_monitoring_package(profile, k=8)
|
|
119
|
+
print(pkg["claim_profile"]["routine_evidence_claim_ceiling"])
|
|
120
|
+
# β "workflow_performance"
|
|
121
|
+
|
|
122
|
+
print(pkg["claim_profile"]["recoverability_label"])
|
|
123
|
+
# β "Recoverable with chart/image review"
|
|
124
|
+
|
|
125
|
+
# Landscape context: how this device compares to the 1,400-device corpus
|
|
126
|
+
ctx = pkg["landscape_context"]
|
|
127
|
+
print(f"{ctx['ceiling_pct']}% of FDA-authorized AI devices share this claim ceiling")
|
|
128
|
+
# β "62.2% of FDA-authorized AI devices share this claim ceiling"
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
### CLI
|
|
132
|
+
```bash
|
|
133
|
+
claimbounded report examples/example_profiles/lvo_triage.json
|
|
134
|
+
claimbounded precedents examples/example_profiles/lvo_triage.json --mode hybrid -k 10
|
|
135
|
+
claimbounded lookup K192383
|
|
136
|
+
claimbounded search "large vessel occlusion"
|
|
137
|
+
claimbounded search "oncology"
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
---
|
|
141
|
+
|
|
142
|
+
## The Five Primary Variables
|
|
143
|
+
|
|
144
|
+
### Postmarket evaluability class
|
|
145
|
+
What *kind* of correctness signal routine deployment naturally produces β before any additional effort.
|
|
146
|
+
|
|
147
|
+
| Class | Description | Prevalence |
|
|
148
|
+
|---|---|---|
|
|
149
|
+
| `surrogate_only` | Deployment produces outputs and logs but no natural correctness signal | **85%** of corpus |
|
|
150
|
+
| `correction_evaluable` | Physician edits/confirmations explicitly captured and stored | 13% |
|
|
151
|
+
| `delayed_evaluable` | Clinical outcome accumulates naturally over time in EHR records | 1% |
|
|
152
|
+
| `workflow_endpoint_directly_auditable` | Authorization endpoint is itself a workflow metric, co-logged in deployment | <1% |
|
|
153
|
+
| `closed_loop_evaluable` | AI output and ground truth both automatically co-logged | <1% |
|
|
154
|
+
|
|
155
|
+
### Authorization endpoint recoverability
|
|
156
|
+
Whether the *specific* authorization endpoint can be recovered and re-measured.
|
|
157
|
+
|
|
158
|
+
| Class | Description | Prevalence |
|
|
159
|
+
|---|---|---|
|
|
160
|
+
| `proxy_only` | Endpoint NOT recoverable; only operational proxies available | **51%** of corpus |
|
|
161
|
+
| `recoverable_with_chart_review` | Endpoint recoverable but requires expert re-annotation (major effort) | 43% |
|
|
162
|
+
| `recoverable_with_linkage` | Endpoint recoverable via data engineering on structured records | 4% |
|
|
163
|
+
| `not_recoverable` | Endpoint not recoverable AND no operational proxy exists | 2% |
|
|
164
|
+
| `directly_auditable` | Endpoint re-measurable from routine deployment data | **<0.1%** (1 in 1,400) |
|
|
165
|
+
|
|
166
|
+
### The Claim Hierarchy
|
|
167
|
+
The strongest monitoring claim routine evidence can support:
|
|
168
|
+
|
|
169
|
+
| Level | Claim | Prevalence in corpus |
|
|
170
|
+
|---|---|---|
|
|
171
|
+
| 7 | **Clinical accuracy or calibration** | 0% (no device reaches this from routine data) |
|
|
172
|
+
| 6 | **Output quality / measurement agreement** | 2.5% |
|
|
173
|
+
| 5 | **Humanβmachine concordance** | 11% |
|
|
174
|
+
| 4 | **Workflow performance** | **62%** |
|
|
175
|
+
| 3 | **Technical pipeline stability** | 23% |
|
|
176
|
+
| 2 | **Utilization only** | β |
|
|
177
|
+
| 1 | **No performance claim auditable** | 1% |
|
|
178
|
+
|
|
179
|
+
---
|
|
180
|
+
|
|
181
|
+
## Precedent Retrieval
|
|
182
|
+
|
|
183
|
+
`claimbounded` retrieves comparable FDA-authorized devices using a hybrid scoring function:
|
|
184
|
+
|
|
185
|
+
| Signal | Weight | Fields |
|
|
186
|
+
|---|---|---|
|
|
187
|
+
| Regulatory identity | 35% | disease area, clinical domain, device function, submission pathway |
|
|
188
|
+
| Evidence structure | 30% | endpoint type, recoverability, ground truth, claim ceiling, evaluability class, audit burden |
|
|
189
|
+
| Text similarity (BM25) | 20% | authorization endpoint description, supporting quotes |
|
|
190
|
+
| Evidence-gap matching | 15% | audit burden, monitoring implication |
|
|
191
|
+
|
|
192
|
+
**Retrieval modes:**
|
|
193
|
+
- `hybrid` β weighted blend (recommended)
|
|
194
|
+
- `like_for_like` β same regulatory and clinical identity
|
|
195
|
+
- `adjacent` β same postmarket-evidence problem, any device type
|
|
196
|
+
- `claim_gap` β same divergence between authorization endpoint and ceiling
|
|
197
|
+
|
|
198
|
+
---
|
|
199
|
+
|
|
200
|
+
## Interactive UI
|
|
201
|
+
|
|
202
|
+
Launch with `claimbounded ui` and navigate three tabs:
|
|
203
|
+
|
|
204
|
+
### β Profile & Report
|
|
205
|
+
Fill in a device description using structured dropdowns (V4 FDA-Panel vocabulary). Click **Generate Report** to receive:
|
|
206
|
+
- **Claim hierarchy** β visual ceiling and authorization gap
|
|
207
|
+
- **Postmarket evaluability class** β what correctness signal deployment produces, with full V4 codebook definition
|
|
208
|
+
- **Authorization endpoint recoverability** β whether/how the clearing endpoint can be recovered
|
|
209
|
+
- **Landscape context** β how this device compares to 1,400 authorized peers (% sharing same ceiling, recoverability, evaluability)
|
|
210
|
+
- **Minimum audit dataset**, **Manufacturer design requirements**, **Procurement questions**
|
|
211
|
+
- **Comparable FDA precedents** β up to 20 real 510(k)/De Novo submission numbers with scoring
|
|
212
|
+
- Downloadable HTML report and Word document (.docx)
|
|
213
|
+
|
|
214
|
+
### β‘ Corpus Search
|
|
215
|
+
Search the 1,400-device corpus by device name, manufacturer, authorization endpoint, disease area, or clinical domain. Results render as a full stakeholder report with evaluability class, recoverability, PCCP status, and monitoring plan notes.
|
|
216
|
+
|
|
217
|
+
### β’ Submission Lookup
|
|
218
|
+
Enter a 510(k) or De Novo submission number to retrieve the complete coded profile β including evaluability class, recoverability, claim ceiling, supporting quotes, and PCCP/monitoring plan context.
|
|
219
|
+
|
|
220
|
+
---
|
|
221
|
+
|
|
222
|
+
## Validation
|
|
223
|
+
|
|
224
|
+
Five primary variables validated against two independent human reviewers on a 200-record stratified sample (pre-registered before full extraction):
|
|
225
|
+
|
|
226
|
+
| Variable | ΞΊ (R1 vs R2) | 95% CI | Gate |
|
|
227
|
+
|---|---|---|---|
|
|
228
|
+
| `authorization_endpoint_recoverability` | 0.759 | [0.68, 0.83] | β PASS |
|
|
229
|
+
| `routine_data_claim_type` | 0.837 | [0.76, 0.91] | β PASS |
|
|
230
|
+
| `postmarket_evaluability_class` | 0.768 | [0.63, 0.88] | β PASS |
|
|
231
|
+
| `strongest_auditable_postmarket_claim` | 0.821 | [0.74, 0.89] | β PASS |
|
|
232
|
+
| `postmarket_audit_burden` | 0.832 | [0.76, 0.90] | β PASS |
|
|
233
|
+
|
|
234
|
+
Pre-registration: [doi:10.17605/OSF.IO/74WAP](https://doi.org/10.17605/OSF.IO/74WAP)
|
|
235
|
+
|
|
236
|
+
---
|
|
237
|
+
|
|
238
|
+
## Public API Reference
|
|
239
|
+
|
|
240
|
+
```python
|
|
241
|
+
from claimbounded import (
|
|
242
|
+
# Profile a device
|
|
243
|
+
profile_device, # dict β DeviceEvidenceProfile
|
|
244
|
+
normalize_device_record, # dict β dict (canonical field set)
|
|
245
|
+
load_corpus, # β list[DeviceEvidenceProfile]
|
|
246
|
+
find_in_corpus, # submission_number β DeviceEvidenceProfile | None
|
|
247
|
+
search_corpus, # text β list[DeviceEvidenceProfile]
|
|
248
|
+
corpus_stats, # profile β dict (corpus-level context percentages)
|
|
249
|
+
|
|
250
|
+
# Classify (primary V4 variables)
|
|
251
|
+
classify_evaluability_class, # profile β str
|
|
252
|
+
classify_recoverability, # profile β str
|
|
253
|
+
classify_claim_ceiling, # profile β str
|
|
254
|
+
classify_supportable_claims, # profile β list[str]
|
|
255
|
+
classify_audit_burden, # profile β dict
|
|
256
|
+
estimate_authorization_remeasurement, # profile β dict
|
|
257
|
+
|
|
258
|
+
# Retrieve precedents
|
|
259
|
+
retrieve_precedents, # (profile, mode, k) β list[dict]
|
|
260
|
+
build_bm25_index,
|
|
261
|
+
structured_similarity,
|
|
262
|
+
schema_similarity,
|
|
263
|
+
explain_precedent_match,
|
|
264
|
+
|
|
265
|
+
# Generate operational outputs
|
|
266
|
+
generate_claim_support_matrix,
|
|
267
|
+
generate_dashboard_claim_limits,
|
|
268
|
+
generate_minimum_audit_dataset,
|
|
269
|
+
generate_manufacturer_design_requirements,
|
|
270
|
+
generate_procurement_questions,
|
|
271
|
+
|
|
272
|
+
# Assemble complete reports
|
|
273
|
+
generate_monitoring_package, # (profile, mode, k) β dict
|
|
274
|
+
generate_monitoring_profile_report, # (profile, mode, k) β str (Markdown)
|
|
275
|
+
)
|
|
276
|
+
```
|
|
277
|
+
|
|
278
|
+
---
|
|
279
|
+
|
|
280
|
+
## Design Principles
|
|
281
|
+
|
|
282
|
+
**Zero runtime dependencies** β the core package uses only the Python standard library, including a dependency-free BM25 implementation. Gradio and python-docx are optional extras.
|
|
283
|
+
|
|
284
|
+
**Empirically grounded** β every classification rule mirrors the pre-registered V4 codebook used to extract and code 1,400 public FDA authorization summaries. Classifications for new devices follow the same logic as the published audit.
|
|
285
|
+
|
|
286
|
+
**Conservative** β the codebook errs on the side of requiring more evidence work rather than overstating what routine data supports. `proxy_only` is the conservative default for recoverability; `surrogate_only` is the conservative default for evaluability.
|
|
287
|
+
|
|
288
|
+
**Precedent-grounded** β every output cites real FDA submission numbers verifiable at `accessdata.fda.gov`. The package cannot generate a recommendation not tied to a public precedent.
|
|
289
|
+
|
|
290
|
+
**Schema-first retrieval** β structured matching over shared coded fields (endpoint type, recoverability, ground truth, evaluability class) outperforms free-text search for this regulatory science task.
|
|
291
|
+
|
|
292
|
+
---
|
|
293
|
+
|
|
294
|
+
## Disclaimer
|
|
295
|
+
|
|
296
|
+
This package does not determine whether a device is safe or effective and does not predict FDA decisions. It maps the evidentiary relationship between authorization claims, routine postmarket evidence, and supportable monitoring claims, grounded in public authorization precedents. All classifications are preliminary and generated from user-provided inputs under the study codebook (schema `v4_claimbounded`, pre-registered at [doi:10.17605/OSF.IO/74WAP](https://doi.org/10.17605/OSF.IO/74WAP)). Nothing in this package constitutes regulatory advice.
|
|
297
|
+
|
|
298
|
+
---
|
|
299
|
+
|
|
300
|
+
## Citation
|
|
301
|
+
|
|
302
|
+
```bibtex
|
|
303
|
+
@software{claimbounded2026,
|
|
304
|
+
title = {claimbounded: Claim-Bounded Monitoring of AI-Enabled Medical Devices},
|
|
305
|
+
author = {Yanis Vandecasteele and Sofiane Vandecasteele},
|
|
306
|
+
year = {2026},
|
|
307
|
+
url = {https://github.com/yanisvdc/claimbounded},
|
|
308
|
+
note = {Schema version v4\_claimbounded. Grounded in 1,400 public FDA authorization
|
|
309
|
+
records. OSF Preregistration: doi:10.17605/OSF.IO/74WAP}
|
|
310
|
+
}
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
---
|
|
314
|
+
|
|
315
|
+
## License
|
|
316
|
+
|
|
317
|
+
MIT Β© 2026 Yanis Vandecasteele & Sofiane Vandecasteele
|