fhir-pyrate 0.2.3__tar.gz → 0.2.4.post1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/PKG-INFO +78 -49
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/README.md +52 -25
- fhir_pyrate-0.2.4.post1/fhir_pyrate.egg-info/PKG-INFO +646 -0
- fhir_pyrate-0.2.4.post1/fhir_pyrate.egg-info/SOURCES.txt +20 -0
- fhir_pyrate-0.2.4.post1/fhir_pyrate.egg-info/dependency_links.txt +1 -0
- fhir_pyrate-0.2.4.post1/fhir_pyrate.egg-info/requires.txt +21 -0
- fhir_pyrate-0.2.4.post1/fhir_pyrate.egg-info/top_level.txt +1 -0
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/pyproject.toml +59 -36
- fhir_pyrate-0.2.4.post1/setup.cfg +4 -0
- fhir_pyrate-0.2.4.post1/tests/test_public.py +690 -0
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/LICENSE +0 -0
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/__init__.py +0 -0
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/ahoy.py +0 -0
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/dicom_downloader.py +0 -0
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/miner.py +0 -0
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/pirate.py +0 -0
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/util/__init__.py +0 -0
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/util/bundle_processing_templates.py +0 -0
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/util/fhirobj.py +0 -0
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/util/imports.py +0 -0
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/util/token_auth.py +0 -0
- {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/util/util.py +0 -0
|
@@ -1,35 +1,40 @@
|
|
|
1
|
-
Metadata-Version: 2.
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
2
|
Name: fhir-pyrate
|
|
3
|
-
Version: 0.2.
|
|
3
|
+
Version: 0.2.4.post1
|
|
4
4
|
Summary: FHIR-PYrate is a package that provides a high-level API to query FHIR Servers for bundles of resources and return the structured information as pandas DataFrames. It can also be used to filter resources using RegEx and SpaCy and download DICOM studies and series.
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
Keywords:
|
|
8
|
-
|
|
9
|
-
Author-email: rene.hosch@uk-essen.de
|
|
10
|
-
Requires-Python: >=3.10,<4.0
|
|
11
|
-
Classifier: License :: OSI Approved :: MIT License
|
|
12
|
-
Classifier: Programming Language :: Python :: 3
|
|
5
|
+
Author-email: Giulia Baldini <giulia.baldini@uk-essen.de>, Rene Hosch <rene.hosch@uk-essen.de>
|
|
6
|
+
Project-URL: Repository, https://github.com/UMEssen/FHIR-PYrate
|
|
7
|
+
Keywords: data-science,fhir,fhirpath,healthcare,python
|
|
8
|
+
Classifier: Programming Language :: Python :: 3 :: Only
|
|
13
9
|
Classifier: Programming Language :: Python :: 3.10
|
|
14
10
|
Classifier: Programming Language :: Python :: 3.11
|
|
15
11
|
Classifier: Programming Language :: Python :: 3.12
|
|
16
12
|
Classifier: Programming Language :: Python :: 3.13
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.14
|
|
14
|
+
Requires-Python: >=3.10
|
|
15
|
+
Description-Content-Type: text/markdown
|
|
16
|
+
License-File: LICENSE
|
|
17
|
+
Requires-Dist: fhirpathpy>=0.2.2
|
|
18
|
+
Requires-Dist: numpy>=2
|
|
19
|
+
Requires-Dist: pandas>=2
|
|
20
|
+
Requires-Dist: pyjwt>=2.4
|
|
21
|
+
Requires-Dist: requests>=2.31
|
|
22
|
+
Requires-Dist: requests-cache>=0.9.7
|
|
23
|
+
Requires-Dist: tqdm>=4.56
|
|
17
24
|
Provides-Extra: all
|
|
25
|
+
Requires-Dist: dicomweb-client>=0.52; extra == "all"
|
|
26
|
+
Requires-Dist: pydicom>=3.0.1; extra == "all"
|
|
27
|
+
Requires-Dist: simpleitk>=2.0.2; extra == "all"
|
|
28
|
+
Requires-Dist: spacy>=3.0.6; extra == "all"
|
|
18
29
|
Provides-Extra: downloader
|
|
30
|
+
Requires-Dist: dicomweb-client>=0.52; extra == "downloader"
|
|
31
|
+
Requires-Dist: pydicom>=2.1.2; extra == "downloader"
|
|
32
|
+
Requires-Dist: simpleitk>=2.0.2; extra == "downloader"
|
|
19
33
|
Provides-Extra: miner
|
|
20
|
-
Requires-Dist:
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
Requires-Dist: numpy (>=2.0.0,<3.0.0)
|
|
25
|
-
Requires-Dist: pandas (>=2.0.0,<3.0.0)
|
|
26
|
-
Requires-Dist: pydicom (>=2.1.2,<3.0.0) ; extra == "downloader" or extra == "all"
|
|
27
|
-
Requires-Dist: requests (>=2.31.0,<3.0.0)
|
|
28
|
-
Requires-Dist: requests-cache (>=0.9.7,<0.10.0)
|
|
29
|
-
Requires-Dist: spacy (>=3.0.6,<4.0.0) ; extra == "miner" or extra == "all"
|
|
30
|
-
Requires-Dist: tqdm (>=4.56.0,<5.0.0)
|
|
31
|
-
Project-URL: Repository, https://github.com/UMEssen/FHIR-PYrate
|
|
32
|
-
Description-Content-Type: text/markdown
|
|
34
|
+
Requires-Dist: spacy>=3.0.6; extra == "miner"
|
|
35
|
+
Dynamic: license-file
|
|
36
|
+
|
|
37
|
+
# FHIR-PYrate
|
|
33
38
|
|
|
34
39
|
[](https://opensource.org/licenses/MIT)
|
|
35
40
|
[](https://www.python.org/downloads/release/python-31011/)
|
|
@@ -46,9 +51,10 @@ pandas DataFrames. Want to use R instead? Try out [fhircrackr](https://github.co
|
|
|
46
51
|
|
|
47
52
|
**If you use this package, please cite:**
|
|
48
53
|
|
|
49
|
-
Hosch, R., Baldini, G., Parmar, V. et al. FHIR-PYrate: a data science friendly Python package to query FHIR servers. BMC Health Serv Res 23, 734 (2023). https://doi.org/10.1186/s12913-023-09498-1
|
|
54
|
+
Hosch, R., Baldini, G., Parmar, V. et al. FHIR-PYrate: a data science friendly Python package to query FHIR servers. BMC Health Serv Res 23, 734 (2023). <https://doi.org/10.1186/s12913-023-09498-1>
|
|
50
55
|
|
|
51
56
|
There are four main classes:
|
|
57
|
+
|
|
52
58
|
* [Ahoy](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/ahoy.py): Authenticate on the FHIR API
|
|
53
59
|
([Example 1](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/1-simple-json-to-df.ipynb),
|
|
54
60
|
[2](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/2-condition-to-imaging-study.ipynb)),
|
|
@@ -75,37 +81,41 @@ problems with the authentication (or anything else really), please just create a
|
|
|
75
81
|
Table of Contents:
|
|
76
82
|
|
|
77
83
|
* [Install](https://github.com/UMEssen/FHIR-PYrate/#install)
|
|
78
|
-
|
|
79
|
-
|
|
84
|
+
* [Either Pip](https://github.com/UMEssen/FHIR-PYrate/#either-pip)
|
|
85
|
+
* [Or Within Poetry](https://github.com/UMEssen/FHIR-PYrate/#or-within-poetry)
|
|
80
86
|
* [Run Tests](https://github.com/UMEssen/FHIR-PYrate/#run-tests)
|
|
81
87
|
* [Explanations & Examples](https://github.com/UMEssen/FHIR-PYrate/#explanations--examples)
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
88
|
+
* [Ahoy](https://github.com/UMEssen/FHIR-PYrate/#ahoy)
|
|
89
|
+
* [Pirate](https://github.com/UMEssen/FHIR-PYrate/#pirate)
|
|
90
|
+
* [sail_through_search_space](https://github.com/UMEssen/FHIR-PYrate/#sail_through_search_space)
|
|
91
|
+
* [trade_rows_for_bundles](https://github.com/UMEssen/FHIR-PYrate/#trade_rows_for_bundles)
|
|
92
|
+
* [bundles_to_dataframe](https://github.com/UMEssen/FHIR-PYrate/#bundles_to_dataframe)
|
|
93
|
+
* [***_dataframe](https://github.com/UMEssen/FHIR-PYrate/#_dataframe)
|
|
94
|
+
* [Miner](https://github.com/UMEssen/FHIR-PYrate/#miner)
|
|
95
|
+
* [DicomDownloader](https://github.com/UMEssen/FHIR-PYrate/#dicomdownloader)
|
|
90
96
|
* [Contributing](https://github.com/UMEssen/FHIR-PYrate/#contributing)
|
|
91
97
|
* [Authors and acknowledgment](https://github.com/UMEssen/FHIR-PYrate/#authors-and-acknowledgment)
|
|
92
98
|
* [License](https://github.com/UMEssen/FHIR-PYrate/#license)
|
|
93
99
|
* [Project status](https://github.com/UMEssen/FHIR-PYrate/#project-status)
|
|
94
100
|
|
|
95
|
-
|
|
96
101
|
## Install
|
|
97
102
|
|
|
98
103
|
### Either Pip
|
|
104
|
+
|
|
99
105
|
The package can be installed using PyPi
|
|
106
|
+
|
|
100
107
|
```bash
|
|
101
108
|
pip install fhir-pyrate
|
|
102
109
|
```
|
|
110
|
+
|
|
103
111
|
or using GitHub (always the newest version).
|
|
112
|
+
|
|
104
113
|
```bash
|
|
105
114
|
pip install git+https://github.com/UMEssen/FHIR-PYrate.git
|
|
106
115
|
```
|
|
107
116
|
|
|
108
117
|
These two commands only install the packages needed for **Pirate**. If you also want to use the **Miner** or the **DicomDownloader**, then you need to install them as extra dependencies with
|
|
118
|
+
|
|
109
119
|
```bash
|
|
110
120
|
pip install "fhir-pyrate[miner]" # only for miner
|
|
111
121
|
pip install "fhir-pyrate[downloader]" # only for downloader
|
|
@@ -113,35 +123,46 @@ pip install "fhir-pyrate[all]" # for both
|
|
|
113
123
|
```
|
|
114
124
|
|
|
115
125
|
### Or Within Poetry
|
|
126
|
+
|
|
116
127
|
We can also use poetry for this same purpose. Using PyPi we need to run the following commands.
|
|
128
|
+
|
|
117
129
|
```bash
|
|
118
130
|
poetry add fhir-pyrate
|
|
119
131
|
poetry install
|
|
120
132
|
```
|
|
133
|
+
|
|
121
134
|
Whereas to add it from GitHub, we have different options, because until recently
|
|
122
135
|
[poetry used to exclusively install from the master branch](https://github.com/python-poetry/poetry/issues/3366).
|
|
123
136
|
|
|
124
137
|
Poetry 1.2.0a2+:
|
|
138
|
+
|
|
125
139
|
```bash
|
|
126
140
|
poetry add git+https://github.com/UMEssen/FHIR-PYrate.git
|
|
127
141
|
poetry install
|
|
128
142
|
```
|
|
143
|
+
|
|
129
144
|
For the previous versions you need to add the following line to your `pyproject.toml` file:
|
|
145
|
+
|
|
130
146
|
```bash
|
|
131
147
|
fhir-pyrate = {git = "https://github.com/UMEssen/FHIR-PYrate.git", branch = "main"}
|
|
132
148
|
```
|
|
149
|
+
|
|
133
150
|
and then run
|
|
151
|
+
|
|
134
152
|
```bash
|
|
135
153
|
poetry lock
|
|
136
154
|
```
|
|
137
155
|
|
|
138
156
|
Also in poetry, the above only installs the packages for **Pirate**. If you also want to use the **Miner** or the **DicomDownloader**, then you need to install them as extra dependencies with
|
|
157
|
+
|
|
139
158
|
```bash
|
|
140
159
|
poetry add "fhir-pyrate[miner]" # only for miner
|
|
141
160
|
poetry add "fhir-pyrate[downloader]" # only for downloader
|
|
142
161
|
poetry add "fhir-pyrate[all]" # for both
|
|
143
162
|
```
|
|
163
|
+
|
|
144
164
|
or by adding the following to your `pyproject.toml` file:
|
|
165
|
+
|
|
145
166
|
```bash
|
|
146
167
|
fhir-pyrate = {git = "https://github.com/UMEssen/FHIR-PYrate.git", branch = "main", extras = ["all"]}
|
|
147
168
|
```
|
|
@@ -209,6 +230,7 @@ search = Pirate(
|
|
|
209
230
|
```
|
|
210
231
|
|
|
211
232
|
The Pirate functions do one of three things:
|
|
233
|
+
|
|
212
234
|
1. They run the query and collect the resources and store them in a generator of bundles.
|
|
213
235
|
* `steal_bundles`: single process, no timespan to specify
|
|
214
236
|
* `sail_through_search_space`: multiprocess, divide&conquer with many smaller timespans
|
|
@@ -230,7 +252,6 @@ The Pirate functions do one of three things:
|
|
|
230
252
|
| sail_through_search_space_to_dataframe | 3 | Yes | No | DataFrame |
|
|
231
253
|
| trade_rows_for_dataframe | 3 | Yes | Yes | DataFrame |
|
|
232
254
|
|
|
233
|
-
|
|
234
255
|
**CACHING**: It is also possible to cache the bundles using the `cache_folder` parameter.
|
|
235
256
|
This unfortunately does not currently work with multiprocessing, but saves a lot of time if you
|
|
236
257
|
need to download a lot of data and you are always doing the same requests.
|
|
@@ -309,7 +330,8 @@ is the column where the values that we want to search for are stored.
|
|
|
309
330
|
Additionally, a system can be used to better identify the constraints of the DataFrame.
|
|
310
331
|
For example, let us assume that we have a column of the DataFrame (called `loinc_code` that
|
|
311
332
|
contains a bunch of different LOINC codes. Our `df_constraints` could look as follows:
|
|
312
|
-
|
|
333
|
+
|
|
334
|
+
```python
|
|
313
335
|
df_constraints={"code": ("http://loinc.org", "loinc_code")}
|
|
314
336
|
```
|
|
315
337
|
|
|
@@ -323,10 +345,12 @@ converted to a `DataFrame` using this function.
|
|
|
323
345
|
|
|
324
346
|
The `bundles_to_dataframe` has three options on how to handle and extract the relevant information
|
|
325
347
|
from the bundles:
|
|
348
|
+
|
|
326
349
|
1. Extract everything, in this case you can use the
|
|
327
350
|
[`flatten_data`](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/util/bundle_processing_templates.py)
|
|
328
351
|
function, which is already the default for `process_function`, so you do not actually need to
|
|
329
352
|
specify anything.
|
|
353
|
+
|
|
330
354
|
```python
|
|
331
355
|
# Create bundles with Pirate
|
|
332
356
|
search = ...
|
|
@@ -336,10 +360,12 @@ df = search.bundles_to_dataframe(
|
|
|
336
360
|
bundles=bundles,
|
|
337
361
|
)
|
|
338
362
|
```
|
|
339
|
-
|
|
363
|
+
|
|
364
|
+
1. Use a processing function where you define exactly which attributes are needed by iterating
|
|
340
365
|
through the entries and selecting the elements. The values that will be added to the
|
|
341
366
|
dictionary represent the columns of the DataFrame. For an example of when it might make sense
|
|
342
367
|
to do this, check [Example 3](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/3-patients-for-condition.ipynb).
|
|
368
|
+
|
|
343
369
|
```python
|
|
344
370
|
from typing import List, Dict
|
|
345
371
|
from fhir_pyrate.util.fhirobj import FHIRObj
|
|
@@ -364,12 +390,14 @@ df = search.bundles_to_dataframe(
|
|
|
364
390
|
process_function=get_diagnostic_text,
|
|
365
391
|
)
|
|
366
392
|
```
|
|
367
|
-
|
|
393
|
+
|
|
394
|
+
1. Extract only part of the information using the `fhir_paths` argument. Here you can put a list
|
|
368
395
|
of string that follow the [FHIRPath](https://hl7.org/fhirpath/) standard. For this purpose, we
|
|
369
396
|
use the [fhirpath-py](https://github.com/beda-software/fhirpath-py) package, which uses the
|
|
370
397
|
[antr4](https://github.com/antlr/antlr4) parser. Additionally, you can use tuples like `(key,
|
|
371
398
|
fhir_path)`, where `key` will be the name of the column the information derived from that
|
|
372
399
|
FHIRPath will be stored.
|
|
400
|
+
|
|
373
401
|
```python
|
|
374
402
|
# Create bundles with Pirate
|
|
375
403
|
search = ...
|
|
@@ -380,6 +408,7 @@ df = search.bundles_to_dataframe(
|
|
|
380
408
|
fhir_paths=["id", ("code", "code.coding"), ("identifier", "identifier[0].code")],
|
|
381
409
|
)
|
|
382
410
|
```
|
|
411
|
+
|
|
383
412
|
**NOTE 1 on FHIR paths**: The standard also allows some primitive math operations such as modulus
|
|
384
413
|
(`mod`) or integer division (`div`), and this may be problematic if there are fields of the
|
|
385
414
|
resource that use these terms as attributes.
|
|
@@ -390,7 +419,8 @@ instead (as in 2.).
|
|
|
390
419
|
**NOTE 2 on FHIR paths**: Since it is possible to specify the column name with a tuple
|
|
391
420
|
`(key, fhir_path)`, it is important to know that if a key is used multiple times for different
|
|
392
421
|
pieces of information but for the same resource, the field will be only filled with the first
|
|
393
|
-
|
|
422
|
+
occurrence that is not None.
|
|
423
|
+
|
|
394
424
|
```python
|
|
395
425
|
df = search.steal_bundles_to_dataframe(
|
|
396
426
|
resource_type="DiagnosticReport",
|
|
@@ -418,6 +448,7 @@ df = search.steal_bundles_to_dataframe(
|
|
|
418
448
|
```
|
|
419
449
|
|
|
420
450
|
#### [`***_dataframe`](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/pirate.py)
|
|
451
|
+
|
|
421
452
|
The `steal_bundles_to_dataframe`, `sail_through_search_space_to_dataframe` and `trade_rows_for_dataframe`
|
|
422
453
|
are facade functions which retrieve the bundles and then run `bundles_to_dataframe`.
|
|
423
454
|
|
|
@@ -437,6 +468,7 @@ More on that in the following section.
|
|
|
437
468
|
|
|
438
469
|
Not all FHIR servers allow this (at least not the public ones that we have tried),
|
|
439
470
|
but it is also possible to obtain multiple resources with just one query:
|
|
471
|
+
|
|
440
472
|
```python
|
|
441
473
|
search = ...
|
|
442
474
|
result_dfs = search.steal_bundles_to_dataframe(
|
|
@@ -464,6 +496,7 @@ result_dfs = search.steal_bundles_to_dataframe(
|
|
|
464
496
|
num_pages=1,
|
|
465
497
|
)
|
|
466
498
|
```
|
|
499
|
+
|
|
467
500
|
In this case, a dictionary of DataFrames is returned, where the keys are the resource types.
|
|
468
501
|
You can then select the single dictionary by doing `result_dfs["ImagingStudy"]`
|
|
469
502
|
or `result_dfs["Patient"]`.
|
|
@@ -485,14 +518,9 @@ such that only the ones containing
|
|
|
485
518
|
the actual resource name are kept if the resource name is specified in the path,
|
|
486
519
|
and that a column full of `None`s is obtained in case no resource type is specified.
|
|
487
520
|
|
|
488
|
-
|
|
489
521
|
### [Miner](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/miner.py)
|
|
490
522
|
|
|
491
|
-
|
|
492
|
-
<div align="center">
|
|
493
|
-
<img src="https://raw.githubusercontent.com/UMEssen/FHIR-PYrate/main/images/miner.svg" alt="Logo" width="718" height="230">
|
|
494
|
-
</div>
|
|
495
|
-
<br />
|
|
523
|
+

|
|
496
524
|
|
|
497
525
|
The **Miner** takes a DataFrame and searches it for a particular regular expression
|
|
498
526
|
with the help of [SpaCy](https://spacy.io/).
|
|
@@ -604,14 +632,15 @@ request. You can also simply open an issue with the tag "enhancement".
|
|
|
604
632
|
|
|
605
633
|
This package was developed by the [SHIP-AI group at the Institute for Artificial Intelligence in Medicine](https://ship-ai.ikim.nrw/).
|
|
606
634
|
|
|
607
|
-
|
|
608
|
-
|
|
635
|
+
* [goku1110](https://github.com/goku1110): initial idea, development, logo & figures
|
|
636
|
+
* [giuliabaldini](https://github.com/giuliabaldini): development, tests, new features
|
|
609
637
|
|
|
610
638
|
We would like to thank [razorx89](https://github.com/razorx89), [butterpear](https://github.com/butterpear), [vkyprmr](https://github.com/vkyprmr), [Wizzzard93](https://github.com/Wizzzard93), [karzideh](https://github.com/karzideh) and [luckfamousa](https://github.com/luckfamousa) for their input, time and effort.
|
|
611
639
|
|
|
612
640
|
## License
|
|
641
|
+
|
|
613
642
|
This project is licenced under the [MIT Licence](LICENSE).
|
|
614
643
|
|
|
615
644
|
## Project status
|
|
616
|
-
The project is in active development.
|
|
617
645
|
|
|
646
|
+
The project is in active development.
|
|
@@ -1,3 +1,5 @@
|
|
|
1
|
+
# FHIR-PYrate
|
|
2
|
+
|
|
1
3
|
[](https://opensource.org/licenses/MIT)
|
|
2
4
|
[](https://www.python.org/downloads/release/python-31011/)
|
|
3
5
|
[](https://pypi.org/project/fhir-pyrate/)
|
|
@@ -13,9 +15,10 @@ pandas DataFrames. Want to use R instead? Try out [fhircrackr](https://github.co
|
|
|
13
15
|
|
|
14
16
|
**If you use this package, please cite:**
|
|
15
17
|
|
|
16
|
-
Hosch, R., Baldini, G., Parmar, V. et al. FHIR-PYrate: a data science friendly Python package to query FHIR servers. BMC Health Serv Res 23, 734 (2023). https://doi.org/10.1186/s12913-023-09498-1
|
|
18
|
+
Hosch, R., Baldini, G., Parmar, V. et al. FHIR-PYrate: a data science friendly Python package to query FHIR servers. BMC Health Serv Res 23, 734 (2023). <https://doi.org/10.1186/s12913-023-09498-1>
|
|
17
19
|
|
|
18
20
|
There are four main classes:
|
|
21
|
+
|
|
19
22
|
* [Ahoy](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/ahoy.py): Authenticate on the FHIR API
|
|
20
23
|
([Example 1](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/1-simple-json-to-df.ipynb),
|
|
21
24
|
[2](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/2-condition-to-imaging-study.ipynb)),
|
|
@@ -42,37 +45,41 @@ problems with the authentication (or anything else really), please just create a
|
|
|
42
45
|
Table of Contents:
|
|
43
46
|
|
|
44
47
|
* [Install](https://github.com/UMEssen/FHIR-PYrate/#install)
|
|
45
|
-
|
|
46
|
-
|
|
48
|
+
* [Either Pip](https://github.com/UMEssen/FHIR-PYrate/#either-pip)
|
|
49
|
+
* [Or Within Poetry](https://github.com/UMEssen/FHIR-PYrate/#or-within-poetry)
|
|
47
50
|
* [Run Tests](https://github.com/UMEssen/FHIR-PYrate/#run-tests)
|
|
48
51
|
* [Explanations & Examples](https://github.com/UMEssen/FHIR-PYrate/#explanations--examples)
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
52
|
+
* [Ahoy](https://github.com/UMEssen/FHIR-PYrate/#ahoy)
|
|
53
|
+
* [Pirate](https://github.com/UMEssen/FHIR-PYrate/#pirate)
|
|
54
|
+
* [sail_through_search_space](https://github.com/UMEssen/FHIR-PYrate/#sail_through_search_space)
|
|
55
|
+
* [trade_rows_for_bundles](https://github.com/UMEssen/FHIR-PYrate/#trade_rows_for_bundles)
|
|
56
|
+
* [bundles_to_dataframe](https://github.com/UMEssen/FHIR-PYrate/#bundles_to_dataframe)
|
|
57
|
+
* [***_dataframe](https://github.com/UMEssen/FHIR-PYrate/#_dataframe)
|
|
58
|
+
* [Miner](https://github.com/UMEssen/FHIR-PYrate/#miner)
|
|
59
|
+
* [DicomDownloader](https://github.com/UMEssen/FHIR-PYrate/#dicomdownloader)
|
|
57
60
|
* [Contributing](https://github.com/UMEssen/FHIR-PYrate/#contributing)
|
|
58
61
|
* [Authors and acknowledgment](https://github.com/UMEssen/FHIR-PYrate/#authors-and-acknowledgment)
|
|
59
62
|
* [License](https://github.com/UMEssen/FHIR-PYrate/#license)
|
|
60
63
|
* [Project status](https://github.com/UMEssen/FHIR-PYrate/#project-status)
|
|
61
64
|
|
|
62
|
-
|
|
63
65
|
## Install
|
|
64
66
|
|
|
65
67
|
### Either Pip
|
|
68
|
+
|
|
66
69
|
The package can be installed using PyPi
|
|
70
|
+
|
|
67
71
|
```bash
|
|
68
72
|
pip install fhir-pyrate
|
|
69
73
|
```
|
|
74
|
+
|
|
70
75
|
or using GitHub (always the newest version).
|
|
76
|
+
|
|
71
77
|
```bash
|
|
72
78
|
pip install git+https://github.com/UMEssen/FHIR-PYrate.git
|
|
73
79
|
```
|
|
74
80
|
|
|
75
81
|
These two commands only install the packages needed for **Pirate**. If you also want to use the **Miner** or the **DicomDownloader**, then you need to install them as extra dependencies with
|
|
82
|
+
|
|
76
83
|
```bash
|
|
77
84
|
pip install "fhir-pyrate[miner]" # only for miner
|
|
78
85
|
pip install "fhir-pyrate[downloader]" # only for downloader
|
|
@@ -80,35 +87,46 @@ pip install "fhir-pyrate[all]" # for both
|
|
|
80
87
|
```
|
|
81
88
|
|
|
82
89
|
### Or Within Poetry
|
|
90
|
+
|
|
83
91
|
We can also use poetry for this same purpose. Using PyPi we need to run the following commands.
|
|
92
|
+
|
|
84
93
|
```bash
|
|
85
94
|
poetry add fhir-pyrate
|
|
86
95
|
poetry install
|
|
87
96
|
```
|
|
97
|
+
|
|
88
98
|
Whereas to add it from GitHub, we have different options, because until recently
|
|
89
99
|
[poetry used to exclusively install from the master branch](https://github.com/python-poetry/poetry/issues/3366).
|
|
90
100
|
|
|
91
101
|
Poetry 1.2.0a2+:
|
|
102
|
+
|
|
92
103
|
```bash
|
|
93
104
|
poetry add git+https://github.com/UMEssen/FHIR-PYrate.git
|
|
94
105
|
poetry install
|
|
95
106
|
```
|
|
107
|
+
|
|
96
108
|
For the previous versions you need to add the following line to your `pyproject.toml` file:
|
|
109
|
+
|
|
97
110
|
```bash
|
|
98
111
|
fhir-pyrate = {git = "https://github.com/UMEssen/FHIR-PYrate.git", branch = "main"}
|
|
99
112
|
```
|
|
113
|
+
|
|
100
114
|
and then run
|
|
115
|
+
|
|
101
116
|
```bash
|
|
102
117
|
poetry lock
|
|
103
118
|
```
|
|
104
119
|
|
|
105
120
|
Also in poetry, the above only installs the packages for **Pirate**. If you also want to use the **Miner** or the **DicomDownloader**, then you need to install them as extra dependencies with
|
|
121
|
+
|
|
106
122
|
```bash
|
|
107
123
|
poetry add "fhir-pyrate[miner]" # only for miner
|
|
108
124
|
poetry add "fhir-pyrate[downloader]" # only for downloader
|
|
109
125
|
poetry add "fhir-pyrate[all]" # for both
|
|
110
126
|
```
|
|
127
|
+
|
|
111
128
|
or by adding the following to your `pyproject.toml` file:
|
|
129
|
+
|
|
112
130
|
```bash
|
|
113
131
|
fhir-pyrate = {git = "https://github.com/UMEssen/FHIR-PYrate.git", branch = "main", extras = ["all"]}
|
|
114
132
|
```
|
|
@@ -176,6 +194,7 @@ search = Pirate(
|
|
|
176
194
|
```
|
|
177
195
|
|
|
178
196
|
The Pirate functions do one of three things:
|
|
197
|
+
|
|
179
198
|
1. They run the query and collect the resources and store them in a generator of bundles.
|
|
180
199
|
* `steal_bundles`: single process, no timespan to specify
|
|
181
200
|
* `sail_through_search_space`: multiprocess, divide&conquer with many smaller timespans
|
|
@@ -197,7 +216,6 @@ The Pirate functions do one of three things:
|
|
|
197
216
|
| sail_through_search_space_to_dataframe | 3 | Yes | No | DataFrame |
|
|
198
217
|
| trade_rows_for_dataframe | 3 | Yes | Yes | DataFrame |
|
|
199
218
|
|
|
200
|
-
|
|
201
219
|
**CACHING**: It is also possible to cache the bundles using the `cache_folder` parameter.
|
|
202
220
|
This unfortunately does not currently work with multiprocessing, but saves a lot of time if you
|
|
203
221
|
need to download a lot of data and you are always doing the same requests.
|
|
@@ -276,7 +294,8 @@ is the column where the values that we want to search for are stored.
|
|
|
276
294
|
Additionally, a system can be used to better identify the constraints of the DataFrame.
|
|
277
295
|
For example, let us assume that we have a column of the DataFrame (called `loinc_code` that
|
|
278
296
|
contains a bunch of different LOINC codes. Our `df_constraints` could look as follows:
|
|
279
|
-
|
|
297
|
+
|
|
298
|
+
```python
|
|
280
299
|
df_constraints={"code": ("http://loinc.org", "loinc_code")}
|
|
281
300
|
```
|
|
282
301
|
|
|
@@ -290,10 +309,12 @@ converted to a `DataFrame` using this function.
|
|
|
290
309
|
|
|
291
310
|
The `bundles_to_dataframe` has three options on how to handle and extract the relevant information
|
|
292
311
|
from the bundles:
|
|
312
|
+
|
|
293
313
|
1. Extract everything, in this case you can use the
|
|
294
314
|
[`flatten_data`](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/util/bundle_processing_templates.py)
|
|
295
315
|
function, which is already the default for `process_function`, so you do not actually need to
|
|
296
316
|
specify anything.
|
|
317
|
+
|
|
297
318
|
```python
|
|
298
319
|
# Create bundles with Pirate
|
|
299
320
|
search = ...
|
|
@@ -303,10 +324,12 @@ df = search.bundles_to_dataframe(
|
|
|
303
324
|
bundles=bundles,
|
|
304
325
|
)
|
|
305
326
|
```
|
|
306
|
-
|
|
327
|
+
|
|
328
|
+
1. Use a processing function where you define exactly which attributes are needed by iterating
|
|
307
329
|
through the entries and selecting the elements. The values that will be added to the
|
|
308
330
|
dictionary represent the columns of the DataFrame. For an example of when it might make sense
|
|
309
331
|
to do this, check [Example 3](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/3-patients-for-condition.ipynb).
|
|
332
|
+
|
|
310
333
|
```python
|
|
311
334
|
from typing import List, Dict
|
|
312
335
|
from fhir_pyrate.util.fhirobj import FHIRObj
|
|
@@ -331,12 +354,14 @@ df = search.bundles_to_dataframe(
|
|
|
331
354
|
process_function=get_diagnostic_text,
|
|
332
355
|
)
|
|
333
356
|
```
|
|
334
|
-
|
|
357
|
+
|
|
358
|
+
1. Extract only part of the information using the `fhir_paths` argument. Here you can put a list
|
|
335
359
|
of string that follow the [FHIRPath](https://hl7.org/fhirpath/) standard. For this purpose, we
|
|
336
360
|
use the [fhirpath-py](https://github.com/beda-software/fhirpath-py) package, which uses the
|
|
337
361
|
[antr4](https://github.com/antlr/antlr4) parser. Additionally, you can use tuples like `(key,
|
|
338
362
|
fhir_path)`, where `key` will be the name of the column the information derived from that
|
|
339
363
|
FHIRPath will be stored.
|
|
364
|
+
|
|
340
365
|
```python
|
|
341
366
|
# Create bundles with Pirate
|
|
342
367
|
search = ...
|
|
@@ -347,6 +372,7 @@ df = search.bundles_to_dataframe(
|
|
|
347
372
|
fhir_paths=["id", ("code", "code.coding"), ("identifier", "identifier[0].code")],
|
|
348
373
|
)
|
|
349
374
|
```
|
|
375
|
+
|
|
350
376
|
**NOTE 1 on FHIR paths**: The standard also allows some primitive math operations such as modulus
|
|
351
377
|
(`mod`) or integer division (`div`), and this may be problematic if there are fields of the
|
|
352
378
|
resource that use these terms as attributes.
|
|
@@ -357,7 +383,8 @@ instead (as in 2.).
|
|
|
357
383
|
**NOTE 2 on FHIR paths**: Since it is possible to specify the column name with a tuple
|
|
358
384
|
`(key, fhir_path)`, it is important to know that if a key is used multiple times for different
|
|
359
385
|
pieces of information but for the same resource, the field will be only filled with the first
|
|
360
|
-
|
|
386
|
+
occurrence that is not None.
|
|
387
|
+
|
|
361
388
|
```python
|
|
362
389
|
df = search.steal_bundles_to_dataframe(
|
|
363
390
|
resource_type="DiagnosticReport",
|
|
@@ -385,6 +412,7 @@ df = search.steal_bundles_to_dataframe(
|
|
|
385
412
|
```
|
|
386
413
|
|
|
387
414
|
#### [`***_dataframe`](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/pirate.py)
|
|
415
|
+
|
|
388
416
|
The `steal_bundles_to_dataframe`, `sail_through_search_space_to_dataframe` and `trade_rows_for_dataframe`
|
|
389
417
|
are facade functions which retrieve the bundles and then run `bundles_to_dataframe`.
|
|
390
418
|
|
|
@@ -404,6 +432,7 @@ More on that in the following section.
|
|
|
404
432
|
|
|
405
433
|
Not all FHIR servers allow this (at least not the public ones that we have tried),
|
|
406
434
|
but it is also possible to obtain multiple resources with just one query:
|
|
435
|
+
|
|
407
436
|
```python
|
|
408
437
|
search = ...
|
|
409
438
|
result_dfs = search.steal_bundles_to_dataframe(
|
|
@@ -431,6 +460,7 @@ result_dfs = search.steal_bundles_to_dataframe(
|
|
|
431
460
|
num_pages=1,
|
|
432
461
|
)
|
|
433
462
|
```
|
|
463
|
+
|
|
434
464
|
In this case, a dictionary of DataFrames is returned, where the keys are the resource types.
|
|
435
465
|
You can then select the single dictionary by doing `result_dfs["ImagingStudy"]`
|
|
436
466
|
or `result_dfs["Patient"]`.
|
|
@@ -452,14 +482,9 @@ such that only the ones containing
|
|
|
452
482
|
the actual resource name are kept if the resource name is specified in the path,
|
|
453
483
|
and that a column full of `None`s is obtained in case no resource type is specified.
|
|
454
484
|
|
|
455
|
-
|
|
456
485
|
### [Miner](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/miner.py)
|
|
457
486
|
|
|
458
|
-
|
|
459
|
-
<div align="center">
|
|
460
|
-
<img src="https://raw.githubusercontent.com/UMEssen/FHIR-PYrate/main/images/miner.svg" alt="Logo" width="718" height="230">
|
|
461
|
-
</div>
|
|
462
|
-
<br />
|
|
487
|
+

|
|
463
488
|
|
|
464
489
|
The **Miner** takes a DataFrame and searches it for a particular regular expression
|
|
465
490
|
with the help of [SpaCy](https://spacy.io/).
|
|
@@ -571,13 +596,15 @@ request. You can also simply open an issue with the tag "enhancement".
|
|
|
571
596
|
|
|
572
597
|
This package was developed by the [SHIP-AI group at the Institute for Artificial Intelligence in Medicine](https://ship-ai.ikim.nrw/).
|
|
573
598
|
|
|
574
|
-
|
|
575
|
-
|
|
599
|
+
* [goku1110](https://github.com/goku1110): initial idea, development, logo & figures
|
|
600
|
+
* [giuliabaldini](https://github.com/giuliabaldini): development, tests, new features
|
|
576
601
|
|
|
577
602
|
We would like to thank [razorx89](https://github.com/razorx89), [butterpear](https://github.com/butterpear), [vkyprmr](https://github.com/vkyprmr), [Wizzzard93](https://github.com/Wizzzard93), [karzideh](https://github.com/karzideh) and [luckfamousa](https://github.com/luckfamousa) for their input, time and effort.
|
|
578
603
|
|
|
579
604
|
## License
|
|
605
|
+
|
|
580
606
|
This project is licenced under the [MIT Licence](LICENSE).
|
|
581
607
|
|
|
582
608
|
## Project status
|
|
609
|
+
|
|
583
610
|
The project is in active development.
|