fhir-pyrate 0.2.3__tar.gz → 0.2.4.post1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (22) hide show
  1. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/PKG-INFO +78 -49
  2. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/README.md +52 -25
  3. fhir_pyrate-0.2.4.post1/fhir_pyrate.egg-info/PKG-INFO +646 -0
  4. fhir_pyrate-0.2.4.post1/fhir_pyrate.egg-info/SOURCES.txt +20 -0
  5. fhir_pyrate-0.2.4.post1/fhir_pyrate.egg-info/dependency_links.txt +1 -0
  6. fhir_pyrate-0.2.4.post1/fhir_pyrate.egg-info/requires.txt +21 -0
  7. fhir_pyrate-0.2.4.post1/fhir_pyrate.egg-info/top_level.txt +1 -0
  8. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/pyproject.toml +59 -36
  9. fhir_pyrate-0.2.4.post1/setup.cfg +4 -0
  10. fhir_pyrate-0.2.4.post1/tests/test_public.py +690 -0
  11. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/LICENSE +0 -0
  12. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/__init__.py +0 -0
  13. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/ahoy.py +0 -0
  14. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/dicom_downloader.py +0 -0
  15. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/miner.py +0 -0
  16. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/pirate.py +0 -0
  17. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/util/__init__.py +0 -0
  18. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/util/bundle_processing_templates.py +0 -0
  19. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/util/fhirobj.py +0 -0
  20. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/util/imports.py +0 -0
  21. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/util/token_auth.py +0 -0
  22. {fhir_pyrate-0.2.3 → fhir_pyrate-0.2.4.post1}/fhir_pyrate/util/util.py +0 -0
@@ -1,35 +1,40 @@
1
- Metadata-Version: 2.1
1
+ Metadata-Version: 2.4
2
2
  Name: fhir-pyrate
3
- Version: 0.2.3
3
+ Version: 0.2.4.post1
4
4
  Summary: FHIR-PYrate is a package that provides a high-level API to query FHIR Servers for bundles of resources and return the structured information as pandas DataFrames. It can also be used to filter resources using RegEx and SpaCy and download DICOM studies and series.
5
- Home-page: https://github.com/UMEssen/FHIR-PYrate
6
- License: MIT
7
- Keywords: python,fhir,data-science,fhirpath,healthcare
8
- Author: Rene Hosch
9
- Author-email: rene.hosch@uk-essen.de
10
- Requires-Python: >=3.10,<4.0
11
- Classifier: License :: OSI Approved :: MIT License
12
- Classifier: Programming Language :: Python :: 3
5
+ Author-email: Giulia Baldini <giulia.baldini@uk-essen.de>, Rene Hosch <rene.hosch@uk-essen.de>
6
+ Project-URL: Repository, https://github.com/UMEssen/FHIR-PYrate
7
+ Keywords: data-science,fhir,fhirpath,healthcare,python
8
+ Classifier: Programming Language :: Python :: 3 :: Only
13
9
  Classifier: Programming Language :: Python :: 3.10
14
10
  Classifier: Programming Language :: Python :: 3.11
15
11
  Classifier: Programming Language :: Python :: 3.12
16
12
  Classifier: Programming Language :: Python :: 3.13
13
+ Classifier: Programming Language :: Python :: 3.14
14
+ Requires-Python: >=3.10
15
+ Description-Content-Type: text/markdown
16
+ License-File: LICENSE
17
+ Requires-Dist: fhirpathpy>=0.2.2
18
+ Requires-Dist: numpy>=2
19
+ Requires-Dist: pandas>=2
20
+ Requires-Dist: pyjwt>=2.4
21
+ Requires-Dist: requests>=2.31
22
+ Requires-Dist: requests-cache>=0.9.7
23
+ Requires-Dist: tqdm>=4.56
17
24
  Provides-Extra: all
25
+ Requires-Dist: dicomweb-client>=0.52; extra == "all"
26
+ Requires-Dist: pydicom>=3.0.1; extra == "all"
27
+ Requires-Dist: simpleitk>=2.0.2; extra == "all"
28
+ Requires-Dist: spacy>=3.0.6; extra == "all"
18
29
  Provides-Extra: downloader
30
+ Requires-Dist: dicomweb-client>=0.52; extra == "downloader"
31
+ Requires-Dist: pydicom>=2.1.2; extra == "downloader"
32
+ Requires-Dist: simpleitk>=2.0.2; extra == "downloader"
19
33
  Provides-Extra: miner
20
- Requires-Dist: PyJWT (>=2.4.0,<3.0.0)
21
- Requires-Dist: SimpleITK (>=2.0.2,<3.0.0) ; extra == "downloader" or extra == "all"
22
- Requires-Dist: dicomweb-client (>=0.52.0,<0.53.0) ; extra == "downloader" or extra == "all"
23
- Requires-Dist: fhirpathpy (>=0.2.2,<0.3.0)
24
- Requires-Dist: numpy (>=2.0.0,<3.0.0)
25
- Requires-Dist: pandas (>=2.0.0,<3.0.0)
26
- Requires-Dist: pydicom (>=2.1.2,<3.0.0) ; extra == "downloader" or extra == "all"
27
- Requires-Dist: requests (>=2.31.0,<3.0.0)
28
- Requires-Dist: requests-cache (>=0.9.7,<0.10.0)
29
- Requires-Dist: spacy (>=3.0.6,<4.0.0) ; extra == "miner" or extra == "all"
30
- Requires-Dist: tqdm (>=4.56.0,<5.0.0)
31
- Project-URL: Repository, https://github.com/UMEssen/FHIR-PYrate
32
- Description-Content-Type: text/markdown
34
+ Requires-Dist: spacy>=3.0.6; extra == "miner"
35
+ Dynamic: license-file
36
+
37
+ # FHIR-PYrate
33
38
 
34
39
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
35
40
  [![Supported Python version](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/release/python-31011/)
@@ -46,9 +51,10 @@ pandas DataFrames. Want to use R instead? Try out [fhircrackr](https://github.co
46
51
 
47
52
  **If you use this package, please cite:**
48
53
 
49
- Hosch, R., Baldini, G., Parmar, V. et al. FHIR-PYrate: a data science friendly Python package to query FHIR servers. BMC Health Serv Res 23, 734 (2023). https://doi.org/10.1186/s12913-023-09498-1
54
+ Hosch, R., Baldini, G., Parmar, V. et al. FHIR-PYrate: a data science friendly Python package to query FHIR servers. BMC Health Serv Res 23, 734 (2023). <https://doi.org/10.1186/s12913-023-09498-1>
50
55
 
51
56
  There are four main classes:
57
+
52
58
  * [Ahoy](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/ahoy.py): Authenticate on the FHIR API
53
59
  ([Example 1](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/1-simple-json-to-df.ipynb),
54
60
  [2](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/2-condition-to-imaging-study.ipynb)),
@@ -75,37 +81,41 @@ problems with the authentication (or anything else really), please just create a
75
81
  Table of Contents:
76
82
 
77
83
  * [Install](https://github.com/UMEssen/FHIR-PYrate/#install)
78
- * [Either Pip](https://github.com/UMEssen/FHIR-PYrate/#either-pip)
79
- * [Or Within Poetry](https://github.com/UMEssen/FHIR-PYrate/#or-within-poetry)
84
+ * [Either Pip](https://github.com/UMEssen/FHIR-PYrate/#either-pip)
85
+ * [Or Within Poetry](https://github.com/UMEssen/FHIR-PYrate/#or-within-poetry)
80
86
  * [Run Tests](https://github.com/UMEssen/FHIR-PYrate/#run-tests)
81
87
  * [Explanations &amp; Examples](https://github.com/UMEssen/FHIR-PYrate/#explanations--examples)
82
- * [Ahoy](https://github.com/UMEssen/FHIR-PYrate/#ahoy)
83
- * [Pirate](https://github.com/UMEssen/FHIR-PYrate/#pirate)
84
- * [sail_through_search_space](https://github.com/UMEssen/FHIR-PYrate/#sail_through_search_space)
85
- * [trade_rows_for_bundles](https://github.com/UMEssen/FHIR-PYrate/#trade_rows_for_bundles)
86
- * [bundles_to_dataframe](https://github.com/UMEssen/FHIR-PYrate/#bundles_to_dataframe)
87
- * [***_dataframe](https://github.com/UMEssen/FHIR-PYrate/#_dataframe)
88
- * [Miner](https://github.com/UMEssen/FHIR-PYrate/#miner)
89
- * [DicomDownloader](https://github.com/UMEssen/FHIR-PYrate/#dicomdownloader)
88
+ * [Ahoy](https://github.com/UMEssen/FHIR-PYrate/#ahoy)
89
+ * [Pirate](https://github.com/UMEssen/FHIR-PYrate/#pirate)
90
+ * [sail_through_search_space](https://github.com/UMEssen/FHIR-PYrate/#sail_through_search_space)
91
+ * [trade_rows_for_bundles](https://github.com/UMEssen/FHIR-PYrate/#trade_rows_for_bundles)
92
+ * [bundles_to_dataframe](https://github.com/UMEssen/FHIR-PYrate/#bundles_to_dataframe)
93
+ * [***_dataframe](https://github.com/UMEssen/FHIR-PYrate/#_dataframe)
94
+ * [Miner](https://github.com/UMEssen/FHIR-PYrate/#miner)
95
+ * [DicomDownloader](https://github.com/UMEssen/FHIR-PYrate/#dicomdownloader)
90
96
  * [Contributing](https://github.com/UMEssen/FHIR-PYrate/#contributing)
91
97
  * [Authors and acknowledgment](https://github.com/UMEssen/FHIR-PYrate/#authors-and-acknowledgment)
92
98
  * [License](https://github.com/UMEssen/FHIR-PYrate/#license)
93
99
  * [Project status](https://github.com/UMEssen/FHIR-PYrate/#project-status)
94
100
 
95
-
96
101
  ## Install
97
102
 
98
103
  ### Either Pip
104
+
99
105
  The package can be installed using PyPi
106
+
100
107
  ```bash
101
108
  pip install fhir-pyrate
102
109
  ```
110
+
103
111
  or using GitHub (always the newest version).
112
+
104
113
  ```bash
105
114
  pip install git+https://github.com/UMEssen/FHIR-PYrate.git
106
115
  ```
107
116
 
108
117
  These two commands only install the packages needed for **Pirate**. If you also want to use the **Miner** or the **DicomDownloader**, then you need to install them as extra dependencies with
118
+
109
119
  ```bash
110
120
  pip install "fhir-pyrate[miner]" # only for miner
111
121
  pip install "fhir-pyrate[downloader]" # only for downloader
@@ -113,35 +123,46 @@ pip install "fhir-pyrate[all]" # for both
113
123
  ```
114
124
 
115
125
  ### Or Within Poetry
126
+
116
127
  We can also use poetry for this same purpose. Using PyPi we need to run the following commands.
128
+
117
129
  ```bash
118
130
  poetry add fhir-pyrate
119
131
  poetry install
120
132
  ```
133
+
121
134
  Whereas to add it from GitHub, we have different options, because until recently
122
135
  [poetry used to exclusively install from the master branch](https://github.com/python-poetry/poetry/issues/3366).
123
136
 
124
137
  Poetry 1.2.0a2+:
138
+
125
139
  ```bash
126
140
  poetry add git+https://github.com/UMEssen/FHIR-PYrate.git
127
141
  poetry install
128
142
  ```
143
+
129
144
  For the previous versions you need to add the following line to your `pyproject.toml` file:
145
+
130
146
  ```bash
131
147
  fhir-pyrate = {git = "https://github.com/UMEssen/FHIR-PYrate.git", branch = "main"}
132
148
  ```
149
+
133
150
  and then run
151
+
134
152
  ```bash
135
153
  poetry lock
136
154
  ```
137
155
 
138
156
  Also in poetry, the above only installs the packages for **Pirate**. If you also want to use the **Miner** or the **DicomDownloader**, then you need to install them as extra dependencies with
157
+
139
158
  ```bash
140
159
  poetry add "fhir-pyrate[miner]" # only for miner
141
160
  poetry add "fhir-pyrate[downloader]" # only for downloader
142
161
  poetry add "fhir-pyrate[all]" # for both
143
162
  ```
163
+
144
164
  or by adding the following to your `pyproject.toml` file:
165
+
145
166
  ```bash
146
167
  fhir-pyrate = {git = "https://github.com/UMEssen/FHIR-PYrate.git", branch = "main", extras = ["all"]}
147
168
  ```
@@ -209,6 +230,7 @@ search = Pirate(
209
230
  ```
210
231
 
211
232
  The Pirate functions do one of three things:
233
+
212
234
  1. They run the query and collect the resources and store them in a generator of bundles.
213
235
  * `steal_bundles`: single process, no timespan to specify
214
236
  * `sail_through_search_space`: multiprocess, divide&conquer with many smaller timespans
@@ -230,7 +252,6 @@ The Pirate functions do one of three things:
230
252
  | sail_through_search_space_to_dataframe | 3 | Yes | No | DataFrame |
231
253
  | trade_rows_for_dataframe | 3 | Yes | Yes | DataFrame |
232
254
 
233
-
234
255
  **CACHING**: It is also possible to cache the bundles using the `cache_folder` parameter.
235
256
  This unfortunately does not currently work with multiprocessing, but saves a lot of time if you
236
257
  need to download a lot of data and you are always doing the same requests.
@@ -309,7 +330,8 @@ is the column where the values that we want to search for are stored.
309
330
  Additionally, a system can be used to better identify the constraints of the DataFrame.
310
331
  For example, let us assume that we have a column of the DataFrame (called `loinc_code` that
311
332
  contains a bunch of different LOINC codes. Our `df_constraints` could look as follows:
312
- ```
333
+
334
+ ```python
313
335
  df_constraints={"code": ("http://loinc.org", "loinc_code")}
314
336
  ```
315
337
 
@@ -323,10 +345,12 @@ converted to a `DataFrame` using this function.
323
345
 
324
346
  The `bundles_to_dataframe` has three options on how to handle and extract the relevant information
325
347
  from the bundles:
348
+
326
349
  1. Extract everything, in this case you can use the
327
350
  [`flatten_data`](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/util/bundle_processing_templates.py)
328
351
  function, which is already the default for `process_function`, so you do not actually need to
329
352
  specify anything.
353
+
330
354
  ```python
331
355
  # Create bundles with Pirate
332
356
  search = ...
@@ -336,10 +360,12 @@ df = search.bundles_to_dataframe(
336
360
  bundles=bundles,
337
361
  )
338
362
  ```
339
- 2. Use a processing function where you define exactly which attributes are needed by iterating
363
+
364
+ 1. Use a processing function where you define exactly which attributes are needed by iterating
340
365
  through the entries and selecting the elements. The values that will be added to the
341
366
  dictionary represent the columns of the DataFrame. For an example of when it might make sense
342
367
  to do this, check [Example 3](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/3-patients-for-condition.ipynb).
368
+
343
369
  ```python
344
370
  from typing import List, Dict
345
371
  from fhir_pyrate.util.fhirobj import FHIRObj
@@ -364,12 +390,14 @@ df = search.bundles_to_dataframe(
364
390
  process_function=get_diagnostic_text,
365
391
  )
366
392
  ```
367
- 3. Extract only part of the information using the `fhir_paths` argument. Here you can put a list
393
+
394
+ 1. Extract only part of the information using the `fhir_paths` argument. Here you can put a list
368
395
  of string that follow the [FHIRPath](https://hl7.org/fhirpath/) standard. For this purpose, we
369
396
  use the [fhirpath-py](https://github.com/beda-software/fhirpath-py) package, which uses the
370
397
  [antr4](https://github.com/antlr/antlr4) parser. Additionally, you can use tuples like `(key,
371
398
  fhir_path)`, where `key` will be the name of the column the information derived from that
372
399
  FHIRPath will be stored.
400
+
373
401
  ```python
374
402
  # Create bundles with Pirate
375
403
  search = ...
@@ -380,6 +408,7 @@ df = search.bundles_to_dataframe(
380
408
  fhir_paths=["id", ("code", "code.coding"), ("identifier", "identifier[0].code")],
381
409
  )
382
410
  ```
411
+
383
412
  **NOTE 1 on FHIR paths**: The standard also allows some primitive math operations such as modulus
384
413
  (`mod`) or integer division (`div`), and this may be problematic if there are fields of the
385
414
  resource that use these terms as attributes.
@@ -390,7 +419,8 @@ instead (as in 2.).
390
419
  **NOTE 2 on FHIR paths**: Since it is possible to specify the column name with a tuple
391
420
  `(key, fhir_path)`, it is important to know that if a key is used multiple times for different
392
421
  pieces of information but for the same resource, the field will be only filled with the first
393
- occurence that is not None.
422
+ occurrence that is not None.
423
+
394
424
  ```python
395
425
  df = search.steal_bundles_to_dataframe(
396
426
  resource_type="DiagnosticReport",
@@ -418,6 +448,7 @@ df = search.steal_bundles_to_dataframe(
418
448
  ```
419
449
 
420
450
  #### [`***_dataframe`](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/pirate.py)
451
+
421
452
  The `steal_bundles_to_dataframe`, `sail_through_search_space_to_dataframe` and `trade_rows_for_dataframe`
422
453
  are facade functions which retrieve the bundles and then run `bundles_to_dataframe`.
423
454
 
@@ -437,6 +468,7 @@ More on that in the following section.
437
468
 
438
469
  Not all FHIR servers allow this (at least not the public ones that we have tried),
439
470
  but it is also possible to obtain multiple resources with just one query:
471
+
440
472
  ```python
441
473
  search = ...
442
474
  result_dfs = search.steal_bundles_to_dataframe(
@@ -464,6 +496,7 @@ result_dfs = search.steal_bundles_to_dataframe(
464
496
  num_pages=1,
465
497
  )
466
498
  ```
499
+
467
500
  In this case, a dictionary of DataFrames is returned, where the keys are the resource types.
468
501
  You can then select the single dictionary by doing `result_dfs["ImagingStudy"]`
469
502
  or `result_dfs["Patient"]`.
@@ -485,14 +518,9 @@ such that only the ones containing
485
518
  the actual resource name are kept if the resource name is specified in the path,
486
519
  and that a column full of `None`s is obtained in case no resource type is specified.
487
520
 
488
-
489
521
  ### [Miner](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/miner.py)
490
522
 
491
- <br />
492
- <div align="center">
493
- <img src="https://raw.githubusercontent.com/UMEssen/FHIR-PYrate/main/images/miner.svg" alt="Logo" width="718" height="230">
494
- </div>
495
- <br />
523
+ ![FHIR-PYrate Logo](https://raw.githubusercontent.com/UMEssen/FHIR-PYrate/main/images/miner.svg)
496
524
 
497
525
  The **Miner** takes a DataFrame and searches it for a particular regular expression
498
526
  with the help of [SpaCy](https://spacy.io/).
@@ -604,14 +632,15 @@ request. You can also simply open an issue with the tag "enhancement".
604
632
 
605
633
  This package was developed by the [SHIP-AI group at the Institute for Artificial Intelligence in Medicine](https://ship-ai.ikim.nrw/).
606
634
 
607
- - [goku1110](https://github.com/goku1110): initial idea, development, logo & figures
608
- - [giuliabaldini](https://github.com/giuliabaldini): development, tests, new features
635
+ * [goku1110](https://github.com/goku1110): initial idea, development, logo & figures
636
+ * [giuliabaldini](https://github.com/giuliabaldini): development, tests, new features
609
637
 
610
638
  We would like to thank [razorx89](https://github.com/razorx89), [butterpear](https://github.com/butterpear), [vkyprmr](https://github.com/vkyprmr), [Wizzzard93](https://github.com/Wizzzard93), [karzideh](https://github.com/karzideh) and [luckfamousa](https://github.com/luckfamousa) for their input, time and effort.
611
639
 
612
640
  ## License
641
+
613
642
  This project is licenced under the [MIT Licence](LICENSE).
614
643
 
615
644
  ## Project status
616
- The project is in active development.
617
645
 
646
+ The project is in active development.
@@ -1,3 +1,5 @@
1
+ # FHIR-PYrate
2
+
1
3
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
2
4
  [![Supported Python version](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/release/python-31011/)
3
5
  [![Stable Version](https://img.shields.io/pypi/v/fhir-pyrate?label=stable)](https://pypi.org/project/fhir-pyrate/)
@@ -13,9 +15,10 @@ pandas DataFrames. Want to use R instead? Try out [fhircrackr](https://github.co
13
15
 
14
16
  **If you use this package, please cite:**
15
17
 
16
- Hosch, R., Baldini, G., Parmar, V. et al. FHIR-PYrate: a data science friendly Python package to query FHIR servers. BMC Health Serv Res 23, 734 (2023). https://doi.org/10.1186/s12913-023-09498-1
18
+ Hosch, R., Baldini, G., Parmar, V. et al. FHIR-PYrate: a data science friendly Python package to query FHIR servers. BMC Health Serv Res 23, 734 (2023). <https://doi.org/10.1186/s12913-023-09498-1>
17
19
 
18
20
  There are four main classes:
21
+
19
22
  * [Ahoy](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/ahoy.py): Authenticate on the FHIR API
20
23
  ([Example 1](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/1-simple-json-to-df.ipynb),
21
24
  [2](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/2-condition-to-imaging-study.ipynb)),
@@ -42,37 +45,41 @@ problems with the authentication (or anything else really), please just create a
42
45
  Table of Contents:
43
46
 
44
47
  * [Install](https://github.com/UMEssen/FHIR-PYrate/#install)
45
- * [Either Pip](https://github.com/UMEssen/FHIR-PYrate/#either-pip)
46
- * [Or Within Poetry](https://github.com/UMEssen/FHIR-PYrate/#or-within-poetry)
48
+ * [Either Pip](https://github.com/UMEssen/FHIR-PYrate/#either-pip)
49
+ * [Or Within Poetry](https://github.com/UMEssen/FHIR-PYrate/#or-within-poetry)
47
50
  * [Run Tests](https://github.com/UMEssen/FHIR-PYrate/#run-tests)
48
51
  * [Explanations &amp; Examples](https://github.com/UMEssen/FHIR-PYrate/#explanations--examples)
49
- * [Ahoy](https://github.com/UMEssen/FHIR-PYrate/#ahoy)
50
- * [Pirate](https://github.com/UMEssen/FHIR-PYrate/#pirate)
51
- * [sail_through_search_space](https://github.com/UMEssen/FHIR-PYrate/#sail_through_search_space)
52
- * [trade_rows_for_bundles](https://github.com/UMEssen/FHIR-PYrate/#trade_rows_for_bundles)
53
- * [bundles_to_dataframe](https://github.com/UMEssen/FHIR-PYrate/#bundles_to_dataframe)
54
- * [***_dataframe](https://github.com/UMEssen/FHIR-PYrate/#_dataframe)
55
- * [Miner](https://github.com/UMEssen/FHIR-PYrate/#miner)
56
- * [DicomDownloader](https://github.com/UMEssen/FHIR-PYrate/#dicomdownloader)
52
+ * [Ahoy](https://github.com/UMEssen/FHIR-PYrate/#ahoy)
53
+ * [Pirate](https://github.com/UMEssen/FHIR-PYrate/#pirate)
54
+ * [sail_through_search_space](https://github.com/UMEssen/FHIR-PYrate/#sail_through_search_space)
55
+ * [trade_rows_for_bundles](https://github.com/UMEssen/FHIR-PYrate/#trade_rows_for_bundles)
56
+ * [bundles_to_dataframe](https://github.com/UMEssen/FHIR-PYrate/#bundles_to_dataframe)
57
+ * [***_dataframe](https://github.com/UMEssen/FHIR-PYrate/#_dataframe)
58
+ * [Miner](https://github.com/UMEssen/FHIR-PYrate/#miner)
59
+ * [DicomDownloader](https://github.com/UMEssen/FHIR-PYrate/#dicomdownloader)
57
60
  * [Contributing](https://github.com/UMEssen/FHIR-PYrate/#contributing)
58
61
  * [Authors and acknowledgment](https://github.com/UMEssen/FHIR-PYrate/#authors-and-acknowledgment)
59
62
  * [License](https://github.com/UMEssen/FHIR-PYrate/#license)
60
63
  * [Project status](https://github.com/UMEssen/FHIR-PYrate/#project-status)
61
64
 
62
-
63
65
  ## Install
64
66
 
65
67
  ### Either Pip
68
+
66
69
  The package can be installed using PyPi
70
+
67
71
  ```bash
68
72
  pip install fhir-pyrate
69
73
  ```
74
+
70
75
  or using GitHub (always the newest version).
76
+
71
77
  ```bash
72
78
  pip install git+https://github.com/UMEssen/FHIR-PYrate.git
73
79
  ```
74
80
 
75
81
  These two commands only install the packages needed for **Pirate**. If you also want to use the **Miner** or the **DicomDownloader**, then you need to install them as extra dependencies with
82
+
76
83
  ```bash
77
84
  pip install "fhir-pyrate[miner]" # only for miner
78
85
  pip install "fhir-pyrate[downloader]" # only for downloader
@@ -80,35 +87,46 @@ pip install "fhir-pyrate[all]" # for both
80
87
  ```
81
88
 
82
89
  ### Or Within Poetry
90
+
83
91
  We can also use poetry for this same purpose. Using PyPi we need to run the following commands.
92
+
84
93
  ```bash
85
94
  poetry add fhir-pyrate
86
95
  poetry install
87
96
  ```
97
+
88
98
  Whereas to add it from GitHub, we have different options, because until recently
89
99
  [poetry used to exclusively install from the master branch](https://github.com/python-poetry/poetry/issues/3366).
90
100
 
91
101
  Poetry 1.2.0a2+:
102
+
92
103
  ```bash
93
104
  poetry add git+https://github.com/UMEssen/FHIR-PYrate.git
94
105
  poetry install
95
106
  ```
107
+
96
108
  For the previous versions you need to add the following line to your `pyproject.toml` file:
109
+
97
110
  ```bash
98
111
  fhir-pyrate = {git = "https://github.com/UMEssen/FHIR-PYrate.git", branch = "main"}
99
112
  ```
113
+
100
114
  and then run
115
+
101
116
  ```bash
102
117
  poetry lock
103
118
  ```
104
119
 
105
120
  Also in poetry, the above only installs the packages for **Pirate**. If you also want to use the **Miner** or the **DicomDownloader**, then you need to install them as extra dependencies with
121
+
106
122
  ```bash
107
123
  poetry add "fhir-pyrate[miner]" # only for miner
108
124
  poetry add "fhir-pyrate[downloader]" # only for downloader
109
125
  poetry add "fhir-pyrate[all]" # for both
110
126
  ```
127
+
111
128
  or by adding the following to your `pyproject.toml` file:
129
+
112
130
  ```bash
113
131
  fhir-pyrate = {git = "https://github.com/UMEssen/FHIR-PYrate.git", branch = "main", extras = ["all"]}
114
132
  ```
@@ -176,6 +194,7 @@ search = Pirate(
176
194
  ```
177
195
 
178
196
  The Pirate functions do one of three things:
197
+
179
198
  1. They run the query and collect the resources and store them in a generator of bundles.
180
199
  * `steal_bundles`: single process, no timespan to specify
181
200
  * `sail_through_search_space`: multiprocess, divide&conquer with many smaller timespans
@@ -197,7 +216,6 @@ The Pirate functions do one of three things:
197
216
  | sail_through_search_space_to_dataframe | 3 | Yes | No | DataFrame |
198
217
  | trade_rows_for_dataframe | 3 | Yes | Yes | DataFrame |
199
218
 
200
-
201
219
  **CACHING**: It is also possible to cache the bundles using the `cache_folder` parameter.
202
220
  This unfortunately does not currently work with multiprocessing, but saves a lot of time if you
203
221
  need to download a lot of data and you are always doing the same requests.
@@ -276,7 +294,8 @@ is the column where the values that we want to search for are stored.
276
294
  Additionally, a system can be used to better identify the constraints of the DataFrame.
277
295
  For example, let us assume that we have a column of the DataFrame (called `loinc_code` that
278
296
  contains a bunch of different LOINC codes. Our `df_constraints` could look as follows:
279
- ```
297
+
298
+ ```python
280
299
  df_constraints={"code": ("http://loinc.org", "loinc_code")}
281
300
  ```
282
301
 
@@ -290,10 +309,12 @@ converted to a `DataFrame` using this function.
290
309
 
291
310
  The `bundles_to_dataframe` has three options on how to handle and extract the relevant information
292
311
  from the bundles:
312
+
293
313
  1. Extract everything, in this case you can use the
294
314
  [`flatten_data`](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/util/bundle_processing_templates.py)
295
315
  function, which is already the default for `process_function`, so you do not actually need to
296
316
  specify anything.
317
+
297
318
  ```python
298
319
  # Create bundles with Pirate
299
320
  search = ...
@@ -303,10 +324,12 @@ df = search.bundles_to_dataframe(
303
324
  bundles=bundles,
304
325
  )
305
326
  ```
306
- 2. Use a processing function where you define exactly which attributes are needed by iterating
327
+
328
+ 1. Use a processing function where you define exactly which attributes are needed by iterating
307
329
  through the entries and selecting the elements. The values that will be added to the
308
330
  dictionary represent the columns of the DataFrame. For an example of when it might make sense
309
331
  to do this, check [Example 3](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/3-patients-for-condition.ipynb).
332
+
310
333
  ```python
311
334
  from typing import List, Dict
312
335
  from fhir_pyrate.util.fhirobj import FHIRObj
@@ -331,12 +354,14 @@ df = search.bundles_to_dataframe(
331
354
  process_function=get_diagnostic_text,
332
355
  )
333
356
  ```
334
- 3. Extract only part of the information using the `fhir_paths` argument. Here you can put a list
357
+
358
+ 1. Extract only part of the information using the `fhir_paths` argument. Here you can put a list
335
359
  of string that follow the [FHIRPath](https://hl7.org/fhirpath/) standard. For this purpose, we
336
360
  use the [fhirpath-py](https://github.com/beda-software/fhirpath-py) package, which uses the
337
361
  [antr4](https://github.com/antlr/antlr4) parser. Additionally, you can use tuples like `(key,
338
362
  fhir_path)`, where `key` will be the name of the column the information derived from that
339
363
  FHIRPath will be stored.
364
+
340
365
  ```python
341
366
  # Create bundles with Pirate
342
367
  search = ...
@@ -347,6 +372,7 @@ df = search.bundles_to_dataframe(
347
372
  fhir_paths=["id", ("code", "code.coding"), ("identifier", "identifier[0].code")],
348
373
  )
349
374
  ```
375
+
350
376
  **NOTE 1 on FHIR paths**: The standard also allows some primitive math operations such as modulus
351
377
  (`mod`) or integer division (`div`), and this may be problematic if there are fields of the
352
378
  resource that use these terms as attributes.
@@ -357,7 +383,8 @@ instead (as in 2.).
357
383
  **NOTE 2 on FHIR paths**: Since it is possible to specify the column name with a tuple
358
384
  `(key, fhir_path)`, it is important to know that if a key is used multiple times for different
359
385
  pieces of information but for the same resource, the field will be only filled with the first
360
- occurence that is not None.
386
+ occurrence that is not None.
387
+
361
388
  ```python
362
389
  df = search.steal_bundles_to_dataframe(
363
390
  resource_type="DiagnosticReport",
@@ -385,6 +412,7 @@ df = search.steal_bundles_to_dataframe(
385
412
  ```
386
413
 
387
414
  #### [`***_dataframe`](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/pirate.py)
415
+
388
416
  The `steal_bundles_to_dataframe`, `sail_through_search_space_to_dataframe` and `trade_rows_for_dataframe`
389
417
  are facade functions which retrieve the bundles and then run `bundles_to_dataframe`.
390
418
 
@@ -404,6 +432,7 @@ More on that in the following section.
404
432
 
405
433
  Not all FHIR servers allow this (at least not the public ones that we have tried),
406
434
  but it is also possible to obtain multiple resources with just one query:
435
+
407
436
  ```python
408
437
  search = ...
409
438
  result_dfs = search.steal_bundles_to_dataframe(
@@ -431,6 +460,7 @@ result_dfs = search.steal_bundles_to_dataframe(
431
460
  num_pages=1,
432
461
  )
433
462
  ```
463
+
434
464
  In this case, a dictionary of DataFrames is returned, where the keys are the resource types.
435
465
  You can then select the single dictionary by doing `result_dfs["ImagingStudy"]`
436
466
  or `result_dfs["Patient"]`.
@@ -452,14 +482,9 @@ such that only the ones containing
452
482
  the actual resource name are kept if the resource name is specified in the path,
453
483
  and that a column full of `None`s is obtained in case no resource type is specified.
454
484
 
455
-
456
485
  ### [Miner](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/miner.py)
457
486
 
458
- <br />
459
- <div align="center">
460
- <img src="https://raw.githubusercontent.com/UMEssen/FHIR-PYrate/main/images/miner.svg" alt="Logo" width="718" height="230">
461
- </div>
462
- <br />
487
+ ![FHIR-PYrate Logo](https://raw.githubusercontent.com/UMEssen/FHIR-PYrate/main/images/miner.svg)
463
488
 
464
489
  The **Miner** takes a DataFrame and searches it for a particular regular expression
465
490
  with the help of [SpaCy](https://spacy.io/).
@@ -571,13 +596,15 @@ request. You can also simply open an issue with the tag "enhancement".
571
596
 
572
597
  This package was developed by the [SHIP-AI group at the Institute for Artificial Intelligence in Medicine](https://ship-ai.ikim.nrw/).
573
598
 
574
- - [goku1110](https://github.com/goku1110): initial idea, development, logo & figures
575
- - [giuliabaldini](https://github.com/giuliabaldini): development, tests, new features
599
+ * [goku1110](https://github.com/goku1110): initial idea, development, logo & figures
600
+ * [giuliabaldini](https://github.com/giuliabaldini): development, tests, new features
576
601
 
577
602
  We would like to thank [razorx89](https://github.com/razorx89), [butterpear](https://github.com/butterpear), [vkyprmr](https://github.com/vkyprmr), [Wizzzard93](https://github.com/Wizzzard93), [karzideh](https://github.com/karzideh) and [luckfamousa](https://github.com/luckfamousa) for their input, time and effort.
578
603
 
579
604
  ## License
605
+
580
606
  This project is licenced under the [MIT Licence](LICENSE).
581
607
 
582
608
  ## Project status
609
+
583
610
  The project is in active development.