PyPI - fhir-pyrate - Versions diffs - 0.2.0b9__tar.gz → 0.2.2__tar.gz - Mend

fhir-pyrate 0.2.0b9tar.gz → 0.2.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

fhir_pyrate-0.2.0b9/README.md → fhir_pyrate-0.2.2/PKG-INFO RENAMED Viewed

@@ -1,20 +1,53 @@
+Metadata-Version: 2.1
+Name: fhir-pyrate
+Version: 0.2.2
+Summary: FHIR-PYrate is a package that provides a high-level API to query FHIR Servers for bundles of resources and return the structured information as pandas DataFrames. It can also be used to filter resources using RegEx and SpaCy and download DICOM studies and series.
+Home-page: https://github.com/UMEssen/FHIR-PYrate
+License: MIT
+Keywords: python,fhir,data-science,fhirpath,healthcare
+Author: Rene Hosch
+Author-email: rene.hosch@uk-essen.de
+Requires-Python: >=3.10,<4.0
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Provides-Extra: all
+Provides-Extra: downloader
+Provides-Extra: miner
+Requires-Dist: PyJWT (>=2.4.0,<3.0.0)
+Requires-Dist: SimpleITK (>=2.0.2,<3.0.0) ; extra == "downloader" or extra == "all"
+Requires-Dist: dicomweb-client (>=0.52.0,<0.53.0) ; extra == "downloader" or extra == "all"
+Requires-Dist: fhirpathpy (>=0.2.2,<0.3.0)
+Requires-Dist: numpy (>=2.0.0,<3.0.0)
+Requires-Dist: pandas (>=2.0.0,<3.0.0)
+Requires-Dist: pydicom (>=2.1.2,<3.0.0) ; extra == "downloader" or extra == "all"
+Requires-Dist: requests (>=2.28.0,<3.0.0)
+Requires-Dist: requests-cache (>=0.9.7,<0.10.0)
+Requires-Dist: spacy (>=3.0.6,<4.0.0) ; extra == "miner" or extra == "all"
+Requires-Dist: tqdm (>=4.56.0,<5.0.0)
+Project-URL: Repository, https://github.com/UMEssen/FHIR-PYrate
+Description-Content-Type: text/markdown
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
-[![Supported Python version](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/release/python-380/)
+[![Supported Python version](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/release/python-31011/)
 [![Stable Version](https://img.shields.io/pypi/v/fhir-pyrate?label=stable)](https://pypi.org/project/fhir-pyrate/)
 [![Pre-release Version](https://img.shields.io/github/v/release/UMEssen/fhir-pyrate?label=pre-release&include_prereleases&sort=semver)](https://pypi.org/project/fhir-pyrate/#history)
 [![DOI](https://zenodo.org/badge/456893108.svg)](https://zenodo.org/badge/latestdoi/456893108)
+[![Affiliated with RTG WisPerMed](https://img.shields.io/badge/Affiliated-RTG%202535%20WisPerMed-blue)](https://wispermed.org/)
 <!-- PROJECT LOGO -->
-<br />
-<div align="center">
-  <a href="https://github.com/UMEssen/FHIR-PYrate">
-    <img src="https://raw.githubusercontent.com/UMEssen/FHIR-PYrate/main/images/logo.svg" alt="Logo" width="440" height="338">
-  </a>
-</div>
+![Pyrate-Banner](images/pyrate-banner.png)
 This package is meant to provide a simple abstraction to query and structure FHIR resources as
 pandas DataFrames. Want to use R instead? Try out [fhircrackr](https://github.com/POLAR-fhiR/fhircrackr)!
+**If you use this package, please cite:**
+Hosch, R., Baldini, G., Parmar, V. et al. FHIR-PYrate: a data science friendly Python package to query FHIR servers. BMC Health Serv Res 23, 734 (2023). https://doi.org/10.1186/s12913-023-09498-1
 There are four main classes:
 * [Ahoy](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/ahoy.py): Authenticate on the FHIR API
 ([Example 1](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/1-simple-json-to-df.ipynb),
@@ -38,11 +71,6 @@ our institute. If there is anything in the code that only applies to our server,
 problems with the authentication (or anything else really), please just create an issue or
 [email us](mailto:giulia.baldini@uk-essen.de).
-<br />
-<div align="center">
-  <img src="https://raw.githubusercontent.com/UMEssen/FHIR-PYrate/main/images/resources.svg" alt="Resources" width="630" height="385">
-</div>
 <!-- TABLE OF CONTENTS -->
 Table of Contents:
@@ -203,10 +231,12 @@ The Pirate functions do one of three things:
 | trade_rows_for_dataframe                |  3   |       Yes       |    Yes    |      DataFrame       |
-**BETA FEATURE**: It is also possible to cache the bundles using the `bundle_caching` parameter,
-which specifies a caching folder. This has not yet been tested extensively and does not have any
-cache invalidation mechanism.
+**CACHING**: It is also possible to cache the bundles using the `cache_folder` parameter.
+This unfortunately does not currently work with multiprocessing, but saves a lot of time if you
+need to download a lot of data and you are always doing the same requests.
+You can also specify how long the cache should be valid with the `cache_expiry_time` parameter.
+Additionally, you can also specify whether the requests should be retried using the `retry_requests`
+parameter. There is an example of this in the docstrings of the Pirate class.
 A toy request for ImagingStudy:
@@ -396,7 +426,65 @@ parameters specified in `df_constraints` as columns of the final DataFrame.
 You can find an example in [Example 3](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/3-patients-for-condition.ipynb).
 Additionally, you can specify the `with_columns` parameter, which can add any columns from the original
 DataFrame. The columns can be either specified as a list of columns `[col1, col2, ...]` or as a
-list of tuples `[(new_name_for_col1, col1), (new_name_for_col2, col2), ...]`
+list of tuples `[(new_name_for_col1, col1), (new_name_for_col2, col2), ...]`.
+Currently, whenever a column is completely empty (i.e., no resources
+have a corresponding value for that column), it is just removed from the DataFrame.
+This is to ensure that we output clean DataFrames when we are handling multiple resources.
+More on that in the following section.
+#### Note on Querying Multiple Resources
+Not all FHIR servers allow this (at least not the public ones that we have tried),
+but it is also possible to obtain multiple resources with just one query:
+```python
+search = ...
+result_dfs = search.steal_bundles_to_dataframe(
+    resource_type="ImagingStudy",
+    request_params={
+        "_lastUpdated": "ge2022-12",
+        "_count": "3",
+        "_include": "ImagingStudy:subject",
+    },
+    fhir_paths=[
+        "id",
+        "started",
+        ("modality", "modality.code"),
+        ("procedureCode", "procedureCode.coding.code"),
+        (
+            "study_instance_uid",
+            "identifier.where(system = 'urn:dicom:uid').value.replace('urn:oid:', '')",
+        ),
+        ("series_instance_uid", "series.uid"),
+        ("series_code", "series.modality.code"),
+        ("numberOfInstances", "series.numberOfInstances"),
+        ("family_first", "name[0].family"),
+        ("given_first", "name[0].given"),
+    ],
+    num_pages=1,
+)
+```
+In this case, a dictionary of DataFrames is returned, where the keys are the resource types.
+You can then select the single dictionary by doing `result_dfs["ImagingStudy"]`
+or `result_dfs["Patient"]`.
+You can find an example of this in [Example 2](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/2-condition-to-imaging-study.ipynb)
+where the `ImagingStudy` resource is queried.
+In theory, it would be smarter to specify the resource name in front of the FHIRPaths,
+e.g. `ImagingStudy.series.uid` instead of `series.uid`, and for each DataFrame only return the
+corresponding attributes.
+However, we do not want to force the user to always specify the resource type, and in the current
+version the DataFrames
+coming from multiple resources have the same columns, because
+we cannot filter which resource was actually intended.
+Currently, we solved this by just removing all columns that do not have any results.
+Which means however, that if you are actually requesting an attribute for a specific resource and it
+is not found, that that column will not appear.
+In the future, [we plan to do a smarter filtering of the FHIRPaths](https://github.com/UMEssen/FHIR-PYrate/issues/120),
+such that only the ones containing
+the actual resource name are kept if the resource name is specified in the path,
+and that a column full of `None`s is obtained in case no resource type is specified.
 ### [Miner](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/miner.py)
@@ -526,3 +614,4 @@ This project is licenced under the [MIT Licence](LICENSE).
 ## Project status
 The project is in active development.

fhir_pyrate-0.2.0b9/PKG-INFO → fhir_pyrate-0.2.2/README.md RENAMED Viewed

@@ -1,53 +1,20 @@
-Metadata-Version: 2.1
-Name: fhir-pyrate
-Version: 0.2.0b9
-Summary: FHIR-PYrate is a package that provides a high-level API to query FHIR Servers for bundles of resources and return the structured information as pandas DataFrames. It can also be used to filter resources using RegEx and SpaCy and download DICOM studies and series.
-Home-page: https://github.com/UMEssen/FHIR-PYrate
-License: MIT
-Keywords: python,fhir,data-science,fhirpath,healthcare
-Author: Rene Hosch
-Author-email: rene.hosch@uk-essen.de
-Requires-Python: >=3.8,<4.0
-Classifier: License :: OSI Approved :: MIT License
-Classifier: Programming Language :: Python :: 3
-Classifier: Programming Language :: Python :: 3.8
-Classifier: Programming Language :: Python :: 3.9
-Classifier: Programming Language :: Python :: 3.10
-Classifier: Programming Language :: Python :: 3.11
-Provides-Extra: all
-Provides-Extra: downloader
-Provides-Extra: miner
-Requires-Dist: PyJWT (>=2.4.0,<3.0.0)
-Requires-Dist: SimpleITK (>=2.0.2,<3.0.0) ; extra == "downloader" or extra == "all"
-Requires-Dist: dicomweb-client (>=0.52.0,<0.53.0) ; extra == "downloader" or extra == "all"
-Requires-Dist: fhirpathpy (>=0.1.0,<0.2.0)
-Requires-Dist: numpy (>=1.22,<2.0)
-Requires-Dist: pandas (>=1.3.0,<2.0.0)
-Requires-Dist: pydicom (>=2.1.2,<3.0.0) ; extra == "downloader" or extra == "all"
-Requires-Dist: requests (>=2.28.0,<3.0.0)
-Requires-Dist: requests-cache (>=0.9.7,<0.10.0)
-Requires-Dist: spacy (>=3.0.6,<4.0.0) ; extra == "miner" or extra == "all"
-Requires-Dist: tqdm (>=4.56.0,<5.0.0)
-Project-URL: Repository, https://github.com/UMEssen/FHIR-PYrate
-Description-Content-Type: text/markdown
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
-[![Supported Python version](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/release/python-380/)
+[![Supported Python version](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/release/python-31011/)
 [![Stable Version](https://img.shields.io/pypi/v/fhir-pyrate?label=stable)](https://pypi.org/project/fhir-pyrate/)
 [![Pre-release Version](https://img.shields.io/github/v/release/UMEssen/fhir-pyrate?label=pre-release&include_prereleases&sort=semver)](https://pypi.org/project/fhir-pyrate/#history)
 [![DOI](https://zenodo.org/badge/456893108.svg)](https://zenodo.org/badge/latestdoi/456893108)
+[![Affiliated with RTG WisPerMed](https://img.shields.io/badge/Affiliated-RTG%202535%20WisPerMed-blue)](https://wispermed.org/)
 <!-- PROJECT LOGO -->
-<br />
-<div align="center">
-  <a href="https://github.com/UMEssen/FHIR-PYrate">
-    <img src="https://raw.githubusercontent.com/UMEssen/FHIR-PYrate/main/images/logo.svg" alt="Logo" width="440" height="338">
-  </a>
-</div>
+![Pyrate-Banner](images/pyrate-banner.png)
 This package is meant to provide a simple abstraction to query and structure FHIR resources as
 pandas DataFrames. Want to use R instead? Try out [fhircrackr](https://github.com/POLAR-fhiR/fhircrackr)!
+**If you use this package, please cite:**
+Hosch, R., Baldini, G., Parmar, V. et al. FHIR-PYrate: a data science friendly Python package to query FHIR servers. BMC Health Serv Res 23, 734 (2023). https://doi.org/10.1186/s12913-023-09498-1
 There are four main classes:
 * [Ahoy](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/ahoy.py): Authenticate on the FHIR API
 ([Example 1](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/1-simple-json-to-df.ipynb),
@@ -71,11 +38,6 @@ our institute. If there is anything in the code that only applies to our server,
 problems with the authentication (or anything else really), please just create an issue or
 [email us](mailto:giulia.baldini@uk-essen.de).
-<br />
-<div align="center">
-  <img src="https://raw.githubusercontent.com/UMEssen/FHIR-PYrate/main/images/resources.svg" alt="Resources" width="630" height="385">
-</div>
 <!-- TABLE OF CONTENTS -->
 Table of Contents:
@@ -236,10 +198,12 @@ The Pirate functions do one of three things:
 | trade_rows_for_dataframe                |  3   |       Yes       |    Yes    |      DataFrame       |
-**BETA FEATURE**: It is also possible to cache the bundles using the `bundle_caching` parameter,
-which specifies a caching folder. This has not yet been tested extensively and does not have any
-cache invalidation mechanism.
+**CACHING**: It is also possible to cache the bundles using the `cache_folder` parameter.
+This unfortunately does not currently work with multiprocessing, but saves a lot of time if you
+need to download a lot of data and you are always doing the same requests.
+You can also specify how long the cache should be valid with the `cache_expiry_time` parameter.
+Additionally, you can also specify whether the requests should be retried using the `retry_requests`
+parameter. There is an example of this in the docstrings of the Pirate class.
 A toy request for ImagingStudy:
@@ -429,7 +393,65 @@ parameters specified in `df_constraints` as columns of the final DataFrame.
 You can find an example in [Example 3](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/3-patients-for-condition.ipynb).
 Additionally, you can specify the `with_columns` parameter, which can add any columns from the original
 DataFrame. The columns can be either specified as a list of columns `[col1, col2, ...]` or as a
-list of tuples `[(new_name_for_col1, col1), (new_name_for_col2, col2), ...]`
+list of tuples `[(new_name_for_col1, col1), (new_name_for_col2, col2), ...]`.
+Currently, whenever a column is completely empty (i.e., no resources
+have a corresponding value for that column), it is just removed from the DataFrame.
+This is to ensure that we output clean DataFrames when we are handling multiple resources.
+More on that in the following section.
+#### Note on Querying Multiple Resources
+Not all FHIR servers allow this (at least not the public ones that we have tried),
+but it is also possible to obtain multiple resources with just one query:
+```python
+search = ...
+result_dfs = search.steal_bundles_to_dataframe(
+    resource_type="ImagingStudy",
+    request_params={
+        "_lastUpdated": "ge2022-12",
+        "_count": "3",
+        "_include": "ImagingStudy:subject",
+    },
+    fhir_paths=[
+        "id",
+        "started",
+        ("modality", "modality.code"),
+        ("procedureCode", "procedureCode.coding.code"),
+        (
+            "study_instance_uid",
+            "identifier.where(system = 'urn:dicom:uid').value.replace('urn:oid:', '')",
+        ),
+        ("series_instance_uid", "series.uid"),
+        ("series_code", "series.modality.code"),
+        ("numberOfInstances", "series.numberOfInstances"),
+        ("family_first", "name[0].family"),
+        ("given_first", "name[0].given"),
+    ],
+    num_pages=1,
+)
+```
+In this case, a dictionary of DataFrames is returned, where the keys are the resource types.
+You can then select the single dictionary by doing `result_dfs["ImagingStudy"]`
+or `result_dfs["Patient"]`.
+You can find an example of this in [Example 2](https://github.com/UMEssen/FHIR-PYrate/blob/main/examples/2-condition-to-imaging-study.ipynb)
+where the `ImagingStudy` resource is queried.
+In theory, it would be smarter to specify the resource name in front of the FHIRPaths,
+e.g. `ImagingStudy.series.uid` instead of `series.uid`, and for each DataFrame only return the
+corresponding attributes.
+However, we do not want to force the user to always specify the resource type, and in the current
+version the DataFrames
+coming from multiple resources have the same columns, because
+we cannot filter which resource was actually intended.
+Currently, we solved this by just removing all columns that do not have any results.
+Which means however, that if you are actually requesting an attribute for a specific resource and it
+is not found, that that column will not appear.
+In the future, [we plan to do a smarter filtering of the FHIRPaths](https://github.com/UMEssen/FHIR-PYrate/issues/120),
+such that only the ones containing
+the actual resource name are kept if the resource name is specified in the path,
+and that a column full of `None`s is obtained in case no resource type is specified.
 ### [Miner](https://github.com/UMEssen/FHIR-PYrate/blob/main/fhir_pyrate/miner.py)
@@ -559,4 +581,3 @@ This project is licenced under the [MIT Licence](LICENSE).
 ## Project status
 The project is in active development.

{fhir_pyrate-0.2.0b9 → fhir_pyrate-0.2.2}/fhir_pyrate/ahoy.py RENAMED Viewed

@@ -33,18 +33,22 @@ class Ahoy:
     :param token_refresh_delta: Either a timedelta object that tells us how often the token
     should be refreshed, or a number of minutes; this does not need to be specified for JWT tokens
     that contain the expiry date
+    :param session: The session that can be used for the authentication. This is particularly
+    useful if you have some particular requirements for your authentication (e.g. you need to
+    support for cusum self-signed certificates).
     """
     def __init__(
         self,
-        auth_url: str = None,
+        auth_url: Optional[str] = None,
         auth_type: Optional[str] = "token",
-        refresh_url: str = None,
-        username: str = None,
+        refresh_url: Optional[str] = None,
+        username: Optional[str] = None,
         auth_method: Optional[str] = "password",
-        token: str = None,
+        token: Optional[str] = None,
         max_login_attempts: int = 5,
-        token_refresh_delta: Union[int, timedelta] = None,
+        token_refresh_delta: Optional[Union[int, timedelta]] = None,
+        session: Optional[requests.Session] = None,
     ) -> None:
         self.auth_type = auth_type
         self.auth_method = auth_method
@@ -54,7 +58,10 @@ class Ahoy:
         self._user_env_name = "FHIR_USER"
         self._pass_env_name = "FHIR_PASSWORD"
         self.token = token
-        self.session = requests.Session()
+        if session is None:
+            self.session = requests.Session()
+        else:
+            self.session = session
         self.max_login_attempts = max_login_attempts
         self.token_refresh_delta = token_refresh_delta
         if self.auth_type is not None and self.auth_method is not None:
@@ -75,7 +82,7 @@ class Ahoy:
         self.close()
     def change_environment_variable_name(
-        self, user_env: str = None, pass_env: str = None
+        self, user_env: Optional[str] = None, pass_env: Optional[str] = None
     ) -> None:
         """
         Change the name of the variables used to retrieve username and password.

fhir-pyrate 0.2.0b9__tar.gz → 0.2.2__tar.gz

fhir-pyrate 0.2.0b9tar.gz → 0.2.2tar.gz