pathling 9.0.0.dev1__tar.gz → 9.1.0.dev0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- pathling-9.1.0.dev0/.gitignore +3 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/LICENSE +11 -11
- {pathling-9.0.0.dev1/pathling.egg-info → pathling-9.1.0.dev0}/PKG-INFO +7 -20
- pathling-9.1.0.dev0/examples/data/csv/conditions.csv +20 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/pathling/_version.py +2 -2
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/pathling/context.py +21 -2
- pathling-9.1.0.dev0/pyproject.toml +78 -0
- pathling-9.0.0.dev1/MANIFEST.in +0 -2
- pathling-9.0.0.dev1/PKG-INFO +0 -404
- pathling-9.0.0.dev1/pathling.egg-info/SOURCES.txt +0 -43
- pathling-9.0.0.dev1/pathling.egg-info/dependency_links.txt +0 -1
- pathling-9.0.0.dev1/pathling.egg-info/requires.txt +0 -2
- pathling-9.0.0.dev1/pathling.egg-info/top_level.txt +0 -1
- pathling-9.0.0.dev1/setup.cfg +0 -10
- pathling-9.0.0.dev1/setup.py +0 -78
- pathling-9.0.0.dev1/tests/test_bulk.py +0 -62
- pathling-9.0.0.dev1/tests/test_datasource.py +0 -309
- pathling-9.0.0.dev1/tests/test_encoders.py +0 -230
- pathling-9.0.0.dev1/tests/test_functions.py +0 -51
- pathling-9.0.0.dev1/tests/test_spark.py +0 -39
- pathling-9.0.0.dev1/tests/test_udfs.py +0 -507
- pathling-9.0.0.dev1/tests/test_view.py +0 -47
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/README.md +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/bulk.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/data/bundles/Bennett146_Swaniawski813_704c9750-f6e6-473b-ee83-fbd48e07fe3f.json +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/data/bundles/Dino214_Parisian75_40d82b80-b682-cd8b-da6d-396809878641.json +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/data/resources/Condition.ndjson +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/data/resources/Patient.ndjson +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/designation.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/display.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/encode_bundles.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/encode_resources.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/fhir_view.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/member_of.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/property_of.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/subsumes.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/examples/translate.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/pathling/__init__.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/pathling/bulk.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/pathling/coding.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/pathling/core.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/pathling/datasink.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/pathling/datasource.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/pathling/fhir.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/pathling/functions.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/pathling/spark.py +0 -0
- {pathling-9.0.0.dev1 → pathling-9.1.0.dev0}/pathling/udfs.py +0 -0
|
@@ -194,20 +194,20 @@ separate files distributed with the Software.
|
|
|
194
194
|
* (Apache License, Version 2.0) Apache Commons BeanUtils (commons-beanutils:commons-beanutils:1.11.0 - https://commons.apache.org/proper/commons-beanutils)
|
|
195
195
|
* (Apache License, Version 2.0) Apache Commons IO (commons-io:commons-io:2.16.1 - https://commons.apache.org/proper/commons-io/)
|
|
196
196
|
* (Apache License, Version 2.0) Commons Lang (commons-lang:commons-lang:2.6 - http://commons.apache.org/lang/)
|
|
197
|
-
* (Apache License, Version 2.0) delta-spark (io.delta:delta-spark_2.
|
|
197
|
+
* (Apache License, Version 2.0) delta-spark (io.delta:delta-spark_2.13:4.0.0 - https://delta.io/)
|
|
198
|
+
* (Apache License, Version 2.0) ucumate-core (io.github.fhnaumann:ucumate-core:1.0.8 - https://github.com/fhnaumann/ucumate)
|
|
199
|
+
* (EPL 2.0) (GPL2 w/ CPE) Jakarta Servlet (jakarta.servlet:jakarta.servlet-api:5.0.0 - https://projects.eclipse.org/projects/ee4j.servlet)
|
|
198
200
|
* (Apache License, Version 2.0) Jakarta Bean Validation API (jakarta.validation:jakarta.validation-api:3.0.2 - https://beanvalidation.org)
|
|
199
|
-
* (CDDL + GPLv2 with classpath exception) Java Servlet API (jakarta.servlet:jakarta.servlet-api:5.0.0 - https://javaee.github.io/servlet-spec/)
|
|
200
201
|
* (Apache License, Version 2.0) Joda-Time (joda-time:joda-time:2.12.7 - https://www.joda.org/joda-time/)
|
|
201
|
-
* (BSD License) ANTLR 4 Tool (org.antlr:antlr4:4.
|
|
202
|
+
* (BSD License) ANTLR 4 Tool (org.antlr:antlr4:4.13.1 - http://www.antlr.org)
|
|
202
203
|
* (Apache License, Version 2.0) Apache Commons Lang (org.apache.commons:commons-lang3:3.18.0 - https://commons.apache.org/proper/commons-lang/)
|
|
203
204
|
* (Apache License, Version 2.0) Apache Derby Tools (org.apache.derby:derbytools:10.16.1.1 - http://db.apache.org/derby/)
|
|
204
|
-
* (Apache License, Version 2.0) Apache Hadoop
|
|
205
|
-
* (Apache License, Version 2.0)
|
|
206
|
-
* (Apache License, Version 2.0) Spark Project
|
|
207
|
-
* (Apache License, Version 2.0) Spark Project
|
|
208
|
-
* (Apache License, Version 2.0) Spark Project
|
|
209
|
-
* (Apache License, Version 2.0) Spark Project
|
|
210
|
-
* (Apache License, Version 2.0) Spark Project Unsafe (org.apache.spark:spark-unsafe_2.12:3.5.6 - https://spark.apache.org/)
|
|
205
|
+
* (Apache License, Version 2.0) Apache Hadoop Client API (org.apache.hadoop:hadoop-client-api:3.4.1 - no url defined)
|
|
206
|
+
* (Apache License, Version 2.0) Spark Project Catalyst (org.apache.spark:spark-catalyst_2.13:4.0.1 - https://spark.apache.org/)
|
|
207
|
+
* (Apache License, Version 2.0) Spark Project Core (org.apache.spark:spark-core_2.13:4.0.1 - https://spark.apache.org/)
|
|
208
|
+
* (Apache License, Version 2.0) Spark Project Hive (org.apache.spark:spark-hive_2.13:4.0.1 - https://spark.apache.org/)
|
|
209
|
+
* (Apache License, Version 2.0) Spark Project SQL (org.apache.spark:spark-sql_2.13:4.0.1 - https://spark.apache.org/)
|
|
210
|
+
* (Apache License, Version 2.0) Spark Project Unsafe (org.apache.spark:spark-unsafe_2.13:4.0.1 - https://spark.apache.org/)
|
|
211
211
|
* (Apache License, Version 2.0) Awaitility (org.awaitility:awaitility:4.2.0 - http://awaitility.org)
|
|
212
212
|
* (Apache License, Version 2.0) Hibernate Validator Engine (org.hibernate.validator:hibernate-validator:8.0.1.Final - http://hibernate.org/validator/hibernate-validator)
|
|
213
213
|
* (Apache License, Version 2.0) Infinispan Commons (org.infinispan:infinispan-commons:15.0.3.Final - http://www.infinispan.org)
|
|
@@ -221,7 +221,7 @@ separate files distributed with the Software.
|
|
|
221
221
|
* (GNU General Public License (GPL), version 2, with the Classpath exception) JMH Core (org.openjdk.jmh:jmh-core:1.37 - http://openjdk.java.net/projects/code-tools/jmh/jmh-core/)
|
|
222
222
|
* (GNU General Public License (GPL), version 2, with the Classpath exception) JMH Generators: Annotation Processors (org.openjdk.jmh:jmh-generator-annprocess:1.37 - http://openjdk.java.net/projects/code-tools/jmh/jmh-generator-annprocess/)
|
|
223
223
|
* (MIT License) Project Lombok (org.projectlombok:lombok:1.18.38 - https://projectlombok.org)
|
|
224
|
-
* (Apache License, Version 2.0) Scala Library (org.scala-lang:scala-library:2.
|
|
224
|
+
* (Apache License, Version 2.0) Scala Library (org.scala-lang:scala-library:2.13.15 - https://www.scala-lang.org/)
|
|
225
225
|
* (Apache License, Version 2.0) JSONassert (org.skyscreamer:jsonassert:1.5.1 - https://github.com/skyscreamer/JSONassert)
|
|
226
226
|
* (MIT License) SLF4J API Module (org.slf4j:slf4j-api:2.0.17 - http://www.slf4j.org)
|
|
227
227
|
* (Apache License, Version 2.0) spring-boot-starter-test (org.springframework.boot:spring-boot-starter-test:3.3.11 - https://spring.io/projects/spring-boot)
|
|
@@ -1,34 +1,21 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: pathling
|
|
3
|
-
Version: 9.
|
|
3
|
+
Version: 9.1.0.dev0
|
|
4
4
|
Summary: Python API for Pathling
|
|
5
|
-
|
|
6
|
-
Author: Australian e-Health Research Centre, CSIRO
|
|
7
|
-
Author-email: pathling@csiro.au
|
|
5
|
+
Project-URL: Homepage, https://github.com/aehrc/pathling
|
|
6
|
+
Author-email: "Australian e-Health Research Centre, CSIRO" <pathling@csiro.au>
|
|
8
7
|
License: Apache License, version 2.0
|
|
9
|
-
|
|
8
|
+
License-File: LICENSE
|
|
9
|
+
Keywords: analytics,fhir,pathling,spark,standards,terminology
|
|
10
10
|
Classifier: Development Status :: 3 - Alpha
|
|
11
11
|
Classifier: License :: OSI Approved :: Apache Software License
|
|
12
12
|
Classifier: Programming Language :: Python :: 3.9
|
|
13
13
|
Classifier: Programming Language :: Python :: 3.10
|
|
14
14
|
Classifier: Programming Language :: Python :: 3.11
|
|
15
15
|
Requires-Python: >=3.9
|
|
16
|
-
Description-Content-Type: text/markdown
|
|
17
|
-
License-File: LICENSE
|
|
18
|
-
Requires-Dist: pyspark<4.1.0,>=4.0.0
|
|
19
16
|
Requires-Dist: deprecated>=1.2.13
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
Dynamic: classifier
|
|
23
|
-
Dynamic: description
|
|
24
|
-
Dynamic: description-content-type
|
|
25
|
-
Dynamic: home-page
|
|
26
|
-
Dynamic: keywords
|
|
27
|
-
Dynamic: license
|
|
28
|
-
Dynamic: license-file
|
|
29
|
-
Dynamic: requires-dist
|
|
30
|
-
Dynamic: requires-python
|
|
31
|
-
Dynamic: summary
|
|
17
|
+
Requires-Dist: pyspark<4.1.0,>=4.0.0
|
|
18
|
+
Description-Content-Type: text/markdown
|
|
32
19
|
|
|
33
20
|
Python API for Pathling
|
|
34
21
|
=======================
|
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
START,STOP,PATIENT,ENCOUNTER,CODE,DESCRIPTION
|
|
2
|
+
2012-10-24,,95d8fdbf-cb4c-7133-1b5f-ab23979febe6,5a0b7744-aceb-15ac-280e-8070b01f1fb4,65363002,Otitis media
|
|
3
|
+
2015-06-24,2015-10-17,95d8fdbf-cb4c-7133-1b5f-ab23979febe6,0e105151-b37f-f62e-802f-beb66ba5fbc6,16114001,Fracture of ankle
|
|
4
|
+
2015-11-13,2015-11-22,95d8fdbf-cb4c-7133-1b5f-ab23979febe6,ce85c3fd-7cc6-481b-6a07-0c46110197a1,444814009,Viral sinusitis (disorder)
|
|
5
|
+
2016-12-22,2017-01-08,95d8fdbf-cb4c-7133-1b5f-ab23979febe6,0fd0b06b-1a99-7ebc-6d0f-a508f78e061f,444814009,Viral sinusitis (disorder)
|
|
6
|
+
2014-06-05,2014-06-16,866f8524-ddd3-2e28-01ac-035e0d1832be,ffb4d9fb-5cd4-bf1f-5a18-833c8ade3993,43878008,Streptococcal sore throat (disorder)
|
|
7
|
+
2019-02-24,2019-03-07,866f8524-ddd3-2e28-01ac-035e0d1832be,39b7146d-43b7-dc66-830b-f0051bd0ac1b,195662009,Acute viral pharyngitis (disorder)
|
|
8
|
+
2013-10-31,2013-11-15,bbf93c9f-90f8-10ad-19b8-9ce4a7908d19,840ed7b9-00c6-00c1-a485-57400c569bff,10509002,Acute bronchitis (disorder)
|
|
9
|
+
2004-01-15,,7316444b-558c-cf79-31d2-c4041929c344,add7491c-9521-5ed4-a161-7a4657e66341,65363002,Otitis media
|
|
10
|
+
2012-08-04,2012-08-17,7316444b-558c-cf79-31d2-c4041929c344,f373fdb2-afb7-298c-bc5e-f3b98816f233,43878008,Streptococcal sore throat (disorder)
|
|
11
|
+
2013-03-04,2013-04-05,7316444b-558c-cf79-31d2-c4041929c344,05c0cac5-ce66-7268-030c-1feb42c8bcda,403190006,First degree burn
|
|
12
|
+
2013-06-27,2013-07-05,7316444b-558c-cf79-31d2-c4041929c344,362f70ee-f8a5-19da-24cb-4e3081dbb106,10509002,Acute bronchitis (disorder)
|
|
13
|
+
2019-09-17,2019-11-21,7316444b-558c-cf79-31d2-c4041929c344,ba21ca9b-4692-2829-f333-531530e27d22,65966004,Fracture of forearm
|
|
14
|
+
2014-04-14,2014-04-26,bbf93c9f-90f8-10ad-19b8-9ce4a7908d19,1c2b48dc-718f-07e7-3f92-d4172f8b51d0,195662009,Acute viral pharyngitis (disorder)
|
|
15
|
+
2011-08-03,,f850b736-c3be-c2d0-5784-f551019956e1,1b81f386-65ae-4a26-343f-a7b2653020e1,65363002,Otitis media
|
|
16
|
+
1987-05-28,,d0237a49-0d92-7514-0613-16685a9c9a3d,da768855-91a7-54d5-839b-931c365a9437,65363002,Otitis media
|
|
17
|
+
2016-06-18,2016-07-11,d0237a49-0d92-7514-0613-16685a9c9a3d,e522ae5f-39fa-ab59-6aa7-dd15618a0312,444814009,Viral sinusitis (disorder)
|
|
18
|
+
1980-01-02,,f07f926b-d82f-3ba5-1898-7fdb82160336,258a40e0-5248-958f-ea21-24247790a245,65363002,Otitis media
|
|
19
|
+
2014-03-15,2014-10-18,f07f926b-d82f-3ba5-1898-7fdb82160336,09ccf1a1-0de0-1ecb-4cf2-3dd5fff14ec5,72892002,Normal pregnancy
|
|
20
|
+
2013-03-05,2013-03-12,f850b736-c3be-c2d0-5784-f551019956e1,3c1ac45a-9063-8eb2-9e3d-399c6d9f140d,195662009,Acute viral pharyngitis (disorder)
|
|
@@ -2,8 +2,8 @@
|
|
|
2
2
|
# Auto generated from POM project version.
|
|
3
3
|
# Please do not modify.
|
|
4
4
|
#
|
|
5
|
-
__version__="9.
|
|
6
|
-
__java_version__="9.
|
|
5
|
+
__version__="9.1.0.dev0"
|
|
6
|
+
__java_version__="9.1.0-SNAPSHOT"
|
|
7
7
|
__scala_version__="2.13"
|
|
8
8
|
__delta_version__="4.0.0"
|
|
9
9
|
__hadoop_version__="3.4.1"
|
|
@@ -99,6 +99,8 @@ class PathlingContext:
|
|
|
99
99
|
scope: Optional[str] = None,
|
|
100
100
|
token_expiry_tolerance: Optional[int] = 120,
|
|
101
101
|
accept_language: Optional[str] = None,
|
|
102
|
+
explain_queries: Optional[bool] = False,
|
|
103
|
+
max_unbound_traversal_depth: Optional[int] = 10,
|
|
102
104
|
enable_delta=False,
|
|
103
105
|
enable_remote_debugging: Optional[bool] = False,
|
|
104
106
|
debug_port: Optional[int] = 5005,
|
|
@@ -173,6 +175,11 @@ class PathlingContext:
|
|
|
173
175
|
the header is not sent. The server can use the header to return the result in the
|
|
174
176
|
preferred language if it is able. The actual behaviour may depend on the server
|
|
175
177
|
implementation and the code systems used.
|
|
178
|
+
:param explain_queries: setting this option to `True` will enable additional logging relating
|
|
179
|
+
to the query plan used to execute queries
|
|
180
|
+
:param max_unbound_traversal_depth: maximum depth for self-referencing structure traversals
|
|
181
|
+
in repeat operations. Controls how deeply nested hierarchical data can be flattened
|
|
182
|
+
during projection.
|
|
176
183
|
:param enable_delta: enables the use of Delta for storage of FHIR data.
|
|
177
184
|
Only supported when no SparkSession is provided.
|
|
178
185
|
:param enable_remote_debugging: enables remote debugging for the JVM process.
|
|
@@ -270,8 +277,20 @@ class PathlingContext:
|
|
|
270
277
|
.build()
|
|
271
278
|
)
|
|
272
279
|
|
|
273
|
-
|
|
274
|
-
|
|
280
|
+
# Build a query configuration object from the provided parameters.
|
|
281
|
+
query_config = (
|
|
282
|
+
jvm.au.csiro.pathling.config.QueryConfiguration.builder()
|
|
283
|
+
.explainQueries(explain_queries)
|
|
284
|
+
.maxUnboundTraversalDepth(max_unbound_traversal_depth)
|
|
285
|
+
.build()
|
|
286
|
+
)
|
|
287
|
+
|
|
288
|
+
jpc: JavaObject = (
|
|
289
|
+
jvm.au.csiro.pathling.library.PathlingContext.builder(spark._jsparkSession)
|
|
290
|
+
.encodingConfiguration(encoders_config)
|
|
291
|
+
.terminologyConfiguration(terminology_config)
|
|
292
|
+
.queryConfiguration(query_config)
|
|
293
|
+
.build()
|
|
275
294
|
)
|
|
276
295
|
return PathlingContext(spark, jpc)
|
|
277
296
|
|
|
@@ -0,0 +1,78 @@
|
|
|
1
|
+
# Copyright © 2018-2025 Commonwealth Scientific and Industrial Research
|
|
2
|
+
# Organisation (CSIRO) ABN 41 687 119 230.
|
|
3
|
+
#
|
|
4
|
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
5
|
+
# you may not use this file except in compliance with the License.
|
|
6
|
+
# You may obtain a copy of the License at
|
|
7
|
+
#
|
|
8
|
+
# http://www.apache.org/licenses/LICENSE-2.0
|
|
9
|
+
#
|
|
10
|
+
# Unless required by applicable law or agreed to in writing, software
|
|
11
|
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
12
|
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
13
|
+
# See the License for the specific language governing permissions and
|
|
14
|
+
# limitations under the License.
|
|
15
|
+
|
|
16
|
+
[build-system]
|
|
17
|
+
requires = ["hatchling"]
|
|
18
|
+
build-backend = "hatchling.build"
|
|
19
|
+
|
|
20
|
+
[project]
|
|
21
|
+
name = "pathling"
|
|
22
|
+
dynamic = ["version"]
|
|
23
|
+
description = "Python API for Pathling"
|
|
24
|
+
readme = "README.md"
|
|
25
|
+
requires-python = ">=3.9"
|
|
26
|
+
license = { text = "Apache License, version 2.0" }
|
|
27
|
+
authors = [
|
|
28
|
+
{ name = "Australian e-Health Research Centre, CSIRO", email = "pathling@csiro.au" }
|
|
29
|
+
]
|
|
30
|
+
keywords = ["pathling", "fhir", "analytics", "spark", "standards", "terminology"]
|
|
31
|
+
classifiers = [
|
|
32
|
+
"Development Status :: 3 - Alpha",
|
|
33
|
+
"License :: OSI Approved :: Apache Software License",
|
|
34
|
+
"Programming Language :: Python :: 3.9",
|
|
35
|
+
"Programming Language :: Python :: 3.10",
|
|
36
|
+
"Programming Language :: Python :: 3.11",
|
|
37
|
+
]
|
|
38
|
+
dependencies = [
|
|
39
|
+
"pyspark>=4.0.0,<4.1.0",
|
|
40
|
+
"deprecated>=1.2.13",
|
|
41
|
+
]
|
|
42
|
+
|
|
43
|
+
[project.urls]
|
|
44
|
+
Homepage = "https://github.com/aehrc/pathling"
|
|
45
|
+
|
|
46
|
+
[dependency-groups]
|
|
47
|
+
dev = [
|
|
48
|
+
"Sphinx>=1.6,<8",
|
|
49
|
+
"sphinx-rtd-theme==1.2.2",
|
|
50
|
+
"pytest==8.2.0",
|
|
51
|
+
"ipython==8.10.0",
|
|
52
|
+
"wheel==0.43.0",
|
|
53
|
+
"twine==5.0.0",
|
|
54
|
+
"build==1.2.1",
|
|
55
|
+
"pytest-cov==5.0.0",
|
|
56
|
+
"http-server-mock==1.7",
|
|
57
|
+
]
|
|
58
|
+
|
|
59
|
+
[tool.hatch.version]
|
|
60
|
+
path = "pathling/_version.py"
|
|
61
|
+
pattern = '__version__="(?P<version>[^"]+)"'
|
|
62
|
+
|
|
63
|
+
[tool.hatch.build.targets.wheel]
|
|
64
|
+
packages = ["pathling"]
|
|
65
|
+
artifacts = [
|
|
66
|
+
"target/dependency/*.jar",
|
|
67
|
+
]
|
|
68
|
+
|
|
69
|
+
[tool.hatch.build.targets.wheel.shared-data]
|
|
70
|
+
"examples" = "share/pathling/examples"
|
|
71
|
+
|
|
72
|
+
[tool.hatch.build.targets.sdist]
|
|
73
|
+
include = [
|
|
74
|
+
"/pathling",
|
|
75
|
+
"/examples",
|
|
76
|
+
"/README.md",
|
|
77
|
+
"/LICENSE",
|
|
78
|
+
]
|
pathling-9.0.0.dev1/MANIFEST.in
DELETED
pathling-9.0.0.dev1/PKG-INFO
DELETED
|
@@ -1,404 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: pathling
|
|
3
|
-
Version: 9.0.0.dev1
|
|
4
|
-
Summary: Python API for Pathling
|
|
5
|
-
Home-page: https://github.com/aehrc/pathling
|
|
6
|
-
Author: Australian e-Health Research Centre, CSIRO
|
|
7
|
-
Author-email: pathling@csiro.au
|
|
8
|
-
License: Apache License, version 2.0
|
|
9
|
-
Keywords: pathling,fhir,analytics,spark,standards,terminology
|
|
10
|
-
Classifier: Development Status :: 3 - Alpha
|
|
11
|
-
Classifier: License :: OSI Approved :: Apache Software License
|
|
12
|
-
Classifier: Programming Language :: Python :: 3.9
|
|
13
|
-
Classifier: Programming Language :: Python :: 3.10
|
|
14
|
-
Classifier: Programming Language :: Python :: 3.11
|
|
15
|
-
Requires-Python: >=3.9
|
|
16
|
-
Description-Content-Type: text/markdown
|
|
17
|
-
License-File: LICENSE
|
|
18
|
-
Requires-Dist: pyspark<4.1.0,>=4.0.0
|
|
19
|
-
Requires-Dist: deprecated>=1.2.13
|
|
20
|
-
Dynamic: author
|
|
21
|
-
Dynamic: author-email
|
|
22
|
-
Dynamic: classifier
|
|
23
|
-
Dynamic: description
|
|
24
|
-
Dynamic: description-content-type
|
|
25
|
-
Dynamic: home-page
|
|
26
|
-
Dynamic: keywords
|
|
27
|
-
Dynamic: license
|
|
28
|
-
Dynamic: license-file
|
|
29
|
-
Dynamic: requires-dist
|
|
30
|
-
Dynamic: requires-python
|
|
31
|
-
Dynamic: summary
|
|
32
|
-
|
|
33
|
-
Python API for Pathling
|
|
34
|
-
=======================
|
|
35
|
-
|
|
36
|
-
This is the Python API for [Pathling](https://pathling.csiro.au). It provides a
|
|
37
|
-
set of tools that aid the use of FHIR terminology services and FHIR data within
|
|
38
|
-
Python applications and data science workflows.
|
|
39
|
-
|
|
40
|
-
[View the API documentation →](https://pathling.csiro.au/docs/python/pathling.html)
|
|
41
|
-
|
|
42
|
-
## Installation
|
|
43
|
-
|
|
44
|
-
Prerequisites:
|
|
45
|
-
|
|
46
|
-
- Python 3.9+ with pip
|
|
47
|
-
|
|
48
|
-
To install, run this command:
|
|
49
|
-
|
|
50
|
-
```bash
|
|
51
|
-
pip install pathling
|
|
52
|
-
```
|
|
53
|
-
|
|
54
|
-
## Encoders
|
|
55
|
-
|
|
56
|
-
The Python library features a set of encoders for converting FHIR data into
|
|
57
|
-
Spark dataframes.
|
|
58
|
-
|
|
59
|
-
### Reading in NDJSON
|
|
60
|
-
|
|
61
|
-
[NDJSON](http://ndjson.org) is a format commonly used for bulk FHIR data, and
|
|
62
|
-
consists of files (one per resource type) that contains one JSON resource per
|
|
63
|
-
line.
|
|
64
|
-
|
|
65
|
-
```python
|
|
66
|
-
from pathling import PathlingContext
|
|
67
|
-
|
|
68
|
-
pc = PathlingContext.create()
|
|
69
|
-
|
|
70
|
-
# Read each line from the NDJSON into a row within a Spark data set.
|
|
71
|
-
ndjson_dir = '/some/path/ndjson/'
|
|
72
|
-
json_resources = pc.spark.read.text(ndjson_dir)
|
|
73
|
-
|
|
74
|
-
# Convert the data set of strings into a structured FHIR data set.
|
|
75
|
-
patients = pc.encode(json_resources, 'Patient')
|
|
76
|
-
|
|
77
|
-
# Do some stuff.
|
|
78
|
-
patients.select('id', 'gender', 'birthDate').show()
|
|
79
|
-
```
|
|
80
|
-
|
|
81
|
-
### Reading in Bundles
|
|
82
|
-
|
|
83
|
-
The FHIR [Bundle](https://hl7.org/fhir/R4/bundle.html) resource can contain a
|
|
84
|
-
collection of FHIR resources. It is often used to represent a set of related
|
|
85
|
-
resources, perhaps generated as part of the same event.
|
|
86
|
-
|
|
87
|
-
```python
|
|
88
|
-
from pathling import PathlingContext
|
|
89
|
-
|
|
90
|
-
pc = PathlingContext.create()
|
|
91
|
-
|
|
92
|
-
# Read each Bundle into a row within a Spark data set.
|
|
93
|
-
bundles_dir = '/some/path/bundles/'
|
|
94
|
-
bundles = pc.spark.read.text(bundles_dir, wholetext=True)
|
|
95
|
-
|
|
96
|
-
# Convert the data set of strings into a structured FHIR data set.
|
|
97
|
-
patients = pc.encode_bundle(bundles, 'Patient')
|
|
98
|
-
|
|
99
|
-
# JSON is the default format, XML Bundles can be encoded using input type.
|
|
100
|
-
# patients = pc.encodeBundle(bundles, 'Patient', inputType=MimeType.FHIR_XML)
|
|
101
|
-
|
|
102
|
-
# Do some stuff.
|
|
103
|
-
patients.select('id', 'gender', 'birthDate').show()
|
|
104
|
-
```
|
|
105
|
-
|
|
106
|
-
## Running SQL on FHIR views
|
|
107
|
-
|
|
108
|
-
The Pathling library leverages the SQL on FHIR specification to provide a way to
|
|
109
|
-
project FHIR data into easy-to-use tabular forms.
|
|
110
|
-
|
|
111
|
-
Once you have transformed your FHIR data into tabular views, you can choose to
|
|
112
|
-
keep it in a Spark dataframe and continue to work with in Apache Spark, or
|
|
113
|
-
export it to Python or R dataframes or a variety of different file formats for
|
|
114
|
-
use in the tool of your choice.
|
|
115
|
-
|
|
116
|
-
```python
|
|
117
|
-
from pathling import PathlingContext
|
|
118
|
-
|
|
119
|
-
pc = PathlingContext.create()
|
|
120
|
-
data = pc.read.ndjson("/some/file/location")
|
|
121
|
-
|
|
122
|
-
result = data.view(
|
|
123
|
-
resource="Patient",
|
|
124
|
-
select=[
|
|
125
|
-
{"column": [{"path": "getResourceKey()", "name": "patient_id"}]},
|
|
126
|
-
{
|
|
127
|
-
"forEach": "address",
|
|
128
|
-
"column": [
|
|
129
|
-
{"path": "line.join('\\n')", "name": "street"},
|
|
130
|
-
{"path": "use", "name": "use"},
|
|
131
|
-
{"path": "city", "name": "city"},
|
|
132
|
-
{"path": "postalCode", "name": "zip"},
|
|
133
|
-
],
|
|
134
|
-
},
|
|
135
|
-
],
|
|
136
|
-
)
|
|
137
|
-
|
|
138
|
-
display(result)
|
|
139
|
-
```
|
|
140
|
-
|
|
141
|
-
The result of this query would look something like this:
|
|
142
|
-
|
|
143
|
-
| patient_id | street | use | city | zip |
|
|
144
|
-
|------------|----------------------------|------|------------|-------|
|
|
145
|
-
| 1 | 398 Kautzer Walk Suite 62 | home | Barnstable | 02675 |
|
|
146
|
-
| 1 | 186 Nitzsche Forge | work | Revere | 02151 |
|
|
147
|
-
| 2 | 1087 Quitzon Club | home | Plymouth | NULL |
|
|
148
|
-
| 3 | 442 Bruen Arcade | home | Nantucket | NULL |
|
|
149
|
-
| 4 | 858 Miller Junction Apt 61 | work | Brockton | 02301 |
|
|
150
|
-
|
|
151
|
-
For a more comprehensive example demonstrating SQL on FHIR queries with multiple
|
|
152
|
-
views, complex transformations and joins, see
|
|
153
|
-
the [SQL on FHIR example](https://pathling.csiro.au/docs/libraries/examples/prostate-cancer.md).
|
|
154
|
-
|
|
155
|
-
## Terminology functions
|
|
156
|
-
|
|
157
|
-
The library also provides a set of functions for querying a FHIR terminology
|
|
158
|
-
server from within your queries and transformations.
|
|
159
|
-
|
|
160
|
-
### Value set membership
|
|
161
|
-
|
|
162
|
-
The `member_of` function can be used to test the membership of a code within a
|
|
163
|
-
[FHIR value set](https://hl7.org/fhir/valueset.html). This can be used with both
|
|
164
|
-
explicit value sets (i.e. those that have been pre-defined and loaded into the
|
|
165
|
-
terminology server) and implicit value sets (e.g. SNOMED CT
|
|
166
|
-
[Expression Constraint Language](http://snomed.org/ecl)).
|
|
167
|
-
|
|
168
|
-
In this example, we take a list of SNOMED CT diagnosis codes and
|
|
169
|
-
create a new column which shows which are viral infections. We use an ECL
|
|
170
|
-
expression to define viral infection as a disease with a pathological process
|
|
171
|
-
of "Infectious process", and a causative agent of "Virus".
|
|
172
|
-
|
|
173
|
-
```python
|
|
174
|
-
result = pc.member_of(csv, to_coding(csv.CODE, 'http://snomed.info/sct'),
|
|
175
|
-
to_ecl_value_set("""
|
|
176
|
-
<< 64572001|Disease| : (
|
|
177
|
-
<< 370135005|Pathological process| = << 441862004|Infectious process|,
|
|
178
|
-
<< 246075003|Causative agent| = << 49872002|Virus|
|
|
179
|
-
)
|
|
180
|
-
"""), 'VIRAL_INFECTION')
|
|
181
|
-
result.select('CODE', 'DESCRIPTION', 'VIRAL_INFECTION').show()
|
|
182
|
-
```
|
|
183
|
-
|
|
184
|
-
Results in:
|
|
185
|
-
|
|
186
|
-
| CODE | DESCRIPTION | VIRAL_INFECTION |
|
|
187
|
-
|-----------|---------------------------|-----------------|
|
|
188
|
-
| 65363002 | Otitis media | false |
|
|
189
|
-
| 16114001 | Fracture of ankle | false |
|
|
190
|
-
| 444814009 | Viral sinusitis | true |
|
|
191
|
-
| 444814009 | Viral sinusitis | true |
|
|
192
|
-
| 43878008 | Streptococcal sore throat | false |
|
|
193
|
-
|
|
194
|
-
### Concept translation
|
|
195
|
-
|
|
196
|
-
The `translate` function can be used to translate codes from one code system to
|
|
197
|
-
another using maps that are known to the terminology server. In this example, we
|
|
198
|
-
translate our SNOMED CT diagnosis codes into Read CTV3.
|
|
199
|
-
|
|
200
|
-
```python
|
|
201
|
-
result = pc.translate(csv, to_coding(csv.CODE, 'http://snomed.info/sct'),
|
|
202
|
-
'http://snomed.info/sct/900000000000207008?fhir_cm='
|
|
203
|
-
'900000000000497000',
|
|
204
|
-
output_column_name='READ_CODE')
|
|
205
|
-
result = result.withColumn('READ_CODE', result.READ_CODE.code)
|
|
206
|
-
result.select('CODE', 'DESCRIPTION', 'READ_CODE').show()
|
|
207
|
-
```
|
|
208
|
-
|
|
209
|
-
Results in:
|
|
210
|
-
|
|
211
|
-
| CODE | DESCRIPTION | READ_CODE |
|
|
212
|
-
|-----------|---------------------------|-----------|
|
|
213
|
-
| 65363002 | Otitis media | X00ik |
|
|
214
|
-
| 16114001 | Fracture of ankle | S34.. |
|
|
215
|
-
| 444814009 | Viral sinusitis | XUjp0 |
|
|
216
|
-
| 444814009 | Viral sinusitis | XUjp0 |
|
|
217
|
-
| 43878008 | Streptococcal sore throat | A340. |
|
|
218
|
-
|
|
219
|
-
### Subsumption testing
|
|
220
|
-
|
|
221
|
-
Subsumption test is a fancy way of saying "is this code equal or a subtype of
|
|
222
|
-
this other code".
|
|
223
|
-
|
|
224
|
-
For example, a code representing "ankle fracture" is subsumed
|
|
225
|
-
by another code representing "fracture". The "fracture" code is more general,
|
|
226
|
-
and using it with subsumption can help us find other codes representing
|
|
227
|
-
different subtypes of fracture.
|
|
228
|
-
|
|
229
|
-
The `subsumes` function allows us to perform subsumption testing on codes within
|
|
230
|
-
our data. The order of the left and right operands can be reversed to query
|
|
231
|
-
whether a code is "subsumed by" another code.
|
|
232
|
-
|
|
233
|
-
```python
|
|
234
|
-
# 232208008 |Ear, nose and throat disorder|
|
|
235
|
-
left_coding = Coding('http://snomed.info/sct', '232208008')
|
|
236
|
-
right_coding_column = to_coding(csv.CODE, 'http://snomed.info/sct')
|
|
237
|
-
|
|
238
|
-
result = pc.subsumes(csv, 'IS_ENT',
|
|
239
|
-
left_coding=left_coding,
|
|
240
|
-
right_coding_column=right_coding_column)
|
|
241
|
-
|
|
242
|
-
result.select('CODE', 'DESCRIPTION', 'IS_ENT').show()
|
|
243
|
-
```
|
|
244
|
-
|
|
245
|
-
Results in:
|
|
246
|
-
|
|
247
|
-
| CODE | DESCRIPTION | IS_ENT |
|
|
248
|
-
|-----------|-------------------|--------|
|
|
249
|
-
| 65363002 | Otitis media | true |
|
|
250
|
-
| 16114001 | Fracture of ankle | false |
|
|
251
|
-
| 444814009 | Viral sinusitis | true |
|
|
252
|
-
|
|
253
|
-
### Retrieving properties
|
|
254
|
-
|
|
255
|
-
Some terminologies contain additional properties that are associated with codes.
|
|
256
|
-
You can query these properties using the `property_of` function.
|
|
257
|
-
|
|
258
|
-
There is also a `display` function that can be used to retrieve the preferred
|
|
259
|
-
display term for each code.
|
|
260
|
-
|
|
261
|
-
```python
|
|
262
|
-
# Get the parent codes for each code in the dataset.
|
|
263
|
-
parents = csv.withColumn(
|
|
264
|
-
"PARENTS",
|
|
265
|
-
property_of(to_snomed_coding(csv.CODE), "parent", PropertyType.CODE),
|
|
266
|
-
)
|
|
267
|
-
# Split each parent code into a separate row.
|
|
268
|
-
exploded_parents = parents.selectExpr(
|
|
269
|
-
"CODE", "DESCRIPTION", "explode_outer(PARENTS) AS PARENT"
|
|
270
|
-
)
|
|
271
|
-
# Retrieve the preferred term for each parent code.
|
|
272
|
-
with_displays = exploded_parents.withColumn(
|
|
273
|
-
"PARENT_DISPLAY", display(to_snomed_coding(exploded_parents.PARENT))
|
|
274
|
-
)
|
|
275
|
-
```
|
|
276
|
-
|
|
277
|
-
Results in:
|
|
278
|
-
|
|
279
|
-
| CODE | DESCRIPTION | PARENT | PARENT_DISPLAY |
|
|
280
|
-
|-----------|-------------------|-----------|-----------------------------------------|
|
|
281
|
-
| 65363002 | Otitis media | 43275000 | Otitis |
|
|
282
|
-
| 65363002 | Otitis media | 68996008 | Disorder of middle ear |
|
|
283
|
-
| 16114001 | Fracture of ankle | 125603006 | Injury of ankle |
|
|
284
|
-
| 16114001 | Fracture of ankle | 46866001 | Fracture of lower limb |
|
|
285
|
-
| 444814009 | Viral sinusitis | 36971009 | Sinusitis |
|
|
286
|
-
| 444814009 | Viral sinusitis | 281794004 | Viral upper respiratory tract infection |
|
|
287
|
-
| 444814009 | Viral sinusitis | 363166002 | Infective disorder of head |
|
|
288
|
-
| 444814009 | Viral sinusitis | 36971009 | Sinusitis |
|
|
289
|
-
| 444814009 | Viral sinusitis | 281794004 | Viral upper respiratory tract infection |
|
|
290
|
-
| 444814009 | Viral sinusitis | 363166002 | Infective disorder of head |
|
|
291
|
-
|
|
292
|
-
### Retrieving designations
|
|
293
|
-
|
|
294
|
-
Some terminologies contain additional display terms for codes. These can be used
|
|
295
|
-
for language translations, synonyms, and more. You can query these terms using
|
|
296
|
-
the `designation` function.
|
|
297
|
-
|
|
298
|
-
```python
|
|
299
|
-
# Get the synonyms for each code in the dataset.
|
|
300
|
-
synonyms = csv.withColumn(
|
|
301
|
-
"SYNONYMS",
|
|
302
|
-
designation(to_snomed_coding(csv.CODE),
|
|
303
|
-
Coding.of_snomed("900000000000013009")),
|
|
304
|
-
)
|
|
305
|
-
# Split each synonyms into a separate row.
|
|
306
|
-
exploded_synonyms = synonyms.selectExpr(
|
|
307
|
-
"CODE", "DESCRIPTION", "explode_outer(SYNONYMS) AS SYNONYM"
|
|
308
|
-
)
|
|
309
|
-
```
|
|
310
|
-
|
|
311
|
-
Results in:
|
|
312
|
-
|
|
313
|
-
| CODE | DESCRIPTION | SYNONYM |
|
|
314
|
-
|-----------|--------------------------------------|--------------------------------------------|
|
|
315
|
-
| 65363002 | Otitis media | OM - Otitis media |
|
|
316
|
-
| 16114001 | Fracture of ankle | Ankle fracture |
|
|
317
|
-
| 16114001 | Fracture of ankle | Fracture of distal end of tibia and fibula |
|
|
318
|
-
| 444814009 | Viral sinusitis (disorder) | NULL |
|
|
319
|
-
| 444814009 | Viral sinusitis (disorder) | NULL |
|
|
320
|
-
| 43878008 | Streptococcal sore throat (disorder) | Septic sore throat |
|
|
321
|
-
| 43878008 | Streptococcal sore throat (disorder) | Strep throat |
|
|
322
|
-
| 43878008 | Streptococcal sore throat (disorder) | Strept throat |
|
|
323
|
-
| 43878008 | Streptococcal sore throat (disorder) | Streptococcal angina |
|
|
324
|
-
| 43878008 | Streptococcal sore throat (disorder) | Streptococcal pharyngitis |
|
|
325
|
-
|
|
326
|
-
### Terminology server authentication
|
|
327
|
-
|
|
328
|
-
Pathling can be configured to connect to a protected terminology server by
|
|
329
|
-
supplying a set of OAuth2 client credentials and a token endpoint.
|
|
330
|
-
|
|
331
|
-
Here is an example of how to authenticate to
|
|
332
|
-
the [NHS terminology server](https://ontology.nhs.uk/):
|
|
333
|
-
|
|
334
|
-
```python
|
|
335
|
-
from pathling import PathlingContext
|
|
336
|
-
|
|
337
|
-
pc = PathlingContext.create(
|
|
338
|
-
terminology_server_url='https://ontology.nhs.uk/production1/fhir',
|
|
339
|
-
token_endpoint='https://ontology.nhs.uk/authorisation/auth/realms/nhs-digital-terminology/protocol/openid-connect/token',
|
|
340
|
-
client_id='[client ID]',
|
|
341
|
-
client_secret='[client secret]'
|
|
342
|
-
)
|
|
343
|
-
```
|
|
344
|
-
|
|
345
|
-
## Installation in Databricks
|
|
346
|
-
|
|
347
|
-
To make the Pathling library available within notebooks, navigate to the
|
|
348
|
-
"Compute" section and click on the cluster. Click on the "Libraries" tab, and
|
|
349
|
-
click "Install new".
|
|
350
|
-
|
|
351
|
-
Install both the `pathling` PyPI package, and
|
|
352
|
-
the `au.csiro.pathling:library-api`
|
|
353
|
-
Maven package. Once the cluster is restarted, the libraries should be available
|
|
354
|
-
for import and use within all notebooks.
|
|
355
|
-
|
|
356
|
-
By default, Databricks uses Java 8 within its clusters, while Pathling requires
|
|
357
|
-
Java 21. To enable Java 21 support within your cluster, navigate to __Advanced
|
|
358
|
-
Options > Spark > Environment Variables__ and add the following:
|
|
359
|
-
|
|
360
|
-
```bash
|
|
361
|
-
JNAME=zulu21-ca-amd64
|
|
362
|
-
```
|
|
363
|
-
|
|
364
|
-
See the Databricks documentation on
|
|
365
|
-
[Libraries](https://docs.databricks.com/libraries/index.html) for more
|
|
366
|
-
information.
|
|
367
|
-
|
|
368
|
-
## Spark cluster configuration
|
|
369
|
-
|
|
370
|
-
If you are running your own Spark cluster, or using a Docker image (such as
|
|
371
|
-
[jupyter/pyspark-notebook](https://hub.docker.com/r/jupyter/pyspark-notebook))
|
|
372
|
-
,
|
|
373
|
-
you will need to configure Pathling as a Spark package.
|
|
374
|
-
|
|
375
|
-
You can do this by adding the following to your `spark-defaults.conf` file:
|
|
376
|
-
|
|
377
|
-
```
|
|
378
|
-
spark.jars.packages au.csiro.pathling:library-api:[some version]
|
|
379
|
-
```
|
|
380
|
-
|
|
381
|
-
See the [Configuration](https://spark.apache.org/docs/latest/configuration.html)
|
|
382
|
-
page of the Spark documentation for more information about `spark.jars.packages`
|
|
383
|
-
and other related configuration options.
|
|
384
|
-
|
|
385
|
-
To create a Pathling notebook Docker image, your `Dockerfile` might look like
|
|
386
|
-
this:
|
|
387
|
-
|
|
388
|
-
```dockerfile
|
|
389
|
-
FROM jupyter/pyspark-notebook
|
|
390
|
-
|
|
391
|
-
USER root
|
|
392
|
-
RUN echo "spark.jars.packages au.csiro.pathling:library-api:[some version]" >> /usr/local/spark/conf/spark-defaults.conf
|
|
393
|
-
|
|
394
|
-
USER ${NB_UID}
|
|
395
|
-
|
|
396
|
-
RUN pip install --quiet --no-cache-dir pathling && \
|
|
397
|
-
fix-permissions "${CONDA_DIR}" && \
|
|
398
|
-
fix-permissions "/home/${NB_USER}"
|
|
399
|
-
```
|
|
400
|
-
|
|
401
|
-
Pathling is copyright © 2018-2025, Commonwealth Scientific and Industrial
|
|
402
|
-
Research Organisation
|
|
403
|
-
(CSIRO) ABN 41 687 119 230. Licensed under
|
|
404
|
-
the [Apache License, version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
|