PyPI - libadalina-core - Versions diffs - 1.0__tar.gz - Mend

libadalina-core 1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (52) hide show

libadalina_core-1.0/.gitignore ADDED Viewed

	@@ -0,0 +1 @@
1	+ .idea

libadalina_core-1.0/.gitlab-ci.yml ADDED Viewed

@@ -0,0 +1,15 @@
+# The Docker image that will be used to build your app
+image: python:3.10-alpine
+create-pages:
+  pages:
+    # The folder that contains the files to be exposed at the Page URL
+    publish: public
+  rules:
+    # This ensures that only pushes to the default branch will trigger
+    # a pages deploy
+    - if: $CI_COMMIT_REF_NAME == $CI_DEFAULT_BRANCH
+  # Functions that should be executed before the build script is run
+  before_script: []
+  script:
+    - pip install sphinx==7.4.6 pydata-sphinx-theme==0.16.0
+    - sphinx-build -b html docs/source public

libadalina_core-1.0/.gitmodules ADDED Viewed

@@ -0,0 +1,3 @@
+[submodule "tests/samples"]
+	path = tests/samples
+	url = git@gitlab.com:amelia_unimi/libadalina-samples.git

libadalina_core-1.0/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2025 University of Milan
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

libadalina_core-1.0/MANIFEST.in ADDED Viewed

@@ -0,0 +1,3 @@
+include LICENSE
+include README.md
+recursive-include libadalina_core *.py

libadalina_core-1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,67 @@
+Metadata-Version: 2.4
+Name: libadalina-core
+Version: 1.0
+Summary: A library for spatial joins of geographic data
+Author-email: Marco Casazza <d.marcocasazza@gmail.com>, Alberto Ceselli <alberto.ceselli@unimi.it>, Marco Premoli <marco.premoli@unimi.it>
+License-Expression: MIT
+Project-URL: Homepage, https://gitlab.com/amelia_unimi/libadalina
+Classifier: Development Status :: 3 - Alpha
+Classifier: Intended Audience :: Developers
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.10
+Requires-Python: ~=3.10
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: apache-sedona[spark]==1.7.1
+Requires-Dist: pyspark==3.3.2
+Requires-Dist: pandas==2.2.3
+Requires-Dist: geopandas==1.0.1
+Requires-Dist: shapely==2.1.1
+Requires-Dist: install-jdk==1.1.0
+Provides-Extra: dev
+Requires-Dist: pytest==8.4.1; extra == "dev"
+Requires-Dist: black; extra == "dev"
+Requires-Dist: isort; extra == "dev"
+Requires-Dist: sphinx; extra == "dev"
+Requires-Dist: pydata-sphinx-theme; extra == "dev"
+Dynamic: license-file
+# libadalina-core
+A Python library for spatial data processing.
+It makes it easier to work with geospatial data in Python by providing a high-level interface
+to Apache Sedona, a powerful geospatial processing engine, and integrates nicely with other well-known libraries
+such as *geopandas* and *pandas*.
+## Installation
+liabadalina-core can be installed using pip:
+```
+pip install libadalina-core
+```
+If `JAVA_HOME` environment variable is not set a suitable JDK will be downloaded in `$HOME/.jre` and used automatically.
+Not all JRE are supported, so if you encounter issues, you can try the automatically installed version.
+## Usage
+You can find the documentation and example at [libadalina-core documentation](https://libadalinacore-6b2a95.gitlab.io/).
+## Features
+* Reading and writing geospatial data from various formats
+* Spatial joins between datasets
+* Spatial aggregations
+* Utilities for working with Apache Sedona
+* Configuration helpers for setting up Apache Sedona
+## Requirements
+- Python 3.10
+- Dependencies:
+  - apache-sedona[spark]
+  - pyspark
+  - pandas
+  - geopandas
+  - install-jdk

libadalina_core-1.0/README.md ADDED Viewed

@@ -0,0 +1,39 @@
+# libadalina-core
+A Python library for spatial data processing.
+It makes it easier to work with geospatial data in Python by providing a high-level interface
+to Apache Sedona, a powerful geospatial processing engine, and integrates nicely with other well-known libraries
+such as *geopandas* and *pandas*.
+## Installation
+liabadalina-core can be installed using pip:
+```
+pip install libadalina-core
+```
+If `JAVA_HOME` environment variable is not set a suitable JDK will be downloaded in `$HOME/.jre` and used automatically.
+Not all JRE are supported, so if you encounter issues, you can try the automatically installed version.
+## Usage
+You can find the documentation and example at [libadalina-core documentation](https://libadalinacore-6b2a95.gitlab.io/).
+## Features
+* Reading and writing geospatial data from various formats
+* Spatial joins between datasets
+* Spatial aggregations
+* Utilities for working with Apache Sedona
+* Configuration helpers for setting up Apache Sedona
+## Requirements
+- Python 3.10
+- Dependencies:
+  - apache-sedona[spark]
+  - pyspark
+  - pandas
+  - geopandas
+  - install-jdk

libadalina_core-1.0/build.sh ADDED Viewed

@@ -0,0 +1,6 @@
+#/usr/bin/env bash
+python3 -m pip install --upgrade build twine
+python3 -m build
+python3 -m twine upload --repository pypi dist/*

libadalina_core-1.0/docs/Makefile ADDED Viewed

@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    ?=
+SPHINXBUILD   ?= sphinx-build
+SOURCEDIR     = source
+BUILDDIR      = build
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+.PHONY: help Makefile
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

libadalina_core-1.0/docs/build/.gitignore ADDED Viewed

	@@ -0,0 +1 @@
1	+ /*

libadalina_core-1.0/docs/make.bat ADDED Viewed

@@ -0,0 +1,35 @@
+@ECHO OFF
+pushd %~dp0
+REM Command file for Sphinx documentation
+if "%SPHINXBUILD%" == "" (
+	set SPHINXBUILD=sphinx-build
+)
+set SOURCEDIR=source
+set BUILDDIR=build
+%SPHINXBUILD% >NUL 2>NUL
+if errorlevel 9009 (
+	echo.
+	echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
+	echo.installed, then set the SPHINXBUILD environment variable to point
+	echo.to the full path of the 'sphinx-build' executable. Alternatively you
+	echo.may add the Sphinx directory to PATH.
+	echo.
+	echo.If you don't have Sphinx installed, grab it from
+	echo.https://www.sphinx-doc.org/
+	exit /b 1
+)
+if "%1" == "" goto help
+%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+goto end
+:help
+%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+:end
+popd

libadalina_core-1.0/docs/source/conf.py ADDED Viewed

@@ -0,0 +1,66 @@
+# Configuration file for the Sphinx documentation builder.
+#
+# For the full list of built-in configuration values, see the documentation:
+# https://www.sphinx-doc.org/en/master/usage/configuration.html
+import os
+import sys
+sys.path.insert(0, os.path.abspath('../..'))
+# -- Project information -----------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
+project = 'libadalina-core'
+copyright = '2025, OptLab, University of Milan'
+author = 'Marco Casazza, Alberto Ceselli, Marco Premoli'
+release = '0.1.0'
+# -- General configuration ---------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
+extensions = [
+    'sphinx.ext.autodoc',
+    'sphinx.ext.autosummary',
+    'sphinx.ext.viewcode',
+    'sphinx.ext.napoleon',
+    'sphinx.ext.intersphinx',
+]
+# Napoleon settings
+napoleon_google_docstring = True
+napoleon_numpy_docstring = True
+napoleon_include_init_with_doc = False
+napoleon_include_private_with_doc = False
+napoleon_include_special_with_doc = True
+napoleon_use_admonition_for_examples = False
+napoleon_use_admonition_for_notes = False
+napoleon_use_admonition_for_references = False
+napoleon_use_ivar = False
+napoleon_use_param = True
+napoleon_use_rtype = True
+napoleon_preprocess_types = False
+napoleon_type_aliases = None
+napoleon_attr_annotations = True
+# Intersphinx settings
+intersphinx_mapping = {
+    'python': ('https://docs.python.org/3', None),
+    'pandas': ('https://pandas.pydata.org/pandas-docs/stable', None),
+    'geopandas': ('https://geopandas.org/en/stable/', None),
+    'pyspark': ('https://spark.apache.org/docs/latest/api/python/', None),
+}
+templates_path = ['_templates']
+exclude_patterns = []
+# -- Options for HTML output -------------------------------------------------
+# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
+html_theme = 'pydata_sphinx_theme'
+html_static_path = ['_static']
+latex_elements = {
+    'fncychap': r'\usepackage[Bjornstrup]{fncychap}',
+}

libadalina_core-1.0/docs/source/examples/hospitals_in_provinces.rst ADDED Viewed

@@ -0,0 +1,25 @@
+Hospitals in Provinces
+=======================
+This example demonstrates how to find hospitals in specific provinces in Italy and aggregate their data.
+.. literalinclude:: ../../../libadalina_core/examples/example_hospitals_in_provinces.py
+   :language: python
+   :linenos:
+   :caption: Example: Finding hospitals in provinces and aggregating data
+Example Explanation
+-------------------
+This example shows how to:
+1. Load geospatial data from GeoPackage files using ``geopackage_to_dataframe``
+2. Filter data to select specific provinces (Milan and Cremona)
+3. Perform a spatial join between provinces and hospitals using ``spatial_join``
+4. Aggregate data to count hospitals and calculate total and average beds using ``spatial_aggregation``
+The example demonstrates the use of the following libadalina-core features:
+* Reading geospatial data
+* Spatial joins
+* Spatial aggregations with different aggregation functions

libadalina_core-1.0/docs/source/examples/index.rst ADDED Viewed

@@ -0,0 +1,11 @@
+Examples
+========
+This section provides examples of how to use the libadalina-core library for various geospatial data processing tasks.
+.. toctree::
+   :maxdepth: 1
+   hospitals_in_provinces
+   population_in_provinces
+   population_served_by_hospitals

libadalina_core-1.0/docs/source/examples/population_in_provinces.rst ADDED Viewed

@@ -0,0 +1,26 @@
+Population in Provinces
+=======================
+This example demonstrates how to calculate population statistics for provinces in Italy.
+.. literalinclude:: ../../../libadalina_core/examples/example_population_in_provinces.py
+   :language: python
+   :linenos:
+   :caption: Example: Calculating population statistics for provinces
+Example Explanation
+-------------------
+This example shows how to:
+1. Load geospatial data from GeoPackage files using ``geopackage_to_dataframe``
+2. Filter data to select specific provinces in Northern Italy
+3. Perform a spatial join between provinces and population data using ``spatial_join``
+4. Aggregate population data by province using ``spatial_aggregation``
+The example demonstrates the use of the following libadalina-core features:
+* Reading geospatial data
+* Filtering data based on attributes
+* Spatial joins
+* Spatial aggregations with sum aggregation function

libadalina_core-1.0/docs/source/examples/population_served_by_hospitals.rst ADDED Viewed

@@ -0,0 +1,29 @@
+Population Served by Hospitals
+==============================
+This example demonstrates how to calculate the population served by hospitals in different provinces of Italy.
+.. literalinclude:: ../../../libadalina_core/examples/example_population_served_by_hospitals.py
+   :language: python
+   :linenos:
+   :caption: Example: Calculating population served by hospitals
+Example Explanation
+-------------------
+This example shows how to:
+1. Load geospatial data from GeoPackage files using ``geopackage_to_dataframe``
+2. Filter data to select specific provinces in Northern Italy
+3. Create buffer zones around hospitals using ``polygonize``
+4. Perform a spatial join between hospital buffer zones and population data using ``spatial_join``
+5. Aggregate population data by hospital using ``spatial_aggregation``
+The example demonstrates the use of the following libadalina-core features:
+* Reading geospatial data
+* Filtering data based on attributes
+* Creating buffer zones around geometries
+* Spatial joins
+* Spatial aggregations with proportional calculations
+* Handling of complex spatial relationships

libadalina_core-1.0/docs/source/index.rst ADDED Viewed

@@ -0,0 +1,54 @@
+.. libadalina-core documentation master file, created by
+   sphinx-quickstart on Thu Oct 26 10:00:00 2023.
+   You can adapt this file completely to your liking, but it should at least
+   contain the root `toctree` directive.
+=============================
+libadalina-core documentation
+=============================
+Introduction
+************
+*libadalina-core* is a Python library for spatial data processing and analysis providing utilities for reading,
+writing, and processing geospatial data,
+with a focus on spatial joins and aggregations.
+It makes it easier to work with geospatial data in Python by providing a high-level interface
+to Apache Sedona, a powerful geospatial processing engine, and integrates nicely with other well-known libraries
+such as *geopandas* and *pandas*.
+*libadalina-core* is part of the `ADaLinA project <https://expertise.unimi.it/resource/project/PNRR%5FBAC24ACESE%5F01>`__
+that aims to develop a set of tools for the analysis of large-scale spatial data
+to be integrated into the `AMELIA <https://https://grins.it/progetto/piattaforma-amelia>`__ platform.
+Features
+--------
+* Reading and writing geospatial data from various formats
+* Spatial joins between datasets
+* Spatial aggregations
+* Utilities for working with Apache Sedona
+* Configuration helpers for setting up Apache Sedona
+Requirements
+------------
+*libadalina-core* requires Python 3.10 and depends on the following libraries:
+* apache-sedona
+* pyspark
+* pandas
+* geopandas
+* install-jdk
+*libadalina-core* has been tested with OpenJDK 17.
+.. toctree::
+   :maxdepth: 2
+   :caption: Table of contents
+   modules/index
+   examples/index

libadalina_core-1.0/docs/source/modules/index.rst ADDED Viewed

@@ -0,0 +1,13 @@
+API Reference
+=============
+This section provides detailed API documentation for all modules in the libadalina-core library.
+.. toctree::
+   :maxdepth: 2
+   readers
+   writers
+   spatial_join
+   sedona_configuration
+   sedona_utils

libadalina_core-1.0/docs/source/modules/readers.rst ADDED Viewed

@@ -0,0 +1,9 @@
+Readers
+=======
+The readers module provides functions for reading geospatial data from various sources.
+.. automodule:: libadalina_core.readers.readers
+   :members:
+   :undoc-members:
+   :show-inheritance:

libadalina_core-1.0/docs/source/modules/sedona_configuration.rst ADDED Viewed

@@ -0,0 +1,17 @@
+Sedona Configuration
+====================
+The sedona_configuration module provides functions for configuring and initializing Apache Sedona.
+.. automodule:: libadalina_core.sedona_configuration.sedona_configuration
+   :members:
+   :undoc-members:
+   :show-inheritance:
+JDK Installer
+-------------
+.. automodule:: libadalina_core.sedona_configuration.jdk_installer
+   :members:
+   :undoc-members:
+   :show-inheritance:

libadalina_core-1.0/docs/source/modules/sedona_utils.rst ADDED Viewed

@@ -0,0 +1,20 @@
+Sedona Utilities
+================
+The sedona_utils module provides utility functions for working with Apache Sedona.
+Utilities
+---------
+.. automodule:: libadalina_core.sedona_utils.utils
+   :members:
+   :undoc-members:
+   :show-inheritance:
+Coordinate Formats
+------------------
+.. automodule:: libadalina_core.sedona_utils.coordinate_formats
+   :members:
+   :undoc-members:
+   :show-inheritance:

libadalina_core-1.0/docs/source/modules/spatial_join.rst ADDED Viewed

@@ -0,0 +1,9 @@
+Spatial Join
+============
+The spatial_join module provides functions for performing spatial joins and aggregations on geospatial data.
+.. automodule:: libadalina_core.spatial_join.query_builder
+   :members:
+   :undoc-members:
+   :show-inheritance:

libadalina_core-1.0/docs/source/modules/writers.rst ADDED Viewed

@@ -0,0 +1,9 @@
+Writers
+=======
+The writers module provides functions for writing geospatial data to various formats.
+.. automodule:: libadalina_core.writers.writers
+   :members:
+   :undoc-members:
+   :show-inheritance:

libadalina_core-1.0/libadalina_core/__init__.py ADDED Viewed

File without changes

libadalina_core-1.0/libadalina_core/examples/example_hospitals_in_provinces.py ADDED Viewed

@@ -0,0 +1,46 @@
+from libadalina_core.readers.readers import geopackage_to_dataframe
+import pathlib
+import pandas as pd
+from libadalina_core.spatial_join.query_builder import spatial_join, JoinType, spatial_aggregation, AggregationType, \
+    AggregationFunction
+if __name__ == "__main__":
+    """Example of how to use libadalina to find hospitals in specific provinces in Italy and aggregate their data."""
+    # Set pandas display options
+    pd.set_option('display.max_columns', None)
+    pd.set_option('display.width', None)
+    pd.set_option('display.max_colwidth', 100)
+    hospitals = geopackage_to_dataframe(
+        str(pathlib.Path(__file__).parent.parent.parent / "tests" / "samples" / "healthcare" / "EU_healthcare.gpkg"),
+        "EU"
+    )[["hospital_name", "geometry", "city", "cap_beds"]]
+    regions = geopackage_to_dataframe(
+        str(pathlib.Path(__file__).parent.parent.parent / "tests" / "samples" / "regions" / "NUTS_RG_20M_2024_4326.gpkg"),
+        "NUTS_RG_20M_2024_4326.gpkg"
+    )[["LEVL_CODE", "NUTS_NAME", "CNTR_CODE", "geometry"]]
+    # select province of Milan and Cremona
+    filtered_regions = regions[
+        (regions['LEVL_CODE'] == 3) &
+        (regions['CNTR_CODE'] == "IT") &
+        (regions['NUTS_NAME'].str.contains('Milano|Cremona', case=False))
+    ]
+    # join with hospitals table to get hospitals in these provinces
+    result = (spatial_join(filtered_regions, hospitals, join_type=JoinType.LEFT)
+              # join operator renames the geometries adding suffixes _left and _right to avoid conflicts
+              .withColumnRenamed('geometry_left', 'geometry'))
+    result.show(truncate=False)
+    # get the number of hospitals in each province along with the total and average number of beds
+    result = spatial_aggregation(result, aggregate_functions=[
+        AggregationFunction("hospital_name", AggregationType.COUNT, 'hospitals'),
+        AggregationFunction("cap_beds", AggregationType.SUM, 'total_beds'),
+        AggregationFunction("cap_beds", AggregationType.AVG, 'average_beds'),
+    ])
+    result.show(truncate=False)

libadalina_core-1.0/libadalina_core/examples/example_population_in_provinces.py ADDED Viewed

@@ -0,0 +1,42 @@
+from libadalina_core.readers.readers import geopackage_to_dataframe
+import pathlib
+import pandas as pd
+from libadalina_core.spatial_join.query_builder import spatial_join, JoinType, spatial_aggregation, AggregationType, \
+    AggregationFunction
+if __name__ == "__main__":
+    """Example of how to use libadalina to find the total amount of the population living in specific provinces in Italy."""
+    # Set pandas display options
+    pd.set_option('display.max_columns', None)
+    pd.set_option('display.width', None)
+    pd.set_option('display.max_colwidth', 100)
+    population = geopackage_to_dataframe(
+        str(pathlib.Path(__file__).parent.parent.parent / "tests" / "samples" / "population-north-italy" / "nord-italia.gpkg"),
+        "census2021"
+    )[['T', 'geometry']]
+    regions = geopackage_to_dataframe(
+        str(pathlib.Path(__file__).parent.parent.parent / "tests" / "samples" / "regions" / "NUTS_RG_20M_2024_4326.gpkg"),
+        "NUTS_RG_20M_2024_4326.gpkg"
+    )[["LEVL_CODE", "NUTS_NAME", "CNTR_CODE", "geometry"]]
+    # select province of Milan and Cremona
+    filtered_regions = regions[
+        (regions['LEVL_CODE'] == 3) &
+        (regions['CNTR_CODE'] == "IT") &
+        (regions['NUTS_NAME'].str.contains('Milano|Cremona', case=False))
+    ]
+    # join with population table to get the population of these provinces
+    result = spatial_aggregation(
+        spatial_join(filtered_regions, population, join_type=JoinType.LEFT)
+              # join operator renames the geometries adding suffixes _left and _right to avoid conflicts
+              .withColumnRenamed('geometry_left', 'geometry'),
+        aggregate_functions=[
+            AggregationFunction("T", AggregationType.SUM, 'population', proportional='geometry_right'),
+    ])
+    result.show(truncate=False)