PyPI - vector2dggs - Versions diffs - 0.6.3__tar.gz → 0.9.0__tar.gz - Mend

vector2dggs 0.6.3tar.gz → 0.9.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

{vector2dggs-0.6.3 → vector2dggs-0.9.0}/PKG-INFO +54 -15
{vector2dggs-0.6.3 → vector2dggs-0.9.0}/README.md +49 -12
{vector2dggs-0.6.3 → vector2dggs-0.9.0}/pyproject.toml +5 -3
vector2dggs-0.9.0/vector2dggs/__init__.py +1 -0
{vector2dggs-0.6.3 → vector2dggs-0.9.0}/vector2dggs/cli.py +4 -0
{vector2dggs-0.6.3 → vector2dggs-0.9.0}/vector2dggs/common.py +39 -41
vector2dggs-0.9.0/vector2dggs/constants.py +75 -0
vector2dggs-0.9.0/vector2dggs/geohash.py +240 -0
{vector2dggs-0.6.3 → vector2dggs-0.9.0}/vector2dggs/h3.py +15 -11
{vector2dggs-0.6.3 → vector2dggs-0.9.0}/vector2dggs/katana.py +26 -20
{vector2dggs-0.6.3 → vector2dggs-0.9.0}/vector2dggs/rHP.py +24 -25
vector2dggs-0.9.0/vector2dggs/s2.py +349 -0
vector2dggs-0.6.3/vector2dggs/__init__.py +0 -1
vector2dggs-0.6.3/vector2dggs/constants.py +0 -26
{vector2dggs-0.6.3 → vector2dggs-0.9.0}/COPYING +0 -0
{vector2dggs-0.6.3 → vector2dggs-0.9.0}/COPYING.LESSER +0 -0

{vector2dggs-0.6.3 → vector2dggs-0.9.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: vector2dggs
-Version: 0.6.3
+Version: 0.9.0
 Summary: CLI DGGS indexer for vector geospatial data
 Home-page: https://github.com/manaakiwhenua/vector2dggs
 License: LGPL-3.0-or-later
@@ -29,8 +29,10 @@ Requires-Dist: pillow (>=11.2.1,<12.0.0)
 Requires-Dist: psycopg2 (>=2.9.9,<3.0.0)
 Requires-Dist: pyarrow (>=20.0,<21.0)
 Requires-Dist: pyproj (>=3.7,<4.0)
-Requires-Dist: rhealpixdggs (>=0.5.5,<0.6.0)
-Requires-Dist: rhppandas (>=0.1.2,<0.2.0)
+Requires-Dist: python-geohash (>=0.8.5,<0.9.0)
+Requires-Dist: rhppandas (>=0.2.0,<0.3.0)
+Requires-Dist: rusty-polygon-geohasher (>=0.2.3,<0.3.0)
+Requires-Dist: s2geometry (>=0.9.0,<0.10.0)
 Requires-Dist: shapely (>=2.1,<3.0)
 Requires-Dist: sqlalchemy (>=2.0.32,<3.0.0)
 Requires-Dist: tqdm (>=4.67,<5.0)
@@ -47,8 +49,13 @@ This is the vector equivalent of [raster2dggs](https://github.com/manaakiwhenua/
 Currently this tool supports the following DGGSs:
-- H3 (polygons, linestrings)
-- rHEALPix (polygons)
+- [H3](https://h3geo.org/)
+- [rHEALPix](https://datastore.landcareresearch.co.nz/dataset/rhealpix-discrete-global-grid-system)
+- [S2](https://s2geometry.io/)
+... and the following geocode systems:
+- [Geohash](https://en.wikipedia.org/wiki/Geohash) (points, polygons)
 Contributions (espeically for other DGGSs), suggestions, bug reports and strongly worded letters are all welcome.
@@ -63,7 +70,8 @@ pip install vector2dggs
 ## Usage
 ```bash
-vector2dggs --help                                                                                                                                                                                                           [11:22:14]
+vector2dggs --help
 Usage: vector2dggs [OPTIONS] COMMAND [ARGS]...
 Options:
@@ -71,8 +79,10 @@ Options:
   --help     Show this message and exit.
 Commands:
-  h3   Ingest a vector dataset and index it to the H3 DGGS.
-  rhp  Ingest a vector dataset and index it to the rHEALPix DGGS.
+  geohash  Ingest a vector dataset and index it using the Geohash geocode...
+  h3       Ingest a vector dataset and index it to the H3 DGGS.
+  rhp      Ingest a vector dataset and index it to the rHEALPix DGGS.
+  s2       Ingest a vector dataset and index it to the S2 DGGS.
 ```
 ```bash
@@ -119,8 +129,8 @@ Options:
                                   [default: 5000; required]
   -t, --threads INTEGER           Amount of threads used for operation
                                   [default: 7]
-  -tbl, --table TEXT              Name of the table to read when using a
-                                  spatial database connection as input
+  -lyr, --layer TEXT              Name of the layer or table to read when using a
+                                  an input that supports layers or tables
   -g, --geom_col TEXT             Column name to use when using a spatial
                                   database connection as input  [default:
                                   geom]
@@ -137,9 +147,9 @@ Options:
 Output is in the Apache Parquet format, a directory with one file per partition.
-For a quick view of your output, you can read Apache Parquet with pandas, and then use h3-pandas and geopandas to convert this into a GeoPackage or GeoParquet for visualisation in a desktop GIS, such as QGIS. The Apache Parquet output is indexed by an ID column (which you can specify), so it should be ready for two intended use-cases:
+For a quick view of your output, you can read Apache Parquet with pandas, and then use tools like h3-pandas and geopandas to convert this into a GeoPackage or GeoParquet for visualisation in a desktop GIS, such as QGIS. The Apache Parquet output is indexed by an ID column (which you can specify), so it should be ready for two intended use-cases:
 - Joining attribute data from the original feature-level data onto computer DGGS cells.
-- Joining other data to this output on the H3 cell ID. (The output has a column like `h3_\d{2}`, e.g. `h3_09` or `h3_12` according to the target resolution.)
+- Joining other data to this output on the DGGS cell ID. (The output has a column like `{dggs}_\d`, e.g. `h3_09` or `h3_12` according to the target resolution, zero-padded to account for the maximum resolution of the DGGS)
 Geoparquet output (hexagon boundaries):
@@ -166,6 +176,34 @@ h3_12
 >>> g.to_parquet('./output-data/parcels.12.geo.parquet')
 ```
+An example for S2 output (using `s2sphere`):
+```python
+import pandas as pd
+import geopandas as gpd
+import s2sphere
+from shapely.geometry import Polygon
+RES = 18
+df = pd.read_parquet(f'~/output-data/ponds-with-holes.s2.{RES}.pq')
+df = df.reset_index()
+def s2id_to_polygon(s2_id_hex):
+    cell_id = s2sphere.CellId.from_token(s2_id_hex)
+    cell = s2sphere.Cell(cell_id)
+    vertices = []
+    for i in range(4):
+        vertex = cell.get_vertex(i)
+        lat_lng = s2sphere.LatLng.from_point(vertex)
+        vertices.append((lat_lng.lng().degrees, lat_lng.lat().degrees))  # (lon, lat)
+    return Polygon(vertices)
+df['geometry'] = df[f's2_{RES}'].apply(s2id_to_polygon)
+df = gpd.GeoDataFrame(df, geometry='geometry', crs='EPSG:4326')  # WGS84
+df.to_parquet(f'sample-{RES}.parquet')
+```
 ### For development
 In brief, to get started:
@@ -175,6 +213,7 @@ In brief, to get started:
     - If you're on Windows, `pip install gdal` may be necessary before running the subsequent commands.
     - On Linux, install GDAL 3.8+ according to your platform-specific instructions, including development headers, i.e. `libgdal-dev`.
 - Create the virtual environment with `poetry init`. This will install necessary dependencies.
+  - If the installation of `s2geometry` fails, you may require SWIG to build it. (A command like `conda install swig` or `sudo dnf install swig` depending on your platform).
 - Subsequently, the virtual environment can be re-activated with `poetry shell`.
 If you run `poetry install`, the CLI tool will be aliased so you can simply use `vector2dggs` rather than `poetry run vector2dggs`, which is the alternative if you do not `poetry install`.
@@ -199,7 +238,7 @@ vector2dggs h3 -v DEBUG -id title_no -r 12 -o ~/Downloads/nz-property-titles.gpk
 With a PostgreSQL/PostGIS connection:
 ```bash
-vector2dggs h3 -v DEBUG -id ogc_fid -r 9 -p 5 -t 4 --overwrite -tbl topo50_lake postgresql://user:password@host:port/db ./topo50_lake.parquet
+vector2dggs h3 -v DEBUG -id ogc_fid -r 9 -p 5 -t 4 --overwrite -lyr topo50_lake postgresql://user:password@host:port/db ./topo50_lake.parquet
 ```
 ## Citation
@@ -209,14 +248,14 @@ vector2dggs h3 -v DEBUG -id ogc_fid -r 9 -p 5 -t 4 --overwrite -tbl topo50_lake
   title={{vector2dggs}},
   author={Ardo, James and Law, Richard},
   url={https://github.com/manaakiwhenua/vector2dggs},
-  version={0.6.3},
+  version={0.9.0},
   date={2023-04-20}
 }
 ```
 APA/Harvard
-> Ardo, J., & Law, R. (2023). vector2dggs (0.6.3) [Computer software]. https://github.com/manaakiwhenua/vector2dggs
+> Ardo, J., & Law, R. (2023). vector2dggs (0.9.0) [Computer software]. https://github.com/manaakiwhenua/vector2dggs
 [![manaakiwhenua-standards](https://github.com/manaakiwhenua/vector2dggs/workflows/manaakiwhenua-standards/badge.svg)](https://github.com/manaakiwhenua/manaakiwhenua-standards)

{vector2dggs-0.6.3 → vector2dggs-0.9.0}/README.md RENAMED Viewed

@@ -8,8 +8,13 @@ This is the vector equivalent of [raster2dggs](https://github.com/manaakiwhenua/
 Currently this tool supports the following DGGSs:
-- H3 (polygons, linestrings)
-- rHEALPix (polygons)
+- [H3](https://h3geo.org/)
+- [rHEALPix](https://datastore.landcareresearch.co.nz/dataset/rhealpix-discrete-global-grid-system)
+- [S2](https://s2geometry.io/)
+... and the following geocode systems:
+- [Geohash](https://en.wikipedia.org/wiki/Geohash) (points, polygons)
 Contributions (espeically for other DGGSs), suggestions, bug reports and strongly worded letters are all welcome.
@@ -24,7 +29,8 @@ pip install vector2dggs
 ## Usage
 ```bash
-vector2dggs --help                                                                                                                                                                                                           [11:22:14]
+vector2dggs --help
 Usage: vector2dggs [OPTIONS] COMMAND [ARGS]...
 Options:
@@ -32,8 +38,10 @@ Options:
   --help     Show this message and exit.
 Commands:
-  h3   Ingest a vector dataset and index it to the H3 DGGS.
-  rhp  Ingest a vector dataset and index it to the rHEALPix DGGS.
+  geohash  Ingest a vector dataset and index it using the Geohash geocode...
+  h3       Ingest a vector dataset and index it to the H3 DGGS.
+  rhp      Ingest a vector dataset and index it to the rHEALPix DGGS.
+  s2       Ingest a vector dataset and index it to the S2 DGGS.
 ```
 ```bash
@@ -80,8 +88,8 @@ Options:
                                   [default: 5000; required]
   -t, --threads INTEGER           Amount of threads used for operation
                                   [default: 7]
-  -tbl, --table TEXT              Name of the table to read when using a
-                                  spatial database connection as input
+  -lyr, --layer TEXT              Name of the layer or table to read when using a
+                                  an input that supports layers or tables
   -g, --geom_col TEXT             Column name to use when using a spatial
                                   database connection as input  [default:
                                   geom]
@@ -98,9 +106,9 @@ Options:
 Output is in the Apache Parquet format, a directory with one file per partition.
-For a quick view of your output, you can read Apache Parquet with pandas, and then use h3-pandas and geopandas to convert this into a GeoPackage or GeoParquet for visualisation in a desktop GIS, such as QGIS. The Apache Parquet output is indexed by an ID column (which you can specify), so it should be ready for two intended use-cases:
+For a quick view of your output, you can read Apache Parquet with pandas, and then use tools like h3-pandas and geopandas to convert this into a GeoPackage or GeoParquet for visualisation in a desktop GIS, such as QGIS. The Apache Parquet output is indexed by an ID column (which you can specify), so it should be ready for two intended use-cases:
 - Joining attribute data from the original feature-level data onto computer DGGS cells.
-- Joining other data to this output on the H3 cell ID. (The output has a column like `h3_\d{2}`, e.g. `h3_09` or `h3_12` according to the target resolution.)
+- Joining other data to this output on the DGGS cell ID. (The output has a column like `{dggs}_\d`, e.g. `h3_09` or `h3_12` according to the target resolution, zero-padded to account for the maximum resolution of the DGGS)
 Geoparquet output (hexagon boundaries):
@@ -127,6 +135,34 @@ h3_12
 >>> g.to_parquet('./output-data/parcels.12.geo.parquet')
 ```
+An example for S2 output (using `s2sphere`):
+```python
+import pandas as pd
+import geopandas as gpd
+import s2sphere
+from shapely.geometry import Polygon
+RES = 18
+df = pd.read_parquet(f'~/output-data/ponds-with-holes.s2.{RES}.pq')
+df = df.reset_index()
+def s2id_to_polygon(s2_id_hex):
+    cell_id = s2sphere.CellId.from_token(s2_id_hex)
+    cell = s2sphere.Cell(cell_id)
+    vertices = []
+    for i in range(4):
+        vertex = cell.get_vertex(i)
+        lat_lng = s2sphere.LatLng.from_point(vertex)
+        vertices.append((lat_lng.lng().degrees, lat_lng.lat().degrees))  # (lon, lat)
+    return Polygon(vertices)
+df['geometry'] = df[f's2_{RES}'].apply(s2id_to_polygon)
+df = gpd.GeoDataFrame(df, geometry='geometry', crs='EPSG:4326')  # WGS84
+df.to_parquet(f'sample-{RES}.parquet')
+```
 ### For development
 In brief, to get started:
@@ -136,6 +172,7 @@ In brief, to get started:
     - If you're on Windows, `pip install gdal` may be necessary before running the subsequent commands.
     - On Linux, install GDAL 3.8+ according to your platform-specific instructions, including development headers, i.e. `libgdal-dev`.
 - Create the virtual environment with `poetry init`. This will install necessary dependencies.
+  - If the installation of `s2geometry` fails, you may require SWIG to build it. (A command like `conda install swig` or `sudo dnf install swig` depending on your platform).
 - Subsequently, the virtual environment can be re-activated with `poetry shell`.
 If you run `poetry install`, the CLI tool will be aliased so you can simply use `vector2dggs` rather than `poetry run vector2dggs`, which is the alternative if you do not `poetry install`.
@@ -160,7 +197,7 @@ vector2dggs h3 -v DEBUG -id title_no -r 12 -o ~/Downloads/nz-property-titles.gpk
 With a PostgreSQL/PostGIS connection:
 ```bash
-vector2dggs h3 -v DEBUG -id ogc_fid -r 9 -p 5 -t 4 --overwrite -tbl topo50_lake postgresql://user:password@host:port/db ./topo50_lake.parquet
+vector2dggs h3 -v DEBUG -id ogc_fid -r 9 -p 5 -t 4 --overwrite -lyr topo50_lake postgresql://user:password@host:port/db ./topo50_lake.parquet
 ```
 ## Citation
@@ -170,13 +207,13 @@ vector2dggs h3 -v DEBUG -id ogc_fid -r 9 -p 5 -t 4 --overwrite -tbl topo50_lake
   title={{vector2dggs}},
   author={Ardo, James and Law, Richard},
   url={https://github.com/manaakiwhenua/vector2dggs},
-  version={0.6.3},
+  version={0.9.0},
   date={2023-04-20}
 }
 ```
 APA/Harvard
-> Ardo, J., & Law, R. (2023). vector2dggs (0.6.3) [Computer software]. https://github.com/manaakiwhenua/vector2dggs
+> Ardo, J., & Law, R. (2023). vector2dggs (0.9.0) [Computer software]. https://github.com/manaakiwhenua/vector2dggs
 [![manaakiwhenua-standards](https://github.com/manaakiwhenua/vector2dggs/workflows/manaakiwhenua-standards/badge.svg)](https://github.com/manaakiwhenua/manaakiwhenua-standards)

{vector2dggs-0.6.3 → vector2dggs-0.9.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "vector2dggs"
-version = "0.6.3"
+version = "0.9.0"
 description = "CLI DGGS indexer for vector geospatial data"
 authors = ["James Ardo <ardoj@landcareresearch.co.nz>"]
 maintainers = ["Richard Law <lawr@landcareresearch.co.nz>"]
@@ -30,9 +30,11 @@ sqlalchemy = "^2.0.32"
 psycopg2 = "^2.9.9"
 shapely = "^2.1"
 numpy = "^2"
-rhppandas = "^0.1.2"
-rhealpixdggs = "^0.5.5"
+rhppandas = "^0.2.0"
 pillow = "^11.2.1"
+s2geometry = "^0.9.0"
+rusty-polygon-geohasher = "^0.2.3"
+python-geohash = "^0.8.5"
 [tool.poetry.group.dev.dependencies]
 pytest = "^7.2.2"

vector2dggs-0.9.0/vector2dggs/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ __version__: str = "0.9.0"

{vector2dggs-0.6.3 → vector2dggs-0.9.0}/vector2dggs/cli.py RENAMED Viewed

@@ -3,6 +3,8 @@ import click
 from vector2dggs import __version__
 from vector2dggs.h3 import h3
 from vector2dggs.rHP import rhp
+from vector2dggs.s2 import s2
+from vector2dggs.geohash import geohash
 #   If the program does terminal interaction, make it output a short
 # notice like this when it starts in an interactive mode:
@@ -21,6 +23,8 @@ def cli():
 cli.add_command(h3)
 cli.add_command(rhp)
+cli.add_command(s2)
+cli.add_command(geohash)
 def main():

{vector2dggs-0.6.3 → vector2dggs-0.9.0}/vector2dggs/common.py RENAMED Viewed

@@ -104,13 +104,15 @@ def drop_condition(
     _diff = _before - _after
     if _diff:
         log_method = (
-            LOGGER.info if (_diff / float(_before)) < warning_threshold else LOGGER.warn
+            LOGGER.info
+            if (_diff / float(_before)) < warning_threshold
+            else LOGGER.warning
         )
         log_method(f"Dropped {_diff} rows ({_diff/float(_before)*100:.2f}%)")
     return df
-def get_parent_res(dggs: str, parent_res: Union[None, int], resolution: int):
+def get_parent_res(dggs: str, parent_res: Union[None, str], resolution: int) -> int:
     """
     Uses a parent resolution,
     OR,
@@ -118,22 +120,17 @@ def get_parent_res(dggs: str, parent_res: Union[None, int], resolution: int):
     Used for intermediate re-partioning.
     """
-    if dggs == "h3":
-        return (
-            parent_res
-            if parent_res is not None
-            else max(const.MIN_H3, (resolution - const.DEFAULT_PARENT_OFFSET))
-        )
-    elif dggs == "rhp":
-        return (
-            parent_res
-            if parent_res is not None
-            else max(const.MIN_RHP, (resolution - const.DEFAULT_PARENT_OFFSET))
-        )
-    else:
+    if not dggs in const.DEFAULT_DGGS_PARENT_RES.keys():
         raise RuntimeError(
-            "Unknown dggs {dggs}) -  must be one of [ 'h3', 'rhp' ]".format(dggs=dggs)
+            "Unknown dggs {dggs}) -  must be one of [ {options} ]".format(
+                dggs=dggs, options=", ".join(const.DEFAULT_DGGS_PARENT_RES.keys())
+            )
         )
+    return (
+        int(parent_res)
+        if parent_res is not None
+        else const.DEFAULT_DGGS_PARENT_RES[dggs](resolution)
+    )
 def parent_partitioning(
@@ -141,10 +138,9 @@ def parent_partitioning(
     input_dir: Path,
     output_dir: Path,
     resolution: int,
-    parent_res: Union[None, int],
+    parent_res: int,
     **kwargs,
 ) -> None:
-    parent_res: int = get_parent_res(dggs, parent_res, resolution)
     partition_col = f"{dggs}_{parent_res:02}"
     with TqdmCallback(desc="Repartitioning"):
@@ -174,30 +170,29 @@ def polyfill(
     pq_in: Path,
     spatial_sort_col: str,
     resolution: int,
-    parent_res: Union[None, int],
+    parent_res: int,
     output_directory: str,
 ) -> None:
     """
     Reads a geoparquet, performs polyfilling (for Polygon),
-    linetracing (for LineString), and writes out to parquet.
+    linetracing (for LineString), or indexing (for Point),
+    and writes out to parquet.
     """
-    df = gpd.read_parquet(pq_in).reset_index().drop(columns=[spatial_sort_col])
+    df = gpd.read_parquet(pq_in).reset_index()
+    if spatial_sort_col != "none":
+        df = df.drop(columns=[spatial_sort_col])
     if len(df.index) == 0:
-        # Input is empty, nothing to polyfill
+        # Input is empty, nothing to convert
         return None
-    # DGGS specific polyfill
+    # DGGS specific conversion
     df = dggsfunc(df, resolution)
     if len(df.index) == 0:
-        # Polyfill resulted in empty output (e.g. large cell, small feature)
+        # Conversion resulted in empty output (e.g. large cell, small feature)
         return None
     df.index.rename(f"{dggs}_{resolution:02}", inplace=True)
-    parent_res: int = get_parent_res(dggs, parent_res, resolution)
-    # print(parent_res)
-    # print(df.index)
-    # print(df.columns)
     # Secondary (parent) index, used later for partitioning
     df = secondary_index_func(df, parent_res)
@@ -228,22 +223,23 @@ def index(
     id_field: str = None,
     cut_crs: pyproj.CRS = None,
     con: SQLConnectionType = None,
-    table: str = None,
+    layer: str = None,
     geom_col: str = "geom",
     overwrite: bool = False,
 ) -> Path:
     """
-    Performs multi-threaded polyfilling on (multi)polygons.
+    Performs multi-threaded DGGS indexing on geometries (including multipart and collections).
     """
+    parent_res = get_parent_res(dggs, parent_res, resolution)
-    if table and con:
+    if layer and con:
         # Database connection
         if keep_attributes:
-            q = sqlalchemy.text(f"SELECT * FROM {table}")
+            q = sqlalchemy.text(f"SELECT * FROM {layer}")
         elif id_field and not keep_attributes:
-            q = sqlalchemy.text(f"SELECT {id_field}, {geom_col} FROM {table}")
+            q = sqlalchemy.text(f"SELECT {id_field}, {geom_col} FROM {layer}")
         else:
-            q = sqlalchemy.text(f"SELECT {geom_col} FROM {table}")
+            q = sqlalchemy.text(f"SELECT {geom_col} FROM {layer}")
         df = gpd.read_postgis(q, con.connect(), geom_col=geom_col).rename_geometry(
             "geometry"
         )
@@ -291,7 +287,8 @@ def index(
             "index": lambda frame: frame[
                 (frame.geometry.geom_type != "Polygon")
                 & (frame.geometry.geom_type != "LineString")
-            ],  # NB currently points and other types are lost; in principle, these could be indexed
+                & (frame.geometry.geom_type != "Point")
+            ],
             "message": "Considering unsupported geometries",
         },
     ]
@@ -300,11 +297,12 @@ def index(
     ddf = dgpd.from_geopandas(df, chunksize=max(1, chunksize), sort=True)
-    LOGGER.debug("Spatially sorting and partitioning (%s)", spatial_sorting)
-    ddf = ddf.spatial_shuffle(by=spatial_sorting)
+    if spatial_sorting != "none":
+        LOGGER.debug("Spatially sorting and partitioning (%s)", spatial_sorting)
+        ddf = ddf.spatial_shuffle(by=spatial_sorting)
     spatial_sort_col = (
         spatial_sorting
-        if spatial_sorting == "geohash"
+        if (spatial_sorting == "geohash" or spatial_sorting == "none")
         else f"{spatial_sorting}_distance"
     )
@@ -314,9 +312,9 @@ def index(
         filepaths = list(map(lambda f: f.absolute(), Path(tmpdir).glob("*")))
-        # Multithreaded polyfilling
+        # Multithreaded DGGS indexing
         LOGGER.debug(
-            "Indexing on spatial partitions by polyfill with resolution: %d",
+            "DGGS indexing by spatial partitions with resolution: %d",
             resolution,
         )
         with tempfile.TemporaryDirectory(suffix=".parquet") as tmpdir2:
@@ -344,7 +342,7 @@ def index(
             parent_partitioning(
                 dggs,
-                tmpdir2,
+                Path(tmpdir2),
                 output_directory,
                 resolution,
                 parent_res,

vector2dggs-0.9.0/vector2dggs/constants.py ADDED Viewed

@@ -0,0 +1,75 @@
+import multiprocessing
+import warnings
+import tempfile
+MIN_H3, MAX_H3 = 0, 15
+MIN_RHP, MAX_RHP = 0, 15
+MIN_S2, MAX_S2 = 0, 30
+MIN_GEOHASH, MAX_GEOHASH = 1, 12
+DEFAULTS = {
+    "id": None,
+    "k": False,
+    "ch": 50,
+    "s": "none",
+    "crs": None,
+    "c": 5000,
+    "t": (multiprocessing.cpu_count() - 1),
+    "lyr": None,
+    "g": "geom",
+    "tempdir": tempfile.tempdir,
+}
+SPATIAL_SORTING_METHODS = ["hilbert", "morton", "geohash", "none"]
+DEFAULT_DGGS_PARENT_RES = {
+    "h3": lambda resolution: max(MIN_H3, (resolution - DEFAULT_PARENT_OFFSET)),
+    "rhp": lambda resolution: max(MIN_RHP, (resolution - DEFAULT_PARENT_OFFSET)),
+    "geohash": lambda resolution: max(
+        MIN_GEOHASH, (resolution - DEFAULT_PARENT_OFFSET)
+    ),
+    "s2": lambda resolution: max(MIN_S2, (resolution - DEFAULT_PARENT_OFFSET)),
+}
+DEFAULT_PARENT_OFFSET = 6
+# http://s2geometry.io/resources/s2cell_statistics.html
+S2_CELLS_MAX_AREA_M2_BY_LEVEL = {
+    0: 85011012.19 * 1e6,
+    1: 21252753.05 * 1e6,
+    2: 6026521.16 * 1e6,
+    3: 1646455.50 * 1e6,
+    4: 413918.15 * 1e6,
+    5: 104297.91 * 1e6,
+    6: 26113.30 * 1e6,
+    7: 6529.09 * 1e6,
+    8: 1632.45 * 1e6,
+    9: 408.12 * 1e6,
+    10: 102.03 * 1e6,
+    11: 25.51 * 1e6,
+    12: 6.38 * 1e6,
+    13: 1.59 * 1e6,
+    14: 0.40 * 1e6,
+    15: 99638.93,
+    16: 24909.73,
+    17: 6227.43,
+    18: 1556.86,
+    19: 389.22,
+    20: 97.30,
+    21: 24.33,
+    22: 6.08,
+    23: 1.52,
+    24: 0.38,
+    25: 950.23 * 1e-4,
+    26: 237.56 * 1e-4,
+    27: 59.39 * 1e-4,
+    28: 14.85 * 1e-4,
+    29: 3.71 * 1e-4,
+    30: 0.93 * 1e-4,
+}
+warnings.filterwarnings(
+    "ignore"
+)  # This is to filter out the polyfill warnings when rows failed to get indexed at a resolution, can be commented out to find missing rows

vector2dggs 0.6.3__tar.gz → 0.9.0__tar.gz

vector2dggs 0.6.3tar.gz → 0.9.0tar.gz