terrana 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (49) hide show
  1. terrana-0.2.0/.editorconfig +22 -0
  2. terrana-0.2.0/.gitignore +7 -0
  3. terrana-0.2.0/.zenodo.json +32 -0
  4. terrana-0.2.0/CHANGELOG.md +80 -0
  5. terrana-0.2.0/CITATION.cff +30 -0
  6. terrana-0.2.0/CONTRIBUTING.md +124 -0
  7. terrana-0.2.0/Cargo.lock +3879 -0
  8. terrana-0.2.0/Cargo.toml +68 -0
  9. terrana-0.2.0/LICENSE-APACHE +201 -0
  10. terrana-0.2.0/LICENSE-MIT +21 -0
  11. terrana-0.2.0/PKG-INFO +10 -0
  12. terrana-0.2.0/README.md +362 -0
  13. terrana-0.2.0/SECURITY.md +47 -0
  14. terrana-0.2.0/pyproject.toml +22 -0
  15. terrana-0.2.0/python/.gitignore +9 -0
  16. terrana-0.2.0/python/Cargo.lock +3702 -0
  17. terrana-0.2.0/python/Cargo.toml +30 -0
  18. terrana-0.2.0/python/src/lib.rs +465 -0
  19. terrana-0.2.0/python/tests/test_terrana.py +153 -0
  20. terrana-0.2.0/rust-toolchain.toml +3 -0
  21. terrana-0.2.0/src/cli.rs +48 -0
  22. terrana-0.2.0/src/config.rs +18 -0
  23. terrana-0.2.0/src/db/loader.rs +390 -0
  24. terrana-0.2.0/src/db/mod.rs +126 -0
  25. terrana-0.2.0/src/db/query.rs +395 -0
  26. terrana-0.2.0/src/error.rs +43 -0
  27. terrana-0.2.0/src/geometry/area.rs +50 -0
  28. terrana-0.2.0/src/geometry/buffer.rs +58 -0
  29. terrana-0.2.0/src/geometry/dissolve.rs +62 -0
  30. terrana-0.2.0/src/geometry/hull.rs +53 -0
  31. terrana-0.2.0/src/geometry/measure.rs +117 -0
  32. terrana-0.2.0/src/geometry/mod.rs +6 -0
  33. terrana-0.2.0/src/geometry/simplify.rs +56 -0
  34. terrana-0.2.0/src/handlers/geometry.rs +435 -0
  35. terrana-0.2.0/src/handlers/meta.rs +57 -0
  36. terrana-0.2.0/src/handlers/mod.rs +4 -0
  37. terrana-0.2.0/src/handlers/query.rs +249 -0
  38. terrana-0.2.0/src/handlers/within.rs +110 -0
  39. terrana-0.2.0/src/lib.rs +54 -0
  40. terrana-0.2.0/src/main.rs +260 -0
  41. terrana-0.2.0/src/output/csv_out.rs +49 -0
  42. terrana-0.2.0/src/output/geojson_out.rs +46 -0
  43. terrana-0.2.0/src/output/json_out.rs +10 -0
  44. terrana-0.2.0/src/output/mod.rs +24 -0
  45. terrana-0.2.0/src/server/middleware.rs +1 -0
  46. terrana-0.2.0/src/server/mod.rs +202 -0
  47. terrana-0.2.0/testdata/observations.csv +21 -0
  48. terrana-0.2.0/testdata/parks.geojson +33 -0
  49. terrana-0.2.0/tests/api.rs +365 -0
@@ -0,0 +1,22 @@
1
+ root = true
2
+
3
+ [*]
4
+ charset = utf-8
5
+ end_of_line = lf
6
+ insert_final_newline = true
7
+ trim_trailing_whitespace = true
8
+ indent_style = space
9
+ indent_size = 4
10
+
11
+ [*.{rs}]
12
+ indent_size = 4
13
+ max_line_length = 100
14
+
15
+ [*.{md,yml,yaml,toml,json}]
16
+ indent_size = 2
17
+
18
+ [*.py]
19
+ indent_size = 4
20
+
21
+ [Makefile]
22
+ indent_style = tab
@@ -0,0 +1,7 @@
1
+ /target
2
+ # Cargo.lock is committed: terrana ships a binary, so a locked dependency set gives
3
+ # reproducible builds. (Library crates typically gitignore it instead.)
4
+ testdata/bench_*.csv
5
+ testdata/bench_*.csv.gz
6
+ .claude/settings.local.json
7
+ /python/tests/__pycache__
@@ -0,0 +1,32 @@
1
+ {
2
+ "title": "Terrana: Zero-Config Spatial API Server",
3
+ "description": "<p>Terrana is a zero-config spatial API server written in Rust. Point it at a CSV, Parquet, or GeoJSON file containing lat/lon columns and immediately get a REST API with spatial queries (radius, bounding box, nearest neighbor, point-in-polygon) and geodesic geometry operations (area, convex hull, buffer, dissolve, simplify, distance) &mdash; no database setup, no PostGIS, no infrastructure. Built on DuckDB (bundled) with R-tree spatial indexing and the <code>geo</code> crate for WGS&nbsp;84 geodesic computations.</p>",
4
+ "upload_type": "software",
5
+ "license": "MIT",
6
+ "access_right": "open",
7
+ "creators": [
8
+ {
9
+ "name": "McMeen, John"
10
+ }
11
+ ],
12
+ "keywords": [
13
+ "spatial",
14
+ "geospatial",
15
+ "gis",
16
+ "rest-api",
17
+ "duckdb",
18
+ "rust",
19
+ "rtree",
20
+ "spatial-index",
21
+ "geojson",
22
+ "wgs84",
23
+ "geodesic"
24
+ ],
25
+ "related_identifiers": [
26
+ {
27
+ "identifier": "https://github.com/jmcmeen/terrana",
28
+ "relation": "isSupplementTo",
29
+ "scheme": "url"
30
+ }
31
+ ]
32
+ }
@@ -0,0 +1,80 @@
1
+ # Changelog
2
+
3
+ All notable changes to Terrana will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [0.2.0] - 2026-06-25
9
+
10
+ ### Added
11
+
12
+ - **Library crate.** Terrana is now a `lib + bin` crate. `src/lib.rs` exposes a public API so the engine can be used in-process without the HTTP server: `ingest_file`, `detect_lat_lon`, `query`, `AppError`, and the geodesic `geometry` modules.
13
+ - `db::loader::ingest_file` — load a file end to end (stage → detect lat/lon → promote → build the R-tree index) in one call, returning `IngestInfo { lat_col, lon_col, row_count }`.
14
+ - `server` Cargo feature, enabled by default. Depend on Terrana with `default-features = false` to get the pure library without pulling in axum / tokio / tower.
15
+ - **Python bindings** (`terrana` on PyPI, built with PyO3 + maturin). A single stable-ABI (`abi3`) wheel for CPython 3.9+ exposing both an in-process library mode — `load_csv`/`load_parquet`/`load_geojson`, `query_radius`/`query_bbox`/`query_nearest`, `geodesic_distance`/`geodesic_area`, `convex_hull`, `buffer` — and an embedded HTTP server managed from Python (`serve_background`/`serve`/`shutdown`, context manager). The tokio runtime runs entirely in a Rust thread, off the GIL.
16
+
17
+ ### Changed
18
+
19
+ - Geodesic math moved out of the Axum handlers into pure `terrana::geometry` modules (`area`, `buffer`, `hull`, `dissolve`, `simplify`, `measure`). Handlers are now thin glue; HTTP responses are unchanged.
20
+
21
+ ### Fixed
22
+
23
+ - `POST /geometry/buffer` reported the area as ~510,000,000 km² (the Earth's entire surface): the ring was wound clockwise, so `geodesic_area_unsigned` measured its complement. The ring is now generated counter-clockwise (GeoJSON right-hand rule), so the reported area is the buffer disk as expected.
24
+
25
+ ## [0.1.1] - 2026-06-02
26
+
27
+ ### Fixed
28
+
29
+ - `--watch` reloads no longer blank out the dataset when the source file is malformed or half-written mid-save. The new file is now staged and its lat/lon columns validated *before* the live dataset is replaced, so a failed reload leaves the previous data serving and the watcher recovers on the next good write.
30
+ - `testdata/bench.sh` no longer exits non-zero (Error 143) when its background server is terminated on an otherwise successful run, so `make bench` reports success correctly. Genuine failures still propagate.
31
+ - Packaging: `testdata/generate.py` is no longer published to crates.io — the `exclude` glob missed the renamed generator, so benchmark/dev tooling was leaking into the package.
32
+
33
+ ### Changed
34
+
35
+ - Consolidated the two benchmark data generators into a single `testdata/generate.py` (`--preset bench` for 10K/100K/1M, `--preset 250m` for the 250M-row set); removed `generate_benchdata.py` and `generate_250m.py`.
36
+ - Moved `bench.sh` into `testdata/` (now runnable from any directory) and added `make gen`, `make gen-250m`, and `make bench` targets.
37
+
38
+ ### Added
39
+
40
+ - Integration tests for `--watch` reload: new rows are reflected, and the old dataset is preserved when a reload hits a bad file.
41
+
42
+ ## [0.1.0] - 2026-06-02
43
+
44
+ ### Added
45
+
46
+ - CLI with `terrana serve` subcommand and `--lat`, `--lon`, `--port`, `--bind`, `--watch`, `--disk` options
47
+ - Auto-detection of lat/lon columns from common naming conventions
48
+ - File ingestion for CSV, Parquet, GeoJSON, and DuckDB files
49
+ - DuckDB spatial extension R-tree index on geometry column for accelerated spatial queries
50
+ - `--disk` flag for on-disk DuckDB storage — required for large datasets (250M+ rows) that exceed available RAM
51
+ - `GET /query` endpoint with radius, bounding box, and nearest neighbor modes
52
+ - `POST /query/within` for point-in-polygon queries via `ST_Contains` (R-tree accelerated)
53
+ - `select=`, `where=`, `group_by=`, `agg=`, `limit=` query parameters
54
+ - JSON, CSV, and GeoJSON output formats
55
+ - Geometry endpoints: area, convex-hull, centroid, buffer, dissolve, simplify, distance, bounds
56
+ - Geometry area/perimeter use geodesic algorithms (Karney, WGS 84 ellipsoid via `geo` crate)
57
+ - Query path distances (radius, nearest) use haversine via DuckDB `ST_Distance_Sphere`
58
+ - `GET /health`, `GET /schema`, `GET /stats` metadata endpoints
59
+ - CORS support and request tracing via tower-http
60
+ - Tracing/logging with `RUST_LOG` env filter
61
+ - GitHub Actions CI (check, clippy, fmt, cross-platform build) and release workflows
62
+ - Dockerfile and docker-compose.yml for containerized deployment
63
+ - Benchmark script (`bench.sh`) and data generators for 10K–250M row datasets
64
+ - `--watch` now re-ingests the source file and atomically swaps the served dataset on change (previously the flag was accepted but had no effect)
65
+ - Test suite: unit tests for validation, SQL builders, lat/lon detection, and geodesic area; integration tests in `tests/api.rs` exercising the live HTTP API (run with `cargo test -- --include-ignored`)
66
+ - crates.io package metadata (description, license, repository, keywords, categories, MSRV) and dual `MIT OR Apache-2.0` licensing
67
+ - Community files: `CONTRIBUTING.md`, `SECURITY.md`, GitHub issue/PR templates, `.editorconfig`, `rust-toolchain.toml`
68
+ - `Makefile` with shortcuts for build/run/test/lint/package tasks (`make help`)
69
+ - "Installing Rust" guide in the README
70
+ - CI `test` job (runs unit + integration tests); clippy now lints all targets
71
+
72
+ ### Changed
73
+
74
+ - `GET /stats` returns `null` for `bbox`/`centroid` when the dataset has no spatially-valid rows, instead of a fake `(0, 0)`
75
+
76
+ ### Security
77
+
78
+ - Column name validation for all user-supplied column names, group_by, agg, and select params
79
+ - Fixed a SQL-injection vector in the `--table` argument for `.duckdb` sources: the table identifier is now validated and quoted before interpolation
80
+ - Bounding-box query parameters are now range-validated (lat ∈ [-90, 90], lon ∈ [-180, 180], min ≤ max)
@@ -0,0 +1,30 @@
1
+ cff-version: 1.2.0
2
+ message: "If you use Terrana in academic research, please cite it using the metadata from this file."
3
+ title: "Terrana: Zero-Config Spatial API Server"
4
+ abstract: >-
5
+ Terrana is a zero-config spatial API server written in Rust. Point it at a
6
+ CSV, Parquet, or GeoJSON file containing lat/lon columns and immediately get
7
+ a REST API with spatial queries and geodesic geometry operations — no database
8
+ setup, no PostGIS, no infrastructure. Built on axum and DuckDB with R-tree
9
+ spatial indexing.
10
+ type: software
11
+ authors:
12
+ - family-names: McMeen
13
+ given-names: John
14
+ email: johnmcmeen@gmail.com
15
+ version: 0.1.0
16
+ date-released: 2026-06-02
17
+ doi: 10.5281/zenodo.20515989
18
+ license:
19
+ - MIT
20
+ - Apache-2.0
21
+ repository-code: "https://github.com/jmcmeen/terrana"
22
+ url: "https://github.com/jmcmeen/terrana"
23
+ keywords:
24
+ - spatial
25
+ - geospatial
26
+ - geojson
27
+ - duckdb
28
+ - rust
29
+ - rest-api
30
+ - gis
@@ -0,0 +1,124 @@
1
+ # Contributing to Terrana
2
+
3
+ Thanks for your interest in contributing! Terrana is a zero-config spatial API
4
+ server written in Rust, and contributions of all kinds are welcome — bug reports,
5
+ documentation, new file formats, geometry operations, and performance work.
6
+
7
+ ## Ways to contribute
8
+
9
+ - **Bug reports** — Open an issue describing the problem, the file format you used,
10
+ and the exact query that triggered it. A minimal sample file helps enormously.
11
+ - **New file formats** — Terrana ingests through DuckDB; adding a format usually
12
+ means a small addition to [`src/db/loader.rs`](src/db/loader.rs).
13
+ - **Geometry operations** — The geodesic math lives in pure functions under
14
+ [`src/geometry/`](src/geometry/) (`area.rs`, `buffer.rs`, `hull.rs`, `measure.rs`, …);
15
+ the `POST /geometry/*` handlers in
16
+ [`src/handlers/geometry.rs`](src/handlers/geometry.rs) are thin glue over them. All
17
+ spatial math **must** use geodesic algorithms from the `geo` crate — never
18
+ planar/Cartesian (see [Geodesic rules](#geodesic-rules) below).
19
+ - **Python bindings** — The PyO3 bindings live in [`python/`](python/) and re-export
20
+ the same engine; new Python surface goes in [`python/src/lib.rs`](python/src/lib.rs)
21
+ with a test in [`python/tests/`](python/tests/).
22
+ - **Performance** — Benchmark with `./testdata/bench.sh` (or `make bench`), profile with `cargo flamegraph`,
23
+ and open a PR with before/after numbers.
24
+ - **Documentation** — Improvements to the README, examples, or inline doc comments
25
+ are always appreciated.
26
+
27
+ ## Development setup
28
+
29
+ You'll need a Rust toolchain (stable). If you don't have one, see
30
+ [Installing Rust](README.md#installing-rust) in the README.
31
+
32
+ ```bash
33
+ git clone https://github.com/jmcmeen/terrana.git
34
+ cd terrana
35
+ cargo build
36
+ cargo run -- serve testdata/observations.csv
37
+ ```
38
+
39
+ No system DuckDB or PostGIS is required — DuckDB is bundled. On first run, DuckDB
40
+ downloads its `spatial` extension from the network and caches it locally.
41
+
42
+ ## Before you open a pull request
43
+
44
+ Run the same checks CI runs, and make sure they all pass:
45
+
46
+ ```bash
47
+ cargo fmt --all # format
48
+ cargo clippy --all-targets -- -D warnings # lint (warnings are errors)
49
+ cargo test # unit tests (offline)
50
+ cargo test -- --include-ignored # + integration tests (needs network)
51
+ ```
52
+
53
+ Or use the [`Makefile`](Makefile) shortcuts (`make help` lists them all):
54
+
55
+ ```bash
56
+ make ci # fmt-check + lint + unit tests (the offline gate)
57
+ make test-all # unit + integration tests (needs network)
58
+ make run # run the server against testdata/observations.csv
59
+ ```
60
+
61
+ The integration tests in [`tests/api.rs`](tests/api.rs) spawn the real binary and
62
+ hit the HTTP endpoints. They are `#[ignore]`d by default because starting the server
63
+ requires the DuckDB `spatial` extension to be available; run them with
64
+ `--include-ignored` in an environment with network access.
65
+
66
+ For changes to the **Python bindings**, build and test them with
67
+ [uv](https://docs.astral.sh/uv/):
68
+
69
+ ```bash
70
+ cd python
71
+ uv venv --python 3.13 && source .venv/bin/activate
72
+ uv pip install maturin pytest
73
+ maturin develop # build the extension into the venv
74
+ uv run pytest tests/ -v # library + server-mode tests
75
+ ```
76
+
77
+ ## Pull request guidelines
78
+
79
+ - Keep PRs focused — one logical change per PR.
80
+ - Add or update tests for any behavior change.
81
+ - Update [`CHANGELOG.md`](CHANGELOG.md) under the `Unreleased` section.
82
+ - Update the README and inline docs when you change user-facing behavior.
83
+ - Write commit messages in the imperative mood ("Add buffer endpoint", not "Added").
84
+
85
+ ## Releasing & versioning
86
+
87
+ Releases are cut by the maintainer only. The Rust crate and the Python wheel ship
88
+ from a **single `vX.Y.Z` tag** and **share one version**, so:
89
+
90
+ - **Any change to the Rust crate _or_ the Python bindings needs a version bump** in
91
+ [`Cargo.toml`](Cargo.toml) at release. crates.io versions are immutable and the
92
+ wheel version is taken from the tag, so the published version must equal the tag —
93
+ CI fails the publish if they differ. Follow [SemVer](https://semver.org).
94
+ - Flow: bump `Cargo.toml` → land through `staging` → merge `staging → main` → push
95
+ `vX.Y.Z` on `main`; the tag publishes both registries.
96
+
97
+ Contributors don't bump the version in a PR — just note user-facing changes in
98
+ [`CHANGELOG.md`](CHANGELOG.md); the maintainer handles the bump and tag at release.
99
+
100
+ ## Geodesic rules
101
+
102
+ These are non-negotiable for every geometry calculation:
103
+
104
+ - **Area / perimeter** → `geo::GeodesicArea::geodesic_area_unsigned()` (Karney, WGS 84).
105
+ - **Buffer ring vertices** → `geo::Destination::geodesic_destination()`.
106
+ - **Geometry-endpoint distances** → `geo::Distance` / geodesic (ellipsoidal).
107
+ - **Query-path distances** (radius, nearest) → DuckDB `ST_Distance_Sphere` (haversine)
108
+ is acceptable.
109
+
110
+ Never use planar/Cartesian math for area, distance, or buffer calculations.
111
+
112
+ ## Scope
113
+
114
+ Terrana is intentionally small. It is **not** a PostGIS replacement, a tile server,
115
+ a distributed system, or a CRS-conversion tool (WGS 84 only). Please open an issue to
116
+ discuss before working on anything that expands this scope — see "What This Project Is
117
+ NOT" in [CLAUDE.md](CLAUDE.md).
118
+
119
+ ## Licensing
120
+
121
+ Terrana is dual-licensed under [MIT](LICENSE-MIT) or [Apache-2.0](LICENSE-APACHE).
122
+ Unless you state otherwise, any contribution you submit for inclusion in the work, as
123
+ defined in the Apache-2.0 license, shall be dual-licensed as above, without any
124
+ additional terms or conditions.