landarchetypes 0.1.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (25) hide show
  1. landarchetypes-0.1.1/LICENSE +21 -0
  2. landarchetypes-0.1.1/PKG-INFO +368 -0
  3. landarchetypes-0.1.1/README.md +342 -0
  4. landarchetypes-0.1.1/pyproject.toml +54 -0
  5. landarchetypes-0.1.1/setup.cfg +4 -0
  6. landarchetypes-0.1.1/src/land_archetypes/__init__.py +13 -0
  7. landarchetypes-0.1.1/src/land_archetypes/archetype_classes/archetype_classes.json +257 -0
  8. landarchetypes-0.1.1/src/land_archetypes/archetype_classification.py +181 -0
  9. landarchetypes-0.1.1/src/land_archetypes/archetype_profiler.py +254 -0
  10. landarchetypes-0.1.1/src/land_archetypes/climate_land_unit_classification.py +289 -0
  11. landarchetypes-0.1.1/src/land_archetypes/input_layers_mapping/__init__.py +0 -0
  12. landarchetypes-0.1.1/src/land_archetypes/input_layers_mapping/clc_mapping.csv +46 -0
  13. landarchetypes-0.1.1/src/land_archetypes/input_layers_mapping/eunis_l2_mapping.csv +45 -0
  14. landarchetypes-0.1.1/src/land_archetypes/input_layers_mapping/mapping_tools.py +47 -0
  15. landarchetypes-0.1.1/src/land_archetypes/kcs_catalogue.json +178 -0
  16. landarchetypes-0.1.1/src/land_archetypes/utilities/__init__.py +0 -0
  17. landarchetypes-0.1.1/src/land_archetypes/utilities/geoprocessing_tools.py +497 -0
  18. landarchetypes-0.1.1/src/landarchetypes.egg-info/PKG-INFO +368 -0
  19. landarchetypes-0.1.1/src/landarchetypes.egg-info/SOURCES.txt +23 -0
  20. landarchetypes-0.1.1/src/landarchetypes.egg-info/dependency_links.txt +1 -0
  21. landarchetypes-0.1.1/src/landarchetypes.egg-info/requires.txt +8 -0
  22. landarchetypes-0.1.1/src/landarchetypes.egg-info/top_level.txt +1 -0
  23. landarchetypes-0.1.1/tests/test_archetype_classification.py +213 -0
  24. landarchetypes-0.1.1/tests/test_archetype_profiler.py +153 -0
  25. landarchetypes-0.1.1/tests/test_clu_classification.py +234 -0
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Envrio
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,368 @@
1
+ Metadata-Version: 2.4
2
+ Name: landarchetypes
3
+ Version: 0.1.1
4
+ Summary: Classify regions into landscape archetypes from multi-layer geospatial inputs.
5
+ Author-email: Ioannis Tsakmakis <itsakmak@envrio.org>, Nikolaos Kokkos <nkokkos@envrio.org>
6
+ License: MIT
7
+ Keywords: geospatial,landscape,archetypes,corine,eunis,classification
8
+ Classifier: Development Status :: 3 - Alpha
9
+ Classifier: Intended Audience :: Science/Research
10
+ Classifier: License :: OSI Approved :: MIT License
11
+ Classifier: Programming Language :: Python :: 3
12
+ Classifier: Programming Language :: Python :: 3 :: Only
13
+ Classifier: Programming Language :: Python :: 3.12
14
+ Classifier: Topic :: Scientific/Engineering :: GIS
15
+ Requires-Python: >=3.12
16
+ Description-Content-Type: text/markdown
17
+ License-File: LICENSE
18
+ Requires-Dist: xarray>=2026.2.0
19
+ Requires-Dist: rioxarray>=0.22.0
20
+ Requires-Dist: pandas>=3.0.1
21
+ Requires-Dist: geopandas>=1.1.3
22
+ Requires-Dist: scikit-learn>=1.5.0
23
+ Provides-Extra: dev
24
+ Requires-Dist: pytest>=7.4; extra == "dev"
25
+ Dynamic: license-file
26
+
27
+ # archetype_mapper
28
+
29
+ A Python library for deriving landscape archetype and Climate-Land Unit rasters
30
+ as geospatial **exposure layers** in support of climate risk assessment.
31
+
32
+ ---
33
+
34
+ ## Conceptual framework
35
+
36
+ In climate risk assessment, **exposure** describes what is present in a location that
37
+ could be adversely affected — the landscape, its structural character, and the communities
38
+ and systems embedded within it. `landarchetypes` operationalises this by classifying
39
+ the landscape into discrete, spatially explicit exposure units.
40
+
41
+ Two levels of exposure characterisation are available:
42
+
43
+ **Land Archetype** — a structural and morphological unit defined by land cover, habitat
44
+ type, elevation, imperviousness, population density, and proximity to coastlines and river
45
+ networks. It describes *what the land is*, which community systems are most likely present within it,
46
+ and which hazards pose the highest threat to it — serving as the primary exposure layer.
47
+
48
+ **Climate-Land Unit (CLU)** — an optional refinement that adds the climatic envelope
49
+ (mean annual precipitation and temperature) under which an archetype operates. CLUs are
50
+ derived by unsupervised clustering within each archetype class and carry labels of the
51
+ form `C4-1`, `C4-2`, encoding both structural identity and climate variant.
52
+
53
+ > **Terminology note**: a land archetype is a structural/morphological descriptor.
54
+ > A CLU is a distinct concept combining structural identity with climatic context.
55
+ > Both are exposure layers. Neither is a hazard map or a vulnerability indicator.
56
+
57
+ ---
58
+
59
+ ## Installation
60
+
61
+ ```bash
62
+ pip install landarchetypes
63
+ ```
64
+
65
+ Requires Python ≥ 3.12.
66
+
67
+ ## Dependencies
68
+
69
+ - `xarray >= 2026.2.0`
70
+ - `rioxarray >= 0.22.0`
71
+ - `pandas >= 3.0.1`
72
+ - `geopandas >= 1.1.3`
73
+ - `scikit-learn >= 1.5.0`
74
+
75
+ ---
76
+
77
+ ## Stage 1 — Land Archetype map (mandatory)
78
+
79
+ The 15 built-in archetype classes span four groups:
80
+
81
+ | Group | Classes | Key discriminators |
82
+ |---|---|---|
83
+ | Coastal | A1–A4 | Coastal proximity, EUNIS marine/transitional codes |
84
+ | Urban | B1–B5 | Imperviousness, population density, elevation, river/coastal proximity |
85
+ | Rural | C1–C4 | CLC agricultural/natural codes, low imperviousness |
86
+ | Mountainous | D1–D2 | Elevation ≥ 300 m, alpine EUNIS codes |
87
+
88
+ ### Indicative input data resources
89
+
90
+ | Spatial Evidence | Indicative Dataset | Spatial Resolution | Version | Last Updated |
91
+ |---|---|---|---|---|
92
+ | CORINE Land Cover | CORINE Land Cover | 100 × 100 m | 20.01 | 2020-05-13 |
93
+ | European Nature Information System | Ecosystem Types of Europe 2012 | 100 × 100 m | 3.1 | 2019-02-26 |
94
+ | Digital Elevation Model | Copernicus GLO-30 Digital Elevation Model | 30 × 30 m | — | 2015-01-07 |
95
+ | Surface Imperviousness Density | Imperviousness Density 2021, Europe (10 m and 100 m), 3-yearly | 10 × 10 m | 1.00 | 2025-08-01 |
96
+ | River Network | HydroRIVERS | — | 1.00 | — |
97
+ | Coast Line | OpenStreetMap Coastlines | — | — | 2026-02-20 |
98
+ | Population Density | WorldPOP Age and Sex Structures | 100 × 100 m | 1 | — |
99
+
100
+ Classification follows a first-match-wins precedence order over a configurable JSON rule set.
101
+ Climate constraints (mean annual precipitation and temperature) are optional — include the
102
+ corresponding rasters in `ras` to activate them; omit them and they are silently skipped.
103
+
104
+ ```python
105
+ import json
106
+ from land_archetypes.archetype_classification import ArchetypeClassification
107
+
108
+ with open("archetype_classes/archetype_classes.json") as f:
109
+ rules = json.load(f)
110
+
111
+ clf = ArchetypeClassification()
112
+ archetype_raster = clf.derive_archetype_raster_map(
113
+ output_path="outputs/",
114
+ archetype_map_name="archetypes.tif",
115
+ ras={
116
+ "clc": clc_da,
117
+ "eunis": eunis_da,
118
+ "coast_buffer": coast_da,
119
+ "river_buffer": river_da,
120
+ "imperviousness": imp_da,
121
+ "population_density": pop_da,
122
+ "dem": dem_da,
123
+ # optional — activate mean_annual_precip_constraint / mean_annual_temp_constraint
124
+ # fields defined in archetype_classes.json for selected archetypes
125
+ "mean_precip": precip_da,
126
+ "mean_temp": temp_da,
127
+ },
128
+ rules=rules,
129
+ eunis_code_map=eunis_map,
130
+ clc_code_map=clc_map,
131
+ )
132
+ ```
133
+
134
+ ### Classification precedence
135
+
136
+ Because a pixel can satisfy the rules of more than one archetype — for example, an urban
137
+ area on the coast satisfies both Coastal Urban and Inland Urban constraints — the
138
+ classifier uses a **first-match-wins** strategy: each pixel is assigned to the first
139
+ archetype in the precedence list whose rule it satisfies, and is then excluded from all
140
+ subsequent evaluations.
141
+
142
+ The default precedence is ordered from most spatially constrained to most general,
143
+ ensuring that specialised archetypes are not absorbed by broader ones:
144
+
145
+ | Priority | Code | Name |
146
+ |---|---|---|
147
+ | 1 | A2 | Beach-Dune System |
148
+ | 2 | A3 | Transitional Coastal Water System |
149
+ | 3 | A1 | Marine/Subtidal |
150
+ | 4 | A4 | Coastal Natural Plains & Forests |
151
+ | 5 | B3 | Coastal Urban |
152
+ | 6 | B2 | Riverine Urban |
153
+ | 7 | B1 | Inland Urban |
154
+ | 8 | B4 | Suburban |
155
+ | 9 | B5 | Mountainous Urban |
156
+ | 10 | D1 | Mountainous/Forested |
157
+ | 11 | D2 | High-Altitude Meadows & Scrub |
158
+ | 12 | C2 | Inland Waterbody Systems |
159
+ | 13 | C3 | Rural Settlements |
160
+ | 14 | C1 | Agricultural Land |
161
+ | 15 | C4 | Inland Natural Plains & Forests |
162
+
163
+ The order can be changed by passing a custom list to the `precedence` parameter:
164
+
165
+ ```python
166
+ archetype_raster = clf.derive_archetype_raster_map(
167
+ ...,
168
+ precedence=["B5", "D2", "D1", "B3", "B2", "B1", "A2", "A3", "A1",
169
+ "A4", "B4", "C2", "C3", "C1", "C4"],
170
+ )
171
+ ```
172
+
173
+ ### Overriding default rule thresholds
174
+
175
+ Per-archetype constraint values (elevation, imperviousness, population density, climate
176
+ ranges) can be overridden at call time without modifying the JSON file. Only the specified
177
+ fields are updated; all others retain their default values.
178
+
179
+ ```python
180
+ archetype_raster = clf.derive_archetype_raster_map(
181
+ ...,
182
+ rule_overrides={
183
+ "C4": {"elevation_constraint": [0, 400]},
184
+ "B1": {"imperviousness_constraints": [40, 100]},
185
+ "D2": {"mean_annual_temp_constraint": [-5, 4]},
186
+ },
187
+ )
188
+ ```
189
+
190
+ The coastline and riverline buffer distances are controlled at preprocessing time via
191
+ `GeospatialProcessingUtilities.create_line_buffer_raster(buffer_distance=...)`.
192
+
193
+ ### Profiling a study area
194
+
195
+ Once you have an archetype raster, `ArchetypeProfiler` surfaces the domain knowledge
196
+ encoded in the rule set — which hazard layers are needed for your specific study area
197
+ and which community systems are at risk — without manual inspection of the JSON file.
198
+
199
+ ```python
200
+ from land_archetypes.archetype_profiler import ArchetypeProfiler
201
+
202
+ profiler = ArchetypeProfiler()
203
+ report = profiler.profile(archetype_raster, rules)
204
+
205
+ # Complete set of hazard maps needed for this study area
206
+ print(report["required_hazard_layers"])
207
+ # ["coastal floods", "drought", "heatwaves", "slope instability/landslides", "wildfires"]
208
+
209
+ # Community systems at risk across the study area
210
+ print(report["community_systems_at_risk"])
211
+ # ["education", "environmental & ecosystem", "food", "health", "transportation", "water"]
212
+
213
+ # Per-archetype breakdown with pixel count and coverage
214
+ for key, info in report["archetypes_present"].items():
215
+ print(f"{key} ({info['name']}): {info['coverage_pct']}% — hazards: {info['hazard_relevance']}")
216
+ ```
217
+
218
+ The same profiler works on CLU rasters, adding climate centroids per variant:
219
+
220
+ ```python
221
+ report = profiler.profile_clu(clu_raster, lookup, rules)
222
+
223
+ print(report["archetypes_present"]["C4"]["climate_variants"])
224
+ # {
225
+ # "C4-1": {"pixel_count": 8200, "centroid": {"mean_precip": 320.4, "mean_temp": 17.8}},
226
+ # "C4-2": {"pixel_count": 4250, "centroid": {"mean_precip": 680.1, "mean_temp": 11.2}},
227
+ # }
228
+ ```
229
+
230
+ ### Expanding community systems
231
+
232
+ `community_systems_at_risk` returns category names. Call `expand_community_systems`
233
+ to resolve those categories into a specific inventory of systems at risk, drawn from
234
+ the built-in KCS catalogue (32 systems across 8 categories).
235
+
236
+ ```python
237
+ details = profiler.expand_community_systems(report["community_systems_at_risk"])
238
+ # {
239
+ # "health": [
240
+ # {"id": 19, "name": "hospitals", "description": "Medical institutions ..."},
241
+ # {"id": 20, "name": "pharmacies", "description": "Facilities dispensing ..."},
242
+ # {"id": 21, "name": "emergency medical services", "description": "Rapid-response ..."},
243
+ # ],
244
+ # "water": [
245
+ # {"id": 1, "name": "drinking water distribution network", "description": "..."},
246
+ # ...
247
+ # ],
248
+ # ...
249
+ # }
250
+ ```
251
+
252
+ The KCS catalogue covers the following categories and systems:
253
+
254
+ | Category | Systems |
255
+ |---|---|
256
+ | Water | Drinking water distribution network; drinking water treatment plants; wastewater treatment plants; stormwater drainage system; irrigation water distribution system |
257
+ | Transportation | Ports/harbors; railways; airports; public transport systems; road networks |
258
+ | Energy | Power plants; transmission and distribution grid; renewable energy infrastructure; refineries |
259
+ | Food | Agricultural production; storages (e.g., silos); food processing facilities; local markets |
260
+ | Health | Hospitals; pharmacies; emergency medical services |
261
+ | Communication | Telecommunications (mobile, internet) |
262
+ | Education | Schools; universities; athletic centers |
263
+ | Environmental & Ecosystem | Wetlands, rivers, floodplains; soil; urban green spaces; dunes, reefs; forests; lagoons and freshwater lakes; groundwater resources |
264
+
265
+ ---
266
+
267
+ ## Stage 2 — Climate-Land Unit map (optional)
268
+
269
+ Requires the archetype raster from Stage 1. Within each archetype class, pixels are
270
+ clustered on mean annual precipitation and temperature to produce climate sub-types
271
+ (e.g. `C4-1`, `C4-2`). Features are z-score standardised before clustering so that
272
+ precipitation (mm) and temperature (°C) contribute equally.
273
+
274
+ ```python
275
+ from land_archetypes.climate_land_unit_classification import ClimateLandUnitClassification
276
+
277
+ clf_clu = ClimateLandUnitClassification()
278
+ clu_raster, lookup = clf_clu.derive_climate_land_unit_map(
279
+ output_path="outputs/",
280
+ output_name="climate_land_units.tif",
281
+ archetype_raster=archetype_raster,
282
+ ras={
283
+ "mean_precip": precip_da, # mean annual precipitation (mm/year)
284
+ "mean_temp": temp_da, # mean annual temperature (°C)
285
+ },
286
+ target_archetypes=["C4", "D1", "D2"], # sub-type only these; others pass through
287
+ n_clusters={"C4": 3}, # fix k for C4; auto-select for the rest
288
+ method="kmeans", # "kmeans" (silhouette) or "gmm" (BIC)
289
+ )
290
+ ```
291
+
292
+ The returned `lookup` dict maps each CLU integer ID to its metadata:
293
+
294
+ ```python
295
+ {
296
+ 1: {"archetype": "C4", "cluster": 1, "label": "C4-1",
297
+ "centroid": {"mean_precip": 320.4, "mean_temp": 17.8}},
298
+ 2: {"archetype": "C4", "cluster": 2, "label": "C4-2",
299
+ "centroid": {"mean_precip": 680.1, "mean_temp": 11.2}},
300
+ ...
301
+ }
302
+ ```
303
+
304
+ ---
305
+
306
+ ## Changelog
307
+
308
+ ### 0.1.1
309
+
310
+ #### New: Climate-Land Unit classification
311
+
312
+ - Added `climate_land_unit_classification.py` — `ClimateLandUnitClassification` class
313
+ implementing the Stage 2 two-stage workflow:
314
+ - Within-archetype unsupervised clustering on mean annual precipitation and temperature.
315
+ - Supports `"kmeans"` (silhouette-based auto-k) and `"gmm"` (BIC-based auto-k).
316
+ - Per-archetype fixed k via `n_clusters` dict; `target_archetypes` controls which
317
+ archetypes are sub-typed vs. passed through.
318
+ - Returns a UInt16 CLU raster (NoData = 65535) and a `lookup` dict with centroids.
319
+ - `scikit-learn >= 1.5.0` added as a dependency.
320
+
321
+ #### New: archetype profiler
322
+
323
+ - Added `archetype_profiler.py` — `ArchetypeProfiler` class with two methods:
324
+ - `profile(archetype_raster, rules)` — for archetype rasters: returns per-archetype
325
+ pixel count, coverage %, `hazard_relevance`, and `kcs`; plus study-area-wide
326
+ `required_hazard_layers` and `community_systems_at_risk` as sorted union sets.
327
+ - `profile_clu(clu_raster, clu_lookup, rules)` — same as above for CLU rasters,
328
+ with an additional `climate_variants` sub-dict per archetype showing per-CLU
329
+ pixel counts and climate centroids.
330
+
331
+ #### New: per-archetype rule overrides
332
+
333
+ - `ArchetypeClassification.derive_archetype_raster_map` accepts
334
+ `rule_overrides: Dict[str, Dict[str, Any]]` — a per-archetype dict that
335
+ deep-merges with the loaded JSON rules at call time.
336
+ - Unknown archetype keys raise a `ValueError` with the list of valid keys.
337
+ - The original rules dict is never mutated (deep-copied before merging).
338
+
339
+ #### New: climate constraints in archetype rules
340
+
341
+ - `archetype_classes.json` includes `mean_annual_precip_constraint` and
342
+ `mean_annual_temp_constraint` fields for all 15 archetypes. Indicative ranges
343
+ set for B1, B3, C4, D1, D2; `null` for all others.
344
+ - `archetype_classification.py` supports optional `mean_precip` and `mean_temp`
345
+ rasters. If absent from `ras`, climate constraints are silently skipped.
346
+
347
+ #### Fix: `GeospatialProcessingUtilities` (geoprocessing_tools.py)
348
+
349
+ - **Class rename**: `GeospacialProcessingUntilities` → `GeospatialProcessingUtilities`.
350
+ - **Bug — integer overflow**: `add_two_rasters` widens integer inputs to `int32` before
351
+ addition to prevent silent `uint8` wraparound.
352
+ - **Bug — float transform comparison**: Replaced exact `Affine !=` equality with
353
+ `np.allclose` on the six transform coefficients.
354
+ - **Bug — double processing of reference raster**: `reproject_rasters` collapsed two
355
+ `reproject` calls into a single pass; reference key skipped inside the loop.
356
+ - **Pitfall — geometry repair**: `make_valid()` used consistently across all methods
357
+ (previously `buffer(0)` in two methods, `make_valid()` in one).
358
+ - **Pitfall — deprecated `unary_union`**: Replaced with `union_all()` (Shapely 2.x).
359
+ - **Pitfall — unreliable CRS equality**: `!=` replaced with `.equals()`.
360
+ - **Pitfall — missing band-dim guard**: `_as_1band` promoted to class-level `@staticmethod`.
361
+ - **Performance — eager raster loading**: `chunks="auto"` added to all
362
+ `rxr.open_rasterio` calls.
363
+ - **Performance — unnecessary `clip_box`**: Moved inside the `mask_outside_vector` branch.
364
+ - **Missing feature**: `compress` parameter added to `clip_raster_by_vector`.
365
+
366
+ ### 0.1.0
367
+
368
+ - Initial release.