aquamatch 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- aquamatch-0.1.0/PKG-INFO +322 -0
- aquamatch-0.1.0/README.md +284 -0
- aquamatch-0.1.0/aquamatch/__init__.py +0 -0
- aquamatch-0.1.0/aquamatch/acolite_spec.py +1095 -0
- aquamatch-0.1.0/aquamatch/insitu_data.py +397 -0
- aquamatch-0.1.0/aquamatch/pipeline_config.py +1110 -0
- aquamatch-0.1.0/aquamatch/scl_water.py +463 -0
- aquamatch-0.1.0/aquamatch/sentinel_data.py +513 -0
- aquamatch-0.1.0/pyproject.toml +42 -0
aquamatch-0.1.0/PKG-INFO
ADDED
|
@@ -0,0 +1,322 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: aquamatch
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Python package and script to validate, model water quality parameters with remote sensing data
|
|
5
|
+
License: MIT
|
|
6
|
+
Keywords: remote sensing,water quality,sentinel-2,acolite
|
|
7
|
+
Author: Felipe Sodré
|
|
8
|
+
Author-email: felipe.b4rros@gmail.com
|
|
9
|
+
Requires-Python: >=3.12,<4.0
|
|
10
|
+
Classifier: Intended Audience :: Science/Research
|
|
11
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
12
|
+
Classifier: Programming Language :: Python :: 3
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
14
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.14
|
|
16
|
+
Classifier: Topic :: Scientific/Engineering :: GIS
|
|
17
|
+
Requires-Dist: boto3 (>=1.40.42,<2.0.0)
|
|
18
|
+
Requires-Dist: dask (>=2026.3.0,<2027.0.0)
|
|
19
|
+
Requires-Dist: geopandas (>=1.1.1,<2.0.0)
|
|
20
|
+
Requires-Dist: matplotlib (>=3.10.9,<4.0.0)
|
|
21
|
+
Requires-Dist: mgrs (>=1.5.4,<2.0.0)
|
|
22
|
+
Requires-Dist: netcdf4 (>=1.7.4,<2.0.0)
|
|
23
|
+
Requires-Dist: openpyxl (>=3.1.5,<4.0.0)
|
|
24
|
+
Requires-Dist: pandas (>=2.3.3,<3.0.0)
|
|
25
|
+
Requires-Dist: pystac-client (>=0.9.0,<0.10.0)
|
|
26
|
+
Requires-Dist: python-dotenv (>=1.2.2,<2.0.0)
|
|
27
|
+
Requires-Dist: rasterio (>=1.4.3,<2.0.0)
|
|
28
|
+
Requires-Dist: requests (>=2.32.5,<3.0.0)
|
|
29
|
+
Requires-Dist: rioxarray (>=0.22.0,<0.23.0)
|
|
30
|
+
Requires-Dist: sentinelhub (>=3.11.2,<4.0.0)
|
|
31
|
+
Requires-Dist: shapely (>=2.1.2,<3.0.0)
|
|
32
|
+
Requires-Dist: xarray (>=2024.1.0)
|
|
33
|
+
Requires-Dist: zarr (>=3.2.1,<4.0.0)
|
|
34
|
+
Project-URL: Homepage, https://github.com/FelipeSBarros/aquamatch
|
|
35
|
+
Project-URL: Repository, https://github.com/FelipeSBarros/aquamatch
|
|
36
|
+
Description-Content-Type: text/markdown
|
|
37
|
+
|
|
38
|
+
# Río Negro Matchup
|
|
39
|
+
|
|
40
|
+
Python package and scripts to match Sentinel-2 satellite imagery with in situ water quality field measurements, apply atmospheric correction, and validate remote sensing water quality products.
|
|
41
|
+
|
|
42
|
+
## Overview
|
|
43
|
+
|
|
44
|
+

|
|
45
|
+
|
|
46
|
+
**Color coding**: teal for the five pipeline steps, gray/neutral for data artifacts (CSVs, SAFE folders, outputs), amber for the YAML orchestration layer, and purple for the SCL/datacube components.
|
|
47
|
+
**Dashed arrows**: used for two relationships that are optional or indirect: the SCL polygon clip path (only when use_scl=True), and the Step 5 orchestration edges back to Steps 1–4 (since the YAML config drives the others rather than receiving data from them).
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
## Installation
|
|
52
|
+
|
|
53
|
+
**Requirements:** Python ≥ 3.12
|
|
54
|
+
|
|
55
|
+
Clone the repository and install dependencies with [Poetry](https://python-poetry.org/):
|
|
56
|
+
|
|
57
|
+
```bash
|
|
58
|
+
git clone https://github.com/your-org/rionegromatchup.git
|
|
59
|
+
cd rionegromatchup
|
|
60
|
+
poetry install
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
Or with pip (using the lock file for reproducibility):
|
|
64
|
+
|
|
65
|
+
```bash
|
|
66
|
+
pip install .
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
> `pyyaml` is required for the pipeline config system. It is included in the project dependencies.
|
|
70
|
+
|
|
71
|
+
---
|
|
72
|
+
|
|
73
|
+
## Environment Setup
|
|
74
|
+
|
|
75
|
+
Create a `.env` file in the project root with your API credentials before running any step:
|
|
76
|
+
|
|
77
|
+
```env
|
|
78
|
+
SH_CLIENT_ID=your_sentinelhub_client_id
|
|
79
|
+
SH_CLIENT_SECRET=your_sentinelhub_client_secret
|
|
80
|
+
DATASPACE_ACCESS_KEY=your_copernicus_dataspace_access_key
|
|
81
|
+
DATASPACE_SECRET_KEY=your_copernicus_dataspace_secret_key
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
See the [Copernicus Dataspace documentation](https://documentation.dataspace.copernicus.eu/APIs/S3.html#example-script-to-download-product-using-boto3) for details on obtaining your access key and secret.
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
## Step-by-step Workflow
|
|
89
|
+
|
|
90
|
+
### Step 1 — Prepare in situ data
|
|
91
|
+
|
|
92
|
+
Reads field campaign data from the [OAN](https://www.ambiente.gub.uy/iSIA_OAN/), cleans measurement values, assigns each station its Sentinel-2 tile, and produces two outputs:
|
|
93
|
+
|
|
94
|
+
- `campaigns_organized.csv` — full cleaned dataset for analysis
|
|
95
|
+
- `campaigns_unique_data.csv` — one row per unique (date, tile) pair, used to drive the satellite search
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
python rionegromatchup/insitu_data.py --mode campaigns
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
To use files in non-default locations:
|
|
102
|
+
|
|
103
|
+
```bash
|
|
104
|
+
python rionegromatchup/insitu_data.py --mode campaigns \
|
|
105
|
+
--stations data/original_data/my_stations.xlsx \
|
|
106
|
+
--campaigns data/original_data/my_export.xlsx
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
> The `--skip-clean` flag is available if the OAN export has already been cleaned before download. [See OAN's documention](https://www.ambiente.gub.uy/iSIA_OAN/guia.html)
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
### Step 2 — Build the satellite catalog
|
|
114
|
+
|
|
115
|
+
Searches for Sentinel-2 L1C scenes that match each field date and location from `campaigns_unique_data.csv`. Only scenes whose MGRS tile matches the station's assigned tile are kept. For each L1C scene, the corresponding L2A scene is looked up to retrieve the SCL (Scene Classification) asset URL.
|
|
116
|
+
|
|
117
|
+
The result is a `sentinel_catalog.json` file listing matched scenes per field date.
|
|
118
|
+
|
|
119
|
+
```bash
|
|
120
|
+
python rionegromatchup/sentinel_data.py --mode catalog \
|
|
121
|
+
--csv data/monitoring_data/campaigns_unique_data.csv \
|
|
122
|
+
--time-delta 2 \
|
|
123
|
+
--cloud-cover 20
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
---
|
|
127
|
+
|
|
128
|
+
### Step 3 — Download imagery
|
|
129
|
+
|
|
130
|
+
Downloads the SAFE products and SCL assets listed in the catalog. Already-downloaded scenes are skipped automatically.
|
|
131
|
+
|
|
132
|
+
```bash
|
|
133
|
+
python rionegromatchup/sentinel_data.py --mode download \
|
|
134
|
+
--download-scl
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
> You can run both steps (build catalog and download images) using `--mode all`
|
|
138
|
+
|
|
139
|
+
---
|
|
140
|
+
|
|
141
|
+
### Step 4 — Atmospheric correction
|
|
142
|
+
|
|
143
|
+
Runs [ACOLITE](https://github.com/acolite/acolite) on the downloaded SAFE folders to produce surface reflectance and water quality products (turbidity, SPM, chlorophyll-a, and others) as NetCDF files.
|
|
144
|
+
|
|
145
|
+
```python
|
|
146
|
+
from rionegromatchup.acolite_spec import AcoliteConfig, IOConfig
|
|
147
|
+
|
|
148
|
+
cfg = AcoliteConfig(
|
|
149
|
+
acolite_executable="/path/to/acolite",
|
|
150
|
+
io=IOConfig(
|
|
151
|
+
inputfile="data/sentinel_downloads/S2A_MSIL1C_20170713T135111_N0500_R024_T21HUD.SAFE",
|
|
152
|
+
output="data/acolite_output",
|
|
153
|
+
limit=(-33.25, -58.45, -33.17, -58.33), # S, W, N, E
|
|
154
|
+
),
|
|
155
|
+
)
|
|
156
|
+
|
|
157
|
+
result = cfg.run()
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
For SCL-based water masking, use `with_scl_polygon()` to restrict processing to water pixels only:
|
|
161
|
+
|
|
162
|
+
```python
|
|
163
|
+
result = cfg.with_scl_polygon(
|
|
164
|
+
"data/sentinel_downloads/scl/S2B_MSIL1C_20200513T135109_N0500_R024_T21HVD_20230430T050652_SCL.tif"
|
|
165
|
+
).run()
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
---
|
|
169
|
+
|
|
170
|
+
## Run the full pipeline from a YAML config
|
|
171
|
+
|
|
172
|
+
The pipeline can also be driven entirely from a single YAML file — one file per campaign, version-controlled alongside your data.
|
|
173
|
+
|
|
174
|
+
**Generate a template:**
|
|
175
|
+
|
|
176
|
+
```bash
|
|
177
|
+
python -m rionegromatchup.pipeline_config --generate campaign_2025.yaml
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
The generated file includes every parameter at its default value, with inline comments documenting units and valid options. Edit it for your campaign, then run:
|
|
181
|
+
|
|
182
|
+
```bash
|
|
183
|
+
python -m rionegromatchup.pipeline_config --run campaign_2025.yaml
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
Individual steps can be disabled by setting `enabled: false`:
|
|
187
|
+
|
|
188
|
+
```yaml
|
|
189
|
+
insitu:
|
|
190
|
+
enabled: false # skip — already prepared
|
|
191
|
+
|
|
192
|
+
sentinel:
|
|
193
|
+
enabled: true
|
|
194
|
+
time_delta_days: 2
|
|
195
|
+
cloud_cover_max: 20
|
|
196
|
+
|
|
197
|
+
acolite:
|
|
198
|
+
enabled: true
|
|
199
|
+
acolite_executable: /path/to/acolite/acolite.py
|
|
200
|
+
scl:
|
|
201
|
+
use_scl: true
|
|
202
|
+
min_area_m2: 5000
|
|
203
|
+
|
|
204
|
+
tiles:
|
|
205
|
+
21HUD:
|
|
206
|
+
polygon: data/polygons/21HUD.geojson
|
|
207
|
+
21HVD:
|
|
208
|
+
limit: [-34.2, -56.8, -33.0, -55.1]
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
**Dry-run** (validate config and log steps without executing):
|
|
212
|
+
|
|
213
|
+
```bash
|
|
214
|
+
python -m rionegromatchup.pipeline_config --run campaign_2025.yaml --dry-run
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
**Force reprocess** (ignore existing outputs and reprocess all scenes):
|
|
218
|
+
|
|
219
|
+
```bash
|
|
220
|
+
python -m rionegromatchup.pipeline_config --run campaign_2025.yaml --force
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
### Per-tile spatial restrictions
|
|
224
|
+
|
|
225
|
+
The `tiles:` section of the config lets you define a spatial restriction for each Sentinel-2 MGRS tile, so the same boundary is applied consistently across every scene processed for that tile — no need to specify it on each run.
|
|
226
|
+
|
|
227
|
+
For each tile, set either `polygon` (a GeoJSON or WKT file path) or `limit` (a `[south, west, north, east]` bounding box in decimal degrees), or omit the tile entirely to process the full scene.
|
|
228
|
+
|
|
229
|
+
```yaml
|
|
230
|
+
tiles:
|
|
231
|
+
21HUD:
|
|
232
|
+
polygon: data/polygons/21HUD.geojson # hand-drawn or pre-processed boundary
|
|
233
|
+
21HVD:
|
|
234
|
+
limit: [-34.2, -56.8, -33.0, -55.1] # bounding box [S, W, N, E]
|
|
235
|
+
21HWD:
|
|
236
|
+
# no entry — full scene processed
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
The restriction is resolved per scene during atmospheric correction following this precedence order:
|
|
240
|
+
|
|
241
|
+
1. **Static polygon** from `tiles:` — highest priority. If a tile has a polygon configured, it is applied directly to ACOLITE and SCL-based clipping (`use_scl`) is suppressed for that tile, since the static polygon already defines the water boundary precisely.
|
|
242
|
+
2. **SCL-derived polygon** (`use_scl: true`) — used when the tile has no static polygon. A water mask is extracted from the SCL asset and applied as the processing boundary.
|
|
243
|
+
3. **Static limit** from `tiles:` — applied when no polygon is available from either source above.
|
|
244
|
+
4. **No restriction** — full scene is processed when the tile is not listed in `tiles:` and SCL clipping is disabled or unavailable.
|
|
245
|
+
|
|
246
|
+
> The tile ID is extracted automatically from the SAFE folder filename (e.g. `T21HUD` in `S2A_MSIL1C_20250801T101031_N0500_R024_T21HUD_...SAFE`), so no manual mapping between files and tiles is needed.
|
|
247
|
+
|
|
248
|
+
---
|
|
249
|
+
|
|
250
|
+
## Programmatic usage
|
|
251
|
+
|
|
252
|
+
For scripting and integration into custom workflows, all pipeline steps can be called directly without a config file.
|
|
253
|
+
|
|
254
|
+
```bash
|
|
255
|
+
# Step 1 — prepare in situ data
|
|
256
|
+
python rionegromatchup/insitu_data.py --mode campaigns
|
|
257
|
+
|
|
258
|
+
# Step 2 — build catalog (±2 days, max 20% cloud cover)
|
|
259
|
+
python rionegromatchup/sentinel_data.py --mode catalog \
|
|
260
|
+
--csv data/monitoring_data/campaigns_unique_data.csv \
|
|
261
|
+
--time-delta 2 \
|
|
262
|
+
--cloud-cover 20
|
|
263
|
+
|
|
264
|
+
# Step 3 — download imagery and SCL assets
|
|
265
|
+
python rionegromatchup/sentinel_data.py --mode download \
|
|
266
|
+
--download-scl
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
```python
|
|
270
|
+
from pathlib import Path
|
|
271
|
+
from rionegromatchup.acolite_spec import AcoliteConfig, IOConfig
|
|
272
|
+
from rionegromatchup.pipeline_config import TilesSection, TileEntry
|
|
273
|
+
|
|
274
|
+
cfg = AcoliteConfig(
|
|
275
|
+
acolite_executable="/path/to/acolite",
|
|
276
|
+
io=IOConfig(inputfile="", output=""),
|
|
277
|
+
)
|
|
278
|
+
|
|
279
|
+
safe_list = sorted(Path("data/sentinel_downloads").glob("*.SAFE"))
|
|
280
|
+
scl_dir = Path("data/sentinel_downloads/scl")
|
|
281
|
+
|
|
282
|
+
# Define per-tile spatial restrictions
|
|
283
|
+
tiles = TilesSection.from_dict({
|
|
284
|
+
"21HUD": {"polygon": "data/polygons/21HUD.geojson"},
|
|
285
|
+
"21HVD": {"limit": [-34.2, -56.8, -33.0, -55.1]},
|
|
286
|
+
})
|
|
287
|
+
|
|
288
|
+
results = cfg.run_batch(
|
|
289
|
+
safe_list=safe_list,
|
|
290
|
+
base_output="data/acolite_output",
|
|
291
|
+
use_scl=True,
|
|
292
|
+
scl_dir=scl_dir,
|
|
293
|
+
scl_kwargs={"min_area_m2": 5000},
|
|
294
|
+
tile_config=tiles,
|
|
295
|
+
continue_on_error=True,
|
|
296
|
+
skip_existing=True, # set False to reprocess all scenes
|
|
297
|
+
)
|
|
298
|
+
```
|
|
299
|
+
|
|
300
|
+
You can also resolve tile restrictions per row when building configs from campaign data:
|
|
301
|
+
|
|
302
|
+
```python
|
|
303
|
+
from rionegromatchup.acolite_spec import AcoliteConfig
|
|
304
|
+
from rionegromatchup.pipeline_config import TilesSection
|
|
305
|
+
|
|
306
|
+
tiles = TilesSection.from_dict({
|
|
307
|
+
"21HUD": {"polygon": "data/polygons/21HUD.geojson"},
|
|
308
|
+
"21HVD": {"limit": [-34.2, -56.8, -33.0, -55.1]},
|
|
309
|
+
})
|
|
310
|
+
|
|
311
|
+
# row is a pandas Series from campaigns_unique_data.csv
|
|
312
|
+
cfg = AcoliteConfig.from_campaigns_row(
|
|
313
|
+
row=row,
|
|
314
|
+
acolite_executable="/path/to/acolite",
|
|
315
|
+
base_output="data/acolite_output",
|
|
316
|
+
inputfile=str(safe_path),
|
|
317
|
+
tile_config=tiles,
|
|
318
|
+
)
|
|
319
|
+
```
|
|
320
|
+
|
|
321
|
+
The spatial restriction is resolved automatically from `row["s2_tile"]`. If `tile_config` is omitted, the original behaviour applies: a 0.1° bounding box is derived from the row's `latitud`/`longitud` coordinates.
|
|
322
|
+
|
|
@@ -0,0 +1,284 @@
|
|
|
1
|
+
# Río Negro Matchup
|
|
2
|
+
|
|
3
|
+
Python package and scripts to match Sentinel-2 satellite imagery with in situ water quality field measurements, apply atmospheric correction, and validate remote sensing water quality products.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+

|
|
8
|
+
|
|
9
|
+
**Color coding**: teal for the five pipeline steps, gray/neutral for data artifacts (CSVs, SAFE folders, outputs), amber for the YAML orchestration layer, and purple for the SCL/datacube components.
|
|
10
|
+
**Dashed arrows**: used for two relationships that are optional or indirect: the SCL polygon clip path (only when use_scl=True), and the Step 5 orchestration edges back to Steps 1–4 (since the YAML config drives the others rather than receiving data from them).
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## Installation
|
|
15
|
+
|
|
16
|
+
**Requirements:** Python ≥ 3.12
|
|
17
|
+
|
|
18
|
+
Clone the repository and install dependencies with [Poetry](https://python-poetry.org/):
|
|
19
|
+
|
|
20
|
+
```bash
|
|
21
|
+
git clone https://github.com/your-org/rionegromatchup.git
|
|
22
|
+
cd rionegromatchup
|
|
23
|
+
poetry install
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
Or with pip (using the lock file for reproducibility):
|
|
27
|
+
|
|
28
|
+
```bash
|
|
29
|
+
pip install .
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
> `pyyaml` is required for the pipeline config system. It is included in the project dependencies.
|
|
33
|
+
|
|
34
|
+
---
|
|
35
|
+
|
|
36
|
+
## Environment Setup
|
|
37
|
+
|
|
38
|
+
Create a `.env` file in the project root with your API credentials before running any step:
|
|
39
|
+
|
|
40
|
+
```env
|
|
41
|
+
SH_CLIENT_ID=your_sentinelhub_client_id
|
|
42
|
+
SH_CLIENT_SECRET=your_sentinelhub_client_secret
|
|
43
|
+
DATASPACE_ACCESS_KEY=your_copernicus_dataspace_access_key
|
|
44
|
+
DATASPACE_SECRET_KEY=your_copernicus_dataspace_secret_key
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
See the [Copernicus Dataspace documentation](https://documentation.dataspace.copernicus.eu/APIs/S3.html#example-script-to-download-product-using-boto3) for details on obtaining your access key and secret.
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
## Step-by-step Workflow
|
|
52
|
+
|
|
53
|
+
### Step 1 — Prepare in situ data
|
|
54
|
+
|
|
55
|
+
Reads field campaign data from the [OAN](https://www.ambiente.gub.uy/iSIA_OAN/), cleans measurement values, assigns each station its Sentinel-2 tile, and produces two outputs:
|
|
56
|
+
|
|
57
|
+
- `campaigns_organized.csv` — full cleaned dataset for analysis
|
|
58
|
+
- `campaigns_unique_data.csv` — one row per unique (date, tile) pair, used to drive the satellite search
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
python rionegromatchup/insitu_data.py --mode campaigns
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
To use files in non-default locations:
|
|
65
|
+
|
|
66
|
+
```bash
|
|
67
|
+
python rionegromatchup/insitu_data.py --mode campaigns \
|
|
68
|
+
--stations data/original_data/my_stations.xlsx \
|
|
69
|
+
--campaigns data/original_data/my_export.xlsx
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
> The `--skip-clean` flag is available if the OAN export has already been cleaned before download. [See OAN's documention](https://www.ambiente.gub.uy/iSIA_OAN/guia.html)
|
|
73
|
+
|
|
74
|
+
---
|
|
75
|
+
|
|
76
|
+
### Step 2 — Build the satellite catalog
|
|
77
|
+
|
|
78
|
+
Searches for Sentinel-2 L1C scenes that match each field date and location from `campaigns_unique_data.csv`. Only scenes whose MGRS tile matches the station's assigned tile are kept. For each L1C scene, the corresponding L2A scene is looked up to retrieve the SCL (Scene Classification) asset URL.
|
|
79
|
+
|
|
80
|
+
The result is a `sentinel_catalog.json` file listing matched scenes per field date.
|
|
81
|
+
|
|
82
|
+
```bash
|
|
83
|
+
python rionegromatchup/sentinel_data.py --mode catalog \
|
|
84
|
+
--csv data/monitoring_data/campaigns_unique_data.csv \
|
|
85
|
+
--time-delta 2 \
|
|
86
|
+
--cloud-cover 20
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
---
|
|
90
|
+
|
|
91
|
+
### Step 3 — Download imagery
|
|
92
|
+
|
|
93
|
+
Downloads the SAFE products and SCL assets listed in the catalog. Already-downloaded scenes are skipped automatically.
|
|
94
|
+
|
|
95
|
+
```bash
|
|
96
|
+
python rionegromatchup/sentinel_data.py --mode download \
|
|
97
|
+
--download-scl
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
> You can run both steps (build catalog and download images) using `--mode all`
|
|
101
|
+
|
|
102
|
+
---
|
|
103
|
+
|
|
104
|
+
### Step 4 — Atmospheric correction
|
|
105
|
+
|
|
106
|
+
Runs [ACOLITE](https://github.com/acolite/acolite) on the downloaded SAFE folders to produce surface reflectance and water quality products (turbidity, SPM, chlorophyll-a, and others) as NetCDF files.
|
|
107
|
+
|
|
108
|
+
```python
|
|
109
|
+
from rionegromatchup.acolite_spec import AcoliteConfig, IOConfig
|
|
110
|
+
|
|
111
|
+
cfg = AcoliteConfig(
|
|
112
|
+
acolite_executable="/path/to/acolite",
|
|
113
|
+
io=IOConfig(
|
|
114
|
+
inputfile="data/sentinel_downloads/S2A_MSIL1C_20170713T135111_N0500_R024_T21HUD.SAFE",
|
|
115
|
+
output="data/acolite_output",
|
|
116
|
+
limit=(-33.25, -58.45, -33.17, -58.33), # S, W, N, E
|
|
117
|
+
),
|
|
118
|
+
)
|
|
119
|
+
|
|
120
|
+
result = cfg.run()
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
For SCL-based water masking, use `with_scl_polygon()` to restrict processing to water pixels only:
|
|
124
|
+
|
|
125
|
+
```python
|
|
126
|
+
result = cfg.with_scl_polygon(
|
|
127
|
+
"data/sentinel_downloads/scl/S2B_MSIL1C_20200513T135109_N0500_R024_T21HVD_20230430T050652_SCL.tif"
|
|
128
|
+
).run()
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
---
|
|
132
|
+
|
|
133
|
+
## Run the full pipeline from a YAML config
|
|
134
|
+
|
|
135
|
+
The pipeline can also be driven entirely from a single YAML file — one file per campaign, version-controlled alongside your data.
|
|
136
|
+
|
|
137
|
+
**Generate a template:**
|
|
138
|
+
|
|
139
|
+
```bash
|
|
140
|
+
python -m rionegromatchup.pipeline_config --generate campaign_2025.yaml
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
The generated file includes every parameter at its default value, with inline comments documenting units and valid options. Edit it for your campaign, then run:
|
|
144
|
+
|
|
145
|
+
```bash
|
|
146
|
+
python -m rionegromatchup.pipeline_config --run campaign_2025.yaml
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
Individual steps can be disabled by setting `enabled: false`:
|
|
150
|
+
|
|
151
|
+
```yaml
|
|
152
|
+
insitu:
|
|
153
|
+
enabled: false # skip — already prepared
|
|
154
|
+
|
|
155
|
+
sentinel:
|
|
156
|
+
enabled: true
|
|
157
|
+
time_delta_days: 2
|
|
158
|
+
cloud_cover_max: 20
|
|
159
|
+
|
|
160
|
+
acolite:
|
|
161
|
+
enabled: true
|
|
162
|
+
acolite_executable: /path/to/acolite/acolite.py
|
|
163
|
+
scl:
|
|
164
|
+
use_scl: true
|
|
165
|
+
min_area_m2: 5000
|
|
166
|
+
|
|
167
|
+
tiles:
|
|
168
|
+
21HUD:
|
|
169
|
+
polygon: data/polygons/21HUD.geojson
|
|
170
|
+
21HVD:
|
|
171
|
+
limit: [-34.2, -56.8, -33.0, -55.1]
|
|
172
|
+
```
|
|
173
|
+
|
|
174
|
+
**Dry-run** (validate config and log steps without executing):
|
|
175
|
+
|
|
176
|
+
```bash
|
|
177
|
+
python -m rionegromatchup.pipeline_config --run campaign_2025.yaml --dry-run
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
**Force reprocess** (ignore existing outputs and reprocess all scenes):
|
|
181
|
+
|
|
182
|
+
```bash
|
|
183
|
+
python -m rionegromatchup.pipeline_config --run campaign_2025.yaml --force
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
### Per-tile spatial restrictions
|
|
187
|
+
|
|
188
|
+
The `tiles:` section of the config lets you define a spatial restriction for each Sentinel-2 MGRS tile, so the same boundary is applied consistently across every scene processed for that tile — no need to specify it on each run.
|
|
189
|
+
|
|
190
|
+
For each tile, set either `polygon` (a GeoJSON or WKT file path) or `limit` (a `[south, west, north, east]` bounding box in decimal degrees), or omit the tile entirely to process the full scene.
|
|
191
|
+
|
|
192
|
+
```yaml
|
|
193
|
+
tiles:
|
|
194
|
+
21HUD:
|
|
195
|
+
polygon: data/polygons/21HUD.geojson # hand-drawn or pre-processed boundary
|
|
196
|
+
21HVD:
|
|
197
|
+
limit: [-34.2, -56.8, -33.0, -55.1] # bounding box [S, W, N, E]
|
|
198
|
+
21HWD:
|
|
199
|
+
# no entry — full scene processed
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
The restriction is resolved per scene during atmospheric correction following this precedence order:
|
|
203
|
+
|
|
204
|
+
1. **Static polygon** from `tiles:` — highest priority. If a tile has a polygon configured, it is applied directly to ACOLITE and SCL-based clipping (`use_scl`) is suppressed for that tile, since the static polygon already defines the water boundary precisely.
|
|
205
|
+
2. **SCL-derived polygon** (`use_scl: true`) — used when the tile has no static polygon. A water mask is extracted from the SCL asset and applied as the processing boundary.
|
|
206
|
+
3. **Static limit** from `tiles:` — applied when no polygon is available from either source above.
|
|
207
|
+
4. **No restriction** — full scene is processed when the tile is not listed in `tiles:` and SCL clipping is disabled or unavailable.
|
|
208
|
+
|
|
209
|
+
> The tile ID is extracted automatically from the SAFE folder filename (e.g. `T21HUD` in `S2A_MSIL1C_20250801T101031_N0500_R024_T21HUD_...SAFE`), so no manual mapping between files and tiles is needed.
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
## Programmatic usage
|
|
214
|
+
|
|
215
|
+
For scripting and integration into custom workflows, all pipeline steps can be called directly without a config file.
|
|
216
|
+
|
|
217
|
+
```bash
|
|
218
|
+
# Step 1 — prepare in situ data
|
|
219
|
+
python rionegromatchup/insitu_data.py --mode campaigns
|
|
220
|
+
|
|
221
|
+
# Step 2 — build catalog (±2 days, max 20% cloud cover)
|
|
222
|
+
python rionegromatchup/sentinel_data.py --mode catalog \
|
|
223
|
+
--csv data/monitoring_data/campaigns_unique_data.csv \
|
|
224
|
+
--time-delta 2 \
|
|
225
|
+
--cloud-cover 20
|
|
226
|
+
|
|
227
|
+
# Step 3 — download imagery and SCL assets
|
|
228
|
+
python rionegromatchup/sentinel_data.py --mode download \
|
|
229
|
+
--download-scl
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
```python
|
|
233
|
+
from pathlib import Path
|
|
234
|
+
from rionegromatchup.acolite_spec import AcoliteConfig, IOConfig
|
|
235
|
+
from rionegromatchup.pipeline_config import TilesSection, TileEntry
|
|
236
|
+
|
|
237
|
+
cfg = AcoliteConfig(
|
|
238
|
+
acolite_executable="/path/to/acolite",
|
|
239
|
+
io=IOConfig(inputfile="", output=""),
|
|
240
|
+
)
|
|
241
|
+
|
|
242
|
+
safe_list = sorted(Path("data/sentinel_downloads").glob("*.SAFE"))
|
|
243
|
+
scl_dir = Path("data/sentinel_downloads/scl")
|
|
244
|
+
|
|
245
|
+
# Define per-tile spatial restrictions
|
|
246
|
+
tiles = TilesSection.from_dict({
|
|
247
|
+
"21HUD": {"polygon": "data/polygons/21HUD.geojson"},
|
|
248
|
+
"21HVD": {"limit": [-34.2, -56.8, -33.0, -55.1]},
|
|
249
|
+
})
|
|
250
|
+
|
|
251
|
+
results = cfg.run_batch(
|
|
252
|
+
safe_list=safe_list,
|
|
253
|
+
base_output="data/acolite_output",
|
|
254
|
+
use_scl=True,
|
|
255
|
+
scl_dir=scl_dir,
|
|
256
|
+
scl_kwargs={"min_area_m2": 5000},
|
|
257
|
+
tile_config=tiles,
|
|
258
|
+
continue_on_error=True,
|
|
259
|
+
skip_existing=True, # set False to reprocess all scenes
|
|
260
|
+
)
|
|
261
|
+
```
|
|
262
|
+
|
|
263
|
+
You can also resolve tile restrictions per row when building configs from campaign data:
|
|
264
|
+
|
|
265
|
+
```python
|
|
266
|
+
from rionegromatchup.acolite_spec import AcoliteConfig
|
|
267
|
+
from rionegromatchup.pipeline_config import TilesSection
|
|
268
|
+
|
|
269
|
+
tiles = TilesSection.from_dict({
|
|
270
|
+
"21HUD": {"polygon": "data/polygons/21HUD.geojson"},
|
|
271
|
+
"21HVD": {"limit": [-34.2, -56.8, -33.0, -55.1]},
|
|
272
|
+
})
|
|
273
|
+
|
|
274
|
+
# row is a pandas Series from campaigns_unique_data.csv
|
|
275
|
+
cfg = AcoliteConfig.from_campaigns_row(
|
|
276
|
+
row=row,
|
|
277
|
+
acolite_executable="/path/to/acolite",
|
|
278
|
+
base_output="data/acolite_output",
|
|
279
|
+
inputfile=str(safe_path),
|
|
280
|
+
tile_config=tiles,
|
|
281
|
+
)
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
The spatial restriction is resolved automatically from `row["s2_tile"]`. If `tile_config` is omitted, the original behaviour applies: a 0.1° bounding box is derived from the row's `latitud`/`longitud` coordinates.
|
|
File without changes
|