grdl-te 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- grdl_te-0.1.0/LICENSE +19 -0
- grdl_te-0.1.0/PKG-INFO +354 -0
- grdl_te-0.1.0/README.md +332 -0
- grdl_te-0.1.0/grdl_te/__init__.py +61 -0
- grdl_te-0.1.0/grdl_te/__main__.py +135 -0
- grdl_te-0.1.0/grdl_te/benchmarking/__init__.py +65 -0
- grdl_te-0.1.0/grdl_te/benchmarking/active.py +226 -0
- grdl_te-0.1.0/grdl_te/benchmarking/base.py +154 -0
- grdl_te-0.1.0/grdl_te/benchmarking/component.py +264 -0
- grdl_te-0.1.0/grdl_te/benchmarking/models.py +656 -0
- grdl_te-0.1.0/grdl_te/benchmarking/report.py +560 -0
- grdl_te-0.1.0/grdl_te/benchmarking/source.py +308 -0
- grdl_te-0.1.0/grdl_te/benchmarking/store.py +259 -0
- grdl_te-0.1.0/grdl_te/benchmarking/suite.py +2499 -0
- grdl_te-0.1.0/grdl_te.egg-info/PKG-INFO +354 -0
- grdl_te-0.1.0/grdl_te.egg-info/SOURCES.txt +19 -0
- grdl_te-0.1.0/grdl_te.egg-info/dependency_links.txt +1 -0
- grdl_te-0.1.0/grdl_te.egg-info/requires.txt +13 -0
- grdl_te-0.1.0/grdl_te.egg-info/top_level.txt +1 -0
- grdl_te-0.1.0/pyproject.toml +96 -0
- grdl_te-0.1.0/setup.cfg +4 -0
grdl_te-0.1.0/LICENSE
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
4
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
5
|
+
in the Software without restriction, including without limitation the rights
|
|
6
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
7
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
8
|
+
furnished to do so, subject to the following conditions:
|
|
9
|
+
|
|
10
|
+
The above copyright notice and this permission notice shall be included in all
|
|
11
|
+
copies or substantial portions of the Software.
|
|
12
|
+
|
|
13
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
14
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
15
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
16
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
17
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
18
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
19
|
+
SOFTWARE.
|
grdl_te-0.1.0/PKG-INFO
ADDED
|
@@ -0,0 +1,354 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: grdl-te
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Testing and Evaluation suite for GRDL library using real-world data
|
|
5
|
+
Author: Steven Siebert, Ava Courtney
|
|
6
|
+
License: MIT
|
|
7
|
+
Requires-Python: >=3.9
|
|
8
|
+
Description-Content-Type: text/markdown
|
|
9
|
+
License-File: LICENSE
|
|
10
|
+
Requires-Dist: grdl
|
|
11
|
+
Requires-Dist: pytest>=7.0
|
|
12
|
+
Requires-Dist: pytest-cov>=4.0
|
|
13
|
+
Requires-Dist: numpy>=1.21
|
|
14
|
+
Requires-Dist: h5py>=3.0
|
|
15
|
+
Requires-Dist: rasterio>=1.3
|
|
16
|
+
Provides-Extra: benchmarking
|
|
17
|
+
Requires-Dist: grdl-runtime; extra == "benchmarking"
|
|
18
|
+
Provides-Extra: dev
|
|
19
|
+
Requires-Dist: pytest-benchmark>=4.0; extra == "dev"
|
|
20
|
+
Requires-Dist: pytest-xdist>=3.0; extra == "dev"
|
|
21
|
+
Dynamic: license-file
|
|
22
|
+
|
|
23
|
+
# GRDL-TE: Testing, Evaluation & Benchmarking
|
|
24
|
+
|
|
25
|
+
GRDL-TE is the validation and benchmarking suite for the [GRDL](../grdl/) (GEOINT Rapid Development Library). It serves two purposes:
|
|
26
|
+
|
|
27
|
+
1. **Validation** — tests GRDL's public API against real-world satellite data with 3-level validation (format, quality, integration).
|
|
28
|
+
2. **Benchmarking** — profiles GRDL workflows and individual components, aggregates metrics across runs, and persists results for regression detection and cross-hardware comparison.
|
|
29
|
+
|
|
30
|
+
GRDL-TE is a *consumer* of GRDL — it only imports the public API. It never modifies GRDL internals.
|
|
31
|
+
|
|
32
|
+
## Architecture
|
|
33
|
+
|
|
34
|
+
```
|
|
35
|
+
grdx/
|
|
36
|
+
├── grdl/ # Core library — readers, filters, transforms, geolocation
|
|
37
|
+
├── grdl-runtime/ # Workflow execution engine (DAG orchestration, YAML pipelines)
|
|
38
|
+
├── grdk/ # GUI toolkit (Orange3 widgets, napari viewers)
|
|
39
|
+
└── grdl-te/ # This package — validation tests + benchmark profiling
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
| Layer | Package | Role |
|
|
43
|
+
|-------|---------|------|
|
|
44
|
+
| **Library** | `grdl` | Modular building blocks for GEOINT image processing |
|
|
45
|
+
| **Runtime** | `grdl-runtime` | Headless workflow executor, YAML pipeline loader |
|
|
46
|
+
| **T&E** | `grdl-te` | Correctness validation and performance profiling against `grdl` |
|
|
47
|
+
|
|
48
|
+
## Setup
|
|
49
|
+
|
|
50
|
+
### Environment
|
|
51
|
+
|
|
52
|
+
GRDL-TE shares the `grdl` conda environment with all GRDX repositories:
|
|
53
|
+
|
|
54
|
+
```bash
|
|
55
|
+
conda activate grdl
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
### Installation
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
# Core — models, store, component benchmarks, all tests
|
|
62
|
+
pip install -e .
|
|
63
|
+
|
|
64
|
+
# With workflow benchmarking (requires grdl-runtime)
|
|
65
|
+
pip install -e ".[benchmarking]"
|
|
66
|
+
|
|
67
|
+
# With dev tools (pytest-benchmark, pytest-xdist)
|
|
68
|
+
pip install -e ".[dev]"
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
### Dependencies
|
|
72
|
+
|
|
73
|
+
| Package | Version | Purpose |
|
|
74
|
+
|---------|---------|---------|
|
|
75
|
+
| `grdl` | latest | Library under test |
|
|
76
|
+
| `pytest` | >=7.0 | Test framework |
|
|
77
|
+
| `pytest-cov` | >=4.0 | Coverage reporting |
|
|
78
|
+
| `numpy` | >=1.21 | Array operations |
|
|
79
|
+
| `h5py` | >=3.0 | HDF5 format support |
|
|
80
|
+
| `rasterio` | >=1.3 | GeoTIFF/NITF support (via GDAL) |
|
|
81
|
+
|
|
82
|
+
**Optional:**
|
|
83
|
+
|
|
84
|
+
| Package | Install extra | Purpose |
|
|
85
|
+
|---------|--------------|---------|
|
|
86
|
+
| `grdl-runtime` | `benchmarking` | Active workflow benchmarking |
|
|
87
|
+
| `pytest-benchmark` | `dev` | Benchmark comparison |
|
|
88
|
+
| `pytest-xdist` | `dev` | Parallel test execution (`-n auto`) |
|
|
89
|
+
|
|
90
|
+
## Validation Suite
|
|
91
|
+
|
|
92
|
+
Three-level validation against real-world satellite data (552 tests, 38 test files):
|
|
93
|
+
|
|
94
|
+
| Level | Scope | Examples |
|
|
95
|
+
|-------|-------|---------|
|
|
96
|
+
| **L1 — Format** | Reader instantiation, metadata, shape/dtype, chip reads, resource cleanup | SICD complex64 dtype, GeoTIFF COG tiling |
|
|
97
|
+
| **L2 — Quality** | CRS projection, value ranges, NoData masking, format-specific features | UTM zone validation, 15-bit reflectance ceilings, SAR speckle statistics |
|
|
98
|
+
| **L3 — Integration** | Multi-component pipelines (chip, normalize, tile, detect) | ChipExtractor → Normalizer → batch validation |
|
|
99
|
+
|
|
100
|
+
Tests skip gracefully when data is absent (`pytest.skip` with download instructions). Present data produces pass/fail — never a false pass.
|
|
101
|
+
|
|
102
|
+
Each `data/<dataset>/README.md` contains download instructions, expected file properties, and format specifications.
|
|
103
|
+
|
|
104
|
+
### Running Tests
|
|
105
|
+
|
|
106
|
+
```bash
|
|
107
|
+
conda activate grdl
|
|
108
|
+
|
|
109
|
+
# Full suite (missing data files skip cleanly)
|
|
110
|
+
pytest
|
|
111
|
+
|
|
112
|
+
# Specific reader
|
|
113
|
+
pytest tests/validation/test_io_geotiff.py -v # Landsat
|
|
114
|
+
pytest tests/validation/test_io_nitf.py -v # Umbra SICD
|
|
115
|
+
pytest tests/validation/test_io_sentinel1.py -v # Sentinel-1
|
|
116
|
+
|
|
117
|
+
# Geolocation tests
|
|
118
|
+
pytest tests/validation/test_geolocation_base.py tests/validation/test_geolocation_utils.py -v
|
|
119
|
+
pytest tests/validation/test_geolocation_affine_real.py -v
|
|
120
|
+
|
|
121
|
+
# Processing tests
|
|
122
|
+
pytest tests/validation/test_detection_cfar.py -v
|
|
123
|
+
pytest tests/validation/test_decomposition_halpha.py -v
|
|
124
|
+
pytest tests/validation/test_sar_image_formation.py -v
|
|
125
|
+
|
|
126
|
+
# Benchmarking infrastructure tests
|
|
127
|
+
pytest tests/benchmarking/ -v
|
|
128
|
+
|
|
129
|
+
# By marker
|
|
130
|
+
pytest -m landsat # All Landsat tests
|
|
131
|
+
pytest -m viirs # All VIIRS tests
|
|
132
|
+
pytest -m geolocation # All geolocation tests
|
|
133
|
+
pytest -m integration # Only Level 3 integration tests
|
|
134
|
+
pytest -m "nitf and not slow" # NITF tests, skip slow ones
|
|
135
|
+
pytest -m benchmark # Benchmarking infrastructure tests
|
|
136
|
+
pytest -m sar # SAR processing tests
|
|
137
|
+
pytest -m detection # Detection algorithm tests
|
|
138
|
+
pytest -m decomposition # Polarimetric decomposition tests
|
|
139
|
+
pytest -m interpolation # Interpolation tests
|
|
140
|
+
|
|
141
|
+
# Skip all data-dependent tests
|
|
142
|
+
pytest -m "not requires_data"
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
### Test Markers
|
|
146
|
+
|
|
147
|
+
| Marker | Purpose |
|
|
148
|
+
|--------|---------|
|
|
149
|
+
| `landsat` | Landsat 8/9 tests (GeoTIFFReader) |
|
|
150
|
+
| `viirs` | VIIRS VNP09GA tests (HDF5Reader) |
|
|
151
|
+
| `sentinel2` | Sentinel-2 tests (JP2Reader) |
|
|
152
|
+
| `nitf` | Umbra SICD tests (NITFReader) |
|
|
153
|
+
| `cphd` | CPHD format tests |
|
|
154
|
+
| `crsd` | CRSD format tests |
|
|
155
|
+
| `sidd` | SIDD format tests |
|
|
156
|
+
| `sentinel1` | Sentinel-1 SLC tests |
|
|
157
|
+
| `aster` | ASTER L1T tests |
|
|
158
|
+
| `biomass` | BIOMASS L1 tests |
|
|
159
|
+
| `terrasar` | TerraSAR-X/TanDEM-X tests |
|
|
160
|
+
| `geolocation` | Geolocation utility and coordinate transform tests |
|
|
161
|
+
| `elevation` | Elevation model tests |
|
|
162
|
+
| `requires_data` | Test requires real data files in `data/` |
|
|
163
|
+
| `slow` | Long-running test (large file reads, full pipelines) |
|
|
164
|
+
| `integration` | Level 3 tests (ChipExtractor, Normalizer, Tiler workflows) |
|
|
165
|
+
| `benchmark` | Performance benchmark tests |
|
|
166
|
+
| `sar` | SAR-specific processing tests |
|
|
167
|
+
| `image_formation` | SAR image formation tests |
|
|
168
|
+
| `detection` | Detection model tests |
|
|
169
|
+
| `cfar` | CFAR detector tests |
|
|
170
|
+
| `decomposition` | Polarimetric decomposition tests |
|
|
171
|
+
| `ortho` | Orthorectification tests |
|
|
172
|
+
| `coregistration` | CoRegistration tests |
|
|
173
|
+
| `interpolation` | Interpolation algorithm tests |
|
|
174
|
+
|
|
175
|
+
## Benchmarking
|
|
176
|
+
|
|
177
|
+
### CLI Benchmark Suite
|
|
178
|
+
|
|
179
|
+
Run the full benchmark suite from the command line:
|
|
180
|
+
|
|
181
|
+
```bash
|
|
182
|
+
python -m grdl_te # medium arrays, 10 iterations
|
|
183
|
+
python -m grdl_te --size small -n 5 # quick run
|
|
184
|
+
python -m grdl_te --size large -n 20 # thorough run
|
|
185
|
+
python -m grdl_te --only filters intensity # specific benchmark groups
|
|
186
|
+
python -m grdl_te --skip-workflow # component benchmarks only
|
|
187
|
+
python -m grdl_te --store-dir ./results # custom output directory
|
|
188
|
+
python -m grdl_te --report # print report to terminal
|
|
189
|
+
python -m grdl_te --report ./reports/ # save report to directory
|
|
190
|
+
python -m grdl_te --report ./my_report.txt # save report to file
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
**Array size presets:**
|
|
194
|
+
|
|
195
|
+
| Preset | Dimensions |
|
|
196
|
+
|--------|-----------|
|
|
197
|
+
| `small` | 512 x 512 |
|
|
198
|
+
| `medium` | 2048 x 2048 |
|
|
199
|
+
| `large` | 4096 x 4096 |
|
|
200
|
+
|
|
201
|
+
**Benchmark groups (13):**
|
|
202
|
+
|
|
203
|
+
| Group | Coverage |
|
|
204
|
+
|-------|----------|
|
|
205
|
+
| `filters` | Mean, Gaussian, Median, Min, Max, StdDev, Lee, ComplexLee, PhaseGradient |
|
|
206
|
+
| `intensity` | ToDecibels, PercentileStretch |
|
|
207
|
+
| `decomposition` | Pauli, DualPolHAlpha, SublookDecomposition |
|
|
208
|
+
| `detection` | CA-CFAR, GO-CFAR, SO-CFAR, OS-CFAR, DetectionSet, Fields |
|
|
209
|
+
| `sar` | MultilookDecomposition, CSIProcessor |
|
|
210
|
+
| `image_formation` | CollectionGeometry, PolarGrid, PFA, RDA, StripmapPFA, FFBP, SubaperturePartitioner |
|
|
211
|
+
| `ortho` | Orthorectifier, OutputGrid, OrthoPipeline, compute_output_resolution |
|
|
212
|
+
| `coregistration` | Affine, FeatureMatch, Projective |
|
|
213
|
+
| `io` | 22 readers/writers (GeoTIFF, HDF5, NITF, JP2, SICD, CPHD, CRSD, SIDD, Sentinel-1/2, ASTER, BIOMASS, TerraSAR-X) |
|
|
214
|
+
| `geolocation` | Affine, GCP, SICD, Sentinel-1 SLC, NoGeolocation |
|
|
215
|
+
| `interpolation` | Lanczos, KaiserSinc, Lagrange, Farrow, Polyphase, ThiranDelay |
|
|
216
|
+
| `data_prep` | ChipExtractor, Tiler, Normalizer |
|
|
217
|
+
| `pipeline` | Sequential Pipeline composition |
|
|
218
|
+
|
|
219
|
+
### Active Workflow Benchmarking
|
|
220
|
+
|
|
221
|
+
Run a `grdl-runtime` Workflow N times, aggregate per-step metrics, and persist results:
|
|
222
|
+
|
|
223
|
+
```python
|
|
224
|
+
from grdl_rt import Workflow
|
|
225
|
+
from grdl_rt.api import load_workflow
|
|
226
|
+
from grdl_te.benchmarking import ActiveBenchmarkRunner, BenchmarkSource, JSONBenchmarkStore
|
|
227
|
+
|
|
228
|
+
store = JSONBenchmarkStore()
|
|
229
|
+
|
|
230
|
+
# ==== Pass a declared workflow ====
|
|
231
|
+
wf = (
|
|
232
|
+
Workflow("SAR Pipeline", modalities=["SAR"])
|
|
233
|
+
.reader(SICDReader)
|
|
234
|
+
.step(SublookDecomposition, num_looks=3)
|
|
235
|
+
.step(ToDecibels)
|
|
236
|
+
)
|
|
237
|
+
runner = ActiveBenchmarkRunner(wf, iterations=10, warmup=2, store=store)
|
|
238
|
+
record = runner.run(source="image.nitf", prefer_gpu=True)
|
|
239
|
+
|
|
240
|
+
# ==== Load a YAML workflow ====
|
|
241
|
+
wf = load_workflow("path/to/my_workflow.yaml")
|
|
242
|
+
source = BenchmarkSource.synthetic("medium")
|
|
243
|
+
|
|
244
|
+
runner = ActiveBenchmarkRunner(
|
|
245
|
+
workflow=wf, source=source, iterations=5, warmup=1, store=store,
|
|
246
|
+
)
|
|
247
|
+
record = runner.run()
|
|
248
|
+
|
|
249
|
+
# record.total_wall_time.mean, .stddev, .p95
|
|
250
|
+
# record.step_results[0].wall_time_s.mean
|
|
251
|
+
# record.hardware.cpu_count, .gpu_devices
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
### Component Benchmarking
|
|
255
|
+
|
|
256
|
+
Profile individual GRDL functions outside of a workflow context:
|
|
257
|
+
|
|
258
|
+
```python
|
|
259
|
+
from grdl.data_prep import Normalizer
|
|
260
|
+
from grdl_te.benchmarking import ComponentBenchmark
|
|
261
|
+
|
|
262
|
+
image = np.random.rand(4096, 4096).astype(np.float32)
|
|
263
|
+
norm = Normalizer(method='minmax')
|
|
264
|
+
|
|
265
|
+
bench = ComponentBenchmark(
|
|
266
|
+
name="Normalizer.minmax.4k",
|
|
267
|
+
fn=norm.normalize,
|
|
268
|
+
setup=lambda: ((image,), {}),
|
|
269
|
+
iterations=20,
|
|
270
|
+
warmup=3,
|
|
271
|
+
)
|
|
272
|
+
record = bench.run()
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
### Benchmark Data Sources
|
|
276
|
+
|
|
277
|
+
```python
|
|
278
|
+
from grdl_te.benchmarking import BenchmarkSource
|
|
279
|
+
|
|
280
|
+
# Synthetic data (lazy generation with caching)
|
|
281
|
+
source = BenchmarkSource.synthetic("medium") # 2048x2048
|
|
282
|
+
source = BenchmarkSource.synthetic("small") # 512x512
|
|
283
|
+
source = BenchmarkSource.synthetic("large") # 4096x4096
|
|
284
|
+
|
|
285
|
+
# Real data file
|
|
286
|
+
source = BenchmarkSource.from_file("path/to/image.nitf")
|
|
287
|
+
|
|
288
|
+
# Existing array
|
|
289
|
+
source = BenchmarkSource.from_array(my_array)
|
|
290
|
+
```
|
|
291
|
+
|
|
292
|
+
### Result Storage
|
|
293
|
+
|
|
294
|
+
Results are stored as JSON files in `.benchmarks/`:
|
|
295
|
+
|
|
296
|
+
```
|
|
297
|
+
.benchmarks/
|
|
298
|
+
index.json # lightweight index for fast filtering
|
|
299
|
+
records/
|
|
300
|
+
<uuid>.json # full BenchmarkRecord per run
|
|
301
|
+
```
|
|
302
|
+
|
|
303
|
+
Each `BenchmarkRecord` captures the `HardwareSnapshot` (CPU, RAM, GPU, platform), per-step `AggregatedMetrics` (min, max, mean, median, stddev, p95), and raw per-iteration measurements for lossless post-hoc analysis.
|
|
304
|
+
|
|
305
|
+
### Example Workflow
|
|
306
|
+
|
|
307
|
+
The `workflows/` directory contains example `grdl-runtime` workflow definitions. `comprehensive_benchmark_workflow.yaml` defines a 28-step SAR processing pipeline covering complex speckle filtering, phase gradient analysis, amplitude conversion, CFAR detection, and conditional orthorectification.
|
|
308
|
+
|
|
309
|
+
`benchmark_examples.py` demonstrates active workflow benchmarking with `ActiveBenchmarkRunner` at multiple array scales.
|
|
310
|
+
|
|
311
|
+
## Project Status
|
|
312
|
+
|
|
313
|
+
**Component coverage: 78/78 (100%)**
|
|
314
|
+
|
|
315
|
+
All public GRDL components have both a dedicated benchmark in `suite.py` and a correctness validation test in `tests/validation/`. See [BENCHMARK_COVERAGE_GAPS.md](BENCHMARK_COVERAGE_GAPS.md) for the full inventory.
|
|
316
|
+
|
|
317
|
+
| Metric | Value |
|
|
318
|
+
|--------|-------|
|
|
319
|
+
| Benchmarked components | 78/78 |
|
|
320
|
+
| Benchmark groups | 13 |
|
|
321
|
+
| Validation test files | 32 |
|
|
322
|
+
| Benchmark infrastructure tests | 6 |
|
|
323
|
+
| YAML workflow steps | 28 |
|
|
324
|
+
| Array size presets | small (512), medium (2048), large (4096) |
|
|
325
|
+
|
|
326
|
+
### Active Development
|
|
327
|
+
|
|
328
|
+
- **Passive Monitoring** — `ExecutionHook` for capturing metrics from production workflows
|
|
329
|
+
- **Regression Detection** — cross-run comparison with configurable thresholds
|
|
330
|
+
- **Cross-Hardware Prediction** — collect results from different machines, predict performance on new hardware
|
|
331
|
+
|
|
332
|
+
## Dependency Management
|
|
333
|
+
|
|
334
|
+
### Source of Truth: `pyproject.toml`
|
|
335
|
+
|
|
336
|
+
All dependencies are defined in `pyproject.toml`. Keep these files synchronized:
|
|
337
|
+
|
|
338
|
+
- **`pyproject.toml`** — source of truth for versions and dependencies
|
|
339
|
+
- **`requirements.txt`** — regenerate with `pip freeze > requirements.txt` after updating `pyproject.toml`
|
|
340
|
+
|
|
341
|
+
**Note:** GRDL-TE is a **validation suite, not a published library**, so there is no `.github/workflows/publish.yml` or PyPI versioning requirement.
|
|
342
|
+
|
|
343
|
+
### Updating Dependencies
|
|
344
|
+
|
|
345
|
+
1. Update dependencies in `pyproject.toml` (add new packages, change versions, create/rename extras)
|
|
346
|
+
2. Install dependencies: `pip install -e ".[all,dev]"` (or appropriate extras for your work)
|
|
347
|
+
3. If `requirements.txt` exists, regenerate it: `pip freeze > requirements.txt`
|
|
348
|
+
4. Commit both files
|
|
349
|
+
|
|
350
|
+
See [CLAUDE.md](CLAUDE.md#dependency-management) for detailed dependency management guidelines.
|
|
351
|
+
|
|
352
|
+
## License
|
|
353
|
+
|
|
354
|
+
MIT License — see [LICENSE](LICENSE) for details.
|
grdl_te-0.1.0/README.md
ADDED
|
@@ -0,0 +1,332 @@
|
|
|
1
|
+
# GRDL-TE: Testing, Evaluation & Benchmarking
|
|
2
|
+
|
|
3
|
+
GRDL-TE is the validation and benchmarking suite for the [GRDL](../grdl/) (GEOINT Rapid Development Library). It serves two purposes:
|
|
4
|
+
|
|
5
|
+
1. **Validation** — tests GRDL's public API against real-world satellite data with 3-level validation (format, quality, integration).
|
|
6
|
+
2. **Benchmarking** — profiles GRDL workflows and individual components, aggregates metrics across runs, and persists results for regression detection and cross-hardware comparison.
|
|
7
|
+
|
|
8
|
+
GRDL-TE is a *consumer* of GRDL — it only imports the public API. It never modifies GRDL internals.
|
|
9
|
+
|
|
10
|
+
## Architecture
|
|
11
|
+
|
|
12
|
+
```
|
|
13
|
+
grdx/
|
|
14
|
+
├── grdl/ # Core library — readers, filters, transforms, geolocation
|
|
15
|
+
├── grdl-runtime/ # Workflow execution engine (DAG orchestration, YAML pipelines)
|
|
16
|
+
├── grdk/ # GUI toolkit (Orange3 widgets, napari viewers)
|
|
17
|
+
└── grdl-te/ # This package — validation tests + benchmark profiling
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
| Layer | Package | Role |
|
|
21
|
+
|-------|---------|------|
|
|
22
|
+
| **Library** | `grdl` | Modular building blocks for GEOINT image processing |
|
|
23
|
+
| **Runtime** | `grdl-runtime` | Headless workflow executor, YAML pipeline loader |
|
|
24
|
+
| **T&E** | `grdl-te` | Correctness validation and performance profiling against `grdl` |
|
|
25
|
+
|
|
26
|
+
## Setup
|
|
27
|
+
|
|
28
|
+
### Environment
|
|
29
|
+
|
|
30
|
+
GRDL-TE shares the `grdl` conda environment with all GRDX repositories:
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
conda activate grdl
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
### Installation
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
# Core — models, store, component benchmarks, all tests
|
|
40
|
+
pip install -e .
|
|
41
|
+
|
|
42
|
+
# With workflow benchmarking (requires grdl-runtime)
|
|
43
|
+
pip install -e ".[benchmarking]"
|
|
44
|
+
|
|
45
|
+
# With dev tools (pytest-benchmark, pytest-xdist)
|
|
46
|
+
pip install -e ".[dev]"
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
### Dependencies
|
|
50
|
+
|
|
51
|
+
| Package | Version | Purpose |
|
|
52
|
+
|---------|---------|---------|
|
|
53
|
+
| `grdl` | latest | Library under test |
|
|
54
|
+
| `pytest` | >=7.0 | Test framework |
|
|
55
|
+
| `pytest-cov` | >=4.0 | Coverage reporting |
|
|
56
|
+
| `numpy` | >=1.21 | Array operations |
|
|
57
|
+
| `h5py` | >=3.0 | HDF5 format support |
|
|
58
|
+
| `rasterio` | >=1.3 | GeoTIFF/NITF support (via GDAL) |
|
|
59
|
+
|
|
60
|
+
**Optional:**
|
|
61
|
+
|
|
62
|
+
| Package | Install extra | Purpose |
|
|
63
|
+
|---------|--------------|---------|
|
|
64
|
+
| `grdl-runtime` | `benchmarking` | Active workflow benchmarking |
|
|
65
|
+
| `pytest-benchmark` | `dev` | Benchmark comparison |
|
|
66
|
+
| `pytest-xdist` | `dev` | Parallel test execution (`-n auto`) |
|
|
67
|
+
|
|
68
|
+
## Validation Suite
|
|
69
|
+
|
|
70
|
+
Three-level validation against real-world satellite data (552 tests, 38 test files):
|
|
71
|
+
|
|
72
|
+
| Level | Scope | Examples |
|
|
73
|
+
|-------|-------|---------|
|
|
74
|
+
| **L1 — Format** | Reader instantiation, metadata, shape/dtype, chip reads, resource cleanup | SICD complex64 dtype, GeoTIFF COG tiling |
|
|
75
|
+
| **L2 — Quality** | CRS projection, value ranges, NoData masking, format-specific features | UTM zone validation, 15-bit reflectance ceilings, SAR speckle statistics |
|
|
76
|
+
| **L3 — Integration** | Multi-component pipelines (chip, normalize, tile, detect) | ChipExtractor → Normalizer → batch validation |
|
|
77
|
+
|
|
78
|
+
Tests skip gracefully when data is absent (`pytest.skip` with download instructions). Present data produces pass/fail — never a false pass.
|
|
79
|
+
|
|
80
|
+
Each `data/<dataset>/README.md` contains download instructions, expected file properties, and format specifications.
|
|
81
|
+
|
|
82
|
+
### Running Tests
|
|
83
|
+
|
|
84
|
+
```bash
|
|
85
|
+
conda activate grdl
|
|
86
|
+
|
|
87
|
+
# Full suite (missing data files skip cleanly)
|
|
88
|
+
pytest
|
|
89
|
+
|
|
90
|
+
# Specific reader
|
|
91
|
+
pytest tests/validation/test_io_geotiff.py -v # Landsat
|
|
92
|
+
pytest tests/validation/test_io_nitf.py -v # Umbra SICD
|
|
93
|
+
pytest tests/validation/test_io_sentinel1.py -v # Sentinel-1
|
|
94
|
+
|
|
95
|
+
# Geolocation tests
|
|
96
|
+
pytest tests/validation/test_geolocation_base.py tests/validation/test_geolocation_utils.py -v
|
|
97
|
+
pytest tests/validation/test_geolocation_affine_real.py -v
|
|
98
|
+
|
|
99
|
+
# Processing tests
|
|
100
|
+
pytest tests/validation/test_detection_cfar.py -v
|
|
101
|
+
pytest tests/validation/test_decomposition_halpha.py -v
|
|
102
|
+
pytest tests/validation/test_sar_image_formation.py -v
|
|
103
|
+
|
|
104
|
+
# Benchmarking infrastructure tests
|
|
105
|
+
pytest tests/benchmarking/ -v
|
|
106
|
+
|
|
107
|
+
# By marker
|
|
108
|
+
pytest -m landsat # All Landsat tests
|
|
109
|
+
pytest -m viirs # All VIIRS tests
|
|
110
|
+
pytest -m geolocation # All geolocation tests
|
|
111
|
+
pytest -m integration # Only Level 3 integration tests
|
|
112
|
+
pytest -m "nitf and not slow" # NITF tests, skip slow ones
|
|
113
|
+
pytest -m benchmark # Benchmarking infrastructure tests
|
|
114
|
+
pytest -m sar # SAR processing tests
|
|
115
|
+
pytest -m detection # Detection algorithm tests
|
|
116
|
+
pytest -m decomposition # Polarimetric decomposition tests
|
|
117
|
+
pytest -m interpolation # Interpolation tests
|
|
118
|
+
|
|
119
|
+
# Skip all data-dependent tests
|
|
120
|
+
pytest -m "not requires_data"
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
### Test Markers
|
|
124
|
+
|
|
125
|
+
| Marker | Purpose |
|
|
126
|
+
|--------|---------|
|
|
127
|
+
| `landsat` | Landsat 8/9 tests (GeoTIFFReader) |
|
|
128
|
+
| `viirs` | VIIRS VNP09GA tests (HDF5Reader) |
|
|
129
|
+
| `sentinel2` | Sentinel-2 tests (JP2Reader) |
|
|
130
|
+
| `nitf` | Umbra SICD tests (NITFReader) |
|
|
131
|
+
| `cphd` | CPHD format tests |
|
|
132
|
+
| `crsd` | CRSD format tests |
|
|
133
|
+
| `sidd` | SIDD format tests |
|
|
134
|
+
| `sentinel1` | Sentinel-1 SLC tests |
|
|
135
|
+
| `aster` | ASTER L1T tests |
|
|
136
|
+
| `biomass` | BIOMASS L1 tests |
|
|
137
|
+
| `terrasar` | TerraSAR-X/TanDEM-X tests |
|
|
138
|
+
| `geolocation` | Geolocation utility and coordinate transform tests |
|
|
139
|
+
| `elevation` | Elevation model tests |
|
|
140
|
+
| `requires_data` | Test requires real data files in `data/` |
|
|
141
|
+
| `slow` | Long-running test (large file reads, full pipelines) |
|
|
142
|
+
| `integration` | Level 3 tests (ChipExtractor, Normalizer, Tiler workflows) |
|
|
143
|
+
| `benchmark` | Performance benchmark tests |
|
|
144
|
+
| `sar` | SAR-specific processing tests |
|
|
145
|
+
| `image_formation` | SAR image formation tests |
|
|
146
|
+
| `detection` | Detection model tests |
|
|
147
|
+
| `cfar` | CFAR detector tests |
|
|
148
|
+
| `decomposition` | Polarimetric decomposition tests |
|
|
149
|
+
| `ortho` | Orthorectification tests |
|
|
150
|
+
| `coregistration` | CoRegistration tests |
|
|
151
|
+
| `interpolation` | Interpolation algorithm tests |
|
|
152
|
+
|
|
153
|
+
## Benchmarking
|
|
154
|
+
|
|
155
|
+
### CLI Benchmark Suite
|
|
156
|
+
|
|
157
|
+
Run the full benchmark suite from the command line:
|
|
158
|
+
|
|
159
|
+
```bash
|
|
160
|
+
python -m grdl_te # medium arrays, 10 iterations
|
|
161
|
+
python -m grdl_te --size small -n 5 # quick run
|
|
162
|
+
python -m grdl_te --size large -n 20 # thorough run
|
|
163
|
+
python -m grdl_te --only filters intensity # specific benchmark groups
|
|
164
|
+
python -m grdl_te --skip-workflow # component benchmarks only
|
|
165
|
+
python -m grdl_te --store-dir ./results # custom output directory
|
|
166
|
+
python -m grdl_te --report # print report to terminal
|
|
167
|
+
python -m grdl_te --report ./reports/ # save report to directory
|
|
168
|
+
python -m grdl_te --report ./my_report.txt # save report to file
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
**Array size presets:**
|
|
172
|
+
|
|
173
|
+
| Preset | Dimensions |
|
|
174
|
+
|--------|-----------|
|
|
175
|
+
| `small` | 512 x 512 |
|
|
176
|
+
| `medium` | 2048 x 2048 |
|
|
177
|
+
| `large` | 4096 x 4096 |
|
|
178
|
+
|
|
179
|
+
**Benchmark groups (13):**
|
|
180
|
+
|
|
181
|
+
| Group | Coverage |
|
|
182
|
+
|-------|----------|
|
|
183
|
+
| `filters` | Mean, Gaussian, Median, Min, Max, StdDev, Lee, ComplexLee, PhaseGradient |
|
|
184
|
+
| `intensity` | ToDecibels, PercentileStretch |
|
|
185
|
+
| `decomposition` | Pauli, DualPolHAlpha, SublookDecomposition |
|
|
186
|
+
| `detection` | CA-CFAR, GO-CFAR, SO-CFAR, OS-CFAR, DetectionSet, Fields |
|
|
187
|
+
| `sar` | MultilookDecomposition, CSIProcessor |
|
|
188
|
+
| `image_formation` | CollectionGeometry, PolarGrid, PFA, RDA, StripmapPFA, FFBP, SubaperturePartitioner |
|
|
189
|
+
| `ortho` | Orthorectifier, OutputGrid, OrthoPipeline, compute_output_resolution |
|
|
190
|
+
| `coregistration` | Affine, FeatureMatch, Projective |
|
|
191
|
+
| `io` | 22 readers/writers (GeoTIFF, HDF5, NITF, JP2, SICD, CPHD, CRSD, SIDD, Sentinel-1/2, ASTER, BIOMASS, TerraSAR-X) |
|
|
192
|
+
| `geolocation` | Affine, GCP, SICD, Sentinel-1 SLC, NoGeolocation |
|
|
193
|
+
| `interpolation` | Lanczos, KaiserSinc, Lagrange, Farrow, Polyphase, ThiranDelay |
|
|
194
|
+
| `data_prep` | ChipExtractor, Tiler, Normalizer |
|
|
195
|
+
| `pipeline` | Sequential Pipeline composition |
|
|
196
|
+
|
|
197
|
+
### Active Workflow Benchmarking
|
|
198
|
+
|
|
199
|
+
Run a `grdl-runtime` Workflow N times, aggregate per-step metrics, and persist results:
|
|
200
|
+
|
|
201
|
+
```python
|
|
202
|
+
from grdl_rt import Workflow
|
|
203
|
+
from grdl_rt.api import load_workflow
|
|
204
|
+
from grdl_te.benchmarking import ActiveBenchmarkRunner, BenchmarkSource, JSONBenchmarkStore
|
|
205
|
+
|
|
206
|
+
store = JSONBenchmarkStore()
|
|
207
|
+
|
|
208
|
+
# ==== Pass a declared workflow ====
|
|
209
|
+
wf = (
|
|
210
|
+
Workflow("SAR Pipeline", modalities=["SAR"])
|
|
211
|
+
.reader(SICDReader)
|
|
212
|
+
.step(SublookDecomposition, num_looks=3)
|
|
213
|
+
.step(ToDecibels)
|
|
214
|
+
)
|
|
215
|
+
runner = ActiveBenchmarkRunner(wf, iterations=10, warmup=2, store=store)
|
|
216
|
+
record = runner.run(source="image.nitf", prefer_gpu=True)
|
|
217
|
+
|
|
218
|
+
# ==== Load a YAML workflow ====
|
|
219
|
+
wf = load_workflow("path/to/my_workflow.yaml")
|
|
220
|
+
source = BenchmarkSource.synthetic("medium")
|
|
221
|
+
|
|
222
|
+
runner = ActiveBenchmarkRunner(
|
|
223
|
+
workflow=wf, source=source, iterations=5, warmup=1, store=store,
|
|
224
|
+
)
|
|
225
|
+
record = runner.run()
|
|
226
|
+
|
|
227
|
+
# record.total_wall_time.mean, .stddev, .p95
|
|
228
|
+
# record.step_results[0].wall_time_s.mean
|
|
229
|
+
# record.hardware.cpu_count, .gpu_devices
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
### Component Benchmarking
|
|
233
|
+
|
|
234
|
+
Profile individual GRDL functions outside of a workflow context:
|
|
235
|
+
|
|
236
|
+
```python
|
|
237
|
+
from grdl.data_prep import Normalizer
|
|
238
|
+
from grdl_te.benchmarking import ComponentBenchmark
|
|
239
|
+
|
|
240
|
+
image = np.random.rand(4096, 4096).astype(np.float32)
|
|
241
|
+
norm = Normalizer(method='minmax')
|
|
242
|
+
|
|
243
|
+
bench = ComponentBenchmark(
|
|
244
|
+
name="Normalizer.minmax.4k",
|
|
245
|
+
fn=norm.normalize,
|
|
246
|
+
setup=lambda: ((image,), {}),
|
|
247
|
+
iterations=20,
|
|
248
|
+
warmup=3,
|
|
249
|
+
)
|
|
250
|
+
record = bench.run()
|
|
251
|
+
```
|
|
252
|
+
|
|
253
|
+
### Benchmark Data Sources
|
|
254
|
+
|
|
255
|
+
```python
|
|
256
|
+
from grdl_te.benchmarking import BenchmarkSource
|
|
257
|
+
|
|
258
|
+
# Synthetic data (lazy generation with caching)
|
|
259
|
+
source = BenchmarkSource.synthetic("medium") # 2048x2048
|
|
260
|
+
source = BenchmarkSource.synthetic("small") # 512x512
|
|
261
|
+
source = BenchmarkSource.synthetic("large") # 4096x4096
|
|
262
|
+
|
|
263
|
+
# Real data file
|
|
264
|
+
source = BenchmarkSource.from_file("path/to/image.nitf")
|
|
265
|
+
|
|
266
|
+
# Existing array
|
|
267
|
+
source = BenchmarkSource.from_array(my_array)
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
### Result Storage
|
|
271
|
+
|
|
272
|
+
Results are stored as JSON files in `.benchmarks/`:
|
|
273
|
+
|
|
274
|
+
```
|
|
275
|
+
.benchmarks/
|
|
276
|
+
index.json # lightweight index for fast filtering
|
|
277
|
+
records/
|
|
278
|
+
<uuid>.json # full BenchmarkRecord per run
|
|
279
|
+
```
|
|
280
|
+
|
|
281
|
+
Each `BenchmarkRecord` captures the `HardwareSnapshot` (CPU, RAM, GPU, platform), per-step `AggregatedMetrics` (min, max, mean, median, stddev, p95), and raw per-iteration measurements for lossless post-hoc analysis.
|
|
282
|
+
|
|
283
|
+
### Example Workflow
|
|
284
|
+
|
|
285
|
+
The `workflows/` directory contains example `grdl-runtime` workflow definitions. `comprehensive_benchmark_workflow.yaml` defines a 28-step SAR processing pipeline covering complex speckle filtering, phase gradient analysis, amplitude conversion, CFAR detection, and conditional orthorectification.
|
|
286
|
+
|
|
287
|
+
`benchmark_examples.py` demonstrates active workflow benchmarking with `ActiveBenchmarkRunner` at multiple array scales.
|
|
288
|
+
|
|
289
|
+
## Project Status
|
|
290
|
+
|
|
291
|
+
**Component coverage: 78/78 (100%)**
|
|
292
|
+
|
|
293
|
+
All public GRDL components have both a dedicated benchmark in `suite.py` and a correctness validation test in `tests/validation/`. See [BENCHMARK_COVERAGE_GAPS.md](BENCHMARK_COVERAGE_GAPS.md) for the full inventory.
|
|
294
|
+
|
|
295
|
+
| Metric | Value |
|
|
296
|
+
|--------|-------|
|
|
297
|
+
| Benchmarked components | 78/78 |
|
|
298
|
+
| Benchmark groups | 13 |
|
|
299
|
+
| Validation test files | 32 |
|
|
300
|
+
| Benchmark infrastructure tests | 6 |
|
|
301
|
+
| YAML workflow steps | 28 |
|
|
302
|
+
| Array size presets | small (512), medium (2048), large (4096) |
|
|
303
|
+
|
|
304
|
+
### Active Development
|
|
305
|
+
|
|
306
|
+
- **Passive Monitoring** — `ExecutionHook` for capturing metrics from production workflows
|
|
307
|
+
- **Regression Detection** — cross-run comparison with configurable thresholds
|
|
308
|
+
- **Cross-Hardware Prediction** — collect results from different machines, predict performance on new hardware
|
|
309
|
+
|
|
310
|
+
## Dependency Management
|
|
311
|
+
|
|
312
|
+
### Source of Truth: `pyproject.toml`
|
|
313
|
+
|
|
314
|
+
All dependencies are defined in `pyproject.toml`. Keep these files synchronized:
|
|
315
|
+
|
|
316
|
+
- **`pyproject.toml`** — source of truth for versions and dependencies
|
|
317
|
+
- **`requirements.txt`** — regenerate with `pip freeze > requirements.txt` after updating `pyproject.toml`
|
|
318
|
+
|
|
319
|
+
**Note:** GRDL-TE is a **validation suite, not a published library**, so there is no `.github/workflows/publish.yml` or PyPI versioning requirement.
|
|
320
|
+
|
|
321
|
+
### Updating Dependencies
|
|
322
|
+
|
|
323
|
+
1. Update dependencies in `pyproject.toml` (add new packages, change versions, create/rename extras)
|
|
324
|
+
2. Install dependencies: `pip install -e ".[all,dev]"` (or appropriate extras for your work)
|
|
325
|
+
3. If `requirements.txt` exists, regenerate it: `pip freeze > requirements.txt`
|
|
326
|
+
4. Commit both files
|
|
327
|
+
|
|
328
|
+
See [CLAUDE.md](CLAUDE.md#dependency-management) for detailed dependency management guidelines.
|
|
329
|
+
|
|
330
|
+
## License
|
|
331
|
+
|
|
332
|
+
MIT License — see [LICENSE](LICENSE) for details.
|