helioqc 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,33 @@
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *.egg-info/
5
+ .eggs/
6
+ dist/
7
+ build/
8
+ *.egg
9
+
10
+ # Virtual environments
11
+ .venv/
12
+ venv/
13
+
14
+ # Test / tooling
15
+ .tox/
16
+ .pytest_cache/
17
+ .mypy_cache/
18
+ .ruff_cache/
19
+ htmlcov/
20
+ .coverage
21
+
22
+ # IDE / OS
23
+ .idea/
24
+ .vscode/
25
+ .DS_Store
26
+
27
+ # Secrets and local artefacts
28
+ .env
29
+ tmp/
30
+
31
+ # Notebook outputs
32
+ .ipynb_checkpoints/
33
+ *.png
helioqc-0.1.0/LICENSE ADDED
@@ -0,0 +1 @@
1
+ TBD — License text to be added.
helioqc-0.1.0/PKG-INFO ADDED
@@ -0,0 +1,167 @@
1
+ Metadata-Version: 2.3
2
+ Name: helioqc
3
+ Version: 0.1.0
4
+ Summary: Visual quality-control diagnostics for solar irradiance measurements (HelioQC multipanel).
5
+ Author-email: Mines Paris - PSL <TBD@mines-paris.psl.eu>
6
+ License: TBD — License text to be added.
7
+ Keywords: irradiance,photovoltaic,quality-control,solar
8
+ Classifier: Development Status :: 4 - Beta
9
+ Classifier: Intended Audience :: Science/Research
10
+ Classifier: License :: Other/Proprietary License
11
+ Classifier: Programming Language :: Python :: 3
12
+ Classifier: Programming Language :: Python :: 3.10
13
+ Classifier: Programming Language :: Python :: 3.11
14
+ Classifier: Programming Language :: Python :: 3.12
15
+ Classifier: Programming Language :: Python :: 3.13
16
+ Classifier: Topic :: Scientific/Engineering
17
+ Requires-Python: >=3.10
18
+ Requires-Dist: click<9,>=8.1
19
+ Requires-Dist: ipykernel>=7.3.0
20
+ Requires-Dist: jupyter>=1.1.1
21
+ Requires-Dist: matplotlib<4,>=3.8
22
+ Requires-Dist: netcdf4>=1.7.3
23
+ Requires-Dist: numpy<2,>=1.26
24
+ Requires-Dist: pandas<3,>=2.0
25
+ Requires-Dist: pvlib<1,>=0.11
26
+ Requires-Dist: python-dotenv<2,>=1.0
27
+ Requires-Dist: sg2<3,>=2.3
28
+ Requires-Dist: tqdm<5,>=4.66
29
+ Requires-Dist: xarray>=2025.6.1
30
+ Provides-Extra: dev
31
+ Requires-Dist: build<2,>=1.2; extra == 'dev'
32
+ Requires-Dist: netcdf4<2,>=1.6; extra == 'dev'
33
+ Requires-Dist: pkginfo<2,>=1.12; extra == 'dev'
34
+ Requires-Dist: pytest-mock<4,>=3.14; extra == 'dev'
35
+ Requires-Dist: pytest<9,>=8; extra == 'dev'
36
+ Requires-Dist: tox-uv<2,>=1; extra == 'dev'
37
+ Requires-Dist: tox<5,>=4.19; extra == 'dev'
38
+ Requires-Dist: twine<7,>=6; extra == 'dev'
39
+ Description-Content-Type: text/markdown
40
+
41
+ # HelioQC
42
+
43
+ HelioQC produces a unified multipanel visual diagnostic for solar irradiance quality control (GHI, DNI, DIF). It complements automated QC tests by revealing characteristic signatures of measurement anomalies across several coordinated views.
44
+
45
+ **DOI:** <TBD>
46
+
47
+ **License:** <TBD> (see [LICENSE](LICENSE))
48
+
49
+ **Authors:** Mines Paris - PSL — <TBD>
50
+
51
+ ## Installation
52
+
53
+ ```bash
54
+ pip install helioqc
55
+ ```
56
+
57
+ For a local checkout:
58
+
59
+ ```bash
60
+ pip install -e .
61
+ ```
62
+
63
+ ## Prerequisites
64
+
65
+ 1. **CAMS radiation service** — register at [soda-pro CAMS Radiation Service](https://www.soda-pro.com/web-services/radiation/cams-radiation-service), then provide your email as `CAMS_EMAIL` or `cams_email=`.
66
+ 2. **Google Static Maps API** (optional) — set `GOOGLE_API_KEY` for station map thumbnails. When absent, maps are skipped with a warning.
67
+
68
+ Copy [`.env.sample`](.env.sample) to `.env` and set your credentials:
69
+
70
+ ```bash
71
+ cp .env.sample .env
72
+ # edit .env — CAMS_EMAIL is required
73
+ ```
74
+
75
+ Integration tests and local runs read from `.env` at the project root.
76
+
77
+ ## Input data format
78
+
79
+ `visual_qc` expects a `pandas.DataFrame` with:
80
+
81
+ - a **UTC** `DatetimeIndex` (naive timestamps are assumed to be UTC)
82
+ - columns **`GHI`**, **`DHI`** or **`DIF`**, **`BNI`** or **`DNI`** (W/m²)
83
+
84
+ Filter the time range in Python before calling `visual_qc` (e.g. `df.loc["2021-01-01":"2025-12-31"]`).
85
+
86
+ Optional station metadata may be attached to `input_df.attrs` (e.g. after reading a NetCDF file):
87
+
88
+ | `attrs` key | Description |
89
+ |---------------|--------------------------|
90
+ | `latitude` | Station latitude (°) |
91
+ | `longitude` | Station longitude (°) |
92
+ | `elevation` | Station elevation (m) |
93
+ | `ClimateType` | Köppen-Geiger code |
94
+ | `Country` | Country name |
95
+ | `station` | Station name |
96
+ | `ID` | Station identifier |
97
+ | `source` | Measurement network |
98
+
99
+ `latitude`, `longitude`, and `elevation` are required unless passed explicitly.
100
+
101
+ ## Python API
102
+
103
+ ```python
104
+ from helioqc import visual_qc
105
+ import pandas as pd
106
+
107
+ df = pd.read_csv("measurements.csv", parse_dates=["time"], index_col="time")
108
+ df.index = df.index.tz_localize("UTC")
109
+
110
+ fig = visual_qc(
111
+ df,
112
+ latitude=36.624,
113
+ longitude=-116.019,
114
+ elevation=1007,
115
+ climate="BWh",
116
+ country="United States",
117
+ station="Desert Rock, Nevada",
118
+ station_id="DRA",
119
+ source="SURFRAD",
120
+ cams_email="you@example.com",
121
+ )
122
+ fig.savefig("helioqc_output.png", dpi=150, bbox_inches="tight")
123
+ ```
124
+
125
+ ### `visual_qc` synopsis
126
+
127
+ ```
128
+ visual_qc(input_df, *, latitude, longitude, elevation, cams_email, ...) → matplotlib.figure.Figure
129
+ ```
130
+
131
+ Returns a matplotlib `Figure` displayed automatically in Jupyter notebooks.
132
+
133
+ ## Command-line interface
134
+
135
+ ```bash
136
+ helioqc \
137
+ --latitude 36.624 --longitude -116.019 --elevation 1007 \
138
+ --climate BWh --country "United States" \
139
+ --station "Desert Rock, Nevada" --station-id DRA --source SURFRAD \
140
+ --cams-email you@example.com \
141
+ data/surfrad_dra_2025.csv.zip output.png
142
+ ```
143
+
144
+ `.csv`, `.csv.gz`, and `.csv.zip` inputs are supported. By default the CLI analysis window spans the last five years of data ending at the dataset maximum (`--start-date` / `--end-date` to override).
145
+
146
+ Additional options: `--time-col`, `--time-format`, `--timezone`, `--ghi-col`, `--dni-col` (falls back to `BNI`), `--dif-col` (falls back to `DHI`), `--start-date`, `--end-date`.
147
+
148
+ ## Sample data
149
+
150
+ The repository includes one year of SURFRAD Desert Rock (DRA) measurements:
151
+
152
+ `data/surfrad_dra_2025.csv.zip`
153
+
154
+ Station metadata (latitude, climate, …) is passed as function/CLI parameters — see the API and CLI examples above.
155
+
156
+ ## Tests
157
+
158
+ ```bash
159
+ make test # e2e (CAMS mocked) + unit tests
160
+ make test-unit # fast wiring tests only
161
+ ```
162
+
163
+ End-to-end tests call sg2/SRTM/plotting for real (CAMS mocked) and write JPEG artefacts to `tmp/test_api.jpg` and `tmp/test_cli.jpg`.
164
+
165
+ ## Demo notebook
166
+
167
+ See [demo.ipynb](demo.ipynb) for a minimal workflow: fetch data from the Thredds server, attach metadata, and call `visual_qc()`.
@@ -0,0 +1,127 @@
1
+ # HelioQC
2
+
3
+ HelioQC produces a unified multipanel visual diagnostic for solar irradiance quality control (GHI, DNI, DIF). It complements automated QC tests by revealing characteristic signatures of measurement anomalies across several coordinated views.
4
+
5
+ **DOI:** <TBD>
6
+
7
+ **License:** <TBD> (see [LICENSE](LICENSE))
8
+
9
+ **Authors:** Mines Paris - PSL — <TBD>
10
+
11
+ ## Installation
12
+
13
+ ```bash
14
+ pip install helioqc
15
+ ```
16
+
17
+ For a local checkout:
18
+
19
+ ```bash
20
+ pip install -e .
21
+ ```
22
+
23
+ ## Prerequisites
24
+
25
+ 1. **CAMS radiation service** — register at [soda-pro CAMS Radiation Service](https://www.soda-pro.com/web-services/radiation/cams-radiation-service), then provide your email as `CAMS_EMAIL` or `cams_email=`.
26
+ 2. **Google Static Maps API** (optional) — set `GOOGLE_API_KEY` for station map thumbnails. When absent, maps are skipped with a warning.
27
+
28
+ Copy [`.env.sample`](.env.sample) to `.env` and set your credentials:
29
+
30
+ ```bash
31
+ cp .env.sample .env
32
+ # edit .env — CAMS_EMAIL is required
33
+ ```
34
+
35
+ Integration tests and local runs read from `.env` at the project root.
36
+
37
+ ## Input data format
38
+
39
+ `visual_qc` expects a `pandas.DataFrame` with:
40
+
41
+ - a **UTC** `DatetimeIndex` (naive timestamps are assumed to be UTC)
42
+ - columns **`GHI`**, **`DHI`** or **`DIF`**, **`BNI`** or **`DNI`** (W/m²)
43
+
44
+ Filter the time range in Python before calling `visual_qc` (e.g. `df.loc["2021-01-01":"2025-12-31"]`).
45
+
46
+ Optional station metadata may be attached to `input_df.attrs` (e.g. after reading a NetCDF file):
47
+
48
+ | `attrs` key | Description |
49
+ |---------------|--------------------------|
50
+ | `latitude` | Station latitude (°) |
51
+ | `longitude` | Station longitude (°) |
52
+ | `elevation` | Station elevation (m) |
53
+ | `ClimateType` | Köppen-Geiger code |
54
+ | `Country` | Country name |
55
+ | `station` | Station name |
56
+ | `ID` | Station identifier |
57
+ | `source` | Measurement network |
58
+
59
+ `latitude`, `longitude`, and `elevation` are required unless passed explicitly.
60
+
61
+ ## Python API
62
+
63
+ ```python
64
+ from helioqc import visual_qc
65
+ import pandas as pd
66
+
67
+ df = pd.read_csv("measurements.csv", parse_dates=["time"], index_col="time")
68
+ df.index = df.index.tz_localize("UTC")
69
+
70
+ fig = visual_qc(
71
+ df,
72
+ latitude=36.624,
73
+ longitude=-116.019,
74
+ elevation=1007,
75
+ climate="BWh",
76
+ country="United States",
77
+ station="Desert Rock, Nevada",
78
+ station_id="DRA",
79
+ source="SURFRAD",
80
+ cams_email="you@example.com",
81
+ )
82
+ fig.savefig("helioqc_output.png", dpi=150, bbox_inches="tight")
83
+ ```
84
+
85
+ ### `visual_qc` synopsis
86
+
87
+ ```
88
+ visual_qc(input_df, *, latitude, longitude, elevation, cams_email, ...) → matplotlib.figure.Figure
89
+ ```
90
+
91
+ Returns a matplotlib `Figure` displayed automatically in Jupyter notebooks.
92
+
93
+ ## Command-line interface
94
+
95
+ ```bash
96
+ helioqc \
97
+ --latitude 36.624 --longitude -116.019 --elevation 1007 \
98
+ --climate BWh --country "United States" \
99
+ --station "Desert Rock, Nevada" --station-id DRA --source SURFRAD \
100
+ --cams-email you@example.com \
101
+ data/surfrad_dra_2025.csv.zip output.png
102
+ ```
103
+
104
+ `.csv`, `.csv.gz`, and `.csv.zip` inputs are supported. By default the CLI analysis window spans the last five years of data ending at the dataset maximum (`--start-date` / `--end-date` to override).
105
+
106
+ Additional options: `--time-col`, `--time-format`, `--timezone`, `--ghi-col`, `--dni-col` (falls back to `BNI`), `--dif-col` (falls back to `DHI`), `--start-date`, `--end-date`.
107
+
108
+ ## Sample data
109
+
110
+ The repository includes one year of SURFRAD Desert Rock (DRA) measurements:
111
+
112
+ `data/surfrad_dra_2025.csv.zip`
113
+
114
+ Station metadata (latitude, climate, …) is passed as function/CLI parameters — see the API and CLI examples above.
115
+
116
+ ## Tests
117
+
118
+ ```bash
119
+ make test # e2e (CAMS mocked) + unit tests
120
+ make test-unit # fast wiring tests only
121
+ ```
122
+
123
+ End-to-end tests call sg2/SRTM/plotting for real (CAMS mocked) and write JPEG artefacts to `tmp/test_api.jpg` and `tmp/test_cli.jpg`.
124
+
125
+ ## Demo notebook
126
+
127
+ See [demo.ipynb](demo.ipynb) for a minimal workflow: fetch data from the Thredds server, attach metadata, and call `visual_qc()`.
@@ -0,0 +1,6 @@
1
+ """HelioQC: visual quality-control diagnostics for solar irradiance measurements."""
2
+
3
+ from helioqc.api import visual_qc
4
+
5
+ __all__ = ["visual_qc"]
6
+ __version__ = "0.1.0"
@@ -0,0 +1,114 @@
1
+ """Public HelioQC API."""
2
+
3
+ from __future__ import annotations
4
+
5
+ from typing import TYPE_CHECKING
6
+
7
+ import pandas as pd
8
+
9
+ from helioqc.config import (
10
+ load_env,
11
+ resolve_cams_email,
12
+ resolve_google_api_key,
13
+ resolve_metadata,
14
+ )
15
+ from helioqc.diagnostic import compute_plots
16
+ from helioqc.io import ensure_utc_index, normalize_irradiance_columns
17
+ from helioqc.log import configure_logging, get_logger
18
+ from helioqc.pipeline import prepare_qc_dataframe
19
+
20
+ import matplotlib.pyplot as plt
21
+
22
+ if TYPE_CHECKING:
23
+ import matplotlib.figure
24
+
25
+ logger = get_logger()
26
+
27
+
28
+ def visual_qc(
29
+ input_df: pd.DataFrame,
30
+ latitude: float | None = None,
31
+ longitude: float | None = None,
32
+ elevation: float | None = None,
33
+ cams_email: str | None = None,
34
+ google_api_key: str | None = None,
35
+ climate: str | None = None,
36
+ country: str | None = None,
37
+ station: str | None = None,
38
+ station_id: str | None = None,
39
+ source: str | None = None,
40
+ show_axis_nr: bool = False,
41
+ no_cache: bool = False,
42
+ ) -> matplotlib.figure.Figure:
43
+ """Run the HelioQC diagnostic multipanel on irradiance measurements.
44
+
45
+ Parameters
46
+ ----------
47
+ input_df
48
+ DataFrame with a DatetimeIndex and columns ``GHI``, ``DHI`` or ``DIF``,
49
+ ``BNI`` or ``DNI`` (W/m²). Timestamps without timezone are assumed UTC.
50
+ Filter the time range before calling this function.
51
+ Location metadata may be stored in ``input_df.attrs``.
52
+
53
+ latitude, longitude, elevation
54
+ Station coordinates. May be provided in ``input_df.attrs``
55
+
56
+ cams_email
57
+ Email registered with the CAMS radiation service.
58
+ Falls back to ``CAMS_EMAIL`` env var.
59
+
60
+ google_api_key
61
+ Google Static Maps API key. Falls back to ``GOOGLE_API_KEY`` env var.
62
+ Map thunmbnails are skipped if missing
63
+
64
+ climate, country, station, station_id, source
65
+ Optional station metadata for display panels.
66
+
67
+ show_axis_nr
68
+ Annotate subplot panel labels when True.
69
+
70
+ no_cache
71
+ When True, bypass the on-disk cache for CAMS and Google Static Maps.
72
+
73
+ Returns
74
+ -------
75
+ matplotlib.figure.Figure
76
+ The HelioQC multipanel figure (shown once in Jupyter when returned).
77
+ """
78
+ load_env()
79
+ configure_logging()
80
+ logger.info("Starting HelioQC visual QC...")
81
+ df = ensure_utc_index(normalize_irradiance_columns(input_df))
82
+
83
+ metadata = resolve_metadata(
84
+ input_df,
85
+ latitude=latitude,
86
+ longitude=longitude,
87
+ elevation=elevation,
88
+ climate=climate,
89
+ country=country,
90
+ station=station,
91
+ station_id=station_id,
92
+ source=source,
93
+ )
94
+ email = resolve_cams_email(cams_email)
95
+ maps_key = resolve_google_api_key(google_api_key)
96
+
97
+ qc_df, metadata = prepare_qc_dataframe(
98
+ df,
99
+ metadata,
100
+ cams_email=email,
101
+ no_cache=no_cache,
102
+ )
103
+ logger.info("Generating multipanel figure...")
104
+ with plt.ioff():
105
+ fig = compute_plots(
106
+ qc_df,
107
+ metadata,
108
+ showAxisNr=show_axis_nr,
109
+ google_api_key=maps_key,
110
+ no_cache=no_cache,
111
+ )
112
+ plt.close(fig)
113
+ logger.info("HelioQC visual QC complete")
114
+ return fig