PVNet 5.0.18__tar.gz → 5.0.19__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. {pvnet-5.0.18 → pvnet-5.0.19}/PKG-INFO +31 -76
  2. {pvnet-5.0.18 → pvnet-5.0.19}/PVNet.egg-info/PKG-INFO +31 -76
  3. {pvnet-5.0.18 → pvnet-5.0.19}/README.md +30 -75
  4. {pvnet-5.0.18 → pvnet-5.0.19}/LICENSE +0 -0
  5. {pvnet-5.0.18 → pvnet-5.0.19}/PVNet.egg-info/SOURCES.txt +0 -0
  6. {pvnet-5.0.18 → pvnet-5.0.19}/PVNet.egg-info/dependency_links.txt +0 -0
  7. {pvnet-5.0.18 → pvnet-5.0.19}/PVNet.egg-info/requires.txt +0 -0
  8. {pvnet-5.0.18 → pvnet-5.0.19}/PVNet.egg-info/top_level.txt +0 -0
  9. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/__init__.py +0 -0
  10. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/data/__init__.py +0 -0
  11. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/data/base_datamodule.py +0 -0
  12. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/data/site_datamodule.py +0 -0
  13. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/data/uk_regional_datamodule.py +0 -0
  14. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/load_model.py +0 -0
  15. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/__init__.py +0 -0
  16. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/base_model.py +0 -0
  17. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/ensemble.py +0 -0
  18. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/late_fusion/__init__.py +0 -0
  19. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/late_fusion/basic_blocks.py +0 -0
  20. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/late_fusion/encoders/__init__.py +0 -0
  21. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/late_fusion/encoders/basic_blocks.py +0 -0
  22. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/late_fusion/encoders/encoders3d.py +0 -0
  23. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/late_fusion/late_fusion.py +0 -0
  24. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/late_fusion/linear_networks/__init__.py +0 -0
  25. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/late_fusion/linear_networks/basic_blocks.py +0 -0
  26. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/late_fusion/linear_networks/networks.py +0 -0
  27. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/late_fusion/site_encoders/__init__.py +0 -0
  28. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/late_fusion/site_encoders/basic_blocks.py +0 -0
  29. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/models/late_fusion/site_encoders/encoders.py +0 -0
  30. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/optimizers.py +0 -0
  31. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/training/__init__.py +0 -0
  32. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/training/lightning_module.py +0 -0
  33. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/training/plots.py +0 -0
  34. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/training/train.py +0 -0
  35. {pvnet-5.0.18 → pvnet-5.0.19}/pvnet/utils.py +0 -0
  36. {pvnet-5.0.18 → pvnet-5.0.19}/pyproject.toml +0 -0
  37. {pvnet-5.0.18 → pvnet-5.0.19}/setup.cfg +0 -0
  38. {pvnet-5.0.18 → pvnet-5.0.19}/tests/test_end2end.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: PVNet
3
- Version: 5.0.18
3
+ Version: 5.0.19
4
4
  Summary: PVNet
5
5
  Author-email: Peter Dudfield <info@openclimatefix.org>
6
6
  Requires-Python: >=3.11
@@ -142,120 +142,75 @@ pip install -e <PATH-TO-ocf-data-sampler-REPO>
142
142
  If you install the local version of `ocf-data-sampler` that is more recent than the version
143
143
  specified in `PVNet` it is not guarenteed to function properly with this library.
144
144
 
145
- ## Pre-saving samples of data for training/validation of PVNet
145
+ ## Streaming samples (no pre-save)
146
146
 
147
- PVNet contains a script for generating samples of data suitable for training the PVNet models. To run the script you will need to make some modifications to the datamodule configuration.
147
+ PVNet now trains and validates directly from **streamed_samples** (i.e. no pre-saving to disk).
148
148
 
149
- Make sure you have copied the example configs (as already stated above):
150
- ```
149
+ Make sure you have copied example configs (as already stated above):
151
150
  cp -r configs.example configs
152
- ```
153
-
154
- ### Set up and config example for sample creation
155
-
156
- We will use the following example config file for creating samples: `/PVNet/configs/datamodule/configuration/example_configuration.yaml`. Ensure that the file paths are set to the correct locations in `example_configuration.yaml`: search for `PLACEHOLDER` to find where to input the location of the files. You will need to comment out or delete the parts of `example_configuration.yaml` pertaining to the data you are not using.
157
151
 
152
+ ### Set up and config example for streaming
158
153
 
159
- When creating samples, an additional datamodule config located in `PVNet/configs/datamodule` is passed into the sample creation script: `streamed_samples.yaml`. Like before, a placeholder variable is used when specifying which configuration to use:
154
+ We will use the following example config file to describe your data sources: `/PVNet/configs/datamodule/configuration/example_configuration.yaml`. Ensure that the file paths are set to the correct locations in `example_configuration.yaml`: search for `PLACEHOLDER` to find where to input the location of the files. Delete or comment the parts for data you are not using.
160
155
 
161
- ```yaml
162
- configuration: "PLACEHOLDER.yaml"
163
- ```
164
-
165
- This should be given the whole path to the config on your local machine, for example:
156
+ At run time, the datamodule config `PVNet/configs/datamodule/streamed_samples.yaml` points to your chosen configuration file:
166
157
 
167
- ```yaml
168
158
  configuration: "/FULL-PATH-TO-REPO/PVNet/configs/datamodule/configuration/example_configuration.yaml"
169
- ```
170
-
171
- Where `FULL-PATH-TO-REPO` represent the whole path to the PVNet repo on your local machine.
172
159
 
173
- This is also where you can update the train, val & test periods to cover the data you have access to.
174
-
175
- ### Running the sample creation script
176
-
177
- Run the `save_samples.py` script to create samples with the parameters specified in the datamodule config (`streamed_samples.yaml` in this example):
178
-
179
- ```bash
180
- python scripts/save_samples.py
181
- ```
182
- PVNet uses
183
- [hydra](https://hydra.cc/) which enables us to pass variables via the command
184
- line that will override the configuration defined in the `./configs` directory, like this:
185
-
186
- ```bash
187
- python scripts/save_samples.py datamodule=streamed_samples datamodule.sample_output_dir="./output" datamodule.num_train_samples=10 datamodule.num_val_samples=5
188
- ```
189
-
190
- `scripts/save_samples.py` needs a config under `PVNet/configs/datamodule`. You can adapt `streamed_samples.yaml` or create your own in the same folder.
160
+ You can also update train/val/test time ranges here to match the period you have access to.
191
161
 
192
162
  If downloading private data from a GCP bucket make sure to authenticate gcloud (the public satellite data does not need authentication):
193
163
 
194
- ```
195
164
  gcloud auth login
196
- ```
197
-
198
- Files stored in multiple locations can be added as a list. For example, in the `example_configuration.yaml` file we can supply a path to satellite data stored on a bucket:
199
165
 
200
- ```yaml
201
- satellite:
202
- zarr_path: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/v4/2020_nonhrv.zarr
203
- ```
166
+ You can provide multiple storage locations as a list. For example:
204
167
 
205
- Or to satellite data hosted by Google:
206
-
207
- ```yaml
208
168
  satellite:
209
- zarr_path:
210
- - "gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2020_nonhrv.zarr"
211
- - "gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2021_nonhrv.zarr"
212
- ```
213
-
214
- ocf-data-sampler is currently set up to use 11 channels from the satellite data, the 12th of which is HRV and is not included in these.
169
+ zarr_path:
170
+ - "gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2020_nonhrv.zarr"
171
+ - "gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2021_nonhrv.zarr"
215
172
 
173
+ `ocf-data-sampler` is currently set up to use 11 channels from the satellite data (the 12th, HRV, is not used).
216
174
 
217
175
  ### Training PVNet
218
176
 
219
- How PVNet is run is determined by the extensive configuration in the config
220
- files. The configs stored in `PVNet/configs.example` should work with samples created using the steps and sample creation config mentioned above.
177
+ How PVNet is run is determined by the configuration files. The example configs in `PVNet/configs.example` work with **streamed_samples** using `datamodule/streamed_samples.yaml`.
221
178
 
222
- Make sure to update the following config files before training your model:
179
+ Update the following before training:
223
180
 
224
- 1. In `configs/datamodule/presaved_samples.yaml`:
225
- - update `sample_dir` to point to the directory you stored your samples in during sample creation
226
- 2. In `configs/model/late_fusion.yaml`:
227
- - update the list of encoders to reflect the data sources you are using. If you are using different NWP sources, the encoders for these should follow the same structure with two important updates:
228
- - `in_channels`: number of variables your NWP source supplies
229
- - `image_size_pixels`: spatial crop of your NWP data. It depends on the spatial resolution of your NWP; should match `image_size_pixels_height` and/or `image_size_pixels_width` in `datamodule/configuration/site_example_configuration.yaml` for the NWP, unless transformations such as coarsening was applied (e. g. as for ECMWF data)
230
- 3. In `configs/trainer/default.yaml`:
231
- - set `accelerator: 0` if running on a system without a supported GPU
181
+ 1. In `configs/model/late_fusion.yaml`:
182
+ - Update the list of encoders to match the data sources you are using. For different NWP sources, keep the same structure but ensure:
183
+ - `in_channels`: the number of variables your NWP source supplies
184
+ - `image_size_pixels`: spatial crop matching your NWP resolution and the settings in your datamodule configuration (unless you coarsened, e.g. for ECMWF)
185
+ 2. In `configs/trainer/default.yaml`:
186
+ - Set `accelerator: 0` if running on a system without a supported GPU
187
+ 3. In `configs/datamodule/streamed_samples.yaml`:
188
+ - Point `configuration:` to your local `example_configuration.yaml` (or your custom one)
189
+ - Adjust the train/val/test time ranges to your available data
232
190
 
233
- If creating copies of the config files instead of modifying existing ones, update `defaults` in the main `./configs/config.yaml` file to use
234
- your customised config files:
191
+ If you create custom config files, update the main `./configs/config.yaml` defaults:
235
192
 
236
- ```yaml
237
193
  defaults:
238
194
  - trainer: default.yaml
239
195
  - model: late_fusion.yaml
240
- - datamodule: presaved_samples.yaml
196
+ - datamodule: streamed_samples.yaml
241
197
  - callbacks: null
242
198
  - experiment: null
243
199
  - hparams_search: null
244
200
  - hydra: default.yaml
245
- ```
246
201
 
247
- Assuming you ran the `save_samples.py` script to generate some presaved train and
248
- val data samples, you can now train PVNet by running:
202
+ Now train PVNet:
249
203
 
250
- ```
251
204
  python run.py
252
- ```
205
+
206
+ You can override any setting with Hydra, e.g.:
207
+
208
+ python run.py datamodule=streamed_samples datamodule.configuration="/FULL-PATH/PVNet/configs/datamodule/configuration/example_configuration.yaml"
253
209
 
254
210
  ## Backtest
255
211
 
256
212
  If you have successfully trained a PVNet model and have a saved model checkpoint you can create a backtest using this, e.g. forecasts on historical data to evaluate forecast accuracy/skill. This can be done by running one of the scripts in this repo such as [the UK GSP backtest script](scripts/backtest_uk_gsp.py) or the [the pv site backtest script](scripts/backtest_sites.py), further info on how to run these are in each backtest file.
257
213
 
258
-
259
214
  ## Testing
260
215
 
261
216
  You can use `python -m pytest tests` to run tests
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: PVNet
3
- Version: 5.0.18
3
+ Version: 5.0.19
4
4
  Summary: PVNet
5
5
  Author-email: Peter Dudfield <info@openclimatefix.org>
6
6
  Requires-Python: >=3.11
@@ -142,120 +142,75 @@ pip install -e <PATH-TO-ocf-data-sampler-REPO>
142
142
  If you install the local version of `ocf-data-sampler` that is more recent than the version
143
143
  specified in `PVNet` it is not guarenteed to function properly with this library.
144
144
 
145
- ## Pre-saving samples of data for training/validation of PVNet
145
+ ## Streaming samples (no pre-save)
146
146
 
147
- PVNet contains a script for generating samples of data suitable for training the PVNet models. To run the script you will need to make some modifications to the datamodule configuration.
147
+ PVNet now trains and validates directly from **streamed_samples** (i.e. no pre-saving to disk).
148
148
 
149
- Make sure you have copied the example configs (as already stated above):
150
- ```
149
+ Make sure you have copied example configs (as already stated above):
151
150
  cp -r configs.example configs
152
- ```
153
-
154
- ### Set up and config example for sample creation
155
-
156
- We will use the following example config file for creating samples: `/PVNet/configs/datamodule/configuration/example_configuration.yaml`. Ensure that the file paths are set to the correct locations in `example_configuration.yaml`: search for `PLACEHOLDER` to find where to input the location of the files. You will need to comment out or delete the parts of `example_configuration.yaml` pertaining to the data you are not using.
157
151
 
152
+ ### Set up and config example for streaming
158
153
 
159
- When creating samples, an additional datamodule config located in `PVNet/configs/datamodule` is passed into the sample creation script: `streamed_samples.yaml`. Like before, a placeholder variable is used when specifying which configuration to use:
154
+ We will use the following example config file to describe your data sources: `/PVNet/configs/datamodule/configuration/example_configuration.yaml`. Ensure that the file paths are set to the correct locations in `example_configuration.yaml`: search for `PLACEHOLDER` to find where to input the location of the files. Delete or comment the parts for data you are not using.
160
155
 
161
- ```yaml
162
- configuration: "PLACEHOLDER.yaml"
163
- ```
164
-
165
- This should be given the whole path to the config on your local machine, for example:
156
+ At run time, the datamodule config `PVNet/configs/datamodule/streamed_samples.yaml` points to your chosen configuration file:
166
157
 
167
- ```yaml
168
158
  configuration: "/FULL-PATH-TO-REPO/PVNet/configs/datamodule/configuration/example_configuration.yaml"
169
- ```
170
-
171
- Where `FULL-PATH-TO-REPO` represent the whole path to the PVNet repo on your local machine.
172
159
 
173
- This is also where you can update the train, val & test periods to cover the data you have access to.
174
-
175
- ### Running the sample creation script
176
-
177
- Run the `save_samples.py` script to create samples with the parameters specified in the datamodule config (`streamed_samples.yaml` in this example):
178
-
179
- ```bash
180
- python scripts/save_samples.py
181
- ```
182
- PVNet uses
183
- [hydra](https://hydra.cc/) which enables us to pass variables via the command
184
- line that will override the configuration defined in the `./configs` directory, like this:
185
-
186
- ```bash
187
- python scripts/save_samples.py datamodule=streamed_samples datamodule.sample_output_dir="./output" datamodule.num_train_samples=10 datamodule.num_val_samples=5
188
- ```
189
-
190
- `scripts/save_samples.py` needs a config under `PVNet/configs/datamodule`. You can adapt `streamed_samples.yaml` or create your own in the same folder.
160
+ You can also update train/val/test time ranges here to match the period you have access to.
191
161
 
192
162
  If downloading private data from a GCP bucket make sure to authenticate gcloud (the public satellite data does not need authentication):
193
163
 
194
- ```
195
164
  gcloud auth login
196
- ```
197
-
198
- Files stored in multiple locations can be added as a list. For example, in the `example_configuration.yaml` file we can supply a path to satellite data stored on a bucket:
199
165
 
200
- ```yaml
201
- satellite:
202
- zarr_path: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/v4/2020_nonhrv.zarr
203
- ```
166
+ You can provide multiple storage locations as a list. For example:
204
167
 
205
- Or to satellite data hosted by Google:
206
-
207
- ```yaml
208
168
  satellite:
209
- zarr_path:
210
- - "gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2020_nonhrv.zarr"
211
- - "gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2021_nonhrv.zarr"
212
- ```
213
-
214
- ocf-data-sampler is currently set up to use 11 channels from the satellite data, the 12th of which is HRV and is not included in these.
169
+ zarr_path:
170
+ - "gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2020_nonhrv.zarr"
171
+ - "gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2021_nonhrv.zarr"
215
172
 
173
+ `ocf-data-sampler` is currently set up to use 11 channels from the satellite data (the 12th, HRV, is not used).
216
174
 
217
175
  ### Training PVNet
218
176
 
219
- How PVNet is run is determined by the extensive configuration in the config
220
- files. The configs stored in `PVNet/configs.example` should work with samples created using the steps and sample creation config mentioned above.
177
+ How PVNet is run is determined by the configuration files. The example configs in `PVNet/configs.example` work with **streamed_samples** using `datamodule/streamed_samples.yaml`.
221
178
 
222
- Make sure to update the following config files before training your model:
179
+ Update the following before training:
223
180
 
224
- 1. In `configs/datamodule/presaved_samples.yaml`:
225
- - update `sample_dir` to point to the directory you stored your samples in during sample creation
226
- 2. In `configs/model/late_fusion.yaml`:
227
- - update the list of encoders to reflect the data sources you are using. If you are using different NWP sources, the encoders for these should follow the same structure with two important updates:
228
- - `in_channels`: number of variables your NWP source supplies
229
- - `image_size_pixels`: spatial crop of your NWP data. It depends on the spatial resolution of your NWP; should match `image_size_pixels_height` and/or `image_size_pixels_width` in `datamodule/configuration/site_example_configuration.yaml` for the NWP, unless transformations such as coarsening was applied (e. g. as for ECMWF data)
230
- 3. In `configs/trainer/default.yaml`:
231
- - set `accelerator: 0` if running on a system without a supported GPU
181
+ 1. In `configs/model/late_fusion.yaml`:
182
+ - Update the list of encoders to match the data sources you are using. For different NWP sources, keep the same structure but ensure:
183
+ - `in_channels`: the number of variables your NWP source supplies
184
+ - `image_size_pixels`: spatial crop matching your NWP resolution and the settings in your datamodule configuration (unless you coarsened, e.g. for ECMWF)
185
+ 2. In `configs/trainer/default.yaml`:
186
+ - Set `accelerator: 0` if running on a system without a supported GPU
187
+ 3. In `configs/datamodule/streamed_samples.yaml`:
188
+ - Point `configuration:` to your local `example_configuration.yaml` (or your custom one)
189
+ - Adjust the train/val/test time ranges to your available data
232
190
 
233
- If creating copies of the config files instead of modifying existing ones, update `defaults` in the main `./configs/config.yaml` file to use
234
- your customised config files:
191
+ If you create custom config files, update the main `./configs/config.yaml` defaults:
235
192
 
236
- ```yaml
237
193
  defaults:
238
194
  - trainer: default.yaml
239
195
  - model: late_fusion.yaml
240
- - datamodule: presaved_samples.yaml
196
+ - datamodule: streamed_samples.yaml
241
197
  - callbacks: null
242
198
  - experiment: null
243
199
  - hparams_search: null
244
200
  - hydra: default.yaml
245
- ```
246
201
 
247
- Assuming you ran the `save_samples.py` script to generate some presaved train and
248
- val data samples, you can now train PVNet by running:
202
+ Now train PVNet:
249
203
 
250
- ```
251
204
  python run.py
252
- ```
205
+
206
+ You can override any setting with Hydra, e.g.:
207
+
208
+ python run.py datamodule=streamed_samples datamodule.configuration="/FULL-PATH/PVNet/configs/datamodule/configuration/example_configuration.yaml"
253
209
 
254
210
  ## Backtest
255
211
 
256
212
  If you have successfully trained a PVNet model and have a saved model checkpoint you can create a backtest using this, e.g. forecasts on historical data to evaluate forecast accuracy/skill. This can be done by running one of the scripts in this repo such as [the UK GSP backtest script](scripts/backtest_uk_gsp.py) or the [the pv site backtest script](scripts/backtest_sites.py), further info on how to run these are in each backtest file.
257
213
 
258
-
259
214
  ## Testing
260
215
 
261
216
  You can use `python -m pytest tests` to run tests
@@ -113,120 +113,75 @@ pip install -e <PATH-TO-ocf-data-sampler-REPO>
113
113
  If you install the local version of `ocf-data-sampler` that is more recent than the version
114
114
  specified in `PVNet` it is not guarenteed to function properly with this library.
115
115
 
116
- ## Pre-saving samples of data for training/validation of PVNet
116
+ ## Streaming samples (no pre-save)
117
117
 
118
- PVNet contains a script for generating samples of data suitable for training the PVNet models. To run the script you will need to make some modifications to the datamodule configuration.
118
+ PVNet now trains and validates directly from **streamed_samples** (i.e. no pre-saving to disk).
119
119
 
120
- Make sure you have copied the example configs (as already stated above):
121
- ```
120
+ Make sure you have copied example configs (as already stated above):
122
121
  cp -r configs.example configs
123
- ```
124
-
125
- ### Set up and config example for sample creation
126
-
127
- We will use the following example config file for creating samples: `/PVNet/configs/datamodule/configuration/example_configuration.yaml`. Ensure that the file paths are set to the correct locations in `example_configuration.yaml`: search for `PLACEHOLDER` to find where to input the location of the files. You will need to comment out or delete the parts of `example_configuration.yaml` pertaining to the data you are not using.
128
122
 
123
+ ### Set up and config example for streaming
129
124
 
130
- When creating samples, an additional datamodule config located in `PVNet/configs/datamodule` is passed into the sample creation script: `streamed_samples.yaml`. Like before, a placeholder variable is used when specifying which configuration to use:
125
+ We will use the following example config file to describe your data sources: `/PVNet/configs/datamodule/configuration/example_configuration.yaml`. Ensure that the file paths are set to the correct locations in `example_configuration.yaml`: search for `PLACEHOLDER` to find where to input the location of the files. Delete or comment the parts for data you are not using.
131
126
 
132
- ```yaml
133
- configuration: "PLACEHOLDER.yaml"
134
- ```
135
-
136
- This should be given the whole path to the config on your local machine, for example:
127
+ At run time, the datamodule config `PVNet/configs/datamodule/streamed_samples.yaml` points to your chosen configuration file:
137
128
 
138
- ```yaml
139
129
  configuration: "/FULL-PATH-TO-REPO/PVNet/configs/datamodule/configuration/example_configuration.yaml"
140
- ```
141
-
142
- Where `FULL-PATH-TO-REPO` represent the whole path to the PVNet repo on your local machine.
143
130
 
144
- This is also where you can update the train, val & test periods to cover the data you have access to.
145
-
146
- ### Running the sample creation script
147
-
148
- Run the `save_samples.py` script to create samples with the parameters specified in the datamodule config (`streamed_samples.yaml` in this example):
149
-
150
- ```bash
151
- python scripts/save_samples.py
152
- ```
153
- PVNet uses
154
- [hydra](https://hydra.cc/) which enables us to pass variables via the command
155
- line that will override the configuration defined in the `./configs` directory, like this:
156
-
157
- ```bash
158
- python scripts/save_samples.py datamodule=streamed_samples datamodule.sample_output_dir="./output" datamodule.num_train_samples=10 datamodule.num_val_samples=5
159
- ```
160
-
161
- `scripts/save_samples.py` needs a config under `PVNet/configs/datamodule`. You can adapt `streamed_samples.yaml` or create your own in the same folder.
131
+ You can also update train/val/test time ranges here to match the period you have access to.
162
132
 
163
133
  If downloading private data from a GCP bucket make sure to authenticate gcloud (the public satellite data does not need authentication):
164
134
 
165
- ```
166
135
  gcloud auth login
167
- ```
168
-
169
- Files stored in multiple locations can be added as a list. For example, in the `example_configuration.yaml` file we can supply a path to satellite data stored on a bucket:
170
136
 
171
- ```yaml
172
- satellite:
173
- zarr_path: gs://solar-pv-nowcasting-data/satellite/EUMETSAT/SEVIRI_RSS/v4/2020_nonhrv.zarr
174
- ```
137
+ You can provide multiple storage locations as a list. For example:
175
138
 
176
- Or to satellite data hosted by Google:
177
-
178
- ```yaml
179
139
  satellite:
180
- zarr_path:
181
- - "gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2020_nonhrv.zarr"
182
- - "gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2021_nonhrv.zarr"
183
- ```
184
-
185
- ocf-data-sampler is currently set up to use 11 channels from the satellite data, the 12th of which is HRV and is not included in these.
140
+ zarr_path:
141
+ - "gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2020_nonhrv.zarr"
142
+ - "gs://public-datasets-eumetsat-solar-forecasting/satellite/EUMETSAT/SEVIRI_RSS/v4/2021_nonhrv.zarr"
186
143
 
144
+ `ocf-data-sampler` is currently set up to use 11 channels from the satellite data (the 12th, HRV, is not used).
187
145
 
188
146
  ### Training PVNet
189
147
 
190
- How PVNet is run is determined by the extensive configuration in the config
191
- files. The configs stored in `PVNet/configs.example` should work with samples created using the steps and sample creation config mentioned above.
148
+ How PVNet is run is determined by the configuration files. The example configs in `PVNet/configs.example` work with **streamed_samples** using `datamodule/streamed_samples.yaml`.
192
149
 
193
- Make sure to update the following config files before training your model:
150
+ Update the following before training:
194
151
 
195
- 1. In `configs/datamodule/presaved_samples.yaml`:
196
- - update `sample_dir` to point to the directory you stored your samples in during sample creation
197
- 2. In `configs/model/late_fusion.yaml`:
198
- - update the list of encoders to reflect the data sources you are using. If you are using different NWP sources, the encoders for these should follow the same structure with two important updates:
199
- - `in_channels`: number of variables your NWP source supplies
200
- - `image_size_pixels`: spatial crop of your NWP data. It depends on the spatial resolution of your NWP; should match `image_size_pixels_height` and/or `image_size_pixels_width` in `datamodule/configuration/site_example_configuration.yaml` for the NWP, unless transformations such as coarsening was applied (e. g. as for ECMWF data)
201
- 3. In `configs/trainer/default.yaml`:
202
- - set `accelerator: 0` if running on a system without a supported GPU
152
+ 1. In `configs/model/late_fusion.yaml`:
153
+ - Update the list of encoders to match the data sources you are using. For different NWP sources, keep the same structure but ensure:
154
+ - `in_channels`: the number of variables your NWP source supplies
155
+ - `image_size_pixels`: spatial crop matching your NWP resolution and the settings in your datamodule configuration (unless you coarsened, e.g. for ECMWF)
156
+ 2. In `configs/trainer/default.yaml`:
157
+ - Set `accelerator: 0` if running on a system without a supported GPU
158
+ 3. In `configs/datamodule/streamed_samples.yaml`:
159
+ - Point `configuration:` to your local `example_configuration.yaml` (or your custom one)
160
+ - Adjust the train/val/test time ranges to your available data
203
161
 
204
- If creating copies of the config files instead of modifying existing ones, update `defaults` in the main `./configs/config.yaml` file to use
205
- your customised config files:
162
+ If you create custom config files, update the main `./configs/config.yaml` defaults:
206
163
 
207
- ```yaml
208
164
  defaults:
209
165
  - trainer: default.yaml
210
166
  - model: late_fusion.yaml
211
- - datamodule: presaved_samples.yaml
167
+ - datamodule: streamed_samples.yaml
212
168
  - callbacks: null
213
169
  - experiment: null
214
170
  - hparams_search: null
215
171
  - hydra: default.yaml
216
- ```
217
172
 
218
- Assuming you ran the `save_samples.py` script to generate some presaved train and
219
- val data samples, you can now train PVNet by running:
173
+ Now train PVNet:
220
174
 
221
- ```
222
175
  python run.py
223
- ```
176
+
177
+ You can override any setting with Hydra, e.g.:
178
+
179
+ python run.py datamodule=streamed_samples datamodule.configuration="/FULL-PATH/PVNet/configs/datamodule/configuration/example_configuration.yaml"
224
180
 
225
181
  ## Backtest
226
182
 
227
183
  If you have successfully trained a PVNet model and have a saved model checkpoint you can create a backtest using this, e.g. forecasts on historical data to evaluate forecast accuracy/skill. This can be done by running one of the scripts in this repo such as [the UK GSP backtest script](scripts/backtest_uk_gsp.py) or the [the pv site backtest script](scripts/backtest_sites.py), further info on how to run these are in each backtest file.
228
184
 
229
-
230
185
  ## Testing
231
186
 
232
187
  You can use `python -m pytest tests` to run tests
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes