goesgcp 1.0.9__tar.gz → 2.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
goesgcp-2.0.1/PKG-INFO ADDED
@@ -0,0 +1,118 @@
1
+ Metadata-Version: 2.2
2
+ Name: goesgcp
3
+ Version: 2.0.1
4
+ Summary: A package to download and process GOES-16/17 data
5
+ Home-page: https://github.com/helvecioneto/goesgcp
6
+ Author: Helvecio B. L. Neto
7
+ Author-email: helvecioblneto@gmail.com
8
+ License: LICENSE
9
+ Classifier: Programming Language :: Python
10
+ Classifier: Development Status :: 5 - Production/Stable
11
+ Classifier: Operating System :: OS Independent
12
+ Classifier: Programming Language :: Python :: 3.10
13
+ Classifier: Programming Language :: Python :: 3.11
14
+ Classifier: Programming Language :: Python :: 3.12
15
+ Classifier: Topic :: Scientific/Engineering
16
+ Classifier: Topic :: Software Development
17
+ Classifier: Topic :: Utilities
18
+ Description-Content-Type: text/markdown
19
+ License-File: LICENSE
20
+ Requires-Dist: google-cloud-storage
21
+ Requires-Dist: pyproj
22
+ Requires-Dist: xarray
23
+ Requires-Dist: netcdf4
24
+ Requires-Dist: rioxarray
25
+ Dynamic: author
26
+ Dynamic: author-email
27
+ Dynamic: classifier
28
+ Dynamic: description
29
+ Dynamic: description-content-type
30
+ Dynamic: home-page
31
+ Dynamic: license
32
+ Dynamic: requires-dist
33
+ Dynamic: summary
34
+
35
+ # goesgcp
36
+ <!-- badges: start -->
37
+ [![pypi](https://badge.fury.io/py/goesgcp.svg)](https://pypi.python.org/pypi/goesgcp)
38
+ [![Downloads](https://img.shields.io/pypi/dm/goesgcp.svg)](https://pypi.python.org/pypi/goesgcp)
39
+ [![Contributors](https://img.shields.io/github/contributors/helvecioneto/goesgcp.svg)](https://github.com/helvecioneto/goesgcp/graphs/contributors)
40
+ [![License](https://img.shields.io/pypi/l/goesgcp.svg)](https://github.com/helvecioneto/goesgcp/blob/main/LICENSE)
41
+ <!-- badges: end -->
42
+
43
+
44
+ `goesgcp` is a Python utility designed for downloading and reprojecting GOES-R satellite data. This script leverages the `google.cloud` library for accessing data from the Google Cloud Platform (GCP) and `pyproj` for reprojecting data to EPSG:4326, as well as cropping it to a user-defined bounding box.
45
+
46
+ ## Features
47
+
48
+ - **Download GOES-R satellite data**: Supports GOES-16 and GOES-17.
49
+ - **Reprojection and cropping**: Reprojects data to EPSG:4326 and crops to a specified bounding box.
50
+ - **Flexible command-line interface**: Customize download options, variables, channels, time range, and output format.
51
+ - **Efficient processing**: Handles large datasets with optimized performance.
52
+
53
+ ## Installation
54
+
55
+ Install the necessary dependencies via `pip`:
56
+
57
+ ```bash
58
+ pip install goesgcp
59
+ ```
60
+
61
+
62
+ ## Usage
63
+
64
+ ### Command-Line Arguments
65
+
66
+ The script uses the `argparse` module for handling command-line arguments. Below are the available options:
67
+
68
+ ```bash
69
+ goesgcp [OPTIONS]
70
+ ```
71
+
72
+ | Option | Description |
73
+ |----------------------|----------------------------------------------------------------------------|
74
+ | `--satellite` | Name of the satellite (e.g., goes16). |
75
+ | `--product` | Name of the satellite product (e.g., ABI-L2-CMIPF). |
76
+ | `--var_name` | Variable name to extract (e.g., CMI). |
77
+ | `--channel` | Channel to use (e.g., 13). |
78
+ | `--output` | Path for saving output files (default: `output/`). |
79
+ | `--lat_min` | Minimum latitude of the bounding box (default: `-56`). |
80
+ | `--lat_max` | Maximum latitude of the bounding box (default: `35`). |
81
+ | `--lon_min` | Minimum longitude of the bounding box (default: `-116`). |
82
+ | `--lon_max` | Maximum longitude of the bounding box (default: `-25`). |
83
+ | `--resolution` | Set the reprojet data resolution in degree (default: `-0.045`). |
84
+ | `--recent` | Number of most recent data to download (default: `1`). |
85
+ | `--start` | Start date for downloading data (default: `None`). |
86
+ | `--end` | End date for downloading data (default: `None`). |
87
+ | `--bt_hour` | Hour of the day to download data (default: [0, 1, ..., 23]). |
88
+ | `--bt_minute` | Minute of the hour to download data (default: [0, 15, 30, 45]). |
89
+ | `--save_format` | Format for saving output files (default: `by_date`). |
90
+
91
+ #### Available GOES Products
92
+ A comprehensive list of available GOES products can be found at the following link: [https://github.com/awslabs/open-data-docs/tree/main/docs/noaa/noaa-goes16](https://github.com/awslabs/open-data-docs/tree/main/docs/noaa/noaa-goes16)
93
+
94
+ ### Examples
95
+
96
+ #### Download Recent Data
97
+ In the example below, the command downloads the 3 most recent files from the GOES-16 satellite for the product ABI-L2-CMIPF. It focuses on the variable CMI (Cloud and Moisture Imagery) from channel 13, which is commonly used for infrared observations. The downloaded files are saved to the specified output directory output/.
98
+
99
+ ```bash
100
+ goesgcp --satellite goes-16 --product ABI-L2-CMIPF --var_name CMI --channel 13 --recent 3 --output "output/"
101
+ ```
102
+
103
+ #### Download Data for a Specific Time Range
104
+ This command retrieves GOES-16 satellite data for the product ABI-L2-CMIPF within the date range 2022-12-15 00:00:00 to 2022-12-20 10:00:00, focusing on hours 5:00 and 6:00 AM. The data is cropped to the geographic bounds of -35° to 5° latitude and -80° to -30° longitude, reprojected with a resolution of 0.045 degrees, and saved in a by_date format for easy organization.
105
+
106
+ ```bash
107
+ goesgcp --satellite goes-16 --product ABI-L2-CMIPF --start '2022-12-15 00:00:00' --end '2022-12-20 10:00:00' --bt_hour 5 6 --save_format by_date --resolution 0.045 --lat_min -35 --lat_max 5 --lon_min -80 --lon_max -30
108
+ ```
109
+
110
+ ### Contributing
111
+ Contributions are welcome! If you encounter issues or have suggestions for improvements, please submit them via GitHub issues or pull requests.
112
+
113
+ ### Credits
114
+ This project was developed and optimized by Helvecio Neto (2025).
115
+ It builds upon NOAA GOES-R data and leverages resources provided by the Google Cloud Platform.
116
+
117
+ ### License
118
+ This project is licensed under the MIT License.
@@ -0,0 +1,84 @@
1
+ # goesgcp
2
+ <!-- badges: start -->
3
+ [![pypi](https://badge.fury.io/py/goesgcp.svg)](https://pypi.python.org/pypi/goesgcp)
4
+ [![Downloads](https://img.shields.io/pypi/dm/goesgcp.svg)](https://pypi.python.org/pypi/goesgcp)
5
+ [![Contributors](https://img.shields.io/github/contributors/helvecioneto/goesgcp.svg)](https://github.com/helvecioneto/goesgcp/graphs/contributors)
6
+ [![License](https://img.shields.io/pypi/l/goesgcp.svg)](https://github.com/helvecioneto/goesgcp/blob/main/LICENSE)
7
+ <!-- badges: end -->
8
+
9
+
10
+ `goesgcp` is a Python utility designed for downloading and reprojecting GOES-R satellite data. This script leverages the `google.cloud` library for accessing data from the Google Cloud Platform (GCP) and `pyproj` for reprojecting data to EPSG:4326, as well as cropping it to a user-defined bounding box.
11
+
12
+ ## Features
13
+
14
+ - **Download GOES-R satellite data**: Supports GOES-16 and GOES-17.
15
+ - **Reprojection and cropping**: Reprojects data to EPSG:4326 and crops to a specified bounding box.
16
+ - **Flexible command-line interface**: Customize download options, variables, channels, time range, and output format.
17
+ - **Efficient processing**: Handles large datasets with optimized performance.
18
+
19
+ ## Installation
20
+
21
+ Install the necessary dependencies via `pip`:
22
+
23
+ ```bash
24
+ pip install goesgcp
25
+ ```
26
+
27
+
28
+ ## Usage
29
+
30
+ ### Command-Line Arguments
31
+
32
+ The script uses the `argparse` module for handling command-line arguments. Below are the available options:
33
+
34
+ ```bash
35
+ goesgcp [OPTIONS]
36
+ ```
37
+
38
+ | Option | Description |
39
+ |----------------------|----------------------------------------------------------------------------|
40
+ | `--satellite` | Name of the satellite (e.g., goes16). |
41
+ | `--product` | Name of the satellite product (e.g., ABI-L2-CMIPF). |
42
+ | `--var_name` | Variable name to extract (e.g., CMI). |
43
+ | `--channel` | Channel to use (e.g., 13). |
44
+ | `--output` | Path for saving output files (default: `output/`). |
45
+ | `--lat_min` | Minimum latitude of the bounding box (default: `-56`). |
46
+ | `--lat_max` | Maximum latitude of the bounding box (default: `35`). |
47
+ | `--lon_min` | Minimum longitude of the bounding box (default: `-116`). |
48
+ | `--lon_max` | Maximum longitude of the bounding box (default: `-25`). |
49
+ | `--resolution` | Set the reprojet data resolution in degree (default: `-0.045`). |
50
+ | `--recent` | Number of most recent data to download (default: `1`). |
51
+ | `--start` | Start date for downloading data (default: `None`). |
52
+ | `--end` | End date for downloading data (default: `None`). |
53
+ | `--bt_hour` | Hour of the day to download data (default: [0, 1, ..., 23]). |
54
+ | `--bt_minute` | Minute of the hour to download data (default: [0, 15, 30, 45]). |
55
+ | `--save_format` | Format for saving output files (default: `by_date`). |
56
+
57
+ #### Available GOES Products
58
+ A comprehensive list of available GOES products can be found at the following link: [https://github.com/awslabs/open-data-docs/tree/main/docs/noaa/noaa-goes16](https://github.com/awslabs/open-data-docs/tree/main/docs/noaa/noaa-goes16)
59
+
60
+ ### Examples
61
+
62
+ #### Download Recent Data
63
+ In the example below, the command downloads the 3 most recent files from the GOES-16 satellite for the product ABI-L2-CMIPF. It focuses on the variable CMI (Cloud and Moisture Imagery) from channel 13, which is commonly used for infrared observations. The downloaded files are saved to the specified output directory output/.
64
+
65
+ ```bash
66
+ goesgcp --satellite goes-16 --product ABI-L2-CMIPF --var_name CMI --channel 13 --recent 3 --output "output/"
67
+ ```
68
+
69
+ #### Download Data for a Specific Time Range
70
+ This command retrieves GOES-16 satellite data for the product ABI-L2-CMIPF within the date range 2022-12-15 00:00:00 to 2022-12-20 10:00:00, focusing on hours 5:00 and 6:00 AM. The data is cropped to the geographic bounds of -35° to 5° latitude and -80° to -30° longitude, reprojected with a resolution of 0.045 degrees, and saved in a by_date format for easy organization.
71
+
72
+ ```bash
73
+ goesgcp --satellite goes-16 --product ABI-L2-CMIPF --start '2022-12-15 00:00:00' --end '2022-12-20 10:00:00' --bt_hour 5 6 --save_format by_date --resolution 0.045 --lat_min -35 --lat_max 5 --lon_min -80 --lon_max -30
74
+ ```
75
+
76
+ ### Contributing
77
+ Contributions are welcome! If you encounter issues or have suggestions for improvements, please submit them via GitHub issues or pull requests.
78
+
79
+ ### Credits
80
+ This project was developed and optimized by Helvecio Neto (2025).
81
+ It builds upon NOAA GOES-R data and leverages resources provided by the Google Cloud Platform.
82
+
83
+ ### License
84
+ This project is licensed under the MIT License.
@@ -5,6 +5,7 @@ import xarray as xr
5
5
  import argparse
6
6
  import sys
7
7
  import tqdm
8
+ import pandas as pd
8
9
  from distutils.util import strtobool
9
10
  from multiprocessing import Pool
10
11
  from google.cloud import storage
@@ -28,6 +29,83 @@ def get_directory_prefix(year, julian_day, hour):
28
29
  """Generates the directory path based on year, Julian day, and hour."""
29
30
  return f"{year}/{julian_day}/{str(hour).zfill(2)}/"
30
31
 
32
+
33
+ def get_files_period(connection, bucket_name, base_prefix, pattern,
34
+ start, end, bt_hour=[0, 23], bt_min=[0, 60], freq='10 min'):
35
+ """
36
+ Fetches files from a GCP bucket within a specified time period and returns them as a DataFrame.
37
+
38
+ :param connection: The GCP storage client connection.
39
+ :param bucket_name: Name of the GCP bucket.
40
+ :param base_prefix: Base directory prefix for the files.
41
+ :param pattern: Search pattern for file names.
42
+ :param start: Start datetime (inclusive).
43
+ :param end: End datetime (exclusive).
44
+ :return: DataFrame containing the file names and their metadata.
45
+ """
46
+
47
+ print(f"GOESGCP: Fetching files between {start} and {end}...")
48
+
49
+ # Ensure datetime objects
50
+ start = pd.to_datetime(start).tz_localize('UTC')
51
+ end = pd.to_datetime(end).tz_localize('UTC')
52
+
53
+ # Initialize list to store file metadata
54
+ files_metadata = []
55
+
56
+ # Generate the list of dates from start to end
57
+ current_time = start
58
+ while current_time < end:
59
+ year = current_time.year
60
+ julian_day = str(current_time.timetuple().tm_yday).zfill(3) # Julian day
61
+ hour = current_time.hour
62
+
63
+ # Generate the directory prefix
64
+ prefix = f"{base_prefix}/{get_directory_prefix(year, julian_day, hour)}"
65
+
66
+ # List blobs in the bucket for the current prefix
67
+ blobs = list_blobs(connection, bucket_name, prefix)
68
+
69
+ # Filter blobs by pattern
70
+ for blob in blobs:
71
+ if pattern in blob.name:
72
+ files_metadata.append({
73
+ 'file_name': blob.name,
74
+ 'last_modified': blob.updated
75
+ })
76
+
77
+ # Move to the next hour
78
+ current_time += timedelta(hours=1)
79
+
80
+ # Create a DataFrame from the list of files
81
+ df = pd.DataFrame(files_metadata)
82
+
83
+ if df.empty:
84
+ print("No files found matching the pattern.")
85
+ return pd.DataFrame()
86
+
87
+ # Ensure 'last_modified' is in the correct datetime format without timezone
88
+ df['last_modified'] = pd.to_datetime(df['last_modified']).dt.tz_localize(None)
89
+ start = pd.to_datetime(start).tz_localize(None)
90
+ end = pd.to_datetime(end).tz_localize(None)
91
+
92
+ # Filter the DataFrame based on the date range
93
+ df = df[(df['last_modified'] >= start) & (df['last_modified'] < end)]
94
+
95
+ # Filter the DataFrame based on the hour range
96
+ df['hour'] = df['last_modified'].dt.hour
97
+ df = df[(df['hour'] >= bt_hour[0]) & (df['hour'] <= bt_hour[1])]
98
+
99
+ # Filter the DataFrame based on the minute range
100
+ df['minute'] = df['last_modified'].dt.minute
101
+ df = df[(df['minute'] >= bt_min[0]) & (df['minute'] <= bt_min[1])]
102
+
103
+ # Filter the DataFrame based on the frequency
104
+ df['freq'] = df['last_modified'].dt.floor(freq)
105
+ df = df.groupby('freq').first().reset_index()
106
+
107
+ return df['file_name'].tolist()
108
+
31
109
  def get_recent_files(connection, bucket_name, base_prefix, pattern, min_files):
32
110
  """
33
111
  Fetches the most recent files in a GCP bucket.
@@ -148,12 +226,31 @@ def crop_reproject(args):
148
226
  # Add global metadata comments
149
227
  ds.attrs['comments'] = "Data processed by goesgcp, author: Helvecio B. L. Neto (helvecioblneto@gmail.com)"
150
228
 
151
- # Save as netcdf overwriting the original file
152
- ds.to_netcdf(f'{output}{file.split("/")[-1]}', mode='w', format='NETCDF4_CLASSIC')
153
-
154
- # Close the dataset
229
+ if save_format == 'by_date':
230
+ file_datetime = datetime.strptime(ds.time_coverage_start,
231
+ "%Y-%m-%dT%H:%M:%S.%fZ")
232
+ year = file_datetime.strftime("%Y")
233
+ month = file_datetime.strftime("%m")
234
+ day = file_datetime.strftime("%d")
235
+ output_directory = f"{output}{year}/{month}/{day}/"
236
+ elif save_format == 'julian':
237
+ file_datetime = datetime.strptime(ds.time_coverage_start,
238
+ "%Y-%m-%dT%H:%M:%S.%fZ")
239
+ year = file_datetime.strftime("%Y")
240
+ julian_day = file_datetime.timetuple().tm_yday
241
+ output_directory = f"{output}{year}/{julian_day}/"
242
+ else:
243
+ output_directory = output
244
+
245
+ # Create the output directory
246
+ pathlib.Path(output_directory).mkdir(parents=True, exist_ok=True)
247
+
248
+ # Save the file
249
+ output_file = f"{output_directory}{file.split('/')[-1]}"
250
+ ds.to_netcdf(output_file, mode='w', format='NETCDF4_CLASSIC')
251
+
252
+ # Fechar o dataset
155
253
  ds.close()
156
-
157
254
  return
158
255
 
159
256
 
@@ -197,16 +294,39 @@ def main():
197
294
 
198
295
  global output_path, var_name, \
199
296
  lat_min, lat_max, lon_min, lon_max, \
200
- max_attempts, parallel, recent, resolution, storage_client
297
+ max_attempts, parallel, recent, resolution, storage_client, \
298
+ satellite, product, op_mode, channel, save_format
201
299
 
202
300
  epilog = """
203
301
  Example usage:
204
302
 
205
- - To download recent 10 files from the GOES-16 satellite for the ABI-L2-CMIPF product:
303
+ - To download recent 3 files from the GOES-16 satellite for the ABI-L2-CMIPF product:
304
+
305
+ goesgcp --satellite goes16 --product ABI-L2-CMIP --recent 3
306
+
307
+ - To download files from the GOES-16 satellite for the ABI-L2-CMIPF product between 2022-12-15 and 2022-12-20:
308
+
309
+ goesgcp --start '2022-12-15 00:00:00' --end '2022-12-20 10:00:00' --bt_hour 5 6 --save_format by_date --resolution 0.045 --lat_min -35 --lat_max 5 --lon_min -80 --lon_max -30
206
310
 
207
- goesgcp --satellite goes16 --product ABI-L2-CMIP --recent 10 --output_path "output/"
208
311
  """
209
312
 
313
+ product_names = [
314
+ "ABI-L1b-RadF", "ABI-L1b-RadC", "ABI-L1b-RadM", "ABI-L2-ACHAC", "ABI-L2-ACHAF", "ABI-L2-ACHAM",
315
+ "ABI-L2-ACHTF", "ABI-L2-ACHTM", "ABI-L2-ACMC", "ABI-L2-ACMF", "ABI-L2-ACMM", "ABI-L2-ACTPC",
316
+ "ABI-L2-ACTPF", "ABI-L2-ACTPM", "ABI-L2-ADPC", "ABI-L2-ADPF", "ABI-L2-ADPM", "ABI-L2-AICEF",
317
+ "ABI-L2-AITAF", "ABI-L2-AODC", "ABI-L2-AODF", "ABI-L2-BRFC", "ABI-L2-BRFF", "ABI-L2-BRFM",
318
+ "ABI-L2-CMIPC", "ABI-L2-CMIPF", "ABI-L2-CMIPM", "ABI-L2-CODC", "ABI-L2-CODF", "ABI-L2-CPSC",
319
+ "ABI-L2-CPSF", "ABI-L2-CPSM", "ABI-L2-CTPC", "ABI-L2-CTPF", "ABI-L2-DMWC", "ABI-L2-DMWF",
320
+ "ABI-L2-DMWM", "ABI-L2-DMWVC", "ABI-L2-DMWVF", "ABI-L2-DMWVF", "ABI-L2-DSIC", "ABI-L2-DSIF",
321
+ "ABI-L2-DSIM", "ABI-L2-DSRC", "ABI-L2-DSRF", "ABI-L2-DSRM", "ABI-L2-FDCC", "ABI-L2-FDCF",
322
+ "ABI-L2-FDCM", "ABI-L2-LSAC", "ABI-L2-LSAF", "ABI-L2-LSAM", "ABI-L2-LSTC", "ABI-L2-LSTF",
323
+ "ABI-L2-LSTM", "ABI-L2-LVMPC", "ABI-L2-LVMPF", "ABI-L2-LVMPM", "ABI-L2-LVTPC", "ABI-L2-LVTPF",
324
+ "ABI-L2-LVTPM", "ABI-L2-MCMIPC", "ABI-L2-MCMIPF", "ABI-L2-MCMIPM", "ABI-L2-RRQPEF",
325
+ "ABI-L2-RSRC", "ABI-L2-RSRF", "ABI-L2-SSTF", "ABI-L2-TPWC", "ABI-L2-TPWF", "ABI-L2-TPWM",
326
+ "ABI-L2-VAAF", "EXIS-L1b-SFEU", "EXIS-L1b-SFXR", "GLM-L2-LCFA", "MAG-L1b-GEOF", "SEIS-L1b-EHIS",
327
+ "SEIS-L1b-MPSH", "SEIS-L1b-MPSL", "SEIS-L1b-SGPS", "SUVI-L1b-Fe093", "SUVI-L1b-Fe131",
328
+ "SUVI-L1b-Fe171", "SUVI-L1b-Fe195", "SUVI-L1b-Fe284", "SUVI-L1b-He303"
329
+ ]
210
330
 
211
331
  # Set arguments
212
332
  parser = argparse.ArgumentParser(description='Converts GOES-16 L2 data to netCDF',
@@ -214,12 +334,21 @@ def main():
214
334
  formatter_class=argparse.RawDescriptionHelpFormatter)
215
335
 
216
336
  # Satellite and product settings
217
- parser.add_argument('--satellite', type=str, default='goes-16', help='Name of the satellite (e.g., goes16)')
218
- parser.add_argument('--product', type=str, default='ABI-L2-CMIP', help='Name of the satellite product')
337
+ parser.add_argument('--satellite', type=str, default='goes-16', choices=['goes-16', 'goes-18'], help='Name of the satellite (e.g., goes16)')
338
+ parser.add_argument('--product', type=str, default='ABI-L2-CMIP', help='Name of the satellite product', choices=product_names)
219
339
  parser.add_argument('--var_name', type=str, default='CMI', help='Variable name to extract (e.g., CMI)')
220
340
  parser.add_argument('--channel', type=int, default=13, help='Channel to use (e.g., 13)')
221
- parser.add_argument('--domain', type=str, default='F', help='Domain to use (e.g., F or C)')
222
- parser.add_argument('--recent', type=int, default=3, help='Number of recent files to download')
341
+ parser.add_argument('--op_mode', type=str, default='M6C', help='Operational mode to use (e.g., M6C)')
342
+
343
+ # Recent files settings
344
+ parser.add_argument('--recent', type=int, help='Number of recent files to download (e.g., 3)')
345
+
346
+ # Date and time settings
347
+ parser.add_argument('--start', type=str, help='Start date in YYYY-MM-DD format')
348
+ parser.add_argument('--end', type=str, help='End date in YYYY-MM-DD format')
349
+ parser.add_argument('--freq', type=str, default='10 min', help='Frequency for the time range (e.g., "10 min")')
350
+ parser.add_argument('--bt_hour', nargs=2, type=int, default=[0, 23], help='Filter data between these hours (e.g., 0 23)')
351
+ parser.add_argument('--bt_min', nargs=2, type=int, default=[0, 60], help='Filter data between these minutes (e.g., 0 60)')
223
352
 
224
353
  # Geographic bounding box
225
354
  parser.add_argument('--lat_min', type=float, default=-81.3282, help='Minimum latitude of the bounding box')
@@ -233,6 +362,9 @@ def main():
233
362
  parser.add_argument('--parallel', type=lambda x: bool(strtobool(x)), default=True, help='Use parallel processing')
234
363
  parser.add_argument('--processes', type=int, default=4, help='Number of processes for parallel execution')
235
364
  parser.add_argument('--max_attempts', type=int, default=3, help='Number of attempts to download a file')
365
+ parser.add_argument('--save_format', type=str, default='flat', choices=['flat', 'by_date','julian'],
366
+ help="Save the files in a flat structure or by date")
367
+
236
368
 
237
369
  # Parse arguments
238
370
  args = parser.parse_args()
@@ -245,7 +377,7 @@ def main():
245
377
  output_path = args.output
246
378
  satellite = args.satellite
247
379
  product = args.product
248
- domain = args.domain
380
+ op_mode = args.op_mode
249
381
  channel = str(args.channel).zfill(2)
250
382
  var_name = args.var_name
251
383
  lat_min = args.lat_min
@@ -255,11 +387,22 @@ def main():
255
387
  resolution = args.resolution
256
388
  max_attempts = args.max_attempts
257
389
  parallel = args.parallel
390
+ recent = args.recent
391
+ start = args.start
392
+ end = args.end
393
+ freq = args.freq
394
+ bt_hour = args.bt_hour
395
+ bt_min = args.bt_min
396
+ save_format = args.save_format
397
+
398
+
399
+ # Check mandatory arguments
400
+ if not args.recent and not (args.start and args.end):
401
+ print("You must provide either the --recent or --start and --end arguments. Exiting...")
402
+ sys.exit(1)
258
403
 
259
404
  # Set bucket name and pattern
260
405
  bucket_name = "gcp-public-data-" + satellite
261
- pattern = "OR_"+product+domain+"-M6C"+channel+"_G" + satellite[-2:]
262
- min_files = args.recent
263
406
 
264
407
  # Create output directory
265
408
  pathlib.Path(output_path).mkdir(parents=True, exist_ok=True)
@@ -274,11 +417,20 @@ def main():
274
417
  print(f"Bucket {bucket_name} not found. Exiting...")
275
418
  sys.exit(1)
276
419
 
277
- # Search for recent files
278
- recent_files = get_recent_files(storage_client, bucket_name, product + domain, pattern, min_files)
420
+ # Set pattern for the files
421
+ pattern = "OR_"+product+"-"+op_mode+channel+"_G" + satellite[-2:]
422
+
423
+ # Check operational mode if is recent or specific date
424
+ if start and end:
425
+ files_list = get_files_period(storage_client, bucket_name,
426
+ product, pattern, start, end,
427
+ bt_hour, bt_min, freq)
428
+ else:
429
+ # Get recent files
430
+ files_list = get_recent_files(storage_client, bucket_name, product, pattern, recent)
279
431
 
280
432
  # Check if any files were found
281
- if not recent_files:
433
+ if not files_list:
282
434
  print(f"No files found with the pattern {pattern}. Exiting...")
283
435
  sys.exit(1)
284
436
 
@@ -286,14 +438,14 @@ def main():
286
438
  pathlib.Path('tmp/').mkdir(parents=True, exist_ok=True)
287
439
 
288
440
  # Download files
289
- print(f"GOESGCP: Downloading and processing {len(recent_files)} files...")
290
- loading_bar = tqdm.tqdm(total=len(recent_files), ncols=100, position=0, leave=True,
441
+ print(f"GOESGCP: Downloading and processing {len(files_list)} files...")
442
+ loading_bar = tqdm.tqdm(total=len(files_list), ncols=100, position=0, leave=True,
291
443
  bar_format='{l_bar}{bar}| {n_fmt}/{total_fmt} + \
292
444
  [Elapsed:{elapsed} Remaining:<{remaining}]')
293
445
 
294
446
  if parallel: # Run in parallel
295
447
  # Create a list of tasks
296
- tasks = [(bucket_name, file, f"tmp/{file.split('/')[-1]}") for file in recent_files]
448
+ tasks = [(bucket_name, file, f"tmp/{file.split('/')[-1]}") for file in files_list]
297
449
 
298
450
  # Download files in parallel
299
451
  with Pool(processes=args.processes) as pool:
@@ -301,7 +453,7 @@ def main():
301
453
  loading_bar.update(1)
302
454
  loading_bar.close()
303
455
  else: # Run in serial
304
- for file in recent_files:
456
+ for file in files_list:
305
457
  local_path = f"tmp/{file.split('/')[-1]}"
306
458
  process_file((bucket_name, file, local_path))
307
459
  loading_bar.update(1)
@@ -0,0 +1,118 @@
1
+ Metadata-Version: 2.2
2
+ Name: goesgcp
3
+ Version: 2.0.1
4
+ Summary: A package to download and process GOES-16/17 data
5
+ Home-page: https://github.com/helvecioneto/goesgcp
6
+ Author: Helvecio B. L. Neto
7
+ Author-email: helvecioblneto@gmail.com
8
+ License: LICENSE
9
+ Classifier: Programming Language :: Python
10
+ Classifier: Development Status :: 5 - Production/Stable
11
+ Classifier: Operating System :: OS Independent
12
+ Classifier: Programming Language :: Python :: 3.10
13
+ Classifier: Programming Language :: Python :: 3.11
14
+ Classifier: Programming Language :: Python :: 3.12
15
+ Classifier: Topic :: Scientific/Engineering
16
+ Classifier: Topic :: Software Development
17
+ Classifier: Topic :: Utilities
18
+ Description-Content-Type: text/markdown
19
+ License-File: LICENSE
20
+ Requires-Dist: google-cloud-storage
21
+ Requires-Dist: pyproj
22
+ Requires-Dist: xarray
23
+ Requires-Dist: netcdf4
24
+ Requires-Dist: rioxarray
25
+ Dynamic: author
26
+ Dynamic: author-email
27
+ Dynamic: classifier
28
+ Dynamic: description
29
+ Dynamic: description-content-type
30
+ Dynamic: home-page
31
+ Dynamic: license
32
+ Dynamic: requires-dist
33
+ Dynamic: summary
34
+
35
+ # goesgcp
36
+ <!-- badges: start -->
37
+ [![pypi](https://badge.fury.io/py/goesgcp.svg)](https://pypi.python.org/pypi/goesgcp)
38
+ [![Downloads](https://img.shields.io/pypi/dm/goesgcp.svg)](https://pypi.python.org/pypi/goesgcp)
39
+ [![Contributors](https://img.shields.io/github/contributors/helvecioneto/goesgcp.svg)](https://github.com/helvecioneto/goesgcp/graphs/contributors)
40
+ [![License](https://img.shields.io/pypi/l/goesgcp.svg)](https://github.com/helvecioneto/goesgcp/blob/main/LICENSE)
41
+ <!-- badges: end -->
42
+
43
+
44
+ `goesgcp` is a Python utility designed for downloading and reprojecting GOES-R satellite data. This script leverages the `google.cloud` library for accessing data from the Google Cloud Platform (GCP) and `pyproj` for reprojecting data to EPSG:4326, as well as cropping it to a user-defined bounding box.
45
+
46
+ ## Features
47
+
48
+ - **Download GOES-R satellite data**: Supports GOES-16 and GOES-17.
49
+ - **Reprojection and cropping**: Reprojects data to EPSG:4326 and crops to a specified bounding box.
50
+ - **Flexible command-line interface**: Customize download options, variables, channels, time range, and output format.
51
+ - **Efficient processing**: Handles large datasets with optimized performance.
52
+
53
+ ## Installation
54
+
55
+ Install the necessary dependencies via `pip`:
56
+
57
+ ```bash
58
+ pip install goesgcp
59
+ ```
60
+
61
+
62
+ ## Usage
63
+
64
+ ### Command-Line Arguments
65
+
66
+ The script uses the `argparse` module for handling command-line arguments. Below are the available options:
67
+
68
+ ```bash
69
+ goesgcp [OPTIONS]
70
+ ```
71
+
72
+ | Option | Description |
73
+ |----------------------|----------------------------------------------------------------------------|
74
+ | `--satellite` | Name of the satellite (e.g., goes16). |
75
+ | `--product` | Name of the satellite product (e.g., ABI-L2-CMIPF). |
76
+ | `--var_name` | Variable name to extract (e.g., CMI). |
77
+ | `--channel` | Channel to use (e.g., 13). |
78
+ | `--output` | Path for saving output files (default: `output/`). |
79
+ | `--lat_min` | Minimum latitude of the bounding box (default: `-56`). |
80
+ | `--lat_max` | Maximum latitude of the bounding box (default: `35`). |
81
+ | `--lon_min` | Minimum longitude of the bounding box (default: `-116`). |
82
+ | `--lon_max` | Maximum longitude of the bounding box (default: `-25`). |
83
+ | `--resolution` | Set the reprojet data resolution in degree (default: `-0.045`). |
84
+ | `--recent` | Number of most recent data to download (default: `1`). |
85
+ | `--start` | Start date for downloading data (default: `None`). |
86
+ | `--end` | End date for downloading data (default: `None`). |
87
+ | `--bt_hour` | Hour of the day to download data (default: [0, 1, ..., 23]). |
88
+ | `--bt_minute` | Minute of the hour to download data (default: [0, 15, 30, 45]). |
89
+ | `--save_format` | Format for saving output files (default: `by_date`). |
90
+
91
+ #### Available GOES Products
92
+ A comprehensive list of available GOES products can be found at the following link: [https://github.com/awslabs/open-data-docs/tree/main/docs/noaa/noaa-goes16](https://github.com/awslabs/open-data-docs/tree/main/docs/noaa/noaa-goes16)
93
+
94
+ ### Examples
95
+
96
+ #### Download Recent Data
97
+ In the example below, the command downloads the 3 most recent files from the GOES-16 satellite for the product ABI-L2-CMIPF. It focuses on the variable CMI (Cloud and Moisture Imagery) from channel 13, which is commonly used for infrared observations. The downloaded files are saved to the specified output directory output/.
98
+
99
+ ```bash
100
+ goesgcp --satellite goes-16 --product ABI-L2-CMIPF --var_name CMI --channel 13 --recent 3 --output "output/"
101
+ ```
102
+
103
+ #### Download Data for a Specific Time Range
104
+ This command retrieves GOES-16 satellite data for the product ABI-L2-CMIPF within the date range 2022-12-15 00:00:00 to 2022-12-20 10:00:00, focusing on hours 5:00 and 6:00 AM. The data is cropped to the geographic bounds of -35° to 5° latitude and -80° to -30° longitude, reprojected with a resolution of 0.045 degrees, and saved in a by_date format for easy organization.
105
+
106
+ ```bash
107
+ goesgcp --satellite goes-16 --product ABI-L2-CMIPF --start '2022-12-15 00:00:00' --end '2022-12-20 10:00:00' --bt_hour 5 6 --save_format by_date --resolution 0.045 --lat_min -35 --lat_max 5 --lon_min -80 --lon_max -30
108
+ ```
109
+
110
+ ### Contributing
111
+ Contributions are welcome! If you encounter issues or have suggestions for improvements, please submit them via GitHub issues or pull requests.
112
+
113
+ ### Credits
114
+ This project was developed and optimized by Helvecio Neto (2025).
115
+ It builds upon NOAA GOES-R data and leverages resources provided by the Google Cloud Platform.
116
+
117
+ ### License
118
+ This project is licensed under the MIT License.
@@ -13,7 +13,7 @@ with open('requirements.txt') as f:
13
13
 
14
14
  setup(
15
15
  name="goesgcp",
16
- version='1.0.9',
16
+ version='2.0.1',
17
17
  author="Helvecio B. L. Neto",
18
18
  author_email="helvecioblneto@gmail.com",
19
19
  description="A package to download and process GOES-16/17 data",
goesgcp-1.0.9/PKG-INFO DELETED
@@ -1,71 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: goesgcp
3
- Version: 1.0.9
4
- Summary: A package to download and process GOES-16/17 data
5
- Home-page: https://github.com/helvecioneto/goesgcp
6
- Author: Helvecio B. L. Neto
7
- Author-email: helvecioblneto@gmail.com
8
- License: LICENSE
9
- Classifier: Programming Language :: Python
10
- Classifier: Development Status :: 5 - Production/Stable
11
- Classifier: Operating System :: OS Independent
12
- Classifier: Programming Language :: Python :: 3.10
13
- Classifier: Programming Language :: Python :: 3.11
14
- Classifier: Programming Language :: Python :: 3.12
15
- Classifier: Topic :: Scientific/Engineering
16
- Classifier: Topic :: Software Development
17
- Classifier: Topic :: Utilities
18
- Description-Content-Type: text/markdown
19
- License-File: LICENSE
20
- Requires-Dist: google-cloud-storage
21
- Requires-Dist: pyproj
22
- Requires-Dist: xarray
23
- Requires-Dist: netcdf4
24
- Requires-Dist: rioxarray
25
-
26
- # goesgcp
27
-
28
- goesgcp is a utility script for downloading and reprojecting GOES-R satellite data. The script uses the `google.cloud` library to download data from the Google Cloud Platform (GCP) and the `pyproj` library to reproject the data to EPSG:4326 and crop it to a specified bounding box.
29
-
30
-
31
- ## Installation
32
-
33
- You can install the necessary dependencies using `pip`:
34
-
35
- ```bash
36
- pip install goesgcp
37
- ```
38
-
39
- ## Usage
40
-
41
- ### Command-Line Arguments
42
-
43
- The script uses the `argparse` module for handling command-line arguments. Below are the available options:
44
-
45
- ```bash
46
- goesgcp [OPTIONS]
47
- ```
48
-
49
- | Option | Description |
50
- |----------------------|----------------------------------------------------------------------------|
51
- | `--satellite` | Name of the satellite (e.g., goes16). |
52
- | `--product` | Name of the satellite product (e.g., ABI-L2-CMIPF). |
53
- | `--var_name` | Variable name to extract (e.g., CMI). |
54
- | `--channel` | Channel to use (e.g., 13). |
55
- | `--output` | Path for saving output files (default: `output/`). |
56
- | `--lat_min` | Minimum latitude of the bounding box (default: `-56`). |
57
- | `--lat_max` | Maximum latitude of the bounding box (default: `35`). |
58
- | `--lon_min` | Minimum longitude of the bounding box (default: `-116`). |
59
- | `--lon_max` | Maximum longitude of the bounding box (default: `-25`). |
60
- | `--resolution` | Set the reprojet data resolution in degree (default: `-0.045`). |
61
-
62
- ### Examples
63
-
64
- To download most 3 recent data for the GOES-16 satellite, ABI-L2-CMIPF product, variable CMI, and channel 13, run the following command:
65
-
66
- ```bash
67
- goesgcp --satellite goes16 --product ABI-L2-CMIPF --var_name CMI --channel 13 --recent 3 --output "output/"
68
- ```
69
-
70
- ### Credits
71
- And this is a otimization by Helvecio Neto - 2025
goesgcp-1.0.9/README.md DELETED
@@ -1,46 +0,0 @@
1
- # goesgcp
2
-
3
- goesgcp is a utility script for downloading and reprojecting GOES-R satellite data. The script uses the `google.cloud` library to download data from the Google Cloud Platform (GCP) and the `pyproj` library to reproject the data to EPSG:4326 and crop it to a specified bounding box.
4
-
5
-
6
- ## Installation
7
-
8
- You can install the necessary dependencies using `pip`:
9
-
10
- ```bash
11
- pip install goesgcp
12
- ```
13
-
14
- ## Usage
15
-
16
- ### Command-Line Arguments
17
-
18
- The script uses the `argparse` module for handling command-line arguments. Below are the available options:
19
-
20
- ```bash
21
- goesgcp [OPTIONS]
22
- ```
23
-
24
- | Option | Description |
25
- |----------------------|----------------------------------------------------------------------------|
26
- | `--satellite` | Name of the satellite (e.g., goes16). |
27
- | `--product` | Name of the satellite product (e.g., ABI-L2-CMIPF). |
28
- | `--var_name` | Variable name to extract (e.g., CMI). |
29
- | `--channel` | Channel to use (e.g., 13). |
30
- | `--output` | Path for saving output files (default: `output/`). |
31
- | `--lat_min` | Minimum latitude of the bounding box (default: `-56`). |
32
- | `--lat_max` | Maximum latitude of the bounding box (default: `35`). |
33
- | `--lon_min` | Minimum longitude of the bounding box (default: `-116`). |
34
- | `--lon_max` | Maximum longitude of the bounding box (default: `-25`). |
35
- | `--resolution` | Set the reprojet data resolution in degree (default: `-0.045`). |
36
-
37
- ### Examples
38
-
39
- To download most 3 recent data for the GOES-16 satellite, ABI-L2-CMIPF product, variable CMI, and channel 13, run the following command:
40
-
41
- ```bash
42
- goesgcp --satellite goes16 --product ABI-L2-CMIPF --var_name CMI --channel 13 --recent 3 --output "output/"
43
- ```
44
-
45
- ### Credits
46
- And this is a otimization by Helvecio Neto - 2025
@@ -1,71 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: goesgcp
3
- Version: 1.0.9
4
- Summary: A package to download and process GOES-16/17 data
5
- Home-page: https://github.com/helvecioneto/goesgcp
6
- Author: Helvecio B. L. Neto
7
- Author-email: helvecioblneto@gmail.com
8
- License: LICENSE
9
- Classifier: Programming Language :: Python
10
- Classifier: Development Status :: 5 - Production/Stable
11
- Classifier: Operating System :: OS Independent
12
- Classifier: Programming Language :: Python :: 3.10
13
- Classifier: Programming Language :: Python :: 3.11
14
- Classifier: Programming Language :: Python :: 3.12
15
- Classifier: Topic :: Scientific/Engineering
16
- Classifier: Topic :: Software Development
17
- Classifier: Topic :: Utilities
18
- Description-Content-Type: text/markdown
19
- License-File: LICENSE
20
- Requires-Dist: google-cloud-storage
21
- Requires-Dist: pyproj
22
- Requires-Dist: xarray
23
- Requires-Dist: netcdf4
24
- Requires-Dist: rioxarray
25
-
26
- # goesgcp
27
-
28
- goesgcp is a utility script for downloading and reprojecting GOES-R satellite data. The script uses the `google.cloud` library to download data from the Google Cloud Platform (GCP) and the `pyproj` library to reproject the data to EPSG:4326 and crop it to a specified bounding box.
29
-
30
-
31
- ## Installation
32
-
33
- You can install the necessary dependencies using `pip`:
34
-
35
- ```bash
36
- pip install goesgcp
37
- ```
38
-
39
- ## Usage
40
-
41
- ### Command-Line Arguments
42
-
43
- The script uses the `argparse` module for handling command-line arguments. Below are the available options:
44
-
45
- ```bash
46
- goesgcp [OPTIONS]
47
- ```
48
-
49
- | Option | Description |
50
- |----------------------|----------------------------------------------------------------------------|
51
- | `--satellite` | Name of the satellite (e.g., goes16). |
52
- | `--product` | Name of the satellite product (e.g., ABI-L2-CMIPF). |
53
- | `--var_name` | Variable name to extract (e.g., CMI). |
54
- | `--channel` | Channel to use (e.g., 13). |
55
- | `--output` | Path for saving output files (default: `output/`). |
56
- | `--lat_min` | Minimum latitude of the bounding box (default: `-56`). |
57
- | `--lat_max` | Maximum latitude of the bounding box (default: `35`). |
58
- | `--lon_min` | Minimum longitude of the bounding box (default: `-116`). |
59
- | `--lon_max` | Maximum longitude of the bounding box (default: `-25`). |
60
- | `--resolution` | Set the reprojet data resolution in degree (default: `-0.045`). |
61
-
62
- ### Examples
63
-
64
- To download most 3 recent data for the GOES-16 satellite, ABI-L2-CMIPF product, variable CMI, and channel 13, run the following command:
65
-
66
- ```bash
67
- goesgcp --satellite goes16 --product ABI-L2-CMIPF --var_name CMI --channel 13 --recent 3 --output "output/"
68
- ```
69
-
70
- ### Credits
71
- And this is a otimization by Helvecio Neto - 2025
File without changes
File without changes
File without changes