cloudnet-api-client 0.1.3__tar.gz → 0.2.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -5,6 +5,18 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## 0.2.1 – 2025-04-02
9
+
10
+ - Return full paths of downloaded files
11
+
12
+ ## 0.2.0 – 2025-04-01
13
+
14
+ - Add download progress bar
15
+ - Add `adownload` function for asynchronous context
16
+ - Only retry on aiohttp errors
17
+ - Extend date parameters
18
+ - Improve type hints
19
+
8
20
  ## 0.1.3 – 2025-03-31
9
21
 
10
22
  - Add py.typed
@@ -0,0 +1,161 @@
1
+ Metadata-Version: 2.4
2
+ Name: cloudnet-api-client
3
+ Version: 0.2.1
4
+ Summary: Cloudnet API client
5
+ Author-email: Simo Tukiainen <simo.tukiainen@fmi.fi>
6
+ License-File: LICENSE
7
+ Classifier: Development Status :: 3 - Alpha
8
+ Classifier: Intended Audience :: Science/Research
9
+ Classifier: License :: OSI Approved :: MIT License
10
+ Classifier: Operating System :: OS Independent
11
+ Classifier: Programming Language :: Python :: 3
12
+ Classifier: Topic :: Scientific/Engineering :: Atmospheric Science
13
+ Requires-Python: >=3.10
14
+ Requires-Dist: aiohttp
15
+ Requires-Dist: numpy
16
+ Requires-Dist: requests
17
+ Requires-Dist: tqdm
18
+ Provides-Extra: dev
19
+ Requires-Dist: pre-commit; extra == 'dev'
20
+ Requires-Dist: release-version; extra == 'dev'
21
+ Requires-Dist: types-requests; extra == 'dev'
22
+ Requires-Dist: types-tqdm; extra == 'dev'
23
+ Provides-Extra: test
24
+ Requires-Dist: mypy; extra == 'test'
25
+ Requires-Dist: pytest; extra == 'test'
26
+ Description-Content-Type: text/markdown
27
+
28
+ [![CI](https://github.com/actris-cloudnet/cloudnet-api-client/actions/workflows/test.yml/badge.svg)](https://github.com/actris-cloudnet/cloudnet-api-client/actions/workflows/test.yml)
29
+ [![PyPI version](https://badge.fury.io/py/cloudnet-api-client.svg)](https://badge.fury.io/py/cloudnet-api-client)
30
+
31
+ # Cloudnet API client
32
+
33
+ Official Python client for the [Cloudnet data portal API](https://docs.cloudnet.fmi.fi/api/data-portal.html).
34
+
35
+ ## Installation
36
+
37
+ ```bash
38
+ python3 -m pip install cloudnet-api-client
39
+ ```
40
+
41
+ ## Quickstart
42
+
43
+ ```python
44
+ import cloudnet_api_client as cac
45
+
46
+ client = cac.APIClient()
47
+
48
+ sites = client.sites(type="cloudnet")
49
+ products = client.products()
50
+
51
+ metadata = client.metadata("hyytiala", "2021-01-01", product=["mwr", "radar"])
52
+ cac.download(metadata, "data/")
53
+
54
+ raw_metadata = client.raw_metadata("granada", date="2024-01", instrument_id="parsivel")
55
+ cac.download(raw_metadata, "data_raw/")
56
+ ```
57
+
58
+ ## Documentation
59
+
60
+ ### `APIClient().metadata()` and `raw_metadata()` &rarr; `list[Metadata]`
61
+
62
+ Fetch product and raw file metadata from the Cloudnet data portal.
63
+
64
+ Parameters:
65
+
66
+ | name | type | default | example |
67
+ | --------------- | --------------------------- | ------- | ---------------------------------------------------- |
68
+ | site_id | `str` | | "hyytiala" |
69
+ | date | `str` or `date` | `None` | "2024-01-01" |
70
+ | date_from | `str` or `date` | `None` | "2025-01-01" |
71
+ | date_to | `str` or `date` | `None` | "2025-01-01" |
72
+ | updated_at | `str`, `date` or `datetime` | `None` | "2025-01-01T12:00:00" |
73
+ | updated_at_from | `str`, `date` or `datetime` | `None` | "2025-01-01T12:00:00" |
74
+ | updated_at_to | `str`, `date` or `datetime` | `None` | "2025-01-01T12:00:00" |
75
+ | instrument_id | `str` or `list[str]` | `None` | "rpg-fmcw-94" |
76
+ | instrument_pid | `str` or `list[str]` | `None` | "https://hdl.handle.net/21.12132/3.191564170f8a4686" |
77
+ | product\* | `str` or `list[str]` | `None` | "classification" |
78
+ | show_legacy\* | `bool` | `False` | |
79
+
80
+ \* = only in `metadata()`
81
+
82
+ **Date Handling**
83
+
84
+ The `date`, `date_from` and `date_to` parameters support:
85
+
86
+ - "YYYY-MM-DD" — a specific date
87
+ - "YYYY-MM" — the entire month
88
+ - "YYYY" — the entire year
89
+ - Or directly as `datetime.date` object
90
+
91
+ In addition to these, the `updated_at`, `updated_at_from` and `updated_at_to` parameters support:
92
+
93
+ - "YYYY-MM-DDTHH" — a specific hour
94
+ - "YYYY-MM-DDTHH:MM" — a specific minute
95
+ - "YYYY-MM-DDTHH:MM:SS" — a specific second
96
+ - "YYYY-MM-DDTHH:MM:SS.FFFFFF" — a specific microsecond
97
+ - Or directly as `datetime.datetime` object
98
+
99
+ **Return value**
100
+
101
+ Both methods return a list of `dataclass` instances, `ProductMetadata` and `RawMetadata`, respectively.
102
+
103
+ ### `APIClient().filter(list[Metadata])` &rarr; `list[Metadata]`
104
+
105
+ Additional filtering of fetched metadata.
106
+
107
+ Parameters:
108
+
109
+ | name | type | default |
110
+ | ------------------ | ---------------------------------------------- | ------- |
111
+ | metadata | `list[RawMetadata]` or `list[ProductMetadata]` | |
112
+ | include_pattern | `str` | `None` |
113
+ | exclude_pattern | `str` | `None` |
114
+ | filename_prefix | `str` | `None` |
115
+ | filename_suffix | `str` | `None` |
116
+ | include_tag_subset | `set[str]` | `None` |
117
+ | exclude_tag_subset | `set[str]` | `None` |
118
+
119
+ ### `APIClient().sites()` &rarr; `list[Site]`
120
+
121
+ Fetch cloudnet sites.
122
+
123
+ Parameters:
124
+
125
+ | name | type | Choices | default |
126
+ | ---- | -------------------- | ----------------------------------------- | ------- |
127
+ | type | `str` or `list[str]` | "cloudnet", "campaign", "model", "hidden" | `None` |
128
+
129
+ ### `APIClient().products()` &rarr; `list[Product]`
130
+
131
+ Fetch cloudnet products.
132
+
133
+ Parameters:
134
+
135
+ | name | type | Choices | default |
136
+ | ---- | -------------------- | ----------------------------------------- | ------- |
137
+ | type | `str` or `list[str]` | "instrument", "geophysical", "evaluation" | `None` |
138
+
139
+ ### `APIClient().instruments()` &rarr; `list[Instrument]`
140
+
141
+ Fetch cloudnet instruments.
142
+
143
+ ### `cloudnet_api_client.download(list[Metadata])` &rarr; `list[Path]`
144
+
145
+ Download files from the fetched metadata.
146
+
147
+ Parameters:
148
+
149
+ | name | type | default |
150
+ | ----------------- | ---------------------------------------------- | ------- |
151
+ | metadata | `list[RawMetadata]` or `list[ProductMetadata]` | |
152
+ | output_directory | `PathLike` or `str` | |
153
+ | concurrency_limit | `int` | 5 |
154
+ | progress | `bool` or `None` | `None` |
155
+
156
+ There's also an asynchronous version of this function:
157
+ `cloudnet_api_client.adownload`. It's useful for usage inside Jupyter notebook.
158
+
159
+ ## License
160
+
161
+ MIT
@@ -0,0 +1,134 @@
1
+ [![CI](https://github.com/actris-cloudnet/cloudnet-api-client/actions/workflows/test.yml/badge.svg)](https://github.com/actris-cloudnet/cloudnet-api-client/actions/workflows/test.yml)
2
+ [![PyPI version](https://badge.fury.io/py/cloudnet-api-client.svg)](https://badge.fury.io/py/cloudnet-api-client)
3
+
4
+ # Cloudnet API client
5
+
6
+ Official Python client for the [Cloudnet data portal API](https://docs.cloudnet.fmi.fi/api/data-portal.html).
7
+
8
+ ## Installation
9
+
10
+ ```bash
11
+ python3 -m pip install cloudnet-api-client
12
+ ```
13
+
14
+ ## Quickstart
15
+
16
+ ```python
17
+ import cloudnet_api_client as cac
18
+
19
+ client = cac.APIClient()
20
+
21
+ sites = client.sites(type="cloudnet")
22
+ products = client.products()
23
+
24
+ metadata = client.metadata("hyytiala", "2021-01-01", product=["mwr", "radar"])
25
+ cac.download(metadata, "data/")
26
+
27
+ raw_metadata = client.raw_metadata("granada", date="2024-01", instrument_id="parsivel")
28
+ cac.download(raw_metadata, "data_raw/")
29
+ ```
30
+
31
+ ## Documentation
32
+
33
+ ### `APIClient().metadata()` and `raw_metadata()` &rarr; `list[Metadata]`
34
+
35
+ Fetch product and raw file metadata from the Cloudnet data portal.
36
+
37
+ Parameters:
38
+
39
+ | name | type | default | example |
40
+ | --------------- | --------------------------- | ------- | ---------------------------------------------------- |
41
+ | site_id | `str` | | "hyytiala" |
42
+ | date | `str` or `date` | `None` | "2024-01-01" |
43
+ | date_from | `str` or `date` | `None` | "2025-01-01" |
44
+ | date_to | `str` or `date` | `None` | "2025-01-01" |
45
+ | updated_at | `str`, `date` or `datetime` | `None` | "2025-01-01T12:00:00" |
46
+ | updated_at_from | `str`, `date` or `datetime` | `None` | "2025-01-01T12:00:00" |
47
+ | updated_at_to | `str`, `date` or `datetime` | `None` | "2025-01-01T12:00:00" |
48
+ | instrument_id | `str` or `list[str]` | `None` | "rpg-fmcw-94" |
49
+ | instrument_pid | `str` or `list[str]` | `None` | "https://hdl.handle.net/21.12132/3.191564170f8a4686" |
50
+ | product\* | `str` or `list[str]` | `None` | "classification" |
51
+ | show_legacy\* | `bool` | `False` | |
52
+
53
+ \* = only in `metadata()`
54
+
55
+ **Date Handling**
56
+
57
+ The `date`, `date_from` and `date_to` parameters support:
58
+
59
+ - "YYYY-MM-DD" — a specific date
60
+ - "YYYY-MM" — the entire month
61
+ - "YYYY" — the entire year
62
+ - Or directly as `datetime.date` object
63
+
64
+ In addition to these, the `updated_at`, `updated_at_from` and `updated_at_to` parameters support:
65
+
66
+ - "YYYY-MM-DDTHH" — a specific hour
67
+ - "YYYY-MM-DDTHH:MM" — a specific minute
68
+ - "YYYY-MM-DDTHH:MM:SS" — a specific second
69
+ - "YYYY-MM-DDTHH:MM:SS.FFFFFF" — a specific microsecond
70
+ - Or directly as `datetime.datetime` object
71
+
72
+ **Return value**
73
+
74
+ Both methods return a list of `dataclass` instances, `ProductMetadata` and `RawMetadata`, respectively.
75
+
76
+ ### `APIClient().filter(list[Metadata])` &rarr; `list[Metadata]`
77
+
78
+ Additional filtering of fetched metadata.
79
+
80
+ Parameters:
81
+
82
+ | name | type | default |
83
+ | ------------------ | ---------------------------------------------- | ------- |
84
+ | metadata | `list[RawMetadata]` or `list[ProductMetadata]` | |
85
+ | include_pattern | `str` | `None` |
86
+ | exclude_pattern | `str` | `None` |
87
+ | filename_prefix | `str` | `None` |
88
+ | filename_suffix | `str` | `None` |
89
+ | include_tag_subset | `set[str]` | `None` |
90
+ | exclude_tag_subset | `set[str]` | `None` |
91
+
92
+ ### `APIClient().sites()` &rarr; `list[Site]`
93
+
94
+ Fetch cloudnet sites.
95
+
96
+ Parameters:
97
+
98
+ | name | type | Choices | default |
99
+ | ---- | -------------------- | ----------------------------------------- | ------- |
100
+ | type | `str` or `list[str]` | "cloudnet", "campaign", "model", "hidden" | `None` |
101
+
102
+ ### `APIClient().products()` &rarr; `list[Product]`
103
+
104
+ Fetch cloudnet products.
105
+
106
+ Parameters:
107
+
108
+ | name | type | Choices | default |
109
+ | ---- | -------------------- | ----------------------------------------- | ------- |
110
+ | type | `str` or `list[str]` | "instrument", "geophysical", "evaluation" | `None` |
111
+
112
+ ### `APIClient().instruments()` &rarr; `list[Instrument]`
113
+
114
+ Fetch cloudnet instruments.
115
+
116
+ ### `cloudnet_api_client.download(list[Metadata])` &rarr; `list[Path]`
117
+
118
+ Download files from the fetched metadata.
119
+
120
+ Parameters:
121
+
122
+ | name | type | default |
123
+ | ----------------- | ---------------------------------------------- | ------- |
124
+ | metadata | `list[RawMetadata]` or `list[ProductMetadata]` | |
125
+ | output_directory | `PathLike` or `str` | |
126
+ | concurrency_limit | `int` | 5 |
127
+ | progress | `bool` or `None` | `None` |
128
+
129
+ There's also an asynchronous version of this function:
130
+ `cloudnet_api_client.adownload`. It's useful for usage inside Jupyter notebook.
131
+
132
+ ## License
133
+
134
+ MIT
@@ -1,2 +1,3 @@
1
1
  from .client import APIClient as APIClient
2
+ from .dl import adownload as adownload
2
3
  from .dl import download as download
@@ -22,6 +22,7 @@ from cloudnet_api_client.containers import (
22
22
 
23
23
  T = TypeVar("T")
24
24
  DateParam = str | datetime.date | None
25
+ DateTimeParam = str | datetime.datetime | datetime.date | None
25
26
  QueryParam = str | list[str] | None
26
27
 
27
28
 
@@ -72,8 +73,9 @@ class APIClient:
72
73
  date: DateParam = None,
73
74
  date_from: DateParam = None,
74
75
  date_to: DateParam = None,
75
- updated_at_from: DateParam = None,
76
- updated_at_to: DateParam = None,
76
+ updated_at: DateTimeParam = None,
77
+ updated_at_from: DateTimeParam = None,
78
+ updated_at_to: DateTimeParam = None,
77
79
  instrument_id: QueryParam = None,
78
80
  instrument_pid: QueryParam = None,
79
81
  model_id: QueryParam = None,
@@ -87,10 +89,10 @@ class APIClient:
87
89
  "product": product,
88
90
  "showLegacy": show_legacy,
89
91
  }
90
- date_params = _mangle_dates(
91
- date, date_from, date_to, updated_at_from, updated_at_to
92
+ _add_date_params(
93
+ params, date, date_from, date_to, updated_at, updated_at_from, updated_at_to
92
94
  )
93
- params.update(date_params)
95
+
94
96
  files_res = self._get_response("files", params)
95
97
 
96
98
  # Add model files if requested
@@ -100,7 +102,7 @@ class APIClient:
100
102
  params["model"] = model_id
101
103
  files_res += self._get_response("model-files", params)
102
104
 
103
- return _build_objects(files_res, ProductMetadata)
105
+ return _build_meta_objects(files_res)
104
106
 
105
107
  def raw_metadata(
106
108
  self,
@@ -108,8 +110,9 @@ class APIClient:
108
110
  date: DateParam = None,
109
111
  date_from: DateParam = None,
110
112
  date_to: DateParam = None,
111
- updated_at_from: DateParam = None,
112
- updated_at_to: DateParam = None,
113
+ updated_at: DateTimeParam = None,
114
+ updated_at_from: DateTimeParam = None,
115
+ updated_at_to: DateTimeParam = None,
113
116
  instrument_id: QueryParam = None,
114
117
  instrument_pid: QueryParam = None,
115
118
  ) -> list[RawMetadata]:
@@ -118,10 +121,9 @@ class APIClient:
118
121
  "instrument": instrument_id,
119
122
  "instrumentPid": instrument_pid,
120
123
  }
121
- date_params = _mangle_dates(
122
- date, date_from, date_to, updated_at_from, updated_at_to
124
+ _add_date_params(
125
+ params, date, date_from, date_to, updated_at, updated_at_from, updated_at_to
123
126
  )
124
- params.update(date_params)
125
127
  res = self._get_response("raw-files", params)
126
128
  return _build_raw_meta_objects(res)
127
129
 
@@ -172,48 +174,107 @@ class APIClient:
172
174
  return res.json()
173
175
 
174
176
 
175
- def _mangle_dates(
177
+ def _add_date_params(
178
+ params: dict,
176
179
  date: DateParam,
177
180
  date_from: DateParam,
178
181
  date_to: DateParam,
179
- updated_at_from: DateParam,
180
- updated_at_to: DateParam,
181
- ) -> dict:
182
- params = {}
183
- if isinstance(date, datetime.date):
184
- params["date"] = date
185
- elif isinstance(date, str):
186
- if re.fullmatch(r"\d{4}-\d{2}-\d{2}", date):
187
- params["date"] = _parse_date(date)
188
- elif re.fullmatch(r"\d{4}-\d{2}", date):
189
- date = datetime.datetime.strptime(date, "%Y-%m")
190
- last_day_number = calendar.monthrange(date.year, date.month)[1]
191
- params["dateFrom"] = datetime.date(date.year, date.month, 1)
192
- params["dateTo"] = datetime.date(date.year, date.month, last_day_number)
193
- elif re.fullmatch(r"\d{4}", date):
194
- params["dateFrom"] = datetime.date(int(date), 1, 1)
195
- params["dateTo"] = datetime.date(int(date), 12, 31)
196
- else:
197
- raise ValueError("Invalid date format")
198
- else:
199
- if date_from:
200
- params["dateFrom"] = _parse_date(date_from)
201
- if date_to:
202
- params["dateTo"] = _parse_date(date_to)
203
- if updated_at_from:
204
- params["updatedAtFrom"] = _parse_date(updated_at_from)
205
- if updated_at_to:
206
- params["updatedAtTo"] = _parse_date(updated_at_to)
207
- return params
208
-
209
-
210
- def _parse_date(date: str | datetime.date) -> datetime.date:
182
+ updated_at: DateTimeParam,
183
+ updated_at_from: DateTimeParam,
184
+ updated_at_to: DateTimeParam,
185
+ ):
186
+ if date is not None and (date_from is not None or date_to is not None):
187
+ msg = "Cannot use 'date' with 'date_from' and 'date_to'"
188
+ raise ValueError(msg)
189
+ if date is not None:
190
+ start, stop = _parse_date(date)
191
+ params["dateFrom"] = start.isoformat()
192
+ params["dateTo"] = stop.isoformat()
193
+ if date_from is not None:
194
+ params["dateFrom"] = _parse_date(date_from)[0].isoformat()
195
+ if date_to is not None:
196
+ params["dateTo"] = _parse_date(date_to)[1].isoformat()
197
+
198
+ if updated_at is not None and (
199
+ updated_at_from is not None or updated_at_to is not None
200
+ ):
201
+ msg = "Cannot use 'updated_at' with 'updated_at_from' and 'updated_at_to'"
202
+ raise ValueError(msg)
203
+ if updated_at is not None:
204
+ start, stop = _parse_datetime(updated_at)
205
+ params["updatedAtFrom"] = start.isoformat()
206
+ params["updatedAtTo"] = stop.isoformat()
207
+ if updated_at_from is not None:
208
+ params["updatedAtFrom"] = _parse_datetime(updated_at_from)[0].isoformat()
209
+ if updated_at_to is not None:
210
+ params["updatedAtTo"] = _parse_datetime(updated_at_to)[1].isoformat()
211
+
212
+
213
+ def _parse_date(date: DateParam) -> tuple[datetime.date, datetime.date]:
211
214
  if isinstance(date, datetime.date):
212
- return date
213
- try:
214
- return datetime.datetime.strptime(date, "%Y-%m-%d").date()
215
- except ValueError as e:
216
- raise ValueError(f"Invalid date format: {date}") from e
215
+ return date, date
216
+ error = ValueError(f"Invalid date format: {date}")
217
+ if isinstance(date, str):
218
+ try:
219
+ parts = [int(part) for part in date.split("-")]
220
+ except ValueError:
221
+ raise error from None
222
+ match parts:
223
+ case [year, month, day]:
224
+ date = datetime.date(year, month, day)
225
+ return date, date
226
+ case [year, month]:
227
+ last_day_number = calendar.monthrange(year, month)[1]
228
+ return datetime.date(year, month, 1), datetime.date(
229
+ year, month, last_day_number
230
+ )
231
+ case [year]:
232
+ return datetime.date(year, 1, 1), datetime.date(year, 12, 31)
233
+ raise error
234
+
235
+
236
+ def _parse_datetime(dt: DateTimeParam) -> tuple[datetime.datetime, datetime.datetime]:
237
+ if isinstance(dt, datetime.datetime):
238
+ return dt, dt
239
+ if isinstance(dt, datetime.date):
240
+ return datetime.datetime.combine(
241
+ dt, datetime.time(0, 0, 0, 0)
242
+ ), datetime.datetime.combine(dt, datetime.time(23, 59, 59, 999999))
243
+ if isinstance(dt, str):
244
+ patterns = {
245
+ ("%Y", "years"),
246
+ ("%Y-%m", "months"),
247
+ ("%Y-%m-%d", "days"),
248
+ ("%Y-%m-%dT%H", "hours"),
249
+ ("%Y-%m-%dT%H:%M", "minutes"),
250
+ ("%Y-%m-%dT%H:%M:%S", "seconds"),
251
+ ("%Y-%m-%dT%H:%M:%S.%f", "microseconds"),
252
+ }
253
+ for fmt, unit in patterns:
254
+ try:
255
+ start_date = datetime.datetime.strptime(dt, fmt)
256
+ except ValueError:
257
+ continue
258
+ if unit == "years":
259
+ end_date = start_date.replace(year=start_date.year + 1)
260
+ elif unit == "months":
261
+ if start_date.month == 12:
262
+ end_date = start_date.replace(year=start_date.year + 1, month=1)
263
+ else:
264
+ end_date = start_date.replace(month=start_date.month + 1)
265
+ elif unit == "days":
266
+ end_date = start_date + datetime.timedelta(days=1)
267
+ elif unit == "hours":
268
+ end_date = start_date + datetime.timedelta(hours=1)
269
+ elif unit == "minutes":
270
+ end_date = start_date + datetime.timedelta(minutes=1)
271
+ elif unit == "seconds":
272
+ end_date = start_date + datetime.timedelta(seconds=1)
273
+ elif unit == "microseconds":
274
+ return start_date, start_date
275
+ return start_date, end_date - datetime.timedelta(microseconds=1)
276
+ msg = f"Invalid datetime format: {dt}"
277
+ raise ValueError(msg)
217
278
 
218
279
 
219
280
  def _build_objects(res: list[dict], object_type: type[T]) -> list[T]:
@@ -228,6 +289,22 @@ def _build_objects(res: list[dict], object_type: type[T]) -> list[T]:
228
289
  return cast(list[T], objects)
229
290
 
230
291
 
292
+ def _build_meta_objects(res: list[dict]) -> list[ProductMetadata]:
293
+ field_names = {f.name for f in fields(ProductMetadata)} - {"product"}
294
+ return [
295
+ ProductMetadata(
296
+ **{_to_snake(k): v for k, v in obj.items() if _to_snake(k) in field_names},
297
+ product=Product(
298
+ id=obj["product"]["id"],
299
+ human_readable_name=obj["product"]["humanReadableName"],
300
+ type=[obj["product"]["type"][1:-1]],
301
+ experimental=obj["product"]["experimental"],
302
+ ),
303
+ )
304
+ for obj in res
305
+ ]
306
+
307
+
231
308
  def _build_raw_meta_objects(res: list[dict]) -> list[RawMetadata]:
232
309
  field_names = {f.name for f in fields(RawMetadata)} - {"instrument"}
233
310
  return [
@@ -19,7 +19,7 @@ class Site:
19
19
  country: str
20
20
  country_code: str
21
21
  country_subdivision_code: str | None
22
- type: SITE_TYPE
22
+ type: list[SITE_TYPE]
23
23
  status: Literal["active", "inactive"]
24
24
  gaw: str | None
25
25
 
@@ -28,7 +28,7 @@ class Site:
28
28
  class Product:
29
29
  id: str
30
30
  human_readable_name: str
31
- type: PRODUCT_TYPE
31
+ type: list[PRODUCT_TYPE]
32
32
  experimental: bool
33
33
 
34
34
 
@@ -5,6 +5,8 @@ from os import PathLike
5
5
  from pathlib import Path
6
6
 
7
7
  import aiohttp
8
+ from tqdm import tqdm
9
+ from tqdm.asyncio import tqdm_asyncio
8
10
 
9
11
  from cloudnet_api_client import utils
10
12
  from cloudnet_api_client.containers import ProductMetadata, RawMetadata
@@ -13,30 +15,56 @@ MetadataList = list[ProductMetadata] | list[RawMetadata]
13
15
 
14
16
 
15
17
  def download(
16
- metadata: MetadataList, output_directory: str | PathLike, concurrency_limit: int = 5
17
- ) -> None:
18
+ metadata: MetadataList,
19
+ output_directory: str | PathLike,
20
+ concurrency_limit: int = 5,
21
+ progress: bool | None = None,
22
+ ) -> list[Path]:
23
+ return asyncio.run(
24
+ adownload(metadata, output_directory, concurrency_limit, progress)
25
+ )
26
+
27
+
28
+ async def adownload(
29
+ metadata: MetadataList,
30
+ output_directory: str | PathLike,
31
+ concurrency_limit: int = 5,
32
+ progress: bool | None = None,
33
+ ) -> list[Path]:
34
+ disable_progress = not progress if progress is not None else None
35
+ output_directory = Path(output_directory).resolve()
18
36
  os.makedirs(output_directory, exist_ok=True)
19
- asyncio.run(_download_files(metadata, output_directory, concurrency_limit))
37
+ return await _download_files(
38
+ metadata, output_directory, concurrency_limit, disable_progress
39
+ )
20
40
 
21
41
 
22
42
  async def _download_files(
23
- metadata: MetadataList, output_path: str | PathLike, concurrency_limit: int
24
- ) -> None:
43
+ metadata: MetadataList,
44
+ output_path: Path,
45
+ concurrency_limit: int,
46
+ disable_progress: bool | None,
47
+ ) -> list[Path]:
25
48
  semaphore = asyncio.Semaphore(concurrency_limit)
49
+ full_paths = []
26
50
  async with aiohttp.ClientSession() as session:
27
51
  tasks = []
28
52
  for meta in metadata:
29
- destination = output_path / Path(meta.download_url.split("/")[-1])
53
+ destination = output_path / meta.download_url.split("/")[-1]
54
+ full_paths.append(destination)
30
55
  if destination.exists() and _file_checksum_matches(meta, destination):
31
56
  logging.info(f"Already downloaded: {destination}")
32
57
  continue
33
58
  task = asyncio.create_task(
34
59
  _download_file_with_retries(
35
- session, meta.download_url, destination, semaphore
60
+ session, meta.download_url, destination, semaphore, disable_progress
36
61
  )
37
62
  )
38
63
  tasks.append(task)
39
- await asyncio.gather(*tasks)
64
+ await tqdm_asyncio.gather(
65
+ *tasks, desc="Completed files", disable=disable_progress
66
+ )
67
+ return full_paths
40
68
 
41
69
 
42
70
  async def _download_file_with_retries(
@@ -44,14 +72,15 @@ async def _download_file_with_retries(
44
72
  url: str,
45
73
  destination: Path,
46
74
  semaphore: asyncio.Semaphore,
75
+ disable_progress: bool | None,
47
76
  max_retries: int = 3,
48
77
  ) -> None:
49
78
  """Attempt to download a file, retrying up to max_retries times if needed."""
50
79
  for attempt in range(1, max_retries + 1):
51
80
  try:
52
- await _download_file(session, url, destination, semaphore)
81
+ await _download_file(session, url, destination, semaphore, disable_progress)
53
82
  return
54
- except Exception as e:
83
+ except aiohttp.ClientError as e:
55
84
  logging.warning(f"Attempt {attempt} failed for {url}: {e}")
56
85
  if attempt == max_retries:
57
86
  logging.error(f"Giving up on {url} after {max_retries} attempts.")
@@ -65,16 +94,28 @@ async def _download_file(
65
94
  url: str,
66
95
  destination: Path,
67
96
  semaphore: asyncio.Semaphore,
97
+ disable_progress: bool | None,
68
98
  ) -> None:
69
99
  async with semaphore:
70
100
  async with session.get(url) as response:
71
101
  response.raise_for_status()
72
- with destination.open("wb") as file_out:
102
+ with (
103
+ destination.open("wb") as file_out,
104
+ tqdm(
105
+ desc=destination.name,
106
+ total=response.content_length,
107
+ unit="iB",
108
+ unit_scale=True,
109
+ unit_divisor=1024,
110
+ disable=disable_progress,
111
+ ) as bar,
112
+ ):
73
113
  while True:
74
114
  chunk = await response.content.read(8192)
75
115
  if not chunk:
76
116
  break
77
117
  file_out.write(chunk)
118
+ bar.update(len(chunk))
78
119
  logging.info(f"Downloaded: {destination}")
79
120
 
80
121
 
@@ -0,0 +1 @@
1
+ __version__ = "0.2.1"
@@ -18,12 +18,12 @@ classifiers = [
18
18
  "Programming Language :: Python :: 3",
19
19
  "Topic :: Scientific/Engineering :: Atmospheric Science",
20
20
  ]
21
- dependencies = ["aiohttp", "numpy", "requests"]
21
+ dependencies = ["aiohttp", "numpy", "requests", "tqdm"]
22
22
  dynamic = ["version"]
23
23
 
24
24
  [project.optional-dependencies]
25
25
  test = ["mypy", "pytest"]
26
- dev = ["pre-commit", "release-version", "types-requests"]
26
+ dev = ["pre-commit", "release-version", "types-requests", "types-tqdm"]
27
27
 
28
28
  [tool.hatch.version]
29
29
  path = "cloudnet_api_client/version.py"
@@ -1,148 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: cloudnet-api-client
3
- Version: 0.1.3
4
- Summary: Cloudnet API client
5
- Author-email: Simo Tukiainen <simo.tukiainen@fmi.fi>
6
- License-File: LICENSE
7
- Classifier: Development Status :: 3 - Alpha
8
- Classifier: Intended Audience :: Science/Research
9
- Classifier: License :: OSI Approved :: MIT License
10
- Classifier: Operating System :: OS Independent
11
- Classifier: Programming Language :: Python :: 3
12
- Classifier: Topic :: Scientific/Engineering :: Atmospheric Science
13
- Requires-Python: >=3.10
14
- Requires-Dist: aiohttp
15
- Requires-Dist: numpy
16
- Requires-Dist: requests
17
- Provides-Extra: dev
18
- Requires-Dist: pre-commit; extra == 'dev'
19
- Requires-Dist: release-version; extra == 'dev'
20
- Requires-Dist: types-requests; extra == 'dev'
21
- Provides-Extra: test
22
- Requires-Dist: mypy; extra == 'test'
23
- Requires-Dist: pytest; extra == 'test'
24
- Description-Content-Type: text/markdown
25
-
26
- [![CI](https://github.com/actris-cloudnet/cloudnet-api-client/actions/workflows/test.yml/badge.svg)](https://github.com/actris-cloudnet/cloudnet-api-client/actions/workflows/test.yml)
27
-
28
- # cloudnet-api-client
29
-
30
- Official Python client for the [Cloudnet data portal API](https://docs.cloudnet.fmi.fi/api/data-portal.html).
31
-
32
- ## Installation
33
-
34
- ```bash
35
- python3 -m pip install cloudnet-api-client
36
- ```
37
-
38
- ## Quickstart
39
-
40
- ```python
41
- import cloudnet_api_client as cac
42
-
43
- client = cac.APIClient()
44
-
45
- sites = client.sites(type="cloudnet")
46
- products = client.products()
47
-
48
- metadata = client.metadata("hyytiala", "2021-01-01", product=["mwr", "radar"])
49
- cac.download(metadata, "data/")
50
-
51
- raw_metadata = client.raw_metadata("granada", date="2024-01", instrument_id="parsivel")
52
- cac.download(raw_metadata, "data_raw/")
53
- ```
54
-
55
- ## Documentation
56
-
57
- ### `APIClient().metadata()` and `raw_metadata()` &rarr; `[Metadata]`
58
-
59
- Fetch product and raw file metadata from the Cloudnet data portal.
60
-
61
- Parameters:
62
-
63
- | name | type | default | example |
64
- | --------------- | ------------------------ | ------- | ---------------------------------------------------- |
65
- | site_id | `str` | | "hyytiala" |
66
- | date | `str` or `datetime.date` | `None` | "2024-01-01" |
67
- | date_from | `str` or `datetime.date` | `None` | "2025-01-01" |
68
- | date_to | `str` or `datetime.date` | `None` | "2025-01-01" |
69
- | updated_at_from | `str` or `datetime.date` | `None` | "2025-01-01" |
70
- | updated_at_to | `str` or `datetime.date` | `None` | "2025-01-01" |
71
- | instrument_id | `str` or `[str]` | `None` | "rpg-fmcw-94" |
72
- | instrument_pid | `str` or `[str]` | `None` | "https://hdl.handle.net/21.12132/3.191564170f8a4686" |
73
- | product\* | `str` or `[str]` | `None` | "classification" |
74
- | show_legacy\* | `bool` | `False` | |
75
-
76
- \* = only in `metadata()`
77
-
78
- **Date Handling**
79
-
80
- The `date` parameter supports:
81
-
82
- - "YYYY-MM-DD" — a specific date
83
- - "YYYY-MM" — the entire month
84
- - "YYYY" — the entire year
85
- - Or directly as `datetime.date` object
86
-
87
- The `date_from`, `date_to`, `updated_at_from` and `updated_at_to` parameters
88
- should be of form "YYYY-MM-DD" or `datetime.date`. Note that, if `date` is defined, `date_from` and `date_to` have no effect.
89
-
90
- **Return value**
91
-
92
- Both methods return a list of `dataclass` instances, `ProductMetadata` and `RawMetadata`, respectively.
93
-
94
- ### `APIClient().filter([Metadata])` &rarr; `[Metadata]`
95
-
96
- Additional filtering of fetched metadata.
97
-
98
- Parameters:
99
-
100
- | name | type | default |
101
- | ------------------ | -------------------------------------- | ------- |
102
- | metadata | `[RawMetadata]` or `[ProductMetadata]` | |
103
- | include_pattern | `str` | `None` |
104
- | exclude_pattern | `str` | `None` |
105
- | filename_prefix | `str` | `None` |
106
- | filename_suffix | `str` | `None` |
107
- | include_tag_subset | `{str}` | `None` |
108
- | exclude_tag_subset | `{str}` | `None` |
109
-
110
- ### `APIClient().sites()` &rarr; `[Site]`
111
-
112
- Fetch cloudnet sites.
113
-
114
- Parameters:
115
-
116
- | name | type | Choices | default |
117
- | ---- | ---------------- | ----------------------------------------- | ------- |
118
- | type | `str` or `[str]` | "cloudnet", "campaign", "model", "hidden" | `None` |
119
-
120
- ### `APIClient().products()` &rarr; `[Product]`
121
-
122
- Fetch cloudnet products.
123
-
124
- Parameters:
125
-
126
- | name | type | Choices | default |
127
- | ---- | ---------------- | ----------------------------------------- | ------- |
128
- | type | `str` or `[str]` | "instrument", "geophysical", "evaluation" | `None` |
129
-
130
- ### `APIClient().instruments()` &rarr; `[Instrument]`
131
-
132
- Fetch cloudnet instruments.
133
-
134
- ### `cloudnet_api_client.download([Metadata])`
135
-
136
- Download files from the fetched metadata.
137
-
138
- Parameters:
139
-
140
- | name | type | default |
141
- | ----------------- | -------------------------------------- | ------- |
142
- | metadata | `[RawMetadata]` or `[ProductMetadata]` | |
143
- | output_directory | `PathLike` or `str` | |
144
- | concurrency_limit | `int` | 5 |
145
-
146
- ## License
147
-
148
- MIT
@@ -1,123 +0,0 @@
1
- [![CI](https://github.com/actris-cloudnet/cloudnet-api-client/actions/workflows/test.yml/badge.svg)](https://github.com/actris-cloudnet/cloudnet-api-client/actions/workflows/test.yml)
2
-
3
- # cloudnet-api-client
4
-
5
- Official Python client for the [Cloudnet data portal API](https://docs.cloudnet.fmi.fi/api/data-portal.html).
6
-
7
- ## Installation
8
-
9
- ```bash
10
- python3 -m pip install cloudnet-api-client
11
- ```
12
-
13
- ## Quickstart
14
-
15
- ```python
16
- import cloudnet_api_client as cac
17
-
18
- client = cac.APIClient()
19
-
20
- sites = client.sites(type="cloudnet")
21
- products = client.products()
22
-
23
- metadata = client.metadata("hyytiala", "2021-01-01", product=["mwr", "radar"])
24
- cac.download(metadata, "data/")
25
-
26
- raw_metadata = client.raw_metadata("granada", date="2024-01", instrument_id="parsivel")
27
- cac.download(raw_metadata, "data_raw/")
28
- ```
29
-
30
- ## Documentation
31
-
32
- ### `APIClient().metadata()` and `raw_metadata()` &rarr; `[Metadata]`
33
-
34
- Fetch product and raw file metadata from the Cloudnet data portal.
35
-
36
- Parameters:
37
-
38
- | name | type | default | example |
39
- | --------------- | ------------------------ | ------- | ---------------------------------------------------- |
40
- | site_id | `str` | | "hyytiala" |
41
- | date | `str` or `datetime.date` | `None` | "2024-01-01" |
42
- | date_from | `str` or `datetime.date` | `None` | "2025-01-01" |
43
- | date_to | `str` or `datetime.date` | `None` | "2025-01-01" |
44
- | updated_at_from | `str` or `datetime.date` | `None` | "2025-01-01" |
45
- | updated_at_to | `str` or `datetime.date` | `None` | "2025-01-01" |
46
- | instrument_id | `str` or `[str]` | `None` | "rpg-fmcw-94" |
47
- | instrument_pid | `str` or `[str]` | `None` | "https://hdl.handle.net/21.12132/3.191564170f8a4686" |
48
- | product\* | `str` or `[str]` | `None` | "classification" |
49
- | show_legacy\* | `bool` | `False` | |
50
-
51
- \* = only in `metadata()`
52
-
53
- **Date Handling**
54
-
55
- The `date` parameter supports:
56
-
57
- - "YYYY-MM-DD" — a specific date
58
- - "YYYY-MM" — the entire month
59
- - "YYYY" — the entire year
60
- - Or directly as `datetime.date` object
61
-
62
- The `date_from`, `date_to`, `updated_at_from` and `updated_at_to` parameters
63
- should be of form "YYYY-MM-DD" or `datetime.date`. Note that, if `date` is defined, `date_from` and `date_to` have no effect.
64
-
65
- **Return value**
66
-
67
- Both methods return a list of `dataclass` instances, `ProductMetadata` and `RawMetadata`, respectively.
68
-
69
- ### `APIClient().filter([Metadata])` &rarr; `[Metadata]`
70
-
71
- Additional filtering of fetched metadata.
72
-
73
- Parameters:
74
-
75
- | name | type | default |
76
- | ------------------ | -------------------------------------- | ------- |
77
- | metadata | `[RawMetadata]` or `[ProductMetadata]` | |
78
- | include_pattern | `str` | `None` |
79
- | exclude_pattern | `str` | `None` |
80
- | filename_prefix | `str` | `None` |
81
- | filename_suffix | `str` | `None` |
82
- | include_tag_subset | `{str}` | `None` |
83
- | exclude_tag_subset | `{str}` | `None` |
84
-
85
- ### `APIClient().sites()` &rarr; `[Site]`
86
-
87
- Fetch cloudnet sites.
88
-
89
- Parameters:
90
-
91
- | name | type | Choices | default |
92
- | ---- | ---------------- | ----------------------------------------- | ------- |
93
- | type | `str` or `[str]` | "cloudnet", "campaign", "model", "hidden" | `None` |
94
-
95
- ### `APIClient().products()` &rarr; `[Product]`
96
-
97
- Fetch cloudnet products.
98
-
99
- Parameters:
100
-
101
- | name | type | Choices | default |
102
- | ---- | ---------------- | ----------------------------------------- | ------- |
103
- | type | `str` or `[str]` | "instrument", "geophysical", "evaluation" | `None` |
104
-
105
- ### `APIClient().instruments()` &rarr; `[Instrument]`
106
-
107
- Fetch cloudnet instruments.
108
-
109
- ### `cloudnet_api_client.download([Metadata])`
110
-
111
- Download files from the fetched metadata.
112
-
113
- Parameters:
114
-
115
- | name | type | default |
116
- | ----------------- | -------------------------------------- | ------- |
117
- | metadata | `[RawMetadata]` or `[ProductMetadata]` | |
118
- | output_directory | `PathLike` or `str` | |
119
- | concurrency_limit | `int` | 5 |
120
-
121
- ## License
122
-
123
- MIT
@@ -1 +0,0 @@
1
- __version__ = "0.1.3"