anemoi-utils 0.4.22__tar.gz → 0.4.24__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of anemoi-utils might be problematic. Click here for more details.

Files changed (103) hide show
  1. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.github/CODEOWNERS +0 -1
  2. anemoi_utils-0.4.24/.github/pull_request_template.md +13 -0
  3. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.github/workflows/downstream-ci-hpc.yml +1 -1
  4. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.github/workflows/pr-conventional-commit.yml +1 -1
  5. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.github/workflows/pr-label-conventional-commits.yml +1 -1
  6. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.github/workflows/pr-label-file-based.yml +1 -1
  7. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.github/workflows/pr-label-public.yml +1 -1
  8. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.github/workflows/python-pull-request.yml +1 -1
  9. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.github/workflows/readthedocs-pr-update.yml +1 -1
  10. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.gitignore +7 -1
  11. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.pre-commit-config.yaml +3 -2
  12. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.release-please-config.json +6 -1
  13. anemoi_utils-0.4.24/.release-please-manifest.json +3 -0
  14. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/CHANGELOG.md +17 -0
  15. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/PKG-INFO +2 -2
  16. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/pyproject.toml +1 -1
  17. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/_version.py +2 -2
  18. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/provenance.py +5 -2
  19. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/remote/s3.py +149 -35
  20. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/testing.py +4 -4
  21. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi_utils.egg-info/PKG-INFO +2 -2
  22. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi_utils.egg-info/SOURCES.txt +0 -1
  23. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi_utils.egg-info/requires.txt +1 -1
  24. anemoi_utils-0.4.22/.github/pull_request_template.md +0 -46
  25. anemoi_utils-0.4.22/.github/release.yml +0 -23
  26. anemoi_utils-0.4.22/.release-please-manifest.json +0 -3
  27. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.gitattributes +0 -0
  28. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.github/ci-hpc-config.yml +0 -0
  29. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.github/dependabot.yml +0 -0
  30. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.github/labeler.yml +0 -0
  31. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.github/workflows/python-publish.yml +0 -0
  32. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.github/workflows/release-please.yml +0 -0
  33. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/.readthedocs.yaml +0 -0
  34. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/CONTRIBUTORS.md +0 -0
  35. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/LICENSE +0 -0
  36. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/README.md +0 -0
  37. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/Makefile +0 -0
  38. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/_static/logo.png +0 -0
  39. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/_static/style.css +0 -0
  40. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/_templates/.gitkeep +0 -0
  41. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/_templates/apidoc/package.rst.jinja +0 -0
  42. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/conf.py +0 -0
  43. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/index.rst +0 -0
  44. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/installing.rst +0 -0
  45. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/modules/checkpoints.rst +0 -0
  46. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/modules/config.rst +0 -0
  47. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/modules/dates.rst +0 -0
  48. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/modules/grib.rst +0 -0
  49. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/modules/humanize.rst +0 -0
  50. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/modules/provenance.rst +0 -0
  51. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/modules/s3.rst +0 -0
  52. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/modules/testing.rst +0 -0
  53. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/modules/text.rst +0 -0
  54. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/docs/scripts/api_build.sh +0 -0
  55. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/setup.cfg +0 -0
  56. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/__init__.py +0 -0
  57. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/__main__.py +0 -0
  58. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/caching.py +0 -0
  59. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/checkpoints.py +0 -0
  60. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/cli.py +0 -0
  61. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/commands/__init__.py +0 -0
  62. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/commands/config.py +0 -0
  63. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/commands/requests.py +0 -0
  64. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/compatibility.py +0 -0
  65. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/config.py +0 -0
  66. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/dates.py +0 -0
  67. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/devtools.py +0 -0
  68. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/grib.py +0 -0
  69. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/grids.py +0 -0
  70. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/hindcasts.py +0 -0
  71. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/humanize.py +0 -0
  72. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/logs.py +0 -0
  73. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/mars/__init__.py +0 -0
  74. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/mars/mars.yaml +0 -0
  75. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/mars/requests.py +0 -0
  76. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/registry.py +0 -0
  77. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/remote/__init__.py +0 -0
  78. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/remote/ssh.py +0 -0
  79. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/rules.py +0 -0
  80. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/s3.py +0 -0
  81. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/sanitise.py +0 -0
  82. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/sanitize.py +0 -0
  83. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/schemas/__init__.py +0 -0
  84. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/schemas/errors.py +0 -0
  85. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/text.py +0 -0
  86. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi/utils/timer.py +0 -0
  87. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi_utils.egg-info/dependency_links.txt +0 -0
  88. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi_utils.egg-info/entry_points.txt +0 -0
  89. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/src/anemoi_utils.egg-info/top_level.txt +0 -0
  90. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test-transfer-data/directory/b/c/x +0 -0
  91. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test-transfer-data/directory/b/y +0 -0
  92. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test-transfer-data/directory/exotic filename ;^/"'[=.,#]()/303/252/303/274/303/247/303/262/342/234/205.txt" +0 -0
  93. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test-transfer-data/directory/z +0 -0
  94. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test-transfer-data/file +0 -0
  95. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test_caching.py +0 -0
  96. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test_compatibility.py +0 -0
  97. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test_dates.py +0 -0
  98. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test_frequency.py +0 -0
  99. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test_grids.py +0 -0
  100. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test_provenance.py +0 -0
  101. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test_remote.py +0 -0
  102. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test_sanetise.py +0 -0
  103. {anemoi_utils-0.4.22 → anemoi_utils-0.4.24}/tests/test_utils.py +0 -0
@@ -1,4 +1,3 @@
1
-
2
1
  # Workflows
3
2
  /.github/ @ecmwf/AnemoiSecurity
4
3
 
@@ -0,0 +1,13 @@
1
+ ## Description
2
+ <!-- What issue or task does this change relate to? -->
3
+
4
+ ## What problem does this change solve?
5
+ <!-- Describe if it's a bugfix, new feature, doc update, or breaking change -->
6
+
7
+ ## What issue or task does this change relate to?
8
+ <!-- link to Issue Number -->
9
+
10
+ ## Additional notes ##
11
+ <!-- Include any additional information, caveats, or considerations that the reviewer should be aware of. -->
12
+
13
+ ***As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/***
@@ -1,6 +1,6 @@
1
1
  # This workflow triggers tests on dependent packages.
2
2
  # The dependency tree itself is defined in ecmwf/downstream-ci/
3
- name: Test downstream dependent packages on HPC
3
+ name: Test downstream dependent packages
4
4
 
5
5
  on:
6
6
  # Trigger the workflow on push to main or develop, except tag creation
@@ -1,5 +1,5 @@
1
1
  # This workflow ensures that the PR title follows the Conventional Commit format.
2
- name: "[Pull Request] Ensure Conventional Commit in PR title"
2
+ name: "[PR] Ensure Conventional Commit Title"
3
3
 
4
4
  on:
5
5
  pull_request_target:
@@ -1,6 +1,6 @@
1
1
  # This workflow assigns labels to a pull request based on the Conventional Commits format.
2
2
  # This is necessary for release-please to work properly.
3
- name: "[Pull Request] Label Conventional Commits"
3
+ name: "[PR] Label Conventional Commits"
4
4
 
5
5
  on:
6
6
  pull_request:
@@ -1,6 +1,6 @@
1
1
  # This workflow assigns labels to a pull request based on the files changed in the PR.
2
2
  # The labels are defined in the `.github/labels.yml` file.
3
- name: "[Pull Request] Label File-based"
3
+ name: "[PR] Label File-based"
4
4
  on:
5
5
  pull_request_target:
6
6
  types: [opened, synchronize]
@@ -1,5 +1,5 @@
1
1
  # Manage labels of pull requests that originate from forks
2
- name: "[Pull Request] Label PRs from public forks"
2
+ name: "[PR] Label Forks"
3
3
 
4
4
  on:
5
5
  pull_request_target:
@@ -1,5 +1,5 @@
1
1
  # This workflow runs pre-commit checks and pytest tests against multiple platforms and Python versions.
2
- name: Code Quality checks and Testing
2
+ name: Code Quality and Testing
3
3
 
4
4
  on:
5
5
  pull_request:
@@ -1,6 +1,6 @@
1
1
  # This workflow adds a link to the experimental documentation build to the PR.
2
2
  # This does NOT trigger a build of the documentation, this is handled through webhooks.
3
- name: Read the Docs PR Preview
3
+ name: "[PR] Read the Docs Preview"
4
4
  on:
5
5
  pull_request_target:
6
6
  types:
@@ -94,6 +94,11 @@ dmypy.json
94
94
  *.csv
95
95
  *.xlsx
96
96
  *.xls
97
+ *.json
98
+ *.txt
99
+ *.zip
100
+ *.db
101
+ *.tgz
97
102
 
98
103
  # ML artifacts
99
104
  wandb/
@@ -120,7 +125,8 @@ tmp/
120
125
  temp/
121
126
  logs/
122
127
  _dev/
123
- outputs
128
+ _api/
129
+ ./outputs
124
130
  *tmp_data/
125
131
 
126
132
  # Project specific
@@ -39,8 +39,9 @@ repos:
39
39
  - -l 120
40
40
  - --force-single-line-imports
41
41
  - --profile black
42
+ - --project anemoi
42
43
  - repo: https://github.com/astral-sh/ruff-pre-commit
43
- rev: v0.11.4
44
+ rev: v0.11.12
44
45
  hooks:
45
46
  - id: ruff
46
47
  args:
@@ -64,7 +65,7 @@ repos:
64
65
  - id: docconvert
65
66
  args: ["numpy"]
66
67
  - repo: https://github.com/tox-dev/pyproject-fmt
67
- rev: "v2.5.1"
68
+ rev: "v2.6.0"
68
69
  hooks:
69
70
  - id: pyproject-fmt
70
71
  - repo: https://github.com/jshwi/docsig # Check docstrings against function sig
@@ -10,11 +10,16 @@
10
10
  "draft-pull-request": true,
11
11
  "pull-request-title-pattern": "chore${scope}: Release${component} ${version}",
12
12
  "pull-request-header": ":robot: Automated Release PR\n\nThis PR was created by `release-please` to prepare the next release. Once merged:\n\n1. A new version tag will be created\n2. A GitHub release will be published\n3. The changelog will be updated\n\nChanges to be included in the next release:",
13
- "pull-request-footer": "> [!IMPORTANT]\n> :warning: Merging this PR will:\n> - Create a new release\n> - Trigger deployment pipelines\n> - Update package versions\n\n **Before merging:**\n - Ensure all tests pass\n - Review the changelog carefully\n - Get required approvals\n\n [Release-please documentation](https://github.com/googleapis/release-please)",
13
+ "pull-request-footer": "> [!IMPORTANT]\n> Please do not change the PR title, manifest file, or any other automatically generated content in this PR unless you understand the implications. Changes here can break the release process.\n> :warning: Merging this PR will:\n> - Create a new release\n> - Trigger deployment pipelines\n> - Update package versions\n\n **Before merging:**\n - Ensure all tests pass\n - Review the changelog carefully\n - Get required approvals\n\n [Release-please documentation](https://github.com/googleapis/release-please)",
14
14
  "packages": {
15
15
  ".": {
16
16
  "package-name": "anemoi-utils"
17
17
  }
18
18
  },
19
+ "plugins": [
20
+ {
21
+ "type": "sentence-case"
22
+ }
23
+ ],
19
24
  "$schema": "https://raw.githubusercontent.com/googleapis/release-please/main/schemas/config.json"
20
25
  }
@@ -0,0 +1,3 @@
1
+ {
2
+ ".": "0.4.24"
3
+ }
@@ -8,6 +8,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
8
8
  Please add your functional changes to the appropriate section in the PR.
9
9
  Keep it human-readable, your future self will thank you!
10
10
 
11
+ ## [0.4.24](https://github.com/ecmwf/anemoi-utils/compare/0.4.23...0.4.24) (2025-06-06)
12
+
13
+
14
+ ### Features
15
+
16
+ * Add s3.object_exists() function ([#157](https://github.com/ecmwf/anemoi-utils/issues/157)) ([d898811](https://github.com/ecmwf/anemoi-utils/commit/d8988116320265dc6dfe467c57e0b6f29f76a2c1))
17
+ * Allow wildcard in config for matching s3 buckets to end points ([#160](https://github.com/ecmwf/anemoi-utils/issues/160)) ([ab20da7](https://github.com/ecmwf/anemoi-utils/commit/ab20da7e9497435a7183705b02dcbb7317d2700b))
18
+
19
+ ## [0.4.23](https://github.com/ecmwf/anemoi-utils/compare/0.4.22...0.4.23) (2025-05-20)
20
+
21
+
22
+ ### Bug Fixes
23
+
24
+ * fix list_folder on s3 ([#154](https://github.com/ecmwf/anemoi-utils/issues/154)) ([3ceb42c](https://github.com/ecmwf/anemoi-utils/commit/3ceb42c5185290d4c12e3fe90c3c331e3d8c7a5f))
25
+ * Remove the requirment to have git installed ([#149](https://github.com/ecmwf/anemoi-utils/issues/149)) ([88846e8](https://github.com/ecmwf/anemoi-utils/commit/88846e80be2927050a879ff953a78aecf39c3ac5))
26
+ * Use urllib to make _offline() aware of HTTP(s) proxies. ([#150](https://github.com/ecmwf/anemoi-utils/issues/150)) ([5c4d06f](https://github.com/ecmwf/anemoi-utils/commit/5c4d06f931590cc360eb4ffeeb8753a5d3d72bcb))
27
+
11
28
  ## [0.4.22](https://github.com/ecmwf/anemoi-utils/compare/0.4.21...0.4.22) (2025-04-10)
12
29
 
13
30
 
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: anemoi-utils
3
- Version: 0.4.22
3
+ Version: 0.4.24
4
4
  Summary: A package to hold various functions to support training of ML models on ECMWF data.
5
5
  Author-email: "European Centre for Medium-Range Weather Forecasts (ECMWF)" <software.support@ecmwf.int>
6
6
  License: Apache License
@@ -252,7 +252,7 @@ Provides-Extra: provenance
252
252
  Requires-Dist: gitpython; extra == "provenance"
253
253
  Requires-Dist: nvsmi; extra == "provenance"
254
254
  Provides-Extra: s3
255
- Requires-Dist: boto3<1.36; extra == "s3"
255
+ Requires-Dist: boto3>1.36; extra == "s3"
256
256
  Provides-Extra: tests
257
257
  Requires-Dist: pytest; extra == "tests"
258
258
  Provides-Extra: text
@@ -70,7 +70,7 @@ optional-dependencies.grib = [ "requests" ]
70
70
  optional-dependencies.provenance = [ "gitpython", "nvsmi" ]
71
71
 
72
72
  optional-dependencies.s3 = [
73
- "boto3<1.36",
73
+ "boto3>1.36",
74
74
  ]
75
75
 
76
76
  optional-dependencies.tests = [ "pytest" ]
@@ -17,5 +17,5 @@ __version__: str
17
17
  __version_tuple__: VERSION_TUPLE
18
18
  version_tuple: VERSION_TUPLE
19
19
 
20
- __version__ = version = '0.4.22'
21
- __version_tuple__ = version_tuple = (0, 4, 22)
20
+ __version__ = version = '0.4.24'
21
+ __version_tuple__ = version_tuple = (0, 4, 24)
@@ -47,8 +47,11 @@ def lookup_git_repo(path: str) -> Optional[Any]:
47
47
  Repo, optional
48
48
  The git repository if found, otherwise None.
49
49
  """
50
- from git import InvalidGitRepositoryError
51
- from git import Repo
50
+ try:
51
+ from git import InvalidGitRepositoryError
52
+ from git import Repo
53
+ except ImportError:
54
+ return None
52
55
 
53
56
  while path != "/":
54
57
  try:
@@ -1,10 +1,13 @@
1
- # (C) Copyright 2024 European Centre for Medium-Range Weather Forecasts.
1
+ # (C) Copyright 2024-2025 Anemoi contributors.
2
+ #
2
3
  # This software is licensed under the terms of the Apache Licence Version 2.0
3
4
  # which can be obtained at http://www.apache.org/licenses/LICENSE-2.0.
5
+ #
4
6
  # In applying this licence, ECMWF does not waive the privileges and immunities
5
7
  # granted to it by virtue of its status as an intergovernmental organisation
6
8
  # nor does it submit to any jurisdiction.
7
9
 
10
+
8
11
  """This module provides functions to upload, download, list and delete files and folders on S3.
9
12
  The functions of this package expect that the AWS credentials are set up in the environment
10
13
  typicaly by setting the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment variables or
@@ -21,6 +24,7 @@ the `~/.config/anemoi/settings.toml`
21
24
  or `~/.config/anemoi/settings-secrets.toml` files.
22
25
  """
23
26
 
27
+ import fnmatch
24
28
  import logging
25
29
  import os
26
30
  import threading
@@ -35,15 +39,15 @@ from ..humanize import bytes_to_human
35
39
  from . import BaseDownload
36
40
  from . import BaseUpload
37
41
 
38
- LOGGER = logging.getLogger(__name__)
39
-
42
+ LOG = logging.getLogger(__name__)
43
+ SECRETS = ["aws_access_key_id", "aws_secret_access_key"]
40
44
 
41
45
  # s3_clients are not thread-safe, so we need to create a new client for each thread
42
46
 
43
47
  thread_local = threading.local()
44
48
 
45
49
 
46
- def s3_client(bucket: str, region: str = None) -> Any:
50
+ def s3_client(bucket: str, *, region: str = None, service: str = "s3") -> Any:
47
51
  """Get an S3 client for the specified bucket and region.
48
52
 
49
53
  Parameters
@@ -52,6 +56,8 @@ def s3_client(bucket: str, region: str = None) -> Any:
52
56
  The name of the S3 bucket.
53
57
  region : str, optional
54
58
  The AWS region of the S3 bucket.
59
+ service : str, optional
60
+ The AWS service to use, default is "s3".
55
61
 
56
62
  Returns
57
63
  -------
@@ -65,14 +71,16 @@ def s3_client(bucket: str, region: str = None) -> Any:
65
71
  if not hasattr(thread_local, "s3_clients"):
66
72
  thread_local.s3_clients = {}
67
73
 
68
- key = f"{bucket}-{region}"
69
-
70
- boto3_config = dict(max_pool_connections=25)
74
+ key = f"{bucket}-{region}-{service}"
71
75
 
72
76
  if key in thread_local.s3_clients:
73
77
  return thread_local.s3_clients[key]
74
78
 
75
- boto3_config = dict(max_pool_connections=25)
79
+ boto3_config = dict(
80
+ max_pool_connections=25,
81
+ request_checksum_calculation="when_required",
82
+ response_checksum_validation="when_required",
83
+ )
76
84
 
77
85
  if region:
78
86
  # This is using AWS
@@ -91,17 +99,27 @@ def s3_client(bucket: str, region: str = None) -> Any:
91
99
  # We may be accessing a different S3 compatible service
92
100
  # Use anemoi.config to get the configuration
93
101
 
94
- options = {}
95
- config = load_config(secrets=["aws_access_key_id", "aws_secret_access_key"])
102
+ region = "unknown-region"
103
+
104
+ options = {"region_name": region}
105
+ config = load_config(secrets=SECRETS)
96
106
 
97
107
  cfg = config.get("object-storage", {})
108
+ candidate = None
98
109
  for k, v in cfg.items():
99
110
  if isinstance(v, (str, int, float, bool)):
100
111
  options[k] = v
101
112
 
102
- for k, v in cfg.get(bucket, {}).items():
103
- if isinstance(v, (str, int, float, bool)):
104
- options[k] = v
113
+ if isinstance(v, dict):
114
+ if fnmatch.fnmatch(bucket, k):
115
+ if candidate is not None:
116
+ raise ValueError(f"Multiple object storage configurations match {bucket}: {candidate} and {k}")
117
+ candidate = k
118
+
119
+ if candidate is not None:
120
+ for k, v in cfg.get(candidate, {}).items():
121
+ if isinstance(v, (str, int, float, bool)):
122
+ options[k] = v
105
123
 
106
124
  type = options.pop("type", "s3")
107
125
  if type != "s3":
@@ -110,11 +128,27 @@ def s3_client(bucket: str, region: str = None) -> Any:
110
128
  if "config" in options:
111
129
  boto3_config.update(options["config"])
112
130
  del options["config"]
113
- from botocore.client import Config
114
131
 
115
132
  options["config"] = Config(**boto3_config)
116
133
 
117
- thread_local.s3_clients[key] = boto3.client("s3", **options)
134
+ def _(options):
135
+
136
+ def __(k, v):
137
+ if k in SECRETS:
138
+ return "***"
139
+ return v
140
+
141
+ if isinstance(options, dict):
142
+ return {k: __(k, v) for k, v in options.items()}
143
+
144
+ if isinstance(options, list):
145
+ return [_(o) for o in options]
146
+
147
+ return options
148
+
149
+ LOG.info(f"Using S3 options: {_(options)}")
150
+
151
+ thread_local.s3_clients[key] = boto3.client(service, **options)
118
152
 
119
153
  return thread_local.s3_clients[key]
120
154
 
@@ -162,7 +196,14 @@ class S3Upload(BaseUpload):
162
196
  # delete(target)
163
197
 
164
198
  def _transfer_file(
165
- self, source: str, target: str, overwrite: bool, resume: bool, verbosity: int, threads: int, config: dict = None
199
+ self,
200
+ source: str,
201
+ target: str,
202
+ overwrite: bool,
203
+ resume: bool,
204
+ verbosity: int,
205
+ threads: int,
206
+ config: dict = None,
166
207
  ) -> int:
167
208
  """Transfer a file to S3.
168
209
 
@@ -203,7 +244,7 @@ class S3Upload(BaseUpload):
203
244
  size = os.path.getsize(source)
204
245
 
205
246
  if verbosity > 0:
206
- LOGGER.info(f"{self.action} {source} to {target} ({bytes_to_human(size)})")
247
+ LOG.info(f"{self.action} {source} to {target} ({bytes_to_human(size)})")
207
248
 
208
249
  try:
209
250
  results = s3.head_object(Bucket=bucket, Key=key)
@@ -215,7 +256,7 @@ class S3Upload(BaseUpload):
215
256
 
216
257
  if remote_size is not None:
217
258
  if remote_size != size:
218
- LOGGER.warning(
259
+ LOG.warning(
219
260
  f"{target} already exists, but with different size, re-uploading (remote={remote_size}, local={size})"
220
261
  )
221
262
  elif resume:
@@ -227,7 +268,13 @@ class S3Upload(BaseUpload):
227
268
 
228
269
  if verbosity > 0:
229
270
  with tqdm.tqdm(total=size, unit="B", unit_scale=True, unit_divisor=1024, leave=False) as pbar:
230
- s3.upload_file(source, bucket, key, Callback=lambda x: pbar.update(x), Config=config)
271
+ s3.upload_file(
272
+ source,
273
+ bucket,
274
+ key,
275
+ Callback=lambda x: pbar.update(x),
276
+ Config=config,
277
+ )
231
278
  else:
232
279
  s3.upload_file(source, bucket, key, Config=config)
233
280
 
@@ -326,7 +373,14 @@ class S3Download(BaseDownload):
326
373
  return s3_object["Size"]
327
374
 
328
375
  def _transfer_file(
329
- self, source: str, target: str, overwrite: bool, resume: bool, verbosity: int, threads: int, config: dict = None
376
+ self,
377
+ source: str,
378
+ target: str,
379
+ overwrite: bool,
380
+ resume: bool,
381
+ verbosity: int,
382
+ threads: int,
383
+ config: dict = None,
330
384
  ) -> int:
331
385
  """Transfer a file from S3 to the local filesystem.
332
386
 
@@ -375,7 +429,7 @@ class S3Download(BaseDownload):
375
429
  size = int(response["ContentLength"])
376
430
 
377
431
  if verbosity > 0:
378
- LOGGER.info(f"{self.action} {source} to {target} ({bytes_to_human(size)})")
432
+ LOG.info(f"{self.action} {source} to {target} ({bytes_to_human(size)})")
379
433
 
380
434
  if overwrite:
381
435
  resume = False
@@ -384,7 +438,7 @@ class S3Download(BaseDownload):
384
438
  if os.path.exists(target):
385
439
  local_size = os.path.getsize(target)
386
440
  if local_size != size:
387
- LOGGER.warning(
441
+ LOG.warning(
388
442
  f"{target} already with different size, re-downloading (remote={size}, local={local_size})"
389
443
  )
390
444
  else:
@@ -397,7 +451,13 @@ class S3Download(BaseDownload):
397
451
 
398
452
  if verbosity > 0:
399
453
  with tqdm.tqdm(total=size, unit="B", unit_scale=True, unit_divisor=1024, leave=False) as pbar:
400
- s3.download_file(bucket, key, target, Callback=lambda x: pbar.update(x), Config=config)
454
+ s3.download_file(
455
+ bucket,
456
+ key,
457
+ target,
458
+ Callback=lambda x: pbar.update(x),
459
+ Config=config,
460
+ )
401
461
  else:
402
462
  s3.download_file(bucket, key, target, Config=config)
403
463
 
@@ -433,7 +493,7 @@ def _list_objects(target: str, batch: bool = False) -> Iterable:
433
493
  yield from objects
434
494
 
435
495
 
436
- def _delete_folder(target: str) -> None:
496
+ def delete_folder(target: str) -> None:
437
497
  """Delete a folder from S3.
438
498
 
439
499
  Parameters
@@ -446,13 +506,13 @@ def _delete_folder(target: str) -> None:
446
506
 
447
507
  total = 0
448
508
  for batch in _list_objects(target, batch=True):
449
- LOGGER.info(f"Deleting {len(batch):,} objects from {target}")
509
+ LOG.info(f"Deleting {len(batch):,} objects from {target}")
450
510
  s3.delete_objects(Bucket=bucket, Delete={"Objects": [{"Key": o["Key"]} for o in batch]})
451
511
  total += len(batch)
452
- LOGGER.info(f"Deleted {len(batch):,} objects (total={total:,})")
512
+ LOG.info(f"Deleted {len(batch):,} objects (total={total:,})")
453
513
 
454
514
 
455
- def _delete_file(target: str) -> None:
515
+ def delete_file(target: str) -> None:
456
516
  """Delete a file from S3.
457
517
 
458
518
  Parameters
@@ -474,12 +534,12 @@ def _delete_file(target: str) -> None:
474
534
  exits = False
475
535
 
476
536
  if not exits:
477
- LOGGER.warning(f"{target} does not exist. Did you mean to delete a folder? Then add a trailing '/'")
537
+ LOG.warning(f"{target} does not exist. Did you mean to delete a folder? Then add a trailing '/'")
478
538
  return
479
539
 
480
- LOGGER.info(f"Deleting {target}")
540
+ LOG.info(f"Deleting {target}")
481
541
  s3.delete_object(Bucket=bucket, Key=key)
482
- LOGGER.info(f"{target} is deleted")
542
+ LOG.info(f"{target} is deleted")
483
543
 
484
544
 
485
545
  def delete(target: str) -> None:
@@ -494,9 +554,9 @@ def delete(target: str) -> None:
494
554
  assert target.startswith("s3://")
495
555
 
496
556
  if target.endswith("/"):
497
- _delete_folder(target)
557
+ delete_folder(target)
498
558
  else:
499
- _delete_file(target)
559
+ delete_file(target)
500
560
 
501
561
 
502
562
  def list_folder(folder: str) -> Iterable:
@@ -524,7 +584,9 @@ def list_folder(folder: str) -> Iterable:
524
584
 
525
585
  for page in paginator.paginate(Bucket=bucket, Prefix=prefix, Delimiter="/"):
526
586
  if "CommonPrefixes" in page:
527
- yield from [folder + _["Prefix"] for _ in page.get("CommonPrefixes")]
587
+ yield from [folder + _["Prefix"] for _ in page.get("CommonPrefixes") if _["Prefix"] != "/"]
588
+ if "Contents" in page:
589
+ yield from [folder + _["Key"] for _ in page.get("Contents")]
528
590
 
529
591
 
530
592
  def object_info(target: str) -> dict:
@@ -548,7 +610,33 @@ def object_info(target: str) -> dict:
548
610
  return s3.head_object(Bucket=bucket, Key=key)
549
611
  except s3.exceptions.ClientError as e:
550
612
  if e.response["Error"]["Code"] == "404":
551
- raise ValueError(f"{target} does not exist")
613
+ raise FileNotFoundError(f"{target} does not exist")
614
+ raise
615
+
616
+
617
+ def object_exists(target: str) -> bool:
618
+ """Check if an object exists.
619
+
620
+ Parameters
621
+ ----------
622
+ target : str
623
+ The URL of a file or a folder on S3. The URL should start with 's3://'.
624
+
625
+ Returns
626
+ -------
627
+ bool
628
+ True if the object exists, False otherwise.
629
+ """
630
+
631
+ _, _, bucket, key = target.split("/", 3)
632
+ s3 = s3_client(bucket)
633
+
634
+ try:
635
+ s3.head_object(Bucket=bucket, Key=key)
636
+ return True
637
+ except s3.exceptions.ClientError as e:
638
+ if e.response["Error"]["Code"] == "404":
639
+ return False
552
640
  raise
553
641
 
554
642
 
@@ -567,7 +655,7 @@ def object_acl(target: str) -> dict:
567
655
  """
568
656
 
569
657
  _, _, bucket, key = target.split("/", 3)
570
- s3 = s3_client()
658
+ s3 = s3_client(bucket)
571
659
 
572
660
  return s3.get_object_acl(Bucket=bucket, Key=key)
573
661
 
@@ -610,3 +698,29 @@ def upload(source: str, target: str, *args, **kwargs) -> None:
610
698
 
611
699
  assert target.startswith("s3://"), f"target {target} should start with 's3://'"
612
700
  return transfer(source, target, *args, **kwargs)
701
+
702
+
703
+ def quotas(target: str) -> dict:
704
+ """Get the quotas for an S3 bucket.
705
+
706
+ Parameters
707
+ ----------
708
+ target : str
709
+ The URL of a file or a folder on S3. The URL should start with 's3://'.
710
+
711
+ Returns
712
+ -------
713
+ dict
714
+ A dictionary with the quotas for the bucket.
715
+ """
716
+ from botocore.exceptions import ClientError
717
+
718
+ _, _, bucket, _ = target.split("/", 3)
719
+ s3 = s3_client(bucket, service="service-quotas")
720
+
721
+ try:
722
+ return s3.list_service_quotas(ServiceCode="ec2")
723
+ except ClientError as e:
724
+ if e.response["Error"]["Code"] == "404":
725
+ raise ValueError(f"{target} does not exist")
726
+ raise
@@ -261,12 +261,12 @@ def _run_slow_tests() -> bool:
261
261
  @lru_cache(maxsize=None)
262
262
  def _offline() -> bool:
263
263
  """Check if we are offline."""
264
-
265
- import socket
264
+ from urllib import request
266
265
 
267
266
  try:
268
- socket.create_connection(("anemoi.ecmwf.int", 443), timeout=5)
269
- except OSError:
267
+ request.urlopen("https://anemoi.ecmwf.int", timeout=1)
268
+ return False
269
+ except request.URLError:
270
270
  return True
271
271
 
272
272
  return False
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: anemoi-utils
3
- Version: 0.4.22
3
+ Version: 0.4.24
4
4
  Summary: A package to hold various functions to support training of ML models on ECMWF data.
5
5
  Author-email: "European Centre for Medium-Range Weather Forecasts (ECMWF)" <software.support@ecmwf.int>
6
6
  License: Apache License
@@ -252,7 +252,7 @@ Provides-Extra: provenance
252
252
  Requires-Dist: gitpython; extra == "provenance"
253
253
  Requires-Dist: nvsmi; extra == "provenance"
254
254
  Provides-Extra: s3
255
- Requires-Dist: boto3<1.36; extra == "s3"
255
+ Requires-Dist: boto3>1.36; extra == "s3"
256
256
  Provides-Extra: tests
257
257
  Requires-Dist: pytest; extra == "tests"
258
258
  Provides-Extra: text
@@ -14,7 +14,6 @@ pyproject.toml
14
14
  .github/dependabot.yml
15
15
  .github/labeler.yml
16
16
  .github/pull_request_template.md
17
- .github/release.yml
18
17
  .github/workflows/downstream-ci-hpc.yml
19
18
  .github/workflows/pr-conventional-commit.yml
20
19
  .github/workflows/pr-label-conventional-commits.yml
@@ -36,7 +36,7 @@ gitpython
36
36
  nvsmi
37
37
 
38
38
  [s3]
39
- boto3<1.36
39
+ boto3>1.36
40
40
 
41
41
  [tests]
42
42
  pytest
@@ -1,46 +0,0 @@
1
- ## Description
2
-
3
- <!-- Provide a brief summary of the changes introduced in this pull request. -->
4
-
5
- ## Type of Change
6
-
7
- - [ ] Bug fix (non-breaking change which fixes an issue)
8
- - [ ] New feature (non-breaking change which adds functionality)
9
- - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
10
- - [ ] Documentation update
11
-
12
- ## Issue Number
13
-
14
- <!-- Link the Issue number this change addresses, ideally in one of the "magic format" such as Closes #XYZ -->
15
-
16
- <!-- Alternatively, explain the motivation behind the changes and the context in which they are being made. -->
17
-
18
- ## Code Compatibility
19
-
20
- - [ ] I have performed a self-review of my code
21
-
22
- ### Code Performance and Testing
23
-
24
- - [ ] I have added tests that prove my fix is effective or that my feature works
25
- - [ ] I ran the [complete Pytest test](https://anemoi.readthedocs.io/projects/training/en/latest/dev/testing.html) suite locally, and they pass
26
-
27
- <!-- In case this affects the model sharding or other specific components please describe these here. -->
28
-
29
- ### Dependencies
30
-
31
- - [ ] I have ensured that the code is still pip-installable after the changes and runs
32
- - [ ] I have tested that new dependencies themselves are pip-installable.
33
-
34
- <!-- List any new dependencies that are required for this change and the justification to add them. -->
35
-
36
- ### Documentation
37
-
38
- - [ ] My code follows the style guidelines of this project
39
- - [ ] I have updated the documentation and docstrings to reflect the changes
40
- - [ ] I have added comments to my code, particularly in hard-to-understand areas
41
-
42
- <!-- Describe any major updates to the documentation -->
43
-
44
- ## Additional Notes
45
-
46
- <!-- Include any additional information, caveats, or considerations that the reviewer should be aware of. -->
@@ -1,23 +0,0 @@
1
- # .github/release.yml
2
- # https://docs.github.com/en/repositories/releasing-projects-on-github/automatically-generated-release-notes
3
-
4
- changelog:
5
- exclude:
6
- labels:
7
- - ignore-for-release
8
- - no-changelog
9
- authors:
10
- - pre-commit-ci
11
- categories:
12
- - title: Breaking Changes 🛠
13
- labels:
14
- - "breaking change"
15
- - title: Exciting New Features 🎉
16
- labels:
17
- - enhancement
18
- - title: Config Changes 📑
19
- labels:
20
- - config
21
- - title: Other Changes 🔗
22
- labels:
23
- - "*"
@@ -1,3 +0,0 @@
1
- {
2
- ".": "0.4.22"
3
- }
File without changes
File without changes
File without changes