gfslib 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
gfslib-0.1.0/PKG-INFO ADDED
@@ -0,0 +1,102 @@
1
+ Metadata-Version: 2.4
2
+ Name: gfslib
3
+ Version: 0.1.0
4
+ Summary: Python library for communication to the GeMMA Fusion Server
5
+ Author: Mitko Nikov
6
+ Author-email: mitko.nikov@student.um.si
7
+ Requires-Python: >=3.10
8
+ Classifier: Programming Language :: Python :: 3
9
+ Classifier: Programming Language :: Python :: 3.10
10
+ Classifier: Programming Language :: Python :: 3.11
11
+ Classifier: Programming Language :: Python :: 3.12
12
+ Classifier: Programming Language :: Python :: 3.13
13
+ Classifier: Programming Language :: Python :: 3.14
14
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
15
+ Requires-Dist: requests (>=2.32.5,<3.0.0)
16
+ Project-URL: Bug Tracker, https://github.com/mitkonikov/gfslib/issues
17
+ Project-URL: Documentation, https://github.com/mitkonikov/gfslib
18
+ Project-URL: Homepage, https://github.com/mitkonikov/gfslib
19
+ Project-URL: Repository, https://github.com/mitkonikov/gfslib
20
+ Description-Content-Type: text/markdown
21
+
22
+ # GeMMA Fusion Suite Python Library (GFSLib)
23
+
24
+ GFSLib is a Python library that provides utilities for working with the GeMMA Fusion Suite Server.
25
+ It is currently in alpha development and only includes storage utilities for communicating with the GF Server.
26
+
27
+ ## Getting Started
28
+
29
+ To install GFSLib, use pip:
30
+
31
+ ```bash
32
+ pip install gfslib
33
+ ```
34
+
35
+ ## Example Usage
36
+
37
+ Here is a simple example of how to use the storage utilities in GFSLib:
38
+
39
+ ```python
40
+ from gfslib.storage import StorageServices
41
+
42
+ storage = StorageServices("https://.../api/ws/<workspace-id>/services/storage")
43
+ storage.set_api_key("...")
44
+
45
+ storage.upload("path/to/remote.file", "/path/to/local.file")
46
+ storage.download("path/to/remote.file", "/path/to/downloaded.file")
47
+ ```
48
+
49
+ ## Contributions
50
+
51
+ We really appreciate contributions from the community!
52
+ We especially welcome the reports of issues and bugs.
53
+
54
+ However, one may note that since this library is currently being heavily developed,
55
+ the API may drastically change and all projects depending on this library have to deal
56
+ with the changes downstream. We will however try to keep these at minimum.
57
+
58
+ The main maintainer of this library is [Mitko Nikov](https://github.com/mitkonikov).
59
+
60
+ ### Developing the library
61
+
62
+ We are using [poetry](https://python-poetry.org/) to manage, build and publish the python package.
63
+ We recommend downloading poetry and running `poetry install` to
64
+ install all of the dependencies instead of doing so manually.
65
+
66
+ To activate the virtual env created by poetry, run `poetry env activate` to get the
67
+ command to activate the env. After activation, you can run anything from within.
68
+
69
+ ### Contributing to GitHub
70
+
71
+ There are three things that we are very strict about:
72
+ - Type-checking - powered by [mypy](https://mypy-lang.org/)
73
+ - Coding style - powered by [Black](https://black.readthedocs.io/en/stable/)
74
+ - Unit Tests - powered by [pytest](https://docs.pytest.org/en/stable/)
75
+
76
+ Run the following commands in the virtual env
77
+ to ensure that everything is according to the guidelines:
78
+
79
+ ```sh
80
+ mypy . --strict
81
+ black .
82
+ pytest .
83
+ ```
84
+
85
+ Guidelines are now checked using GitHub Workflows.
86
+ When developing the library locally, you can install [act](https://nektosact.com/) to run
87
+ the GitHub workflows on your machine through Docker.
88
+ We also recommend installing the VSCode extension
89
+ [GitHub Local Actions](https://marketplace.visualstudio.com/items?itemName=SanjulaGanepola.github-local-actions)
90
+ to run the workflows from inside VSCode, making the process painless.
91
+
92
+ Example scenarios are also tested in GitHub Actions by running them from the CLI.
93
+
94
+ ## General Guidelines
95
+
96
+ Here are a few guidelines to following while contributing on the library:
97
+ - We aim to keep this library with as little run-time-necessary dependencies as possible.
98
+ - Unit tests for as many functions as possible. (we know that we can't cover everything)
99
+ - Strict Static Type-checking using `mypy`
100
+ - Strict formatting style guidelines using `black`
101
+ - Nicely documented functions and classes
102
+
gfslib-0.1.0/README.md ADDED
@@ -0,0 +1,80 @@
1
+ # GeMMA Fusion Suite Python Library (GFSLib)
2
+
3
+ GFSLib is a Python library that provides utilities for working with the GeMMA Fusion Suite Server.
4
+ It is currently in alpha development and only includes storage utilities for communicating with the GF Server.
5
+
6
+ ## Getting Started
7
+
8
+ To install GFSLib, use pip:
9
+
10
+ ```bash
11
+ pip install gfslib
12
+ ```
13
+
14
+ ## Example Usage
15
+
16
+ Here is a simple example of how to use the storage utilities in GFSLib:
17
+
18
+ ```python
19
+ from gfslib.storage import StorageServices
20
+
21
+ storage = StorageServices("https://.../api/ws/<workspace-id>/services/storage")
22
+ storage.set_api_key("...")
23
+
24
+ storage.upload("path/to/remote.file", "/path/to/local.file")
25
+ storage.download("path/to/remote.file", "/path/to/downloaded.file")
26
+ ```
27
+
28
+ ## Contributions
29
+
30
+ We really appreciate contributions from the community!
31
+ We especially welcome the reports of issues and bugs.
32
+
33
+ However, one may note that since this library is currently being heavily developed,
34
+ the API may drastically change and all projects depending on this library have to deal
35
+ with the changes downstream. We will however try to keep these at minimum.
36
+
37
+ The main maintainer of this library is [Mitko Nikov](https://github.com/mitkonikov).
38
+
39
+ ### Developing the library
40
+
41
+ We are using [poetry](https://python-poetry.org/) to manage, build and publish the python package.
42
+ We recommend downloading poetry and running `poetry install` to
43
+ install all of the dependencies instead of doing so manually.
44
+
45
+ To activate the virtual env created by poetry, run `poetry env activate` to get the
46
+ command to activate the env. After activation, you can run anything from within.
47
+
48
+ ### Contributing to GitHub
49
+
50
+ There are three things that we are very strict about:
51
+ - Type-checking - powered by [mypy](https://mypy-lang.org/)
52
+ - Coding style - powered by [Black](https://black.readthedocs.io/en/stable/)
53
+ - Unit Tests - powered by [pytest](https://docs.pytest.org/en/stable/)
54
+
55
+ Run the following commands in the virtual env
56
+ to ensure that everything is according to the guidelines:
57
+
58
+ ```sh
59
+ mypy . --strict
60
+ black .
61
+ pytest .
62
+ ```
63
+
64
+ Guidelines are now checked using GitHub Workflows.
65
+ When developing the library locally, you can install [act](https://nektosact.com/) to run
66
+ the GitHub workflows on your machine through Docker.
67
+ We also recommend installing the VSCode extension
68
+ [GitHub Local Actions](https://marketplace.visualstudio.com/items?itemName=SanjulaGanepola.github-local-actions)
69
+ to run the workflows from inside VSCode, making the process painless.
70
+
71
+ Example scenarios are also tested in GitHub Actions by running them from the CLI.
72
+
73
+ ## General Guidelines
74
+
75
+ Here are a few guidelines to following while contributing on the library:
76
+ - We aim to keep this library with as little run-time-necessary dependencies as possible.
77
+ - Unit tests for as many functions as possible. (we know that we can't cover everything)
78
+ - Strict Static Type-checking using `mypy`
79
+ - Strict formatting style guidelines using `black`
80
+ - Nicely documented functions and classes
@@ -0,0 +1,36 @@
1
+ [project]
2
+ name = "gfslib"
3
+ version = "0.1.0"
4
+ packages = [{ include = "gfslib", from = "src" }]
5
+ description = "Python library for communication to the GeMMA Fusion Server"
6
+ authors = [
7
+ {name = "Mitko Nikov",email = "mitko.nikov@student.um.si"}
8
+ ]
9
+ readme = "README.md"
10
+ requires-python = ">=3.10"
11
+ dependencies = [
12
+ "requests (>=2.32.5,<3.0.0)"
13
+ ]
14
+ dynamic = [ "classifiers" ]
15
+
16
+ [project.urls]
17
+ homepage = "https://github.com/mitkonikov/gfslib"
18
+ repository = "https://github.com/mitkonikov/gfslib"
19
+ documentation = "https://github.com/mitkonikov/gfslib"
20
+ "Bug Tracker" = "https://github.com/mitkonikov/gfslib/issues"
21
+
22
+ [tool.poetry]
23
+ classifiers = [
24
+ "Topic :: Software Development :: Libraries :: Python Modules"
25
+ ]
26
+
27
+ [tool.poetry.group.dev.dependencies]
28
+ mypy = "^1.19.1"
29
+ black = "^25.12.0"
30
+ pytest = "^9.0.2"
31
+ types-requests = "^2.32.4.20260107"
32
+ python-dotenv = "^1.2.1"
33
+
34
+ [build-system]
35
+ requires = ["poetry-core>=2.0.0,<3.0.0"]
36
+ build-backend = "poetry.core.masonry.api"
@@ -0,0 +1,2 @@
1
+ __version__ = "0.1.0"
2
+ __author__ = "Mitko Nikov"
@@ -0,0 +1,5 @@
1
+ """Storage utilities for gfslib."""
2
+
3
+ from .client import StorageServices
4
+
5
+ __all__ = ["StorageServices"]
@@ -0,0 +1,237 @@
1
+ """High-level StorageServices for GeMMA Fusion Server.
2
+
3
+ Provides methods for listing, uploading, downloading, deleting,
4
+ fetching metadata and syncing local files with the remote storage service.
5
+
6
+ Usage:
7
+ client = StorageServices("https://fusion.gemma.feri.um.si/gf-test/api/ws/<workspace-id>/services/storage")
8
+ client.set_api_key("...")
9
+ client.ls()
10
+ """
11
+
12
+ from __future__ import annotations
13
+
14
+ import hashlib
15
+ import json
16
+ import os
17
+ from typing import Dict, Iterable, List, Optional, Tuple, Any
18
+ from urllib.parse import quote
19
+
20
+ import requests
21
+
22
+
23
+ class StorageServices:
24
+ """Client for the server storage API.
25
+
26
+ Initialize with the base storage service URL. Examples of base URL:
27
+ - https://.../api/ws/<workspace-id>/services/storage
28
+ """
29
+
30
+ def __init__(self, base_url: str, timeout: Optional[float] = 30.0) -> None:
31
+ self.base_url = base_url.rstrip("/")
32
+ self._api_key: Optional[str] = None
33
+ self.timeout = timeout
34
+
35
+ def set_api_key(self, key: str) -> None:
36
+ """Set the X-API-Key to use for requests."""
37
+ self._api_key = key
38
+
39
+ def _headers(self, extra: Optional[Dict[str, str]] = None) -> Dict[str, str]:
40
+ headers = {
41
+ "Accept": "*/*",
42
+ }
43
+ if self._api_key:
44
+ headers["X-API-Key"] = self._api_key
45
+ if extra:
46
+ headers.update(extra)
47
+ return headers
48
+
49
+ def _file_url(self, remote_path: str) -> str:
50
+ # Ensure we don't double the slashes; keep path parts encoded except '/'
51
+ rp = remote_path.lstrip("/")
52
+ return f"{self.base_url}/files/{quote(rp, safe='/') }"
53
+
54
+ def ls(self) -> requests.Response:
55
+ """List remote files (short). Returns the requests.Response object."""
56
+ url = f"{self.base_url}/ls"
57
+ return requests.get(url, headers=self._headers(), timeout=self.timeout)
58
+
59
+ def ls_long(self) -> requests.Response:
60
+ """List remote files (long). Returns the requests.Response object."""
61
+ url = f"{self.base_url}/ls/long"
62
+ return requests.get(url, headers=self._headers(), timeout=self.timeout)
63
+
64
+ def upload(
65
+ self, remote_path: str, data: bytes | str | os.PathLike[str]
66
+ ) -> requests.Response:
67
+ """Upload data to `remote_path` using PUT.
68
+
69
+ `data` can be raw bytes, a string (will be encoded as utf-8), or a path
70
+ to a local file to stream.
71
+ """
72
+ url = self._file_url(remote_path)
73
+ headers = self._headers({"Content-Type": "application/octet-stream"})
74
+
75
+ if isinstance(data, (bytes, bytearray)):
76
+ body = data
77
+ resp = requests.put(url, headers=headers, data=body, timeout=self.timeout)
78
+ return resp
79
+
80
+ if isinstance(data, str) and os.path.exists(data):
81
+ # treat as file path
82
+ with open(data, "rb") as fh:
83
+ return requests.put(url, headers=headers, data=fh, timeout=self.timeout)
84
+
85
+ # otherwise treat as string content
86
+ if isinstance(data, str):
87
+ body = data.encode("utf-8")
88
+ return requests.put(url, headers=headers, data=body, timeout=self.timeout)
89
+
90
+ raise TypeError("data must be bytes, string, or path to file")
91
+
92
+ def download(
93
+ self,
94
+ remote_path: str,
95
+ dest: Optional[os.PathLike[str] | str] = None,
96
+ byte_range: Optional[Tuple[int, Optional[int]]] = None,
97
+ ) -> bytes:
98
+ """Download a file. If `byte_range` provided, send Range header as (start, end).
99
+
100
+ If `dest` is provided the content is written to that path and an empty bytes
101
+ object is returned; otherwise the file bytes are returned.
102
+ """
103
+ url = self._file_url(remote_path)
104
+ headers = self._headers()
105
+ if byte_range is not None:
106
+ start, end = byte_range
107
+ if end is None:
108
+ headers["Range"] = f"bytes={start}-"
109
+ else:
110
+ headers["Range"] = f"bytes={start}-{end}"
111
+
112
+ resp = requests.get(url, headers=headers, stream=True, timeout=self.timeout)
113
+ resp.raise_for_status()
114
+
115
+ if dest is not None:
116
+ dest_path = str(dest)
117
+ os.makedirs(os.path.dirname(dest_path) or ".", exist_ok=True)
118
+ with open(dest_path, "wb") as fh:
119
+ for chunk in resp.iter_content(chunk_size=8192):
120
+ if chunk:
121
+ fh.write(chunk)
122
+ return b""
123
+
124
+ # collect into bytes
125
+ return resp.content
126
+
127
+ def delete(self, remote_path: str) -> requests.Response:
128
+ """Delete a remote file."""
129
+ url = self._file_url(remote_path)
130
+ return requests.delete(url, headers=self._headers(), timeout=self.timeout)
131
+
132
+ def metadata(self, filepaths: Iterable[str], ignore_sha: bool = False) -> Any:
133
+ """Get metadata for the provided filepaths."""
134
+ url = f"{self.base_url}/metadata"
135
+ if ignore_sha:
136
+ url = f"{url}?ignoreSha=true"
137
+
138
+ body = json.dumps(list(filepaths))
139
+ headers = self._headers(
140
+ {"Content-Type": "application/json", "Content-Length": str(len(body))}
141
+ )
142
+ resp = requests.post(
143
+ url, headers=headers, data=body.encode("utf-8"), timeout=self.timeout
144
+ )
145
+ # return parsed JSON (list/dict) for easier consumption by callers
146
+ try:
147
+ return resp.json()
148
+ except ValueError:
149
+ return {}
150
+
151
+ @staticmethod
152
+ def compute_sha256(path: os.PathLike[str] | str) -> str:
153
+ """Compute SHA-256 hex digest of a file."""
154
+ h = hashlib.sha256()
155
+ with open(str(path), "rb") as fh:
156
+ for chunk in iter(lambda: fh.read(8192), b""):
157
+ h.update(chunk)
158
+ return h.hexdigest()
159
+
160
+ def sync_local_to_remote(
161
+ self,
162
+ local_dir: os.PathLike[str],
163
+ remote_prefix: str = "",
164
+ ignore_sha: bool = False,
165
+ dry_run: bool = False,
166
+ ) -> Dict[str, str]:
167
+ """Sync files from `local_dir` to remote storage under `remote_prefix`.
168
+
169
+ Behaviour:
170
+ - Walk `local_dir`, collect relative file paths.
171
+ - Query `metadata` for those paths (prefixed by `remote_prefix`).
172
+ - Upload files that are missing or whose SHA differs (unless `ignore_sha=True`).
173
+
174
+ Returns a dict mapping relative path -> action performed ("uploaded", "skipped").
175
+ """
176
+ rem_prefix = remote_prefix.strip("/")
177
+
178
+ rel_paths: List[str] = []
179
+ for root, _dirs, files in os.walk(local_dir):
180
+ for f in files:
181
+ full = os.path.join(root, f)
182
+ rel = os.path.relpath(full, local_dir).replace("\\", "/")
183
+ if rem_prefix:
184
+ rel_paths.append(f"{rem_prefix}/{rel}")
185
+ else:
186
+ rel_paths.append(rel)
187
+
188
+ result: Dict[str, str] = {}
189
+ if not rel_paths:
190
+ return result
191
+
192
+ # request metadata in one batch (server should accept arrays)
193
+ remote_meta = self.metadata(rel_paths, ignore_sha=ignore_sha)
194
+ if remote_meta is None:
195
+ remote_meta = {}
196
+
197
+ # Build a mapping path -> metadata (prefer Path/path), fallback to name
198
+ remote_map = {}
199
+
200
+ def _norm(s: str) -> str:
201
+ return s.replace("\\", "/")
202
+
203
+ if isinstance(remote_meta, list):
204
+ for item in remote_meta:
205
+ p = item.get("Path")
206
+ if isinstance(p, str):
207
+ remote_map[_norm(p)] = item
208
+
209
+ # For each local file decide upload or skip
210
+ for rel in rel_paths:
211
+ # derive local file path
212
+ if rem_prefix:
213
+ # strip prefix when mapping back to local relative path
214
+ local_rel = rel[len(rem_prefix) + 1 :]
215
+ else:
216
+ local_rel = rel
217
+ local_path = os.path.join(local_dir, local_rel.replace("/", os.sep))
218
+
219
+ action = "uploaded"
220
+ # if remote metadata contains sha, use it
221
+ remote_item = remote_map.get(_norm(rel))
222
+ if remote_item and not ignore_sha:
223
+ # prefer Sha2 (server uses SHA-256)
224
+ remote_sha = remote_item.get("Sha2") or remote_item.get("sha2")
225
+ if remote_sha:
226
+ local_sha = self.compute_sha256(local_path)
227
+ if local_sha == remote_sha:
228
+ action = "skipped"
229
+
230
+ if action == "uploaded":
231
+ if not dry_run:
232
+ self.upload(rel, local_path)
233
+ result[local_rel] = "uploaded (dry-run)" if dry_run else "uploaded"
234
+ else:
235
+ result[local_rel] = "skipped"
236
+
237
+ return result