PyS3Uploader 0.0.0a0__py3-none-any.whl → 0.1.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of PyS3Uploader might be problematic. Click here for more details.
- pys3uploader-0.1.0.dist-info/METADATA +186 -0
- pys3uploader-0.1.0.dist-info/RECORD +11 -0
- s3/__init__.py +3 -1
- s3/exceptions.py +82 -0
- s3/logger.py +45 -0
- s3/tree.py +53 -0
- s3/uploader.py +175 -0
- s3/utils.py +45 -0
- pys3uploader-0.0.0a0.dist-info/METADATA +0 -49
- pys3uploader-0.0.0a0.dist-info/RECORD +0 -6
- {pys3uploader-0.0.0a0.dist-info → pys3uploader-0.1.0.dist-info}/LICENSE +0 -0
- {pys3uploader-0.0.0a0.dist-info → pys3uploader-0.1.0.dist-info}/WHEEL +0 -0
- {pys3uploader-0.0.0a0.dist-info → pys3uploader-0.1.0.dist-info}/top_level.txt +0 -0
|
@@ -0,0 +1,186 @@
|
|
|
1
|
+
Metadata-Version: 2.2
|
|
2
|
+
Name: PyS3Uploader
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Python module to upload objects to an S3 bucket.
|
|
5
|
+
Author-email: Vignesh Rao <svignesh1793@gmail.com>
|
|
6
|
+
License: MIT License
|
|
7
|
+
|
|
8
|
+
Copyright (c) 2025 Vignesh Rao
|
|
9
|
+
|
|
10
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
11
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
12
|
+
in the Software without restriction, including without limitation the rights
|
|
13
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
14
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
15
|
+
furnished to do so, subject to the following conditions:
|
|
16
|
+
|
|
17
|
+
The above copyright notice and this permission notice shall be included in all
|
|
18
|
+
copies or substantial portions of the Software.
|
|
19
|
+
|
|
20
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
21
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
22
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
23
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
24
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
25
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
26
|
+
SOFTWARE.
|
|
27
|
+
|
|
28
|
+
Project-URL: Homepage, https://github.com/thevickypedia/PyS3Uploader
|
|
29
|
+
Project-URL: Docs, https://thevickypedia.github.io/PyS3Uploader/
|
|
30
|
+
Project-URL: Source, https://github.com/thevickypedia/PyS3Uploader
|
|
31
|
+
Project-URL: Bug Tracker, https://github.com/thevickypedia/PyS3Uploader/issues
|
|
32
|
+
Keywords: s3
|
|
33
|
+
Classifier: Development Status :: 1 - Planning
|
|
34
|
+
Classifier: Intended Audience :: Information Technology
|
|
35
|
+
Classifier: Operating System :: OS Independent
|
|
36
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
37
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
38
|
+
Classifier: Topic :: Internet :: File Transfer Protocol (FTP)
|
|
39
|
+
Requires-Python: >=3.11
|
|
40
|
+
Description-Content-Type: text/markdown
|
|
41
|
+
License-File: LICENSE
|
|
42
|
+
Requires-Dist: boto3==1.40.*
|
|
43
|
+
Requires-Dist: tqdm==4.67.*
|
|
44
|
+
Provides-Extra: dev
|
|
45
|
+
Requires-Dist: sphinx==5.1.1; extra == "dev"
|
|
46
|
+
Requires-Dist: pre-commit; extra == "dev"
|
|
47
|
+
Requires-Dist: recommonmark; extra == "dev"
|
|
48
|
+
|
|
49
|
+
**Versions Supported**
|
|
50
|
+
|
|
51
|
+

|
|
52
|
+
|
|
53
|
+
**Language Stats**
|
|
54
|
+
|
|
55
|
+

|
|
56
|
+

|
|
57
|
+
|
|
58
|
+
**Repo Stats**
|
|
59
|
+
|
|
60
|
+
[][license]
|
|
61
|
+
[][repo]
|
|
62
|
+
[][repo]
|
|
63
|
+
|
|
64
|
+
**Activity**
|
|
65
|
+
|
|
66
|
+
[][repo]
|
|
67
|
+
[][repo]
|
|
68
|
+
[][repo]
|
|
69
|
+
|
|
70
|
+
**Build Status**
|
|
71
|
+
|
|
72
|
+
[![pypi-publish][gha-pypi-badge]][gha-pypi]
|
|
73
|
+
[![pages-build-deployment][gha-pages-badge]][gha-pages]
|
|
74
|
+
|
|
75
|
+
# PyS3Uploader
|
|
76
|
+
Python module to upload an entire directory to an S3 bucket.
|
|
77
|
+
|
|
78
|
+
### Installation
|
|
79
|
+
```shell
|
|
80
|
+
pip install PyS3Uploader
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
### Usage
|
|
84
|
+
|
|
85
|
+
##### Upload objects in parallel
|
|
86
|
+
```python
|
|
87
|
+
import s3
|
|
88
|
+
|
|
89
|
+
if __name__ == '__main__':
|
|
90
|
+
wrapper = s3.Uploader(
|
|
91
|
+
bucket_name="BUCKET_NAME",
|
|
92
|
+
upload_dir="FULL_PATH_TO_UPLOAD",
|
|
93
|
+
prefix_dir="START_DIRECTORY_IN_S3"
|
|
94
|
+
)
|
|
95
|
+
wrapper.run_in_parallel()
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
##### Upload objects in sequence
|
|
99
|
+
```python
|
|
100
|
+
import s3
|
|
101
|
+
|
|
102
|
+
if __name__ == '__main__':
|
|
103
|
+
wrapper = s3.Uploader(
|
|
104
|
+
bucket_name="BUCKET_NAME",
|
|
105
|
+
upload_dir="FULL_PATH_TO_UPLOAD",
|
|
106
|
+
prefix_dir="START_DIRECTORY_IN_S3"
|
|
107
|
+
)
|
|
108
|
+
wrapper.run()
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
#### Mandatory arg
|
|
112
|
+
- **bucket_name** - Name of the s3 bucket.
|
|
113
|
+
- **upload_dir** - Directory to upload.
|
|
114
|
+
|
|
115
|
+
#### Optional kwargs
|
|
116
|
+
- **prefix_dir** - Start directory from ``upload_dir`` to use as root in S3. Defaults to `None`
|
|
117
|
+
- **logger** - Bring your own custom pre-configured logger. Defaults to on-screen logging.
|
|
118
|
+
<br><br>
|
|
119
|
+
- **region_name** - AWS region name. Defaults to the env var `AWS_DEFAULT_REGION`
|
|
120
|
+
- **profile_name** - AWS profile name. Defaults to the env var `PROFILE_NAME`
|
|
121
|
+
- **aws_access_key_id** - AWS access key ID. Defaults to the env var `AWS_ACCESS_KEY_ID`
|
|
122
|
+
- **aws_secret_access_key** - AWS secret access key. Defaults to the env var `AWS_SECRET_ACCESS_KEY`
|
|
123
|
+
> AWS values are loaded from env vars or the default config at `~/.aws/config` / `~/.aws/credentials`
|
|
124
|
+
|
|
125
|
+
### Coding Standards
|
|
126
|
+
Docstring format: [`Google`](https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings) <br>
|
|
127
|
+
Styling conventions: [`PEP 8`](https://www.python.org/dev/peps/pep-0008/) <br>
|
|
128
|
+
Clean code with pre-commit hooks: [`flake8`](https://flake8.pycqa.org/en/latest/) and
|
|
129
|
+
[`isort`](https://pycqa.github.io/isort/)
|
|
130
|
+
|
|
131
|
+
## [Release Notes][release-notes]
|
|
132
|
+
**Requirement**
|
|
133
|
+
```shell
|
|
134
|
+
python -m pip install gitverse
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
**Usage**
|
|
138
|
+
```shell
|
|
139
|
+
gitverse-release reverse -f release_notes.rst -t 'Release Notes'
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
## Linting
|
|
143
|
+
`pre-commit` will ensure linting, run pytest, generate runbook & release notes, and validate hyperlinks in ALL
|
|
144
|
+
markdown files (including Wiki pages)
|
|
145
|
+
|
|
146
|
+
**Requirement**
|
|
147
|
+
```shell
|
|
148
|
+
pip install sphinx==5.1.1 pre-commit recommonmark
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
**Usage**
|
|
152
|
+
```shell
|
|
153
|
+
pre-commit run --all-files
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
## Pypi Package
|
|
157
|
+
[![pypi-module][label-pypi-package]][pypi-repo]
|
|
158
|
+
|
|
159
|
+
[https://pypi.org/project/PyS3Uploader/][pypi]
|
|
160
|
+
|
|
161
|
+
## Runbook
|
|
162
|
+
[![made-with-sphinx-doc][label-sphinx-doc]][sphinx]
|
|
163
|
+
|
|
164
|
+
[https://thevickypedia.github.io/PyS3Uploader/][runbook]
|
|
165
|
+
|
|
166
|
+
## License & copyright
|
|
167
|
+
|
|
168
|
+
© Vignesh Rao
|
|
169
|
+
|
|
170
|
+
Licensed under the [MIT License][license]
|
|
171
|
+
|
|
172
|
+
[license]: https://github.com/thevickypedia/PyS3Uploader/blob/main/LICENSE
|
|
173
|
+
[release-notes]: https://github.com/thevickypedia/PyS3Uploader/blob/main/release_notes.rst
|
|
174
|
+
[pypi]: https://pypi.org/project/PyS3Uploader/
|
|
175
|
+
[pypi-tutorials]: https://packaging.python.org/tutorials/packaging-projects/
|
|
176
|
+
[pypi-logo]: https://img.shields.io/badge/Software%20Repository-pypi-1f425f.svg
|
|
177
|
+
[repo]: https://api.github.com/repos/thevickypedia/PyS3Uploader
|
|
178
|
+
[gha-pages-badge]: https://github.com/thevickypedia/PyS3Uploader/actions/workflows/pages/pages-build-deployment/badge.svg
|
|
179
|
+
[gha-pypi-badge]: https://github.com/thevickypedia/PyS3Uploader/actions/workflows/python-publish.yml/badge.svg
|
|
180
|
+
[gha-pages]: https://github.com/thevickypedia/PyS3Uploader/actions/workflows/pages/pages-build-deployment
|
|
181
|
+
[gha-pypi]: https://github.com/thevickypedia/PyS3Uploader/actions/workflows/python-publish.yml
|
|
182
|
+
[sphinx]: https://www.sphinx-doc.org/en/master/man/sphinx-autogen.html
|
|
183
|
+
[label-sphinx-doc]: https://img.shields.io/badge/Made%20with-Sphinx-blue?style=for-the-badge&logo=Sphinx
|
|
184
|
+
[runbook]: https://thevickypedia.github.io/PyS3Uploader/
|
|
185
|
+
[label-pypi-package]: https://img.shields.io/badge/Pypi%20Package-PyS3Uploader-blue?style=for-the-badge&logo=Python
|
|
186
|
+
[pypi-repo]: https://packaging.python.org/tutorials/packaging-projects/
|
|
@@ -0,0 +1,11 @@
|
|
|
1
|
+
s3/__init__.py,sha256=zSLbLrsnVD-tRtiiTcT3JDWzmpnSC9mP6uHMXt2cyEc,66
|
|
2
|
+
s3/exceptions.py,sha256=hH3jlMOe8yjBatQK9EdndWZz4QESU74KSY_iDhQ37SY,2585
|
|
3
|
+
s3/logger.py,sha256=oH540oq8jY723jA4lDWlgfFPLbNgGXTkDwFpB7TLO_o,1196
|
|
4
|
+
s3/tree.py,sha256=DiQ2ekMMaj2m_P3-iKkEqSuJCJZ_UZxcAwHtAoPVa5c,1824
|
|
5
|
+
s3/uploader.py,sha256=Z2EvtUlR5jlL1xbeQWj4XLBfhTn4yWPm9E8WhPcz6Qk,7056
|
|
6
|
+
s3/utils.py,sha256=swkdwkfn43e8I3dGL9HAGZ-dba3fIeorihVAjTE07wc,1291
|
|
7
|
+
pys3uploader-0.1.0.dist-info/LICENSE,sha256=8k-hEraOzyum0GvmmK65YxNRTFXK7eIFHJ0OshJXeTk,1068
|
|
8
|
+
pys3uploader-0.1.0.dist-info/METADATA,sha256=gkAfOF-hEXYfW9p0PZiJhcrpLjhNorK7LFZNtU_ybrE,7188
|
|
9
|
+
pys3uploader-0.1.0.dist-info/WHEEL,sha256=beeZ86-EfXScwlR_HKu4SllMC9wUEj_8Z_4FJ3egI2w,91
|
|
10
|
+
pys3uploader-0.1.0.dist-info/top_level.txt,sha256=iQp4y1P58Q633gj8M08kHE4mqqT0hixuDWcniDk_RJ4,3
|
|
11
|
+
pys3uploader-0.1.0.dist-info/RECORD,,
|
s3/__init__.py
CHANGED
s3/exceptions.py
ADDED
|
@@ -0,0 +1,82 @@
|
|
|
1
|
+
"""Module to store all the custom exceptions and formatters.
|
|
2
|
+
|
|
3
|
+
>>> S3Error
|
|
4
|
+
|
|
5
|
+
"""
|
|
6
|
+
|
|
7
|
+
from typing import Dict, Set
|
|
8
|
+
|
|
9
|
+
|
|
10
|
+
class S3Error(Exception):
|
|
11
|
+
"""Custom error for base exception to the PyS3Uploader module."""
|
|
12
|
+
|
|
13
|
+
|
|
14
|
+
class BucketNotFound(S3Error):
|
|
15
|
+
"""Custom error for bucket not found."""
|
|
16
|
+
|
|
17
|
+
|
|
18
|
+
class NoObjectFound(S3Error):
|
|
19
|
+
"""Custom error for no objects found."""
|
|
20
|
+
|
|
21
|
+
|
|
22
|
+
def convert_to_folder_structure(sequence: Set[str]) -> str:
|
|
23
|
+
"""Convert objects in a s3 buckets into a folder like representation.
|
|
24
|
+
|
|
25
|
+
Args:
|
|
26
|
+
sequence: Takes either a mutable or immutable sequence as an argument.
|
|
27
|
+
|
|
28
|
+
Returns:
|
|
29
|
+
str:
|
|
30
|
+
String representation of the architecture.
|
|
31
|
+
"""
|
|
32
|
+
folder_structure = {}
|
|
33
|
+
for item in sequence:
|
|
34
|
+
parts = item.split("/")
|
|
35
|
+
current_level = folder_structure
|
|
36
|
+
for part in parts:
|
|
37
|
+
current_level = current_level.setdefault(part, {})
|
|
38
|
+
|
|
39
|
+
def generate_folder_structure(structure: Dict[str, dict], indent: str = "") -> str:
|
|
40
|
+
"""Generates the folder like structure.
|
|
41
|
+
|
|
42
|
+
Args:
|
|
43
|
+
structure: Structure of folder objects as key-value pairs.
|
|
44
|
+
indent: Required indentation for the ASCII.
|
|
45
|
+
"""
|
|
46
|
+
result = ""
|
|
47
|
+
for i, (key, value) in enumerate(structure.items()):
|
|
48
|
+
if i == len(structure) - 1:
|
|
49
|
+
result += indent + "└── " + key + "\n"
|
|
50
|
+
sub_indent = indent + " "
|
|
51
|
+
else:
|
|
52
|
+
result += indent + "├── " + key + "\n"
|
|
53
|
+
sub_indent = indent + "│ "
|
|
54
|
+
if value:
|
|
55
|
+
result += generate_folder_structure(value, sub_indent)
|
|
56
|
+
return result
|
|
57
|
+
|
|
58
|
+
return generate_folder_structure(folder_structure)
|
|
59
|
+
|
|
60
|
+
|
|
61
|
+
class InvalidPrefix(S3Error):
|
|
62
|
+
"""Custom exception for invalid prefix value."""
|
|
63
|
+
|
|
64
|
+
def __init__(self, prefix: str, bucket_name: str, available: Set[str]):
|
|
65
|
+
"""Initialize an instance of ``InvalidPrefix`` object inherited from ``S3Error``
|
|
66
|
+
|
|
67
|
+
Args:
|
|
68
|
+
prefix: Prefix to limit the objects.
|
|
69
|
+
bucket_name: Name of the S3 bucket.
|
|
70
|
+
available: Available objects in the s3.
|
|
71
|
+
"""
|
|
72
|
+
self.prefix = prefix
|
|
73
|
+
self.bucket_name = bucket_name
|
|
74
|
+
self.available = available
|
|
75
|
+
super().__init__(self.format_error_message())
|
|
76
|
+
|
|
77
|
+
def format_error_message(self):
|
|
78
|
+
"""Returns the formatter error message as a string."""
|
|
79
|
+
return (
|
|
80
|
+
f"\n\n\t{self.prefix!r} was not found in {self.bucket_name}.\n\t"
|
|
81
|
+
f"Available: {self.available}\n\n{convert_to_folder_structure(self.available)}"
|
|
82
|
+
)
|
s3/logger.py
ADDED
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
"""Loads a default logger with StreamHandler set to DEBUG mode.
|
|
2
|
+
|
|
3
|
+
>>> logging.Logger
|
|
4
|
+
|
|
5
|
+
"""
|
|
6
|
+
|
|
7
|
+
import logging
|
|
8
|
+
|
|
9
|
+
|
|
10
|
+
def default_handler() -> logging.StreamHandler:
|
|
11
|
+
"""Creates a ``StreamHandler`` and assigns a default format to it.
|
|
12
|
+
|
|
13
|
+
Returns:
|
|
14
|
+
logging.StreamHandler:
|
|
15
|
+
Returns an instance of the ``StreamHandler`` object.
|
|
16
|
+
"""
|
|
17
|
+
handler = logging.StreamHandler()
|
|
18
|
+
handler.setFormatter(fmt=default_format())
|
|
19
|
+
return handler
|
|
20
|
+
|
|
21
|
+
|
|
22
|
+
def default_format() -> logging.Formatter:
|
|
23
|
+
"""Creates a logging ``Formatter`` with a custom message and datetime format.
|
|
24
|
+
|
|
25
|
+
Returns:
|
|
26
|
+
logging.Formatter:
|
|
27
|
+
Returns an instance of the ``Formatter`` object.
|
|
28
|
+
"""
|
|
29
|
+
return logging.Formatter(
|
|
30
|
+
fmt="%(asctime)s - %(levelname)s - [%(module)s:%(lineno)d] - %(funcName)s - %(message)s",
|
|
31
|
+
datefmt="%b-%d-%Y %I:%M:%S %p",
|
|
32
|
+
)
|
|
33
|
+
|
|
34
|
+
|
|
35
|
+
def default_logger() -> logging.Logger:
|
|
36
|
+
"""Creates a default logger with debug mode enabled.
|
|
37
|
+
|
|
38
|
+
Returns:
|
|
39
|
+
logging.Logger:
|
|
40
|
+
Returns an instance of the ``Logger`` object.
|
|
41
|
+
"""
|
|
42
|
+
logger = logging.getLogger(__name__)
|
|
43
|
+
logger.addHandler(hdlr=default_handler())
|
|
44
|
+
logger.setLevel(level=logging.DEBUG)
|
|
45
|
+
return logger
|
s3/tree.py
ADDED
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
import pathlib
|
|
2
|
+
from typing import List
|
|
3
|
+
|
|
4
|
+
|
|
5
|
+
class Tree:
|
|
6
|
+
"""Root tree formatter for a particular directory location.
|
|
7
|
+
|
|
8
|
+
This class allows the creation of a visual representation of the file
|
|
9
|
+
system hierarchy of a specified directory. It can optionally skip files
|
|
10
|
+
that start with a dot (hidden files).
|
|
11
|
+
|
|
12
|
+
>>> Tree
|
|
13
|
+
|
|
14
|
+
"""
|
|
15
|
+
|
|
16
|
+
def __init__(self, skip_dot_files: bool):
|
|
17
|
+
"""Instantiates the tree object.
|
|
18
|
+
|
|
19
|
+
Args:
|
|
20
|
+
skip_dot_files (bool): If True, skips files with a dot prefix (hidden files).
|
|
21
|
+
"""
|
|
22
|
+
self.tree_text = []
|
|
23
|
+
self.skip_dot_files = skip_dot_files
|
|
24
|
+
|
|
25
|
+
def scan(self, path: pathlib.Path, last: bool = True, header: str = "") -> List[str]:
|
|
26
|
+
"""Returns contents for a folder as a root tree.
|
|
27
|
+
|
|
28
|
+
Args:
|
|
29
|
+
path: Directory path for which the root tree is to be extracted.
|
|
30
|
+
last: Indicates if the current item is the last in the directory.
|
|
31
|
+
header: The prefix for the current level in the tree structure.
|
|
32
|
+
|
|
33
|
+
Returns:
|
|
34
|
+
List[str]:
|
|
35
|
+
A list of strings representing the directory structure.
|
|
36
|
+
"""
|
|
37
|
+
elbow = "└──"
|
|
38
|
+
pipe = "│ "
|
|
39
|
+
tee = "├──"
|
|
40
|
+
blank = " "
|
|
41
|
+
self.tree_text.append(header + (elbow if last else tee) + path.name)
|
|
42
|
+
if path.is_dir():
|
|
43
|
+
children = list(path.iterdir())
|
|
44
|
+
for idx, child in enumerate(children):
|
|
45
|
+
# Skip child file/directory when dot files are supposed to be hidden
|
|
46
|
+
if self.skip_dot_files and child.name.startswith("."):
|
|
47
|
+
continue
|
|
48
|
+
self.scan(
|
|
49
|
+
child,
|
|
50
|
+
header=header + (blank if last else pipe),
|
|
51
|
+
last=idx == len(children) - 1,
|
|
52
|
+
)
|
|
53
|
+
return self.tree_text
|
s3/uploader.py
ADDED
|
@@ -0,0 +1,175 @@
|
|
|
1
|
+
import logging
|
|
2
|
+
import os
|
|
3
|
+
import time
|
|
4
|
+
from concurrent.futures import ThreadPoolExecutor, as_completed
|
|
5
|
+
from typing import Dict
|
|
6
|
+
|
|
7
|
+
import boto3.resources.factory
|
|
8
|
+
from botocore.config import Config
|
|
9
|
+
from botocore.exceptions import ClientError
|
|
10
|
+
from tqdm import tqdm
|
|
11
|
+
|
|
12
|
+
from s3.exceptions import BucketNotFound
|
|
13
|
+
from s3.logger import default_logger
|
|
14
|
+
from s3.utils import UploadResults, get_object_path, getenv
|
|
15
|
+
|
|
16
|
+
|
|
17
|
+
class Uploader:
|
|
18
|
+
"""Initiates Uploader object to upload entire directory to S3.
|
|
19
|
+
|
|
20
|
+
>>> Uploader
|
|
21
|
+
|
|
22
|
+
"""
|
|
23
|
+
|
|
24
|
+
RETRY_CONFIG: Config = Config(retries={"max_attempts": 10, "mode": "standard"})
|
|
25
|
+
|
|
26
|
+
def __init__(
|
|
27
|
+
self,
|
|
28
|
+
bucket_name: str,
|
|
29
|
+
upload_dir: str,
|
|
30
|
+
prefix_dir: str = None,
|
|
31
|
+
region_name: str = None,
|
|
32
|
+
profile_name: str = None,
|
|
33
|
+
aws_access_key_id: str = None,
|
|
34
|
+
aws_secret_access_key: str = None,
|
|
35
|
+
logger: logging.Logger = None,
|
|
36
|
+
):
|
|
37
|
+
"""Initiates all the necessary args and creates a boto3 session with retry logic.
|
|
38
|
+
|
|
39
|
+
Args:
|
|
40
|
+
bucket_name: Name of the bucket.
|
|
41
|
+
upload_dir: Name of the directory to be uploaded.
|
|
42
|
+
prefix_dir: Start folder name from upload_dir.
|
|
43
|
+
region_name: Name of the AWS region.
|
|
44
|
+
profile_name: AWS profile name.
|
|
45
|
+
aws_access_key_id: AWS access key ID.
|
|
46
|
+
aws_secret_access_key: AWS secret access key.
|
|
47
|
+
logger: Bring your own logger.
|
|
48
|
+
"""
|
|
49
|
+
self.session = boto3.Session(
|
|
50
|
+
profile_name=profile_name or getenv("PROFILE_NAME"),
|
|
51
|
+
region_name=region_name or getenv("AWS_DEFAULT_REGION"),
|
|
52
|
+
aws_access_key_id=aws_access_key_id or getenv("AWS_ACCESS_KEY_ID"),
|
|
53
|
+
aws_secret_access_key=aws_secret_access_key or getenv("AWS_SECRET_ACCESS_KEY"),
|
|
54
|
+
)
|
|
55
|
+
self.s3 = self.session.resource(service_name="s3", config=self.RETRY_CONFIG)
|
|
56
|
+
self.logger = logger or default_logger()
|
|
57
|
+
self.upload_dir = upload_dir or getenv("UPLOAD_DIR", "SOURCE")
|
|
58
|
+
self.prefix_dir = prefix_dir
|
|
59
|
+
self.bucket_name = bucket_name
|
|
60
|
+
# noinspection PyUnresolvedReferences
|
|
61
|
+
self.bucket: boto3.resources.factory.s3.Bucket = None
|
|
62
|
+
self.results = UploadResults()
|
|
63
|
+
self.start = time.time()
|
|
64
|
+
|
|
65
|
+
def init(self) -> None:
|
|
66
|
+
"""Instantiates the bucket instance.
|
|
67
|
+
|
|
68
|
+
Raises:
|
|
69
|
+
ValueError: If no bucket name was passed.
|
|
70
|
+
BucketNotFound: If bucket name was not found.
|
|
71
|
+
"""
|
|
72
|
+
self.start = time.time()
|
|
73
|
+
if self.prefix_dir and self.prefix_dir not in self.upload_dir.split(os.sep):
|
|
74
|
+
raise ValueError(
|
|
75
|
+
f"\n\n\tPrefix folder name {self.prefix_dir!r} is not a part of upload directory {self.upload_dir!r}"
|
|
76
|
+
)
|
|
77
|
+
if not self.upload_dir:
|
|
78
|
+
raise ValueError("\n\n\tCannot proceed without an upload directory.")
|
|
79
|
+
try:
|
|
80
|
+
assert os.path.exists(self.upload_dir)
|
|
81
|
+
except AssertionError:
|
|
82
|
+
raise ValueError(f"\n\n\tPath not found: {self.upload_dir}")
|
|
83
|
+
buckets = [bucket.name for bucket in self.s3.buckets.all()]
|
|
84
|
+
if not self.bucket_name:
|
|
85
|
+
raise ValueError(f"\n\n\tCannot proceed without a bucket name.\n\tAvailable: {buckets}")
|
|
86
|
+
_account_id, _alias = self.session.resource(service_name="iam").CurrentUser().arn.split("/")
|
|
87
|
+
if self.bucket_name not in buckets:
|
|
88
|
+
raise BucketNotFound(f"\n\n\t{self.bucket_name} was not found in {_alias} account.\n\tAvailable: {buckets}")
|
|
89
|
+
self.upload_dir = os.path.abspath(self.upload_dir)
|
|
90
|
+
self.logger.info("Bucket objects from '%s' will be uploaded to '%s'", self.upload_dir, self.bucket_name)
|
|
91
|
+
# noinspection PyUnresolvedReferences
|
|
92
|
+
self.bucket: boto3.resources.factory.s3.Bucket = self.s3.Bucket(self.bucket_name)
|
|
93
|
+
|
|
94
|
+
def exit(self) -> None:
|
|
95
|
+
"""Exits after printing results, and run time."""
|
|
96
|
+
total = self.results.success + self.results.failed
|
|
97
|
+
self.logger.info(
|
|
98
|
+
"Total number of uploads: %d, success: %d, failed: %d", total, self.results.success, self.results.failed
|
|
99
|
+
)
|
|
100
|
+
self.logger.info("Run Time: %.2fs", time.time() - self.start)
|
|
101
|
+
|
|
102
|
+
def _uploader(self, objectpath: str, filepath: str) -> None:
|
|
103
|
+
"""Uploads the filepath to the specified S3 bucket.
|
|
104
|
+
|
|
105
|
+
Args:
|
|
106
|
+
objectpath: Object path ref in S3.
|
|
107
|
+
filepath: Filepath to upload.
|
|
108
|
+
"""
|
|
109
|
+
self.bucket.upload_file(filepath, objectpath)
|
|
110
|
+
|
|
111
|
+
def _get_files(self) -> Dict[str, str]:
|
|
112
|
+
"""Get a mapping for all the file path and object paths in upload directory.
|
|
113
|
+
|
|
114
|
+
Returns:
|
|
115
|
+
Dict[str, str]:
|
|
116
|
+
Returns a dictionary object path and filepath.
|
|
117
|
+
"""
|
|
118
|
+
files_to_upload = {}
|
|
119
|
+
for __path, __directory, __files in os.walk(self.upload_dir):
|
|
120
|
+
for file_ in __files:
|
|
121
|
+
file_path = os.path.join(__path, file_)
|
|
122
|
+
if self.prefix_dir:
|
|
123
|
+
try:
|
|
124
|
+
object_path = get_object_path(file_path, self.prefix_dir)
|
|
125
|
+
except ValueError as error:
|
|
126
|
+
self.logger.error(error)
|
|
127
|
+
continue
|
|
128
|
+
else:
|
|
129
|
+
object_path = self.prefix_dir
|
|
130
|
+
files_to_upload[object_path] = file_path
|
|
131
|
+
return files_to_upload
|
|
132
|
+
|
|
133
|
+
def run(self) -> None:
|
|
134
|
+
"""Initiates object upload in a traditional loop."""
|
|
135
|
+
self.init()
|
|
136
|
+
keys = self._get_files()
|
|
137
|
+
self.logger.debug(keys)
|
|
138
|
+
self.logger.info("Initiating upload process.")
|
|
139
|
+
for objectpath, filepath in tqdm(
|
|
140
|
+
keys.items(), total=len(keys), unit="file", leave=True, desc=f"Uploading files from {self.upload_dir}"
|
|
141
|
+
):
|
|
142
|
+
try:
|
|
143
|
+
self._uploader(objectpath=objectpath, filepath=filepath)
|
|
144
|
+
self.results.success += 1
|
|
145
|
+
except ClientError as error:
|
|
146
|
+
self.logger.error(error)
|
|
147
|
+
self.results.failed += 1
|
|
148
|
+
self.exit()
|
|
149
|
+
|
|
150
|
+
def run_in_parallel(self, max_workers: int = 5) -> None:
|
|
151
|
+
"""Initiates upload in multi-threading.
|
|
152
|
+
|
|
153
|
+
Args:
|
|
154
|
+
max_workers: Number of maximum threads to use.
|
|
155
|
+
"""
|
|
156
|
+
self.init()
|
|
157
|
+
self.logger.info(f"Number of threads: {max_workers}")
|
|
158
|
+
keys = self._get_files()
|
|
159
|
+
self.logger.info("Initiating upload process.")
|
|
160
|
+
with ThreadPoolExecutor(max_workers=max_workers) as executor:
|
|
161
|
+
futures = [executor.submit(self._uploader, *kv) for kv in keys.items()]
|
|
162
|
+
for future in tqdm(
|
|
163
|
+
iterable=as_completed(futures),
|
|
164
|
+
total=len(futures),
|
|
165
|
+
desc=f"Uploading files to {self.bucket_name}",
|
|
166
|
+
unit="files",
|
|
167
|
+
leave=True,
|
|
168
|
+
):
|
|
169
|
+
try:
|
|
170
|
+
future.result()
|
|
171
|
+
self.results.success += 1
|
|
172
|
+
except ClientError as error:
|
|
173
|
+
self.logger.error(f"Upload failed: {error}")
|
|
174
|
+
self.results.failed += 1
|
|
175
|
+
self.exit()
|
s3/utils.py
ADDED
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
import os
|
|
2
|
+
|
|
3
|
+
|
|
4
|
+
class UploadResults(dict):
|
|
5
|
+
"""Object to store results of S3 upload.
|
|
6
|
+
|
|
7
|
+
>>> UploadResults
|
|
8
|
+
|
|
9
|
+
"""
|
|
10
|
+
|
|
11
|
+
success: int = 0
|
|
12
|
+
failed: int = 0
|
|
13
|
+
|
|
14
|
+
|
|
15
|
+
def getenv(*args, default: str = None) -> str:
|
|
16
|
+
"""Returns the key-ed environment variable or the default value."""
|
|
17
|
+
for key in args:
|
|
18
|
+
if value := os.environ.get(key.upper()) or os.environ.get(key.lower()):
|
|
19
|
+
return value
|
|
20
|
+
return default
|
|
21
|
+
|
|
22
|
+
|
|
23
|
+
def get_object_path(filepath: str, start_folder_name: str):
|
|
24
|
+
"""Construct object path without absolute path's pretext.
|
|
25
|
+
|
|
26
|
+
Args:
|
|
27
|
+
filepath: Absolute file path to upload.
|
|
28
|
+
start_folder_name: Folder name to begin object path.
|
|
29
|
+
|
|
30
|
+
Returns:
|
|
31
|
+
str:
|
|
32
|
+
Returns the object name.
|
|
33
|
+
"""
|
|
34
|
+
# Split file_path into parts
|
|
35
|
+
parts = filepath.split(os.sep)
|
|
36
|
+
try:
|
|
37
|
+
# Find index of the folder to start from
|
|
38
|
+
start_index = parts.index(start_folder_name)
|
|
39
|
+
except ValueError:
|
|
40
|
+
# Folder not found in path, fallback to full path or raise error
|
|
41
|
+
raise ValueError(f"Folder '{start_folder_name}' not found in path '{filepath}'")
|
|
42
|
+
# Reconstruct path from start_folder_name onwards
|
|
43
|
+
relative_parts = parts[start_index:]
|
|
44
|
+
# Join with os.sep for system-appropriate separators
|
|
45
|
+
return os.sep.join(relative_parts)
|
|
@@ -1,49 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.2
|
|
2
|
-
Name: PyS3Uploader
|
|
3
|
-
Version: 0.0.0a0
|
|
4
|
-
Summary: Python module to upload objects to an S3 bucket.
|
|
5
|
-
Author-email: Vignesh Rao <svignesh1793@gmail.com>
|
|
6
|
-
License: MIT License
|
|
7
|
-
|
|
8
|
-
Copyright (c) 2025 Vignesh Rao
|
|
9
|
-
|
|
10
|
-
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
11
|
-
of this software and associated documentation files (the "Software"), to deal
|
|
12
|
-
in the Software without restriction, including without limitation the rights
|
|
13
|
-
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
14
|
-
copies of the Software, and to permit persons to whom the Software is
|
|
15
|
-
furnished to do so, subject to the following conditions:
|
|
16
|
-
|
|
17
|
-
The above copyright notice and this permission notice shall be included in all
|
|
18
|
-
copies or substantial portions of the Software.
|
|
19
|
-
|
|
20
|
-
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
21
|
-
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
22
|
-
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
23
|
-
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
24
|
-
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
25
|
-
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
26
|
-
SOFTWARE.
|
|
27
|
-
|
|
28
|
-
Project-URL: Homepage, https://github.com/thevickypedia/s3-uploader
|
|
29
|
-
Project-URL: Docs, https://thevickypedia.github.io/s3-uploader/
|
|
30
|
-
Project-URL: Source, https://github.com/thevickypedia/s3-uploader
|
|
31
|
-
Project-URL: Bug Tracker, https://github.com/thevickypedia/s3-uploader/issues
|
|
32
|
-
Keywords: s3
|
|
33
|
-
Classifier: Development Status :: 1 - Planning
|
|
34
|
-
Classifier: Intended Audience :: Information Technology
|
|
35
|
-
Classifier: Operating System :: OS Independent
|
|
36
|
-
Classifier: Programming Language :: Python :: 3.9
|
|
37
|
-
Classifier: Topic :: Internet :: File Transfer Protocol (FTP)
|
|
38
|
-
Requires-Python: >=3.8
|
|
39
|
-
Description-Content-Type: text/markdown
|
|
40
|
-
License-File: LICENSE
|
|
41
|
-
Requires-Dist: boto3
|
|
42
|
-
Requires-Dist: tqdm
|
|
43
|
-
Provides-Extra: dev
|
|
44
|
-
Requires-Dist: sphinx==5.1.1; extra == "dev"
|
|
45
|
-
Requires-Dist: pre-commit; extra == "dev"
|
|
46
|
-
Requires-Dist: recommonmark; extra == "dev"
|
|
47
|
-
|
|
48
|
-
# s3-uploader
|
|
49
|
-
Upload objects to S3
|
|
@@ -1,6 +0,0 @@
|
|
|
1
|
-
s3/__init__.py,sha256=wq6ADf1YC8sccYtXrSjPGhw7W-AHqruT82iKIaMcttM,20
|
|
2
|
-
pys3uploader-0.0.0a0.dist-info/LICENSE,sha256=8k-hEraOzyum0GvmmK65YxNRTFXK7eIFHJ0OshJXeTk,1068
|
|
3
|
-
pys3uploader-0.0.0a0.dist-info/METADATA,sha256=nWmw-uD4ok-IyS1eXHgjINPF6DKO3cJ4xSqgGzg_f_8,2277
|
|
4
|
-
pys3uploader-0.0.0a0.dist-info/WHEEL,sha256=beeZ86-EfXScwlR_HKu4SllMC9wUEj_8Z_4FJ3egI2w,91
|
|
5
|
-
pys3uploader-0.0.0a0.dist-info/top_level.txt,sha256=iQp4y1P58Q633gj8M08kHE4mqqT0hixuDWcniDk_RJ4,3
|
|
6
|
-
pys3uploader-0.0.0a0.dist-info/RECORD,,
|
|
File without changes
|
|
File without changes
|
|
File without changes
|