miraj 0.1.0a1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- miraj-0.1.0a1/.gitignore +13 -0
- miraj-0.1.0a1/.rosetta/state.json +5 -0
- miraj-0.1.0a1/PKG-INFO +331 -0
- miraj-0.1.0a1/README.md +303 -0
- miraj-0.1.0a1/Technical-Spec.md +619 -0
- miraj-0.1.0a1/examples/config.py +63 -0
- miraj-0.1.0a1/examples/native_gcp_test.py +25 -0
- miraj-0.1.0a1/examples/test_gcp_list_buckets.py +17 -0
- miraj-0.1.0a1/examples/test_gcp_object_read.py +153 -0
- miraj-0.1.0a1/examples/test_gcp_put_object.py +18 -0
- miraj-0.1.0a1/examples/test_s3_create_bucket.py +17 -0
- miraj-0.1.0a1/examples/test_s3_get_object.py +21 -0
- miraj-0.1.0a1/examples/test_s3_put_object.py +18 -0
- miraj-0.1.0a1/pyproject.toml +69 -0
- miraj-0.1.0a1/pyrightconfig.json +0 -0
- miraj-0.1.0a1/src/miraj/__init__.py +23 -0
- miraj-0.1.0a1/src/miraj/cli.py +182 -0
- miraj-0.1.0a1/src/miraj/context.py +128 -0
- miraj-0.1.0a1/src/miraj/exceptions.py +157 -0
- miraj-0.1.0a1/src/miraj/models.py +59 -0
- miraj-0.1.0a1/src/miraj/providers/__init__.py +1 -0
- miraj-0.1.0a1/src/miraj/providers/aws.py +1046 -0
- miraj-0.1.0a1/src/miraj/providers/gcp.py +824 -0
- miraj-0.1.0a1/src/miraj/storage.py +434 -0
- miraj-0.1.0a1/tests/test_storage.py +28 -0
miraj-0.1.0a1/.gitignore
ADDED
miraj-0.1.0a1/PKG-INFO
ADDED
|
@@ -0,0 +1,331 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: miraj
|
|
3
|
+
Version: 0.1.0a1
|
|
4
|
+
Summary: Multi-cloud operational abstraction SDK
|
|
5
|
+
Project-URL: Homepage, https://github.com/<your-github-username>/miraj
|
|
6
|
+
Project-URL: Repository, https://github.com/<your-github-username>/miraj
|
|
7
|
+
Project-URL: Issues, https://github.com/<your-github-username>/miraj/issues
|
|
8
|
+
Author: Abul Hasan
|
|
9
|
+
License-Expression: MIT
|
|
10
|
+
Keywords: aws,cloud,gcp,multicloud,sdk,storage
|
|
11
|
+
Classifier: Development Status :: 3 - Alpha
|
|
12
|
+
Classifier: Intended Audience :: Developers
|
|
13
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
14
|
+
Classifier: Operating System :: OS Independent
|
|
15
|
+
Classifier: Programming Language :: Python :: 3
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
18
|
+
Requires-Python: >=3.11
|
|
19
|
+
Requires-Dist: boto3>=1.34.0
|
|
20
|
+
Requires-Dist: botocore>=1.34.0
|
|
21
|
+
Requires-Dist: google-cloud-storage>=2.16.0
|
|
22
|
+
Requires-Dist: pyyaml>=6.0.0
|
|
23
|
+
Provides-Extra: dev
|
|
24
|
+
Requires-Dist: pyright>=1.1.409; extra == 'dev'
|
|
25
|
+
Requires-Dist: pytest>=9.0.0; extra == 'dev'
|
|
26
|
+
Requires-Dist: ruff>=0.15.0; extra == 'dev'
|
|
27
|
+
Description-Content-Type: text/markdown
|
|
28
|
+
|
|
29
|
+
# Rosetta
|
|
30
|
+
|
|
31
|
+
Rosetta is a Python SDK for multi-cloud object storage operations. It provides a single operational interface for AWS S3 and Google Cloud Storage, while keeping provider-specific SDK details, authentication behavior, and exception handling inside provider adapters.
|
|
32
|
+
|
|
33
|
+
Rosetta is intentionally narrow in scope. The MVP focuses on object storage because it is the smallest useful surface for proving cross-cloud operational abstraction without drifting into orchestration, provisioning, or control-plane complexity.
|
|
34
|
+
|
|
35
|
+
## Project overview
|
|
36
|
+
|
|
37
|
+
Modern platform teams usually end up writing separate automation for each cloud provider. AWS scripts use boto3 and AWS-native patterns; GCP scripts use google-cloud-storage and GCP-native patterns. That duplication creates repeated auth setup, repeated error handling, repeated logging, and repeated operational logic.
|
|
38
|
+
|
|
39
|
+
Rosetta addresses that problem by standardizing storage operations behind one SDK interface. Application code calls Rosetta once, and Rosetta routes the operation to the correct provider adapter internally.
|
|
40
|
+
|
|
41
|
+
The design goal is operational simplicity, not fake uniformity. Rosetta standardizes the workflow, but it does not pretend AWS and GCP are identical.
|
|
42
|
+
|
|
43
|
+
## Problem statement
|
|
44
|
+
|
|
45
|
+
Cloud teams commonly need the same operational actions across providers:
|
|
46
|
+
|
|
47
|
+
- upload an object
|
|
48
|
+
- download an object
|
|
49
|
+
- delete an object
|
|
50
|
+
- list objects
|
|
51
|
+
- check whether an object exists
|
|
52
|
+
- generate a signed URL
|
|
53
|
+
- create or delete a bucket
|
|
54
|
+
|
|
55
|
+
The problem is not the lack of cloud SDKs. The problem is that every provider implements these actions with different clients, auth chains, exceptions, and edge cases. That makes customer automation harder to maintain and harder to standardize.
|
|
56
|
+
|
|
57
|
+
Rosetta exists to remove that friction by providing one operational contract for the common object-storage workflow.
|
|
58
|
+
|
|
59
|
+
## Architectural philosophy
|
|
60
|
+
|
|
61
|
+
Rosetta is built around these rules:
|
|
62
|
+
|
|
63
|
+
- abstract operations, not resource graphs
|
|
64
|
+
- keep provider-specific logic inside adapters
|
|
65
|
+
- normalize exceptions into stable Rosetta exceptions
|
|
66
|
+
- return typed models instead of raw provider dictionaries
|
|
67
|
+
- keep the public API small and explicit
|
|
68
|
+
- prefer deterministic behavior over guessing
|
|
69
|
+
- keep the SDK sync-first for operational scripting
|
|
70
|
+
- preserve the ability to extend into other services later without redesigning the core
|
|
71
|
+
|
|
72
|
+
The important boundary is this: application code should not know whether the backend is boto3 or google-cloud-storage. It should only know how to call Rosetta.
|
|
73
|
+
|
|
74
|
+
## Supported providers
|
|
75
|
+
|
|
76
|
+
The MVP supports:
|
|
77
|
+
|
|
78
|
+
- AWS S3
|
|
79
|
+
- Google Cloud Storage
|
|
80
|
+
|
|
81
|
+
Provider support is implemented through adapter classes under `src/miraj/providers/`.
|
|
82
|
+
|
|
83
|
+
## Supported operations
|
|
84
|
+
|
|
85
|
+
The current object-storage API supports:
|
|
86
|
+
|
|
87
|
+
- `put_object`
|
|
88
|
+
- `upload_file` (compatibility wrapper)
|
|
89
|
+
- `get_object`
|
|
90
|
+
- `download_file` (compatibility wrapper)
|
|
91
|
+
- `delete_object`
|
|
92
|
+
- `delete_file` (compatibility wrapper)
|
|
93
|
+
- `object_exists`
|
|
94
|
+
- `list_objects`
|
|
95
|
+
- `copy_object`
|
|
96
|
+
- `move_object`
|
|
97
|
+
- `bulk_delete_objects`
|
|
98
|
+
- `create_bucket`
|
|
99
|
+
- `delete_bucket`
|
|
100
|
+
- `list_buckets`
|
|
101
|
+
- `generate_signed_url`
|
|
102
|
+
|
|
103
|
+
For signed URLs, the MVP supports `GET` and `PUT`.
|
|
104
|
+
|
|
105
|
+
## Installation
|
|
106
|
+
|
|
107
|
+
Rosetta is a normal Python package. The project is built with `hatchling` and managed locally with `uv`.
|
|
108
|
+
|
|
109
|
+
Install editable mode during development:
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
uv pip install -e .
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
Install development dependencies:
|
|
116
|
+
|
|
117
|
+
```bash
|
|
118
|
+
uv pip install -e ".[dev]"
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
## Configuration
|
|
122
|
+
|
|
123
|
+
Rosetta separates provider state from provider runtime configuration.
|
|
124
|
+
|
|
125
|
+
### Provider state
|
|
126
|
+
|
|
127
|
+
Provider selection is managed by Rosetta itself. The current active provider is stored in `.miraj/state.json` and is controlled through the CLI.
|
|
128
|
+
|
|
129
|
+
Example:
|
|
130
|
+
|
|
131
|
+
```bash
|
|
132
|
+
miraj set provider --gcp
|
|
133
|
+
miraj list providers
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
### Provider runtime configuration
|
|
137
|
+
|
|
138
|
+
Provider-specific runtime settings live in `examples/config.py` during the MVP. That file holds the configuration values required by each provider.
|
|
139
|
+
|
|
140
|
+
Typical values are:
|
|
141
|
+
|
|
142
|
+
- GCP: `credentials_path`, `project`
|
|
143
|
+
- AWS: `region`, optional static credentials, or environment-based auth
|
|
144
|
+
|
|
145
|
+
The configuration file is not the provider selector. It only stores provider settings.
|
|
146
|
+
|
|
147
|
+
## Provider-state workflow
|
|
148
|
+
|
|
149
|
+
Rosetta uses a simple context model:
|
|
150
|
+
|
|
151
|
+
1. the user sets the active provider through the CLI
|
|
152
|
+
2. Rosetta stores that selection in `.miraj/state.json`
|
|
153
|
+
3. application code calls `Storage()` or `Storage(config=...)`
|
|
154
|
+
4. Rosetta resolves the active provider from state
|
|
155
|
+
5. Rosetta initializes the correct provider adapter
|
|
156
|
+
6. the adapter executes the operation using the provider-native SDK
|
|
157
|
+
|
|
158
|
+
This allows the same application script to run against AWS or GCP without rewriting the script itself.
|
|
159
|
+
|
|
160
|
+
The active provider can be shown with:
|
|
161
|
+
|
|
162
|
+
```bash
|
|
163
|
+
miraj list providers
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
The current provider is marked with `*`.
|
|
167
|
+
|
|
168
|
+
## Quickstart examples
|
|
169
|
+
|
|
170
|
+
### 1. Set the active provider
|
|
171
|
+
|
|
172
|
+
```bash
|
|
173
|
+
miraj set provider --gcp
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
or
|
|
177
|
+
|
|
178
|
+
```bash
|
|
179
|
+
miraj set provider --aws
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
### 2. Verify the selected provider
|
|
183
|
+
|
|
184
|
+
```bash
|
|
185
|
+
miraj list providers
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
### 3. Use Rosetta in application code
|
|
189
|
+
|
|
190
|
+
```python
|
|
191
|
+
from pathlib import Path
|
|
192
|
+
|
|
193
|
+
from miraj import Storage
|
|
194
|
+
from config import STORAGE_CONFIG
|
|
195
|
+
|
|
196
|
+
storage = Storage(config=STORAGE_CONFIG)
|
|
197
|
+
content = storage.get_object(
|
|
198
|
+
bucket="wmg-invoice-data",
|
|
199
|
+
key="wmg invoices/WMG Invoice 2024-09-01 — 2024-09-30.csv",
|
|
200
|
+
)
|
|
201
|
+
Path("downloaded-object.csv").write_bytes(content)
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
### 4. Upload an object
|
|
205
|
+
|
|
206
|
+
```python
|
|
207
|
+
from miraj import Storage
|
|
208
|
+
from config import STORAGE_CONFIG
|
|
209
|
+
|
|
210
|
+
storage = Storage(config=STORAGE_CONFIG)
|
|
211
|
+
storage.put_object(
|
|
212
|
+
bucket="wmg-invoice-data",
|
|
213
|
+
key="hello.txt",
|
|
214
|
+
data="hello from miraj",
|
|
215
|
+
)
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
## Testing instructions
|
|
219
|
+
|
|
220
|
+
The MVP was validated through direct example scripts rather than a large test harness.
|
|
221
|
+
|
|
222
|
+
Recommended checks:
|
|
223
|
+
|
|
224
|
+
- `miraj list providers`
|
|
225
|
+
- GCP object read flow
|
|
226
|
+
- GCP object write flow
|
|
227
|
+
- AWS object read flow
|
|
228
|
+
- AWS bucket creation flow
|
|
229
|
+
- object listing flow
|
|
230
|
+
- signed URL generation
|
|
231
|
+
|
|
232
|
+
Run the example scripts from the `examples/` directory after selecting the target provider through the CLI.
|
|
233
|
+
|
|
234
|
+
Example:
|
|
235
|
+
|
|
236
|
+
```bash
|
|
237
|
+
miraj set provider --gcp
|
|
238
|
+
python examples/test_gcp_test_object_storage.py
|
|
239
|
+
python examples/test_gcp_put_object.py
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
Example AWS flow:
|
|
243
|
+
|
|
244
|
+
```bash
|
|
245
|
+
miraj set provider --aws
|
|
246
|
+
python examples/test_s3_get_object.py
|
|
247
|
+
python examples/test_s3_put_object.py
|
|
248
|
+
python examples/test_s3_create_bucket.py
|
|
249
|
+
```
|
|
250
|
+
|
|
251
|
+
## Validation summary
|
|
252
|
+
|
|
253
|
+
The MVP proved the following:
|
|
254
|
+
|
|
255
|
+
- Rosetta can route calls through one SDK interface
|
|
256
|
+
- provider state can be set and listed through the CLI
|
|
257
|
+
- GCP object read and list operations work through Rosetta
|
|
258
|
+
- GCP upload calls are normalized through Rosetta exceptions when permissions are missing
|
|
259
|
+
- AWS provider initialization and request routing are wired through Rosetta
|
|
260
|
+
- AWS bucket creation and object operations follow the Rosetta interface
|
|
261
|
+
- provider-specific errors are surfaced as Rosetta exceptions rather than raw provider exceptions
|
|
262
|
+
|
|
263
|
+
Live validation depends on the customer’s own cloud credentials, IAM permissions, bucket existence, and region/project settings.
|
|
264
|
+
|
|
265
|
+
## Known limitations
|
|
266
|
+
|
|
267
|
+
This MVP does not attempt to solve everything that cloud storage can do.
|
|
268
|
+
|
|
269
|
+
Current limitations include:
|
|
270
|
+
|
|
271
|
+
- only AWS S3 and Google Cloud Storage are supported
|
|
272
|
+
- only object storage is implemented
|
|
273
|
+
- no IAM abstraction
|
|
274
|
+
- no networking abstraction
|
|
275
|
+
- no provisioning or reconciliation layer
|
|
276
|
+
- no workflow orchestration
|
|
277
|
+
- no async public API yet
|
|
278
|
+
- no automatic provider guessing from resource names
|
|
279
|
+
- no deep control-plane abstraction
|
|
280
|
+
- no support for all advanced provider-specific features through the public API
|
|
281
|
+
|
|
282
|
+
Some capabilities remain intentionally inside the native provider escape hatches rather than the Rosetta public surface.
|
|
283
|
+
|
|
284
|
+
## Future extensibility overview
|
|
285
|
+
|
|
286
|
+
Rosetta was designed so that additional services can be added without replacing the core architecture.
|
|
287
|
+
|
|
288
|
+
The extension pattern is:
|
|
289
|
+
|
|
290
|
+
1. define a public service facade
|
|
291
|
+
2. define a provider contract for that service
|
|
292
|
+
3. implement AWS and GCP adapters for the new service
|
|
293
|
+
4. normalize exceptions and return models
|
|
294
|
+
5. add contract tests
|
|
295
|
+
6. add example scripts
|
|
296
|
+
|
|
297
|
+
This approach is meant to make future services like EC2 / Compute Engine, SQS / Pub/Sub, or CloudWatch Logs / Cloud Logging feasible without changing how application code is written.
|
|
298
|
+
|
|
299
|
+
The important rule is to keep the public API operational and explicit. New services should follow the same pattern as storage: stable facade, provider adapters, normalized errors, typed results.
|
|
300
|
+
|
|
301
|
+
## Maintenance ownership boundary
|
|
302
|
+
|
|
303
|
+
This MVP was delivered as the foundation for the customer’s internal use.
|
|
304
|
+
|
|
305
|
+
After handoff:
|
|
306
|
+
|
|
307
|
+
- the customer owns ongoing maintenance
|
|
308
|
+
- the customer owns cloud credentials and IAM permissions
|
|
309
|
+
- the customer owns environment setup for AWS and GCP
|
|
310
|
+
- the customer owns future service expansion unless separately agreed otherwise
|
|
311
|
+
- the customer owns changes to operational policy, bucket access, and runtime configuration
|
|
312
|
+
|
|
313
|
+
The intent of the handoff is to provide a working, understandable starting point that the customer can continue to extend.
|
|
314
|
+
|
|
315
|
+
## Repository layout
|
|
316
|
+
|
|
317
|
+
Key files in the MVP:
|
|
318
|
+
|
|
319
|
+
- `src/miraj/storage.py` — public SDK facade
|
|
320
|
+
- `src/miraj/context.py` — active provider state storage and resolution
|
|
321
|
+
- `src/miraj/cli.py` — CLI for provider selection and listing
|
|
322
|
+
- `src/miraj/exceptions.py` — normalized exception hierarchy
|
|
323
|
+
- `src/miraj/models.py` — typed return models
|
|
324
|
+
- `src/miraj/providers/aws.py` — AWS S3 adapter
|
|
325
|
+
- `src/miraj/providers/gcp.py` — GCS adapter
|
|
326
|
+
- `examples/config.py` — provider runtime config used by example scripts
|
|
327
|
+
- `examples/` — customer-facing usage examples
|
|
328
|
+
|
|
329
|
+
## Notes for the customer
|
|
330
|
+
|
|
331
|
+
Rosetta is intentionally small at the MVP stage. The value is not in breadth of services. The value is in proving that one operational interface can drive multiple clouds cleanly, with provider-specific behavior isolated behind adapters and a stable SDK boundary.
|
miraj-0.1.0a1/README.md
ADDED
|
@@ -0,0 +1,303 @@
|
|
|
1
|
+
# Rosetta
|
|
2
|
+
|
|
3
|
+
Rosetta is a Python SDK for multi-cloud object storage operations. It provides a single operational interface for AWS S3 and Google Cloud Storage, while keeping provider-specific SDK details, authentication behavior, and exception handling inside provider adapters.
|
|
4
|
+
|
|
5
|
+
Rosetta is intentionally narrow in scope. The MVP focuses on object storage because it is the smallest useful surface for proving cross-cloud operational abstraction without drifting into orchestration, provisioning, or control-plane complexity.
|
|
6
|
+
|
|
7
|
+
## Project overview
|
|
8
|
+
|
|
9
|
+
Modern platform teams usually end up writing separate automation for each cloud provider. AWS scripts use boto3 and AWS-native patterns; GCP scripts use google-cloud-storage and GCP-native patterns. That duplication creates repeated auth setup, repeated error handling, repeated logging, and repeated operational logic.
|
|
10
|
+
|
|
11
|
+
Rosetta addresses that problem by standardizing storage operations behind one SDK interface. Application code calls Rosetta once, and Rosetta routes the operation to the correct provider adapter internally.
|
|
12
|
+
|
|
13
|
+
The design goal is operational simplicity, not fake uniformity. Rosetta standardizes the workflow, but it does not pretend AWS and GCP are identical.
|
|
14
|
+
|
|
15
|
+
## Problem statement
|
|
16
|
+
|
|
17
|
+
Cloud teams commonly need the same operational actions across providers:
|
|
18
|
+
|
|
19
|
+
- upload an object
|
|
20
|
+
- download an object
|
|
21
|
+
- delete an object
|
|
22
|
+
- list objects
|
|
23
|
+
- check whether an object exists
|
|
24
|
+
- generate a signed URL
|
|
25
|
+
- create or delete a bucket
|
|
26
|
+
|
|
27
|
+
The problem is not the lack of cloud SDKs. The problem is that every provider implements these actions with different clients, auth chains, exceptions, and edge cases. That makes customer automation harder to maintain and harder to standardize.
|
|
28
|
+
|
|
29
|
+
Rosetta exists to remove that friction by providing one operational contract for the common object-storage workflow.
|
|
30
|
+
|
|
31
|
+
## Architectural philosophy
|
|
32
|
+
|
|
33
|
+
Rosetta is built around these rules:
|
|
34
|
+
|
|
35
|
+
- abstract operations, not resource graphs
|
|
36
|
+
- keep provider-specific logic inside adapters
|
|
37
|
+
- normalize exceptions into stable Rosetta exceptions
|
|
38
|
+
- return typed models instead of raw provider dictionaries
|
|
39
|
+
- keep the public API small and explicit
|
|
40
|
+
- prefer deterministic behavior over guessing
|
|
41
|
+
- keep the SDK sync-first for operational scripting
|
|
42
|
+
- preserve the ability to extend into other services later without redesigning the core
|
|
43
|
+
|
|
44
|
+
The important boundary is this: application code should not know whether the backend is boto3 or google-cloud-storage. It should only know how to call Rosetta.
|
|
45
|
+
|
|
46
|
+
## Supported providers
|
|
47
|
+
|
|
48
|
+
The MVP supports:
|
|
49
|
+
|
|
50
|
+
- AWS S3
|
|
51
|
+
- Google Cloud Storage
|
|
52
|
+
|
|
53
|
+
Provider support is implemented through adapter classes under `src/miraj/providers/`.
|
|
54
|
+
|
|
55
|
+
## Supported operations
|
|
56
|
+
|
|
57
|
+
The current object-storage API supports:
|
|
58
|
+
|
|
59
|
+
- `put_object`
|
|
60
|
+
- `upload_file` (compatibility wrapper)
|
|
61
|
+
- `get_object`
|
|
62
|
+
- `download_file` (compatibility wrapper)
|
|
63
|
+
- `delete_object`
|
|
64
|
+
- `delete_file` (compatibility wrapper)
|
|
65
|
+
- `object_exists`
|
|
66
|
+
- `list_objects`
|
|
67
|
+
- `copy_object`
|
|
68
|
+
- `move_object`
|
|
69
|
+
- `bulk_delete_objects`
|
|
70
|
+
- `create_bucket`
|
|
71
|
+
- `delete_bucket`
|
|
72
|
+
- `list_buckets`
|
|
73
|
+
- `generate_signed_url`
|
|
74
|
+
|
|
75
|
+
For signed URLs, the MVP supports `GET` and `PUT`.
|
|
76
|
+
|
|
77
|
+
## Installation
|
|
78
|
+
|
|
79
|
+
Rosetta is a normal Python package. The project is built with `hatchling` and managed locally with `uv`.
|
|
80
|
+
|
|
81
|
+
Install editable mode during development:
|
|
82
|
+
|
|
83
|
+
```bash
|
|
84
|
+
uv pip install -e .
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
Install development dependencies:
|
|
88
|
+
|
|
89
|
+
```bash
|
|
90
|
+
uv pip install -e ".[dev]"
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
## Configuration
|
|
94
|
+
|
|
95
|
+
Rosetta separates provider state from provider runtime configuration.
|
|
96
|
+
|
|
97
|
+
### Provider state
|
|
98
|
+
|
|
99
|
+
Provider selection is managed by Rosetta itself. The current active provider is stored in `.miraj/state.json` and is controlled through the CLI.
|
|
100
|
+
|
|
101
|
+
Example:
|
|
102
|
+
|
|
103
|
+
```bash
|
|
104
|
+
miraj set provider --gcp
|
|
105
|
+
miraj list providers
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
### Provider runtime configuration
|
|
109
|
+
|
|
110
|
+
Provider-specific runtime settings live in `examples/config.py` during the MVP. That file holds the configuration values required by each provider.
|
|
111
|
+
|
|
112
|
+
Typical values are:
|
|
113
|
+
|
|
114
|
+
- GCP: `credentials_path`, `project`
|
|
115
|
+
- AWS: `region`, optional static credentials, or environment-based auth
|
|
116
|
+
|
|
117
|
+
The configuration file is not the provider selector. It only stores provider settings.
|
|
118
|
+
|
|
119
|
+
## Provider-state workflow
|
|
120
|
+
|
|
121
|
+
Rosetta uses a simple context model:
|
|
122
|
+
|
|
123
|
+
1. the user sets the active provider through the CLI
|
|
124
|
+
2. Rosetta stores that selection in `.miraj/state.json`
|
|
125
|
+
3. application code calls `Storage()` or `Storage(config=...)`
|
|
126
|
+
4. Rosetta resolves the active provider from state
|
|
127
|
+
5. Rosetta initializes the correct provider adapter
|
|
128
|
+
6. the adapter executes the operation using the provider-native SDK
|
|
129
|
+
|
|
130
|
+
This allows the same application script to run against AWS or GCP without rewriting the script itself.
|
|
131
|
+
|
|
132
|
+
The active provider can be shown with:
|
|
133
|
+
|
|
134
|
+
```bash
|
|
135
|
+
miraj list providers
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
The current provider is marked with `*`.
|
|
139
|
+
|
|
140
|
+
## Quickstart examples
|
|
141
|
+
|
|
142
|
+
### 1. Set the active provider
|
|
143
|
+
|
|
144
|
+
```bash
|
|
145
|
+
miraj set provider --gcp
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
or
|
|
149
|
+
|
|
150
|
+
```bash
|
|
151
|
+
miraj set provider --aws
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
### 2. Verify the selected provider
|
|
155
|
+
|
|
156
|
+
```bash
|
|
157
|
+
miraj list providers
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
### 3. Use Rosetta in application code
|
|
161
|
+
|
|
162
|
+
```python
|
|
163
|
+
from pathlib import Path
|
|
164
|
+
|
|
165
|
+
from miraj import Storage
|
|
166
|
+
from config import STORAGE_CONFIG
|
|
167
|
+
|
|
168
|
+
storage = Storage(config=STORAGE_CONFIG)
|
|
169
|
+
content = storage.get_object(
|
|
170
|
+
bucket="wmg-invoice-data",
|
|
171
|
+
key="wmg invoices/WMG Invoice 2024-09-01 — 2024-09-30.csv",
|
|
172
|
+
)
|
|
173
|
+
Path("downloaded-object.csv").write_bytes(content)
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
### 4. Upload an object
|
|
177
|
+
|
|
178
|
+
```python
|
|
179
|
+
from miraj import Storage
|
|
180
|
+
from config import STORAGE_CONFIG
|
|
181
|
+
|
|
182
|
+
storage = Storage(config=STORAGE_CONFIG)
|
|
183
|
+
storage.put_object(
|
|
184
|
+
bucket="wmg-invoice-data",
|
|
185
|
+
key="hello.txt",
|
|
186
|
+
data="hello from miraj",
|
|
187
|
+
)
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
## Testing instructions
|
|
191
|
+
|
|
192
|
+
The MVP was validated through direct example scripts rather than a large test harness.
|
|
193
|
+
|
|
194
|
+
Recommended checks:
|
|
195
|
+
|
|
196
|
+
- `miraj list providers`
|
|
197
|
+
- GCP object read flow
|
|
198
|
+
- GCP object write flow
|
|
199
|
+
- AWS object read flow
|
|
200
|
+
- AWS bucket creation flow
|
|
201
|
+
- object listing flow
|
|
202
|
+
- signed URL generation
|
|
203
|
+
|
|
204
|
+
Run the example scripts from the `examples/` directory after selecting the target provider through the CLI.
|
|
205
|
+
|
|
206
|
+
Example:
|
|
207
|
+
|
|
208
|
+
```bash
|
|
209
|
+
miraj set provider --gcp
|
|
210
|
+
python examples/test_gcp_test_object_storage.py
|
|
211
|
+
python examples/test_gcp_put_object.py
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
Example AWS flow:
|
|
215
|
+
|
|
216
|
+
```bash
|
|
217
|
+
miraj set provider --aws
|
|
218
|
+
python examples/test_s3_get_object.py
|
|
219
|
+
python examples/test_s3_put_object.py
|
|
220
|
+
python examples/test_s3_create_bucket.py
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
## Validation summary
|
|
224
|
+
|
|
225
|
+
The MVP proved the following:
|
|
226
|
+
|
|
227
|
+
- Rosetta can route calls through one SDK interface
|
|
228
|
+
- provider state can be set and listed through the CLI
|
|
229
|
+
- GCP object read and list operations work through Rosetta
|
|
230
|
+
- GCP upload calls are normalized through Rosetta exceptions when permissions are missing
|
|
231
|
+
- AWS provider initialization and request routing are wired through Rosetta
|
|
232
|
+
- AWS bucket creation and object operations follow the Rosetta interface
|
|
233
|
+
- provider-specific errors are surfaced as Rosetta exceptions rather than raw provider exceptions
|
|
234
|
+
|
|
235
|
+
Live validation depends on the customer’s own cloud credentials, IAM permissions, bucket existence, and region/project settings.
|
|
236
|
+
|
|
237
|
+
## Known limitations
|
|
238
|
+
|
|
239
|
+
This MVP does not attempt to solve everything that cloud storage can do.
|
|
240
|
+
|
|
241
|
+
Current limitations include:
|
|
242
|
+
|
|
243
|
+
- only AWS S3 and Google Cloud Storage are supported
|
|
244
|
+
- only object storage is implemented
|
|
245
|
+
- no IAM abstraction
|
|
246
|
+
- no networking abstraction
|
|
247
|
+
- no provisioning or reconciliation layer
|
|
248
|
+
- no workflow orchestration
|
|
249
|
+
- no async public API yet
|
|
250
|
+
- no automatic provider guessing from resource names
|
|
251
|
+
- no deep control-plane abstraction
|
|
252
|
+
- no support for all advanced provider-specific features through the public API
|
|
253
|
+
|
|
254
|
+
Some capabilities remain intentionally inside the native provider escape hatches rather than the Rosetta public surface.
|
|
255
|
+
|
|
256
|
+
## Future extensibility overview
|
|
257
|
+
|
|
258
|
+
Rosetta was designed so that additional services can be added without replacing the core architecture.
|
|
259
|
+
|
|
260
|
+
The extension pattern is:
|
|
261
|
+
|
|
262
|
+
1. define a public service facade
|
|
263
|
+
2. define a provider contract for that service
|
|
264
|
+
3. implement AWS and GCP adapters for the new service
|
|
265
|
+
4. normalize exceptions and return models
|
|
266
|
+
5. add contract tests
|
|
267
|
+
6. add example scripts
|
|
268
|
+
|
|
269
|
+
This approach is meant to make future services like EC2 / Compute Engine, SQS / Pub/Sub, or CloudWatch Logs / Cloud Logging feasible without changing how application code is written.
|
|
270
|
+
|
|
271
|
+
The important rule is to keep the public API operational and explicit. New services should follow the same pattern as storage: stable facade, provider adapters, normalized errors, typed results.
|
|
272
|
+
|
|
273
|
+
## Maintenance ownership boundary
|
|
274
|
+
|
|
275
|
+
This MVP was delivered as the foundation for the customer’s internal use.
|
|
276
|
+
|
|
277
|
+
After handoff:
|
|
278
|
+
|
|
279
|
+
- the customer owns ongoing maintenance
|
|
280
|
+
- the customer owns cloud credentials and IAM permissions
|
|
281
|
+
- the customer owns environment setup for AWS and GCP
|
|
282
|
+
- the customer owns future service expansion unless separately agreed otherwise
|
|
283
|
+
- the customer owns changes to operational policy, bucket access, and runtime configuration
|
|
284
|
+
|
|
285
|
+
The intent of the handoff is to provide a working, understandable starting point that the customer can continue to extend.
|
|
286
|
+
|
|
287
|
+
## Repository layout
|
|
288
|
+
|
|
289
|
+
Key files in the MVP:
|
|
290
|
+
|
|
291
|
+
- `src/miraj/storage.py` — public SDK facade
|
|
292
|
+
- `src/miraj/context.py` — active provider state storage and resolution
|
|
293
|
+
- `src/miraj/cli.py` — CLI for provider selection and listing
|
|
294
|
+
- `src/miraj/exceptions.py` — normalized exception hierarchy
|
|
295
|
+
- `src/miraj/models.py` — typed return models
|
|
296
|
+
- `src/miraj/providers/aws.py` — AWS S3 adapter
|
|
297
|
+
- `src/miraj/providers/gcp.py` — GCS adapter
|
|
298
|
+
- `examples/config.py` — provider runtime config used by example scripts
|
|
299
|
+
- `examples/` — customer-facing usage examples
|
|
300
|
+
|
|
301
|
+
## Notes for the customer
|
|
302
|
+
|
|
303
|
+
Rosetta is intentionally small at the MVP stage. The value is not in breadth of services. The value is in proving that one operational interface can drive multiple clouds cleanly, with provider-specific behavior isolated behind adapters and a stable SDK boundary.
|