opendal 0.1.6.pre.rc.1-arm64-darwin-23
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.standard.yml +20 -0
- data/.tool-versions +1 -0
- data/.yardopts +1 -0
- data/DEPENDENCIES.md +9 -0
- data/DEPENDENCIES.rust.tsv +277 -0
- data/Gemfile +35 -0
- data/README.md +159 -0
- data/Rakefile +149 -0
- data/core/CHANGELOG.md +4929 -0
- data/core/CONTRIBUTING.md +61 -0
- data/core/DEPENDENCIES.md +3 -0
- data/core/DEPENDENCIES.rust.tsv +185 -0
- data/core/LICENSE +201 -0
- data/core/README.md +228 -0
- data/core/benches/README.md +18 -0
- data/core/benches/ops/README.md +26 -0
- data/core/benches/types/README.md +9 -0
- data/core/benches/vs_fs/README.md +35 -0
- data/core/benches/vs_s3/README.md +55 -0
- data/core/edge/README.md +3 -0
- data/core/edge/file_write_on_full_disk/README.md +14 -0
- data/core/edge/s3_aws_assume_role_with_web_identity/README.md +18 -0
- data/core/edge/s3_read_on_wasm/.gitignore +3 -0
- data/core/edge/s3_read_on_wasm/README.md +42 -0
- data/core/edge/s3_read_on_wasm/webdriver.json +15 -0
- data/core/examples/README.md +23 -0
- data/core/examples/basic/README.md +15 -0
- data/core/examples/concurrent-upload/README.md +15 -0
- data/core/examples/multipart-upload/README.md +15 -0
- data/core/fuzz/.gitignore +5 -0
- data/core/fuzz/README.md +68 -0
- data/core/src/docs/comparisons/vs_object_store.md +183 -0
- data/core/src/docs/performance/concurrent_write.md +101 -0
- data/core/src/docs/performance/http_optimization.md +124 -0
- data/core/src/docs/rfcs/0000_example.md +74 -0
- data/core/src/docs/rfcs/0000_foyer_integration.md +111 -0
- data/core/src/docs/rfcs/0041_object_native_api.md +185 -0
- data/core/src/docs/rfcs/0044_error_handle.md +198 -0
- data/core/src/docs/rfcs/0057_auto_region.md +160 -0
- data/core/src/docs/rfcs/0069_object_stream.md +145 -0
- data/core/src/docs/rfcs/0090_limited_reader.md +155 -0
- data/core/src/docs/rfcs/0112_path_normalization.md +79 -0
- data/core/src/docs/rfcs/0191_async_streaming_io.md +328 -0
- data/core/src/docs/rfcs/0203_remove_credential.md +96 -0
- data/core/src/docs/rfcs/0221_create_dir.md +89 -0
- data/core/src/docs/rfcs/0247_retryable_error.md +87 -0
- data/core/src/docs/rfcs/0293_object_id.md +67 -0
- data/core/src/docs/rfcs/0337_dir_entry.md +191 -0
- data/core/src/docs/rfcs/0409_accessor_capabilities.md +67 -0
- data/core/src/docs/rfcs/0413_presign.md +154 -0
- data/core/src/docs/rfcs/0423_command_line_interface.md +268 -0
- data/core/src/docs/rfcs/0429_init_from_iter.md +107 -0
- data/core/src/docs/rfcs/0438_multipart.md +163 -0
- data/core/src/docs/rfcs/0443_gateway.md +73 -0
- data/core/src/docs/rfcs/0501_new_builder.md +111 -0
- data/core/src/docs/rfcs/0554_write_refactor.md +96 -0
- data/core/src/docs/rfcs/0561_list_metadata_reuse.md +210 -0
- data/core/src/docs/rfcs/0599_blocking_api.md +157 -0
- data/core/src/docs/rfcs/0623_redis_service.md +300 -0
- data/core/src/docs/rfcs/0627_split_capabilities.md +89 -0
- data/core/src/docs/rfcs/0661_path_in_accessor.md +126 -0
- data/core/src/docs/rfcs/0793_generic_kv_services.md +209 -0
- data/core/src/docs/rfcs/0926_object_reader.md +93 -0
- data/core/src/docs/rfcs/0977_refactor_error.md +151 -0
- data/core/src/docs/rfcs/1085_object_handler.md +73 -0
- data/core/src/docs/rfcs/1391_object_metadataer.md +110 -0
- data/core/src/docs/rfcs/1398_query_based_metadata.md +125 -0
- data/core/src/docs/rfcs/1420_object_writer.md +147 -0
- data/core/src/docs/rfcs/1477_remove_object_concept.md +159 -0
- data/core/src/docs/rfcs/1735_operation_extension.md +117 -0
- data/core/src/docs/rfcs/2083_writer_sink_api.md +106 -0
- data/core/src/docs/rfcs/2133_append_api.md +88 -0
- data/core/src/docs/rfcs/2299_chain_based_operator_api.md +99 -0
- data/core/src/docs/rfcs/2602_object_versioning.md +138 -0
- data/core/src/docs/rfcs/2758_merge_append_into_write.md +79 -0
- data/core/src/docs/rfcs/2774_lister_api.md +66 -0
- data/core/src/docs/rfcs/2779_list_with_metakey.md +143 -0
- data/core/src/docs/rfcs/2852_native_capability.md +58 -0
- data/core/src/docs/rfcs/2884_merge_range_read_into_read.md +80 -0
- data/core/src/docs/rfcs/3017_remove_write_copy_from.md +94 -0
- data/core/src/docs/rfcs/3197_config.md +237 -0
- data/core/src/docs/rfcs/3232_align_list_api.md +69 -0
- data/core/src/docs/rfcs/3243_list_prefix.md +128 -0
- data/core/src/docs/rfcs/3356_lazy_reader.md +111 -0
- data/core/src/docs/rfcs/3526_list_recursive.md +59 -0
- data/core/src/docs/rfcs/3574_concurrent_stat_in_list.md +80 -0
- data/core/src/docs/rfcs/3734_buffered_reader.md +64 -0
- data/core/src/docs/rfcs/3898_concurrent_writer.md +66 -0
- data/core/src/docs/rfcs/3911_deleter_api.md +165 -0
- data/core/src/docs/rfcs/4382_range_based_read.md +213 -0
- data/core/src/docs/rfcs/4638_executor.md +215 -0
- data/core/src/docs/rfcs/5314_remove_metakey.md +120 -0
- data/core/src/docs/rfcs/5444_operator_from_uri.md +162 -0
- data/core/src/docs/rfcs/5479_context.md +140 -0
- data/core/src/docs/rfcs/5485_conditional_reader.md +112 -0
- data/core/src/docs/rfcs/5495_list_with_deleted.md +81 -0
- data/core/src/docs/rfcs/5556_write_returns_metadata.md +121 -0
- data/core/src/docs/rfcs/5871_read_returns_metadata.md +112 -0
- data/core/src/docs/rfcs/6189_remove_native_blocking.md +106 -0
- data/core/src/docs/rfcs/6209_glob_support.md +132 -0
- data/core/src/docs/rfcs/6213_options_api.md +142 -0
- data/core/src/docs/rfcs/README.md +62 -0
- data/core/src/docs/upgrade.md +1556 -0
- data/core/src/services/aliyun_drive/docs.md +61 -0
- data/core/src/services/alluxio/docs.md +45 -0
- data/core/src/services/azblob/docs.md +77 -0
- data/core/src/services/azdls/docs.md +73 -0
- data/core/src/services/azfile/docs.md +65 -0
- data/core/src/services/b2/docs.md +54 -0
- data/core/src/services/cacache/docs.md +38 -0
- data/core/src/services/cloudflare_kv/docs.md +21 -0
- data/core/src/services/cos/docs.md +55 -0
- data/core/src/services/d1/docs.md +48 -0
- data/core/src/services/dashmap/docs.md +38 -0
- data/core/src/services/dbfs/docs.md +57 -0
- data/core/src/services/dropbox/docs.md +64 -0
- data/core/src/services/etcd/docs.md +45 -0
- data/core/src/services/foundationdb/docs.md +42 -0
- data/core/src/services/fs/docs.md +49 -0
- data/core/src/services/ftp/docs.md +42 -0
- data/core/src/services/gcs/docs.md +76 -0
- data/core/src/services/gdrive/docs.md +65 -0
- data/core/src/services/ghac/docs.md +84 -0
- data/core/src/services/github/docs.md +52 -0
- data/core/src/services/gridfs/docs.md +46 -0
- data/core/src/services/hdfs/docs.md +140 -0
- data/core/src/services/hdfs_native/docs.md +35 -0
- data/core/src/services/http/docs.md +45 -0
- data/core/src/services/huggingface/docs.md +61 -0
- data/core/src/services/ipfs/docs.md +45 -0
- data/core/src/services/ipmfs/docs.md +14 -0
- data/core/src/services/koofr/docs.md +51 -0
- data/core/src/services/lakefs/docs.md +62 -0
- data/core/src/services/memcached/docs.md +47 -0
- data/core/src/services/memory/docs.md +36 -0
- data/core/src/services/mini_moka/docs.md +19 -0
- data/core/src/services/moka/docs.md +42 -0
- data/core/src/services/mongodb/docs.md +49 -0
- data/core/src/services/monoiofs/docs.md +46 -0
- data/core/src/services/mysql/docs.md +47 -0
- data/core/src/services/obs/docs.md +54 -0
- data/core/src/services/onedrive/docs.md +115 -0
- data/core/src/services/opfs/docs.md +18 -0
- data/core/src/services/oss/docs.md +74 -0
- data/core/src/services/pcloud/docs.md +51 -0
- data/core/src/services/persy/docs.md +43 -0
- data/core/src/services/postgresql/docs.md +47 -0
- data/core/src/services/redb/docs.md +41 -0
- data/core/src/services/redis/docs.md +43 -0
- data/core/src/services/rocksdb/docs.md +54 -0
- data/core/src/services/s3/compatible_services.md +126 -0
- data/core/src/services/s3/docs.md +244 -0
- data/core/src/services/seafile/docs.md +54 -0
- data/core/src/services/sftp/docs.md +49 -0
- data/core/src/services/sled/docs.md +39 -0
- data/core/src/services/sqlite/docs.md +46 -0
- data/core/src/services/surrealdb/docs.md +54 -0
- data/core/src/services/swift/compatible_services.md +53 -0
- data/core/src/services/swift/docs.md +52 -0
- data/core/src/services/tikv/docs.md +43 -0
- data/core/src/services/upyun/docs.md +51 -0
- data/core/src/services/vercel_artifacts/docs.md +40 -0
- data/core/src/services/vercel_blob/docs.md +45 -0
- data/core/src/services/webdav/docs.md +49 -0
- data/core/src/services/webhdfs/docs.md +90 -0
- data/core/src/services/yandex_disk/docs.md +45 -0
- data/core/tests/behavior/README.md +77 -0
- data/core/tests/data/normal_dir/.gitkeep +0 -0
- data/core/tests/data/normal_file.txt +1041 -0
- data/core/tests/data/special_dir !@#$%^&()_+-=;',/.gitkeep +0 -0
- data/core/tests/data/special_file !@#$%^&()_+-=;',.txt +1041 -0
- data/core/users.md +13 -0
- data/extconf.rb +24 -0
- data/lib/opendal.rb +25 -0
- data/lib/opendal_ruby/entry.rb +35 -0
- data/lib/opendal_ruby/io.rb +70 -0
- data/lib/opendal_ruby/metadata.rb +44 -0
- data/lib/opendal_ruby/opendal_ruby.bundle +0 -0
- data/lib/opendal_ruby/operator.rb +29 -0
- data/lib/opendal_ruby/operator_info.rb +26 -0
- data/opendal.gemspec +91 -0
- data/test/blocking_op_test.rb +112 -0
- data/test/capability_test.rb +42 -0
- data/test/io_test.rb +172 -0
- data/test/lister_test.rb +77 -0
- data/test/metadata_test.rb +78 -0
- data/test/middlewares_test.rb +46 -0
- data/test/operator_info_test.rb +35 -0
- data/test/test_helper.rb +36 -0
- metadata +240 -0
|
@@ -0,0 +1,89 @@
|
|
|
1
|
+
- Proposal Name: `create-dir`
|
|
2
|
+
- Start Date: 2022-04-06
|
|
3
|
+
- RFC PR: [apache/opendal#221](https://github.com/apache/opendal/pull/221)
|
|
4
|
+
- Tracking Issue: [apache/opendal#222](https://github.com/apache/opendal/issues/222)
|
|
5
|
+
|
|
6
|
+
# Summary
|
|
7
|
+
|
|
8
|
+
Add creating dir support for OpenDAL.
|
|
9
|
+
|
|
10
|
+
# Motivation
|
|
11
|
+
|
|
12
|
+
Interoperability between OpenDAL services requires dir support. The object storage system will simulate dir operations with `/` via object ends. But we can't share the same behavior with `fs`, as `mkdir` is a separate syscall.
|
|
13
|
+
|
|
14
|
+
So we need to unify the behavior about dir across different services.
|
|
15
|
+
|
|
16
|
+
# Guide-level explanation
|
|
17
|
+
|
|
18
|
+
After this proposal got merged, we will treat all paths that end with `/` as a dir.
|
|
19
|
+
|
|
20
|
+
For example:
|
|
21
|
+
|
|
22
|
+
- `read("abc/")` will return an `IsDir` error.
|
|
23
|
+
- `write("abc/")` will return an `IsDir` error.
|
|
24
|
+
- `stat("abc/")` will be guaranteed to return a dir or a `NotDir` error.
|
|
25
|
+
- `delete("abc/")` will be guaranteed to delete a dir or `NotDir` / `NotEmpty` error.
|
|
26
|
+
- `list("abc/")` will be guaranteed to list a dir or a `NotDir` error.
|
|
27
|
+
|
|
28
|
+
And we will support create an empty object:
|
|
29
|
+
|
|
30
|
+
```rust
|
|
31
|
+
// create a dir object "abc/"
|
|
32
|
+
let _ = op.object("abc/").create().await?;
|
|
33
|
+
// create a file object "abc"
|
|
34
|
+
let _ = op.object("abc").create().await?;
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
# Reference-level explanation
|
|
38
|
+
|
|
39
|
+
And we will add a new API called `create` to create an empty object.
|
|
40
|
+
|
|
41
|
+
```rust
|
|
42
|
+
struct OpCreate {
|
|
43
|
+
path: String,
|
|
44
|
+
mode: ObjectMode,
|
|
45
|
+
}
|
|
46
|
+
|
|
47
|
+
pub trait Accessor: Send + Sync + Debug {
|
|
48
|
+
async fn create(&self, args: &OpCreate) -> Result<Metadata>;
|
|
49
|
+
}
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
`Object` will expose API like `create` which will call `Accessor::create()` internally.
|
|
53
|
+
|
|
54
|
+
# Drawbacks
|
|
55
|
+
|
|
56
|
+
None
|
|
57
|
+
|
|
58
|
+
# Rationale and alternatives
|
|
59
|
+
|
|
60
|
+
None
|
|
61
|
+
|
|
62
|
+
# Prior art
|
|
63
|
+
|
|
64
|
+
None
|
|
65
|
+
|
|
66
|
+
# Unresolved questions
|
|
67
|
+
|
|
68
|
+
When writing this proposal, [io_error_more](https://github.com/rust-lang/rust/issues/86442) is not stabilized yet. We can't use `NotADirectory` nor `IsADirectory` directly.
|
|
69
|
+
|
|
70
|
+
Using `from_raw_os_error` is unacceptable because we can't carry our error context.
|
|
71
|
+
|
|
72
|
+
```rust
|
|
73
|
+
use std::io;
|
|
74
|
+
|
|
75
|
+
let error = io::Error::from_raw_os_error(22);
|
|
76
|
+
assert_eq!(error.kind(), io::ErrorKind::InvalidInput);
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
So we will use `ErrorKind::Other` for now, which means our users can't check the following errors:
|
|
80
|
+
|
|
81
|
+
- `IsADirectory`
|
|
82
|
+
- `DirectoryNotEmpty`
|
|
83
|
+
- `NotADirectory`
|
|
84
|
+
|
|
85
|
+
Until they get stabilized.
|
|
86
|
+
|
|
87
|
+
# Future possibilities
|
|
88
|
+
|
|
89
|
+
None
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
- Proposal Name: `retryable_error`
|
|
2
|
+
- Start Date: 2022-04-12
|
|
3
|
+
- RFC PR: [apache/opendal#247](https://github.com/apache/opendal/pull/247)
|
|
4
|
+
- Tracking Issue: [apache/opendal#248](https://github.com/apache/opendal/issues/248)
|
|
5
|
+
|
|
6
|
+
# Summary
|
|
7
|
+
|
|
8
|
+
Treat `io::ErrorKind::Interrupt` as retryable error.
|
|
9
|
+
|
|
10
|
+
# Motivation
|
|
11
|
+
|
|
12
|
+
Supports retry make our users' lives easier:
|
|
13
|
+
|
|
14
|
+
> [Feature request: Custom retries for the s3 backend](https://github.com/apache/opendal/issues/196)
|
|
15
|
+
>
|
|
16
|
+
> While the reading/writing from/to s3, AWS occasionally returns errors that could be retried (at least 5xx?). Currently, in the databend, this will fail the whole execution of the statement (which may have been running for an extended time).
|
|
17
|
+
|
|
18
|
+
Most users may need this retry feature, like `decompress`. Implementing it in OpenDAL will make users no bother, no backoff logic.
|
|
19
|
+
|
|
20
|
+
# Guide-level explanation
|
|
21
|
+
|
|
22
|
+
With the `retry` feature enabled:
|
|
23
|
+
|
|
24
|
+
```toml
|
|
25
|
+
opendal = {version="0.5.2", features=["retry"]}
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
Users can configure the retry behavior easily:
|
|
29
|
+
|
|
30
|
+
```rust
|
|
31
|
+
let backoff = ExponentialBackoff::default();
|
|
32
|
+
let op = op.with_backoff(backoff);
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
All requests sent by `op` will be automatically retried.
|
|
36
|
+
|
|
37
|
+
# Reference-level explanation
|
|
38
|
+
|
|
39
|
+
We will implement retry features via adding a new `Layer`.
|
|
40
|
+
|
|
41
|
+
In the retry layer, we will support retrying all operations. To do our best to keep retrying read & write, we will implement `RetryableReader` and `RetryableWriter`, which will support retry while no actual IO happens.
|
|
42
|
+
|
|
43
|
+
## Retry operations
|
|
44
|
+
|
|
45
|
+
Most operations are safe to retry, like `list`, `stat`, `delete` and `create`.
|
|
46
|
+
|
|
47
|
+
We will retry those operations via input backoff.
|
|
48
|
+
|
|
49
|
+
## Retry IO operations
|
|
50
|
+
|
|
51
|
+
Retry IO operations are a bit complex because IO operations have side effects, especially for HTTP-based services like s3. We can't resume an operation during the reading process without sending new requests.
|
|
52
|
+
|
|
53
|
+
This proposal will do the best we can: retry the operation if no actual IO happens.
|
|
54
|
+
|
|
55
|
+
If we meet an internal error before reading/writing the user's buffer, it's safe and cheap to retry it with precisely the same argument.
|
|
56
|
+
|
|
57
|
+
## Retryable Error
|
|
58
|
+
|
|
59
|
+
- Operator MAY retry `io::ErrorKind::Interrupt` errors.
|
|
60
|
+
- Services SHOULD return `io::ErrorKind::Interrupt` kind if the error is retryable.
|
|
61
|
+
|
|
62
|
+
# Drawbacks
|
|
63
|
+
|
|
64
|
+
## Write operation can't be retried
|
|
65
|
+
|
|
66
|
+
As we return `Writer` to users, there is no way for OpenDAL to get the input data again.
|
|
67
|
+
|
|
68
|
+
# Rationale and alternatives
|
|
69
|
+
|
|
70
|
+
## Implement retry at operator level
|
|
71
|
+
|
|
72
|
+
We need to implement retry logic for every operator function, and can't address the same problem:
|
|
73
|
+
|
|
74
|
+
- `Reader` / `Writer` can't be retired.
|
|
75
|
+
- Intrusive design that users cannot expand on their own
|
|
76
|
+
|
|
77
|
+
# Prior art
|
|
78
|
+
|
|
79
|
+
None
|
|
80
|
+
|
|
81
|
+
# Unresolved questions
|
|
82
|
+
|
|
83
|
+
- `read` and `write` can't be retried during IO.
|
|
84
|
+
|
|
85
|
+
# Future possibilities
|
|
86
|
+
|
|
87
|
+
None
|
|
@@ -0,0 +1,67 @@
|
|
|
1
|
+
- Proposal Name: `object_id`
|
|
2
|
+
- Start Date: 2022-05-27
|
|
3
|
+
- RFC PR: [apache/opendal#293](https://github.com/apache/opendal/pull/293)
|
|
4
|
+
- Tracking Issue: [apache/opendal#294](https://github.com/apache/opendal/issues/294)
|
|
5
|
+
|
|
6
|
+
# Summary
|
|
7
|
+
|
|
8
|
+
Allow getting id from an object.
|
|
9
|
+
|
|
10
|
+
# Motivation
|
|
11
|
+
|
|
12
|
+
Allow get id from an object will make it possible to operate across different operators. Users can store objects' IDs locally and refer to them with different settings. This proposal will make tasks like backup, restore, and migration possible.
|
|
13
|
+
|
|
14
|
+
# Guide-level explanation
|
|
15
|
+
|
|
16
|
+
Users can fetch an object id via:
|
|
17
|
+
|
|
18
|
+
```rust
|
|
19
|
+
let o = op.object("test_object");
|
|
20
|
+
let id = o.id();
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
The id is unique and permanent inside the underlying storage.
|
|
24
|
+
|
|
25
|
+
For example, if we have an s3 bucket with the root `/workdir/`, the object's id `test_object` will be `/workdir/test_object`.
|
|
26
|
+
|
|
27
|
+
# Reference-level explanation
|
|
28
|
+
|
|
29
|
+
`id()` and `path()` will be added as functions of `object`:
|
|
30
|
+
|
|
31
|
+
```rust
|
|
32
|
+
impl Object {
|
|
33
|
+
pub fn id(&self) -> String {}
|
|
34
|
+
pub fn path(&self) -> String {}
|
|
35
|
+
}
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
- `path` is a re-export of call to `Metadata::path()`.
|
|
39
|
+
- `id` will be generated by Operator's root and `Metadata::path()`.
|
|
40
|
+
|
|
41
|
+
# Drawbacks
|
|
42
|
+
|
|
43
|
+
None
|
|
44
|
+
|
|
45
|
+
# Rationale and alternatives
|
|
46
|
+
|
|
47
|
+
## Why not add a new field in `Metadata`?
|
|
48
|
+
|
|
49
|
+
Adding a new field inside `Metadata` requires every service to handle the id separately. And every metadata will need to store a complete id with the operators' root.
|
|
50
|
+
|
|
51
|
+
## Why not provide a full URI like `s3://path/to/object`?
|
|
52
|
+
|
|
53
|
+
Because we can't.
|
|
54
|
+
|
|
55
|
+
A full and functional URI towards an object will need the operator's endpoint and credentials. It's better to provide the mechanism and allow users to construct them based on their own business.
|
|
56
|
+
|
|
57
|
+
# Prior art
|
|
58
|
+
|
|
59
|
+
None
|
|
60
|
+
|
|
61
|
+
# Unresolved questions
|
|
62
|
+
|
|
63
|
+
None
|
|
64
|
+
|
|
65
|
+
# Future possibilities
|
|
66
|
+
|
|
67
|
+
None
|
|
@@ -0,0 +1,191 @@
|
|
|
1
|
+
- Proposal Name: `dir_entry`
|
|
2
|
+
- Start Date: 2022-06-08
|
|
3
|
+
- RFC PR: [apache/opendal#337](https://github.com/apache/opendal/pull/337)
|
|
4
|
+
- Tracking Issue: [apache/opendal#338](https://github.com/apache/opendal/issues/338)
|
|
5
|
+
|
|
6
|
+
# Summary
|
|
7
|
+
|
|
8
|
+
Returning `DirEntry` instead of `Object` in list.
|
|
9
|
+
|
|
10
|
+
# Motivation
|
|
11
|
+
|
|
12
|
+
In [Object Stream](./0069-object-stream.md), we introduce read_dir support via:
|
|
13
|
+
|
|
14
|
+
```rust
|
|
15
|
+
pub trait ObjectStream: futures::Stream<Item = Result<Object>> + Unpin + Send {}
|
|
16
|
+
impl<T> ObjectStream for T where T: futures::Stream<Item = Result<Object>> + Unpin + Send {}
|
|
17
|
+
|
|
18
|
+
pub struct Object {
|
|
19
|
+
acc: Arc<dyn Accessor>,
|
|
20
|
+
meta: Metadata,
|
|
21
|
+
}
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
However, the `meta` inside `Object` is not well-used:
|
|
25
|
+
|
|
26
|
+
```rust
|
|
27
|
+
pub(crate) fn metadata_ref(&self) -> &Metadata {}
|
|
28
|
+
pub(crate) fn metadata_mut(&mut self) -> &mut Metadata {}
|
|
29
|
+
pub async fn metadata_cached(&mut self) -> Result<&Metadata> {}
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
Users can't know an object's mode after the list, so they have to send `metadata` every time they get an object:
|
|
33
|
+
|
|
34
|
+
```rust
|
|
35
|
+
let o = op.object("path/to/dir/");
|
|
36
|
+
let mut obs = o.list().await?;
|
|
37
|
+
// ObjectStream implements `futures::Stream`
|
|
38
|
+
while let Some(o) = obs.next().await {
|
|
39
|
+
let mut o = o?;
|
|
40
|
+
// It's highly possible that OpenDAL already did metadata during list.
|
|
41
|
+
// Use `Object::metadata_cached()` to get cached metadata at first.
|
|
42
|
+
let meta = o.metadata_cached().await?;
|
|
43
|
+
match meta.mode() {
|
|
44
|
+
ObjectMode::FILE => {
|
|
45
|
+
println!("Handling file")
|
|
46
|
+
}
|
|
47
|
+
ObjectMode::DIR => {
|
|
48
|
+
println!("Handling dir like start a new list via meta.path()")
|
|
49
|
+
}
|
|
50
|
+
ObjectMode::Unknown => continue,
|
|
51
|
+
}
|
|
52
|
+
}
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
This behavior doesn't make sense as we already know the object's mode after the list.
|
|
56
|
+
|
|
57
|
+
Introducing a separate `DirEntry` could reduce an extra call for metadata most of the time.
|
|
58
|
+
|
|
59
|
+
```rust
|
|
60
|
+
let o = op.object("path/to/dir/");
|
|
61
|
+
let mut ds = o.list().await?;
|
|
62
|
+
// ObjectStream implements `futures::Stream`
|
|
63
|
+
while let Some(de) = ds.try_next().await {
|
|
64
|
+
match de.mode() {
|
|
65
|
+
ObjectMode::FILE => {
|
|
66
|
+
println!("Handling file")
|
|
67
|
+
}
|
|
68
|
+
ObjectMode::DIR => {
|
|
69
|
+
println!("Handling dir like start a new list via meta.path()")
|
|
70
|
+
}
|
|
71
|
+
ObjectMode::Unknown => continue,
|
|
72
|
+
}
|
|
73
|
+
}
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
# Guide-level explanation
|
|
77
|
+
|
|
78
|
+
Within this RFC, `Object::list()` will return `DirStreamer` instead.
|
|
79
|
+
|
|
80
|
+
```rust
|
|
81
|
+
pub trait DirStream: futures::Stream<Item = Result<DirEntry>> + Unpin + Send {}
|
|
82
|
+
pub type DirStreamer = Box<dyn DirStream>;
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
`DirStreamer` will stream `DirEntry`, which carries information already known during the list. So we can:
|
|
86
|
+
|
|
87
|
+
```rust
|
|
88
|
+
let id = de.id();
|
|
89
|
+
let path = de.path();
|
|
90
|
+
let name = de.name();
|
|
91
|
+
let mode = de.mode();
|
|
92
|
+
let meta = de.metadata().await?;
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
With `DirEntry` support, we can reduce an extra `metadata` call if we only want to know the object's mode:
|
|
96
|
+
|
|
97
|
+
```rust
|
|
98
|
+
let o = op.object("path/to/dir/");
|
|
99
|
+
let mut ds = o.list().await?;
|
|
100
|
+
// ObjectStream implements `futures::Stream`
|
|
101
|
+
while let Some(de) = ds.try_next().await {
|
|
102
|
+
match de.mode() {
|
|
103
|
+
ObjectMode::FILE => {
|
|
104
|
+
println!("Handling file")
|
|
105
|
+
}
|
|
106
|
+
ObjectMode::DIR => {
|
|
107
|
+
println!("Handling dir like start a new list via meta.path()")
|
|
108
|
+
}
|
|
109
|
+
ObjectMode::Unknown => continue,
|
|
110
|
+
}
|
|
111
|
+
}
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
We can convert this `DirEntry` into `Object` without overhead:
|
|
115
|
+
|
|
116
|
+
```rust
|
|
117
|
+
let o = de.into();
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
# Reference-level explanation
|
|
121
|
+
|
|
122
|
+
This proposal will introduce a new struct, `DirEntry`:
|
|
123
|
+
|
|
124
|
+
```rust
|
|
125
|
+
struct DirEntry {}
|
|
126
|
+
|
|
127
|
+
impl DirEntry {
|
|
128
|
+
pub fn id() -> String {}
|
|
129
|
+
pub fn path() -> &str {}
|
|
130
|
+
pub fn name() -> &str {}
|
|
131
|
+
pub fn mode() -> ObjectMode {}
|
|
132
|
+
pub async fn metadata() -> ObjectMetadata {}
|
|
133
|
+
}
|
|
134
|
+
|
|
135
|
+
impl From<DirEntry> for Object {}
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
And use `DirStream` to replace `ObjectStream`:
|
|
139
|
+
|
|
140
|
+
```rust
|
|
141
|
+
pub trait DirStream: futures::Stream<Item = Result<DirEntry>> + Unpin + Send {}
|
|
142
|
+
pub type DirStreamer = Box<dyn DirStream>;
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
With the addition of `DirEntry`, we will remove `meta` from `Object`:
|
|
146
|
+
|
|
147
|
+
```rust
|
|
148
|
+
#[derive(Clone, Debug)]
|
|
149
|
+
pub struct Object {
|
|
150
|
+
acc: Arc<dyn Accessor>,
|
|
151
|
+
path: String,
|
|
152
|
+
}
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
After this change, `Object` will become a thin wrapper of `Accessor` with path. And metadata related APIs like `metadata_ref()` and `metadata_mut()` will also be removed.
|
|
156
|
+
|
|
157
|
+
# Drawbacks
|
|
158
|
+
|
|
159
|
+
We are adding a new concept to our core logic.
|
|
160
|
+
|
|
161
|
+
# Rationale and alternatives
|
|
162
|
+
|
|
163
|
+
## Rust fs API design
|
|
164
|
+
|
|
165
|
+
Rust also provides abstractions like `File` and `DirEntry`:
|
|
166
|
+
|
|
167
|
+
```rust
|
|
168
|
+
use std::fs;
|
|
169
|
+
|
|
170
|
+
fn main() -> std::io::Result<()> {
|
|
171
|
+
for entry in fs::read_dir(".")? {
|
|
172
|
+
let dir = entry?;
|
|
173
|
+
println!("{:?}", dir.path());
|
|
174
|
+
}
|
|
175
|
+
Ok(())
|
|
176
|
+
}
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
Users can open a file with `entry.path()`.
|
|
180
|
+
|
|
181
|
+
# Prior art
|
|
182
|
+
|
|
183
|
+
None.
|
|
184
|
+
|
|
185
|
+
# Unresolved questions
|
|
186
|
+
|
|
187
|
+
None.
|
|
188
|
+
|
|
189
|
+
# Future possibilities
|
|
190
|
+
|
|
191
|
+
None.
|
|
@@ -0,0 +1,67 @@
|
|
|
1
|
+
- Proposal Name: `accessor_capabilities`
|
|
2
|
+
- Start Date: 2022-06-29
|
|
3
|
+
- RFC PR: [apache/opendal#409](https://github.com/apache/opendal/pull/409)
|
|
4
|
+
- Tracking Issue: [apache/opendal#410](https://github.com/apache/opendal/issues/410)
|
|
5
|
+
|
|
6
|
+
# Summary
|
|
7
|
+
|
|
8
|
+
Add support for accessor capabilities so that users can check if a given accessor is capable of a given ability.
|
|
9
|
+
|
|
10
|
+
# Motivation
|
|
11
|
+
|
|
12
|
+
Users of OpenDAL are requesting advanced features like the following:
|
|
13
|
+
|
|
14
|
+
- [Support parallel upload object](https://github.com/apache/opendal/issues/256)
|
|
15
|
+
- [Add presign url support](https://github.com/apache/opendal/issues/394)
|
|
16
|
+
|
|
17
|
+
It's meaningful for OpenDAL to support them in a unified way. Of course, not all storage services have the same feature sets. OpenDAL needs to provide a way for users to check if a given accessor is capable of a given capability.
|
|
18
|
+
|
|
19
|
+
# Guide-level explanation
|
|
20
|
+
|
|
21
|
+
Users can check an `Accessor`'s capability via `Operator::metadata()`.
|
|
22
|
+
|
|
23
|
+
```rust
|
|
24
|
+
let meta = op.metadata();
|
|
25
|
+
let _: bool = meta.can_presign();
|
|
26
|
+
let _: bool = meta.can_multipart();
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
`Accessor` will return [`io::ErrorKind::Unsupported`](https://doc.rust-lang.org/stable/std/io/enum.ErrorKind.html#variant.Unsupported) for not supported operations instead of panic as `unimplemented()`.
|
|
30
|
+
|
|
31
|
+
Users can check before operations or the `Unsupported` error kind after operations.
|
|
32
|
+
|
|
33
|
+
# Reference-level explanation
|
|
34
|
+
|
|
35
|
+
We will introduce a new enum called `AccessorCapability`, which includes `AccessorMetadata`.
|
|
36
|
+
|
|
37
|
+
This enum is private and only accessible inside OpenDAL, so it's not part of our public API. We will expose the check API via `AccessorMetadata`:
|
|
38
|
+
|
|
39
|
+
```rust
|
|
40
|
+
impl AccessorMetadata {
|
|
41
|
+
pub fn can_presign(&self) -> bool { .. }
|
|
42
|
+
pub fn can_multipart(&self) -> bool { .. }
|
|
43
|
+
}
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
# Drawbacks
|
|
47
|
+
|
|
48
|
+
None.
|
|
49
|
+
|
|
50
|
+
# Rationale and alternatives
|
|
51
|
+
|
|
52
|
+
None.
|
|
53
|
+
|
|
54
|
+
# Prior art
|
|
55
|
+
|
|
56
|
+
## go-storage
|
|
57
|
+
|
|
58
|
+
- [GSP-109: Redesign Features](https://github.com/beyondstorage/go-storage/blob/master/docs/rfcs/109-redesign-features.md)
|
|
59
|
+
- [GSP-837: Support Feature Flag](https://github.com/beyondstorage/go-storage/blob/master/docs/rfcs/837-support-feature-flag.md)
|
|
60
|
+
|
|
61
|
+
# Unresolved questions
|
|
62
|
+
|
|
63
|
+
None.
|
|
64
|
+
|
|
65
|
+
# Future possibilities
|
|
66
|
+
|
|
67
|
+
None.
|
|
@@ -0,0 +1,154 @@
|
|
|
1
|
+
- Proposal Name: `presign`
|
|
2
|
+
- Start Date: 2022-06-30
|
|
3
|
+
- RFC PR: [apache/opendal#0413](https://github.com/apache/opendal/pull/413)
|
|
4
|
+
- Tracking Issue: [apache/opendal#394](https://github.com/apache/opendal/issues/394)
|
|
5
|
+
|
|
6
|
+
# Summary
|
|
7
|
+
|
|
8
|
+
Add presign support in OpenDAL so users can generate a pre-signed URL without leaking `serect_key`.
|
|
9
|
+
|
|
10
|
+
# Motivation
|
|
11
|
+
|
|
12
|
+
> By default, all S3 objects are private. Only the object owner has permission to access them. However, the object owner can optionally share objects with others by creating a presigned URL, using their own security credentials, to grant time-limited permission to download the objects.
|
|
13
|
+
>
|
|
14
|
+
> From [Sharing objects using presigned URLs](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ShareObjectPreSignedURL.html)
|
|
15
|
+
|
|
16
|
+
We can use this presigned URL for:
|
|
17
|
+
|
|
18
|
+
- Download the object within the expired time from a bucket directly
|
|
19
|
+
- Upload content to the bucket on client-side
|
|
20
|
+
|
|
21
|
+
Adding this feature in OpenDAL will make users' lives easier to generate presigned URLs across different storage services.
|
|
22
|
+
|
|
23
|
+
The whole process would be:
|
|
24
|
+
|
|
25
|
+
```text
|
|
26
|
+
┌────────────┐
|
|
27
|
+
│ User ├─────────────────────┐
|
|
28
|
+
└──┬─────▲───┘ │
|
|
29
|
+
│ │ 4. Send Request to S3 Directly
|
|
30
|
+
1. Request Resource │ │
|
|
31
|
+
│ │ │
|
|
32
|
+
│ │ │
|
|
33
|
+
│ │ ▼
|
|
34
|
+
│ 3. Return Request ┌────────────┐
|
|
35
|
+
│ │ │ │
|
|
36
|
+
│ │ │ S3 │
|
|
37
|
+
┌──▼─────┴───┐ │ │
|
|
38
|
+
│ │ └────────────┘
|
|
39
|
+
│ App │
|
|
40
|
+
│ ┌────────┐ │
|
|
41
|
+
│ │ OpenDAL│ │
|
|
42
|
+
│ ├────────┤ │
|
|
43
|
+
└─┴──┼─────┴─┘
|
|
44
|
+
│ ▲
|
|
45
|
+
└──────┘
|
|
46
|
+
2. Generate Request
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
# Guide-level explanation
|
|
50
|
+
|
|
51
|
+
With this feature, our users can:
|
|
52
|
+
|
|
53
|
+
## Generate presigned URL for downloading
|
|
54
|
+
|
|
55
|
+
```rust
|
|
56
|
+
let req = op.presign_read("path/to/file")?;
|
|
57
|
+
// req.method: GET
|
|
58
|
+
// req.url: https://s3.amazonaws.com/examplebucket/test.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=access_key_id/20130721/us-east-1/s3/aws4_request&X-Amz-Date=20130721T201207Z&X-Amz-Expires=86400&X-Amz-SignedHeaders=host&X-Amz-Signature=<signature-value>
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
Users can download this object directly from the s3 bucket. For example:
|
|
62
|
+
|
|
63
|
+
```shell
|
|
64
|
+
curl <generated_url> -O test.txt
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## Generate presigned URL for uploading
|
|
68
|
+
|
|
69
|
+
```rust
|
|
70
|
+
let req = op.presign_write("path/to/file")?;
|
|
71
|
+
// req.method: PUT
|
|
72
|
+
// req.url: https://s3.amazonaws.com/examplebucket/test.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=access_key_id/20130721/us-east-1/s3/aws4_request&X-Amz-Date=20130721T201207Z&X-Amz-Expires=86400&X-Amz-SignedHeaders=host&X-Amz-Signature=<signature-value>
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
Users can upload content directly to the s3 bucket. For example:
|
|
76
|
+
|
|
77
|
+
```shell
|
|
78
|
+
curl -X PUT <generated_url> -T "/tmp/test.txt"
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
# Reference-level explanation
|
|
82
|
+
|
|
83
|
+
`Accessor` will add a new API `presign`:
|
|
84
|
+
|
|
85
|
+
```rust
|
|
86
|
+
pub trait Accessor {
|
|
87
|
+
fn presign(&self, args: &OpPresign) -> Result<PresignedRequest> {..}
|
|
88
|
+
}
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
`presign` accepts `OpPresign` and returns `Result<PresignedRequest>`:
|
|
92
|
+
|
|
93
|
+
```rust
|
|
94
|
+
struct OpPresign {
|
|
95
|
+
path: String,
|
|
96
|
+
op: Operation,
|
|
97
|
+
expire: time::Duration,
|
|
98
|
+
}
|
|
99
|
+
|
|
100
|
+
struct PresignedRequest {}
|
|
101
|
+
|
|
102
|
+
impl PresignedRequest {
|
|
103
|
+
pub fn method(&self) -> &http::Method {..}
|
|
104
|
+
pub fn url(&self) -> &http::Uri {..}
|
|
105
|
+
}
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
We are building a new struct to avoid leaking underlying implementations like `hyper::Request<T>` to users.
|
|
109
|
+
|
|
110
|
+
This feature will be a new capability in `AccessorCapability` as described in [RFC-0409: Accessor Capabilities](./0409-accessor-capabilities.md)
|
|
111
|
+
|
|
112
|
+
Based on `Accessor::presign`, we will export public APIs in `Operator`:
|
|
113
|
+
|
|
114
|
+
```rust
|
|
115
|
+
impl Operator {
|
|
116
|
+
fn presign_read(&self, path: &str) -> Result<PresignedRequest> {}
|
|
117
|
+
fn presign_write(&self, path: &str) -> Result<PresignedRequest> {}
|
|
118
|
+
}
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
Although it's possible to generate URLs for `create`, `delete`, `stat`, and `list`, there are no obvious use-cases. So we will not add them to this proposal.
|
|
122
|
+
|
|
123
|
+
# Drawbacks
|
|
124
|
+
|
|
125
|
+
None.
|
|
126
|
+
|
|
127
|
+
# Rationale and alternatives
|
|
128
|
+
|
|
129
|
+
## Query Sign Support Status
|
|
130
|
+
|
|
131
|
+
- s3: [Authenticating Requests: Using Query Parameters (AWS Signature Version 4)](https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-query-string-auth.html)
|
|
132
|
+
- azblob: [Delegate access with a shared access signature](https://docs.microsoft.com/en-us/rest/api/storageservices/delegate-access-with-shared-access-signature)
|
|
133
|
+
- gcs: [Signed URLs](https://cloud.google.com/storage/docs/access-control/signed-urls) (Only for XML API)
|
|
134
|
+
|
|
135
|
+
# Prior art
|
|
136
|
+
|
|
137
|
+
## awscli presign
|
|
138
|
+
|
|
139
|
+
AWS CLI has native presign support
|
|
140
|
+
|
|
141
|
+
```shell
|
|
142
|
+
> aws s3 presign s3://DOC-EXAMPLE-BUCKET/test2.txt
|
|
143
|
+
https://DOC-EXAMPLE-BUCKET.s3.us-west-2.amazonaws.com/key?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAEXAMPLE123456789%2F20210621%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20210621T041609Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=EXAMBLE1234494d5fba3fed607f98018e1dfc62e2529ae96d844123456
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
Refer to [AWS CLI Command Reference](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3/presign.html) for more information.
|
|
147
|
+
|
|
148
|
+
# Unresolved questions
|
|
149
|
+
|
|
150
|
+
None.
|
|
151
|
+
|
|
152
|
+
# Future possibilities
|
|
153
|
+
|
|
154
|
+
- Add `stat`/`list`/`delete` support
|