opendal 0.1.6.pre.rc.1-arm64-darwin-23

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (191) hide show
  1. checksums.yaml +7 -0
  2. data/.standard.yml +20 -0
  3. data/.tool-versions +1 -0
  4. data/.yardopts +1 -0
  5. data/DEPENDENCIES.md +9 -0
  6. data/DEPENDENCIES.rust.tsv +277 -0
  7. data/Gemfile +35 -0
  8. data/README.md +159 -0
  9. data/Rakefile +149 -0
  10. data/core/CHANGELOG.md +4929 -0
  11. data/core/CONTRIBUTING.md +61 -0
  12. data/core/DEPENDENCIES.md +3 -0
  13. data/core/DEPENDENCIES.rust.tsv +185 -0
  14. data/core/LICENSE +201 -0
  15. data/core/README.md +228 -0
  16. data/core/benches/README.md +18 -0
  17. data/core/benches/ops/README.md +26 -0
  18. data/core/benches/types/README.md +9 -0
  19. data/core/benches/vs_fs/README.md +35 -0
  20. data/core/benches/vs_s3/README.md +55 -0
  21. data/core/edge/README.md +3 -0
  22. data/core/edge/file_write_on_full_disk/README.md +14 -0
  23. data/core/edge/s3_aws_assume_role_with_web_identity/README.md +18 -0
  24. data/core/edge/s3_read_on_wasm/.gitignore +3 -0
  25. data/core/edge/s3_read_on_wasm/README.md +42 -0
  26. data/core/edge/s3_read_on_wasm/webdriver.json +15 -0
  27. data/core/examples/README.md +23 -0
  28. data/core/examples/basic/README.md +15 -0
  29. data/core/examples/concurrent-upload/README.md +15 -0
  30. data/core/examples/multipart-upload/README.md +15 -0
  31. data/core/fuzz/.gitignore +5 -0
  32. data/core/fuzz/README.md +68 -0
  33. data/core/src/docs/comparisons/vs_object_store.md +183 -0
  34. data/core/src/docs/performance/concurrent_write.md +101 -0
  35. data/core/src/docs/performance/http_optimization.md +124 -0
  36. data/core/src/docs/rfcs/0000_example.md +74 -0
  37. data/core/src/docs/rfcs/0000_foyer_integration.md +111 -0
  38. data/core/src/docs/rfcs/0041_object_native_api.md +185 -0
  39. data/core/src/docs/rfcs/0044_error_handle.md +198 -0
  40. data/core/src/docs/rfcs/0057_auto_region.md +160 -0
  41. data/core/src/docs/rfcs/0069_object_stream.md +145 -0
  42. data/core/src/docs/rfcs/0090_limited_reader.md +155 -0
  43. data/core/src/docs/rfcs/0112_path_normalization.md +79 -0
  44. data/core/src/docs/rfcs/0191_async_streaming_io.md +328 -0
  45. data/core/src/docs/rfcs/0203_remove_credential.md +96 -0
  46. data/core/src/docs/rfcs/0221_create_dir.md +89 -0
  47. data/core/src/docs/rfcs/0247_retryable_error.md +87 -0
  48. data/core/src/docs/rfcs/0293_object_id.md +67 -0
  49. data/core/src/docs/rfcs/0337_dir_entry.md +191 -0
  50. data/core/src/docs/rfcs/0409_accessor_capabilities.md +67 -0
  51. data/core/src/docs/rfcs/0413_presign.md +154 -0
  52. data/core/src/docs/rfcs/0423_command_line_interface.md +268 -0
  53. data/core/src/docs/rfcs/0429_init_from_iter.md +107 -0
  54. data/core/src/docs/rfcs/0438_multipart.md +163 -0
  55. data/core/src/docs/rfcs/0443_gateway.md +73 -0
  56. data/core/src/docs/rfcs/0501_new_builder.md +111 -0
  57. data/core/src/docs/rfcs/0554_write_refactor.md +96 -0
  58. data/core/src/docs/rfcs/0561_list_metadata_reuse.md +210 -0
  59. data/core/src/docs/rfcs/0599_blocking_api.md +157 -0
  60. data/core/src/docs/rfcs/0623_redis_service.md +300 -0
  61. data/core/src/docs/rfcs/0627_split_capabilities.md +89 -0
  62. data/core/src/docs/rfcs/0661_path_in_accessor.md +126 -0
  63. data/core/src/docs/rfcs/0793_generic_kv_services.md +209 -0
  64. data/core/src/docs/rfcs/0926_object_reader.md +93 -0
  65. data/core/src/docs/rfcs/0977_refactor_error.md +151 -0
  66. data/core/src/docs/rfcs/1085_object_handler.md +73 -0
  67. data/core/src/docs/rfcs/1391_object_metadataer.md +110 -0
  68. data/core/src/docs/rfcs/1398_query_based_metadata.md +125 -0
  69. data/core/src/docs/rfcs/1420_object_writer.md +147 -0
  70. data/core/src/docs/rfcs/1477_remove_object_concept.md +159 -0
  71. data/core/src/docs/rfcs/1735_operation_extension.md +117 -0
  72. data/core/src/docs/rfcs/2083_writer_sink_api.md +106 -0
  73. data/core/src/docs/rfcs/2133_append_api.md +88 -0
  74. data/core/src/docs/rfcs/2299_chain_based_operator_api.md +99 -0
  75. data/core/src/docs/rfcs/2602_object_versioning.md +138 -0
  76. data/core/src/docs/rfcs/2758_merge_append_into_write.md +79 -0
  77. data/core/src/docs/rfcs/2774_lister_api.md +66 -0
  78. data/core/src/docs/rfcs/2779_list_with_metakey.md +143 -0
  79. data/core/src/docs/rfcs/2852_native_capability.md +58 -0
  80. data/core/src/docs/rfcs/2884_merge_range_read_into_read.md +80 -0
  81. data/core/src/docs/rfcs/3017_remove_write_copy_from.md +94 -0
  82. data/core/src/docs/rfcs/3197_config.md +237 -0
  83. data/core/src/docs/rfcs/3232_align_list_api.md +69 -0
  84. data/core/src/docs/rfcs/3243_list_prefix.md +128 -0
  85. data/core/src/docs/rfcs/3356_lazy_reader.md +111 -0
  86. data/core/src/docs/rfcs/3526_list_recursive.md +59 -0
  87. data/core/src/docs/rfcs/3574_concurrent_stat_in_list.md +80 -0
  88. data/core/src/docs/rfcs/3734_buffered_reader.md +64 -0
  89. data/core/src/docs/rfcs/3898_concurrent_writer.md +66 -0
  90. data/core/src/docs/rfcs/3911_deleter_api.md +165 -0
  91. data/core/src/docs/rfcs/4382_range_based_read.md +213 -0
  92. data/core/src/docs/rfcs/4638_executor.md +215 -0
  93. data/core/src/docs/rfcs/5314_remove_metakey.md +120 -0
  94. data/core/src/docs/rfcs/5444_operator_from_uri.md +162 -0
  95. data/core/src/docs/rfcs/5479_context.md +140 -0
  96. data/core/src/docs/rfcs/5485_conditional_reader.md +112 -0
  97. data/core/src/docs/rfcs/5495_list_with_deleted.md +81 -0
  98. data/core/src/docs/rfcs/5556_write_returns_metadata.md +121 -0
  99. data/core/src/docs/rfcs/5871_read_returns_metadata.md +112 -0
  100. data/core/src/docs/rfcs/6189_remove_native_blocking.md +106 -0
  101. data/core/src/docs/rfcs/6209_glob_support.md +132 -0
  102. data/core/src/docs/rfcs/6213_options_api.md +142 -0
  103. data/core/src/docs/rfcs/README.md +62 -0
  104. data/core/src/docs/upgrade.md +1556 -0
  105. data/core/src/services/aliyun_drive/docs.md +61 -0
  106. data/core/src/services/alluxio/docs.md +45 -0
  107. data/core/src/services/azblob/docs.md +77 -0
  108. data/core/src/services/azdls/docs.md +73 -0
  109. data/core/src/services/azfile/docs.md +65 -0
  110. data/core/src/services/b2/docs.md +54 -0
  111. data/core/src/services/cacache/docs.md +38 -0
  112. data/core/src/services/cloudflare_kv/docs.md +21 -0
  113. data/core/src/services/cos/docs.md +55 -0
  114. data/core/src/services/d1/docs.md +48 -0
  115. data/core/src/services/dashmap/docs.md +38 -0
  116. data/core/src/services/dbfs/docs.md +57 -0
  117. data/core/src/services/dropbox/docs.md +64 -0
  118. data/core/src/services/etcd/docs.md +45 -0
  119. data/core/src/services/foundationdb/docs.md +42 -0
  120. data/core/src/services/fs/docs.md +49 -0
  121. data/core/src/services/ftp/docs.md +42 -0
  122. data/core/src/services/gcs/docs.md +76 -0
  123. data/core/src/services/gdrive/docs.md +65 -0
  124. data/core/src/services/ghac/docs.md +84 -0
  125. data/core/src/services/github/docs.md +52 -0
  126. data/core/src/services/gridfs/docs.md +46 -0
  127. data/core/src/services/hdfs/docs.md +140 -0
  128. data/core/src/services/hdfs_native/docs.md +35 -0
  129. data/core/src/services/http/docs.md +45 -0
  130. data/core/src/services/huggingface/docs.md +61 -0
  131. data/core/src/services/ipfs/docs.md +45 -0
  132. data/core/src/services/ipmfs/docs.md +14 -0
  133. data/core/src/services/koofr/docs.md +51 -0
  134. data/core/src/services/lakefs/docs.md +62 -0
  135. data/core/src/services/memcached/docs.md +47 -0
  136. data/core/src/services/memory/docs.md +36 -0
  137. data/core/src/services/mini_moka/docs.md +19 -0
  138. data/core/src/services/moka/docs.md +42 -0
  139. data/core/src/services/mongodb/docs.md +49 -0
  140. data/core/src/services/monoiofs/docs.md +46 -0
  141. data/core/src/services/mysql/docs.md +47 -0
  142. data/core/src/services/obs/docs.md +54 -0
  143. data/core/src/services/onedrive/docs.md +115 -0
  144. data/core/src/services/opfs/docs.md +18 -0
  145. data/core/src/services/oss/docs.md +74 -0
  146. data/core/src/services/pcloud/docs.md +51 -0
  147. data/core/src/services/persy/docs.md +43 -0
  148. data/core/src/services/postgresql/docs.md +47 -0
  149. data/core/src/services/redb/docs.md +41 -0
  150. data/core/src/services/redis/docs.md +43 -0
  151. data/core/src/services/rocksdb/docs.md +54 -0
  152. data/core/src/services/s3/compatible_services.md +126 -0
  153. data/core/src/services/s3/docs.md +244 -0
  154. data/core/src/services/seafile/docs.md +54 -0
  155. data/core/src/services/sftp/docs.md +49 -0
  156. data/core/src/services/sled/docs.md +39 -0
  157. data/core/src/services/sqlite/docs.md +46 -0
  158. data/core/src/services/surrealdb/docs.md +54 -0
  159. data/core/src/services/swift/compatible_services.md +53 -0
  160. data/core/src/services/swift/docs.md +52 -0
  161. data/core/src/services/tikv/docs.md +43 -0
  162. data/core/src/services/upyun/docs.md +51 -0
  163. data/core/src/services/vercel_artifacts/docs.md +40 -0
  164. data/core/src/services/vercel_blob/docs.md +45 -0
  165. data/core/src/services/webdav/docs.md +49 -0
  166. data/core/src/services/webhdfs/docs.md +90 -0
  167. data/core/src/services/yandex_disk/docs.md +45 -0
  168. data/core/tests/behavior/README.md +77 -0
  169. data/core/tests/data/normal_dir/.gitkeep +0 -0
  170. data/core/tests/data/normal_file.txt +1041 -0
  171. data/core/tests/data/special_dir !@#$%^&()_+-=;',/.gitkeep +0 -0
  172. data/core/tests/data/special_file !@#$%^&()_+-=;',.txt +1041 -0
  173. data/core/users.md +13 -0
  174. data/extconf.rb +24 -0
  175. data/lib/opendal.rb +25 -0
  176. data/lib/opendal_ruby/entry.rb +35 -0
  177. data/lib/opendal_ruby/io.rb +70 -0
  178. data/lib/opendal_ruby/metadata.rb +44 -0
  179. data/lib/opendal_ruby/opendal_ruby.bundle +0 -0
  180. data/lib/opendal_ruby/operator.rb +29 -0
  181. data/lib/opendal_ruby/operator_info.rb +26 -0
  182. data/opendal.gemspec +91 -0
  183. data/test/blocking_op_test.rb +112 -0
  184. data/test/capability_test.rb +42 -0
  185. data/test/io_test.rb +172 -0
  186. data/test/lister_test.rb +77 -0
  187. data/test/metadata_test.rb +78 -0
  188. data/test/middlewares_test.rb +46 -0
  189. data/test/operator_info_test.rb +35 -0
  190. data/test/test_helper.rb +36 -0
  191. metadata +240 -0
@@ -0,0 +1,88 @@
1
+ - Proposal Name: `append_api`
2
+ - Start Date: 2023-04-26
3
+ - RFC PR: [apache/opendal#2133](https://github.com/apache/opendal/pull/2133)
4
+ - Tracking Issue: [apache/opendal#2163](https://github.com/apache/opendal/issues/2163)
5
+
6
+ # Summary
7
+
8
+ Introduce append operations for OpenDAL which allow users to add data to a file.
9
+
10
+ # Motivation
11
+
12
+ OpenDAL has the write operation used to create a file and upload in parts. This is implemented based on multipart API. However, current approach has some limitations:
13
+
14
+ - Data could be lost and not readable before w.close() returned Ok(())
15
+ - File can't be appended again after w.close() returned Ok(())
16
+
17
+ To address these issues, I propose adding an append operation. Users can create an appender that provides a reentrant append operation. Each append operation will add data to the end of the file, which can be read immediately after the operation.
18
+
19
+ # Guide-level explanation
20
+
21
+ The files created by the append operation can be appended via append API.
22
+
23
+ ```rust
24
+ async fn append_test(op: Operation) -> Result<()> {
25
+ // create writer
26
+ let append = op.append("path_to_file").await?;
27
+
28
+ let bs = read_from_file();
29
+ append.append(bs).await?;
30
+
31
+ let bs = read_from_another_file();
32
+ append.append(bs).await?;
33
+
34
+ // close the file
35
+ append.close().await?;
36
+ }
37
+ ```
38
+
39
+ Difference between the write and append operation:
40
+
41
+ - write: Always create a new file, not readable until close.
42
+ - append: Can append existing appendable file, readable after append return.
43
+
44
+ # Reference-level explanation
45
+
46
+ For underlay API, we will make these changes in Accessor:
47
+
48
+ ```rust
49
+ trait Accessor {
50
+ type Appender: oio::Append;
51
+
52
+ async fn append(&self, path: &str, args: OpAppend) -> Result<Self::Append>;
53
+ }
54
+ ```
55
+
56
+ To implement this feature, we need to add a new API `append` into `oio::Append`.
57
+
58
+ ```rust
59
+ #[async_trait]
60
+ pub trait Append: Unpin + Send + Sync {
61
+ /// Append data to the end of file.
62
+ /// Users will call `append` multiple times. Please make sure `append` is safe to re-enter.
63
+ async fn append(&mut self, bs: Bytes) -> Result<()>;
64
+
65
+ /// Seal the file to mark it as unmodifiable.
66
+ async fn close(&mut self) -> Result<()>;
67
+ }
68
+ ```
69
+
70
+ # Drawbacks
71
+
72
+ None.
73
+
74
+ # Rationale and alternatives
75
+
76
+ None.
77
+
78
+ # Prior art
79
+
80
+ None.
81
+
82
+ # Unresolved questions
83
+
84
+ None.
85
+
86
+ # Future possibilities
87
+
88
+ We can use append API to implement for services that natively support append, such as [Azure blob](https://learn.microsoft.com/en-us/rest/api/storageservices/append-block?tabs=azure-ad) and [Alibaba cloud OSS](https://www.alibabacloud.com/help/en/object-storage-service/latest/appendobject). This will improve the performance and reliability of append operation.
@@ -0,0 +1,99 @@
1
+ - Proposal Name: `chain_based_operator_api`
2
+ - Start Date: 2023-05-23
3
+ - RFC PR: [apache/opendal#2299](https://github.com/apache/opendal/pull/2299)
4
+ - Tracking Issue: [apache/opendal#2300](https://github.com/apache/opendal/issues/2300)
5
+
6
+ # Summary
7
+
8
+ Add chain based Operator API to replace `OpXxx`.
9
+
10
+ # Motivation
11
+
12
+ OpenDAL provides `xxx_with` API for users to add more options for requests:
13
+
14
+ ```rust
15
+ let bs = op.read_with("path/to/file", OpRead::new().with_range(0..=1024)).await?;
16
+ ```
17
+
18
+ However, the API's usability is hindered as users are required to create a new `OpXxx` struct. The API call can be excessively verbose:
19
+
20
+ ```rust
21
+ let bs = op.read_with(
22
+ "path/to/file",
23
+ OpRead::new()
24
+ .with_range(0..=1024)
25
+ .with_if_match("<etag>")
26
+ .with_if_none_match("<etag>")
27
+ .with_override_cache_control("<cache_control>")
28
+ .with_override_content_disposition("<content_disposition>")
29
+ ).await?;
30
+ ```
31
+
32
+
33
+ # Guide-level explanation
34
+
35
+ In this proposal, I plan to introduce chain based `Operator` API to make them more friendly to use:
36
+
37
+ ```rust
38
+ let bs = op.read_with("path/to/file")
39
+ .range(0..=1024)
40
+ .if_match("<etag>")
41
+ .if_none_match("<etag>")
42
+ .override_cache_control("<cache_control>")
43
+ .override_content_disposition("<content_disposition>")
44
+ .await?;
45
+ ```
46
+
47
+ By eliminating the usage of `OpXxx`, our users can write code that is more readable.
48
+
49
+ # Reference-level explanation
50
+
51
+ To implement chain based API, we will change `read_with` as following:
52
+
53
+ ```diff
54
+ - pub async fn read_with(&self, path: &str, args: OpRead) -> Result<Vec<u8>>
55
+ + pub fn read_with(&self, path: &str) -> FutureRead
56
+ ```
57
+
58
+ `FutureRead` will implement `Future<Output=Result<Vec<u8>>>`, so that users can still call `read_with` like the following:
59
+
60
+ ```rust
61
+ let bs = op.read_with("path/to/file").await?;
62
+ ```
63
+
64
+ For blocking operations, we will change `read_with` as following:
65
+
66
+ ```diff
67
+ - pub fn read_with(&self, path: &str, args: OpRead) -> Result<Vec<u8>>
68
+ + pub fn read_with(&self, path: &str) -> FunctionRead
69
+ ```
70
+
71
+ `FunctionRead` will implement `call(self) -> Result<Vec<u8>>`, so that users can call `read_with` like the following:
72
+
73
+ ```rust
74
+ let bs = op.read_with("path/to/file").call()?;
75
+ ```
76
+
77
+ After this change, all `OpXxx` will be moved as raw API.
78
+
79
+ # Drawbacks
80
+
81
+ None
82
+
83
+ # Rationale and alternatives
84
+
85
+ None
86
+
87
+ # Prior art
88
+
89
+ None
90
+
91
+ # Unresolved questions
92
+
93
+ None
94
+
95
+ # Future possibilities
96
+
97
+ ## Change API after fn_traits stabilized
98
+
99
+ After [fn_traits](https://github.com/rust-lang/rust/issues/29625) get stabilized, we will implement `FnOnce` for `FunctionXxx` instead of `call`.
@@ -0,0 +1,138 @@
1
+ - Proposal Name: object_versioning
2
+ - Start Date: 2023-07-06
3
+ - RFC PR: [apache/opendal#2602](https://github.com/apache/opendal/pull/2602)
4
+ - Tracking Issue: [apache/opendal#2611](https://github.com/apache/opendal/issues/2611)
5
+
6
+ # Summary
7
+
8
+ This proposal describes the object versioning (or object version control) feature of OpenDAL.
9
+
10
+ # Motivation
11
+
12
+ There is a kind of storage service, which is called object storage service,
13
+ provides a simple and scalable way to store, organize, and access unstructured data.
14
+ These services store data as objects within buckets.
15
+ And an object is a file and any metadata that describes that file, a bucket is a container for objects.
16
+
17
+ The object versioning provided by these services is a very useful feature.
18
+ It allows users to keep multiple versions of an object in the same bucket.
19
+ If users enable object versioning, each object will have a history of versions.
20
+ Each version will have a unique version ID, which is a string that is unique for each version of an object.
21
+
22
+ (The object, bucket,
23
+ and version ID mentioned here are all concepts of object storage services,
24
+ they could be called differently in different services,
25
+ but they are the same thing.)
26
+
27
+ OpenDAL provides support for some of those services, such as S3, GCS, Azure Blob Storage, etc.
28
+ Now we want to add support for object versioning to OpenDAL.
29
+
30
+ # Guide-level explanation
31
+
32
+ When object versioning is enabled, the following operations will be supported:
33
+
34
+ - `stat`: Get the metadata of an object with specific version ID.
35
+ - `read`: Read a specific version of an object.
36
+ - `delete`: Delete a specific version of an object.
37
+
38
+ Code example:
39
+
40
+ ```rust
41
+ // To get the current version ID of a file
42
+ let meta = op.stat("path/to/file").await?;
43
+ let version_id = meta.version().expect("just for example");
44
+
45
+ // To fetch the metadata of specific version of a file
46
+ let meta = op.stat_with("path/to/file").version("version_id").await?;
47
+ let version_id = meta.version().expect("just for example"); // get the version ID
48
+
49
+ // To read an file with specific version ID
50
+ let content = op.read_with("path/to/file").version("version_id").await?;
51
+
52
+ // To delete an file with specific version ID
53
+ op.delete_with("path/to/file").version("version_id").await?;
54
+ ```
55
+
56
+ # Reference-level explanation
57
+
58
+ Those operations with object version are different from the normal operations:
59
+
60
+ - `stat`: when getting the metadata of a file, it will always get the metadata of the latest version of the file if no version ID is specified. And there will be a new field `version` in the metadata to indicate the version ID of the file.
61
+ - `read`: when reading a file, it will always read the latest version of the file if no version ID is specified.
62
+ - `delete`: when deleting a file, it will always delete the latest version of the file if no version ID is specified. And users will not be able to read this file without specifying the version ID, unless they specify a version not be deleted.
63
+
64
+ And with object versioning, when writing an object,
65
+ it will always create a new version of the object than overwrite the old version.
66
+ But here it is imperceptible to the user.
67
+ Because the version id is generated by the service itself, it cannot be specified by the user and user cannot override the historical version.
68
+
69
+ To implement object versioning, we will do the following:
70
+
71
+ - Add a new field `version` to `OpStat`, `OpRead` and `OpDelete` struct.
72
+ - Add a new field `version` to `ObjectMetadata` struct.
73
+ - Add a new property(setter) `version` to the return value of `stat_with`, `read_with` method.
74
+ - Add a new method `delete_with` and add a new property(setter) `version` to the return value of `delete_with` method.
75
+
76
+ For service backend, it should support the following operations:
77
+
78
+ - `stat`: Get the metadata of an object with specific version ID.
79
+ - `read`: Read a specific version of an object.
80
+ - `delete`: Delete a specific version of an object.
81
+
82
+ # Drawbacks
83
+
84
+ None.
85
+
86
+ # Rationale and alternatives
87
+
88
+ ## What is object versioning?
89
+
90
+ Object versioning is a feature that allows users to keep multiple versions of an object in the same bucket.
91
+
92
+ It's a way to preserve, retrieve, and restore every version of every object stored in a bucket.
93
+
94
+ With object versioning, users can easily recover from both unintended user actions and application failures.
95
+
96
+ ## How does object versioning work?
97
+
98
+ When object versioning is enabled, each object will have a history of versions. Each version will have a unique version ID, which is a string that is unique for each version of an object.
99
+
100
+ The version ID is not a timestamp.
101
+ It is not guaranteed to be sequential.
102
+ Many object storage services produce object version IDs by themselves, using their own algorithms.
103
+ Users cannot specify the version ID when writing an object.
104
+
105
+ ## Will object versioning affect the existing code?
106
+
107
+ There is no difference between whether object versioning is enabled or not when writing an object.
108
+ The storage service will always create a new version of the object than overwrite the old version when writing an object.
109
+ But here it is imperceptible to the user.
110
+
111
+ ## What are the benefits of object versioning?
112
+
113
+ With object versioning, users can:
114
+
115
+ - Track the history of a file.
116
+ - Implement optimistic concurrency control.
117
+ - Implement a simple backup system.
118
+
119
+ ## reference
120
+
121
+ - [AWS S3 Object Versioning](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Versioning.html)
122
+ - [How does AWS S3 object versioning work?](https://docs.aws.amazon.com/AmazonS3/latest/userguide/versioning-workflows.html)
123
+ - [How to enable object versioning for a bucket in AWS S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/manage-versioning-examples.html)
124
+ - [Google Cloud Storage Object Versioning](https://cloud.google.com/storage/docs/object-versioning)
125
+ - [Azure Blob Storage Object Versioning](https://docs.microsoft.com/en-us/azure/storage/blobs/versioning-overview)
126
+
127
+ # Prior art
128
+
129
+ None.
130
+
131
+ # Unresolved questions
132
+
133
+ None.
134
+
135
+ # Future possibilities
136
+
137
+ Impl a new method `list_versions`(list all versions of an object).
138
+
@@ -0,0 +1,79 @@
1
+ - Proposal Name: `merge_append_into_write`
2
+ - Start Date: 2023-08-02
3
+ - RFC PR: [apache/opendal#2758](https://github.com/apache/opendal/pull/2758)
4
+ - Tracking Issue: [apache/opendal#2760](https://github.com/apache/opendal/issues/2760)
5
+
6
+ # Summary
7
+
8
+ Merge the `appender` API into `writer` by introducing a new `writer_with.append(true)` method to enable append mode.
9
+
10
+ # Motivation
11
+
12
+ Currently OpenDAL has separate `appender` and `writer` APIs:
13
+
14
+ ```rust
15
+ let mut appender = op.appender_with("file.txt").await?;
16
+
17
+ appender.append(bs).await?;
18
+ appender.append(bs).await?;
19
+ ```
20
+
21
+ This duplication forces users to learn two different APIs for writing data.
22
+
23
+ By adding this change, we can:
24
+
25
+ - Simpler API surface - users only need to learn one writing API.
26
+ - Reduce code duplication between append and write implementations.
27
+ - Atomic append semantics are handled internally in `writer`.
28
+ - Reuse the `sink` api for both `overwrite` and `append` mode.
29
+
30
+ # Guide-level explanation
31
+
32
+ The new approach is to enable append mode on `writer`:
33
+
34
+ ```rust
35
+ let mut writer = op.writer_with("file.txt").append(true).await?;
36
+
37
+ writer.write(bs).await?; // appends to file
38
+ writer.write(bs).await?; // appends again
39
+ ```
40
+
41
+ Calling `writer_with.append(true)` will start the writer in append mode. Subsequent `write()` calls will append rather than overwrite.
42
+
43
+ There is no longer a separate `appender` API.
44
+
45
+ # Reference-level explanation
46
+
47
+ We will add an `append` flag into `OpWrite`:
48
+
49
+ ```rust
50
+ impl OpWrite {
51
+ pub fn with_append(mut self, append: bool) -> Self {
52
+ self.append = append;
53
+ self
54
+ }
55
+ }
56
+ ```
57
+
58
+ All services need to check `append` flag and handle append mode accordingly. Services that not support append should return an `Unsupported` error instead.
59
+
60
+ # Drawbacks
61
+
62
+ - `writer` API is more complex with the append mode flag.
63
+ - Internal implementation must handle both overwrite and append logic.
64
+
65
+ # Rationale and alternatives
66
+
67
+ None
68
+
69
+ # Prior art
70
+
71
+ Python's file open() supports an `"a"` mode flag to enable append-only writing.
72
+
73
+ # Unresolved questions
74
+
75
+ None
76
+
77
+ # Future possibilities
78
+
79
+ None
@@ -0,0 +1,66 @@
1
+ - Proposal Name: `lister_api`
2
+ - Start Date: 2023-08-04
3
+ - RFC PR: [apache/opendal#2774](https://github.com/apache/opendal/pull/2774)
4
+ - Tracking Issue: [apache/opendal#2775](https://github.com/apache/opendal/issues/2775)
5
+
6
+ # Summary
7
+
8
+ Add `lister` API to align with other OpenDAL APIs like `read`/`reader`.
9
+
10
+ # Motivation
11
+
12
+ Currently OpenDAL has `list` APIs like:
13
+
14
+ ```rust
15
+ let lister = op.list().await?;
16
+ ```
17
+
18
+ This is inconsistent with APIs like `read`/`reader` and can confuse users.
19
+
20
+ We should add a new `lister` API and change the `list` to:
21
+
22
+ - Align with other OpenDAL APIs
23
+ - Simplify usage
24
+
25
+ # Guide-level explanation
26
+
27
+ The new APIs will be:
28
+
29
+ ```rust
30
+ let entries = op.list().await?; // Get entries directly
31
+
32
+ let lister = op.lister().await?; // Get lister
33
+ ```
34
+
35
+ - `op.list()` returns entries directly.
36
+ - `op.lister()` returns a lister that users can list entries on demand.
37
+
38
+ # Reference-level explanation
39
+
40
+ We will:
41
+
42
+ - Rename existing `list` to `lister`
43
+ - Add new `list` method to call `lister` and return all entries
44
+ - Merge `scan` into `list_with` with `delimiter("")`
45
+
46
+ This keeps the pagination logic encapsulated in `lister`.
47
+
48
+ # Drawbacks
49
+
50
+ None
51
+
52
+ # Rationale and alternatives
53
+
54
+ None
55
+
56
+ # Prior art
57
+
58
+ None
59
+
60
+ # Unresolved questions
61
+
62
+ None
63
+
64
+ # Future possibilities
65
+
66
+ None
@@ -0,0 +1,143 @@
1
+ - Proposal Name: `list_with_metakey`
2
+ - Start Date: 2023-08-04
3
+ - RFC PR: [apache/opendal#2779](https://github.com/apache/opendal/pull/2779)
4
+ - Tracking Issue: [apache/opendal#2802](https://github.com/apache/opendal/issues/2802)
5
+
6
+ # Summary
7
+
8
+ Move `Operator` `metadata` API to `list_with().metakey()` to simplify the usage.
9
+
10
+ # Motivation
11
+
12
+ The current `Entry` metadata API is:
13
+
14
+ ```rust
15
+ use opendal::Entry;
16
+ use opendal::Metakey;
17
+
18
+ let meta = op
19
+ .metadata(&entry, Metakey::ContentLength | Metakey::ContentType)
20
+ .await?;
21
+ ```
22
+
23
+ This API is difficult to understand and rarely used correctly. And in reality, users always fetch the same set of metadata during listing.
24
+
25
+ Take one of our users code as an example:
26
+
27
+ ```rust
28
+ let stream = self
29
+ .inner
30
+ .scan(&path)
31
+ .await
32
+ .map_err(|err| format_object_store_error(err, &path))?;
33
+
34
+ let stream = stream.then(|res| async {
35
+ let entry = res.map_err(|err| format_object_store_error(err, ""))?;
36
+ let meta = self
37
+ .inner
38
+ .metadata(&entry, Metakey::ContentLength | Metakey::LastModified)
39
+ .await
40
+ .map_err(|err| format_object_store_error(err, entry.path()))?;
41
+
42
+ Ok(format_object_meta(entry.path(), &meta))
43
+ });
44
+
45
+ Ok(stream.boxed())
46
+ ```
47
+
48
+ By moving metadata to `lister`, our user code can be simplified to:
49
+
50
+ ```rust
51
+ let stream = self
52
+ .inner
53
+ .scan_with(&path)
54
+ .metakey(Metakey::ContentLength | Metakey::LastModified)
55
+ .await
56
+ .map_err(|err| format_object_store_error(err, &path))?;
57
+
58
+ let stream = stream.then(|res| async {
59
+ let entry = res.map_err(|err| format_object_store_error(err, ""))?;
60
+ let meta = entry.into_metadata()
61
+
62
+ Ok(format_object_meta(entry.path(), &meta))
63
+ });
64
+
65
+ Ok(stream.boxed())
66
+ ```
67
+
68
+ By introducing this change:
69
+
70
+ - Users don't need to capture `Operator` in the closure.
71
+ - Users don't need to do async call like `metadata()` again.
72
+
73
+ If we don't have this change:
74
+
75
+ - every place that could receive a `fn()` must use `Fn()` instead which enforce users to have a generic parameter in their code.
76
+ - It's harder for other languages binding to implement `op.metadata()` right.
77
+
78
+ # Guide-level explanation
79
+
80
+ The new API will be:
81
+
82
+ ```rust
83
+ let entries: Vec<Entry> = op
84
+ .list_with("dir")
85
+ .metakey(Metakey::ContentLength | Metakey::ContentType).await?;
86
+
87
+ let meta: &Metadata = entries[0].metadata();
88
+ ```
89
+
90
+ Metadata can be queried directly when listing entries via `metadata()`, and later extracted via `into_parts()`.
91
+
92
+ # Reference-level explanation
93
+
94
+ ## How metakey works
95
+
96
+ For every services, `stat` will return the full set of it's metadata. For example, `s3` will return `ContentLength | ContentType | LastModified | ...`, and `fs` will return `ContentLength | LastModified`. And most services will return part of those metadata during `list`. `s3` will return `ContentLength`, `LastModified`, but `fs` returns none of them.
97
+
98
+ So when users use `list` to list entries, they will get a list of entries with incomplete metadata. The metadata could be in three states:
99
+
100
+ - Filled: the metadata is returned in `list`
101
+ - NotExist: the metadata is not supported by service.
102
+ - Unknown: the metadata is supported by service but not returned in `list`.
103
+
104
+ By accept `metakey`, we can compare the returning entry's metadata with metakey:
105
+
106
+ - Return the entry if metakey already met by `Filled` and `NotExist`.
107
+ - Send `stat` call to fetch the metadata if metadata is `Unknown`.
108
+
109
+ ## Changes
110
+
111
+ We will add `metakey` into `OpList`. Underlying services can use those information to try their best to fetch the metadata.
112
+
113
+ There are following possibilities:
114
+
115
+ - The entry metadata is met: `Lister` return the entry directly
116
+ - The entry metadata is not met and not fully filled: `Lister` will try to send `stat` call to fetch the metadata
117
+ - The entry metadata is not met and fully filled: `Lister` will return the entry directly.
118
+
119
+ To make sure we can handle all metadata correctly, we will add a new capability called `stat_complete_metakey`. This capability will be used to indicate the complete set of metadata that can be fetched via `stat` call. For example, `s3` can set this capability to `ContentLength | ContentType | LastModified | ...`, and `fs` only have `ContentLength | LastModified`. `Lister` can use this capability to decide whether to send `stat` call or not.
120
+
121
+ Services' lister implementation should not changed.
122
+
123
+ # Drawbacks
124
+
125
+ None
126
+
127
+ # Rationale and alternatives
128
+
129
+ Keeping the complex standalone API has limited benefit given low usage.
130
+
131
+ # Prior art
132
+
133
+ None
134
+
135
+ # Unresolved questions
136
+
137
+ None
138
+
139
+ # Future possibilities
140
+
141
+ ## Add glob and regex support for Lister
142
+
143
+ We can add `glob` and `regex` support for `Lister` to make it more powerful.
@@ -0,0 +1,58 @@
1
+ - Proposal Name: `native_capability`
2
+ - Start Date: 2023-08-11
3
+ - RFC PR: [apache/opendal#2852](https://github.com/apache/opendal/pull/2852)
4
+ - Tracking Issue: [apache/opendal#2859](https://github.com/apache/opendal/issues/2859)
5
+
6
+ # Summary
7
+
8
+ Add `native_capability` and `full_capability` to `Operator` so that users can make more informed decisions.
9
+
10
+ # Motivation
11
+
12
+ OpenDAL adds `Capability` to inform users whether a service supports a specific feature. However, this is not enough for users to make decisions. OpenDAL doesn't simply expose the services' API directly; instead, it simulates the behavior to make it more useful.
13
+
14
+ For example, `s3` doesn't support seek operations like a local file system. But it's a quite common operation for users. So OpenDAL will try to simulate the behavior by calculating the correct offset and reading the data from that offset instead. After this simulation, the `s3` service has the `read_can_seek` capability now.
15
+
16
+ As another example, most services like `s3` don't support blocking operations. OpenDAL implements a `BlockingLayer` to make it possible. After this implementation, the `s3` service has the `blocking` capability now.
17
+
18
+ However, these capabilities alone are insufficient for users to make informed decisions. Take the `s3` service's `blocking` capability as an example. Users are unable to determine whether it is a native capability or not, which may result in them unknowingly utilizing this feature in performance-sensitive scenarios, leading to significantly poor performance.
19
+
20
+ So this proposal intends to address this issue by adding `native_capability` and `full_capability` to `OperatorInfo`. Users can use `native_capability` to determine whether a capability is native or not.
21
+
22
+ # Guide-level explanation
23
+
24
+ We will add two new APIs `native_capability()` and `full_capability()` in `OperatorInfo`, and remove the `capability()` and related `can_xxx()` API.
25
+
26
+ ```diff
27
+ + pub fn native_capability(&self) -> Capability
28
+ + pub fn full_capability(&self) -> Capability
29
+ - pub fn capability(&self) -> Capability
30
+ ```
31
+
32
+ # Reference-level explanation
33
+
34
+ We will add two new fields `native_capability` and `full_capability` in `AccessorInfo`:
35
+
36
+ - Services SHOULD only set `native_capability`, and `full_capability` will be the same as `native_capability`.
37
+ - Layers MAY change `full_capability` and MUST NOT modify `native_capability`.
38
+ - `OperatorInfo` should forward `native_capability()` and `full_capability()` to `AccessorInfo`.
39
+
40
+ # Drawbacks
41
+
42
+ None
43
+
44
+ # Rationale and alternatives
45
+
46
+ None
47
+
48
+ # Prior art
49
+
50
+ None
51
+
52
+ # Unresolved questions
53
+
54
+ None
55
+
56
+ # Future possibilities
57
+
58
+ None