bucket_store 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/README.md +167 -0
- data/lib/bucket_store/configuration.rb +19 -0
- data/lib/bucket_store/disk.rb +84 -0
- data/lib/bucket_store/gcs.rb +82 -0
- data/lib/bucket_store/in_memory.rb +62 -0
- data/lib/bucket_store/key_context.rb +36 -0
- data/lib/bucket_store/key_storage.rb +158 -0
- data/lib/bucket_store/s3.rb +79 -0
- data/lib/bucket_store/timing.rb +19 -0
- data/lib/bucket_store/uri_builder.rb +15 -0
- data/lib/bucket_store/version.rb +5 -0
- data/lib/bucket_store.rb +59 -0
- metadata +182 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: c812fe38b957031d6fae686c11a3e615ac5a2575b563ebf96b8f0a019513d363
|
4
|
+
data.tar.gz: 3165a65032fc35ca76ed983a7477f0cb1a13de62d1f2b68ad6cb24e4622a0eb4
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 3be0d30f1857185217c5f3b461809bf436ca2306426791af4051a2c725229d246d272ffd85aaef90792102e59c8100e66422f9c43ab9354cfb01ab4b7971e6fe
|
7
|
+
data.tar.gz: 9ea6b6baadf89e29b0b74d00a9ca5c743088c50d75c1867348ba48ce14ac304160f205b881747ccf4f54b3dc50f631e27f2042443e2ffbfc69913e4f1703bbb2
|
data/README.md
ADDED
@@ -0,0 +1,167 @@
|
|
1
|
+
# BucketStore
|
2
|
+
|
3
|
+
An abstraction layer on the top of file cloud storage systems such as Google Cloud
|
4
|
+
Storage or S3. This module exposes a generic interface that allows interoperability
|
5
|
+
between different storage options. Callers don't need to worry about the specifics
|
6
|
+
of where and how a file is stored and retrieved as long as the given key is valid.
|
7
|
+
|
8
|
+
Keys within the `BucketStorage` are URI strings that can universally locate an object
|
9
|
+
in the given provider. A valid key example would be
|
10
|
+
`gs://a-gcs-bucket/file/path.json`.
|
11
|
+
|
12
|
+
## Usage
|
13
|
+
This library is distributed as a Ruby gem, and we recommend adding it to your Gemfile:
|
14
|
+
|
15
|
+
```ruby
|
16
|
+
gem "bucket-store"
|
17
|
+
```
|
18
|
+
|
19
|
+
Some attributes can be configured via `BucketStore.configure`. If using Rails, you want to
|
20
|
+
add a new initializer for `BucketStore`. Example:
|
21
|
+
|
22
|
+
```ruby
|
23
|
+
BucketStore.configure do |config|
|
24
|
+
config.logger = Logger.new($stderr)
|
25
|
+
end
|
26
|
+
```
|
27
|
+
|
28
|
+
If using RSpec, you'll probably want to add this line to RSpec's config block (see
|
29
|
+
the *Adapters* section for more details):
|
30
|
+
|
31
|
+
```ruby
|
32
|
+
config.before { BucketStore::InMemory.reset! }
|
33
|
+
```
|
34
|
+
|
35
|
+
For our policy on compatibility with Ruby versions, see [COMPATIBILITY.md](docs/COMPATIBILITY.md).
|
36
|
+
|
37
|
+
## Design and Architecture
|
38
|
+
The main principle behind `BucketStore` is that each resource or group of resources must
|
39
|
+
be unequivocally identifiable by a URI. The URI is always composed of three parts:
|
40
|
+
|
41
|
+
- the "adapter" used to fetch the resource (see "adapters" below)
|
42
|
+
- the "bucket" where the resource lives
|
43
|
+
- the path to the resource(s)
|
44
|
+
|
45
|
+
As an example, all the following are valid URIs:
|
46
|
+
|
47
|
+
- `gs://gcs-bucket/path/to/file.xml`
|
48
|
+
- `inmemory://bucket/separator/file.xml`
|
49
|
+
- `disk://hello/path/to/file.json`
|
50
|
+
|
51
|
+
Even though `BucketStore`'s main goal is to be an abstraction layer on top of systems such
|
52
|
+
as S3 or Google Cloud Storage where the "path" to a resource is in practice a unique
|
53
|
+
identifier as a whole (i.e. the `/` is not a directory separator but rather part of the
|
54
|
+
key's name), we assume that clients will actually want some sort of hierarchical
|
55
|
+
separation of resources and assume that such separation is achieved by defining each
|
56
|
+
part of the hierarchy via `/`.
|
57
|
+
|
58
|
+
This means that the following are also valid URIs in `BucketStore` but they refer to
|
59
|
+
all the resources under that specific hierarchy:
|
60
|
+
|
61
|
+
- `gs://gcs-bucket/path/subpath/`
|
62
|
+
- `inmemory://bucket/separator/`
|
63
|
+
- `disk://hello/path`
|
64
|
+
|
65
|
+
## Configuration
|
66
|
+
`BucketStore` exposes some configurable attributes via `BucketStore.configure`. If
|
67
|
+
necessary this should be called at startup time before any other method is invoked.
|
68
|
+
|
69
|
+
- `logger`: custom logger class. By default, logs will be sent to stdout.
|
70
|
+
|
71
|
+
## Adapters
|
72
|
+
|
73
|
+
`BucketStore` comes with 4 built-in adapters:
|
74
|
+
|
75
|
+
- `gs`: the Google Cloud Storage adapter
|
76
|
+
- `s3`: the S3 adapter
|
77
|
+
- `disk`: a disk-based adapter
|
78
|
+
- `inmemory`: an in-memory store
|
79
|
+
|
80
|
+
### GS adapter
|
81
|
+
This is the adapter for Google Cloud Storage. `BucketStore` assumes that the authorisation
|
82
|
+
for accessing the resources has been set up outside of the gem.
|
83
|
+
|
84
|
+
### S3 adapter
|
85
|
+
This is the adapter for S3. `BucketStore` assumes that the authorisation for accessing
|
86
|
+
the resources has been set up outside of the gem (see also
|
87
|
+
https://docs.aws.amazon.com/sdk-for-ruby/v3/api/index.html#Configuration).
|
88
|
+
|
89
|
+
### Disk adapter
|
90
|
+
A disk-backed key-value store. This adapter will create a temporary directory where
|
91
|
+
all the files will be written into/read from. The base directory can be explicitly
|
92
|
+
defined by setting the `DISK_ADAPTER_BASE_DIR` environment variable, otherwise a temporary
|
93
|
+
directory will be created.
|
94
|
+
|
95
|
+
### In-memory adapter
|
96
|
+
An in-memory key-value storage. This works just like the disk adapter, except that
|
97
|
+
the content of all the files is stored in memory, which is particularly useful for
|
98
|
+
testing. Note that content added to this adapter will persist for the lifetime of
|
99
|
+
the application as it's not possible to create different instances of the same adapter.
|
100
|
+
In general, this is not what's expected during testing where the content of the bucket
|
101
|
+
should be reset between different tests. The adapter provides a way to easily reset the
|
102
|
+
content though via a `.reset!` method. In RSpec this would translate to adding this line
|
103
|
+
in the `spec_helper`:
|
104
|
+
|
105
|
+
```ruby
|
106
|
+
config.before { BucketStore::InMemory.reset! }
|
107
|
+
```
|
108
|
+
|
109
|
+
## BucketStore vs ActiveStorage
|
110
|
+
|
111
|
+
ActiveStorage is a common framework to access cloud storage systems that is included in
|
112
|
+
the ActiveSupport library. In general, ActiveStorage provides a lot more than BucketStore
|
113
|
+
does (including many more adapters) however the two libraries have different use cases
|
114
|
+
in mind:
|
115
|
+
|
116
|
+
- ActiveStorage requires you to define every possible bucket you're planning to use
|
117
|
+
ahead of time in a YAML file. This works well for most cases, however if you plan to
|
118
|
+
use a lot of buckets this soon becomes impractical. We think that BucketStore approach
|
119
|
+
works much better in this case.
|
120
|
+
- BucketStore does not provide ways to manipulate the content whereas ActiveStorage does.
|
121
|
+
If you plan to apply transformations to the content before uploading or after
|
122
|
+
downloading them, then probably ActiveStorage is the library for you. With that said,
|
123
|
+
it's still possible to do these transformations outside of BucketStore and in fact we've
|
124
|
+
found the explicitness of this approach a desirable property.
|
125
|
+
- BucketStore approach makes any resource on a cloud storage system uniquely identifiable
|
126
|
+
via a single URI, which means normally it's enough to pass that string around different
|
127
|
+
systems to be able to access the resource without ambiguity. As the URI also includes
|
128
|
+
the adapter, it's possible for example to download a `disk://dir/input_file` and
|
129
|
+
upload it to a `gs://bucket/output_file` all going through a single interface.
|
130
|
+
ActiveStorage is instead focused on persisting an equivalent reference on a Rails model.
|
131
|
+
If your application does not use Rails, or does not need to persist the reference or
|
132
|
+
just requires more flexibility in general, then BucketStore is probably the library for
|
133
|
+
you.
|
134
|
+
|
135
|
+
|
136
|
+
## Examples
|
137
|
+
|
138
|
+
### Uploading a file to a bucket
|
139
|
+
```ruby
|
140
|
+
BucketStore.for("inmemory://bucket/path/file.xml").upload!("hello world")
|
141
|
+
=> "inmemory://bucket/path/file.xml"
|
142
|
+
```
|
143
|
+
|
144
|
+
### Accessing a file in a bucket
|
145
|
+
```ruby
|
146
|
+
BucketStore.for("inmemory://bucket/path/file.xml").download
|
147
|
+
=> {:bucket=>"bucket", :key=>"path/file.xml", :content=>"hello world"}
|
148
|
+
```
|
149
|
+
|
150
|
+
### Listing all keys under a prefix
|
151
|
+
```ruby
|
152
|
+
BucketStore.for("inmemory://bucket/path/").list
|
153
|
+
=> ["inmemory://bucket/path/file.xml"]
|
154
|
+
```
|
155
|
+
|
156
|
+
### Delete a file
|
157
|
+
```ruby
|
158
|
+
BucketStore.for("inmemory://bucket/path/file.xml").delete!
|
159
|
+
=> true
|
160
|
+
```
|
161
|
+
|
162
|
+
## License & Contributing
|
163
|
+
|
164
|
+
* BucketStore is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
|
165
|
+
* Bug reports and pull requests are welcome on GitHub at https://github.com/gocardless/file-storage.
|
166
|
+
|
167
|
+
GoCardless ♥ open source. If you do too, come [join us](https://gocardless.com/about/careers/).
|
@@ -0,0 +1,19 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module BucketStore
|
4
|
+
class Configuration
|
5
|
+
def logger
|
6
|
+
@logger ||= Logger.new($stdout)
|
7
|
+
end
|
8
|
+
|
9
|
+
# Specifies a custom logger.
|
10
|
+
#
|
11
|
+
# Note that {BucketStore} uses structured logging, any custom logger passed must also
|
12
|
+
# support it.
|
13
|
+
#
|
14
|
+
# @example Use stderr as main output device
|
15
|
+
# config.logger = Logger.new($stderr)
|
16
|
+
# @!attribute logger
|
17
|
+
attr_writer :logger
|
18
|
+
end
|
19
|
+
end
|
@@ -0,0 +1,84 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "fileutils"
|
4
|
+
|
5
|
+
module BucketStore
|
6
|
+
class Disk
|
7
|
+
def self.build(base_dir = ENV["DISK_ADAPTER_BASE_DIR"])
|
8
|
+
base_dir ||= Dir.tmpdir
|
9
|
+
Disk.new(base_dir)
|
10
|
+
end
|
11
|
+
|
12
|
+
def initialize(base_dir)
|
13
|
+
@base_dir = File.expand_path(base_dir)
|
14
|
+
end
|
15
|
+
|
16
|
+
def upload!(bucket:, key:, content:)
|
17
|
+
File.open(key_path(bucket, key), "w") do |file|
|
18
|
+
file.write(content)
|
19
|
+
end
|
20
|
+
{
|
21
|
+
bucket: bucket,
|
22
|
+
key: key,
|
23
|
+
}
|
24
|
+
end
|
25
|
+
|
26
|
+
def download(bucket:, key:)
|
27
|
+
File.open(key_path(bucket, key), "r") do |file|
|
28
|
+
{
|
29
|
+
bucket: bucket,
|
30
|
+
key: key,
|
31
|
+
content: file.read,
|
32
|
+
}
|
33
|
+
end
|
34
|
+
end
|
35
|
+
|
36
|
+
def list(bucket:, key:, page_size:)
|
37
|
+
root = Pathname.new(bucket_root(bucket))
|
38
|
+
|
39
|
+
Dir["#{root}/**/*"].
|
40
|
+
reject { |absolute_path| File.directory?(absolute_path) }.
|
41
|
+
map { |full_path| Pathname.new(full_path).relative_path_from(root).to_s }.
|
42
|
+
select { |f| f.start_with?(key) }.
|
43
|
+
each_slice(page_size).
|
44
|
+
map do |keys|
|
45
|
+
{
|
46
|
+
bucket: bucket,
|
47
|
+
keys: keys,
|
48
|
+
}
|
49
|
+
end.to_enum
|
50
|
+
end
|
51
|
+
|
52
|
+
def delete!(bucket:, key:)
|
53
|
+
File.unlink(key_path(bucket, key))
|
54
|
+
|
55
|
+
true
|
56
|
+
end
|
57
|
+
|
58
|
+
private
|
59
|
+
|
60
|
+
attr_reader :base_dir
|
61
|
+
|
62
|
+
def bucket_root(bucket)
|
63
|
+
path = File.join(base_dir, sanitize_filename(bucket))
|
64
|
+
FileUtils.mkdir_p(path)
|
65
|
+
path
|
66
|
+
end
|
67
|
+
|
68
|
+
def key_path(bucket, key)
|
69
|
+
path = File.join(bucket_root(bucket), sanitize_filename(key))
|
70
|
+
path = File.expand_path(path)
|
71
|
+
|
72
|
+
unless path.start_with?(base_dir)
|
73
|
+
raise ArgumentError, "Directory traversal out of bucket boundaries: #{key}"
|
74
|
+
end
|
75
|
+
|
76
|
+
FileUtils.mkdir_p(File.dirname(path))
|
77
|
+
path
|
78
|
+
end
|
79
|
+
|
80
|
+
def sanitize_filename(filename)
|
81
|
+
filename.gsub(%r{[^0-9A-z.\-/]}, "_")
|
82
|
+
end
|
83
|
+
end
|
84
|
+
end
|
@@ -0,0 +1,82 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "stringio"
|
4
|
+
require "uri"
|
5
|
+
|
6
|
+
require "google/cloud/storage"
|
7
|
+
|
8
|
+
module BucketStore
|
9
|
+
class Gcs
|
10
|
+
DEFAULT_TIMEOUT_SECONDS = 30
|
11
|
+
|
12
|
+
def self.build(timeout_seconds = DEFAULT_TIMEOUT_SECONDS)
|
13
|
+
Gcs.new(timeout_seconds)
|
14
|
+
end
|
15
|
+
|
16
|
+
def initialize(timeout_seconds)
|
17
|
+
@storage = Google::Cloud::Storage.new(
|
18
|
+
timeout: timeout_seconds,
|
19
|
+
)
|
20
|
+
end
|
21
|
+
|
22
|
+
def upload!(bucket:, key:, content:)
|
23
|
+
buffer = StringIO.new(content)
|
24
|
+
get_bucket(bucket).create_file(buffer, key)
|
25
|
+
|
26
|
+
{
|
27
|
+
bucket: bucket,
|
28
|
+
key: key,
|
29
|
+
}
|
30
|
+
end
|
31
|
+
|
32
|
+
def download(bucket:, key:)
|
33
|
+
file = get_bucket(bucket).file(key)
|
34
|
+
|
35
|
+
buffer = StringIO.new
|
36
|
+
file.download(buffer)
|
37
|
+
|
38
|
+
{
|
39
|
+
bucket: bucket,
|
40
|
+
key: key,
|
41
|
+
content: buffer.string,
|
42
|
+
}
|
43
|
+
end
|
44
|
+
|
45
|
+
def list(bucket:, key:, page_size:)
|
46
|
+
Enumerator.new do |yielder|
|
47
|
+
token = nil
|
48
|
+
|
49
|
+
loop do
|
50
|
+
page = get_bucket(bucket).files(prefix: key, max: page_size, token: token)
|
51
|
+
yielder.yield({
|
52
|
+
bucket: bucket,
|
53
|
+
keys: page.map(&:name),
|
54
|
+
})
|
55
|
+
|
56
|
+
break if page.token.nil?
|
57
|
+
|
58
|
+
token = page.token
|
59
|
+
end
|
60
|
+
end
|
61
|
+
end
|
62
|
+
|
63
|
+
def delete!(bucket:, key:)
|
64
|
+
get_bucket(bucket).file(key).delete
|
65
|
+
|
66
|
+
true
|
67
|
+
end
|
68
|
+
|
69
|
+
private
|
70
|
+
|
71
|
+
attr_reader :storage
|
72
|
+
|
73
|
+
def get_bucket(name)
|
74
|
+
# Lookup only checks that the bucket actually exist before doing any work on it.
|
75
|
+
# Unfortunately it also requires a set of extra permissions that are not necessarily
|
76
|
+
# going to be granted for service accounts. Given that if the bucket doesn't exist
|
77
|
+
# we'll get errors down the line anyway, we can safely skip the lookup without loss
|
78
|
+
# of generality.
|
79
|
+
storage.bucket(name, skip_lookup: true)
|
80
|
+
end
|
81
|
+
end
|
82
|
+
end
|
@@ -0,0 +1,62 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module BucketStore
|
4
|
+
class InMemory
|
5
|
+
def self.build
|
6
|
+
InMemory.instance
|
7
|
+
end
|
8
|
+
|
9
|
+
def self.instance
|
10
|
+
# rubocop:disable Style/ClassVars
|
11
|
+
@@instance ||= new
|
12
|
+
# rubocop:enable Style/ClassVars
|
13
|
+
end
|
14
|
+
|
15
|
+
def self.reset!
|
16
|
+
instance.reset!
|
17
|
+
end
|
18
|
+
|
19
|
+
def initialize
|
20
|
+
reset!
|
21
|
+
end
|
22
|
+
|
23
|
+
def reset!
|
24
|
+
@buckets = Hash.new { |hash, key| hash[key] = {} }
|
25
|
+
end
|
26
|
+
|
27
|
+
def upload!(bucket:, key:, content:)
|
28
|
+
@buckets[bucket][key] = content
|
29
|
+
|
30
|
+
{
|
31
|
+
bucket: bucket,
|
32
|
+
key: key,
|
33
|
+
}
|
34
|
+
end
|
35
|
+
|
36
|
+
def download(bucket:, key:)
|
37
|
+
{
|
38
|
+
bucket: bucket,
|
39
|
+
key: key,
|
40
|
+
content: @buckets[bucket].fetch(key),
|
41
|
+
}
|
42
|
+
end
|
43
|
+
|
44
|
+
def list(bucket:, key:, page_size:)
|
45
|
+
@buckets[bucket].keys.
|
46
|
+
select { |k| k.start_with?(key) }.
|
47
|
+
each_slice(page_size).
|
48
|
+
map do |keys|
|
49
|
+
{
|
50
|
+
bucket: bucket,
|
51
|
+
keys: keys,
|
52
|
+
}
|
53
|
+
end.to_enum
|
54
|
+
end
|
55
|
+
|
56
|
+
def delete!(bucket:, key:)
|
57
|
+
@buckets[bucket].delete(key)
|
58
|
+
|
59
|
+
true
|
60
|
+
end
|
61
|
+
end
|
62
|
+
end
|
@@ -0,0 +1,36 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "uri"
|
4
|
+
|
5
|
+
module BucketStore
|
6
|
+
class KeyContext
|
7
|
+
attr_reader :adapter, :bucket, :key
|
8
|
+
|
9
|
+
class KeyParseException < RuntimeError; end
|
10
|
+
|
11
|
+
def initialize(adapter:, bucket:, key:)
|
12
|
+
@adapter = adapter
|
13
|
+
@bucket = bucket
|
14
|
+
@key = key
|
15
|
+
end
|
16
|
+
|
17
|
+
def to_s
|
18
|
+
"<KeyContext adapter:#{adapter} bucket:#{bucket} key:#{key}>"
|
19
|
+
end
|
20
|
+
|
21
|
+
def self.parse(raw_key)
|
22
|
+
uri = URI(raw_key)
|
23
|
+
|
24
|
+
# A key should never be `nil` but can be empty. Depending on the operation, this may
|
25
|
+
# or may not be a valid configuration (e.g. an empty key is likely valid on a
|
26
|
+
# `list`, but not during a `download`).
|
27
|
+
key = uri.path.sub!(%r{/}, "") || ""
|
28
|
+
|
29
|
+
raise KeyParseException if [uri.scheme, uri.host, key].map(&:nil?).any?
|
30
|
+
|
31
|
+
KeyContext.new(adapter: uri.scheme,
|
32
|
+
bucket: uri.host,
|
33
|
+
key: key)
|
34
|
+
end
|
35
|
+
end
|
36
|
+
end
|
@@ -0,0 +1,158 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "bucket_store/timing"
|
4
|
+
require "bucket_store/in_memory"
|
5
|
+
require "bucket_store/gcs"
|
6
|
+
require "bucket_store/s3"
|
7
|
+
require "bucket_store/disk"
|
8
|
+
|
9
|
+
module BucketStore
|
10
|
+
class KeyStorage
|
11
|
+
SUPPORTED_ADAPTERS = {
|
12
|
+
gs: Gcs,
|
13
|
+
s3: S3,
|
14
|
+
inmemory: InMemory,
|
15
|
+
disk: Disk,
|
16
|
+
}.freeze
|
17
|
+
|
18
|
+
attr_reader :bucket, :key, :adapter_type
|
19
|
+
|
20
|
+
def initialize(adapter:, bucket:, key:)
|
21
|
+
@adapter_type = adapter.to_sym
|
22
|
+
raise "Unknown adapter: #{@adapter_type}" unless SUPPORTED_ADAPTERS.include?(@adapter_type)
|
23
|
+
|
24
|
+
@adapter = SUPPORTED_ADAPTERS.fetch(@adapter_type).build
|
25
|
+
@bucket = bucket
|
26
|
+
@key = key
|
27
|
+
end
|
28
|
+
|
29
|
+
def filename
|
30
|
+
File.basename(key)
|
31
|
+
end
|
32
|
+
|
33
|
+
# Downloads the content of the reference key
|
34
|
+
#
|
35
|
+
# @return [Hash<Symbol, Object>]
|
36
|
+
# A hash that includes the download result. The hash keys reference different aspects of the
|
37
|
+
# download (e.g. `:key` and `:content` will include respectively the original key's name and
|
38
|
+
# the actual download's content)
|
39
|
+
#
|
40
|
+
# @example Download a key
|
41
|
+
# BucketStore.for("inmemory://bucket/file.xml").download
|
42
|
+
def download
|
43
|
+
raise ArgumentError, "Key cannot be empty" if key.empty?
|
44
|
+
|
45
|
+
BucketStore.logger.info(event: "key_storage.download_started")
|
46
|
+
|
47
|
+
start = BucketStore::Timing.monotonic_now
|
48
|
+
result = adapter.download(bucket: bucket, key: key)
|
49
|
+
|
50
|
+
BucketStore.logger.info(event: "key_storage.download_finished",
|
51
|
+
duration: BucketStore::Timing.monotonic_now - start)
|
52
|
+
|
53
|
+
result
|
54
|
+
end
|
55
|
+
|
56
|
+
# Uploads the given content to the reference key location.
|
57
|
+
#
|
58
|
+
# If the `key` already exists, its content will be replaced by the one in input.
|
59
|
+
#
|
60
|
+
# @param [String] content The content to upload
|
61
|
+
# @return [String] The final `key` where the content has been uploaded
|
62
|
+
# @example Upload a file
|
63
|
+
# BucketStore.for("inmemory://bucket/file.xml").upload("hello world")
|
64
|
+
def upload!(content)
|
65
|
+
raise ArgumentError, "Key cannot be empty" if key.empty?
|
66
|
+
|
67
|
+
BucketStore.logger.info(event: "key_storage.upload_started",
|
68
|
+
**log_context)
|
69
|
+
|
70
|
+
start = BucketStore::Timing.monotonic_now
|
71
|
+
result = adapter.upload!(
|
72
|
+
bucket: bucket,
|
73
|
+
key: key,
|
74
|
+
content: content,
|
75
|
+
)
|
76
|
+
|
77
|
+
BucketStore.logger.info(event: "key_storage.upload_finished",
|
78
|
+
duration: BucketStore::Timing.monotonic_now - start,
|
79
|
+
**log_context)
|
80
|
+
|
81
|
+
"#{adapter_type}://#{result[:bucket]}/#{result[:key]}"
|
82
|
+
end
|
83
|
+
|
84
|
+
# Lists all keys for the current adapter that have the reference key as prefix
|
85
|
+
#
|
86
|
+
# Internally, this method will paginate through the result set. The default page size
|
87
|
+
# for the underlying adapter can be controlled via the `page_size` argument.
|
88
|
+
#
|
89
|
+
# This will return a enumerator of valid keys in the format of `adapter://bucket/key`.
|
90
|
+
# The keys in the list will share the reference key as a prefix. Underlying adapters will
|
91
|
+
# paginate the result set as the enumerable is consumed. The number of items per page
|
92
|
+
# can be controlled by the `page_size` argument.
|
93
|
+
#
|
94
|
+
# @param [Integer] page_size
|
95
|
+
# the max number of items to fetch for each page of results
|
96
|
+
def list(page_size: 1000)
|
97
|
+
BucketStore.logger.info(event: "key_storage.list_started")
|
98
|
+
|
99
|
+
start = BucketStore::Timing.monotonic_now
|
100
|
+
pages = adapter.list(
|
101
|
+
bucket: bucket,
|
102
|
+
key: key,
|
103
|
+
page_size: page_size,
|
104
|
+
)
|
105
|
+
|
106
|
+
page_count = 0
|
107
|
+
Enumerator.new do |yielder|
|
108
|
+
pages.each do |page|
|
109
|
+
page_count += 1
|
110
|
+
keys = page.fetch(:keys, []).map { |key| "#{adapter_type}://#{page[:bucket]}/#{key}" }
|
111
|
+
|
112
|
+
BucketStore.logger.info(
|
113
|
+
event: "key_storage.list_page_fetched",
|
114
|
+
resource_count: keys.count,
|
115
|
+
page: page_count,
|
116
|
+
duration: BucketStore::Timing.monotonic_now - start,
|
117
|
+
)
|
118
|
+
|
119
|
+
keys.each do |key|
|
120
|
+
yielder.yield(key)
|
121
|
+
end
|
122
|
+
end
|
123
|
+
end
|
124
|
+
end
|
125
|
+
|
126
|
+
# Deletes the referenced key.
|
127
|
+
#
|
128
|
+
# Note that this method will always return true.
|
129
|
+
#
|
130
|
+
# @return [bool]
|
131
|
+
#
|
132
|
+
# @example Delete a file
|
133
|
+
# BucketStore.for("inmemory://bucket/file.txt").delete!
|
134
|
+
def delete!
|
135
|
+
BucketStore.logger.info(event: "key_storage.delete_started")
|
136
|
+
|
137
|
+
start = BucketStore::Timing.monotonic_now
|
138
|
+
adapter.delete!(bucket: bucket, key: key)
|
139
|
+
|
140
|
+
BucketStore.logger.info(event: "key_storage.delete_finished",
|
141
|
+
duration: BucketStore::Timing.monotonic_now - start)
|
142
|
+
|
143
|
+
true
|
144
|
+
end
|
145
|
+
|
146
|
+
private
|
147
|
+
|
148
|
+
attr_reader :adapter
|
149
|
+
|
150
|
+
def log_context
|
151
|
+
{
|
152
|
+
bucket: bucket,
|
153
|
+
key: key,
|
154
|
+
adapter_type: adapter_type,
|
155
|
+
}.compact
|
156
|
+
end
|
157
|
+
end
|
158
|
+
end
|
@@ -0,0 +1,79 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "uri"
|
4
|
+
|
5
|
+
require "aws-sdk-s3"
|
6
|
+
|
7
|
+
module BucketStore
|
8
|
+
class S3
|
9
|
+
DEFAULT_TIMEOUT_SECONDS = 30
|
10
|
+
|
11
|
+
def self.build(open_timeout_seconds = DEFAULT_TIMEOUT_SECONDS,
|
12
|
+
read_timeout_seconds = DEFAULT_TIMEOUT_SECONDS)
|
13
|
+
S3.new(open_timeout_seconds, read_timeout_seconds)
|
14
|
+
end
|
15
|
+
|
16
|
+
def initialize(open_timeout_seconds, read_timeout_seconds)
|
17
|
+
@storage = Aws::S3::Client.new(
|
18
|
+
http_open_timeout: open_timeout_seconds,
|
19
|
+
http_read_timeout: read_timeout_seconds,
|
20
|
+
)
|
21
|
+
end
|
22
|
+
|
23
|
+
def upload!(bucket:, key:, content:)
|
24
|
+
storage.put_object(
|
25
|
+
bucket: bucket,
|
26
|
+
key: key,
|
27
|
+
body: content,
|
28
|
+
)
|
29
|
+
|
30
|
+
{
|
31
|
+
bucket: bucket,
|
32
|
+
key: key,
|
33
|
+
}
|
34
|
+
end
|
35
|
+
|
36
|
+
def download(bucket:, key:)
|
37
|
+
file = storage.get_object(
|
38
|
+
bucket: bucket,
|
39
|
+
key: key,
|
40
|
+
)
|
41
|
+
|
42
|
+
{
|
43
|
+
bucket: bucket,
|
44
|
+
key: key,
|
45
|
+
content: file.body.read,
|
46
|
+
}
|
47
|
+
end
|
48
|
+
|
49
|
+
def list(bucket:, key:, page_size:)
|
50
|
+
Enumerator.new do |yielder|
|
51
|
+
page = storage.list_objects_v2(bucket: bucket, prefix: key, max_keys: page_size)
|
52
|
+
|
53
|
+
loop do
|
54
|
+
yielder.yield({
|
55
|
+
bucket: bucket,
|
56
|
+
keys: page.contents.map(&:key),
|
57
|
+
})
|
58
|
+
|
59
|
+
break unless page.next_page?
|
60
|
+
|
61
|
+
page = page.next_page
|
62
|
+
end
|
63
|
+
end
|
64
|
+
end
|
65
|
+
|
66
|
+
def delete!(bucket:, key:)
|
67
|
+
storage.delete_object(
|
68
|
+
bucket: bucket,
|
69
|
+
key: key,
|
70
|
+
)
|
71
|
+
|
72
|
+
true
|
73
|
+
end
|
74
|
+
|
75
|
+
private
|
76
|
+
|
77
|
+
attr_reader :storage
|
78
|
+
end
|
79
|
+
end
|
@@ -0,0 +1,19 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module BucketStore
|
4
|
+
module Timing
|
5
|
+
# "Wall clock is for telling time, monotonic clock is for measuring time."
|
6
|
+
#
|
7
|
+
# When timing events, ensure we ask for a monotonically adjusted clock time
|
8
|
+
# to avoid changes to the system time from being reflected in our
|
9
|
+
# measurements.
|
10
|
+
#
|
11
|
+
# See this article for a good explanation and a deeper dive:
|
12
|
+
# https://blog.dnsimple.com/2018/03/elapsed-time-with-ruby-the-right-way/
|
13
|
+
#
|
14
|
+
# @return [Float]
|
15
|
+
def self.monotonic_now
|
16
|
+
Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
17
|
+
end
|
18
|
+
end
|
19
|
+
end
|
@@ -0,0 +1,15 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module BucketStore
|
4
|
+
module UriBuilder
|
5
|
+
# Sanitizes the input as not all characters are valid as either URIs or as bucket keys.
|
6
|
+
# When we get them we want to replace them with something we can process.
|
7
|
+
#
|
8
|
+
# @param input [String] the string to sanitise
|
9
|
+
# @param [String] replacement the replacement string for invalid characters
|
10
|
+
# @return [String] the sanitised string
|
11
|
+
def self.sanitize(input, replacement = "__")
|
12
|
+
input.gsub(/[{}<>%]/, replacement)
|
13
|
+
end
|
14
|
+
end
|
15
|
+
end
|
data/lib/bucket_store.rb
ADDED
@@ -0,0 +1,59 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "bucket_store/version"
|
4
|
+
require "bucket_store/configuration"
|
5
|
+
require "bucket_store/key_context"
|
6
|
+
require "bucket_store/key_storage"
|
7
|
+
|
8
|
+
# An abstraction layer on the top of file cloud storage systems such as Google Cloud
|
9
|
+
# Storage or S3. This module exposes a generic interface that allows interoperability
|
10
|
+
# between different storage options. Callers don't need to worry about the specifics
|
11
|
+
# of where and how a file is stored and retrieved as long as the given key is valid.
|
12
|
+
#
|
13
|
+
# Keys within the {BucketStore} are URI strings that can universally locate an object
|
14
|
+
# in the given provider. A valid key example would be:
|
15
|
+
# `gs://gcs-bucket/file/path.json`.
|
16
|
+
module BucketStore
|
17
|
+
class << self
|
18
|
+
attr_writer :configuration
|
19
|
+
|
20
|
+
def configuration
|
21
|
+
@configuration ||= BucketStore::Configuration.new
|
22
|
+
end
|
23
|
+
|
24
|
+
# Yields a {BucketStore::Configuration} object that allows callers to configure
|
25
|
+
# BucketStore's behaviour.
|
26
|
+
#
|
27
|
+
# @yield [BucketStore::Configuration]
|
28
|
+
#
|
29
|
+
# @example Configure BucketStore to use a different logger than the default
|
30
|
+
# BucketStore.configure do |config|
|
31
|
+
# config.logger = Logger.new($stderr)
|
32
|
+
# end
|
33
|
+
def configure
|
34
|
+
yield(configuration)
|
35
|
+
end
|
36
|
+
|
37
|
+
def logger
|
38
|
+
configuration.logger
|
39
|
+
end
|
40
|
+
|
41
|
+
# Given a `key` in the format of `adapter://bucket/key` returns the corresponding
|
42
|
+
# adapter that will allow to manipulate (e.g. download, upload or list) such key.
|
43
|
+
#
|
44
|
+
# Currently supported adapters are `gs` (Google Cloud Storage), `inmemory` (an
|
45
|
+
# in-memory key-value storage) and `disk` (a disk-backed key-value store).
|
46
|
+
#
|
47
|
+
# @param [String] key The reference key
|
48
|
+
# @return [KeyStorage] An interface to the adapter that can handle requests on the given key
|
49
|
+
# @example Configure {BucketStore} for Google Cloud Storage
|
50
|
+
# BucketStore.for("gs://the_bucket/a/valid/key")
|
51
|
+
def for(key)
|
52
|
+
ctx = KeyContext.parse(key)
|
53
|
+
|
54
|
+
KeyStorage.new(adapter: ctx.adapter,
|
55
|
+
bucket: ctx.bucket,
|
56
|
+
key: ctx.key)
|
57
|
+
end
|
58
|
+
end
|
59
|
+
end
|
metadata
ADDED
@@ -0,0 +1,182 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: bucket_store
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.3.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- GoCardless Engineering
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
date: 2021-10-26 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: aws-sdk-s3
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '1'
|
20
|
+
type: :runtime
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '1'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: google-cloud-storage
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '1.31'
|
34
|
+
type: :runtime
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '1.31'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: gc_ruboconfig
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - "~>"
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '2.29'
|
48
|
+
type: :development
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - "~>"
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '2.29'
|
55
|
+
- !ruby/object:Gem::Dependency
|
56
|
+
name: pry-byebug
|
57
|
+
requirement: !ruby/object:Gem::Requirement
|
58
|
+
requirements:
|
59
|
+
- - "~>"
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: '3.9'
|
62
|
+
type: :development
|
63
|
+
prerelease: false
|
64
|
+
version_requirements: !ruby/object:Gem::Requirement
|
65
|
+
requirements:
|
66
|
+
- - "~>"
|
67
|
+
- !ruby/object:Gem::Version
|
68
|
+
version: '3.9'
|
69
|
+
- !ruby/object:Gem::Dependency
|
70
|
+
name: rspec
|
71
|
+
requirement: !ruby/object:Gem::Requirement
|
72
|
+
requirements:
|
73
|
+
- - "~>"
|
74
|
+
- !ruby/object:Gem::Version
|
75
|
+
version: '3.10'
|
76
|
+
type: :development
|
77
|
+
prerelease: false
|
78
|
+
version_requirements: !ruby/object:Gem::Requirement
|
79
|
+
requirements:
|
80
|
+
- - "~>"
|
81
|
+
- !ruby/object:Gem::Version
|
82
|
+
version: '3.10'
|
83
|
+
- !ruby/object:Gem::Dependency
|
84
|
+
name: rspec_junit_formatter
|
85
|
+
requirement: !ruby/object:Gem::Requirement
|
86
|
+
requirements:
|
87
|
+
- - "~>"
|
88
|
+
- !ruby/object:Gem::Version
|
89
|
+
version: 0.4.1
|
90
|
+
type: :development
|
91
|
+
prerelease: false
|
92
|
+
version_requirements: !ruby/object:Gem::Requirement
|
93
|
+
requirements:
|
94
|
+
- - "~>"
|
95
|
+
- !ruby/object:Gem::Version
|
96
|
+
version: 0.4.1
|
97
|
+
- !ruby/object:Gem::Dependency
|
98
|
+
name: rubocop
|
99
|
+
requirement: !ruby/object:Gem::Requirement
|
100
|
+
requirements:
|
101
|
+
- - "~>"
|
102
|
+
- !ruby/object:Gem::Version
|
103
|
+
version: '1.22'
|
104
|
+
type: :development
|
105
|
+
prerelease: false
|
106
|
+
version_requirements: !ruby/object:Gem::Requirement
|
107
|
+
requirements:
|
108
|
+
- - "~>"
|
109
|
+
- !ruby/object:Gem::Version
|
110
|
+
version: '1.22'
|
111
|
+
- !ruby/object:Gem::Dependency
|
112
|
+
name: rubocop-performance
|
113
|
+
requirement: !ruby/object:Gem::Requirement
|
114
|
+
requirements:
|
115
|
+
- - "~>"
|
116
|
+
- !ruby/object:Gem::Version
|
117
|
+
version: '1.11'
|
118
|
+
type: :development
|
119
|
+
prerelease: false
|
120
|
+
version_requirements: !ruby/object:Gem::Requirement
|
121
|
+
requirements:
|
122
|
+
- - "~>"
|
123
|
+
- !ruby/object:Gem::Version
|
124
|
+
version: '1.11'
|
125
|
+
- !ruby/object:Gem::Dependency
|
126
|
+
name: rubocop-rspec
|
127
|
+
requirement: !ruby/object:Gem::Requirement
|
128
|
+
requirements:
|
129
|
+
- - "~>"
|
130
|
+
- !ruby/object:Gem::Version
|
131
|
+
version: '2.5'
|
132
|
+
type: :development
|
133
|
+
prerelease: false
|
134
|
+
version_requirements: !ruby/object:Gem::Requirement
|
135
|
+
requirements:
|
136
|
+
- - "~>"
|
137
|
+
- !ruby/object:Gem::Version
|
138
|
+
version: '2.5'
|
139
|
+
description: " A helper library to access cloud storage services such as Google
|
140
|
+
Cloud Storage.\n"
|
141
|
+
email:
|
142
|
+
- engineering@gocardless.com
|
143
|
+
executables: []
|
144
|
+
extensions: []
|
145
|
+
extra_rdoc_files: []
|
146
|
+
files:
|
147
|
+
- README.md
|
148
|
+
- lib/bucket_store.rb
|
149
|
+
- lib/bucket_store/configuration.rb
|
150
|
+
- lib/bucket_store/disk.rb
|
151
|
+
- lib/bucket_store/gcs.rb
|
152
|
+
- lib/bucket_store/in_memory.rb
|
153
|
+
- lib/bucket_store/key_context.rb
|
154
|
+
- lib/bucket_store/key_storage.rb
|
155
|
+
- lib/bucket_store/s3.rb
|
156
|
+
- lib/bucket_store/timing.rb
|
157
|
+
- lib/bucket_store/uri_builder.rb
|
158
|
+
- lib/bucket_store/version.rb
|
159
|
+
homepage: https://github.com/gocardless/file-storage
|
160
|
+
licenses:
|
161
|
+
- MIT
|
162
|
+
metadata: {}
|
163
|
+
post_install_message:
|
164
|
+
rdoc_options: []
|
165
|
+
require_paths:
|
166
|
+
- lib
|
167
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
168
|
+
requirements:
|
169
|
+
- - ">="
|
170
|
+
- !ruby/object:Gem::Version
|
171
|
+
version: '2.6'
|
172
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
173
|
+
requirements:
|
174
|
+
- - ">="
|
175
|
+
- !ruby/object:Gem::Version
|
176
|
+
version: '0'
|
177
|
+
requirements: []
|
178
|
+
rubygems_version: 3.2.22
|
179
|
+
signing_key:
|
180
|
+
specification_version: 4
|
181
|
+
summary: A helper library to access cloud storage services
|
182
|
+
test_files: []
|