bucket_store 0.3.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/README.md +167 -0
- data/lib/bucket_store/configuration.rb +19 -0
- data/lib/bucket_store/disk.rb +84 -0
- data/lib/bucket_store/gcs.rb +82 -0
- data/lib/bucket_store/in_memory.rb +62 -0
- data/lib/bucket_store/key_context.rb +36 -0
- data/lib/bucket_store/key_storage.rb +158 -0
- data/lib/bucket_store/s3.rb +79 -0
- data/lib/bucket_store/timing.rb +19 -0
- data/lib/bucket_store/uri_builder.rb +15 -0
- data/lib/bucket_store/version.rb +5 -0
- data/lib/bucket_store.rb +59 -0
- metadata +182 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: c812fe38b957031d6fae686c11a3e615ac5a2575b563ebf96b8f0a019513d363
|
4
|
+
data.tar.gz: 3165a65032fc35ca76ed983a7477f0cb1a13de62d1f2b68ad6cb24e4622a0eb4
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 3be0d30f1857185217c5f3b461809bf436ca2306426791af4051a2c725229d246d272ffd85aaef90792102e59c8100e66422f9c43ab9354cfb01ab4b7971e6fe
|
7
|
+
data.tar.gz: 9ea6b6baadf89e29b0b74d00a9ca5c743088c50d75c1867348ba48ce14ac304160f205b881747ccf4f54b3dc50f631e27f2042443e2ffbfc69913e4f1703bbb2
|
data/README.md
ADDED
@@ -0,0 +1,167 @@
|
|
1
|
+
# BucketStore
|
2
|
+
|
3
|
+
An abstraction layer on the top of file cloud storage systems such as Google Cloud
|
4
|
+
Storage or S3. This module exposes a generic interface that allows interoperability
|
5
|
+
between different storage options. Callers don't need to worry about the specifics
|
6
|
+
of where and how a file is stored and retrieved as long as the given key is valid.
|
7
|
+
|
8
|
+
Keys within the `BucketStorage` are URI strings that can universally locate an object
|
9
|
+
in the given provider. A valid key example would be
|
10
|
+
`gs://a-gcs-bucket/file/path.json`.
|
11
|
+
|
12
|
+
## Usage
|
13
|
+
This library is distributed as a Ruby gem, and we recommend adding it to your Gemfile:
|
14
|
+
|
15
|
+
```ruby
|
16
|
+
gem "bucket-store"
|
17
|
+
```
|
18
|
+
|
19
|
+
Some attributes can be configured via `BucketStore.configure`. If using Rails, you want to
|
20
|
+
add a new initializer for `BucketStore`. Example:
|
21
|
+
|
22
|
+
```ruby
|
23
|
+
BucketStore.configure do |config|
|
24
|
+
config.logger = Logger.new($stderr)
|
25
|
+
end
|
26
|
+
```
|
27
|
+
|
28
|
+
If using RSpec, you'll probably want to add this line to RSpec's config block (see
|
29
|
+
the *Adapters* section for more details):
|
30
|
+
|
31
|
+
```ruby
|
32
|
+
config.before { BucketStore::InMemory.reset! }
|
33
|
+
```
|
34
|
+
|
35
|
+
For our policy on compatibility with Ruby versions, see [COMPATIBILITY.md](docs/COMPATIBILITY.md).
|
36
|
+
|
37
|
+
## Design and Architecture
|
38
|
+
The main principle behind `BucketStore` is that each resource or group of resources must
|
39
|
+
be unequivocally identifiable by a URI. The URI is always composed of three parts:
|
40
|
+
|
41
|
+
- the "adapter" used to fetch the resource (see "adapters" below)
|
42
|
+
- the "bucket" where the resource lives
|
43
|
+
- the path to the resource(s)
|
44
|
+
|
45
|
+
As an example, all the following are valid URIs:
|
46
|
+
|
47
|
+
- `gs://gcs-bucket/path/to/file.xml`
|
48
|
+
- `inmemory://bucket/separator/file.xml`
|
49
|
+
- `disk://hello/path/to/file.json`
|
50
|
+
|
51
|
+
Even though `BucketStore`'s main goal is to be an abstraction layer on top of systems such
|
52
|
+
as S3 or Google Cloud Storage where the "path" to a resource is in practice a unique
|
53
|
+
identifier as a whole (i.e. the `/` is not a directory separator but rather part of the
|
54
|
+
key's name), we assume that clients will actually want some sort of hierarchical
|
55
|
+
separation of resources and assume that such separation is achieved by defining each
|
56
|
+
part of the hierarchy via `/`.
|
57
|
+
|
58
|
+
This means that the following are also valid URIs in `BucketStore` but they refer to
|
59
|
+
all the resources under that specific hierarchy:
|
60
|
+
|
61
|
+
- `gs://gcs-bucket/path/subpath/`
|
62
|
+
- `inmemory://bucket/separator/`
|
63
|
+
- `disk://hello/path`
|
64
|
+
|
65
|
+
## Configuration
|
66
|
+
`BucketStore` exposes some configurable attributes via `BucketStore.configure`. If
|
67
|
+
necessary this should be called at startup time before any other method is invoked.
|
68
|
+
|
69
|
+
- `logger`: custom logger class. By default, logs will be sent to stdout.
|
70
|
+
|
71
|
+
## Adapters
|
72
|
+
|
73
|
+
`BucketStore` comes with 4 built-in adapters:
|
74
|
+
|
75
|
+
- `gs`: the Google Cloud Storage adapter
|
76
|
+
- `s3`: the S3 adapter
|
77
|
+
- `disk`: a disk-based adapter
|
78
|
+
- `inmemory`: an in-memory store
|
79
|
+
|
80
|
+
### GS adapter
|
81
|
+
This is the adapter for Google Cloud Storage. `BucketStore` assumes that the authorisation
|
82
|
+
for accessing the resources has been set up outside of the gem.
|
83
|
+
|
84
|
+
### S3 adapter
|
85
|
+
This is the adapter for S3. `BucketStore` assumes that the authorisation for accessing
|
86
|
+
the resources has been set up outside of the gem (see also
|
87
|
+
https://docs.aws.amazon.com/sdk-for-ruby/v3/api/index.html#Configuration).
|
88
|
+
|
89
|
+
### Disk adapter
|
90
|
+
A disk-backed key-value store. This adapter will create a temporary directory where
|
91
|
+
all the files will be written into/read from. The base directory can be explicitly
|
92
|
+
defined by setting the `DISK_ADAPTER_BASE_DIR` environment variable, otherwise a temporary
|
93
|
+
directory will be created.
|
94
|
+
|
95
|
+
### In-memory adapter
|
96
|
+
An in-memory key-value storage. This works just like the disk adapter, except that
|
97
|
+
the content of all the files is stored in memory, which is particularly useful for
|
98
|
+
testing. Note that content added to this adapter will persist for the lifetime of
|
99
|
+
the application as it's not possible to create different instances of the same adapter.
|
100
|
+
In general, this is not what's expected during testing where the content of the bucket
|
101
|
+
should be reset between different tests. The adapter provides a way to easily reset the
|
102
|
+
content though via a `.reset!` method. In RSpec this would translate to adding this line
|
103
|
+
in the `spec_helper`:
|
104
|
+
|
105
|
+
```ruby
|
106
|
+
config.before { BucketStore::InMemory.reset! }
|
107
|
+
```
|
108
|
+
|
109
|
+
## BucketStore vs ActiveStorage
|
110
|
+
|
111
|
+
ActiveStorage is a common framework to access cloud storage systems that is included in
|
112
|
+
the ActiveSupport library. In general, ActiveStorage provides a lot more than BucketStore
|
113
|
+
does (including many more adapters) however the two libraries have different use cases
|
114
|
+
in mind:
|
115
|
+
|
116
|
+
- ActiveStorage requires you to define every possible bucket you're planning to use
|
117
|
+
ahead of time in a YAML file. This works well for most cases, however if you plan to
|
118
|
+
use a lot of buckets this soon becomes impractical. We think that BucketStore approach
|
119
|
+
works much better in this case.
|
120
|
+
- BucketStore does not provide ways to manipulate the content whereas ActiveStorage does.
|
121
|
+
If you plan to apply transformations to the content before uploading or after
|
122
|
+
downloading them, then probably ActiveStorage is the library for you. With that said,
|
123
|
+
it's still possible to do these transformations outside of BucketStore and in fact we've
|
124
|
+
found the explicitness of this approach a desirable property.
|
125
|
+
- BucketStore approach makes any resource on a cloud storage system uniquely identifiable
|
126
|
+
via a single URI, which means normally it's enough to pass that string around different
|
127
|
+
systems to be able to access the resource without ambiguity. As the URI also includes
|
128
|
+
the adapter, it's possible for example to download a `disk://dir/input_file` and
|
129
|
+
upload it to a `gs://bucket/output_file` all going through a single interface.
|
130
|
+
ActiveStorage is instead focused on persisting an equivalent reference on a Rails model.
|
131
|
+
If your application does not use Rails, or does not need to persist the reference or
|
132
|
+
just requires more flexibility in general, then BucketStore is probably the library for
|
133
|
+
you.
|
134
|
+
|
135
|
+
|
136
|
+
## Examples
|
137
|
+
|
138
|
+
### Uploading a file to a bucket
|
139
|
+
```ruby
|
140
|
+
BucketStore.for("inmemory://bucket/path/file.xml").upload!("hello world")
|
141
|
+
=> "inmemory://bucket/path/file.xml"
|
142
|
+
```
|
143
|
+
|
144
|
+
### Accessing a file in a bucket
|
145
|
+
```ruby
|
146
|
+
BucketStore.for("inmemory://bucket/path/file.xml").download
|
147
|
+
=> {:bucket=>"bucket", :key=>"path/file.xml", :content=>"hello world"}
|
148
|
+
```
|
149
|
+
|
150
|
+
### Listing all keys under a prefix
|
151
|
+
```ruby
|
152
|
+
BucketStore.for("inmemory://bucket/path/").list
|
153
|
+
=> ["inmemory://bucket/path/file.xml"]
|
154
|
+
```
|
155
|
+
|
156
|
+
### Delete a file
|
157
|
+
```ruby
|
158
|
+
BucketStore.for("inmemory://bucket/path/file.xml").delete!
|
159
|
+
=> true
|
160
|
+
```
|
161
|
+
|
162
|
+
## License & Contributing
|
163
|
+
|
164
|
+
* BucketStore is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
|
165
|
+
* Bug reports and pull requests are welcome on GitHub at https://github.com/gocardless/file-storage.
|
166
|
+
|
167
|
+
GoCardless ♥ open source. If you do too, come [join us](https://gocardless.com/about/careers/).
|
@@ -0,0 +1,19 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module BucketStore
|
4
|
+
class Configuration
|
5
|
+
def logger
|
6
|
+
@logger ||= Logger.new($stdout)
|
7
|
+
end
|
8
|
+
|
9
|
+
# Specifies a custom logger.
|
10
|
+
#
|
11
|
+
# Note that {BucketStore} uses structured logging, any custom logger passed must also
|
12
|
+
# support it.
|
13
|
+
#
|
14
|
+
# @example Use stderr as main output device
|
15
|
+
# config.logger = Logger.new($stderr)
|
16
|
+
# @!attribute logger
|
17
|
+
attr_writer :logger
|
18
|
+
end
|
19
|
+
end
|
@@ -0,0 +1,84 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "fileutils"
|
4
|
+
|
5
|
+
module BucketStore
|
6
|
+
class Disk
|
7
|
+
def self.build(base_dir = ENV["DISK_ADAPTER_BASE_DIR"])
|
8
|
+
base_dir ||= Dir.tmpdir
|
9
|
+
Disk.new(base_dir)
|
10
|
+
end
|
11
|
+
|
12
|
+
def initialize(base_dir)
|
13
|
+
@base_dir = File.expand_path(base_dir)
|
14
|
+
end
|
15
|
+
|
16
|
+
def upload!(bucket:, key:, content:)
|
17
|
+
File.open(key_path(bucket, key), "w") do |file|
|
18
|
+
file.write(content)
|
19
|
+
end
|
20
|
+
{
|
21
|
+
bucket: bucket,
|
22
|
+
key: key,
|
23
|
+
}
|
24
|
+
end
|
25
|
+
|
26
|
+
def download(bucket:, key:)
|
27
|
+
File.open(key_path(bucket, key), "r") do |file|
|
28
|
+
{
|
29
|
+
bucket: bucket,
|
30
|
+
key: key,
|
31
|
+
content: file.read,
|
32
|
+
}
|
33
|
+
end
|
34
|
+
end
|
35
|
+
|
36
|
+
def list(bucket:, key:, page_size:)
|
37
|
+
root = Pathname.new(bucket_root(bucket))
|
38
|
+
|
39
|
+
Dir["#{root}/**/*"].
|
40
|
+
reject { |absolute_path| File.directory?(absolute_path) }.
|
41
|
+
map { |full_path| Pathname.new(full_path).relative_path_from(root).to_s }.
|
42
|
+
select { |f| f.start_with?(key) }.
|
43
|
+
each_slice(page_size).
|
44
|
+
map do |keys|
|
45
|
+
{
|
46
|
+
bucket: bucket,
|
47
|
+
keys: keys,
|
48
|
+
}
|
49
|
+
end.to_enum
|
50
|
+
end
|
51
|
+
|
52
|
+
def delete!(bucket:, key:)
|
53
|
+
File.unlink(key_path(bucket, key))
|
54
|
+
|
55
|
+
true
|
56
|
+
end
|
57
|
+
|
58
|
+
private
|
59
|
+
|
60
|
+
attr_reader :base_dir
|
61
|
+
|
62
|
+
def bucket_root(bucket)
|
63
|
+
path = File.join(base_dir, sanitize_filename(bucket))
|
64
|
+
FileUtils.mkdir_p(path)
|
65
|
+
path
|
66
|
+
end
|
67
|
+
|
68
|
+
def key_path(bucket, key)
|
69
|
+
path = File.join(bucket_root(bucket), sanitize_filename(key))
|
70
|
+
path = File.expand_path(path)
|
71
|
+
|
72
|
+
unless path.start_with?(base_dir)
|
73
|
+
raise ArgumentError, "Directory traversal out of bucket boundaries: #{key}"
|
74
|
+
end
|
75
|
+
|
76
|
+
FileUtils.mkdir_p(File.dirname(path))
|
77
|
+
path
|
78
|
+
end
|
79
|
+
|
80
|
+
def sanitize_filename(filename)
|
81
|
+
filename.gsub(%r{[^0-9A-z.\-/]}, "_")
|
82
|
+
end
|
83
|
+
end
|
84
|
+
end
|
@@ -0,0 +1,82 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "stringio"
|
4
|
+
require "uri"
|
5
|
+
|
6
|
+
require "google/cloud/storage"
|
7
|
+
|
8
|
+
module BucketStore
|
9
|
+
class Gcs
|
10
|
+
DEFAULT_TIMEOUT_SECONDS = 30
|
11
|
+
|
12
|
+
def self.build(timeout_seconds = DEFAULT_TIMEOUT_SECONDS)
|
13
|
+
Gcs.new(timeout_seconds)
|
14
|
+
end
|
15
|
+
|
16
|
+
def initialize(timeout_seconds)
|
17
|
+
@storage = Google::Cloud::Storage.new(
|
18
|
+
timeout: timeout_seconds,
|
19
|
+
)
|
20
|
+
end
|
21
|
+
|
22
|
+
def upload!(bucket:, key:, content:)
|
23
|
+
buffer = StringIO.new(content)
|
24
|
+
get_bucket(bucket).create_file(buffer, key)
|
25
|
+
|
26
|
+
{
|
27
|
+
bucket: bucket,
|
28
|
+
key: key,
|
29
|
+
}
|
30
|
+
end
|
31
|
+
|
32
|
+
def download(bucket:, key:)
|
33
|
+
file = get_bucket(bucket).file(key)
|
34
|
+
|
35
|
+
buffer = StringIO.new
|
36
|
+
file.download(buffer)
|
37
|
+
|
38
|
+
{
|
39
|
+
bucket: bucket,
|
40
|
+
key: key,
|
41
|
+
content: buffer.string,
|
42
|
+
}
|
43
|
+
end
|
44
|
+
|
45
|
+
def list(bucket:, key:, page_size:)
|
46
|
+
Enumerator.new do |yielder|
|
47
|
+
token = nil
|
48
|
+
|
49
|
+
loop do
|
50
|
+
page = get_bucket(bucket).files(prefix: key, max: page_size, token: token)
|
51
|
+
yielder.yield({
|
52
|
+
bucket: bucket,
|
53
|
+
keys: page.map(&:name),
|
54
|
+
})
|
55
|
+
|
56
|
+
break if page.token.nil?
|
57
|
+
|
58
|
+
token = page.token
|
59
|
+
end
|
60
|
+
end
|
61
|
+
end
|
62
|
+
|
63
|
+
def delete!(bucket:, key:)
|
64
|
+
get_bucket(bucket).file(key).delete
|
65
|
+
|
66
|
+
true
|
67
|
+
end
|
68
|
+
|
69
|
+
private
|
70
|
+
|
71
|
+
attr_reader :storage
|
72
|
+
|
73
|
+
def get_bucket(name)
|
74
|
+
# Lookup only checks that the bucket actually exist before doing any work on it.
|
75
|
+
# Unfortunately it also requires a set of extra permissions that are not necessarily
|
76
|
+
# going to be granted for service accounts. Given that if the bucket doesn't exist
|
77
|
+
# we'll get errors down the line anyway, we can safely skip the lookup without loss
|
78
|
+
# of generality.
|
79
|
+
storage.bucket(name, skip_lookup: true)
|
80
|
+
end
|
81
|
+
end
|
82
|
+
end
|
@@ -0,0 +1,62 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module BucketStore
|
4
|
+
class InMemory
|
5
|
+
def self.build
|
6
|
+
InMemory.instance
|
7
|
+
end
|
8
|
+
|
9
|
+
def self.instance
|
10
|
+
# rubocop:disable Style/ClassVars
|
11
|
+
@@instance ||= new
|
12
|
+
# rubocop:enable Style/ClassVars
|
13
|
+
end
|
14
|
+
|
15
|
+
def self.reset!
|
16
|
+
instance.reset!
|
17
|
+
end
|
18
|
+
|
19
|
+
def initialize
|
20
|
+
reset!
|
21
|
+
end
|
22
|
+
|
23
|
+
def reset!
|
24
|
+
@buckets = Hash.new { |hash, key| hash[key] = {} }
|
25
|
+
end
|
26
|
+
|
27
|
+
def upload!(bucket:, key:, content:)
|
28
|
+
@buckets[bucket][key] = content
|
29
|
+
|
30
|
+
{
|
31
|
+
bucket: bucket,
|
32
|
+
key: key,
|
33
|
+
}
|
34
|
+
end
|
35
|
+
|
36
|
+
def download(bucket:, key:)
|
37
|
+
{
|
38
|
+
bucket: bucket,
|
39
|
+
key: key,
|
40
|
+
content: @buckets[bucket].fetch(key),
|
41
|
+
}
|
42
|
+
end
|
43
|
+
|
44
|
+
def list(bucket:, key:, page_size:)
|
45
|
+
@buckets[bucket].keys.
|
46
|
+
select { |k| k.start_with?(key) }.
|
47
|
+
each_slice(page_size).
|
48
|
+
map do |keys|
|
49
|
+
{
|
50
|
+
bucket: bucket,
|
51
|
+
keys: keys,
|
52
|
+
}
|
53
|
+
end.to_enum
|
54
|
+
end
|
55
|
+
|
56
|
+
def delete!(bucket:, key:)
|
57
|
+
@buckets[bucket].delete(key)
|
58
|
+
|
59
|
+
true
|
60
|
+
end
|
61
|
+
end
|
62
|
+
end
|
@@ -0,0 +1,36 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "uri"
|
4
|
+
|
5
|
+
module BucketStore
|
6
|
+
class KeyContext
|
7
|
+
attr_reader :adapter, :bucket, :key
|
8
|
+
|
9
|
+
class KeyParseException < RuntimeError; end
|
10
|
+
|
11
|
+
def initialize(adapter:, bucket:, key:)
|
12
|
+
@adapter = adapter
|
13
|
+
@bucket = bucket
|
14
|
+
@key = key
|
15
|
+
end
|
16
|
+
|
17
|
+
def to_s
|
18
|
+
"<KeyContext adapter:#{adapter} bucket:#{bucket} key:#{key}>"
|
19
|
+
end
|
20
|
+
|
21
|
+
def self.parse(raw_key)
|
22
|
+
uri = URI(raw_key)
|
23
|
+
|
24
|
+
# A key should never be `nil` but can be empty. Depending on the operation, this may
|
25
|
+
# or may not be a valid configuration (e.g. an empty key is likely valid on a
|
26
|
+
# `list`, but not during a `download`).
|
27
|
+
key = uri.path.sub!(%r{/}, "") || ""
|
28
|
+
|
29
|
+
raise KeyParseException if [uri.scheme, uri.host, key].map(&:nil?).any?
|
30
|
+
|
31
|
+
KeyContext.new(adapter: uri.scheme,
|
32
|
+
bucket: uri.host,
|
33
|
+
key: key)
|
34
|
+
end
|
35
|
+
end
|
36
|
+
end
|
@@ -0,0 +1,158 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "bucket_store/timing"
|
4
|
+
require "bucket_store/in_memory"
|
5
|
+
require "bucket_store/gcs"
|
6
|
+
require "bucket_store/s3"
|
7
|
+
require "bucket_store/disk"
|
8
|
+
|
9
|
+
module BucketStore
|
10
|
+
class KeyStorage
|
11
|
+
SUPPORTED_ADAPTERS = {
|
12
|
+
gs: Gcs,
|
13
|
+
s3: S3,
|
14
|
+
inmemory: InMemory,
|
15
|
+
disk: Disk,
|
16
|
+
}.freeze
|
17
|
+
|
18
|
+
attr_reader :bucket, :key, :adapter_type
|
19
|
+
|
20
|
+
def initialize(adapter:, bucket:, key:)
|
21
|
+
@adapter_type = adapter.to_sym
|
22
|
+
raise "Unknown adapter: #{@adapter_type}" unless SUPPORTED_ADAPTERS.include?(@adapter_type)
|
23
|
+
|
24
|
+
@adapter = SUPPORTED_ADAPTERS.fetch(@adapter_type).build
|
25
|
+
@bucket = bucket
|
26
|
+
@key = key
|
27
|
+
end
|
28
|
+
|
29
|
+
def filename
|
30
|
+
File.basename(key)
|
31
|
+
end
|
32
|
+
|
33
|
+
# Downloads the content of the reference key
|
34
|
+
#
|
35
|
+
# @return [Hash<Symbol, Object>]
|
36
|
+
# A hash that includes the download result. The hash keys reference different aspects of the
|
37
|
+
# download (e.g. `:key` and `:content` will include respectively the original key's name and
|
38
|
+
# the actual download's content)
|
39
|
+
#
|
40
|
+
# @example Download a key
|
41
|
+
# BucketStore.for("inmemory://bucket/file.xml").download
|
42
|
+
def download
|
43
|
+
raise ArgumentError, "Key cannot be empty" if key.empty?
|
44
|
+
|
45
|
+
BucketStore.logger.info(event: "key_storage.download_started")
|
46
|
+
|
47
|
+
start = BucketStore::Timing.monotonic_now
|
48
|
+
result = adapter.download(bucket: bucket, key: key)
|
49
|
+
|
50
|
+
BucketStore.logger.info(event: "key_storage.download_finished",
|
51
|
+
duration: BucketStore::Timing.monotonic_now - start)
|
52
|
+
|
53
|
+
result
|
54
|
+
end
|
55
|
+
|
56
|
+
# Uploads the given content to the reference key location.
|
57
|
+
#
|
58
|
+
# If the `key` already exists, its content will be replaced by the one in input.
|
59
|
+
#
|
60
|
+
# @param [String] content The content to upload
|
61
|
+
# @return [String] The final `key` where the content has been uploaded
|
62
|
+
# @example Upload a file
|
63
|
+
# BucketStore.for("inmemory://bucket/file.xml").upload("hello world")
|
64
|
+
def upload!(content)
|
65
|
+
raise ArgumentError, "Key cannot be empty" if key.empty?
|
66
|
+
|
67
|
+
BucketStore.logger.info(event: "key_storage.upload_started",
|
68
|
+
**log_context)
|
69
|
+
|
70
|
+
start = BucketStore::Timing.monotonic_now
|
71
|
+
result = adapter.upload!(
|
72
|
+
bucket: bucket,
|
73
|
+
key: key,
|
74
|
+
content: content,
|
75
|
+
)
|
76
|
+
|
77
|
+
BucketStore.logger.info(event: "key_storage.upload_finished",
|
78
|
+
duration: BucketStore::Timing.monotonic_now - start,
|
79
|
+
**log_context)
|
80
|
+
|
81
|
+
"#{adapter_type}://#{result[:bucket]}/#{result[:key]}"
|
82
|
+
end
|
83
|
+
|
84
|
+
# Lists all keys for the current adapter that have the reference key as prefix
|
85
|
+
#
|
86
|
+
# Internally, this method will paginate through the result set. The default page size
|
87
|
+
# for the underlying adapter can be controlled via the `page_size` argument.
|
88
|
+
#
|
89
|
+
# This will return a enumerator of valid keys in the format of `adapter://bucket/key`.
|
90
|
+
# The keys in the list will share the reference key as a prefix. Underlying adapters will
|
91
|
+
# paginate the result set as the enumerable is consumed. The number of items per page
|
92
|
+
# can be controlled by the `page_size` argument.
|
93
|
+
#
|
94
|
+
# @param [Integer] page_size
|
95
|
+
# the max number of items to fetch for each page of results
|
96
|
+
def list(page_size: 1000)
|
97
|
+
BucketStore.logger.info(event: "key_storage.list_started")
|
98
|
+
|
99
|
+
start = BucketStore::Timing.monotonic_now
|
100
|
+
pages = adapter.list(
|
101
|
+
bucket: bucket,
|
102
|
+
key: key,
|
103
|
+
page_size: page_size,
|
104
|
+
)
|
105
|
+
|
106
|
+
page_count = 0
|
107
|
+
Enumerator.new do |yielder|
|
108
|
+
pages.each do |page|
|
109
|
+
page_count += 1
|
110
|
+
keys = page.fetch(:keys, []).map { |key| "#{adapter_type}://#{page[:bucket]}/#{key}" }
|
111
|
+
|
112
|
+
BucketStore.logger.info(
|
113
|
+
event: "key_storage.list_page_fetched",
|
114
|
+
resource_count: keys.count,
|
115
|
+
page: page_count,
|
116
|
+
duration: BucketStore::Timing.monotonic_now - start,
|
117
|
+
)
|
118
|
+
|
119
|
+
keys.each do |key|
|
120
|
+
yielder.yield(key)
|
121
|
+
end
|
122
|
+
end
|
123
|
+
end
|
124
|
+
end
|
125
|
+
|
126
|
+
# Deletes the referenced key.
|
127
|
+
#
|
128
|
+
# Note that this method will always return true.
|
129
|
+
#
|
130
|
+
# @return [bool]
|
131
|
+
#
|
132
|
+
# @example Delete a file
|
133
|
+
# BucketStore.for("inmemory://bucket/file.txt").delete!
|
134
|
+
def delete!
|
135
|
+
BucketStore.logger.info(event: "key_storage.delete_started")
|
136
|
+
|
137
|
+
start = BucketStore::Timing.monotonic_now
|
138
|
+
adapter.delete!(bucket: bucket, key: key)
|
139
|
+
|
140
|
+
BucketStore.logger.info(event: "key_storage.delete_finished",
|
141
|
+
duration: BucketStore::Timing.monotonic_now - start)
|
142
|
+
|
143
|
+
true
|
144
|
+
end
|
145
|
+
|
146
|
+
private
|
147
|
+
|
148
|
+
attr_reader :adapter
|
149
|
+
|
150
|
+
def log_context
|
151
|
+
{
|
152
|
+
bucket: bucket,
|
153
|
+
key: key,
|
154
|
+
adapter_type: adapter_type,
|
155
|
+
}.compact
|
156
|
+
end
|
157
|
+
end
|
158
|
+
end
|
@@ -0,0 +1,79 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "uri"
|
4
|
+
|
5
|
+
require "aws-sdk-s3"
|
6
|
+
|
7
|
+
module BucketStore
|
8
|
+
class S3
|
9
|
+
DEFAULT_TIMEOUT_SECONDS = 30
|
10
|
+
|
11
|
+
def self.build(open_timeout_seconds = DEFAULT_TIMEOUT_SECONDS,
|
12
|
+
read_timeout_seconds = DEFAULT_TIMEOUT_SECONDS)
|
13
|
+
S3.new(open_timeout_seconds, read_timeout_seconds)
|
14
|
+
end
|
15
|
+
|
16
|
+
def initialize(open_timeout_seconds, read_timeout_seconds)
|
17
|
+
@storage = Aws::S3::Client.new(
|
18
|
+
http_open_timeout: open_timeout_seconds,
|
19
|
+
http_read_timeout: read_timeout_seconds,
|
20
|
+
)
|
21
|
+
end
|
22
|
+
|
23
|
+
def upload!(bucket:, key:, content:)
|
24
|
+
storage.put_object(
|
25
|
+
bucket: bucket,
|
26
|
+
key: key,
|
27
|
+
body: content,
|
28
|
+
)
|
29
|
+
|
30
|
+
{
|
31
|
+
bucket: bucket,
|
32
|
+
key: key,
|
33
|
+
}
|
34
|
+
end
|
35
|
+
|
36
|
+
def download(bucket:, key:)
|
37
|
+
file = storage.get_object(
|
38
|
+
bucket: bucket,
|
39
|
+
key: key,
|
40
|
+
)
|
41
|
+
|
42
|
+
{
|
43
|
+
bucket: bucket,
|
44
|
+
key: key,
|
45
|
+
content: file.body.read,
|
46
|
+
}
|
47
|
+
end
|
48
|
+
|
49
|
+
def list(bucket:, key:, page_size:)
|
50
|
+
Enumerator.new do |yielder|
|
51
|
+
page = storage.list_objects_v2(bucket: bucket, prefix: key, max_keys: page_size)
|
52
|
+
|
53
|
+
loop do
|
54
|
+
yielder.yield({
|
55
|
+
bucket: bucket,
|
56
|
+
keys: page.contents.map(&:key),
|
57
|
+
})
|
58
|
+
|
59
|
+
break unless page.next_page?
|
60
|
+
|
61
|
+
page = page.next_page
|
62
|
+
end
|
63
|
+
end
|
64
|
+
end
|
65
|
+
|
66
|
+
def delete!(bucket:, key:)
|
67
|
+
storage.delete_object(
|
68
|
+
bucket: bucket,
|
69
|
+
key: key,
|
70
|
+
)
|
71
|
+
|
72
|
+
true
|
73
|
+
end
|
74
|
+
|
75
|
+
private
|
76
|
+
|
77
|
+
attr_reader :storage
|
78
|
+
end
|
79
|
+
end
|
@@ -0,0 +1,19 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module BucketStore
|
4
|
+
module Timing
|
5
|
+
# "Wall clock is for telling time, monotonic clock is for measuring time."
|
6
|
+
#
|
7
|
+
# When timing events, ensure we ask for a monotonically adjusted clock time
|
8
|
+
# to avoid changes to the system time from being reflected in our
|
9
|
+
# measurements.
|
10
|
+
#
|
11
|
+
# See this article for a good explanation and a deeper dive:
|
12
|
+
# https://blog.dnsimple.com/2018/03/elapsed-time-with-ruby-the-right-way/
|
13
|
+
#
|
14
|
+
# @return [Float]
|
15
|
+
def self.monotonic_now
|
16
|
+
Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
17
|
+
end
|
18
|
+
end
|
19
|
+
end
|
@@ -0,0 +1,15 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module BucketStore
|
4
|
+
module UriBuilder
|
5
|
+
# Sanitizes the input as not all characters are valid as either URIs or as bucket keys.
|
6
|
+
# When we get them we want to replace them with something we can process.
|
7
|
+
#
|
8
|
+
# @param input [String] the string to sanitise
|
9
|
+
# @param [String] replacement the replacement string for invalid characters
|
10
|
+
# @return [String] the sanitised string
|
11
|
+
def self.sanitize(input, replacement = "__")
|
12
|
+
input.gsub(/[{}<>%]/, replacement)
|
13
|
+
end
|
14
|
+
end
|
15
|
+
end
|
data/lib/bucket_store.rb
ADDED
@@ -0,0 +1,59 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require "bucket_store/version"
|
4
|
+
require "bucket_store/configuration"
|
5
|
+
require "bucket_store/key_context"
|
6
|
+
require "bucket_store/key_storage"
|
7
|
+
|
8
|
+
# An abstraction layer on the top of file cloud storage systems such as Google Cloud
|
9
|
+
# Storage or S3. This module exposes a generic interface that allows interoperability
|
10
|
+
# between different storage options. Callers don't need to worry about the specifics
|
11
|
+
# of where and how a file is stored and retrieved as long as the given key is valid.
|
12
|
+
#
|
13
|
+
# Keys within the {BucketStore} are URI strings that can universally locate an object
|
14
|
+
# in the given provider. A valid key example would be:
|
15
|
+
# `gs://gcs-bucket/file/path.json`.
|
16
|
+
module BucketStore
|
17
|
+
class << self
|
18
|
+
attr_writer :configuration
|
19
|
+
|
20
|
+
def configuration
|
21
|
+
@configuration ||= BucketStore::Configuration.new
|
22
|
+
end
|
23
|
+
|
24
|
+
# Yields a {BucketStore::Configuration} object that allows callers to configure
|
25
|
+
# BucketStore's behaviour.
|
26
|
+
#
|
27
|
+
# @yield [BucketStore::Configuration]
|
28
|
+
#
|
29
|
+
# @example Configure BucketStore to use a different logger than the default
|
30
|
+
# BucketStore.configure do |config|
|
31
|
+
# config.logger = Logger.new($stderr)
|
32
|
+
# end
|
33
|
+
def configure
|
34
|
+
yield(configuration)
|
35
|
+
end
|
36
|
+
|
37
|
+
def logger
|
38
|
+
configuration.logger
|
39
|
+
end
|
40
|
+
|
41
|
+
# Given a `key` in the format of `adapter://bucket/key` returns the corresponding
|
42
|
+
# adapter that will allow to manipulate (e.g. download, upload or list) such key.
|
43
|
+
#
|
44
|
+
# Currently supported adapters are `gs` (Google Cloud Storage), `inmemory` (an
|
45
|
+
# in-memory key-value storage) and `disk` (a disk-backed key-value store).
|
46
|
+
#
|
47
|
+
# @param [String] key The reference key
|
48
|
+
# @return [KeyStorage] An interface to the adapter that can handle requests on the given key
|
49
|
+
# @example Configure {BucketStore} for Google Cloud Storage
|
50
|
+
# BucketStore.for("gs://the_bucket/a/valid/key")
|
51
|
+
def for(key)
|
52
|
+
ctx = KeyContext.parse(key)
|
53
|
+
|
54
|
+
KeyStorage.new(adapter: ctx.adapter,
|
55
|
+
bucket: ctx.bucket,
|
56
|
+
key: ctx.key)
|
57
|
+
end
|
58
|
+
end
|
59
|
+
end
|
metadata
ADDED
@@ -0,0 +1,182 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: bucket_store
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.3.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- GoCardless Engineering
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
date: 2021-10-26 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: aws-sdk-s3
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '1'
|
20
|
+
type: :runtime
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '1'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: google-cloud-storage
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '1.31'
|
34
|
+
type: :runtime
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '1.31'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: gc_ruboconfig
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - "~>"
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '2.29'
|
48
|
+
type: :development
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - "~>"
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '2.29'
|
55
|
+
- !ruby/object:Gem::Dependency
|
56
|
+
name: pry-byebug
|
57
|
+
requirement: !ruby/object:Gem::Requirement
|
58
|
+
requirements:
|
59
|
+
- - "~>"
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: '3.9'
|
62
|
+
type: :development
|
63
|
+
prerelease: false
|
64
|
+
version_requirements: !ruby/object:Gem::Requirement
|
65
|
+
requirements:
|
66
|
+
- - "~>"
|
67
|
+
- !ruby/object:Gem::Version
|
68
|
+
version: '3.9'
|
69
|
+
- !ruby/object:Gem::Dependency
|
70
|
+
name: rspec
|
71
|
+
requirement: !ruby/object:Gem::Requirement
|
72
|
+
requirements:
|
73
|
+
- - "~>"
|
74
|
+
- !ruby/object:Gem::Version
|
75
|
+
version: '3.10'
|
76
|
+
type: :development
|
77
|
+
prerelease: false
|
78
|
+
version_requirements: !ruby/object:Gem::Requirement
|
79
|
+
requirements:
|
80
|
+
- - "~>"
|
81
|
+
- !ruby/object:Gem::Version
|
82
|
+
version: '3.10'
|
83
|
+
- !ruby/object:Gem::Dependency
|
84
|
+
name: rspec_junit_formatter
|
85
|
+
requirement: !ruby/object:Gem::Requirement
|
86
|
+
requirements:
|
87
|
+
- - "~>"
|
88
|
+
- !ruby/object:Gem::Version
|
89
|
+
version: 0.4.1
|
90
|
+
type: :development
|
91
|
+
prerelease: false
|
92
|
+
version_requirements: !ruby/object:Gem::Requirement
|
93
|
+
requirements:
|
94
|
+
- - "~>"
|
95
|
+
- !ruby/object:Gem::Version
|
96
|
+
version: 0.4.1
|
97
|
+
- !ruby/object:Gem::Dependency
|
98
|
+
name: rubocop
|
99
|
+
requirement: !ruby/object:Gem::Requirement
|
100
|
+
requirements:
|
101
|
+
- - "~>"
|
102
|
+
- !ruby/object:Gem::Version
|
103
|
+
version: '1.22'
|
104
|
+
type: :development
|
105
|
+
prerelease: false
|
106
|
+
version_requirements: !ruby/object:Gem::Requirement
|
107
|
+
requirements:
|
108
|
+
- - "~>"
|
109
|
+
- !ruby/object:Gem::Version
|
110
|
+
version: '1.22'
|
111
|
+
- !ruby/object:Gem::Dependency
|
112
|
+
name: rubocop-performance
|
113
|
+
requirement: !ruby/object:Gem::Requirement
|
114
|
+
requirements:
|
115
|
+
- - "~>"
|
116
|
+
- !ruby/object:Gem::Version
|
117
|
+
version: '1.11'
|
118
|
+
type: :development
|
119
|
+
prerelease: false
|
120
|
+
version_requirements: !ruby/object:Gem::Requirement
|
121
|
+
requirements:
|
122
|
+
- - "~>"
|
123
|
+
- !ruby/object:Gem::Version
|
124
|
+
version: '1.11'
|
125
|
+
- !ruby/object:Gem::Dependency
|
126
|
+
name: rubocop-rspec
|
127
|
+
requirement: !ruby/object:Gem::Requirement
|
128
|
+
requirements:
|
129
|
+
- - "~>"
|
130
|
+
- !ruby/object:Gem::Version
|
131
|
+
version: '2.5'
|
132
|
+
type: :development
|
133
|
+
prerelease: false
|
134
|
+
version_requirements: !ruby/object:Gem::Requirement
|
135
|
+
requirements:
|
136
|
+
- - "~>"
|
137
|
+
- !ruby/object:Gem::Version
|
138
|
+
version: '2.5'
|
139
|
+
description: " A helper library to access cloud storage services such as Google
|
140
|
+
Cloud Storage.\n"
|
141
|
+
email:
|
142
|
+
- engineering@gocardless.com
|
143
|
+
executables: []
|
144
|
+
extensions: []
|
145
|
+
extra_rdoc_files: []
|
146
|
+
files:
|
147
|
+
- README.md
|
148
|
+
- lib/bucket_store.rb
|
149
|
+
- lib/bucket_store/configuration.rb
|
150
|
+
- lib/bucket_store/disk.rb
|
151
|
+
- lib/bucket_store/gcs.rb
|
152
|
+
- lib/bucket_store/in_memory.rb
|
153
|
+
- lib/bucket_store/key_context.rb
|
154
|
+
- lib/bucket_store/key_storage.rb
|
155
|
+
- lib/bucket_store/s3.rb
|
156
|
+
- lib/bucket_store/timing.rb
|
157
|
+
- lib/bucket_store/uri_builder.rb
|
158
|
+
- lib/bucket_store/version.rb
|
159
|
+
homepage: https://github.com/gocardless/file-storage
|
160
|
+
licenses:
|
161
|
+
- MIT
|
162
|
+
metadata: {}
|
163
|
+
post_install_message:
|
164
|
+
rdoc_options: []
|
165
|
+
require_paths:
|
166
|
+
- lib
|
167
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
168
|
+
requirements:
|
169
|
+
- - ">="
|
170
|
+
- !ruby/object:Gem::Version
|
171
|
+
version: '2.6'
|
172
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
173
|
+
requirements:
|
174
|
+
- - ">="
|
175
|
+
- !ruby/object:Gem::Version
|
176
|
+
version: '0'
|
177
|
+
requirements: []
|
178
|
+
rubygems_version: 3.2.22
|
179
|
+
signing_key:
|
180
|
+
specification_version: 4
|
181
|
+
summary: A helper library to access cloud storage services
|
182
|
+
test_files: []
|