activestorage-aws-record 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md ADDED
@@ -0,0 +1,287 @@
1
+ # activestorage-aws-record
2
+
3
+ Run **Active Storage on Amazon DynamoDB** — via [`aws-record`](https://github.com/aws/aws-record-ruby)
4
+ instead of Active Record.
5
+
6
+ It is a **metadata** backend: blob *bytes* still flow through a normal Active
7
+ Storage `Service` (Disk, S3, Mirror, …); only the Blob, Attachment, and
8
+ VariantRecord **metadata** lives in DynamoDB. Everything else — analyzers,
9
+ previewers, variants, jobs, direct uploads, controllers — is reused from Active
10
+ Storage unchanged.
11
+
12
+ It implements Active Storage's generic (non-ActiveRecord) custom-backend
13
+ contract, and doubles as a reference **example implementation** of that contract.
14
+
15
+ ## Why
16
+
17
+ Apps built on DynamoDB + `aws-record` (no relational database) still want
18
+ Active Storage's attachment ergonomics. This gem provides them without
19
+ introducing Active Record, following **Single Table Design**: every Active
20
+ Storage item lives in *one* table you already own.
21
+
22
+ ## Highlights
23
+
24
+ - **Single Table Design** — Blob, Attachment, and VariantRecord items share one
25
+ application-provided table, keyed by `#`-separated composite keys under a
26
+ configurable namespace. No gem-owned tables.
27
+ - **Zero-config key discovery** — the partition/sort key attribute names and
28
+ types are auto-detected from the live table at boot. Usually the only thing
29
+ you configure is the table name.
30
+ - **Adapts to your key types** — a **String** range key needs no GSI (all reads
31
+ strongly consistent); a **numeric** range key is supported automatically by
32
+ routing the adjacency through a string-keyed GSI.
33
+ - **Safe shared blobs** — a strongly-consistent, transactional reference count
34
+ with a conditional-delete foreign-key guard (no wrongful purge of a shared
35
+ blob, no zombie/orphan rows).
36
+ - **Atomic multi-attachment changes** — clearing, replacing, or detaching a
37
+ `has_many` commits every row delete (and a coalesced refcount decrement per
38
+ blob) in one DynamoDB transaction, so it can never delete some rows and leave
39
+ others. A change over DynamoDB's 100-action limit fails closed rather than
40
+ partially.
41
+ - **Fiber-safe** — Falcon-ready: eager mutex, mutex-guarded client, read-only
42
+ post-boot schema cache.
43
+
44
+ ## Requirements
45
+
46
+ - Ruby `>= 3.4`
47
+ - Active Storage with **generic custom-backend support** — currently the
48
+ [`rails/rails#57537`](https://github.com/rails/rails/pull/57537) branch (not yet
49
+ released). Until it ships in a Rails release, point your `Gemfile` at that
50
+ branch; installing against a released `activestorage` will not work.
51
+ - `aws-record ~> 2.15`, `aws-sdk-dynamodb ~> 1`
52
+ - A DynamoDB table with a composite primary key (partition **String** + sort key)
53
+
54
+ ## Installation
55
+
56
+ ```ruby
57
+ # Gemfile
58
+ gem "activestorage-aws-record"
59
+ ```
60
+
61
+ ## Configuration
62
+
63
+ In a Rails app, configure via `config.activestorage_aws_record` (e.g. in
64
+ `config/application.rb` or an initializer):
65
+
66
+ ```ruby
67
+ config.activestorage_aws_record.table_name = "my_app" # the single table (required)
68
+ config.activestorage_aws_record.namespace = "ActiveStorage" # key prefix; change to avoid collisions
69
+ # Optional:
70
+ # config.activestorage_aws_record.separator = "#"
71
+ # config.activestorage_aws_record.client_options = { region: "eu-central-1" }
72
+ # config.activestorage_aws_record.client = Aws::DynamoDB::Client.new(...)
73
+ # config.activestorage_aws_record.manage_table = Rails.env.local? # create the table if missing (dev/test)
74
+ # config.activestorage_aws_record.index_name = "active_storage_index" # which GSI to use in Mode B
75
+ ```
76
+
77
+ | Setting | Default | Meaning |
78
+ |---|---|---|
79
+ | `table_name` | `"active_storage"` | The single shared table. |
80
+ | `namespace` | `"ActiveStorage"` | First segment of every key; isolates Active Storage items from your own. |
81
+ | `separator` | `"#"` | Key segment delimiter. |
82
+ | `client` / `client_options` | — / `{}` | Provide a client, or options for one. |
83
+ | `manage_table` | `false` | Create the table (and, in dev, point at it) if missing. Production tables are app-managed. |
84
+ | `index_name` | `"active_storage_index"` | The GSI to use when the range key is numeric (Mode B). |
85
+
86
+ The partition/sort key **attribute names** are *not* configured — they are
87
+ detected from the table.
88
+
89
+ ## The table
90
+
91
+ The gem stores its items in your existing single table. It only assumes a
92
+ composite primary key: a **String partition key** and a sort key. The sort key's
93
+ type selects the mode automatically.
94
+
95
+ ### Mode A — String sort key (recommended)
96
+
97
+ All access patterns live on the base table, so **every read is strongly
98
+ consistent and no GSI is required**. A minimal standalone table:
99
+
100
+ ```ruby
101
+ client.create_table(
102
+ table_name: "my_app",
103
+ attribute_definitions: [
104
+ { attribute_name: "pk", attribute_type: "S" },
105
+ { attribute_name: "sk", attribute_type: "S" }
106
+ ],
107
+ key_schema: [
108
+ { attribute_name: "pk", key_type: "HASH" },
109
+ { attribute_name: "sk", key_type: "RANGE" }
110
+ ],
111
+ billing_mode: "PAY_PER_REQUEST"
112
+ )
113
+ ```
114
+
115
+ (In development/test, `manage_table = true` creates exactly this for you.)
116
+
117
+ ### Mode B — numeric sort key (e.g. a single-table app keyed `hash_key` + `version`)
118
+
119
+ A numeric sort key cannot hold the `#`-composite strings, so the gem routes
120
+ listing through a **string-keyed GSI** (auto-detected; its key names can be
121
+ anything). Point lookups, the reference count, and the foreign-key guard remain
122
+ **strong** on the base table; listing (an owner's attachments, a blob's variant
123
+ records) is **eventually consistent** via the GSI. Your table needs a GSI named
124
+ by `index_name` with a `(String, String)` key projecting `ALL`:
125
+
126
+ ```ruby
127
+ global_secondary_indexes: [{
128
+ index_name: "active_storage_index",
129
+ key_schema: [
130
+ { attribute_name: "as_index_pk", key_type: "HASH" },
131
+ { attribute_name: "as_index_sk", key_type: "RANGE" }
132
+ ],
133
+ projection: { projection_type: "ALL" }
134
+ }]
135
+ ```
136
+
137
+ If the range key is numeric and that GSI is missing, the gem raises a
138
+ `ConfigurationError` describing exactly what to add (it never mutates a
139
+ production table's indexes itself).
140
+
141
+ ### Key layout
142
+
143
+ `ns` = `namespace`. Items are distinguished by key *values*, not separate tables:
144
+
145
+ | Entity | partition | sort |
146
+ |---|---|---|
147
+ | Blob | `ns#Blob#<id>` | `ns#Blob#<id>` |
148
+ | VariantRecord | `ns#Blob#<blob_id>` | `ns#VariantRecord#<digest>` |
149
+ | Attachment | `ns#Owner#<record_type>#<record_id>` | `ns#Attachment#<name>#<id>` |
150
+
151
+ All non-key attributes are stored under namespaced names (`as_filename`,
152
+ `as_blob_id`, …) so they never collide with your table's own key attributes.
153
+
154
+ ## Usage
155
+
156
+ ### Greenfield model — `Owner`
157
+
158
+ For an `aws-record` model with **no persistence of its own**, include `Owner`. It
159
+ provides an aws-record save/destroy wired into Active Storage's callbacks, plus
160
+ all the contract glue. Include `Aws::Record` **before** `Owner`:
161
+
162
+ ```ruby
163
+ class User
164
+ include Aws::Record
165
+ include ActiveStorage::AwsRecord::Owner
166
+
167
+ string_attr :id, hash_key: true
168
+ string_attr :name
169
+
170
+ has_one_attached :avatar
171
+ has_many_attached :documents
172
+ end
173
+ ```
174
+
175
+ ### Model with its own persistence — `Attachable`
176
+
177
+ If your model **already** defines `save`/`destroy` (its own versioning, events,
178
+ search — e.g. a shared `BaseModel`), do **not** include `Owner` — it would
179
+ override that persistence. Include `Attachable` instead: it adds only the contract
180
+ glue (callback chains, a `changed?` bridge, owner resolution, the attachment
181
+ macros) and never touches your `save`/`destroy`. Your persistence just needs to
182
+ run Active Storage's callback chains — wrap it with the provided helpers:
183
+
184
+ ```ruby
185
+ class Document < BaseModel # BaseModel already defines save/destroy
186
+ include ActiveStorage::AwsRecord::Attachable
187
+
188
+ has_many_attached :files
189
+
190
+ def save(*) = run_attachment_save { super }
191
+ def destroy(*) = run_attachment_destroy { delete! if persisted? }
192
+ end
193
+ ```
194
+
195
+ Active Storage resolves an owner from the bare id it stores; since aws-record's
196
+ `#find` is key-hash-based, `Attachable` supplies an `active_storage_find(id)`
197
+ adapter for that — without shadowing your model's own `#find`. Owners must be
198
+ single-hash-key. If you define `:commit` callbacks, you must run them (Active
199
+ Storage moves uploads to `after_commit` once they exist); with none, uploads
200
+ happen in `after_save`.
201
+
202
+ Then use Active Storage exactly as you would with the Active Record backend:
203
+
204
+ ```ruby
205
+ user.avatar.attach(io: file, filename: "me.png", content_type: "image/png")
206
+ user.avatar.attached? # => true
207
+ user.avatar.url # served by your configured Service
208
+ user.documents.attach(blob1, blob2)
209
+ user.avatar.variant(resize_to_limit: [100, 100]).processed
210
+ ```
211
+
212
+ Owners keep **their own** key schema — the single-table key scheme above applies
213
+ only to the gem's three entities.
214
+
215
+ ## Consistency
216
+
217
+ - **Mode A**: every read is strongly consistent (base table only).
218
+ - **Mode B**: point lookups, the reference count, and the foreign-key guard are
219
+ strong; *listing* (owner→attachments, blob→variant sweep) is eventually
220
+ consistent (GSI). Active Storage's in-memory change tracking masks this within
221
+ a request.
222
+
223
+ ### Atomic grouped changes
224
+
225
+ Active Storage drives a `has_many` clear/replace/detach through
226
+ `attachment_class.transaction`. The adapter makes that a real, fiber-local
227
+ DynamoDB transaction: every row delete and a single **coalesced** `ADD` per blob
228
+ (so the same blob attached twice produces one `-2`, not two rejected ops) commit
229
+ in one `transact_write_items`. Either all rows go or none do — no partial clear.
230
+ *Creates stay per-row and synchronous*, so Active Storage's own failed-save
231
+ cleanup of new blob/attachment records is unaffected. A single buffered delete
232
+ keeps the per-row idempotent recovery (duplicate purge, orphaned blob).
233
+
234
+ ## Known limitations
235
+
236
+ - `Blob#attachments` (the blob→attachment reverse lookup) is unsupported on a
237
+ persisted blob — the design is deliberately GSI-free for that direction; the
238
+ shared-blob count answers "is this still referenced?". Active Storage's generic
239
+ path only reaches it for a non-persisted blob (which has no rows).
240
+ - Mode B variant cleanup uses the eventually-consistent GSI, so a variant created
241
+ within the GSI's propagation window of a blob purge may be missed (Mode A is
242
+ strong).
243
+ - Two requests processing the *same* variant concurrently can briefly see the
244
+ variant record before its image is uploaded — a transient that resolves once
245
+ the winner finishes (inherent to create-then-upload).
246
+ - Blob service keys are random 145-bit tokens; like the reference backend there
247
+ is no DB-level unique-key constraint (collision is negligible).
248
+ - An atomic grouped change is capped by DynamoDB's **100-action** transaction
249
+ limit (`#rows + #distinct blobs`). A larger clear/replace/detach raises
250
+ `ActiveStorage::AwsRecord::TransactionTooLarge` **before any write** rather than
251
+ chunking (which would reintroduce the partial-clear bug) — split it into
252
+ smaller batches.
253
+ - A `has_one` *replace* is a synchronous create plus one buffered orphan delete.
254
+ It is fully atomic only on an Active Storage that carries the widened
255
+ `CreateOne#save` rescue (wrapping the whole `attachment_class.transaction`, so a
256
+ commit-time delete failure rolls back the new record) — shipped together with
257
+ this adapter as part of the generic-backend work. Without it, a transient
258
+ failure on the old-blob delete can leave the new attachment plus an unpurged old
259
+ blob.
260
+
261
+ ## Development
262
+
263
+ ```bash
264
+ bin/setup # bundle install + start DynamoDB Local (docker compose)
265
+ bundle exec rake test # Minitest suite (Mode A)
266
+ bundle exec rake smoke # standalone end-to-end smoke scripts (Mode A, Mode B, schema discovery)
267
+ bundle exec rake rubocop # lint (Rails Omakase); `rake rubocop:autocorrect` to format
268
+ bundle exec rake build # package the gem into pkg/
269
+ ```
270
+
271
+ The suite talks to DynamoDB Local over `DYNAMODB_ENDPOINT` (default
272
+ `http://localhost:8000`). To use a different instance — e.g. to avoid clashing
273
+ with another DynamoDB Local already on `:8000`:
274
+
275
+ ```bash
276
+ DYNAMODB_PORT=8002 docker compose up -d
277
+ DYNAMODB_ENDPOINT=http://localhost:8002 bundle exec rake test
278
+ ```
279
+
280
+ It isolates itself with a per-process table name and never touches other tables
281
+ on the endpoint.
282
+
283
+ See [`PLAN.md`](PLAN.md) for the full design rationale.
284
+
285
+ ## License
286
+
287
+ [MIT](LICENSE).
@@ -0,0 +1,104 @@
1
+ module ActiveStorage
2
+ module AwsRecord
3
+ # Mix into an aws-record model that **already manages its own persistence**
4
+ # (its own +save+/+destroy+ — versioning, events, search) to make it an Active
5
+ # Storage attachment owner *without* the gem taking over persistence. It
6
+ # contributes only the owner-*contract* glue Active Storage's generic builder
7
+ # needs: the callback chains it hooks, a +changed?+ bridge, owner resolution,
8
+ # and the +has_one_attached+/+has_many_attached+ macros.
9
+ #
10
+ # class Document < BaseModel # BaseModel already defines save/destroy
11
+ # include ActiveStorage::AwsRecord::Attachable
12
+ # has_many_attached :files
13
+ #
14
+ # def save(*) = run_attachment_save { super }
15
+ # def destroy(*) = run_attachment_destroy { delete! if persisted? }
16
+ # end
17
+ #
18
+ # The host keeps its own +save+/+destroy+, but those **must run** Active
19
+ # Storage's +:save+ (and, if defined, +:commit+) and +:destroy+ callback
20
+ # chains — that is how attachments flush/upload and are cleaned up. If the host
21
+ # already runs ActiveModel +:save+/+:destroy+ callbacks it works as-is;
22
+ # otherwise wrap raw persistence with {#run_attachment_save} /
23
+ # {#run_attachment_destroy}.
24
+ #
25
+ # For a greenfield model with **no** persistence of its own, use {Owner}, which
26
+ # is {Attachable} plus an aws-record save/destroy implementation.
27
+ #
28
+ # Assumes the host already +include Aws::Record+ (for +find_with_opts+/+dirty?+/
29
+ # +hash_key+). Owners must be single-hash-key (the contract stores one id).
30
+ module Attachable
31
+ extend ActiveSupport::Concern
32
+
33
+ included do
34
+ # The generic builder requires :validation + :save + :destroy callback
35
+ # chains and installs a :validation hook for analysis. Including these is
36
+ # idempotent.
37
+ include ActiveModel::Validations
38
+ include ActiveModel::Validations::Callbacks
39
+ extend ActiveModel::Callbacks
40
+
41
+ # Define ONLY the chains the host does not already run, so a model with its
42
+ # own :save/:destroy callbacks keeps them and Active Storage just hooks the
43
+ # existing chain. :commit is deliberately NOT auto-defined: Active Storage
44
+ # switches uploads to +after_commit+ the instant +_commit_callbacks+ exists
45
+ # (see the generic builder), so defining it without running it would write
46
+ # the attachment row but never upload the file. {Owner} (which controls
47
+ # persistence) defines and runs :commit itself.
48
+ define_model_callbacks :save unless respond_to?(:_save_callbacks, true)
49
+ define_model_callbacks :destroy unless respond_to?(:_destroy_callbacks, true)
50
+
51
+ # Active Storage uses +changed?+ to decide whether +attach+ saves the owner
52
+ # immediately; aws-record exposes the same state as +dirty?+. Bridge it
53
+ # only when the host has no +changed?+ of its own (never clobber one).
54
+ unless method_defined?(:changed?) || private_method_defined?(:changed?)
55
+ define_method(:changed?) { dirty? }
56
+ end
57
+
58
+ include ActiveStorage::Attached::Model
59
+ end
60
+
61
+ class_methods do
62
+ # Attachment rows store the owner under this name; override for STI-like
63
+ # schemes.
64
+ def polymorphic_name
65
+ name
66
+ end
67
+
68
+ # Active Storage resolves an owner from the bare +record_id+ it stored
69
+ # (+record_type.constantize.active_storage_find(record_id)+). aws-record's
70
+ # +#find+ is key-hash-based, so adapt it on the single hash key here rather
71
+ # than shadowing the host's own +#find+ (which app code may call with
72
+ # aws-record semantics).
73
+ def active_storage_find(id)
74
+ find_with_opts(key: { hash_key => id }) ||
75
+ raise(ActiveStorage::RecordNotFound, "Couldn't find #{name} with id=#{id.inspect}")
76
+ rescue Aws::Record::Errors::KeyMissing
77
+ raise ActiveStorage::RecordNotFound, "Couldn't find #{name} with id=#{id.inspect}"
78
+ end
79
+ end
80
+
81
+ # Run your model's real save inside Active Storage's +:save+ chain so its
82
+ # attachments flush and upload (with no +:commit+ chain the upload happens in
83
+ # +after_save+). Aborts the chain — so nothing flushes — if the save returns
84
+ # falsy. Returns the save result.
85
+ def run_attachment_save
86
+ result = nil
87
+ run_callbacks(:save) do
88
+ result = yield
89
+ throw :abort unless result
90
+ result
91
+ end
92
+ result
93
+ end
94
+
95
+ # Run your model's real deletion inside the +:destroy+ chain so attachments
96
+ # are cleaned up. The block must make the owner non-persisted (e.g. +delete!+)
97
+ # or Active Storage's +after_destroy+ cleanup is skipped (it only runs for an
98
+ # owner that was persisted and no longer is).
99
+ def run_attachment_destroy(&block)
100
+ run_callbacks(:destroy, &block)
101
+ end
102
+ end
103
+ end
104
+ end