active_record_proxy_adapters 0.2.0 → 0.2.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +4 -0
- data/README.md +225 -3
- data/lib/active_record_proxy_adapters/primary_replica_proxy.rb +9 -0
- data/lib/active_record_proxy_adapters/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: f5fc4961ba1a8bce59f560d529b55eb965cd27971d92772f95f57b7a21bce30c
|
4
|
+
data.tar.gz: 9c5c16ed9546f3be6df5e191e20483bb2177473b6c934ac589035a75771189e9
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 8a97f02e6e212666c43490ebebe13957a9226103fb59ab2142de1905a10cc6357372314736f79fb86e141b5efdff0bfc1078a89e4205a52e53c80daf6df6b073
|
7
|
+
data.tar.gz: b8d947651861fbbe6227c5ef6cc670c72a73bd58126c228d1cc6fe253327531e6b1ef299b51ea69402bd986380134be5d39308fdaaa9fc075f127b22ee37ee8d
|
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,9 @@
|
|
1
1
|
## [Unreleased]
|
2
2
|
|
3
|
+
## [0.2.1] - 2025-01-02
|
4
|
+
|
5
|
+
- Fix replica connection pool getter when specific connection name is not found (847e150dd21c5bc619745ee1d9d8fcaa9b8f2eea)
|
6
|
+
|
3
7
|
## [0.2.0] - 2024-12-24
|
4
8
|
|
5
9
|
- Add custom log subscriber to tag queries based on the adapter being used (68b8c1f4191388eb957bf12e0f84289da667e940)
|
data/README.md
CHANGED
@@ -2,6 +2,21 @@
|
|
2
2
|
|
3
3
|
A set of ActiveRecord adapters that leverage Rails native multiple database setup to allow automatic connection switching from _one_ primary pool to _one_ replica pool at the database statement level.
|
4
4
|
|
5
|
+
## Why do I need this?
|
6
|
+
|
7
|
+
Maybe you don't. Rails already provides, since version 6.0, a [Rack middleware](https://guides.rubyonrails.org/active_record_multiple_databases.html#activating-automatic-role-switching) that switches between primary and replica automatically based on the HTTP request (`GET` and `HEAD` requests go the primary, everything else goes to the replica).
|
8
|
+
|
9
|
+
The caveat is: you are not allowed do any writes in any `GET` or `HEAD` requests (including controller callbacks).
|
10
|
+
Which means, for example, your `devise` callbacks that save user metadata will now crash.
|
11
|
+
So will your `ahoy-matey` callbacks.
|
12
|
+
|
13
|
+
You will then start wrapping those callbacks in `ApplicationRecord.connected_to(role :reading) {}` blocks as a workaround and, many months later, you have dozens of those (we had nearly 40 when we decided to build this gem).
|
14
|
+
|
15
|
+
By the way, that middleware only works at HTTP request layer (well, duh! it's a Rack middleware).
|
16
|
+
So not good for background jobs, cron jobs or anything that happens outside the scope of an HTTP request. And, if your application needs a replica at this point, for sure you would benefit from automatic connection switching in background jobs too, wouldn't you?
|
17
|
+
|
18
|
+
This gem is heavily inspired by [Makara](https://github.com/instacart/makara), a fantastic gem built by the Instacart folks, which is [no longer maintained](https://github.com/instacart/makara/issues/393), but we took a slightly different, slimmer approach. We don't support load balancing replicas, and that is by design. We believe that should be done outside the scope of the application (using tools like `Pgpool-II`, `pgcat` or RDS Proxy).
|
19
|
+
|
5
20
|
## Installation
|
6
21
|
|
7
22
|
Install the gem and add to the application's Gemfile by executing:
|
@@ -31,6 +46,15 @@ development:
|
|
31
46
|
# your replica credentials here
|
32
47
|
```
|
33
48
|
|
49
|
+
```ruby
|
50
|
+
# app/models/application_record.rb
|
51
|
+
class ApplicationRecord < ActiveRecord::Base
|
52
|
+
self.abstract_class = true
|
53
|
+
|
54
|
+
connects_to database: { writing: :primary, reading: :primary_replica }
|
55
|
+
end
|
56
|
+
```
|
57
|
+
|
34
58
|
### Off Rails
|
35
59
|
|
36
60
|
```ruby
|
@@ -62,7 +86,7 @@ class ApplicationRecord << ActiveRecord::Base
|
|
62
86
|
end
|
63
87
|
```
|
64
88
|
|
65
|
-
|
89
|
+
## Configuration
|
66
90
|
|
67
91
|
The gem comes preconfigured out of the box. However, if default configuration does not suit your needs, you can modify them by using a `.configure` block:
|
68
92
|
|
@@ -77,7 +101,40 @@ ActiveRecordProxyAdapters.configure do |config|
|
|
77
101
|
end
|
78
102
|
```
|
79
103
|
|
80
|
-
|
104
|
+
## Logging
|
105
|
+
|
106
|
+
```ruby
|
107
|
+
# config/initializers/active_record_proxy_adapters.rb
|
108
|
+
require "active_record_proxy_adapters/log_subscriber"
|
109
|
+
|
110
|
+
ActiveRecordProxyAdapters.configure do |config|
|
111
|
+
config.log_subscriber_primary_prefix = "My primary tag" # defaults to "#{adapter_name} Primary", i.e "PostgreSQL Primary"
|
112
|
+
config.log_subscriber_replica_prefix = "My replica tag" # defaults to "#{adapter_name} Replica", i.e "PostgreSQL Replica"
|
113
|
+
end
|
114
|
+
|
115
|
+
# You may want to remove duplicate logs
|
116
|
+
ActiveRecord::LogSubscriber.detach_from :active_record
|
117
|
+
```
|
118
|
+
|
119
|
+
### Example:
|
120
|
+
|
121
|
+
```ruby
|
122
|
+
irb(main):001> User.count ; User.create(name: 'John Doe', email: 'john.doe@example.com') ; 3.times { User.count ; sleep(1) }
|
123
|
+
```
|
124
|
+
yields
|
125
|
+
|
126
|
+
```
|
127
|
+
D, [2024-12-24T17:18:49.151235 #328] DEBUG -- : [My replica tag] User Count (0.5ms) SELECT COUNT(*) FROM "users"
|
128
|
+
D, [2024-12-24T17:18:49.156633 #328] DEBUG -- : [My primary tag] TRANSACTION (0.1ms) BEGIN
|
129
|
+
D, [2024-12-24T17:18:49.157323 #328] DEBUG -- : [My primary tag] User Create (0.4ms) INSERT INTO "users" ("name", "email", "created_at", "updated_at") VALUES ($1, $2, $3, $4) RETURNING "id" [["name", "John Doe"], ["email", "john.doe@example.com"], ["created_at", "2024-12-24 17:18:49.156063"], ["updated_at", "2024-12-24 17:18:49.156063"]]
|
130
|
+
D, [2024-12-24T17:18:49.158305 #328] DEBUG -- : [My primary tag] TRANSACTION (0.7ms) COMMIT
|
131
|
+
D, [2024-12-24T17:18:49.159079 #328] DEBUG -- : [My primary tag] User Count (0.3ms) SELECT COUNT(*) FROM "users"
|
132
|
+
D, [2024-12-24T17:18:50.166105 #328] DEBUG -- : [My primary tag] User Count (1.9ms) SELECT COUNT(*) FROM "users"
|
133
|
+
D, [2024-12-24T17:18:51.169911 #328] DEBUG -- : [My replica tag] User Count (0.9ms) SELECT COUNT(*) FROM "users"
|
134
|
+
=> 3
|
135
|
+
```
|
136
|
+
|
137
|
+
## How it works
|
81
138
|
|
82
139
|
The proxy will analyze each SQL string, using pattern matching, to decide the appropriate connection for it (i.e. if it should go to the primary or replica).
|
83
140
|
|
@@ -89,10 +146,175 @@ The proxy will analyze each SQL string, using pattern matching, to decide the ap
|
|
89
146
|
- All sequence methods (e.g `nextval`) go the primary
|
90
147
|
- Everything else goes to the replica
|
91
148
|
|
92
|
-
|
149
|
+
### TL;DR
|
93
150
|
|
94
151
|
All `SELECT` queries go to the _replica_, everything else goes to _primary_.
|
95
152
|
|
153
|
+
## Stickiness context
|
154
|
+
|
155
|
+
Similar to Rails' built-in [automatic role switching](https://guides.rubyonrails.org/active_record_multiple_databases.html#activating-automatic-role-switching) Rack middleware, the proxy guarantes read-your-own-writes consistency by keeping a contextual timestamp for each Adapter Instance (a.k.a what you get when you call `Model.connection`).
|
156
|
+
|
157
|
+
Until `config.proxy_delay` time has been reached, all subsequent read requests _only for that connection_ will be rerouted to the primary. Once that has been reached, all following read requests will go the replica.
|
158
|
+
|
159
|
+
Although the gem comes configured out of the box with `config.proxy_delay = 2.seconds`, it is your responsibility to find the proper number to use here, as that is very particular to each application and may be affected by many different factors (i.e. hardware, workload, availability, fault-tolerance, etc.). **Do not use this gem** if you don't have any replication delay metrics avaiable in your production APM. And make sure you have the proper alerts setup in case there's a spike in replication delay.
|
160
|
+
|
161
|
+
One strategy you can use to quickly disable the proxy is set your adapter using an environment variable:
|
162
|
+
|
163
|
+
```yaml
|
164
|
+
# config/database.yml
|
165
|
+
production:
|
166
|
+
primary:
|
167
|
+
adapter: <%= ENV.fetch("PRIMARY_DATABASE_ADAPTER", "postgresql") %>
|
168
|
+
primary_replica:
|
169
|
+
adapter: postgresql
|
170
|
+
replica: true
|
171
|
+
```
|
172
|
+
Then set `PRIMARY_DATABASE_ADAPTER=postgresql_proxy` to enable the proxy.
|
173
|
+
That way you can redeploy your application disabling the proxy completely, without any code change.
|
174
|
+
|
175
|
+
### Sticking to the primary database manually
|
176
|
+
|
177
|
+
The proxy respects ActiveRecord's `#connected_to_stack` and will use it if present.
|
178
|
+
You can use that to force connection to the primary or replica and bypass the proxy entirely.
|
179
|
+
|
180
|
+
```ruby
|
181
|
+
User.create(name: 'John Doe', email: 'john.doe@example.com')
|
182
|
+
last_user = User.last # This would normally go to the primary to adhere to read-your-own-writes consistency
|
183
|
+
last_user = ApplicationRecord.connected_to(role: :reading) { User.last } # but I can override it with this block
|
184
|
+
```
|
185
|
+
|
186
|
+
This is useful when picking up a background job that could be impacted by replication delay.
|
187
|
+
|
188
|
+
```ruby
|
189
|
+
# app/models/application_record.rb
|
190
|
+
class ApplicationRecord < ActiveRecord::Base
|
191
|
+
self.abstract_class = true
|
192
|
+
|
193
|
+
connects_to database: { writing: :primary, reading: :primary_replica }
|
194
|
+
end
|
195
|
+
|
196
|
+
# app/models/user.rb
|
197
|
+
class User < ApplicationRecord
|
198
|
+
validates :name, :email, presence: true
|
199
|
+
|
200
|
+
after_commit :say_hello, on: :create
|
201
|
+
|
202
|
+
private
|
203
|
+
|
204
|
+
def say_hello
|
205
|
+
SayHelloJob.perform_later(id) # new row may not be replicated yet
|
206
|
+
end
|
207
|
+
end
|
208
|
+
|
209
|
+
# app/jobs/say_hello_job.rb
|
210
|
+
class SayHelloJob < ApplicationJob
|
211
|
+
def perform(user_id)
|
212
|
+
# so we manually reroute it to the primary
|
213
|
+
user = ApplicationRecord.connected_to(role: :writing) { User.find(user_id) }
|
214
|
+
|
215
|
+
UserMailer.welcome(user).deliver_now
|
216
|
+
end
|
217
|
+
end
|
218
|
+
```
|
219
|
+
|
220
|
+
### Thread safety
|
221
|
+
|
222
|
+
Since Rails already leases exactly one connection per thread from the pool and the adapter operates on that premise, it is safe to use it in multi-threaded servers such as Puma.
|
223
|
+
|
224
|
+
As long as you're not writing thread unsafe code that handles connections from the pool directly, or you don't have any other gem depenencies that write thread unsafe pool operations, you're all set.
|
225
|
+
|
226
|
+
There is, however, an open bug in `ActiveRecord::ConnectionAdapters::PostgreSQLAdapter` for Rails versions 7.1 and greater that can cause random race conditions, but it's not caused by this gem (More info [here](https://github.com/rails/rails/issues/51780)).
|
227
|
+
Rails 7.0 works as expected.
|
228
|
+
|
229
|
+
Multi-threaded queries example:
|
230
|
+
```ruby
|
231
|
+
# app/models/application_record.rb
|
232
|
+
class ApplicationRecord < ActiveRecord::Base
|
233
|
+
self.abstract_class = true
|
234
|
+
|
235
|
+
connects_to database: { writing: :primary, reading: :primary_replica }
|
236
|
+
end
|
237
|
+
|
238
|
+
# app/models/portal.rb
|
239
|
+
class Portal < ApplicationRecord
|
240
|
+
end
|
241
|
+
|
242
|
+
# in rails console -e test
|
243
|
+
ActiveRecord::Base.logger.formatter = proc do |_severity, _time, _progname, msg|
|
244
|
+
"[#{Time.current.iso8601} THREAD #{Thread.current[:name]}] #{msg}\n"
|
245
|
+
end
|
246
|
+
|
247
|
+
def read_your_own_writes
|
248
|
+
proc do
|
249
|
+
Portal.all.count # should go to the replica
|
250
|
+
FactoryBot.create(:portal)
|
251
|
+
|
252
|
+
5.times do
|
253
|
+
Portal.all.count # first one goes the primary, last 3 should go to the replica
|
254
|
+
sleep(3)
|
255
|
+
end
|
256
|
+
end
|
257
|
+
end
|
258
|
+
|
259
|
+
def use_replica
|
260
|
+
proc do
|
261
|
+
5.times do
|
262
|
+
Portal.all.count # should always go the replica
|
263
|
+
sleep(1.5)
|
264
|
+
end
|
265
|
+
end
|
266
|
+
end
|
267
|
+
|
268
|
+
def executor
|
269
|
+
Rails.application.executor
|
270
|
+
end
|
271
|
+
|
272
|
+
def test_multithread_queries
|
273
|
+
ActiveRecordProxyAdapters.configure do |config|
|
274
|
+
config.proxy_delay = 2.seconds
|
275
|
+
config.checkout_timeout = 2.seconds
|
276
|
+
end
|
277
|
+
|
278
|
+
t1 = Thread.new do
|
279
|
+
Thread.current[:name] = "USE REPLICA"
|
280
|
+
executor.wrap { ActiveRecord::Base.uncached { use_replica.call } }
|
281
|
+
end
|
282
|
+
|
283
|
+
t2 = Thread.new do
|
284
|
+
Thread.current[:name] = "READ YOUR OWN WRITES"
|
285
|
+
executor.wrap { ActiveRecord::Base.uncached { read_your_own_writes.call } }
|
286
|
+
end
|
287
|
+
|
288
|
+
[t1, t2].each(&:join)
|
289
|
+
end
|
290
|
+
```
|
291
|
+
|
292
|
+
Yields:
|
293
|
+
```bash
|
294
|
+
irb(main):051:0> test_multithread_queries
|
295
|
+
[2024-12-24T13:52:40-05:00 THREAD USE REPLICA] [PostgreSQL Replica] Portal Count (1.4ms) SELECT COUNT(*) FROM "portals"
|
296
|
+
[2024-12-24T13:52:40-05:00 THREAD READ YOUR OWN WRITES] [PostgreSQL Replica] Portal Count (0.4ms) SELECT COUNT(*) FROM "portals"
|
297
|
+
[2024-12-24T13:52:40-05:00 THREAD READ YOUR OWN WRITES] [PostgreSQLProxy Primary] TRANSACTION (0.5ms) BEGIN
|
298
|
+
[2024-12-24T13:52:40-05:00 THREAD READ YOUR OWN WRITES] [PostgreSQLProxy Primary] Portal Exists? (1.2ms) SELECT 1 AS one FROM "portals" WHERE "portals"."id" IS NOT NULL AND "portals"."slug" = $1 LIMIT $2 [["slug", "portal-e065948fbbee73d3b2c576b48c2b37e021115158edc6a92390d613640460e1d4"], ["LIMIT", 1]]
|
299
|
+
[2024-12-24T13:52:40-05:00 THREAD READ YOUR OWN WRITES] [PostgreSQLProxy Primary] Portal Exists? (0.4ms) SELECT 1 AS one FROM "portals" WHERE "portals"."name" = $1 LIMIT $2 [["name", "Portal-e065948fbbee73d3b2c576b48c2b37e021115158edc6a92390d613640460e1d4"], ["LIMIT", 1]]
|
300
|
+
[2024-12-24T13:52:40-05:00 THREAD READ YOUR OWN WRITES] [PostgreSQLProxy Primary] Portal Create (0.8ms) INSERT INTO "portals" ("name", "slug", "logo", "created_at", "updated_at", "visible") VALUES ($1, $2, $3, $4, $5, $6) RETURNING "id" [["name", "Portal-e065948fbbee73d3b2c576b48c2b37e021115158edc6a92390d613640460e1d4"], ["slug", "portal-e065948fbbee73d3b2c576b48c2b37e021115158edc6a92390d613640460e1d4"], ["logo", nil], ["created_at", "2024-12-24 18:52:40.428383"], ["updated_at", "2024-12-24 18:52:40.428383"], ["visible", true]]
|
301
|
+
[2024-12-24T13:52:40-05:00 THREAD READ YOUR OWN WRITES] [PostgreSQLProxy Primary] TRANSACTION (0.7ms) COMMIT
|
302
|
+
[2024-12-24T13:52:40-05:00 THREAD READ YOUR OWN WRITES] [PostgreSQLProxy Primary] Portal Count (0.6ms) SELECT COUNT(*) FROM "portals"
|
303
|
+
[2024-12-24T13:52:41-05:00 THREAD USE REPLICA] [PostgreSQL Replica] Portal Count (4.4ms) SELECT COUNT(*) FROM "portals"
|
304
|
+
[2024-12-24T13:52:43-05:00 THREAD USE REPLICA] [PostgreSQL Replica] Portal Count (3.3ms) SELECT COUNT(*) FROM "portals"
|
305
|
+
[2024-12-24T13:52:43-05:00 THREAD READ YOUR OWN WRITES] [PostgreSQL Replica] Portal Count (2.8ms) SELECT COUNT(*) FROM "portals"
|
306
|
+
[2024-12-24T13:52:44-05:00 THREAD USE REPLICA] [PostgreSQL Replica] Portal Count (18.0ms) SELECT COUNT(*) FROM "portals"
|
307
|
+
[2024-12-24T13:52:46-05:00 THREAD USE REPLICA] [PostgreSQL Replica] Portal Count (0.9ms) SELECT COUNT(*) FROM "portals"
|
308
|
+
[2024-12-24T13:52:46-05:00 THREAD READ YOUR OWN WRITES] [PostgreSQL Replica] Portal Count (2.3ms) SELECT COUNT(*) FROM "portals"
|
309
|
+
[2024-12-24T13:52:49-05:00 THREAD READ YOUR OWN WRITES] [PostgreSQL Replica] Portal Count (7.2ms) SELECT COUNT(*) FROM "portals"
|
310
|
+
[2024-12-24T13:52:52-05:00 THREAD READ YOUR OWN WRITES] [PostgreSQL Replica] Portal Count (3.7ms) SELECT COUNT(*) FROM "portals"
|
311
|
+
=> [#<Thread:0x00007fffdd6c9348 (irb):38 dead>, #<Thread:0x00007fffdd6c9230 (irb):43 dead>]
|
312
|
+
```
|
313
|
+
|
314
|
+
## Building your own proxy
|
315
|
+
|
316
|
+
TODO: update instructions
|
317
|
+
|
96
318
|
## Development
|
97
319
|
|
98
320
|
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
@@ -70,9 +70,18 @@ module ActiveRecordProxyAdapters
|
|
70
70
|
end
|
71
71
|
|
72
72
|
def replica_pool
|
73
|
+
# use default handler if the connection pool for specific class is not found
|
74
|
+
specific_replica_pool || default_replica_pool
|
75
|
+
end
|
76
|
+
|
77
|
+
def specific_replica_pool
|
73
78
|
connection_handler.retrieve_connection_pool(connection_class.name, role: reading_role)
|
74
79
|
end
|
75
80
|
|
81
|
+
def default_replica_pool
|
82
|
+
connection_handler.retrieve_connection_pool(ActiveRecord::Base.name, role: reading_role)
|
83
|
+
end
|
84
|
+
|
76
85
|
def connection_class
|
77
86
|
active_record_context.connection_class_for(primary_connection)
|
78
87
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: active_record_proxy_adapters
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.2.
|
4
|
+
version: 0.2.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Matt Cruz
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2025-01-02 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: activerecord
|