solid_cache 0.4.2 → 0.5.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: bbca784f86b0c917fa3deb2eb48da304530332424df3e74d0d7241fa4d465ef0
4
- data.tar.gz: f94bcbf1aee676ef3a9a719ec3dc40ce7b0a62838cdc903686de0f61bbf29cc4
3
+ metadata.gz: 1045c342c381976b1af701309a06bcfc3334afb1e7f2b83e39f1064718fefeb5
4
+ data.tar.gz: 59a6672e613a99b4bcf3d9c4215bd1d80a1bb2bd1ffcda3a4c5a0ac00d51bade
5
5
  SHA512:
6
- metadata.gz: 6634aa88669e4f64d296e84883a524e406c6692ff47c8f2a92543923cb7da57d14caf74fe3da0273dd68920316fdcaab6b596c6904882cdf508b3658e3f202ba
7
- data.tar.gz: e07c191310d07621fd8056f902611e519334f70578141cdd0e0230434370819161e53600cbf839006cc1238357a65a51ea9c7e4932146a6d72906bfce6f8bc12
6
+ metadata.gz: a55f8ce342e3b205ba5b37230328199adb194ec138eb78a27a6a1458c257377e9bf4d36db82bccaba7da3c345d92b25f2f3ae3d5f1da3296b4dbc75fdac69000
7
+ data.tar.gz: 270d176c9271089141fe72e6464f5cfb09f6dd052636831a63cd862eb0fc694ad3c210d4dacc70becd34371c1710e27b0d316068964ee45729d1adc75a184ad3
data/README.md CHANGED
@@ -1,15 +1,11 @@
1
1
  # Solid Cache
2
2
 
3
- **Upgrading from v0.3.0 or earlier? Please see [upgrading to version 0.4.0](upgrading_to_version_0.4.x.md)**
3
+ **Upgrading from v0.3.0 or earlier? Please see [upgrading to version v0.4.x and beyond](upgrading_to_version_0.4.x.md)**
4
4
 
5
5
  Solid Cache is a database-backed Active Support cache store implementation.
6
6
 
7
7
  Using SQL databases backed by SSDs we can have caches that are much larger and cheaper than traditional memory only Redis or Memcached backed caches.
8
8
 
9
- Testing on [HEY](https://hey.com) shows that reads and writes are 25%-50% slower than with a Redis cache (1.2ms vs 0.8-1ms per single-key read), but this is not a significant percentage of the overall request time.
10
-
11
- If cache misses are expensive (up to 50x the cost of a hit on HEY), then there are big advantages to caches that can hold months rather than days of data.
12
-
13
9
  ## Usage
14
10
 
15
11
  To set Solid Cache as your Rails cache, you should add this to your environment config:
@@ -18,7 +14,7 @@ To set Solid Cache as your Rails cache, you should add this to your environment
18
14
  config.cache_store = :solid_cache_store
19
15
  ```
20
16
 
21
- Solid Cache is a FIFO (first in, first out) cache. While this is not as efficient as an LRU cache, this is mitigated by the longer cache lifespans.
17
+ Solid Cache is a FIFO (first in, first out) cache. While this is not as efficient as an LRU cache, this is mitigated by the longer cache lifespan.
22
18
 
23
19
  A FIFO cache is much easier to manage:
24
20
  1. We don't need to track when items are read
@@ -55,23 +51,66 @@ $ bin/rails db:migrate
55
51
 
56
52
  ### Configuration
57
53
 
54
+ Configuration will be read from `config/solid_cache.yml`. You can change the location of the config file by setting the `SOLID_CACHE_CONFIG` env variable.
55
+
56
+ The format of the file is:
57
+
58
+ ```yml
59
+ default:
60
+ store_options: &default_store_options
61
+ max_age: <%= 60.days.to_i %>
62
+ namespace: <%= Rails.env %>
63
+ size_estimate_samples: 1000
64
+
65
+ development: &development
66
+ database: development_cache
67
+ store_options:
68
+ <<: *default_store_options
69
+ max_size: <%= 256.gigabytes %>
70
+
71
+ production: &production
72
+ databases: [production_cache1, production_cache2]
73
+ store_options:
74
+ <<: *default_store_options
75
+ max_entries: <%= 256.gigabytes %>
76
+ ```
77
+
78
+ For the full list of keys for `store_options` see [Cache configuration](#cache-configuration). Any options passed to the cache lookup will overwrite those specified here.
79
+
80
+ #### Connection configuration
81
+
82
+ You can set one of `database`, `databases` and `connects_to` in the config file. They will be used to configure the cache databases in `SolidCache::Record#connects_to`.
83
+
84
+ Setting `database` to `cache_db` will configure with:
85
+
86
+ ```ruby
87
+ SolidCache::Record.connects_to database: { writing: :cache_db }
88
+ ```
89
+
90
+ Setting `databases` to `[cache_db, cache_db2]` is the equivalent of:
91
+
92
+ ```ruby
93
+ SolidCache::Record.connects_to shards: { cache_db1: { writing: :cache_db1 }, cache_db2: { writing: :cache_db2 } }
94
+ ```
95
+
96
+ If `connects_to` is set it will be passed directly.
97
+
98
+ If none of these are set, then Solid Cache will use the `ActiveRecord::Base` connection pool. This means that cache reads and writes will be part of any wrapping
99
+ database transaction.
100
+
58
101
  #### Engine configuration
59
102
 
60
- There are two options that can be set on the engine:
103
+ There are three options that can be set on the engine:
61
104
 
62
105
  - `executor` - the [Rails executor](https://guides.rubyonrails.org/threading_and_code_execution.html#executor) used to wrap asynchronous operations, defaults to the app executor
63
- - `connects_to` - a custom connects to value for the abstract `SolidCache::Record` active record model. Required for sharding and/or using a separate cache database to the main app.
106
+ - `connects_to` - a custom connects to value for the abstract `SolidCache::Record` active record model. Required for sharding and/or using a separate cache database to the main app. This will overwrite any value set in `config/solid_cache.yml`
107
+ - `size_estimate_samples` - if `max_size` is set on the cache, the number of the samples used to estimates the size.
64
108
 
65
109
  These can be set in your Rails configuration:
66
110
 
67
111
  ```ruby
68
112
  Rails.application.configure do
69
- config.solid_cache.connects_to = {
70
- shards: {
71
- shard1: { writing: :cache_primary_shard1 },
72
- shard2: { writing: :cache_primary_shard2 }
73
- }
74
- }
113
+ config.solid_cache.size_estimate_samples = 1000
75
114
  end
76
115
  ```
77
116
 
@@ -85,6 +124,7 @@ Solid Cache supports these options in addition to the standard `ActiveSupport::C
85
124
  - `expiry_queue` - which queue to add expiry jobs to (default: `default`)
86
125
  - `max_age` - the maximum age of entries in the cache (default: `2.weeks.to_i`). Can be set to `nil`, but this is not recommended unless using `max_entries` to limit the size of the cache.
87
126
  - `max_entries` - the maximum number of entries allowed in the cache (default: `nil`, meaning no limit)
127
+ - `max_size` - the maximum size of the cache entries (default `nil`, meaning no limit)
88
128
  - `cluster` - a Hash of options for the cache database cluster, e.g `{ shards: [:database1, :database2, :database3] }`
89
129
  - `clusters` - and Array of Hashes for multiple cache clusters (ignored if `:cluster` is set)
90
130
  - `active_record_instrumentation` - whether to instrument the cache's queries (default: `true`)
@@ -95,13 +135,15 @@ For more information on cache clusters see [Sharding the cache](#sharding-the-ca
95
135
 
96
136
  ### Cache expiry
97
137
 
98
- Solid Cache tracks writes to the cache. For every write it increments a counter by 1. Once the counter reaches 80% of the `expiry_batch_size` it adds a task to run on a background thread. That task will:
138
+ Solid Cache tracks writes to the cache. For every write it increments a counter by 1. Once the counter reaches 50% of the `expiry_batch_size` it adds a task to run on a background thread. That task will:
99
139
 
100
- 1. Check if we have exceeded the `max_entries` value (if set) by subtracting the max and min IDs from the `SolidCache::Entry` table (this is an estimate that ignores any gaps).
140
+ 1. Check if we have exceeded the `max_entries` or `max_size` values (if set)
141
+ The current entries are estimated by subtracting the max and min IDs from the `SolidCache::Entry` table.
142
+ The current size is estimated by sampling the entry `byte_size` columns.
101
143
  2. If we have it will delete `expiry_batch_size` entries
102
144
  3. If not it will delete up to `expiry_batch_size` entries, provided they are all older than `max_age`.
103
145
 
104
- Expiring when we reach 80% of the batch size allows us to expire records from the cache faster than we write to it when we need to reduce the cache size.
146
+ Expiring when we reach 50% of the batch size allows us to expire records from the cache faster than we write to it when we need to reduce the cache size.
105
147
 
106
148
  Only triggering expiry when we write means that the if the cache is idle, the background thread is also idle.
107
149
 
@@ -136,10 +178,10 @@ $ mv db/migrate/*.solid_cache.rb db/cache/migrate
136
178
  ```
137
179
 
138
180
  Set the engine configuration to point to the new database:
139
- ```
140
- Rails.application.configure do
141
- config.solid_cache.connects_to = { database: { writing: :cache } }
142
- end
181
+ ```yaml
182
+ # config/solid_cache.yml
183
+ production:
184
+ database: cache
143
185
  ```
144
186
 
145
187
  Run migrations:
@@ -172,44 +214,28 @@ production:
172
214
  host: cache3-db
173
215
  ```
174
216
 
175
- ```ruby
176
- # config/environment/production.rb
177
- Rails.application.configure do
178
- config.solid_cache.connects_to = {
179
- shards: {
180
- cache_shard1: { writing: :cache_shard1 },
181
- cache_shard2: { writing: :cache_shard2 },
182
- cache_shard3: { writing: :cache_shard3 },
183
- }
184
- }
185
-
186
- config.cache_store = [ :solid_cache_store, cluster: { shards: [ :cache_shard1, :cache_shard2, :cache_shard3 ] } ]
187
- end
217
+ ```yaml
218
+ # config/solid_cache.yml
219
+ production:
220
+ databases: [cache_shard1, cache_shard2, cache_shard3]
188
221
  ```
189
222
 
190
223
  ### Secondary cache clusters
191
224
 
192
225
  You can add secondary cache clusters. Reads will only be sent to the primary cluster (i.e. the first one listed).
193
226
 
194
- Writes will go to all clusters. The writes to the primary cluster are synchronous, but asyncronous to the secondary clusters.
227
+ Writes will go to all clusters. The writes to the primary cluster are synchronous, but asynchronous to the secondary clusters.
195
228
 
196
229
  To specific multiple clusters you can do:
197
230
 
198
- ```ruby
199
- Rails.application.configure do
200
- config.solid_cache.connects_to = {
201
- shards: {
202
- cache_primary_shard1: { writing: :cache_primary_shard1 },
203
- cache_primary_shard2: { writing: :cache_primary_shard2 },
204
- cache_secondary_shard1: { writing: :cache_secondary_shard1 },
205
- cache_secondary_shard2: { writing: :cache_secondary_shard2 },
206
- }
207
- }
208
-
209
- primary_cluster = { shards: [ :cache_primary_shard1, :cache_primary_shard2 ] }
210
- secondary_cluster = { shards: [ :cache_secondary_shard1, :cache_secondary_shard2 ] }
211
- config.cache_store = [ :solid_cache_store, clusters: [ primary_cluster, secondary_cluster ] ]
212
- end
231
+ ```yaml
232
+ # config/solid_cache.yml
233
+ production:
234
+ databases: [cache_primary_shard1, cache_primary_shard2, cache_secondary_shard1, cache_secondary_shard2]
235
+ store_options:
236
+ clusters:
237
+ - shards: [cache_primary_shard1, cache_primary_shard2]
238
+ - shards: [cache_secondary_shard1, cache_secondary_shard2]
213
239
  ```
214
240
 
215
241
  ### Named shard destinations
@@ -218,24 +244,19 @@ By default, the node key used for sharding is the name of the database in `datab
218
244
 
219
245
  It is possible to add names for the shards in the cluster config. This will allow you to shuffle or remove shards without breaking consistent hashing.
220
246
 
221
- ```ruby
222
- Rails.application.configure do
223
- config.solid_cache.connects_to = {
224
- shards: {
225
- cache_primary_shard1: { writing: :cache_primary_shard1 },
226
- cache_primary_shard2: { writing: :cache_primary_shard2 },
227
- cache_secondary_shard1: { writing: :cache_secondary_shard1 },
228
- cache_secondary_shard2: { writing: :cache_secondary_shard2 },
229
- }
230
- }
231
-
232
- primary_cluster = { shards: { cache_primary_shard1: :node1, cache_primary_shard2: :node2 } }
233
- secondary_cluster = { shards: { cache_primary_shard1: :node3, cache_primary_shard2: :node4 } }
234
- config.cache_store = [ :solid_cache_store, clusters: [ primary_cluster, secondary_cluster ] ]
235
- end
247
+ ```yaml
248
+ production:
249
+ databases: [cache_primary_shard1, cache_primary_shard2, cache_secondary_shard1, cache_secondary_shard2]
250
+ store_options:
251
+ clusters:
252
+ - shards:
253
+ cache_primary_shard1: node1
254
+ cache_primary_shard2: node2
255
+ - shards:
256
+ cache_secondary_shard1: node3
257
+ cache_secondary_shard2: node4
236
258
  ```
237
259
 
238
-
239
260
  ### Enabling encryption
240
261
 
241
262
  Add this to an initializer:
@@ -254,7 +275,7 @@ The Solid Cache migrations try to create an index with 1024 byte entries. If tha
254
275
 
255
276
  ## Development
256
277
 
257
- Run the tests with `bin/rails test`. By default, these will run against SQLite.
278
+ Run the tests with `bin/rake test`. By default, these will run against SQLite.
258
279
 
259
280
  You can also run the tests against MySQL and PostgreSQL. First start up the databases:
260
281
 
@@ -273,8 +294,8 @@ $ TARGET_DB=postgres bin/rails db:setup
273
294
  Then run the tests for the target database:
274
295
 
275
296
  ```shell
276
- $ TARGET_DB=mysql bin/rails test
277
- $ TARGET_DB=postgres bin/rails test
297
+ $ TARGET_DB=mysql bin/rake test
298
+ $ TARGET_DB=postgres bin/rake test
278
299
  ```
279
300
 
280
301
  ### Testing with multiple Rails version
@@ -285,7 +306,7 @@ multiple Rails version.
285
306
  To run a test for a specific version run:
286
307
 
287
308
  ```shell
288
- bundle exec appraisal rails-7-1 bin/rails test
309
+ bundle exec appraisal rails-7-1 bin/rake test
289
310
  ```
290
311
 
291
312
  After updating the dependencies in the `Gemfile` please run:
data/Rakefile CHANGED
@@ -8,3 +8,37 @@ load "rails/tasks/engine.rake"
8
8
  load "rails/tasks/statistics.rake"
9
9
 
10
10
  require "bundler/gem_tasks"
11
+ require "rake/testtask"
12
+
13
+ def run_without_aborting(*tasks)
14
+ errors = []
15
+
16
+ tasks.each do |task|
17
+ Rake::Task[task].invoke
18
+ rescue Exception
19
+ errors << task
20
+ end
21
+
22
+ abort "Errors running #{errors.join(', ')}" if errors.any?
23
+ end
24
+
25
+ def configs
26
+ [ :default, :cluster, :cluster_inferred, :clusters, :clusters_named, :database, :no_database ]
27
+ end
28
+
29
+ task :test do
30
+ tasks = configs.map { |config| "test:#{config}" }
31
+ run_without_aborting(*tasks)
32
+ end
33
+
34
+ configs.each do |config|
35
+ namespace :test do
36
+ task config do
37
+ if config == :default
38
+ sh("bin/rails test")
39
+ else
40
+ sh("SOLID_CACHE_CONFIG=config/solid_cache_#{config}.yml bin/rails test")
41
+ end
42
+ end
43
+ end
44
+ end
@@ -2,9 +2,9 @@
2
2
 
3
3
  module SolidCache
4
4
  class ExpiryJob < ActiveJob::Base
5
- def perform(count, shard: nil, max_age:, max_entries:)
5
+ def perform(count, shard: nil, max_age: nil, max_entries: nil, max_size: nil)
6
6
  Record.with_shard(shard) do
7
- Entry.expire(count, max_age: max_age, max_entries: max_entries)
7
+ Entry.expire(count, max_age: max_age, max_entries: max_entries, max_size: max_size)
8
8
  end
9
9
  end
10
10
  end
@@ -6,21 +6,25 @@ module SolidCache
6
6
  extend ActiveSupport::Concern
7
7
 
8
8
  class_methods do
9
- def id_range
10
- uncached do
11
- pick(Arel.sql("max(id) - min(id) + 1")) || 0
12
- end
13
- end
14
-
15
- def expire(count, max_age:, max_entries:)
16
- if (ids = expiry_candidate_ids(count, max_age: max_age, max_entries: max_entries)).any?
9
+ def expire(count, max_age:, max_entries:, max_size:)
10
+ if (ids = expiry_candidate_ids(count, max_age: max_age, max_entries: max_entries, max_size: max_size)).any?
17
11
  delete(ids)
18
12
  end
19
13
  end
20
14
 
21
15
  private
22
- def expiry_candidate_ids(count, max_age:, max_entries:)
23
- cache_full = max_entries && max_entries < id_range
16
+ def cache_full?(max_entries:, max_size:)
17
+ if max_entries && max_entries < id_range
18
+ true
19
+ elsif max_size && max_size < estimated_size
20
+ true
21
+ else
22
+ false
23
+ end
24
+ end
25
+
26
+ def expiry_candidate_ids(count, max_age:, max_entries:, max_size:)
27
+ cache_full = cache_full?(max_entries: max_entries, max_size: max_size)
24
28
  return [] unless cache_full || max_age
25
29
 
26
30
  # In the case of multiple concurrent expiry operations, it is desirable to
@@ -0,0 +1,124 @@
1
+ # frozen_string_literal: true
2
+
3
+ module SolidCache
4
+ class Entry
5
+ # # Cache size estimation
6
+ #
7
+ # We store the size of each cache row in the byte_size field. This allows us to estimate the size of the cache
8
+ # by sampling those rows.
9
+ #
10
+ # To reduce the effect of outliers though we'll grab the N largest rows, and add their size to a sampled based
11
+ # estimate of the size of the remaining rows.
12
+ #
13
+ # ## Outliers
14
+ #
15
+ # There is an index on the byte_size column, so we can efficiently grab the N largest rows. We also grab the
16
+ # minimum byte_size of those rows, which we'll use as a cutoff for the non outlier sampling.
17
+ #
18
+ # ## Sampling
19
+ #
20
+ # To efficiently sample the data we use the key_hash column, which is a random 64 bit integer. There's an index
21
+ # on key_hash and byte_size so we can grab a sum of the byte_sizes in a range of key_hash directly from that
22
+ # index.
23
+ #
24
+ # To decide how big the range should be, we use the difference between the smallest and largest database IDs as
25
+ # an estimate of the number of rows in the table. This should be a good estimate, because we delete rows in ID order
26
+ #
27
+ # We then calculate the fraction of the rows we want to sample by dividing the sample size by the estimated number
28
+ # of rows.
29
+ #
30
+ # The we grab the byte_size sum of the rows in the range of key_hash values excluding any rows that are larger than
31
+ # our minimum outlier cutoff. We then divide this by the sampling fraction to get an estimate of the size of the
32
+ # non outlier rows
33
+ #
34
+ # ## Equations
35
+ #
36
+ # Given N samples and a key_hash range of Kmin..Kmax
37
+ #
38
+ # outliers_cutoff OC = min(byte_size of N largest rows)
39
+ # outliers_size OS = sum(byte_size of N largest rows)
40
+ #
41
+ # estimated number of rows R = max(ID) - min(ID) + 1
42
+ # sample_fraction F = N / R
43
+ # sample_range_size S = (Kmax - Kmin) * F
44
+ # sample range is K1..K2 where K1 = Kmin + rand(Kmax - S) and K2 = K1 + S
45
+ #
46
+ # non_outlier_sample_size NSS = sum(byte_size of rows in key_hash range K1..K2 where byte_size <= OC)
47
+ # non_outlier_estimated_size NES = NSS / F
48
+ # estimated_size ES = OS + NES
49
+ module Size
50
+ class Estimate
51
+ attr_reader :samples, :max_records
52
+
53
+ def initialize(samples:)
54
+ @samples = samples
55
+ @max_records ||= Entry.id_range
56
+ end
57
+
58
+ def size
59
+ outliers_size + non_outlier_estimated_size
60
+ end
61
+
62
+ def exact?
63
+ outliers_count < samples || sampled_fraction == 1
64
+ end
65
+
66
+ private
67
+ def outliers_size
68
+ outliers_size_count_and_cutoff[0]
69
+ end
70
+
71
+ def outliers_count
72
+ outliers_size_count_and_cutoff[1]
73
+ end
74
+
75
+ def outliers_cutoff
76
+ outliers_size_count_and_cutoff[2]
77
+ end
78
+
79
+ def outliers_size_count_and_cutoff
80
+ @outlier_size_and_cutoff ||= Entry.uncached do
81
+ sum, count, min = Entry.largest_byte_sizes(samples).pick(Arel.sql("sum(byte_size), count(*), min(byte_size)"))
82
+ sum ? [sum, count, min] : [0, 0, nil]
83
+ end
84
+ end
85
+
86
+ def non_outlier_estimated_size
87
+ @non_outlier_estimated_size ||= sampled_fraction.zero? ? 0 : (sampled_non_outlier_size / sampled_fraction).round
88
+ end
89
+
90
+ def sampled_fraction
91
+ @sampled_fraction ||=
92
+ if max_records <= samples
93
+ 0
94
+ else
95
+ [samples.to_f / (max_records - samples), 1].min
96
+ end
97
+ end
98
+
99
+ def sampled_non_outlier_size
100
+ @sampled_non_outlier_size ||= Entry.uncached do
101
+ Entry.in_key_hash_range(sample_range).up_to_byte_size(outliers_cutoff).sum(:byte_size)
102
+ end
103
+ end
104
+
105
+ def sample_range
106
+ if sampled_fraction == 1
107
+ key_hash_range
108
+ else
109
+ start = rand(key_hash_range.begin..(key_hash_range.end - sample_range_size))
110
+ start..(start + sample_range_size)
111
+ end
112
+ end
113
+
114
+ def key_hash_range
115
+ Entry::KEY_HASH_ID_RANGE
116
+ end
117
+
118
+ def sample_range_size
119
+ @sample_range_size ||= (key_hash_range.size * sampled_fraction).to_i
120
+ end
121
+ end
122
+ end
123
+ end
124
+ end
@@ -0,0 +1,57 @@
1
+ # frozen_string_literal: true
2
+
3
+ module SolidCache
4
+ class Entry
5
+ module Size
6
+ # Moving averate cache size estimation
7
+ #
8
+ # To reduce variablitity in the cache size estimate, we'll use a moving average of the previous 20 estimates.
9
+ # The estimates are stored directly in the cache, under the "__solid_cache_entry_size_moving_average_estimates" key.
10
+ #
11
+ # We'll remove the largest and smallest estimates, and then average remaining ones.
12
+ class MovingAverageEstimate
13
+ ESTIMATES_KEY = "__solid_cache_entry_size_moving_average_estimates"
14
+ MAX_RETAINED_ESTIMATES = 50
15
+ TARGET_SAMPLED_FRACTION = 0.0005
16
+
17
+ attr_reader :samples, :size
18
+ delegate :exact?, to: :estimate
19
+
20
+ def initialize(samples:)
21
+ @samples = samples
22
+ @estimate = Estimate.new(samples: samples)
23
+ values = latest_values
24
+ @size = (values.sum / values.size.to_f).round
25
+ write_values(values)
26
+ end
27
+
28
+ private
29
+ attr_reader :estimate
30
+
31
+ def previous_values
32
+ Entry.read(ESTIMATES_KEY).presence&.split("|")&.map(&:to_i) || []
33
+ end
34
+
35
+ def latest_value
36
+ estimate.size
37
+ end
38
+
39
+ def latest_values
40
+ (previous_values + [latest_value]).last(retained_estimates)
41
+ end
42
+
43
+ def write_values(values)
44
+ Entry.write(ESTIMATES_KEY, values.join("|"))
45
+ end
46
+
47
+ def retained_estimates
48
+ [retained_estimates_for_target_fraction, MAX_RETAINED_ESTIMATES].min
49
+ end
50
+
51
+ def retained_estimates_for_target_fraction
52
+ (estimate.max_records / samples * TARGET_SAMPLED_FRACTION).floor + 1
53
+ end
54
+ end
55
+ end
56
+ end
57
+ end
@@ -0,0 +1,21 @@
1
+ # frozen_string_literal: true
2
+
3
+ module SolidCache
4
+ class Entry
5
+ module Size
6
+ extend ActiveSupport::Concern
7
+
8
+ included do
9
+ scope :largest_byte_sizes, -> (limit) { from(order(byte_size: :desc).limit(limit).select(:byte_size)) }
10
+ scope :in_key_hash_range, -> (range) { where(key_hash: range) }
11
+ scope :up_to_byte_size, -> (cutoff) { where("byte_size <= ?", cutoff) }
12
+ end
13
+
14
+ class_methods do
15
+ def estimated_size(samples: SolidCache.configuration.size_estimate_samples)
16
+ MovingAverageEstimate.new(samples: samples).size
17
+ end
18
+ end
19
+ end
20
+ end
21
+ end
@@ -2,15 +2,13 @@
2
2
 
3
3
  module SolidCache
4
4
  class Entry < Record
5
- include Expiration
5
+ include Expiration, Size
6
6
 
7
- ID_BYTE_SIZE = 8
8
- CREATED_AT_BYTE_SIZE = 8
9
- KEY_HASH_BYTE_SIZE = 8
10
- VALUE_BYTE_SIZE = 4
11
- FIXED_SIZE_COLUMNS_BYTE_SIZE = ID_BYTE_SIZE + CREATED_AT_BYTE_SIZE + KEY_HASH_BYTE_SIZE + VALUE_BYTE_SIZE
12
-
13
- self.ignored_columns += [ :key_hash, :byte_size] if SolidCache.key_hash_stage == :ignored
7
+ # The estimated cost of an extra row in bytes, including fixed size columns, overhead, indexes and free space
8
+ # Based on expirimentation on SQLite, MySQL and Postgresql.
9
+ # A bit high for SQLite (more like 90 bytes), but about right for MySQL/Postgresql.
10
+ ESTIMATED_ROW_OVERHEAD = 140
11
+ KEY_HASH_ID_RANGE = -(2**63)..(2**63 - 1)
14
12
 
15
13
  class << self
16
14
  def write(key, value)
@@ -22,23 +20,23 @@ module SolidCache
22
20
  end
23
21
 
24
22
  def read(key)
25
- result = select_all_no_query_cache(get_sql, lookup_value(key)).first
23
+ result = select_all_no_query_cache(get_sql, key_hash_for(key)).first
26
24
  result[1] if result&.first == key
27
25
  end
28
26
 
29
27
  def read_multi(keys)
30
- key_hashes = keys.map { |key| lookup_value(key) }
28
+ key_hashes = keys.map { |key| key_hash_for(key) }
31
29
  results = select_all_no_query_cache(get_all_sql(key_hashes), key_hashes).to_h
32
30
  results.except!(results.keys - keys)
33
31
  end
34
32
 
35
33
  def delete_by_key(key)
36
- delete_no_query_cache(lookup_column, lookup_value(key))
34
+ delete_no_query_cache(:key_hash, key_hash_for(key))
37
35
  end
38
36
 
39
37
  def delete_multi(keys)
40
- serialized_keys = keys.map { |key| lookup_value(key) }
41
- delete_no_query_cache(lookup_column, serialized_keys)
38
+ serialized_keys = keys.map { |key| key_hash_for(key) }
39
+ delete_no_query_cache(:key_hash, serialized_keys)
42
40
  end
43
41
 
44
42
  def clear_truncate
@@ -52,7 +50,7 @@ module SolidCache
52
50
  def increment(key, amount)
53
51
  transaction do
54
52
  uncached do
55
- result = lock.where(lookup_column => lookup_value(key)).pick(:key, :value)
53
+ result = lock.where(key_hash: key_hash_for(key)).pick(:key, :value)
56
54
  amount += result[1].to_i if result&.first == key
57
55
  write(key, amount)
58
56
  amount
@@ -64,6 +62,12 @@ module SolidCache
64
62
  increment(key, -amount)
65
63
  end
66
64
 
65
+ def id_range
66
+ uncached do
67
+ pick(Arel.sql("max(id) - min(id) + 1")) || 0
68
+ end
69
+ end
70
+
67
71
  private
68
72
  def upsert_all_no_query_cache(payloads)
69
73
  insert_all = ActiveRecord::InsertAll.new(
@@ -85,65 +89,34 @@ module SolidCache
85
89
  def add_key_hash_and_byte_size(payloads)
86
90
  payloads.map do |payload|
87
91
  payload.dup.tap do |payload|
88
- if key_hash?
89
- payload[:key_hash] = key_hash_for(payload[:key])
90
- payload[:byte_size] = byte_size_for(payload)
91
- end
92
+ payload[:key_hash] = key_hash_for(payload[:key])
93
+ payload[:byte_size] = byte_size_for(payload)
92
94
  end
93
95
  end
94
96
  end
95
97
 
96
- def key_hash?
97
- @key_hash ||= [ :indexed, :unindexed ].include?(SolidCache.key_hash_stage) &&
98
- connection.column_exists?(table_name, :key_hash)
99
- end
100
-
101
- def key_hash_indexed?
102
- key_hash? && SolidCache.key_hash_stage == :indexed
103
- end
104
-
105
- def lookup_column
106
- key_hash_indexed? ? :key_hash : :key
107
- end
108
-
109
- def lookup_value(key)
110
- key_hash_indexed? ? key_hash_for(key) : to_binary(key)
111
- end
112
-
113
- def lookup_placeholder
114
- key_hash_indexed? ? 1 : "placeholder"
115
- end
116
-
117
98
  def exec_query_method
118
99
  connection.respond_to?(:internal_exec_query) ? :internal_exec_query : :exec_query
119
100
  end
120
101
 
121
102
  def upsert_unique_by
122
- connection.supports_insert_conflict_target? ? lookup_column : nil
103
+ connection.supports_insert_conflict_target? ? :key_hash : nil
123
104
  end
124
105
 
125
106
  def upsert_update_only
126
- if key_hash_indexed?
127
- [ :key, :value, :byte_size ]
128
- elsif key_hash?
129
- [ :value, :key_hash, :byte_size ]
130
- else
131
- [ :value ]
132
- end
107
+ [ :key, :value, :byte_size ]
133
108
  end
134
109
 
135
110
  def get_sql
136
- @get_sql ||= {}
137
- @get_sql[lookup_column] ||= build_sql(where(lookup_column => lookup_placeholder).select(:key, :value))
111
+ @get_sql ||= build_sql(where(key_hash: 1).select(:key, :value))
138
112
  end
139
113
 
140
114
  def get_all_sql(key_hashes)
141
115
  if connection.prepared_statements?
142
116
  @get_all_sql_binds ||= {}
143
- @get_all_sql_binds[[key_hashes.count, lookup_column]] ||= build_sql(where(lookup_column => key_hashes).select(:key, :value))
117
+ @get_all_sql_binds[key_hashes.count] ||= build_sql(where(key_hash: key_hashes).select(:key, :value))
144
118
  else
145
- @get_all_sql_no_binds ||= {}
146
- @get_all_sql_no_binds[lookup_column] ||= build_sql(where(lookup_column => [ lookup_placeholder, lookup_placeholder ]).select(:key, :value)).gsub("?, ?", "?")
119
+ @get_all_sql_no_binds ||= build_sql(where(key_hash: [ 1, 2 ]).select(:key, :value)).gsub("?, ?", "?")
147
120
  end
148
121
  end
149
122
 
@@ -192,7 +165,7 @@ module SolidCache
192
165
  end
193
166
 
194
167
  def byte_size_for(payload)
195
- payload[:key].to_s.bytesize + payload[:value].to_s.bytesize + FIXED_SIZE_COLUMNS_BYTE_SIZE
168
+ payload[:key].to_s.bytesize + payload[:value].to_s.bytesize + ESTIMATED_ROW_OVERHEAD
196
169
  end
197
170
  end
198
171
  end
@@ -6,7 +6,7 @@ module SolidCache
6
6
 
7
7
  self.abstract_class = true
8
8
 
9
- connects_to(**SolidCache.connects_to) if SolidCache.connects_to
9
+ connects_to(**SolidCache.configuration.connects_to) if SolidCache.configuration.connects_to
10
10
 
11
11
  class << self
12
12
  def disable_instrumentation(&block)
@@ -14,12 +14,24 @@ module SolidCache
14
14
  end
15
15
 
16
16
  def with_shard(shard, &block)
17
- if shard && SolidCache.connects_to
17
+ if shard && SolidCache.configuration.sharded?
18
18
  connected_to(shard: shard, role: default_role, prevent_writes: false, &block)
19
19
  else
20
20
  block.call
21
21
  end
22
22
  end
23
+
24
+ def each_shard(&block)
25
+ return to_enum(:each_shard) unless block_given?
26
+
27
+ if SolidCache.configuration.sharded?
28
+ SolidCache.configuration.shard_keys.each do |shard|
29
+ Record.with_shard(shard, &block)
30
+ end
31
+ else
32
+ yield
33
+ end
34
+ end
23
35
  end
24
36
  end
25
37
  end
@@ -3,6 +3,8 @@
3
3
  module SolidCache
4
4
  class Cluster
5
5
  module Connections
6
+ attr_reader :shard_options
7
+
6
8
  def initialize(options = {})
7
9
  super(options)
8
10
  @shard_options = options.fetch(:shards, nil)
@@ -40,14 +42,14 @@ module SolidCache
40
42
  connections.names
41
43
  end
42
44
 
45
+ def connections
46
+ @connections ||= SolidCache::Connections.from_config(@shard_options)
47
+ end
48
+
43
49
  private
44
50
  def setup!
45
51
  connections
46
52
  end
47
-
48
- def connections
49
- @connections ||= SolidCache::Connections.from_config(@shard_options)
50
- end
51
53
  end
52
54
  end
53
55
  end
@@ -19,6 +19,8 @@ module SolidCache
19
19
  instrument(&block)
20
20
  end
21
21
  end
22
+ rescue Exception => exception
23
+ error_handler&.call(method: :async, exception: exception, returning: nil)
22
24
  end
23
25
  end
24
26
 
@@ -7,9 +7,9 @@ module SolidCache
7
7
  module Expiry
8
8
  # For every write that we do, we attempt to delete EXPIRY_MULTIPLIER times as many records.
9
9
  # This ensures there is downward pressure on the cache size while there is valid data to delete
10
- EXPIRY_MULTIPLIER = 1.25
10
+ EXPIRY_MULTIPLIER = 2
11
11
 
12
- attr_reader :expiry_batch_size, :expiry_method, :expiry_queue, :expires_per_write, :max_age, :max_entries
12
+ attr_reader :expiry_batch_size, :expiry_method, :expiry_queue, :expires_per_write, :max_age, :max_entries, :max_size
13
13
 
14
14
  def initialize(options = {})
15
15
  super(options)
@@ -19,6 +19,7 @@ module SolidCache
19
19
  @expires_per_write = (1 / expiry_batch_size.to_f) * EXPIRY_MULTIPLIER
20
20
  @max_age = options.fetch(:max_age, 2.weeks.to_i)
21
21
  @max_entries = options.fetch(:max_entries, nil)
22
+ @max_size = options.fetch(:max_size, nil)
22
23
 
23
24
  raise ArgumentError, "Expiry method must be one of `:thread` or `:job`" unless [ :thread, :job ].include?(expiry_method)
24
25
  end
@@ -36,12 +37,13 @@ module SolidCache
36
37
  end
37
38
 
38
39
  def expire_later
40
+ max_options = { max_age: max_age, max_entries: max_entries, max_size: max_size }
39
41
  if expiry_method == :job
40
42
  ExpiryJob
41
43
  .set(queue: expiry_queue)
42
- .perform_later(expiry_batch_size, shard: Entry.current_shard, max_age: max_age, max_entries: max_entries)
44
+ .perform_later(expiry_batch_size, shard: Entry.current_shard, **max_options)
43
45
  else
44
- async { Entry.expire(expiry_batch_size, max_age: max_age, max_entries: max_entries) }
46
+ async { Entry.expire(expiry_batch_size, **max_options) }
45
47
  end
46
48
  end
47
49
  end
@@ -4,7 +4,10 @@ module SolidCache
4
4
  class Cluster
5
5
  include Connections, Execution, Expiry, Stats
6
6
 
7
+ attr_reader :error_handler
8
+
7
9
  def initialize(options = {})
10
+ @error_handler = options[:error_handler]
8
11
  super(options)
9
12
  end
10
13
 
@@ -0,0 +1,41 @@
1
+ # frozen_string_literal: true
2
+
3
+ module SolidCache
4
+ class Configuration
5
+ attr_reader :store_options, :connects_to, :executor, :size_estimate_samples
6
+
7
+ def initialize(store_options: {}, database: nil, databases: nil, connects_to: nil, executor: nil, size_estimate_samples: 10_000)
8
+ @store_options = store_options
9
+ @size_estimate_samples = size_estimate_samples
10
+ @executor = executor
11
+ set_connects_to(database: database, databases: databases, connects_to: connects_to)
12
+ end
13
+
14
+ def sharded?
15
+ connects_to && connects_to[:shards]
16
+ end
17
+
18
+ def shard_keys
19
+ sharded? ? connects_to[:shards].keys : []
20
+ end
21
+
22
+ private
23
+ def set_connects_to(database:, databases:, connects_to:)
24
+ if [database, databases, connects_to].compact.size > 1
25
+ raise ArgumentError, "You can only specify one of :database, :databases, or :connects_to"
26
+ end
27
+
28
+ @connects_to =
29
+ case
30
+ when database
31
+ { database: { writing: database.to_sym } }
32
+ when databases
33
+ { shards: databases.map(&:to_sym).index_with { |database| { writing: database } } }
34
+ when connects_to
35
+ connects_to
36
+ else
37
+ nil
38
+ end
39
+ end
40
+ end
41
+ end
@@ -3,20 +3,20 @@
3
3
  module SolidCache
4
4
  module Connections
5
5
  def self.from_config(options)
6
- if options.present? || SolidCache.all_shards_config.present?
6
+ if options.present? || SolidCache.configuration.sharded?
7
7
  case options
8
8
  when NilClass
9
- names = SolidCache.all_shard_keys
9
+ names = SolidCache.configuration.shard_keys
10
10
  nodes = names.to_h { |name| [ name, name ] }
11
11
  when Array
12
- names = options
12
+ names = options.map(&:to_sym)
13
13
  nodes = names.to_h { |name| [ name, name ] }
14
14
  when Hash
15
- names = options.keys
16
- nodes = options.invert
15
+ names = options.keys.map(&:to_sym)
16
+ nodes = options.to_h { |names, nodes| [ nodes.to_sym, names.to_sym ] }
17
17
  end
18
18
 
19
- if (unknown_shards = names - SolidCache.all_shard_keys).any?
19
+ if (unknown_shards = names - SolidCache.configuration.shard_keys).any?
20
20
  raise ArgumentError, "Unknown #{"shard".pluralize(unknown_shards)}: #{unknown_shards.join(", ")}"
21
21
  end
22
22
 
@@ -8,19 +8,28 @@ module SolidCache
8
8
 
9
9
  config.solid_cache = ActiveSupport::OrderedOptions.new
10
10
 
11
- initializer "solid_cache", before: :run_prepare_callbacks do |app|
12
- config.solid_cache.executor ||= app.executor
11
+ initializer "solid_cache.config", before: :initialize_cache do |app|
12
+ app.paths.add "config/solid_cache", with: ENV["SOLID_CACHE_CONFIG"] || "config/solid_cache.yml"
13
+
14
+ options = {}
15
+ if (config_path = Pathname.new(app.config.paths["config/solid_cache"].first)).exist?
16
+ options = app.config_for(config_path).to_h.deep_symbolize_keys
17
+ end
18
+
19
+ options[:connects_to] = config.solid_cache.connects_to if config.solid_cache.connects_to
20
+ options[:size_estimate_samples] = config.solid_cache.size_estimate_samples if config.solid_cache.size_estimate_samples
21
+
22
+ SolidCache.configuration = SolidCache::Configuration.new(**options)
13
23
 
14
- SolidCache.executor = config.solid_cache.executor
15
- SolidCache.connects_to = config.solid_cache.connects_to
16
24
  if config.solid_cache.key_hash_stage
17
- unless [:ignored, :unindexed, :indexed].include?(config.solid_cache.key_hash_stage)
18
- raise "ArgumentError, :key_hash_stage must be :ignored, :unindexed or :indexed"
19
- end
20
- SolidCache.key_hash_stage = config.solid_cache.key_hash_stage
25
+ ActiveSupport.deprecator.warn("config.solid_cache.key_hash_stage is deprecated and has no effect.")
21
26
  end
22
27
  end
23
28
 
29
+ initializer "solid_cache.app_executor", before: :run_prepare_callbacks do |app|
30
+ SolidCache.executor = config.solid_cache.executor || app.executor
31
+ end
32
+
24
33
  config.after_initialize do
25
34
  Rails.cache.setup! if Rails.cache.is_a?(Store)
26
35
  end
@@ -11,7 +11,7 @@ module SolidCache
11
11
  clusters_options = options.fetch(:clusters) { [ options.fetch(:cluster, {}) ] }
12
12
 
13
13
  @clusters = clusters_options.map.with_index do |cluster_options, index|
14
- Cluster.new(options.merge(cluster_options).merge(async_writes: index != 0))
14
+ Cluster.new(options.merge(cluster_options).merge(async_writes: index != 0, error_handler: error_handler))
15
15
  end
16
16
 
17
17
  @primary_cluster = clusters.first
@@ -5,6 +5,10 @@ module SolidCache
5
5
  include Api, Clusters, Entries, Failsafe
6
6
  prepend ActiveSupport::Cache::Strategy::LocalCache
7
7
 
8
+ def initialize(options = {})
9
+ super(SolidCache.configuration.store_options.merge(options))
10
+ end
11
+
8
12
  def self.supports_cache_versioning?
9
13
  true
10
14
  end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module SolidCache
4
- VERSION = "0.4.2"
4
+ VERSION = "0.5.1"
5
5
  end
data/lib/solid_cache.rb CHANGED
@@ -9,28 +9,8 @@ loader.ignore("#{__dir__}/generators")
9
9
  loader.setup
10
10
 
11
11
  module SolidCache
12
- mattr_accessor :executor, :connects_to
13
- mattr_accessor :key_hash_stage, default: :indexed
14
-
15
- def self.all_shard_keys
16
- all_shards_config&.keys || []
17
- end
18
-
19
- def self.all_shards_config
20
- connects_to && connects_to[:shards]
21
- end
22
-
23
- def self.each_shard(&block)
24
- return to_enum(:each_shard) unless block_given?
25
-
26
- if (shards = all_shards_config&.keys)
27
- shards.each do |shard|
28
- Record.with_shard(shard, &block)
29
- end
30
- else
31
- yield
32
- end
33
- end
12
+ mattr_accessor :executor
13
+ mattr_accessor :configuration, default: Configuration.new
34
14
  end
35
15
 
36
16
  loader.eager_load
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: solid_cache
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.2
4
+ version: 0.5.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Donal McBreen
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2024-01-29 00:00:00.000000000 Z
11
+ date: 2024-02-28 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activerecord
@@ -93,6 +93,9 @@ files:
93
93
  - app/jobs/solid_cache/expiry_job.rb
94
94
  - app/models/solid_cache/entry.rb
95
95
  - app/models/solid_cache/entry/expiration.rb
96
+ - app/models/solid_cache/entry/size.rb
97
+ - app/models/solid_cache/entry/size/estimate.rb
98
+ - app/models/solid_cache/entry/size/moving_average_estimate.rb
96
99
  - app/models/solid_cache/record.rb
97
100
  - db/migrate/20230724121448_create_solid_cache_entries.rb
98
101
  - db/migrate/20240108155507_add_key_hash_and_byte_size_to_solid_cache_entries.rb
@@ -107,6 +110,7 @@ files:
107
110
  - lib/solid_cache/cluster/execution.rb
108
111
  - lib/solid_cache/cluster/expiry.rb
109
112
  - lib/solid_cache/cluster/stats.rb
113
+ - lib/solid_cache/configuration.rb
110
114
  - lib/solid_cache/connections.rb
111
115
  - lib/solid_cache/connections/sharded.rb
112
116
  - lib/solid_cache/connections/single.rb
@@ -127,7 +131,7 @@ metadata:
127
131
  homepage_uri: http://github.com/rails/solid_cache
128
132
  source_code_uri: http://github.com/rails/solid_cache
129
133
  post_install_message: |
130
- Solid Cache v0.4 contains new database migrations.
134
+ Upgrading from Solid Cache v0.3 or earlier? There are new database migrations in v0.4.
131
135
  See https://github.com/rails/solid_cache/blob/main/upgrading_to_version_0.4.x.md for upgrade instructions.
132
136
  rdoc_options: []
133
137
  require_paths:
@@ -143,7 +147,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
143
147
  - !ruby/object:Gem::Version
144
148
  version: '0'
145
149
  requirements: []
146
- rubygems_version: 3.5.4
150
+ rubygems_version: 3.5.6
147
151
  signing_key:
148
152
  specification_version: 4
149
153
  summary: A database backed ActiveSupport::Cache::Store