solid_queue_autoscaler 1.0.13 → 1.0.16
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +57 -0
- data/README.md +88 -2
- data/lib/generators/solid_queue_autoscaler/templates/create_solid_queue_autoscaler_locks.rb.erb +30 -0
- data/lib/solid_queue_autoscaler/adapters/heroku.rb +38 -0
- data/lib/solid_queue_autoscaler/advisory_lock.rb +168 -12
- data/lib/solid_queue_autoscaler/version.rb +1 -1
- metadata +17 -2
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 0d2ec8d0897f2312d05ccc10d075e5b90bd7c31e5e699fc34bfff13ba5be513b
|
|
4
|
+
data.tar.gz: b6e26a9f33e0c86f8809c4c52f04059ed62b4f1c3097ef6421a25cafb5eeed72
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 4d9cd4937e412fded6a640c9044dfa846b7006848647a3ffa2c78a70fe3f0b03bf1b5f34dc0555f0e5489e0d3c4d23fef575c4e7f16ff421ad2385355ea6919f
|
|
7
|
+
data.tar.gz: 1ed169508ba540f4ec3dcfaf6bcea5d04812f4385c4231f0033a98d10b891d3722f2b0d46184d4bd7ffea9b860b6e70f5b3b07856533104ae06e3d31d09e2be7
|
data/CHANGELOG.md
CHANGED
|
@@ -7,6 +7,63 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
7
7
|
|
|
8
8
|
## [Unreleased]
|
|
9
9
|
|
|
10
|
+
## [1.0.16] - 2025-01-30
|
|
11
|
+
|
|
12
|
+
### Added
|
|
13
|
+
- **Comprehensive scale-to-zero documentation** - Added dedicated "Scale to Zero" section in README:
|
|
14
|
+
- Explains how `min_workers = 0` works with Heroku formation behavior
|
|
15
|
+
- Documents the v1.0.15 fix for graceful 404 handling
|
|
16
|
+
- Includes configuration examples and cold-start latency considerations
|
|
17
|
+
- Guidance on where to run the autoscaler (web dyno vs workers)
|
|
18
|
+
- Updated Features list and linked Cost-Optimized example
|
|
19
|
+
|
|
20
|
+
## [1.0.15] - 2025-01-30
|
|
21
|
+
|
|
22
|
+
### Fixed
|
|
23
|
+
- **Fixed Heroku adapter 404 error when querying scaled-to-zero dynos** - When a dyno type is scaled to 0 and removed from Heroku's formation, the API returns 404. The adapter now handles this gracefully:
|
|
24
|
+
- `current_workers` returns 0 instead of raising an error when formation doesn't exist
|
|
25
|
+
- `scale` falls back to `batch_update` API to create the formation when `update` returns 404
|
|
26
|
+
- Added `create_formation` private method using Heroku's batch_update endpoint
|
|
27
|
+
- This enables full scale-to-zero support with `min_workers = 0`
|
|
28
|
+
|
|
29
|
+
## [1.0.14] - 2025-01-18
|
|
30
|
+
|
|
31
|
+
### Added
|
|
32
|
+
- **SQLite and MySQL support for advisory locks** - AdvisoryLock now supports multiple database adapters:
|
|
33
|
+
- PostgreSQL: Uses native `pg_try_advisory_lock/pg_advisory_unlock`
|
|
34
|
+
- MySQL/Trilogy: Uses `GET_LOCK/RELEASE_LOCK`
|
|
35
|
+
- SQLite: Uses table-based locking with auto-created locks table
|
|
36
|
+
- Other databases: Falls back to table-based locking
|
|
37
|
+
- Automatic adapter detection via `connection.adapter_name`
|
|
38
|
+
- Stale lock cleanup (locks older than 5 minutes are removed)
|
|
39
|
+
- Lock ownership tracking (`hostname:pid:thread_id`)
|
|
40
|
+
|
|
41
|
+
- **Comprehensive configuration tests** - Added 100+ tests across Rails and Sinatra dummy apps:
|
|
42
|
+
- Tests for ALL configuration options (job_queue, job_priority, scaling thresholds, cooldowns, etc.)
|
|
43
|
+
- Decision engine threshold tests verifying scaling logic
|
|
44
|
+
- End-to-end tests with mocked Heroku API verifying full scaling workflow
|
|
45
|
+
- Queue name and priority regression tests (prevents jobs going to wrong queue)
|
|
46
|
+
|
|
47
|
+
- **GitHub Actions integration test workflow** - New CI job that runs dummy app tests:
|
|
48
|
+
- Runs Rails dummy app tests (62 tests)
|
|
49
|
+
- Runs Sinatra dummy app tests (58 tests)
|
|
50
|
+
- Ensures queue name, priority, and E2E scaling tests pass before release
|
|
51
|
+
|
|
52
|
+
- **Release workflow now requires CI to pass** - Updated release.yml to use `workflow_run` trigger:
|
|
53
|
+
- Release only runs after CI workflow completes successfully
|
|
54
|
+
- All unit tests, integration tests, and linting must pass before publishing
|
|
55
|
+
|
|
56
|
+
### Fixed
|
|
57
|
+
- **Fixed test pollution in autoscale_job_spec** - Changed from using RSpec's `described_class` (which caches class references) to dynamic constant lookup, preventing stale class reference issues when tests reload the AutoscaleJob class
|
|
58
|
+
|
|
59
|
+
## [1.0.13] - 2025-01-17
|
|
60
|
+
|
|
61
|
+
### Fixed
|
|
62
|
+
- **Fixed AutoscaleJob queue_name type mismatch** - Queue name is now converted to string when set via `apply_job_settings!`
|
|
63
|
+
- ActiveJob internally uses strings for queue names, but the configuration uses symbols
|
|
64
|
+
- This caused jobs to have symbol queue names (`:autoscaler`) instead of string (`"autoscaler"`)
|
|
65
|
+
- Now `apply_job_settings!` calls `.to_s` on the job_queue to ensure consistent string format
|
|
66
|
+
|
|
10
67
|
## [1.0.12] - 2025-01-17
|
|
11
68
|
|
|
12
69
|
### Fixed
|
data/README.md
CHANGED
|
@@ -10,12 +10,98 @@ A control plane for [Solid Queue](https://github.com/rails/solid_queue) that aut
|
|
|
10
10
|
- **Metrics-based scaling**: Scales based on queue depth, job latency, and throughput
|
|
11
11
|
- **Multiple scaling strategies**: Fixed increment or proportional scaling based on load
|
|
12
12
|
- **Multi-worker support**: Configure and scale different worker types independently
|
|
13
|
+
- **Scale to zero**: Full support for `min_workers = 0` to eliminate costs during idle periods
|
|
13
14
|
- **Platform adapters**: Native support for Heroku and Kubernetes
|
|
14
15
|
- **Singleton execution**: Uses PostgreSQL advisory locks to ensure only one autoscaler runs at a time
|
|
15
16
|
- **Safety features**: Cooldowns, min/max limits, dry-run mode
|
|
16
17
|
- **Rails integration**: Configuration via initializer, Railtie with rake tasks
|
|
17
18
|
- **Flexible execution**: Run as a recurring Solid Queue job or standalone
|
|
18
19
|
|
|
20
|
+
## Scale to Zero
|
|
21
|
+
|
|
22
|
+
The autoscaler fully supports scaling workers to zero (`min_workers = 0`), allowing you to eliminate worker costs during idle periods.
|
|
23
|
+
|
|
24
|
+
### How It Works
|
|
25
|
+
|
|
26
|
+
When you configure `min_workers = 0` and the queue becomes idle, the autoscaler will scale your workers down to zero. This is ideal for:
|
|
27
|
+
|
|
28
|
+
- **Development/staging environments** with sporadic usage
|
|
29
|
+
- **Batch processing workers** that only run when jobs are queued
|
|
30
|
+
- **Cost-sensitive applications** with predictable idle periods
|
|
31
|
+
|
|
32
|
+
### Heroku Formation Behavior
|
|
33
|
+
|
|
34
|
+
On Heroku, when a dyno type is scaled to 0, it gets **removed from the formation entirely**. This means:
|
|
35
|
+
|
|
36
|
+
1. `heroku ps:scale worker=0` removes the `worker` formation
|
|
37
|
+
2. Subsequent API calls to get formation info return **404 Not Found**
|
|
38
|
+
3. When scaling back up, the formation must be **recreated**
|
|
39
|
+
|
|
40
|
+
As of **v1.0.15**, the autoscaler handles this gracefully:
|
|
41
|
+
|
|
42
|
+
- When querying a non-existent formation, it returns `0` workers (instead of raising an error)
|
|
43
|
+
- When scaling up a non-existent formation, it automatically creates it using Heroku's batch update API
|
|
44
|
+
- This enables seamless scale-to-zero → scale-up workflows
|
|
45
|
+
|
|
46
|
+
### Configuration Example
|
|
47
|
+
|
|
48
|
+
```ruby
|
|
49
|
+
SolidQueueAutoscaler.configure(:batch_worker) do |config|
|
|
50
|
+
config.adapter = :heroku
|
|
51
|
+
config.heroku_api_key = ENV['HEROKU_API_KEY']
|
|
52
|
+
config.heroku_app_name = ENV['HEROKU_APP_NAME']
|
|
53
|
+
config.process_type = 'batch_worker'
|
|
54
|
+
|
|
55
|
+
# Enable scale-to-zero
|
|
56
|
+
config.min_workers = 0
|
|
57
|
+
config.max_workers = 5
|
|
58
|
+
|
|
59
|
+
# Scale up immediately when any job is queued
|
|
60
|
+
config.scale_up_queue_depth = 1
|
|
61
|
+
config.scale_up_latency_seconds = 60
|
|
62
|
+
|
|
63
|
+
# Scale down when completely idle
|
|
64
|
+
config.scale_down_queue_depth = 0
|
|
65
|
+
config.scale_down_latency_seconds = 10
|
|
66
|
+
|
|
67
|
+
# Longer scale-down cooldown to avoid premature scaling to zero
|
|
68
|
+
config.scale_up_cooldown_seconds = 30
|
|
69
|
+
config.scale_down_cooldown_seconds = 300 # 5 minutes
|
|
70
|
+
end
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
### Important Considerations
|
|
74
|
+
|
|
75
|
+
**Cold-start latency**: When workers are at zero and a job is enqueued, there will be latency before the job is processed:
|
|
76
|
+
1. The autoscaler job must run (depends on your `schedule` interval)
|
|
77
|
+
2. The autoscaler must scale up workers
|
|
78
|
+
3. Heroku must provision and start the dyno (~10-30 seconds)
|
|
79
|
+
4. The worker must boot and start processing
|
|
80
|
+
|
|
81
|
+
Total cold-start time is typically **30-90 seconds** depending on your configuration and dyno startup time.
|
|
82
|
+
|
|
83
|
+
**Where to run the autoscaler**: The autoscaler job **must run on a process that's always running** (like your web dyno), NOT on the workers being scaled. If the autoscaler runs on workers and those workers scale to zero, there's nothing to scale them back up!
|
|
84
|
+
|
|
85
|
+
```yaml
|
|
86
|
+
# config/recurring.yml - runs on whatever process runs the dispatcher
|
|
87
|
+
autoscaler_batch:
|
|
88
|
+
class: SolidQueueAutoscaler::AutoscaleJob
|
|
89
|
+
queue: autoscaler
|
|
90
|
+
schedule: every 30 seconds
|
|
91
|
+
args: [:batch_worker]
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
**Procfile setup**: Ensure your web dyno runs the Solid Queue dispatcher (or use a dedicated always-on dyno):
|
|
95
|
+
|
|
96
|
+
```
|
|
97
|
+
# Procfile
|
|
98
|
+
web: bundle exec puma -C config/puma.rb
|
|
99
|
+
worker: bundle exec rake solid_queue:start
|
|
100
|
+
batch_worker: bundle exec rake solid_queue:start
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
Alternatively, run the dispatcher in a thread within your web process using `solid_queue.yml` configuration.
|
|
104
|
+
|
|
19
105
|
## Installation
|
|
20
106
|
|
|
21
107
|
Add to your Gemfile:
|
|
@@ -308,7 +394,7 @@ autoscaler:
|
|
|
308
394
|
|
|
309
395
|
### Cost-Optimized Setup (Scale to Zero)
|
|
310
396
|
|
|
311
|
-
For apps with sporadic workloads where you want to minimize costs during idle periods
|
|
397
|
+
For apps with sporadic workloads where you want to minimize costs during idle periods. See the [Scale to Zero](#scale-to-zero) section for full details on how this works.
|
|
312
398
|
|
|
313
399
|
```ruby
|
|
314
400
|
SolidQueueAutoscaler.configure do |config|
|
|
@@ -337,7 +423,7 @@ SolidQueueAutoscaler.configure do |config|
|
|
|
337
423
|
end
|
|
338
424
|
```
|
|
339
425
|
|
|
340
|
-
**⚠️ Note:** With `min_workers = 0`, there's cold-start latency when the first job arrives. The autoscaler must run on a web dyno or separate process, not on the workers themselves.
|
|
426
|
+
**⚠️ Note:** With `min_workers = 0`, there's cold-start latency (~30-90s) when the first job arrives. The autoscaler must run on a web dyno or separate always-on process, not on the workers themselves. See [Scale to Zero](#scale-to-zero) for details.
|
|
341
427
|
|
|
342
428
|
---
|
|
343
429
|
|
data/lib/generators/solid_queue_autoscaler/templates/create_solid_queue_autoscaler_locks.rb.erb
ADDED
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
# Migration for SolidQueueAutoscaler locks table.
|
|
4
|
+
# This table is used for advisory locking on databases that don't support
|
|
5
|
+
# native advisory locks (SQLite, etc.).
|
|
6
|
+
#
|
|
7
|
+
# NOTE: This migration is OPTIONAL. The locks table is automatically created
|
|
8
|
+
# when first needed. Only use this migration if you prefer to manage the
|
|
9
|
+
# table schema explicitly.
|
|
10
|
+
#
|
|
11
|
+
# For multi-database setups (SolidQueue in separate database):
|
|
12
|
+
# This migration should be placed in db/queue_migrate/ (or your queue DB's migration path)
|
|
13
|
+
# Run with: rails db:migrate:queue
|
|
14
|
+
#
|
|
15
|
+
# For single-database setups:
|
|
16
|
+
# Place in db/migrate/ and run: rails db:migrate
|
|
17
|
+
#
|
|
18
|
+
class CreateSolidQueueAutoscalerLocks < ActiveRecord::Migration<%= migration_version %>
|
|
19
|
+
def change
|
|
20
|
+
create_table :solid_queue_autoscaler_locks, id: false do |t|
|
|
21
|
+
t.string :lock_key, null: false, primary_key: true
|
|
22
|
+
t.integer :lock_id, null: false
|
|
23
|
+
t.datetime :locked_at, null: false
|
|
24
|
+
t.string :locked_by, null: false
|
|
25
|
+
end
|
|
26
|
+
|
|
27
|
+
# Index for cleanup of stale locks
|
|
28
|
+
add_index :solid_queue_autoscaler_locks, :locked_at
|
|
29
|
+
end
|
|
30
|
+
end
|
|
@@ -33,6 +33,13 @@ module SolidQueueAutoscaler
|
|
|
33
33
|
formation['quantity']
|
|
34
34
|
end
|
|
35
35
|
rescue Excon::Error => e
|
|
36
|
+
# Handle 404 gracefully - formation doesn't exist means 0 workers
|
|
37
|
+
# This happens when a dyno type is scaled to 0 and removed from formation
|
|
38
|
+
if e.respond_to?(:response) && e.response&.status == 404
|
|
39
|
+
logger&.debug("[Autoscaler] Formation '#{process_type}' not found, treating as 0 workers")
|
|
40
|
+
return 0
|
|
41
|
+
end
|
|
42
|
+
|
|
36
43
|
raise HerokuAPIError.new(
|
|
37
44
|
"Failed to get formation info: #{e.message}",
|
|
38
45
|
status_code: e.respond_to?(:response) ? e.response&.status : nil,
|
|
@@ -51,6 +58,12 @@ module SolidQueueAutoscaler
|
|
|
51
58
|
end
|
|
52
59
|
quantity
|
|
53
60
|
rescue Excon::Error => e
|
|
61
|
+
# Handle 404 by trying to create the formation via batch_update
|
|
62
|
+
# This happens when scaling up a dyno type that was previously scaled to 0
|
|
63
|
+
if e.respond_to?(:response) && e.response&.status == 404
|
|
64
|
+
return create_formation(quantity)
|
|
65
|
+
end
|
|
66
|
+
|
|
54
67
|
raise HerokuAPIError.new(
|
|
55
68
|
"Failed to scale #{process_type} to #{quantity}: #{e.message}",
|
|
56
69
|
status_code: e.respond_to?(:response) ? e.response&.status : nil,
|
|
@@ -84,6 +97,31 @@ module SolidQueueAutoscaler
|
|
|
84
97
|
|
|
85
98
|
private
|
|
86
99
|
|
|
100
|
+
# Creates a formation that doesn't exist using batch_update.
|
|
101
|
+
# This is needed when scaling up a dyno type that was previously scaled to 0.
|
|
102
|
+
#
|
|
103
|
+
# @param quantity [Integer] desired worker count
|
|
104
|
+
# @return [Integer] the new worker count
|
|
105
|
+
# @raise [HerokuAPIError] if the API call fails
|
|
106
|
+
def create_formation(quantity)
|
|
107
|
+
logger&.info("[Autoscaler] Formation '#{process_type}' not found, creating with quantity #{quantity}")
|
|
108
|
+
|
|
109
|
+
with_retry(RETRYABLE_ERRORS, retryable_check: method(:retryable_error?)) do
|
|
110
|
+
client.formation.batch_update(app_name, {
|
|
111
|
+
updates: [
|
|
112
|
+
{ type: process_type, quantity: quantity }
|
|
113
|
+
]
|
|
114
|
+
})
|
|
115
|
+
end
|
|
116
|
+
quantity
|
|
117
|
+
rescue Excon::Error => e
|
|
118
|
+
raise HerokuAPIError.new(
|
|
119
|
+
"Failed to create formation #{process_type} with quantity #{quantity}: #{e.message}",
|
|
120
|
+
status_code: e.respond_to?(:response) ? e.response&.status : nil,
|
|
121
|
+
response_body: e.respond_to?(:response) ? e.response&.body : nil
|
|
122
|
+
)
|
|
123
|
+
end
|
|
124
|
+
|
|
87
125
|
# Determines if an error should be retried.
|
|
88
126
|
# Retries timeouts and 5xx errors, but not 4xx client errors.
|
|
89
127
|
def retryable_error?(error)
|
|
@@ -1,12 +1,14 @@
|
|
|
1
1
|
# frozen_string_literal: true
|
|
2
2
|
|
|
3
3
|
require 'zlib'
|
|
4
|
+
require 'socket'
|
|
4
5
|
|
|
5
6
|
module SolidQueueAutoscaler
|
|
6
|
-
#
|
|
7
|
+
# Advisory lock wrapper for singleton enforcement.
|
|
8
|
+
# Supports both PostgreSQL (native advisory locks) and SQLite (table-based locks).
|
|
7
9
|
#
|
|
8
|
-
# IMPORTANT: PgBouncer Compatibility Warning
|
|
9
|
-
#
|
|
10
|
+
# IMPORTANT: PgBouncer Compatibility Warning (PostgreSQL only)
|
|
11
|
+
# ============================================================
|
|
10
12
|
# PostgreSQL advisory locks are connection-scoped (session-level locks).
|
|
11
13
|
# If you're using PgBouncer in transaction pooling mode, advisory locks
|
|
12
14
|
# will NOT work correctly because:
|
|
@@ -24,6 +26,10 @@ module SolidQueueAutoscaler
|
|
|
24
26
|
# lock acquisition always failing, PgBouncer is likely the cause.
|
|
25
27
|
#
|
|
26
28
|
class AdvisoryLock
|
|
29
|
+
LOCKS_TABLE_NAME = 'solid_queue_autoscaler_locks'
|
|
30
|
+
# Stale lock timeout - locks older than this are considered abandoned (5 minutes)
|
|
31
|
+
STALE_LOCK_TIMEOUT_SECONDS = 300
|
|
32
|
+
|
|
27
33
|
attr_reader :lock_key, :timeout
|
|
28
34
|
|
|
29
35
|
def initialize(lock_key: nil, timeout: nil, config: nil)
|
|
@@ -31,6 +37,7 @@ module SolidQueueAutoscaler
|
|
|
31
37
|
@lock_key = lock_key || @config.lock_key
|
|
32
38
|
@timeout = timeout || @config.lock_timeout_seconds
|
|
33
39
|
@lock_acquired = false
|
|
40
|
+
@strategy = nil
|
|
34
41
|
end
|
|
35
42
|
|
|
36
43
|
def with_lock
|
|
@@ -43,20 +50,14 @@ module SolidQueueAutoscaler
|
|
|
43
50
|
def try_lock
|
|
44
51
|
return false if @lock_acquired
|
|
45
52
|
|
|
46
|
-
|
|
47
|
-
"SELECT pg_try_advisory_lock(#{lock_id})"
|
|
48
|
-
)
|
|
49
|
-
@lock_acquired = [true, 't'].include?(result)
|
|
53
|
+
@lock_acquired = lock_strategy.try_lock
|
|
50
54
|
@lock_acquired
|
|
51
55
|
end
|
|
52
56
|
|
|
53
57
|
def acquire!
|
|
54
58
|
return true if @lock_acquired
|
|
55
59
|
|
|
56
|
-
|
|
57
|
-
"SELECT pg_try_advisory_lock(#{lock_id})"
|
|
58
|
-
)
|
|
59
|
-
@lock_acquired = [true, 't'].include?(result)
|
|
60
|
+
@lock_acquired = lock_strategy.try_lock
|
|
60
61
|
|
|
61
62
|
raise LockError, "Could not acquire advisory lock '#{lock_key}' (id: #{lock_id})" unless @lock_acquired
|
|
62
63
|
|
|
@@ -66,7 +67,7 @@ module SolidQueueAutoscaler
|
|
|
66
67
|
def release
|
|
67
68
|
return false unless @lock_acquired
|
|
68
69
|
|
|
69
|
-
|
|
70
|
+
lock_strategy.release
|
|
70
71
|
@lock_acquired = false
|
|
71
72
|
true
|
|
72
73
|
end
|
|
@@ -87,5 +88,160 @@ module SolidQueueAutoscaler
|
|
|
87
88
|
hash & 0x7FFFFFFF
|
|
88
89
|
end
|
|
89
90
|
end
|
|
91
|
+
|
|
92
|
+
def lock_strategy
|
|
93
|
+
@strategy ||= create_lock_strategy
|
|
94
|
+
end
|
|
95
|
+
|
|
96
|
+
def create_lock_strategy
|
|
97
|
+
adapter_name = connection.adapter_name.downcase
|
|
98
|
+
|
|
99
|
+
case adapter_name
|
|
100
|
+
when /postgresql/, /postgis/
|
|
101
|
+
PostgreSQLLockStrategy.new(connection: connection, lock_id: lock_id, lock_key: lock_key)
|
|
102
|
+
when /sqlite/
|
|
103
|
+
SQLiteLockStrategy.new(connection: connection, lock_id: lock_id, lock_key: lock_key)
|
|
104
|
+
when /mysql/, /trilogy/
|
|
105
|
+
MySQLLockStrategy.new(connection: connection, lock_id: lock_id, lock_key: lock_key)
|
|
106
|
+
else
|
|
107
|
+
# Fall back to table-based locking for unknown adapters
|
|
108
|
+
TableBasedLockStrategy.new(connection: connection, lock_id: lock_id, lock_key: lock_key)
|
|
109
|
+
end
|
|
110
|
+
end
|
|
111
|
+
|
|
112
|
+
# Base class for lock strategies
|
|
113
|
+
class BaseLockStrategy
|
|
114
|
+
def initialize(connection:, lock_id:, lock_key:)
|
|
115
|
+
@connection = connection
|
|
116
|
+
@lock_id = lock_id
|
|
117
|
+
@lock_key = lock_key
|
|
118
|
+
end
|
|
119
|
+
|
|
120
|
+
def try_lock
|
|
121
|
+
raise NotImplementedError, "#{self.class} must implement #try_lock"
|
|
122
|
+
end
|
|
123
|
+
|
|
124
|
+
def release
|
|
125
|
+
raise NotImplementedError, "#{self.class} must implement #release"
|
|
126
|
+
end
|
|
127
|
+
|
|
128
|
+
protected
|
|
129
|
+
|
|
130
|
+
attr_reader :connection, :lock_id, :lock_key
|
|
131
|
+
end
|
|
132
|
+
|
|
133
|
+
# PostgreSQL native advisory locks
|
|
134
|
+
class PostgreSQLLockStrategy < BaseLockStrategy
|
|
135
|
+
def try_lock
|
|
136
|
+
result = connection.select_value(
|
|
137
|
+
"SELECT pg_try_advisory_lock(#{lock_id})"
|
|
138
|
+
)
|
|
139
|
+
[true, 't'].include?(result)
|
|
140
|
+
end
|
|
141
|
+
|
|
142
|
+
def release
|
|
143
|
+
connection.execute("SELECT pg_advisory_unlock(#{lock_id})")
|
|
144
|
+
true
|
|
145
|
+
end
|
|
146
|
+
end
|
|
147
|
+
|
|
148
|
+
# MySQL named locks (GET_LOCK/RELEASE_LOCK)
|
|
149
|
+
class MySQLLockStrategy < BaseLockStrategy
|
|
150
|
+
def try_lock
|
|
151
|
+
# MySQL GET_LOCK returns 1 on success, 0 if timeout, NULL on error
|
|
152
|
+
result = connection.select_value(
|
|
153
|
+
"SELECT GET_LOCK(#{connection.quote(lock_key)}, 0)"
|
|
154
|
+
)
|
|
155
|
+
result == 1
|
|
156
|
+
end
|
|
157
|
+
|
|
158
|
+
def release
|
|
159
|
+
connection.execute("SELECT RELEASE_LOCK(#{connection.quote(lock_key)})")
|
|
160
|
+
true
|
|
161
|
+
end
|
|
162
|
+
end
|
|
163
|
+
|
|
164
|
+
# Table-based locking for databases without native advisory lock support
|
|
165
|
+
# Uses a simple locks table with INSERT/DELETE for lock management
|
|
166
|
+
class TableBasedLockStrategy < BaseLockStrategy
|
|
167
|
+
def try_lock
|
|
168
|
+
ensure_locks_table_exists!
|
|
169
|
+
cleanup_stale_locks!
|
|
170
|
+
|
|
171
|
+
# Try to insert a lock record
|
|
172
|
+
begin
|
|
173
|
+
connection.execute(<<~SQL)
|
|
174
|
+
INSERT INTO #{quoted_table_name} (lock_key, lock_id, locked_at, locked_by)
|
|
175
|
+
VALUES (#{connection.quote(lock_key)}, #{lock_id}, #{connection.quote(Time.now.utc.iso8601)}, #{connection.quote(lock_owner)})
|
|
176
|
+
SQL
|
|
177
|
+
true
|
|
178
|
+
rescue ActiveRecord::RecordNotUnique, ActiveRecord::StatementInvalid => e
|
|
179
|
+
# Lock already held by another process
|
|
180
|
+
# StatementInvalid catches SQLite's UNIQUE constraint violation
|
|
181
|
+
return false if e.message.include?('UNIQUE') || e.message.include?('duplicate')
|
|
182
|
+
|
|
183
|
+
raise
|
|
184
|
+
end
|
|
185
|
+
end
|
|
186
|
+
|
|
187
|
+
def release
|
|
188
|
+
return true unless table_exists?
|
|
189
|
+
|
|
190
|
+
connection.execute(<<~SQL)
|
|
191
|
+
DELETE FROM #{quoted_table_name}
|
|
192
|
+
WHERE lock_key = #{connection.quote(lock_key)}
|
|
193
|
+
AND locked_by = #{connection.quote(lock_owner)}
|
|
194
|
+
SQL
|
|
195
|
+
true
|
|
196
|
+
end
|
|
197
|
+
|
|
198
|
+
private
|
|
199
|
+
|
|
200
|
+
def ensure_locks_table_exists!
|
|
201
|
+
return if table_exists?
|
|
202
|
+
|
|
203
|
+
create_locks_table!
|
|
204
|
+
end
|
|
205
|
+
|
|
206
|
+
def table_exists?
|
|
207
|
+
@table_exists ||= connection.table_exists?(LOCKS_TABLE_NAME)
|
|
208
|
+
end
|
|
209
|
+
|
|
210
|
+
def create_locks_table!
|
|
211
|
+
connection.execute(<<~SQL)
|
|
212
|
+
CREATE TABLE IF NOT EXISTS #{quoted_table_name} (
|
|
213
|
+
lock_key VARCHAR(255) NOT NULL PRIMARY KEY,
|
|
214
|
+
lock_id INTEGER NOT NULL,
|
|
215
|
+
locked_at DATETIME NOT NULL,
|
|
216
|
+
locked_by VARCHAR(255) NOT NULL
|
|
217
|
+
)
|
|
218
|
+
SQL
|
|
219
|
+
@table_exists = true
|
|
220
|
+
end
|
|
221
|
+
|
|
222
|
+
def cleanup_stale_locks!
|
|
223
|
+
# Remove locks older than STALE_LOCK_TIMEOUT_SECONDS
|
|
224
|
+
stale_threshold = (Time.now.utc - STALE_LOCK_TIMEOUT_SECONDS).iso8601
|
|
225
|
+
connection.execute(<<~SQL)
|
|
226
|
+
DELETE FROM #{quoted_table_name}
|
|
227
|
+
WHERE locked_at < #{connection.quote(stale_threshold)}
|
|
228
|
+
SQL
|
|
229
|
+
end
|
|
230
|
+
|
|
231
|
+
def quoted_table_name
|
|
232
|
+
connection.quote_table_name(LOCKS_TABLE_NAME)
|
|
233
|
+
end
|
|
234
|
+
|
|
235
|
+
def lock_owner
|
|
236
|
+
# Unique identifier for this process/thread
|
|
237
|
+
@lock_owner ||= "#{Socket.gethostname}:#{Process.pid}:#{Thread.current.object_id}"
|
|
238
|
+
end
|
|
239
|
+
end
|
|
240
|
+
|
|
241
|
+
# SQLite table-based locking (SQLite doesn't have advisory locks)
|
|
242
|
+
# Defined after TableBasedLockStrategy since it inherits from it
|
|
243
|
+
class SQLiteLockStrategy < TableBasedLockStrategy
|
|
244
|
+
# Inherits all behavior from TableBasedLockStrategy
|
|
245
|
+
end
|
|
90
246
|
end
|
|
91
247
|
end
|
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: solid_queue_autoscaler
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 1.0.
|
|
4
|
+
version: 1.0.16
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- reillyse
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: exe
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date: 2026-01-
|
|
11
|
+
date: 2026-01-31 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: activerecord
|
|
@@ -122,6 +122,20 @@ dependencies:
|
|
|
122
122
|
- - "~>"
|
|
123
123
|
- !ruby/object:Gem::Version
|
|
124
124
|
version: '3.18'
|
|
125
|
+
- !ruby/object:Gem::Dependency
|
|
126
|
+
name: sqlite3
|
|
127
|
+
requirement: !ruby/object:Gem::Requirement
|
|
128
|
+
requirements:
|
|
129
|
+
- - ">="
|
|
130
|
+
- !ruby/object:Gem::Version
|
|
131
|
+
version: '0'
|
|
132
|
+
type: :development
|
|
133
|
+
prerelease: false
|
|
134
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
135
|
+
requirements:
|
|
136
|
+
- - ">="
|
|
137
|
+
- !ruby/object:Gem::Version
|
|
138
|
+
version: '0'
|
|
125
139
|
description: A control plane for Solid Queue on Heroku that automatically scales worker
|
|
126
140
|
dynos based on queue depth, job latency, and throughput. Uses PostgreSQL advisory
|
|
127
141
|
locks for singleton behavior and the Heroku Platform API for scaling.
|
|
@@ -143,6 +157,7 @@ files:
|
|
|
143
157
|
- lib/generators/solid_queue_autoscaler/migration_generator.rb
|
|
144
158
|
- lib/generators/solid_queue_autoscaler/templates/README
|
|
145
159
|
- lib/generators/solid_queue_autoscaler/templates/create_solid_queue_autoscaler_events.rb.erb
|
|
160
|
+
- lib/generators/solid_queue_autoscaler/templates/create_solid_queue_autoscaler_locks.rb.erb
|
|
146
161
|
- lib/generators/solid_queue_autoscaler/templates/create_solid_queue_autoscaler_state.rb.erb
|
|
147
162
|
- lib/generators/solid_queue_autoscaler/templates/initializer.rb
|
|
148
163
|
- lib/solid_queue_autoscaler.rb
|