solid_queue_autoscaler 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/CHANGELOG.md +189 -0
- data/LICENSE.txt +21 -0
- data/README.md +553 -0
- data/lib/generators/solid_queue_autoscaler/dashboard_generator.rb +54 -0
- data/lib/generators/solid_queue_autoscaler/install_generator.rb +21 -0
- data/lib/generators/solid_queue_autoscaler/migration_generator.rb +29 -0
- data/lib/generators/solid_queue_autoscaler/templates/README +41 -0
- data/lib/generators/solid_queue_autoscaler/templates/create_solid_queue_autoscaler_events.rb.erb +24 -0
- data/lib/generators/solid_queue_autoscaler/templates/create_solid_queue_autoscaler_state.rb.erb +15 -0
- data/lib/generators/solid_queue_autoscaler/templates/initializer.rb +58 -0
- data/lib/solid_queue_autoscaler/adapters/base.rb +102 -0
- data/lib/solid_queue_autoscaler/adapters/heroku.rb +93 -0
- data/lib/solid_queue_autoscaler/adapters/kubernetes.rb +158 -0
- data/lib/solid_queue_autoscaler/adapters.rb +57 -0
- data/lib/solid_queue_autoscaler/advisory_lock.rb +71 -0
- data/lib/solid_queue_autoscaler/autoscale_job.rb +71 -0
- data/lib/solid_queue_autoscaler/configuration.rb +269 -0
- data/lib/solid_queue_autoscaler/cooldown_tracker.rb +153 -0
- data/lib/solid_queue_autoscaler/dashboard/engine.rb +136 -0
- data/lib/solid_queue_autoscaler/dashboard/views/layouts/solid_queue_heroku_autoscaler/dashboard/application.html.erb +206 -0
- data/lib/solid_queue_autoscaler/dashboard/views/solid_queue_heroku_autoscaler/dashboard/dashboard/index.html.erb +138 -0
- data/lib/solid_queue_autoscaler/dashboard/views/solid_queue_heroku_autoscaler/dashboard/events/index.html.erb +102 -0
- data/lib/solid_queue_autoscaler/dashboard/views/solid_queue_heroku_autoscaler/dashboard/workers/index.html.erb +106 -0
- data/lib/solid_queue_autoscaler/dashboard/views/solid_queue_heroku_autoscaler/dashboard/workers/show.html.erb +209 -0
- data/lib/solid_queue_autoscaler/dashboard.rb +99 -0
- data/lib/solid_queue_autoscaler/decision_engine.rb +228 -0
- data/lib/solid_queue_autoscaler/errors.rb +44 -0
- data/lib/solid_queue_autoscaler/metrics.rb +172 -0
- data/lib/solid_queue_autoscaler/railtie.rb +179 -0
- data/lib/solid_queue_autoscaler/scale_event.rb +292 -0
- data/lib/solid_queue_autoscaler/scaler.rb +294 -0
- data/lib/solid_queue_autoscaler/version.rb +5 -0
- data/lib/solid_queue_autoscaler.rb +108 -0
- metadata +179 -0
data/README.md
ADDED
|
@@ -0,0 +1,553 @@
|
|
|
1
|
+
# Solid Queue Heroku Autoscaler
|
|
2
|
+
|
|
3
|
+
[](https://github.com/reillyse/solid_queue_autoscaler/actions/workflows/ci.yml)
|
|
4
|
+
[](https://badge.fury.io/rb/solid_queue_autoscaler)
|
|
5
|
+
|
|
6
|
+
A control plane for [Solid Queue](https://github.com/rails/solid_queue) that automatically scales worker processes based on queue metrics. Supports both **Heroku** and **Kubernetes** deployments.
|
|
7
|
+
|
|
8
|
+
## Features
|
|
9
|
+
|
|
10
|
+
- **Metrics-based scaling**: Scales based on queue depth, job latency, and throughput
|
|
11
|
+
- **Multiple scaling strategies**: Fixed increment or proportional scaling based on load
|
|
12
|
+
- **Multi-worker support**: Configure and scale different worker types independently
|
|
13
|
+
- **Platform adapters**: Native support for Heroku and Kubernetes
|
|
14
|
+
- **Singleton execution**: Uses PostgreSQL advisory locks to ensure only one autoscaler runs at a time
|
|
15
|
+
- **Safety features**: Cooldowns, min/max limits, dry-run mode
|
|
16
|
+
- **Rails integration**: Configuration via initializer, Railtie with rake tasks
|
|
17
|
+
- **Flexible execution**: Run as a recurring Solid Queue job or standalone
|
|
18
|
+
|
|
19
|
+
## Installation
|
|
20
|
+
|
|
21
|
+
Add to your Gemfile:
|
|
22
|
+
|
|
23
|
+
```ruby
|
|
24
|
+
gem 'solid_queue_autoscaler'
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
Then run:
|
|
28
|
+
|
|
29
|
+
```bash
|
|
30
|
+
bundle install
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
### Database Setup (Recommended)
|
|
34
|
+
|
|
35
|
+
For persistent cooldown tracking that survives process restarts:
|
|
36
|
+
|
|
37
|
+
```bash
|
|
38
|
+
rails generate solid_queue_autoscaler:migration
|
|
39
|
+
rails db:migrate
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
This creates a `solid_queue_autoscaler_state` table to store cooldown timestamps.
|
|
43
|
+
|
|
44
|
+
### Dashboard Setup (Optional)
|
|
45
|
+
|
|
46
|
+
For a web UI to monitor autoscaler events and status:
|
|
47
|
+
|
|
48
|
+
```bash
|
|
49
|
+
rails generate solid_queue_autoscaler:dashboard
|
|
50
|
+
rails db:migrate
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
Then mount the dashboard in `config/routes.rb`:
|
|
54
|
+
|
|
55
|
+
```ruby
|
|
56
|
+
# With authentication (recommended)
|
|
57
|
+
authenticate :user, ->(u) { u.admin? } do
|
|
58
|
+
mount SolidQueueAutoscaler::Dashboard::Engine => "/autoscaler"
|
|
59
|
+
end
|
|
60
|
+
|
|
61
|
+
# Or without authentication
|
|
62
|
+
mount SolidQueueAutoscaler::Dashboard::Engine => "/autoscaler"
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
## Quick Start
|
|
66
|
+
|
|
67
|
+
### Basic Configuration (Single Worker)
|
|
68
|
+
|
|
69
|
+
Create an initializer at `config/initializers/solid_queue_autoscaler.rb`:
|
|
70
|
+
|
|
71
|
+
```ruby
|
|
72
|
+
SolidQueueAutoscaler.configure do |config|
|
|
73
|
+
# Platform: Heroku
|
|
74
|
+
config.adapter = :heroku
|
|
75
|
+
config.heroku_api_key = ENV['HEROKU_API_KEY']
|
|
76
|
+
config.heroku_app_name = ENV['HEROKU_APP_NAME']
|
|
77
|
+
config.process_type = 'worker'
|
|
78
|
+
|
|
79
|
+
# Worker limits
|
|
80
|
+
config.min_workers = 1
|
|
81
|
+
config.max_workers = 10
|
|
82
|
+
|
|
83
|
+
# Scaling thresholds
|
|
84
|
+
config.scale_up_queue_depth = 100
|
|
85
|
+
config.scale_up_latency_seconds = 300
|
|
86
|
+
config.scale_down_queue_depth = 10
|
|
87
|
+
config.scale_down_latency_seconds = 30
|
|
88
|
+
end
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
### Multi-Worker Configuration
|
|
92
|
+
|
|
93
|
+
Scale different worker types independently with named configurations:
|
|
94
|
+
|
|
95
|
+
```ruby
|
|
96
|
+
# Critical jobs worker - fast response, dedicated queue
|
|
97
|
+
SolidQueueAutoscaler.configure(:critical_worker) do |config|
|
|
98
|
+
config.adapter = :heroku
|
|
99
|
+
config.heroku_api_key = ENV['HEROKU_API_KEY']
|
|
100
|
+
config.heroku_app_name = ENV['HEROKU_APP_NAME']
|
|
101
|
+
config.process_type = 'critical_worker'
|
|
102
|
+
|
|
103
|
+
# Only monitor the critical queue
|
|
104
|
+
config.queues = ['critical']
|
|
105
|
+
|
|
106
|
+
# Aggressive scaling for critical jobs
|
|
107
|
+
config.min_workers = 2
|
|
108
|
+
config.max_workers = 20
|
|
109
|
+
config.scale_up_queue_depth = 10
|
|
110
|
+
config.scale_up_latency_seconds = 30
|
|
111
|
+
config.cooldown_seconds = 60
|
|
112
|
+
end
|
|
113
|
+
|
|
114
|
+
# Default worker - handles standard queues
|
|
115
|
+
SolidQueueAutoscaler.configure(:default_worker) do |config|
|
|
116
|
+
config.adapter = :heroku
|
|
117
|
+
config.heroku_api_key = ENV['HEROKU_API_KEY']
|
|
118
|
+
config.heroku_app_name = ENV['HEROKU_APP_NAME']
|
|
119
|
+
config.process_type = 'worker'
|
|
120
|
+
|
|
121
|
+
# Monitor default and mailers queues
|
|
122
|
+
config.queues = ['default', 'mailers']
|
|
123
|
+
|
|
124
|
+
# Conservative scaling for background jobs
|
|
125
|
+
config.min_workers = 1
|
|
126
|
+
config.max_workers = 10
|
|
127
|
+
config.scale_up_queue_depth = 100
|
|
128
|
+
config.scale_up_latency_seconds = 300
|
|
129
|
+
config.cooldown_seconds = 120
|
|
130
|
+
end
|
|
131
|
+
|
|
132
|
+
# Batch processing worker - handles long-running jobs
|
|
133
|
+
SolidQueueAutoscaler.configure(:batch_worker) do |config|
|
|
134
|
+
config.adapter = :heroku
|
|
135
|
+
config.heroku_api_key = ENV['HEROKU_API_KEY']
|
|
136
|
+
config.heroku_app_name = ENV['HEROKU_APP_NAME']
|
|
137
|
+
config.process_type = 'batch_worker'
|
|
138
|
+
|
|
139
|
+
config.queues = ['batch', 'imports', 'exports']
|
|
140
|
+
|
|
141
|
+
config.min_workers = 0
|
|
142
|
+
config.max_workers = 5
|
|
143
|
+
config.scale_up_queue_depth = 1 # Scale up when any batch job is queued
|
|
144
|
+
config.scale_down_queue_depth = 0
|
|
145
|
+
config.cooldown_seconds = 300
|
|
146
|
+
end
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
## Platform Adapters
|
|
150
|
+
|
|
151
|
+
### Heroku Adapter (Default)
|
|
152
|
+
|
|
153
|
+
```ruby
|
|
154
|
+
SolidQueueAutoscaler.configure do |config|
|
|
155
|
+
config.adapter = :heroku
|
|
156
|
+
config.heroku_api_key = ENV['HEROKU_API_KEY']
|
|
157
|
+
config.heroku_app_name = ENV['HEROKU_APP_NAME']
|
|
158
|
+
config.process_type = 'worker' # Dyno type to scale
|
|
159
|
+
end
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
Generate a Heroku API key:
|
|
163
|
+
|
|
164
|
+
```bash
|
|
165
|
+
heroku authorizations:create -d "Solid Queue Autoscaler"
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
### Kubernetes Adapter
|
|
169
|
+
|
|
170
|
+
```ruby
|
|
171
|
+
SolidQueueAutoscaler.configure do |config|
|
|
172
|
+
config.adapter = :kubernetes
|
|
173
|
+
config.kubernetes_namespace = ENV.fetch('KUBERNETES_NAMESPACE', 'default')
|
|
174
|
+
config.kubernetes_deployment = 'solid-queue-worker'
|
|
175
|
+
|
|
176
|
+
# Optional: Custom kubeconfig path (defaults to in-cluster config)
|
|
177
|
+
# config.kubernetes_config_path = '/path/to/kubeconfig'
|
|
178
|
+
end
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
The Kubernetes adapter uses the official `kubeclient` gem and supports:
|
|
182
|
+
- In-cluster service account authentication (recommended for production)
|
|
183
|
+
- External kubeconfig file authentication (useful for development)
|
|
184
|
+
|
|
185
|
+
## Configuration Reference
|
|
186
|
+
|
|
187
|
+
### Core Settings
|
|
188
|
+
|
|
189
|
+
| Option | Type | Default | Description |
|
|
190
|
+
|--------|------|---------|-------------|
|
|
191
|
+
| `adapter` | Symbol | `:heroku` | Platform adapter (`:heroku` or `:kubernetes`) |
|
|
192
|
+
| `enabled` | Boolean | `true` | Master switch to enable/disable autoscaling |
|
|
193
|
+
| `dry_run` | Boolean | `false` | Log decisions without making changes |
|
|
194
|
+
| `queues` | Array | `nil` | Queue names to monitor (nil = all queues) |
|
|
195
|
+
| `table_prefix` | String | `'solid_queue_'` | Solid Queue table name prefix |
|
|
196
|
+
|
|
197
|
+
### Worker Limits
|
|
198
|
+
|
|
199
|
+
| Option | Type | Default | Description |
|
|
200
|
+
|--------|------|---------|-------------|
|
|
201
|
+
| `min_workers` | Integer | `1` | Minimum workers to maintain |
|
|
202
|
+
| `max_workers` | Integer | `10` | Maximum workers allowed |
|
|
203
|
+
|
|
204
|
+
### Scale-Up Thresholds
|
|
205
|
+
|
|
206
|
+
Scaling up triggers when **ANY** threshold is exceeded:
|
|
207
|
+
|
|
208
|
+
| Option | Type | Default | Description |
|
|
209
|
+
|--------|------|---------|-------------|
|
|
210
|
+
| `scale_up_queue_depth` | Integer | `100` | Jobs in queue to trigger scale up |
|
|
211
|
+
| `scale_up_latency_seconds` | Integer | `300` | Oldest job age to trigger scale up |
|
|
212
|
+
| `scale_up_increment` | Integer | `1` | Workers to add (fixed strategy) |
|
|
213
|
+
|
|
214
|
+
### Scale-Down Thresholds
|
|
215
|
+
|
|
216
|
+
Scaling down triggers when **ALL** thresholds are met:
|
|
217
|
+
|
|
218
|
+
| Option | Type | Default | Description |
|
|
219
|
+
|--------|------|---------|-------------|
|
|
220
|
+
| `scale_down_queue_depth` | Integer | `10` | Jobs in queue threshold |
|
|
221
|
+
| `scale_down_latency_seconds` | Integer | `30` | Oldest job age threshold |
|
|
222
|
+
| `scale_down_decrement` | Integer | `1` | Workers to remove |
|
|
223
|
+
|
|
224
|
+
### Scaling Strategies
|
|
225
|
+
|
|
226
|
+
| Option | Type | Default | Description |
|
|
227
|
+
|--------|------|---------|-------------|
|
|
228
|
+
| `scaling_strategy` | Symbol | `:fixed` | `:fixed` or `:proportional` |
|
|
229
|
+
| `scale_up_jobs_per_worker` | Integer | `50` | Jobs per worker (proportional) |
|
|
230
|
+
| `scale_up_latency_per_worker` | Integer | `60` | Seconds per worker (proportional) |
|
|
231
|
+
| `scale_down_jobs_per_worker` | Integer | `50` | Jobs capacity per worker |
|
|
232
|
+
|
|
233
|
+
### Cooldowns
|
|
234
|
+
|
|
235
|
+
| Option | Type | Default | Description |
|
|
236
|
+
|--------|------|---------|-------------|
|
|
237
|
+
| `cooldown_seconds` | Integer | `120` | Default cooldown for both directions |
|
|
238
|
+
| `scale_up_cooldown_seconds` | Integer | `nil` | Override for scale-up cooldown |
|
|
239
|
+
| `scale_down_cooldown_seconds` | Integer | `nil` | Override for scale-down cooldown |
|
|
240
|
+
| `persist_cooldowns` | Boolean | `true` | Save cooldowns to database |
|
|
241
|
+
|
|
242
|
+
### Heroku-Specific
|
|
243
|
+
|
|
244
|
+
| Option | Type | Default | Description |
|
|
245
|
+
|--------|------|---------|-------------|
|
|
246
|
+
| `heroku_api_key` | String | `nil` | Heroku Platform API token |
|
|
247
|
+
| `heroku_app_name` | String | `nil` | Heroku app name |
|
|
248
|
+
| `process_type` | String | `'worker'` | Dyno type to scale |
|
|
249
|
+
|
|
250
|
+
### Kubernetes-Specific
|
|
251
|
+
|
|
252
|
+
| Option | Type | Default | Description |
|
|
253
|
+
|--------|------|---------|-------------|
|
|
254
|
+
| `kubernetes_namespace` | String | `'default'` | Kubernetes namespace |
|
|
255
|
+
| `kubernetes_deployment` | String | `nil` | Deployment name to scale |
|
|
256
|
+
| `kubernetes_config_path` | String | `nil` | Path to kubeconfig (optional) |
|
|
257
|
+
|
|
258
|
+
## Usage
|
|
259
|
+
|
|
260
|
+
### Running as a Solid Queue Recurring Job (Recommended)
|
|
261
|
+
|
|
262
|
+
Add to your `config/recurring.yml`:
|
|
263
|
+
|
|
264
|
+
```yaml
|
|
265
|
+
# Single worker configuration
|
|
266
|
+
autoscaler:
|
|
267
|
+
class: SolidQueueAutoscaler::AutoscaleJob
|
|
268
|
+
queue: autoscaler
|
|
269
|
+
schedule: every 30 seconds
|
|
270
|
+
|
|
271
|
+
# Or for multi-worker: scale all workers
|
|
272
|
+
autoscaler_all:
|
|
273
|
+
class: SolidQueueAutoscaler::AutoscaleJob
|
|
274
|
+
queue: autoscaler
|
|
275
|
+
schedule: every 30 seconds
|
|
276
|
+
args: [:all]
|
|
277
|
+
|
|
278
|
+
# Or scale specific worker types on different schedules
|
|
279
|
+
autoscaler_critical:
|
|
280
|
+
class: SolidQueueAutoscaler::AutoscaleJob
|
|
281
|
+
queue: autoscaler
|
|
282
|
+
schedule: every 15 seconds
|
|
283
|
+
args: [:critical_worker]
|
|
284
|
+
|
|
285
|
+
autoscaler_default:
|
|
286
|
+
class: SolidQueueAutoscaler::AutoscaleJob
|
|
287
|
+
queue: autoscaler
|
|
288
|
+
schedule: every 60 seconds
|
|
289
|
+
args: [:default_worker]
|
|
290
|
+
```
|
|
291
|
+
|
|
292
|
+
### Running via Rake Tasks
|
|
293
|
+
|
|
294
|
+
```bash
|
|
295
|
+
# Scale the default worker
|
|
296
|
+
bundle exec rake solid_queue_autoscaler:scale
|
|
297
|
+
|
|
298
|
+
# Scale a specific worker type
|
|
299
|
+
WORKER=critical_worker bundle exec rake solid_queue_autoscaler:scale
|
|
300
|
+
|
|
301
|
+
# Scale all configured workers
|
|
302
|
+
bundle exec rake solid_queue_autoscaler:scale_all
|
|
303
|
+
|
|
304
|
+
# List all registered worker configurations
|
|
305
|
+
bundle exec rake solid_queue_autoscaler:workers
|
|
306
|
+
|
|
307
|
+
# View metrics for default worker
|
|
308
|
+
bundle exec rake solid_queue_autoscaler:metrics
|
|
309
|
+
|
|
310
|
+
# View metrics for specific worker
|
|
311
|
+
WORKER=critical_worker bundle exec rake solid_queue_autoscaler:metrics
|
|
312
|
+
|
|
313
|
+
# View current formation
|
|
314
|
+
bundle exec rake solid_queue_autoscaler:formation
|
|
315
|
+
|
|
316
|
+
# Check cooldown status
|
|
317
|
+
bundle exec rake solid_queue_autoscaler:cooldown
|
|
318
|
+
|
|
319
|
+
# Reset cooldowns
|
|
320
|
+
bundle exec rake solid_queue_autoscaler:reset_cooldown
|
|
321
|
+
```
|
|
322
|
+
|
|
323
|
+
### Running Programmatically
|
|
324
|
+
|
|
325
|
+
```ruby
|
|
326
|
+
# Scale the default worker
|
|
327
|
+
result = SolidQueueAutoscaler.scale!
|
|
328
|
+
|
|
329
|
+
# Scale a specific worker type
|
|
330
|
+
result = SolidQueueAutoscaler.scale!(:critical_worker)
|
|
331
|
+
|
|
332
|
+
# Scale all configured workers
|
|
333
|
+
results = SolidQueueAutoscaler.scale_all!
|
|
334
|
+
|
|
335
|
+
# Get metrics for a specific worker
|
|
336
|
+
metrics = SolidQueueAutoscaler.metrics(:critical_worker)
|
|
337
|
+
puts "Queue depth: #{metrics.queue_depth}"
|
|
338
|
+
puts "Latency: #{metrics.oldest_job_age_seconds}s"
|
|
339
|
+
|
|
340
|
+
# Get current worker count
|
|
341
|
+
workers = SolidQueueAutoscaler.current_workers(:default_worker)
|
|
342
|
+
puts "Current workers: #{workers}"
|
|
343
|
+
|
|
344
|
+
# List all registered workers
|
|
345
|
+
SolidQueueAutoscaler.registered_workers
|
|
346
|
+
# => [:critical_worker, :default_worker, :batch_worker]
|
|
347
|
+
|
|
348
|
+
# Get configuration for a specific worker
|
|
349
|
+
config = SolidQueueAutoscaler.config(:critical_worker)
|
|
350
|
+
```
|
|
351
|
+
|
|
352
|
+
## How It Works
|
|
353
|
+
|
|
354
|
+
### Metrics Collection
|
|
355
|
+
|
|
356
|
+
The autoscaler queries Solid Queue's PostgreSQL tables to collect:
|
|
357
|
+
|
|
358
|
+
- **Queue depth**: Count of jobs in `solid_queue_ready_executions`
|
|
359
|
+
- **Oldest job age**: Time since oldest job was enqueued (latency)
|
|
360
|
+
- **Throughput**: Jobs completed in the last minute
|
|
361
|
+
- **Active workers**: Workers with recent heartbeats
|
|
362
|
+
- **Per-queue breakdown**: Job counts by queue name
|
|
363
|
+
|
|
364
|
+
When `queues` is configured, metrics are filtered to only those queues.
|
|
365
|
+
|
|
366
|
+
### Decision Logic
|
|
367
|
+
|
|
368
|
+
**Scale Up** when ANY of these conditions are met:
|
|
369
|
+
- Queue depth >= `scale_up_queue_depth`
|
|
370
|
+
- Oldest job age >= `scale_up_latency_seconds`
|
|
371
|
+
|
|
372
|
+
**Scale Down** when ALL of these conditions are met:
|
|
373
|
+
- Queue depth <= `scale_down_queue_depth`
|
|
374
|
+
- Oldest job age <= `scale_down_latency_seconds`
|
|
375
|
+
- OR queue is completely idle (no pending or claimed jobs)
|
|
376
|
+
|
|
377
|
+
**No Change** when:
|
|
378
|
+
- Already at min/max workers
|
|
379
|
+
- Within cooldown period
|
|
380
|
+
- Metrics are in normal range
|
|
381
|
+
|
|
382
|
+
### Scaling Strategies
|
|
383
|
+
|
|
384
|
+
**Fixed Strategy** (default): Adds/removes a fixed number of workers per scaling event.
|
|
385
|
+
|
|
386
|
+
```ruby
|
|
387
|
+
config.scaling_strategy = :fixed
|
|
388
|
+
config.scale_up_increment = 2 # Add 2 workers when scaling up
|
|
389
|
+
config.scale_down_decrement = 1 # Remove 1 worker when scaling down
|
|
390
|
+
```
|
|
391
|
+
|
|
392
|
+
**Proportional Strategy**: Scales based on how far over/under thresholds you are.
|
|
393
|
+
|
|
394
|
+
```ruby
|
|
395
|
+
config.scaling_strategy = :proportional
|
|
396
|
+
config.scale_up_jobs_per_worker = 50 # Add 1 worker per 50 jobs over threshold
|
|
397
|
+
config.scale_up_latency_per_worker = 60 # Add 1 worker per 60s over threshold
|
|
398
|
+
```
|
|
399
|
+
|
|
400
|
+
### Singleton Execution
|
|
401
|
+
|
|
402
|
+
PostgreSQL advisory locks ensure only one autoscaler instance runs at a time, even across multiple dynos/pods. Each worker configuration gets its own lock key, so different worker types can scale simultaneously.
|
|
403
|
+
|
|
404
|
+
### Cooldowns
|
|
405
|
+
|
|
406
|
+
After each scaling event, a cooldown period prevents additional scaling:
|
|
407
|
+
- Prevents "flapping" between states
|
|
408
|
+
- Gives the platform time to spin up new workers
|
|
409
|
+
- Allows queue to stabilize after scaling
|
|
410
|
+
|
|
411
|
+
Cooldowns are tracked per-worker type, so scaling one worker doesn't block scaling another.
|
|
412
|
+
|
|
413
|
+
## Environment Variables
|
|
414
|
+
|
|
415
|
+
### Heroku
|
|
416
|
+
|
|
417
|
+
| Variable | Description | Required |
|
|
418
|
+
|----------|-------------|----------|
|
|
419
|
+
| `HEROKU_API_KEY` | Heroku Platform API token | Yes |
|
|
420
|
+
| `HEROKU_APP_NAME` | Name of your Heroku app | Yes |
|
|
421
|
+
|
|
422
|
+
### Kubernetes
|
|
423
|
+
|
|
424
|
+
| Variable | Description | Required |
|
|
425
|
+
|----------|-------------|----------|
|
|
426
|
+
| `KUBERNETES_NAMESPACE` | Kubernetes namespace | No (defaults to 'default') |
|
|
427
|
+
|
|
428
|
+
## Dry Run Mode
|
|
429
|
+
|
|
430
|
+
Test the autoscaler without making actual changes:
|
|
431
|
+
|
|
432
|
+
```ruby
|
|
433
|
+
SolidQueueAutoscaler.configure do |config|
|
|
434
|
+
config.dry_run = true
|
|
435
|
+
# ... other config
|
|
436
|
+
end
|
|
437
|
+
```
|
|
438
|
+
|
|
439
|
+
In dry-run mode, all decisions are logged but no platform API calls are made.
|
|
440
|
+
|
|
441
|
+
## Dashboard
|
|
442
|
+
|
|
443
|
+
The optional dashboard provides a web UI for monitoring the autoscaler:
|
|
444
|
+
|
|
445
|
+
### Features
|
|
446
|
+
|
|
447
|
+
- **Overview Dashboard**: Real-time metrics, worker status, and recent events
|
|
448
|
+
- **Workers View**: Detailed status for each worker type with configuration and cooldowns
|
|
449
|
+
- **Events Log**: Historical record of all scaling decisions with filtering
|
|
450
|
+
- **Manual Scaling**: Trigger scale operations directly from the UI
|
|
451
|
+
|
|
452
|
+
### Setup
|
|
453
|
+
|
|
454
|
+
1. Generate the dashboard migration:
|
|
455
|
+
|
|
456
|
+
```bash
|
|
457
|
+
rails generate solid_queue_autoscaler:dashboard
|
|
458
|
+
rails db:migrate
|
|
459
|
+
```
|
|
460
|
+
|
|
461
|
+
2. Mount the engine in `config/routes.rb`:
|
|
462
|
+
|
|
463
|
+
```ruby
|
|
464
|
+
authenticate :user, ->(u) { u.admin? } do
|
|
465
|
+
mount SolidQueueAutoscaler::Dashboard::Engine => "/autoscaler"
|
|
466
|
+
end
|
|
467
|
+
```
|
|
468
|
+
|
|
469
|
+
3. Visit `/autoscaler` in your browser
|
|
470
|
+
|
|
471
|
+
### Event Recording
|
|
472
|
+
|
|
473
|
+
By default, all scaling events are recorded to the database. Configure in your initializer:
|
|
474
|
+
|
|
475
|
+
```ruby
|
|
476
|
+
SolidQueueAutoscaler.configure do |config|
|
|
477
|
+
# Record scale_up, scale_down, skipped, and error events (default: true)
|
|
478
|
+
config.record_events = true
|
|
479
|
+
|
|
480
|
+
# Also record no_change events (verbose, default: false)
|
|
481
|
+
config.record_all_events = false
|
|
482
|
+
end
|
|
483
|
+
```
|
|
484
|
+
|
|
485
|
+
### Rake Tasks for Events
|
|
486
|
+
|
|
487
|
+
```bash
|
|
488
|
+
# View recent scale events
|
|
489
|
+
bundle exec rake solid_queue_autoscaler:events
|
|
490
|
+
|
|
491
|
+
# View events for a specific worker
|
|
492
|
+
WORKER=critical_worker bundle exec rake solid_queue_autoscaler:events
|
|
493
|
+
|
|
494
|
+
# Cleanup old events (default: keep 30 days)
|
|
495
|
+
bundle exec rake solid_queue_autoscaler:cleanup_events
|
|
496
|
+
KEEP_DAYS=7 bundle exec rake solid_queue_autoscaler:cleanup_events
|
|
497
|
+
```
|
|
498
|
+
|
|
499
|
+
## Troubleshooting
|
|
500
|
+
|
|
501
|
+
### "Could not acquire advisory lock"
|
|
502
|
+
|
|
503
|
+
Another autoscaler instance is currently running. This is expected behavior — only one instance should run at a time per worker type.
|
|
504
|
+
|
|
505
|
+
### "Cooldown active"
|
|
506
|
+
|
|
507
|
+
A recent scaling event triggered the cooldown. Wait for the cooldown to expire or adjust `cooldown_seconds`.
|
|
508
|
+
|
|
509
|
+
### Workers not scaling
|
|
510
|
+
|
|
511
|
+
1. Check that `enabled` is `true`
|
|
512
|
+
2. Verify platform credentials are set correctly
|
|
513
|
+
3. Check metrics with `rake solid_queue_autoscaler:metrics`
|
|
514
|
+
4. Enable dry-run to see what decisions would be made
|
|
515
|
+
5. Check the logs for error messages
|
|
516
|
+
|
|
517
|
+
### Kubernetes authentication issues
|
|
518
|
+
|
|
519
|
+
1. Ensure the service account has permissions to patch deployments
|
|
520
|
+
2. Check namespace is correct
|
|
521
|
+
3. Verify deployment name matches exactly
|
|
522
|
+
|
|
523
|
+
## Architecture Notes
|
|
524
|
+
|
|
525
|
+
This gem acts as a **control plane** for Solid Queue:
|
|
526
|
+
|
|
527
|
+
- **External to workers**: The autoscaler must not depend on the workers it's scaling
|
|
528
|
+
- **Singleton**: Advisory locks ensure only one instance runs globally per worker type
|
|
529
|
+
- **Dedicated queue**: Runs on its own queue to avoid competing with business jobs
|
|
530
|
+
- **Conservative**: Defaults to gradual scaling with cooldowns
|
|
531
|
+
- **Multi-tenant**: Each worker configuration is independent
|
|
532
|
+
|
|
533
|
+
## License
|
|
534
|
+
|
|
535
|
+
MIT License. See [LICENSE.txt](LICENSE.txt) for details.
|
|
536
|
+
|
|
537
|
+
## Contributing
|
|
538
|
+
|
|
539
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup and contribution guidelines.
|
|
540
|
+
|
|
541
|
+
1. Fork the repository
|
|
542
|
+
2. Create your feature branch (`git checkout -b feature/my-feature`)
|
|
543
|
+
3. Write tests for your changes
|
|
544
|
+
4. Ensure all tests pass (`bundle exec rspec`)
|
|
545
|
+
5. Ensure RuboCop passes (`bundle exec rubocop`)
|
|
546
|
+
6. Submit a pull request
|
|
547
|
+
|
|
548
|
+
## Links
|
|
549
|
+
|
|
550
|
+
- [GitHub Repository](https://github.com/reillyse/solid_queue_autoscaler)
|
|
551
|
+
- [RubyGems](https://rubygems.org/gems/solid_queue_autoscaler)
|
|
552
|
+
- [Changelog](CHANGELOG.md)
|
|
553
|
+
|
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require 'rails/generators'
|
|
4
|
+
require 'rails/generators/active_record'
|
|
5
|
+
|
|
6
|
+
module SolidQueueAutoscaler
|
|
7
|
+
module Generators
|
|
8
|
+
# Generator for the dashboard migrations.
|
|
9
|
+
# Creates the scale events table for tracking autoscaler history.
|
|
10
|
+
#
|
|
11
|
+
# @example Run the generator
|
|
12
|
+
# rails generate solid_queue_autoscaler:dashboard
|
|
13
|
+
class DashboardGenerator < Rails::Generators::Base
|
|
14
|
+
include ActiveRecord::Generators::Migration
|
|
15
|
+
|
|
16
|
+
source_root File.expand_path('templates', __dir__)
|
|
17
|
+
|
|
18
|
+
desc 'Creates migrations for SolidQueueAutoscaler dashboard (events table)'
|
|
19
|
+
|
|
20
|
+
def create_migration_file
|
|
21
|
+
migration_template 'create_solid_queue_autoscaler_events.rb.erb',
|
|
22
|
+
'db/migrate/create_solid_queue_autoscaler_events.rb'
|
|
23
|
+
end
|
|
24
|
+
|
|
25
|
+
def show_post_install
|
|
26
|
+
say ''
|
|
27
|
+
say '=== Solid Queue Autoscaler Dashboard Setup ==='
|
|
28
|
+
say ''
|
|
29
|
+
say 'Next steps:'
|
|
30
|
+
say ' 1. Run migrations: rails db:migrate'
|
|
31
|
+
say ' 2. Mount the dashboard in config/routes.rb:'
|
|
32
|
+
say ''
|
|
33
|
+
say ' mount SolidQueueAutoscaler::Dashboard::Engine => "/autoscaler"'
|
|
34
|
+
say ''
|
|
35
|
+
say ' 3. For authentication, wrap in a constraint:'
|
|
36
|
+
say ''
|
|
37
|
+
say ' authenticate :user, ->(u) { u.admin? } do'
|
|
38
|
+
say ' mount SolidQueueAutoscaler::Dashboard::Engine => "/autoscaler"'
|
|
39
|
+
say ' end'
|
|
40
|
+
say ''
|
|
41
|
+
say 'View the dashboard at: /autoscaler'
|
|
42
|
+
say ''
|
|
43
|
+
end
|
|
44
|
+
|
|
45
|
+
private
|
|
46
|
+
|
|
47
|
+
def migration_version
|
|
48
|
+
return unless defined?(ActiveRecord::VERSION)
|
|
49
|
+
|
|
50
|
+
"[#{ActiveRecord::VERSION::MAJOR}.#{ActiveRecord::VERSION::MINOR}]"
|
|
51
|
+
end
|
|
52
|
+
end
|
|
53
|
+
end
|
|
54
|
+
end
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require 'rails/generators'
|
|
4
|
+
|
|
5
|
+
module SolidQueueAutoscaler
|
|
6
|
+
module Generators
|
|
7
|
+
class InstallGenerator < Rails::Generators::Base
|
|
8
|
+
source_root File.expand_path('templates', __dir__)
|
|
9
|
+
|
|
10
|
+
desc 'Creates a SolidQueueAutoscaler initializer'
|
|
11
|
+
|
|
12
|
+
def copy_initializer
|
|
13
|
+
template 'initializer.rb', 'config/initializers/solid_queue_autoscaler.rb'
|
|
14
|
+
end
|
|
15
|
+
|
|
16
|
+
def show_readme
|
|
17
|
+
readme 'README' if behavior == :invoke
|
|
18
|
+
end
|
|
19
|
+
end
|
|
20
|
+
end
|
|
21
|
+
end
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require 'rails/generators'
|
|
4
|
+
require 'rails/generators/active_record'
|
|
5
|
+
|
|
6
|
+
module SolidQueueAutoscaler
|
|
7
|
+
module Generators
|
|
8
|
+
class MigrationGenerator < Rails::Generators::Base
|
|
9
|
+
include ActiveRecord::Generators::Migration
|
|
10
|
+
|
|
11
|
+
source_root File.expand_path('templates', __dir__)
|
|
12
|
+
|
|
13
|
+
desc 'Creates the migration for SolidQueueAutoscaler state table'
|
|
14
|
+
|
|
15
|
+
def create_migration_file
|
|
16
|
+
migration_template 'create_solid_queue_autoscaler_state.rb.erb',
|
|
17
|
+
'db/migrate/create_solid_queue_autoscaler_state.rb'
|
|
18
|
+
end
|
|
19
|
+
|
|
20
|
+
private
|
|
21
|
+
|
|
22
|
+
def migration_version
|
|
23
|
+
return unless defined?(ActiveRecord::VERSION)
|
|
24
|
+
|
|
25
|
+
"[#{ActiveRecord::VERSION::MAJOR}.#{ActiveRecord::VERSION::MINOR}]"
|
|
26
|
+
end
|
|
27
|
+
end
|
|
28
|
+
end
|
|
29
|
+
end
|
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
===============================================================================
|
|
2
|
+
|
|
3
|
+
SolidQueueHerokuAutoscaler has been installed!
|
|
4
|
+
|
|
5
|
+
Next steps:
|
|
6
|
+
|
|
7
|
+
1. Set environment variables:
|
|
8
|
+
- HEROKU_API_KEY: Generate with `heroku authorizations:create -d "Solid Queue Autoscaler"`
|
|
9
|
+
- HEROKU_APP_NAME: Your Heroku app name
|
|
10
|
+
|
|
11
|
+
2. Run the migration generator for persistent cooldown tracking:
|
|
12
|
+
|
|
13
|
+
rails generate solid_queue_heroku_autoscaler:migration
|
|
14
|
+
rails db:migrate
|
|
15
|
+
|
|
16
|
+
3. Review config/initializers/solid_queue_autoscaler.rb and adjust thresholds
|
|
17
|
+
|
|
18
|
+
4. Add the recurring job to config/recurring.yml:
|
|
19
|
+
|
|
20
|
+
autoscaler:
|
|
21
|
+
class: SolidQueueHerokuAutoscaler::AutoscaleJob
|
|
22
|
+
queue: autoscaler
|
|
23
|
+
schedule: every 30 seconds
|
|
24
|
+
|
|
25
|
+
5. Configure a dedicated queue in config/queue.yml:
|
|
26
|
+
|
|
27
|
+
queues:
|
|
28
|
+
- autoscaler
|
|
29
|
+
- default
|
|
30
|
+
|
|
31
|
+
workers:
|
|
32
|
+
- queues: [autoscaler]
|
|
33
|
+
threads: 1
|
|
34
|
+
- queues: [default]
|
|
35
|
+
threads: 5
|
|
36
|
+
|
|
37
|
+
6. Test with dry_run mode before enabling in production
|
|
38
|
+
|
|
39
|
+
For more information, see: https://github.com/reillyse/solid_queue_heroku_autoscaler
|
|
40
|
+
|
|
41
|
+
===============================================================================
|