dispatch_policy 0.2.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/MIT-LICENSE +16 -17
- data/README.md +433 -388
- data/app/assets/stylesheets/dispatch_policy/application.css +157 -0
- data/app/controllers/dispatch_policy/application_controller.rb +45 -1
- data/app/controllers/dispatch_policy/dashboard_controller.rb +91 -0
- data/app/controllers/dispatch_policy/partitions_controller.rb +122 -0
- data/app/controllers/dispatch_policy/policies_controller.rb +94 -267
- data/app/controllers/dispatch_policy/staged_jobs_controller.rb +9 -0
- data/app/models/dispatch_policy/adaptive_concurrency_stats.rb +11 -81
- data/app/models/dispatch_policy/inflight_job.rb +12 -0
- data/app/models/dispatch_policy/partition.rb +21 -0
- data/app/models/dispatch_policy/staged_job.rb +4 -97
- data/app/models/dispatch_policy/tick_sample.rb +11 -0
- data/app/views/dispatch_policy/dashboard/index.html.erb +109 -0
- data/app/views/dispatch_policy/partitions/index.html.erb +63 -0
- data/app/views/dispatch_policy/partitions/show.html.erb +106 -0
- data/app/views/dispatch_policy/policies/index.html.erb +15 -37
- data/app/views/dispatch_policy/policies/show.html.erb +139 -223
- data/app/views/dispatch_policy/shared/_capacity.html.erb +67 -0
- data/app/views/dispatch_policy/shared/_hints.html.erb +13 -0
- data/app/views/dispatch_policy/shared/_partition_row.html.erb +12 -0
- data/app/views/dispatch_policy/staged_jobs/show.html.erb +31 -0
- data/app/views/layouts/dispatch_policy/application.html.erb +95 -238
- data/config/routes.rb +18 -2
- data/db/migrate/20260501000001_create_dispatch_policy_tables.rb +103 -0
- data/lib/dispatch_policy/bypass.rb +23 -0
- data/lib/dispatch_policy/config.rb +85 -0
- data/lib/dispatch_policy/context.rb +50 -0
- data/lib/dispatch_policy/cursor_pagination.rb +121 -0
- data/lib/dispatch_policy/decision.rb +22 -0
- data/lib/dispatch_policy/engine.rb +4 -27
- data/lib/dispatch_policy/forwarder.rb +63 -0
- data/lib/dispatch_policy/gate.rb +10 -38
- data/lib/dispatch_policy/gates/adaptive_concurrency.rb +99 -97
- data/lib/dispatch_policy/gates/concurrency.rb +45 -26
- data/lib/dispatch_policy/gates/throttle.rb +65 -41
- data/lib/dispatch_policy/inflight_tracker.rb +174 -0
- data/lib/dispatch_policy/job_extension.rb +155 -0
- data/lib/dispatch_policy/operator_hints.rb +126 -0
- data/lib/dispatch_policy/pipeline.rb +48 -0
- data/lib/dispatch_policy/policy.rb +61 -59
- data/lib/dispatch_policy/policy_dsl.rb +120 -0
- data/lib/dispatch_policy/railtie.rb +35 -0
- data/lib/dispatch_policy/registry.rb +46 -0
- data/lib/dispatch_policy/repository.rb +723 -0
- data/lib/dispatch_policy/serializer.rb +36 -0
- data/lib/dispatch_policy/tick.rb +260 -256
- data/lib/dispatch_policy/tick_loop.rb +59 -26
- data/lib/dispatch_policy/version.rb +1 -1
- data/lib/dispatch_policy.rb +71 -52
- data/lib/generators/dispatch_policy/install/install_generator.rb +70 -0
- data/lib/generators/dispatch_policy/install/templates/create_dispatch_policy_tables.rb.tt +95 -0
- data/lib/generators/dispatch_policy/install/templates/dispatch_tick_loop_job.rb.tt +53 -0
- data/lib/generators/dispatch_policy/install/templates/initializer.rb.tt +11 -0
- metadata +101 -43
- data/CHANGELOG.md +0 -43
- data/app/models/dispatch_policy/partition_inflight_count.rb +0 -42
- data/app/models/dispatch_policy/partition_observation.rb +0 -76
- data/app/models/dispatch_policy/throttle_bucket.rb +0 -41
- data/db/migrate/20260424000001_create_dispatch_policy_tables.rb +0 -80
- data/db/migrate/20260424000002_create_adaptive_concurrency_stats.rb +0 -22
- data/db/migrate/20260424000003_create_adaptive_concurrency_samples.rb +0 -25
- data/db/migrate/20260424000004_rename_samples_to_partition_observations.rb +0 -32
- data/db/migrate/20260425000001_add_duration_to_partition_observations.rb +0 -8
- data/lib/dispatch_policy/active_job_perform_all_later_patch.rb +0 -32
- data/lib/dispatch_policy/dispatch_context.rb +0 -53
- data/lib/dispatch_policy/dispatchable.rb +0 -123
- data/lib/dispatch_policy/gates/fair_interleave.rb +0 -32
- data/lib/dispatch_policy/gates/global_cap.rb +0 -26
data/README.md
CHANGED
|
@@ -1,550 +1,595 @@
|
|
|
1
1
|
# DispatchPolicy
|
|
2
2
|
|
|
3
|
-
> **⚠️ Experimental
|
|
4
|
-
>
|
|
5
|
-
>
|
|
6
|
-
>
|
|
7
|
-
>
|
|
3
|
+
> **⚠️ Experimental v2 branch.** This is the `v2` branch of
|
|
4
|
+
> [ceritium/dispatch_policy](https://github.com/ceritium/dispatch_policy)
|
|
5
|
+
> — an alternative cut: TX-atomic admission, in-tick fairness as a
|
|
6
|
+
> layer (not a gate), and a single canonical partition scope per
|
|
7
|
+
> policy. API, schema, and defaults can change between any two
|
|
8
|
+
> commits. The `master` branch of the same repo is the original
|
|
9
|
+
> design and is what the published gem (when one ships) tracks.
|
|
8
10
|
>
|
|
9
|
-
> **PostgreSQL only
|
|
10
|
-
>
|
|
11
|
-
>
|
|
12
|
-
>
|
|
13
|
-
>
|
|
14
|
-
> (shadow columns for `jsonb`, full indexes instead of partial, a
|
|
15
|
-
> different batch-fetch strategy for fairness). Contributions welcome.
|
|
11
|
+
> **PostgreSQL only.** Staging, admission, and adaptive stats lean on
|
|
12
|
+
> `jsonb`, partial indexes, `FOR UPDATE SKIP LOCKED`, `ON CONFLICT`,
|
|
13
|
+
> and the adapter sharing `ActiveRecord::Base.connection` so the
|
|
14
|
+
> admit + adapter INSERT can join one transaction. Tested against
|
|
15
|
+
> good_job and solid_queue.
|
|
16
16
|
|
|
17
17
|
Per-partition admission control for ActiveJob. Stages `perform_later`
|
|
18
18
|
into a dedicated table, runs a tick loop that admits jobs through
|
|
19
|
-
declared gates (throttle
|
|
20
|
-
|
|
19
|
+
declared gates (`throttle`, `concurrency`, `adaptive_concurrency`),
|
|
20
|
+
then forwards survivors to the real adapter. The admission and the
|
|
21
|
+
adapter INSERT happen inside one Postgres transaction, so a worker
|
|
22
|
+
crash mid-tick can't lose a job.
|
|
21
23
|
|
|
22
24
|
Use it when you need:
|
|
23
25
|
|
|
24
|
-
- **Per-tenant / per-endpoint throttle**
|
|
25
|
-
|
|
26
|
-
- **Per-partition concurrency**
|
|
27
|
-
completion
|
|
26
|
+
- **Per-tenant / per-endpoint throttle** — token bucket per partition,
|
|
27
|
+
refreshed lazily on read.
|
|
28
|
+
- **Per-partition concurrency** — fixed cap on in-flight jobs with a
|
|
29
|
+
release hook on completion and a heartbeat-based reaper for crashes.
|
|
28
30
|
- **Adaptive concurrency** — a cap that shrinks under queue pressure
|
|
29
|
-
and grows back when workers keep up,
|
|
30
|
-
- **
|
|
31
|
-
|
|
32
|
-
tenant's burst can't starve the others
|
|
33
|
-
|
|
34
|
-
|
|
31
|
+
and grows back when workers keep up, no manual tuning per tenant.
|
|
32
|
+
- **In-tick fairness** — within a single tick, partitions are reordered
|
|
33
|
+
by recent activity (EWMA) and an optional global cap is shared
|
|
34
|
+
fairly across them. So one tenant's burst can't starve the others.
|
|
35
|
+
- **Sharding** — split a policy across N queues so independent tick
|
|
36
|
+
workers admit in parallel.
|
|
35
37
|
|
|
36
38
|
## Demo
|
|
37
39
|
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
40
|
+
The demo lives in `test/dummy/` — a tiny Rails app inside this repo.
|
|
41
|
+
Run it locally to play with every gate and the admin UI:
|
|
42
|
+
|
|
43
|
+
```bash
|
|
44
|
+
bin/dummy setup good_job # creates the DB and migrates
|
|
45
|
+
DUMMY_ADAPTER=good_job bundle exec foreman start
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
Then open:
|
|
49
|
+
|
|
50
|
+
- `http://localhost:3000/` — playground with one card per job and a
|
|
51
|
+
storm form that exercises the adaptive cap and fairness reorder
|
|
52
|
+
across many tenants.
|
|
53
|
+
- `http://localhost:3000/dispatch_policy` — admin UI: live throughput,
|
|
54
|
+
partition state, denial reasons, capacity hints.
|
|
55
|
+
|
|
56
|
+
The dummy ships ten purpose-built jobs covering throttle, concurrency,
|
|
57
|
+
mixed gates, scheduling, retries, stress tests, sharding, fairness, and
|
|
58
|
+
adaptive concurrency. See `test/dummy/app/jobs/`.
|
|
43
59
|
|
|
44
60
|
## Install
|
|
45
61
|
|
|
46
62
|
Add to your `Gemfile`:
|
|
47
63
|
|
|
48
64
|
```ruby
|
|
49
|
-
gem "dispatch_policy"
|
|
65
|
+
gem "dispatch_policy",
|
|
66
|
+
git: "https://github.com/ceritium/dispatch_policy",
|
|
67
|
+
branch: "v2"
|
|
50
68
|
```
|
|
51
69
|
|
|
52
|
-
|
|
70
|
+
Generate the install bundle (migration + initializer + tick loop job):
|
|
53
71
|
|
|
54
|
-
```
|
|
55
|
-
|
|
56
|
-
|
|
72
|
+
```bash
|
|
73
|
+
bin/rails generate dispatch_policy:install
|
|
74
|
+
bin/rails db:migrate
|
|
57
75
|
```
|
|
58
76
|
|
|
59
|
-
Mount the admin UI
|
|
77
|
+
Mount the admin UI (optional but recommended):
|
|
60
78
|
|
|
61
79
|
```ruby
|
|
62
|
-
mount DispatchPolicy::Engine
|
|
80
|
+
mount DispatchPolicy::Engine, at: "/dispatch_policy"
|
|
63
81
|
```
|
|
64
82
|
|
|
65
|
-
|
|
83
|
+
Then schedule the tick loop. The generator wrote a
|
|
84
|
+
`DispatchTickLoopJob` in `app/jobs/`; kick it off once and it
|
|
85
|
+
re-enqueues itself:
|
|
66
86
|
|
|
67
87
|
```ruby
|
|
68
|
-
|
|
69
|
-
c.enabled = ENV.fetch("DISPATCH_POLICY_ENABLED", "true") != "false"
|
|
70
|
-
c.lease_duration = 15.minutes
|
|
71
|
-
c.batch_size = 500
|
|
72
|
-
c.round_robin_quantum = 50
|
|
73
|
-
c.tick_sleep = 1 # idle
|
|
74
|
-
c.tick_sleep_busy = 0.05 # after productive ticks
|
|
75
|
-
end
|
|
88
|
+
DispatchTickLoopJob.perform_later
|
|
76
89
|
```
|
|
77
90
|
|
|
78
91
|
## Flow
|
|
79
92
|
|
|
80
93
|
```
|
|
81
94
|
ActiveJob#perform_later
|
|
82
|
-
→
|
|
83
|
-
→
|
|
95
|
+
→ JobExtension.around_enqueue_for
|
|
96
|
+
→ Repository.stage! (INSERT staged + UPSERT partition; ctx refreshed)
|
|
84
97
|
|
|
85
98
|
(tick loop, periodically)
|
|
86
|
-
→
|
|
87
|
-
→
|
|
88
|
-
→
|
|
89
|
-
|
|
99
|
+
→ claim_partitions (FOR UPDATE SKIP LOCKED, ordered by last_checked_at)
|
|
100
|
+
→ reorder by decayed_admits ASC (in-tick fairness)
|
|
101
|
+
→ for each: pipeline.call(ctx, partition, fair_share)
|
|
102
|
+
→ gates evaluate; admit_count = min(allowed)
|
|
103
|
+
→ ONE TX: claim_staged_jobs! + insert_inflight! + Forwarder.dispatch
|
|
104
|
+
(the adapter INSERT shares the TX; rollback if anything raises)
|
|
105
|
+
→ bulk-flush deny-state in one UPDATE ... FROM (VALUES ...)
|
|
90
106
|
|
|
91
107
|
(worker runs perform)
|
|
92
|
-
→
|
|
108
|
+
→ InflightTracker.track (around_perform)
|
|
109
|
+
→ INSERT inflight_jobs ON CONFLICT DO NOTHING
|
|
110
|
+
→ spawn heartbeat thread
|
|
93
111
|
→ block.call
|
|
94
|
-
→
|
|
112
|
+
→ record_observation on adaptive gates (queue_lag → AIMD update)
|
|
113
|
+
→ DELETE inflight_jobs
|
|
95
114
|
```
|
|
96
115
|
|
|
97
116
|
## Declaring a policy
|
|
98
117
|
|
|
99
118
|
```ruby
|
|
100
|
-
class
|
|
101
|
-
|
|
119
|
+
class FetchEndpointJob < ApplicationJob
|
|
120
|
+
dispatch_policy_inflight_tracking # only required if a concurrency gate is used
|
|
102
121
|
|
|
103
|
-
dispatch_policy do
|
|
104
|
-
# Persisted in the staged row so gates can read it without touching AR.
|
|
122
|
+
dispatch_policy :endpoints do
|
|
105
123
|
context ->(args) {
|
|
106
124
|
event = args.first
|
|
107
|
-
{
|
|
125
|
+
{
|
|
126
|
+
endpoint_id: event.endpoint_id,
|
|
127
|
+
rate_limit: event.endpoint.rate_limit,
|
|
128
|
+
max_per_account: event.account.dispatch_concurrency
|
|
129
|
+
}
|
|
108
130
|
}
|
|
109
131
|
|
|
110
|
-
#
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
# Tenant fairness — see the "Round-robin" section below.
|
|
114
|
-
round_robin_by ->(args) { args.first.account_id }
|
|
132
|
+
# Required: every gate in the policy enforces against this scope.
|
|
133
|
+
partition_by ->(ctx) { ctx[:endpoint_id] }
|
|
115
134
|
|
|
116
135
|
gate :throttle,
|
|
117
|
-
rate:
|
|
118
|
-
per:
|
|
119
|
-
|
|
136
|
+
rate: ->(ctx) { ctx[:rate_limit] },
|
|
137
|
+
per: 1.minute
|
|
138
|
+
|
|
139
|
+
gate :concurrency,
|
|
140
|
+
max: ->(ctx) { ctx[:max_per_account] || 5 }
|
|
120
141
|
|
|
121
|
-
|
|
142
|
+
retry_strategy :restage # default; alternative: :bypass
|
|
122
143
|
end
|
|
123
144
|
|
|
124
|
-
def perform(event)
|
|
145
|
+
def perform(event)
|
|
146
|
+
# ... call the rate-limited HTTP endpoint
|
|
147
|
+
end
|
|
125
148
|
end
|
|
126
149
|
```
|
|
127
150
|
|
|
128
151
|
`perform_later` stages the job; the tick admits it when its gates pass.
|
|
152
|
+
With multiple gates the actual `admit_count` per tick comes out as
|
|
153
|
+
`min(allowed)` across all of them.
|
|
129
154
|
|
|
130
|
-
|
|
131
|
-
a shared pool) skip ahead to [Recipes](#multi-tenant-webhook-delivery)
|
|
132
|
-
— `round_robin_by weight: :time` plus `:adaptive_concurrency` covers
|
|
133
|
-
it without an explicit throttle.
|
|
155
|
+
## Choosing the partition scope
|
|
134
156
|
|
|
135
|
-
|
|
157
|
+
`partition_by` is the most consequential decision in a policy and the
|
|
158
|
+
only required field. It tells the gem **what counts as one logical
|
|
159
|
+
partition** — what scope each gate enforces against, and what the
|
|
160
|
+
in-tick fairness reorder operates over.
|
|
136
161
|
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
`
|
|
162
|
+
A policy with `partition_by` and **no gates** is also valid: the
|
|
163
|
+
pipeline passes the full budget through, and the Tick caps it via
|
|
164
|
+
`admission_batch_size` (or `tick_admission_budget` if set). Useful
|
|
165
|
+
for "balance N tenants evenly" without rate-limiting any of them.
|
|
140
166
|
|
|
141
|
-
|
|
167
|
+
If you need genuinely different scopes per gate (throttle by endpoint
|
|
168
|
+
AND concurrency by account, each enforced at its own scope), **split
|
|
169
|
+
into two policies** and chain them: the staging policy admits, its
|
|
170
|
+
worker enqueues into the second.
|
|
142
171
|
|
|
143
|
-
|
|
144
|
-
partition. Tracks in-flight counts in
|
|
145
|
-
`dispatch_policy_partition_counts`; decremented by the `around_perform`
|
|
146
|
-
hook when the job finishes, or by the reaper when a lease expires
|
|
147
|
-
(worker crashed).
|
|
148
|
-
|
|
149
|
-
```ruby
|
|
150
|
-
gate :concurrency,
|
|
151
|
-
max: ->(ctx) { ctx[:max_per_account] || 5 },
|
|
152
|
-
partition_by: ->(ctx) { "acct:#{ctx[:account_id]}" }
|
|
153
|
-
```
|
|
172
|
+
## Gates
|
|
154
173
|
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
174
|
+
Gates run in declared order; each narrows the survivor count. Every
|
|
175
|
+
option that takes a value can alternatively take a lambda receiving
|
|
176
|
+
the `ctx` hash, so parameters can depend on per-job data.
|
|
158
177
|
|
|
159
178
|
### `:throttle` — token-bucket rate limit per partition
|
|
160
179
|
|
|
161
|
-
Refills `rate` tokens every `per` seconds, capped at `
|
|
162
|
-
|
|
163
|
-
pending for the next tick.
|
|
180
|
+
Refills `rate` tokens every `per` seconds, capped at `rate` (no
|
|
181
|
+
separate burst). Admits jobs while tokens are available; leaves the
|
|
182
|
+
rest pending for the next tick. State is persisted in
|
|
183
|
+
`partitions.gate_state.throttle`.
|
|
164
184
|
|
|
165
185
|
```ruby
|
|
166
186
|
gate :throttle,
|
|
167
|
-
rate:
|
|
168
|
-
per:
|
|
169
|
-
burst: 100, # bucket cap (optional, defaults to rate)
|
|
170
|
-
partition_by: ->(ctx) { "host:#{ctx[:host]}" }
|
|
187
|
+
rate: ->(ctx) { ctx[:rate_limit] },
|
|
188
|
+
per: 1.minute
|
|
171
189
|
```
|
|
172
190
|
|
|
173
|
-
|
|
174
|
-
|
|
191
|
+
Throttle does **not** release tokens on completion — tokens refill
|
|
192
|
+
only with elapsed time.
|
|
193
|
+
|
|
194
|
+
### `:concurrency` — in-flight cap per partition
|
|
195
|
+
|
|
196
|
+
Caps the number of admitted-but-not-yet-completed jobs per partition.
|
|
197
|
+
Counts rows in `dispatch_policy_inflight_jobs` keyed by the policy's
|
|
198
|
+
canonical partition. Decremented by `InflightTracker.track`'s
|
|
199
|
+
`around_perform`; reaped by a periodic sweeper if a worker crashes.
|
|
175
200
|
|
|
176
201
|
```ruby
|
|
177
|
-
gate :
|
|
178
|
-
|
|
179
|
-
per: 1.minute,
|
|
180
|
-
partition_by: ->(ctx) { ctx[:endpoint_id] }
|
|
202
|
+
gate :concurrency,
|
|
203
|
+
max: ->(ctx) { ctx[:max_per_account] || 5 }
|
|
181
204
|
```
|
|
182
205
|
|
|
183
|
-
|
|
184
|
-
|
|
206
|
+
When the cap is full, the gate returns `retry_after = full_backoff`
|
|
207
|
+
(default 1s) so the partition skips the next ticks instead of
|
|
208
|
+
hammering `count(*)` every iteration.
|
|
185
209
|
|
|
186
|
-
### `:
|
|
210
|
+
### `:adaptive_concurrency` — per-partition cap that self-tunes
|
|
187
211
|
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
212
|
+
Like `:concurrency` but the cap (`current_max`) shrinks when the
|
|
213
|
+
adapter queue backs up and grows when workers drain it quickly.
|
|
214
|
+
AIMD loop on a per-partition stats row in
|
|
215
|
+
`dispatch_policy_adaptive_concurrency_stats`.
|
|
191
216
|
|
|
192
217
|
```ruby
|
|
193
|
-
gate :
|
|
194
|
-
|
|
218
|
+
gate :adaptive_concurrency,
|
|
219
|
+
initial_max: 3,
|
|
220
|
+
target_lag_ms: 1000, # acceptable queue wait before backoff
|
|
221
|
+
min: 1 # floor; a partition can't lock out
|
|
195
222
|
```
|
|
196
223
|
|
|
197
|
-
|
|
224
|
+
- **Feedback signal**: `admitted_at → perform_start` (queue wait in
|
|
225
|
+
the real adapter). Pure saturation signal — slow performs in the
|
|
226
|
+
downstream service don't punish admissions if workers still drain
|
|
227
|
+
the queue quickly.
|
|
228
|
+
- **Growth**: `current_max += 1` per fast success.
|
|
229
|
+
- **Slow shrink**: `current_max *= 0.95` when EWMA lag > target.
|
|
230
|
+
- **Failure shrink**: `current_max *= 0.5` when `perform` raises.
|
|
231
|
+
- **Safety valve**: when `in_flight == 0` the gate floors `remaining`
|
|
232
|
+
at `initial_max` so a partition that AIMD shrunk to `min` during
|
|
233
|
+
a past burst can re-grow when it idles.
|
|
198
234
|
|
|
199
|
-
|
|
235
|
+
#### Choosing `target_lag_ms`
|
|
200
236
|
|
|
201
|
-
|
|
202
|
-
partition and interleaves, so no single partition can starve others
|
|
203
|
-
even if it has many pending jobs.
|
|
237
|
+
It's the knob that trades latency for throughput. Rough guide:
|
|
204
238
|
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
|
|
208
|
-
|
|
239
|
+
- **Too low** (10–50 ms): the gate reacts to every tiny bump in
|
|
240
|
+
queue wait and shrinks aggressively. Workers idle while jobs sit
|
|
241
|
+
pending — overshoot.
|
|
242
|
+
- **Too high** (30 s+): the gate barely pushes back; throughput is
|
|
243
|
+
near-max but new admissions wait seconds before a worker picks
|
|
244
|
+
them up.
|
|
245
|
+
- **Reasonable starting point**: `≈ worker_threads × avg_perform_ms`.
|
|
246
|
+
E.g. 5 workers × 200 ms perform = 1000 ms means "queue depth up
|
|
247
|
+
to ~1 s is fine".
|
|
209
248
|
|
|
210
|
-
|
|
211
|
-
off the first partition a row picked up.
|
|
249
|
+
## Fairness within a tick
|
|
212
250
|
|
|
213
|
-
|
|
251
|
+
When several partitions compete for admission inside the same tick,
|
|
252
|
+
the gem reorders them by **least-recently-active first** so a hot
|
|
253
|
+
partition with thousands of pending jobs cannot starve a cold one
|
|
254
|
+
that just woke up.
|
|
214
255
|
|
|
215
|
-
The
|
|
216
|
-
|
|
217
|
-
|
|
218
|
-
per-partition stats row (`dispatch_policy_adaptive_concurrency_stats`).
|
|
256
|
+
The mechanism has two knobs: an EWMA half-life (controls *how* the
|
|
257
|
+
order is decided) and an optional global tick cap (controls *how
|
|
258
|
+
much* each partition is allowed in one tick).
|
|
219
259
|
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
|
|
260
|
+
### `fairness half_life:`
|
|
261
|
+
|
|
262
|
+
Each partition keeps `decayed_admits` and `decayed_admits_at`,
|
|
263
|
+
updated atomically inside the admit transaction:
|
|
264
|
+
|
|
265
|
+
```
|
|
266
|
+
decayed_admits := decayed_admits * exp(-Δt / τ) + admitted
|
|
267
|
+
where τ = half_life / ln(2)
|
|
227
268
|
```
|
|
228
269
|
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
the queue quickly.
|
|
233
|
-
- **Growth**: +1 per fast success. No hard ceiling; the algorithm
|
|
234
|
-
self-limits via `target_lag_ms`. If the queue builds up, the cap
|
|
235
|
-
shrinks multiplicatively.
|
|
236
|
-
- **Failure**: `current_max *= 0.5` (halve) when `perform` raises.
|
|
237
|
-
- **Slow**: `current_max *= 0.95` when EWMA lag > target.
|
|
270
|
+
After `half_life` seconds without admitting, the value halves. The
|
|
271
|
+
Tick sorts the claimed batch by current `decayed_admits` ASC, so the
|
|
272
|
+
under-admitted go first.
|
|
238
273
|
|
|
239
|
-
|
|
274
|
+
| Value | Behaviour |
|
|
275
|
+
|-----------|------------------------------------------------------------------------------|
|
|
276
|
+
| 5–10 s | Reacts to brief pauses. Bursty workloads where short stalls deserve a head start. |
|
|
277
|
+
| **60 s** (default) | Stable steady-state. Hot partitions stay "hot" through normal latency variation. |
|
|
278
|
+
| 5–15 min | Long memory. Burst on partition A penalises A for many minutes. |
|
|
240
279
|
|
|
241
|
-
|
|
280
|
+
Set `c.fairness_half_life_seconds = nil` to disable the reorder
|
|
281
|
+
entirely — partitions are processed in `claim_partitions` order
|
|
282
|
+
(last-checked-first).
|
|
242
283
|
|
|
243
|
-
|
|
244
|
-
|
|
245
|
-
|
|
246
|
-
|
|
247
|
-
-
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
up.
|
|
251
|
-
- **Reasonable starting point**: `≈ worker_max_threads × avg_perform_ms`.
|
|
252
|
-
If you run 5 workers at ~200 ms/perform, `target_lag_ms: 1000`
|
|
253
|
-
means "it's OK if the adapter queue stays at most ~1 second
|
|
254
|
-
deep". You'll want to tune from there based on what your
|
|
255
|
-
downstream tolerates and how fast you want bursts to drain.
|
|
256
|
-
|
|
257
|
-
Pair it with `round_robin_by` for multi-tenant systems that want
|
|
258
|
-
automatic backpressure without hand-tuned caps per tenant:
|
|
284
|
+
### `tick_admission_budget`
|
|
285
|
+
|
|
286
|
+
Without this, each partition admits up to `admission_batch_size`.
|
|
287
|
+
With it set, the per-partition ceiling becomes `fair_share = ceil(cap
|
|
288
|
+
/ claimed_partitions)`. Pass-1 walks the (decay-sorted) partitions
|
|
289
|
+
giving each up to `fair_share`; pass-2 redistributes any leftover to
|
|
290
|
+
those that filled their share.
|
|
259
291
|
|
|
260
292
|
```ruby
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
|
|
293
|
+
DispatchPolicy.configure do |c|
|
|
294
|
+
c.fairness_half_life_seconds = 60
|
|
295
|
+
c.tick_admission_budget = nil # default — no global cap
|
|
296
|
+
end
|
|
297
|
+
|
|
298
|
+
# Per-policy override:
|
|
299
|
+
dispatch_policy :endpoints do
|
|
300
|
+
partition_by ->(c) { c[:endpoint_id] }
|
|
301
|
+
fairness half_life: 30.seconds
|
|
302
|
+
tick_admission_budget 200
|
|
303
|
+
gate :throttle, rate: 100, per: 60
|
|
304
|
+
end
|
|
266
305
|
```
|
|
267
306
|
|
|
268
|
-
|
|
307
|
+
When the cap is hit before all partitions admit, the rest are denied
|
|
308
|
+
with reason `tick_cap_exhausted`. They were still observed
|
|
309
|
+
(`last_checked_at` bumped), so they're at the front of the next
|
|
310
|
+
tick's order.
|
|
311
|
+
|
|
312
|
+
### Anti-stagnation
|
|
269
313
|
|
|
270
|
-
|
|
271
|
-
|
|
272
|
-
|
|
273
|
-
|
|
274
|
-
|
|
275
|
-
one concurrency cap.
|
|
314
|
+
The decay-based reorder only applies to partitions already claimed.
|
|
315
|
+
Selection (`Repository.claim_partitions`) still orders by
|
|
316
|
+
`last_checked_at NULLS FIRST, id`. Every active partition with
|
|
317
|
+
pending jobs is visited in at most ⌈N / partition_batch_size⌉ ticks
|
|
318
|
+
regardless of how hot or cold it is.
|
|
276
319
|
|
|
277
|
-
|
|
320
|
+
### Mixing `:adaptive_concurrency` with fairness
|
|
278
321
|
|
|
279
|
-
|
|
280
|
-
|
|
281
|
-
urgent work should jump ahead, set a lower ActiveJob `priority`
|
|
282
|
-
(the admission SELECT is `ORDER BY priority, staged_at`) — or split
|
|
283
|
-
into a subclass with its own policy.
|
|
284
|
-
- `dedupe_key` is queue-agnostic: the same key enqueued to
|
|
285
|
-
`:urgent` and `:low` dedupes to one row.
|
|
322
|
+
Adaptive and fairness operate at different layers and compose
|
|
323
|
+
without sharing state:
|
|
286
324
|
|
|
287
|
-
|
|
325
|
+
- **Fairness** writes `partitions.decayed_admits` inside the
|
|
326
|
+
per-partition admit TX.
|
|
327
|
+
- **Adaptive** writes `dispatch_policy_adaptive_concurrency_stats`
|
|
328
|
+
from the worker's `around_perform` via `record_observation`.
|
|
288
329
|
|
|
289
|
-
|
|
290
|
-
|
|
330
|
+
Different tables, different locks. Each tick the actual admit_count
|
|
331
|
+
becomes `min(fair_share, current_max - in_flight)` (with the
|
|
332
|
+
adaptive safety valve when `in_flight == 0`). Fairness picks order +
|
|
333
|
+
budget per tick; adaptive shapes how aggressively each partition
|
|
334
|
+
consumes its share.
|
|
291
335
|
|
|
292
336
|
```ruby
|
|
293
|
-
|
|
294
|
-
|
|
337
|
+
dispatch_policy :tenants do
|
|
338
|
+
partition_by ->(c) { c[:tenant] }
|
|
295
339
|
|
|
296
|
-
|
|
297
|
-
|
|
340
|
+
gate :adaptive_concurrency,
|
|
341
|
+
initial_max: 5,
|
|
342
|
+
target_lag_ms: 1000,
|
|
343
|
+
min: 1
|
|
298
344
|
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
gate :throttle,
|
|
302
|
-
rate: 100,
|
|
303
|
-
per: 1.minute,
|
|
304
|
-
partition_by: ->(ctx) { "#{ctx[:queue_name]}:#{ctx[:account_id]}" }
|
|
305
|
-
end
|
|
345
|
+
fairness half_life: 30.seconds
|
|
346
|
+
tick_admission_budget 60
|
|
306
347
|
end
|
|
307
|
-
|
|
308
|
-
SendEmailJob.set(queue: :urgent).perform_later(user)
|
|
309
|
-
SendEmailJob.set(queue: :default).perform_later(user)
|
|
310
|
-
# → two partitions, each with its own bucket.
|
|
311
348
|
```
|
|
312
349
|
|
|
313
|
-
|
|
314
|
-
|
|
350
|
+
The dummy `AdaptiveDemoJob` declares both; the storm form drives it
|
|
351
|
+
across many tenants with a triangular weight distribution so you can
|
|
352
|
+
watch the EWMA reorder hot tenants AND the AIMD shrink their cap.
|
|
353
|
+
Integration test: `test/integration/adaptive_with_fairness_test.rb`.
|
|
354
|
+
|
|
355
|
+
## Sharding a policy across worker pools
|
|
356
|
+
|
|
357
|
+
Shards partition the gem horizontally: each tick worker sees only
|
|
358
|
+
the partitions on its own shard, so multiple workers can admit in
|
|
359
|
+
parallel for the same policy. Declare a `shard_by`:
|
|
315
360
|
|
|
316
361
|
```ruby
|
|
317
|
-
|
|
318
|
-
|
|
319
|
-
|
|
320
|
-
|
|
321
|
-
|
|
322
|
-
|
|
362
|
+
dispatch_policy :events do
|
|
363
|
+
context ->(args) { { account_id: args.first[:account_id] } }
|
|
364
|
+
partition_by ->(c) { "acct:#{c[:account_id]}" }
|
|
365
|
+
shard_by ->(c) { "events-shard-#{c[:account_id].hash.abs % 4}" }
|
|
366
|
+
|
|
367
|
+
gate :concurrency, max: 50
|
|
323
368
|
end
|
|
324
369
|
```
|
|
325
370
|
|
|
326
|
-
|
|
371
|
+
Run one `DispatchTickLoopJob` per shard:
|
|
372
|
+
|
|
373
|
+
```ruby
|
|
374
|
+
4.times { |i| DispatchTickLoopJob.perform_later("events", "events-shard-#{i}") }
|
|
375
|
+
```
|
|
376
|
+
|
|
377
|
+
The generated `DispatchTickLoopJob` template uses
|
|
378
|
+
`queue_as { arguments[1] }` so each tick is enqueued on the same
|
|
379
|
+
queue it monitors. Workers listening on `events-shard-*` queues run
|
|
380
|
+
both the tick loops and the admitted jobs from one pool per shard.
|
|
327
381
|
|
|
328
|
-
|
|
329
|
-
`
|
|
382
|
+
The gem's automatic context enrichment puts `:queue_name` into the
|
|
383
|
+
ctx hash so `shard_by` can use it directly without your `context`
|
|
384
|
+
proc having to know about it.
|
|
330
385
|
|
|
331
|
-
|
|
332
|
-
|
|
333
|
-
|
|
334
|
-
- Returning `nil` from the lambda → no dedup for that enqueue.
|
|
386
|
+
**`shard_by` must be ≥ as coarse as the most restrictive throttle's
|
|
387
|
+
scope.** If not, the bucket duplicates across shards and the
|
|
388
|
+
effective rate becomes `rate × N_shards`.
|
|
335
389
|
|
|
336
|
-
|
|
337
|
-
`"event:abc123"`). Keep it stable for the duration of a logical unit
|
|
338
|
-
of work.
|
|
390
|
+
## Atomic admission
|
|
339
391
|
|
|
340
|
-
|
|
392
|
+
`Forwarder.dispatch` runs inside the per-partition admission
|
|
393
|
+
transaction. The adapter (good_job, solid_queue) uses
|
|
394
|
+
`ActiveRecord::Base.connection`, so its `INSERT INTO good_jobs`
|
|
395
|
+
joins the same TX as the `DELETE FROM staged_jobs` and the `INSERT
|
|
396
|
+
INTO inflight_jobs`. Any exception (deserialize, adapter error,
|
|
397
|
+
network) rolls everything back atomically — no window where staged
|
|
398
|
+
is gone but the adapter never received the job.
|
|
341
399
|
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
|
|
345
|
-
|
|
400
|
+
The trade-off: the gem requires a PG-backed adapter for
|
|
401
|
+
at-least-once. The railtie warns at boot if the adapter doesn't
|
|
402
|
+
look PG-shared (Sidekiq, Resque, async, …) but doesn't hard-fail —
|
|
403
|
+
a custom PG-backed adapter we don't recognise can still work.
|
|
404
|
+
|
|
405
|
+
For Rails multi-DB (e.g. solid_queue on a separate `:queue` role):
|
|
346
406
|
|
|
347
407
|
```ruby
|
|
348
|
-
|
|
349
|
-
|
|
350
|
-
round_robin_by ->(args) { args.first.account_id }
|
|
408
|
+
DispatchPolicy.configure do |c|
|
|
409
|
+
c.database_role = :queue
|
|
351
410
|
end
|
|
352
411
|
```
|
|
353
412
|
|
|
354
|
-
|
|
355
|
-
`
|
|
356
|
-
|
|
413
|
+
`Repository.with_connection` wraps the admission TX in
|
|
414
|
+
`connected_to(role:)` when set. Staging tables and the adapter's
|
|
415
|
+
table must live in the same DB for atomicity to hold.
|
|
357
416
|
|
|
358
|
-
|
|
359
|
-
Guarantees each active tenant gets at least `quantum` rows per
|
|
360
|
-
tick, so a tenant with 10 pending is served in the same tick as
|
|
361
|
-
a tenant with 50k pending.
|
|
362
|
-
2. **Top-up** — if the fairness floor doesn't fill `batch_size`, the
|
|
363
|
-
remaining slots go to the oldest pending (excluding the ids
|
|
364
|
-
already locked). Keeps single-tenant throughput at full capacity.
|
|
365
|
-
|
|
366
|
-
Cost per tick is O(`quantum × active_keys`), not O(backlog) — so the
|
|
367
|
-
admin stays snappy even with thousands of distinct tenants.
|
|
368
|
-
|
|
369
|
-
### Time-weighted variant
|
|
417
|
+
## Running the tick
|
|
370
418
|
|
|
371
|
-
|
|
372
|
-
|
|
373
|
-
|
|
374
|
-
|
|
375
|
-
|
|
419
|
+
`DispatchPolicy::TickLoop.run(policy_name:, shard:, stop_when:)` is
|
|
420
|
+
the entry point. It claims partitions under `FOR UPDATE SKIP
|
|
421
|
+
LOCKED`, evaluates gates, atomically admits, and updates partition
|
|
422
|
+
state. The install generator scaffolds a `DispatchTickLoopJob` you
|
|
423
|
+
schedule like any other ActiveJob:
|
|
376
424
|
|
|
377
425
|
```ruby
|
|
378
|
-
|
|
426
|
+
DispatchTickLoopJob.perform_later # all policies
|
|
427
|
+
DispatchTickLoopJob.perform_later("endpoints") # one policy
|
|
428
|
+
DispatchTickLoopJob.perform_later("endpoints", "shard-2")
|
|
379
429
|
```
|
|
380
430
|
|
|
381
|
-
|
|
382
|
-
|
|
383
|
-
|
|
384
|
-
|
|
385
|
-
sourced from `dispatch_policy_partition_observations`. So if `slow`
|
|
386
|
-
has burned 20 s of perform time recently and `fast` has burned 200 ms,
|
|
387
|
-
this tick `fast` claims ~99% of `batch_size` while `slow` gets the
|
|
388
|
-
floor — total compute per minute stays balanced and you don't need a
|
|
389
|
-
throttle on top.
|
|
431
|
+
Each job uses `good_job_control_concurrency_with` (or solid_queue's
|
|
432
|
+
`limits_concurrency`) so only one tick is active per
|
|
433
|
+
(policy, shard) combination at a time. The job re-enqueues itself
|
|
434
|
+
with a 1-second tail wait, so the loop survives normal restarts.
|
|
390
435
|
|
|
391
|
-
##
|
|
436
|
+
## Admin UI
|
|
392
437
|
|
|
393
|
-
|
|
438
|
+
Mount the engine and visit `/dispatch_policy`:
|
|
394
439
|
|
|
395
|
-
|
|
396
|
-
|
|
440
|
+
- **Dashboard** — totals, throughput windows, round-trip stats,
|
|
441
|
+
capacity gauges (admit rate vs adapter ceiling, avg tick vs
|
|
442
|
+
`tick_max_duration`), pending trend with up/down arrow, auto-hints
|
|
443
|
+
("avg tick at 88% of tick_max_duration — shard or lower
|
|
444
|
+
admission_batch_size").
|
|
445
|
+
- **Policies** — per-policy throughput, denial reasons breakdown,
|
|
446
|
+
top partitions by lifetime/pending, pause/resume/drain.
|
|
447
|
+
- **Partitions** — searchable list, detail view with gate state,
|
|
448
|
+
decayed_admits + admits/min estimate, recent staged jobs,
|
|
449
|
+
force-admit, drain.
|
|
397
450
|
|
|
398
|
-
|
|
399
|
-
|
|
400
|
-
|
|
401
|
-
|
|
402
|
-
dispatch_policy do
|
|
403
|
-
context ->(args) { { account_id: args.first[:account_id] } }
|
|
404
|
-
|
|
405
|
-
# Fetch-level fairness by *compute time* (not request count). When
|
|
406
|
-
# several accounts compete, per-tick quanta are sized inverse to
|
|
407
|
-
# their recent perform duration; solo accounts top up to batch_size.
|
|
408
|
-
round_robin_by ->(args) { args.first[:account_id] },
|
|
409
|
-
weight: :time, window: 60
|
|
410
|
-
|
|
411
|
-
# Drip-feed admission per account based on adapter queue lag.
|
|
412
|
-
# Without this, a single account with thousands of pending could
|
|
413
|
-
# dump batch_size jobs into the adapter queue in one tick and lose
|
|
414
|
-
# the ability to react to performance changes mid-burst.
|
|
415
|
-
gate :adaptive_concurrency,
|
|
416
|
-
partition_by: ->(ctx) { ctx[:account_id] },
|
|
417
|
-
initial_max: 3,
|
|
418
|
-
target_lag_ms: 500
|
|
419
|
-
end
|
|
451
|
+
The UI auto-refreshes via Turbo morph + a controllable picker
|
|
452
|
+
(off / 2s / 5s / 10s) stored in sessionStorage; preserves scroll
|
|
453
|
+
position; and skips a refresh while a previous Turbo visit is in
|
|
454
|
+
flight so a slow page doesn't stack visits.
|
|
420
455
|
|
|
421
|
-
|
|
456
|
+
CSRF and forgery protection use the host app's settings. The UI
|
|
457
|
+
ships unauthenticated; wrap the `mount` with a constraint or
|
|
458
|
+
`before_action` for auth in production.
|
|
459
|
+
|
|
460
|
+
## Configuration
|
|
461
|
+
|
|
462
|
+
```ruby
|
|
463
|
+
# config/initializers/dispatch_policy.rb
|
|
464
|
+
DispatchPolicy.configure do |c|
|
|
465
|
+
c.tick_max_duration = 25 # seconds the tick job stays admitting
|
|
466
|
+
c.partition_batch_size = 50 # partitions claimed per tick iteration
|
|
467
|
+
c.admission_batch_size = 100 # max jobs admitted per partition per iteration
|
|
468
|
+
c.idle_pause = 0.5 # seconds slept when a tick admits nothing
|
|
469
|
+
c.partition_inactive_after = 86_400 # GC partitions idle this long
|
|
470
|
+
c.inflight_stale_after = 300 # GC inflight rows whose worker stopped heartbeating
|
|
471
|
+
c.inflight_heartbeat_interval = 30 # how often the worker bumps heartbeat_at
|
|
472
|
+
c.sweep_every_ticks = 50 # sweeper cadence (in tick iterations)
|
|
473
|
+
c.metrics_retention = 86_400 # tick_samples kept this long
|
|
474
|
+
c.fairness_half_life_seconds = 60 # EWMA half-life for in-tick reorder; nil disables
|
|
475
|
+
c.tick_admission_budget = nil # global cap on admissions per tick; nil = none
|
|
476
|
+
c.adapter_throughput_target = nil # jobs/sec; UI shows admit rate as % of this
|
|
477
|
+
c.database_role = nil # AR role for the admission TX (multi-DB)
|
|
422
478
|
end
|
|
423
479
|
```
|
|
424
480
|
|
|
425
|
-
|
|
426
|
-
|
|
427
|
-
- A solo account runs at whatever throughput its downstream allows;
|
|
428
|
-
`:adaptive_concurrency` grows `current_max` while queue lag stays
|
|
429
|
-
under `target_lag_ms`.
|
|
430
|
-
- A slow account (1 s/perform) and a fast account (100 ms/perform)
|
|
431
|
-
competing → `weight: :time` gives the fast one most of each tick's
|
|
432
|
-
budget; the slow one's adaptive cap shrinks toward `min`. Total
|
|
433
|
-
compute time per minute stays balanced and the adapter queue
|
|
434
|
-
doesn't pile up behind whichever tenant happened to enqueue first.
|
|
435
|
-
- A misbehaving downstream that suddenly goes from 100 ms to 5 s →
|
|
436
|
-
that tenant's `current_max` drops within a few completions and its
|
|
437
|
-
fetch quantum shrinks; the other tenants are unaffected.
|
|
438
|
-
|
|
439
|
-
Tune `target_lag_ms` for the latency budget you can tolerate (see
|
|
440
|
-
[Choosing target_lag_ms](#choosing-target_lag_ms)) and `window` for
|
|
441
|
-
how reactive the time-balancing should be (smaller = noisier, larger
|
|
442
|
-
= more stable).
|
|
481
|
+
You can override `admission_batch_size`, `fairness_half_life_seconds`,
|
|
482
|
+
and `tick_admission_budget` per policy via the DSL.
|
|
443
483
|
|
|
444
|
-
##
|
|
484
|
+
## `partitions.context` is refreshed on every enqueue
|
|
445
485
|
|
|
446
|
-
|
|
447
|
-
|
|
448
|
-
queue-adapter specific (GoodJob's `total_limit`, Sidekiq Enterprise
|
|
449
|
-
uniqueness, etc.), so you write a small job in your app that wraps
|
|
450
|
-
the loop with whatever dedup your adapter provides. Example for
|
|
451
|
-
GoodJob:
|
|
486
|
+
When you call `perform_later`, the gem evaluates your `context` proc
|
|
487
|
+
and upserts the partition row with the resulting hash:
|
|
452
488
|
|
|
453
|
-
```
|
|
454
|
-
|
|
455
|
-
|
|
456
|
-
|
|
457
|
-
|
|
458
|
-
|
|
459
|
-
|
|
460
|
-
)
|
|
461
|
-
|
|
462
|
-
def perform(policy_name = nil)
|
|
463
|
-
deadline = Time.current + DispatchPolicy.config.tick_max_duration
|
|
464
|
-
DispatchPolicy::TickLoop.run(
|
|
465
|
-
policy_name: policy_name,
|
|
466
|
-
stop_when: -> {
|
|
467
|
-
GoodJob.current_thread_shutting_down? || Time.current >= deadline
|
|
468
|
-
}
|
|
469
|
-
)
|
|
470
|
-
# Self-chain so the next run starts immediately; cron below is a safety net.
|
|
471
|
-
DispatchTickLoopJob.set(wait: 1.second).perform_later(policy_name)
|
|
472
|
-
end
|
|
473
|
-
end
|
|
489
|
+
```sql
|
|
490
|
+
INSERT INTO dispatch_policy_partitions (..., context, context_updated_at, ...) VALUES (...)
|
|
491
|
+
ON CONFLICT (policy_name, partition_key) DO UPDATE
|
|
492
|
+
SET context = EXCLUDED.context,
|
|
493
|
+
context_updated_at = EXCLUDED.context_updated_at,
|
|
494
|
+
pending_count = dispatch_policy_partitions.pending_count + 1,
|
|
495
|
+
...
|
|
474
496
|
```
|
|
475
497
|
|
|
476
|
-
|
|
477
|
-
|
|
498
|
+
Gates evaluate against `partition.context`, **not** the per-job
|
|
499
|
+
snapshot in `staged_jobs.context`. So if a tenant bumps their
|
|
500
|
+
`dispatch_concurrency` from 5 to 20 and a new job arrives, the next
|
|
501
|
+
admission uses the new value — no need to drain the partition
|
|
502
|
+
first. If a partition has no new traffic, the context stays at the
|
|
503
|
+
value seen by the last enqueue.
|
|
504
|
+
|
|
505
|
+
## Retry strategies
|
|
506
|
+
|
|
507
|
+
By default a retry produced by `retry_on` re-enters the policy and
|
|
508
|
+
is staged again, so throttle/concurrency apply equally to first
|
|
509
|
+
attempts and retries. Use `retry_strategy :bypass` if you want
|
|
510
|
+
retries to skip the gem and go straight to the adapter:
|
|
478
511
|
|
|
479
512
|
```ruby
|
|
480
|
-
|
|
481
|
-
|
|
482
|
-
|
|
483
|
-
|
|
484
|
-
|
|
485
|
-
}
|
|
486
|
-
}
|
|
513
|
+
dispatch_policy :foo do
|
|
514
|
+
partition_by ->(_c) { "k" }
|
|
515
|
+
gate :throttle, rate: 5, per: 60
|
|
516
|
+
retry_strategy :bypass
|
|
517
|
+
end
|
|
487
518
|
```
|
|
488
519
|
|
|
489
|
-
|
|
490
|
-
yourself (e.g. `pg_try_advisory_lock` inside `perform`) before calling
|
|
491
|
-
`DispatchPolicy::TickLoop.run`.
|
|
520
|
+
## Compatibility
|
|
492
521
|
|
|
493
|
-
|
|
494
|
-
|
|
495
|
-
`
|
|
496
|
-
|
|
497
|
-
|
|
498
|
-
- Policy index with pending / admitted / completed-24h totals.
|
|
499
|
-
- Per-policy page with a **partition breakdown** (watched + searchable
|
|
500
|
-
list) showing pending-eligible / pending-scheduled / in-flight /
|
|
501
|
-
completed / adaptive cap / EWMA latency / last enqueue / last
|
|
502
|
-
dispatch per partition.
|
|
503
|
-
- Line chart of avg EWMA queue lag (last hour, per minute) with
|
|
504
|
-
completions-per-minute bars behind it.
|
|
505
|
-
- Per-partition sparkline with the same overlay; click to watch /
|
|
506
|
-
unwatch. Watched set is persisted in `localStorage` and synced into
|
|
507
|
-
the URL so reloading keeps your view.
|
|
508
|
-
- Opt-in auto-refresh (off / 2s / 5s / 15s) stored in `localStorage`.
|
|
509
|
-
Page updates via Turbo morph — scroll position and tooltips survive.
|
|
522
|
+
- Rails 7.1+ (developed against 8.1).
|
|
523
|
+
- PostgreSQL 12+ (uses `FOR UPDATE SKIP LOCKED`, `JSONB`, `ON CONFLICT`).
|
|
524
|
+
- `good_job` ≥ 4.0 or `solid_queue` ≥ 1.0.
|
|
525
|
+
- Sidekiq / Resque are NOT supported — the at-least-once guarantee
|
|
526
|
+
needs the adapter to share Postgres with the gem.
|
|
510
527
|
|
|
511
528
|
## Testing
|
|
512
529
|
|
|
513
|
-
```
|
|
514
|
-
bundle
|
|
515
|
-
bundle exec rake
|
|
530
|
+
```bash
|
|
531
|
+
bundle exec rake test # 124 runs / 284 assertions
|
|
532
|
+
bundle exec rake bench # manual benchmark suite (creates dispatch_policy_bench DB)
|
|
533
|
+
bundle exec rake bench:real # end-to-end against good_job on the dummy DB
|
|
534
|
+
bundle exec rake bench:limits # stretches every path to its breaking point
|
|
516
535
|
```
|
|
517
536
|
|
|
518
|
-
|
|
519
|
-
|
|
520
|
-
`
|
|
521
|
-
`test/dummy/config/database.yml`.
|
|
537
|
+
Integration tests skip when no Postgres is reachable (default DB
|
|
538
|
+
`dispatch_policy_test`; override via `DB_NAME`, `DB_HOST`,
|
|
539
|
+
`DB_USER`, `DB_PASS`).
|
|
522
540
|
|
|
523
541
|
## Releasing
|
|
524
542
|
|
|
525
|
-
|
|
526
|
-
|
|
527
|
-
|
|
528
|
-
|
|
529
|
-
|
|
530
|
-
|
|
531
|
-
|
|
532
|
-
|
|
533
|
-
3.
|
|
534
|
-
|
|
535
|
-
4.
|
|
536
|
-
|
|
537
|
-
|
|
538
|
-
|
|
539
|
-
|
|
540
|
-
|
|
541
|
-
|
|
542
|
-
|
|
543
|
-
|
|
544
|
-
|
|
545
|
-
|
|
546
|
-
|
|
547
|
-
|
|
543
|
+
Cutting a new version is driven by `bin/release`. Steps:
|
|
544
|
+
|
|
545
|
+
1. Bump `DispatchPolicy::VERSION` in
|
|
546
|
+
`lib/dispatch_policy/version.rb`.
|
|
547
|
+
2. Add a `## <VERSION>` section in `CHANGELOG.md` describing the
|
|
548
|
+
release. The script extracts that section verbatim as the
|
|
549
|
+
GitHub release notes, so anything missing here will be missing
|
|
550
|
+
on GitHub.
|
|
551
|
+
3. Commit both on `master` and push so `origin/master` matches
|
|
552
|
+
local.
|
|
553
|
+
4. Run the script from the repo root:
|
|
554
|
+
|
|
555
|
+
```bash
|
|
556
|
+
bin/release
|
|
557
|
+
```
|
|
558
|
+
|
|
559
|
+
The script:
|
|
560
|
+
|
|
561
|
+
- Refuses to run unless you are on `master`, the working tree is
|
|
562
|
+
clean, the local branch matches `origin/master`, and the tag
|
|
563
|
+
`v<VERSION>` does not yet exist.
|
|
564
|
+
- Asks for a `y` confirmation before doing anything.
|
|
565
|
+
- Hands off to `bundle exec rake release` (builds the gem, creates
|
|
566
|
+
the `v<VERSION>` tag, pushes the tag to GitHub, pushes the gem to
|
|
567
|
+
RubyGems.org).
|
|
568
|
+
- Creates a GitHub release for `v<VERSION>` using the matching
|
|
569
|
+
CHANGELOG section as the body. Requires the `gh` CLI; if it is
|
|
570
|
+
missing, the gem ships but you'll need to create the GitHub
|
|
571
|
+
release manually with `gh release create v<VERSION> --notes-file
|
|
572
|
+
CHANGELOG.md`.
|
|
573
|
+
|
|
574
|
+
Prerequisites: a configured `~/.gem/credentials` for RubyGems push
|
|
575
|
+
and `gh auth login` for the GitHub release.
|
|
576
|
+
|
|
577
|
+
## Status
|
|
578
|
+
|
|
579
|
+
Published on RubyGems. API may still shift between minors until
|
|
580
|
+
1.0. The set of features that ship today:
|
|
581
|
+
|
|
582
|
+
- Gates: `:throttle`, `:concurrency`, `:adaptive_concurrency`.
|
|
583
|
+
- Fairness: in-tick EWMA reorder + optional `tick_admission_budget`.
|
|
584
|
+
- Sharding: `shard_by` + per-shard tick loops.
|
|
585
|
+
- Bulk handoff: `ActiveJob.perform_all_later` collapses to one
|
|
586
|
+
adapter `INSERT` per tick when admissible.
|
|
587
|
+
- Admin UI with capacity hints, pending trend, denial reasons.
|
|
588
|
+
- Manual benchmark suite.
|
|
589
|
+
|
|
590
|
+
Deferred ideas (with rationale) live in [`IDEAS.md`](IDEAS.md):
|
|
591
|
+
`gate :global_cap`, smarter sweeper defaults, `sweep_every_seconds`
|
|
592
|
+
instead of `sweep_every_ticks`.
|
|
548
593
|
|
|
549
594
|
## License
|
|
550
595
|
|