dispatch_policy 0.1.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (70) hide show
  1. checksums.yaml +4 -4
  2. data/MIT-LICENSE +16 -17
  3. data/README.md +449 -288
  4. data/app/assets/stylesheets/dispatch_policy/application.css +157 -0
  5. data/app/controllers/dispatch_policy/application_controller.rb +45 -1
  6. data/app/controllers/dispatch_policy/dashboard_controller.rb +91 -0
  7. data/app/controllers/dispatch_policy/partitions_controller.rb +122 -0
  8. data/app/controllers/dispatch_policy/policies_controller.rb +94 -241
  9. data/app/controllers/dispatch_policy/staged_jobs_controller.rb +9 -0
  10. data/app/models/dispatch_policy/adaptive_concurrency_stats.rb +11 -81
  11. data/app/models/dispatch_policy/inflight_job.rb +12 -0
  12. data/app/models/dispatch_policy/partition.rb +21 -0
  13. data/app/models/dispatch_policy/staged_job.rb +4 -97
  14. data/app/models/dispatch_policy/tick_sample.rb +11 -0
  15. data/app/views/dispatch_policy/dashboard/index.html.erb +109 -0
  16. data/app/views/dispatch_policy/partitions/index.html.erb +63 -0
  17. data/app/views/dispatch_policy/partitions/show.html.erb +106 -0
  18. data/app/views/dispatch_policy/policies/index.html.erb +15 -37
  19. data/app/views/dispatch_policy/policies/show.html.erb +140 -216
  20. data/app/views/dispatch_policy/shared/_capacity.html.erb +67 -0
  21. data/app/views/dispatch_policy/shared/_hints.html.erb +13 -0
  22. data/app/views/dispatch_policy/shared/_partition_row.html.erb +12 -0
  23. data/app/views/dispatch_policy/staged_jobs/show.html.erb +31 -0
  24. data/app/views/layouts/dispatch_policy/application.html.erb +95 -238
  25. data/config/routes.rb +18 -2
  26. data/db/migrate/20260501000001_create_dispatch_policy_tables.rb +103 -0
  27. data/lib/dispatch_policy/bypass.rb +23 -0
  28. data/lib/dispatch_policy/config.rb +85 -0
  29. data/lib/dispatch_policy/context.rb +50 -0
  30. data/lib/dispatch_policy/cursor_pagination.rb +121 -0
  31. data/lib/dispatch_policy/decision.rb +22 -0
  32. data/lib/dispatch_policy/engine.rb +4 -27
  33. data/lib/dispatch_policy/forwarder.rb +63 -0
  34. data/lib/dispatch_policy/gate.rb +10 -38
  35. data/lib/dispatch_policy/gates/adaptive_concurrency.rb +99 -97
  36. data/lib/dispatch_policy/gates/concurrency.rb +45 -26
  37. data/lib/dispatch_policy/gates/throttle.rb +65 -37
  38. data/lib/dispatch_policy/inflight_tracker.rb +174 -0
  39. data/lib/dispatch_policy/job_extension.rb +155 -0
  40. data/lib/dispatch_policy/operator_hints.rb +126 -0
  41. data/lib/dispatch_policy/pipeline.rb +48 -0
  42. data/lib/dispatch_policy/policy.rb +62 -47
  43. data/lib/dispatch_policy/policy_dsl.rb +120 -0
  44. data/lib/dispatch_policy/railtie.rb +35 -0
  45. data/lib/dispatch_policy/registry.rb +46 -0
  46. data/lib/dispatch_policy/repository.rb +723 -0
  47. data/lib/dispatch_policy/serializer.rb +36 -0
  48. data/lib/dispatch_policy/tick.rb +263 -172
  49. data/lib/dispatch_policy/tick_loop.rb +59 -26
  50. data/lib/dispatch_policy/version.rb +1 -1
  51. data/lib/dispatch_policy.rb +71 -46
  52. data/lib/generators/dispatch_policy/install/install_generator.rb +70 -0
  53. data/lib/generators/dispatch_policy/install/templates/create_dispatch_policy_tables.rb.tt +95 -0
  54. data/lib/generators/dispatch_policy/install/templates/dispatch_tick_loop_job.rb.tt +53 -0
  55. data/lib/generators/dispatch_policy/install/templates/initializer.rb.tt +11 -0
  56. metadata +101 -43
  57. data/CHANGELOG.md +0 -12
  58. data/app/models/dispatch_policy/partition_inflight_count.rb +0 -42
  59. data/app/models/dispatch_policy/partition_observation.rb +0 -49
  60. data/app/models/dispatch_policy/throttle_bucket.rb +0 -41
  61. data/db/migrate/20260424000001_create_dispatch_policy_tables.rb +0 -80
  62. data/db/migrate/20260424000002_create_adaptive_concurrency_stats.rb +0 -22
  63. data/db/migrate/20260424000003_create_adaptive_concurrency_samples.rb +0 -25
  64. data/db/migrate/20260424000004_rename_samples_to_partition_observations.rb +0 -32
  65. data/lib/dispatch_policy/active_job_perform_all_later_patch.rb +0 -32
  66. data/lib/dispatch_policy/dispatch_context.rb +0 -53
  67. data/lib/dispatch_policy/dispatchable.rb +0 -120
  68. data/lib/dispatch_policy/gates/fair_interleave.rb +0 -32
  69. data/lib/dispatch_policy/gates/global_cap.rb +0 -26
  70. data/lib/dispatch_policy/install_generator.rb +0 -23
data/README.md CHANGED
@@ -1,434 +1,595 @@
1
1
  # DispatchPolicy
2
2
 
3
- > **⚠️ Experimental.** The API, schema, and defaults can change between
4
- > minor releases without notice. DispatchPolicy is currently running in
5
- > production on [pulso.run](https://pulso.run) that's how we learn
6
- > what breaks. If you pick it up for your own project, pin the exact
7
- > version and expect to follow the changelog.
3
+ > **⚠️ Experimental v2 branch.** This is the `v2` branch of
4
+ > [ceritium/dispatch_policy](https://github.com/ceritium/dispatch_policy)
5
+ > an alternative cut: TX-atomic admission, in-tick fairness as a
6
+ > layer (not a gate), and a single canonical partition scope per
7
+ > policy. API, schema, and defaults can change between any two
8
+ > commits. The `master` branch of the same repo is the original
9
+ > design and is what the published gem (when one ships) tracks.
8
10
  >
9
- > **PostgreSQL only (11+).** The staging, admission, and fairness
10
- > machinery lean on `jsonb`, partial indexes, `FOR UPDATE SKIP LOCKED`,
11
- > `ON CONFLICT`, and `CROSS JOIN LATERAL`. MySQL/SQLite support isn't
12
- > closed off as a goal being drop-in across every ActiveJob backend
13
- > is the long-term direction — but it would take meaningful rework
14
- > (shadow columns for `jsonb`, full indexes instead of partial, a
15
- > different batch-fetch strategy for fairness). Contributions welcome.
11
+ > **PostgreSQL only.** Staging, admission, and adaptive stats lean on
12
+ > `jsonb`, partial indexes, `FOR UPDATE SKIP LOCKED`, `ON CONFLICT`,
13
+ > and the adapter sharing `ActiveRecord::Base.connection` so the
14
+ > admit + adapter INSERT can join one transaction. Tested against
15
+ > good_job and solid_queue.
16
16
 
17
17
  Per-partition admission control for ActiveJob. Stages `perform_later`
18
18
  into a dedicated table, runs a tick loop that admits jobs through
19
- declared gates (throttle, concurrency, global_cap, fair_interleave,
20
- adaptive_concurrency), then forwards survivors to the real adapter.
19
+ declared gates (`throttle`, `concurrency`, `adaptive_concurrency`),
20
+ then forwards survivors to the real adapter. The admission and the
21
+ adapter INSERT happen inside one Postgres transaction, so a worker
22
+ crash mid-tick can't lose a job.
21
23
 
22
24
  Use it when you need:
23
25
 
24
- - **Per-tenant / per-endpoint throttle** that's exact (token bucket)
25
- instead of best-effort enqueue-side.
26
- - **Per-partition concurrency** with a proper release hook on job
27
- completion (and lease-expiry recovery if the worker dies mid-perform).
26
+ - **Per-tenant / per-endpoint throttle** token bucket per partition,
27
+ refreshed lazily on read.
28
+ - **Per-partition concurrency** fixed cap on in-flight jobs with a
29
+ release hook on completion and a heartbeat-based reaper for crashes.
28
30
  - **Adaptive concurrency** — a cap that shrinks under queue pressure
29
- and grows back when workers keep up, without manual tuning.
30
- - **Dedupe** against a partial unique index, not an in-memory key.
31
- - **Round-robin fairness across tenants** (LATERAL batch fetch) so one
32
- tenant's burst can't starve the others.
31
+ and grows back when workers keep up, no manual tuning per tenant.
32
+ - **In-tick fairness** within a single tick, partitions are reordered
33
+ by recent activity (EWMA) and an optional global cap is shared
34
+ fairly across them. So one tenant's burst can't starve the others.
35
+ - **Sharding** — split a policy across N queues so independent tick
36
+ workers admit in parallel.
37
+
38
+ ## Demo
39
+
40
+ The demo lives in `test/dummy/` — a tiny Rails app inside this repo.
41
+ Run it locally to play with every gate and the admin UI:
42
+
43
+ ```bash
44
+ bin/dummy setup good_job # creates the DB and migrates
45
+ DUMMY_ADAPTER=good_job bundle exec foreman start
46
+ ```
47
+
48
+ Then open:
49
+
50
+ - `http://localhost:3000/` — playground with one card per job and a
51
+ storm form that exercises the adaptive cap and fairness reorder
52
+ across many tenants.
53
+ - `http://localhost:3000/dispatch_policy` — admin UI: live throughput,
54
+ partition state, denial reasons, capacity hints.
55
+
56
+ The dummy ships ten purpose-built jobs covering throttle, concurrency,
57
+ mixed gates, scheduling, retries, stress tests, sharding, fairness, and
58
+ adaptive concurrency. See `test/dummy/app/jobs/`.
33
59
 
34
60
  ## Install
35
61
 
36
62
  Add to your `Gemfile`:
37
63
 
38
64
  ```ruby
39
- gem "dispatch_policy"
65
+ gem "dispatch_policy",
66
+ git: "https://github.com/ceritium/dispatch_policy",
67
+ branch: "v2"
40
68
  ```
41
69
 
42
- Copy the migration and run it:
70
+ Generate the install bundle (migration + initializer + tick loop job):
43
71
 
44
- ```
45
- bundle exec rails dispatch_policy:install:migrations
46
- bundle exec rails db:migrate
72
+ ```bash
73
+ bin/rails generate dispatch_policy:install
74
+ bin/rails db:migrate
47
75
  ```
48
76
 
49
- Mount the admin UI in `config/routes.rb` (optional):
77
+ Mount the admin UI (optional but recommended):
50
78
 
51
79
  ```ruby
52
- mount DispatchPolicy::Engine => "/admin/dispatch_policy"
80
+ mount DispatchPolicy::Engine, at: "/dispatch_policy"
53
81
  ```
54
82
 
55
- Configure in `config/initializers/dispatch_policy.rb`:
83
+ Then schedule the tick loop. The generator wrote a
84
+ `DispatchTickLoopJob` in `app/jobs/`; kick it off once and it
85
+ re-enqueues itself:
56
86
 
57
87
  ```ruby
58
- DispatchPolicy.configure do |c|
59
- c.enabled = ENV.fetch("DISPATCH_POLICY_ENABLED", "true") != "false"
60
- c.lease_duration = 15.minutes
61
- c.batch_size = 500
62
- c.round_robin_quantum = 50
63
- c.tick_sleep = 1 # idle
64
- c.tick_sleep_busy = 0.05 # after productive ticks
65
- end
88
+ DispatchTickLoopJob.perform_later
66
89
  ```
67
90
 
68
91
  ## Flow
69
92
 
70
93
  ```
71
94
  ActiveJob#perform_later
72
- Dispatchable#enqueue
73
- StagedJob.stage! (insert into dispatch_policy_staged_jobs, pending)
95
+ JobExtension.around_enqueue_for
96
+ Repository.stage! (INSERT staged + UPSERT partition; ctx refreshed)
74
97
 
75
98
  (tick loop, periodically)
76
- SELECT pending FOR UPDATE SKIP LOCKED
77
- Run gates in declared order; survivors are the admitted set
78
- StagedJob#mark_admitted! (increment counters, set admitted_at)
79
- job.enqueue(_bypass_staging: true) (hand off to the real adapter)
99
+ claim_partitions (FOR UPDATE SKIP LOCKED, ordered by last_checked_at)
100
+ reorder by decayed_admits ASC (in-tick fairness)
101
+ for each: pipeline.call(ctx, partition, fair_share)
102
+ gates evaluate; admit_count = min(allowed)
103
+ → ONE TX: claim_staged_jobs! + insert_inflight! + Forwarder.dispatch
104
+ (the adapter INSERT shares the TX; rollback if anything raises)
105
+ → bulk-flush deny-state in one UPDATE ... FROM (VALUES ...)
80
106
 
81
107
  (worker runs perform)
82
- Dispatchable#around_perform
108
+ InflightTracker.track (around_perform)
109
+ → INSERT inflight_jobs ON CONFLICT DO NOTHING
110
+ → spawn heartbeat thread
83
111
  → block.call
84
- release counters, mark StagedJob completed_at, record observation
112
+ record_observation on adaptive gates (queue_lag AIMD update)
113
+ → DELETE inflight_jobs
85
114
  ```
86
115
 
87
116
  ## Declaring a policy
88
117
 
89
118
  ```ruby
90
- class SendWebhookJob < ApplicationJob
91
- include DispatchPolicy::Dispatchable
119
+ class FetchEndpointJob < ApplicationJob
120
+ dispatch_policy_inflight_tracking # only required if a concurrency gate is used
92
121
 
93
- dispatch_policy do
94
- # Persisted in the staged row so gates can read it without touching AR.
122
+ dispatch_policy :endpoints do
95
123
  context ->(args) {
96
124
  event = args.first
97
- { endpoint_id: event.endpoint_id, rate_limit: event.endpoint.rate_limit }
125
+ {
126
+ endpoint_id: event.endpoint_id,
127
+ rate_limit: event.endpoint.rate_limit,
128
+ max_per_account: event.account.dispatch_concurrency
129
+ }
98
130
  }
99
131
 
100
- # Partial unique index dedupes identical keys while the previous is pending.
101
- dedupe_key ->(args) { "event:#{args.first.id}" }
102
-
103
- # Tenant fairness — see the "Round-robin" section below.
104
- round_robin_by ->(args) { args.first.account_id }
132
+ # Required: every gate in the policy enforces against this scope.
133
+ partition_by ->(ctx) { ctx[:endpoint_id] }
105
134
 
106
135
  gate :throttle,
107
- rate: ->(ctx) { ctx[:rate_limit] },
108
- per: 1.minute,
109
- partition_by: ->(ctx) { ctx[:endpoint_id] }
136
+ rate: ->(ctx) { ctx[:rate_limit] },
137
+ per: 1.minute
110
138
 
111
- gate :fair_interleave
139
+ gate :concurrency,
140
+ max: ->(ctx) { ctx[:max_per_account] || 5 }
141
+
142
+ retry_strategy :restage # default; alternative: :bypass
112
143
  end
113
144
 
114
- def perform(event) = event.deliver!
145
+ def perform(event)
146
+ # ... call the rate-limited HTTP endpoint
147
+ end
115
148
  end
116
149
  ```
117
150
 
118
151
  `perform_later` stages the job; the tick admits it when its gates pass.
152
+ With multiple gates the actual `admit_count` per tick comes out as
153
+ `min(allowed)` across all of them.
119
154
 
120
- ## Gates
155
+ ## Choosing the partition scope
121
156
 
122
- Gates run in declared order, each narrowing the survivor set. Any option
123
- that takes a value can alternatively take a lambda that receives the
124
- `ctx` hash, so parameters can depend on per-job data.
157
+ `partition_by` is the most consequential decision in a policy and the
158
+ only required field. It tells the gem **what counts as one logical
159
+ partition** what scope each gate enforces against, and what the
160
+ in-tick fairness reorder operates over.
125
161
 
126
- ### `:concurrency` in-flight cap per partition
162
+ A policy with `partition_by` and **no gates** is also valid: the
163
+ pipeline passes the full budget through, and the Tick caps it via
164
+ `admission_batch_size` (or `tick_admission_budget` if set). Useful
165
+ for "balance N tenants evenly" without rate-limiting any of them.
127
166
 
128
- Caps the number of admitted-but-not-yet-completed jobs in each
129
- partition. Tracks in-flight counts in
130
- `dispatch_policy_partition_counts`; decremented by the `around_perform`
131
- hook when the job finishes, or by the reaper when a lease expires
132
- (worker crashed).
167
+ If you need genuinely different scopes per gate (throttle by endpoint
168
+ AND concurrency by account, each enforced at its own scope), **split
169
+ into two policies** and chain them: the staging policy admits, its
170
+ worker enqueues into the second.
133
171
 
134
- ```ruby
135
- gate :concurrency,
136
- max: ->(ctx) { ctx[:max_per_account] || 5 },
137
- partition_by: ->(ctx) { "acct:#{ctx[:account_id]}" }
138
- ```
172
+ ## Gates
139
173
 
140
- When to reach for it: external APIs with per-tenant concurrency limits,
141
- database-heavy jobs you don't want to pile up per customer, anything
142
- where "at most N running at once for this key" matters.
174
+ Gates run in declared order; each narrows the survivor count. Every
175
+ option that takes a value can alternatively take a lambda receiving
176
+ the `ctx` hash, so parameters can depend on per-job data.
143
177
 
144
178
  ### `:throttle` — token-bucket rate limit per partition
145
179
 
146
- Refills `rate` tokens every `per` seconds, capped at `burst` (defaults
147
- to `rate`). Admits jobs while tokens are available; leaves the rest
148
- pending for the next tick.
180
+ Refills `rate` tokens every `per` seconds, capped at `rate` (no
181
+ separate burst). Admits jobs while tokens are available; leaves the
182
+ rest pending for the next tick. State is persisted in
183
+ `partitions.gate_state.throttle`.
149
184
 
150
185
  ```ruby
151
186
  gate :throttle,
152
- rate: 100, # tokens
153
- per: 1.minute, # refill window
154
- burst: 100, # bucket cap (optional, defaults to rate)
155
- partition_by: ->(ctx) { "host:#{ctx[:host]}" }
187
+ rate: ->(ctx) { ctx[:rate_limit] },
188
+ per: 1.minute
156
189
  ```
157
190
 
158
- `rate` and `burst` accept lambdas, so the limit can come from
159
- configuration stored alongside the thing being rate-limited:
191
+ Throttle does **not** release tokens on completion tokens refill
192
+ only with elapsed time.
193
+
194
+ ### `:concurrency` — in-flight cap per partition
195
+
196
+ Caps the number of admitted-but-not-yet-completed jobs per partition.
197
+ Counts rows in `dispatch_policy_inflight_jobs` keyed by the policy's
198
+ canonical partition. Decremented by `InflightTracker.track`'s
199
+ `around_perform`; reaped by a periodic sweeper if a worker crashes.
160
200
 
161
201
  ```ruby
162
- gate :throttle,
163
- rate: ->(ctx) { ctx[:rate_limit] },
164
- per: 1.minute,
165
- partition_by: ->(ctx) { ctx[:endpoint_id] }
202
+ gate :concurrency,
203
+ max: ->(ctx) { ctx[:max_per_account] || 5 }
166
204
  ```
167
205
 
168
- Unlike `:concurrency`, throttle does **not** release tokens on job
169
- completion tokens refill only with elapsed time.
206
+ When the cap is full, the gate returns `retry_after = full_backoff`
207
+ (default 1s) so the partition skips the next ticks instead of
208
+ hammering `count(*)` every iteration.
170
209
 
171
- ### `:global_cap` — single cap across all partitions
210
+ ### `:adaptive_concurrency` — per-partition cap that self-tunes
172
211
 
173
- A global version of `:concurrency`: at most `max` jobs admitted
174
- simultaneously across the whole policy, regardless of partition.
175
- Useful as a safety ceiling on top of per-partition limits.
212
+ Like `:concurrency` but the cap (`current_max`) shrinks when the
213
+ adapter queue backs up and grows when workers drain it quickly.
214
+ AIMD loop on a per-partition stats row in
215
+ `dispatch_policy_adaptive_concurrency_stats`.
176
216
 
177
217
  ```ruby
178
- gate :concurrency, max: 10, partition_by: ->(ctx) { ctx[:tenant] }
179
- gate :global_cap, max: 200
218
+ gate :adaptive_concurrency,
219
+ initial_max: 3,
220
+ target_lag_ms: 1000, # acceptable queue wait before backoff
221
+ min: 1 # floor; a partition can't lock out
180
222
  ```
181
223
 
182
- Reads: "up to 10 in flight per tenant, but never more than 200 total".
224
+ - **Feedback signal**: `admitted_at perform_start` (queue wait in
225
+ the real adapter). Pure saturation signal — slow performs in the
226
+ downstream service don't punish admissions if workers still drain
227
+ the queue quickly.
228
+ - **Growth**: `current_max += 1` per fast success.
229
+ - **Slow shrink**: `current_max *= 0.95` when EWMA lag > target.
230
+ - **Failure shrink**: `current_max *= 0.5` when `perform` raises.
231
+ - **Safety valve**: when `in_flight == 0` the gate floors `remaining`
232
+ at `initial_max` so a partition that AIMD shrunk to `min` during
233
+ a past burst can re-grow when it idles.
183
234
 
184
- ### `:fair_interleave` — round-robin ordering across partitions
235
+ #### Choosing `target_lag_ms`
185
236
 
186
- Not a filter a reordering step. Groups the batch by its primary
187
- partition and interleaves, so no single partition can starve others
188
- even if it has many pending jobs.
237
+ It's the knob that trades latency for throughput. Rough guide:
189
238
 
190
- ```ruby
191
- gate :concurrency, max: 10, partition_by: ->(ctx) { "acct:#{ctx[:account_id]}" }
192
- gate :fair_interleave
193
- ```
239
+ - **Too low** (10–50 ms): the gate reacts to every tiny bump in
240
+ queue wait and shrinks aggressively. Workers idle while jobs sit
241
+ pending — overshoot.
242
+ - **Too high** (30 s+): the gate barely pushes back; throughput is
243
+ near-max but new admissions wait seconds before a worker picks
244
+ them up.
245
+ - **Reasonable starting point**: `≈ worker_threads × avg_perform_ms`.
246
+ E.g. 5 workers × 200 ms perform = 1000 ms means "queue depth up
247
+ to ~1 s is fine".
194
248
 
195
- Place it after a gate that assigned partitions; interleaving is keyed
196
- off the first partition a row picked up.
249
+ ## Fairness within a tick
197
250
 
198
- ### `:adaptive_concurrency` per-partition cap that self-tunes
251
+ When several partitions compete for admission inside the same tick,
252
+ the gem reorders them by **least-recently-active first** so a hot
253
+ partition with thousands of pending jobs cannot starve a cold one
254
+ that just woke up.
199
255
 
200
- The cap per partition (`current_max`) shrinks when the adapter queue
201
- backs up (EWMA of queue lag > `target_lag_ms`) or when performs raise;
202
- grows back by +1 when lag stays under target. AIMD loop on a
203
- per-partition stats row (`dispatch_policy_adaptive_concurrency_stats`).
256
+ The mechanism has two knobs: an EWMA half-life (controls *how* the
257
+ order is decided) and an optional global tick cap (controls *how
258
+ much* each partition is allowed in one tick).
204
259
 
205
- ```ruby
206
- gate :adaptive_concurrency,
207
- partition_by: ->(ctx) { ctx[:account_id] },
208
- initial_max: 3,
209
- target_lag_ms: 1000, # acceptable queue wait before admission
210
- min: 1 # floor so a partition can't lock out
211
- end
260
+ ### `fairness half_life:`
261
+
262
+ Each partition keeps `decayed_admits` and `decayed_admits_at`,
263
+ updated atomically inside the admit transaction:
264
+
265
+ ```
266
+ decayed_admits := decayed_admits * exp(-Δt / τ) + admitted
267
+ where τ = half_life / ln(2)
212
268
  ```
213
269
 
214
- - **Feedback signal**: `admitted_at perform_start` (queue wait in the
215
- real adapter). Pure saturation signal slow performs in the
216
- downstream service don't punish admissions if workers still drain
217
- the queue quickly.
218
- - **Growth**: +1 per fast success. No hard ceiling; the algorithm
219
- self-limits via `target_lag_ms`. If the queue builds up, the cap
220
- shrinks multiplicatively.
221
- - **Failure**: `current_max *= 0.5` (halve) when `perform` raises.
222
- - **Slow**: `current_max *= 0.95` when EWMA lag > target.
270
+ After `half_life` seconds without admitting, the value halves. The
271
+ Tick sorts the claimed batch by current `decayed_admits` ASC, so the
272
+ under-admitted go first.
223
273
 
224
- ### Choosing `target_lag_ms`
274
+ | Value | Behaviour |
275
+ |-----------|------------------------------------------------------------------------------|
276
+ | 5–10 s | Reacts to brief pauses. Bursty workloads where short stalls deserve a head start. |
277
+ | **60 s** (default) | Stable steady-state. Hot partitions stay "hot" through normal latency variation. |
278
+ | 5–15 min | Long memory. Burst on partition A penalises A for many minutes. |
225
279
 
226
- It's the knob that trades latency for throughput. Rough guide:
280
+ Set `c.fairness_half_life_seconds = nil` to disable the reorder
281
+ entirely — partitions are processed in `claim_partitions` order
282
+ (last-checked-first).
227
283
 
228
- - **Too low** (e.g. 10-50 ms). The gate reacts to every tiny bump in
229
- queue wait and shrinks the cap aggressively. Workers can end up
230
- idle with jobs still pending admission because the cap is
231
- overcorrecting classic contention / overshoot.
232
- - **Too high** (e.g. 30 s). The gate barely ever pushes back, so
233
- you get near-maximum throughput at the cost of real queue buildup;
234
- newly admitted jobs may wait seconds before a worker picks them
235
- up.
236
- - **Reasonable starting point**: `≈ worker_max_threads × avg_perform_ms`.
237
- If you run 5 workers at ~200 ms/perform, `target_lag_ms: 1000`
238
- means "it's OK if the adapter queue stays at most ~1 second
239
- deep". You'll want to tune from there based on what your
240
- downstream tolerates and how fast you want bursts to drain.
241
-
242
- Pair it with `round_robin_by` for multi-tenant systems that want
243
- automatic backpressure without hand-tuned caps per tenant:
284
+ ### `tick_admission_budget`
285
+
286
+ Without this, each partition admits up to `admission_batch_size`.
287
+ With it set, the per-partition ceiling becomes `fair_share = ceil(cap
288
+ / claimed_partitions)`. Pass-1 walks the (decay-sorted) partitions
289
+ giving each up to `fair_share`; pass-2 redistributes any leftover to
290
+ those that filled their share.
244
291
 
245
292
  ```ruby
246
- round_robin_by ->(args) { args.first[:account_id] }
247
- gate :adaptive_concurrency,
248
- partition_by: ->(ctx) { ctx[:account_id] },
249
- initial_max: 3,
250
- target_lag_ms: 1000
293
+ DispatchPolicy.configure do |c|
294
+ c.fairness_half_life_seconds = 60
295
+ c.tick_admission_budget = nil # default — no global cap
296
+ end
297
+
298
+ # Per-policy override:
299
+ dispatch_policy :endpoints do
300
+ partition_by ->(c) { c[:endpoint_id] }
301
+ fairness half_life: 30.seconds
302
+ tick_admission_budget 200
303
+ gate :throttle, rate: 100, per: 60
304
+ end
251
305
  ```
252
306
 
253
- ## Queues and partitioning
307
+ When the cap is hit before all partitions admit, the rest are denied
308
+ with reason `tick_cap_exhausted`. They were still observed
309
+ (`last_checked_at` bumped), so they're at the front of the next
310
+ tick's order.
311
+
312
+ ### Anti-stagnation
254
313
 
255
- DispatchPolicy operates at the **policy** (class) level. A job's
256
- ActiveJob `queue` and `priority` travel through staging into admission
257
- and on to the real adapter workers of each queue pick up their jobs
258
- normally but neither affects which staged rows the gates see. All
259
- enqueues of the same job class share one policy, one throttle bucket,
260
- one concurrency cap.
314
+ The decay-based reorder only applies to partitions already claimed.
315
+ Selection (`Repository.claim_partitions`) still orders by
316
+ `last_checked_at NULLS FIRST, id`. Every active partition with
317
+ pending jobs is visited in at most ⌈N / partition_batch_size⌉ ticks
318
+ regardless of how hot or cold it is.
261
319
 
262
- Two consequences to be aware of:
320
+ ### Mixing `:adaptive_concurrency` with fairness
263
321
 
264
- - Enqueuing the same job to different queues does **not** give one
265
- queue priority at admission; they share the policy's gates. If
266
- urgent work should jump ahead, set a lower ActiveJob `priority`
267
- (the admission SELECT is `ORDER BY priority, staged_at`) — or split
268
- into a subclass with its own policy.
269
- - `dedupe_key` is queue-agnostic: the same key enqueued to
270
- `:urgent` and `:low` dedupes to one row.
322
+ Adaptive and fairness operate at different layers and compose
323
+ without sharing state:
271
324
 
272
- ### Using queue as a partition
325
+ - **Fairness** writes `partitions.decayed_admits` inside the
326
+ per-partition admit TX.
327
+ - **Adaptive** writes `dispatch_policy_adaptive_concurrency_stats`
328
+ from the worker's `around_perform` via `record_observation`.
273
329
 
274
- The context hash has `queue_name` and `priority` injected automatically
275
- at stage time (user-supplied keys win). Use them in any `partition_by`:
330
+ Different tables, different locks. Each tick the actual admit_count
331
+ becomes `min(fair_share, current_max - in_flight)` (with the
332
+ adaptive safety valve when `in_flight == 0`). Fairness picks order +
333
+ budget per tick; adaptive shapes how aggressively each partition
334
+ consumes its share.
276
335
 
277
336
  ```ruby
278
- class SendEmailJob < ApplicationJob
279
- include DispatchPolicy::Dispatchable
337
+ dispatch_policy :tenants do
338
+ partition_by ->(c) { c[:tenant] }
280
339
 
281
- dispatch_policy do
282
- context ->(args) { { account_id: args.first.account_id } }
340
+ gate :adaptive_concurrency,
341
+ initial_max: 5,
342
+ target_lag_ms: 1000,
343
+ min: 1
283
344
 
284
- # Separate throttle bucket per (queue, account) — urgent and default
285
- # don't share rate tokens.
286
- gate :throttle,
287
- rate: 100,
288
- per: 1.minute,
289
- partition_by: ->(ctx) { "#{ctx[:queue_name]}:#{ctx[:account_id]}" }
290
- end
345
+ fairness half_life: 30.seconds
346
+ tick_admission_budget 60
291
347
  end
292
-
293
- SendEmailJob.set(queue: :urgent).perform_later(user)
294
- SendEmailJob.set(queue: :default).perform_later(user)
295
- # → two partitions, each with its own bucket.
296
348
  ```
297
349
 
298
- If you'd rather keep the two streams fully isolated (separate policies,
299
- admin rows, and dedupe scopes), subclass:
350
+ The dummy `AdaptiveDemoJob` declares both; the storm form drives it
351
+ across many tenants with a triangular weight distribution so you can
352
+ watch the EWMA reorder hot tenants AND the AIMD shrink their cap.
353
+ Integration test: `test/integration/adaptive_with_fairness_test.rb`.
354
+
355
+ ## Sharding a policy across worker pools
356
+
357
+ Shards partition the gem horizontally: each tick worker sees only
358
+ the partitions on its own shard, so multiple workers can admit in
359
+ parallel for the same policy. Declare a `shard_by`:
300
360
 
301
361
  ```ruby
302
- class UrgentEmailJob < SendEmailJob
303
- queue_as :urgent
304
- dispatch_policy do
305
- context ->(args) { { account_id: args.first.account_id } }
306
- gate :throttle, rate: 500, per: 1.minute, partition_by: ->(ctx) { ctx[:account_id] }
307
- end
362
+ dispatch_policy :events do
363
+ context ->(args) { { account_id: args.first[:account_id] } }
364
+ partition_by ->(c) { "acct:#{c[:account_id]}" }
365
+ shard_by ->(c) { "events-shard-#{c[:account_id].hash.abs % 4}" }
366
+
367
+ gate :concurrency, max: 50
308
368
  end
309
369
  ```
310
370
 
311
- ## Dedupe
371
+ Run one `DispatchTickLoopJob` per shard:
372
+
373
+ ```ruby
374
+ 4.times { |i| DispatchTickLoopJob.perform_later("events", "events-shard-#{i}") }
375
+ ```
376
+
377
+ The generated `DispatchTickLoopJob` template uses
378
+ `queue_as { arguments[1] }` so each tick is enqueued on the same
379
+ queue it monitors. Workers listening on `events-shard-*` queues run
380
+ both the tick loops and the admitted jobs from one pool per shard.
312
381
 
313
- `dedupe_key` is enforced by a partial unique index on
314
- `(policy_name, dedupe_key) WHERE completed_at IS NULL`. Semantics:
382
+ The gem's automatic context enrichment puts `:queue_name` into the
383
+ ctx hash so `shard_by` can use it directly without your `context`
384
+ proc having to know about it.
315
385
 
316
- - Re-enqueuing while a previous staged row is pending or admitted →
317
- silently dropped.
318
- - Re-enqueuing after the previous completes → fresh staged row.
319
- - Returning `nil` from the lambda → no dedup for that enqueue.
386
+ **`shard_by` must be as coarse as the most restrictive throttle's
387
+ scope.** If not, the bucket duplicates across shards and the
388
+ effective rate becomes `rate × N_shards`.
320
389
 
321
- Typical pattern: `"<domain>:<entity>:<id>"` (`"monitor:42"`,
322
- `"event:abc123"`). Keep it stable for the duration of a logical unit
323
- of work.
390
+ ## Atomic admission
324
391
 
325
- ## Round-robin batching (tenant fairness)
392
+ `Forwarder.dispatch` runs inside the per-partition admission
393
+ transaction. The adapter (good_job, solid_queue) uses
394
+ `ActiveRecord::Base.connection`, so its `INSERT INTO good_jobs`
395
+ joins the same TX as the `DELETE FROM staged_jobs` and the `INSERT
396
+ INTO inflight_jobs`. Any exception (deserialize, adapter error,
397
+ network) rolls everything back atomically — no window where staged
398
+ is gone but the adapter never received the job.
326
399
 
327
- For policies where every tenant should keep making progress even
328
- when one suddenly enqueues 100× its normal volume, neither throttle
329
- nor concurrency is a good fit you want max throughput, just
330
- fairness. `round_robin_by` solves it at the batch SELECT layer:
400
+ The trade-off: the gem requires a PG-backed adapter for
401
+ at-least-once. The railtie warns at boot if the adapter doesn't
402
+ look PG-shared (Sidekiq, Resque, async, …) but doesn't hard-fail
403
+ a custom PG-backed adapter we don't recognise can still work.
404
+
405
+ For Rails multi-DB (e.g. solid_queue on a separate `:queue` role):
331
406
 
332
407
  ```ruby
333
- dispatch_policy do
334
- context ->(args) { { account_id: args.first.account_id } }
335
- round_robin_by ->(args) { args.first.account_id }
408
+ DispatchPolicy.configure do |c|
409
+ c.database_role = :queue
336
410
  end
337
411
  ```
338
412
 
339
- At stage time the lambda's result is written into the dedicated
340
- `round_robin_key` column (indexed). `Tick.run` then uses a two-phase
341
- fetch:
413
+ `Repository.with_connection` wraps the admission TX in
414
+ `connected_to(role:)` when set. Staging tables and the adapter's
415
+ table must live in the same DB for atomicity to hold.
342
416
 
343
- 1. **LATERAL join** — distinct keys × per-key `LIMIT round_robin_quantum`.
344
- Guarantees each active tenant gets at least `quantum` rows per
345
- tick, so a tenant with 10 pending is served in the same tick as
346
- a tenant with 50k pending.
347
- 2. **Top-up** — if the fairness floor doesn't fill `batch_size`, the
348
- remaining slots go to the oldest pending (excluding the ids
349
- already locked). Keeps single-tenant throughput at full capacity.
417
+ ## Running the tick
350
418
 
351
- Cost per tick is O(`quantum × active_keys`), not O(backlog) — so the
352
- admin stays snappy even with thousands of distinct tenants.
419
+ `DispatchPolicy::TickLoop.run(policy_name:, shard:, stop_when:)` is
420
+ the entry point. It claims partitions under `FOR UPDATE SKIP
421
+ LOCKED`, evaluates gates, atomically admits, and updates partition
422
+ state. The install generator scaffolds a `DispatchTickLoopJob` you
423
+ schedule like any other ActiveJob:
353
424
 
354
- ## Running the tick
425
+ ```ruby
426
+ DispatchTickLoopJob.perform_later # all policies
427
+ DispatchTickLoopJob.perform_later("endpoints") # one policy
428
+ DispatchTickLoopJob.perform_later("endpoints", "shard-2")
429
+ ```
430
+
431
+ Each job uses `good_job_control_concurrency_with` (or solid_queue's
432
+ `limits_concurrency`) so only one tick is active per
433
+ (policy, shard) combination at a time. The job re-enqueues itself
434
+ with a 1-second tail wait, so the loop survives normal restarts.
435
+
436
+ ## Admin UI
437
+
438
+ Mount the engine and visit `/dispatch_policy`:
439
+
440
+ - **Dashboard** — totals, throughput windows, round-trip stats,
441
+ capacity gauges (admit rate vs adapter ceiling, avg tick vs
442
+ `tick_max_duration`), pending trend with up/down arrow, auto-hints
443
+ ("avg tick at 88% of tick_max_duration — shard or lower
444
+ admission_batch_size").
445
+ - **Policies** — per-policy throughput, denial reasons breakdown,
446
+ top partitions by lifetime/pending, pause/resume/drain.
447
+ - **Partitions** — searchable list, detail view with gate state,
448
+ decayed_admits + admits/min estimate, recent staged jobs,
449
+ force-admit, drain.
355
450
 
356
- The gem exposes `DispatchPolicy::TickLoop.run(policy_name:, stop_when:)`
357
- but **does not ship a tick job** concurrency semantics are
358
- queue-adapter specific (GoodJob's `total_limit`, Sidekiq Enterprise
359
- uniqueness, etc.), so you write a small job in your app that wraps
360
- the loop with whatever dedup your adapter provides. Example for
361
- GoodJob:
451
+ The UI auto-refreshes via Turbo morph + a controllable picker
452
+ (off / 2s / 5s / 10s) stored in sessionStorage; preserves scroll
453
+ position; and skips a refresh while a previous Turbo visit is in
454
+ flight so a slow page doesn't stack visits.
455
+
456
+ CSRF and forgery protection use the host app's settings. The UI
457
+ ships unauthenticated; wrap the `mount` with a constraint or
458
+ `before_action` for auth in production.
459
+
460
+ ## Configuration
362
461
 
363
462
  ```ruby
364
- # app/jobs/dispatch_tick_loop_job.rb
365
- class DispatchTickLoopJob < ApplicationJob
366
- include GoodJob::ActiveJobExtensions::Concurrency
367
- good_job_control_concurrency_with(
368
- total_limit: 1,
369
- key: -> { "dispatch_tick_loop:#{arguments.first || 'all'}" }
370
- )
371
-
372
- def perform(policy_name = nil)
373
- deadline = Time.current + DispatchPolicy.config.tick_max_duration
374
- DispatchPolicy::TickLoop.run(
375
- policy_name: policy_name,
376
- stop_when: -> {
377
- GoodJob.current_thread_shutting_down? || Time.current >= deadline
378
- }
379
- )
380
- # Self-chain so the next run starts immediately; cron below is a safety net.
381
- DispatchTickLoopJob.set(wait: 1.second).perform_later(policy_name)
382
- end
463
+ # config/initializers/dispatch_policy.rb
464
+ DispatchPolicy.configure do |c|
465
+ c.tick_max_duration = 25 # seconds the tick job stays admitting
466
+ c.partition_batch_size = 50 # partitions claimed per tick iteration
467
+ c.admission_batch_size = 100 # max jobs admitted per partition per iteration
468
+ c.idle_pause = 0.5 # seconds slept when a tick admits nothing
469
+ c.partition_inactive_after = 86_400 # GC partitions idle this long
470
+ c.inflight_stale_after = 300 # GC inflight rows whose worker stopped heartbeating
471
+ c.inflight_heartbeat_interval = 30 # how often the worker bumps heartbeat_at
472
+ c.sweep_every_ticks = 50 # sweeper cadence (in tick iterations)
473
+ c.metrics_retention = 86_400 # tick_samples kept this long
474
+ c.fairness_half_life_seconds = 60 # EWMA half-life for in-tick reorder; nil disables
475
+ c.tick_admission_budget = nil # global cap on admissions per tick; nil = none
476
+ c.adapter_throughput_target = nil # jobs/sec; UI shows admit rate as % of this
477
+ c.database_role = nil # AR role for the admission TX (multi-DB)
383
478
  end
384
479
  ```
385
480
 
386
- Schedule it (every 10s as a safety net — the self-chain keeps one
387
- alive under normal operation):
481
+ You can override `admission_batch_size`, `fairness_half_life_seconds`,
482
+ and `tick_admission_budget` per policy via the DSL.
388
483
 
389
- ```ruby
390
- # config/application.rb
391
- config.good_job.cron = {
392
- dispatch_tick_loop: {
393
- cron: "*/10 * * * * *",
394
- class: "DispatchTickLoopJob"
395
- }
396
- }
484
+ ## `partitions.context` is refreshed on every enqueue
485
+
486
+ When you call `perform_later`, the gem evaluates your `context` proc
487
+ and upserts the partition row with the resulting hash:
488
+
489
+ ```sql
490
+ INSERT INTO dispatch_policy_partitions (..., context, context_updated_at, ...) VALUES (...)
491
+ ON CONFLICT (policy_name, partition_key) DO UPDATE
492
+ SET context = EXCLUDED.context,
493
+ context_updated_at = EXCLUDED.context_updated_at,
494
+ pending_count = dispatch_policy_partitions.pending_count + 1,
495
+ ...
397
496
  ```
398
497
 
399
- For adapters without a first-class dedup mechanism, implement it
400
- yourself (e.g. `pg_try_advisory_lock` inside `perform`) before calling
401
- `DispatchPolicy::TickLoop.run`.
498
+ Gates evaluate against `partition.context`, **not** the per-job
499
+ snapshot in `staged_jobs.context`. So if a tenant bumps their
500
+ `dispatch_concurrency` from 5 to 20 and a new job arrives, the next
501
+ admission uses the new value — no need to drain the partition
502
+ first. If a partition has no new traffic, the context stays at the
503
+ value seen by the last enqueue.
402
504
 
403
- ## Admin UI
505
+ ## Retry strategies
404
506
 
405
- `DispatchPolicy::Engine` ships a read-only admin mounted wherever
406
- you like. Features:
407
-
408
- - Policy index with pending / admitted / completed-24h totals.
409
- - Per-policy page with a **partition breakdown** (watched + searchable
410
- list) showing pending-eligible / pending-scheduled / in-flight /
411
- completed / adaptive cap / EWMA latency / last enqueue / last
412
- dispatch per partition.
413
- - Line chart of avg EWMA queue lag (last hour, per minute) with
414
- completions-per-minute bars behind it.
415
- - Per-partition sparkline with the same overlay; click to watch /
416
- unwatch. Watched set is persisted in `localStorage` and synced into
417
- the URL so reloading keeps your view.
418
- - Opt-in auto-refresh (off / 2s / 5s / 15s) stored in `localStorage`.
419
- Page updates via Turbo morph — scroll position and tooltips survive.
507
+ By default a retry produced by `retry_on` re-enters the policy and
508
+ is staged again, so throttle/concurrency apply equally to first
509
+ attempts and retries. Use `retry_strategy :bypass` if you want
510
+ retries to skip the gem and go straight to the adapter:
511
+
512
+ ```ruby
513
+ dispatch_policy :foo do
514
+ partition_by ->(_c) { "k" }
515
+ gate :throttle, rate: 5, per: 60
516
+ retry_strategy :bypass
517
+ end
518
+ ```
519
+
520
+ ## Compatibility
521
+
522
+ - Rails 7.1+ (developed against 8.1).
523
+ - PostgreSQL 12+ (uses `FOR UPDATE SKIP LOCKED`, `JSONB`, `ON CONFLICT`).
524
+ - `good_job` ≥ 4.0 or `solid_queue` ≥ 1.0.
525
+ - Sidekiq / Resque are NOT supported — the at-least-once guarantee
526
+ needs the adapter to share Postgres with the gem.
420
527
 
421
528
  ## Testing
422
529
 
423
- ```
424
- bundle install
425
- bundle exec rake test
530
+ ```bash
531
+ bundle exec rake test # 124 runs / 284 assertions
532
+ bundle exec rake bench # manual benchmark suite (creates dispatch_policy_bench DB)
533
+ bundle exec rake bench:real # end-to-end against good_job on the dummy DB
534
+ bundle exec rake bench:limits # stretches every path to its breaking point
426
535
  ```
427
536
 
428
- Tests require a PostgreSQL instance (uses `ON CONFLICT`, partial
429
- indexes, `FOR UPDATE SKIP LOCKED`, `jsonb`). `PGUSER` / `PGHOST` /
430
- `PGPASSWORD` env vars override the defaults in
431
- `test/dummy/config/database.yml`.
537
+ Integration tests skip when no Postgres is reachable (default DB
538
+ `dispatch_policy_test`; override via `DB_NAME`, `DB_HOST`,
539
+ `DB_USER`, `DB_PASS`).
540
+
541
+ ## Releasing
542
+
543
+ Cutting a new version is driven by `bin/release`. Steps:
544
+
545
+ 1. Bump `DispatchPolicy::VERSION` in
546
+ `lib/dispatch_policy/version.rb`.
547
+ 2. Add a `## <VERSION>` section in `CHANGELOG.md` describing the
548
+ release. The script extracts that section verbatim as the
549
+ GitHub release notes, so anything missing here will be missing
550
+ on GitHub.
551
+ 3. Commit both on `master` and push so `origin/master` matches
552
+ local.
553
+ 4. Run the script from the repo root:
554
+
555
+ ```bash
556
+ bin/release
557
+ ```
558
+
559
+ The script:
560
+
561
+ - Refuses to run unless you are on `master`, the working tree is
562
+ clean, the local branch matches `origin/master`, and the tag
563
+ `v<VERSION>` does not yet exist.
564
+ - Asks for a `y` confirmation before doing anything.
565
+ - Hands off to `bundle exec rake release` (builds the gem, creates
566
+ the `v<VERSION>` tag, pushes the tag to GitHub, pushes the gem to
567
+ RubyGems.org).
568
+ - Creates a GitHub release for `v<VERSION>` using the matching
569
+ CHANGELOG section as the body. Requires the `gh` CLI; if it is
570
+ missing, the gem ships but you'll need to create the GitHub
571
+ release manually with `gh release create v<VERSION> --notes-file
572
+ CHANGELOG.md`.
573
+
574
+ Prerequisites: a configured `~/.gem/credentials` for RubyGems push
575
+ and `gh auth login` for the GitHub release.
576
+
577
+ ## Status
578
+
579
+ Published on RubyGems. API may still shift between minors until
580
+ 1.0. The set of features that ship today:
581
+
582
+ - Gates: `:throttle`, `:concurrency`, `:adaptive_concurrency`.
583
+ - Fairness: in-tick EWMA reorder + optional `tick_admission_budget`.
584
+ - Sharding: `shard_by` + per-shard tick loops.
585
+ - Bulk handoff: `ActiveJob.perform_all_later` collapses to one
586
+ adapter `INSERT` per tick when admissible.
587
+ - Admin UI with capacity hints, pending trend, denial reasons.
588
+ - Manual benchmark suite.
589
+
590
+ Deferred ideas (with rationale) live in [`IDEAS.md`](IDEAS.md):
591
+ `gate :global_cap`, smarter sweeper defaults, `sweep_every_seconds`
592
+ instead of `sweep_every_ticks`.
432
593
 
433
594
  ## License
434
595