dispatch_policy 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (70) hide show
  1. checksums.yaml +4 -4
  2. data/MIT-LICENSE +16 -17
  3. data/README.md +433 -388
  4. data/app/assets/stylesheets/dispatch_policy/application.css +157 -0
  5. data/app/controllers/dispatch_policy/application_controller.rb +45 -1
  6. data/app/controllers/dispatch_policy/dashboard_controller.rb +91 -0
  7. data/app/controllers/dispatch_policy/partitions_controller.rb +122 -0
  8. data/app/controllers/dispatch_policy/policies_controller.rb +94 -267
  9. data/app/controllers/dispatch_policy/staged_jobs_controller.rb +9 -0
  10. data/app/models/dispatch_policy/adaptive_concurrency_stats.rb +11 -81
  11. data/app/models/dispatch_policy/inflight_job.rb +12 -0
  12. data/app/models/dispatch_policy/partition.rb +21 -0
  13. data/app/models/dispatch_policy/staged_job.rb +4 -97
  14. data/app/models/dispatch_policy/tick_sample.rb +11 -0
  15. data/app/views/dispatch_policy/dashboard/index.html.erb +109 -0
  16. data/app/views/dispatch_policy/partitions/index.html.erb +63 -0
  17. data/app/views/dispatch_policy/partitions/show.html.erb +106 -0
  18. data/app/views/dispatch_policy/policies/index.html.erb +15 -37
  19. data/app/views/dispatch_policy/policies/show.html.erb +139 -223
  20. data/app/views/dispatch_policy/shared/_capacity.html.erb +67 -0
  21. data/app/views/dispatch_policy/shared/_hints.html.erb +13 -0
  22. data/app/views/dispatch_policy/shared/_partition_row.html.erb +12 -0
  23. data/app/views/dispatch_policy/staged_jobs/show.html.erb +31 -0
  24. data/app/views/layouts/dispatch_policy/application.html.erb +95 -238
  25. data/config/routes.rb +18 -2
  26. data/db/migrate/20260501000001_create_dispatch_policy_tables.rb +103 -0
  27. data/lib/dispatch_policy/bypass.rb +23 -0
  28. data/lib/dispatch_policy/config.rb +85 -0
  29. data/lib/dispatch_policy/context.rb +50 -0
  30. data/lib/dispatch_policy/cursor_pagination.rb +121 -0
  31. data/lib/dispatch_policy/decision.rb +22 -0
  32. data/lib/dispatch_policy/engine.rb +4 -27
  33. data/lib/dispatch_policy/forwarder.rb +63 -0
  34. data/lib/dispatch_policy/gate.rb +10 -38
  35. data/lib/dispatch_policy/gates/adaptive_concurrency.rb +99 -97
  36. data/lib/dispatch_policy/gates/concurrency.rb +45 -26
  37. data/lib/dispatch_policy/gates/throttle.rb +65 -41
  38. data/lib/dispatch_policy/inflight_tracker.rb +174 -0
  39. data/lib/dispatch_policy/job_extension.rb +155 -0
  40. data/lib/dispatch_policy/operator_hints.rb +126 -0
  41. data/lib/dispatch_policy/pipeline.rb +48 -0
  42. data/lib/dispatch_policy/policy.rb +61 -59
  43. data/lib/dispatch_policy/policy_dsl.rb +120 -0
  44. data/lib/dispatch_policy/railtie.rb +35 -0
  45. data/lib/dispatch_policy/registry.rb +46 -0
  46. data/lib/dispatch_policy/repository.rb +723 -0
  47. data/lib/dispatch_policy/serializer.rb +36 -0
  48. data/lib/dispatch_policy/tick.rb +260 -256
  49. data/lib/dispatch_policy/tick_loop.rb +59 -26
  50. data/lib/dispatch_policy/version.rb +1 -1
  51. data/lib/dispatch_policy.rb +71 -52
  52. data/lib/generators/dispatch_policy/install/install_generator.rb +70 -0
  53. data/lib/generators/dispatch_policy/install/templates/create_dispatch_policy_tables.rb.tt +95 -0
  54. data/lib/generators/dispatch_policy/install/templates/dispatch_tick_loop_job.rb.tt +53 -0
  55. data/lib/generators/dispatch_policy/install/templates/initializer.rb.tt +11 -0
  56. metadata +101 -43
  57. data/CHANGELOG.md +0 -43
  58. data/app/models/dispatch_policy/partition_inflight_count.rb +0 -42
  59. data/app/models/dispatch_policy/partition_observation.rb +0 -76
  60. data/app/models/dispatch_policy/throttle_bucket.rb +0 -41
  61. data/db/migrate/20260424000001_create_dispatch_policy_tables.rb +0 -80
  62. data/db/migrate/20260424000002_create_adaptive_concurrency_stats.rb +0 -22
  63. data/db/migrate/20260424000003_create_adaptive_concurrency_samples.rb +0 -25
  64. data/db/migrate/20260424000004_rename_samples_to_partition_observations.rb +0 -32
  65. data/db/migrate/20260425000001_add_duration_to_partition_observations.rb +0 -8
  66. data/lib/dispatch_policy/active_job_perform_all_later_patch.rb +0 -32
  67. data/lib/dispatch_policy/dispatch_context.rb +0 -53
  68. data/lib/dispatch_policy/dispatchable.rb +0 -123
  69. data/lib/dispatch_policy/gates/fair_interleave.rb +0 -32
  70. data/lib/dispatch_policy/gates/global_cap.rb +0 -26
data/README.md CHANGED
@@ -1,550 +1,595 @@
1
1
  # DispatchPolicy
2
2
 
3
- > **⚠️ Experimental.** The API, schema, and defaults can change between
4
- > minor releases without notice. DispatchPolicy is currently running in
5
- > production on [pulso.run](https://pulso.run) that's how we learn
6
- > what breaks. If you pick it up for your own project, pin the exact
7
- > version and expect to follow the changelog.
3
+ > **⚠️ Experimental v2 branch.** This is the `v2` branch of
4
+ > [ceritium/dispatch_policy](https://github.com/ceritium/dispatch_policy)
5
+ > an alternative cut: TX-atomic admission, in-tick fairness as a
6
+ > layer (not a gate), and a single canonical partition scope per
7
+ > policy. API, schema, and defaults can change between any two
8
+ > commits. The `master` branch of the same repo is the original
9
+ > design and is what the published gem (when one ships) tracks.
8
10
  >
9
- > **PostgreSQL only (11+).** The staging, admission, and fairness
10
- > machinery lean on `jsonb`, partial indexes, `FOR UPDATE SKIP LOCKED`,
11
- > `ON CONFLICT`, and `CROSS JOIN LATERAL`. MySQL/SQLite support isn't
12
- > closed off as a goal being drop-in across every ActiveJob backend
13
- > is the long-term direction — but it would take meaningful rework
14
- > (shadow columns for `jsonb`, full indexes instead of partial, a
15
- > different batch-fetch strategy for fairness). Contributions welcome.
11
+ > **PostgreSQL only.** Staging, admission, and adaptive stats lean on
12
+ > `jsonb`, partial indexes, `FOR UPDATE SKIP LOCKED`, `ON CONFLICT`,
13
+ > and the adapter sharing `ActiveRecord::Base.connection` so the
14
+ > admit + adapter INSERT can join one transaction. Tested against
15
+ > good_job and solid_queue.
16
16
 
17
17
  Per-partition admission control for ActiveJob. Stages `perform_later`
18
18
  into a dedicated table, runs a tick loop that admits jobs through
19
- declared gates (throttle, concurrency, global_cap, fair_interleave,
20
- adaptive_concurrency), then forwards survivors to the real adapter.
19
+ declared gates (`throttle`, `concurrency`, `adaptive_concurrency`),
20
+ then forwards survivors to the real adapter. The admission and the
21
+ adapter INSERT happen inside one Postgres transaction, so a worker
22
+ crash mid-tick can't lose a job.
21
23
 
22
24
  Use it when you need:
23
25
 
24
- - **Per-tenant / per-endpoint throttle** that's exact (token bucket)
25
- instead of best-effort enqueue-side.
26
- - **Per-partition concurrency** with a proper release hook on job
27
- completion (and lease-expiry recovery if the worker dies mid-perform).
26
+ - **Per-tenant / per-endpoint throttle** token bucket per partition,
27
+ refreshed lazily on read.
28
+ - **Per-partition concurrency** fixed cap on in-flight jobs with a
29
+ release hook on completion and a heartbeat-based reaper for crashes.
28
30
  - **Adaptive concurrency** — a cap that shrinks under queue pressure
29
- and grows back when workers keep up, without manual tuning.
30
- - **Dedupe** against a partial unique index, not an in-memory key.
31
- - **Round-robin fairness across tenants** (LATERAL batch fetch) so one
32
- tenant's burst can't starve the others — including a **time-weighted
33
- variant** that balances total compute time per tenant when their
34
- performs have very different durations.
31
+ and grows back when workers keep up, no manual tuning per tenant.
32
+ - **In-tick fairness** within a single tick, partitions are reordered
33
+ by recent activity (EWMA) and an optional global cap is shared
34
+ fairly across them. So one tenant's burst can't starve the others.
35
+ - **Sharding** split a policy across N queues so independent tick
36
+ workers admit in parallel.
35
37
 
36
38
  ## Demo
37
39
 
38
- A runnable playground that exercises every gate and the admin UI lives
39
- at [ceritium/dispatch_policy-demo](https://github.com/ceritium/dispatch_policy-demo).
40
- Clone it, `bundle && rails db:setup`, and use the in-browser forms to
41
- fire jobs through throttle / concurrency / adaptive / round-robin
42
- policies while the admin UI updates in real time.
40
+ The demo lives in `test/dummy/` a tiny Rails app inside this repo.
41
+ Run it locally to play with every gate and the admin UI:
42
+
43
+ ```bash
44
+ bin/dummy setup good_job # creates the DB and migrates
45
+ DUMMY_ADAPTER=good_job bundle exec foreman start
46
+ ```
47
+
48
+ Then open:
49
+
50
+ - `http://localhost:3000/` — playground with one card per job and a
51
+ storm form that exercises the adaptive cap and fairness reorder
52
+ across many tenants.
53
+ - `http://localhost:3000/dispatch_policy` — admin UI: live throughput,
54
+ partition state, denial reasons, capacity hints.
55
+
56
+ The dummy ships ten purpose-built jobs covering throttle, concurrency,
57
+ mixed gates, scheduling, retries, stress tests, sharding, fairness, and
58
+ adaptive concurrency. See `test/dummy/app/jobs/`.
43
59
 
44
60
  ## Install
45
61
 
46
62
  Add to your `Gemfile`:
47
63
 
48
64
  ```ruby
49
- gem "dispatch_policy"
65
+ gem "dispatch_policy",
66
+ git: "https://github.com/ceritium/dispatch_policy",
67
+ branch: "v2"
50
68
  ```
51
69
 
52
- Copy the migration and run it:
70
+ Generate the install bundle (migration + initializer + tick loop job):
53
71
 
54
- ```
55
- bundle exec rails dispatch_policy:install:migrations
56
- bundle exec rails db:migrate
72
+ ```bash
73
+ bin/rails generate dispatch_policy:install
74
+ bin/rails db:migrate
57
75
  ```
58
76
 
59
- Mount the admin UI in `config/routes.rb` (optional):
77
+ Mount the admin UI (optional but recommended):
60
78
 
61
79
  ```ruby
62
- mount DispatchPolicy::Engine => "/admin/dispatch_policy"
80
+ mount DispatchPolicy::Engine, at: "/dispatch_policy"
63
81
  ```
64
82
 
65
- Configure in `config/initializers/dispatch_policy.rb`:
83
+ Then schedule the tick loop. The generator wrote a
84
+ `DispatchTickLoopJob` in `app/jobs/`; kick it off once and it
85
+ re-enqueues itself:
66
86
 
67
87
  ```ruby
68
- DispatchPolicy.configure do |c|
69
- c.enabled = ENV.fetch("DISPATCH_POLICY_ENABLED", "true") != "false"
70
- c.lease_duration = 15.minutes
71
- c.batch_size = 500
72
- c.round_robin_quantum = 50
73
- c.tick_sleep = 1 # idle
74
- c.tick_sleep_busy = 0.05 # after productive ticks
75
- end
88
+ DispatchTickLoopJob.perform_later
76
89
  ```
77
90
 
78
91
  ## Flow
79
92
 
80
93
  ```
81
94
  ActiveJob#perform_later
82
- Dispatchable#enqueue
83
- StagedJob.stage! (insert into dispatch_policy_staged_jobs, pending)
95
+ JobExtension.around_enqueue_for
96
+ Repository.stage! (INSERT staged + UPSERT partition; ctx refreshed)
84
97
 
85
98
  (tick loop, periodically)
86
- SELECT pending FOR UPDATE SKIP LOCKED
87
- Run gates in declared order; survivors are the admitted set
88
- StagedJob#mark_admitted! (increment counters, set admitted_at)
89
- job.enqueue(_bypass_staging: true) (hand off to the real adapter)
99
+ claim_partitions (FOR UPDATE SKIP LOCKED, ordered by last_checked_at)
100
+ reorder by decayed_admits ASC (in-tick fairness)
101
+ for each: pipeline.call(ctx, partition, fair_share)
102
+ gates evaluate; admit_count = min(allowed)
103
+ → ONE TX: claim_staged_jobs! + insert_inflight! + Forwarder.dispatch
104
+ (the adapter INSERT shares the TX; rollback if anything raises)
105
+ → bulk-flush deny-state in one UPDATE ... FROM (VALUES ...)
90
106
 
91
107
  (worker runs perform)
92
- Dispatchable#around_perform
108
+ InflightTracker.track (around_perform)
109
+ → INSERT inflight_jobs ON CONFLICT DO NOTHING
110
+ → spawn heartbeat thread
93
111
  → block.call
94
- release counters, mark StagedJob completed_at, record observation
112
+ record_observation on adaptive gates (queue_lag AIMD update)
113
+ → DELETE inflight_jobs
95
114
  ```
96
115
 
97
116
  ## Declaring a policy
98
117
 
99
118
  ```ruby
100
- class SendWebhookJob < ApplicationJob
101
- include DispatchPolicy::Dispatchable
119
+ class FetchEndpointJob < ApplicationJob
120
+ dispatch_policy_inflight_tracking # only required if a concurrency gate is used
102
121
 
103
- dispatch_policy do
104
- # Persisted in the staged row so gates can read it without touching AR.
122
+ dispatch_policy :endpoints do
105
123
  context ->(args) {
106
124
  event = args.first
107
- { endpoint_id: event.endpoint_id, rate_limit: event.endpoint.rate_limit }
125
+ {
126
+ endpoint_id: event.endpoint_id,
127
+ rate_limit: event.endpoint.rate_limit,
128
+ max_per_account: event.account.dispatch_concurrency
129
+ }
108
130
  }
109
131
 
110
- # Partial unique index dedupes identical keys while the previous is pending.
111
- dedupe_key ->(args) { "event:#{args.first.id}" }
112
-
113
- # Tenant fairness — see the "Round-robin" section below.
114
- round_robin_by ->(args) { args.first.account_id }
132
+ # Required: every gate in the policy enforces against this scope.
133
+ partition_by ->(ctx) { ctx[:endpoint_id] }
115
134
 
116
135
  gate :throttle,
117
- rate: ->(ctx) { ctx[:rate_limit] },
118
- per: 1.minute,
119
- partition_by: ->(ctx) { ctx[:endpoint_id] }
136
+ rate: ->(ctx) { ctx[:rate_limit] },
137
+ per: 1.minute
138
+
139
+ gate :concurrency,
140
+ max: ->(ctx) { ctx[:max_per_account] || 5 }
120
141
 
121
- gate :fair_interleave
142
+ retry_strategy :restage # default; alternative: :bypass
122
143
  end
123
144
 
124
- def perform(event) = event.deliver!
145
+ def perform(event)
146
+ # ... call the rate-limited HTTP endpoint
147
+ end
125
148
  end
126
149
  ```
127
150
 
128
151
  `perform_later` stages the job; the tick admits it when its gates pass.
152
+ With multiple gates the actual `admit_count` per tick comes out as
153
+ `min(allowed)` across all of them.
129
154
 
130
- For the common multi-tenant webhook case (mixed-latency tenants behind
131
- a shared pool) skip ahead to [Recipes](#multi-tenant-webhook-delivery)
132
- — `round_robin_by weight: :time` plus `:adaptive_concurrency` covers
133
- it without an explicit throttle.
155
+ ## Choosing the partition scope
134
156
 
135
- ## Gates
157
+ `partition_by` is the most consequential decision in a policy and the
158
+ only required field. It tells the gem **what counts as one logical
159
+ partition** — what scope each gate enforces against, and what the
160
+ in-tick fairness reorder operates over.
136
161
 
137
- Gates run in declared order, each narrowing the survivor set. Any option
138
- that takes a value can alternatively take a lambda that receives the
139
- `ctx` hash, so parameters can depend on per-job data.
162
+ A policy with `partition_by` and **no gates** is also valid: the
163
+ pipeline passes the full budget through, and the Tick caps it via
164
+ `admission_batch_size` (or `tick_admission_budget` if set). Useful
165
+ for "balance N tenants evenly" without rate-limiting any of them.
140
166
 
141
- ### `:concurrency` in-flight cap per partition
167
+ If you need genuinely different scopes per gate (throttle by endpoint
168
+ AND concurrency by account, each enforced at its own scope), **split
169
+ into two policies** and chain them: the staging policy admits, its
170
+ worker enqueues into the second.
142
171
 
143
- Caps the number of admitted-but-not-yet-completed jobs in each
144
- partition. Tracks in-flight counts in
145
- `dispatch_policy_partition_counts`; decremented by the `around_perform`
146
- hook when the job finishes, or by the reaper when a lease expires
147
- (worker crashed).
148
-
149
- ```ruby
150
- gate :concurrency,
151
- max: ->(ctx) { ctx[:max_per_account] || 5 },
152
- partition_by: ->(ctx) { "acct:#{ctx[:account_id]}" }
153
- ```
172
+ ## Gates
154
173
 
155
- When to reach for it: external APIs with per-tenant concurrency limits,
156
- database-heavy jobs you don't want to pile up per customer, anything
157
- where "at most N running at once for this key" matters.
174
+ Gates run in declared order; each narrows the survivor count. Every
175
+ option that takes a value can alternatively take a lambda receiving
176
+ the `ctx` hash, so parameters can depend on per-job data.
158
177
 
159
178
  ### `:throttle` — token-bucket rate limit per partition
160
179
 
161
- Refills `rate` tokens every `per` seconds, capped at `burst` (defaults
162
- to `rate`). Admits jobs while tokens are available; leaves the rest
163
- pending for the next tick.
180
+ Refills `rate` tokens every `per` seconds, capped at `rate` (no
181
+ separate burst). Admits jobs while tokens are available; leaves the
182
+ rest pending for the next tick. State is persisted in
183
+ `partitions.gate_state.throttle`.
164
184
 
165
185
  ```ruby
166
186
  gate :throttle,
167
- rate: 100, # tokens
168
- per: 1.minute, # refill window
169
- burst: 100, # bucket cap (optional, defaults to rate)
170
- partition_by: ->(ctx) { "host:#{ctx[:host]}" }
187
+ rate: ->(ctx) { ctx[:rate_limit] },
188
+ per: 1.minute
171
189
  ```
172
190
 
173
- `rate` and `burst` accept lambdas, so the limit can come from
174
- configuration stored alongside the thing being rate-limited:
191
+ Throttle does **not** release tokens on completion tokens refill
192
+ only with elapsed time.
193
+
194
+ ### `:concurrency` — in-flight cap per partition
195
+
196
+ Caps the number of admitted-but-not-yet-completed jobs per partition.
197
+ Counts rows in `dispatch_policy_inflight_jobs` keyed by the policy's
198
+ canonical partition. Decremented by `InflightTracker.track`'s
199
+ `around_perform`; reaped by a periodic sweeper if a worker crashes.
175
200
 
176
201
  ```ruby
177
- gate :throttle,
178
- rate: ->(ctx) { ctx[:rate_limit] },
179
- per: 1.minute,
180
- partition_by: ->(ctx) { ctx[:endpoint_id] }
202
+ gate :concurrency,
203
+ max: ->(ctx) { ctx[:max_per_account] || 5 }
181
204
  ```
182
205
 
183
- Unlike `:concurrency`, throttle does **not** release tokens on job
184
- completion tokens refill only with elapsed time.
206
+ When the cap is full, the gate returns `retry_after = full_backoff`
207
+ (default 1s) so the partition skips the next ticks instead of
208
+ hammering `count(*)` every iteration.
185
209
 
186
- ### `:global_cap` — single cap across all partitions
210
+ ### `:adaptive_concurrency` — per-partition cap that self-tunes
187
211
 
188
- A global version of `:concurrency`: at most `max` jobs admitted
189
- simultaneously across the whole policy, regardless of partition.
190
- Useful as a safety ceiling on top of per-partition limits.
212
+ Like `:concurrency` but the cap (`current_max`) shrinks when the
213
+ adapter queue backs up and grows when workers drain it quickly.
214
+ AIMD loop on a per-partition stats row in
215
+ `dispatch_policy_adaptive_concurrency_stats`.
191
216
 
192
217
  ```ruby
193
- gate :concurrency, max: 10, partition_by: ->(ctx) { ctx[:tenant] }
194
- gate :global_cap, max: 200
218
+ gate :adaptive_concurrency,
219
+ initial_max: 3,
220
+ target_lag_ms: 1000, # acceptable queue wait before backoff
221
+ min: 1 # floor; a partition can't lock out
195
222
  ```
196
223
 
197
- Reads: "up to 10 in flight per tenant, but never more than 200 total".
224
+ - **Feedback signal**: `admitted_at perform_start` (queue wait in
225
+ the real adapter). Pure saturation signal — slow performs in the
226
+ downstream service don't punish admissions if workers still drain
227
+ the queue quickly.
228
+ - **Growth**: `current_max += 1` per fast success.
229
+ - **Slow shrink**: `current_max *= 0.95` when EWMA lag > target.
230
+ - **Failure shrink**: `current_max *= 0.5` when `perform` raises.
231
+ - **Safety valve**: when `in_flight == 0` the gate floors `remaining`
232
+ at `initial_max` so a partition that AIMD shrunk to `min` during
233
+ a past burst can re-grow when it idles.
198
234
 
199
- ### `:fair_interleave` — round-robin ordering across partitions
235
+ #### Choosing `target_lag_ms`
200
236
 
201
- Not a filter a reordering step. Groups the batch by its primary
202
- partition and interleaves, so no single partition can starve others
203
- even if it has many pending jobs.
237
+ It's the knob that trades latency for throughput. Rough guide:
204
238
 
205
- ```ruby
206
- gate :concurrency, max: 10, partition_by: ->(ctx) { "acct:#{ctx[:account_id]}" }
207
- gate :fair_interleave
208
- ```
239
+ - **Too low** (10–50 ms): the gate reacts to every tiny bump in
240
+ queue wait and shrinks aggressively. Workers idle while jobs sit
241
+ pending — overshoot.
242
+ - **Too high** (30 s+): the gate barely pushes back; throughput is
243
+ near-max but new admissions wait seconds before a worker picks
244
+ them up.
245
+ - **Reasonable starting point**: `≈ worker_threads × avg_perform_ms`.
246
+ E.g. 5 workers × 200 ms perform = 1000 ms means "queue depth up
247
+ to ~1 s is fine".
209
248
 
210
- Place it after a gate that assigned partitions; interleaving is keyed
211
- off the first partition a row picked up.
249
+ ## Fairness within a tick
212
250
 
213
- ### `:adaptive_concurrency` per-partition cap that self-tunes
251
+ When several partitions compete for admission inside the same tick,
252
+ the gem reorders them by **least-recently-active first** so a hot
253
+ partition with thousands of pending jobs cannot starve a cold one
254
+ that just woke up.
214
255
 
215
- The cap per partition (`current_max`) shrinks when the adapter queue
216
- backs up (EWMA of queue lag > `target_lag_ms`) or when performs raise;
217
- grows back by +1 when lag stays under target. AIMD loop on a
218
- per-partition stats row (`dispatch_policy_adaptive_concurrency_stats`).
256
+ The mechanism has two knobs: an EWMA half-life (controls *how* the
257
+ order is decided) and an optional global tick cap (controls *how
258
+ much* each partition is allowed in one tick).
219
259
 
220
- ```ruby
221
- gate :adaptive_concurrency,
222
- partition_by: ->(ctx) { ctx[:account_id] },
223
- initial_max: 3,
224
- target_lag_ms: 1000, # acceptable queue wait before admission
225
- min: 1 # floor so a partition can't lock out
226
- end
260
+ ### `fairness half_life:`
261
+
262
+ Each partition keeps `decayed_admits` and `decayed_admits_at`,
263
+ updated atomically inside the admit transaction:
264
+
265
+ ```
266
+ decayed_admits := decayed_admits * exp(-Δt / τ) + admitted
267
+ where τ = half_life / ln(2)
227
268
  ```
228
269
 
229
- - **Feedback signal**: `admitted_at perform_start` (queue wait in the
230
- real adapter). Pure saturation signal slow performs in the
231
- downstream service don't punish admissions if workers still drain
232
- the queue quickly.
233
- - **Growth**: +1 per fast success. No hard ceiling; the algorithm
234
- self-limits via `target_lag_ms`. If the queue builds up, the cap
235
- shrinks multiplicatively.
236
- - **Failure**: `current_max *= 0.5` (halve) when `perform` raises.
237
- - **Slow**: `current_max *= 0.95` when EWMA lag > target.
270
+ After `half_life` seconds without admitting, the value halves. The
271
+ Tick sorts the claimed batch by current `decayed_admits` ASC, so the
272
+ under-admitted go first.
238
273
 
239
- ### Choosing `target_lag_ms`
274
+ | Value | Behaviour |
275
+ |-----------|------------------------------------------------------------------------------|
276
+ | 5–10 s | Reacts to brief pauses. Bursty workloads where short stalls deserve a head start. |
277
+ | **60 s** (default) | Stable steady-state. Hot partitions stay "hot" through normal latency variation. |
278
+ | 5–15 min | Long memory. Burst on partition A penalises A for many minutes. |
240
279
 
241
- It's the knob that trades latency for throughput. Rough guide:
280
+ Set `c.fairness_half_life_seconds = nil` to disable the reorder
281
+ entirely — partitions are processed in `claim_partitions` order
282
+ (last-checked-first).
242
283
 
243
- - **Too low** (e.g. 10-50 ms). The gate reacts to every tiny bump in
244
- queue wait and shrinks the cap aggressively. Workers can end up
245
- idle with jobs still pending admission because the cap is
246
- overcorrecting classic contention / overshoot.
247
- - **Too high** (e.g. 30 s). The gate barely ever pushes back, so
248
- you get near-maximum throughput at the cost of real queue buildup;
249
- newly admitted jobs may wait seconds before a worker picks them
250
- up.
251
- - **Reasonable starting point**: `≈ worker_max_threads × avg_perform_ms`.
252
- If you run 5 workers at ~200 ms/perform, `target_lag_ms: 1000`
253
- means "it's OK if the adapter queue stays at most ~1 second
254
- deep". You'll want to tune from there based on what your
255
- downstream tolerates and how fast you want bursts to drain.
256
-
257
- Pair it with `round_robin_by` for multi-tenant systems that want
258
- automatic backpressure without hand-tuned caps per tenant:
284
+ ### `tick_admission_budget`
285
+
286
+ Without this, each partition admits up to `admission_batch_size`.
287
+ With it set, the per-partition ceiling becomes `fair_share = ceil(cap
288
+ / claimed_partitions)`. Pass-1 walks the (decay-sorted) partitions
289
+ giving each up to `fair_share`; pass-2 redistributes any leftover to
290
+ those that filled their share.
259
291
 
260
292
  ```ruby
261
- round_robin_by ->(args) { args.first[:account_id] }
262
- gate :adaptive_concurrency,
263
- partition_by: ->(ctx) { ctx[:account_id] },
264
- initial_max: 3,
265
- target_lag_ms: 1000
293
+ DispatchPolicy.configure do |c|
294
+ c.fairness_half_life_seconds = 60
295
+ c.tick_admission_budget = nil # default — no global cap
296
+ end
297
+
298
+ # Per-policy override:
299
+ dispatch_policy :endpoints do
300
+ partition_by ->(c) { c[:endpoint_id] }
301
+ fairness half_life: 30.seconds
302
+ tick_admission_budget 200
303
+ gate :throttle, rate: 100, per: 60
304
+ end
266
305
  ```
267
306
 
268
- ## Queues and partitioning
307
+ When the cap is hit before all partitions admit, the rest are denied
308
+ with reason `tick_cap_exhausted`. They were still observed
309
+ (`last_checked_at` bumped), so they're at the front of the next
310
+ tick's order.
311
+
312
+ ### Anti-stagnation
269
313
 
270
- DispatchPolicy operates at the **policy** (class) level. A job's
271
- ActiveJob `queue` and `priority` travel through staging into admission
272
- and on to the real adapter workers of each queue pick up their jobs
273
- normally but neither affects which staged rows the gates see. All
274
- enqueues of the same job class share one policy, one throttle bucket,
275
- one concurrency cap.
314
+ The decay-based reorder only applies to partitions already claimed.
315
+ Selection (`Repository.claim_partitions`) still orders by
316
+ `last_checked_at NULLS FIRST, id`. Every active partition with
317
+ pending jobs is visited in at most ⌈N / partition_batch_size⌉ ticks
318
+ regardless of how hot or cold it is.
276
319
 
277
- Two consequences to be aware of:
320
+ ### Mixing `:adaptive_concurrency` with fairness
278
321
 
279
- - Enqueuing the same job to different queues does **not** give one
280
- queue priority at admission; they share the policy's gates. If
281
- urgent work should jump ahead, set a lower ActiveJob `priority`
282
- (the admission SELECT is `ORDER BY priority, staged_at`) — or split
283
- into a subclass with its own policy.
284
- - `dedupe_key` is queue-agnostic: the same key enqueued to
285
- `:urgent` and `:low` dedupes to one row.
322
+ Adaptive and fairness operate at different layers and compose
323
+ without sharing state:
286
324
 
287
- ### Using queue as a partition
325
+ - **Fairness** writes `partitions.decayed_admits` inside the
326
+ per-partition admit TX.
327
+ - **Adaptive** writes `dispatch_policy_adaptive_concurrency_stats`
328
+ from the worker's `around_perform` via `record_observation`.
288
329
 
289
- The context hash has `queue_name` and `priority` injected automatically
290
- at stage time (user-supplied keys win). Use them in any `partition_by`:
330
+ Different tables, different locks. Each tick the actual admit_count
331
+ becomes `min(fair_share, current_max - in_flight)` (with the
332
+ adaptive safety valve when `in_flight == 0`). Fairness picks order +
333
+ budget per tick; adaptive shapes how aggressively each partition
334
+ consumes its share.
291
335
 
292
336
  ```ruby
293
- class SendEmailJob < ApplicationJob
294
- include DispatchPolicy::Dispatchable
337
+ dispatch_policy :tenants do
338
+ partition_by ->(c) { c[:tenant] }
295
339
 
296
- dispatch_policy do
297
- context ->(args) { { account_id: args.first.account_id } }
340
+ gate :adaptive_concurrency,
341
+ initial_max: 5,
342
+ target_lag_ms: 1000,
343
+ min: 1
298
344
 
299
- # Separate throttle bucket per (queue, account) — urgent and default
300
- # don't share rate tokens.
301
- gate :throttle,
302
- rate: 100,
303
- per: 1.minute,
304
- partition_by: ->(ctx) { "#{ctx[:queue_name]}:#{ctx[:account_id]}" }
305
- end
345
+ fairness half_life: 30.seconds
346
+ tick_admission_budget 60
306
347
  end
307
-
308
- SendEmailJob.set(queue: :urgent).perform_later(user)
309
- SendEmailJob.set(queue: :default).perform_later(user)
310
- # → two partitions, each with its own bucket.
311
348
  ```
312
349
 
313
- If you'd rather keep the two streams fully isolated (separate policies,
314
- admin rows, and dedupe scopes), subclass:
350
+ The dummy `AdaptiveDemoJob` declares both; the storm form drives it
351
+ across many tenants with a triangular weight distribution so you can
352
+ watch the EWMA reorder hot tenants AND the AIMD shrink their cap.
353
+ Integration test: `test/integration/adaptive_with_fairness_test.rb`.
354
+
355
+ ## Sharding a policy across worker pools
356
+
357
+ Shards partition the gem horizontally: each tick worker sees only
358
+ the partitions on its own shard, so multiple workers can admit in
359
+ parallel for the same policy. Declare a `shard_by`:
315
360
 
316
361
  ```ruby
317
- class UrgentEmailJob < SendEmailJob
318
- queue_as :urgent
319
- dispatch_policy do
320
- context ->(args) { { account_id: args.first.account_id } }
321
- gate :throttle, rate: 500, per: 1.minute, partition_by: ->(ctx) { ctx[:account_id] }
322
- end
362
+ dispatch_policy :events do
363
+ context ->(args) { { account_id: args.first[:account_id] } }
364
+ partition_by ->(c) { "acct:#{c[:account_id]}" }
365
+ shard_by ->(c) { "events-shard-#{c[:account_id].hash.abs % 4}" }
366
+
367
+ gate :concurrency, max: 50
323
368
  end
324
369
  ```
325
370
 
326
- ## Dedupe
371
+ Run one `DispatchTickLoopJob` per shard:
372
+
373
+ ```ruby
374
+ 4.times { |i| DispatchTickLoopJob.perform_later("events", "events-shard-#{i}") }
375
+ ```
376
+
377
+ The generated `DispatchTickLoopJob` template uses
378
+ `queue_as { arguments[1] }` so each tick is enqueued on the same
379
+ queue it monitors. Workers listening on `events-shard-*` queues run
380
+ both the tick loops and the admitted jobs from one pool per shard.
327
381
 
328
- `dedupe_key` is enforced by a partial unique index on
329
- `(policy_name, dedupe_key) WHERE completed_at IS NULL`. Semantics:
382
+ The gem's automatic context enrichment puts `:queue_name` into the
383
+ ctx hash so `shard_by` can use it directly without your `context`
384
+ proc having to know about it.
330
385
 
331
- - Re-enqueuing while a previous staged row is pending or admitted →
332
- silently dropped.
333
- - Re-enqueuing after the previous completes → fresh staged row.
334
- - Returning `nil` from the lambda → no dedup for that enqueue.
386
+ **`shard_by` must be as coarse as the most restrictive throttle's
387
+ scope.** If not, the bucket duplicates across shards and the
388
+ effective rate becomes `rate × N_shards`.
335
389
 
336
- Typical pattern: `"<domain>:<entity>:<id>"` (`"monitor:42"`,
337
- `"event:abc123"`). Keep it stable for the duration of a logical unit
338
- of work.
390
+ ## Atomic admission
339
391
 
340
- ## Round-robin batching (tenant fairness)
392
+ `Forwarder.dispatch` runs inside the per-partition admission
393
+ transaction. The adapter (good_job, solid_queue) uses
394
+ `ActiveRecord::Base.connection`, so its `INSERT INTO good_jobs`
395
+ joins the same TX as the `DELETE FROM staged_jobs` and the `INSERT
396
+ INTO inflight_jobs`. Any exception (deserialize, adapter error,
397
+ network) rolls everything back atomically — no window where staged
398
+ is gone but the adapter never received the job.
341
399
 
342
- For policies where every tenant should keep making progress even
343
- when one suddenly enqueues 100× its normal volume, neither throttle
344
- nor concurrency is a good fit you want max throughput, just
345
- fairness. `round_robin_by` solves it at the batch SELECT layer:
400
+ The trade-off: the gem requires a PG-backed adapter for
401
+ at-least-once. The railtie warns at boot if the adapter doesn't
402
+ look PG-shared (Sidekiq, Resque, async, …) but doesn't hard-fail
403
+ a custom PG-backed adapter we don't recognise can still work.
404
+
405
+ For Rails multi-DB (e.g. solid_queue on a separate `:queue` role):
346
406
 
347
407
  ```ruby
348
- dispatch_policy do
349
- context ->(args) { { account_id: args.first.account_id } }
350
- round_robin_by ->(args) { args.first.account_id }
408
+ DispatchPolicy.configure do |c|
409
+ c.database_role = :queue
351
410
  end
352
411
  ```
353
412
 
354
- At stage time the lambda's result is written into the dedicated
355
- `round_robin_key` column (indexed). `Tick.run` then uses a two-phase
356
- fetch:
413
+ `Repository.with_connection` wraps the admission TX in
414
+ `connected_to(role:)` when set. Staging tables and the adapter's
415
+ table must live in the same DB for atomicity to hold.
357
416
 
358
- 1. **LATERAL join** — distinct keys × per-key `LIMIT round_robin_quantum`.
359
- Guarantees each active tenant gets at least `quantum` rows per
360
- tick, so a tenant with 10 pending is served in the same tick as
361
- a tenant with 50k pending.
362
- 2. **Top-up** — if the fairness floor doesn't fill `batch_size`, the
363
- remaining slots go to the oldest pending (excluding the ids
364
- already locked). Keeps single-tenant throughput at full capacity.
365
-
366
- Cost per tick is O(`quantum × active_keys`), not O(backlog) — so the
367
- admin stays snappy even with thousands of distinct tenants.
368
-
369
- ### Time-weighted variant
417
+ ## Running the tick
370
418
 
371
- Equal-quanta round-robin gives every active tenant the same number of
372
- admissions per tick fair by *count*. If your tenants have very
373
- different per-job durations (slow webhooks, varied report sizes) and
374
- you want to balance the *total compute time* each consumes, pass
375
- `weight: :time`:
419
+ `DispatchPolicy::TickLoop.run(policy_name:, shard:, stop_when:)` is
420
+ the entry point. It claims partitions under `FOR UPDATE SKIP
421
+ LOCKED`, evaluates gates, atomically admits, and updates partition
422
+ state. The install generator scaffolds a `DispatchTickLoopJob` you
423
+ schedule like any other ActiveJob:
376
424
 
377
425
  ```ruby
378
- round_robin_by ->(args) { args.first[:account_id] }, weight: :time
426
+ DispatchTickLoopJob.perform_later # all policies
427
+ DispatchTickLoopJob.perform_later("endpoints") # one policy
428
+ DispatchTickLoopJob.perform_later("endpoints", "shard-2")
379
429
  ```
380
430
 
381
- Solo tenants are unaffected the fetch falls through to the trailing
382
- top-up and they consume up to `batch_size` per tick. When multiple
383
- tenants are active, each one's quantum is sized inversely to how much
384
- compute time it has used in the last `window` seconds (default 60),
385
- sourced from `dispatch_policy_partition_observations`. So if `slow`
386
- has burned 20 s of perform time recently and `fast` has burned 200 ms,
387
- this tick `fast` claims ~99% of `batch_size` while `slow` gets the
388
- floor — total compute per minute stays balanced and you don't need a
389
- throttle on top.
431
+ Each job uses `good_job_control_concurrency_with` (or solid_queue's
432
+ `limits_concurrency`) so only one tick is active per
433
+ (policy, shard) combination at a time. The job re-enqueues itself
434
+ with a 1-second tail wait, so the loop survives normal restarts.
390
435
 
391
- ## Recipes
436
+ ## Admin UI
392
437
 
393
- ### Multi-tenant webhook delivery
438
+ Mount the engine and visit `/dispatch_policy`:
394
439
 
395
- Mixed-latency tenants behind a shared worker pool — exactly the case
396
- that motivated `weight: :time` and adaptive concurrency. Pair them:
440
+ - **Dashboard** totals, throughput windows, round-trip stats,
441
+ capacity gauges (admit rate vs adapter ceiling, avg tick vs
442
+ `tick_max_duration`), pending trend with up/down arrow, auto-hints
443
+ ("avg tick at 88% of tick_max_duration — shard or lower
444
+ admission_batch_size").
445
+ - **Policies** — per-policy throughput, denial reasons breakdown,
446
+ top partitions by lifetime/pending, pause/resume/drain.
447
+ - **Partitions** — searchable list, detail view with gate state,
448
+ decayed_admits + admits/min estimate, recent staged jobs,
449
+ force-admit, drain.
397
450
 
398
- ```ruby
399
- class WebhookDeliveryJob < ApplicationJob
400
- include DispatchPolicy::Dispatchable
401
-
402
- dispatch_policy do
403
- context ->(args) { { account_id: args.first[:account_id] } }
404
-
405
- # Fetch-level fairness by *compute time* (not request count). When
406
- # several accounts compete, per-tick quanta are sized inverse to
407
- # their recent perform duration; solo accounts top up to batch_size.
408
- round_robin_by ->(args) { args.first[:account_id] },
409
- weight: :time, window: 60
410
-
411
- # Drip-feed admission per account based on adapter queue lag.
412
- # Without this, a single account with thousands of pending could
413
- # dump batch_size jobs into the adapter queue in one tick and lose
414
- # the ability to react to performance changes mid-burst.
415
- gate :adaptive_concurrency,
416
- partition_by: ->(ctx) { ctx[:account_id] },
417
- initial_max: 3,
418
- target_lag_ms: 500
419
- end
451
+ The UI auto-refreshes via Turbo morph + a controllable picker
452
+ (off / 2s / 5s / 10s) stored in sessionStorage; preserves scroll
453
+ position; and skips a refresh while a previous Turbo visit is in
454
+ flight so a slow page doesn't stack visits.
420
455
 
421
- def perform(account_id:, **) = WebhookClient.deliver!(account_id)
456
+ CSRF and forgery protection use the host app's settings. The UI
457
+ ships unauthenticated; wrap the `mount` with a constraint or
458
+ `before_action` for auth in production.
459
+
460
+ ## Configuration
461
+
462
+ ```ruby
463
+ # config/initializers/dispatch_policy.rb
464
+ DispatchPolicy.configure do |c|
465
+ c.tick_max_duration = 25 # seconds the tick job stays admitting
466
+ c.partition_batch_size = 50 # partitions claimed per tick iteration
467
+ c.admission_batch_size = 100 # max jobs admitted per partition per iteration
468
+ c.idle_pause = 0.5 # seconds slept when a tick admits nothing
469
+ c.partition_inactive_after = 86_400 # GC partitions idle this long
470
+ c.inflight_stale_after = 300 # GC inflight rows whose worker stopped heartbeating
471
+ c.inflight_heartbeat_interval = 30 # how often the worker bumps heartbeat_at
472
+ c.sweep_every_ticks = 50 # sweeper cadence (in tick iterations)
473
+ c.metrics_retention = 86_400 # tick_samples kept this long
474
+ c.fairness_half_life_seconds = 60 # EWMA half-life for in-tick reorder; nil disables
475
+ c.tick_admission_budget = nil # global cap on admissions per tick; nil = none
476
+ c.adapter_throughput_target = nil # jobs/sec; UI shows admit rate as % of this
477
+ c.database_role = nil # AR role for the admission TX (multi-DB)
422
478
  end
423
479
  ```
424
480
 
425
- What you get with no throttle, no manual tuning:
426
-
427
- - A solo account runs at whatever throughput its downstream allows;
428
- `:adaptive_concurrency` grows `current_max` while queue lag stays
429
- under `target_lag_ms`.
430
- - A slow account (1 s/perform) and a fast account (100 ms/perform)
431
- competing → `weight: :time` gives the fast one most of each tick's
432
- budget; the slow one's adaptive cap shrinks toward `min`. Total
433
- compute time per minute stays balanced and the adapter queue
434
- doesn't pile up behind whichever tenant happened to enqueue first.
435
- - A misbehaving downstream that suddenly goes from 100 ms to 5 s →
436
- that tenant's `current_max` drops within a few completions and its
437
- fetch quantum shrinks; the other tenants are unaffected.
438
-
439
- Tune `target_lag_ms` for the latency budget you can tolerate (see
440
- [Choosing target_lag_ms](#choosing-target_lag_ms)) and `window` for
441
- how reactive the time-balancing should be (smaller = noisier, larger
442
- = more stable).
481
+ You can override `admission_batch_size`, `fairness_half_life_seconds`,
482
+ and `tick_admission_budget` per policy via the DSL.
443
483
 
444
- ## Running the tick
484
+ ## `partitions.context` is refreshed on every enqueue
445
485
 
446
- The gem exposes `DispatchPolicy::TickLoop.run(policy_name:, stop_when:)`
447
- but **does not ship a tick job** concurrency semantics are
448
- queue-adapter specific (GoodJob's `total_limit`, Sidekiq Enterprise
449
- uniqueness, etc.), so you write a small job in your app that wraps
450
- the loop with whatever dedup your adapter provides. Example for
451
- GoodJob:
486
+ When you call `perform_later`, the gem evaluates your `context` proc
487
+ and upserts the partition row with the resulting hash:
452
488
 
453
- ```ruby
454
- # app/jobs/dispatch_tick_loop_job.rb
455
- class DispatchTickLoopJob < ApplicationJob
456
- include GoodJob::ActiveJobExtensions::Concurrency
457
- good_job_control_concurrency_with(
458
- total_limit: 1,
459
- key: -> { "dispatch_tick_loop:#{arguments.first || 'all'}" }
460
- )
461
-
462
- def perform(policy_name = nil)
463
- deadline = Time.current + DispatchPolicy.config.tick_max_duration
464
- DispatchPolicy::TickLoop.run(
465
- policy_name: policy_name,
466
- stop_when: -> {
467
- GoodJob.current_thread_shutting_down? || Time.current >= deadline
468
- }
469
- )
470
- # Self-chain so the next run starts immediately; cron below is a safety net.
471
- DispatchTickLoopJob.set(wait: 1.second).perform_later(policy_name)
472
- end
473
- end
489
+ ```sql
490
+ INSERT INTO dispatch_policy_partitions (..., context, context_updated_at, ...) VALUES (...)
491
+ ON CONFLICT (policy_name, partition_key) DO UPDATE
492
+ SET context = EXCLUDED.context,
493
+ context_updated_at = EXCLUDED.context_updated_at,
494
+ pending_count = dispatch_policy_partitions.pending_count + 1,
495
+ ...
474
496
  ```
475
497
 
476
- Schedule it (every 10s as a safety net — the self-chain keeps one
477
- alive under normal operation):
498
+ Gates evaluate against `partition.context`, **not** the per-job
499
+ snapshot in `staged_jobs.context`. So if a tenant bumps their
500
+ `dispatch_concurrency` from 5 to 20 and a new job arrives, the next
501
+ admission uses the new value — no need to drain the partition
502
+ first. If a partition has no new traffic, the context stays at the
503
+ value seen by the last enqueue.
504
+
505
+ ## Retry strategies
506
+
507
+ By default a retry produced by `retry_on` re-enters the policy and
508
+ is staged again, so throttle/concurrency apply equally to first
509
+ attempts and retries. Use `retry_strategy :bypass` if you want
510
+ retries to skip the gem and go straight to the adapter:
478
511
 
479
512
  ```ruby
480
- # config/application.rb
481
- config.good_job.cron = {
482
- dispatch_tick_loop: {
483
- cron: "*/10 * * * * *",
484
- class: "DispatchTickLoopJob"
485
- }
486
- }
513
+ dispatch_policy :foo do
514
+ partition_by ->(_c) { "k" }
515
+ gate :throttle, rate: 5, per: 60
516
+ retry_strategy :bypass
517
+ end
487
518
  ```
488
519
 
489
- For adapters without a first-class dedup mechanism, implement it
490
- yourself (e.g. `pg_try_advisory_lock` inside `perform`) before calling
491
- `DispatchPolicy::TickLoop.run`.
520
+ ## Compatibility
492
521
 
493
- ## Admin UI
494
-
495
- `DispatchPolicy::Engine` ships a read-only admin mounted wherever
496
- you like. Features:
497
-
498
- - Policy index with pending / admitted / completed-24h totals.
499
- - Per-policy page with a **partition breakdown** (watched + searchable
500
- list) showing pending-eligible / pending-scheduled / in-flight /
501
- completed / adaptive cap / EWMA latency / last enqueue / last
502
- dispatch per partition.
503
- - Line chart of avg EWMA queue lag (last hour, per minute) with
504
- completions-per-minute bars behind it.
505
- - Per-partition sparkline with the same overlay; click to watch /
506
- unwatch. Watched set is persisted in `localStorage` and synced into
507
- the URL so reloading keeps your view.
508
- - Opt-in auto-refresh (off / 2s / 5s / 15s) stored in `localStorage`.
509
- Page updates via Turbo morph — scroll position and tooltips survive.
522
+ - Rails 7.1+ (developed against 8.1).
523
+ - PostgreSQL 12+ (uses `FOR UPDATE SKIP LOCKED`, `JSONB`, `ON CONFLICT`).
524
+ - `good_job` 4.0 or `solid_queue` 1.0.
525
+ - Sidekiq / Resque are NOT supported — the at-least-once guarantee
526
+ needs the adapter to share Postgres with the gem.
510
527
 
511
528
  ## Testing
512
529
 
513
- ```
514
- bundle install
515
- bundle exec rake test
530
+ ```bash
531
+ bundle exec rake test # 124 runs / 284 assertions
532
+ bundle exec rake bench # manual benchmark suite (creates dispatch_policy_bench DB)
533
+ bundle exec rake bench:real # end-to-end against good_job on the dummy DB
534
+ bundle exec rake bench:limits # stretches every path to its breaking point
516
535
  ```
517
536
 
518
- Tests require a PostgreSQL instance (uses `ON CONFLICT`, partial
519
- indexes, `FOR UPDATE SKIP LOCKED`, `jsonb`). `PGUSER` / `PGHOST` /
520
- `PGPASSWORD` env vars override the defaults in
521
- `test/dummy/config/database.yml`.
537
+ Integration tests skip when no Postgres is reachable (default DB
538
+ `dispatch_policy_test`; override via `DB_NAME`, `DB_HOST`,
539
+ `DB_USER`, `DB_PASS`).
522
540
 
523
541
  ## Releasing
524
542
 
525
- The gem uses the standard `bundler/gem_tasks` flow — there is no
526
- release automation in CI. To cut a new version:
527
-
528
- 1. Bump `DispatchPolicy::VERSION` in `lib/dispatch_policy/version.rb`
529
- following SemVer. While the API is marked experimental, breaking
530
- changes go in a minor bump and should be called out in the changelog.
531
- 2. Add a section to `CHANGELOG.md` above the previous one, grouping
532
- entries (Added / Changed / Fixed / Removed). Link any relevant PRs.
533
- 3. Make sure the working tree is on `master`, clean, and CI is green
534
- (`bundle exec rake test` locally for a sanity check).
535
- 4. Commit: `git commit -am "Release vX.Y.Z"`.
536
- 5. `bundle exec rake release` — Bundler will build the `.gem` into
537
- `pkg/`, tag `vX.Y.Z`, push the commit and tag, and `gem push` to
538
- RubyGems. The gemspec sets `rubygems_mfa_required`, so have your
539
- OTP ready (`gem signin` first if you aren't authenticated).
540
- 6. Optional: publish a GitHub release from the tag, e.g.
541
- `gh release create vX.Y.Z --notes-from-tag`, or paste the
542
- changelog section into the release notes.
543
-
544
- If `rake release` fails partway through (e.g. RubyGems push rejects
545
- the version), do not retry blindly — inspect what already happened
546
- (tag created? commit pushed?) and clean up before re-running, since
547
- Bundler won't re-tag an existing version.
543
+ Cutting a new version is driven by `bin/release`. Steps:
544
+
545
+ 1. Bump `DispatchPolicy::VERSION` in
546
+ `lib/dispatch_policy/version.rb`.
547
+ 2. Add a `## <VERSION>` section in `CHANGELOG.md` describing the
548
+ release. The script extracts that section verbatim as the
549
+ GitHub release notes, so anything missing here will be missing
550
+ on GitHub.
551
+ 3. Commit both on `master` and push so `origin/master` matches
552
+ local.
553
+ 4. Run the script from the repo root:
554
+
555
+ ```bash
556
+ bin/release
557
+ ```
558
+
559
+ The script:
560
+
561
+ - Refuses to run unless you are on `master`, the working tree is
562
+ clean, the local branch matches `origin/master`, and the tag
563
+ `v<VERSION>` does not yet exist.
564
+ - Asks for a `y` confirmation before doing anything.
565
+ - Hands off to `bundle exec rake release` (builds the gem, creates
566
+ the `v<VERSION>` tag, pushes the tag to GitHub, pushes the gem to
567
+ RubyGems.org).
568
+ - Creates a GitHub release for `v<VERSION>` using the matching
569
+ CHANGELOG section as the body. Requires the `gh` CLI; if it is
570
+ missing, the gem ships but you'll need to create the GitHub
571
+ release manually with `gh release create v<VERSION> --notes-file
572
+ CHANGELOG.md`.
573
+
574
+ Prerequisites: a configured `~/.gem/credentials` for RubyGems push
575
+ and `gh auth login` for the GitHub release.
576
+
577
+ ## Status
578
+
579
+ Published on RubyGems. API may still shift between minors until
580
+ 1.0. The set of features that ship today:
581
+
582
+ - Gates: `:throttle`, `:concurrency`, `:adaptive_concurrency`.
583
+ - Fairness: in-tick EWMA reorder + optional `tick_admission_budget`.
584
+ - Sharding: `shard_by` + per-shard tick loops.
585
+ - Bulk handoff: `ActiveJob.perform_all_later` collapses to one
586
+ adapter `INSERT` per tick when admissible.
587
+ - Admin UI with capacity hints, pending trend, denial reasons.
588
+ - Manual benchmark suite.
589
+
590
+ Deferred ideas (with rationale) live in [`IDEAS.md`](IDEAS.md):
591
+ `gate :global_cap`, smarter sweeper defaults, `sweep_every_seconds`
592
+ instead of `sweep_every_ticks`.
548
593
 
549
594
  ## License
550
595