RubyGems - dispatch_policy - Versions diffs - 0.1.0 - Mend

dispatch_policy 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (37) hide show

checksums.yaml +7 -0
data/CHANGELOG.md +12 -0
data/MIT-LICENSE +21 -0
data/README.md +435 -0
data/app/controllers/dispatch_policy/application_controller.rb +9 -0
data/app/controllers/dispatch_policy/policies_controller.rb +269 -0
data/app/models/dispatch_policy/adaptive_concurrency_stats.rb +89 -0
data/app/models/dispatch_policy/application_record.rb +7 -0
data/app/models/dispatch_policy/partition_inflight_count.rb +42 -0
data/app/models/dispatch_policy/partition_observation.rb +49 -0
data/app/models/dispatch_policy/staged_job.rb +105 -0
data/app/models/dispatch_policy/throttle_bucket.rb +41 -0
data/app/views/dispatch_policy/policies/index.html.erb +52 -0
data/app/views/dispatch_policy/policies/show.html.erb +241 -0
data/app/views/layouts/dispatch_policy/application.html.erb +266 -0
data/config/routes.rb +6 -0
data/db/migrate/20260424000001_create_dispatch_policy_tables.rb +80 -0
data/db/migrate/20260424000002_create_adaptive_concurrency_stats.rb +22 -0
data/db/migrate/20260424000003_create_adaptive_concurrency_samples.rb +25 -0
data/db/migrate/20260424000004_rename_samples_to_partition_observations.rb +32 -0
data/lib/dispatch_policy/active_job_perform_all_later_patch.rb +32 -0
data/lib/dispatch_policy/dispatch_context.rb +53 -0
data/lib/dispatch_policy/dispatchable.rb +120 -0
data/lib/dispatch_policy/engine.rb +36 -0
data/lib/dispatch_policy/gate.rb +49 -0
data/lib/dispatch_policy/gates/adaptive_concurrency.rb +123 -0
data/lib/dispatch_policy/gates/concurrency.rb +43 -0
data/lib/dispatch_policy/gates/fair_interleave.rb +32 -0
data/lib/dispatch_policy/gates/global_cap.rb +26 -0
data/lib/dispatch_policy/gates/throttle.rb +52 -0
data/lib/dispatch_policy/install_generator.rb +23 -0
data/lib/dispatch_policy/policy.rb +73 -0
data/lib/dispatch_policy/tick.rb +214 -0
data/lib/dispatch_policy/tick_loop.rb +45 -0
data/lib/dispatch_policy/version.rb +5 -0
data/lib/dispatch_policy.rb +64 -0
metadata +182 -0

checksums.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+---
+SHA256:
+  metadata.gz: 78151d6562bbaa7ef349966a0c968ee26cc3c2c1830cb3f33461c5c1d5f66303
+  data.tar.gz: 29360c499ccdab8bb98c865bfcf522efad89facd287ee335beb9eeec302cb5cf
+SHA512:
+  metadata.gz: 306526c1343773820a6a1df453a716551201433555843869b841c85ffefaea3b61c64c6b29cec2c7105718ba019f92c3dd19180ccd837260dd65c0b317fb75e5
+  data.tar.gz: 8d5372b9feb3857b45ad0745feae75a123fa68a219d936ed6bf9af5eb4c9f97dbc183cae86cb36bf155b65c47137f4c897a90f2f5b85112fc9099397bdf35ca3

data/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,12 @@
+# Changelog
+## 0.1.0
+Initial release.
+- Rails engine + ActiveJob integration (`DispatchPolicy::Dispatchable`).
+- Gates: `:throttle`, `:concurrency`, `:global_cap`, `:fair_interleave`, `:adaptive_concurrency`.
+- Staged jobs with dedupe, round-robin fairness, per-partition counters, and throttle buckets.
+- Admin UI (Chart.js + Turbo) with watched partitions, sparklines, and EWMA queue-lag charts.
+- PostgreSQL required (uses `FOR UPDATE SKIP LOCKED`, `ON CONFLICT`, and `jsonb`).
+- Experimental — being trialed on pulso.run.

data/MIT-LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 José Galisteo
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

data/README.md ADDED Viewed

@@ -0,0 +1,435 @@
+# DispatchPolicy
+> **⚠️ Experimental.** The API, schema, and defaults can change between
+> minor releases without notice. DispatchPolicy is currently running in
+> production on [pulso.run](https://pulso.run) — that's how we learn
+> what breaks. If you pick it up for your own project, pin the exact
+> version and expect to follow the changelog.
+>
+> **PostgreSQL only (11+).** The staging, admission, and fairness
+> machinery lean on `jsonb`, partial indexes, `FOR UPDATE SKIP LOCKED`,
+> `ON CONFLICT`, and `CROSS JOIN LATERAL`. MySQL/SQLite support isn't
+> closed off as a goal — being drop-in across every ActiveJob backend
+> is the long-term direction — but it would take meaningful rework
+> (shadow columns for `jsonb`, full indexes instead of partial, a
+> different batch-fetch strategy for fairness). Contributions welcome.
+Per-partition admission control for ActiveJob. Stages `perform_later`
+into a dedicated table, runs a tick loop that admits jobs through
+declared gates (throttle, concurrency, global_cap, fair_interleave,
+adaptive_concurrency), then forwards survivors to the real adapter.
+Use it when you need:
+- **Per-tenant / per-endpoint throttle** that's exact (token bucket)
+  instead of best-effort enqueue-side.
+- **Per-partition concurrency** with a proper release hook on job
+  completion (and lease-expiry recovery if the worker dies mid-perform).
+- **Adaptive concurrency** — a cap that shrinks under queue pressure
+  and grows back when workers keep up, without manual tuning.
+- **Dedupe** against a partial unique index, not an in-memory key.
+- **Round-robin fairness across tenants** (LATERAL batch fetch) so one
+  tenant's burst can't starve the others.
+## Install
+Add to your `Gemfile`:
+```ruby
+gem "dispatch_policy"
+```
+Copy the migration and run it:
+```
+bundle exec rails dispatch_policy:install:migrations
+bundle exec rails db:migrate
+```
+Mount the admin UI in `config/routes.rb` (optional):
+```ruby
+mount DispatchPolicy::Engine => "/admin/dispatch_policy"
+```
+Configure in `config/initializers/dispatch_policy.rb`:
+```ruby
+DispatchPolicy.configure do |c|
+  c.enabled             = ENV.fetch("DISPATCH_POLICY_ENABLED", "true") != "false"
+  c.lease_duration      = 15.minutes
+  c.batch_size          = 500
+  c.round_robin_quantum = 50
+  c.tick_sleep          = 1        # idle
+  c.tick_sleep_busy     = 0.05     # after productive ticks
+end
+```
+## Flow
+```
+ActiveJob#perform_later
+  → Dispatchable#enqueue
+    → StagedJob.stage!   (insert into dispatch_policy_staged_jobs, pending)
+(tick loop, periodically)
+  → SELECT pending FOR UPDATE SKIP LOCKED
+  → Run gates in declared order; survivors are the admitted set
+  → StagedJob#mark_admitted!   (increment counters, set admitted_at)
+  → job.enqueue(_bypass_staging: true)   (hand off to the real adapter)
+(worker runs perform)
+  → Dispatchable#around_perform
+    → block.call
+    → release counters, mark StagedJob completed_at, record observation
+```
+## Declaring a policy
+```ruby
+class SendWebhookJob < ApplicationJob
+  include DispatchPolicy::Dispatchable
+  dispatch_policy do
+    # Persisted in the staged row so gates can read it without touching AR.
+    context ->(args) {
+      event = args.first
+      { endpoint_id: event.endpoint_id, rate_limit: event.endpoint.rate_limit }
+    }
+    # Partial unique index dedupes identical keys while the previous is pending.
+    dedupe_key ->(args) { "event:#{args.first.id}" }
+    # Tenant fairness — see the "Round-robin" section below.
+    round_robin_by ->(args) { args.first.account_id }
+    gate :throttle,
+         rate:         ->(ctx) { ctx[:rate_limit] },
+         per:          1.minute,
+         partition_by: ->(ctx) { ctx[:endpoint_id] }
+    gate :fair_interleave
+  end
+  def perform(event) = event.deliver!
+end
+```
+`perform_later` stages the job; the tick admits it when its gates pass.
+## Gates
+Gates run in declared order, each narrowing the survivor set. Any option
+that takes a value can alternatively take a lambda that receives the
+`ctx` hash, so parameters can depend on per-job data.
+### `:concurrency` — in-flight cap per partition
+Caps the number of admitted-but-not-yet-completed jobs in each
+partition. Tracks in-flight counts in
+`dispatch_policy_partition_counts`; decremented by the `around_perform`
+hook when the job finishes, or by the reaper when a lease expires
+(worker crashed).
+```ruby
+gate :concurrency,
+     max:          ->(ctx) { ctx[:max_per_account] || 5 },
+     partition_by: ->(ctx) { "acct:#{ctx[:account_id]}" }
+```
+When to reach for it: external APIs with per-tenant concurrency limits,
+database-heavy jobs you don't want to pile up per customer, anything
+where "at most N running at once for this key" matters.
+### `:throttle` — token-bucket rate limit per partition
+Refills `rate` tokens every `per` seconds, capped at `burst` (defaults
+to `rate`). Admits jobs while tokens are available; leaves the rest
+pending for the next tick.
+```ruby
+gate :throttle,
+     rate:         100,          # tokens
+     per:          1.minute,     # refill window
+     burst:        100,          # bucket cap (optional, defaults to rate)
+     partition_by: ->(ctx) { "host:#{ctx[:host]}" }
+```
+`rate` and `burst` accept lambdas, so the limit can come from
+configuration stored alongside the thing being rate-limited:
+```ruby
+gate :throttle,
+     rate:         ->(ctx) { ctx[:rate_limit] },
+     per:          1.minute,
+     partition_by: ->(ctx) { ctx[:endpoint_id] }
+```
+Unlike `:concurrency`, throttle does **not** release tokens on job
+completion — tokens refill only with elapsed time.
+### `:global_cap` — single cap across all partitions
+A global version of `:concurrency`: at most `max` jobs admitted
+simultaneously across the whole policy, regardless of partition.
+Useful as a safety ceiling on top of per-partition limits.
+```ruby
+gate :concurrency, max: 10, partition_by: ->(ctx) { ctx[:tenant] }
+gate :global_cap,  max: 200
+```
+Reads: "up to 10 in flight per tenant, but never more than 200 total".
+### `:fair_interleave` — round-robin ordering across partitions
+Not a filter — a reordering step. Groups the batch by its primary
+partition and interleaves, so no single partition can starve others
+even if it has many pending jobs.
+```ruby
+gate :concurrency, max: 10, partition_by: ->(ctx) { "acct:#{ctx[:account_id]}" }
+gate :fair_interleave
+```
+Place it after a gate that assigned partitions; interleaving is keyed
+off the first partition a row picked up.
+### `:adaptive_concurrency` — per-partition cap that self-tunes
+The cap per partition (`current_max`) shrinks when the adapter queue
+backs up (EWMA of queue lag > `target_lag_ms`) or when performs raise;
+grows back by +1 when lag stays under target. AIMD loop on a
+per-partition stats row (`dispatch_policy_adaptive_concurrency_stats`).
+```ruby
+gate :adaptive_concurrency,
+     partition_by:   ->(ctx) { ctx[:account_id] },
+     initial_max:    3,
+     target_lag_ms:  1000,   # acceptable queue wait before admission
+     min:            1       # floor so a partition can't lock out
+end
+```
+- **Feedback signal**: `admitted_at → perform_start` (queue wait in the
+  real adapter). Pure saturation signal — slow performs in the
+  downstream service don't punish admissions if workers still drain
+  the queue quickly.
+- **Growth**: +1 per fast success. No hard ceiling; the algorithm
+  self-limits via `target_lag_ms`. If the queue builds up, the cap
+  shrinks multiplicatively.
+- **Failure**: `current_max *= 0.5` (halve) when `perform` raises.
+- **Slow**: `current_max *= 0.95` when EWMA lag > target.
+### Choosing `target_lag_ms`
+It's the knob that trades latency for throughput. Rough guide:
+- **Too low** (e.g. 10-50 ms). The gate reacts to every tiny bump in
+  queue wait and shrinks the cap aggressively. Workers can end up
+  idle with jobs still pending admission because the cap is
+  overcorrecting — classic contention / overshoot.
+- **Too high** (e.g. 30 s). The gate barely ever pushes back, so
+  you get near-maximum throughput at the cost of real queue buildup;
+  newly admitted jobs may wait seconds before a worker picks them
+  up.
+- **Reasonable starting point**: `≈ worker_max_threads × avg_perform_ms`.
+  If you run 5 workers at ~200 ms/perform, `target_lag_ms: 1000`
+  means "it's OK if the adapter queue stays at most ~1 second
+  deep". You'll want to tune from there based on what your
+  downstream tolerates and how fast you want bursts to drain.
+Pair it with `round_robin_by` for multi-tenant systems that want
+automatic backpressure without hand-tuned caps per tenant:
+```ruby
+round_robin_by ->(args) { args.first[:account_id] }
+gate :adaptive_concurrency,
+     partition_by:  ->(ctx) { ctx[:account_id] },
+     initial_max:   3,
+     target_lag_ms: 1000
+```
+## Queues and partitioning
+DispatchPolicy operates at the **policy** (class) level. A job's
+ActiveJob `queue` and `priority` travel through staging into admission
+and on to the real adapter — workers of each queue pick up their jobs
+normally — but neither affects which staged rows the gates see. All
+enqueues of the same job class share one policy, one throttle bucket,
+one concurrency cap.
+Two consequences to be aware of:
+- Enqueuing the same job to different queues does **not** give one
+  queue priority at admission; they share the policy's gates. If
+  urgent work should jump ahead, set a lower ActiveJob `priority`
+  (the admission SELECT is `ORDER BY priority, staged_at`) — or split
+  into a subclass with its own policy.
+- `dedupe_key` is queue-agnostic: the same key enqueued to
+  `:urgent` and `:low` dedupes to one row.
+### Using queue as a partition
+The context hash has `queue_name` and `priority` injected automatically
+at stage time (user-supplied keys win). Use them in any `partition_by`:
+```ruby
+class SendEmailJob < ApplicationJob
+  include DispatchPolicy::Dispatchable
+  dispatch_policy do
+    context ->(args) { { account_id: args.first.account_id } }
+    # Separate throttle bucket per (queue, account) — urgent and default
+    # don't share rate tokens.
+    gate :throttle,
+         rate:         100,
+         per:          1.minute,
+         partition_by: ->(ctx) { "#{ctx[:queue_name]}:#{ctx[:account_id]}" }
+  end
+end
+SendEmailJob.set(queue: :urgent).perform_later(user)
+SendEmailJob.set(queue: :default).perform_later(user)
+# → two partitions, each with its own bucket.
+```
+If you'd rather keep the two streams fully isolated (separate policies,
+admin rows, and dedupe scopes), subclass:
+```ruby
+class UrgentEmailJob < SendEmailJob
+  queue_as :urgent
+  dispatch_policy do
+    context ->(args) { { account_id: args.first.account_id } }
+    gate :throttle, rate: 500, per: 1.minute, partition_by: ->(ctx) { ctx[:account_id] }
+  end
+end
+```
+## Dedupe
+`dedupe_key` is enforced by a partial unique index on
+`(policy_name, dedupe_key) WHERE completed_at IS NULL`. Semantics:
+- Re-enqueuing while a previous staged row is pending or admitted →
+  silently dropped.
+- Re-enqueuing after the previous completes → fresh staged row.
+- Returning `nil` from the lambda → no dedup for that enqueue.
+Typical pattern: `"<domain>:<entity>:<id>"` (`"monitor:42"`,
+`"event:abc123"`). Keep it stable for the duration of a logical unit
+of work.
+## Round-robin batching (tenant fairness)
+For policies where every tenant should keep making progress even
+when one suddenly enqueues 100× its normal volume, neither throttle
+nor concurrency is a good fit — you want max throughput, just
+fairness. `round_robin_by` solves it at the batch SELECT layer:
+```ruby
+dispatch_policy do
+  context ->(args) { { account_id: args.first.account_id } }
+  round_robin_by ->(args) { args.first.account_id }
+end
+```
+At stage time the lambda's result is written into the dedicated
+`round_robin_key` column (indexed). `Tick.run` then uses a two-phase
+fetch:
+1. **LATERAL join** — distinct keys × per-key `LIMIT round_robin_quantum`.
+   Guarantees each active tenant gets at least `quantum` rows per
+   tick, so a tenant with 10 pending is served in the same tick as
+   a tenant with 50k pending.
+2. **Top-up** — if the fairness floor doesn't fill `batch_size`, the
+   remaining slots go to the oldest pending (excluding the ids
+   already locked). Keeps single-tenant throughput at full capacity.
+Cost per tick is O(`quantum × active_keys`), not O(backlog) — so the
+admin stays snappy even with thousands of distinct tenants.
+## Running the tick
+The gem exposes `DispatchPolicy::TickLoop.run(policy_name:, stop_when:)`
+but **does not ship a tick job** — concurrency semantics are
+queue-adapter specific (GoodJob's `total_limit`, Sidekiq Enterprise
+uniqueness, etc.), so you write a small job in your app that wraps
+the loop with whatever dedup your adapter provides. Example for
+GoodJob:
+```ruby
+# app/jobs/dispatch_tick_loop_job.rb
+class DispatchTickLoopJob < ApplicationJob
+  include GoodJob::ActiveJobExtensions::Concurrency
+  good_job_control_concurrency_with(
+    total_limit: 1,
+    key: -> { "dispatch_tick_loop:#{arguments.first || 'all'}" }
+  )
+  def perform(policy_name = nil)
+    deadline = Time.current + DispatchPolicy.config.tick_max_duration
+    DispatchPolicy::TickLoop.run(
+      policy_name: policy_name,
+      stop_when:   -> {
+        GoodJob.current_thread_shutting_down? || Time.current >= deadline
+      }
+    )
+    # Self-chain so the next run starts immediately; cron below is a safety net.
+    DispatchTickLoopJob.set(wait: 1.second).perform_later(policy_name)
+  end
+end
+```
+Schedule it (every 10s as a safety net — the self-chain keeps one
+alive under normal operation):
+```ruby
+# config/application.rb
+config.good_job.cron = {
+  dispatch_tick_loop: {
+    cron:  "*/10 * * * * *",
+    class: "DispatchTickLoopJob"
+  }
+}
+```
+For adapters without a first-class dedup mechanism, implement it
+yourself (e.g. `pg_try_advisory_lock` inside `perform`) before calling
+`DispatchPolicy::TickLoop.run`.
+## Admin UI
+`DispatchPolicy::Engine` ships a read-only admin mounted wherever
+you like. Features:
+- Policy index with pending / admitted / completed-24h totals.
+- Per-policy page with a **partition breakdown** (watched + searchable
+  list) showing pending-eligible / pending-scheduled / in-flight /
+  completed / adaptive cap / EWMA latency / last enqueue / last
+  dispatch per partition.
+- Line chart of avg EWMA queue lag (last hour, per minute) with
+  completions-per-minute bars behind it.
+- Per-partition sparkline with the same overlay; click to watch /
+  unwatch. Watched set is persisted in `localStorage` and synced into
+  the URL so reloading keeps your view.
+- Opt-in auto-refresh (off / 2s / 5s / 15s) stored in `localStorage`.
+  Page updates via Turbo morph — scroll position and tooltips survive.
+## Testing
+```
+bundle install
+bundle exec rake test
+```
+Tests require a PostgreSQL instance (uses `ON CONFLICT`, partial
+indexes, `FOR UPDATE SKIP LOCKED`, `jsonb`). `PGUSER` / `PGHOST` /
+`PGPASSWORD` env vars override the defaults in
+`test/dummy/config/database.yml`.
+## License
+MIT.

data/app/controllers/dispatch_policy/application_controller.rb ADDED Viewed

@@ -0,0 +1,9 @@
+# frozen_string_literal: true
+module DispatchPolicy
+  class ApplicationController < ActionController::Base
+    protect_from_forgery with: :exception
+    layout "dispatch_policy/application"
+  end
+end