RubyGems - dispatch_policy - Versions diffs - 0.4.3 → 0.5.0 - Mend

dispatch_policy 0.4.3 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (37) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +185 -0
data/README.md +30 -7
data/app/controllers/dispatch_policy/application_controller.rb +21 -2
data/app/controllers/dispatch_policy/dashboard_controller.rb +3 -0
data/app/controllers/dispatch_policy/partitions_controller.rb +51 -15
data/app/controllers/dispatch_policy/policies_controller.rb +26 -4
data/app/models/dispatch_policy/policy_setting.rb +14 -0
data/app/views/dispatch_policy/dashboard/index.html.erb +6 -1
data/app/views/dispatch_policy/partitions/index.html.erb +1 -1
data/app/views/dispatch_policy/partitions/show.html.erb +1 -1
data/app/views/dispatch_policy/policies/index.html.erb +11 -3
data/app/views/dispatch_policy/policies/show.html.erb +13 -4
data/app/views/dispatch_policy/shared/_partition_row.html.erb +9 -2
data/app/views/layouts/dispatch_policy/application.html.erb +21 -25
data/db/migrate/20260501000001_create_dispatch_policy_tables.rb +13 -0
data/lib/dispatch_policy/config.rb +5 -0
data/lib/dispatch_policy/context.rb +12 -2
data/lib/dispatch_policy/cursor_pagination.rb +24 -7
data/lib/dispatch_policy/gates/adaptive_concurrency.rb +14 -0
data/lib/dispatch_policy/gates/concurrency.rb +4 -0
data/lib/dispatch_policy/gates/throttle.rb +36 -9
data/lib/dispatch_policy/inflight_tracker.rb +72 -26
data/lib/dispatch_policy/job_extension.rb +33 -9
data/lib/dispatch_policy/manual_admission.rb +18 -0
data/lib/dispatch_policy/operator_hints.rb +14 -0
data/lib/dispatch_policy/policy.rb +12 -0
data/lib/dispatch_policy/policy_dsl.rb +10 -2
data/lib/dispatch_policy/railtie.rb +10 -0
data/lib/dispatch_policy/registry.rb +8 -4
data/lib/dispatch_policy/repository.rb +102 -30
data/lib/dispatch_policy/tick.rb +18 -2
data/lib/dispatch_policy/tick_loop.rb +15 -7
data/lib/dispatch_policy/version.rb +1 -1
data/lib/generators/dispatch_policy/install/templates/create_dispatch_policy_tables.rb.tt +9 -0
data/lib/generators/dispatch_policy/install/templates/dispatch_tick_loop_job.rb.tt +30 -2
metadata +2 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 23433a64c963b0e0908c185ad8dc8e6f97edbd8d476ee712d15023f74ba0e338
-  data.tar.gz: 64ff19e04a6d02b0f1eedb4fb6d74b0e073e3773efb9a3afc92ae1a3e9002aeb
+  metadata.gz: aa4e5f3f353ac2b0de80f0e1c584aa7d53395f75ad4644355cf14070bf36cb71
+  data.tar.gz: 01607ab2a98c331f791b46f6c11badbddeec17532d85d78c24d3987020945cae
 SHA512:
-  metadata.gz: e168e049dbb0d399dddc6e84427b7b557474d9ce10cb1983d3f4f24c6fde43ffda9c03c179b4590d9f09d225a8adeb5d7295a10788898ebc4ab0bc47a765163c
-  data.tar.gz: d8ef9debaebdf89de7cce28e5fa669484acafd27e4b5e65ff959f6f177c1aaa124cb1215da191c3a6759094aac467980489241062f907574860b7226dd9dbc9a
+  metadata.gz: c7ead479e4a623510eee9a4cfcf3a58e1cec0fb387239a8149743990b4bc560f9cab4883ffa2fa93bf9eb81a580cbc458425fdaa3f356c708f037e04341ebd52
+  data.tar.gz: b6c18523ea1f59184631bc7f0b471fdead3ce48166f65217e46c48de40214fc76a6364c5faf6d9c2b236f3b725cdde3cee7197d84fa2110c8fc7bda4aa891e7a

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,190 @@
 # Changelog
+## Unreleased
+## 0.5.0
+### Upgrade notes
+- **New table `dispatch_policy_policy_settings`.** Required by the
+  policy-level pause fix below. New installs get it from the updated
+  install generator. **Existing installs must add it** — the gem ships a
+  single migration, so either re-copy the migration via
+  `rails dispatch_policy:install:migrations` (or hand-apply) or run:
+  ```ruby
+  create_table :dispatch_policy_policy_settings do |t|
+    t.string  :policy_name, null: false
+    t.boolean :paused,      null: false, default: false
+    t.timestamps
+  end
+  add_index :dispatch_policy_policy_settings, :policy_name,
+            unique: true, name: "idx_dp_policy_settings_lookup"
+  ```
+  Until the table exists, the tick's `claim_partitions` raises
+  `PG::UndefinedTable`. One row per policy holds its pause flag; it's the
+  policy-wide source of truth `claim_partitions` consults.
+### Added
+- The `:throttle` gate's `per` now accepts a lambda (like `rate`), so the
+  rate-limit window can depend on per-job context. A resolved `per <= 0`
+  raises.
+- Policy-level **pause** now actually holds the whole policy. The pause
+  flag lives in the new `dispatch_policy_policy_settings` table and is
+  honored by `claim_partitions`, so it also stops partitions that first
+  appear *after* the pause — previously `pause` only flipped the `status`
+  of partition rows that existed at click time, and a tenant's first
+  enqueue afterwards created an `active` partition the next tick admitted.
+  The per-partition `status` update is kept for the partitions-index
+  display; `resume` clears the flag.
+- The admin UI now reflects the policy-level pause flag everywhere
+  (policies index + show, dashboard policy rows, partitions index + show):
+  partitions created after a pause render as effectively paused even
+  though their own `status` is still `active`, the pause/resume button
+  toggles to a single relevant action, and `policies#show` shows a PAUSED
+  badge. The per-policy operator hints also short-circuit to a single
+  "policy is paused" note instead of falsely warning about never-checked
+  partitions / growing backlog while admission is intentionally stopped.
+### Fixed
+- **The admin UI honors `config.database_role`.** The engine controllers
+  query the gem tables through the AR models directly (`Partition`,
+  `StagedJob`, `InflightJob`, `PolicySetting`, `TickSample`), which the
+  `Repository` role wrapper doesn't cover — under multi-DB every dashboard
+  page queried the default writing role (`PG::UndefinedTable` → 500), and
+  `pause`/`resume` updated the partition `status` in the wrong DB while
+  the policy flag went to the right one. An `around_action` in the
+  engine's `ApplicationController` now wraps every action — including view
+  rendering, so lazily-evaluated relations stay routed — in
+  `Repository.with_connection`. No-op without `database_role`.
+- **`pause`/`resume` write the policy flag and the partition statuses in
+  one transaction.** They were two autocommitted statements; a crash
+  between them left the partition list contradicting what admission
+  actually does until the next toggle.
+- **The generated `DispatchTickLoopJob` no longer dies after its first run
+  under good_job.** It re-enqueues itself at the end of `perform`, but
+  `good_job_control_concurrency_with(total_limit: 1)` counts the
+  still-running job in its enqueue check (`unfinished`), so the successor
+  was silently aborted and admission stopped after `tick_max_duration`.
+  Switched to `enqueue_limit: 1` + `perform_limit: 1` (the enqueue check
+  excludes the running job) and the job now logs an error if a re-enqueue
+  is ever refused. solid_queue was unaffected.
+- **`Tick#record_sample!` routes its two AR-model reads through
+  `config.database_role`.** They bypassed the `Repository` role wrapper, so
+  under a separate queue DB they queried the wrong role and the swallowed
+  error meant no `tick_sample` was ever written (empty dashboard/metrics).
+- **Multi-DB (`config.database_role`) is now honored everywhere.** It was
+  only applied at the three admission-TX boundaries (`Tick`,
+  `ManualAdmission`), leaving staging, partition claim, inflight
+  counts/tracking, sweeps and dashboard reads on the default writing role.
+  Under a separate queue DB (e.g. `solid_queue`) with the gem tables
+  there, staging wrote one DB while the tick read another — silent job
+  loss — and the concurrency gate counted inflight rows in a different DB
+  than the tracker wrote them to. Every public `Repository` method now
+  opens inside `connected_to(role:)`; `InflightTracker`'s direct access
+  (lookup + heartbeat thread) is routed too.
+- **A policy may declare each gate type at most once.** Two gates of the
+  same type shared a single `gate_state` key (both throttles wrote
+  `gate_state["throttle"]`), so the merged patch kept only the last gate's
+  bucket and the other then saw a permanently full bucket — silently
+  defeating the stricter limit (the classic 10/min + 600/hour idiom).
+  `Policy#validate!` now raises `InvalidPolicy`; use separate policies for
+  multi-window limits.
+- **Bulk `perform_all_later` correctness.** A job whose declared policy
+  wasn't registered was silently dropped (neither staged nor sent to the
+  adapter); jobs were marked `successfully_enqueued` before the INSERT
+  committed; and the bulk path ignored `bypass_retries`. It now mirrors
+  the single path: unstageable jobs fall through to the adapter, the
+  enqueued flag is set only after `stage_many!` returns, and retries on a
+  `:bypass` policy skip staging.
+- **`ManualAdmission.force!` pre-inserts inflight rows** in the same
+  transaction as the claim, like the Tick. Without it the concurrency
+  gate under-counted force-admitted jobs (UI admit/drain) until each one
+  started performing — an over-admission window proportional to the
+  backlog drained.
+- **Inflight rows are reaped when a job is discarded before performing.**
+  `discard_on ActiveJob::DeserializationError` (and any discard) fires
+  during argument deserialization, before `around_perform`, so
+  `InflightTracker.track`'s `ensure` never ran and the Tick's pre-inserted
+  row sat until the `inflight_queued_stale_after` sweeper (1h), holding a
+  concurrency slot. The railtie now subscribes to `discard.active_job` and
+  deletes the row by `active_job_id`.
+- **`throttle` no longer busy-loops on a zero/nil rate.** A `rate` of `0`
+  or `nil` (e.g. a paused tenant) denied with a NULL `retry_after`, which
+  left the partition immediately eligible — re-claimed and re-evaluated
+  every tick — and clobbered any existing backoff. It now backs off one
+  `per` window, and `bulk_record_partition_denies!` preserves the existing
+  `next_eligible_at` when `retry_after` is NULL instead of nulling it.
+- **`throttle` rate is read as `Float`.** A fractional rate (e.g. `2.5`)
+  kept its fractional part instead of truncating every refill (systematic
+  under-admission), and a sub-unit rate (`rate: 0.5`) accumulates a whole
+  token and admits instead of truncating to `0` and denying forever.
+- **`adaptive_concurrency` validates its tuning knobs.** Out-of-range
+  values silently inverted the AIMD loop: `ewma_alpha: 0` froze the EWMA
+  at its seed so the cap grew unbounded, and a decrease factor `>= 1`
+  turned the multiplicative *decrease* into a positive-feedback *increase*
+  under failure/overload. The constructor now requires
+  `0 < ewma_alpha <= 1` and `0 < failure/overload_decrease_factor < 1`.
+- **`partitions#admit` bounds its count.** An unbounded `count` forced a
+  single `DELETE…RETURNING` + dispatch of the whole backlog in one
+  transaction (bypassing the batching/cap that `drain` uses), and a
+  non-numeric value 500'd. It's now clamped to `[1, 10_000]` with a
+  fallback to `1`.
+- **Forged timestamp pagination cursors no longer 500.** A non-parseable
+  string on a `stale`/`recent` sort bound into a timestamp column and
+  raised `invalid input syntax for type timestamp`. `CursorPagination`
+  now requires a parseable ISO8601 value for timestamp sorts, falling back
+  to the first page otherwise.
+- `stage_many!` chunks its INSERT into batches of 1,000 rows so a bulk
+  `perform_all_later` larger than ~8,191 jobs no longer blows Postgres's
+  65,535 bind-param limit and fails the whole batch.
+- `InflightTracker.track` now inserts the inflight row and spawns the
+  heartbeat inside its `begin/ensure`, so a failure spawning the heartbeat
+  thread can't leave a ghost inflight row behind until the sweeper.
+- `Registry` reads (`fetch`/`names`/`each`/`size`) take the same mutex as
+  `register`/`clear` (snapshotting before iterating in `each`), removing a
+  data race on non-GVL runtimes (JRuby/TruffleRuby).
+- The DSL rejects `tick_admission_budget`/`admission_batch_size` of `0` or
+  negative (a silent full stop of the policy) and the `concurrency` /
+  `adaptive_concurrency` gates reject a negative `full_backoff` (which
+  would put `next_eligible_at` in the past and re-evaluate every tick).
+  `nil` still defers to config.
+- The policy-wide drain passes its remaining budget to each partition so
+  the total can't overshoot the 10,000 cap by nearly 2×, and a drain that
+  only leaves future-scheduled jobs now says "N scheduled for later
+  remain" instead of looping "click drain again" forever.
+- `partitions#show` lists recent staged jobs in the real admission order
+  (`priority DESC, scheduled_at NULLS FIRST, id`) and drops a dead,
+  mis-scoped `@inflight` query.
+- `Context` now exposes indifferent (symbol/string) access at every depth,
+  not just the top level — `ctx[:limits][:max]` no longer silently returns
+  nil when the host wrote a nested hash with symbol keys. `to_jsonb`/`to_h`
+  still return the plain string-keyed hash for storage.
+- The tick loop survives misconfigured pacing: `sweep_every_ticks <= 0`
+  now means "never sweep" instead of raising `ZeroDivisionError`, and a
+  negative `idle_pause`/`busy_pause` is treated as no pause instead of
+  raising in `sleep`. Both previously escaped the loop's rescues and
+  stopped admission.
+- Pass-2 budget redistribution denies (e.g. a throttle emptied after
+  pass-1) now feed the tick sample's denied-reason breakdown, so the
+  dashboard reflects why redistribution stopped.
+- Admin UI: `format_count` keeps the sign of negative values; durations
+  clamp at 0 so app↔DB clock skew can't render "-340ms"; the partition
+  search escapes `%`/`_` so a literal key containing them matches
+  literally; and the refresh/theme controls bind via a single delegated
+  document listener instead of per-button (Turbo's morph refresh dropped
+  the `data-bound` guard, leaking a new listener per refresh).
+- Dummy app: the throttle demos (`slow_api`, `mixed`) honor the form's
+  `per` field via the new callable `per` instead of a hardcoded window
+  (`slow_api` was stuck at 60000s), and the enqueue forms tolerate blank
+  numeric fields / unknown job names instead of 500ing.
+### Internal
+- Corrected the `bulk_record_partition_denies!` comment: `claim_partitions`
+  runs autocommitted, so its `FOR UPDATE SKIP LOCKED` locks don't guard the
+  end-of-tick deny flush — the one-tick-loop-per-(policy,shard) invariant
+  and the `last_checked_at` bump do.
 ## 0.4.3
 ### Fixed

data/README.md CHANGED Viewed

@@ -210,6 +210,12 @@ Gates run in declared order; each narrows the survivor count. Every
 option that takes a value can alternatively take a lambda receiving
 the `ctx` hash, so parameters can depend on per-job data.
+A policy may declare each gate type **at most once** — two gates of the
+same type would share a `gate_state` key and corrupt each other's
+persisted state, so the policy raises `InvalidPolicy` at definition
+time. For multi-window rate limiting (e.g. 10/min *and* 600/hour), use
+separate policies.
 ### `:throttle` — token-bucket rate limit per partition
 Refills `rate` tokens every `per` seconds, capped at `rate` (no
@@ -223,9 +229,20 @@ gate :throttle,
      per:  1.minute
 ```
+Both `rate` and `per` accept a lambda receiving the `ctx`, so the rate
+limit and its window can depend on per-job data (e.g. a per-tenant plan
+that sets both). A `per` that resolves to `<= 0` raises.
 Throttle does **not** release tokens on completion — tokens refill
 only with elapsed time.
+`rate` may be fractional (e.g. `2.5`): the bucket keeps the fractional
+part so the long-run rate is exact rather than truncated. A sub-unit
+rate works too — the bucket holds at least one whole token, so e.g.
+`rate: 1, per: 2.seconds` admits one job every two seconds. A `rate`
+of `0` (or `nil`) denies and backs the partition off for one `per`
+window. Prefer expressing low rates via a longer `per`.
 ### `:concurrency` — in-flight cap per partition
 Caps the number of admitted-but-not-yet-completed jobs per partition.
@@ -445,9 +462,12 @@ DispatchPolicy.configure do |c|
 end
 ```
-`Repository.with_connection` wraps the admission TX in
-`connected_to(role:)` when set. Staging tables and the adapter's
-table must live in the same DB for atomicity to hold.
+When set, **every** DB access the gem makes runs inside
+`connected_to(role:)` — staging on `perform_later`, the admission TX,
+inflight tracking and its heartbeat thread, sweeps, and the admin UI
+(an `around_action` routes each dashboard request, so its reads and
+operator actions hit the same DB the tick writes). Staging tables and
+the adapter's table must live in the same DB for atomicity to hold.
 ### Job identity across staging and adapter
@@ -509,7 +529,10 @@ Mount the engine and visit `/dispatch_policy`:
   ("avg tick at 88% of tick_max_duration — shard or lower
   admission_batch_size").
 - **Policies** — per-policy throughput, denial reasons breakdown,
-  top partitions by lifetime/pending, pause/resume/drain.
+  top partitions by lifetime/pending, pause/resume/drain. Pause is a
+  policy-level flag (stored in `dispatch_policy_policy_settings`) the
+  tick honors, so it also holds partitions that first appear *after*
+  the pause; resume clears it.
 - **Partitions** — searchable list, detail view with gate state,
   decayed_admits + admits/min estimate, recent staged jobs,
   force-admit, drain.
@@ -535,13 +558,13 @@ DispatchPolicy.configure do |c|
   c.partition_inactive_after  = 86_400   # GC partitions idle this long
   c.inflight_stale_after      = 300      # GC inflight rows whose worker stopped heartbeating
   c.inflight_queued_stale_after = 3_600  # GC inflight rows admitted but never started (queued)
-  c.inflight_heartbeat_interval = 30     # how often the worker bumps heartbeat_at
-  c.sweep_every_ticks         = 50       # sweeper cadence (in tick iterations)
+  c.inflight_heartbeat_interval = 30     # how often the worker bumps heartbeat_at; 0 disables the thread
+  c.sweep_every_ticks         = 50       # sweeper cadence (in tick iterations); <= 0 never sweeps
   c.metrics_retention         = 86_400   # tick_samples kept this long
   c.fairness_half_life_seconds = 60      # EWMA half-life for in-tick reorder; nil disables
   c.tick_admission_budget      = nil     # global cap on admissions per tick; nil = none
   c.adapter_throughput_target  = nil     # jobs/sec; UI shows admit rate as % of this
-  c.database_role              = nil     # AR role for the admission TX (multi-DB)
+  c.database_role              = nil     # AR role ALL gem DB access runs against (multi-DB)
 end
 ```

data/app/controllers/dispatch_policy/application_controller.rb CHANGED Viewed

@@ -4,11 +4,24 @@ module DispatchPolicy
   class ApplicationController < ActionController::Base
     protect_from_forgery with: :exception
+    # The dashboard reads and writes the gem tables through the AR models
+    # directly (Partition, StagedJob, InflightJob, PolicySetting,
+    # TickSample), which — unlike Repository — have no role wrapper of
+    # their own. Under multi-DB (config.database_role) those queries would
+    # hit the default writing role, where the gem tables don't live.
+    # Wrapping the whole action keeps view rendering inside the role too,
+    # so lazily-evaluated relations (@partitions etc.) stay routed.
+    around_action :route_database_role
     helper_method :format_time, :format_count, :format_duration_seconds,
                   :format_duration_ms, :sparkline, :registered_policies
     private
+    def route_database_role(&action)
+      Repository.with_connection(&action)
+    end
     def registered_policies
       DispatchPolicy.registry.each.to_a
     end
@@ -20,12 +33,18 @@ module DispatchPolicy
     def format_count(value)
       return "0" if value.nil?
-      value.to_i.to_s.reverse.scan(/\d{1,3}/).join(",").reverse
+      n      = value.to_i
+      sign   = n.negative? ? "-" : ""
+      digits = n.abs.to_s.reverse.scan(/\d{1,3}/).join(",").reverse
+      "#{sign}#{digits}"
     end
     def format_duration_seconds(seconds)
       return "—" if seconds.nil?
-      s = seconds.to_f
+      # A duration is never meaningfully negative; clock skew between the
+      # app and Postgres (timestamps written by now(), subtracted in Ruby)
+      # can yield a small negative — clamp so the UI shows 0ms, not "-340ms".
+      s = [seconds.to_f, 0.0].max
       return "%.0fms" % (s * 1000) if s < 1
       return "%.1fs"  % s          if s < 60
       return "%.1fm"  % (s / 60)   if s < 3600

data/app/controllers/dispatch_policy/dashboard_controller.rb CHANGED Viewed

@@ -69,6 +69,8 @@ module DispatchPolicy
       denied_by = Repository.top_denied_reason_by_policy(since: one_min_ago)
       rt_by     = Repository.partition_round_trip_stats_by_policy
+      paused_policies = PolicySetting.paused.pluck(:policy_name).to_set
       names = (pending_by_policy.keys + in_flight_by_policy.keys).uniq.sort
       @policies = names.map do |name|
         info = pending_by_policy[name] || {}
@@ -79,6 +81,7 @@ module DispatchPolicy
         {
           name:           name,
+          paused:         paused_policies.include?(name),
           pending:        info[:pending] || 0,
           in_flight:      in_flight_by_policy[name] || 0,
           last_admit_at:  info[:last_admit_at],

data/app/controllers/dispatch_policy/partitions_controller.rb CHANGED Viewed

@@ -13,7 +13,11 @@ module DispatchPolicy
       base = Partition.all
       base = base.for_policy(params[:policy]) if params[:policy].present?
       base = base.for_shard(params[:shard])   if params[:shard].present?
-      base = base.where("partition_key ILIKE ?", "%#{params[:q]}%") if params[:q].present?
+      if params[:q].present?
+        # Escape %/_ so a literal key containing them (e.g. "discount_50%")
+        # matches literally instead of as ILIKE wildcards.
+        base = base.where("partition_key ILIKE ?", "%#{Partition.sanitize_sql_like(params[:q])}%")
+      end
       base = base.where("pending_count > 0")                         if params[:only_pending] == "1"
       @sort = DispatchPolicy::CursorPagination::SORTS.key?(params[:sort]) ? params[:sort] : DispatchPolicy::CursorPagination::DEFAULT_SORT
@@ -40,6 +44,11 @@ module DispatchPolicy
       @query         = params[:q]
       @only_pending  = params[:only_pending] == "1"
+      # Policy-level pause flags so rows show their EFFECTIVE state: a
+      # partition created after a pause has status 'active' but is not
+      # being admitted (claim_partitions skips the whole policy).
+      @paused_policies = PolicySetting.paused.pluck(:policy_name).to_set
       shards_scope = Partition.all
       shards_scope = shards_scope.for_policy(params[:policy]) if params[:policy].present?
       @shards      = shards_scope.distinct.pluck(:shard).sort
@@ -59,15 +68,25 @@ module DispatchPolicy
     helper_method :pagination_params
     def show
+      # Order matches the tick's claim order (claim_staged_jobs!) so the list
+      # reflects what would actually be admitted first, not the reverse.
       @recent_jobs = StagedJob
         .for_partition(@partition.policy_name, @partition.partition_key)
-        .order(:scheduled_at, :id)
+        .order(Arel.sql("priority DESC, scheduled_at ASC NULLS FIRST, id ASC"))
         .limit(50)
-      @inflight = InflightJob.where(policy_name: @partition.policy_name).limit(50)
+      # The whole policy may be paused even if this partition's own status
+      # is 'active' (it was created after the pause). claim_partitions skips
+      # the policy regardless, so surface the effective state.
+      @policy_paused = PolicySetting.for_policy(@partition.policy_name).pick(:paused) || false
     end
     def admit
-      count     = Integer(params[:count] || 1)
+      # Bound the count: an unbounded value would force a single
+      # DELETE…RETURNING + dispatch of the whole backlog in one transaction,
+      # bypassing the batching/cap that #drain uses precisely to avoid
+      # request timeouts and giant transactions. A non-numeric value falls
+      # back to 1 instead of raising (ArgumentError → 500).
+      count     = (Integer(params[:count], exception: false) || 1).clamp(1, DRAIN_MAX_PER_REQUEST)
       forwarded = ManualAdmission.force!(
         policy_name:   @partition.policy_name,
         partition_key: @partition.partition_key,
@@ -81,19 +100,33 @@ module DispatchPolicy
     # huge backlog can't time the controller out — the operator clicks again
     # for the next batch.
     def drain
-      drained, remaining = self.class.drain_partition!(@partition)
-      notice = if remaining.positive?
-        "Drained #{drained} job(s); #{remaining} still pending — click drain again to continue."
-      else
-        "Drained #{drained} job(s); partition empty."
-      end
+      drained, due_remaining, scheduled_remaining =
+        self.class.drain_partition!(@partition)
+      notice =
+        if due_remaining.positive?
+          "Drained #{drained} job(s); #{due_remaining} still pending — click drain again to continue."
+        elsif scheduled_remaining.positive?
+          # The claim only picks up rows whose scheduled_at has arrived, so
+          # future-scheduled jobs can't be drained now. Saying "click again"
+          # would just loop forwarding zero.
+          "Drained #{drained} job(s); #{scheduled_remaining} scheduled for later remain."
+        else
+          "Drained #{drained} job(s); partition empty."
+        end
       redirect_to partition_path(@partition), notice: notice
     end
-    def self.drain_partition!(partition)
+    # Force-admits up to DRAIN_MAX_PER_REQUEST due jobs in DRAIN_BATCH_SIZE
+    # batches. Optional `cap` lets the policy-wide drain bound the TOTAL
+    # across partitions. Returns [drained, due_remaining, scheduled_remaining]
+    # — due_remaining is claimable-now work the cap left behind;
+    # scheduled_remaining is future-scheduled rows the claim can't touch yet.
+    def self.drain_partition!(partition, cap: DRAIN_MAX_PER_REQUEST)
+      cap     = [cap, DRAIN_MAX_PER_REQUEST].min
       drained = 0
-      while drained < DRAIN_MAX_PER_REQUEST
-        batch_limit = [DRAIN_BATCH_SIZE, DRAIN_MAX_PER_REQUEST - drained].min
+      while drained < cap
+        batch_limit = [DRAIN_BATCH_SIZE, cap - drained].min
         forwarded   = ManualAdmission.force!(
           policy_name:   partition.policy_name,
           partition_key: partition.partition_key,
@@ -103,8 +136,11 @@ module DispatchPolicy
         drained += forwarded
       end
-      remaining = partition.class.where(id: partition.id).pick(:pending_count) || 0
-      [drained, remaining]
+      scope               = StagedJob.for_partition(partition.policy_name, partition.partition_key)
+      due_remaining       = scope.due.count
+      scheduled_remaining = scope.count - due_remaining
+      [drained, due_remaining, scheduled_remaining]
     end
     private

data/app/controllers/dispatch_policy/policies_controller.rb CHANGED Viewed

@@ -15,12 +15,16 @@ module DispatchPolicy
       # One grouped query for pending / partition count / paused count
       # across every policy instead of three per policy.
       counts_by_policy    = Repository.partition_counts_by_policy
+      # Policy-level pause flags — the source of truth the tick honors
+      # (partitions.status alone misses partitions created after the pause).
+      paused_policies     = PolicySetting.paused.pluck(:policy_name).to_set
       @rows = names.map do |name|
         counts = counts_by_policy[name] || {}
         {
           name:           name,
           registered:     registry_names.include?(name),
+          paused:         paused_policies.include?(name),
           pending:        counts[:pending] || 0,
           in_flight:      in_flight_by_policy[name] || 0,
           partitions:     counts[:partitions] || 0,
@@ -31,6 +35,7 @@ module DispatchPolicy
     def show
       @policy_object = DispatchPolicy.registry.fetch(@policy_name)
+      @paused        = PolicySetting.for_policy(@policy_name).pick(:paused) || false
       @partitions    = Partition.for_policy(@policy_name)
                                 .order(Arel.sql("pending_count DESC, last_admit_at DESC NULLS LAST"))
                                 .limit(100)
@@ -77,17 +82,31 @@ module DispatchPolicy
         in_backoff:           @round_trip[:in_backoff],
         total_partitions:     @totals[:partitions],
         adapter_target_jps:   @capacity[:adapter_target_jps],
-        pending_trend:        @pending_trend
+        pending_trend:        @pending_trend,
+        paused:               @paused
       )
     end
     def pause
-      Partition.for_policy(@policy_name).update_all(status: "paused", updated_at: Time.current)
+      # Policy-level flag is the source of truth the tick honors (so a key
+      # that first appears AFTER the pause is held too). The per-partition
+      # status update is kept for the partitions index display. One TX so
+      # both writes commit or neither: a flag without the statuses (or vice
+      # versa) leaves the partition list contradicting what admission
+      # actually does until the next toggle. set_policy_paused! shares the
+      # connection (same role via around_action), so it joins this TX.
+      Partition.transaction do
+        Repository.set_policy_paused!(policy_name: @policy_name, paused: true)
+        Partition.for_policy(@policy_name).update_all(status: "paused", updated_at: Time.current)
+      end
       redirect_to policy_path(@policy_name), notice: "Policy paused."
     end
     def resume
-      Partition.for_policy(@policy_name).update_all(status: "active", updated_at: Time.current)
+      Partition.transaction do
+        Repository.set_policy_paused!(policy_name: @policy_name, paused: false)
+        Partition.for_policy(@policy_name).update_all(status: "active", updated_at: Time.current)
+      end
       redirect_to policy_path(@policy_name), notice: "Policy resumed."
     end
@@ -103,7 +122,10 @@ module DispatchPolicy
                .each do |partition|
         break if drained >= DRAIN_MAX_PER_REQUEST
-        batch, _ = PartitionsController.drain_partition!(partition)
+        # Pass the REMAINING budget so a single partition can't push the
+        # total past the cap (a fixed per-partition cap could overshoot by
+        # nearly 2× when the first partition nearly fills it).
+        batch, = PartitionsController.drain_partition!(partition, cap: DRAIN_MAX_PER_REQUEST - drained)
         drained += batch
       end

data/app/models/dispatch_policy/policy_setting.rb ADDED Viewed

@@ -0,0 +1,14 @@
+# frozen_string_literal: true
+module DispatchPolicy
+  # Policy-level settings (currently just the pause flag). One row per
+  # policy_name. The tick's claim_partitions consults this so a pause takes
+  # effect for partitions created after the pause too — not only the ones
+  # that existed when the operator clicked.
+  class PolicySetting < ApplicationRecord
+    self.table_name = "dispatch_policy_policy_settings"
+    scope :for_policy, ->(name) { where(policy_name: name) }
+    scope :paused,     -> { where(paused: true) }
+  end
+end

data/app/views/dispatch_policy/dashboard/index.html.erb CHANGED Viewed

@@ -84,7 +84,12 @@
         <tbody>
           <% @policies.each do |p| %>
             <tr>
-              <td><%= link_to p[:name], policy_path(p[:name]), class: "dp-link" %></td>
+              <td>
+                <%= link_to p[:name], policy_path(p[:name]), class: "dp-link" %>
+                <% if p[:paused] %>
+                  <span class="dp-warn" style="font-size:11px; border:1px solid currentColor; border-radius:4px; padding:1px 5px; margin-left:4px;">paused</span>
+                <% end %>
+              </td>
               <td class="dp-num"><%= format_count(p[:pending]) %></td>
               <td class="dp-num"><%= format_count(p[:in_flight]) %></td>
               <td class="dp-num"><%= format_count(p[:admitted_1m]) %></td>

data/app/views/dispatch_policy/partitions/index.html.erb CHANGED Viewed

@@ -35,7 +35,7 @@
     </thead>
     <tbody>
       <% @partitions.each do |p| %>
-        <%= render "dispatch_policy/shared/partition_row", partition: p %>
+        <%= render "dispatch_policy/shared/partition_row", partition: p, policy_paused: @paused_policies.include?(p.policy_name) %>
       <% end %>
     </tbody>
   </table>

data/app/views/dispatch_policy/partitions/show.html.erb CHANGED Viewed

@@ -23,7 +23,7 @@
   <div class="dp-stat"><span class="dp-stat-label">Policy</span><span class="dp-stat-value"><%= @partition.policy_name %></span></div>
   <div class="dp-stat"><span class="dp-stat-label">Shard</span><span class="dp-stat-value"><code><%= @partition.shard %></code></span></div>
   <div class="dp-stat"><span class="dp-stat-label">Queue</span><span class="dp-stat-value"><%= @partition.queue_name || "—" %></span></div>
-  <div class="dp-stat"><span class="dp-stat-label">Status</span><span class="dp-stat-value <%= "dp-warn" if @partition.paused? %>"><%= @partition.status %></span></div>
+  <div class="dp-stat"><span class="dp-stat-label">Status</span><span class="dp-stat-value <%= "dp-warn" if @partition.paused? || @policy_paused %>"><%= @policy_paused && !@partition.paused? ? "#{@partition.status} (policy paused)" : @partition.status %></span></div>
   <div class="dp-stat"><span class="dp-stat-label">Pending</span><span class="dp-stat-value"><%= format_count(@partition.pending_count) %></span></div>
   <div class="dp-stat"><span class="dp-stat-label">Lifetime admitted</span><span class="dp-stat-value"><%= format_count(@partition.total_admitted) %></span></div>
   <div class="dp-stat"><span class="dp-stat-label">Round-trip age</span><span class="dp-stat-value"><%= age_seconds ? format_duration_seconds(age_seconds) : "never" %></span></div>

data/app/views/dispatch_policy/policies/index.html.erb CHANGED Viewed

@@ -13,15 +13,23 @@
     <tbody>
       <% @rows.each do |row| %>
         <tr>
-          <td><%= link_to row[:name], policy_path(row[:name]), class: "dp-link" %></td>
+          <td>
+            <%= link_to row[:name], policy_path(row[:name]), class: "dp-link" %>
+            <% if row[:paused] %>
+              <span class="dp-warn" style="font-size:11px; border:1px solid currentColor; border-radius:4px; padding:1px 5px; margin-left:4px;">paused</span>
+            <% end %>
+          </td>
           <td class="dp-num"><%= format_count(row[:pending]) %></td>
           <td class="dp-num"><%= format_count(row[:in_flight]) %></td>
           <td class="dp-num"><%= format_count(row[:partitions]) %></td>
           <td class="dp-num"><%= row[:paused_count].positive? ? content_tag(:span, format_count(row[:paused_count]), class: "dp-warn") : 0 %></td>
           <td><%= row[:registered] ? "yes" : content_tag(:span, "no (orphan)", class: "dp-warn") %></td>
           <td>
-            <%= button_to "Pause",  pause_policy_path(row[:name]),  class: "dp-btn", method: :post, form: { class: "dp-form-inline" } %>
-            <%= button_to "Resume", resume_policy_path(row[:name]), class: "dp-btn dp-btn-ok", method: :post, form: { class: "dp-form-inline" } %>
+            <% if row[:paused] %>
+              <%= button_to "Resume", resume_policy_path(row[:name]), class: "dp-btn dp-btn-ok", method: :post, form: { class: "dp-form-inline" } %>
+            <% else %>
+              <%= button_to "Pause", pause_policy_path(row[:name]), class: "dp-btn", method: :post, form: { class: "dp-form-inline" } %>
+            <% end %>
           </td>
         </tr>
       <% end %>

data/app/views/dispatch_policy/policies/show.html.erb CHANGED Viewed

@@ -1,4 +1,9 @@
-<h1>Policy <code><%= @policy_name %></code></h1>
+<h1>
+  Policy <code><%= @policy_name %></code>
+  <% if @paused %>
+    <span class="dp-warn" style="font-size:14px; vertical-align:middle; border:1px solid currentColor; border-radius:4px; padding:2px 8px; margin-left:8px;">PAUSED</span>
+  <% end %>
+</h1>
 <section class="dp-stats">
   <div class="dp-stat"><span class="dp-stat-label">Partitions</span><span class="dp-stat-value"><%= format_count(@totals[:partitions]) %></span></div>
@@ -150,15 +155,19 @@
 <section class="dp-section">
   <h2>Actions</h2>
-  <%= button_to "Pause all partitions",  pause_policy_path(@policy_name),  class: "dp-btn",         method: :post, form: { class: "dp-form-inline" } %>
-  <%= button_to "Resume all partitions", resume_policy_path(@policy_name), class: "dp-btn dp-btn-ok", method: :post, form: { class: "dp-form-inline" } %>
+  <% if @paused %>
+    <%= button_to "Resume policy", resume_policy_path(@policy_name), class: "dp-btn dp-btn-ok", method: :post, form: { class: "dp-form-inline" } %>
+  <% else %>
+    <%= button_to "Pause policy", pause_policy_path(@policy_name), class: "dp-btn", method: :post, form: { class: "dp-form-inline" } %>
+  <% end %>
   <%= button_to "Drain policy",          drain_policy_path(@policy_name),
                 class:  "dp-btn dp-btn-warn",
                 method: :post,
                 form: { class: "dp-form-inline",
                         onsubmit: "return confirm('Force-admit every staged job across every partition of this policy, bypassing all gates?');" } %>
   <p class="dp-hint">
-    <strong>Pause</strong> stops admission but keeps staging — the queue keeps filling, in-flight jobs finish.
+    <strong>Pause</strong> stops admission for the whole policy — including partitions created
+    after the pause — but keeps staging: the queue keeps filling, in-flight jobs finish.
     <strong>Drain</strong> empties the staging table by force-admitting every job (bypassing gates).
     Capped at 10,000 jobs per click — click again for more.
   </p>