postburner 1.0.0.rc.5 → 1.0.0.rc.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 62187fb2fd681e87e1028c52cbd6f3756b32c895c05390bffca05d51e1287ec7
4
- data.tar.gz: 31ab083d1e9d00bfb7dc6fabbf09d7065f5062d73a6cd738835a0b76f54426b6
3
+ metadata.gz: b48c13a76dd9b1b6e54f825bbc50d95574c0decd4a57b8b6ba3d3fc8c5ea369c
4
+ data.tar.gz: 5a867eef5d36cce234b2874b5c18ff95d0ddb648fd1576e71c0a8c5946ddc12c
5
5
  SHA512:
6
- metadata.gz: d2e7a422e53852490b5e2ec8502f3dd69a34f17975ee808f463658f20f2ca228c5467d149987caaf7e905433b364b30c7f92142e07e7f7da3bbd7763d99c6284
7
- data.tar.gz: aa6ce11d8f045be5b259a1869339117ddf40fca8e6d3579ced469355d2c89fbd7f19da96903154dac11a0f0efed57db840c1581ba9512c2665300d58d0453e32
6
+ metadata.gz: 82e60e8af9d983550d7a07dac46b1c5611eb11245944e2ddac5a7411cf2d0b63a59697cab1a83965379bb92c274c1abd33a1a475498f981cbcb8589d520ac61c
7
+ data.tar.gz: 430829a786da22122f8ce4a47e8e966feb524877539f7a0e4086db912731235064c6888d0662d67c438e0c2e2d590a2a4496e33a56658b3022f70fbbb58ca050
data/CHANGELOG.md CHANGED
@@ -1,5 +1,81 @@
1
1
  # Changelog
2
2
 
3
+ ## v1.0.0.rc.6 - 2026-06-27
4
+
5
+ ### Highlights
6
+
7
+ - `Postburner::Schedule#reconcile!` — guarantees exactly one live future execution per enabled schedule and zero future executions for a disabled schedule. The watchdog delegates to it.
8
+ - `Postburner::Schedule#enable!` / `#disable!` — convenience wrappers that drive reconciliation.
9
+ - Auto-reconcile on schedule edits — an `after_update_commit` hook reconciles whenever a scheduling/snapshot attribute changes.
10
+ - `Postburner::ScheduleExecution#supersede!` — tears down an execution (Beanstalkd job + `Postburner::Job` AR row) and marks it `superseded`.
11
+ - `superseded` status (enum value `201`) on `Postburner::ScheduleExecution`, distinct from `skipped`. Neither counts as a live future execution.
12
+ - `live` and `future_live` scopes on `Postburner::ScheduleExecution` (`pending`/`scheduled` only; `future_live` adds `run_at > now`).
13
+ - Instrumentation events `supersede.schedule_execution.postburner` and `reconcile.schedule.postburner`.
14
+ - **`ScheduleExecution#skip!` now skips one occurrence, then resumes -- tearing down down the associated `Postburner::Job` **and** the Beanstalkd job. The skipped row is retained as history.
15
+ - **`schedule.destroy` now tears down each execution's Beanstalkd job and `Postburner::Job` (no orphans)
16
+ - The watchdog sweeps **disabled** schedules with a lingering live future execution and supersedes them.
17
+
18
+ ### Fixed
19
+
20
+ - Scheduled **tracked ActiveJob** executions had a `nil` `bkid`: the adapter stored the entire Beanstalkd put response hash (`{status:, id:}`) instead of the id, which cast to `nil` in the `bigint` column. Scheduled **non-tracked ActiveJob** executions discarded the Beanstalkd id entirely (`beanstalk_job_id` was always `nil`), so they could not be cancelled. Both now capture the real id via the adapter's `provider_job_id`.
21
+ - Scheduled **`Postburner::Job`** executions stored the wrong value in `ScheduleExecution#beanstalk_job_id` (the return value of `queue!`, not the Beanstalkd id). It now holds the real `job.bkid`.
22
+ - `Postburner::OrphanedJob#destroy` can be soft-removed (`remove!`) and destroyed (`destroy`), with Beanstalkd job, while `readonly?` still blocks changes.
23
+
24
+ ### Upgrade notes
25
+
26
+ - This is a **template mutation** (like rc.4), not an additive migration. Existing installs must add a migration that:
27
+ 1. Replaces the total unique index on `(schedule_id, run_at)` with a **live-only partial** unique index (`WHERE status IN (0, 11)`), so a superseded/skipped row can coexist with a freshly recreated live execution at the same `run_at`.
28
+ 2. Drops `ON DELETE CASCADE` on the `schedule_id` foreign key (teardown is now owned by `dependent: :destroy` + `before_destroy`).
29
+ - **Behavior change — deleting a schedule.** With `ON DELETE CASCADE` removed, deleting a schedule by any path other than ActiveRecord `destroy` — raw SQL, `Schedule.delete`, `where(...).delete_all` — now raises a foreign-key violation when the schedule has executions, instead of silently cascading. Always use `schedule.destroy` (which runs `dependent: :destroy` + the `before_destroy` teardown). This is intentional: it prevents orphaned Beanstalkd jobs and `Postburner::Job` rows.
30
+
31
+ Standalone upgrade migration (written for zero-downtime on large `postburner_schedule_executions` tables — builds the index `CONCURRENTLY` and adds the FK unvalidated then validates it, neither of which blocks writes; requires `disable_ddl_transaction!`):
32
+
33
+ ```ruby
34
+ class UpgradePostburnerSchedulesForReconcile < ActiveRecord::Migration[7.2]
35
+ disable_ddl_transaction!
36
+
37
+ # Existing installs created the unique index via the generator with Rails'
38
+ # default name. Adjust if your install named it differently.
39
+ OLD_INDEX = 'index_postburner_schedule_executions_on_schedule_id_and_run_at'
40
+ NEW_INDEX = 'index_pb_sched_exec_live_schedule_run_at'
41
+
42
+ def up
43
+ # 1. Add the live-only partial unique index CONCURRENTLY (no write lock) so a
44
+ # superseded/skipped row can coexist with a live recreate at the same run_at.
45
+ # Narrowing uniqueness can't conflict with existing data (the old total
46
+ # index was stricter), so this build cannot fail on valid rows.
47
+ add_index :postburner_schedule_executions, [:schedule_id, :run_at],
48
+ unique: true, where: 'status IN (0, 11)',
49
+ name: NEW_INDEX, algorithm: :concurrently, if_not_exists: true
50
+
51
+ # Drop the old total unique index. Target by NAME: with the new index added,
52
+ # two indexes now match the (schedule_id, run_at) columns.
53
+ remove_index :postburner_schedule_executions,
54
+ name: OLD_INDEX, algorithm: :concurrently, if_exists: true
55
+
56
+ # 2. Replace the cascading FK with a plain one. Add unvalidated first (brief
57
+ # metadata-only lock), then validate separately (does not block writes).
58
+ # dependent: :destroy + before_destroy now own teardown.
59
+ remove_foreign_key :postburner_schedule_executions, column: :schedule_id
60
+ add_foreign_key :postburner_schedule_executions, :postburner_schedules,
61
+ column: :schedule_id, validate: false
62
+ validate_foreign_key :postburner_schedule_executions, column: :schedule_id
63
+ end
64
+
65
+ def down
66
+ remove_foreign_key :postburner_schedule_executions, column: :schedule_id
67
+ add_foreign_key :postburner_schedule_executions, :postburner_schedules,
68
+ column: :schedule_id, on_delete: :cascade, validate: false
69
+ validate_foreign_key :postburner_schedule_executions, column: :schedule_id
70
+
71
+ add_index :postburner_schedule_executions, [:schedule_id, :run_at],
72
+ unique: true, name: OLD_INDEX, algorithm: :concurrently, if_not_exists: true
73
+ remove_index :postburner_schedule_executions,
74
+ name: NEW_INDEX, algorithm: :concurrently, if_exists: true
75
+ end
76
+ end
77
+ ```
78
+
3
79
  ## v1.0.0.rc.5 - 2026-06-26
4
80
 
5
81
  ### Fixed
data/README.md CHANGED
@@ -126,7 +126,7 @@ Postburner [beanstalkd](https://beanstalkd.github.io/) is used with PostgreSQL t
126
126
 
127
127
  ```ruby
128
128
  # Gemfile
129
- gem 'postburner', '~> 1.0.0.pre.18'
129
+ gem 'postburner', '~> 1.0.0.rc.6'
130
130
 
131
131
  # config/application.rb
132
132
  config.active_job.queue_adapter = :postburner
@@ -499,8 +499,7 @@ The scheduler uses **immediate enqueue** combined with a **watchdog safety net**
499
499
  ```
500
500
  4. When a worker reserves the watchdog, it instantiates `Postburner::Scheduler` which:
501
501
  - Acquires a PostgreSQL advisory lock for coordination
502
- - Auto-bootstraps any unstarted schedules
503
- - Ensures each schedule has a future execution queued
502
+ - Reconciles every schedule (see [Editing & Pausing Schedules](#editing--pausing-schedules)) auto-bootstrapping unstarted schedules, ensuring each enabled schedule has exactly one future execution queued, and tearing down lingering futures on disabled schedules
504
503
  - Re-queues a new watchdog with delay for the next interval
505
504
 
506
505
  NOTE: The watchdog is ephemeral data in Beanstalkd, not a database record. `Postburner::Scheduler` is the handler class that does the work. This design requires no dedicated scheduler process - existing workers handle everything.
@@ -687,8 +686,9 @@ Postburner::Schedule.create!(
687
686
  schedule = Postburner::Schedule.find_by(name: 'daily_cleanup')
688
687
  Postburner::Schedule.enabled # All enabled schedules
689
688
 
690
- # Disable temporarily
691
- schedule.update!(enabled: false)
689
+ # Pause / resume (preferred — see "Editing & Pausing Schedules" below)
690
+ schedule.disable! # Tears down the queued future execution
691
+ schedule.enable! # Resumes from the next grid point after now
692
692
 
693
693
  # Change catch-up policy
694
694
  schedule.update!(catch_up: true)
@@ -700,11 +700,68 @@ schedule.next_run_at # => 2025-01-02 09:00:00 -0500
700
700
  schedule.next_run_at_times(5) # Next 5 run times
701
701
 
702
702
  # View executions
703
+ schedule.executions.live # pending + scheduled (real queued work)
704
+ schedule.executions.future_live # the live future execution(s)
703
705
  schedule.executions.pending
704
706
  schedule.executions.scheduled
705
- schedule.executions.skipped
707
+ schedule.executions.skipped # cancelled occurrences (history)
708
+ schedule.executions.superseded # replaced by a recreate (history)
709
+
710
+ # Delete a schedule (ALWAYS use destroy — see note below)
711
+ schedule.destroy # Tears down every execution's Beanstalkd job + Postburner::Job row
706
712
  ```
707
713
 
714
+ **Deleting schedules — always use `destroy`.** `schedule.destroy` runs `dependent: :destroy` plus a `before_destroy` teardown that removes each execution's Beanstalkd job and any `Postburner::Job` row, so nothing is orphaned. Deleting a schedule by any other path — raw SQL, `Schedule.delete`, `where(...).delete_all` — raises a foreign-key violation when the schedule has executions, by design. Reach for `schedule.destroy` (or `disable!` to keep the record and history).
715
+
716
+ #### Editing & Pausing Schedules
717
+
718
+ Postburner keeps a schedule's queued work in sync with its configuration through a single, idempotent convergence path: **reconciliation**. The guarantee is simple:
719
+
720
+ - An **enabled** schedule always has **exactly one** live future execution, sitting on the current grid with a snapshot matching the live config.
721
+ - A **disabled** schedule has **zero** future executions.
722
+
723
+ You rarely call reconciliation directly (`schedule.reconcile!` exists if you need it). Instead it runs for you in three places: the watchdog reconciles every schedule on each pass, schedule edits trigger it automatically, and `enable!` / `disable!` drive it.
724
+
725
+ ##### Pausing & resuming
726
+
727
+ Use `disable!` / `enable!` rather than `update!(enabled: ...)` (both work, but these read better and make intent obvious):
728
+
729
+ ```ruby
730
+ schedule.disable! # Supersedes the queued future execution → zero future executions
731
+ schedule.enable! # Recreates a future execution at the next grid point after now
732
+ ```
733
+
734
+ Disabling now actually tears down the queued future execution in Beanstalkd — it doesn't just stop creating new ones. Enabling resumes from the **next grid point after now**; it does **not** backfill the gap while the schedule was paused (`catch_up` only governs the running-job chaining path, not resume).
735
+
736
+ ##### Editing a schedule
737
+
738
+ Editing a scheduling/snapshot attribute — `args`, `queue`, `priority`, the grid (`anchor`, `interval`, `interval_unit`, `cron`), `timezone`, `enabled`, or `name` — automatically supersedes the stale queued execution and recreates it from the current config (via an `after_update_commit` hook):
739
+
740
+ ```ruby
741
+ # Payload-only edit: recreates the future execution at the SAME run_at
742
+ schedule.update!(args: { report_type: 'weekly' })
743
+
744
+ # Grid edit: recreates the future execution at the NEW grid point
745
+ schedule.update!(interval: 2, interval_unit: 'hours')
746
+ ```
747
+
748
+ > **Best-effort, not synchronous.** This inline reconcile is non-blocking: it *tries* the global scheduler advisory lock without waiting, so it never holds up a web request. If the watchdog happens to hold the lock mid-sweep, the inline reconcile is skipped and the watchdog converges on its next pass. For UI-managed schedules, **don't assume the next run is updated immediately** — read it back after the watchdog interval (`scheduler_interval`) rather than right after the save.
749
+
750
+ ##### Skipping a single occurrence
751
+
752
+ `skip!` cancels **one** upcoming occurrence and then resumes at the next grid point after the skipped slot. It tears down both the Beanstalkd job and the associated `Postburner::Job` row, marks the execution `skipped` (kept as history), and triggers a best-effort reconcile. The skipped occurrence does not run and is not recreated:
753
+
754
+ ```ruby
755
+ execution = schedule.executions.future_live.first
756
+ execution.skip! # This one won't run; the schedule continues at the next grid point
757
+ ```
758
+
759
+ To pause every upcoming run instead of just one, use `disable!`.
760
+
761
+ ##### Superseding (internal)
762
+
763
+ `ScheduleExecution#supersede!` performs the same teardown as `skip!` but records a `superseded` status, meaning the execution was replaced by a recreate (config/grid drift resolved by reconciliation) rather than cancelled by an operator. Reconciliation uses it internally; you'll mostly encounter `superseded` rows when reading execution history. Like skipped rows, superseded rows are excluded from `live` / `future_live`.
764
+
708
765
  #### Starting Schedules
709
766
 
710
767
  When you create a schedule, it won't run until the first execution is created. You have two options:
@@ -748,19 +805,27 @@ Each scheduled run creates an execution record for tracking:
748
805
  ```ruby
749
806
  execution = Postburner::ScheduleExecution.find(123)
750
807
 
751
- execution.status # pending, scheduled, skipped
808
+ execution.status # pending, scheduled, skipped, superseded
752
809
  execution.run_at # Scheduled time
753
810
  execution.enqueued_at # When job was queued
754
811
  execution.beanstalk_job_id # Beanstalkd job ID
755
812
  execution.job_id # Postburner::Job ID (if using Postburner::Job)
756
813
  ```
757
814
 
815
+ **Status values:**
816
+ - `pending` — created, not yet enqueued to Beanstalkd
817
+ - `scheduled` — enqueued to Beanstalkd, waiting for `run_at`
818
+ - `skipped` — one occurrence cancelled by an operator (`skip!`); retained as history
819
+ - `superseded` — replaced by a recreate during reconciliation (`supersede!`); retained as history
820
+
821
+ Only `pending` and `scheduled` count as a live future execution (the `live` / `future_live` scopes). `skipped` and `superseded` are inert history.
822
+
758
823
  **Execution lifecycle:**
759
824
  1. Execution created with `pending` status and immediately enqueued to Beanstalkd
760
825
  2. Status changes to `scheduled` once enqueued
761
826
  3. At `run_at` time, Beanstalkd releases job to worker
762
827
  4. For `Postburner::Job` (and `ActiveJob` with `Postburner::Tracked`) schedules: `before_attempt` callback creates next execution
763
- 5. Watchdog periodically verifies future executions exist (safety net)
828
+ 5. Watchdog periodically reconciles each schedule, guaranteeing future executions exist (safety net)
764
829
 
765
830
  #### Timezone Handling
766
831
 
@@ -1572,7 +1637,8 @@ Postburner emits ActiveSupport::Notifications events following Rails conventions
1572
1637
  |-------|------|--------------|
1573
1638
  | `create.schedule.postburner` | When schedule is created | `:schedule` |
1574
1639
  | `update.schedule.postburner` | When schedule is updated | `:schedule`, `:changes` |
1575
- | `audit.schedule.postburner` | When scheduler audits a schedule | `:schedule` |
1640
+ | `audit.schedule.postburner` | When scheduler audits a schedule | `:schedule`, `:execution_created`, `:orphans_enqueued` |
1641
+ | `reconcile.schedule.postburner` | When a schedule is reconciled (executions converged to config) | `:schedule`, `:superseded`, `:created` |
1576
1642
 
1577
1643
  **Schedule Payload Structure:**
1578
1644
 
@@ -1595,7 +1661,8 @@ Postburner emits ActiveSupport::Notifications events following Rails conventions
1595
1661
  |-------|------|--------------|
1596
1662
  | `create.schedule_execution.postburner` | When execution is created | `:schedule`, `:execution` |
1597
1663
  | `enqueue.schedule_execution.postburner` | When execution is enqueued to Beanstalkd | `:schedule`, `:execution`, `:beanstalk_job_id` |
1598
- | `skip.schedule_execution.postburner` | When execution is skipped | `:schedule`, `:execution` |
1664
+ | `skip.schedule_execution.postburner` | When an operator skips one occurrence | `:schedule`, `:execution` |
1665
+ | `supersede.schedule_execution.postburner` | When an execution is superseded (replaced by a recreate during reconciliation) | `:schedule`, `:execution` |
1599
1666
 
1600
1667
  **Execution Payload Structure:**
1601
1668
 
@@ -2368,7 +2435,7 @@ Key changes in v1.0:
2368
2435
 
2369
2436
  1. **Update Gemfile:**
2370
2437
  ```ruby
2371
- gem 'postburner', '~> 1.0.0.pre.18'
2438
+ gem 'postburner', '~> 1.0.0.rc.6'
2372
2439
  ```
2373
2440
 
2374
2441
  2. **Remove Backburner config:**
@@ -84,6 +84,10 @@ module Postburner
84
84
  # through instance save/update and therefore does not trigger
85
85
  # `ensure_proper_type` either).
86
86
  #
87
+ # Uses the STI base class for the relation so the row's missing `type` value
88
+ # (which differs from OrphanedJob's sti_name) does not scope it out — a
89
+ # `self.class.where` would match zero rows and silently fail to persist.
90
+ #
87
91
  # Idempotent: does nothing if already removed.
88
92
  #
89
93
  # @return [void]
@@ -93,10 +97,31 @@ module Postburner
93
97
 
94
98
  self.delete!
95
99
  now = Time.current
96
- self.class.where(id: self.id).update_all(removed_at: now)
100
+ self.class.base_class.where(id: self.id).update_all(removed_at: now)
97
101
  self.removed_at = now
98
102
  end
99
103
 
104
+ # Hard-deletes this orphaned row, removing BOTH its Beanstalkd job and its AR
105
+ # row, while keeping `readonly?` true for saves/updates.
106
+ #
107
+ # The base ActiveRecord#destroy raises ReadOnlyRecord on a readonly record,
108
+ # which would make ScheduleExecution#teardown_job! (which calls `job.destroy`
109
+ # uniformly across job shapes) fail for an execution whose job class was
110
+ # deleted/renamed. This override mirrors {#remove!}: it removes the Beanstalkd
111
+ # job via {#delete!} and deletes the row at the relation level — using the STI
112
+ # base class so the missing `type` value doesn't scope the row out — bypassing
113
+ # both `readonly?` and instance callbacks. The row is genuinely removed (not a
114
+ # soft-delete), matching destroy semantics.
115
+ #
116
+ # @return [self] frozen, with destroyed? == true
117
+ #
118
+ def destroy
119
+ self.delete!
120
+ self.class.base_class.where(id: self.id).delete_all
121
+ @destroyed = true
122
+ freeze
123
+ end
124
+
100
125
  # Prevents saves that would allow Rails' ensure_proper_type to overwrite the
101
126
  # original `type` value with 'Postburner::OrphanedJob'.
102
127
  #
@@ -1,5 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require 'postburner/advisory_lock'
4
+
3
5
  module Postburner
4
6
  # Schedule model for recurring job execution with fixed-rate, grid-aligned scheduling.
5
7
  #
@@ -118,6 +120,21 @@ module Postburner
118
120
  after_create_commit :instrument_create
119
121
  after_update_commit :instrument_update
120
122
 
123
+ # Reconcile executions whenever a relevant attribute changes.
124
+ #
125
+ # Runs post-commit (so enqueueing to Beanstalkd is safe and observes the
126
+ # committed config) and only when a config attribute that affects scheduling
127
+ # or the cached snapshot actually changed. reconcile! mutates executions and
128
+ # uses update_column for last_audit_at, so it never re-triggers this callback.
129
+ after_update_commit :reconcile_after_change
130
+
131
+ # Schedule attributes whose change should trigger reconciliation. Mirrors the
132
+ # cacheable_attributes set plus :enabled (enable/disable drives reconcile).
133
+ RECONCILE_TRIGGER_ATTRS = %w[
134
+ enabled job_class args queue priority timezone
135
+ anchor interval interval_unit cron catch_up name
136
+ ].freeze
137
+
121
138
  # Scopes
122
139
  scope :enabled, -> { where(enabled: true) }
123
140
  scope :disabled, -> { where(enabled: false) }
@@ -183,7 +200,11 @@ module Postburner
183
200
  # will return nil with a warning logged.
184
201
  #
185
202
  def create_next_execution!(after: nil)
186
- # Check if a future execution already exists (any status - including skipped).
203
+ # Check if a LIVE future execution already exists.
204
+ #
205
+ # Only pending/scheduled rows count: a skipped or superseded row must NOT
206
+ # block creation of the next live future, so this matches the same
207
+ # "live future" notion reconcile! and future_live use.
187
208
  #
188
209
  # TIME PRECISION: When called from a job callback during time travel
189
210
  # (e.g., ImmediateTestQueue), Time.current and execution.run_at can differ
@@ -197,7 +218,7 @@ module Postburner
197
218
  # after.run_at when an execution is provided, we correctly exclude the
198
219
  # current execution from the future check.
199
220
  check_time = after.is_a?(ScheduleExecution) ? after.run_at : Time.current
200
- return nil if executions.where('run_at > ?', check_time).exists?
221
+ return nil if executions.live.where('run_at > ?', check_time).exists?
201
222
 
202
223
  # Determine base time for calculating next execution.
203
224
  #
@@ -217,7 +238,15 @@ module Postburner
217
238
  Time.current
218
239
  end
219
240
 
220
- create_execution!(after: after_time)
241
+ # Compute the next slot from the chosen base time using the SAME skip-aware
242
+ # computation reconcile! uses, so the before_attempt chaining path can never
243
+ # place a live execution onto a grid point that carries a skipped (cancelled)
244
+ # occurrence. With no skipped slots this is identical to the next grid point
245
+ # strictly after `after_time` (i.e. unchanged behavior).
246
+ desired = next_grid_point_skipping_cancelled(after: after_time)
247
+ return nil if desired.nil?
248
+
249
+ create_execution!(at: desired)
221
250
  rescue ActiveRecord::RecordNotUnique, ActiveRecord::RecordInvalid => e
222
251
  # Race condition - another process/thread created it between our check and insert
223
252
  # RecordNotUnique: PostgreSQL constraint violation
@@ -226,6 +255,73 @@ module Postburner
226
255
  nil
227
256
  end
228
257
 
258
+ # Converge this schedule's executions to its target invariants. Idempotent.
259
+ #
260
+ # This is the single convergence path that owns:
261
+ # 1. exactly one LIVE future execution per ENABLED schedule,
262
+ # 2. zero future executions for a DISABLED schedule,
263
+ # 3. the future execution sits on the current grid with a non-drifted
264
+ # cached snapshot.
265
+ #
266
+ # Always future-only: it never touches past or in-flight executions. Always
267
+ # resumes from the next grid point after now; it does NOT honor catch_up to
268
+ # backfill a gap (that is the before_attempt chaining path's concern, not
269
+ # reconcile's).
270
+ #
271
+ # Concurrency: mutual exclusion is provided by the SCHEDULER_LOCK_KEY advisory
272
+ # lock (NOT a wrapping AR transaction, so enqueue can happen after commit).
273
+ # Each supersede!/create runs in its own small transaction; the live-only
274
+ # partial unique index on (schedule_id, run_at) is the hard backstop.
275
+ #
276
+ # @param lock [Boolean] acquire SCHEDULER_LOCK_KEY. Pass false when the caller
277
+ # (the watchdog) already holds it.
278
+ # @param blocking [Boolean] when acquiring the lock, wait for it (true) or give
279
+ # up immediately if it's held (false). Admin-triggered paths (skip!, the
280
+ # after_update_commit edit hook) pass false so a web request never blocks for
281
+ # the duration of a watchdog sweep — if the lock is busy they skip the inline
282
+ # reconcile and let the watchdog converge on its next pass. Ignored when
283
+ # lock: false.
284
+ # @return [ScheduleExecution, nil] the execution created this run, or nil if
285
+ # the schedule was already stable/disabled OR (non-blocking) the lock was
286
+ # held. Callers use this to populate the `execution_created` instrumentation
287
+ # key.
288
+ #
289
+ def reconcile!(lock: true, blocking: true)
290
+ return reconcile_unlocked! unless lock
291
+
292
+ acquired = false
293
+ result = Postburner::AdvisoryLock.with_lock(
294
+ Postburner::AdvisoryLock::SCHEDULER_LOCK_KEY, blocking: blocking
295
+ ) do
296
+ acquired = true
297
+ reconcile_unlocked!
298
+ end
299
+
300
+ unless acquired
301
+ Rails.logger.debug(
302
+ "[Postburner::Schedule] reconcile! skipped for '#{name}': scheduler lock " \
303
+ "held; the watchdog will converge on its next pass"
304
+ )
305
+ end
306
+
307
+ result
308
+ end
309
+
310
+ # Enable this schedule (and reconcile via after_update_commit).
311
+ #
312
+ # @return [Boolean]
313
+ def enable!
314
+ update!(enabled: true)
315
+ end
316
+
317
+ # Disable this schedule (and reconcile via after_update_commit, which
318
+ # supersedes the future execution so zero future executions remain).
319
+ #
320
+ # @return [Boolean]
321
+ def disable!
322
+ update!(enabled: false)
323
+ end
324
+
229
325
  # Calculate next N run times.
230
326
  #
231
327
  # Uses either cron or anchor-based calculation depending on schedule mode.
@@ -336,7 +432,12 @@ module Postburner
336
432
  queue: queue,
337
433
  priority: priority,
338
434
  timezone: timezone,
339
- anchor: anchor,
435
+ # Emit anchor as a deterministic ISO-8601 string (ms precision) rather
436
+ # than a raw Time, so the cached snapshot and a live cacheable_attributes
437
+ # hash compare equal under ScheduleExecution#drifted? without a
438
+ # Time-vs-string false drift. Anchors are second/minute aligned in
439
+ # practice, so millisecond precision is more than sufficient.
440
+ anchor: anchor&.iso8601(3),
340
441
  interval: interval,
341
442
  interval_unit: interval_unit,
342
443
  cron: cron,
@@ -346,6 +447,131 @@ module Postburner
346
447
 
347
448
  private
348
449
 
450
+ # The reconciliation algorithm, assuming the caller holds SCHEDULER_LOCK_KEY.
451
+ #
452
+ # @return [ScheduleExecution, nil] the created execution, or nil
453
+ #
454
+ # @api private
455
+ #
456
+ def reconcile_unlocked!
457
+ live = executions.future_live.to_a
458
+
459
+ unless enabled?
460
+ # Invariant 2: a disabled schedule has zero future executions.
461
+ live.each(&:supersede!)
462
+ touch_audit!
463
+ instrument_reconcile(superseded: live.size, created: false)
464
+ return nil
465
+ end
466
+
467
+ # The next grid point after now that we should occupy. Advance past any
468
+ # grid point that was explicitly SKIPPED: a skipped occurrence is cancelled
469
+ # and must not be recreated. (A superseded slot, by contrast, IS
470
+ # recreatable — payload-drift recreates at the same run_at — so it must not
471
+ # be skipped past.)
472
+ desired = next_grid_point_skipping_cancelled
473
+
474
+ current = live.first
475
+
476
+ # Stable when there is exactly one live future, it sits on the desired grid
477
+ # point, and its cached snapshot matches the live config.
478
+ stable = live.size == 1 &&
479
+ desired.present? &&
480
+ current.run_at.to_i == desired.to_i &&
481
+ !current.drifted?(self)
482
+
483
+ if stable
484
+ touch_audit!
485
+ instrument_reconcile(superseded: 0, created: false)
486
+ return nil
487
+ end
488
+
489
+ # Free the live slot(s).
490
+ live.each(&:supersede!)
491
+
492
+ # If the schedule has no more grid points, create nothing.
493
+ if desired.nil?
494
+ touch_audit!
495
+ instrument_reconcile(superseded: live.size, created: false)
496
+ return nil
497
+ end
498
+
499
+ # Recreate exactly one live future at the desired grid point.
500
+ created = create_execution!(at: desired)
501
+ touch_audit!
502
+ instrument_reconcile(superseded: live.size, created: created.present?)
503
+ created
504
+ end
505
+
506
+ # The next grid point strictly after `after`, advanced past any grid point
507
+ # that already has a SKIPPED execution (a cancelled occurrence must not be
508
+ # recreated). A SUPERSEDED slot is NOT skipped — it remains recreatable at the
509
+ # same run_at (payload-drift recreate). Returns nil if the schedule has no
510
+ # further grid points.
511
+ #
512
+ # This is the single skip-aware "next occurrence" computation shared by both
513
+ # reconcile! (after: Time.current) and the before_attempt chaining path
514
+ # (create_next_execution!), so neither can place a live execution onto a
515
+ # skipped slot.
516
+ #
517
+ # @param after [Time] base time to calculate the next grid point after
518
+ # (default: Time.current, preserving reconcile!'s behavior)
519
+ # @return [Time, nil]
520
+ #
521
+ # @api private
522
+ #
523
+ def next_grid_point_skipping_cancelled(after: Time.current)
524
+ desired = next_run_at(after: after)
525
+ while desired && executions.skipped.exists?(run_at: desired)
526
+ nxt = next_run_at(after: desired)
527
+ break if nxt.nil? || nxt.to_i <= desired.to_i # guard against non-advancing grids
528
+ desired = nxt
529
+ end
530
+ desired
531
+ end
532
+
533
+ # Update last_audit_at without firing after_update_commit (which would
534
+ # otherwise re-trigger reconcile_after_change).
535
+ #
536
+ # @return [void]
537
+ #
538
+ # @api private
539
+ #
540
+ def touch_audit!
541
+ update_column(:last_audit_at, Time.current)
542
+ end
543
+
544
+ # after_update_commit hook: reconcile only when a scheduling/snapshot
545
+ # attribute actually changed.
546
+ #
547
+ # @return [void]
548
+ #
549
+ # @api private
550
+ #
551
+ def reconcile_after_change
552
+ return if (saved_changes.keys & RECONCILE_TRIGGER_ATTRS).empty?
553
+
554
+ # Best-effort: never block a web request on the global scheduler lock. If the
555
+ # watchdog is mid-sweep, skip the inline reconcile; the watchdog converges.
556
+ reconcile!(blocking: false)
557
+ end
558
+
559
+ # Instrument a reconcile run.
560
+ #
561
+ # @param superseded [Integer] number of executions superseded this run
562
+ # @param created [Boolean] whether a new execution was created this run
563
+ # @return [void]
564
+ #
565
+ # @api private
566
+ #
567
+ def instrument_reconcile(superseded:, created:)
568
+ ActiveSupport::Notifications.instrument('reconcile.schedule.postburner', {
569
+ schedule: Postburner::Instrumentation.schedule_payload(self),
570
+ superseded: superseded,
571
+ created: created
572
+ })
573
+ end
574
+
349
575
  # Create and save an execution, then enqueue it to Beanstalkd.
350
576
  #
351
577
  # The execution is enqueued immediately with appropriate delay - Beanstalkd's
@@ -355,13 +581,17 @@ module Postburner
355
581
  # Instruments with ActiveSupport::Notifications:
356
582
  # - create.schedule_execution.postburner: When execution is created
357
583
  #
358
- # @param after [Time, ScheduleExecution, nil] Calculate after this time/execution
584
+ # @param after [Time, ScheduleExecution, nil] Calculate the slot as the next
585
+ # grid point after this time/execution (legacy bootstrap path).
586
+ # @param at [Time, nil] Create the execution at this EXACT run_at (used by
587
+ # reconcile! / create_next_execution!, which have already computed the precise
588
+ # skip-aware grid point). Takes precedence over `after`.
359
589
  # @return [ScheduleExecution, nil] The created execution, or nil if no more runs
360
590
  #
361
591
  # @api private
362
592
  #
363
- def create_execution!(after: nil)
364
- execution = build_execution(after: after)
593
+ def create_execution!(after: nil, at: nil)
594
+ execution = build_execution(after: after, at: at)
365
595
  return nil if execution.nil?
366
596
 
367
597
  execution.save!
@@ -383,22 +613,35 @@ module Postburner
383
613
 
384
614
  # Build an execution record (does not save).
385
615
  #
386
- # Calculates the next two run times and builds an execution with run_at
387
- # and next_run_at set. The execution is not persisted to the database.
616
+ # Two modes:
617
+ # - `at:` create the execution at this EXACT run_at. The next_run_at
618
+ # lookahead is the next skip-aware grid point AFTER it, matching what the
619
+ # next reconcile/chaining pass will actually create. This is the path used
620
+ # by reconcile! and create_next_execution!, which have already computed the
621
+ # precise grid point (no `-1s` round-trip, no double grid computation).
622
+ # - `after:` — legacy path: run_at is the next grid point strictly after the
623
+ # given time/execution; next_run_at is the grid point after that.
388
624
  #
389
- # @param after [Time, ScheduleExecution, nil] Calculate after this time/execution
390
- # @return [ScheduleExecution, nil] The built execution, or nil if schedule has no more runs
625
+ # @param after [Time, ScheduleExecution, nil] Calculate the slot after this
626
+ # @param at [Time, nil] Build at this exact run_at (takes precedence)
627
+ # @return [ScheduleExecution, nil] The built execution, or nil if no more runs
391
628
  #
392
629
  # @api private
393
630
  #
394
- def build_execution(after: nil)
395
- after_time = after.is_a?(ScheduleExecution) ? after.run_at : after
396
- times = next_run_at_times(after: after_time, count: 2)
631
+ def build_execution(after: nil, at: nil)
632
+ if at
633
+ run_at = at
634
+ # Lookahead matches the next slot the scheduler would create next time.
635
+ next_run_at = next_grid_point_skipping_cancelled(after: at)
636
+ else
637
+ after_time = after.is_a?(ScheduleExecution) ? after.run_at : after
638
+ times = next_run_at_times(after: after_time, count: 2)
397
639
 
398
- return nil if times.empty?
640
+ return nil if times.empty?
399
641
 
400
- run_at = times[0]
401
- next_run_at = times[1] # May be nil if count is 1 or no more runs
642
+ run_at = times[0]
643
+ next_run_at = times[1] # May be nil if count is 1 or no more runs
644
+ end
402
645
 
403
646
  executions.build(
404
647
  run_at: run_at,