ruby_reactor 0.5.1 → 0.5.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 54ecb36ac72eedf48af0025dff29d83986449586683300ce3fe8fd874c1412d5
4
- data.tar.gz: 5e3565e3e238bad746d93982ca7b01560893a68755be18fbfce95df6f54ce5b3
3
+ metadata.gz: 66c0a0c5591cd862dc61063d752f4d92111370808256bfdae749825fec68429b
4
+ data.tar.gz: 212ac4ce7ef87e5d28606cab0aff8358afde437389fbb5b2d717b1b5874daa77
5
5
  SHA512:
6
- metadata.gz: 708acdb0c74582cea4c33210ea1bac055080c7dd19c69d166db4b2c22705de2935346d6b09d4fc6c9cbb1bda5ba235cade92a88f04fe791e364bb78bda256138
7
- data.tar.gz: e3f03e46d71babe276224571eaad3e794c7d8c695e2865333f5021e680feb21a12cd3b33527035e1349dc600d01faec87b573b691d7e50521c4f0d81610c8b8f
6
+ metadata.gz: 75e3bd7ead2281ef7bd1a74fe42a1aaaa1ff5ac92db2b72172e779b0fa9591878268b9829c939def027a6d368ad0ad7b00eddaa05bbb5497f0678229b68b17c0
7
+ data.tar.gz: 46c44e73bef1e7f2a11d7a5de51de83a2a9a2505aa8c5b06cf64215bc97c4c272f67f23a220187431a188a6dcca216ee13741566fc61e1ead6a8d7dc7b392d74
@@ -1,3 +1,3 @@
1
1
  {
2
- ".": "0.5.1"
2
+ ".": "0.5.2"
3
3
  }
data/CHANGELOG.md CHANGED
@@ -1,5 +1,12 @@
1
1
  # Changelog
2
2
 
3
+ ## [0.5.2](https://github.com/arturictus/ruby_reactor/compare/v0.5.1...v0.5.2) (2026-06-14)
4
+
5
+
6
+ ### Features
7
+
8
+ * Nonce lock ([#26](https://github.com/arturictus/ruby_reactor/issues/26)) ([5925cac](https://github.com/arturictus/ruby_reactor/commit/5925cac7af93f59be6c0a8a98ab020f96080f60b))
9
+
3
10
  ## [0.5.1](https://github.com/arturictus/ruby_reactor/compare/v0.5.0...v0.5.1) (2026-06-14)
4
11
 
5
12
 
data/README.md CHANGED
@@ -24,7 +24,7 @@ The key value is **Reliability**: if any part of your workflow fails, Ruby React
24
24
  - **Compensation**: Automatic rollback of completed steps when a failure occurs.
25
25
  - **Interrupts**: Pause and resume workflows to wait for external events (webhooks, user approvals).
26
26
  - **Input Validation**: Integrated with `dry-validation` for robust input checking.
27
- - **Distributed Locks, Semaphores, Rate Limits & Periods**: Coordinate across processes with Redis-backed primitives — exclusive locks for at-most-one-runner, semaphores for capacity caps, fixed-window rate limits for external APIs (single or multi-window like "3/sec AND 100/min"), and `with_period` to dedup reactors to once per calendar bucket (once per day/month/year/etc). Async jobs snooze on contention with smart `retry_after` instead of consuming retry budget.
27
+ - **Distributed Locks, Semaphores, Rate Limits, Periods & Ordered Locks**: Coordinate across processes with Redis-backed primitives — exclusive locks for at-most-one-runner, semaphores for capacity caps, fixed-window rate limits for external APIs (single or multi-window like "3/sec AND 100/min"), `with_period` to dedup reactors to once per calendar bucket, and `with_ordered_lock` for strict transaction ordering via a monotonically increasing nonce assigned at enqueue. Async jobs snooze on contention with smart `retry_after` instead of consuming retry budget.
28
28
 
29
29
  ## Comparison
30
30
 
@@ -58,7 +58,7 @@ The key value is **Reliability**: if any part of your workflow fails, Ruby React
58
58
  - [Full Reactor Async](#full-reactor-async)
59
59
  - [Step-Level Async](#step-level-async)
60
60
  - [Interrupts (Pause & Resume)](#interrupts-pause--resume)
61
- - [Locks & Semaphores](#locks--semaphores)
61
+ - [Locks, Semaphores & Ordered Locks](#locks-semaphores--ordered-locks)
62
62
  - [Map & Parallel Execution](#map--parallel-execution)
63
63
  - [Map with Dynamic Source (ActiveRecord)](#map-with-dynamic-source-activerecord)
64
64
  - [Input Validation](#input-validation)
@@ -92,20 +92,31 @@ Or install it yourself as:
92
92
 
93
93
  Configure RubyReactor with your Sidekiq and Redis settings:
94
94
 
95
+ Every setting below is **optional** — RubyReactor ships with the defaults shown. Override only what you need.
96
+
95
97
  ```ruby
96
98
  RubyReactor.configure do |config|
97
- # Redis configuration for state persistence
99
+ # Storage adapter. Default: :redis (the only adapter shipped today).
98
100
  config.storage.adapter = :redis
101
+ # Redis URL. Default: "redis://localhost:6379/0".
99
102
  config.storage.redis_url = ENV.fetch("REDIS_URL", "redis://localhost:6379/0")
103
+ # Extra options passed to Redis.new. Default: {}.
100
104
  config.storage.redis_options = { timeout: 1 }
101
105
 
102
- # Sidekiq configuration for async execution
106
+ # Sidekiq queue used by RubyReactor's async worker. Default: :default.
103
107
  config.sidekiq_queue = :default
108
+ # Sidekiq retry count for infrastructure failures only (deserialization,
109
+ # Redis, network). Step retries are managed separately. Default: 3.
104
110
  config.sidekiq_retry_count = 3
105
111
 
106
- # Lock contention snooze behavior for async reactors. When a Sidekiq worker
107
- # cannot acquire a lock or semaphore, it re-enqueues itself with this delay
108
- # (plus jitter) up to `lock_snooze_max_attempts` times before giving up.
112
+ # Lock/semaphore/rate-limit/ordered-lock contention snooze behavior for
113
+ # async reactors. When a Sidekiq worker cannot acquire a primitive it
114
+ # re-enqueues itself with `lock_snooze_base_delay + rand(0..lock_snooze_jitter)`
115
+ # seconds (rate-limit uses a precise `retry_after_seconds` hint from the error;
116
+ # ordered-lock waits re-poll at the base delay so a successor catches its
117
+ # blocker finishing fast), up to `lock_snooze_max_attempts` times before
118
+ # marking the context :failed. Defaults: 5 / 5 / 20. Set max_attempts to
119
+ # :infinity to never give up.
109
120
  config.lock_snooze_base_delay = 5
110
121
  config.lock_snooze_jitter = 5
111
122
  config.lock_snooze_max_attempts = 20
@@ -114,11 +125,18 @@ RubyReactor.configure do |config|
114
125
  # `with_rate_limit(:stripe)`. See Locks, Semaphores, Rate Limits & Periods.
115
126
  config.rate_limits.register(:stripe, limits: { second: 3, minute: 100 })
116
127
 
117
- # Logger configuration
128
+ # Logger. Default: Logger.new($stderr).
118
129
  config.logger = Logger.new($stdout)
130
+
131
+ # Async router. Default: RubyReactor::SidekiqAdapter. Swap for a custom
132
+ # adapter if you don't use Sidekiq — the adapter only needs to respond to
133
+ # `perform_async(serialized_context, reactor_class_name, **)`.
134
+ # config.async_router = MyCustomAdapter
119
135
  end
120
136
  ```
121
137
 
138
+ You can also leave out the `configure` block entirely — defaults work for local development against a Redis on `localhost:6379`.
139
+
122
140
 
123
141
  ## Quick Start
124
142
 
@@ -359,7 +377,7 @@ ApprovalReactor.continue_by_correlation_id(
359
377
  )
360
378
  ```
361
379
 
362
- ### Locks & Semaphores
380
+ ### Locks, Semaphores & Ordered Locks
363
381
 
364
382
  Coordinate across processes with Redis-backed primitives:
365
383
 
@@ -367,6 +385,7 @@ Coordinate across processes with Redis-backed primitives:
367
385
  - **`with_semaphore`** — cap total concurrent runners per key (capacity control).
368
386
  - **`with_rate_limit`** — fixed-window rate limit, single or multi-window ("3/sec AND 100/min"). Inline per-reactor, or reference a named limit registered once in `RubyReactor.configure` and shared across reactors.
369
387
  - **`with_period`** — run at most once per calendar bucket (dedup / once-per-day, once-per-month, etc).
388
+ - **`with_ordered_lock`** — strict transaction ordering via a monotonically increasing nonce assigned at enqueue. Workers can only proceed when their nonce equals `last_completed + 1`.
370
389
 
371
390
  ```ruby
372
391
  class RefundOrderReactor < RubyReactor::Reactor
@@ -423,6 +442,27 @@ class ChargeReactor < RubyReactor::Reactor
423
442
  run { |args| Stripe.charge(args[:account_id]) }
424
443
  end
425
444
  end
445
+
446
+ class OrderedTransactionReactor < RubyReactor::Reactor
447
+ async
448
+ input :account_id
449
+ input :transaction
450
+
451
+ # Strict order: a monotonically increasing nonce is assigned at enqueue
452
+ # time (inside `Reactor.run`). Workers only execute when their nonce
453
+ # equals last_completed + 1; otherwise they snooze. After the sequence
454
+ # fully drains the counter resets to 0.
455
+ with_ordered_lock(poison_pill_timeout: 300) { |inputs| "txs:#{inputs[:account_id]}" }
456
+
457
+ step :apply do
458
+ argument :transaction, input(:transaction)
459
+ run { |args| Ledger.apply(args[:transaction]) }
460
+ end
461
+ end
462
+
463
+ # Caller-side order is preserved; the worker pool may pick jobs in any order
464
+ # but the gate enforces sequential execution per key.
465
+ [tx1, tx2, tx3].each { |tx| OrderedTransactionReactor.run(account_id: 42, transaction: tx) }
426
466
  ```
427
467
 
428
468
  **Named global limits.** When several reactors hit the same external service, register the limit once and reference it by name. The name is the shared key base, so every reactor throttles against one bucket:
@@ -449,8 +489,8 @@ Referencing an unregistered name raises `RubyReactor::RateLimitRegistry::Unknown
449
489
 
450
490
  On contention:
451
491
 
452
- - **Inline** (`Reactor.run`) raises `RubyReactor::Lock::AcquisitionError` / `RubyReactor::Semaphore::AcquisitionError` / `RubyReactor::RateLimit::ExceededError`.
453
- - **Async** (Sidekiq) snoozes the job via `perform_in(delay, ...)`. For rate limits the delay is the error's `retry_after_seconds` (precise wakeup); for locks/semaphores it's `lock_snooze_base_delay + jitter`. Snoozes do not count against the Sidekiq retry budget. After `lock_snooze_max_attempts` snoozes the context is marked failed.
492
+ - **Inline** (`Reactor.run`) raises `RubyReactor::Lock::AcquisitionError` / `RubyReactor::Semaphore::AcquisitionError` / `RubyReactor::RateLimit::ExceededError` / `RubyReactor::OrderedLock::WaitError`.
493
+ - **Async** (Sidekiq) snoozes the job via `perform_in(delay, ...)`. For rate limits the delay uses the error's `retry_after_seconds` hint (precise wakeup — the bucket roll time is known exactly); for locks, semaphores, and ordered-lock waits it's `lock_snooze_base_delay + jitter` (a short re-poll, since a held lock or a live blocker nonce typically clears in milliseconds). Snoozes do not count against the Sidekiq retry budget. After `lock_snooze_max_attempts` snoozes the context is marked failed (ordered-lock waits bypass the cap — see the ordered-lock docs).
454
494
 
455
495
  On dedup hits (period gate already marked), the reactor returns a `RubyReactor::Skipped` result instead — no steps run, no exception:
456
496
 
@@ -472,7 +512,7 @@ step :ensure_active do
472
512
  end
473
513
  ```
474
514
 
475
- See [Locks, Semaphores, Rate Limits & Periods](documentation/locks_and_semaphores.md) for re-entrancy, auto-extend, multi-window quotas, bucket semantics, owner identity, snooze tuning, and operational notes.
515
+ See [Locks, Semaphores, Rate Limits, Periods & Ordered Locks](documentation/locks_and_semaphores.md) for re-entrancy, auto-extend, multi-window quotas, bucket semantics, owner identity, snooze tuning, ordered-lock assignment + poison-pill semantics, and operational notes.
476
516
 
477
517
  ### Map & Parallel Execution
478
518
 
@@ -986,9 +1026,9 @@ Learn how to pause and resume reactors to handle long-running processes, manual
986
1026
  ### [Testing with RSpec](documentation/testing.md)
987
1027
  Comprehensive guide to testing reactors with RubyReactor's testing utilities. Learn about the `TestSubject` class for reactor execution and introspection, step mocking for isolating dependencies, testing nested and composed reactors, and custom RSpec matchers like `be_success`, `have_run_step`, and `have_retried_step`.
988
1028
 
989
- ### [Locks, Semaphores, Rate Limits & Periods](documentation/locks_and_semaphores.md)
1029
+ ### [Locks, Semaphores, Rate Limits, Periods & Ordered Locks](documentation/locks_and_semaphores.md)
990
1030
 
991
- Coordinate access to shared resources across processes with Redis-backed primitives: exclusive locks (`with_lock`), concurrency-limiting semaphores (`with_semaphore`), fixed-window rate limits with multi-window quotas (`with_rate_limit`), and calendar-bucketed dedup (`with_period`, returning `Skipped` results). Covers re-entrancy across composed reactors, TTL auto-extend, inline-vs-async contention behavior, smart `retry_after` snoozes for rate limits, snooze tuning, the token-based semaphore safety model, and once-per-day/month/year scheduling patterns.
1031
+ Coordinate access to shared resources across processes with Redis-backed primitives: exclusive locks (`with_lock`), concurrency-limiting semaphores (`with_semaphore`), fixed-window rate limits with multi-window quotas (`with_rate_limit`), calendar-bucketed dedup (`with_period`, returning `Skipped` results), and strict sequential ordering via a monotonically increasing nonce assigned at enqueue (`with_ordered_lock`). Covers re-entrancy across composed reactors, TTL auto-extend, inline-vs-async contention behavior, smart `retry_after` snoozes for rate limits, snooze tuning, the token-based semaphore safety model, once-per-day/month/year scheduling patterns, ordered-lock counter reset on drain, poison-pill timeouts, and deadlock-safe composition rules.
992
1032
 
993
1033
  ### [Middlewares & OpenTelemetry](documentation/middlewares.md)
994
1034
 
@@ -41,6 +41,7 @@ module RubyReactor
41
41
  end
42
42
 
43
43
  def build
44
+ warn_if_child_has_ordered_lock!
44
45
  dependencies = extract_dependencies_from_mappings
45
46
 
46
47
  step_config = {
@@ -78,6 +79,25 @@ module RubyReactor
78
79
 
79
80
  private
80
81
 
82
+ # Composed children bypass `Reactor#run`, so `assign_ordered_lock_nonce!`
83
+ # never fires for them — their `with_ordered_lock` declaration is silently
84
+ # ignored. Surface this at class load so users don't expect ordering
85
+ # enforcement that isn't happening. Nested ordered-lock sequences must be
86
+ # invoked as top-level `Reactor.run` to participate.
87
+ def warn_if_child_has_ordered_lock!
88
+ return unless @composed_reactor_class
89
+ return unless @composed_reactor_class.respond_to?(:ordered_lock_config)
90
+ return unless @composed_reactor_class.ordered_lock_config
91
+
92
+ parent_name = @reactor&.name || "<anonymous>"
93
+ child_name = @composed_reactor_class.name || "<anonymous>"
94
+ RubyReactor.configuration.logger.warn(
95
+ "RubyReactor: `with_ordered_lock` on #{child_name} is ignored when " \
96
+ "composed by #{parent_name}##{@name}. Nested ordered-lock sequences " \
97
+ "are independent and must run via top-level `Reactor.run` to be enforced."
98
+ )
99
+ end
100
+
81
101
  def ensure_composed_reactor_class!
82
102
  raise ArgumentError, "No block provided for inline compose" unless @composed_reactor_class
83
103
  end
@@ -8,7 +8,7 @@ module RubyReactor
8
8
  end
9
9
 
10
10
  module ClassMethods
11
- attr_reader :lock_config, :semaphore_config, :period_config, :rate_limit_config
11
+ attr_reader :lock_config, :semaphore_config, :period_config, :rate_limit_config, :ordered_lock_config
12
12
 
13
13
  # Propagate lock/semaphore/period/rate-limit config to subclasses;
14
14
  # without this a subclass of a configured reactor would silently lose
@@ -19,6 +19,7 @@ module RubyReactor
19
19
  subclass.instance_variable_set(:@semaphore_config, @semaphore_config) if @semaphore_config
20
20
  subclass.instance_variable_set(:@period_config, @period_config) if @period_config
21
21
  subclass.instance_variable_set(:@rate_limit_config, @rate_limit_config) if @rate_limit_config
22
+ subclass.instance_variable_set(:@ordered_lock_config, @ordered_lock_config) if @ordered_lock_config
22
23
  end
23
24
 
24
25
  # Configure locking for this reactor
@@ -74,6 +75,45 @@ module RubyReactor
74
75
  }
75
76
  end
76
77
 
78
+ # Configure strict-ordering nonce gating for this reactor. A
79
+ # monotonically increasing nonce is assigned at enqueue time; the
80
+ # worker can only proceed when its nonce equals `last_completed + 1`.
81
+ # Otherwise the worker raises {OrderedLock::WaitError} and the Sidekiq
82
+ # worker snoozes via `perform_in`.
83
+ #
84
+ # Counters reset to 0 once the sequence fully drains (last_completed
85
+ # catches up to next). Re-entrancy is NOT supported — a nested reactor
86
+ # with its own `with_ordered_lock` is an independent sequence.
87
+ #
88
+ # @param poison_pill_timeout [Integer] seconds since the blocker nonce
89
+ # was assigned before the gate auto-advances past it. Protects
90
+ # against permanent head-of-line blocking from a caller that INCRed
91
+ # the counter but crashed before enqueueing.
92
+ # @param ttl [Integer] TTL on the counter keys, refreshed on every
93
+ # assign. Only fully-drained sequences GC themselves.
94
+ # @param strict [Boolean] When true (default), if any nonce in the
95
+ # sequence terminates with a `Failure`, all subsequent nonces are
96
+ # short-circuited with `Skipped(reason: :ordered_lock_chain_failed)`
97
+ # instead of executing. This models "stop the line on the first
98
+ # problem" pipelines (e.g. ledger transactions). When false, the
99
+ # sequence keeps executing every nonce in order regardless of prior
100
+ # failures. The poison state is per-key and clears on full drain. The
101
+ # check only applies to a fresh `execute`; an already-started run
102
+ # that paused (InterruptResult/AsyncResult) completes on resume even
103
+ # if the chain failed in the meantime.
104
+ # @yield [inputs] Block that returns the ordered-lock key string.
105
+ def with_ordered_lock(poison_pill_timeout: OrderedLock::DEFAULT_POISON_PILL_TIMEOUT,
106
+ ttl: OrderedLock::DEFAULT_TTL,
107
+ strict: true,
108
+ &block)
109
+ @ordered_lock_config = {
110
+ poison_pill_timeout: poison_pill_timeout,
111
+ ttl: ttl,
112
+ strict: strict,
113
+ key_proc: block
114
+ }
115
+ end
116
+
77
117
  # Configure rate limiting for this reactor (fixed-window counter).
78
118
  # Pass either a single window via `limit:` + `period:`, or a hash of
79
119
  # windows via `limits:` for layered API quotas.
@@ -0,0 +1,307 @@
1
+ # frozen_string_literal: true
2
+
3
+ module RubyReactor
4
+ class Executor
5
+ # Gate check and terminal-advance logic for `with_ordered_lock`. Mixed
6
+ # into Executor to keep that class under the length limit. All methods
7
+ # read from `@context.private_data[:ordered_lock]`, which Reactor#run
8
+ # populates at enqueue time.
9
+ module OrderedLockSupport
10
+ # Thread-local stack of ordered-lock keys whose steps are currently
11
+ # running in this thread. Used to detect a synchronous `Reactor.run`
12
+ # nested under another ordered-lock reactor on the same key — which
13
+ # would deadlock since the outer holds the slot and the inner can never
14
+ # advance.
15
+ THREAD_LOCAL_ACTIVE_KEYS = :ruby_reactor_active_ordered_locks
16
+
17
+ # Minimum interval between liveness heartbeats; protects very small
18
+ # poison_pill_timeouts (mirrors Lock::MIN_EXTEND_INTERVAL).
19
+ HEARTBEAT_MIN_INTERVAL = 1.0
20
+
21
+ def self.active_keys
22
+ Thread.current[THREAD_LOCAL_ACTIVE_KEYS] ||= []
23
+ end
24
+
25
+ # Parse the ordered-lock stash from a context's private_data, surviving
26
+ # the JSON round-trip (symbol or string keys). Module-level so the
27
+ # Sidekiq worker can advance the nonce on escalation paths that never
28
+ # construct an Executor.
29
+ def self.info_from(context)
30
+ data = context.private_data[:ordered_lock] || context.private_data["ordered_lock"]
31
+ return nil unless data
32
+
33
+ strict_raw = data[:strict]
34
+ strict_raw = data["strict"] if strict_raw.nil?
35
+ strict_raw = true if strict_raw.nil?
36
+
37
+ {
38
+ key: data[:key] || data["key"],
39
+ nonce: (data[:nonce] || data["nonce"]).to_i,
40
+ epoch: (data[:epoch] || data["epoch"]).to_i,
41
+ poison_pill_timeout: (data[:poison_pill_timeout] || data["poison_pill_timeout"] ||
42
+ OrderedLock::DEFAULT_POISON_PILL_TIMEOUT).to_i,
43
+ ttl: (data[:ttl] || data["ttl"] || OrderedLock::DEFAULT_TTL).to_i,
44
+ strict: [true, "true"].include?(strict_raw)
45
+ }
46
+ end
47
+
48
+ # Strict-ordering gate. Runs BEFORE rate-limit / lock / semaphore so a
49
+ # waiting nonce never holds any other primitive — preventing
50
+ # hold-and-wait deadlocks when `with_lock` and `with_ordered_lock`
51
+ # share inputs. Raises {OrderedLock::WaitError}; the Sidekiq worker
52
+ # rescues and snoozes.
53
+ def check_ordered_lock_gate
54
+ info = ordered_lock_info
55
+ return :go unless info
56
+
57
+ OrderedLock.new(
58
+ info.fetch(:key),
59
+ nonce: info.fetch(:nonce),
60
+ epoch: info.fetch(:epoch),
61
+ poison_pill_timeout: info.fetch(:poison_pill_timeout),
62
+ strict: info.fetch(:strict)
63
+ ).check!
64
+ end
65
+
66
+ # Combined gate-check + thread-local push. Call at the top of
67
+ # `execute` / `resume_execution`. Pair with `leave_ordered_lock_scope`
68
+ # in `ensure`.
69
+ #
70
+ # The strict-mode chain-skip only fires on a *fresh* start (no step
71
+ # has run yet on this context). This lets an in-flight run that paused
72
+ # (Interrupt / AsyncResult) complete on resume regardless of chain
73
+ # failures that landed while it was parked, while still applying
74
+ # strict to a fresh Sidekiq job (which enters via `resume_execution`
75
+ # but has no prior step state).
76
+ def enter_ordered_lock_scope
77
+ gate = check_ordered_lock_gate
78
+ # A stale-batch run never participates regardless of fresh/resume state —
79
+ # its numbering belongs to a drained generation. Chain-skip stays gated
80
+ # on a fresh start so an in-flight paused run still completes on resume.
81
+ @ordered_lock_stale_batch = gate == :stale_batch
82
+ @ordered_lock_chain_skip = fresh_ordered_lock_start? && gate == :skip_chain_failed
83
+
84
+ # Drained-batch gate: the batch GC'd while this caller slept. A genuine
85
+ # late straggler runs (poison semantics); a Sidekiq redelivery of an
86
+ # ALREADY-terminal context must not re-execute its steps. Only the
87
+ # latter — confirmed by a terminal stored status — is short-circuited.
88
+ @ordered_lock_drained_replay = gate == :drained_go && stored_status_terminal?
89
+
90
+ info = ordered_lock_info
91
+ return unless info
92
+
93
+ OrderedLockSupport.active_keys << info[:key]
94
+
95
+ # Only a run that will actually execute steps needs a heartbeat. A
96
+ # short-circuiting run (stale batch / strict chain skip / drained
97
+ # redelivery) does no work and terminally advances immediately, so
98
+ # starting a thread for it is pointless churn.
99
+ return if @ordered_lock_stale_batch || @ordered_lock_chain_skip || @ordered_lock_drained_replay
100
+
101
+ start_ordered_lock_heartbeat(info)
102
+ end
103
+
104
+ def fresh_ordered_lock_start?
105
+ @context.intermediate_results.empty? && @context.current_step.nil?
106
+ end
107
+
108
+ def ordered_lock_chain_skip?
109
+ @ordered_lock_chain_skip == true
110
+ end
111
+
112
+ def ordered_lock_stale_batch?
113
+ @ordered_lock_stale_batch == true
114
+ end
115
+
116
+ def ordered_lock_drained_replay?
117
+ @ordered_lock_drained_replay == true
118
+ end
119
+
120
+ # Terminal Skipped result when the ordered-lock gate short-circuits this
121
+ # run (stale batch, strict chain failure, or a drained-batch redelivery of
122
+ # an already-terminal context), or nil to continue. Shared by `execute`
123
+ # and `resume_execution`.
124
+ def ordered_lock_short_circuit
125
+ return RubyReactor::Skipped.new(reason: :ordered_lock_stale_batch) if ordered_lock_stale_batch?
126
+ return RubyReactor::Skipped.new(reason: :ordered_lock_drained_replay) if ordered_lock_drained_replay?
127
+ return RubyReactor::Skipped.new(reason: :ordered_lock_chain_failed) if ordered_lock_chain_skip?
128
+
129
+ nil
130
+ end
131
+
132
+ # Pre-step short-circuit: ordered-lock gate skip or already-marked
133
+ # period bucket. Returns a terminal result or nil.
134
+ def short_circuit_result
135
+ ordered_lock_short_circuit || check_period_gate
136
+ end
137
+
138
+ def short_circuit!(result)
139
+ @result = result
140
+
141
+ # A stale-batch or drained-batch-redelivery skip means this run's epoch
142
+ # belongs to a drained generation — typically a slow straggler or a
143
+ # Sidekiq at-least-once redelivery. If the redelivery is of a job that
144
+ # ALREADY reached a terminal status, its stored context is the source of
145
+ # truth; writing :skipped over a :completed/:failed record would silently
146
+ # corrupt the outcome. Return the skip to the worker (so it stops)
147
+ # without saving. The `@skip_context_persist` flag also suppresses the
148
+ # ensure-block save in execute / resume_execution, which would otherwise
149
+ # clobber the stored terminal record with this run's stale in-memory
150
+ # status.
151
+ if redelivery_of_terminal?(result)
152
+ @skip_context_persist = true
153
+ return @result
154
+ end
155
+
156
+ update_context_status(@result)
157
+ save_context
158
+ @result
159
+ end
160
+
161
+ def skip_context_persist?
162
+ @skip_context_persist == true
163
+ end
164
+
165
+ # True when the skip is one of the drained-generation reasons (stale batch
166
+ # or drained-batch replay) AND the stored context already reached a
167
+ # terminal status — i.e. this is a redelivery of an already-finished run
168
+ # whose record must not be overwritten. The drained-replay flag is only
169
+ # set when the status was terminal, but re-checking keeps both paths
170
+ # uniform and self-guarding.
171
+ def redelivery_of_terminal?(result)
172
+ return false unless result.is_a?(RubyReactor::Skipped)
173
+ return false unless %i[ordered_lock_stale_batch ordered_lock_drained_replay].include?(result.reason)
174
+
175
+ stored_status_terminal?
176
+ end
177
+
178
+ def stored_status_terminal?
179
+ %w[completed failed skipped].include?(stored_context_status)
180
+ end
181
+
182
+ def stored_context_status
183
+ reactor_class_name = @reactor_class.name || "AnonymousReactor-#{@reactor_class.object_id}"
184
+ data = RubyReactor.configuration.storage_adapter.retrieve_context(@context.context_id, reactor_class_name)
185
+ return nil unless data
186
+
187
+ (data["status"] || data[:status]).to_s
188
+ rescue StandardError
189
+ nil
190
+ end
191
+
192
+ # Combined terminal-advance + thread-local pop. Idempotent: safe to call
193
+ # in `ensure` even if `enter_ordered_lock_scope` never pushed (gate
194
+ # raised, or no ordered_lock configured).
195
+ def leave_ordered_lock_scope
196
+ # Stop (and join) the heartbeat BEFORE advancing: the advance HDELs this
197
+ # nonce's assigned_at, and a heartbeat racing that HDEL could restamp it.
198
+ # The HEARTBEAT_SCRIPT's hexists guard makes a late restamp a harmless
199
+ # no-op, but joining first keeps the ordering deterministic.
200
+ stop_ordered_lock_heartbeat
201
+ advance_ordered_lock_if_terminal
202
+ info = ordered_lock_info
203
+ return unless info
204
+
205
+ stack = OrderedLockSupport.active_keys
206
+ idx = stack.rindex(info[:key])
207
+ stack.delete_at(idx) if idx
208
+ end
209
+
210
+ # Background thread that restamps this nonce's assigned_at every pp/3
211
+ # seconds (floored at HEARTBEAT_MIN_INTERVAL) while its steps run, so a
212
+ # legitimately slow blocker is not poison-passed by a successor. Mirrors
213
+ # the Lock auto-extend thread. The thread sleeps FIRST, so a job that
214
+ # finishes faster than one interval never touches Redis.
215
+ def start_ordered_lock_heartbeat(info)
216
+ return if @ordered_lock_heartbeat_running
217
+
218
+ pp = info[:poison_pill_timeout].to_f
219
+ interval = [pp / 3.0, HEARTBEAT_MIN_INTERVAL].max
220
+ @ordered_lock_heartbeat_running = true
221
+ lock = OrderedLock.new(
222
+ info.fetch(:key), nonce: info.fetch(:nonce), epoch: info.fetch(:epoch)
223
+ )
224
+
225
+ @ordered_lock_heartbeat = Thread.new do
226
+ while @ordered_lock_heartbeat_running
227
+ sleep interval
228
+ break unless @ordered_lock_heartbeat_running
229
+
230
+ begin
231
+ lock.heartbeat!
232
+ rescue StandardError => e
233
+ RubyReactor.configuration.logger.warn(
234
+ "RubyReactor ordered_lock heartbeat failed for '#{info[:key]}' " \
235
+ "nonce #{info[:nonce]}: #{e.message}"
236
+ )
237
+ break
238
+ end
239
+ end
240
+ end
241
+ end
242
+
243
+ def stop_ordered_lock_heartbeat
244
+ return unless @ordered_lock_heartbeat_running
245
+
246
+ @ordered_lock_heartbeat_running = false
247
+ thread = @ordered_lock_heartbeat
248
+ @ordered_lock_heartbeat = nil
249
+ return unless thread
250
+
251
+ thread.wakeup if thread.alive?
252
+ thread.join(0.1)
253
+ rescue StandardError
254
+ # Best-effort shutdown; never let heartbeat teardown break the ensure chain.
255
+ end
256
+
257
+ # Advance the cursor when this run reached a *terminal* status.
258
+ # Retry-queued, interrupted, or async-handed-off results keep the same
259
+ # nonce owning the slot — a Sidekiq retry must not double-advance. A
260
+ # terminal `Failure` is also recorded as the chain's poison marker
261
+ # (only the FIRST such failure sticks).
262
+ def advance_ordered_lock_if_terminal
263
+ info = ordered_lock_info
264
+ return unless info
265
+ return unless terminal_for_ordered_lock?(@result)
266
+
267
+ OrderedLockSupport.advance_with_retry(info, failed: @result.is_a?(RubyReactor::Failure))
268
+ end
269
+
270
+ # A missed advance on a terminal result stalls every successor for up to
271
+ # poison_pill_timeout with only a warn line as evidence, so one transient
272
+ # Redis blip is worth absorbing here before giving up.
273
+ def self.advance_with_retry(info, failed:)
274
+ attempts = 0
275
+ begin
276
+ attempts += 1
277
+ OrderedLock.new(
278
+ info.fetch(:key), nonce: info.fetch(:nonce), epoch: info.fetch(:epoch), ttl: info.fetch(:ttl)
279
+ ).advance!(failed: failed)
280
+ rescue StandardError => e
281
+ retry if attempts < 2
282
+
283
+ RubyReactor.configuration.logger.warn(
284
+ "RubyReactor failed to advance ordered_lock '#{info[:key]}' nonce #{info[:nonce]} " \
285
+ "after #{attempts} attempts: #{e.message} — successors will stall until " \
286
+ "poison_pill_timeout (#{info[:poison_pill_timeout]}s) expires"
287
+ )
288
+ end
289
+ end
290
+
291
+ private
292
+
293
+ def ordered_lock_info
294
+ OrderedLockSupport.info_from(@context)
295
+ end
296
+
297
+ def terminal_for_ordered_lock?(result)
298
+ case result
299
+ when RubyReactor::AsyncResult, RubyReactor::InterruptResult, RetryQueuedResult
300
+ false
301
+ when RubyReactor::Success, RubyReactor::Failure
302
+ true
303
+ end
304
+ end
305
+ end
306
+ end
307
+ end