tobox 0.6.1 → 0.7.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 7fcb3a283fa1bb64036319c64c537891a3c4640331ef96058c0a80f311ff6d9a
4
- data.tar.gz: e6f6a8cd2b0ec8f38cc5e52added655c1922cc79be890fa79c72df3a56418990
3
+ metadata.gz: d1414826d4ba25bb3b9ac45856cfe3d064efd57c4d0dbac1f5ad6b0fa5887286
4
+ data.tar.gz: 194b0b90a514e1e4809170baf4389090a06cc62e4456412117ced3094dab89d5
5
5
  SHA512:
6
- metadata.gz: 9b48f7d89bfecba3313b4807fb7cc3139aac28f43c43755657feedddfca7c7fb9048e1b6898becbc3fe05ee9ba28b8a1e57948661743173870ea7b04785a2ddc
7
- data.tar.gz: 24f01a9794d77d90e406b71be0045c309e7280e46b98ad8aad2375935d47647963780c834833b76a0a4ed838ba5dd95aa184c17e3707059e51cd4ed2f334c932
6
+ metadata.gz: ec514b27d335a02c0eb2c3d3a6a01cdf3ddd289484b4aa80e51b4664a0f3eff3a179986004d1b6d3b0db338df0a15645a9b622dd1771b81b8d3670c554541367
7
+ data.tar.gz: 36eabe5466a4912d994893496e8cf76fbc445e7cf437003536cd5688531ec4786ff714f25c391cf969ec3d3fdfe77f6017ccacb0d5285d9b341c5fe5456c0978
data/CHANGELOG.md CHANGED
@@ -1,5 +1,35 @@
1
1
  ## [Unreleased]
2
2
 
3
+ ## [0.7.0] - 2024-12-18
4
+
5
+ ### Features
6
+
7
+ #### `:pg_notify` plugin
8
+
9
+ The `:pg_notify` plugin is introduced, which leverages the PostgreSQL-only `LISTEN/NOTIFY` statements to asynchronously notify workers when there are new events to be processed (instead of regularly polling the database).
10
+
11
+ #### `visibility_column` and `attempts_column` configuration
12
+
13
+ The `visibility_column` configuration (default: `:run_at`) can be used not only to point to a differently named column, it can both point to either a timestamp column (used as a visibility timeout for events) or a boolean column (used to hide events from other transactions); using the latter will also disable exponential backoff in retries.
14
+
15
+ The `attempts_column` configuration (default: `:attempts`) can be used to point to a differently named column, or, when set to `nil`, uncap retries.
16
+
17
+ See when they can be used for in the readme recommendations section.
18
+
19
+ ### Improvements
20
+
21
+ #### fiber pool on async
22
+
23
+ Support was removed for the `fiber_scheduler` gem, which is under-maintained, so the fiber worker pool will use an [async](https://github.com/socketry/async) scheduler going forward. Functionality and APIs will be the same, but you'll have to add it to your Gemfile.
24
+
25
+ ### Bugfixes
26
+
27
+ * `datadog` integration: environment variables were wrongly named.
28
+
29
+ ### Chore
30
+
31
+ * `mutex_m` usage was removed.
32
+
3
33
  ## [0.6.1] - 2024-10-30
4
34
 
5
35
  ### Improvements
data/README.md CHANGED
@@ -21,8 +21,10 @@ Simple, data-first events processing framework based on the [transactional outbo
21
21
  - [Sentry](#sentry)
22
22
  - [Datadog](#datadog)
23
23
  - [Stats](#stats)
24
+ - [PG Notify](#pg-notify)
24
25
  - [Advanced](#advanced)
25
26
  - [Batch Events Handling](#batch-events)
27
+ - [Recommendations](#recommendations)
26
28
  - [Supported Rubies](#supported-rubies)
27
29
  - [Rails support](#rails-support)
28
30
  - [Why?](#why)
@@ -201,6 +203,28 @@ the name of the database table where outbox events are stored (`:outbox` by defa
201
203
  table :outbox
202
204
  ```
203
205
 
206
+ ### `visibility_column`
207
+
208
+ the name of the database column used to mark an event as invisible while being handled (`:run_at` by default).
209
+
210
+ The column type MUST be either a datetime (or timestamp, depending of your database) or a boolean (if your database supports it, MySQL doesn't for example).
211
+
212
+ If it's a datetime/timestamp column, this value will be used, along with the `visibility timeout` option, to mark the event as invisible for the given duration; this will ensure that the event will be picked up again in case of a crash eventually, in case of non-transactional event handling (via the `:progress` plugin); if it's a boolean column, the event is marked as invisible indefinitely, so in case of a crash, you'll need to recover it manually.
213
+
214
+ ```ruby
215
+ visibility_column :run_at
216
+ ```
217
+
218
+ ### `attempts_column`
219
+
220
+ the name of the database column where the number of times an event was handled and failed (`:attempts` by default). If `nil`, events will be retried indefinitely.
221
+
222
+ ### `created_at_column`
223
+
224
+ the name of the database column where the event creation timestamp is stored (`:created_at` by default).
225
+
226
+ When creating the outbox table, you're **recommended** to set this column default to `CURRENT_TIMESTAMP` (or the equivalent in your database), instead of passing it manually in the corresponding `INSERT` statements.
227
+
204
228
  ### `max_attempts`
205
229
 
206
230
  Maximum number of times a failed attempt to process an event will be retried (`10` by default).
@@ -536,7 +560,6 @@ end
536
560
 
537
561
  #### Configuration
538
562
 
539
-
540
563
  ##### inbox table
541
564
 
542
565
  Defines the name of the table to be used for inbox (`:inbox` by default).
@@ -589,16 +612,23 @@ end
589
612
  Plugin for [datadog](https://github.com/DataDog/dd-trace-rb) ruby SDK. It'll generate traces for event handling.
590
613
 
591
614
  ```ruby
592
- # you can init the datadog config in another file to load:
593
- Datadog.configure do |c|
594
- c.tracing.instrument :tobox
595
- end
596
-
597
615
  # tobox.rb
598
616
  plugin(:datadog)
617
+ # or, if you want to pass options to tracing call:
618
+ plugin(:datadog, enabled: false)
619
+ # or, if you want to access the datadog configuration:
620
+ plugin(:datadog) do |c|
621
+ c.tracing.instrument :smth_else
622
+ end
599
623
  ```
600
624
 
601
- <a id="markdown-datadog" name="stats"></a>
625
+ `datadog` tracing functionality can also be enabled/disabled via environment variables, namely the following:
626
+
627
+ * `DD_TOBOX_ENABLED`: enables/disables tobox tracing (defaults to `true`)
628
+ * `DD_TOBOX_ANALYTICS_ENABLED`: enables/disables tobox analytics (defaults to `true`)
629
+ * `DD_TRACE_TOBOX_ANALYTICS_SAMPLE_RATE`: sets tobox tracing sample rate (defaults to `1.0`)
630
+
631
+ <a id="markdown-stats" name="stats"></a>
602
632
  ### Stats
603
633
 
604
634
  The `stats` plugin collects statistics related with the outbox table periodically, and exposes them to app code (which can then relay them to a statsD collector, or similar tool).
@@ -662,6 +692,29 @@ c.on_stats(5) do |stats_collector, db|
662
692
  end
663
693
  ```
664
694
 
695
+
696
+ <a id="markdown-pg-notify" name="pg-notify"></a>
697
+ ### PG Notify
698
+
699
+ The `pg_notify` plugin is a **PostgreSLQ only** plugin, which uses the [LISTEN](https://www.postgresql.org/docs/current/sql-listen.html) statement to pause the workers when no work is available in the outbox table, until the producer says so, by using the [NOTIFY](https://www.postgresql.org/docs/current/sql-notify.html) statement to notify the channel the workers are listening to.
700
+
701
+ It reduces the `SELECT ... FOR UPDATE SKIP LOCKED` statements to the bare minimum required; without this plugin, these may, given enough load, become the cause of overhead in the master replica, considering that they're handled as "write statements", i.e. resources must be allocated, high frequency affects applying changes on (and using) indexes on the outbox table, which may make subsequent queries fall back to table scan, which will hold dead tuples from used transaction xids for longer, which won't be vacuumed fast, which increases replication lag, which... you get the gist.
702
+
703
+ ```ruby
704
+ plugin(:pg_notify)
705
+ notifier_channel :outbox_notifications # default
706
+
707
+ # that's it
708
+ ```
709
+
710
+ **NOTE**: this plugin can't be used with `jruby`.
711
+
712
+ #### Configuration
713
+
714
+ ##### `notifier_channel`
715
+
716
+ Identifies the name of the channel the `LISTEN` and `NOTIFY` SQL statements will refer to (`:outbox_notifications` by default).
717
+
665
718
  <a id="markdown-advanced" name="advanced"></a>
666
719
  ## Advanced
667
720
 
@@ -711,6 +764,104 @@ on("user_created", "user_updated") do |*events| # 10 events at most
711
764
  end
712
765
  end
713
766
  ```
767
+ <a id="markdown-recommendations" name="recommendations"></a>
768
+ ## Recommendations
769
+
770
+ There is no free lunch. Having a transactional outbox has a cost. Throughput is sacrificed in order to guarantee the processing of the event. The cost has to be reasonable, however.
771
+
772
+ ### PostgreSQL
773
+
774
+ PostgreSQL is the most popular database around, and for good reason: extensible, feature-rich, and quite performant for most workloads. It does have some known drawbacks though: its implementation of MVCC, with creation of tuples for UPDATEs and DELETEs, along with the requirement for indexes to point to the address of the most recent tuple, and WAL logs having to bookkeep all of that (which impacts, among other things, disk usage and replication), highly impacts the performance of transaction management. This phenomenon is known as "write amplification".
775
+
776
+ Considering the additional overhead that a transactional outbox introduces to the same database your main application uses, certain issues may escalate badly, and it'll be up to you to apply strategies to mitigate them. Here are some recommendations.
777
+
778
+ ### Tweak `max_connections`
779
+
780
+ By default, a `tobox` consumer process will have as many database connections as there are workers (each worker polls the outbox table individually). As the system scales out to cope with more traffic, you may see that, as more workers are added, so will query latency (and database CPU usage).
781
+
782
+ One way to address that is to limit the number of database connections that can be used by the workers in a `tobox` consumer process, by setting the `max_connections` configuration option to a number lower than `concurrency`, i.e. 1/3 or 1/4. As a result, workers will wait for an available connection to fetch work from, when none is available.
783
+
784
+ #### Caveats
785
+
786
+ This is not the main source of query latency overhead, you may start seeing "pool timeout" errors as a result, so do monitor their performance and apply other mitigations accordingly.
787
+
788
+ ### Handling events in batches
789
+
790
+ By default, each worker will fetch-and-handle-then-delete events one by one. As surges happen and volume increases, the database will spend way more time and resources managing the transaction, than doing the actual work you need, thereby affecting overall turnaround time. In the case of PostgreSQL, the constant DELETEs and UPDATEs may result in the query planner deciding not to use indexes to find an event, and instead fallback to table scan, if an index is assumed to be "behind" due to a large queue of pending updates from valid transactions.
791
+
792
+ A way to mitigate this is to [handle more events at once](#batch-events). It's a strategy that makes sense if the event handler APIs support batching. For instance, if all your event handler is doing is relaying to AWS SNS, you can use the [PublishBatch](https://docs.aws.amazon.com/sns/latest/api/API_PublishBatch.html) API (and adjust the batching window to the max threshold you're able to handle at once).
793
+
794
+ #### Caveats
795
+
796
+ As per above, it makes sense to use this if events can be handled as a batch; if that's not the case, and the handling block iterates across the batch one by one, this will cause significant variance in single event TaT metrics, as a "slow to handle" event will delay subsequent events in the batch. Delays can also cause visibility timeouts to expire, and make events visible to other handlers earlier than expected.
797
+
798
+ Recovering from errors in a batch is also more convoluted, (see `Tobox.raise_batch_errors`).
799
+
800
+ ### Disable retries and ordering
801
+
802
+ The `tobox` default configuration expects the `visibility_column` to be a datetime column ( default is `:run_at`), which is therefore used as a "visibility timeout", and along the `attempts` column, used to retry failed events gracefully with an exponential backoff interval.
803
+
804
+ As a consequence, and in order to ensure reliable performance of the worker polling query, a sorted index is recommended; in PostgreSQL, it's `CREATE INDEX ... (id, run_at DESC NULLS FIRST)`, which ensures that new events get handled before retries, which can append ` WHERE attempts < 10` to the index statement, in order to rule out events which have exhausted attempts.
805
+
806
+ This comes at the cost of increased overhead per event: when producing it via `INSERT` statement, the sorted index will have to be rebalanced. When picking it up, setting the "visibility timeout" before handling it will rebalance it again; and after handling it, whether successfully or not, it'll rebalance it again. This will increase the backlog associated with index management, which may have other consequences (described somewhere else in this section).
807
+
808
+ You may observe in your systems that your handler either never fails, or when it does, it's the type of transient error which can be retried immediately after, and at a marginal cost. In such situations, the default "planning for failure" exponential backoff strategy described above imposes too much weight for little gain.
809
+
810
+ You can improve this by setting `visibility_column` to a boolean column, with default set to `false`:
811
+
812
+ ```ruby
813
+ # in migration
814
+ column :in_progress, :boolean, default: false
815
+
816
+ # tobox
817
+ visibility_column :in_progress
818
+ # and, if you require unbounded retries
819
+ attempts_column nil
820
+ ```
821
+
822
+ this should improve the performance of the main polling query, by **not requiring a sorted index on the visibility column** (i.e. the primary key index is all you need), and rely on conditional boolean statements (instead of the more expensive datetime logical operators).
823
+
824
+ #### Caveats
825
+
826
+ While using a boolean column as the `visibility_column` may improve the performance of most queries and reduce the overhead of writes, event handling will not be protected against database crashes, so you'll have to monitor idle events and recover them manually (by resetting the `visibility_column` to `false`).
827
+
828
+ ### Do not poll, get notified
829
+
830
+ The database must allocate resources and bookkeep some data on each transaction. In some cases (i.e. PostgreSQL), some of that bookkeeping does not happen **until** the first write statement is processed. However, due to the usage of locks via `SELECT ... FOR UPDATE`, most databases will consider the polling statement as a write statement, which means that, in a `tobox` process, transaction overhead is ever present. In a high availability configuration scenario, transactional resources will need to be maintained and replicated to read replica nodes, which given enough replication lag and inability to vacuum data, may snowball resource usage in the master replica, which may trigger autoscaling, causing more workers to poll the database for more work, and eventually bringing the whole system down.
831
+
832
+ This can be mitigated by either adjusting polling intervals (via `wait_for_events_delay` option), or replacing polling by asynchronously notifying workers of when there's work to do. For PostgreSQL, you can use the [pg_notify](#pg-notify) plugin, which will use the PostgreSQL-only `LISTEN`/`NOTIFY` statements for that effect.
833
+
834
+ #### Caveats
835
+
836
+ Using `LISTEN` requires maintaining a long-lived most-idle separate database connection; this approach may not be compatible with your setup, such as if you're using a connection pooler with a particular configuration. For instance, if you're using the popular [pgbouncer](https://www.pgbouncer.org/features.html), this plugin will be incompatible with transaction pooling.
837
+
838
+ There will be a slight race condition between the moment that a worker wasn't able to fetch an event, and the moment it starts listening to the notification channel; if an event arrives meanwhile, and the notification is broadcasted before the worker starts listening, the worker won't pick up this work immediately. Given enough entropy and workers, this should be a non-scenario, but a theoretical one still.
839
+
840
+ ### Unlogged tables
841
+
842
+ By design (storing the event in the same transaction where the associated changes happen), a transactional outbox consumer requires access that the outbox table is stored in the same database the application uses, and accesses it via the master replica. As already mentioned, this means associated bookkeeping overhead in the master replica, including WAL logs and replication lag, which under extreme load, leads to all kind of issues to guarantee data consistency, despite the outbox table being unused and irrelevant in read nodes.
843
+
844
+ In such cases, you may want to set the outbox table as [unlogged](https://www.postgresql.org/docs/current/sql-createtable.html#SQL-CREATETABLE-UNLOGGED), which ensures that associated write statements aren't part of WAL logs, and aren't replicated either. This will massively improve throughput of associated traffic, while preserving most of the desired transactional properties of using a transactional outbox solution, i.e. writing events along with associated data, making it visible to consumers only after transaction commits, and **in case of a clean shutdown**, ensure that data is flushed to disk.
845
+
846
+ #### Caveats
847
+
848
+ The last statement leads to the biggest shortcoming of this recommendation: by choosing to do unlog the outbox table, your database cannot ensure 100% consistency for its data in case of a database crash or unclean shutdown, which means you may lose events in that event. And while outbox data should not be business critical,having less than 100% event handling may be unacceptable to you.
849
+
850
+ You may decide to do it temporarily though whenever you expect the level of traffic that justifies foregoing 100% consistency, but be aware that an `ALTER TABLE ... SET UNLOGGED` statement **rewrites the table**, so bear in mind of that, if you try to do this during an ongoing traffic surge / incident; the recommendation is to do this **before** the surge happens, such as a thursday before a black friday.
851
+
852
+ ### WAL outbox consumer (Debezium/Kafka)
853
+
854
+ It takes a lot of write statements to both produce and consume from the outbox table, in the manner in which it is implemented in `tobox`.In PostgreSQL, considering each write statement on a given row will just generate a new tuple, per event, that amounts to at least 3 tuples. In the long run, and given enough volume, the health of the whole database will be limited by how quickly dead tuples are vacuumed from the outbox table.
855
+
856
+ An alternative way to consume outbox events which does not require consuming events via SQL is by using a broker which is able to relay outbox events directly from the WAL logs. One such alternative is [Debezium](https://debezium.io/documentation/reference/stable/integrations/outbox.html), which relays them into Kafka streams.
857
+
858
+ This solution means not using `tobox` anymore.
859
+
860
+ #### Caveats
861
+
862
+ This solution is, at least at the time of writing, limited to Kafka streams; if events are to be relayed to other alternatives (AWS SNS, RabbitMQ...), or there's more to your event handler than relaying, this solution will not work for you either.
863
+
864
+ There are also several shortcomings to consider when using Kafka streams; for once, events are consumed one at a time, which will affect event handling turnaround time.
714
865
 
715
866
  <a id="markdown-supported-rubies" name="supported-rubies"></a>
716
867
  ## Supported Rubies
@@ -7,7 +7,8 @@ module Tobox
7
7
  class Configuration
8
8
  extend Forwardable
9
9
 
10
- attr_reader :plugins, :handlers, :lifecycle_events, :arguments_handler, :default_logger, :database, :fetcher_class,
10
+ attr_reader :plugins, :handlers, :lifecycle_events, :arguments_handler, :default_logger, :database,
11
+ :fetcher_class, :worker_class,
11
12
  :config
12
13
 
13
14
  def_delegator :@config, :[]
@@ -19,6 +20,8 @@ module Tobox
19
20
  database_uri: nil,
20
21
  database_options: nil,
21
22
  table: :outbox,
23
+ visibility_column: :run_at,
24
+ attempts_column: :attempts,
22
25
  created_at_column: nil,
23
26
  batch_size: 1,
24
27
  max_attempts: 10,
@@ -47,6 +50,7 @@ module Tobox
47
50
  @message_to_arguments = nil
48
51
  @plugins = []
49
52
  @fetcher_class = Class.new(Fetcher)
53
+ @worker_class = Class.new(Worker)
50
54
 
51
55
  if block
52
56
  case block.arity
@@ -117,6 +121,11 @@ module Tobox
117
121
  self
118
122
  end
119
123
 
124
+ def on_start_worker(&callback)
125
+ (@lifecycle_events[:start_worker] ||= []) << callback
126
+ self
127
+ end
128
+
120
129
  def on_error_worker(&callback)
121
130
  (@lifecycle_events[:error_worker] ||= []) << callback
122
131
  self
@@ -132,6 +141,16 @@ module Tobox
132
141
  self
133
142
  end
134
143
 
144
+ def visibility_type_bool?
145
+ _, visibility_info = @database.schema(@config[:table]).find do |column, _|
146
+ column == @config[:visibility_column]
147
+ end
148
+
149
+ raise Error, "a visibility column is required" unless visibility_info
150
+
151
+ visibility_info[:type] == :boolean
152
+ end
153
+
135
154
  def plugin(plugin, **options, &block)
136
155
  raise Error, "Cannot add a plugin to a frozen config" if frozen?
137
156
 
@@ -145,6 +164,7 @@ module Tobox
145
164
  extend(plugin::ConfigurationMethods) if defined?(plugin::ConfigurationMethods)
146
165
 
147
166
  @fetcher_class.__send__(:include, plugin::FetcherMethods) if defined?(plugin::FetcherMethods)
167
+ @worker_class.__send__(:include, plugin::WorkerMethods) if defined?(plugin::WorkerMethods)
148
168
 
149
169
  plugin.configure(self, **options, &block) if plugin.respond_to?(:configure)
150
170
  end
data/lib/tobox/fetcher.rb CHANGED
@@ -19,14 +19,27 @@ module Tobox
19
19
 
20
20
  @ds = @db[@table]
21
21
 
22
- run_at_conds = [
23
- { Sequel[@table][:run_at] => nil },
24
- (Sequel.expr(Sequel[@table][:run_at]) < Sequel::CURRENT_TIMESTAMP)
25
- ].reduce { |agg, cond| Sequel.expr(agg) | Sequel.expr(cond) }
22
+ @visibility_column = configuration[:visibility_column]
23
+ @attempts_column = configuration[:attempts_column]
26
24
 
27
- @pick_next_sql = @ds.where(Sequel[@table][:attempts] < max_attempts) # filter out exhausted attempts
28
- .where(run_at_conds)
29
- .order(Sequel.desc(:run_at, nulls: :first), :id)
25
+ @pick_next_sql = @ds
26
+
27
+ if @attempts_column
28
+ # filter out exhausted attempts
29
+ @pick_next_sql = @pick_next_sql.where(Sequel[@table][@attempts_column] < max_attempts)
30
+ end
31
+
32
+ if configuration.visibility_type_bool?
33
+ @pick_next_sql = @pick_next_sql.where(@visibility_column => false).order(:id)
34
+ else
35
+ visibility_conds = [
36
+ { Sequel[@table][@visibility_column] => nil },
37
+ (Sequel.expr(Sequel[@table][@visibility_column]) < Sequel::CURRENT_TIMESTAMP)
38
+ ].reduce { |agg, cond| Sequel.expr(agg) | Sequel.expr(cond) }
39
+
40
+ @pick_next_sql = @pick_next_sql.where(visibility_conds)
41
+ .order(Sequel.desc(@visibility_column, nulls: :first), :id)
42
+ end
30
43
 
31
44
  @batch_size = configuration[:batch_size]
32
45
 
@@ -129,17 +142,27 @@ module Tobox
129
142
  end
130
143
  end
131
144
 
132
- def log_message(msg)
133
- "(worker: #{@label}) -> #{msg}"
145
+ def log_message(msg, event)
146
+ tags = { type: event[:type], attempts: event[@attempts_column] }.compact
147
+
148
+ "(worker: #{@label}) -> outbox event " \
149
+ "(#{tags.map { |*pair| pair.join(": ") }.join(", ")}) #{msg}"
134
150
  end
135
151
 
136
152
  def mark_as_error(event, error)
153
+ # @type var update_params: Hash[Symbol, untyped]
137
154
  update_params = {
138
- run_at: calculate_event_retry_interval(event[:attempts]),
139
- attempts: Sequel[@table][:attempts] + 1,
140
155
  last_error: error.full_message(highlight: false)
141
156
  }
142
157
 
158
+ update_params[@attempts_column] = Sequel[@table][@attempts_column] + 1 if @attempts_column
159
+
160
+ update_params[@visibility_column] = if @configuration.visibility_type_bool?
161
+ false
162
+ else
163
+ calculate_event_retry_interval(event[@attempts_column])
164
+ end
165
+
143
166
  set_event_retry_attempts(event, update_params)
144
167
  end
145
168
 
@@ -177,7 +200,7 @@ module Tobox
177
200
 
178
201
  def handle_before_event(event)
179
202
  @logger.debug do
180
- log_message("outbox event (type: \"#{event[:type]}\", attempts: #{event[:attempts]}) starting...")
203
+ log_message("starting...", event)
181
204
  end
182
205
  @before_event_handlers.each do |hd|
183
206
  hd.call(event)
@@ -185,7 +208,7 @@ module Tobox
185
208
  end
186
209
 
187
210
  def handle_after_event(event)
188
- @logger.debug { log_message("outbox event (type: \"#{event[:type]}\", attempts: #{event[:attempts]}) completed") }
211
+ @logger.debug { log_message("completed", event) }
189
212
  @after_event_handlers.each do |hd|
190
213
  hd.call(event)
191
214
  end
@@ -193,9 +216,9 @@ module Tobox
193
216
 
194
217
  def handle_error_event(event, error)
195
218
  @logger.error do
196
- log_message("outbox event (type: \"#{event[:type]}\", attempts: #{event[:attempts]}) failed with error\n" \
219
+ log_message("failed with error\n" \
197
220
  "#{error.class}: #{error.message}\n" \
198
- "#{error.backtrace.join("\n")}")
221
+ "#{error.backtrace.join("\n")}", event)
199
222
  end
200
223
  @error_event_handlers.each do |hd|
201
224
  hd.call(event, error)
@@ -13,7 +13,7 @@ module Datadog
13
13
  if Gem::Version.new(DDTrace::VERSION::STRING) >= Gem::Version.new("1.13.0")
14
14
  option :enabled do |o|
15
15
  o.type :bool
16
- o.env "DD_TOBOX_SIDEKIQ_ENABLED"
16
+ o.env "DD_TOBOX_ENABLED"
17
17
  o.default true
18
18
  end
19
19
 
@@ -30,7 +30,7 @@ module Datadog
30
30
  end
31
31
  else
32
32
  option :enabled do |o|
33
- o.default { env_to_bool("DD_TOBOX_SIDEKIQ_ENABLED", true) }
33
+ o.default { env_to_bool("DD_TOBOX_ENABLED", true) }
34
34
  o.lazy
35
35
  end
36
36
 
@@ -58,7 +58,7 @@ module Tobox
58
58
 
59
59
  span.set_tag("tobox.event.id", event[:id])
60
60
  span.set_tag("tobox.event.type", event[:type])
61
- span.set_tag("tobox.event.retry", event[:attempts])
61
+ span.set_tag("tobox.event.retry", event[@attempts_column]) if @attempts_column
62
62
  span.set_tag("tobox.event.table", @db_table)
63
63
  span.set_tag("tobox.event.delay", (Time.now.utc - event[:created_at]).to_f)
64
64
 
@@ -26,8 +26,14 @@ module Tobox
26
26
  total_from_group = @ds.where(@group_column => group).count
27
27
 
28
28
  event_ids = @ds.where(@group_column => group)
29
- .order(Sequel.desc(:run_at, nulls: :first), :id)
30
- .for_update.skip_locked.select_map(:id)
29
+
30
+ event_ids = if @configuration.visibility_type_bool?
31
+ event_ids.order(:id)
32
+ else
33
+ event_ids.order(Sequel.desc(@visibility_column, nulls: :first), :id)
34
+ end
35
+
36
+ event_ids = event_ids.for_update.skip_locked.select_map(:id)
31
37
 
32
38
  if event_ids.size != total_from_group
33
39
  # this happens if concurrent workers locked different rows from the same group,
@@ -0,0 +1,85 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Tobox
4
+ module Plugins
5
+ module PgNotify
6
+ class Notifier
7
+ def initialize(config)
8
+ @config = config
9
+ @running = false
10
+ @notifier_mutex = Thread::Mutex.new
11
+ @notifier_semaphore = Thread::ConditionVariable.new
12
+ end
13
+
14
+ def start
15
+ return if @running
16
+
17
+ config = @config
18
+
19
+ @db = Sequel.connect(config.database.opts.merge(max_connections: 1))
20
+
21
+ raise Error, "this plugin only works with postgresql" unless @db.database_type == :postgres
22
+
23
+ @db.loggers = config.database.loggers
24
+ Array(config.lifecycle_events[:database_connect]).each { |cb| cb.call(@db) }
25
+
26
+ channel = config[:notifier_channel]
27
+
28
+ @th = Thread.start do
29
+ Thread.current.name = "outbox-notifier"
30
+
31
+ @db.listen(channel, loop: true) do
32
+ signal
33
+ end
34
+ end
35
+
36
+ @running = true
37
+ end
38
+
39
+ def stop
40
+ return unless @running
41
+
42
+ @th.terminate
43
+
44
+ @db.disconnect
45
+
46
+ @running = false
47
+ end
48
+
49
+ def wait
50
+ @notifier_mutex.synchronize do
51
+ @notifier_semaphore.wait(@notifier_mutex)
52
+ end
53
+ end
54
+
55
+ def signal
56
+ @notifier_mutex.synchronize do
57
+ @notifier_semaphore.signal
58
+ end
59
+ end
60
+ end
61
+
62
+ module WorkerMethods
63
+ attr_writer :notifier
64
+
65
+ def wait_for_work
66
+ @notifier.wait
67
+ end
68
+ end
69
+
70
+ class << self
71
+ def configure(config)
72
+ config.config[:notifier_channel] = :outbox_notifications
73
+
74
+ notifier = Notifier.new(config)
75
+
76
+ config.on_start_worker { |wk| wk.notifier = notifier }
77
+
78
+ config.on_start(&notifier.method(:start))
79
+ config.on_stop(&notifier.method(:stop))
80
+ end
81
+ end
82
+ end
83
+ register_plugin :pg_notify, PgNotify
84
+ end
85
+ end
@@ -11,15 +11,22 @@ module Tobox
11
11
  private
12
12
 
13
13
  def do_fetch_events
14
- # mark events as invisible by using run_at as a visibility timeout
14
+ # mark events as invisible
15
+
16
+ # @type var mark_as_fetched_params: Hash[Symbol, untyped]
15
17
  mark_as_fetched_params = {
16
- run_at: Sequel.date_add(
17
- Sequel::CURRENT_TIMESTAMP,
18
- seconds: @configuration[:visibility_timeout]
19
- ),
20
- attempts: Sequel[@table][:attempts] + 1,
21
18
  last_error: nil
22
19
  }
20
+ mark_as_fetched_params[@attempts_column] = Sequel[@table][@attempts_column] + 1 if @attempts_column
21
+
22
+ mark_as_fetched_params[@visibility_column] = if @configuration.visibility_type_bool?
23
+ true
24
+ else
25
+ Sequel.date_add(
26
+ Sequel::CURRENT_TIMESTAMP,
27
+ seconds: @configuration[:visibility_timeout]
28
+ )
29
+ end
23
30
 
24
31
  if @ds.supports_returning?(:update)
25
32
  @ds.where(id: fetch_event_ids).returning.update(mark_as_fetched_params)
@@ -36,7 +43,7 @@ module Tobox
36
43
  end
37
44
 
38
45
  def set_event_retry_attempts(event, update_params)
39
- update_params.delete(:attempts)
46
+ update_params.delete(@attempts_column)
40
47
  super
41
48
  end
42
49
 
@@ -38,15 +38,15 @@ module Tobox
38
38
  scope = ::Sentry.get_current_scope
39
39
 
40
40
  scope.set_contexts(tobox: {
41
- id: event[:id],
42
- type: event[:type],
43
- attempts: event[:attempts],
44
- created_at: event[:created_at],
45
- run_at: event[:run_at],
46
- last_error: event[:last_error]&.byteslice(0..1000),
47
- version: Tobox::VERSION,
48
- db_adapter: @db_scheme
49
- })
41
+ id: event[:id],
42
+ type: event[:type],
43
+ @attempts_column => event[@config[:attempts_column]],
44
+ created_at: event[:created_at],
45
+ @visibility_column => event[@config[:visibility_column]],
46
+ last_error: event[:last_error]&.byteslice(0..1000),
47
+ version: Tobox::VERSION,
48
+ db_adapter: @db_scheme
49
+ }.compact)
50
50
  scope.set_tags(
51
51
  outbox: @db_table,
52
52
  event_id: event[:id],
@@ -116,7 +116,9 @@ module Tobox
116
116
  end
117
117
 
118
118
  def capture_exception(event, error)
119
- if ::Sentry.configuration.tobox.report_after_retries && event[:attempts] && event[:attempts] < @max_attempts
119
+ if ::Sentry.configuration.tobox.report_after_retries &&
120
+ event[@config[:attempts_column]] &&
121
+ event[@config[:attempts_column]] < @max_attempts
120
122
  return
121
123
  end
122
124
 
@@ -52,7 +52,14 @@ module Tobox
52
52
 
53
53
  if @created_at_column
54
54
  # discard already handled events
55
- @oldest_event_age_ds = @outbox_ds.where(last_error: nil, run_at: nil).order(Sequel.asc(:id))
55
+ #
56
+ @oldest_event_age_ds = @outbox_ds.where(last_error: nil)
57
+ @oldest_event_age_ds = if config.visibility_type_bool?
58
+ @oldest_event_age_ds.where(config[:visibility_column] => false)
59
+ else
60
+ @oldest_event_age_ds.where(config[:visibility_column] => nil)
61
+ end
62
+ @oldest_event_age_ds = @oldest_event_age_ds.order(Sequel.asc(:id))
56
63
  end
57
64
 
58
65
  logger = config.default_logger
@@ -104,7 +111,7 @@ module Tobox
104
111
  stats = @outbox_ds.group_and_count(
105
112
  Sequel.case([
106
113
  [{ last_error: nil }, "pending_count"],
107
- [Sequel.expr([:attempts]) < @max_attempts, "failing_count"]
114
+ [Sequel.expr(@config[:attempts_column]) < @max_attempts, "failing_count"]
108
115
  ],
109
116
  "failed_count").as(:status)
110
117
  )
@@ -1,7 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  require "timeout"
4
- require "fiber_scheduler"
4
+ require "async/scheduler"
5
5
 
6
6
  module Tobox
7
7
  class FiberPool < Pool
@@ -20,18 +20,19 @@ module Tobox
20
20
  Thread.current.name = "tobox-fibers-thread"
21
21
 
22
22
  begin
23
- FiberScheduler do
24
- @fiber_mtx.synchronize do
25
- @workers.each do |worker|
26
- @fibers << start_fiber_worker(worker)
27
- end
28
- @fiber_cond.signal
23
+ Fiber.set_scheduler(Async::Scheduler.new)
24
+
25
+ @fiber_mtx.synchronize do
26
+ @workers.each do |worker|
27
+ @fibers << start_fiber_worker(worker)
29
28
  end
29
+ @fiber_cond.signal
30
30
  end
31
31
  rescue KillError
32
32
  @fibers.each { |f| f.raise(KillError) }
33
33
  end
34
34
  end
35
+
35
36
  @fiber_mtx.synchronize do
36
37
  @fiber_cond.wait(@fiber_mtx)
37
38
  end
@@ -43,18 +44,29 @@ module Tobox
43
44
 
44
45
  super
45
46
 
46
- @fiber_thread.join(shutdown_timeout)
47
+ th = @fiber_thread
48
+
49
+ return unless th
50
+
51
+ th.join(shutdown_timeout)
47
52
 
48
- return unless @fiber_thread.alive?
53
+ return unless th.alive?
49
54
 
50
- @fiber_thread.raise(KillError)
51
- @fiber_thread.join(grace_shutdown_timeout)
52
- @fiber_thread.kill
53
- @fiber_thread.join(1)
55
+ th.raise(KillError)
56
+ th.join(grace_shutdown_timeout)
57
+ th.kill
58
+ th.join(1)
54
59
  end
55
60
 
56
61
  private
57
62
 
63
+ def handle_exception(wrk, exc)
64
+ # noop
65
+ return if exc.is_a?(::Async::Stop)
66
+
67
+ super
68
+ end
69
+
58
70
  def start_fiber_worker(worker)
59
71
  Fiber.schedule do
60
72
  do_work(worker)
@@ -67,12 +79,13 @@ module Tobox
67
79
 
68
80
  raise Error, "worker not found" unless idx
69
81
 
70
- subst_worker = Worker.new(worker.label, @configuration)
82
+ subst_worker = @configuration.worker_class.new(worker.label, @configuration)
71
83
  @workers[idx] = subst_worker
72
- subst_fiber = start_fiber_worker(subst_worker)
73
- @fiber_mtx.synchronize { @fibers << subst_fiber }
84
+ @fibers << start_fiber_worker(subst_worker)
74
85
  end
75
86
  end
87
+ rescue KillError
88
+ # noop
76
89
  end
77
90
  end
78
91
  end
@@ -1,20 +1,18 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- require "monitor"
4
-
5
3
  module Tobox
6
4
  class ThreadedPool < Pool
7
5
  def initialize(_configuration)
8
6
  @parent_thread = Thread.main
9
7
  @threads = []
10
- @threads.extend(MonitorMixin)
8
+ @threads_mutex = Thread::Mutex.new
11
9
  super
12
10
  end
13
11
 
14
12
  def start
15
13
  @workers.each do |wk|
16
14
  th = start_thread_worker(wk)
17
- @threads.synchronize do
15
+ @threads_mutex.synchronize do
18
16
  @threads << th
19
17
  end
20
18
  end
@@ -32,7 +30,7 @@ module Tobox
32
30
  start = Process.clock_gettime(::Process::CLOCK_MONOTONIC)
33
31
 
34
32
  loop do
35
- terminating_th = @threads.synchronize { @threads.first }
33
+ terminating_th = @threads_mutex.synchronize { @threads.first }
36
34
 
37
35
  return unless terminating_th
38
36
 
@@ -47,9 +45,9 @@ module Tobox
47
45
  join.call(shutdown_timeout)
48
46
 
49
47
  # hard exit
50
- @threads.synchronize { @threads.each { |th| th.raise(KillError) } }
48
+ @threads_mutex.synchronize { @threads.each { |th| th.raise(KillError) } }
51
49
  join.call(grace_shutdown_timeout)
52
- @threads.synchronize { @threads.each(&:kill) }
50
+ @threads_mutex.synchronize { @threads.each(&:kill) }
53
51
  join.call(1)
54
52
  end
55
53
 
@@ -61,7 +59,7 @@ module Tobox
61
59
 
62
60
  do_work(worker)
63
61
 
64
- @threads.synchronize do
62
+ @threads_mutex.synchronize do
65
63
  @threads.delete(Thread.current)
66
64
 
67
65
  if worker.finished? && @running
@@ -69,7 +67,7 @@ module Tobox
69
67
 
70
68
  raise Error, "worker not found" unless idx
71
69
 
72
- subst_worker = Worker.new(worker.label, @configuration)
70
+ subst_worker = @configuration.worker_class.new(worker.label, @configuration)
73
71
  @workers[idx] = subst_worker
74
72
  subst_thread = start_thread_worker(subst_worker)
75
73
  @threads << subst_thread
data/lib/tobox/pool.rb CHANGED
@@ -9,7 +9,7 @@ module Tobox
9
9
  @logger = @configuration.default_logger
10
10
  @num_workers = configuration[:concurrency]
11
11
  @workers = Array.new(@num_workers) do |idx|
12
- Worker.new("tobox-worker-#{idx}", configuration)
12
+ @configuration.worker_class.new("tobox-worker-#{idx}", configuration)
13
13
  end
14
14
  @worker_error_handlers = Array(@configuration.lifecycle_events[:error_worker])
15
15
  @running = true
@@ -22,19 +22,28 @@ module Tobox
22
22
  @running = false
23
23
  end
24
24
 
25
+ private
26
+
25
27
  def do_work(wrk)
26
28
  wrk.work
27
- rescue KillError
28
- # noop
29
29
  rescue Exception => e # rubocop:disable Lint/RescueException
30
- wrk.finish!
31
- @logger.error do
32
- "(worker: #{wrk.label}) -> " \
33
- "crashed with error\n" \
34
- "#{e.class}: #{e.message}\n" \
35
- "#{e.backtrace.join("\n")}"
30
+ handle_exception(wrk, e)
31
+ end
32
+
33
+ def handle_exception(wrk, exc)
34
+ case exc
35
+ when KillError
36
+ # noop
37
+ when Exception
38
+ wrk.finish!
39
+ @logger.error do
40
+ "(worker: #{wrk.label}) -> " \
41
+ "crashed with error\n" \
42
+ "#{exc.class}: #{exc.message}\n" \
43
+ "#{exc.backtrace.join("\n")}"
44
+ end
45
+ @worker_error_handlers.each { |hd| hd.call(exc) }
36
46
  end
37
- @worker_error_handlers.each { |hd| hd.call(e) }
38
47
  end
39
48
  end
40
49
 
data/lib/tobox/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Tobox
4
- VERSION = "0.6.1"
4
+ VERSION = "0.7.0"
5
5
  end
data/lib/tobox/worker.rb CHANGED
@@ -11,9 +11,13 @@ module Tobox
11
11
  @fetcher = configuration.fetcher_class.new(label, configuration)
12
12
  @finished = false
13
13
 
14
- return unless (message_to_arguments = configuration.arguments_handler)
14
+ if (message_to_arguments = configuration.arguments_handler)
15
+ define_singleton_method(:message_to_arguments, &message_to_arguments)
16
+ end
15
17
 
16
- define_singleton_method(:message_to_arguments, &message_to_arguments)
18
+ Array(configuration.lifecycle_events[:start_worker]).each do |hd|
19
+ hd.call(self)
20
+ end
17
21
  end
18
22
 
19
23
  def finished?
@@ -47,7 +51,11 @@ module Tobox
47
51
 
48
52
  return if @finished
49
53
 
50
- sleep(@wait_for_events_delay) if sum_fetched_events.zero?
54
+ wait_for_work if sum_fetched_events.zero?
55
+ end
56
+
57
+ def wait_for_work
58
+ sleep(@wait_for_events_delay)
51
59
  end
52
60
 
53
61
  def message_to_arguments(event)
data/lib/tobox.rb CHANGED
@@ -4,25 +4,23 @@ require "sequel"
4
4
 
5
5
  require_relative "tobox/version"
6
6
 
7
- require "mutex_m"
8
-
9
7
  module Tobox
10
8
  class Error < StandardError; end
11
9
 
12
10
  EMPTY = [].freeze
13
11
 
14
12
  module Plugins
13
+ PLUGINS_MUTEX = Thread::Mutex.new
15
14
  @plugins = {}
16
- @plugins.extend(Mutex_m)
17
15
 
18
16
  # Loads a plugin based on a name. If the plugin hasn't been loaded, tries to load
19
17
  # it from the load path under "httpx/plugins/" directory.
20
18
  #
21
19
  def self.load_plugin(name)
22
20
  h = @plugins
23
- unless (plugin = h.synchronize { h[name] })
21
+ unless (plugin = PLUGINS_MUTEX.synchronize { h[name] })
24
22
  require "tobox/plugins/#{name}"
25
- raise "Plugin #{name} hasn't been registered" unless (plugin = h.synchronize { h[name] })
23
+ raise "Plugin #{name} hasn't been registered" unless (plugin = PLUGINS_MUTEX.synchronize { h[name] })
26
24
  end
27
25
  plugin
28
26
  end
@@ -31,7 +29,7 @@ module Tobox
31
29
  #
32
30
  def self.register_plugin(name, mod)
33
31
  h = @plugins
34
- h.synchronize { h[name] = mod }
32
+ PLUGINS_MUTEX.synchronize { h[name] = mod }
35
33
  end
36
34
  end
37
35
 
@@ -54,11 +52,12 @@ module Tobox
54
52
  # Tobox.raise_batch_error(batch_errors)
55
53
  # end
56
54
  def self.raise_batch_errors(batch_errors)
57
- unless batch_errors.respond_to?(:to_hash) && batch_errors.all? { |k, v| k.is_a?(Integer) && v.is_a?(Exception) }
58
- raise "batch errors must be an array of index-to-exception tuples"
55
+ batch_errors = Hash.try_convert(batch_errors)
56
+ unless batch_errors && batch_errors.all? { |k, v| k.is_a?(Integer) && v.is_a?(Exception) }
57
+ raise Error, "batch errors must be an array of index-to-exception tuples"
59
58
  end
60
59
 
61
- throw(:tobox_batch_errors, batch_errors.to_h)
60
+ throw(:tobox_batch_errors, batch_errors)
62
61
  end
63
62
  end
64
63
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: tobox
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.6.1
4
+ version: 0.7.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - HoneyryderChuck
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2024-10-30 00:00:00.000000000 Z
11
+ date: 2024-12-18 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: logger
@@ -61,6 +61,7 @@ files:
61
61
  - lib/tobox/plugins/datadog/patcher.rb
62
62
  - lib/tobox/plugins/event_grouping.rb
63
63
  - lib/tobox/plugins/inbox.rb
64
+ - lib/tobox/plugins/pg_notify.rb
64
65
  - lib/tobox/plugins/progress.rb
65
66
  - lib/tobox/plugins/sentry.rb
66
67
  - lib/tobox/plugins/stats.rb