RubyGems - solid_queue - Versions diffs - 1.0.1 → 1.1.0 - Mend

solid_queue 1.0.1 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

checksums.yaml +4 -4
data/README.md +78 -5
data/app/models/solid_queue/claimed_execution.rb +1 -0
data/app/models/solid_queue/queue.rb +13 -0
data/app/models/solid_queue/queue_selector.rb +35 -5
data/lib/generators/solid_queue/install/install_generator.rb +5 -3
data/lib/solid_queue/processes/interruptible.rb +10 -18
data/lib/solid_queue/scheduler/recurring_schedule.rb +1 -0
data/lib/solid_queue/version.rb +1 -1
metadata +35 -7

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: f0d576a078aa17199edc6614015a0953027fe2660e98eab5ef31c23a7417c6e3
-  data.tar.gz: dfdac5d1ae0b53d48dea6e31abd70a39858cb48d41e2ef51f4db777797b5db1a
+  metadata.gz: e843e842397e1d8141e0457b5b278026b71effe6ebf83d8300d2c4098db56adf
+  data.tar.gz: a5db37bdea8dacec796f2dcb912ab8f2a69c929487dff081d8a53fe10c20086c
 SHA512:
-  metadata.gz: a0583ab02da0aac3e812a9e13a3a9853aca53cd67d15cd1a8ce2d666b7f48e9c669dd52bd45a74b03b5177f2b93dd98a1d85ef9f86ab2c0b4aeb82d38f26bb3f
-  data.tar.gz: 6e6cdb1ad994664f8132774e2d337758a2fea55bbb46bb618acbed3319490acf23f0cf59f12a66ac7c798c55a92b69167073fe932b41a453dbdfd91189cb3ae4
+  metadata.gz: a6831f7114c24d68ae8ed7b5aebfd4d3df0cc2c7b4e0046e6275a5bf5e46f8da8f5b79e9a303ae908e3581e8b96a132776f4c1a3f55bbf550dd008541e679665
+  data.tar.gz: e51b90234b3a60355a96c9163bfc7d95a1f5eb95fcf7a2dc1faf553f04a7c242da2d48493ba5b6574c2dbf4b0d259cb69af1dda576475f1510a31f8325be5f8b

data/README.md CHANGED Viewed

@@ -149,6 +149,9 @@ Here's an overview of the different options:
   This will create a worker fetching jobs from all queues starting with `staging`. The wildcard `*` is only allowed on its own or at the end of a queue name; you can't specify queue names such as `*_some_queue`. These will be ignored.
   Finally, you can combine prefixes with exact names, like `[ staging*, background ]`, and the behaviour with respect to order will be the same as with only exact names.
+  Check the sections below on [how queue order behaves combined with priorities](#queue-order-and-priorities), and [how the way you specify the queues per worker might affect performance](#queues-specification-and-performance).
 - `threads`: this is the max size of the thread pool that each worker will have to run jobs. Each worker will fetch this number of jobs from their queue(s), at most and will post them to the thread pool to be run. By default, this is `3`. Only workers have this setting.
 - `processes`: this is the number of worker processes that will be forked by the supervisor with the settings given. By default, this is `1`, just a single process. This setting is useful if you want to dedicate more than one CPU core to a queue or queues with the same configuration. Only workers have this setting.
 - `concurrency_maintenance`: whether the dispatcher will perform the concurrency maintenance work. This is `true` by default, and it's useful if you don't use any [concurrency controls](#concurrency-controls) and want to disable it or if you run multiple dispatchers and want some of them to just dispatch jobs without doing anything else.
@@ -164,6 +167,67 @@ This is useful when you run jobs with different importance or urgency in the sam
 We recommend not mixing queue order with priorities but either choosing one or the other, as that will make job execution order more straightforward for you.
+### Queues specification and performance
+To keep polling performant and ensure a covering index is always used, Solid Queue only does two types of polling queries:
+```sql
+-- No filtering by queue
+SELECT job_id
+FROM solid_queue_ready_executions
+ORDER BY priority ASC, job_id ASC
+LIMIT ?
+FOR UPDATE SKIP LOCKED;
+-- Filtering by a single queue
+SELECT job_id
+FROM solid_queue_ready_executions
+WHERE queue_name = ?
+ORDER BY priority ASC, job_id ASC
+LIMIT ?
+FOR UPDATE SKIP LOCKED;
+```
+The first one (no filtering by queue) is used when you specify
+```yml
+queues: *
+```
+and there aren't any queues paused, as we want to target all queues.
+In other cases, we need to have a list of queues to filter by, in order, because we can only filter by a single queue at a time to ensure we use an index to sort. This means that if you specify your queues as:
+```yml
+queues: beta*
+```
+we'll need to get a list of all existing queues matching that prefix first, with a query that would look like this:
+```sql
+SELECT DISTINCT(queue_name)
+FROM solid_queue_ready_executions
+WHERE queue_name LIKE 'beta%';
+```
+This type of `DISTINCT` query on a column that's the leftmost column in an index can be performed very fast in MySQL thanks to a technique called [Loose Index Scan](https://dev.mysql.com/doc/refman/8.0/en/group-by-optimization.html#loose-index-scan). PostgreSQL and SQLite, however, don't implement this technique, which means that if your `solid_queue_ready_executions` table is very big because your queues get very deep, this query will get slow. Normally your `solid_queue_ready_executions` table will be small, but it can happen.
+Similarly to using prefixes, the same will happen if you have paused queues, because we need to get a list of all queues with a query like
+```sql
+SELECT DISTINCT(queue_name)
+FROM solid_queue_ready_executions
+```
+and then remove the paused ones. Pausing in general should be something rare, used in special circumstances, and for a short period of time. If you don't want to process jobs from a queue anymore, the best way to do that is to remove it from your list of queues.
+💡 To sum up, **if you want to ensure optimal performance on polling**, the best way to do that is to always specify exact names for them, and not have any queues paused.
+Do this:
+```yml
+queues: background, backend
+```
+instead of this:
+```yml
+queues: back*
+```
 ### Threads, processes and signals
@@ -175,7 +239,9 @@ The supervisor is in charge of managing these processes, and it responds to the
 When receiving a `QUIT` signal, if workers still have jobs in-flight, these will be returned to the queue when the processes are deregistered.
-If processes have no chance of cleaning up before exiting (e.g. if someone pulls a cable somewhere), in-flight jobs might remain claimed by the processes executing them. Processes send heartbeats, and the supervisor checks and prunes processes with expired heartbeats, which will release any claimed jobs back to their queues. You can configure both the frequency of heartbeats and the threshold to consider a process dead. See the section below for this.
+If processes have no chance of cleaning up before exiting (e.g. if someone pulls a cable somewhere), in-flight jobs might remain claimed by the processes executing them. Processes send heartbeats, and the supervisor checks and prunes processes with expired heartbeats. Jobs that were claimed by processes with an expired heartbeat will be marked as failed with a `SolidQueue::Processes::ProcessPrunedError`. You can configure both the frequency of heartbeats and the threshold to consider a process dead. See the section below for this.
+In a similar way, if a worker is terminated in any other way not initiated by the above signals (e.g. a worker is sent a `KILL` signal), jobs in progress will be marked as failed so that they can be inspected, with a `SolidQueue::Processes::Process::ProcessExitError`. Sometimes a job in particular is responsible for this, for example, if it has a memory leak and you have a mechanism to kill processes over a certain memory threshold, so this will help identifying this kind of situation.
 ### Database configuration
@@ -348,13 +414,19 @@ to your `puma.rb` configuration.
 ## Jobs and transactional integrity
-:warning: Having your jobs in the same ACID-compliant database as your application data enables a powerful yet sharp tool: taking advantage of transactional integrity to ensure some action in your app is not committed unless your job is also committed and viceversa, and ensuring that your job won't be enqueued until the transaction within which you're enqueing it is committed. This can be very powerful and useful, but it can also backfire if you base some of your logic on this behaviour, and in the future, you move to another active job backend, or if you simply move Solid Queue to its own database, and suddenly the behaviour changes under you.
+:warning: Having your jobs in the same ACID-compliant database as your application data enables a powerful yet sharp tool: taking advantage of transactional integrity to ensure some action in your app is not committed unless your job is also committed and vice versa, and ensuring that your job won't be enqueued until the transaction within which you're enqueuing it is committed. This can be very powerful and useful, but it can also backfire if you base some of your logic on this behaviour, and in the future, you move to another active job backend, or if you simply move Solid Queue to its own database, and suddenly the behaviour changes under you. Because this can be quite tricky and many people shouldn't need to worry about it, by default Solid Queue is configured in a different database as the main app.
-Because this can be quite tricky and many people shouldn't need to worry about it, by default Solid Queue is configured in a different database as the main app, **job enqueuing is deferred until any ongoing transaction is committed** thanks to Active Job's built-in capability to do this. This means that even if you run Solid Queue in the same DB as your app, you won't be taking advantage of this transactional integrity.
+Starting from Rails 8, an option which doesn't rely on this transactional integrity and which Active Job provides is to defer the enqueueing of a job inside an Active Record transaction until that transaction successfully commits. This option can be set via the [`enqueue_after_transaction_commit`](https://edgeapi.rubyonrails.org/classes/ActiveJob/Enqueuing.html#method-c-enqueue_after_transaction_commit) class method on the job level and is by default disabled. Either it can be enabled for individual jobs or for all jobs through `ApplicationJob`:
-If you prefer to change this, you can set [`config.active_job.enqueue_after_transaction_commit`](https://edgeguides.rubyonrails.org/configuring.html#config-active-job-enqueue-after-transaction-commit) to `never`. You can also set this on a per-job basis.
+```ruby
+class ApplicationJob < ActiveJob::Base
+  self.enqueue_after_transaction_commit = true
+end
+```
+Using this option, you can also use Solid Queue in the same database as your app but not rely on transactional integrity.
-If you set that to `never` but still want to make sure you're not inadvertently on transactional integrity, you can make sure that:
+If you don't set this option but still want to make sure you're not inadvertently on transactional integrity, you can make sure that:
 - Your jobs relying on specific data are always enqueued on [`after_commit` callbacks](https://guides.rubyonrails.org/active_record_callbacks.html#after-commit-and-after-rollback) or otherwise from a place where you're certain that whatever data the job will use has been committed to the database before the job is enqueued.
 - Or, you configure a different database for Solid Queue, even if it's the same as your app, ensuring that a different connection on the thread handling requests or running jobs for your app will be used to enqueue jobs. For example:
@@ -369,6 +441,7 @@ If you set that to `never` but still want to make sure you're not inadvertently
   config.solid_queue.connects_to = { database: { writing: :primary, reading: :replica } }
   ```
 ## Recurring tasks
 Solid Queue supports defining recurring tasks that run at specific times in the future, on a regular basis like cron jobs. These are managed by the scheduler process and are defined in their own configuration file. By default, the file is located in `config/recurring.yml`, but you can set a different path using the environment variable `SOLID_QUEUE_RECURRING_SCHEDULE` or by using the `--recurring_schedule_file` option with `bin/jobs`, like this:

data/app/models/solid_queue/claimed_execution.rb CHANGED Viewed

@@ -64,6 +64,7 @@ class SolidQueue::ClaimedExecution < SolidQueue::Execution
       finished
     else
       failed_with(result.error)
+      raise result.error
     end
   ensure
     job.unblock_next_blocked_job

data/app/models/solid_queue/queue.rb CHANGED Viewed

@@ -40,6 +40,19 @@ module SolidQueue
       @size ||= ReadyExecution.queued_as(name).count
     end
+    def latency
+      @latency ||= begin
+        now = Time.current
+        oldest_enqueued_at = ReadyExecution.queued_as(name).minimum(:created_at) || now
+        (now - oldest_enqueued_at).to_i
+      end
+    end
+    def human_latency
+      ActiveSupport::Duration.build(latency).inspect
+    end
     def ==(queue)
       name == queue.name
     end

data/app/models/solid_queue/queue_selector.rb CHANGED Viewed

@@ -34,7 +34,7 @@ module SolidQueue
       def eligible_queues
         if include_all_queues? then all_queues
         else
-          exact_names + prefixed_names
+          in_raw_order(exact_names + prefixed_names)
         end
       end
@@ -42,8 +42,12 @@ module SolidQueue
         "*".in? raw_queues
       end
+      def all_queues
+        relation.distinct(:queue_name).pluck(:queue_name)
+      end
       def exact_names
-        raw_queues.select { |queue| !queue.include?("*") }
+        raw_queues.select { |queue| exact_name?(queue) }
       end
       def prefixed_names
@@ -54,15 +58,41 @@ module SolidQueue
       end
       def prefixes
-        @prefixes ||= raw_queues.select { |queue| queue.ends_with?("*") }.map { |queue| queue.tr("*", "%") }
+        @prefixes ||= raw_queues.select { |queue| prefixed_name?(queue) }.map { |queue| queue.tr("*", "%") }
       end
-      def all_queues
-        relation.distinct(:queue_name).pluck(:queue_name)
+      def exact_name?(queue)
+        !queue.include?("*")
+      end
+      def prefixed_name?(queue)
+        queue.ends_with?("*")
       end
       def paused_queues
         @paused_queues ||= Pause.all.pluck(:queue_name)
       end
+      def in_raw_order(queues)
+        # Only need to sort if we have prefixes and more than one queue name.
+        # Exact names are selected in the same order as they're found
+        if queues.one? || prefixes.empty?
+          queues
+        else
+          queues = queues.dup
+          raw_queues.flat_map { |raw_queue| delete_in_order(raw_queue, queues) }.compact
+        end
+      end
+      def delete_in_order(raw_queue, queues)
+        if exact_name?(raw_queue)
+          queues.delete(raw_queue)
+        elsif prefixed_name?(raw_queue)
+          prefix = raw_queue.tr("*", "")
+          queues.select { |queue| queue.start_with?(prefix) }.tap do |matches|
+            queues -= matches
+          end
+        end
+      end
   end
 end

data/lib/generators/solid_queue/install/install_generator.rb CHANGED Viewed

@@ -11,9 +11,11 @@ class SolidQueue::InstallGenerator < Rails::Generators::Base
     chmod "bin/jobs", 0755 & ~File.umask, verbose: false
   end
-  def configure_active_job_adapter
-    gsub_file Pathname(destination_root).join("config/environments/production.rb"),
-      /(# )?config\.active_job\.queue_adapter\s+=.*/,
+  def configure_adapter_and_database
+    pathname = Pathname(destination_root).join("config/environments/production.rb")
+    gsub_file pathname, /\n\s*config\.solid_queue\.connects_to\s+=.*\n/, "\n", verbose: false
+    gsub_file pathname, /(# )?config\.active_job\.queue_adapter\s+=.*\n/,
       "config.active_job.queue_adapter = :solid_queue\n" +
       "  config.solid_queue.connects_to = { database: { writing: :queue } }\n"
   end

data/lib/solid_queue/processes/interruptible.rb CHANGED Viewed

@@ -7,31 +7,23 @@ module SolidQueue::Processes
     end
     private
-      SELF_PIPE_BLOCK_SIZE = 11
       def interrupt
-        self_pipe[:writer].write_nonblock(".")
-      rescue Errno::EAGAIN, Errno::EINTR
-        # Ignore writes that would block and retry
-        # if another signal arrived while writing
-        retry
+        queue << true
       end
       def interruptible_sleep(time)
-        if time > 0 && self_pipe[:reader].wait_readable(time)
-          loop { self_pipe[:reader].read_nonblock(SELF_PIPE_BLOCK_SIZE) }
-        end
-      rescue Errno::EAGAIN, Errno::EINTR
+        # Invoking from the main thread can result in a 35% slowdown (at least when running the test suite).
+        # Using some form of Async (Futures) addresses this performance issue.
+        Concurrent::Promises.future(time) do |timeout|
+          if timeout > 0 && queue.pop(timeout:)
+            queue.clear
+          end
+        end.value
       end
-      # Self-pipe for signal-handling (http://cr.yp.to/docs/selfpipe.html)
-      def self_pipe
-        @self_pipe ||= create_self_pipe
-      end
-      def create_self_pipe
-        reader, writer = IO.pipe
-        { reader: reader, writer: writer }
+      def queue
+        @queue ||= Queue.new
       end
   end
 end

data/lib/solid_queue/scheduler/recurring_schedule.rb CHANGED Viewed

@@ -41,6 +41,7 @@ module SolidQueue
     private
       def persist_tasks
+        SolidQueue::RecurringTask.static.where.not(key: task_keys).delete_all
         SolidQueue::RecurringTask.create_or_update_all configured_tasks
       end

data/lib/solid_queue/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module SolidQueue
-  VERSION = "1.0.1"
+  VERSION = "1.1.0"
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: solid_queue
 version: !ruby/object:Gem::Version
-  version: 1.0.1
+  version: 1.1.0
 platform: ruby
 authors:
 - Rosa Gutierrez
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2024-11-08 00:00:00.000000000 Z
+date: 2024-12-05 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: activerecord
@@ -98,16 +98,16 @@ dependencies:
   name: debug
   requirement: !ruby/object:Gem::Requirement
     requirements:
-    - - ">="
+    - - "~>"
       - !ruby/object:Gem::Version
-        version: '0'
+        version: '1.9'
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
     requirements:
-    - - ">="
+    - - "~>"
       - !ruby/object:Gem::Version
-        version: '0'
+        version: '1.9'
 - !ruby/object:Gem::Dependency
   name: mocha
   requirement: !ruby/object:Gem::Requirement
@@ -192,6 +192,34 @@ dependencies:
     - - ">="
       - !ruby/object:Gem::Version
         version: '0'
+- !ruby/object:Gem::Dependency
+  name: rdoc
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+- !ruby/object:Gem::Dependency
+  name: logger
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :development
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
 description: Database-backed Active Job backend.
 email:
 - rosa@37signals.com
@@ -302,7 +330,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.5.9
+rubygems_version: 3.5.16
 signing_key:
 specification_version: 4
 summary: Database-backed Active Job backend.