gouda 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 69ee0eda4e10e8777d9cb1df596472103558ace52ba2fd018de2fcbc28a59ead
4
- data.tar.gz: 7d2eb26cafd53af69e52f8a5de8adf8d08678d7618dbccc6d50c500e49780c92
3
+ metadata.gz: 0fa853c78222eb23897ccb31ed465fc231aa5894641fe0d1991ade90a5e3fc8d
4
+ data.tar.gz: e9680b441d3fe9c3da7fadf50272c033db712296104870d016b278da2c1d92bd
5
5
  SHA512:
6
- metadata.gz: 7bb748afdacae3fe76ee49a4b82e45c282d3feeee6fb69cb390b57e9b05db9108d692521454121724fbec81970c297117befd900dde9a687bfc51d8b08a4cd8e
7
- data.tar.gz: f14f7bad25ba4d0b2c22ec0dad98747ef1a1e79fa3a5978b987f13fbe340220eddfe6958fa1751083402c7f65a6e3f4129fdbb53bfa0d6e87ecc45692b7d4548
6
+ metadata.gz: 9a64544cd45d14400ab949a848e0325ee5d5305d648f7f38239279f93e1f8d2d32dac368708317aafbf470b48e16f88a0ffe4bad6890798ef53adea0566da5f6
7
+ data.tar.gz: e2140d4da50c4afe8edadd51bb3049b60935e4c0273d235dcb1988efa2900362ea81451a58df04ba3f4db58209380ea9a58ccedd42972806bd5be2cd9f19d7d4
@@ -15,9 +15,6 @@ jobs:
15
15
  matrix:
16
16
  ruby:
17
17
  - '2.7'
18
- - '3.0'
19
- - '3.1'
20
- - '3.2'
21
18
  - '3.3'
22
19
  services:
23
20
  postgres:
data/CHANGELOG.md CHANGED
@@ -7,3 +7,7 @@
7
7
  ## [0.1.1] - 2023-06-10
8
8
 
9
9
  - Fix support for older ruby versions until 2.7
10
+
11
+ ## [0.1.2] - 2023-06-11
12
+
13
+ - Updated readme and method renaming in Scheduler
data/README.md CHANGED
@@ -11,7 +11,96 @@ $ bundle install
11
11
  $ bin/rails g gouda:install
12
12
  ```
13
13
 
14
- ## Usage
14
+ Gouda is build as a lightweight alternative to [good_job](https://github.com/bensheldon/good_job) and has been created before [solid_queue.](https://github.com/rails/solid_queue/)
15
+ It is _smaller_ than solid_queue though.
16
+
17
+ It was designed to enable job processing using `SELECT ... FOR UPDATE SKIP LOCKED` on Postgres so that we could use pg_bouncer in our system setup.
18
+
19
+
20
+ ## Key concepts in Gouda: Workload
21
+
22
+ Gouda is built around the concept of a **Workload.** A workload is not the same as an ActiveJob. A workload is a single execution of a task - the task may be an entire ActiveJob, or a retry of an ActiveJob, or a part of a sequence of ActiveJobs initiated using [job-iteration](https://github.com/shopify/job-iteration)
23
+
24
+ You can easily have multiple `Workloads` stored in your queue which reference the same job. However, when you are using Gouda it is important to always keep the distinction between the two in mind.
25
+
26
+ When an ActiveJob gets first initialised, it receives a randomly-generated ActiveJob ID, which is normally a UUID. This UUID will be reused when a job gets retried, or when job-iteration is in use - but it will exist across multiple Gouda workloads.
27
+
28
+ A `Workload` can only be in one of the three states: `enqueued`, `executing` and `finished`. It does not matter whether the workload has raised an exception, or was manually canceled before it started performing, or succeeded - its terminal state is always going to be `finished`, regardless. This is done on purpose: Gouda uses a number of partial indexes in Postgres which allows it to maintain uniqueness, but only among jobs which are either waiting to start or already running. Additionally, _only the transitions between those states_ are guarded by `BEGIN...COMMIT` and it is the selection on those states that is supplemented by `SELECT ... FOR UPDATE SKIP LOCKED`. The only time locks are placed on a particular `gouda_workloads` row is when this update is about to take place (`SELECT` then `UPDATE`). This makes Gouda a good fit for use with pg_bouncer in transaction mode.
29
+
30
+ Understanding workload identity is key for making good use of Gouda. For example, an ActiveJob that gets retried can take the following shape in Gouda:
31
+
32
+ ```
33
+ ____________________________ _______________________________________________
34
+ | ActiveJob(id="0abc-...34") | ----> | Workload(id="f67b-...123",state="finished") |
35
+ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
36
+ ____________________________ _______________________________________________
37
+ | ActiveJob(id="0abc-...34") | ----> | Workload(id="5e52-...456",state="finished") |
38
+ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
39
+ ____________________________ _______________________________________________
40
+ | ActiveJob(id="0abc-...34") | ----> | Workload(id="8a41-...789",state="enqueued") |
41
+ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
42
+ ```
43
+
44
+ This would happen if, for example, the ActiveJob raises an exception inside `perform` and is configured to `retry_on` after this exception. Same for job-iteration:
45
+
46
+ ```
47
+ _______________________________________ _______________________________________________
48
+ | ActiveJob(id="0abc-...34",cursor=nil) | ----> | Workload(id="f67b-...123",state="finished") |
49
+ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
50
+ _______________________________________ _______________________________________________
51
+ | ActiveJob(id="0abc-...34",cursor=123) | ----> | Workload(id="5e52-...456",state="finished") |
52
+ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
53
+ _______________________________________ _______________________________________________
54
+ | ActiveJob(id="0abc-...34",cursor=456) | ----> | Workload(id="8a41-...789",state="executing") |
55
+ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
56
+ ```
57
+
58
+ A key thing to remember when reading the Gouda source code is that **workloads and jobs are not the same thing.** A single job **may span multiple workloads.**
59
+
60
+ ## Key concepts in Gouda: concurrency keys
61
+
62
+ Gouda has a few indexes on the `gouda_workloads` table which will:
63
+
64
+ * Forbid inserting another `enqueued` workload with the same `enqueue_concurrency_key` value. Uniqueness is on that column only.
65
+ * Forbid a workload from transition into `executing` when another workload with the same `execution_concurrency_key` is already running.
66
+
67
+ These are compatible with good_job concurrency keys, with one major distinction: we use unique indices and not counters, so these keys can be used
68
+ to **prevent concurrent executions** but not to **limit the load on the system**, and the limit of 1 is always enforced.
69
+
70
+ ## Key concepts in Gouda: `executing_on`
71
+
72
+ A `Workload` is executing on a particular `executing_on` entity - usually a worker thread. That entity gets a pseudorandom ID . The `executing_on` value can be used to see, for example, whether a particular worker thread has hung. If multiple jobs have a far-behind `updated_at` and are all `executing`, this likely means that the worker has crashed or hung. The value can also be used to build a table of currently running workers.
73
+
74
+ ## Usage tips: bulkify your enqueues
75
+
76
+ When possible, Gouda uses `enqueue_all` to `INSERT` as many jobs at once as possible. With modern servers this allows for very rapid insertion of very large
77
+ batches of jobs. It is supplemented by a module which will make all `perform_later` calls buffered and submitted to the queue in bulk:
78
+
79
+ ```ruby
80
+ Gouda.in_bulk do
81
+ User.joined_recently.find_each do |user|
82
+ WelcomeMailer.with(user:).welcome_email.deliver_later
83
+ end
84
+ end
85
+ ```
86
+
87
+ If there are multiple ActiveJob adapters configured and you bulk-enqueue a job which uses an adapter different than Gouda, `in_bulk` will try to use `enqueue_all` on that
88
+ adapter as well.
89
+
90
+ ## Usage tips: co-commit
91
+
92
+ Gouda is designed to `COMMIT` the workload together with your business data. It does not need `after_commit` unless you so choose. In fact,
93
+ the main advantage of DB-based job queues such as Gouda is that you can always rely on the fact that the workload will be enqueued only
94
+ once the data it needs to operate on is already available for reading. This is guaranteed to work:
95
+
96
+ ```ruby
97
+ User.transaction do
98
+ freshly_joined_user = User.create!(user_params)
99
+ WelcomeMailer.with(user: freshly_joined_user).welcome_email.deliver_later
100
+ end
101
+ ```
102
+
103
+ ## Web UI
15
104
 
16
105
  At the moment the Gouda UI is proprietary, so this gem only provides a "headless" implementation. We expect this to change in the future.
17
106
 
data/lib/gouda/railtie.rb CHANGED
@@ -34,8 +34,6 @@ module Gouda
34
34
  # The `to_prepare` block which is executed once in production
35
35
  # and before each request in development.
36
36
  config.to_prepare do
37
- Gouda::Scheduler.update_schedule_from_config!
38
-
39
37
  if defined?(Rails) && Rails.respond_to?(:application)
40
38
  config_from_rails = Rails.application.config.try(:gouda)
41
39
  if config_from_rails
@@ -52,6 +50,9 @@ module Gouda
52
50
  Gouda.config.polling_sleep_interval_seconds = 0.2
53
51
  Gouda.config.logger.level = Gouda.config.log_level
54
52
  end
53
+
54
+ Gouda::Scheduler.build_scheduler_entries_list!
55
+ Gouda::Scheduler.upsert_workloads_from_entries_list!
55
56
  end
56
57
  end
57
58
  end
@@ -53,7 +53,33 @@ module Gouda::Scheduler
53
53
  end
54
54
  end
55
55
 
56
- def self.update_schedule_from_config!(cron_table_hash = nil)
56
+ # Takes in a Hash formatted with cron entries in the format similar
57
+ # to good_job, and builds a table of scheduler entries. A scheduler
58
+ # entry references a particular job class name, the set of arguments to
59
+ # be passed to the job when performing it, and either the interval
60
+ # to repeat the job after or a cron pattern. This method does not
61
+ # insert the actual Workloads into the database but just builds the
62
+ # table of the entries. That table gets consulted when workloads finish
63
+ # to determine whether the workload that just ran was scheduled or ad-hoc,
64
+ # and whether the subsequent workload has to be enqueued.
65
+ #
66
+ # If no table is given the method will attempt to read the table from
67
+ # Rails application config from `[:gouda][:cron]`.
68
+ #
69
+ # The table is a Hash of entries, and the keys are the names of the workload
70
+ # to be enqueued - those keys are also used to ensure scheduled workloads
71
+ # only get scheduled once.
72
+ #
73
+ # @param cron_table_hash[Hash] a hash of the following shape:
74
+ # {
75
+ # download_invoices_every_minute: {
76
+ # cron: "* * * * *",
77
+ # class: "DownloadInvoicesJob",
78
+ # args: ["immediate"]
79
+ # }
80
+ # }
81
+ # @return Array[Entry]
82
+ def self.build_scheduler_entries_list!(cron_table_hash = nil)
57
83
  Gouda.logger.info "Updating scheduled workload entries..."
58
84
  if cron_table_hash.blank?
59
85
  config_from_rails = Rails.application.config.try(:gouda)
@@ -76,6 +102,12 @@ module Gouda::Scheduler
76
102
  end
77
103
  end
78
104
 
105
+ # Once a workload has finished (doesn't matter whether it raised an exception
106
+ # or completed successfully), it is going to be passed to this method to enqueue
107
+ # the next scheduled workload
108
+ #
109
+ # @param finished_workload[Gouda::Workload]
110
+ # @return void
79
111
  def self.enqueue_next_scheduled_workload_for(finished_workload)
80
112
  return unless finished_workload.scheduler_key
81
113
 
@@ -86,11 +118,23 @@ module Gouda::Scheduler
86
118
  Gouda.enqueue_jobs_via_their_adapters([timer_entry.build_active_job])
87
119
  end
88
120
 
121
+ # Returns the list of entries of the scheduler which are currently known. Normally the
122
+ # scheduler will hold the list of entries loaded from the Rails config.
123
+ #
124
+ # @return Array[Entry]
89
125
  def self.entries
90
126
  @cron_table || []
91
127
  end
92
128
 
93
- def self.update_scheduled_workloads!
129
+ # Will upsert (`INSERT ... ON CONFLICT UPDATE`) workloads for all entries which are in the scheduler entries
130
+ # table (the table needs to be read or hydrated first using `build_scheduler_entries_list!`). This is done
131
+ # in a transaction. Any workloads which have been previously inserted from the scheduled entries, but no
132
+ # longer have a corresponding scheduler entry, will be deleted from the database. If there already are workloads
133
+ # with the corresponding scheduler key they will not be touched and will be performed with their previously-defined
134
+ # arguments.
135
+ #
136
+ # @return void
137
+ def self.upsert_workloads_from_entries_list!
94
138
  table_entries = @cron_table || []
95
139
 
96
140
  # Remove any cron keyed workloads which no longer match config-wise
data/lib/gouda/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Gouda
4
- VERSION = "0.1.1"
4
+ VERSION = "0.1.2"
5
5
  end
data/lib/gouda.rb CHANGED
@@ -46,7 +46,7 @@ module Gouda
46
46
  end
47
47
 
48
48
  def self.start
49
- Gouda::Scheduler.update_scheduled_workloads!
49
+ Gouda::Scheduler.upsert_workloads_from_entries_list!
50
50
 
51
51
  queue_constraint = if ENV["GOUDA_QUEUES"]
52
52
  Gouda.parse_queue_constraint(ENV["GOUDA_QUEUES"])
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: gouda
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.1
4
+ version: 0.1.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Sebastian van Hesteren
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2024-06-10 00:00:00.000000000 Z
12
+ date: 2024-06-11 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: activerecord