gouda 0.1.1 → 0.1.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 69ee0eda4e10e8777d9cb1df596472103558ace52ba2fd018de2fcbc28a59ead
4
- data.tar.gz: 7d2eb26cafd53af69e52f8a5de8adf8d08678d7618dbccc6d50c500e49780c92
3
+ metadata.gz: 0fa853c78222eb23897ccb31ed465fc231aa5894641fe0d1991ade90a5e3fc8d
4
+ data.tar.gz: e9680b441d3fe9c3da7fadf50272c033db712296104870d016b278da2c1d92bd
5
5
  SHA512:
6
- metadata.gz: 7bb748afdacae3fe76ee49a4b82e45c282d3feeee6fb69cb390b57e9b05db9108d692521454121724fbec81970c297117befd900dde9a687bfc51d8b08a4cd8e
7
- data.tar.gz: f14f7bad25ba4d0b2c22ec0dad98747ef1a1e79fa3a5978b987f13fbe340220eddfe6958fa1751083402c7f65a6e3f4129fdbb53bfa0d6e87ecc45692b7d4548
6
+ metadata.gz: 9a64544cd45d14400ab949a848e0325ee5d5305d648f7f38239279f93e1f8d2d32dac368708317aafbf470b48e16f88a0ffe4bad6890798ef53adea0566da5f6
7
+ data.tar.gz: e2140d4da50c4afe8edadd51bb3049b60935e4c0273d235dcb1988efa2900362ea81451a58df04ba3f4db58209380ea9a58ccedd42972806bd5be2cd9f19d7d4
@@ -15,9 +15,6 @@ jobs:
15
15
  matrix:
16
16
  ruby:
17
17
  - '2.7'
18
- - '3.0'
19
- - '3.1'
20
- - '3.2'
21
18
  - '3.3'
22
19
  services:
23
20
  postgres:
data/CHANGELOG.md CHANGED
@@ -7,3 +7,7 @@
7
7
  ## [0.1.1] - 2023-06-10
8
8
 
9
9
  - Fix support for older ruby versions until 2.7
10
+
11
+ ## [0.1.2] - 2023-06-11
12
+
13
+ - Updated readme and method renaming in Scheduler
data/README.md CHANGED
@@ -11,7 +11,96 @@ $ bundle install
11
11
  $ bin/rails g gouda:install
12
12
  ```
13
13
 
14
- ## Usage
14
+ Gouda is build as a lightweight alternative to [good_job](https://github.com/bensheldon/good_job) and has been created before [solid_queue.](https://github.com/rails/solid_queue/)
15
+ It is _smaller_ than solid_queue though.
16
+
17
+ It was designed to enable job processing using `SELECT ... FOR UPDATE SKIP LOCKED` on Postgres so that we could use pg_bouncer in our system setup.
18
+
19
+
20
+ ## Key concepts in Gouda: Workload
21
+
22
+ Gouda is built around the concept of a **Workload.** A workload is not the same as an ActiveJob. A workload is a single execution of a task - the task may be an entire ActiveJob, or a retry of an ActiveJob, or a part of a sequence of ActiveJobs initiated using [job-iteration](https://github.com/shopify/job-iteration)
23
+
24
+ You can easily have multiple `Workloads` stored in your queue which reference the same job. However, when you are using Gouda it is important to always keep the distinction between the two in mind.
25
+
26
+ When an ActiveJob gets first initialised, it receives a randomly-generated ActiveJob ID, which is normally a UUID. This UUID will be reused when a job gets retried, or when job-iteration is in use - but it will exist across multiple Gouda workloads.
27
+
28
+ A `Workload` can only be in one of the three states: `enqueued`, `executing` and `finished`. It does not matter whether the workload has raised an exception, or was manually canceled before it started performing, or succeeded - its terminal state is always going to be `finished`, regardless. This is done on purpose: Gouda uses a number of partial indexes in Postgres which allows it to maintain uniqueness, but only among jobs which are either waiting to start or already running. Additionally, _only the transitions between those states_ are guarded by `BEGIN...COMMIT` and it is the selection on those states that is supplemented by `SELECT ... FOR UPDATE SKIP LOCKED`. The only time locks are placed on a particular `gouda_workloads` row is when this update is about to take place (`SELECT` then `UPDATE`). This makes Gouda a good fit for use with pg_bouncer in transaction mode.
29
+
30
+ Understanding workload identity is key for making good use of Gouda. For example, an ActiveJob that gets retried can take the following shape in Gouda:
31
+
32
+ ```
33
+ ____________________________ _______________________________________________
34
+ | ActiveJob(id="0abc-...34") | ----> | Workload(id="f67b-...123",state="finished") |
35
+ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
36
+ ____________________________ _______________________________________________
37
+ | ActiveJob(id="0abc-...34") | ----> | Workload(id="5e52-...456",state="finished") |
38
+ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
39
+ ____________________________ _______________________________________________
40
+ | ActiveJob(id="0abc-...34") | ----> | Workload(id="8a41-...789",state="enqueued") |
41
+ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
42
+ ```
43
+
44
+ This would happen if, for example, the ActiveJob raises an exception inside `perform` and is configured to `retry_on` after this exception. Same for job-iteration:
45
+
46
+ ```
47
+ _______________________________________ _______________________________________________
48
+ | ActiveJob(id="0abc-...34",cursor=nil) | ----> | Workload(id="f67b-...123",state="finished") |
49
+ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
50
+ _______________________________________ _______________________________________________
51
+ | ActiveJob(id="0abc-...34",cursor=123) | ----> | Workload(id="5e52-...456",state="finished") |
52
+ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
53
+ _______________________________________ _______________________________________________
54
+ | ActiveJob(id="0abc-...34",cursor=456) | ----> | Workload(id="8a41-...789",state="executing") |
55
+ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾
56
+ ```
57
+
58
+ A key thing to remember when reading the Gouda source code is that **workloads and jobs are not the same thing.** A single job **may span multiple workloads.**
59
+
60
+ ## Key concepts in Gouda: concurrency keys
61
+
62
+ Gouda has a few indexes on the `gouda_workloads` table which will:
63
+
64
+ * Forbid inserting another `enqueued` workload with the same `enqueue_concurrency_key` value. Uniqueness is on that column only.
65
+ * Forbid a workload from transition into `executing` when another workload with the same `execution_concurrency_key` is already running.
66
+
67
+ These are compatible with good_job concurrency keys, with one major distinction: we use unique indices and not counters, so these keys can be used
68
+ to **prevent concurrent executions** but not to **limit the load on the system**, and the limit of 1 is always enforced.
69
+
70
+ ## Key concepts in Gouda: `executing_on`
71
+
72
+ A `Workload` is executing on a particular `executing_on` entity - usually a worker thread. That entity gets a pseudorandom ID . The `executing_on` value can be used to see, for example, whether a particular worker thread has hung. If multiple jobs have a far-behind `updated_at` and are all `executing`, this likely means that the worker has crashed or hung. The value can also be used to build a table of currently running workers.
73
+
74
+ ## Usage tips: bulkify your enqueues
75
+
76
+ When possible, Gouda uses `enqueue_all` to `INSERT` as many jobs at once as possible. With modern servers this allows for very rapid insertion of very large
77
+ batches of jobs. It is supplemented by a module which will make all `perform_later` calls buffered and submitted to the queue in bulk:
78
+
79
+ ```ruby
80
+ Gouda.in_bulk do
81
+ User.joined_recently.find_each do |user|
82
+ WelcomeMailer.with(user:).welcome_email.deliver_later
83
+ end
84
+ end
85
+ ```
86
+
87
+ If there are multiple ActiveJob adapters configured and you bulk-enqueue a job which uses an adapter different than Gouda, `in_bulk` will try to use `enqueue_all` on that
88
+ adapter as well.
89
+
90
+ ## Usage tips: co-commit
91
+
92
+ Gouda is designed to `COMMIT` the workload together with your business data. It does not need `after_commit` unless you so choose. In fact,
93
+ the main advantage of DB-based job queues such as Gouda is that you can always rely on the fact that the workload will be enqueued only
94
+ once the data it needs to operate on is already available for reading. This is guaranteed to work:
95
+
96
+ ```ruby
97
+ User.transaction do
98
+ freshly_joined_user = User.create!(user_params)
99
+ WelcomeMailer.with(user: freshly_joined_user).welcome_email.deliver_later
100
+ end
101
+ ```
102
+
103
+ ## Web UI
15
104
 
16
105
  At the moment the Gouda UI is proprietary, so this gem only provides a "headless" implementation. We expect this to change in the future.
17
106
 
data/lib/gouda/railtie.rb CHANGED
@@ -34,8 +34,6 @@ module Gouda
34
34
  # The `to_prepare` block which is executed once in production
35
35
  # and before each request in development.
36
36
  config.to_prepare do
37
- Gouda::Scheduler.update_schedule_from_config!
38
-
39
37
  if defined?(Rails) && Rails.respond_to?(:application)
40
38
  config_from_rails = Rails.application.config.try(:gouda)
41
39
  if config_from_rails
@@ -52,6 +50,9 @@ module Gouda
52
50
  Gouda.config.polling_sleep_interval_seconds = 0.2
53
51
  Gouda.config.logger.level = Gouda.config.log_level
54
52
  end
53
+
54
+ Gouda::Scheduler.build_scheduler_entries_list!
55
+ Gouda::Scheduler.upsert_workloads_from_entries_list!
55
56
  end
56
57
  end
57
58
  end
@@ -53,7 +53,33 @@ module Gouda::Scheduler
53
53
  end
54
54
  end
55
55
 
56
- def self.update_schedule_from_config!(cron_table_hash = nil)
56
+ # Takes in a Hash formatted with cron entries in the format similar
57
+ # to good_job, and builds a table of scheduler entries. A scheduler
58
+ # entry references a particular job class name, the set of arguments to
59
+ # be passed to the job when performing it, and either the interval
60
+ # to repeat the job after or a cron pattern. This method does not
61
+ # insert the actual Workloads into the database but just builds the
62
+ # table of the entries. That table gets consulted when workloads finish
63
+ # to determine whether the workload that just ran was scheduled or ad-hoc,
64
+ # and whether the subsequent workload has to be enqueued.
65
+ #
66
+ # If no table is given the method will attempt to read the table from
67
+ # Rails application config from `[:gouda][:cron]`.
68
+ #
69
+ # The table is a Hash of entries, and the keys are the names of the workload
70
+ # to be enqueued - those keys are also used to ensure scheduled workloads
71
+ # only get scheduled once.
72
+ #
73
+ # @param cron_table_hash[Hash] a hash of the following shape:
74
+ # {
75
+ # download_invoices_every_minute: {
76
+ # cron: "* * * * *",
77
+ # class: "DownloadInvoicesJob",
78
+ # args: ["immediate"]
79
+ # }
80
+ # }
81
+ # @return Array[Entry]
82
+ def self.build_scheduler_entries_list!(cron_table_hash = nil)
57
83
  Gouda.logger.info "Updating scheduled workload entries..."
58
84
  if cron_table_hash.blank?
59
85
  config_from_rails = Rails.application.config.try(:gouda)
@@ -76,6 +102,12 @@ module Gouda::Scheduler
76
102
  end
77
103
  end
78
104
 
105
+ # Once a workload has finished (doesn't matter whether it raised an exception
106
+ # or completed successfully), it is going to be passed to this method to enqueue
107
+ # the next scheduled workload
108
+ #
109
+ # @param finished_workload[Gouda::Workload]
110
+ # @return void
79
111
  def self.enqueue_next_scheduled_workload_for(finished_workload)
80
112
  return unless finished_workload.scheduler_key
81
113
 
@@ -86,11 +118,23 @@ module Gouda::Scheduler
86
118
  Gouda.enqueue_jobs_via_their_adapters([timer_entry.build_active_job])
87
119
  end
88
120
 
121
+ # Returns the list of entries of the scheduler which are currently known. Normally the
122
+ # scheduler will hold the list of entries loaded from the Rails config.
123
+ #
124
+ # @return Array[Entry]
89
125
  def self.entries
90
126
  @cron_table || []
91
127
  end
92
128
 
93
- def self.update_scheduled_workloads!
129
+ # Will upsert (`INSERT ... ON CONFLICT UPDATE`) workloads for all entries which are in the scheduler entries
130
+ # table (the table needs to be read or hydrated first using `build_scheduler_entries_list!`). This is done
131
+ # in a transaction. Any workloads which have been previously inserted from the scheduled entries, but no
132
+ # longer have a corresponding scheduler entry, will be deleted from the database. If there already are workloads
133
+ # with the corresponding scheduler key they will not be touched and will be performed with their previously-defined
134
+ # arguments.
135
+ #
136
+ # @return void
137
+ def self.upsert_workloads_from_entries_list!
94
138
  table_entries = @cron_table || []
95
139
 
96
140
  # Remove any cron keyed workloads which no longer match config-wise
data/lib/gouda/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Gouda
4
- VERSION = "0.1.1"
4
+ VERSION = "0.1.2"
5
5
  end
data/lib/gouda.rb CHANGED
@@ -46,7 +46,7 @@ module Gouda
46
46
  end
47
47
 
48
48
  def self.start
49
- Gouda::Scheduler.update_scheduled_workloads!
49
+ Gouda::Scheduler.upsert_workloads_from_entries_list!
50
50
 
51
51
  queue_constraint = if ENV["GOUDA_QUEUES"]
52
52
  Gouda.parse_queue_constraint(ENV["GOUDA_QUEUES"])
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: gouda
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.1
4
+ version: 0.1.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Sebastian van Hesteren
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2024-06-10 00:00:00.000000000 Z
12
+ date: 2024-06-11 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: activerecord