skiplock 1.0.11 → 1.0.15
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +34 -1
- data/bin/skiplock +2 -2
- data/lib/active_job/queue_adapters/skiplock_adapter.rb +1 -1
- data/lib/generators/skiplock/templates/migration.rb.erb +7 -8
- data/lib/skiplock/counter.rb +1 -0
- data/lib/skiplock/cron.rb +0 -1
- data/lib/skiplock/job.rb +91 -61
- data/lib/skiplock/manager.rb +67 -72
- data/lib/skiplock/version.rb +1 -1
- data/lib/skiplock/worker.rb +105 -0
- data/lib/skiplock.rb +1 -2
- metadata +2 -3
- data/lib/skiplock/dispatcher.rb +0 -116
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 5c965cac0fea118b7a8295bd82999bd33cd2807829dee639b6b782279a27697d
|
4
|
+
data.tar.gz: f5357c225546ef4fcec74a10b3180d07950c7f7f3308a39b2fe19c5668677a0f
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: d42f8d156e11c25d117a246f30cee23e7c487ecfa80e74e0a10866013f2a6145afd42bb55e3f057ff1687f686ee4e030f851d88fd3c2d3a99b33df745c4617f0
|
7
|
+
data.tar.gz: 981e7adad4bae9281e40bda3e049f68b32458078b98f192f5a9721ba31b40d82347c19a4da1876b210826a066f49f3ff0f2b8c966e9708b2d5b7d7bb890ff8b4
|
data/README.md
CHANGED
@@ -4,7 +4,7 @@
|
|
4
4
|
|
5
5
|
It only uses the `LISTEN/NOTIFY/SKIP LOCKED` features provided natively on PostgreSQL 9.5+ to efficiently and reliably dispatch jobs to worker processes and threads ensuring that each job can be completed successfully **only once**. No other polling or timer is needed.
|
6
6
|
|
7
|
-
The library is quite small compared to other PostgreSQL job queues (eg. *delay_job*, *queue_classic*, *que*, *good_job*) with less than
|
7
|
+
The library is quite small compared to other PostgreSQL job queues (eg. *delay_job*, *queue_classic*, *que*, *good_job*) with less than 500 lines of codes; and it still provides similar set of features and more...
|
8
8
|
|
9
9
|
#### Compatibility:
|
10
10
|
|
@@ -54,6 +54,7 @@ The library is quite small compared to other PostgreSQL job queues (eg. *delay_j
|
|
54
54
|
max_threads: 5
|
55
55
|
max_retries: 20
|
56
56
|
logfile: log/skiplock.log
|
57
|
+
loglevel: info
|
57
58
|
notification: custom
|
58
59
|
extensions: false
|
59
60
|
purge_completion: true
|
@@ -67,6 +68,7 @@ The library is quite small compared to other PostgreSQL job queues (eg. *delay_j
|
|
67
68
|
- **max_threads** (*integer*): sets the maximum number of threads allowed to run jobs
|
68
69
|
- **max_retries** (*integer*): sets the maximum attempt a job will be retrying before it is marked expired. See `Retry system` for more details
|
69
70
|
- **logfile** (*string*): path filename for skiplock logs; empty logfile will disable logging
|
71
|
+
- **loglevel** (*string*): sets logging level (`debug, info, warn, error, fatal, unknown`)
|
70
72
|
- **notification** (*string*): sets the library to be used for notifying errors and exceptions (`auto, airbrake, bugsnag, exception_notification, custom`); using `auto` will detect library if available. See `Notification system` for more details
|
71
73
|
- **extensions** (*boolean*): enable or disable the class method extension. See `ClassMethod extension` for more details
|
72
74
|
- **purge_completion** (*boolean*): when set to **true** will delete jobs after they were completed successfully; if set to **false** then the completed jobs should be purged periodically to maximize performance (eg. clean up old jobs after 3 months)
|
@@ -185,6 +187,37 @@ If the `retry_on` block is not defined, then the built-in retry system of `skipl
|
|
185
187
|
Subscription.skiplock(wait_until: Date.tomorrow.noon).charge(amount: 100)
|
186
188
|
```
|
187
189
|
|
190
|
+
## Fault tolerant
|
191
|
+
`Skiplock` ensures that jobs will be executed sucessfully only once even if database connection is lost during or after the job was dispatched. Successful jobs are marked as completed or removed (with `purge_completion` turned on), and failed or interrupted jobs are marked for retry; however, when the database connection is dropped for any reasons and the commit is lost, `Skiplock` will then save the commit data to local disk (as `tmp/skiplock/<job_id>`) and synchronize with the database when the connection resumes. This also protects in-progress jobs that were terminated abruptly during a graceful shutdown with timeout; they will be queued for retry.
|
192
|
+
|
193
|
+
## Scalability
|
194
|
+
`Skiplock` can scale both vertically and horizontally. To scale vertically, simply increase the number of `Skiplock` workers per host. To scale horizontally, simply deploy `Skiplock` to multiple hosts sharing the same PostgreSQL database.
|
195
|
+
|
196
|
+
## Statistics and counters
|
197
|
+
The `skiplock.workers` database table contains all the `Skiplock` workers running on all the hosts. Active worker will update its timestamp column (`updated_at`) every minute; and dispatched jobs would be associated with the running workers. At any given time, a list of active workers running a list of jobs can be determined using the database table.
|
198
|
+
|
199
|
+
The `skiplock.counters` database table contains all historical job dispatches, completions, expiries, failures and retries. The counters are recorded by dates; so it's possible to get statistical data for any given day or range of dates.
|
200
|
+
|
201
|
+
- **completions**: numbers of jobs completed successfully
|
202
|
+
- **dispatches**: number of jobs dispatched for the first time (**retries** are not counted here)
|
203
|
+
- **expiries**: number of jobs exceeded `max_retry` and still failed to complete
|
204
|
+
- **failures**: number of jobs interrupted by graceful shutdown or errors (exceptions)
|
205
|
+
- **retries**: number of jobs dispatched for retrying
|
206
|
+
|
207
|
+
Code examples of gathering counters information:
|
208
|
+
- get counter information for today
|
209
|
+
```ruby
|
210
|
+
Skiplock::Counter.where(day: Date.today).first
|
211
|
+
```
|
212
|
+
- get total number of successfully completed jobs within the past 30 days
|
213
|
+
```ruby
|
214
|
+
Skiplock::Counter.where("day >= ?", 30.days.ago).sum(:completions)
|
215
|
+
```
|
216
|
+
- get total number of expired jobs
|
217
|
+
```ruby
|
218
|
+
Skiplock::Counter.sum(:expiries)
|
219
|
+
```
|
220
|
+
|
188
221
|
## Contributing
|
189
222
|
|
190
223
|
Bug reports and pull requests are welcome on GitHub at https://github.com/vtt/skiplock.
|
data/bin/skiplock
CHANGED
@@ -5,7 +5,7 @@ begin
|
|
5
5
|
op = OptionParser.new do |opts|
|
6
6
|
opts.banner = "Usage: #{File.basename($0)} [options]"
|
7
7
|
opts.on('-e', '--environment STRING', String, 'Rails environment')
|
8
|
-
opts.on('-l', '--logfile STRING', String, '
|
8
|
+
opts.on('-l', '--logfile STRING', String, 'Log filename')
|
9
9
|
opts.on('-s', '--graceful-shutdown NUM', Integer, 'Number of seconds to wait for graceful shutdown')
|
10
10
|
opts.on('-r', '--max-retries NUM', Integer, 'Number of maxixum retries')
|
11
11
|
opts.on('-t', '--max-threads NUM', Integer, 'Number of maximum threads')
|
@@ -25,4 +25,4 @@ options.transform_keys! { |k| k.to_s.gsub('-', '_').to_sym }
|
|
25
25
|
env = options.delete(:environment)
|
26
26
|
ENV['RAILS_ENV'] = env if env
|
27
27
|
require File.expand_path("config/environment.rb")
|
28
|
-
|
28
|
+
Rails.application.config.skiplock.standalone(**options)
|
@@ -2,7 +2,7 @@ module ActiveJob
|
|
2
2
|
module QueueAdapters
|
3
3
|
class SkiplockAdapter
|
4
4
|
def initialize
|
5
|
-
Rails.application.config.after_initialize { Skiplock::Manager.new }
|
5
|
+
Rails.application.config.after_initialize { Rails.application.config.skiplock = Skiplock::Manager.new }
|
6
6
|
end
|
7
7
|
|
8
8
|
def enqueue(job)
|
@@ -43,24 +43,23 @@ class CreateSkiplockSchema < ActiveRecord::Migration<%= "[#{ActiveRecord::VERSIO
|
|
43
43
|
record = NEW;
|
44
44
|
IF (TG_OP = 'DELETE') THEN
|
45
45
|
record = OLD;
|
46
|
-
IF (record.
|
47
|
-
|
46
|
+
IF (record.running = TRUE) THEN
|
47
|
+
INSERT INTO skiplock.counters (day,completions) VALUES (NOW(),1) ON CONFLICT (day) DO UPDATE SET completions = skiplock.counters.completions + 1;
|
48
48
|
END IF;
|
49
|
-
INSERT INTO skiplock.counters (day,completions) VALUES (NOW(),1) ON CONFLICT (day) DO UPDATE SET completions = skiplock.counters.completions + 1;
|
50
49
|
ELSIF (record.running = TRUE) THEN
|
51
|
-
IF (record.executions
|
52
|
-
INSERT INTO skiplock.counters (day,dispatches) VALUES (NOW(),1) ON CONFLICT (day) DO UPDATE SET dispatches = skiplock.counters.dispatches + 1;
|
53
|
-
ELSE
|
50
|
+
IF (record.executions > 0) THEN
|
54
51
|
INSERT INTO skiplock.counters (day,retries) VALUES (NOW(),1) ON CONFLICT (day) DO UPDATE SET retries = skiplock.counters.retries + 1;
|
52
|
+
ELSE
|
53
|
+
INSERT INTO skiplock.counters (day,dispatches) VALUES (NOW(),1) ON CONFLICT (day) DO UPDATE SET dispatches = skiplock.counters.dispatches + 1;
|
55
54
|
END IF;
|
56
55
|
ELSIF (record.finished_at IS NOT NULL) THEN
|
57
56
|
INSERT INTO skiplock.counters (day,completions) VALUES (NOW(),1) ON CONFLICT (day) DO UPDATE SET completions = skiplock.counters.completions + 1;
|
58
57
|
ELSIF (record.expired_at IS NOT NULL) THEN
|
59
58
|
INSERT INTO skiplock.counters (day,expiries) VALUES (NOW(),1) ON CONFLICT (day) DO UPDATE SET expiries = skiplock.counters.expiries + 1;
|
60
|
-
ELSIF (record.executions
|
59
|
+
ELSIF (record.executions > 0) THEN
|
61
60
|
INSERT INTO skiplock.counters (day,failures) VALUES (NOW(),1) ON CONFLICT (day) DO UPDATE SET failures = skiplock.counters.failures + 1;
|
62
61
|
END IF;
|
63
|
-
PERFORM pg_notify('skiplock::jobs', CONCAT(TG_OP,',',record.id::TEXT,',',record.worker_id::TEXT,',',record.queue_name,',',record.running::TEXT,',',CAST(EXTRACT(EPOCH FROM record.expired_at) AS FLOAT)::TEXT,',',CAST(EXTRACT(EPOCH FROM record.finished_at) AS FLOAT)::TEXT,',',CAST(EXTRACT(EPOCH FROM CASE WHEN record.scheduled_at IS NULL THEN record.updated_at ELSE record.scheduled_at END) AS FLOAT)::TEXT));
|
62
|
+
PERFORM pg_notify('skiplock::jobs', CONCAT(TG_OP,',',record.id::TEXT,',',record.worker_id::TEXT,',',record.job_class,',',record.queue_name,',',record.running::TEXT,',',CAST(EXTRACT(EPOCH FROM record.expired_at) AS FLOAT)::TEXT,',',CAST(EXTRACT(EPOCH FROM record.finished_at) AS FLOAT)::TEXT,',',CAST(EXTRACT(EPOCH FROM CASE WHEN record.scheduled_at IS NULL THEN record.updated_at ELSE record.scheduled_at END) AS FLOAT)::TEXT));
|
64
63
|
RETURN NULL;
|
65
64
|
END;
|
66
65
|
$$ LANGUAGE plpgsql
|
data/lib/skiplock/counter.rb
CHANGED
data/lib/skiplock/cron.rb
CHANGED
data/lib/skiplock/job.rb
CHANGED
@@ -1,38 +1,32 @@
|
|
1
1
|
module Skiplock
|
2
2
|
class Job < ActiveRecord::Base
|
3
3
|
self.implicit_order_column = 'created_at'
|
4
|
+
attr_accessor :activejob_retry
|
5
|
+
belongs_to :worker, inverse_of: :jobs, required: false
|
4
6
|
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
job_data = job.attributes.slice('job_class', 'queue_name', 'locale', 'timezone', 'priority', 'executions', 'exception_executions').merge('job_id' => job.id, 'enqueued_at' => job.updated_at, 'arguments' => (job.data['arguments'] || []))
|
15
|
-
job.executions = (job.executions || 0) + 1
|
16
|
-
Skiplock.logger.info "[Skiplock] Performing #{job.job_class} (#{job.id}) from queue '#{job.queue_name || 'default'}'..."
|
17
|
-
Thread.current[:skiplock_dispatch_job] = job
|
18
|
-
start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
19
|
-
begin
|
20
|
-
ActiveJob::Base.execute(job_data)
|
21
|
-
rescue Exception => ex
|
22
|
-
Skiplock.logger.error(ex)
|
23
|
-
end
|
24
|
-
unless ex
|
25
|
-
end_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
26
|
-
job_name = job.job_class
|
27
|
-
if job.job_class == 'Skiplock::Extension::ProxyJob'
|
28
|
-
target, method_name = ::YAML.load(job.data['arguments'].first)
|
29
|
-
job_name = "'#{target.name}.#{method_name}'"
|
7
|
+
# resynchronize jobs that could not commit to database and retry any abandoned jobs
|
8
|
+
def self.cleanup(purge_completion: true, max_retries: 20)
|
9
|
+
Dir.mkdir('tmp/skiplock') unless Dir.exist?('tmp/skiplock')
|
10
|
+
Dir.glob('tmp/skiplock/*').each do |f|
|
11
|
+
job_from_db = self.find_by(id: File.basename(f), running: true)
|
12
|
+
disposed = true
|
13
|
+
if job_from_db
|
14
|
+
job, ex = YAML.load_file(f) rescue nil
|
15
|
+
disposed = job.dispose(ex, purge_completion: purge_completion, max_retries: max_retries) if job
|
30
16
|
end
|
31
|
-
|
17
|
+
(File.delete(f) rescue nil) if disposed
|
32
18
|
end
|
33
|
-
|
34
|
-
|
35
|
-
|
19
|
+
self.where(running: true).where.not(worker_id: Worker.ids).update_all(running: false, worker_id: nil)
|
20
|
+
end
|
21
|
+
|
22
|
+
def self.dispatch(purge_completion: true, max_retries: 20)
|
23
|
+
job = nil
|
24
|
+
self.connection.transaction do
|
25
|
+
job = self.find_by_sql("SELECT id, scheduled_at FROM skiplock.jobs WHERE running = FALSE AND expired_at IS NULL AND finished_at IS NULL ORDER BY scheduled_at ASC NULLS FIRST, priority ASC NULLS LAST, created_at ASC FOR UPDATE SKIP LOCKED LIMIT 1").first
|
26
|
+
return if job.nil? || job.scheduled_at.to_f > Time.now.to_f
|
27
|
+
job = self.find_by_sql("UPDATE skiplock.jobs SET running = TRUE, updated_at = NOW() WHERE id = '#{job.id}' RETURNING *").first
|
28
|
+
end
|
29
|
+
self.dispatch(purge_completion: purge_completion, max_retries: max_retries) if job.execute(purge_completion: purge_completion, max_retries: max_retries)
|
36
30
|
end
|
37
31
|
|
38
32
|
def self.enqueue(activejob)
|
@@ -42,13 +36,14 @@ module Skiplock
|
|
42
36
|
def self.enqueue_at(activejob, timestamp)
|
43
37
|
timestamp = Time.at(timestamp) if timestamp
|
44
38
|
if Thread.current[:skiplock_dispatch_job].try(:id) == activejob.job_id
|
45
|
-
Thread.current[:skiplock_dispatch_job].
|
39
|
+
Thread.current[:skiplock_dispatch_job].activejob_retry = true
|
46
40
|
Thread.current[:skiplock_dispatch_job].executions = activejob.executions
|
41
|
+
Thread.current[:skiplock_dispatch_job].exception_executions = activejob.exception_executions
|
47
42
|
Thread.current[:skiplock_dispatch_job].scheduled_at = timestamp
|
48
43
|
Thread.current[:skiplock_dispatch_job]
|
49
44
|
else
|
50
45
|
serialize = activejob.serialize
|
51
|
-
|
46
|
+
self.create!(serialize.slice(*self.column_names).merge('id' => serialize['job_id'], 'data' => { 'arguments' => serialize['arguments'] }, 'scheduled_at' => timestamp))
|
52
47
|
end
|
53
48
|
end
|
54
49
|
|
@@ -57,47 +52,82 @@ module Skiplock
|
|
57
52
|
end
|
58
53
|
|
59
54
|
def dispose(ex, purge_completion: true, max_retries: 20)
|
60
|
-
|
55
|
+
yaml = [self, ex].to_yaml
|
56
|
+
purging = false
|
61
57
|
self.running = false
|
62
58
|
self.worker_id = nil
|
63
|
-
self.updated_at =
|
59
|
+
self.updated_at = Time.now > self.updated_at ? Time.now : self.updated_at + 1 # in case of clock drifting
|
64
60
|
if ex
|
65
|
-
self.exception_executions
|
66
|
-
|
61
|
+
self.exception_executions ||= {}
|
62
|
+
self.exception_executions["[#{ex.class.name}]"] = self.exception_executions["[#{ex.class.name}]"].to_i + 1 unless self.activejob_retry
|
63
|
+
if self.executions.to_i >= max_retries || self.activejob_retry
|
67
64
|
self.expired_at = Time.now
|
68
65
|
else
|
69
|
-
self.scheduled_at = Time.now + (5 * 2**self.executions)
|
66
|
+
self.scheduled_at = Time.now + (5 * 2**self.executions.to_i)
|
70
67
|
end
|
71
|
-
self.save!
|
72
68
|
Skiplock.on_errors.each { |p| p.call(ex) }
|
73
|
-
elsif self.
|
74
|
-
self.
|
75
|
-
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
|
81
|
-
|
82
|
-
|
83
|
-
|
84
|
-
|
85
|
-
|
86
|
-
|
87
|
-
|
69
|
+
elsif self.finished_at
|
70
|
+
if self.cron
|
71
|
+
self.data ||= {}
|
72
|
+
self.data['crons'] = (self.data['crons'] || 0) + 1
|
73
|
+
self.data['last_cron_at'] = Time.now.utc.to_s
|
74
|
+
next_cron_at = Cron.next_schedule_at(self.cron)
|
75
|
+
if next_cron_at
|
76
|
+
self.executions = nil
|
77
|
+
self.exception_executions = nil
|
78
|
+
self.scheduled_at = Time.at(next_cron_at)
|
79
|
+
else
|
80
|
+
Skiplock.logger.error("[Skiplock] ERROR: Invalid CRON '#{self.cron}' for Job #{self.job_class}") if Skiplock.logger
|
81
|
+
purging = true
|
82
|
+
end
|
83
|
+
elsif purge_completion
|
84
|
+
purging = true
|
88
85
|
end
|
89
|
-
elsif purge_completion
|
90
|
-
self.delete
|
91
|
-
else
|
92
|
-
self.finished_at = Time.now
|
93
|
-
self.exception_executions = nil
|
94
|
-
self.save!
|
95
86
|
end
|
96
|
-
self
|
87
|
+
purging ? self.delete : self.update_columns(self.attributes.slice(*self.changes.keys))
|
97
88
|
rescue Exception => e
|
98
|
-
|
99
|
-
|
89
|
+
File.write("tmp/skiplock/#{self.id}", yaml) rescue nil
|
90
|
+
if Skiplock.logger
|
91
|
+
Skiplock.logger.error(e.to_s)
|
92
|
+
Skiplock.logger.error(e.backtrace.join("\n"))
|
93
|
+
end
|
94
|
+
Skiplock.on_errors.each { |p| p.call(e) }
|
100
95
|
nil
|
101
96
|
end
|
97
|
+
|
98
|
+
def execute(purge_completion: true, max_retries: 20)
|
99
|
+
Skiplock.logger.info("[Skiplock] Performing #{self.job_class} (#{self.id}) from queue '#{self.queue_name || 'default'}'...") if Skiplock.logger
|
100
|
+
self.data ||= {}
|
101
|
+
self.exception_executions ||= {}
|
102
|
+
self.activejob_retry = false
|
103
|
+
job_data = self.attributes.slice('job_class', 'queue_name', 'locale', 'timezone', 'priority', 'executions', 'exception_executions').merge('job_id' => self.id, 'enqueued_at' => self.updated_at, 'arguments' => (self.data['arguments'] || []))
|
104
|
+
self.executions = self.executions.to_i + 1
|
105
|
+
Thread.current[:skiplock_dispatch_job] = self
|
106
|
+
start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
107
|
+
begin
|
108
|
+
ActiveJob::Base.execute(job_data)
|
109
|
+
self.finished_at = Time.now unless self.activejob_retry
|
110
|
+
rescue Exception => ex
|
111
|
+
end
|
112
|
+
if Skiplock.logger
|
113
|
+
if ex || self.activejob_retry
|
114
|
+
Skiplock.logger.error("[Skiplock] Job #{self.job_class} (#{self.id}) was interrupted by an exception#{ ' (rescued and retried by ActiveJob)' if self.activejob_retry }")
|
115
|
+
if ex
|
116
|
+
Skiplock.logger.error(ex.to_s)
|
117
|
+
Skiplock.logger.error(ex.backtrace.join("\n"))
|
118
|
+
end
|
119
|
+
else
|
120
|
+
end_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
121
|
+
job_name = self.job_class
|
122
|
+
if self.job_class == 'Skiplock::Extension::ProxyJob'
|
123
|
+
target, method_name = ::YAML.load(self.data['arguments'].first)
|
124
|
+
job_name = "'#{target.name}.#{method_name}'"
|
125
|
+
end
|
126
|
+
Skiplock.logger.info "[Skiplock] Performed #{job_name} (#{self.id}) from queue '#{self.queue_name || 'default'}' in #{end_time - start_time} seconds"
|
127
|
+
end
|
128
|
+
end
|
129
|
+
ensure
|
130
|
+
self.dispose(ex, purge_completion: purge_completion, max_retries: max_retries)
|
131
|
+
end
|
102
132
|
end
|
103
133
|
end
|
data/lib/skiplock/manager.rb
CHANGED
@@ -6,30 +6,59 @@ module Skiplock
|
|
6
6
|
@config.symbolize_keys!
|
7
7
|
@config.transform_values! {|v| v.is_a?(String) ? v.downcase : v}
|
8
8
|
@config.merge!(config)
|
9
|
-
Module.__send__(:include, Skiplock::Extension) if @config[:extensions] == true
|
10
|
-
return unless @config[:standalone] || (caller.any?{ |l| l =~ %r{/rack/} } && (@config[:workers] == 0 || Rails.env.development?))
|
11
9
|
@config[:hostname] = `hostname -f`.strip
|
12
|
-
|
13
|
-
|
10
|
+
configure
|
11
|
+
setup_logger
|
12
|
+
Module.__send__(:include, Skiplock::Extension) if @config[:extensions] == true
|
13
|
+
if (caller.any?{ |l| l =~ %r{/rack/} } && @config[:workers] == 0)
|
14
|
+
cleanup_workers
|
15
|
+
@worker = create_worker
|
16
|
+
@worker.start(**@config)
|
17
|
+
at_exit { @worker.shutdown }
|
18
|
+
end
|
19
|
+
rescue Exception => ex
|
20
|
+
@logger.error(ex.to_s)
|
21
|
+
@logger.error(ex.backtrace.join("\n"))
|
22
|
+
end
|
23
|
+
|
24
|
+
def standalone(**options)
|
25
|
+
@config.merge!(options)
|
26
|
+
Rails.logger.reopen('/dev/null')
|
27
|
+
Rails.logger.extend(ActiveSupport::Logger.broadcast(@logger))
|
28
|
+
@config[:workers] = 1 if @config[:workers] <= 0
|
29
|
+
@config[:standalone] = true
|
30
|
+
banner
|
14
31
|
cleanup_workers
|
15
|
-
create_worker
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
32
|
+
@worker = create_worker
|
33
|
+
@parent_id = Process.pid
|
34
|
+
@shutdown = false
|
35
|
+
Signal.trap('INT') { @shutdown = true }
|
36
|
+
Signal.trap('TERM') { @shutdown = true }
|
37
|
+
Signal.trap('HUP') { setup_logger }
|
38
|
+
(@config[:workers] - 1).times do |n|
|
39
|
+
fork do
|
40
|
+
sleep 1
|
41
|
+
worker = create_worker(master: false)
|
42
|
+
worker.start(worker_num: n + 1, **@config)
|
43
|
+
loop do
|
44
|
+
sleep 0.5
|
45
|
+
break if @shutdown || Process.ppid != @parent_id
|
46
|
+
end
|
47
|
+
worker.shutdown
|
26
48
|
end
|
27
49
|
end
|
28
|
-
|
29
|
-
|
50
|
+
@worker.start(**@config)
|
51
|
+
loop do
|
52
|
+
sleep 0.5
|
53
|
+
break if @shutdown
|
54
|
+
end
|
55
|
+
@logger.info "[Skiplock] Terminating signal... Waiting for jobs to finish (up to #{@config[:graceful_shutdown]} seconds)..." if @config[:graceful_shutdown]
|
56
|
+
Process.waitall
|
57
|
+
@worker.shutdown
|
58
|
+
@logger.info "[Skiplock] Shutdown completed."
|
30
59
|
end
|
31
60
|
|
32
|
-
|
61
|
+
private
|
33
62
|
|
34
63
|
def banner
|
35
64
|
title = "Skiplock #{Skiplock::VERSION} (Rails #{Rails::VERSION::STRING} | Ruby #{RUBY_VERSION}-p#{RUBY_PATCHLEVEL})"
|
@@ -43,6 +72,7 @@ module Skiplock
|
|
43
72
|
@logger.info " Min threads: #{@config[:min_threads]}"
|
44
73
|
@logger.info " Max threads: #{@config[:max_threads]}"
|
45
74
|
@logger.info " Environment: #{Rails.env}"
|
75
|
+
@logger.info " Loglevel: #{@config[:loglevel]}"
|
46
76
|
@logger.info " Logfile: #{@config[:logfile] || '(disabled)'}"
|
47
77
|
@logger.info " Workers: #{@config[:workers]}"
|
48
78
|
@logger.info " Queues: #{@config[:queues].map {|k,v| k + '(' + v.to_s + ')'}.join(', ')}" if @config[:queues].is_a?(Hash)
|
@@ -52,24 +82,22 @@ module Skiplock
|
|
52
82
|
end
|
53
83
|
|
54
84
|
def cleanup_workers
|
85
|
+
Rails.application.eager_load! if Rails.env.development?
|
55
86
|
delete_ids = []
|
56
87
|
Worker.where(hostname: @config[:hostname]).each do |worker|
|
57
88
|
sid = Process.getsid(worker.pid) rescue nil
|
58
|
-
delete_ids << worker.id if worker.sid != sid || worker.updated_at <
|
59
|
-
end
|
60
|
-
if delete_ids.count > 0
|
61
|
-
Job.where(running: true, worker_id: delete_ids).update_all(running: false, worker_id: nil)
|
62
|
-
Worker.where(id: delete_ids).delete_all
|
89
|
+
delete_ids << worker.id if worker.sid != sid || worker.updated_at < 10.minutes.ago
|
63
90
|
end
|
91
|
+
Worker.where(id: delete_ids).delete_all if delete_ids.count > 0
|
64
92
|
end
|
65
93
|
|
66
|
-
def create_worker(
|
67
|
-
|
94
|
+
def create_worker(master: true)
|
95
|
+
Worker.create!(pid: Process.pid, sid: Process.getsid(), master: master, hostname: @config[:hostname], capacity: @config[:max_threads])
|
68
96
|
rescue
|
69
|
-
|
97
|
+
Worker.create!(pid: Process.pid, sid: Process.getsid(), master: false, hostname: @config[:hostname], capacity: @config[:max_threads])
|
70
98
|
end
|
71
99
|
|
72
|
-
def
|
100
|
+
def configure
|
73
101
|
@config[:graceful_shutdown] = 300 if @config[:graceful_shutdown] > 300
|
74
102
|
@config[:graceful_shutdown] = nil if @config[:graceful_shutdown] < 0
|
75
103
|
@config[:max_retries] = 20 if @config[:max_retries] > 20
|
@@ -78,19 +106,6 @@ module Skiplock
|
|
78
106
|
@config[:max_threads] = 20 if @config[:max_threads] > 20
|
79
107
|
@config[:min_threads] = 0 if @config[:min_threads] < 0
|
80
108
|
@config[:workers] = 0 if @config[:workers] < 0
|
81
|
-
@config[:workers] = 1 if @config[:standalone] && @config[:workers] <= 0
|
82
|
-
@logger = ActiveSupport::Logger.new(STDOUT)
|
83
|
-
@logger.level = Rails.logger.level
|
84
|
-
Skiplock.logger = @logger
|
85
|
-
raise "Cannot create logfile '#{@config[:logfile]}'" if @config[:logfile] && !File.writable?(File.dirname(@config[:logfile]))
|
86
|
-
@config[:logfile] = nil if @config[:logfile].to_s.length == 0
|
87
|
-
if @config[:logfile]
|
88
|
-
@logger.extend(ActiveSupport::Logger.broadcast(::Logger.new(@config[:logfile])))
|
89
|
-
if @config[:standalone]
|
90
|
-
Rails.logger.reopen('/dev/null')
|
91
|
-
Rails.logger.extend(ActiveSupport::Logger.broadcast(@logger))
|
92
|
-
end
|
93
|
-
end
|
94
109
|
@config[:queues].values.each { |v| raise 'Queue value must be an integer' unless v.is_a?(Integer) } if @config[:queues].is_a?(Hash)
|
95
110
|
if @config[:notification] == 'auto'
|
96
111
|
if defined?(Airbrake)
|
@@ -125,39 +140,19 @@ module Skiplock
|
|
125
140
|
Skiplock.on_errors.freeze unless Skiplock.on_errors.frozen?
|
126
141
|
end
|
127
142
|
|
128
|
-
def
|
129
|
-
|
130
|
-
|
131
|
-
|
132
|
-
|
133
|
-
|
134
|
-
|
135
|
-
|
136
|
-
worker = create_worker(master: false)
|
137
|
-
dispatcher = Dispatcher.new(worker: worker, worker_num: n + 1, **@config)
|
138
|
-
thread = dispatcher.run
|
139
|
-
loop do
|
140
|
-
sleep 0.5
|
141
|
-
break if shutdown || Process.ppid != parent_id
|
142
|
-
end
|
143
|
-
dispatcher.shutdown
|
144
|
-
thread.join(@config[:graceful_shutdown])
|
145
|
-
worker.delete
|
146
|
-
exit
|
147
|
-
end
|
148
|
-
end
|
149
|
-
dispatcher = Dispatcher.new(worker: @worker, **@config)
|
150
|
-
thread = dispatcher.run
|
151
|
-
loop do
|
152
|
-
sleep 0.5
|
153
|
-
break if shutdown
|
143
|
+
def setup_logger
|
144
|
+
@config[:loglevel] = 'info' unless ['debug','info','warn','error','fatal','unknown'].include?(@config[:loglevel].to_s)
|
145
|
+
@logger = ActiveSupport::Logger.new(STDOUT)
|
146
|
+
@logger.level = @config[:loglevel].to_sym
|
147
|
+
Skiplock.logger = @logger
|
148
|
+
if @config[:logfile].to_s.length > 0
|
149
|
+
@logger.extend(ActiveSupport::Logger.broadcast(::Logger.new(File.join(Rails.root, 'log', @config[:logfile].to_s), 'daily')))
|
150
|
+
ActiveJob::Base.logger = nil
|
154
151
|
end
|
155
|
-
|
156
|
-
|
157
|
-
|
158
|
-
|
159
|
-
@worker.delete
|
160
|
-
@logger.info "[Skiplock] Shutdown completed."
|
152
|
+
rescue Exception => ex
|
153
|
+
puts "Exception with logger: #{ex.to_s}"
|
154
|
+
puts ex.backtrace.join("\n")
|
155
|
+
Skiplock.on_errors.each { |p| p.call(ex) }
|
161
156
|
end
|
162
157
|
end
|
163
158
|
end
|
data/lib/skiplock/version.rb
CHANGED
data/lib/skiplock/worker.rb
CHANGED
@@ -1,5 +1,110 @@
|
|
1
1
|
module Skiplock
|
2
2
|
class Worker < ActiveRecord::Base
|
3
3
|
self.implicit_order_column = 'created_at'
|
4
|
+
has_many :jobs, inverse_of: :worker
|
5
|
+
|
6
|
+
def start(worker_num: 0, **config)
|
7
|
+
@config = config
|
8
|
+
@queues_order_query = @config[:queues].map { |q,v| "WHEN queue_name = '#{q}' THEN #{v}" }.join(' ') if @config[:queues].is_a?(Hash) && @config[:queues].count > 0
|
9
|
+
@next_schedule_at = Time.now.to_f
|
10
|
+
@executor = Concurrent::ThreadPoolExecutor.new(min_threads: @config[:min_threads] + 1, max_threads: @config[:max_threads] + 1, max_queue: @config[:max_threads], idletime: 60, auto_terminate: true, fallback_policy: :discard)
|
11
|
+
if self.master
|
12
|
+
Job.cleanup(purge_completion: @config[:purge_completion], max_retries: @config[:max_retries])
|
13
|
+
Cron.setup
|
14
|
+
end
|
15
|
+
@running = true
|
16
|
+
Process.setproctitle("skiplock-#{self.master ? 'master[0]' : 'worker[' + worker_num.to_s + ']'}") if @config[:standalone]
|
17
|
+
@executor.post { run }
|
18
|
+
end
|
19
|
+
|
20
|
+
def shutdown
|
21
|
+
@running = false
|
22
|
+
@executor.shutdown
|
23
|
+
@executor.kill unless @executor.wait_for_termination(@config[:graceful_shutdown])
|
24
|
+
self.delete
|
25
|
+
end
|
26
|
+
|
27
|
+
private
|
28
|
+
|
29
|
+
def get_next_available_job
|
30
|
+
@connection.transaction do
|
31
|
+
job = Job.find_by_sql("SELECT id, scheduled_at FROM skiplock.jobs WHERE running = FALSE AND expired_at IS NULL AND finished_at IS NULL ORDER BY scheduled_at ASC NULLS FIRST,#{@queues_order_query ? ' CASE ' + @queues_order_query + ' ELSE NULL END ASC NULLS LAST,' : ''} priority ASC NULLS LAST, created_at ASC FOR UPDATE SKIP LOCKED LIMIT 1").first
|
32
|
+
if job && job.scheduled_at.to_f <= Time.now.to_f
|
33
|
+
job = Job.find_by_sql("UPDATE skiplock.jobs SET running = TRUE, worker_id = '#{self.id}', updated_at = NOW() WHERE id = '#{job.id}' RETURNING *").first
|
34
|
+
end
|
35
|
+
job
|
36
|
+
end
|
37
|
+
end
|
38
|
+
|
39
|
+
def run
|
40
|
+
error = false
|
41
|
+
listen = false
|
42
|
+
timestamp = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
43
|
+
while @running
|
44
|
+
Rails.application.reloader.wrap do
|
45
|
+
begin
|
46
|
+
unless listen
|
47
|
+
@connection = self.class.connection
|
48
|
+
@connection.exec_query('LISTEN "skiplock::jobs"')
|
49
|
+
listen = true
|
50
|
+
end
|
51
|
+
if error
|
52
|
+
unless @connection.active?
|
53
|
+
@connection.reconnect!
|
54
|
+
sleep(0.5)
|
55
|
+
@connection.exec_query('LISTEN "skiplock::jobs"')
|
56
|
+
@next_schedule_at = Time.now.to_f
|
57
|
+
end
|
58
|
+
Job.cleanup if self.master
|
59
|
+
error = false
|
60
|
+
end
|
61
|
+
if Time.now.to_f >= @next_schedule_at && @executor.remaining_capacity > 0
|
62
|
+
job = get_next_available_job
|
63
|
+
if job.try(:running)
|
64
|
+
@executor.post { Rails.application.reloader.wrap { job.execute(purge_completion: @config[:purge_completion], max_retries: @config[:max_retries]) } }
|
65
|
+
else
|
66
|
+
@next_schedule_at = (job ? job.scheduled_at.to_f : Float::INFINITY)
|
67
|
+
end
|
68
|
+
end
|
69
|
+
job_notifications = []
|
70
|
+
@connection.raw_connection.wait_for_notify(0.4) do |channel, pid, payload|
|
71
|
+
job_notifications << payload if payload
|
72
|
+
loop do
|
73
|
+
payload = @connection.raw_connection.notifies
|
74
|
+
break unless @running && payload
|
75
|
+
job_notifications << payload[:extra]
|
76
|
+
end
|
77
|
+
job_notifications.each do |n|
|
78
|
+
op, id, worker_id, job_class, queue_name, running, expired_at, finished_at, scheduled_at = n.split(',')
|
79
|
+
next if op == 'DELETE' || running == 'true' || expired_at.to_f > 0 || finished_at.to_f > 0
|
80
|
+
@next_schedule_at = scheduled_at.to_f if scheduled_at.to_f < @next_schedule_at
|
81
|
+
end
|
82
|
+
end
|
83
|
+
if Process.clock_gettime(Process::CLOCK_MONOTONIC) - timestamp > 60
|
84
|
+
self.touch
|
85
|
+
timestamp = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
86
|
+
end
|
87
|
+
rescue Exception => ex
|
88
|
+
# most likely error with database connection
|
89
|
+
Skiplock.logger.error(ex.to_s)
|
90
|
+
Skiplock.logger.error(ex.backtrace.join("\n"))
|
91
|
+
Skiplock.on_errors.each { |p| p.call(ex, @last_exception) }
|
92
|
+
error = true
|
93
|
+
wait(5)
|
94
|
+
@last_exception = ex
|
95
|
+
end
|
96
|
+
sleep(0.3)
|
97
|
+
end
|
98
|
+
end
|
99
|
+
@connection.exec_query('UNLISTEN *')
|
100
|
+
end
|
101
|
+
|
102
|
+
def wait(timeout)
|
103
|
+
t = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
104
|
+
while @running
|
105
|
+
sleep(0.5)
|
106
|
+
break if Process.clock_gettime(Process::CLOCK_MONOTONIC) - t > timeout
|
107
|
+
end
|
108
|
+
end
|
4
109
|
end
|
5
110
|
end
|
data/lib/skiplock.rb
CHANGED
@@ -3,7 +3,6 @@ require 'active_job/queue_adapters/skiplock_adapter'
|
|
3
3
|
require 'active_record'
|
4
4
|
require 'skiplock/counter'
|
5
5
|
require 'skiplock/cron'
|
6
|
-
require 'skiplock/dispatcher'
|
7
6
|
require 'skiplock/extension'
|
8
7
|
require 'skiplock/job'
|
9
8
|
require 'skiplock/manager'
|
@@ -11,7 +10,7 @@ require 'skiplock/worker'
|
|
11
10
|
require 'skiplock/version'
|
12
11
|
|
13
12
|
module Skiplock
|
14
|
-
DEFAULT_CONFIG = { 'extensions' => false, 'logfile' => '
|
13
|
+
DEFAULT_CONFIG = { 'extensions' => false, 'logfile' => 'skiplock.log', 'loglevel' => 'info', 'graceful_shutdown' => 15, 'min_threads' => 1, 'max_threads' => 5, 'max_retries' => 20, 'notification' => 'custom', 'purge_completion' => true, 'queues' => { 'default' => 100, 'mailers' => 999 }, 'workers' => 0 }.freeze
|
15
14
|
|
16
15
|
def self.logger=(l)
|
17
16
|
@logger = l
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: skiplock
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.0.
|
4
|
+
version: 1.0.15
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Tin Vo
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2021-
|
11
|
+
date: 2021-09-09 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: activejob
|
@@ -84,7 +84,6 @@ files:
|
|
84
84
|
- lib/skiplock.rb
|
85
85
|
- lib/skiplock/counter.rb
|
86
86
|
- lib/skiplock/cron.rb
|
87
|
-
- lib/skiplock/dispatcher.rb
|
88
87
|
- lib/skiplock/extension.rb
|
89
88
|
- lib/skiplock/job.rb
|
90
89
|
- lib/skiplock/manager.rb
|
data/lib/skiplock/dispatcher.rb
DELETED
@@ -1,116 +0,0 @@
|
|
1
|
-
module Skiplock
|
2
|
-
class Dispatcher
|
3
|
-
def initialize(worker:, worker_num: nil, **config)
|
4
|
-
@config = config
|
5
|
-
@worker = worker
|
6
|
-
@queues_order_query = @config[:queues].map { |q,v| "WHEN queue_name = '#{q}' THEN #{v}" }.join(' ') if @config[:queues].is_a?(Hash) && @config[:queues].count > 0
|
7
|
-
@executor = Concurrent::ThreadPoolExecutor.new(min_threads: @config[:min_threads], max_threads: @config[:max_threads], max_queue: @config[:max_threads], idletime: 60, auto_terminate: true, fallback_policy: :discard)
|
8
|
-
@last_dispatch_at = 0
|
9
|
-
@next_schedule_at = Time.now.to_f
|
10
|
-
Process.setproctitle("skiplock-#{@worker.master ? 'master[0]' : 'worker[' + worker_num.to_s + ']'}") if @config[:standalone]
|
11
|
-
end
|
12
|
-
|
13
|
-
def run
|
14
|
-
@running = true
|
15
|
-
Thread.new do
|
16
|
-
ActiveRecord::Base.connection_pool.with_connection do |connection|
|
17
|
-
connection.exec_query('LISTEN "skiplock::jobs"')
|
18
|
-
if @worker.master
|
19
|
-
Dir.mkdir('tmp/skiplock') unless Dir.exist?('tmp/skiplock')
|
20
|
-
check_sync_errors
|
21
|
-
Cron.setup
|
22
|
-
end
|
23
|
-
error = false
|
24
|
-
timestamp = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
25
|
-
while @running
|
26
|
-
begin
|
27
|
-
if error
|
28
|
-
unless connection.active?
|
29
|
-
connection.reconnect!
|
30
|
-
sleep(0.5)
|
31
|
-
connection.exec_query('LISTEN "skiplock::jobs"')
|
32
|
-
@next_schedule_at = Time.now.to_f
|
33
|
-
end
|
34
|
-
check_sync_errors
|
35
|
-
error = false
|
36
|
-
end
|
37
|
-
job_notifications = []
|
38
|
-
connection.raw_connection.wait_for_notify(0.1) do |channel, pid, payload|
|
39
|
-
job_notifications << payload if payload
|
40
|
-
loop do
|
41
|
-
payload = connection.raw_connection.notifies
|
42
|
-
break unless @running && payload
|
43
|
-
job_notifications << payload[:extra]
|
44
|
-
end
|
45
|
-
job_notifications.each do |n|
|
46
|
-
op, id, worker_id, queue_name, running, expired_at, finished_at, scheduled_at = n.split(',')
|
47
|
-
next if op == 'DELETE' || running == 'true' || expired_at.to_f > 0 || finished_at.to_f > 0 || scheduled_at.to_f < @last_dispatch_at
|
48
|
-
if scheduled_at.to_f <= Time.now.to_f
|
49
|
-
@next_schedule_at = Time.now.to_f
|
50
|
-
elsif scheduled_at.to_f < @next_schedule_at
|
51
|
-
@next_schedule_at = scheduled_at.to_f
|
52
|
-
end
|
53
|
-
end
|
54
|
-
end
|
55
|
-
if Time.now.to_f >= @next_schedule_at && @executor.remaining_capacity > 0
|
56
|
-
@executor.post { do_work }
|
57
|
-
end
|
58
|
-
if Process.clock_gettime(Process::CLOCK_MONOTONIC) - timestamp > 60
|
59
|
-
@worker.touch
|
60
|
-
timestamp = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
61
|
-
end
|
62
|
-
rescue Exception => ex
|
63
|
-
# most likely error with database connection
|
64
|
-
Skiplock.logger.error(ex)
|
65
|
-
Skiplock.on_errors.each { |p| p.call(ex, @last_exception) }
|
66
|
-
error = true
|
67
|
-
t = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
68
|
-
while @running
|
69
|
-
sleep(0.5)
|
70
|
-
break if Process.clock_gettime(Process::CLOCK_MONOTONIC) - t > 5
|
71
|
-
end
|
72
|
-
@last_exception = ex
|
73
|
-
end
|
74
|
-
sleep(0.2)
|
75
|
-
end
|
76
|
-
connection.exec_query('UNLISTEN *')
|
77
|
-
@executor.shutdown
|
78
|
-
@executor.kill unless @executor.wait_for_termination(@config[:graceful_shutdown])
|
79
|
-
end
|
80
|
-
end
|
81
|
-
end
|
82
|
-
|
83
|
-
def shutdown
|
84
|
-
@running = false
|
85
|
-
end
|
86
|
-
|
87
|
-
private
|
88
|
-
|
89
|
-
def check_sync_errors
|
90
|
-
# get performed jobs that could not sync with database
|
91
|
-
Dir.glob('tmp/skiplock/*').each do |f|
|
92
|
-
job_from_db = Job.find_by(id: File.basename(f), running: true)
|
93
|
-
disposed = true
|
94
|
-
if job_from_db
|
95
|
-
job, ex = YAML.load_file(f) rescue nil
|
96
|
-
disposed = job.dispose(ex, purge_completion: @config[:purge_completion], max_retries: @config[:max_retries])
|
97
|
-
end
|
98
|
-
File.delete(f) if disposed
|
99
|
-
end
|
100
|
-
end
|
101
|
-
|
102
|
-
def do_work
|
103
|
-
while @running
|
104
|
-
@last_dispatch_at = Time.now.to_f - 1 # 1 second allowance for time drift
|
105
|
-
result = Job.dispatch(queues_order_query: @queues_order_query, worker_id: @worker.id, purge_completion: @config[:purge_completion], max_retries: @config[:max_retries])
|
106
|
-
next if result.is_a?(Job) && Time.now.to_f >= @next_schedule_at
|
107
|
-
@next_schedule_at = result if result.is_a?(Float)
|
108
|
-
break
|
109
|
-
end
|
110
|
-
rescue Exception => ex
|
111
|
-
Skiplock.logger.error(ex)
|
112
|
-
Skiplock.on_errors.each { |p| p.call(ex, @last_exception) }
|
113
|
-
@last_exception = ex
|
114
|
-
end
|
115
|
-
end
|
116
|
-
end
|