gitlab-sidekiq-fetcher 0.4.0 → 0.5.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: cc2e47cf7679deb6a6d526e199a09c20a671e3b55bad22d7c05ce17405eb6103
4
- data.tar.gz: e71949587df8a635223ca8fa36339949df771f493e7edda0a4d9c34198600fb5
3
+ metadata.gz: a179330459058ca56cd918066793895684d63bbfcfed73eccd8687d5ed22ea15
4
+ data.tar.gz: fb0115218a6c35a6349f66c79a0a9eacbf1434a7fc6015100ff2dffb0bf8cef7
5
5
  SHA512:
6
- metadata.gz: ea7d6b7283354053a4f9fc24f419ab56da97efff3ce3ac3a2a10517ab2a1bd184a44b0ffb9d39b1dbdf27b23032c58b1139a18d8f2ec1bf5357deed65187e3ed
7
- data.tar.gz: ae4c78eca271dc63abf98bc112a582a2f56fc4e1e7be843189c30193ff0dabd99232f5a019b9b69d45cd1f08ca7bd88afe0c4341e0256c88d352441474a08400
6
+ metadata.gz: c4ef091732aa1cd0b69b889d7649c6f4ade001989360cdb717296cbfec6db7551e379b278b64fcb823b0cf293245b72293fbef8ad8b009158060983ecf421328
7
+ data.tar.gz: 932751a38d73e1f1d0d147fb654ce802bc3a0f9b20319ab76ef9525fd29661ac70b505aa47e71932f9e02d6651785f8779dfdbd3723226b1f53d55d7e1e3880d
data/.gitignore CHANGED
@@ -1,2 +1,3 @@
1
1
  *.gem
2
2
  coverage
3
+ .DS_Store
data/.gitlab-ci.yml CHANGED
@@ -3,7 +3,7 @@ image: "ruby:2.5"
3
3
  before_script:
4
4
  - ruby -v
5
5
  - which ruby
6
- - gem install bundler --no-ri --no-rdoc
6
+ - gem install bundler
7
7
  - bundle install --jobs $(nproc) "${FLAGS[@]}"
8
8
 
9
9
  variables:
@@ -25,7 +25,7 @@ rspec:
25
25
  .integration:
26
26
  stage: test
27
27
  script:
28
- - cd test
28
+ - cd tests/reliability
29
29
  - bundle exec ruby reliability_test.rb
30
30
  services:
31
31
  - redis:alpine
@@ -47,6 +47,22 @@ integration_basic:
47
47
  variables:
48
48
  JOB_FETCHER: basic
49
49
 
50
+ kill_interruption:
51
+ stage: test
52
+ script:
53
+ - cd tests/interruption
54
+ - bundle exec ruby test_kill_signal.rb
55
+ services:
56
+ - redis:alpine
57
+
58
+ term_interruption:
59
+ stage: test
60
+ script:
61
+ - cd tests/interruption
62
+ - bundle exec ruby test_term_signal.rb
63
+ services:
64
+ - redis:alpine
65
+
50
66
 
51
67
  # rubocop:
52
68
  # script:
data/README.md CHANGED
@@ -10,6 +10,17 @@ There are two strategies implemented: [Reliable fetch](http://redis.io/commands/
10
10
  semi-reliable fetch that uses regular `brpop` and `lpush` to pick the job and put it to working queue. The main benefit of "Reliable" strategy is that `rpoplpush` is atomic, eliminating a race condition in which jobs can be lost.
11
11
  However, it comes at a cost because `rpoplpush` can't watch multiple lists at the same time so we need to iterate over the entire queue list which significantly increases pressure on Redis when there are more than a few queues. The "semi-reliable" strategy is much more reliable than the default Sidekiq fetcher, though. Compared to the reliable fetch strategy, it does not increase pressure on Redis significantly.
12
12
 
13
+ ### Interruption handling
14
+
15
+ Sidekiq expects any job to report succcess or to fail. In the last case, Sidekiq puts `retry_count` counter
16
+ into the job and keeps to re-run the job until the counter reched the maximum allowed value. When the job has
17
+ not been given a chance to finish its work(to report success or fail), for example, when it was killed forcibly or when the job was requeued, after receiving TERM signal, the standard retry mechanisme does not get into the game and the job will be retried indefinatelly. This is why Reliable fetcher maintains a special counter `interrupted_count`
18
+ which is used to limit the amount of such retries. In both cases, Reliable Fetcher increments counter `interrupted_count` and rejects the job from running again when the counter exceeds `max_retries_after_interruption` times (default: 3 times).
19
+ Such a job will be put to `interrupted` queue. This queue mostly behaves as Sidekiq Dead queue so it only stores a limited amount of jobs for a limited term. Same as for Dead queue, all the limits are configurable via `interrupted_max_jobs` (default: 10_000) and `interrupted_timeout_in_seconds` (default: 3 months) Sidekiq option keys.
20
+
21
+ You can also disable special handling of interrupted jobs by setting `max_retries_after_interruption` into `-1`.
22
+ In this case, interrupted jobs will be run without any limits from Reliable Fetcher and they won't be put into Interrupted queue.
23
+
13
24
 
14
25
  ## Installation
15
26
 
@@ -1,14 +1,14 @@
1
1
  Gem::Specification.new do |s|
2
- s.name = 'gitlab-sidekiq-fetcher'
3
- s.version = '0.4.0'
4
- s.authors = ['TEA', 'GitLab']
5
- s.email = 'valery@gitlab.com'
6
- s.license = 'LGPL-3.0'
7
- s.homepage = 'https://gitlab.com/gitlab-org/sidekiq-reliable-fetch/'
8
- s.summary = 'Reliable fetch extension for Sidekiq'
9
- s.description = 'Redis reliable queue pattern implemented in Sidekiq'
2
+ s.name = 'gitlab-sidekiq-fetcher'
3
+ s.version = '0.5.3'
4
+ s.authors = ['TEA', 'GitLab']
5
+ s.email = 'valery@gitlab.com'
6
+ s.license = 'LGPL-3.0'
7
+ s.homepage = 'https://gitlab.com/gitlab-org/sidekiq-reliable-fetch/'
8
+ s.summary = 'Reliable fetch extension for Sidekiq'
9
+ s.description = 'Redis reliable queue pattern implemented in Sidekiq'
10
10
  s.require_paths = ['lib']
11
- s.files = `git ls-files`.split($\)
12
- s.test_files = []
11
+ s.files = `git ls-files`.split($\)
12
+ s.test_files = []
13
13
  s.add_dependency 'sidekiq', '~> 5'
14
14
  end
@@ -1,4 +1,5 @@
1
1
  require 'sidekiq'
2
+ require 'sidekiq/api'
2
3
 
3
4
  require_relative 'sidekiq/base_reliable_fetch'
4
5
  require_relative 'sidekiq/reliable_fetch'
@@ -1,5 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require_relative 'interrupted_set'
4
+
3
5
  module Sidekiq
4
6
  class BaseReliableFetch
5
7
  DEFAULT_CLEANUP_INTERVAL = 60 * 60 # 1 hour
@@ -16,6 +18,9 @@ module Sidekiq
16
18
  # Defines the COUNT parameter that will be passed to Redis SCAN command
17
19
  SCAN_COUNT = 1000
18
20
 
21
+ # How much time a job can be interrupted
22
+ DEFAULT_MAX_RETRIES_AFTER_INTERRUPTION = 3
23
+
19
24
  UnitOfWork = Struct.new(:queue, :job) do
20
25
  def acknowledge
21
26
  Sidekiq.redis { |conn| conn.lrem(Sidekiq::BaseReliableFetch.working_queue_name(queue), 1, job) }
@@ -82,22 +87,61 @@ module Sidekiq
82
87
  def self.bulk_requeue(inprogress, _options)
83
88
  return if inprogress.empty?
84
89
 
85
- Sidekiq.logger.debug('Re-queueing terminated jobs')
86
-
87
90
  Sidekiq.redis do |conn|
88
91
  inprogress.each do |unit_of_work|
89
92
  conn.multi do |multi|
90
- multi.lpush(unit_of_work.queue, unit_of_work.job)
93
+ preprocess_interrupted_job(unit_of_work.job, unit_of_work.queue, multi)
94
+
91
95
  multi.lrem(working_queue_name(unit_of_work.queue), 1, unit_of_work.job)
92
96
  end
93
97
  end
94
98
  end
95
-
96
- Sidekiq.logger.info("Pushed #{inprogress.size} jobs back to Redis")
97
99
  rescue => e
98
100
  Sidekiq.logger.warn("Failed to requeue #{inprogress.size} jobs: #{e.message}")
99
101
  end
100
102
 
103
+ def self.clean_working_queue!(working_queue)
104
+ original_queue = working_queue.gsub(/#{WORKING_QUEUE_PREFIX}:|:[^:]*:[0-9]*\z/, '')
105
+
106
+ Sidekiq.redis do |conn|
107
+ while job = conn.rpop(working_queue)
108
+ preprocess_interrupted_job(job, original_queue)
109
+ end
110
+ end
111
+ end
112
+
113
+ def self.preprocess_interrupted_job(job, queue, conn = nil)
114
+ msg = Sidekiq.load_json(job)
115
+ msg['interrupted_count'] = msg['interrupted_count'].to_i + 1
116
+
117
+ if interruption_exhausted?(msg)
118
+ send_to_quarantine(msg, conn)
119
+ else
120
+ requeue_job(queue, msg, conn)
121
+ end
122
+ end
123
+
124
+ # Detect "old" jobs and requeue them because the worker they were assigned
125
+ # to probably failed miserably.
126
+ def self.clean_working_queues!
127
+ Sidekiq.logger.info('Cleaning working queues')
128
+
129
+ Sidekiq.redis do |conn|
130
+ conn.scan_each(match: "#{WORKING_QUEUE_PREFIX}:queue:*", count: SCAN_COUNT) do |key|
131
+ # Example: "working:name_of_the_job:queue:{hostname}:{PID}"
132
+ hostname, pid = key.scan(/:([^:]*):([0-9]*)\z/).flatten
133
+
134
+ next if hostname.nil? || pid.nil?
135
+
136
+ clean_working_queue!(key) if worker_dead?(hostname, pid, conn)
137
+ end
138
+ end
139
+ end
140
+
141
+ def self.worker_dead?(hostname, pid, conn)
142
+ !conn.get(heartbeat_key(hostname, pid))
143
+ end
144
+
101
145
  def self.heartbeat_key(hostname, pid)
102
146
  "reliable-fetcher-heartbeat-#{hostname}-#{pid}"
103
147
  end
@@ -106,6 +150,57 @@ module Sidekiq
106
150
  "#{WORKING_QUEUE_PREFIX}:#{queue}:#{hostname}:#{pid}"
107
151
  end
108
152
 
153
+ def self.interruption_exhausted?(msg)
154
+ return false if max_retries_after_interruption(msg['class']) < 0
155
+
156
+ msg['interrupted_count'].to_i >= max_retries_after_interruption(msg['class'])
157
+ end
158
+
159
+ def self.max_retries_after_interruption(worker_class)
160
+ max_retries_after_interruption = nil
161
+
162
+ max_retries_after_interruption ||= begin
163
+ Object.const_get(worker_class).sidekiq_options[:max_retries_after_interruption]
164
+ rescue NameError
165
+ end
166
+
167
+ max_retries_after_interruption ||= Sidekiq.options[:max_retries_after_interruption]
168
+ max_retries_after_interruption ||= DEFAULT_MAX_RETRIES_AFTER_INTERRUPTION
169
+ max_retries_after_interruption
170
+ end
171
+
172
+ def self.send_to_quarantine(msg, multi_connection = nil)
173
+ Sidekiq.logger.warn(
174
+ class: msg['class'],
175
+ jid: msg['jid'],
176
+ message: %(Reliable Fetcher: adding dead #{msg['class']} job #{msg['jid']} to interrupted queue)
177
+ )
178
+
179
+ job = Sidekiq.dump_json(msg)
180
+ Sidekiq::InterruptedSet.new.put(job, connection: multi_connection)
181
+ end
182
+
183
+ # If you want this method to be run is a scope of multi connection
184
+ # you need to pass it
185
+ def self.requeue_job(queue, msg, conn)
186
+ with_connection(conn) do |conn|
187
+ conn.lpush(queue, Sidekiq.dump_json(msg))
188
+ end
189
+
190
+ Sidekiq.logger.info(
191
+ message: "Pushed job #{msg['jid']} back to queue #{queue}",
192
+ jid: msg['jid'],
193
+ queue: queue
194
+ )
195
+ end
196
+
197
+ # Yield block with an existing connection or creates another one
198
+ def self.with_connection(conn, &block)
199
+ return yield(conn) if conn
200
+
201
+ Sidekiq.redis { |conn| yield(conn) }
202
+ end
203
+
109
204
  attr_reader :cleanup_interval, :last_try_to_take_lease_at, :lease_interval,
110
205
  :queues, :use_semi_reliable_fetch,
111
206
  :strictly_ordered_queues
@@ -119,7 +214,7 @@ module Sidekiq
119
214
  end
120
215
 
121
216
  def retrieve_work
122
- clean_working_queues! if take_lease
217
+ self.class.clean_working_queues! if take_lease
123
218
 
124
219
  retrieve_unit_of_work
125
220
  end
@@ -131,43 +226,6 @@ module Sidekiq
131
226
 
132
227
  private
133
228
 
134
- def clean_working_queue!(working_queue)
135
- original_queue = working_queue.gsub(/#{WORKING_QUEUE_PREFIX}:|:[^:]*:[0-9]*\z/, '')
136
-
137
- Sidekiq.redis do |conn|
138
- count = 0
139
-
140
- while conn.rpoplpush(working_queue, original_queue) do
141
- count += 1
142
- end
143
-
144
- Sidekiq.logger.info("Requeued #{count} dead jobs to #{original_queue}")
145
- end
146
- end
147
-
148
- # Detect "old" jobs and requeue them because the worker they were assigned
149
- # to probably failed miserably.
150
- def clean_working_queues!
151
- Sidekiq.logger.info("Cleaning working queues")
152
-
153
- Sidekiq.redis do |conn|
154
- conn.scan_each(match: "#{WORKING_QUEUE_PREFIX}:queue:*", count: SCAN_COUNT) do |key|
155
- # Example: "working:name_of_the_job:queue:{hostname}:{PID}"
156
- hostname, pid = key.scan(/:([^:]*):([0-9]*)\z/).flatten
157
-
158
- continue if hostname.nil? || pid.nil?
159
-
160
- clean_working_queue!(key) if worker_dead?(hostname, pid)
161
- end
162
- end
163
- end
164
-
165
- def worker_dead?(hostname, pid)
166
- Sidekiq.redis do |conn|
167
- !conn.get(self.class.heartbeat_key(hostname, pid))
168
- end
169
- end
170
-
171
229
  def take_lease
172
230
  return unless allowed_to_take_a_lease?
173
231
 
@@ -0,0 +1,47 @@
1
+ require 'sidekiq/api'
2
+
3
+ module Sidekiq
4
+ class InterruptedSet < ::Sidekiq::JobSet
5
+ DEFAULT_MAX_CAPACITY = 10_000
6
+ DEFAULT_MAX_TIMEOUT = 90 * 24 * 60 * 60 # 3 months
7
+
8
+ def initialize
9
+ super "interrupted"
10
+ end
11
+
12
+ def put(message, opts = {})
13
+ now = Time.now.to_f
14
+
15
+ with_multi_connection(opts[:connection]) do |conn|
16
+ conn.zadd(name, now.to_s, message)
17
+ conn.zremrangebyscore(name, '-inf', now - self.class.timeout)
18
+ conn.zremrangebyrank(name, 0, - self.class.max_jobs)
19
+ end
20
+
21
+ true
22
+ end
23
+
24
+ # Yield block inside an existing multi connection or creates new one
25
+ def with_multi_connection(conn, &block)
26
+ return yield(conn) if conn
27
+
28
+ Sidekiq.redis do |c|
29
+ c.multi do |multi|
30
+ yield(multi)
31
+ end
32
+ end
33
+ end
34
+
35
+ def retry_all
36
+ each(&:retry) while size > 0
37
+ end
38
+
39
+ def self.max_jobs
40
+ Sidekiq.options[:interrupted_max_jobs] || DEFAULT_MAX_CAPACITY
41
+ end
42
+
43
+ def self.timeout
44
+ Sidekiq.options[:interrupted_timeout_in_seconds] || DEFAULT_MAX_TIMEOUT
45
+ end
46
+ end
47
+ end
@@ -5,6 +5,8 @@ require 'sidekiq/reliable_fetch'
5
5
  require 'sidekiq/semi_reliable_fetch'
6
6
 
7
7
  describe Sidekiq::BaseReliableFetch do
8
+ let(:job) { Sidekiq.dump_json(class: 'Bob', args: [1, 2, 'foo']) }
9
+
8
10
  before { Sidekiq.redis(&:flushdb) }
9
11
 
10
12
  describe 'UnitOfWork' do
@@ -12,7 +14,7 @@ describe Sidekiq::BaseReliableFetch do
12
14
 
13
15
  describe '#requeue' do
14
16
  it 'requeues job' do
15
- Sidekiq.redis { |conn| conn.rpush('queue:foo', 'msg') }
17
+ Sidekiq.redis { |conn| conn.rpush('queue:foo', job) }
16
18
 
17
19
  uow = fetcher.retrieve_work
18
20
 
@@ -25,7 +27,7 @@ describe Sidekiq::BaseReliableFetch do
25
27
 
26
28
  describe '#acknowledge' do
27
29
  it 'acknowledges job' do
28
- Sidekiq.redis { |conn| conn.rpush('queue:foo', 'msg') }
30
+ Sidekiq.redis { |conn| conn.rpush('queue:foo', job) }
29
31
 
30
32
  uow = fetcher.retrieve_work
31
33
 
@@ -38,19 +40,42 @@ describe Sidekiq::BaseReliableFetch do
38
40
  end
39
41
 
40
42
  describe '.bulk_requeue' do
43
+ let!(:queue1) { Sidekiq::Queue.new('foo') }
44
+ let!(:queue2) { Sidekiq::Queue.new('bar') }
45
+
41
46
  it 'requeues the bulk' do
42
- queue1 = Sidekiq::Queue.new('foo')
43
- queue2 = Sidekiq::Queue.new('bar')
47
+ uow = described_class::UnitOfWork
48
+ jobs = [ uow.new('queue:foo', job), uow.new('queue:foo', job), uow.new('queue:bar', job) ]
49
+ described_class.bulk_requeue(jobs, queues: [])
44
50
 
45
- expect(queue1.size).to eq 0
46
- expect(queue2.size).to eq 0
51
+ expect(queue1.size).to eq 2
52
+ expect(queue2.size).to eq 1
53
+ end
47
54
 
55
+ it 'puts jobs into interrupted queue' do
48
56
  uow = described_class::UnitOfWork
49
- jobs = [ uow.new('queue:foo', 'bob'), uow.new('queue:foo', 'bar'), uow.new('queue:bar', 'widget') ]
57
+ interrupted_job = Sidekiq.dump_json(class: 'Bob', args: [1, 2, 'foo'], interrupted_count: 3)
58
+ jobs = [ uow.new('queue:foo', interrupted_job), uow.new('queue:foo', job), uow.new('queue:bar', job) ]
59
+ described_class.bulk_requeue(jobs, queues: [])
60
+
61
+ expect(queue1.size).to eq 1
62
+ expect(queue2.size).to eq 1
63
+ expect(Sidekiq::InterruptedSet.new.size).to eq 1
64
+ end
65
+
66
+ it 'does not put jobs into interrupted queue if it is disabled' do
67
+ Sidekiq.options[:max_retries_after_interruption] = -1
68
+
69
+ uow = described_class::UnitOfWork
70
+ interrupted_job = Sidekiq.dump_json(class: 'Bob', args: [1, 2, 'foo'], interrupted_count: 3)
71
+ jobs = [ uow.new('queue:foo', interrupted_job), uow.new('queue:foo', job), uow.new('queue:bar', job) ]
50
72
  described_class.bulk_requeue(jobs, queues: [])
51
73
 
52
74
  expect(queue1.size).to eq 2
53
75
  expect(queue2.size).to eq 1
76
+ expect(Sidekiq::InterruptedSet.new.size).to eq 0
77
+
78
+ Sidekiq.options[:max_retries_after_interruption] = 3
54
79
  end
55
80
  end
56
81
 
@@ -4,44 +4,65 @@ shared_examples 'a Sidekiq fetcher' do
4
4
  before { Sidekiq.redis(&:flushdb) }
5
5
 
6
6
  describe '#retrieve_work' do
7
+ let(:job) { Sidekiq.dump_json(class: 'Bob', args: [1, 2, 'foo']) }
7
8
  let(:fetcher) { described_class.new(queues: ['assigned']) }
8
9
 
9
10
  it 'retrieves the job and puts it to working queue' do
10
- Sidekiq.redis { |conn| conn.rpush('queue:assigned', 'msg') }
11
+ Sidekiq.redis { |conn| conn.rpush('queue:assigned', job) }
11
12
 
12
13
  uow = fetcher.retrieve_work
13
14
 
14
15
  expect(working_queue_size('assigned')).to eq 1
15
16
  expect(uow.queue_name).to eq 'assigned'
16
- expect(uow.job).to eq 'msg'
17
+ expect(uow.job).to eq job
17
18
  expect(Sidekiq::Queue.new('assigned').size).to eq 0
18
19
  end
19
20
 
20
21
  it 'does not retrieve a job from foreign queue' do
21
- Sidekiq.redis { |conn| conn.rpush('queue:not_assigned', 'msg') }
22
+ Sidekiq.redis { |conn| conn.rpush('queue:not_assigned', job) }
22
23
 
23
24
  expect(fetcher.retrieve_work).to be_nil
24
25
  end
25
26
 
26
- it 'requeues jobs from dead working queue' do
27
+ it 'requeues jobs from dead working queue with incremented interrupted_count' do
27
28
  Sidekiq.redis do |conn|
28
- conn.rpush(other_process_working_queue_name('assigned'), 'msg')
29
+ conn.rpush(other_process_working_queue_name('assigned'), job)
29
30
  end
30
31
 
32
+ expected_job = Sidekiq.load_json(job)
33
+ expected_job['interrupted_count'] = 1
34
+ expected_job = Sidekiq.dump_json(expected_job)
35
+
31
36
  uow = fetcher.retrieve_work
32
37
 
33
- expect(uow.job).to eq 'msg'
38
+ expect(uow.job).to eq expected_job
34
39
 
35
40
  Sidekiq.redis do |conn|
36
41
  expect(conn.llen(other_process_working_queue_name('assigned'))).to eq 0
37
42
  end
38
43
  end
39
44
 
45
+ it 'ignores working queue keys in unknown formats' do
46
+ # Add a spurious non-numeric char segment at the end; this simulates any other
47
+ # incorrect form in general
48
+ malformed_key = "#{other_process_working_queue_name('assigned')}:X"
49
+ Sidekiq.redis do |conn|
50
+ conn.rpush(malformed_key, job)
51
+ end
52
+
53
+ uow = fetcher.retrieve_work
54
+
55
+ Sidekiq.redis do |conn|
56
+ expect(conn.llen(malformed_key)).to eq 1
57
+ end
58
+ end
59
+
60
+
40
61
  it 'does not requeue jobs from live working queue' do
41
62
  working_queue = live_other_process_working_queue_name('assigned')
42
63
 
43
64
  Sidekiq.redis do |conn|
44
- conn.rpush(working_queue, 'msg')
65
+ conn.rpush(working_queue, job)
45
66
  end
46
67
 
47
68
  uow = fetcher.retrieve_work
@@ -56,8 +77,7 @@ shared_examples 'a Sidekiq fetcher' do
56
77
  it 'does not clean up orphaned jobs more than once per cleanup interval' do
57
78
  Sidekiq.redis = Sidekiq::RedisConnection.create(url: REDIS_URL, size: 10)
58
79
 
59
- expect_any_instance_of(described_class)
60
- .to receive(:clean_working_queues!).once
80
+ expect(described_class).to receive(:clean_working_queues!).once
61
81
 
62
82
  threads = 10.times.map do
63
83
  Thread.new do
@@ -1,5 +1,6 @@
1
1
  require 'spec_helper'
2
2
  require 'fetch_shared_examples'
3
+ require 'sidekiq/base_reliable_fetch'
3
4
  require 'sidekiq/reliable_fetch'
4
5
 
5
6
  describe Sidekiq::ReliableFetch do
@@ -1,5 +1,6 @@
1
1
  require 'spec_helper'
2
2
  require 'fetch_shared_examples'
3
+ require 'sidekiq/base_reliable_fetch'
3
4
  require 'sidekiq/semi_reliable_fetch'
4
5
 
5
6
  describe Sidekiq::SemiReliableFetch do
data/tests/README.md ADDED
@@ -0,0 +1,37 @@
1
+ # How to run reliability tests
2
+
3
+ ```
4
+ cd reliability_test
5
+ bundle exec ruby reliability_test.rb
6
+ ```
7
+
8
+ You can adjust some parameters of the test in the `config.rb`.
9
+
10
+ JOB_FETCHER can be set to one of these values: `semi`, `reliable`, `basic`
11
+
12
+ You need to have redis server running on default HTTP port `6379`. To use other HTTP port, you can define
13
+ `REDIS_URL` environment varible with the port you need(example: `REDIS_URL="redis://localhost:9999"`).
14
+
15
+
16
+ ## How it works
17
+
18
+ This tool spawns configured number of Sidekiq workers and when the amount of processed jobs is about half of origin
19
+ number it will kill all the workers with `kill -9` and then it will spawn new workers again until all the jobs are processed. To track the process and counters we use Redis keys/counters.
20
+
21
+ # How to run interruption tests
22
+
23
+ ```
24
+ cd tests/interruption
25
+
26
+ # Verify "KILL" signal
27
+ bundle exec ruby test_kill_signal.rb
28
+
29
+ # Verify "TERM" signal
30
+ bundle exec ruby test_term_signal.rb
31
+ ```
32
+
33
+ It requires Redis to be running on 6379 port.
34
+
35
+ ## How it works
36
+
37
+ It spawns Sidekiq workers then creates a job that will kill itself after a moment. The reliable fetcher will bring it back. The purpose is to verify that job is run no more then allowed number of times.
@@ -0,0 +1,19 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative '../../lib/sidekiq-reliable-fetch'
4
+ require_relative 'worker'
5
+
6
+ TEST_CLEANUP_INTERVAL = 20
7
+ TEST_LEASE_INTERVAL = 5
8
+
9
+ Sidekiq.configure_server do |config|
10
+ config.options[:semi_reliable_fetch] = true
11
+
12
+ # We need to override these parameters to not wait too long
13
+ # The default values are good for production use only
14
+ # These will be ignored for :basic
15
+ config.options[:cleanup_interval] = TEST_CLEANUP_INTERVAL
16
+ config.options[:lease_interval] = TEST_LEASE_INTERVAL
17
+
18
+ Sidekiq::ReliableFetch.setup_reliable_fetch!(config)
19
+ end
@@ -0,0 +1,25 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'sidekiq'
4
+ require_relative 'config'
5
+ require_relative '../support/utils'
6
+
7
+ EXPECTED_NUM_TIMES_BEEN_RUN = 3
8
+ NUM_WORKERS = EXPECTED_NUM_TIMES_BEEN_RUN + 1
9
+
10
+ Sidekiq.redis(&:flushdb)
11
+
12
+ pids = spawn_workers(NUM_WORKERS)
13
+
14
+ RetryTestWorker.perform_async
15
+
16
+ sleep 300
17
+
18
+ Sidekiq.redis do |redis|
19
+ times_has_been_run = redis.get('times_has_been_run').to_i
20
+ assert 'The job has been run', times_has_been_run, EXPECTED_NUM_TIMES_BEEN_RUN
21
+ end
22
+
23
+ assert 'Found interruption exhausted jobs', Sidekiq::InterruptedSet.new.size, 1
24
+
25
+ stop_workers(pids)
@@ -0,0 +1,25 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'sidekiq'
4
+ require_relative 'config'
5
+ require_relative '../support/utils'
6
+
7
+ EXPECTED_NUM_TIMES_BEEN_RUN = 3
8
+ NUM_WORKERS = EXPECTED_NUM_TIMES_BEEN_RUN + 1
9
+
10
+ Sidekiq.redis(&:flushdb)
11
+
12
+ pids = spawn_workers(NUM_WORKERS)
13
+
14
+ RetryTestWorker.perform_async('TERM', 60)
15
+
16
+ sleep 300
17
+
18
+ Sidekiq.redis do |redis|
19
+ times_has_been_run = redis.get('times_has_been_run').to_i
20
+ assert 'The job has been run', times_has_been_run, EXPECTED_NUM_TIMES_BEEN_RUN
21
+ end
22
+
23
+ assert 'Found interruption exhausted jobs', Sidekiq::InterruptedSet.new.size, 1
24
+
25
+ stop_workers(pids)
@@ -0,0 +1,15 @@
1
+ # frozen_string_literal: true
2
+
3
+ class RetryTestWorker
4
+ include Sidekiq::Worker
5
+
6
+ def perform(signal = 'KILL', wait_seconds = 1)
7
+ Sidekiq.redis do |redis|
8
+ redis.incr('times_has_been_run')
9
+ end
10
+
11
+ Process.kill(signal, Process.pid)
12
+
13
+ sleep wait_seconds
14
+ end
15
+ end
@@ -1,8 +1,6 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- require_relative '../lib/sidekiq/base_reliable_fetch'
4
- require_relative '../lib/sidekiq/reliable_fetch'
5
- require_relative '../lib/sidekiq/semi_reliable_fetch'
3
+ require_relative '../../lib/sidekiq-reliable-fetch'
6
4
  require_relative 'worker'
7
5
 
8
6
  REDIS_FINISHED_LIST = 'reliable-fetcher-finished-jids'
@@ -89,7 +89,7 @@ Sidekiq.redis(&:flushdb)
89
89
  jobs = []
90
90
 
91
91
  NUMBER_OF_JOBS.times do
92
- jobs << TestWorker.perform_async
92
+ jobs << ReliabilityTestWorker.perform_async
93
93
  end
94
94
 
95
95
  puts "Queued #{NUMBER_OF_JOBS} jobs"
@@ -1,6 +1,6 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- class TestWorker
3
+ class ReliabilityTestWorker
4
4
  include Sidekiq::Worker
5
5
 
6
6
  def perform
@@ -0,0 +1,26 @@
1
+ def assert(text, actual, expected)
2
+ if actual == expected
3
+ puts "#{text}: #{actual} (Success)"
4
+ else
5
+ puts "#{text}: #{actual} (Failed). Expected: #{expected}"
6
+ exit 1
7
+ end
8
+ end
9
+
10
+ def spawn_workers(number)
11
+ pids = []
12
+
13
+ number.times do
14
+ pids << spawn('sidekiq -r ./config.rb')
15
+ end
16
+
17
+ pids
18
+ end
19
+
20
+ # Stop Sidekiq workers
21
+ def stop_workers(pids)
22
+ pids.each do |pid|
23
+ Process.kill('KILL', pid)
24
+ Process.wait pid
25
+ end
26
+ end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: gitlab-sidekiq-fetcher
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.0
4
+ version: 0.5.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - TEA
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2018-12-19 00:00:00.000000000 Z
12
+ date: 2021-02-18 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: sidekiq
@@ -42,6 +42,7 @@ files:
42
42
  - gitlab-sidekiq-fetcher.gemspec
43
43
  - lib/sidekiq-reliable-fetch.rb
44
44
  - lib/sidekiq/base_reliable_fetch.rb
45
+ - lib/sidekiq/interrupted_set.rb
45
46
  - lib/sidekiq/reliable_fetch.rb
46
47
  - lib/sidekiq/semi_reliable_fetch.rb
47
48
  - spec/base_reliable_fetch_spec.rb
@@ -49,10 +50,15 @@ files:
49
50
  - spec/reliable_fetch_spec.rb
50
51
  - spec/semi_reliable_fetch_spec.rb
51
52
  - spec/spec_helper.rb
52
- - test/README.md
53
- - test/config.rb
54
- - test/reliability_test.rb
55
- - test/worker.rb
53
+ - tests/README.md
54
+ - tests/interruption/config.rb
55
+ - tests/interruption/test_kill_signal.rb
56
+ - tests/interruption/test_term_signal.rb
57
+ - tests/interruption/worker.rb
58
+ - tests/reliability/config.rb
59
+ - tests/reliability/reliability_test.rb
60
+ - tests/reliability/worker.rb
61
+ - tests/support/utils.rb
56
62
  homepage: https://gitlab.com/gitlab-org/sidekiq-reliable-fetch/
57
63
  licenses:
58
64
  - LGPL-3.0
@@ -72,8 +78,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
72
78
  - !ruby/object:Gem::Version
73
79
  version: '0'
74
80
  requirements: []
75
- rubyforge_project:
76
- rubygems_version: 2.7.6
81
+ rubygems_version: 3.1.4
77
82
  signing_key:
78
83
  specification_version: 4
79
84
  summary: Reliable fetch extension for Sidekiq
data/test/README.md DELETED
@@ -1,34 +0,0 @@
1
- # How to run
2
-
3
- ```
4
- cd test
5
- bundle exec ruby reliability_test.rb
6
- ```
7
-
8
- You can adjust some parameters of the test in the `config.rb`
9
-
10
-
11
- # How it works
12
-
13
- This tool spawns configured number of Sidekiq workers and when the amount of processed jobs is about half of origin
14
- number it will kill all the workers with `kill -9` and then it will spawn new workers again until all the jobs are processed. To track the process and counters we use Redis keys/counters.
15
-
16
- # How to run tests
17
-
18
- To run rspec:
19
-
20
- ```
21
- bundle exec rspec
22
- ```
23
-
24
- To run performance tests:
25
-
26
- ```
27
- cd test
28
- JOB_FETCHER=semi bundle exec ruby reliability_test.rb
29
- ```
30
-
31
- JOB_FETCHER can be set to one of these values: `semi`, `reliable`, `basic`
32
-
33
- To run both kind of tests you need to have redis server running on default HTTP port `6379`. To use other HTTP port, you can define
34
- `REDIS_URL` environment varible with the port you need(example: `REDIS_URL="redis://localhost:9999"`).