gitlab-sidekiq-fetcher 0.4.0 → 0.6.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: cc2e47cf7679deb6a6d526e199a09c20a671e3b55bad22d7c05ce17405eb6103
4
- data.tar.gz: e71949587df8a635223ca8fa36339949df771f493e7edda0a4d9c34198600fb5
3
+ metadata.gz: 4815e3e75915230d7b2eaf2fe3fa0daa288ec4c670b2cd211cb659aff4838788
4
+ data.tar.gz: c8b491f2d1a2678ef40fe856a55bf89341e041070c40941ca693685fa3c048cf
5
5
  SHA512:
6
- metadata.gz: ea7d6b7283354053a4f9fc24f419ab56da97efff3ce3ac3a2a10517ab2a1bd184a44b0ffb9d39b1dbdf27b23032c58b1139a18d8f2ec1bf5357deed65187e3ed
7
- data.tar.gz: ae4c78eca271dc63abf98bc112a582a2f56fc4e1e7be843189c30193ff0dabd99232f5a019b9b69d45cd1f08ca7bd88afe0c4341e0256c88d352441474a08400
6
+ metadata.gz: 1667fb3ffb47117ac3c756c07a85867d96871b5489e820823735b62912df0d82b4d6c10be478e1f291d637d2ed15aff99a958f4fd884df98f026ee8c8f9c4c83
7
+ data.tar.gz: 5dad56f30515be87e79c58c3b5dce301213d427fe579309bca2a525e2b8664b94f0763c7eabfb2443324c5d8ecf5403f9826a0772e3b09b74e215e72ebfc4226
data/.gitignore CHANGED
@@ -1,2 +1,3 @@
1
1
  *.gem
2
2
  coverage
3
+ .DS_Store
@@ -3,7 +3,7 @@ image: "ruby:2.5"
3
3
  before_script:
4
4
  - ruby -v
5
5
  - which ruby
6
- - gem install bundler --no-ri --no-rdoc
6
+ - gem install bundler
7
7
  - bundle install --jobs $(nproc) "${FLAGS[@]}"
8
8
 
9
9
  variables:
@@ -25,7 +25,7 @@ rspec:
25
25
  .integration:
26
26
  stage: test
27
27
  script:
28
- - cd test
28
+ - cd tests/reliability
29
29
  - bundle exec ruby reliability_test.rb
30
30
  services:
31
31
  - redis:alpine
@@ -47,6 +47,22 @@ integration_basic:
47
47
  variables:
48
48
  JOB_FETCHER: basic
49
49
 
50
+ kill_interruption:
51
+ stage: test
52
+ script:
53
+ - cd tests/interruption
54
+ - bundle exec ruby test_kill_signal.rb
55
+ services:
56
+ - redis:alpine
57
+
58
+ term_interruption:
59
+ stage: test
60
+ script:
61
+ - cd tests/interruption
62
+ - bundle exec ruby test_term_signal.rb
63
+ services:
64
+ - redis:alpine
65
+
50
66
 
51
67
  # rubocop:
52
68
  # script:
data/README.md CHANGED
@@ -10,6 +10,17 @@ There are two strategies implemented: [Reliable fetch](http://redis.io/commands/
10
10
  semi-reliable fetch that uses regular `brpop` and `lpush` to pick the job and put it to working queue. The main benefit of "Reliable" strategy is that `rpoplpush` is atomic, eliminating a race condition in which jobs can be lost.
11
11
  However, it comes at a cost because `rpoplpush` can't watch multiple lists at the same time so we need to iterate over the entire queue list which significantly increases pressure on Redis when there are more than a few queues. The "semi-reliable" strategy is much more reliable than the default Sidekiq fetcher, though. Compared to the reliable fetch strategy, it does not increase pressure on Redis significantly.
12
12
 
13
+ ### Interruption handling
14
+
15
+ Sidekiq expects any job to report succcess or to fail. In the last case, Sidekiq puts `retry_count` counter
16
+ into the job and keeps to re-run the job until the counter reched the maximum allowed value. When the job has
17
+ not been given a chance to finish its work(to report success or fail), for example, when it was killed forcibly or when the job was requeued, after receiving TERM signal, the standard retry mechanisme does not get into the game and the job will be retried indefinatelly. This is why Reliable fetcher maintains a special counter `interrupted_count`
18
+ which is used to limit the amount of such retries. In both cases, Reliable Fetcher increments counter `interrupted_count` and rejects the job from running again when the counter exceeds `max_retries_after_interruption` times (default: 3 times).
19
+ Such a job will be put to `interrupted` queue. This queue mostly behaves as Sidekiq Dead queue so it only stores a limited amount of jobs for a limited term. Same as for Dead queue, all the limits are configurable via `interrupted_max_jobs` (default: 10_000) and `interrupted_timeout_in_seconds` (default: 3 months) Sidekiq option keys.
20
+
21
+ You can also disable special handling of interrupted jobs by setting `max_retries_after_interruption` into `-1`.
22
+ In this case, interrupted jobs will be run without any limits from Reliable Fetcher and they won't be put into Interrupted queue.
23
+
13
24
 
14
25
  ## Installation
15
26
 
@@ -1,14 +1,14 @@
1
1
  Gem::Specification.new do |s|
2
- s.name = 'gitlab-sidekiq-fetcher'
3
- s.version = '0.4.0'
4
- s.authors = ['TEA', 'GitLab']
5
- s.email = 'valery@gitlab.com'
6
- s.license = 'LGPL-3.0'
7
- s.homepage = 'https://gitlab.com/gitlab-org/sidekiq-reliable-fetch/'
8
- s.summary = 'Reliable fetch extension for Sidekiq'
9
- s.description = 'Redis reliable queue pattern implemented in Sidekiq'
2
+ s.name = 'gitlab-sidekiq-fetcher'
3
+ s.version = '0.6.0'
4
+ s.authors = ['TEA', 'GitLab']
5
+ s.email = 'valery@gitlab.com'
6
+ s.license = 'LGPL-3.0'
7
+ s.homepage = 'https://gitlab.com/gitlab-org/sidekiq-reliable-fetch/'
8
+ s.summary = 'Reliable fetch extension for Sidekiq'
9
+ s.description = 'Redis reliable queue pattern implemented in Sidekiq'
10
10
  s.require_paths = ['lib']
11
- s.files = `git ls-files`.split($\)
12
- s.test_files = []
13
- s.add_dependency 'sidekiq', '~> 5'
11
+ s.files = `git ls-files`.split($\)
12
+ s.test_files = []
13
+ s.add_dependency 'sidekiq', '>= 5', '< 7'
14
14
  end
@@ -1,4 +1,5 @@
1
1
  require 'sidekiq'
2
+ require 'sidekiq/api'
2
3
 
3
4
  require_relative 'sidekiq/base_reliable_fetch'
4
5
  require_relative 'sidekiq/reliable_fetch'
@@ -1,5 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require_relative 'interrupted_set'
4
+
3
5
  module Sidekiq
4
6
  class BaseReliableFetch
5
7
  DEFAULT_CLEANUP_INTERVAL = 60 * 60 # 1 hour
@@ -16,6 +18,9 @@ module Sidekiq
16
18
  # Defines the COUNT parameter that will be passed to Redis SCAN command
17
19
  SCAN_COUNT = 1000
18
20
 
21
+ # How much time a job can be interrupted
22
+ DEFAULT_MAX_RETRIES_AFTER_INTERRUPTION = 3
23
+
19
24
  UnitOfWork = Struct.new(:queue, :job) do
20
25
  def acknowledge
21
26
  Sidekiq.redis { |conn| conn.lrem(Sidekiq::BaseReliableFetch.working_queue_name(queue), 1, job) }
@@ -36,12 +41,10 @@ module Sidekiq
36
41
  end
37
42
 
38
43
  def self.setup_reliable_fetch!(config)
39
- config.options[:fetch] = if config.options[:semi_reliable_fetch]
40
- Sidekiq::SemiReliableFetch
41
- else
42
- Sidekiq::ReliableFetch
43
- end
44
+ fetch = config.options[:semi_reliable_fetch] ? SemiReliableFetch : ReliableFetch
45
+ fetch = fetch.new(config.options) if Sidekiq::VERSION >= '6'
44
46
 
47
+ config.options[:fetch] = fetch
45
48
  Sidekiq.logger.info('GitLab reliable fetch activated!')
46
49
 
47
50
  start_heartbeat_thread
@@ -79,25 +82,68 @@ module Sidekiq
79
82
  Sidekiq.logger.debug("Heartbeat for hostname: #{hostname} and pid: #{pid}")
80
83
  end
81
84
 
85
+ def bulk_requeue(inprogress, options)
86
+ self.class.bulk_requeue(inprogress, options)
87
+ end
88
+
82
89
  def self.bulk_requeue(inprogress, _options)
83
90
  return if inprogress.empty?
84
91
 
85
- Sidekiq.logger.debug('Re-queueing terminated jobs')
86
-
87
92
  Sidekiq.redis do |conn|
88
93
  inprogress.each do |unit_of_work|
89
94
  conn.multi do |multi|
90
- multi.lpush(unit_of_work.queue, unit_of_work.job)
95
+ preprocess_interrupted_job(unit_of_work.job, unit_of_work.queue, multi)
96
+
91
97
  multi.lrem(working_queue_name(unit_of_work.queue), 1, unit_of_work.job)
92
98
  end
93
99
  end
94
100
  end
95
-
96
- Sidekiq.logger.info("Pushed #{inprogress.size} jobs back to Redis")
97
101
  rescue => e
98
102
  Sidekiq.logger.warn("Failed to requeue #{inprogress.size} jobs: #{e.message}")
99
103
  end
100
104
 
105
+ def self.clean_working_queue!(working_queue)
106
+ original_queue = working_queue.gsub(/#{WORKING_QUEUE_PREFIX}:|:[^:]*:[0-9]*\z/, '')
107
+
108
+ Sidekiq.redis do |conn|
109
+ while job = conn.rpop(working_queue)
110
+ preprocess_interrupted_job(job, original_queue)
111
+ end
112
+ end
113
+ end
114
+
115
+ def self.preprocess_interrupted_job(job, queue, conn = nil)
116
+ msg = Sidekiq.load_json(job)
117
+ msg['interrupted_count'] = msg['interrupted_count'].to_i + 1
118
+
119
+ if interruption_exhausted?(msg)
120
+ send_to_quarantine(msg, conn)
121
+ else
122
+ requeue_job(queue, msg, conn)
123
+ end
124
+ end
125
+
126
+ # Detect "old" jobs and requeue them because the worker they were assigned
127
+ # to probably failed miserably.
128
+ def self.clean_working_queues!
129
+ Sidekiq.logger.info('Cleaning working queues')
130
+
131
+ Sidekiq.redis do |conn|
132
+ conn.scan_each(match: "#{WORKING_QUEUE_PREFIX}:queue:*", count: SCAN_COUNT) do |key|
133
+ # Example: "working:name_of_the_job:queue:{hostname}:{PID}"
134
+ hostname, pid = key.scan(/:([^:]*):([0-9]*)\z/).flatten
135
+
136
+ continue if hostname.nil? || pid.nil?
137
+
138
+ clean_working_queue!(key) if worker_dead?(hostname, pid, conn)
139
+ end
140
+ end
141
+ end
142
+
143
+ def self.worker_dead?(hostname, pid, conn)
144
+ !conn.get(heartbeat_key(hostname, pid))
145
+ end
146
+
101
147
  def self.heartbeat_key(hostname, pid)
102
148
  "reliable-fetcher-heartbeat-#{hostname}-#{pid}"
103
149
  end
@@ -106,6 +152,57 @@ module Sidekiq
106
152
  "#{WORKING_QUEUE_PREFIX}:#{queue}:#{hostname}:#{pid}"
107
153
  end
108
154
 
155
+ def self.interruption_exhausted?(msg)
156
+ return false if max_retries_after_interruption(msg['class']) < 0
157
+
158
+ msg['interrupted_count'].to_i >= max_retries_after_interruption(msg['class'])
159
+ end
160
+
161
+ def self.max_retries_after_interruption(worker_class)
162
+ max_retries_after_interruption = nil
163
+
164
+ max_retries_after_interruption ||= begin
165
+ Object.const_get(worker_class).sidekiq_options[:max_retries_after_interruption]
166
+ rescue NameError
167
+ end
168
+
169
+ max_retries_after_interruption ||= Sidekiq.options[:max_retries_after_interruption]
170
+ max_retries_after_interruption ||= DEFAULT_MAX_RETRIES_AFTER_INTERRUPTION
171
+ max_retries_after_interruption
172
+ end
173
+
174
+ def self.send_to_quarantine(msg, multi_connection = nil)
175
+ Sidekiq.logger.warn(
176
+ class: msg['class'],
177
+ jid: msg['jid'],
178
+ message: %(Reliable Fetcher: adding dead #{msg['class']} job #{msg['jid']} to interrupted queue)
179
+ )
180
+
181
+ job = Sidekiq.dump_json(msg)
182
+ Sidekiq::InterruptedSet.new.put(job, connection: multi_connection)
183
+ end
184
+
185
+ # If you want this method to be run is a scope of multi connection
186
+ # you need to pass it
187
+ def self.requeue_job(queue, msg, conn)
188
+ with_connection(conn) do |conn|
189
+ conn.lpush(queue, Sidekiq.dump_json(msg))
190
+ end
191
+
192
+ Sidekiq.logger.info(
193
+ message: "Pushed job #{msg['jid']} back to queue #{queue}",
194
+ jid: msg['jid'],
195
+ queue: queue
196
+ )
197
+ end
198
+
199
+ # Yield block with an existing connection or creates another one
200
+ def self.with_connection(conn, &block)
201
+ return yield(conn) if conn
202
+
203
+ Sidekiq.redis { |conn| yield(conn) }
204
+ end
205
+
109
206
  attr_reader :cleanup_interval, :last_try_to_take_lease_at, :lease_interval,
110
207
  :queues, :use_semi_reliable_fetch,
111
208
  :strictly_ordered_queues
@@ -119,7 +216,7 @@ module Sidekiq
119
216
  end
120
217
 
121
218
  def retrieve_work
122
- clean_working_queues! if take_lease
219
+ self.class.clean_working_queues! if take_lease
123
220
 
124
221
  retrieve_unit_of_work
125
222
  end
@@ -131,43 +228,6 @@ module Sidekiq
131
228
 
132
229
  private
133
230
 
134
- def clean_working_queue!(working_queue)
135
- original_queue = working_queue.gsub(/#{WORKING_QUEUE_PREFIX}:|:[^:]*:[0-9]*\z/, '')
136
-
137
- Sidekiq.redis do |conn|
138
- count = 0
139
-
140
- while conn.rpoplpush(working_queue, original_queue) do
141
- count += 1
142
- end
143
-
144
- Sidekiq.logger.info("Requeued #{count} dead jobs to #{original_queue}")
145
- end
146
- end
147
-
148
- # Detect "old" jobs and requeue them because the worker they were assigned
149
- # to probably failed miserably.
150
- def clean_working_queues!
151
- Sidekiq.logger.info("Cleaning working queues")
152
-
153
- Sidekiq.redis do |conn|
154
- conn.scan_each(match: "#{WORKING_QUEUE_PREFIX}:queue:*", count: SCAN_COUNT) do |key|
155
- # Example: "working:name_of_the_job:queue:{hostname}:{PID}"
156
- hostname, pid = key.scan(/:([^:]*):([0-9]*)\z/).flatten
157
-
158
- continue if hostname.nil? || pid.nil?
159
-
160
- clean_working_queue!(key) if worker_dead?(hostname, pid)
161
- end
162
- end
163
- end
164
-
165
- def worker_dead?(hostname, pid)
166
- Sidekiq.redis do |conn|
167
- !conn.get(self.class.heartbeat_key(hostname, pid))
168
- end
169
- end
170
-
171
231
  def take_lease
172
232
  return unless allowed_to_take_a_lease?
173
233
 
@@ -0,0 +1,47 @@
1
+ require 'sidekiq/api'
2
+
3
+ module Sidekiq
4
+ class InterruptedSet < ::Sidekiq::JobSet
5
+ DEFAULT_MAX_CAPACITY = 10_000
6
+ DEFAULT_MAX_TIMEOUT = 90 * 24 * 60 * 60 # 3 months
7
+
8
+ def initialize
9
+ super "interrupted"
10
+ end
11
+
12
+ def put(message, opts = {})
13
+ now = Time.now.to_f
14
+
15
+ with_multi_connection(opts[:connection]) do |conn|
16
+ conn.zadd(name, now.to_s, message)
17
+ conn.zremrangebyscore(name, '-inf', now - self.class.timeout)
18
+ conn.zremrangebyrank(name, 0, - self.class.max_jobs)
19
+ end
20
+
21
+ true
22
+ end
23
+
24
+ # Yield block inside an existing multi connection or creates new one
25
+ def with_multi_connection(conn, &block)
26
+ return yield(conn) if conn
27
+
28
+ Sidekiq.redis do |c|
29
+ c.multi do |multi|
30
+ yield(multi)
31
+ end
32
+ end
33
+ end
34
+
35
+ def retry_all
36
+ each(&:retry) while size > 0
37
+ end
38
+
39
+ def self.max_jobs
40
+ Sidekiq.options[:interrupted_max_jobs] || DEFAULT_MAX_CAPACITY
41
+ end
42
+
43
+ def self.timeout
44
+ Sidekiq.options[:interrupted_timeout_in_seconds] || DEFAULT_MAX_TIMEOUT
45
+ end
46
+ end
47
+ end
@@ -5,6 +5,8 @@ require 'sidekiq/reliable_fetch'
5
5
  require 'sidekiq/semi_reliable_fetch'
6
6
 
7
7
  describe Sidekiq::BaseReliableFetch do
8
+ let(:job) { Sidekiq.dump_json(class: 'Bob', args: [1, 2, 'foo']) }
9
+
8
10
  before { Sidekiq.redis(&:flushdb) }
9
11
 
10
12
  describe 'UnitOfWork' do
@@ -12,7 +14,7 @@ describe Sidekiq::BaseReliableFetch do
12
14
 
13
15
  describe '#requeue' do
14
16
  it 'requeues job' do
15
- Sidekiq.redis { |conn| conn.rpush('queue:foo', 'msg') }
17
+ Sidekiq.redis { |conn| conn.rpush('queue:foo', job) }
16
18
 
17
19
  uow = fetcher.retrieve_work
18
20
 
@@ -25,7 +27,7 @@ describe Sidekiq::BaseReliableFetch do
25
27
 
26
28
  describe '#acknowledge' do
27
29
  it 'acknowledges job' do
28
- Sidekiq.redis { |conn| conn.rpush('queue:foo', 'msg') }
30
+ Sidekiq.redis { |conn| conn.rpush('queue:foo', job) }
29
31
 
30
32
  uow = fetcher.retrieve_work
31
33
 
@@ -38,24 +40,47 @@ describe Sidekiq::BaseReliableFetch do
38
40
  end
39
41
 
40
42
  describe '.bulk_requeue' do
43
+ let!(:queue1) { Sidekiq::Queue.new('foo') }
44
+ let!(:queue2) { Sidekiq::Queue.new('bar') }
45
+
41
46
  it 'requeues the bulk' do
42
- queue1 = Sidekiq::Queue.new('foo')
43
- queue2 = Sidekiq::Queue.new('bar')
47
+ uow = described_class::UnitOfWork
48
+ jobs = [ uow.new('queue:foo', job), uow.new('queue:foo', job), uow.new('queue:bar', job) ]
49
+ described_class.bulk_requeue(jobs, queues: [])
44
50
 
45
- expect(queue1.size).to eq 0
46
- expect(queue2.size).to eq 0
51
+ expect(queue1.size).to eq 2
52
+ expect(queue2.size).to eq 1
53
+ end
47
54
 
55
+ it 'puts jobs into interrupted queue' do
48
56
  uow = described_class::UnitOfWork
49
- jobs = [ uow.new('queue:foo', 'bob'), uow.new('queue:foo', 'bar'), uow.new('queue:bar', 'widget') ]
57
+ interrupted_job = Sidekiq.dump_json(class: 'Bob', args: [1, 2, 'foo'], interrupted_count: 3)
58
+ jobs = [ uow.new('queue:foo', interrupted_job), uow.new('queue:foo', job), uow.new('queue:bar', job) ]
59
+ described_class.bulk_requeue(jobs, queues: [])
60
+
61
+ expect(queue1.size).to eq 1
62
+ expect(queue2.size).to eq 1
63
+ expect(Sidekiq::InterruptedSet.new.size).to eq 1
64
+ end
65
+
66
+ it 'does not put jobs into interrupted queue if it is disabled' do
67
+ Sidekiq.options[:max_retries_after_interruption] = -1
68
+
69
+ uow = described_class::UnitOfWork
70
+ interrupted_job = Sidekiq.dump_json(class: 'Bob', args: [1, 2, 'foo'], interrupted_count: 3)
71
+ jobs = [ uow.new('queue:foo', interrupted_job), uow.new('queue:foo', job), uow.new('queue:bar', job) ]
50
72
  described_class.bulk_requeue(jobs, queues: [])
51
73
 
52
74
  expect(queue1.size).to eq 2
53
75
  expect(queue2.size).to eq 1
76
+ expect(Sidekiq::InterruptedSet.new.size).to eq 0
77
+
78
+ Sidekiq.options[:max_retries_after_interruption] = 3
54
79
  end
55
80
  end
56
81
 
57
82
  it 'sets heartbeat' do
58
- config = double(:sidekiq_config, options: {})
83
+ config = double(:sidekiq_config, options: { queues: [] })
59
84
 
60
85
  heartbeat_thread = described_class.setup_reliable_fetch!(config)
61
86
 
@@ -4,33 +4,38 @@ shared_examples 'a Sidekiq fetcher' do
4
4
  before { Sidekiq.redis(&:flushdb) }
5
5
 
6
6
  describe '#retrieve_work' do
7
+ let(:job) { Sidekiq.dump_json(class: 'Bob', args: [1, 2, 'foo']) }
7
8
  let(:fetcher) { described_class.new(queues: ['assigned']) }
8
9
 
9
10
  it 'retrieves the job and puts it to working queue' do
10
- Sidekiq.redis { |conn| conn.rpush('queue:assigned', 'msg') }
11
+ Sidekiq.redis { |conn| conn.rpush('queue:assigned', job) }
11
12
 
12
13
  uow = fetcher.retrieve_work
13
14
 
14
15
  expect(working_queue_size('assigned')).to eq 1
15
16
  expect(uow.queue_name).to eq 'assigned'
16
- expect(uow.job).to eq 'msg'
17
+ expect(uow.job).to eq job
17
18
  expect(Sidekiq::Queue.new('assigned').size).to eq 0
18
19
  end
19
20
 
20
21
  it 'does not retrieve a job from foreign queue' do
21
- Sidekiq.redis { |conn| conn.rpush('queue:not_assigned', 'msg') }
22
+ Sidekiq.redis { |conn| conn.rpush('queue:not_assigned', job) }
22
23
 
23
24
  expect(fetcher.retrieve_work).to be_nil
24
25
  end
25
26
 
26
- it 'requeues jobs from dead working queue' do
27
+ it 'requeues jobs from dead working queue with incremented interrupted_count' do
27
28
  Sidekiq.redis do |conn|
28
- conn.rpush(other_process_working_queue_name('assigned'), 'msg')
29
+ conn.rpush(other_process_working_queue_name('assigned'), job)
29
30
  end
30
31
 
32
+ expected_job = Sidekiq.load_json(job)
33
+ expected_job['interrupted_count'] = 1
34
+ expected_job = Sidekiq.dump_json(expected_job)
35
+
31
36
  uow = fetcher.retrieve_work
32
37
 
33
- expect(uow.job).to eq 'msg'
38
+ expect(uow.job).to eq expected_job
34
39
 
35
40
  Sidekiq.redis do |conn|
36
41
  expect(conn.llen(other_process_working_queue_name('assigned'))).to eq 0
@@ -41,7 +46,7 @@ shared_examples 'a Sidekiq fetcher' do
41
46
  working_queue = live_other_process_working_queue_name('assigned')
42
47
 
43
48
  Sidekiq.redis do |conn|
44
- conn.rpush(working_queue, 'msg')
49
+ conn.rpush(working_queue, job)
45
50
  end
46
51
 
47
52
  uow = fetcher.retrieve_work
@@ -56,8 +61,7 @@ shared_examples 'a Sidekiq fetcher' do
56
61
  it 'does not clean up orphaned jobs more than once per cleanup interval' do
57
62
  Sidekiq.redis = Sidekiq::RedisConnection.create(url: REDIS_URL, size: 10)
58
63
 
59
- expect_any_instance_of(described_class)
60
- .to receive(:clean_working_queues!).once
64
+ expect(described_class).to receive(:clean_working_queues!).once
61
65
 
62
66
  threads = 10.times.map do
63
67
  Thread.new do
@@ -1,5 +1,6 @@
1
1
  require 'spec_helper'
2
2
  require 'fetch_shared_examples'
3
+ require 'sidekiq/base_reliable_fetch'
3
4
  require 'sidekiq/reliable_fetch'
4
5
 
5
6
  describe Sidekiq::ReliableFetch do
@@ -1,5 +1,6 @@
1
1
  require 'spec_helper'
2
2
  require 'fetch_shared_examples'
3
+ require 'sidekiq/base_reliable_fetch'
3
4
  require 'sidekiq/semi_reliable_fetch'
4
5
 
5
6
  describe Sidekiq::SemiReliableFetch do
@@ -0,0 +1,37 @@
1
+ # How to run reliability tests
2
+
3
+ ```
4
+ cd reliability_test
5
+ bundle exec ruby reliability_test.rb
6
+ ```
7
+
8
+ You can adjust some parameters of the test in the `config.rb`.
9
+
10
+ JOB_FETCHER can be set to one of these values: `semi`, `reliable`, `basic`
11
+
12
+ You need to have redis server running on default HTTP port `6379`. To use other HTTP port, you can define
13
+ `REDIS_URL` environment varible with the port you need(example: `REDIS_URL="redis://localhost:9999"`).
14
+
15
+
16
+ ## How it works
17
+
18
+ This tool spawns configured number of Sidekiq workers and when the amount of processed jobs is about half of origin
19
+ number it will kill all the workers with `kill -9` and then it will spawn new workers again until all the jobs are processed. To track the process and counters we use Redis keys/counters.
20
+
21
+ # How to run interruption tests
22
+
23
+ ```
24
+ cd tests/interruption
25
+
26
+ # Verify "KILL" signal
27
+ bundle exec ruby test_kill_signal.rb
28
+
29
+ # Verify "TERM" signal
30
+ bundle exec ruby test_term_signal.rb
31
+ ```
32
+
33
+ It requires Redis to be running on 6379 port.
34
+
35
+ ## How it works
36
+
37
+ It spawns Sidekiq workers then creates a job that will kill itself after a moment. The reliable fetcher will bring it back. The purpose is to verify that job is run no more then allowed number of times.
@@ -0,0 +1,19 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative '../../lib/sidekiq-reliable-fetch'
4
+ require_relative 'worker'
5
+
6
+ TEST_CLEANUP_INTERVAL = 20
7
+ TEST_LEASE_INTERVAL = 5
8
+
9
+ Sidekiq.configure_server do |config|
10
+ config.options[:semi_reliable_fetch] = true
11
+
12
+ # We need to override these parameters to not wait too long
13
+ # The default values are good for production use only
14
+ # These will be ignored for :basic
15
+ config.options[:cleanup_interval] = TEST_CLEANUP_INTERVAL
16
+ config.options[:lease_interval] = TEST_LEASE_INTERVAL
17
+
18
+ Sidekiq::ReliableFetch.setup_reliable_fetch!(config)
19
+ end
@@ -0,0 +1,25 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'sidekiq'
4
+ require_relative 'config'
5
+ require_relative '../support/utils'
6
+
7
+ EXPECTED_NUM_TIMES_BEEN_RUN = 3
8
+ NUM_WORKERS = EXPECTED_NUM_TIMES_BEEN_RUN + 1
9
+
10
+ Sidekiq.redis(&:flushdb)
11
+
12
+ pids = spawn_workers(NUM_WORKERS)
13
+
14
+ RetryTestWorker.perform_async
15
+
16
+ sleep 300
17
+
18
+ Sidekiq.redis do |redis|
19
+ times_has_been_run = redis.get('times_has_been_run').to_i
20
+ assert 'The job has been run', times_has_been_run, EXPECTED_NUM_TIMES_BEEN_RUN
21
+ end
22
+
23
+ assert 'Found interruption exhausted jobs', Sidekiq::InterruptedSet.new.size, 1
24
+
25
+ stop_workers(pids)
@@ -0,0 +1,25 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'sidekiq'
4
+ require_relative 'config'
5
+ require_relative '../support/utils'
6
+
7
+ EXPECTED_NUM_TIMES_BEEN_RUN = 3
8
+ NUM_WORKERS = EXPECTED_NUM_TIMES_BEEN_RUN + 1
9
+
10
+ Sidekiq.redis(&:flushdb)
11
+
12
+ pids = spawn_workers(NUM_WORKERS)
13
+
14
+ RetryTestWorker.perform_async('TERM', 60)
15
+
16
+ sleep 300
17
+
18
+ Sidekiq.redis do |redis|
19
+ times_has_been_run = redis.get('times_has_been_run').to_i
20
+ assert 'The job has been run', times_has_been_run, EXPECTED_NUM_TIMES_BEEN_RUN
21
+ end
22
+
23
+ assert 'Found interruption exhausted jobs', Sidekiq::InterruptedSet.new.size, 1
24
+
25
+ stop_workers(pids)
@@ -0,0 +1,15 @@
1
+ # frozen_string_literal: true
2
+
3
+ class RetryTestWorker
4
+ include Sidekiq::Worker
5
+
6
+ def perform(signal = 'KILL', wait_seconds = 1)
7
+ Sidekiq.redis do |redis|
8
+ redis.incr('times_has_been_run')
9
+ end
10
+
11
+ Process.kill(signal, Process.pid)
12
+
13
+ sleep wait_seconds
14
+ end
15
+ end
@@ -1,8 +1,6 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- require_relative '../lib/sidekiq/base_reliable_fetch'
4
- require_relative '../lib/sidekiq/reliable_fetch'
5
- require_relative '../lib/sidekiq/semi_reliable_fetch'
3
+ require_relative '../../lib/sidekiq-reliable-fetch'
6
4
  require_relative 'worker'
7
5
 
8
6
  REDIS_FINISHED_LIST = 'reliable-fetcher-finished-jids'
@@ -89,7 +89,7 @@ Sidekiq.redis(&:flushdb)
89
89
  jobs = []
90
90
 
91
91
  NUMBER_OF_JOBS.times do
92
- jobs << TestWorker.perform_async
92
+ jobs << ReliabilityTestWorker.perform_async
93
93
  end
94
94
 
95
95
  puts "Queued #{NUMBER_OF_JOBS} jobs"
@@ -1,6 +1,6 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- class TestWorker
3
+ class ReliabilityTestWorker
4
4
  include Sidekiq::Worker
5
5
 
6
6
  def perform
@@ -0,0 +1,26 @@
1
+ def assert(text, actual, expected)
2
+ if actual == expected
3
+ puts "#{text}: #{actual} (Success)"
4
+ else
5
+ puts "#{text}: #{actual} (Failed). Expected: #{expected}"
6
+ exit 1
7
+ end
8
+ end
9
+
10
+ def spawn_workers(number)
11
+ pids = []
12
+
13
+ number.times do
14
+ pids << spawn('sidekiq -r ./config.rb')
15
+ end
16
+
17
+ pids
18
+ end
19
+
20
+ # Stop Sidekiq workers
21
+ def stop_workers(pids)
22
+ pids.each do |pid|
23
+ Process.kill('KILL', pid)
24
+ Process.wait pid
25
+ end
26
+ end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: gitlab-sidekiq-fetcher
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.0
4
+ version: 0.6.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - TEA
@@ -9,22 +9,28 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2018-12-19 00:00:00.000000000 Z
12
+ date: 2020-07-22 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: sidekiq
16
16
  requirement: !ruby/object:Gem::Requirement
17
17
  requirements:
18
- - - "~>"
18
+ - - ">="
19
19
  - !ruby/object:Gem::Version
20
20
  version: '5'
21
+ - - "<"
22
+ - !ruby/object:Gem::Version
23
+ version: '7'
21
24
  type: :runtime
22
25
  prerelease: false
23
26
  version_requirements: !ruby/object:Gem::Requirement
24
27
  requirements:
25
- - - "~>"
28
+ - - ">="
26
29
  - !ruby/object:Gem::Version
27
30
  version: '5'
31
+ - - "<"
32
+ - !ruby/object:Gem::Version
33
+ version: '7'
28
34
  description: Redis reliable queue pattern implemented in Sidekiq
29
35
  email: valery@gitlab.com
30
36
  executables: []
@@ -42,6 +48,7 @@ files:
42
48
  - gitlab-sidekiq-fetcher.gemspec
43
49
  - lib/sidekiq-reliable-fetch.rb
44
50
  - lib/sidekiq/base_reliable_fetch.rb
51
+ - lib/sidekiq/interrupted_set.rb
45
52
  - lib/sidekiq/reliable_fetch.rb
46
53
  - lib/sidekiq/semi_reliable_fetch.rb
47
54
  - spec/base_reliable_fetch_spec.rb
@@ -49,10 +56,15 @@ files:
49
56
  - spec/reliable_fetch_spec.rb
50
57
  - spec/semi_reliable_fetch_spec.rb
51
58
  - spec/spec_helper.rb
52
- - test/README.md
53
- - test/config.rb
54
- - test/reliability_test.rb
55
- - test/worker.rb
59
+ - tests/README.md
60
+ - tests/interruption/config.rb
61
+ - tests/interruption/test_kill_signal.rb
62
+ - tests/interruption/test_term_signal.rb
63
+ - tests/interruption/worker.rb
64
+ - tests/reliability/config.rb
65
+ - tests/reliability/reliability_test.rb
66
+ - tests/reliability/worker.rb
67
+ - tests/support/utils.rb
56
68
  homepage: https://gitlab.com/gitlab-org/sidekiq-reliable-fetch/
57
69
  licenses:
58
70
  - LGPL-3.0
@@ -72,8 +84,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
72
84
  - !ruby/object:Gem::Version
73
85
  version: '0'
74
86
  requirements: []
75
- rubyforge_project:
76
- rubygems_version: 2.7.6
87
+ rubygems_version: 3.0.3
77
88
  signing_key:
78
89
  specification_version: 4
79
90
  summary: Reliable fetch extension for Sidekiq
@@ -1,34 +0,0 @@
1
- # How to run
2
-
3
- ```
4
- cd test
5
- bundle exec ruby reliability_test.rb
6
- ```
7
-
8
- You can adjust some parameters of the test in the `config.rb`
9
-
10
-
11
- # How it works
12
-
13
- This tool spawns configured number of Sidekiq workers and when the amount of processed jobs is about half of origin
14
- number it will kill all the workers with `kill -9` and then it will spawn new workers again until all the jobs are processed. To track the process and counters we use Redis keys/counters.
15
-
16
- # How to run tests
17
-
18
- To run rspec:
19
-
20
- ```
21
- bundle exec rspec
22
- ```
23
-
24
- To run performance tests:
25
-
26
- ```
27
- cd test
28
- JOB_FETCHER=semi bundle exec ruby reliability_test.rb
29
- ```
30
-
31
- JOB_FETCHER can be set to one of these values: `semi`, `reliable`, `basic`
32
-
33
- To run both kind of tests you need to have redis server running on default HTTP port `6379`. To use other HTTP port, you can define
34
- `REDIS_URL` environment varible with the port you need(example: `REDIS_URL="redis://localhost:9999"`).