sidekiq-paquet 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 2d30b4621c71e1966308bbe0a68f10aeb776a332
4
+ data.tar.gz: 13476e7dbafc931566f3e9cd6d2861fa1175583f
5
+ SHA512:
6
+ metadata.gz: d17cefb200e17b149424f2c1403b053e11c23da4c016e670b445f9108da393cc081c9281a91cb4a72faa320c5fd51fc4641d2579e0210daacae43367a7e9a525
7
+ data.tar.gz: 46be4131522f343c047226c2ae9b3b06082590a2054ac98cfd11bec473f2e4de72d1172fc3a532998165e3ff0001ad61e2a7fa52b674c9d00535e77b25d31ee8
data/.gitignore ADDED
@@ -0,0 +1,9 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
data/.travis.yml ADDED
@@ -0,0 +1,6 @@
1
+ language: ruby
2
+ services:
3
+ - redis-server
4
+ rvm:
5
+ - 2.2.2
6
+ before_install: gem install bundler -v 1.10.6
@@ -0,0 +1,13 @@
1
+ # Contributor Code of Conduct
2
+
3
+ As contributors and maintainers of this project, we pledge to respect all people who contribute through reporting issues, posting feature requests, updating documentation, submitting pull requests or patches, and other activities.
4
+
5
+ We are committed to making participation in this project a harassment-free experience for everyone, regardless of level of experience, gender, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion.
6
+
7
+ Examples of unacceptable behavior by participants include the use of sexual language or imagery, derogatory comments or personal attacks, trolling, public or private harassment, insults, or other unprofessional conduct.
8
+
9
+ Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed from the project team.
10
+
11
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by opening an issue or contacting one or more of the project maintainers.
12
+
13
+ This Code of Conduct is adapted from the [Contributor Covenant](http://contributor-covenant.org), version 1.0.0, available at [http://contributor-covenant.org/version/1/0/0/](http://contributor-covenant.org/version/1/0/0/)
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in sidekiq-paquet.gemspec
4
+ gemspec
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2015 ccocchi
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,66 @@
1
+ # Sidekiq::Paquet
2
+
3
+ Instead of enqueueing and processing jobs one at a time, enqueue them one by one process them in bulk.
4
+ Useful for grouping background API calls or intensive database inserts coming from multiple sources.
5
+
6
+ ## Installation
7
+
8
+ ```ruby
9
+ gem install 'sidekiq-paquet'
10
+ ```
11
+
12
+ sidekiq-bulk requires Sidekiq 4+. If you're using Sidekiq < 4, take a look at [sidekiq-grouping](https://github.com/gzigzigzeo/sidekiq-grouping/) for similar features.
13
+
14
+ ## Usage
15
+
16
+ Add `bulk: true` option to your worker's `sidekiq_options` to have jobs processed in bulk. The size of the bulk can be configured per worker. If not specified, the `Sidekiq::Paquet.options[:default_bulk_size]` is used.
17
+
18
+ ```ruby
19
+ class ElasticIndexerWorker
20
+ include Sidekiq::Worker
21
+
22
+ sidekiq_options bulk: true, bulk_size: 100
23
+
24
+ def perform(values)
25
+ # Perform work with the array of values
26
+ end
27
+ end
28
+ ```
29
+
30
+ Instead of being processed by Sidekiq, jobs will be stored into a separate queue and periodically, a poller will retrieve them by slice of `bulk_size` and enqueue a regular Sidekiq job with that bulk as argument.
31
+ Thus, your worker will only be invoked with an array of values, never with single values themselves.
32
+
33
+ For example, if you call `perform_async` twice on the previous worker
34
+
35
+ ```ruby
36
+ ElasticIndexerWorker.perform_async({ delete: { _index: 'users', _id: 1, _type: 'user' } })
37
+ ElasticIndexerWorker.perform_async({ delete: { _index: 'users', _id: 2, _type: 'user' } })
38
+ ```
39
+
40
+ the worker instance will receive these values as a single argument
41
+
42
+ ```ruby
43
+ [
44
+ { delete: { _index: 'users', _id: 1, _type: 'user' } },
45
+ { delete: { _index: 'users', _id: 2, _type: 'user' } }
46
+ ]
47
+ ```
48
+
49
+ ## Configuration
50
+
51
+ You can change global configuration by modifying the `Sidekiq::Paquet.options` hash.
52
+
53
+ ```
54
+ Sidekiq::Paquet.options[:default_bulk_size] = 500 # Default is 100
55
+ Sidekiq::Paquet.options[:average_bulk_flush_interval] = 30 # Default is 15
56
+ ```
57
+
58
+ The `average_bulk_flush_interval` represent the average time elapsed between two polling of values. This scales with the number of sidekiq processes you're running. So if you have 5 sidekiq processes, and set the `average_bulk_flush_interval` to 15, each process will check for new bulk jobs every 75 seconds -- so that in average, the bulk queue will be checked every 15 seconds.
59
+
60
+ ## Contributing
61
+
62
+ Bug reports and pull requests are welcome on GitHub at https://github.com/ccocchi/sidekiq-paquet. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
63
+
64
+ ## License
65
+
66
+ The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
data/Rakefile ADDED
@@ -0,0 +1,12 @@
1
+ require "bundler/gem_tasks"
2
+ require "rake/testtask"
3
+
4
+ Rake::TestTask.new(:test) do |t|
5
+ t.libs << "test"
6
+ t.libs << "lib"
7
+
8
+ t.warning = false
9
+ t.test_files = FileList['test/**/test_*.rb']
10
+ end
11
+
12
+ task :default => :test
@@ -0,0 +1,42 @@
1
+ module Sidekiq
2
+ module Paquet
3
+ module Batch
4
+
5
+ def self.append(item)
6
+ worker_name = item['class'.freeze]
7
+ args = item.fetch('args'.freeze, [])
8
+
9
+ Sidekiq.redis do |conn|
10
+ conn.multi do
11
+ conn.sadd('bulks'.freeze, worker_name)
12
+ conn.rpush("bulk:#{worker_name}", Sidekiq.dump_json(args))
13
+ end
14
+ end
15
+ end
16
+
17
+ def self.enqueue_jobs
18
+ Sidekiq.redis do |conn|
19
+ workers = conn.smembers('bulks'.freeze)
20
+
21
+ workers.each do |worker|
22
+ klass = worker.constantize
23
+ opts = klass.get_sidekiq_options
24
+ items = conn.lrange("bulk:#{worker}", 0, -1)
25
+ items.map! { |i| Sidekiq.load_json(i) }
26
+
27
+ items.each_slice(opts.fetch('bulk_size'.freeze, Sidekiq::Paquet.options[:default_bulk_size])) do |vs|
28
+ Sidekiq::Client.push(
29
+ 'class' => worker,
30
+ 'queue' => opts['queue'.freeze],
31
+ 'args' => vs
32
+ )
33
+ end
34
+
35
+ conn.ltrim("bulk:#{worker}", items.size, -1)
36
+ end
37
+ end
38
+ end
39
+
40
+ end
41
+ end
42
+ end
@@ -0,0 +1,21 @@
1
+ module Sidekiq
2
+ module Paquet
3
+ class List
4
+ def initialize(name)
5
+ @lname = "bulk:#{name}"
6
+ end
7
+
8
+ def size
9
+ Sidekiq.redis { |c| c.llen(@lname) }
10
+ end
11
+
12
+ def items
13
+ Sidekiq.redis { |c| c.lrange(@lname, 0, -1) }
14
+ end
15
+
16
+ def clear
17
+ Sidekiq.redis { |c| c.del(@lname) }
18
+ end
19
+ end
20
+ end
21
+ end
@@ -0,0 +1,14 @@
1
+ module Sidekiq
2
+ module Paquet
3
+ class Middleware
4
+ def call(worker, item, queue, redis_pool = nil)
5
+ if item['bulk'.freeze]
6
+ Batch.append(item)
7
+ false
8
+ else
9
+ yield
10
+ end
11
+ end
12
+ end
13
+ end
14
+ end
@@ -0,0 +1,84 @@
1
+ require 'sidekiq/util'
2
+ require 'sidekiq/scheduled'
3
+
4
+ module Sidekiq
5
+ module Paquet
6
+ class Poller < Sidekiq::Scheduled::Poller
7
+
8
+ def initialize
9
+ @sleeper = ConnectionPool::TimedStack.new
10
+ @done = false
11
+ end
12
+
13
+ def start
14
+ @thread ||= safe_thread('bulk') do
15
+ initial_wait
16
+
17
+ while !@done
18
+ enqueue
19
+ wait
20
+ end
21
+ Sidekiq.logger.info('Bulk exiting...')
22
+ end
23
+ end
24
+
25
+ def enqueue
26
+ begin
27
+ Batch.enqueue_jobs
28
+ rescue => ex
29
+ # Most likely a problem with redis networking.
30
+ # Punt and try again at the next interval
31
+ logger.error ex.message
32
+ logger.error ex.backtrace.first
33
+ end
34
+ end
35
+
36
+ private
37
+
38
+ # Calculates a random interval that is ±50% the desired average.
39
+ def random_poll_interval
40
+ avg = poll_interval_average.to_f
41
+ avg * rand + avg / 2
42
+ end
43
+
44
+ # We do our best to tune the poll interval to the size of the active Sidekiq
45
+ # cluster. If you have 30 processes and poll every 15 seconds, that means one
46
+ # Sidekiq is checking Redis every 0.5 seconds - way too often for most people
47
+ # and really bad if the retry or scheduled sets are large.
48
+ #
49
+ # Instead try to avoid polling more than once every 15 seconds. If you have
50
+ # 30 Sidekiq processes, we'll poll every 30 * 15 or 450 seconds.
51
+ # To keep things statistically random, we'll sleep a random amount between
52
+ # 225 and 675 seconds for each poll or 450 seconds on average. Otherwise restarting
53
+ # all your Sidekiq processes at the same time will lead to them all polling at
54
+ # the same time: the thundering herd problem.
55
+ #
56
+ # We only do this if poll_interval_average is unset (the default).
57
+ def poll_interval_average
58
+ if Sidekiq::Paquet.options[:dynamic_interval_scaling]
59
+ scaled_poll_interval
60
+ else
61
+ Sidekiq::Paquet.options[:bulk_flush_interval] ||= scaled_poll_interval
62
+ end
63
+ end
64
+
65
+ # Calculates an average poll interval based on the number of known Sidekiq processes.
66
+ # This minimizes a single point of failure by dispersing check-ins but without taxing
67
+ # Redis if you run many Sidekiq processes.
68
+ def scaled_poll_interval
69
+ pcount = Sidekiq::ProcessSet.new.size
70
+ pcount = 1 if pcount == 0
71
+ pcount * Sidekiq::Paquet.options[:average_bulk_flush_interval]
72
+ end
73
+
74
+ def initial_wait
75
+ # Have all processes sleep between 5-15 seconds. 10 seconds
76
+ # to give time for the heartbeat to register (if the poll interval is going to be calculated by the number
77
+ # of workers), and 5 random seconds to ensure they don't all hit Redis at the same time.
78
+ total = INITIAL_WAIT + (15 * rand)
79
+ @sleeper.pop(total)
80
+ rescue Timeout::Error
81
+ end
82
+ end
83
+ end
84
+ end
@@ -0,0 +1,5 @@
1
+ module Sidekiq
2
+ module Paquet
3
+ VERSION = '0.1.0'
4
+ end
5
+ end
@@ -0,0 +1,47 @@
1
+ require 'sidekiq'
2
+ require 'sidekiq/paquet/version'
3
+
4
+ require 'sidekiq/paquet/list'
5
+ require 'sidekiq/paquet/batch'
6
+ require 'sidekiq/paquet/middleware'
7
+ require 'sidekiq/paquet/poller'
8
+
9
+ module Sidekiq
10
+ module Paquet
11
+ DEFAULTS = {
12
+ default_bulk_size: 100,
13
+ bulk_flush_interval: nil,
14
+ average_bulk_flush_interval: 15,
15
+ dynamic_interval_scaling: true
16
+ }
17
+
18
+ def self.options
19
+ @options ||= DEFAULTS.dup
20
+ end
21
+
22
+ def self.options=(opts)
23
+ @options = opts
24
+ end
25
+ end
26
+ end
27
+
28
+ Sidekiq.configure_client do |config|
29
+ config.client_middleware do |chain|
30
+ chain.add Sidekiq::Paquet::Middleware
31
+ end
32
+ end
33
+
34
+ Sidekiq.configure_server do |config|
35
+ config.client_middleware do |chain|
36
+ chain.add Sidekiq::Paquet::Middleware
37
+ end
38
+
39
+ config.on(:startup) do
40
+ config.options[:bulk_poller] = Sidekiq::Paquet::Poller.new
41
+ config.options[:bulk_poller].start
42
+ end
43
+
44
+ config.on(:shutdown) do
45
+ config.options[:bulk_poller].terminate
46
+ end
47
+ end
@@ -0,0 +1,26 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'sidekiq/paquet/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "sidekiq-paquet"
8
+ spec.version = Sidekiq::Paquet::VERSION
9
+ spec.authors = ["ccocchi"]
10
+ spec.email = ["cocchi.c@gmail.com"]
11
+
12
+ spec.summary = "Bulk processing for sidekiq 4+"
13
+ spec.homepage = "https://github.com/ccocchi/sidekiq-paquet"
14
+ spec.license = "MIT"
15
+
16
+ spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
17
+ spec.bindir = "exe"
18
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
19
+ spec.require_paths = ["lib"]
20
+
21
+ spec.add_dependency "sidekiq", ">= 4"
22
+
23
+ spec.add_development_dependency "bundler", "~> 1.10"
24
+ spec.add_development_dependency "rake", "~> 10.0"
25
+ spec.add_development_dependency "minitest"
26
+ end
metadata ADDED
@@ -0,0 +1,115 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: sidekiq-paquet
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - ccocchi
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2015-12-26 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: sidekiq
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: '4'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: '4'
27
+ - !ruby/object:Gem::Dependency
28
+ name: bundler
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '1.10'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '1.10'
41
+ - !ruby/object:Gem::Dependency
42
+ name: rake
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - "~>"
46
+ - !ruby/object:Gem::Version
47
+ version: '10.0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - "~>"
53
+ - !ruby/object:Gem::Version
54
+ version: '10.0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: minitest
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - ">="
60
+ - !ruby/object:Gem::Version
61
+ version: '0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ version: '0'
69
+ description:
70
+ email:
71
+ - cocchi.c@gmail.com
72
+ executables: []
73
+ extensions: []
74
+ extra_rdoc_files: []
75
+ files:
76
+ - ".gitignore"
77
+ - ".travis.yml"
78
+ - CODE_OF_CONDUCT.md
79
+ - Gemfile
80
+ - LICENSE.txt
81
+ - README.md
82
+ - Rakefile
83
+ - lib/sidekiq/paquet.rb
84
+ - lib/sidekiq/paquet/batch.rb
85
+ - lib/sidekiq/paquet/list.rb
86
+ - lib/sidekiq/paquet/middleware.rb
87
+ - lib/sidekiq/paquet/poller.rb
88
+ - lib/sidekiq/paquet/version.rb
89
+ - sidekiq-bulk.gemspec
90
+ homepage: https://github.com/ccocchi/sidekiq-paquet
91
+ licenses:
92
+ - MIT
93
+ metadata: {}
94
+ post_install_message:
95
+ rdoc_options: []
96
+ require_paths:
97
+ - lib
98
+ required_ruby_version: !ruby/object:Gem::Requirement
99
+ requirements:
100
+ - - ">="
101
+ - !ruby/object:Gem::Version
102
+ version: '0'
103
+ required_rubygems_version: !ruby/object:Gem::Requirement
104
+ requirements:
105
+ - - ">="
106
+ - !ruby/object:Gem::Version
107
+ version: '0'
108
+ requirements: []
109
+ rubyforge_project:
110
+ rubygems_version: 2.4.5
111
+ signing_key:
112
+ specification_version: 4
113
+ summary: Bulk processing for sidekiq 4+
114
+ test_files: []
115
+ has_rdoc: