karafka-sidekiq-backend 1.2.0.beta4 → 1.4.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Karafka Sidekiq Backend
2
2
 
3
- [![Build Status](https://travis-ci.org/karafka/sidekiq-backend.png)](https://travis-ci.org/karafka/karafka-sidekiq-backend)
3
+ [![Build Status](https://github.com/karafka/sidekiq-backend/workflows/ci/badge.svg)](https://github.com/karafka/sidekiq-backend/actions?query=workflow%3Aci)
4
4
  [![Join the chat at https://gitter.im/karafka/karafka](https://badges.gitter.im/karafka/karafka.svg)](https://gitter.im/karafka/karafka?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
5
5
 
6
6
  [Karafka Sidekiq Backend](https://github.com/karafka/sidekiq-backend) provides support for consuming (processing) received Kafka messages inside of Sidekiq workers.
@@ -45,7 +45,8 @@ or on a per topic level:
45
45
  App.routes.draw do
46
46
  consumer_group :videos_consumer do
47
47
  topic :binary_video_details do
48
- controller Videos::DetailsController
48
+ backend :sidekiq
49
+ consumer Videos::DetailsConsumer
49
50
  worker Workers::DetailsWorker
50
51
  interchanger Interchangers::MyCustomInterchanger
51
52
  end
@@ -53,7 +54,7 @@ App.routes.draw do
53
54
  end
54
55
  ```
55
56
 
56
- You don't need to do anything beyond that. Karafka will know, that you want to run your controllers ```#perform``` method in a background job.
57
+ You don't need to do anything beyond that. Karafka will know, that you want to run your consumer's ```#consume``` method in a background job.
57
58
 
58
59
  ## Configuration
59
60
 
@@ -67,7 +68,7 @@ There are two options you can set inside of the ```topic``` block:
67
68
 
68
69
  ### Workers
69
70
 
70
- Karafka by default will build a worker that will correspond to each of your controllers (so you will have a pair - controller and a worker). All of them will inherit from ```ApplicationWorker``` and will share all its settings.
71
+ Karafka by default will build a worker that will correspond to each of your consumers (so you will have a pair - consumer and a worker). All of them will inherit from ```ApplicationWorker``` and will share all its settings.
71
72
 
72
73
  To run Sidekiq you should have sidekiq.yml file in *config* folder. The example of ```sidekiq.yml``` file will be generated to config/sidekiq.yml.example once you run ```bundle exec karafka install```.
73
74
 
@@ -75,17 +76,17 @@ However, if you want to use a raw Sidekiq worker (without any Karafka additional
75
76
 
76
77
  ```ruby
77
78
  topic :incoming_messages do
78
- controller MessagesController
79
+ consumer MessagesConsumer
79
80
  worker MyCustomWorker
80
81
  end
81
82
  ```
82
83
 
83
- Note that even then, you need to specify a controller that will schedule a background task.
84
+ Note that even then, you need to specify a consumer that will schedule a background task.
84
85
 
85
86
  Custom workers need to provide a ```#perform_async``` method. It needs to accept two arguments:
86
87
 
87
88
  - ```topic_id``` - first argument is a current topic id from which a given message comes
88
- - ```params_batch``` - all the params that came from Kafka + additional metadata. This data format might be changed if you use custom interchangers. Otherwise it will be an instance of Karafka::Params::ParamsBatch.
89
+ - ```params_batch``` - all the params that came from Kafka + additional metadata. This data format might be changed if you use custom interchangers. Otherwise, it will be an instance of Karafka::Params::ParamsBatch.
89
90
 
90
91
  **Note**: If you use custom interchangers, keep in mind, that params inside params batch might be in two states: parsed or unparsed when passed to #perform_async. This means, that if you use custom interchangers and/or custom workers, you might want to look into Karafka's sources to see exactly how it works.
91
92
 
@@ -93,19 +94,50 @@ Custom workers need to provide a ```#perform_async``` method. It needs to accept
93
94
 
94
95
  Custom interchangers target issues with non-standard (binary, etc.) data that we want to store when we do ```#perform_async```. This data might be corrupted when fetched in a worker (see [this](https://github.com/karafka/karafka/issues/30) issue). With custom interchangers, you can encode/compress data before it is being passed to scheduling and decode/decompress it when it gets into the worker.
95
96
 
97
+ To specify the interchanger for a topic, specify the interchanger inside routes like this:
98
+
99
+ ```ruby
100
+ App.routes.draw do
101
+ consumer_group :videos_consumer do
102
+ topic :binary_video_details do
103
+ consumer Videos::DetailsConsumer
104
+ interchanger Interchangers::MyCustomInterchanger
105
+ end
106
+ end
107
+ end
108
+ ```
109
+ Each custom interchanger should define `encode` to encode params before they get stored in Redis, and `decode` to convert the params to hash format, as shown below:
110
+
111
+ ```ruby
112
+ class Base64Interchanger
113
+ class << self
114
+ def encode(params_batch)
115
+ # Note, that you need to cast the params_batch to an array in order to get it work
116
+ # in sidekiq later
117
+ Base64.encode64(Marshal.dump(params_batch.to_a))
118
+ end
119
+
120
+ def decode(params_string)
121
+ Marshal.load(Base64.decode64(params_string))
122
+ end
123
+ end
124
+ end
125
+
126
+ ```
127
+
96
128
  **Warning**: if you decide to use slow interchangers, they might significantly slow down Karafka.
97
129
 
98
130
  ## References
99
131
 
100
132
  * [Karafka framework](https://github.com/karafka/karafka)
101
- * [Karafka Sidekiq Backend Travis CI](https://travis-ci.org/karafka/karafka-sidekiq-backend)
133
+ * [Karafka Sidekiq Backend Actions CI](https://github.com/karafka/sidekiq-backend/actions?query=workflow%3Aci)
102
134
  * [Karafka Sidekiq Backend Coditsu](https://app.coditsu.io/karafka/repositories/karafka-sidekiq-backend)
103
135
 
104
136
  ## Note on contributions
105
137
 
106
138
  First, thank you for considering contributing to Karafka! It's people like you that make the open source community such a great community!
107
139
 
108
- Each pull request must pass all the rspec specs and meet our quality requirements.
140
+ Each pull request must pass all the RSpec specs and meet our quality requirements.
109
141
 
110
142
  To check if everything is as it should be, we use [Coditsu](https://coditsu.io) that combines multiple linters and code analyzers for both code and documentation. Once you're done with your changes, submit a pull request.
111
143
 
@@ -0,0 +1,25 @@
1
+ -----BEGIN CERTIFICATE-----
2
+ MIIEODCCAqCgAwIBAgIBATANBgkqhkiG9w0BAQsFADAjMSEwHwYDVQQDDBhtYWNp
3
+ ZWovREM9bWVuc2ZlbGQvREM9cGwwHhcNMjAwODExMDkxNTM3WhcNMjEwODExMDkx
4
+ NTM3WjAjMSEwHwYDVQQDDBhtYWNpZWovREM9bWVuc2ZlbGQvREM9cGwwggGiMA0G
5
+ CSqGSIb3DQEBAQUAA4IBjwAwggGKAoIBgQDCpXsCgmINb6lHBXXBdyrgsBPSxC4/
6
+ 2H+weJ6L9CruTiv2+2/ZkQGtnLcDgrD14rdLIHK7t0o3EKYlDT5GhD/XUVhI15JE
7
+ N7IqnPUgexe1fbZArwQ51afxz2AmPQN2BkB2oeQHXxnSWUGMhvcEZpfbxCCJH26w
8
+ hS0Ccsma8yxA6hSlGVhFVDuCr7c2L1di6cK2CtIDpfDaWqnVNJEwBYHIxrCoWK5g
9
+ sIGekVt/admS9gRhIMaIBg+Mshth5/DEyWO2QjteTodItlxfTctrfmiAl8X8T5JP
10
+ VXeLp5SSOJ5JXE80nShMJp3RFnGw5fqjX/ffjtISYh78/By4xF3a25HdWH9+qO2Z
11
+ tx0wSGc9/4gqNM0APQnjN/4YXrGZ4IeSjtE+OrrX07l0TiyikzSLFOkZCAp8oBJi
12
+ Fhlosz8xQDJf7mhNxOaZziqASzp/hJTU/tuDKl5+ql2icnMv5iV/i6SlmvU29QNg
13
+ LCV71pUv0pWzN+OZbHZKWepGhEQ3cG9MwvkCAwEAAaN3MHUwCQYDVR0TBAIwADAL
14
+ BgNVHQ8EBAMCBLAwHQYDVR0OBBYEFImGed2AXS070ohfRidiCEhXEUN+MB0GA1Ud
15
+ EQQWMBSBEm1hY2llakBtZW5zZmVsZC5wbDAdBgNVHRIEFjAUgRJtYWNpZWpAbWVu
16
+ c2ZlbGQucGwwDQYJKoZIhvcNAQELBQADggGBAKiHpwoENVrMi94V1zD4o8/6G3AU
17
+ gWz4udkPYHTZLUy3dLznc/sNjdkJFWT3E6NKYq7c60EpJ0m0vAEg5+F5pmNOsvD3
18
+ 2pXLj9kisEeYhR516HwXAvtngboUcb75skqvBCU++4Pu7BRAPjO1/ihLSBexbwSS
19
+ fF+J5OWNuyHHCQp+kGPLtXJe2yUYyvSWDj3I2//Vk0VhNOIlaCS1+5/P3ZJThOtm
20
+ zJUBI7h3HgovwRpcnmk2mXTmU4Zx/bCzX8EA6VY0khEvnmiq7S6eBF0H9qH8KyQ6
21
+ EkVLpvmUDFcf/uNaBQdazEMB5jYtwoA8gQlANETNGPi51KlkukhKgaIEDMkBDJOx
22
+ 65N7DzmkcyY0/GwjIVIxmRhcrCt1YeCUElmfFx0iida1/YRm6sB2AXqScc1+ECRi
23
+ 2DND//YJUikn1zwbz1kT70XmHd97B4Eytpln7K+M1u2g1pHVEPW4owD/ammXNpUy
24
+ nt70FcDD4yxJQ+0YNiHd0N8IcVBM1TMIVctMNQ==
25
+ -----END CERTIFICATE-----
@@ -9,16 +9,21 @@ Gem::Specification.new do |spec|
9
9
  spec.version = Karafka::Backends::Sidekiq::VERSION
10
10
  spec.platform = Gem::Platform::RUBY
11
11
  spec.authors = ['Maciej Mensfeld']
12
- spec.email = %w[maciej@coditsu.io]
12
+ spec.email = %w[maciej@mensfeld.pl]
13
13
  spec.homepage = 'https://github.com/karafka/karafka-sidekiq-backend'
14
14
  spec.summary = 'Karafka Sidekiq backend for background messages processing'
15
15
  spec.description = 'Karafka Sidekiq backend for background messages processing'
16
- spec.license = 'MIT'
16
+ spec.license = 'LGPL-3.0'
17
17
 
18
- spec.add_dependency 'karafka', '>= 1.2.0.beta4'
18
+ spec.add_dependency 'karafka', '~> 1.4.0.rc2'
19
19
  spec.add_dependency 'sidekiq', '>= 4.2'
20
- spec.required_ruby_version = '>= 2.3.0'
20
+ spec.required_ruby_version = '>= 2.5.0'
21
21
 
22
+ if $PROGRAM_NAME.end_with?('gem')
23
+ spec.signing_key = File.expand_path('~/.ssh/gem-private_key.pem')
24
+ end
25
+
26
+ spec.cert_chain = %w[certs/mensfeld.pem]
22
27
  spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(spec)/}) }
23
28
  spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
24
29
  spec.require_paths = %w[lib]
@@ -1,34 +1,3 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- # @note Active Support is already included since Karafka is using it directly so no need
4
- # to require it again in the gemspec, etc
5
- %w[
6
- karafka
7
- sidekiq
8
- active_support/core_ext/class/subclasses
9
- ].each(&method(:require))
10
-
11
- # Karafka framework namespace
12
- module Karafka
13
- # Namespace for all the backends that process data
14
- module Backends
15
- # Sidekiq Karafka backend
16
- module Sidekiq
17
- class << self
18
- # @return [String] path to Karafka gem root core
19
- def core_root
20
- Pathname.new(File.expand_path('karafka', __dir__))
21
- end
22
- end
23
- end
24
- end
25
- end
26
-
27
- # Uses Karafka loader to load all the sources that this backend needs
28
- Karafka::Loader.load!(Karafka::Backends::Sidekiq.core_root)
29
- Karafka::AttributesMap.prepend(Karafka::Extensions::SidekiqAttributesMap)
30
- # Register internal events for instrumentation
31
- %w[
32
- backends.sidekiq.process
33
- backends.sidekiq.base_worker.perform
34
- ].each(&Karafka.monitor.method(:register_event))
3
+ require 'karafka_sidekiq_backend'
@@ -1,11 +1,12 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Karafka
4
+ # Namespace for alternative processing backends for Karafka framework
4
5
  module Backends
5
6
  # Sidekiq backend that schedules stuff to Sidekiq worker for delayed execution
6
7
  module Sidekiq
7
8
  # Karafka Sidekiq backend version
8
- VERSION = '1.2.0.beta4'
9
+ VERSION = '1.4.0'
9
10
 
10
11
  # Enqueues the execution of perform method into a worker.
11
12
  # @note Each worker needs to have a class #perform_async method that will allow us to pass
@@ -16,7 +17,8 @@ module Karafka
16
17
  Karafka.monitor.instrument('backends.sidekiq.process', caller: self) do
17
18
  topic.worker.perform_async(
18
19
  topic.id,
19
- topic.interchanger.encode(params_batch)
20
+ topic.interchanger.encode(params_batch),
21
+ respond_to?(:batch_metadata) ? batch_metadata.to_h : nil
20
22
  )
21
23
  end
22
24
  end
@@ -5,11 +5,30 @@ module Karafka
5
5
  class BaseWorker
6
6
  include Sidekiq::Worker
7
7
 
8
+ class << self
9
+ # Returns the base worker class for application.
10
+ #
11
+ # @return [Class] first worker that inherited from Karafka::BaseWorker. Karafka
12
+ # assumes that it is the base worker for an application.
13
+ # @raise [Karafka::Errors::BaseWorkerDescentantMissing] raised when application
14
+ # base worker was not defined.
15
+ def base_worker
16
+ @inherited || raise(Errors::BaseWorkerDescentantMissing)
17
+ end
18
+
19
+ # @param subclass [Class] subclass of the worker
20
+ # @return [Class] subclass of the worker that was selected
21
+ def inherited(subclass)
22
+ @inherited ||= subclass
23
+ end
24
+ end
25
+
8
26
  # Executes the logic that lies in #perform Karafka consumer method
9
27
  # @param topic_id [String] Unique topic id that we will use to find a proper topic
10
- # @param params_batch [Array] Array with messages batch
11
- def perform(topic_id, params_batch)
12
- consumer = consumer(topic_id, params_batch)
28
+ # @param params_batch [Array<Hash>] Array with messages batch
29
+ # @param metadata [Hash, nil] hash with all the metadata or nil if not present
30
+ def perform(topic_id, params_batch, metadata)
31
+ consumer = consumer(topic_id, params_batch, metadata)
13
32
 
14
33
  Karafka.monitor.instrument(
15
34
  'backends.sidekiq.base_worker.perform',
@@ -20,12 +39,27 @@ module Karafka
20
39
 
21
40
  private
22
41
 
42
+ # @see `#perform` for exact params descriptions
43
+ # @param topic_id [String]
44
+ # @param params_batch [Array<Hash>]
45
+ # @param metadata [Hash, nil]
23
46
  # @return [Karafka::Consumer] descendant of Karafka::BaseConsumer that matches the topic
24
47
  # with params_batch assigned already (consumer is ready to use)
25
- def consumer(topic_id, params_batch)
48
+ def consumer(topic_id, params_batch, metadata)
26
49
  topic = Karafka::Routing::Router.find(topic_id)
27
- consumer = topic.consumer.new
28
- consumer.params_batch = topic.interchanger.decode(params_batch)
50
+ consumer = topic.consumer.new(topic)
51
+ consumer.params_batch = Params::Builders::ParamsBatch.from_array(
52
+ topic.interchanger.decode(params_batch),
53
+ topic
54
+ )
55
+
56
+ if topic.batch_fetching
57
+ consumer.batch_metadata = Params::Builders::BatchMetadata.from_hash(
58
+ metadata,
59
+ topic
60
+ )
61
+ end
62
+
29
63
  consumer
30
64
  end
31
65
  end
@@ -0,0 +1,23 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Karafka
4
+ module Extensions
5
+ # Extension for metadata builder to allow building metadata from a hash
6
+ module BatchMetadataBuilder
7
+ # Builds metadata from hash
8
+ # @param hash [Hash] hash with metadata
9
+ # @param topic [Karafka::Routing::Topic] topic instance
10
+ # @return [Karafka::Params::Metadata] metadata instance
11
+ def from_hash(hash, topic)
12
+ # Parser needs to be merged as this is the only non-serializable object
13
+ # so it gets reconstructed from the topic
14
+ Karafka::Params::BatchMetadata
15
+ .new(
16
+ **hash
17
+ .merge('deserializer' => topic.deserializer)
18
+ .transform_keys(&:to_sym)
19
+ )
20
+ end
21
+ end
22
+ end
23
+ end
@@ -0,0 +1,21 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Karafka
4
+ module Extensions
5
+ # Extension for params batch builder for reconstruction of the batch from an array
6
+ module ParamsBatchBuilder
7
+ # Builds params batch from array of hashes
8
+ # @param array [Array<Hash>] array with hash messages
9
+ # @param topic [Karafka::Routing::Topic] topic for which we build the batch
10
+ # @return [Karafka::Params::ParamsBatch] built batch
11
+ # @note We rebuild the params batch from array after the serialization
12
+ def from_array(array, topic)
13
+ params_array = array.map do |hash|
14
+ Karafka::Params::Builders::Params.from_hash(hash, topic)
15
+ end
16
+
17
+ Karafka::Params::ParamsBatch.new(params_array).freeze
18
+ end
19
+ end
20
+ end
21
+ end
@@ -0,0 +1,24 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Karafka
4
+ module Extensions
5
+ # Extension for rebuilding params from a hash
6
+ module ParamsBuilder
7
+ # Builds params from a hash
8
+ # @param hash [Hash] hash with params details
9
+ # @param topic [Karafka::Routing::Topic] topic for which we build the params
10
+ # @return [Karafka::Params::Params] built params
11
+ def from_hash(hash, topic)
12
+ metadata = Karafka::Params::Metadata.new(
13
+ **hash
14
+ .fetch('metadata')
15
+ .merge('deserializer' => topic.deserializer)
16
+ .transform_keys(&:to_sym)
17
+ ).freeze
18
+
19
+ Karafka::Params::Params
20
+ .new(hash.fetch('raw_payload'), metadata)
21
+ end
22
+ end
23
+ end
24
+ end
@@ -1,6 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Karafka
4
+ # Extensions for Karafka components
4
5
  module Extensions
5
6
  # Additional Karafka::Attributes map topic attributes that can be used when worker
6
7
  # is active and we use sidekiq backend
@@ -1,8 +1,6 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Karafka
4
- # Namespace for additional extensions that we include into some Karafka components, to gain
5
- # extra features that we require
6
4
  module Extensions
7
5
  # Additional Karafka::Routing::Topic methods that are required to work with Sidekiq backend
8
6
  module SidekiqTopicAttributes
@@ -16,7 +14,7 @@ module Karafka
16
14
  # @return [Class] Interchanger class (not an instance) that we want to use to interchange
17
15
  # params between Karafka server and Karafka background job
18
16
  def interchanger
19
- @interchanger ||= Karafka::Interchanger
17
+ @interchanger ||= Karafka::Interchanger.new
20
18
  end
21
19
 
22
20
  # Creates attributes writers for worker and interchanger, so they can be overwritten
@@ -28,5 +26,3 @@ module Karafka
28
26
  end
29
27
  end
30
28
  end
31
-
32
- Karafka::Routing::Topic.include Karafka::Extensions::SidekiqTopicAttributes
@@ -0,0 +1,27 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Karafka
4
+ module Extensions
5
+ # Additional methods for listener that listen on instrumentation related to the Sidekiq
6
+ # backend of Karafka
7
+ module StdoutListener
8
+ # Logs info about scheduling of a certain dataset with a Sidekiq backend
9
+ # @param event [Dry::Events::Event] event details including payload
10
+ def on_backends_sidekiq_process(event)
11
+ count = event[:caller].send(:params_batch).size
12
+ topic = event[:caller].topic.name
13
+ time = event[:time]
14
+ info "Scheduling of #{count} messages to Sidekiq on topic #{topic} took #{time} ms"
15
+ end
16
+
17
+ # Logs ino about processing certain events with a given Sidekiq worker
18
+ # @param event [Dry::Events::Event] event details including payload
19
+ def on_backends_sidekiq_base_worker_perform(event)
20
+ count = event[:consumer].send(:params_batch).size
21
+ topic = event[:consumer].topic.name
22
+ time = event[:time]
23
+ info "Sidekiq processing of topic #{topic} with #{count} messages took #{time} ms"
24
+ end
25
+ end
26
+ end
27
+ end
@@ -1,7 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Karafka
4
- # Interchangers allow us to format/encode/pack data that is being send to perform_async
4
+ # Interchanger allows us to format/encode/pack data that is being send to perform_async
5
5
  # This is meant to target mostly issues with data encoding like this one:
6
6
  # https://github.com/mperham/sidekiq/issues/197
7
7
  # Each custom interchanger should implement following methods:
@@ -9,26 +9,22 @@ module Karafka
9
9
  # - decode - decoded params back to a hash format that we can use
10
10
  #
11
11
  # This interchanger uses default Sidekiq options to exchange data
12
- # @note Since we use symbols for Karafka params (performance reasons), they will be
13
- # deserialized into string versions. Keep that in mind.
14
12
  class Interchanger
15
- class << self
16
- # @param params_batch [Karafka::Params::ParamsBatch] Karafka params batch object
17
- # @return [Karafka::Params::ParamsBatch] parsed params batch. There are too many problems
18
- # with passing unparsed data from Karafka to Sidekiq, to make it a default. In case you
19
- # need this, please implement your own interchanger.
20
- def encode(params_batch)
21
- params_batch.parsed
13
+ # @param params_batch [Karafka::Params::ParamsBatch] Karafka params batch object
14
+ # @return [Array<Hash>] Array with hash built out of params data
15
+ def encode(params_batch)
16
+ params_batch.map do |param|
17
+ {
18
+ raw_payload: param.raw_payload,
19
+ metadata: param.metadata.to_h
20
+ }
22
21
  end
22
+ end
23
23
 
24
- # @param params_batch [Array<Hash>] Sidekiq params that are now an array
25
- # @note Since Sidekiq does not like symbols, we restore symbolized keys for system keys, so
26
- # everything can work as expected. Keep in mind, that custom data will always be assigned
27
- # with string keys per design. To change it, please change this interchanger and create
28
- # your own custom parser
29
- def decode(params_batch)
30
- params_batch
31
- end
24
+ # @param params_batch [Array<Hash>] Sidekiq params that are now an array
25
+ # @return [Array<Hash>] exactly what we've fetched from Sidekiq
26
+ def decode(params_batch)
27
+ params_batch
32
28
  end
33
29
  end
34
30
  end