racecar 0.2.1 → 0.3.0.beta1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: fbf8129bd78b4936e7cf40832517e6619a7f0e17
4
- data.tar.gz: 14f1432deab3e9f29236715e33f14d521e90ba42
3
+ metadata.gz: 21f3ecc4525a97eace384c5702c7f37939920292
4
+ data.tar.gz: 6bf40c5f643a9f0eb8fe6a962793a04e020becd4
5
5
  SHA512:
6
- metadata.gz: 740d45ad33ae89a961338fa08125f9f7365ecb1b5f3a6d45732a43574a6c44e97432ceeed43325df32d713d2d3ec42d258eb2cc44b87319385df72305317a589
7
- data.tar.gz: bb2b1b4eca7e542aaa53067d2adb6219eac9092d3f5a01cf51e9c67c33257f0ec83a6ea26955aa7d8a28105ede42d2c9400173b885552743eb1e1e460a4e3345
6
+ metadata.gz: 17a690c3206dc413c8795b36d2a8d01afbab74bbcd48a2c0e1d64bb8a662ae2d41516a4446d140967d5266e1bc5755b4e0814f269fc6c6f6dc1892eac7e05ed1
7
+ data.tar.gz: c811af7f6f5aa7cdd3953e1e954964ad6903e7740600e9ae16e5b8b2ef27234116b48c60fa6851d9166b157a07f1e44b851c310fd1a784a9717728d8218f43fc
data/.gitignore CHANGED
@@ -7,3 +7,4 @@
7
7
  /pkg/
8
8
  /spec/reports/
9
9
  /tmp/
10
+ /vendor/bundle/
data/README.md CHANGED
@@ -4,6 +4,21 @@ Introducing Racecar, your friendly and easy-to-approach Kafka consumer framework
4
4
 
5
5
  Using [ruby-kafka](https://github.com/zendesk/ruby-kafka) directly can be a challenge: it's a flexible library with lots of knobs and options. Most users don't need that level of flexibility, though. Racecar provides a simple and intuitive way to build and configure Kafka consumers that optionally integrates seemlessly with Rails.
6
6
 
7
+ ## Table of content
8
+
9
+ 1. [Installation](#installation)
10
+ 2. [Usage](#usage)
11
+ 1. [Creating consumers](#creating-consumers)
12
+ 2. [Running consumers](#running-consumers)
13
+ 3. [Configuration](#configuration)
14
+ 4. [Testing consumers](#testing-consumers)
15
+ 5. [Deploying consumers](#deploying-consumers)
16
+ 6. [Operations](#operations)
17
+ 3. [Development](#development)
18
+ 4. [Contributing](#contributing)
19
+ 5. [Copyright and license](#copyright-and-license)
20
+
21
+
7
22
  ## Installation
8
23
 
9
24
  Add this line to your application's Gemfile:
@@ -98,31 +113,96 @@ subscribes_to "some-topic", start_from_beginning: false
98
113
 
99
114
  Note that once the consumer has started, it will commit the offsets it has processed until and in the future will resume from those.
100
115
 
116
+ #### Processing messages in batches
117
+
118
+ If you want to process whole _batches_ of messages at a time, simply rename your `#process` method to `#process_batch`. The method will now be called with a "batch" object rather than a message:
119
+
120
+ ```ruby
121
+ class ArchiveEventsConsumer < Racecar::Consumer
122
+ subscribes_to "events"
123
+
124
+ def process_batch(batch)
125
+ file_name = [
126
+ batch.topic, # the topic this batch of messages came from.
127
+ batch.partition, # the partition this batch of messages came from.
128
+ batch.first_offset, # offset of the first message in the batch.
129
+ batch.last_offset, # offset of the last message in the batch.
130
+ ].join("-")
131
+
132
+ File.open(file_name, "w") do |file|
133
+ # the messages in the batch.
134
+ batch.messages.each do |message|
135
+ file << message.value
136
+ end
137
+ end
138
+ end
139
+ end
140
+ ```
141
+
142
+ An important detail is that, if an exception is raised while processing a batch, the _whole batch_ is re-processed.
143
+
101
144
  ### Running consumers
102
145
 
103
146
  Racecar is first and foremost an executable _consumer runner_. The `racecar` executable takes as argument the name of the consumer class that should be run. Racecar automatically loads your Rails application before starting, and you can load any other library you need by passing the `--require` flag, e.g.
104
147
 
105
148
  $ bundle exec racecar --require dance_moves TapDanceConsumer
106
149
 
150
+ The first time you execute `racecar` with a consumer class a _consumer group_ will be created with a group id derived from the class name (this can be configured). If you start `racecar` with the same consumer class argument multiple times, the processes will join the existing group – even if you start them on other nodes. You will typically want to have at least two consumers in each of your groups – preferably on separate nodes – in order to deal with failures.
151
+
107
152
  ### Configuration
108
153
 
109
154
  Racecar provides a flexible way to configure your consumer in a way that feels at home in a Rails application. If you haven't already, run `bundle exec rails generate racecar:install` in order to generate a config file. You'll get a separate section for each Rails environment, with the common configuration values in a shared `common` section.
110
155
 
111
- The possible configuration keys are:
156
+ **Note:** many of these configuration keys correspond directly to similarly named concepts in [ruby-kafka](https://github.com/zendesk/ruby-kafka); for more details on low-level operations, read that project's documentation.
112
157
 
113
- * `brokers` (_optional_) A list of Kafka brokers in the cluster that you're consuming from. Defaults to `localhost` on port 9092, the default Kafka port.
114
- * `client_id` (_optional_) – A string used to identify the client in logs and metrics.
115
- * `group_id_prefix` (_optional_) – A prefix used when generating consumer group names. For instance, if you set the prefix to be `kevin.` and your consumer class is named `BaconConsumer`, the resulting consumer group will be named `kevin.bacon_consumer`.
116
- * `offset_commit_interval` (_optional_) – How often to save the consumer's position in Kafka.
117
- * `heartbeat_interval` (_optional_) – How often to send a heartbeat message to Kafka.
118
- * `pause_timeout` (_optional_) – How long to pause a partition for if the consumer raises an exception while processing a message.
119
- * `connect_timeout` (_optional_) – How long to wait when trying to connect to a Kafka broker. Default is 10 seconds.
120
- * `socket_timeout` (_optional_) – How long to wait when trying to communicate with a Kafka broker. Default is 30 seconds.
121
- * `max_wait_time` (_optional_) – How long to allow the Kafka brokers to wait before returning messages. A higher number means larger batches, at the cost of higher latency. Default is 5 seconds.
158
+ It's also possible to configure Racecar using environment variables. For any given configuration key, there should be a corresponding environment variable with the prefix `RACECAR_`, in upper case. For instance, in order to configure the client id, set `RACECAR_CLIENT_ID=some-id` in the process in which the Racecar consumer is launched. You can set `brokers` by passing a comma-separated list, e.g. `RACECAR_BROKERS=kafka1:9092,kafka2:9092,kafka3:9092`.
122
159
 
123
- Note that many of these configuration keys correspond directly with similarly named concepts in [ruby-kafka](https://github.com/zendesk/ruby-kafka) for more details on low-level operations, read that project's documentation.
160
+ #### Basic configuration
161
+
162
+ * `brokers` – A list of Kafka brokers in the cluster that you're consuming from. Defaults to `localhost` on port 9092, the default Kafka port.
163
+ * `client_id` – A string used to identify the client in logs and metrics.
164
+ * `group_id` – The group id to use for a given group of consumers. Note that this _must_ be different for each consumer class. If left blank a group id is generated based on the consumer class name.
165
+ * `group_id_prefix` – A prefix used when generating consumer group names. For instance, if you set the prefix to be `kevin.` and your consumer class is named `BaconConsumer`, the resulting consumer group will be named `kevin.bacon_consumer`.
166
+
167
+ #### Consumer checkpointing
168
+
169
+ The consumers will checkpoint their positions from time to time in order to be able to recover from failures. This is called _committing offsets_, since it's done by tracking the offset reached in each partition being processed, and committing those offset numbers to the Kafka offset storage API. If you can tolerate more double-processing after a failure, you can increase the interval between commits in order to better performance. You can also do the opposite if you prefer less chance of double-processing.
170
+
171
+ * `offset_commit_interval` – How often to save the consumer's position in Kafka. Default is every 10 seconds.
172
+ * `offset_commit_threshold` – How many messages to process before forcing a checkpoint. Default is 0, which means there's no limit. Setting this to e.g. 100 makes the consumer stop every 100 messages to checkpoint its position.
173
+
174
+ #### Timeouts & intervals
175
+
176
+ All timeouts are defined in number of seconds.
177
+
178
+ * `heartbeat_interval` – How often to send a heartbeat message to Kafka.
179
+ * `pause_timeout` – How long to pause a partition for if the consumer raises an exception while processing a message.
180
+ * `connect_timeout` – How long to wait when trying to connect to a Kafka broker. Default is 10 seconds.
181
+ * `socket_timeout` – How long to wait when trying to communicate with a Kafka broker. Default is 30 seconds.
182
+ * `max_wait_time` – How long to allow the Kafka brokers to wait before returning messages. A higher number means larger batches, at the cost of higher latency. Default is 5 seconds.
183
+
184
+ #### SSL encryption, authentication & authorization
185
+
186
+ * `ssl_ca_cert` – A valid SSL certificate authority, as a string.
187
+ * `ssl_ca_cert_file_path` - The path to a valid SSL certificate authority file.
188
+ * `ssl_client_cert` – A valid SSL client certificate, as a string.
189
+ * `ssl_client_cert_key` – A valid SSL client certificate key, as a string.
190
+
191
+ #### SASL encryption, authentication & authorization
192
+
193
+ Racecar has support for using SASL to authenticate clients using either the GSSAPI or PLAIN mechanism.
194
+
195
+ If using GSSAPI:
196
+
197
+ * `sasl_gssapi_principal` – The GSSAPI principal
198
+ * `sasl_gssapi_keytab` – Optional GSSAPI keytab.
199
+
200
+ If using PLAIN:
201
+
202
+ * `sasl_plain_authzid` – The authorization identity to use.
203
+ * `sasl_plain_username` – The username used to authenticate.
204
+ * `sasl_plain_password` – The password used to authenticate.
124
205
 
125
- It's also possible to configure Racecar using environment variables. For any given configuration key, there should be a corresponding environment variable with the prefix `RACECAR_`, in upper case. For instance, in order to configure the client id, set `RACECAR_CLIENT_ID=some-id` in the process in which the Racecar consumer is launched. You can set `brokers` by passing a comma-separated list, e.g. `RACECAR_BROKERS=kafka1:9092,kafka2:9092,kafka3:9092`.
126
206
 
127
207
  ### Testing consumers
128
208
 
@@ -158,6 +238,30 @@ describe CreateContactsConsumer do
158
238
  end
159
239
  ```
160
240
 
241
+
242
+ ### Deploying consumers
243
+
244
+ If you're already deploying your Rails application using e.g. [Capistrano](http://capistranorb.com/), all you need to do to run your Racecar consumers in production is to have some _process supervisor_ start the processes and manage them for you.
245
+
246
+ [Foreman](https://ddollar.github.io/foreman/) is a very straightford tool for interfacing with several process supervisor systems. You define your process types in a Procfile, e.g.
247
+
248
+ ```
249
+ racecar-process-payments: bundle exec racecar ProcessPaymentsConsumer
250
+ racecar-resize-images: bundle exec racecar ResizeImagesConsumer
251
+ ```
252
+
253
+ If you've ever used Heroku you'll recognize the format – indeed, deploying to Heroku should just work if you add Racecar invocations to your Procfile.
254
+
255
+ With Foreman, you can easily run these processes locally by executing `foreman run`; in production you'll want to _export_ to another process management format such as Upstart or Runit. [capistrano-foreman](https://github.com/hyperoslo/capistrano-foreman) allows you to do this with Capistrano.
256
+
257
+
258
+ ### Operations
259
+
260
+ In order to gracefully shut down a Racecar consumer process, send it the `SIGTERM` signal. Most process supervisors such as Runit and Kubernetes send this signal when shutting down a process, so using those systems will make things easier.
261
+
262
+ In order to introspect the configuration of a consumer process, send it the `SIGUSR1` signal. This will make Racecar print its configuration to the standard error file descriptor associated with the consumer process, so you'll need to know where that is written to.
263
+
264
+
161
265
  ## Development
162
266
 
163
267
  After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
@@ -170,7 +274,7 @@ Bug reports and pull requests are welcome on GitHub at https://github.com/zendes
170
274
 
171
275
  ## Copyright and license
172
276
 
173
- Copyright 2017 Zendesk
277
+ Copyright 2017 Daniel Schierbeck & Zendesk
174
278
 
175
279
  Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.
176
280
 
@@ -0,0 +1,9 @@
1
+ class BatchConsumer < Racecar::Consumer
2
+ subscribes_to "messages", start_from_beginning: false
3
+
4
+ def process_batch(batch)
5
+ batch.messages.each do |message|
6
+ puts message.value
7
+ end
8
+ end
9
+ end
data/lib/racecar/cli.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  require "optparse"
2
- require "racecar/config_loader"
2
+ require "racecar/rails_config_file_loader"
3
3
 
4
4
  module Racecar
5
5
  module Cli
@@ -24,7 +24,7 @@ module Racecar
24
24
 
25
25
  puts "=> Starting Racecar consumer #{consumer_name}..."
26
26
 
27
- ConfigLoader.load!
27
+ RailsConfigFileLoader.load!
28
28
 
29
29
  # Find the consumer class by name.
30
30
  consumer_class = Kernel.const_get(consumer_name)
@@ -7,8 +7,10 @@ module Racecar
7
7
  ALLOWED_KEYS = %w(
8
8
  brokers
9
9
  client_id
10
+
10
11
  offset_commit_interval
11
12
  offset_commit_threshold
13
+
12
14
  heartbeat_interval
13
15
  pause_timeout
14
16
  connect_timeout
@@ -16,9 +18,21 @@ module Racecar
16
18
  group_id_prefix
17
19
  group_id
18
20
  subscriptions
19
- error_handler
20
21
  max_wait_time
22
+
23
+ error_handler
21
24
  log_to_stdout
25
+
26
+ ssl_ca_cert
27
+ ssl_ca_cert_file_path
28
+ ssl_client_cert
29
+ ssl_client_cert_key
30
+
31
+ sasl_gssapi_principal
32
+ sasl_gssapi_keytab
33
+ sasl_plain_authzid
34
+ sasl_plain_username
35
+ sasl_plain_password
22
36
  )
23
37
 
24
38
  REQUIRED_KEYS = %w(
@@ -70,6 +84,12 @@ module Racecar
70
84
  load_env!
71
85
  end
72
86
 
87
+ def inspect
88
+ ALLOWED_KEYS
89
+ .map {|key| [key, get(key).inspect].join(" = ") }
90
+ .join("\n")
91
+ end
92
+
73
93
  def validate!
74
94
  REQUIRED_KEYS.each do |key|
75
95
  if send(key).nil?
@@ -99,6 +119,10 @@ module Racecar
99
119
  load(data)
100
120
  end
101
121
 
122
+ def get(key)
123
+ public_send(key)
124
+ end
125
+
102
126
  def set(key, value)
103
127
  unless ALLOWED_KEYS.include?(key.to_s)
104
128
  raise ConfigError, "unknown configuration key `#{key}`"
@@ -148,6 +172,8 @@ module Racecar
148
172
  loader.integer(:connect_timeout)
149
173
  loader.integer(:socket_timeout)
150
174
  loader.integer(:max_wait_time)
175
+
176
+ loader.validate!
151
177
  end
152
178
  end
153
179
  end
data/lib/racecar/ctl.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  require "optparse"
2
- require "racecar/config_loader"
2
+ require "racecar/rails_config_file_loader"
3
3
 
4
4
  module Racecar
5
5
  class Ctl
@@ -46,7 +46,7 @@ module Racecar
46
46
  raise Racecar::Error, "no message value specified"
47
47
  end
48
48
 
49
- ConfigLoader.load!
49
+ RailsConfigFileLoader.load!
50
50
 
51
51
  Racecar.config.validate!
52
52
 
@@ -3,6 +3,7 @@ module Racecar
3
3
  def initialize(env, config)
4
4
  @env = env
5
5
  @config = config
6
+ @loaded_keys = []
6
7
  end
7
8
 
8
9
  def string(name)
@@ -23,6 +24,16 @@ module Racecar
23
24
  set(name) {|value| value.split(",") }
24
25
  end
25
26
 
27
+ def validate!
28
+ # Make sure the user hasn't made a typo and added a key we don't know
29
+ # about.
30
+ @env.keys.grep(/^RACECAR_/).each do |key|
31
+ unless @loaded_keys.include?(key)
32
+ raise ConfigError, "unknown config variable #{key}"
33
+ end
34
+ end
35
+ end
36
+
26
37
  private
27
38
 
28
39
  def set(name)
@@ -31,6 +42,7 @@ module Racecar
31
42
  if @env.key?(key)
32
43
  value = yield @env.fetch(key)
33
44
  @config.set(name, value)
45
+ @loaded_keys << key
34
46
  end
35
47
  end
36
48
  end
@@ -1,5 +1,5 @@
1
1
  module Racecar
2
- module ConfigLoader
2
+ module RailsConfigFileLoader
3
3
  def self.load!
4
4
  config_file = "config/racecar.yml"
5
5
 
@@ -15,6 +15,9 @@ module Racecar
15
15
  logger: logger,
16
16
  connect_timeout: config.connect_timeout,
17
17
  socket_timeout: config.socket_timeout,
18
+ ssl_ca_cert: config.ssl_ca_cert,
19
+ ssl_client_cert: config.ssl_client_cert,
20
+ ssl_client_cert_key: config.ssl_client_cert_key,
18
21
  )
19
22
 
20
23
  consumer = kafka.consumer(
@@ -29,6 +32,9 @@ module Racecar
29
32
  trap("INT") { consumer.stop }
30
33
  trap("TERM") { consumer.stop }
31
34
 
35
+ # Print the consumer config to STDERR on USR1.
36
+ trap("USR1") { $stderr.puts config.inspect }
37
+
32
38
  config.subscriptions.each do |subscription|
33
39
  consumer.subscribe(
34
40
  subscription.topic,
@@ -38,8 +44,16 @@ module Racecar
38
44
  end
39
45
 
40
46
  begin
41
- consumer.each_message(max_wait_time: config.max_wait_time) do |message|
42
- processor.process(message)
47
+ if processor.respond_to?(:process)
48
+ consumer.each_message(max_wait_time: config.max_wait_time) do |message|
49
+ processor.process(message)
50
+ end
51
+ elsif processor.respond_to?(:process_batch)
52
+ consumer.each_batch(max_wait_time: config.max_wait_time) do |batch|
53
+ processor.process_batch(batch)
54
+ end
55
+ else
56
+ raise NotImplementedError, "Consumer class must implement process or process_batch method"
43
57
  end
44
58
  rescue Kafka::ProcessingError => e
45
59
  @logger.error "Error processing partition #{e.topic}/#{e.partition} at offset #{e.offset}"
@@ -1,3 +1,3 @@
1
1
  module Racecar
2
- VERSION = "0.2.1"
2
+ VERSION = "0.3.0.beta1"
3
3
  end
data/racecar.gemspec CHANGED
@@ -20,7 +20,7 @@ Gem::Specification.new do |spec|
20
20
  spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
21
21
  spec.require_paths = ["lib"]
22
22
 
23
- spec.add_runtime_dependency "ruby-kafka", "~> 0.3"
23
+ spec.add_runtime_dependency "ruby-kafka", "~> 0.4"
24
24
 
25
25
  spec.add_development_dependency "bundler", "~> 1.13"
26
26
  spec.add_development_dependency "rake", "~> 10.0"
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: racecar
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.1
4
+ version: 0.3.0.beta1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Daniel Schierbeck
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: exe
11
11
  cert_chain: []
12
- date: 2017-07-17 00:00:00.000000000 Z
12
+ date: 2017-08-03 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: ruby-kafka
@@ -17,14 +17,14 @@ dependencies:
17
17
  requirements:
18
18
  - - "~>"
19
19
  - !ruby/object:Gem::Version
20
- version: '0.3'
20
+ version: '0.4'
21
21
  type: :runtime
22
22
  prerelease: false
23
23
  version_requirements: !ruby/object:Gem::Requirement
24
24
  requirements:
25
25
  - - "~>"
26
26
  - !ruby/object:Gem::Version
27
- version: '0.3'
27
+ version: '0.4'
28
28
  - !ruby/object:Gem::Dependency
29
29
  name: bundler
30
30
  requirement: !ruby/object:Gem::Requirement
@@ -85,6 +85,7 @@ files:
85
85
  - Rakefile
86
86
  - bin/console
87
87
  - bin/setup
88
+ - examples/batch_consumer.rb
88
89
  - examples/cat_consumer.rb
89
90
  - exe/racecar
90
91
  - exe/racecarctl
@@ -95,10 +96,10 @@ files:
95
96
  - lib/racecar.rb
96
97
  - lib/racecar/cli.rb
97
98
  - lib/racecar/config.rb
98
- - lib/racecar/config_loader.rb
99
99
  - lib/racecar/consumer.rb
100
100
  - lib/racecar/ctl.rb
101
101
  - lib/racecar/env_loader.rb
102
+ - lib/racecar/rails_config_file_loader.rb
102
103
  - lib/racecar/runner.rb
103
104
  - lib/racecar/version.rb
104
105
  - racecar.gemspec
@@ -117,9 +118,9 @@ required_ruby_version: !ruby/object:Gem::Requirement
117
118
  version: '0'
118
119
  required_rubygems_version: !ruby/object:Gem::Requirement
119
120
  requirements:
120
- - - ">="
121
+ - - ">"
121
122
  - !ruby/object:Gem::Version
122
- version: '0'
123
+ version: 1.3.1
123
124
  requirements: []
124
125
  rubyforge_project:
125
126
  rubygems_version: 2.4.5.1