ruby-kafka 0.3.12 → 0.3.13.beta1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 29c5259a5c2650e5a9eec0c513aa997a90834f9d
4
- data.tar.gz: b4eb6a8de42d27f2a3b215cb690d7c803f706bad
3
+ metadata.gz: e6c48d1b7996a28caddd510f0daf032fdfca84fe
4
+ data.tar.gz: 302d2211024a2fdb904947c97cb0c258dc8583eb
5
5
  SHA512:
6
- metadata.gz: 43c219f2e1956264a864a9ce41e75f2a3b46ac6cc5c248cdc99a28a15b2871ed8fdb1554422797c9ca8e913996874981aadf3f35f23fea30110e616bfc3013e0
7
- data.tar.gz: ad849644e4f75761164f41ffad9d69e67e0ac71eb2796d222afc224f1745bc9199491165d38fd346ebf0b8f7da6ea25e60fb9fe7a853b12bc1699f069331d8d5
6
+ metadata.gz: d5524a95270673acb7e7e0ba2d6907eb774c715fb467925a052a633c1ab0e76de0fe5a16806ef3a7b95bbc29d54bd832a2b0a43d496753f4b5f1a162e4f91dde
7
+ data.tar.gz: 34f6417d7d0fdb987f1c6fdf69595a9de413ad010ca391f1d5d5fa3632fe781fe9514ae3568d20c2ed120887e288bb42848ebb22ab42125215d50ca16c0523da
@@ -4,6 +4,11 @@ Changes and additions to the library will be listed here.
4
4
 
5
5
  ## Unreleased
6
6
 
7
+ ## v0.3.13.beta1
8
+
9
+ - Minimize the number of times messages are reprocessed after a consumer group resync.
10
+ - Improve instrumentation of the async producer.
11
+
7
12
  ## v0.3.12
8
13
 
9
14
  - Fix a bug in the consumer.
data/README.md CHANGED
@@ -26,8 +26,9 @@ Although parts of this library work with Kafka 0.8 – specifically, the Produce
26
26
  1. [Consumer Groups](#consumer-groups)
27
27
  2. [Consumer Checkpointing](#consumer-checkpointing)
28
28
  3. [Topic Subscriptions](#topic-subscriptions)
29
- 4. [Consuming Messages in Batches](#consuming-messages-in-batches)
30
- 5. [Balancing Throughput and Latency](#balancing-throughput-and-latency)
29
+ 4. [Shutting Down a Consumer](#shutting-down-a-consumer)
30
+ 5. [Consuming Messages in Batches](#consuming-messages-in-batches)
31
+ 6. [Balancing Throughput and Latency](#balancing-throughput-and-latency)
31
32
  4. [Thread Safety](#thread-safety)
32
33
  5. [Logging](#logging)
33
34
  6. [Instrumentation](#instrumentation)
@@ -526,6 +527,23 @@ consumer.subscribe("notifications", start_from_beginning: false)
526
527
  Once the consumer group has checkpointed its progress in the topic's partitions, the consumers will always start from the checkpointed offsets, regardless of `start_from_beginning`. As such, this setting only applies when the consumer initially starts consuming from a topic.
527
528
 
528
529
 
530
+ #### Shutting Down a Consumer
531
+
532
+ In order to shut down a running consumer process cleanly, call `#stop` on it. A common pattern is to trap a process signal and initiate the shutdown from there:
533
+
534
+ ```ruby
535
+ consumer = kafka.consumer(...)
536
+
537
+ # The consumer can be stopped from the command line by executing
538
+ # `kill -s QUIT <process-id>`.
539
+ trap("QUIT") { consumer.stop }
540
+
541
+ consumer.each_message do |message|
542
+ ...
543
+ end
544
+ ```
545
+
546
+
529
547
  #### Consuming Messages in Batches
530
548
 
531
549
  Sometimes it is easier to deal with messages in batches rather than individually. A _batch_ is a sequence of one or more Kafka messages that all belong to the same topic and partition. One common reason to want to use batches is when some external system has a batch or transactional API.
@@ -69,7 +69,7 @@ module Kafka
69
69
  # @param delivery_interval [Integer] if greater than zero, the number of
70
70
  # seconds between automatic message deliveries.
71
71
  #
72
- def initialize(sync_producer:, max_queue_size: 1000, delivery_threshold: 0, delivery_interval: 0, instrumenter:)
72
+ def initialize(sync_producer:, max_queue_size: 1000, delivery_threshold: 0, delivery_interval: 0, instrumenter:, logger:)
73
73
  raise ArgumentError unless max_queue_size > 0
74
74
  raise ArgumentError unless delivery_threshold >= 0
75
75
  raise ArgumentError unless delivery_interval >= 0
@@ -77,6 +77,7 @@ module Kafka
77
77
  @queue = Queue.new
78
78
  @max_queue_size = max_queue_size
79
79
  @instrumenter = instrumenter
80
+ @logger = logger
80
81
 
81
82
  @worker = Worker.new(
82
83
  queue: @queue,
@@ -102,6 +103,12 @@ module Kafka
102
103
  args = [value, **options.merge(topic: topic)]
103
104
  @queue << [:produce, args]
104
105
 
106
+ @instrumenter.instrument("enqueue_message.async_producer", {
107
+ topic: topic,
108
+ queue_size: @queue.size,
109
+ max_queue_size: @max_queue_size,
110
+ })
111
+
105
112
  nil
106
113
  end
107
114
 
@@ -122,6 +129,7 @@ module Kafka
122
129
  # @see Kafka::Producer#shutdown
123
130
  # @return [nil]
124
131
  def shutdown
132
+ @timer_thread && @timer_thread.exit
125
133
  @queue << [:shutdown, nil]
126
134
  @worker_thread && @worker_thread.join
127
135
 
@@ -149,6 +157,8 @@ module Kafka
149
157
  topic: topic,
150
158
  })
151
159
 
160
+ @logger.error "Buffer overflow: failed to enqueue message for #{topic}"
161
+
152
162
  raise BufferOverflow
153
163
  end
154
164
 
@@ -200,6 +200,7 @@ module Kafka
200
200
  delivery_threshold: delivery_threshold,
201
201
  max_queue_size: max_queue_size,
202
202
  instrumenter: @instrumenter,
203
+ logger: @logger,
203
204
  )
204
205
  end
205
206
 
@@ -211,8 +211,22 @@ module Kafka
211
211
  end
212
212
 
213
213
  def join_group
214
- @offset_manager.clear_offsets
214
+ old_generation_id = @group.generation_id
215
+
215
216
  @group.join
217
+
218
+ if old_generation_id && @group.generation_id != old_generation_id + 1
219
+ # We've been out of the group for at least an entire generation, no
220
+ # sense in trying to hold on to offset data
221
+ @offset_manager.clear_offsets
222
+ else
223
+ # After rejoining the group we may have been assigned a new set of
224
+ # partitions. Keeping the old offset commits around forever would risk
225
+ # having the consumer go back and reprocess messages if it's assigned
226
+ # a partition it used to be assigned to way back. For that reason, we
227
+ # only keep commits for the partitions that we're still assigned.
228
+ @offset_manager.clear_offsets_excluding(@group.assigned_partitions)
229
+ end
216
230
  end
217
231
 
218
232
  def fetch_batches(min_bytes:, max_wait_time:)
@@ -3,7 +3,7 @@ require "kafka/round_robin_assignment_strategy"
3
3
 
4
4
  module Kafka
5
5
  class ConsumerGroup
6
- attr_reader :assigned_partitions
6
+ attr_reader :assigned_partitions, :generation_id
7
7
 
8
8
  def initialize(cluster:, logger:, group_id:, session_timeout:)
9
9
  @cluster = cluster
@@ -206,5 +206,27 @@ module Kafka
206
206
 
207
207
  attach_to "producer.kafka"
208
208
  end
209
+
210
+ class AsyncProducerSubscriber < StatsdSubscriber
211
+ def enqueue_message(event)
212
+ client = event.payload.fetch(:client_id)
213
+ topic = event.payload.fetch(:topic)
214
+ queue_size = event.payload.fetch(:queue_size)
215
+ max_queue_size = event.payload.fetch(:max_queue_size)
216
+ queue_fill_ratio = queue_size.to_f / max_queue_size.to_f
217
+
218
+ tags = {
219
+ client: client,
220
+ }
221
+
222
+ # This gets us the avg/max queue size per producer.
223
+ histogram("producer.queue.size", queue_size, tags: tags)
224
+
225
+ # This gets us the avg/max queue fill ratio per producer.
226
+ histogram("producer.queue.fill_ratio", queue_fill_ratio, tags: tags)
227
+ end
228
+
229
+ attach_to "async_producer.kafka"
230
+ end
209
231
  end
210
232
  end
@@ -59,14 +59,27 @@ module Kafka
59
59
  end
60
60
 
61
61
  def commit_offsets_if_necessary
62
- if seconds_since_last_commit >= @commit_interval || commit_threshold_reached?
62
+ if commit_timeout_reached? || commit_threshold_reached?
63
63
  commit_offsets
64
64
  end
65
65
  end
66
66
 
67
67
  def clear_offsets
68
- @uncommitted_offsets = 0
69
68
  @processed_offsets.clear
69
+
70
+ # Clear the cached commits from the brokers.
71
+ @committed_offsets = nil
72
+ end
73
+
74
+ def clear_offsets_excluding(excluded)
75
+ # Clear all offsets that aren't in `excluded`.
76
+ @processed_offsets.each do |topic, partitions|
77
+ partitions.keep_if do |partition, _|
78
+ excluded.fetch(topic, []).include?(partition)
79
+ end
80
+ end
81
+
82
+ # Clear the cached commits from the brokers.
70
83
  @committed_offsets = nil
71
84
  end
72
85
 
@@ -81,6 +94,10 @@ module Kafka
81
94
  @committed_offsets.offset_for(topic, partition)
82
95
  end
83
96
 
97
+ def commit_timeout_reached?
98
+ @commit_interval != 0 && seconds_since_last_commit >= @commit_interval
99
+ end
100
+
84
101
  def commit_threshold_reached?
85
102
  @commit_threshold != 0 && @uncommitted_offsets >= @commit_threshold
86
103
  end
@@ -1,3 +1,3 @@
1
1
  module Kafka
2
- VERSION = "0.3.12"
2
+ VERSION = "0.3.13.beta1"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ruby-kafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.12
4
+ version: 0.3.13.beta1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Daniel Schierbeck
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2016-07-28 00:00:00.000000000 Z
11
+ date: 2016-08-02 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -303,9 +303,9 @@ required_ruby_version: !ruby/object:Gem::Requirement
303
303
  version: 2.1.0
304
304
  required_rubygems_version: !ruby/object:Gem::Requirement
305
305
  requirements:
306
- - - ">="
306
+ - - ">"
307
307
  - !ruby/object:Gem::Version
308
- version: '0'
308
+ version: 1.3.1
309
309
  requirements: []
310
310
  rubyforge_project:
311
311
  rubygems_version: 2.4.5.1