ruby-kafka 0.3.16 → 0.3.17
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +8 -0
- data/README.md +37 -30
- data/lib/kafka/client.rb +13 -4
- data/lib/kafka/consumer.rb +1 -0
- data/lib/kafka/consumer_group.rb +3 -1
- data/lib/kafka/fetched_batch.rb +1 -1
- data/lib/kafka/message_buffer.rb +1 -1
- data/lib/kafka/offset_manager.rb +56 -14
- data/lib/kafka/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: ae14c68f2b224bf04659de3a17203e7065d5efd7
|
4
|
+
data.tar.gz: d33dd2594e8159eb23d3a8589b789f2b66c5993b
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 9f2d374b0237fad7d3678f18ee903640a40bc735f995b3138286928398d5d6b595bc3ce395a86d60810934617cdc2c1e207d305c5b245323c9281ffe4bc33bc5
|
7
|
+
data.tar.gz: 0403e6ba029d666660395fc666dc4003195e9aa947b2c44936415394b12abd14946830d4b369ac5587f5d71858f40c6ff0630628c51602603e98f340e065ef36
|
data/CHANGELOG.md
CHANGED
@@ -4,11 +4,19 @@ Changes and additions to the library will be listed here.
|
|
4
4
|
|
5
5
|
## Unreleased
|
6
6
|
|
7
|
+
## v0.3.17
|
8
|
+
|
9
|
+
- Re-commit previously committed offsets periodically with an interval of half
|
10
|
+
the offset retention time, starting with the first commit (#318).
|
11
|
+
- Expose offset retention time in the Consumer API (#316).
|
12
|
+
- Don't get blocked when there's temporarily no leader for a topic (#336).
|
13
|
+
|
7
14
|
## v0.3.16
|
8
15
|
|
9
16
|
- Fix SSL socket timeout (#283).
|
10
17
|
- Update to the latest Datadog gem (#296).
|
11
18
|
- Automatically detect private key type (#297).
|
19
|
+
- Only fetch messages for subscribed topics (#309).
|
12
20
|
|
13
21
|
## v0.3.15
|
14
22
|
|
data/README.md
CHANGED
@@ -9,35 +9,35 @@ Although parts of this library work with Kafka 0.8 – specifically, the Produce
|
|
9
9
|
1. [Installation](#installation)
|
10
10
|
2. [Compatibility](#compatibility)
|
11
11
|
3. [Usage](#usage)
|
12
|
-
|
13
|
-
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
|
12
|
+
1. [Setting up the Kafka Client](#setting-up-the-kafka-client)
|
13
|
+
2. [Producing Messages to Kafka](#producing-messages-to-kafka)
|
14
|
+
1. [Efficiently Producing Messages](#efficiently-producing-messages)
|
15
|
+
1. [Asynchronously Producing Messages](#asynchronously-producing-messages)
|
16
|
+
2. [Serialization](#serialization)
|
17
|
+
3. [Partitioning](#partitioning)
|
18
|
+
4. [Buffering and Error Handling](#buffering-and-error-handling)
|
19
|
+
5. [Message Durability](#message-durability)
|
20
|
+
6. [Message Delivery Guarantees](#message-delivery-guarantees)
|
21
|
+
7. [Compression](#compression)
|
22
|
+
8. [Producing Messages from a Rails Application](#producing-messages-from-a-rails-application)
|
23
|
+
3. [Consuming Messages from Kafka](#consuming-messages-from-kafka)
|
24
|
+
1. [Consumer Groups](#consumer-groups)
|
25
|
+
2. [Consumer Checkpointing](#consumer-checkpointing)
|
26
|
+
3. [Topic Subscriptions](#topic-subscriptions)
|
27
|
+
4. [Shutting Down a Consumer](#shutting-down-a-consumer)
|
28
|
+
5. [Consuming Messages in Batches](#consuming-messages-in-batches)
|
29
|
+
6. [Balancing Throughput and Latency](#balancing-throughput-and-latency)
|
30
|
+
4. [Thread Safety](#thread-safety)
|
31
|
+
5. [Logging](#logging)
|
32
|
+
6. [Instrumentation](#instrumentation)
|
33
|
+
7. [Monitoring](#monitoring)
|
34
|
+
1. [Reporting Metrics to Datadog](#reporting-metrics-to-datadog)
|
35
|
+
8. [Understanding Timeouts](#understanding-timeouts)
|
36
|
+
9. [Encryption and Authentication using SSL](#encryption-and-authentication-using-ssl)
|
37
37
|
4. [Design](#design)
|
38
|
-
|
39
|
-
|
40
|
-
|
38
|
+
1. [Producer Design](#producer-design)
|
39
|
+
2. [Asynchronous Producer Design](#asynchronous-producer-design)
|
40
|
+
3. [Consumer Design](#consumer-design)
|
41
41
|
5. [Development](#development)
|
42
42
|
6. [Roadmap](#roadmap)
|
43
43
|
|
@@ -166,7 +166,7 @@ Read the docs for [Kafka::Producer](http://www.rubydoc.info/gems/ruby-kafka/Kafk
|
|
166
166
|
|
167
167
|
#### Asynchronously Producing Messages
|
168
168
|
|
169
|
-
A normal producer will block while `#deliver_messages` is sending messages to Kafka,
|
169
|
+
A normal producer will block while `#deliver_messages` is sending messages to Kafka, possibly for tens of seconds or even minutes at a time, depending on your timeout and retry settings. Furthermore, you have to call `#deliver_messages` manually, with a frequency that balances batch size with message delay.
|
170
170
|
|
171
171
|
In order to avoid blocking during message deliveries you can use the _asynchronous producer_ API. It is mostly similar to the synchronous API, with calls to `#produce` and `#deliver_messages`. The main difference is that rather than blocking, these calls will return immediately. The actual work will be done in a background thread, with the messages and operations being sent from the caller over a thread safe queue.
|
172
172
|
|
@@ -505,6 +505,10 @@ By default, offsets are committed every 10 seconds. You can increase the frequen
|
|
505
505
|
|
506
506
|
In addition to the time based trigger it's possible to trigger checkpointing in response to _n_ messages having been processed, known as the _offset commit threshold_. This puts a bound on the number of messages that can be double-processed before the problem is detected. Setting this to 1 will cause an offset commit to take place every time a message has been processed. By default this trigger is disabled.
|
507
507
|
|
508
|
+
Stale offsets are periodically purged by the broker. The broker setting `offsets.retention.minutes` controls the retention window for committed offsets, and defaults to 1 day. The length of the retention window, known as _offset retention time_, can be changed for the consumer.
|
509
|
+
|
510
|
+
Previously committed offsets are re-committed, to reset the retention window, at the first commit and periodically at an interval of half the _offset retention time_.
|
511
|
+
|
508
512
|
```ruby
|
509
513
|
consumer = kafka.consumer(
|
510
514
|
group_id: "some-group",
|
@@ -514,6 +518,9 @@ consumer = kafka.consumer(
|
|
514
518
|
|
515
519
|
# Commit offsets when 100 messages have been processed.
|
516
520
|
offset_commit_threshold: 100,
|
521
|
+
|
522
|
+
# Increase the length of time that committed offsets are kept.
|
523
|
+
offset_retention_time: 7 * 60 * 60
|
517
524
|
)
|
518
525
|
```
|
519
526
|
|
@@ -668,7 +675,7 @@ end
|
|
668
675
|
|
669
676
|
It is highly recommended that you monitor your Kafka client applications in production. Typical problems you'll see are:
|
670
677
|
|
671
|
-
* high network
|
678
|
+
* high network error rates, which may impact performance and time-to-delivery;
|
672
679
|
* producer buffer growth, which may indicate that producers are unable to deliver messages at the rate they're being produced;
|
673
680
|
* consumer processing errors, indicating exceptions are being raised in the processing code;
|
674
681
|
* frequent consumer rebalances, which may indicate unstable network conditions or consumer configurations.
|
data/lib/kafka/client.rb
CHANGED
@@ -31,8 +31,8 @@ module Kafka
|
|
31
31
|
# @param socket_timeout [Integer, nil] the timeout setting for socket
|
32
32
|
# connections. See {BrokerPool#initialize}.
|
33
33
|
#
|
34
|
-
# @param ssl_ca_cert [String, nil] a PEM encoded CA cert
|
35
|
-
# SSL connection.
|
34
|
+
# @param ssl_ca_cert [String, Array<String>, nil] a PEM encoded CA cert, or an Array of
|
35
|
+
# PEM encoded CA certs, to use with an SSL connection.
|
36
36
|
#
|
37
37
|
# @param ssl_client_cert [String, nil] a PEM encoded client cert to use with an
|
38
38
|
# SSL connection. Must be used in combination with ssl_client_cert_key.
|
@@ -216,19 +216,25 @@ module Kafka
|
|
216
216
|
# not triggered by message processing.
|
217
217
|
# @param heartbeat_interval [Integer] the interval between heartbeats; must be less
|
218
218
|
# than the session window.
|
219
|
+
# @param offset_retention_time [Integer] the time period that committed
|
220
|
+
# offsets will be retained, in seconds. Defaults to the broker setting.
|
219
221
|
# @return [Consumer]
|
220
|
-
def consumer(group_id:, session_timeout: 30, offset_commit_interval: 10, offset_commit_threshold: 0, heartbeat_interval: 10)
|
222
|
+
def consumer(group_id:, session_timeout: 30, offset_commit_interval: 10, offset_commit_threshold: 0, heartbeat_interval: 10, offset_retention_time: nil)
|
221
223
|
cluster = initialize_cluster
|
222
224
|
|
223
225
|
instrumenter = DecoratingInstrumenter.new(@instrumenter, {
|
224
226
|
group_id: group_id,
|
225
227
|
})
|
226
228
|
|
229
|
+
# The Kafka protocol expects the retention time to be in ms.
|
230
|
+
retention_time = (offset_retention_time && offset_retention_time * 1_000) || -1
|
231
|
+
|
227
232
|
group = ConsumerGroup.new(
|
228
233
|
cluster: cluster,
|
229
234
|
logger: @logger,
|
230
235
|
group_id: group_id,
|
231
236
|
session_timeout: session_timeout,
|
237
|
+
retention_time: retention_time
|
232
238
|
)
|
233
239
|
|
234
240
|
offset_manager = OffsetManager.new(
|
@@ -237,6 +243,7 @@ module Kafka
|
|
237
243
|
logger: @logger,
|
238
244
|
commit_interval: offset_commit_interval,
|
239
245
|
commit_threshold: offset_commit_threshold,
|
246
|
+
offset_retention_time: offset_retention_time
|
240
247
|
)
|
241
248
|
|
242
249
|
heartbeat = Heartbeat.new(
|
@@ -447,7 +454,9 @@ module Kafka
|
|
447
454
|
|
448
455
|
if ca_cert
|
449
456
|
store = OpenSSL::X509::Store.new
|
450
|
-
|
457
|
+
Array(ca_cert).each do |cert|
|
458
|
+
store.add_cert(OpenSSL::X509::Certificate.new(cert))
|
459
|
+
end
|
451
460
|
ssl_context.cert_store = store
|
452
461
|
end
|
453
462
|
|
data/lib/kafka/consumer.rb
CHANGED
data/lib/kafka/consumer_group.rb
CHANGED
@@ -5,7 +5,7 @@ module Kafka
|
|
5
5
|
class ConsumerGroup
|
6
6
|
attr_reader :assigned_partitions, :generation_id
|
7
7
|
|
8
|
-
def initialize(cluster:, logger:, group_id:, session_timeout:)
|
8
|
+
def initialize(cluster:, logger:, group_id:, session_timeout:, retention_time:)
|
9
9
|
@cluster = cluster
|
10
10
|
@logger = logger
|
11
11
|
@group_id = group_id
|
@@ -16,6 +16,7 @@ module Kafka
|
|
16
16
|
@topics = Set.new
|
17
17
|
@assigned_partitions = {}
|
18
18
|
@assignment_strategy = RoundRobinAssignmentStrategy.new(cluster: @cluster)
|
19
|
+
@retention_time = retention_time
|
19
20
|
end
|
20
21
|
|
21
22
|
def subscribe(topic)
|
@@ -68,6 +69,7 @@ module Kafka
|
|
68
69
|
member_id: @member_id,
|
69
70
|
generation_id: @generation_id,
|
70
71
|
offsets: offsets,
|
72
|
+
retention_time: @retention_time
|
71
73
|
)
|
72
74
|
|
73
75
|
response.topics.each do |topic, partitions|
|
data/lib/kafka/fetched_batch.rb
CHANGED
data/lib/kafka/message_buffer.rb
CHANGED
@@ -56,7 +56,7 @@ module Kafka
|
|
56
56
|
return unless @buffer.key?(topic) && @buffer[topic].key?(partition)
|
57
57
|
|
58
58
|
@size -= @buffer[topic][partition].count
|
59
|
-
@bytesize -= @buffer[topic][partition].map(&:bytesize).reduce(:+)
|
59
|
+
@bytesize -= @buffer[topic][partition].map(&:bytesize).reduce(0, :+)
|
60
60
|
|
61
61
|
@buffer[topic].delete(partition)
|
62
62
|
@buffer.delete(topic) if @buffer[topic].empty?
|
data/lib/kafka/offset_manager.rb
CHANGED
@@ -1,6 +1,10 @@
|
|
1
1
|
module Kafka
|
2
2
|
class OffsetManager
|
3
|
-
|
3
|
+
|
4
|
+
# The default broker setting for offsets.retention.minutes is 1440.
|
5
|
+
DEFAULT_RETENTION_TIME = 1440 * 60
|
6
|
+
|
7
|
+
def initialize(cluster:, group:, logger:, commit_interval:, commit_threshold:, offset_retention_time:)
|
4
8
|
@cluster = cluster
|
5
9
|
@group = group
|
6
10
|
@logger = logger
|
@@ -13,6 +17,8 @@ module Kafka
|
|
13
17
|
@committed_offsets = nil
|
14
18
|
@resolved_offsets = {}
|
15
19
|
@last_commit = Time.now
|
20
|
+
@last_recommit = nil
|
21
|
+
@recommit_interval = (offset_retention_time || DEFAULT_RETENTION_TIME) / 2
|
16
22
|
end
|
17
23
|
|
18
24
|
def set_default_offset(topic, default_offset)
|
@@ -49,17 +55,15 @@ module Kafka
|
|
49
55
|
end
|
50
56
|
end
|
51
57
|
|
52
|
-
def commit_offsets
|
53
|
-
|
54
|
-
|
55
|
-
|
56
|
-
}.join(", ")
|
57
|
-
|
58
|
-
@logger.info "Committing offsets: #{pretty_offsets}"
|
58
|
+
def commit_offsets(recommit = false)
|
59
|
+
offsets = offsets_to_commit(recommit)
|
60
|
+
unless offsets.empty?
|
61
|
+
@logger.info "Committing offsets#{recommit ? ' with recommit' : ''}: #{prettify_offsets(offsets)}"
|
59
62
|
|
60
|
-
@group.commit_offsets(
|
63
|
+
@group.commit_offsets(offsets)
|
61
64
|
|
62
65
|
@last_commit = Time.now
|
66
|
+
@last_recommit = Time.now if recommit
|
63
67
|
|
64
68
|
@uncommitted_offsets = 0
|
65
69
|
@committed_offsets = nil
|
@@ -67,8 +71,9 @@ module Kafka
|
|
67
71
|
end
|
68
72
|
|
69
73
|
def commit_offsets_if_necessary
|
70
|
-
|
71
|
-
|
74
|
+
recommit = recommit_timeout_reached?
|
75
|
+
if recommit || commit_timeout_reached? || commit_threshold_reached?
|
76
|
+
commit_offsets(recommit)
|
72
77
|
end
|
73
78
|
end
|
74
79
|
|
@@ -107,13 +112,44 @@ module Kafka
|
|
107
112
|
@cluster.resolve_offsets(topic, partitions, default_offset)
|
108
113
|
end
|
109
114
|
|
115
|
+
def seconds_since(time)
|
116
|
+
Time.now - time
|
117
|
+
end
|
118
|
+
|
110
119
|
def seconds_since_last_commit
|
111
|
-
|
120
|
+
seconds_since(@last_commit)
|
112
121
|
end
|
113
122
|
|
114
|
-
def
|
123
|
+
def committed_offsets
|
115
124
|
@committed_offsets ||= @group.fetch_offsets
|
116
|
-
|
125
|
+
end
|
126
|
+
|
127
|
+
def committed_offset_for(topic, partition)
|
128
|
+
committed_offsets.offset_for(topic, partition)
|
129
|
+
end
|
130
|
+
|
131
|
+
def offsets_to_commit(recommit = false)
|
132
|
+
if recommit
|
133
|
+
offsets_to_recommit.merge!(@processed_offsets) do |_topic, committed, processed|
|
134
|
+
committed.merge!(processed)
|
135
|
+
end
|
136
|
+
else
|
137
|
+
@processed_offsets
|
138
|
+
end
|
139
|
+
end
|
140
|
+
|
141
|
+
def offsets_to_recommit
|
142
|
+
committed_offsets.topics.each_with_object({}) do |(topic, partition_info), offsets|
|
143
|
+
topic_offsets = partition_info.keys.each_with_object({}) do |partition, partition_map|
|
144
|
+
offset = committed_offsets.offset_for(topic, partition)
|
145
|
+
partition_map[partition] = offset unless offset == -1
|
146
|
+
end
|
147
|
+
offsets[topic] = topic_offsets unless topic_offsets.empty?
|
148
|
+
end
|
149
|
+
end
|
150
|
+
|
151
|
+
def recommit_timeout_reached?
|
152
|
+
@last_recommit.nil? || seconds_since(@last_recommit) >= @recommit_interval
|
117
153
|
end
|
118
154
|
|
119
155
|
def commit_timeout_reached?
|
@@ -123,5 +159,11 @@ module Kafka
|
|
123
159
|
def commit_threshold_reached?
|
124
160
|
@commit_threshold != 0 && @uncommitted_offsets >= @commit_threshold
|
125
161
|
end
|
162
|
+
|
163
|
+
def prettify_offsets(offsets)
|
164
|
+
offsets.flat_map do |topic, partitions|
|
165
|
+
partitions.map { |partition, offset| "#{topic}/#{partition}:#{offset}" }
|
166
|
+
end.join(', ')
|
167
|
+
end
|
126
168
|
end
|
127
169
|
end
|
data/lib/kafka/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: ruby-kafka
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.3.
|
4
|
+
version: 0.3.17
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Daniel Schierbeck
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2017-
|
11
|
+
date: 2017-04-07 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|