ruby-kafka 0.3.9 → 0.3.10
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +7 -0
- data/ISSUE_TEMPLATE.md +23 -0
- data/README.md +49 -42
- data/lib/kafka/client.rb +42 -3
- data/lib/kafka/cluster.rb +40 -0
- data/lib/kafka/consumer.rb +5 -3
- data/lib/kafka/consumer_group.rb +1 -1
- data/lib/kafka/datadog.rb +2 -0
- data/lib/kafka/fetch_operation.rb +8 -1
- data/lib/kafka/offset_manager.rb +12 -2
- data/lib/kafka/version.rb +1 -1
- metadata +3 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: c81a0d1df2ae4667c58001fd625b1df17916fad6
|
4
|
+
data.tar.gz: 6dcd1593cbe60056a675eca9b26a86972af41ac7
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 0255775eedb66b4fe2dac35df58854748746ebcebdba436f636105d99a59279211bd8f876fcf760d47fecd42041338991aacd0ced17c771791819a7a4a72267f
|
7
|
+
data.tar.gz: 7b9e20157c087717d6178a6cf5c466e011123c966e44ab384fa4b62c36c113a85bf7435a5e9e7cec2ecfa686b375e0ade50db8507a4589e7d4350d9c3a216456
|
data/CHANGELOG.md
CHANGED
@@ -4,6 +4,13 @@ Changes and additions to the library will be listed here.
|
|
4
4
|
|
5
5
|
## Unreleased
|
6
6
|
|
7
|
+
## v0.3.10
|
8
|
+
|
9
|
+
- Handle brokers becoming unavailable while in a consumer loop (#228).
|
10
|
+
- Handle edge case when consuming from the end of a topic (#230).
|
11
|
+
- Ensure the library can be loaded without Bundler (#224).
|
12
|
+
- Add an API for fetching the last offset in a partition (#232).
|
13
|
+
|
7
14
|
## v0.3.9
|
8
15
|
|
9
16
|
- Improve the default durability setting. The producer setting `required_acks` now defaults to `:all` (#210).
|
data/ISSUE_TEMPLATE.md
ADDED
@@ -0,0 +1,23 @@
|
|
1
|
+
If this is a bug report, please fill out the following:
|
2
|
+
|
3
|
+
* Version of Ruby:
|
4
|
+
* Version of Kafka:
|
5
|
+
* Version of ruby-kafka:
|
6
|
+
|
7
|
+
Please verify that the problem you're seeing hasn't been fixed by the current `master` of ruby-kafka.
|
8
|
+
|
9
|
+
###### Steps to reproduce
|
10
|
+
|
11
|
+
```ruby
|
12
|
+
kafka = Kafka.new(...)
|
13
|
+
|
14
|
+
# Please write an example that reproduces the problem you're describing.
|
15
|
+
```
|
16
|
+
|
17
|
+
###### Expected outcome
|
18
|
+
|
19
|
+
What you thought would happen when running the example.
|
20
|
+
|
21
|
+
###### Actual outcome
|
22
|
+
|
23
|
+
What actually happened.
|
data/README.md
CHANGED
@@ -23,10 +23,11 @@ Although parts of this library work with Kafka 0.8 – specifically, the Produce
|
|
23
23
|
7. [Compression](#compression)
|
24
24
|
8. [Producing Messages from a Rails Application](#producing-messages-from-a-rails-application)
|
25
25
|
3. [Consuming Messages from Kafka](#consuming-messages-from-kafka)
|
26
|
-
1. [Consumer
|
27
|
-
2. [
|
28
|
-
3. [
|
29
|
-
4. [
|
26
|
+
1. [Consumer Groups](#consumer-groups)
|
27
|
+
2. [Consumer Checkpointing](#consumer-checkpointing)
|
28
|
+
3. [Topic Subscriptions](#topic-subscriptions)
|
29
|
+
4. [Consuming Messages in Batches](#consuming-messages-in-batches)
|
30
|
+
5. [Balancing Throughput and Latency](#balancing-throughput-and-latency)
|
30
31
|
4. [Thread Safety](#thread-safety)
|
31
32
|
5. [Logging](#logging)
|
32
33
|
6. [Instrumentation](#instrumentation)
|
@@ -110,52 +111,52 @@ kafka = Kafka.new(...)
|
|
110
111
|
kafka.deliver_message("Hello, World!", topic: "greetings")
|
111
112
|
```
|
112
113
|
|
113
|
-
This will write the message to a random partition in the `greetings` topic.
|
114
|
-
|
115
|
-
#### Efficiently Producing Messages
|
116
|
-
|
117
|
-
While `#deliver_message` works fine for infrequent writes, there are a number of downside:
|
118
|
-
|
119
|
-
* Kafka is optimized for transmitting _batches_ of messages rather than individual messages, so there's a significant overhead and performance penalty in using the single-message API.
|
120
|
-
* The message delivery can fail in a number of different ways, but this simplistic API does not provide automatic retries.
|
121
|
-
* The message is not buffered, so if there is an error, it is lost.
|
122
|
-
|
123
|
-
The Producer API solves all these problems and more:
|
114
|
+
This will write the message to a random partition in the `greetings` topic. If you want to write to a _specific_ partition, pass the `partition` parameter:
|
124
115
|
|
125
116
|
```ruby
|
126
|
-
|
117
|
+
# Will write to partition 42.
|
118
|
+
kafka.deliver_message("Hello, World!", topic: "greetings", partition: 42)
|
127
119
|
```
|
128
120
|
|
129
|
-
|
121
|
+
If you don't know exactly how many partitions are in the topic, or if you'd rather have some level of indirection, you can pass in `partition_key` instead. Two messages with the same partition key will always be assigned to the same partition. This is useful if you want to make sure all messages with a given attribute are always written to the same partition, e.g. all purchase events for a given customer id.
|
130
122
|
|
131
123
|
```ruby
|
132
|
-
|
124
|
+
# Partition keys assign a partition deterministically.
|
125
|
+
kafka.deliver_message("Hello, World!", topic: "greetings", partition_key: "hello")
|
133
126
|
```
|
134
127
|
|
135
|
-
|
128
|
+
Kafka also supports _message keys_. When passed, a message key can be used instead of a partition key. The message key is written alongside the message value and can be read by consumers. Message keys in Kafka can be used for interesting things such as [Log Compaction](http://kafka.apache.org/documentation.html#compaction). See [Partitioning](#partitioning) for more information.
|
136
129
|
|
137
130
|
```ruby
|
138
|
-
|
131
|
+
# Set a message key; the key will be used for partitioning since no explicit
|
132
|
+
# `partition_key` is set.
|
133
|
+
kafka.deliver_message("Hello, World!", key: "hello", topic: "greetings")
|
139
134
|
```
|
140
135
|
|
141
|
-
If you need to control which partition a message should be assigned to, you can pass in the `partition` parameter.
|
142
136
|
|
143
|
-
|
144
|
-
producer.produce("hello3", topic: "test-messages", partition: 1)
|
145
|
-
```
|
137
|
+
#### Efficiently Producing Messages
|
146
138
|
|
147
|
-
|
139
|
+
While `#deliver_message` works fine for infrequent writes, there are a number of downside:
|
148
140
|
|
149
|
-
|
150
|
-
|
151
|
-
|
141
|
+
* Kafka is optimized for transmitting _batches_ of messages rather than individual messages, so there's a significant overhead and performance penalty in using the single-message API.
|
142
|
+
* The message delivery can fail in a number of different ways, but this simplistic API does not provide automatic retries.
|
143
|
+
* The message is not buffered, so if there is an error, it is lost.
|
152
144
|
|
153
|
-
|
145
|
+
The Producer API solves all these problems and more:
|
154
146
|
|
155
147
|
```ruby
|
148
|
+
# Instantiate a new producer.
|
149
|
+
producer = kafka.producer
|
150
|
+
|
151
|
+
# Add a message to the producer buffer.
|
152
|
+
producer.produce("hello1", topic: "test-messages")
|
153
|
+
|
154
|
+
# Deliver the messages to Kafka.
|
156
155
|
producer.deliver_messages
|
157
156
|
```
|
158
157
|
|
158
|
+
`#produce` will buffer the message in the producer but will _not_ actually send it to the Kafka cluster. Buffered messages are only delivered to the Kafka cluster once `#deliver_messages` is called. Since messages may be destined for different partitions, this could involve writing to more than one Kafka broker. Note that a failure to send all buffered messages after the configured number of retries will result in `Kafka::DeliveryFailed` being raised. This can be rescued and ignored; the messages will be kept in the buffer until the next attempt.
|
159
|
+
|
159
160
|
Read the docs for [Kafka::Producer](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Producer) for more details.
|
160
161
|
|
161
162
|
#### Asynchronously Producing Messages
|
@@ -291,15 +292,23 @@ The producer is designed for resilience in the face of temporary network errors,
|
|
291
292
|
|
292
293
|
Typically, you'd configure the producer to retry failed attempts at sending messages, but sometimes all retries are exhausted. In that case, `Kafka::DeliveryFailed` is raised from `Kafka::Producer#deliver_messages`. If you wish to have your application be resilient to this happening (e.g. if you're logging to Kafka from a web application) you can rescue this exception. The failed messages are still retained in the buffer, so a subsequent call to `#deliver_messages` will still attempt to send them.
|
293
294
|
|
294
|
-
Note that there's a maximum buffer size;
|
295
|
+
Note that there's a maximum buffer size; by default, it's set to 1,000 messages and 10MB. It's possible to configure both these numbers:
|
296
|
+
|
297
|
+
```ruby
|
298
|
+
producer = kafka.producer(
|
299
|
+
max_buffer_size: 5_000, # Allow at most 5K messages to be buffered.
|
300
|
+
max_buffer_bytesize: 100_000_000, # Allow at most 100MB to be buffered.
|
301
|
+
...
|
302
|
+
)
|
303
|
+
```
|
295
304
|
|
296
|
-
A final note on buffers: local buffers give resilience against broker and network failures, and allow higher throughput due to message batching, but they also trade off consistency guarantees for higher
|
305
|
+
A final note on buffers: local buffers give resilience against broker and network failures, and allow higher throughput due to message batching, but they also trade off consistency guarantees for higher availability and resilience. If your local process dies while messages are buffered, those messages will be lost. If you require high levels of consistency, you should call `#deliver_messages` immediately after `#produce`.
|
297
306
|
|
298
307
|
#### Message Durability
|
299
308
|
|
300
|
-
Once the client has delivered a set of messages to a Kafka broker the broker will forward them to its replicas, thus ensuring that a single broker failure will not result in message loss. However, the client can choose _when the leader acknowledges the write_. At one extreme, the client can choose fire-and-forget delivery, not even bothering to check whether the messages have been acknowledged. At the other end, the client can ask the broker to wait until _all_ its replicas have acknowledged the write before returning. This is the safest option, and the default. It's also possible to have the broker return as soon as it has written the messages to its own log but before the replicas have done so. This leaves a window of time where a failure of the leader will result in the messages being lost, although this should not be a common
|
309
|
+
Once the client has delivered a set of messages to a Kafka broker the broker will forward them to its replicas, thus ensuring that a single broker failure will not result in message loss. However, the client can choose _when the leader acknowledges the write_. At one extreme, the client can choose fire-and-forget delivery, not even bothering to check whether the messages have been acknowledged. At the other end, the client can ask the broker to wait until _all_ its replicas have acknowledged the write before returning. This is the safest option, and the default. It's also possible to have the broker return as soon as it has written the messages to its own log but before the replicas have done so. This leaves a window of time where a failure of the leader will result in the messages being lost, although this should not be a common occurrence.
|
301
310
|
|
302
|
-
Write latency and throughput are
|
311
|
+
Write latency and throughput are negatively impacted by having more replicas acknowledge a write, so if you require low-latency, high throughput writes you may want to accept lower durability.
|
303
312
|
|
304
313
|
This behavior is controlled by the `required_acks` option to `#producer` and `#async_producer`:
|
305
314
|
|
@@ -343,7 +352,7 @@ producer.deliver_messages
|
|
343
352
|
That is, once `#deliver_messages` returns we can be sure that Kafka has received the message. Note that there are some big caveats here:
|
344
353
|
|
345
354
|
- Depending on how your cluster and topic is configured the message could still be lost by Kafka.
|
346
|
-
- If you configure the producer to not require acknowledgements from the Kafka brokers by setting `required_acks` to zero there is no guarantee that the
|
355
|
+
- If you configure the producer to not require acknowledgements from the Kafka brokers by setting `required_acks` to zero there is no guarantee that the message will ever make it to a Kafka broker.
|
347
356
|
- If you use the asynchronous producer there's no guarantee that messages will have been delivered after `#deliver_messages` returns. A way of blocking until a message has been delivered with the asynchronous producer may be implemented in the future.
|
348
357
|
|
349
358
|
It's possible to improve your chances of success when calling `#deliver_messages`, at the price of a longer max latency:
|
@@ -435,25 +444,23 @@ end
|
|
435
444
|
|
436
445
|
**Warning:** The Consumer API is still alpha level and will likely change. The consumer code should not be considered stable, as it hasn't been exhaustively tested in production environments yet.
|
437
446
|
|
438
|
-
|
447
|
+
Consuming messages from a Kafka topic is simple:
|
439
448
|
|
440
449
|
```ruby
|
441
450
|
require "kafka"
|
442
451
|
|
443
452
|
kafka = Kafka.new(seed_brokers: ["kafka1:9092", "kafka2:9092"])
|
444
453
|
|
445
|
-
|
446
|
-
|
447
|
-
messages.each do |message|
|
454
|
+
kafka.each_message(topic: "greetings") do |message|
|
448
455
|
puts message.offset, message.key, message.value
|
449
456
|
end
|
450
457
|
```
|
451
458
|
|
452
459
|
While this is great for extremely simple use cases, there are a number of downsides:
|
453
460
|
|
454
|
-
- You can only fetch from a single topic
|
461
|
+
- You can only fetch from a single topic at a time.
|
455
462
|
- If you want to have multiple processes consume from the same topic, there's no way of coordinating which processes should fetch from which partitions.
|
456
|
-
- If
|
463
|
+
- If the process dies, there's no way to have another process resume fetching from the point in the partition that the original process had reached.
|
457
464
|
|
458
465
|
|
459
466
|
#### Consumer Groups
|
@@ -487,7 +494,7 @@ Each consumer process will be assigned one or more partitions from each topic th
|
|
487
494
|
|
488
495
|
In order to be able to resume processing after a consumer crashes, each consumer will periodically _checkpoint_ its position within each partition it reads from. Since each partition has a monotonically increasing sequence of message offsets, this works by _committing_ the offset of the last message that was processed in a given partition. Kafka handles these commits and allows another consumer in a group to resume from the last commit when a member crashes or becomes unresponsive.
|
489
496
|
|
490
|
-
By default, offsets are committed every 10 seconds. You can increase the frequency, known as the _offset commit interval_, to limit the duration of double-processing scenarios, at the cost of a lower throughput due to the added coordination. If you want to improve
|
497
|
+
By default, offsets are committed every 10 seconds. You can increase the frequency, known as the _offset commit interval_, to limit the duration of double-processing scenarios, at the cost of a lower throughput due to the added coordination. If you want to improve throughput, and double-processing is of less concern to you, then you can decrease the frequency.
|
491
498
|
|
492
499
|
In addition to the time based trigger it's possible to trigger checkpointing in response to _n_ messages having been processed, known as the _offset commit threshold_. This puts a bound on the number of messages that can be double-processed before the problem is detected. Setting this to 1 will cause an offset commit to take place every time a message has been processed. By default this trigger is disabled.
|
493
500
|
|
@@ -696,7 +703,7 @@ After checking out the repo, run `bin/setup` to install dependencies. Then, run
|
|
696
703
|
|
697
704
|
## Roadmap
|
698
705
|
|
699
|
-
The current stable release is v0.
|
706
|
+
The current stable release is v0.3. This release is running in production at Zendesk, but it's still not recommended that you use it when data loss is unacceptable. It will take a little while until all edge cases have been uncovered and handled.
|
700
707
|
|
701
708
|
### v0.4
|
702
709
|
|
data/lib/kafka/client.rb
CHANGED
@@ -1,4 +1,5 @@
|
|
1
1
|
require "openssl"
|
2
|
+
require "uri"
|
2
3
|
|
3
4
|
require "kafka/cluster"
|
4
5
|
require "kafka/producer"
|
@@ -215,6 +216,7 @@ module Kafka
|
|
215
216
|
)
|
216
217
|
|
217
218
|
offset_manager = OffsetManager.new(
|
219
|
+
cluster: cluster,
|
218
220
|
group: group,
|
219
221
|
logger: @logger,
|
220
222
|
commit_interval: offset_commit_interval,
|
@@ -311,9 +313,32 @@ module Kafka
|
|
311
313
|
operation.execute.flat_map {|batch| batch.messages }
|
312
314
|
end
|
313
315
|
|
314
|
-
#
|
315
|
-
|
316
|
-
|
316
|
+
# Enumerate all messages in a topic.
|
317
|
+
#
|
318
|
+
# @param topic [String] the topic to consume messages from.
|
319
|
+
#
|
320
|
+
# @param start_from_beginning [Boolean] whether to start from the beginning
|
321
|
+
# of the topic or just subscribe to new messages being produced. This
|
322
|
+
# only applies when first consuming a topic partition – once the consumer
|
323
|
+
# has checkpointed its progress, it will always resume from the last
|
324
|
+
# checkpoint.
|
325
|
+
#
|
326
|
+
# @param max_wait_time [Integer] the maximum amount of time to wait before
|
327
|
+
# the server responds, in seconds.
|
328
|
+
#
|
329
|
+
# @param min_bytes [Integer] the minimum number of bytes to wait for. If set to
|
330
|
+
# zero, the broker will respond immediately, but the response may be empty.
|
331
|
+
# The default is 1 byte, which means that the broker will respond as soon as
|
332
|
+
# a message is written to the partition.
|
333
|
+
#
|
334
|
+
# @param max_bytes [Integer] the maximum number of bytes to include in the
|
335
|
+
# response message set. Default is 1 MB. You need to set this higher if you
|
336
|
+
# expect messages to be larger than this.
|
337
|
+
#
|
338
|
+
# @return [nil]
|
339
|
+
def each_message(topic:, start_from_beginning: true, max_wait_time: 5, min_bytes: 1, max_bytes: 1048576, &block)
|
340
|
+
default_offset ||= start_from_beginning ? :earliest : :latest
|
341
|
+
offsets = Hash.new { default_offset }
|
317
342
|
|
318
343
|
loop do
|
319
344
|
operation = FetchOperation.new(
|
@@ -341,6 +366,7 @@ module Kafka
|
|
341
366
|
#
|
342
367
|
# @return [Array<String>] the list of topic names.
|
343
368
|
def topics
|
369
|
+
@cluster.clear_target_topics
|
344
370
|
@cluster.topics
|
345
371
|
end
|
346
372
|
|
@@ -352,6 +378,19 @@ module Kafka
|
|
352
378
|
@cluster.partitions_for(topic).count
|
353
379
|
end
|
354
380
|
|
381
|
+
# Retrieve the offset of the last message in a partition. If there are no
|
382
|
+
# messages in the partition -1 is returned.
|
383
|
+
#
|
384
|
+
# @param topic [String]
|
385
|
+
# @param partition [Integer]
|
386
|
+
# @return [Integer] the offset of the last message in the partition, or -1 if
|
387
|
+
# there are no messages in the partition.
|
388
|
+
def last_offset_for(topic, partition)
|
389
|
+
# The offset resolution API will return the offset of the "next" message to
|
390
|
+
# be written when resolving the "latest" offset, so we subtract one.
|
391
|
+
@cluster.resolve_offset(topic, partition, :latest) - 1
|
392
|
+
end
|
393
|
+
|
355
394
|
# Closes all connections to the Kafka brokers and frees up used resources.
|
356
395
|
#
|
357
396
|
# @return [nil]
|
data/lib/kafka/cluster.rb
CHANGED
@@ -32,6 +32,11 @@ module Kafka
|
|
32
32
|
@target_topics = Set.new
|
33
33
|
end
|
34
34
|
|
35
|
+
# Adds a list of topics to the target list. Only the topics on this list will
|
36
|
+
# be queried for metadata.
|
37
|
+
#
|
38
|
+
# @param topics [Array<String>]
|
39
|
+
# @return [nil]
|
35
40
|
def add_target_topics(topics)
|
36
41
|
new_topics = Set.new(topics) - @target_topics
|
37
42
|
|
@@ -44,6 +49,15 @@ module Kafka
|
|
44
49
|
end
|
45
50
|
end
|
46
51
|
|
52
|
+
# Clears the list of target topics.
|
53
|
+
#
|
54
|
+
# @see #add_target_topics
|
55
|
+
# @return [nil]
|
56
|
+
def clear_target_topics
|
57
|
+
@target_topics.clear
|
58
|
+
refresh_metadata!
|
59
|
+
end
|
60
|
+
|
47
61
|
def mark_as_stale!
|
48
62
|
@stale = true
|
49
63
|
end
|
@@ -105,6 +119,32 @@ module Kafka
|
|
105
119
|
raise
|
106
120
|
end
|
107
121
|
|
122
|
+
def resolve_offset(topic, partition, offset)
|
123
|
+
add_target_topics([topic])
|
124
|
+
refresh_metadata_if_necessary!
|
125
|
+
broker = get_leader(topic, partition)
|
126
|
+
|
127
|
+
if offset == :earliest
|
128
|
+
offset = -2
|
129
|
+
elsif offset == :latest
|
130
|
+
offset = -1
|
131
|
+
end
|
132
|
+
|
133
|
+
response = broker.list_offsets(
|
134
|
+
topics: {
|
135
|
+
topic => [
|
136
|
+
{
|
137
|
+
partition: partition,
|
138
|
+
time: offset,
|
139
|
+
max_offsets: 1,
|
140
|
+
}
|
141
|
+
]
|
142
|
+
}
|
143
|
+
)
|
144
|
+
|
145
|
+
response.offset_for(topic, partition)
|
146
|
+
end
|
147
|
+
|
108
148
|
def topics
|
109
149
|
cluster_info.topics.map(&:topic_name)
|
110
150
|
end
|
data/lib/kafka/consumer.rb
CHANGED
@@ -60,8 +60,8 @@ module Kafka
|
|
60
60
|
#
|
61
61
|
# Typically you either want to start reading messages from the very
|
62
62
|
# beginning of the topic's partitions or you simply want to wait for new
|
63
|
-
# messages to be written. In the former case, set `
|
64
|
-
#
|
63
|
+
# messages to be written. In the former case, set `start_from_beginnign`
|
64
|
+
# true (the default); in the latter, set it to false.
|
65
65
|
#
|
66
66
|
# @param topic [String] the name of the topic to subscribe to.
|
67
67
|
# @param default_offset [Symbol] whether to start from the beginning or the
|
@@ -189,8 +189,10 @@ module Kafka
|
|
189
189
|
while @running
|
190
190
|
begin
|
191
191
|
yield
|
192
|
-
rescue HeartbeatError, OffsetCommitError
|
192
|
+
rescue HeartbeatError, OffsetCommitError
|
193
193
|
join_group
|
194
|
+
rescue FetchError
|
195
|
+
@cluster.mark_as_stale!
|
194
196
|
rescue LeaderNotAvailable => e
|
195
197
|
@logger.error "Leader not available; waiting 1s before retrying"
|
196
198
|
sleep 1
|
data/lib/kafka/consumer_group.rb
CHANGED
@@ -71,7 +71,7 @@ module Kafka
|
|
71
71
|
Protocol.handle_error(error_code)
|
72
72
|
end
|
73
73
|
end
|
74
|
-
rescue
|
74
|
+
rescue Kafka::Error => e
|
75
75
|
@logger.error "Error committing offsets: #{e}"
|
76
76
|
raise OffsetCommitError, e
|
77
77
|
end
|
data/lib/kafka/datadog.rb
CHANGED
@@ -93,6 +93,7 @@ module Kafka
|
|
93
93
|
|
94
94
|
tags = {
|
95
95
|
client: event.payload.fetch(:client_id),
|
96
|
+
group_id: event.payload.fetch(:group_id),
|
96
97
|
topic: event.payload.fetch(:topic),
|
97
98
|
partition: event.payload.fetch(:partition),
|
98
99
|
}
|
@@ -112,6 +113,7 @@ module Kafka
|
|
112
113
|
|
113
114
|
tags = {
|
114
115
|
client: event.payload.fetch(:client_id),
|
116
|
+
group_id: event.payload.fetch(:group_id),
|
115
117
|
topic: event.payload.fetch(:topic),
|
116
118
|
partition: event.payload.fetch(:partition),
|
117
119
|
}
|
@@ -69,7 +69,14 @@ module Kafka
|
|
69
69
|
|
70
70
|
response.topics.flat_map {|fetched_topic|
|
71
71
|
fetched_topic.partitions.map {|fetched_partition|
|
72
|
-
|
72
|
+
begin
|
73
|
+
Protocol.handle_error(fetched_partition.error_code)
|
74
|
+
rescue Kafka::Error => e
|
75
|
+
topic = fetched_topic.name
|
76
|
+
partition = fetched_partition.partition
|
77
|
+
@logger.error "Failed to fetch from #{topic}/#{partition}: #{e.message}"
|
78
|
+
raise e
|
79
|
+
end
|
73
80
|
|
74
81
|
messages = fetched_partition.messages.map {|message|
|
75
82
|
FetchedMessage.new(
|
data/lib/kafka/offset_manager.rb
CHANGED
@@ -1,6 +1,7 @@
|
|
1
1
|
module Kafka
|
2
2
|
class OffsetManager
|
3
|
-
def initialize(group:, logger:, commit_interval:, commit_threshold:)
|
3
|
+
def initialize(cluster:, group:, logger:, commit_interval:, commit_threshold:)
|
4
|
+
@cluster = cluster
|
4
5
|
@group = group
|
5
6
|
@logger = logger
|
6
7
|
@commit_interval = commit_interval
|
@@ -28,7 +29,16 @@ module Kafka
|
|
28
29
|
committed_offset_for(topic, partition)
|
29
30
|
}
|
30
31
|
|
31
|
-
offset
|
32
|
+
# A negative offset means that no offset has been committed, so we need to
|
33
|
+
# resolve the default offset for the topic.
|
34
|
+
if offset < 0
|
35
|
+
offset = @default_offsets.fetch(topic)
|
36
|
+
offset = @cluster.resolve_offset(topic, partition, offset)
|
37
|
+
|
38
|
+
# Make sure we commit this offset so that we don't repeat have to
|
39
|
+
# resolve the default offset every time.
|
40
|
+
mark_as_processed(topic, partition, offset - 1)
|
41
|
+
end
|
32
42
|
|
33
43
|
offset
|
34
44
|
end
|
data/lib/kafka/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: ruby-kafka
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.3.
|
4
|
+
version: 0.3.10
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Daniel Schierbeck
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2016-
|
11
|
+
date: 2016-07-12 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -208,6 +208,7 @@ files:
|
|
208
208
|
- CHANGELOG.md
|
209
209
|
- Gemfile
|
210
210
|
- Gemfile.lock
|
211
|
+
- ISSUE_TEMPLATE.md
|
211
212
|
- LICENSE.txt
|
212
213
|
- Procfile
|
213
214
|
- README.md
|