ruby-kafka 0.3.15.beta2 → 0.3.15.beta3
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +4 -0
- data/Gemfile.lock +3 -1
- data/README.md +33 -2
- data/lib/kafka/consumer.rb +26 -6
- data/lib/kafka/version.rb +1 -1
- data/ruby-kafka.gemspec +1 -0
- metadata +16 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 99f22b235423d8050945416fa7b423e568ece95e
|
4
|
+
data.tar.gz: c9706320fa69bdcb0bfe20378dd606dbd9d9213c
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 1095866be7a6181392ba636f22a171b867ce7c04e340172f18b0f72e6aa3692a47f5dfd98f7360794e75e3cd334260e53c87c7b4bab29b31a5178eb80bfbcf89
|
7
|
+
data.tar.gz: 0f5ed394bb0a57e4b854d254d503a8af10e2189702dfd7f5af7ed208b5232232738ba409044edfad4934805a854a7d08b822c350f1e3f52a18d2b3e9048b04f7
|
data/CHANGELOG.md
CHANGED
data/Gemfile.lock
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
PATH
|
2
2
|
remote: .
|
3
3
|
specs:
|
4
|
-
ruby-kafka (0.3.
|
4
|
+
ruby-kafka (0.3.15.beta2)
|
5
5
|
|
6
6
|
GEM
|
7
7
|
remote: https://rubygems.org/
|
@@ -55,6 +55,7 @@ GEM
|
|
55
55
|
slop (3.6.0)
|
56
56
|
snappy (0.0.12)
|
57
57
|
thread_safe (0.3.5)
|
58
|
+
timecop (0.8.0)
|
58
59
|
tzinfo (1.2.2)
|
59
60
|
thread_safe (~> 0.1)
|
60
61
|
|
@@ -76,6 +77,7 @@ DEPENDENCIES
|
|
76
77
|
ruby-kafka!
|
77
78
|
ruby-prof
|
78
79
|
snappy
|
80
|
+
timecop
|
79
81
|
|
80
82
|
BUNDLED WITH
|
81
83
|
1.10.6
|
data/README.md
CHANGED
@@ -34,8 +34,11 @@ Although parts of this library work with Kafka 0.8 – specifically, the Produce
|
|
34
34
|
6. [Instrumentation](#instrumentation)
|
35
35
|
7. [Understanding Timeouts](#understanding-timeouts)
|
36
36
|
8. [Encryption and Authentication using SSL](#encryption-and-authentication-using-ssl)
|
37
|
-
4. [
|
38
|
-
|
37
|
+
4. [Design](#design)
|
38
|
+
1. [Producer Design](#producer-design)
|
39
|
+
2. [Asynchronous Producer Design](#asynchronous-producer-design)
|
40
|
+
5. [Development](#development)
|
41
|
+
6. [Roadmap](#roadmap)
|
39
42
|
|
40
43
|
## Installation
|
41
44
|
|
@@ -713,6 +716,34 @@ kafka = Kafka.new(
|
|
713
716
|
|
714
717
|
Once client authentication is set up, it is possible to configure the Kafka cluster to [authorize client requests](http://kafka.apache.org/documentation.html#security_authz).
|
715
718
|
|
719
|
+
## Design
|
720
|
+
|
721
|
+
The library has been designed as a layered system, with each layer having a clear responsibility:
|
722
|
+
|
723
|
+
* The **network layer** handles low-level connection tasks, such as keeping open connections to each Kafka broker, reconnecting when there's an error, etc. See [`Kafka::Connection`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Connection) for more details.
|
724
|
+
* The **protocol layer** is responsible for encoding and decoding the Kafka protocol's various structures. See [`Kafka::Protocol`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Protocol) for more details.
|
725
|
+
* The **operational layer** provides high-level operations, such as fetching messages from a topic, that may involve more than one API request to the Kafka cluster. Some complex operations are made available through [`Kafka::Cluster`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Cluster), which represents an entire cluster, while simpler ones are only available through [`Kafka::Broker`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Broker), which represents a single Kafka broker. In general, `Kafka::Cluster` is the high-level API, with more polish.
|
726
|
+
* The **API layer** provides APIs to users of the libraries. The Consumer API is implemented in [`Kafka::Consumer`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Consumer) while the Producer API is implemented in [`Kafka::Producer`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Producer) and [`Kafka::AsyncProducer`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/AsyncProducer).
|
727
|
+
* The **configuration layer** provides a way to set up and configure the client, as well as easy entrypoints to the various APIs. [`Kafka::Client`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Client) implements the public APIs. For convenience, the method [`Kafka.new`](http://www.rubydoc.info/gems/ruby-kafka/Kafka.new) can instantiate the class for you.
|
728
|
+
|
729
|
+
Note that only the API and configuration layers have any backwards compatibility guarantees – the other layers are considered internal and may change without warning. Don't use them directly.
|
730
|
+
|
731
|
+
### Producer Design
|
732
|
+
|
733
|
+
The producer is designed with resilience and operational ease of use in mind, sometimes at the cost of raw performance. For instance, the operation is heavily instrumented, allowing operators to monitor the producer at a very granular level.
|
734
|
+
|
735
|
+
The producer has two main internal data structures: a list of _pending messages_ and a _message buffer_. When the user calls [`Kafka::Producer#produce`](http://www.rubydoc.info/gems/ruby-kafka/Kafka%2FProducer%3Aproduce), a message is appended to the pending message list, but no network communication takes place. This means that the call site does not have to handle the broad range of errors that can happen at the network or protocol level. Instead, those errors will only happen once [`Kafka::Producer#deliver_messages`](http://www.rubydoc.info/gems/ruby-kafka/Kafka%2FProducer%3Adeliver_messages) is called. This method will go through the pending messages one by one, making sure they're assigned a partition. This may fail for some messages, as it could require knowing the current configuration for the message's topic, necessitating API calls to Kafka. Messages that cannot be assigned a partition are kept in the list, while the others are written into the message buffer. The producer then figures out which topic partitions are led by which Kafka brokers so that messages can be sent to the right place – in Kafka, it is the responsibility of the client to do this routing. A separate _produce_ API request will be sent to each broker; the response will be inspected; and messages that were acknowledged by the broker will be removed from the message buffer. Any messages that were _not_ acknowledged will be kept in the buffer.
|
736
|
+
|
737
|
+
If there are any messages left in either the pending message list _or_ the message buffer after this operation, [`Kafka::DeliveryFailed`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/DeliveryFailed) will be raised. This exception must be rescued and handled by the user, possibly by calling `#deliver_messages` at a later time.
|
738
|
+
|
739
|
+
### Asynchronous Producer Design
|
740
|
+
|
741
|
+
The synchronous producer allows the user fine-grained control over when network activity and the possible errors arising from that will take place, but it requires the user to handle the errors nonetheless. The async producer provides a more hands-off approach that trades off control for ease of use and resilience.
|
742
|
+
|
743
|
+
Instead of writing directly into the pending message list, [`Kafka::AsyncProducer`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/AsyncProducer) writes the message to an internal thread-safe queue, returning immediately. A background thread reads messages off the queue and passes them to a synchronous producer.
|
744
|
+
|
745
|
+
Rather than triggering message deliveries directly, users of the async producer will typically set up _automatic triggers_, such as a timer.
|
746
|
+
|
716
747
|
## Development
|
717
748
|
|
718
749
|
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
data/lib/kafka/consumer.rb
CHANGED
@@ -100,12 +100,16 @@ module Kafka
|
|
100
100
|
# idea to simply pause the partition until the error can be resolved, allowing
|
101
101
|
# the rest of the partitions to continue being processed.
|
102
102
|
#
|
103
|
+
# If the `timeout` argument is passed, the partition will automatically be
|
104
|
+
# resumed when the timeout expires.
|
105
|
+
#
|
103
106
|
# @param topic [String]
|
104
107
|
# @param partition [Integer]
|
108
|
+
# @param timeout [Integer] the number of seconds to pause the partition for.
|
105
109
|
# @return [nil]
|
106
|
-
def pause(topic, partition)
|
107
|
-
@paused_partitions[topic] ||=
|
108
|
-
@paused_partitions[topic]
|
110
|
+
def pause(topic, partition, timeout: nil)
|
111
|
+
@paused_partitions[topic] ||= {}
|
112
|
+
@paused_partitions[topic][partition] = timeout && Time.now + timeout
|
109
113
|
end
|
110
114
|
|
111
115
|
# Resume processing of a topic partition.
|
@@ -115,7 +119,7 @@ module Kafka
|
|
115
119
|
# @param partition [Integer]
|
116
120
|
# @return [nil]
|
117
121
|
def resume(topic, partition)
|
118
|
-
paused_partitions = @paused_partitions.fetch(topic,
|
122
|
+
paused_partitions = @paused_partitions.fetch(topic, {})
|
119
123
|
paused_partitions.delete(partition)
|
120
124
|
end
|
121
125
|
|
@@ -126,7 +130,23 @@ module Kafka
|
|
126
130
|
# @param partition [Integer]
|
127
131
|
# @return [Boolean] true if the partition is paused, false otherwise.
|
128
132
|
def paused?(topic, partition)
|
129
|
-
@paused_partitions.fetch(topic,
|
133
|
+
partitions = @paused_partitions.fetch(topic, {})
|
134
|
+
|
135
|
+
if partitions.key?(partition)
|
136
|
+
# Users can set an optional timeout, after which the partition is
|
137
|
+
# automatically resumed. When pausing, the timeout is translated to an
|
138
|
+
# absolute point in time.
|
139
|
+
timeout = partitions.fetch(partition)
|
140
|
+
|
141
|
+
if timeout.nil?
|
142
|
+
true
|
143
|
+
elsif Time.now < timeout
|
144
|
+
true
|
145
|
+
else
|
146
|
+
resume(topic, partition)
|
147
|
+
false
|
148
|
+
end
|
149
|
+
end
|
130
150
|
end
|
131
151
|
|
132
152
|
# Fetches and enumerates the messages in the topics that the consumer group
|
@@ -161,7 +181,7 @@ module Kafka
|
|
161
181
|
topic: message.topic,
|
162
182
|
partition: message.partition,
|
163
183
|
offset: message.offset,
|
164
|
-
offset_lag: batch.highwater_mark_offset - message.offset,
|
184
|
+
offset_lag: batch.highwater_mark_offset - message.offset - 1,
|
165
185
|
key: message.key,
|
166
186
|
value: message.value,
|
167
187
|
)
|
data/lib/kafka/version.rb
CHANGED
data/ruby-kafka.gemspec
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: ruby-kafka
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.3.15.
|
4
|
+
version: 0.3.15.beta3
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Daniel Schierbeck
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2016-08-
|
11
|
+
date: 2016-08-29 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -192,6 +192,20 @@ dependencies:
|
|
192
192
|
- - ">="
|
193
193
|
- !ruby/object:Gem::Version
|
194
194
|
version: '0'
|
195
|
+
- !ruby/object:Gem::Dependency
|
196
|
+
name: timecop
|
197
|
+
requirement: !ruby/object:Gem::Requirement
|
198
|
+
requirements:
|
199
|
+
- - ">="
|
200
|
+
- !ruby/object:Gem::Version
|
201
|
+
version: '0'
|
202
|
+
type: :development
|
203
|
+
prerelease: false
|
204
|
+
version_requirements: !ruby/object:Gem::Requirement
|
205
|
+
requirements:
|
206
|
+
- - ">="
|
207
|
+
- !ruby/object:Gem::Version
|
208
|
+
version: '0'
|
195
209
|
description: |-
|
196
210
|
A client library for the Kafka distributed commit log.
|
197
211
|
|