ruby-kafka 0.3.15.beta2 → 0.3.15.beta3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 26c3ed03ad7cdfe31424d9b0f1bd169794631b75
4
- data.tar.gz: 0619bf2d30a8f1f48c779911ba3a7b5c0f212b96
3
+ metadata.gz: 99f22b235423d8050945416fa7b423e568ece95e
4
+ data.tar.gz: c9706320fa69bdcb0bfe20378dd606dbd9d9213c
5
5
  SHA512:
6
- metadata.gz: 3c5ff90a634a46859975c4c498350a208f8bfb485c25a6305c0b69a0e6094c3e8d2ef5746bc0968a401df0dc1a36a0ffda45c44fcc1cdf0c3d2ba4834c037ffd
7
- data.tar.gz: 9a0ef026324fb0152776e90a6ce40606f4752a1369744f51ae21e19cb89f6de1ae0bdb933660feda3ca95fb77031862c1d343bfdc319c62780e987a45bda6712
6
+ metadata.gz: 1095866be7a6181392ba636f22a171b867ce7c04e340172f18b0f72e6aa3692a47f5dfd98f7360794e75e3cd334260e53c87c7b4bab29b31a5178eb80bfbcf89
7
+ data.tar.gz: 0f5ed394bb0a57e4b854d254d503a8af10e2189702dfd7f5af7ed208b5232232738ba409044edfad4934805a854a7d08b822c350f1e3f52a18d2b3e9048b04f7
data/CHANGELOG.md CHANGED
@@ -4,6 +4,10 @@ Changes and additions to the library will be listed here.
4
4
 
5
5
  ## Unreleased
6
6
 
7
+ ## v0.3.15.beta3
8
+
9
+ - Allow setting a timeout on a partition pause (#272).
10
+
7
11
  ## v0.3.15.beta1
8
12
 
9
13
  - Allow pausing consumption of a partition (#268).
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- ruby-kafka (0.3.8)
4
+ ruby-kafka (0.3.15.beta2)
5
5
 
6
6
  GEM
7
7
  remote: https://rubygems.org/
@@ -55,6 +55,7 @@ GEM
55
55
  slop (3.6.0)
56
56
  snappy (0.0.12)
57
57
  thread_safe (0.3.5)
58
+ timecop (0.8.0)
58
59
  tzinfo (1.2.2)
59
60
  thread_safe (~> 0.1)
60
61
 
@@ -76,6 +77,7 @@ DEPENDENCIES
76
77
  ruby-kafka!
77
78
  ruby-prof
78
79
  snappy
80
+ timecop
79
81
 
80
82
  BUNDLED WITH
81
83
  1.10.6
data/README.md CHANGED
@@ -34,8 +34,11 @@ Although parts of this library work with Kafka 0.8 – specifically, the Produce
34
34
  6. [Instrumentation](#instrumentation)
35
35
  7. [Understanding Timeouts](#understanding-timeouts)
36
36
  8. [Encryption and Authentication using SSL](#encryption-and-authentication-using-ssl)
37
- 4. [Development](#development)
38
- 5. [Roadmap](#roadmap)
37
+ 4. [Design](#design)
38
+ 1. [Producer Design](#producer-design)
39
+ 2. [Asynchronous Producer Design](#asynchronous-producer-design)
40
+ 5. [Development](#development)
41
+ 6. [Roadmap](#roadmap)
39
42
 
40
43
  ## Installation
41
44
 
@@ -713,6 +716,34 @@ kafka = Kafka.new(
713
716
 
714
717
  Once client authentication is set up, it is possible to configure the Kafka cluster to [authorize client requests](http://kafka.apache.org/documentation.html#security_authz).
715
718
 
719
+ ## Design
720
+
721
+ The library has been designed as a layered system, with each layer having a clear responsibility:
722
+
723
+ * The **network layer** handles low-level connection tasks, such as keeping open connections to each Kafka broker, reconnecting when there's an error, etc. See [`Kafka::Connection`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Connection) for more details.
724
+ * The **protocol layer** is responsible for encoding and decoding the Kafka protocol's various structures. See [`Kafka::Protocol`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Protocol) for more details.
725
+ * The **operational layer** provides high-level operations, such as fetching messages from a topic, that may involve more than one API request to the Kafka cluster. Some complex operations are made available through [`Kafka::Cluster`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Cluster), which represents an entire cluster, while simpler ones are only available through [`Kafka::Broker`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Broker), which represents a single Kafka broker. In general, `Kafka::Cluster` is the high-level API, with more polish.
726
+ * The **API layer** provides APIs to users of the libraries. The Consumer API is implemented in [`Kafka::Consumer`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Consumer) while the Producer API is implemented in [`Kafka::Producer`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Producer) and [`Kafka::AsyncProducer`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/AsyncProducer).
727
+ * The **configuration layer** provides a way to set up and configure the client, as well as easy entrypoints to the various APIs. [`Kafka::Client`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Client) implements the public APIs. For convenience, the method [`Kafka.new`](http://www.rubydoc.info/gems/ruby-kafka/Kafka.new) can instantiate the class for you.
728
+
729
+ Note that only the API and configuration layers have any backwards compatibility guarantees – the other layers are considered internal and may change without warning. Don't use them directly.
730
+
731
+ ### Producer Design
732
+
733
+ The producer is designed with resilience and operational ease of use in mind, sometimes at the cost of raw performance. For instance, the operation is heavily instrumented, allowing operators to monitor the producer at a very granular level.
734
+
735
+ The producer has two main internal data structures: a list of _pending messages_ and a _message buffer_. When the user calls [`Kafka::Producer#produce`](http://www.rubydoc.info/gems/ruby-kafka/Kafka%2FProducer%3Aproduce), a message is appended to the pending message list, but no network communication takes place. This means that the call site does not have to handle the broad range of errors that can happen at the network or protocol level. Instead, those errors will only happen once [`Kafka::Producer#deliver_messages`](http://www.rubydoc.info/gems/ruby-kafka/Kafka%2FProducer%3Adeliver_messages) is called. This method will go through the pending messages one by one, making sure they're assigned a partition. This may fail for some messages, as it could require knowing the current configuration for the message's topic, necessitating API calls to Kafka. Messages that cannot be assigned a partition are kept in the list, while the others are written into the message buffer. The producer then figures out which topic partitions are led by which Kafka brokers so that messages can be sent to the right place – in Kafka, it is the responsibility of the client to do this routing. A separate _produce_ API request will be sent to each broker; the response will be inspected; and messages that were acknowledged by the broker will be removed from the message buffer. Any messages that were _not_ acknowledged will be kept in the buffer.
736
+
737
+ If there are any messages left in either the pending message list _or_ the message buffer after this operation, [`Kafka::DeliveryFailed`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/DeliveryFailed) will be raised. This exception must be rescued and handled by the user, possibly by calling `#deliver_messages` at a later time.
738
+
739
+ ### Asynchronous Producer Design
740
+
741
+ The synchronous producer allows the user fine-grained control over when network activity and the possible errors arising from that will take place, but it requires the user to handle the errors nonetheless. The async producer provides a more hands-off approach that trades off control for ease of use and resilience.
742
+
743
+ Instead of writing directly into the pending message list, [`Kafka::AsyncProducer`](http://www.rubydoc.info/gems/ruby-kafka/Kafka/AsyncProducer) writes the message to an internal thread-safe queue, returning immediately. A background thread reads messages off the queue and passes them to a synchronous producer.
744
+
745
+ Rather than triggering message deliveries directly, users of the async producer will typically set up _automatic triggers_, such as a timer.
746
+
716
747
  ## Development
717
748
 
718
749
  After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
@@ -100,12 +100,16 @@ module Kafka
100
100
  # idea to simply pause the partition until the error can be resolved, allowing
101
101
  # the rest of the partitions to continue being processed.
102
102
  #
103
+ # If the `timeout` argument is passed, the partition will automatically be
104
+ # resumed when the timeout expires.
105
+ #
103
106
  # @param topic [String]
104
107
  # @param partition [Integer]
108
+ # @param timeout [Integer] the number of seconds to pause the partition for.
105
109
  # @return [nil]
106
- def pause(topic, partition)
107
- @paused_partitions[topic] ||= Set.new
108
- @paused_partitions[topic] << partition
110
+ def pause(topic, partition, timeout: nil)
111
+ @paused_partitions[topic] ||= {}
112
+ @paused_partitions[topic][partition] = timeout && Time.now + timeout
109
113
  end
110
114
 
111
115
  # Resume processing of a topic partition.
@@ -115,7 +119,7 @@ module Kafka
115
119
  # @param partition [Integer]
116
120
  # @return [nil]
117
121
  def resume(topic, partition)
118
- paused_partitions = @paused_partitions.fetch(topic, [])
122
+ paused_partitions = @paused_partitions.fetch(topic, {})
119
123
  paused_partitions.delete(partition)
120
124
  end
121
125
 
@@ -126,7 +130,23 @@ module Kafka
126
130
  # @param partition [Integer]
127
131
  # @return [Boolean] true if the partition is paused, false otherwise.
128
132
  def paused?(topic, partition)
129
- @paused_partitions.fetch(topic, []).include?(partition)
133
+ partitions = @paused_partitions.fetch(topic, {})
134
+
135
+ if partitions.key?(partition)
136
+ # Users can set an optional timeout, after which the partition is
137
+ # automatically resumed. When pausing, the timeout is translated to an
138
+ # absolute point in time.
139
+ timeout = partitions.fetch(partition)
140
+
141
+ if timeout.nil?
142
+ true
143
+ elsif Time.now < timeout
144
+ true
145
+ else
146
+ resume(topic, partition)
147
+ false
148
+ end
149
+ end
130
150
  end
131
151
 
132
152
  # Fetches and enumerates the messages in the topics that the consumer group
@@ -161,7 +181,7 @@ module Kafka
161
181
  topic: message.topic,
162
182
  partition: message.partition,
163
183
  offset: message.offset,
164
- offset_lag: batch.highwater_mark_offset - message.offset,
184
+ offset_lag: batch.highwater_mark_offset - message.offset - 1,
165
185
  key: message.key,
166
186
  value: message.value,
167
187
  )
data/lib/kafka/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Kafka
2
- VERSION = "0.3.15.beta2"
2
+ VERSION = "0.3.15.beta3"
3
3
  end
data/ruby-kafka.gemspec CHANGED
@@ -40,4 +40,5 @@ Gem::Specification.new do |spec|
40
40
  spec.add_development_dependency "rspec_junit_formatter", "0.2.2"
41
41
  spec.add_development_dependency "dogstatsd-ruby"
42
42
  spec.add_development_dependency "ruby-prof"
43
+ spec.add_development_dependency "timecop"
43
44
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: ruby-kafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.15.beta2
4
+ version: 0.3.15.beta3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Daniel Schierbeck
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2016-08-23 00:00:00.000000000 Z
11
+ date: 2016-08-29 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -192,6 +192,20 @@ dependencies:
192
192
  - - ">="
193
193
  - !ruby/object:Gem::Version
194
194
  version: '0'
195
+ - !ruby/object:Gem::Dependency
196
+ name: timecop
197
+ requirement: !ruby/object:Gem::Requirement
198
+ requirements:
199
+ - - ">="
200
+ - !ruby/object:Gem::Version
201
+ version: '0'
202
+ type: :development
203
+ prerelease: false
204
+ version_requirements: !ruby/object:Gem::Requirement
205
+ requirements:
206
+ - - ">="
207
+ - !ruby/object:Gem::Version
208
+ version: '0'
195
209
  description: |-
196
210
  A client library for the Kafka distributed commit log.
197
211