rafka 0.0.10 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md CHANGED
@@ -1,13 +1,30 @@
1
1
  rafka-rb: Ruby driver for Rafka
2
2
  ===============================================================================
3
+ [![Build Status](https://api.travis-ci.org/skroutz/rafka-rb.svg?branch=master)](https://travis-ci.org/skroutz/rafka-rb)
3
4
  [![Gem Version](https://badge.fury.io/rb/rafka.svg)](https://badge.fury.io/rb/rafka-rb)
4
5
  [![Documentation](http://img.shields.io/badge/yard-docs-blue.svg)](http://www.rubydoc.info/github/skroutz/rafka-rb)
5
6
 
6
- rafka-rb is a thin Ruby client library for [Rafka](https://github.com/skroutz/rafka),
7
- providing a consumer and a producer with simple semantics. It is backed by
8
- [redis-rb](https://github.com/redis/redis-rb).
7
+ rafka-rb is a Ruby client for [Rafka](https://github.com/skroutz/rafka),
8
+ providing consumer and producer implementations with simple semantics.
9
+ It is backed by [redis-rb](https://github.com/redis/redis-rb).
10
+
11
+ Refer to the [API documentation](http://www.rubydoc.info/github/skroutz/rafka-rb)
12
+ for more information.
13
+
14
+
15
+
16
+
17
+
18
+
19
+ Features
20
+ -------------------------------------------------------------------------------
21
+
22
+ - Consumer implementation
23
+ - consumer groups
24
+ - offsets may be managed automatically or manually
25
+ - Producer implementation
26
+ - support for partition hashing key
9
27
 
10
- View the [API documentation](http://www.rubydoc.info/github/skroutz/rafka-rb).
11
28
 
12
29
 
13
30
 
@@ -40,17 +57,19 @@ Usage
40
57
  ### Producer
41
58
 
42
59
  ```ruby
43
- require "rafka"
60
+ producer = Rafka::Producer.new(host: "localhost", port: 6380)
61
+ producer.produce("greetings", "Hello there!")
62
+ ```
63
+
64
+ Refer to the [Producer API documentation](http://www.rubydoc.info/github/skroutz/rafka-rb/Rafka/Producer)
65
+ for more information.
66
+
67
+
68
+
69
+
44
70
 
45
- prod = Rafka::Producer.new(host: "localhost", port: 6380)
46
71
 
47
- # Produce to topic "greetings". The message will be assigned to a random partition.
48
- prod.produce("greetings", "Hello there!")
49
72
 
50
- # Produce using a key. Two or more messages with the same key will always be assigned to the same partition.
51
- prod.produce("greetings", "Hello there!", key: "hi")
52
- prod.produce("greetings", "Hi there!", key: "hi")
53
- ```
54
73
 
55
74
 
56
75
 
@@ -58,14 +77,66 @@ prod.produce("greetings", "Hi there!", key: "hi")
58
77
  ### Consumer
59
78
 
60
79
  ```ruby
61
- require "rafka"
62
-
63
- cons = Rafka::Consumer.new(topic: "greetings", group: "myapp", id: "greeter1")
64
- cons.consume # => "Hello there!"
80
+ consumer = Rafka::Consumer.new(topic: "greetings", group: "myapp")
81
+ msg = consumer.consume
82
+ msg.value # => "Hello there!"
65
83
 
66
84
  # with a block
67
- cons.consume { |msg| puts "Received: #{msg.value}" } # => "Hello there!"
85
+ consumer.consume { |msg| puts "Received: #{msg.value}" } # => "Hello there!"
86
+ ```
87
+
88
+ Offsets are managed automatically by default. If you need more control you can
89
+ turn off the feature and manually commit offsets:
90
+
91
+ ```ruby
92
+ consumer = Rafka::Consumer.new(topic: "greetings", group: "myapp", auto_offset_commit: false)
93
+
94
+ # commit a single offset
95
+ msg = consumer.consume
96
+ consumer.commit(msg) # => true
97
+
98
+ # or commit a bunch of offsets
99
+ msg1 = consumer.consume
100
+ msg2 = consumer.consume
101
+ consumer.commit(msg1, msg2) # => true
102
+ ```
103
+
104
+ Refer to the [Consumer API documentation](http://www.rubydoc.info/github/skroutz/rafka-rb/Rafka/Consumer)
105
+ for more information.
106
+
107
+
108
+
109
+
110
+
111
+
112
+
113
+
114
+
115
+
116
+
117
+ Development
118
+ -------------------------------------------------------------------------------
119
+
120
+ Running Rubocop:
121
+
122
+ ```shell
123
+ $ bundle exec rake rubocop
68
124
  ```
69
125
 
70
- `Rafka::Consumer#consume` automatically commits the offsets when the given block
71
- is executed without raising any exceptions.
126
+ Unit tests run as follows:
127
+
128
+ ```shell
129
+ $ bundle exec rake test
130
+ ```
131
+
132
+
133
+ rafka-rb is indirectly tested by [Rafka's end-to-end tests](https://github.com/skroutz/rafka/tree/master/test).
134
+
135
+
136
+
137
+
138
+
139
+
140
+ License
141
+ -------------------------------------------------------------------------------
142
+ rafka-rb is released under the GNU General Public License version 3. See [COPYING](COPYING).
data/Rakefile ADDED
@@ -0,0 +1,22 @@
1
+ begin
2
+ require "bundler/setup"
3
+ rescue LoadError
4
+ puts "You must `gem install bundler` and `bundle install` to run rake tasks"
5
+ end
6
+
7
+ require "rake/testtask"
8
+ Rake::TestTask.new(:test) do |t|
9
+ t.libs << "test"
10
+ t.pattern = "test/**/*_test.rb"
11
+ t.verbose = true
12
+ end
13
+
14
+ require "yard"
15
+ YARD::Rake::YardocTask.new do |t|
16
+ t.files = ["lib/**/*.rb"]
17
+ end
18
+
19
+ require "rubocop/rake_task"
20
+ RuboCop::RakeTask.new
21
+
22
+ task default: [:test, :rubocop]
data/lib/rafka.rb CHANGED
@@ -9,23 +9,18 @@ require "rafka/consumer"
9
9
  require "rafka/producer"
10
10
 
11
11
  module Rafka
12
- DEFAULTS = {
12
+ REDIS_DEFAULTS = {
13
13
  host: "localhost",
14
14
  port: 6380,
15
- reconnect_attempts: 5,
16
- }
15
+ reconnect_attempts: 5
16
+ }.freeze
17
17
 
18
18
  def self.wrap_errors
19
19
  yield
20
20
  rescue Redis::CommandError => e
21
- case
22
- when e.message.start_with?("PROD ")
23
- raise ProduceError, e.message[5..-1]
24
- when e.message.start_with?("CONS ")
25
- raise ConsumeError, e.message[5..-1]
26
- else
27
- raise CommandError, e.message
28
- end
21
+ raise ProduceError, e.message[5..-1] if e.message.start_with?("PROD ")
22
+ raise ConsumeError, e.message[5..-1] if e.message.start_with?("CONS ")
23
+ raise CommandError, e.message
29
24
  end
30
25
 
31
26
  # redis-rb until 3.2.1 didn't retry to connect on
@@ -34,7 +29,7 @@ module Rafka
34
29
  #
35
30
  # TODO(agis): get rid of this method when we go to 3.2.1 or later, because
36
31
  # https://github.com/redis/redis-rb/pull/476/
37
- def self.with_retry(times: DEFAULTS[:reconnect_attempts], every_sec: 1)
32
+ def self.with_retry(times: REDIS_DEFAULTS[:reconnect_attempts], every_sec: 1)
38
33
  attempts = 0
39
34
 
40
35
  begin
@@ -52,4 +47,3 @@ module Rafka
52
47
  end
53
48
  end
54
49
  end
55
-
@@ -1,15 +1,20 @@
1
- require 'securerandom'
1
+ require "securerandom"
2
2
 
3
3
  module Rafka
4
+ # A Kafka consumer that consumes messages from a given Kafka topic
5
+ # and belongs to a specific consumer group. Offsets may be commited
6
+ # automatically or manually (see {#consume}).
7
+ #
8
+ # @see https://kafka.apache.org/documentation/#consumerapi
4
9
  class Consumer
5
10
  include GenericCommands
6
11
 
7
- REQUIRED = [:group, :topic]
12
+ REQUIRED_OPTS = [:group, :topic].freeze
8
13
 
9
- # The underlying Redis client object
14
+ # @return [Redis::Client] the underlying Redis client instance
10
15
  attr_reader :redis
11
16
 
12
- # Create a new client instance.
17
+ # Initialize a new consumer.
13
18
  #
14
19
  # @param [Hash] opts
15
20
  # @option opts [String] :host ("localhost") server hostname
@@ -17,29 +22,50 @@ module Rafka
17
22
  # @option opts [String] :topic Kafka topic to consume (required)
18
23
  # @option opts [String] :group Kafka consumer group name (required)
19
24
  # @option opts [String] :id (random) Kafka consumer id
20
- # @option opts [Hash] :redis ({}) Optional configuration for the
21
- # underlying Redis client
25
+ # @option opts [Boolean] :auto_commit (true) automatically commit
26
+ # offsets
27
+ # @option opts [Hash] :redis ({}) Configuration for the
28
+ # underlying Redis client (see {REDIS_DEFAULTS})
29
+ #
30
+ # @raise [RuntimeError] if a required option was not provided
31
+ # (see {REQUIRED_OPTS})
32
+ #
33
+ # @return [Consumer]
22
34
  def initialize(opts={})
23
- @options = parse_opts(opts)
24
- @redis = Redis.new(@options)
25
- @topic = "topics:#{opts[:topic]}"
35
+ opts[:id] ||= SecureRandom.hex
36
+ opts[:id] = "#{opts[:group]}:#{opts[:id]}"
37
+ opts[:auto_commit] = true if opts[:auto_commit].nil?
38
+
39
+ @rafka_opts, @redis_opts = parse_opts(opts)
40
+ @redis = Redis.new(@redis_opts)
41
+ @topic = "topics:#{@rafka_opts[:topic]}"
26
42
  end
27
43
 
28
- # Fetches the next message. Offsets are commited automatically. In the
29
- # block form, the offset is commited only if the given block haven't
30
- # raised any exceptions.
44
+ # Consumes the next message.
45
+ #
46
+ # If :auto_commit is true, offsets are commited automatically.
47
+ # In the block form, offsets are commited only if the block executes
48
+ # without raising any exceptions.
31
49
  #
32
- # @param timeout [Fixnum] the time in seconds to wait for a message
50
+ # If :auto_commit is false, offsets have to be commited manually using
51
+ # {#commit}.
52
+ #
53
+ # @param timeout [Fixnum] the time in seconds to wait for a message. If
54
+ # reached, {#consume} returns nil.
55
+ #
56
+ # @yieldparam [Message] msg the consumed message
33
57
  #
34
58
  # @raise [MalformedMessageError] if the message cannot be parsed
59
+ # @raise [ConsumeError] if there was any error consuming a message
35
60
  #
36
- # @return [nil, Message]
61
+ # @return [nil, Message] the consumed message, or nil of there wasn't any
37
62
  #
38
63
  # @example Consume a message
39
- # puts consume(5).value
64
+ # msg = consumer.consume
65
+ # msg.value # => "hi"
40
66
  #
41
- # @example Consume and commit offset if the block runs successfully
42
- # consume(5) { |msg| puts "I received #{msg.value}" }
67
+ # @example Consume a message and commit offset if the block does not raise an exception
68
+ # consumer.consume { |msg| puts "I received #{msg.value}" }
43
69
  def consume(timeout=5)
44
70
  # redis-rb didn't automatically call `CLIENT SETNAME` until v3.2.2
45
71
  # (https://github.com/redis/redis-rb/issues/510)
@@ -57,7 +83,7 @@ module Rafka
57
83
 
58
84
  begin
59
85
  Rafka.wrap_errors do
60
- Rafka.with_retry(times: @options[:reconnect_attempts]) do
86
+ Rafka.with_retry(times: @redis_opts[:reconnect_attempts]) do
61
87
  msg = @redis.blpop(@topic, timeout: timeout)
62
88
  end
63
89
  end
@@ -90,28 +116,78 @@ module Rafka
90
116
 
91
117
  msg
92
118
  ensure
93
- if msg && !raised
94
- Rafka.wrap_errors do
95
- @redis.rpush("acks", "#{msg.topic}:#{msg.partition}:#{msg.offset}")
119
+ if msg && !raised && @rafka_opts[:auto_commit]
120
+ commit(msg)
121
+ end
122
+ end
123
+
124
+ # Commit offsets for the given messages.
125
+ #
126
+ # If more than one messages refer to the same topic/partition pair,
127
+ # only the largest offset amongst them is committed.
128
+ #
129
+ # @note This is non-blocking operation; a successful server reply means
130
+ # offsets are received by the server and will _eventually_ be committed
131
+ # to Kafka.
132
+ #
133
+ # @param msgs [Array<Message>] the messages for which to commit offsets
134
+ #
135
+ # @raise [ConsumeError] if there was any error commiting offsets
136
+ #
137
+ # @return [Hash] the actual offsets sent for commit
138
+ # @return [Hash{String=>Hash{Integer=>Integer}}] the actual offsets sent
139
+ # for commit.Keys denote the topics while values contain the
140
+ # partition=>offset pairs.
141
+ def commit(*msgs)
142
+ tp = prepare_for_commit(*msgs)
143
+
144
+ tp.each do |topic, po|
145
+ po.each do |partition, offset|
146
+ Rafka.wrap_errors do
147
+ @redis.rpush("acks", "#{topic}:#{partition}:#{offset}")
148
+ end
96
149
  end
97
150
  end
151
+
152
+ tp
98
153
  end
99
154
 
100
155
  private
101
156
 
102
- # @return [Hash]
157
+ # @param opts [Hash] options hash as passed to {#initialize}
158
+ #
159
+ # @return [Array<Hash, Hash>] rafka opts, redis opts
103
160
  def parse_opts(opts)
104
- REQUIRED.each do |opt|
161
+ REQUIRED_OPTS.each do |opt|
105
162
  raise "#{opt.inspect} option not provided" if opts[opt].nil?
106
163
  end
107
164
 
108
165
  rafka_opts = opts.reject { |k| k == :redis }
109
- redis_opts = opts[:redis] || {}
110
166
 
111
- options = DEFAULTS.dup.merge(rafka_opts).merge(redis_opts)
112
- options[:id] = SecureRandom.hex if options[:id].nil?
113
- options[:id] = "#{options[:group]}:#{options[:id]}"
114
- options
167
+ redis_opts = REDIS_DEFAULTS.dup.merge(opts[:redis] || {})
168
+ redis_opts.merge!(
169
+ rafka_opts.select { |k| [:host, :port, :id].include?(k) }
170
+ )
171
+
172
+ [rafka_opts, redis_opts]
173
+ end
174
+
175
+ # Accepts one or more messages and prepare them for commit.
176
+ #
177
+ # @param msgs [Array<Message>]
178
+ #
179
+ # @return [Hash{String=>Hash{Integer=>Integer}}] the offsets to be commited.
180
+ # Keys denote the topics while values contain the partition=>offset pairs.
181
+ def prepare_for_commit(*msgs)
182
+ tp = Hash.new { |h, k| h[k] = Hash.new(0) }
183
+
184
+ msgs.each do |msg|
185
+ if msg.offset >= tp[msg.topic][msg.partition]
186
+ tp[msg.topic][msg.partition] = msg.offset
187
+ end
188
+ end
189
+
190
+ tp
115
191
  end
116
192
  end
117
193
  end
data/lib/rafka/message.rb CHANGED
@@ -1,11 +1,19 @@
1
1
  module Rafka
2
2
  # Message represents a message consumed from a topic.
3
3
  class Message
4
- attr :topic, :partition, :offset, :value
4
+ attr_reader :topic, :partition, :offset, :value
5
5
 
6
+ # @param msg [Array] a message as received by the server
7
+ #
8
+ # @raise [MalformedMessageError] if message is malformed
9
+ #
10
+ # @example
11
+ # Message.new(
12
+ # ["topic", "greetings", "partition", 2, "offset", 321123, "value", "Hi!"]
13
+ # )
6
14
  def initialize(msg)
7
15
  if !msg.is_a?(Array) || msg.size != 8
8
- raise MalformedMessageError.new(msg)
16
+ raise MalformedMessageError, msg
9
17
  end
10
18
 
11
19
  @topic = msg[1]
@@ -14,7 +22,7 @@ module Rafka
14
22
  @partition = Integer(msg[3])
15
23
  @offset = Integer(msg[5])
16
24
  rescue ArgumentError
17
- raise MalformedMessageError.new(msg)
25
+ raise MalformedMessageError, msg
18
26
  end
19
27
 
20
28
  @value = msg[7]
@@ -1,16 +1,20 @@
1
1
  module Rafka
2
+ # A Kafka producer that can produce to different topics.
3
+ # See {#produce} for more info.
4
+ #
5
+ # @see https://kafka.apache.org/documentation/#producerapi
2
6
  class Producer
3
7
  include GenericCommands
4
8
 
5
9
  # Access the underlying Redis client object
6
10
  attr_reader :redis
7
11
 
8
- # Create a new client instance.
12
+ # Create a new producer.
9
13
  #
10
14
  # @param [Hash] opts
11
15
  # @option opts [String] :host ("localhost") server hostname
12
16
  # @option opts [Fixnum] :port (6380) server port
13
- # @options opts [Hash] :redis Configuration options for the underlying
17
+ # @option opts [Hash] :redis Configuration options for the underlying
14
18
  # Redis client
15
19
  #
16
20
  # @return [Producer]
@@ -19,18 +23,21 @@ module Rafka
19
23
  @redis = Redis.new(@options)
20
24
  end
21
25
 
22
- # Produce a message. This is an asynchronous operation.
26
+ # Produce a message to a topic. This is an asynchronous operation.
23
27
  #
24
28
  # @param topic [String]
25
- # @param msg [#to_s]
26
- # @param key [#to_s] two or more messages with the same key will always be
27
- # assigned to the same partition.
29
+ # @param msg [#to_s] the message
30
+ # @param key [#to_s] an optional partition hashing key. Two or more messages
31
+ # with the same key will always be written to the same partition.
28
32
  #
29
- # @example
30
- # produce("greetings", "Hello there!")
33
+ # @example Simple produce
34
+ # producer = Rafka::Producer.new
35
+ # producer.produce("greetings", "Hello there!")
31
36
  #
32
- # @example
33
- # produce("greetings", "Hello there!", key: "hi")
37
+ # @example Produce two messages with a hashing key. Those messages are guaranteed to be written to the same partition
38
+ # producer = Rafka::Producer.new
39
+ # produce("greetings", "Aloha", key: "abc")
40
+ # produce("greetings", "Hola", key: "abc")
34
41
  def produce(topic, msg, key: nil)
35
42
  Rafka.wrap_errors do
36
43
  Rafka.with_retry(times: @options[:reconnect_attempts]) do
@@ -41,10 +48,10 @@ module Rafka
41
48
  end
42
49
  end
43
50
 
44
- # Flush any buffered messages. Blocks until all messages are flushed or
45
- # timeout exceeds.
51
+ # Flush any buffered messages. Blocks until all messages are written or the
52
+ # given timeout exceeds.
46
53
  #
47
- # @param timeout_ms [Fixnum] (5000) The timeout in milliseconds
54
+ # @param timeout_ms [Fixnum]
48
55
  #
49
56
  # @return [Fixnum] The number of unflushed messages
50
57
  def flush(timeout_ms=5000)
@@ -59,7 +66,7 @@ module Rafka
59
66
  def parse_opts(opts)
60
67
  rafka_opts = opts.reject { |k| k == :redis }
61
68
  redis_opts = opts[:redis] || {}
62
- DEFAULTS.dup.merge(opts).merge(redis_opts)
69
+ REDIS_DEFAULTS.dup.merge(opts).merge(redis_opts).merge(rafka_opts)
63
70
  end
64
71
  end
65
72
  end