rafka 0.0.10 → 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -1,13 +1,30 @@
1
1
  rafka-rb: Ruby driver for Rafka
2
2
  ===============================================================================
3
+ [![Build Status](https://api.travis-ci.org/skroutz/rafka-rb.svg?branch=master)](https://travis-ci.org/skroutz/rafka-rb)
3
4
  [![Gem Version](https://badge.fury.io/rb/rafka.svg)](https://badge.fury.io/rb/rafka-rb)
4
5
  [![Documentation](http://img.shields.io/badge/yard-docs-blue.svg)](http://www.rubydoc.info/github/skroutz/rafka-rb)
5
6
 
6
- rafka-rb is a thin Ruby client library for [Rafka](https://github.com/skroutz/rafka),
7
- providing a consumer and a producer with simple semantics. It is backed by
8
- [redis-rb](https://github.com/redis/redis-rb).
7
+ rafka-rb is a Ruby client for [Rafka](https://github.com/skroutz/rafka),
8
+ providing consumer and producer implementations with simple semantics.
9
+ It is backed by [redis-rb](https://github.com/redis/redis-rb).
10
+
11
+ Refer to the [API documentation](http://www.rubydoc.info/github/skroutz/rafka-rb)
12
+ for more information.
13
+
14
+
15
+
16
+
17
+
18
+
19
+ Features
20
+ -------------------------------------------------------------------------------
21
+
22
+ - Consumer implementation
23
+ - consumer groups
24
+ - offsets may be managed automatically or manually
25
+ - Producer implementation
26
+ - support for partition hashing key
9
27
 
10
- View the [API documentation](http://www.rubydoc.info/github/skroutz/rafka-rb).
11
28
 
12
29
 
13
30
 
@@ -40,17 +57,19 @@ Usage
40
57
  ### Producer
41
58
 
42
59
  ```ruby
43
- require "rafka"
60
+ producer = Rafka::Producer.new(host: "localhost", port: 6380)
61
+ producer.produce("greetings", "Hello there!")
62
+ ```
63
+
64
+ Refer to the [Producer API documentation](http://www.rubydoc.info/github/skroutz/rafka-rb/Rafka/Producer)
65
+ for more information.
66
+
67
+
68
+
69
+
44
70
 
45
- prod = Rafka::Producer.new(host: "localhost", port: 6380)
46
71
 
47
- # Produce to topic "greetings". The message will be assigned to a random partition.
48
- prod.produce("greetings", "Hello there!")
49
72
 
50
- # Produce using a key. Two or more messages with the same key will always be assigned to the same partition.
51
- prod.produce("greetings", "Hello there!", key: "hi")
52
- prod.produce("greetings", "Hi there!", key: "hi")
53
- ```
54
73
 
55
74
 
56
75
 
@@ -58,14 +77,66 @@ prod.produce("greetings", "Hi there!", key: "hi")
58
77
  ### Consumer
59
78
 
60
79
  ```ruby
61
- require "rafka"
62
-
63
- cons = Rafka::Consumer.new(topic: "greetings", group: "myapp", id: "greeter1")
64
- cons.consume # => "Hello there!"
80
+ consumer = Rafka::Consumer.new(topic: "greetings", group: "myapp")
81
+ msg = consumer.consume
82
+ msg.value # => "Hello there!"
65
83
 
66
84
  # with a block
67
- cons.consume { |msg| puts "Received: #{msg.value}" } # => "Hello there!"
85
+ consumer.consume { |msg| puts "Received: #{msg.value}" } # => "Hello there!"
86
+ ```
87
+
88
+ Offsets are managed automatically by default. If you need more control you can
89
+ turn off the feature and manually commit offsets:
90
+
91
+ ```ruby
92
+ consumer = Rafka::Consumer.new(topic: "greetings", group: "myapp", auto_offset_commit: false)
93
+
94
+ # commit a single offset
95
+ msg = consumer.consume
96
+ consumer.commit(msg) # => true
97
+
98
+ # or commit a bunch of offsets
99
+ msg1 = consumer.consume
100
+ msg2 = consumer.consume
101
+ consumer.commit(msg1, msg2) # => true
102
+ ```
103
+
104
+ Refer to the [Consumer API documentation](http://www.rubydoc.info/github/skroutz/rafka-rb/Rafka/Consumer)
105
+ for more information.
106
+
107
+
108
+
109
+
110
+
111
+
112
+
113
+
114
+
115
+
116
+
117
+ Development
118
+ -------------------------------------------------------------------------------
119
+
120
+ Running Rubocop:
121
+
122
+ ```shell
123
+ $ bundle exec rake rubocop
68
124
  ```
69
125
 
70
- `Rafka::Consumer#consume` automatically commits the offsets when the given block
71
- is executed without raising any exceptions.
126
+ Unit tests run as follows:
127
+
128
+ ```shell
129
+ $ bundle exec rake test
130
+ ```
131
+
132
+
133
+ rafka-rb is indirectly tested by [Rafka's end-to-end tests](https://github.com/skroutz/rafka/tree/master/test).
134
+
135
+
136
+
137
+
138
+
139
+
140
+ License
141
+ -------------------------------------------------------------------------------
142
+ rafka-rb is released under the GNU General Public License version 3. See [COPYING](COPYING).
data/Rakefile ADDED
@@ -0,0 +1,22 @@
1
+ begin
2
+ require "bundler/setup"
3
+ rescue LoadError
4
+ puts "You must `gem install bundler` and `bundle install` to run rake tasks"
5
+ end
6
+
7
+ require "rake/testtask"
8
+ Rake::TestTask.new(:test) do |t|
9
+ t.libs << "test"
10
+ t.pattern = "test/**/*_test.rb"
11
+ t.verbose = true
12
+ end
13
+
14
+ require "yard"
15
+ YARD::Rake::YardocTask.new do |t|
16
+ t.files = ["lib/**/*.rb"]
17
+ end
18
+
19
+ require "rubocop/rake_task"
20
+ RuboCop::RakeTask.new
21
+
22
+ task default: [:test, :rubocop]
data/lib/rafka.rb CHANGED
@@ -9,23 +9,18 @@ require "rafka/consumer"
9
9
  require "rafka/producer"
10
10
 
11
11
  module Rafka
12
- DEFAULTS = {
12
+ REDIS_DEFAULTS = {
13
13
  host: "localhost",
14
14
  port: 6380,
15
- reconnect_attempts: 5,
16
- }
15
+ reconnect_attempts: 5
16
+ }.freeze
17
17
 
18
18
  def self.wrap_errors
19
19
  yield
20
20
  rescue Redis::CommandError => e
21
- case
22
- when e.message.start_with?("PROD ")
23
- raise ProduceError, e.message[5..-1]
24
- when e.message.start_with?("CONS ")
25
- raise ConsumeError, e.message[5..-1]
26
- else
27
- raise CommandError, e.message
28
- end
21
+ raise ProduceError, e.message[5..-1] if e.message.start_with?("PROD ")
22
+ raise ConsumeError, e.message[5..-1] if e.message.start_with?("CONS ")
23
+ raise CommandError, e.message
29
24
  end
30
25
 
31
26
  # redis-rb until 3.2.1 didn't retry to connect on
@@ -34,7 +29,7 @@ module Rafka
34
29
  #
35
30
  # TODO(agis): get rid of this method when we go to 3.2.1 or later, because
36
31
  # https://github.com/redis/redis-rb/pull/476/
37
- def self.with_retry(times: DEFAULTS[:reconnect_attempts], every_sec: 1)
32
+ def self.with_retry(times: REDIS_DEFAULTS[:reconnect_attempts], every_sec: 1)
38
33
  attempts = 0
39
34
 
40
35
  begin
@@ -52,4 +47,3 @@ module Rafka
52
47
  end
53
48
  end
54
49
  end
55
-
@@ -1,15 +1,20 @@
1
- require 'securerandom'
1
+ require "securerandom"
2
2
 
3
3
  module Rafka
4
+ # A Kafka consumer that consumes messages from a given Kafka topic
5
+ # and belongs to a specific consumer group. Offsets may be commited
6
+ # automatically or manually (see {#consume}).
7
+ #
8
+ # @see https://kafka.apache.org/documentation/#consumerapi
4
9
  class Consumer
5
10
  include GenericCommands
6
11
 
7
- REQUIRED = [:group, :topic]
12
+ REQUIRED_OPTS = [:group, :topic].freeze
8
13
 
9
- # The underlying Redis client object
14
+ # @return [Redis::Client] the underlying Redis client instance
10
15
  attr_reader :redis
11
16
 
12
- # Create a new client instance.
17
+ # Initialize a new consumer.
13
18
  #
14
19
  # @param [Hash] opts
15
20
  # @option opts [String] :host ("localhost") server hostname
@@ -17,29 +22,50 @@ module Rafka
17
22
  # @option opts [String] :topic Kafka topic to consume (required)
18
23
  # @option opts [String] :group Kafka consumer group name (required)
19
24
  # @option opts [String] :id (random) Kafka consumer id
20
- # @option opts [Hash] :redis ({}) Optional configuration for the
21
- # underlying Redis client
25
+ # @option opts [Boolean] :auto_commit (true) automatically commit
26
+ # offsets
27
+ # @option opts [Hash] :redis ({}) Configuration for the
28
+ # underlying Redis client (see {REDIS_DEFAULTS})
29
+ #
30
+ # @raise [RuntimeError] if a required option was not provided
31
+ # (see {REQUIRED_OPTS})
32
+ #
33
+ # @return [Consumer]
22
34
  def initialize(opts={})
23
- @options = parse_opts(opts)
24
- @redis = Redis.new(@options)
25
- @topic = "topics:#{opts[:topic]}"
35
+ opts[:id] ||= SecureRandom.hex
36
+ opts[:id] = "#{opts[:group]}:#{opts[:id]}"
37
+ opts[:auto_commit] = true if opts[:auto_commit].nil?
38
+
39
+ @rafka_opts, @redis_opts = parse_opts(opts)
40
+ @redis = Redis.new(@redis_opts)
41
+ @topic = "topics:#{@rafka_opts[:topic]}"
26
42
  end
27
43
 
28
- # Fetches the next message. Offsets are commited automatically. In the
29
- # block form, the offset is commited only if the given block haven't
30
- # raised any exceptions.
44
+ # Consumes the next message.
45
+ #
46
+ # If :auto_commit is true, offsets are commited automatically.
47
+ # In the block form, offsets are commited only if the block executes
48
+ # without raising any exceptions.
31
49
  #
32
- # @param timeout [Fixnum] the time in seconds to wait for a message
50
+ # If :auto_commit is false, offsets have to be commited manually using
51
+ # {#commit}.
52
+ #
53
+ # @param timeout [Fixnum] the time in seconds to wait for a message. If
54
+ # reached, {#consume} returns nil.
55
+ #
56
+ # @yieldparam [Message] msg the consumed message
33
57
  #
34
58
  # @raise [MalformedMessageError] if the message cannot be parsed
59
+ # @raise [ConsumeError] if there was any error consuming a message
35
60
  #
36
- # @return [nil, Message]
61
+ # @return [nil, Message] the consumed message, or nil of there wasn't any
37
62
  #
38
63
  # @example Consume a message
39
- # puts consume(5).value
64
+ # msg = consumer.consume
65
+ # msg.value # => "hi"
40
66
  #
41
- # @example Consume and commit offset if the block runs successfully
42
- # consume(5) { |msg| puts "I received #{msg.value}" }
67
+ # @example Consume a message and commit offset if the block does not raise an exception
68
+ # consumer.consume { |msg| puts "I received #{msg.value}" }
43
69
  def consume(timeout=5)
44
70
  # redis-rb didn't automatically call `CLIENT SETNAME` until v3.2.2
45
71
  # (https://github.com/redis/redis-rb/issues/510)
@@ -57,7 +83,7 @@ module Rafka
57
83
 
58
84
  begin
59
85
  Rafka.wrap_errors do
60
- Rafka.with_retry(times: @options[:reconnect_attempts]) do
86
+ Rafka.with_retry(times: @redis_opts[:reconnect_attempts]) do
61
87
  msg = @redis.blpop(@topic, timeout: timeout)
62
88
  end
63
89
  end
@@ -90,28 +116,78 @@ module Rafka
90
116
 
91
117
  msg
92
118
  ensure
93
- if msg && !raised
94
- Rafka.wrap_errors do
95
- @redis.rpush("acks", "#{msg.topic}:#{msg.partition}:#{msg.offset}")
119
+ if msg && !raised && @rafka_opts[:auto_commit]
120
+ commit(msg)
121
+ end
122
+ end
123
+
124
+ # Commit offsets for the given messages.
125
+ #
126
+ # If more than one messages refer to the same topic/partition pair,
127
+ # only the largest offset amongst them is committed.
128
+ #
129
+ # @note This is non-blocking operation; a successful server reply means
130
+ # offsets are received by the server and will _eventually_ be committed
131
+ # to Kafka.
132
+ #
133
+ # @param msgs [Array<Message>] the messages for which to commit offsets
134
+ #
135
+ # @raise [ConsumeError] if there was any error commiting offsets
136
+ #
137
+ # @return [Hash] the actual offsets sent for commit
138
+ # @return [Hash{String=>Hash{Integer=>Integer}}] the actual offsets sent
139
+ # for commit.Keys denote the topics while values contain the
140
+ # partition=>offset pairs.
141
+ def commit(*msgs)
142
+ tp = prepare_for_commit(*msgs)
143
+
144
+ tp.each do |topic, po|
145
+ po.each do |partition, offset|
146
+ Rafka.wrap_errors do
147
+ @redis.rpush("acks", "#{topic}:#{partition}:#{offset}")
148
+ end
96
149
  end
97
150
  end
151
+
152
+ tp
98
153
  end
99
154
 
100
155
  private
101
156
 
102
- # @return [Hash]
157
+ # @param opts [Hash] options hash as passed to {#initialize}
158
+ #
159
+ # @return [Array<Hash, Hash>] rafka opts, redis opts
103
160
  def parse_opts(opts)
104
- REQUIRED.each do |opt|
161
+ REQUIRED_OPTS.each do |opt|
105
162
  raise "#{opt.inspect} option not provided" if opts[opt].nil?
106
163
  end
107
164
 
108
165
  rafka_opts = opts.reject { |k| k == :redis }
109
- redis_opts = opts[:redis] || {}
110
166
 
111
- options = DEFAULTS.dup.merge(rafka_opts).merge(redis_opts)
112
- options[:id] = SecureRandom.hex if options[:id].nil?
113
- options[:id] = "#{options[:group]}:#{options[:id]}"
114
- options
167
+ redis_opts = REDIS_DEFAULTS.dup.merge(opts[:redis] || {})
168
+ redis_opts.merge!(
169
+ rafka_opts.select { |k| [:host, :port, :id].include?(k) }
170
+ )
171
+
172
+ [rafka_opts, redis_opts]
173
+ end
174
+
175
+ # Accepts one or more messages and prepare them for commit.
176
+ #
177
+ # @param msgs [Array<Message>]
178
+ #
179
+ # @return [Hash{String=>Hash{Integer=>Integer}}] the offsets to be commited.
180
+ # Keys denote the topics while values contain the partition=>offset pairs.
181
+ def prepare_for_commit(*msgs)
182
+ tp = Hash.new { |h, k| h[k] = Hash.new(0) }
183
+
184
+ msgs.each do |msg|
185
+ if msg.offset >= tp[msg.topic][msg.partition]
186
+ tp[msg.topic][msg.partition] = msg.offset
187
+ end
188
+ end
189
+
190
+ tp
115
191
  end
116
192
  end
117
193
  end
data/lib/rafka/message.rb CHANGED
@@ -1,11 +1,19 @@
1
1
  module Rafka
2
2
  # Message represents a message consumed from a topic.
3
3
  class Message
4
- attr :topic, :partition, :offset, :value
4
+ attr_reader :topic, :partition, :offset, :value
5
5
 
6
+ # @param msg [Array] a message as received by the server
7
+ #
8
+ # @raise [MalformedMessageError] if message is malformed
9
+ #
10
+ # @example
11
+ # Message.new(
12
+ # ["topic", "greetings", "partition", 2, "offset", 321123, "value", "Hi!"]
13
+ # )
6
14
  def initialize(msg)
7
15
  if !msg.is_a?(Array) || msg.size != 8
8
- raise MalformedMessageError.new(msg)
16
+ raise MalformedMessageError, msg
9
17
  end
10
18
 
11
19
  @topic = msg[1]
@@ -14,7 +22,7 @@ module Rafka
14
22
  @partition = Integer(msg[3])
15
23
  @offset = Integer(msg[5])
16
24
  rescue ArgumentError
17
- raise MalformedMessageError.new(msg)
25
+ raise MalformedMessageError, msg
18
26
  end
19
27
 
20
28
  @value = msg[7]
@@ -1,16 +1,20 @@
1
1
  module Rafka
2
+ # A Kafka producer that can produce to different topics.
3
+ # See {#produce} for more info.
4
+ #
5
+ # @see https://kafka.apache.org/documentation/#producerapi
2
6
  class Producer
3
7
  include GenericCommands
4
8
 
5
9
  # Access the underlying Redis client object
6
10
  attr_reader :redis
7
11
 
8
- # Create a new client instance.
12
+ # Create a new producer.
9
13
  #
10
14
  # @param [Hash] opts
11
15
  # @option opts [String] :host ("localhost") server hostname
12
16
  # @option opts [Fixnum] :port (6380) server port
13
- # @options opts [Hash] :redis Configuration options for the underlying
17
+ # @option opts [Hash] :redis Configuration options for the underlying
14
18
  # Redis client
15
19
  #
16
20
  # @return [Producer]
@@ -19,18 +23,21 @@ module Rafka
19
23
  @redis = Redis.new(@options)
20
24
  end
21
25
 
22
- # Produce a message. This is an asynchronous operation.
26
+ # Produce a message to a topic. This is an asynchronous operation.
23
27
  #
24
28
  # @param topic [String]
25
- # @param msg [#to_s]
26
- # @param key [#to_s] two or more messages with the same key will always be
27
- # assigned to the same partition.
29
+ # @param msg [#to_s] the message
30
+ # @param key [#to_s] an optional partition hashing key. Two or more messages
31
+ # with the same key will always be written to the same partition.
28
32
  #
29
- # @example
30
- # produce("greetings", "Hello there!")
33
+ # @example Simple produce
34
+ # producer = Rafka::Producer.new
35
+ # producer.produce("greetings", "Hello there!")
31
36
  #
32
- # @example
33
- # produce("greetings", "Hello there!", key: "hi")
37
+ # @example Produce two messages with a hashing key. Those messages are guaranteed to be written to the same partition
38
+ # producer = Rafka::Producer.new
39
+ # produce("greetings", "Aloha", key: "abc")
40
+ # produce("greetings", "Hola", key: "abc")
34
41
  def produce(topic, msg, key: nil)
35
42
  Rafka.wrap_errors do
36
43
  Rafka.with_retry(times: @options[:reconnect_attempts]) do
@@ -41,10 +48,10 @@ module Rafka
41
48
  end
42
49
  end
43
50
 
44
- # Flush any buffered messages. Blocks until all messages are flushed or
45
- # timeout exceeds.
51
+ # Flush any buffered messages. Blocks until all messages are written or the
52
+ # given timeout exceeds.
46
53
  #
47
- # @param timeout_ms [Fixnum] (5000) The timeout in milliseconds
54
+ # @param timeout_ms [Fixnum]
48
55
  #
49
56
  # @return [Fixnum] The number of unflushed messages
50
57
  def flush(timeout_ms=5000)
@@ -59,7 +66,7 @@ module Rafka
59
66
  def parse_opts(opts)
60
67
  rafka_opts = opts.reject { |k| k == :redis }
61
68
  redis_opts = opts[:redis] || {}
62
- DEFAULTS.dup.merge(opts).merge(redis_opts)
69
+ REDIS_DEFAULTS.dup.merge(opts).merge(redis_opts).merge(rafka_opts)
63
70
  end
64
71
  end
65
72
  end