rdkafka 0.3.5 → 0.4.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f2ca53e4caa266a4e9b4cd1fd5132626259629c1835c6923b8b5b252185884c5
4
- data.tar.gz: d748f67df1abc442410de913a8b67b600264aeba55cef148ac01b0a7e863a7ce
3
+ metadata.gz: c361e42d0a0eb4ef3c6b7baa3509baa675f35ed8b96660e5095ba8c09ef58163
4
+ data.tar.gz: dbf82f43bb4f69fc2bdc43f931c28009578a331eb8e4bc4bb3606b2cdeedefc8
5
5
  SHA512:
6
- metadata.gz: 65900548523763e0c448d8ceb8a08fdc96d8404fcef169db0724fc8ef50336a0bd4b50d622371eaf981ecc808d432c70b44b881a044903c7fa2f20bdacf23987
7
- data.tar.gz: 8d157b0d6e3e82158c4c08c1bfed99e6ccb9c1d6ef287af81e20932c33adc82566a8df239fce2a227eea42b29f991b55499770ab85e87ca61b57afcc94b686d2
6
+ metadata.gz: 74485766bf68d0bb42e756dc3c381da604e01500682dcdd08b0d79a9c736696fbe4845ede037a820e5eb9482626f0ad8355344bb15f47235810e29d1e6219f28
7
+ data.tar.gz: ae8ec9d6a81fbdc1931c3f629fd92a32d31297fb055fcecca97f322248d31a73ce23f9fdc356645215960dd34b8a979e9288afdd8d5df9b031e2cd19a428825d
@@ -2,9 +2,13 @@ language: ruby
2
2
 
3
3
  sudo: false
4
4
 
5
+ services:
6
+ - docker
7
+
5
8
  env:
6
9
  global:
7
10
  - CC_TEST_REPORTER_ID=9f7f740ac1b6e264e1189fa07a6687a87bcdb9f3c0f4199d4344ab3b538e187e
11
+ - KAFKA_HEAP_OPTS="-Xmx512m -Xms512m"
8
12
 
9
13
  rvm:
10
14
  - 2.1
@@ -15,14 +19,10 @@ rvm:
15
19
 
16
20
  before_install:
17
21
  - gem update --system
18
- - wget http://www.us.apache.org/dist/kafka/1.0.0/kafka_2.12-1.0.0.tgz -O kafka.tgz
19
- - mkdir -p kafka && tar xzf kafka.tgz -C kafka --strip-components 1
20
- - nohup bash -c "cd kafka && bin/zookeeper-server-start.sh config/zookeeper.properties &"
21
- - nohup bash -c "cd kafka && bin/kafka-server-start.sh config/server.properties &"
22
22
 
23
23
  before_script:
24
+ - docker-compose up -d
24
25
  - cd ext && bundle exec rake && cd ..
25
- - bundle exec rake create_topics
26
26
  - curl -L https://codeclimate.com/downloads/test-reporter/test-reporter-latest-linux-amd64 > ./cc-test-reporter
27
27
  - chmod +x ./cc-test-reporter
28
28
  - ./cc-test-reporter before-build
@@ -31,5 +31,5 @@ script:
31
31
  - bundle exec rspec
32
32
 
33
33
  after_script:
34
+ - docker-compose stop
34
35
  - ./cc-test-reporter after-build --exit-code $TRAVIS_TEST_RESULT
35
- - killall -9 java
@@ -1,3 +1,15 @@
1
+ # 0.4.0
2
+ * Improvements in librdkafka archive download
3
+ * Add global statistics callback
4
+ * Use Time for timestamps, potentially breaking change if you
5
+ rely on the previous behavior where it returns an integer with
6
+ the number of milliseconds.
7
+ * Bump librdkafka to 0.11.5
8
+ * Implement TopicPartitionList in Ruby so we don't have to keep
9
+ track of native objects.
10
+ * Support committing a topic partition list
11
+ * Add consumer assignment method
12
+
1
13
  # 0.3.5
2
14
  * Fix crash when not waiting for delivery handles
3
15
  * Run specs on Ruby 2.5
data/README.md CHANGED
@@ -8,12 +8,16 @@
8
8
  The `rdkafka` gem is a modern Kafka client library for Ruby based on
9
9
  [librdkafka](https://github.com/edenhill/librdkafka/).
10
10
  It wraps the production-ready C client using the [ffi](https://github.com/ffi/ffi)
11
- gem and targets Kafka 0.10+ and Ruby 2.1+.
11
+ gem and targets Kafka 1.0+ and Ruby 2.1+.
12
12
 
13
13
  This gem only provides a high-level Kafka consumer. If you are running
14
14
  an older version of Kafka and/or need the legacy simple consumer we
15
15
  suggest using the [Hermann](https://github.com/reiseburo/hermann) gem.
16
16
 
17
+ The most important pieces of a Kafka client are implemented. We're
18
+ working towards feature completeness, you can track that here:
19
+ https://github.com/appsignal/rdkafka-ruby/milestone/1
20
+
17
21
  ## Installation
18
22
 
19
23
  This gem downloads and compiles librdkafka when it is installed. If you
@@ -25,6 +29,9 @@ See the [documentation](http://www.rubydoc.info/github/thijsc/rdkafka-ruby/maste
25
29
 
26
30
  ### Consuming messages
27
31
 
32
+ Subscribe to a topic and get messages. Kafka will automatically spread
33
+ the available partitions over consumers with the same group id.
34
+
28
35
  ```ruby
29
36
  config = {
30
37
  :"bootstrap.servers" => "localhost:9092",
@@ -40,25 +47,45 @@ end
40
47
 
41
48
  ### Producing messages
42
49
 
50
+ Produce a number of messages, put the delivery handles in an array and
51
+ wait for them before exiting. This way the messages will be batched and
52
+ sent to Kafka in an efficient way.
53
+
43
54
  ```ruby
44
55
  config = {:"bootstrap.servers" => "localhost:9092"}
45
56
  producer = Rdkafka::Config.new(config).producer
57
+ delivery_handles = []
46
58
 
47
59
  100.times do |i|
48
60
  puts "Producing message #{i}"
49
- producer.produce(
61
+ delivery_handles << producer.produce(
50
62
  topic: "ruby-test-topic",
51
63
  payload: "Payload #{i}",
52
64
  key: "Key #{i}"
53
- ).wait
65
+ )
54
66
  end
67
+
68
+ delivery_handles.each(&:wait)
55
69
  ```
56
70
 
71
+ ## Known issues
72
+
73
+ When using forked process such as when using Unicorn you currently need
74
+ to make sure that you create rdkafka instances after forking. Otherwise
75
+ they will not work and crash your Ruby process when they are garbage
76
+ collected. See https://github.com/appsignal/rdkafka-ruby/issues/19
77
+
57
78
  ## Development
58
79
 
59
- For development we expect a local zookeeper and kafka instance to be
60
- running. Run `bundle` and `cd ext && bundle exec rake && cd ..`. Then
61
- create the topics as expected in the specs: `bundle exec rake create_topics`.
80
+ A Docker Compose file is included to run Kafka and Zookeeper. To run
81
+ that:
82
+
83
+ ```
84
+ docker-compose up
85
+ ```
86
+
87
+ Run `bundle` and `cd ext && bundle exec rake && cd ..` to download and
88
+ compile `librdkafka`.
62
89
 
63
90
  You can then run `bundle exec rspec` to run the tests. To see rdkafka
64
91
  debug output:
@@ -68,6 +95,15 @@ DEBUG_PRODUCER=true bundle exec rspec
68
95
  DEBUG_CONSUMER=true bundle exec rspec
69
96
  ```
70
97
 
98
+ After running the tests you can bring the cluster down to start with a
99
+ clean slate:
100
+
101
+ ```
102
+ docker-compose down
103
+ ```
104
+
105
+ ## Example
106
+
71
107
  To see everything working run these in separate tabs:
72
108
 
73
109
  ```
data/Rakefile CHANGED
@@ -1,33 +1,24 @@
1
1
  require "./lib/rdkafka"
2
2
 
3
- task :create_topics do
4
- puts "Creating test topics"
5
- kafka_topics = if ENV['TRAVIS']
6
- 'kafka/bin/kafka-topics.sh'
7
- else
8
- 'kafka-topics'
9
- end
10
- `#{kafka_topics} --create --topic=consume_test_topic --zookeeper=127.0.0.1:2181 --partitions=3 --replication-factor=1`
11
- `#{kafka_topics} --create --topic=empty_test_topic --zookeeper=127.0.0.1:2181 --partitions=3 --replication-factor=1`
12
- `#{kafka_topics} --create --topic=load_test_topic --zookeeper=127.0.0.1:2181 --partitions=3 --replication-factor=1`
13
- `#{kafka_topics} --create --topic=produce_test_topic --zookeeper=127.0.0.1:2181 --partitions=3 --replication-factor=1`
14
- `#{kafka_topics} --create --topic=rake_test_topic --zookeeper=127.0.0.1:2181 --partitions=3 --replication-factor=1`
15
- end
16
-
17
3
  task :produce_messages do
18
4
  config = {:"bootstrap.servers" => "localhost:9092"}
19
5
  if ENV["DEBUG"]
20
6
  config[:debug] = "broker,topic,msg"
21
7
  end
22
8
  producer = Rdkafka::Config.new(config).producer
9
+
10
+ delivery_handles = []
23
11
  100.times do |i|
24
12
  puts "Producing message #{i}"
25
- producer.produce(
13
+ delivery_handles << producer.produce(
26
14
  topic: "rake_test_topic",
27
15
  payload: "Payload #{i} from Rake",
28
16
  key: "Key #{i} from Rake"
29
- ).wait
17
+ )
30
18
  end
19
+ puts 'Waiting for delivery'
20
+ delivery_handles.each(&:wait)
21
+ puts 'Done'
31
22
  end
32
23
 
33
24
  task :consume_messages do
@@ -35,11 +26,16 @@ task :consume_messages do
35
26
  :"bootstrap.servers" => "localhost:9092",
36
27
  :"group.id" => "rake_test",
37
28
  :"enable.partition.eof" => false,
38
- :"auto.offset.reset" => "earliest"
29
+ :"auto.offset.reset" => "earliest",
30
+ :"statistics.interval.ms" => 10_000
39
31
  }
40
32
  if ENV["DEBUG"]
41
33
  config[:debug] = "cgrp,topic,fetch"
42
34
  end
35
+ Rdkafka::Config.statistics_callback = lambda do |stats|
36
+ puts stats
37
+ end
38
+ consumer = Rdkafka::Config.new(config).consumer
43
39
  consumer = Rdkafka::Config.new(config).consumer
44
40
  consumer.subscribe("rake_test_topic")
45
41
  consumer.each do |message|
@@ -0,0 +1,18 @@
1
+ version: '2'
2
+ services:
3
+ zookeeper:
4
+ image: wurstmeister/zookeeper
5
+ ports:
6
+ - "2181:2181"
7
+ kafka:
8
+ image: wurstmeister/kafka:1.0.1
9
+ ports:
10
+ - "9092:9092"
11
+ environment:
12
+ KAFKA_ADVERTISED_HOST_NAME: localhost
13
+ KAFKA_ADVERTISED_PORT: 9092
14
+ KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
15
+ KAFKA_AUTO_CREATE_TOPICS_ENABLE: 'false'
16
+ KAFKA_CREATE_TOPICS: "consume_test_topic:3:1,empty_test_topic:3:1,load_test_topic:3:1,produce_test_topic:3:1,rake_test_topic:3:1"
17
+ volumes:
18
+ - /var/run/docker.sock:/var/run/docker.sock
@@ -3,9 +3,25 @@ require "mini_portile2"
3
3
  require "fileutils"
4
4
 
5
5
  task :default => :clean do
6
+ # MiniPortile#download_file_http is a monkey patch that removes the download
7
+ # progress indicator. This indicator relies on the 'Content Length' response
8
+ # headers, which is not set by GitHub
9
+ class MiniPortile
10
+ def download_file_http(url, full_path, _count)
11
+ filename = File.basename(full_path)
12
+ with_tempfile(filename, full_path) do |temp_file|
13
+ params = { 'Accept-Encoding' => 'identity' }
14
+ OpenURI.open_uri(url, 'rb', params) do |io|
15
+ temp_file.write(io.read)
16
+ end
17
+ output
18
+ end
19
+ end
20
+ end
21
+
6
22
  # Download and compile librdkafka
7
23
  recipe = MiniPortile.new("librdkafka", Rdkafka::LIBRDKAFKA_VERSION)
8
- recipe.files = ["https://github.com/edenhill/librdkafka/archive/v#{Rdkafka::LIBRDKAFKA_VERSION}.tar.gz"]
24
+ recipe.files = ["https://codeload.github.com/edenhill/librdkafka/tar.gz/v#{Rdkafka::LIBRDKAFKA_VERSION}"]
9
25
  recipe.configure_options = ["--host=#{recipe.host}"]
10
26
  recipe.cook
11
27
  # Move dynamic library we're interested in
@@ -1,4 +1,5 @@
1
1
  require "ffi"
2
+ require "json"
2
3
  require "logger"
3
4
 
4
5
  module Rdkafka
@@ -43,7 +44,7 @@ module Rdkafka
43
44
  # TopicPartition ad TopicPartitionList structs
44
45
 
45
46
  class TopicPartition < FFI::Struct
46
- layout :topic, :string,
47
+ layout :topic, :string,
47
48
  :partition, :int32,
48
49
  :offset, :int64,
49
50
  :metadata, :pointer,
@@ -61,6 +62,7 @@ module Rdkafka
61
62
 
62
63
  attach_function :rd_kafka_topic_partition_list_new, [:int32], :pointer
63
64
  attach_function :rd_kafka_topic_partition_list_add, [:pointer, :string, :int32], :void
65
+ attach_function :rd_kafka_topic_partition_list_set_offset, [:pointer, :string, :int32, :int64], :void
64
66
  attach_function :rd_kafka_topic_partition_list_destroy, [:pointer], :void
65
67
  attach_function :rd_kafka_topic_partition_list_copy, [:pointer], :pointer
66
68
 
@@ -81,6 +83,8 @@ module Rdkafka
81
83
  attach_function :rd_kafka_conf_set, [:pointer, :string, :string, :pointer, :int], :kafka_config_response
82
84
  callback :log_cb, [:pointer, :int, :string, :string], :void
83
85
  attach_function :rd_kafka_conf_set_log_cb, [:pointer, :log_cb], :void
86
+ callback :stats_cb, [:pointer, :string, :int, :pointer], :int
87
+ attach_function :rd_kafka_conf_set_stats_cb, [:pointer, :stats_cb], :void
84
88
 
85
89
  # Log queue
86
90
  attach_function :rd_kafka_set_log_queue, [:pointer, :pointer], :void
@@ -106,6 +110,19 @@ module Rdkafka
106
110
  Rdkafka::Config.logger.add(severity) { "rdkafka: #{line}" }
107
111
  end
108
112
 
113
+ StatsCallback = FFI::Function.new(
114
+ :int, [:pointer, :string, :int, :pointer]
115
+ ) do |_client_ptr, json, _json_len, _opaque|
116
+ # Pass the stats hash to callback in config
117
+ if Rdkafka::Config.statistics_callback
118
+ stats = JSON.parse(json)
119
+ Rdkafka::Config.statistics_callback.call(stats)
120
+ end
121
+
122
+ # Return 0 so librdkafka frees the json string
123
+ 0
124
+ end
125
+
109
126
  # Handle
110
127
 
111
128
  enum :kafka_type, [
@@ -121,6 +138,7 @@ module Rdkafka
121
138
  attach_function :rd_kafka_subscribe, [:pointer, :pointer], :int
122
139
  attach_function :rd_kafka_unsubscribe, [:pointer], :int
123
140
  attach_function :rd_kafka_subscription, [:pointer, :pointer], :int
141
+ attach_function :rd_kafka_assign, [:pointer, :pointer], :int
124
142
  attach_function :rd_kafka_assignment, [:pointer, :pointer], :int
125
143
  attach_function :rd_kafka_committed, [:pointer, :pointer, :int], :int
126
144
  attach_function :rd_kafka_commit, [:pointer, :pointer, :bool], :int, blocking: true
@@ -7,6 +7,7 @@ module Rdkafka
7
7
  class Config
8
8
  # @private
9
9
  @@logger = Logger.new(STDOUT)
10
+ @@statistics_callback = nil
10
11
 
11
12
  # Returns the current logger, by default this is a logger to stdout.
12
13
  #
@@ -25,6 +26,25 @@ module Rdkafka
25
26
  @@logger=logger
26
27
  end
27
28
 
29
+ # Set a callback that will be called every time the underlying client emits statistics.
30
+ # You can configure if and how often this happens using `statistics.interval.ms`.
31
+ # The callback is called with a hash that's documented here: https://github.com/edenhill/librdkafka/blob/master/STATISTICS.md
32
+ #
33
+ # @param callback [Proc] The callback
34
+ #
35
+ # @return [nil]
36
+ def self.statistics_callback=(callback)
37
+ raise TypeError.new("Callback has to be a proc or lambda") unless callback.is_a? Proc
38
+ @@statistics_callback = callback
39
+ end
40
+
41
+ # Returns the current statistics callback, by default this is nil.
42
+ #
43
+ # @return [Proc, nil]
44
+ def self.statistics_callback
45
+ @@statistics_callback
46
+ end
47
+
28
48
  # Default config that can be overwritten.
29
49
  DEFAULT_CONFIG = {
30
50
  # Request api version so advanced features work
@@ -122,8 +142,12 @@ module Rdkafka
122
142
  raise ConfigError.new(error_buffer.read_string)
123
143
  end
124
144
  end
145
+ # Set opaque pointer back to this config
146
+ #Rdkafka::Bindings.rd_kafka_conf_set_opaque(config, self)
125
147
  # Set log callback
126
148
  Rdkafka::Bindings.rd_kafka_conf_set_log_cb(config, Rdkafka::Bindings::LogCallback)
149
+ # Set stats callback
150
+ Rdkafka::Bindings.rd_kafka_conf_set_stats_cb(config, Rdkafka::Bindings::StatsCallback)
127
151
  end
128
152
  end
129
153
 
@@ -65,34 +65,68 @@ module Rdkafka
65
65
  # @return [TopicPartitionList]
66
66
  def subscription
67
67
  tpl = FFI::MemoryPointer.new(:pointer)
68
+ tpl.autorelease = false
68
69
  response = Rdkafka::Bindings.rd_kafka_subscription(@native_kafka, tpl)
69
70
  if response != 0
70
71
  raise Rdkafka::RdkafkaError.new(response)
71
72
  end
72
- Rdkafka::Consumer::TopicPartitionList.new(tpl.get_pointer(0))
73
+ Rdkafka::Consumer::TopicPartitionList.from_native_tpl(tpl.get_pointer(0))
74
+ end
75
+
76
+ # Atomic assignment of partitions to consume
77
+ #
78
+ # @param list [TopicPartitionList] The topic with partitions to assign
79
+ #
80
+ # @raise [RdkafkaError] When assigning fails
81
+ def assign(list)
82
+ unless list.is_a?(TopicPartitionList)
83
+ raise TypeError.new("list has to be a TopicPartitionList")
84
+ end
85
+ tpl = list.to_native_tpl
86
+ response = Rdkafka::Bindings.rd_kafka_assign(@native_kafka, tpl)
87
+ if response != 0
88
+ raise Rdkafka::RdkafkaError.new(response, "Error assigning '#{list.to_h}'")
89
+ end
90
+ ensure
91
+ Rdkafka::Bindings.rd_kafka_topic_partition_list_destroy(tpl) if tpl
92
+ end
93
+
94
+ # Returns the current partition assignment.
95
+ #
96
+ # @raise [RdkafkaError] When getting the assignment fails.
97
+ #
98
+ # @return [TopicPartitionList]
99
+ def assignment
100
+ tpl = FFI::MemoryPointer.new(:pointer)
101
+ tpl.autorelease = false
102
+ response = Rdkafka::Bindings.rd_kafka_assignment(@native_kafka, tpl)
103
+ if response != 0
104
+ raise Rdkafka::RdkafkaError.new(response)
105
+ end
106
+ Rdkafka::Consumer::TopicPartitionList.from_native_tpl(tpl.get_pointer(0))
73
107
  end
74
108
 
75
109
  # Return the current committed offset per partition for this consumer group.
76
110
  # The offset field of each requested partition will either be set to stored offset or to -1001 in case there was no stored offset for that partition.
77
111
  #
78
- # TODO: This should use the subscription or assignment by default.
79
- #
80
- # @param list [TopicPartitionList] The topic with partitions to get the offsets for.
112
+ # @param list [TopicPartitionList, nil] The topic with partitions to get the offsets for or nil to use the current subscription.
81
113
  # @param timeout_ms [Integer] The timeout for fetching this information.
82
114
  #
83
115
  # @raise [RdkafkaError] When getting the committed positions fails.
84
116
  #
85
117
  # @return [TopicPartitionList]
86
- def committed(list, timeout_ms=200)
87
- unless list.is_a?(TopicPartitionList)
88
- raise TypeError.new("list has to be a TopicPartitionList")
118
+ def committed(list=nil, timeout_ms=200)
119
+ if list.nil?
120
+ list = assignment
121
+ elsif !list.is_a?(TopicPartitionList)
122
+ raise TypeError.new("list has to be nil or a TopicPartitionList")
89
123
  end
90
- tpl = list.copy_tpl
124
+ tpl = list.to_native_tpl
91
125
  response = Rdkafka::Bindings.rd_kafka_committed(@native_kafka, tpl, timeout_ms)
92
126
  if response != 0
93
127
  raise Rdkafka::RdkafkaError.new(response)
94
128
  end
95
- Rdkafka::Consumer::TopicPartitionList.new(tpl)
129
+ TopicPartitionList.from_native_tpl(tpl)
96
130
  end
97
131
 
98
132
  # Query broker for low (oldest/beginning) and high (newest/end) offsets for a partition.
@@ -160,11 +194,21 @@ module Rdkafka
160
194
  # @raise [RdkafkaError] When comitting fails
161
195
  #
162
196
  # @return [nil]
163
- def commit(async=false)
164
- response = Rdkafka::Bindings.rd_kafka_commit(@native_kafka, nil, async)
197
+ def commit(list=nil, async=false)
198
+ if !list.nil? && !list.is_a?(TopicPartitionList)
199
+ raise TypeError.new("list has to be nil or a TopicPartitionList")
200
+ end
201
+ tpl = if list
202
+ list.to_native_tpl
203
+ else
204
+ nil
205
+ end
206
+ response = Rdkafka::Bindings.rd_kafka_commit(@native_kafka, tpl, async)
165
207
  if response != 0
166
208
  raise Rdkafka::RdkafkaError.new(response)
167
209
  end
210
+ ensure
211
+ Rdkafka::Bindings.rd_kafka_topic_partition_list_destroy(tpl) if tpl
168
212
  end
169
213
 
170
214
  # Poll for the next message on one of the subscribed topics