rdkafka 0.3.5 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f2ca53e4caa266a4e9b4cd1fd5132626259629c1835c6923b8b5b252185884c5
4
- data.tar.gz: d748f67df1abc442410de913a8b67b600264aeba55cef148ac01b0a7e863a7ce
3
+ metadata.gz: c361e42d0a0eb4ef3c6b7baa3509baa675f35ed8b96660e5095ba8c09ef58163
4
+ data.tar.gz: dbf82f43bb4f69fc2bdc43f931c28009578a331eb8e4bc4bb3606b2cdeedefc8
5
5
  SHA512:
6
- metadata.gz: 65900548523763e0c448d8ceb8a08fdc96d8404fcef169db0724fc8ef50336a0bd4b50d622371eaf981ecc808d432c70b44b881a044903c7fa2f20bdacf23987
7
- data.tar.gz: 8d157b0d6e3e82158c4c08c1bfed99e6ccb9c1d6ef287af81e20932c33adc82566a8df239fce2a227eea42b29f991b55499770ab85e87ca61b57afcc94b686d2
6
+ metadata.gz: 74485766bf68d0bb42e756dc3c381da604e01500682dcdd08b0d79a9c736696fbe4845ede037a820e5eb9482626f0ad8355344bb15f47235810e29d1e6219f28
7
+ data.tar.gz: ae8ec9d6a81fbdc1931c3f629fd92a32d31297fb055fcecca97f322248d31a73ce23f9fdc356645215960dd34b8a979e9288afdd8d5df9b031e2cd19a428825d
@@ -2,9 +2,13 @@ language: ruby
2
2
 
3
3
  sudo: false
4
4
 
5
+ services:
6
+ - docker
7
+
5
8
  env:
6
9
  global:
7
10
  - CC_TEST_REPORTER_ID=9f7f740ac1b6e264e1189fa07a6687a87bcdb9f3c0f4199d4344ab3b538e187e
11
+ - KAFKA_HEAP_OPTS="-Xmx512m -Xms512m"
8
12
 
9
13
  rvm:
10
14
  - 2.1
@@ -15,14 +19,10 @@ rvm:
15
19
 
16
20
  before_install:
17
21
  - gem update --system
18
- - wget http://www.us.apache.org/dist/kafka/1.0.0/kafka_2.12-1.0.0.tgz -O kafka.tgz
19
- - mkdir -p kafka && tar xzf kafka.tgz -C kafka --strip-components 1
20
- - nohup bash -c "cd kafka && bin/zookeeper-server-start.sh config/zookeeper.properties &"
21
- - nohup bash -c "cd kafka && bin/kafka-server-start.sh config/server.properties &"
22
22
 
23
23
  before_script:
24
+ - docker-compose up -d
24
25
  - cd ext && bundle exec rake && cd ..
25
- - bundle exec rake create_topics
26
26
  - curl -L https://codeclimate.com/downloads/test-reporter/test-reporter-latest-linux-amd64 > ./cc-test-reporter
27
27
  - chmod +x ./cc-test-reporter
28
28
  - ./cc-test-reporter before-build
@@ -31,5 +31,5 @@ script:
31
31
  - bundle exec rspec
32
32
 
33
33
  after_script:
34
+ - docker-compose stop
34
35
  - ./cc-test-reporter after-build --exit-code $TRAVIS_TEST_RESULT
35
- - killall -9 java
@@ -1,3 +1,15 @@
1
+ # 0.4.0
2
+ * Improvements in librdkafka archive download
3
+ * Add global statistics callback
4
+ * Use Time for timestamps, potentially breaking change if you
5
+ rely on the previous behavior where it returns an integer with
6
+ the number of milliseconds.
7
+ * Bump librdkafka to 0.11.5
8
+ * Implement TopicPartitionList in Ruby so we don't have to keep
9
+ track of native objects.
10
+ * Support committing a topic partition list
11
+ * Add consumer assignment method
12
+
1
13
  # 0.3.5
2
14
  * Fix crash when not waiting for delivery handles
3
15
  * Run specs on Ruby 2.5
data/README.md CHANGED
@@ -8,12 +8,16 @@
8
8
  The `rdkafka` gem is a modern Kafka client library for Ruby based on
9
9
  [librdkafka](https://github.com/edenhill/librdkafka/).
10
10
  It wraps the production-ready C client using the [ffi](https://github.com/ffi/ffi)
11
- gem and targets Kafka 0.10+ and Ruby 2.1+.
11
+ gem and targets Kafka 1.0+ and Ruby 2.1+.
12
12
 
13
13
  This gem only provides a high-level Kafka consumer. If you are running
14
14
  an older version of Kafka and/or need the legacy simple consumer we
15
15
  suggest using the [Hermann](https://github.com/reiseburo/hermann) gem.
16
16
 
17
+ The most important pieces of a Kafka client are implemented. We're
18
+ working towards feature completeness, you can track that here:
19
+ https://github.com/appsignal/rdkafka-ruby/milestone/1
20
+
17
21
  ## Installation
18
22
 
19
23
  This gem downloads and compiles librdkafka when it is installed. If you
@@ -25,6 +29,9 @@ See the [documentation](http://www.rubydoc.info/github/thijsc/rdkafka-ruby/maste
25
29
 
26
30
  ### Consuming messages
27
31
 
32
+ Subscribe to a topic and get messages. Kafka will automatically spread
33
+ the available partitions over consumers with the same group id.
34
+
28
35
  ```ruby
29
36
  config = {
30
37
  :"bootstrap.servers" => "localhost:9092",
@@ -40,25 +47,45 @@ end
40
47
 
41
48
  ### Producing messages
42
49
 
50
+ Produce a number of messages, put the delivery handles in an array and
51
+ wait for them before exiting. This way the messages will be batched and
52
+ sent to Kafka in an efficient way.
53
+
43
54
  ```ruby
44
55
  config = {:"bootstrap.servers" => "localhost:9092"}
45
56
  producer = Rdkafka::Config.new(config).producer
57
+ delivery_handles = []
46
58
 
47
59
  100.times do |i|
48
60
  puts "Producing message #{i}"
49
- producer.produce(
61
+ delivery_handles << producer.produce(
50
62
  topic: "ruby-test-topic",
51
63
  payload: "Payload #{i}",
52
64
  key: "Key #{i}"
53
- ).wait
65
+ )
54
66
  end
67
+
68
+ delivery_handles.each(&:wait)
55
69
  ```
56
70
 
71
+ ## Known issues
72
+
73
+ When using forked process such as when using Unicorn you currently need
74
+ to make sure that you create rdkafka instances after forking. Otherwise
75
+ they will not work and crash your Ruby process when they are garbage
76
+ collected. See https://github.com/appsignal/rdkafka-ruby/issues/19
77
+
57
78
  ## Development
58
79
 
59
- For development we expect a local zookeeper and kafka instance to be
60
- running. Run `bundle` and `cd ext && bundle exec rake && cd ..`. Then
61
- create the topics as expected in the specs: `bundle exec rake create_topics`.
80
+ A Docker Compose file is included to run Kafka and Zookeeper. To run
81
+ that:
82
+
83
+ ```
84
+ docker-compose up
85
+ ```
86
+
87
+ Run `bundle` and `cd ext && bundle exec rake && cd ..` to download and
88
+ compile `librdkafka`.
62
89
 
63
90
  You can then run `bundle exec rspec` to run the tests. To see rdkafka
64
91
  debug output:
@@ -68,6 +95,15 @@ DEBUG_PRODUCER=true bundle exec rspec
68
95
  DEBUG_CONSUMER=true bundle exec rspec
69
96
  ```
70
97
 
98
+ After running the tests you can bring the cluster down to start with a
99
+ clean slate:
100
+
101
+ ```
102
+ docker-compose down
103
+ ```
104
+
105
+ ## Example
106
+
71
107
  To see everything working run these in separate tabs:
72
108
 
73
109
  ```
data/Rakefile CHANGED
@@ -1,33 +1,24 @@
1
1
  require "./lib/rdkafka"
2
2
 
3
- task :create_topics do
4
- puts "Creating test topics"
5
- kafka_topics = if ENV['TRAVIS']
6
- 'kafka/bin/kafka-topics.sh'
7
- else
8
- 'kafka-topics'
9
- end
10
- `#{kafka_topics} --create --topic=consume_test_topic --zookeeper=127.0.0.1:2181 --partitions=3 --replication-factor=1`
11
- `#{kafka_topics} --create --topic=empty_test_topic --zookeeper=127.0.0.1:2181 --partitions=3 --replication-factor=1`
12
- `#{kafka_topics} --create --topic=load_test_topic --zookeeper=127.0.0.1:2181 --partitions=3 --replication-factor=1`
13
- `#{kafka_topics} --create --topic=produce_test_topic --zookeeper=127.0.0.1:2181 --partitions=3 --replication-factor=1`
14
- `#{kafka_topics} --create --topic=rake_test_topic --zookeeper=127.0.0.1:2181 --partitions=3 --replication-factor=1`
15
- end
16
-
17
3
  task :produce_messages do
18
4
  config = {:"bootstrap.servers" => "localhost:9092"}
19
5
  if ENV["DEBUG"]
20
6
  config[:debug] = "broker,topic,msg"
21
7
  end
22
8
  producer = Rdkafka::Config.new(config).producer
9
+
10
+ delivery_handles = []
23
11
  100.times do |i|
24
12
  puts "Producing message #{i}"
25
- producer.produce(
13
+ delivery_handles << producer.produce(
26
14
  topic: "rake_test_topic",
27
15
  payload: "Payload #{i} from Rake",
28
16
  key: "Key #{i} from Rake"
29
- ).wait
17
+ )
30
18
  end
19
+ puts 'Waiting for delivery'
20
+ delivery_handles.each(&:wait)
21
+ puts 'Done'
31
22
  end
32
23
 
33
24
  task :consume_messages do
@@ -35,11 +26,16 @@ task :consume_messages do
35
26
  :"bootstrap.servers" => "localhost:9092",
36
27
  :"group.id" => "rake_test",
37
28
  :"enable.partition.eof" => false,
38
- :"auto.offset.reset" => "earliest"
29
+ :"auto.offset.reset" => "earliest",
30
+ :"statistics.interval.ms" => 10_000
39
31
  }
40
32
  if ENV["DEBUG"]
41
33
  config[:debug] = "cgrp,topic,fetch"
42
34
  end
35
+ Rdkafka::Config.statistics_callback = lambda do |stats|
36
+ puts stats
37
+ end
38
+ consumer = Rdkafka::Config.new(config).consumer
43
39
  consumer = Rdkafka::Config.new(config).consumer
44
40
  consumer.subscribe("rake_test_topic")
45
41
  consumer.each do |message|
@@ -0,0 +1,18 @@
1
+ version: '2'
2
+ services:
3
+ zookeeper:
4
+ image: wurstmeister/zookeeper
5
+ ports:
6
+ - "2181:2181"
7
+ kafka:
8
+ image: wurstmeister/kafka:1.0.1
9
+ ports:
10
+ - "9092:9092"
11
+ environment:
12
+ KAFKA_ADVERTISED_HOST_NAME: localhost
13
+ KAFKA_ADVERTISED_PORT: 9092
14
+ KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
15
+ KAFKA_AUTO_CREATE_TOPICS_ENABLE: 'false'
16
+ KAFKA_CREATE_TOPICS: "consume_test_topic:3:1,empty_test_topic:3:1,load_test_topic:3:1,produce_test_topic:3:1,rake_test_topic:3:1"
17
+ volumes:
18
+ - /var/run/docker.sock:/var/run/docker.sock
@@ -3,9 +3,25 @@ require "mini_portile2"
3
3
  require "fileutils"
4
4
 
5
5
  task :default => :clean do
6
+ # MiniPortile#download_file_http is a monkey patch that removes the download
7
+ # progress indicator. This indicator relies on the 'Content Length' response
8
+ # headers, which is not set by GitHub
9
+ class MiniPortile
10
+ def download_file_http(url, full_path, _count)
11
+ filename = File.basename(full_path)
12
+ with_tempfile(filename, full_path) do |temp_file|
13
+ params = { 'Accept-Encoding' => 'identity' }
14
+ OpenURI.open_uri(url, 'rb', params) do |io|
15
+ temp_file.write(io.read)
16
+ end
17
+ output
18
+ end
19
+ end
20
+ end
21
+
6
22
  # Download and compile librdkafka
7
23
  recipe = MiniPortile.new("librdkafka", Rdkafka::LIBRDKAFKA_VERSION)
8
- recipe.files = ["https://github.com/edenhill/librdkafka/archive/v#{Rdkafka::LIBRDKAFKA_VERSION}.tar.gz"]
24
+ recipe.files = ["https://codeload.github.com/edenhill/librdkafka/tar.gz/v#{Rdkafka::LIBRDKAFKA_VERSION}"]
9
25
  recipe.configure_options = ["--host=#{recipe.host}"]
10
26
  recipe.cook
11
27
  # Move dynamic library we're interested in
@@ -1,4 +1,5 @@
1
1
  require "ffi"
2
+ require "json"
2
3
  require "logger"
3
4
 
4
5
  module Rdkafka
@@ -43,7 +44,7 @@ module Rdkafka
43
44
  # TopicPartition ad TopicPartitionList structs
44
45
 
45
46
  class TopicPartition < FFI::Struct
46
- layout :topic, :string,
47
+ layout :topic, :string,
47
48
  :partition, :int32,
48
49
  :offset, :int64,
49
50
  :metadata, :pointer,
@@ -61,6 +62,7 @@ module Rdkafka
61
62
 
62
63
  attach_function :rd_kafka_topic_partition_list_new, [:int32], :pointer
63
64
  attach_function :rd_kafka_topic_partition_list_add, [:pointer, :string, :int32], :void
65
+ attach_function :rd_kafka_topic_partition_list_set_offset, [:pointer, :string, :int32, :int64], :void
64
66
  attach_function :rd_kafka_topic_partition_list_destroy, [:pointer], :void
65
67
  attach_function :rd_kafka_topic_partition_list_copy, [:pointer], :pointer
66
68
 
@@ -81,6 +83,8 @@ module Rdkafka
81
83
  attach_function :rd_kafka_conf_set, [:pointer, :string, :string, :pointer, :int], :kafka_config_response
82
84
  callback :log_cb, [:pointer, :int, :string, :string], :void
83
85
  attach_function :rd_kafka_conf_set_log_cb, [:pointer, :log_cb], :void
86
+ callback :stats_cb, [:pointer, :string, :int, :pointer], :int
87
+ attach_function :rd_kafka_conf_set_stats_cb, [:pointer, :stats_cb], :void
84
88
 
85
89
  # Log queue
86
90
  attach_function :rd_kafka_set_log_queue, [:pointer, :pointer], :void
@@ -106,6 +110,19 @@ module Rdkafka
106
110
  Rdkafka::Config.logger.add(severity) { "rdkafka: #{line}" }
107
111
  end
108
112
 
113
+ StatsCallback = FFI::Function.new(
114
+ :int, [:pointer, :string, :int, :pointer]
115
+ ) do |_client_ptr, json, _json_len, _opaque|
116
+ # Pass the stats hash to callback in config
117
+ if Rdkafka::Config.statistics_callback
118
+ stats = JSON.parse(json)
119
+ Rdkafka::Config.statistics_callback.call(stats)
120
+ end
121
+
122
+ # Return 0 so librdkafka frees the json string
123
+ 0
124
+ end
125
+
109
126
  # Handle
110
127
 
111
128
  enum :kafka_type, [
@@ -121,6 +138,7 @@ module Rdkafka
121
138
  attach_function :rd_kafka_subscribe, [:pointer, :pointer], :int
122
139
  attach_function :rd_kafka_unsubscribe, [:pointer], :int
123
140
  attach_function :rd_kafka_subscription, [:pointer, :pointer], :int
141
+ attach_function :rd_kafka_assign, [:pointer, :pointer], :int
124
142
  attach_function :rd_kafka_assignment, [:pointer, :pointer], :int
125
143
  attach_function :rd_kafka_committed, [:pointer, :pointer, :int], :int
126
144
  attach_function :rd_kafka_commit, [:pointer, :pointer, :bool], :int, blocking: true
@@ -7,6 +7,7 @@ module Rdkafka
7
7
  class Config
8
8
  # @private
9
9
  @@logger = Logger.new(STDOUT)
10
+ @@statistics_callback = nil
10
11
 
11
12
  # Returns the current logger, by default this is a logger to stdout.
12
13
  #
@@ -25,6 +26,25 @@ module Rdkafka
25
26
  @@logger=logger
26
27
  end
27
28
 
29
+ # Set a callback that will be called every time the underlying client emits statistics.
30
+ # You can configure if and how often this happens using `statistics.interval.ms`.
31
+ # The callback is called with a hash that's documented here: https://github.com/edenhill/librdkafka/blob/master/STATISTICS.md
32
+ #
33
+ # @param callback [Proc] The callback
34
+ #
35
+ # @return [nil]
36
+ def self.statistics_callback=(callback)
37
+ raise TypeError.new("Callback has to be a proc or lambda") unless callback.is_a? Proc
38
+ @@statistics_callback = callback
39
+ end
40
+
41
+ # Returns the current statistics callback, by default this is nil.
42
+ #
43
+ # @return [Proc, nil]
44
+ def self.statistics_callback
45
+ @@statistics_callback
46
+ end
47
+
28
48
  # Default config that can be overwritten.
29
49
  DEFAULT_CONFIG = {
30
50
  # Request api version so advanced features work
@@ -122,8 +142,12 @@ module Rdkafka
122
142
  raise ConfigError.new(error_buffer.read_string)
123
143
  end
124
144
  end
145
+ # Set opaque pointer back to this config
146
+ #Rdkafka::Bindings.rd_kafka_conf_set_opaque(config, self)
125
147
  # Set log callback
126
148
  Rdkafka::Bindings.rd_kafka_conf_set_log_cb(config, Rdkafka::Bindings::LogCallback)
149
+ # Set stats callback
150
+ Rdkafka::Bindings.rd_kafka_conf_set_stats_cb(config, Rdkafka::Bindings::StatsCallback)
127
151
  end
128
152
  end
129
153
 
@@ -65,34 +65,68 @@ module Rdkafka
65
65
  # @return [TopicPartitionList]
66
66
  def subscription
67
67
  tpl = FFI::MemoryPointer.new(:pointer)
68
+ tpl.autorelease = false
68
69
  response = Rdkafka::Bindings.rd_kafka_subscription(@native_kafka, tpl)
69
70
  if response != 0
70
71
  raise Rdkafka::RdkafkaError.new(response)
71
72
  end
72
- Rdkafka::Consumer::TopicPartitionList.new(tpl.get_pointer(0))
73
+ Rdkafka::Consumer::TopicPartitionList.from_native_tpl(tpl.get_pointer(0))
74
+ end
75
+
76
+ # Atomic assignment of partitions to consume
77
+ #
78
+ # @param list [TopicPartitionList] The topic with partitions to assign
79
+ #
80
+ # @raise [RdkafkaError] When assigning fails
81
+ def assign(list)
82
+ unless list.is_a?(TopicPartitionList)
83
+ raise TypeError.new("list has to be a TopicPartitionList")
84
+ end
85
+ tpl = list.to_native_tpl
86
+ response = Rdkafka::Bindings.rd_kafka_assign(@native_kafka, tpl)
87
+ if response != 0
88
+ raise Rdkafka::RdkafkaError.new(response, "Error assigning '#{list.to_h}'")
89
+ end
90
+ ensure
91
+ Rdkafka::Bindings.rd_kafka_topic_partition_list_destroy(tpl) if tpl
92
+ end
93
+
94
+ # Returns the current partition assignment.
95
+ #
96
+ # @raise [RdkafkaError] When getting the assignment fails.
97
+ #
98
+ # @return [TopicPartitionList]
99
+ def assignment
100
+ tpl = FFI::MemoryPointer.new(:pointer)
101
+ tpl.autorelease = false
102
+ response = Rdkafka::Bindings.rd_kafka_assignment(@native_kafka, tpl)
103
+ if response != 0
104
+ raise Rdkafka::RdkafkaError.new(response)
105
+ end
106
+ Rdkafka::Consumer::TopicPartitionList.from_native_tpl(tpl.get_pointer(0))
73
107
  end
74
108
 
75
109
  # Return the current committed offset per partition for this consumer group.
76
110
  # The offset field of each requested partition will either be set to stored offset or to -1001 in case there was no stored offset for that partition.
77
111
  #
78
- # TODO: This should use the subscription or assignment by default.
79
- #
80
- # @param list [TopicPartitionList] The topic with partitions to get the offsets for.
112
+ # @param list [TopicPartitionList, nil] The topic with partitions to get the offsets for or nil to use the current subscription.
81
113
  # @param timeout_ms [Integer] The timeout for fetching this information.
82
114
  #
83
115
  # @raise [RdkafkaError] When getting the committed positions fails.
84
116
  #
85
117
  # @return [TopicPartitionList]
86
- def committed(list, timeout_ms=200)
87
- unless list.is_a?(TopicPartitionList)
88
- raise TypeError.new("list has to be a TopicPartitionList")
118
+ def committed(list=nil, timeout_ms=200)
119
+ if list.nil?
120
+ list = assignment
121
+ elsif !list.is_a?(TopicPartitionList)
122
+ raise TypeError.new("list has to be nil or a TopicPartitionList")
89
123
  end
90
- tpl = list.copy_tpl
124
+ tpl = list.to_native_tpl
91
125
  response = Rdkafka::Bindings.rd_kafka_committed(@native_kafka, tpl, timeout_ms)
92
126
  if response != 0
93
127
  raise Rdkafka::RdkafkaError.new(response)
94
128
  end
95
- Rdkafka::Consumer::TopicPartitionList.new(tpl)
129
+ TopicPartitionList.from_native_tpl(tpl)
96
130
  end
97
131
 
98
132
  # Query broker for low (oldest/beginning) and high (newest/end) offsets for a partition.
@@ -160,11 +194,21 @@ module Rdkafka
160
194
  # @raise [RdkafkaError] When comitting fails
161
195
  #
162
196
  # @return [nil]
163
- def commit(async=false)
164
- response = Rdkafka::Bindings.rd_kafka_commit(@native_kafka, nil, async)
197
+ def commit(list=nil, async=false)
198
+ if !list.nil? && !list.is_a?(TopicPartitionList)
199
+ raise TypeError.new("list has to be nil or a TopicPartitionList")
200
+ end
201
+ tpl = if list
202
+ list.to_native_tpl
203
+ else
204
+ nil
205
+ end
206
+ response = Rdkafka::Bindings.rd_kafka_commit(@native_kafka, tpl, async)
165
207
  if response != 0
166
208
  raise Rdkafka::RdkafkaError.new(response)
167
209
  end
210
+ ensure
211
+ Rdkafka::Bindings.rd_kafka_topic_partition_list_destroy(tpl) if tpl
168
212
  end
169
213
 
170
214
  # Poll for the next message on one of the subscribed topics