heller 0.0.3-java → 0.2.0-java

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: eb77b9b32f52f9f8cb4aefbdf3d27af37b402283
4
+ data.tar.gz: dcf3fbab3d4ee6cb31f0da70e98f18fe3994b2b6
5
+ SHA512:
6
+ metadata.gz: 8a4b163c1f7523d3035cf73087b8d26dc366845b66c3c39845e9476fcb9d00b32944a7307a9e746deb833db722605290e2576322116b0a5ccbd65bb68027d6f1
7
+ data.tar.gz: 9658bb7e731c353d0269bcce8c150c7f47883d21bf0c592d937eaab50c641b0f064037dabe846698443d2acc7987995eb9084152aac60f2d5b06ed9bd4d5fa9d
data/README.md ADDED
@@ -0,0 +1,193 @@
1
+ # Heller
2
+
3
+ [![Build Status](https://travis-ci.org/mthssdrbrg/heller.svg?branch=master)](https://travis-ci.org/mthssdrbrg/heller)
4
+ [![Coverage Status](https://coveralls.io/repos/mthssdrbrg/heller/badge.svg?branch=master)](https://coveralls.io/r/mthssdrbrg/heller?branch=master)
5
+
6
+ Heller is a JRuby wrapper around the Kafka Producer and (Simple)Consumer
7
+ APIs, much like [Mikka](https://github.com/iconara/mikka) is for Akka's Java API.
8
+
9
+ The goal of Heller is to make the Producer and Consumer APIs of Kafka a bit more
10
+ Rubyesque, and useful as building blocks for creating more advanced producer and
11
+ consumer implementations.
12
+
13
+ ## Producer API
14
+
15
+ ```Heller::Producer``` is an extremely simple wrapper class around
16
+ ```kafka.javaapi.producer.Producer``` and provides some convenience in configuring the
17
+ producer with more Rubyesque names for configuration parameters.
18
+
19
+ All configuration parameters are supported and can be used in the following way:
20
+
21
+ ```ruby
22
+ require 'heller'
23
+
24
+ producer = Heller::Producer.new('localhost:9092,localhost:9093', {
25
+ :type => :async,
26
+ :serializer => 'kafka.serializer.StringEncoder',
27
+ :key_serializer => 'kafka.serializer.DefaultEncoder',
28
+ :partitioner => 'kafka.producer.DefaultPartitioner',
29
+ :compression => :gzip,
30
+ :num_retries => 5,
31
+ :retry_backoff => 1500,
32
+ :metadata_refresh_interval => 5000,
33
+ :batch_size => 2000,
34
+ :client_id => 'spec-client',
35
+ :request_timeout => 10000,
36
+ :buffer_limit => 100 * 100,
37
+ :buffer_timeout => 1000 * 100,
38
+ :enqueue_timeout => 1000,
39
+ :socket_buffer => 1024 * 1000,
40
+ :ack => -1
41
+ })
42
+ ```
43
+
44
+ Check the official [Kafka docs](http://kafka.apache.org/documentation.html#producerconfigs) for possible values for each parameter.
45
+
46
+ To send messages you creates instances of ```Heller::Message``` and feed them to the
47
+ ```#push``` method of the producer:
48
+
49
+ ```ruby
50
+ messages = 3.times.map { Heller::Message.new('test', 'my message!') }
51
+ producer.push(messages)
52
+ ```
53
+
54
+ Want to partition messages based on some key? Sure, no problem:
55
+
56
+ ```ruby
57
+ messages = [0, 1, 2].map { |key| Heller::Message.new('test', "my message using #{key} as key!", key.to_s) }
58
+ producer.push(messages)
59
+ ```
60
+
61
+ ## Consumer API
62
+
63
+ ```Heller::Consumer``` wraps ```kafka.javaapi.consumer.SimpleConsumer``` and provides
64
+ basically the same methods, but with a bit more convenience (or at least I'd
65
+ like to think so).
66
+
67
+ A ```Consumer``` can be created in the following way:
68
+
69
+ ```ruby
70
+ require 'heller'
71
+
72
+ options = {
73
+ # 'generic' consumer options
74
+ :timeout => 5000, # socket timeout
75
+ :buffer_size => 128 * 1024, # socket buffer size
76
+ :client_id => 'my-consumer', # id of consumer
77
+ # fetch request related options
78
+ :max_wait => 4500, # maximum time (ms) the consumer will wait for response of a request
79
+ :min_bytes => 1024 # minimum amount of bytes the server (broker) should return for a fetch request
80
+ }
81
+
82
+ consumer = Heller::Consumer.new('localhost:9092', options)
83
+ ```
84
+
85
+ The options specified in the options hash are also described in the official
86
+ [Kafka docs](http://kafka.apache.org/documentation.html#consumerconfigs), albeit they're described in the context of their high-level
87
+ consumer.
88
+
89
+ The consumer API exposes the following methods: ```#fetch```, ```#metadata```,
90
+ ```#offsets_before```, ```#earliest_offset``` and ```#latest_offset```, and
91
+ their usage is described below.
92
+
93
+ ### Fetching messages
94
+
95
+ ```ruby
96
+ topic = 'my-topic'
97
+ partition = offset = 0
98
+
99
+ response = consumer.fetch(Heller::FetchRequest.new(topic, partition, offset))
100
+
101
+ if response.error? && (error_code = response.error(topic, partition)) != 0
102
+ puts "Got error #{Heller::Errors.error_for(error_code)}!"
103
+ else
104
+ message_enumerator = response.messages(topic, partition)
105
+ message_enumerator.each do |offset, message_payload|
106
+ puts "#{offset}: #{message_payload}"
107
+ end
108
+ end
109
+ ```
110
+
111
+ See ```Heller::FetchResponse``` (and the related specs) for usage of other
112
+ methods.
113
+
114
+ It's also possible to pass an array of ```FetchRequest``` objects to ```#fetch```.
115
+
116
+ ```ruby
117
+ requests = [0, 1, 2].map { |i| Heller::FetchRequest.new(topic, i, offset) }
118
+ fetch_response = consumer.fetch(requests)
119
+ ```
120
+
121
+ ### Topic and partition metadata
122
+
123
+ ```kafka.javaapi.consumer.SimpleConsumer``` exposes a method called ```#topic_metadata```, which in Heller has been "renamed" to just ```#metadata```.
124
+
125
+ ```ruby
126
+ topics = [1, 2, 3].map { |i| "my-topic-#{i}" }
127
+
128
+ response = consumer.metadata(topics)
129
+
130
+ response.each do |topic, partition_metadata|
131
+ puts "Got metadata for (#{topic}:#{partition_metadata.partition_id})"
132
+ end
133
+
134
+ leader = response.leader_for('my-topic-1', 0)
135
+ puts "Leader for my-topic-1:0 is at #{leader.connection_string} (#{leader.zk_string})"
136
+
137
+ isrs = response.isr_for('my-topic-1', 0) # also aliased as #in_sync_replicas_for
138
+ isrs.each do |isr|
139
+ puts "An in-sync replica for my-topic-1:0 is at #{isr.connection_string} (#{isr.zk_string})"
140
+ end
141
+ ```
142
+
143
+ ### Get offsets for topic-partition combinations
144
+
145
+ ```ruby
146
+ # arguments = *[topic, partition, timestamp (ms), max number of offsets]
147
+ request = Heller::OffsetRequest.new('my-topic', 0, Time.now.to_i * 1000, 10)
148
+ response = consumer.offsets_before(request)
149
+
150
+ if response.error? && (error_code = response.error('my-topic', 0)) != 0
151
+ puts "Got error #{Heller::Errors.error_for(error_code)}!"
152
+ else
153
+ offsets = response.offsets('my-topic', 0)
154
+ puts "Got #{offsets.join(', ')} offsets for my-topic:0"
155
+ end
156
+ ```
157
+
158
+ ```Heller::Consumer``` also exposes (as ```SimpleConsumer```) two convenience
159
+ methods for retrieving the earliest / latest offset for a topic-partition
160
+ combination.
161
+
162
+ ```ruby
163
+ earliest_offset = consumer.earliest_offset('my-topic', 0)
164
+ latest_offset = consumer.latest_offset('my-topic', 0)
165
+
166
+ puts "Earliest available offset is #{earliest_offset}"
167
+ puts "Latest available offset is #{latest_offset}"
168
+ ```
169
+
170
+ ## Status
171
+
172
+ The project is currently under development, and I wouldn't really recommend it
173
+ to be used in any form of production environment.
174
+ There is still quite some work that needs to be done, especially for the Consumer API.
175
+ The Producer API is more or less done, for the moment at least.
176
+
177
+ It's getting there, though I'm mostly doing this during my spare time, which is
178
+ sparse at times.
179
+
180
+ ## Copyright
181
+
182
+ Copyright 2013-2015 Mathias Söderberg and contributors
183
+
184
+ Licensed under the Apache License, Version 2.0 (the "License"); you may not use
185
+ this file except in compliance with the License. You may obtain a copy of the
186
+ License at
187
+
188
+ http://www.apache.org/licenses/LICENSE-2.0
189
+
190
+ Unless required by applicable law or agreed to in writing, software distributed
191
+ under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
192
+ CONDITIONS OF ANY KIND, either express or implied. See the License for the
193
+ specific language governing permissions and limitations under the License.
@@ -0,0 +1,41 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class Configuration
5
+ def initialize(options={})
6
+ @configuration = merge_with_defaults(options)
7
+ end
8
+
9
+ def [](key)
10
+ @configuration[key.to_sym]
11
+ end
12
+
13
+ def to_java
14
+ kafka_config_class.new(to_properties)
15
+ end
16
+
17
+ protected
18
+
19
+ def defaults
20
+ {}
21
+ end
22
+
23
+ private
24
+
25
+ def merge_with_defaults(options)
26
+ options.each_with_object(defaults) do |(k, v), h|
27
+ h[k.to_sym] = v
28
+ end
29
+ end
30
+
31
+ def convert_key(key)
32
+ key_mappings.key?(key) ? key_mappings[key] : key.to_s.gsub('_', '.')
33
+ end
34
+
35
+ def to_properties
36
+ @configuration.each_with_object(Properties.new) do |(key, value), props|
37
+ props.put(convert_key(key), value.to_s)
38
+ end
39
+ end
40
+ end
41
+ end
@@ -1,59 +1,95 @@
1
+ # encoding: utf-8
2
+
3
+ require 'securerandom'
4
+
5
+
1
6
  module Heller
2
- class Consumer < Kafka::Consumer::SimpleConsumer
3
-
4
- LATEST_OFFSET = -1.freeze
5
- EARLIEST_OFFSET = -2.freeze
6
-
7
- MAX_FETCH_SIZE = 1000000.freeze
8
-
9
- def latest_offset(topic, partition)
10
- single_offset(topic, partition, LATEST_OFFSET)
11
- end
12
-
13
- def earliest_offset(topic, partition)
14
- single_offset(topic, partition, EARLIEST_OFFSET)
15
- end
16
-
17
- def fetch_request(topic, partition, offset, max_size)
18
- Kafka::Api::FetchRequest.new(topic, partition, offset, max_size)
19
- end
20
-
21
- def consume(topic, partition, offset, max_size = MAX_FETCH_SIZE)
22
- request = fetch_request(topic, partition, offset, max_size)
23
-
24
- messages = fetch(request)
25
- messages.to_a
26
- end
27
-
28
- def multi_fetch(topics_hash, max_size = MAX_FETCH_SIZE)
29
- requests = topics_hash.map { |topic, hash| fetch_request(topic, hash[:partition], hash[:offset], max_size) }
30
-
31
- response = multifetch(ArrayList.new(requests))
32
- parse_multi_fetch_response(topics_hash.keys, response)
33
- end
34
-
35
- protected
36
-
37
- def single_offset(topic, partition, time)
38
- offsets = get_offsets_before(topic, partition, time, 1)
39
- offsets = offsets.to_a
40
-
41
- offsets.first unless offsets.empty?
42
- end
43
-
44
- def parse_multi_fetch_response(topics, response)
45
- if response.respond_to?(:to_a)
46
- response_array = topics.zip(response.to_a)
47
-
48
- response_array.inject({}) do |response_hash, (topic, messages)|
49
- response_hash[topic] = messages.to_a
50
- response_hash
51
- end
52
- else
53
- puts 'response does not respond to #to_a'
54
- puts "response is #{response.inspect}"
55
- puts "topics were: #{topics.inspect}"
56
- end
57
- end
58
- end
7
+ class Consumer
8
+ def initialize(connect_string, options = {})
9
+ @host, @port = connect_string.split(':')
10
+ options = defaults.merge(options)
11
+ @consumer = create_consumer(options)
12
+ @build_options = options.select { |k, _| BUILD_OPTIONS.include?(k) }
13
+ @decoder = Kafka::Serializer::StringDecoder.new(nil)
14
+ end
15
+
16
+ def client_id
17
+ @consumer.client_id
18
+ end
19
+
20
+ def fetch(fetch_requests, fetch_size = DEFAULT_FETCH_SIZE)
21
+ builder = create_builder(@build_options)
22
+ Array(fetch_requests).each do |request|
23
+ builder.add_fetch(request.topic, request.partition, request.offset, fetch_size)
24
+ end
25
+ raw_response = @consumer.fetch(builder.build)
26
+ FetchResponse.new(raw_response, @decoder)
27
+ end
28
+
29
+ def metadata(topics=[])
30
+ request = Kafka::JavaApi::TopicMetadataRequest.new(topics)
31
+ TopicMetadataResponse.new(@consumer.send(request))
32
+ end
33
+ alias_method :topic_metadata, :metadata
34
+
35
+ def offsets_before(offset_requests)
36
+ request_info = Array(offset_requests).each_with_object({}) do |request, memo|
37
+ topic_partition = Kafka::Common::TopicAndPartition.new(request.topic, request.partition)
38
+ partition_offset = Kafka::Api::PartitionOffsetRequestInfo.new(request.time.to_i, request.max_offsets)
39
+
40
+ memo[topic_partition] = partition_offset
41
+ end
42
+
43
+ request = Kafka::JavaApi::OffsetRequest.new(request_info, OffsetRequest.current_version, client_id)
44
+ OffsetResponse.new(@consumer.get_offsets_before(request))
45
+ end
46
+
47
+ def earliest_offset(topic, partition)
48
+ response = offsets_before(OffsetRequest.new(topic, partition, OffsetRequest.earliest_time))
49
+ response.offsets(topic, partition).first
50
+ end
51
+
52
+ def latest_offset(topic, partition)
53
+ response = offsets_before(OffsetRequest.new(topic, partition, OffsetRequest.latest_time))
54
+ response.offsets(topic, partition).last
55
+ end
56
+
57
+ def disconnect
58
+ @consumer.close
59
+ end
60
+ alias_method :close, :disconnect
61
+
62
+ private
63
+
64
+ DEFAULT_FETCH_SIZE = 1024 * 1024
65
+ BUILD_OPTIONS = [:client_id, :max_wait, :min_bytes].freeze
66
+
67
+ def defaults
68
+ {
69
+ timeout: 30 * 1000,
70
+ buffer_size: 64 * 1024,
71
+ client_id: generate_client_id
72
+ }
73
+ end
74
+
75
+ def generate_client_id
76
+ "heller-#{self.class.name.split('::').last.downcase}-#{SecureRandom.uuid}"
77
+ end
78
+
79
+ def create_consumer(options)
80
+ consumer_impl = options.delete(:consumer_impl) || Kafka::Consumer::SimpleConsumer
81
+ extra_options = options.values_at(:timeout, :buffer_size, :client_id)
82
+ consumer_impl.new(@host, @port.to_i, *extra_options)
83
+ end
84
+
85
+ def create_builder(options)
86
+ builder = Kafka::Api::FetchRequestBuilder.new
87
+
88
+ BUILD_OPTIONS.each do |symbol|
89
+ builder.send(symbol, options[symbol]) if options[symbol]
90
+ end
91
+
92
+ builder
93
+ end
94
+ end
59
95
  end
@@ -0,0 +1,38 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class ConsumerConfiguration < Configuration
5
+
6
+ protected
7
+
8
+ def key_mappings
9
+ @key_mappings ||= {
10
+ auto_commit: 'auto.commit.enable',
11
+ auto_commit_interval: 'auto.commit.interval.ms',
12
+ auto_reset_offset: 'auto.offset.reset',
13
+ client_id: 'client.id',
14
+ consumer_id: 'consumer.id',
15
+ fetch_message_max_bytes: 'fetch.message.max.bytes',
16
+ fetch_min_bytes: 'fetch.min.bytes',
17
+ fetch_max_wait: 'fetch.wait.max.ms',
18
+ group_id: 'group.id',
19
+ num_fetchers: 'num.consumer.fetchers',
20
+ max_queued_message_chunks: 'queued.max.message.chunks',
21
+ receive_buffer: 'socket.receive.buffer.bytes',
22
+ rebalance_retries: 'rebalance.max.retries',
23
+ rebalance_retry_backoff: 'rebalance.backoff.ms',
24
+ refresh_leader_backoff: 'refresh.leader.backoff.ms',
25
+ socket_timeout: 'socket.timeout.ms',
26
+ timeout: 'consumer.timeout.ms',
27
+ zk_connect: 'zookeeper.connect',
28
+ zk_session_timeout: 'zookeeper.session.timeout.ms',
29
+ zk_connection_timeout: 'zookeeper.connection.timeout.ms',
30
+ zk_sync_time: 'zookeeper.sync.time.ms',
31
+ }.freeze
32
+ end
33
+
34
+ def kafka_config_class
35
+ Kafka::Consumer::ConsumerConfig
36
+ end
37
+ end
38
+ end
@@ -0,0 +1,9 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ Errors = Kafka::Errors::ErrorMapping
5
+
6
+ class Errors
7
+ self.singleton_class.send(:alias_method, :error_for, :exception_for)
8
+ end
9
+ end
@@ -0,0 +1,11 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class FetchRequest
5
+ attr_reader :topic, :partition, :offset
6
+
7
+ def initialize(*args)
8
+ @topic, @partition, @offset = *args
9
+ end
10
+ end
11
+ end
@@ -0,0 +1,33 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class FetchResponse
5
+ def initialize(underlying, decoder)
6
+ @underlying, @decoder = underlying, decoder
7
+ end
8
+
9
+ def error?
10
+ @underlying.has_error?
11
+ end
12
+
13
+ def error(topic, partition)
14
+ convert_error { @underlying.error_code(topic, partition) }
15
+ end
16
+
17
+ def messages(topic, partition)
18
+ convert_error { MessageSetEnumerator.new(@underlying.message_set(topic, partition), @decoder) }
19
+ end
20
+
21
+ def high_watermark(topic, partition)
22
+ convert_error { @underlying.high_watermark(topic, partition) }
23
+ end
24
+
25
+ private
26
+
27
+ def convert_error
28
+ yield
29
+ rescue IllegalArgumentException => e
30
+ raise NoSuchTopicPartitionCombinationError, e.message, e.backtrace
31
+ end
32
+ end
33
+ end
@@ -0,0 +1,9 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class Message < Kafka::Producer::KeyedMessage
5
+ def initialize(topic, message, key = nil)
6
+ super(topic, key, message)
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,35 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class MessageSetEnumerator
5
+ include Enumerable
6
+
7
+ def initialize(message_set, decoder)
8
+ @iterator, @decoder = message_set.iterator, decoder
9
+ end
10
+
11
+ def each
12
+ loop do
13
+ yield self.next
14
+ end
15
+ end
16
+
17
+ def next
18
+ if @iterator.has_next?
19
+ item = @iterator.next
20
+ offset, payload = item.offset, item.message.payload
21
+ [offset, decode(payload)]
22
+ else
23
+ raise StopIteration
24
+ end
25
+ end
26
+
27
+ private
28
+
29
+ def decode(payload)
30
+ bytes = Java::byte[payload.limit].new
31
+ payload.get(bytes)
32
+ @decoder.from_bytes(bytes)
33
+ end
34
+ end
35
+ end
@@ -0,0 +1,23 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class OffsetRequest
5
+ attr_reader :topic, :partition, :time, :max_offsets
6
+
7
+ def self.latest_time
8
+ Kafka::Api::OffsetRequest.latest_time
9
+ end
10
+
11
+ def self.earliest_time
12
+ Kafka::Api::OffsetRequest.earliest_time
13
+ end
14
+
15
+ def self.current_version
16
+ Kafka::Api::OffsetRequest.current_version
17
+ end
18
+
19
+ def initialize(topic, partition, time, offsets = 1)
20
+ @topic, @partition, @time, @max_offsets = topic, partition, time, offsets
21
+ end
22
+ end
23
+ end
@@ -0,0 +1,29 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class OffsetResponse
5
+ def initialize(underlying)
6
+ @underlying = underlying
7
+ end
8
+
9
+ def offsets(topic, partition)
10
+ convert_error { @underlying.offsets(topic, partition).to_a }
11
+ end
12
+
13
+ def error?
14
+ @underlying.has_error?
15
+ end
16
+
17
+ def error(topic, partition)
18
+ convert_error { @underlying.error_code(topic, partition) }
19
+ end
20
+
21
+ private
22
+
23
+ def convert_error
24
+ yield
25
+ rescue NoSuchElementException => e
26
+ raise NoSuchTopicPartitionCombinationError, e.message, e.backtrace
27
+ end
28
+ end
29
+ end
@@ -1,39 +1,25 @@
1
- module Heller
2
- class Producer < Kafka::Producer::Producer
3
-
4
- attr_reader :configuration
5
-
6
- def initialize(zk_connect, options = {})
7
- options.merge!({
8
- 'zk.connect' => zk_connect
9
- })
10
-
11
- @configuration = Kafka::Producer::ProducerConfig.new(hash_to_properties(options))
12
- super @configuration
13
- end
14
-
15
- def produce(topic_mappings)
16
- producer_data = topic_mappings.map do |topic, hash|
17
- if hash[:key]
18
- Kafka::Producer::ProducerData.new(topic, hash[:key], hash[:messages])
19
- else
20
- Kafka::Producer::ProducerData.new(topic, hash[:messages])
21
- end
22
- end
1
+ # encoding: utf-8
23
2
 
24
- send(producer_data)
25
- end
26
-
27
- protected
28
-
29
- def hash_to_properties(options)
30
- properties = java.util.Properties.new
31
-
32
- options.each do |key, value|
33
- properties.put(key.to_s, value.to_s)
34
- end
35
-
36
- properties
37
- end
38
- end
39
- end
3
+ module Heller
4
+ class Producer
5
+ def initialize(broker_list, options = {})
6
+ @producer = create_producer(options.merge(brokers: broker_list))
7
+ end
8
+
9
+ def push(messages)
10
+ @producer.send(ArrayList.new(Array(messages)))
11
+ end
12
+
13
+ def disconnect
14
+ @producer.close
15
+ end
16
+ alias_method :close, :disconnect
17
+
18
+ private
19
+
20
+ def create_producer(options)
21
+ producer_impl = options.delete(:producer_impl) || Kafka::Producer::Producer
22
+ producer_impl.new(ProducerConfiguration.new(options).to_java)
23
+ end
24
+ end
25
+ end