heller 0.0.3-java → 0.2.0-java

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: eb77b9b32f52f9f8cb4aefbdf3d27af37b402283
4
+ data.tar.gz: dcf3fbab3d4ee6cb31f0da70e98f18fe3994b2b6
5
+ SHA512:
6
+ metadata.gz: 8a4b163c1f7523d3035cf73087b8d26dc366845b66c3c39845e9476fcb9d00b32944a7307a9e746deb833db722605290e2576322116b0a5ccbd65bb68027d6f1
7
+ data.tar.gz: 9658bb7e731c353d0269bcce8c150c7f47883d21bf0c592d937eaab50c641b0f064037dabe846698443d2acc7987995eb9084152aac60f2d5b06ed9bd4d5fa9d
data/README.md ADDED
@@ -0,0 +1,193 @@
1
+ # Heller
2
+
3
+ [![Build Status](https://travis-ci.org/mthssdrbrg/heller.svg?branch=master)](https://travis-ci.org/mthssdrbrg/heller)
4
+ [![Coverage Status](https://coveralls.io/repos/mthssdrbrg/heller/badge.svg?branch=master)](https://coveralls.io/r/mthssdrbrg/heller?branch=master)
5
+
6
+ Heller is a JRuby wrapper around the Kafka Producer and (Simple)Consumer
7
+ APIs, much like [Mikka](https://github.com/iconara/mikka) is for Akka's Java API.
8
+
9
+ The goal of Heller is to make the Producer and Consumer APIs of Kafka a bit more
10
+ Rubyesque, and useful as building blocks for creating more advanced producer and
11
+ consumer implementations.
12
+
13
+ ## Producer API
14
+
15
+ ```Heller::Producer``` is an extremely simple wrapper class around
16
+ ```kafka.javaapi.producer.Producer``` and provides some convenience in configuring the
17
+ producer with more Rubyesque names for configuration parameters.
18
+
19
+ All configuration parameters are supported and can be used in the following way:
20
+
21
+ ```ruby
22
+ require 'heller'
23
+
24
+ producer = Heller::Producer.new('localhost:9092,localhost:9093', {
25
+ :type => :async,
26
+ :serializer => 'kafka.serializer.StringEncoder',
27
+ :key_serializer => 'kafka.serializer.DefaultEncoder',
28
+ :partitioner => 'kafka.producer.DefaultPartitioner',
29
+ :compression => :gzip,
30
+ :num_retries => 5,
31
+ :retry_backoff => 1500,
32
+ :metadata_refresh_interval => 5000,
33
+ :batch_size => 2000,
34
+ :client_id => 'spec-client',
35
+ :request_timeout => 10000,
36
+ :buffer_limit => 100 * 100,
37
+ :buffer_timeout => 1000 * 100,
38
+ :enqueue_timeout => 1000,
39
+ :socket_buffer => 1024 * 1000,
40
+ :ack => -1
41
+ })
42
+ ```
43
+
44
+ Check the official [Kafka docs](http://kafka.apache.org/documentation.html#producerconfigs) for possible values for each parameter.
45
+
46
+ To send messages you creates instances of ```Heller::Message``` and feed them to the
47
+ ```#push``` method of the producer:
48
+
49
+ ```ruby
50
+ messages = 3.times.map { Heller::Message.new('test', 'my message!') }
51
+ producer.push(messages)
52
+ ```
53
+
54
+ Want to partition messages based on some key? Sure, no problem:
55
+
56
+ ```ruby
57
+ messages = [0, 1, 2].map { |key| Heller::Message.new('test', "my message using #{key} as key!", key.to_s) }
58
+ producer.push(messages)
59
+ ```
60
+
61
+ ## Consumer API
62
+
63
+ ```Heller::Consumer``` wraps ```kafka.javaapi.consumer.SimpleConsumer``` and provides
64
+ basically the same methods, but with a bit more convenience (or at least I'd
65
+ like to think so).
66
+
67
+ A ```Consumer``` can be created in the following way:
68
+
69
+ ```ruby
70
+ require 'heller'
71
+
72
+ options = {
73
+ # 'generic' consumer options
74
+ :timeout => 5000, # socket timeout
75
+ :buffer_size => 128 * 1024, # socket buffer size
76
+ :client_id => 'my-consumer', # id of consumer
77
+ # fetch request related options
78
+ :max_wait => 4500, # maximum time (ms) the consumer will wait for response of a request
79
+ :min_bytes => 1024 # minimum amount of bytes the server (broker) should return for a fetch request
80
+ }
81
+
82
+ consumer = Heller::Consumer.new('localhost:9092', options)
83
+ ```
84
+
85
+ The options specified in the options hash are also described in the official
86
+ [Kafka docs](http://kafka.apache.org/documentation.html#consumerconfigs), albeit they're described in the context of their high-level
87
+ consumer.
88
+
89
+ The consumer API exposes the following methods: ```#fetch```, ```#metadata```,
90
+ ```#offsets_before```, ```#earliest_offset``` and ```#latest_offset```, and
91
+ their usage is described below.
92
+
93
+ ### Fetching messages
94
+
95
+ ```ruby
96
+ topic = 'my-topic'
97
+ partition = offset = 0
98
+
99
+ response = consumer.fetch(Heller::FetchRequest.new(topic, partition, offset))
100
+
101
+ if response.error? && (error_code = response.error(topic, partition)) != 0
102
+ puts "Got error #{Heller::Errors.error_for(error_code)}!"
103
+ else
104
+ message_enumerator = response.messages(topic, partition)
105
+ message_enumerator.each do |offset, message_payload|
106
+ puts "#{offset}: #{message_payload}"
107
+ end
108
+ end
109
+ ```
110
+
111
+ See ```Heller::FetchResponse``` (and the related specs) for usage of other
112
+ methods.
113
+
114
+ It's also possible to pass an array of ```FetchRequest``` objects to ```#fetch```.
115
+
116
+ ```ruby
117
+ requests = [0, 1, 2].map { |i| Heller::FetchRequest.new(topic, i, offset) }
118
+ fetch_response = consumer.fetch(requests)
119
+ ```
120
+
121
+ ### Topic and partition metadata
122
+
123
+ ```kafka.javaapi.consumer.SimpleConsumer``` exposes a method called ```#topic_metadata```, which in Heller has been "renamed" to just ```#metadata```.
124
+
125
+ ```ruby
126
+ topics = [1, 2, 3].map { |i| "my-topic-#{i}" }
127
+
128
+ response = consumer.metadata(topics)
129
+
130
+ response.each do |topic, partition_metadata|
131
+ puts "Got metadata for (#{topic}:#{partition_metadata.partition_id})"
132
+ end
133
+
134
+ leader = response.leader_for('my-topic-1', 0)
135
+ puts "Leader for my-topic-1:0 is at #{leader.connection_string} (#{leader.zk_string})"
136
+
137
+ isrs = response.isr_for('my-topic-1', 0) # also aliased as #in_sync_replicas_for
138
+ isrs.each do |isr|
139
+ puts "An in-sync replica for my-topic-1:0 is at #{isr.connection_string} (#{isr.zk_string})"
140
+ end
141
+ ```
142
+
143
+ ### Get offsets for topic-partition combinations
144
+
145
+ ```ruby
146
+ # arguments = *[topic, partition, timestamp (ms), max number of offsets]
147
+ request = Heller::OffsetRequest.new('my-topic', 0, Time.now.to_i * 1000, 10)
148
+ response = consumer.offsets_before(request)
149
+
150
+ if response.error? && (error_code = response.error('my-topic', 0)) != 0
151
+ puts "Got error #{Heller::Errors.error_for(error_code)}!"
152
+ else
153
+ offsets = response.offsets('my-topic', 0)
154
+ puts "Got #{offsets.join(', ')} offsets for my-topic:0"
155
+ end
156
+ ```
157
+
158
+ ```Heller::Consumer``` also exposes (as ```SimpleConsumer```) two convenience
159
+ methods for retrieving the earliest / latest offset for a topic-partition
160
+ combination.
161
+
162
+ ```ruby
163
+ earliest_offset = consumer.earliest_offset('my-topic', 0)
164
+ latest_offset = consumer.latest_offset('my-topic', 0)
165
+
166
+ puts "Earliest available offset is #{earliest_offset}"
167
+ puts "Latest available offset is #{latest_offset}"
168
+ ```
169
+
170
+ ## Status
171
+
172
+ The project is currently under development, and I wouldn't really recommend it
173
+ to be used in any form of production environment.
174
+ There is still quite some work that needs to be done, especially for the Consumer API.
175
+ The Producer API is more or less done, for the moment at least.
176
+
177
+ It's getting there, though I'm mostly doing this during my spare time, which is
178
+ sparse at times.
179
+
180
+ ## Copyright
181
+
182
+ Copyright 2013-2015 Mathias Söderberg and contributors
183
+
184
+ Licensed under the Apache License, Version 2.0 (the "License"); you may not use
185
+ this file except in compliance with the License. You may obtain a copy of the
186
+ License at
187
+
188
+ http://www.apache.org/licenses/LICENSE-2.0
189
+
190
+ Unless required by applicable law or agreed to in writing, software distributed
191
+ under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
192
+ CONDITIONS OF ANY KIND, either express or implied. See the License for the
193
+ specific language governing permissions and limitations under the License.
@@ -0,0 +1,41 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class Configuration
5
+ def initialize(options={})
6
+ @configuration = merge_with_defaults(options)
7
+ end
8
+
9
+ def [](key)
10
+ @configuration[key.to_sym]
11
+ end
12
+
13
+ def to_java
14
+ kafka_config_class.new(to_properties)
15
+ end
16
+
17
+ protected
18
+
19
+ def defaults
20
+ {}
21
+ end
22
+
23
+ private
24
+
25
+ def merge_with_defaults(options)
26
+ options.each_with_object(defaults) do |(k, v), h|
27
+ h[k.to_sym] = v
28
+ end
29
+ end
30
+
31
+ def convert_key(key)
32
+ key_mappings.key?(key) ? key_mappings[key] : key.to_s.gsub('_', '.')
33
+ end
34
+
35
+ def to_properties
36
+ @configuration.each_with_object(Properties.new) do |(key, value), props|
37
+ props.put(convert_key(key), value.to_s)
38
+ end
39
+ end
40
+ end
41
+ end
@@ -1,59 +1,95 @@
1
+ # encoding: utf-8
2
+
3
+ require 'securerandom'
4
+
5
+
1
6
  module Heller
2
- class Consumer < Kafka::Consumer::SimpleConsumer
3
-
4
- LATEST_OFFSET = -1.freeze
5
- EARLIEST_OFFSET = -2.freeze
6
-
7
- MAX_FETCH_SIZE = 1000000.freeze
8
-
9
- def latest_offset(topic, partition)
10
- single_offset(topic, partition, LATEST_OFFSET)
11
- end
12
-
13
- def earliest_offset(topic, partition)
14
- single_offset(topic, partition, EARLIEST_OFFSET)
15
- end
16
-
17
- def fetch_request(topic, partition, offset, max_size)
18
- Kafka::Api::FetchRequest.new(topic, partition, offset, max_size)
19
- end
20
-
21
- def consume(topic, partition, offset, max_size = MAX_FETCH_SIZE)
22
- request = fetch_request(topic, partition, offset, max_size)
23
-
24
- messages = fetch(request)
25
- messages.to_a
26
- end
27
-
28
- def multi_fetch(topics_hash, max_size = MAX_FETCH_SIZE)
29
- requests = topics_hash.map { |topic, hash| fetch_request(topic, hash[:partition], hash[:offset], max_size) }
30
-
31
- response = multifetch(ArrayList.new(requests))
32
- parse_multi_fetch_response(topics_hash.keys, response)
33
- end
34
-
35
- protected
36
-
37
- def single_offset(topic, partition, time)
38
- offsets = get_offsets_before(topic, partition, time, 1)
39
- offsets = offsets.to_a
40
-
41
- offsets.first unless offsets.empty?
42
- end
43
-
44
- def parse_multi_fetch_response(topics, response)
45
- if response.respond_to?(:to_a)
46
- response_array = topics.zip(response.to_a)
47
-
48
- response_array.inject({}) do |response_hash, (topic, messages)|
49
- response_hash[topic] = messages.to_a
50
- response_hash
51
- end
52
- else
53
- puts 'response does not respond to #to_a'
54
- puts "response is #{response.inspect}"
55
- puts "topics were: #{topics.inspect}"
56
- end
57
- end
58
- end
7
+ class Consumer
8
+ def initialize(connect_string, options = {})
9
+ @host, @port = connect_string.split(':')
10
+ options = defaults.merge(options)
11
+ @consumer = create_consumer(options)
12
+ @build_options = options.select { |k, _| BUILD_OPTIONS.include?(k) }
13
+ @decoder = Kafka::Serializer::StringDecoder.new(nil)
14
+ end
15
+
16
+ def client_id
17
+ @consumer.client_id
18
+ end
19
+
20
+ def fetch(fetch_requests, fetch_size = DEFAULT_FETCH_SIZE)
21
+ builder = create_builder(@build_options)
22
+ Array(fetch_requests).each do |request|
23
+ builder.add_fetch(request.topic, request.partition, request.offset, fetch_size)
24
+ end
25
+ raw_response = @consumer.fetch(builder.build)
26
+ FetchResponse.new(raw_response, @decoder)
27
+ end
28
+
29
+ def metadata(topics=[])
30
+ request = Kafka::JavaApi::TopicMetadataRequest.new(topics)
31
+ TopicMetadataResponse.new(@consumer.send(request))
32
+ end
33
+ alias_method :topic_metadata, :metadata
34
+
35
+ def offsets_before(offset_requests)
36
+ request_info = Array(offset_requests).each_with_object({}) do |request, memo|
37
+ topic_partition = Kafka::Common::TopicAndPartition.new(request.topic, request.partition)
38
+ partition_offset = Kafka::Api::PartitionOffsetRequestInfo.new(request.time.to_i, request.max_offsets)
39
+
40
+ memo[topic_partition] = partition_offset
41
+ end
42
+
43
+ request = Kafka::JavaApi::OffsetRequest.new(request_info, OffsetRequest.current_version, client_id)
44
+ OffsetResponse.new(@consumer.get_offsets_before(request))
45
+ end
46
+
47
+ def earliest_offset(topic, partition)
48
+ response = offsets_before(OffsetRequest.new(topic, partition, OffsetRequest.earliest_time))
49
+ response.offsets(topic, partition).first
50
+ end
51
+
52
+ def latest_offset(topic, partition)
53
+ response = offsets_before(OffsetRequest.new(topic, partition, OffsetRequest.latest_time))
54
+ response.offsets(topic, partition).last
55
+ end
56
+
57
+ def disconnect
58
+ @consumer.close
59
+ end
60
+ alias_method :close, :disconnect
61
+
62
+ private
63
+
64
+ DEFAULT_FETCH_SIZE = 1024 * 1024
65
+ BUILD_OPTIONS = [:client_id, :max_wait, :min_bytes].freeze
66
+
67
+ def defaults
68
+ {
69
+ timeout: 30 * 1000,
70
+ buffer_size: 64 * 1024,
71
+ client_id: generate_client_id
72
+ }
73
+ end
74
+
75
+ def generate_client_id
76
+ "heller-#{self.class.name.split('::').last.downcase}-#{SecureRandom.uuid}"
77
+ end
78
+
79
+ def create_consumer(options)
80
+ consumer_impl = options.delete(:consumer_impl) || Kafka::Consumer::SimpleConsumer
81
+ extra_options = options.values_at(:timeout, :buffer_size, :client_id)
82
+ consumer_impl.new(@host, @port.to_i, *extra_options)
83
+ end
84
+
85
+ def create_builder(options)
86
+ builder = Kafka::Api::FetchRequestBuilder.new
87
+
88
+ BUILD_OPTIONS.each do |symbol|
89
+ builder.send(symbol, options[symbol]) if options[symbol]
90
+ end
91
+
92
+ builder
93
+ end
94
+ end
59
95
  end
@@ -0,0 +1,38 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class ConsumerConfiguration < Configuration
5
+
6
+ protected
7
+
8
+ def key_mappings
9
+ @key_mappings ||= {
10
+ auto_commit: 'auto.commit.enable',
11
+ auto_commit_interval: 'auto.commit.interval.ms',
12
+ auto_reset_offset: 'auto.offset.reset',
13
+ client_id: 'client.id',
14
+ consumer_id: 'consumer.id',
15
+ fetch_message_max_bytes: 'fetch.message.max.bytes',
16
+ fetch_min_bytes: 'fetch.min.bytes',
17
+ fetch_max_wait: 'fetch.wait.max.ms',
18
+ group_id: 'group.id',
19
+ num_fetchers: 'num.consumer.fetchers',
20
+ max_queued_message_chunks: 'queued.max.message.chunks',
21
+ receive_buffer: 'socket.receive.buffer.bytes',
22
+ rebalance_retries: 'rebalance.max.retries',
23
+ rebalance_retry_backoff: 'rebalance.backoff.ms',
24
+ refresh_leader_backoff: 'refresh.leader.backoff.ms',
25
+ socket_timeout: 'socket.timeout.ms',
26
+ timeout: 'consumer.timeout.ms',
27
+ zk_connect: 'zookeeper.connect',
28
+ zk_session_timeout: 'zookeeper.session.timeout.ms',
29
+ zk_connection_timeout: 'zookeeper.connection.timeout.ms',
30
+ zk_sync_time: 'zookeeper.sync.time.ms',
31
+ }.freeze
32
+ end
33
+
34
+ def kafka_config_class
35
+ Kafka::Consumer::ConsumerConfig
36
+ end
37
+ end
38
+ end
@@ -0,0 +1,9 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ Errors = Kafka::Errors::ErrorMapping
5
+
6
+ class Errors
7
+ self.singleton_class.send(:alias_method, :error_for, :exception_for)
8
+ end
9
+ end
@@ -0,0 +1,11 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class FetchRequest
5
+ attr_reader :topic, :partition, :offset
6
+
7
+ def initialize(*args)
8
+ @topic, @partition, @offset = *args
9
+ end
10
+ end
11
+ end
@@ -0,0 +1,33 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class FetchResponse
5
+ def initialize(underlying, decoder)
6
+ @underlying, @decoder = underlying, decoder
7
+ end
8
+
9
+ def error?
10
+ @underlying.has_error?
11
+ end
12
+
13
+ def error(topic, partition)
14
+ convert_error { @underlying.error_code(topic, partition) }
15
+ end
16
+
17
+ def messages(topic, partition)
18
+ convert_error { MessageSetEnumerator.new(@underlying.message_set(topic, partition), @decoder) }
19
+ end
20
+
21
+ def high_watermark(topic, partition)
22
+ convert_error { @underlying.high_watermark(topic, partition) }
23
+ end
24
+
25
+ private
26
+
27
+ def convert_error
28
+ yield
29
+ rescue IllegalArgumentException => e
30
+ raise NoSuchTopicPartitionCombinationError, e.message, e.backtrace
31
+ end
32
+ end
33
+ end
@@ -0,0 +1,9 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class Message < Kafka::Producer::KeyedMessage
5
+ def initialize(topic, message, key = nil)
6
+ super(topic, key, message)
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,35 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class MessageSetEnumerator
5
+ include Enumerable
6
+
7
+ def initialize(message_set, decoder)
8
+ @iterator, @decoder = message_set.iterator, decoder
9
+ end
10
+
11
+ def each
12
+ loop do
13
+ yield self.next
14
+ end
15
+ end
16
+
17
+ def next
18
+ if @iterator.has_next?
19
+ item = @iterator.next
20
+ offset, payload = item.offset, item.message.payload
21
+ [offset, decode(payload)]
22
+ else
23
+ raise StopIteration
24
+ end
25
+ end
26
+
27
+ private
28
+
29
+ def decode(payload)
30
+ bytes = Java::byte[payload.limit].new
31
+ payload.get(bytes)
32
+ @decoder.from_bytes(bytes)
33
+ end
34
+ end
35
+ end
@@ -0,0 +1,23 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class OffsetRequest
5
+ attr_reader :topic, :partition, :time, :max_offsets
6
+
7
+ def self.latest_time
8
+ Kafka::Api::OffsetRequest.latest_time
9
+ end
10
+
11
+ def self.earliest_time
12
+ Kafka::Api::OffsetRequest.earliest_time
13
+ end
14
+
15
+ def self.current_version
16
+ Kafka::Api::OffsetRequest.current_version
17
+ end
18
+
19
+ def initialize(topic, partition, time, offsets = 1)
20
+ @topic, @partition, @time, @max_offsets = topic, partition, time, offsets
21
+ end
22
+ end
23
+ end
@@ -0,0 +1,29 @@
1
+ # encoding: utf-8
2
+
3
+ module Heller
4
+ class OffsetResponse
5
+ def initialize(underlying)
6
+ @underlying = underlying
7
+ end
8
+
9
+ def offsets(topic, partition)
10
+ convert_error { @underlying.offsets(topic, partition).to_a }
11
+ end
12
+
13
+ def error?
14
+ @underlying.has_error?
15
+ end
16
+
17
+ def error(topic, partition)
18
+ convert_error { @underlying.error_code(topic, partition) }
19
+ end
20
+
21
+ private
22
+
23
+ def convert_error
24
+ yield
25
+ rescue NoSuchElementException => e
26
+ raise NoSuchTopicPartitionCombinationError, e.message, e.backtrace
27
+ end
28
+ end
29
+ end
@@ -1,39 +1,25 @@
1
- module Heller
2
- class Producer < Kafka::Producer::Producer
3
-
4
- attr_reader :configuration
5
-
6
- def initialize(zk_connect, options = {})
7
- options.merge!({
8
- 'zk.connect' => zk_connect
9
- })
10
-
11
- @configuration = Kafka::Producer::ProducerConfig.new(hash_to_properties(options))
12
- super @configuration
13
- end
14
-
15
- def produce(topic_mappings)
16
- producer_data = topic_mappings.map do |topic, hash|
17
- if hash[:key]
18
- Kafka::Producer::ProducerData.new(topic, hash[:key], hash[:messages])
19
- else
20
- Kafka::Producer::ProducerData.new(topic, hash[:messages])
21
- end
22
- end
1
+ # encoding: utf-8
23
2
 
24
- send(producer_data)
25
- end
26
-
27
- protected
28
-
29
- def hash_to_properties(options)
30
- properties = java.util.Properties.new
31
-
32
- options.each do |key, value|
33
- properties.put(key.to_s, value.to_s)
34
- end
35
-
36
- properties
37
- end
38
- end
39
- end
3
+ module Heller
4
+ class Producer
5
+ def initialize(broker_list, options = {})
6
+ @producer = create_producer(options.merge(brokers: broker_list))
7
+ end
8
+
9
+ def push(messages)
10
+ @producer.send(ArrayList.new(Array(messages)))
11
+ end
12
+
13
+ def disconnect
14
+ @producer.close
15
+ end
16
+ alias_method :close, :disconnect
17
+
18
+ private
19
+
20
+ def create_producer(options)
21
+ producer_impl = options.delete(:producer_impl) || Kafka::Producer::Producer
22
+ producer_impl.new(ProducerConfiguration.new(options).to_java)
23
+ end
24
+ end
25
+ end