roched-fluent-plugin-kafka 0.6.5

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: e881322a89e987344e6548fa9072160f02f36972
4
+ data.tar.gz: 891fe76d4b08be328dc184d482670c04c6cf0169
5
+ SHA512:
6
+ metadata.gz: 24e79b9778e49e92e380d7a0f5df370557f3c19d0baee536e0e216d3e0d30365865280b35c525ec019ed3a261c89a043f4c36f6be0bf941398d40e93e8cf305b
7
+ data.tar.gz: 6ec09a40c4ff6928933d2c7e38d5bdf47813966786a61b696b39051955c4ea7bbb23d42639d8b8444c6c0d276ca57dae62892e5ccef1e34a2b2f9780b77118c5
data/.gitignore ADDED
@@ -0,0 +1,2 @@
1
+ /Gemfile.lock
2
+ *.swp
data/.travis.yml ADDED
@@ -0,0 +1,18 @@
1
+ language: ruby
2
+
3
+ rvm:
4
+ - 2.1
5
+ - 2.2
6
+ - 2.3.1
7
+ - 2.4.1
8
+ - ruby-head
9
+
10
+ script:
11
+ - bundle exec rake test
12
+
13
+ sudo: false
14
+
15
+ matrix:
16
+ allow_failures:
17
+ - rvm: ruby-head
18
+
data/ChangeLog ADDED
@@ -0,0 +1,94 @@
1
+ Release 0.6.3 - 2017/11/14
2
+
3
+ * in_kafka_group: re-create consumer when error happens during event fetch
4
+
5
+ Release 0.6.2 - 2017/11/1
6
+
7
+ * Fix ltsv parsing issue which generates symbol keys
8
+
9
+ Release 0.6.1 - 2017/08/30
10
+
11
+ * Add stats and datadog monitoring support
12
+ * ssl_ca_certs now accepts multiple paths
13
+ * Fix bug by ruby-kafka 0.4.1 changes
14
+ * Update ruby-kafka dependency to v0.4.1
15
+
16
+ Release 0.6.0 - 2017/07/25
17
+
18
+ * Add principal and keytab parameters for SASL support
19
+
20
+ Release 0.5.7 - 2017/07/13
21
+
22
+ * out_kafka_buffered: Add kafka_agg_max_messages parameter
23
+
24
+ Release 0.5.6 - 2017/07/10
25
+
26
+ * output: Add ActiveSupport notification support
27
+
28
+ Release 0.5.5 - 2017/04/19
29
+
30
+ * output: Some trace log level changed to debug
31
+ * out_kafka_buffered: Add discard_kafka_delivery_failed parameter
32
+
33
+ Release 0.5.4 - 2017/04/12
34
+
35
+ * out_kafka_buffered: Add max_send_limit_bytes parameter
36
+ * out_kafka: Improve buffer overflow handling of ruby-kafka
37
+
38
+ Release 0.5.3 - 2017/02/13
39
+
40
+ * Relax ruby-kafka dependency
41
+
42
+ Release 0.5.2 - 2017/02/13
43
+
44
+ * in_kafka_group: Add max_bytes parameter
45
+
46
+ Release 0.5.1 - 2017/02/06
47
+
48
+ * in_kafka_group: Fix uninitialized constant error
49
+
50
+ Release 0.5.0 - 2017/01/17
51
+
52
+ * output: Add out_kafka2 plugin with v0.14 API
53
+
54
+ Release 0.4.2 - 2016/12/10
55
+
56
+ * input: Add use_record_time and time_format parameters
57
+ * Update ruby-kafka dependency to 0.3.16.beta2
58
+
59
+ Release 0.4.1 - 2016/12/01
60
+
61
+ * output: Support specifying partition
62
+
63
+ Release 0.4.0 - 2016/11/08
64
+
65
+ * Remove zookeeper dependency
66
+
67
+ Release 0.3.5 - 2016/10/21
68
+
69
+ * output: Support message key and related parameters. #91
70
+
71
+ Release 0.3.4 - 2016/10/20
72
+
73
+ * output: Add exclude_topic_key and exclude_partition_key. #89
74
+
75
+ Release 0.3.3 - 2016/10/17
76
+
77
+ * out_kafka_buffered: Add get_kafka_client_log parameter. #83
78
+ * out_kafka_buffered: Skip and log invalid record to avoid buffer stuck. #86
79
+ * in_kafka_group: Add retry_emit_limit to handle BufferQueueLimitError. #87
80
+
81
+ Release 0.3.2 - 2016/10/06
82
+
83
+ * in_kafka_group: Re-fetch events after consumer error. #79
84
+
85
+ Release 0.3.1 - 2016/08/28
86
+
87
+ * output: Change default required_acks to -1. #70
88
+ * Support ruby version changed to 2.1.0 or later
89
+
90
+ Release 0.3.0 - 2016/08/24
91
+
92
+ * Fully replace poseidon ruby library with ruby-kafka to support latest kafka versions
93
+
94
+ See git commits for older changes
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in fluent-plugin-kafka.gemspec
4
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,14 @@
1
+ Copyright (C) 2014 htgc
2
+
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+
7
+ http://www.apache.org/licenses/LICENSE-2.0
8
+
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
14
+
data/README.md ADDED
@@ -0,0 +1,244 @@
1
+ # fluent-plugin-kafka, a plugin for [Fluentd](http://fluentd.org)
2
+
3
+ [![Build Status](https://travis-ci.org/htgc/fluent-plugin-kafka.svg?branch=master)](https://travis-ci.org/htgc/fluent-plugin-kafka)
4
+
5
+ A fluentd plugin to both consume and produce data for Apache Kafka.
6
+
7
+ TODO: Also, I need to write tests
8
+
9
+ ## Installation
10
+
11
+ Add this line to your application's Gemfile:
12
+
13
+ gem 'fluent-plugin-kafka'
14
+
15
+ And then execute:
16
+
17
+ $ bundle
18
+
19
+ Or install it yourself as:
20
+
21
+ $ gem install fluent-plugin-kafka
22
+
23
+ If you want to use zookeeper related parameters, you also need to install zookeeper gem. zookeeper gem includes native extension, so development tools are needed, e.g. gcc, make and etc.
24
+
25
+ ## Requirements
26
+
27
+ - Ruby 2.1 or later
28
+ - Input plugins work with kafka v0.9 or later
29
+ - Output plugins work with kafka v0.8 or later
30
+
31
+ ## Usage
32
+
33
+ ### Common parameters
34
+
35
+ #### SSL authentication
36
+
37
+ - ssl_ca_cert
38
+ - ssl_client_cert
39
+ - ssl_client_cert_key
40
+
41
+ Set path to SSL related files. See [Encryption and Authentication using SSL](https://github.com/zendesk/ruby-kafka#encryption-and-authentication-using-ssl) for more detail.
42
+
43
+ #### SASL authentication
44
+
45
+ - principal
46
+ - keytab
47
+
48
+ Set principal and path to keytab for SASL/GSSAPI authentication. See [Authentication using SASL](https://github.com/zendesk/ruby-kafka#authentication-using-sasl) for more details.
49
+
50
+ ### Input plugin (@type 'kafka')
51
+
52
+ Consume events by single consumer.
53
+
54
+ <source>
55
+ @type kafka
56
+
57
+ brokers <broker1_host>:<broker1_port>,<broker2_host>:<broker2_port>,..
58
+ topics <listening topics(separate with comma',')>
59
+ format <input text type (text|json|ltsv|msgpack)> :default => json
60
+ message_key <key (Optional, for text format only, default is message)>
61
+ add_prefix <tag prefix (Optional)>
62
+ add_suffix <tag suffix (Optional)>
63
+
64
+ # Optionally, you can manage topic offset by using zookeeper
65
+ offset_zookeeper <zookeer node list (<zookeeper1_host>:<zookeeper1_port>,<zookeeper2_host>:<zookeeper2_port>,..)>
66
+ offset_zk_root_node <offset path in zookeeper> default => '/fluent-plugin-kafka'
67
+
68
+ # ruby-kafka consumer options
69
+ max_bytes (integer) :default => nil (Use default of ruby-kafka)
70
+ max_wait_time (integer) :default => nil (Use default of ruby-kafka)
71
+ min_bytes (integer) :default => nil (Use default of ruby-kafka)
72
+ </source>
73
+
74
+ Supports a start of processing from the assigned offset for specific topics.
75
+
76
+ <source>
77
+ @type kafka
78
+
79
+ brokers <broker1_host>:<broker1_port>,<broker2_host>:<broker2_port>,..
80
+ format <input text type (text|json|ltsv|msgpack)>
81
+ <topic>
82
+ topic <listening topic>
83
+ partition <listening partition: default=0>
84
+ offset <listening start offset: default=-1>
85
+ </topic>
86
+ <topic>
87
+ topic <listening topic>
88
+ partition <listening partition: default=0>
89
+ offset <listening start offset: default=-1>
90
+ </topic>
91
+ </source>
92
+
93
+ See also [ruby-kafka README](https://github.com/zendesk/ruby-kafka#consuming-messages-from-kafka) for more detailed documentation about ruby-kafka.
94
+
95
+ ### Input plugin (@type 'kafka_group', supports kafka group)
96
+
97
+ Consume events by kafka consumer group features..
98
+
99
+ <source>
100
+ @type kafka_group
101
+
102
+ brokers <broker1_host>:<broker1_port>,<broker2_host>:<broker2_port>,..
103
+ consumer_group <consumer group name, must set>
104
+ topics <listening topics(separate with comma',')>
105
+ format <input text type (text|json|ltsv|msgpack)> :default => json
106
+ message_key <key (Optional, for text format only, default is message)>
107
+ add_prefix <tag prefix (Optional)>
108
+ add_suffix <tag suffix (Optional)>
109
+ retry_emit_limit <Wait retry_emit_limit x 1s when BuffereQueueLimitError happens. The default is nil and it means waiting until BufferQueueLimitError is resolved>
110
+ use_record_time <If true, replace event time with contents of 'time' field of fetched record>
111
+ time_format <string (Optional when use_record_time is used)>
112
+
113
+ # ruby-kafka consumer options
114
+ max_bytes (integer) :default => 1048576
115
+ max_wait_time (integer) :default => nil (Use default of ruby-kafka)
116
+ min_bytes (integer) :default => nil (Use default of ruby-kafka)
117
+ offset_commit_interval (integer) :default => nil (Use default of ruby-kafka)
118
+ offset_commit_threshold (integer) :default => nil (Use default of ruby-kafka)
119
+ start_from_beginning (bool) :default => true
120
+ </source>
121
+
122
+ See also [ruby-kafka README](https://github.com/zendesk/ruby-kafka#consuming-messages-from-kafka) for more detailed documentation about ruby-kafka options.
123
+
124
+ ### Buffered output plugin
125
+
126
+ This plugin uses ruby-kafka producer for writing data. This plugin works with recent kafka versions.
127
+
128
+ <match *.**>
129
+ @type kafka_buffered
130
+
131
+ # Brokers: you can choose either brokers or zookeeper. If you are not familiar with zookeeper, use brokers parameters.
132
+ brokers <broker1_host>:<broker1_port>,<broker2_host>:<broker2_port>,.. # Set brokers directly
133
+ zookeeper <zookeeper_host>:<zookeeper_port> # Set brokers via Zookeeper
134
+ zookeeper_path <broker path in zookeeper> :default => /brokers/ids # Set path in zookeeper for kafka
135
+
136
+ default_topic (string) :default => nil
137
+ default_partition_key (string) :default => nil
138
+ default_message_key (string) :default => nil
139
+ output_data_type (json|ltsv|msgpack|attr:<record name>|<formatter name>) :default => json
140
+ output_include_tag (bool) :default => false
141
+ output_include_time (bool) :default => false
142
+ exclude_topic_key (bool) :default => false
143
+ exclude_partition_key (bool) :default => false
144
+ get_kafka_client_log (bool) :default => false
145
+
146
+ # See fluentd document for buffer related parameters: http://docs.fluentd.org/articles/buffer-plugin-overview
147
+
148
+ # ruby-kafka producer options
149
+ max_send_retries (integer) :default => 1
150
+ required_acks (integer) :default => -1
151
+ ack_timeout (integer) :default => nil (Use default of ruby-kafka)
152
+ compression_codec (gzip|snappy) :default => nil (No compression)
153
+ kafka_agg_max_bytes (integer) :default => 4096
154
+ kafka_agg_max_messages (integer) :default => nil (No limit)
155
+ max_send_limit_bytes (integer) :default => nil (No drop)
156
+ discard_kafka_delivery_failed (bool) :default => false (No discard)
157
+ monitoring_list (array) :default => []
158
+ </match>
159
+
160
+ `<formatter name>` of `output_data_type` uses fluentd's formatter plugins. See [formatter article](http://docs.fluentd.org/articles/formatter-plugin-overview).
161
+
162
+ ruby-kafka sometimes returns `Kafka::DeliveryFailed` error without good information.
163
+ In this case, `get_kafka_client_log` is useful for identifying the error cause.
164
+ ruby-kafka's log is routed to fluentd log so you can see ruby-kafka's log in fluentd logs.
165
+
166
+ Supports following ruby-kafka's producer options.
167
+
168
+ - max_send_retries - default: 1 - Number of times to retry sending of messages to a leader.
169
+ - required_acks - default: -1 - The number of acks required per request. If you need flush performance, set lower value, e.g. 1, 2.
170
+ - ack_timeout - default: nil - How long the producer waits for acks. The unit is seconds.
171
+ - compression_codec - default: nil - The codec the producer uses to compress messages.
172
+ - kafka_agg_max_bytes - default: 4096 - Maximum value of total message size to be included in one batch transmission.
173
+ - kafka_agg_max_messages - default: nil - Maximum number of messages to include in one batch transmission.
174
+ - max_send_limit_bytes - default: nil - Max byte size to send message to avoid MessageSizeTooLarge. For example, if you set 1000000(message.max.bytes in kafka), Message more than 1000000 byes will be dropped.
175
+ - discard_kafka_delivery_failed - default: false - discard the record where [Kafka::DeliveryFailed](http://www.rubydoc.info/gems/ruby-kafka/Kafka/DeliveryFailed) occurred
176
+ - monitoring_list - default: [] - library to be used to monitor. statsd and datadog are supported
177
+
178
+ If you want to know about detail of monitoring, see also https://github.com/zendesk/ruby-kafka#monitoring
179
+
180
+ See also [Kafka::Client](http://www.rubydoc.info/gems/ruby-kafka/Kafka/Client) for more detailed documentation about ruby-kafka.
181
+
182
+ This plugin supports compression codec "snappy" also.
183
+ Install snappy module before you use snappy compression.
184
+
185
+ $ gem install snappy
186
+
187
+ snappy gem uses native extension, so you need to install several packages before.
188
+ On Ubuntu, need development packages and snappy library.
189
+
190
+ $ sudo apt-get install build-essential autoconf automake libtool libsnappy-dev
191
+
192
+ #### Load balancing
193
+
194
+ Messages will be assigned a partition at random as default by ruby-kafka, but messages with the same partition key will always be assigned to the same partition by setting `default_partition_key` in config file.
195
+ If key name `partition_key` exists in a message, this plugin set its value of partition_key as key.
196
+
197
+ |default_partition_key|partition_key| behavior |
198
+ | --- | --- | --- |
199
+ |Not set|Not exists| All messages are assigned a partition at random |
200
+ |Set| Not exists| All messages are assigned to the specific partition |
201
+ |Not set| Exists | Messages which have partition_key record are assigned to the specific partition, others are assigned a partition at random |
202
+ |Set| Exists | Messages which have partition_key record are assigned to the specific partition with parition_key, others are assigned to the specific partition with default_parition_key |
203
+
204
+ If key name `message_key` exists in a message, this plugin publishes the value of message_key to kafka and can be read by consumers. Same message key will be assigned to all messages by setting `default_message_key` in config file. If message_key exists and if partition_key is not set explicitly, messsage_key will be used for partitioning.
205
+
206
+ ### Non-buffered output plugin
207
+
208
+ This plugin uses ruby-kafka producer for writing data. For performance and reliability concerns, use `kafka_bufferd` output instead. This is mainly for testing.
209
+
210
+ <match *.**>
211
+ @type kafka
212
+
213
+ # Brokers: you can choose either brokers or zookeeper.
214
+ brokers <broker1_host>:<broker1_port>,<broker2_host>:<broker2_port>,.. # Set brokers directly
215
+ zookeeper <zookeeper_host>:<zookeeper_port> # Set brokers via Zookeeper
216
+ zookeeper_path <broker path in zookeeper> :default => /brokers/ids # Set path in zookeeper for kafka
217
+
218
+ default_topic (string) :default => nil
219
+ default_partition_key (string) :default => nil
220
+ default_message_key (string) :default => nil
221
+ output_data_type (json|ltsv|msgpack|attr:<record name>|<formatter name>) :default => json
222
+ output_include_tag (bool) :default => false
223
+ output_include_time (bool) :default => false
224
+ exclude_topic_key (bool) :default => false
225
+ exclude_partition_key (bool) :default => false
226
+
227
+ # ruby-kafka producer options
228
+ max_send_retries (integer) :default => 1
229
+ required_acks (integer) :default => -1
230
+ ack_timeout (integer) :default => nil (Use default of ruby-kafka)
231
+ compression_codec (gzip|snappy) :default => nil
232
+ max_buffer_size (integer) :default => nil (Use default of ruby-kafka)
233
+ max_buffer_bytesize (integer) :default => nil (Use default of ruby-kafka)
234
+ </match>
235
+
236
+ This plugin also supports ruby-kafka related parameters. See Buffered output plugin section.
237
+
238
+ ## Contributing
239
+
240
+ 1. Fork it
241
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
242
+ 3. Commit your changes (`git commit -am 'Added some feature'`)
243
+ 4. Push to the branch (`git push origin my-new-feature`)
244
+ 5. Create new Pull Request
data/Rakefile ADDED
@@ -0,0 +1,12 @@
1
+ require 'bundler'
2
+ Bundler::GemHelper.install_tasks
3
+
4
+ require 'rake/testtask'
5
+
6
+ Rake::TestTask.new(:test) do |test|
7
+ test.libs << 'lib' << 'test'
8
+ test.test_files = FileList['test/**/test_*.rb']
9
+ test.verbose = true
10
+ end
11
+
12
+ task :default => [:build]
@@ -0,0 +1,24 @@
1
+ # -*- encoding: utf-8 -*-
2
+
3
+ Gem::Specification.new do |gem|
4
+ gem.authors = ["Hidemasa Togashi", "Masahiro Nakagawa"]
5
+ gem.email = ["togachiro@gmail.com", "repeatedly@gmail.com"]
6
+ gem.description = %q{Fluentd plugin for Apache Kafka > 0.8}
7
+ gem.summary = %q{Fluentd plugin for Apache Kafka > 0.8}
8
+ gem.homepage = "https://github.com/roche-d/fluent-plugin-kafka"
9
+ gem.license = "Apache-2.0"
10
+
11
+ gem.files = `git ls-files`.split($\)
12
+ gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
13
+ gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
14
+ gem.name = "roched-fluent-plugin-kafka"
15
+ gem.require_paths = ["lib"]
16
+ gem.version = '0.6.5'
17
+ gem.required_ruby_version = ">= 2.1.0"
18
+
19
+ gem.add_dependency "fluentd", [">= 0.10.58", "< 2"]
20
+ gem.add_dependency 'ltsv'
21
+ gem.add_dependency 'ruby-kafka', '~> 0.4.1'
22
+ gem.add_development_dependency "rake", ">= 0.9.2"
23
+ gem.add_development_dependency "test-unit", ">= 3.0.8"
24
+ end