fluent-plugin-kafka 0.12.1 → 0.13.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 497c5a450bc4f55ddaf1b46454ed80f028c283775666e2a38ee0d91f16d391ef
4
- data.tar.gz: 5f40c26e75f06e4ee73cc0b3ba32ef8438da48bea465549fb95f4bcb89e98819
3
+ metadata.gz: 05cbb3ff005fbf6f27ab1a87ebe799f176e53ebfa6cebadf2b4f4418dfb6cb7b
4
+ data.tar.gz: 45bec524fc7a727031cf65b98c4b1db41dc8f9a1bfe86696fe0e7e3df7f8a0fa
5
5
  SHA512:
6
- metadata.gz: dcac3281832905427ad5799b6b1d7c9dab9b13562c19ba8283fa4eaf2ea1b88e7c782714be7cec9546878e26ab223725b0cbc06259bc8e375eff5e407575bb3f
7
- data.tar.gz: deef97208222254e9fc4b0279f0dd9045c177354fa1a8dcca3a304d3b2a5d1090e1e7970b5cd9066d7ff917e8020b798d4dd674623285c52041c173fe37d48f5
6
+ metadata.gz: fdee5e4b8d0e15f835d2b7d3bedb0d735e0b00b9e725fac19ed68c710a751c99279aa9966590cab0fef7c5b88a28cd7cefb80ee923423cc47e49a13ba88ac8f5
7
+ data.tar.gz: a33c14f6c4927e3cd91c8aabbf2edc35a34582aa5859b5369829721810234edcdad39a2b148d4bff3377177f707bae3e74909d1f8b553cb8ee3682d00e4d58d7
data/ChangeLog CHANGED
@@ -1,3 +1,24 @@
1
+ Release 0.13.1 - 2020/07/17
2
+
3
+ * in_kafka_group: Support ssl_verify_hostname parameter
4
+ * out_kafka2/out_rdkafka2: Support topic parameter with placeholders
5
+
6
+ Release 0.13.0 - 2020/03/09
7
+
8
+ * Accept ruby-kafka v1 or later
9
+
10
+ Release 0.12.4 - 2020/03/03
11
+
12
+ * output: Follow rdkafka log level
13
+
14
+ Release 0.12.3 - 2020/02/06
15
+
16
+ * output: Show warning message for v0.12 plugins
17
+
18
+ Release 0.12.2 - 2020/01/07
19
+
20
+ * input: Refer sasl_over_ssl parameter in plain SASL
21
+
1
22
  Release 0.12.1 - 2019/10/14
2
23
 
3
24
  * input: Add time_source parameter to replace use_record_time
data/README.md CHANGED
@@ -4,8 +4,6 @@
4
4
 
5
5
  A fluentd plugin to both consume and produce data for Apache Kafka.
6
6
 
7
- TODO: Also, I need to write tests
8
-
9
7
  ## Installation
10
8
 
11
9
  Add this line to your application's Gemfile:
@@ -123,7 +121,8 @@ Consume events by kafka consumer group features..
123
121
  add_prefix <tag prefix (Optional)>
124
122
  add_suffix <tag suffix (Optional)>
125
123
  retry_emit_limit <Wait retry_emit_limit x 1s when BuffereQueueLimitError happens. The default is nil and it means waiting until BufferQueueLimitError is resolved>
126
- use_record_time <If true, replace event time with contents of 'time' field of fetched record>
124
+ use_record_time (Deprecated. Use 'time_source record' instead.) <If true, replace event time with contents of 'time' field of fetched record>
125
+ time_source <source for message timestamp (now|kafka|record)> :default => now
127
126
  time_format <string (Optional when use_record_time is used)>
128
127
 
129
128
  # ruby-kafka consumer options
@@ -142,7 +141,7 @@ Consuming topic name is used for event tag. So when the target topic name is `ap
142
141
 
143
142
  ### Output plugin
144
143
 
145
- This plugin is for fluentd v1.0 or later. This will be `out_kafka` plugin in the future.
144
+ This `kafka2` plugin is for fluentd v1.0 or later. This will be `out_kafka` plugin in the future.
146
145
 
147
146
  <match app.**>
148
147
  @type kafka2
@@ -157,6 +156,8 @@ This plugin is for fluentd v1.0 or later. This will be `out_kafka` plugin in the
157
156
  default_message_key (string) :default => nil
158
157
  exclude_topic_key (bool) :default => false
159
158
  exclude_partition_key (bool) :default => false
159
+ exclude_partition (bool) :default => false
160
+ exclude_message_key (bool) :default => false
160
161
  get_kafka_client_log (bool) :default => false
161
162
  headers (hash) :default => {}
162
163
  headers_from_record (hash) :default => {}
@@ -180,7 +181,7 @@ This plugin is for fluentd v1.0 or later. This will be `out_kafka` plugin in the
180
181
 
181
182
  # ruby-kafka producer options
182
183
  idempotent (bool) :default => false
183
- sasl_over_ssl (bool) :default => false
184
+ sasl_over_ssl (bool) :default => true
184
185
  max_send_retries (integer) :default => 1
185
186
  required_acks (integer) :default => -1
186
187
  ack_timeout (integer) :default => nil (Use default of ruby-kafka)
@@ -199,8 +200,6 @@ Supports following ruby-kafka's producer options.
199
200
  - required_acks - default: -1 - The number of acks required per request. If you need flush performance, set lower value, e.g. 1, 2.
200
201
  - ack_timeout - default: nil - How long the producer waits for acks. The unit is seconds.
201
202
  - compression_codec - default: nil - The codec the producer uses to compress messages.
202
- - kafka_agg_max_bytes - default: 4096 - Maximum value of total message size to be included in one batch transmission.
203
- - kafka_agg_max_messages - default: nil - Maximum number of messages to include in one batch transmission.
204
203
  - max_send_limit_bytes - default: nil - Max byte size to send message to avoid MessageSizeTooLarge. For example, if you set 1000000(message.max.bytes in kafka), Message more than 1000000 byes will be dropped.
205
204
  - discard_kafka_delivery_failed - default: false - discard the record where [Kafka::DeliveryFailed](http://www.rubydoc.info/gems/ruby-kafka/Kafka/DeliveryFailed) occurred
206
205
  - monitoring_list - default: [] - library to be used to monitor. statsd and datadog are supported
@@ -277,6 +276,7 @@ The configuration format is jsonpath. It is descibed in https://docs.fluentd.org
277
276
  ### Buffered output plugin
278
277
 
279
278
  This plugin uses ruby-kafka producer for writing data. This plugin is for v0.12. If you use v1, see `kafka2`.
279
+ Support of fluentd v0.12 has ended. `kafka_buffered` will be an alias of `kafka2` and will be removed in the future.
280
280
 
281
281
  <match app.**>
282
282
  @type kafka_buffered
@@ -293,6 +293,10 @@ This plugin uses ruby-kafka producer for writing data. This plugin is for v0.12.
293
293
  default_topic (string) :default => nil
294
294
  default_partition_key (string) :default => nil
295
295
  default_message_key (string) :default => nil
296
+ exclude_topic_key (bool) :default => false
297
+ exclude_partition_key (bool) :default => false
298
+ exclude_partition (bool) :default => false
299
+ exclude_message_key (bool) :default => false
296
300
  output_data_type (json|ltsv|msgpack|attr:<record name>|<formatter name>) :default => json
297
301
  output_include_tag (bool) :default => false
298
302
  output_include_time (bool) :default => false
@@ -304,7 +308,7 @@ This plugin uses ruby-kafka producer for writing data. This plugin is for v0.12.
304
308
 
305
309
  # ruby-kafka producer options
306
310
  idempotent (bool) :default => false
307
- sasl_over_ssl (bool) :default => false
311
+ sasl_over_ssl (bool) :default => true
308
312
  max_send_retries (integer) :default => 1
309
313
  required_acks (integer) :default => -1
310
314
  ack_timeout (integer) :default => nil (Use default of ruby-kafka)
@@ -316,6 +320,11 @@ This plugin uses ruby-kafka producer for writing data. This plugin is for v0.12.
316
320
  monitoring_list (array) :default => []
317
321
  </match>
318
322
 
323
+ `kafka_buffered` has two additional parameters:
324
+
325
+ - kafka_agg_max_bytes - default: 4096 - Maximum value of total message size to be included in one batch transmission.
326
+ - kafka_agg_max_messages - default: nil - Maximum number of messages to include in one batch transmission.
327
+
319
328
  ### Non-buffered output plugin
320
329
 
321
330
  This plugin uses ruby-kafka producer for writing data. For performance and reliability concerns, use `kafka_bufferd` output instead. This is mainly for testing.
@@ -350,10 +359,10 @@ This plugin also supports ruby-kafka related parameters. See Buffered output plu
350
359
 
351
360
  ### rdkafka based output plugin
352
361
 
353
- This plugin uses `rdkafka` instead of `ruby-kafka` for ruby client.
362
+ This plugin uses `rdkafka` instead of `ruby-kafka` for kafka client.
354
363
  You need to install rdkafka gem.
355
364
 
356
- # rdkafka is C extension library so need development tools like ruby-devel, gcc and etc
365
+ # rdkafka is C extension library. Need to install development tools like ruby-devel, gcc and etc
357
366
  # for v0.12 or later
358
367
  $ gem install rdkafka --no-document
359
368
  # for v0.11 or earlier
@@ -426,6 +435,18 @@ If you use v0.12, use `rdkafka` instead.
426
435
  }
427
436
  </match>
428
437
 
438
+ ## FAQ
439
+
440
+ ### Why fluent-plugin-kafka can't send data to our kafka cluster?
441
+
442
+ We got lots of similar questions. Almost cases, this problem happens by version mismatch between ruby-kafka and kafka cluster.
443
+ See ruby-kafka README for more details: https://github.com/zendesk/ruby-kafka#compatibility
444
+
445
+ To avoid the problem, there are 2 approaches:
446
+
447
+ - Upgrade your kafka cluster to latest version. This is better because recent version is faster and robust.
448
+ - Downgrade ruby-kafka/fluent-plugin-kafka to work with your older kafka.
449
+
429
450
  ## Contributing
430
451
 
431
452
  1. Fork it
@@ -13,12 +13,12 @@ Gem::Specification.new do |gem|
13
13
  gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
14
14
  gem.name = "fluent-plugin-kafka"
15
15
  gem.require_paths = ["lib"]
16
- gem.version = '0.12.1'
16
+ gem.version = '0.13.1'
17
17
  gem.required_ruby_version = ">= 2.1.0"
18
18
 
19
19
  gem.add_dependency "fluentd", [">= 0.10.58", "< 2"]
20
20
  gem.add_dependency 'ltsv'
21
- gem.add_dependency 'ruby-kafka', '>= 0.7.8', '< 0.8.0'
21
+ gem.add_dependency 'ruby-kafka', '>= 0.7.8', '< 2'
22
22
  gem.add_development_dependency "rake", ">= 0.9.2"
23
23
  gem.add_development_dependency "test-unit", ">= 3.0.8"
24
24
  end
@@ -39,6 +39,8 @@ class Fluent::KafkaInput < Fluent::Input
39
39
  :deprecated => "Use 'time_source record' instead."
40
40
  config_param :time_source, :enum, :list => [:now, :kafka, :record], :default => :now,
41
41
  :desc => "Source for message timestamp."
42
+ config_param :record_time_key, :string, :default => 'time',
43
+ :desc => "Time field when time_source is 'record'"
42
44
  config_param :get_kafka_client_log, :bool, :default => false
43
45
  config_param :time_format, :string, :default => nil,
44
46
  :desc => "Time format to be used to parse 'time' field."
@@ -190,7 +192,8 @@ class Fluent::KafkaInput < Fluent::Input
190
192
  elsif @username != nil && @password != nil
191
193
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
192
194
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
193
- ssl_ca_certs_from_system: @ssl_ca_certs_from_system,sasl_plain_username: @username, sasl_plain_password: @password)
195
+ ssl_ca_certs_from_system: @ssl_ca_certs_from_system,sasl_plain_username: @username, sasl_plain_password: @password,
196
+ sasl_over_ssl: @sasl_over_ssl)
194
197
  else
195
198
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
196
199
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
@@ -291,9 +294,9 @@ class Fluent::KafkaInput < Fluent::Input
291
294
  record_time = Fluent::Engine.now
292
295
  when :record
293
296
  if @time_format
294
- record_time = @time_parser.parse(record['time'])
297
+ record_time = @time_parser.parse(record[@record_time_key])
295
298
  else
296
- record_time = record['time']
299
+ record_time = record[@record_time_key]
297
300
  end
298
301
  else
299
302
  $log.fatal "BUG: invalid time_source: #{@time_source}"
@@ -29,6 +29,8 @@ class Fluent::KafkaGroupInput < Fluent::Input
29
29
  :deprecated => "Use 'time_source record' instead."
30
30
  config_param :time_source, :enum, :list => [:now, :kafka, :record], :default => :now,
31
31
  :desc => "Source for message timestamp."
32
+ config_param :record_time_key, :string, :default => 'time',
33
+ :desc => "Time field when time_source is 'record'"
32
34
  config_param :get_kafka_client_log, :bool, :default => false
33
35
  config_param :time_format, :string, :default => nil,
34
36
  :desc => "Time format to be used to parse 'time' field."
@@ -166,15 +168,17 @@ class Fluent::KafkaGroupInput < Fluent::Input
166
168
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, connect_timeout: @connect_timeout, socket_timeout: @socket_timeout, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
167
169
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
168
170
  ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_scram_username: @username, sasl_scram_password: @password,
169
- sasl_scram_mechanism: @scram_mechanism, sasl_over_ssl: @sasl_over_ssl)
171
+ sasl_scram_mechanism: @scram_mechanism, sasl_over_ssl: @sasl_over_ssl, ssl_verify_hostname: @ssl_verify_hostname)
170
172
  elsif @username != nil && @password != nil
171
173
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, connect_timeout: @connect_timeout, socket_timeout: @socket_timeout, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
172
174
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
173
- ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_plain_username: @username, sasl_plain_password: @password)
175
+ ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_plain_username: @username, sasl_plain_password: @password,
176
+ sasl_over_ssl: @sasl_over_ssl, ssl_verify_hostname: @ssl_verify_hostname)
174
177
  else
175
178
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, connect_timeout: @connect_timeout, socket_timeout: @socket_timeout, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
176
179
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
177
- ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_gssapi_principal: @principal, sasl_gssapi_keytab: @keytab)
180
+ ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_gssapi_principal: @principal, sasl_gssapi_keytab: @keytab,
181
+ ssl_verify_hostname: @ssl_verify_hostname)
178
182
  end
179
183
 
180
184
  @consumer = setup_consumer
@@ -197,7 +201,14 @@ class Fluent::KafkaGroupInput < Fluent::Input
197
201
  def setup_consumer
198
202
  consumer = @kafka.consumer(@consumer_opts)
199
203
  @topics.each { |topic|
200
- consumer.subscribe(topic, start_from_beginning: @start_from_beginning, max_bytes_per_partition: @max_bytes)
204
+ if m = /^\/(.+)\/$/.match(topic)
205
+ topic_or_regex = Regexp.new(m[1])
206
+ $log.info "Subscribe to topics matching the regex #{topic}"
207
+ else
208
+ topic_or_regex = topic
209
+ $log.info "Subscribe to topic #{topic}"
210
+ end
211
+ consumer.subscribe(topic_or_regex, start_from_beginning: @start_from_beginning, max_bytes_per_partition: @max_bytes)
201
212
  }
202
213
  consumer
203
214
  end
@@ -242,9 +253,9 @@ class Fluent::KafkaGroupInput < Fluent::Input
242
253
  record_time = Fluent::Engine.now
243
254
  when :record
244
255
  if @time_format
245
- record_time = @time_parser.parse(record['time'].to_s)
256
+ record_time = @time_parser.parse(record[@record_time_key].to_s)
246
257
  else
247
- record_time = record['time']
258
+ record_time = record[@record_time_key]
248
259
  end
249
260
  else
250
261
  log.fatal "BUG: invalid time_source: #{@time_source}"
@@ -131,6 +131,8 @@ DESC
131
131
  def configure(conf)
132
132
  super
133
133
 
134
+ log.warn "Support of fluentd v0.12 has ended. Use kafka2 instead. kafka will be an alias of kafka2"
135
+
134
136
  if @zookeeper
135
137
  require 'zookeeper'
136
138
  else
@@ -15,6 +15,7 @@ module Fluent::Plugin
15
15
  Set brokers directly:
16
16
  <broker1_host>:<broker1_port>,<broker2_host>:<broker2_port>,..
17
17
  DESC
18
+ config_param :topic, :string, :default => nil, :desc => "kafka topic. Placeholders are supported"
18
19
  config_param :topic_key, :string, :default => 'topic', :desc => "Field for kafka topic"
19
20
  config_param :default_topic, :string, :default => nil,
20
21
  :desc => "Default output topic when record doesn't have topic field"
@@ -215,7 +216,11 @@ DESC
215
216
  # TODO: optimize write performance
216
217
  def write(chunk)
217
218
  tag = chunk.metadata.tag
218
- topic = (chunk.metadata.variables && chunk.metadata.variables[@topic_key_sym]) || @default_topic || tag
219
+ topic = if @topic
220
+ extract_placeholders(@topic, chunk)
221
+ else
222
+ (chunk.metadata.variables && chunk.metadata.variables[@topic_key_sym]) || @default_topic || tag
223
+ end
219
224
 
220
225
  messages = 0
221
226
  record_buf = nil
@@ -159,6 +159,8 @@ DESC
159
159
  def configure(conf)
160
160
  super
161
161
 
162
+ log.warn "Support of fluentd v0.12 has ended. Use kafka2 instead. kafka_buffered will be an alias of kafka2"
163
+
162
164
  if @zookeeper
163
165
  require 'zookeeper'
164
166
  else
@@ -1,4 +1,5 @@
1
1
  require 'thread'
2
+ require 'logger'
2
3
  require 'fluent/output'
3
4
  require 'fluent/plugin/kafka_plugin_util'
4
5
 
@@ -91,8 +92,22 @@ DESC
91
92
  super
92
93
  log.instance_eval {
93
94
  def add(level, &block)
94
- if block
95
+ return unless block
96
+
97
+ # Follow rdkakfa's log level. See also rdkafka-ruby's bindings.rb: https://github.com/appsignal/rdkafka-ruby/blob/e5c7261e3f2637554a5c12b924be297d7dca1328/lib/rdkafka/bindings.rb#L117
98
+ case level
99
+ when Logger::FATAL
100
+ self.fatal(block.call)
101
+ when Logger::ERROR
102
+ self.error(block.call)
103
+ when Logger::WARN
104
+ self.warn(block.call)
105
+ when Logger::INFO
95
106
  self.info(block.call)
107
+ when Logger::DEBUG
108
+ self.debug(block.call)
109
+ else
110
+ self.trace(block.call)
96
111
  end
97
112
  end
98
113
  }
@@ -1,4 +1,5 @@
1
1
  require 'thread'
2
+ require 'logger'
2
3
  require 'fluent/plugin/output'
3
4
  require 'fluent/plugin/kafka_plugin_util'
4
5
 
@@ -32,6 +33,7 @@ Set brokers directly:
32
33
  <broker1_host>:<broker1_port>,<broker2_host>:<broker2_port>,..
33
34
  Brokers: you can choose to use either brokers or zookeeper.
34
35
  DESC
36
+ config_param :topic, :string, :default => nil, :desc => "kafka topic. Placeholders are supported"
35
37
  config_param :topic_key, :string, :default => 'topic', :desc => "Field for kafka topic"
36
38
  config_param :default_topic, :string, :default => nil,
37
39
  :desc => "Default output topic when record doesn't have topic field"
@@ -108,8 +110,22 @@ DESC
108
110
  super
109
111
  log.instance_eval {
110
112
  def add(level, &block)
111
- if block
113
+ return unless block
114
+
115
+ # Follow rdkakfa's log level. See also rdkafka-ruby's bindings.rb: https://github.com/appsignal/rdkafka-ruby/blob/e5c7261e3f2637554a5c12b924be297d7dca1328/lib/rdkafka/bindings.rb#L117
116
+ case level
117
+ when Logger::FATAL
118
+ self.fatal(block.call)
119
+ when Logger::ERROR
120
+ self.error(block.call)
121
+ when Logger::WARN
122
+ self.warn(block.call)
123
+ when Logger::INFO
112
124
  self.info(block.call)
125
+ when Logger::DEBUG
126
+ self.debug(block.call)
127
+ else
128
+ self.trace(block.call)
113
129
  end
114
130
  end
115
131
  }
@@ -263,7 +279,11 @@ DESC
263
279
 
264
280
  def write(chunk)
265
281
  tag = chunk.metadata.tag
266
- topic = (chunk.metadata.variables && chunk.metadata.variables[@topic_key_sym]) || @default_topic || tag
282
+ topic = if @topic
283
+ extract_placeholders(@topic, chunk)
284
+ else
285
+ (chunk.metadata.variables && chunk.metadata.variables[@topic_key_sym]) || @default_topic || tag
286
+ end
267
287
 
268
288
  handlers = []
269
289
  record_buf = nil
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-kafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.12.1
4
+ version: 0.13.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Hidemasa Togashi
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2019-10-15 00:00:00.000000000 Z
12
+ date: 2020-07-17 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: fluentd
@@ -54,7 +54,7 @@ dependencies:
54
54
  version: 0.7.8
55
55
  - - "<"
56
56
  - !ruby/object:Gem::Version
57
- version: 0.8.0
57
+ version: '2'
58
58
  type: :runtime
59
59
  prerelease: false
60
60
  version_requirements: !ruby/object:Gem::Requirement
@@ -64,7 +64,7 @@ dependencies:
64
64
  version: 0.7.8
65
65
  - - "<"
66
66
  - !ruby/object:Gem::Version
67
- version: 0.8.0
67
+ version: '2'
68
68
  - !ruby/object:Gem::Dependency
69
69
  name: rake
70
70
  requirement: !ruby/object:Gem::Requirement