fluent-plugin-kafka 0.12.3 → 0.14.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a5c5fbae81fab15c17ce05013a3a26d924b91cb58c5be753ffdd2c87733667c6
4
- data.tar.gz: fc40bae1a4d1c391c9da98c52e442065f2291e7899a8ab20eb4067181efd6a68
3
+ metadata.gz: 27a59b5835dff5d64dcf78bd5d3bf945341c9b734de476b16f9217afa2839a22
4
+ data.tar.gz: fdac125fa11e88712059f0f0794ca199ef248d562b1e1aad389e6b0dace9c777
5
5
  SHA512:
6
- metadata.gz: f5df728c0e3516ab4548ad139972fa79729dd5f25d4c5f94423217110f668f15d65fb2d191c6231f446775874ffd2e6b834ad678b8587c1bf47cede4248587f4
7
- data.tar.gz: 8a88fc87aef958ec0c43400a6a67e25a01475a607e9d4f38e4da39ac5a926cc2a29199fc911c1258a619a8ca844a2e32d0e2755e8f1c029056fb3e1fb66794d7
6
+ metadata.gz: faf2abd472b6af6b010409750d6b0e3483ba8748af8930c4dad05d6e6b4d9aca5dc7c55d29b2e4ed3b092b12612f0d19e46b9878463ecbe417e417d7c3ee522b
7
+ data.tar.gz: f51803596ea03e0f6dfc9f83abaf070179136f497dc89ad48eed0582c8d580ad88e105f5f4e075096421f6dc0acf8bf1e58fd75738e7c49636acca01ec347a46
data/ChangeLog CHANGED
@@ -1,3 +1,25 @@
1
+ Release 0.14.1 - 2020/08/11
2
+
3
+ * kafka_producer_ext: Fix regression by v0.14.0 changes
4
+
5
+ Release 0.14.0 - 2020/08/07
6
+
7
+ * Update ruby-kafka dependency to v1.2.0 or later. Check https://github.com/zendesk/ruby-kafka#compatibility
8
+ * kafka_producer_ext: Follow Paritioner API change
9
+
10
+ Release 0.13.1 - 2020/07/17
11
+
12
+ * in_kafka_group: Support ssl_verify_hostname parameter
13
+ * out_kafka2/out_rdkafka2: Support topic parameter with placeholders
14
+
15
+ Release 0.13.0 - 2020/03/09
16
+
17
+ * Accept ruby-kafka v1 or later
18
+
19
+ Release 0.12.4 - 2020/03/03
20
+
21
+ * output: Follow rdkafka log level
22
+
1
23
  Release 0.12.3 - 2020/02/06
2
24
 
3
25
  * output: Show warning message for v0.12 plugins
data/README.md CHANGED
@@ -121,7 +121,8 @@ Consume events by kafka consumer group features..
121
121
  add_prefix <tag prefix (Optional)>
122
122
  add_suffix <tag suffix (Optional)>
123
123
  retry_emit_limit <Wait retry_emit_limit x 1s when BuffereQueueLimitError happens. The default is nil and it means waiting until BufferQueueLimitError is resolved>
124
- use_record_time <If true, replace event time with contents of 'time' field of fetched record>
124
+ use_record_time (Deprecated. Use 'time_source record' instead.) <If true, replace event time with contents of 'time' field of fetched record>
125
+ time_source <source for message timestamp (now|kafka|record)> :default => now
125
126
  time_format <string (Optional when use_record_time is used)>
126
127
 
127
128
  # ruby-kafka consumer options
@@ -140,7 +141,7 @@ Consuming topic name is used for event tag. So when the target topic name is `ap
140
141
 
141
142
  ### Output plugin
142
143
 
143
- This plugin is for fluentd v1.0 or later. This will be `out_kafka` plugin in the future.
144
+ This `kafka2` plugin is for fluentd v1.0 or later. This will be `out_kafka` plugin in the future.
144
145
 
145
146
  <match app.**>
146
147
  @type kafka2
@@ -155,6 +156,8 @@ This plugin is for fluentd v1.0 or later. This will be `out_kafka` plugin in the
155
156
  default_message_key (string) :default => nil
156
157
  exclude_topic_key (bool) :default => false
157
158
  exclude_partition_key (bool) :default => false
159
+ exclude_partition (bool) :default => false
160
+ exclude_message_key (bool) :default => false
158
161
  get_kafka_client_log (bool) :default => false
159
162
  headers (hash) :default => {}
160
163
  headers_from_record (hash) :default => {}
@@ -197,8 +200,6 @@ Supports following ruby-kafka's producer options.
197
200
  - required_acks - default: -1 - The number of acks required per request. If you need flush performance, set lower value, e.g. 1, 2.
198
201
  - ack_timeout - default: nil - How long the producer waits for acks. The unit is seconds.
199
202
  - compression_codec - default: nil - The codec the producer uses to compress messages.
200
- - kafka_agg_max_bytes - default: 4096 - Maximum value of total message size to be included in one batch transmission.
201
- - kafka_agg_max_messages - default: nil - Maximum number of messages to include in one batch transmission.
202
203
  - max_send_limit_bytes - default: nil - Max byte size to send message to avoid MessageSizeTooLarge. For example, if you set 1000000(message.max.bytes in kafka), Message more than 1000000 byes will be dropped.
203
204
  - discard_kafka_delivery_failed - default: false - discard the record where [Kafka::DeliveryFailed](http://www.rubydoc.info/gems/ruby-kafka/Kafka/DeliveryFailed) occurred
204
205
  - monitoring_list - default: [] - library to be used to monitor. statsd and datadog are supported
@@ -292,6 +293,10 @@ Support of fluentd v0.12 has ended. `kafka_buffered` will be an alias of `kafka2
292
293
  default_topic (string) :default => nil
293
294
  default_partition_key (string) :default => nil
294
295
  default_message_key (string) :default => nil
296
+ exclude_topic_key (bool) :default => false
297
+ exclude_partition_key (bool) :default => false
298
+ exclude_partition (bool) :default => false
299
+ exclude_message_key (bool) :default => false
295
300
  output_data_type (json|ltsv|msgpack|attr:<record name>|<formatter name>) :default => json
296
301
  output_include_tag (bool) :default => false
297
302
  output_include_time (bool) :default => false
@@ -315,6 +320,11 @@ Support of fluentd v0.12 has ended. `kafka_buffered` will be an alias of `kafka2
315
320
  monitoring_list (array) :default => []
316
321
  </match>
317
322
 
323
+ `kafka_buffered` has two additional parameters:
324
+
325
+ - kafka_agg_max_bytes - default: 4096 - Maximum value of total message size to be included in one batch transmission.
326
+ - kafka_agg_max_messages - default: nil - Maximum number of messages to include in one batch transmission.
327
+
318
328
  ### Non-buffered output plugin
319
329
 
320
330
  This plugin uses ruby-kafka producer for writing data. For performance and reliability concerns, use `kafka_bufferd` output instead. This is mainly for testing.
@@ -349,10 +359,10 @@ This plugin also supports ruby-kafka related parameters. See Buffered output plu
349
359
 
350
360
  ### rdkafka based output plugin
351
361
 
352
- This plugin uses `rdkafka` instead of `ruby-kafka` for ruby client.
362
+ This plugin uses `rdkafka` instead of `ruby-kafka` for kafka client.
353
363
  You need to install rdkafka gem.
354
364
 
355
- # rdkafka is C extension library so need development tools like ruby-devel, gcc and etc
365
+ # rdkafka is C extension library. Need to install development tools like ruby-devel, gcc and etc
356
366
  # for v0.12 or later
357
367
  $ gem install rdkafka --no-document
358
368
  # for v0.11 or earlier
@@ -434,7 +444,7 @@ See ruby-kafka README for more details: https://github.com/zendesk/ruby-kafka#co
434
444
 
435
445
  To avoid the problem, there are 2 approaches:
436
446
 
437
- - Upgrade your kafka cluster to latest version. This is better becase recent version is faster and robust.
447
+ - Upgrade your kafka cluster to latest version. This is better because recent version is faster and robust.
438
448
  - Downgrade ruby-kafka/fluent-plugin-kafka to work with your older kafka.
439
449
 
440
450
  ## Contributing
@@ -13,12 +13,12 @@ Gem::Specification.new do |gem|
13
13
  gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
14
14
  gem.name = "fluent-plugin-kafka"
15
15
  gem.require_paths = ["lib"]
16
- gem.version = '0.12.3'
16
+ gem.version = '0.14.1'
17
17
  gem.required_ruby_version = ">= 2.1.0"
18
18
 
19
19
  gem.add_dependency "fluentd", [">= 0.10.58", "< 2"]
20
20
  gem.add_dependency 'ltsv'
21
- gem.add_dependency 'ruby-kafka', '>= 0.7.8', '< 0.8.0'
21
+ gem.add_dependency 'ruby-kafka', '>= 1.2.0', '< 2'
22
22
  gem.add_development_dependency "rake", ">= 0.9.2"
23
23
  gem.add_development_dependency "test-unit", ">= 3.0.8"
24
24
  end
@@ -39,6 +39,8 @@ class Fluent::KafkaInput < Fluent::Input
39
39
  :deprecated => "Use 'time_source record' instead."
40
40
  config_param :time_source, :enum, :list => [:now, :kafka, :record], :default => :now,
41
41
  :desc => "Source for message timestamp."
42
+ config_param :record_time_key, :string, :default => 'time',
43
+ :desc => "Time field when time_source is 'record'"
42
44
  config_param :get_kafka_client_log, :bool, :default => false
43
45
  config_param :time_format, :string, :default => nil,
44
46
  :desc => "Time format to be used to parse 'time' field."
@@ -292,9 +294,9 @@ class Fluent::KafkaInput < Fluent::Input
292
294
  record_time = Fluent::Engine.now
293
295
  when :record
294
296
  if @time_format
295
- record_time = @time_parser.parse(record['time'])
297
+ record_time = @time_parser.parse(record[@record_time_key])
296
298
  else
297
- record_time = record['time']
299
+ record_time = record[@record_time_key]
298
300
  end
299
301
  else
300
302
  $log.fatal "BUG: invalid time_source: #{@time_source}"
@@ -29,6 +29,8 @@ class Fluent::KafkaGroupInput < Fluent::Input
29
29
  :deprecated => "Use 'time_source record' instead."
30
30
  config_param :time_source, :enum, :list => [:now, :kafka, :record], :default => :now,
31
31
  :desc => "Source for message timestamp."
32
+ config_param :record_time_key, :string, :default => 'time',
33
+ :desc => "Time field when time_source is 'record'"
32
34
  config_param :get_kafka_client_log, :bool, :default => false
33
35
  config_param :time_format, :string, :default => nil,
34
36
  :desc => "Time format to be used to parse 'time' field."
@@ -166,16 +168,17 @@ class Fluent::KafkaGroupInput < Fluent::Input
166
168
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, connect_timeout: @connect_timeout, socket_timeout: @socket_timeout, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
167
169
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
168
170
  ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_scram_username: @username, sasl_scram_password: @password,
169
- sasl_scram_mechanism: @scram_mechanism, sasl_over_ssl: @sasl_over_ssl)
171
+ sasl_scram_mechanism: @scram_mechanism, sasl_over_ssl: @sasl_over_ssl, ssl_verify_hostname: @ssl_verify_hostname)
170
172
  elsif @username != nil && @password != nil
171
173
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, connect_timeout: @connect_timeout, socket_timeout: @socket_timeout, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
172
174
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
173
175
  ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_plain_username: @username, sasl_plain_password: @password,
174
- sasl_over_ssl: @sasl_over_ssl)
176
+ sasl_over_ssl: @sasl_over_ssl, ssl_verify_hostname: @ssl_verify_hostname)
175
177
  else
176
178
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, connect_timeout: @connect_timeout, socket_timeout: @socket_timeout, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
177
179
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
178
- ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_gssapi_principal: @principal, sasl_gssapi_keytab: @keytab)
180
+ ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_gssapi_principal: @principal, sasl_gssapi_keytab: @keytab,
181
+ ssl_verify_hostname: @ssl_verify_hostname)
179
182
  end
180
183
 
181
184
  @consumer = setup_consumer
@@ -198,7 +201,14 @@ class Fluent::KafkaGroupInput < Fluent::Input
198
201
  def setup_consumer
199
202
  consumer = @kafka.consumer(@consumer_opts)
200
203
  @topics.each { |topic|
201
- consumer.subscribe(topic, start_from_beginning: @start_from_beginning, max_bytes_per_partition: @max_bytes)
204
+ if m = /^\/(.+)\/$/.match(topic)
205
+ topic_or_regex = Regexp.new(m[1])
206
+ $log.info "Subscribe to topics matching the regex #{topic}"
207
+ else
208
+ topic_or_regex = topic
209
+ $log.info "Subscribe to topic #{topic}"
210
+ end
211
+ consumer.subscribe(topic_or_regex, start_from_beginning: @start_from_beginning, max_bytes_per_partition: @max_bytes)
202
212
  }
203
213
  consumer
204
214
  end
@@ -243,9 +253,9 @@ class Fluent::KafkaGroupInput < Fluent::Input
243
253
  record_time = Fluent::Engine.now
244
254
  when :record
245
255
  if @time_format
246
- record_time = @time_parser.parse(record['time'].to_s)
256
+ record_time = @time_parser.parse(record[@record_time_key].to_s)
247
257
  else
248
- record_time = record['time']
258
+ record_time = record[@record_time_key]
249
259
  end
250
260
  else
251
261
  log.fatal "BUG: invalid time_source: #{@time_source}"
@@ -69,12 +69,13 @@ module Kafka
69
69
  retry_backoff: retry_backoff,
70
70
  max_buffer_size: max_buffer_size,
71
71
  max_buffer_bytesize: max_buffer_bytesize,
72
+ partitioner: @partitioner,
72
73
  )
73
74
  end
74
75
  end
75
76
 
76
77
  class TopicProducer
77
- def initialize(topic, cluster:, transaction_manager:, logger:, instrumenter:, compressor:, ack_timeout:, required_acks:, max_retries:, retry_backoff:, max_buffer_size:, max_buffer_bytesize:)
78
+ def initialize(topic, cluster:, transaction_manager:, logger:, instrumenter:, compressor:, ack_timeout:, required_acks:, max_retries:, retry_backoff:, max_buffer_size:, max_buffer_bytesize:, partitioner:)
78
79
  @cluster = cluster
79
80
  @transaction_manager = transaction_manager
80
81
  @logger = logger
@@ -86,6 +87,7 @@ module Kafka
86
87
  @max_buffer_size = max_buffer_size
87
88
  @max_buffer_bytesize = max_buffer_bytesize
88
89
  @compressor = compressor
90
+ @partitioner = partitioner
89
91
 
90
92
  @topic = topic
91
93
  @cluster.add_target_topics(Set.new([topic]))
@@ -250,7 +252,7 @@ module Kafka
250
252
 
251
253
  begin
252
254
  if partition.nil?
253
- partition = Partitioner.partition_for_key(partition_count, message)
255
+ partition = @partitioner.call(partition_count, message)
254
256
  end
255
257
 
256
258
  @buffer.write(
@@ -15,6 +15,7 @@ module Fluent::Plugin
15
15
  Set brokers directly:
16
16
  <broker1_host>:<broker1_port>,<broker2_host>:<broker2_port>,..
17
17
  DESC
18
+ config_param :topic, :string, :default => nil, :desc => "kafka topic. Placeholders are supported"
18
19
  config_param :topic_key, :string, :default => 'topic', :desc => "Field for kafka topic"
19
20
  config_param :default_topic, :string, :default => nil,
20
21
  :desc => "Default output topic when record doesn't have topic field"
@@ -215,7 +216,11 @@ DESC
215
216
  # TODO: optimize write performance
216
217
  def write(chunk)
217
218
  tag = chunk.metadata.tag
218
- topic = (chunk.metadata.variables && chunk.metadata.variables[@topic_key_sym]) || @default_topic || tag
219
+ topic = if @topic
220
+ extract_placeholders(@topic, chunk)
221
+ else
222
+ (chunk.metadata.variables && chunk.metadata.variables[@topic_key_sym]) || @default_topic || tag
223
+ end
219
224
 
220
225
  messages = 0
221
226
  record_buf = nil
@@ -1,4 +1,5 @@
1
1
  require 'thread'
2
+ require 'logger'
2
3
  require 'fluent/output'
3
4
  require 'fluent/plugin/kafka_plugin_util'
4
5
 
@@ -91,8 +92,22 @@ DESC
91
92
  super
92
93
  log.instance_eval {
93
94
  def add(level, &block)
94
- if block
95
+ return unless block
96
+
97
+ # Follow rdkakfa's log level. See also rdkafka-ruby's bindings.rb: https://github.com/appsignal/rdkafka-ruby/blob/e5c7261e3f2637554a5c12b924be297d7dca1328/lib/rdkafka/bindings.rb#L117
98
+ case level
99
+ when Logger::FATAL
100
+ self.fatal(block.call)
101
+ when Logger::ERROR
102
+ self.error(block.call)
103
+ when Logger::WARN
104
+ self.warn(block.call)
105
+ when Logger::INFO
95
106
  self.info(block.call)
107
+ when Logger::DEBUG
108
+ self.debug(block.call)
109
+ else
110
+ self.trace(block.call)
96
111
  end
97
112
  end
98
113
  }
@@ -1,4 +1,5 @@
1
1
  require 'thread'
2
+ require 'logger'
2
3
  require 'fluent/plugin/output'
3
4
  require 'fluent/plugin/kafka_plugin_util'
4
5
 
@@ -32,6 +33,7 @@ Set brokers directly:
32
33
  <broker1_host>:<broker1_port>,<broker2_host>:<broker2_port>,..
33
34
  Brokers: you can choose to use either brokers or zookeeper.
34
35
  DESC
36
+ config_param :topic, :string, :default => nil, :desc => "kafka topic. Placeholders are supported"
35
37
  config_param :topic_key, :string, :default => 'topic', :desc => "Field for kafka topic"
36
38
  config_param :default_topic, :string, :default => nil,
37
39
  :desc => "Default output topic when record doesn't have topic field"
@@ -108,8 +110,22 @@ DESC
108
110
  super
109
111
  log.instance_eval {
110
112
  def add(level, &block)
111
- if block
113
+ return unless block
114
+
115
+ # Follow rdkakfa's log level. See also rdkafka-ruby's bindings.rb: https://github.com/appsignal/rdkafka-ruby/blob/e5c7261e3f2637554a5c12b924be297d7dca1328/lib/rdkafka/bindings.rb#L117
116
+ case level
117
+ when Logger::FATAL
118
+ self.fatal(block.call)
119
+ when Logger::ERROR
120
+ self.error(block.call)
121
+ when Logger::WARN
122
+ self.warn(block.call)
123
+ when Logger::INFO
112
124
  self.info(block.call)
125
+ when Logger::DEBUG
126
+ self.debug(block.call)
127
+ else
128
+ self.trace(block.call)
113
129
  end
114
130
  end
115
131
  }
@@ -263,7 +279,11 @@ DESC
263
279
 
264
280
  def write(chunk)
265
281
  tag = chunk.metadata.tag
266
- topic = (chunk.metadata.variables && chunk.metadata.variables[@topic_key_sym]) || @default_topic || tag
282
+ topic = if @topic
283
+ extract_placeholders(@topic, chunk)
284
+ else
285
+ (chunk.metadata.variables && chunk.metadata.variables[@topic_key_sym]) || @default_topic || tag
286
+ end
267
287
 
268
288
  handlers = []
269
289
  record_buf = nil
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-kafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.12.3
4
+ version: 0.14.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Hidemasa Togashi
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2020-02-06 00:00:00.000000000 Z
12
+ date: 2020-08-11 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: fluentd
@@ -51,20 +51,20 @@ dependencies:
51
51
  requirements:
52
52
  - - ">="
53
53
  - !ruby/object:Gem::Version
54
- version: 0.7.8
54
+ version: 1.2.0
55
55
  - - "<"
56
56
  - !ruby/object:Gem::Version
57
- version: 0.8.0
57
+ version: '2'
58
58
  type: :runtime
59
59
  prerelease: false
60
60
  version_requirements: !ruby/object:Gem::Requirement
61
61
  requirements:
62
62
  - - ">="
63
63
  - !ruby/object:Gem::Version
64
- version: 0.7.8
64
+ version: 1.2.0
65
65
  - - "<"
66
66
  - !ruby/object:Gem::Version
67
- version: 0.8.0
67
+ version: '2'
68
68
  - !ruby/object:Gem::Dependency
69
69
  name: rake
70
70
  requirement: !ruby/object:Gem::Requirement