fluent-plugin-kafka 0.14.1 → 0.15.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 27a59b5835dff5d64dcf78bd5d3bf945341c9b734de476b16f9217afa2839a22
4
- data.tar.gz: fdac125fa11e88712059f0f0794ca199ef248d562b1e1aad389e6b0dace9c777
3
+ metadata.gz: 730f14a0d24b98ac7df9683bc0836f826ff80def0ee61c833bc4c1e61bf435ac
4
+ data.tar.gz: 0a647554fb96f88ae75c8b872399545bfdb6b26d12247f3715c56dda2e628d57
5
5
  SHA512:
6
- metadata.gz: faf2abd472b6af6b010409750d6b0e3483ba8748af8930c4dad05d6e6b4d9aca5dc7c55d29b2e4ed3b092b12612f0d19e46b9878463ecbe417e417d7c3ee522b
7
- data.tar.gz: f51803596ea03e0f6dfc9f83abaf070179136f497dc89ad48eed0582c8d580ad88e105f5f4e075096421f6dc0acf8bf1e58fd75738e7c49636acca01ec347a46
6
+ metadata.gz: 802bef06a0cb0703e0e3c1d8c9581a96007b23a6b574d2c64f3bca362f6d4c531cbc4c9c17c78162705d655fc2e864af16182772a1db0ce2150c0e853b4f55ef
7
+ data.tar.gz: 0d3695bc1289b75cee1071fcf8e697250de9f0177b3facd69dd62e931a7b705fd93b24f886e683586d01f9daf9888d487ff6058da791d3779d09c076af854dd8
data/ChangeLog CHANGED
@@ -1,3 +1,25 @@
1
+ Release 0.15.3 - 2020/12/08
2
+
3
+ * in_kafka: Fix `record_time_key` parameter not working
4
+
5
+ Release 0.15.2 - 2020/09/30
6
+
7
+ * input: Support 3rd party parser
8
+
9
+ Release 0.15.1 - 2020/09/17
10
+
11
+ * out_kafka2: Fix wrong class name for configuration error
12
+
13
+ Release 0.15.0 - 2020/09/14
14
+
15
+ * Add experimental `in_rdkafka_group`
16
+ * in_kafka: Expose `ssl_verify_hostname` parameter
17
+
18
+ Release 0.14.2 - 2020/08/26
19
+
20
+ * in_kafka_group: Add `add_headers` parameter
21
+ * out_kafka2/out_rdkafka2: Support `discard_kafka_delivery_failed` parameter
22
+
1
23
  Release 0.14.1 - 2020/08/11
2
24
 
3
25
  * kafka_producer_ext: Fix regression by v0.14.0 changes
@@ -10,6 +32,7 @@ Release 0.14.0 - 2020/08/07
10
32
  Release 0.13.1 - 2020/07/17
11
33
 
12
34
  * in_kafka_group: Support ssl_verify_hostname parameter
35
+ * in_kafka_group: Support regex based topics
13
36
  * out_kafka2/out_rdkafka2: Support topic parameter with placeholders
14
37
 
15
38
  Release 0.13.0 - 2020/03/09
data/README.md CHANGED
@@ -118,6 +118,8 @@ Consume events by kafka consumer group features..
118
118
  topics <listening topics(separate with comma',')>
119
119
  format <input text type (text|json|ltsv|msgpack)> :default => json
120
120
  message_key <key (Optional, for text format only, default is message)>
121
+ kafka_mesasge_key <key (Optional, If specified, set kafka's message key to this key)>
122
+ add_headers <If true, add kafka's message headers to record>
121
123
  add_prefix <tag prefix (Optional)>
122
124
  add_suffix <tag suffix (Optional)>
123
125
  retry_emit_limit <Wait retry_emit_limit x 1s when BuffereQueueLimitError happens. The default is nil and it means waiting until BufferQueueLimitError is resolved>
@@ -137,11 +139,47 @@ Consume events by kafka consumer group features..
137
139
 
138
140
  See also [ruby-kafka README](https://github.com/zendesk/ruby-kafka#consuming-messages-from-kafka) for more detailed documentation about ruby-kafka options.
139
141
 
142
+ `topics` supports regex pattern since v0.13.1. If you want to use regex pattern, use `/pattern/` like `/foo.*/`.
143
+
144
+ Consuming topic name is used for event tag. So when the target topic name is `app_event`, the tag is `app_event`. If you want to modify tag, use `add_prefix` or `add_suffix` parameter. With `add_prefix kafka`, the tag is `kafka.app_event`.
145
+
146
+ ### Input plugin (@type 'rdkafka_group', supports kafka consumer groups, uses rdkafka-ruby)
147
+
148
+ :warning: **The in_rdkafka_group consumer was not yet tested under heavy production load. Use it at your own risk!**
149
+
150
+ With the introduction of the rdkafka-ruby based input plugin we hope to support Kafka brokers above version 2.1 where we saw [compatibility issues](https://github.com/fluent/fluent-plugin-kafka/issues/315) when using the ruby-kafka based @kafka_group input type. The rdkafka-ruby lib wraps the highly performant and production ready librdkafka C lib.
151
+
152
+ <source>
153
+ @type rdkafka_group
154
+ topics <listening topics(separate with comma',')>
155
+ format <input text type (text|json|ltsv|msgpack)> :default => json
156
+ message_key <key (Optional, for text format only, default is message)>
157
+ kafka_mesasge_key <key (Optional, If specified, set kafka's message key to this key)>
158
+ add_headers <If true, add kafka's message headers to record>
159
+ add_prefix <tag prefix (Optional)>
160
+ add_suffix <tag suffix (Optional)>
161
+ retry_emit_limit <Wait retry_emit_limit x 1s when BuffereQueueLimitError happens. The default is nil and it means waiting until BufferQueueLimitError is resolved>
162
+ use_record_time (Deprecated. Use 'time_source record' instead.) <If true, replace event time with contents of 'time' field of fetched record>
163
+ time_source <source for message timestamp (now|kafka|record)> :default => now
164
+ time_format <string (Optional when use_record_time is used)>
165
+
166
+ # kafka consumer options
167
+ max_wait_time_ms 500
168
+ max_batch_size 10000
169
+ kafka_configs {
170
+ "bootstrap.servers": "brokers <broker1_host>:<broker1_port>,<broker2_host>:<broker2_port>",
171
+ "group.id": "<consumer group name>"
172
+ }
173
+ </source>
174
+
175
+ See also [rdkafka-ruby](https://github.com/appsignal/rdkafka-ruby) and [librdkafka](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md) for more detailed documentation about Kafka consumer options.
176
+
140
177
  Consuming topic name is used for event tag. So when the target topic name is `app_event`, the tag is `app_event`. If you want to modify tag, use `add_prefix` or `add_suffix` parameter. With `add_prefix kafka`, the tag is `kafka.app_event`.
141
178
 
142
179
  ### Output plugin
143
180
 
144
- This `kafka2` plugin is for fluentd v1.0 or later. This will be `out_kafka` plugin in the future.
181
+ This `kafka2` plugin is for fluentd v1 or later. This plugin uses `ruby-kafka` producer for writing data.
182
+ If `ruby-kafka` doesn't fit your kafka environment, check `rdkafka2` plugin instead. This will be `out_kafka` plugin in the future.
145
183
 
146
184
  <match app.**>
147
185
  @type kafka2
@@ -162,6 +200,7 @@ This `kafka2` plugin is for fluentd v1.0 or later. This will be `out_kafka` plug
162
200
  headers (hash) :default => {}
163
201
  headers_from_record (hash) :default => {}
164
202
  use_default_for_unknown_topic (bool) :default => false
203
+ discard_kafka_delivery_failed (bool) :default => false (No discard)
165
204
 
166
205
  <format>
167
206
  @type (json|ltsv|msgpack|attr:<record name>|<formatter name>) :default => json
@@ -385,6 +424,7 @@ You need to install rdkafka gem.
385
424
  default_message_key (string) :default => nil
386
425
  exclude_topic_key (bool) :default => false
387
426
  exclude_partition_key (bool) :default => false
427
+ discard_kafka_delivery_failed (bool) :default => false (No discard)
388
428
 
389
429
  # same with kafka2
390
430
  headers (hash) :default => {}
@@ -13,7 +13,7 @@ Gem::Specification.new do |gem|
13
13
  gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
14
14
  gem.name = "fluent-plugin-kafka"
15
15
  gem.require_paths = ["lib"]
16
- gem.version = '0.14.1'
16
+ gem.version = '0.15.3'
17
17
  gem.required_ruby_version = ">= 2.1.0"
18
18
 
19
19
  gem.add_dependency "fluentd", [">= 0.10.58", "< 2"]
@@ -113,7 +113,7 @@ class Fluent::KafkaInput < Fluent::Input
113
113
 
114
114
  require 'zookeeper' if @offset_zookeeper
115
115
 
116
- @parser_proc = setup_parser
116
+ @parser_proc = setup_parser(conf)
117
117
 
118
118
  @time_source = :record if @use_record_time
119
119
 
@@ -126,7 +126,7 @@ class Fluent::KafkaInput < Fluent::Input
126
126
  end
127
127
  end
128
128
 
129
- def setup_parser
129
+ def setup_parser(conf)
130
130
  case @format
131
131
  when 'json'
132
132
  begin
@@ -165,6 +165,14 @@ class Fluent::KafkaInput < Fluent::Input
165
165
  add_offset_in_hash(r, te, msg.offset) if @add_offset_in_record
166
166
  r
167
167
  }
168
+ else
169
+ @custom_parser = Fluent::Plugin.new_parser(conf['format'])
170
+ @custom_parser.configure(conf)
171
+ Proc.new { |msg|
172
+ @custom_parser.parse(msg.value) {|_time, record|
173
+ record
174
+ }
175
+ }
168
176
  end
169
177
  end
170
178
 
@@ -188,16 +196,17 @@ class Fluent::KafkaInput < Fluent::Input
188
196
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
189
197
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
190
198
  ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_scram_username: @username, sasl_scram_password: @password,
191
- sasl_scram_mechanism: @scram_mechanism, sasl_over_ssl: @sasl_over_ssl)
199
+ sasl_scram_mechanism: @scram_mechanism, sasl_over_ssl: @sasl_over_ssl, ssl_verify_hostname: @ssl_verify_hostname)
192
200
  elsif @username != nil && @password != nil
193
201
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
194
202
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
195
203
  ssl_ca_certs_from_system: @ssl_ca_certs_from_system,sasl_plain_username: @username, sasl_plain_password: @password,
196
- sasl_over_ssl: @sasl_over_ssl)
204
+ sasl_over_ssl: @sasl_over_ssl, ssl_verify_hostname: @ssl_verify_hostname)
197
205
  else
198
206
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
199
207
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
200
- ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_gssapi_principal: @principal, sasl_gssapi_keytab: @keytab)
208
+ ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_gssapi_principal: @principal, sasl_gssapi_keytab: @keytab,
209
+ ssl_verify_hostname: @ssl_verify_hostname)
201
210
  end
202
211
 
203
212
  @zookeeper = Zookeeper.new(@offset_zookeeper) if @offset_zookeeper
@@ -215,6 +224,7 @@ class Fluent::KafkaInput < Fluent::Input
215
224
  router,
216
225
  @kafka_message_key,
217
226
  @time_source,
227
+ @record_time_key,
218
228
  opt)
219
229
  }
220
230
  @topic_watchers.each {|tw|
@@ -239,7 +249,7 @@ class Fluent::KafkaInput < Fluent::Input
239
249
  end
240
250
 
241
251
  class TopicWatcher < Coolio::TimerWatcher
242
- def initialize(topic_entry, kafka, interval, parser, add_prefix, add_suffix, offset_manager, router, kafka_message_key, time_source, options={})
252
+ def initialize(topic_entry, kafka, interval, parser, add_prefix, add_suffix, offset_manager, router, kafka_message_key, time_source, record_time_key, options={})
243
253
  @topic_entry = topic_entry
244
254
  @kafka = kafka
245
255
  @callback = method(:consume)
@@ -251,6 +261,7 @@ class Fluent::KafkaInput < Fluent::Input
251
261
  @router = router
252
262
  @kafka_message_key = kafka_message_key
253
263
  @time_source = time_source
264
+ @record_time_key = record_time_key
254
265
 
255
266
  @next_offset = @topic_entry.offset
256
267
  if @topic_entry.offset == -1 && offset_manager
@@ -18,6 +18,8 @@ class Fluent::KafkaGroupInput < Fluent::Input
18
18
  :desc => "Supported format: (json|text|ltsv|msgpack)"
19
19
  config_param :message_key, :string, :default => 'message',
20
20
  :desc => "For 'text' format only."
21
+ config_param :add_headers, :bool, :default => false,
22
+ :desc => "Add kafka's message headers to event record"
21
23
  config_param :add_prefix, :string, :default => nil,
22
24
  :desc => "Tag prefix (Optional)"
23
25
  config_param :add_suffix, :string, :default => nil,
@@ -115,7 +117,7 @@ class Fluent::KafkaGroupInput < Fluent::Input
115
117
  @max_wait_time = conf['max_wait_ms'].to_i / 1000
116
118
  end
117
119
 
118
- @parser_proc = setup_parser
120
+ @parser_proc = setup_parser(conf)
119
121
 
120
122
  @consumer_opts = {:group_id => @consumer_group}
121
123
  @consumer_opts[:session_timeout] = @session_timeout if @session_timeout
@@ -138,7 +140,7 @@ class Fluent::KafkaGroupInput < Fluent::Input
138
140
  end
139
141
  end
140
142
 
141
- def setup_parser
143
+ def setup_parser(conf)
142
144
  case @format
143
145
  when 'json'
144
146
  begin
@@ -157,6 +159,14 @@ class Fluent::KafkaGroupInput < Fluent::Input
157
159
  Proc.new { |msg| MessagePack.unpack(msg.value) }
158
160
  when 'text'
159
161
  Proc.new { |msg| {@message_key => msg.value} }
162
+ else
163
+ @custom_parser = Fluent::Plugin.new_parser(conf['format'])
164
+ @custom_parser.configure(conf)
165
+ Proc.new { |msg|
166
+ @custom_parser.parse(msg.value) {|_time, record|
167
+ record
168
+ }
169
+ }
160
170
  end
161
171
  end
162
172
 
@@ -263,6 +273,11 @@ class Fluent::KafkaGroupInput < Fluent::Input
263
273
  if @kafka_message_key
264
274
  record[@kafka_message_key] = msg.key
265
275
  end
276
+ if @add_headers
277
+ msg.headers.each_pair { |k, v|
278
+ record[k] = v
279
+ }
280
+ end
266
281
  es.add(record_time, record)
267
282
  rescue => e
268
283
  log.warn "parser error in #{batch.topic}/#{batch.partition}", :error => e.to_s, :value => msg.value, :offset => msg.offset
@@ -0,0 +1,305 @@
1
+ require 'fluent/plugin/input'
2
+ require 'fluent/time'
3
+ require 'fluent/plugin/kafka_plugin_util'
4
+
5
+ require 'rdkafka'
6
+
7
+ class Fluent::Plugin::RdKafkaGroupInput < Fluent::Plugin::Input
8
+ Fluent::Plugin.register_input('rdkafka_group', self)
9
+
10
+ helpers :thread, :parser, :compat_parameters
11
+
12
+ config_param :topics, :string,
13
+ :desc => "Listening topics(separate with comma',')."
14
+
15
+ config_param :format, :string, :default => 'json',
16
+ :desc => "Supported format: (json|text|ltsv|msgpack)"
17
+ config_param :message_key, :string, :default => 'message',
18
+ :desc => "For 'text' format only."
19
+ config_param :add_headers, :bool, :default => false,
20
+ :desc => "Add kafka's message headers to event record"
21
+ config_param :add_prefix, :string, :default => nil,
22
+ :desc => "Tag prefix (Optional)"
23
+ config_param :add_suffix, :string, :default => nil,
24
+ :desc => "Tag suffix (Optional)"
25
+ config_param :use_record_time, :bool, :default => false,
26
+ :desc => "Replace message timestamp with contents of 'time' field.",
27
+ :deprecated => "Use 'time_source record' instead."
28
+ config_param :time_source, :enum, :list => [:now, :kafka, :record], :default => :now,
29
+ :desc => "Source for message timestamp."
30
+ config_param :record_time_key, :string, :default => 'time',
31
+ :desc => "Time field when time_source is 'record'"
32
+ config_param :time_format, :string, :default => nil,
33
+ :desc => "Time format to be used to parse 'time' field."
34
+ config_param :kafka_message_key, :string, :default => nil,
35
+ :desc => "Set kafka's message key to this field"
36
+
37
+ config_param :retry_emit_limit, :integer, :default => nil,
38
+ :desc => "How long to stop event consuming when BufferQueueLimitError happens. Wait retry_emit_limit x 1s. The default is waiting until BufferQueueLimitError is resolved"
39
+ config_param :retry_wait_seconds, :integer, :default => 30
40
+ config_param :disable_retry_limit, :bool, :default => false,
41
+ :desc => "If set true, it disables retry_limit and make Fluentd retry indefinitely (default: false)"
42
+ config_param :retry_limit, :integer, :default => 10,
43
+ :desc => "The maximum number of retries for connecting kafka (default: 10)"
44
+
45
+ config_param :max_wait_time_ms, :integer, :default => 250,
46
+ :desc => "How long to block polls in milliseconds until the server sends us data."
47
+ config_param :max_batch_size, :integer, :default => 10000,
48
+ :desc => "Maximum number of log lines emitted in a single batch."
49
+
50
+ config_param :kafka_configs, :hash, :default => {},
51
+ :desc => "Kafka configuration properties as desribed in https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md"
52
+
53
+ config_section :parse do
54
+ config_set_default :@type, 'json'
55
+ end
56
+
57
+ include Fluent::KafkaPluginUtil::SSLSettings
58
+ include Fluent::KafkaPluginUtil::SaslSettings
59
+
60
+ class ForShutdown < StandardError
61
+ end
62
+
63
+ BufferError = Fluent::Plugin::Buffer::BufferOverflowError
64
+
65
+ def initialize
66
+ super
67
+
68
+ @time_parser = nil
69
+ @retry_count = 1
70
+ end
71
+
72
+ def _config_to_array(config)
73
+ config_array = config.split(',').map {|k| k.strip }
74
+ if config_array.empty?
75
+ raise Fluent::ConfigError, "kafka_group: '#{config}' is a required parameter"
76
+ end
77
+ config_array
78
+ end
79
+
80
+ def multi_workers_ready?
81
+ true
82
+ end
83
+
84
+ private :_config_to_array
85
+
86
+ def configure(conf)
87
+ compat_parameters_convert(conf, :parser)
88
+
89
+ super
90
+
91
+ log.warn "The in_rdkafka_group consumer was not yet tested under heavy production load. Use it at your own risk!"
92
+
93
+ log.info "Will watch for topics #{@topics} at brokers " \
94
+ "#{@kafka_configs["bootstrap.servers"]} and '#{@kafka_configs["group.id"]}' group"
95
+
96
+ @topics = _config_to_array(@topics)
97
+
98
+ parser_conf = conf.elements('parse').first
99
+ unless parser_conf
100
+ raise Fluent::ConfigError, "<parse> section or format parameter is required."
101
+ end
102
+ unless parser_conf["@type"]
103
+ raise Fluent::ConfigError, "parse/@type is required."
104
+ end
105
+ @parser_proc = setup_parser(parser_conf)
106
+
107
+ @time_source = :record if @use_record_time
108
+
109
+ if @time_source == :record and @time_format
110
+ @time_parser = Fluent::TimeParser.new(@time_format)
111
+ end
112
+ end
113
+
114
+ def setup_parser(parser_conf)
115
+ format = parser_conf["@type"]
116
+ case format
117
+ when 'json'
118
+ begin
119
+ require 'oj'
120
+ Oj.default_options = Fluent::DEFAULT_OJ_OPTIONS
121
+ Proc.new { |msg| Oj.load(msg.payload) }
122
+ rescue LoadError
123
+ require 'yajl'
124
+ Proc.new { |msg| Yajl::Parser.parse(msg.payload) }
125
+ end
126
+ when 'ltsv'
127
+ require 'ltsv'
128
+ Proc.new { |msg| LTSV.parse(msg.payload, {:symbolize_keys => false}).first }
129
+ when 'msgpack'
130
+ require 'msgpack'
131
+ Proc.new { |msg| MessagePack.unpack(msg.payload) }
132
+ when 'text'
133
+ Proc.new { |msg| {@message_key => msg.payload} }
134
+ else
135
+ @custom_parser = parser_create(usage: 'in-rdkafka-plugin', conf: parser_conf)
136
+ Proc.new { |msg|
137
+ @custom_parser.parse(msg.payload) {|_time, record|
138
+ record
139
+ }
140
+ }
141
+ end
142
+ end
143
+
144
+ def start
145
+ super
146
+
147
+ @consumer = setup_consumer
148
+
149
+ thread_create(:in_rdkafka_group, &method(:run))
150
+ end
151
+
152
+ def shutdown
153
+ # This nil assignment should be guarded by mutex in multithread programming manner.
154
+ # But the situation is very low contention, so we don't use mutex for now.
155
+ # If the problem happens, we will add a guard for consumer.
156
+ consumer = @consumer
157
+ @consumer = nil
158
+ consumer.close
159
+
160
+ super
161
+ end
162
+
163
+ def setup_consumer
164
+ consumer = Rdkafka::Config.new(@kafka_configs).consumer
165
+ consumer.subscribe(*@topics)
166
+ consumer
167
+ end
168
+
169
+ def reconnect_consumer
170
+ log.warn "Stopping Consumer"
171
+ consumer = @consumer
172
+ @consumer = nil
173
+ if consumer
174
+ consumer.close
175
+ end
176
+ log.warn "Could not connect to broker. retry_time:#{@retry_count}. Next retry will be in #{@retry_wait_seconds} seconds"
177
+ @retry_count = @retry_count + 1
178
+ sleep @retry_wait_seconds
179
+ @consumer = setup_consumer
180
+ log.warn "Re-starting consumer #{Time.now.to_s}"
181
+ @retry_count = 0
182
+ rescue =>e
183
+ log.error "unexpected error during re-starting consumer object access", :error => e.to_s
184
+ log.error_backtrace
185
+ if @retry_count <= @retry_limit or disable_retry_limit
186
+ reconnect_consumer
187
+ end
188
+ end
189
+
190
+ class Batch
191
+ attr_reader :topic
192
+ attr_reader :messages
193
+
194
+ def initialize(topic)
195
+ @topic = topic
196
+ @messages = []
197
+ end
198
+ end
199
+
200
+ # Executes the passed codeblock on a batch of messages.
201
+ # It is guaranteed that every message in a given batch belongs to the same topic, because the tagging logic in :run expects that property.
202
+ # The number of maximum messages in a batch is capped by the :max_batch_size configuration value. It ensures that consuming from a single
203
+ # topic for a long time (e.g. with `auto.offset.reset` set to `earliest`) does not lead to memory exhaustion. Also, calling consumer.poll
204
+ # advances thes consumer offset, so in case the process crashes we might lose at most :max_batch_size messages.
205
+ def each_batch(&block)
206
+ batch = nil
207
+ message = nil
208
+ while @consumer
209
+ message = @consumer.poll(@max_wait_time_ms)
210
+ if message
211
+ if not batch
212
+ batch = Batch.new(message.topic)
213
+ elsif batch.topic != message.topic || batch.messages.size >= @max_batch_size
214
+ yield batch
215
+ batch = Batch.new(message.topic)
216
+ end
217
+ batch.messages << message
218
+ else
219
+ yield batch if batch
220
+ batch = nil
221
+ end
222
+ end
223
+ yield batch if batch
224
+ end
225
+
226
+ def run
227
+ while @consumer
228
+ begin
229
+ each_batch { |batch|
230
+ log.debug "A new batch for topic #{batch.topic} with #{batch.messages.size} messages"
231
+ es = Fluent::MultiEventStream.new
232
+ tag = batch.topic
233
+ tag = @add_prefix + "." + tag if @add_prefix
234
+ tag = tag + "." + @add_suffix if @add_suffix
235
+
236
+ batch.messages.each { |msg|
237
+ begin
238
+ record = @parser_proc.call(msg)
239
+ case @time_source
240
+ when :kafka
241
+ record_time = Fluent::EventTime.from_time(msg.timestamp)
242
+ when :now
243
+ record_time = Fluent::Engine.now
244
+ when :record
245
+ if @time_format
246
+ record_time = @time_parser.parse(record[@record_time_key].to_s)
247
+ else
248
+ record_time = record[@record_time_key]
249
+ end
250
+ else
251
+ log.fatal "BUG: invalid time_source: #{@time_source}"
252
+ end
253
+ if @kafka_message_key
254
+ record[@kafka_message_key] = msg.key
255
+ end
256
+ if @add_headers
257
+ msg.headers.each_pair { |k, v|
258
+ record[k] = v
259
+ }
260
+ end
261
+ es.add(record_time, record)
262
+ rescue => e
263
+ log.warn "parser error in #{msg.topic}/#{msg.partition}", :error => e.to_s, :value => msg.payload, :offset => msg.offset
264
+ log.debug_backtrace
265
+ end
266
+ }
267
+
268
+ unless es.empty?
269
+ emit_events(tag, es)
270
+ end
271
+ }
272
+ rescue ForShutdown
273
+ rescue => e
274
+ log.error "unexpected error during consuming events from kafka. Re-fetch events.", :error => e.to_s
275
+ log.error_backtrace
276
+ reconnect_consumer
277
+ end
278
+ end
279
+ rescue => e
280
+ log.error "unexpected error during consumer object access", :error => e.to_s
281
+ log.error_backtrace
282
+ end
283
+
284
+ def emit_events(tag, es)
285
+ retries = 0
286
+ begin
287
+ router.emit_stream(tag, es)
288
+ rescue BufferError
289
+ raise ForShutdown if @consumer.nil?
290
+
291
+ if @retry_emit_limit.nil?
292
+ sleep 1
293
+ retry
294
+ end
295
+
296
+ if retries < @retry_emit_limit
297
+ retries += 1
298
+ sleep 1
299
+ retry
300
+ else
301
+ raise RuntimeError, "Exceeds retry_emit_limit"
302
+ end
303
+ end
304
+ end
305
+ end
@@ -69,6 +69,7 @@ The codec the producer uses to compress messages.
69
69
  Supported codecs depends on ruby-kafka: https://github.com/zendesk/ruby-kafka#compression
70
70
  DESC
71
71
  config_param :max_send_limit_bytes, :size, :default => nil
72
+ config_param :discard_kafka_delivery_failed, :bool, :default => false
72
73
  config_param :active_support_notification_regex, :string, :default => nil,
73
74
  :desc => <<-DESC
74
75
  Add a regular expression to capture ActiveSupport notifications from the Kafka client
@@ -127,7 +128,7 @@ DESC
127
128
  @seed_brokers = @brokers
128
129
  log.info "brokers has been set: #{@seed_brokers}"
129
130
  else
130
- raise Fluent::Config, 'No brokers specified. Need one broker at least.'
131
+ raise Fluent::ConfigError, 'No brokers specified. Need one broker at least.'
131
132
  end
132
133
 
133
134
  formatter_conf = conf.elements('format').first
@@ -267,7 +268,16 @@ DESC
267
268
 
268
269
  if messages > 0
269
270
  log.debug { "#{messages} messages send." }
270
- producer.deliver_messages
271
+ if @discard_kafka_delivery_failed
272
+ begin
273
+ producer.deliver_messages
274
+ rescue Kafka::DeliveryFailed => e
275
+ log.warn "DeliveryFailed occurred. Discard broken event:", :error => e.to_s, :error_class => e.class.to_s, :tag => tag
276
+ producer.clear_buffer
277
+ end
278
+ else
279
+ producer.deliver_messages
280
+ end
271
281
  end
272
282
  rescue Kafka::UnknownTopicOrPartition
273
283
  if @use_default_for_unknown_topic && topic != @default_topic
@@ -73,6 +73,7 @@ The codec the producer uses to compress messages. Used for compression.codec
73
73
  Supported codecs: (gzip|snappy)
74
74
  DESC
75
75
  config_param :max_send_limit_bytes, :size, :default => nil
76
+ config_param :discard_kafka_delivery_failed, :bool, :default => false
76
77
  config_param :rdkafka_buffering_max_ms, :integer, :default => nil, :desc => 'Used for queue.buffering.max.ms'
77
78
  config_param :rdkafka_buffering_max_messages, :integer, :default => nil, :desc => 'Used for queue.buffering.max.messages'
78
79
  config_param :rdkafka_message_max_bytes, :integer, :default => nil, :desc => 'Used for message.max.bytes'
@@ -325,9 +326,13 @@ DESC
325
326
  }
326
327
  end
327
328
  rescue Exception => e
328
- log.warn "Send exception occurred: #{e} at #{e.backtrace.first}"
329
- # Raise exception to retry sendind messages
330
- raise e
329
+ if @discard_kafka_delivery_failed
330
+ log.warn "Delivery failed. Discard events:", :error => e.to_s, :error_class => e.class.to_s, :tag => tag
331
+ else
332
+ log.warn "Send exception occurred: #{e} at #{e.backtrace.first}"
333
+ # Raise exception to retry sendind messages
334
+ raise e
335
+ end
331
336
  end
332
337
 
333
338
  def enqueue_with_retry(producer, topic, record_buf, message_key, partition, headers)
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-kafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.14.1
4
+ version: 0.15.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Hidemasa Togashi
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2020-08-11 00:00:00.000000000 Z
12
+ date: 2020-12-08 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: fluentd
@@ -111,6 +111,7 @@ files:
111
111
  - fluent-plugin-kafka.gemspec
112
112
  - lib/fluent/plugin/in_kafka.rb
113
113
  - lib/fluent/plugin/in_kafka_group.rb
114
+ - lib/fluent/plugin/in_rdkafka_group.rb
114
115
  - lib/fluent/plugin/kafka_plugin_util.rb
115
116
  - lib/fluent/plugin/kafka_producer_ext.rb
116
117
  - lib/fluent/plugin/out_kafka.rb