fluent-plugin-kafka 0.14.0 → 0.15.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 0ceff7f76a27f8be74e0f6ffd01cf793a951a78aa14c73c2758472a08d5d6b7e
4
- data.tar.gz: c95a162564fea2b7abbedc1032151c87d5c5938c56c6fa6d871eb678306d4e7e
3
+ metadata.gz: 866fb421d7097ccbac1bde2e279d9975bce4de086dab74609cadbe50429bb312
4
+ data.tar.gz: 0e763fc4276177949f6cec2b4839ad5168892cd4d346d633f81449fdb8df84d7
5
5
  SHA512:
6
- metadata.gz: a78e16a0b4e0995699f5d0d5e7010eed8ecfa52aeff647e83c9f885299396d184b8e2b656fd4626519f1e93297210a0b6ae59e6ca1dc5c066cbc6b644e83e36f
7
- data.tar.gz: 98cfcb42d5b225861d90d7adb3f2a1eaba93a57c7f08ed142e767f4a971c15359e8c866e73f554405021bb037ea1935f0d85d5a0b08e999e2181e31421f99ba4
6
+ metadata.gz: b41f5cb35d1c4dea3743e513b505e11f634dfcbc33e339f188f2ac4c2b710ed1357c00779e24873c5f4a0bdd5326f8c1a731b4f2a4c323dc6fde8b85bc78ef28
7
+ data.tar.gz: df4061316f692fbe264b2344fd74f7ba1d15174bb91a617c09f5d9d3de6d50a5f0b6c8aa702ce9322d438ee6ed83b09cf620e71537e96e13db73882f01e291cb
data/ChangeLog CHANGED
@@ -1,3 +1,25 @@
1
+ Release 0.15.2 - 2020/09/30
2
+
3
+ * input: Support 3rd party parser
4
+
5
+ Release 0.15.1 - 2020/09/17
6
+
7
+ * out_kafka2: Fix wrong class name for configuration error
8
+
9
+ Release 0.15.0 - 2020/09/14
10
+
11
+ * Add experimental `in_rdkafka_group`
12
+ * in_kafka: Expose `ssl_verify_hostname` parameter
13
+
14
+ Release 0.14.2 - 2020/08/26
15
+
16
+ * in_kafka_group: Add `add_headers` parameter
17
+ * out_kafka2/out_rdkafka2: Support `discard_kafka_delivery_failed` parameter
18
+
19
+ Release 0.14.1 - 2020/08/11
20
+
21
+ * kafka_producer_ext: Fix regression by v0.14.0 changes
22
+
1
23
  Release 0.14.0 - 2020/08/07
2
24
 
3
25
  * Update ruby-kafka dependency to v1.2.0 or later. Check https://github.com/zendesk/ruby-kafka#compatibility
data/README.md CHANGED
@@ -118,6 +118,8 @@ Consume events by kafka consumer group features..
118
118
  topics <listening topics(separate with comma',')>
119
119
  format <input text type (text|json|ltsv|msgpack)> :default => json
120
120
  message_key <key (Optional, for text format only, default is message)>
121
+ kafka_mesasge_key <key (Optional, If specified, set kafka's message key to this key)>
122
+ add_headers <If true, add kafka's message headers to record>
121
123
  add_prefix <tag prefix (Optional)>
122
124
  add_suffix <tag suffix (Optional)>
123
125
  retry_emit_limit <Wait retry_emit_limit x 1s when BuffereQueueLimitError happens. The default is nil and it means waiting until BufferQueueLimitError is resolved>
@@ -139,9 +141,43 @@ See also [ruby-kafka README](https://github.com/zendesk/ruby-kafka#consuming-mes
139
141
 
140
142
  Consuming topic name is used for event tag. So when the target topic name is `app_event`, the tag is `app_event`. If you want to modify tag, use `add_prefix` or `add_suffix` parameter. With `add_prefix kafka`, the tag is `kafka.app_event`.
141
143
 
144
+ ### Input plugin (@type 'rdkafka_group', supports kafka consumer groups, uses rdkafka-ruby)
145
+
146
+ :warning: **The in_rdkafka_group consumer was not yet tested under heavy production load. Use it at your own risk!**
147
+
148
+ With the introduction of the rdkafka-ruby based input plugin we hope to support Kafka brokers above version 2.1 where we saw [compatibility issues](https://github.com/fluent/fluent-plugin-kafka/issues/315) when using the ruby-kafka based @kafka_group input type. The rdkafka-ruby lib wraps the highly performant and production ready librdkafka C lib.
149
+
150
+ <source>
151
+ @type rdkafka_group
152
+ topics <listening topics(separate with comma',')>
153
+ format <input text type (text|json|ltsv|msgpack)> :default => json
154
+ message_key <key (Optional, for text format only, default is message)>
155
+ kafka_mesasge_key <key (Optional, If specified, set kafka's message key to this key)>
156
+ add_headers <If true, add kafka's message headers to record>
157
+ add_prefix <tag prefix (Optional)>
158
+ add_suffix <tag suffix (Optional)>
159
+ retry_emit_limit <Wait retry_emit_limit x 1s when BuffereQueueLimitError happens. The default is nil and it means waiting until BufferQueueLimitError is resolved>
160
+ use_record_time (Deprecated. Use 'time_source record' instead.) <If true, replace event time with contents of 'time' field of fetched record>
161
+ time_source <source for message timestamp (now|kafka|record)> :default => now
162
+ time_format <string (Optional when use_record_time is used)>
163
+
164
+ # kafka consumer options
165
+ max_wait_time_ms 500
166
+ max_batch_size 10000
167
+ kafka_configs {
168
+ "bootstrap.servers": "brokers <broker1_host>:<broker1_port>,<broker2_host>:<broker2_port>",
169
+ "group.id": "<consumer group name>"
170
+ }
171
+ </source>
172
+
173
+ See also [rdkafka-ruby](https://github.com/appsignal/rdkafka-ruby) and [librdkafka](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md) for more detailed documentation about Kafka consumer options.
174
+
175
+ Consuming topic name is used for event tag. So when the target topic name is `app_event`, the tag is `app_event`. If you want to modify tag, use `add_prefix` or `add_suffix` parameter. With `add_prefix kafka`, the tag is `kafka.app_event`.
176
+
142
177
  ### Output plugin
143
178
 
144
- This `kafka2` plugin is for fluentd v1.0 or later. This will be `out_kafka` plugin in the future.
179
+ This `kafka2` plugin is for fluentd v1 or later. This plugin uses `ruby-kafka` producer for writing data.
180
+ If `ruby-kafka` doesn't fit your kafka environment, check `rdkafka2` plugin instead. This will be `out_kafka` plugin in the future.
145
181
 
146
182
  <match app.**>
147
183
  @type kafka2
@@ -162,6 +198,7 @@ This `kafka2` plugin is for fluentd v1.0 or later. This will be `out_kafka` plug
162
198
  headers (hash) :default => {}
163
199
  headers_from_record (hash) :default => {}
164
200
  use_default_for_unknown_topic (bool) :default => false
201
+ discard_kafka_delivery_failed (bool) :default => false (No discard)
165
202
 
166
203
  <format>
167
204
  @type (json|ltsv|msgpack|attr:<record name>|<formatter name>) :default => json
@@ -385,6 +422,7 @@ You need to install rdkafka gem.
385
422
  default_message_key (string) :default => nil
386
423
  exclude_topic_key (bool) :default => false
387
424
  exclude_partition_key (bool) :default => false
425
+ discard_kafka_delivery_failed (bool) :default => false (No discard)
388
426
 
389
427
  # same with kafka2
390
428
  headers (hash) :default => {}
@@ -13,7 +13,7 @@ Gem::Specification.new do |gem|
13
13
  gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
14
14
  gem.name = "fluent-plugin-kafka"
15
15
  gem.require_paths = ["lib"]
16
- gem.version = '0.14.0'
16
+ gem.version = '0.15.2'
17
17
  gem.required_ruby_version = ">= 2.1.0"
18
18
 
19
19
  gem.add_dependency "fluentd", [">= 0.10.58", "< 2"]
@@ -113,7 +113,7 @@ class Fluent::KafkaInput < Fluent::Input
113
113
 
114
114
  require 'zookeeper' if @offset_zookeeper
115
115
 
116
- @parser_proc = setup_parser
116
+ @parser_proc = setup_parser(conf)
117
117
 
118
118
  @time_source = :record if @use_record_time
119
119
 
@@ -126,7 +126,7 @@ class Fluent::KafkaInput < Fluent::Input
126
126
  end
127
127
  end
128
128
 
129
- def setup_parser
129
+ def setup_parser(conf)
130
130
  case @format
131
131
  when 'json'
132
132
  begin
@@ -165,6 +165,14 @@ class Fluent::KafkaInput < Fluent::Input
165
165
  add_offset_in_hash(r, te, msg.offset) if @add_offset_in_record
166
166
  r
167
167
  }
168
+ else
169
+ @custom_parser = Fluent::Plugin.new_parser(conf['format'])
170
+ @custom_parser.configure(conf)
171
+ Proc.new { |msg|
172
+ @custom_parser.parse(msg.value) {|_time, record|
173
+ record
174
+ }
175
+ }
168
176
  end
169
177
  end
170
178
 
@@ -188,16 +196,17 @@ class Fluent::KafkaInput < Fluent::Input
188
196
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
189
197
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
190
198
  ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_scram_username: @username, sasl_scram_password: @password,
191
- sasl_scram_mechanism: @scram_mechanism, sasl_over_ssl: @sasl_over_ssl)
199
+ sasl_scram_mechanism: @scram_mechanism, sasl_over_ssl: @sasl_over_ssl, ssl_verify_hostname: @ssl_verify_hostname)
192
200
  elsif @username != nil && @password != nil
193
201
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
194
202
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
195
203
  ssl_ca_certs_from_system: @ssl_ca_certs_from_system,sasl_plain_username: @username, sasl_plain_password: @password,
196
- sasl_over_ssl: @sasl_over_ssl)
204
+ sasl_over_ssl: @sasl_over_ssl, ssl_verify_hostname: @ssl_verify_hostname)
197
205
  else
198
206
  @kafka = Kafka.new(seed_brokers: @brokers, client_id: @client_id, logger: logger, ssl_ca_cert: read_ssl_file(@ssl_ca_cert),
199
207
  ssl_client_cert: read_ssl_file(@ssl_client_cert), ssl_client_cert_key: read_ssl_file(@ssl_client_cert_key),
200
- ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_gssapi_principal: @principal, sasl_gssapi_keytab: @keytab)
208
+ ssl_ca_certs_from_system: @ssl_ca_certs_from_system, sasl_gssapi_principal: @principal, sasl_gssapi_keytab: @keytab,
209
+ ssl_verify_hostname: @ssl_verify_hostname)
201
210
  end
202
211
 
203
212
  @zookeeper = Zookeeper.new(@offset_zookeeper) if @offset_zookeeper
@@ -18,6 +18,8 @@ class Fluent::KafkaGroupInput < Fluent::Input
18
18
  :desc => "Supported format: (json|text|ltsv|msgpack)"
19
19
  config_param :message_key, :string, :default => 'message',
20
20
  :desc => "For 'text' format only."
21
+ config_param :add_headers, :bool, :default => false,
22
+ :desc => "Add kafka's message headers to event record"
21
23
  config_param :add_prefix, :string, :default => nil,
22
24
  :desc => "Tag prefix (Optional)"
23
25
  config_param :add_suffix, :string, :default => nil,
@@ -115,7 +117,7 @@ class Fluent::KafkaGroupInput < Fluent::Input
115
117
  @max_wait_time = conf['max_wait_ms'].to_i / 1000
116
118
  end
117
119
 
118
- @parser_proc = setup_parser
120
+ @parser_proc = setup_parser(conf)
119
121
 
120
122
  @consumer_opts = {:group_id => @consumer_group}
121
123
  @consumer_opts[:session_timeout] = @session_timeout if @session_timeout
@@ -138,7 +140,7 @@ class Fluent::KafkaGroupInput < Fluent::Input
138
140
  end
139
141
  end
140
142
 
141
- def setup_parser
143
+ def setup_parser(conf)
142
144
  case @format
143
145
  when 'json'
144
146
  begin
@@ -157,6 +159,14 @@ class Fluent::KafkaGroupInput < Fluent::Input
157
159
  Proc.new { |msg| MessagePack.unpack(msg.value) }
158
160
  when 'text'
159
161
  Proc.new { |msg| {@message_key => msg.value} }
162
+ else
163
+ @custom_parser = Fluent::Plugin.new_parser(conf['format'])
164
+ @custom_parser.configure(conf)
165
+ Proc.new { |msg|
166
+ @custom_parser.parse(msg.value) {|_time, record|
167
+ record
168
+ }
169
+ }
160
170
  end
161
171
  end
162
172
 
@@ -263,6 +273,11 @@ class Fluent::KafkaGroupInput < Fluent::Input
263
273
  if @kafka_message_key
264
274
  record[@kafka_message_key] = msg.key
265
275
  end
276
+ if @add_headers
277
+ msg.headers.each_pair { |k, v|
278
+ record[k] = v
279
+ }
280
+ end
266
281
  es.add(record_time, record)
267
282
  rescue => e
268
283
  log.warn "parser error in #{batch.topic}/#{batch.partition}", :error => e.to_s, :value => msg.value, :offset => msg.offset
@@ -0,0 +1,305 @@
1
+ require 'fluent/plugin/input'
2
+ require 'fluent/time'
3
+ require 'fluent/plugin/kafka_plugin_util'
4
+
5
+ require 'rdkafka'
6
+
7
+ class Fluent::Plugin::RdKafkaGroupInput < Fluent::Plugin::Input
8
+ Fluent::Plugin.register_input('rdkafka_group', self)
9
+
10
+ helpers :thread, :parser, :compat_parameters
11
+
12
+ config_param :topics, :string,
13
+ :desc => "Listening topics(separate with comma',')."
14
+
15
+ config_param :format, :string, :default => 'json',
16
+ :desc => "Supported format: (json|text|ltsv|msgpack)"
17
+ config_param :message_key, :string, :default => 'message',
18
+ :desc => "For 'text' format only."
19
+ config_param :add_headers, :bool, :default => false,
20
+ :desc => "Add kafka's message headers to event record"
21
+ config_param :add_prefix, :string, :default => nil,
22
+ :desc => "Tag prefix (Optional)"
23
+ config_param :add_suffix, :string, :default => nil,
24
+ :desc => "Tag suffix (Optional)"
25
+ config_param :use_record_time, :bool, :default => false,
26
+ :desc => "Replace message timestamp with contents of 'time' field.",
27
+ :deprecated => "Use 'time_source record' instead."
28
+ config_param :time_source, :enum, :list => [:now, :kafka, :record], :default => :now,
29
+ :desc => "Source for message timestamp."
30
+ config_param :record_time_key, :string, :default => 'time',
31
+ :desc => "Time field when time_source is 'record'"
32
+ config_param :time_format, :string, :default => nil,
33
+ :desc => "Time format to be used to parse 'time' field."
34
+ config_param :kafka_message_key, :string, :default => nil,
35
+ :desc => "Set kafka's message key to this field"
36
+
37
+ config_param :retry_emit_limit, :integer, :default => nil,
38
+ :desc => "How long to stop event consuming when BufferQueueLimitError happens. Wait retry_emit_limit x 1s. The default is waiting until BufferQueueLimitError is resolved"
39
+ config_param :retry_wait_seconds, :integer, :default => 30
40
+ config_param :disable_retry_limit, :bool, :default => false,
41
+ :desc => "If set true, it disables retry_limit and make Fluentd retry indefinitely (default: false)"
42
+ config_param :retry_limit, :integer, :default => 10,
43
+ :desc => "The maximum number of retries for connecting kafka (default: 10)"
44
+
45
+ config_param :max_wait_time_ms, :integer, :default => 250,
46
+ :desc => "How long to block polls in milliseconds until the server sends us data."
47
+ config_param :max_batch_size, :integer, :default => 10000,
48
+ :desc => "Maximum number of log lines emitted in a single batch."
49
+
50
+ config_param :kafka_configs, :hash, :default => {},
51
+ :desc => "Kafka configuration properties as desribed in https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md"
52
+
53
+ config_section :parse do
54
+ config_set_default :@type, 'json'
55
+ end
56
+
57
+ include Fluent::KafkaPluginUtil::SSLSettings
58
+ include Fluent::KafkaPluginUtil::SaslSettings
59
+
60
+ class ForShutdown < StandardError
61
+ end
62
+
63
+ BufferError = Fluent::Plugin::Buffer::BufferOverflowError
64
+
65
+ def initialize
66
+ super
67
+
68
+ @time_parser = nil
69
+ @retry_count = 1
70
+ end
71
+
72
+ def _config_to_array(config)
73
+ config_array = config.split(',').map {|k| k.strip }
74
+ if config_array.empty?
75
+ raise Fluent::ConfigError, "kafka_group: '#{config}' is a required parameter"
76
+ end
77
+ config_array
78
+ end
79
+
80
+ def multi_workers_ready?
81
+ true
82
+ end
83
+
84
+ private :_config_to_array
85
+
86
+ def configure(conf)
87
+ compat_parameters_convert(conf, :parser)
88
+
89
+ super
90
+
91
+ log.warn "The in_rdkafka_group consumer was not yet tested under heavy production load. Use it at your own risk!"
92
+
93
+ log.info "Will watch for topics #{@topics} at brokers " \
94
+ "#{@kafka_configs["bootstrap.servers"]} and '#{@kafka_configs["group.id"]}' group"
95
+
96
+ @topics = _config_to_array(@topics)
97
+
98
+ parser_conf = conf.elements('parse').first
99
+ unless parser_conf
100
+ raise Fluent::ConfigError, "<parse> section or format parameter is required."
101
+ end
102
+ unless parser_conf["@type"]
103
+ raise Fluent::ConfigError, "parse/@type is required."
104
+ end
105
+ @parser_proc = setup_parser(parser_conf)
106
+
107
+ @time_source = :record if @use_record_time
108
+
109
+ if @time_source == :record and @time_format
110
+ @time_parser = Fluent::TimeParser.new(@time_format)
111
+ end
112
+ end
113
+
114
+ def setup_parser(parser_conf)
115
+ format = parser_conf["@type"]
116
+ case format
117
+ when 'json'
118
+ begin
119
+ require 'oj'
120
+ Oj.default_options = Fluent::DEFAULT_OJ_OPTIONS
121
+ Proc.new { |msg| Oj.load(msg.payload) }
122
+ rescue LoadError
123
+ require 'yajl'
124
+ Proc.new { |msg| Yajl::Parser.parse(msg.payload) }
125
+ end
126
+ when 'ltsv'
127
+ require 'ltsv'
128
+ Proc.new { |msg| LTSV.parse(msg.payload, {:symbolize_keys => false}).first }
129
+ when 'msgpack'
130
+ require 'msgpack'
131
+ Proc.new { |msg| MessagePack.unpack(msg.payload) }
132
+ when 'text'
133
+ Proc.new { |msg| {@message_key => msg.payload} }
134
+ else
135
+ @custom_parser = parser_create(usage: 'in-rdkafka-plugin', conf: parser_conf)
136
+ Proc.new { |msg|
137
+ @custom_parser.parse(msg.payload) {|_time, record|
138
+ record
139
+ }
140
+ }
141
+ end
142
+ end
143
+
144
+ def start
145
+ super
146
+
147
+ @consumer = setup_consumer
148
+
149
+ thread_create(:in_rdkafka_group, &method(:run))
150
+ end
151
+
152
+ def shutdown
153
+ # This nil assignment should be guarded by mutex in multithread programming manner.
154
+ # But the situation is very low contention, so we don't use mutex for now.
155
+ # If the problem happens, we will add a guard for consumer.
156
+ consumer = @consumer
157
+ @consumer = nil
158
+ consumer.close
159
+
160
+ super
161
+ end
162
+
163
+ def setup_consumer
164
+ consumer = Rdkafka::Config.new(@kafka_configs).consumer
165
+ consumer.subscribe(*@topics)
166
+ consumer
167
+ end
168
+
169
+ def reconnect_consumer
170
+ log.warn "Stopping Consumer"
171
+ consumer = @consumer
172
+ @consumer = nil
173
+ if consumer
174
+ consumer.close
175
+ end
176
+ log.warn "Could not connect to broker. retry_time:#{@retry_count}. Next retry will be in #{@retry_wait_seconds} seconds"
177
+ @retry_count = @retry_count + 1
178
+ sleep @retry_wait_seconds
179
+ @consumer = setup_consumer
180
+ log.warn "Re-starting consumer #{Time.now.to_s}"
181
+ @retry_count = 0
182
+ rescue =>e
183
+ log.error "unexpected error during re-starting consumer object access", :error => e.to_s
184
+ log.error_backtrace
185
+ if @retry_count <= @retry_limit or disable_retry_limit
186
+ reconnect_consumer
187
+ end
188
+ end
189
+
190
+ class Batch
191
+ attr_reader :topic
192
+ attr_reader :messages
193
+
194
+ def initialize(topic)
195
+ @topic = topic
196
+ @messages = []
197
+ end
198
+ end
199
+
200
+ # Executes the passed codeblock on a batch of messages.
201
+ # It is guaranteed that every message in a given batch belongs to the same topic, because the tagging logic in :run expects that property.
202
+ # The number of maximum messages in a batch is capped by the :max_batch_size configuration value. It ensures that consuming from a single
203
+ # topic for a long time (e.g. with `auto.offset.reset` set to `earliest`) does not lead to memory exhaustion. Also, calling consumer.poll
204
+ # advances thes consumer offset, so in case the process crashes we might lose at most :max_batch_size messages.
205
+ def each_batch(&block)
206
+ batch = nil
207
+ message = nil
208
+ while @consumer
209
+ message = @consumer.poll(@max_wait_time_ms)
210
+ if message
211
+ if not batch
212
+ batch = Batch.new(message.topic)
213
+ elsif batch.topic != message.topic || batch.messages.size >= @max_batch_size
214
+ yield batch
215
+ batch = Batch.new(message.topic)
216
+ end
217
+ batch.messages << message
218
+ else
219
+ yield batch if batch
220
+ batch = nil
221
+ end
222
+ end
223
+ yield batch if batch
224
+ end
225
+
226
+ def run
227
+ while @consumer
228
+ begin
229
+ each_batch { |batch|
230
+ log.debug "A new batch for topic #{batch.topic} with #{batch.messages.size} messages"
231
+ es = Fluent::MultiEventStream.new
232
+ tag = batch.topic
233
+ tag = @add_prefix + "." + tag if @add_prefix
234
+ tag = tag + "." + @add_suffix if @add_suffix
235
+
236
+ batch.messages.each { |msg|
237
+ begin
238
+ record = @parser_proc.call(msg)
239
+ case @time_source
240
+ when :kafka
241
+ record_time = Fluent::EventTime.from_time(msg.timestamp)
242
+ when :now
243
+ record_time = Fluent::Engine.now
244
+ when :record
245
+ if @time_format
246
+ record_time = @time_parser.parse(record[@record_time_key].to_s)
247
+ else
248
+ record_time = record[@record_time_key]
249
+ end
250
+ else
251
+ log.fatal "BUG: invalid time_source: #{@time_source}"
252
+ end
253
+ if @kafka_message_key
254
+ record[@kafka_message_key] = msg.key
255
+ end
256
+ if @add_headers
257
+ msg.headers.each_pair { |k, v|
258
+ record[k] = v
259
+ }
260
+ end
261
+ es.add(record_time, record)
262
+ rescue => e
263
+ log.warn "parser error in #{msg.topic}/#{msg.partition}", :error => e.to_s, :value => msg.payload, :offset => msg.offset
264
+ log.debug_backtrace
265
+ end
266
+ }
267
+
268
+ unless es.empty?
269
+ emit_events(tag, es)
270
+ end
271
+ }
272
+ rescue ForShutdown
273
+ rescue => e
274
+ log.error "unexpected error during consuming events from kafka. Re-fetch events.", :error => e.to_s
275
+ log.error_backtrace
276
+ reconnect_consumer
277
+ end
278
+ end
279
+ rescue => e
280
+ log.error "unexpected error during consumer object access", :error => e.to_s
281
+ log.error_backtrace
282
+ end
283
+
284
+ def emit_events(tag, es)
285
+ retries = 0
286
+ begin
287
+ router.emit_stream(tag, es)
288
+ rescue BufferError
289
+ raise ForShutdown if @consumer.nil?
290
+
291
+ if @retry_emit_limit.nil?
292
+ sleep 1
293
+ retry
294
+ end
295
+
296
+ if retries < @retry_emit_limit
297
+ retries += 1
298
+ sleep 1
299
+ retry
300
+ else
301
+ raise RuntimeError, "Exceeds retry_emit_limit"
302
+ end
303
+ end
304
+ end
305
+ end
@@ -69,12 +69,13 @@ module Kafka
69
69
  retry_backoff: retry_backoff,
70
70
  max_buffer_size: max_buffer_size,
71
71
  max_buffer_bytesize: max_buffer_bytesize,
72
+ partitioner: @partitioner,
72
73
  )
73
74
  end
74
75
  end
75
76
 
76
77
  class TopicProducer
77
- def initialize(topic, cluster:, transaction_manager:, logger:, instrumenter:, compressor:, ack_timeout:, required_acks:, max_retries:, retry_backoff:, max_buffer_size:, max_buffer_bytesize:)
78
+ def initialize(topic, cluster:, transaction_manager:, logger:, instrumenter:, compressor:, ack_timeout:, required_acks:, max_retries:, retry_backoff:, max_buffer_size:, max_buffer_bytesize:, partitioner:)
78
79
  @cluster = cluster
79
80
  @transaction_manager = transaction_manager
80
81
  @logger = logger
@@ -86,6 +87,7 @@ module Kafka
86
87
  @max_buffer_size = max_buffer_size
87
88
  @max_buffer_bytesize = max_buffer_bytesize
88
89
  @compressor = compressor
90
+ @partitioner = partitioner
89
91
 
90
92
  @topic = topic
91
93
  @cluster.add_target_topics(Set.new([topic]))
@@ -250,7 +252,7 @@ module Kafka
250
252
 
251
253
  begin
252
254
  if partition.nil?
253
- partition = Partitioner.call(partition_count, message)
255
+ partition = @partitioner.call(partition_count, message)
254
256
  end
255
257
 
256
258
  @buffer.write(
@@ -69,6 +69,7 @@ The codec the producer uses to compress messages.
69
69
  Supported codecs depends on ruby-kafka: https://github.com/zendesk/ruby-kafka#compression
70
70
  DESC
71
71
  config_param :max_send_limit_bytes, :size, :default => nil
72
+ config_param :discard_kafka_delivery_failed, :bool, :default => false
72
73
  config_param :active_support_notification_regex, :string, :default => nil,
73
74
  :desc => <<-DESC
74
75
  Add a regular expression to capture ActiveSupport notifications from the Kafka client
@@ -127,7 +128,7 @@ DESC
127
128
  @seed_brokers = @brokers
128
129
  log.info "brokers has been set: #{@seed_brokers}"
129
130
  else
130
- raise Fluent::Config, 'No brokers specified. Need one broker at least.'
131
+ raise Fluent::ConfigError, 'No brokers specified. Need one broker at least.'
131
132
  end
132
133
 
133
134
  formatter_conf = conf.elements('format').first
@@ -267,7 +268,16 @@ DESC
267
268
 
268
269
  if messages > 0
269
270
  log.debug { "#{messages} messages send." }
270
- producer.deliver_messages
271
+ if @discard_kafka_delivery_failed
272
+ begin
273
+ producer.deliver_messages
274
+ rescue Kafka::DeliveryFailed => e
275
+ log.warn "DeliveryFailed occurred. Discard broken event:", :error => e.to_s, :error_class => e.class.to_s, :tag => tag
276
+ producer.clear_buffer
277
+ end
278
+ else
279
+ producer.deliver_messages
280
+ end
271
281
  end
272
282
  rescue Kafka::UnknownTopicOrPartition
273
283
  if @use_default_for_unknown_topic && topic != @default_topic
@@ -73,6 +73,7 @@ The codec the producer uses to compress messages. Used for compression.codec
73
73
  Supported codecs: (gzip|snappy)
74
74
  DESC
75
75
  config_param :max_send_limit_bytes, :size, :default => nil
76
+ config_param :discard_kafka_delivery_failed, :bool, :default => false
76
77
  config_param :rdkafka_buffering_max_ms, :integer, :default => nil, :desc => 'Used for queue.buffering.max.ms'
77
78
  config_param :rdkafka_buffering_max_messages, :integer, :default => nil, :desc => 'Used for queue.buffering.max.messages'
78
79
  config_param :rdkafka_message_max_bytes, :integer, :default => nil, :desc => 'Used for message.max.bytes'
@@ -325,9 +326,13 @@ DESC
325
326
  }
326
327
  end
327
328
  rescue Exception => e
328
- log.warn "Send exception occurred: #{e} at #{e.backtrace.first}"
329
- # Raise exception to retry sendind messages
330
- raise e
329
+ if @discard_kafka_delivery_failed
330
+ log.warn "Delivery failed. Discard events:", :error => e.to_s, :error_class => e.class.to_s, :tag => tag
331
+ else
332
+ log.warn "Send exception occurred: #{e} at #{e.backtrace.first}"
333
+ # Raise exception to retry sendind messages
334
+ raise e
335
+ end
331
336
  end
332
337
 
333
338
  def enqueue_with_retry(producer, topic, record_buf, message_key, partition, headers)
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-kafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.14.0
4
+ version: 0.15.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Hidemasa Togashi
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2020-08-07 00:00:00.000000000 Z
12
+ date: 2020-09-30 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: fluentd
@@ -111,6 +111,7 @@ files:
111
111
  - fluent-plugin-kafka.gemspec
112
112
  - lib/fluent/plugin/in_kafka.rb
113
113
  - lib/fluent/plugin/in_kafka_group.rb
114
+ - lib/fluent/plugin/in_rdkafka_group.rb
114
115
  - lib/fluent/plugin/kafka_plugin_util.rb
115
116
  - lib/fluent/plugin/kafka_producer_ext.rb
116
117
  - lib/fluent/plugin/out_kafka.rb