logstash-output-kafka 6.2.2 → 6.2.4

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 7af0fa54637415fa08242236f0467adff7737c14
4
- data.tar.gz: d02e499764ef636fd652359ec37f13e9ed59acdf
2
+ SHA256:
3
+ metadata.gz: 870dfa346e6d0aece5fa86221a1b54f4e01d7223012a7c6af5bdb759d8e9563d
4
+ data.tar.gz: 50608eda6ee0f6644de6831d13e607b04230bc9d027e692b98973b16e1482304
5
5
  SHA512:
6
- metadata.gz: f48e203cc01529eda57a3bb6997e5edc7e101b365592dcff3deb5aa23aa32823032358cb50790283a308e7d1857414d6d8b380b238710c03e13097bc44aeee98
7
- data.tar.gz: 2b0d3ad356f646e22afafcfddf6a323e9f85e5a9d85908b21f9f3bc1aeca1e469af0689751898242d76aabfe5ffb757ebb76b288374e3f671df8df0abfeb69d6
6
+ metadata.gz: 9444f1648daaf96411cec7cc8f0ed576f060ec3c7f4b16ba37bba600b54a295297626deadb38aae5fcf6b1f36cc6ad64d9b449bc3e326dff13c39f857364a753
7
+ data.tar.gz: 4f68f1f9f01542372c621e8686b9604ff911851c8bab0477b4f42069b7bb7e269c324815c8bb65eadee6a8835a6cc56be7f385b5fc98227f40ced706a5026e13
data/CHANGELOG.md CHANGED
@@ -1,3 +1,15 @@
1
+ ## 6.2.4
2
+ - Backport of fixes from more recent branches:
3
+ - Fixed unnecessary sleep after exhausted retries [#166](https://github.com/logstash-plugins/logstash-output-kafka/pull/166)
4
+ - Changed Kafka send errors to log as warn [#179](https://github.com/logstash-plugins/logstash-output-kafka/pull/179)
5
+
6
+ ## 6.2.3
7
+ - Bugfix: Sends are now retried until successful. Previously, failed transmissions to Kafka
8
+ could have been lost by the KafkaProducer library. Now we verify transmission explicitly.
9
+ This changes the default 'retry' from 0 to retry-forever. It was a bug that we defaulted
10
+ to a retry count of 0.
11
+ https://github.com/logstash-plugins/logstash-output-kafka/pull/151
12
+
1
13
  ## 6.2.2
2
14
  - bump kafka dependency to 0.11.0.0
3
15
 
data/docs/index.asciidoc CHANGED
@@ -20,26 +20,11 @@ include::{include_path}/plugin_header.asciidoc[]
20
20
 
21
21
  ==== Description
22
22
 
23
- Write events to a Kafka topic. This uses the Kafka Producer API to write messages to a topic on
24
- the broker.
25
-
26
- Here's a compatibility matrix that shows the Kafka client versions that are compatible with each combination
27
- of Logstash and the Kafka output plugin:
28
-
29
- [options="header"]
30
- |==========================================================
31
- |Kafka Client Version |Logstash Version |Plugin Version |Why?
32
- |0.8 |2.0.0 - 2.x.x |<3.0.0 |Legacy, 0.8 is still popular
33
- |0.9 |2.0.0 - 2.3.x | 3.x.x |Works with the old Ruby Event API (`event['product']['price'] = 10`)
34
- |0.9 |2.4.x - 5.x.x | 4.x.x |Works with the new getter/setter APIs (`event.set('[product][price]', 10)`)
35
- |0.10.0.x |2.4.x - 5.x.x | 5.x.x |Not compatible with the <= 0.9 broker
36
- |0.10.1.x |2.4.x - 5.x.x | 6.x.x |
37
- |0.11.0.0 |2.4.x - 5.x.x | 6.2.2 |Not compatible with the <= 0.9 broker
38
- |==========================================================
39
-
40
- NOTE: We recommended that you use matching Kafka client and broker versions. During upgrades, you should
41
- upgrade brokers before clients because brokers target backwards compatibility. For example, the 0.9 broker
42
- is compatible with both the 0.8 consumer and 0.9 consumer APIs, but not the other way around.
23
+ Write events to a Kafka topic.
24
+
25
+ This plugin uses Kafka Client 0.11.0.0. For broker compatibility, see the official https://cwiki.apache.org/confluence/display/KAFKA/Compatibility+Matrix[Kafka compatibility reference].
26
+
27
+ If you're using a plugin version that was released after {version}, see the https://www.elastic.co/guide/en/logstash/master/plugins-inputs-kafka.html[latest plugin documentation] for updated information about Kafka compatibility. If you require features not yet available in this plugin (including client version upgrades), please file an issue with details about what you need..
43
28
 
44
29
  This output supports connecting to Kafka over:
45
30
 
@@ -302,10 +287,17 @@ retries are exhausted.
302
287
  ===== `retries`
303
288
 
304
289
  * Value type is <<number,number>>
305
- * Default value is `0`
290
+ * There is no default value for this setting.
291
+
292
+ The default retry behavior is to retry until successful. To prevent data loss,
293
+ the use of this setting is discouraged.
294
+
295
+ If you choose to set `retries`, a value greater than zero will cause the
296
+ client to only retry a fixed number of times. This will result in data loss
297
+ if a transport fault exists for longer than your retry count (network outage,
298
+ Kafka down, etc).
306
299
 
307
- Setting a value greater than zero will cause the client to
308
- resend any record whose send fails with a potentially transient error.
300
+ A value less than zero is a configuration error.
309
301
 
310
302
  [id="plugins-{type}s-{plugin}-retry_backoff_ms"]
311
303
  ===== `retry_backoff_ms`
@@ -111,9 +111,15 @@ class LogStash::Outputs::Kafka < LogStash::Outputs::Base
111
111
  # elapses the client will resend the request if necessary or fail the request if
112
112
  # retries are exhausted.
113
113
  config :request_timeout_ms, :validate => :string
114
- # Setting a value greater than zero will cause the client to
115
- # resend any record whose send fails with a potentially transient error.
116
- config :retries, :validate => :number, :default => 0
114
+ # The default retry behavior is to retry until successful. To prevent data loss,
115
+ # the use of this setting is discouraged.
116
+ #
117
+ # If you choose to set `retries`, a value greater than zero will cause the
118
+ # client to only retry a fixed number of times. This will result in data loss
119
+ # if a transient error outlasts your retry count.
120
+ #
121
+ # A value less than zero is a configuration error.
122
+ config :retries, :validate => :number
117
123
  # The amount of time to wait before attempting to retry a failed produce request to a given topic partition.
118
124
  config :retry_backoff_ms, :validate => :number, :default => 100
119
125
  # The size of the TCP send buffer to use when sending data.
@@ -175,6 +181,17 @@ class LogStash::Outputs::Kafka < LogStash::Outputs::Base
175
181
 
176
182
  public
177
183
  def register
184
+ @thread_batch_map = Concurrent::Hash.new
185
+
186
+ if !@retries.nil?
187
+ if @retries < 0
188
+ raise ConfigurationError, "A negative retry count (#{@retries}) is not valid. Must be a value >= 0"
189
+ end
190
+
191
+ @logger.warn("Kafka output is configured with finite retry. This instructs Logstash to LOSE DATA after a set number of send attempts fails. If you do not want to lose data if Kafka is down, then you must remove the retry setting.", :retries => @retries)
192
+ end
193
+
194
+
178
195
  @producer = create_producer
179
196
  @codec.on_event do |event, data|
180
197
  begin
@@ -183,7 +200,7 @@ class LogStash::Outputs::Kafka < LogStash::Outputs::Base
183
200
  else
184
201
  record = org.apache.kafka.clients.producer.ProducerRecord.new(event.sprintf(@topic_id), event.sprintf(@message_key), data)
185
202
  end
186
- @producer.send(record)
203
+ prepare(record)
187
204
  rescue LogStash::ShutdownSignal
188
205
  @logger.debug('Kafka producer got shutdown signal')
189
206
  rescue => e
@@ -191,14 +208,92 @@ class LogStash::Outputs::Kafka < LogStash::Outputs::Base
191
208
  :exception => e)
192
209
  end
193
210
  end
194
-
195
211
  end # def register
196
212
 
197
- def receive(event)
198
- if event == LogStash::SHUTDOWN
199
- return
213
+ def prepare(record)
214
+ # This output is threadsafe, so we need to keep a batch per thread.
215
+ @thread_batch_map[Thread.current].add(record)
216
+ end
217
+
218
+ def multi_receive(events)
219
+ t = Thread.current
220
+ if !@thread_batch_map.include?(t)
221
+ @thread_batch_map[t] = java.util.ArrayList.new(events.size)
222
+ end
223
+
224
+ events.each do |event|
225
+ break if event == LogStash::SHUTDOWN
226
+ @codec.encode(event)
227
+ end
228
+
229
+ batch = @thread_batch_map[t]
230
+ if batch.any?
231
+ retrying_send(batch)
232
+ batch.clear
233
+ end
234
+ end
235
+
236
+ def retrying_send(batch)
237
+ remaining = @retries
238
+
239
+ while batch.any?
240
+ if !remaining.nil?
241
+ if remaining < 0
242
+ # TODO(sissel): Offer to DLQ? Then again, if it's a transient fault,
243
+ # DLQing would make things worse (you dlq data that would be successful
244
+ # after the fault is repaired)
245
+ logger.info("Exhausted user-configured retry count when sending to Kafka. Dropping these events.",
246
+ :max_retries => @retries, :drop_count => batch.count)
247
+ break
248
+ end
249
+
250
+ remaining -= 1
251
+ end
252
+
253
+ failures = []
254
+
255
+ futures = batch.collect do |record|
256
+ begin
257
+ # send() can throw an exception even before the future is created.
258
+ @producer.send(record)
259
+ rescue org.apache.kafka.common.errors.TimeoutException => e
260
+ failures << record
261
+ nil
262
+ rescue org.apache.kafka.common.errors.InterruptException => e
263
+ failures << record
264
+ nil
265
+ rescue org.apache.kafka.common.errors.SerializationException => e
266
+ # TODO(sissel): Retrying will fail because the data itself has a problem serializing.
267
+ # TODO(sissel): Let's add DLQ here.
268
+ failures << record
269
+ nil
270
+ end
271
+ end.compact
272
+
273
+ futures.each_with_index do |future, i|
274
+ begin
275
+ result = future.get()
276
+ rescue => e
277
+ # TODO(sissel): Add metric to count failures, possibly by exception type.
278
+ logger.debug? && logger.debug("KafkaProducer.send() failed: #{e}", :exception => e);
279
+ failures << batch[i]
280
+ end
281
+ end
282
+
283
+ # No failures? Cool. Let's move on.
284
+ break if failures.empty?
285
+
286
+ # Otherwise, retry with any failed transmissions
287
+ if remaining != nil && remaining < 0
288
+ logger.info("Sending batch to Kafka failed.", :batch_size => batch.size,:failures => failures.size)
289
+ else
290
+ delay = @retry_backoff_ms / 1000.0
291
+ logger.info("Sending batch to Kafka failed. Will retry after a delay.", :batch_size => batch.size,
292
+ :failures => failures.size, :sleep => delay)
293
+ batch = failures
294
+ sleep(delay)
295
+ end
200
296
  end
201
- @codec.encode(event)
202
297
  end
203
298
 
204
299
  def close
@@ -222,8 +317,8 @@ class LogStash::Outputs::Kafka < LogStash::Outputs::Base
222
317
  props.put(kafka::MAX_REQUEST_SIZE_CONFIG, max_request_size.to_s)
223
318
  props.put(kafka::RECONNECT_BACKOFF_MS_CONFIG, reconnect_backoff_ms) unless reconnect_backoff_ms.nil?
224
319
  props.put(kafka::REQUEST_TIMEOUT_MS_CONFIG, request_timeout_ms) unless request_timeout_ms.nil?
225
- props.put(kafka::RETRIES_CONFIG, retries.to_s)
226
- props.put(kafka::RETRY_BACKOFF_MS_CONFIG, retry_backoff_ms.to_s)
320
+ props.put(kafka::RETRIES_CONFIG, retries.to_s) unless retries.nil?
321
+ props.put(kafka::RETRY_BACKOFF_MS_CONFIG, retry_backoff_ms.to_s)
227
322
  props.put(kafka::SEND_BUFFER_CONFIG, send_buffer_bytes.to_s)
228
323
  props.put(kafka::VALUE_SERIALIZER_CLASS_CONFIG, value_serializer)
229
324
 
@@ -241,7 +336,9 @@ class LogStash::Outputs::Kafka < LogStash::Outputs::Base
241
336
 
242
337
  org.apache.kafka.clients.producer.KafkaProducer.new(props)
243
338
  rescue => e
244
- logger.error("Unable to create Kafka producer from given configuration", :kafka_error_message => e)
339
+ logger.error("Unable to create Kafka producer from given configuration",
340
+ :kafka_error_message => e,
341
+ :cause => e.respond_to?(:getCause) ? e.getCause() : nil)
245
342
  raise e
246
343
  end
247
344
  end
@@ -1,7 +1,7 @@
1
1
  Gem::Specification.new do |s|
2
2
 
3
3
  s.name = 'logstash-output-kafka'
4
- s.version = '6.2.2'
4
+ s.version = '6.2.4'
5
5
  s.licenses = ['Apache License (2.0)']
6
6
  s.summary = 'Output events to a Kafka topic. This uses the Kafka Producer API to write messages to a topic on the broker'
7
7
  s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"
@@ -157,7 +157,7 @@ describe "outputs/kafka", :integration => true do
157
157
  def load_kafka_data(config)
158
158
  kafka = LogStash::Outputs::Kafka.new(config)
159
159
  kafka.register
160
- num_events.times do kafka.receive(event) end
160
+ kafka.multi_receive(num_events.times.collect { event })
161
161
  kafka.close
162
162
  end
163
163
 
@@ -25,34 +25,168 @@ describe "outputs/kafka" do
25
25
  context 'when outputting messages' do
26
26
  it 'should send logstash event to kafka broker' do
27
27
  expect_any_instance_of(org.apache.kafka.clients.producer.KafkaProducer).to receive(:send)
28
- .with(an_instance_of(org.apache.kafka.clients.producer.ProducerRecord))
28
+ .with(an_instance_of(org.apache.kafka.clients.producer.ProducerRecord)).and_call_original
29
29
  kafka = LogStash::Outputs::Kafka.new(simple_kafka_config)
30
30
  kafka.register
31
- kafka.receive(event)
31
+ kafka.multi_receive([event])
32
32
  end
33
33
 
34
34
  it 'should support Event#sprintf placeholders in topic_id' do
35
35
  topic_field = 'topic_name'
36
36
  expect(org.apache.kafka.clients.producer.ProducerRecord).to receive(:new)
37
- .with("my_topic", event.to_s)
38
- expect_any_instance_of(org.apache.kafka.clients.producer.KafkaProducer).to receive(:send)
37
+ .with("my_topic", event.to_s).and_call_original
38
+ expect_any_instance_of(org.apache.kafka.clients.producer.KafkaProducer).to receive(:send).and_call_original
39
39
  kafka = LogStash::Outputs::Kafka.new({'topic_id' => "%{#{topic_field}}"})
40
40
  kafka.register
41
- kafka.receive(event)
41
+ kafka.multi_receive([event])
42
42
  end
43
43
 
44
44
  it 'should support field referenced message_keys' do
45
45
  expect(org.apache.kafka.clients.producer.ProducerRecord).to receive(:new)
46
- .with("test", "172.0.0.1", event.to_s)
47
- expect_any_instance_of(org.apache.kafka.clients.producer.KafkaProducer).to receive(:send)
46
+ .with("test", "172.0.0.1", event.to_s).and_call_original
47
+ expect_any_instance_of(org.apache.kafka.clients.producer.KafkaProducer).to receive(:send).and_call_original
48
48
  kafka = LogStash::Outputs::Kafka.new(simple_kafka_config.merge({"message_key" => "%{host}"}))
49
49
  kafka.register
50
- kafka.receive(event)
50
+ kafka.multi_receive([event])
51
51
  end
52
-
52
+
53
53
  it 'should raise config error when truststore location is not set and ssl is enabled' do
54
- kafka = LogStash::Outputs::Kafka.new(simple_kafka_config.merge({"ssl" => "true"}))
54
+ kafka = LogStash::Outputs::Kafka.new(simple_kafka_config.merge("security_protocol" => "SSL"))
55
55
  expect { kafka.register }.to raise_error(LogStash::ConfigurationError, /ssl_truststore_location must be set when SSL is enabled/)
56
56
  end
57
57
  end
58
+
59
+ context "when KafkaProducer#send() raises an exception" do
60
+ let(:failcount) { (rand * 10).to_i }
61
+ let(:sendcount) { failcount + 1 }
62
+
63
+ let(:exception_classes) { [
64
+ org.apache.kafka.common.errors.TimeoutException,
65
+ org.apache.kafka.common.errors.InterruptException,
66
+ org.apache.kafka.common.errors.SerializationException
67
+ ] }
68
+
69
+ before do
70
+ count = 0
71
+ expect_any_instance_of(org.apache.kafka.clients.producer.KafkaProducer).to receive(:send)
72
+ .exactly(sendcount).times
73
+ .and_wrap_original do |m, *args|
74
+ if count < failcount # fail 'failcount' times in a row.
75
+ count += 1
76
+ # Pick an exception at random
77
+ raise exception_classes.shuffle.first.new("injected exception for testing")
78
+ else
79
+ m.call(*args) # call original
80
+ end
81
+ end
82
+ end
83
+
84
+ it "should retry until successful" do
85
+ kafka = LogStash::Outputs::Kafka.new(simple_kafka_config)
86
+ kafka.register
87
+ kafka.multi_receive([event])
88
+ end
89
+ end
90
+
91
+ context "when a send fails" do
92
+ context "and the default retries behavior is used" do
93
+ # Fail this many times and then finally succeed.
94
+ let(:failcount) { (rand * 10).to_i }
95
+
96
+ # Expect KafkaProducer.send() to get called again after every failure, plus the successful one.
97
+ let(:sendcount) { failcount + 1 }
98
+
99
+ it "should retry until successful" do
100
+ count = 0;
101
+
102
+ expect_any_instance_of(org.apache.kafka.clients.producer.KafkaProducer).to receive(:send)
103
+ .exactly(sendcount).times
104
+ .and_wrap_original do |m, *args|
105
+ if count < failcount
106
+ count += 1
107
+ # inject some failures.
108
+
109
+ # Return a custom Future that will raise an exception to simulate a Kafka send() problem.
110
+ future = java.util.concurrent.FutureTask.new { raise "Failed" }
111
+ future.run
112
+ future
113
+ else
114
+ m.call(*args)
115
+ end
116
+ end
117
+ kafka = LogStash::Outputs::Kafka.new(simple_kafka_config)
118
+ kafka.register
119
+ kafka.multi_receive([event])
120
+ end
121
+ end
122
+
123
+ context 'when retries is 0' do
124
+ let(:retries) { 0 }
125
+ let(:max_sends) { 1 }
126
+
127
+ it "should should only send once" do
128
+ expect_any_instance_of(org.apache.kafka.clients.producer.KafkaProducer).to receive(:send)
129
+ .once
130
+ .and_wrap_original do |m, *args|
131
+ # Always fail.
132
+ future = java.util.concurrent.FutureTask.new { raise "Failed" }
133
+ future.run
134
+ future
135
+ end
136
+ kafka = LogStash::Outputs::Kafka.new(simple_kafka_config.merge("retries" => retries))
137
+ kafka.register
138
+ kafka.multi_receive([event])
139
+ end
140
+
141
+ it 'should not sleep' do
142
+ expect_any_instance_of(org.apache.kafka.clients.producer.KafkaProducer).to receive(:send)
143
+ .once
144
+ .and_wrap_original do |m, *args|
145
+ # Always fail.
146
+ future = java.util.concurrent.FutureTask.new { raise "Failed" }
147
+ future.run
148
+ future
149
+ end
150
+
151
+ kafka = LogStash::Outputs::Kafka.new(simple_kafka_config.merge("retries" => retries))
152
+ expect(kafka).not_to receive(:sleep).with(anything)
153
+ kafka.register
154
+ kafka.multi_receive([event])
155
+ end
156
+ end
157
+
158
+ context "and when retries is set by the user" do
159
+ let(:retries) { (rand * 10).to_i }
160
+ let(:max_sends) { retries + 1 }
161
+
162
+ it "should give up after retries are exhausted" do
163
+ expect_any_instance_of(org.apache.kafka.clients.producer.KafkaProducer).to receive(:send)
164
+ .at_most(max_sends).times
165
+ .and_wrap_original do |m, *args|
166
+ # Always fail.
167
+ future = java.util.concurrent.FutureTask.new { raise "Failed" }
168
+ future.run
169
+ future
170
+ end
171
+ kafka = LogStash::Outputs::Kafka.new(simple_kafka_config.merge("retries" => retries))
172
+ kafka.register
173
+ kafka.multi_receive([event])
174
+ end
175
+
176
+ it 'should only sleep retries number of times' do
177
+ expect_any_instance_of(org.apache.kafka.clients.producer.KafkaProducer).to receive(:send)
178
+ .at_most(max_sends)
179
+ .and_wrap_original do |m, *args|
180
+ # Always fail.
181
+ future = java.util.concurrent.FutureTask.new { raise "Failed" }
182
+ future.run
183
+ future
184
+ end
185
+ kafka = LogStash::Outputs::Kafka.new(simple_kafka_config.merge("retries" => retries))
186
+ expect(kafka).to receive(:sleep).exactly(retries).times
187
+ kafka.register
188
+ kafka.multi_receive([event])
189
+ end
190
+ end
191
+ end
58
192
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: logstash-output-kafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 6.2.2
4
+ version: 6.2.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - Elasticsearch
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2017-07-11 00:00:00.000000000 Z
11
+ date: 2019-02-12 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  requirement: !ruby/object:Gem::Requirement
@@ -114,7 +114,9 @@ dependencies:
114
114
  - - ">="
115
115
  - !ruby/object:Gem::Version
116
116
  version: '0'
117
- description: This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program
117
+ description: This gem is a Logstash plugin required to be installed on top of the
118
+ Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This
119
+ gem is not a stand-alone program
118
120
  email: info@elastic.co
119
121
  executables: []
120
122
  extensions: []
@@ -176,10 +178,11 @@ requirements:
176
178
  - jar 'org.slf4j:slf4j-log4j12', '1.7.21'
177
179
  - jar 'org.apache.logging.log4j:log4j-1.2-api', '2.6.2'
178
180
  rubyforge_project:
179
- rubygems_version: 2.4.8
181
+ rubygems_version: 2.6.13
180
182
  signing_key:
181
183
  specification_version: 4
182
- summary: Output events to a Kafka topic. This uses the Kafka Producer API to write messages to a topic on the broker
184
+ summary: Output events to a Kafka topic. This uses the Kafka Producer API to write
185
+ messages to a topic on the broker
183
186
  test_files:
184
187
  - spec/integration/outputs/kafka_spec.rb
185
188
  - spec/unit/outputs/kafka_spec.rb