logstash-integration-kafka 10.4.0-java → 10.6.0-java

Sign up to get free protection for your applications and to get access to all the features.
Files changed (33) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +24 -1
  3. data/docs/index.asciidoc +7 -2
  4. data/docs/input-kafka.asciidoc +71 -10
  5. data/docs/output-kafka.asciidoc +42 -16
  6. data/lib/logstash-integration-kafka_jars.rb +13 -4
  7. data/lib/logstash/inputs/kafka.rb +33 -29
  8. data/lib/logstash/outputs/kafka.rb +31 -47
  9. data/lib/logstash/plugin_mixins/common.rb +92 -0
  10. data/lib/logstash/plugin_mixins/kafka_support.rb +29 -0
  11. data/logstash-integration-kafka.gemspec +1 -1
  12. data/spec/fixtures/trust-store_stub.jks +0 -0
  13. data/spec/integration/inputs/kafka_spec.rb +186 -11
  14. data/spec/unit/inputs/avro_schema_fixture_payment.asvc +8 -0
  15. data/spec/unit/inputs/kafka_spec.rb +16 -0
  16. data/spec/unit/outputs/kafka_spec.rb +58 -15
  17. data/vendor/jar-dependencies/com/github/luben/zstd-jni/1.4.4-7/zstd-jni-1.4.4-7.jar +0 -0
  18. data/vendor/jar-dependencies/io/confluent/common-config/5.5.1/common-config-5.5.1.jar +0 -0
  19. data/vendor/jar-dependencies/io/confluent/common-utils/5.5.1/common-utils-5.5.1.jar +0 -0
  20. data/vendor/jar-dependencies/io/confluent/kafka-avro-serializer/5.5.1/kafka-avro-serializer-5.5.1.jar +0 -0
  21. data/vendor/jar-dependencies/io/confluent/kafka-schema-registry-client/5.5.1/kafka-schema-registry-client-5.5.1.jar +0 -0
  22. data/vendor/jar-dependencies/io/confluent/kafka-schema-serializer/5.5.1/kafka-schema-serializer-5.5.1.jar +0 -0
  23. data/vendor/jar-dependencies/javax/ws/rs/javax.ws.rs-api/2.1.1/javax.ws.rs-api-2.1.1.jar +0 -0
  24. data/vendor/jar-dependencies/org/apache/avro/avro/1.9.2/avro-1.9.2.jar +0 -0
  25. data/vendor/jar-dependencies/org/apache/kafka/kafka-clients/{2.4.1/kafka-clients-2.4.1.jar → 2.5.1/kafka-clients-2.5.1.jar} +0 -0
  26. data/vendor/jar-dependencies/org/apache/kafka/kafka_2.12/2.5.1/kafka_2.12-2.5.1.jar +0 -0
  27. data/vendor/jar-dependencies/org/glassfish/jersey/core/jersey-common/2.30/jersey-common-2.30.jar +0 -0
  28. data/vendor/jar-dependencies/org/lz4/lz4-java/1.7.1/lz4-java-1.7.1.jar +0 -0
  29. data/vendor/jar-dependencies/org/slf4j/slf4j-api/1.7.30/slf4j-api-1.7.30.jar +0 -0
  30. metadata +21 -6
  31. data/vendor/jar-dependencies/com/github/luben/zstd-jni/1.4.3-1/zstd-jni-1.4.3-1.jar +0 -0
  32. data/vendor/jar-dependencies/org/lz4/lz4-java/1.6.0/lz4-java-1.6.0.jar +0 -0
  33. data/vendor/jar-dependencies/org/slf4j/slf4j-api/1.7.28/slf4j-api-1.7.28.jar +0 -0
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 6ebbcd2d18d130e9fac997330c3c4b4bd9a959a982fe83215762b03638497ba4
4
- data.tar.gz: 2b54ba231d9f74344a5ec321e0dcdec256ea5664001e7c3dc0323b2150761e30
3
+ metadata.gz: 1d8b40d779e91e9c05dece65249660ab5c272b6833658cbca51b977d92936f42
4
+ data.tar.gz: 58323e216be645aede9f0b49c27824958b6627485bdf79a6774cd5f87b818245
5
5
  SHA512:
6
- metadata.gz: a8c2aa5c2123fa001f58fc3670bd90face614fed72cf24f17ad645ba4de3bd689923d51ba5b5dd3a9201507657f6ed54326ed48495d274bb7c2284525470bdf7
7
- data.tar.gz: cebe4abeda34edd6d5d1872c96f1b119abfa7abb2e40c52fb061e2c0953789441223e4b5a93a0d2fd7e3de1918c592dce9f5fee91bb6713b0e16f167033c13ce
6
+ metadata.gz: 0f05eec028758745a2ab04b90d721128c088c46d0bb9a01923c389118c99a718a561d2ca2420fc2e206e4cf75e4e3695e1704a17d9eff9f4f18e66da3d3ccb85
7
+ data.tar.gz: 85e117a64d14d013674869ccadfb5487a04991ec83c6a9d6874496b862be200bf4e7203d3ac4fc779d67131ecbe82212fafd66884114337d17ed33dda7ad0963
@@ -1,8 +1,31 @@
1
+ ## 10.6.0
2
+ - Added functionality to Kafka input to use Avro deserializer in retrieving data from Kafka. The schema is retrieved
3
+ from an instance of Confluent's Schema Registry service [#51](https://github.com/logstash-plugins/logstash-integration-kafka/pull/51)
4
+
5
+ ## 10.5.3
6
+ - Fix: set (optional) truststore when endpoint id check disabled [#60](https://github.com/logstash-plugins/logstash-integration-kafka/pull/60).
7
+ Since **10.1.0** disabling server host-name verification (`ssl_endpoint_identification_algorithm => ""`) did not allow
8
+ the (output) plugin to set `ssl_truststore_location => "..."`.
9
+
10
+ ## 10.5.2
11
+ - Docs: explain group_id in case of multiple inputs [#59](https://github.com/logstash-plugins/logstash-integration-kafka/pull/59)
12
+
13
+ ## 10.5.1
14
+ - [DOC]Replaced plugin_header file with plugin_header-integration file. [#46](https://github.com/logstash-plugins/logstash-integration-kafka/pull/46)
15
+ - [DOC]Update kafka client version across kafka integration docs [#47](https://github.com/logstash-plugins/logstash-integration-kafka/pull/47)
16
+ - [DOC]Replace hard-coded kafka client and doc path version numbers with attributes to simplify doc maintenance [#48](https://github.com/logstash-plugins/logstash-integration-kafka/pull/48)
17
+
18
+ ## 10.5.0
19
+ - Changed: retry sending messages only for retriable exceptions [#27](https://github.com/logstash-plugins/logstash-integration-kafka/pull/29)
20
+
21
+ ## 10.4.1
22
+ - [DOC] Fixed formatting issues and made minor content edits [#43](https://github.com/logstash-plugins/logstash-integration-kafka/pull/43)
23
+
1
24
  ## 10.4.0
2
25
  - added the input `isolation_level` to allow fine control of whether to return transactional messages [#44](https://github.com/logstash-plugins/logstash-integration-kafka/pull/44)
3
26
 
4
27
  ## 10.3.0
5
- - added the input and output `client_dns_lookup` parameter to allow control of how DNS requests are made
28
+ - added the input and output `client_dns_lookup` parameter to allow control of how DNS requests are made [#28](https://github.com/logstash-plugins/logstash-integration-kafka/pull/28)
6
29
 
7
30
  ## 10.2.0
8
31
  - Changed: config defaults to be aligned with Kafka client defaults [#30](https://github.com/logstash-plugins/logstash-integration-kafka/pull/30)
@@ -1,6 +1,7 @@
1
1
  :plugin: kafka
2
2
  :type: integration
3
3
  :no_codec:
4
+ :kafka_client: 2.4
4
5
 
5
6
  ///////////////////////////////////////////
6
7
  START - GENERATED VARIABLES, DO NOT EDIT!
@@ -21,11 +22,15 @@ include::{include_path}/plugin_header.asciidoc[]
21
22
 
22
23
  ==== Description
23
24
 
24
- The Kafka Integration Plugin provides integrated plugins for working with the https://kafka.apache.org/[Kafka] distributed streaming platform.
25
+ The Kafka Integration Plugin provides integrated plugins for working with the
26
+ https://kafka.apache.org/[Kafka] distributed streaming platform.
25
27
 
26
28
  - {logstash-ref}/plugins-inputs-kafka.html[Kafka Input Plugin]
27
29
  - {logstash-ref}/plugins-outputs-kafka.html[Kafka Output Plugin]
28
30
 
29
- This plugin uses Kafka Client 2.4. For broker compatibility, see the official https://cwiki.apache.org/confluence/display/KAFKA/Compatibility+Matrix[Kafka compatibility reference]. If the linked compatibility wiki is not up-to-date, please contact Kafka support/community to confirm compatibility.
31
+ This plugin uses Kafka Client {kafka_client}. For broker compatibility, see the official
32
+ https://cwiki.apache.org/confluence/display/KAFKA/Compatibility+Matrix[Kafka
33
+ compatibility reference]. If the linked compatibility wiki is not up-to-date,
34
+ please contact Kafka support/community to confirm compatibility.
30
35
 
31
36
  :no_codec!:
@@ -1,6 +1,9 @@
1
+ :integration: kafka
1
2
  :plugin: kafka
2
3
  :type: input
3
4
  :default_codec: plain
5
+ :kafka_client: 2.4
6
+ :kafka_client_doc: 24
4
7
 
5
8
  ///////////////////////////////////////////
6
9
  START - GENERATED VARIABLES, DO NOT EDIT!
@@ -17,15 +20,20 @@ END - GENERATED VARIABLES, DO NOT EDIT!
17
20
 
18
21
  === Kafka input plugin
19
22
 
20
- include::{include_path}/plugin_header.asciidoc[]
23
+ include::{include_path}/plugin_header-integration.asciidoc[]
21
24
 
22
25
  ==== Description
23
26
 
24
27
  This input will read events from a Kafka topic.
25
28
 
26
- This plugin uses Kafka Client 2.3.0. For broker compatibility, see the official https://cwiki.apache.org/confluence/display/KAFKA/Compatibility+Matrix[Kafka compatibility reference]. If the linked compatibility wiki is not up-to-date, please contact Kafka support/community to confirm compatibility.
29
+ This plugin uses Kafka Client {kafka_client}. For broker compatibility, see the
30
+ official
31
+ https://cwiki.apache.org/confluence/display/KAFKA/Compatibility+Matrix[Kafka
32
+ compatibility reference]. If the linked compatibility wiki is not up-to-date,
33
+ please contact Kafka support/community to confirm compatibility.
27
34
 
28
- If you require features not yet available in this plugin (including client version upgrades), please file an issue with details about what you need.
35
+ If you require features not yet available in this plugin (including client
36
+ version upgrades), please file an issue with details about what you need.
29
37
 
30
38
  This input supports connecting to Kafka over:
31
39
 
@@ -46,9 +54,9 @@ the same `group_id`.
46
54
  Ideally you should have as many threads as the number of partitions for a perfect balance --
47
55
  more threads than partitions means that some threads will be idle
48
56
 
49
- For more information see https://kafka.apache.org/24/documentation.html#theconsumer
57
+ For more information see https://kafka.apache.org/{kafka_client_doc}/documentation.html#theconsumer
50
58
 
51
- Kafka consumer configuration: https://kafka.apache.org/24/documentation.html#consumerconfigs
59
+ Kafka consumer configuration: https://kafka.apache.org/{kafka_client_doc}/documentation.html#consumerconfigs
52
60
 
53
61
  ==== Metadata fields
54
62
 
@@ -59,7 +67,11 @@ The following metadata from Kafka broker are added under the `[@metadata]` field
59
67
  * `[@metadata][kafka][partition]`: Partition info for this message.
60
68
  * `[@metadata][kafka][offset]`: Original record offset for this message.
61
69
  * `[@metadata][kafka][key]`: Record key, if any.
62
- * `[@metadata][kafka][timestamp]`: Timestamp in the Record. Depending on your broker configuration, this can be either when the record was created (default) or when it was received by the broker. See more about property log.message.timestamp.type at https://kafka.apache.org/10/documentation.html#brokerconfigs
70
+ * `[@metadata][kafka][timestamp]`: Timestamp in the Record.
71
+ Depending on your broker configuration, this can be
72
+ either when the record was created (default) or when it was received by the
73
+ broker. See more about property log.message.timestamp.type at
74
+ https://kafka.apache.org/{kafka_client_doc}/documentation.html#brokerconfigs
63
75
 
64
76
  Metadata is only added to the event if the `decorate_events` option is set to true (it defaults to false).
65
77
 
@@ -73,7 +85,7 @@ This plugin supports these configuration options plus the <<plugins-{type}s-{plu
73
85
 
74
86
  NOTE: Some of these options map to a Kafka option. Defaults usually reflect the Kafka default setting,
75
87
  and might change if Kafka's consumer defaults change.
76
- See the https://kafka.apache.org/24/documentation for more details.
88
+ See the https://kafka.apache.org/{kafka_client_doc}/documentation for more details.
77
89
 
78
90
  [cols="<,<,<",options="header",]
79
91
  |=======================================================================
@@ -112,6 +124,10 @@ See the https://kafka.apache.org/24/documentation for more details.
112
124
  | <<plugins-{type}s-{plugin}-sasl_jaas_config>> |<<string,string>>|No
113
125
  | <<plugins-{type}s-{plugin}-sasl_kerberos_service_name>> |<<string,string>>|No
114
126
  | <<plugins-{type}s-{plugin}-sasl_mechanism>> |<<string,string>>|No
127
+ | <<plugins-{type}s-{plugin}-schema_registry_key>> |<<string,string>>|No
128
+ | <<plugins-{type}s-{plugin}-schema_registry_proxy>> |<<uri,uri>>|No
129
+ | <<plugins-{type}s-{plugin}-schema_registry_secret>> |<<string,string>>|No
130
+ | <<plugins-{type}s-{plugin}-schema_registry_url>> |<<uri,uri>>|No
115
131
  | <<plugins-{type}s-{plugin}-security_protocol>> |<<string,string>>, one of `["PLAINTEXT", "SSL", "SASL_PLAINTEXT", "SASL_SSL"]`|No
116
132
  | <<plugins-{type}s-{plugin}-send_buffer_bytes>> |<<number,number>>|No
117
133
  | <<plugins-{type}s-{plugin}-session_timeout_ms>> |<<number,number>>|No
@@ -302,7 +318,11 @@ before answering the request.
302
318
 
303
319
  The identifier of the group this consumer belongs to. Consumer group is a single logical subscriber
304
320
  that happens to be made up of multiple processors. Messages in a topic will be distributed to all
305
- Logstash instances with the same `group_id`
321
+ Logstash instances with the same `group_id`.
322
+
323
+ NOTE: In cases when multiple inputs are being used in a single pipeline, reading from different topics,
324
+ it's essential to set a different `group_id => ...` for each input. Setting a unique `client_id => ...`
325
+ is also recommended.
306
326
 
307
327
  [id="plugins-{type}s-{plugin}-heartbeat_interval_ms"]
308
328
  ===== `heartbeat_interval_ms`
@@ -421,7 +441,7 @@ partition ownership amongst consumer instances, supported options are:
421
441
  * `sticky`
422
442
  * `cooperative_sticky`
423
443
 
424
- These map to Kafka's corresponding https://kafka.apache.org/24/javadoc/org/apache/kafka/clients/consumer/ConsumerPartitionAssignor.html[`ConsumerPartitionAssignor`]
444
+ These map to Kafka's corresponding https://kafka.apache.org/{kafka_client_doc}/javadoc/org/apache/kafka/clients/consumer/ConsumerPartitionAssignor.html[`ConsumerPartitionAssignor`]
425
445
  implementations.
426
446
 
427
447
  [id="plugins-{type}s-{plugin}-poll_timeout_ms"]
@@ -512,6 +532,44 @@ http://kafka.apache.org/documentation.html#security_sasl[SASL mechanism] used fo
512
532
  This may be any mechanism for which a security provider is available.
513
533
  GSSAPI is the default mechanism.
514
534
 
535
+ [id="plugins-{type}s-{plugin}-schema_registry_key"]
536
+ ===== `schema_registry_key`
537
+
538
+ * Value type is <<string,string>>
539
+ * There is no default value for this setting.
540
+
541
+ Set the username for basic authorization to access remote Schema Registry.
542
+
543
+ [id="plugins-{type}s-{plugin}-schema_registry_proxy"]
544
+ ===== `schema_registry_proxy`
545
+
546
+ * Value type is <<uri,uri>>
547
+ * There is no default value for this setting.
548
+
549
+ Set the address of a forward HTTP proxy. An empty string is treated as if proxy was not set.
550
+
551
+ [id="plugins-{type}s-{plugin}-schema_registry_secret"]
552
+ ===== `schema_registry_secret`
553
+
554
+ * Value type is <<string,string>>
555
+ * There is no default value for this setting.
556
+
557
+ Set the password for basic authorization to access remote Schema Registry.
558
+
559
+ [id="plugins-{type}s-{plugin}-schema_registry_url"]
560
+ ===== `schema_registry_url`
561
+
562
+ * Value type is <<uri,uri>>
563
+
564
+ The URI that points to an instance of the
565
+ https://docs.confluent.io/current/schema-registry/index.html[Schema Registry] service,
566
+ used to manage Avro schemas. Be sure that the Avro schemas for deserializing the data from
567
+ the specified topics have been uploaded to the Schema Registry service.
568
+ The schemas must follow a naming convention with the pattern <topic name>-value.
569
+
570
+ Use either the Schema Registry config option or the
571
+ <<plugins-{type}s-{plugin}-value_deserializer_class>> config option, but not both.
572
+
515
573
  [id="plugins-{type}s-{plugin}-security_protocol"]
516
574
  ===== `security_protocol`
517
575
 
@@ -625,7 +683,10 @@ The topics configuration will be ignored when using this configuration.
625
683
  * Value type is <<string,string>>
626
684
  * Default value is `"org.apache.kafka.common.serialization.StringDeserializer"`
627
685
 
628
- Java Class used to deserialize the record's value
686
+ Java Class used to deserialize the record's value.
687
+ A custom value deserializer can be used only if you are not using a Schema Registry.
688
+ Use either the value_deserializer_class config option or the
689
+ <<plugins-{type}s-{plugin}-schema_registry_url>> config option, but not both.
629
690
 
630
691
  [id="plugins-{type}s-{plugin}-common-options"]
631
692
  include::{include_path}/{type}.asciidoc[]
@@ -1,6 +1,9 @@
1
+ :integration: kafka
1
2
  :plugin: kafka
2
3
  :type: output
3
4
  :default_codec: plain
5
+ :kafka_client: 2.4
6
+ :kafka_client_doc: 24
4
7
 
5
8
  ///////////////////////////////////////////
6
9
  START - GENERATED VARIABLES, DO NOT EDIT!
@@ -17,15 +20,20 @@ END - GENERATED VARIABLES, DO NOT EDIT!
17
20
 
18
21
  === Kafka output plugin
19
22
 
20
- include::{include_path}/plugin_header.asciidoc[]
23
+ include::{include_path}/plugin_header-integration.asciidoc[]
21
24
 
22
25
  ==== Description
23
26
 
24
27
  Write events to a Kafka topic.
25
28
 
26
- This plugin uses Kafka Client 2.3.0. For broker compatibility, see the official https://cwiki.apache.org/confluence/display/KAFKA/Compatibility+Matrix[Kafka compatibility reference]. If the linked compatibility wiki is not up-to-date, please contact Kafka support/community to confirm compatibility.
29
+ This plugin uses Kafka Client {kafka_client}. For broker compatibility, see the
30
+ official
31
+ https://cwiki.apache.org/confluence/display/KAFKA/Compatibility+Matrix[Kafka
32
+ compatibility reference]. If the linked compatibility wiki is not up-to-date,
33
+ please contact Kafka support/community to confirm compatibility.
27
34
 
28
- If you require features not yet available in this plugin (including client version upgrades), please file an issue with details about what you need.
35
+ If you require features not yet available in this plugin (including client
36
+ version upgrades), please file an issue with details about what you need.
29
37
 
30
38
  This output supports connecting to Kafka over:
31
39
 
@@ -36,9 +44,12 @@ By default security is disabled but can be turned on as needed.
36
44
 
37
45
  The only required configuration is the topic_id.
38
46
 
39
- The default codec is plain. Logstash will encode your events with not only the message field but also with a timestamp and hostname.
47
+ The default codec is plain. Logstash will encode your events with not only the
48
+ message field but also with a timestamp and hostname.
49
+
50
+ If you want the full content of your events to be sent as json, you should set
51
+ the codec in the output configuration like this:
40
52
 
41
- If you want the full content of your events to be sent as json, you should set the codec in the output configuration like this:
42
53
  [source,ruby]
43
54
  output {
44
55
  kafka {
@@ -47,9 +58,11 @@ If you want the full content of your events to be sent as json, you should set t
47
58
  }
48
59
  }
49
60
 
50
- For more information see https://kafka.apache.org/24/documentation.html#theproducer
61
+ For more information see
62
+ https://kafka.apache.org/{kafka_client_doc}/documentation.html#theproducer
51
63
 
52
- Kafka producer configuration: https://kafka.apache.org/24/documentation.html#producerconfigs
64
+ Kafka producer configuration:
65
+ https://kafka.apache.org/{kafka_client_doc}/documentation.html#producerconfigs
53
66
 
54
67
  [id="plugins-{type}s-{plugin}-options"]
55
68
  ==== Kafka Output Configuration Options
@@ -58,7 +71,7 @@ This plugin supports the following configuration options plus the <<plugins-{typ
58
71
 
59
72
  NOTE: Some of these options map to a Kafka option. Defaults usually reflect the Kafka default setting,
60
73
  and might change if Kafka's producer defaults change.
61
- See the https://kafka.apache.org/24/documentation for more details.
74
+ See the https://kafka.apache.org/{kafka_client_doc}/documentation for more details.
62
75
 
63
76
  [cols="<,<,<",options="header",]
64
77
  |=======================================================================
@@ -115,10 +128,13 @@ output plugins.
115
128
  The number of acknowledgments the producer requires the leader to have received
116
129
  before considering a request complete.
117
130
 
118
- acks=0, the producer will not wait for any acknowledgment from the server at all.
119
- acks=1, This will mean the leader will write the record to its local log but
120
- will respond without awaiting full acknowledgement from all followers.
121
- acks=all, This means the leader will wait for the full set of in-sync replicas to acknowledge the record.
131
+ `acks=0`. The producer will not wait for any acknowledgment from the server.
132
+
133
+ `acks=1`. The leader will write the record to its local log, but will respond
134
+ without waiting for full acknowledgement from all followers.
135
+
136
+ `acks=all`. The leader will wait for the full set of in-sync replicas before
137
+ acknowledging the record.
122
138
 
123
139
  [id="plugins-{type}s-{plugin}-batch_size"]
124
140
  ===== `batch_size`
@@ -154,11 +170,12 @@ The total bytes of memory the producer can use to buffer records waiting to be s
154
170
  ===== `client_dns_lookup`
155
171
 
156
172
  * Value type is <<string,string>>
173
+ * Valid options are `use_all_dns_ips`, `resolve_canonical_bootstrap_servers_only`, `default`
157
174
  * Default value is `"default"`
158
175
 
159
- How DNS lookups should be done. If set to `use_all_dns_ips`, when the lookup returns multiple
160
- IP addresses for a hostname, they will all be attempted to connect to before failing the
161
- connection. If the value is `resolve_canonical_bootstrap_servers_only` each entry will be
176
+ Controls how DNS lookups are done. If set to `use_all_dns_ips`, Logstash tries
177
+ all IP addresses returned for a hostname before failing the connection.
178
+ If set to `resolve_canonical_bootstrap_servers_only`, each entry will be
162
179
  resolved and expanded into a list of canonical names.
163
180
 
164
181
  [id="plugins-{type}s-{plugin}-client_id"]
@@ -178,7 +195,7 @@ ip/port by allowing a logical application name to be included with the request
178
195
  * Default value is `"none"`
179
196
 
180
197
  The compression type for all data generated by the producer.
181
- The default is none (i.e. no compression). Valid values are none, gzip, or snappy.
198
+ The default is none (i.e. no compression). Valid values are none, gzip, snappy, or lz4.
182
199
 
183
200
  [id="plugins-{type}s-{plugin}-jaas_path"]
184
201
  ===== `jaas_path`
@@ -323,6 +340,15 @@ Kafka down, etc).
323
340
 
324
341
  A value less than zero is a configuration error.
325
342
 
343
+ Starting with version 10.5.0, this plugin will only retry exceptions that are a subclass of
344
+ https://kafka.apache.org/{kafka_client_doc}/javadoc/org/apache/kafka/common/errors/RetriableException.html[RetriableException]
345
+ and
346
+ https://kafka.apache.org/{kafka_client_doc}/javadoc/org/apache/kafka/common/errors/InterruptException.html[InterruptException].
347
+ If producing a message throws any other exception, an error is logged and the message is dropped without retrying.
348
+ This prevents the Logstash pipeline from hanging indefinitely.
349
+
350
+ In versions prior to 10.5.0, any exception is retried indefinitely unless the `retries` option is configured.
351
+
326
352
  [id="plugins-{type}s-{plugin}-retry_backoff_ms"]
327
353
  ===== `retry_backoff_ms`
328
354
 
@@ -1,8 +1,17 @@
1
1
  # AUTOGENERATED BY THE GRADLE SCRIPT. DO NOT EDIT.
2
2
 
3
3
  require 'jar_dependencies'
4
- require_jar('org.apache.kafka', 'kafka-clients', '2.4.1')
5
- require_jar('com.github.luben', 'zstd-jni', '1.4.3-1')
6
- require_jar('org.slf4j', 'slf4j-api', '1.7.28')
7
- require_jar('org.lz4', 'lz4-java', '1.6.0')
4
+ require_jar('io.confluent', 'kafka-avro-serializer', '5.5.1')
5
+ require_jar('io.confluent', 'kafka-schema-serializer', '5.5.1')
6
+ require_jar('io.confluent', 'common-config', '5.5.1')
7
+ require_jar('org.apache.avro', 'avro', '1.9.2')
8
+ require_jar('io.confluent', 'kafka-schema-registry-client', '5.5.1')
9
+ require_jar('org.apache.kafka', 'kafka_2.12', '2.5.1')
10
+ require_jar('io.confluent', 'common-utils', '5.5.1')
11
+ require_jar('javax.ws.rs', 'javax.ws.rs-api', '2.1.1')
12
+ require_jar('org.glassfish.jersey.core', 'jersey-common', '2.30')
13
+ require_jar('org.apache.kafka', 'kafka-clients', '2.5.1')
14
+ require_jar('com.github.luben', 'zstd-jni', '1.4.4-7')
15
+ require_jar('org.slf4j', 'slf4j-api', '1.7.30')
16
+ require_jar('org.lz4', 'lz4-java', '1.7.1')
8
17
  require_jar('org.xerial.snappy', 'snappy-java', '1.1.7.3')
@@ -3,6 +3,11 @@ require 'logstash/inputs/base'
3
3
  require 'stud/interval'
4
4
  require 'java'
5
5
  require 'logstash-integration-kafka_jars.rb'
6
+ require 'logstash/plugin_mixins/kafka_support'
7
+ require "faraday"
8
+ require "json"
9
+ require "logstash/json"
10
+ require_relative '../plugin_mixins/common'
6
11
 
7
12
  # This input will read events from a Kafka topic. It uses the 0.10 version of
8
13
  # the consumer API provided by Kafka to read messages from the broker.
@@ -48,6 +53,12 @@ require 'logstash-integration-kafka_jars.rb'
48
53
  # Kafka consumer configuration: http://kafka.apache.org/documentation.html#consumerconfigs
49
54
  #
50
55
  class LogStash::Inputs::Kafka < LogStash::Inputs::Base
56
+
57
+ DEFAULT_DESERIALIZER_CLASS = "org.apache.kafka.common.serialization.StringDeserializer"
58
+
59
+ include LogStash::PluginMixins::KafkaSupport
60
+ include ::LogStash::PluginMixins::KafkaAvroSchemaRegistry
61
+
51
62
  config_name 'kafka'
52
63
 
53
64
  default :codec, 'plain'
@@ -163,7 +174,7 @@ class LogStash::Inputs::Kafka < LogStash::Inputs::Base
163
174
  # and a rebalance operation is triggered for the group identified by `group_id`
164
175
  config :session_timeout_ms, :validate => :number, :default => 10_000 # (10s) Kafka default
165
176
  # Java Class used to deserialize the record's value
166
- config :value_deserializer_class, :validate => :string, :default => "org.apache.kafka.common.serialization.StringDeserializer"
177
+ config :value_deserializer_class, :validate => :string, :default => DEFAULT_DESERIALIZER_CLASS
167
178
  # A list of topics to subscribe to, defaults to ["logstash"].
168
179
  config :topics, :validate => :array, :default => ["logstash"]
169
180
  # A topic regex pattern to subscribe to.
@@ -232,11 +243,11 @@ class LogStash::Inputs::Kafka < LogStash::Inputs::Base
232
243
  # `timestamp`: The timestamp of this message
233
244
  config :decorate_events, :validate => :boolean, :default => false
234
245
 
235
-
236
246
  public
237
247
  def register
238
248
  @runner_threads = []
239
- end # def register
249
+ check_schema_registry_parameters
250
+ end
240
251
 
241
252
  public
242
253
  def run(logstash_queue)
@@ -274,6 +285,13 @@ class LogStash::Inputs::Kafka < LogStash::Inputs::Base
274
285
  for record in records do
275
286
  codec_instance.decode(record.value.to_s) do |event|
276
287
  decorate(event)
288
+ if schema_registry_url
289
+ json = LogStash::Json.load(record.value.to_s)
290
+ json.each do |k, v|
291
+ event.set(k, v)
292
+ end
293
+ event.remove("message")
294
+ end
277
295
  if @decorate_events
278
296
  event.set("[@metadata][kafka][topic]", record.topic)
279
297
  event.set("[@metadata][kafka][consumer_group]", @group_id)
@@ -333,7 +351,18 @@ class LogStash::Inputs::Kafka < LogStash::Inputs::Base
333
351
  props.put(kafka::CLIENT_RACK_CONFIG, client_rack) unless client_rack.nil?
334
352
 
335
353
  props.put("security.protocol", security_protocol) unless security_protocol.nil?
336
-
354
+ if schema_registry_url
355
+ props.put(kafka::VALUE_DESERIALIZER_CLASS_CONFIG, Java::io.confluent.kafka.serializers.KafkaAvroDeserializer.java_class)
356
+ serdes_config = Java::io.confluent.kafka.serializers.AbstractKafkaAvroSerDeConfig
357
+ props.put(serdes_config::SCHEMA_REGISTRY_URL_CONFIG, schema_registry_url.to_s)
358
+ if schema_registry_proxy && !schema_registry_proxy.empty?
359
+ props.put(serdes_config::PROXY_HOST, @schema_registry_proxy_host)
360
+ props.put(serdes_config::PROXY_PORT, @schema_registry_proxy_port)
361
+ end
362
+ if schema_registry_key && !schema_registry_key.empty?
363
+ props.put(serdes_config::USER_INFO_CONFIG, schema_registry_key + ":" + schema_registry_secret.value)
364
+ end
365
+ end
337
366
  if security_protocol == "SSL"
338
367
  set_trustore_keystore_config(props)
339
368
  elsif security_protocol == "SASL_PLAINTEXT"
@@ -370,29 +399,4 @@ class LogStash::Inputs::Kafka < LogStash::Inputs::Base
370
399
  end
371
400
  end
372
401
 
373
- def set_trustore_keystore_config(props)
374
- props.put("ssl.truststore.type", ssl_truststore_type) unless ssl_truststore_type.nil?
375
- props.put("ssl.truststore.location", ssl_truststore_location) unless ssl_truststore_location.nil?
376
- props.put("ssl.truststore.password", ssl_truststore_password.value) unless ssl_truststore_password.nil?
377
-
378
- # Client auth stuff
379
- props.put("ssl.keystore.type", ssl_keystore_type) unless ssl_keystore_type.nil?
380
- props.put("ssl.key.password", ssl_key_password.value) unless ssl_key_password.nil?
381
- props.put("ssl.keystore.location", ssl_keystore_location) unless ssl_keystore_location.nil?
382
- props.put("ssl.keystore.password", ssl_keystore_password.value) unless ssl_keystore_password.nil?
383
- props.put("ssl.endpoint.identification.algorithm", ssl_endpoint_identification_algorithm) unless ssl_endpoint_identification_algorithm.nil?
384
- end
385
-
386
- def set_sasl_config(props)
387
- java.lang.System.setProperty("java.security.auth.login.config", jaas_path) unless jaas_path.nil?
388
- java.lang.System.setProperty("java.security.krb5.conf", kerberos_config) unless kerberos_config.nil?
389
-
390
- props.put("sasl.mechanism", sasl_mechanism)
391
- if sasl_mechanism == "GSSAPI" && sasl_kerberos_service_name.nil?
392
- raise LogStash::ConfigurationError, "sasl_kerberos_service_name must be specified when SASL mechanism is GSSAPI"
393
- end
394
-
395
- props.put("sasl.kerberos.service.name", sasl_kerberos_service_name) unless sasl_kerberos_service_name.nil?
396
- props.put("sasl.jaas.config", sasl_jaas_config) unless sasl_jaas_config.nil?
397
- end
398
402
  end #class LogStash::Inputs::Kafka
@@ -2,6 +2,7 @@ require 'logstash/namespace'
2
2
  require 'logstash/outputs/base'
3
3
  require 'java'
4
4
  require 'logstash-integration-kafka_jars.rb'
5
+ require 'logstash/plugin_mixins/kafka_support'
5
6
 
6
7
  # Write events to a Kafka topic. This uses the Kafka Producer API to write messages to a topic on
7
8
  # the broker.
@@ -50,6 +51,8 @@ class LogStash::Outputs::Kafka < LogStash::Outputs::Base
50
51
 
51
52
  java_import org.apache.kafka.clients.producer.ProducerRecord
52
53
 
54
+ include LogStash::PluginMixins::KafkaSupport
55
+
53
56
  declare_threadsafe!
54
57
 
55
58
  config_name 'kafka'
@@ -236,7 +239,7 @@ class LogStash::Outputs::Kafka < LogStash::Outputs::Base
236
239
  remaining = @retries
237
240
 
238
241
  while batch.any?
239
- if !remaining.nil?
242
+ unless remaining.nil?
240
243
  if remaining < 0
241
244
  # TODO(sissel): Offer to DLQ? Then again, if it's a transient fault,
242
245
  # DLQing would make things worse (you dlq data that would be successful
@@ -255,27 +258,39 @@ class LogStash::Outputs::Kafka < LogStash::Outputs::Base
255
258
  begin
256
259
  # send() can throw an exception even before the future is created.
257
260
  @producer.send(record)
258
- rescue org.apache.kafka.common.errors.TimeoutException => e
259
- failures << record
260
- nil
261
- rescue org.apache.kafka.common.errors.InterruptException => e
261
+ rescue org.apache.kafka.common.errors.InterruptException,
262
+ org.apache.kafka.common.errors.RetriableException => e
263
+ logger.info("producer send failed, will retry sending", :exception => e.class, :message => e.message)
262
264
  failures << record
263
265
  nil
264
- rescue org.apache.kafka.common.errors.SerializationException => e
265
- # TODO(sissel): Retrying will fail because the data itself has a problem serializing.
266
- # TODO(sissel): Let's add DLQ here.
267
- failures << record
266
+ rescue org.apache.kafka.common.KafkaException => e
267
+ # This error is not retriable, drop event
268
+ # TODO: add DLQ support
269
+ logger.warn("producer send failed, dropping record",:exception => e.class, :message => e.message,
270
+ :record_value => record.value)
268
271
  nil
269
272
  end
270
- end.compact
273
+ end
271
274
 
272
275
  futures.each_with_index do |future, i|
273
- begin
274
- result = future.get()
275
- rescue => e
276
- # TODO(sissel): Add metric to count failures, possibly by exception type.
277
- logger.warn("producer send failed", :exception => e.class, :message => e.message)
278
- failures << batch[i]
276
+ # We cannot skip nils using `futures.compact` because then our index `i` will not align with `batch`
277
+ unless future.nil?
278
+ begin
279
+ future.get
280
+ rescue java.util.concurrent.ExecutionException => e
281
+ # TODO(sissel): Add metric to count failures, possibly by exception type.
282
+ if e.get_cause.is_a? org.apache.kafka.common.errors.RetriableException or
283
+ e.get_cause.is_a? org.apache.kafka.common.errors.InterruptException
284
+ logger.info("producer send failed, will retry sending", :exception => e.cause.class,
285
+ :message => e.cause.message)
286
+ failures << batch[i]
287
+ elsif e.get_cause.is_a? org.apache.kafka.common.KafkaException
288
+ # This error is not retriable, drop event
289
+ # TODO: add DLQ support
290
+ logger.warn("producer send failed, dropping record", :exception => e.cause.class,
291
+ :message => e.cause.message, :record_value => batch[i].value)
292
+ end
293
+ end
279
294
  end
280
295
  end
281
296
 
@@ -377,35 +392,4 @@ class LogStash::Outputs::Kafka < LogStash::Outputs::Base
377
392
  end
378
393
  end
379
394
 
380
- def set_trustore_keystore_config(props)
381
- unless ssl_endpoint_identification_algorithm.to_s.strip.empty?
382
- if ssl_truststore_location.nil?
383
- raise LogStash::ConfigurationError, "ssl_truststore_location must be set when SSL is enabled"
384
- end
385
- props.put("ssl.truststore.type", ssl_truststore_type) unless ssl_truststore_type.nil?
386
- props.put("ssl.truststore.location", ssl_truststore_location)
387
- props.put("ssl.truststore.password", ssl_truststore_password.value) unless ssl_truststore_password.nil?
388
- end
389
-
390
- # Client auth stuff
391
- props.put("ssl.keystore.type", ssl_keystore_type) unless ssl_keystore_type.nil?
392
- props.put("ssl.key.password", ssl_key_password.value) unless ssl_key_password.nil?
393
- props.put("ssl.keystore.location", ssl_keystore_location) unless ssl_keystore_location.nil?
394
- props.put("ssl.keystore.password", ssl_keystore_password.value) unless ssl_keystore_password.nil?
395
- props.put("ssl.endpoint.identification.algorithm", ssl_endpoint_identification_algorithm) unless ssl_endpoint_identification_algorithm.nil?
396
- end
397
-
398
- def set_sasl_config(props)
399
- java.lang.System.setProperty("java.security.auth.login.config", jaas_path) unless jaas_path.nil?
400
- java.lang.System.setProperty("java.security.krb5.conf", kerberos_config) unless kerberos_config.nil?
401
-
402
- props.put("sasl.mechanism",sasl_mechanism)
403
- if sasl_mechanism == "GSSAPI" && sasl_kerberos_service_name.nil?
404
- raise LogStash::ConfigurationError, "sasl_kerberos_service_name must be specified when SASL mechanism is GSSAPI"
405
- end
406
-
407
- props.put("sasl.kerberos.service.name", sasl_kerberos_service_name) unless sasl_kerberos_service_name.nil?
408
- props.put("sasl.jaas.config", sasl_jaas_config) unless sasl_jaas_config.nil?
409
- end
410
-
411
395
  end #class LogStash::Outputs::Kafka
@@ -0,0 +1,92 @@
1
+ module LogStash
2
+ module PluginMixins
3
+ module KafkaAvroSchemaRegistry
4
+
5
+ def self.included(base)
6
+ base.extend(self)
7
+ base.setup_schema_registry_config
8
+ end
9
+
10
+ def setup_schema_registry_config
11
+ # Option to set key to access Schema Registry.
12
+ config :schema_registry_key, :validate => :string
13
+
14
+ # Option to set secret to access Schema Registry.
15
+ config :schema_registry_secret, :validate => :password
16
+
17
+ # Option to set the endpoint of the Schema Registry.
18
+ # This option permit the usage of Avro Kafka deserializer which retrieve the schema of the Avro message from an
19
+ # instance of schema registry. If this option has value `value_deserializer_class` nor `topics_pattern` could be valued
20
+ config :schema_registry_url, :validate => :uri
21
+
22
+ # Option to set the proxy of the Schema Registry.
23
+ # This option permits to define a proxy to be used to reach the schema registry service instance.
24
+ config :schema_registry_proxy, :validate => :uri
25
+ end
26
+
27
+ def check_schema_registry_parameters
28
+ if @schema_registry_url
29
+ check_for_schema_registry_conflicts
30
+ @schema_registry_proxy_host, @schema_registry_proxy_port = split_proxy_into_host_and_port(schema_registry_proxy)
31
+ check_for_key_and_secret
32
+ check_for_schema_registry_connectivity_and_subjects
33
+ end
34
+ end
35
+
36
+ private
37
+ def check_for_schema_registry_conflicts
38
+ if @value_deserializer_class != LogStash::Inputs::Kafka::DEFAULT_DESERIALIZER_CLASS
39
+ raise LogStash::ConfigurationError, 'Option schema_registry_url prohibit the customization of value_deserializer_class'
40
+ end
41
+ if @topics_pattern && !@topics_pattern.empty?
42
+ raise LogStash::ConfigurationError, 'Option schema_registry_url prohibit the customization of topics_pattern'
43
+ end
44
+ end
45
+
46
+ private
47
+ def check_for_schema_registry_connectivity_and_subjects
48
+ client = Faraday.new(@schema_registry_url.to_s) do |conn|
49
+ if schema_registry_proxy && !schema_registry_proxy.empty?
50
+ conn.proxy = schema_registry_proxy.to_s
51
+ end
52
+ if schema_registry_key and !schema_registry_key.empty?
53
+ conn.basic_auth(schema_registry_key, schema_registry_secret.value)
54
+ end
55
+ end
56
+ begin
57
+ response = client.get('/subjects')
58
+ rescue Faraday::Error => e
59
+ raise LogStash::ConfigurationError.new("Schema registry service doesn't respond, error: #{e.message}")
60
+ end
61
+ registered_subjects = JSON.parse response.body
62
+ expected_subjects = @topics.map { |t| "#{t}-value"}
63
+ if (expected_subjects & registered_subjects).size != expected_subjects.size
64
+ undefined_topic_subjects = expected_subjects - registered_subjects
65
+ raise LogStash::ConfigurationError, "The schema registry does not contain definitions for required topic subjects: #{undefined_topic_subjects}"
66
+ end
67
+ end
68
+
69
+ def split_proxy_into_host_and_port(proxy_uri)
70
+ return nil unless proxy_uri && !proxy_uri.empty?
71
+
72
+ port = proxy_uri.port
73
+
74
+ host_spec = ""
75
+ host_spec << proxy_uri.scheme || "http"
76
+ host_spec << "://"
77
+ host_spec << "#{proxy_uri.userinfo}@" if proxy_uri.userinfo
78
+ host_spec << proxy_uri.host
79
+
80
+ [host_spec, port]
81
+ end
82
+
83
+ def check_for_key_and_secret
84
+ if schema_registry_key and !schema_registry_key.empty?
85
+ if !schema_registry_secret or schema_registry_secret.value.empty?
86
+ raise LogStash::ConfigurationError, "Setting `schema_registry_secret` is required when `schema_registry_key` is provided."
87
+ end
88
+ end
89
+ end
90
+ end
91
+ end
92
+ end
@@ -0,0 +1,29 @@
1
+ module LogStash module PluginMixins module KafkaSupport
2
+
3
+ def set_trustore_keystore_config(props)
4
+ props.put("ssl.truststore.type", ssl_truststore_type) unless ssl_truststore_type.nil?
5
+ props.put("ssl.truststore.location", ssl_truststore_location) unless ssl_truststore_location.nil?
6
+ props.put("ssl.truststore.password", ssl_truststore_password.value) unless ssl_truststore_password.nil?
7
+
8
+ # Client auth stuff
9
+ props.put("ssl.keystore.type", ssl_keystore_type) unless ssl_keystore_type.nil?
10
+ props.put("ssl.key.password", ssl_key_password.value) unless ssl_key_password.nil?
11
+ props.put("ssl.keystore.location", ssl_keystore_location) unless ssl_keystore_location.nil?
12
+ props.put("ssl.keystore.password", ssl_keystore_password.value) unless ssl_keystore_password.nil?
13
+ props.put("ssl.endpoint.identification.algorithm", ssl_endpoint_identification_algorithm) unless ssl_endpoint_identification_algorithm.nil?
14
+ end
15
+
16
+ def set_sasl_config(props)
17
+ java.lang.System.setProperty("java.security.auth.login.config", jaas_path) unless jaas_path.nil?
18
+ java.lang.System.setProperty("java.security.krb5.conf", kerberos_config) unless kerberos_config.nil?
19
+
20
+ props.put("sasl.mechanism", sasl_mechanism)
21
+ if sasl_mechanism == "GSSAPI" && sasl_kerberos_service_name.nil?
22
+ raise LogStash::ConfigurationError, "sasl_kerberos_service_name must be specified when SASL mechanism is GSSAPI"
23
+ end
24
+
25
+ props.put("sasl.kerberos.service.name", sasl_kerberos_service_name) unless sasl_kerberos_service_name.nil?
26
+ props.put("sasl.jaas.config", sasl_jaas_config) unless sasl_jaas_config.nil?
27
+ end
28
+
29
+ end end end
@@ -1,6 +1,6 @@
1
1
  Gem::Specification.new do |s|
2
2
  s.name = 'logstash-integration-kafka'
3
- s.version = '10.4.0'
3
+ s.version = '10.6.0'
4
4
  s.licenses = ['Apache-2.0']
5
5
  s.summary = "Integration with Kafka - input and output plugins"
6
6
  s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline "+
File without changes
@@ -2,6 +2,9 @@
2
2
  require "logstash/devutils/rspec/spec_helper"
3
3
  require "logstash/inputs/kafka"
4
4
  require "rspec/wait"
5
+ require "stud/try"
6
+ require "faraday"
7
+ require "json"
5
8
 
6
9
  # Please run kafka_test_setup.sh prior to executing this integration test.
7
10
  describe "inputs/kafka", :integration => true do
@@ -120,20 +123,192 @@ describe "inputs/kafka", :integration => true do
120
123
  end
121
124
  end
122
125
  end
126
+ end
127
+
128
+ private
129
+
130
+ def consume_messages(config, queue: Queue.new, timeout:, event_count:)
131
+ kafka_input = LogStash::Inputs::Kafka.new(config)
132
+ t = Thread.new { kafka_input.run(queue) }
133
+ begin
134
+ t.run
135
+ wait(timeout).for { queue.length }.to eq(event_count) unless timeout.eql?(false)
136
+ block_given? ? yield(queue, kafka_input) : queue
137
+ ensure
138
+ t.kill
139
+ t.join(30_000)
140
+ end
141
+ end
142
+
143
+
144
+ describe "schema registry connection options" do
145
+ context "remote endpoint validation" do
146
+ it "should fail if not reachable" do
147
+ config = {'schema_registry_url' => 'http://localnothost:8081'}
148
+ kafka_input = LogStash::Inputs::Kafka.new(config)
149
+ expect { kafka_input.register }.to raise_error LogStash::ConfigurationError, /Schema registry service doesn't respond.*/
150
+ end
151
+
152
+ it "should fail if any topic is not matched by a subject on the schema registry" do
153
+ config = {
154
+ 'schema_registry_url' => 'http://localhost:8081',
155
+ 'topics' => ['temperature_stream']
156
+ }
157
+
158
+ kafka_input = LogStash::Inputs::Kafka.new(config)
159
+ expect { kafka_input.register }.to raise_error LogStash::ConfigurationError, /The schema registry does not contain definitions for required topic subjects: \["temperature_stream-value"\]/
160
+ end
161
+
162
+ context "register with subject present" do
163
+ SUBJECT_NAME = "temperature_stream-value"
164
+
165
+ before(:each) do
166
+ response = save_avro_schema_to_schema_registry(File.join(Dir.pwd, "spec", "unit", "inputs", "avro_schema_fixture_payment.asvc"), SUBJECT_NAME)
167
+ expect( response.status ).to be(200)
168
+ end
123
169
 
124
- private
170
+ after(:each) do
171
+ schema_registry_client = Faraday.new('http://localhost:8081')
172
+ delete_remote_schema(schema_registry_client, SUBJECT_NAME)
173
+ end
125
174
 
126
- def consume_messages(config, queue: Queue.new, timeout:, event_count:)
127
- kafka_input = LogStash::Inputs::Kafka.new(config)
128
- t = Thread.new { kafka_input.run(queue) }
129
- begin
130
- t.run
131
- wait(timeout).for { queue.length }.to eq(event_count) unless timeout.eql?(false)
132
- block_given? ? yield(queue, kafka_input) : queue
133
- ensure
134
- t.kill
135
- t.join(30_000)
175
+ it "should correctly complete registration phase" do
176
+ config = {
177
+ 'schema_registry_url' => 'http://localhost:8081',
178
+ 'topics' => ['temperature_stream']
179
+ }
180
+ kafka_input = LogStash::Inputs::Kafka.new(config)
181
+ kafka_input.register
182
+ end
136
183
  end
137
184
  end
185
+ end
138
186
 
187
+ def save_avro_schema_to_schema_registry(schema_file, subject_name)
188
+ raw_schema = File.readlines(schema_file).map(&:chomp).join
189
+ raw_schema_quoted = raw_schema.gsub('"', '\"')
190
+ response = Faraday.post("http://localhost:8081/subjects/#{subject_name}/versions",
191
+ '{"schema": "' + raw_schema_quoted + '"}',
192
+ "Content-Type" => "application/vnd.schemaregistry.v1+json")
193
+ response
139
194
  end
195
+
196
+ def delete_remote_schema(schema_registry_client, subject_name)
197
+ expect(schema_registry_client.delete("/subjects/#{subject_name}").status ).to be(200)
198
+ expect(schema_registry_client.delete("/subjects/#{subject_name}?permanent=true").status ).to be(200)
199
+ end
200
+
201
+ # AdminClientConfig = org.alpache.kafka.clients.admin.AdminClientConfig
202
+
203
+ describe "Schema registry API", :integration => true do
204
+
205
+ let(:schema_registry) { Faraday.new('http://localhost:8081') }
206
+
207
+ context 'listing subject on clean instance' do
208
+ it "should return an empty set" do
209
+ subjects = JSON.parse schema_registry.get('/subjects').body
210
+ expect( subjects ).to be_empty
211
+ end
212
+ end
213
+
214
+ context 'send a schema definition' do
215
+ it "save the definition" do
216
+ response = save_avro_schema_to_schema_registry(File.join(Dir.pwd, "spec", "unit", "inputs", "avro_schema_fixture_payment.asvc"), "schema_test_1")
217
+ expect( response.status ).to be(200)
218
+ delete_remote_schema(schema_registry, "schema_test_1")
219
+ end
220
+
221
+ it "delete the schema just added" do
222
+ response = save_avro_schema_to_schema_registry(File.join(Dir.pwd, "spec", "unit", "inputs", "avro_schema_fixture_payment.asvc"), "schema_test_1")
223
+ expect( response.status ).to be(200)
224
+
225
+ expect( schema_registry.delete('/subjects/schema_test_1?permanent=false').status ).to be(200)
226
+ sleep(1)
227
+ subjects = JSON.parse schema_registry.get('/subjects').body
228
+ expect( subjects ).to be_empty
229
+ end
230
+ end
231
+
232
+ context 'use the schema to serialize' do
233
+ after(:each) do
234
+ expect( schema_registry.delete('/subjects/topic_avro-value').status ).to be(200)
235
+ sleep 1
236
+ expect( schema_registry.delete('/subjects/topic_avro-value?permanent=true').status ).to be(200)
237
+
238
+ Stud.try(3.times, [StandardError, RSpec::Expectations::ExpectationNotMetError]) do
239
+ wait(10).for do
240
+ subjects = JSON.parse schema_registry.get('/subjects').body
241
+ subjects.empty?
242
+ end.to be_truthy
243
+ end
244
+ end
245
+
246
+ let(:group_id_1) {rand(36**8).to_s(36)}
247
+
248
+ let(:avro_topic_name) { "topic_avro" }
249
+
250
+ let(:plain_config) do
251
+ { 'schema_registry_url' => 'http://localhost:8081',
252
+ 'topics' => [avro_topic_name],
253
+ 'codec' => 'plain',
254
+ 'group_id' => group_id_1,
255
+ 'auto_offset_reset' => 'earliest' }
256
+ end
257
+
258
+ def delete_topic_if_exists(topic_name)
259
+ props = java.util.Properties.new
260
+ props.put(Java::org.apache.kafka.clients.admin.AdminClientConfig::BOOTSTRAP_SERVERS_CONFIG, "localhost:9092")
261
+
262
+ admin_client = org.apache.kafka.clients.admin.AdminClient.create(props)
263
+ topics_list = admin_client.listTopics().names().get()
264
+ if topics_list.contains(topic_name)
265
+ result = admin_client.deleteTopics([topic_name])
266
+ result.values.get(topic_name).get()
267
+ end
268
+ end
269
+
270
+ def write_some_data_to(topic_name)
271
+ props = java.util.Properties.new
272
+ config = org.apache.kafka.clients.producer.ProducerConfig
273
+
274
+ serdes_config = Java::io.confluent.kafka.serializers.AbstractKafkaAvroSerDeConfig
275
+ props.put(serdes_config::SCHEMA_REGISTRY_URL_CONFIG, "http://localhost:8081")
276
+
277
+ props.put(config::BOOTSTRAP_SERVERS_CONFIG, "localhost:9092")
278
+ props.put(config::KEY_SERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringSerializer.java_class)
279
+ props.put(config::VALUE_SERIALIZER_CLASS_CONFIG, Java::io.confluent.kafka.serializers.KafkaAvroSerializer.java_class)
280
+
281
+ parser = org.apache.avro.Schema::Parser.new()
282
+ user_schema = '''{"type":"record",
283
+ "name":"myrecord",
284
+ "fields":[
285
+ {"name":"str_field", "type": "string"},
286
+ {"name":"map_field", "type": {"type": "map", "values": "string"}}
287
+ ]}'''
288
+ schema = parser.parse(user_schema)
289
+ avro_record = org.apache.avro.generic.GenericData::Record.new(schema)
290
+ avro_record.put("str_field", "value1")
291
+ avro_record.put("map_field", {"inner_field" => "inner value"})
292
+
293
+ producer = org.apache.kafka.clients.producer.KafkaProducer.new(props)
294
+ record = org.apache.kafka.clients.producer.ProducerRecord.new(topic_name, "avro_key", avro_record)
295
+ producer.send(record)
296
+ end
297
+
298
+ it "stored a new schema using Avro Kafka serdes" do
299
+ delete_topic_if_exists avro_topic_name
300
+ write_some_data_to avro_topic_name
301
+
302
+ subjects = JSON.parse schema_registry.get('/subjects').body
303
+ expect( subjects ).to contain_exactly("topic_avro-value")
304
+
305
+ num_events = 1
306
+ queue = consume_messages(plain_config, timeout: 30, event_count: num_events)
307
+ expect(queue.length).to eq(num_events)
308
+ elem = queue.pop
309
+ expect( elem.to_hash).not_to include("message")
310
+ expect( elem.get("str_field") ).to eq("value1")
311
+ expect( elem.get("map_field")["inner_field"] ).to eq("inner value")
312
+ end
313
+ end
314
+ end
@@ -0,0 +1,8 @@
1
+ {"namespace": "io.confluent.examples.clients.basicavro",
2
+ "type": "record",
3
+ "name": "Payment",
4
+ "fields": [
5
+ {"name": "id", "type": "string"},
6
+ {"name": "amount", "type": "double"}
7
+ ]
8
+ }
@@ -37,6 +37,22 @@ describe LogStash::Inputs::Kafka do
37
37
  expect { subject.register }.to_not raise_error
38
38
  end
39
39
 
40
+ context "register parameter verification" do
41
+ let(:config) do
42
+ { 'schema_registry_url' => 'http://localhost:8081', 'topics' => ['logstash'], 'consumer_threads' => 4 }
43
+ end
44
+
45
+ it "schema_registry_url conflict with value_deserializer_class should fail" do
46
+ config['value_deserializer_class'] = 'my.fantasy.Deserializer'
47
+ expect { subject.register }.to raise_error LogStash::ConfigurationError, /Option schema_registry_url prohibit the customization of value_deserializer_class/
48
+ end
49
+
50
+ it "schema_registry_url conflict with topics_pattern should fail" do
51
+ config['topics_pattern'] = 'topic_.*'
52
+ expect { subject.register }.to raise_error LogStash::ConfigurationError, /Option schema_registry_url prohibit the customization of topics_pattern/
53
+ end
54
+ end
55
+
40
56
  context 'with client_rack' do
41
57
  let(:config) { super.merge('client_rack' => 'EU-R1') }
42
58
 
@@ -50,20 +50,22 @@ describe "outputs/kafka" do
50
50
  kafka.multi_receive([event])
51
51
  end
52
52
 
53
- it 'should raise config error when truststore location is not set and ssl is enabled' do
53
+ it 'should not raise config error when truststore location is not set and ssl is enabled' do
54
54
  kafka = LogStash::Outputs::Kafka.new(simple_kafka_config.merge("security_protocol" => "SSL"))
55
- expect { kafka.register }.to raise_error(LogStash::ConfigurationError, /ssl_truststore_location must be set when SSL is enabled/)
55
+ expect(org.apache.kafka.clients.producer.KafkaProducer).to receive(:new)
56
+ expect { kafka.register }.to_not raise_error
56
57
  end
57
58
  end
58
59
 
59
- context "when KafkaProducer#send() raises an exception" do
60
+ context "when KafkaProducer#send() raises a retriable exception" do
60
61
  let(:failcount) { (rand * 10).to_i }
61
62
  let(:sendcount) { failcount + 1 }
62
63
 
63
64
  let(:exception_classes) { [
64
65
  org.apache.kafka.common.errors.TimeoutException,
66
+ org.apache.kafka.common.errors.DisconnectException,
67
+ org.apache.kafka.common.errors.CoordinatorNotAvailableException,
65
68
  org.apache.kafka.common.errors.InterruptException,
66
- org.apache.kafka.common.errors.SerializationException
67
69
  ] }
68
70
 
69
71
  before do
@@ -88,6 +90,37 @@ describe "outputs/kafka" do
88
90
  end
89
91
  end
90
92
 
93
+ context "when KafkaProducer#send() raises a non-retriable exception" do
94
+ let(:failcount) { (rand * 10).to_i }
95
+
96
+ let(:exception_classes) { [
97
+ org.apache.kafka.common.errors.SerializationException,
98
+ org.apache.kafka.common.errors.RecordTooLargeException,
99
+ org.apache.kafka.common.errors.InvalidTopicException
100
+ ] }
101
+
102
+ before do
103
+ count = 0
104
+ expect_any_instance_of(org.apache.kafka.clients.producer.KafkaProducer).to receive(:send)
105
+ .exactly(1).times
106
+ .and_wrap_original do |m, *args|
107
+ if count < failcount # fail 'failcount' times in a row.
108
+ count += 1
109
+ # Pick an exception at random
110
+ raise exception_classes.shuffle.first.new("injected exception for testing")
111
+ else
112
+ m.call(*args) # call original
113
+ end
114
+ end
115
+ end
116
+
117
+ it "should not retry" do
118
+ kafka = LogStash::Outputs::Kafka.new(simple_kafka_config)
119
+ kafka.register
120
+ kafka.multi_receive([event])
121
+ end
122
+ end
123
+
91
124
  context "when a send fails" do
92
125
  context "and the default retries behavior is used" do
93
126
  # Fail this many times and then finally succeed.
@@ -107,7 +140,7 @@ describe "outputs/kafka" do
107
140
  # inject some failures.
108
141
 
109
142
  # Return a custom Future that will raise an exception to simulate a Kafka send() problem.
110
- future = java.util.concurrent.FutureTask.new { raise "Failed" }
143
+ future = java.util.concurrent.FutureTask.new { raise org.apache.kafka.common.errors.TimeoutException.new("Failed") }
111
144
  future.run
112
145
  future
113
146
  else
@@ -129,7 +162,7 @@ describe "outputs/kafka" do
129
162
  .once
130
163
  .and_wrap_original do |m, *args|
131
164
  # Always fail.
132
- future = java.util.concurrent.FutureTask.new { raise "Failed" }
165
+ future = java.util.concurrent.FutureTask.new { raise org.apache.kafka.common.errors.TimeoutException.new("Failed") }
133
166
  future.run
134
167
  future
135
168
  end
@@ -143,7 +176,7 @@ describe "outputs/kafka" do
143
176
  .once
144
177
  .and_wrap_original do |m, *args|
145
178
  # Always fail.
146
- future = java.util.concurrent.FutureTask.new { raise "Failed" }
179
+ future = java.util.concurrent.FutureTask.new { raise org.apache.kafka.common.errors.TimeoutException.new("Failed") }
147
180
  future.run
148
181
  future
149
182
  end
@@ -164,7 +197,7 @@ describe "outputs/kafka" do
164
197
  .at_most(max_sends).times
165
198
  .and_wrap_original do |m, *args|
166
199
  # Always fail.
167
- future = java.util.concurrent.FutureTask.new { raise "Failed" }
200
+ future = java.util.concurrent.FutureTask.new { raise org.apache.kafka.common.errors.TimeoutException.new("Failed") }
168
201
  future.run
169
202
  future
170
203
  end
@@ -175,10 +208,10 @@ describe "outputs/kafka" do
175
208
 
176
209
  it 'should only sleep retries number of times' do
177
210
  expect_any_instance_of(org.apache.kafka.clients.producer.KafkaProducer).to receive(:send)
178
- .at_most(max_sends)
211
+ .at_most(max_sends).times
179
212
  .and_wrap_original do |m, *args|
180
213
  # Always fail.
181
- future = java.util.concurrent.FutureTask.new { raise "Failed" }
214
+ future = java.util.concurrent.FutureTask.new { raise org.apache.kafka.common.errors.TimeoutException.new("Failed") }
182
215
  future.run
183
216
  future
184
217
  end
@@ -193,21 +226,31 @@ describe "outputs/kafka" do
193
226
  context 'when ssl endpoint identification disabled' do
194
227
 
195
228
  let(:config) do
196
- simple_kafka_config.merge('ssl_endpoint_identification_algorithm' => '', 'security_protocol' => 'SSL')
229
+ simple_kafka_config.merge(
230
+ 'security_protocol' => 'SSL',
231
+ 'ssl_endpoint_identification_algorithm' => '',
232
+ 'ssl_truststore_location' => truststore_path,
233
+ )
234
+ end
235
+
236
+ let(:truststore_path) do
237
+ File.join(File.dirname(__FILE__), '../../fixtures/trust-store_stub.jks')
197
238
  end
198
239
 
199
240
  subject { LogStash::Outputs::Kafka.new(config) }
200
241
 
201
- it 'does not configure truststore' do
242
+ it 'sets empty ssl.endpoint.identification.algorithm' do
202
243
  expect(org.apache.kafka.clients.producer.KafkaProducer).
203
- to receive(:new).with(hash_excluding('ssl.truststore.location' => anything))
244
+ to receive(:new).with(hash_including('ssl.endpoint.identification.algorithm' => ''))
204
245
  subject.register
205
246
  end
206
247
 
207
- it 'sets empty ssl.endpoint.identification.algorithm' do
248
+ it 'configures truststore' do
208
249
  expect(org.apache.kafka.clients.producer.KafkaProducer).
209
- to receive(:new).with(hash_including('ssl.endpoint.identification.algorithm' => ''))
250
+ to receive(:new).with(hash_including('ssl.truststore.location' => truststore_path))
210
251
  subject.register
211
252
  end
253
+
212
254
  end
255
+
213
256
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: logstash-integration-kafka
3
3
  version: !ruby/object:Gem::Version
4
- version: 10.4.0
4
+ version: 10.6.0
5
5
  platform: java
6
6
  authors:
7
7
  - Elastic
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-07-03 00:00:00.000000000 Z
11
+ date: 2020-10-28 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  requirement: !ruby/object:Gem::Requirement
@@ -183,15 +183,28 @@ files:
183
183
  - lib/logstash-integration-kafka_jars.rb
184
184
  - lib/logstash/inputs/kafka.rb
185
185
  - lib/logstash/outputs/kafka.rb
186
+ - lib/logstash/plugin_mixins/common.rb
187
+ - lib/logstash/plugin_mixins/kafka_support.rb
186
188
  - logstash-integration-kafka.gemspec
189
+ - spec/fixtures/trust-store_stub.jks
187
190
  - spec/integration/inputs/kafka_spec.rb
188
191
  - spec/integration/outputs/kafka_spec.rb
192
+ - spec/unit/inputs/avro_schema_fixture_payment.asvc
189
193
  - spec/unit/inputs/kafka_spec.rb
190
194
  - spec/unit/outputs/kafka_spec.rb
191
- - vendor/jar-dependencies/com/github/luben/zstd-jni/1.4.3-1/zstd-jni-1.4.3-1.jar
192
- - vendor/jar-dependencies/org/apache/kafka/kafka-clients/2.4.1/kafka-clients-2.4.1.jar
193
- - vendor/jar-dependencies/org/lz4/lz4-java/1.6.0/lz4-java-1.6.0.jar
194
- - vendor/jar-dependencies/org/slf4j/slf4j-api/1.7.28/slf4j-api-1.7.28.jar
195
+ - vendor/jar-dependencies/com/github/luben/zstd-jni/1.4.4-7/zstd-jni-1.4.4-7.jar
196
+ - vendor/jar-dependencies/io/confluent/common-config/5.5.1/common-config-5.5.1.jar
197
+ - vendor/jar-dependencies/io/confluent/common-utils/5.5.1/common-utils-5.5.1.jar
198
+ - vendor/jar-dependencies/io/confluent/kafka-avro-serializer/5.5.1/kafka-avro-serializer-5.5.1.jar
199
+ - vendor/jar-dependencies/io/confluent/kafka-schema-registry-client/5.5.1/kafka-schema-registry-client-5.5.1.jar
200
+ - vendor/jar-dependencies/io/confluent/kafka-schema-serializer/5.5.1/kafka-schema-serializer-5.5.1.jar
201
+ - vendor/jar-dependencies/javax/ws/rs/javax.ws.rs-api/2.1.1/javax.ws.rs-api-2.1.1.jar
202
+ - vendor/jar-dependencies/org/apache/avro/avro/1.9.2/avro-1.9.2.jar
203
+ - vendor/jar-dependencies/org/apache/kafka/kafka-clients/2.5.1/kafka-clients-2.5.1.jar
204
+ - vendor/jar-dependencies/org/apache/kafka/kafka_2.12/2.5.1/kafka_2.12-2.5.1.jar
205
+ - vendor/jar-dependencies/org/glassfish/jersey/core/jersey-common/2.30/jersey-common-2.30.jar
206
+ - vendor/jar-dependencies/org/lz4/lz4-java/1.7.1/lz4-java-1.7.1.jar
207
+ - vendor/jar-dependencies/org/slf4j/slf4j-api/1.7.30/slf4j-api-1.7.30.jar
195
208
  - vendor/jar-dependencies/org/xerial/snappy/snappy-java/1.1.7.3/snappy-java-1.1.7.3.jar
196
209
  homepage: http://www.elastic.co/guide/en/logstash/current/index.html
197
210
  licenses:
@@ -222,7 +235,9 @@ signing_key:
222
235
  specification_version: 4
223
236
  summary: Integration with Kafka - input and output plugins
224
237
  test_files:
238
+ - spec/fixtures/trust-store_stub.jks
225
239
  - spec/integration/inputs/kafka_spec.rb
226
240
  - spec/integration/outputs/kafka_spec.rb
241
+ - spec/unit/inputs/avro_schema_fixture_payment.asvc
227
242
  - spec/unit/inputs/kafka_spec.rb
228
243
  - spec/unit/outputs/kafka_spec.rb