logstash-filter-kafka_time_machine 0.5.0 → 2.0.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f737fcbdc55c777f9ff5000e778c8fd26ceda00fdb87ee48ecb6c65d7dafaf90
4
- data.tar.gz: 5c4f1cd0dc80f63cd97a93701c2d12d79e0c97a93f913187c6dcaa1e19ca306a
3
+ metadata.gz: dcb429711e99220eb57f095154d68f2b3f477956587762c4ba03ef7bc434ba30
4
+ data.tar.gz: 3c66923274a218187bef908747091baa540523729c6d961f02ab002eb6701fce
5
5
  SHA512:
6
- metadata.gz: e0549b7084ce70c4af5e94d249ebf244d215f4bffbdc4db5c9b0356041cc50b736e0e2dcf0680b0d049b0f38402a9f3931aa2eae6ace9b41cd3628ae04c73fe0
7
- data.tar.gz: '04338e54be18ff407221ed442d62d641cbcf4391adb5cd85d919efa0c37fe158f18bcfa88cace01c69f8ed44b7156a1ac96e9e6b42124fed87a1873d4e49eee1'
6
+ metadata.gz: 840b0fbdef1e7096c2e51cdaac3a4ef38e5e7830fb8263e8511d8e09116a13a92a90dcf13ffb72ffef51da4f66aaf898235d30ee50c6f7adddab3c79a428bb80
7
+ data.tar.gz: 6ffbb0731c74f2b7168dccd79f4a78f00b168baab2d5d111ac9a872b39c24f3aca8fa5b92e67053a7f5abef71a0fc44561604f56f669c40812a10c237e5b65e9
data/README.md CHANGED
@@ -1,3 +1,374 @@
1
- # logstash-filter-kafka_time_machine
1
+ [![Gem Version](https://badge.fury.io/rb/logstash-filter-kafka_time_machine.svg)](https://badge.fury.io/rb/logstash-filter-kafka_time_machine)
2
2
 
3
- TBD
3
+ # Logstash Plugin: logstash-filter-kafka_time_machine
4
+
5
+ This is a filter plugin for [Logstash](https://github.com/elastic/logstash)
6
+
7
+ ## Description
8
+
9
+ This filter plugin will generate new events in the logstash pipeline. These events are generated based on fields in the original that are extracted and passed to the filter. The new events generate metrics for log events that have traversed multiple kafka and logstash blocks for aggregation.
10
+
11
+ The typical flow for log events:
12
+
13
+ ```
14
+
15
+ Service Log ---> kafka_shipper <--- logstash_shipper ---> | ott_network_link | ---> kafka_indexer <--- logstash_indexer ---> elastic_search
16
+
17
+ ```
18
+
19
+ The filter leverages metadata inserted into the log event on both `logstash_shipper` and `logstash_indexer` nodes to track dwell time of log events through this pipeline.
20
+
21
+ ## Kafka Time Machine Result
22
+
23
+ When the `kafka_time_machine` executes it will return a [InfluxDB Line Protocol](https://docs.influxdata.com/influxdb/v1.8/write_protocols/line_protocol_tutorial/) formatted metric, i.e.:
24
+
25
+ ```
26
+ ktm,datacenter=kafka_datacenter_shipper-test,lag_type=total,owner=ktm_test@cisco.com lag_ms=300i,payload_size_bytes=40i 1634662795000000000
27
+ ```
28
+
29
+ The plugin will also emit a metric if an error was encountered, i.e.:
30
+
31
+ ```
32
+ ktm_error,datacenter=kafka_datacenter_shipper-test,owner=ktm_test@cisco.com,source=shipper count=1i 1634662795000000000
33
+ ```
34
+
35
+ To ensure a logstash `output{}` block can properly route this metric, the new event are tagged with a `[@metadata][ktm_tag][ktm_metric]` field, i.e.:
36
+
37
+ ```
38
+ {
39
+ "ktm_metric" => "ktm,datacenter=kafka_datacenter_shipper-test,lag_type=total,owner=ktm_test@cisco.com lag_ms=300i,payload_size_bytes=40i 1634662795000000000",
40
+ "@timestamp" => 2021-10-20T23:46:24.704Z,
41
+ "@metadata" => {
42
+ "ktm_tags" => {
43
+ "ktm_metric" => "true"
44
+ }
45
+ },
46
+ "@version" => "1"
47
+ }
48
+ ```
49
+
50
+ ### Metric Event Breakdown
51
+
52
+ The `kafka_time_machine` can insert one or more new events in the pipeline. The `ktm_metric` created will be one of:
53
+
54
+ - `ktm`
55
+ - `ktm_error`
56
+
57
+ In the case of `ktm` the metric breakdown is:
58
+
59
+ | Line Protocol Element | Line Protocol Type | Description |
60
+ | --------------------- | ------------------ | ------------------------------------------- |
61
+ | datacenter | tag | Echo of `kafka_datacenter_shipper` |
62
+ | lag_type | tag | Calculated lag type |
63
+ | owner | tag | Echo of `event_owner` |
64
+ | lag_ms | field | Calculated lag in milliseconds |
65
+ | payload_size_bytes | field | Calculated size of `payload` field in bytes |
66
+
67
+ Meaning of `lag_type`:
68
+
69
+ - `total`: Lag calculated includes dwell time on both on shipper and indexer
70
+ - `indexer`: Lag calculated is dwell time for indexer only. Insufficient data provided for shipper to compute `total` lag.
71
+ - `shipper`: Lag calculated is dwell time for shipper only. Insufficient data provided for indexer to compute `total` lag.
72
+
73
+ In the case of `ktm_error` the metric breakdown is:
74
+
75
+ | Line Protocol Element | Line Protocol Type | Description |
76
+ | --------------------- | ------------------ | ------------------------------------ |
77
+ | datacenter | tag | Echo of `kafka_datacenter_shipper` |
78
+ | source | tag | Source of the error metric |
79
+ | owner | tag | Echo of `event_owner` |
80
+ | count | field | Count to track error; not cumulative |
81
+
82
+ Meaning of `source`:
83
+
84
+ - `indexer`: Insufficient data provided for indexer to compute `total` lag.
85
+ - `shipper`: Insufficient data provided for shipper to compute `total` lag.
86
+ - `insufficient_data`: Insufficient data provided both indexer and shipper to compute `total` lag.
87
+ - `unknown`: Unknown error encountered
88
+
89
+ ### Metric Event Timestamp
90
+
91
+ When the `kafka_time_machine` generates the [InfluxDB Line Protocol](https://docs.influxdata.com/influxdb/v1.8/write_protocols/line_protocol_tutorial/) metric it must also set the timestamp on the event. To ensure the caller of filter has control of this the `event_time_ms` configuration is used to set the metric timestamp.
92
+
93
+ For example if `event_time_ms` is provided as `1634662795000` the resulting metric would be:
94
+
95
+ ```
96
+ ktm,datacenter=kafka_datacenter_shipper-test,lag_type=total,owner=ktm_test@cisco.com lag_ms=300i,payload_size_bytes=40i 1634662795000000000
97
+ ```
98
+
99
+ ## Kafka Time Machine Configuration Options
100
+
101
+ This plugin requires the following configurations:
102
+
103
+ | Setting | Input Type | Required |
104
+ | --------------------------------------------------------------------- | ---------- | -------- |
105
+ | [kafka_datacenter_shipper](#kafka_datacenter_shipper) | string | Yes |
106
+ | [kafka_topic_shipper](#kafka_topic_shipper) | string | Yes |
107
+ | [kafka_consumer_group_shipper](#kafka_consumer_group_shipper) | string | Yes |
108
+ | [kafka_append_time_shipper](#kafka_append_time_shipper) | string | Yes |
109
+ | [logstash_kafka_read_time_shipper](#logstash_kafka_read_time_shipper) | string | Yes |
110
+ | [kafka_topic_indexer](#kafka_topic_indexer) | string | Yes |
111
+ | [kafka_consumer_group_indexer](#kafka_consumer_group_indexer) | string | Yes |
112
+ | [kafka_append_time_indexer](#kafka_append_time_indexer) | string | Yes |
113
+ | [logstash_kafka_read_time_indexer](#logstash_kafka_read_time_indexer) | string | Yes |
114
+ | [event_owner](#event_owner) | string | Yes |
115
+ | [event_time_ms](#event_time_ms) | string | Yes |
116
+
117
+ > Why are all settings required?
118
+ >
119
+ >> This was a design decision based on the use case. Tracking a Kafka "lag by time" metric, but not knowing the topic and consumer group would be essentially useless. By leveraging the [Kafka input `decorate_events`](https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html#_metadata_fields) feature we know we'll always have the required fields.
120
+ >>
121
+ >> While they are required, they can be passed as empty strings. The plugin will handle these cases, i.e. the `kafka_consumer_group_shipper` name is empty string, and only return `indexer` results
122
+
123
+ ### kafka_datacenter_shipper
124
+
125
+ - Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
126
+ - There is no default value for this setting.
127
+
128
+ Provide datacenter that log event originated from; datacenter kafka_shipper is in. Field values can be static or dynamic:
129
+
130
+ ```
131
+ filter {
132
+ kafka_time_machine {
133
+ kafka_datacenter_shipper => "static_field"
134
+ }
135
+ }
136
+ ```
137
+
138
+ ```
139
+ filter {
140
+ kafka_time_machine {
141
+ kafka_datacenter_shipper => "%{[dynamic_field]}"
142
+ }
143
+ }
144
+ ```
145
+
146
+ ### kafka_topic_shipper
147
+
148
+ - Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
149
+ - There is no default value for this setting.
150
+
151
+ Provide kafka topic log event was read from on shipper. Field values can be static or dynamic:
152
+
153
+ ```
154
+ filter {
155
+ kafka_time_machine {
156
+ kafka_topic_shipper => "static_field"
157
+ }
158
+ }
159
+ ```
160
+
161
+ ```
162
+ filter {
163
+ kafka_time_machine {
164
+ kafka_topic_shipper => "%{[dynamic_field]}"
165
+ }
166
+ }
167
+ ```
168
+
169
+ ### kafka_consumer_group_shipper
170
+
171
+ - Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
172
+ - There is no default value for this setting.
173
+
174
+ Provide kafka consumer group log event was read from on shipper. Field values can be static or dynamic:
175
+
176
+ ```
177
+ filter {
178
+ kafka_time_machine {
179
+ kafka_consumer_group_shipper => "static_field"
180
+ }
181
+ }
182
+ ```
183
+
184
+ ```
185
+ filter {
186
+ kafka_time_machine {
187
+ kafka_consumer_group_shipper => "%{[dynamic_field]}"
188
+ }
189
+ }
190
+ ```
191
+
192
+ ### kafka_append_time_shipper
193
+
194
+ - Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
195
+ - There is no default value for this setting.
196
+
197
+ Provide EPOCH time in milliseconds log event was added to `kafka_shipper`. Field values can be static or dynamic:
198
+
199
+ ```
200
+ filter {
201
+ kafka_time_machine {
202
+ kafka_append_time_shipper => 1624394191000
203
+ }
204
+ }
205
+ ```
206
+
207
+ ```
208
+ filter {
209
+ kafka_time_machine {
210
+ kafka_append_time_shipper => "%{[dynamic_field]}"
211
+ }
212
+ }
213
+ ```
214
+
215
+ ### logstash_kafka_read_time_shipper
216
+
217
+ - Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
218
+ - There is no default value for this setting.
219
+
220
+ Provide EPOCH time in milliseconds log event read from to `kafka_shipper`. Field values can be static or dynamic:
221
+
222
+ ```
223
+ filter {
224
+ kafka_time_machine {
225
+ logstash_kafka_read_time_shipper => 1624394191000
226
+ }
227
+ }
228
+ ```
229
+
230
+ ```
231
+ filter {
232
+ kafka_time_machine {
233
+ logstash_kafka_read_time_shipper => "%{[dynamic_field]}"
234
+ }
235
+ }
236
+ ```
237
+
238
+ ### kafka_topic_indexer
239
+
240
+ - Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
241
+ - There is no default value for this setting.
242
+
243
+ Provide kafka topic log event was read from on indexer. Field values can be static or dynamic:
244
+
245
+ ```
246
+ filter {
247
+ kafka_time_machine {
248
+ kafka_topic_indexer => "static_field"
249
+ }
250
+ }
251
+ ```
252
+
253
+ ```
254
+ filter {
255
+ kafka_time_machine {
256
+ kafka_topic_indexer => "%{[dynamic_field]}"
257
+ }
258
+ }
259
+ ```
260
+
261
+ ### kafka_consumer_group_indexer
262
+
263
+ - Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
264
+ - There is no default value for this setting.
265
+
266
+ Provide kafka consumer group log event was read from on indexer. Field values can be static or dynamic:
267
+
268
+ ```
269
+ filter {
270
+ kafka_time_machine {
271
+ kafka_consumer_group_indexer => "static_field"
272
+ }
273
+ }
274
+ ```
275
+
276
+ ```
277
+ filter {
278
+ kafka_time_machine {
279
+ kafka_consumer_group_indexer => "%{[dynamic_field]}"
280
+ }
281
+ }
282
+ ```
283
+
284
+ ### kafka_append_time_indexer
285
+
286
+ - Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
287
+ - There is no default value for this setting.
288
+
289
+ Provide EPOCH time in milliseconds log event was added to `kafka_indexer`. Field values can be static or dynamic:
290
+
291
+ ```
292
+ filter {
293
+ kafka_time_machine {
294
+ kafka_append_time_indexer => 1624394191000
295
+ }
296
+ }
297
+ ```
298
+
299
+ ```
300
+ filter {
301
+ kafka_time_machine {
302
+ kafka_append_time_indexer => "%{[dynamic_field]}"
303
+ }
304
+ }
305
+ ```
306
+
307
+ ### logstash_kafka_read_time_indexer
308
+
309
+ - Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
310
+ - There is no default value for this setting.
311
+
312
+ Provide EPOCH time in milliseconds log event read from to `kafka_indexer`. Field values can be static or dynamic:
313
+
314
+ ```
315
+ filter {
316
+ kafka_time_machine {
317
+ logstash_kafka_read_time_indexer => 1624394191000
318
+ }
319
+ }
320
+ ```
321
+
322
+ ```
323
+ filter {
324
+ kafka_time_machine {
325
+ logstash_kafka_read_time_indexer => "%{[dynamic_field]}"
326
+ }
327
+ }
328
+ ```
329
+
330
+ ### event_owner
331
+
332
+ - Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
333
+ - There is no default value for this setting.
334
+
335
+ Provide event owner; represents the owner of the log. Field values can be static or dynamic:
336
+
337
+ ```
338
+ filter {
339
+ kafka_time_machine {
340
+ event_owner => "static_field"
341
+ }
342
+ }
343
+ ```
344
+
345
+ ```
346
+ filter {
347
+ kafka_time_machine {
348
+ event_owner => "%{[dynamic_field]}"
349
+ }
350
+ }
351
+ ```
352
+
353
+ ### event_time_ms
354
+
355
+ - Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
356
+ - There is no default value for this setting.
357
+
358
+ Provide EPOCH time in milliseconds that this event is being processed. This time will be appending the generated InfluxDb Line Protocol metric. Field values can be static or dynamic:
359
+
360
+ ```
361
+ filter {
362
+ kafka_time_machine {
363
+ event_time_ms => 1624394191000
364
+ }
365
+ }
366
+ ```
367
+
368
+ ```
369
+ filter {
370
+ kafka_time_machine {
371
+ event_time_ms => "%{[dynamic_field]}"
372
+ }
373
+ }
374
+ ```
@@ -0,0 +1,252 @@
1
+ # encoding: utf-8
2
+ require "logstash/filters/base"
3
+ require "logstash/namespace"
4
+ require "logstash/event"
5
+ require "influxdb-client"
6
+
7
+ class LogStash::Filters::KafkaTimeMachine < LogStash::Filters::Base
8
+
9
+ config_name "kafka_time_machine"
10
+
11
+ # Datacenter the kafka message originated from.
12
+ config :kafka_datacenter_shipper, :validate => :string, :required => true
13
+
14
+ # Kafka Topic on shipper datacenter
15
+ config :kafka_topic_shipper, :validate => :string, :required => true
16
+
17
+ # Kafka Consumer Group on shipper datacenter
18
+ config :kafka_consumer_group_shipper, :validate => :string, :required => true
19
+
20
+ # Time message was appended to kafka on shipper datacenter
21
+ config :kafka_append_time_shipper, :validate => :string, :required => true
22
+
23
+ # Time message read from kafka by logstash on shipper datacenter
24
+ config :logstash_kafka_read_time_shipper, :validate => :string, :required => true
25
+
26
+ # Kafka Topic on indexer datacenter
27
+ config :kafka_topic_indexer, :validate => :string, :required => true
28
+
29
+ # Kafka Consumer Group on indexer datacenter
30
+ config :kafka_consumer_group_indexer, :validate => :string, :required => true
31
+
32
+ # Time message was appended to kafka on indexer datacenter
33
+ config :kafka_append_time_indexer, :validate => :string, :required => true
34
+
35
+ # Time message read from kafka by logstash on indexer datacenter
36
+ config :logstash_kafka_read_time_indexer, :validate => :string, :required => true
37
+
38
+ # Owner of the event currenty being process.
39
+ config :event_owner, :validate => :string, :required => true
40
+
41
+ # Current time since EPOCH in ms that should be set in the influxdb generated metric
42
+ config :event_time_ms, :validate => :string, :required => true
43
+
44
+ # Current time since EPOCH in ms that should be set in the influxdb generated metric
45
+ config :elasticsearch_cluster, :validate => :string, :required => true
46
+
47
+ # Current time since EPOCH in ms that should be set in the influxdb generated metric
48
+ config :elasticsearch_cluster_index, :validate => :string, :required => true
49
+
50
+ public
51
+ def register
52
+
53
+ end
54
+
55
+ public
56
+ def filter(event)
57
+
58
+ @logger.debug("Starting filter calculations")
59
+
60
+ # Note - It was considered to error check for strings that are invalid, i.e. "%{[@metadata][ktm][kafka_datacenter_shipper]}". However, this string being present is a good way to identify
61
+ # shipper/indexer logstash configs that are wrong so its allowed to pass through unaltered.
62
+ #
63
+ # Extract all string values to local variables.
64
+ event_owner = event.sprintf(@event_owner)
65
+ shipper_kafka_datacenter = event.sprintf(@kafka_datacenter_shipper)
66
+ shipper_kafka_topic = event.sprintf(@kafka_topic_shipper)
67
+ shipper_kafka_consumer_group = event.sprintf(@kafka_consumer_group_shipper)
68
+ indexer_kafka_topic = event.sprintf(@kafka_topic_indexer)
69
+ indexer_kafka_consumer_group = event.sprintf(@kafka_consumer_group_indexer)
70
+ elasticsearch_cluster = event.sprintf(@elasticsearch_cluster)
71
+ elasticsearch_cluster_index = event.sprintf(@elasticsearch_cluster_index)
72
+
73
+ # Extract all the "time" related values to local variables. This need special handling due to the Float() operation.
74
+ #
75
+ # We must check for a valid numberic value; if not the float operation will error out on "invalid hash error" and stop logstash pipeline
76
+ event_time_ms = get_numeric(event.sprintf(@event_time_ms))
77
+ shipper_kafka_append_time = get_numeric(event.sprintf(@kafka_append_time_shipper))
78
+ shipper_logstash_kafka_read_time = get_numeric(event.sprintf(@logstash_kafka_read_time_shipper))
79
+ indexer_kafka_append_time = get_numeric(event.sprintf(@kafka_append_time_indexer))
80
+ indexer_logstash_kafka_read_time = get_numeric(event.sprintf(@logstash_kafka_read_time_indexer))
81
+
82
+ # Validate the shipper data
83
+ shipper_kafka_array = Array[shipper_kafka_datacenter, shipper_kafka_topic, shipper_kafka_consumer_group, shipper_kafka_append_time, shipper_logstash_kafka_read_time, event_owner, event_time_ms, elasticsearch_cluster, elasticsearch_cluster_index]
84
+ if (shipper_kafka_array.any? { |text| text.nil? || text.to_s.empty? })
85
+ @logger.debug("shipper_kafka_array invalid: Found null")
86
+ error_string_shipper = sprintf("Error in shipper data: %s", shipper_kafka_array)
87
+ @logger.debug(error_string_shipper)
88
+ shipper_valid = false
89
+ else
90
+ @logger.debug("shipper_kafka_array valid")
91
+ shipper_valid = true
92
+ shipper_logstash_kafka_read_time = shipper_logstash_kafka_read_time.to_i
93
+ shipper_kafka_append_time = shipper_kafka_append_time.to_i
94
+ shipper_kafka_lag_ms = shipper_logstash_kafka_read_time - shipper_kafka_append_time
95
+ end
96
+
97
+ # Validate the indexer data
98
+ indexer_kafka_array = Array[shipper_kafka_datacenter, indexer_kafka_topic, indexer_kafka_consumer_group, indexer_kafka_append_time, indexer_logstash_kafka_read_time, event_owner, event_time_ms, elasticsearch_cluster, elasticsearch_cluster_index]
99
+ if (indexer_kafka_array.any? { |text| text.nil? || text.to_s.empty? })
100
+ @logger.debug("indexer_kafka_array invalid: Found null")
101
+ error_string_indexer = sprintf("Error in indexer data: %s", indexer_kafka_array)
102
+ @logger.debug(error_string_indexer)
103
+ indexer_valid = false
104
+ else
105
+ @logger.debug("indexer_kafka_array valid")
106
+ indexer_valid = true
107
+ indexer_logstash_kafka_read_time = indexer_logstash_kafka_read_time.to_i
108
+ indexer_kafka_append_time = indexer_kafka_append_time.to_i
109
+ indexer_kafka_lag_ms = indexer_logstash_kafka_read_time - indexer_kafka_append_time
110
+ end
111
+
112
+ # Add in the size of the payload field
113
+ payload_bytesize = 0
114
+ if event.get("[payload]")
115
+ payload_bytesize = event.get("[payload]").bytesize
116
+ end
117
+
118
+ # Set time (nanoseconds) for influxdb line protocol
119
+ epoch_time_ns = nil
120
+ if (event_time_ms != nil )
121
+ epoch_time_ns = event_time_ms * 1000000
122
+ end
123
+
124
+ # Create array to hold one or more ktm metric events
125
+ ktm_metric_event_array = Array.new
126
+
127
+ # Populate the event and set tags
128
+ if (shipper_valid == true && indexer_valid == true && epoch_time_ns != nil)
129
+ total_kafka_lag_ms = indexer_logstash_kafka_read_time - shipper_kafka_append_time
130
+
131
+ point_influxdb = create_influxdb_point_ktm(shipper_kafka_datacenter, event_owner, payload_bytesize, "total", total_kafka_lag_ms, epoch_time_ns, elasticsearch_cluster, elasticsearch_cluster_index)
132
+ ktm_metric_event_array.push point_influxdb
133
+
134
+ elsif (shipper_valid == true && indexer_valid == false && epoch_time_ns != nil)
135
+ point_influxdb = create_influxdb_point_ktm(shipper_kafka_datacenter, event_owner, payload_bytesize, "shipper", shipper_kafka_lag_ms, epoch_time_ns, elasticsearch_cluster, elasticsearch_cluster_index)
136
+ ktm_metric_event_array.push point_influxdb
137
+
138
+ point_influxdb = create_influxdb_point_ktm_error(shipper_kafka_datacenter, event_owner, epoch_time_ns, "indexer", elasticsearch_cluster, elasticsearch_cluster_index)
139
+ ktm_metric_event_array.push point_influxdb
140
+
141
+ elsif (indexer_valid == true && shipper_valid == false && epoch_time_ns != nil)
142
+ point_influxdb = create_influxdb_point_ktm(shipper_kafka_datacenter, event_owner, payload_bytesize, "indexer", indexer_kafka_lag_ms, epoch_time_ns, elasticsearch_cluster, elasticsearch_cluster_index)
143
+ ktm_metric_event_array.push point_influxdb
144
+
145
+ point_influxdb = create_influxdb_point_ktm_error(shipper_kafka_datacenter, event_owner, epoch_time_ns, "shipper", elasticsearch_cluster, elasticsearch_cluster_index)
146
+ ktm_metric_event_array.push point_influxdb
147
+
148
+ elsif (indexer_valid == false && shipper_valid == false)
149
+
150
+ point_influxdb = create_influxdb_point_ktm_error(shipper_kafka_datacenter, event_owner, epoch_time_ns, "insufficient_data", elasticsearch_cluster, elasticsearch_cluster_index)
151
+ ktm_metric_event_array.push point_influxdb
152
+
153
+ error_string = sprintf("Error kafka_time_machine: Could not build valid response --> %s, %s", error_string_shipper, error_string_indexer)
154
+ @logger.debug(error_string)
155
+
156
+ else
157
+
158
+ point_influxdb = create_influxdb_point_ktm_error(shipper_kafka_datacenter, event_owner, epoch_time_ns, "unknown", elasticsearch_cluster, elasticsearch_cluster_index)
159
+ ktm_metric_event_array.push point_influxdb
160
+
161
+ error_string = "Unknown error encountered"
162
+ @logger.debug(error_string)
163
+
164
+ end
165
+
166
+ # Publish even event in our array
167
+ ktm_metric_event_array.each do |metric_event|
168
+
169
+ # Create new event for KTM metric
170
+ event_ktm = LogStash::Event.new
171
+
172
+ event_ktm.set("ktm_metric", metric_event)
173
+ event_ktm.set("[@metadata][ktm_tags][ktm_metric]", "true")
174
+
175
+ filter_matched(event_ktm)
176
+ yield event_ktm
177
+
178
+ end
179
+
180
+ end # def filter
181
+
182
+ # Creates an Influx DB line-protocol data point to return
183
+ public
184
+ def create_influxdb_point_ktm(datacenter, event_owner, payload_size_bytes, lag_type, lag_ms, epoch_time_ns, elasticsearch_cluster, elasticsearch_cluster_index)
185
+
186
+ point = InfluxDB2::Point.new( name: "ktm",
187
+ tags: {datacenter: datacenter, owner: event_owner, lag_type: lag_type, es_cluster: elasticsearch_cluster, es_cluster_index: elasticsearch_cluster_index},
188
+ fields: {payload_size_bytes: payload_size_bytes, lag_ms: lag_ms},
189
+ time: epoch_time_ns)
190
+
191
+ point_influxdb = point.to_line_protocol
192
+ return point_influxdb
193
+
194
+ end # def create_influxdb_point
195
+
196
+ # Creates an Influx DB line-protocol data point to return
197
+ public
198
+ def create_influxdb_point_ktm_error(datacenter, event_owner, epoch_time_ns, type, elasticsearch_cluster, elasticsearch_cluster_index)
199
+
200
+ # Check for nil values
201
+ if (nil == datacenter)
202
+ datacenter = "unknown"
203
+ end
204
+
205
+ if (nil == event_owner)
206
+ event_owner = "unknown"
207
+ end
208
+
209
+ # set time if we didn't recieve it
210
+ if (nil == epoch_time_ns)
211
+ epoch_time_ns = ((Time.now.to_f * 1000).to_i)*1000000
212
+ end
213
+
214
+ point = InfluxDB2::Point.new( name: "ktm_error",
215
+ tags: {datacenter: datacenter, owner: event_owner, source: type, es_cluster: elasticsearch_cluster, es_cluster_index: elasticsearch_cluster_index},
216
+ fields: {count: 1},
217
+ time: epoch_time_ns)
218
+
219
+ point_influxdb = point.to_line_protocol
220
+ return point_influxdb
221
+
222
+ end # def create_influxdb_point
223
+
224
+ # Ensures the provided value is numeric; if not returns 'nil'
225
+ public
226
+ def get_numeric(input_str)
227
+
228
+ # @logger.debug("Aggregate timeout for '#{@task_id}' pattern: #{@timeout} seconds")
229
+ @logger.debug("get_numeric operating on: #{input_str} ")
230
+
231
+ is_numeric = input_str.to_s.match(/\A[+-]?\d+?(\.\d+)?\Z/) == nil ? false : true
232
+ if (true == is_numeric)
233
+ @logger.debug("get_numeric - valid value provided")
234
+ num_value = Float(sprintf(input_str))
235
+
236
+ if (false == num_value.positive?)
237
+ @logger.debug("get_numeric - negative value provided")
238
+ num_value = nil
239
+ end
240
+
241
+ else
242
+ @logger.debug("get_numeric - invalid value provided")
243
+ num_value = nil
244
+ end
245
+
246
+ @logger.debug(sprintf("get_numeric response --> #{num_value}"))
247
+
248
+ return num_value
249
+
250
+ end # def get_numberic
251
+
252
+ end # class LogStash::Filters::KafkaTimeMachine
@@ -1,6 +1,6 @@
1
1
  Gem::Specification.new do |s|
2
2
  s.name = 'logstash-filter-kafka_time_machine'
3
- s.version = '0.5.0'
3
+ s.version = '2.0.1'
4
4
  s.licenses = ['Apache-2.0']
5
5
  s.summary = "Calculate total time of logstash event that traversed 2 Kafka queues from a shipper site to an indexer site"
6
6
  s.description = "This gem is a logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"
@@ -20,5 +20,6 @@ Gem::Specification.new do |s|
20
20
 
21
21
  # Gem dependencies
22
22
  s.add_runtime_dependency "logstash-core-plugin-api", ">= 1.60", "<= 2.99"
23
+ s.add_runtime_dependency "influxdb-client", "~> 2.0.0"
23
24
  s.add_development_dependency 'logstash-devutils', '~> 0'
24
25
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: logstash-filter-kafka_time_machine
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.0
4
+ version: 2.0.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Chris Foster
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2021-06-17 00:00:00.000000000 Z
11
+ date: 2021-10-28 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: logstash-core-plugin-api
@@ -30,6 +30,20 @@ dependencies:
30
30
  - - "<="
31
31
  - !ruby/object:Gem::Version
32
32
  version: '2.99'
33
+ - !ruby/object:Gem::Dependency
34
+ name: influxdb-client
35
+ requirement: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - "~>"
38
+ - !ruby/object:Gem::Version
39
+ version: 2.0.0
40
+ type: :runtime
41
+ prerelease: false
42
+ version_requirements: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - "~>"
45
+ - !ruby/object:Gem::Version
46
+ version: 2.0.0
33
47
  - !ruby/object:Gem::Dependency
34
48
  name: logstash-devutils
35
49
  requirement: !ruby/object:Gem::Requirement
@@ -54,7 +68,7 @@ extra_rdoc_files: []
54
68
  files:
55
69
  - Gemfile
56
70
  - README.md
57
- - lib/logstash/filters/kafkatimemachine.rb
71
+ - lib/logstash/filters/kafka_time_machine.rb
58
72
  - logstash-filter-kafka_time_machine.gemspec
59
73
  homepage: http://www.elastic.co/guide/en/logstash/current/index.html
60
74
  licenses:
@@ -1,83 +0,0 @@
1
- # encoding: utf-8
2
- require "logstash/filters/base"
3
- require "logstash/namespace"
4
- require "logstash/event"
5
-
6
- class LogStash::Filters::KafkaTimeMachine < LogStash::Filters::Base
7
-
8
- config_name "kafkatimemachine"
9
-
10
- public
11
- def register
12
-
13
- end
14
-
15
- public
16
- def filter(event)
17
-
18
- # Extract shipper data and check for validity; note that kafka_datacenter_shipper is used for both shipper and indexer arrays
19
- kafka_datacenter_shipper = event.get("[@metadata][kafka_datacenter_shipper]")
20
- kafka_topic_shipper = event.get("[@metadata][kafka_topic_shipper]")
21
- kafka_consumer_group_shipper = event.get("[@metadata][kafka_consumer_group_shipper]")
22
- kafka_append_time_shipper = Float(event.get("[@metadata][kafka_append_time_shipper]")) rescue nil
23
- logstash_kafka_read_time_shipper = Float(event.get("[@metadata][logstash_kafka_read_time_shipper]")) rescue nil
24
-
25
- kafka_shipper_array = Array[kafka_datacenter_shipper, kafka_topic_shipper, kafka_consumer_group_shipper, kafka_append_time_shipper, logstash_kafka_read_time_shipper]
26
- @logger.debug("kafka_shipper_array: #{kafka_shipper_array}")
27
-
28
- if (kafka_shipper_array.any? { |text| text.nil? || text.to_s.empty? })
29
- @logger.debug("kafka_shipper_array invalid: Found null")
30
- error_string_shipper = "Error in shipper data: #{kafka_shipper_array}"
31
- shipper_valid = false
32
- else
33
- @logger.debug("kafka_shipper_array valid")
34
- shipper_valid = true
35
- logstash_kafka_read_time_shipper = logstash_kafka_read_time_shipper.to_i
36
- kafka_append_time_shipper = kafka_append_time_shipper.to_i
37
- kafka_shipper_lag_ms = logstash_kafka_read_time_shipper - kafka_append_time_shipper
38
- end
39
-
40
- # Extract indexer data and check for validity
41
- kafka_topic_indexer = event.get("[@metadata][kafka_topic_indexer]")
42
- kafka_consumer_group_indexer = event.get("[@metadata][kafka_consumer_group_indexer]")
43
- kafka_append_time_indexer = Float(event.get("[@metadata][kafka_append_time_indexer]")) rescue nil
44
- logstash_kafka_read_time_indexer = Float(event.get("[@metadata][logstash_kafka_read_time_indexer]")) rescue nil
45
-
46
- kafka_indexer_array = Array[kafka_datacenter_shipper, kafka_topic_indexer, kafka_consumer_group_indexer, kafka_append_time_indexer, logstash_kafka_read_time_indexer]
47
- @logger.debug("kafka_indexer_array: #{kafka_indexer_array}")
48
-
49
- if (kafka_indexer_array.any? { |text| text.nil? || text.to_s.empty? })
50
- @logger.debug("kafka_indexer_array invalid: Found null")
51
- error_string_indexer = "Error in indexer data: #{kafka_indexer_array}"
52
- indexer_valid = false
53
- else
54
- @logger.debug("kafka_indexer_array valid")
55
- indexer_valid = true
56
- logstash_kafka_read_time_indexer = logstash_kafka_read_time_indexer.to_i
57
- kafka_append_time_indexer = kafka_append_time_indexer.to_i
58
- kafka_indexer_lag_ms = logstash_kafka_read_time_indexer - kafka_append_time_indexer
59
- end
60
-
61
- if (shipper_valid == true && indexer_valid == true)
62
- kafka_total_lag_ms = logstash_kafka_read_time_indexer - kafka_append_time_shipper
63
- event.set("[ktm]", {"lag_total_ms" => kafka_total_lag_ms, "lag_indexer" => kafka_indexer_lag_ms, "lag_shipper" => kafka_shipper_lag_ms, "datacenter_shipper" => kafka_datacenter_shipper, "kafka_topic_indexer" => kafka_topic_indexer, "kafka_consumer_group_indexer" => kafka_consumer_group_indexer, "kafka_topic_shipper" => kafka_topic_shipper, "kafka_consumer_group_shipper" => kafka_consumer_group_shipper, "tags" => ["ktm_lag_total"] })
64
- elsif (shipper_valid == true && indexer_valid == false)
65
- event.set("[ktm]", {"lag_shipper_ms" => kafka_shipper_lag_ms, "datacenter_shipper" => kafka_datacenter_shipper, "kafka_topic_shipper" => kafka_topic_shipper, "kafka_consumer_group_shipper" => kafka_consumer_group_shipper, "tags" => ["ktm_lag_shipper"] })
66
- elsif (indexer_valid == true && shipper_valid == false)
67
- event.set("[ktm]", {"lag_indexer_ms" => kafka_indexer_lag_ms, "datacenter_shipper" => kafka_datacenter_shipper, "kafka_topic_indexer" => kafka_topic_indexer, "kafka_consumer_group_indexer" => kafka_consumer_group_indexer, "tags" => ["ktm_lag_indexer"] })
68
- elsif (indexer_valid == false && shipper_valid == false)
69
- @logger.debug("Error kafkatimemachine: Could not build valid response --> #{error_string_shipper}, #{error_string_indexer}")
70
- end
71
-
72
- # Add in the size of the payload field
73
- if event.get("[payload]")
74
- payload_bytesize = event.get("[payload]").bytesize
75
- event.set("[ktm][payload_size_bytes]", payload_bytesize)
76
- end
77
-
78
- # filter_matched should go in the last line of our successful code
79
- filter_matched(event)
80
-
81
- end # def filter
82
-
83
- end # class LogStash::Filters::KafkaTimeMachine