logstash-filter-kafka_time_machine 0.4.0 → 2.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +373 -2
- data/lib/logstash/filters/kafka_time_machine.rb +242 -0
- data/logstash-filter-kafka_time_machine.gemspec +2 -1
- metadata +17 -3
- data/lib/logstash/filters/kafkatimemachine.rb +0 -77
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: a0ab8be37285b43b8785aea650cc6eb6d456d701ba36aeb0d5c2cbcb020d470d
|
4
|
+
data.tar.gz: d6ccd08fa8024d7cced6c529708ad5cd5f15ef683c043206966c58b989d28f42
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 73b806dbef6c52765e674dc7acec647c90427ff40270c9665eaf07c29818ecca558c564cb812aa9cd81601a9b3009ccdd5b7d465ef18352a4f146c195f9daf16
|
7
|
+
data.tar.gz: 3f318fbafd7bd1283599ac68b4d7b29ea9187f7772d8353235d0d4d2cc06be34afc2bf440d226600ba125ad4a9a34d058d77f05c87c94c45d9f3339cbdc79e31
|
data/README.md
CHANGED
@@ -1,3 +1,374 @@
|
|
1
|
-
|
1
|
+
[![Gem Version](https://badge.fury.io/rb/logstash-filter-kafka_time_machine.svg)](https://badge.fury.io/rb/logstash-filter-kafka_time_machine)
|
2
2
|
|
3
|
-
|
3
|
+
# Logstash Plugin: logstash-filter-kafka_time_machine
|
4
|
+
|
5
|
+
This is a filter plugin for [Logstash](https://github.com/elastic/logstash)
|
6
|
+
|
7
|
+
## Description
|
8
|
+
|
9
|
+
This filter plugin will generate new events in the logstash pipeline. These events are generated based on fields in the original that are extracted and passed to the filter. The new events generate metrics for log events that have traversed multiple kafka and logstash blocks for aggregation.
|
10
|
+
|
11
|
+
The typical flow for log events:
|
12
|
+
|
13
|
+
```
|
14
|
+
|
15
|
+
Service Log ---> kafka_shipper <--- logstash_shipper ---> | ott_network_link | ---> kafka_indexer <--- logstash_indexer ---> elastic_search
|
16
|
+
|
17
|
+
```
|
18
|
+
|
19
|
+
The filter leverages metadata inserted into the log event on both `logstash_shipper` and `logstash_indexer` nodes to track dwell time of log events through this pipeline.
|
20
|
+
|
21
|
+
## Kafka Time Machine Result
|
22
|
+
|
23
|
+
When the `kafka_time_machine` executes it will return a [InfluxDB Line Protocol](https://docs.influxdata.com/influxdb/v1.8/write_protocols/line_protocol_tutorial/) formatted metric, i.e.:
|
24
|
+
|
25
|
+
```
|
26
|
+
ktm,datacenter=kafka_datacenter_shipper-test,lag_type=total,owner=ktm_test@cisco.com lag_ms=300i,payload_size_bytes=40i 1634662795000000000
|
27
|
+
```
|
28
|
+
|
29
|
+
The plugin will also emit a metric if an error was encountered, i.e.:
|
30
|
+
|
31
|
+
```
|
32
|
+
ktm_error,datacenter=kafka_datacenter_shipper-test,owner=ktm_test@cisco.com,source=shipper count=1i 1634662795000000000
|
33
|
+
```
|
34
|
+
|
35
|
+
To ensure a logstash `output{}` block can properly route this metric, the new event are tagged with a `[@metadata][ktm_tag][ktm_metric]` field, i.e.:
|
36
|
+
|
37
|
+
```
|
38
|
+
{
|
39
|
+
"ktm_metric" => "ktm,datacenter=kafka_datacenter_shipper-test,lag_type=total,owner=ktm_test@cisco.com lag_ms=300i,payload_size_bytes=40i 1634662795000000000",
|
40
|
+
"@timestamp" => 2021-10-20T23:46:24.704Z,
|
41
|
+
"@metadata" => {
|
42
|
+
"ktm_tags" => {
|
43
|
+
"ktm_metric" => "true"
|
44
|
+
}
|
45
|
+
},
|
46
|
+
"@version" => "1"
|
47
|
+
}
|
48
|
+
```
|
49
|
+
|
50
|
+
### Metric Event Breakdown
|
51
|
+
|
52
|
+
The `kafka_time_machine` can insert one or more new events in the pipeline. The `ktm_metric` created will be one of:
|
53
|
+
|
54
|
+
- `ktm`
|
55
|
+
- `ktm_error`
|
56
|
+
|
57
|
+
In the case of `ktm` the metric breakdown is:
|
58
|
+
|
59
|
+
| Line Protocol Element | Line Protocol Type | Description |
|
60
|
+
| --------------------- | ------------------ | ------------------------------------------- |
|
61
|
+
| datacenter | tag | Echo of `kafka_datacenter_shipper` |
|
62
|
+
| lag_type | tag | Calculated lag type |
|
63
|
+
| owner | tag | Echo of `event_owner` |
|
64
|
+
| lag_ms | field | Calculated lag in milliseconds |
|
65
|
+
| payload_size_bytes | field | Calculated size of `payload` field in bytes |
|
66
|
+
|
67
|
+
Meaning of `lag_type`:
|
68
|
+
|
69
|
+
- `total`: Lag calculated includes dwell time on both on shipper and indexer
|
70
|
+
- `indexer`: Lag calculated is dwell time for indexer only. Insufficient data provided for shipper to compute `total` lag.
|
71
|
+
- `shipper`: Lag calculated is dwell time for shipper only. Insufficient data provided for indexer to compute `total` lag.
|
72
|
+
|
73
|
+
In the case of `ktm_error` the metric breakdown is:
|
74
|
+
|
75
|
+
| Line Protocol Element | Line Protocol Type | Description |
|
76
|
+
| --------------------- | ------------------ | ------------------------------------ |
|
77
|
+
| datacenter | tag | Echo of `kafka_datacenter_shipper` |
|
78
|
+
| source | tag | Source of the error metric |
|
79
|
+
| owner | tag | Echo of `event_owner` |
|
80
|
+
| count | field | Count to track error; not cumulative |
|
81
|
+
|
82
|
+
Meaning of `source`:
|
83
|
+
|
84
|
+
- `indexer`: Insufficient data provided for indexer to compute `total` lag.
|
85
|
+
- `shipper`: Insufficient data provided for shipper to compute `total` lag.
|
86
|
+
- `insufficient_data`: Insufficient data provided both indexer and shipper to compute `total` lag.
|
87
|
+
- `unknown`: Unknown error encountered
|
88
|
+
|
89
|
+
### Metric Event Timestamp
|
90
|
+
|
91
|
+
When the `kafka_time_machine` generates the [InfluxDB Line Protocol](https://docs.influxdata.com/influxdb/v1.8/write_protocols/line_protocol_tutorial/) metric it must also set the timestamp on the event. To ensure the caller of filter has control of this the `event_time_ms` configuration is used to set the metric timestamp.
|
92
|
+
|
93
|
+
For example if `event_time_ms` is provided as `1634662795000` the resulting metric would be:
|
94
|
+
|
95
|
+
```
|
96
|
+
ktm,datacenter=kafka_datacenter_shipper-test,lag_type=total,owner=ktm_test@cisco.com lag_ms=300i,payload_size_bytes=40i 1634662795000000000
|
97
|
+
```
|
98
|
+
|
99
|
+
## Kafka Time Machine Configuration Options
|
100
|
+
|
101
|
+
This plugin requires the following configurations:
|
102
|
+
|
103
|
+
| Setting | Input Type | Required |
|
104
|
+
| --------------------------------------------------------------------- | ---------- | -------- |
|
105
|
+
| [kafka_datacenter_shipper](#kafka_datacenter_shipper) | string | Yes |
|
106
|
+
| [kafka_topic_shipper](#kafka_topic_shipper) | string | Yes |
|
107
|
+
| [kafka_consumer_group_shipper](#kafka_consumer_group_shipper) | string | Yes |
|
108
|
+
| [kafka_append_time_shipper](#kafka_append_time_shipper) | string | Yes |
|
109
|
+
| [logstash_kafka_read_time_shipper](#logstash_kafka_read_time_shipper) | string | Yes |
|
110
|
+
| [kafka_topic_indexer](#kafka_topic_indexer) | string | Yes |
|
111
|
+
| [kafka_consumer_group_indexer](#kafka_consumer_group_indexer) | string | Yes |
|
112
|
+
| [kafka_append_time_indexer](#kafka_append_time_indexer) | string | Yes |
|
113
|
+
| [logstash_kafka_read_time_indexer](#logstash_kafka_read_time_indexer) | string | Yes |
|
114
|
+
| [event_owner](#event_owner) | string | Yes |
|
115
|
+
| [event_time_ms](#event_time_ms) | string | Yes |
|
116
|
+
|
117
|
+
> Why are all settings required?
|
118
|
+
>
|
119
|
+
>> This was a design decision based on the use case. Tracking a Kafka "lag by time" metric, but not knowing the topic and consumer group would be essentially useless. By leveraging the [Kafka input `decorate_events`](https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html#_metadata_fields) feature we know we'll always have the required fields.
|
120
|
+
>>
|
121
|
+
>> While they are required, they can be passed as empty strings. The plugin will handle these cases, i.e. the `kafka_consumer_group_shipper` name is empty string, and only return `indexer` results
|
122
|
+
|
123
|
+
### kafka_datacenter_shipper
|
124
|
+
|
125
|
+
- Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
|
126
|
+
- There is no default value for this setting.
|
127
|
+
|
128
|
+
Provide datacenter that log event originated from; datacenter kafka_shipper is in. Field values can be static or dynamic:
|
129
|
+
|
130
|
+
```
|
131
|
+
filter {
|
132
|
+
kafka_time_machine {
|
133
|
+
kafka_datacenter_shipper => "static_field"
|
134
|
+
}
|
135
|
+
}
|
136
|
+
```
|
137
|
+
|
138
|
+
```
|
139
|
+
filter {
|
140
|
+
kafka_time_machine {
|
141
|
+
kafka_datacenter_shipper => "%{[dynamic_field]}"
|
142
|
+
}
|
143
|
+
}
|
144
|
+
```
|
145
|
+
|
146
|
+
### kafka_topic_shipper
|
147
|
+
|
148
|
+
- Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
|
149
|
+
- There is no default value for this setting.
|
150
|
+
|
151
|
+
Provide kafka topic log event was read from on shipper. Field values can be static or dynamic:
|
152
|
+
|
153
|
+
```
|
154
|
+
filter {
|
155
|
+
kafka_time_machine {
|
156
|
+
kafka_topic_shipper => "static_field"
|
157
|
+
}
|
158
|
+
}
|
159
|
+
```
|
160
|
+
|
161
|
+
```
|
162
|
+
filter {
|
163
|
+
kafka_time_machine {
|
164
|
+
kafka_topic_shipper => "%{[dynamic_field]}"
|
165
|
+
}
|
166
|
+
}
|
167
|
+
```
|
168
|
+
|
169
|
+
### kafka_consumer_group_shipper
|
170
|
+
|
171
|
+
- Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
|
172
|
+
- There is no default value for this setting.
|
173
|
+
|
174
|
+
Provide kafka consumer group log event was read from on shipper. Field values can be static or dynamic:
|
175
|
+
|
176
|
+
```
|
177
|
+
filter {
|
178
|
+
kafka_time_machine {
|
179
|
+
kafka_consumer_group_shipper => "static_field"
|
180
|
+
}
|
181
|
+
}
|
182
|
+
```
|
183
|
+
|
184
|
+
```
|
185
|
+
filter {
|
186
|
+
kafka_time_machine {
|
187
|
+
kafka_consumer_group_shipper => "%{[dynamic_field]}"
|
188
|
+
}
|
189
|
+
}
|
190
|
+
```
|
191
|
+
|
192
|
+
### kafka_append_time_shipper
|
193
|
+
|
194
|
+
- Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
|
195
|
+
- There is no default value for this setting.
|
196
|
+
|
197
|
+
Provide EPOCH time in milliseconds log event was added to `kafka_shipper`. Field values can be static or dynamic:
|
198
|
+
|
199
|
+
```
|
200
|
+
filter {
|
201
|
+
kafka_time_machine {
|
202
|
+
kafka_append_time_shipper => 1624394191000
|
203
|
+
}
|
204
|
+
}
|
205
|
+
```
|
206
|
+
|
207
|
+
```
|
208
|
+
filter {
|
209
|
+
kafka_time_machine {
|
210
|
+
kafka_append_time_shipper => "%{[dynamic_field]}"
|
211
|
+
}
|
212
|
+
}
|
213
|
+
```
|
214
|
+
|
215
|
+
### logstash_kafka_read_time_shipper
|
216
|
+
|
217
|
+
- Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
|
218
|
+
- There is no default value for this setting.
|
219
|
+
|
220
|
+
Provide EPOCH time in milliseconds log event read from to `kafka_shipper`. Field values can be static or dynamic:
|
221
|
+
|
222
|
+
```
|
223
|
+
filter {
|
224
|
+
kafka_time_machine {
|
225
|
+
logstash_kafka_read_time_shipper => 1624394191000
|
226
|
+
}
|
227
|
+
}
|
228
|
+
```
|
229
|
+
|
230
|
+
```
|
231
|
+
filter {
|
232
|
+
kafka_time_machine {
|
233
|
+
logstash_kafka_read_time_shipper => "%{[dynamic_field]}"
|
234
|
+
}
|
235
|
+
}
|
236
|
+
```
|
237
|
+
|
238
|
+
### kafka_topic_indexer
|
239
|
+
|
240
|
+
- Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
|
241
|
+
- There is no default value for this setting.
|
242
|
+
|
243
|
+
Provide kafka topic log event was read from on indexer. Field values can be static or dynamic:
|
244
|
+
|
245
|
+
```
|
246
|
+
filter {
|
247
|
+
kafka_time_machine {
|
248
|
+
kafka_topic_indexer => "static_field"
|
249
|
+
}
|
250
|
+
}
|
251
|
+
```
|
252
|
+
|
253
|
+
```
|
254
|
+
filter {
|
255
|
+
kafka_time_machine {
|
256
|
+
kafka_topic_indexer => "%{[dynamic_field]}"
|
257
|
+
}
|
258
|
+
}
|
259
|
+
```
|
260
|
+
|
261
|
+
### kafka_consumer_group_indexer
|
262
|
+
|
263
|
+
- Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
|
264
|
+
- There is no default value for this setting.
|
265
|
+
|
266
|
+
Provide kafka consumer group log event was read from on indexer. Field values can be static or dynamic:
|
267
|
+
|
268
|
+
```
|
269
|
+
filter {
|
270
|
+
kafka_time_machine {
|
271
|
+
kafka_consumer_group_indexer => "static_field"
|
272
|
+
}
|
273
|
+
}
|
274
|
+
```
|
275
|
+
|
276
|
+
```
|
277
|
+
filter {
|
278
|
+
kafka_time_machine {
|
279
|
+
kafka_consumer_group_indexer => "%{[dynamic_field]}"
|
280
|
+
}
|
281
|
+
}
|
282
|
+
```
|
283
|
+
|
284
|
+
### kafka_append_time_indexer
|
285
|
+
|
286
|
+
- Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
|
287
|
+
- There is no default value for this setting.
|
288
|
+
|
289
|
+
Provide EPOCH time in milliseconds log event was added to `kafka_indexer`. Field values can be static or dynamic:
|
290
|
+
|
291
|
+
```
|
292
|
+
filter {
|
293
|
+
kafka_time_machine {
|
294
|
+
kafka_append_time_indexer => 1624394191000
|
295
|
+
}
|
296
|
+
}
|
297
|
+
```
|
298
|
+
|
299
|
+
```
|
300
|
+
filter {
|
301
|
+
kafka_time_machine {
|
302
|
+
kafka_append_time_indexer => "%{[dynamic_field]}"
|
303
|
+
}
|
304
|
+
}
|
305
|
+
```
|
306
|
+
|
307
|
+
### logstash_kafka_read_time_indexer
|
308
|
+
|
309
|
+
- Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
|
310
|
+
- There is no default value for this setting.
|
311
|
+
|
312
|
+
Provide EPOCH time in milliseconds log event read from to `kafka_indexer`. Field values can be static or dynamic:
|
313
|
+
|
314
|
+
```
|
315
|
+
filter {
|
316
|
+
kafka_time_machine {
|
317
|
+
logstash_kafka_read_time_indexer => 1624394191000
|
318
|
+
}
|
319
|
+
}
|
320
|
+
```
|
321
|
+
|
322
|
+
```
|
323
|
+
filter {
|
324
|
+
kafka_time_machine {
|
325
|
+
logstash_kafka_read_time_indexer => "%{[dynamic_field]}"
|
326
|
+
}
|
327
|
+
}
|
328
|
+
```
|
329
|
+
|
330
|
+
### event_owner
|
331
|
+
|
332
|
+
- Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
|
333
|
+
- There is no default value for this setting.
|
334
|
+
|
335
|
+
Provide event owner; represents the owner of the log. Field values can be static or dynamic:
|
336
|
+
|
337
|
+
```
|
338
|
+
filter {
|
339
|
+
kafka_time_machine {
|
340
|
+
event_owner => "static_field"
|
341
|
+
}
|
342
|
+
}
|
343
|
+
```
|
344
|
+
|
345
|
+
```
|
346
|
+
filter {
|
347
|
+
kafka_time_machine {
|
348
|
+
event_owner => "%{[dynamic_field]}"
|
349
|
+
}
|
350
|
+
}
|
351
|
+
```
|
352
|
+
|
353
|
+
### event_time_ms
|
354
|
+
|
355
|
+
- Value type is [string](https://www.elastic.co/guide/en/logstash/7.13/configuration-file-structure.html#string)
|
356
|
+
- There is no default value for this setting.
|
357
|
+
|
358
|
+
Provide EPOCH time in milliseconds that this event is being processed. This time will be appending the generated InfluxDb Line Protocol metric. Field values can be static or dynamic:
|
359
|
+
|
360
|
+
```
|
361
|
+
filter {
|
362
|
+
kafka_time_machine {
|
363
|
+
event_time_ms => 1624394191000
|
364
|
+
}
|
365
|
+
}
|
366
|
+
```
|
367
|
+
|
368
|
+
```
|
369
|
+
filter {
|
370
|
+
kafka_time_machine {
|
371
|
+
event_time_ms => "%{[dynamic_field]}"
|
372
|
+
}
|
373
|
+
}
|
374
|
+
```
|
@@ -0,0 +1,242 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
require "logstash/filters/base"
|
3
|
+
require "logstash/namespace"
|
4
|
+
require "logstash/event"
|
5
|
+
require "influxdb-client"
|
6
|
+
|
7
|
+
class LogStash::Filters::KafkaTimeMachine < LogStash::Filters::Base
|
8
|
+
|
9
|
+
config_name "kafka_time_machine"
|
10
|
+
|
11
|
+
# Datacenter the kafka message originated from.
|
12
|
+
config :kafka_datacenter_shipper, :validate => :string, :required => true
|
13
|
+
|
14
|
+
# Kafka Topic on shipper datacenter
|
15
|
+
config :kafka_topic_shipper, :validate => :string, :required => true
|
16
|
+
|
17
|
+
# Kafka Consumer Group on shipper datacenter
|
18
|
+
config :kafka_consumer_group_shipper, :validate => :string, :required => true
|
19
|
+
|
20
|
+
# Time message was appended to kafka on shipper datacenter
|
21
|
+
config :kafka_append_time_shipper, :validate => :string, :required => true
|
22
|
+
|
23
|
+
# Time message read from kafka by logstash on shipper datacenter
|
24
|
+
config :logstash_kafka_read_time_shipper, :validate => :string, :required => true
|
25
|
+
|
26
|
+
# Kafka Topic on indexer datacenter
|
27
|
+
config :kafka_topic_indexer, :validate => :string, :required => true
|
28
|
+
|
29
|
+
# Kafka Consumer Group on indexer datacenter
|
30
|
+
config :kafka_consumer_group_indexer, :validate => :string, :required => true
|
31
|
+
|
32
|
+
# Time message was appended to kafka on indexer datacenter
|
33
|
+
config :kafka_append_time_indexer, :validate => :string, :required => true
|
34
|
+
|
35
|
+
# Time message read from kafka by logstash on indexer datacenter
|
36
|
+
config :logstash_kafka_read_time_indexer, :validate => :string, :required => true
|
37
|
+
|
38
|
+
# Owner of the event currenty being process.
|
39
|
+
config :event_owner, :validate => :string, :required => true
|
40
|
+
|
41
|
+
# Current time since EPOCH in ms that should be set in the influxdb generated metric
|
42
|
+
config :event_time_ms, :validate => :string, :required => true
|
43
|
+
|
44
|
+
public
|
45
|
+
def register
|
46
|
+
|
47
|
+
end
|
48
|
+
|
49
|
+
public
|
50
|
+
def filter(event)
|
51
|
+
|
52
|
+
@logger.debug("Starting filter calculations")
|
53
|
+
|
54
|
+
# Note - It was considered to error check for strings that are invalid, i.e. "%{[@metadata][ktm][kafka_datacenter_shipper]}". However, this string being present is a good way to identify
|
55
|
+
# shipper/indexer logstash configs that are wrong so its allowed to pass through unaltered.
|
56
|
+
#
|
57
|
+
# Extract all string values to local variables.
|
58
|
+
event_owner = event.sprintf(@event_owner)
|
59
|
+
shipper_kafka_datacenter = event.sprintf(@kafka_datacenter_shipper)
|
60
|
+
shipper_kafka_topic = event.sprintf(@kafka_topic_shipper)
|
61
|
+
shipper_kafka_consumer_group = event.sprintf(@kafka_consumer_group_shipper)
|
62
|
+
indexer_kafka_topic = event.sprintf(@kafka_topic_indexer)
|
63
|
+
indexer_kafka_consumer_group = event.sprintf(@kafka_consumer_group_indexer)
|
64
|
+
|
65
|
+
# Extract all the "time" related values to local variables. This need special handling due to the Float() operation.
|
66
|
+
#
|
67
|
+
# We must check for a valid numberic value; if not the float operation will error out on "invalid hash error" and stop logstash pipeline
|
68
|
+
event_time_ms = get_numeric(event.sprintf(@event_time_ms))
|
69
|
+
shipper_kafka_append_time = get_numeric(event.sprintf(@kafka_append_time_shipper))
|
70
|
+
shipper_logstash_kafka_read_time = get_numeric(event.sprintf(@logstash_kafka_read_time_shipper))
|
71
|
+
indexer_kafka_append_time = get_numeric(event.sprintf(@kafka_append_time_indexer))
|
72
|
+
indexer_logstash_kafka_read_time = get_numeric(event.sprintf(@logstash_kafka_read_time_indexer))
|
73
|
+
|
74
|
+
# Validate the shipper data
|
75
|
+
shipper_kafka_array = Array[shipper_kafka_datacenter, shipper_kafka_topic, shipper_kafka_consumer_group, shipper_kafka_append_time, shipper_logstash_kafka_read_time, event_owner, event_time_ms]
|
76
|
+
if (shipper_kafka_array.any? { |text| text.nil? || text.to_s.empty? })
|
77
|
+
@logger.debug("shipper_kafka_array invalid: Found null")
|
78
|
+
error_string_shipper = sprintf("Error in shipper data: %s", shipper_kafka_array)
|
79
|
+
shipper_valid = false
|
80
|
+
else
|
81
|
+
@logger.debug("shipper_kafka_array valid")
|
82
|
+
shipper_valid = true
|
83
|
+
shipper_logstash_kafka_read_time = shipper_logstash_kafka_read_time.to_i
|
84
|
+
shipper_kafka_append_time = shipper_kafka_append_time.to_i
|
85
|
+
shipper_kafka_lag_ms = shipper_logstash_kafka_read_time - shipper_kafka_append_time
|
86
|
+
end
|
87
|
+
|
88
|
+
# Validate the indexer data
|
89
|
+
indexer_kafka_array = Array[shipper_kafka_datacenter, indexer_kafka_topic, indexer_kafka_consumer_group, indexer_kafka_append_time, indexer_logstash_kafka_read_time, event_owner, event_time_ms]
|
90
|
+
if (indexer_kafka_array.any? { |text| text.nil? || text.to_s.empty? })
|
91
|
+
@logger.debug("indexer_kafka_array invalid: Found null")
|
92
|
+
error_string_indexer = sprintf("Error in indexer data: %s", indexer_kafka_array)
|
93
|
+
indexer_valid = false
|
94
|
+
else
|
95
|
+
@logger.debug("indexer_kafka_array valid")
|
96
|
+
indexer_valid = true
|
97
|
+
indexer_logstash_kafka_read_time = indexer_logstash_kafka_read_time.to_i
|
98
|
+
indexer_kafka_append_time = indexer_kafka_append_time.to_i
|
99
|
+
indexer_kafka_lag_ms = indexer_logstash_kafka_read_time - indexer_kafka_append_time
|
100
|
+
end
|
101
|
+
|
102
|
+
# Add in the size of the payload field
|
103
|
+
payload_bytesize = 0
|
104
|
+
if event.get("[payload]")
|
105
|
+
payload_bytesize = event.get("[payload]").bytesize
|
106
|
+
end
|
107
|
+
|
108
|
+
# Set time (nanoseconds) for influxdb line protocol
|
109
|
+
epoch_time_ns = nil
|
110
|
+
if (event_time_ms != nil )
|
111
|
+
epoch_time_ns = event_time_ms * 1000000
|
112
|
+
end
|
113
|
+
|
114
|
+
# Create array to hold one or more ktm metric events
|
115
|
+
ktm_metric_event_array = Array.new
|
116
|
+
|
117
|
+
# Populate the event and set tags
|
118
|
+
if (shipper_valid == true && indexer_valid == true && epoch_time_ns != nil)
|
119
|
+
total_kafka_lag_ms = indexer_logstash_kafka_read_time - shipper_kafka_append_time
|
120
|
+
|
121
|
+
point_influxdb = create_influxdb_point_ktm(shipper_kafka_datacenter, event_owner, payload_bytesize, "total", total_kafka_lag_ms, epoch_time_ns)
|
122
|
+
ktm_metric_event_array.push point_influxdb
|
123
|
+
|
124
|
+
elsif (shipper_valid == true && indexer_valid == false && epoch_time_ns != nil)
|
125
|
+
point_influxdb = create_influxdb_point_ktm(shipper_kafka_datacenter, event_owner, payload_bytesize, "shipper", shipper_kafka_lag_ms, epoch_time_ns)
|
126
|
+
ktm_metric_event_array.push point_influxdb
|
127
|
+
|
128
|
+
point_influxdb = create_influxdb_point_ktm_error(shipper_kafka_datacenter, event_owner, epoch_time_ns, "indexer")
|
129
|
+
ktm_metric_event_array.push point_influxdb
|
130
|
+
|
131
|
+
elsif (indexer_valid == true && shipper_valid == false && epoch_time_ns != nil)
|
132
|
+
point_influxdb = create_influxdb_point_ktm(shipper_kafka_datacenter, event_owner, payload_bytesize, "indexer", indexer_kafka_lag_ms, epoch_time_ns)
|
133
|
+
ktm_metric_event_array.push point_influxdb
|
134
|
+
|
135
|
+
point_influxdb = create_influxdb_point_ktm_error(shipper_kafka_datacenter, event_owner, epoch_time_ns, "shipper")
|
136
|
+
ktm_metric_event_array.push point_influxdb
|
137
|
+
|
138
|
+
elsif (indexer_valid == false && shipper_valid == false)
|
139
|
+
|
140
|
+
point_influxdb = create_influxdb_point_ktm_error(shipper_kafka_datacenter, event_owner, epoch_time_ns, "insufficient_data")
|
141
|
+
ktm_metric_event_array.push point_influxdb
|
142
|
+
|
143
|
+
error_string = sprintf("Error kafka_time_machine: Could not build valid response --> %s, %s", error_string_shipper, error_string_indexer)
|
144
|
+
@logger.debug(error_string)
|
145
|
+
|
146
|
+
else
|
147
|
+
|
148
|
+
point_influxdb = create_influxdb_point_ktm_error(shipper_kafka_datacenter, event_owner, epoch_time_ns, "unknown")
|
149
|
+
ktm_metric_event_array.push point_influxdb
|
150
|
+
|
151
|
+
error_string = "Unknown error encountered"
|
152
|
+
@logger.debug(error_string)
|
153
|
+
|
154
|
+
end
|
155
|
+
|
156
|
+
# Publish even event in our array
|
157
|
+
ktm_metric_event_array.each do |metric_event|
|
158
|
+
|
159
|
+
# Create new event for KTM metric
|
160
|
+
event_ktm = LogStash::Event.new
|
161
|
+
|
162
|
+
event_ktm.set("ktm_metric", metric_event)
|
163
|
+
event_ktm.set("[@metadata][ktm_tags][ktm_metric]", "true")
|
164
|
+
|
165
|
+
filter_matched(event_ktm)
|
166
|
+
yield event_ktm
|
167
|
+
|
168
|
+
end
|
169
|
+
|
170
|
+
end # def filter
|
171
|
+
|
172
|
+
# Creates an Influx DB line-protocol data point to return
|
173
|
+
public
|
174
|
+
def create_influxdb_point_ktm(datacenter, event_owner, payload_size_bytes, lag_type, lag_ms, epoch_time_ns)
|
175
|
+
|
176
|
+
point = InfluxDB2::Point.new( name: "ktm",
|
177
|
+
tags: {datacenter: datacenter, owner: event_owner, lag_type: lag_type},
|
178
|
+
fields: {payload_size_bytes: payload_size_bytes, lag_ms: lag_ms},
|
179
|
+
time: epoch_time_ns)
|
180
|
+
|
181
|
+
point_influxdb = point.to_line_protocol
|
182
|
+
return point_influxdb
|
183
|
+
|
184
|
+
end # def create_influxdb_point
|
185
|
+
|
186
|
+
# Creates an Influx DB line-protocol data point to return
|
187
|
+
public
|
188
|
+
def create_influxdb_point_ktm_error(datacenter, event_owner, epoch_time_ns, type)
|
189
|
+
|
190
|
+
# Check for nil values
|
191
|
+
if (nil == datacenter)
|
192
|
+
datacenter = "unknown"
|
193
|
+
end
|
194
|
+
|
195
|
+
if (nil == event_owner)
|
196
|
+
event_owner = "unknown"
|
197
|
+
end
|
198
|
+
|
199
|
+
# set time if we didn't recieve it
|
200
|
+
if (nil == epoch_time_ns)
|
201
|
+
epoch_time_ns = ((Time.now.to_f * 1000).to_i)*1000000
|
202
|
+
end
|
203
|
+
|
204
|
+
point = InfluxDB2::Point.new( name: "ktm_error",
|
205
|
+
tags: {datacenter: datacenter, owner: event_owner, source: type},
|
206
|
+
fields: {count: 1},
|
207
|
+
time: epoch_time_ns)
|
208
|
+
|
209
|
+
point_influxdb = point.to_line_protocol
|
210
|
+
return point_influxdb
|
211
|
+
|
212
|
+
end # def create_influxdb_point
|
213
|
+
|
214
|
+
# Ensures the provided value is numeric; if not returns 'nil'
|
215
|
+
public
|
216
|
+
def get_numeric(input_str)
|
217
|
+
|
218
|
+
# @logger.debug("Aggregate timeout for '#{@task_id}' pattern: #{@timeout} seconds")
|
219
|
+
@logger.debug("get_numeric operating on: #{input_str} ")
|
220
|
+
|
221
|
+
is_numeric = input_str.to_s.match(/\A[+-]?\d+?(\.\d+)?\Z/) == nil ? false : true
|
222
|
+
if (true == is_numeric)
|
223
|
+
@logger.debug("get_numeric - valid value provided")
|
224
|
+
num_value = Float(sprintf(input_str))
|
225
|
+
|
226
|
+
if (false == num_value.positive?)
|
227
|
+
@logger.debug("get_numeric - negative value provided")
|
228
|
+
num_value = nil
|
229
|
+
end
|
230
|
+
|
231
|
+
else
|
232
|
+
@logger.debug("get_numeric - invalid value provided")
|
233
|
+
num_value = nil
|
234
|
+
end
|
235
|
+
|
236
|
+
@logger.debug(sprintf("get_numeric response --> #{num_value}"))
|
237
|
+
|
238
|
+
return num_value
|
239
|
+
|
240
|
+
end # def get_numberic
|
241
|
+
|
242
|
+
end # class LogStash::Filters::KafkaTimeMachine
|
@@ -1,6 +1,6 @@
|
|
1
1
|
Gem::Specification.new do |s|
|
2
2
|
s.name = 'logstash-filter-kafka_time_machine'
|
3
|
-
s.version = '0.
|
3
|
+
s.version = '2.0.0'
|
4
4
|
s.licenses = ['Apache-2.0']
|
5
5
|
s.summary = "Calculate total time of logstash event that traversed 2 Kafka queues from a shipper site to an indexer site"
|
6
6
|
s.description = "This gem is a logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"
|
@@ -20,5 +20,6 @@ Gem::Specification.new do |s|
|
|
20
20
|
|
21
21
|
# Gem dependencies
|
22
22
|
s.add_runtime_dependency "logstash-core-plugin-api", ">= 1.60", "<= 2.99"
|
23
|
+
s.add_runtime_dependency "influxdb-client", "~> 2.0.0"
|
23
24
|
s.add_development_dependency 'logstash-devutils', '~> 0'
|
24
25
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: logstash-filter-kafka_time_machine
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 2.0.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Chris Foster
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2021-
|
11
|
+
date: 2021-10-21 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: logstash-core-plugin-api
|
@@ -30,6 +30,20 @@ dependencies:
|
|
30
30
|
- - "<="
|
31
31
|
- !ruby/object:Gem::Version
|
32
32
|
version: '2.99'
|
33
|
+
- !ruby/object:Gem::Dependency
|
34
|
+
name: influxdb-client
|
35
|
+
requirement: !ruby/object:Gem::Requirement
|
36
|
+
requirements:
|
37
|
+
- - "~>"
|
38
|
+
- !ruby/object:Gem::Version
|
39
|
+
version: 2.0.0
|
40
|
+
type: :runtime
|
41
|
+
prerelease: false
|
42
|
+
version_requirements: !ruby/object:Gem::Requirement
|
43
|
+
requirements:
|
44
|
+
- - "~>"
|
45
|
+
- !ruby/object:Gem::Version
|
46
|
+
version: 2.0.0
|
33
47
|
- !ruby/object:Gem::Dependency
|
34
48
|
name: logstash-devutils
|
35
49
|
requirement: !ruby/object:Gem::Requirement
|
@@ -54,7 +68,7 @@ extra_rdoc_files: []
|
|
54
68
|
files:
|
55
69
|
- Gemfile
|
56
70
|
- README.md
|
57
|
-
- lib/logstash/filters/
|
71
|
+
- lib/logstash/filters/kafka_time_machine.rb
|
58
72
|
- logstash-filter-kafka_time_machine.gemspec
|
59
73
|
homepage: http://www.elastic.co/guide/en/logstash/current/index.html
|
60
74
|
licenses:
|
@@ -1,77 +0,0 @@
|
|
1
|
-
# encoding: utf-8
|
2
|
-
require "logstash/filters/base"
|
3
|
-
require "logstash/namespace"
|
4
|
-
require "logstash/event"
|
5
|
-
|
6
|
-
class LogStash::Filters::KafkaTimeMachine < LogStash::Filters::Base
|
7
|
-
|
8
|
-
config_name "kafkatimemachine"
|
9
|
-
|
10
|
-
public
|
11
|
-
def register
|
12
|
-
|
13
|
-
end
|
14
|
-
|
15
|
-
public
|
16
|
-
def filter(event)
|
17
|
-
|
18
|
-
# Extract shipper data and check for validity; note that kafka_datacenter_shipper is used for both shipper and indexer arrays
|
19
|
-
kafka_datacenter_shipper = event.get("[@metadata][kafka_datacenter_shipper]")
|
20
|
-
kafka_topic_shipper = event.get("[@metadata][kafka_topic_shipper]")
|
21
|
-
kafka_consumer_group_shipper = event.get("[@metadata][kafka_consumer_group_shipper]")
|
22
|
-
kafka_append_time_shipper = Float(event.get("[@metadata][kafka_append_time_shipper]")) rescue nil
|
23
|
-
logstash_kafka_read_time_shipper = Float(event.get("[@metadata][logstash_kafka_read_time_shipper]")) rescue nil
|
24
|
-
|
25
|
-
kafka_shipper_array = Array[kafka_datacenter_shipper, kafka_topic_shipper, kafka_consumer_group_shipper, kafka_append_time_shipper, logstash_kafka_read_time_shipper]
|
26
|
-
@logger.debug("kafka_shipper_array: #{kafka_shipper_array}")
|
27
|
-
|
28
|
-
if (kafka_shipper_array.any? { |text| text.nil? || text.to_s.empty? })
|
29
|
-
@logger.debug("kafka_shipper_array invalid: Found null")
|
30
|
-
error_string_shipper = "Error in shipper data: #{kafka_shipper_array}"
|
31
|
-
shipper_valid = false
|
32
|
-
else
|
33
|
-
@logger.debug("kafka_shipper_array valid")
|
34
|
-
shipper_valid = true
|
35
|
-
logstash_kafka_read_time_shipper = logstash_kafka_read_time_shipper.to_i
|
36
|
-
kafka_append_time_shipper = kafka_append_time_shipper.to_i
|
37
|
-
kafka_shipper_lag_ms = logstash_kafka_read_time_shipper - kafka_append_time_shipper
|
38
|
-
end
|
39
|
-
|
40
|
-
# Extract indexer data and check for validity
|
41
|
-
kafka_topic_indexer = event.get("[@metadata][kafka_topic_indexer]")
|
42
|
-
kafka_consumer_group_indexer = event.get("[@metadata][kafka_consumer_group_indexer]")
|
43
|
-
kafka_append_time_indexer = Float(event.get("[@metadata][kafka_append_time_indexer]")) rescue nil
|
44
|
-
logstash_kafka_read_time_indexer = Float(event.get("[@metadata][logstash_kafka_read_time_indexer]")) rescue nil
|
45
|
-
|
46
|
-
kafka_indexer_array = Array[kafka_datacenter_shipper, kafka_topic_indexer, kafka_consumer_group_indexer, kafka_append_time_indexer, logstash_kafka_read_time_indexer]
|
47
|
-
@logger.debug("kafka_indexer_array: #{kafka_indexer_array}")
|
48
|
-
|
49
|
-
if (kafka_indexer_array.any? { |text| text.nil? || text.to_s.empty? })
|
50
|
-
@logger.debug("kafka_indexer_array invalid: Found null")
|
51
|
-
error_string_indexer = "Error in indexer data: #{kafka_indexer_array}"
|
52
|
-
indexer_valid = false
|
53
|
-
else
|
54
|
-
@logger.debug("kafka_indexer_array valid")
|
55
|
-
indexer_valid = true
|
56
|
-
logstash_kafka_read_time_indexer = logstash_kafka_read_time_indexer.to_i
|
57
|
-
kafka_append_time_indexer = kafka_append_time_indexer.to_i
|
58
|
-
kafka_indexer_lag_ms = logstash_kafka_read_time_indexer - kafka_append_time_indexer
|
59
|
-
end
|
60
|
-
|
61
|
-
if (shipper_valid == true && indexer_valid == true)
|
62
|
-
kafka_total_lag_ms = logstash_kafka_read_time_indexer - kafka_append_time_shipper
|
63
|
-
event.set("[ktm]", {"lag_total" => kafka_total_lag_ms, "lag_indexer" => kafka_indexer_lag_ms, "lag_shipper" => kafka_shipper_lag_ms, "datacenter_shipper" => kafka_datacenter_shipper, "kafka_topic_indexer" => kafka_topic_indexer, "kafka_consumer_group_indexer" => kafka_consumer_group_indexer, "kafka_topic_shipper" => kafka_topic_shipper, "kafka_consumer_group_shipper" => kafka_consumer_group_shipper, "tags" => ["ktm_lag_complete"] })
|
64
|
-
elsif (shipper_valid == true && indexer_valid == false)
|
65
|
-
event.set("[ktm]", {"lag_shipper" => kafka_shipper_lag_ms, "datacenter_shipper" => kafka_datacenter_shipper, "kafka_topic_shipper" => kafka_topic_shipper, "kafka_consumer_group_shipper" => kafka_consumer_group_shipper, "tags" => ["ktm_lag_shipper"] })
|
66
|
-
elsif (indexer_valid == true && shipper_valid == false)
|
67
|
-
event.set("[ktm]", {"lag_indexer" => kafka_indexer_lag_ms, "datacenter_shipper" => kafka_datacenter_shipper, "kafka_topic_indexer" => kafka_topic_indexer, "kafka_consumer_group_indexer" => kafka_consumer_group_indexer, "tags" => ["ktm_lag_indexer"] })
|
68
|
-
elsif (indexer_valid == false && shipper_valid == false)
|
69
|
-
@logger.debug("Error kafkatimemachine: Could not build valid response --> #{error_string_shipper}, #{error_string_indexer}")
|
70
|
-
end
|
71
|
-
|
72
|
-
# filter_matched should go in the last line of our successful code
|
73
|
-
filter_matched(event)
|
74
|
-
|
75
|
-
end # def filter
|
76
|
-
|
77
|
-
end # class LogStash::Filters::KafkaTimeMachine
|