logstash-filter-aggregate 2.3.1 → 2.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +6 -0
- data/README.md +20 -16
- data/lib/logstash/filters/aggregate.rb +121 -72
- data/logstash-filter-aggregate.gemspec +1 -1
- data/spec/filters/aggregate_spec.rb +56 -43
- data/spec/filters/aggregate_spec_helper.rb +5 -5
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 86bf41e2f4183eb1bca1053dcdd54bba16ce715d
|
4
|
+
data.tar.gz: edd15d9f860d6becda2d0faabf246de87cb404bd
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: adf8addee7fcfd1efeddf980a85933b9ccb35e89f1d3e1e784fb302902a0308ec31209decb5c3d3da83151860d212e2d7c79f41349bc5e2cc4a16acdc2352d5c
|
7
|
+
data.tar.gz: 320a2b40266aa4cc506ef01ec6e61e6b0f92985f2ccd4914a621b725fc9ace12f50e8d3314c36afedff97c5e0bb886a2460c2bc87549ebf41ff92389754b6842
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,9 @@
|
|
1
|
+
## 2.4.0
|
2
|
+
- new feature: You can now define timeout options per task_id pattern (#42)
|
3
|
+
timeout options are : `timeout, timeout_code, push_map_as_event_on_timeout, push_previous_map_as_event, timeout_task_id_field, timeout_tags`
|
4
|
+
- validation: a configuration error is thrown at startup if you define any timeout option on several aggregate filters for the same task_id pattern
|
5
|
+
- breaking: if you use `aggregate_maps_path` option, storage format has changed. So you have to delete `aggregate_maps_path` file before starting Logstash
|
6
|
+
|
1
7
|
## 2.3.1
|
2
8
|
- new feature: Add new option "timeout_tags" so that you can add tags to generated timeout events
|
3
9
|
|
data/README.md
CHANGED
@@ -204,7 +204,9 @@ In that case, you don't want to wait task timeout to flush aggregation map.
|
|
204
204
|
- after the final event, the map attached to task is deleted
|
205
205
|
- in one filter configuration, it is recommanded to define a timeout option to protect the filter against unterminated tasks. It tells the filter to delete expired maps
|
206
206
|
- if no timeout is defined, by default, all maps older than 1800 seconds are automatically deleted
|
207
|
-
-
|
207
|
+
- all timeout options have to be defined in only one aggregate filter per task_id pattern.
|
208
|
+
Timeout options are : `timeout, timeout_code, push_map_as_event_on_timeout, push_previous_map_as_event, timeout_task_id_field, timeout_tags`
|
209
|
+
- if `code` execution raises an exception, the error is logged and event is tagged '_aggregateexception'
|
208
210
|
|
209
211
|
## Use Cases
|
210
212
|
- extract some cool metrics from task logs and push them into task final log event (like in example #1 and #2)
|
@@ -236,29 +238,20 @@ Tell the filter what to do with aggregate map (default : "create_or_update").
|
|
236
238
|
Default value: `create_or_update`
|
237
239
|
|
238
240
|
- **end_of_task:**
|
239
|
-
Tell the filter that task is ended, and therefore, to delete map after code execution.
|
241
|
+
Tell the filter that task is ended, and therefore, to delete aggregate map after code execution.
|
240
242
|
Default value: `false`
|
241
243
|
|
242
|
-
- **timeout:**
|
243
|
-
The amount of seconds after a task "end event" can be considered lost.
|
244
|
-
When timeout occurs for a task, The task "map" is evicted.
|
245
|
-
If no timeout is defined, default timeout will be applied : 1800 seconds.
|
246
|
-
|
247
244
|
- **aggregate_maps_path:**
|
248
245
|
The path to file where aggregate maps are stored when logstash stops and are loaded from when logstash starts.
|
249
246
|
If not defined, aggregate maps will not be stored at logstash stop and will be lost.
|
250
247
|
Must be defined in only one aggregate filter (as aggregate maps are global).
|
251
248
|
Example value : `"/path/to/.aggregate_maps"`
|
252
249
|
|
253
|
-
- **
|
254
|
-
|
255
|
-
|
256
|
-
|
257
|
-
|
258
|
-
|
259
|
-
- **push_map_as_event_on_timeout**
|
260
|
-
When this option is enabled, each time a task timeout is detected, it pushes task aggregation map as a new logstash event.
|
261
|
-
This enables to detect and process task timeouts in logstash, but also to manage tasks that have no explicit end event.
|
250
|
+
- **timeout:**
|
251
|
+
The amount of seconds after a task "end event" can be considered lost.
|
252
|
+
When timeout occurs for a task, The task "map" is evicted.
|
253
|
+
Timeout can be defined for each "task_id" pattern.
|
254
|
+
If no timeout is defined, default timeout will be applied : 1800 seconds.
|
262
255
|
|
263
256
|
- **timeout_code**
|
264
257
|
The code to execute to complete timeout generated event, when 'push_map_as_event_on_timeout' or 'push_previous_map_as_event' is set to true.
|
@@ -266,6 +259,17 @@ The code block will have access to the newly generated timeout event that is pre
|
|
266
259
|
If 'timeout_task_id_field' is set, the event is also populated with the task_id value
|
267
260
|
Example value: `"event['state'] = 'timeout'"`
|
268
261
|
|
262
|
+
- **push_map_as_event_on_timeout**
|
263
|
+
When this option is enabled, each time a task timeout is detected, it pushes task aggregation map as a new logstash event.
|
264
|
+
This enables to detect and process task timeouts in logstash, but also to manage tasks that have no explicit end event.
|
265
|
+
Default value: `false`
|
266
|
+
|
267
|
+
- **push_previous_map_as_event:**
|
268
|
+
When this option is enabled, each time aggregate plugin detects a new task id, it pushes previous aggregate map as a new logstash event,
|
269
|
+
and then creates a new empty map for the next task.
|
270
|
+
_WARNING:_ this option works fine only if tasks come one after the other. It means : all task1 events, then all task2 events, etc...
|
271
|
+
Default value: `false`
|
272
|
+
|
269
273
|
- **timeout_task_id_field**
|
270
274
|
This option indicates the timeout generated event's field for the "task_id" value.
|
271
275
|
The task id will then be set into the timeout event. This can help correlate which tasks have been timed out.
|
@@ -220,7 +220,8 @@ require "logstash/util/decorators"
|
|
220
220
|
# * after the final event, the map attached to task is deleted
|
221
221
|
# * in one filter configuration, it is recommanded to define a timeout option to protect the feature against unterminated tasks. It tells the filter to delete expired maps
|
222
222
|
# * if no timeout is defined, by default, all maps older than 1800 seconds are automatically deleted
|
223
|
-
# *
|
223
|
+
# * all timeout options have to be defined in only one aggregate filter per task_id pattern. Timeout options are : timeout, timeout_code, push_map_as_event_on_timeout, push_previous_map_as_event, timeout_task_id_field, timeout_tags
|
224
|
+
# * if `code` execution raises an exception, the error is logged and event is tagged '_aggregateexception'
|
224
225
|
#
|
225
226
|
#
|
226
227
|
# ==== Use Cases
|
@@ -252,30 +253,6 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
252
253
|
# Example value : `"map['sql_duration'] += event['duration']"`
|
253
254
|
config :code, :validate => :string, :required => true
|
254
255
|
|
255
|
-
|
256
|
-
|
257
|
-
# The code to execute to complete timeout generated event, when 'push_map_as_event_on_timeout' or 'push_previous_map_as_event' is set to true.
|
258
|
-
# The code block will have access to the newly generated timeout event that is pre-populated with the aggregation map.
|
259
|
-
#
|
260
|
-
# If 'timeout_task_id_field' is set, the event is also populated with the task_id value
|
261
|
-
#
|
262
|
-
# Example value: `"event['state'] = 'timeout'"`
|
263
|
-
config :timeout_code, :validate => :string, :required => false
|
264
|
-
|
265
|
-
|
266
|
-
# This option indicates the timeout generated event's field for the "task_id" value.
|
267
|
-
# The task id will then be set into the timeout event. This can help correlate which tasks have been timed out.
|
268
|
-
#
|
269
|
-
# This field has no default value and will not be set on the event if not configured.
|
270
|
-
#
|
271
|
-
# Example:
|
272
|
-
#
|
273
|
-
# If the task_id is "12345" and this field is set to "my_id", the generated event will have:
|
274
|
-
# event[ "my_id" ] = "12345"
|
275
|
-
#
|
276
|
-
config :timeout_task_id_field, :validate => :string, :required => false
|
277
|
-
|
278
|
-
|
279
256
|
# Tell the filter what to do with aggregate map.
|
280
257
|
#
|
281
258
|
# `create`: create the map, and execute the code only if map wasn't created before
|
@@ -285,16 +262,9 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
285
262
|
# `create_or_update`: create the map if it wasn't created before, execute the code in all cases
|
286
263
|
config :map_action, :validate => :string, :default => "create_or_update"
|
287
264
|
|
288
|
-
# Tell the filter that task is ended, and therefore, to delete map after code execution.
|
265
|
+
# Tell the filter that task is ended, and therefore, to delete aggregate map after code execution.
|
289
266
|
config :end_of_task, :validate => :boolean, :default => false
|
290
267
|
|
291
|
-
# The amount of seconds after a task "end event" can be considered lost.
|
292
|
-
#
|
293
|
-
# When timeout occurs for a task, The task "map" is evicted.
|
294
|
-
#
|
295
|
-
# If no timeout is defined, default timeout will be applied : 1800 seconds.
|
296
|
-
config :timeout, :validate => :number, :required => false
|
297
|
-
|
298
268
|
# The path to file where aggregate maps are stored when logstash stops
|
299
269
|
# and are loaded from when logstash starts.
|
300
270
|
#
|
@@ -304,15 +274,44 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
304
274
|
# Example value : `"/path/to/.aggregate_maps"`
|
305
275
|
config :aggregate_maps_path, :validate => :string, :required => false
|
306
276
|
|
277
|
+
# The amount of seconds after a task "end event" can be considered lost.
|
278
|
+
#
|
279
|
+
# When timeout occurs for a task, The task "map" is evicted.
|
280
|
+
#
|
281
|
+
# Timeout can be defined for each "task_id" pattern.
|
282
|
+
#
|
283
|
+
# If no timeout is defined, default timeout will be applied : 1800 seconds.
|
284
|
+
config :timeout, :validate => :number, :required => false
|
285
|
+
|
286
|
+
# The code to execute to complete timeout generated event, when 'push_map_as_event_on_timeout' or 'push_previous_map_as_event' is set to true.
|
287
|
+
# The code block will have access to the newly generated timeout event that is pre-populated with the aggregation map.
|
288
|
+
#
|
289
|
+
# If 'timeout_task_id_field' is set, the event is also populated with the task_id value
|
290
|
+
#
|
291
|
+
# Example value: `"event['state'] = 'timeout'"`
|
292
|
+
config :timeout_code, :validate => :string, :required => false
|
293
|
+
|
294
|
+
# When this option is enabled, each time a task timeout is detected, it pushes task aggregation map as a new logstash event.
|
295
|
+
# This enables to detect and process task timeouts in logstash, but also to manage tasks that have no explicit end event.
|
296
|
+
config :push_map_as_event_on_timeout, :validate => :boolean, :required => false, :default => false
|
297
|
+
|
307
298
|
# When this option is enabled, each time aggregate plugin detects a new task id, it pushes previous aggregate map as a new logstash event,
|
308
299
|
# and then creates a new empty map for the next task.
|
309
300
|
#
|
310
301
|
# WARNING: this option works fine only if tasks come one after the other. It means : all task1 events, then all task2 events, etc...
|
311
302
|
config :push_previous_map_as_event, :validate => :boolean, :required => false, :default => false
|
312
303
|
|
313
|
-
#
|
314
|
-
#
|
315
|
-
|
304
|
+
# This option indicates the timeout generated event's field for the "task_id" value.
|
305
|
+
# The task id will then be set into the timeout event. This can help correlate which tasks have been timed out.
|
306
|
+
#
|
307
|
+
# This field has no default value and will not be set on the event if not configured.
|
308
|
+
#
|
309
|
+
# Example:
|
310
|
+
#
|
311
|
+
# If the task_id is "12345" and this field is set to "my_id", the generated event will have:
|
312
|
+
# event[ "my_id" ] = "12345"
|
313
|
+
#
|
314
|
+
config :timeout_task_id_field, :validate => :string, :required => false
|
316
315
|
|
317
316
|
# Defines tags to add when a timeout event is generated and yield
|
318
317
|
config :timeout_tags, :validate => :array, :required => false, :default => []
|
@@ -329,11 +328,15 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
329
328
|
# Mutex used to synchronize access to 'aggregate_maps'
|
330
329
|
@@mutex = Mutex.new
|
331
330
|
|
332
|
-
#
|
333
|
-
@@
|
331
|
+
# Default timeout for task_id patterns where timeout is not defined in logstash filter configuration
|
332
|
+
@@default_timeout = nil
|
333
|
+
|
334
|
+
# For each "task_id" pattern, defines which Aggregate instance will evict all expired Aggregate elements (older than timeout)
|
335
|
+
# For each entry, key is "task_id pattern" and value is "aggregate instance"
|
336
|
+
@@eviction_instance_map = {}
|
334
337
|
|
335
|
-
# last time where eviction was launched
|
336
|
-
@@
|
338
|
+
# last time where eviction was launched, per "task_id" pattern
|
339
|
+
@@last_eviction_timestamp_map = {}
|
337
340
|
|
338
341
|
# flag indicating if aggregate_maps_path option has been already set on one aggregate instance
|
339
342
|
@@aggregate_maps_path_set = false
|
@@ -349,30 +352,44 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
349
352
|
if @timeout_code
|
350
353
|
eval("@timeout_codeblock = lambda { |event| #{@timeout_code} }", binding, "(aggregate filter timeout code)")
|
351
354
|
end
|
352
|
-
|
355
|
+
|
353
356
|
@@mutex.synchronize do
|
354
|
-
|
355
|
-
|
356
|
-
|
357
|
-
|
357
|
+
|
358
|
+
# timeout management : define eviction_instance for current task_id pattern
|
359
|
+
if has_timeout_options?
|
360
|
+
if @@eviction_instance_map.has_key?(@task_id)
|
361
|
+
# all timeout options have to be defined in only one aggregate filter per task_id pattern
|
362
|
+
raise LogStash::ConfigurationError, "Aggregate plugin: For task_id pattern #{@task_id}, there are more than one filter which defines timeout options. All timeout options have to be defined in only one aggregate filter per task_id pattern. Timeout options are : #{display_timeout_options}"
|
363
|
+
end
|
364
|
+
@@eviction_instance_map[@task_id] = self
|
365
|
+
@logger.info("Aggregate plugin: timeout for '#{@task_id}' pattern: #{@timeout} seconds")
|
366
|
+
end
|
367
|
+
|
368
|
+
# timeout management : define default_timeout
|
369
|
+
if !@timeout.nil? && (@@default_timeout.nil? || @timeout < @@default_timeout)
|
370
|
+
@@default_timeout = @timeout
|
371
|
+
@logger.info("Aggregate plugin: default timeout: #{@timeout} seconds")
|
358
372
|
end
|
359
373
|
|
360
374
|
# check if aggregate_maps_path option has already been set on another instance
|
361
|
-
if
|
362
|
-
if
|
375
|
+
if !@aggregate_maps_path.nil?
|
376
|
+
if @@aggregate_maps_path_set
|
363
377
|
@@aggregate_maps_path_set = false
|
364
|
-
raise LogStash::ConfigurationError, "Option 'aggregate_maps_path' must be set on only one aggregate filter"
|
378
|
+
raise LogStash::ConfigurationError, "Aggregate plugin: Option 'aggregate_maps_path' must be set on only one aggregate filter"
|
365
379
|
else
|
366
380
|
@@aggregate_maps_path_set = true
|
367
381
|
end
|
368
382
|
end
|
369
383
|
|
370
384
|
# load aggregate maps from file (if option defined)
|
371
|
-
if
|
385
|
+
if !@aggregate_maps_path.nil? && File.exist?(@aggregate_maps_path)
|
372
386
|
File.open(@aggregate_maps_path, "r") { |from_file| @@aggregate_maps = Marshal.load(from_file) }
|
373
387
|
File.delete(@aggregate_maps_path)
|
374
|
-
@logger.info("Aggregate
|
388
|
+
@logger.info("Aggregate plugin: load aggregate maps from : #{@aggregate_maps_path}")
|
375
389
|
end
|
390
|
+
|
391
|
+
# init aggregate_maps
|
392
|
+
@@aggregate_maps[@task_id] ||= {}
|
376
393
|
end
|
377
394
|
end
|
378
395
|
|
@@ -380,18 +397,21 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
380
397
|
public
|
381
398
|
def close
|
382
399
|
|
383
|
-
#
|
384
|
-
@@aggregate_maps_path_set = false if @@aggregate_maps_path_set
|
385
|
-
@@eviction_instance = nil unless @@eviction_instance.nil?
|
386
|
-
|
400
|
+
# store aggregate maps to file (if option defined)
|
387
401
|
@@mutex.synchronize do
|
388
|
-
|
389
|
-
if
|
402
|
+
@@aggregate_maps.delete_if { |key, value| value.empty? }
|
403
|
+
if !@aggregate_maps_path.nil? && !@@aggregate_maps.empty?
|
390
404
|
File.open(@aggregate_maps_path, "w"){ |to_file| Marshal.dump(@@aggregate_maps, to_file) }
|
391
|
-
|
392
|
-
@logger.info("Aggregate, store aggregate maps to : #{@aggregate_maps_path}")
|
405
|
+
@logger.info("Aggregate plugin: store aggregate maps to : #{@aggregate_maps_path}")
|
393
406
|
end
|
407
|
+
@@aggregate_maps.clear()
|
394
408
|
end
|
409
|
+
|
410
|
+
# Protection against logstash reload
|
411
|
+
@@aggregate_maps_path_set = false if @@aggregate_maps_path_set
|
412
|
+
@@default_timeout = nil unless @@default_timeout.nil?
|
413
|
+
@@eviction_instance_map = {} unless @@eviction_instance_map.empty?
|
414
|
+
|
395
415
|
end
|
396
416
|
|
397
417
|
# This method is invoked each time an event matches the filter
|
@@ -409,19 +429,19 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
409
429
|
@@mutex.synchronize do
|
410
430
|
|
411
431
|
# retrieve the current aggregate map
|
412
|
-
aggregate_maps_element = @@aggregate_maps[task_id]
|
432
|
+
aggregate_maps_element = @@aggregate_maps[@task_id][task_id]
|
413
433
|
|
414
434
|
|
415
435
|
# create aggregate map, if it doesn't exist
|
416
|
-
if
|
436
|
+
if aggregate_maps_element.nil?
|
417
437
|
return if @map_action == "update"
|
418
438
|
# create new event from previous map, if @push_previous_map_as_event is enabled
|
419
|
-
if
|
420
|
-
previous_map = @@aggregate_maps.shift[1].map
|
439
|
+
if @push_previous_map_as_event && !@@aggregate_maps[@task_id].empty?
|
440
|
+
previous_map = @@aggregate_maps[@task_id].shift[1].map
|
421
441
|
event_to_yield = create_timeout_event(previous_map, task_id)
|
422
442
|
end
|
423
443
|
aggregate_maps_element = LogStash::Filters::Aggregate::Element.new(Time.now);
|
424
|
-
@@aggregate_maps[task_id] = aggregate_maps_element
|
444
|
+
@@aggregate_maps[@task_id][task_id] = aggregate_maps_element
|
425
445
|
else
|
426
446
|
return if @map_action == "create"
|
427
447
|
end
|
@@ -437,7 +457,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
437
457
|
end
|
438
458
|
|
439
459
|
# delete the map if task is ended
|
440
|
-
@@aggregate_maps.delete(task_id) if @end_of_task
|
460
|
+
@@aggregate_maps[@task_id].delete(task_id) if @end_of_task
|
441
461
|
|
442
462
|
end
|
443
463
|
|
@@ -484,15 +504,20 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
484
504
|
# This method is invoked by LogStash every 5 seconds.
|
485
505
|
def flush(options = {})
|
486
506
|
# Protection against no timeout defined by logstash conf : define a default eviction instance with timeout = DEFAULT_TIMEOUT seconds
|
487
|
-
if
|
488
|
-
@@
|
489
|
-
|
507
|
+
if @@default_timeout.nil?
|
508
|
+
@@default_timeout = DEFAULT_TIMEOUT
|
509
|
+
end
|
510
|
+
if !@@eviction_instance_map.has_key?(@task_id)
|
511
|
+
@@eviction_instance_map[@task_id] = self
|
512
|
+
@timeout = @@default_timeout
|
513
|
+
elsif @@eviction_instance_map[@task_id].timeout.nil?
|
514
|
+
@@eviction_instance_map[@task_id].timeout = @@default_timeout
|
490
515
|
end
|
491
516
|
|
492
517
|
# Launch eviction only every interval of (@timeout / 2) seconds
|
493
|
-
if
|
518
|
+
if @@eviction_instance_map[@task_id] == self && (!@@last_eviction_timestamp_map.has_key?(@task_id) || Time.now > @@last_eviction_timestamp_map[@task_id] + @timeout / 2)
|
494
519
|
events_to_flush = remove_expired_maps()
|
495
|
-
@@
|
520
|
+
@@last_eviction_timestamp_map[@task_id] = Time.now
|
496
521
|
return events_to_flush
|
497
522
|
end
|
498
523
|
|
@@ -507,9 +532,9 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
507
532
|
|
508
533
|
@@mutex.synchronize do
|
509
534
|
|
510
|
-
@@aggregate_maps.delete_if do |key, element|
|
511
|
-
if
|
512
|
-
if
|
535
|
+
@@aggregate_maps[@task_id].delete_if do |key, element|
|
536
|
+
if element.creation_timestamp < min_timestamp
|
537
|
+
if @push_previous_map_as_event || @push_map_as_event_on_timeout
|
513
538
|
events_to_flush << create_timeout_event(element.map, key)
|
514
539
|
end
|
515
540
|
next true
|
@@ -520,6 +545,30 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
520
545
|
|
521
546
|
return events_to_flush
|
522
547
|
end
|
548
|
+
|
549
|
+
# return if this filter instance has any timeout option enabled in logstash configuration
|
550
|
+
def has_timeout_options?()
|
551
|
+
return (
|
552
|
+
timeout ||
|
553
|
+
timeout_code ||
|
554
|
+
push_map_as_event_on_timeout ||
|
555
|
+
push_previous_map_as_event ||
|
556
|
+
timeout_task_id_field ||
|
557
|
+
!timeout_tags.empty?
|
558
|
+
)
|
559
|
+
end
|
560
|
+
|
561
|
+
# display all possible timeout options
|
562
|
+
def display_timeout_options()
|
563
|
+
return [
|
564
|
+
"timeout",
|
565
|
+
"timeout_code",
|
566
|
+
"push_map_as_event_on_timeout",
|
567
|
+
"push_previous_map_as_event",
|
568
|
+
"timeout_task_id_field",
|
569
|
+
"timeout_tags"
|
570
|
+
].join(", ")
|
571
|
+
end
|
523
572
|
|
524
573
|
end # class LogStash::Filters::Aggregate
|
525
574
|
|
@@ -1,6 +1,6 @@
|
|
1
1
|
Gem::Specification.new do |s|
|
2
2
|
s.name = 'logstash-filter-aggregate'
|
3
|
-
s.version = '2.
|
3
|
+
s.version = '2.4.0'
|
4
4
|
s.licenses = ['Apache License (2.0)']
|
5
5
|
s.summary = "The aim of this filter is to aggregate information available among several events (typically log lines) belonging to a same task, and finally push aggregated information into final task event."
|
6
6
|
s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"
|
@@ -6,7 +6,7 @@ require_relative "aggregate_spec_helper"
|
|
6
6
|
describe LogStash::Filters::Aggregate do
|
7
7
|
|
8
8
|
before(:each) do
|
9
|
-
|
9
|
+
reset_timeout_management()
|
10
10
|
aggregate_maps.clear()
|
11
11
|
@start_filter = setup_filter({ "map_action" => "create", "code" => "map['sql_duration'] = 0" })
|
12
12
|
@update_filter = setup_filter({ "map_action" => "update", "code" => "map['sql_duration'] += event['duration']" })
|
@@ -17,7 +17,7 @@ describe LogStash::Filters::Aggregate do
|
|
17
17
|
describe "and receiving an event without task_id" do
|
18
18
|
it "does not record it" do
|
19
19
|
@start_filter.filter(event())
|
20
|
-
expect(aggregate_maps).to be_empty
|
20
|
+
expect(aggregate_maps["%{taskid}"]).to be_empty
|
21
21
|
end
|
22
22
|
end
|
23
23
|
describe "and receiving an event with task_id" do
|
@@ -25,10 +25,10 @@ describe LogStash::Filters::Aggregate do
|
|
25
25
|
event = start_event("taskid" => "id123")
|
26
26
|
@start_filter.filter(event)
|
27
27
|
|
28
|
-
expect(aggregate_maps.size).to eq(1)
|
29
|
-
expect(aggregate_maps["id123"]).not_to be_nil
|
30
|
-
expect(aggregate_maps["id123"].creation_timestamp).to be >= event["@timestamp"]
|
31
|
-
expect(aggregate_maps["id123"].map["sql_duration"]).to eq(0)
|
28
|
+
expect(aggregate_maps["%{taskid}"].size).to eq(1)
|
29
|
+
expect(aggregate_maps["%{taskid}"]["id123"]).not_to be_nil
|
30
|
+
expect(aggregate_maps["%{taskid}"]["id123"].creation_timestamp).to be >= event["@timestamp"]
|
31
|
+
expect(aggregate_maps["%{taskid}"]["id123"].map["sql_duration"]).to eq(0)
|
32
32
|
end
|
33
33
|
end
|
34
34
|
|
@@ -45,9 +45,9 @@ describe LogStash::Filters::Aggregate do
|
|
45
45
|
second_start_event = start_event("taskid" => "id124")
|
46
46
|
@start_filter.filter(second_start_event)
|
47
47
|
|
48
|
-
expect(aggregate_maps.size).to eq(1)
|
49
|
-
expect(aggregate_maps["id124"].creation_timestamp).to be < second_start_event["@timestamp"]
|
50
|
-
expect(aggregate_maps["id124"].map["sql_duration"]).to eq(first_update_event["duration"])
|
48
|
+
expect(aggregate_maps["%{taskid}"].size).to eq(1)
|
49
|
+
expect(aggregate_maps["%{taskid}"]["id124"].creation_timestamp).to be < second_start_event["@timestamp"]
|
50
|
+
expect(aggregate_maps["%{taskid}"]["id124"].map["sql_duration"]).to eq(first_update_event["duration"])
|
51
51
|
end
|
52
52
|
end
|
53
53
|
end
|
@@ -59,7 +59,7 @@ describe LogStash::Filters::Aggregate do
|
|
59
59
|
end_event = end_event("taskid" => "id124")
|
60
60
|
@end_filter.filter(end_event)
|
61
61
|
|
62
|
-
expect(aggregate_maps).to be_empty
|
62
|
+
expect(aggregate_maps["%{taskid}"]).to be_empty
|
63
63
|
expect(end_event["sql_duration"]).to be_nil
|
64
64
|
end
|
65
65
|
end
|
@@ -72,7 +72,7 @@ describe LogStash::Filters::Aggregate do
|
|
72
72
|
@task_id_value = "id_123"
|
73
73
|
@start_event = start_event({"taskid" => @task_id_value})
|
74
74
|
@start_filter.filter(@start_event)
|
75
|
-
expect(aggregate_maps.size).to eq(1)
|
75
|
+
expect(aggregate_maps["%{taskid}"].size).to eq(1)
|
76
76
|
end
|
77
77
|
|
78
78
|
describe "and receiving an end event" do
|
@@ -80,7 +80,7 @@ describe LogStash::Filters::Aggregate do
|
|
80
80
|
it "does nothing" do
|
81
81
|
end_event = end_event()
|
82
82
|
@end_filter.filter(end_event)
|
83
|
-
expect(aggregate_maps.size).to eq(1)
|
83
|
+
expect(aggregate_maps["%{taskid}"].size).to eq(1)
|
84
84
|
expect(end_event["sql_duration"]).to be_nil
|
85
85
|
end
|
86
86
|
end
|
@@ -90,21 +90,21 @@ describe LogStash::Filters::Aggregate do
|
|
90
90
|
different_id_value = @task_id_value + "_different"
|
91
91
|
@end_filter.filter(end_event("taskid" => different_id_value))
|
92
92
|
|
93
|
-
expect(aggregate_maps.size).to eq(1)
|
94
|
-
expect(aggregate_maps[@task_id_value]).not_to be_nil
|
93
|
+
expect(aggregate_maps["%{taskid}"].size).to eq(1)
|
94
|
+
expect(aggregate_maps["%{taskid}"][@task_id_value]).not_to be_nil
|
95
95
|
end
|
96
96
|
end
|
97
97
|
|
98
98
|
describe "and the same id of the 'start event'" do
|
99
99
|
it "add 'sql_duration' field to the end event and deletes the aggregate map associated to taskid" do
|
100
|
-
expect(aggregate_maps.size).to eq(1)
|
100
|
+
expect(aggregate_maps["%{taskid}"].size).to eq(1)
|
101
101
|
|
102
102
|
@update_filter.filter(update_event("taskid" => @task_id_value, "duration" => 2))
|
103
103
|
|
104
104
|
end_event = end_event("taskid" => @task_id_value)
|
105
105
|
@end_filter.filter(end_event)
|
106
106
|
|
107
|
-
expect(aggregate_maps).to be_empty
|
107
|
+
expect(aggregate_maps["%{taskid}"]).to be_empty
|
108
108
|
expect(end_event["sql_duration"]).to eq(2)
|
109
109
|
end
|
110
110
|
|
@@ -117,7 +117,7 @@ describe LogStash::Filters::Aggregate do
|
|
117
117
|
it "works as well as with a string task id" do
|
118
118
|
start_event = start_event("taskid" => 124)
|
119
119
|
@start_filter.filter(start_event)
|
120
|
-
expect(aggregate_maps.size).to eq(1)
|
120
|
+
expect(aggregate_maps["%{taskid}"].size).to eq(1)
|
121
121
|
end
|
122
122
|
end
|
123
123
|
|
@@ -138,25 +138,25 @@ describe LogStash::Filters::Aggregate do
|
|
138
138
|
@task_id_value = "id_123"
|
139
139
|
@start_event = start_event({"taskid" => @task_id_value})
|
140
140
|
@start_filter.filter(@start_event)
|
141
|
-
expect(aggregate_maps.size).to eq(1)
|
141
|
+
expect(aggregate_maps["%{taskid}"].size).to eq(1)
|
142
142
|
end
|
143
143
|
|
144
144
|
describe "no timeout defined in none filter" do
|
145
145
|
it "defines a default timeout on a default filter" do
|
146
|
-
|
147
|
-
expect(
|
146
|
+
reset_timeout_management()
|
147
|
+
expect(taskid_eviction_instance).to be_nil
|
148
148
|
@end_filter.flush()
|
149
|
-
expect(
|
149
|
+
expect(taskid_eviction_instance).to eq(@end_filter)
|
150
150
|
expect(@end_filter.timeout).to eq(LogStash::Filters::Aggregate::DEFAULT_TIMEOUT)
|
151
151
|
end
|
152
152
|
end
|
153
153
|
|
154
154
|
describe "timeout is defined on another filter" do
|
155
|
-
it "eviction_instance is not updated" do
|
156
|
-
expect(
|
155
|
+
it "taskid eviction_instance is not updated" do
|
156
|
+
expect(taskid_eviction_instance).not_to be_nil
|
157
157
|
@start_filter.flush()
|
158
|
-
expect(
|
159
|
-
expect(
|
158
|
+
expect(taskid_eviction_instance).not_to eq(@start_filter)
|
159
|
+
expect(taskid_eviction_instance).to eq(@end_filter)
|
160
160
|
end
|
161
161
|
end
|
162
162
|
|
@@ -164,20 +164,20 @@ describe LogStash::Filters::Aggregate do
|
|
164
164
|
it "event is not removed" do
|
165
165
|
sleep(2)
|
166
166
|
@start_filter.flush()
|
167
|
-
expect(aggregate_maps.size).to eq(1)
|
167
|
+
expect(aggregate_maps["%{taskid}"].size).to eq(1)
|
168
168
|
end
|
169
169
|
end
|
170
170
|
|
171
171
|
describe "timeout defined on the filter" do
|
172
172
|
it "event is not removed if not expired" do
|
173
173
|
entries = @end_filter.flush()
|
174
|
-
expect(aggregate_maps.size).to eq(1)
|
174
|
+
expect(aggregate_maps["%{taskid}"].size).to eq(1)
|
175
175
|
expect(entries).to be_empty
|
176
176
|
end
|
177
177
|
it "removes event if expired and creates a new timeout event" do
|
178
178
|
sleep(2)
|
179
179
|
entries = @end_filter.flush()
|
180
|
-
expect(aggregate_maps).to be_empty
|
180
|
+
expect(aggregate_maps["%{taskid}"]).to be_empty
|
181
181
|
expect(entries.size).to eq(1)
|
182
182
|
expect(entries[0]['my_id']).to eq("id_123") # task id
|
183
183
|
expect(entries[0]["sql_duration"]).to eq(0) # Aggregation map
|
@@ -186,21 +186,29 @@ describe LogStash::Filters::Aggregate do
|
|
186
186
|
end
|
187
187
|
end
|
188
188
|
|
189
|
+
describe "timeout defined on another filter with another task_id pattern" do
|
190
|
+
it "does not remove event" do
|
191
|
+
another_filter = setup_filter({ "task_id" => "%{another_taskid}", "code" => "", "timeout" => 1 })
|
192
|
+
sleep(2)
|
193
|
+
entries = another_filter.flush()
|
194
|
+
expect(aggregate_maps["%{taskid}"].size).to eq(1)
|
195
|
+
expect(entries).to be_empty
|
196
|
+
end
|
197
|
+
end
|
189
198
|
end
|
190
199
|
|
191
200
|
context "aggregate_maps_path option is defined, " do
|
192
201
|
describe "close event append then register event append, " do
|
193
202
|
it "stores aggregate maps to configured file and then loads aggregate maps from file" do
|
194
|
-
|
195
203
|
store_file = "aggregate_maps"
|
196
204
|
expect(File.exist?(store_file)).to be false
|
197
205
|
|
198
206
|
store_filter = setup_filter({ "code" => "map['sql_duration'] = 0", "aggregate_maps_path" => store_file })
|
199
|
-
expect(aggregate_maps).to be_empty
|
207
|
+
expect(aggregate_maps["%{taskid}"]).to be_empty
|
200
208
|
|
201
209
|
start_event = start_event("taskid" => 124)
|
202
210
|
filter = store_filter.filter(start_event)
|
203
|
-
expect(aggregate_maps.size).to eq(1)
|
211
|
+
expect(aggregate_maps["%{taskid}"].size).to eq(1)
|
204
212
|
|
205
213
|
store_filter.close()
|
206
214
|
expect(File.exist?(store_file)).to be true
|
@@ -208,44 +216,49 @@ describe LogStash::Filters::Aggregate do
|
|
208
216
|
|
209
217
|
store_filter = setup_filter({ "code" => "map['sql_duration'] = 0", "aggregate_maps_path" => store_file })
|
210
218
|
expect(File.exist?(store_file)).to be false
|
211
|
-
expect(aggregate_maps.size).to eq(1)
|
212
|
-
|
219
|
+
expect(aggregate_maps["%{taskid}"].size).to eq(1)
|
213
220
|
end
|
214
221
|
end
|
215
222
|
|
216
223
|
describe "when aggregate_maps_path option is defined in 2 instances, " do
|
217
224
|
it "raises Logstash::ConfigurationError" do
|
218
|
-
|
219
225
|
expect {
|
220
226
|
setup_filter({ "code" => "", "aggregate_maps_path" => "aggregate_maps1" })
|
221
227
|
setup_filter({ "code" => "", "aggregate_maps_path" => "aggregate_maps2" })
|
222
228
|
}.to raise_error(LogStash::ConfigurationError)
|
223
|
-
|
224
229
|
end
|
225
230
|
end
|
226
231
|
end
|
227
232
|
|
228
233
|
context "push_previous_map_as_event option is defined, " do
|
234
|
+
describe "when push_previous_map_as_event option is activated on another filter with same task_id pattern" do
|
235
|
+
it "should throw a LogStash::ConfigurationError" do
|
236
|
+
expect {
|
237
|
+
setup_filter({"code" => "map['taskid'] = event['taskid']", "push_previous_map_as_event" => true})
|
238
|
+
}.to raise_error(LogStash::ConfigurationError)
|
239
|
+
end
|
240
|
+
end
|
241
|
+
|
229
242
|
describe "when a new task id is detected, " do
|
230
243
|
it "should push previous map as new event" do
|
231
|
-
push_filter = setup_filter({ "code" => "map['
|
232
|
-
push_filter.filter(event({"
|
233
|
-
push_filter.filter(event({"
|
234
|
-
expect(aggregate_maps.size).to eq(1)
|
244
|
+
push_filter = setup_filter({ "task_id" => "%{ppm_id}", "code" => "map['ppm_id'] = event['ppm_id']", "push_previous_map_as_event" => true, "timeout" => 5 })
|
245
|
+
push_filter.filter(event({"ppm_id" => "1"})) { |yield_event| fail "task 1 shouldn't have yield event" }
|
246
|
+
push_filter.filter(event({"ppm_id" => "2"})) { |yield_event| expect(yield_event["ppm_id"]).to eq("1") }
|
247
|
+
expect(aggregate_maps["%{ppm_id}"].size).to eq(1)
|
235
248
|
end
|
236
249
|
end
|
237
250
|
|
238
251
|
describe "when timeout happens, " do
|
239
252
|
it "flush method should return last map as new event" do
|
240
|
-
push_filter = setup_filter({ "code" => "map['
|
241
|
-
push_filter.filter(event({"
|
253
|
+
push_filter = setup_filter({ "task_id" => "%{ppm_id}", "code" => "map['ppm_id'] = event['ppm_id']", "push_previous_map_as_event" => true, "timeout" => 1, "timeout_code" => "event['test'] = 'testValue'" })
|
254
|
+
push_filter.filter(event({"ppm_id" => "1"}))
|
242
255
|
sleep(2)
|
243
256
|
events_to_flush = push_filter.flush()
|
244
257
|
expect(events_to_flush).not_to be_nil
|
245
258
|
expect(events_to_flush.size).to eq(1)
|
246
|
-
expect(events_to_flush[0]["
|
259
|
+
expect(events_to_flush[0]["ppm_id"]).to eq("1")
|
247
260
|
expect(events_to_flush[0]['test']).to eq("testValue")
|
248
|
-
expect(aggregate_maps.size).to eq(0)
|
261
|
+
expect(aggregate_maps["%{ppm_id}"].size).to eq(0)
|
249
262
|
end
|
250
263
|
end
|
251
264
|
end
|
@@ -39,11 +39,11 @@ def aggregate_maps()
|
|
39
39
|
LogStash::Filters::Aggregate.class_variable_get(:@@aggregate_maps)
|
40
40
|
end
|
41
41
|
|
42
|
-
def
|
43
|
-
LogStash::Filters::Aggregate.class_variable_get(:@@
|
42
|
+
def taskid_eviction_instance()
|
43
|
+
LogStash::Filters::Aggregate.class_variable_get(:@@eviction_instance_map)["%{taskid}"]
|
44
44
|
end
|
45
45
|
|
46
|
-
def
|
47
|
-
LogStash::Filters::Aggregate.class_variable_set(:@@
|
46
|
+
def reset_timeout_management()
|
47
|
+
LogStash::Filters::Aggregate.class_variable_set(:@@default_timeout, nil)
|
48
|
+
LogStash::Filters::Aggregate.class_variable_get(:@@eviction_instance_map).clear()
|
48
49
|
end
|
49
|
-
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: logstash-filter-aggregate
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.
|
4
|
+
version: 2.4.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Elastic
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2016-10-
|
12
|
+
date: 2016-10-15 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
requirement: !ruby/object:Gem::Requirement
|