logstash-filter-aggregate 2.9.2 → 2.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 15460afa4f78789d3f4ee56ae45e06e5f31b8ef4
4
- data.tar.gz: f4bd241aa7207b29366d2456d6c42cb6ecc499c3
2
+ SHA256:
3
+ metadata.gz: e58c8dca379dfba4361b616d161cd03b8c32211f48fd696b1acc977e4506c3d3
4
+ data.tar.gz: a6ec0abe65bb04db42f28c2db73095d5bf20f0f23c5c8710547a1834572975dc
5
5
  SHA512:
6
- metadata.gz: 10f18942af2c7cd6f343502478d6d6d2ffc530700b9462e377770a0adbd6bb38b71c2a7debeeaba832aa0aafa9eba85acd041c35a468f2876ad61d0a51566f40
7
- data.tar.gz: d6c92aa3b6bf04bfdd1be6e6b3fafbeb8b31416e20db4ee2c795d3fa20ac193db9a8422634b90df90efed88f135e5941a2f7373707b0e20be886a68a30abd8c7
6
+ metadata.gz: 222a895a6d0106f19bd642a131babb8f4ec2dee84fc1b608d901a3c1865e062e252dc966971e5adb71321128c7292a4d4dd8845b6eea60878d3011ccea968406
7
+ data.tar.gz: e4380712a5b97a69dd7d3e53e634f2fa8caed0fc373f9fdae899d9ae73b2d5e724c07883c1ad582c8851c1f51101cd96c528639a4c0c123fe2b169d762d77de1
data/CHANGELOG.md CHANGED
@@ -1,3 +1,9 @@
1
+ ## 2.11.0
2
+ - new feature: log a warning message when the number of tasks stored in memory exceeds the configured threshold (#125)
3
+
4
+ ## 2.10.0
5
+ - new feature: add ability to generate new event during code execution (#116)
6
+
1
7
  ## 2.9.2
2
8
  - bugfix: remove 'default_timeout' at pipeline level (fix #112)
3
9
  - ci: update travis ci configuration
data/docs/index.asciidoc CHANGED
@@ -26,7 +26,7 @@ include::{include_path}/plugin_header.asciidoc[]
26
26
  The aim of this filter is to aggregate information available among several events (typically log lines) belonging to a same task,
27
27
  and finally push aggregated information into final task event.
28
28
 
29
- You should be very careful to set Logstash filter workers to 1 (`-w 1` flag) for this filter to work correctly
29
+ You should be very careful to set Logstash filter workers to 1 (`-w 1` in [command-line flag](https://www.elastic.co/guide/en/logstash/current/running-logstash-command-line.html#command-line-flags)) for this filter to work correctly
30
30
  otherwise events may be processed out of sequence and unexpected results will occur.
31
31
 
32
32
 
@@ -364,6 +364,7 @@ This plugin supports the following configuration options plus the <<plugins-{typ
364
364
  | <<plugins-{type}s-{plugin}-timeout_tags>> |<<array,array>>|No
365
365
  | <<plugins-{type}s-{plugin}-timeout_task_id_field>> |<<string,string>>|No
366
366
  | <<plugins-{type}s-{plugin}-timeout_timestamp_field>> |<<string,string>>|No
367
+ | <<plugins-{type}s-{plugin}-map_count_warning_threshold>> |<<number,number>>|No
367
368
  |=======================================================================
368
369
 
369
370
  Also see <<plugins-{type}s-{plugin}-common-options>> for a list of options supported by all
@@ -402,7 +403,7 @@ The code to execute to update aggregated map, using current event.
402
403
 
403
404
  Or on the contrary, the code to execute to update event, using aggregated map.
404
405
 
405
- Available variables are :
406
+ Available variables are:
406
407
 
407
408
  `event`: current Logstash event
408
409
 
@@ -411,8 +412,11 @@ Available variables are :
411
412
  `map_meta`: meta informations associated to aggregate map. It allows to set a custom `timeout` or `inactivity_timeout`.
412
413
  It allows also to get `creation_timestamp`, `lastevent_timestamp` and `task_id`.
413
414
 
415
+ `new_event_block`: block used to emit new Logstash events. See the second example on how to use it.
416
+
414
417
  When option push_map_as_event_on_timeout=true, if you set `map_meta.timeout=0` in `code` block, then aggregated map is immediately pushed as a new event.
415
418
 
419
+
416
420
  Example:
417
421
  [source,ruby]
418
422
  filter {
@@ -421,6 +425,26 @@ Example:
421
425
  }
422
426
  }
423
427
 
428
+
429
+ To create additional events during the code execution, to be emitted immediately, you can use `new_event_block.call(event)` function, like in the following example:
430
+
431
+ [source,ruby]
432
+ filter {
433
+ aggregate {
434
+ code => "
435
+ data = {:my_sql_duration => map['sql_duration']}
436
+ generated_event = LogStash::Event.new(data)
437
+ generated_event.set('my_other_field', 34)
438
+ new_event_block.call(generated_event)
439
+ "
440
+ }
441
+ }
442
+
443
+ The parameter of the function `new_event_block.call` must be of type `LogStash::Event`.
444
+ To create such an object, the constructor of the same class can be used: `LogStash::Event.new()`.
445
+ `LogStash::Event.new()` can receive a parameter of type ruby http://ruby-doc.org/core-1.9.1/Hash.html[Hash] to initialize the new event fields.
446
+
447
+
424
448
  [id="plugins-{type}s-{plugin}-end_of_task"]
425
449
  ===== `end_of_task`
426
450
 
@@ -591,6 +615,29 @@ Example:
591
615
  }
592
616
  }
593
617
 
618
+ [id="plugins-{type}s-{plugin}-map_count_warning_threshold"]
619
+ ===== `map_count_warning_threshold`
620
+
621
+ * Value type is <<number,number>>
622
+ * Default value is `5000`
623
+
624
+ Defines the threshold for the number of tasks in memory before a warning is logged.
625
+ When the number of maps for a task_id pattern exceeds this threshold, a warning message is logged to help identify potential memory issues caused by unterminated tasks.
626
+
627
+ The warning is repeated every 20% of the threshold events (e.g., with threshold 5000, warnings appear every 1000 events while above threshold).
628
+
629
+ Set to `0` to disable memory warnings.
630
+
631
+ Example:
632
+ [source,ruby]
633
+ filter {
634
+ aggregate {
635
+ task_id => "%{taskid}"
636
+ code => "map['count'] ||= 0; map['count'] += 1"
637
+ map_count_warning_threshold => 10000
638
+ }
639
+ }
640
+
594
641
 
595
642
  [id="plugins-{type}s-{plugin}-common-options"]
596
643
  include::{include_path}/{type}.asciidoc[]
@@ -42,6 +42,8 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
42
42
 
43
43
  config :timeout_tags, :validate => :array, :required => false, :default => []
44
44
 
45
+ config :map_count_warning_threshold, :validate => :number, :required => false, :default => 5000
46
+
45
47
 
46
48
  # ################## #
47
49
  # INSTANCE VARIABLES #
@@ -62,6 +64,9 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
62
64
  # Default timeout (in seconds) when not defined in plugin configuration
63
65
  DEFAULT_TIMEOUT = 1800
64
66
 
67
+ # Warning frequency divisor: warn every (threshold / divisor) events when above threshold
68
+ WARNING_FREQUENCY_DIVISOR = 5
69
+
65
70
  # Store all shared aggregate attributes per pipeline id
66
71
  @@pipelines = {}
67
72
 
@@ -83,7 +88,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
83
88
  end
84
89
 
85
90
  # process lambda expression to call in each filter call
86
- eval("@codeblock = lambda { |event, map, map_meta| #{@code} }", binding, "(aggregate filter code)")
91
+ eval("@codeblock = lambda { |event, map, map_meta, &new_event_block| #{@code} }", binding, "(aggregate filter code)")
87
92
 
88
93
  # process lambda expression to call in the timeout case or previous event case
89
94
  if @timeout_code
@@ -138,6 +143,9 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
138
143
  @current_pipeline.aggregate_maps[@task_id] ||= {}
139
144
  update_aggregate_maps_metric()
140
145
 
146
+ # calculate warning frequency (warn every threshold/divisor events, minimum 1)
147
+ @map_count_warning_frequency = [@map_count_warning_threshold / WARNING_FREQUENCY_DIVISOR, 1].max
148
+
141
149
  end
142
150
  end
143
151
 
@@ -168,7 +176,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
168
176
 
169
177
  # This method is invoked each time an event matches the filter
170
178
  public
171
- def filter(event)
179
+ def filter(event, &new_event_block)
172
180
 
173
181
  # define task id
174
182
  task_id = event.sprintf(@task_id)
@@ -180,6 +188,8 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
180
188
  # protect aggregate_maps against concurrent access, using a mutex
181
189
  @current_pipeline.mutex.synchronize do
182
190
 
191
+ check_map_count_warning()
192
+
183
193
  # if timeout is based on event timestamp, check if task_id map is expired and should be removed
184
194
  if @timeout_timestamp_field
185
195
  event_to_yield = remove_expired_map_based_on_event_timestamp(task_id, event)
@@ -213,7 +223,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
213
223
  # execute the code to read/update map and event
214
224
  map = aggregate_maps_element.map
215
225
  begin
216
- @codeblock.call(event, map, aggregate_maps_element)
226
+ @codeblock.call(event, map, aggregate_maps_element, &new_event_block)
217
227
  @logger.debug("Aggregate successful filter code execution", :code => @code)
218
228
  noError = true
219
229
  rescue => exception
@@ -485,6 +495,26 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
485
495
  end
486
496
  end
487
497
 
498
+ # checks if map count exceeds warning threshold and logs a warning if so
499
+ def check_map_count_warning()
500
+ return if @map_count_warning_threshold == 0
501
+
502
+ map_count = @current_pipeline.aggregate_maps[@task_id].length
503
+ if map_count >= @map_count_warning_threshold
504
+ @events_since_last_warning ||= 0
505
+ if @events_since_last_warning == 0
506
+ @logger.warn("Aggregate filter memory warning: task_id pattern '#{@task_id}' has #{map_count} maps in memory (threshold: #{@map_count_warning_threshold}).",
507
+ :task_id_pattern => @task_id,
508
+ :map_count => map_count,
509
+ :threshold => @map_count_warning_threshold)
510
+ end
511
+ @events_since_last_warning = (@events_since_last_warning + 1) % @map_count_warning_frequency
512
+ else
513
+ @events_since_last_warning = 0
514
+ end
515
+
516
+ end
517
+
488
518
  end # class LogStash::Filters::Aggregate
489
519
 
490
520
  # Element of "aggregate_maps"
@@ -1,8 +1,8 @@
1
1
  Gem::Specification.new do |s|
2
2
  s.name = 'logstash-filter-aggregate'
3
- s.version = '2.9.2'
4
- s.licenses = ['Apache License (2.0)']
5
- s.summary = "Aggregates information from several events originating with a single task"
3
+ s.version = '2.11.0'
4
+ s.licenses = ['Apache-2.0']
5
+ s.summary = 'Aggregates information from several events originating with a single task'
6
6
  s.description = 'This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program'
7
7
  s.authors = ['Elastic', 'Fabien Baligand']
8
8
  s.email = 'info@elastic.co'
@@ -420,4 +420,50 @@ describe LogStash::Filters::Aggregate do
420
420
  end
421
421
  end
422
422
 
423
- end
423
+ context "Custom event generation code is used" do
424
+ describe "when a new event is manually generated" do
425
+ it "should push a new event immediately" do
426
+ agg_filter = setup_filter({ "task_id" => "%{task_id}", "code" => "map['sql_duration'] = 2; new_event_block.call(LogStash::Event.new({:my_sql_duration => map['sql_duration']}))", "timeout" => 120 })
427
+ agg_filter.filter(event({"task_id" => "1"})) do |yield_event|
428
+ expect(yield_event).not_to be_nil
429
+ expect(yield_event.get("my_sql_duration")).to eq(2)
430
+ end
431
+ end
432
+ end
433
+
434
+ end
435
+
436
+ context "map_count_warning_threshold option" do
437
+ describe "when threshold is set to 0 (disabled)" do
438
+ it "should not log any warning" do
439
+ filter = setup_filter({ "task_id" => "%{warn_id}", "code" => "", "map_count_warning_threshold" => 0 })
440
+ expect(filter.logger).not_to receive(:warn)
441
+ 3.times { |i| filter.filter(event({"warn_id" => "task_#{i}"})) }
442
+ end
443
+ end
444
+
445
+ describe "when map count reaches threshold" do
446
+ it "should log a warning" do
447
+ filter = setup_filter({ "task_id" => "%{warn_id}", "code" => "", "map_count_warning_threshold" => 3 })
448
+ expect(filter.logger).to receive(:warn).with(
449
+ /Aggregate filter memory warning.*has 3 maps in memory/,
450
+ hash_including(:task_id_pattern => "%{warn_id}", :map_count => 3, :threshold => 3)
451
+ ).once
452
+ 4.times { |i| filter.filter(event({"warn_id" => "task_#{i}"})) }
453
+ end
454
+
455
+ it "should log warning every 20% of map_count_warning_threshold occurrences" do
456
+ filter = setup_filter({ "task_id" => "%{warn_id}", "code" => "", "map_count_warning_threshold" => 5 })
457
+ expect(filter.logger).to receive(:warn).twice
458
+ 7.times { |i| filter.filter(event({"warn_id" => "task_#{i}"})) }
459
+ end
460
+ end
461
+
462
+ describe "when using default threshold" do
463
+ it "should have default threshold of 5000" do
464
+ filter = setup_filter({ "task_id" => "%{warn_id}", "code" => "" })
465
+ expect(filter.map_count_warning_threshold).to eq(5000)
466
+ end
467
+ end
468
+ end
469
+ end
metadata CHANGED
@@ -1,17 +1,17 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: logstash-filter-aggregate
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.9.2
4
+ version: 2.11.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Elastic
8
8
  - Fabien Baligand
9
- autorequire:
10
9
  bindir: bin
11
10
  cert_chain: []
12
- date: 2021-04-25 00:00:00.000000000 Z
11
+ date: 2026-02-06 00:00:00.000000000 Z
13
12
  dependencies:
14
13
  - !ruby/object:Gem::Dependency
14
+ name: logstash-core-plugin-api
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
17
  - - ">="
@@ -20,9 +20,8 @@ dependencies:
20
20
  - - "<="
21
21
  - !ruby/object:Gem::Version
22
22
  version: '2.99'
23
- name: logstash-core-plugin-api
24
- prerelease: false
25
23
  type: :runtime
24
+ prerelease: false
26
25
  version_requirements: !ruby/object:Gem::Requirement
27
26
  requirements:
28
27
  - - ">="
@@ -32,20 +31,22 @@ dependencies:
32
31
  - !ruby/object:Gem::Version
33
32
  version: '2.99'
34
33
  - !ruby/object:Gem::Dependency
34
+ name: logstash-devutils
35
35
  requirement: !ruby/object:Gem::Requirement
36
36
  requirements:
37
37
  - - ">="
38
38
  - !ruby/object:Gem::Version
39
39
  version: '0'
40
- name: logstash-devutils
41
- prerelease: false
42
40
  type: :development
41
+ prerelease: false
43
42
  version_requirements: !ruby/object:Gem::Requirement
44
43
  requirements:
45
44
  - - ">="
46
45
  - !ruby/object:Gem::Version
47
46
  version: '0'
48
- description: This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program
47
+ description: This gem is a Logstash plugin required to be installed on top of the
48
+ Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This
49
+ gem is not a stand-alone program
49
50
  email: info@elastic.co
50
51
  executables: []
51
52
  extensions: []
@@ -65,11 +66,10 @@ files:
65
66
  - spec/filters/aggregate_spec_helper.rb
66
67
  homepage: https://github.com/logstash-plugins/logstash-filter-aggregate
67
68
  licenses:
68
- - Apache License (2.0)
69
+ - Apache-2.0
69
70
  metadata:
70
71
  logstash_plugin: 'true'
71
72
  logstash_group: filter
72
- post_install_message:
73
73
  rdoc_options: []
74
74
  require_paths:
75
75
  - lib
@@ -84,9 +84,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
84
84
  - !ruby/object:Gem::Version
85
85
  version: '0'
86
86
  requirements: []
87
- rubyforge_project:
88
- rubygems_version: 2.4.8
89
- signing_key:
87
+ rubygems_version: 3.6.3
90
88
  specification_version: 4
91
89
  summary: Aggregates information from several events originating with a single task
92
90
  test_files: