logstash-filter-aggregate 2.6.4 → 2.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +6 -0
- data/docs/index.asciidoc +29 -16
- data/lib/logstash/filters/aggregate.rb +105 -75
- data/logstash-filter-aggregate.gemspec +1 -1
- data/spec/filters/aggregate_spec.rb +8 -8
- data/spec/filters/aggregate_spec_helper.rb +22 -13
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 155d6cb60a93bfdc3fb4d14e67c775e59e0b25cc
|
4
|
+
data.tar.gz: 3d61aa6dbd824619fe613d295f958d5e4abe70a7
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: ab53e0c19de46018254cfc5f545d96a6796556321a8841775aff1c69d1b1ed02c0e3f8df1b89a9934f3e6475f1d720b6215b81f530000275c4ff7b7cd8f6ecc4
|
7
|
+
data.tar.gz: 1b7a7d1fc22c049b0593c84cde188c3bcc1654ae267840789786110d9bb78e1fe0917f3dcb7393173a9b0bf37588d14358252b959b438716761034466731cf46
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,9 @@
|
|
1
|
+
## 2.7.0
|
2
|
+
- new feature: add support for multiple pipelines (for Logstash 6.0+)
|
3
|
+
aggregate maps, timeout options, and aggregate_maps_path are now stored per pipeline.
|
4
|
+
each pipeline is independant.
|
5
|
+
- docs: fix break lines in documentation examples
|
6
|
+
|
1
7
|
## 2.6.4
|
2
8
|
- bugfix: fix a NPE issue at Logstash 6.0 shutdown
|
3
9
|
- docs: remove all redundant documentation in aggregate.rb (now only present in docs/index.asciidoc)
|
data/docs/index.asciidoc
CHANGED
@@ -195,9 +195,11 @@ filter {
|
|
195
195
|
[id="plugins-{type}s-{plugin}-example4"]
|
196
196
|
==== Example #4 : no end event and tasks come one after the other
|
197
197
|
|
198
|
-
Fourth use case : like example #3, you have no specific end event, but also, tasks come one after the other.
|
199
|
-
|
200
|
-
|
198
|
+
Fourth use case : like example #3, you have no specific end event, but also, tasks come one after the other.
|
199
|
+
|
200
|
+
That is to say : tasks are not interlaced. All task1 events come, then all task2 events come, ...
|
201
|
+
|
202
|
+
In that case, you don't want to wait task timeout to flush aggregation map.
|
201
203
|
|
202
204
|
* A typical case is aggregating results from jdbc input plugin.
|
203
205
|
* Given that you have this SQL query : `SELECT country_name, town_name FROM town`
|
@@ -245,20 +247,31 @@ In that case, you don't want to wait task timeout to flush aggregation map. +
|
|
245
247
|
[id="plugins-{type}s-{plugin}-example5"]
|
246
248
|
==== Example #5 : no end event and push events as soon as possible
|
247
249
|
|
248
|
-
Fifth use case: like example #3, there is no end event.
|
249
|
-
|
250
|
-
|
250
|
+
Fifth use case: like example #3, there is no end event.
|
251
|
+
|
252
|
+
Events keep comming for an indefinite time and you want to push the aggregation map as soon as possible after the last user interaction without waiting for the `timeout`.
|
253
|
+
|
254
|
+
This allows to have the aggregated events pushed closer to real time.
|
255
|
+
|
256
|
+
|
257
|
+
A typical case is aggregating or tracking user behaviour.
|
258
|
+
|
259
|
+
We can track a user by its ID through the events, however once the user stops interacting, the events stop coming in.
|
260
|
+
|
261
|
+
There is no specific event indicating the end of the user's interaction.
|
262
|
+
|
263
|
+
The user ineraction will be considered as ended when no events for the specified user (task_id) arrive after the specified inactivity_timeout`.
|
264
|
+
|
265
|
+
If the user continues interacting for longer than `timeout` seconds (since first event), the aggregation map will still be deleted and pushed as a new event when timeout occurs.
|
251
266
|
|
252
|
-
A typical case is aggregating or tracking user behaviour. +
|
253
|
-
We can track a user by its ID through the events, however once the user stops interacting, the events stop coming in. +
|
254
|
-
There is no specific event indicating the end of the user's interaction. +
|
255
|
-
The user ineraction will be considered as ended when no events for the specified user (task_id) arrive after the specified inactivity_timeout`. +
|
256
|
-
If the user continues interacting for longer than `timeout` seconds (since first event), the aggregation map will still be deleted and pushed as a new event when timeout occurs. +
|
257
267
|
The difference with example #3 is that the events will be pushed as soon as the user stops interacting for `inactivity_timeout` seconds instead of waiting for the end of `timeout` seconds since first event.
|
258
268
|
|
259
|
-
In this case, we can enable the option 'push_map_as_event_on_timeout' to enable pushing the aggregation map as a new event when inactivity timeout occurs.
|
260
|
-
|
261
|
-
|
269
|
+
In this case, we can enable the option 'push_map_as_event_on_timeout' to enable pushing the aggregation map as a new event when inactivity timeout occurs.
|
270
|
+
|
271
|
+
In addition, we can enable 'timeout_code' to execute code on the populated timeout event.
|
272
|
+
|
273
|
+
We can also add 'timeout_task_id_field' so we can correlate the task_id, which in this case would be the user's ID.
|
274
|
+
|
262
275
|
|
263
276
|
* Given these logs:
|
264
277
|
|
@@ -315,7 +328,7 @@ filter {
|
|
315
328
|
* an aggregate map is tied to one task_id value which is tied to one task_id pattern. So if you have 2 filters with different task_id patterns, even if you have same task_id value, they won't share the same aggregate map.
|
316
329
|
* in one filter configuration, it is recommanded to define a timeout option to protect the feature against unterminated tasks. It tells the filter to delete expired maps
|
317
330
|
* if no timeout is defined, by default, all maps older than 1800 seconds are automatically deleted
|
318
|
-
* all timeout options have to be defined in only one aggregate filter per task_id pattern. Timeout options are : timeout, inactivity_timeout, timeout_code, push_map_as_event_on_timeout, push_previous_map_as_event, timeout_task_id_field, timeout_tags
|
331
|
+
* all timeout options have to be defined in only one aggregate filter per task_id pattern (per pipeline). Timeout options are : timeout, inactivity_timeout, timeout_code, push_map_as_event_on_timeout, push_previous_map_as_event, timeout_task_id_field, timeout_tags
|
319
332
|
* if `code` execution raises an exception, the error is logged and event is tagged '_aggregateexception'
|
320
333
|
|
321
334
|
|
@@ -366,7 +379,7 @@ The path to file where aggregate maps are stored when Logstash stops
|
|
366
379
|
and are loaded from when Logstash starts.
|
367
380
|
|
368
381
|
If not defined, aggregate maps will not be stored at Logstash stop and will be lost.
|
369
|
-
Must be defined in only one aggregate filter (as aggregate maps are
|
382
|
+
Must be defined in only one aggregate filter per pipeline (as aggregate maps are shared at pipeline level).
|
370
383
|
|
371
384
|
Example:
|
372
385
|
[source,ruby]
|
@@ -41,6 +41,15 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
41
41
|
config :timeout_tags, :validate => :array, :required => false, :default => []
|
42
42
|
|
43
43
|
|
44
|
+
# ################## #
|
45
|
+
# INSTANCE VARIABLES #
|
46
|
+
# ################## #
|
47
|
+
|
48
|
+
|
49
|
+
# pointer to current pipeline context
|
50
|
+
attr_accessor :current_pipeline
|
51
|
+
|
52
|
+
|
44
53
|
# ################ #
|
45
54
|
# STATIC VARIABLES #
|
46
55
|
# ################ #
|
@@ -48,29 +57,9 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
48
57
|
|
49
58
|
# Default timeout (in seconds) when not defined in plugin configuration
|
50
59
|
DEFAULT_TIMEOUT = 1800
|
51
|
-
|
52
|
-
#
|
53
|
-
|
54
|
-
@@aggregate_maps = {}
|
55
|
-
|
56
|
-
# Mutex used to synchronize access to 'aggregate_maps'
|
57
|
-
@@mutex = Mutex.new
|
58
|
-
|
59
|
-
# Default timeout for task_id patterns where timeout is not defined in Logstash filter configuration
|
60
|
-
@@default_timeout = nil
|
61
|
-
|
62
|
-
# For each "task_id" pattern, defines which Aggregate instance will process flush() call, processing expired Aggregate elements (older than timeout)
|
63
|
-
# For each entry, key is "task_id pattern" and value is "aggregate instance"
|
64
|
-
@@flush_instance_map = {}
|
65
|
-
|
66
|
-
# last time where timeout management in flush() method was launched, per "task_id" pattern
|
67
|
-
@@last_flush_timestamp_map = {}
|
68
|
-
|
69
|
-
# flag indicating if aggregate_maps_path option has been already set on one aggregate instance
|
70
|
-
@@aggregate_maps_path_set = false
|
71
|
-
|
72
|
-
# defines which Aggregate instance will close Aggregate static variables
|
73
|
-
@@static_close_instance = nil
|
60
|
+
|
61
|
+
# Store all shared aggregate attributes per pipeline id
|
62
|
+
@@pipelines = {}
|
74
63
|
|
75
64
|
|
76
65
|
# ####### #
|
@@ -88,7 +77,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
88
77
|
if !@task_id.match(/%\{.+\}/)
|
89
78
|
raise LogStash::ConfigurationError, "Aggregate plugin: task_id pattern '#{@task_id}' must contain a dynamic expression like '%{field}'"
|
90
79
|
end
|
91
|
-
|
80
|
+
|
92
81
|
# process lambda expression to call in each filter call
|
93
82
|
eval("@codeblock = lambda { |event, map| #{@code} }", binding, "(aggregate filter code)")
|
94
83
|
|
@@ -97,54 +86,60 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
97
86
|
eval("@timeout_codeblock = lambda { |event| #{@timeout_code} }", binding, "(aggregate filter timeout code)")
|
98
87
|
end
|
99
88
|
|
100
|
-
|
89
|
+
# init pipeline context
|
90
|
+
@@pipelines[pipeline_id] ||= LogStash::Filters::Aggregate::Pipeline.new();
|
91
|
+
@current_pipeline = @@pipelines[pipeline_id]
|
92
|
+
|
93
|
+
@current_pipeline.mutex.synchronize do
|
101
94
|
|
102
95
|
# timeout management : define eviction_instance for current task_id pattern
|
103
96
|
if has_timeout_options?
|
104
|
-
if
|
97
|
+
if @current_pipeline.flush_instance_map.has_key?(@task_id)
|
105
98
|
# all timeout options have to be defined in only one aggregate filter per task_id pattern
|
106
99
|
raise LogStash::ConfigurationError, "Aggregate plugin: For task_id pattern '#{@task_id}', there are more than one filter which defines timeout options. All timeout options have to be defined in only one aggregate filter per task_id pattern. Timeout options are : #{display_timeout_options}"
|
107
100
|
end
|
108
|
-
|
101
|
+
@current_pipeline.flush_instance_map[@task_id] = self
|
109
102
|
@logger.debug("Aggregate timeout for '#{@task_id}' pattern: #{@timeout} seconds")
|
110
103
|
end
|
111
104
|
|
112
105
|
# timeout management : define default_timeout
|
113
|
-
if !@timeout.nil? && (
|
114
|
-
|
106
|
+
if !@timeout.nil? && (@current_pipeline.default_timeout.nil? || @timeout < @current_pipeline.default_timeout)
|
107
|
+
@current_pipeline.default_timeout = @timeout
|
115
108
|
@logger.debug("Aggregate default timeout: #{@timeout} seconds")
|
116
109
|
end
|
117
110
|
|
118
111
|
# inactivity timeout management: make sure it is lower than timeout
|
119
|
-
if !@inactivity_timeout.nil? && ((!@timeout.nil? && @inactivity_timeout > @timeout) || (
|
112
|
+
if !@inactivity_timeout.nil? && ((!@timeout.nil? && @inactivity_timeout > @timeout) || (!@current_pipeline.default_timeout.nil? && @inactivity_timeout > @current_pipeline.default_timeout))
|
120
113
|
raise LogStash::ConfigurationError, "Aggregate plugin: For task_id pattern #{@task_id}, inactivity_timeout must be lower than timeout"
|
121
114
|
end
|
122
115
|
|
123
|
-
# reinit
|
124
|
-
if
|
125
|
-
|
116
|
+
# reinit pipeline_close_instance (if necessary)
|
117
|
+
if !@current_pipeline.aggregate_maps_path_set && !@current_pipeline.pipeline_close_instance.nil?
|
118
|
+
@current_pipeline.pipeline_close_instance = nil
|
126
119
|
end
|
127
120
|
|
128
|
-
# check if aggregate_maps_path option has already been set on another instance else set
|
121
|
+
# check if aggregate_maps_path option has already been set on another instance else set @current_pipeline.aggregate_maps_path_set
|
129
122
|
if !@aggregate_maps_path.nil?
|
130
|
-
if
|
131
|
-
|
123
|
+
if @current_pipeline.aggregate_maps_path_set
|
124
|
+
@current_pipeline.aggregate_maps_path_set = false
|
132
125
|
raise LogStash::ConfigurationError, "Aggregate plugin: Option 'aggregate_maps_path' must be set on only one aggregate filter"
|
133
126
|
else
|
134
|
-
|
135
|
-
|
127
|
+
@current_pipeline.aggregate_maps_path_set = true
|
128
|
+
@current_pipeline.pipeline_close_instance = self
|
136
129
|
end
|
137
130
|
end
|
138
131
|
|
139
132
|
# load aggregate maps from file (if option defined)
|
140
133
|
if !@aggregate_maps_path.nil? && File.exist?(@aggregate_maps_path)
|
141
|
-
File.open(@aggregate_maps_path, "r") { |from_file|
|
134
|
+
File.open(@aggregate_maps_path, "r") { |from_file| @current_pipeline.aggregate_maps.merge!(Marshal.load(from_file)) }
|
142
135
|
File.delete(@aggregate_maps_path)
|
143
136
|
@logger.info("Aggregate maps loaded from : #{@aggregate_maps_path}")
|
144
137
|
end
|
145
138
|
|
146
139
|
# init aggregate_maps
|
147
|
-
|
140
|
+
@current_pipeline.aggregate_maps[@task_id] ||= {}
|
141
|
+
|
142
|
+
|
148
143
|
end
|
149
144
|
end
|
150
145
|
|
@@ -154,25 +149,21 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
154
149
|
|
155
150
|
@logger.debug("Aggregate close call", :code => @code)
|
156
151
|
|
157
|
-
# define
|
158
|
-
|
152
|
+
# define pipeline close instance if none is already defined
|
153
|
+
@current_pipeline.pipeline_close_instance = self if @current_pipeline.pipeline_close_instance.nil?
|
159
154
|
|
160
|
-
if
|
155
|
+
if @current_pipeline.pipeline_close_instance == self
|
161
156
|
# store aggregate maps to file (if option defined)
|
162
|
-
|
163
|
-
|
164
|
-
if !@aggregate_maps_path.nil? &&
|
165
|
-
File.open(@aggregate_maps_path, "w"){ |to_file| Marshal.dump(
|
157
|
+
@current_pipeline.mutex.synchronize do
|
158
|
+
@current_pipeline.aggregate_maps.delete_if { |key, value| value.empty? }
|
159
|
+
if !@aggregate_maps_path.nil? && !@current_pipeline.aggregate_maps.empty?
|
160
|
+
File.open(@aggregate_maps_path, "w"){ |to_file| Marshal.dump(@current_pipeline.aggregate_maps, to_file) }
|
166
161
|
@logger.info("Aggregate maps stored to : #{@aggregate_maps_path}")
|
167
162
|
end
|
168
|
-
@@aggregate_maps.clear()
|
169
163
|
end
|
170
164
|
|
171
|
-
#
|
172
|
-
@@
|
173
|
-
@@flush_instance_map = {}
|
174
|
-
@@last_flush_timestamp_map = {}
|
175
|
-
@@aggregate_maps_path_set = false
|
165
|
+
# remove pipeline context for Logstash reload
|
166
|
+
@@pipelines.delete(pipeline_id)
|
176
167
|
end
|
177
168
|
|
178
169
|
end
|
@@ -189,21 +180,21 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
189
180
|
event_to_yield = nil
|
190
181
|
|
191
182
|
# protect aggregate_maps against concurrent access, using a mutex
|
192
|
-
|
183
|
+
@current_pipeline.mutex.synchronize do
|
193
184
|
|
194
185
|
# retrieve the current aggregate map
|
195
|
-
aggregate_maps_element =
|
186
|
+
aggregate_maps_element = @current_pipeline.aggregate_maps[@task_id][task_id]
|
196
187
|
|
197
188
|
|
198
189
|
# create aggregate map, if it doesn't exist
|
199
190
|
if aggregate_maps_element.nil?
|
200
191
|
return if @map_action == "update"
|
201
192
|
# create new event from previous map, if @push_previous_map_as_event is enabled
|
202
|
-
if @push_previous_map_as_event &&
|
193
|
+
if @push_previous_map_as_event && !@current_pipeline.aggregate_maps[@task_id].empty?
|
203
194
|
event_to_yield = extract_previous_map_as_event()
|
204
195
|
end
|
205
196
|
aggregate_maps_element = LogStash::Filters::Aggregate::Element.new(Time.now);
|
206
|
-
|
197
|
+
@current_pipeline.aggregate_maps[@task_id][task_id] = aggregate_maps_element
|
207
198
|
else
|
208
199
|
return if @map_action == "create"
|
209
200
|
end
|
@@ -225,7 +216,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
225
216
|
end
|
226
217
|
|
227
218
|
# delete the map if task is ended
|
228
|
-
|
219
|
+
@current_pipeline.aggregate_maps[@task_id].delete(task_id) if @end_of_task
|
229
220
|
|
230
221
|
end
|
231
222
|
|
@@ -272,7 +263,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
272
263
|
|
273
264
|
# Extract the previous map in aggregate maps, and return it as a new Logstash event
|
274
265
|
def extract_previous_map_as_event
|
275
|
-
previous_entry =
|
266
|
+
previous_entry = @current_pipeline.aggregate_maps[@task_id].shift()
|
276
267
|
previous_task_id = previous_entry[0]
|
277
268
|
previous_map = previous_entry[1].map
|
278
269
|
return create_timeout_event(previous_map, previous_task_id)
|
@@ -289,26 +280,26 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
289
280
|
@logger.debug("Aggregate flush call with #{options}")
|
290
281
|
|
291
282
|
# Protection against no timeout defined by Logstash conf : define a default eviction instance with timeout = DEFAULT_TIMEOUT seconds
|
292
|
-
if
|
293
|
-
|
283
|
+
if @current_pipeline.default_timeout.nil?
|
284
|
+
@current_pipeline.default_timeout = DEFAULT_TIMEOUT
|
294
285
|
end
|
295
|
-
if
|
296
|
-
|
297
|
-
@timeout =
|
298
|
-
elsif
|
299
|
-
|
286
|
+
if !@current_pipeline.flush_instance_map.has_key?(@task_id)
|
287
|
+
@current_pipeline.flush_instance_map[@task_id] = self
|
288
|
+
@timeout = @current_pipeline.default_timeout
|
289
|
+
elsif @current_pipeline.flush_instance_map[@task_id].timeout.nil?
|
290
|
+
@current_pipeline.flush_instance_map[@task_id].timeout = @current_pipeline.default_timeout
|
300
291
|
end
|
301
292
|
|
302
|
-
if
|
303
|
-
|
293
|
+
if @current_pipeline.flush_instance_map[@task_id].inactivity_timeout.nil?
|
294
|
+
@current_pipeline.flush_instance_map[@task_id].inactivity_timeout = @current_pipeline.flush_instance_map[@task_id].timeout
|
304
295
|
end
|
305
296
|
|
306
297
|
# Launch timeout management only every interval of (@inactivity_timeout / 2) seconds or at Logstash shutdown
|
307
|
-
if
|
298
|
+
if @current_pipeline.flush_instance_map[@task_id] == self && !@current_pipeline.aggregate_maps[@task_id].nil? && (!@current_pipeline.last_flush_timestamp_map.has_key?(@task_id) || Time.now > @current_pipeline.last_flush_timestamp_map[@task_id] + @inactivity_timeout / 2 || options[:final])
|
308
299
|
events_to_flush = remove_expired_maps()
|
309
300
|
|
310
301
|
# at Logstash shutdown, if push_previous_map_as_event is enabled, it's important to force flush (particularly for jdbc input plugin)
|
311
|
-
if options[:final] && @push_previous_map_as_event &&
|
302
|
+
if options[:final] && @push_previous_map_as_event && !@current_pipeline.aggregate_maps[@task_id].empty?
|
312
303
|
events_to_flush << extract_previous_map_as_event()
|
313
304
|
end
|
314
305
|
|
@@ -318,7 +309,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
318
309
|
end
|
319
310
|
|
320
311
|
# update last flush timestamp
|
321
|
-
|
312
|
+
@current_pipeline.last_flush_timestamp_map[@task_id] = Time.now
|
322
313
|
|
323
314
|
# return events to flush into Logstash pipeline
|
324
315
|
return events_to_flush
|
@@ -329,18 +320,18 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
329
320
|
end
|
330
321
|
|
331
322
|
|
332
|
-
# Remove the expired Aggregate maps from
|
323
|
+
# Remove the expired Aggregate maps from @current_pipeline.aggregate_maps if they are older than timeout or if no new event has been received since inactivity_timeout.
|
333
324
|
# If @push_previous_map_as_event option is set, or @push_map_as_event_on_timeout is set, expired maps are returned as new events to be flushed to Logstash pipeline.
|
334
325
|
def remove_expired_maps()
|
335
326
|
events_to_flush = []
|
336
327
|
min_timestamp = Time.now - @timeout
|
337
328
|
min_inactivity_timestamp = Time.now - @inactivity_timeout
|
338
329
|
|
339
|
-
|
330
|
+
@current_pipeline.mutex.synchronize do
|
340
331
|
|
341
|
-
@logger.debug("Aggregate remove_expired_maps call with '#{@task_id}' pattern and #{
|
332
|
+
@logger.debug("Aggregate remove_expired_maps call with '#{@task_id}' pattern and #{@current_pipeline.aggregate_maps[@task_id].length} maps")
|
342
333
|
|
343
|
-
|
334
|
+
@current_pipeline.aggregate_maps[@task_id].delete_if do |key, element|
|
344
335
|
if element.creation_timestamp < min_timestamp || element.lastevent_timestamp < min_inactivity_timestamp
|
345
336
|
if @push_previous_map_as_event || @push_map_as_event_on_timeout
|
346
337
|
events_to_flush << create_timeout_event(element.map, key)
|
@@ -379,6 +370,15 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
379
370
|
"timeout_tags"
|
380
371
|
].join(", ")
|
381
372
|
end
|
373
|
+
|
374
|
+
# return current pipeline id
|
375
|
+
def pipeline_id()
|
376
|
+
if @execution_context
|
377
|
+
return @execution_context.pipeline_id
|
378
|
+
else
|
379
|
+
return pipeline_id = "main"
|
380
|
+
end
|
381
|
+
end
|
382
382
|
|
383
383
|
end # class LogStash::Filters::Aggregate
|
384
384
|
|
@@ -393,3 +393,33 @@ class LogStash::Filters::Aggregate::Element
|
|
393
393
|
@map = {}
|
394
394
|
end
|
395
395
|
end
|
396
|
+
|
397
|
+
# shared aggregate attributes for each pipeline
|
398
|
+
class LogStash::Filters::Aggregate::Pipeline
|
399
|
+
|
400
|
+
attr_accessor :aggregate_maps, :mutex, :default_timeout, :flush_instance_map, :last_flush_timestamp_map, :aggregate_maps_path_set, :pipeline_close_instance
|
401
|
+
|
402
|
+
def initialize()
|
403
|
+
# Stores all aggregate maps, per task_id pattern, then per task_id value
|
404
|
+
@aggregate_maps = {}
|
405
|
+
|
406
|
+
# Mutex used to synchronize access to 'aggregate_maps'
|
407
|
+
@mutex = Mutex.new
|
408
|
+
|
409
|
+
# Default timeout for task_id patterns where timeout is not defined in Logstash filter configuration
|
410
|
+
@default_timeout = nil
|
411
|
+
|
412
|
+
# For each "task_id" pattern, defines which Aggregate instance will process flush() call, processing expired Aggregate elements (older than timeout)
|
413
|
+
# For each entry, key is "task_id pattern" and value is "aggregate instance"
|
414
|
+
@flush_instance_map = {}
|
415
|
+
|
416
|
+
# last time where timeout management in flush() method was launched, per "task_id" pattern
|
417
|
+
@last_flush_timestamp_map = {}
|
418
|
+
|
419
|
+
# flag indicating if aggregate_maps_path option has been already set on one aggregate instance
|
420
|
+
@aggregate_maps_path_set = false
|
421
|
+
|
422
|
+
# defines which Aggregate instance will close Aggregate variables associated to current pipeline
|
423
|
+
@pipeline_close_instance = nil
|
424
|
+
end
|
425
|
+
end
|
@@ -1,6 +1,6 @@
|
|
1
1
|
Gem::Specification.new do |s|
|
2
2
|
s.name = 'logstash-filter-aggregate'
|
3
|
-
s.version = '2.
|
3
|
+
s.version = '2.7.0'
|
4
4
|
s.licenses = ['Apache License (2.0)']
|
5
5
|
s.summary = 'The aim of this filter is to aggregate information available among several events (typically log lines) belonging to a same task, and finally push aggregated information into final task event.'
|
6
6
|
s.description = 'This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program'
|
@@ -6,7 +6,7 @@ require_relative "aggregate_spec_helper"
|
|
6
6
|
describe LogStash::Filters::Aggregate do
|
7
7
|
|
8
8
|
before(:each) do
|
9
|
-
|
9
|
+
reset_pipeline_variables()
|
10
10
|
@start_filter = setup_filter({ "map_action" => "create", "code" => "map['sql_duration'] = 0" })
|
11
11
|
@update_filter = setup_filter({ "map_action" => "update", "code" => "map['sql_duration'] += event.get('duration')" })
|
12
12
|
@end_filter = setup_filter({"timeout_task_id_field" => "my_id", "push_map_as_event_on_timeout" => true, "map_action" => "update", "code" => "event.set('sql_duration', map['sql_duration'])", "end_of_task" => true, "timeout" => 5, "inactivity_timeout" => 2, "timeout_code" => "event.set('test', 'testValue')", "timeout_tags" => ["tag1", "tag2"] })
|
@@ -268,6 +268,7 @@ describe LogStash::Filters::Aggregate do
|
|
268
268
|
describe "close event append then register event append, " do
|
269
269
|
it "stores aggregate maps to configured file and then loads aggregate maps from file" do
|
270
270
|
store_file = "aggregate_maps"
|
271
|
+
File.delete(store_file) if File.exist?(store_file)
|
271
272
|
expect(File.exist?(store_file)).to be false
|
272
273
|
|
273
274
|
one_filter = setup_filter({ "task_id" => "%{one_special_field}", "code" => ""})
|
@@ -284,7 +285,7 @@ describe LogStash::Filters::Aggregate do
|
|
284
285
|
|
285
286
|
store_filter.close()
|
286
287
|
expect(File.exist?(store_file)).to be true
|
287
|
-
expect(
|
288
|
+
expect(current_pipeline).to be_nil
|
288
289
|
|
289
290
|
one_filter = setup_filter({ "task_id" => "%{one_special_field}", "code" => ""})
|
290
291
|
store_filter = setup_filter({ "code" => "map['sql_duration'] = 0", "aggregate_maps_path" => store_file })
|
@@ -306,15 +307,14 @@ describe LogStash::Filters::Aggregate do
|
|
306
307
|
|
307
308
|
context "Logstash reload occurs, " do
|
308
309
|
describe "close method is called, " do
|
309
|
-
it "reinitializes
|
310
|
+
it "reinitializes pipelines" do
|
310
311
|
@end_filter.close()
|
311
|
-
expect(
|
312
|
-
expect(taskid_eviction_instance).to be_nil
|
313
|
-
expect(static_close_instance).not_to be_nil
|
314
|
-
expect(aggregate_maps_path_set).to be false
|
312
|
+
expect(current_pipeline).to be_nil
|
315
313
|
|
316
314
|
@end_filter.register()
|
317
|
-
expect(
|
315
|
+
expect(current_pipeline).not_to be_nil
|
316
|
+
expect(aggregate_maps).not_to be_nil
|
317
|
+
expect(pipeline_close_instance).to be_nil
|
318
318
|
end
|
319
319
|
end
|
320
320
|
end
|
@@ -33,31 +33,40 @@ def filter(event)
|
|
33
33
|
@end_filter.filter(event)
|
34
34
|
end
|
35
35
|
|
36
|
+
def pipelines()
|
37
|
+
LogStash::Filters::Aggregate.class_variable_get(:@@pipelines)
|
38
|
+
end
|
39
|
+
|
40
|
+
def current_pipeline()
|
41
|
+
pipelines()['main']
|
42
|
+
end
|
43
|
+
|
36
44
|
def aggregate_maps()
|
37
|
-
|
45
|
+
current_pipeline().aggregate_maps
|
38
46
|
end
|
39
47
|
|
40
48
|
def taskid_eviction_instance()
|
41
|
-
|
49
|
+
current_pipeline().flush_instance_map["%{taskid}"]
|
42
50
|
end
|
43
51
|
|
44
|
-
def
|
45
|
-
|
52
|
+
def pipeline_close_instance()
|
53
|
+
current_pipeline().pipeline_close_instance
|
46
54
|
end
|
47
55
|
|
48
56
|
def aggregate_maps_path_set()
|
49
|
-
|
57
|
+
current_pipeline().aggregate_maps_path_set
|
50
58
|
end
|
51
59
|
|
52
60
|
def reset_timeout_management()
|
53
|
-
|
54
|
-
|
55
|
-
|
61
|
+
current_pipeline().default_timeout = nil
|
62
|
+
current_pipeline().flush_instance_map.clear()
|
63
|
+
current_pipeline().last_flush_timestamp_map.clear()
|
56
64
|
end
|
57
65
|
|
58
|
-
def
|
59
|
-
|
60
|
-
|
61
|
-
|
62
|
-
|
66
|
+
def reset_pipeline_variables()
|
67
|
+
pipelines().clear()
|
68
|
+
# reset_timeout_management()
|
69
|
+
# aggregate_maps().clear()
|
70
|
+
# current_pipeline().pipeline_close_instance = nil
|
71
|
+
# current_pipeline().aggregate_maps_path_set = false
|
63
72
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: logstash-filter-aggregate
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.
|
4
|
+
version: 2.7.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Elastic
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2017-
|
12
|
+
date: 2017-11-03 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
requirement: !ruby/object:Gem::Requirement
|