logstash-filter-aggregate 2.6.4 → 2.7.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +6 -0
- data/docs/index.asciidoc +29 -16
- data/lib/logstash/filters/aggregate.rb +105 -75
- data/logstash-filter-aggregate.gemspec +1 -1
- data/spec/filters/aggregate_spec.rb +8 -8
- data/spec/filters/aggregate_spec_helper.rb +22 -13
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 155d6cb60a93bfdc3fb4d14e67c775e59e0b25cc
|
4
|
+
data.tar.gz: 3d61aa6dbd824619fe613d295f958d5e4abe70a7
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: ab53e0c19de46018254cfc5f545d96a6796556321a8841775aff1c69d1b1ed02c0e3f8df1b89a9934f3e6475f1d720b6215b81f530000275c4ff7b7cd8f6ecc4
|
7
|
+
data.tar.gz: 1b7a7d1fc22c049b0593c84cde188c3bcc1654ae267840789786110d9bb78e1fe0917f3dcb7393173a9b0bf37588d14358252b959b438716761034466731cf46
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,9 @@
|
|
1
|
+
## 2.7.0
|
2
|
+
- new feature: add support for multiple pipelines (for Logstash 6.0+)
|
3
|
+
aggregate maps, timeout options, and aggregate_maps_path are now stored per pipeline.
|
4
|
+
each pipeline is independant.
|
5
|
+
- docs: fix break lines in documentation examples
|
6
|
+
|
1
7
|
## 2.6.4
|
2
8
|
- bugfix: fix a NPE issue at Logstash 6.0 shutdown
|
3
9
|
- docs: remove all redundant documentation in aggregate.rb (now only present in docs/index.asciidoc)
|
data/docs/index.asciidoc
CHANGED
@@ -195,9 +195,11 @@ filter {
|
|
195
195
|
[id="plugins-{type}s-{plugin}-example4"]
|
196
196
|
==== Example #4 : no end event and tasks come one after the other
|
197
197
|
|
198
|
-
Fourth use case : like example #3, you have no specific end event, but also, tasks come one after the other.
|
199
|
-
|
200
|
-
|
198
|
+
Fourth use case : like example #3, you have no specific end event, but also, tasks come one after the other.
|
199
|
+
|
200
|
+
That is to say : tasks are not interlaced. All task1 events come, then all task2 events come, ...
|
201
|
+
|
202
|
+
In that case, you don't want to wait task timeout to flush aggregation map.
|
201
203
|
|
202
204
|
* A typical case is aggregating results from jdbc input plugin.
|
203
205
|
* Given that you have this SQL query : `SELECT country_name, town_name FROM town`
|
@@ -245,20 +247,31 @@ In that case, you don't want to wait task timeout to flush aggregation map. +
|
|
245
247
|
[id="plugins-{type}s-{plugin}-example5"]
|
246
248
|
==== Example #5 : no end event and push events as soon as possible
|
247
249
|
|
248
|
-
Fifth use case: like example #3, there is no end event.
|
249
|
-
|
250
|
-
|
250
|
+
Fifth use case: like example #3, there is no end event.
|
251
|
+
|
252
|
+
Events keep comming for an indefinite time and you want to push the aggregation map as soon as possible after the last user interaction without waiting for the `timeout`.
|
253
|
+
|
254
|
+
This allows to have the aggregated events pushed closer to real time.
|
255
|
+
|
256
|
+
|
257
|
+
A typical case is aggregating or tracking user behaviour.
|
258
|
+
|
259
|
+
We can track a user by its ID through the events, however once the user stops interacting, the events stop coming in.
|
260
|
+
|
261
|
+
There is no specific event indicating the end of the user's interaction.
|
262
|
+
|
263
|
+
The user ineraction will be considered as ended when no events for the specified user (task_id) arrive after the specified inactivity_timeout`.
|
264
|
+
|
265
|
+
If the user continues interacting for longer than `timeout` seconds (since first event), the aggregation map will still be deleted and pushed as a new event when timeout occurs.
|
251
266
|
|
252
|
-
A typical case is aggregating or tracking user behaviour. +
|
253
|
-
We can track a user by its ID through the events, however once the user stops interacting, the events stop coming in. +
|
254
|
-
There is no specific event indicating the end of the user's interaction. +
|
255
|
-
The user ineraction will be considered as ended when no events for the specified user (task_id) arrive after the specified inactivity_timeout`. +
|
256
|
-
If the user continues interacting for longer than `timeout` seconds (since first event), the aggregation map will still be deleted and pushed as a new event when timeout occurs. +
|
257
267
|
The difference with example #3 is that the events will be pushed as soon as the user stops interacting for `inactivity_timeout` seconds instead of waiting for the end of `timeout` seconds since first event.
|
258
268
|
|
259
|
-
In this case, we can enable the option 'push_map_as_event_on_timeout' to enable pushing the aggregation map as a new event when inactivity timeout occurs.
|
260
|
-
|
261
|
-
|
269
|
+
In this case, we can enable the option 'push_map_as_event_on_timeout' to enable pushing the aggregation map as a new event when inactivity timeout occurs.
|
270
|
+
|
271
|
+
In addition, we can enable 'timeout_code' to execute code on the populated timeout event.
|
272
|
+
|
273
|
+
We can also add 'timeout_task_id_field' so we can correlate the task_id, which in this case would be the user's ID.
|
274
|
+
|
262
275
|
|
263
276
|
* Given these logs:
|
264
277
|
|
@@ -315,7 +328,7 @@ filter {
|
|
315
328
|
* an aggregate map is tied to one task_id value which is tied to one task_id pattern. So if you have 2 filters with different task_id patterns, even if you have same task_id value, they won't share the same aggregate map.
|
316
329
|
* in one filter configuration, it is recommanded to define a timeout option to protect the feature against unterminated tasks. It tells the filter to delete expired maps
|
317
330
|
* if no timeout is defined, by default, all maps older than 1800 seconds are automatically deleted
|
318
|
-
* all timeout options have to be defined in only one aggregate filter per task_id pattern. Timeout options are : timeout, inactivity_timeout, timeout_code, push_map_as_event_on_timeout, push_previous_map_as_event, timeout_task_id_field, timeout_tags
|
331
|
+
* all timeout options have to be defined in only one aggregate filter per task_id pattern (per pipeline). Timeout options are : timeout, inactivity_timeout, timeout_code, push_map_as_event_on_timeout, push_previous_map_as_event, timeout_task_id_field, timeout_tags
|
319
332
|
* if `code` execution raises an exception, the error is logged and event is tagged '_aggregateexception'
|
320
333
|
|
321
334
|
|
@@ -366,7 +379,7 @@ The path to file where aggregate maps are stored when Logstash stops
|
|
366
379
|
and are loaded from when Logstash starts.
|
367
380
|
|
368
381
|
If not defined, aggregate maps will not be stored at Logstash stop and will be lost.
|
369
|
-
Must be defined in only one aggregate filter (as aggregate maps are
|
382
|
+
Must be defined in only one aggregate filter per pipeline (as aggregate maps are shared at pipeline level).
|
370
383
|
|
371
384
|
Example:
|
372
385
|
[source,ruby]
|
@@ -41,6 +41,15 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
41
41
|
config :timeout_tags, :validate => :array, :required => false, :default => []
|
42
42
|
|
43
43
|
|
44
|
+
# ################## #
|
45
|
+
# INSTANCE VARIABLES #
|
46
|
+
# ################## #
|
47
|
+
|
48
|
+
|
49
|
+
# pointer to current pipeline context
|
50
|
+
attr_accessor :current_pipeline
|
51
|
+
|
52
|
+
|
44
53
|
# ################ #
|
45
54
|
# STATIC VARIABLES #
|
46
55
|
# ################ #
|
@@ -48,29 +57,9 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
48
57
|
|
49
58
|
# Default timeout (in seconds) when not defined in plugin configuration
|
50
59
|
DEFAULT_TIMEOUT = 1800
|
51
|
-
|
52
|
-
#
|
53
|
-
|
54
|
-
@@aggregate_maps = {}
|
55
|
-
|
56
|
-
# Mutex used to synchronize access to 'aggregate_maps'
|
57
|
-
@@mutex = Mutex.new
|
58
|
-
|
59
|
-
# Default timeout for task_id patterns where timeout is not defined in Logstash filter configuration
|
60
|
-
@@default_timeout = nil
|
61
|
-
|
62
|
-
# For each "task_id" pattern, defines which Aggregate instance will process flush() call, processing expired Aggregate elements (older than timeout)
|
63
|
-
# For each entry, key is "task_id pattern" and value is "aggregate instance"
|
64
|
-
@@flush_instance_map = {}
|
65
|
-
|
66
|
-
# last time where timeout management in flush() method was launched, per "task_id" pattern
|
67
|
-
@@last_flush_timestamp_map = {}
|
68
|
-
|
69
|
-
# flag indicating if aggregate_maps_path option has been already set on one aggregate instance
|
70
|
-
@@aggregate_maps_path_set = false
|
71
|
-
|
72
|
-
# defines which Aggregate instance will close Aggregate static variables
|
73
|
-
@@static_close_instance = nil
|
60
|
+
|
61
|
+
# Store all shared aggregate attributes per pipeline id
|
62
|
+
@@pipelines = {}
|
74
63
|
|
75
64
|
|
76
65
|
# ####### #
|
@@ -88,7 +77,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
88
77
|
if !@task_id.match(/%\{.+\}/)
|
89
78
|
raise LogStash::ConfigurationError, "Aggregate plugin: task_id pattern '#{@task_id}' must contain a dynamic expression like '%{field}'"
|
90
79
|
end
|
91
|
-
|
80
|
+
|
92
81
|
# process lambda expression to call in each filter call
|
93
82
|
eval("@codeblock = lambda { |event, map| #{@code} }", binding, "(aggregate filter code)")
|
94
83
|
|
@@ -97,54 +86,60 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
97
86
|
eval("@timeout_codeblock = lambda { |event| #{@timeout_code} }", binding, "(aggregate filter timeout code)")
|
98
87
|
end
|
99
88
|
|
100
|
-
|
89
|
+
# init pipeline context
|
90
|
+
@@pipelines[pipeline_id] ||= LogStash::Filters::Aggregate::Pipeline.new();
|
91
|
+
@current_pipeline = @@pipelines[pipeline_id]
|
92
|
+
|
93
|
+
@current_pipeline.mutex.synchronize do
|
101
94
|
|
102
95
|
# timeout management : define eviction_instance for current task_id pattern
|
103
96
|
if has_timeout_options?
|
104
|
-
if
|
97
|
+
if @current_pipeline.flush_instance_map.has_key?(@task_id)
|
105
98
|
# all timeout options have to be defined in only one aggregate filter per task_id pattern
|
106
99
|
raise LogStash::ConfigurationError, "Aggregate plugin: For task_id pattern '#{@task_id}', there are more than one filter which defines timeout options. All timeout options have to be defined in only one aggregate filter per task_id pattern. Timeout options are : #{display_timeout_options}"
|
107
100
|
end
|
108
|
-
|
101
|
+
@current_pipeline.flush_instance_map[@task_id] = self
|
109
102
|
@logger.debug("Aggregate timeout for '#{@task_id}' pattern: #{@timeout} seconds")
|
110
103
|
end
|
111
104
|
|
112
105
|
# timeout management : define default_timeout
|
113
|
-
if !@timeout.nil? && (
|
114
|
-
|
106
|
+
if !@timeout.nil? && (@current_pipeline.default_timeout.nil? || @timeout < @current_pipeline.default_timeout)
|
107
|
+
@current_pipeline.default_timeout = @timeout
|
115
108
|
@logger.debug("Aggregate default timeout: #{@timeout} seconds")
|
116
109
|
end
|
117
110
|
|
118
111
|
# inactivity timeout management: make sure it is lower than timeout
|
119
|
-
if !@inactivity_timeout.nil? && ((!@timeout.nil? && @inactivity_timeout > @timeout) || (
|
112
|
+
if !@inactivity_timeout.nil? && ((!@timeout.nil? && @inactivity_timeout > @timeout) || (!@current_pipeline.default_timeout.nil? && @inactivity_timeout > @current_pipeline.default_timeout))
|
120
113
|
raise LogStash::ConfigurationError, "Aggregate plugin: For task_id pattern #{@task_id}, inactivity_timeout must be lower than timeout"
|
121
114
|
end
|
122
115
|
|
123
|
-
# reinit
|
124
|
-
if
|
125
|
-
|
116
|
+
# reinit pipeline_close_instance (if necessary)
|
117
|
+
if !@current_pipeline.aggregate_maps_path_set && !@current_pipeline.pipeline_close_instance.nil?
|
118
|
+
@current_pipeline.pipeline_close_instance = nil
|
126
119
|
end
|
127
120
|
|
128
|
-
# check if aggregate_maps_path option has already been set on another instance else set
|
121
|
+
# check if aggregate_maps_path option has already been set on another instance else set @current_pipeline.aggregate_maps_path_set
|
129
122
|
if !@aggregate_maps_path.nil?
|
130
|
-
if
|
131
|
-
|
123
|
+
if @current_pipeline.aggregate_maps_path_set
|
124
|
+
@current_pipeline.aggregate_maps_path_set = false
|
132
125
|
raise LogStash::ConfigurationError, "Aggregate plugin: Option 'aggregate_maps_path' must be set on only one aggregate filter"
|
133
126
|
else
|
134
|
-
|
135
|
-
|
127
|
+
@current_pipeline.aggregate_maps_path_set = true
|
128
|
+
@current_pipeline.pipeline_close_instance = self
|
136
129
|
end
|
137
130
|
end
|
138
131
|
|
139
132
|
# load aggregate maps from file (if option defined)
|
140
133
|
if !@aggregate_maps_path.nil? && File.exist?(@aggregate_maps_path)
|
141
|
-
File.open(@aggregate_maps_path, "r") { |from_file|
|
134
|
+
File.open(@aggregate_maps_path, "r") { |from_file| @current_pipeline.aggregate_maps.merge!(Marshal.load(from_file)) }
|
142
135
|
File.delete(@aggregate_maps_path)
|
143
136
|
@logger.info("Aggregate maps loaded from : #{@aggregate_maps_path}")
|
144
137
|
end
|
145
138
|
|
146
139
|
# init aggregate_maps
|
147
|
-
|
140
|
+
@current_pipeline.aggregate_maps[@task_id] ||= {}
|
141
|
+
|
142
|
+
|
148
143
|
end
|
149
144
|
end
|
150
145
|
|
@@ -154,25 +149,21 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
154
149
|
|
155
150
|
@logger.debug("Aggregate close call", :code => @code)
|
156
151
|
|
157
|
-
# define
|
158
|
-
|
152
|
+
# define pipeline close instance if none is already defined
|
153
|
+
@current_pipeline.pipeline_close_instance = self if @current_pipeline.pipeline_close_instance.nil?
|
159
154
|
|
160
|
-
if
|
155
|
+
if @current_pipeline.pipeline_close_instance == self
|
161
156
|
# store aggregate maps to file (if option defined)
|
162
|
-
|
163
|
-
|
164
|
-
if !@aggregate_maps_path.nil? &&
|
165
|
-
File.open(@aggregate_maps_path, "w"){ |to_file| Marshal.dump(
|
157
|
+
@current_pipeline.mutex.synchronize do
|
158
|
+
@current_pipeline.aggregate_maps.delete_if { |key, value| value.empty? }
|
159
|
+
if !@aggregate_maps_path.nil? && !@current_pipeline.aggregate_maps.empty?
|
160
|
+
File.open(@aggregate_maps_path, "w"){ |to_file| Marshal.dump(@current_pipeline.aggregate_maps, to_file) }
|
166
161
|
@logger.info("Aggregate maps stored to : #{@aggregate_maps_path}")
|
167
162
|
end
|
168
|
-
@@aggregate_maps.clear()
|
169
163
|
end
|
170
164
|
|
171
|
-
#
|
172
|
-
@@
|
173
|
-
@@flush_instance_map = {}
|
174
|
-
@@last_flush_timestamp_map = {}
|
175
|
-
@@aggregate_maps_path_set = false
|
165
|
+
# remove pipeline context for Logstash reload
|
166
|
+
@@pipelines.delete(pipeline_id)
|
176
167
|
end
|
177
168
|
|
178
169
|
end
|
@@ -189,21 +180,21 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
189
180
|
event_to_yield = nil
|
190
181
|
|
191
182
|
# protect aggregate_maps against concurrent access, using a mutex
|
192
|
-
|
183
|
+
@current_pipeline.mutex.synchronize do
|
193
184
|
|
194
185
|
# retrieve the current aggregate map
|
195
|
-
aggregate_maps_element =
|
186
|
+
aggregate_maps_element = @current_pipeline.aggregate_maps[@task_id][task_id]
|
196
187
|
|
197
188
|
|
198
189
|
# create aggregate map, if it doesn't exist
|
199
190
|
if aggregate_maps_element.nil?
|
200
191
|
return if @map_action == "update"
|
201
192
|
# create new event from previous map, if @push_previous_map_as_event is enabled
|
202
|
-
if @push_previous_map_as_event &&
|
193
|
+
if @push_previous_map_as_event && !@current_pipeline.aggregate_maps[@task_id].empty?
|
203
194
|
event_to_yield = extract_previous_map_as_event()
|
204
195
|
end
|
205
196
|
aggregate_maps_element = LogStash::Filters::Aggregate::Element.new(Time.now);
|
206
|
-
|
197
|
+
@current_pipeline.aggregate_maps[@task_id][task_id] = aggregate_maps_element
|
207
198
|
else
|
208
199
|
return if @map_action == "create"
|
209
200
|
end
|
@@ -225,7 +216,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
225
216
|
end
|
226
217
|
|
227
218
|
# delete the map if task is ended
|
228
|
-
|
219
|
+
@current_pipeline.aggregate_maps[@task_id].delete(task_id) if @end_of_task
|
229
220
|
|
230
221
|
end
|
231
222
|
|
@@ -272,7 +263,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
272
263
|
|
273
264
|
# Extract the previous map in aggregate maps, and return it as a new Logstash event
|
274
265
|
def extract_previous_map_as_event
|
275
|
-
previous_entry =
|
266
|
+
previous_entry = @current_pipeline.aggregate_maps[@task_id].shift()
|
276
267
|
previous_task_id = previous_entry[0]
|
277
268
|
previous_map = previous_entry[1].map
|
278
269
|
return create_timeout_event(previous_map, previous_task_id)
|
@@ -289,26 +280,26 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
289
280
|
@logger.debug("Aggregate flush call with #{options}")
|
290
281
|
|
291
282
|
# Protection against no timeout defined by Logstash conf : define a default eviction instance with timeout = DEFAULT_TIMEOUT seconds
|
292
|
-
if
|
293
|
-
|
283
|
+
if @current_pipeline.default_timeout.nil?
|
284
|
+
@current_pipeline.default_timeout = DEFAULT_TIMEOUT
|
294
285
|
end
|
295
|
-
if
|
296
|
-
|
297
|
-
@timeout =
|
298
|
-
elsif
|
299
|
-
|
286
|
+
if !@current_pipeline.flush_instance_map.has_key?(@task_id)
|
287
|
+
@current_pipeline.flush_instance_map[@task_id] = self
|
288
|
+
@timeout = @current_pipeline.default_timeout
|
289
|
+
elsif @current_pipeline.flush_instance_map[@task_id].timeout.nil?
|
290
|
+
@current_pipeline.flush_instance_map[@task_id].timeout = @current_pipeline.default_timeout
|
300
291
|
end
|
301
292
|
|
302
|
-
if
|
303
|
-
|
293
|
+
if @current_pipeline.flush_instance_map[@task_id].inactivity_timeout.nil?
|
294
|
+
@current_pipeline.flush_instance_map[@task_id].inactivity_timeout = @current_pipeline.flush_instance_map[@task_id].timeout
|
304
295
|
end
|
305
296
|
|
306
297
|
# Launch timeout management only every interval of (@inactivity_timeout / 2) seconds or at Logstash shutdown
|
307
|
-
if
|
298
|
+
if @current_pipeline.flush_instance_map[@task_id] == self && !@current_pipeline.aggregate_maps[@task_id].nil? && (!@current_pipeline.last_flush_timestamp_map.has_key?(@task_id) || Time.now > @current_pipeline.last_flush_timestamp_map[@task_id] + @inactivity_timeout / 2 || options[:final])
|
308
299
|
events_to_flush = remove_expired_maps()
|
309
300
|
|
310
301
|
# at Logstash shutdown, if push_previous_map_as_event is enabled, it's important to force flush (particularly for jdbc input plugin)
|
311
|
-
if options[:final] && @push_previous_map_as_event &&
|
302
|
+
if options[:final] && @push_previous_map_as_event && !@current_pipeline.aggregate_maps[@task_id].empty?
|
312
303
|
events_to_flush << extract_previous_map_as_event()
|
313
304
|
end
|
314
305
|
|
@@ -318,7 +309,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
318
309
|
end
|
319
310
|
|
320
311
|
# update last flush timestamp
|
321
|
-
|
312
|
+
@current_pipeline.last_flush_timestamp_map[@task_id] = Time.now
|
322
313
|
|
323
314
|
# return events to flush into Logstash pipeline
|
324
315
|
return events_to_flush
|
@@ -329,18 +320,18 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
329
320
|
end
|
330
321
|
|
331
322
|
|
332
|
-
# Remove the expired Aggregate maps from
|
323
|
+
# Remove the expired Aggregate maps from @current_pipeline.aggregate_maps if they are older than timeout or if no new event has been received since inactivity_timeout.
|
333
324
|
# If @push_previous_map_as_event option is set, or @push_map_as_event_on_timeout is set, expired maps are returned as new events to be flushed to Logstash pipeline.
|
334
325
|
def remove_expired_maps()
|
335
326
|
events_to_flush = []
|
336
327
|
min_timestamp = Time.now - @timeout
|
337
328
|
min_inactivity_timestamp = Time.now - @inactivity_timeout
|
338
329
|
|
339
|
-
|
330
|
+
@current_pipeline.mutex.synchronize do
|
340
331
|
|
341
|
-
@logger.debug("Aggregate remove_expired_maps call with '#{@task_id}' pattern and #{
|
332
|
+
@logger.debug("Aggregate remove_expired_maps call with '#{@task_id}' pattern and #{@current_pipeline.aggregate_maps[@task_id].length} maps")
|
342
333
|
|
343
|
-
|
334
|
+
@current_pipeline.aggregate_maps[@task_id].delete_if do |key, element|
|
344
335
|
if element.creation_timestamp < min_timestamp || element.lastevent_timestamp < min_inactivity_timestamp
|
345
336
|
if @push_previous_map_as_event || @push_map_as_event_on_timeout
|
346
337
|
events_to_flush << create_timeout_event(element.map, key)
|
@@ -379,6 +370,15 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
379
370
|
"timeout_tags"
|
380
371
|
].join(", ")
|
381
372
|
end
|
373
|
+
|
374
|
+
# return current pipeline id
|
375
|
+
def pipeline_id()
|
376
|
+
if @execution_context
|
377
|
+
return @execution_context.pipeline_id
|
378
|
+
else
|
379
|
+
return pipeline_id = "main"
|
380
|
+
end
|
381
|
+
end
|
382
382
|
|
383
383
|
end # class LogStash::Filters::Aggregate
|
384
384
|
|
@@ -393,3 +393,33 @@ class LogStash::Filters::Aggregate::Element
|
|
393
393
|
@map = {}
|
394
394
|
end
|
395
395
|
end
|
396
|
+
|
397
|
+
# shared aggregate attributes for each pipeline
|
398
|
+
class LogStash::Filters::Aggregate::Pipeline
|
399
|
+
|
400
|
+
attr_accessor :aggregate_maps, :mutex, :default_timeout, :flush_instance_map, :last_flush_timestamp_map, :aggregate_maps_path_set, :pipeline_close_instance
|
401
|
+
|
402
|
+
def initialize()
|
403
|
+
# Stores all aggregate maps, per task_id pattern, then per task_id value
|
404
|
+
@aggregate_maps = {}
|
405
|
+
|
406
|
+
# Mutex used to synchronize access to 'aggregate_maps'
|
407
|
+
@mutex = Mutex.new
|
408
|
+
|
409
|
+
# Default timeout for task_id patterns where timeout is not defined in Logstash filter configuration
|
410
|
+
@default_timeout = nil
|
411
|
+
|
412
|
+
# For each "task_id" pattern, defines which Aggregate instance will process flush() call, processing expired Aggregate elements (older than timeout)
|
413
|
+
# For each entry, key is "task_id pattern" and value is "aggregate instance"
|
414
|
+
@flush_instance_map = {}
|
415
|
+
|
416
|
+
# last time where timeout management in flush() method was launched, per "task_id" pattern
|
417
|
+
@last_flush_timestamp_map = {}
|
418
|
+
|
419
|
+
# flag indicating if aggregate_maps_path option has been already set on one aggregate instance
|
420
|
+
@aggregate_maps_path_set = false
|
421
|
+
|
422
|
+
# defines which Aggregate instance will close Aggregate variables associated to current pipeline
|
423
|
+
@pipeline_close_instance = nil
|
424
|
+
end
|
425
|
+
end
|
@@ -1,6 +1,6 @@
|
|
1
1
|
Gem::Specification.new do |s|
|
2
2
|
s.name = 'logstash-filter-aggregate'
|
3
|
-
s.version = '2.
|
3
|
+
s.version = '2.7.0'
|
4
4
|
s.licenses = ['Apache License (2.0)']
|
5
5
|
s.summary = 'The aim of this filter is to aggregate information available among several events (typically log lines) belonging to a same task, and finally push aggregated information into final task event.'
|
6
6
|
s.description = 'This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program'
|
@@ -6,7 +6,7 @@ require_relative "aggregate_spec_helper"
|
|
6
6
|
describe LogStash::Filters::Aggregate do
|
7
7
|
|
8
8
|
before(:each) do
|
9
|
-
|
9
|
+
reset_pipeline_variables()
|
10
10
|
@start_filter = setup_filter({ "map_action" => "create", "code" => "map['sql_duration'] = 0" })
|
11
11
|
@update_filter = setup_filter({ "map_action" => "update", "code" => "map['sql_duration'] += event.get('duration')" })
|
12
12
|
@end_filter = setup_filter({"timeout_task_id_field" => "my_id", "push_map_as_event_on_timeout" => true, "map_action" => "update", "code" => "event.set('sql_duration', map['sql_duration'])", "end_of_task" => true, "timeout" => 5, "inactivity_timeout" => 2, "timeout_code" => "event.set('test', 'testValue')", "timeout_tags" => ["tag1", "tag2"] })
|
@@ -268,6 +268,7 @@ describe LogStash::Filters::Aggregate do
|
|
268
268
|
describe "close event append then register event append, " do
|
269
269
|
it "stores aggregate maps to configured file and then loads aggregate maps from file" do
|
270
270
|
store_file = "aggregate_maps"
|
271
|
+
File.delete(store_file) if File.exist?(store_file)
|
271
272
|
expect(File.exist?(store_file)).to be false
|
272
273
|
|
273
274
|
one_filter = setup_filter({ "task_id" => "%{one_special_field}", "code" => ""})
|
@@ -284,7 +285,7 @@ describe LogStash::Filters::Aggregate do
|
|
284
285
|
|
285
286
|
store_filter.close()
|
286
287
|
expect(File.exist?(store_file)).to be true
|
287
|
-
expect(
|
288
|
+
expect(current_pipeline).to be_nil
|
288
289
|
|
289
290
|
one_filter = setup_filter({ "task_id" => "%{one_special_field}", "code" => ""})
|
290
291
|
store_filter = setup_filter({ "code" => "map['sql_duration'] = 0", "aggregate_maps_path" => store_file })
|
@@ -306,15 +307,14 @@ describe LogStash::Filters::Aggregate do
|
|
306
307
|
|
307
308
|
context "Logstash reload occurs, " do
|
308
309
|
describe "close method is called, " do
|
309
|
-
it "reinitializes
|
310
|
+
it "reinitializes pipelines" do
|
310
311
|
@end_filter.close()
|
311
|
-
expect(
|
312
|
-
expect(taskid_eviction_instance).to be_nil
|
313
|
-
expect(static_close_instance).not_to be_nil
|
314
|
-
expect(aggregate_maps_path_set).to be false
|
312
|
+
expect(current_pipeline).to be_nil
|
315
313
|
|
316
314
|
@end_filter.register()
|
317
|
-
expect(
|
315
|
+
expect(current_pipeline).not_to be_nil
|
316
|
+
expect(aggregate_maps).not_to be_nil
|
317
|
+
expect(pipeline_close_instance).to be_nil
|
318
318
|
end
|
319
319
|
end
|
320
320
|
end
|
@@ -33,31 +33,40 @@ def filter(event)
|
|
33
33
|
@end_filter.filter(event)
|
34
34
|
end
|
35
35
|
|
36
|
+
def pipelines()
|
37
|
+
LogStash::Filters::Aggregate.class_variable_get(:@@pipelines)
|
38
|
+
end
|
39
|
+
|
40
|
+
def current_pipeline()
|
41
|
+
pipelines()['main']
|
42
|
+
end
|
43
|
+
|
36
44
|
def aggregate_maps()
|
37
|
-
|
45
|
+
current_pipeline().aggregate_maps
|
38
46
|
end
|
39
47
|
|
40
48
|
def taskid_eviction_instance()
|
41
|
-
|
49
|
+
current_pipeline().flush_instance_map["%{taskid}"]
|
42
50
|
end
|
43
51
|
|
44
|
-
def
|
45
|
-
|
52
|
+
def pipeline_close_instance()
|
53
|
+
current_pipeline().pipeline_close_instance
|
46
54
|
end
|
47
55
|
|
48
56
|
def aggregate_maps_path_set()
|
49
|
-
|
57
|
+
current_pipeline().aggregate_maps_path_set
|
50
58
|
end
|
51
59
|
|
52
60
|
def reset_timeout_management()
|
53
|
-
|
54
|
-
|
55
|
-
|
61
|
+
current_pipeline().default_timeout = nil
|
62
|
+
current_pipeline().flush_instance_map.clear()
|
63
|
+
current_pipeline().last_flush_timestamp_map.clear()
|
56
64
|
end
|
57
65
|
|
58
|
-
def
|
59
|
-
|
60
|
-
|
61
|
-
|
62
|
-
|
66
|
+
def reset_pipeline_variables()
|
67
|
+
pipelines().clear()
|
68
|
+
# reset_timeout_management()
|
69
|
+
# aggregate_maps().clear()
|
70
|
+
# current_pipeline().pipeline_close_instance = nil
|
71
|
+
# current_pipeline().aggregate_maps_path_set = false
|
63
72
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: logstash-filter-aggregate
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.
|
4
|
+
version: 2.7.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Elastic
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2017-
|
12
|
+
date: 2017-11-03 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
requirement: !ruby/object:Gem::Requirement
|