logstash-filter-aggregate 2.7.1 → 2.9.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -5
- data/CHANGELOG.md +33 -8
- data/LICENSE +199 -10
- data/README.md +1 -1
- data/docs/index.asciidoc +57 -16
- data/lib/logstash/filters/aggregate.rb +174 -67
- data/logstash-filter-aggregate.gemspec +1 -1
- data/spec/filters/aggregate_spec.rb +53 -1
- data/spec/filters/aggregate_spec_helper.rb +4 -1
- metadata +4 -6
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 15460afa4f78789d3f4ee56ae45e06e5f31b8ef4
|
4
|
+
data.tar.gz: f4bd241aa7207b29366d2456d6c42cb6ecc499c3
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 10f18942af2c7cd6f343502478d6d6d2ffc530700b9462e377770a0adbd6bb38b71c2a7debeeaba832aa0aafa9eba85acd041c35a468f2876ad61d0a51566f40
|
7
|
+
data.tar.gz: d6c92aa3b6bf04bfdd1be6e6b3fafbeb8b31416e20db4ee2c795d3fa20ac193db9a8422634b90df90efed88f135e5941a2f7373707b0e20be886a68a30abd8c7
|
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,29 @@
|
|
1
|
+
## 2.9.2
|
2
|
+
- bugfix: remove 'default_timeout' at pipeline level (fix #112)
|
3
|
+
- ci: update travis ci configuration
|
4
|
+
|
5
|
+
## 2.9.1
|
6
|
+
- bugfix: fix inactivity timeout feature when processing old logs (PR [#103](https://github.com/logstash-plugins/logstash-filter-aggregate/pull/103), thanks @jdratlif for his contribution!)
|
7
|
+
- docs: fix several typos in documentation
|
8
|
+
- docs: enhance example 4 documentation
|
9
|
+
- ci: enhance plugin continuous integration
|
10
|
+
|
11
|
+
## 2.9.0
|
12
|
+
- new feature: add ability to dynamically define a custom `timeout` or `inactivity_timeout` in `code` block (fix issues [#91](https://github.com/logstash-plugins/logstash-filter-aggregate/issues/91) and [#92](https://github.com/logstash-plugins/logstash-filter-aggregate/issues/92))
|
13
|
+
- new feature: add meta informations available in `code` block through `map_meta` variable
|
14
|
+
- new feature: add Logstash metrics, specific to aggregate plugin: aggregate_maps, pushed_events, task_timeouts, code_errors, timeout_code_errors
|
15
|
+
- new feature: validate at startup that `map_action` option equals to 'create', 'update' or 'create_or_update'
|
16
|
+
|
17
|
+
## 2.8.0
|
18
|
+
- new feature: add 'timeout_timestamp_field' option (fix issue [#81](https://github.com/logstash-plugins/logstash-filter-aggregate/issues/81))
|
19
|
+
When set, this option lets to compute timeout based on event timestamp field (and not system time).
|
20
|
+
It's particularly useful when processing old logs.
|
21
|
+
|
22
|
+
## 2.7.2
|
23
|
+
- bugfix: fix synchronisation issue at Logstash shutdown (issue [#75](https://github.com/logstash-plugins/logstash-filter-aggregate/issues/75))
|
24
|
+
|
1
25
|
## 2.7.1
|
2
|
-
-
|
26
|
+
- docs: update gemspec summary
|
3
27
|
|
4
28
|
## 2.7.0
|
5
29
|
- new feature: add support for multiple pipelines (for Logstash 6.0+)
|
@@ -21,10 +45,11 @@
|
|
21
45
|
- docs: bump patch level for doc build
|
22
46
|
|
23
47
|
## 2.6.0
|
24
|
-
- new feature: 'inactivity_timeout'
|
48
|
+
- new feature: add 'inactivity_timeout' option.
|
49
|
+
Events for a given `task_id` will be aggregated for as long as they keep arriving within the defined `inactivity_timeout` option - the inactivity timeout is reset each time a new event happens. On the contrary, `timeout` is never reset and happens after `timeout` seconds since aggregation map creation.
|
25
50
|
|
26
51
|
## 2.5.2
|
27
|
-
- bugfix: fix 'aggregate_maps_path' load (issue #62). Re-start of Logstash died when no data were provided in 'aggregate_maps_path' file for some aggregate task_id patterns
|
52
|
+
- bugfix: fix 'aggregate_maps_path' load (issue [#62](https://github.com/logstash-plugins/logstash-filter-aggregate/issues/62)). Re-start of Logstash died when no data were provided in 'aggregate_maps_path' file for some aggregate task_id patterns
|
28
53
|
- enhancement: at Logstash startup, check that 'task_id' option contains a field reference expression (else raise error)
|
29
54
|
- docs: enhance examples
|
30
55
|
- docs: precise that tasks are tied to their task_id pattern, even if they have same task_id value
|
@@ -42,7 +67,7 @@
|
|
42
67
|
- breaking: need Logstash 2.4 or later
|
43
68
|
|
44
69
|
## 2.4.0
|
45
|
-
- new feature: You can now define timeout options per task_id pattern (#42)
|
70
|
+
- new feature: You can now define timeout options per task_id pattern (fix issue [#42](https://github.com/logstash-plugins/logstash-filter-aggregate/issues/42))
|
46
71
|
timeout options are : `timeout, timeout_code, push_map_as_event_on_timeout, push_previous_map_as_event, timeout_task_id_field, timeout_tags`
|
47
72
|
- validation: a configuration error is thrown at startup if you define any timeout option on several aggregate filters for the same task_id pattern
|
48
73
|
- breaking: if you use `aggregate_maps_path` option, storage format has changed. So you have to delete `aggregate_maps_path` file before starting Logstash
|
@@ -76,14 +101,14 @@
|
|
76
101
|
- internal,deps: New dependency requirements for logstash-core for the 5.0 release
|
77
102
|
|
78
103
|
## 2.0.3
|
79
|
-
- bugfix: fix issue #10 : numeric task_id is now well processed
|
104
|
+
- bugfix: fix issue [#10](https://github.com/logstash-plugins/logstash-filter-aggregate/issues/10) : numeric task_id is now well processed
|
80
105
|
|
81
106
|
## 2.0.2
|
82
|
-
- bugfix: fix issue #5 : when code call raises an exception, the error is logged and the event is tagged '_aggregateexception'. It avoids logstash crash.
|
107
|
+
- bugfix: fix issue [#5](https://github.com/logstash-plugins/logstash-filter-aggregate/issues/5) : when code call raises an exception, the error is logged and the event is tagged '_aggregateexception'. It avoids logstash crash.
|
83
108
|
|
84
109
|
## 2.0.0
|
85
|
-
- internal: Plugins were updated to follow the new shutdown semantic, this mainly allows Logstash to instruct input plugins to terminate gracefully,
|
86
|
-
|
110
|
+
- internal: Plugins were updated to follow the new shutdown semantic, this mainly allows Logstash to instruct input plugins to terminate gracefully, instead of using Thread.raise on the plugins' threads.
|
111
|
+
Ref: https://github.com/elastic/logstash/pull/3895
|
87
112
|
- internal,deps: Dependency on logstash-core update to 2.0
|
88
113
|
|
89
114
|
## 0.1.3
|
data/LICENSE
CHANGED
@@ -1,13 +1,202 @@
|
|
1
|
-
Copyright (c) 2012-2015 Elasticsearch <http://www.elasticsearch.org>
|
2
1
|
|
3
|
-
|
4
|
-
|
5
|
-
|
2
|
+
Apache License
|
3
|
+
Version 2.0, January 2004
|
4
|
+
http://www.apache.org/licenses/
|
6
5
|
|
7
|
-
|
6
|
+
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
8
7
|
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
8
|
+
1. Definitions.
|
9
|
+
|
10
|
+
"License" shall mean the terms and conditions for use, reproduction,
|
11
|
+
and distribution as defined by Sections 1 through 9 of this document.
|
12
|
+
|
13
|
+
"Licensor" shall mean the copyright owner or entity authorized by
|
14
|
+
the copyright owner that is granting the License.
|
15
|
+
|
16
|
+
"Legal Entity" shall mean the union of the acting entity and all
|
17
|
+
other entities that control, are controlled by, or are under common
|
18
|
+
control with that entity. For the purposes of this definition,
|
19
|
+
"control" means (i) the power, direct or indirect, to cause the
|
20
|
+
direction or management of such entity, whether by contract or
|
21
|
+
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
22
|
+
outstanding shares, or (iii) beneficial ownership of such entity.
|
23
|
+
|
24
|
+
"You" (or "Your") shall mean an individual or Legal Entity
|
25
|
+
exercising permissions granted by this License.
|
26
|
+
|
27
|
+
"Source" form shall mean the preferred form for making modifications,
|
28
|
+
including but not limited to software source code, documentation
|
29
|
+
source, and configuration files.
|
30
|
+
|
31
|
+
"Object" form shall mean any form resulting from mechanical
|
32
|
+
transformation or translation of a Source form, including but
|
33
|
+
not limited to compiled object code, generated documentation,
|
34
|
+
and conversions to other media types.
|
35
|
+
|
36
|
+
"Work" shall mean the work of authorship, whether in Source or
|
37
|
+
Object form, made available under the License, as indicated by a
|
38
|
+
copyright notice that is included in or attached to the work
|
39
|
+
(an example is provided in the Appendix below).
|
40
|
+
|
41
|
+
"Derivative Works" shall mean any work, whether in Source or Object
|
42
|
+
form, that is based on (or derived from) the Work and for which the
|
43
|
+
editorial revisions, annotations, elaborations, or other modifications
|
44
|
+
represent, as a whole, an original work of authorship. For the purposes
|
45
|
+
of this License, Derivative Works shall not include works that remain
|
46
|
+
separable from, or merely link (or bind by name) to the interfaces of,
|
47
|
+
the Work and Derivative Works thereof.
|
48
|
+
|
49
|
+
"Contribution" shall mean any work of authorship, including
|
50
|
+
the original version of the Work and any modifications or additions
|
51
|
+
to that Work or Derivative Works thereof, that is intentionally
|
52
|
+
submitted to Licensor for inclusion in the Work by the copyright owner
|
53
|
+
or by an individual or Legal Entity authorized to submit on behalf of
|
54
|
+
the copyright owner. For the purposes of this definition, "submitted"
|
55
|
+
means any form of electronic, verbal, or written communication sent
|
56
|
+
to the Licensor or its representatives, including but not limited to
|
57
|
+
communication on electronic mailing lists, source code control systems,
|
58
|
+
and issue tracking systems that are managed by, or on behalf of, the
|
59
|
+
Licensor for the purpose of discussing and improving the Work, but
|
60
|
+
excluding communication that is conspicuously marked or otherwise
|
61
|
+
designated in writing by the copyright owner as "Not a Contribution."
|
62
|
+
|
63
|
+
"Contributor" shall mean Licensor and any individual or Legal Entity
|
64
|
+
on behalf of whom a Contribution has been received by Licensor and
|
65
|
+
subsequently incorporated within the Work.
|
66
|
+
|
67
|
+
2. Grant of Copyright License. Subject to the terms and conditions of
|
68
|
+
this License, each Contributor hereby grants to You a perpetual,
|
69
|
+
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
70
|
+
copyright license to reproduce, prepare Derivative Works of,
|
71
|
+
publicly display, publicly perform, sublicense, and distribute the
|
72
|
+
Work and such Derivative Works in Source or Object form.
|
73
|
+
|
74
|
+
3. Grant of Patent License. Subject to the terms and conditions of
|
75
|
+
this License, each Contributor hereby grants to You a perpetual,
|
76
|
+
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
77
|
+
(except as stated in this section) patent license to make, have made,
|
78
|
+
use, offer to sell, sell, import, and otherwise transfer the Work,
|
79
|
+
where such license applies only to those patent claims licensable
|
80
|
+
by such Contributor that are necessarily infringed by their
|
81
|
+
Contribution(s) alone or by combination of their Contribution(s)
|
82
|
+
with the Work to which such Contribution(s) was submitted. If You
|
83
|
+
institute patent litigation against any entity (including a
|
84
|
+
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
85
|
+
or a Contribution incorporated within the Work constitutes direct
|
86
|
+
or contributory patent infringement, then any patent licenses
|
87
|
+
granted to You under this License for that Work shall terminate
|
88
|
+
as of the date such litigation is filed.
|
89
|
+
|
90
|
+
4. Redistribution. You may reproduce and distribute copies of the
|
91
|
+
Work or Derivative Works thereof in any medium, with or without
|
92
|
+
modifications, and in Source or Object form, provided that You
|
93
|
+
meet the following conditions:
|
94
|
+
|
95
|
+
(a) You must give any other recipients of the Work or
|
96
|
+
Derivative Works a copy of this License; and
|
97
|
+
|
98
|
+
(b) You must cause any modified files to carry prominent notices
|
99
|
+
stating that You changed the files; and
|
100
|
+
|
101
|
+
(c) You must retain, in the Source form of any Derivative Works
|
102
|
+
that You distribute, all copyright, patent, trademark, and
|
103
|
+
attribution notices from the Source form of the Work,
|
104
|
+
excluding those notices that do not pertain to any part of
|
105
|
+
the Derivative Works; and
|
106
|
+
|
107
|
+
(d) If the Work includes a "NOTICE" text file as part of its
|
108
|
+
distribution, then any Derivative Works that You distribute must
|
109
|
+
include a readable copy of the attribution notices contained
|
110
|
+
within such NOTICE file, excluding those notices that do not
|
111
|
+
pertain to any part of the Derivative Works, in at least one
|
112
|
+
of the following places: within a NOTICE text file distributed
|
113
|
+
as part of the Derivative Works; within the Source form or
|
114
|
+
documentation, if provided along with the Derivative Works; or,
|
115
|
+
within a display generated by the Derivative Works, if and
|
116
|
+
wherever such third-party notices normally appear. The contents
|
117
|
+
of the NOTICE file are for informational purposes only and
|
118
|
+
do not modify the License. You may add Your own attribution
|
119
|
+
notices within Derivative Works that You distribute, alongside
|
120
|
+
or as an addendum to the NOTICE text from the Work, provided
|
121
|
+
that such additional attribution notices cannot be construed
|
122
|
+
as modifying the License.
|
123
|
+
|
124
|
+
You may add Your own copyright statement to Your modifications and
|
125
|
+
may provide additional or different license terms and conditions
|
126
|
+
for use, reproduction, or distribution of Your modifications, or
|
127
|
+
for any such Derivative Works as a whole, provided Your use,
|
128
|
+
reproduction, and distribution of the Work otherwise complies with
|
129
|
+
the conditions stated in this License.
|
130
|
+
|
131
|
+
5. Submission of Contributions. Unless You explicitly state otherwise,
|
132
|
+
any Contribution intentionally submitted for inclusion in the Work
|
133
|
+
by You to the Licensor shall be under the terms and conditions of
|
134
|
+
this License, without any additional terms or conditions.
|
135
|
+
Notwithstanding the above, nothing herein shall supersede or modify
|
136
|
+
the terms of any separate license agreement you may have executed
|
137
|
+
with Licensor regarding such Contributions.
|
138
|
+
|
139
|
+
6. Trademarks. This License does not grant permission to use the trade
|
140
|
+
names, trademarks, service marks, or product names of the Licensor,
|
141
|
+
except as required for reasonable and customary use in describing the
|
142
|
+
origin of the Work and reproducing the content of the NOTICE file.
|
143
|
+
|
144
|
+
7. Disclaimer of Warranty. Unless required by applicable law or
|
145
|
+
agreed to in writing, Licensor provides the Work (and each
|
146
|
+
Contributor provides its Contributions) on an "AS IS" BASIS,
|
147
|
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
148
|
+
implied, including, without limitation, any warranties or conditions
|
149
|
+
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
150
|
+
PARTICULAR PURPOSE. You are solely responsible for determining the
|
151
|
+
appropriateness of using or redistributing the Work and assume any
|
152
|
+
risks associated with Your exercise of permissions under this License.
|
153
|
+
|
154
|
+
8. Limitation of Liability. In no event and under no legal theory,
|
155
|
+
whether in tort (including negligence), contract, or otherwise,
|
156
|
+
unless required by applicable law (such as deliberate and grossly
|
157
|
+
negligent acts) or agreed to in writing, shall any Contributor be
|
158
|
+
liable to You for damages, including any direct, indirect, special,
|
159
|
+
incidental, or consequential damages of any character arising as a
|
160
|
+
result of this License or out of the use or inability to use the
|
161
|
+
Work (including but not limited to damages for loss of goodwill,
|
162
|
+
work stoppage, computer failure or malfunction, or any and all
|
163
|
+
other commercial damages or losses), even if such Contributor
|
164
|
+
has been advised of the possibility of such damages.
|
165
|
+
|
166
|
+
9. Accepting Warranty or Additional Liability. While redistributing
|
167
|
+
the Work or Derivative Works thereof, You may choose to offer,
|
168
|
+
and charge a fee for, acceptance of support, warranty, indemnity,
|
169
|
+
or other liability obligations and/or rights consistent with this
|
170
|
+
License. However, in accepting such obligations, You may act only
|
171
|
+
on Your own behalf and on Your sole responsibility, not on behalf
|
172
|
+
of any other Contributor, and only if You agree to indemnify,
|
173
|
+
defend, and hold each Contributor harmless for any liability
|
174
|
+
incurred by, or claims asserted against, such Contributor by reason
|
175
|
+
of your accepting any such warranty or additional liability.
|
176
|
+
|
177
|
+
END OF TERMS AND CONDITIONS
|
178
|
+
|
179
|
+
APPENDIX: How to apply the Apache License to your work.
|
180
|
+
|
181
|
+
To apply the Apache License to your work, attach the following
|
182
|
+
boilerplate notice, with the fields enclosed by brackets "[]"
|
183
|
+
replaced with your own identifying information. (Don't include
|
184
|
+
the brackets!) The text should be enclosed in the appropriate
|
185
|
+
comment syntax for the file format. We also recommend that a
|
186
|
+
file or class name and description of purpose be included on the
|
187
|
+
same "printed page" as the copyright notice for easier
|
188
|
+
identification within third-party archives.
|
189
|
+
|
190
|
+
Copyright 2020 Elastic and contributors
|
191
|
+
|
192
|
+
Licensed under the Apache License, Version 2.0 (the "License");
|
193
|
+
you may not use this file except in compliance with the License.
|
194
|
+
You may obtain a copy of the License at
|
195
|
+
|
196
|
+
http://www.apache.org/licenses/LICENSE-2.0
|
197
|
+
|
198
|
+
Unless required by applicable law or agreed to in writing, software
|
199
|
+
distributed under the License is distributed on an "AS IS" BASIS,
|
200
|
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
201
|
+
See the License for the specific language governing permissions and
|
202
|
+
limitations under the License.
|
data/README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1
1
|
# Aggregate Logstash Plugin
|
2
2
|
|
3
|
-
[![Travis Build Status](https://travis-ci.
|
3
|
+
[![Travis Build Status](https://travis-ci.com/logstash-plugins/logstash-filter-aggregate.svg)](https://travis-ci.com/logstash-plugins/logstash-filter-aggregate)
|
4
4
|
|
5
5
|
This is a plugin for [Logstash](https://github.com/elastic/logstash).
|
6
6
|
|
data/docs/index.asciidoc
CHANGED
@@ -228,7 +228,7 @@ In that case, you don't want to wait task timeout to flush aggregation map.
|
|
228
228
|
aggregate {
|
229
229
|
task_id => "%{country_name}"
|
230
230
|
code => "
|
231
|
-
map['country_name']
|
231
|
+
map['country_name'] ||= event.get('country_name')
|
232
232
|
map['towns'] ||= []
|
233
233
|
map['towns'] << {'town_name' => event.get('town_name')}
|
234
234
|
event.cancel()
|
@@ -240,8 +240,9 @@ In that case, you don't want to wait task timeout to flush aggregation map.
|
|
240
240
|
----------------------------------
|
241
241
|
|
242
242
|
* The key point is that each time aggregate plugin detects a new `country_name`, it pushes previous aggregate map as a new Logstash event, and then creates a new empty map for the next country
|
243
|
-
* When
|
244
|
-
*
|
243
|
+
* When 3s timeout comes, the last aggregate map is pushed as a new event
|
244
|
+
* Initial events (which are not aggregated) are dropped because useless (thanks to `event.cancel()`)
|
245
|
+
* Last point: if a field is not fulfilled for every event (say "town_postcode" field), the `||=` operator will let you to push into aggregate map, the first "not null" value. Example: `map['town_postcode'] ||= event.get('town_postcode')`
|
245
246
|
|
246
247
|
|
247
248
|
[id="plugins-{type}s-{plugin}-example5"]
|
@@ -249,7 +250,7 @@ In that case, you don't want to wait task timeout to flush aggregation map.
|
|
249
250
|
|
250
251
|
Fifth use case: like example #3, there is no end event.
|
251
252
|
|
252
|
-
Events keep
|
253
|
+
Events keep coming for an indefinite time and you want to push the aggregation map as soon as possible after the last user interaction without waiting for the `timeout`.
|
253
254
|
|
254
255
|
This allows to have the aggregated events pushed closer to real time.
|
255
256
|
|
@@ -260,7 +261,7 @@ We can track a user by its ID through the events, however once the user stops in
|
|
260
261
|
|
261
262
|
There is no specific event indicating the end of the user's interaction.
|
262
263
|
|
263
|
-
The user
|
264
|
+
The user interaction will be considered as ended when no events for the specified user (task_id) arrive after the specified inactivity_timeout`.
|
264
265
|
|
265
266
|
If the user continues interacting for longer than `timeout` seconds (since first event), the aggregation map will still be deleted and pushed as a new event when timeout occurs.
|
266
267
|
|
@@ -295,7 +296,7 @@ filter {
|
|
295
296
|
code => "map['clicks'] ||= 0; map['clicks'] += 1;"
|
296
297
|
push_map_as_event_on_timeout => true
|
297
298
|
timeout_task_id_field => "user_id"
|
298
|
-
timeout => 3600 # 1 hour timeout, user activity will be considered finished one hour after the first event, even if events keep
|
299
|
+
timeout => 3600 # 1 hour timeout, user activity will be considered finished one hour after the first event, even if events keep coming
|
299
300
|
inactivity_timeout => 300 # 5 minutes timeout, user activity will be considered finished if no new events arrive 5 minutes after the last event
|
300
301
|
timeout_tags => ['_aggregatetimeout']
|
301
302
|
timeout_code => "event.set('several_clicks', event.get('clicks') > 1)"
|
@@ -326,9 +327,9 @@ filter {
|
|
326
327
|
* in the final event, you can execute a last code (for instance, add map data to final event)
|
327
328
|
* after the final event, the map attached to task is deleted (thanks to `end_of_task => true`)
|
328
329
|
* an aggregate map is tied to one task_id value which is tied to one task_id pattern. So if you have 2 filters with different task_id patterns, even if you have same task_id value, they won't share the same aggregate map.
|
329
|
-
* in one filter configuration, it is
|
330
|
+
* in one filter configuration, it is recommended to define a timeout option to protect the feature against unterminated tasks. It tells the filter to delete expired maps
|
330
331
|
* if no timeout is defined, by default, all maps older than 1800 seconds are automatically deleted
|
331
|
-
* all timeout options have to be defined in only one aggregate filter per task_id pattern (per pipeline). Timeout options are : timeout, inactivity_timeout, timeout_code, push_map_as_event_on_timeout, push_previous_map_as_event, timeout_task_id_field, timeout_tags
|
332
|
+
* all timeout options have to be defined in only one aggregate filter per task_id pattern (per pipeline). Timeout options are : timeout, inactivity_timeout, timeout_code, push_map_as_event_on_timeout, push_previous_map_as_event, timeout_timestamp_field, timeout_task_id_field, timeout_tags
|
332
333
|
* if `code` execution raises an exception, the error is logged and event is tagged '_aggregateexception'
|
333
334
|
|
334
335
|
|
@@ -362,6 +363,7 @@ This plugin supports the following configuration options plus the <<plugins-{typ
|
|
362
363
|
| <<plugins-{type}s-{plugin}-timeout_code>> |<<string,string>>|No
|
363
364
|
| <<plugins-{type}s-{plugin}-timeout_tags>> |<<array,array>>|No
|
364
365
|
| <<plugins-{type}s-{plugin}-timeout_task_id_field>> |<<string,string>>|No
|
366
|
+
| <<plugins-{type}s-{plugin}-timeout_timestamp_field>> |<<string,string>>|No
|
365
367
|
|=======================================================================
|
366
368
|
|
367
369
|
Also see <<plugins-{type}s-{plugin}-common-options>> for a list of options supported by all
|
@@ -396,11 +398,20 @@ Example:
|
|
396
398
|
* Value type is <<string,string>>
|
397
399
|
* There is no default value for this setting.
|
398
400
|
|
399
|
-
The code to execute to update map, using current event.
|
401
|
+
The code to execute to update aggregated map, using current event.
|
400
402
|
|
401
|
-
Or on the contrary, the code to execute to update event, using
|
403
|
+
Or on the contrary, the code to execute to update event, using aggregated map.
|
402
404
|
|
403
|
-
|
405
|
+
Available variables are :
|
406
|
+
|
407
|
+
`event`: current Logstash event
|
408
|
+
|
409
|
+
`map`: aggregated map associated to `task_id`, containing key/value pairs. Data structure is a ruby http://ruby-doc.org/core-1.9.1/Hash.html[Hash]
|
410
|
+
|
411
|
+
`map_meta`: meta informations associated to aggregate map. It allows to set a custom `timeout` or `inactivity_timeout`.
|
412
|
+
It allows also to get `creation_timestamp`, `lastevent_timestamp` and `task_id`.
|
413
|
+
|
414
|
+
When option push_map_as_event_on_timeout=true, if you set `map_meta.timeout=0` in `code` block, then aggregated map is immediately pushed as a new event.
|
404
415
|
|
405
416
|
Example:
|
406
417
|
[source,ruby]
|
@@ -532,7 +543,7 @@ Example:
|
|
532
543
|
[source,ruby]
|
533
544
|
filter {
|
534
545
|
aggregate {
|
535
|
-
timeout_tags => ["aggregate_timeout
|
546
|
+
timeout_tags => ["aggregate_timeout"]
|
536
547
|
}
|
537
548
|
}
|
538
549
|
|
@@ -542,13 +553,43 @@ Example:
|
|
542
553
|
* Value type is <<string,string>>
|
543
554
|
* There is no default value for this setting.
|
544
555
|
|
545
|
-
This option indicates the timeout generated event's field
|
546
|
-
|
547
|
-
|
548
|
-
For example, with option `timeout_task_id_field => "my_id"` ,when timeout task id is `"12345"`, the generated timeout event will contain `'my_id' => '12345'`.
|
556
|
+
This option indicates the timeout generated event's field where the current "task_id" value will be set.
|
557
|
+
This can help to correlate which tasks have been timed out.
|
549
558
|
|
550
559
|
By default, if this option is not set, task id value won't be set into timeout generated event.
|
551
560
|
|
561
|
+
Example:
|
562
|
+
[source,ruby]
|
563
|
+
filter {
|
564
|
+
aggregate {
|
565
|
+
timeout_task_id_field => "task_id"
|
566
|
+
}
|
567
|
+
}
|
568
|
+
|
569
|
+
[id="plugins-{type}s-{plugin}-timeout_timestamp_field"]
|
570
|
+
===== `timeout_timestamp_field`
|
571
|
+
|
572
|
+
* Value type is <<string,string>>
|
573
|
+
* There is no default value for this setting.
|
574
|
+
|
575
|
+
By default, timeout is computed using system time, where Logstash is running.
|
576
|
+
|
577
|
+
When this option is set, timeout is computed using event timestamp field indicated in this option.
|
578
|
+
It means that when a first event arrives on aggregate filter and induces a map creation, map creation time will be equal to this event timestamp.
|
579
|
+
Then, each time a new event arrives on aggregate filter, event timestamp is compared to map creation time to check if timeout happened.
|
580
|
+
|
581
|
+
This option is particularly useful when processing old logs with option `push_map_as_event_on_timeout => true`.
|
582
|
+
It lets to generate aggregated events based on timeout on old logs, where system time is inappropriate.
|
583
|
+
|
584
|
+
Warning : so that this option works fine, it must be set on first aggregate filter.
|
585
|
+
|
586
|
+
Example:
|
587
|
+
[source,ruby]
|
588
|
+
filter {
|
589
|
+
aggregate {
|
590
|
+
timeout_timestamp_field => "@timestamp"
|
591
|
+
}
|
592
|
+
}
|
552
593
|
|
553
594
|
|
554
595
|
[id="plugins-{type}s-{plugin}-common-options"]
|
@@ -20,7 +20,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
20
20
|
|
21
21
|
config :code, :validate => :string, :required => true
|
22
22
|
|
23
|
-
config :map_action, :validate =>
|
23
|
+
config :map_action, :validate => ["create", "update", "create_or_update"], :default => "create_or_update"
|
24
24
|
|
25
25
|
config :end_of_task, :validate => :boolean, :default => false
|
26
26
|
|
@@ -36,6 +36,8 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
36
36
|
|
37
37
|
config :push_previous_map_as_event, :validate => :boolean, :required => false, :default => false
|
38
38
|
|
39
|
+
config :timeout_timestamp_field, :validate => :string, :required => false
|
40
|
+
|
39
41
|
config :timeout_task_id_field, :validate => :string, :required => false
|
40
42
|
|
41
43
|
config :timeout_tags, :validate => :array, :required => false, :default => []
|
@@ -44,11 +46,13 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
44
46
|
# ################## #
|
45
47
|
# INSTANCE VARIABLES #
|
46
48
|
# ################## #
|
47
|
-
|
49
|
+
|
48
50
|
|
49
51
|
# pointer to current pipeline context
|
50
52
|
attr_accessor :current_pipeline
|
51
53
|
|
54
|
+
# boolean indicating if expired maps should be checked on every flush call (typically because custom timeout has beeen set on a map)
|
55
|
+
attr_accessor :check_expired_maps_on_every_flush
|
52
56
|
|
53
57
|
# ################ #
|
54
58
|
# STATIC VARIABLES #
|
@@ -57,7 +61,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
57
61
|
|
58
62
|
# Default timeout (in seconds) when not defined in plugin configuration
|
59
63
|
DEFAULT_TIMEOUT = 1800
|
60
|
-
|
64
|
+
|
61
65
|
# Store all shared aggregate attributes per pipeline id
|
62
66
|
@@pipelines = {}
|
63
67
|
|
@@ -77,9 +81,9 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
77
81
|
if !@task_id.match(/%\{.+\}/)
|
78
82
|
raise LogStash::ConfigurationError, "Aggregate plugin: task_id pattern '#{@task_id}' must contain a dynamic expression like '%{field}'"
|
79
83
|
end
|
80
|
-
|
84
|
+
|
81
85
|
# process lambda expression to call in each filter call
|
82
|
-
eval("@codeblock = lambda { |event, map| #{@code} }", binding, "(aggregate filter code)")
|
86
|
+
eval("@codeblock = lambda { |event, map, map_meta| #{@code} }", binding, "(aggregate filter code)")
|
83
87
|
|
84
88
|
# process lambda expression to call in the timeout case or previous event case
|
85
89
|
if @timeout_code
|
@@ -87,7 +91,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
87
91
|
end
|
88
92
|
|
89
93
|
# init pipeline context
|
90
|
-
@@pipelines[pipeline_id] ||= LogStash::Filters::Aggregate::Pipeline.new()
|
94
|
+
@@pipelines[pipeline_id] ||= LogStash::Filters::Aggregate::Pipeline.new()
|
91
95
|
@current_pipeline = @@pipelines[pipeline_id]
|
92
96
|
|
93
97
|
@current_pipeline.mutex.synchronize do
|
@@ -102,24 +106,18 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
102
106
|
@logger.debug("Aggregate timeout for '#{@task_id}' pattern: #{@timeout} seconds")
|
103
107
|
end
|
104
108
|
|
105
|
-
# timeout management : define default_timeout
|
106
|
-
if !@timeout.nil? && (@current_pipeline.default_timeout.nil? || @timeout < @current_pipeline.default_timeout)
|
107
|
-
@current_pipeline.default_timeout = @timeout
|
108
|
-
@logger.debug("Aggregate default timeout: #{@timeout} seconds")
|
109
|
-
end
|
110
|
-
|
111
109
|
# inactivity timeout management: make sure it is lower than timeout
|
112
|
-
if
|
113
|
-
raise LogStash::ConfigurationError, "Aggregate plugin: For task_id pattern #{@task_id}, inactivity_timeout must be lower than timeout"
|
110
|
+
if @inactivity_timeout && ((@timeout && @inactivity_timeout > @timeout) || (@timeout.nil? && @inactivity_timeout > DEFAULT_TIMEOUT))
|
111
|
+
raise LogStash::ConfigurationError, "Aggregate plugin: For task_id pattern #{@task_id}, inactivity_timeout (#{@inactivity_timeout}) must be lower than timeout (#{@timeout})"
|
114
112
|
end
|
115
113
|
|
116
114
|
# reinit pipeline_close_instance (if necessary)
|
117
|
-
if !@current_pipeline.aggregate_maps_path_set &&
|
115
|
+
if !@current_pipeline.aggregate_maps_path_set && @current_pipeline.pipeline_close_instance
|
118
116
|
@current_pipeline.pipeline_close_instance = nil
|
119
117
|
end
|
120
118
|
|
121
119
|
# check if aggregate_maps_path option has already been set on another instance else set @current_pipeline.aggregate_maps_path_set
|
122
|
-
if
|
120
|
+
if @aggregate_maps_path
|
123
121
|
if @current_pipeline.aggregate_maps_path_set
|
124
122
|
@current_pipeline.aggregate_maps_path_set = false
|
125
123
|
raise LogStash::ConfigurationError, "Aggregate plugin: Option 'aggregate_maps_path' must be set on only one aggregate filter"
|
@@ -130,7 +128,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
130
128
|
end
|
131
129
|
|
132
130
|
# load aggregate maps from file (if option defined)
|
133
|
-
if
|
131
|
+
if @aggregate_maps_path && File.exist?(@aggregate_maps_path)
|
134
132
|
File.open(@aggregate_maps_path, "r") { |from_file| @current_pipeline.aggregate_maps.merge!(Marshal.load(from_file)) }
|
135
133
|
File.delete(@aggregate_maps_path)
|
136
134
|
@logger.info("Aggregate maps loaded from : #{@aggregate_maps_path}")
|
@@ -138,8 +136,8 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
138
136
|
|
139
137
|
# init aggregate_maps
|
140
138
|
@current_pipeline.aggregate_maps[@task_id] ||= {}
|
141
|
-
|
142
|
-
|
139
|
+
update_aggregate_maps_metric()
|
140
|
+
|
143
141
|
end
|
144
142
|
end
|
145
143
|
|
@@ -156,7 +154,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
156
154
|
# store aggregate maps to file (if option defined)
|
157
155
|
@current_pipeline.mutex.synchronize do
|
158
156
|
@current_pipeline.aggregate_maps.delete_if { |key, value| value.empty? }
|
159
|
-
if
|
157
|
+
if @aggregate_maps_path && !@current_pipeline.aggregate_maps.empty?
|
160
158
|
File.open(@aggregate_maps_path, "w"){ |to_file| Marshal.dump(@current_pipeline.aggregate_maps, to_file) }
|
161
159
|
@logger.info("Aggregate maps stored to : #{@aggregate_maps_path}")
|
162
160
|
end
|
@@ -182,28 +180,40 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
182
180
|
# protect aggregate_maps against concurrent access, using a mutex
|
183
181
|
@current_pipeline.mutex.synchronize do
|
184
182
|
|
183
|
+
# if timeout is based on event timestamp, check if task_id map is expired and should be removed
|
184
|
+
if @timeout_timestamp_field
|
185
|
+
event_to_yield = remove_expired_map_based_on_event_timestamp(task_id, event)
|
186
|
+
end
|
187
|
+
|
185
188
|
# retrieve the current aggregate map
|
186
189
|
aggregate_maps_element = @current_pipeline.aggregate_maps[@task_id][task_id]
|
187
190
|
|
188
|
-
|
189
|
-
# create aggregate map, if it doesn't exist
|
191
|
+
# case where aggregate map isn't already created
|
190
192
|
if aggregate_maps_element.nil?
|
191
193
|
return if @map_action == "update"
|
194
|
+
|
192
195
|
# create new event from previous map, if @push_previous_map_as_event is enabled
|
193
196
|
if @push_previous_map_as_event && !@current_pipeline.aggregate_maps[@task_id].empty?
|
194
197
|
event_to_yield = extract_previous_map_as_event()
|
195
198
|
end
|
196
|
-
|
199
|
+
|
200
|
+
# create aggregate map
|
201
|
+
creation_timestamp = reference_timestamp(event)
|
202
|
+
aggregate_maps_element = LogStash::Filters::Aggregate::Element.new(creation_timestamp, task_id)
|
197
203
|
@current_pipeline.aggregate_maps[@task_id][task_id] = aggregate_maps_element
|
204
|
+
update_aggregate_maps_metric()
|
198
205
|
else
|
199
206
|
return if @map_action == "create"
|
200
207
|
end
|
201
|
-
|
208
|
+
|
202
209
|
# update last event timestamp
|
203
|
-
aggregate_maps_element.lastevent_timestamp =
|
210
|
+
aggregate_maps_element.lastevent_timestamp = reference_timestamp(event)
|
211
|
+
aggregate_maps_element.difference_from_lastevent_to_now = (Time.now - aggregate_maps_element.lastevent_timestamp).to_i
|
212
|
+
|
204
213
|
# execute the code to read/update map and event
|
214
|
+
map = aggregate_maps_element.map
|
205
215
|
begin
|
206
|
-
@codeblock.call(event, map)
|
216
|
+
@codeblock.call(event, map, aggregate_maps_element)
|
207
217
|
@logger.debug("Aggregate successful filter code execution", :code => @code)
|
208
218
|
noError = true
|
209
219
|
rescue => exception
|
@@ -213,10 +223,17 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
213
223
|
:map => map,
|
214
224
|
:event_data => event.to_hash_with_metadata)
|
215
225
|
event.tag("_aggregateexception")
|
226
|
+
metric.increment(:code_errors)
|
216
227
|
end
|
217
228
|
|
218
229
|
# delete the map if task is ended
|
219
230
|
@current_pipeline.aggregate_maps[@task_id].delete(task_id) if @end_of_task
|
231
|
+
update_aggregate_maps_metric()
|
232
|
+
|
233
|
+
# process custom timeout set by code block
|
234
|
+
if (aggregate_maps_element.timeout || aggregate_maps_element.inactivity_timeout)
|
235
|
+
event_to_yield = process_map_timeout(aggregate_maps_element)
|
236
|
+
end
|
220
237
|
|
221
238
|
end
|
222
239
|
|
@@ -224,8 +241,26 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
224
241
|
filter_matched(event) if noError
|
225
242
|
|
226
243
|
# yield previous map as new event if set
|
227
|
-
yield event_to_yield
|
244
|
+
yield event_to_yield if event_to_yield
|
245
|
+
end
|
228
246
|
|
247
|
+
# Process a custom timeout defined in aggregate map element
|
248
|
+
# Returns an event to yield if timeout=0 and push_map_as_event_on_timeout=true
|
249
|
+
def process_map_timeout(element)
|
250
|
+
event_to_yield = nil
|
251
|
+
init_pipeline_timeout_management()
|
252
|
+
if (element.timeout == 0 || element.inactivity_timeout == 0)
|
253
|
+
@current_pipeline.aggregate_maps[@task_id].delete(element.task_id)
|
254
|
+
if @current_pipeline.flush_instance_map[@task_id].push_map_as_event_on_timeout
|
255
|
+
event_to_yield = create_timeout_event(element.map, element.task_id)
|
256
|
+
end
|
257
|
+
@logger.debug("Aggregate remove expired map with task_id=#{element.task_id} and custom timeout=0")
|
258
|
+
metric.increment(:task_timeouts)
|
259
|
+
update_aggregate_maps_metric()
|
260
|
+
else
|
261
|
+
@current_pipeline.flush_instance_map[@task_id].check_expired_maps_on_every_flush ||= true
|
262
|
+
end
|
263
|
+
return event_to_yield
|
229
264
|
end
|
230
265
|
|
231
266
|
# Create a new event from the aggregation_map and the corresponding task_id
|
@@ -245,7 +280,8 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
245
280
|
|
246
281
|
LogStash::Util::Decorators.add_tags(@timeout_tags, event_to_yield, "filters/#{self.class.name}")
|
247
282
|
|
248
|
-
|
283
|
+
|
284
|
+
# Call timeout code block if available
|
249
285
|
if @timeout_code
|
250
286
|
begin
|
251
287
|
@timeout_codeblock.call(event_to_yield)
|
@@ -255,9 +291,12 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
255
291
|
:timeout_code => @timeout_code,
|
256
292
|
:timeout_event_data => event_to_yield.to_hash_with_metadata)
|
257
293
|
event_to_yield.tag("_aggregateexception")
|
294
|
+
metric.increment(:timeout_code_errors)
|
258
295
|
end
|
259
296
|
end
|
260
297
|
|
298
|
+
metric.increment(:pushed_events)
|
299
|
+
|
261
300
|
return event_to_yield
|
262
301
|
end
|
263
302
|
|
@@ -266,6 +305,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
266
305
|
previous_entry = @current_pipeline.aggregate_maps[@task_id].shift()
|
267
306
|
previous_task_id = previous_entry[0]
|
268
307
|
previous_map = previous_entry[1].map
|
308
|
+
update_aggregate_maps_metric()
|
269
309
|
return create_timeout_event(previous_map, previous_task_id)
|
270
310
|
end
|
271
311
|
|
@@ -277,31 +317,23 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
277
317
|
# This method is invoked by LogStash every 5 seconds.
|
278
318
|
def flush(options = {})
|
279
319
|
|
280
|
-
@logger.
|
320
|
+
@logger.trace("Aggregate flush call with #{options}")
|
281
321
|
|
282
|
-
#
|
283
|
-
|
284
|
-
|
285
|
-
|
286
|
-
if
|
287
|
-
@current_pipeline.flush_instance_map[@task_id] = self
|
288
|
-
@timeout = @current_pipeline.default_timeout
|
289
|
-
elsif @current_pipeline.flush_instance_map[@task_id].timeout.nil?
|
290
|
-
@current_pipeline.flush_instance_map[@task_id].timeout = @current_pipeline.default_timeout
|
291
|
-
end
|
292
|
-
|
293
|
-
if @current_pipeline.flush_instance_map[@task_id].inactivity_timeout.nil?
|
294
|
-
@current_pipeline.flush_instance_map[@task_id].inactivity_timeout = @current_pipeline.flush_instance_map[@task_id].timeout
|
295
|
-
end
|
296
|
-
|
297
|
-
# Launch timeout management only every interval of (@inactivity_timeout / 2) seconds or at Logstash shutdown
|
298
|
-
if @current_pipeline.flush_instance_map[@task_id] == self && !@current_pipeline.aggregate_maps[@task_id].nil? && (!@current_pipeline.last_flush_timestamp_map.has_key?(@task_id) || Time.now > @current_pipeline.last_flush_timestamp_map[@task_id] + @inactivity_timeout / 2 || options[:final])
|
322
|
+
# init flush/timeout properties for current pipeline
|
323
|
+
init_pipeline_timeout_management()
|
324
|
+
|
325
|
+
# launch timeout management only every interval of (@inactivity_timeout / 2) seconds or at Logstash shutdown
|
326
|
+
if @current_pipeline.flush_instance_map[@task_id] == self && @current_pipeline.aggregate_maps[@task_id] && (!@current_pipeline.last_flush_timestamp_map.has_key?(@task_id) || Time.now > @current_pipeline.last_flush_timestamp_map[@task_id] + @inactivity_timeout / 2 || options[:final] || @check_expired_maps_on_every_flush)
|
299
327
|
events_to_flush = remove_expired_maps()
|
300
328
|
|
301
329
|
# at Logstash shutdown, if push_previous_map_as_event is enabled, it's important to force flush (particularly for jdbc input plugin)
|
302
|
-
|
303
|
-
|
330
|
+
@current_pipeline.mutex.synchronize do
|
331
|
+
if options[:final] && @push_previous_map_as_event && !@current_pipeline.aggregate_maps[@task_id].empty?
|
332
|
+
events_to_flush << extract_previous_map_as_event()
|
333
|
+
end
|
304
334
|
end
|
335
|
+
|
336
|
+
update_aggregate_maps_metric()
|
305
337
|
|
306
338
|
# tag flushed events, indicating "final flush" special event
|
307
339
|
if options[:final]
|
@@ -316,35 +348,92 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
316
348
|
else
|
317
349
|
return []
|
318
350
|
end
|
319
|
-
|
320
351
|
end
|
352
|
+
|
353
|
+
# init flush/timeout properties for current pipeline
|
354
|
+
def init_pipeline_timeout_management()
|
355
|
+
|
356
|
+
# Define default flush instance that manages timeout (if not defined by user)
|
357
|
+
if !@current_pipeline.flush_instance_map.has_key?(@task_id)
|
358
|
+
@current_pipeline.flush_instance_map[@task_id] = self
|
359
|
+
end
|
321
360
|
|
361
|
+
# Define timeout and inactivity_timeout (if not defined by user)
|
362
|
+
if @current_pipeline.flush_instance_map[@task_id] == self
|
363
|
+
if @timeout.nil?
|
364
|
+
@timeout = DEFAULT_TIMEOUT
|
365
|
+
@logger.debug("Aggregate timeout for '#{@task_id}' pattern: #{@timeout} seconds (default value)")
|
366
|
+
end
|
367
|
+
if @inactivity_timeout.nil?
|
368
|
+
@inactivity_timeout = @timeout
|
369
|
+
end
|
370
|
+
end
|
371
|
+
|
372
|
+
end
|
322
373
|
|
323
374
|
# Remove the expired Aggregate maps from @current_pipeline.aggregate_maps if they are older than timeout or if no new event has been received since inactivity_timeout.
|
324
375
|
# If @push_previous_map_as_event option is set, or @push_map_as_event_on_timeout is set, expired maps are returned as new events to be flushed to Logstash pipeline.
|
325
376
|
def remove_expired_maps()
|
326
377
|
events_to_flush = []
|
327
|
-
|
328
|
-
|
378
|
+
default_min_timestamp = Time.now - @timeout
|
379
|
+
default_min_inactivity_timestamp = Time.now - @inactivity_timeout
|
329
380
|
|
330
381
|
@current_pipeline.mutex.synchronize do
|
331
382
|
|
332
383
|
@logger.debug("Aggregate remove_expired_maps call with '#{@task_id}' pattern and #{@current_pipeline.aggregate_maps[@task_id].length} maps")
|
333
384
|
|
334
385
|
@current_pipeline.aggregate_maps[@task_id].delete_if do |key, element|
|
335
|
-
|
386
|
+
min_timestamp = element.timeout ? Time.now - element.timeout : default_min_timestamp
|
387
|
+
min_inactivity_timestamp = element.inactivity_timeout ? Time.now - element.inactivity_timeout : default_min_inactivity_timestamp
|
388
|
+
if element.creation_timestamp + element.difference_from_creation_to_now < min_timestamp || element.lastevent_timestamp + element.difference_from_lastevent_to_now < min_inactivity_timestamp
|
336
389
|
if @push_previous_map_as_event || @push_map_as_event_on_timeout
|
337
390
|
events_to_flush << create_timeout_event(element.map, key)
|
338
391
|
end
|
392
|
+
@logger.debug("Aggregate remove expired map with task_id=#{key}")
|
393
|
+
metric.increment(:task_timeouts)
|
339
394
|
next true
|
340
395
|
end
|
341
396
|
next false
|
342
397
|
end
|
343
398
|
end
|
399
|
+
|
400
|
+
# disable check_expired_maps_on_every_flush if there is not anymore maps
|
401
|
+
if @current_pipeline.aggregate_maps[@task_id].length == 0 && @check_expired_maps_on_every_flush
|
402
|
+
@check_expired_maps_on_every_flush = nil
|
403
|
+
end
|
344
404
|
|
345
405
|
return events_to_flush
|
346
406
|
end
|
347
407
|
|
408
|
+
# Remove the expired Aggregate map associated to task_id if it is older than timeout or if no new event has been received since inactivity_timeout (relative to current event timestamp).
|
409
|
+
# If @push_previous_map_as_event option is set, or @push_map_as_event_on_timeout is set, expired map is returned as new event to be flushed to Logstash pipeline.
|
410
|
+
def remove_expired_map_based_on_event_timestamp(task_id, event)
|
411
|
+
|
412
|
+
@logger.debug("Aggregate remove_expired_map_based_on_event_timestamp call with task_id : '#{@task_id}'")
|
413
|
+
|
414
|
+
# get aggregate map element
|
415
|
+
element = @current_pipeline.aggregate_maps[@task_id][task_id]
|
416
|
+
return nil if element.nil?
|
417
|
+
|
418
|
+
init_pipeline_timeout_management()
|
419
|
+
|
420
|
+
event_to_flush = nil
|
421
|
+
event_timestamp = reference_timestamp(event)
|
422
|
+
min_timestamp = element.timeout ? event_timestamp - element.timeout : event_timestamp - @timeout
|
423
|
+
min_inactivity_timestamp = element.inactivity_timeout ? event_timestamp - element.inactivity_timeout : event_timestamp - @inactivity_timeout
|
424
|
+
|
425
|
+
if element.creation_timestamp < min_timestamp || element.lastevent_timestamp < min_inactivity_timestamp
|
426
|
+
if @push_previous_map_as_event || @push_map_as_event_on_timeout
|
427
|
+
event_to_flush = create_timeout_event(element.map, task_id)
|
428
|
+
end
|
429
|
+
@current_pipeline.aggregate_maps[@task_id].delete(task_id)
|
430
|
+
@logger.debug("Aggregate remove expired map with task_id=#{task_id}")
|
431
|
+
metric.increment(:task_timeouts)
|
432
|
+
end
|
433
|
+
|
434
|
+
return event_to_flush
|
435
|
+
end
|
436
|
+
|
348
437
|
# return if this filter instance has any timeout option enabled in logstash configuration
|
349
438
|
def has_timeout_options?()
|
350
439
|
return (
|
@@ -353,6 +442,7 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
353
442
|
timeout_code ||
|
354
443
|
push_map_as_event_on_timeout ||
|
355
444
|
push_previous_map_as_event ||
|
445
|
+
timeout_timestamp_field ||
|
356
446
|
timeout_task_id_field ||
|
357
447
|
!timeout_tags.empty?
|
358
448
|
)
|
@@ -366,17 +456,32 @@ class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
|
366
456
|
"timeout_code",
|
367
457
|
"push_map_as_event_on_timeout",
|
368
458
|
"push_previous_map_as_event",
|
459
|
+
"timeout_timestamp_field",
|
369
460
|
"timeout_task_id_field",
|
370
461
|
"timeout_tags"
|
371
462
|
].join(", ")
|
372
463
|
end
|
373
|
-
|
464
|
+
|
374
465
|
# return current pipeline id
|
375
466
|
def pipeline_id()
|
376
467
|
if @execution_context
|
377
468
|
return @execution_context.pipeline_id
|
378
469
|
else
|
379
|
-
return
|
470
|
+
return "main"
|
471
|
+
end
|
472
|
+
end
|
473
|
+
|
474
|
+
# compute and return "reference" timestamp to compute timeout :
|
475
|
+
# by default "system current time" or event timestamp if timeout_timestamp_field option is defined
|
476
|
+
def reference_timestamp(event)
|
477
|
+
return (@timeout_timestamp_field) ? event.get(@timeout_timestamp_field).time : Time.now
|
478
|
+
end
|
479
|
+
|
480
|
+
# update "aggregate_maps" metric, with aggregate maps count associated to configured taskid pattern
|
481
|
+
def update_aggregate_maps_metric()
|
482
|
+
aggregate_maps = @current_pipeline.aggregate_maps[@task_id]
|
483
|
+
if aggregate_maps
|
484
|
+
metric.gauge(:aggregate_maps, aggregate_maps.length)
|
380
485
|
end
|
381
486
|
end
|
382
487
|
|
@@ -385,40 +490,42 @@ end # class LogStash::Filters::Aggregate
|
|
385
490
|
# Element of "aggregate_maps"
|
386
491
|
class LogStash::Filters::Aggregate::Element
|
387
492
|
|
388
|
-
attr_accessor :creation_timestamp, :lastevent_timestamp, :map
|
493
|
+
attr_accessor :creation_timestamp, :lastevent_timestamp, :difference_from_creation_to_now, :difference_from_lastevent_to_now, :timeout, :inactivity_timeout, :task_id, :map
|
389
494
|
|
390
|
-
def initialize(creation_timestamp)
|
495
|
+
def initialize(creation_timestamp, task_id)
|
391
496
|
@creation_timestamp = creation_timestamp
|
392
|
-
@lastevent_timestamp = creation_timestamp
|
497
|
+
@lastevent_timestamp = creation_timestamp
|
498
|
+
@difference_from_creation_to_now = (Time.now - creation_timestamp).to_i
|
499
|
+
@difference_from_lastevent_to_now = @difference_from_creation_to_now
|
500
|
+
@timeout = nil
|
501
|
+
@inactivity_timeout = nil
|
502
|
+
@task_id = task_id
|
393
503
|
@map = {}
|
394
504
|
end
|
395
505
|
end
|
396
506
|
|
397
507
|
# shared aggregate attributes for each pipeline
|
398
508
|
class LogStash::Filters::Aggregate::Pipeline
|
399
|
-
|
400
|
-
attr_accessor :aggregate_maps, :mutex, :
|
509
|
+
|
510
|
+
attr_accessor :aggregate_maps, :mutex, :flush_instance_map, :last_flush_timestamp_map, :aggregate_maps_path_set, :pipeline_close_instance
|
401
511
|
|
402
512
|
def initialize()
|
403
513
|
# Stores all aggregate maps, per task_id pattern, then per task_id value
|
404
514
|
@aggregate_maps = {}
|
405
|
-
|
515
|
+
|
406
516
|
# Mutex used to synchronize access to 'aggregate_maps'
|
407
517
|
@mutex = Mutex.new
|
408
|
-
|
409
|
-
# Default timeout for task_id patterns where timeout is not defined in Logstash filter configuration
|
410
|
-
@default_timeout = nil
|
411
|
-
|
518
|
+
|
412
519
|
# For each "task_id" pattern, defines which Aggregate instance will process flush() call, processing expired Aggregate elements (older than timeout)
|
413
520
|
# For each entry, key is "task_id pattern" and value is "aggregate instance"
|
414
521
|
@flush_instance_map = {}
|
415
|
-
|
522
|
+
|
416
523
|
# last time where timeout management in flush() method was launched, per "task_id" pattern
|
417
524
|
@last_flush_timestamp_map = {}
|
418
|
-
|
525
|
+
|
419
526
|
# flag indicating if aggregate_maps_path option has been already set on one aggregate instance
|
420
527
|
@aggregate_maps_path_set = false
|
421
|
-
|
528
|
+
|
422
529
|
# defines which Aggregate instance will close Aggregate variables associated to current pipeline
|
423
530
|
@pipeline_close_instance = nil
|
424
531
|
end
|
@@ -1,6 +1,6 @@
|
|
1
1
|
Gem::Specification.new do |s|
|
2
2
|
s.name = 'logstash-filter-aggregate'
|
3
|
-
s.version = '2.
|
3
|
+
s.version = '2.9.2'
|
4
4
|
s.licenses = ['Apache License (2.0)']
|
5
5
|
s.summary = "Aggregates information from several events originating with a single task"
|
6
6
|
s.description = 'This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program'
|
@@ -163,6 +163,7 @@ describe LogStash::Filters::Aggregate do
|
|
163
163
|
describe "no timeout defined in none filter" do
|
164
164
|
it "defines a default timeout on a default filter" do
|
165
165
|
reset_timeout_management()
|
166
|
+
@end_filter.timeout = nil
|
166
167
|
expect(taskid_eviction_instance).to be_nil
|
167
168
|
@end_filter.flush()
|
168
169
|
expect(taskid_eviction_instance).to eq(@end_filter)
|
@@ -332,7 +333,10 @@ describe LogStash::Filters::Aggregate do
|
|
332
333
|
it "should push previous map as new event" do
|
333
334
|
push_filter = setup_filter({ "task_id" => "%{ppm_id}", "code" => "map['ppm_id'] = event.get('ppm_id')", "push_previous_map_as_event" => true, "timeout" => 5, "timeout_task_id_field" => "timeout_task_id_field" })
|
334
335
|
push_filter.filter(event({"ppm_id" => "1"})) { |yield_event| fail "task 1 shouldn't have yield event" }
|
335
|
-
push_filter.filter(event({"ppm_id" => "2"}))
|
336
|
+
push_filter.filter(event({"ppm_id" => "2"})) do |yield_event|
|
337
|
+
expect(yield_event.get("ppm_id")).to eq("1")
|
338
|
+
expect(yield_event.get("timeout_task_id_field")).to eq("1")
|
339
|
+
end
|
336
340
|
expect(aggregate_maps["%{ppm_id}"].size).to eq(1)
|
337
341
|
end
|
338
342
|
end
|
@@ -367,5 +371,53 @@ describe LogStash::Filters::Aggregate do
|
|
367
371
|
end
|
368
372
|
end
|
369
373
|
|
374
|
+
context "timeout_timestamp_field option is defined, " do
|
375
|
+
describe "when 3 old events arrive, " do
|
376
|
+
it "should push a new aggregated event using timeout based on events timestamp" do
|
377
|
+
agg_filter = setup_filter({ "task_id" => "%{ppm_id}", "code" => "map['sql_duration'] ||= 0; map['sql_duration'] += event.get('duration')", "timeout_timestamp_field" => "@timestamp", "push_map_as_event_on_timeout" => true, "timeout" => 120 })
|
378
|
+
agg_filter.filter(event({"ppm_id" => "1", "duration" => 2, "@timestamp" => timestamp("2018-01-31T00:00:00Z")})) { |yield_event| fail "it shouldn't have yield event" }
|
379
|
+
agg_filter.filter(event({"ppm_id" => "1", "duration" => 3, "@timestamp" => timestamp("2018-01-31T00:00:01Z")})) { |yield_event| fail "it shouldn't have yield event" }
|
380
|
+
events_to_flush = agg_filter.flush()
|
381
|
+
expect(events_to_flush).to be_empty
|
382
|
+
agg_filter.filter(event({"ppm_id" => "1", "duration" => 4, "@timestamp" => timestamp("2018-01-31T00:05:00Z")})) do |yield_event|
|
383
|
+
expect(yield_event).not_to be_nil
|
384
|
+
expect(yield_event.get("sql_duration")).to eq(5)
|
385
|
+
end
|
386
|
+
expect(aggregate_maps["%{ppm_id}"].size).to eq(1)
|
387
|
+
expect(aggregate_maps["%{ppm_id}"]["1"].map["sql_duration"]).to eq(4)
|
388
|
+
end
|
389
|
+
end
|
390
|
+
end
|
391
|
+
|
392
|
+
context "custom timeout on map_meta, " do
|
393
|
+
describe "when map_meta.timeout=0, " do
|
394
|
+
it "should push a new aggregated event immediately" do
|
395
|
+
agg_filter = setup_filter({ "task_id" => "%{ppm_id}", "code" => "map['sql_duration'] = 2; map_meta.timeout = 0", "push_map_as_event_on_timeout" => true, "timeout" => 120 })
|
396
|
+
agg_filter.filter(event({"ppm_id" => "1"})) do |yield_event|
|
397
|
+
expect(yield_event).not_to be_nil
|
398
|
+
expect(yield_event.get("sql_duration")).to eq(2)
|
399
|
+
end
|
400
|
+
expect(aggregate_maps["%{ppm_id}"]).to be_empty
|
401
|
+
end
|
402
|
+
end
|
403
|
+
describe "when map_meta.timeout=0 and push_map_as_event_on_timeout=false, " do
|
404
|
+
it "should just remove expired map and not push an aggregated event" do
|
405
|
+
agg_filter = setup_filter({ "task_id" => "%{ppm_id}", "code" => "map_meta.timeout = 0", "push_map_as_event_on_timeout" => false, "timeout" => 120 })
|
406
|
+
agg_filter.filter(event({"ppm_id" => "1"})) { |yield_event| fail "it shouldn't have yield event" }
|
407
|
+
expect(aggregate_maps["%{ppm_id}"]).to be_empty
|
408
|
+
end
|
409
|
+
end
|
410
|
+
describe "when map_meta.inactivity_timeout=1, " do
|
411
|
+
it "should push a new aggregated event at next flush call" do
|
412
|
+
agg_filter = setup_filter({ "task_id" => "%{ppm_id}", "code" => "map['sql_duration'] = 2; map_meta.inactivity_timeout = 1", "push_map_as_event_on_timeout" => true, "timeout" => 120 })
|
413
|
+
agg_filter.filter(event({"ppm_id" => "1"})) { |yield_event| fail "it shouldn't have yield event" }
|
414
|
+
expect(aggregate_maps["%{ppm_id}"].size).to eq(1)
|
415
|
+
sleep(2)
|
416
|
+
events_to_flush = agg_filter.flush()
|
417
|
+
expect(events_to_flush.size).to eq(1)
|
418
|
+
expect(aggregate_maps["%{ppm_id}"]).to be_empty
|
419
|
+
end
|
420
|
+
end
|
421
|
+
end
|
370
422
|
|
371
423
|
end
|
@@ -5,6 +5,10 @@ def event(data = {})
|
|
5
5
|
LogStash::Event.new(data)
|
6
6
|
end
|
7
7
|
|
8
|
+
def timestamp(iso8601)
|
9
|
+
LogStash::Timestamp.new(iso8601)
|
10
|
+
end
|
11
|
+
|
8
12
|
def start_event(data = {})
|
9
13
|
data["logger"] = "TASK_START"
|
10
14
|
event(data)
|
@@ -58,7 +62,6 @@ def aggregate_maps_path_set()
|
|
58
62
|
end
|
59
63
|
|
60
64
|
def reset_timeout_management()
|
61
|
-
current_pipeline().default_timeout = nil
|
62
65
|
current_pipeline().flush_instance_map.clear()
|
63
66
|
current_pipeline().last_flush_timestamp_map.clear()
|
64
67
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: logstash-filter-aggregate
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.
|
4
|
+
version: 2.9.2
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Elastic
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date:
|
12
|
+
date: 2021-04-25 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
requirement: !ruby/object:Gem::Requirement
|
@@ -45,9 +45,7 @@ dependencies:
|
|
45
45
|
- - ">="
|
46
46
|
- !ruby/object:Gem::Version
|
47
47
|
version: '0'
|
48
|
-
description: This gem is a Logstash plugin required to be installed on top of the
|
49
|
-
Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This
|
50
|
-
gem is not a stand-alone program
|
48
|
+
description: This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program
|
51
49
|
email: info@elastic.co
|
52
50
|
executables: []
|
53
51
|
extensions: []
|
@@ -87,7 +85,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
87
85
|
version: '0'
|
88
86
|
requirements: []
|
89
87
|
rubyforge_project:
|
90
|
-
rubygems_version: 2.
|
88
|
+
rubygems_version: 2.4.8
|
91
89
|
signing_key:
|
92
90
|
specification_version: 4
|
93
91
|
summary: Aggregates information from several events originating with a single task
|