logstash-filter-aggregate 0.1.3
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/BUILD.md +86 -0
- data/CHANGELOG.md +0 -0
- data/CONTRIBUTORS +10 -0
- data/Gemfile +2 -0
- data/LICENSE +13 -0
- data/README.md +136 -0
- data/lib/logstash/filters/aggregate.rb +255 -0
- data/logstash-filter-aggregate.gemspec +24 -0
- data/spec/filters/aggregate_spec.rb +167 -0
- data/spec/filters/aggregate_spec_helper.rb +49 -0
- metadata +92 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 60e752551408fc57869c2ab7866e76f8612aa0c6
|
4
|
+
data.tar.gz: 3757e961e87a984a827b07a748811638c2df4cb7
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: b9cb0e54a99fd0933499f98101d72b11943a24bd039c84a76fa9e9578a4580ef4b4ff514c9141415fc6740f93e2c9b34fed4fa1df9b4d980df70eb520e399154
|
7
|
+
data.tar.gz: caec2ae83b4d7d0985e2a96b1cf40d5be58de8915b2b8a17f3c8043085db270d895e8c6f60fe250d0333390d507458306a3b39ec6a192ae18d7a4ab0071c89cf
|
data/BUILD.md
ADDED
@@ -0,0 +1,86 @@
|
|
1
|
+
# Logstash Plugin
|
2
|
+
|
3
|
+
This is a plugin for [Logstash](https://github.com/elastic/logstash).
|
4
|
+
|
5
|
+
It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way.
|
6
|
+
|
7
|
+
## Documentation
|
8
|
+
|
9
|
+
Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elasticsearch.org/guide/en/logstash/current/).
|
10
|
+
|
11
|
+
- For formatting code or config example, you can use the asciidoc `[source,ruby]` directive
|
12
|
+
- For more asciidoc formatting tips, see the excellent reference here https://github.com/elastic/docs#asciidoc-guide
|
13
|
+
|
14
|
+
## Need Help?
|
15
|
+
|
16
|
+
Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum.
|
17
|
+
|
18
|
+
## Developing
|
19
|
+
|
20
|
+
### 1. Plugin Developement and Testing
|
21
|
+
|
22
|
+
#### Code
|
23
|
+
- To get started, you'll need JRuby with the Bundler gem installed.
|
24
|
+
|
25
|
+
- Create a new plugin or clone and existing from the GitHub [logstash-plugins](https://github.com/logstash-plugins) organization. We also provide [example plugins](https://github.com/logstash-plugins?query=example).
|
26
|
+
|
27
|
+
- Install dependencies
|
28
|
+
```sh
|
29
|
+
bundle install
|
30
|
+
```
|
31
|
+
|
32
|
+
#### Test
|
33
|
+
|
34
|
+
- Update your dependencies
|
35
|
+
|
36
|
+
```sh
|
37
|
+
bundle install
|
38
|
+
```
|
39
|
+
|
40
|
+
- Run tests
|
41
|
+
|
42
|
+
```sh
|
43
|
+
bundle exec rspec
|
44
|
+
```
|
45
|
+
|
46
|
+
### 2. Running your unpublished Plugin in Logstash
|
47
|
+
|
48
|
+
#### 2.1 Run in a local Logstash clone
|
49
|
+
|
50
|
+
- Edit Logstash `Gemfile` and add the local plugin path, for example:
|
51
|
+
```ruby
|
52
|
+
gem "logstash-filter-awesome", :path => "/your/local/logstash-filter-awesome"
|
53
|
+
```
|
54
|
+
- Install plugin
|
55
|
+
```sh
|
56
|
+
bin/plugin install --no-verify
|
57
|
+
```
|
58
|
+
- Run Logstash with your plugin
|
59
|
+
```sh
|
60
|
+
bin/logstash -e 'filter {awesome {}}'
|
61
|
+
```
|
62
|
+
At this point any modifications to the plugin code will be applied to this local Logstash setup. After modifying the plugin, simply rerun Logstash.
|
63
|
+
|
64
|
+
#### 2.2 Run in an installed Logstash
|
65
|
+
|
66
|
+
You can use the same **2.1** method to run your plugin in an installed Logstash by editing its `Gemfile` and pointing the `:path` to your local plugin development directory or you can build the gem and install it using:
|
67
|
+
|
68
|
+
- Build your plugin gem
|
69
|
+
```sh
|
70
|
+
gem build logstash-filter-awesome.gemspec
|
71
|
+
```
|
72
|
+
- Install the plugin from the Logstash home
|
73
|
+
```sh
|
74
|
+
bin/plugin install /your/local/plugin/logstash-filter-awesome.gem
|
75
|
+
```
|
76
|
+
- Start Logstash and proceed to test the plugin
|
77
|
+
|
78
|
+
## Contributing
|
79
|
+
|
80
|
+
All contributions are welcome: ideas, patches, documentation, bug reports, complaints, and even something you drew up on a napkin.
|
81
|
+
|
82
|
+
Programming is not a required skill. Whatever you've seen about open source and maintainers or community members saying "send patches or die" - you will not see that here.
|
83
|
+
|
84
|
+
It is more important to the community that you are able to contribute.
|
85
|
+
|
86
|
+
For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file.
|
data/CHANGELOG.md
ADDED
File without changes
|
data/CONTRIBUTORS
ADDED
@@ -0,0 +1,10 @@
|
|
1
|
+
The following is a list of people who have contributed ideas, code, bug
|
2
|
+
reports, or in general have helped logstash along its way.
|
3
|
+
|
4
|
+
Contributors:
|
5
|
+
* Fabien Baligand (fbaligand)
|
6
|
+
|
7
|
+
Note: If you've sent us patches, bug reports, or otherwise contributed to
|
8
|
+
Logstash, and you aren't on the list above and want to be, please let us know
|
9
|
+
and we'll make sure you're here. Contributions from folks like you are what make
|
10
|
+
open source awesome.
|
data/Gemfile
ADDED
data/LICENSE
ADDED
@@ -0,0 +1,13 @@
|
|
1
|
+
Copyright (c) 2012-2015 Elasticsearch <http://www.elasticsearch.org>
|
2
|
+
|
3
|
+
Licensed under the Apache License, Version 2.0 (the "License");
|
4
|
+
you may not use this file except in compliance with the License.
|
5
|
+
You may obtain a copy of the License at
|
6
|
+
|
7
|
+
http://www.apache.org/licenses/LICENSE-2.0
|
8
|
+
|
9
|
+
Unless required by applicable law or agreed to in writing, software
|
10
|
+
distributed under the License is distributed on an "AS IS" BASIS,
|
11
|
+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
12
|
+
See the License for the specific language governing permissions and
|
13
|
+
limitations under the License.
|
data/README.md
ADDED
@@ -0,0 +1,136 @@
|
|
1
|
+
# Logstash Filter Aggregate Documentation
|
2
|
+
|
3
|
+
The aim of this filter is to aggregate informations available among several events (typically log lines) belonging to a same task, and finally push aggregated information into final task event.
|
4
|
+
|
5
|
+
## Example #1
|
6
|
+
|
7
|
+
* with these given logs :
|
8
|
+
```
|
9
|
+
INFO - 12345 - TASK_START - start
|
10
|
+
INFO - 12345 - SQL - sqlQuery1 - 12
|
11
|
+
INFO - 12345 - SQL - sqlQuery2 - 34
|
12
|
+
INFO - 12345 - TASK_END - end
|
13
|
+
```
|
14
|
+
|
15
|
+
* you can aggregate "sql duration" for the whole task with this configuration :
|
16
|
+
``` ruby
|
17
|
+
filter {
|
18
|
+
grok {
|
19
|
+
match => [ "message", "%{LOGLEVEL:loglevel} - %{NOTSPACE:taskid} - %{NOTSPACE:logger} - %{WORD:label}( - %{INT:duration:int})?" ]
|
20
|
+
}
|
21
|
+
|
22
|
+
if [logger] == "TASK_START" {
|
23
|
+
aggregate {
|
24
|
+
task_id => "%{taskid}"
|
25
|
+
code => "map['sql_duration'] = 0"
|
26
|
+
map_action => "create"
|
27
|
+
}
|
28
|
+
}
|
29
|
+
|
30
|
+
if [logger] == "SQL" {
|
31
|
+
aggregate {
|
32
|
+
task_id => "%{taskid}"
|
33
|
+
code => "map['sql_duration'] += event['duration']"
|
34
|
+
map_action => "update"
|
35
|
+
}
|
36
|
+
}
|
37
|
+
|
38
|
+
if [logger] == "TASK_END" {
|
39
|
+
aggregate {
|
40
|
+
task_id => "%{taskid}"
|
41
|
+
code => "event['sql_duration'] = map['sql_duration']"
|
42
|
+
map_action => "update"
|
43
|
+
end_of_task => true
|
44
|
+
timeout => 120
|
45
|
+
}
|
46
|
+
}
|
47
|
+
}
|
48
|
+
```
|
49
|
+
|
50
|
+
* the final event then looks like :
|
51
|
+
``` ruby
|
52
|
+
{
|
53
|
+
"message" => "INFO - 12345 - TASK_END - end",
|
54
|
+
"sql_duration" => 46
|
55
|
+
}
|
56
|
+
```
|
57
|
+
|
58
|
+
the field `sql_duration` is added and contains the sum of all sql queries durations.
|
59
|
+
|
60
|
+
## Example #2
|
61
|
+
|
62
|
+
* If you have the same logs than example #1, but without a start log :
|
63
|
+
```
|
64
|
+
INFO - 12345 - SQL - sqlQuery1 - 12
|
65
|
+
INFO - 12345 - SQL - sqlQuery2 - 34
|
66
|
+
INFO - 12345 - TASK_END - end
|
67
|
+
```
|
68
|
+
|
69
|
+
* you can also aggregate "sql duration" with a slightly different configuration :
|
70
|
+
``` ruby
|
71
|
+
filter {
|
72
|
+
grok {
|
73
|
+
match => [ "message", "%{LOGLEVEL:loglevel} - %{NOTSPACE:taskid} - %{NOTSPACE:logger} - %{WORD:label}( - %{INT:duration:int})?" ]
|
74
|
+
}
|
75
|
+
|
76
|
+
if [logger] == "SQL" {
|
77
|
+
aggregate {
|
78
|
+
task_id => "%{taskid}"
|
79
|
+
code => "map['sql_duration'] ||= 0 ; map['sql_duration'] += event['duration']"
|
80
|
+
}
|
81
|
+
}
|
82
|
+
|
83
|
+
if [logger] == "TASK_END" {
|
84
|
+
aggregate {
|
85
|
+
task_id => "%{taskid}"
|
86
|
+
code => "event['sql_duration'] = map['sql_duration']"
|
87
|
+
end_of_task => true
|
88
|
+
timeout => 120
|
89
|
+
}
|
90
|
+
}
|
91
|
+
}
|
92
|
+
```
|
93
|
+
|
94
|
+
* the final event is exactly the same than example #1
|
95
|
+
* the key point is the "||=" ruby operator.
|
96
|
+
it allows to initialize 'sql_duration' map entry to 0 only if this map entry is not already initialized
|
97
|
+
|
98
|
+
|
99
|
+
## How it works
|
100
|
+
- the filter needs a "task_id" to correlate events (log lines) of a same task
|
101
|
+
- at the task beggining, filter creates a map, attached to task_id
|
102
|
+
- for each event, you can execute code using 'event' and 'map' (for instance, copy an event field to map)
|
103
|
+
- in the final event, you can execute a last code (for instance, add map data to final event)
|
104
|
+
- after the final event, the map attached to task is deleted
|
105
|
+
- in one filter configuration, it is recommanded to define a timeout option to protect the filter against unterminated tasks. It tells the filter to delete expired maps
|
106
|
+
- if no timeout is defined, by default, all maps older than 1800 seconds are automatically deleted
|
107
|
+
|
108
|
+
## Aggregate Plugin Options
|
109
|
+
- **task_id :**
|
110
|
+
The expression defining task ID to correlate logs.
|
111
|
+
This value must uniquely identify the task in the system.
|
112
|
+
This option is required.
|
113
|
+
Example value : `"%{application}%{my_task_id}"`
|
114
|
+
|
115
|
+
- **code:**
|
116
|
+
The code to execute to update map, using current event.
|
117
|
+
Or on the contrary, the code to execute to update event, using current map.
|
118
|
+
You will have a 'map' variable and an 'event' variable available (that is the event itself).
|
119
|
+
This option is required.
|
120
|
+
Example value : `"map['sql_duration'] += event['duration']"`
|
121
|
+
|
122
|
+
- **map_action:**
|
123
|
+
Tell the filter what to do with aggregate map (default : "create_or_update").
|
124
|
+
`create`: create the map, and execute the code only if map wasn't created before
|
125
|
+
`update`: doesn't create the map, and execute the code only if map was created before
|
126
|
+
`create_or_update`: create the map if it wasn't created before, execute the code in all cases
|
127
|
+
Default value: `create_or_update`
|
128
|
+
|
129
|
+
- **end_of_task:**
|
130
|
+
Tell the filter that task is ended, and therefore, to delete map after code execution.
|
131
|
+
Default value: `false`
|
132
|
+
|
133
|
+
- **timeout:**
|
134
|
+
The amount of seconds after a task "end event" can be considered lost.
|
135
|
+
The task "map" is then evicted.
|
136
|
+
The default value is 0, which means no timeout so no auto eviction.
|
@@ -0,0 +1,255 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
|
3
|
+
require "logstash/filters/base"
|
4
|
+
require "logstash/namespace"
|
5
|
+
require "thread"
|
6
|
+
|
7
|
+
#
|
8
|
+
# The aim of this filter is to aggregate informations available among several events (typically log lines) belonging to a same task,
|
9
|
+
# and finally push aggregated information into final task event.
|
10
|
+
#
|
11
|
+
# An example of use can be:
|
12
|
+
#
|
13
|
+
# * with these given logs :
|
14
|
+
# [source,log]
|
15
|
+
# ----------------------------------
|
16
|
+
# INFO - 12345 - TASK_START - start
|
17
|
+
# INFO - 12345 - SQL - sqlQuery1 - 12
|
18
|
+
# INFO - 12345 - SQL - sqlQuery2 - 34
|
19
|
+
# INFO - 12345 - TASK_END - end
|
20
|
+
# ----------------------------------
|
21
|
+
#
|
22
|
+
# * you can aggregate "dao duration" with this configuration :
|
23
|
+
# [source,ruby]
|
24
|
+
# ----------------------------------
|
25
|
+
# filter {
|
26
|
+
# grok {
|
27
|
+
# match => [ "message", "%{LOGLEVEL:loglevel} - %{NOTSPACE:taskid} - %{NOTSPACE:logger} - %{WORD:label}( - %{INT:duration:int})?" ]
|
28
|
+
# }
|
29
|
+
#
|
30
|
+
# if [logger] == "TASK_START" {
|
31
|
+
# aggregate {
|
32
|
+
# task_id => "%{taskid}"
|
33
|
+
# code => "map['sql_duration'] = 0"
|
34
|
+
# map_action => "create"
|
35
|
+
# }
|
36
|
+
# }
|
37
|
+
#
|
38
|
+
# if [logger] == "SQL" {
|
39
|
+
# aggregate {
|
40
|
+
# task_id => "%{taskid}"
|
41
|
+
# code => "map['sql_duration'] += event['duration']"
|
42
|
+
# map_action => "update"
|
43
|
+
# }
|
44
|
+
# }
|
45
|
+
#
|
46
|
+
# if [logger] == "TASK_END" {
|
47
|
+
# aggregate {
|
48
|
+
# task_id => "%{taskid}"
|
49
|
+
# code => "event['sql_duration'] = map['sql_duration']"
|
50
|
+
# map_action => "update"
|
51
|
+
# end_of_task => true
|
52
|
+
# timeout => 120
|
53
|
+
# }
|
54
|
+
# }
|
55
|
+
# }
|
56
|
+
# ----------------------------------
|
57
|
+
#
|
58
|
+
# * the final event then looks like :
|
59
|
+
# [source,json]
|
60
|
+
# ----------------------------------
|
61
|
+
# {
|
62
|
+
# "message" => "INFO - 12345 - TASK_END - end message",
|
63
|
+
# "sql_duration" => 46
|
64
|
+
# }
|
65
|
+
# ----------------------------------
|
66
|
+
#
|
67
|
+
# the field `sql_duration` is added and contains the sum of all sql queries durations.
|
68
|
+
#
|
69
|
+
#
|
70
|
+
# * Another example : imagine you have the same logs than example #1, but without a start log :
|
71
|
+
# [source,log]
|
72
|
+
# ----------------------------------
|
73
|
+
# INFO - 12345 - SQL - sqlQuery1 - 12
|
74
|
+
# INFO - 12345 - SQL - sqlQuery2 - 34
|
75
|
+
# INFO - 12345 - TASK_END - end
|
76
|
+
# ----------------------------------
|
77
|
+
#
|
78
|
+
# * you can also aggregate "sql duration" with a slightly different configuration :
|
79
|
+
# [source,ruby]
|
80
|
+
# ----------------------------------
|
81
|
+
# filter {
|
82
|
+
# grok {
|
83
|
+
# match => [ "message", "%{LOGLEVEL:loglevel} - %{NOTSPACE:taskid} - %{NOTSPACE:logger} - %{WORD:label}( - %{INT:duration:int})?" ]
|
84
|
+
# }
|
85
|
+
#
|
86
|
+
# if [logger] == "SQL" {
|
87
|
+
# aggregate {
|
88
|
+
# task_id => "%{taskid}"
|
89
|
+
# code => "map['sql_duration'] ||= 0 ; map['sql_duration'] += event['duration']"
|
90
|
+
# }
|
91
|
+
# }
|
92
|
+
#
|
93
|
+
# if [logger] == "TASK_END" {
|
94
|
+
# aggregate {
|
95
|
+
# task_id => "%{taskid}"
|
96
|
+
# code => "event['sql_duration'] = map['sql_duration']"
|
97
|
+
# end_of_task => true
|
98
|
+
# timeout => 120
|
99
|
+
# }
|
100
|
+
# }
|
101
|
+
# }
|
102
|
+
# ----------------------------------
|
103
|
+
#
|
104
|
+
# * the final event is exactly the same than example #1
|
105
|
+
# * the key point is the "||=" ruby operator. +
|
106
|
+
# it allows to initialize 'sql_duration' map entry to 0 only if this map entry is not already initialized
|
107
|
+
#
|
108
|
+
#
|
109
|
+
# How it works :
|
110
|
+
# - the filter needs a "task_id" to correlate events (log lines) of a same task
|
111
|
+
# - at the task beggining, filter creates a map, attached to task_id
|
112
|
+
# - for each event, you can execute code using 'event' and 'map' (for instance, copy an event field to map)
|
113
|
+
# - in the final event, you can execute a last code (for instance, add map data to final event)
|
114
|
+
# - after the final event, the map attached to task is deleted
|
115
|
+
# - in one filter configuration, it is recommanded to define a timeout option to protect the feature against unterminated tasks. It tells the filter to delete expired maps
|
116
|
+
# - if no timeout is defined, by default, all maps older than 1800 seconds are automatically deleted
|
117
|
+
#
|
118
|
+
#
|
119
|
+
class LogStash::Filters::Aggregate < LogStash::Filters::Base
|
120
|
+
|
121
|
+
config_name "aggregate"
|
122
|
+
|
123
|
+
# The expression defining task ID to correlate logs. +
|
124
|
+
# This value must uniquely identify the task in the system +
|
125
|
+
# Example value : "%{application}%{my_task_id}" +
|
126
|
+
config :task_id, :validate => :string, :required => true
|
127
|
+
|
128
|
+
# The code to execute to update map, using current event. +
|
129
|
+
# Or on the contrary, the code to execute to update event, using current map. +
|
130
|
+
# You will have a 'map' variable and an 'event' variable available (that is the event itself). +
|
131
|
+
# Example value : "map['sql_duration'] += event['duration']" +
|
132
|
+
config :code, :validate => :string, :required => true
|
133
|
+
|
134
|
+
# Tell the filter what to do with aggregate map (default : "create_or_update"). +
|
135
|
+
# create: create the map, and execute the code only if map wasn't created before +
|
136
|
+
# update: doesn't create the map, and execute the code only if map was created before +
|
137
|
+
# create_or_update: create the map if it wasn't created before, execute the code in all cases +
|
138
|
+
config :map_action, :validate => :string, :default => "create_or_update"
|
139
|
+
|
140
|
+
# Tell the filter that task is ended, and therefore, to delete map after code execution.
|
141
|
+
config :end_of_task, :validate => :boolean, :default => false
|
142
|
+
|
143
|
+
# The amount of seconds after a task "end event" can be considered lost. +
|
144
|
+
# The task "map" is evicted. +
|
145
|
+
# The default value is 0, which means no timeout so no auto eviction. +
|
146
|
+
config :timeout, :validate => :number, :required => false, :default => 0
|
147
|
+
|
148
|
+
|
149
|
+
# Default timeout (in seconds) when not defined in plugin configuration
|
150
|
+
DEFAULT_TIMEOUT = 1800
|
151
|
+
|
152
|
+
# This is the state of the filter.
|
153
|
+
# For each entry, key is "task_id" and value is a map freely updatable by 'code' config
|
154
|
+
@@aggregate_maps = {}
|
155
|
+
|
156
|
+
# Mutex used to synchronize access to 'aggregate_maps'
|
157
|
+
@@mutex = Mutex.new
|
158
|
+
|
159
|
+
# Aggregate instance which will evict all zombie Aggregate elements (older than timeout)
|
160
|
+
@@eviction_instance = nil
|
161
|
+
|
162
|
+
# last time where eviction was launched
|
163
|
+
@@last_eviction_timestamp = nil
|
164
|
+
|
165
|
+
# Initialize plugin
|
166
|
+
public
|
167
|
+
def register
|
168
|
+
# process lambda expression to call in each filter call
|
169
|
+
eval("@codeblock = lambda { |event, map| #{@code} }", binding, "(aggregate filter code)")
|
170
|
+
|
171
|
+
# define eviction_instance
|
172
|
+
@@mutex.synchronize do
|
173
|
+
if (@timeout > 0 && (@@eviction_instance.nil? || @timeout < @@eviction_instance.timeout))
|
174
|
+
@@eviction_instance = self
|
175
|
+
@logger.info("Aggregate, timeout: #{@timeout} seconds")
|
176
|
+
end
|
177
|
+
end
|
178
|
+
end
|
179
|
+
|
180
|
+
|
181
|
+
# This method is invoked each time an event matches the filter
|
182
|
+
public
|
183
|
+
def filter(event)
|
184
|
+
# return nothing unless there's an actual filter event
|
185
|
+
return unless filter?(event)
|
186
|
+
|
187
|
+
# define task id
|
188
|
+
task_id = event.sprintf(@task_id)
|
189
|
+
return if task_id.nil? || task_id.empty? || task_id == @task_id
|
190
|
+
|
191
|
+
@@mutex.synchronize do
|
192
|
+
# retrieve the current aggregate map
|
193
|
+
aggregate_maps_element = @@aggregate_maps[task_id]
|
194
|
+
if (aggregate_maps_element.nil?)
|
195
|
+
return if @map_action == "update"
|
196
|
+
aggregate_maps_element = LogStash::Filters::Aggregate::Element.new(Time.now);
|
197
|
+
@@aggregate_maps[task_id] = aggregate_maps_element
|
198
|
+
else
|
199
|
+
return if @map_action == "create"
|
200
|
+
end
|
201
|
+
map = aggregate_maps_element.map
|
202
|
+
|
203
|
+
# execute the code to read/update map and event
|
204
|
+
@codeblock.call(event, map)
|
205
|
+
|
206
|
+
# delete the map if task is ended
|
207
|
+
@@aggregate_maps.delete(task_id) if @end_of_task
|
208
|
+
end
|
209
|
+
|
210
|
+
filter_matched(event)
|
211
|
+
end
|
212
|
+
|
213
|
+
# Necessary to indicate logstash to periodically call 'flush' method
|
214
|
+
def periodic_flush
|
215
|
+
true
|
216
|
+
end
|
217
|
+
|
218
|
+
# This method is invoked by LogStash every 5 seconds.
|
219
|
+
def flush(options = {})
|
220
|
+
# Protection against no timeout defined by logstash conf : define a default eviction instance with timeout = DEFAULT_TIMEOUT seconds
|
221
|
+
if (@@eviction_instance.nil?)
|
222
|
+
@@eviction_instance = self
|
223
|
+
@timeout = DEFAULT_TIMEOUT
|
224
|
+
end
|
225
|
+
|
226
|
+
# Launch eviction only every interval of (@timeout / 2) seconds
|
227
|
+
if (@@eviction_instance == self && (@@last_eviction_timestamp.nil? || Time.now > @@last_eviction_timestamp + @timeout / 2))
|
228
|
+
remove_expired_elements()
|
229
|
+
@@last_eviction_timestamp = Time.now
|
230
|
+
end
|
231
|
+
|
232
|
+
return nil
|
233
|
+
end
|
234
|
+
|
235
|
+
|
236
|
+
# Remove the expired Aggregate elements from "aggregate_maps" if they are older than timeout
|
237
|
+
def remove_expired_elements()
|
238
|
+
min_timestamp = Time.now - @timeout
|
239
|
+
@@mutex.synchronize do
|
240
|
+
@@aggregate_maps.delete_if { |key, element| element.creation_timestamp < min_timestamp }
|
241
|
+
end
|
242
|
+
end
|
243
|
+
|
244
|
+
end # class LogStash::Filters::Aggregate
|
245
|
+
|
246
|
+
# Element of "aggregate_maps"
|
247
|
+
class LogStash::Filters::Aggregate::Element
|
248
|
+
|
249
|
+
attr_accessor :creation_timestamp, :map
|
250
|
+
|
251
|
+
def initialize(creation_timestamp)
|
252
|
+
@creation_timestamp = creation_timestamp
|
253
|
+
@map = {}
|
254
|
+
end
|
255
|
+
end
|
@@ -0,0 +1,24 @@
|
|
1
|
+
Gem::Specification.new do |s|
|
2
|
+
s.name = 'logstash-filter-aggregate'
|
3
|
+
s.version = '0.1.3'
|
4
|
+
s.licenses = ['Apache License (2.0)']
|
5
|
+
s.summary = "The aim of this filter is to aggregate informations available among several events (typically log lines) belonging to a same task, and finally push aggregated information into final task event."
|
6
|
+
s.description = "This gem is a logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/plugin install gemname. This gem is not a stand-alone program"
|
7
|
+
s.authors = ["Elastic", "Fabien Baligand"]
|
8
|
+
s.email = 'info@elastic.co'
|
9
|
+
s.homepage = "https://github.com/logstash-plugins/logstash-filter-aggregate"
|
10
|
+
s.require_paths = ["lib"]
|
11
|
+
|
12
|
+
# Files
|
13
|
+
s.files = Dir['lib/**/*','spec/**/*','*.gemspec','*.md','CONTRIBUTORS','Gemfile','LICENSE']
|
14
|
+
|
15
|
+
# Tests
|
16
|
+
s.test_files = s.files.grep(%r{^(test|spec|features)/})
|
17
|
+
|
18
|
+
# Special flag to let us know this is actually a logstash plugin
|
19
|
+
s.metadata = { "logstash_plugin" => "true", "logstash_group" => "filter" }
|
20
|
+
|
21
|
+
# Gem dependencies
|
22
|
+
s.add_runtime_dependency 'logstash-core', '>= 1.4.0', '< 2.0.0'
|
23
|
+
s.add_development_dependency 'logstash-devutils', '~> 0'
|
24
|
+
end
|
@@ -0,0 +1,167 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
require "logstash/devutils/rspec/spec_helper"
|
3
|
+
require "logstash/filters/aggregate"
|
4
|
+
require_relative "aggregate_spec_helper"
|
5
|
+
|
6
|
+
describe LogStash::Filters::Aggregate do
|
7
|
+
|
8
|
+
before(:each) do
|
9
|
+
set_eviction_instance(nil)
|
10
|
+
aggregate_maps.clear()
|
11
|
+
@start_filter = setup_filter({ "map_action" => "create", "code" => "map['sql_duration'] = 0" })
|
12
|
+
@update_filter = setup_filter({ "map_action" => "update", "code" => "map['sql_duration'] += event['duration']" })
|
13
|
+
@end_filter = setup_filter({ "map_action" => "update", "code" => "event.to_hash.merge!(map)", "end_of_task" => true, "timeout" => 5 })
|
14
|
+
end
|
15
|
+
|
16
|
+
context "Start event" do
|
17
|
+
describe "and receiving an event without task_id" do
|
18
|
+
it "does not record it" do
|
19
|
+
@start_filter.filter(event())
|
20
|
+
expect(aggregate_maps).to be_empty
|
21
|
+
end
|
22
|
+
end
|
23
|
+
describe "and receiving an event with task_id" do
|
24
|
+
it "records it" do
|
25
|
+
event = start_event("taskid" => "id123")
|
26
|
+
@start_filter.filter(event)
|
27
|
+
|
28
|
+
expect(aggregate_maps.size).to eq(1)
|
29
|
+
expect(aggregate_maps["id123"]).not_to be_nil
|
30
|
+
expect(aggregate_maps["id123"].creation_timestamp).to be >= event["@timestamp"]
|
31
|
+
expect(aggregate_maps["id123"].map["sql_duration"]).to eq(0)
|
32
|
+
end
|
33
|
+
end
|
34
|
+
|
35
|
+
describe "and receiving two 'start events' for the same task_id" do
|
36
|
+
it "keeps the first one and does nothing with the second one" do
|
37
|
+
|
38
|
+
first_start_event = start_event("taskid" => "id124")
|
39
|
+
@start_filter.filter(first_start_event)
|
40
|
+
|
41
|
+
first_update_event = update_event("taskid" => "id124", "duration" => 2)
|
42
|
+
@update_filter.filter(first_update_event)
|
43
|
+
|
44
|
+
sleep(1)
|
45
|
+
second_start_event = start_event("taskid" => "id124")
|
46
|
+
@start_filter.filter(second_start_event)
|
47
|
+
|
48
|
+
expect(aggregate_maps.size).to eq(1)
|
49
|
+
expect(aggregate_maps["id124"].creation_timestamp).to be < second_start_event["@timestamp"]
|
50
|
+
expect(aggregate_maps["id124"].map["sql_duration"]).to eq(first_update_event["duration"])
|
51
|
+
end
|
52
|
+
end
|
53
|
+
end
|
54
|
+
|
55
|
+
context "End event" do
|
56
|
+
describe "receiving an event without a previous 'start event'" do
|
57
|
+
describe "but without a previous 'start event'" do
|
58
|
+
it "does nothing with the event" do
|
59
|
+
end_event = end_event("taskid" => "id124")
|
60
|
+
@end_filter.filter(end_event)
|
61
|
+
|
62
|
+
expect(aggregate_maps).to be_empty
|
63
|
+
expect(end_event["sql_duration"]).to be_nil
|
64
|
+
end
|
65
|
+
end
|
66
|
+
end
|
67
|
+
end
|
68
|
+
|
69
|
+
context "Start/end events interaction" do
|
70
|
+
describe "receiving a 'start event'" do
|
71
|
+
before(:each) do
|
72
|
+
@task_id_value = "id_123"
|
73
|
+
@start_event = start_event({"taskid" => @task_id_value})
|
74
|
+
@start_filter.filter(@start_event)
|
75
|
+
expect(aggregate_maps.size).to eq(1)
|
76
|
+
end
|
77
|
+
|
78
|
+
describe "and receiving an end event" do
|
79
|
+
describe "and without an id" do
|
80
|
+
it "does nothing" do
|
81
|
+
end_event = end_event()
|
82
|
+
@end_filter.filter(end_event)
|
83
|
+
expect(aggregate_maps.size).to eq(1)
|
84
|
+
expect(end_event["sql_duration"]).to be_nil
|
85
|
+
end
|
86
|
+
end
|
87
|
+
|
88
|
+
describe "and an id different from the one of the 'start event'" do
|
89
|
+
it "does nothing" do
|
90
|
+
different_id_value = @task_id_value + "_different"
|
91
|
+
@end_filter.filter(end_event("taskid" => different_id_value))
|
92
|
+
|
93
|
+
expect(aggregate_maps.size).to eq(1)
|
94
|
+
expect(aggregate_maps[@task_id_value]).not_to be_nil
|
95
|
+
end
|
96
|
+
end
|
97
|
+
|
98
|
+
describe "and the same id of the 'start event'" do
|
99
|
+
it "add 'sql_duration' field to the end event and deletes the recorded 'start event'" do
|
100
|
+
expect(aggregate_maps.size).to eq(1)
|
101
|
+
|
102
|
+
@update_filter.filter(update_event("taskid" => @task_id_value, "duration" => 2))
|
103
|
+
|
104
|
+
end_event = end_event("taskid" => @task_id_value)
|
105
|
+
@end_filter.filter(end_event)
|
106
|
+
|
107
|
+
expect(aggregate_maps).to be_empty
|
108
|
+
expect(end_event["sql_duration"]).to eq(2)
|
109
|
+
end
|
110
|
+
|
111
|
+
end
|
112
|
+
end
|
113
|
+
end
|
114
|
+
end
|
115
|
+
|
116
|
+
context "flush call" do
|
117
|
+
before(:each) do
|
118
|
+
@end_filter.timeout = 1
|
119
|
+
expect(@end_filter.timeout).to eq(1)
|
120
|
+
@task_id_value = "id_123"
|
121
|
+
@start_event = start_event({"taskid" => @task_id_value})
|
122
|
+
@start_filter.filter(@start_event)
|
123
|
+
expect(aggregate_maps.size).to eq(1)
|
124
|
+
end
|
125
|
+
|
126
|
+
describe "no timeout defined in none filter" do
|
127
|
+
it "defines a default timeout on a default filter" do
|
128
|
+
set_eviction_instance(nil)
|
129
|
+
expect(eviction_instance).to be_nil
|
130
|
+
@end_filter.flush()
|
131
|
+
expect(eviction_instance).to eq(@end_filter)
|
132
|
+
expect(@end_filter.timeout).to eq(LogStash::Filters::Aggregate::DEFAULT_TIMEOUT)
|
133
|
+
end
|
134
|
+
end
|
135
|
+
|
136
|
+
describe "timeout is defined on another filter" do
|
137
|
+
it "eviction_instance is not updated" do
|
138
|
+
expect(eviction_instance).not_to be_nil
|
139
|
+
@start_filter.flush()
|
140
|
+
expect(eviction_instance).not_to eq(@start_filter)
|
141
|
+
expect(eviction_instance).to eq(@end_filter)
|
142
|
+
end
|
143
|
+
end
|
144
|
+
|
145
|
+
describe "no timeout defined on the filter" do
|
146
|
+
it "event is not removed" do
|
147
|
+
sleep(2)
|
148
|
+
@start_filter.flush()
|
149
|
+
expect(aggregate_maps.size).to eq(1)
|
150
|
+
end
|
151
|
+
end
|
152
|
+
|
153
|
+
describe "timeout defined on the filter" do
|
154
|
+
it "event is not removed if not expired" do
|
155
|
+
@end_filter.flush()
|
156
|
+
expect(aggregate_maps.size).to eq(1)
|
157
|
+
end
|
158
|
+
it "event is removed if expired" do
|
159
|
+
sleep(2)
|
160
|
+
@end_filter.flush()
|
161
|
+
expect(aggregate_maps).to be_empty
|
162
|
+
end
|
163
|
+
end
|
164
|
+
|
165
|
+
end
|
166
|
+
|
167
|
+
end
|
@@ -0,0 +1,49 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
require "logstash/filters/aggregate"
|
3
|
+
|
4
|
+
def event(data = {})
|
5
|
+
data["message"] ||= "Log message"
|
6
|
+
data["@timestamp"] ||= Time.now
|
7
|
+
LogStash::Event.new(data)
|
8
|
+
end
|
9
|
+
|
10
|
+
def start_event(data = {})
|
11
|
+
data["logger"] = "TASK_START"
|
12
|
+
event(data)
|
13
|
+
end
|
14
|
+
|
15
|
+
def update_event(data = {})
|
16
|
+
data["logger"] = "SQL"
|
17
|
+
event(data)
|
18
|
+
end
|
19
|
+
|
20
|
+
def end_event(data = {})
|
21
|
+
data["logger"] = "TASK_END"
|
22
|
+
event(data)
|
23
|
+
end
|
24
|
+
|
25
|
+
def setup_filter(config = {})
|
26
|
+
config["task_id"] ||= "%{taskid}"
|
27
|
+
filter = LogStash::Filters::Aggregate.new(config)
|
28
|
+
filter.register()
|
29
|
+
return filter
|
30
|
+
end
|
31
|
+
|
32
|
+
def filter(event)
|
33
|
+
@start_filter.filter(event)
|
34
|
+
@update_filter.filter(event)
|
35
|
+
@end_filter.filter(event)
|
36
|
+
end
|
37
|
+
|
38
|
+
def aggregate_maps()
|
39
|
+
LogStash::Filters::Aggregate.class_variable_get(:@@aggregate_maps)
|
40
|
+
end
|
41
|
+
|
42
|
+
def eviction_instance()
|
43
|
+
LogStash::Filters::Aggregate.class_variable_get(:@@eviction_instance)
|
44
|
+
end
|
45
|
+
|
46
|
+
def set_eviction_instance(new_value)
|
47
|
+
LogStash::Filters::Aggregate.class_variable_set(:@@eviction_instance, new_value)
|
48
|
+
end
|
49
|
+
|
metadata
ADDED
@@ -0,0 +1,92 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: logstash-filter-aggregate
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.3
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Elastic
|
8
|
+
- Fabien Baligand
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2015-07-04 00:00:00.000000000 Z
|
13
|
+
dependencies:
|
14
|
+
- !ruby/object:Gem::Dependency
|
15
|
+
name: logstash-core
|
16
|
+
version_requirements: !ruby/object:Gem::Requirement
|
17
|
+
requirements:
|
18
|
+
- - '>='
|
19
|
+
- !ruby/object:Gem::Version
|
20
|
+
version: 1.4.0
|
21
|
+
- - <
|
22
|
+
- !ruby/object:Gem::Version
|
23
|
+
version: 2.0.0
|
24
|
+
requirement: !ruby/object:Gem::Requirement
|
25
|
+
requirements:
|
26
|
+
- - '>='
|
27
|
+
- !ruby/object:Gem::Version
|
28
|
+
version: 1.4.0
|
29
|
+
- - <
|
30
|
+
- !ruby/object:Gem::Version
|
31
|
+
version: 2.0.0
|
32
|
+
prerelease: false
|
33
|
+
type: :runtime
|
34
|
+
- !ruby/object:Gem::Dependency
|
35
|
+
name: logstash-devutils
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - ~>
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '0'
|
41
|
+
requirement: !ruby/object:Gem::Requirement
|
42
|
+
requirements:
|
43
|
+
- - ~>
|
44
|
+
- !ruby/object:Gem::Version
|
45
|
+
version: '0'
|
46
|
+
prerelease: false
|
47
|
+
type: :development
|
48
|
+
description: This gem is a logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/plugin install gemname. This gem is not a stand-alone program
|
49
|
+
email: info@elastic.co
|
50
|
+
executables: []
|
51
|
+
extensions: []
|
52
|
+
extra_rdoc_files: []
|
53
|
+
files:
|
54
|
+
- BUILD.md
|
55
|
+
- CHANGELOG.md
|
56
|
+
- CONTRIBUTORS
|
57
|
+
- Gemfile
|
58
|
+
- LICENSE
|
59
|
+
- README.md
|
60
|
+
- lib/logstash/filters/aggregate.rb
|
61
|
+
- logstash-filter-aggregate.gemspec
|
62
|
+
- spec/filters/aggregate_spec.rb
|
63
|
+
- spec/filters/aggregate_spec_helper.rb
|
64
|
+
homepage: https://github.com/logstash-plugins/logstash-filter-aggregate
|
65
|
+
licenses:
|
66
|
+
- Apache License (2.0)
|
67
|
+
metadata:
|
68
|
+
logstash_plugin: 'true'
|
69
|
+
logstash_group: filter
|
70
|
+
post_install_message:
|
71
|
+
rdoc_options: []
|
72
|
+
require_paths:
|
73
|
+
- lib
|
74
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
75
|
+
requirements:
|
76
|
+
- - '>='
|
77
|
+
- !ruby/object:Gem::Version
|
78
|
+
version: '0'
|
79
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
80
|
+
requirements:
|
81
|
+
- - '>='
|
82
|
+
- !ruby/object:Gem::Version
|
83
|
+
version: '0'
|
84
|
+
requirements: []
|
85
|
+
rubyforge_project:
|
86
|
+
rubygems_version: 2.4.5
|
87
|
+
signing_key:
|
88
|
+
specification_version: 4
|
89
|
+
summary: The aim of this filter is to aggregate informations available among several events (typically log lines) belonging to a same task, and finally push aggregated information into final task event.
|
90
|
+
test_files:
|
91
|
+
- spec/filters/aggregate_spec.rb
|
92
|
+
- spec/filters/aggregate_spec_helper.rb
|