logstash-filter-aggregate 0.1.3

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 60e752551408fc57869c2ab7866e76f8612aa0c6
4
+ data.tar.gz: 3757e961e87a984a827b07a748811638c2df4cb7
5
+ SHA512:
6
+ metadata.gz: b9cb0e54a99fd0933499f98101d72b11943a24bd039c84a76fa9e9578a4580ef4b4ff514c9141415fc6740f93e2c9b34fed4fa1df9b4d980df70eb520e399154
7
+ data.tar.gz: caec2ae83b4d7d0985e2a96b1cf40d5be58de8915b2b8a17f3c8043085db270d895e8c6f60fe250d0333390d507458306a3b39ec6a192ae18d7a4ab0071c89cf
@@ -0,0 +1,86 @@
1
+ # Logstash Plugin
2
+
3
+ This is a plugin for [Logstash](https://github.com/elastic/logstash).
4
+
5
+ It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way.
6
+
7
+ ## Documentation
8
+
9
+ Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elasticsearch.org/guide/en/logstash/current/).
10
+
11
+ - For formatting code or config example, you can use the asciidoc `[source,ruby]` directive
12
+ - For more asciidoc formatting tips, see the excellent reference here https://github.com/elastic/docs#asciidoc-guide
13
+
14
+ ## Need Help?
15
+
16
+ Need help? Try #logstash on freenode IRC or the https://discuss.elastic.co/c/logstash discussion forum.
17
+
18
+ ## Developing
19
+
20
+ ### 1. Plugin Developement and Testing
21
+
22
+ #### Code
23
+ - To get started, you'll need JRuby with the Bundler gem installed.
24
+
25
+ - Create a new plugin or clone and existing from the GitHub [logstash-plugins](https://github.com/logstash-plugins) organization. We also provide [example plugins](https://github.com/logstash-plugins?query=example).
26
+
27
+ - Install dependencies
28
+ ```sh
29
+ bundle install
30
+ ```
31
+
32
+ #### Test
33
+
34
+ - Update your dependencies
35
+
36
+ ```sh
37
+ bundle install
38
+ ```
39
+
40
+ - Run tests
41
+
42
+ ```sh
43
+ bundle exec rspec
44
+ ```
45
+
46
+ ### 2. Running your unpublished Plugin in Logstash
47
+
48
+ #### 2.1 Run in a local Logstash clone
49
+
50
+ - Edit Logstash `Gemfile` and add the local plugin path, for example:
51
+ ```ruby
52
+ gem "logstash-filter-awesome", :path => "/your/local/logstash-filter-awesome"
53
+ ```
54
+ - Install plugin
55
+ ```sh
56
+ bin/plugin install --no-verify
57
+ ```
58
+ - Run Logstash with your plugin
59
+ ```sh
60
+ bin/logstash -e 'filter {awesome {}}'
61
+ ```
62
+ At this point any modifications to the plugin code will be applied to this local Logstash setup. After modifying the plugin, simply rerun Logstash.
63
+
64
+ #### 2.2 Run in an installed Logstash
65
+
66
+ You can use the same **2.1** method to run your plugin in an installed Logstash by editing its `Gemfile` and pointing the `:path` to your local plugin development directory or you can build the gem and install it using:
67
+
68
+ - Build your plugin gem
69
+ ```sh
70
+ gem build logstash-filter-awesome.gemspec
71
+ ```
72
+ - Install the plugin from the Logstash home
73
+ ```sh
74
+ bin/plugin install /your/local/plugin/logstash-filter-awesome.gem
75
+ ```
76
+ - Start Logstash and proceed to test the plugin
77
+
78
+ ## Contributing
79
+
80
+ All contributions are welcome: ideas, patches, documentation, bug reports, complaints, and even something you drew up on a napkin.
81
+
82
+ Programming is not a required skill. Whatever you've seen about open source and maintainers or community members saying "send patches or die" - you will not see that here.
83
+
84
+ It is more important to the community that you are able to contribute.
85
+
86
+ For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file.
File without changes
@@ -0,0 +1,10 @@
1
+ The following is a list of people who have contributed ideas, code, bug
2
+ reports, or in general have helped logstash along its way.
3
+
4
+ Contributors:
5
+ * Fabien Baligand (fbaligand)
6
+
7
+ Note: If you've sent us patches, bug reports, or otherwise contributed to
8
+ Logstash, and you aren't on the list above and want to be, please let us know
9
+ and we'll make sure you're here. Contributions from folks like you are what make
10
+ open source awesome.
data/Gemfile ADDED
@@ -0,0 +1,2 @@
1
+ source 'https://rubygems.org'
2
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,13 @@
1
+ Copyright (c) 2012-2015 Elasticsearch <http://www.elasticsearch.org>
2
+
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+
7
+ http://www.apache.org/licenses/LICENSE-2.0
8
+
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
@@ -0,0 +1,136 @@
1
+ # Logstash Filter Aggregate Documentation
2
+
3
+ The aim of this filter is to aggregate informations available among several events (typically log lines) belonging to a same task, and finally push aggregated information into final task event.
4
+
5
+ ## Example #1
6
+
7
+ * with these given logs :
8
+ ```
9
+ INFO - 12345 - TASK_START - start
10
+ INFO - 12345 - SQL - sqlQuery1 - 12
11
+ INFO - 12345 - SQL - sqlQuery2 - 34
12
+ INFO - 12345 - TASK_END - end
13
+ ```
14
+
15
+ * you can aggregate "sql duration" for the whole task with this configuration :
16
+ ``` ruby
17
+ filter {
18
+ grok {
19
+ match => [ "message", "%{LOGLEVEL:loglevel} - %{NOTSPACE:taskid} - %{NOTSPACE:logger} - %{WORD:label}( - %{INT:duration:int})?" ]
20
+ }
21
+
22
+ if [logger] == "TASK_START" {
23
+ aggregate {
24
+ task_id => "%{taskid}"
25
+ code => "map['sql_duration'] = 0"
26
+ map_action => "create"
27
+ }
28
+ }
29
+
30
+ if [logger] == "SQL" {
31
+ aggregate {
32
+ task_id => "%{taskid}"
33
+ code => "map['sql_duration'] += event['duration']"
34
+ map_action => "update"
35
+ }
36
+ }
37
+
38
+ if [logger] == "TASK_END" {
39
+ aggregate {
40
+ task_id => "%{taskid}"
41
+ code => "event['sql_duration'] = map['sql_duration']"
42
+ map_action => "update"
43
+ end_of_task => true
44
+ timeout => 120
45
+ }
46
+ }
47
+ }
48
+ ```
49
+
50
+ * the final event then looks like :
51
+ ``` ruby
52
+ {
53
+ "message" => "INFO - 12345 - TASK_END - end",
54
+ "sql_duration" => 46
55
+ }
56
+ ```
57
+
58
+ the field `sql_duration` is added and contains the sum of all sql queries durations.
59
+
60
+ ## Example #2
61
+
62
+ * If you have the same logs than example #1, but without a start log :
63
+ ```
64
+ INFO - 12345 - SQL - sqlQuery1 - 12
65
+ INFO - 12345 - SQL - sqlQuery2 - 34
66
+ INFO - 12345 - TASK_END - end
67
+ ```
68
+
69
+ * you can also aggregate "sql duration" with a slightly different configuration :
70
+ ``` ruby
71
+ filter {
72
+ grok {
73
+ match => [ "message", "%{LOGLEVEL:loglevel} - %{NOTSPACE:taskid} - %{NOTSPACE:logger} - %{WORD:label}( - %{INT:duration:int})?" ]
74
+ }
75
+
76
+ if [logger] == "SQL" {
77
+ aggregate {
78
+ task_id => "%{taskid}"
79
+ code => "map['sql_duration'] ||= 0 ; map['sql_duration'] += event['duration']"
80
+ }
81
+ }
82
+
83
+ if [logger] == "TASK_END" {
84
+ aggregate {
85
+ task_id => "%{taskid}"
86
+ code => "event['sql_duration'] = map['sql_duration']"
87
+ end_of_task => true
88
+ timeout => 120
89
+ }
90
+ }
91
+ }
92
+ ```
93
+
94
+ * the final event is exactly the same than example #1
95
+ * the key point is the "||=" ruby operator.
96
+ it allows to initialize 'sql_duration' map entry to 0 only if this map entry is not already initialized
97
+
98
+
99
+ ## How it works
100
+ - the filter needs a "task_id" to correlate events (log lines) of a same task
101
+ - at the task beggining, filter creates a map, attached to task_id
102
+ - for each event, you can execute code using 'event' and 'map' (for instance, copy an event field to map)
103
+ - in the final event, you can execute a last code (for instance, add map data to final event)
104
+ - after the final event, the map attached to task is deleted
105
+ - in one filter configuration, it is recommanded to define a timeout option to protect the filter against unterminated tasks. It tells the filter to delete expired maps
106
+ - if no timeout is defined, by default, all maps older than 1800 seconds are automatically deleted
107
+
108
+ ## Aggregate Plugin Options
109
+ - **task_id :**
110
+ The expression defining task ID to correlate logs.
111
+ This value must uniquely identify the task in the system.
112
+ This option is required.
113
+ Example value : `"%{application}%{my_task_id}"`
114
+
115
+ - **code:**
116
+ The code to execute to update map, using current event.
117
+ Or on the contrary, the code to execute to update event, using current map.
118
+ You will have a 'map' variable and an 'event' variable available (that is the event itself).
119
+ This option is required.
120
+ Example value : `"map['sql_duration'] += event['duration']"`
121
+
122
+ - **map_action:**
123
+ Tell the filter what to do with aggregate map (default : "create_or_update").
124
+ `create`: create the map, and execute the code only if map wasn't created before
125
+ `update`: doesn't create the map, and execute the code only if map was created before
126
+ `create_or_update`: create the map if it wasn't created before, execute the code in all cases
127
+ Default value: `create_or_update`
128
+
129
+ - **end_of_task:**
130
+ Tell the filter that task is ended, and therefore, to delete map after code execution.
131
+ Default value: `false`
132
+
133
+ - **timeout:**
134
+ The amount of seconds after a task "end event" can be considered lost.
135
+ The task "map" is then evicted.
136
+ The default value is 0, which means no timeout so no auto eviction.
@@ -0,0 +1,255 @@
1
+ # encoding: utf-8
2
+
3
+ require "logstash/filters/base"
4
+ require "logstash/namespace"
5
+ require "thread"
6
+
7
+ #
8
+ # The aim of this filter is to aggregate informations available among several events (typically log lines) belonging to a same task,
9
+ # and finally push aggregated information into final task event.
10
+ #
11
+ # An example of use can be:
12
+ #
13
+ # * with these given logs :
14
+ # [source,log]
15
+ # ----------------------------------
16
+ # INFO - 12345 - TASK_START - start
17
+ # INFO - 12345 - SQL - sqlQuery1 - 12
18
+ # INFO - 12345 - SQL - sqlQuery2 - 34
19
+ # INFO - 12345 - TASK_END - end
20
+ # ----------------------------------
21
+ #
22
+ # * you can aggregate "dao duration" with this configuration :
23
+ # [source,ruby]
24
+ # ----------------------------------
25
+ # filter {
26
+ # grok {
27
+ # match => [ "message", "%{LOGLEVEL:loglevel} - %{NOTSPACE:taskid} - %{NOTSPACE:logger} - %{WORD:label}( - %{INT:duration:int})?" ]
28
+ # }
29
+ #
30
+ # if [logger] == "TASK_START" {
31
+ # aggregate {
32
+ # task_id => "%{taskid}"
33
+ # code => "map['sql_duration'] = 0"
34
+ # map_action => "create"
35
+ # }
36
+ # }
37
+ #
38
+ # if [logger] == "SQL" {
39
+ # aggregate {
40
+ # task_id => "%{taskid}"
41
+ # code => "map['sql_duration'] += event['duration']"
42
+ # map_action => "update"
43
+ # }
44
+ # }
45
+ #
46
+ # if [logger] == "TASK_END" {
47
+ # aggregate {
48
+ # task_id => "%{taskid}"
49
+ # code => "event['sql_duration'] = map['sql_duration']"
50
+ # map_action => "update"
51
+ # end_of_task => true
52
+ # timeout => 120
53
+ # }
54
+ # }
55
+ # }
56
+ # ----------------------------------
57
+ #
58
+ # * the final event then looks like :
59
+ # [source,json]
60
+ # ----------------------------------
61
+ # {
62
+ # "message" => "INFO - 12345 - TASK_END - end message",
63
+ # "sql_duration" => 46
64
+ # }
65
+ # ----------------------------------
66
+ #
67
+ # the field `sql_duration` is added and contains the sum of all sql queries durations.
68
+ #
69
+ #
70
+ # * Another example : imagine you have the same logs than example #1, but without a start log :
71
+ # [source,log]
72
+ # ----------------------------------
73
+ # INFO - 12345 - SQL - sqlQuery1 - 12
74
+ # INFO - 12345 - SQL - sqlQuery2 - 34
75
+ # INFO - 12345 - TASK_END - end
76
+ # ----------------------------------
77
+ #
78
+ # * you can also aggregate "sql duration" with a slightly different configuration :
79
+ # [source,ruby]
80
+ # ----------------------------------
81
+ # filter {
82
+ # grok {
83
+ # match => [ "message", "%{LOGLEVEL:loglevel} - %{NOTSPACE:taskid} - %{NOTSPACE:logger} - %{WORD:label}( - %{INT:duration:int})?" ]
84
+ # }
85
+ #
86
+ # if [logger] == "SQL" {
87
+ # aggregate {
88
+ # task_id => "%{taskid}"
89
+ # code => "map['sql_duration'] ||= 0 ; map['sql_duration'] += event['duration']"
90
+ # }
91
+ # }
92
+ #
93
+ # if [logger] == "TASK_END" {
94
+ # aggregate {
95
+ # task_id => "%{taskid}"
96
+ # code => "event['sql_duration'] = map['sql_duration']"
97
+ # end_of_task => true
98
+ # timeout => 120
99
+ # }
100
+ # }
101
+ # }
102
+ # ----------------------------------
103
+ #
104
+ # * the final event is exactly the same than example #1
105
+ # * the key point is the "||=" ruby operator. +
106
+ # it allows to initialize 'sql_duration' map entry to 0 only if this map entry is not already initialized
107
+ #
108
+ #
109
+ # How it works :
110
+ # - the filter needs a "task_id" to correlate events (log lines) of a same task
111
+ # - at the task beggining, filter creates a map, attached to task_id
112
+ # - for each event, you can execute code using 'event' and 'map' (for instance, copy an event field to map)
113
+ # - in the final event, you can execute a last code (for instance, add map data to final event)
114
+ # - after the final event, the map attached to task is deleted
115
+ # - in one filter configuration, it is recommanded to define a timeout option to protect the feature against unterminated tasks. It tells the filter to delete expired maps
116
+ # - if no timeout is defined, by default, all maps older than 1800 seconds are automatically deleted
117
+ #
118
+ #
119
+ class LogStash::Filters::Aggregate < LogStash::Filters::Base
120
+
121
+ config_name "aggregate"
122
+
123
+ # The expression defining task ID to correlate logs. +
124
+ # This value must uniquely identify the task in the system +
125
+ # Example value : "%{application}%{my_task_id}" +
126
+ config :task_id, :validate => :string, :required => true
127
+
128
+ # The code to execute to update map, using current event. +
129
+ # Or on the contrary, the code to execute to update event, using current map. +
130
+ # You will have a 'map' variable and an 'event' variable available (that is the event itself). +
131
+ # Example value : "map['sql_duration'] += event['duration']" +
132
+ config :code, :validate => :string, :required => true
133
+
134
+ # Tell the filter what to do with aggregate map (default : "create_or_update"). +
135
+ # create: create the map, and execute the code only if map wasn't created before +
136
+ # update: doesn't create the map, and execute the code only if map was created before +
137
+ # create_or_update: create the map if it wasn't created before, execute the code in all cases +
138
+ config :map_action, :validate => :string, :default => "create_or_update"
139
+
140
+ # Tell the filter that task is ended, and therefore, to delete map after code execution.
141
+ config :end_of_task, :validate => :boolean, :default => false
142
+
143
+ # The amount of seconds after a task "end event" can be considered lost. +
144
+ # The task "map" is evicted. +
145
+ # The default value is 0, which means no timeout so no auto eviction. +
146
+ config :timeout, :validate => :number, :required => false, :default => 0
147
+
148
+
149
+ # Default timeout (in seconds) when not defined in plugin configuration
150
+ DEFAULT_TIMEOUT = 1800
151
+
152
+ # This is the state of the filter.
153
+ # For each entry, key is "task_id" and value is a map freely updatable by 'code' config
154
+ @@aggregate_maps = {}
155
+
156
+ # Mutex used to synchronize access to 'aggregate_maps'
157
+ @@mutex = Mutex.new
158
+
159
+ # Aggregate instance which will evict all zombie Aggregate elements (older than timeout)
160
+ @@eviction_instance = nil
161
+
162
+ # last time where eviction was launched
163
+ @@last_eviction_timestamp = nil
164
+
165
+ # Initialize plugin
166
+ public
167
+ def register
168
+ # process lambda expression to call in each filter call
169
+ eval("@codeblock = lambda { |event, map| #{@code} }", binding, "(aggregate filter code)")
170
+
171
+ # define eviction_instance
172
+ @@mutex.synchronize do
173
+ if (@timeout > 0 && (@@eviction_instance.nil? || @timeout < @@eviction_instance.timeout))
174
+ @@eviction_instance = self
175
+ @logger.info("Aggregate, timeout: #{@timeout} seconds")
176
+ end
177
+ end
178
+ end
179
+
180
+
181
+ # This method is invoked each time an event matches the filter
182
+ public
183
+ def filter(event)
184
+ # return nothing unless there's an actual filter event
185
+ return unless filter?(event)
186
+
187
+ # define task id
188
+ task_id = event.sprintf(@task_id)
189
+ return if task_id.nil? || task_id.empty? || task_id == @task_id
190
+
191
+ @@mutex.synchronize do
192
+ # retrieve the current aggregate map
193
+ aggregate_maps_element = @@aggregate_maps[task_id]
194
+ if (aggregate_maps_element.nil?)
195
+ return if @map_action == "update"
196
+ aggregate_maps_element = LogStash::Filters::Aggregate::Element.new(Time.now);
197
+ @@aggregate_maps[task_id] = aggregate_maps_element
198
+ else
199
+ return if @map_action == "create"
200
+ end
201
+ map = aggregate_maps_element.map
202
+
203
+ # execute the code to read/update map and event
204
+ @codeblock.call(event, map)
205
+
206
+ # delete the map if task is ended
207
+ @@aggregate_maps.delete(task_id) if @end_of_task
208
+ end
209
+
210
+ filter_matched(event)
211
+ end
212
+
213
+ # Necessary to indicate logstash to periodically call 'flush' method
214
+ def periodic_flush
215
+ true
216
+ end
217
+
218
+ # This method is invoked by LogStash every 5 seconds.
219
+ def flush(options = {})
220
+ # Protection against no timeout defined by logstash conf : define a default eviction instance with timeout = DEFAULT_TIMEOUT seconds
221
+ if (@@eviction_instance.nil?)
222
+ @@eviction_instance = self
223
+ @timeout = DEFAULT_TIMEOUT
224
+ end
225
+
226
+ # Launch eviction only every interval of (@timeout / 2) seconds
227
+ if (@@eviction_instance == self && (@@last_eviction_timestamp.nil? || Time.now > @@last_eviction_timestamp + @timeout / 2))
228
+ remove_expired_elements()
229
+ @@last_eviction_timestamp = Time.now
230
+ end
231
+
232
+ return nil
233
+ end
234
+
235
+
236
+ # Remove the expired Aggregate elements from "aggregate_maps" if they are older than timeout
237
+ def remove_expired_elements()
238
+ min_timestamp = Time.now - @timeout
239
+ @@mutex.synchronize do
240
+ @@aggregate_maps.delete_if { |key, element| element.creation_timestamp < min_timestamp }
241
+ end
242
+ end
243
+
244
+ end # class LogStash::Filters::Aggregate
245
+
246
+ # Element of "aggregate_maps"
247
+ class LogStash::Filters::Aggregate::Element
248
+
249
+ attr_accessor :creation_timestamp, :map
250
+
251
+ def initialize(creation_timestamp)
252
+ @creation_timestamp = creation_timestamp
253
+ @map = {}
254
+ end
255
+ end
@@ -0,0 +1,24 @@
1
+ Gem::Specification.new do |s|
2
+ s.name = 'logstash-filter-aggregate'
3
+ s.version = '0.1.3'
4
+ s.licenses = ['Apache License (2.0)']
5
+ s.summary = "The aim of this filter is to aggregate informations available among several events (typically log lines) belonging to a same task, and finally push aggregated information into final task event."
6
+ s.description = "This gem is a logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/plugin install gemname. This gem is not a stand-alone program"
7
+ s.authors = ["Elastic", "Fabien Baligand"]
8
+ s.email = 'info@elastic.co'
9
+ s.homepage = "https://github.com/logstash-plugins/logstash-filter-aggregate"
10
+ s.require_paths = ["lib"]
11
+
12
+ # Files
13
+ s.files = Dir['lib/**/*','spec/**/*','*.gemspec','*.md','CONTRIBUTORS','Gemfile','LICENSE']
14
+
15
+ # Tests
16
+ s.test_files = s.files.grep(%r{^(test|spec|features)/})
17
+
18
+ # Special flag to let us know this is actually a logstash plugin
19
+ s.metadata = { "logstash_plugin" => "true", "logstash_group" => "filter" }
20
+
21
+ # Gem dependencies
22
+ s.add_runtime_dependency 'logstash-core', '>= 1.4.0', '< 2.0.0'
23
+ s.add_development_dependency 'logstash-devutils', '~> 0'
24
+ end
@@ -0,0 +1,167 @@
1
+ # encoding: utf-8
2
+ require "logstash/devutils/rspec/spec_helper"
3
+ require "logstash/filters/aggregate"
4
+ require_relative "aggregate_spec_helper"
5
+
6
+ describe LogStash::Filters::Aggregate do
7
+
8
+ before(:each) do
9
+ set_eviction_instance(nil)
10
+ aggregate_maps.clear()
11
+ @start_filter = setup_filter({ "map_action" => "create", "code" => "map['sql_duration'] = 0" })
12
+ @update_filter = setup_filter({ "map_action" => "update", "code" => "map['sql_duration'] += event['duration']" })
13
+ @end_filter = setup_filter({ "map_action" => "update", "code" => "event.to_hash.merge!(map)", "end_of_task" => true, "timeout" => 5 })
14
+ end
15
+
16
+ context "Start event" do
17
+ describe "and receiving an event without task_id" do
18
+ it "does not record it" do
19
+ @start_filter.filter(event())
20
+ expect(aggregate_maps).to be_empty
21
+ end
22
+ end
23
+ describe "and receiving an event with task_id" do
24
+ it "records it" do
25
+ event = start_event("taskid" => "id123")
26
+ @start_filter.filter(event)
27
+
28
+ expect(aggregate_maps.size).to eq(1)
29
+ expect(aggregate_maps["id123"]).not_to be_nil
30
+ expect(aggregate_maps["id123"].creation_timestamp).to be >= event["@timestamp"]
31
+ expect(aggregate_maps["id123"].map["sql_duration"]).to eq(0)
32
+ end
33
+ end
34
+
35
+ describe "and receiving two 'start events' for the same task_id" do
36
+ it "keeps the first one and does nothing with the second one" do
37
+
38
+ first_start_event = start_event("taskid" => "id124")
39
+ @start_filter.filter(first_start_event)
40
+
41
+ first_update_event = update_event("taskid" => "id124", "duration" => 2)
42
+ @update_filter.filter(first_update_event)
43
+
44
+ sleep(1)
45
+ second_start_event = start_event("taskid" => "id124")
46
+ @start_filter.filter(second_start_event)
47
+
48
+ expect(aggregate_maps.size).to eq(1)
49
+ expect(aggregate_maps["id124"].creation_timestamp).to be < second_start_event["@timestamp"]
50
+ expect(aggregate_maps["id124"].map["sql_duration"]).to eq(first_update_event["duration"])
51
+ end
52
+ end
53
+ end
54
+
55
+ context "End event" do
56
+ describe "receiving an event without a previous 'start event'" do
57
+ describe "but without a previous 'start event'" do
58
+ it "does nothing with the event" do
59
+ end_event = end_event("taskid" => "id124")
60
+ @end_filter.filter(end_event)
61
+
62
+ expect(aggregate_maps).to be_empty
63
+ expect(end_event["sql_duration"]).to be_nil
64
+ end
65
+ end
66
+ end
67
+ end
68
+
69
+ context "Start/end events interaction" do
70
+ describe "receiving a 'start event'" do
71
+ before(:each) do
72
+ @task_id_value = "id_123"
73
+ @start_event = start_event({"taskid" => @task_id_value})
74
+ @start_filter.filter(@start_event)
75
+ expect(aggregate_maps.size).to eq(1)
76
+ end
77
+
78
+ describe "and receiving an end event" do
79
+ describe "and without an id" do
80
+ it "does nothing" do
81
+ end_event = end_event()
82
+ @end_filter.filter(end_event)
83
+ expect(aggregate_maps.size).to eq(1)
84
+ expect(end_event["sql_duration"]).to be_nil
85
+ end
86
+ end
87
+
88
+ describe "and an id different from the one of the 'start event'" do
89
+ it "does nothing" do
90
+ different_id_value = @task_id_value + "_different"
91
+ @end_filter.filter(end_event("taskid" => different_id_value))
92
+
93
+ expect(aggregate_maps.size).to eq(1)
94
+ expect(aggregate_maps[@task_id_value]).not_to be_nil
95
+ end
96
+ end
97
+
98
+ describe "and the same id of the 'start event'" do
99
+ it "add 'sql_duration' field to the end event and deletes the recorded 'start event'" do
100
+ expect(aggregate_maps.size).to eq(1)
101
+
102
+ @update_filter.filter(update_event("taskid" => @task_id_value, "duration" => 2))
103
+
104
+ end_event = end_event("taskid" => @task_id_value)
105
+ @end_filter.filter(end_event)
106
+
107
+ expect(aggregate_maps).to be_empty
108
+ expect(end_event["sql_duration"]).to eq(2)
109
+ end
110
+
111
+ end
112
+ end
113
+ end
114
+ end
115
+
116
+ context "flush call" do
117
+ before(:each) do
118
+ @end_filter.timeout = 1
119
+ expect(@end_filter.timeout).to eq(1)
120
+ @task_id_value = "id_123"
121
+ @start_event = start_event({"taskid" => @task_id_value})
122
+ @start_filter.filter(@start_event)
123
+ expect(aggregate_maps.size).to eq(1)
124
+ end
125
+
126
+ describe "no timeout defined in none filter" do
127
+ it "defines a default timeout on a default filter" do
128
+ set_eviction_instance(nil)
129
+ expect(eviction_instance).to be_nil
130
+ @end_filter.flush()
131
+ expect(eviction_instance).to eq(@end_filter)
132
+ expect(@end_filter.timeout).to eq(LogStash::Filters::Aggregate::DEFAULT_TIMEOUT)
133
+ end
134
+ end
135
+
136
+ describe "timeout is defined on another filter" do
137
+ it "eviction_instance is not updated" do
138
+ expect(eviction_instance).not_to be_nil
139
+ @start_filter.flush()
140
+ expect(eviction_instance).not_to eq(@start_filter)
141
+ expect(eviction_instance).to eq(@end_filter)
142
+ end
143
+ end
144
+
145
+ describe "no timeout defined on the filter" do
146
+ it "event is not removed" do
147
+ sleep(2)
148
+ @start_filter.flush()
149
+ expect(aggregate_maps.size).to eq(1)
150
+ end
151
+ end
152
+
153
+ describe "timeout defined on the filter" do
154
+ it "event is not removed if not expired" do
155
+ @end_filter.flush()
156
+ expect(aggregate_maps.size).to eq(1)
157
+ end
158
+ it "event is removed if expired" do
159
+ sleep(2)
160
+ @end_filter.flush()
161
+ expect(aggregate_maps).to be_empty
162
+ end
163
+ end
164
+
165
+ end
166
+
167
+ end
@@ -0,0 +1,49 @@
1
+ # encoding: utf-8
2
+ require "logstash/filters/aggregate"
3
+
4
+ def event(data = {})
5
+ data["message"] ||= "Log message"
6
+ data["@timestamp"] ||= Time.now
7
+ LogStash::Event.new(data)
8
+ end
9
+
10
+ def start_event(data = {})
11
+ data["logger"] = "TASK_START"
12
+ event(data)
13
+ end
14
+
15
+ def update_event(data = {})
16
+ data["logger"] = "SQL"
17
+ event(data)
18
+ end
19
+
20
+ def end_event(data = {})
21
+ data["logger"] = "TASK_END"
22
+ event(data)
23
+ end
24
+
25
+ def setup_filter(config = {})
26
+ config["task_id"] ||= "%{taskid}"
27
+ filter = LogStash::Filters::Aggregate.new(config)
28
+ filter.register()
29
+ return filter
30
+ end
31
+
32
+ def filter(event)
33
+ @start_filter.filter(event)
34
+ @update_filter.filter(event)
35
+ @end_filter.filter(event)
36
+ end
37
+
38
+ def aggregate_maps()
39
+ LogStash::Filters::Aggregate.class_variable_get(:@@aggregate_maps)
40
+ end
41
+
42
+ def eviction_instance()
43
+ LogStash::Filters::Aggregate.class_variable_get(:@@eviction_instance)
44
+ end
45
+
46
+ def set_eviction_instance(new_value)
47
+ LogStash::Filters::Aggregate.class_variable_set(:@@eviction_instance, new_value)
48
+ end
49
+
metadata ADDED
@@ -0,0 +1,92 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: logstash-filter-aggregate
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.3
5
+ platform: ruby
6
+ authors:
7
+ - Elastic
8
+ - Fabien Baligand
9
+ autorequire:
10
+ bindir: bin
11
+ cert_chain: []
12
+ date: 2015-07-04 00:00:00.000000000 Z
13
+ dependencies:
14
+ - !ruby/object:Gem::Dependency
15
+ name: logstash-core
16
+ version_requirements: !ruby/object:Gem::Requirement
17
+ requirements:
18
+ - - '>='
19
+ - !ruby/object:Gem::Version
20
+ version: 1.4.0
21
+ - - <
22
+ - !ruby/object:Gem::Version
23
+ version: 2.0.0
24
+ requirement: !ruby/object:Gem::Requirement
25
+ requirements:
26
+ - - '>='
27
+ - !ruby/object:Gem::Version
28
+ version: 1.4.0
29
+ - - <
30
+ - !ruby/object:Gem::Version
31
+ version: 2.0.0
32
+ prerelease: false
33
+ type: :runtime
34
+ - !ruby/object:Gem::Dependency
35
+ name: logstash-devutils
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ~>
39
+ - !ruby/object:Gem::Version
40
+ version: '0'
41
+ requirement: !ruby/object:Gem::Requirement
42
+ requirements:
43
+ - - ~>
44
+ - !ruby/object:Gem::Version
45
+ version: '0'
46
+ prerelease: false
47
+ type: :development
48
+ description: This gem is a logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/plugin install gemname. This gem is not a stand-alone program
49
+ email: info@elastic.co
50
+ executables: []
51
+ extensions: []
52
+ extra_rdoc_files: []
53
+ files:
54
+ - BUILD.md
55
+ - CHANGELOG.md
56
+ - CONTRIBUTORS
57
+ - Gemfile
58
+ - LICENSE
59
+ - README.md
60
+ - lib/logstash/filters/aggregate.rb
61
+ - logstash-filter-aggregate.gemspec
62
+ - spec/filters/aggregate_spec.rb
63
+ - spec/filters/aggregate_spec_helper.rb
64
+ homepage: https://github.com/logstash-plugins/logstash-filter-aggregate
65
+ licenses:
66
+ - Apache License (2.0)
67
+ metadata:
68
+ logstash_plugin: 'true'
69
+ logstash_group: filter
70
+ post_install_message:
71
+ rdoc_options: []
72
+ require_paths:
73
+ - lib
74
+ required_ruby_version: !ruby/object:Gem::Requirement
75
+ requirements:
76
+ - - '>='
77
+ - !ruby/object:Gem::Version
78
+ version: '0'
79
+ required_rubygems_version: !ruby/object:Gem::Requirement
80
+ requirements:
81
+ - - '>='
82
+ - !ruby/object:Gem::Version
83
+ version: '0'
84
+ requirements: []
85
+ rubyforge_project:
86
+ rubygems_version: 2.4.5
87
+ signing_key:
88
+ specification_version: 4
89
+ summary: The aim of this filter is to aggregate informations available among several events (typically log lines) belonging to a same task, and finally push aggregated information into final task event.
90
+ test_files:
91
+ - spec/filters/aggregate_spec.rb
92
+ - spec/filters/aggregate_spec_helper.rb