logstash-filter-multiline 0.1.2 → 0.1.4

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: ebb52611156af53a3586ad87b94264defff21b4d
4
- data.tar.gz: d5678da5ec5b5eb35ae1c57047bb4cb08ccb318f
3
+ metadata.gz: c8be8f44cc90d0771e53b9c642ccd5456c81ef7d
4
+ data.tar.gz: 57cb66091186f91f1b0d7df4292b8dc38e2b4386
5
5
  SHA512:
6
- metadata.gz: 0c6c2c36a07e1b30119beb9e76a9271c898a5e2d2259f349aaed5e941beaef81258f6fe02af4eb17844820fdfa9eda4a20ca20afa2dd289a6907d11ba6829c14
7
- data.tar.gz: e6f495b4efa8cec399b9ce6aab65fdd1edd7cd2c5ba952c703d949fa65b2899c94b6646c84c947228963891877ece5730805e182cbb72d291533c8cf5c921304
6
+ metadata.gz: 240769acdc6b6efce550a9d052c124015d307de079c84a689fbaf2a906e3ca5f029c52d6ce793d95e4f7ee7721a0699940112cf13891c097196973fa9b59301c
7
+ data.tar.gz: 07af22be599d635feb5922a6a3b4997f70b6e8ee5990aaaf4cfb6007909e36b3abbfdd3c1d77d903cb41cad6a35bc315be54402e3c2c0fabf0be55e82677cb14
data/LICENSE CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2012-2014 Elasticsearch <http://www.elasticsearch.org>
1
+ Copyright (c) 2012-2015 Elasticsearch <http://www.elasticsearch.org>
2
2
 
3
3
  Licensed under the Apache License, Version 2.0 (the "License");
4
4
  you may not use this file except in compliance with the License.
data/README.md ADDED
@@ -0,0 +1,95 @@
1
+ # Logstash Plugin
2
+
3
+ This is a plugin for [Logstash](https://github.com/elasticsearch/logstash).
4
+
5
+ It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way.
6
+
7
+ ## Documentation
8
+
9
+ Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one [central location](http://www.elasticsearch.org/guide/en/logstash/current/).
10
+
11
+ - For formatting code or config example, you can use the asciidoc `[source,ruby]` directive
12
+ - For more asciidoc formatting tips, see the excellent reference here https://github.com/elasticsearch/docs#asciidoc-guide
13
+
14
+ ## Need Help?
15
+
16
+ Need help? Try #logstash on freenode IRC or the logstash-users@googlegroups.com mailing list.
17
+
18
+ ## Developing
19
+
20
+ ### 1. Plugin Developement and Testing
21
+
22
+ #### Code
23
+ - To get started, you'll need JRuby with the Bundler gem installed.
24
+
25
+ - Create a new plugin or clone and existing from the GitHub [logstash-plugins](https://github.com/logstash-plugins) organization.
26
+
27
+ - Install dependencies
28
+ ```sh
29
+ bundle install
30
+ ```
31
+
32
+ #### Test
33
+
34
+ ```sh
35
+ bundle exec rspec
36
+ ```
37
+
38
+ The Logstash code required to run the tests/specs is specified in the `Gemfile` by the line similar to:
39
+ ```ruby
40
+ gem "logstash", :github => "elasticsearch/logstash", :branch => "1.5"
41
+ ```
42
+ To test against another version or a local Logstash, edit the `Gemfile` to specify an alternative location, for example:
43
+ ```ruby
44
+ gem "logstash", :github => "elasticsearch/logstash", :ref => "master"
45
+ ```
46
+ ```ruby
47
+ gem "logstash", :path => "/your/local/logstash"
48
+ ```
49
+
50
+ Then update your dependencies and run your tests:
51
+
52
+ ```sh
53
+ bundle install
54
+ bundle exec rspec
55
+ ```
56
+
57
+ ### 2. Running your unpublished Plugin in Logstash
58
+
59
+ #### 2.1 Run in a local Logstash clone
60
+
61
+ - Edit Logstash `tools/Gemfile` and add the local plugin path, for example:
62
+ ```ruby
63
+ gem "logstash-filter-awesome", :path => "/your/local/logstash-filter-awesome"
64
+ ```
65
+ - Update Logstash dependencies
66
+ ```sh
67
+ rake vendor:gems
68
+ ```
69
+ - Run Logstash with your plugin
70
+ ```sh
71
+ bin/logstash -e 'filter {awesome {}}'
72
+ ```
73
+ At this point any modifications to the plugin code will be applied to this local Logstash setup. After modifying the plugin, simply rerun Logstash.
74
+
75
+ #### 2.2 Run in an installed Logstash
76
+
77
+ - Build your plugin gem
78
+ ```sh
79
+ gem build logstash-filter-awesome.gemspec
80
+ ```
81
+ - Install the plugin from the Logstash home
82
+ ```sh
83
+ bin/plugin install /your/local/plugin/logstash-filter-awesome.gem
84
+ ```
85
+ - Start Logstash and proceed to test the plugin
86
+
87
+ ## Contributing
88
+
89
+ All contributions are welcome: ideas, patches, documentation, bug reports, complaints, and even something you drew up on a napkin.
90
+
91
+ Programming is not a required skill. Whatever you've seen about open source and maintainers or community members saying "send patches or die" - you will not see that here.
92
+
93
+ It is more important to me that you are able to contribute.
94
+
95
+ For more information about contributing, see the [CONTRIBUTING](https://github.com/elasticsearch/logstash/blob/master/CONTRIBUTING.md) file.
@@ -63,7 +63,12 @@ require "set"
63
63
  class LogStash::Filters::Multiline < LogStash::Filters::Base
64
64
 
65
65
  config_name "multiline"
66
- milestone 3
66
+
67
+ # The field name to execute the pattern match on.
68
+ config :source, :validate => :string, :default => "message"
69
+
70
+ # Allow duplcate values on the source field.
71
+ config :allow_duplicates, :validate => :boolean, :default => true
67
72
 
68
73
  # The regular expression to match.
69
74
  config :pattern, :validate => :string, :required => true
@@ -109,8 +114,7 @@ class LogStash::Filters::Multiline < LogStash::Filters::Base
109
114
  # Optional.
110
115
  config :periodic_flush, :validate => :boolean, :default => true
111
116
 
112
-
113
- # Detect if we are running from a jarfile, pick the right path.
117
+ # Register default pattern paths
114
118
  @@patterns_path = Set.new
115
119
  @@patterns_path += [LogStash::Patterns::Core.path]
116
120
 
@@ -137,10 +141,7 @@ class LogStash::Filters::Multiline < LogStash::Filters::Base
137
141
 
138
142
  @patterns_dir = @@patterns_path.to_a + @patterns_dir
139
143
  @patterns_dir.each do |path|
140
- if File.directory?(path)
141
- path = File.join(path, "*")
142
- end
143
-
144
+ path = File.join(path, "*") if File.directory?(path)
144
145
  Dir.glob(path).each do |file|
145
146
  @logger.info("Grok loading patterns from file", :path => file)
146
147
  @grok.add_patterns_from_file(file)
@@ -166,17 +167,14 @@ class LogStash::Filters::Multiline < LogStash::Filters::Base
166
167
  def filter(event)
167
168
  return unless filter?(event)
168
169
 
169
- match = event["message"].is_a?(Array) ? @grok.match(event["message"].first) : @grok.match(event["message"])
170
- match = (match and !@negate) || (!match and @negate) # add negate option
170
+ match = event[@source].is_a?(Array) ? @grok.match(event[@source].first) : @grok.match(event[@source])
171
+ match = (match && !@negate) || (!match && @negate) # add negate option
171
172
 
172
- @logger.debug? && @logger.debug("Multiline", :pattern => @pattern, :message => event["message"], :match => match, :negate => @negate)
173
+ @logger.debug? && @logger.debug("Multiline", :pattern => @pattern, :message => event[@source], :match => match, :negate => @negate)
173
174
 
174
175
  multiline_filter!(event, match)
175
176
 
176
- unless event.cancelled?
177
- collapse_event!(event)
178
- filter_matched(event) if match
179
- end
177
+ filter_matched(event) unless event.cancelled?
180
178
  end # def filter
181
179
 
182
180
  # flush any pending messages
@@ -186,25 +184,28 @@ class LogStash::Filters::Multiline < LogStash::Filters::Base
186
184
  # @return [Array<LogStash::Event>] list of flushed events
187
185
  public
188
186
  def flush(options = {})
189
- expired = nil
190
-
191
187
  # note that thread safety concerns are not necessary here because the multiline filter
192
- # is not thread safe thus cannot be run in multiple folterworker threads and flushing
188
+ # is not thread safe thus cannot be run in multiple filterworker threads and flushing
193
189
  # is called by the same thread
194
190
 
195
191
  # select all expired events from the @pending hash into a new expired hash
196
192
  # if :final flush then select all events
197
- expired = @pending.inject({}) do |r, (key, event)|
198
- age = Time.now - Array(event["@timestamp"]).first.time
199
- r[key] = event if (age >= @max_age) || options[:final]
200
- r
193
+ expired = @pending.inject({}) do |result, (key, events)|
194
+ unless events.empty?
195
+ age = Time.now - events.first["@timestamp"].time
196
+ result[key] = events if (age >= @max_age) || options[:final]
197
+ end
198
+ result
201
199
  end
202
200
 
203
- # delete expired items from @pending hash
204
- expired.each{|key, event| @pending.delete(key)}
205
-
206
- # return list of uncancelled and collapsed expired events
207
- expired.map{|key, event| event.uncancel; collapse_event!(event)}
201
+ # return list of uncancelled expired events
202
+ expired.map do |key, events|
203
+ @pending.delete(key)
204
+ event = merge(events)
205
+ event.uncancel
206
+ filter_matched(event)
207
+ event
208
+ end
208
209
  end # def flush
209
210
 
210
211
  public
@@ -216,29 +217,24 @@ class LogStash::Filters::Multiline < LogStash::Filters::Base
216
217
 
217
218
  def previous_filter!(event, match)
218
219
  key = event.sprintf(@stream_identity)
219
-
220
- pending = @pending[key]
220
+ pending = @pending[key] ||= []
221
221
 
222
222
  if match
223
+ # previous previous line is part of this event. append it to the event and cancel it
223
224
  event.tag(MULTILINE_TAG)
224
- # previous previous line is part of this event.
225
- # append it to the event and cancel it
226
- if pending
227
- pending.append(event)
228
- else
229
- @pending[key] = event
230
- end
225
+ pending << event
231
226
  event.cancel
232
227
  else
233
- # this line is not part of the previous event
234
- # if we have a pending event, it's done, send it.
228
+ # this line is not part of the previous event if we have a pending event, it's done, send it.
235
229
  # put the current event into pending
236
- if pending
230
+ unless pending.empty?
237
231
  tmp = event.to_hash
238
- event.overwrite(pending)
239
- @pending[key] = LogStash::Event.new(tmp)
232
+ event.overwrite(merge(pending))
233
+ pending.clear # avoid array creation
234
+ pending << LogStash::Event.new(tmp)
240
235
  else
241
- @pending[key] = event
236
+ pending.clear # avoid array creation
237
+ pending << event
242
238
  event.cancel
243
239
  end
244
240
  end # if match
@@ -246,35 +242,66 @@ class LogStash::Filters::Multiline < LogStash::Filters::Base
246
242
 
247
243
  def next_filter!(event, match)
248
244
  key = event.sprintf(@stream_identity)
249
-
250
- # protect @pending for race condition between the flush thread and the worker thread
251
- pending = @pending[key]
245
+ pending = @pending[key] ||= []
252
246
 
253
247
  if match
248
+ # this line is part of a multiline event, the next line will be part, too, put it into pending.
254
249
  event.tag(MULTILINE_TAG)
255
- # this line is part of a multiline event, the next
256
- # line will be part, too, put it into pending.
257
- if pending
258
- pending.append(event)
259
- else
260
- @pending[key] = event
261
- end
250
+ pending << event
262
251
  event.cancel
263
252
  else
264
- # if we have something in pending, join it with this message
265
- # and send it. otherwise, this is a new message and not part of
266
- # multiline, send it.
267
- if pending
268
- pending.append(event)
269
- event.overwrite(pending)
270
- @pending.delete(key)
253
+ # if we have something in pending, join it with this message and send it.
254
+ # otherwise, this is a new message and not part of multiline, send it.
255
+ unless pending.empty?
256
+ event.overwrite(merge(pending << event))
257
+ pending.clear
271
258
  end
272
259
  end # if match
273
260
  end
274
261
 
275
- def collapse_event!(event)
276
- event["message"] = event["message"].join("\n") if event["message"].is_a?(Array)
277
- event.timestamp = event.timestamp.first if event.timestamp.is_a?(Array)
278
- event
262
+ # merge a list of events. @timestamp for the resulting merged event will be from
263
+ # the "oldest" (events.first). all @source fields will be deduplicated depending
264
+ # on @allow_duplicates and joined with \n. all other fields will be deduplicated.
265
+ # @param events [Array<Event>] the list of events to merge
266
+ # @return [Event] the resulting merged event
267
+ def merge(events)
268
+ dups_key = @allow_duplicates ? @source : nil
269
+
270
+ data = events.inject({}) do |result, event|
271
+ self.class.event_hash_merge!(result, event.to_hash_with_metadata, dups_key)
272
+ end
273
+
274
+ # merged event @timestamp is from first event in sequence
275
+ data["@timestamp"] = Array(data["@timestamp"]).first
276
+ # collapse all @source field values
277
+ data[@source] = Array(data[@source]).join("\n")
278
+ LogStash::Event.new(data)
279
279
  end
280
+
281
+ # merge two events data hash, src into dst and handle duplicate values for dups_key
282
+ # @param dst [Hash] the event to merge into, dst will be mutated
283
+ # @param src [Hash] the event to merge in dst
284
+ # @param dups_key [String] the field key to keep duplicate values
285
+ # @return [Hash] mutated dst
286
+ def self.event_hash_merge!(dst, src, dups_key = nil)
287
+ src.each do |key, svalue|
288
+ dst[key] = if dst.has_key?(key)
289
+ dvalue = dst[key]
290
+
291
+ if dvalue.is_a?(Hash) && svalue.is_a?(Hash)
292
+ event_hash_merge!(dvalue, svalue, dups_key)
293
+ else
294
+ v = (dups_key == key) ? Array(dvalue) + Array(svalue) : Array(dvalue) | Array(svalue)
295
+ # the v result is always an Array, if none of the fields were arrays and there is a
296
+ # single value in the array, return the value, not the array
297
+ dvalue.is_a?(Array) || svalue.is_a?(Array) ? v : (v.size == 1 ? v.first : v)
298
+ end
299
+ else
300
+ svalue
301
+ end
302
+ end
303
+
304
+ dst
305
+ end # def self.hash_merge
306
+
280
307
  end # class LogStash::Filters::Multiline
@@ -1,7 +1,7 @@
1
1
  Gem::Specification.new do |s|
2
2
 
3
3
  s.name = 'logstash-filter-multiline'
4
- s.version = '0.1.2'
4
+ s.version = '0.1.4'
5
5
  s.licenses = ['Apache License (2.0)']
6
6
  s.summary = "This filter will collapse multiline messages from a single source into one Logstash event."
7
7
  s.description = "This gem is a logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/plugin install gemname. This gem is not a stand-alone program"
@@ -87,6 +87,7 @@ describe LogStash::Filters::Multiline do
87
87
  end
88
88
  end
89
89
 
90
+
90
91
  describe "multiline add/remove tags and fields only when matched" do
91
92
  config <<-CONFIG
92
93
  filter {
@@ -108,9 +109,9 @@ describe LogStash::Filters::Multiline do
108
109
  insist { subject.size } == 2
109
110
 
110
111
  subject.each do |s|
111
- insist { s["tags"].include?("nope") } == false
112
- insist { s["tags"].include?("dummy") } == true
113
- insist { s.include?("dummy2") } == false
112
+ insist { s["tags"].include?("nope") } == true
113
+ insist { s["tags"].include?("dummy") } == false
114
+ insist { s.include?("dummy2") } == true
114
115
  end
115
116
  end
116
117
  end
@@ -121,7 +122,6 @@ describe LogStash::Filters::Multiline do
121
122
  multiline {
122
123
  pattern => "^\s"
123
124
  what => "next"
124
- add_tag => ["multi"]
125
125
  }
126
126
  }
127
127
  CONFIG
@@ -138,7 +138,6 @@ describe LogStash::Filters::Multiline do
138
138
  multiline {
139
139
  pattern => "^\s"
140
140
  what => "next"
141
- add_tag => ["multi"]
142
141
  }
143
142
  }
144
143
  CONFIG
@@ -150,4 +149,100 @@ describe LogStash::Filters::Multiline do
150
149
  insist { subject[1]["message"] } == " match2\nnomatch2"
151
150
  end
152
151
  end
152
+
153
+ describe "keep duplicates by default on message field" do
154
+ config <<-CONFIG
155
+ filter {
156
+ multiline {
157
+ pattern => "^\s"
158
+ what => "next"
159
+ }
160
+ }
161
+ CONFIG
162
+
163
+ sample [" match1", " match1", "nomatch1", " 1match2", " 2match2", " 1match2", "nomatch2"] do
164
+ expect(subject).to be_a(Array)
165
+ insist { subject.size } == 2
166
+ insist { subject[0]["message"] } == " match1\n match1\nnomatch1"
167
+ insist { subject[1]["message"] } == " 1match2\n 2match2\n 1match2\nnomatch2"
168
+ end
169
+ end
170
+
171
+ describe "remove duplicates using :allow_duplicates => false on message field" do
172
+ config <<-CONFIG
173
+ filter {
174
+ multiline {
175
+ allow_duplicates => false
176
+ pattern => "^\s"
177
+ what => "next"
178
+ }
179
+ }
180
+ CONFIG
181
+
182
+ sample [" match1", " match1", "nomatch1", " 1match2", " 2match2", " 1match2", "nomatch2"] do
183
+ expect(subject).to be_a(Array)
184
+ insist { subject.size } == 2
185
+ insist { subject[0]["message"] } == " match1\nnomatch1"
186
+ insist { subject[1]["message"] } == " 1match2\n 2match2\nnomatch2"
187
+ end
188
+ end
189
+
190
+ describe "keep duplicates only on @source field" do
191
+ config <<-CONFIG
192
+ filter {
193
+ multiline {
194
+ source => "foo"
195
+ pattern => "^\s"
196
+ what => "next"
197
+ }
198
+ }
199
+ CONFIG
200
+
201
+ sample [
202
+ {"message" => "bar", "foo" => " match1"},
203
+ {"message" => "bar", "foo" => " match1"},
204
+ {"message" => "baz", "foo" => "nomatch1"},
205
+ {"foo" => " 1match2"},
206
+ {"foo" => " 2match2"},
207
+ {"foo" => " 1match2"},
208
+ {"foo" => "nomatch2"}
209
+ ] do
210
+ expect(subject).to be_a(Array)
211
+ insist { subject.size } == 2
212
+ insist { subject[0]["foo"] } == " match1\n match1\nnomatch1"
213
+ insist { subject[0]["message"] } == ["bar", "baz"]
214
+ insist { subject[1]["foo"] } == " 1match2\n 2match2\n 1match2\nnomatch2"
215
+ end
216
+ end
217
+
218
+ describe "fix dropped duplicated lines" do
219
+ # as reported in https://github.com/logstash-plugins/logstash-filter-multiline/issues/3
220
+
221
+ config <<-CONFIG
222
+ filter {
223
+ multiline {
224
+ pattern => "^START"
225
+ what => "previous"
226
+ negate=> true
227
+ }
228
+ }
229
+ CONFIG
230
+
231
+ messages = [
232
+ "START",
233
+ "<Tag1 Id=\"1\">",
234
+ "<Tag2>Foo</Tag2>",
235
+ "</Tag1>",
236
+ "<Tag1 Id=\"2\">",
237
+ "<Tag2>Foo</Tag2>",
238
+ "</Tag1>",
239
+ "START",
240
+ ]
241
+ sample messages do
242
+ expect(subject).to be_a(Array)
243
+ insist { subject.size } == 2
244
+ insist { subject[0]["message"] } == messages[0..-2].join("\n")
245
+ end
246
+ end
247
+
153
248
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: logstash-filter-multiline
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.2
4
+ version: 0.1.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - Elasticsearch
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-11-19 00:00:00.000000000 Z
11
+ date: 2015-01-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: logstash
@@ -95,6 +95,7 @@ files:
95
95
  - .gitignore
96
96
  - Gemfile
97
97
  - LICENSE
98
+ - README.md
98
99
  - Rakefile
99
100
  - lib/logstash/filters/multiline.rb
100
101
  - logstash-filter-multiline.gemspec
@@ -121,7 +122,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
121
122
  version: '0'
122
123
  requirements: []
123
124
  rubyforge_project:
124
- rubygems_version: 2.4.4
125
+ rubygems_version: 2.1.9
125
126
  signing_key:
126
127
  specification_version: 4
127
128
  summary: This filter will collapse multiline messages from a single source into one Logstash event.