logstash-filter-dissect 1.0.8 → 1.0.9

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 8ab8fdb5cc4c1f091bb5e2b439497b40fd1bfef2
4
- data.tar.gz: 59eced571c856df8dd9fb4b22a8e4d0922b47d69
3
+ metadata.gz: 214e5686732a713c4f413999fe85595782d0746c
4
+ data.tar.gz: 528f4a1d7afa2d6ea0bb7d8c55d49ce3342af913
5
5
  SHA512:
6
- metadata.gz: d48240a786ab7f95ec12bbd1ab31e700cc101d17793f597263d28b01318f44cabce4d71428f8333a996152dce91dfa38683b9e7cb8c872dcb1a59c8dfeaf4cfe
7
- data.tar.gz: 69df41b6eff092e685b921fb0c98b10a3b63cc2ce45b0a2e0dbcd4325e0f211178326fe0c31a680339011e0431bc53ead783313fd3162f834f806b0c9b00aae6
6
+ metadata.gz: 57f4154b1bc1e3eb59622d33d9e08a706f2a603957b5910bde47bf8bbab23c0271890957ba302abec26b7b633a3ce60081144a0d1026072bbff59275ca1f7089
7
+ data.tar.gz: 87662b417269eb478b762f6376db447f329af697c1b7df129ee068426e1530accf331ab30d4e7023f3b69fef55a0d2e9a9f500b5aa45aed86ecd0dce320c7dce
data/CHANGELOG.md CHANGED
@@ -1,3 +1,7 @@
1
+ ## 1.0.9
2
+ - Docs: Fix doc generation error by removing illegal heading
3
+ - Add metrics to track the number of matches and failures
4
+
1
5
  ## 1.0.8
2
6
  - Add "vendor/jars" to require_paths in gemspec
3
7
 
data/Gemfile CHANGED
@@ -1,3 +1,11 @@
1
1
  source 'https://rubygems.org'
2
+
2
3
  gemspec
3
4
 
5
+ logstash_path = ENV["LOGSTASH_PATH"] || "../../logstash"
6
+ use_logstash_source = ENV["LOGSTASH_SOURCE"] && ENV["LOGSTASH_SOURCE"].to_s == "1"
7
+
8
+ if Dir.exist?(logstash_path) && use_logstash_source
9
+ gem 'logstash-core', :path => "#{logstash_path}/logstash-core"
10
+ gem 'logstash-core-plugin-api', :path => "#{logstash_path}/logstash-core-plugin-api"
11
+ end
data/README.md CHANGED
@@ -1,3 +1,31 @@
1
+ ## Description
2
+
3
+ Dissect filter is an alternative to Grok filter and can be used to extract structured fields from an unstructured line.
4
+ However, if the structure of your text varies from line to line then Grok is more suitable. There is a hybrid case where Dissect can be used to de-structure the section of the line that is reliably repeated and then Grok can be used on the remaining field values with more regex predictability and less overall work to do.
5
+
6
+ A set of fields and delimiters is called a *dissection*.
7
+
8
+ The dissection is described using a set of `%{}` sections:
9
+ ....
10
+ %{a} - %{b} - %{c}
11
+ ....
12
+
13
+ A *field* is the text from `%` to `}` inclusive.
14
+
15
+ A *delimiter* is the text between `}` and `%` characters. Delimiters can't contain these `}{%` characters.
16
+
17
+ The config might look like this:
18
+
19
+ ```
20
+ filter {
21
+ dissect {
22
+ mapping => {
23
+ "message" => "%{ts} %{+ts} %{+ts} %{src} %{} %{prog}[%{pid}]: %{msg}"
24
+ }
25
+ }
26
+ }
27
+ ```
28
+
1
29
  ### NOTE
2
30
  Please read BUILD_INSTRUCTIONS.md
3
31
 
@@ -98,4 +126,4 @@ Programming is not a required skill. Whatever you've seen about open source and
98
126
 
99
127
  It is more important to the community that you are able to contribute.
100
128
 
101
- For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file.
129
+ For more information about contributing, see the [CONTRIBUTING](https://github.com/elastic/logstash/blob/master/CONTRIBUTING.md) file.
data/VERSION CHANGED
@@ -1 +1 @@
1
- 1.0.8
1
+ 1.0.9
@@ -0,0 +1,213 @@
1
+ :plugin: dissect
2
+ :type: filter
3
+
4
+ ///////////////////////////////////////////
5
+ START - GENERATED VARIABLES, DO NOT EDIT!
6
+ ///////////////////////////////////////////
7
+ :version: %VERSION%
8
+ :release_date: %RELEASE_DATE%
9
+ :changelog_url: %CHANGELOG_URL%
10
+ :include_path: ../../../../logstash/docs/include
11
+ ///////////////////////////////////////////
12
+ END - GENERATED VARIABLES, DO NOT EDIT!
13
+ ///////////////////////////////////////////
14
+
15
+ [id="plugins-{type}-{plugin}"]
16
+
17
+ === Dissect filter plugin
18
+
19
+ include::{include_path}/plugin_header.asciidoc[]
20
+
21
+ ==== Description
22
+
23
+ The Dissect filter is a kind of split operation. Unlike a regular split operation where one delimiter is applied to the whole string, this operation applies a set of delimiters # to a string value. +
24
+ Dissect does not use regular expressions and is very fast. +
25
+ However, if the structure of your text varies from line to line then Grok is more suitable. +
26
+ There is a hybrid case where Dissect can be used to de-structure the section of the line that is reliably repeated and then Grok can be used on the remaining field values with # more regex predictability and less overall work to do. +
27
+
28
+ A set of fields and delimiters is called a *dissection*.
29
+
30
+ The dissection is described using a set of `%{}` sections:
31
+ ....
32
+ %{a} - %{b} - %{c}
33
+ ....
34
+
35
+ A *field* is the text from `%` to `}` inclusive.
36
+
37
+ A *delimiter* is the text between `}` and `%` characters.
38
+
39
+ [NOTE]
40
+ delimiters can't contain these `}{%` characters.
41
+
42
+ The config might look like this:
43
+ ....
44
+ filter {
45
+ dissect {
46
+ mapping => {
47
+ "message" => "%{ts} %{+ts} %{+ts} %{src} %{} %{prog}[%{pid}]: %{msg}"
48
+ }
49
+ }
50
+ }
51
+ ....
52
+ When dissecting a string from left to right, text is captured upto the first delimiter - this captured text is stored in the first field. This is repeated for each field/# delimiter pair thereafter until the last delimiter is reached, then *the remaining text is stored in the last field*. +
53
+
54
+ *The Key:* +
55
+ The key is the text between the `%{` and `}`, exclusive of the ?, +, & prefixes and the ordinal suffix. +
56
+ `%{?aaa}` - key is `aaa` +
57
+ `%{+bbb/3}` - key is `bbb` +
58
+ `%{&ccc}` - key is `ccc` +
59
+
60
+ *Normal field notation:* +
61
+ The found value is added to the Event using the key. +
62
+ `%{some_field}` - a normal field has no prefix or suffix
63
+
64
+ *Skip field notation:* +
65
+ The found value is stored internally but not added to the Event. +
66
+ The key, if supplied, is prefixed with a `?`.
67
+
68
+ `%{}` is an empty skip field.
69
+
70
+ `%{?foo}` is a named skip field.
71
+
72
+ *Append field notation:* +
73
+ The value is appended to another value or stored if its the first field seen. +
74
+ The key is prefixed with a `+`. +
75
+ The final value is stored in the Event using the key. +
76
+
77
+ [NOTE]
78
+ ====
79
+ The delimiter found before the field is appended with the value. +
80
+ If no delimiter is found before the field, a single space character is used.
81
+ ====
82
+
83
+ `%{+some_field}` is an append field. +
84
+ `%{+some_field/2}` is an append field with an order modifier.
85
+
86
+ An order modifier, `/digits`, allows one to reorder the append sequence. +
87
+ e.g. for a text of `1 2 3 go`, this `%{+a/2} %{+a/1} %{+a/4} %{+a/3}` will build a key/value of `a => 2 1 go 3` +
88
+ Append fields without an order modifier will append in declared order. +
89
+ e.g. for a text of `1 2 3 go`, this `%{a} %{b} %{+a}` will build two key/values of `a => 1 3 go, b => 2` +
90
+
91
+ *Indirect field notation:* +
92
+ The found value is added to the Event using the found value of another field as the key. +
93
+ The key is prefixed with a `&`. +
94
+ `%{&some_field}` - an indirect field where the key is indirectly sourced from the value of `some_field`. +
95
+ e.g. for a text of `error: some_error, some_description`, this `error: %{?err}, %{&err}` will build a key/value of `some_error => some_description`.
96
+
97
+ [NOTE]
98
+ for append and indirect field the key can refer to a field that already exists in the event before dissection.
99
+
100
+ [NOTE]
101
+ use a Skip field if you do not want the indirection key/value stored.
102
+
103
+ e.g. for a text of `google: 77.98`, this `%{?a}: %{&a}` will build a key/value of `google => 77.98`.
104
+
105
+ [NOTE]
106
+ ===============================
107
+ append and indirect cannot be combined and will fail validation. +
108
+ `%{+&something}` - will add a value to the `&something` key, probably not the intended outcome. +
109
+ `%{&+something}` will add a value to the `+something` key, again probably unintended. +
110
+ ===============================
111
+
112
+ *Delimiter repetition:* +
113
+ In the source text if a field has variable width padded with delimiters, the padding will be ignored. +
114
+ e.g. for texts of:
115
+ ....
116
+ 00000043 ViewReceiver I
117
+ 000000b3 Peer I
118
+ ....
119
+ with a dissection of `%{a} %{b} %{c}`; the padding is ignored, `event.get([c]) -> "I"`
120
+
121
+ [NOTE]
122
+ ====
123
+ You probably want to use this filter inside an `if` block. +
124
+ This ensures that the event contains a field value with a suitable structure for the dissection.
125
+ ====
126
+
127
+ For example...
128
+ ....
129
+ filter {
130
+ if [type] == "syslog" or "syslog" in [tags] {
131
+ dissect {
132
+ mapping => {
133
+ "message" => "%{ts} %{+ts} %{+ts} %{src} %{} %{prog}[%{pid}]: %{msg}"
134
+ }
135
+ }
136
+ }
137
+ }
138
+ ....
139
+
140
+ [id="plugins-{type}s-{plugin}-options"]
141
+ ==== Dissect Filter Configuration Options
142
+
143
+ This plugin supports the following configuration options plus the <<plugins-{type}s-{plugin}-common-options>> described later.
144
+
145
+ [cols="<,<,<",options="header",]
146
+ |=======================================================================
147
+ |Setting |Input type|Required
148
+ | <<plugins-{type}s-{plugin}-convert_datatype>> |<<hash,hash>>|No
149
+ | <<plugins-{type}s-{plugin}-mapping>> |<<hash,hash>>|No
150
+ | <<plugins-{type}s-{plugin}-tag_on_failure>> |<<array,array>>|No
151
+ |=======================================================================
152
+
153
+ Also see <<plugins-{type}s-{plugin}-common-options>> for a list of options supported by all
154
+ filter plugins.
155
+
156
+ &nbsp;
157
+
158
+ [id="plugins-{type}s-{plugin}-convert_datatype"]
159
+ ===== `convert_datatype`
160
+
161
+ * Value type is <<hash,hash>>
162
+ * Default value is `{}`
163
+
164
+ With this setting `int` and `float` datatype conversions can be specified. +
165
+ These will be done after all `mapping` dissections have taken place. +
166
+ Feel free to use this setting on its own without a `mapping` section. +
167
+
168
+ For example
169
+ [source, ruby]
170
+ filter {
171
+ dissect {
172
+ convert_datatype => {
173
+ cpu => "float"
174
+ code => "int"
175
+ }
176
+ }
177
+ }
178
+
179
+ [id="plugins-{type}s-{plugin}-mapping"]
180
+ ===== `mapping`
181
+
182
+ * Value type is <<hash,hash>>
183
+ * Default value is `{}`
184
+
185
+ A hash of dissections of `field => value` +
186
+ A later dissection can be done on values from a previous dissection or they can be independent.
187
+
188
+ For example
189
+ [source, ruby]
190
+ filter {
191
+ dissect {
192
+ mapping => {
193
+ "message" => "%{field1} %{field2} %{description}"
194
+ "description" => "%{field3} %{field4} %{field5}"
195
+ }
196
+ }
197
+ }
198
+
199
+ This is useful if you want to keep the field `description` but also
200
+ dissect it some more.
201
+
202
+ [id="plugins-{type}s-{plugin}-tag_on_failure"]
203
+ ===== `tag_on_failure`
204
+
205
+ * Value type is <<array,array>>
206
+ * Default value is `["_dissectfailure"]`
207
+
208
+ Append values to the `tags` field when dissection fails
209
+
210
+
211
+
212
+ [id="plugins-{type}s-{plugin}-common-options"]
213
+ include::{include_path}/{type}.asciidoc[]
@@ -1,4 +1,4 @@
1
1
  # AUTOGENERATED BY THE GRADLE SCRIPT. DO NOT EDIT.
2
2
 
3
3
  require 'jar_dependencies'
4
- require_jar('org.logstash.dissect', 'jruby-dissect-library', '1.0.8')
4
+ require_jar('org.logstash.dissect', 'jruby-dissect-library', '1.0.9')
@@ -6,8 +6,6 @@ require "java"
6
6
  require "jruby-dissect-library_jars"
7
7
  require "jruby_dissector"
8
8
 
9
- # ==== *Dissect or how to de-structure text*
10
- #
11
9
  # The Dissect filter is a kind of split operation. Unlike a regular split operation where one delimiter is applied to the whole string, this operation applies a set of delimiters # to a string value. +
12
10
  # Dissect does not use regular expressions and is very fast. +
13
11
  # However, if the structure of your text varies from line to line then Grok is more suitable. +
@@ -80,7 +78,7 @@ require "jruby_dissector"
80
78
  # The found value is added to the Event using the found value of another field as the key. +
81
79
  # The key is prefixed with a `&`. +
82
80
  # `%{&some_field}` - an indirect field where the key is indirectly sourced from the value of `some_field`. +
83
- # e.g. for a text of `error: some_error, some_description`, this `error: %{?err}, %{&err}` will build a key/value of `some_error => description`.
81
+ # e.g. for a text of `error: some_error, some_description`, this `error: %{?err}, %{&err}` will build a key/value of `some_error => some_description`.
84
82
  #
85
83
  # [NOTE]
86
84
  # for append and indirect field the key can refer to a field that already exists in the event before dissection.
@@ -184,4 +182,8 @@ module LogStash module Filters class Dissect < LogStash::Filters::Base
184
182
  @dissector.dissect_multi(events, self)
185
183
  events
186
184
  end
185
+
186
+ def metric_increment(metric_name)
187
+ metric.increment(metric_name)
188
+ end
187
189
  end end end
@@ -12,7 +12,7 @@ Gem::Specification.new do |s|
12
12
  s.require_paths = ["lib", "vendor/jars"]
13
13
 
14
14
  # Files
15
- s.files = Dir['lib/**/*','spec/**/*','vendor/**/*','*.gemspec','*.md','CONTRIBUTORS','Gemfile','VERSION','LICENSE','NOTICE.TXT']
15
+ s.files = Dir["lib/**/*","spec/**/*","*.gemspec","*.md","CONTRIBUTORS","Gemfile","LICENSE","NOTICE.TXT", "vendor/jar-dependencies/**/*.jar", "vendor/jar-dependencies/**/*.rb", "VERSION", "docs/**/*"]
16
16
  # Tests
17
17
  s.test_files = s.files.grep(%r{^(test|spec|features)/})
18
18
 
@@ -222,4 +222,56 @@ describe LogStash::Filters::Dissect do
222
222
  end
223
223
  end
224
224
  end
225
+
226
+ describe "metrics tracking" do
227
+
228
+ let(:options) { { "mapping" => { "message" => "%{a} %{b}" } } }
229
+ subject { described_class.new(options) }
230
+
231
+ before(:each) { subject.register }
232
+
233
+ context "when match is successful" do
234
+ let(:event) { LogStash::Event.new("message" => "1 2") }
235
+
236
+ it "should increment the matches metric" do
237
+ expect(subject).to receive(:metric_increment).once.with(:matches)
238
+ subject.filter(event)
239
+ end
240
+ end
241
+
242
+ context "when match is not successful" do
243
+ let(:event) { LogStash::Event.new("message" => "") }
244
+
245
+ it "should increment the failures metric" do
246
+ expect(subject).to receive(:metric_increment).once.with(:failures)
247
+ subject.filter(event)
248
+ end
249
+ end
250
+ end
251
+
252
+ describe "Basic dissection" do
253
+
254
+ let(:options) { { "mapping" => { "message" => "%{a} %{b}" } } }
255
+ subject { described_class.new(options) }
256
+ let(:event) { LogStash::Event.new(event_data) }
257
+
258
+ before(:each) do
259
+ subject.register
260
+ subject.filter(event)
261
+ end
262
+
263
+ context "when no field" do
264
+ let(:event_data) { {} }
265
+ it "should not add tags to the event" do
266
+ expect(event.get("tags")).to be_nil
267
+ end
268
+ end
269
+
270
+ context "when field is empty" do
271
+ let(:event_data) { { "message" => "" } }
272
+ it "should add tags to the event" do
273
+ expect(event.get("tags")).to include("_dissectfailure")
274
+ end
275
+ end
276
+ end
225
277
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: logstash-filter-dissect
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.8
4
+ version: 1.0.9
5
5
  platform: ruby
6
6
  authors:
7
7
  - Elastic
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-10-22 00:00:00.000000000 Z
11
+ date: 2017-06-23 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  requirement: !ruby/object:Gem::Requirement
@@ -87,12 +87,12 @@ files:
87
87
  - NOTICE.TXT
88
88
  - README.md
89
89
  - VERSION
90
+ - docs/index.asciidoc
90
91
  - lib/jruby-dissect-library_jars.rb
91
92
  - lib/logstash/filters/dissect.rb
92
93
  - logstash-filter-dissect.gemspec
93
94
  - spec/filters/dissect_spec.rb
94
95
  - spec/spec_helper.rb
95
- - vendor/jars/org/logstash/dissect/jruby-dissect-library/1.0.8/jruby-dissect-library-1.0.8.jar
96
96
  homepage: http://www.elastic.co/guide/en/logstash/current/index.html
97
97
  licenses:
98
98
  - Apache License (2.0)
@@ -116,7 +116,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
116
116
  version: '0'
117
117
  requirements: []
118
118
  rubyforge_project:
119
- rubygems_version: 2.6.6
119
+ rubygems_version: 2.4.8
120
120
  signing_key:
121
121
  specification_version: 4
122
122
  summary: This dissect filter will de-structure text into multiple fields.