fluent-plugin-burrow 1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 39340b1b558d45c2bc90b371d73d33dd0fa4d26c
4
+ data.tar.gz: ac0f5ce4759eba67450f13bb07a023f304becd69
5
+ SHA512:
6
+ metadata.gz: d547faabd129b91df6a46f5f9145933fd40f9c4cebdb460094992057368814ba3e5354e7c3a2132272bb043bad9ae7d5e61c1a128eff20dcd9ca7a70b346d5b7
7
+ data.tar.gz: cb4bdfff79da208322ffd2890ddb04c217ae2ddbd8821ad76959eb44d2ad71473834bc3acec18aaef088b11c1069add50ad7321ce59708e2d784060205042bf3
data/.gitignore ADDED
@@ -0,0 +1,5 @@
1
+ *.gem
2
+ .bundle
3
+ Gemfile.lock
4
+ pkg/*
5
+ vendor/*
data/.travis.yml ADDED
@@ -0,0 +1,6 @@
1
+ language: ruby
2
+
3
+ rvm:
4
+ - 2.1
5
+ - 2.0.0
6
+ - 1.9.3
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source "http://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in fluent-plugin-rewrite-tag-filter.gemspec
4
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2014 Vanilla Forums
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,201 @@
1
+ fluent-plugin-burrow
2
+ ====================
3
+
4
+ This plugin for [Fluentd](http://fluentd.org) allows to extract a single key from an existing record and re-parse it with
5
+ a supplied format. A new event is then emitted, with the record modified by the now-decoded key's value.
6
+
7
+ ## Motivation
8
+
9
+ **out_burrow is designed to allow post-facto re-parsing of nested key elements.**
10
+
11
+ For example, lets say your source application writes to syslog, but instead of plain string messages, it writes JSON
12
+ encoded data. /var/log/syslog contains the following entry:
13
+
14
+ Jun 17 21:16:22 app1 5012162: {"event":"csrf_failure","msg":"Invalid transient key for System.","username":"System","userid":"1","ip":"192.34.93.74","method":"GET","domain":"http://timgunter.ca","path":"/dashboard/settings","tags":["csrf","failure"],"accountid":5009392,"siteid":5012162}
15
+
16
+ In td-agent.conf, you might have something like this to read this event:
17
+
18
+ ```
19
+ <source>
20
+ type syslog
21
+ port 5140
22
+ bind 127.0.0.1
23
+ tag raw.app.vanilla.events
24
+ </source>
25
+ ```
26
+
27
+ Unfortunately, in_syslog does not understand that the `message` field is encoded with JSON, so it escapes all the data
28
+ and makes it unusable down the line. If we piped these events to a file, we would see something like this:
29
+
30
+ ```
31
+ 2014-06-17T21:16:22Z raw.app.vanilla.events.local0.err {"host":"app1","ident":"5012162","message":"{\"event\":\"csrf_failure\",\"msg\":\"Invalid transient key for System.\",\"username\":\"System\",\"userid\":\"1\",\"ip\":\"192.34.93.74\",\"method\":\"GET\",\"domain\":\"http://timgunter.ca\",\"path\":\"/dashboard/authentication\",\"tags\":[\"csrf\",\"failure\"],\"accountid\":5009392,\"siteid\":5012162}"}
32
+ ```
33
+
34
+ Note how the `message` field has been escaped. This means that when this event eventually makes its way to a file, or
35
+ another system (like elasticsearch for example), it will not be ready for consumption. That's where `out_burrow` comes in.
36
+
37
+ Adding the following `match` block to td-agent.conf allows us to intercept the raw syslog events and re-parse the
38
+ message field as JSON:
39
+
40
+ ```
41
+ <match raw.app.vanilla.events.**>
42
+ type burrow
43
+ key_name message
44
+ action inplace
45
+ remove_prefix raw
46
+ format json
47
+ </match>
48
+ ```
49
+
50
+ There are several components to this rule, but for now lets look at the output:
51
+
52
+ ```
53
+ 2014-06-17T21:16:23Z app.vanilla.events.local0.err {"host":"app1","ident":"5012162","message":{"event":"csrf_failure","msg":"Invalid transient key for System.","username":"System","userid":"1","ip":"192.34.93.74","method":"GET","domain":"http://timgunter.ca","path":"/dashboard/settings/mobilethemes","tags":["csrf","failure"],"accountid":5009392,"siteid":5012162}}
54
+ ```
55
+
56
+ Now the JSON is no longer escaped, and can be easily parsed by both fluentd and elasticsearch.
57
+
58
+ ## Settings
59
+
60
+ ### key_name
61
+
62
+ `required`
63
+
64
+ This is the name of the key we want to examine and re-parse, and is required.
65
+
66
+ ### format
67
+
68
+ `required`
69
+
70
+ This is format that Fluentd should expect the `key_name` field to be encoded with. out_burrow supports the same built-in
71
+ format as Fluent::TextParser (and in_tail):
72
+
73
+ - apache
74
+ - apache2
75
+ - nginx
76
+ - syslog
77
+ - json
78
+ - csv
79
+ - tsv
80
+ - ltsv
81
+
82
+ ### tag
83
+
84
+ `optional`
85
+
86
+ When this event is re-emitted, change its tag to this setting's value.
87
+
88
+ ### remove_prefix
89
+
90
+ `optional`
91
+
92
+ When this event is re-emitted, remove this prefix from the source tag and use the resulting string as the new event's
93
+ tag. This setting automatically adds a trailing period `.` to its value before stripping.
94
+
95
+ ### add_prefix
96
+
97
+ `optional`
98
+
99
+ When this event is re-emitted, prepend this prefix to the source tag and use the resulting string as the new event's tag.
100
+ This setting automatically adds a trailing period `.` to its value before prepending.
101
+
102
+ **One of the `tag`, `remove_prefix`, or `add_prefix` settings is required.**
103
+ **`remove_prefix` and `add_prefix` can co-exist together.**
104
+
105
+ ### action
106
+
107
+ `optional` and defaults to `inplace`
108
+
109
+ The value of this setting determines how the new event will be constructed. There are three distinct options here:
110
+
111
+ - inplace
112
+
113
+ Perform decoding 'in place'. When the `key_name` field is successfully parsed, its contents will be written back to its
114
+ original key in the original record, which will then be re-emitted.
115
+
116
+ - overlay
117
+
118
+ Overlay decoded data on top of original record, and re-emit. In our example above, if 'overlay' was used instead of
119
+ 'inplace', the resulting record would have been:
120
+
121
+ ```
122
+ {
123
+ "host":"app1",
124
+ "ident":"5012162",
125
+ "event":"csrf_failure",
126
+ "msg":"Invalid transient key for System.",
127
+ "username":"System",
128
+ "userid":"1",
129
+ "ip":"192.34.93.74",
130
+ "method":"GET",
131
+ "domain":"http://timgunter.ca",
132
+ "path":"/dashboard/settings",
133
+ "tags":["csrf","failure"],
134
+ "accountid":5009392,
135
+ "siteid":5012162
136
+ }
137
+ ```
138
+
139
+ - replace
140
+
141
+ Replace the original entirely with the contents of the decoded field. In our example above, if 'replace' was used
142
+ instead of 'inplace', the resulting record would have been:
143
+
144
+ ```
145
+ {
146
+ "event":"csrf_failure",
147
+ "msg":"Invalid transient key for System.",
148
+ "username":"System",
149
+ "userid":"1",
150
+ "ip":"192.34.93.74",
151
+ "method":"GET",
152
+ "domain":"http://timgunter.ca",
153
+ "path":"/dashboard/settings",
154
+ "tags":["csrf","failure"],
155
+ "accountid":5009392,
156
+ "siteid":5012162
157
+ }
158
+ ```
159
+
160
+ ### keep_key
161
+
162
+ `optional` and defaults to `false`
163
+
164
+ Keep original source key (only valid with 'overlay' and 'replace' actions). When this is `true`, the original encoded
165
+ source key is retained in the output.
166
+
167
+ ### keep_time
168
+
169
+ `optional` and defaults to `false`
170
+
171
+ Keep the original record's "time" key. If the original top level record contains a
172
+
173
+ ### record_time_key
174
+
175
+ `optional` and defaults to `time`
176
+
177
+ If `keep_time` is `true`, this field specifies the key that contains the original records's time. The value of this key
178
+ will be copied into the new record after it has been parsed.
179
+
180
+ ### time_key
181
+
182
+ `optional` and defaults to `time`
183
+
184
+ When the `key_name` field's value is being parsed, look for this key and interpret it as the record's `time` key.
185
+
186
+ ### time_format
187
+
188
+ `optional` and defaults to nil
189
+
190
+ When parsing the `key_name` field's value and if `time_key` is set, this field denotes the format to expect the `time`
191
+ to be in.
192
+
193
+ ## Contributing
194
+
195
+ 1. Fork it
196
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
197
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
198
+ 4. Push to the branch (`git push origin my-new-feature`)
199
+ 5. Create new Pull Request
200
+
201
+ If you have a question, [open an Issue](https://github.com/vanilla/fluent-plugin-burrow/issues).
data/Rakefile ADDED
@@ -0,0 +1,9 @@
1
+ require "bundler/gem_tasks"
2
+ require "rake/testtask"
3
+ Rake::TestTask.new(:test) do |test|
4
+ test.libs << 'lib' << 'test'
5
+ test.pattern = 'test/**/test_*.rb'
6
+ test.verbose = true
7
+ end
8
+
9
+ task :default => :test
@@ -0,0 +1,5 @@
1
+ #
2
+ # This example demonstrates a method of extracting the "message" portion of an encoded syslog event and bringing the
3
+ # data in that portion into the main record scope.
4
+ #
5
+ #
@@ -0,0 +1,21 @@
1
+ # -*- encoding: utf-8 -*-
2
+ $:.push File.expand_path("../lib", __FILE__)
3
+
4
+ Gem::Specification.new do |s|
5
+ s.name = "fluent-plugin-burrow"
6
+ s.description = "Extract a single key (in formats Fluent can natively understand) from an event and re-emit a new event that replaces the entire original record with that key's values."
7
+ s.version = "1.0"
8
+ s.license = "MIT"
9
+ s.authors = ["Tim Gunter"]
10
+ s.email = ["tim@vanillaforums.com"]
11
+ s.homepage = "https://github.com/vanilla/fluent-plugin-burrow"
12
+ s.summary = %q{Fluentd output filter plugin. Extract a single key (in formats Fluent can natively understand) from an event and re-emit a new event that replaces the entire original record with that key's values.}
13
+
14
+ s.files = `git ls-files`.split("\n")
15
+ s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
16
+ s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
17
+ s.require_paths = ["lib"]
18
+
19
+ s.add_development_dependency "rake"
20
+ s.add_runtime_dependency "fluentd"
21
+ end
@@ -0,0 +1,143 @@
1
+ # Burrow Output Plugin
2
+ # @author Tim Gunter <tim@vanillaforums.com>
3
+ #
4
+ # This plugin allows to extract a single key from an existing event and re-parse it with a given
5
+ # format, and then re-emit a new event with the key's value replaced, or with the whole record replaced.
6
+ #
7
+
8
+ class Fluent::BurrowPlugin < Fluent::Output
9
+ # Register type
10
+ Fluent::Plugin.register_output('burrow', self)
11
+
12
+ # Required
13
+ config_param :key_name, :string
14
+ config_param :format, :string
15
+
16
+ # Optional - tag format
17
+ config_param :tag, :string, :default => nil # Create a new tag for the re-emitted event
18
+ config_param :remove_prefix, :string, :default => nil # Remove a prefix from the existing tag
19
+ config_param :add_prefix, :string, :default => nil # Add a prefix to the existing tag
20
+
21
+ # Optional - record format
22
+ config_param :action, :string, :default => 'inplace' # The action to take once key parsing is complete
23
+ config_param :keep_key, :bool, :default => false # Keep original source key (only valid with 'overlay' and 'replace' actions)
24
+
25
+ # Optional - time format
26
+ config_param :keep_time, :bool, :default => false # Keep the original event's "time" key
27
+ config_param :record_time_key, :string, :default => 'time' # Allow a custom time field in the record
28
+ config_param :time_key, :string, :default => 'time' # Allow a custom time field in the sub-record
29
+ config_param :time_format, :string, :default => nil # Allow a custom time format for the new record
30
+
31
+ # Parse config hash
32
+ def configure(conf)
33
+ super
34
+
35
+ # One of 'tag', 'remove_prefix' or 'add_prefix' must be specified
36
+ if not @tag and not @remove_prefix and not @add_prefix
37
+ raise Fluent::ConfigError, "One of 'tag', 'remove_prefix' or 'add_prefix' must be specified"
38
+ end
39
+ if @tag and (@remove_prefix or @add_prefix)
40
+ raise Fluent::ConfigError, "Specifying both 'tag' and either 'remove_prefix' or 'add_prefix' is not supported"
41
+ end
42
+
43
+ # Prepare for tag modification if required
44
+ if @remove_prefix
45
+ @removed_prefix_string = @remove_prefix.chomp('.') + '.'
46
+ @removed_length = @removed_prefix_string.length
47
+ end
48
+ if @add_prefix
49
+ @added_prefix_string = @add_prefix.chomp('.') + '.'
50
+ end
51
+
52
+ # Validate action
53
+ actions = ['replace','overlay','inplace']
54
+ if not actions.include? @action
55
+ raise Fluent::ConfigError, "Invalid 'action', must be one of #{actions.join(',')}"
56
+ end
57
+
58
+ # Validate action-based restrictions
59
+ if @action == 'inplace' and @keep_key
60
+ raise Fluent::ConfigError, "Specifying 'keep_key' with action 'inplace' is not supported"
61
+ end
62
+
63
+ # Prepare fluent's built-in parser
64
+ @parser = Fluent::TextParser.new()
65
+ @parser.configure(conf)
66
+ end
67
+
68
+ # This method is called when starting.
69
+ def start
70
+ super
71
+ end
72
+
73
+ # This method is called when shutting down.
74
+ def shutdown
75
+ end
76
+
77
+ # This method is called when an event reaches Fluentd.
78
+ def emit(tag, es, chain)
79
+
80
+ # Figure out new event tag (either manually specified, or modified with add_prefix|remove_prefix)
81
+ if @tag
82
+ tag = @tag
83
+ else
84
+ if @remove_prefix and
85
+ ( (tag.start_with?(@removed_prefix_string) and tag.length > @removed_length) or tag == @remove_prefix)
86
+ tag = tag[@removed_length..-1]
87
+ end
88
+ if @add_prefix
89
+ if tag and tag.length > 0
90
+ tag = @added_prefix_string + tag
91
+ else
92
+ tag = @add_prefix
93
+ end
94
+ end
95
+ end
96
+
97
+ # Handle all currently available events in stream
98
+ es.each do |time,record|
99
+ # Extract raw key value
100
+ raw_value = record[@key_name]
101
+
102
+ # Remember original time key, or raw event time
103
+ raw_time = record[@record_time_key]
104
+
105
+ # Try to parse it according to 'format'
106
+ t,values = raw_value ? @parser.parse(raw_value) : [nil, nil]
107
+
108
+ # Set new event's time to current time unless new time key was found in the sub-event
109
+ t ||= raw_time
110
+
111
+ r = values;
112
+
113
+ # Overlay new record on top of original record?
114
+ case @action
115
+ when 'inplace'
116
+ r = record.merge({@key_name => r})
117
+ when 'overlay'
118
+ r = record.merge(r)
119
+ when 'replace'
120
+ # noop
121
+ end
122
+
123
+ if ['overlay','replace'].include? @action
124
+ if not @keep_key
125
+ r.delete(@key_name)
126
+ end
127
+ end
128
+
129
+ # Preserve 'time' key?
130
+ if @keep_time
131
+ r[@record_time_key] = raw_time
132
+ end
133
+
134
+ # Emit event back to Fluent
135
+ if r
136
+ Fluent::Engine.emit(tag, t, r)
137
+ end
138
+ end
139
+
140
+ chain.next
141
+ end
142
+
143
+ end
data/test/helper.rb ADDED
@@ -0,0 +1,28 @@
1
+ require 'rubygems'
2
+ require 'bundler'
3
+ begin
4
+ Bundler.setup(:default, :development)
5
+ rescue Bundler::BundlerError => e
6
+ $stderr.puts e.message
7
+ $stderr.puts "Run `bundle install` to install missing gems"
8
+ exit e.status_code
9
+ end
10
+ require 'test/unit'
11
+
12
+ $LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
13
+ $LOAD_PATH.unshift(File.dirname(__FILE__))
14
+ require 'fluent/test'
15
+ unless ENV.has_key?('VERBOSE')
16
+ nulllogger = Object.new
17
+ nulllogger.instance_eval {|obj|
18
+ def method_missing(method, *args)
19
+ # pass
20
+ end
21
+ }
22
+ $log = nulllogger
23
+ end
24
+
25
+ require 'fluent/plugin/out_burrow'
26
+
27
+ class Test::Unit::TestCase
28
+ end
metadata ADDED
@@ -0,0 +1,87 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: fluent-plugin-burrow
3
+ version: !ruby/object:Gem::Version
4
+ version: '1.0'
5
+ platform: ruby
6
+ authors:
7
+ - Tim Gunter
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2015-03-19 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: rake
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - '>='
18
+ - !ruby/object:Gem::Version
19
+ version: '0'
20
+ type: :development
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - '>='
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: fluentd
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - '>='
32
+ - !ruby/object:Gem::Version
33
+ version: '0'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - '>='
39
+ - !ruby/object:Gem::Version
40
+ version: '0'
41
+ description: Extract a single key (in formats Fluent can natively understand) from
42
+ an event and re-emit a new event that replaces the entire original record with that
43
+ key's values.
44
+ email:
45
+ - tim@vanillaforums.com
46
+ executables: []
47
+ extensions: []
48
+ extra_rdoc_files: []
49
+ files:
50
+ - .gitignore
51
+ - .travis.yml
52
+ - Gemfile
53
+ - LICENSE
54
+ - README.md
55
+ - Rakefile
56
+ - examples/syslog.json.txt
57
+ - fluent-plugin-burrow.gemspec
58
+ - lib/fluent/plugin/out_burrow.rb
59
+ - test/helper.rb
60
+ homepage: https://github.com/vanilla/fluent-plugin-burrow
61
+ licenses:
62
+ - MIT
63
+ metadata: {}
64
+ post_install_message:
65
+ rdoc_options: []
66
+ require_paths:
67
+ - lib
68
+ required_ruby_version: !ruby/object:Gem::Requirement
69
+ requirements:
70
+ - - '>='
71
+ - !ruby/object:Gem::Version
72
+ version: '0'
73
+ required_rubygems_version: !ruby/object:Gem::Requirement
74
+ requirements:
75
+ - - '>='
76
+ - !ruby/object:Gem::Version
77
+ version: '0'
78
+ requirements: []
79
+ rubyforge_project:
80
+ rubygems_version: 2.0.14
81
+ signing_key:
82
+ specification_version: 4
83
+ summary: Fluentd output filter plugin. Extract a single key (in formats Fluent can
84
+ natively understand) from an event and re-emit a new event that replaces the entire
85
+ original record with that key's values.
86
+ test_files:
87
+ - test/helper.rb