fluent-plugin-norikra 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: aa64957f7aa640bcb15c8bf4ac747e01617ecac0
4
+ data.tar.gz: cc7f497f0fda85c0dd5b45fb70d5470d631d4317
5
+ SHA512:
6
+ metadata.gz: 0e9ecc0d5c4eda53f6426e82991dcea65c9152f03b1a7846ad202428a315b1dcc91502460f4b405e6a57bd543457d177b6b24c87441846302e118f31e67fa422
7
+ data.tar.gz: c81c72fba462086810c65f45dd2d29fd6fbb042d78185052ab0fd92fbec2b706f0a7f217c4b6aacd7f8e85ce4b2a1be1d19353c903b9ccb2ded4417de9eff85f
data/.gitignore ADDED
@@ -0,0 +1,17 @@
1
+ *.gem
2
+ *.rbc
3
+ .bundle
4
+ .config
5
+ .yardoc
6
+ Gemfile.lock
7
+ InstalledFiles
8
+ _yardoc
9
+ coverage
10
+ doc/
11
+ lib/bundler/man
12
+ pkg
13
+ rdoc
14
+ spec/reports
15
+ test/tmp
16
+ test/version_tmp
17
+ tmp
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in fluent-plugin-norikra.gemspec
4
+ gemspec
data/LICENSE.txt ADDED
@@ -0,0 +1,13 @@
1
+ Copyright (c) 2013- TAGOMORI Satoshi
2
+
3
+ Licensed under the Apache License, Version 2.0 (the "License");
4
+ you may not use this file except in compliance with the License.
5
+ You may obtain a copy of the License at
6
+
7
+ http://www.apache.org/licenses/LICENSE-2.0
8
+
9
+ Unless required by applicable law or agreed to in writing, software
10
+ distributed under the License is distributed on an "AS IS" BASIS,
11
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+ See the License for the specific language governing permissions and
13
+ limitations under the License.
data/README.md ADDED
@@ -0,0 +1,179 @@
1
+ # fluent-plugin-norikra
2
+
3
+ Fluentd output plugin to send events to norikra server, and to fetch events (and re-send on fluentd network) from norikra server.
4
+
5
+ With NorikraOutput, we can:
6
+
7
+ * execute Norikra server as built-in process dynamically
8
+ * generate Norikra's target automatically with Fluentd's tags
9
+ * register queries automatically with Fluentd's tags and messages
10
+ * get all events on Norikra and emit on Fluentd network automatically
11
+
12
+ # Setup
13
+
14
+ At first, install JRuby and Norikra on your host if you are not using stand-alone Norikra servers.
15
+
16
+ 1. install latest jruby
17
+ * (rbenv) `rbenv install jruby-1.7.4`
18
+ * (rvm) `rvm install jruby-1.7.4`
19
+ * or other tools you want.
20
+ 2. swith to jruby, and install Norikra
21
+ * `gem install norikra`
22
+ 3. check norikra path
23
+ * `which norikra`
24
+ 4. switch CRuby (with Fluentd), and install this plugin
25
+ * `gem install fluent-plugin-norikra` (or use `fluent-gem`)
26
+ 5. configure Fluentd, and execute.
27
+ * and write `path` configuration of `<server>` section (if you want)
28
+
29
+ # Configuration
30
+
31
+ For variations, see `example` directory.
32
+
33
+ ## NorikraOutput
34
+
35
+ With built-in Norikra server, to receive tags like `event.foo` and send norikra's target `foo`, and get count of its records per minute, and per hour.
36
+
37
+ <match event.*>
38
+ type norikra
39
+ norikra localhost:26571 # this is default
40
+ <server>
41
+ execute yes
42
+ path /home/user/.rbenv/versions/jruby-1.7.4/bin/norikra
43
+ </server>
44
+
45
+ remove_tag_prefix event
46
+ target_map_tag yes
47
+
48
+ <default>
49
+ <query>
50
+ name count_min_${target}
51
+ expression SELECT count(*) AS cnt FROM ${target}.win:time_batch(1 minute)
52
+ tag count.min.${target}
53
+ </query>
54
+ <query>
55
+ name count_hour_${target}
56
+ expression SELECT count(*) AS cnt FROM ${target}.win:time_batch(1 hour)
57
+ tag count.hour.${target}
58
+ </query>
59
+ </default>
60
+ </match>
61
+
62
+ With default setting, all fields are defined as 'string', so you must use `cast` for numerical processing in query (For more details, see Norikra and Esper's documents).
63
+
64
+ If you know some field's types of records, you can define types of these fields. This plugin will define field types before it send records into Norikra server.
65
+
66
+ <match event.*>
67
+ type norikra
68
+ norikra localhost:26571 # this is default
69
+ <server>
70
+ execute yes
71
+ path /home/user/.rbenv/versions/jruby-1.7.4/bin/norikra
72
+ </server>
73
+
74
+ remove_tag_prefix event
75
+ target_map_tag yes
76
+
77
+ <default>
78
+ field_int amount
79
+ field_long size
80
+ field_double price
81
+
82
+ <query>
83
+ name sales_${target}
84
+ expression SELECT price * amount AS FROM ${target}.win:time_batch(1 minute) WHERE size > 0
85
+ tag sales.min.${target}
86
+ </query>
87
+ </default>
88
+ </match>
89
+
90
+ Additional field definitions and query registrations should be written in `<target TARGET_NAME>` sections.
91
+
92
+ <default>
93
+ ... # for all of access logs
94
+ </default>
95
+ <target login>
96
+ field_string protocol # like 'oauth', 'openid', ...
97
+ field_int proto_num # integer means internal id of protocols
98
+ <query>
99
+ name protocol
100
+ expression SELECT protocol, count(*) AS cnt FROM ${target}.win:time_batch(1 hour) WHERE proto_num != 0 GROUP BY protocol
101
+ tag login.counts
102
+ </query>
103
+ </target>
104
+ <target other_action>
105
+ ...
106
+ </target>
107
+ # ...
108
+
109
+ ### Input event data filtering
110
+
111
+ If you want send known fields only, specify `exclude *` and `include` or `include_regexp` like this:
112
+
113
+ <default>
114
+ exclude *
115
+ include path,status,method,bytes,rhost,referer,agent,duration
116
+ include_pattern ^(query_|header_).*
117
+
118
+ # ...
119
+ </default>
120
+
121
+ Or you can specify to include as default, and exclude known some fields:
122
+
123
+ <default>
124
+ include *
125
+ exclude user_secret
126
+ include_pattern ^(header_).*
127
+
128
+ # ...
129
+ </default>
130
+
131
+ NOTE: These configurations of `<target>` section overwrites of configurations in `<default>` section.
132
+
133
+ ### Target mapping
134
+
135
+ Norikra's target (like table name) can be generated from:
136
+
137
+ * tag
138
+ * one target per one tag
139
+ * `target_map_tag yes`
140
+ * value of specified field
141
+ * targets from values in specified field of record, dynamically
142
+ * `target_map_key foo`
143
+ * fixed string (in configuration file)
144
+ * all records are sent in single target
145
+ * `target_string from_fluentd`
146
+
147
+ ### Event sweeping
148
+
149
+ Norikra server accepts queries and events from everywhere other than Fluentd. This plugin can get events from these queries/events.
150
+
151
+ To gather all events of Norikra server, including queries from outside of Fluentd configurations, write `<event>` section.
152
+
153
+ <events>
154
+ method sweep
155
+ tag query_name
156
+ # tag field FIELDNAME
157
+ # tag string FIXED_STRING
158
+ tag_prefix norikra.event # actual tag: norikra.event.QUERYNAME
159
+ sweep_interval 5s
160
+ </events>
161
+
162
+ NOTE: 'sweep' get all events from Norikra, and other clients cannot get these events. Take care for other clients.
163
+
164
+ # FAQ
165
+
166
+ * TODO: write this section
167
+ * `fetch_interval`
168
+ * error logs for new target, success logs of retry
169
+
170
+ # TODO
171
+
172
+ * TODO: write this section
173
+
174
+ # Copyright
175
+
176
+ * Copyright (c) 2013- TAGOMORI Satoshi (tagomoris)
177
+ * License
178
+ * Apache License, version 2.0
179
+
data/Rakefile ADDED
@@ -0,0 +1,10 @@
1
+ require "bundler/gem_tasks"
2
+
3
+ require 'rake/testtask'
4
+ Rake::TestTask.new(:test) do |test|
5
+ test.libs << 'lib' << 'test'
6
+ test.pattern = 'test/**/test_*.rb'
7
+ test.verbose = true
8
+ end
9
+
10
+ task :default => :test
@@ -0,0 +1,15 @@
1
+ <source>
2
+ type forward
3
+ </source>
4
+
5
+ <match event.*>
6
+ type norikra
7
+ norikra localhost:26571 # this is default
8
+ <server>
9
+ execute yes
10
+ path /Users/tagomoris/.rbenv/versions/jruby-1.7.3/bin/norikra
11
+ </server>
12
+
13
+ remove_tag_prefix event
14
+ target_map_tag yes
15
+ </match>
@@ -0,0 +1,68 @@
1
+ <match event.*>
2
+ type norikra
3
+ norikra localhost:26571
4
+
5
+ <server>
6
+ execute yes # (default)no
7
+ path /home/user/.rbenv/versions/jruby-1.7.4/bin/norikra
8
+ </server>
9
+
10
+ remove_tag_prefix event
11
+
12
+ target_map_tag yes
13
+ # or
14
+ # target_map_key KEYNAME
15
+ # or
16
+ # target_string TARGET_STRING
17
+
18
+ <default>
19
+ include *
20
+ exclude yyyymmdd,hhmmss
21
+ exclude_regexp f_.*
22
+ # OR
23
+ # exclude *
24
+ # include foo,bar,baz
25
+ # include_regexp status.*
26
+ field_boolean flag
27
+ field_int status
28
+ field_long duration,bytes
29
+
30
+ <query>
31
+ name pv_${target}
32
+ expression SELECT count(*) AS cnt FROM ${target}.win:time_batch(1 minutes) WHERE not flag
33
+ tag pv.${target}
34
+ fetch_interval 15s # default -> time_batch / 4 ? -> (none) -> 60s
35
+ # fetch_interval is ignored when <events> section specified
36
+ </query>
37
+ <query>
38
+ name errors_${target}
39
+ expression SELECT count(*) AS cnt FROM ${target}.win:time_batch(1 minutes) WHERE status >= 500
40
+ tag errors.${target}
41
+ fetch_interval 15s
42
+ </query>
43
+ </default>
44
+
45
+ <target search>
46
+ field_int display
47
+
48
+ <query>
49
+ name search_words
50
+ expression SELECT count(distinct query_search) AS cnt FROM ${target}.win:time_batch(1 minutes) WHERE query_search.length() > 0
51
+ tag search.words
52
+ </query>
53
+ <query>
54
+ name search_rate
55
+ expression SELECT count(*) AS cnt FROM ${target}.win:time_batch(1 minutes) WHERE query_search.length() > 0
56
+ tag search.rate
57
+ </query>
58
+ </target>
59
+
60
+ <events>
61
+ method sweep # listen(not implemented)
62
+ tag query_name
63
+ # tag field FIELDNAME
64
+ # tag string TAG_STRING
65
+ tag_prefix cep
66
+ sweep_interval 5s
67
+ </events>
68
+ </match>
@@ -0,0 +1,36 @@
1
+ <source>
2
+ type forward
3
+ </source>
4
+
5
+ <match test.*>
6
+ type norikra
7
+ norikra localhost:26571
8
+ <server>
9
+ execute yes
10
+ path /Users/tagomoris/.rbenv/versions/jruby-1.7.3/bin/norikra
11
+ </server>
12
+
13
+ remove_tag_prefix test
14
+ target_map_tag yes
15
+
16
+ <default>
17
+ <query>
18
+ name count_${target}
19
+ expression SELECT '${target}' as target,count(*) AS cnt FROM ${target}.win:time_batch(30 sec)
20
+ </query>
21
+ </default>
22
+ <event>
23
+ method sweep
24
+ tag field target
25
+ tag_prefix count
26
+ sweep_interval 5s
27
+ </event>
28
+ </match>
29
+
30
+ <match fluent.*>
31
+ type null
32
+ </match>
33
+
34
+ <match **>
35
+ type stdout
36
+ </match>
@@ -0,0 +1,23 @@
1
+ # coding: utf-8
2
+
3
+ Gem::Specification.new do |spec|
4
+ spec.name = "fluent-plugin-norikra"
5
+ spec.version = "0.0.1"
6
+ spec.authors = ["TAGOMORI Satoshi"]
7
+ spec.email = ["tagomoris@gmail.com"]
8
+ spec.description = %q{process events on fluentd with SQL like query, with built-in Norikra server if needed.}
9
+ spec.summary = %q{Fluentd plugin to do CEP with norikra}
10
+ spec.homepage = "https://github.com/tagomoris/fluent-plugin-norikra"
11
+ spec.license = "APLv2"
12
+
13
+ spec.files = `git ls-files`.split($/)
14
+ spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
15
+ spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
16
+ spec.require_paths = ["lib"]
17
+
18
+ spec.add_runtime_dependency "norikra-client", ">= 0.0.2"
19
+ spec.add_runtime_dependency "fluentd"
20
+
21
+ spec.add_development_dependency "bundler", "~> 1.3"
22
+ spec.add_development_dependency "rake"
23
+ end
@@ -0,0 +1,217 @@
1
+ class Fluent::NorikraOutput
2
+ class Query
3
+ attr_accessor :name, :expression, :tag, :interval
4
+
5
+ def initialize(name, expression, tag, interval)
6
+ @name = name
7
+ @expression = expression
8
+ @tag = tag
9
+ @interval = interval
10
+ end
11
+ end
12
+
13
+ class QueryGenerator
14
+ attr_reader :fetch_interval
15
+
16
+ def initialize(name_template, expression_template, tag_template, opts={})
17
+ @name_template = name_template || ''
18
+ @expression_template = expression_template || ''
19
+ @tag_template = tag_template || ''
20
+ if @name_template.empty? || @expression_template.empty?
21
+ raise Fluent::ConfigError, "query's name/expression must be specified"
22
+ end
23
+ @fetch_interval = case
24
+ when opts['fetch_interval']
25
+ Fluent::Config.time_value(opts['fetch_interval'])
26
+ when @expression_template =~ /\.win:time_batch\(([^\)]+)\)/
27
+ y,mon,w,d,h,m,s,msec = self.class.parse_time_period($1)
28
+ (h * 3600 + m * 60 + s) / 5
29
+ else
30
+ 60
31
+ end
32
+ end
33
+
34
+ def generate(target)
35
+ Fluent::NorikraOutput::Query.new(
36
+ self.class.replace_target(target, @name_template),
37
+ self.class.replace_target(target, @expression_template),
38
+ self.class.replace_target(target, @tag_template),
39
+ @fetch_interval
40
+ )
41
+ end
42
+
43
+ def self.replace_target(target, str)
44
+ str.gsub('${target}', target)
45
+ end
46
+
47
+ def self.parse_time_period(string)
48
+ #### http://esper.codehaus.org/esper-4.9.0/doc/reference/en-US/html/epl_clauses.html#epl-syntax-time-periods
49
+ # time-period : [year-part] [month-part] [week-part] [day-part] [hour-part] [minute-part] [seconds-part] [milliseconds-part]
50
+ # year-part : (number|variable_name) ("years" | "year")
51
+ # month-part : (number|variable_name) ("months" | "month")
52
+ # week-part : (number|variable_name) ("weeks" | "week")
53
+ # day-part : (number|variable_name) ("days" | "day")
54
+ # hour-part : (number|variable_name) ("hours" | "hour")
55
+ # minute-part : (number|variable_name) ("minutes" | "minute" | "min")
56
+ # seconds-part : (number|variable_name) ("seconds" | "second" | "sec")
57
+ # milliseconds-part : (number|variable_name) ("milliseconds" | "millisecond" | "msec")
58
+ m = /^\s*(\d+ years?)? ?(\d+ months?)? ?(\d+ weeks?)? ?(\d+ days?)? ?(\d+ hours?)? ?(\d+ (?:min|minute|minutes))? ?(\d+ (?:sec|second|seconds))? ?(\d+ (?:msec|millisecond|milliseconds))?/.match(string)
59
+ years = (m[1] || '').split(' ',2).first.to_i
60
+ months = (m[2] || '').split(' ',2).first.to_i
61
+ weeks = (m[3] || '').split(' ',2).first.to_i
62
+ days = (m[4] || '').split(' ',2).first.to_i
63
+ hours = (m[5] || '').split(' ',2).first.to_i
64
+ minutes = (m[6] || '').split(' ',2).first.to_i
65
+ seconds = (m[7] || '').split(' ',2).first.to_i
66
+ msecs = (m[8] || '').split(' ',2).first.to_i
67
+ return [years, months, weeks, days, hours, minutes, seconds, msecs]
68
+ end
69
+ end
70
+
71
+ class RecordFilter
72
+ attr_reader :default_policy, :include_fields, :include_regexp, :exclude_fields, :exclude_regexp
73
+
74
+ def initialize(include='', include_regexp='', exclude='', exclude_regexp='')
75
+ include ||= ''
76
+ include_regexp ||= ''
77
+ exclude ||= ''
78
+ exclude_regexp ||= ''
79
+
80
+ @default_policy = nil
81
+ if include == '*' && exclude == '*'
82
+ raise Fluent::ConfigError, "invalid configuration, both of 'include' and 'exclude' are '*'"
83
+ end
84
+ if include.empty? && include_regexp.empty? && exclude.empty? && exclude_regexp.empty? # assuming "include *"
85
+ @default_policy = :include
86
+ elsif exclude.empty? && exclude_regexp.empty? || exclude == '*' # assuming "exclude *"
87
+ @default_policy = :exclude
88
+ elsif include.empty? && include_regexp.empty? || include == '*' # assuming "include *"
89
+ @default_policy = :include
90
+ else
91
+ raise Fluent::ConfigError, "unknown default policy. specify 'include *' or 'exclude *'"
92
+ end
93
+
94
+ @include_fields = nil
95
+ @include_regexp = nil
96
+ @exclude_fields = nil
97
+ @exclude_regexp = nil
98
+
99
+ if @default_policy == :exclude
100
+ @include_fields = include.split(',')
101
+ @include_regexp = Regexp.new(include_regexp) unless include_regexp.empty?
102
+ if @include_fields.empty? && @include_regexp.nil?
103
+ raise Fluent::ConfigError, "no one fields specified. specify 'include' or 'include_regexp'"
104
+ end
105
+ else
106
+ @exclude_fields = exclude.split(',')
107
+ @exclude_regexp = Regexp.new(exclude_regexp) unless exclude_regexp.empty?
108
+ end
109
+ end
110
+
111
+ def filter(record)
112
+ if @default_policy == :include
113
+ if @exclude_fields.empty? && @exclude_regexp.nil?
114
+ record
115
+ else
116
+ record = record.dup
117
+ record.keys.each do |f|
118
+ record.delete(f) if @exclude_fields.include?(f) || @exclude_regexp && @exclude_regexp.match(f)
119
+ end
120
+ record
121
+ end
122
+ else # default policy exclude
123
+ data = {}
124
+ record.keys.each do |f|
125
+ data[f] = record[f] if @include_fields.include?(f) || @include_regexp && @include_regexp.match(f)
126
+ end
127
+ data
128
+ end
129
+ end
130
+ end
131
+
132
+ class ConfigSection
133
+ attr_accessor :target, :filter_params, :field_definitions, :query_generators
134
+
135
+ def initialize(section)
136
+ @target = case section.name
137
+ when 'default'
138
+ nil
139
+ when 'target'
140
+ section.arg
141
+ else
142
+ raise ArgumentError, "invalid section for this class, #{section.name}: ConfigSection"
143
+ end
144
+ @filter_params = {
145
+ :include => section['include'],
146
+ :include_regexp => section['include_regexp'],
147
+ :exclude => section['exclude'],
148
+ :exclude_regexp => section['exclude_regexp']
149
+ }
150
+ @field_definitions = {
151
+ :string => (section['field_string'] || '').split(','),
152
+ :boolean => (section['field_boolean'] || '').split(','),
153
+ :int => (section['field_int'] || '').split(','),
154
+ :long => (section['field_long'] || '').split(','),
155
+ :float => (section['field_float'] || '').split(','),
156
+ :double => (section['field_double'] || '').split(',')
157
+ }
158
+ @query_generators = []
159
+ section.elements.each do |element|
160
+ if element.name == 'query'
161
+ opt = {}
162
+ if element.has_key?('fetch_interval')
163
+ opt['fetch_interval'] = element['fetch_interval'].to_i
164
+ end
165
+ @query_generators.push(QueryGenerator.new(element['name'], element['expression'], element['tag'], opt))
166
+ end
167
+ end
168
+ end
169
+
170
+ def +(other)
171
+ if other.nil?
172
+ other = self.class.new(Fluent::Config::Element.new('target', 'dummy', {}, []))
173
+ end
174
+ r = self.class.new(Fluent::Config::Element.new('target', (other.target ? other.target : self.target), {}, []))
175
+ others_filter = {}
176
+ other.filter_params.keys.each do |k|
177
+ others_filter[k] = other.filter_params[k] if other.filter_params[k]
178
+ end
179
+ r.filter_params = self.filter_params.merge(others_filter)
180
+ r.field_definitions = {
181
+ :string => self.field_definitions[:string] + other.field_definitions[:string],
182
+ :boolean => self.field_definitions[:boolean] + other.field_definitions[:boolean],
183
+ :int => self.field_definitions[:int] + other.field_definitions[:int],
184
+ :long => self.field_definitions[:long] + other.field_definitions[:long],
185
+ :float => self.field_definitions[:float] + other.field_definitions[:float],
186
+ :double => self.field_definitions[:double] + other.field_definitions[:double]
187
+ }
188
+ r.query_generators = self.query_generators + other.query_generators
189
+ r
190
+ end
191
+ end
192
+
193
+ class Target
194
+ attr_accessor :name, :fields, :queries
195
+
196
+ def initialize(target, config)
197
+ @name = target
198
+ @filter = RecordFilter.new(*([:include, :include_regexp, :exclude, :exclude_regexp].map{|s| config.filter_params[s]}))
199
+ @fields = config.field_definitions
200
+ @queries = config.query_generators.map{|g| g.generate(target)}
201
+ end
202
+
203
+ def filter(record)
204
+ @filter.filter(record)
205
+ end
206
+
207
+ def reserve_fields
208
+ f = {}
209
+ @fields.keys.each do |type_sym|
210
+ @fields[type_sym].each do |fieldname|
211
+ f[fieldname] = type_sym.to_s
212
+ end
213
+ end
214
+ f
215
+ end
216
+ end
217
+ end