fluent-plugin-rewrite-tag-filter 1.1.1 → 1.2.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,162 @@
1
+ # fluent-plugin-rewrite-tag-filter
2
+
3
+ ## Overview
4
+
5
+ ### RewriteTagFilterOutput
6
+
7
+ Fluentd Output filter plugin. It has designed to rewrite tag like mod_rewrite.
8
+ Re-emmit a record with rewrited tag when a value matches with the regular expression.
9
+ Also you can change a tag from apache log by domain, status-code(ex. 500 error),
10
+ user-agent, request-uri, regex-backreference and so on with regular expression.
11
+
12
+ ## Installation
13
+
14
+ ```
15
+ # native gem
16
+ gem install fluent-plugin-rewrite-tag-filter
17
+
18
+ # td-agent gem
19
+ /usr/lib64/fluent/ruby/bin/fluent-gem install fluent-plugin-rewrite-tag-filter
20
+ ```
21
+
22
+ ## Configuration
23
+
24
+ ### Syntax
25
+
26
+ ```
27
+ rewruterule<num:1-200> <attribute> <regex_pattern> <new_tag>
28
+
29
+ # Optional: Capitalize every matched regex backreference. (ex: $1, $2)
30
+ capitalize_regex_backreference <yes/no> (default no)
31
+ ```
32
+
33
+ ### Usage
34
+
35
+ It's a sample to exclude some static file log before split tag by domain.
36
+
37
+ ```
38
+ <source>
39
+ type tail
40
+ path /var/log/httpd/access_log
41
+ format /^(?<domain>[^ ]*) (?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<status>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<response_time>[^ ]*))?$/
42
+ time_format %d/%b/%Y:%H:%M:%S %z
43
+ tag td.apache.access
44
+ pos_file /var/log/td-agent/apache_access.pos
45
+ </source>
46
+
47
+ # At rewriterule4, "site.$2$1" to be "site.ExampleMail" by capitalize_regex_backreference option.
48
+ <match td.apache.access>
49
+ type rewrite_tag_filter
50
+ capitalize_regex_backreference yes
51
+ rewriterule1 path \.(gif|jpe?g|png|pdf|zip)$ clear
52
+ rewriterule2 domain ^maps\.example\.com$ site.ExampleMaps
53
+ rewriterule3 domain ^news\.example\.com$ site.ExampleNews
54
+ rewriterule4 domain ^(mail)\.(example)\.com$ site.$2$1
55
+ rewriterule5 domain .+ site.unmatched
56
+ </match>
57
+
58
+ <match site.*>
59
+ type mongo
60
+ host localhost
61
+ database apache_access
62
+ remove_tag_prefix site
63
+ tag_mapped
64
+ capped
65
+ capped_size 100m
66
+ </match>
67
+
68
+ <match clear>
69
+ type null
70
+ </match>
71
+ ```
72
+
73
+ ### Result
74
+
75
+ ```
76
+ $ mongo
77
+ MongoDB shell version: 2.2.0
78
+ > use apache_access
79
+ switched to db apache_access
80
+ > show collections
81
+ ExampleMaps
82
+ ExampleNews
83
+ ExampleMail
84
+ unmatched
85
+ ```
86
+
87
+ ### Debug
88
+
89
+ On starting td-agent, Logging supported like below.
90
+
91
+ ```
92
+ $ tailf /var/log/td-agent/td-agent.log
93
+ 2012-09-16 18:10:51 +0900: adding match pattern="td.apache.access" type="rewrite_tag_filter"
94
+ 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [1, "path", /\.(gif|jpe?g|png|pdf|zip)$/, "clear"]
95
+ 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [2, "domain", /^maps\.example\.com$/, "site.ExampleMaps"]
96
+ 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [3, "domain", /^news\.example\.com$/, "site.ExampleNews"]
97
+ 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [4, "domain", /^(mail)\.(example)\.com$/, "site.$2$1"]
98
+ 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [5, "domain", /.+/, "site.unmatched"]
99
+ ```
100
+
101
+ ### Tag placeholder
102
+
103
+ Supporing these placeholder for new_tag(rewrited tag).
104
+ It's available to use this placeholder with `remove_tag_prefix` option.
105
+ This feature adds removing tag prefix for `${tag}` or `__TAG__` in placeholder.
106
+
107
+ - `${hostname}`
108
+ - `__HOSTNAME__`
109
+ - `${tag}`
110
+ - `__TAG__`
111
+
112
+ It's a sample to rewrite a tag with placeholder.
113
+
114
+ ```
115
+ # It will get "rewrited.access.ExampleMail"
116
+ <match apache.access>
117
+ type rewrite_tag_filter
118
+ rewriterule1 domain ^(mail)\.(example)\.com$ rewrited.${tag}.$2$1
119
+ remove_tag_prefix apache
120
+ </match>
121
+
122
+ # It will get "rewrited.ExampleMail.app30-124.foo.com" when hostname is "app30-124.foo.com"
123
+ <match apache.access>
124
+ type rewrite_tag_filter
125
+ rewriterule1 domain ^(mail)\.(example)\.com$ rewrited.$2$1.${hostname}
126
+ </match>
127
+ ```
128
+
129
+ ## Example
130
+
131
+ - Example1: how to analyze response_time, response_code and user_agent for each virtual domain websites.
132
+ https://github.com/y-ken/fluent-plugin-rewrite-tag-filter/blob/master/example.conf
133
+
134
+ - Example2: how to exclude specified patterns before analyze response_time for each virtual domain websites.
135
+ https://github.com/y-ken/fluent-plugin-rewrite-tag-filter/blob/master/example2.conf
136
+
137
+ ## Related Articles
138
+
139
+ - 自在にタグを書き換える fluent-plugin-rewrite-tag-filter でログ解析が捗るお話 #fluentd
140
+ http://d.hatena.ne.jp/yoshi-ken/20120701/1341137269
141
+
142
+ - Fluentd & TreasureDataで こっそり始めるログ集計 Fluentd Meetup #2 @mikeda
143
+ http://www.slideshare.net/baguzy/fluentd-meetup-2-14073930
144
+
145
+ - 似てる #fluentd プラグインの比較
146
+ http://matsumana.wordpress.com/2012/11/15/%E4%BC%BC%E3%81%A6%E3%82%8B-fluentd-%E3%83%97%E3%83%A9%E3%82%B0%E3%82%A4%E3%83%B3%E3%81%AE%E6%AF%94%E8%BC%83/
147
+
148
+ - Fluentdの集約サーバ用設定ファイル (fluent-plugin-rewrite-tag-filter版)
149
+ https://gist.github.com/matsumana/4078096
150
+
151
+ - 稼働中のFluentdにflowcounter pluginを導入してみた
152
+ http://dayafterneet.blogspot.jp/2012/12/fluentdflowcounter-plugin.html
153
+
154
+ ## TODO
155
+
156
+ Pull requests are very welcome!!
157
+
158
+ ## Copyright
159
+
160
+ Copyright : Copyright (c) 2012- Kentaro Yoshida (@yoshi_ken)
161
+ License : Apache License, Version 2.0
162
+
@@ -3,18 +3,17 @@ $:.push File.expand_path("../lib", __FILE__)
3
3
 
4
4
  Gem::Specification.new do |s|
5
5
  s.name = "fluent-plugin-rewrite-tag-filter"
6
- s.version = "1.1.1"
6
+ s.version = "1.2.0"
7
7
  s.authors = ["Kentaro Yoshida"]
8
8
  s.email = ["y.ken.studio@gmail.com"]
9
9
  s.homepage = "https://github.com/y-ken/fluent-plugin-rewrite-tag-filter"
10
- s.summary = %q{Fluentd Output filter plugin. It has designed to rewrite tag like mod_rewrite. Re-emmit a record with rewrited tag when a value matches with the regular expression. Also you can change a tag with a values from apache log by domain, status-code(ex. 500 error), user-agent, request-uri, regex-backreference and so on with regular expression.}
10
+ s.summary = %q{Fluentd Output filter plugin. It has designed to rewrite tag like mod_rewrite. Re-emmit a record with rewrited tag when a value matches with the regular expression. Also you can change a tag from apache log by domain, status-code(ex. 500 error), user-agent, request-uri, regex-backreference and so on with regular expression.}
11
11
 
12
12
  s.files = `git ls-files`.split("\n")
13
13
  s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
14
14
  s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
15
15
  s.require_paths = ["lib"]
16
16
 
17
- # specify any dependencies here; for example:
18
- s.add_development_dependency "fluentd"
17
+ s.add_development_dependency "rake"
19
18
  s.add_runtime_dependency "fluentd"
20
19
  end
@@ -8,6 +8,7 @@ class Fluent::RewriteTagFilterOutput < Fluent::Output
8
8
  config_param ('rewriterule' + i.to_s).to_sym, :string, :default => nil # NAME REGEXP
9
9
  end
10
10
  config_param :capitalize_regex_backreference, :bool, :default => false
11
+ config_param :remove_tag_prefix, :string, :default => nil
11
12
 
12
13
  def configure(conf)
13
14
  super
@@ -37,6 +38,10 @@ class Fluent::RewriteTagFilterOutput < Fluent::Output
37
38
  unless @rewriterules.length == rewriterule_names.uniq.length
38
39
  raise Fluent::ConfigError, "duplicated rewriterules found #{@rewriterules.inspect}"
39
40
  end
41
+
42
+ unless conf['remove_tag_prefix'].nil?
43
+ @remove_tag_prefix = Regexp.new("^#{Regexp.escape(remove_tag_prefix)}\.?")
44
+ end
40
45
  end
41
46
 
42
47
  def emit(tag, es, chain)
@@ -68,17 +73,15 @@ class Fluent::RewriteTagFilterOutput < Fluent::Output
68
73
  end
69
74
 
70
75
  def map_regex_table(elements)
71
- # TODO: use each.with_index http://shirusu-ni-tarazu.hatenablog.jp/entry/2012/11/04/173513
72
76
  hash_table = Hash.new
73
- index = 1
74
- elements.each do |value|
77
+ elements.each.with_index(1) do |value, index|
75
78
  hash_table["$#{index}"] = @capitalize_regex_backreference ? value.capitalize : value
76
- index += 1
77
79
  end
78
80
  return hash_table
79
81
  end
80
82
 
81
83
  def get_placeholder(tag)
84
+ tag = tag.sub(@remove_tag_prefix, '') if @remove_tag_prefix
82
85
  return {
83
86
  '__HOSTNAME__' => @hostname,
84
87
  '${hostname}' => @hostname,
@@ -24,6 +24,12 @@ class RewriteTagFilterOutputTest < Test::Unit::TestCase
24
24
  rewriterule4 agent "(Googlebot|CustomBot)-([a-zA-Z]+)" agent.$1-$2
25
25
  ]
26
26
 
27
+ # remove_tag_prefix test
28
+ CONFIG3 = %[
29
+ rewriterule1 domain ^www\.google\.com$ ${tag}
30
+ remove_tag_prefix input
31
+ ]
32
+
27
33
  def create_driver(conf=CONFIG,tag='test')
28
34
  Fluent::Test::OutputTestDriver.new(Fluent::RewriteTagFilterOutput, tag).configure(conf)
29
35
  end
@@ -43,7 +49,6 @@ class RewriteTagFilterOutputTest < Test::Unit::TestCase
43
49
 
44
50
  def test_emit
45
51
  d1 = create_driver(CONFIG, 'input.access')
46
- time = Time.parse("2012-01-02 13:14:15").to_i
47
52
  d1.run do
48
53
  d1.emit({'domain' => 'www.google.com', 'path' => '/foo/bar?key=value', 'agent' => 'Googlebot', 'response_time' => 1000000})
49
54
  d1.emit({'domain' => 'news.google.com', 'path' => '/', 'agent' => 'Googlebot-Mobile', 'response_time' => 900000})
@@ -68,7 +73,6 @@ class RewriteTagFilterOutputTest < Test::Unit::TestCase
68
73
 
69
74
  def test_emit2
70
75
  d1 = create_driver(CONFIG2, 'input.access')
71
- time = Time.parse("2012-01-02 13:14:15").to_i
72
76
  d1.run do
73
77
  d1.emit({'domain' => 'www.google.com', 'path' => '/foo/bar?key=value', 'agent' => 'Googlebot', 'response_time' => 1000000})
74
78
  d1.emit({'domain' => 'news.google.com', 'path' => '/', 'agent' => 'Googlebot-Mobile', 'response_time' => 900000})
@@ -87,5 +91,16 @@ class RewriteTagFilterOutputTest < Test::Unit::TestCase
87
91
  p emits[3]
88
92
  assert_equal 'agent.Googlebot-Foobar', emits[3][0] #tag
89
93
  end
94
+
95
+ def test_emit3
96
+ d1 = create_driver(CONFIG3, 'input.access')
97
+ d1.run do
98
+ d1.emit({'domain' => 'www.google.com', 'path' => '/foo/bar?key=value', 'agent' => 'Googlebot', 'response_time' => 1000000})
99
+ end
100
+ emits = d1.emits
101
+ assert_equal 1, emits.length
102
+ p emits[0]
103
+ assert_equal 'access', emits[0][0] # tag
104
+ end
90
105
  end
91
106
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-rewrite-tag-filter
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.1
4
+ version: 1.2.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,10 +9,10 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2013-02-13 00:00:00.000000000 Z
12
+ date: 2013-05-03 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
- name: fluentd
15
+ name: rake
16
16
  requirement: !ruby/object:Gem::Requirement
17
17
  none: false
18
18
  requirements:
@@ -53,7 +53,7 @@ files:
53
53
  - .gitignore
54
54
  - Gemfile
55
55
  - LICENSE.txt
56
- - README.rdoc
56
+ - README.md
57
57
  - Rakefile
58
58
  - example.conf
59
59
  - example2.conf
@@ -86,9 +86,8 @@ signing_key:
86
86
  specification_version: 3
87
87
  summary: Fluentd Output filter plugin. It has designed to rewrite tag like mod_rewrite.
88
88
  Re-emmit a record with rewrited tag when a value matches with the regular expression.
89
- Also you can change a tag with a values from apache log by domain, status-code(ex.
90
- 500 error), user-agent, request-uri, regex-backreference and so on with regular
91
- expression.
89
+ Also you can change a tag from apache log by domain, status-code(ex. 500 error),
90
+ user-agent, request-uri, regex-backreference and so on with regular expression.
92
91
  test_files:
93
92
  - test/helper.rb
94
93
  - test/plugin/test_out_rewrite_tag_filter.rb
@@ -1,140 +0,0 @@
1
- = fluent-plugin-rewrite-tag-filter
2
-
3
- == Overview
4
-
5
- === RewriteTagFilterOutput
6
-
7
- It has designed to rewrite tags like mod_rewrite.
8
- Also you can change a tag from apache log by domain, status-code(ex. 500 error),
9
- user-agent, request-uri, regex-backreference and so on with regular expression.
10
-
11
- == Installation
12
-
13
- # native gem
14
- gem install fluent-plugin-rewrite-tag-filter
15
-
16
- # td-agent gem
17
- /usr/lib64/fluent/ruby/bin/fluent-gem install fluent-plugin-rewrite-tag-filter
18
-
19
- == Configuration
20
-
21
- === Syntax
22
-
23
- rewruterule<num:1-200> <attribute> <regex_pattern> <new_tag>
24
-
25
- # Optional: Capitalize every matched regex backreference. (ex: $1, $2)
26
- capitalize_regex_backreference <yes/no> (default no)
27
-
28
- === Usage
29
-
30
- It's a sample to exclude some static file log before split tag by domain.
31
-
32
- <source>
33
- type tail
34
- path /var/log/httpd/access_log
35
- format /^(?<domain>[^ ]*) (?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<status>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<response_time>[^ ]*))?$/
36
- time_format %d/%b/%Y:%H:%M:%S %z
37
- tag td.apache.access
38
- pos_file /var/log/td-agent/apache_access.pos
39
- </source>
40
-
41
- # At rewriterule4, "site.$2$1" to be "site.ExampleMail" by capitalize_regex_backreference option.
42
- <match td.apache.access>
43
- type rewrite_tag_filter
44
- capitalize_regex_backreference yes
45
- rewriterule1 path \.(gif|jpe?g|png|pdf|zip)$ clear
46
- rewriterule2 domain ^maps\.example\.com$ site.ExampleMaps
47
- rewriterule3 domain ^news\.example\.com$ site.ExampleNews
48
- rewriterule4 domain ^(mail)\.(example)\.com$ site.$2$1
49
- rewriterule5 domain .+ site.unmatched
50
- </match>
51
-
52
- <match site.*>
53
- type mongo
54
- host localhost
55
- database apache_access
56
- remove_tag_prefix site
57
- tag_mapped
58
- capped
59
- capped_size 100m
60
- </match>
61
-
62
- <match clear>
63
- type null
64
- </match>
65
-
66
- === Result
67
-
68
- $ mongo
69
- MongoDB shell version: 2.2.0
70
- > use apache_access
71
- switched to db apache_access
72
- > show collections
73
- ExampleMaps
74
- ExampleNews
75
- ExampleMail
76
- unmatched
77
-
78
- === Debug
79
-
80
- On starting td-agent, Logging supported like below.
81
-
82
- $ tailf /var/log/td-agent/td-agent.log
83
- 2012-09-16 18:10:51 +0900: adding match pattern="td.apache.access" type="rewrite_tag_filter"
84
- 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [1, "path", /\.(gif|jpe?g|png|pdf|zip)$/, "clear"]
85
- 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [2, "domain", /^maps\.example\.com$/, "site.ExampleMaps"]
86
- 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [3, "domain", /^news\.example\.com$/, "site.ExampleNews"]
87
- 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [4, "domain", /^(mail)\.(example)\.com$/, "site.$2$1"]
88
- 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [5, "domain", /.+/, "site.unmatched"]
89
-
90
- === Tag placeholder
91
-
92
- Supporing these placeholder for new_tag(rewrited tag).
93
-
94
- - ${hostname}
95
- - __HOSTNAME__
96
- - ${tag}
97
- - __TAG__
98
-
99
- It's a sample to rewrite a tag with placeholder.
100
-
101
- # It will get "rewrited.apache.access.ExampleMail"
102
- <match apache.access>
103
- type rewrite_tag_filter
104
- rewriterule1 domain ^(mail)\.(example)\.com$ rewrited.${tag}.$2$1
105
- </match>
106
-
107
- # It will get "rewrited.ExampleMail.app30-124.foo.com" when hostname is "app30-124.foo.com"
108
- <match apache.access>
109
- type rewrite_tag_filter
110
- rewriterule1 domain ^(mail)\.(example)\.com$ rewrited.$2$1.${hostname}
111
- </match>
112
-
113
- == Example
114
-
115
- - Example1: how to analyze response_time, response_code and user_agent for each virtual domain websites.
116
- - https://github.com/y-ken/fluent-plugin-rewrite-tag-filter/blob/master/example.conf
117
-
118
- - Example2: how to exclude specified patterns before analyze response_time for each virtual domain websites.
119
- - https://github.com/y-ken/fluent-plugin-rewrite-tag-filter/blob/master/example2.conf
120
-
121
- == Topic
122
-
123
- - 自在にタグを書き換える fluent-plugin-rewrite-tag-filter でログ解析が捗るお話 #fluentd
124
- - http://d.hatena.ne.jp/yoshi-ken/20120701/1341137269
125
-
126
- - Fluentd & TreasureDataで こっそり始めるログ集計 Fluentd Meetup #2 @mikeda
127
- - http://www.slideshare.net/baguzy/fluentd-meetup-2-14073930
128
-
129
- - 似てる #fluentd プラグインの比較
130
- - http://matsumana.wordpress.com/2012/11/15/%E4%BC%BC%E3%81%A6%E3%82%8B-fluentd-%E3%83%97%E3%83%A9%E3%82%B0%E3%82%A4%E3%83%B3%E3%81%AE%E6%AF%94%E8%BC%83/
131
-
132
- == TODO
133
-
134
- - patches welcome!
135
-
136
- == Copyright
137
-
138
- Copyright:: Copyright (c) 2012- Kentaro Yoshida (@yoshi_ken)
139
- License:: Apache License, Version 2.0
140
-