fluent-plugin-rewrite-tag-filter 1.1.1 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,162 @@
1
+ # fluent-plugin-rewrite-tag-filter
2
+
3
+ ## Overview
4
+
5
+ ### RewriteTagFilterOutput
6
+
7
+ Fluentd Output filter plugin. It has designed to rewrite tag like mod_rewrite.
8
+ Re-emmit a record with rewrited tag when a value matches with the regular expression.
9
+ Also you can change a tag from apache log by domain, status-code(ex. 500 error),
10
+ user-agent, request-uri, regex-backreference and so on with regular expression.
11
+
12
+ ## Installation
13
+
14
+ ```
15
+ # native gem
16
+ gem install fluent-plugin-rewrite-tag-filter
17
+
18
+ # td-agent gem
19
+ /usr/lib64/fluent/ruby/bin/fluent-gem install fluent-plugin-rewrite-tag-filter
20
+ ```
21
+
22
+ ## Configuration
23
+
24
+ ### Syntax
25
+
26
+ ```
27
+ rewruterule<num:1-200> <attribute> <regex_pattern> <new_tag>
28
+
29
+ # Optional: Capitalize every matched regex backreference. (ex: $1, $2)
30
+ capitalize_regex_backreference <yes/no> (default no)
31
+ ```
32
+
33
+ ### Usage
34
+
35
+ It's a sample to exclude some static file log before split tag by domain.
36
+
37
+ ```
38
+ <source>
39
+ type tail
40
+ path /var/log/httpd/access_log
41
+ format /^(?<domain>[^ ]*) (?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<status>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<response_time>[^ ]*))?$/
42
+ time_format %d/%b/%Y:%H:%M:%S %z
43
+ tag td.apache.access
44
+ pos_file /var/log/td-agent/apache_access.pos
45
+ </source>
46
+
47
+ # At rewriterule4, "site.$2$1" to be "site.ExampleMail" by capitalize_regex_backreference option.
48
+ <match td.apache.access>
49
+ type rewrite_tag_filter
50
+ capitalize_regex_backreference yes
51
+ rewriterule1 path \.(gif|jpe?g|png|pdf|zip)$ clear
52
+ rewriterule2 domain ^maps\.example\.com$ site.ExampleMaps
53
+ rewriterule3 domain ^news\.example\.com$ site.ExampleNews
54
+ rewriterule4 domain ^(mail)\.(example)\.com$ site.$2$1
55
+ rewriterule5 domain .+ site.unmatched
56
+ </match>
57
+
58
+ <match site.*>
59
+ type mongo
60
+ host localhost
61
+ database apache_access
62
+ remove_tag_prefix site
63
+ tag_mapped
64
+ capped
65
+ capped_size 100m
66
+ </match>
67
+
68
+ <match clear>
69
+ type null
70
+ </match>
71
+ ```
72
+
73
+ ### Result
74
+
75
+ ```
76
+ $ mongo
77
+ MongoDB shell version: 2.2.0
78
+ > use apache_access
79
+ switched to db apache_access
80
+ > show collections
81
+ ExampleMaps
82
+ ExampleNews
83
+ ExampleMail
84
+ unmatched
85
+ ```
86
+
87
+ ### Debug
88
+
89
+ On starting td-agent, Logging supported like below.
90
+
91
+ ```
92
+ $ tailf /var/log/td-agent/td-agent.log
93
+ 2012-09-16 18:10:51 +0900: adding match pattern="td.apache.access" type="rewrite_tag_filter"
94
+ 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [1, "path", /\.(gif|jpe?g|png|pdf|zip)$/, "clear"]
95
+ 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [2, "domain", /^maps\.example\.com$/, "site.ExampleMaps"]
96
+ 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [3, "domain", /^news\.example\.com$/, "site.ExampleNews"]
97
+ 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [4, "domain", /^(mail)\.(example)\.com$/, "site.$2$1"]
98
+ 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [5, "domain", /.+/, "site.unmatched"]
99
+ ```
100
+
101
+ ### Tag placeholder
102
+
103
+ Supporing these placeholder for new_tag(rewrited tag).
104
+ It's available to use this placeholder with `remove_tag_prefix` option.
105
+ This feature adds removing tag prefix for `${tag}` or `__TAG__` in placeholder.
106
+
107
+ - `${hostname}`
108
+ - `__HOSTNAME__`
109
+ - `${tag}`
110
+ - `__TAG__`
111
+
112
+ It's a sample to rewrite a tag with placeholder.
113
+
114
+ ```
115
+ # It will get "rewrited.access.ExampleMail"
116
+ <match apache.access>
117
+ type rewrite_tag_filter
118
+ rewriterule1 domain ^(mail)\.(example)\.com$ rewrited.${tag}.$2$1
119
+ remove_tag_prefix apache
120
+ </match>
121
+
122
+ # It will get "rewrited.ExampleMail.app30-124.foo.com" when hostname is "app30-124.foo.com"
123
+ <match apache.access>
124
+ type rewrite_tag_filter
125
+ rewriterule1 domain ^(mail)\.(example)\.com$ rewrited.$2$1.${hostname}
126
+ </match>
127
+ ```
128
+
129
+ ## Example
130
+
131
+ - Example1: how to analyze response_time, response_code and user_agent for each virtual domain websites.
132
+ https://github.com/y-ken/fluent-plugin-rewrite-tag-filter/blob/master/example.conf
133
+
134
+ - Example2: how to exclude specified patterns before analyze response_time for each virtual domain websites.
135
+ https://github.com/y-ken/fluent-plugin-rewrite-tag-filter/blob/master/example2.conf
136
+
137
+ ## Related Articles
138
+
139
+ - 自在にタグを書き換える fluent-plugin-rewrite-tag-filter でログ解析が捗るお話 #fluentd
140
+ http://d.hatena.ne.jp/yoshi-ken/20120701/1341137269
141
+
142
+ - Fluentd & TreasureDataで こっそり始めるログ集計 Fluentd Meetup #2 @mikeda
143
+ http://www.slideshare.net/baguzy/fluentd-meetup-2-14073930
144
+
145
+ - 似てる #fluentd プラグインの比較
146
+ http://matsumana.wordpress.com/2012/11/15/%E4%BC%BC%E3%81%A6%E3%82%8B-fluentd-%E3%83%97%E3%83%A9%E3%82%B0%E3%82%A4%E3%83%B3%E3%81%AE%E6%AF%94%E8%BC%83/
147
+
148
+ - Fluentdの集約サーバ用設定ファイル (fluent-plugin-rewrite-tag-filter版)
149
+ https://gist.github.com/matsumana/4078096
150
+
151
+ - 稼働中のFluentdにflowcounter pluginを導入してみた
152
+ http://dayafterneet.blogspot.jp/2012/12/fluentdflowcounter-plugin.html
153
+
154
+ ## TODO
155
+
156
+ Pull requests are very welcome!!
157
+
158
+ ## Copyright
159
+
160
+ Copyright : Copyright (c) 2012- Kentaro Yoshida (@yoshi_ken)
161
+ License : Apache License, Version 2.0
162
+
@@ -3,18 +3,17 @@ $:.push File.expand_path("../lib", __FILE__)
3
3
 
4
4
  Gem::Specification.new do |s|
5
5
  s.name = "fluent-plugin-rewrite-tag-filter"
6
- s.version = "1.1.1"
6
+ s.version = "1.2.0"
7
7
  s.authors = ["Kentaro Yoshida"]
8
8
  s.email = ["y.ken.studio@gmail.com"]
9
9
  s.homepage = "https://github.com/y-ken/fluent-plugin-rewrite-tag-filter"
10
- s.summary = %q{Fluentd Output filter plugin. It has designed to rewrite tag like mod_rewrite. Re-emmit a record with rewrited tag when a value matches with the regular expression. Also you can change a tag with a values from apache log by domain, status-code(ex. 500 error), user-agent, request-uri, regex-backreference and so on with regular expression.}
10
+ s.summary = %q{Fluentd Output filter plugin. It has designed to rewrite tag like mod_rewrite. Re-emmit a record with rewrited tag when a value matches with the regular expression. Also you can change a tag from apache log by domain, status-code(ex. 500 error), user-agent, request-uri, regex-backreference and so on with regular expression.}
11
11
 
12
12
  s.files = `git ls-files`.split("\n")
13
13
  s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
14
14
  s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
15
15
  s.require_paths = ["lib"]
16
16
 
17
- # specify any dependencies here; for example:
18
- s.add_development_dependency "fluentd"
17
+ s.add_development_dependency "rake"
19
18
  s.add_runtime_dependency "fluentd"
20
19
  end
@@ -8,6 +8,7 @@ class Fluent::RewriteTagFilterOutput < Fluent::Output
8
8
  config_param ('rewriterule' + i.to_s).to_sym, :string, :default => nil # NAME REGEXP
9
9
  end
10
10
  config_param :capitalize_regex_backreference, :bool, :default => false
11
+ config_param :remove_tag_prefix, :string, :default => nil
11
12
 
12
13
  def configure(conf)
13
14
  super
@@ -37,6 +38,10 @@ class Fluent::RewriteTagFilterOutput < Fluent::Output
37
38
  unless @rewriterules.length == rewriterule_names.uniq.length
38
39
  raise Fluent::ConfigError, "duplicated rewriterules found #{@rewriterules.inspect}"
39
40
  end
41
+
42
+ unless conf['remove_tag_prefix'].nil?
43
+ @remove_tag_prefix = Regexp.new("^#{Regexp.escape(remove_tag_prefix)}\.?")
44
+ end
40
45
  end
41
46
 
42
47
  def emit(tag, es, chain)
@@ -68,17 +73,15 @@ class Fluent::RewriteTagFilterOutput < Fluent::Output
68
73
  end
69
74
 
70
75
  def map_regex_table(elements)
71
- # TODO: use each.with_index http://shirusu-ni-tarazu.hatenablog.jp/entry/2012/11/04/173513
72
76
  hash_table = Hash.new
73
- index = 1
74
- elements.each do |value|
77
+ elements.each.with_index(1) do |value, index|
75
78
  hash_table["$#{index}"] = @capitalize_regex_backreference ? value.capitalize : value
76
- index += 1
77
79
  end
78
80
  return hash_table
79
81
  end
80
82
 
81
83
  def get_placeholder(tag)
84
+ tag = tag.sub(@remove_tag_prefix, '') if @remove_tag_prefix
82
85
  return {
83
86
  '__HOSTNAME__' => @hostname,
84
87
  '${hostname}' => @hostname,
@@ -24,6 +24,12 @@ class RewriteTagFilterOutputTest < Test::Unit::TestCase
24
24
  rewriterule4 agent "(Googlebot|CustomBot)-([a-zA-Z]+)" agent.$1-$2
25
25
  ]
26
26
 
27
+ # remove_tag_prefix test
28
+ CONFIG3 = %[
29
+ rewriterule1 domain ^www\.google\.com$ ${tag}
30
+ remove_tag_prefix input
31
+ ]
32
+
27
33
  def create_driver(conf=CONFIG,tag='test')
28
34
  Fluent::Test::OutputTestDriver.new(Fluent::RewriteTagFilterOutput, tag).configure(conf)
29
35
  end
@@ -43,7 +49,6 @@ class RewriteTagFilterOutputTest < Test::Unit::TestCase
43
49
 
44
50
  def test_emit
45
51
  d1 = create_driver(CONFIG, 'input.access')
46
- time = Time.parse("2012-01-02 13:14:15").to_i
47
52
  d1.run do
48
53
  d1.emit({'domain' => 'www.google.com', 'path' => '/foo/bar?key=value', 'agent' => 'Googlebot', 'response_time' => 1000000})
49
54
  d1.emit({'domain' => 'news.google.com', 'path' => '/', 'agent' => 'Googlebot-Mobile', 'response_time' => 900000})
@@ -68,7 +73,6 @@ class RewriteTagFilterOutputTest < Test::Unit::TestCase
68
73
 
69
74
  def test_emit2
70
75
  d1 = create_driver(CONFIG2, 'input.access')
71
- time = Time.parse("2012-01-02 13:14:15").to_i
72
76
  d1.run do
73
77
  d1.emit({'domain' => 'www.google.com', 'path' => '/foo/bar?key=value', 'agent' => 'Googlebot', 'response_time' => 1000000})
74
78
  d1.emit({'domain' => 'news.google.com', 'path' => '/', 'agent' => 'Googlebot-Mobile', 'response_time' => 900000})
@@ -87,5 +91,16 @@ class RewriteTagFilterOutputTest < Test::Unit::TestCase
87
91
  p emits[3]
88
92
  assert_equal 'agent.Googlebot-Foobar', emits[3][0] #tag
89
93
  end
94
+
95
+ def test_emit3
96
+ d1 = create_driver(CONFIG3, 'input.access')
97
+ d1.run do
98
+ d1.emit({'domain' => 'www.google.com', 'path' => '/foo/bar?key=value', 'agent' => 'Googlebot', 'response_time' => 1000000})
99
+ end
100
+ emits = d1.emits
101
+ assert_equal 1, emits.length
102
+ p emits[0]
103
+ assert_equal 'access', emits[0][0] # tag
104
+ end
90
105
  end
91
106
 
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-rewrite-tag-filter
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.1
4
+ version: 1.2.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,10 +9,10 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2013-02-13 00:00:00.000000000 Z
12
+ date: 2013-05-03 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
- name: fluentd
15
+ name: rake
16
16
  requirement: !ruby/object:Gem::Requirement
17
17
  none: false
18
18
  requirements:
@@ -53,7 +53,7 @@ files:
53
53
  - .gitignore
54
54
  - Gemfile
55
55
  - LICENSE.txt
56
- - README.rdoc
56
+ - README.md
57
57
  - Rakefile
58
58
  - example.conf
59
59
  - example2.conf
@@ -86,9 +86,8 @@ signing_key:
86
86
  specification_version: 3
87
87
  summary: Fluentd Output filter plugin. It has designed to rewrite tag like mod_rewrite.
88
88
  Re-emmit a record with rewrited tag when a value matches with the regular expression.
89
- Also you can change a tag with a values from apache log by domain, status-code(ex.
90
- 500 error), user-agent, request-uri, regex-backreference and so on with regular
91
- expression.
89
+ Also you can change a tag from apache log by domain, status-code(ex. 500 error),
90
+ user-agent, request-uri, regex-backreference and so on with regular expression.
92
91
  test_files:
93
92
  - test/helper.rb
94
93
  - test/plugin/test_out_rewrite_tag_filter.rb
@@ -1,140 +0,0 @@
1
- = fluent-plugin-rewrite-tag-filter
2
-
3
- == Overview
4
-
5
- === RewriteTagFilterOutput
6
-
7
- It has designed to rewrite tags like mod_rewrite.
8
- Also you can change a tag from apache log by domain, status-code(ex. 500 error),
9
- user-agent, request-uri, regex-backreference and so on with regular expression.
10
-
11
- == Installation
12
-
13
- # native gem
14
- gem install fluent-plugin-rewrite-tag-filter
15
-
16
- # td-agent gem
17
- /usr/lib64/fluent/ruby/bin/fluent-gem install fluent-plugin-rewrite-tag-filter
18
-
19
- == Configuration
20
-
21
- === Syntax
22
-
23
- rewruterule<num:1-200> <attribute> <regex_pattern> <new_tag>
24
-
25
- # Optional: Capitalize every matched regex backreference. (ex: $1, $2)
26
- capitalize_regex_backreference <yes/no> (default no)
27
-
28
- === Usage
29
-
30
- It's a sample to exclude some static file log before split tag by domain.
31
-
32
- <source>
33
- type tail
34
- path /var/log/httpd/access_log
35
- format /^(?<domain>[^ ]*) (?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<status>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<response_time>[^ ]*))?$/
36
- time_format %d/%b/%Y:%H:%M:%S %z
37
- tag td.apache.access
38
- pos_file /var/log/td-agent/apache_access.pos
39
- </source>
40
-
41
- # At rewriterule4, "site.$2$1" to be "site.ExampleMail" by capitalize_regex_backreference option.
42
- <match td.apache.access>
43
- type rewrite_tag_filter
44
- capitalize_regex_backreference yes
45
- rewriterule1 path \.(gif|jpe?g|png|pdf|zip)$ clear
46
- rewriterule2 domain ^maps\.example\.com$ site.ExampleMaps
47
- rewriterule3 domain ^news\.example\.com$ site.ExampleNews
48
- rewriterule4 domain ^(mail)\.(example)\.com$ site.$2$1
49
- rewriterule5 domain .+ site.unmatched
50
- </match>
51
-
52
- <match site.*>
53
- type mongo
54
- host localhost
55
- database apache_access
56
- remove_tag_prefix site
57
- tag_mapped
58
- capped
59
- capped_size 100m
60
- </match>
61
-
62
- <match clear>
63
- type null
64
- </match>
65
-
66
- === Result
67
-
68
- $ mongo
69
- MongoDB shell version: 2.2.0
70
- > use apache_access
71
- switched to db apache_access
72
- > show collections
73
- ExampleMaps
74
- ExampleNews
75
- ExampleMail
76
- unmatched
77
-
78
- === Debug
79
-
80
- On starting td-agent, Logging supported like below.
81
-
82
- $ tailf /var/log/td-agent/td-agent.log
83
- 2012-09-16 18:10:51 +0900: adding match pattern="td.apache.access" type="rewrite_tag_filter"
84
- 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [1, "path", /\.(gif|jpe?g|png|pdf|zip)$/, "clear"]
85
- 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [2, "domain", /^maps\.example\.com$/, "site.ExampleMaps"]
86
- 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [3, "domain", /^news\.example\.com$/, "site.ExampleNews"]
87
- 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [4, "domain", /^(mail)\.(example)\.com$/, "site.$2$1"]
88
- 2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [5, "domain", /.+/, "site.unmatched"]
89
-
90
- === Tag placeholder
91
-
92
- Supporing these placeholder for new_tag(rewrited tag).
93
-
94
- - ${hostname}
95
- - __HOSTNAME__
96
- - ${tag}
97
- - __TAG__
98
-
99
- It's a sample to rewrite a tag with placeholder.
100
-
101
- # It will get "rewrited.apache.access.ExampleMail"
102
- <match apache.access>
103
- type rewrite_tag_filter
104
- rewriterule1 domain ^(mail)\.(example)\.com$ rewrited.${tag}.$2$1
105
- </match>
106
-
107
- # It will get "rewrited.ExampleMail.app30-124.foo.com" when hostname is "app30-124.foo.com"
108
- <match apache.access>
109
- type rewrite_tag_filter
110
- rewriterule1 domain ^(mail)\.(example)\.com$ rewrited.$2$1.${hostname}
111
- </match>
112
-
113
- == Example
114
-
115
- - Example1: how to analyze response_time, response_code and user_agent for each virtual domain websites.
116
- - https://github.com/y-ken/fluent-plugin-rewrite-tag-filter/blob/master/example.conf
117
-
118
- - Example2: how to exclude specified patterns before analyze response_time for each virtual domain websites.
119
- - https://github.com/y-ken/fluent-plugin-rewrite-tag-filter/blob/master/example2.conf
120
-
121
- == Topic
122
-
123
- - 自在にタグを書き換える fluent-plugin-rewrite-tag-filter でログ解析が捗るお話 #fluentd
124
- - http://d.hatena.ne.jp/yoshi-ken/20120701/1341137269
125
-
126
- - Fluentd & TreasureDataで こっそり始めるログ集計 Fluentd Meetup #2 @mikeda
127
- - http://www.slideshare.net/baguzy/fluentd-meetup-2-14073930
128
-
129
- - 似てる #fluentd プラグインの比較
130
- - http://matsumana.wordpress.com/2012/11/15/%E4%BC%BC%E3%81%A6%E3%82%8B-fluentd-%E3%83%97%E3%83%A9%E3%82%B0%E3%82%A4%E3%83%B3%E3%81%AE%E6%AF%94%E8%BC%83/
131
-
132
- == TODO
133
-
134
- - patches welcome!
135
-
136
- == Copyright
137
-
138
- Copyright:: Copyright (c) 2012- Kentaro Yoshida (@yoshi_ken)
139
- License:: Apache License, Version 2.0
140
-