fluent-plugin-webhdfs 0.4.2 → 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 61f1146fdfa68de12937082f2321f097e4aa5858
4
- data.tar.gz: 502673646e6db2db20973079a4c2056515b7de11
3
+ metadata.gz: c6a284ca4dbae16f8154b790f6201d37f75b27a1
4
+ data.tar.gz: bbdd1477dd222f2a44308e1afcb7ec1ddd0f57ac
5
5
  SHA512:
6
- metadata.gz: b634dd49ac2d1e3d552e7dbda396cc5aca47c0cd090219c5464a3a834ec9ffd5c0030d013474e4b4ac83870252f45af567788d97e0671113278d41bd168b969a
7
- data.tar.gz: 90dfe3fc5e45a3667df905874607875cba4cd18e49e22cc05318dd4141ec1e1cec3d36d933ac15634d780ef46ab75565d63fea6ffcc5c3678b901ced071042c4
6
+ metadata.gz: 986dd5af2013f754606db0a51abbfd1c4870e7e4c2cebf759bb27c02d9e73f1b7c1c1f2b5d57a53a5578ae65febcc20bb4022dcabbcef90c0ce40f34a660d433
7
+ data.tar.gz: 8768473997ac56b9af81c517e671a3c22ecf5048a841cc1b119c53ac6952b133751c7206b55237d06fc8268d152ccd454559360f270943172d6d105cc6ba9d5a
data/.travis.yml CHANGED
@@ -1,9 +1,11 @@
1
+ sudo: false
1
2
  language: ruby
2
3
 
3
4
  rvm:
4
5
  - 2.0.0
5
6
  - 2.1
6
7
  - 2.2
8
+ - 2.3.0
7
9
 
8
10
  branches:
9
11
  only:
@@ -13,3 +15,7 @@ before_install:
13
15
  - gem update bundler
14
16
 
15
17
  script: bundle exec rake test
18
+
19
+ gemfile:
20
+ - Gemfile
21
+ - gemfiles/fluentd_v0.12.gemfile
data/Appraisals ADDED
@@ -0,0 +1,4 @@
1
+ appraise "fluentd v0.12" do
2
+ gem "fluentd", "~>0.12.0"
3
+ end
4
+
data/README.md CHANGED
@@ -17,10 +17,10 @@ And you can specify output file path as 'path /path/to/dir/access.%Y%m%d.log', t
17
17
 
18
18
  ### WebHDFSOutput
19
19
 
20
- To store data by time,tag,json (same with 'type file') over WebHDFS:
20
+ To store data by time,tag,json (same with '@type file') over WebHDFS:
21
21
 
22
22
  <match access.**>
23
- type webhdfs
23
+ @type webhdfs
24
24
  host namenode.your.cluster.local
25
25
  port 50070
26
26
  path /path/on/hdfs/access.log.%Y%m%d_%H.log
@@ -29,7 +29,7 @@ To store data by time,tag,json (same with 'type file') over WebHDFS:
29
29
  If you want JSON object only (without time or tag or both on header of lines), specify it by `output_include_time` or `output_include_tag` (default true):
30
30
 
31
31
  <match access.**>
32
- type webhdfs
32
+ @type webhdfs
33
33
  host namenode.your.cluster.local
34
34
  port 50070
35
35
  path /path/on/hdfs/access.log.%Y%m%d_%H.log
@@ -40,7 +40,7 @@ If you want JSON object only (without time or tag or both on header of lines), s
40
40
  To specify namenode, `namenode` is also available:
41
41
 
42
42
  <match access.**>
43
- type webhdfs
43
+ @type webhdfs
44
44
  namenode master.your.cluster.local:50070
45
45
  path /path/on/hdfs/access.log.%Y%m%d_%H.log
46
46
  </match>
@@ -48,7 +48,7 @@ To specify namenode, `namenode` is also available:
48
48
  To store data as LTSV without time and tag over WebHDFS:
49
49
 
50
50
  <match access.**>
51
- type webhdfs
51
+ @type webhdfs
52
52
  host namenode.your.cluster.local
53
53
  port 50070
54
54
  path /path/on/hdfs/access.log.%Y%m%d_%H.log
@@ -58,7 +58,7 @@ To store data as LTSV without time and tag over WebHDFS:
58
58
  With username of pseudo authentication:
59
59
 
60
60
  <match access.**>
61
- type webhdfs
61
+ @type webhdfs
62
62
  host namenode.your.cluster.local
63
63
  port 50070
64
64
  path /path/on/hdfs/access.log.%Y%m%d_%H.log
@@ -68,7 +68,7 @@ With username of pseudo authentication:
68
68
  Store data over HttpFs (instead of WebHDFS):
69
69
 
70
70
  <match access.**>
71
- type webhdfs
71
+ @type webhdfs
72
72
  host httpfs.node.your.cluster.local
73
73
  port 14000
74
74
  path /path/on/hdfs/access.log.%Y%m%d_%H.log
@@ -78,7 +78,7 @@ Store data over HttpFs (instead of WebHDFS):
78
78
  Store data as TSV (TAB separated values) of specified keys, without time, with tag (removed prefix 'access'):
79
79
 
80
80
  <match access.**>
81
- type webhdfs
81
+ @type webhdfs
82
82
  host namenode.your.cluster.local
83
83
  port 50070
84
84
  path /path/on/hdfs/access.log.%Y%m%d_%H.log
@@ -96,7 +96,7 @@ If message doesn't have specified attribute, fluent-plugin-webhdfs outputs 'NULL
96
96
  With ssl:
97
97
 
98
98
  <match access.**>
99
- type webhdfs
99
+ @type webhdfs
100
100
  host namenode.your.cluster.local
101
101
  port 50070
102
102
  path /path/on/hdfs/access.log.%Y%m%d_%H.log
@@ -113,7 +113,7 @@ and [openssl](http://www.ruby-doc.org/stdlib-2.1.3/libdoc/openssl/rdoc/OpenSSL.h
113
113
  With kerberos authentication:
114
114
 
115
115
  <match access.**>
116
- type webhdfs
116
+ @type webhdfs
117
117
  host namenode.your.cluster.local
118
118
  port 50070
119
119
  path /path/on/hdfs/access.log.%Y%m%d_%H.log
@@ -123,21 +123,21 @@ With kerberos authentication:
123
123
  If you want to compress data before storing it:
124
124
 
125
125
  <match access.**>
126
- type webhdfs
126
+ @type webhdfs
127
127
  host namenode.your.cluster.local
128
128
  port 50070
129
129
  path /path/on/hdfs/access.log.%Y%m%d_%H
130
- compress gzip # currently only support gzip
130
+ compress gzip # or 'bzip2', 'lzo_command'
131
131
  </match>
132
132
 
133
- Note that if you set `compress gzip`, then the suffix `.gz` will be added to path.
133
+ Note that if you set `compress gzip`, then the suffix `.gz` will be added to path (or `.bz2`, `.lzo`).
134
134
 
135
135
  ### Namenode HA / Auto retry for WebHDFS known errors
136
136
 
137
137
  `fluent-plugin-webhdfs` (v0.2.0 or later) accepts 2 namenodes for Namenode HA (active/standby). Use `standby_namenode` like this:
138
138
 
139
139
  <match access.**>
140
- type webhdfs
140
+ @type webhdfs
141
141
  namenode master1.your.cluster.local:50070
142
142
  standby_namenode master2.your.cluster.local:50070
143
143
  path /path/on/hdfs/access.log.%Y%m%d_%H.log
@@ -146,7 +146,7 @@ Note that if you set `compress gzip`, then the suffix `.gz` will be added to pat
146
146
  And you can also specify to retry known hdfs errors (such like `LeaseExpiredException`) automatically. With this configuration, fluentd doesn't write logs for this errors if retry successed.
147
147
 
148
148
  <match access.**>
149
- type webhdfs
149
+ @type webhdfs
150
150
  namenode master1.your.cluster.local:50070
151
151
  path /path/on/hdfs/access.log.%Y%m%d_%H.log
152
152
  retry_known_errors yes
@@ -162,7 +162,7 @@ You can use '${hostname}' or '${uuid:random}' placeholders in configuration for
162
162
  For hostname:
163
163
 
164
164
  <match access.**>
165
- type webhdfs
165
+ @type webhdfs
166
166
  host namenode.your.cluster.local
167
167
  port 50070
168
168
  path /log/access/%Y%m%d/${hostname}.log
@@ -171,7 +171,7 @@ For hostname:
171
171
  Or with random filename (to avoid duplicated file name only):
172
172
 
173
173
  <match access.**>
174
- type webhdfs
174
+ @type webhdfs
175
175
  host namenode.your.cluster.local
176
176
  port 50070
177
177
  path /log/access/%Y%m%d/${uuid:random}.log
@@ -182,7 +182,7 @@ With configurations above, you can handle all of files of '/log/access/20120820/
182
182
  For high load cluster nodes, you can specify timeouts for HTTP requests.
183
183
 
184
184
  <match access.**>
185
- type webhdfs
185
+ @type webhdfs
186
186
  namenode master.your.cluster.local:50070
187
187
  path /log/access/%Y%m%d/${hostname}.log
188
188
  open_timeout 180 # [sec] default: 30
@@ -196,7 +196,7 @@ With default configuration, fluent-plugin-webhdfs checks HDFS filesystem status
196
196
  If you were usging unstable NameNodes and have wanted to ignore NameNode errors on startup of fluentd, enable `ignore_start_check_error` option like below:
197
197
 
198
198
  <match access.**>
199
- type webhdfs
199
+ @type webhdfs
200
200
  host namenode.your.cluster.local
201
201
  port 50070
202
202
  path /log/access/%Y%m%d/${hostname}.log
@@ -208,7 +208,7 @@ If you were usging unstable NameNodes and have wanted to ignore NameNode errors
208
208
  With unstable datanodes that frequently downs, appending over WebHDFS may produce broken files. In such cases, specify `append no` and `${chunk_id}` parameter.
209
209
 
210
210
  <match access.**>
211
- type webhdfs
211
+ @type webhdfs
212
212
  host namenode.your.cluster.local
213
213
  port 50070
214
214
 
@@ -2,7 +2,7 @@
2
2
 
3
3
  Gem::Specification.new do |gem|
4
4
  gem.name = "fluent-plugin-webhdfs"
5
- gem.version = "0.4.2"
5
+ gem.version = "0.5.0"
6
6
  gem.authors = ["TAGOMORI Satoshi"]
7
7
  gem.email = ["tagomoris@gmail.com"]
8
8
  gem.summary = %q{Fluentd plugin to write data on HDFS over WebHDFS, with flexible formatting}
@@ -17,7 +17,8 @@ Gem::Specification.new do |gem|
17
17
 
18
18
  gem.add_development_dependency "rake"
19
19
  gem.add_development_dependency "test-unit"
20
- gem.add_runtime_dependency "fluentd", '>= 0.10.53'
20
+ gem.add_development_dependency "appraisal"
21
+ gem.add_runtime_dependency "fluentd", '>= 0.10.59'
21
22
  gem.add_runtime_dependency "fluent-mixin-plaintextformatter", '>= 0.2.1'
22
23
  gem.add_runtime_dependency "fluent-mixin-config-placeholders", ">= 0.3.0"
23
24
  gem.add_runtime_dependency "webhdfs", '>= 0.6.0'
@@ -0,0 +1,7 @@
1
+ # This file was generated by Appraisal
2
+
3
+ source "https://rubygems.org"
4
+
5
+ gem "fluentd", "~>0.12.0"
6
+
7
+ gemspec :path => "../"
@@ -85,7 +85,7 @@ class Fluent::WebHDFSOutput < Fluent::TimeSlicedOutput
85
85
  desc 'Use kerberos authentication or not'
86
86
  config_param :kerberos, :bool, :default => false
87
87
 
88
- SUPPORTED_COMPRESS = ['gzip', 'bzip2']
88
+ SUPPORTED_COMPRESS = ['gzip', 'bzip2', 'lzo_command']
89
89
  desc "Compress method (#{SUPPORTED_COMPRESS.join(',')})"
90
90
  config_param :compress, :default => nil do |val|
91
91
  unless SUPPORTED_COMPRESS.include? val
@@ -357,3 +357,4 @@ end
357
357
  require 'fluent/plugin/webhdfs_compressor_text'
358
358
  require 'fluent/plugin/webhdfs_compressor_gzip'
359
359
  require 'fluent/plugin/webhdfs_compressor_bzip2'
360
+ require 'fluent/plugin/webhdfs_compressor_lzo_command'
@@ -0,0 +1,31 @@
1
+ module Fluent
2
+ class WebHDFSOutput
3
+ class LZOCommandCompressor < Compressor
4
+ WebHDFSOutput.register_compressor('lzo_command', self)
5
+
6
+ config_param :command_parameter, :string, :default => '-qf1'
7
+
8
+ def configure(conf)
9
+ super
10
+ check_command('lzop', 'LZO')
11
+ end
12
+
13
+ def ext
14
+ '.lzo'
15
+ end
16
+
17
+ def compress(chunk, tmp)
18
+ w = Tempfile.new("chunk-lzo-tmp-")
19
+ w.binmode
20
+ chunk.write_to(w)
21
+ w.close
22
+
23
+ # We don't check the return code because we can't recover lzop failure.
24
+ system "lzop #{@command_parameter} -o #{tmp.path} #{w.path}"
25
+ ensure
26
+ w.close rescue nil
27
+ w.unlink rescue nil
28
+ end
29
+ end
30
+ end
31
+ end
@@ -68,7 +68,8 @@ kerberos true
68
68
  end
69
69
 
70
70
  data(gzip: ['gzip', Fluent::WebHDFSOutput::GzipCompressor],
71
- bzip2: ['bzip2', Fluent::WebHDFSOutput::Bzip2Compressor])
71
+ bzip2: ['bzip2', Fluent::WebHDFSOutput::Bzip2Compressor],
72
+ lzo: ['lzo_command', Fluent::WebHDFSOutput::LZOCommandCompressor])
72
73
  def test_compress(data)
73
74
  compress_type, compressor_class = data
74
75
  d = create_driver %[
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-webhdfs
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.2
4
+ version: 0.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - TAGOMORI Satoshi
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2016-01-07 00:00:00.000000000 Z
11
+ date: 2016-02-03 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rake
@@ -38,20 +38,34 @@ dependencies:
38
38
  - - ">="
39
39
  - !ruby/object:Gem::Version
40
40
  version: '0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: appraisal
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: '0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ">="
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
41
55
  - !ruby/object:Gem::Dependency
42
56
  name: fluentd
43
57
  requirement: !ruby/object:Gem::Requirement
44
58
  requirements:
45
59
  - - ">="
46
60
  - !ruby/object:Gem::Version
47
- version: 0.10.53
61
+ version: 0.10.59
48
62
  type: :runtime
49
63
  prerelease: false
50
64
  version_requirements: !ruby/object:Gem::Requirement
51
65
  requirements:
52
66
  - - ">="
53
67
  - !ruby/object:Gem::Version
54
- version: 0.10.53
68
+ version: 0.10.59
55
69
  - !ruby/object:Gem::Dependency
56
70
  name: fluent-mixin-plaintextformatter
57
71
  requirement: !ruby/object:Gem::Requirement
@@ -117,14 +131,17 @@ extra_rdoc_files: []
117
131
  files:
118
132
  - ".gitignore"
119
133
  - ".travis.yml"
134
+ - Appraisals
120
135
  - Gemfile
121
136
  - LICENSE.txt
122
137
  - README.md
123
138
  - Rakefile
124
139
  - fluent-plugin-webhdfs.gemspec
140
+ - gemfiles/fluentd_v0.12.gemfile
125
141
  - lib/fluent/plugin/out_webhdfs.rb
126
142
  - lib/fluent/plugin/webhdfs_compressor_bzip2.rb
127
143
  - lib/fluent/plugin/webhdfs_compressor_gzip.rb
144
+ - lib/fluent/plugin/webhdfs_compressor_lzo_command.rb
128
145
  - lib/fluent/plugin/webhdfs_compressor_text.rb
129
146
  - test/helper.rb
130
147
  - test/plugin/test_out_webhdfs.rb
@@ -148,7 +165,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
148
165
  version: '0'
149
166
  requirements: []
150
167
  rubyforge_project:
151
- rubygems_version: 2.4.5
168
+ rubygems_version: 2.5.1
152
169
  signing_key:
153
170
  specification_version: 4
154
171
  summary: Fluentd plugin to write data on HDFS over WebHDFS, with flexible formatting