fluent-plugin-s3 0.2.6 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
data/ChangeLog CHANGED
@@ -1,4 +1,19 @@
1
- Release 0.2.6 - 2012/01/15
1
+ Release 0.3.0 - 2013/02/19
2
+
3
+ * Enable dynamic and configurable S3 object kyes
4
+ https://github.com/fluent/fluent-plugin-s3/pull/12
5
+ * Fix a lot of temporary files were left on /tmp when the plugin failed to write to S3
6
+ https://github.com/fluent/fluent-plugin-s3/pull/15
7
+ * Enable fluent-mixin-config-placeholders to support hostname, uuid and other parameters in configuration
8
+ https://github.com/fluent/fluent-plugin-s3/pull/19
9
+ * Update 'aws-sdk' version requirement to '~> 1.8.2'
10
+ https://github.com/fluent/fluent-plugin-s3/pull/21
11
+ * Create new S3 bucket if not exists
12
+ https://github.com/fluent/fluent-plugin-s3/pull/22
13
+ * Check the permission and bucket existence at start method, not write method.
14
+
15
+
16
+ Release 0.2.6 - 2013/01/15
2
17
 
3
18
  * Add use_ssl option
4
19
 
data/README.rdoc CHANGED
@@ -22,6 +22,7 @@ Simply use RubyGems:
22
22
  aws_sec_key YOUR_AWS_SECRET/KEY
23
23
  s3_bucket YOUR_S3_BUCKET_NAME
24
24
  s3_endpoint s3-ap-northeast-1.amazonaws.com
25
+ s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
25
26
  path logs/
26
27
  buffer_path /var/log/fluent/s3
27
28
 
@@ -38,6 +39,48 @@ Simply use RubyGems:
38
39
 
39
40
  [s3_endpoint] s3 endpoint name. Example, Tokyo region is "s3-ap-northeast-1.amazonaws.com".
40
41
 
42
+ [s3_object_key_format] The format of S3 object keys. You can use several built-in variables:
43
+
44
+ - %{path}
45
+ - %{time_slice}
46
+ - %{index}
47
+ - %{file_extension}
48
+
49
+ to decide keys dynamically.
50
+
51
+ %{path} is exactly the value of *path* configured in the configuration file. E.g., "logs/" in the example configuration above.
52
+ %{time_slice} is the time-slice in text that are formatted with *time_slice_format*.
53
+ %{index} is the sequential number starts from 0, increments when multiple files are uploaded to S3 in the same time slice.
54
+ %{file_extention} is always "gz" for now.
55
+
56
+ The default format is "%{path}%{time_slice}_%{index}.%{file_extension}".
57
+
58
+ For instance, using the example configuration above, actual object keys on S3 will be something like:
59
+
60
+ "logs/20130111-22_0.gz"
61
+ "logs/20130111-23_0.gz"
62
+ "logs/20130111-23_1.gz"
63
+ "logs/20130112-00_0.gz"
64
+
65
+ With the configuration:
66
+
67
+ s3_object_key_format %{path}/events/ts=%{time_slice}/events_%{index}.%{file_extension}
68
+ path log
69
+ time_slice_format %Y%m%d-%H
70
+
71
+ You get:
72
+
73
+ "log/events/ts=20130111-22/events_0.gz"
74
+ "log/events/ts=20130111-23/events_0.gz"
75
+ "log/events/ts=20130111-23/events_1.gz"
76
+ "log/events/ts=20130112-00/events_0.gz"
77
+
78
+ The {fluent-mixin-config-placeholders}[https://github.com/tagomoris/fluent-mixin-config-placeholders] mixin is also incorporated, so additional variables such as %{hostname}, %{uuid}, etc. can be used in the s3_object_key_format. This could prove useful in preventing filename conflicts when writing from multiple servers.
79
+
80
+ s3_object_key_format %{path}/events/ts=%{time_slice}/events_%{index}-%{hostname}.%{file_extension}
81
+
82
+ [auto_create_bucket] Create S3 bucket if it does not exists. Default is true.
83
+
41
84
  [path] path prefix of the files on S3. Default is "" (no prefix).
42
85
 
43
86
  [buffer_path (required)] path prefix of the files to buffer logs.
@@ -48,8 +91,6 @@ Simply use RubyGems:
48
91
 
49
92
  [utc] Use UTC instead of local time.
50
93
 
51
- The actual path on S3 will be: "{path}{time_slice_format}_{sequential_number}.gz"
52
-
53
94
 
54
95
  == Copyright
55
96
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.2.6
1
+ 0.3.0
@@ -17,7 +17,9 @@ Gem::Specification.new do |gem|
17
17
  gem.require_paths = ['lib']
18
18
 
19
19
  gem.add_dependency "fluentd", "~> 0.10.0"
20
- gem.add_dependency "aws-sdk", "~> 1.7"
20
+ gem.add_dependency "aws-sdk", "~> 1.8.2"
21
21
  gem.add_dependency "yajl-ruby", "~> 1.0"
22
+ gem.add_dependency "fluent-mixin-config-placeholders", "~> 0.2.0"
22
23
  gem.add_development_dependency "rake", ">= 0.9.2"
24
+ gem.add_development_dependency "flexmock", ">= 1.2.0"
23
25
  end
@@ -1,5 +1,6 @@
1
1
  module Fluent
2
2
 
3
+ require 'fluent/mixin/config_placeholders'
3
4
 
4
5
  class S3Output < Fluent::TimeSlicedOutput
5
6
  Fluent::Plugin.register_output('s3', self)
@@ -27,6 +28,16 @@ class S3Output < Fluent::TimeSlicedOutput
27
28
  config_param :aws_sec_key, :string, :default => nil
28
29
  config_param :s3_bucket, :string
29
30
  config_param :s3_endpoint, :string, :default => nil
31
+ config_param :s3_object_key_format, :string, :default => "%{path}%{time_slice}_%{index}.%{file_extension}"
32
+ config_param :auto_create_bucket, :bool, :default => true
33
+
34
+ attr_reader :bucket
35
+
36
+ include Fluent::Mixin::ConfigPlaceholders
37
+
38
+ def placeholders
39
+ [:percent]
40
+ end
30
41
 
31
42
  def configure(conf)
32
43
  super
@@ -63,6 +74,9 @@ class S3Output < Fluent::TimeSlicedOutput
63
74
 
64
75
  @s3 = AWS::S3.new(options)
65
76
  @bucket = @s3.buckets[@s3_bucket]
77
+
78
+ ensure_bucket
79
+ check_apikeys
66
80
  end
67
81
 
68
82
  def format(tag, time, record)
@@ -88,7 +102,15 @@ class S3Output < Fluent::TimeSlicedOutput
88
102
  def write(chunk)
89
103
  i = 0
90
104
  begin
91
- s3path = "#{@path}#{chunk.key}_#{i}.gz"
105
+ values_for_s3_object_key = {
106
+ "path" => @path,
107
+ "time_slice" => chunk.key,
108
+ "file_extension" => "gz",
109
+ "index" => i
110
+ }
111
+ s3path = @s3_object_key_format.gsub(%r(%{[^}]+})) { |expr|
112
+ values_for_s3_object_key[expr[2...expr.size-1]]
113
+ }
92
114
  i += 1
93
115
  end while @bucket.objects[s3path].exists?
94
116
 
@@ -99,11 +121,30 @@ class S3Output < Fluent::TimeSlicedOutput
99
121
  w.close
100
122
  @bucket.objects[s3path].write(Pathname.new(tmp.path), :content_type => 'application/x-gzip')
101
123
  ensure
124
+ tmp.close(true) rescue nil
102
125
  w.close rescue nil
103
126
  end
104
127
  end
105
- end
106
128
 
129
+ private
130
+
131
+ def ensure_bucket
132
+ if !@bucket.exists?
133
+ if @auto_create_bucket
134
+ $log.info "Creating bucket #{@s3_bucket} on #{@s3_endpoint}"
135
+ @s3.buckets.create(@s3_bucket)
136
+ else
137
+ raise "The specified bucket does not exist: bucket = #{@s3_bucket}"
138
+ end
139
+ end
140
+ end
107
141
 
142
+ def check_apikeys
143
+ @bucket.empty?
144
+ rescue
145
+ raise "aws_key_id or aws_sec_key is invalid. Please check your configuration"
146
+ end
108
147
  end
109
148
 
149
+
150
+ end
data/test/out_s3.rb CHANGED
@@ -1,8 +1,12 @@
1
1
  require 'fluent/test'
2
2
  require 'fluent/plugin/out_s3'
3
3
 
4
+ require 'flexmock/test_unit'
5
+ require 'zlib'
6
+
4
7
  class S3OutputTest < Test::Unit::TestCase
5
8
  def setup
9
+ require 'aws-sdk'
6
10
  Fluent::Test.setup
7
11
  end
8
12
 
@@ -20,6 +24,11 @@ class S3OutputTest < Test::Unit::TestCase
20
24
  def write(chunk)
21
25
  chunk.read
22
26
  end
27
+
28
+ private
29
+
30
+ def check_apikeys
31
+ end
23
32
  end.configure(conf)
24
33
  end
25
34
 
@@ -115,7 +124,7 @@ class S3OutputTest < Test::Unit::TestCase
115
124
  d.run
116
125
  end
117
126
 
118
- def test_write
127
+ def test_chunk_to_write
119
128
  d = create_driver
120
129
 
121
130
  time = Time.parse("2011-01-02 13:14:15 UTC").to_i
@@ -130,5 +139,109 @@ class S3OutputTest < Test::Unit::TestCase
130
139
  data
131
140
  end
132
141
 
133
- end
142
+ CONFIG2 = %[
143
+ hostname testing.node.local
144
+ aws_key_id test_key_id
145
+ aws_sec_key test_sec_key
146
+ s3_bucket test_bucket
147
+ s3_object_key_format %{path}/events/ts=%{time_slice}/events_%{index}-%{hostname}.%{file_extension}
148
+ time_slice_format %Y%m%d-%H
149
+ path log
150
+ utc
151
+ buffer_type memory
152
+ auto_create_bucket false
153
+ ]
154
+
155
+ def create_time_sliced_driver(additional_conf = '')
156
+ d = Fluent::Test::TimeSlicedOutputTestDriver.new(Fluent::S3Output) do
157
+ private
158
+
159
+ def check_apikeys
160
+ end
161
+ end.configure([CONFIG2, additional_conf].join("\n"))
162
+ d
163
+ end
164
+
165
+ def test_write_with_custom_s3_object_key_format
166
+ # Assert content of event logs which are being sent to S3
167
+ s3obj = flexmock(AWS::S3::S3Object)
168
+ s3obj.should_receive(:exists?).with_any_args.
169
+ and_return { false }
170
+ s3obj.should_receive(:write).with(
171
+ on { |pathname|
172
+ data = nil
173
+ # Event logs are compressed in GZip
174
+ pathname.open { |f|
175
+ gz = Zlib::GzipReader.new(f)
176
+ data = gz.read
177
+ gz.close
178
+ }
179
+ assert_equal %[2011-01-02T13:14:15Z\ttest\t{"a":1}\n] +
180
+ %[2011-01-02T13:14:15Z\ttest\t{"a":2}\n],
181
+ data
182
+
183
+ pathname.to_s.match(%r|s3-|)
184
+ },
185
+ {:content_type=>"application/x-gzip"})
186
+
187
+ # Assert the key of S3Object, which event logs are stored in
188
+ s3obj_col = flexmock(AWS::S3::ObjectCollection)
189
+ s3obj_col.should_receive(:[]).with(
190
+ on { |key|
191
+ key == "log/events/ts=20110102-13/events_0-testing.node.local.gz"
192
+ }).
193
+ and_return {
194
+ s3obj
195
+ }
196
+
197
+ # Partial mock the S3Bucket, not to make an actual connection to Amazon S3
198
+ flexmock(AWS::S3::Bucket).new_instances do |bucket|
199
+ bucket.should_receive(:objects).with_any_args.
200
+ and_return {
201
+ s3obj_col
202
+ }
203
+ end
204
+
205
+ # We must use TimeSlicedOutputTestDriver instead of BufferedOutputTestDriver,
206
+ # to make assertions on chunks' keys
207
+ d = create_time_sliced_driver
134
208
 
209
+ time = Time.parse("2011-01-02 13:14:15 UTC").to_i
210
+ d.emit({"a"=>1}, time)
211
+ d.emit({"a"=>2}, time)
212
+
213
+ # Finally, the instance of S3Output is initialized and then invoked
214
+ d.run
215
+ end
216
+
217
+ def setup_mocks
218
+ s3bucket = flexmock(AWS::S3::Bucket)
219
+ s3bucket.should_receive(:exists?).with_any_args.and_return { false }
220
+ s3bucket_col = flexmock(AWS::S3::BucketCollection)
221
+ s3bucket_col.should_receive(:[]).with_any_args.and_return { s3bucket }
222
+ flexmock(AWS::S3).new_instances do |bucket|
223
+ bucket.should_receive(:buckets).with_any_args.and_return { s3bucket_col }
224
+ end
225
+
226
+ return s3bucket, s3bucket_col
227
+ end
228
+
229
+ def test_auto_create_bucket_false_with_non_existence_bucket
230
+ s3bucket, s3bucket_col = setup_mocks
231
+
232
+ d = create_time_sliced_driver('auto_create_bucket false')
233
+ assert_raise(RuntimeError, "The specified bucket does not exist: bucket = test_bucket") {
234
+ d.run
235
+ }
236
+ end
237
+
238
+ def test_auto_create_bucket_true_with_non_existence_bucket
239
+ s3bucket, s3bucket_col = setup_mocks
240
+ s3bucket_col.should_receive(:create).with_any_args.and_return { true }
241
+
242
+ d = create_time_sliced_driver('auto_create_bucket true')
243
+ assert_nothing_raised {
244
+ d.run
245
+ }
246
+ end
247
+ end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-s3
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.6
4
+ version: 0.3.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2013-01-14 00:00:00.000000000 Z
12
+ date: 2013-02-19 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: fluentd
@@ -34,7 +34,7 @@ dependencies:
34
34
  requirements:
35
35
  - - ~>
36
36
  - !ruby/object:Gem::Version
37
- version: '1.7'
37
+ version: 1.8.2
38
38
  type: :runtime
39
39
  prerelease: false
40
40
  version_requirements: !ruby/object:Gem::Requirement
@@ -42,7 +42,7 @@ dependencies:
42
42
  requirements:
43
43
  - - ~>
44
44
  - !ruby/object:Gem::Version
45
- version: '1.7'
45
+ version: 1.8.2
46
46
  - !ruby/object:Gem::Dependency
47
47
  name: yajl-ruby
48
48
  requirement: !ruby/object:Gem::Requirement
@@ -59,6 +59,22 @@ dependencies:
59
59
  - - ~>
60
60
  - !ruby/object:Gem::Version
61
61
  version: '1.0'
62
+ - !ruby/object:Gem::Dependency
63
+ name: fluent-mixin-config-placeholders
64
+ requirement: !ruby/object:Gem::Requirement
65
+ none: false
66
+ requirements:
67
+ - - ~>
68
+ - !ruby/object:Gem::Version
69
+ version: 0.2.0
70
+ type: :runtime
71
+ prerelease: false
72
+ version_requirements: !ruby/object:Gem::Requirement
73
+ none: false
74
+ requirements:
75
+ - - ~>
76
+ - !ruby/object:Gem::Version
77
+ version: 0.2.0
62
78
  - !ruby/object:Gem::Dependency
63
79
  name: rake
64
80
  requirement: !ruby/object:Gem::Requirement
@@ -75,6 +91,22 @@ dependencies:
75
91
  - - ! '>='
76
92
  - !ruby/object:Gem::Version
77
93
  version: 0.9.2
94
+ - !ruby/object:Gem::Dependency
95
+ name: flexmock
96
+ requirement: !ruby/object:Gem::Requirement
97
+ none: false
98
+ requirements:
99
+ - - ! '>='
100
+ - !ruby/object:Gem::Version
101
+ version: 1.2.0
102
+ type: :development
103
+ prerelease: false
104
+ version_requirements: !ruby/object:Gem::Requirement
105
+ none: false
106
+ requirements:
107
+ - - ! '>='
108
+ - !ruby/object:Gem::Version
109
+ version: 1.2.0
78
110
  description: Amazon S3 output plugin for Fluent event collector
79
111
  email: frsyuki@gmail.com
80
112
  executables: []
@@ -104,7 +136,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
104
136
  version: '0'
105
137
  segments:
106
138
  - 0
107
- hash: -957364254525035982
139
+ hash: -741948344279215557
108
140
  required_rubygems_version: !ruby/object:Gem::Requirement
109
141
  none: false
110
142
  requirements:
@@ -113,7 +145,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
113
145
  version: '0'
114
146
  segments:
115
147
  - 0
116
- hash: -957364254525035982
148
+ hash: -741948344279215557
117
149
  requirements: []
118
150
  rubyforge_project:
119
151
  rubygems_version: 1.8.23