fluent-plugin-s3 0.2.6 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/ChangeLog CHANGED
@@ -1,4 +1,19 @@
1
- Release 0.2.6 - 2012/01/15
1
+ Release 0.3.0 - 2013/02/19
2
+
3
+ * Enable dynamic and configurable S3 object kyes
4
+ https://github.com/fluent/fluent-plugin-s3/pull/12
5
+ * Fix a lot of temporary files were left on /tmp when the plugin failed to write to S3
6
+ https://github.com/fluent/fluent-plugin-s3/pull/15
7
+ * Enable fluent-mixin-config-placeholders to support hostname, uuid and other parameters in configuration
8
+ https://github.com/fluent/fluent-plugin-s3/pull/19
9
+ * Update 'aws-sdk' version requirement to '~> 1.8.2'
10
+ https://github.com/fluent/fluent-plugin-s3/pull/21
11
+ * Create new S3 bucket if not exists
12
+ https://github.com/fluent/fluent-plugin-s3/pull/22
13
+ * Check the permission and bucket existence at start method, not write method.
14
+
15
+
16
+ Release 0.2.6 - 2013/01/15
2
17
 
3
18
  * Add use_ssl option
4
19
 
data/README.rdoc CHANGED
@@ -22,6 +22,7 @@ Simply use RubyGems:
22
22
  aws_sec_key YOUR_AWS_SECRET/KEY
23
23
  s3_bucket YOUR_S3_BUCKET_NAME
24
24
  s3_endpoint s3-ap-northeast-1.amazonaws.com
25
+ s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
25
26
  path logs/
26
27
  buffer_path /var/log/fluent/s3
27
28
 
@@ -38,6 +39,48 @@ Simply use RubyGems:
38
39
 
39
40
  [s3_endpoint] s3 endpoint name. Example, Tokyo region is "s3-ap-northeast-1.amazonaws.com".
40
41
 
42
+ [s3_object_key_format] The format of S3 object keys. You can use several built-in variables:
43
+
44
+ - %{path}
45
+ - %{time_slice}
46
+ - %{index}
47
+ - %{file_extension}
48
+
49
+ to decide keys dynamically.
50
+
51
+ %{path} is exactly the value of *path* configured in the configuration file. E.g., "logs/" in the example configuration above.
52
+ %{time_slice} is the time-slice in text that are formatted with *time_slice_format*.
53
+ %{index} is the sequential number starts from 0, increments when multiple files are uploaded to S3 in the same time slice.
54
+ %{file_extention} is always "gz" for now.
55
+
56
+ The default format is "%{path}%{time_slice}_%{index}.%{file_extension}".
57
+
58
+ For instance, using the example configuration above, actual object keys on S3 will be something like:
59
+
60
+ "logs/20130111-22_0.gz"
61
+ "logs/20130111-23_0.gz"
62
+ "logs/20130111-23_1.gz"
63
+ "logs/20130112-00_0.gz"
64
+
65
+ With the configuration:
66
+
67
+ s3_object_key_format %{path}/events/ts=%{time_slice}/events_%{index}.%{file_extension}
68
+ path log
69
+ time_slice_format %Y%m%d-%H
70
+
71
+ You get:
72
+
73
+ "log/events/ts=20130111-22/events_0.gz"
74
+ "log/events/ts=20130111-23/events_0.gz"
75
+ "log/events/ts=20130111-23/events_1.gz"
76
+ "log/events/ts=20130112-00/events_0.gz"
77
+
78
+ The {fluent-mixin-config-placeholders}[https://github.com/tagomoris/fluent-mixin-config-placeholders] mixin is also incorporated, so additional variables such as %{hostname}, %{uuid}, etc. can be used in the s3_object_key_format. This could prove useful in preventing filename conflicts when writing from multiple servers.
79
+
80
+ s3_object_key_format %{path}/events/ts=%{time_slice}/events_%{index}-%{hostname}.%{file_extension}
81
+
82
+ [auto_create_bucket] Create S3 bucket if it does not exists. Default is true.
83
+
41
84
  [path] path prefix of the files on S3. Default is "" (no prefix).
42
85
 
43
86
  [buffer_path (required)] path prefix of the files to buffer logs.
@@ -48,8 +91,6 @@ Simply use RubyGems:
48
91
 
49
92
  [utc] Use UTC instead of local time.
50
93
 
51
- The actual path on S3 will be: "{path}{time_slice_format}_{sequential_number}.gz"
52
-
53
94
 
54
95
  == Copyright
55
96
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.2.6
1
+ 0.3.0
@@ -17,7 +17,9 @@ Gem::Specification.new do |gem|
17
17
  gem.require_paths = ['lib']
18
18
 
19
19
  gem.add_dependency "fluentd", "~> 0.10.0"
20
- gem.add_dependency "aws-sdk", "~> 1.7"
20
+ gem.add_dependency "aws-sdk", "~> 1.8.2"
21
21
  gem.add_dependency "yajl-ruby", "~> 1.0"
22
+ gem.add_dependency "fluent-mixin-config-placeholders", "~> 0.2.0"
22
23
  gem.add_development_dependency "rake", ">= 0.9.2"
24
+ gem.add_development_dependency "flexmock", ">= 1.2.0"
23
25
  end
@@ -1,5 +1,6 @@
1
1
  module Fluent
2
2
 
3
+ require 'fluent/mixin/config_placeholders'
3
4
 
4
5
  class S3Output < Fluent::TimeSlicedOutput
5
6
  Fluent::Plugin.register_output('s3', self)
@@ -27,6 +28,16 @@ class S3Output < Fluent::TimeSlicedOutput
27
28
  config_param :aws_sec_key, :string, :default => nil
28
29
  config_param :s3_bucket, :string
29
30
  config_param :s3_endpoint, :string, :default => nil
31
+ config_param :s3_object_key_format, :string, :default => "%{path}%{time_slice}_%{index}.%{file_extension}"
32
+ config_param :auto_create_bucket, :bool, :default => true
33
+
34
+ attr_reader :bucket
35
+
36
+ include Fluent::Mixin::ConfigPlaceholders
37
+
38
+ def placeholders
39
+ [:percent]
40
+ end
30
41
 
31
42
  def configure(conf)
32
43
  super
@@ -63,6 +74,9 @@ class S3Output < Fluent::TimeSlicedOutput
63
74
 
64
75
  @s3 = AWS::S3.new(options)
65
76
  @bucket = @s3.buckets[@s3_bucket]
77
+
78
+ ensure_bucket
79
+ check_apikeys
66
80
  end
67
81
 
68
82
  def format(tag, time, record)
@@ -88,7 +102,15 @@ class S3Output < Fluent::TimeSlicedOutput
88
102
  def write(chunk)
89
103
  i = 0
90
104
  begin
91
- s3path = "#{@path}#{chunk.key}_#{i}.gz"
105
+ values_for_s3_object_key = {
106
+ "path" => @path,
107
+ "time_slice" => chunk.key,
108
+ "file_extension" => "gz",
109
+ "index" => i
110
+ }
111
+ s3path = @s3_object_key_format.gsub(%r(%{[^}]+})) { |expr|
112
+ values_for_s3_object_key[expr[2...expr.size-1]]
113
+ }
92
114
  i += 1
93
115
  end while @bucket.objects[s3path].exists?
94
116
 
@@ -99,11 +121,30 @@ class S3Output < Fluent::TimeSlicedOutput
99
121
  w.close
100
122
  @bucket.objects[s3path].write(Pathname.new(tmp.path), :content_type => 'application/x-gzip')
101
123
  ensure
124
+ tmp.close(true) rescue nil
102
125
  w.close rescue nil
103
126
  end
104
127
  end
105
- end
106
128
 
129
+ private
130
+
131
+ def ensure_bucket
132
+ if !@bucket.exists?
133
+ if @auto_create_bucket
134
+ $log.info "Creating bucket #{@s3_bucket} on #{@s3_endpoint}"
135
+ @s3.buckets.create(@s3_bucket)
136
+ else
137
+ raise "The specified bucket does not exist: bucket = #{@s3_bucket}"
138
+ end
139
+ end
140
+ end
107
141
 
142
+ def check_apikeys
143
+ @bucket.empty?
144
+ rescue
145
+ raise "aws_key_id or aws_sec_key is invalid. Please check your configuration"
146
+ end
108
147
  end
109
148
 
149
+
150
+ end
data/test/out_s3.rb CHANGED
@@ -1,8 +1,12 @@
1
1
  require 'fluent/test'
2
2
  require 'fluent/plugin/out_s3'
3
3
 
4
+ require 'flexmock/test_unit'
5
+ require 'zlib'
6
+
4
7
  class S3OutputTest < Test::Unit::TestCase
5
8
  def setup
9
+ require 'aws-sdk'
6
10
  Fluent::Test.setup
7
11
  end
8
12
 
@@ -20,6 +24,11 @@ class S3OutputTest < Test::Unit::TestCase
20
24
  def write(chunk)
21
25
  chunk.read
22
26
  end
27
+
28
+ private
29
+
30
+ def check_apikeys
31
+ end
23
32
  end.configure(conf)
24
33
  end
25
34
 
@@ -115,7 +124,7 @@ class S3OutputTest < Test::Unit::TestCase
115
124
  d.run
116
125
  end
117
126
 
118
- def test_write
127
+ def test_chunk_to_write
119
128
  d = create_driver
120
129
 
121
130
  time = Time.parse("2011-01-02 13:14:15 UTC").to_i
@@ -130,5 +139,109 @@ class S3OutputTest < Test::Unit::TestCase
130
139
  data
131
140
  end
132
141
 
133
- end
142
+ CONFIG2 = %[
143
+ hostname testing.node.local
144
+ aws_key_id test_key_id
145
+ aws_sec_key test_sec_key
146
+ s3_bucket test_bucket
147
+ s3_object_key_format %{path}/events/ts=%{time_slice}/events_%{index}-%{hostname}.%{file_extension}
148
+ time_slice_format %Y%m%d-%H
149
+ path log
150
+ utc
151
+ buffer_type memory
152
+ auto_create_bucket false
153
+ ]
154
+
155
+ def create_time_sliced_driver(additional_conf = '')
156
+ d = Fluent::Test::TimeSlicedOutputTestDriver.new(Fluent::S3Output) do
157
+ private
158
+
159
+ def check_apikeys
160
+ end
161
+ end.configure([CONFIG2, additional_conf].join("\n"))
162
+ d
163
+ end
164
+
165
+ def test_write_with_custom_s3_object_key_format
166
+ # Assert content of event logs which are being sent to S3
167
+ s3obj = flexmock(AWS::S3::S3Object)
168
+ s3obj.should_receive(:exists?).with_any_args.
169
+ and_return { false }
170
+ s3obj.should_receive(:write).with(
171
+ on { |pathname|
172
+ data = nil
173
+ # Event logs are compressed in GZip
174
+ pathname.open { |f|
175
+ gz = Zlib::GzipReader.new(f)
176
+ data = gz.read
177
+ gz.close
178
+ }
179
+ assert_equal %[2011-01-02T13:14:15Z\ttest\t{"a":1}\n] +
180
+ %[2011-01-02T13:14:15Z\ttest\t{"a":2}\n],
181
+ data
182
+
183
+ pathname.to_s.match(%r|s3-|)
184
+ },
185
+ {:content_type=>"application/x-gzip"})
186
+
187
+ # Assert the key of S3Object, which event logs are stored in
188
+ s3obj_col = flexmock(AWS::S3::ObjectCollection)
189
+ s3obj_col.should_receive(:[]).with(
190
+ on { |key|
191
+ key == "log/events/ts=20110102-13/events_0-testing.node.local.gz"
192
+ }).
193
+ and_return {
194
+ s3obj
195
+ }
196
+
197
+ # Partial mock the S3Bucket, not to make an actual connection to Amazon S3
198
+ flexmock(AWS::S3::Bucket).new_instances do |bucket|
199
+ bucket.should_receive(:objects).with_any_args.
200
+ and_return {
201
+ s3obj_col
202
+ }
203
+ end
204
+
205
+ # We must use TimeSlicedOutputTestDriver instead of BufferedOutputTestDriver,
206
+ # to make assertions on chunks' keys
207
+ d = create_time_sliced_driver
134
208
 
209
+ time = Time.parse("2011-01-02 13:14:15 UTC").to_i
210
+ d.emit({"a"=>1}, time)
211
+ d.emit({"a"=>2}, time)
212
+
213
+ # Finally, the instance of S3Output is initialized and then invoked
214
+ d.run
215
+ end
216
+
217
+ def setup_mocks
218
+ s3bucket = flexmock(AWS::S3::Bucket)
219
+ s3bucket.should_receive(:exists?).with_any_args.and_return { false }
220
+ s3bucket_col = flexmock(AWS::S3::BucketCollection)
221
+ s3bucket_col.should_receive(:[]).with_any_args.and_return { s3bucket }
222
+ flexmock(AWS::S3).new_instances do |bucket|
223
+ bucket.should_receive(:buckets).with_any_args.and_return { s3bucket_col }
224
+ end
225
+
226
+ return s3bucket, s3bucket_col
227
+ end
228
+
229
+ def test_auto_create_bucket_false_with_non_existence_bucket
230
+ s3bucket, s3bucket_col = setup_mocks
231
+
232
+ d = create_time_sliced_driver('auto_create_bucket false')
233
+ assert_raise(RuntimeError, "The specified bucket does not exist: bucket = test_bucket") {
234
+ d.run
235
+ }
236
+ end
237
+
238
+ def test_auto_create_bucket_true_with_non_existence_bucket
239
+ s3bucket, s3bucket_col = setup_mocks
240
+ s3bucket_col.should_receive(:create).with_any_args.and_return { true }
241
+
242
+ d = create_time_sliced_driver('auto_create_bucket true')
243
+ assert_nothing_raised {
244
+ d.run
245
+ }
246
+ end
247
+ end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-s3
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.6
4
+ version: 0.3.0
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2013-01-14 00:00:00.000000000 Z
12
+ date: 2013-02-19 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: fluentd
@@ -34,7 +34,7 @@ dependencies:
34
34
  requirements:
35
35
  - - ~>
36
36
  - !ruby/object:Gem::Version
37
- version: '1.7'
37
+ version: 1.8.2
38
38
  type: :runtime
39
39
  prerelease: false
40
40
  version_requirements: !ruby/object:Gem::Requirement
@@ -42,7 +42,7 @@ dependencies:
42
42
  requirements:
43
43
  - - ~>
44
44
  - !ruby/object:Gem::Version
45
- version: '1.7'
45
+ version: 1.8.2
46
46
  - !ruby/object:Gem::Dependency
47
47
  name: yajl-ruby
48
48
  requirement: !ruby/object:Gem::Requirement
@@ -59,6 +59,22 @@ dependencies:
59
59
  - - ~>
60
60
  - !ruby/object:Gem::Version
61
61
  version: '1.0'
62
+ - !ruby/object:Gem::Dependency
63
+ name: fluent-mixin-config-placeholders
64
+ requirement: !ruby/object:Gem::Requirement
65
+ none: false
66
+ requirements:
67
+ - - ~>
68
+ - !ruby/object:Gem::Version
69
+ version: 0.2.0
70
+ type: :runtime
71
+ prerelease: false
72
+ version_requirements: !ruby/object:Gem::Requirement
73
+ none: false
74
+ requirements:
75
+ - - ~>
76
+ - !ruby/object:Gem::Version
77
+ version: 0.2.0
62
78
  - !ruby/object:Gem::Dependency
63
79
  name: rake
64
80
  requirement: !ruby/object:Gem::Requirement
@@ -75,6 +91,22 @@ dependencies:
75
91
  - - ! '>='
76
92
  - !ruby/object:Gem::Version
77
93
  version: 0.9.2
94
+ - !ruby/object:Gem::Dependency
95
+ name: flexmock
96
+ requirement: !ruby/object:Gem::Requirement
97
+ none: false
98
+ requirements:
99
+ - - ! '>='
100
+ - !ruby/object:Gem::Version
101
+ version: 1.2.0
102
+ type: :development
103
+ prerelease: false
104
+ version_requirements: !ruby/object:Gem::Requirement
105
+ none: false
106
+ requirements:
107
+ - - ! '>='
108
+ - !ruby/object:Gem::Version
109
+ version: 1.2.0
78
110
  description: Amazon S3 output plugin for Fluent event collector
79
111
  email: frsyuki@gmail.com
80
112
  executables: []
@@ -104,7 +136,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
104
136
  version: '0'
105
137
  segments:
106
138
  - 0
107
- hash: -957364254525035982
139
+ hash: -741948344279215557
108
140
  required_rubygems_version: !ruby/object:Gem::Requirement
109
141
  none: false
110
142
  requirements:
@@ -113,7 +145,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
113
145
  version: '0'
114
146
  segments:
115
147
  - 0
116
- hash: -957364254525035982
148
+ hash: -741948344279215557
117
149
  requirements: []
118
150
  rubyforge_project:
119
151
  rubygems_version: 1.8.23