fluent-plugin-s3 0.6.0 → 0.6.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 9a324977d9062b01736950c76f666d42161e93d6
4
- data.tar.gz: 6d70b12fabf887f6ac991798ac4e5a0200c5afe5
3
+ metadata.gz: fd2f564114dd0e1f90168baa3f8ed983a9eba33d
4
+ data.tar.gz: 259cb9cbe6213eeaf6e2734706eaf0701cb17812
5
5
  SHA512:
6
- metadata.gz: 23a7cb16fca1b7dbe7f7c785d3096d8670c1bc7af033685e57ebfe7b8bcecc225bd3ad55abb72e71514d983e7319bea13629df0f1f51638fffffd60ac23c2e27
7
- data.tar.gz: 30b2f703798b58b3b866b1f3fe1bdcf64081c4c73eede9d570a631d4c297a7be662267c59bbd46a17c286047a00481898493e8bf1cefed2acf2c15c3fc55b00f
6
+ metadata.gz: 7437b11fcd3ac8ce9ce0aabc34b2959eee14f003ba5c54ec3e5feed94658f18094811490c2505e548c0f9ea00136b333a0f86a5c914729a1e9d31a85ed0535e6
7
+ data.tar.gz: 3037249250cfed2cbe07343f826097ecdfabf33a3e3ebb9dac7a10c574b8224f4fefeceb1ae5bde87409b74d87da84355559eb93fcd91d1a52539868ba0c5589
data/.gitignore CHANGED
@@ -9,3 +9,5 @@
9
9
  Gemfile.lock
10
10
  vendor
11
11
  .ruby-version
12
+
13
+ test/tmp/
data/ChangeLog CHANGED
@@ -1,3 +1,10 @@
1
+ Release 0.6.1 - 2015/10/30
2
+
3
+ * Fix server_side_encryption error
4
+ * Keep hex random identity on rebooting
5
+ * Fix Tempfile handling on windows
6
+
7
+
1
8
  Release 0.6.0 - 2015/10/09
2
9
 
3
10
  * Allow path based calling format
@@ -0,0 +1,465 @@
1
+ # Amazon S3 output plugin for [Fluentd](http://github.com/fluent/fluentd)
2
+
3
+ [<img src="https://travis-ci.org/fluent/fluent-plugin-s3.svg?branch=master"
4
+ alt="Build Status" />](https://travis-ci.org/fluent/fluent-plugin-s3) [<img
5
+ src="https://codeclimate.com/github/fluent/fluent-plugin-s3/badges/gpa.svg"
6
+ />](https://codeclimate.com/github/fluent/fluent-plugin-s3)
7
+
8
+ ## Overview
9
+
10
+ **s3** output plugin buffers event logs in local file and upload it to S3
11
+ periodically.
12
+
13
+ This plugin splits files exactly by using the time of event logs (not the time
14
+ when the logs are received). For example, a log '2011-01-02 message B' is
15
+ reached, and then another log '2011-01-03 message B' is reached in this order,
16
+ the former one is stored in "20110102.gz" file, and latter one in
17
+ "20110103.gz" file.
18
+
19
+ ## Installation
20
+
21
+ Simply use RubyGems:
22
+
23
+ gem install fluent-plugin-s3
24
+
25
+ ## Configuration
26
+
27
+ <match pattern>
28
+ type s3
29
+
30
+ aws_key_id YOUR_AWS_KEY_ID
31
+ aws_sec_key YOUR_AWS_SECRET_KEY
32
+ s3_bucket YOUR_S3_BUCKET_NAME
33
+ s3_region ap-northeast-1
34
+ s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
35
+ path logs/
36
+ buffer_path /var/log/fluent/s3
37
+
38
+ time_slice_format %Y%m%d-%H
39
+ time_slice_wait 10m
40
+ utc
41
+ </match>
42
+
43
+ **aws_key_id**
44
+
45
+ AWS access key id. This parameter is required when your agent is not
46
+ running on EC2 instance with an IAM Role.
47
+
48
+ **aws_sec_key**
49
+
50
+ AWS secret key. This parameter is required when your agent is not running
51
+ on EC2 instance with an IAM Role.
52
+
53
+ **aws_iam_retries**
54
+
55
+ The number of attempts to make (with exponential backoff) when loading
56
+ instance profile credentials from the EC2 metadata service using an IAM
57
+ role. Defaults to 5 retries.
58
+
59
+ **s3_bucket (required)**
60
+
61
+ S3 bucket name.
62
+
63
+ **s3_region**
64
+
65
+ s3 region name. For example, US West (Oregon) Region is "us-west-2". The
66
+ full list of regions are available here. >
67
+ http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region. We
68
+ recommend using `s3_region` instead of `s3_endpoint`.
69
+
70
+ **s3_endpoint**
71
+
72
+ endpoint for S3 compatible services. For example, Riak CS based storage or
73
+ something. This option doesn't work on S3, use `s3_region` instead.
74
+
75
+ **s3_object_key_format**
76
+
77
+ The format of S3 object keys. You can use several built-in variables:
78
+
79
+ * %{path}
80
+ * %{time_slice}
81
+ * %{index}
82
+ * %{file_extension}
83
+ * %{uuid_flush}
84
+ * %{hex_random}
85
+
86
+ to decide keys dynamically.
87
+
88
+ * %{path} is exactly the value of **path** configured in the configuration file.
89
+ E.g., "logs/" in the example configuration above.
90
+ * %{time_slice} is the
91
+ time-slice in text that are formatted with **time_slice_format**. %{index} is
92
+ the sequential number starts from 0, increments when multiple files are
93
+ uploaded to S3 in the same time slice.
94
+ * %{file_extention} is always "gz" for
95
+ now.
96
+ * %{uuid_flush} a uuid that is replaced everytime the buffer will be flushed
97
+ * %{hex_random} a random hex string that is replaced for each buffer chunk, not
98
+ assured to be unique. This is used to follow a way of peformance tuning, `Add
99
+ a Hex Hash Prefix to Key Name`, written in [Request Rate and Performance
100
+ Considerations - Amazon Simple Storage
101
+ Service](https://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-con
102
+ siderations.html). You can configure the length of string with a
103
+ `hex_random_length` parameter (Default: 4).
104
+
105
+ The default format is `%{path}%{time_slice}_%{index}.%{file_extension}`.
106
+
107
+ For instance, using the example configuration above, actual object keys on S3
108
+ will be something like:
109
+
110
+ "logs/20130111-22_0.gz"
111
+ "logs/20130111-23_0.gz"
112
+ "logs/20130111-23_1.gz"
113
+ "logs/20130112-00_0.gz"
114
+
115
+ With the configuration:
116
+
117
+ s3_object_key_format %{path}/events/ts=%{time_slice}/events_%{index}.%{file_extension}
118
+ path log
119
+ time_slice_format %Y%m%d-%H
120
+
121
+ You get:
122
+
123
+ "log/events/ts=20130111-22/events_0.gz"
124
+ "log/events/ts=20130111-23/events_0.gz"
125
+ "log/events/ts=20130111-23/events_1.gz"
126
+ "log/events/ts=20130112-00/events_0.gz"
127
+
128
+ The
129
+ [fluent-mixin-config-placeholders](https://github.com/tagomoris/fluent-mixin-c
130
+ onfig-placeholders) mixin is also incorporated, so additional variables such
131
+ as %{hostname}, %{uuid}, etc. can be used in the s3_object_key_format. This
132
+ could prove useful in preventing filename conflicts when writing from multiple
133
+ servers.
134
+
135
+ s3_object_key_format %{path}/events/ts=%{time_slice}/events_%{index}-%{hostname}.%{file_extension}
136
+
137
+ **force_path_style**
138
+
139
+ :force_path_style (Boolean) — default: false — When set to true, the
140
+ bucket name is always left in the request URI and never moved to the host
141
+ as a sub-domain. See Plugins::S3BucketDns for more details.
142
+
143
+ **store_as**
144
+
145
+ archive format on S3. You can use serveral format:
146
+
147
+ * gzip (default)
148
+ * json
149
+ * text
150
+ * lzo (Need lzop command)
151
+ * lzma2 (Need xz command)
152
+ * gzip_command (Need gzip command)
153
+ * This compressor uses an external gzip command, hence would result in
154
+ utilizing CPU cores well compared with `gzip`
155
+
156
+ See `Use your compression algorithm` section for adding another format.
157
+
158
+ **format**
159
+
160
+ Change one line format in the S3 object. Supported formats are "out_file",
161
+ "json", "ltsv" and "single_value".
162
+
163
+ * out_file (default).
164
+
165
+ time\ttag\t{..json1..}
166
+ time\ttag\t{..json2..}
167
+ ...
168
+
169
+ * json
170
+
171
+ {..json1..}
172
+ {..json2..}
173
+ ...
174
+
175
+
176
+ At this format, "time" and "tag" are omitted. But you can set these
177
+ information to the record by setting "include_tag_key" / "tag_key" and
178
+ "include_time_key" / "time_key" option. If you set following configuration in
179
+ S3 output:
180
+
181
+ format json
182
+ include_time_key true
183
+ time_key log_time # default is time
184
+
185
+ then the record has log_time field.
186
+
187
+ {"log_time":"time string",...}
188
+
189
+ * ltsv
190
+
191
+ key1:value1\tkey2:value2
192
+ key1:value1\tkey2:value2
193
+ ...
194
+
195
+
196
+ "ltsv" format also accepts "include_xxx" related options. See "json" section.
197
+
198
+ * single_value
199
+
200
+
201
+ Use specified value instead of entire recode. If you get '{"message":"my
202
+ log"}', then contents are
203
+
204
+ my log1
205
+ my log2
206
+ ...
207
+
208
+ You can change key name by "message_key" option.
209
+
210
+ **auto_create_bucket**
211
+
212
+ Create S3 bucket if it does not exists. Default is true.
213
+
214
+ **check_apikey_on_start**
215
+
216
+ Check AWS key on start. Default is true.
217
+
218
+ **proxy_uri**
219
+
220
+ uri of proxy environment.
221
+
222
+ **path**
223
+
224
+ path prefix of the files on S3. Default is "" (no prefix).
225
+
226
+ **buffer_path (required)**
227
+
228
+ path prefix of the files to buffer logs.
229
+
230
+ **time_slice_format**
231
+
232
+ Format of the time used as the file name. Default is '%Y%m%d'. Use
233
+ '%Y%m%d%H' to split files hourly.
234
+
235
+ **time_slice_wait**
236
+
237
+ The time to wait old logs. Default is 10 minutes. Specify larger value if
238
+ old logs may reache.
239
+
240
+ **utc**
241
+
242
+ Use UTC instead of local time.
243
+
244
+ **reduced_redundancy**
245
+
246
+ Use S3 reduced redundancy storage for 33% cheaper pricing. Default is
247
+ false.
248
+
249
+ **acl**
250
+
251
+ Permission for the object in S3. This is useful for cross-account access
252
+ using IAM roles. Valid values are:
253
+
254
+ * private (default)
255
+ * public_read
256
+ * public_read_write (not recommended - see [Canned
257
+ ACL](http://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#cann
258
+ ed-acl))
259
+ * authenticated_read
260
+ * bucket_owner_read
261
+ * bucket_owner_full_control
262
+
263
+ To use cross-account access, you will need to create a bucket policy granting
264
+ the specific access required. Refer to the [AWS
265
+ documentation](http://docs.aws.amazon.com/AmazonS3/latest/dev/example-walkthro
266
+ ughs-managing-access-example3.html) for examples.
267
+
268
+ **hex_random_length**
269
+
270
+ The length of `%{hex_random}` placeholder. Default is 4 as written in
271
+ [Request Rate and Performance Considerations - Amazon Simple Storage
272
+ Service](https://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html).
273
+ The maximum length is 16.
274
+
275
+ **overwrite**
276
+
277
+ Overwrite already existing path. Default is false, which raises an error
278
+ if a s3 object of the same path already exists, or increment the
279
+ `%{index}` placeholder until finding an absent path.
280
+
281
+ ### assume_role_credentials
282
+
283
+ Typically, you use AssumeRole for cross-account access or federation.
284
+
285
+ <match *>
286
+ type s3
287
+
288
+ <assume_role_credentials>
289
+ role_arn ROLE_ARN
290
+ role_session_name ROLE_SESSION_NAME
291
+ </assume_role_credentials>
292
+ </match>
293
+
294
+ See also:
295
+
296
+ * [Using IAM Roles - AWS Identity and Access
297
+ Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.h
298
+ tml)
299
+ * [Aws::STS::Client](http://docs.aws.amazon.com/sdkforruby/api/Aws/STS/Clien
300
+ t.html)
301
+ * [Aws::AssumeRoleCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws
302
+ /AssumeRoleCredentials.html)
303
+
304
+ **role_arn (required)**
305
+
306
+ The Amazon Resource Name (ARN) of the role to assume.
307
+
308
+ **role_session_name (required)**
309
+
310
+ An identifier for the assumed role session.
311
+
312
+ **policy**
313
+
314
+ An IAM policy in JSON format.
315
+
316
+ **duration_seconds**
317
+
318
+ The duration, in seconds, of the role session. The value can range from
319
+ 900 seconds (15 minutes) to 3600 seconds (1 hour). By default, the value
320
+ is set to 3600 seconds.
321
+
322
+ **external_id**
323
+
324
+ A unique identifier that is used by third parties when assuming roles in
325
+ their customers' accounts.
326
+
327
+ ### instance_profile_credentials
328
+
329
+ Retrieve temporary security credentials via HTTP request. This is useful on
330
+ EC2 instance.
331
+
332
+ <match *>
333
+ type s3
334
+
335
+ <instance_profile_credentials>
336
+ ip_address IP_ADDRESS
337
+ port PORT
338
+ </instance_profile_credentials>
339
+ </match>
340
+
341
+ See also:
342
+
343
+ * [Aws::InstanceProfileCredentials](http://docs.aws.amazon.com/sdkforruby/ap
344
+ i/Aws/InstanceProfileCredentials.html)
345
+ * [Temporary Security Credentials - AWS Identity and Access
346
+ Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials
347
+ _temp.html)
348
+ * [Instance Metadata and User Data - Amazon Elastic Compute
349
+ Cloud](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-met
350
+ adata.html)
351
+
352
+ **retries**
353
+
354
+ Number of times to retry when retrieving credentials. Default is 5.
355
+
356
+ **ip_address**
357
+
358
+ Default is 169.254.169.254.
359
+
360
+ **port**
361
+
362
+ Default is 80.
363
+
364
+ **http_open_timeout**
365
+
366
+ Default is 5.
367
+
368
+ **http_read_timeout**
369
+
370
+ Default is 5.
371
+
372
+ ### shared_credentials
373
+
374
+ This loads AWS access credentials from local ini file. This is useful for
375
+ local developing.
376
+
377
+ <match *>
378
+ type s3
379
+
380
+ <shared_credentials>
381
+ path PATH
382
+ profile_name PROFILE_NAME
383
+ </shared_credentials>
384
+ </match>
385
+
386
+ See also:
387
+
388
+ * [Aws::SharedCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws/Sha
389
+ redCredentials.html)
390
+
391
+ **path**
392
+
393
+ Path to the shared file. Defaults to "#{Dir.home}/.aws/credentials".
394
+
395
+ **profile_name**
396
+
397
+ Defaults to 'default' or `[ENV]('AWS_PROFILE')`.
398
+
399
+
400
+ ## IAM Policy
401
+
402
+ The following is an example for a minimal IAM policy needed to write to an s3
403
+ bucket (matches my-s3bucket/logs, my-s3bucket-test, etc.).
404
+
405
+ { "Statement": [
406
+ { "Effect":"Allow",
407
+ "Action":"s3:*",
408
+ "Resource":"arn:aws:s3:::my-s3bucket*"
409
+ } ]
410
+ }
411
+
412
+ Note that the bucket must already exist and **auto_create_bucket** has no
413
+ effect in this case.
414
+
415
+ Refer to the [AWS
416
+ documentation](http://docs.aws.amazon.com/IAM/latest/UserGuide/ExampleIAMPolic
417
+ ies.html) for example policies.
418
+
419
+ Using [IAM
420
+ roles](http://docs.aws.amazon.com/IAM/latest/UserGuide/WorkingWithRoles.html)
421
+ with a properly configured IAM policy are preferred over embedding access keys
422
+ on EC2 instances.
423
+
424
+ ## Use your compression algorithm
425
+
426
+ s3 plugin has plugabble compression mechanizm like Fleuntd's input / output
427
+ plugin. If you set 'store_as xxx', s3 plugin searches
428
+ `fluent/plugin/s3_compressor_xxx.rb`. You can define your compression with
429
+ 'S3Output::Compressor' class. Compressor API is here:
430
+
431
+ module Fluent
432
+ class S3Output
433
+ class XXXCompressor < Compressor
434
+ S3Output.register_compressor('xxx', self)
435
+
436
+ # Used to file extension
437
+ def ext
438
+ 'xxx'
439
+ end
440
+
441
+ # Used to file content type
442
+ def content_type
443
+ 'application/x-xxx'
444
+ end
445
+
446
+ # chunk is buffer chunk. tmp is destination file for upload
447
+ def compress(chunk, tmp)
448
+ # call command or something
449
+ end
450
+ end
451
+ end
452
+ end
453
+
454
+ See bundled Compressor classes for more detail.
455
+
456
+ ## Website, license, et. al.
457
+
458
+ | Web site | http://fluentd.org/ |
459
+ |-------------------|-------------------------------------------|
460
+ | Documents | http://docs.fluentd.org/ |
461
+ | Source repository | http://github.com/fluent/fluent-plugin-s3 |
462
+ | Discussion | http://groups.google.com/group/fluentd |
463
+ | Author | Sadayuki Furuhashi |
464
+ | Copyright | (c) 2011 FURUHASHI Sadayuki |
465
+ | License | Apache License, Version 2.0 |
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.6.0
1
+ 0.6.1
@@ -0,0 +1,25 @@
1
+ version: '{build}'
2
+
3
+ install:
4
+ - SET PATH=C:\Ruby%ruby_version%\bin;%PATH%
5
+ - "%devkit%\\devkitvars.bat"
6
+ - ruby --version
7
+ - gem --version
8
+ - bundle install
9
+ build: off
10
+ test_script:
11
+ - bundle exec rake test TESTOPTS=-v
12
+
13
+ environment:
14
+ matrix:
15
+ - ruby_version: "22-x64"
16
+ devkit: C:\Ruby21-x64\DevKit
17
+ - ruby_version: "22"
18
+ devkit: C:\Ruby21\DevKit
19
+ - ruby_version: "21-x64"
20
+ devkit: C:\Ruby21-x64\DevKit
21
+ - ruby_version: "21"
22
+ devkit: C:\Ruby21\DevKit
23
+ matrix:
24
+ allow_failures:
25
+ - ruby_version: "21"
@@ -59,6 +59,7 @@ module Fluent
59
59
  attr_reader :bucket
60
60
 
61
61
  include Fluent::Mixin::ConfigPlaceholders
62
+ MAX_HEX_RANDOM_LENGTH = 16
62
63
 
63
64
  def placeholders
64
65
  [:percent]
@@ -92,6 +93,10 @@ module Fluent
92
93
  }
93
94
  end
94
95
 
96
+ if @hex_random_length > MAX_HEX_RANDOM_LENGTH
97
+ raise ConfigError, "hex_random_length parameter must be less than or equal to #{MAX_HEX_RANDOM_LENGTH}"
98
+ end
99
+
95
100
  @storage_class = "REDUCED_REDUNDANCY" if @reduced_redundancy
96
101
  @values_for_s3_object_chunk = {}
97
102
  end
@@ -102,7 +107,6 @@ module Fluent
102
107
  options[:region] = @s3_region if @s3_region
103
108
  options[:endpoint] = @s3_endpoint if @s3_endpoint
104
109
  options[:http_proxy] = @proxy_uri if @proxy_uri
105
- options[:s3_server_side_encryption] = @use_server_side_encryption.to_sym if @use_server_side_encryption
106
110
  options[:force_path_style] = @force_path_style
107
111
 
108
112
  s3_client = Aws::S3::Client.new(options)
@@ -127,16 +131,16 @@ module Fluent
127
131
  begin
128
132
  path = @path_slicer.call(@path)
129
133
 
130
- @values_for_s3_object_chunk[chunk.key] ||= {
131
- "hex_random" => hex_random,
132
- "uuid_flush" => uuid_random,
134
+ @values_for_s3_object_chunk[chunk.unique_id] ||= {
135
+ "hex_random" => hex_random(chunk),
133
136
  }
134
137
  values_for_s3_object_key = {
135
138
  "path" => path,
136
139
  "time_slice" => chunk.key,
137
140
  "file_extension" => @compressor.ext,
138
141
  "index" => i,
139
- }.merge!(@values_for_s3_object_chunk[chunk.key])
142
+ "uuid_flush" => uuid_random,
143
+ }.merge!(@values_for_s3_object_chunk[chunk.unique_id])
140
144
 
141
145
  s3path = @s3_object_key_format.gsub(%r(%{[^}]+})) { |expr|
142
146
  values_for_s3_object_key[expr[2...expr.size-1]]
@@ -155,23 +159,43 @@ module Fluent
155
159
  end while @bucket.object(s3path).exists?
156
160
 
157
161
  tmp = Tempfile.new("s3-")
162
+ tmp.binmode
158
163
  begin
159
164
  @compressor.compress(chunk, tmp)
160
165
  tmp.rewind
161
- log.debug { "out_s3: trying to write {object_id:#{chunk.object_id},time_slice:#{chunk.key}} to s3://#{@s3_bucket}/#{s3path}" }
162
- @bucket.object(s3path).put(:body => tmp,
163
- :content_type => @compressor.content_type,
164
- :storage_class => @storage_class)
166
+ log.debug { "out_s3: write chunk: {key:#{chunk.key},tsuffix:#{tsuffix(chunk)}} to s3://#{@s3_bucket}/#{s3path}" }
167
+
168
+ put_options = {:body => tmp, :content_type => @compressor.content_type, :storage_class => @storage_class}
169
+ put_options[:server_side_encryption] = @use_server_side_encryption if @use_server_side_encryption
170
+ @bucket.object(s3path).put(put_options)
171
+
172
+ @values_for_s3_object_chunk.delete(chunk.unique_id)
165
173
  ensure
166
- @values_for_s3_object_chunk.delete(chunk.key)
167
174
  tmp.close(true) rescue nil
168
175
  end
169
176
  end
170
177
 
171
178
  private
172
179
 
173
- def hex_random
174
- SecureRandom.hex(@hex_random_n)[0...@hex_random_length]
180
+ # tsuffix is the one which file buffer filename has
181
+ def tsuffix(chunk)
182
+ if chunk.is_a?(Fluent::FileBufferChunk)
183
+ unique_id = chunk.unique_id
184
+ tsuffix = unique_id[0...(unique_id.size/2)].unpack('C*').map {|x| x.to_s(16) }.join('') # size: 16
185
+ else
186
+ nil
187
+ end
188
+ end
189
+
190
+ def hex_random(chunk)
191
+ if chunk.is_a?(Fluent::FileBufferChunk)
192
+ # let me use tsuffix because its value is kept on retrying even after rebooting
193
+ tsuffix = tsuffix(chunk)
194
+ tsuffix.reverse! # tsuffix is like (time_sec, time_usec, rand) => reversing gives more randomness
195
+ tsuffix[0...@hex_random_length]
196
+ else
197
+ SecureRandom.hex(@hex_random_n)[0...@hex_random_length]
198
+ end
175
199
  end
176
200
 
177
201
  def ensure_bucket
@@ -24,6 +24,7 @@ module Fluent
24
24
  chunk.path
25
25
  else
26
26
  w = Tempfile.new("chunk-gzip-tmp")
27
+ w.binmode
27
28
  chunk.write_to(w)
28
29
  w.close
29
30
  w.path
@@ -20,6 +20,7 @@ module Fluent
20
20
 
21
21
  def compress(chunk, tmp)
22
22
  w = Tempfile.new("chunk-xz-tmp")
23
+ w.binmode
23
24
  chunk.write_to(w)
24
25
  w.close
25
26
 
@@ -20,6 +20,7 @@ module Fluent
20
20
 
21
21
  def compress(chunk, tmp)
22
22
  w = Tempfile.new("chunk-tmp")
23
+ w.binmode
23
24
  chunk.write_to(w)
24
25
  w.close
25
26
 
@@ -11,6 +11,10 @@ class S3OutputTest < Test::Unit::TestCase
11
11
  Fluent::Test.setup
12
12
  end
13
13
 
14
+ def teardown
15
+ Dir.glob('test/tmp/*').each {|file| FileUtils.rm_f(file) }
16
+ end
17
+
14
18
  CONFIG = %[
15
19
  aws_key_id test_key_id
16
20
  aws_sec_key test_sec_key
@@ -95,6 +99,16 @@ class S3OutputTest < Test::Unit::TestCase
95
99
  assert d.instance.force_path_style
96
100
  end
97
101
 
102
+ def test_configure_with_hex_random_length
103
+ conf = CONFIG.clone
104
+ assert_raise Fluent::ConfigError do
105
+ create_driver(conf + "\nhex_random_length 17\n")
106
+ end
107
+ assert_nothing_raised do
108
+ create_driver(conf + "\nhex_random_length 16\n")
109
+ end
110
+ end
111
+
98
112
  def test_path_slicing
99
113
  config = CONFIG.clone.gsub(/path\slog/, "path log/%Y/%m/%d")
100
114
  d = create_driver(config)
@@ -323,17 +337,36 @@ class S3OutputTest < Test::Unit::TestCase
323
337
  data
324
338
  end
325
339
  FileUtils.rm_f(s3_test_file_path)
340
+ Dir.glob('tmp/*').each {|file| FileUtils.rm_f(file) }
341
+ end
342
+
343
+ def test_write_with_custom_s3_object_key_format_containing_hex_random_placeholder_memory_buffer
344
+ hex = "012345"
345
+ mock(SecureRandom).hex(3) { hex }
346
+
347
+ config = CONFIG_TIME_SLICE.gsub(/%{hostname}/,"%{hex_random}") << "\nhex_random_length 5"
348
+ write_with_custom_s3_object_key_format_containing_hex_random_placeholder(config, hex[0...5])
349
+ end
350
+
351
+ def test_write_with_custom_s3_object_key_format_containing_hex_random_placeholder_file_buffer
352
+ tsuffix = "5226c3c4fb3d49b1"
353
+ any_instance_of(Fluent::FileBufferChunk) do |klass|
354
+ unique_id = "R&\xC3\xC4\xFB=I\xB1R&\xC3\xC4\xFB=I\xB1" # corresponding unique_id with tsuffxi
355
+ stub(klass).unique_id { unique_id }
356
+ end
357
+ hex = tsuffix.reverse
358
+
359
+ config = CONFIG_TIME_SLICE.gsub(/%{hostname}/,"%{hex_random}") << "\nhex_random_length 16"
360
+ config = config.gsub(/buffer_type memory/, "buffer_type file\nbuffer_path test/tmp/buf")
361
+ write_with_custom_s3_object_key_format_containing_hex_random_placeholder(config, hex)
326
362
  end
327
363
 
328
364
  # ToDo: need to test hex_random does not change on retry, but it is difficult with
329
365
  # the current fluentd test helper because it does not provide a way to run with the same chunks
330
- def test_write_with_custom_s3_object_key_format_containing_hex_random_placeholder
366
+ def write_with_custom_s3_object_key_format_containing_hex_random_placeholder(config, hex)
331
367
  # Partial mock the S3Bucket, not to make an actual connection to Amazon S3
332
368
  setup_mocks(true)
333
369
 
334
- hex = "012345"
335
- mock(SecureRandom).hex(3) { hex }
336
-
337
370
  # Assert content of event logs which are being sent to S3
338
371
  s3obj = stub(Aws::S3::Object.new(:bucket_name => "test_bucket",
339
372
  :key => "test",
@@ -345,11 +378,8 @@ class S3OutputTest < Test::Unit::TestCase
345
378
  s3obj.put(:body => tempfile,
346
379
  :content_type => "application/x-gzip",
347
380
  :storage_class => "STANDARD")
348
- @s3_bucket.object("log/events/ts=20110102-13/events_0-#{hex[0...5]}.gz") { s3obj }
381
+ @s3_bucket.object("log/events/ts=20110102-13/events_0-#{hex}.gz") { s3obj }
349
382
 
350
- # We must use TimeSlicedOutputTestDriver instead of BufferedOutputTestDriver,
351
- # to make assertions on chunks' keys
352
- config = CONFIG_TIME_SLICE.gsub(/%{hostname}/,"%{hex_random}") << "\nhex_random_length 5"
353
383
  d = create_time_sliced_driver(config)
354
384
 
355
385
  time = Time.parse("2011-01-02 13:14:15 UTC").to_i
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-s3
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.6.0
4
+ version: 0.6.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Sadayuki Furuhashi
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2015-10-09 00:00:00.000000000 Z
12
+ date: 2015-10-29 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: fluentd
@@ -126,9 +126,10 @@ files:
126
126
  - AUTHORS
127
127
  - ChangeLog
128
128
  - Gemfile
129
- - README.rdoc
129
+ - README.md
130
130
  - Rakefile
131
131
  - VERSION
132
+ - appveyor.yml
132
133
  - fluent-plugin-s3.gemspec
133
134
  - lib/fluent/plugin/out_s3.rb
134
135
  - lib/fluent/plugin/s3_compressor_gzip_command.rb
@@ -155,7 +156,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
155
156
  version: '0'
156
157
  requirements: []
157
158
  rubyforge_project:
158
- rubygems_version: 2.2.3
159
+ rubygems_version: 2.4.5.1
159
160
  signing_key:
160
161
  specification_version: 4
161
162
  summary: Amazon S3 output plugin for Fluentd event collector
@@ -1,319 +0,0 @@
1
- = Amazon S3 output plugin for {Fluentd}[http://github.com/fluent/fluentd]
2
-
3
- {<img src="https://travis-ci.org/fluent/fluent-plugin-s3.svg?branch=master" alt="Build Status" />}[https://travis-ci.org/fluent/fluent-plugin-s3] {<img src="https://codeclimate.com/github/fluent/fluent-plugin-s3/badges/gpa.svg" />}[https://codeclimate.com/github/fluent/fluent-plugin-s3]
4
-
5
- == Overview
6
-
7
- *s3* output plugin buffers event logs in local file and upload it to S3 periodically.
8
-
9
- This plugin splits files exactly by using the time of event logs (not the time when the logs are received). For example, a log '2011-01-02 message B' is reached, and then another log '2011-01-03 message B' is reached in this order, the former one is stored in "20110102.gz" file, and latter one in "20110103.gz" file.
10
-
11
-
12
- == Installation
13
-
14
- Simply use RubyGems:
15
-
16
- gem install fluent-plugin-s3
17
-
18
- == Configuration
19
-
20
- <match pattern>
21
- type s3
22
-
23
- aws_key_id YOUR_AWS_KEY_ID
24
- aws_sec_key YOUR_AWS_SECRET_KEY
25
- s3_bucket YOUR_S3_BUCKET_NAME
26
- s3_region ap-northeast-1
27
- s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
28
- path logs/
29
- buffer_path /var/log/fluent/s3
30
-
31
- time_slice_format %Y%m%d-%H
32
- time_slice_wait 10m
33
- utc
34
- </match>
35
-
36
- [aws_key_id] AWS access key id. This parameter is required when your agent is not running on EC2 instance with an IAM Role.
37
-
38
- [aws_sec_key] AWS secret key. This parameter is required when your agent is not running on EC2 instance with an IAM Role.
39
-
40
- [aws_iam_retries] The number of attempts to make (with exponential backoff) when loading instance profile credentials from the EC2 metadata service using an IAM role. Defaults to 5 retries.
41
-
42
- [s3_bucket (required)] S3 bucket name.
43
-
44
- [s3_region] s3 region name. For example, US West (Oregon) Region is "us-west-2". The full list of regions are available here. > http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region. We recommend using `s3_region` instead of `s3_endpoint`.
45
-
46
- [s3_endpoint] endpoint for S3 compatible services. For example, Riak CS based storage or something. This option doesn't work on S3, use `s3_region` instead.
47
-
48
- [s3_object_key_format] The format of S3 object keys. You can use several built-in variables:
49
-
50
- - %{path}
51
- - %{time_slice}
52
- - %{index}
53
- - %{file_extension}
54
- - %{uuid_flush}
55
- - %{hex_random}
56
-
57
- to decide keys dynamically.
58
-
59
- %{path} is exactly the value of *path* configured in the configuration file. E.g., "logs/" in the example configuration above.
60
- %{time_slice} is the time-slice in text that are formatted with *time_slice_format*.
61
- %{index} is the sequential number starts from 0, increments when multiple files are uploaded to S3 in the same time slice.
62
- %{file_extention} is always "gz" for now.
63
- %{uuid_flush} a uuid that is replaced for each buffer chunk to be flushed
64
- %{hex_random} a random hex string that is replaced for each buffer chunk, not assured to be unique. This is used to follow a way of peformance tuning, `Add a Hex Hash Prefix to Key Name`, written in [Request Rate and Performance Considerations - Amazon Simple Storage Service](https://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html). You can configure the length of string with a `hex_random_length` parameter (Default: 4).
65
-
66
- The default format is "%{path}%{time_slice}_%{index}.%{file_extension}".
67
-
68
- For instance, using the example configuration above, actual object keys on S3 will be something like:
69
-
70
- "logs/20130111-22_0.gz"
71
- "logs/20130111-23_0.gz"
72
- "logs/20130111-23_1.gz"
73
- "logs/20130112-00_0.gz"
74
-
75
- With the configuration:
76
-
77
- s3_object_key_format %{path}/events/ts=%{time_slice}/events_%{index}.%{file_extension}
78
- path log
79
- time_slice_format %Y%m%d-%H
80
-
81
- You get:
82
-
83
- "log/events/ts=20130111-22/events_0.gz"
84
- "log/events/ts=20130111-23/events_0.gz"
85
- "log/events/ts=20130111-23/events_1.gz"
86
- "log/events/ts=20130112-00/events_0.gz"
87
-
88
- The {fluent-mixin-config-placeholders}[https://github.com/tagomoris/fluent-mixin-config-placeholders] mixin is also incorporated, so additional variables such as %{hostname}, %{uuid}, etc. can be used in the s3_object_key_format. This could prove useful in preventing filename conflicts when writing from multiple servers.
89
-
90
- s3_object_key_format %{path}/events/ts=%{time_slice}/events_%{index}-%{hostname}.%{file_extension}
91
-
92
- [force_path_style] :force_path_style (Boolean) — default: false — When set to true, the bucket name is always left in the request URI and never moved to the host as a sub-domain. See Plugins::S3BucketDns for more details.
93
-
94
- [store_as] archive format on S3. You can use serveral format:
95
-
96
- - gzip (default)
97
- - json
98
- - text
99
- - lzo (Need lzop command)
100
- - lzma2 (Need xz command)
101
- - gzip_command (Need gzip command)
102
- - This compressor uses an external gzip command, hence would result in utilizing CPU cores well compared with `gzip`
103
-
104
- See 'Use your compression algorithm' section for adding another format.
105
-
106
- [format] Change one line format in the S3 object. Supported formats are "out_file", "json", "ltsv" and "single_value".
107
-
108
- - out_file (default).
109
-
110
- time\ttag\t{..json1..}
111
- time\ttag\t{..json2..}
112
- ...
113
-
114
- - json
115
-
116
- {..json1..}
117
- {..json2..}
118
- ...
119
-
120
- At this format, "time" and "tag" are omitted.
121
- But you can set these information to the record by setting "include_tag_key" / "tag_key" and "include_time_key" / "time_key" option.
122
- If you set following configuration in S3 output:
123
-
124
- format json
125
- include_time_key true
126
- time_key log_time # default is time
127
-
128
- then the record has log_time field.
129
-
130
- {"log_time":"time string",...}
131
-
132
- - ltsv
133
-
134
- key1:value1\tkey2:value2
135
- key1:value1\tkey2:value2
136
- ...
137
-
138
- "ltsv" format also accepts "include_xxx" related options. See "json" section.
139
-
140
- - single_value
141
-
142
- Use specified value instead of entire recode. If you get '{"message":"my log"}', then contents are
143
-
144
- my log1
145
- my log2
146
- ...
147
-
148
- You can change key name by "message_key" option.
149
-
150
- [auto_create_bucket] Create S3 bucket if it does not exists. Default is true.
151
-
152
- [check_apikey_on_start] Check AWS key on start. Default is true.
153
-
154
- [proxy_uri] uri of proxy environment.
155
-
156
- [path] path prefix of the files on S3. Default is "" (no prefix).
157
-
158
- [buffer_path (required)] path prefix of the files to buffer logs.
159
-
160
- [time_slice_format] Format of the time used as the file name. Default is '%Y%m%d'. Use '%Y%m%d%H' to split files hourly.
161
-
162
- [time_slice_wait] The time to wait old logs. Default is 10 minutes. Specify larger value if old logs may reache.
163
-
164
- [utc] Use UTC instead of local time.
165
-
166
- [reduced_redundancy] Use S3 reduced redundancy storage for 33% cheaper pricing. Default is false.
167
-
168
- [acl] Permission for the object in S3. This is useful for cross-account access using IAM roles. Valid values are:
169
-
170
- - private (default)
171
- - public_read
172
- - public_read_write (not recommended - see {Canned ACL}[http://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl])
173
- - authenticated_read
174
- - bucket_owner_read
175
- - bucket_owner_full_control
176
-
177
- To use cross-account access, you will need to create a bucket policy granting
178
- the specific access required. Refer to the {AWS documentation}[http://docs.aws.amazon.com/AmazonS3/latest/dev/example-walkthroughs-managing-access-example3.html] for examples.
179
-
180
- [hex_random_length] The length of `%{hex_random}` placeholder. Default is 4 as written in [Request Rate and Performance Considerations - Amazon Simple Storage Service](https://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html).
181
-
182
- [overwrite] Overwrite already existing path. Default is false, which raises an error if a s3 object of the same path already exists, or increment the `%{index}` placeholder until finding an absent path.
183
-
184
- === assume_role_credentials
185
-
186
- Typically, you use AssumeRole for cross-account access or federation.
187
-
188
- <match *>
189
- type s3
190
-
191
- <assume_role_credentials>
192
- role_arn ROLE_ARN
193
- role_session_name ROLE_SESSION_NAME
194
- </assume_role_credentials>
195
- </match>
196
-
197
- See also:
198
-
199
- - {Using IAM Roles - AWS Identity and Access Management}[http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html]
200
- - {Aws::STS::Client}[http://docs.aws.amazon.com/sdkforruby/api/Aws/STS/Client.html]
201
- - {Aws::AssumeRoleCredentials}[http://docs.aws.amazon.com/sdkforruby/api/Aws/AssumeRoleCredentials.html]
202
-
203
- [role_arn (required)] The Amazon Resource Name (ARN) of the role to assume.
204
-
205
- [role_session_name (required)] An identifier for the assumed role session.
206
-
207
- [policy] An IAM policy in JSON format.
208
-
209
- [duration_seconds] The duration, in seconds, of the role session. The value can range from 900 seconds (15 minutes) to 3600 seconds (1 hour). By default, the value is set to 3600 seconds.
210
-
211
- [external_id] A unique identifier that is used by third parties when assuming roles in their customers' accounts.
212
-
213
- === instance_profile_credentials
214
-
215
- Retrieve temporary security credentials via HTTP request. This is useful on EC2 instance.
216
-
217
- <match *>
218
- type s3
219
-
220
- <instance_profile_credentials>
221
- ip_address IP_ADDRESS
222
- port PORT
223
- </instance_profile_credentials>
224
- </match>
225
-
226
- See also:
227
-
228
- - {Aws::InstanceProfileCredentials}[http://docs.aws.amazon.com/sdkforruby/api/Aws/InstanceProfileCredentials.html]
229
- - {Temporary Security Credentials - AWS Identity and Access Management}[http://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html]
230
- - {Instance Metadata and User Data - Amazon Elastic Compute Cloud}[http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html]
231
-
232
- [retries] Number of times to retry when retrieving credentials. Default is 5.
233
-
234
- [ip_address] Default is 169.254.169.254.
235
-
236
- [port] Default is 80.
237
-
238
- [http_open_timeout] Default is 5.
239
-
240
- [http_read_timeout] Default is 5.
241
-
242
- === shared_credentials
243
-
244
- This loads AWS access credentials from local ini file. This is useful for local developing.
245
-
246
- <match *>
247
- type s3
248
-
249
- <shared_credentials>
250
- path PATH
251
- profile_name PROFILE_NAME
252
- </shared_credentials>
253
- </match>
254
-
255
- See also:
256
-
257
- - {Aws::SharedCredentials}[http://docs.aws.amazon.com/sdkforruby/api/Aws/SharedCredentials.html]
258
-
259
- [path] Path to the shared file. Defaults to "#{Dir.home}/.aws/credentials".
260
-
261
- [profile_name] Defaults to 'default' or `ENV['AWS_PROFILE']`.
262
-
263
- == IAM Policy
264
-
265
- The following is an example for a minimal IAM policy needed to write to an s3 bucket (matches my-s3bucket/logs, my-s3bucket-test, etc.).
266
-
267
- { "Statement": [
268
- { "Effect":"Allow",
269
- "Action":"s3:*",
270
- "Resource":"arn:aws:s3:::my-s3bucket*"
271
- } ]
272
- }
273
-
274
- Note that the bucket must already exist and *auto_create_bucket* has no effect in this case.
275
-
276
- Refer to the {AWS documentation}[http://docs.aws.amazon.com/IAM/latest/UserGuide/ExampleIAMPolicies.html] for example policies.
277
-
278
- Using {IAM roles}[http://docs.aws.amazon.com/IAM/latest/UserGuide/WorkingWithRoles.html] with a properly configured IAM policy are preferred over embedding access keys on EC2 instances.
279
-
280
- == Use your compression algorithm
281
-
282
- s3 plugin has plugabble compression mechanizm like Fleuntd\'s input / output plugin.
283
- If you set 'store_as xxx', s3 plugin searches `fluent/plugin/s3_compressor_xxx.rb`.
284
- You can define your compression with 'S3Output::Compressor' class. Compressor API is here:
285
-
286
- module Fluent
287
- class S3Output
288
- class XXXCompressor < Compressor
289
- S3Output.register_compressor('xxx', self)
290
-
291
- # Used to file extension
292
- def ext
293
- 'xxx'
294
- end
295
-
296
- # Used to file content type
297
- def content_type
298
- 'application/x-xxx'
299
- end
300
-
301
- # chunk is buffer chunk. tmp is destination file for upload
302
- def compress(chunk, tmp)
303
- # call command or something
304
- end
305
- end
306
- end
307
- end
308
-
309
- See bundled Compressor classes for more detail.
310
-
311
- == Website, license, et. al.
312
-
313
- Web site:: http://fluentd.org/
314
- Documents:: http://docs.fluentd.org/
315
- Source repository:: http://github.com/fluent
316
- Discussion:: http://groups.google.com/group/fluentd
317
- Author:: Sadayuki Furuhashi
318
- Copyright:: (c) 2011 FURUHASHI Sadayuki
319
- License:: Apache License, Version 2.0