fluent-plugin-s3 1.3.3 → 1.6.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.github/workflows/linux.yml +26 -0
- data/ChangeLog +21 -0
- data/Gemfile +0 -2
- data/README.md +11 -781
- data/VERSION +1 -1
- data/docs/credentials.md +171 -0
- data/docs/howto.md +92 -0
- data/docs/input.md +90 -0
- data/docs/output.md +445 -0
- data/docs/v0.12.md +52 -0
- data/lib/fluent/plugin/in_s3.rb +1 -1
- data/lib/fluent/plugin/out_s3.rb +38 -19
- data/lib/fluent/plugin/s3_compressor_parquet.rb +83 -0
- data/test/test_out_s3.rb +132 -8
- metadata +11 -7
- data/.travis.yml +0 -26
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
1.
|
1
|
+
1.6.0
|
data/docs/credentials.md
ADDED
@@ -0,0 +1,171 @@
|
|
1
|
+
# Configuration: credentials
|
2
|
+
|
3
|
+
Both S3 input/output plugin provide several credential methods for authentication/authorization.
|
4
|
+
|
5
|
+
## AWS key and secret authentication
|
6
|
+
|
7
|
+
These parameters are required when your agent is not running on EC2 instance with an IAM Role. When using an IAM role, make sure to configure `instance_profile_credentials`. Usage can be found below.
|
8
|
+
|
9
|
+
### aws_key_id
|
10
|
+
|
11
|
+
AWS access key id.
|
12
|
+
|
13
|
+
### aws_sec_key
|
14
|
+
|
15
|
+
AWS secret key.
|
16
|
+
|
17
|
+
## \<assume_role_credentials\> section
|
18
|
+
|
19
|
+
Typically, you use AssumeRole for cross-account access or federation.
|
20
|
+
|
21
|
+
<match *>
|
22
|
+
@type s3
|
23
|
+
|
24
|
+
<assume_role_credentials>
|
25
|
+
role_arn ROLE_ARN
|
26
|
+
role_session_name ROLE_SESSION_NAME
|
27
|
+
</assume_role_credentials>
|
28
|
+
</match>
|
29
|
+
|
30
|
+
See also:
|
31
|
+
|
32
|
+
* [Using IAM Roles - AWS Identity and Access
|
33
|
+
Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html)
|
34
|
+
* [Aws::STS::Client](http://docs.aws.amazon.com/sdkforruby/api/Aws/STS/Client.html)
|
35
|
+
* [Aws::AssumeRoleCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws/AssumeRoleCredentials.html)
|
36
|
+
|
37
|
+
### role_arn (required)
|
38
|
+
|
39
|
+
The Amazon Resource Name (ARN) of the role to assume.
|
40
|
+
|
41
|
+
### role_session_name (required)
|
42
|
+
|
43
|
+
An identifier for the assumed role session.
|
44
|
+
|
45
|
+
### policy
|
46
|
+
|
47
|
+
An IAM policy in JSON format.
|
48
|
+
|
49
|
+
### duration_seconds
|
50
|
+
|
51
|
+
The duration, in seconds, of the role session. The value can range from
|
52
|
+
900 seconds (15 minutes) to 3600 seconds (1 hour). By default, the value
|
53
|
+
is set to 3600 seconds.
|
54
|
+
|
55
|
+
### external_id
|
56
|
+
|
57
|
+
A unique identifier that is used by third parties when assuming roles in
|
58
|
+
their customers' accounts.
|
59
|
+
|
60
|
+
## \<web_identity_credentials\> section
|
61
|
+
|
62
|
+
Similar to the assume_role_credentials, but for usage in EKS.
|
63
|
+
|
64
|
+
<match *>
|
65
|
+
@type s3
|
66
|
+
|
67
|
+
<web_identity_credentials>
|
68
|
+
role_arn ROLE_ARN
|
69
|
+
role_session_name ROLE_SESSION_NAME
|
70
|
+
web_identity_token_file AWS_WEB_IDENTITY_TOKEN_FILE
|
71
|
+
</web_identity_credentials>
|
72
|
+
</match>
|
73
|
+
|
74
|
+
See also:
|
75
|
+
|
76
|
+
* [Using IAM Roles - AWS Identity and Access
|
77
|
+
Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html)
|
78
|
+
* [IAM Roles For Service Accounts](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-technical-overview.html)
|
79
|
+
* [Aws::STS::Client](http://docs.aws.amazon.com/sdkforruby/api/Aws/STS/Client.html)
|
80
|
+
* [Aws::AssumeRoleWebIdentityCredentials](https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/AssumeRoleWebIdentityCredentials.html)
|
81
|
+
|
82
|
+
### role_arn (required)
|
83
|
+
|
84
|
+
The Amazon Resource Name (ARN) of the role to assume.
|
85
|
+
|
86
|
+
### role_session_name (required)
|
87
|
+
|
88
|
+
An identifier for the assumed role session.
|
89
|
+
|
90
|
+
### web_identity_token_file (required)
|
91
|
+
|
92
|
+
The absolute path to the file on disk containing the OIDC token
|
93
|
+
|
94
|
+
### policy
|
95
|
+
|
96
|
+
An IAM policy in JSON format.
|
97
|
+
|
98
|
+
### duration_seconds
|
99
|
+
|
100
|
+
The duration, in seconds, of the role session. The value can range from
|
101
|
+
900 seconds (15 minutes) to 43200 seconds (12 hours). By default, the value
|
102
|
+
is set to 3600 seconds.
|
103
|
+
|
104
|
+
|
105
|
+
## \<instance_profile_credentials\> section
|
106
|
+
|
107
|
+
Retrieve temporary security credentials via HTTP request. This is useful on
|
108
|
+
EC2 instance.
|
109
|
+
|
110
|
+
<match *>
|
111
|
+
@type s3
|
112
|
+
|
113
|
+
<instance_profile_credentials>
|
114
|
+
ip_address IP_ADDRESS
|
115
|
+
port PORT
|
116
|
+
</instance_profile_credentials>
|
117
|
+
</match>
|
118
|
+
|
119
|
+
See also:
|
120
|
+
|
121
|
+
* [Aws::InstanceProfileCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws/InstanceProfileCredentials.html)
|
122
|
+
* [Temporary Security Credentials - AWS Identity and Access
|
123
|
+
Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html)
|
124
|
+
* [Instance Metadata and User Data - Amazon Elastic Compute
|
125
|
+
Cloud](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html)
|
126
|
+
|
127
|
+
### retries
|
128
|
+
|
129
|
+
Number of times to retry when retrieving credentials. Default is 5.
|
130
|
+
|
131
|
+
### ip_address
|
132
|
+
|
133
|
+
Default is 169.254.169.254.
|
134
|
+
|
135
|
+
### port
|
136
|
+
|
137
|
+
Default is 80.
|
138
|
+
|
139
|
+
### http_open_timeout
|
140
|
+
|
141
|
+
Default is 5.
|
142
|
+
|
143
|
+
### http_read_timeout
|
144
|
+
|
145
|
+
Default is 5.
|
146
|
+
|
147
|
+
## \<shared_credentials\> section
|
148
|
+
|
149
|
+
This loads AWS access credentials from local ini file. This is useful for
|
150
|
+
local developing.
|
151
|
+
|
152
|
+
<match *>
|
153
|
+
@type s3
|
154
|
+
|
155
|
+
<shared_credentials>
|
156
|
+
path PATH
|
157
|
+
profile_name PROFILE_NAME
|
158
|
+
</shared_credentials>
|
159
|
+
</match>
|
160
|
+
|
161
|
+
See also:
|
162
|
+
|
163
|
+
* [Aws::SharedCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws/SharedCredentials.html)
|
164
|
+
|
165
|
+
### path
|
166
|
+
|
167
|
+
Path to the shared file. Defaults to "#{Dir.home}/.aws/credentials".
|
168
|
+
|
169
|
+
### profile_name
|
170
|
+
|
171
|
+
Defaults to 'default' or `[ENV]('AWS_PROFILE')`.
|
data/docs/howto.md
ADDED
@@ -0,0 +1,92 @@
|
|
1
|
+
# Object Metadata Added To Records
|
2
|
+
|
3
|
+
If the [`add_object_metadata`](input.md#add_object_metadata) option is set to true, then the name of the bucket
|
4
|
+
and the key for a given object will be added to each log record as [`s3_bucket`](input.md#s3_bucket)
|
5
|
+
and [`s3_key`](input.md#s3_key), respectively. This metadata can be used by filter plugins or other
|
6
|
+
downstream processors to better identify the source of a given record.
|
7
|
+
|
8
|
+
# IAM Policy
|
9
|
+
|
10
|
+
The following is an example for a IAM policy needed to write to an s3 bucket (matches my-s3bucket/logs, my-s3bucket-test, etc.).
|
11
|
+
|
12
|
+
{
|
13
|
+
"Version": "2012-10-17",
|
14
|
+
"Statement": [
|
15
|
+
{
|
16
|
+
"Effect": "Allow",
|
17
|
+
"Action": [
|
18
|
+
"s3:ListBucket"
|
19
|
+
],
|
20
|
+
"Resource": "arn:aws:s3:::my-s3bucket"
|
21
|
+
},
|
22
|
+
{
|
23
|
+
"Effect": "Allow",
|
24
|
+
"Action": [
|
25
|
+
"s3:PutObject",
|
26
|
+
"s3:GetObject"
|
27
|
+
],
|
28
|
+
"Resource": "arn:aws:s3:::my-s3bucket/*"
|
29
|
+
}
|
30
|
+
]
|
31
|
+
}
|
32
|
+
|
33
|
+
Note that the bucket must already exist and **[`auto_create_bucket`](output.md#auto_create_bucket)** has no effect in this case.
|
34
|
+
|
35
|
+
`s3:GetObject` is needed for object check to avoid object overwritten.
|
36
|
+
If you set `check_object false`, `s3:GetObject` is not needed.
|
37
|
+
|
38
|
+
Refer to the [AWS
|
39
|
+
documentation](http://docs.aws.amazon.com/IAM/latest/UserGuide/ExampleIAMPolicies.html) for example policies.
|
40
|
+
|
41
|
+
Using [IAM
|
42
|
+
roles](http://docs.aws.amazon.com/IAM/latest/UserGuide/WorkingWithRoles.html)
|
43
|
+
with a properly configured IAM policy are preferred over embedding access keys
|
44
|
+
on EC2 instances.
|
45
|
+
|
46
|
+
## Example when `check_bucket false` and `check_object false`
|
47
|
+
|
48
|
+
When the mentioned configuration will be made, fluentd will work with the
|
49
|
+
minimum IAM poilcy, like:
|
50
|
+
|
51
|
+
|
52
|
+
"Statement": [{
|
53
|
+
"Effect": "Allow",
|
54
|
+
"Action": "s3:PutObject",
|
55
|
+
"Resource": ["*"]
|
56
|
+
}]
|
57
|
+
|
58
|
+
|
59
|
+
# Use your (de)compression algorithm
|
60
|
+
|
61
|
+
s3 plugin has pluggable compression mechanizm like Fluentd's input / output
|
62
|
+
plugin. If you set 'store_as xxx', `out_s3` plugin searches
|
63
|
+
`fluent/plugin/s3_compressor_xxx.rb` and `in_s3` plugin searches
|
64
|
+
`fluent/plugin/s3_extractor_xxx.rb`. You can define your (de)compression with
|
65
|
+
'S3Output::Compressor'/`S3Input::Extractor` classes. Compressor API is here:
|
66
|
+
|
67
|
+
module Fluent # Since fluent-plugin-s3 v1.0.0 or later, use Fluent::Plugin instead of Fluent
|
68
|
+
class S3Output
|
69
|
+
class XXXCompressor < Compressor
|
70
|
+
S3Output.register_compressor('xxx', self)
|
71
|
+
|
72
|
+
# Used to file extension
|
73
|
+
def ext
|
74
|
+
'xxx'
|
75
|
+
end
|
76
|
+
|
77
|
+
# Used to file content type
|
78
|
+
def content_type
|
79
|
+
'application/x-xxx'
|
80
|
+
end
|
81
|
+
|
82
|
+
# chunk is buffer chunk. tmp is destination file for upload
|
83
|
+
def compress(chunk, tmp)
|
84
|
+
# call command or something
|
85
|
+
end
|
86
|
+
end
|
87
|
+
end
|
88
|
+
end
|
89
|
+
|
90
|
+
`Extractor` is similar to `Compressor`
|
91
|
+
See bundled `Compressor`/`Extractor` classes for more detail.
|
92
|
+
|
data/docs/input.md
ADDED
@@ -0,0 +1,90 @@
|
|
1
|
+
# Input: Setup
|
2
|
+
|
3
|
+
1. Create new [SQS](https://aws.amazon.com/documentation/sqs/) queue (use same region as S3)
|
4
|
+
2. Set proper permission to new queue
|
5
|
+
3. [Configure S3 event notification](http://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html)
|
6
|
+
4. Write configuration file such as fluent.conf
|
7
|
+
5. Run fluentd
|
8
|
+
|
9
|
+
# Configuration: Input
|
10
|
+
|
11
|
+
See also [Configuration: credentials](credentials.md) for common comprehensive parameters.
|
12
|
+
|
13
|
+
<source>
|
14
|
+
@type s3
|
15
|
+
|
16
|
+
aws_key_id YOUR_AWS_KEY_ID
|
17
|
+
aws_sec_key YOUR_AWS_SECRET_KEY
|
18
|
+
s3_bucket YOUR_S3_BUCKET_NAME
|
19
|
+
s3_region ap-northeast-1
|
20
|
+
add_object_metadata true
|
21
|
+
|
22
|
+
<sqs>
|
23
|
+
queue_name YOUR_SQS_QUEUE_NAME
|
24
|
+
</sqs>
|
25
|
+
</source>
|
26
|
+
|
27
|
+
## add_object_metadata
|
28
|
+
|
29
|
+
Whether or not object metadata should be added to the record. Defaults to `false`. See below for details.
|
30
|
+
|
31
|
+
## s3_bucket (required)
|
32
|
+
|
33
|
+
S3 bucket name.
|
34
|
+
|
35
|
+
## s3_region
|
36
|
+
|
37
|
+
S3 region name. For example, US West (Oregon) Region is
|
38
|
+
"us-west-2". The full list of regions are available here. >
|
39
|
+
http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region. We
|
40
|
+
recommend using `s3_region` instead of `s3_endpoint`.
|
41
|
+
|
42
|
+
## store_as
|
43
|
+
|
44
|
+
archive format on S3. You can use serveral format:
|
45
|
+
|
46
|
+
* gzip (default)
|
47
|
+
* json
|
48
|
+
* text
|
49
|
+
* lzo (Need lzop command)
|
50
|
+
* lzma2 (Need xz command)
|
51
|
+
* gzip_command (Need gzip command)
|
52
|
+
* This compressor uses an external gzip command, hence would result in utilizing CPU cores well compared with `gzip`
|
53
|
+
|
54
|
+
See [Use your compression algorithm](howto.md#use-your-compression-algorithm) section for adding another format.
|
55
|
+
|
56
|
+
## format
|
57
|
+
|
58
|
+
Parse a line as this format in the S3 object. Supported formats are
|
59
|
+
"apache_error", "apache2", "syslog", "json", "tsv", "ltsv", "csv",
|
60
|
+
"nginx" and "none".
|
61
|
+
|
62
|
+
## check_apikey_on_start
|
63
|
+
|
64
|
+
Check AWS key on start. Default is true.
|
65
|
+
|
66
|
+
## proxy_uri
|
67
|
+
|
68
|
+
URI of proxy environment.
|
69
|
+
|
70
|
+
## \<sqs\> section
|
71
|
+
|
72
|
+
### queue_name (required)
|
73
|
+
|
74
|
+
SQS queue name. Need to create SQS queue on the region same as S3 bucket.
|
75
|
+
|
76
|
+
### queue_owner_aws_account_id
|
77
|
+
|
78
|
+
SQS Owner Account ID
|
79
|
+
|
80
|
+
### skip_delete
|
81
|
+
|
82
|
+
When true, messages are not deleted after polling block. Default is false.
|
83
|
+
|
84
|
+
### wait_time_seconds
|
85
|
+
|
86
|
+
The long polling interval. Default is 20.
|
87
|
+
|
88
|
+
### retry_error_interval
|
89
|
+
|
90
|
+
Interval to retry polling SQS if polling unsuccessful, in seconds. Default is 300.
|
data/docs/output.md
ADDED
@@ -0,0 +1,445 @@
|
|
1
|
+
# Configuration: Output
|
2
|
+
|
3
|
+
Here is a sample configuration and available parameters for fluentd v1 or later.
|
4
|
+
See also [Configuration: credentials](credentials.md) for common comprehensive parameters.
|
5
|
+
|
6
|
+
<match pattern>
|
7
|
+
@type s3
|
8
|
+
|
9
|
+
aws_key_id YOUR_AWS_KEY_ID
|
10
|
+
aws_sec_key YOUR_AWS_SECRET_KEY
|
11
|
+
s3_bucket YOUR_S3_BUCKET_NAME
|
12
|
+
s3_region ap-northeast-1
|
13
|
+
|
14
|
+
path logs/${tag}/%Y/%m/%d/
|
15
|
+
s3_object_key_format %{path}%{time_slice}_%{index}.%{file_extension}
|
16
|
+
|
17
|
+
# if you want to use ${tag} or %Y/%m/%d/ like syntax in path / s3_object_key_format,
|
18
|
+
# need to specify tag for ${tag} and time for %Y/%m/%d in <buffer> argument.
|
19
|
+
<buffer tag,time>
|
20
|
+
@type file
|
21
|
+
path /var/log/fluent/s3
|
22
|
+
timekey 3600 # 1 hour partition
|
23
|
+
timekey_wait 10m
|
24
|
+
timekey_use_utc true # use utc
|
25
|
+
</buffer>
|
26
|
+
<format>
|
27
|
+
@type json
|
28
|
+
</format>
|
29
|
+
</match>
|
30
|
+
|
31
|
+
For [`<buffer>`](https://docs.fluentd.org/configuration/buffer-section), you can use any record field in `path` / `s3_object_key_format`.
|
32
|
+
|
33
|
+
path logs/${tag}/${foo}
|
34
|
+
<buffer tag,foo>
|
35
|
+
# parameters...
|
36
|
+
</buffer>
|
37
|
+
|
38
|
+
See official article for available parameters and usage of placeholder in detail: [Config: Buffer Section](https://docs.fluentd.org/configuration/buffer-section#placeholders)
|
39
|
+
|
40
|
+
Note that this configuration doesn't work with fluentd v0.12. See [v0.12](v0.12.md) for v0.12 style.
|
41
|
+
|
42
|
+
## aws_iam_retries
|
43
|
+
|
44
|
+
This parameter is deprecated. Use [instance_profile_credentials](credentials.md#instance_profile_credentials) instead.
|
45
|
+
|
46
|
+
The number of attempts to make (with exponential backoff) when loading
|
47
|
+
instance profile credentials from the EC2 metadata service using an IAM
|
48
|
+
role. Defaults to 5 retries.
|
49
|
+
|
50
|
+
## s3_bucket (required)
|
51
|
+
|
52
|
+
S3 bucket name.
|
53
|
+
|
54
|
+
## s3_region
|
55
|
+
|
56
|
+
s3 region name. For example, US West (Oregon) Region is "us-west-2". The
|
57
|
+
full list of regions are available here. >
|
58
|
+
http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region. We
|
59
|
+
recommend using `s3_region` instead of [`s3_endpoint`](#s3_endpoint).
|
60
|
+
|
61
|
+
## s3_endpoint
|
62
|
+
|
63
|
+
endpoint for S3 compatible services. For example, Riak CS based storage or
|
64
|
+
something. This option is deprecated for AWS S3, use [`s3_region`](#s3_region) instead.
|
65
|
+
|
66
|
+
See also AWS article: [Working with Regions](https://aws.amazon.com/blogs/developer/working-with-regions/).
|
67
|
+
|
68
|
+
## enable_transfer_acceleration
|
69
|
+
|
70
|
+
Enable [S3 Transfer Acceleration](https://docs.aws.amazon.com/AmazonS3/latest/dev/transfer-acceleration.html) for uploads. **IMPORTANT**: For this to work, you must first enable this feature on your destination S3 bucket.
|
71
|
+
|
72
|
+
## enable_dual_stack
|
73
|
+
|
74
|
+
Enable [Amazon S3 Dual-Stack Endpoints](https://docs.aws.amazon.com/AmazonS3/latest/dev/dual-stack-endpoints.html) for uploads. Will make it possible to use either IPv4 or IPv6 when connecting to S3.
|
75
|
+
|
76
|
+
## use_bundled_cert
|
77
|
+
|
78
|
+
For cases where the default SSL certificate is unavailable (e.g. Windows), you can set this option to true in order to use the AWS SDK bundled certificate. Default is false.
|
79
|
+
|
80
|
+
This fixes the following error often seen in Windows:
|
81
|
+
|
82
|
+
SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed (Seahorse::Client::NetworkingError)
|
83
|
+
|
84
|
+
## ssl_verify_peer
|
85
|
+
|
86
|
+
Verify SSL certificate of the endpoint. Default is true. Set false when you want to ignore the endpoint SSL certificate.
|
87
|
+
|
88
|
+
## s3_object_key_format
|
89
|
+
|
90
|
+
The format of S3 object keys. You can use several built-in variables:
|
91
|
+
|
92
|
+
* %{path}
|
93
|
+
* %{time_slice}
|
94
|
+
* %{index}
|
95
|
+
* %{file_extension}
|
96
|
+
* %{hex_random}
|
97
|
+
* %{uuid_flush}
|
98
|
+
* %{hostname}
|
99
|
+
|
100
|
+
to decide keys dynamically.
|
101
|
+
|
102
|
+
* %{path} is exactly the value of **path** configured in the configuration file.
|
103
|
+
E.g., "logs/" in the example configuration above.
|
104
|
+
* %{time_slice} is the
|
105
|
+
time-slice in text that are formatted with **time_slice_format**.
|
106
|
+
* %{index} is the sequential number starts from 0, increments when multiple files are uploaded to S3 in the same time slice.
|
107
|
+
* %{file_extension} depends on **store_as** parameter.
|
108
|
+
* %{uuid_flush} a uuid that is replaced everytime the buffer will be flushed.
|
109
|
+
* %{hostname} is replaced with `Socket.gethostname` result.
|
110
|
+
* %{hex_random} a random hex string that is replaced for each buffer chunk, not
|
111
|
+
assured to be unique. This is used to follow a way of performance tuning, `Add
|
112
|
+
a Hex Hash Prefix to Key Name`, written in [Request Rate and Performance
|
113
|
+
Considerations - Amazon Simple Storage
|
114
|
+
Service](https://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html).
|
115
|
+
You can configure the length of string with a
|
116
|
+
`hex_random_length` parameter (Default: 4).
|
117
|
+
|
118
|
+
The default format is `%{path}%{time_slice}_%{index}.%{file_extension}`.
|
119
|
+
In addition, you can use [buffer placeholders](https://docs.fluentd.org/configuration/buffer-section#placeholders) in this parameter,
|
120
|
+
so you can embed tag, time and record value like below:
|
121
|
+
|
122
|
+
s3_object_key_format %{path}/events/%Y%m%d/${tag}_%{index}.%{file_extension}
|
123
|
+
<buffer tag,time>
|
124
|
+
# buffer parameters...
|
125
|
+
</buffer>
|
126
|
+
|
127
|
+
For instance, using the example configuration above, actual object keys on S3
|
128
|
+
will be something like:
|
129
|
+
|
130
|
+
"logs/20130111-22_0.gz"
|
131
|
+
"logs/20130111-23_0.gz"
|
132
|
+
"logs/20130111-23_1.gz"
|
133
|
+
"logs/20130112-00_0.gz"
|
134
|
+
|
135
|
+
With the configuration:
|
136
|
+
|
137
|
+
s3_object_key_format %{path}/events/ts=%{time_slice}/events_%{index}.%{file_extension}
|
138
|
+
path log
|
139
|
+
time_slice_format %Y%m%d-%H
|
140
|
+
|
141
|
+
You get:
|
142
|
+
|
143
|
+
"log/events/ts=20130111-22/events_0.gz"
|
144
|
+
"log/events/ts=20130111-23/events_0.gz"
|
145
|
+
"log/events/ts=20130111-23/events_1.gz"
|
146
|
+
"log/events/ts=20130112-00/events_0.gz"
|
147
|
+
|
148
|
+
NOTE: ${hostname} placeholder is deprecated since v0.8. You can get same result by using [configuration's embedded ruby code feature](https://docs.fluentd.org/configuration/config-file#embedded-ruby-code).
|
149
|
+
|
150
|
+
s3_object_key_format %{path}%{time_slice}_%{hostname}%{index}.%{file_extension}
|
151
|
+
s3_object_key_format "%{path}%{time_slice}_#{Socket.gethostname}%{index}.%{file_extension}"
|
152
|
+
|
153
|
+
Above two configurations are same. The important point is wrapping `""` is needed for `#{Socket.gethostname}`.
|
154
|
+
|
155
|
+
## force_path_style
|
156
|
+
|
157
|
+
:force_path_style (Boolean) — default: false — When set to true, the
|
158
|
+
bucket name is always left in the request URI and never moved to the host
|
159
|
+
as a sub-domain. See Plugins::S3BucketDns for more details.
|
160
|
+
|
161
|
+
This parameter is deprecated. See AWS announcement: https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/
|
162
|
+
|
163
|
+
## store_as
|
164
|
+
|
165
|
+
archive format on S3. You can use several format:
|
166
|
+
|
167
|
+
* gzip (default)
|
168
|
+
* json
|
169
|
+
* text
|
170
|
+
* lzo (Need lzop command)
|
171
|
+
* lzma2 (Need xz command)
|
172
|
+
* gzip_command (Need gzip command)
|
173
|
+
* This compressor uses an external gzip command, hence would result in
|
174
|
+
utilizing CPU cores well compared with `gzip`
|
175
|
+
* parquet (Need columnify command)
|
176
|
+
* This compressor uses an external [columnify](https://github.com/reproio/columnify) command.
|
177
|
+
* Use [`<compress>`](#compress-for-parquet-compressor-only) section to configure columnify command behavior.
|
178
|
+
|
179
|
+
See [Use your compression algorithm](howto.md#use-your-compression-algorighm) section for adding another format.
|
180
|
+
|
181
|
+
## \<compress\> (for parquet compressor only) section
|
182
|
+
|
183
|
+
### parquet_compression_codec
|
184
|
+
|
185
|
+
parquet compression codec.
|
186
|
+
|
187
|
+
* uncompressed
|
188
|
+
* snappy (default)
|
189
|
+
* gzip
|
190
|
+
* lzo (unsupported by columnify)
|
191
|
+
* brotli (unsupported by columnify)
|
192
|
+
* lz4 (unsupported by columnify)
|
193
|
+
* zstd
|
194
|
+
|
195
|
+
### parquet_page_size
|
196
|
+
|
197
|
+
parquet file page size. default: 8192 bytes
|
198
|
+
|
199
|
+
### parquet_row_group_size
|
200
|
+
|
201
|
+
parquet file row group size. default: 128 MB
|
202
|
+
|
203
|
+
### record_type
|
204
|
+
|
205
|
+
record data format type.
|
206
|
+
|
207
|
+
* avro
|
208
|
+
* csv
|
209
|
+
* jsonl
|
210
|
+
* msgpack
|
211
|
+
* tsv
|
212
|
+
* msgpack (default)
|
213
|
+
* json
|
214
|
+
|
215
|
+
### schema_type
|
216
|
+
|
217
|
+
schema type.
|
218
|
+
|
219
|
+
* avro (default)
|
220
|
+
* bigquery
|
221
|
+
|
222
|
+
### schema_file (required)
|
223
|
+
|
224
|
+
path to schema file.
|
225
|
+
|
226
|
+
## \<format\> section
|
227
|
+
|
228
|
+
Change one line format in the S3 object. Supported formats are "out_file",
|
229
|
+
"json", "ltsv", "single_value" and other formatter plugins. See also [official Formatter article](https://docs.fluentd.org/formatter).
|
230
|
+
|
231
|
+
* out_file (default).
|
232
|
+
|
233
|
+
time\ttag\t{..json1..}
|
234
|
+
time\ttag\t{..json2..}
|
235
|
+
...
|
236
|
+
|
237
|
+
* json
|
238
|
+
|
239
|
+
{..json1..}
|
240
|
+
{..json2..}
|
241
|
+
...
|
242
|
+
|
243
|
+
|
244
|
+
At this format, "time" and "tag" are omitted. But you can set these
|
245
|
+
information to the record by setting `<inject>` option. If you set following configuration in
|
246
|
+
S3 output:
|
247
|
+
|
248
|
+
<format>
|
249
|
+
@type json
|
250
|
+
</format>
|
251
|
+
<inject>
|
252
|
+
time_key log_time
|
253
|
+
</inject>
|
254
|
+
|
255
|
+
then the record has log_time field.
|
256
|
+
|
257
|
+
{"log_time":"time string",...}
|
258
|
+
|
259
|
+
See also [official Inject Section article](https://docs.fluentd.org/configuration/inject-section).
|
260
|
+
|
261
|
+
* ltsv
|
262
|
+
|
263
|
+
key1:value1\tkey2:value2
|
264
|
+
key1:value1\tkey2:value2
|
265
|
+
...
|
266
|
+
|
267
|
+
* single_value
|
268
|
+
|
269
|
+
|
270
|
+
Use specified value instead of entire recode. If you get '{"message":"my
|
271
|
+
log"}', then contents are
|
272
|
+
|
273
|
+
my log1
|
274
|
+
my log2
|
275
|
+
...
|
276
|
+
|
277
|
+
You can change key name by "message_key" option.
|
278
|
+
|
279
|
+
## auto_create_bucket
|
280
|
+
|
281
|
+
Create S3 bucket if it does not exists. Default is true.
|
282
|
+
|
283
|
+
## check_bucket
|
284
|
+
|
285
|
+
Check mentioned bucket if it exists in AWS or not. Default is true.
|
286
|
+
|
287
|
+
When it is false, fluentd will not check aws s3 for the existence of the mentioned bucket.
|
288
|
+
This is the case where bucket will be pre-created before running fluentd.
|
289
|
+
|
290
|
+
## check_object
|
291
|
+
|
292
|
+
Check object before creation if it exists or not. Default is true.
|
293
|
+
|
294
|
+
When it is false, s3_object_key_format will be %{path}%{time_slice}_%{hms_slice}.%{file_extension} by default where,
|
295
|
+
hms_slice will be time-slice in hhmmss format, so that each object will be unique.
|
296
|
+
Example object name, assuming it is created on 2016/16/11 3:30:54 PM 20161611_153054.txt (extension can be anything as per user's choice)
|
297
|
+
|
298
|
+
## check_apikey_on_start
|
299
|
+
|
300
|
+
Check AWS key on start. Default is true.
|
301
|
+
|
302
|
+
## proxy_uri
|
303
|
+
|
304
|
+
uri of proxy environment.
|
305
|
+
|
306
|
+
## path
|
307
|
+
|
308
|
+
path prefix of the files on S3. Default is "" (no prefix).
|
309
|
+
[buffer placeholder](https://docs.fluentd.org/configuration/buffer-section#placeholders) is supported,
|
310
|
+
so you can embed tag, time and record value like below.
|
311
|
+
|
312
|
+
path logs/%Y%m%d/${tag}/
|
313
|
+
<buffer tag,time>
|
314
|
+
# buffer parameters...
|
315
|
+
</buffer>
|
316
|
+
|
317
|
+
## utc
|
318
|
+
|
319
|
+
Use UTC instead of local time.
|
320
|
+
|
321
|
+
## storage_class
|
322
|
+
|
323
|
+
Set storage class. Possible values are `STANDARD`, `REDUCED_REDUNDANCY`, `STANDARD_IA` from [Ruby SDK](http://docs.aws.amazon.com/sdkforruby/api/Aws/S3/Object.html#storage_class-instance_method).
|
324
|
+
|
325
|
+
## reduced_redundancy
|
326
|
+
|
327
|
+
Use S3 reduced redundancy storage for 33% cheaper pricing. Default is
|
328
|
+
false.
|
329
|
+
|
330
|
+
This is deprecated. Use `storage_class REDUCED_REDUNDANCY` instead.
|
331
|
+
|
332
|
+
## acl
|
333
|
+
|
334
|
+
Permission for the object in S3. This is useful for cross-account access
|
335
|
+
using IAM roles. Valid values are:
|
336
|
+
|
337
|
+
* private (default)
|
338
|
+
* public-read
|
339
|
+
* public-read-write (not recommended - see [Canned
|
340
|
+
ACL](http://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl))
|
341
|
+
* authenticated-read
|
342
|
+
* bucket-owner-read
|
343
|
+
* bucket-owner-full-control
|
344
|
+
|
345
|
+
To use cross-account access, you will need to create a bucket policy granting
|
346
|
+
the specific access required. Refer to the [AWS
|
347
|
+
documentation](http://docs.aws.amazon.com/AmazonS3/latest/dev/example-walkthroughs-managing-access-example3.html) for examples.
|
348
|
+
|
349
|
+
## grant_full_control
|
350
|
+
|
351
|
+
Allows grantee READ, READ_ACP, and WRITE_ACP permissions on the object.
|
352
|
+
This is useful for cross-account access using IAM roles.
|
353
|
+
|
354
|
+
Valid values are `id="Grantee-CanonicalUserID"`. Please specify the grantee's canonical user ID.
|
355
|
+
|
356
|
+
e.g. `id="79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be"`
|
357
|
+
|
358
|
+
Note that a canonical user ID is different from an AWS account ID.
|
359
|
+
Please refer to [AWS documentation](https://docs.aws.amazon.com/general/latest/gr/acct-identifiers.html) for more details.
|
360
|
+
|
361
|
+
## grant_read
|
362
|
+
|
363
|
+
Allows grantee to read the object data and its metadata.
|
364
|
+
Valid values are `id="Grantee-CanonicalUserID"`. Please specify the grantee's canonical user ID.
|
365
|
+
|
366
|
+
e.g. `id="79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be"`
|
367
|
+
|
368
|
+
## grant_read_acp
|
369
|
+
|
370
|
+
Allows grantee to read the object ACL.
|
371
|
+
Valid values are `id="Grantee-CanonicalUserID"`. Please specify the grantee's canonical user ID.
|
372
|
+
|
373
|
+
e.g. `id="79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be"`
|
374
|
+
|
375
|
+
## grant_write_acp
|
376
|
+
|
377
|
+
Allows grantee to write the ACL for the applicable object.
|
378
|
+
Valid values are `id="Grantee-CanonicalUserID"`. Please specify the grantee's canonical user ID.
|
379
|
+
|
380
|
+
e.g. `id="79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be"`
|
381
|
+
|
382
|
+
## hex_random_length
|
383
|
+
|
384
|
+
The length of `%{hex_random}` placeholder. Default is 4 as written in
|
385
|
+
[Request Rate and Performance Considerations - Amazon Simple Storage
|
386
|
+
Service](https://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html).
|
387
|
+
The maximum length is 16.
|
388
|
+
|
389
|
+
## index_format
|
390
|
+
|
391
|
+
`%{index}` is formatted by [sprintf](http://ruby-doc.org/core-2.2.0/Kernel.html#method-i-sprintf) using this format_string. Default is '%d'. Zero padding is supported e.g. `%04d` to ensure minimum length four digits. `%{index}` can be in lowercase or uppercase hex using '%x' or '%X'
|
392
|
+
|
393
|
+
## overwrite
|
394
|
+
|
395
|
+
Overwrite already existing path. Default is false, which raises an error
|
396
|
+
if a s3 object of the same path already exists, or increment the
|
397
|
+
`%{index}` placeholder until finding an absent path.
|
398
|
+
|
399
|
+
## use_server_side_encryption
|
400
|
+
|
401
|
+
The Server-side encryption algorithm used when storing this object in S3
|
402
|
+
(e.g., AES256, aws:kms)
|
403
|
+
|
404
|
+
## ssekms_key_id
|
405
|
+
|
406
|
+
Specifies the AWS KMS key ID to use for object encryption. You have to
|
407
|
+
set "aws:kms" to [`use_server_side_encryption`](#use_server_side_encryption) to use the KMS encryption.
|
408
|
+
|
409
|
+
## sse_customer_algorithm
|
410
|
+
|
411
|
+
Specifies the algorithm to use to when encrypting the object (e.g., AES256).
|
412
|
+
|
413
|
+
## sse_customer_key
|
414
|
+
|
415
|
+
Specifies the AWS KMS key ID to use for object encryption.
|
416
|
+
|
417
|
+
## sse_customer_key_md5
|
418
|
+
|
419
|
+
Specifies the 128-bit MD5 digest of the encryption key according to RFC 1321.
|
420
|
+
|
421
|
+
## compute_checksums
|
422
|
+
|
423
|
+
AWS SDK uses MD5 for API request/response by default. On FIPS enabled environment,
|
424
|
+
OpenSSL returns an error because MD5 is disabled. If you want to use
|
425
|
+
this plugin on FIPS enabled environment, set `compute_checksums false`.
|
426
|
+
|
427
|
+
## signature_version
|
428
|
+
|
429
|
+
Signature version for API request. `s3` means signature version 2 and
|
430
|
+
`v4` means signature version 4. Default is `nil` (Following SDK's default).
|
431
|
+
It would be useful when you use S3 compatible storage that accepts only signature version 2.
|
432
|
+
|
433
|
+
## warn_for_delay
|
434
|
+
|
435
|
+
Given a threshold to treat events as delay, output warning logs if delayed events were put into s3.
|
436
|
+
|
437
|
+
## \<bucket_lifecycle_rule\> section
|
438
|
+
|
439
|
+
Specify one or more lifecycle rules for the bucket
|
440
|
+
|
441
|
+
<bucket_lifecycle_rule>
|
442
|
+
id UNIQUE_ID_FOR_THE_RULE
|
443
|
+
prefix OPTIONAL_PREFIX # Objects whose keys begin with this prefix will be affected by the rule. If not specified all objects of the bucket will be affected
|
444
|
+
expiration_days NUMBER_OF_DAYS # The number of days before the object will expire
|
445
|
+
</bucket_lifecycle_rule>
|