fluent-plugin-s3 1.5.1 → 1.7.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.github/ISSUE_TEMPLATE/bug_report.yaml +72 -0
- data/.github/ISSUE_TEMPLATE/config.yml +5 -0
- data/.github/ISSUE_TEMPLATE/feature_request.yaml +38 -0
- data/.github/workflows/linux.yml +5 -3
- data/.github/workflows/stale-actions.yml +22 -0
- data/ChangeLog +15 -0
- data/README.md +13 -781
- data/VERSION +1 -1
- data/docs/credentials.md +171 -0
- data/docs/howto.md +92 -0
- data/docs/input.md +98 -0
- data/docs/output.md +453 -0
- data/docs/v0.12.md +52 -0
- data/fluent-plugin-s3.gemspec +3 -0
- data/lib/fluent/plugin/in_s3.rb +26 -1
- data/lib/fluent/plugin/out_s3.rb +12 -3
- data/lib/fluent/plugin/s3_compressor_parquet.rb +83 -0
- data/test/test_in_s3.rb +108 -5
- data/test/test_out_s3.rb +167 -118
- metadata +28 -7
- data/.travis.yml +0 -24
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
1.
|
1
|
+
1.7.0
|
data/docs/credentials.md
ADDED
@@ -0,0 +1,171 @@
|
|
1
|
+
# Configuration: credentials
|
2
|
+
|
3
|
+
Both S3 input/output plugin provide several credential methods for authentication/authorization.
|
4
|
+
|
5
|
+
## AWS key and secret authentication
|
6
|
+
|
7
|
+
These parameters are required when your agent is not running on EC2 instance with an IAM Role. When using an IAM role, make sure to configure `instance_profile_credentials`. Usage can be found below.
|
8
|
+
|
9
|
+
### aws_key_id
|
10
|
+
|
11
|
+
AWS access key id.
|
12
|
+
|
13
|
+
### aws_sec_key
|
14
|
+
|
15
|
+
AWS secret key.
|
16
|
+
|
17
|
+
## \<assume_role_credentials\> section
|
18
|
+
|
19
|
+
Typically, you use AssumeRole for cross-account access or federation.
|
20
|
+
|
21
|
+
<match *>
|
22
|
+
@type s3
|
23
|
+
|
24
|
+
<assume_role_credentials>
|
25
|
+
role_arn ROLE_ARN
|
26
|
+
role_session_name ROLE_SESSION_NAME
|
27
|
+
</assume_role_credentials>
|
28
|
+
</match>
|
29
|
+
|
30
|
+
See also:
|
31
|
+
|
32
|
+
* [Using IAM Roles - AWS Identity and Access
|
33
|
+
Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html)
|
34
|
+
* [Aws::STS::Client](http://docs.aws.amazon.com/sdkforruby/api/Aws/STS/Client.html)
|
35
|
+
* [Aws::AssumeRoleCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws/AssumeRoleCredentials.html)
|
36
|
+
|
37
|
+
### role_arn (required)
|
38
|
+
|
39
|
+
The Amazon Resource Name (ARN) of the role to assume.
|
40
|
+
|
41
|
+
### role_session_name (required)
|
42
|
+
|
43
|
+
An identifier for the assumed role session.
|
44
|
+
|
45
|
+
### policy
|
46
|
+
|
47
|
+
An IAM policy in JSON format.
|
48
|
+
|
49
|
+
### duration_seconds
|
50
|
+
|
51
|
+
The duration, in seconds, of the role session. The value can range from
|
52
|
+
900 seconds (15 minutes) to 3600 seconds (1 hour). By default, the value
|
53
|
+
is set to 3600 seconds.
|
54
|
+
|
55
|
+
### external_id
|
56
|
+
|
57
|
+
A unique identifier that is used by third parties when assuming roles in
|
58
|
+
their customers' accounts.
|
59
|
+
|
60
|
+
## \<web_identity_credentials\> section
|
61
|
+
|
62
|
+
Similar to the assume_role_credentials, but for usage in EKS.
|
63
|
+
|
64
|
+
<match *>
|
65
|
+
@type s3
|
66
|
+
|
67
|
+
<web_identity_credentials>
|
68
|
+
role_arn ROLE_ARN
|
69
|
+
role_session_name ROLE_SESSION_NAME
|
70
|
+
web_identity_token_file AWS_WEB_IDENTITY_TOKEN_FILE
|
71
|
+
</web_identity_credentials>
|
72
|
+
</match>
|
73
|
+
|
74
|
+
See also:
|
75
|
+
|
76
|
+
* [Using IAM Roles - AWS Identity and Access
|
77
|
+
Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html)
|
78
|
+
* [IAM Roles For Service Accounts](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-technical-overview.html)
|
79
|
+
* [Aws::STS::Client](http://docs.aws.amazon.com/sdkforruby/api/Aws/STS/Client.html)
|
80
|
+
* [Aws::AssumeRoleWebIdentityCredentials](https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/AssumeRoleWebIdentityCredentials.html)
|
81
|
+
|
82
|
+
### role_arn (required)
|
83
|
+
|
84
|
+
The Amazon Resource Name (ARN) of the role to assume.
|
85
|
+
|
86
|
+
### role_session_name (required)
|
87
|
+
|
88
|
+
An identifier for the assumed role session.
|
89
|
+
|
90
|
+
### web_identity_token_file (required)
|
91
|
+
|
92
|
+
The absolute path to the file on disk containing the OIDC token
|
93
|
+
|
94
|
+
### policy
|
95
|
+
|
96
|
+
An IAM policy in JSON format.
|
97
|
+
|
98
|
+
### duration_seconds
|
99
|
+
|
100
|
+
The duration, in seconds, of the role session. The value can range from
|
101
|
+
900 seconds (15 minutes) to 43200 seconds (12 hours). By default, the value
|
102
|
+
is set to 3600 seconds.
|
103
|
+
|
104
|
+
|
105
|
+
## \<instance_profile_credentials\> section
|
106
|
+
|
107
|
+
Retrieve temporary security credentials via HTTP request. This is useful on
|
108
|
+
EC2 instance.
|
109
|
+
|
110
|
+
<match *>
|
111
|
+
@type s3
|
112
|
+
|
113
|
+
<instance_profile_credentials>
|
114
|
+
ip_address IP_ADDRESS
|
115
|
+
port PORT
|
116
|
+
</instance_profile_credentials>
|
117
|
+
</match>
|
118
|
+
|
119
|
+
See also:
|
120
|
+
|
121
|
+
* [Aws::InstanceProfileCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws/InstanceProfileCredentials.html)
|
122
|
+
* [Temporary Security Credentials - AWS Identity and Access
|
123
|
+
Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html)
|
124
|
+
* [Instance Metadata and User Data - Amazon Elastic Compute
|
125
|
+
Cloud](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html)
|
126
|
+
|
127
|
+
### retries
|
128
|
+
|
129
|
+
Number of times to retry when retrieving credentials. Default is 5.
|
130
|
+
|
131
|
+
### ip_address
|
132
|
+
|
133
|
+
Default is 169.254.169.254.
|
134
|
+
|
135
|
+
### port
|
136
|
+
|
137
|
+
Default is 80.
|
138
|
+
|
139
|
+
### http_open_timeout
|
140
|
+
|
141
|
+
Default is 5.
|
142
|
+
|
143
|
+
### http_read_timeout
|
144
|
+
|
145
|
+
Default is 5.
|
146
|
+
|
147
|
+
## \<shared_credentials\> section
|
148
|
+
|
149
|
+
This loads AWS access credentials from local ini file. This is useful for
|
150
|
+
local developing.
|
151
|
+
|
152
|
+
<match *>
|
153
|
+
@type s3
|
154
|
+
|
155
|
+
<shared_credentials>
|
156
|
+
path PATH
|
157
|
+
profile_name PROFILE_NAME
|
158
|
+
</shared_credentials>
|
159
|
+
</match>
|
160
|
+
|
161
|
+
See also:
|
162
|
+
|
163
|
+
* [Aws::SharedCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws/SharedCredentials.html)
|
164
|
+
|
165
|
+
### path
|
166
|
+
|
167
|
+
Path to the shared file. Defaults to "#{Dir.home}/.aws/credentials".
|
168
|
+
|
169
|
+
### profile_name
|
170
|
+
|
171
|
+
Defaults to 'default' or `[ENV]('AWS_PROFILE')`.
|
data/docs/howto.md
ADDED
@@ -0,0 +1,92 @@
|
|
1
|
+
# Object Metadata Added To Records
|
2
|
+
|
3
|
+
If the [`add_object_metadata`](input.md#add_object_metadata) option is set to true, then the name of the bucket
|
4
|
+
and the key for a given object will be added to each log record as [`s3_bucket`](input.md#s3_bucket)
|
5
|
+
and [`s3_key`](input.md#s3_key), respectively. This metadata can be used by filter plugins or other
|
6
|
+
downstream processors to better identify the source of a given record.
|
7
|
+
|
8
|
+
# IAM Policy
|
9
|
+
|
10
|
+
The following is an example for a IAM policy needed to write to an s3 bucket (matches my-s3bucket/logs, my-s3bucket-test, etc.).
|
11
|
+
|
12
|
+
{
|
13
|
+
"Version": "2012-10-17",
|
14
|
+
"Statement": [
|
15
|
+
{
|
16
|
+
"Effect": "Allow",
|
17
|
+
"Action": [
|
18
|
+
"s3:ListBucket"
|
19
|
+
],
|
20
|
+
"Resource": "arn:aws:s3:::my-s3bucket"
|
21
|
+
},
|
22
|
+
{
|
23
|
+
"Effect": "Allow",
|
24
|
+
"Action": [
|
25
|
+
"s3:PutObject",
|
26
|
+
"s3:GetObject"
|
27
|
+
],
|
28
|
+
"Resource": "arn:aws:s3:::my-s3bucket/*"
|
29
|
+
}
|
30
|
+
]
|
31
|
+
}
|
32
|
+
|
33
|
+
Note that the bucket must already exist and **[`auto_create_bucket`](output.md#auto_create_bucket)** has no effect in this case.
|
34
|
+
|
35
|
+
`s3:GetObject` is needed for object check to avoid object overwritten.
|
36
|
+
If you set `check_object false`, `s3:GetObject` is not needed.
|
37
|
+
|
38
|
+
Refer to the [AWS
|
39
|
+
documentation](http://docs.aws.amazon.com/IAM/latest/UserGuide/ExampleIAMPolicies.html) for example policies.
|
40
|
+
|
41
|
+
Using [IAM
|
42
|
+
roles](http://docs.aws.amazon.com/IAM/latest/UserGuide/WorkingWithRoles.html)
|
43
|
+
with a properly configured IAM policy are preferred over embedding access keys
|
44
|
+
on EC2 instances.
|
45
|
+
|
46
|
+
## Example when `check_bucket false` and `check_object false`
|
47
|
+
|
48
|
+
When the mentioned configuration will be made, fluentd will work with the
|
49
|
+
minimum IAM poilcy, like:
|
50
|
+
|
51
|
+
|
52
|
+
"Statement": [{
|
53
|
+
"Effect": "Allow",
|
54
|
+
"Action": "s3:PutObject",
|
55
|
+
"Resource": ["*"]
|
56
|
+
}]
|
57
|
+
|
58
|
+
|
59
|
+
# Use your (de)compression algorithm
|
60
|
+
|
61
|
+
s3 plugin has pluggable compression mechanizm like Fluentd's input / output
|
62
|
+
plugin. If you set 'store_as xxx', `out_s3` plugin searches
|
63
|
+
`fluent/plugin/s3_compressor_xxx.rb` and `in_s3` plugin searches
|
64
|
+
`fluent/plugin/s3_extractor_xxx.rb`. You can define your (de)compression with
|
65
|
+
'S3Output::Compressor'/`S3Input::Extractor` classes. Compressor API is here:
|
66
|
+
|
67
|
+
module Fluent # Since fluent-plugin-s3 v1.0.0 or later, use Fluent::Plugin instead of Fluent
|
68
|
+
class S3Output
|
69
|
+
class XXXCompressor < Compressor
|
70
|
+
S3Output.register_compressor('xxx', self)
|
71
|
+
|
72
|
+
# Used to file extension
|
73
|
+
def ext
|
74
|
+
'xxx'
|
75
|
+
end
|
76
|
+
|
77
|
+
# Used to file content type
|
78
|
+
def content_type
|
79
|
+
'application/x-xxx'
|
80
|
+
end
|
81
|
+
|
82
|
+
# chunk is buffer chunk. tmp is destination file for upload
|
83
|
+
def compress(chunk, tmp)
|
84
|
+
# call command or something
|
85
|
+
end
|
86
|
+
end
|
87
|
+
end
|
88
|
+
end
|
89
|
+
|
90
|
+
`Extractor` is similar to `Compressor`
|
91
|
+
See bundled `Compressor`/`Extractor` classes for more detail.
|
92
|
+
|
data/docs/input.md
ADDED
@@ -0,0 +1,98 @@
|
|
1
|
+
# Input: Setup
|
2
|
+
|
3
|
+
1. Create new [SQS](https://aws.amazon.com/documentation/sqs/) queue (use same region as S3)
|
4
|
+
2. Set proper permission to new queue
|
5
|
+
3. [Configure S3 event notification](http://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html)
|
6
|
+
4. Write configuration file such as fluent.conf
|
7
|
+
5. Run fluentd
|
8
|
+
|
9
|
+
# Configuration: Input
|
10
|
+
|
11
|
+
See also [Configuration: credentials](credentials.md) for common comprehensive parameters.
|
12
|
+
|
13
|
+
<source>
|
14
|
+
@type s3
|
15
|
+
|
16
|
+
aws_key_id YOUR_AWS_KEY_ID
|
17
|
+
aws_sec_key YOUR_AWS_SECRET_KEY
|
18
|
+
s3_bucket YOUR_S3_BUCKET_NAME
|
19
|
+
s3_region ap-northeast-1
|
20
|
+
add_object_metadata true
|
21
|
+
|
22
|
+
<sqs>
|
23
|
+
queue_name YOUR_SQS_QUEUE_NAME
|
24
|
+
</sqs>
|
25
|
+
</source>
|
26
|
+
|
27
|
+
## add_object_metadata
|
28
|
+
|
29
|
+
Whether or not object metadata should be added to the record. Defaults to `false`. See below for details.
|
30
|
+
|
31
|
+
## s3_bucket (required)
|
32
|
+
|
33
|
+
S3 bucket name.
|
34
|
+
|
35
|
+
## s3_region
|
36
|
+
|
37
|
+
S3 region name. For example, US West (Oregon) Region is
|
38
|
+
"us-west-2". The full list of regions are available here. >
|
39
|
+
http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region. We
|
40
|
+
recommend using `s3_region` instead of `s3_endpoint`.
|
41
|
+
|
42
|
+
## store_as
|
43
|
+
|
44
|
+
archive format on S3. You can use serveral format:
|
45
|
+
|
46
|
+
* gzip (default)
|
47
|
+
* json
|
48
|
+
* text
|
49
|
+
* lzo (Need lzop command)
|
50
|
+
* lzma2 (Need xz command)
|
51
|
+
* gzip_command (Need gzip command)
|
52
|
+
* This compressor uses an external gzip command, hence would result in utilizing CPU cores well compared with `gzip`
|
53
|
+
|
54
|
+
See [Use your compression algorithm](howto.md#use-your-compression-algorithm) section for adding another format.
|
55
|
+
|
56
|
+
## format
|
57
|
+
|
58
|
+
Parse a line as this format in the S3 object. Supported formats are
|
59
|
+
"apache_error", "apache2", "syslog", "json", "tsv", "ltsv", "csv",
|
60
|
+
"nginx" and "none".
|
61
|
+
|
62
|
+
## check_apikey_on_start
|
63
|
+
|
64
|
+
Check AWS key on start. Default is true.
|
65
|
+
|
66
|
+
## proxy_uri
|
67
|
+
|
68
|
+
URI of proxy environment.
|
69
|
+
|
70
|
+
## \<sqs\> section
|
71
|
+
|
72
|
+
### queue_name (required)
|
73
|
+
|
74
|
+
SQS queue name. Need to create SQS queue on the region same as S3 bucket.
|
75
|
+
|
76
|
+
### queue_owner_aws_account_id
|
77
|
+
|
78
|
+
SQS Owner Account ID
|
79
|
+
|
80
|
+
### aws_key_id
|
81
|
+
|
82
|
+
Alternative aws key id for SQS
|
83
|
+
|
84
|
+
### aws_sec_key
|
85
|
+
|
86
|
+
Alternative aws key secret for SQS
|
87
|
+
|
88
|
+
### skip_delete
|
89
|
+
|
90
|
+
When true, messages are not deleted after polling block. Default is false.
|
91
|
+
|
92
|
+
### wait_time_seconds
|
93
|
+
|
94
|
+
The long polling interval. Default is 20.
|
95
|
+
|
96
|
+
### retry_error_interval
|
97
|
+
|
98
|
+
Interval to retry polling SQS if polling unsuccessful, in seconds. Default is 300.
|