fluent-plugin-s3 1.5.1 → 1.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.github/ISSUE_TEMPLATE/bug_report.yaml +72 -0
- data/.github/ISSUE_TEMPLATE/config.yml +5 -0
- data/.github/ISSUE_TEMPLATE/feature_request.yaml +38 -0
- data/.github/workflows/linux.yml +5 -3
- data/.github/workflows/stale-actions.yml +22 -0
- data/ChangeLog +15 -0
- data/README.md +13 -781
- data/VERSION +1 -1
- data/docs/credentials.md +171 -0
- data/docs/howto.md +92 -0
- data/docs/input.md +98 -0
- data/docs/output.md +453 -0
- data/docs/v0.12.md +52 -0
- data/fluent-plugin-s3.gemspec +3 -0
- data/lib/fluent/plugin/in_s3.rb +26 -1
- data/lib/fluent/plugin/out_s3.rb +12 -3
- data/lib/fluent/plugin/s3_compressor_parquet.rb +83 -0
- data/test/test_in_s3.rb +108 -5
- data/test/test_out_s3.rb +167 -118
- metadata +28 -7
- data/.travis.yml +0 -24
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
1.
|
1
|
+
1.7.0
|
data/docs/credentials.md
ADDED
@@ -0,0 +1,171 @@
|
|
1
|
+
# Configuration: credentials
|
2
|
+
|
3
|
+
Both S3 input/output plugin provide several credential methods for authentication/authorization.
|
4
|
+
|
5
|
+
## AWS key and secret authentication
|
6
|
+
|
7
|
+
These parameters are required when your agent is not running on EC2 instance with an IAM Role. When using an IAM role, make sure to configure `instance_profile_credentials`. Usage can be found below.
|
8
|
+
|
9
|
+
### aws_key_id
|
10
|
+
|
11
|
+
AWS access key id.
|
12
|
+
|
13
|
+
### aws_sec_key
|
14
|
+
|
15
|
+
AWS secret key.
|
16
|
+
|
17
|
+
## \<assume_role_credentials\> section
|
18
|
+
|
19
|
+
Typically, you use AssumeRole for cross-account access or federation.
|
20
|
+
|
21
|
+
<match *>
|
22
|
+
@type s3
|
23
|
+
|
24
|
+
<assume_role_credentials>
|
25
|
+
role_arn ROLE_ARN
|
26
|
+
role_session_name ROLE_SESSION_NAME
|
27
|
+
</assume_role_credentials>
|
28
|
+
</match>
|
29
|
+
|
30
|
+
See also:
|
31
|
+
|
32
|
+
* [Using IAM Roles - AWS Identity and Access
|
33
|
+
Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html)
|
34
|
+
* [Aws::STS::Client](http://docs.aws.amazon.com/sdkforruby/api/Aws/STS/Client.html)
|
35
|
+
* [Aws::AssumeRoleCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws/AssumeRoleCredentials.html)
|
36
|
+
|
37
|
+
### role_arn (required)
|
38
|
+
|
39
|
+
The Amazon Resource Name (ARN) of the role to assume.
|
40
|
+
|
41
|
+
### role_session_name (required)
|
42
|
+
|
43
|
+
An identifier for the assumed role session.
|
44
|
+
|
45
|
+
### policy
|
46
|
+
|
47
|
+
An IAM policy in JSON format.
|
48
|
+
|
49
|
+
### duration_seconds
|
50
|
+
|
51
|
+
The duration, in seconds, of the role session. The value can range from
|
52
|
+
900 seconds (15 minutes) to 3600 seconds (1 hour). By default, the value
|
53
|
+
is set to 3600 seconds.
|
54
|
+
|
55
|
+
### external_id
|
56
|
+
|
57
|
+
A unique identifier that is used by third parties when assuming roles in
|
58
|
+
their customers' accounts.
|
59
|
+
|
60
|
+
## \<web_identity_credentials\> section
|
61
|
+
|
62
|
+
Similar to the assume_role_credentials, but for usage in EKS.
|
63
|
+
|
64
|
+
<match *>
|
65
|
+
@type s3
|
66
|
+
|
67
|
+
<web_identity_credentials>
|
68
|
+
role_arn ROLE_ARN
|
69
|
+
role_session_name ROLE_SESSION_NAME
|
70
|
+
web_identity_token_file AWS_WEB_IDENTITY_TOKEN_FILE
|
71
|
+
</web_identity_credentials>
|
72
|
+
</match>
|
73
|
+
|
74
|
+
See also:
|
75
|
+
|
76
|
+
* [Using IAM Roles - AWS Identity and Access
|
77
|
+
Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html)
|
78
|
+
* [IAM Roles For Service Accounts](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-technical-overview.html)
|
79
|
+
* [Aws::STS::Client](http://docs.aws.amazon.com/sdkforruby/api/Aws/STS/Client.html)
|
80
|
+
* [Aws::AssumeRoleWebIdentityCredentials](https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/AssumeRoleWebIdentityCredentials.html)
|
81
|
+
|
82
|
+
### role_arn (required)
|
83
|
+
|
84
|
+
The Amazon Resource Name (ARN) of the role to assume.
|
85
|
+
|
86
|
+
### role_session_name (required)
|
87
|
+
|
88
|
+
An identifier for the assumed role session.
|
89
|
+
|
90
|
+
### web_identity_token_file (required)
|
91
|
+
|
92
|
+
The absolute path to the file on disk containing the OIDC token
|
93
|
+
|
94
|
+
### policy
|
95
|
+
|
96
|
+
An IAM policy in JSON format.
|
97
|
+
|
98
|
+
### duration_seconds
|
99
|
+
|
100
|
+
The duration, in seconds, of the role session. The value can range from
|
101
|
+
900 seconds (15 minutes) to 43200 seconds (12 hours). By default, the value
|
102
|
+
is set to 3600 seconds.
|
103
|
+
|
104
|
+
|
105
|
+
## \<instance_profile_credentials\> section
|
106
|
+
|
107
|
+
Retrieve temporary security credentials via HTTP request. This is useful on
|
108
|
+
EC2 instance.
|
109
|
+
|
110
|
+
<match *>
|
111
|
+
@type s3
|
112
|
+
|
113
|
+
<instance_profile_credentials>
|
114
|
+
ip_address IP_ADDRESS
|
115
|
+
port PORT
|
116
|
+
</instance_profile_credentials>
|
117
|
+
</match>
|
118
|
+
|
119
|
+
See also:
|
120
|
+
|
121
|
+
* [Aws::InstanceProfileCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws/InstanceProfileCredentials.html)
|
122
|
+
* [Temporary Security Credentials - AWS Identity and Access
|
123
|
+
Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html)
|
124
|
+
* [Instance Metadata and User Data - Amazon Elastic Compute
|
125
|
+
Cloud](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html)
|
126
|
+
|
127
|
+
### retries
|
128
|
+
|
129
|
+
Number of times to retry when retrieving credentials. Default is 5.
|
130
|
+
|
131
|
+
### ip_address
|
132
|
+
|
133
|
+
Default is 169.254.169.254.
|
134
|
+
|
135
|
+
### port
|
136
|
+
|
137
|
+
Default is 80.
|
138
|
+
|
139
|
+
### http_open_timeout
|
140
|
+
|
141
|
+
Default is 5.
|
142
|
+
|
143
|
+
### http_read_timeout
|
144
|
+
|
145
|
+
Default is 5.
|
146
|
+
|
147
|
+
## \<shared_credentials\> section
|
148
|
+
|
149
|
+
This loads AWS access credentials from local ini file. This is useful for
|
150
|
+
local developing.
|
151
|
+
|
152
|
+
<match *>
|
153
|
+
@type s3
|
154
|
+
|
155
|
+
<shared_credentials>
|
156
|
+
path PATH
|
157
|
+
profile_name PROFILE_NAME
|
158
|
+
</shared_credentials>
|
159
|
+
</match>
|
160
|
+
|
161
|
+
See also:
|
162
|
+
|
163
|
+
* [Aws::SharedCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws/SharedCredentials.html)
|
164
|
+
|
165
|
+
### path
|
166
|
+
|
167
|
+
Path to the shared file. Defaults to "#{Dir.home}/.aws/credentials".
|
168
|
+
|
169
|
+
### profile_name
|
170
|
+
|
171
|
+
Defaults to 'default' or `[ENV]('AWS_PROFILE')`.
|
data/docs/howto.md
ADDED
@@ -0,0 +1,92 @@
|
|
1
|
+
# Object Metadata Added To Records
|
2
|
+
|
3
|
+
If the [`add_object_metadata`](input.md#add_object_metadata) option is set to true, then the name of the bucket
|
4
|
+
and the key for a given object will be added to each log record as [`s3_bucket`](input.md#s3_bucket)
|
5
|
+
and [`s3_key`](input.md#s3_key), respectively. This metadata can be used by filter plugins or other
|
6
|
+
downstream processors to better identify the source of a given record.
|
7
|
+
|
8
|
+
# IAM Policy
|
9
|
+
|
10
|
+
The following is an example for a IAM policy needed to write to an s3 bucket (matches my-s3bucket/logs, my-s3bucket-test, etc.).
|
11
|
+
|
12
|
+
{
|
13
|
+
"Version": "2012-10-17",
|
14
|
+
"Statement": [
|
15
|
+
{
|
16
|
+
"Effect": "Allow",
|
17
|
+
"Action": [
|
18
|
+
"s3:ListBucket"
|
19
|
+
],
|
20
|
+
"Resource": "arn:aws:s3:::my-s3bucket"
|
21
|
+
},
|
22
|
+
{
|
23
|
+
"Effect": "Allow",
|
24
|
+
"Action": [
|
25
|
+
"s3:PutObject",
|
26
|
+
"s3:GetObject"
|
27
|
+
],
|
28
|
+
"Resource": "arn:aws:s3:::my-s3bucket/*"
|
29
|
+
}
|
30
|
+
]
|
31
|
+
}
|
32
|
+
|
33
|
+
Note that the bucket must already exist and **[`auto_create_bucket`](output.md#auto_create_bucket)** has no effect in this case.
|
34
|
+
|
35
|
+
`s3:GetObject` is needed for object check to avoid object overwritten.
|
36
|
+
If you set `check_object false`, `s3:GetObject` is not needed.
|
37
|
+
|
38
|
+
Refer to the [AWS
|
39
|
+
documentation](http://docs.aws.amazon.com/IAM/latest/UserGuide/ExampleIAMPolicies.html) for example policies.
|
40
|
+
|
41
|
+
Using [IAM
|
42
|
+
roles](http://docs.aws.amazon.com/IAM/latest/UserGuide/WorkingWithRoles.html)
|
43
|
+
with a properly configured IAM policy are preferred over embedding access keys
|
44
|
+
on EC2 instances.
|
45
|
+
|
46
|
+
## Example when `check_bucket false` and `check_object false`
|
47
|
+
|
48
|
+
When the mentioned configuration will be made, fluentd will work with the
|
49
|
+
minimum IAM poilcy, like:
|
50
|
+
|
51
|
+
|
52
|
+
"Statement": [{
|
53
|
+
"Effect": "Allow",
|
54
|
+
"Action": "s3:PutObject",
|
55
|
+
"Resource": ["*"]
|
56
|
+
}]
|
57
|
+
|
58
|
+
|
59
|
+
# Use your (de)compression algorithm
|
60
|
+
|
61
|
+
s3 plugin has pluggable compression mechanizm like Fluentd's input / output
|
62
|
+
plugin. If you set 'store_as xxx', `out_s3` plugin searches
|
63
|
+
`fluent/plugin/s3_compressor_xxx.rb` and `in_s3` plugin searches
|
64
|
+
`fluent/plugin/s3_extractor_xxx.rb`. You can define your (de)compression with
|
65
|
+
'S3Output::Compressor'/`S3Input::Extractor` classes. Compressor API is here:
|
66
|
+
|
67
|
+
module Fluent # Since fluent-plugin-s3 v1.0.0 or later, use Fluent::Plugin instead of Fluent
|
68
|
+
class S3Output
|
69
|
+
class XXXCompressor < Compressor
|
70
|
+
S3Output.register_compressor('xxx', self)
|
71
|
+
|
72
|
+
# Used to file extension
|
73
|
+
def ext
|
74
|
+
'xxx'
|
75
|
+
end
|
76
|
+
|
77
|
+
# Used to file content type
|
78
|
+
def content_type
|
79
|
+
'application/x-xxx'
|
80
|
+
end
|
81
|
+
|
82
|
+
# chunk is buffer chunk. tmp is destination file for upload
|
83
|
+
def compress(chunk, tmp)
|
84
|
+
# call command or something
|
85
|
+
end
|
86
|
+
end
|
87
|
+
end
|
88
|
+
end
|
89
|
+
|
90
|
+
`Extractor` is similar to `Compressor`
|
91
|
+
See bundled `Compressor`/`Extractor` classes for more detail.
|
92
|
+
|
data/docs/input.md
ADDED
@@ -0,0 +1,98 @@
|
|
1
|
+
# Input: Setup
|
2
|
+
|
3
|
+
1. Create new [SQS](https://aws.amazon.com/documentation/sqs/) queue (use same region as S3)
|
4
|
+
2. Set proper permission to new queue
|
5
|
+
3. [Configure S3 event notification](http://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html)
|
6
|
+
4. Write configuration file such as fluent.conf
|
7
|
+
5. Run fluentd
|
8
|
+
|
9
|
+
# Configuration: Input
|
10
|
+
|
11
|
+
See also [Configuration: credentials](credentials.md) for common comprehensive parameters.
|
12
|
+
|
13
|
+
<source>
|
14
|
+
@type s3
|
15
|
+
|
16
|
+
aws_key_id YOUR_AWS_KEY_ID
|
17
|
+
aws_sec_key YOUR_AWS_SECRET_KEY
|
18
|
+
s3_bucket YOUR_S3_BUCKET_NAME
|
19
|
+
s3_region ap-northeast-1
|
20
|
+
add_object_metadata true
|
21
|
+
|
22
|
+
<sqs>
|
23
|
+
queue_name YOUR_SQS_QUEUE_NAME
|
24
|
+
</sqs>
|
25
|
+
</source>
|
26
|
+
|
27
|
+
## add_object_metadata
|
28
|
+
|
29
|
+
Whether or not object metadata should be added to the record. Defaults to `false`. See below for details.
|
30
|
+
|
31
|
+
## s3_bucket (required)
|
32
|
+
|
33
|
+
S3 bucket name.
|
34
|
+
|
35
|
+
## s3_region
|
36
|
+
|
37
|
+
S3 region name. For example, US West (Oregon) Region is
|
38
|
+
"us-west-2". The full list of regions are available here. >
|
39
|
+
http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region. We
|
40
|
+
recommend using `s3_region` instead of `s3_endpoint`.
|
41
|
+
|
42
|
+
## store_as
|
43
|
+
|
44
|
+
archive format on S3. You can use serveral format:
|
45
|
+
|
46
|
+
* gzip (default)
|
47
|
+
* json
|
48
|
+
* text
|
49
|
+
* lzo (Need lzop command)
|
50
|
+
* lzma2 (Need xz command)
|
51
|
+
* gzip_command (Need gzip command)
|
52
|
+
* This compressor uses an external gzip command, hence would result in utilizing CPU cores well compared with `gzip`
|
53
|
+
|
54
|
+
See [Use your compression algorithm](howto.md#use-your-compression-algorithm) section for adding another format.
|
55
|
+
|
56
|
+
## format
|
57
|
+
|
58
|
+
Parse a line as this format in the S3 object. Supported formats are
|
59
|
+
"apache_error", "apache2", "syslog", "json", "tsv", "ltsv", "csv",
|
60
|
+
"nginx" and "none".
|
61
|
+
|
62
|
+
## check_apikey_on_start
|
63
|
+
|
64
|
+
Check AWS key on start. Default is true.
|
65
|
+
|
66
|
+
## proxy_uri
|
67
|
+
|
68
|
+
URI of proxy environment.
|
69
|
+
|
70
|
+
## \<sqs\> section
|
71
|
+
|
72
|
+
### queue_name (required)
|
73
|
+
|
74
|
+
SQS queue name. Need to create SQS queue on the region same as S3 bucket.
|
75
|
+
|
76
|
+
### queue_owner_aws_account_id
|
77
|
+
|
78
|
+
SQS Owner Account ID
|
79
|
+
|
80
|
+
### aws_key_id
|
81
|
+
|
82
|
+
Alternative aws key id for SQS
|
83
|
+
|
84
|
+
### aws_sec_key
|
85
|
+
|
86
|
+
Alternative aws key secret for SQS
|
87
|
+
|
88
|
+
### skip_delete
|
89
|
+
|
90
|
+
When true, messages are not deleted after polling block. Default is false.
|
91
|
+
|
92
|
+
### wait_time_seconds
|
93
|
+
|
94
|
+
The long polling interval. Default is 20.
|
95
|
+
|
96
|
+
### retry_error_interval
|
97
|
+
|
98
|
+
Interval to retry polling SQS if polling unsuccessful, in seconds. Default is 300.
|