adp-fluent-plugin-kinesis 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.github/PULL_REQUEST_TEMPLATE.md +6 -0
- data/.gitignore +15 -0
- data/.travis.yml +56 -0
- data/CHANGELOG.md +172 -0
- data/CODE_OF_CONDUCT.md +4 -0
- data/CONTRIBUTING.md +61 -0
- data/CONTRIBUTORS.txt +8 -0
- data/Gemfile +18 -0
- data/LICENSE.txt +201 -0
- data/Makefile +44 -0
- data/NOTICE.txt +2 -0
- data/README.md +559 -0
- data/Rakefile +26 -0
- data/adp-fluent-plugin-kinesis.gemspec +71 -0
- data/benchmark/task.rake +106 -0
- data/gemfiles/Gemfile.fluentd-0.14.22 +6 -0
- data/gemfiles/Gemfile.fluentd-1.13.3 +6 -0
- data/gemfiles/Gemfile.td-agent-3.1.0 +17 -0
- data/gemfiles/Gemfile.td-agent-3.1.1 +17 -0
- data/gemfiles/Gemfile.td-agent-3.2.0 +17 -0
- data/gemfiles/Gemfile.td-agent-3.2.1 +17 -0
- data/gemfiles/Gemfile.td-agent-3.3.0 +17 -0
- data/gemfiles/Gemfile.td-agent-3.4.0 +17 -0
- data/gemfiles/Gemfile.td-agent-3.4.1 +17 -0
- data/gemfiles/Gemfile.td-agent-3.5.0 +17 -0
- data/gemfiles/Gemfile.td-agent-3.5.1 +17 -0
- data/gemfiles/Gemfile.td-agent-3.6.0 +17 -0
- data/gemfiles/Gemfile.td-agent-3.7.0 +17 -0
- data/gemfiles/Gemfile.td-agent-3.7.1 +17 -0
- data/gemfiles/Gemfile.td-agent-3.8.0 +17 -0
- data/gemfiles/Gemfile.td-agent-3.8.1 +18 -0
- data/gemfiles/Gemfile.td-agent-4.0.0 +25 -0
- data/gemfiles/Gemfile.td-agent-4.0.1 +21 -0
- data/gemfiles/Gemfile.td-agent-4.1.0 +21 -0
- data/gemfiles/Gemfile.td-agent-4.1.1 +21 -0
- data/gemfiles/Gemfile.td-agent-4.2.0 +21 -0
- data/lib/fluent/plugin/kinesis.rb +174 -0
- data/lib/fluent/plugin/kinesis_helper/aggregator.rb +101 -0
- data/lib/fluent/plugin/kinesis_helper/api.rb +254 -0
- data/lib/fluent/plugin/kinesis_helper/client.rb +210 -0
- data/lib/fluent/plugin/kinesis_helper/compression.rb +27 -0
- data/lib/fluent/plugin/out_kinesis_firehose.rb +60 -0
- data/lib/fluent/plugin/out_kinesis_streams.rb +72 -0
- data/lib/fluent/plugin/out_kinesis_streams_aggregated.rb +79 -0
- data/lib/fluent_plugin_kinesis/version.rb +17 -0
- metadata +339 -0
data/README.md
ADDED
@@ -0,0 +1,559 @@
|
|
1
|
+
# Fluent plugin for Amazon Kinesis
|
2
|
+
|
3
|
+
[![Build Status](https://api.travis-ci.com/awslabs/aws-fluent-plugin-kinesis.svg?branch=master)](https://app.travis-ci.com/github/awslabs/aws-fluent-plugin-kinesis)
|
4
|
+
[![Gem Version](https://badge.fury.io/rb/fluent-plugin-kinesis.svg)](https://rubygems.org/gems/fluent-plugin-kinesis)
|
5
|
+
[![Gem Downloads](https://img.shields.io/gem/dt/fluent-plugin-kinesis.svg)](https://rubygems.org/gems/fluent-plugin-kinesis)
|
6
|
+
|
7
|
+
[Fluentd][fluentd] output plugin
|
8
|
+
that sends events to [Amazon Kinesis Data Streams][streams] and [Amazon Kinesis Data Firehose][firehose]. Also it supports [KPL Aggregated Record Format][kpl]. This gem includes three output plugins respectively:
|
9
|
+
|
10
|
+
- `kinesis_streams`
|
11
|
+
- `kinesis_firehose`
|
12
|
+
- `kinesis_streams_aggregated`
|
13
|
+
|
14
|
+
Also, there is a [documentation on Fluentd official site][fluentd-doc-kinesis].
|
15
|
+
|
16
|
+
**Note**: This README is for v3. Plugin v3 is almost compatible with v2. If you use v1, see the [old README][v1-readme].
|
17
|
+
|
18
|
+
## Installation
|
19
|
+
This Fluentd plugin is available as the `fluent-plugin-kinesis` gem from RubyGems.
|
20
|
+
|
21
|
+
gem install fluent-plugin-kinesis
|
22
|
+
|
23
|
+
Or you can install this plugin for [td-agent][td-agent] as:
|
24
|
+
|
25
|
+
td-agent-gem install fluent-plugin-kinesis
|
26
|
+
|
27
|
+
If you would like to build by yourself and install, see the section below. Your need [bundler][bundler] for this.
|
28
|
+
|
29
|
+
In case of using with Fluentd: Fluentd will be also installed via the process below.
|
30
|
+
|
31
|
+
git clone https://github.com/awslabs/aws-fluent-plugin-kinesis.git
|
32
|
+
cd aws-fluent-plugin-kinesis
|
33
|
+
bundle install
|
34
|
+
bundle exec rake build
|
35
|
+
bundle exec rake install
|
36
|
+
|
37
|
+
Also, you can use this plugin with td-agent: You have to install td-agent before installing this plugin.
|
38
|
+
|
39
|
+
git clone https://github.com/awslabs/aws-fluent-plugin-kinesis.git
|
40
|
+
cd aws-fluent-plugin-kinesis
|
41
|
+
bundle install
|
42
|
+
bundle exec rake build
|
43
|
+
fluent-gem install pkg/fluent-plugin-kinesis
|
44
|
+
|
45
|
+
Or just download specify your Ruby library path. Below is the sample for specifying your library path via RUBYLIB.
|
46
|
+
|
47
|
+
git clone https://github.com/awslabs/aws-fluent-plugin-kinesis.git
|
48
|
+
cd aws-fluent-plugin-kinesis
|
49
|
+
bundle install
|
50
|
+
export RUBYLIB=$RUBYLIB:/path/to/aws-fluent-plugin-kinesis/lib
|
51
|
+
|
52
|
+
## Dependencies
|
53
|
+
* Ruby 2.3.0+
|
54
|
+
* Fluentd 0.14.22+ (td-agent v3.1.0+)
|
55
|
+
|
56
|
+
## Basic Usage
|
57
|
+
Here are general procedures for using this plugin:
|
58
|
+
|
59
|
+
1. Install.
|
60
|
+
1. Edit configuration
|
61
|
+
1. Run Fluentd or td-agent
|
62
|
+
|
63
|
+
You can run this plugin with Fluentd as follows:
|
64
|
+
|
65
|
+
1. Install.
|
66
|
+
1. Edit configuration file and save it as 'fluentd.conf'.
|
67
|
+
1. Then, run `fluentd -c /path/to/fluentd.conf`
|
68
|
+
|
69
|
+
To run with td-agent, it would be as follows:
|
70
|
+
|
71
|
+
1. Install.
|
72
|
+
1. Edit configuration file provided by td-agent.
|
73
|
+
1. Then, run or restart td-agent.
|
74
|
+
|
75
|
+
## Getting started
|
76
|
+
Assume you use Amazon EC2 instances with Instance profile. If you want to use specific credentials, see [Credentials](#configuration-credentials).
|
77
|
+
|
78
|
+
### kinesis_streams
|
79
|
+
<match your_tag>
|
80
|
+
@type kinesis_streams
|
81
|
+
region us-east-1
|
82
|
+
stream_name your_stream
|
83
|
+
partition_key key # Otherwise, use random partition key
|
84
|
+
</match>
|
85
|
+
For more details, see [Configuration: kinesis_streams](#configuration-kinesis_streams).
|
86
|
+
|
87
|
+
### kinesis_firehose
|
88
|
+
<match your_tag>
|
89
|
+
@type kinesis_firehose
|
90
|
+
region us-east-1
|
91
|
+
delivery_stream_name your_stream
|
92
|
+
</match>
|
93
|
+
For more details, see [Configuration: kinesis_firehose](#configuration-kinesis_firehose).
|
94
|
+
|
95
|
+
### kinesis_streams_aggregated
|
96
|
+
<match your_tag>
|
97
|
+
@type kinesis_streams_aggregated
|
98
|
+
region us-east-1
|
99
|
+
stream_name your_stream
|
100
|
+
# Unlike kinesis_streams, there is no way to use dynamic partition key.
|
101
|
+
# fixed_partition_key or random.
|
102
|
+
</match>
|
103
|
+
For more details, see [Configuration: kinesis_streams_aggregated](#configuration-kinesis_streams_aggregated).
|
104
|
+
|
105
|
+
### For better throughput
|
106
|
+
Add configurations like below:
|
107
|
+
|
108
|
+
flush_interval 1
|
109
|
+
chunk_limit_size 1m
|
110
|
+
flush_thread_interval 0.1
|
111
|
+
flush_thread_burst_interval 0.01
|
112
|
+
flush_thread_count 15
|
113
|
+
|
114
|
+
When you use Fluent v1.0 (td-agent3), write these configurations in buffer section. For more details, see [Config: Buffer Section][fluentd-buffer-section].
|
115
|
+
|
116
|
+
Note: Each value should be adjusted to your system by yourself.
|
117
|
+
|
118
|
+
## Configuration: Credentials
|
119
|
+
To put records into Amazon Kinesis Data Streams or Firehose, you need to provide AWS security credentials somehow. Without specifying credentials in config file, this plugin automatically fetch credential just following AWS SDK for Ruby does (environment variable, shared profile, and instance profile).
|
120
|
+
|
121
|
+
This plugin uses the same configuration in [fluent-plugin-s3][fluent-plugin-s3], but also supports aws session tokens for temporary credentials.
|
122
|
+
|
123
|
+
**aws_key_id**
|
124
|
+
|
125
|
+
AWS access key id. This parameter is required when your agent is not running on EC2 instance with an IAM Role. When using an IAM role, make sure to configure `instance_profile_credentials`. Usage can be found below.
|
126
|
+
|
127
|
+
**aws_sec_key**
|
128
|
+
|
129
|
+
AWS secret key. This parameter is required when your agent is not running on EC2 instance with an IAM Role.
|
130
|
+
|
131
|
+
**aws_ses_token**
|
132
|
+
|
133
|
+
AWS session token. This parameter is optional, but can be provided if using MFA or temporary credentials when your agent is not running on EC2 instance with an IAM Role.
|
134
|
+
|
135
|
+
**aws_iam_retries**
|
136
|
+
|
137
|
+
The number of attempts to make (with exponential backoff) when loading instance profile credentials from the EC2 metadata service using an IAM role. Defaults to 5 retries.
|
138
|
+
|
139
|
+
### assume_role_credentials
|
140
|
+
Typically, you can use AssumeRole for cross-account access or federation.
|
141
|
+
|
142
|
+
<match *>
|
143
|
+
@type kinesis_streams
|
144
|
+
|
145
|
+
<assume_role_credentials>
|
146
|
+
role_arn ROLE_ARN
|
147
|
+
role_session_name ROLE_SESSION_NAME
|
148
|
+
</assume_role_credentials>
|
149
|
+
</match>
|
150
|
+
|
151
|
+
See also:
|
152
|
+
|
153
|
+
* [Using IAM Roles - AWS Identity and Access
|
154
|
+
Management](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html)
|
155
|
+
* [Aws::STS::Client](https://docs.aws.amazon.com/sdkforruby/api/Aws/STS/Client.html)
|
156
|
+
* [Aws::AssumeRoleCredentials](https://docs.aws.amazon.com/sdkforruby/api/Aws/AssumeRoleCredentials.html)
|
157
|
+
|
158
|
+
**role_arn (required)**
|
159
|
+
|
160
|
+
The Amazon Resource Name (ARN) of the role to assume.
|
161
|
+
|
162
|
+
**role_session_name (required)**
|
163
|
+
|
164
|
+
An identifier for the assumed role session.
|
165
|
+
|
166
|
+
**policy**
|
167
|
+
|
168
|
+
An IAM policy in JSON format.
|
169
|
+
|
170
|
+
**duration_seconds**
|
171
|
+
|
172
|
+
The duration, in seconds, of the role session. The value can range from 900 seconds (15 minutes) to 3600 seconds (1 hour). By default, the value is set to 3600 seconds.
|
173
|
+
|
174
|
+
**external_id**
|
175
|
+
|
176
|
+
A unique identifier that is used by third parties when assuming roles in their customers' accounts.
|
177
|
+
|
178
|
+
**sts_http_proxy**
|
179
|
+
|
180
|
+
Proxy url for proxying requests to amazon sts service api. This needs to be set up independently from global http_proxy parameter for the use case in which requests to kinesis api are going via kinesis vpc endpoint but requests to sts api have to go via http proxy.
|
181
|
+
It should be added to assume_role_credentials configuration stanza in the next format:
|
182
|
+
sts_http_proxy http://[username:password]@hostname:port
|
183
|
+
|
184
|
+
**sts_endpoint_url**
|
185
|
+
|
186
|
+
STS API endpoint url. This can be used to override the default global STS API endpoint of sts.amazonaws.com. Using regional endpoints may be preferred to reduce latency, and are required if utilizing a PrivateLink VPC Endpoint for STS API calls.
|
187
|
+
|
188
|
+
|
189
|
+
### web_identity_credentials
|
190
|
+
|
191
|
+
Similar to the assume_role_credentials, but for usage in EKS.
|
192
|
+
|
193
|
+
<match *>
|
194
|
+
@type kinesis_streams
|
195
|
+
|
196
|
+
<web_identity_credentials>
|
197
|
+
role_arn ROLE_ARN
|
198
|
+
role_session_name ROLE_SESSION_NAME
|
199
|
+
web_identity_token_file AWS_WEB_IDENTITY_TOKEN_FILE
|
200
|
+
</web_identity_credentials>
|
201
|
+
</match>
|
202
|
+
|
203
|
+
See also:
|
204
|
+
|
205
|
+
* [Using IAM Roles - AWS Identity and Access Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html)
|
206
|
+
* [IAM Roles For Service Accounts](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-technical-overview.html)
|
207
|
+
* [Aws::STS::Client](http://docs.aws.amazon.com/sdkforruby/api/Aws/STS/Client.html)
|
208
|
+
* [Aws::AssumeRoleWebIdentityCredentials](https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/AssumeRoleWebIdentityCredentials.html)
|
209
|
+
|
210
|
+
**role_arn (required)**
|
211
|
+
|
212
|
+
The Amazon Resource Name (ARN) of the role to assume.
|
213
|
+
|
214
|
+
**role_session_name (required)**
|
215
|
+
|
216
|
+
An identifier for the assumed role session.
|
217
|
+
|
218
|
+
**web_identity_token_file (required)**
|
219
|
+
|
220
|
+
The absolute path to the file on disk containing the OIDC token
|
221
|
+
|
222
|
+
**policy**
|
223
|
+
|
224
|
+
An IAM policy in JSON format.
|
225
|
+
|
226
|
+
**duration_seconds**
|
227
|
+
|
228
|
+
The duration, in seconds, of the role session. The value can range from
|
229
|
+
900 seconds (15 minutes) to 43200 seconds (12 hours). By default, the value
|
230
|
+
is set to 3600 seconds (1 hour).
|
231
|
+
|
232
|
+
### instance_profile_credentials
|
233
|
+
|
234
|
+
Retrieve temporary security credentials via HTTP request. This is useful on EC2 instance.
|
235
|
+
|
236
|
+
<match *>
|
237
|
+
@type kinesis_streams
|
238
|
+
|
239
|
+
<instance_profile_credentials>
|
240
|
+
ip_address IP_ADDRESS
|
241
|
+
port PORT
|
242
|
+
</instance_profile_credentials>
|
243
|
+
</match>
|
244
|
+
|
245
|
+
See also:
|
246
|
+
|
247
|
+
* [Aws::InstanceProfileCredentials](https://docs.aws.amazon.com/sdkforruby/api/Aws/InstanceProfileCredentials.html)
|
248
|
+
* [Temporary Security Credentials - AWS Identity and Access
|
249
|
+
Management](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html)
|
250
|
+
* [Instance Metadata and User Data - Amazon Elastic Compute
|
251
|
+
Cloud](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html)
|
252
|
+
|
253
|
+
**retries**
|
254
|
+
|
255
|
+
Number of times to retry when retrieving credentials. Default is 5.
|
256
|
+
|
257
|
+
**ip_address**
|
258
|
+
|
259
|
+
Default is 169.254.169.254.
|
260
|
+
|
261
|
+
**port**
|
262
|
+
|
263
|
+
Default is 80.
|
264
|
+
|
265
|
+
**http_open_timeout**
|
266
|
+
|
267
|
+
Default is 5.
|
268
|
+
|
269
|
+
**http_read_timeout**
|
270
|
+
|
271
|
+
Default is 5.
|
272
|
+
|
273
|
+
### shared_credentials
|
274
|
+
|
275
|
+
This loads AWS access credentials from local ini file. This is useful for local developing.
|
276
|
+
|
277
|
+
<match *>
|
278
|
+
@type kinesis_streams
|
279
|
+
|
280
|
+
<shared_credentials>
|
281
|
+
path PATH
|
282
|
+
profile_name PROFILE_NAME
|
283
|
+
</shared_credentials>
|
284
|
+
</match>
|
285
|
+
|
286
|
+
See also:
|
287
|
+
|
288
|
+
* [Aws::SharedCredentials](https://docs.aws.amazon.com/sdkforruby/api/Aws/SharedCredentials.html)
|
289
|
+
|
290
|
+
**path**
|
291
|
+
|
292
|
+
Path to the shared file. Defaults to "#{Dir.home}/.aws/credentials".
|
293
|
+
|
294
|
+
**profile_name**
|
295
|
+
|
296
|
+
Defaults to 'default' or `[ENV]('AWS_PROFILE')`.
|
297
|
+
|
298
|
+
### process_credentials
|
299
|
+
|
300
|
+
This loads AWS access credentials from an external process.
|
301
|
+
|
302
|
+
<match *>
|
303
|
+
@type kinesis_streams
|
304
|
+
|
305
|
+
<process_credentials>
|
306
|
+
process CMD
|
307
|
+
</process_credentials>
|
308
|
+
</match>
|
309
|
+
|
310
|
+
See also:
|
311
|
+
|
312
|
+
* [Aws::ProcessCredentials](https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/ProcessCredentials.html)
|
313
|
+
* [Sourcing Credentials From External Processes](https://docs.aws.amazon.com/cli/latest/topic/config-vars.html#sourcing-credentials-from-external-processes)
|
314
|
+
|
315
|
+
**process (required)**
|
316
|
+
|
317
|
+
Command to be executed as an external process.
|
318
|
+
|
319
|
+
## Configuration: Format
|
320
|
+
|
321
|
+
### format (section)
|
322
|
+
This plugin uses `Fluent::TextFormatter` to serialize record to string. See [formatter.rb] for more details. By default, it uses `json` formatter same as specific like below:
|
323
|
+
|
324
|
+
<match *>
|
325
|
+
@type kinesis_streams
|
326
|
+
|
327
|
+
<format>
|
328
|
+
@type json
|
329
|
+
</format>
|
330
|
+
</match>
|
331
|
+
|
332
|
+
For other configurations of `json` formatter, see [json formatter plugin][fluentd-formatter-json].
|
333
|
+
|
334
|
+
### inject (section)
|
335
|
+
This plugin uses `Fluent::TimeFormatter` and other injection configurations. See [inject.rb] for more details.
|
336
|
+
|
337
|
+
For example, the config below will add `time` field whose value is event time with nanosecond and `tag` field whose value is its tag.
|
338
|
+
|
339
|
+
<match *>
|
340
|
+
@type kinesis_streams
|
341
|
+
|
342
|
+
<inject>
|
343
|
+
time_key time
|
344
|
+
tag_key tag
|
345
|
+
</inject>
|
346
|
+
</match>
|
347
|
+
|
348
|
+
By default, `time_type string` and `time_format %Y-%m-%dT%H:%M:%S.%N%z` are already set to be applicable to Elasticsearch sub-second format. Although, you can use any configuration.
|
349
|
+
|
350
|
+
Also, there are some format related options below:
|
351
|
+
|
352
|
+
### data_key
|
353
|
+
If your record contains a field whose string should be sent to Amazon Kinesis directly (without formatter), use this parameter to specify the field. In that case, other fields than **data_key** are thrown away and never sent to Amazon Kinesis. Default `nil`, which means whole record will be formatted and sent.
|
354
|
+
|
355
|
+
### compression
|
356
|
+
Specifying compression way for data of each record. Current accepted options are `zlib` and `gzip`. Otherwise, no compression will be preformed.
|
357
|
+
|
358
|
+
### log_truncate_max_size
|
359
|
+
Integer, default 1024. When emitting the log entry, the message will be truncated by this size to avoid infinite loop when the log is also sent to Kinesis. The value 0 means no truncation.
|
360
|
+
|
361
|
+
### chomp_record
|
362
|
+
Boolean. Default `false`. If it is enabled, the plugin calls chomp and removes separator from the end of each record. This option is for compatible format with plugin v2. See [#142](https://github.com/awslabs/aws-fluent-plugin-kinesis/issues/142) for more details.
|
363
|
+
When you use [kinesis_firehose](#kinesis_firehose) output, [append_new_line](#append_new_line) option is `true` as default. If [append_new_line](#append_new_line) is enabled, the plugin calls chomp as [chomp_record](#chomp_record) is `true` before appending `\n` to each record. Therefore, you don't need to enable [chomp_record](#chomp_record) option when you use [kinesis_firehose](#kinesis_firehose) with default configuration. If you want to set [append_new_line](#append_new_line) `false`, you can choose [chomp_record](#chomp_record) `false` (default) or `true` (compatible format with plugin v2).
|
364
|
+
|
365
|
+
## Configuration: API
|
366
|
+
### region
|
367
|
+
AWS region of your stream. It should be in form like `us-east-1`, `us-west-2`. Refer to [Regions and Endpoints in AWS General Reference][region] for supported regions.
|
368
|
+
|
369
|
+
Default `nil`, which means try to find from environment variable `AWS_REGION`.
|
370
|
+
|
371
|
+
### max_record_size
|
372
|
+
The upper limit of size of each record. Default is 1 MB which is the limitation of Kinesis.
|
373
|
+
|
374
|
+
### http_proxy
|
375
|
+
HTTP proxy for API calling. Default `nil`.
|
376
|
+
|
377
|
+
### endpoint
|
378
|
+
API endpoint URL, for testing. Default `nil`.
|
379
|
+
|
380
|
+
### ssl_verify_peer
|
381
|
+
Boolean. Disable if you want to verify ssl connection, for testing. Default `true`.
|
382
|
+
|
383
|
+
### debug
|
384
|
+
Boolean. Enable if you need to debug Amazon Kinesis Data Firehose API call. Default is `false`.
|
385
|
+
|
386
|
+
## Configuration: Batch request
|
387
|
+
### retries_on_batch_request
|
388
|
+
Integer, default is 8. The plugin will put multiple records to Amazon Kinesis Data Streams in batches using PutRecords. A set of records in a batch may fail for reasons documented in the Kinesis Service API Reference for PutRecords. Failed records will be retried **retries_on_batch_request** times. If a record fails all retries an error log will be emitted.
|
389
|
+
|
390
|
+
### reset_backoff_if_success
|
391
|
+
Boolean, default `true`. If enabled, when after retrying, the next retrying checks the number of succeeded records on the former batch request and reset exponential backoff if there is any success. Because batch request could be composed by requests across shards, simple exponential backoff for the batch request wouldn't work some cases.
|
392
|
+
|
393
|
+
### batch_request_max_count
|
394
|
+
Integer, default 500. The number of max count of making batch request from record chunk. It can't exceed the default value because it's API limit.
|
395
|
+
|
396
|
+
Default:
|
397
|
+
|
398
|
+
- `kinesis_streams`: 500
|
399
|
+
- `kinesis_firehose`: 500
|
400
|
+
- `kinesis_streams_aggregated`: 100,000
|
401
|
+
|
402
|
+
### batch_request_max_size
|
403
|
+
Integer. The number of max size of making batch request from record chunk. It can't exceed the default value because it's API limit.
|
404
|
+
|
405
|
+
Default:
|
406
|
+
|
407
|
+
- `kinesis_streams`: 5 MB
|
408
|
+
- `kinesis_firehose`: 4 MB
|
409
|
+
- `kinesis_streams_aggregated`: 1 MB
|
410
|
+
|
411
|
+
### drop_failed_records_after_batch_request_retries
|
412
|
+
Boolean, default `true`.
|
413
|
+
|
414
|
+
If *drop_failed_records_after_batch_request_retries* is enabled (default), the plugin will drop failed records when batch request fails after retrying max times configured as *retries_on_batch_request*. This dropping can be monitored from [monitor_agent](https://docs.fluentd.org/input/monitor_agent) or [fluent-plugin-prometheus](https://docs.fluentd.org/monitoring-fluentd/monitoring-prometheus) as *retry_count* or *num_errors* metrics.
|
415
|
+
|
416
|
+
If *drop_failed_records_after_batch_request_retries* is disabled, the plugin will raise error and return chunk to Fluentd buffer when batch request fails after retrying max times. Fluentd will retry to send chunk records according to retry config in [Buffer Section](https://docs.fluentd.org/configuration/buffer-section). Note that this retryng may create duplicate records since [PutRecords API](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecords.html) of Kinesis Data Streams and [PutRecordBatch API](https://docs.aws.amazon.com/firehose/latest/APIReference/API_PutRecordBatch.html) of Kinesis Data Firehose may return a partially successful response.
|
417
|
+
|
418
|
+
### monitor_num_of_batch_request_retries
|
419
|
+
Boolean, default `false`. If enabled, the plugin will increment *retry_count* monitoring metrics after internal retrying to send batch request. This configuration enables you to monitor [ProvisionedThroughputExceededException](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecords.html) from [monitor_agent](https://docs.fluentd.org/input/monitor_agent) or [fluent-plugin-prometheus](https://docs.fluentd.org/monitoring-fluentd/monitoring-prometheus). Note that *retry_count* metrics will be counted by the plugin in addition to original Fluentd buffering mechanism if *monitor_num_of_batch_request_retries* is enabled.
|
420
|
+
|
421
|
+
## Configuration: kinesis_streams
|
422
|
+
Here are `kinesis_streams` specific configurations.
|
423
|
+
|
424
|
+
### stream_name
|
425
|
+
Name of the stream to put data.
|
426
|
+
|
427
|
+
As of Fluentd v1, built-in placeholders are supported. Now, you can also use built-in placeholders for this parameter.
|
428
|
+
|
429
|
+
**NOTE:**
|
430
|
+
Built-in placeholders require target key information in your buffer section attributes.
|
431
|
+
|
432
|
+
e.g.)
|
433
|
+
|
434
|
+
When you specify the following `stream_name` configuration with built-in placeholder:
|
435
|
+
|
436
|
+
```aconf
|
437
|
+
stream_name "${$.kubernetes.annotations.kinesis_streams}"
|
438
|
+
```
|
439
|
+
|
440
|
+
you ought to specify the corresponding attributes in buffer section:
|
441
|
+
|
442
|
+
```aconf
|
443
|
+
# $.kubernetes.annotations.kinesis_streams needs to be set in buffer attributes
|
444
|
+
<buffer $.kubernetes.annotations.kinesis_streams>
|
445
|
+
# ...
|
446
|
+
</buffer>
|
447
|
+
```
|
448
|
+
|
449
|
+
For more details, refer [Placeholders section in the official Fluentd document](https://docs.fluentd.org/configuration/buffer-section#placeholders).
|
450
|
+
|
451
|
+
### partition_key
|
452
|
+
A key to extract partition key from JSON object. Default `nil`, which means partition key will be generated randomly.
|
453
|
+
|
454
|
+
## Configuration: kinesis_firehose
|
455
|
+
Here are `kinesis_firehose` specific configurations.
|
456
|
+
|
457
|
+
### delivery_stream_name
|
458
|
+
Name of the delivery stream to put data.
|
459
|
+
|
460
|
+
As of Fluentd v1, built-in placeholders are supported. Now, you can also use built-in placeholders for this parameter.
|
461
|
+
|
462
|
+
**NOTE:**
|
463
|
+
Built-in placeholders require target key information in your buffer section attributes.
|
464
|
+
|
465
|
+
e.g.)
|
466
|
+
|
467
|
+
When you specify the following `delivery_stream_name` configuration with built-in placeholder:
|
468
|
+
|
469
|
+
```aconf
|
470
|
+
delivery_stream_name "${$.kubernetes.annotations.kinesis_firehose_streams}"
|
471
|
+
```
|
472
|
+
|
473
|
+
you ought to specify the corresponding attributes in buffer section:
|
474
|
+
|
475
|
+
```aconf
|
476
|
+
# $.kubernetes.annotations.kinesis_firehose_streams needs to be set in buffer attributes
|
477
|
+
<buffer $.kubernetes.annotations.kinesis_firehose_streams>
|
478
|
+
# ...
|
479
|
+
</buffer>
|
480
|
+
```
|
481
|
+
|
482
|
+
For more details, refer [Placeholders section in the official Fluentd document](https://docs.fluentd.org/configuration/buffer-section#placeholders).
|
483
|
+
|
484
|
+
### append_new_line
|
485
|
+
Boolean. Default `true`. If it is enabled, the plugin adds new line character (`\n`) to each serialized record.
|
486
|
+
Before appending `\n`, plugin calls chomp and removes separator from the end of each record as [chomp_record](#chomp_record) is `true`. Therefore, you don't need to enable [chomp_record](#chomp_record) option when you use [kinesis_firehose](#kinesis_firehose) output with default configuration ([append_new_line](#append_new_line) is `true`). If you want to set [append_new_line](#append_new_line) `false`, you can choose [chomp_record](#chomp_record) `false` (default) or `true` (compatible format with plugin v2).
|
487
|
+
|
488
|
+
## Configuration: kinesis_streams_aggregated
|
489
|
+
Here are `kinesis_streams_aggregated` specific configurations.
|
490
|
+
|
491
|
+
### stream_name
|
492
|
+
Name of the stream to put data.
|
493
|
+
|
494
|
+
As of Fluentd v1, built-in placeholders are supported. Now, you can also use built-in placeholders for this parameter.
|
495
|
+
|
496
|
+
**NOTE:**
|
497
|
+
Built-in placeholders require target key information in your buffer section attributes.
|
498
|
+
|
499
|
+
e.g.)
|
500
|
+
|
501
|
+
When you specify the following `stream_name` configuration with built-in placeholder:
|
502
|
+
|
503
|
+
```aconf
|
504
|
+
stream_name "${$.kubernetes.annotations.kinesis_streams_aggregated}"
|
505
|
+
```
|
506
|
+
|
507
|
+
you ought to specify the corresponding attributes in buffer section:
|
508
|
+
|
509
|
+
```aconf
|
510
|
+
# $.kubernetes.annotations.kinesis_streams_aggregated needs to be set in buffer attributes
|
511
|
+
<buffer $.kubernetes.annotations.kinesis_streams_aggregated>
|
512
|
+
# ...
|
513
|
+
</buffer>
|
514
|
+
```
|
515
|
+
|
516
|
+
For more details, refer [Placeholders section in the official Fluentd document](https://docs.fluentd.org/configuration/buffer-section#placeholders).
|
517
|
+
|
518
|
+
### fixed_partition_key
|
519
|
+
A value of fixed partition key. Default `nil`, which means partition key will be generated randomly.
|
520
|
+
|
521
|
+
Note: if you specified this option, all records go to a single shard.
|
522
|
+
|
523
|
+
## Development
|
524
|
+
|
525
|
+
To launch `fluentd` process with this plugin for development, follow the steps below:
|
526
|
+
|
527
|
+
git clone https://github.com/awslabs/aws-fluent-plugin-kinesis.git
|
528
|
+
cd aws-fluent-plugin-kinesis
|
529
|
+
make # will install gems dependency
|
530
|
+
bundle exec fluentd -c /path/to/fluent.conf
|
531
|
+
|
532
|
+
To launch using specified version of Fluentd, use `BUNDLE_GEMFILE` environment variable:
|
533
|
+
|
534
|
+
BUNDLE_GEMFILE=$PWD/gemfiles/Gemfile.td-agent-3.3.0 bundle exec fluentd -c /path/to/fluent.conf
|
535
|
+
|
536
|
+
## Contributing
|
537
|
+
|
538
|
+
Bug reports and pull requests are welcome on [GitHub][github].
|
539
|
+
|
540
|
+
## Related Resources
|
541
|
+
|
542
|
+
* [Amazon Kinesis Data Streams Developer Guide](http://docs.aws.amazon.com/kinesis/latest/dev/introduction.html)
|
543
|
+
* [Amazon Kinesis Data Firehose Developer Guide](http://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html)
|
544
|
+
|
545
|
+
[fluentd]: https://www.fluentd.org/
|
546
|
+
[streams]: https://aws.amazon.com/kinesis/streams/
|
547
|
+
[firehose]: https://aws.amazon.com/kinesis/firehose/
|
548
|
+
[kpl]: https://github.com/awslabs/amazon-kinesis-producer/blob/master/aggregation-format.md
|
549
|
+
[td-agent]: https://github.com/treasure-data/omnibus-td-agent
|
550
|
+
[bundler]: https://bundler.io/
|
551
|
+
[region]: https://docs.aws.amazon.com/general/latest/gr/rande.html#ak_region
|
552
|
+
[fluentd-buffer-section]: https://docs.fluentd.org/configuration/buffer-section
|
553
|
+
[fluentd-formatter-json]: https://docs.fluentd.org/formatter/json
|
554
|
+
[github]: https://github.com/awslabs/aws-fluent-plugin-kinesis
|
555
|
+
[formatter.rb]: https://github.com/fluent/fluentd/blob/master/lib/fluent/formatter.rb
|
556
|
+
[inject.rb]: https://github.com/fluent/fluentd/blob/master/lib/fluent/plugin_helper/inject.rb
|
557
|
+
[fluentd-doc-kinesis]: https://docs.fluentd.org/how-to-guides/kinesis-stream
|
558
|
+
[fluent-plugin-s3]: https://github.com/fluent/fluent-plugin-s3
|
559
|
+
[v1-readme]: https://github.com/awslabs/aws-fluent-plugin-kinesis/blob/v1/README.md
|
data/Rakefile
ADDED
@@ -0,0 +1,26 @@
|
|
1
|
+
#
|
2
|
+
# Copyright 2014-2017 Amazon.com, Inc. or its affiliates. All Rights Reserved.
|
3
|
+
#
|
4
|
+
# Licensed under the Apache License, Version 2.0 (the "License"). You
|
5
|
+
# may not use this file except in compliance with the License. A copy of
|
6
|
+
# the License is located at
|
7
|
+
#
|
8
|
+
# http://www.apache.org/licenses/LICENSE-2.0
|
9
|
+
#
|
10
|
+
# or in the "license" file accompanying this file. This file is
|
11
|
+
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
|
12
|
+
# ANY KIND, either express or implied. See the License for the specific
|
13
|
+
# language governing permissions and limitations under the License.
|
14
|
+
|
15
|
+
require "bundler/gem_tasks"
|
16
|
+
|
17
|
+
require 'rake/testtask'
|
18
|
+
|
19
|
+
task default: [:test]
|
20
|
+
Rake::TestTask.new do |test|
|
21
|
+
test.libs << 'lib' << 'test'
|
22
|
+
test.test_files = FileList['test/**/test_*.rb']
|
23
|
+
test.options = '-v'
|
24
|
+
end
|
25
|
+
|
26
|
+
load 'benchmark/task.rake'
|
@@ -0,0 +1,71 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
#
|
3
|
+
# Copyright 2014-2017 Amazon.com, Inc. or its affiliates. All Rights Reserved.
|
4
|
+
#
|
5
|
+
# Licensed under the Apache License, Version 2.0 (the "License"). You
|
6
|
+
# may not use this file except in compliance with the License. A copy of
|
7
|
+
# the License is located at
|
8
|
+
#
|
9
|
+
# http://www.apache.org/licenses/LICENSE-2.0
|
10
|
+
#
|
11
|
+
# or in the "license" file accompanying this file. This file is
|
12
|
+
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
|
13
|
+
# ANY KIND, either express or implied. See the License for the specific
|
14
|
+
# language governing permissions and limitations under the License.
|
15
|
+
|
16
|
+
lib = File.expand_path('../lib', __FILE__)
|
17
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
18
|
+
require 'fluent_plugin_kinesis/version'
|
19
|
+
|
20
|
+
Gem::Specification.new do |spec|
|
21
|
+
spec.name = "adp-fluent-plugin-kinesis"
|
22
|
+
spec.version = FluentPluginKinesis::VERSION
|
23
|
+
spec.author = 'Amazon Web Services'
|
24
|
+
spec.summary = %q{Fork of plugin created by AWS}
|
25
|
+
spec.homepage = "https://github.com/awslabs/aws-fluent-plugin-kinesis"
|
26
|
+
spec.license = "Apache-2.0"
|
27
|
+
|
28
|
+
spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
|
29
|
+
spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
|
30
|
+
spec.require_paths = ["lib"]
|
31
|
+
spec.required_ruby_version = '>= 2.3'
|
32
|
+
|
33
|
+
spec.add_dependency "fluentd", ">= 0.14.22", "< 2"
|
34
|
+
|
35
|
+
# This plugin is sometimes used with s3 plugin, so watch out for conflicts
|
36
|
+
# https://rubygems.org/gems/fluent-plugin-s3
|
37
|
+
# Exclude v1.5 to avoid aws-sdk dependency problem due to this issue
|
38
|
+
# https://github.com/aws/aws-sdk-ruby/issues/1872
|
39
|
+
# Exclude aws-sdk-kinesis v1.4 to avoid aws-sdk-core dependency problem with td-agent v3.1.1
|
40
|
+
# NoMethodError: undefined method `event=' for #<Seahorse::Model::Shapes::ShapeRef:*>
|
41
|
+
# https://github.com/aws/aws-sdk-ruby/commit/03d60f9d3d821e645bd2a3efca066f37350ef906#diff-c69f15af8ea3eb9ab152659476e04608R401
|
42
|
+
# https://github.com/aws/aws-sdk-ruby/commit/571c2d0e5ff9c24ff72893a08a74790db591fb57#diff-a55155f04aa6559460a0814e264eb0cdR43
|
43
|
+
# Exclude aws-sdk-kinesis v1.14 to avoid aws-sdk-core dependency problem with td-agent v3.4.1
|
44
|
+
# LoadError: cannot load such file -- aws-sdk-core/plugins/transfer_encoding.rb
|
45
|
+
# https://github.com/aws/aws-sdk-ruby/commit/bb61ed0a2fabc6b1f90b757f13f37d5aeae48d8a#diff-b493e941d32289cd2df7eebc3fc5be2cR26
|
46
|
+
# https://github.com/aws/aws-sdk-ruby/commit/e26577d2a426a4be79cd2d9edc1a4a4176e388ba#diff-10f50e27b30c3dc522b3c25db5782e2e
|
47
|
+
spec.add_dependency "aws-sdk-kinesis", "~> 1", "!= 1.4", "!= 1.5", "!= 1.14"
|
48
|
+
# Exclude aws-sdk-firehose v1.9 to avoid aws-sdk-core dependency problem with td-agent v3.2.1
|
49
|
+
# LoadError: cannot load such file -- aws-sdk-core/plugins/endpoint_discovery.rb
|
50
|
+
# https://github.com/aws/aws-sdk-ruby/commit/85d8538a62255e58d9e176ee524a9f94354b51a0#diff-d51486091a10ada65b308b7f45966af1R18
|
51
|
+
# https://github.com/aws/aws-sdk-ruby/commit/7c9584bc6473100df9aec9333ab491ad4faeeca8#diff-be94f87e58e00329a6c0e03e43d5c292
|
52
|
+
# Exclude aws-sdk-firehose v1.15 to avoid aws-sdk-core dependency problem with td-agent v3.4.1
|
53
|
+
# LoadError: cannot load such file -- aws-sdk-core/plugins/transfer_encoding.rb
|
54
|
+
# https://github.com/aws/aws-sdk-ruby/commit/bb61ed0a2fabc6b1f90b757f13f37d5aeae48d8a#diff-d51486091a10ada65b308b7f45966af1R26
|
55
|
+
# https://github.com/aws/aws-sdk-ruby/commit/e26577d2a426a4be79cd2d9edc1a4a4176e388ba#diff-10f50e27b30c3dc522b3c25db5782e2e
|
56
|
+
spec.add_dependency "aws-sdk-firehose", "~> 1", "!= 1.5", "!= 1.9", "!= 1.15"
|
57
|
+
|
58
|
+
spec.add_dependency "google-protobuf", "~> 3"
|
59
|
+
|
60
|
+
spec.add_development_dependency "bundler", ">= 1.10"
|
61
|
+
spec.add_development_dependency "rake", ">= 10.0"
|
62
|
+
spec.add_development_dependency "test-unit", ">= 3.0.8"
|
63
|
+
spec.add_development_dependency "test-unit-rr", ">= 1.0.3"
|
64
|
+
spec.add_development_dependency "pry", ">= 0.10.1"
|
65
|
+
spec.add_development_dependency "pry-byebug", ">= 3.3.0"
|
66
|
+
spec.add_development_dependency "pry-stack_explorer", ">= 0.4.9.2"
|
67
|
+
spec.add_development_dependency "net-empty_port", ">= 0.0.2"
|
68
|
+
spec.add_development_dependency "mocha", ">= 1.1.0"
|
69
|
+
spec.add_development_dependency "webmock", ">= 1.24.2"
|
70
|
+
spec.add_development_dependency "fakefs", ">= 0.8.1"
|
71
|
+
end
|