fluent-plugin-kinesis-modified 3.1.3

Sign up to get free protection for your applications and to get access to all the features.
data/Makefile ADDED
@@ -0,0 +1,44 @@
1
+ #
2
+ # Copyright 2014-2017 Amazon.com, Inc. or its affiliates. All Rights Reserved.
3
+ #
4
+ # Licensed under the Apache License, Version 2.0 (the "License"). You
5
+ # may not use this file except in compliance with the License. A copy of
6
+ # the License is located at
7
+ #
8
+ # http://www.apache.org/licenses/LICENSE-2.0
9
+ #
10
+ # or in the "license" file accompanying this file. This file is
11
+ # distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
12
+ # ANY KIND, either express or implied. See the License for the specific
13
+ # language governing permissions and limitations under the License.
14
+
15
+ .PHONY: run run-td-agent-2.3.5 test test-td-agent-2.3.5 install $(wildcard test/test_*.rb) $(wildcard test/**/test_*.rb) benchmark benchmark-remote
16
+
17
+ all:
18
+ bundle install
19
+
20
+ run:
21
+ bundle exec fluentd -v
22
+
23
+ run-td-agent-2.3.5:
24
+ RBENV_VERSION=2.1.10 BUNDLE_GEMFILE=./gemfiles/Gemfile.td-agent-2.3.5 bundle update
25
+ RBENV_VERSION=2.1.10 BUNDLE_GEMFILE=./gemfiles/Gemfile.td-agent-2.3.5 bundle exec fluentd -v
26
+
27
+ test:
28
+ bundle exec rake test
29
+
30
+ test-td-agent-2.3.5:
31
+ RBENV_VERSION=2.1.10 BUNDLE_GEMFILE=./gemfiles/Gemfile.td-agent-2.3.5 bundle update
32
+ RBENV_VERSION=2.1.10 BUNDLE_GEMFILE=./gemfiles/Gemfile.td-agent-2.3.5 bundle exec rake test
33
+
34
+ install:
35
+ bundle exec rake install:local
36
+
37
+ $(wildcard test/test_*.rb) $(wildcard test/**/test_*.rb):
38
+ bundle exec rake test TEST=$@
39
+
40
+ benchmark:
41
+ bundle exec rake benchmark:local
42
+
43
+ benchmark-remote:
44
+ bundle exec rake benchmark:remote
data/NOTICE.txt ADDED
@@ -0,0 +1,2 @@
1
+ Fluent Plugin for Amazon Kinesis
2
+ Copyright 2014-2017 Amazon.com, Inc. or its affiliates. All Rights Reserved.
data/README.md ADDED
@@ -0,0 +1,424 @@
1
+ # Fluent plugin for Amazon Kinesis
2
+
3
+ [![Gitter](https://badges.gitter.im/awslabs/aws-fluent-plugin-kinesis.svg)](https://gitter.im/awslabs/aws-fluent-plugin-kinesis?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge)
4
+ [![Build Status](https://travis-ci.org/awslabs/aws-fluent-plugin-kinesis.svg?branch=master)](https://travis-ci.org/awslabs/aws-fluent-plugin-kinesis)
5
+ [![Gem Version](https://badge.fury.io/rb/fluent-plugin-kinesis.svg)](https://rubygems.org/gems/fluent-plugin-kinesis)
6
+
7
+ [Fluentd][fluentd] output plugin
8
+ that sends events to [Amazon Kinesis Data Streams][streams] and [Amazon Kinesis Data Firehose][firehose]. Also it supports [KPL Aggregated record format][kpl]. This gem includes three output plugins respectively:
9
+
10
+ - `kinesis_streams`
11
+ - `kinesis_firehose`
12
+ - `kinesis_streams_aggregated`
13
+
14
+ Also, there is a [documentation on Fluentd official site][fluentd-doc-kinesis].
15
+
16
+ **Note**: This README is for v3. Plugin v3 is almost compatible with v2. If you use v1, see the [old README][v1-readme].
17
+
18
+ ## Installation
19
+ This Fluentd plugin is available as the `fluent-plugin-kinesis` gem from RubyGems.
20
+
21
+ gem install fluent-plugin-kinesis
22
+
23
+ Or you can install this plugin for [td-agent][td-agent] as:
24
+
25
+ td-agent-gem install fluent-plugin-kinesis
26
+
27
+ If you would like to build by yourself and install, please see the section below. Your need [bundler][bundler] for this.
28
+
29
+ In case of using with Fluentd: Fluentd will be also installed via the process below.
30
+
31
+ git clone https://github.com/awslabs/aws-fluent-plugin-kinesis.git
32
+ cd aws-fluent-plugin-kinesis
33
+ bundle install
34
+ bundle exec rake build
35
+ bundle exec rake install
36
+
37
+ Also, you can use this plugin with td-agent: You have to install td-agent before installing this plugin.
38
+
39
+ git clone https://github.com/awslabs/aws-fluent-plugin-kinesis.git
40
+ cd aws-fluent-plugin-kinesis
41
+ bundle install
42
+ bundle exec rake build
43
+ fluent-gem install pkg/fluent-plugin-kinesis
44
+
45
+ Or just download specify your Ruby library path. Below is the sample for specifying your library path via RUBYLIB.
46
+
47
+ git clone https://github.com/awslabs/aws-fluent-plugin-kinesis.git
48
+ cd aws-fluent-plugin-kinesis
49
+ bundle install
50
+ export RUBYLIB=$RUBYLIB:/path/to/aws-fluent-plugin-kinesis/lib
51
+
52
+ ## Dependencies
53
+ * Ruby 2.1.0+
54
+ * Fluentd 0.14.10+
55
+
56
+ ## Basic Usage
57
+ Here are general procedures for using this plugin:
58
+
59
+ 1. Install.
60
+ 1. Edit configuration
61
+ 1. Run Fluentd or td-agent
62
+
63
+ You can run this plugin with Fluentd as follows:
64
+
65
+ 1. Install.
66
+ 1. Edit configuration file and save it as 'fluentd.conf'.
67
+ 1. Then, run `fluentd -c /path/to/fluentd.conf`
68
+
69
+ To run with td-agent, it would be as follows:
70
+
71
+ 1. Install.
72
+ 1. Edit configuration file provided by td-agent.
73
+ 1. Then, run or restart td-agent.
74
+
75
+ ## Getting started
76
+ Assume you use Amazon EC2 instances with Instance profile. If you want to use specific credentials, see [Credentials](#configuration-credentials).
77
+
78
+ ### kinesis_streams
79
+ <match your_tag>
80
+ @type kinesis_streams
81
+ region us-east-1
82
+ stream_name your_stream
83
+ partition_key key # Otherwise, use random partition key
84
+ </match>
85
+ For more details, see [Configuration: kinesis_streams](#configuration-kinesis_streams).
86
+
87
+ ### kinesis_firehose
88
+ <match your_tag>
89
+ @type kinesis_firehose
90
+ region us-east-1
91
+ delivery_stream_name your_stream
92
+ </match>
93
+ For more details, see [Configuration: kinesis_firehose](#configuration-kinesis_firehose).
94
+
95
+ ### kinesis_streams_aggregated
96
+ <match your_tag>
97
+ @type kinesis_streams_aggregated
98
+ region us-east-1
99
+ stream_name your_stream
100
+ # Unlike kinesis_streams, there is no way to use dynamic partition key.
101
+ # fixed_partition_key or random.
102
+ </match>
103
+ For more details, see [Configuration: kinesis_streams_aggregated](#configuration-kinesis_streams_aggregated).
104
+
105
+ ### For better throughput
106
+ Add configurations like below:
107
+
108
+ flush_interval 1
109
+ chunk_limit_size 1m
110
+ flush_thread_interval 0.1
111
+ flush_thread_burst_interval 0.01
112
+ flush_thread_count 15
113
+
114
+ When you use Fluent v1.0 (td-agent3), write these configurations in buffer section. For more details, see [Buffer section configurations](https://docs.fluentd.org/articles/buffer-section).
115
+
116
+ Note: Each value should be adjusted to your system by yourself.
117
+
118
+ ## Configuration: Credentials
119
+ To put records into Amazon Kinesis Data Streams or Firehose, you need to provide AWS security credentials somehow. Without specifying credentials in config file, this plugin automatically fetch credential just following AWS SDK for Ruby does (environment variable, shared profile, and instance profile).
120
+
121
+ This plugin uses the same configuration in [fluent-plugin-s3][fluent-plugin-s3].
122
+
123
+ **aws_key_id**
124
+
125
+ AWS access key id. This parameter is required when your agent is not running on EC2 instance with an IAM Role. When using an IAM role, make sure to configure `instance_profile_credentials`. Usage can be found below.
126
+
127
+ **aws_sec_key**
128
+
129
+ AWS secret key. This parameter is required when your agent is not running on EC2 instance with an IAM Role.
130
+
131
+ **aws_iam_retries**
132
+
133
+ The number of attempts to make (with exponential backoff) when loading instance profile credentials from the EC2 metadata service using an IAM role. Defaults to 5 retries.
134
+
135
+ ### assume_role_credentials
136
+ Typically, you can use AssumeRole for cross-account access or federation.
137
+
138
+ <match *>
139
+ @type kinesis_streams
140
+
141
+ <assume_role_credentials>
142
+ role_arn ROLE_ARN
143
+ role_session_name ROLE_SESSION_NAME
144
+ </assume_role_credentials>
145
+ </match>
146
+
147
+ See also:
148
+
149
+ * [Using IAM Roles - AWS Identity and Access
150
+ Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html)
151
+ * [Aws::STS::Client](http://docs.aws.amazon.com/sdkforruby/api/Aws/STS/Client.html)
152
+ * [Aws::AssumeRoleCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws/AssumeRoleCredentials.html)
153
+
154
+ **role_arn (required)**
155
+
156
+ The Amazon Resource Name (ARN) of the role to assume.
157
+
158
+ **role_session_name (required)**
159
+
160
+ An identifier for the assumed role session.
161
+
162
+ **policy**
163
+
164
+ An IAM policy in JSON format.
165
+
166
+ **duration_seconds**
167
+
168
+ The duration, in seconds, of the role session. The value can range from 900 seconds (15 minutes) to 3600 seconds (1 hour). By default, the value is set to 3600 seconds.
169
+
170
+ **external_id**
171
+
172
+ A unique identifier that is used by third parties when assuming roles in their customers' accounts.
173
+
174
+ **sts_http_proxy**
175
+
176
+ Proxy url for proxying requests to amazon sts service api. This needs to be set up independently from global http_proxy parameter for the use case in which requests to kinesis api are going via kinesis vpc endpoint but requests to sts api have to go via http proxy.
177
+ It should be added to assume_role_credentials configuration stanza in the next format:
178
+ sts_http_proxy http://[username:password]@hostname:port
179
+
180
+ ### instance_profile_credentials
181
+
182
+ Retrieve temporary security credentials via HTTP request. This is useful on EC2 instance.
183
+
184
+ <match *>
185
+ @type kinesis_streams
186
+
187
+ <instance_profile_credentials>
188
+ ip_address IP_ADDRESS
189
+ port PORT
190
+ </instance_profile_credentials>
191
+ </match>
192
+
193
+ See also:
194
+
195
+ * [Aws::InstanceProfileCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws/InstanceProfileCredentials.html)
196
+ * [Temporary Security Credentials - AWS Identity and Access
197
+ Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html)
198
+ * [Instance Metadata and User Data - Amazon Elastic Compute
199
+ Cloud](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html)
200
+
201
+ **retries**
202
+
203
+ Number of times to retry when retrieving credentials. Default is 5.
204
+
205
+ **ip_address**
206
+
207
+ Default is 169.254.169.254.
208
+
209
+ **port**
210
+
211
+ Default is 80.
212
+
213
+ **http_open_timeout**
214
+
215
+ Default is 5.
216
+
217
+ **http_read_timeout**
218
+
219
+ Default is 5.
220
+
221
+ ### shared_credentials
222
+
223
+ This loads AWS access credentials from local ini file. This is useful for local developing.
224
+
225
+ <match *>
226
+ @type kinesis_streams
227
+
228
+ <shared_credentials>
229
+ path PATH
230
+ profile_name PROFILE_NAME
231
+ </shared_credentials>
232
+ </match>
233
+
234
+ See also:
235
+
236
+ * [Aws::SharedCredentials](http://docs.aws.amazon.com/sdkforruby/api/Aws/SharedCredentials.html)
237
+
238
+ **path**
239
+
240
+ Path to the shared file. Defaults to "#{Dir.home}/.aws/credentials".
241
+
242
+ **profile_name**
243
+
244
+ Defaults to 'default' or `[ENV]('AWS_PROFILE')`.
245
+
246
+ ### process_credentials
247
+
248
+ This loads AWS access credentials from an external process.
249
+
250
+ <match *>
251
+ @type kinesis_streams
252
+
253
+ <process_credentials>
254
+ process CMD
255
+ </process_credentials>
256
+ </match>
257
+
258
+ See also:
259
+
260
+ * [Aws::ProcessCredentials](https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/ProcessCredentials.html)
261
+ * [Sourcing Credentials From External Processes](https://docs.aws.amazon.com/cli/latest/topic/config-vars.html#sourcing-credentials-from-external-processes)
262
+
263
+ **process (required)**
264
+
265
+ Command to be executed as an external process.
266
+
267
+ ## Configuration: Format
268
+
269
+ ### format (section)
270
+ This plugin uses `Fluent::TextFormatter` to serialize record to string. See [formatter.rb] for more details. By default, it uses `json` formatter same as specific like below:
271
+
272
+ <match *>
273
+ @type kinesis_streams
274
+
275
+ <format>
276
+ @type json
277
+ </format>
278
+ </match>
279
+
280
+ For other configurations of `json` formatter, see [json Formatter Plugin](https://docs.fluentd.org/articles/formatter_json).
281
+
282
+ ### inject (section)
283
+ This plugin uses `Fluent::TimeFormatter` and other injection configurations. See [inject.rb] for more details.
284
+
285
+ For example, the config below will add `time` field whose value is event time with nanosecond and `tag` field whose value is its tag.
286
+
287
+ <match *>
288
+ @type kinesis_streams
289
+
290
+ <inject>
291
+ time_key time
292
+ tag_key tag
293
+ </inject>
294
+ </match>
295
+
296
+ By default, `time_type string` and `time_format %Y-%m-%dT%H:%M:%S.%N%z` are already set to be applicable to Elasticsearch sub-second format. Although, you can use any configuration.
297
+
298
+ Also, there are some format related options below:
299
+
300
+ ### data_key
301
+ If your record contains a field whose string should be sent to Amazon Kinesis directly (without formatter), use this parameter to specify the field. In that case, other fields than **data_key** are thrown away and never sent to Amazon Kinesis. Default `nil`, which means whole record will be formatted and sent.
302
+
303
+ ### compression
304
+ Specifing compression way for data of each record. Current accepted options are `zlib`. Otherwise, no compression will be preformed.
305
+
306
+ ### log_truncate_max_size
307
+ Integer, default 1024. When emitting the log entry, the message will be truncated by this size to avoid infinite loop when the log is also sent to Kinesis. The value 0 means no truncation.
308
+
309
+ ### chomp_record
310
+ Boolean. Default `false`. If it is enabled, the plugin calls chomp and removes separator from the end of each record. This option is for compatible format with plugin v2. See [#142](https://github.com/awslabs/aws-fluent-plugin-kinesis/issues/142) for more details.
311
+ When you use [kinesis_firehose](#kinesis_firehose) output, [append_new_line](#append_new_line) option is `true` as default. If [append_new_line](#append_new_line) is enabled, the plugin calls chomp as [chomp_record](#chomp_record) is `true` before appending `\n` to each record. Therefore, you don't need to enable [chomp_record](#chomp_record) option when you use [kinesis_firehose](#kinesis_firehose) with default configuration. If you want to set [append_new_line](#append_new_line) `false`, you can choose [chomp_record](#chomp_record) `false` (default) or `true` (compatible format with plugin v2).
312
+
313
+ ## Configuration: API
314
+ ### region
315
+ AWS region of your stream. It should be in form like `us-east-1`, `us-west-2`. Refer to [Regions and Endpoints in AWS General Reference][region] for supported regions.
316
+
317
+ Default `nil`, which means try to find from environment variable `AWS_REGION`.
318
+
319
+ ### max_record_size
320
+ The upper limit of size of each record. Default is 1 MB which is the limitation of Kinesis.
321
+
322
+ ### http_proxy
323
+ HTTP proxy for API calling. Default `nil`.
324
+
325
+ ### endpoint
326
+ API endpoint URL, for testing. Default `nil`.
327
+
328
+ ### ssl_verify_peer
329
+ Boolean. Disable if you want to verify ssl connection, for testing. Default `true`.
330
+
331
+ ### debug
332
+ Boolean. Enable if you need to debug Amazon Kinesis Data Firehose API call. Default is `false`.
333
+
334
+ ## Configuration: Batch request
335
+ ### retries_on_batch_request
336
+ Integer, default is 8. The plugin will put multiple records to Amazon Kinesis Data Streams in batches using PutRecords. A set of records in a batch may fail for reasons documented in the Kinesis Service API Reference for PutRecords. Failed records will be retried **retries_on_batch_request** times. If a record fails all retries an error log will be emitted.
337
+
338
+ ### reset_backoff_if_success
339
+ Boolean, default `true`. If enabled, when after retrying, the next retrying checks the number of succeeded records on the former batch request and reset exponential backoff if there is any success. Because batch request could be composed by requests across shards, simple exponential backoff for the batch request wouldn't work some cases.
340
+
341
+ ### batch_request_max_count
342
+ Integer, default 500. The number of max count of making batch request from record chunk. It can't exceed the default value because it's API limit.
343
+
344
+ Default:
345
+
346
+ - `kinesis_streams`: 500
347
+ - `kinesis_firehose`: 500
348
+ - `kinesis_streams_aggregated`: 100,000
349
+
350
+ ### batch_request_max_size
351
+ Integer. The number of max size of making batch request from record chunk. It can't exceed the default value because it's API limit.
352
+
353
+ Default:
354
+
355
+ - `kinesis_streams`: 5 MB
356
+ - `kinesis_firehose`: 4 MB
357
+ - `kinesis_streams_aggregated`: 1 MB
358
+
359
+ ## Configuration: kinesis_streams
360
+ Here are `kinesis_streams` specific configurations.
361
+
362
+ ### stream_name
363
+ Name of the stream to put data.
364
+
365
+ ### partition_key
366
+ A key to extract partition key from JSON object. Default `nil`, which means partition key will be generated randomly.
367
+
368
+ ## Configuration: kinesis_firehose
369
+ Here are `kinesis_firehose` specific configurations.
370
+
371
+ ### delivery_stream_name
372
+ Name of the delivery stream to put data.
373
+
374
+ ### append_new_line
375
+ Boolean. Default `true`. If it is enabled, the plugin adds new line character (`\n`) to each serialized record.
376
+ Before appending `\n`, plugin calls chomp and removes separator from the end of each record as [chomp_record](#chomp_record) is `true`. Therefore, you don't need to enable [chomp_record](#chomp_record) option when you use [kinesis_firehose](#kinesis_firehose) output with default configuration ([append_new_line](#append_new_line) is `true`). If you want to set [append_new_line](#append_new_line) `false`, you can choose [chomp_record](#chomp_record) `false` (default) or `true` (compatible format with plugin v2).
377
+
378
+ ## Configuration: kinesis_streams_aggregated
379
+ Here are `kinesis_streams_aggregated` specific configurations.
380
+
381
+ ### stream_name
382
+ Name of the stream to put data.
383
+
384
+ ### fixed_partition_key
385
+ A value of fixed partition key. Default `nil`, which means partition key will be generated randomly.
386
+
387
+ Note: if you specified this option, all records go to a single shard.
388
+
389
+ ## Development
390
+
391
+ To launch `fluentd` process with this plugin for development, follow the steps below:
392
+
393
+ git clone https://github.com/awslabs/aws-fluent-plugin-kinesis.git
394
+ cd aws-fluent-plugin-kinesis
395
+ make # will install gems dependency
396
+ bundle exec fluentd -c /path/to/fluent.conf
397
+
398
+ To launch using specified version of Fluentd, use `BUNDLE_GEMFILE` environment variable:
399
+
400
+ BUNDLE_GEMFILE=$PWD/gemfiles/Gemfile.td-agent-3.3.0 bundle exec fluentd -c /path/to/fluent.conf
401
+
402
+ ## Contributing
403
+
404
+ Bug reports and pull requests are welcome on [GitHub][github].
405
+
406
+ ## Related Resources
407
+
408
+ * [Amazon Kinesis Data Streams Developer Guide](http://docs.aws.amazon.com/kinesis/latest/dev/introduction.html)
409
+ * [Amazon Kinesis Data Firehose Developer Guide](http://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html)
410
+
411
+ [fluentd]: http://fluentd.org/
412
+ [streams]: https://aws.amazon.com/kinesis/streams/
413
+ [firehose]: https://aws.amazon.com/kinesis/firehose/
414
+ [kpl]: https://github.com/awslabs/amazon-kinesis-producer/blob/master/aggregation-format.md
415
+ [td-agent]: https://github.com/treasure-data/td-agent
416
+ [bundler]: http://bundler.io/
417
+ [region]: http://docs.aws.amazon.com/general/latest/gr/rande.html#ak_region
418
+ [fluentd_buffer]: http://docs.fluentd.org/articles/buffer-plugin-overview
419
+ [github]: https://github.com/awslabs/aws-fluent-plugin-kinesis
420
+ [formatter.rb]: https://github.com/fluent/fluentd/blob/master/lib/fluent/formatter.rb
421
+ [inject.rb]: https://github.com/fluent/fluentd/blob/master/lib/fluent/plugin_helper/inject.rb
422
+ [fluentd-doc-kinesis]: http://docs.fluentd.org/articles/kinesis-stream
423
+ [fluent-plugin-s3]: https://github.com/fluent/fluent-plugin-s3
424
+ [v1-readme]: https://github.com/awslabs/aws-fluent-plugin-kinesis/blob/v1/README.md