adp-fluent-plugin-kinesis 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (47) hide show
  1. checksums.yaml +7 -0
  2. data/.github/PULL_REQUEST_TEMPLATE.md +6 -0
  3. data/.gitignore +15 -0
  4. data/.travis.yml +56 -0
  5. data/CHANGELOG.md +172 -0
  6. data/CODE_OF_CONDUCT.md +4 -0
  7. data/CONTRIBUTING.md +61 -0
  8. data/CONTRIBUTORS.txt +8 -0
  9. data/Gemfile +18 -0
  10. data/LICENSE.txt +201 -0
  11. data/Makefile +44 -0
  12. data/NOTICE.txt +2 -0
  13. data/README.md +559 -0
  14. data/Rakefile +26 -0
  15. data/adp-fluent-plugin-kinesis.gemspec +71 -0
  16. data/benchmark/task.rake +106 -0
  17. data/gemfiles/Gemfile.fluentd-0.14.22 +6 -0
  18. data/gemfiles/Gemfile.fluentd-1.13.3 +6 -0
  19. data/gemfiles/Gemfile.td-agent-3.1.0 +17 -0
  20. data/gemfiles/Gemfile.td-agent-3.1.1 +17 -0
  21. data/gemfiles/Gemfile.td-agent-3.2.0 +17 -0
  22. data/gemfiles/Gemfile.td-agent-3.2.1 +17 -0
  23. data/gemfiles/Gemfile.td-agent-3.3.0 +17 -0
  24. data/gemfiles/Gemfile.td-agent-3.4.0 +17 -0
  25. data/gemfiles/Gemfile.td-agent-3.4.1 +17 -0
  26. data/gemfiles/Gemfile.td-agent-3.5.0 +17 -0
  27. data/gemfiles/Gemfile.td-agent-3.5.1 +17 -0
  28. data/gemfiles/Gemfile.td-agent-3.6.0 +17 -0
  29. data/gemfiles/Gemfile.td-agent-3.7.0 +17 -0
  30. data/gemfiles/Gemfile.td-agent-3.7.1 +17 -0
  31. data/gemfiles/Gemfile.td-agent-3.8.0 +17 -0
  32. data/gemfiles/Gemfile.td-agent-3.8.1 +18 -0
  33. data/gemfiles/Gemfile.td-agent-4.0.0 +25 -0
  34. data/gemfiles/Gemfile.td-agent-4.0.1 +21 -0
  35. data/gemfiles/Gemfile.td-agent-4.1.0 +21 -0
  36. data/gemfiles/Gemfile.td-agent-4.1.1 +21 -0
  37. data/gemfiles/Gemfile.td-agent-4.2.0 +21 -0
  38. data/lib/fluent/plugin/kinesis.rb +174 -0
  39. data/lib/fluent/plugin/kinesis_helper/aggregator.rb +101 -0
  40. data/lib/fluent/plugin/kinesis_helper/api.rb +254 -0
  41. data/lib/fluent/plugin/kinesis_helper/client.rb +210 -0
  42. data/lib/fluent/plugin/kinesis_helper/compression.rb +27 -0
  43. data/lib/fluent/plugin/out_kinesis_firehose.rb +60 -0
  44. data/lib/fluent/plugin/out_kinesis_streams.rb +72 -0
  45. data/lib/fluent/plugin/out_kinesis_streams_aggregated.rb +79 -0
  46. data/lib/fluent_plugin_kinesis/version.rb +17 -0
  47. metadata +339 -0
data/README.md ADDED
@@ -0,0 +1,559 @@
1
+ # Fluent plugin for Amazon Kinesis
2
+
3
+ [![Build Status](https://api.travis-ci.com/awslabs/aws-fluent-plugin-kinesis.svg?branch=master)](https://app.travis-ci.com/github/awslabs/aws-fluent-plugin-kinesis)
4
+ [![Gem Version](https://badge.fury.io/rb/fluent-plugin-kinesis.svg)](https://rubygems.org/gems/fluent-plugin-kinesis)
5
+ [![Gem Downloads](https://img.shields.io/gem/dt/fluent-plugin-kinesis.svg)](https://rubygems.org/gems/fluent-plugin-kinesis)
6
+
7
+ [Fluentd][fluentd] output plugin
8
+ that sends events to [Amazon Kinesis Data Streams][streams] and [Amazon Kinesis Data Firehose][firehose]. Also it supports [KPL Aggregated Record Format][kpl]. This gem includes three output plugins respectively:
9
+
10
+ - `kinesis_streams`
11
+ - `kinesis_firehose`
12
+ - `kinesis_streams_aggregated`
13
+
14
+ Also, there is a [documentation on Fluentd official site][fluentd-doc-kinesis].
15
+
16
+ **Note**: This README is for v3. Plugin v3 is almost compatible with v2. If you use v1, see the [old README][v1-readme].
17
+
18
+ ## Installation
19
+ This Fluentd plugin is available as the `fluent-plugin-kinesis` gem from RubyGems.
20
+
21
+ gem install fluent-plugin-kinesis
22
+
23
+ Or you can install this plugin for [td-agent][td-agent] as:
24
+
25
+ td-agent-gem install fluent-plugin-kinesis
26
+
27
+ If you would like to build by yourself and install, see the section below. Your need [bundler][bundler] for this.
28
+
29
+ In case of using with Fluentd: Fluentd will be also installed via the process below.
30
+
31
+ git clone https://github.com/awslabs/aws-fluent-plugin-kinesis.git
32
+ cd aws-fluent-plugin-kinesis
33
+ bundle install
34
+ bundle exec rake build
35
+ bundle exec rake install
36
+
37
+ Also, you can use this plugin with td-agent: You have to install td-agent before installing this plugin.
38
+
39
+ git clone https://github.com/awslabs/aws-fluent-plugin-kinesis.git
40
+ cd aws-fluent-plugin-kinesis
41
+ bundle install
42
+ bundle exec rake build
43
+ fluent-gem install pkg/fluent-plugin-kinesis
44
+
45
+ Or just download specify your Ruby library path. Below is the sample for specifying your library path via RUBYLIB.
46
+
47
+ git clone https://github.com/awslabs/aws-fluent-plugin-kinesis.git
48
+ cd aws-fluent-plugin-kinesis
49
+ bundle install
50
+ export RUBYLIB=$RUBYLIB:/path/to/aws-fluent-plugin-kinesis/lib
51
+
52
+ ## Dependencies
53
+ * Ruby 2.3.0+
54
+ * Fluentd 0.14.22+ (td-agent v3.1.0+)
55
+
56
+ ## Basic Usage
57
+ Here are general procedures for using this plugin:
58
+
59
+ 1. Install.
60
+ 1. Edit configuration
61
+ 1. Run Fluentd or td-agent
62
+
63
+ You can run this plugin with Fluentd as follows:
64
+
65
+ 1. Install.
66
+ 1. Edit configuration file and save it as 'fluentd.conf'.
67
+ 1. Then, run `fluentd -c /path/to/fluentd.conf`
68
+
69
+ To run with td-agent, it would be as follows:
70
+
71
+ 1. Install.
72
+ 1. Edit configuration file provided by td-agent.
73
+ 1. Then, run or restart td-agent.
74
+
75
+ ## Getting started
76
+ Assume you use Amazon EC2 instances with Instance profile. If you want to use specific credentials, see [Credentials](#configuration-credentials).
77
+
78
+ ### kinesis_streams
79
+ <match your_tag>
80
+ @type kinesis_streams
81
+ region us-east-1
82
+ stream_name your_stream
83
+ partition_key key # Otherwise, use random partition key
84
+ </match>
85
+ For more details, see [Configuration: kinesis_streams](#configuration-kinesis_streams).
86
+
87
+ ### kinesis_firehose
88
+ <match your_tag>
89
+ @type kinesis_firehose
90
+ region us-east-1
91
+ delivery_stream_name your_stream
92
+ </match>
93
+ For more details, see [Configuration: kinesis_firehose](#configuration-kinesis_firehose).
94
+
95
+ ### kinesis_streams_aggregated
96
+ <match your_tag>
97
+ @type kinesis_streams_aggregated
98
+ region us-east-1
99
+ stream_name your_stream
100
+ # Unlike kinesis_streams, there is no way to use dynamic partition key.
101
+ # fixed_partition_key or random.
102
+ </match>
103
+ For more details, see [Configuration: kinesis_streams_aggregated](#configuration-kinesis_streams_aggregated).
104
+
105
+ ### For better throughput
106
+ Add configurations like below:
107
+
108
+ flush_interval 1
109
+ chunk_limit_size 1m
110
+ flush_thread_interval 0.1
111
+ flush_thread_burst_interval 0.01
112
+ flush_thread_count 15
113
+
114
+ When you use Fluent v1.0 (td-agent3), write these configurations in buffer section. For more details, see [Config: Buffer Section][fluentd-buffer-section].
115
+
116
+ Note: Each value should be adjusted to your system by yourself.
117
+
118
+ ## Configuration: Credentials
119
+ To put records into Amazon Kinesis Data Streams or Firehose, you need to provide AWS security credentials somehow. Without specifying credentials in config file, this plugin automatically fetch credential just following AWS SDK for Ruby does (environment variable, shared profile, and instance profile).
120
+
121
+ This plugin uses the same configuration in [fluent-plugin-s3][fluent-plugin-s3], but also supports aws session tokens for temporary credentials.
122
+
123
+ **aws_key_id**
124
+
125
+ AWS access key id. This parameter is required when your agent is not running on EC2 instance with an IAM Role. When using an IAM role, make sure to configure `instance_profile_credentials`. Usage can be found below.
126
+
127
+ **aws_sec_key**
128
+
129
+ AWS secret key. This parameter is required when your agent is not running on EC2 instance with an IAM Role.
130
+
131
+ **aws_ses_token**
132
+
133
+ AWS session token. This parameter is optional, but can be provided if using MFA or temporary credentials when your agent is not running on EC2 instance with an IAM Role.
134
+
135
+ **aws_iam_retries**
136
+
137
+ The number of attempts to make (with exponential backoff) when loading instance profile credentials from the EC2 metadata service using an IAM role. Defaults to 5 retries.
138
+
139
+ ### assume_role_credentials
140
+ Typically, you can use AssumeRole for cross-account access or federation.
141
+
142
+ <match *>
143
+ @type kinesis_streams
144
+
145
+ <assume_role_credentials>
146
+ role_arn ROLE_ARN
147
+ role_session_name ROLE_SESSION_NAME
148
+ </assume_role_credentials>
149
+ </match>
150
+
151
+ See also:
152
+
153
+ * [Using IAM Roles - AWS Identity and Access
154
+ Management](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html)
155
+ * [Aws::STS::Client](https://docs.aws.amazon.com/sdkforruby/api/Aws/STS/Client.html)
156
+ * [Aws::AssumeRoleCredentials](https://docs.aws.amazon.com/sdkforruby/api/Aws/AssumeRoleCredentials.html)
157
+
158
+ **role_arn (required)**
159
+
160
+ The Amazon Resource Name (ARN) of the role to assume.
161
+
162
+ **role_session_name (required)**
163
+
164
+ An identifier for the assumed role session.
165
+
166
+ **policy**
167
+
168
+ An IAM policy in JSON format.
169
+
170
+ **duration_seconds**
171
+
172
+ The duration, in seconds, of the role session. The value can range from 900 seconds (15 minutes) to 3600 seconds (1 hour). By default, the value is set to 3600 seconds.
173
+
174
+ **external_id**
175
+
176
+ A unique identifier that is used by third parties when assuming roles in their customers' accounts.
177
+
178
+ **sts_http_proxy**
179
+
180
+ Proxy url for proxying requests to amazon sts service api. This needs to be set up independently from global http_proxy parameter for the use case in which requests to kinesis api are going via kinesis vpc endpoint but requests to sts api have to go via http proxy.
181
+ It should be added to assume_role_credentials configuration stanza in the next format:
182
+ sts_http_proxy http://[username:password]@hostname:port
183
+
184
+ **sts_endpoint_url**
185
+
186
+ STS API endpoint url. This can be used to override the default global STS API endpoint of sts.amazonaws.com. Using regional endpoints may be preferred to reduce latency, and are required if utilizing a PrivateLink VPC Endpoint for STS API calls.
187
+
188
+
189
+ ### web_identity_credentials
190
+
191
+ Similar to the assume_role_credentials, but for usage in EKS.
192
+
193
+ <match *>
194
+ @type kinesis_streams
195
+
196
+ <web_identity_credentials>
197
+ role_arn ROLE_ARN
198
+ role_session_name ROLE_SESSION_NAME
199
+ web_identity_token_file AWS_WEB_IDENTITY_TOKEN_FILE
200
+ </web_identity_credentials>
201
+ </match>
202
+
203
+ See also:
204
+
205
+ * [Using IAM Roles - AWS Identity and Access Management](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use.html)
206
+ * [IAM Roles For Service Accounts](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-technical-overview.html)
207
+ * [Aws::STS::Client](http://docs.aws.amazon.com/sdkforruby/api/Aws/STS/Client.html)
208
+ * [Aws::AssumeRoleWebIdentityCredentials](https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/AssumeRoleWebIdentityCredentials.html)
209
+
210
+ **role_arn (required)**
211
+
212
+ The Amazon Resource Name (ARN) of the role to assume.
213
+
214
+ **role_session_name (required)**
215
+
216
+ An identifier for the assumed role session.
217
+
218
+ **web_identity_token_file (required)**
219
+
220
+ The absolute path to the file on disk containing the OIDC token
221
+
222
+ **policy**
223
+
224
+ An IAM policy in JSON format.
225
+
226
+ **duration_seconds**
227
+
228
+ The duration, in seconds, of the role session. The value can range from
229
+ 900 seconds (15 minutes) to 43200 seconds (12 hours). By default, the value
230
+ is set to 3600 seconds (1 hour).
231
+
232
+ ### instance_profile_credentials
233
+
234
+ Retrieve temporary security credentials via HTTP request. This is useful on EC2 instance.
235
+
236
+ <match *>
237
+ @type kinesis_streams
238
+
239
+ <instance_profile_credentials>
240
+ ip_address IP_ADDRESS
241
+ port PORT
242
+ </instance_profile_credentials>
243
+ </match>
244
+
245
+ See also:
246
+
247
+ * [Aws::InstanceProfileCredentials](https://docs.aws.amazon.com/sdkforruby/api/Aws/InstanceProfileCredentials.html)
248
+ * [Temporary Security Credentials - AWS Identity and Access
249
+ Management](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html)
250
+ * [Instance Metadata and User Data - Amazon Elastic Compute
251
+ Cloud](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html)
252
+
253
+ **retries**
254
+
255
+ Number of times to retry when retrieving credentials. Default is 5.
256
+
257
+ **ip_address**
258
+
259
+ Default is 169.254.169.254.
260
+
261
+ **port**
262
+
263
+ Default is 80.
264
+
265
+ **http_open_timeout**
266
+
267
+ Default is 5.
268
+
269
+ **http_read_timeout**
270
+
271
+ Default is 5.
272
+
273
+ ### shared_credentials
274
+
275
+ This loads AWS access credentials from local ini file. This is useful for local developing.
276
+
277
+ <match *>
278
+ @type kinesis_streams
279
+
280
+ <shared_credentials>
281
+ path PATH
282
+ profile_name PROFILE_NAME
283
+ </shared_credentials>
284
+ </match>
285
+
286
+ See also:
287
+
288
+ * [Aws::SharedCredentials](https://docs.aws.amazon.com/sdkforruby/api/Aws/SharedCredentials.html)
289
+
290
+ **path**
291
+
292
+ Path to the shared file. Defaults to "#{Dir.home}/.aws/credentials".
293
+
294
+ **profile_name**
295
+
296
+ Defaults to 'default' or `[ENV]('AWS_PROFILE')`.
297
+
298
+ ### process_credentials
299
+
300
+ This loads AWS access credentials from an external process.
301
+
302
+ <match *>
303
+ @type kinesis_streams
304
+
305
+ <process_credentials>
306
+ process CMD
307
+ </process_credentials>
308
+ </match>
309
+
310
+ See also:
311
+
312
+ * [Aws::ProcessCredentials](https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/ProcessCredentials.html)
313
+ * [Sourcing Credentials From External Processes](https://docs.aws.amazon.com/cli/latest/topic/config-vars.html#sourcing-credentials-from-external-processes)
314
+
315
+ **process (required)**
316
+
317
+ Command to be executed as an external process.
318
+
319
+ ## Configuration: Format
320
+
321
+ ### format (section)
322
+ This plugin uses `Fluent::TextFormatter` to serialize record to string. See [formatter.rb] for more details. By default, it uses `json` formatter same as specific like below:
323
+
324
+ <match *>
325
+ @type kinesis_streams
326
+
327
+ <format>
328
+ @type json
329
+ </format>
330
+ </match>
331
+
332
+ For other configurations of `json` formatter, see [json formatter plugin][fluentd-formatter-json].
333
+
334
+ ### inject (section)
335
+ This plugin uses `Fluent::TimeFormatter` and other injection configurations. See [inject.rb] for more details.
336
+
337
+ For example, the config below will add `time` field whose value is event time with nanosecond and `tag` field whose value is its tag.
338
+
339
+ <match *>
340
+ @type kinesis_streams
341
+
342
+ <inject>
343
+ time_key time
344
+ tag_key tag
345
+ </inject>
346
+ </match>
347
+
348
+ By default, `time_type string` and `time_format %Y-%m-%dT%H:%M:%S.%N%z` are already set to be applicable to Elasticsearch sub-second format. Although, you can use any configuration.
349
+
350
+ Also, there are some format related options below:
351
+
352
+ ### data_key
353
+ If your record contains a field whose string should be sent to Amazon Kinesis directly (without formatter), use this parameter to specify the field. In that case, other fields than **data_key** are thrown away and never sent to Amazon Kinesis. Default `nil`, which means whole record will be formatted and sent.
354
+
355
+ ### compression
356
+ Specifying compression way for data of each record. Current accepted options are `zlib` and `gzip`. Otherwise, no compression will be preformed.
357
+
358
+ ### log_truncate_max_size
359
+ Integer, default 1024. When emitting the log entry, the message will be truncated by this size to avoid infinite loop when the log is also sent to Kinesis. The value 0 means no truncation.
360
+
361
+ ### chomp_record
362
+ Boolean. Default `false`. If it is enabled, the plugin calls chomp and removes separator from the end of each record. This option is for compatible format with plugin v2. See [#142](https://github.com/awslabs/aws-fluent-plugin-kinesis/issues/142) for more details.
363
+ When you use [kinesis_firehose](#kinesis_firehose) output, [append_new_line](#append_new_line) option is `true` as default. If [append_new_line](#append_new_line) is enabled, the plugin calls chomp as [chomp_record](#chomp_record) is `true` before appending `\n` to each record. Therefore, you don't need to enable [chomp_record](#chomp_record) option when you use [kinesis_firehose](#kinesis_firehose) with default configuration. If you want to set [append_new_line](#append_new_line) `false`, you can choose [chomp_record](#chomp_record) `false` (default) or `true` (compatible format with plugin v2).
364
+
365
+ ## Configuration: API
366
+ ### region
367
+ AWS region of your stream. It should be in form like `us-east-1`, `us-west-2`. Refer to [Regions and Endpoints in AWS General Reference][region] for supported regions.
368
+
369
+ Default `nil`, which means try to find from environment variable `AWS_REGION`.
370
+
371
+ ### max_record_size
372
+ The upper limit of size of each record. Default is 1 MB which is the limitation of Kinesis.
373
+
374
+ ### http_proxy
375
+ HTTP proxy for API calling. Default `nil`.
376
+
377
+ ### endpoint
378
+ API endpoint URL, for testing. Default `nil`.
379
+
380
+ ### ssl_verify_peer
381
+ Boolean. Disable if you want to verify ssl connection, for testing. Default `true`.
382
+
383
+ ### debug
384
+ Boolean. Enable if you need to debug Amazon Kinesis Data Firehose API call. Default is `false`.
385
+
386
+ ## Configuration: Batch request
387
+ ### retries_on_batch_request
388
+ Integer, default is 8. The plugin will put multiple records to Amazon Kinesis Data Streams in batches using PutRecords. A set of records in a batch may fail for reasons documented in the Kinesis Service API Reference for PutRecords. Failed records will be retried **retries_on_batch_request** times. If a record fails all retries an error log will be emitted.
389
+
390
+ ### reset_backoff_if_success
391
+ Boolean, default `true`. If enabled, when after retrying, the next retrying checks the number of succeeded records on the former batch request and reset exponential backoff if there is any success. Because batch request could be composed by requests across shards, simple exponential backoff for the batch request wouldn't work some cases.
392
+
393
+ ### batch_request_max_count
394
+ Integer, default 500. The number of max count of making batch request from record chunk. It can't exceed the default value because it's API limit.
395
+
396
+ Default:
397
+
398
+ - `kinesis_streams`: 500
399
+ - `kinesis_firehose`: 500
400
+ - `kinesis_streams_aggregated`: 100,000
401
+
402
+ ### batch_request_max_size
403
+ Integer. The number of max size of making batch request from record chunk. It can't exceed the default value because it's API limit.
404
+
405
+ Default:
406
+
407
+ - `kinesis_streams`: 5 MB
408
+ - `kinesis_firehose`: 4 MB
409
+ - `kinesis_streams_aggregated`: 1 MB
410
+
411
+ ### drop_failed_records_after_batch_request_retries
412
+ Boolean, default `true`.
413
+
414
+ If *drop_failed_records_after_batch_request_retries* is enabled (default), the plugin will drop failed records when batch request fails after retrying max times configured as *retries_on_batch_request*. This dropping can be monitored from [monitor_agent](https://docs.fluentd.org/input/monitor_agent) or [fluent-plugin-prometheus](https://docs.fluentd.org/monitoring-fluentd/monitoring-prometheus) as *retry_count* or *num_errors* metrics.
415
+
416
+ If *drop_failed_records_after_batch_request_retries* is disabled, the plugin will raise error and return chunk to Fluentd buffer when batch request fails after retrying max times. Fluentd will retry to send chunk records according to retry config in [Buffer Section](https://docs.fluentd.org/configuration/buffer-section). Note that this retryng may create duplicate records since [PutRecords API](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecords.html) of Kinesis Data Streams and [PutRecordBatch API](https://docs.aws.amazon.com/firehose/latest/APIReference/API_PutRecordBatch.html) of Kinesis Data Firehose may return a partially successful response.
417
+
418
+ ### monitor_num_of_batch_request_retries
419
+ Boolean, default `false`. If enabled, the plugin will increment *retry_count* monitoring metrics after internal retrying to send batch request. This configuration enables you to monitor [ProvisionedThroughputExceededException](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecords.html) from [monitor_agent](https://docs.fluentd.org/input/monitor_agent) or [fluent-plugin-prometheus](https://docs.fluentd.org/monitoring-fluentd/monitoring-prometheus). Note that *retry_count* metrics will be counted by the plugin in addition to original Fluentd buffering mechanism if *monitor_num_of_batch_request_retries* is enabled.
420
+
421
+ ## Configuration: kinesis_streams
422
+ Here are `kinesis_streams` specific configurations.
423
+
424
+ ### stream_name
425
+ Name of the stream to put data.
426
+
427
+ As of Fluentd v1, built-in placeholders are supported. Now, you can also use built-in placeholders for this parameter.
428
+
429
+ **NOTE:**
430
+ Built-in placeholders require target key information in your buffer section attributes.
431
+
432
+ e.g.)
433
+
434
+ When you specify the following `stream_name` configuration with built-in placeholder:
435
+
436
+ ```aconf
437
+ stream_name "${$.kubernetes.annotations.kinesis_streams}"
438
+ ```
439
+
440
+ you ought to specify the corresponding attributes in buffer section:
441
+
442
+ ```aconf
443
+ # $.kubernetes.annotations.kinesis_streams needs to be set in buffer attributes
444
+ <buffer $.kubernetes.annotations.kinesis_streams>
445
+ # ...
446
+ </buffer>
447
+ ```
448
+
449
+ For more details, refer [Placeholders section in the official Fluentd document](https://docs.fluentd.org/configuration/buffer-section#placeholders).
450
+
451
+ ### partition_key
452
+ A key to extract partition key from JSON object. Default `nil`, which means partition key will be generated randomly.
453
+
454
+ ## Configuration: kinesis_firehose
455
+ Here are `kinesis_firehose` specific configurations.
456
+
457
+ ### delivery_stream_name
458
+ Name of the delivery stream to put data.
459
+
460
+ As of Fluentd v1, built-in placeholders are supported. Now, you can also use built-in placeholders for this parameter.
461
+
462
+ **NOTE:**
463
+ Built-in placeholders require target key information in your buffer section attributes.
464
+
465
+ e.g.)
466
+
467
+ When you specify the following `delivery_stream_name` configuration with built-in placeholder:
468
+
469
+ ```aconf
470
+ delivery_stream_name "${$.kubernetes.annotations.kinesis_firehose_streams}"
471
+ ```
472
+
473
+ you ought to specify the corresponding attributes in buffer section:
474
+
475
+ ```aconf
476
+ # $.kubernetes.annotations.kinesis_firehose_streams needs to be set in buffer attributes
477
+ <buffer $.kubernetes.annotations.kinesis_firehose_streams>
478
+ # ...
479
+ </buffer>
480
+ ```
481
+
482
+ For more details, refer [Placeholders section in the official Fluentd document](https://docs.fluentd.org/configuration/buffer-section#placeholders).
483
+
484
+ ### append_new_line
485
+ Boolean. Default `true`. If it is enabled, the plugin adds new line character (`\n`) to each serialized record.
486
+ Before appending `\n`, plugin calls chomp and removes separator from the end of each record as [chomp_record](#chomp_record) is `true`. Therefore, you don't need to enable [chomp_record](#chomp_record) option when you use [kinesis_firehose](#kinesis_firehose) output with default configuration ([append_new_line](#append_new_line) is `true`). If you want to set [append_new_line](#append_new_line) `false`, you can choose [chomp_record](#chomp_record) `false` (default) or `true` (compatible format with plugin v2).
487
+
488
+ ## Configuration: kinesis_streams_aggregated
489
+ Here are `kinesis_streams_aggregated` specific configurations.
490
+
491
+ ### stream_name
492
+ Name of the stream to put data.
493
+
494
+ As of Fluentd v1, built-in placeholders are supported. Now, you can also use built-in placeholders for this parameter.
495
+
496
+ **NOTE:**
497
+ Built-in placeholders require target key information in your buffer section attributes.
498
+
499
+ e.g.)
500
+
501
+ When you specify the following `stream_name` configuration with built-in placeholder:
502
+
503
+ ```aconf
504
+ stream_name "${$.kubernetes.annotations.kinesis_streams_aggregated}"
505
+ ```
506
+
507
+ you ought to specify the corresponding attributes in buffer section:
508
+
509
+ ```aconf
510
+ # $.kubernetes.annotations.kinesis_streams_aggregated needs to be set in buffer attributes
511
+ <buffer $.kubernetes.annotations.kinesis_streams_aggregated>
512
+ # ...
513
+ </buffer>
514
+ ```
515
+
516
+ For more details, refer [Placeholders section in the official Fluentd document](https://docs.fluentd.org/configuration/buffer-section#placeholders).
517
+
518
+ ### fixed_partition_key
519
+ A value of fixed partition key. Default `nil`, which means partition key will be generated randomly.
520
+
521
+ Note: if you specified this option, all records go to a single shard.
522
+
523
+ ## Development
524
+
525
+ To launch `fluentd` process with this plugin for development, follow the steps below:
526
+
527
+ git clone https://github.com/awslabs/aws-fluent-plugin-kinesis.git
528
+ cd aws-fluent-plugin-kinesis
529
+ make # will install gems dependency
530
+ bundle exec fluentd -c /path/to/fluent.conf
531
+
532
+ To launch using specified version of Fluentd, use `BUNDLE_GEMFILE` environment variable:
533
+
534
+ BUNDLE_GEMFILE=$PWD/gemfiles/Gemfile.td-agent-3.3.0 bundle exec fluentd -c /path/to/fluent.conf
535
+
536
+ ## Contributing
537
+
538
+ Bug reports and pull requests are welcome on [GitHub][github].
539
+
540
+ ## Related Resources
541
+
542
+ * [Amazon Kinesis Data Streams Developer Guide](http://docs.aws.amazon.com/kinesis/latest/dev/introduction.html)
543
+ * [Amazon Kinesis Data Firehose Developer Guide](http://docs.aws.amazon.com/firehose/latest/dev/what-is-this-service.html)
544
+
545
+ [fluentd]: https://www.fluentd.org/
546
+ [streams]: https://aws.amazon.com/kinesis/streams/
547
+ [firehose]: https://aws.amazon.com/kinesis/firehose/
548
+ [kpl]: https://github.com/awslabs/amazon-kinesis-producer/blob/master/aggregation-format.md
549
+ [td-agent]: https://github.com/treasure-data/omnibus-td-agent
550
+ [bundler]: https://bundler.io/
551
+ [region]: https://docs.aws.amazon.com/general/latest/gr/rande.html#ak_region
552
+ [fluentd-buffer-section]: https://docs.fluentd.org/configuration/buffer-section
553
+ [fluentd-formatter-json]: https://docs.fluentd.org/formatter/json
554
+ [github]: https://github.com/awslabs/aws-fluent-plugin-kinesis
555
+ [formatter.rb]: https://github.com/fluent/fluentd/blob/master/lib/fluent/formatter.rb
556
+ [inject.rb]: https://github.com/fluent/fluentd/blob/master/lib/fluent/plugin_helper/inject.rb
557
+ [fluentd-doc-kinesis]: https://docs.fluentd.org/how-to-guides/kinesis-stream
558
+ [fluent-plugin-s3]: https://github.com/fluent/fluent-plugin-s3
559
+ [v1-readme]: https://github.com/awslabs/aws-fluent-plugin-kinesis/blob/v1/README.md
data/Rakefile ADDED
@@ -0,0 +1,26 @@
1
+ #
2
+ # Copyright 2014-2017 Amazon.com, Inc. or its affiliates. All Rights Reserved.
3
+ #
4
+ # Licensed under the Apache License, Version 2.0 (the "License"). You
5
+ # may not use this file except in compliance with the License. A copy of
6
+ # the License is located at
7
+ #
8
+ # http://www.apache.org/licenses/LICENSE-2.0
9
+ #
10
+ # or in the "license" file accompanying this file. This file is
11
+ # distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
12
+ # ANY KIND, either express or implied. See the License for the specific
13
+ # language governing permissions and limitations under the License.
14
+
15
+ require "bundler/gem_tasks"
16
+
17
+ require 'rake/testtask'
18
+
19
+ task default: [:test]
20
+ Rake::TestTask.new do |test|
21
+ test.libs << 'lib' << 'test'
22
+ test.test_files = FileList['test/**/test_*.rb']
23
+ test.options = '-v'
24
+ end
25
+
26
+ load 'benchmark/task.rake'
@@ -0,0 +1,71 @@
1
+ # coding: utf-8
2
+ #
3
+ # Copyright 2014-2017 Amazon.com, Inc. or its affiliates. All Rights Reserved.
4
+ #
5
+ # Licensed under the Apache License, Version 2.0 (the "License"). You
6
+ # may not use this file except in compliance with the License. A copy of
7
+ # the License is located at
8
+ #
9
+ # http://www.apache.org/licenses/LICENSE-2.0
10
+ #
11
+ # or in the "license" file accompanying this file. This file is
12
+ # distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
13
+ # ANY KIND, either express or implied. See the License for the specific
14
+ # language governing permissions and limitations under the License.
15
+
16
+ lib = File.expand_path('../lib', __FILE__)
17
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
18
+ require 'fluent_plugin_kinesis/version'
19
+
20
+ Gem::Specification.new do |spec|
21
+ spec.name = "adp-fluent-plugin-kinesis"
22
+ spec.version = FluentPluginKinesis::VERSION
23
+ spec.author = 'Amazon Web Services'
24
+ spec.summary = %q{Fork of plugin created by AWS}
25
+ spec.homepage = "https://github.com/awslabs/aws-fluent-plugin-kinesis"
26
+ spec.license = "Apache-2.0"
27
+
28
+ spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
29
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
30
+ spec.require_paths = ["lib"]
31
+ spec.required_ruby_version = '>= 2.3'
32
+
33
+ spec.add_dependency "fluentd", ">= 0.14.22", "< 2"
34
+
35
+ # This plugin is sometimes used with s3 plugin, so watch out for conflicts
36
+ # https://rubygems.org/gems/fluent-plugin-s3
37
+ # Exclude v1.5 to avoid aws-sdk dependency problem due to this issue
38
+ # https://github.com/aws/aws-sdk-ruby/issues/1872
39
+ # Exclude aws-sdk-kinesis v1.4 to avoid aws-sdk-core dependency problem with td-agent v3.1.1
40
+ # NoMethodError: undefined method `event=' for #<Seahorse::Model::Shapes::ShapeRef:*>
41
+ # https://github.com/aws/aws-sdk-ruby/commit/03d60f9d3d821e645bd2a3efca066f37350ef906#diff-c69f15af8ea3eb9ab152659476e04608R401
42
+ # https://github.com/aws/aws-sdk-ruby/commit/571c2d0e5ff9c24ff72893a08a74790db591fb57#diff-a55155f04aa6559460a0814e264eb0cdR43
43
+ # Exclude aws-sdk-kinesis v1.14 to avoid aws-sdk-core dependency problem with td-agent v3.4.1
44
+ # LoadError: cannot load such file -- aws-sdk-core/plugins/transfer_encoding.rb
45
+ # https://github.com/aws/aws-sdk-ruby/commit/bb61ed0a2fabc6b1f90b757f13f37d5aeae48d8a#diff-b493e941d32289cd2df7eebc3fc5be2cR26
46
+ # https://github.com/aws/aws-sdk-ruby/commit/e26577d2a426a4be79cd2d9edc1a4a4176e388ba#diff-10f50e27b30c3dc522b3c25db5782e2e
47
+ spec.add_dependency "aws-sdk-kinesis", "~> 1", "!= 1.4", "!= 1.5", "!= 1.14"
48
+ # Exclude aws-sdk-firehose v1.9 to avoid aws-sdk-core dependency problem with td-agent v3.2.1
49
+ # LoadError: cannot load such file -- aws-sdk-core/plugins/endpoint_discovery.rb
50
+ # https://github.com/aws/aws-sdk-ruby/commit/85d8538a62255e58d9e176ee524a9f94354b51a0#diff-d51486091a10ada65b308b7f45966af1R18
51
+ # https://github.com/aws/aws-sdk-ruby/commit/7c9584bc6473100df9aec9333ab491ad4faeeca8#diff-be94f87e58e00329a6c0e03e43d5c292
52
+ # Exclude aws-sdk-firehose v1.15 to avoid aws-sdk-core dependency problem with td-agent v3.4.1
53
+ # LoadError: cannot load such file -- aws-sdk-core/plugins/transfer_encoding.rb
54
+ # https://github.com/aws/aws-sdk-ruby/commit/bb61ed0a2fabc6b1f90b757f13f37d5aeae48d8a#diff-d51486091a10ada65b308b7f45966af1R26
55
+ # https://github.com/aws/aws-sdk-ruby/commit/e26577d2a426a4be79cd2d9edc1a4a4176e388ba#diff-10f50e27b30c3dc522b3c25db5782e2e
56
+ spec.add_dependency "aws-sdk-firehose", "~> 1", "!= 1.5", "!= 1.9", "!= 1.15"
57
+
58
+ spec.add_dependency "google-protobuf", "~> 3"
59
+
60
+ spec.add_development_dependency "bundler", ">= 1.10"
61
+ spec.add_development_dependency "rake", ">= 10.0"
62
+ spec.add_development_dependency "test-unit", ">= 3.0.8"
63
+ spec.add_development_dependency "test-unit-rr", ">= 1.0.3"
64
+ spec.add_development_dependency "pry", ">= 0.10.1"
65
+ spec.add_development_dependency "pry-byebug", ">= 3.3.0"
66
+ spec.add_development_dependency "pry-stack_explorer", ">= 0.4.9.2"
67
+ spec.add_development_dependency "net-empty_port", ">= 0.0.2"
68
+ spec.add_development_dependency "mocha", ">= 1.1.0"
69
+ spec.add_development_dependency "webmock", ">= 1.24.2"
70
+ spec.add_development_dependency "fakefs", ">= 0.8.1"
71
+ end