aws-kclrb 1.0.1 → 2.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -5
- data/README.md +47 -10
- data/VERSION +1 -1
- data/lib/aws/kclrb.rb +1 -0
- data/lib/aws/kclrb/kcl_process.rb +29 -13
- data/lib/aws/kclrb/messages.rb +53 -0
- data/lib/aws/kclrb/record_processor.rb +115 -3
- metadata +4 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: ef941c4d394cb56204cbc3bd8fde1ac88eecf7cb131cfc9f3827c3d418451629
|
4
|
+
data.tar.gz: aad7a5ddfbd195755f49af66b74a097e2de513ee1763e7b97509619542d0aa92
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: d28c8c92de60f53d4b506ad3736b89396f2359c61524f088212d6a902f5c3856de7f255ebd28227ff2a97cf2752ca9b14604829d114b96373015ee15ac74840b
|
7
|
+
data.tar.gz: 98149d6c3f10b61f8f1072b74f6e09fdc55059ceac5e669a060a14cb8301d7ae147d77bef1e387f01652849060106c6dac1f723f34c8f29cb37c5265742018ea
|
data/README.md
CHANGED
@@ -15,17 +15,25 @@ executable. A record processor in Ruby typically looks something like:
|
|
15
15
|
|
16
16
|
require 'aws/kclrb'
|
17
17
|
|
18
|
-
class SampleRecordProcessor < Aws::KCLrb::RecordProcessorBase
|
19
|
-
def init_processor(
|
18
|
+
class SampleRecordProcessor < Aws::KCLrb::V2::RecordProcessorBase
|
19
|
+
def init_processor(initialize_input)
|
20
20
|
# initialize
|
21
21
|
end
|
22
22
|
|
23
|
-
def process_records(
|
23
|
+
def process_records(process_records_input)
|
24
24
|
# process batch of records
|
25
25
|
end
|
26
26
|
|
27
|
-
def
|
28
|
-
# cleanup
|
27
|
+
def lease_lost(lease_lost_input)
|
28
|
+
# lease was lost, cleanup
|
29
|
+
end
|
30
|
+
|
31
|
+
def shard_ended(shard_ended_input)
|
32
|
+
# shard has ended, cleanup
|
33
|
+
end
|
34
|
+
|
35
|
+
def shutdown_requested(shutdown_requested_input)
|
36
|
+
# shutdown has been requested
|
29
37
|
end
|
30
38
|
end
|
31
39
|
|
@@ -69,10 +77,11 @@ The sample application consists of two components:
|
|
69
77
|
The following defaults are used in the sample application:
|
70
78
|
|
71
79
|
* *Stream name*: `kclrbsample`
|
80
|
+
* *Region*: `us-east-1`
|
72
81
|
* *Number of shards*: 2
|
73
82
|
* *Amazon KCL application name*: `RubyKCLSample`
|
74
|
-
* *Amazon DynamoDB table for
|
75
|
-
* *
|
83
|
+
* *Amazon DynamoDB table for KCL application*: `RubyKCLSample`
|
84
|
+
* *Amazon CloudWatch metrics namespace for KCL application*: `RubyKCLSample`
|
76
85
|
|
77
86
|
### Running the Data Producer
|
78
87
|
|
@@ -105,7 +114,7 @@ To run the data processor, run the following commands:
|
|
105
114
|
|
106
115
|
```sh
|
107
116
|
cd samples
|
108
|
-
rake run
|
117
|
+
rake run properties_file=sample.properties
|
109
118
|
```
|
110
119
|
|
111
120
|
#### Notes
|
@@ -118,6 +127,7 @@ To run the data processor, run the following commands:
|
|
118
127
|
* `executableName = samples/sample_kcl.rb`
|
119
128
|
* `streamName = kclrbsample`
|
120
129
|
* `applicationName = RubyKCLSample`
|
130
|
+
* `regionName = us-east-1`
|
121
131
|
|
122
132
|
### Cleaning Up
|
123
133
|
|
@@ -126,7 +136,7 @@ create a real DynamoDB table to track the Amazon KCL application state, thus pot
|
|
126
136
|
incurring AWS costs. Once done, you can log in to AWS management console and delete these
|
127
137
|
resources. Specifically, the sample application will create in your default AWS region
|
128
138
|
|
129
|
-
* an *Amazon Kinesis
|
139
|
+
* an *Amazon Kinesis Data Stream* named `kclrbsample`
|
130
140
|
* an *Amazon DynamoDB table* named `RubyKCLSample`
|
131
141
|
|
132
142
|
## Running on Amazon EC2
|
@@ -147,7 +157,7 @@ Amazon Linux can be found at `/usr/bin/java` and should be 1.7 or greater.
|
|
147
157
|
cd kclrb/samples
|
148
158
|
rake run_producer
|
149
159
|
# ... and in another terminal
|
150
|
-
rake run
|
160
|
+
rake run properties_file=sample.properties
|
151
161
|
```
|
152
162
|
|
153
163
|
## Under the Hood - What You Should Know about Amazon KCL's [MultiLangDaemon][multi-lang-daemon]
|
@@ -177,6 +187,33 @@ all languages.
|
|
177
187
|
|
178
188
|
## Release Notes
|
179
189
|
|
190
|
+
### Release 2.0.0 (February 26, 2019)
|
191
|
+
* Added support for [Enhanced Fan-Out](https://aws.amazon.com/blogs/aws/kds-enhanced-fanout/).
|
192
|
+
Enhanced Fan-Out provides dedicated throughput per stream consumer, and uses an HTTP/2 push API (SubscribeToShard) to deliver records with lower latency.
|
193
|
+
* Updated the Amazon Kinesis Client Library for Java to version 2.1.2.
|
194
|
+
* Version 2.1.2 uses 4 additional Kinesis API's
|
195
|
+
__WARNING: These additional API's may require updating any explicit IAM policies__
|
196
|
+
* [`RegisterStreamConsumer`](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_RegisterStreamConsumer.html)
|
197
|
+
* [`SubscribeToShard`](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_SubscribeToShard.html)
|
198
|
+
* [`DescribeStreamConsumer`](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_DescribeStreamConsumer.html)
|
199
|
+
* [`DescribeStreamSummary`](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_DescribeStreamSummary.html)
|
200
|
+
* For more information about Enhanced Fan-Out with the Amazon Kinesis Client Library please see the [announcement](https://aws.amazon.com/blogs/aws/kds-enhanced-fanout/) and [developer documentation](https://docs.aws.amazon.com/streams/latest/dev/introduction-to-enhanced-consumers.html).
|
201
|
+
* Added version 2 of the [`RecordProcessorBase`](https://github.com/awslabs/amazon-kinesis-client-ruby/blob/d5c2bbafb232b5e1ab947980a0bd8505c87978f9/lib/aws/kclrb/record_processor.rb#L102) which supports the new `ShardRecordProcessor` interface
|
202
|
+
* The `shutdown` method from version 1 has been replaced by `lease_lost` and `shard_ended`.
|
203
|
+
* Added the `lease_lost` method which is invoked when a lease is lost.
|
204
|
+
`lease_lost` replaces `shutdown(checkpointer, 'ZOMBIE')`.
|
205
|
+
* Added the `shard_ended` method which is invoked when all records from a split or merge have been processed.
|
206
|
+
`shard_ended` replaces `shutdown(checkpointer, 'TERMINATE')`.
|
207
|
+
* Added an optional method, `shutdown_requested`, which provides the record processor a last chance to checkpoint during the Amazon Kinesis Client Library shutdown process before the lease is canceled.
|
208
|
+
* To control how long the Amazon Kinesis Client Library waits for the record processors to complete shutdown, add `timeoutInSeconds=<seconds to wait>` to your properties file.
|
209
|
+
* Updated the AWS Java SDK version to 2.4.0
|
210
|
+
* MultiLangDaemon now provides logging using Logback.
|
211
|
+
* MultiLangDaemon supports custom configurations for logging via a Logback XML configuration file.
|
212
|
+
* The example [Rakefile](https://github.com/awslabs/amazon-kinesis-client-ruby/blob/master/samples/Rakefile) supports setting the logging configuration by adding `log_configuration=<log configuration file>` to the Rake command line.
|
213
|
+
|
214
|
+
### Release 1.0.1 (January 19, 2017)
|
215
|
+
* Upgraded to use version 1.7.2 of the [Amazon Kinesis Client library][amazon-kcl-github]
|
216
|
+
|
180
217
|
### Release 1.0.0 (December 30, 2014)
|
181
218
|
* **aws-kclrb** gem which exposes an interface to allow implementation of record processors in Ruby
|
182
219
|
using the Amazon KCL's [MultiLangDaemon][multi-lang-daemon]
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
|
1
|
+
2.0.0
|
data/lib/aws/kclrb.rb
CHANGED
@@ -14,13 +14,16 @@
|
|
14
14
|
|
15
15
|
require 'aws/kclrb/io_proxy'
|
16
16
|
require 'aws/kclrb/checkpointer'
|
17
|
+
require 'aws/kclrb/messages'
|
18
|
+
require 'aws/kclrb/record_processor'
|
17
19
|
|
18
20
|
module Aws
|
19
21
|
module KCLrb
|
20
22
|
# Error raised if the {KCLProcess} received an input action that it
|
21
23
|
# could not parse or it could not handle.
|
22
|
-
class MalformedAction < RuntimeError;
|
23
|
-
|
24
|
+
class MalformedAction < RuntimeError;
|
25
|
+
end
|
26
|
+
|
24
27
|
# Entry point for a KCL application in Ruby.
|
25
28
|
#
|
26
29
|
# Implementers of KCL applications in Ruby should instantiate this
|
@@ -31,12 +34,16 @@ module Aws
|
|
31
34
|
# @param input [IO] An `IO`-like object to read input lines from.
|
32
35
|
# @param output [IO] An `IO`-like object to write output lines to.
|
33
36
|
# @param error [IO] An `IO`-like object to write error lines to.
|
34
|
-
def initialize(processor, input
|
35
|
-
|
37
|
+
def initialize(processor, input = $stdin, output = $stdout, error = $stderr)
|
38
|
+
if processor.version == 1
|
39
|
+
@processor = Aws::KCLrb::V2::V2ToV1Adapter.new(processor)
|
40
|
+
else
|
41
|
+
@processor = processor
|
42
|
+
end
|
36
43
|
@io_proxy = IOProxy.new(input, output, error)
|
37
44
|
@checkpointer = CheckpointerImpl.new(@io_proxy)
|
38
45
|
end
|
39
|
-
|
46
|
+
|
40
47
|
# Starts this KCL processor's main loop.
|
41
48
|
def run
|
42
49
|
action = @io_proxy.read_action
|
@@ -45,9 +52,9 @@ module Aws
|
|
45
52
|
action = @io_proxy.read_action
|
46
53
|
end
|
47
54
|
end
|
48
|
-
|
55
|
+
|
49
56
|
private
|
50
|
-
|
57
|
+
|
51
58
|
# @api private
|
52
59
|
# Parses an input action and invokes the appropriate method of the
|
53
60
|
# record processor.
|
@@ -63,11 +70,20 @@ module Aws
|
|
63
70
|
action_name = action.fetch('action')
|
64
71
|
case action_name
|
65
72
|
when 'initialize'
|
66
|
-
dispatch_to_processor(:init_processor,
|
73
|
+
dispatch_to_processor(:init_processor,
|
74
|
+
Aws::KCLrb::V2::InitializeInput.new(action.fetch('shardId'),
|
75
|
+
action.fetch('sequenceNumber')))
|
67
76
|
when 'processRecords'
|
68
|
-
dispatch_to_processor(:process_records,
|
69
|
-
|
70
|
-
|
77
|
+
dispatch_to_processor(:process_records,
|
78
|
+
Aws::KCLrb::V2::ProcessRecordsInput.new(action.fetch('records'),
|
79
|
+
action.fetch('millisBehindLatest'),
|
80
|
+
@checkpointer))
|
81
|
+
when 'leaseLost'
|
82
|
+
dispatch_to_processor(:lease_lost, Aws::KCLrb::V2::LeaseLostInput.new)
|
83
|
+
when 'shardEnded'
|
84
|
+
dispatch_to_processor(:shard_ended, Aws::KCLrb::V2::ShardEndedInput.new(@checkpointer))
|
85
|
+
when 'shutdownRequested'
|
86
|
+
dispatch_to_processor(:shutdown_requested, Aws::KCLrb::V2::ShutdownRequestedInput.new(@checkpointer))
|
71
87
|
else
|
72
88
|
raise MalformedAction.new("Received an action which couldn't be understood. Action was '#{action}'")
|
73
89
|
end
|
@@ -75,7 +91,7 @@ module Aws
|
|
75
91
|
rescue KeyError => ke
|
76
92
|
raise MalformedAction.new("Action '#{action}': #{ke.message}")
|
77
93
|
end
|
78
|
-
|
94
|
+
|
79
95
|
# @api private
|
80
96
|
# Calls the specified method on the record processor, and handles
|
81
97
|
# any resulting exceptions by writing to the error stream.
|
@@ -89,7 +105,7 @@ module Aws
|
|
89
105
|
# of issue.
|
90
106
|
@io_proxy.write_error(processor_error)
|
91
107
|
end
|
92
|
-
|
108
|
+
|
93
109
|
end
|
94
110
|
end
|
95
111
|
end
|
@@ -0,0 +1,53 @@
|
|
1
|
+
module Aws
|
2
|
+
module KCLrb
|
3
|
+
module V2
|
4
|
+
# @abstract
|
5
|
+
# Input object for RecordProcessorBase#init_processor method.
|
6
|
+
class InitializeInput
|
7
|
+
attr_reader :shard_id, :sequence_number
|
8
|
+
|
9
|
+
def initialize(shard_id, sequence_number)
|
10
|
+
@shard_id = shard_id
|
11
|
+
@sequence_number = sequence_number
|
12
|
+
end
|
13
|
+
end
|
14
|
+
|
15
|
+
# @abstract
|
16
|
+
# Input object for RecordProcessorBase#process_records method.
|
17
|
+
class ProcessRecordsInput
|
18
|
+
attr_reader :records, :millis_behind_latest, :checkpointer
|
19
|
+
|
20
|
+
def initialize(records, millis_behind_latest, checkpointer = nil)
|
21
|
+
@records = records
|
22
|
+
@millis_behind_latest = millis_behind_latest
|
23
|
+
@checkpointer = checkpointer
|
24
|
+
end
|
25
|
+
end
|
26
|
+
|
27
|
+
# @abstract
|
28
|
+
# Input object for RecordProcessorBase#lease_lost method.
|
29
|
+
class LeaseLostInput
|
30
|
+
end
|
31
|
+
|
32
|
+
# @abstract
|
33
|
+
# Input object forRecordProcessorBase#shard_ended method.
|
34
|
+
class ShardEndedInput
|
35
|
+
attr_reader :checkpointer
|
36
|
+
|
37
|
+
def initialize(checkpointer = nil)
|
38
|
+
@checkpointer = checkpointer
|
39
|
+
end
|
40
|
+
end
|
41
|
+
|
42
|
+
# @abstract
|
43
|
+
# Input object for RecordProcessorBase#shutdown_requested method.
|
44
|
+
class ShutdownRequestedInput
|
45
|
+
attr_reader :checkpointer
|
46
|
+
|
47
|
+
def initialize(checkpointer = nil)
|
48
|
+
@checkpointer = checkpointer
|
49
|
+
end
|
50
|
+
end
|
51
|
+
end
|
52
|
+
end
|
53
|
+
end
|
@@ -1,5 +1,5 @@
|
|
1
1
|
#
|
2
|
-
# Copyright
|
2
|
+
# Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
|
3
3
|
#
|
4
4
|
# Licensed under the Amazon Software License (the "License").
|
5
5
|
# You may not use this file except in compliance with the License.
|
@@ -32,7 +32,7 @@ module Aws
|
|
32
32
|
def init_processor(shard_id)
|
33
33
|
fail NotImplementedError.new
|
34
34
|
end
|
35
|
-
|
35
|
+
|
36
36
|
# @abstract
|
37
37
|
# Called by a KCLProcess with a list of records to be processed and a checkpointer
|
38
38
|
# which accepts sequence numbers from the records to indicate where in the stream
|
@@ -52,7 +52,7 @@ module Aws
|
|
52
52
|
def process_records(records, checkpointer)
|
53
53
|
fail NotImplementedError.new
|
54
54
|
end
|
55
|
-
|
55
|
+
|
56
56
|
# @abstract
|
57
57
|
# Called by a KCLProcess instance to indicate that this record processor
|
58
58
|
# should shutdown. After this is called, there will be no more calls to
|
@@ -72,6 +72,118 @@ module Aws
|
|
72
72
|
def shutdown(checkpointer, reason)
|
73
73
|
fail NotImplementedError.new
|
74
74
|
end
|
75
|
+
|
76
|
+
# @abstract
|
77
|
+
# Called by a KCLProcess instance to indicate that this record processor
|
78
|
+
# is requesting a shutdown. This method should be overriden if required.
|
79
|
+
#
|
80
|
+
# @param checkpointer [Checkpointer] A checkpointer which accepts a sequence
|
81
|
+
# number or no parameters.
|
82
|
+
def shutdown_requested(checkpointer)
|
83
|
+
end
|
84
|
+
|
85
|
+
def version
|
86
|
+
1
|
87
|
+
end
|
88
|
+
end
|
89
|
+
|
90
|
+
module V2
|
91
|
+
# @abstract
|
92
|
+
# Base class for implementing a record processor.
|
93
|
+
#
|
94
|
+
# A `RecordProcessor` processes a shard in a stream. See {https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/clientlibrary/interfaces/IRecordProcessor.java the corresponding KCL interface}.
|
95
|
+
# Its methods will be called as follows:
|
96
|
+
#
|
97
|
+
# 1. {#init_processor} will be called once
|
98
|
+
# 2. {#process_records} will be called zero or more times
|
99
|
+
# 3. {#lease_lost} will be called zero to one time
|
100
|
+
# 4. {#shard_ended} will be called zero or more times
|
101
|
+
# 5. {#shutdown_requested} will be called zero to one time
|
102
|
+
class RecordProcessorBase
|
103
|
+
# @abstract
|
104
|
+
# Called once by a KCLProcess before any calls to process_records.
|
105
|
+
#
|
106
|
+
# @param initialize_input [InitializeInput] Initialize processor input
|
107
|
+
# object
|
108
|
+
def init_processor(initialize_input)
|
109
|
+
fail NotImplementedError.new
|
110
|
+
end
|
111
|
+
|
112
|
+
# @abstract
|
113
|
+
# Called by a KCLProcess with a list of records to be processed and a
|
114
|
+
# checkpointer which accepts sequence numbers from the records to
|
115
|
+
# indicate where in the stream to checkpoint.
|
116
|
+
#
|
117
|
+
# @param record_processor_input [RecordProcessorInput] Process records
|
118
|
+
# input object
|
119
|
+
def process_records(process_records_input)
|
120
|
+
fail NotImplementedError.new
|
121
|
+
end
|
122
|
+
|
123
|
+
# @abstract
|
124
|
+
# Called by a KCLProcess instance to indicate that this record processor
|
125
|
+
# should shutdown. After this is called, there will be no more calls to
|
126
|
+
# any other methods of this record processor.
|
127
|
+
#
|
128
|
+
# @param lease_lost_input [LeaseLostInput] Lease lost input object
|
129
|
+
#
|
130
|
+
# - Clients should not checkpoint because there is possibly another
|
131
|
+
# record processor which has acquired the lease for this shard.
|
132
|
+
def lease_lost(lease_lost_input)
|
133
|
+
fail NotImplementedError.new
|
134
|
+
end
|
135
|
+
|
136
|
+
# @abstract
|
137
|
+
# Called by a KCLProcess instance to indicate that this record processor
|
138
|
+
# should shutdown. After this is called, there will be no more calls to
|
139
|
+
# any other methods of this record processor.
|
140
|
+
#
|
141
|
+
# @param shard_ended_input [ShardEndedInput] Shard ended input object
|
142
|
+
#
|
143
|
+
# - Clients need to checkpoint at this time.
|
144
|
+
def shard_ended(shard_ended_input)
|
145
|
+
fail NotImplementedError.new
|
146
|
+
end
|
147
|
+
|
148
|
+
# @abstract
|
149
|
+
# Called by a KCLProcess instance to indicate that this record processor
|
150
|
+
# is requesting a shutdown. This method should be overriden if required.
|
151
|
+
#
|
152
|
+
# @param shutdown_requested_input [ShutdownRequestedInput]
|
153
|
+
def shutdown_requested(shutdown_requested_input)
|
154
|
+
end
|
155
|
+
|
156
|
+
def version
|
157
|
+
2
|
158
|
+
end
|
159
|
+
end
|
160
|
+
|
161
|
+
class V2ToV1Adapter < Aws::KCLrb::V2::RecordProcessorBase
|
162
|
+
def initialize(processor)
|
163
|
+
@processor = processor
|
164
|
+
end
|
165
|
+
|
166
|
+
def init_processor(initialize_input)
|
167
|
+
@processor.init_processor(initialize_input.shard_id)
|
168
|
+
end
|
169
|
+
|
170
|
+
def process_records(process_records_input)
|
171
|
+
@processor.process_records(process_records_input.records,
|
172
|
+
process_records_input.checkpointer)
|
173
|
+
end
|
174
|
+
|
175
|
+
def lease_lost(lease_lost_input)
|
176
|
+
@processor.shutdown(nil, 'ZOMBIE')
|
177
|
+
end
|
178
|
+
|
179
|
+
def shard_ended(shard_ended_input)
|
180
|
+
@processor.shutdown(shard_ended_input.checkpointer, 'TERMINATE')
|
181
|
+
end
|
182
|
+
|
183
|
+
def shutdown_requested(shutdown_requested_input)
|
184
|
+
@processor.shutdown_requested(shutdown_requested_input.checkpointer)
|
185
|
+
end
|
186
|
+
end
|
75
187
|
end
|
76
188
|
end
|
77
189
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: aws-kclrb
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
4
|
+
version: 2.0.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Amazon Web Services
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2019-02-26 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: multi_json
|
@@ -40,6 +40,7 @@ files:
|
|
40
40
|
- lib/aws/kclrb/checkpointer.rb
|
41
41
|
- lib/aws/kclrb/io_proxy.rb
|
42
42
|
- lib/aws/kclrb/kcl_process.rb
|
43
|
+
- lib/aws/kclrb/messages.rb
|
43
44
|
- lib/aws/kclrb/record_processor.rb
|
44
45
|
- spec/checkpointer_spec.rb
|
45
46
|
- spec/io_proxy_spec.rb
|
@@ -65,7 +66,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
65
66
|
version: '0'
|
66
67
|
requirements: []
|
67
68
|
rubyforge_project:
|
68
|
-
rubygems_version: 2.
|
69
|
+
rubygems_version: 2.7.7
|
69
70
|
signing_key:
|
70
71
|
specification_version: 4
|
71
72
|
summary: Amazon Kinesis Client Library for Ruby
|