aws-kclrb 1.0.1 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +5 -5
- data/README.md +47 -10
- data/VERSION +1 -1
- data/lib/aws/kclrb.rb +1 -0
- data/lib/aws/kclrb/kcl_process.rb +29 -13
- data/lib/aws/kclrb/messages.rb +53 -0
- data/lib/aws/kclrb/record_processor.rb +115 -3
- metadata +4 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: ef941c4d394cb56204cbc3bd8fde1ac88eecf7cb131cfc9f3827c3d418451629
|
4
|
+
data.tar.gz: aad7a5ddfbd195755f49af66b74a097e2de513ee1763e7b97509619542d0aa92
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: d28c8c92de60f53d4b506ad3736b89396f2359c61524f088212d6a902f5c3856de7f255ebd28227ff2a97cf2752ca9b14604829d114b96373015ee15ac74840b
|
7
|
+
data.tar.gz: 98149d6c3f10b61f8f1072b74f6e09fdc55059ceac5e669a060a14cb8301d7ae147d77bef1e387f01652849060106c6dac1f723f34c8f29cb37c5265742018ea
|
data/README.md
CHANGED
@@ -15,17 +15,25 @@ executable. A record processor in Ruby typically looks something like:
|
|
15
15
|
|
16
16
|
require 'aws/kclrb'
|
17
17
|
|
18
|
-
class SampleRecordProcessor < Aws::KCLrb::RecordProcessorBase
|
19
|
-
def init_processor(
|
18
|
+
class SampleRecordProcessor < Aws::KCLrb::V2::RecordProcessorBase
|
19
|
+
def init_processor(initialize_input)
|
20
20
|
# initialize
|
21
21
|
end
|
22
22
|
|
23
|
-
def process_records(
|
23
|
+
def process_records(process_records_input)
|
24
24
|
# process batch of records
|
25
25
|
end
|
26
26
|
|
27
|
-
def
|
28
|
-
# cleanup
|
27
|
+
def lease_lost(lease_lost_input)
|
28
|
+
# lease was lost, cleanup
|
29
|
+
end
|
30
|
+
|
31
|
+
def shard_ended(shard_ended_input)
|
32
|
+
# shard has ended, cleanup
|
33
|
+
end
|
34
|
+
|
35
|
+
def shutdown_requested(shutdown_requested_input)
|
36
|
+
# shutdown has been requested
|
29
37
|
end
|
30
38
|
end
|
31
39
|
|
@@ -69,10 +77,11 @@ The sample application consists of two components:
|
|
69
77
|
The following defaults are used in the sample application:
|
70
78
|
|
71
79
|
* *Stream name*: `kclrbsample`
|
80
|
+
* *Region*: `us-east-1`
|
72
81
|
* *Number of shards*: 2
|
73
82
|
* *Amazon KCL application name*: `RubyKCLSample`
|
74
|
-
* *Amazon DynamoDB table for
|
75
|
-
* *
|
83
|
+
* *Amazon DynamoDB table for KCL application*: `RubyKCLSample`
|
84
|
+
* *Amazon CloudWatch metrics namespace for KCL application*: `RubyKCLSample`
|
76
85
|
|
77
86
|
### Running the Data Producer
|
78
87
|
|
@@ -105,7 +114,7 @@ To run the data processor, run the following commands:
|
|
105
114
|
|
106
115
|
```sh
|
107
116
|
cd samples
|
108
|
-
rake run
|
117
|
+
rake run properties_file=sample.properties
|
109
118
|
```
|
110
119
|
|
111
120
|
#### Notes
|
@@ -118,6 +127,7 @@ To run the data processor, run the following commands:
|
|
118
127
|
* `executableName = samples/sample_kcl.rb`
|
119
128
|
* `streamName = kclrbsample`
|
120
129
|
* `applicationName = RubyKCLSample`
|
130
|
+
* `regionName = us-east-1`
|
121
131
|
|
122
132
|
### Cleaning Up
|
123
133
|
|
@@ -126,7 +136,7 @@ create a real DynamoDB table to track the Amazon KCL application state, thus pot
|
|
126
136
|
incurring AWS costs. Once done, you can log in to AWS management console and delete these
|
127
137
|
resources. Specifically, the sample application will create in your default AWS region
|
128
138
|
|
129
|
-
* an *Amazon Kinesis
|
139
|
+
* an *Amazon Kinesis Data Stream* named `kclrbsample`
|
130
140
|
* an *Amazon DynamoDB table* named `RubyKCLSample`
|
131
141
|
|
132
142
|
## Running on Amazon EC2
|
@@ -147,7 +157,7 @@ Amazon Linux can be found at `/usr/bin/java` and should be 1.7 or greater.
|
|
147
157
|
cd kclrb/samples
|
148
158
|
rake run_producer
|
149
159
|
# ... and in another terminal
|
150
|
-
rake run
|
160
|
+
rake run properties_file=sample.properties
|
151
161
|
```
|
152
162
|
|
153
163
|
## Under the Hood - What You Should Know about Amazon KCL's [MultiLangDaemon][multi-lang-daemon]
|
@@ -177,6 +187,33 @@ all languages.
|
|
177
187
|
|
178
188
|
## Release Notes
|
179
189
|
|
190
|
+
### Release 2.0.0 (February 26, 2019)
|
191
|
+
* Added support for [Enhanced Fan-Out](https://aws.amazon.com/blogs/aws/kds-enhanced-fanout/).
|
192
|
+
Enhanced Fan-Out provides dedicated throughput per stream consumer, and uses an HTTP/2 push API (SubscribeToShard) to deliver records with lower latency.
|
193
|
+
* Updated the Amazon Kinesis Client Library for Java to version 2.1.2.
|
194
|
+
* Version 2.1.2 uses 4 additional Kinesis API's
|
195
|
+
__WARNING: These additional API's may require updating any explicit IAM policies__
|
196
|
+
* [`RegisterStreamConsumer`](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_RegisterStreamConsumer.html)
|
197
|
+
* [`SubscribeToShard`](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_SubscribeToShard.html)
|
198
|
+
* [`DescribeStreamConsumer`](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_DescribeStreamConsumer.html)
|
199
|
+
* [`DescribeStreamSummary`](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_DescribeStreamSummary.html)
|
200
|
+
* For more information about Enhanced Fan-Out with the Amazon Kinesis Client Library please see the [announcement](https://aws.amazon.com/blogs/aws/kds-enhanced-fanout/) and [developer documentation](https://docs.aws.amazon.com/streams/latest/dev/introduction-to-enhanced-consumers.html).
|
201
|
+
* Added version 2 of the [`RecordProcessorBase`](https://github.com/awslabs/amazon-kinesis-client-ruby/blob/d5c2bbafb232b5e1ab947980a0bd8505c87978f9/lib/aws/kclrb/record_processor.rb#L102) which supports the new `ShardRecordProcessor` interface
|
202
|
+
* The `shutdown` method from version 1 has been replaced by `lease_lost` and `shard_ended`.
|
203
|
+
* Added the `lease_lost` method which is invoked when a lease is lost.
|
204
|
+
`lease_lost` replaces `shutdown(checkpointer, 'ZOMBIE')`.
|
205
|
+
* Added the `shard_ended` method which is invoked when all records from a split or merge have been processed.
|
206
|
+
`shard_ended` replaces `shutdown(checkpointer, 'TERMINATE')`.
|
207
|
+
* Added an optional method, `shutdown_requested`, which provides the record processor a last chance to checkpoint during the Amazon Kinesis Client Library shutdown process before the lease is canceled.
|
208
|
+
* To control how long the Amazon Kinesis Client Library waits for the record processors to complete shutdown, add `timeoutInSeconds=<seconds to wait>` to your properties file.
|
209
|
+
* Updated the AWS Java SDK version to 2.4.0
|
210
|
+
* MultiLangDaemon now provides logging using Logback.
|
211
|
+
* MultiLangDaemon supports custom configurations for logging via a Logback XML configuration file.
|
212
|
+
* The example [Rakefile](https://github.com/awslabs/amazon-kinesis-client-ruby/blob/master/samples/Rakefile) supports setting the logging configuration by adding `log_configuration=<log configuration file>` to the Rake command line.
|
213
|
+
|
214
|
+
### Release 1.0.1 (January 19, 2017)
|
215
|
+
* Upgraded to use version 1.7.2 of the [Amazon Kinesis Client library][amazon-kcl-github]
|
216
|
+
|
180
217
|
### Release 1.0.0 (December 30, 2014)
|
181
218
|
* **aws-kclrb** gem which exposes an interface to allow implementation of record processors in Ruby
|
182
219
|
using the Amazon KCL's [MultiLangDaemon][multi-lang-daemon]
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
|
1
|
+
2.0.0
|
data/lib/aws/kclrb.rb
CHANGED
@@ -14,13 +14,16 @@
|
|
14
14
|
|
15
15
|
require 'aws/kclrb/io_proxy'
|
16
16
|
require 'aws/kclrb/checkpointer'
|
17
|
+
require 'aws/kclrb/messages'
|
18
|
+
require 'aws/kclrb/record_processor'
|
17
19
|
|
18
20
|
module Aws
|
19
21
|
module KCLrb
|
20
22
|
# Error raised if the {KCLProcess} received an input action that it
|
21
23
|
# could not parse or it could not handle.
|
22
|
-
class MalformedAction < RuntimeError;
|
23
|
-
|
24
|
+
class MalformedAction < RuntimeError;
|
25
|
+
end
|
26
|
+
|
24
27
|
# Entry point for a KCL application in Ruby.
|
25
28
|
#
|
26
29
|
# Implementers of KCL applications in Ruby should instantiate this
|
@@ -31,12 +34,16 @@ module Aws
|
|
31
34
|
# @param input [IO] An `IO`-like object to read input lines from.
|
32
35
|
# @param output [IO] An `IO`-like object to write output lines to.
|
33
36
|
# @param error [IO] An `IO`-like object to write error lines to.
|
34
|
-
def initialize(processor, input
|
35
|
-
|
37
|
+
def initialize(processor, input = $stdin, output = $stdout, error = $stderr)
|
38
|
+
if processor.version == 1
|
39
|
+
@processor = Aws::KCLrb::V2::V2ToV1Adapter.new(processor)
|
40
|
+
else
|
41
|
+
@processor = processor
|
42
|
+
end
|
36
43
|
@io_proxy = IOProxy.new(input, output, error)
|
37
44
|
@checkpointer = CheckpointerImpl.new(@io_proxy)
|
38
45
|
end
|
39
|
-
|
46
|
+
|
40
47
|
# Starts this KCL processor's main loop.
|
41
48
|
def run
|
42
49
|
action = @io_proxy.read_action
|
@@ -45,9 +52,9 @@ module Aws
|
|
45
52
|
action = @io_proxy.read_action
|
46
53
|
end
|
47
54
|
end
|
48
|
-
|
55
|
+
|
49
56
|
private
|
50
|
-
|
57
|
+
|
51
58
|
# @api private
|
52
59
|
# Parses an input action and invokes the appropriate method of the
|
53
60
|
# record processor.
|
@@ -63,11 +70,20 @@ module Aws
|
|
63
70
|
action_name = action.fetch('action')
|
64
71
|
case action_name
|
65
72
|
when 'initialize'
|
66
|
-
dispatch_to_processor(:init_processor,
|
73
|
+
dispatch_to_processor(:init_processor,
|
74
|
+
Aws::KCLrb::V2::InitializeInput.new(action.fetch('shardId'),
|
75
|
+
action.fetch('sequenceNumber')))
|
67
76
|
when 'processRecords'
|
68
|
-
dispatch_to_processor(:process_records,
|
69
|
-
|
70
|
-
|
77
|
+
dispatch_to_processor(:process_records,
|
78
|
+
Aws::KCLrb::V2::ProcessRecordsInput.new(action.fetch('records'),
|
79
|
+
action.fetch('millisBehindLatest'),
|
80
|
+
@checkpointer))
|
81
|
+
when 'leaseLost'
|
82
|
+
dispatch_to_processor(:lease_lost, Aws::KCLrb::V2::LeaseLostInput.new)
|
83
|
+
when 'shardEnded'
|
84
|
+
dispatch_to_processor(:shard_ended, Aws::KCLrb::V2::ShardEndedInput.new(@checkpointer))
|
85
|
+
when 'shutdownRequested'
|
86
|
+
dispatch_to_processor(:shutdown_requested, Aws::KCLrb::V2::ShutdownRequestedInput.new(@checkpointer))
|
71
87
|
else
|
72
88
|
raise MalformedAction.new("Received an action which couldn't be understood. Action was '#{action}'")
|
73
89
|
end
|
@@ -75,7 +91,7 @@ module Aws
|
|
75
91
|
rescue KeyError => ke
|
76
92
|
raise MalformedAction.new("Action '#{action}': #{ke.message}")
|
77
93
|
end
|
78
|
-
|
94
|
+
|
79
95
|
# @api private
|
80
96
|
# Calls the specified method on the record processor, and handles
|
81
97
|
# any resulting exceptions by writing to the error stream.
|
@@ -89,7 +105,7 @@ module Aws
|
|
89
105
|
# of issue.
|
90
106
|
@io_proxy.write_error(processor_error)
|
91
107
|
end
|
92
|
-
|
108
|
+
|
93
109
|
end
|
94
110
|
end
|
95
111
|
end
|
@@ -0,0 +1,53 @@
|
|
1
|
+
module Aws
|
2
|
+
module KCLrb
|
3
|
+
module V2
|
4
|
+
# @abstract
|
5
|
+
# Input object for RecordProcessorBase#init_processor method.
|
6
|
+
class InitializeInput
|
7
|
+
attr_reader :shard_id, :sequence_number
|
8
|
+
|
9
|
+
def initialize(shard_id, sequence_number)
|
10
|
+
@shard_id = shard_id
|
11
|
+
@sequence_number = sequence_number
|
12
|
+
end
|
13
|
+
end
|
14
|
+
|
15
|
+
# @abstract
|
16
|
+
# Input object for RecordProcessorBase#process_records method.
|
17
|
+
class ProcessRecordsInput
|
18
|
+
attr_reader :records, :millis_behind_latest, :checkpointer
|
19
|
+
|
20
|
+
def initialize(records, millis_behind_latest, checkpointer = nil)
|
21
|
+
@records = records
|
22
|
+
@millis_behind_latest = millis_behind_latest
|
23
|
+
@checkpointer = checkpointer
|
24
|
+
end
|
25
|
+
end
|
26
|
+
|
27
|
+
# @abstract
|
28
|
+
# Input object for RecordProcessorBase#lease_lost method.
|
29
|
+
class LeaseLostInput
|
30
|
+
end
|
31
|
+
|
32
|
+
# @abstract
|
33
|
+
# Input object forRecordProcessorBase#shard_ended method.
|
34
|
+
class ShardEndedInput
|
35
|
+
attr_reader :checkpointer
|
36
|
+
|
37
|
+
def initialize(checkpointer = nil)
|
38
|
+
@checkpointer = checkpointer
|
39
|
+
end
|
40
|
+
end
|
41
|
+
|
42
|
+
# @abstract
|
43
|
+
# Input object for RecordProcessorBase#shutdown_requested method.
|
44
|
+
class ShutdownRequestedInput
|
45
|
+
attr_reader :checkpointer
|
46
|
+
|
47
|
+
def initialize(checkpointer = nil)
|
48
|
+
@checkpointer = checkpointer
|
49
|
+
end
|
50
|
+
end
|
51
|
+
end
|
52
|
+
end
|
53
|
+
end
|
@@ -1,5 +1,5 @@
|
|
1
1
|
#
|
2
|
-
# Copyright
|
2
|
+
# Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved.
|
3
3
|
#
|
4
4
|
# Licensed under the Amazon Software License (the "License").
|
5
5
|
# You may not use this file except in compliance with the License.
|
@@ -32,7 +32,7 @@ module Aws
|
|
32
32
|
def init_processor(shard_id)
|
33
33
|
fail NotImplementedError.new
|
34
34
|
end
|
35
|
-
|
35
|
+
|
36
36
|
# @abstract
|
37
37
|
# Called by a KCLProcess with a list of records to be processed and a checkpointer
|
38
38
|
# which accepts sequence numbers from the records to indicate where in the stream
|
@@ -52,7 +52,7 @@ module Aws
|
|
52
52
|
def process_records(records, checkpointer)
|
53
53
|
fail NotImplementedError.new
|
54
54
|
end
|
55
|
-
|
55
|
+
|
56
56
|
# @abstract
|
57
57
|
# Called by a KCLProcess instance to indicate that this record processor
|
58
58
|
# should shutdown. After this is called, there will be no more calls to
|
@@ -72,6 +72,118 @@ module Aws
|
|
72
72
|
def shutdown(checkpointer, reason)
|
73
73
|
fail NotImplementedError.new
|
74
74
|
end
|
75
|
+
|
76
|
+
# @abstract
|
77
|
+
# Called by a KCLProcess instance to indicate that this record processor
|
78
|
+
# is requesting a shutdown. This method should be overriden if required.
|
79
|
+
#
|
80
|
+
# @param checkpointer [Checkpointer] A checkpointer which accepts a sequence
|
81
|
+
# number or no parameters.
|
82
|
+
def shutdown_requested(checkpointer)
|
83
|
+
end
|
84
|
+
|
85
|
+
def version
|
86
|
+
1
|
87
|
+
end
|
88
|
+
end
|
89
|
+
|
90
|
+
module V2
|
91
|
+
# @abstract
|
92
|
+
# Base class for implementing a record processor.
|
93
|
+
#
|
94
|
+
# A `RecordProcessor` processes a shard in a stream. See {https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/clientlibrary/interfaces/IRecordProcessor.java the corresponding KCL interface}.
|
95
|
+
# Its methods will be called as follows:
|
96
|
+
#
|
97
|
+
# 1. {#init_processor} will be called once
|
98
|
+
# 2. {#process_records} will be called zero or more times
|
99
|
+
# 3. {#lease_lost} will be called zero to one time
|
100
|
+
# 4. {#shard_ended} will be called zero or more times
|
101
|
+
# 5. {#shutdown_requested} will be called zero to one time
|
102
|
+
class RecordProcessorBase
|
103
|
+
# @abstract
|
104
|
+
# Called once by a KCLProcess before any calls to process_records.
|
105
|
+
#
|
106
|
+
# @param initialize_input [InitializeInput] Initialize processor input
|
107
|
+
# object
|
108
|
+
def init_processor(initialize_input)
|
109
|
+
fail NotImplementedError.new
|
110
|
+
end
|
111
|
+
|
112
|
+
# @abstract
|
113
|
+
# Called by a KCLProcess with a list of records to be processed and a
|
114
|
+
# checkpointer which accepts sequence numbers from the records to
|
115
|
+
# indicate where in the stream to checkpoint.
|
116
|
+
#
|
117
|
+
# @param record_processor_input [RecordProcessorInput] Process records
|
118
|
+
# input object
|
119
|
+
def process_records(process_records_input)
|
120
|
+
fail NotImplementedError.new
|
121
|
+
end
|
122
|
+
|
123
|
+
# @abstract
|
124
|
+
# Called by a KCLProcess instance to indicate that this record processor
|
125
|
+
# should shutdown. After this is called, there will be no more calls to
|
126
|
+
# any other methods of this record processor.
|
127
|
+
#
|
128
|
+
# @param lease_lost_input [LeaseLostInput] Lease lost input object
|
129
|
+
#
|
130
|
+
# - Clients should not checkpoint because there is possibly another
|
131
|
+
# record processor which has acquired the lease for this shard.
|
132
|
+
def lease_lost(lease_lost_input)
|
133
|
+
fail NotImplementedError.new
|
134
|
+
end
|
135
|
+
|
136
|
+
# @abstract
|
137
|
+
# Called by a KCLProcess instance to indicate that this record processor
|
138
|
+
# should shutdown. After this is called, there will be no more calls to
|
139
|
+
# any other methods of this record processor.
|
140
|
+
#
|
141
|
+
# @param shard_ended_input [ShardEndedInput] Shard ended input object
|
142
|
+
#
|
143
|
+
# - Clients need to checkpoint at this time.
|
144
|
+
def shard_ended(shard_ended_input)
|
145
|
+
fail NotImplementedError.new
|
146
|
+
end
|
147
|
+
|
148
|
+
# @abstract
|
149
|
+
# Called by a KCLProcess instance to indicate that this record processor
|
150
|
+
# is requesting a shutdown. This method should be overriden if required.
|
151
|
+
#
|
152
|
+
# @param shutdown_requested_input [ShutdownRequestedInput]
|
153
|
+
def shutdown_requested(shutdown_requested_input)
|
154
|
+
end
|
155
|
+
|
156
|
+
def version
|
157
|
+
2
|
158
|
+
end
|
159
|
+
end
|
160
|
+
|
161
|
+
class V2ToV1Adapter < Aws::KCLrb::V2::RecordProcessorBase
|
162
|
+
def initialize(processor)
|
163
|
+
@processor = processor
|
164
|
+
end
|
165
|
+
|
166
|
+
def init_processor(initialize_input)
|
167
|
+
@processor.init_processor(initialize_input.shard_id)
|
168
|
+
end
|
169
|
+
|
170
|
+
def process_records(process_records_input)
|
171
|
+
@processor.process_records(process_records_input.records,
|
172
|
+
process_records_input.checkpointer)
|
173
|
+
end
|
174
|
+
|
175
|
+
def lease_lost(lease_lost_input)
|
176
|
+
@processor.shutdown(nil, 'ZOMBIE')
|
177
|
+
end
|
178
|
+
|
179
|
+
def shard_ended(shard_ended_input)
|
180
|
+
@processor.shutdown(shard_ended_input.checkpointer, 'TERMINATE')
|
181
|
+
end
|
182
|
+
|
183
|
+
def shutdown_requested(shutdown_requested_input)
|
184
|
+
@processor.shutdown_requested(shutdown_requested_input.checkpointer)
|
185
|
+
end
|
186
|
+
end
|
75
187
|
end
|
76
188
|
end
|
77
189
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: aws-kclrb
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
4
|
+
version: 2.0.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Amazon Web Services
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2019-02-26 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: multi_json
|
@@ -40,6 +40,7 @@ files:
|
|
40
40
|
- lib/aws/kclrb/checkpointer.rb
|
41
41
|
- lib/aws/kclrb/io_proxy.rb
|
42
42
|
- lib/aws/kclrb/kcl_process.rb
|
43
|
+
- lib/aws/kclrb/messages.rb
|
43
44
|
- lib/aws/kclrb/record_processor.rb
|
44
45
|
- spec/checkpointer_spec.rb
|
45
46
|
- spec/io_proxy_spec.rb
|
@@ -65,7 +66,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
65
66
|
version: '0'
|
66
67
|
requirements: []
|
67
68
|
rubyforge_project:
|
68
|
-
rubygems_version: 2.
|
69
|
+
rubygems_version: 2.7.7
|
69
70
|
signing_key:
|
70
71
|
specification_version: 4
|
71
72
|
summary: Amazon Kinesis Client Library for Ruby
|