aws-kclrb 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 08f02b6b3c02cdfc113ee2a43823c81d94b5b2bb
4
+ data.tar.gz: b55771d377c8b4b6880245049f93fc9498bd18f7
5
+ SHA512:
6
+ metadata.gz: 9602145a147e21e3cf5a2285bfc0933ed747303dc354683fa8b0e83a943ea92b94a19997e61d3a37faa173ce38cc20851e11dbe3f10d71d94adba75e78e72940
7
+ data.tar.gz: 6993ffb17e798357b00a22a00da7d6ba1fd43283c98b03651445ae07e4c58766899df26e5421f4f5454b7fa39cc068bd4d4be291ced66a5e998bed2213ff2bcb
data/.rspec ADDED
@@ -0,0 +1,4 @@
1
+ --color
2
+ --format progress
3
+ -r spec_helper
4
+ -I./lib
@@ -0,0 +1,6 @@
1
+ --title 'Amazon Kinesis Client Library for Ruby'
2
+ --markup markdown
3
+ --markup-provider redcarpet
4
+ --hide-api private
5
+ lib/**/*.rb - README.md LICENSE.txt NOTICE.txt
6
+
@@ -0,0 +1,40 @@
1
+
2
+ Amazon Software License
3
+
4
+ This Amazon Software License (“License”) governs your use, reproduction, and distribution of the accompanying software as specified below.
5
+ 1. Definitions
6
+
7
+ “Licensor” means any person or entity that distributes its Work.
8
+
9
+ “Software” means the original work of authorship made available under this License.
10
+
11
+ “Work” means the Software and any additions to or derivative works of the Software that are made available under this License.
12
+
13
+ The terms “reproduce,” “reproduction,” “derivative works,” and “distribution” have the meaning as provided under U.S. copyright law; provided, however, that for the purposes of this License, derivative works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work.
14
+
15
+ Works, including the Software, are “made available” under this License by including in or with the Work either (a) a copyright notice referencing the applicability of this License to the Work, or (b) a copy of this License.
16
+ 2. License Grants
17
+
18
+ 2.1 Copyright Grant. Subject to the terms and conditions of this License, each Licensor grants to you a perpetual, worldwide, non-exclusive, royalty-free, copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, sublicense and distribute its Work and any resulting derivative works in any form.
19
+
20
+ 2.2 Patent Grant. Subject to the terms and conditions of this License, each Licensor grants to you a perpetual, worldwide, non-exclusive, royalty-free patent license to make, have made, use, sell, offer for sale, import, and otherwise transfer its Work, in whole or in part. The foregoing license applies only to the patent claims licensable by Licensor that would be infringed by Licensor’s Work (or portion thereof) individually and excluding any combinations with any other materials or technology.
21
+ 3. Limitations
22
+
23
+ 3.1 Redistribution. You may reproduce or distribute the Work only if (a) you do so under this License, (b) you include a complete copy of this License with your distribution, and (c) you retain without modification any copyright, patent, trademark, or attribution notices that are present in the Work.
24
+
25
+ 3.2 Derivative Works. You may specify that additional or different terms apply to the use, reproduction, and distribution of your derivative works of the Work (“Your Terms”) only if (a) Your Terms provide that the use limitation in Section 3.3 applies to your derivative works, and (b) you identify the specific derivative works that are subject to Your Terms. Notwithstanding Your Terms, this License (including the redistribution requirements in Section 3.1) will continue to apply to the Work itself.
26
+
27
+ 3.3 Use Limitation. The Work and any derivative works thereof only may be used or intended for use with the web services, computing platforms or applications provided by Amazon.com, Inc. or its affiliates, including Amazon Web Services, Inc.
28
+
29
+ 3.4 Patent Claims. If you bring or threaten to bring a patent claim against any Licensor (including any claim, cross-claim or counterclaim in a lawsuit) to enforce any patents that you allege are infringed by any Work, then your rights under this License from such Licensor (including the grants in Sections 2.1 and 2.2) will terminate immediately.
30
+
31
+ 3.5 Trademarks. This License does not grant any rights to use any Licensor’s or its affiliates’ names, logos, or trademarks, except as necessary to reproduce the notices described in this License.
32
+
33
+ 3.6 Termination. If you violate any term of this License, then your rights under this License (including the grants in Sections 2.1 and 2.2) will terminate immediately.
34
+ 4. Disclaimer of Warranty.
35
+
36
+ THE WORK IS PROVIDED “AS IS” WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WARRANTIES OR CONDITIONS OF M ERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE OR NON-INFRINGEMENT. YOU BEAR THE RISK OF UNDERTAKING ANY ACTIVITIES UNDER THIS LICENSE. SOME STATES’ CONSUMER LAWS DO NOT ALLOW EXCLUSION OF AN IMPLIED WARRANTY, SO THIS DISCLAIMER MAY NOT APPLY TO YOU.
37
+ 5. Limitation of Liability.
38
+
39
+ EXCEPT AS PROHIBITED BY APPLICABLE LAW, IN NO EVENT AND UNDER NO LEGAL THEORY, WHETHER IN TORT (INCLUDING NEGLIGENCE), CONTRACT, OR OTHERWISE SHALL ANY LICENSOR BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF OR RELATED TO THIS LICENSE, THE USE OR INABILITY TO USE THE WORK (INCLUDING BUT NOT LIMITED TO LOSS OF GOODWILL, BUSINESS INTERRUPTION, LOST PROFITS OR DATA, COMPUTER FAILURE OR MALFUNCTION, OR ANY OTHER COMM ERCIAL DAMAGES OR LOSSES), EVEN IF THE LICENSOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
40
+
@@ -0,0 +1,2 @@
1
+ Amazon Kinesis Client Library for Ruby
2
+ Copyright 2014 Amazon.com, Inc. or its affiliates. All Rights Reserved.
@@ -0,0 +1,195 @@
1
+ # Amazon Kinesis Client Library for Ruby
2
+
3
+ This package provides an interface to the Amazon Kinesis Client Library's (KCL) [MultiLangDaemon][multi-lang-daemon]
4
+ for the Ruby language.
5
+ Developers can use the [Amazon KCL][amazon-kcl] to build distributed applications that process streaming data reliably
6
+ at scale. The [Amazon KCL][amazon-kcl] takes care of many of the complex tasks associated with distributed computing,
7
+ such as load-balancing across multiple instances, responding to instance failures, checkpointing processed records,
8
+ and reacting to changes in stream volume.
9
+ This package wraps and manages the interaction with the [MultiLangDaemon][multi-lang-daemon] which is part of the
10
+ [Amazon KCL for Java][amazon-kcl-github] so that developers can focus on implementing their record processor
11
+ executable. A record processor in Ruby typically looks something like:
12
+
13
+ ```ruby
14
+ #! /usr/bin/env ruby
15
+
16
+ require 'aws/kclrb'
17
+
18
+ class SampleRecordProcessor < Aws::KCLrb::RecordProcessorBase
19
+ def init_processor(shard_id)
20
+ # initialize
21
+ end
22
+
23
+ def process_records(records, checkpointer)
24
+ # process batch of records
25
+ end
26
+
27
+ def shutdown(checkpointer, reason)
28
+ # cleanup
29
+ end
30
+ end
31
+
32
+ if __FILE__ == $0
33
+ # Start the main processing loop
34
+ record_processor = SampleRecordProcessor.new
35
+ driver = Aws::KCLrb::KCLProcess.new(record_processor)
36
+ driver.run
37
+ end
38
+ ```
39
+
40
+ ## Before You Get Started
41
+
42
+ Before running the samples, you'll want to make sure that your environment is
43
+ configured to allow the samples to use your
44
+ [AWS Security Credentials](http://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html).
45
+
46
+ By default the samples use the [DefaultAWSCredentialsProviderChain][DefaultAWSCredentialsProviderChain]
47
+ so you'll want to make your credentials available to one of the credentials providers in that
48
+ provider chain. There are several ways to do this such as providing a `~/.aws/credentials` file,
49
+ or if you're running on Amazon EC2, you can associate an IAM role with your instance with appropriate
50
+ access.
51
+
52
+ For questions regarding [Amazon Kinesis Service][amazon-kinesis] and the client libraries please check the
53
+ [official documentation][amazon-kinesis-docs] as well as the [Amazon Kinesis Forums][kinesis-forum].
54
+
55
+ ## Running the Sample
56
+
57
+ Using the Amazon KCL for Ruby package requires the [MultiLangDaemon][multi-lang-daemon] which
58
+ is provided by the [Amazon KCL for Java][amazon-kcl-github]. Rake tasks are provided to start the sample
59
+ application(s) and download all the required dependencies.
60
+
61
+ The sample application consists of two components:
62
+
63
+ * A data producer (`samples/sample_kcl_producer.rb`): this script creates an Amazon Kinesis
64
+ stream and starts putting random records into it.
65
+ * A data processor (`samples/sample_kcl.rb`): this script is invoked by the
66
+ [MultiLangDaemon][multi-lang-daemon] and consumes the data from the Amazon Kinesis
67
+ stream and stores it into files (1 file per shard).
68
+
69
+ The following defaults are used in the sample application:
70
+
71
+ * *Stream name*: `kclrbsample`
72
+ * *Number of shards*: 2
73
+ * *Amazon KCL application name*: `RubyKCLSample`
74
+ * *Amazon DynamoDB table for Amazon KCL application*: `RubyKCLSample`
75
+ * *Sample application output directory*: `/tmp/kclrbsample/`
76
+
77
+ ### Running the Data Producer
78
+
79
+ To run the data producer, run the following commands:
80
+
81
+ ```sh
82
+ cd samples
83
+ rake run_producer
84
+ ```
85
+
86
+ #### Notes
87
+
88
+ * The [AWS Ruby SDK gem][aws-ruby-sdk-gem] needs to be installed as a pre-requisite. To install,
89
+ run:
90
+
91
+ ```sh
92
+ sudo gem install aws-sdk
93
+ ```
94
+
95
+ * The script `samples/sample_kcl_producer.rb` takes several parameters that you can use
96
+ to customize its behavior. To see the available options, run:
97
+
98
+ ```sh
99
+ samples/sample_kcl_producer.rb --help
100
+ ```
101
+
102
+ ### Running the Data Processor
103
+
104
+ To run the data processor, run the following commands:
105
+
106
+ ```sh
107
+ cd samples
108
+ rake run
109
+ ```
110
+
111
+ #### Notes
112
+
113
+ * The `JAVA_HOME` environment variable needs to point to a valid JVM.
114
+ * The rake task invokes the [MultiLangDaemon][multi-lang-daemon] passing to it the
115
+ properties file `samples/sample.properties`. This file contains the
116
+ information needed to bootstrap the sample application, e.g.
117
+
118
+ * `executableName = samples/sample_kcl.rb`
119
+ * `streamName = kclrbsample`
120
+ * `applicationName = RubyKCLSample`
121
+
122
+ ### Cleaning Up
123
+
124
+ This sample application creates a real Amazon Kinesis stream and sends real data to it, and
125
+ create a real DynamoDB table to track the Amazon KCL application state, thus potentially
126
+ incurring AWS costs. Once done, you can log in to AWS management console and delete these
127
+ resources. Specifically, the sample application will create in your default AWS region
128
+
129
+ * an *Amazon Kinesis stream* named `kclrbsample`
130
+ * an *Amazon DynamoDB table* named `RubyKCLSample`
131
+
132
+ ## Running on Amazon EC2
133
+
134
+ Running on Amazon EC2 is simple. Assuming you are already logged into an Amazon EC2
135
+ instance running Amazon Linux, the following steps will prepare your environment
136
+ for running the sample application. Note the version of Java that ships with
137
+ Amazon Linux can be found at `/usr/bin/java` and should be 1.7 or greater.
138
+
139
+ ```sh
140
+ # install some prerequisites if missing
141
+ sudo yum install gcc patch git ruby rake rubygems ruby-devel
142
+ # install the AWS Ruby SDK (pre-requisuite for producer)
143
+ sudo gem install aws-sdk aws-kclrb
144
+ # clone the git repository to work with the samples
145
+ git clone https://github.com/awslabs/amazon-kinesis-client-ruby.git kclrb
146
+ # run the sample
147
+ cd kclrb/samples
148
+ rake run_producer
149
+ # ... and in another terminal
150
+ rake run
151
+ ```
152
+
153
+ ## Under the Hood - What You Should Know about Amazon KCL's [MultiLangDaemon][multi-lang-daemon]
154
+
155
+ Amazon KCL for Ruby uses [Amazon KCL for Java][amazon-kcl-github] internally. We have implemented
156
+ a Java-based daemon, called the *MultiLangDaemon* that does all the heavy lifting. Our approach
157
+ has the daemon spawn the user-defined record processor script/program as a sub-process. The
158
+ *MultiLangDaemon* communicates with this sub-process over standard input/output using a simple
159
+ protocol, and therefore the record processor script/program can be written in any language.
160
+
161
+ At runtime, there will always be a one-to-one correspondence between a record processor, a child process,
162
+ and an [Amazon Kinesis Shard][amazon-kinesis-shard]. The *MultiLangDaemon* will make sure of
163
+ that, without any need for the developer to intervene.
164
+
165
+ In this release, we have abstracted these implementation details away and exposed an interface that enables
166
+ you to focus on writing record processing logic in Ruby. This approach enables [Amazon KCL][amazon-kcl] to
167
+ be language agnostic, while providing identical features and similar parallel processing model across
168
+ all languages.
169
+
170
+ ## See Also
171
+
172
+ * [Developing Consumer Applications for Amazon Kinesis Using the Amazon Kinesis Client Library][amazon-kcl]
173
+ * The [Amazon KCL for Java][amazon-kcl-github]
174
+ * The [Amazon KCL for Python][amazon-kinesis-python-github]
175
+ * The [Amazon Kinesis Documentation][amazon-kinesis-docs]
176
+ * The [Amazon Kinesis Forum][kinesis-forum]
177
+
178
+ ## Release Notes
179
+
180
+ ### Release 1.0.0 (December 30, 2014)
181
+ * **aws-kclrb** gem which exposes an interface to allow implementation of record processors in Ruby
182
+ using the Amazon KCL's [MultiLangDaemon][multi-lang-daemon]
183
+ * **samples** directory contains a sample producer and processing applications using the Amazon KCL
184
+ for Ruby library.
185
+
186
+ [amazon-kinesis]: http://aws.amazon.com/kinesis
187
+ [amazon-kinesis-docs]: http://aws.amazon.com/documentation/kinesis/
188
+ [amazon-kinesis-shard]: http://docs.aws.amazon.com/kinesis/latest/dev/key-concepts.html
189
+ [amazon-kcl]: http://docs.aws.amazon.com/kinesis/latest/dev/kinesis-record-processor-app.html
190
+ [amazon-kcl-github]: https://github.com/awslabs/amazon-kinesis-client
191
+ [amazon-kinesis-python-github]: https://github.com/awslabs/amazon-kinesis-client-python
192
+ [multi-lang-daemon]: https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/multilang/package-info.java
193
+ [DefaultAWSCredentialsProviderChain]: http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.html
194
+ [kinesis-forum]: http://developer.amazonwebservices.com/connect/forum.jspa?forumID=169
195
+ [aws-ruby-sdk-gem]: https://rubygems.org/gems/aws-sdk
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 1.0.0
@@ -0,0 +1,16 @@
1
+ #
2
+ # Copyright 2014 Amazon.com, Inc. or its affiliates. All Rights Reserved.
3
+ #
4
+ # Licensed under the Amazon Software License (the "License").
5
+ # You may not use this file except in compliance with the License.
6
+ # A copy of the License is located at
7
+ #
8
+ # http://aws.amazon.com/asl/
9
+ #
10
+ # or in the "license" file accompanying this file. This file is distributed
11
+ # on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
12
+ # express or implied. See the License for the specific language governing
13
+ # permissions and limitations under the License.
14
+
15
+ require 'aws/kclrb/record_processor'
16
+ require 'aws/kclrb/kcl_process'
@@ -0,0 +1,99 @@
1
+ #
2
+ # Copyright 2014 Amazon.com, Inc. or its affiliates. All Rights Reserved.
3
+ #
4
+ # Licensed under the Amazon Software License (the "License").
5
+ # You may not use this file except in compliance with the License.
6
+ # A copy of the License is located at
7
+ #
8
+ # http://aws.amazon.com/asl/
9
+ #
10
+ # or in the "license" file accompanying this file. This file is distributed
11
+ # on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
12
+ # express or implied. See the License for the specific language governing
13
+ # permissions and limitations under the License.
14
+
15
+ require 'aws/kclrb/io_proxy'
16
+
17
+ module Aws
18
+ module KCLrb
19
+ # Error class used for wrapping exception names passed through the
20
+ # input stream.
21
+ class CheckpointError < RuntimeError
22
+ # @!attribute [r] value
23
+ # @return [String] the name of the exception wrapped by this instance.
24
+ attr_reader :value
25
+
26
+ # @param value [String] The name of the exception that was received
27
+ # while checkpointing. For more details see
28
+ # {https://github.com/awslabs/amazon-kinesis-client/tree/master/src/main/java/com/amazonaws/services/kinesis/clientlibrary/exceptions KCL exceptions}.
29
+ # Any of these exception names could be returned by the {https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/multilang/package-info.java MultiLangDaemon}
30
+ # as a response to a checkpoint action.
31
+ def initialize(value)
32
+ @value = value
33
+ end
34
+
35
+ # @return [String] the name of the wrapped exception.
36
+ def to_s
37
+ @value.to_s
38
+ end
39
+ end
40
+
41
+ # @abstract
42
+ # A checkpointer class which allows you to make checkpoint requests.
43
+ #
44
+ # A checkpoint marks a point in a shard where you've successfully
45
+ # processed to. If this processor fails or loses its lease to that
46
+ # shard, another processor will be started either by this
47
+ # {https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/multilang/package-info.java MultiLangDaemon}
48
+ # or a different instance and resume at the most recent checkpoint
49
+ # in this shard.
50
+ class Checkpointer
51
+
52
+ # Checkpoints at a particular sequence number you provide or if `nil`
53
+ # was passed, the checkpoint will be at the end of the most recently
54
+ # delivered list of records.
55
+ #
56
+ # @param sequence_number [String, nil] The sequence number to checkpoint at
57
+ # or `nil` if you want to checkpoint at the farthest record.
58
+ # @raise [CheckpointError] if the {https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/multilang/package-info.java MultiLangDaemon}
59
+ # returned a response indicating an error, or if the checkpointer
60
+ # encountered unexpected input.
61
+ def checkpoint(sequence_number=nil)
62
+ fail NotImplementedError.new
63
+ end
64
+ end
65
+
66
+
67
+ # @api private
68
+ # Default implementation of the {Checkpointer} abstract class.
69
+ class CheckpointerImpl
70
+ # @param io_proxy [IOProxy] An {IOProxy} object to be used to read/write
71
+ # checkpoint actions from/to the {https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/multilang/package-info.java MultiLangDaemon}.
72
+ def initialize(io_proxy)
73
+ @io_proxy = io_proxy
74
+ end
75
+
76
+ # (see Checkpointer#checkpoint)
77
+ def checkpoint(sequence_number=nil)
78
+ @io_proxy.write_action('checkpoint', 'checkpoint' => sequence_number)
79
+ # Consume the response action
80
+ action = @io_proxy.read_action
81
+ # Happy response is expected to be of the form:
82
+ # `{"action":"checkpoint","checkpoint":"<seq-number>"}`
83
+ # Error response would look like the following:
84
+ # `{"action":"checkpoint","checkpoint":"<seq-number>","error":"<error-type>"}`
85
+ if action && action['action'] == 'checkpoint'
86
+ raise CheckpointError.new(action['error']) if action['error']
87
+ else
88
+ # We are in an invalid state. We will raise a checkpoint exception
89
+ # to the RecordProcessor indicating that the KCL (or KCLrb) is in
90
+ # an invalid state. See KCL documentation for description of this
91
+ # exception. Note that the documented guidance is that this exception
92
+ # is NOT retriable so the client code should exit (see
93
+ # https://github.com/awslabs/amazon-kinesis-client/tree/master/src/main/java/com/amazonaws/services/kinesis/clientlibrary/exceptions)
94
+ raise CheckpointError.new('InvalidStateException')
95
+ end
96
+ end
97
+ end
98
+ end
99
+ end
@@ -0,0 +1,99 @@
1
+ #
2
+ # Copyright 2014 Amazon.com, Inc. or its affiliates. All Rights Reserved.
3
+ #
4
+ # Licensed under the Amazon Software License (the "License").
5
+ # You may not use this file except in compliance with the License.
6
+ # A copy of the License is located at
7
+ #
8
+ # http://aws.amazon.com/asl/
9
+ #
10
+ # or in the "license" file accompanying this file. This file is distributed
11
+ # on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
12
+ # express or implied. See the License for the specific language governing
13
+ # permissions and limitations under the License.
14
+ #
15
+
16
+ require 'multi_json'
17
+
18
+ module Aws
19
+ module KCLrb
20
+ # @api private
21
+ # Internal class used by {KCLProcess} and {Checkpointer} to communicate
22
+ # with the the {https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/multilang/package-info.java MultiLangDaemon} via the input and output streams.
23
+ class IOProxy
24
+ # @param input [IO, #readline] An `IO`-like object to read input lines from (e.g. `$stdin`).
25
+ # @param output [IO] An `IO`-like object to write output lines to (e.g. `$stdout`).
26
+ # @param error [IO] An `IO`-like object to write error lines to (e.g. `$stderr`).
27
+ def initialize(input, output, error)
28
+ @input = input
29
+ @output = output
30
+ @error = error
31
+ end
32
+
33
+ # Reads one line from the input IO, strips it from any
34
+ # leading/trailing whitespaces, skipping empty lines.
35
+ #
36
+ # @return [String, nil] The line read from the input IO or `nil`
37
+ # if end of stream was reached.
38
+ def read_line
39
+ line = nil
40
+ begin
41
+ line = @input.readline
42
+ break unless line
43
+ line.strip!
44
+ end while line.empty?
45
+ line
46
+ rescue EOFError
47
+ nil
48
+ end
49
+
50
+ # Reads a line and decodes it as a message from the {https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/multilang/package-info.java MultiLangDaemon}.
51
+ #
52
+ # @return [Hash] A hash representing the contents of the line, e.g.
53
+ # `{"action" => "initialize", "shardId" => "shardId-000001"}`
54
+ def read_action
55
+ line = read_line
56
+ if line
57
+ MultiJson.load(line)
58
+ end
59
+ end
60
+
61
+ # Writes a line to the output stream. The line is preceded and followed by a
62
+ # new line because other libraries could be writing to the output stream as
63
+ # well (e.g. some libs might write debugging info to `$stdout`) so we would
64
+ # like to prevent our lines from being interlaced with other messages so
65
+ # the MultiLangDaemon can understand them.
66
+ #
67
+ # @param line [String] A line to write to the output stream, e.g.
68
+ # `{"action":"status","responseFor":"<someAction>"}`
69
+ def write_line(line)
70
+ @output.write("\n#{line}\n")
71
+ @output.flush
72
+ end
73
+
74
+
75
+ # Writes a line to the error file.
76
+ #
77
+ # @param error [String,Exception] An exception or error message
78
+ def write_error(error)
79
+ if error.is_a?(Exception)
80
+ error = "#{error.class}: #{error.message}\n\t#{error.backtrace.join("\n\t")}"
81
+ end
82
+ @error.write("#{error}\n")
83
+ @error.flush
84
+ end
85
+
86
+ # Writes a response action to the {https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/multilang/package-info.java MultiLangDaemon},
87
+ # in JSON of the form:
88
+ # `{"action":"<action>","detail1":"value1",...}`
89
+ # where the details depend on the type of the action. See {https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/multilang/package-info.java MultiLangDaemon documentation} for more infortmation.
90
+ #
91
+ # @param action [String] The action name that will be put into the output JSON's `action` attribute.
92
+ # @param details [Hash] Additional key-value pairs to be added to the action response.
93
+ def write_action(action, details={})
94
+ response = {'action' => action}.merge(details)
95
+ write_line(MultiJson.dump(response))
96
+ end
97
+ end
98
+ end
99
+ end
@@ -0,0 +1,95 @@
1
+ #
2
+ # Copyright 2014 Amazon.com, Inc. or its affiliates. All Rights Reserved.
3
+ #
4
+ # Licensed under the Amazon Software License (the "License").
5
+ # You may not use this file except in compliance with the License.
6
+ # A copy of the License is located at
7
+ #
8
+ # http://aws.amazon.com/asl/
9
+ #
10
+ # or in the "license" file accompanying this file. This file is distributed
11
+ # on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
12
+ # express or implied. See the License for the specific language governing
13
+ # permissions and limitations under the License.
14
+
15
+ require 'aws/kclrb/io_proxy'
16
+ require 'aws/kclrb/checkpointer'
17
+
18
+ module Aws
19
+ module KCLrb
20
+ # Error raised if the {KCLProcess} received an input action that it
21
+ # could not parse or it could not handle.
22
+ class MalformedAction < RuntimeError; end
23
+
24
+ # Entry point for a KCL application in Ruby.
25
+ #
26
+ # Implementers of KCL applications in Ruby should instantiate this
27
+ # class and invoke the {#run} method to start processing records.
28
+ class KCLProcess
29
+ # @param processor [RecordProcessorBase] A record processor
30
+ # to use for processing a shard.
31
+ # @param input [IO] An `IO`-like object to read input lines from.
32
+ # @param output [IO] An `IO`-like object to write output lines to.
33
+ # @param error [IO] An `IO`-like object to write error lines to.
34
+ def initialize(processor, input=$stdin, output=$stdout, error=$stderr)
35
+ @processor = processor
36
+ @io_proxy = IOProxy.new(input, output, error)
37
+ @checkpointer = CheckpointerImpl.new(@io_proxy)
38
+ end
39
+
40
+ # Starts this KCL processor's main loop.
41
+ def run
42
+ action = @io_proxy.read_action
43
+ while action do
44
+ process_action(action)
45
+ action = @io_proxy.read_action
46
+ end
47
+ end
48
+
49
+ private
50
+
51
+ # @api private
52
+ # Parses an input action and invokes the appropriate method of the
53
+ # record processor.
54
+ #
55
+ # @param action [Hash] A hash that represents an action to take with
56
+ # appropriate attributes, as retrieved from {IOProxy#read_action}, e.g.
57
+ #
58
+ # - `{"action":"initialize","shardId":"shardId-123"}`
59
+ # - `{"action":"processRecords","records":[{"data":"bWVvdw==","partitionKey":"cat","sequenceNumber":"456"}]}`
60
+ # - `{"action":"shutdown","reason":"TERMINATE"}`
61
+ # @raise [MalformedAction] if the action is missing expected attributes.
62
+ def process_action(action)
63
+ action_name = action.fetch('action')
64
+ case action_name
65
+ when 'initialize'
66
+ dispatch_to_processor(:init_processor, action.fetch('shardId'))
67
+ when 'processRecords'
68
+ dispatch_to_processor(:process_records, action.fetch('records'), @checkpointer)
69
+ when 'shutdown'
70
+ dispatch_to_processor(:shutdown, @checkpointer, action.fetch('reason'))
71
+ else
72
+ raise MalformedAction.new("Received an action which couldn't be understood. Action was '#{action}'")
73
+ end
74
+ @io_proxy.write_action('status', {'responseFor' => action_name})
75
+ rescue KeyError => ke
76
+ raise MalformedAction.new("Action '#{action}': #{ke.message}")
77
+ end
78
+
79
+ # @api private
80
+ # Calls the specified method on the record processor, and handles
81
+ # any resulting exceptions by writing to the error stream.
82
+ def dispatch_to_processor(method, *args)
83
+ @processor.send(method, *args)
84
+ rescue => processor_error
85
+ # We don't know what the client's code could raise and we have
86
+ # no way to recover if we let it propagate up further. We will
87
+ # mimic the KCL and pass over client errors. We print their
88
+ # stack trace to STDERR to help them notice and debug this type
89
+ # of issue.
90
+ @io_proxy.write_error(processor_error)
91
+ end
92
+
93
+ end
94
+ end
95
+ end
@@ -0,0 +1,77 @@
1
+ #
2
+ # Copyright 2014 Amazon.com, Inc. or its affiliates. All Rights Reserved.
3
+ #
4
+ # Licensed under the Amazon Software License (the "License").
5
+ # You may not use this file except in compliance with the License.
6
+ # A copy of the License is located at
7
+ #
8
+ # http://aws.amazon.com/asl/
9
+ #
10
+ # or in the "license" file accompanying this file. This file is distributed
11
+ # on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
12
+ # express or implied. See the License for the specific language governing
13
+ # permissions and limitations under the License.
14
+
15
+ module Aws
16
+ module KCLrb
17
+ # @abstract
18
+ # Base class for implementing a record processor.
19
+ #
20
+ # A `RecordProcessor` processes a shard in a stream. See {https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/clientlibrary/interfaces/IRecordProcessor.java the corresponding KCL interface}.
21
+ # Its methods will be called as follows:
22
+ #
23
+ # 1. {#init_processor} will be called once
24
+ # 2. {#process_records} will be called zero or more times
25
+ # 3. {#shutdown} will be called if this {https://github.com/awslabs/amazon-kinesis-client/blob/master/src/main/java/com/amazonaws/services/kinesis/multilang/package-info.java MultiLangDaemon}
26
+ # instance loses the lease to this shard
27
+ class RecordProcessorBase
28
+ # @abstract
29
+ # Called once by a KCLProcess before any calls to process_records.
30
+ #
31
+ # @param shard_id [String] The shard id that this processor is going to be working on.
32
+ def init_processor(shard_id)
33
+ fail NotImplementedError.new
34
+ end
35
+
36
+ # @abstract
37
+ # Called by a KCLProcess with a list of records to be processed and a checkpointer
38
+ # which accepts sequence numbers from the records to indicate where in the stream
39
+ # to checkpoint.
40
+ #
41
+ # @param records [Array<Hash>] A list of records that are to be processed. A record
42
+ # looks like:
43
+ #
44
+ # ```
45
+ # {"data":"<base64 encoded string>","partitionKey":"someKey","sequenceNumber":"1234567890"}
46
+ # ```
47
+ #
48
+ # Note that `data` attribute is a base64 encoded string. You can use `Base64.decode64`
49
+ # in the `base64` module to get the original data as a string.
50
+ # @param checkpointer [Checkpointer] A checkpointer which accepts a sequence
51
+ # number or no parameters.
52
+ def process_records(records, checkpointer)
53
+ fail NotImplementedError.new
54
+ end
55
+
56
+ # @abstract
57
+ # Called by a KCLProcess instance to indicate that this record processor
58
+ # should shutdown. After this is called, there will be no more calls to
59
+ # any other methods of this record processor.
60
+ #
61
+ # @param checkpointer [Checkpointer] A checkpointer which accepts a sequence
62
+ # number or no parameters.
63
+ # @param reason [String] The reason this record processor is being shutdown,
64
+ # can be either `TERMINATE` or `ZOMBIE`.
65
+ #
66
+ # - If `ZOMBIE`, clients should not checkpoint because there is possibly
67
+ # another record processor which has acquired the lease for this shard.
68
+ # - If `TERMINATE` then `checkpointer.checkpoint()` (without parameters)
69
+ # should be called to checkpoint at the end of the shard so that this
70
+ # processor will be shutdown and new processor(s) will be created to
71
+ # for the child(ren) of this shard.
72
+ def shutdown(checkpointer, reason)
73
+ fail NotImplementedError.new
74
+ end
75
+ end
76
+ end
77
+ end
@@ -0,0 +1,50 @@
1
+ #
2
+ # Copyright 2014 Amazon.com, Inc. or its affiliates. All Rights Reserved.
3
+ #
4
+ # Licensed under the Amazon Software License (the "License").
5
+ # You may not use this file except in compliance with the License.
6
+ # A copy of the License is located at
7
+ #
8
+ # http://aws.amazon.com/asl/
9
+ #
10
+ # or in the "license" file accompanying this file. This file is distributed
11
+ # on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
12
+ # express or implied. See the License for the specific language governing
13
+ # permissions and limitations under the License.
14
+
15
+ require 'aws/kclrb/io_proxy.rb'
16
+ require 'aws/kclrb/checkpointer.rb'
17
+
18
+ module Aws::KCLrb
19
+ describe Checkpointer do
20
+ describe "#checkpoint" do
21
+ it "should emit a checkpoint action and consume response action" do
22
+ seq_number = rand(100_000).to_s
23
+ expected_output_string = %Q[{"action":"checkpoint","checkpoint":"#{seq_number}"}]
24
+ input_string = %Q[{"action":"checkpoint","checkpoint":"#{seq_number}"}]
25
+ input = StringIO.new(input_string)
26
+ output = StringIO.new
27
+ error = StringIO.new
28
+ io_proxy = IOProxy.new(input, output, error)
29
+ checkpointer = CheckpointerImpl.new(io_proxy)
30
+ checkpointer.checkpoint(seq_number)
31
+ expect( output.string.strip ).to eq(expected_output_string.strip)
32
+ expect( input.eof? ).to eq(true)
33
+ end
34
+
35
+ it "should raise a CheckpointError when error is received from MultiLangDaemon" do
36
+ seq_number = rand(100_000).to_s
37
+ expected_output_string = %Q[{"action":"checkpoint","checkpoint":"#{seq_number}"}]
38
+ input_string = %Q[{"action":"checkpoint","checkpoint":"#{seq_number}","error":"ThrottlingException"}]
39
+ input = StringIO.new(input_string)
40
+ output = StringIO.new
41
+ error = StringIO.new
42
+ io_proxy = IOProxy.new(input, output, error)
43
+ checkpointer = CheckpointerImpl.new(io_proxy)
44
+ expect { checkpointer.checkpoint(seq_number) }.to raise_error(CheckpointError, /ThrottlingException/)
45
+ expect( output.string.strip ).to eq(expected_output_string.strip)
46
+ expect( input.eof? ).to eq(true)
47
+ end
48
+ end
49
+ end
50
+ end
@@ -0,0 +1,75 @@
1
+ #
2
+ # Copyright 2014 Amazon.com, Inc. or its affiliates. All Rights Reserved.
3
+ #
4
+ # Licensed under the Amazon Software License (the "License").
5
+ # You may not use this file except in compliance with the License.
6
+ # A copy of the License is located at
7
+ #
8
+ # http://aws.amazon.com/asl/
9
+ #
10
+ # or in the "license" file accompanying this file. This file is distributed
11
+ # on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
12
+ # express or implied. See the License for the specific language governing
13
+ # permissions and limitations under the License.
14
+
15
+ require 'aws/kclrb/io_proxy'
16
+
17
+ module Aws::KCLrb
18
+ describe IOProxy do
19
+ describe "#read_line" do
20
+ it "should skip blank lines" do
21
+ input_string = " \nline1\n\n\n \nline2\n \n"
22
+ input = StringIO.new(input_string)
23
+ output = StringIO.new
24
+ error = StringIO.new
25
+ io_proxy = IOProxy.new(input, output, error)
26
+ expect( io_proxy.read_line ).to eq("line1")
27
+ expect( io_proxy.read_line ).to eq("line2")
28
+ expect( io_proxy.read_line ).to be_nil
29
+ end
30
+ it "should return nil on EOF" do
31
+ input_string = "line1\n"
32
+ input = StringIO.new(input_string)
33
+ output = StringIO.new
34
+ error = StringIO.new
35
+ io_proxy = IOProxy.new(input, output, error)
36
+ expect( io_proxy.read_line ).to eq("line1")
37
+ expect( io_proxy.read_line ).to be_nil
38
+ expect( io_proxy.read_line ).to be_nil
39
+ end
40
+ end
41
+ describe "#write_error" do
42
+ it "should write an error message to the error stream" do
43
+ input = StringIO.new
44
+ output = StringIO.new
45
+ error = StringIO.new
46
+ io_proxy = IOProxy.new(input, output, error)
47
+ io_proxy.write_error('an error message')
48
+ expect( error.string.strip ).to eq('an error message')
49
+ end
50
+ it "should write exception details to the error stream" do
51
+ input = StringIO.new
52
+ output = StringIO.new
53
+ error = StringIO.new
54
+ io_proxy = IOProxy.new(input, output, error)
55
+ begin
56
+ raise RuntimeError.new("Test error")
57
+ rescue => e
58
+ io_proxy.write_error(e)
59
+ end
60
+ #puts error.string
61
+ expect( error.string.strip ).to match(/RuntimeError.*Test error/)
62
+ end
63
+ end
64
+ describe "#write_action" do
65
+ it "should write a valid JSON action to the output stream" do
66
+ input = StringIO.new
67
+ output = StringIO.new
68
+ error = StringIO.new
69
+ io_proxy = IOProxy.new(input, output, error)
70
+ io_proxy.write_action('status', 'responseFor' => 'initialize')
71
+ expect( output.string.strip ).to eq('{"action":"status","responseFor":"initialize"}')
72
+ end
73
+ end
74
+ end
75
+ end
@@ -0,0 +1,103 @@
1
+ #
2
+ # Copyright 2014 Amazon.com, Inc. or its affiliates. All Rights Reserved.
3
+ #
4
+ # Licensed under the Amazon Software License (the "License").
5
+ # You may not use this file except in compliance with the License.
6
+ # A copy of the License is located at
7
+ #
8
+ # http://aws.amazon.com/asl/
9
+ #
10
+ # or in the "license" file accompanying this file. This file is distributed
11
+ # on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
12
+ # express or implied. See the License for the specific language governing
13
+ # permissions and limitations under the License.
14
+
15
+ require 'aws/kclrb/kcl_process'
16
+ require 'aws/kclrb/record_processor'
17
+
18
+ module Aws::KCLrb
19
+ # Dummy test class.
20
+ # The {#process_reocrds} method will retry the checkpointing call
21
+ # in case of a throttling exception.
22
+ class TestRecordProcessor < RecordProcessorBase
23
+ def init_processor(shard_id)
24
+ # no-op
25
+ end
26
+
27
+ def process_records(records, checkpointer)
28
+ seq = records[0]['sequenceNumber']
29
+ begin
30
+ checkpointer.checkpoint(seq)
31
+ rescue CheckpointError => cpe
32
+ if cpe.value == 'ThrottlingException'
33
+ checkpointer.checkpoint(seq)
34
+ else
35
+ raise
36
+ end
37
+ end
38
+ end
39
+
40
+ def shutdown(checkpointer, reason)
41
+ checkpointer.checkpoint if reason == 'TERMINATE'
42
+ end
43
+ end
44
+
45
+ describe KCLProcess do
46
+ describe "#run" do
47
+ it "should respond to each action by invoking the corresponding processor's method and write a status message to the output IO" do
48
+ input_specs = [
49
+ {:method => :init_processor, :action => 'initialize', :input => '{"action":"initialize","shardId":"shard-000001"}'},
50
+ {:method => :process_records, :action => 'processRecords', :input => '{"action":"processRecords","records":[]}'},
51
+ {:method => :shutdown, :action => 'shutdown', :input => '{"action":"shutdown","reason":"TERMINATE"}'},
52
+ ]
53
+ # pick any of the actions randomly to avoid writing a test for each
54
+ input_spec = input_specs.sample
55
+ processor = double(RecordProcessorBase)
56
+ expect(processor).to receive(input_spec[:method]).once
57
+ input = StringIO.new(input_spec[:input])
58
+ output = StringIO.new
59
+ error = StringIO.new
60
+ driver = KCLProcess.new(processor, input, output, error)
61
+ driver.run
62
+
63
+ expected_output = %Q[{"action":"status","responseFor":"#{input_spec[:action]}"}]
64
+ expect( output.string.gsub(/\s+/, "") ).to eq(expected_output.gsub(/\s+/, ""))
65
+ expect( error.string ).to eq("")
66
+ expect( input.eof? ).to eq(true)
67
+ end
68
+ it "should process a normal stream of actions and produce expected output" do
69
+ input_string = <<-INPUT
70
+ {"action":"initialize","shardId":"shardId-123"}
71
+ {"action":"processRecords","records":[{"data":"bWVvdw==","partitionKey":"cat","sequenceNumber":"456"}]}
72
+ {"action":"checkpoint","checkpoint":"456","error":"ThrottlingException"}
73
+ {"action":"checkpoint","checkpoint":"456"}
74
+ {"action":"shutdown","reason":"TERMINATE"}
75
+ {"action":"checkpoint","checkpoint":"456"}
76
+ INPUT
77
+
78
+ # NOTE: The first checkpoint is expected to fail
79
+ # with a ThrottlingException and hence the
80
+ # retry.
81
+ expected_output_string = <<-OUTPUT
82
+ {"action":"status","responseFor":"initialize"}
83
+ {"action":"checkpoint","checkpoint":"456"}
84
+ {"action":"checkpoint","checkpoint":"456"}
85
+ {"action":"status","responseFor":"processRecords"}
86
+ {"action":"checkpoint","checkpoint":null}
87
+ {"action":"status","responseFor":"shutdown"}
88
+ OUTPUT
89
+ processor = TestRecordProcessor.new
90
+ input = StringIO.new(input_string)
91
+ output = StringIO.new
92
+ error = StringIO.new
93
+ driver = KCLProcess.new(processor, input, output, error)
94
+ driver.run
95
+
96
+ # outputs should be same modulo some whitespaces
97
+ expect( output.string.gsub(/\s+/, "") ).to eq(expected_output_string.gsub(/\s+/, ""))
98
+ expect( error.string ).to eq("")
99
+ expect( input.eof? ).to eq(true)
100
+ end
101
+ end
102
+ end
103
+ end
@@ -0,0 +1,19 @@
1
+ #
2
+ # Copyright 2014 Amazon.com, Inc. or its affiliates. All Rights Reserved.
3
+ #
4
+ # Licensed under the Amazon Software License (the "License").
5
+ # You may not use this file except in compliance with the License.
6
+ # A copy of the License is located at
7
+ #
8
+ # http://aws.amazon.com/asl/
9
+ #
10
+ # or in the "license" file accompanying this file. This file is distributed
11
+ # on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
12
+ # express or implied. See the License for the specific language governing
13
+ # permissions and limitations under the License.
14
+
15
+ require 'simplecov'
16
+ # See http://rubydoc.info/gems/rspec-core/RSpec/Core/Configuration
17
+ RSpec.configure do |config|
18
+ config.run_all_when_everything_filtered = true
19
+ end
metadata ADDED
@@ -0,0 +1,72 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: aws-kclrb
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: ruby
6
+ authors:
7
+ - Amazon Web Services
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2014-12-30 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: multi_json
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '1.0'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '1.0'
27
+ description: A ruby interface for the Amazon Kinesis Client Library MultiLangDaemon
28
+ email:
29
+ executables: []
30
+ extensions: []
31
+ extra_rdoc_files: []
32
+ files:
33
+ - ".rspec"
34
+ - ".yardopts"
35
+ - LICENSE.txt
36
+ - NOTICE.txt
37
+ - README.md
38
+ - VERSION
39
+ - lib/aws/kclrb.rb
40
+ - lib/aws/kclrb/checkpointer.rb
41
+ - lib/aws/kclrb/io_proxy.rb
42
+ - lib/aws/kclrb/kcl_process.rb
43
+ - lib/aws/kclrb/record_processor.rb
44
+ - spec/checkpointer_spec.rb
45
+ - spec/io_proxy_spec.rb
46
+ - spec/kcl_process_spec.rb
47
+ - spec/spec_helper.rb
48
+ homepage: https://github.com/awslabs/amazon-kinesis-client-ruby
49
+ licenses:
50
+ - Amazon Software License
51
+ metadata: {}
52
+ post_install_message:
53
+ rdoc_options: []
54
+ require_paths:
55
+ - lib
56
+ required_ruby_version: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - ">="
59
+ - !ruby/object:Gem::Version
60
+ version: '0'
61
+ required_rubygems_version: !ruby/object:Gem::Requirement
62
+ requirements:
63
+ - - ">="
64
+ - !ruby/object:Gem::Version
65
+ version: '0'
66
+ requirements: []
67
+ rubyforge_project:
68
+ rubygems_version: 2.2.0
69
+ signing_key:
70
+ specification_version: 4
71
+ summary: Amazon Kinesis Client Library for Ruby
72
+ test_files: []