elasticity 1.3.1 → 1.4
Sign up to get free protection for your applications and to get access to all the features.
data/README.md
CHANGED
@@ -1,7 +1,5 @@
|
|
1
1
|
Elasticity provides programmatic access to Amazon's Elastic Map Reduce service. The aim is to conveniently wrap the API operations in a manner that makes working with EMR job flows from Ruby more productive and more enjoyable, without having to understand the nuts and bolts of the EMR REST API. At the very least, using Elasticity allows you to easily experiment with the EMR API :)
|
2
2
|
|
3
|
-
**BACKLOG**: Have a look at the [backlog](https://www.pivotaltracker.com/projects/272429) to see where this is headed.
|
4
|
-
|
5
3
|
**CREDITS**: AWS signing was used from [RightScale's](http://www.rightscale.com/) amazing [right_aws gem](https://github.com/rightscale/right_aws) which works extraordinarily well! If you need access to any AWS service (EC2, S3, etc.), have a look. Used camelize from ActiveSupport as well, thank you \Rails :)
|
6
4
|
|
7
5
|
**CONTRIBUTIONS**:
|
@@ -18,7 +16,7 @@ All you have to do is <code>require 'elasticity'</code> and you're all set!
|
|
18
16
|
|
19
17
|
# Simplified API Reference
|
20
18
|
|
21
|
-
Elasticity currently provides simplified access to launching Hive and
|
19
|
+
Elasticity currently provides simplified access to launching Hive, Pig and Custom Jar job flows, specifying several default values that you may optionally override:
|
22
20
|
|
23
21
|
<pre>
|
24
22
|
@action_on_failure = "TERMINATE_JOB_FLOW"
|
@@ -30,7 +28,7 @@ Elasticity currently provides simplified access to launching Hive and Pig job fl
|
|
30
28
|
@slave_instance_type = "m1.small"
|
31
29
|
</pre>
|
32
30
|
|
33
|
-
These are all accessible from
|
31
|
+
These are all accessible from the simplified jobs. See the PigJob description for an example.
|
34
32
|
|
35
33
|
### Bootstrap Actions
|
36
34
|
|
@@ -96,6 +94,23 @@ Use this as you would any other Pig variable.
|
|
96
94
|
...
|
97
95
|
</pre>
|
98
96
|
|
97
|
+
## Custom Jar
|
98
|
+
|
99
|
+
Custom jar jobs are also available. To kick off a custom job, specify the path to the jar and any arguments you'd like passed to the jar.
|
100
|
+
|
101
|
+
<pre>
|
102
|
+
custom_jar = Elasticity::PigJob.new(ENV["AWS_ACCESS_KEY_ID"], ENV["AWS_SECRET_KEY"])
|
103
|
+
custom_jar.log_uri = "s3n://slif-test/output/logs"
|
104
|
+
custom_jar.action_on_failure = "TERMINATE_JOB_FLOW"
|
105
|
+
jobflow_id = custom_jar.run('s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar', [
|
106
|
+
"s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br",
|
107
|
+
"s3n://elasticmapreduce/samples/cloudburst/input/100k.br",
|
108
|
+
"s3n://slif_hadoop_test/cloudburst/output/2011-12-09",
|
109
|
+
])
|
110
|
+
|
111
|
+
> "j-1IU6NM8OUPS9I"
|
112
|
+
</pre>
|
113
|
+
|
99
114
|
# Amazon API Reference
|
100
115
|
|
101
116
|
Elasticity wraps all of the EMR API calls. Please see the Amazon guide for details on these operations because the default values aren't obvious (e.g. the meaning of <code>DescribeJobFlows</code> without parameters).
|
data/lib/elasticity.rb
CHANGED
@@ -0,0 +1,37 @@
|
|
1
|
+
module Elasticity
|
2
|
+
|
3
|
+
class CustomJarJob < Elasticity::SimpleJob
|
4
|
+
|
5
|
+
def initialize(aws_access_key_id, aws_secret_access_key)
|
6
|
+
super
|
7
|
+
@name = "Elasticity Custom Jar Job"
|
8
|
+
end
|
9
|
+
|
10
|
+
def run(jar, arguments=nil)
|
11
|
+
jobflow_config = {
|
12
|
+
:name => @name,
|
13
|
+
:instances => {
|
14
|
+
:ec2_key_name => @ec2_key_name,
|
15
|
+
:hadoop_version => @hadoop_version,
|
16
|
+
:instance_count => @instance_count,
|
17
|
+
:master_instance_type => @master_instance_type,
|
18
|
+
:slave_instance_type => @slave_instance_type,
|
19
|
+
},
|
20
|
+
:steps => [
|
21
|
+
{
|
22
|
+
:action_on_failure => @action_on_failure,
|
23
|
+
:hadoop_jar_step => {
|
24
|
+
:jar => jar
|
25
|
+
},
|
26
|
+
:name => "Execute Custom Jar"
|
27
|
+
}
|
28
|
+
]
|
29
|
+
}
|
30
|
+
jobflow_config.merge!(:log_uri => @log_uri) if @log_uri
|
31
|
+
jobflow_config[:steps].first[:hadoop_jar_step][:args] = arguments if arguments
|
32
|
+
@emr.run_job_flow(jobflow_config)
|
33
|
+
end
|
34
|
+
|
35
|
+
end
|
36
|
+
|
37
|
+
end
|
data/lib/elasticity/version.rb
CHANGED
@@ -0,0 +1,35 @@
|
|
1
|
+
---
|
2
|
+
- !ruby/struct:VCR::HTTPInteraction
|
3
|
+
request: !ruby/struct:VCR::Request
|
4
|
+
method: :get
|
5
|
+
uri: !ruby/regexp /^https:\/\/elasticmapreduce.amazonaws.com\/\?AWSAccessKeyId=1V37GRJQT0BF1NS7NT82&Instances.Ec2KeyName=sharethrough_dev&Instances.HadoopVersion=0.20&Instances.InstanceCount=2&Instances.MasterInstanceType=m1.small&Instances.SlaveInstanceType=m1.small&Name=Elasticity%20Custom%20Jar%20Job&Operation=RunJobFlow&.*Steps.member.1.ActionOnFailure=TERMINATE_JOB_FLOW&Steps.member.1.HadoopJarStep.Args.member.1=s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br&Steps.member.1.HadoopJarStep.Args.member.2=s3n://elasticmapreduce/samples/cloudburst/input/100k.br&Steps.member.1.HadoopJarStep.Args.member.3=s3n://slif_hadoop_test/cloudburst/output/2011-12-09&Steps.member.1.HadoopJarStep.Jar=s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar&Steps.member.1.Name=Execute%20Custom%20Jar/
|
6
|
+
body:
|
7
|
+
headers:
|
8
|
+
accept:
|
9
|
+
- "*/*; q=0.5, application/xml"
|
10
|
+
accept-encoding:
|
11
|
+
- gzip, deflate
|
12
|
+
response: !ruby/struct:VCR::Response
|
13
|
+
status: !ruby/struct:VCR::ResponseStatus
|
14
|
+
code: 200
|
15
|
+
message: OK
|
16
|
+
headers:
|
17
|
+
x-amzn-requestid:
|
18
|
+
- ba9e82df-22b9-11e1-beee-7d8a92482267
|
19
|
+
content-type:
|
20
|
+
- text/xml
|
21
|
+
date:
|
22
|
+
- Fri, 09 Dec 2011 23:01:29 GMT
|
23
|
+
content-length:
|
24
|
+
- "297"
|
25
|
+
body: |
|
26
|
+
<RunJobFlowResponse xmlns="http://elasticmapreduce.amazonaws.com/doc/2009-03-31">
|
27
|
+
<RunJobFlowResult>
|
28
|
+
<JobFlowId>j-1IU6NM8OUPS9I</JobFlowId>
|
29
|
+
</RunJobFlowResult>
|
30
|
+
<ResponseMetadata>
|
31
|
+
<RequestId>ba9e82df-22b9-11e1-beee-7d8a92482267</RequestId>
|
32
|
+
</ResponseMetadata>
|
33
|
+
</RunJobFlowResponse>
|
34
|
+
|
35
|
+
http_version: "1.1"
|
@@ -0,0 +1,118 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
describe Elasticity::CustomJarJob do
|
4
|
+
|
5
|
+
describe ".new" do
|
6
|
+
|
7
|
+
it "should have good defaults" do
|
8
|
+
custom_jar = Elasticity::CustomJarJob.new("access", "secret")
|
9
|
+
custom_jar.aws_access_key_id.should == "access"
|
10
|
+
custom_jar.aws_secret_access_key.should == "secret"
|
11
|
+
custom_jar.ec2_key_name.should == "default"
|
12
|
+
custom_jar.hadoop_version.should == "0.20"
|
13
|
+
custom_jar.instance_count.should == 2
|
14
|
+
custom_jar.master_instance_type.should == "m1.small"
|
15
|
+
custom_jar.name.should == "Elasticity Custom Jar Job"
|
16
|
+
custom_jar.slave_instance_type.should == "m1.small"
|
17
|
+
custom_jar.action_on_failure.should == "TERMINATE_JOB_FLOW"
|
18
|
+
custom_jar.log_uri.should == nil
|
19
|
+
end
|
20
|
+
|
21
|
+
end
|
22
|
+
|
23
|
+
describe "#run" do
|
24
|
+
|
25
|
+
context "when there are arguments provided" do
|
26
|
+
it "should run the script with the specified variables and return the jobflow_id" do
|
27
|
+
aws = Elasticity::EMR.new("", "")
|
28
|
+
aws.should_receive(:run_job_flow).with(
|
29
|
+
{
|
30
|
+
:name => "Elasticity Custom Jar Job",
|
31
|
+
:log_uri => "s3n://slif-test/output/logs",
|
32
|
+
:instances => {
|
33
|
+
:ec2_key_name => "default",
|
34
|
+
:hadoop_version => "0.20",
|
35
|
+
:instance_count => 2,
|
36
|
+
:master_instance_type => "m1.small",
|
37
|
+
:slave_instance_type => "m1.small",
|
38
|
+
},
|
39
|
+
:steps => [
|
40
|
+
{
|
41
|
+
:action_on_failure => "TERMINATE_JOB_FLOW",
|
42
|
+
:hadoop_jar_step => {
|
43
|
+
:jar => "s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar",
|
44
|
+
:args => [
|
45
|
+
"s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br",
|
46
|
+
"s3n://elasticmapreduce/samples/cloudburst/input/100k.br",
|
47
|
+
"s3n://slif_hadoop_test/cloudburst/output/2011-12-09",
|
48
|
+
],
|
49
|
+
},
|
50
|
+
:name => "Execute Custom Jar"
|
51
|
+
}
|
52
|
+
]
|
53
|
+
}).and_return("new_jobflow_id")
|
54
|
+
Elasticity::EMR.should_receive(:new).with("access", "secret").and_return(aws)
|
55
|
+
|
56
|
+
custom_jar = Elasticity::CustomJarJob.new("access", "secret")
|
57
|
+
custom_jar.log_uri = "s3n://slif-test/output/logs"
|
58
|
+
custom_jar.action_on_failure = "TERMINATE_JOB_FLOW"
|
59
|
+
jobflow_id = custom_jar.run('s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar', [
|
60
|
+
"s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br",
|
61
|
+
"s3n://elasticmapreduce/samples/cloudburst/input/100k.br",
|
62
|
+
"s3n://slif_hadoop_test/cloudburst/output/2011-12-09",
|
63
|
+
])
|
64
|
+
jobflow_id.should == "new_jobflow_id"
|
65
|
+
end
|
66
|
+
end
|
67
|
+
|
68
|
+
context "when there are no arguments provided" do
|
69
|
+
it "should run the script with the specified variables and return the jobflow_id" do
|
70
|
+
aws = Elasticity::EMR.new("", "")
|
71
|
+
aws.should_receive(:run_job_flow).with(
|
72
|
+
{
|
73
|
+
:name => "Elasticity Custom Jar Job",
|
74
|
+
:log_uri => "s3n://slif-test/output/logs",
|
75
|
+
:instances => {
|
76
|
+
:ec2_key_name => "default",
|
77
|
+
:hadoop_version => "0.20",
|
78
|
+
:instance_count => 2,
|
79
|
+
:master_instance_type => "m1.small",
|
80
|
+
:slave_instance_type => "m1.small",
|
81
|
+
},
|
82
|
+
:steps => [
|
83
|
+
{
|
84
|
+
:action_on_failure => "TERMINATE_JOB_FLOW",
|
85
|
+
:hadoop_jar_step => {
|
86
|
+
:jar => "s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar"
|
87
|
+
},
|
88
|
+
:name => "Execute Custom Jar"
|
89
|
+
}
|
90
|
+
]
|
91
|
+
}).and_return("new_jobflow_id")
|
92
|
+
Elasticity::EMR.should_receive(:new).with("access", "secret").and_return(aws)
|
93
|
+
|
94
|
+
custom_jar = Elasticity::CustomJarJob.new("access", "secret")
|
95
|
+
custom_jar.log_uri = "s3n://slif-test/output/logs"
|
96
|
+
custom_jar.action_on_failure = "TERMINATE_JOB_FLOW"
|
97
|
+
jobflow_id = custom_jar.run('s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar')
|
98
|
+
jobflow_id.should == "new_jobflow_id"
|
99
|
+
end
|
100
|
+
end
|
101
|
+
|
102
|
+
end
|
103
|
+
|
104
|
+
describe "integration happy path" do
|
105
|
+
use_vcr_cassette "custom_jar_job/cloudburst", :record => :none
|
106
|
+
it "should kick off the sample Amazion EMR Hive application" do
|
107
|
+
custom_jar = Elasticity::CustomJarJob.new(AWS_ACCESS_KEY_ID, AWS_SECRET_KEY)
|
108
|
+
custom_jar.ec2_key_name = "sharethrough_dev"
|
109
|
+
jobflow_id = custom_jar.run('s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar', [
|
110
|
+
"s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br",
|
111
|
+
"s3n://elasticmapreduce/samples/cloudburst/input/100k.br",
|
112
|
+
"s3n://slif_hadoop_test/cloudburst/output/2011-12-09",
|
113
|
+
])
|
114
|
+
jobflow_id.should == "j-1IU6NM8OUPS9I"
|
115
|
+
end
|
116
|
+
end
|
117
|
+
|
118
|
+
end
|
metadata
CHANGED
@@ -1,13 +1,12 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: elasticity
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
hash:
|
4
|
+
hash: 7
|
5
5
|
prerelease:
|
6
6
|
segments:
|
7
7
|
- 1
|
8
|
-
-
|
9
|
-
|
10
|
-
version: 1.3.1
|
8
|
+
- 4
|
9
|
+
version: "1.4"
|
11
10
|
platform: ruby
|
12
11
|
authors:
|
13
12
|
- Robert Slifka
|
@@ -15,8 +14,7 @@ autorequire:
|
|
15
14
|
bindir: bin
|
16
15
|
cert_chain: []
|
17
16
|
|
18
|
-
date: 2011-
|
19
|
-
default_executable:
|
17
|
+
date: 2011-12-09 00:00:00 Z
|
20
18
|
dependencies:
|
21
19
|
- !ruby/object:Gem::Dependency
|
22
20
|
name: rest-client
|
@@ -171,6 +169,7 @@ files:
|
|
171
169
|
- elasticity.gemspec
|
172
170
|
- lib/elasticity.rb
|
173
171
|
- lib/elasticity/aws_request.rb
|
172
|
+
- lib/elasticity/custom_jar_job.rb
|
174
173
|
- lib/elasticity/emr.rb
|
175
174
|
- lib/elasticity/hive_job.rb
|
176
175
|
- lib/elasticity/job_flow.rb
|
@@ -181,6 +180,7 @@ files:
|
|
181
180
|
- spec/fixtures/vcr_cassettes/add_instance_groups/one_group_successful.yml
|
182
181
|
- spec/fixtures/vcr_cassettes/add_instance_groups/one_group_unsuccessful.yml
|
183
182
|
- spec/fixtures/vcr_cassettes/add_jobflow_steps/add_multiple_steps.yml
|
183
|
+
- spec/fixtures/vcr_cassettes/custom_jar_job/cloudburst.yml
|
184
184
|
- spec/fixtures/vcr_cassettes/describe_jobflows/all_jobflows.yml
|
185
185
|
- spec/fixtures/vcr_cassettes/direct/terminate_jobflow.yml
|
186
186
|
- spec/fixtures/vcr_cassettes/hive_job/hive_ads.yml
|
@@ -192,13 +192,13 @@ files:
|
|
192
192
|
- spec/fixtures/vcr_cassettes/set_termination_protection/protect_multiple_job_flows.yml
|
193
193
|
- spec/fixtures/vcr_cassettes/terminate_jobflows/one_jobflow.yml
|
194
194
|
- spec/lib/elasticity/aws_request_spec.rb
|
195
|
+
- spec/lib/elasticity/custom_jar_job_spec.rb
|
195
196
|
- spec/lib/elasticity/emr_spec.rb
|
196
197
|
- spec/lib/elasticity/hive_job_spec.rb
|
197
198
|
- spec/lib/elasticity/job_flow_spec.rb
|
198
199
|
- spec/lib/elasticity/job_flow_step_spec.rb
|
199
200
|
- spec/lib/elasticity/pig_job_spec.rb
|
200
201
|
- spec/spec_helper.rb
|
201
|
-
has_rdoc: true
|
202
202
|
homepage: http://www.github.com/rslifka/elasticity
|
203
203
|
licenses: []
|
204
204
|
|
@@ -228,7 +228,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
228
228
|
requirements: []
|
229
229
|
|
230
230
|
rubyforge_project:
|
231
|
-
rubygems_version: 1.
|
231
|
+
rubygems_version: 1.8.6
|
232
232
|
signing_key:
|
233
233
|
specification_version: 3
|
234
234
|
summary: Programmatic access to Amazon's Elastic Map Reduce service.
|
@@ -236,6 +236,7 @@ test_files:
|
|
236
236
|
- spec/fixtures/vcr_cassettes/add_instance_groups/one_group_successful.yml
|
237
237
|
- spec/fixtures/vcr_cassettes/add_instance_groups/one_group_unsuccessful.yml
|
238
238
|
- spec/fixtures/vcr_cassettes/add_jobflow_steps/add_multiple_steps.yml
|
239
|
+
- spec/fixtures/vcr_cassettes/custom_jar_job/cloudburst.yml
|
239
240
|
- spec/fixtures/vcr_cassettes/describe_jobflows/all_jobflows.yml
|
240
241
|
- spec/fixtures/vcr_cassettes/direct/terminate_jobflow.yml
|
241
242
|
- spec/fixtures/vcr_cassettes/hive_job/hive_ads.yml
|
@@ -247,6 +248,7 @@ test_files:
|
|
247
248
|
- spec/fixtures/vcr_cassettes/set_termination_protection/protect_multiple_job_flows.yml
|
248
249
|
- spec/fixtures/vcr_cassettes/terminate_jobflows/one_jobflow.yml
|
249
250
|
- spec/lib/elasticity/aws_request_spec.rb
|
251
|
+
- spec/lib/elasticity/custom_jar_job_spec.rb
|
250
252
|
- spec/lib/elasticity/emr_spec.rb
|
251
253
|
- spec/lib/elasticity/hive_job_spec.rb
|
252
254
|
- spec/lib/elasticity/job_flow_spec.rb
|