elasticity 1.3.1 → 1.4

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -1,7 +1,5 @@
1
1
  Elasticity provides programmatic access to Amazon's Elastic Map Reduce service. The aim is to conveniently wrap the API operations in a manner that makes working with EMR job flows from Ruby more productive and more enjoyable, without having to understand the nuts and bolts of the EMR REST API. At the very least, using Elasticity allows you to easily experiment with the EMR API :)
2
2
 
3
- **BACKLOG**: Have a look at the [backlog](https://www.pivotaltracker.com/projects/272429) to see where this is headed.
4
-
5
3
  **CREDITS**: AWS signing was used from [RightScale's](http://www.rightscale.com/) amazing [right_aws gem](https://github.com/rightscale/right_aws) which works extraordinarily well! If you need access to any AWS service (EC2, S3, etc.), have a look. Used camelize from ActiveSupport as well, thank you \Rails :)
6
4
 
7
5
  **CONTRIBUTIONS**:
@@ -18,7 +16,7 @@ All you have to do is <code>require 'elasticity'</code> and you're all set!
18
16
 
19
17
  # Simplified API Reference
20
18
 
21
- Elasticity currently provides simplified access to launching Hive and Pig job flows, specifying several default values that you may optionally override:
19
+ Elasticity currently provides simplified access to launching Hive, Pig and Custom Jar job flows, specifying several default values that you may optionally override:
22
20
 
23
21
  <pre>
24
22
  @action_on_failure = "TERMINATE_JOB_FLOW"
@@ -30,7 +28,7 @@ Elasticity currently provides simplified access to launching Hive and Pig job fl
30
28
  @slave_instance_type = "m1.small"
31
29
  </pre>
32
30
 
33
- These are all accessible from HiveJob and PigJob. See the PigJob description for an example.
31
+ These are all accessible from the simplified jobs. See the PigJob description for an example.
34
32
 
35
33
  ### Bootstrap Actions
36
34
 
@@ -96,6 +94,23 @@ Use this as you would any other Pig variable.
96
94
  ...
97
95
  </pre>
98
96
 
97
+ ## Custom Jar
98
+
99
+ Custom jar jobs are also available. To kick off a custom job, specify the path to the jar and any arguments you'd like passed to the jar.
100
+
101
+ <pre>
102
+ custom_jar = Elasticity::PigJob.new(ENV["AWS_ACCESS_KEY_ID"], ENV["AWS_SECRET_KEY"])
103
+ custom_jar.log_uri = "s3n://slif-test/output/logs"
104
+ custom_jar.action_on_failure = "TERMINATE_JOB_FLOW"
105
+ jobflow_id = custom_jar.run('s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar', [
106
+ "s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br",
107
+ "s3n://elasticmapreduce/samples/cloudburst/input/100k.br",
108
+ "s3n://slif_hadoop_test/cloudburst/output/2011-12-09",
109
+ ])
110
+
111
+ > "j-1IU6NM8OUPS9I"
112
+ </pre>
113
+
99
114
  # Amazon API Reference
100
115
 
101
116
  Elasticity wraps all of the EMR API calls. Please see the Amazon guide for details on these operations because the default values aren't obvious (e.g. the meaning of <code>DescribeJobFlows</code> without parameters).
@@ -10,6 +10,8 @@ require 'elasticity/job_flow'
10
10
  require 'elasticity/job_flow_step'
11
11
 
12
12
  require 'elasticity/simple_job'
13
+
14
+ require 'elasticity/custom_jar_job'
13
15
  require 'elasticity/hive_job'
14
16
  require 'elasticity/pig_job'
15
17
 
@@ -0,0 +1,37 @@
1
+ module Elasticity
2
+
3
+ class CustomJarJob < Elasticity::SimpleJob
4
+
5
+ def initialize(aws_access_key_id, aws_secret_access_key)
6
+ super
7
+ @name = "Elasticity Custom Jar Job"
8
+ end
9
+
10
+ def run(jar, arguments=nil)
11
+ jobflow_config = {
12
+ :name => @name,
13
+ :instances => {
14
+ :ec2_key_name => @ec2_key_name,
15
+ :hadoop_version => @hadoop_version,
16
+ :instance_count => @instance_count,
17
+ :master_instance_type => @master_instance_type,
18
+ :slave_instance_type => @slave_instance_type,
19
+ },
20
+ :steps => [
21
+ {
22
+ :action_on_failure => @action_on_failure,
23
+ :hadoop_jar_step => {
24
+ :jar => jar
25
+ },
26
+ :name => "Execute Custom Jar"
27
+ }
28
+ ]
29
+ }
30
+ jobflow_config.merge!(:log_uri => @log_uri) if @log_uri
31
+ jobflow_config[:steps].first[:hadoop_jar_step][:args] = arguments if arguments
32
+ @emr.run_job_flow(jobflow_config)
33
+ end
34
+
35
+ end
36
+
37
+ end
@@ -1,3 +1,3 @@
1
1
  module Elasticity
2
- VERSION = "1.3.1"
2
+ VERSION = "1.4"
3
3
  end
@@ -0,0 +1,35 @@
1
+ ---
2
+ - !ruby/struct:VCR::HTTPInteraction
3
+ request: !ruby/struct:VCR::Request
4
+ method: :get
5
+ uri: !ruby/regexp /^https:\/\/elasticmapreduce.amazonaws.com\/\?AWSAccessKeyId=1V37GRJQT0BF1NS7NT82&Instances.Ec2KeyName=sharethrough_dev&Instances.HadoopVersion=0.20&Instances.InstanceCount=2&Instances.MasterInstanceType=m1.small&Instances.SlaveInstanceType=m1.small&Name=Elasticity%20Custom%20Jar%20Job&Operation=RunJobFlow&.*Steps.member.1.ActionOnFailure=TERMINATE_JOB_FLOW&Steps.member.1.HadoopJarStep.Args.member.1=s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br&Steps.member.1.HadoopJarStep.Args.member.2=s3n://elasticmapreduce/samples/cloudburst/input/100k.br&Steps.member.1.HadoopJarStep.Args.member.3=s3n://slif_hadoop_test/cloudburst/output/2011-12-09&Steps.member.1.HadoopJarStep.Jar=s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar&Steps.member.1.Name=Execute%20Custom%20Jar/
6
+ body:
7
+ headers:
8
+ accept:
9
+ - "*/*; q=0.5, application/xml"
10
+ accept-encoding:
11
+ - gzip, deflate
12
+ response: !ruby/struct:VCR::Response
13
+ status: !ruby/struct:VCR::ResponseStatus
14
+ code: 200
15
+ message: OK
16
+ headers:
17
+ x-amzn-requestid:
18
+ - ba9e82df-22b9-11e1-beee-7d8a92482267
19
+ content-type:
20
+ - text/xml
21
+ date:
22
+ - Fri, 09 Dec 2011 23:01:29 GMT
23
+ content-length:
24
+ - "297"
25
+ body: |
26
+ <RunJobFlowResponse xmlns="http://elasticmapreduce.amazonaws.com/doc/2009-03-31">
27
+ <RunJobFlowResult>
28
+ <JobFlowId>j-1IU6NM8OUPS9I</JobFlowId>
29
+ </RunJobFlowResult>
30
+ <ResponseMetadata>
31
+ <RequestId>ba9e82df-22b9-11e1-beee-7d8a92482267</RequestId>
32
+ </ResponseMetadata>
33
+ </RunJobFlowResponse>
34
+
35
+ http_version: "1.1"
@@ -0,0 +1,118 @@
1
+ require 'spec_helper'
2
+
3
+ describe Elasticity::CustomJarJob do
4
+
5
+ describe ".new" do
6
+
7
+ it "should have good defaults" do
8
+ custom_jar = Elasticity::CustomJarJob.new("access", "secret")
9
+ custom_jar.aws_access_key_id.should == "access"
10
+ custom_jar.aws_secret_access_key.should == "secret"
11
+ custom_jar.ec2_key_name.should == "default"
12
+ custom_jar.hadoop_version.should == "0.20"
13
+ custom_jar.instance_count.should == 2
14
+ custom_jar.master_instance_type.should == "m1.small"
15
+ custom_jar.name.should == "Elasticity Custom Jar Job"
16
+ custom_jar.slave_instance_type.should == "m1.small"
17
+ custom_jar.action_on_failure.should == "TERMINATE_JOB_FLOW"
18
+ custom_jar.log_uri.should == nil
19
+ end
20
+
21
+ end
22
+
23
+ describe "#run" do
24
+
25
+ context "when there are arguments provided" do
26
+ it "should run the script with the specified variables and return the jobflow_id" do
27
+ aws = Elasticity::EMR.new("", "")
28
+ aws.should_receive(:run_job_flow).with(
29
+ {
30
+ :name => "Elasticity Custom Jar Job",
31
+ :log_uri => "s3n://slif-test/output/logs",
32
+ :instances => {
33
+ :ec2_key_name => "default",
34
+ :hadoop_version => "0.20",
35
+ :instance_count => 2,
36
+ :master_instance_type => "m1.small",
37
+ :slave_instance_type => "m1.small",
38
+ },
39
+ :steps => [
40
+ {
41
+ :action_on_failure => "TERMINATE_JOB_FLOW",
42
+ :hadoop_jar_step => {
43
+ :jar => "s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar",
44
+ :args => [
45
+ "s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br",
46
+ "s3n://elasticmapreduce/samples/cloudburst/input/100k.br",
47
+ "s3n://slif_hadoop_test/cloudburst/output/2011-12-09",
48
+ ],
49
+ },
50
+ :name => "Execute Custom Jar"
51
+ }
52
+ ]
53
+ }).and_return("new_jobflow_id")
54
+ Elasticity::EMR.should_receive(:new).with("access", "secret").and_return(aws)
55
+
56
+ custom_jar = Elasticity::CustomJarJob.new("access", "secret")
57
+ custom_jar.log_uri = "s3n://slif-test/output/logs"
58
+ custom_jar.action_on_failure = "TERMINATE_JOB_FLOW"
59
+ jobflow_id = custom_jar.run('s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar', [
60
+ "s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br",
61
+ "s3n://elasticmapreduce/samples/cloudburst/input/100k.br",
62
+ "s3n://slif_hadoop_test/cloudburst/output/2011-12-09",
63
+ ])
64
+ jobflow_id.should == "new_jobflow_id"
65
+ end
66
+ end
67
+
68
+ context "when there are no arguments provided" do
69
+ it "should run the script with the specified variables and return the jobflow_id" do
70
+ aws = Elasticity::EMR.new("", "")
71
+ aws.should_receive(:run_job_flow).with(
72
+ {
73
+ :name => "Elasticity Custom Jar Job",
74
+ :log_uri => "s3n://slif-test/output/logs",
75
+ :instances => {
76
+ :ec2_key_name => "default",
77
+ :hadoop_version => "0.20",
78
+ :instance_count => 2,
79
+ :master_instance_type => "m1.small",
80
+ :slave_instance_type => "m1.small",
81
+ },
82
+ :steps => [
83
+ {
84
+ :action_on_failure => "TERMINATE_JOB_FLOW",
85
+ :hadoop_jar_step => {
86
+ :jar => "s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar"
87
+ },
88
+ :name => "Execute Custom Jar"
89
+ }
90
+ ]
91
+ }).and_return("new_jobflow_id")
92
+ Elasticity::EMR.should_receive(:new).with("access", "secret").and_return(aws)
93
+
94
+ custom_jar = Elasticity::CustomJarJob.new("access", "secret")
95
+ custom_jar.log_uri = "s3n://slif-test/output/logs"
96
+ custom_jar.action_on_failure = "TERMINATE_JOB_FLOW"
97
+ jobflow_id = custom_jar.run('s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar')
98
+ jobflow_id.should == "new_jobflow_id"
99
+ end
100
+ end
101
+
102
+ end
103
+
104
+ describe "integration happy path" do
105
+ use_vcr_cassette "custom_jar_job/cloudburst", :record => :none
106
+ it "should kick off the sample Amazion EMR Hive application" do
107
+ custom_jar = Elasticity::CustomJarJob.new(AWS_ACCESS_KEY_ID, AWS_SECRET_KEY)
108
+ custom_jar.ec2_key_name = "sharethrough_dev"
109
+ jobflow_id = custom_jar.run('s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar', [
110
+ "s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br",
111
+ "s3n://elasticmapreduce/samples/cloudburst/input/100k.br",
112
+ "s3n://slif_hadoop_test/cloudburst/output/2011-12-09",
113
+ ])
114
+ jobflow_id.should == "j-1IU6NM8OUPS9I"
115
+ end
116
+ end
117
+
118
+ end
metadata CHANGED
@@ -1,13 +1,12 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: elasticity
3
3
  version: !ruby/object:Gem::Version
4
- hash: 25
4
+ hash: 7
5
5
  prerelease:
6
6
  segments:
7
7
  - 1
8
- - 3
9
- - 1
10
- version: 1.3.1
8
+ - 4
9
+ version: "1.4"
11
10
  platform: ruby
12
11
  authors:
13
12
  - Robert Slifka
@@ -15,8 +14,7 @@ autorequire:
15
14
  bindir: bin
16
15
  cert_chain: []
17
16
 
18
- date: 2011-11-16 00:00:00 -08:00
19
- default_executable:
17
+ date: 2011-12-09 00:00:00 Z
20
18
  dependencies:
21
19
  - !ruby/object:Gem::Dependency
22
20
  name: rest-client
@@ -171,6 +169,7 @@ files:
171
169
  - elasticity.gemspec
172
170
  - lib/elasticity.rb
173
171
  - lib/elasticity/aws_request.rb
172
+ - lib/elasticity/custom_jar_job.rb
174
173
  - lib/elasticity/emr.rb
175
174
  - lib/elasticity/hive_job.rb
176
175
  - lib/elasticity/job_flow.rb
@@ -181,6 +180,7 @@ files:
181
180
  - spec/fixtures/vcr_cassettes/add_instance_groups/one_group_successful.yml
182
181
  - spec/fixtures/vcr_cassettes/add_instance_groups/one_group_unsuccessful.yml
183
182
  - spec/fixtures/vcr_cassettes/add_jobflow_steps/add_multiple_steps.yml
183
+ - spec/fixtures/vcr_cassettes/custom_jar_job/cloudburst.yml
184
184
  - spec/fixtures/vcr_cassettes/describe_jobflows/all_jobflows.yml
185
185
  - spec/fixtures/vcr_cassettes/direct/terminate_jobflow.yml
186
186
  - spec/fixtures/vcr_cassettes/hive_job/hive_ads.yml
@@ -192,13 +192,13 @@ files:
192
192
  - spec/fixtures/vcr_cassettes/set_termination_protection/protect_multiple_job_flows.yml
193
193
  - spec/fixtures/vcr_cassettes/terminate_jobflows/one_jobflow.yml
194
194
  - spec/lib/elasticity/aws_request_spec.rb
195
+ - spec/lib/elasticity/custom_jar_job_spec.rb
195
196
  - spec/lib/elasticity/emr_spec.rb
196
197
  - spec/lib/elasticity/hive_job_spec.rb
197
198
  - spec/lib/elasticity/job_flow_spec.rb
198
199
  - spec/lib/elasticity/job_flow_step_spec.rb
199
200
  - spec/lib/elasticity/pig_job_spec.rb
200
201
  - spec/spec_helper.rb
201
- has_rdoc: true
202
202
  homepage: http://www.github.com/rslifka/elasticity
203
203
  licenses: []
204
204
 
@@ -228,7 +228,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
228
228
  requirements: []
229
229
 
230
230
  rubyforge_project:
231
- rubygems_version: 1.4.1
231
+ rubygems_version: 1.8.6
232
232
  signing_key:
233
233
  specification_version: 3
234
234
  summary: Programmatic access to Amazon's Elastic Map Reduce service.
@@ -236,6 +236,7 @@ test_files:
236
236
  - spec/fixtures/vcr_cassettes/add_instance_groups/one_group_successful.yml
237
237
  - spec/fixtures/vcr_cassettes/add_instance_groups/one_group_unsuccessful.yml
238
238
  - spec/fixtures/vcr_cassettes/add_jobflow_steps/add_multiple_steps.yml
239
+ - spec/fixtures/vcr_cassettes/custom_jar_job/cloudburst.yml
239
240
  - spec/fixtures/vcr_cassettes/describe_jobflows/all_jobflows.yml
240
241
  - spec/fixtures/vcr_cassettes/direct/terminate_jobflow.yml
241
242
  - spec/fixtures/vcr_cassettes/hive_job/hive_ads.yml
@@ -247,6 +248,7 @@ test_files:
247
248
  - spec/fixtures/vcr_cassettes/set_termination_protection/protect_multiple_job_flows.yml
248
249
  - spec/fixtures/vcr_cassettes/terminate_jobflows/one_jobflow.yml
249
250
  - spec/lib/elasticity/aws_request_spec.rb
251
+ - spec/lib/elasticity/custom_jar_job_spec.rb
250
252
  - spec/lib/elasticity/emr_spec.rb
251
253
  - spec/lib/elasticity/hive_job_spec.rb
252
254
  - spec/lib/elasticity/job_flow_spec.rb