elasticity 1.4.1 → 1.5

Sign up to get free protection for your applications and to get access to all the features.
Files changed (29) hide show
  1. data/.rvmrc +1 -1
  2. data/.travis.yml +11 -0
  3. data/HISTORY.md +6 -0
  4. data/README.md +23 -6
  5. data/elasticity.gemspec +4 -7
  6. data/lib/elasticity/custom_jar_job.rb +3 -2
  7. data/lib/elasticity/hive_job.rb +1 -3
  8. data/lib/elasticity/pig_job.rb +1 -4
  9. data/lib/elasticity/simple_job.rb +4 -3
  10. data/lib/elasticity/version.rb +1 -1
  11. data/spec/fixtures/vcr_cassettes/add_instance_groups/one_group_successful.yml +33 -27
  12. data/spec/fixtures/vcr_cassettes/add_instance_groups/one_group_unsuccessful.yml +33 -27
  13. data/spec/fixtures/vcr_cassettes/add_jobflow_steps/add_multiple_steps.yml +236 -222
  14. data/spec/fixtures/vcr_cassettes/custom_jar_job/cloudburst.yml +33 -27
  15. data/spec/fixtures/vcr_cassettes/describe_jobflows/all_jobflows.yml +64 -58
  16. data/spec/fixtures/vcr_cassettes/direct/terminate_jobflow.yml +30 -24
  17. data/spec/fixtures/vcr_cassettes/hive_job/hive_ads.yml +30 -24
  18. data/spec/fixtures/vcr_cassettes/modify_instance_groups/set_instances_to_3.yml +30 -24
  19. data/spec/fixtures/vcr_cassettes/pig_job/apache_log_reports.yml +32 -26
  20. data/spec/fixtures/vcr_cassettes/pig_job/apache_log_reports_with_bootstrap.yml +33 -27
  21. data/spec/fixtures/vcr_cassettes/run_jobflow/word_count.yml +33 -27
  22. data/spec/fixtures/vcr_cassettes/set_termination_protection/nonexistent_job_flows.yml +33 -27
  23. data/spec/fixtures/vcr_cassettes/set_termination_protection/protect_multiple_job_flows.yml +30 -24
  24. data/spec/fixtures/vcr_cassettes/terminate_jobflows/one_jobflow.yml +30 -24
  25. data/spec/lib/elasticity/emr_spec.rb +3 -0
  26. data/spec/lib/elasticity/hive_job_spec.rb +3 -9
  27. data/spec/lib/elasticity/pig_job_spec.rb +16 -1
  28. data/spec/spec_helper.rb +39 -26
  29. metadata +14 -90
data/.rvmrc CHANGED
@@ -1 +1 @@
1
- rvm use ree-1.8.7-2010.02@elasticity --create
1
+ rvm use ruby-1.9.2-p180@elasticity --create
data/.travis.yml ADDED
@@ -0,0 +1,11 @@
1
+ language: ruby
2
+ rvm:
3
+ - 1.8.7
4
+ - 1.9.2
5
+ - 1.9.3
6
+ - ree
7
+ branches:
8
+ only:
9
+ - master
10
+ - travis-testing
11
+
data/HISTORY.md CHANGED
@@ -1,3 +1,9 @@
1
+ ### 1.5
2
+
3
+ + Added support for Hadoop bootstrap actions to all job types (Pig, Hive and Custom Jar).
4
+ + Added support for REE 1.8.7-2011.12, Ruby 1.9.2 and 1.9.3.
5
+ + Updated to the latest versions of all development dependencies (notably VCR 2).
6
+
1
7
  ### 1.4.1
2
8
 
3
9
  + Added Elasticity::EMR#describe_jobflow("jobflow_id") for describing a specific job. If you happen to run hundreds of EMR jobs, this makes retrieving jobflow status much faster than using Elasticity::EMR#describe_jobflowS which pulls down and parses XML status for hundreds of jobs.
data/README.md CHANGED
@@ -1,10 +1,6 @@
1
1
  Elasticity provides programmatic access to Amazon's Elastic Map Reduce service. The aim is to conveniently wrap the API operations in a manner that makes working with EMR job flows from Ruby more productive and more enjoyable, without having to understand the nuts and bolts of the EMR REST API. At the very least, using Elasticity allows you to easily experiment with the EMR API :)
2
2
 
3
- **CREDITS**: AWS signing was used from [RightScale's](http://www.rightscale.com/) amazing [right_aws gem](https://github.com/rightscale/right_aws) which works extraordinarily well! If you need access to any AWS service (EC2, S3, etc.), have a look. Used camelize from ActiveSupport as well, thank you \Rails :)
4
-
5
- **CONTRIBUTIONS**:
6
-
7
- + [Wouter Broekhof](https://github.com/wouter/) - HTTPS and AWS region support, additional params to describe_jobflows.
3
+ [![Build Status](https://secure.travis-ci.org/rslifka/elasticity.png)](http://travis-ci.org/rslifka/elasticity)
8
4
 
9
5
  # Installation and Usage
10
6
 
@@ -99,7 +95,7 @@ Use this as you would any other Pig variable.
99
95
  Custom jar jobs are also available. To kick off a custom job, specify the path to the jar and any arguments you'd like passed to the jar.
100
96
 
101
97
  <pre>
102
- custom_jar = Elasticity::PigJob.new(ENV["AWS_ACCESS_KEY_ID"], ENV["AWS_SECRET_KEY"])
98
+ custom_jar = Elasticity::CustomJarJob.new(ENV["AWS_ACCESS_KEY_ID"], ENV["AWS_SECRET_KEY"])
103
99
  custom_jar.log_uri = "s3n://slif-test/output/logs"
104
100
  custom_jar.action_on_failure = "TERMINATE_JOB_FLOW"
105
101
  jobflow_id = custom_jar.run('s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar', [
@@ -111,6 +107,12 @@ Custom jar jobs are also available. To kick off a custom job, specify the path
111
107
  > "j-1IU6NM8OUPS9I"
112
108
  </pre>
113
109
 
110
+ Custom jar jobs support arbitrary entry points. Specify the class on which to call main() either via the JAR manifest or as the first argument to the job:
111
+
112
+ <pre>
113
+ Elasticity::CustomJarJob.new(key, secret).run(s3_jar_path, ['MyCustomClass', 'arg1', 'arg2'])
114
+ </pre>
115
+
114
116
  # Amazon API Reference
115
117
 
116
118
  Elasticity wraps all of the EMR API calls. Please see the Amazon guide for details on these operations because the default values aren't obvious (e.g. the meaning of <code>DescribeJobFlows</code> without parameters).
@@ -343,6 +345,19 @@ If you're chomping at the bit to initiate some EMR functionality that isn't wrap
343
345
  > &lt;DescribeJobFlowsResponse xmlns="http://elasticmapreduce.amazonaws.com/doc/2009...
344
346
  </pre>
345
347
 
348
+ # Something Borrowed...
349
+
350
+ AWS signing was used from [RightScale's](http://www.rightscale.com/) amazing [right_aws gem](https://github.com/rightscale/right_aws) which works extraordinarily well! If you need access to any AWS service (EC2, S3, etc.), have a look.
351
+
352
+ Used camelize from ActiveSupport as well, thank you \Rails :)
353
+
354
+ # Thanks!
355
+
356
+ Thanks to the following people who have contributed patches or helpful suggestions:
357
+
358
+ + [Aram Price](https://github.com/aramprice/)
359
+ + [Wouter Broekhof](https://github.com/wouter/)
360
+
346
361
  # License
347
362
 
348
363
  <pre>
@@ -363,6 +378,8 @@ If you're chomping at the bit to initiate some EMR functionality that isn't wrap
363
378
 
364
379
  ### Development Notes for Slif
365
380
 
381
+ [Versioning Guide](http://docs.rubygems.org/read/chapter/7#page27), c/o [@brokenladder](https://twitter.com/#!/brokenladder)
382
+
366
383
  <pre>
367
384
  rake build # Build lorem-0.0.2.gem into the pkg directory
368
385
  rake install # Build and install lorem-0.0.2.gem into system gems
data/elasticity.gemspec CHANGED
@@ -9,18 +9,15 @@ Gem::Specification.new do |s|
9
9
  s.authors = ["Robert Slifka"]
10
10
  s.homepage = "http://www.github.com/rslifka/elasticity"
11
11
  s.summary = %q{Programmatic access to Amazon's Elastic Map Reduce service.}
12
- s.description = %q{Programmatic access to Amazon's Elastic Map Reduce service.}
12
+ s.description = %q{Programmatic access to Amazon's Elastic Map Reduce service, driven by the Sharethrough team's requirements for belting out EMR jobs.}
13
13
 
14
14
  s.add_dependency("rest-client")
15
15
  s.add_dependency("nokogiri")
16
16
 
17
- s.add_development_dependency("autotest-fsevent")
18
- s.add_development_dependency("autotest-growl")
19
17
  s.add_development_dependency("rake")
20
- s.add_development_dependency("rspec", ">= 2.5.0")
21
- s.add_development_dependency("vcr", ">= 1.5.1")
22
- s.add_development_dependency("webmock", ">= 1.6.2")
23
- s.add_development_dependency("ZenTest")
18
+ s.add_development_dependency("rspec", ">= 2.8.0")
19
+ s.add_development_dependency("vcr", "~> 2.0")
20
+ s.add_development_dependency("webmock", "~> 1.8.0")
24
21
 
25
22
  s.files = `git ls-files`.split("\n")
26
23
  s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
@@ -27,9 +27,10 @@ module Elasticity
27
27
  }
28
28
  ]
29
29
  }
30
- jobflow_config.merge!(:log_uri => @log_uri) if @log_uri
30
+
31
31
  jobflow_config[:steps].first[:hadoop_jar_step][:args] = arguments if arguments
32
- @emr.run_job_flow(jobflow_config)
32
+
33
+ run_job(jobflow_config)
33
34
  end
34
35
 
35
36
  end
@@ -61,9 +61,7 @@ module Elasticity
61
61
  ]
62
62
  }
63
63
 
64
- jobflow_config.merge!(:log_uri => @log_uri) if @log_uri
65
-
66
- @emr.run_job_flow(jobflow_config)
64
+ run_job(jobflow_config)
67
65
  end
68
66
 
69
67
  end
@@ -75,10 +75,7 @@ module Elasticity
75
75
  ]
76
76
  }
77
77
 
78
- jobflow_config.merge!(:log_uri => @log_uri) if @log_uri
79
- jobflow_config.merge!(get_bootstrap_actions)
80
-
81
- @emr.run_job_flow(jobflow_config)
78
+ run_job(jobflow_config)
82
79
  end
83
80
 
84
81
  private
@@ -40,9 +40,10 @@ module Elasticity
40
40
 
41
41
  private
42
42
 
43
- def get_bootstrap_actions
44
- return {} unless @hadoop_actions && !@hadoop_actions.empty?
45
- { :bootstrap_actions => @hadoop_actions }
43
+ def run_job(jobflow_config)
44
+ jobflow_config.merge!(:log_uri => @log_uri) if @log_uri
45
+ jobflow_config.merge!(:bootstrap_actions => @hadoop_actions) if @hadoop_actions && !@hadoop_actions.empty?
46
+ @emr.run_job_flow(jobflow_config)
46
47
  end
47
48
 
48
49
  end
@@ -1,3 +1,3 @@
1
1
  module Elasticity
2
- VERSION = "1.4.1"
2
+ VERSION = "1.5"
3
3
  end
@@ -1,38 +1,44 @@
1
1
  ---
2
- - !ruby/struct:VCR::HTTPInteraction
3
- request: !ruby/struct:VCR::Request
4
- method: :get
5
- uri: !ruby/regexp /^https:\/\/elasticmapreduce.amazonaws.com\/\?AWSAccessKeyId=\S+&InstanceGroups.member.1.InstanceCount=1&InstanceGroups.member.1.InstanceRole=TASK&InstanceGroups.member.1.InstanceType=m1.small&InstanceGroups.member.1.Market=ON_DEMAND&InstanceGroups.member.1.Name=Go%20Canucks%20Go!&JobFlowId=j-OALI7TZTQMHX&Operation=AddInstanceGroups/
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: !ruby/regexp /^https/
6
6
  body:
7
+ encoding: US-ASCII
8
+ string: ""
7
9
  headers:
8
- accept:
10
+ Accept:
9
11
  - "*/*; q=0.5, application/xml"
10
- accept-encoding:
12
+ Accept-Encoding:
11
13
  - gzip, deflate
12
- response: !ruby/struct:VCR::Response
13
- status: !ruby/struct:VCR::ResponseStatus
14
+ response:
15
+ status:
14
16
  code: 200
15
17
  message: OK
16
18
  headers:
17
- x-amzn-requestid:
18
- - ddd0d158-67eb-11e0-ba06-2b5c43005be2
19
- content-type:
20
- - text/xml
21
- date:
22
- - Sat, 16 Apr 2011 05:39:15 GMT
23
- content-length:
19
+ Content-Length:
24
20
  - "411"
25
- body: |
26
- <AddInstanceGroupsResponse xmlns="http://elasticmapreduce.amazonaws.com/doc/2009-03-31">
27
- <AddInstanceGroupsResult>
28
- <JobFlowId>j-OALI7TZTQMHX</JobFlowId>
29
- <InstanceGroupIds>
30
- <member>ig-2GOVEN6HVJZID</member>
31
- </InstanceGroupIds>
32
- </AddInstanceGroupsResult>
33
- <ResponseMetadata>
34
- <RequestId>ddd0d158-67eb-11e0-ba06-2b5c43005be2</RequestId>
35
- </ResponseMetadata>
36
- </AddInstanceGroupsResponse>
21
+ Date:
22
+ - Sat, 16 Apr 2011 05:39:15 GMT
23
+ Content-Type:
24
+ - text/xml
25
+ X-Amzn-Requestid:
26
+ - ddd0d158-67eb-11e0-ba06-2b5c43005be2
27
+ body:
28
+ encoding: US-ASCII
29
+ string: |
30
+ <AddInstanceGroupsResponse xmlns="http://elasticmapreduce.amazonaws.com/doc/2009-03-31">
31
+ <AddInstanceGroupsResult>
32
+ <JobFlowId>j-OALI7TZTQMHX</JobFlowId>
33
+ <InstanceGroupIds>
34
+ <member>ig-2GOVEN6HVJZID</member>
35
+ </InstanceGroupIds>
36
+ </AddInstanceGroupsResult>
37
+ <ResponseMetadata>
38
+ <RequestId>ddd0d158-67eb-11e0-ba06-2b5c43005be2</RequestId>
39
+ </ResponseMetadata>
40
+ </AddInstanceGroupsResponse>
37
41
 
38
42
  http_version: "1.1"
43
+ recorded_at: Sat, 03 Mar 2012 22:59:32 GMT
44
+ recorded_with: VCR 2.0.0
@@ -1,35 +1,41 @@
1
- ---
2
- - !ruby/struct:VCR::HTTPInteraction
3
- request: !ruby/struct:VCR::Request
4
- method: :get
5
- uri: !ruby/regexp /^https:\/\/elasticmapreduce.amazonaws.com\/\?AWSAccessKeyId=\S+&InstanceGroups.member.1.BidPrice=0&InstanceGroups.member.1.InstanceCount=1&InstanceGroups.member.1.InstanceRole=TASK&InstanceGroups.member.1.InstanceType=m1.small&InstanceGroups.member.1.Market=ON_DEMAND&InstanceGroups.member.1.Name=Go%20Canucks%20Go!&JobFlowId=j-19WDDS68ZUENP&Operation=AddInstanceGroups/
1
+ ---
2
+ http_interactions:
3
+ - request:
4
+ method: get
5
+ uri: !ruby/regexp /^https/
6
6
  body:
7
- headers:
8
- accept:
7
+ encoding: US-ASCII
8
+ string: ""
9
+ headers:
10
+ Accept:
9
11
  - "*/*; q=0.5, application/xml"
10
- accept-encoding:
12
+ Accept-Encoding:
11
13
  - gzip, deflate
12
- response: !ruby/struct:VCR::Response
13
- status: !ruby/struct:VCR::ResponseStatus
14
+ response:
15
+ status:
14
16
  code: 400
15
17
  message: Bad Request
16
- headers:
17
- x-amzn-requestid:
18
- - 0c8d744d-67ea-11e0-bf8a-ed57a5465c87
19
- content-type:
20
- - text/xml
21
- date:
22
- - Sat, 16 Apr 2011 05:26:15 GMT
23
- content-length:
18
+ headers:
19
+ Content-Length:
24
20
  - "337"
25
- body: |
26
- <ErrorResponse xmlns="http://elasticmapreduce.amazonaws.com/doc/2009-03-31">
27
- <Error>
28
- <Type>Sender</Type>
29
- <Code>ValidationError</Code>
30
- <Message>Task instance group already exists in the job flow, cannot add more task groups</Message>
31
- </Error>
32
- <RequestId>0c8d744d-67ea-11e0-bf8a-ed57a5465c87</RequestId>
33
- </ErrorResponse>
21
+ Date:
22
+ - Sat, 16 Apr 2011 05:26:15 GMT
23
+ Content-Type:
24
+ - text/xml
25
+ X-Amzn-Requestid:
26
+ - 0c8d744d-67ea-11e0-bf8a-ed57a5465c87
27
+ body:
28
+ encoding: US-ASCII
29
+ string: |
30
+ <ErrorResponse xmlns="http://elasticmapreduce.amazonaws.com/doc/2009-03-31">
31
+ <Error>
32
+ <Type>Sender</Type>
33
+ <Code>ValidationError</Code>
34
+ <Message>Task instance group already exists in the job flow, cannot add more task groups</Message>
35
+ </Error>
36
+ <RequestId>0c8d744d-67ea-11e0-bf8a-ed57a5465c87</RequestId>
37
+ </ErrorResponse>
34
38
 
35
39
  http_version: "1.1"
40
+ recorded_at: Sat, 03 Mar 2012 22:59:41 GMT
41
+ recorded_with: VCR 2.0.0