elasticity 5.0.3 → 6.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/HISTORY.md +26 -0
- data/README.md +35 -28
- data/elasticity.gemspec +2 -2
- data/lib/elasticity.rb +5 -3
- data/lib/elasticity/aws_request_v4.rb +15 -3
- data/lib/elasticity/aws_session.rb +4 -23
- data/lib/elasticity/aws_utils.rb +0 -29
- data/lib/elasticity/cluster_status.rb +38 -0
- data/lib/elasticity/cluster_step_status.rb +51 -0
- data/lib/elasticity/emr.rb +208 -78
- data/lib/elasticity/job_flow.rb +16 -17
- data/lib/elasticity/version.rb +1 -1
- data/spec/factories/cluster_status_factory.rb +12 -0
- data/spec/factories/cluster_step_status_factory.rb +17 -0
- data/spec/lib/elasticity/aws_request_v4_spec.rb +54 -4
- data/spec/lib/elasticity/aws_session_spec.rb +22 -88
- data/spec/lib/elasticity/aws_utils_spec.rb +0 -46
- data/spec/lib/elasticity/bootstrap_action_spec.rb +7 -3
- data/spec/lib/elasticity/cluster_status_spec.rb +98 -0
- data/spec/lib/elasticity/cluster_step_status_spec.rb +80 -0
- data/spec/lib/elasticity/custom_jar_step_spec.rb +10 -7
- data/spec/lib/elasticity/emr_spec.rb +422 -132
- data/spec/lib/elasticity/ganglia_bootstrap_action_spec.rb +8 -3
- data/spec/lib/elasticity/hadoop_bootstrap_action_spec.rb +8 -3
- data/spec/lib/elasticity/hadoop_file_bootstrap_action_spec.rb +7 -3
- data/spec/lib/elasticity/hive_step_spec.rb +21 -17
- data/spec/lib/elasticity/instance_group_spec.rb +9 -5
- data/spec/lib/elasticity/job_flow_integration_spec.rb +4 -4
- data/spec/lib/elasticity/job_flow_spec.rb +102 -76
- data/spec/lib/elasticity/job_flow_step_spec.rb +1 -1
- data/spec/lib/elasticity/looper_spec.rb +1 -1
- data/spec/lib/elasticity/pig_step_spec.rb +13 -9
- data/spec/lib/elasticity/s3distcp_step_spec.rb +7 -5
- data/spec/lib/elasticity/script_step_spec.rb +11 -6
- data/spec/lib/elasticity/setup_hadoop_debugging_step_spec.rb +9 -5
- data/spec/lib/elasticity/streaming_step_spec.rb +13 -9
- data/spec/spec_helper.rb +8 -0
- data/spec/support/factory_girl.rb +8 -0
- metadata +24 -21
- data/lib/elasticity/aws_request_v2.rb +0 -42
- data/lib/elasticity/job_flow_status.rb +0 -91
- data/lib/elasticity/job_flow_status_step.rb +0 -38
- data/spec/lib/elasticity/aws_request_v2_spec.rb +0 -38
- data/spec/lib/elasticity/job_flow_status_spec.rb +0 -265
- data/spec/lib/elasticity/job_flow_status_step_spec.rb +0 -80
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 9e3ac59080cc54551c89cc60e8aa0eab2a83bfd5
|
4
|
+
data.tar.gz: 8533393df0bc0f7cfd5fb05e3e339dc98721b44e
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: b8db8baed719babf9489f48ca949283e8ca60c0262f864496a505f145b6490243ddfe3003563ce9720a72a6977066cf2ea3f60afcd6a654446591e984374a04b
|
7
|
+
data.tar.gz: e2e705bdb17457727cf12df927033edca67795ffe480e85b4863ef050a26999bf861fed9cd23e5f2f089c9225e2abe383a7543ce1b85e91371377abd414a0100
|
data/HISTORY.md
CHANGED
@@ -1,3 +1,29 @@
|
|
1
|
+
## 6.0 - July 17, 2015
|
2
|
+
|
3
|
+
Amazon is in the process of transitioning from the notion of "Job Flows" to "Clusters" and is updating their APIs as such. You've already seen this in the EMR web UI as all mentions of "job flows" are gone and now you create "Clusters".
|
4
|
+
|
5
|
+
On the API side, all of the newer commands take `cluster_id` rather than `job_flow_id`. On the API submission side, they are transitioning from a 'flat' structure to a nested JSON structure requiring no transformation. Finally, XML is all gone and commands return JSON (i.e. no more Nokogiri as a dependency).
|
6
|
+
|
7
|
+
They've also begun deprecating APIs, starting with `DescribeJobFlows`. Given the sweeping set of changes, a major release was deemed appropriate.
|
8
|
+
|
9
|
+
- [#88](https://github.com/rslifka/elasticity/issues/88) - Removed support for deprecated `DescribeJobFlows`.
|
10
|
+
- [#89](https://github.com/rslifka/elasticity/issues/89) - Add support for `AddTags`.
|
11
|
+
- [#90](https://github.com/rslifka/elasticity/issues/90) - Add support for `RemoveTags`.
|
12
|
+
- [#91](https://github.com/rslifka/elasticity/issues/91) - Add support for `SetVisibleToAllUsers`.
|
13
|
+
- [#92](https://github.com/rslifka/elasticity/issues/92) - Add support for `ListSteps`.
|
14
|
+
- [#93](https://github.com/rslifka/elasticity/issues/93) - Add support for `ListInstances`.
|
15
|
+
- [#94](https://github.com/rslifka/elasticity/issues/94) - Add support for `ListInstanceGroups`.
|
16
|
+
- [#95](https://github.com/rslifka/elasticity/issues/95) - Add support for `ListClusters`.
|
17
|
+
- [#96](https://github.com/rslifka/elasticity/issues/96) - Add support for `ListBootstrapActions`.
|
18
|
+
- [#97](https://github.com/rslifka/elasticity/issues/97) - Add support for `DescribeCluster`.
|
19
|
+
- [#98](https://github.com/rslifka/elasticity/issues/98) - Add support for `DescribeStep`.
|
20
|
+
- [#101](https://github.com/rslifka/elasticity/issues/101) - Fix plurality of `TerminateJobFlows`; now requires an array of IDs to terminate.
|
21
|
+
- [#102](https://github.com/rslifka/elasticity/issues/102) - Simplify interface to `AddJobFlowSteps`; no longer require extraneous `:steps => []`.
|
22
|
+
- [#104](https://github.com/rslifka/elasticity/issues/104) - Expose return value from `AddJobFlowSteps`.
|
23
|
+
- [#105](https://github.com/rslifka/elasticity/issues/105) - `JobFlow#status` has been removed in favour of `JobFlow#cluster_status` and `JobFlow#cluster_step_status`.
|
24
|
+
- [#107](https://github.com/rslifka/elasticity/issues/107) - Add support for temporary credentials via `Elasticity.configure`.
|
25
|
+
- [#109](https://github.com/rslifka/elasticity/issues/109) - Credential specification relocated to `Elasticity.configure`.
|
26
|
+
|
1
27
|
## 5.0.3 - July 8, 2015
|
2
28
|
|
3
29
|
- Fix for issue [#86](https://github.com/rslifka/elasticity/issues/86).
|
data/README.md
CHANGED
@@ -18,7 +18,7 @@ gem install elasticity
|
|
18
18
|
or in your Gemfile
|
19
19
|
|
20
20
|
```
|
21
|
-
gem 'elasticity', '~>
|
21
|
+
gem 'elasticity', '~> 6.0'
|
22
22
|
```
|
23
23
|
|
24
24
|
This will ensure that you protect yourself from API changes, which will only be made in major revisions.
|
@@ -30,11 +30,14 @@ If you're familiar with the AWS EMR UI, you'll recall there are sample jobs Amaz
|
|
30
30
|
```ruby
|
31
31
|
require 'elasticity'
|
32
32
|
|
33
|
-
#
|
34
|
-
|
33
|
+
# Specify your AWS credentials
|
34
|
+
Elasticity.configure do |c|
|
35
|
+
c.access_key = ENV['AWS_ACCESS_KEY_ID']
|
36
|
+
c.secret_key = ENV['AWS_SECRET_ACCESS_KEY']
|
37
|
+
end
|
35
38
|
|
36
|
-
#
|
37
|
-
|
39
|
+
# Create a job flow
|
40
|
+
jobflow = Elasticity::JobFlow.new
|
38
41
|
|
39
42
|
# NOTE: Amazon requires that all new accounts specify a VPC subnet when launching jobs.
|
40
43
|
# If you're on an existing account, this is unnecessary however new AWS accounts require
|
@@ -44,7 +47,7 @@ jobflow = Elasticity::JobFlow.new('AWS access key', 'AWS secret key')
|
|
44
47
|
# This is the first step in the jobflow - running a custom jar
|
45
48
|
step = Elasticity::CustomJarStep.new('s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar')
|
46
49
|
|
47
|
-
# Here are the arguments to pass to the jar
|
50
|
+
# Here are the arguments to pass to the jar (replace OUTPUT_BUCKET)
|
48
51
|
step.arguments = %w(s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br s3n://elasticmapreduce/samples/cloudburst/input/100k.br s3n://OUTPUT_BUCKET/cloudburst/output/2012-06-22 36 3 0 1 240 48 24 24 128 16)
|
49
52
|
|
50
53
|
# Add the step to the jobflow
|
@@ -60,6 +63,7 @@ Note that this example is only for ```CustomJarStep```. Other steps will have d
|
|
60
63
|
|
61
64
|
Job flows are the center of the EMR universe. The general order of operations is:
|
62
65
|
|
66
|
+
1. Specify AWS credentials
|
63
67
|
1. Create a job flow.
|
64
68
|
1. Specify options.
|
65
69
|
1. (optional) Configure instance groups.
|
@@ -71,31 +75,30 @@ Job flows are the center of the EMR universe. The general order of operations i
|
|
71
75
|
1. (optional) Wait for the job flow to complete.
|
72
76
|
1. (optional) Shutdown the job flow.
|
73
77
|
|
74
|
-
## 1 -
|
75
|
-
|
76
|
-
Only your AWS credentials are needed.
|
78
|
+
## 1 - Specify AWS Credentials
|
77
79
|
|
78
80
|
```ruby
|
79
|
-
|
80
|
-
|
81
|
+
Elasticity.configure do |c|
|
82
|
+
c.access_key = ENV['AWS_ACCESS_KEY_ID']
|
83
|
+
c.secret_key = ENV['AWS_SECRET_ACCESS_KEY']
|
84
|
+
end
|
85
|
+
```
|
81
86
|
|
82
|
-
|
87
|
+
## 2 - Create a Job Flow
|
88
|
+
|
89
|
+
```ruby
|
83
90
|
jobflow = Elasticity::JobFlow.new
|
84
91
|
```
|
85
92
|
|
86
93
|
If you want to access a job flow that's already running:
|
87
94
|
|
88
95
|
```ruby
|
89
|
-
|
90
|
-
jobflow = Elasticity::JobFlow.from_jobflow_id('AWS access key', 'AWS secret key', 'jobflow ID', 'region')
|
91
|
-
|
92
|
-
# Use the standard environment variables (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY)
|
93
|
-
jobflow = Elasticity::JobFlow.from_jobflow_id(nil, nil, 'jobflow ID', 'region')
|
96
|
+
jobflow = Elasticity::JobFlow.from_jobflow_id('jobflow ID', 'region')
|
94
97
|
```
|
95
98
|
|
96
99
|
This is useful if you'd like to attach to a running job flow and add more steps, etc. The ```region``` parameter is necessary because job flows are only accessible from the the API when you connect to the same endpoint that created them (e.g. us-west-1). If you don't specify the ```region``` parameter, us-east-1 is assumed.
|
97
100
|
|
98
|
-
##
|
101
|
+
## 3 - Specifying Options
|
99
102
|
|
100
103
|
Configuration job flow options, shown below with default values. Note that these defaults are subject to change - they are reasonable defaults at the time(s) I work on them (e.g. the latest version of Hadoop).
|
101
104
|
|
@@ -106,7 +109,7 @@ jobflow.name = 'Elasticity Job Flow'
|
|
106
109
|
|
107
110
|
# For new AWS accounts, this is required to be set
|
108
111
|
jobflow.ec2_subnet_id = nil
|
109
|
-
jobflow.
|
112
|
+
jobflow.job_flow_role = nil
|
110
113
|
jobflow.service_role = nil
|
111
114
|
|
112
115
|
jobflow.action_on_failure = 'TERMINATE_JOB_FLOW'
|
@@ -124,7 +127,7 @@ jobflow.master_instance_type = 'm1.small'
|
|
124
127
|
jobflow.slave_instance_type = 'm1.small'
|
125
128
|
```
|
126
129
|
|
127
|
-
##
|
130
|
+
## 4 - Configure Instance Groups (optional)
|
128
131
|
|
129
132
|
Technically this is optional since Elasticity creates MASTER and CORE instance groups for you (one m1.small instance in each). If you'd like your jobs to finish in an appreciable amount of time, you'll want to at least add a few instances to the CORE group :)
|
130
133
|
|
@@ -170,7 +173,7 @@ ig.set_spot_instances(0.25) # Makes this a SPOT group with a $0.25 bid p
|
|
170
173
|
jobflow.set_core_instance_group(ig)
|
171
174
|
```
|
172
175
|
|
173
|
-
##
|
176
|
+
## 5 - Add Bootstrap Actions (optional)
|
174
177
|
|
175
178
|
Bootstrap actions are run as part of setting up the job flow, so be sure to configure these before running the job.
|
176
179
|
|
@@ -206,7 +209,7 @@ action = Elasticity::HadoopFileBootstrapAction.new('s3n://my-bucket/job-config.x
|
|
206
209
|
jobflow.add_bootstrap_action(action)
|
207
210
|
```
|
208
211
|
|
209
|
-
##
|
212
|
+
## 6 - Add Steps (optional)
|
210
213
|
|
211
214
|
Each type of step has ```#name``` and ```#action_on_failure``` fields that can be specified. Apart from that, steps are configured differently - exhaustively described below.
|
212
215
|
|
@@ -308,7 +311,7 @@ copy_step.arguments = [...]
|
|
308
311
|
jobflow.add_step(copy_step)
|
309
312
|
```
|
310
313
|
|
311
|
-
##
|
314
|
+
## 7 - Upload Assets (optional)
|
312
315
|
|
313
316
|
This isn't part of ```JobFlow```; more of an aside. Elasticity provides a very basic means of uploading assets to S3 so that your EMR job has access to them. Most commonly this will be a set of resources to run the job (e.g. JAR files, streaming scripts, etc.) and a set of resources used by the job itself (e.g. a TSV file with a range of valid values, join tables, etc.).
|
314
317
|
|
@@ -332,7 +335,7 @@ If the bucket doesn't exist, it will be created.
|
|
332
335
|
|
333
336
|
If a file already exists, there is an MD5 checksum evaluation. If the checksums are the same, the file will be skipped. Now you can use something like ```s3n://my-bucket/remote-dir/this-job/tables/join.tsv``` in your EMR jobs.
|
334
337
|
|
335
|
-
##
|
338
|
+
## 8 - Run the Job Flow
|
336
339
|
|
337
340
|
Submit the job flow to Amazon, storing the ID of the running job flow.
|
338
341
|
|
@@ -340,11 +343,11 @@ Submit the job flow to Amazon, storing the ID of the running job flow.
|
|
340
343
|
jobflow_id = jobflow.run
|
341
344
|
```
|
342
345
|
|
343
|
-
##
|
346
|
+
## 9 - Add Additional Steps (optional)
|
344
347
|
|
345
348
|
Steps can be added to a running jobflow just by calling ```#add_step``` on the job flow exactly how you add them prior to submitting the job.
|
346
349
|
|
347
|
-
##
|
350
|
+
## 10 - Wait For the Job Flow to Complete (optional)
|
348
351
|
|
349
352
|
Elasticity has the ability to block until the status of a job flow is not STARTING or RUNNING. There are two flavours. Without a status callback:
|
350
353
|
|
@@ -362,7 +365,7 @@ jobflow.wait_for_completion do |elapsed_time, job_flow_status|
|
|
362
365
|
end
|
363
366
|
```
|
364
367
|
|
365
|
-
##
|
368
|
+
## 11 - Shut Down the Job Flow (optional)
|
366
369
|
|
367
370
|
By default, job flows are set to terminate when there are no more running steps. You can tell the job flow to stay alive when it has nothing left to do:
|
368
371
|
|
@@ -378,11 +381,15 @@ jobflow.shutdown
|
|
378
381
|
|
379
382
|
# Elasticity Configuration
|
380
383
|
|
381
|
-
Elasticity supports a
|
384
|
+
Elasticity supports a handful of configuration options, all of which are shown below.
|
382
385
|
|
383
386
|
```ruby
|
384
387
|
Elasticity.configure do |config|
|
385
388
|
|
389
|
+
# AWS credentials
|
390
|
+
config.access_key = ENV['AWS_ACCESS_KEY_ID']
|
391
|
+
config.secret_key = ENV['AWS_SECRET_ACCESS_KEY']
|
392
|
+
|
386
393
|
# If using Hive, it will be configured via the directives here
|
387
394
|
config.hive_site = 's3://bucket/hive-site.xml'
|
388
395
|
|
data/elasticity.gemspec
CHANGED
@@ -12,13 +12,13 @@ Gem::Specification.new do |s|
|
|
12
12
|
s.description = %q{Streamlined, programmatic access to Amazon's Elastic Map Reduce service, driven by the Sharethrough team's requirements for belting out EMR jobs.}
|
13
13
|
|
14
14
|
s.add_dependency('rest-client', '~> 1.0')
|
15
|
-
s.add_dependency('nokogiri', '~> 1.0')
|
16
15
|
s.add_dependency('fog', '~> 1.0')
|
17
16
|
s.add_dependency('unf', '~> 0.1')
|
18
17
|
|
18
|
+
s.add_development_dependency('factory_girl', '~> 4.0')
|
19
19
|
s.add_development_dependency('fakefs', '~> 0.4.0')
|
20
20
|
s.add_development_dependency('rake', '~> 0.9')
|
21
|
-
s.add_development_dependency('rspec', '~>
|
21
|
+
s.add_development_dependency('rspec', '~> 3.0')
|
22
22
|
s.add_development_dependency('timecop', '~> 0.6')
|
23
23
|
|
24
24
|
s.files = `git ls-files`.split("\n")
|
data/lib/elasticity.rb
CHANGED
@@ -9,7 +9,6 @@ require 'elasticity/version'
|
|
9
9
|
|
10
10
|
require 'elasticity/aws_utils'
|
11
11
|
require 'elasticity/aws_session'
|
12
|
-
require 'elasticity/aws_request_v2'
|
13
12
|
require 'elasticity/aws_request_v4'
|
14
13
|
require 'elasticity/emr'
|
15
14
|
|
@@ -25,8 +24,8 @@ require 'elasticity/looper'
|
|
25
24
|
require 'elasticity/job_flow'
|
26
25
|
require 'elasticity/instance_group'
|
27
26
|
|
28
|
-
require 'elasticity/
|
29
|
-
require 'elasticity/
|
27
|
+
require 'elasticity/cluster_status'
|
28
|
+
require 'elasticity/cluster_step_status'
|
30
29
|
|
31
30
|
require 'elasticity/custom_jar_step'
|
32
31
|
require 'elasticity/setup_hadoop_debugging_step'
|
@@ -56,6 +55,9 @@ module Elasticity
|
|
56
55
|
|
57
56
|
class Configuration
|
58
57
|
attr_accessor :hive_site
|
58
|
+
attr_accessor :access_key
|
59
|
+
attr_accessor :secret_key
|
60
|
+
attr_accessor :security_token
|
59
61
|
end
|
60
62
|
|
61
63
|
end
|
@@ -20,11 +20,21 @@ module Elasticity
|
|
20
20
|
@ruby_service_hash.delete(:operation)
|
21
21
|
|
22
22
|
@timestamp = Time.now.utc
|
23
|
+
|
24
|
+
@access_key = Elasticity.configuration.access_key
|
25
|
+
if @access_key == nil
|
26
|
+
raise ArgumentError, '.access_key must be set in the configuration block'
|
27
|
+
end
|
28
|
+
|
29
|
+
@secret_key = Elasticity.configuration.secret_key
|
30
|
+
if @secret_key == nil
|
31
|
+
raise ArgumentError, '.secret_key must be set in the configuration block'
|
32
|
+
end
|
23
33
|
end
|
24
34
|
|
25
35
|
def headers
|
26
|
-
{
|
27
|
-
'Authorization' => "AWS4-HMAC-SHA256 Credential=#{@
|
36
|
+
headers = {
|
37
|
+
'Authorization' => "AWS4-HMAC-SHA256 Credential=#{@access_key}/#{credential_scope}, SignedHeaders=content-type;host;user-agent;x-amz-content-sha256;x-amz-date;x-amz-target, Signature=#{aws_v4_signature}",
|
28
38
|
'Content-Type' => 'application/x-amz-json-1.1',
|
29
39
|
'Host' => host,
|
30
40
|
'User-Agent' => "elasticity/#{Elasticity::VERSION}",
|
@@ -32,6 +42,8 @@ module Elasticity
|
|
32
42
|
'X-Amz-Date' => @timestamp.strftime('%Y%m%dT%H%M%SZ'),
|
33
43
|
'X-Amz-Target' => "ElasticMapReduce.#{@operation}",
|
34
44
|
}
|
45
|
+
headers.merge!('X-Amz-Security-Token' => Elasticity.configuration.security_token) if Elasticity.configuration.security_token
|
46
|
+
headers
|
35
47
|
end
|
36
48
|
|
37
49
|
def url
|
@@ -85,7 +97,7 @@ module Elasticity
|
|
85
97
|
# Task 3: Calculate the AWS Signature Version 4
|
86
98
|
# http://docs.aws.amazon.com/general/latest/gr/sigv4-calculate-signature.html
|
87
99
|
def aws_v4_signature
|
88
|
-
date = OpenSSL::HMAC.digest('sha256', 'AWS4' + @
|
100
|
+
date = OpenSSL::HMAC.digest('sha256', 'AWS4' + @secret_key, @timestamp.strftime('%Y%m%d'))
|
89
101
|
region = OpenSSL::HMAC.digest('sha256', date, @aws_session.region)
|
90
102
|
service = OpenSSL::HMAC.digest('sha256', region, SERVICE_NAME)
|
91
103
|
signing_key = OpenSSL::HMAC.digest('sha256', service, 'aws4_request')
|
@@ -7,15 +7,13 @@ module Elasticity
|
|
7
7
|
|
8
8
|
class AwsSession
|
9
9
|
|
10
|
-
attr_reader :access_key
|
11
|
-
attr_reader :secret_key
|
12
10
|
attr_reader :host
|
13
11
|
attr_reader :region
|
14
12
|
|
15
13
|
# Supported values for options:
|
16
14
|
# :region - AWS region (e.g. us-west-1)
|
17
15
|
# :secure - true or false, default true.
|
18
|
-
def initialize(
|
16
|
+
def initialize(options={})
|
19
17
|
# There is a cryptic error if this isn't set
|
20
18
|
if options.has_key?(:region) && options[:region] == nil
|
21
19
|
raise MissingRegionError, 'A valid :region is required to connect to EMR'
|
@@ -23,8 +21,6 @@ module Elasticity
|
|
23
21
|
options[:region] = 'us-east-1' unless options[:region]
|
24
22
|
@region = options[:region]
|
25
23
|
|
26
|
-
@access_key = get_access_key(access)
|
27
|
-
@secret_key = get_secret_key(secret)
|
28
24
|
@host = "elasticmapreduce.#@region.amazonaws.com"
|
29
25
|
end
|
30
26
|
|
@@ -39,32 +35,17 @@ module Elasticity
|
|
39
35
|
|
40
36
|
def ==(other)
|
41
37
|
return false unless other.is_a? AwsSession
|
42
|
-
return false unless @access_key == other.access_key
|
43
|
-
return false unless @secret_key == other.secret_key
|
44
38
|
return false unless @host == other.host
|
45
39
|
true
|
46
40
|
end
|
47
41
|
|
48
42
|
private
|
49
43
|
|
50
|
-
def get_access_key(access)
|
51
|
-
return access if access
|
52
|
-
return ENV['AWS_ACCESS_KEY_ID'] if ENV['AWS_ACCESS_KEY_ID']
|
53
|
-
raise MissingKeyError, 'Please provide an access key or set AWS_ACCESS_KEY_ID.'
|
54
|
-
end
|
55
|
-
|
56
|
-
def get_secret_key(secret)
|
57
|
-
return secret if secret
|
58
|
-
return ENV['AWS_SECRET_ACCESS_KEY'] if ENV['AWS_SECRET_ACCESS_KEY']
|
59
|
-
raise MissingKeyError, 'Please provide a secret key or set AWS_SECRET_ACCESS_KEY.'
|
60
|
-
end
|
61
|
-
|
62
44
|
# AWS error responses all follow the same form. Extract the message from
|
63
45
|
# the error document.
|
64
|
-
def self.parse_error_response(
|
65
|
-
|
66
|
-
|
67
|
-
xml_doc.xpath('/ErrorResponse/Error/Message').text
|
46
|
+
def self.parse_error_response(error_json)
|
47
|
+
error = JSON.parse(error_json)
|
48
|
+
"AWS EMR API Error (#{error['__type']}): #{error['message']}"
|
68
49
|
end
|
69
50
|
|
70
51
|
end
|
data/lib/elasticity/aws_utils.rb
CHANGED
@@ -27,35 +27,6 @@ module Elasticity
|
|
27
27
|
end
|
28
28
|
end
|
29
29
|
|
30
|
-
# Since we use the same structure as AWS, we can generate AWS param names
|
31
|
-
# from the Ruby versions of those names (and the param nesting).
|
32
|
-
def self.convert_ruby_to_aws(params)
|
33
|
-
result = {}
|
34
|
-
params.each do |key, value|
|
35
|
-
case value
|
36
|
-
when Array
|
37
|
-
prefix = "#{camelize(key.to_s)}.member"
|
38
|
-
value.each_with_index do |item, index|
|
39
|
-
if item.is_a?(String)
|
40
|
-
result["#{prefix}.#{index+1}"] = item
|
41
|
-
else
|
42
|
-
convert_ruby_to_aws(item).each do |nested_key, nested_value|
|
43
|
-
result["#{prefix}.#{index+1}.#{nested_key}"] = nested_value
|
44
|
-
end
|
45
|
-
end
|
46
|
-
end
|
47
|
-
when Hash
|
48
|
-
prefix = "#{camelize(key.to_s)}"
|
49
|
-
convert_ruby_to_aws(value).each do |nested_key, nested_value|
|
50
|
-
result["#{prefix}.#{nested_key}"] = nested_value
|
51
|
-
end
|
52
|
-
else
|
53
|
-
result[camelize(key.to_s)] = value
|
54
|
-
end
|
55
|
-
end
|
56
|
-
result
|
57
|
-
end
|
58
|
-
|
59
30
|
def self.camelize(word)
|
60
31
|
word.to_s.gsub(/\/(.?)/) { '::' + $1.upcase }.gsub(/(^|_)(.)/) { $2.upcase }
|
61
32
|
end
|
@@ -0,0 +1,38 @@
|
|
1
|
+
module Elasticity
|
2
|
+
|
3
|
+
class ClusterStatus
|
4
|
+
|
5
|
+
attr_accessor :name
|
6
|
+
attr_accessor :cluster_id
|
7
|
+
attr_accessor :state
|
8
|
+
attr_accessor :created_at
|
9
|
+
attr_accessor :ready_at
|
10
|
+
attr_accessor :ended_at
|
11
|
+
attr_accessor :last_state_change_reason
|
12
|
+
attr_accessor :master_public_dns_name
|
13
|
+
attr_accessor :normalized_instance_hours
|
14
|
+
|
15
|
+
# ClusterStatus is created via the results of the DescribeCluster API call
|
16
|
+
def self.from_aws_data(cluster_data)
|
17
|
+
cluster_data = cluster_data['Cluster']
|
18
|
+
ClusterStatus.new.tap do |c|
|
19
|
+
c.name = cluster_data['Name']
|
20
|
+
c.cluster_id = cluster_data['Id']
|
21
|
+
c.state = cluster_data['Status']['State']
|
22
|
+
c.created_at = Time.at(cluster_data['Status']['Timeline']['CreationDateTime'])
|
23
|
+
c.ready_at = Time.at(cluster_data['Status']['Timeline']['ReadyDateTime'])
|
24
|
+
c.ended_at = Time.at(cluster_data['Status']['Timeline']['EndDateTime'])
|
25
|
+
c.last_state_change_reason = cluster_data['Status']['StateChangeReason']['Code']
|
26
|
+
c.master_public_dns_name = cluster_data['MasterPublicDnsName']
|
27
|
+
c.normalized_instance_hours = cluster_data['NormalizedInstanceHours']
|
28
|
+
end
|
29
|
+
end
|
30
|
+
|
31
|
+
# http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/ProcessingCycle.html
|
32
|
+
def active?
|
33
|
+
%w{RUNNING STARTING BOOTSTRAPPING WAITING SHUTTING_DOWN}.include?(@state)
|
34
|
+
end
|
35
|
+
|
36
|
+
end
|
37
|
+
|
38
|
+
end
|
@@ -0,0 +1,51 @@
|
|
1
|
+
module Elasticity
|
2
|
+
|
3
|
+
class ClusterStepStatus
|
4
|
+
|
5
|
+
attr_accessor :action_on_failure
|
6
|
+
attr_accessor :args
|
7
|
+
attr_accessor :jar
|
8
|
+
attr_accessor :main_class
|
9
|
+
attr_accessor :properties
|
10
|
+
attr_accessor :step_id
|
11
|
+
attr_accessor :name
|
12
|
+
attr_accessor :state
|
13
|
+
attr_accessor :state_change_reason
|
14
|
+
attr_accessor :state_change_reason_message
|
15
|
+
attr_accessor :created_at
|
16
|
+
attr_accessor :started_at
|
17
|
+
attr_accessor :ended_at
|
18
|
+
|
19
|
+
# Constructed from http://docs.aws.amazon.com/ElasticMapReduce/latest/API/API_ListSteps.html
|
20
|
+
def self.from_aws_list_data(cluster_step_data)
|
21
|
+
cluster_step_data['Steps'].map do |s|
|
22
|
+
ClusterStepStatus.new.tap do |c|
|
23
|
+
c.action_on_failure = s['ActionOnFailure']
|
24
|
+
c.args = s['Config']['Args']
|
25
|
+
c.jar = s['Config']['Jar']
|
26
|
+
c.main_class = s['Config']['MainClass']
|
27
|
+
c.properties = s['Config']['Properties']
|
28
|
+
c.step_id = s['Id']
|
29
|
+
c.name = s['Name']
|
30
|
+
c.state = s['Status']['State']
|
31
|
+
c.state_change_reason = s['Status']['StateChangeReason']['Code']
|
32
|
+
c.state_change_reason_message = s['Status']['StateChangeReason']['Message']
|
33
|
+
c.created_at = Time.at(s['Status']['Timeline']['CreationDateTime'])
|
34
|
+
c.started_at = Time.at(s['Status']['Timeline']['StartDateTime'])
|
35
|
+
c.ended_at = Time.at(s['Status']['Timeline']['EndDateTime'])
|
36
|
+
end
|
37
|
+
end
|
38
|
+
end
|
39
|
+
|
40
|
+
def self.installed_steps(cluster_step_statuses)
|
41
|
+
step_names = cluster_step_statuses.map(&:name)
|
42
|
+
installed_steps = []
|
43
|
+
Elasticity::JobFlowStep.steps_requiring_installation.each do |step|
|
44
|
+
installed_steps << step if step_names.include?(step.aws_installation_step_name)
|
45
|
+
end
|
46
|
+
installed_steps
|
47
|
+
end
|
48
|
+
|
49
|
+
end
|
50
|
+
|
51
|
+
end
|