elasticity 5.0.3 → 6.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (46) hide show
  1. checksums.yaml +4 -4
  2. data/HISTORY.md +26 -0
  3. data/README.md +35 -28
  4. data/elasticity.gemspec +2 -2
  5. data/lib/elasticity.rb +5 -3
  6. data/lib/elasticity/aws_request_v4.rb +15 -3
  7. data/lib/elasticity/aws_session.rb +4 -23
  8. data/lib/elasticity/aws_utils.rb +0 -29
  9. data/lib/elasticity/cluster_status.rb +38 -0
  10. data/lib/elasticity/cluster_step_status.rb +51 -0
  11. data/lib/elasticity/emr.rb +208 -78
  12. data/lib/elasticity/job_flow.rb +16 -17
  13. data/lib/elasticity/version.rb +1 -1
  14. data/spec/factories/cluster_status_factory.rb +12 -0
  15. data/spec/factories/cluster_step_status_factory.rb +17 -0
  16. data/spec/lib/elasticity/aws_request_v4_spec.rb +54 -4
  17. data/spec/lib/elasticity/aws_session_spec.rb +22 -88
  18. data/spec/lib/elasticity/aws_utils_spec.rb +0 -46
  19. data/spec/lib/elasticity/bootstrap_action_spec.rb +7 -3
  20. data/spec/lib/elasticity/cluster_status_spec.rb +98 -0
  21. data/spec/lib/elasticity/cluster_step_status_spec.rb +80 -0
  22. data/spec/lib/elasticity/custom_jar_step_spec.rb +10 -7
  23. data/spec/lib/elasticity/emr_spec.rb +422 -132
  24. data/spec/lib/elasticity/ganglia_bootstrap_action_spec.rb +8 -3
  25. data/spec/lib/elasticity/hadoop_bootstrap_action_spec.rb +8 -3
  26. data/spec/lib/elasticity/hadoop_file_bootstrap_action_spec.rb +7 -3
  27. data/spec/lib/elasticity/hive_step_spec.rb +21 -17
  28. data/spec/lib/elasticity/instance_group_spec.rb +9 -5
  29. data/spec/lib/elasticity/job_flow_integration_spec.rb +4 -4
  30. data/spec/lib/elasticity/job_flow_spec.rb +102 -76
  31. data/spec/lib/elasticity/job_flow_step_spec.rb +1 -1
  32. data/spec/lib/elasticity/looper_spec.rb +1 -1
  33. data/spec/lib/elasticity/pig_step_spec.rb +13 -9
  34. data/spec/lib/elasticity/s3distcp_step_spec.rb +7 -5
  35. data/spec/lib/elasticity/script_step_spec.rb +11 -6
  36. data/spec/lib/elasticity/setup_hadoop_debugging_step_spec.rb +9 -5
  37. data/spec/lib/elasticity/streaming_step_spec.rb +13 -9
  38. data/spec/spec_helper.rb +8 -0
  39. data/spec/support/factory_girl.rb +8 -0
  40. metadata +24 -21
  41. data/lib/elasticity/aws_request_v2.rb +0 -42
  42. data/lib/elasticity/job_flow_status.rb +0 -91
  43. data/lib/elasticity/job_flow_status_step.rb +0 -38
  44. data/spec/lib/elasticity/aws_request_v2_spec.rb +0 -38
  45. data/spec/lib/elasticity/job_flow_status_spec.rb +0 -265
  46. data/spec/lib/elasticity/job_flow_status_step_spec.rb +0 -80
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: f55ef4c0da52c5540b84f2f5c443a481dc3b0f63
4
- data.tar.gz: 24d3f7bf29a3b61e0775d343b19c27fce4da6150
3
+ metadata.gz: 9e3ac59080cc54551c89cc60e8aa0eab2a83bfd5
4
+ data.tar.gz: 8533393df0bc0f7cfd5fb05e3e339dc98721b44e
5
5
  SHA512:
6
- metadata.gz: e790d9f3b835d039b2c1a8522981bff5dfa7eb167e2a04e944c5a11c96bf3b44e881e61bc43bedaa80e0810da027f2eadf3fbb39730ccf98cc8c82a62995baa6
7
- data.tar.gz: 8397623a0f99f13f6407e6619a22b33185a7ad2f480177d7bcef075d6e83e6dc1e3b32d928881f0546510d06232c73d62aa7c94c4184979f3037b9bb3f4d96f2
6
+ metadata.gz: b8db8baed719babf9489f48ca949283e8ca60c0262f864496a505f145b6490243ddfe3003563ce9720a72a6977066cf2ea3f60afcd6a654446591e984374a04b
7
+ data.tar.gz: e2e705bdb17457727cf12df927033edca67795ffe480e85b4863ef050a26999bf861fed9cd23e5f2f089c9225e2abe383a7543ce1b85e91371377abd414a0100
data/HISTORY.md CHANGED
@@ -1,3 +1,29 @@
1
+ ## 6.0 - July 17, 2015
2
+
3
+ Amazon is in the process of transitioning from the notion of "Job Flows" to "Clusters" and is updating their APIs as such. You've already seen this in the EMR web UI as all mentions of "job flows" are gone and now you create "Clusters".
4
+
5
+ On the API side, all of the newer commands take `cluster_id` rather than `job_flow_id`. On the API submission side, they are transitioning from a 'flat' structure to a nested JSON structure requiring no transformation. Finally, XML is all gone and commands return JSON (i.e. no more Nokogiri as a dependency).
6
+
7
+ They've also begun deprecating APIs, starting with `DescribeJobFlows`. Given the sweeping set of changes, a major release was deemed appropriate.
8
+
9
+ - [#88](https://github.com/rslifka/elasticity/issues/88) - Removed support for deprecated `DescribeJobFlows`.
10
+ - [#89](https://github.com/rslifka/elasticity/issues/89) - Add support for `AddTags`.
11
+ - [#90](https://github.com/rslifka/elasticity/issues/90) - Add support for `RemoveTags`.
12
+ - [#91](https://github.com/rslifka/elasticity/issues/91) - Add support for `SetVisibleToAllUsers`.
13
+ - [#92](https://github.com/rslifka/elasticity/issues/92) - Add support for `ListSteps`.
14
+ - [#93](https://github.com/rslifka/elasticity/issues/93) - Add support for `ListInstances`.
15
+ - [#94](https://github.com/rslifka/elasticity/issues/94) - Add support for `ListInstanceGroups`.
16
+ - [#95](https://github.com/rslifka/elasticity/issues/95) - Add support for `ListClusters`.
17
+ - [#96](https://github.com/rslifka/elasticity/issues/96) - Add support for `ListBootstrapActions`.
18
+ - [#97](https://github.com/rslifka/elasticity/issues/97) - Add support for `DescribeCluster`.
19
+ - [#98](https://github.com/rslifka/elasticity/issues/98) - Add support for `DescribeStep`.
20
+ - [#101](https://github.com/rslifka/elasticity/issues/101) - Fix plurality of `TerminateJobFlows`; now requires an array of IDs to terminate.
21
+ - [#102](https://github.com/rslifka/elasticity/issues/102) - Simplify interface to `AddJobFlowSteps`; no longer require extraneous `:steps => []`.
22
+ - [#104](https://github.com/rslifka/elasticity/issues/104) - Expose return value from `AddJobFlowSteps`.
23
+ - [#105](https://github.com/rslifka/elasticity/issues/105) - `JobFlow#status` has been removed in favour of `JobFlow#cluster_status` and `JobFlow#cluster_step_status`.
24
+ - [#107](https://github.com/rslifka/elasticity/issues/107) - Add support for temporary credentials via `Elasticity.configure`.
25
+ - [#109](https://github.com/rslifka/elasticity/issues/109) - Credential specification relocated to `Elasticity.configure`.
26
+
1
27
  ## 5.0.3 - July 8, 2015
2
28
 
3
29
  - Fix for issue [#86](https://github.com/rslifka/elasticity/issues/86).
data/README.md CHANGED
@@ -18,7 +18,7 @@ gem install elasticity
18
18
  or in your Gemfile
19
19
 
20
20
  ```
21
- gem 'elasticity', '~> 5.0'
21
+ gem 'elasticity', '~> 6.0'
22
22
  ```
23
23
 
24
24
  This will ensure that you protect yourself from API changes, which will only be made in major revisions.
@@ -30,11 +30,14 @@ If you're familiar with the AWS EMR UI, you'll recall there are sample jobs Amaz
30
30
  ```ruby
31
31
  require 'elasticity'
32
32
 
33
- # Create a job flow with your AWS credentials
34
- jobflow = Elasticity::JobFlow.new('AWS access key', 'AWS secret key')
33
+ # Specify your AWS credentials
34
+ Elasticity.configure do |c|
35
+ c.access_key = ENV['AWS_ACCESS_KEY_ID']
36
+ c.secret_key = ENV['AWS_SECRET_ACCESS_KEY']
37
+ end
35
38
 
36
- # Omit credentials to use the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables
37
- # jobflow = Elasticity::JobFlow.new
39
+ # Create a job flow
40
+ jobflow = Elasticity::JobFlow.new
38
41
 
39
42
  # NOTE: Amazon requires that all new accounts specify a VPC subnet when launching jobs.
40
43
  # If you're on an existing account, this is unnecessary however new AWS accounts require
@@ -44,7 +47,7 @@ jobflow = Elasticity::JobFlow.new('AWS access key', 'AWS secret key')
44
47
  # This is the first step in the jobflow - running a custom jar
45
48
  step = Elasticity::CustomJarStep.new('s3n://elasticmapreduce/samples/cloudburst/cloudburst.jar')
46
49
 
47
- # Here are the arguments to pass to the jar
50
+ # Here are the arguments to pass to the jar (replace OUTPUT_BUCKET)
48
51
  step.arguments = %w(s3n://elasticmapreduce/samples/cloudburst/input/s_suis.br s3n://elasticmapreduce/samples/cloudburst/input/100k.br s3n://OUTPUT_BUCKET/cloudburst/output/2012-06-22 36 3 0 1 240 48 24 24 128 16)
49
52
 
50
53
  # Add the step to the jobflow
@@ -60,6 +63,7 @@ Note that this example is only for ```CustomJarStep```. Other steps will have d
60
63
 
61
64
  Job flows are the center of the EMR universe. The general order of operations is:
62
65
 
66
+ 1. Specify AWS credentials
63
67
  1. Create a job flow.
64
68
  1. Specify options.
65
69
  1. (optional) Configure instance groups.
@@ -71,31 +75,30 @@ Job flows are the center of the EMR universe. The general order of operations i
71
75
  1. (optional) Wait for the job flow to complete.
72
76
  1. (optional) Shutdown the job flow.
73
77
 
74
- ## 1 - Create a Job Flow
75
-
76
- Only your AWS credentials are needed.
78
+ ## 1 - Specify AWS Credentials
77
79
 
78
80
  ```ruby
79
- # Manually specify AWS credentials
80
- jobflow = Elasticity::JobFlow.new('AWS access key', 'AWS secret key')
81
+ Elasticity.configure do |c|
82
+ c.access_key = ENV['AWS_ACCESS_KEY_ID']
83
+ c.secret_key = ENV['AWS_SECRET_ACCESS_KEY']
84
+ end
85
+ ```
81
86
 
82
- # Use the standard environment variables (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY)
87
+ ## 2 - Create a Job Flow
88
+
89
+ ```ruby
83
90
  jobflow = Elasticity::JobFlow.new
84
91
  ```
85
92
 
86
93
  If you want to access a job flow that's already running:
87
94
 
88
95
  ```ruby
89
- # Manually specify AWS credentials
90
- jobflow = Elasticity::JobFlow.from_jobflow_id('AWS access key', 'AWS secret key', 'jobflow ID', 'region')
91
-
92
- # Use the standard environment variables (AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY)
93
- jobflow = Elasticity::JobFlow.from_jobflow_id(nil, nil, 'jobflow ID', 'region')
96
+ jobflow = Elasticity::JobFlow.from_jobflow_id('jobflow ID', 'region')
94
97
  ```
95
98
 
96
99
  This is useful if you'd like to attach to a running job flow and add more steps, etc. The ```region``` parameter is necessary because job flows are only accessible from the the API when you connect to the same endpoint that created them (e.g. us-west-1). If you don't specify the ```region``` parameter, us-east-1 is assumed.
97
100
 
98
- ## 2 - Specifying Options
101
+ ## 3 - Specifying Options
99
102
 
100
103
  Configuration job flow options, shown below with default values. Note that these defaults are subject to change - they are reasonable defaults at the time(s) I work on them (e.g. the latest version of Hadoop).
101
104
 
@@ -106,7 +109,7 @@ jobflow.name = 'Elasticity Job Flow'
106
109
 
107
110
  # For new AWS accounts, this is required to be set
108
111
  jobflow.ec2_subnet_id = nil
109
- jobflow.jobflow_role = nil
112
+ jobflow.job_flow_role = nil
110
113
  jobflow.service_role = nil
111
114
 
112
115
  jobflow.action_on_failure = 'TERMINATE_JOB_FLOW'
@@ -124,7 +127,7 @@ jobflow.master_instance_type = 'm1.small'
124
127
  jobflow.slave_instance_type = 'm1.small'
125
128
  ```
126
129
 
127
- ## 3 - Configure Instance Groups (optional)
130
+ ## 4 - Configure Instance Groups (optional)
128
131
 
129
132
  Technically this is optional since Elasticity creates MASTER and CORE instance groups for you (one m1.small instance in each). If you'd like your jobs to finish in an appreciable amount of time, you'll want to at least add a few instances to the CORE group :)
130
133
 
@@ -170,7 +173,7 @@ ig.set_spot_instances(0.25) # Makes this a SPOT group with a $0.25 bid p
170
173
  jobflow.set_core_instance_group(ig)
171
174
  ```
172
175
 
173
- ## 4 - Add Bootstrap Actions (optional)
176
+ ## 5 - Add Bootstrap Actions (optional)
174
177
 
175
178
  Bootstrap actions are run as part of setting up the job flow, so be sure to configure these before running the job.
176
179
 
@@ -206,7 +209,7 @@ action = Elasticity::HadoopFileBootstrapAction.new('s3n://my-bucket/job-config.x
206
209
  jobflow.add_bootstrap_action(action)
207
210
  ```
208
211
 
209
- ## 5 - Add Steps (optional)
212
+ ## 6 - Add Steps (optional)
210
213
 
211
214
  Each type of step has ```#name``` and ```#action_on_failure``` fields that can be specified. Apart from that, steps are configured differently - exhaustively described below.
212
215
 
@@ -308,7 +311,7 @@ copy_step.arguments = [...]
308
311
  jobflow.add_step(copy_step)
309
312
  ```
310
313
 
311
- ## 6 - Upload Assets (optional)
314
+ ## 7 - Upload Assets (optional)
312
315
 
313
316
  This isn't part of ```JobFlow```; more of an aside. Elasticity provides a very basic means of uploading assets to S3 so that your EMR job has access to them. Most commonly this will be a set of resources to run the job (e.g. JAR files, streaming scripts, etc.) and a set of resources used by the job itself (e.g. a TSV file with a range of valid values, join tables, etc.).
314
317
 
@@ -332,7 +335,7 @@ If the bucket doesn't exist, it will be created.
332
335
 
333
336
  If a file already exists, there is an MD5 checksum evaluation. If the checksums are the same, the file will be skipped. Now you can use something like ```s3n://my-bucket/remote-dir/this-job/tables/join.tsv``` in your EMR jobs.
334
337
 
335
- ## 7 - Run the Job Flow
338
+ ## 8 - Run the Job Flow
336
339
 
337
340
  Submit the job flow to Amazon, storing the ID of the running job flow.
338
341
 
@@ -340,11 +343,11 @@ Submit the job flow to Amazon, storing the ID of the running job flow.
340
343
  jobflow_id = jobflow.run
341
344
  ```
342
345
 
343
- ## 8 - Add Additional Steps (optional)
346
+ ## 9 - Add Additional Steps (optional)
344
347
 
345
348
  Steps can be added to a running jobflow just by calling ```#add_step``` on the job flow exactly how you add them prior to submitting the job.
346
349
 
347
- ## 9 - Wait For the Job Flow to Complete (optional)
350
+ ## 10 - Wait For the Job Flow to Complete (optional)
348
351
 
349
352
  Elasticity has the ability to block until the status of a job flow is not STARTING or RUNNING. There are two flavours. Without a status callback:
350
353
 
@@ -362,7 +365,7 @@ jobflow.wait_for_completion do |elapsed_time, job_flow_status|
362
365
  end
363
366
  ```
364
367
 
365
- ## 10 - Shut Down the Job Flow (optional)
368
+ ## 11 - Shut Down the Job Flow (optional)
366
369
 
367
370
  By default, job flows are set to terminate when there are no more running steps. You can tell the job flow to stay alive when it has nothing left to do:
368
371
 
@@ -378,11 +381,15 @@ jobflow.shutdown
378
381
 
379
382
  # Elasticity Configuration
380
383
 
381
- Elasticity supports a wide range of configuration options :) all of which are shown below.
384
+ Elasticity supports a handful of configuration options, all of which are shown below.
382
385
 
383
386
  ```ruby
384
387
  Elasticity.configure do |config|
385
388
 
389
+ # AWS credentials
390
+ config.access_key = ENV['AWS_ACCESS_KEY_ID']
391
+ config.secret_key = ENV['AWS_SECRET_ACCESS_KEY']
392
+
386
393
  # If using Hive, it will be configured via the directives here
387
394
  config.hive_site = 's3://bucket/hive-site.xml'
388
395
 
@@ -12,13 +12,13 @@ Gem::Specification.new do |s|
12
12
  s.description = %q{Streamlined, programmatic access to Amazon's Elastic Map Reduce service, driven by the Sharethrough team's requirements for belting out EMR jobs.}
13
13
 
14
14
  s.add_dependency('rest-client', '~> 1.0')
15
- s.add_dependency('nokogiri', '~> 1.0')
16
15
  s.add_dependency('fog', '~> 1.0')
17
16
  s.add_dependency('unf', '~> 0.1')
18
17
 
18
+ s.add_development_dependency('factory_girl', '~> 4.0')
19
19
  s.add_development_dependency('fakefs', '~> 0.4.0')
20
20
  s.add_development_dependency('rake', '~> 0.9')
21
- s.add_development_dependency('rspec', '~> 2.14.0')
21
+ s.add_development_dependency('rspec', '~> 3.0')
22
22
  s.add_development_dependency('timecop', '~> 0.6')
23
23
 
24
24
  s.files = `git ls-files`.split("\n")
@@ -9,7 +9,6 @@ require 'elasticity/version'
9
9
 
10
10
  require 'elasticity/aws_utils'
11
11
  require 'elasticity/aws_session'
12
- require 'elasticity/aws_request_v2'
13
12
  require 'elasticity/aws_request_v4'
14
13
  require 'elasticity/emr'
15
14
 
@@ -25,8 +24,8 @@ require 'elasticity/looper'
25
24
  require 'elasticity/job_flow'
26
25
  require 'elasticity/instance_group'
27
26
 
28
- require 'elasticity/job_flow_status'
29
- require 'elasticity/job_flow_status_step'
27
+ require 'elasticity/cluster_status'
28
+ require 'elasticity/cluster_step_status'
30
29
 
31
30
  require 'elasticity/custom_jar_step'
32
31
  require 'elasticity/setup_hadoop_debugging_step'
@@ -56,6 +55,9 @@ module Elasticity
56
55
 
57
56
  class Configuration
58
57
  attr_accessor :hive_site
58
+ attr_accessor :access_key
59
+ attr_accessor :secret_key
60
+ attr_accessor :security_token
59
61
  end
60
62
 
61
63
  end
@@ -20,11 +20,21 @@ module Elasticity
20
20
  @ruby_service_hash.delete(:operation)
21
21
 
22
22
  @timestamp = Time.now.utc
23
+
24
+ @access_key = Elasticity.configuration.access_key
25
+ if @access_key == nil
26
+ raise ArgumentError, '.access_key must be set in the configuration block'
27
+ end
28
+
29
+ @secret_key = Elasticity.configuration.secret_key
30
+ if @secret_key == nil
31
+ raise ArgumentError, '.secret_key must be set in the configuration block'
32
+ end
23
33
  end
24
34
 
25
35
  def headers
26
- {
27
- 'Authorization' => "AWS4-HMAC-SHA256 Credential=#{@aws_session.access_key}/#{credential_scope}, SignedHeaders=content-type;host;user-agent;x-amz-content-sha256;x-amz-date;x-amz-target, Signature=#{aws_v4_signature}",
36
+ headers = {
37
+ 'Authorization' => "AWS4-HMAC-SHA256 Credential=#{@access_key}/#{credential_scope}, SignedHeaders=content-type;host;user-agent;x-amz-content-sha256;x-amz-date;x-amz-target, Signature=#{aws_v4_signature}",
28
38
  'Content-Type' => 'application/x-amz-json-1.1',
29
39
  'Host' => host,
30
40
  'User-Agent' => "elasticity/#{Elasticity::VERSION}",
@@ -32,6 +42,8 @@ module Elasticity
32
42
  'X-Amz-Date' => @timestamp.strftime('%Y%m%dT%H%M%SZ'),
33
43
  'X-Amz-Target' => "ElasticMapReduce.#{@operation}",
34
44
  }
45
+ headers.merge!('X-Amz-Security-Token' => Elasticity.configuration.security_token) if Elasticity.configuration.security_token
46
+ headers
35
47
  end
36
48
 
37
49
  def url
@@ -85,7 +97,7 @@ module Elasticity
85
97
  # Task 3: Calculate the AWS Signature Version 4
86
98
  # http://docs.aws.amazon.com/general/latest/gr/sigv4-calculate-signature.html
87
99
  def aws_v4_signature
88
- date = OpenSSL::HMAC.digest('sha256', 'AWS4' + @aws_session.secret_key, @timestamp.strftime('%Y%m%d'))
100
+ date = OpenSSL::HMAC.digest('sha256', 'AWS4' + @secret_key, @timestamp.strftime('%Y%m%d'))
89
101
  region = OpenSSL::HMAC.digest('sha256', date, @aws_session.region)
90
102
  service = OpenSSL::HMAC.digest('sha256', region, SERVICE_NAME)
91
103
  signing_key = OpenSSL::HMAC.digest('sha256', service, 'aws4_request')
@@ -7,15 +7,13 @@ module Elasticity
7
7
 
8
8
  class AwsSession
9
9
 
10
- attr_reader :access_key
11
- attr_reader :secret_key
12
10
  attr_reader :host
13
11
  attr_reader :region
14
12
 
15
13
  # Supported values for options:
16
14
  # :region - AWS region (e.g. us-west-1)
17
15
  # :secure - true or false, default true.
18
- def initialize(access=nil, secret=nil, options={})
16
+ def initialize(options={})
19
17
  # There is a cryptic error if this isn't set
20
18
  if options.has_key?(:region) && options[:region] == nil
21
19
  raise MissingRegionError, 'A valid :region is required to connect to EMR'
@@ -23,8 +21,6 @@ module Elasticity
23
21
  options[:region] = 'us-east-1' unless options[:region]
24
22
  @region = options[:region]
25
23
 
26
- @access_key = get_access_key(access)
27
- @secret_key = get_secret_key(secret)
28
24
  @host = "elasticmapreduce.#@region.amazonaws.com"
29
25
  end
30
26
 
@@ -39,32 +35,17 @@ module Elasticity
39
35
 
40
36
  def ==(other)
41
37
  return false unless other.is_a? AwsSession
42
- return false unless @access_key == other.access_key
43
- return false unless @secret_key == other.secret_key
44
38
  return false unless @host == other.host
45
39
  true
46
40
  end
47
41
 
48
42
  private
49
43
 
50
- def get_access_key(access)
51
- return access if access
52
- return ENV['AWS_ACCESS_KEY_ID'] if ENV['AWS_ACCESS_KEY_ID']
53
- raise MissingKeyError, 'Please provide an access key or set AWS_ACCESS_KEY_ID.'
54
- end
55
-
56
- def get_secret_key(secret)
57
- return secret if secret
58
- return ENV['AWS_SECRET_ACCESS_KEY'] if ENV['AWS_SECRET_ACCESS_KEY']
59
- raise MissingKeyError, 'Please provide a secret key or set AWS_SECRET_ACCESS_KEY.'
60
- end
61
-
62
44
  # AWS error responses all follow the same form. Extract the message from
63
45
  # the error document.
64
- def self.parse_error_response(error_xml)
65
- xml_doc = Nokogiri::XML(error_xml)
66
- xml_doc.remove_namespaces!
67
- xml_doc.xpath('/ErrorResponse/Error/Message').text
46
+ def self.parse_error_response(error_json)
47
+ error = JSON.parse(error_json)
48
+ "AWS EMR API Error (#{error['__type']}): #{error['message']}"
68
49
  end
69
50
 
70
51
  end
@@ -27,35 +27,6 @@ module Elasticity
27
27
  end
28
28
  end
29
29
 
30
- # Since we use the same structure as AWS, we can generate AWS param names
31
- # from the Ruby versions of those names (and the param nesting).
32
- def self.convert_ruby_to_aws(params)
33
- result = {}
34
- params.each do |key, value|
35
- case value
36
- when Array
37
- prefix = "#{camelize(key.to_s)}.member"
38
- value.each_with_index do |item, index|
39
- if item.is_a?(String)
40
- result["#{prefix}.#{index+1}"] = item
41
- else
42
- convert_ruby_to_aws(item).each do |nested_key, nested_value|
43
- result["#{prefix}.#{index+1}.#{nested_key}"] = nested_value
44
- end
45
- end
46
- end
47
- when Hash
48
- prefix = "#{camelize(key.to_s)}"
49
- convert_ruby_to_aws(value).each do |nested_key, nested_value|
50
- result["#{prefix}.#{nested_key}"] = nested_value
51
- end
52
- else
53
- result[camelize(key.to_s)] = value
54
- end
55
- end
56
- result
57
- end
58
-
59
30
  def self.camelize(word)
60
31
  word.to_s.gsub(/\/(.?)/) { '::' + $1.upcase }.gsub(/(^|_)(.)/) { $2.upcase }
61
32
  end
@@ -0,0 +1,38 @@
1
+ module Elasticity
2
+
3
+ class ClusterStatus
4
+
5
+ attr_accessor :name
6
+ attr_accessor :cluster_id
7
+ attr_accessor :state
8
+ attr_accessor :created_at
9
+ attr_accessor :ready_at
10
+ attr_accessor :ended_at
11
+ attr_accessor :last_state_change_reason
12
+ attr_accessor :master_public_dns_name
13
+ attr_accessor :normalized_instance_hours
14
+
15
+ # ClusterStatus is created via the results of the DescribeCluster API call
16
+ def self.from_aws_data(cluster_data)
17
+ cluster_data = cluster_data['Cluster']
18
+ ClusterStatus.new.tap do |c|
19
+ c.name = cluster_data['Name']
20
+ c.cluster_id = cluster_data['Id']
21
+ c.state = cluster_data['Status']['State']
22
+ c.created_at = Time.at(cluster_data['Status']['Timeline']['CreationDateTime'])
23
+ c.ready_at = Time.at(cluster_data['Status']['Timeline']['ReadyDateTime'])
24
+ c.ended_at = Time.at(cluster_data['Status']['Timeline']['EndDateTime'])
25
+ c.last_state_change_reason = cluster_data['Status']['StateChangeReason']['Code']
26
+ c.master_public_dns_name = cluster_data['MasterPublicDnsName']
27
+ c.normalized_instance_hours = cluster_data['NormalizedInstanceHours']
28
+ end
29
+ end
30
+
31
+ # http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/ProcessingCycle.html
32
+ def active?
33
+ %w{RUNNING STARTING BOOTSTRAPPING WAITING SHUTTING_DOWN}.include?(@state)
34
+ end
35
+
36
+ end
37
+
38
+ end
@@ -0,0 +1,51 @@
1
+ module Elasticity
2
+
3
+ class ClusterStepStatus
4
+
5
+ attr_accessor :action_on_failure
6
+ attr_accessor :args
7
+ attr_accessor :jar
8
+ attr_accessor :main_class
9
+ attr_accessor :properties
10
+ attr_accessor :step_id
11
+ attr_accessor :name
12
+ attr_accessor :state
13
+ attr_accessor :state_change_reason
14
+ attr_accessor :state_change_reason_message
15
+ attr_accessor :created_at
16
+ attr_accessor :started_at
17
+ attr_accessor :ended_at
18
+
19
+ # Constructed from http://docs.aws.amazon.com/ElasticMapReduce/latest/API/API_ListSteps.html
20
+ def self.from_aws_list_data(cluster_step_data)
21
+ cluster_step_data['Steps'].map do |s|
22
+ ClusterStepStatus.new.tap do |c|
23
+ c.action_on_failure = s['ActionOnFailure']
24
+ c.args = s['Config']['Args']
25
+ c.jar = s['Config']['Jar']
26
+ c.main_class = s['Config']['MainClass']
27
+ c.properties = s['Config']['Properties']
28
+ c.step_id = s['Id']
29
+ c.name = s['Name']
30
+ c.state = s['Status']['State']
31
+ c.state_change_reason = s['Status']['StateChangeReason']['Code']
32
+ c.state_change_reason_message = s['Status']['StateChangeReason']['Message']
33
+ c.created_at = Time.at(s['Status']['Timeline']['CreationDateTime'])
34
+ c.started_at = Time.at(s['Status']['Timeline']['StartDateTime'])
35
+ c.ended_at = Time.at(s['Status']['Timeline']['EndDateTime'])
36
+ end
37
+ end
38
+ end
39
+
40
+ def self.installed_steps(cluster_step_statuses)
41
+ step_names = cluster_step_statuses.map(&:name)
42
+ installed_steps = []
43
+ Elasticity::JobFlowStep.steps_requiring_installation.each do |step|
44
+ installed_steps << step if step_names.include?(step.aws_installation_step_name)
45
+ end
46
+ installed_steps
47
+ end
48
+
49
+ end
50
+
51
+ end