elasticity 2.0 → 2.1
Sign up to get free protection for your applications and to get access to all the features.
- data/HISTORY.md +20 -12
- data/README.md +57 -8
- data/lib/elasticity.rb +2 -0
- data/lib/elasticity/custom_jar_step.rb +1 -1
- data/lib/elasticity/instance_group.rb +57 -0
- data/lib/elasticity/job_flow.rb +46 -12
- data/lib/elasticity/version.rb +1 -1
- data/spec/lib/elasticity/custom_jar_step_spec.rb +3 -3
- data/spec/lib/elasticity/instance_group_spec.rb +140 -0
- data/spec/lib/elasticity/job_flow_integration_spec.rb +43 -10
- data/spec/lib/elasticity/job_flow_spec.rb +111 -5
- metadata +7 -4
data/HISTORY.md
CHANGED
@@ -1,3 +1,9 @@
|
|
1
|
+
## 2.1 - July 7, 2012
|
2
|
+
|
3
|
+
+ TASK instance group support added.
|
4
|
+
+ SPOT instance support added for all instance group types.
|
5
|
+
+ Removed name of jar from default name of ```CustomJarStep``` since the AWS UI already calls it out in a separate column when looking at job flow steps.
|
6
|
+
|
1
7
|
## 2.0 - June 26, 2012
|
2
8
|
|
3
9
|
2.0 is a rewrite of the simplified API after a year's worth of daily use at [Sharethrough](http://www.sharethrough.com/). We're investing heavily in our data processing infrastucture and many Elasticity feature ideas have come from those efforts.
|
@@ -21,59 +27,61 @@ In order to move more quickly and support interesting features like a command-li
|
|
21
27
|
+ Drastic simplification of the testing around EMR submission, reducing LoC (however important that metric is you :) and complexity by ~50%.
|
22
28
|
+ Development dependency updates: updated to ruby-1.9.3-p194 and rspec-2.10. Removed dependency on VCR and WebMock (no longer using either of these).
|
23
29
|
|
24
|
-
## 1.5
|
30
|
+
## 1.5 - March 5, 2012
|
25
31
|
|
26
32
|
+ Added support for Hadoop bootstrap actions to all job types (Pig, Hive and Custom Jar).
|
27
33
|
+ Added support for REE 1.8.7-2011.12, Ruby 1.9.2 and 1.9.3.
|
28
34
|
+ Updated to the latest versions of all development dependencies (notably VCR 2).
|
29
35
|
|
30
|
-
## 1.4.1
|
36
|
+
## 1.4.1 - December 17, 2011
|
31
37
|
|
32
38
|
+ Added ```Elasticity::EMR#describe_jobflow("jobflow_id")``` for describing a specific job. If you happen to run hundreds of EMR jobs, this makes retrieving jobflow status much faster than using ```Elasticity::EMR#describe_jobflowS``` which pulls down and parses XML status for hundreds of jobs.
|
33
39
|
|
34
|
-
## 1.4
|
40
|
+
## 1.4 - December 9, 2011
|
35
41
|
|
36
42
|
+ Added ```Elasticity::CustomJarJob``` for launching "Custom Jar" jobs.
|
37
43
|
|
38
|
-
## 1.3.1
|
44
|
+
## 1.3.1 - November 16, 2011
|
39
45
|
|
40
46
|
+ Explicitly requiring 'time' (only a problem if you aren't running from within a Rails environment).
|
41
47
|
+ ```Elasticity::JobFlow``` now exposes ```last_state_change_reason```.
|
42
48
|
|
43
|
-
## 1.3
|
49
|
+
## 1.3 - October 10, 2011
|
50
|
+
|
51
|
+
This release primarily contains contributions from Wouter Broekhof
|
44
52
|
|
45
53
|
+ The default mode of communication is now via HTTPS.
|
46
54
|
+ ```Elasticity::AwsRequest``` new option ```:secure => true|false``` (whether to use HTTPS).
|
47
55
|
+ ```Elasticity::AwsRequest``` new option ```:region => eu-west-1|...``` (which region to run the EMR job).
|
48
56
|
+ ```Elasticity::EMR#describe_jobflows``` now accepts additional params for filtering the jobflow query (see docs).
|
49
57
|
|
50
|
-
## 1.2.2
|
58
|
+
## 1.2.2 - May 10, 2011
|
51
59
|
|
52
60
|
+ ```HiveJob``` and ```PigJob``` now support configuring Hadoop options via ```#add_hadoop_bootstrap_action()```.
|
53
61
|
|
54
|
-
## 1.2.1
|
62
|
+
## 1.2.1 - May 7, 2011
|
55
63
|
|
56
64
|
+ Shipping up E_PARALLELS Pig variable with each invocation; reasonable default value for PARALLEL based on the number and type of instances configured.
|
57
65
|
|
58
|
-
## 1.2
|
66
|
+
## 1.2 - May 4, 2011
|
59
67
|
|
60
68
|
+ Added ```PigJob```!
|
61
69
|
|
62
|
-
## 1.1.1
|
70
|
+
## 1.1.1 - April 25, 2011
|
63
71
|
|
64
72
|
+ ```HiveJob``` critical bug fixed, now it works :)
|
65
73
|
+ Added ```log_uri``` and ```action_on_failure``` as options to ```HiveJob```.
|
66
74
|
+ Added integration tests to ```HiveJob```.
|
67
75
|
|
68
|
-
## 1.1
|
76
|
+
## 1.1 - April 24, 2011
|
69
77
|
|
70
78
|
+ Added ```HiveJob```, a simplified way to launch basic Hive job flows.
|
71
79
|
+ Added HISTORY.
|
72
80
|
|
73
|
-
## 1.0.1
|
81
|
+
## 1.0.1 - April 22, 2011
|
74
82
|
|
75
83
|
+ Added LICENSE.
|
76
84
|
|
77
|
-
## 1.0
|
85
|
+
## 1.0 - April 22, 2011
|
78
86
|
|
79
87
|
+ Released!
|
data/README.md
CHANGED
@@ -52,7 +52,8 @@ Job flows are the center of the EMR universe. The general order of operations i
|
|
52
52
|
|
53
53
|
1. Create a job flow.
|
54
54
|
1. Specify options.
|
55
|
-
1.
|
55
|
+
1. (optional) Configure instance groups.
|
56
|
+
1. (optional) Add bootstrap actions.
|
56
57
|
1. Create steps.
|
57
58
|
1. Run the job flow.
|
58
59
|
1. (optional) Add additional steps.
|
@@ -78,15 +79,63 @@ jobflow.ami_version = 'latest'
|
|
78
79
|
jobflow.ec2_key_name = 'default'
|
79
80
|
jobflow.ec2_subnet_id = nil
|
80
81
|
jobflow.hadoop_version = '0.20.205'
|
81
|
-
jobflow.instance_count = 2
|
82
82
|
jobflow.keep_job_flow_alive_when_no_steps = true
|
83
83
|
jobflow.log_uri = nil
|
84
|
-
jobflow.master_instance_type = 'm1.small'
|
85
84
|
jobflow.name = 'Elasticity Job Flow'
|
85
|
+
jobflow.instance_count = 2
|
86
|
+
jobflow.master_instance_type = 'm1.small'
|
86
87
|
jobflow.slave_instance_type = 'm1.small'
|
87
88
|
```
|
88
89
|
|
89
|
-
## 3 -
|
90
|
+
## 3 - Configuring Instance Groups (optional)
|
91
|
+
|
92
|
+
Technically this is optional since Elasticity creates MASTER and CORE instance groups for you (one m1.small instance in each). If you'd like your jobs to finish in an appreciable amount of time, you'll want to at least add a few instances to the CORE group :)
|
93
|
+
|
94
|
+
### The Easy Way™
|
95
|
+
|
96
|
+
If all you'd like to do is change the type or number of instances, ```JobFlow``` provides a few shortcuts to do just that.
|
97
|
+
|
98
|
+
```
|
99
|
+
jobflow.instance_count = 10
|
100
|
+
jobflow.master_instance_type = 'm1.small'
|
101
|
+
jobflow.slave_instance_type = 'c1.medium'
|
102
|
+
```
|
103
|
+
|
104
|
+
This says "I want 10 instances from EMR: one m1.small MASTER instance and nine c1.medium CORE instances."
|
105
|
+
|
106
|
+
### The Still-Easy Way™
|
107
|
+
|
108
|
+
Elasticity supports all EMR instance group types and all configuration options. The MASTER, CORE and TASK instance groups can be configured via ```JobFlow#set_master_instance_group```, ```JobFlow#set_core_instance_group``` and ```JobFlow#set_task_instance_group``` respectively.
|
109
|
+
|
110
|
+
#### On-Demand Instance Groups
|
111
|
+
|
112
|
+
These instances will be available for the life of your EMR job, versus Spot instances which are transient depending on your bid price (see below).
|
113
|
+
|
114
|
+
```
|
115
|
+
ig = Elasticity::InstanceGroup.new
|
116
|
+
ig.count = 10 # Provision 10 instances
|
117
|
+
ig.type = 'c1.medium' # See the EMR docs for a list of supported types
|
118
|
+
ig.set_on_demand_instances # This is the default setting
|
119
|
+
|
120
|
+
|
121
|
+
jobflow.set_core_instance_group(ig)
|
122
|
+
```
|
123
|
+
|
124
|
+
#### Spot Instance Groups
|
125
|
+
|
126
|
+
*When Amazon EC2 has unused capacity, it offers EC2 instances at a reduced cost, called the Spot Price. This price fluctuates based on availability and demand. You can purchase Spot Instances by placing a request that includes the highest bid price you are willing to pay for those instances. When the Spot Price is below your bid price, your Spot Instances are launched and you are billed the Spot Price. If the Spot Price rises above your bid price, Amazon EC2 terminates your Spot Instances.* - [EMR Developer Guide](http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/UsingEMR_SpotInstances.html)
|
127
|
+
|
128
|
+
```
|
129
|
+
ig = Elasticity::InstanceGroup.new
|
130
|
+
ig.count = 10 # Provision 10 instances
|
131
|
+
ig.type = 'c1.medium' # See the EMR docs for a list of supported types
|
132
|
+
ig.set_spot_instances(0.25) # Makes this a SPOT group with a $0.25 bid price
|
133
|
+
|
134
|
+
|
135
|
+
jobflow.set_core_instance_group(ig)
|
136
|
+
```
|
137
|
+
|
138
|
+
## 4 - Adding Bootstrap Actions (optional)
|
90
139
|
|
91
140
|
Bootstrap actions are run as part of setting up the job flow, so be sure to configure these before running the job.
|
92
141
|
|
@@ -100,7 +149,7 @@ Bootstrap actions are run as part of setting up the job flow, so be sure to conf
|
|
100
149
|
end
|
101
150
|
```
|
102
151
|
|
103
|
-
##
|
152
|
+
## 5 - Adding Steps
|
104
153
|
|
105
154
|
Each type of step has a default name that can be overridden (the :name field). Apart from that, steps are configured differently - exhaustively described below.
|
106
155
|
|
@@ -170,7 +219,7 @@ jar_step.arguments = ['arg1', 'arg2']
|
|
170
219
|
jobflow.add_step(jar_step)
|
171
220
|
```
|
172
221
|
|
173
|
-
##
|
222
|
+
## 6 - Running the Job Flow
|
174
223
|
|
175
224
|
Submit the job flow to Amazon, storing the ID of the running job flow.
|
176
225
|
|
@@ -178,11 +227,11 @@ Submit the job flow to Amazon, storing the ID of the running job flow.
|
|
178
227
|
jobflow_id = jobflow.run
|
179
228
|
```
|
180
229
|
|
181
|
-
##
|
230
|
+
## 7 - Adding Additional Steps (optional)
|
182
231
|
|
183
232
|
Steps can be added to a running jobflow just by calling ```#add_step``` on the job flow exactly how you add them prior to submitting the job.
|
184
233
|
|
185
|
-
##
|
234
|
+
## 8 - Shutting Down the Job Flow (optional)
|
186
235
|
|
187
236
|
By default, job flows are set to terminate when there are no more running steps. You can tell the job flow to stay alive when it has nothing left to do:
|
188
237
|
|
data/lib/elasticity.rb
CHANGED
@@ -0,0 +1,57 @@
|
|
1
|
+
module Elasticity
|
2
|
+
|
3
|
+
class InstanceGroup
|
4
|
+
|
5
|
+
ROLES = %w(MASTER CORE TASK)
|
6
|
+
|
7
|
+
attr_accessor :count
|
8
|
+
attr_accessor :type
|
9
|
+
attr_accessor :role
|
10
|
+
|
11
|
+
attr_reader :bid_price
|
12
|
+
attr_reader :market
|
13
|
+
|
14
|
+
def initialize
|
15
|
+
@count = 1
|
16
|
+
@type = 'm1.small'
|
17
|
+
@market = 'ON_DEMAND'
|
18
|
+
@role = 'CORE'
|
19
|
+
end
|
20
|
+
|
21
|
+
def count=(instance_count)
|
22
|
+
raise_if instance_count <= 0, ArgumentError, "Instance groups require at least 1 instance (#{instance_count} requested)"
|
23
|
+
raise_if @role == 'MASTER' && instance_count != 1, ArgumentError, "MASTER instance groups can only have 1 instance (#{instance_count} requested)"
|
24
|
+
@count = instance_count
|
25
|
+
end
|
26
|
+
|
27
|
+
def role=(group_role)
|
28
|
+
raise_unless ROLES.include?(group_role), ArgumentError, "Role must be one of MASTER, CORE or TASK (#{group_role} was requested)"
|
29
|
+
@count = 1 if group_role == 'MASTER'
|
30
|
+
@role = group_role
|
31
|
+
end
|
32
|
+
|
33
|
+
def set_spot_instances(bid_price)
|
34
|
+
raise_unless bid_price > 0, ArgumentError, "The bid price for spot instances should be greater than 0 (#{bid_price} requested)"
|
35
|
+
@bid_price = bid_price
|
36
|
+
@market = 'SPOT'
|
37
|
+
end
|
38
|
+
|
39
|
+
def set_on_demand_instances
|
40
|
+
@bid_price = nil
|
41
|
+
@market = 'ON_DEMAND'
|
42
|
+
end
|
43
|
+
|
44
|
+
def to_aws_instance_config
|
45
|
+
{
|
46
|
+
:market => @market,
|
47
|
+
:instance_count => @count,
|
48
|
+
:instance_type => @type,
|
49
|
+
:instance_role => @role,
|
50
|
+
}.tap do |config|
|
51
|
+
config.merge!(:bid_price => @bid_price) if @market == 'SPOT'
|
52
|
+
end
|
53
|
+
end
|
54
|
+
|
55
|
+
end
|
56
|
+
|
57
|
+
end
|
data/lib/elasticity/job_flow.rb
CHANGED
@@ -22,30 +22,61 @@ module Elasticity
|
|
22
22
|
@action_on_failure = 'TERMINATE_JOB_FLOW'
|
23
23
|
@ec2_key_name = 'default'
|
24
24
|
@hadoop_version = '0.20.205'
|
25
|
-
@instance_count = 2
|
26
|
-
@master_instance_type = 'm1.small'
|
27
25
|
@name = 'Elasticity Job Flow'
|
28
|
-
@slave_instance_type = 'm1.small'
|
29
26
|
@ami_version = 'latest'
|
30
27
|
@keep_job_flow_alive_when_no_steps = false
|
31
28
|
|
32
|
-
@emr = Elasticity::EMR.new(access, secret)
|
33
|
-
|
34
29
|
@bootstrap_actions = []
|
35
30
|
@jobflow_steps = []
|
36
31
|
@installed_steps = []
|
32
|
+
|
33
|
+
@instance_groups = {}
|
34
|
+
set_master_instance_group(Elasticity::InstanceGroup.new)
|
35
|
+
set_core_instance_group(Elasticity::InstanceGroup.new)
|
36
|
+
|
37
|
+
@instance_count = 2
|
38
|
+
@master_instance_type = 'm1.small'
|
39
|
+
@slave_instance_type = 'm1.small'
|
40
|
+
|
41
|
+
@emr = Elasticity::EMR.new(access, secret)
|
37
42
|
end
|
38
43
|
|
39
44
|
def instance_count=(count)
|
40
45
|
raise ArgumentError, 'Instance count cannot be set to less than 2 (requested 1)' unless count > 1
|
46
|
+
@instance_groups[:core].count = count - 1
|
41
47
|
@instance_count = count
|
42
48
|
end
|
43
49
|
|
50
|
+
def master_instance_type=(type)
|
51
|
+
@instance_groups[:master].type = type
|
52
|
+
@master_instance_type = type
|
53
|
+
end
|
54
|
+
|
55
|
+
def slave_instance_type=(type)
|
56
|
+
@instance_groups[:core].type = type
|
57
|
+
@slave_instance_type = type
|
58
|
+
end
|
59
|
+
|
44
60
|
def add_bootstrap_action(bootstrap_action)
|
45
61
|
raise_if is_jobflow_running?, JobFlowRunningError, 'To modify bootstrap actions, please create a new job flow.'
|
46
62
|
@bootstrap_actions << bootstrap_action
|
47
63
|
end
|
48
64
|
|
65
|
+
def set_master_instance_group(instance_group)
|
66
|
+
instance_group.role = 'MASTER'
|
67
|
+
@instance_groups[:master] = instance_group
|
68
|
+
end
|
69
|
+
|
70
|
+
def set_core_instance_group(instance_group)
|
71
|
+
instance_group.role = 'CORE'
|
72
|
+
@instance_groups[:core] = instance_group
|
73
|
+
end
|
74
|
+
|
75
|
+
def set_task_instance_group(instance_group)
|
76
|
+
instance_group.role = 'TASK'
|
77
|
+
@instance_groups[:task] = instance_group
|
78
|
+
end
|
79
|
+
|
49
80
|
def add_step(jobflow_step)
|
50
81
|
if is_jobflow_running?
|
51
82
|
jobflow_steps = []
|
@@ -90,20 +121,18 @@ module Elasticity
|
|
90
121
|
end
|
91
122
|
|
92
123
|
def jobflow_preamble
|
93
|
-
{
|
124
|
+
preamble = {
|
94
125
|
:name => @name,
|
95
126
|
:ami_version => @ami_version,
|
96
127
|
:instances => {
|
97
128
|
:keep_job_flow_alive_when_no_steps => @keep_job_flow_alive_when_no_steps,
|
98
129
|
:ec2_key_name => @ec2_key_name,
|
99
130
|
:hadoop_version => @hadoop_version,
|
100
|
-
:
|
101
|
-
:master_instance_type => @master_instance_type,
|
102
|
-
:slave_instance_type => @slave_instance_type,
|
131
|
+
:instance_groups => jobflow_instance_groups
|
103
132
|
}
|
104
|
-
}
|
105
|
-
|
106
|
-
|
133
|
+
}
|
134
|
+
preamble.merge!(:ec2_subnet_id => @ec2_subnet_id) if @ec2_subnet_id
|
135
|
+
preamble
|
107
136
|
end
|
108
137
|
|
109
138
|
def jobflow_steps
|
@@ -118,6 +147,11 @@ module Elasticity
|
|
118
147
|
steps
|
119
148
|
end
|
120
149
|
|
150
|
+
def jobflow_instance_groups
|
151
|
+
groups = [:master, :core, :task].map{|role| @instance_groups[role]}.compact
|
152
|
+
groups.map(&:to_aws_instance_config)
|
153
|
+
end
|
154
|
+
|
121
155
|
end
|
122
156
|
|
123
157
|
end
|
data/lib/elasticity/version.rb
CHANGED
@@ -6,7 +6,7 @@ describe Elasticity::CustomJarStep do
|
|
6
6
|
|
7
7
|
it { should be_a Elasticity::JobFlowStep }
|
8
8
|
|
9
|
-
its(:name) { should == 'Elasticity Custom Jar Step
|
9
|
+
its(:name) { should == 'Elasticity Custom Jar Step' }
|
10
10
|
its(:jar) { should == 'jar' }
|
11
11
|
its(:arguments) { should == [] }
|
12
12
|
its(:action_on_failure) { should == 'TERMINATE_JOB_FLOW' }
|
@@ -24,7 +24,7 @@ describe Elasticity::CustomJarStep do
|
|
24
24
|
:hadoop_jar_step => {
|
25
25
|
:jar => 'jar'
|
26
26
|
},
|
27
|
-
:name => 'Elasticity Custom Jar Step
|
27
|
+
:name => 'Elasticity Custom Jar Step'
|
28
28
|
}
|
29
29
|
end
|
30
30
|
end
|
@@ -43,7 +43,7 @@ describe Elasticity::CustomJarStep do
|
|
43
43
|
:jar => 'jar',
|
44
44
|
:args => ['arg1', 'arg2',],
|
45
45
|
},
|
46
|
-
:name => 'Elasticity Custom Jar Step
|
46
|
+
:name => 'Elasticity Custom Jar Step'
|
47
47
|
}
|
48
48
|
end
|
49
49
|
end
|
@@ -0,0 +1,140 @@
|
|
1
|
+
describe Elasticity::InstanceGroup do
|
2
|
+
|
3
|
+
its(:bid_price) { should == nil }
|
4
|
+
its(:count) { should == 1 }
|
5
|
+
its(:type) { should == 'm1.small' }
|
6
|
+
its(:market) { should == 'ON_DEMAND' }
|
7
|
+
its(:role) { should == 'CORE' }
|
8
|
+
|
9
|
+
describe '#count=' do
|
10
|
+
|
11
|
+
it 'should set the count' do
|
12
|
+
subject.count = 10
|
13
|
+
subject.count.should == 10
|
14
|
+
end
|
15
|
+
|
16
|
+
context 'when the role is not MASTER' do
|
17
|
+
context 'and the count is <= 0' do
|
18
|
+
it 'should be an error' do
|
19
|
+
subject.role = 'CORE'
|
20
|
+
expect {
|
21
|
+
subject.count = 0
|
22
|
+
}.to raise_error(ArgumentError, 'Instance groups require at least 1 instance (0 requested)')
|
23
|
+
end
|
24
|
+
end
|
25
|
+
end
|
26
|
+
|
27
|
+
context 'when the role is MASTER' do
|
28
|
+
context 'and a count != 1 is attempted' do
|
29
|
+
it 'should be an error' do
|
30
|
+
subject.role = 'MASTER'
|
31
|
+
expect {
|
32
|
+
subject.count = 2
|
33
|
+
}.to raise_error(ArgumentError, 'MASTER instance groups can only have 1 instance (2 requested)')
|
34
|
+
end
|
35
|
+
end
|
36
|
+
end
|
37
|
+
|
38
|
+
end
|
39
|
+
|
40
|
+
describe '#role=' do
|
41
|
+
|
42
|
+
it 'should set the role' do
|
43
|
+
subject.role = 'MASTER'
|
44
|
+
subject.role.should == 'MASTER'
|
45
|
+
end
|
46
|
+
|
47
|
+
context 'when the role is unknown' do
|
48
|
+
it 'should be an error' do
|
49
|
+
expect {
|
50
|
+
subject.role = '_'
|
51
|
+
}.to raise_error(ArgumentError, 'Role must be one of MASTER, CORE or TASK (_ was requested)')
|
52
|
+
end
|
53
|
+
end
|
54
|
+
|
55
|
+
context 'when the role is switching to MASTER' do
|
56
|
+
context 'and the count is != 1' do
|
57
|
+
it 'should set the count to 1' do
|
58
|
+
subject.role = 'CORE'
|
59
|
+
subject.count = 2
|
60
|
+
expect {
|
61
|
+
subject.role = 'MASTER'
|
62
|
+
}.to change { subject.count }.to(1)
|
63
|
+
end
|
64
|
+
end
|
65
|
+
end
|
66
|
+
|
67
|
+
end
|
68
|
+
|
69
|
+
describe '#set_spot_instances' do
|
70
|
+
|
71
|
+
it 'should set the type and price' do
|
72
|
+
subject.set_spot_instances(0.25)
|
73
|
+
subject.market.should == 'SPOT'
|
74
|
+
subject.bid_price.should == 0.25
|
75
|
+
end
|
76
|
+
|
77
|
+
context 'when the price is <= 0' do
|
78
|
+
it 'should be an error' do
|
79
|
+
expect {
|
80
|
+
subject.set_spot_instances(-1)
|
81
|
+
}.to raise_error(ArgumentError, 'The bid price for spot instances should be greater than 0 (-1 requested)')
|
82
|
+
end
|
83
|
+
end
|
84
|
+
|
85
|
+
end
|
86
|
+
|
87
|
+
describe '#set_on_demand_instances' do
|
88
|
+
|
89
|
+
it 'should set the type and price' do
|
90
|
+
subject.set_on_demand_instances
|
91
|
+
subject.market.should == 'ON_DEMAND'
|
92
|
+
subject.bid_price.should == nil
|
93
|
+
end
|
94
|
+
|
95
|
+
end
|
96
|
+
|
97
|
+
describe '#to_aws_instance_config' do
|
98
|
+
|
99
|
+
context 'when an ON_DEMAND group' do
|
100
|
+
let(:on_demand_instance_group) do
|
101
|
+
Elasticity::InstanceGroup.new.tap do |i|
|
102
|
+
i.count = 5
|
103
|
+
i.type = 'c1.medium'
|
104
|
+
i.role = 'CORE'
|
105
|
+
i.set_on_demand_instances
|
106
|
+
end
|
107
|
+
end
|
108
|
+
it 'should generate an AWS config' do
|
109
|
+
on_demand_instance_group.to_aws_instance_config.should == {
|
110
|
+
:market => 'ON_DEMAND',
|
111
|
+
:instance_count => 5,
|
112
|
+
:instance_type => 'c1.medium',
|
113
|
+
:instance_role => 'CORE',
|
114
|
+
}
|
115
|
+
end
|
116
|
+
end
|
117
|
+
|
118
|
+
context 'when a SPOT group' do
|
119
|
+
let(:on_demand_instance_group) do
|
120
|
+
Elasticity::InstanceGroup.new.tap do |i|
|
121
|
+
i.count = 5
|
122
|
+
i.type = 'c1.medium'
|
123
|
+
i.role = 'CORE'
|
124
|
+
i.set_spot_instances(0.25)
|
125
|
+
end
|
126
|
+
end
|
127
|
+
it 'should generate an AWS config' do
|
128
|
+
on_demand_instance_group.to_aws_instance_config.should == {
|
129
|
+
:market => 'SPOT',
|
130
|
+
:bid_price => 0.25,
|
131
|
+
:instance_count => 5,
|
132
|
+
:instance_type => 'c1.medium',
|
133
|
+
:instance_role => 'CORE',
|
134
|
+
}
|
135
|
+
end
|
136
|
+
end
|
137
|
+
|
138
|
+
end
|
139
|
+
|
140
|
+
end
|
@@ -31,9 +31,20 @@ describe 'Elasticity::JobFlow Integration Examples' do
|
|
31
31
|
:keep_job_flow_alive_when_no_steps => false,
|
32
32
|
:ec2_key_name => 'default',
|
33
33
|
:hadoop_version => '0.20.205',
|
34
|
-
:
|
35
|
-
|
36
|
-
|
34
|
+
:instance_groups => [
|
35
|
+
{
|
36
|
+
:instance_count => 1,
|
37
|
+
:instance_role => 'MASTER',
|
38
|
+
:instance_type => 'm1.small',
|
39
|
+
:market => 'ON_DEMAND',
|
40
|
+
},
|
41
|
+
{
|
42
|
+
:instance_count => 1,
|
43
|
+
:instance_role => 'CORE',
|
44
|
+
:instance_type => 'm1.small',
|
45
|
+
:market => 'ON_DEMAND'
|
46
|
+
},
|
47
|
+
],
|
37
48
|
},
|
38
49
|
:steps => [
|
39
50
|
{
|
@@ -98,9 +109,20 @@ describe 'Elasticity::JobFlow Integration Examples' do
|
|
98
109
|
:keep_job_flow_alive_when_no_steps => false,
|
99
110
|
:ec2_key_name => 'default',
|
100
111
|
:hadoop_version => '0.20.205',
|
101
|
-
:
|
102
|
-
|
103
|
-
|
112
|
+
:instance_groups => [
|
113
|
+
{
|
114
|
+
:instance_count => 1,
|
115
|
+
:instance_role => 'MASTER',
|
116
|
+
:instance_type => 'm1.small',
|
117
|
+
:market => 'ON_DEMAND',
|
118
|
+
},
|
119
|
+
{
|
120
|
+
:instance_count => 7,
|
121
|
+
:instance_role => 'CORE',
|
122
|
+
:instance_type => 'm1.xlarge',
|
123
|
+
:market => 'ON_DEMAND'
|
124
|
+
},
|
125
|
+
]
|
104
126
|
},
|
105
127
|
:steps => [
|
106
128
|
{
|
@@ -169,9 +191,20 @@ describe 'Elasticity::JobFlow Integration Examples' do
|
|
169
191
|
:keep_job_flow_alive_when_no_steps => false,
|
170
192
|
:ec2_key_name => 'default',
|
171
193
|
:hadoop_version => '0.20.205',
|
172
|
-
:
|
173
|
-
|
174
|
-
|
194
|
+
:instance_groups => [
|
195
|
+
{
|
196
|
+
:instance_count => 1,
|
197
|
+
:instance_role => 'MASTER',
|
198
|
+
:instance_type => 'm1.small',
|
199
|
+
:market => 'ON_DEMAND',
|
200
|
+
},
|
201
|
+
{
|
202
|
+
:instance_count => 1,
|
203
|
+
:instance_role => 'CORE',
|
204
|
+
:instance_type => 'm1.small',
|
205
|
+
:market => 'ON_DEMAND'
|
206
|
+
},
|
207
|
+
]
|
175
208
|
},
|
176
209
|
:steps => [
|
177
210
|
{
|
@@ -184,7 +217,7 @@ describe 'Elasticity::JobFlow Integration Examples' do
|
|
184
217
|
's3n://slif_hadoop_test/cloudburst/output/2011-12-09',
|
185
218
|
],
|
186
219
|
},
|
187
|
-
:name => 'Elasticity Custom Jar Step
|
220
|
+
:name => 'Elasticity Custom Jar Step'
|
188
221
|
}
|
189
222
|
]
|
190
223
|
}).and_return('CUSTOM_JAR_JOBFLOW_ID')
|
@@ -23,18 +23,69 @@ describe Elasticity::JobFlow do
|
|
23
23
|
describe '#instance_count=' do
|
24
24
|
|
25
25
|
context 'when set to more than 1' do
|
26
|
+
|
26
27
|
it 'should set the number of instances' do
|
27
28
|
subject.instance_count = 10
|
28
29
|
subject.instance_count.should == 10
|
29
30
|
end
|
31
|
+
|
32
|
+
it 'should set the CORE group instance count to COUNT-1 instances' do
|
33
|
+
instance_group = Elasticity::InstanceGroup.new
|
34
|
+
instance_group.count = 4
|
35
|
+
instance_group.role = 'CORE'
|
36
|
+
|
37
|
+
subject.instance_count = 5
|
38
|
+
subject.send(:jobflow_instance_groups).should be_include(instance_group.to_aws_instance_config)
|
39
|
+
end
|
40
|
+
|
30
41
|
end
|
31
42
|
|
32
43
|
context 'when set to less than 2' do
|
33
|
-
|
44
|
+
|
45
|
+
it 'should be an error and not set the instance count' do
|
46
|
+
subject.instance_count = 10
|
34
47
|
expect {
|
35
48
|
subject.instance_count = 1
|
36
49
|
}.to raise_error(ArgumentError, 'Instance count cannot be set to less than 2 (requested 1)')
|
50
|
+
subject.instance_count.should == 10
|
37
51
|
end
|
52
|
+
|
53
|
+
end
|
54
|
+
|
55
|
+
end
|
56
|
+
|
57
|
+
describe '#master_instance_type=' do
|
58
|
+
|
59
|
+
it 'should set the master_instance_type' do
|
60
|
+
subject.master_instance_type = '_'
|
61
|
+
subject.master_instance_type.should == '_'
|
62
|
+
end
|
63
|
+
|
64
|
+
it 'should set the MASTER group instance type' do
|
65
|
+
instance_group = Elasticity::InstanceGroup.new
|
66
|
+
instance_group.type = 'c1.medium'
|
67
|
+
instance_group.role = 'MASTER'
|
68
|
+
|
69
|
+
subject.master_instance_type = 'c1.medium'
|
70
|
+
subject.send(:jobflow_instance_groups).should be_include(instance_group.to_aws_instance_config)
|
71
|
+
end
|
72
|
+
|
73
|
+
end
|
74
|
+
|
75
|
+
describe '#slave_instance_type=' do
|
76
|
+
|
77
|
+
it 'should set the slave_instance_type' do
|
78
|
+
subject.slave_instance_type = '_'
|
79
|
+
subject.slave_instance_type.should == '_'
|
80
|
+
end
|
81
|
+
|
82
|
+
it 'should set the CORE group instance type' do
|
83
|
+
instance_group = Elasticity::InstanceGroup.new
|
84
|
+
instance_group.type = 'c1.medium'
|
85
|
+
instance_group.role = 'CORE'
|
86
|
+
|
87
|
+
subject.slave_instance_type = 'c1.medium'
|
88
|
+
subject.send(:jobflow_instance_groups).should be_include(instance_group.to_aws_instance_config)
|
38
89
|
end
|
39
90
|
|
40
91
|
end
|
@@ -231,6 +282,59 @@ describe Elasticity::JobFlow do
|
|
231
282
|
|
232
283
|
end
|
233
284
|
|
285
|
+
describe '#jobflow_instance_groups' do
|
286
|
+
|
287
|
+
describe 'default instance groups' do
|
288
|
+
|
289
|
+
let(:default_instance_groups) do
|
290
|
+
[
|
291
|
+
{
|
292
|
+
:instance_count => 1,
|
293
|
+
:instance_role => 'MASTER',
|
294
|
+
:instance_type => 'm1.small',
|
295
|
+
:market => 'ON_DEMAND',
|
296
|
+
},
|
297
|
+
{
|
298
|
+
:instance_count => 1,
|
299
|
+
:instance_role => 'CORE',
|
300
|
+
:instance_type => 'm1.small',
|
301
|
+
:market => 'ON_DEMAND'
|
302
|
+
},
|
303
|
+
]
|
304
|
+
end
|
305
|
+
|
306
|
+
it 'should create a properly specified instance group config' do
|
307
|
+
subject.send(:jobflow_instance_groups).should == default_instance_groups
|
308
|
+
end
|
309
|
+
|
310
|
+
end
|
311
|
+
|
312
|
+
context 'when a task instance group is specified' do
|
313
|
+
|
314
|
+
let(:task_instance_group) do
|
315
|
+
Elasticity::InstanceGroup.new.tap do |i|
|
316
|
+
i.count = 2
|
317
|
+
i.type = 'c1.medium'
|
318
|
+
end
|
319
|
+
end
|
320
|
+
|
321
|
+
let(:task_instance_group_config) do
|
322
|
+
{
|
323
|
+
:instance_count => 2,
|
324
|
+
:instance_role => 'TASK',
|
325
|
+
:instance_type => 'c1.medium',
|
326
|
+
:market => 'ON_DEMAND'
|
327
|
+
}
|
328
|
+
end
|
329
|
+
|
330
|
+
it 'should include it in the group config' do
|
331
|
+
subject.set_task_instance_group(task_instance_group)
|
332
|
+
subject.send(:jobflow_instance_groups).should be_include(task_instance_group_config)
|
333
|
+
end
|
334
|
+
|
335
|
+
end
|
336
|
+
end
|
337
|
+
|
234
338
|
describe '#jobflow_preamble' do
|
235
339
|
|
236
340
|
let(:basic_preamble) do
|
@@ -241,13 +345,15 @@ describe Elasticity::JobFlow do
|
|
241
345
|
:keep_job_flow_alive_when_no_steps => false,
|
242
346
|
:ec2_key_name => 'default',
|
243
347
|
:hadoop_version => '0.20.205',
|
244
|
-
:
|
245
|
-
:master_instance_type => 'm1.small',
|
246
|
-
:slave_instance_type => 'm1.small',
|
348
|
+
:instance_groups => ['INSTANCE_GROUP_CONFIGURATION']
|
247
349
|
}
|
248
350
|
}
|
249
351
|
end
|
250
352
|
|
353
|
+
before do
|
354
|
+
subject.stub(:jobflow_instance_groups).and_return(['INSTANCE_GROUP_CONFIGURATION'])
|
355
|
+
end
|
356
|
+
|
251
357
|
it 'should create a jobflow configuration section' do
|
252
358
|
subject.send(:jobflow_preamble).should == basic_preamble
|
253
359
|
end
|
@@ -353,7 +459,7 @@ describe Elasticity::JobFlow do
|
|
353
459
|
describe '#shutdown' do
|
354
460
|
|
355
461
|
context 'when the jobflow has not yet been started' do
|
356
|
-
let(:unstarted_job_flow) { Elasticity::JobFlow.new('_', '_')}
|
462
|
+
let(:unstarted_job_flow) { Elasticity::JobFlow.new('_', '_') }
|
357
463
|
it 'should be an error' do
|
358
464
|
expect {
|
359
465
|
unstarted_job_flow.shutdown
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: elasticity
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: '2.
|
4
|
+
version: '2.1'
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2012-07-
|
12
|
+
date: 2012-07-07 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: rest-client
|
@@ -98,6 +98,7 @@ files:
|
|
98
98
|
- lib/elasticity/emr.rb
|
99
99
|
- lib/elasticity/hadoop_bootstrap_action.rb
|
100
100
|
- lib/elasticity/hive_step.rb
|
101
|
+
- lib/elasticity/instance_group.rb
|
101
102
|
- lib/elasticity/job_flow.rb
|
102
103
|
- lib/elasticity/job_flow_status.rb
|
103
104
|
- lib/elasticity/job_flow_status_step.rb
|
@@ -110,6 +111,7 @@ files:
|
|
110
111
|
- spec/lib/elasticity/emr_spec.rb
|
111
112
|
- spec/lib/elasticity/hadoop_bootstrap_action_spec.rb
|
112
113
|
- spec/lib/elasticity/hive_step_spec.rb
|
114
|
+
- spec/lib/elasticity/instance_group_spec.rb
|
113
115
|
- spec/lib/elasticity/job_flow_integration_spec.rb
|
114
116
|
- spec/lib/elasticity/job_flow_spec.rb
|
115
117
|
- spec/lib/elasticity/job_flow_status_spec.rb
|
@@ -133,7 +135,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
133
135
|
version: '0'
|
134
136
|
segments:
|
135
137
|
- 0
|
136
|
-
hash:
|
138
|
+
hash: -3013863256678255822
|
137
139
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
138
140
|
none: false
|
139
141
|
requirements:
|
@@ -142,7 +144,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
142
144
|
version: '0'
|
143
145
|
segments:
|
144
146
|
- 0
|
145
|
-
hash:
|
147
|
+
hash: -3013863256678255822
|
146
148
|
requirements: []
|
147
149
|
rubyforge_project:
|
148
150
|
rubygems_version: 1.8.24
|
@@ -155,6 +157,7 @@ test_files:
|
|
155
157
|
- spec/lib/elasticity/emr_spec.rb
|
156
158
|
- spec/lib/elasticity/hadoop_bootstrap_action_spec.rb
|
157
159
|
- spec/lib/elasticity/hive_step_spec.rb
|
160
|
+
- spec/lib/elasticity/instance_group_spec.rb
|
158
161
|
- spec/lib/elasticity/job_flow_integration_spec.rb
|
159
162
|
- spec/lib/elasticity/job_flow_spec.rb
|
160
163
|
- spec/lib/elasticity/job_flow_status_spec.rb
|