large-hadron-migrator 0.1.2 → 0.1.3
Sign up to get free protection for your applications and to get access to all the features.
- data/CHANGES.md +9 -0
- data/{README.markdown → README.md} +10 -10
- data/VERSION +1 -1
- data/lib/large_hadron_migration.rb +23 -17
- data/spec/large_hadron_migration_spec.rb +7 -1
- data/spec/spec_helper.rb +15 -0
- metadata +6 -32
- data/CHANGES.markdown +0 -0
data/CHANGES.md
ADDED
@@ -0,0 +1,9 @@
|
|
1
|
+
# 0.1.3
|
2
|
+
* code cleanup
|
3
|
+
* Merged [Pullrequest #8](https://github.com/soundcloud/large-hadron-migrator/pull/8)
|
4
|
+
* Merged [Pullrequest #7](https://github.com/soundcloud/large-hadron-migrator/pull/7)
|
5
|
+
* Merged [Pullrequest #4](https://github.com/soundcloud/large-hadron-migrator/pull/4)
|
6
|
+
* Merged [Pullrequest #1](https://github.com/soundcloud/large-hadron-migrator/pull/1)
|
7
|
+
|
8
|
+
# 0.1.2
|
9
|
+
* Initial Release
|
@@ -5,7 +5,7 @@ an agile manner. Most Rails projects start like this, and at first, making
|
|
5
5
|
changes is fast and easy.
|
6
6
|
|
7
7
|
That is until your tables grow to millions of records. At this point, the
|
8
|
-
locking nature of `ALTER TABLE` may take your site down for an hour
|
8
|
+
locking nature of `ALTER TABLE` may take your site down for an hour or more
|
9
9
|
while critical tables are migrated. In order to avoid this, developers begin
|
10
10
|
to design around the problem by introducing join tables or moving the data
|
11
11
|
into another layer. Development gets less and less agile as tables grow and
|
@@ -89,7 +89,7 @@ there can only ever be one version of the record in the journal table.
|
|
89
89
|
|
90
90
|
If the journalling trigger hits an already persisted record, it will be
|
91
91
|
replaced with the latest data and action. `ON DUPLICATE KEY` comes in handy
|
92
|
-
here. This
|
92
|
+
here. This ensures that all journal records will be consistent with the
|
93
93
|
original table.
|
94
94
|
|
95
95
|
### Perform alter statement on new table
|
@@ -100,20 +100,20 @@ indexes at the end of the copying process.
|
|
100
100
|
|
101
101
|
### Copy in chunks up to max primary key value to new table
|
102
102
|
|
103
|
-
Currently InnoDB
|
103
|
+
Currently InnoDB acquires a read lock on the source rows in `INSERT INTO...
|
104
104
|
SELECT`. LHM reads 35K ranges and pauses for a specified number of milliseconds
|
105
105
|
so that contention can be minimized.
|
106
106
|
|
107
107
|
### Switch new and original table names and remove triggers
|
108
108
|
|
109
|
-
The
|
109
|
+
The original and copy table are now atomically switched with `RENAME TABLE
|
110
110
|
original TO archive_original, copy_table TO original`. The triggers are removed
|
111
111
|
so that journalling stops and all mutations and reads now go against the
|
112
112
|
original table.
|
113
113
|
|
114
114
|
### Replay journal: insert, update, deletes
|
115
115
|
|
116
|
-
Because the chunked copy stops at the
|
116
|
+
Because the chunked copy stops at the initial maximum id, we can simply replay
|
117
117
|
all inserts in the journal table without worrying about collisions.
|
118
118
|
|
119
119
|
Updates and deletes are then replayed.
|
@@ -131,8 +131,8 @@ pass, so this will be quite short compared to the copy phase. The
|
|
131
131
|
inconsistency during replay is similar in effect to a slave which is slightly
|
132
132
|
behind master.
|
133
133
|
|
134
|
-
There is also caveat with the current journalling scheme; stale journal
|
135
|
-
'update' entries are still replayed. Imagine an update to
|
134
|
+
There is also a caveat with the current journalling scheme; stale journal
|
135
|
+
'update' entries are still replayed. Imagine an update to a record in the
|
136
136
|
migrated table while the journal is replaying. The journal may already contain
|
137
137
|
an update for this record, which becomes stale now. When it is replayed, the
|
138
138
|
second change will be lost. So if a record is updated twice, once before and
|
@@ -162,9 +162,9 @@ Several hours into the migration, a critical fix had to be deployed to the
|
|
162
162
|
site. We rolled out the fix and restarted the app servers in mid migration.
|
163
163
|
This was not a good idea.
|
164
164
|
|
165
|
-
TL;DR: Never restart during migrations when removing columns with
|
166
|
-
|
167
|
-
|
165
|
+
TL;DR: Never restart during migrations when removing columns with LHM.
|
166
|
+
You can restart while adding migrations as long as active record reads column
|
167
|
+
definitions from the slave.
|
168
168
|
|
169
169
|
The information below is only relevant if you want to restart your app servers
|
170
170
|
while migrating in a master slave setup.
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
0.1.
|
1
|
+
0.1.3
|
@@ -100,16 +100,17 @@ class LargeHadronMigration < ActiveRecord::Migration
|
|
100
100
|
|
101
101
|
raise "chunk_size must be >= 1" unless chunk_size >= 1
|
102
102
|
|
103
|
-
|
104
|
-
|
105
|
-
|
103
|
+
started = Time.now.strftime("%Y_%m_%d_%H_%M_%S_%3N")
|
104
|
+
new_table = "lhmn_%s" % curr_table
|
105
|
+
old_table = "lhmo_%s_%s" % [started, curr_table]
|
106
|
+
journal_table = "lhmj_%s_%s" % [started, curr_table]
|
106
107
|
|
107
108
|
last_insert_id = last_insert_id(curr_table)
|
108
109
|
say "last inserted id in #{curr_table}: #{last_insert_id}"
|
109
110
|
|
110
111
|
begin
|
111
112
|
# clean tables. old tables are never deleted to guard against rollbacks.
|
112
|
-
execute %Q
|
113
|
+
execute %Q{drop table if exists %s} % new_table
|
113
114
|
|
114
115
|
clone_table(curr_table, new_table, id_window)
|
115
116
|
clone_table_for_changes(curr_table, journal_table)
|
@@ -253,8 +254,8 @@ class LargeHadronMigration < ActiveRecord::Migration
|
|
253
254
|
end
|
254
255
|
end
|
255
256
|
|
256
|
-
def self.clone_table(source, dest, window = 0)
|
257
|
-
execute schema_sql(source, dest, window)
|
257
|
+
def self.clone_table(source, dest, window = 0, add_action_column = false)
|
258
|
+
execute schema_sql(source, dest, window, add_action_column)
|
258
259
|
end
|
259
260
|
|
260
261
|
def self.common_columns(t1, t2)
|
@@ -262,11 +263,7 @@ class LargeHadronMigration < ActiveRecord::Migration
|
|
262
263
|
end
|
263
264
|
|
264
265
|
def self.clone_table_for_changes(table, journal_table)
|
265
|
-
clone_table(table, journal_table)
|
266
|
-
execute %Q{
|
267
|
-
alter table %s
|
268
|
-
add column hadron_action varchar(15);
|
269
|
-
} % journal_table
|
266
|
+
clone_table(table, journal_table, 0, true)
|
270
267
|
end
|
271
268
|
|
272
269
|
def self.rename_tables(tables = {})
|
@@ -333,11 +330,15 @@ class LargeHadronMigration < ActiveRecord::Migration
|
|
333
330
|
end
|
334
331
|
|
335
332
|
def self.replay_delete_changes(table, journal_table)
|
336
|
-
|
337
|
-
|
338
|
-
|
339
|
-
|
340
|
-
|
333
|
+
with_master do
|
334
|
+
if connection.select_values("select id from #{journal_table} where hadron_action = 'delete' LIMIT 1").any?
|
335
|
+
execute %Q{
|
336
|
+
delete from #{table} where id in (
|
337
|
+
select id from #{journal_table} where hadron_action = 'delete'
|
338
|
+
)
|
339
|
+
}
|
340
|
+
end
|
341
|
+
end
|
341
342
|
end
|
342
343
|
|
343
344
|
def self.replay_update_changes(table, journal_table, chunk_size = 10000, wait = 0.2)
|
@@ -359,12 +360,17 @@ class LargeHadronMigration < ActiveRecord::Migration
|
|
359
360
|
# behavior with the latter where the auto_increment of the source table
|
360
361
|
# got modified when updating the destination.
|
361
362
|
#
|
362
|
-
def self.schema_sql(source, dest, window)
|
363
|
+
def self.schema_sql(source, dest, window, add_action_column = false)
|
363
364
|
show_create(source).tap do |schema|
|
364
365
|
schema.gsub!(/auto_increment=(\d+)/i) do
|
365
366
|
"auto_increment=#{ $1.to_i + window }"
|
366
367
|
end
|
367
368
|
|
369
|
+
if add_action_column
|
370
|
+
schema.sub!(/\) ENGINE=/,
|
371
|
+
", hadron_action ENUM('update', 'insert', 'delete'), INDEX hadron_action (hadron_action) USING BTREE) ENGINE=")
|
372
|
+
end
|
373
|
+
|
368
374
|
schema.gsub!('CREATE TABLE `%s`' % source, 'CREATE TABLE `%s`' % dest)
|
369
375
|
end
|
370
376
|
end
|
@@ -99,7 +99,8 @@ describe "LargeHadronMigration", "triggers" do
|
|
99
99
|
end
|
100
100
|
|
101
101
|
it "should create a table for triggered changes" do
|
102
|
-
truthiness_column "triggerme_changes", "hadron_action", "
|
102
|
+
truthiness_column "triggerme_changes", "hadron_action", "enum"
|
103
|
+
truthiness_index "triggerme_changes", "hadron_action", [ "hadron_action" ], false
|
103
104
|
end
|
104
105
|
|
105
106
|
it "should trigger on insert" do
|
@@ -341,6 +342,11 @@ describe "LargeHadronMigration", "replaying changes" do
|
|
341
342
|
end
|
342
343
|
end
|
343
344
|
|
345
|
+
it "doesn't replay delete if there are any" do
|
346
|
+
LargeHadronMigration.should_receive(:execute).never
|
347
|
+
LargeHadronMigration.replay_delete_changes("source", "source_changes")
|
348
|
+
end
|
349
|
+
|
344
350
|
end
|
345
351
|
|
346
352
|
describe "LargeHadronMigration", "units" do
|
data/spec/spec_helper.rb
CHANGED
@@ -90,6 +90,21 @@ module SpecHelper
|
|
90
90
|
|
91
91
|
end
|
92
92
|
|
93
|
+
def truthiness_index(table, expected_index_name, indexed_columns, unique)
|
94
|
+
index = sql("SHOW INDEXES FROM #{table}").all_hashes.inject({}) do |a, part|
|
95
|
+
index_name = part['Key_name']
|
96
|
+
a[index_name] ||= { 'unique' => '0' == part['Non_unique'], 'columns' => [] }
|
97
|
+
column_index = part['Seq_in_index'].to_i - 1
|
98
|
+
a[index_name]['columns'][column_index] = part['Column_name']
|
99
|
+
a
|
100
|
+
end[expected_index_name]
|
101
|
+
|
102
|
+
flunk("no index named #{expected_index_name} found on #{table}") unless index
|
103
|
+
|
104
|
+
index['columns'].should == indexed_columns
|
105
|
+
index['unique'].should == unique
|
106
|
+
end
|
107
|
+
|
93
108
|
end
|
94
109
|
|
95
110
|
# Mock Rails Environment
|
metadata
CHANGED
@@ -1,12 +1,8 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: large-hadron-migrator
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
prerelease:
|
5
|
-
|
6
|
-
- 0
|
7
|
-
- 1
|
8
|
-
- 2
|
9
|
-
version: 0.1.2
|
4
|
+
prerelease:
|
5
|
+
version: 0.1.3
|
10
6
|
platform: ruby
|
11
7
|
authors:
|
12
8
|
- SoundCloud
|
@@ -16,8 +12,7 @@ autorequire:
|
|
16
12
|
bindir: bin
|
17
13
|
cert_chain: []
|
18
14
|
|
19
|
-
date: 2011-
|
20
|
-
default_executable:
|
15
|
+
date: 2011-09-28 00:00:00 Z
|
21
16
|
dependencies:
|
22
17
|
- !ruby/object:Gem::Dependency
|
23
18
|
name: activerecord
|
@@ -27,10 +22,6 @@ dependencies:
|
|
27
22
|
requirements:
|
28
23
|
- - ~>
|
29
24
|
- !ruby/object:Gem::Version
|
30
|
-
segments:
|
31
|
-
- 2
|
32
|
-
- 3
|
33
|
-
- 8
|
34
25
|
version: 2.3.8
|
35
26
|
type: :runtime
|
36
27
|
version_requirements: *id001
|
@@ -42,10 +33,6 @@ dependencies:
|
|
42
33
|
requirements:
|
43
34
|
- - ~>
|
44
35
|
- !ruby/object:Gem::Version
|
45
|
-
segments:
|
46
|
-
- 2
|
47
|
-
- 3
|
48
|
-
- 8
|
49
36
|
version: 2.3.8
|
50
37
|
type: :runtime
|
51
38
|
version_requirements: *id002
|
@@ -57,10 +44,6 @@ dependencies:
|
|
57
44
|
requirements:
|
58
45
|
- - "="
|
59
46
|
- !ruby/object:Gem::Version
|
60
|
-
segments:
|
61
|
-
- 2
|
62
|
-
- 8
|
63
|
-
- 1
|
64
47
|
version: 2.8.1
|
65
48
|
type: :runtime
|
66
49
|
version_requirements: *id003
|
@@ -72,10 +55,6 @@ dependencies:
|
|
72
55
|
requirements:
|
73
56
|
- - "="
|
74
57
|
- !ruby/object:Gem::Version
|
75
|
-
segments:
|
76
|
-
- 1
|
77
|
-
- 3
|
78
|
-
- 1
|
79
58
|
version: 1.3.1
|
80
59
|
type: :development
|
81
60
|
version_requirements: *id004
|
@@ -89,11 +68,11 @@ extra_rdoc_files: []
|
|
89
68
|
|
90
69
|
files:
|
91
70
|
- .gitignore
|
92
|
-
- CHANGES.
|
71
|
+
- CHANGES.md
|
93
72
|
- Gemfile
|
94
73
|
- Gemfile.lock
|
95
74
|
- LICENSE
|
96
|
-
- README.
|
75
|
+
- README.md
|
97
76
|
- Rakefile
|
98
77
|
- VERSION
|
99
78
|
- large-hadron-migrator.gemspec
|
@@ -101,7 +80,6 @@ files:
|
|
101
80
|
- spec/large_hadron_migration_spec.rb
|
102
81
|
- spec/migrate/add_new_column.rb
|
103
82
|
- spec/spec_helper.rb
|
104
|
-
has_rdoc: true
|
105
83
|
homepage: http://github.com/soundcloud/large-hadron-migrator
|
106
84
|
licenses: []
|
107
85
|
|
@@ -115,21 +93,17 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
115
93
|
requirements:
|
116
94
|
- - ">="
|
117
95
|
- !ruby/object:Gem::Version
|
118
|
-
segments:
|
119
|
-
- 0
|
120
96
|
version: "0"
|
121
97
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
122
98
|
none: false
|
123
99
|
requirements:
|
124
100
|
- - ">="
|
125
101
|
- !ruby/object:Gem::Version
|
126
|
-
segments:
|
127
|
-
- 0
|
128
102
|
version: "0"
|
129
103
|
requirements: []
|
130
104
|
|
131
105
|
rubyforge_project:
|
132
|
-
rubygems_version: 1.
|
106
|
+
rubygems_version: 1.8.10
|
133
107
|
signing_key:
|
134
108
|
specification_version: 3
|
135
109
|
summary: online schema changer for mysql
|
data/CHANGES.markdown
DELETED
File without changes
|