lhm 1.0.0.rc.1
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +5 -0
- data/.travis.yml +10 -0
- data/CHANGELOG.md +32 -0
- data/Gemfile +3 -0
- data/LICENSE +27 -0
- data/README.md +98 -0
- data/Rakefile +16 -0
- data/TODO +11 -0
- data/lhm.gemspec +29 -0
- data/lib/lhm.rb +20 -0
- data/lib/lhm/chunker.rb +73 -0
- data/lib/lhm/command.rb +70 -0
- data/lib/lhm/entangler.rb +105 -0
- data/lib/lhm/intersection.rb +42 -0
- data/lib/lhm/invoker.rb +37 -0
- data/lib/lhm/locked_switcher.rb +78 -0
- data/lib/lhm/migration.rb +34 -0
- data/lib/lhm/migrator.rb +125 -0
- data/lib/lhm/table.rb +87 -0
- data/spec/bootstrap.rb +16 -0
- data/spec/fixtures/destination.ddl +7 -0
- data/spec/fixtures/origin.ddl +7 -0
- data/spec/fixtures/users.ddl +11 -0
- data/spec/integration/chunker_spec.rb +31 -0
- data/spec/integration/entangler_spec.rb +60 -0
- data/spec/integration/integration_helper.rb +74 -0
- data/spec/integration/lhm_spec.rb +118 -0
- data/spec/integration/locked_switcher_spec.rb +41 -0
- data/spec/unit/chunker_spec.rb +79 -0
- data/spec/unit/entangler_spec.rb +79 -0
- data/spec/unit/intersection_spec.rb +42 -0
- data/spec/unit/locked_switcher_spec.rb +54 -0
- data/spec/unit/migration_spec.rb +26 -0
- data/spec/unit/migrator_spec.rb +81 -0
- data/spec/unit/table_spec.rb +88 -0
- data/spec/unit/unit_helper.rb +17 -0
- metadata +165 -0
data/.travis.yml
ADDED
data/CHANGELOG.md
ADDED
@@ -0,0 +1,32 @@
|
|
1
|
+
# 1.0.0-RC1
|
2
|
+
|
3
|
+
* rewrite.
|
4
|
+
|
5
|
+
# 0.9.1
|
6
|
+
|
7
|
+
# 0.2.1 (November 26, 2011)
|
8
|
+
|
9
|
+
* Include changelog in gem
|
10
|
+
|
11
|
+
# 0.2.0 (November 26, 2011)
|
12
|
+
|
13
|
+
* Add Ruby 1.8 compatibility
|
14
|
+
* Setup travis continuous integration
|
15
|
+
* Fix record lose issue
|
16
|
+
* Fix and speed up specs
|
17
|
+
|
18
|
+
# 0.1.4
|
19
|
+
|
20
|
+
* Merged [Pullrequest #9](https://github.com/soundcloud/large-hadron-migrator/pull/9)
|
21
|
+
|
22
|
+
# 0.1.3
|
23
|
+
|
24
|
+
* code cleanup
|
25
|
+
* Merged [Pullrequest #8](https://github.com/soundcloud/large-hadron-migrator/pull/8)
|
26
|
+
* Merged [Pullrequest #7](https://github.com/soundcloud/large-hadron-migrator/pull/7)
|
27
|
+
* Merged [Pullrequest #4](https://github.com/soundcloud/large-hadron-migrator/pull/4)
|
28
|
+
* Merged [Pullrequest #1](https://github.com/soundcloud/large-hadron-migrator/pull/1)
|
29
|
+
|
30
|
+
# 0.1.2
|
31
|
+
|
32
|
+
* Initial Release
|
data/Gemfile
ADDED
data/LICENSE
ADDED
@@ -0,0 +1,27 @@
|
|
1
|
+
Copyright (c) 2011, SoundCloud, Rany Keddo, Tobias Bielohlawek, Tobias Schmidt
|
2
|
+
|
3
|
+
All rights reserved.
|
4
|
+
|
5
|
+
Redistribution and use in source and binary forms, with or without
|
6
|
+
modification, are permitted provided that the following conditions are met:
|
7
|
+
|
8
|
+
- Redistributions of source code must retain the above copyright notice, this
|
9
|
+
list of conditions and the following disclaimer.
|
10
|
+
- Redistributions in binary form must reproduce the above copyright notice,
|
11
|
+
this list of conditions and the following disclaimer in the documentation
|
12
|
+
and/or other materials provided with the distribution.
|
13
|
+
- Neither the name of the SoundCloud nor the names of its contributors may be
|
14
|
+
used to endorse or promote products derived from this software without
|
15
|
+
specific prior written permission.
|
16
|
+
|
17
|
+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
18
|
+
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
19
|
+
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
20
|
+
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
|
21
|
+
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
22
|
+
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
|
23
|
+
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
|
24
|
+
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
|
25
|
+
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
26
|
+
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
27
|
+
|
data/README.md
ADDED
@@ -0,0 +1,98 @@
|
|
1
|
+
# Large Hadron Migrator [![Build Status](https://secure.travis-ci.org/soundcloud/large-hadron-migrator.png)](http://travis-ci.org/soundcloud/large-hadron-migrator)
|
2
|
+
|
3
|
+
Rails style database migrations are a useful way to evolve your data schema in
|
4
|
+
an agile manner. Most Rails projects start like this, and at first, making
|
5
|
+
changes is fast and easy.
|
6
|
+
|
7
|
+
That is until your tables grow to millions of records. At this point, the
|
8
|
+
locking nature of `ALTER TABLE` may take your site down for an hour or more
|
9
|
+
while critical tables are migrated. In order to avoid this, developers begin
|
10
|
+
to design around the problem by introducing join tables or moving the data
|
11
|
+
into another layer. Development gets less and less agile as tables grow and
|
12
|
+
grow. To make the problem worse, adding or changing indices to optimize data
|
13
|
+
access becomes just as difficult.
|
14
|
+
|
15
|
+
> Side effects may include black holes and universe implosion.
|
16
|
+
|
17
|
+
There are few things that can be done at the server or engine level. It is
|
18
|
+
possible to change default values in an `ALTER TABLE` without locking the
|
19
|
+
table. The InnoDB Plugin provides facilities for online index creation, which
|
20
|
+
is great if you are using this engine, but only solves half the problem.
|
21
|
+
|
22
|
+
At SoundCloud we started having migration pains quite a while ago, and after
|
23
|
+
looking around for third party solutions [0] [1] [2], we decided to create our
|
24
|
+
own. We called it Large Hadron Migrator, and it is a gem for online
|
25
|
+
ActiveRecord migrations.
|
26
|
+
|
27
|
+
![LHC](http://farm4.static.flickr.com/3093/2844971993_17f2ddf2a8_z.jpg)
|
28
|
+
|
29
|
+
[The Large Hadron collider at CERN](http://en.wikipedia.org/wiki/Large_Hadron_Collider)
|
30
|
+
|
31
|
+
## The idea
|
32
|
+
|
33
|
+
The basic idea is to perform the migration online while the system is live,
|
34
|
+
without locking the table. Similar to OAK (online alter table) [2] and the
|
35
|
+
facebook tool [0], we use a copy table, triggers and a journal table.
|
36
|
+
|
37
|
+
The Large Hadron is a test driven Ruby solution which can easily be dropped
|
38
|
+
into an ActiveRecord migration. It presumes a single auto incremented
|
39
|
+
numerical primary key called id as per the Rails convention. Unlike the
|
40
|
+
twitter solution [1], it does not require the presence of an indexed
|
41
|
+
`updated_at` column.
|
42
|
+
|
43
|
+
## Usage
|
44
|
+
|
45
|
+
After including Lhm, `hadron_change_table` becomes available
|
46
|
+
with the following methods:
|
47
|
+
|
48
|
+
class MigrateArbitrary < ActiveRecord::Migration
|
49
|
+
include Lhm
|
50
|
+
|
51
|
+
def self.up
|
52
|
+
hadron_change_table(:users) do |t|
|
53
|
+
t.add_column(:arbitrary, "INT(12)")
|
54
|
+
t.add_index([:arbitrary, :created_at])
|
55
|
+
t.ddl("alter table %s add column flag tinyint(1)" % t.name)
|
56
|
+
end
|
57
|
+
end
|
58
|
+
|
59
|
+
def self.down
|
60
|
+
hadron_change_table(:users) do |t|
|
61
|
+
t.remove_index([:arbitrary, :created_at])
|
62
|
+
t.remove_column(:arbitrary)
|
63
|
+
end
|
64
|
+
end
|
65
|
+
end
|
66
|
+
|
67
|
+
## Migration phases
|
68
|
+
|
69
|
+
_TODO_
|
70
|
+
|
71
|
+
### When adding a column
|
72
|
+
|
73
|
+
_TODO_
|
74
|
+
|
75
|
+
### When removing a column
|
76
|
+
|
77
|
+
_TODO_
|
78
|
+
|
79
|
+
## Contributing
|
80
|
+
|
81
|
+
We'll check out your contribution if you:
|
82
|
+
|
83
|
+
- Provide a comprehensive suite of tests for your fork.
|
84
|
+
- Have a clear and documented rationale for your changes.
|
85
|
+
- Package these up in a pull request.
|
86
|
+
|
87
|
+
We'll do our best to help you out with any contribution issues you may have.
|
88
|
+
|
89
|
+
## License
|
90
|
+
|
91
|
+
The license is included as LICENSE in this directory.
|
92
|
+
|
93
|
+
## Footnotes
|
94
|
+
|
95
|
+
[0]: http://www.facebook.com/note.php?note\_id=430801045932 "Facebook"
|
96
|
+
[1]: https://github.com/freels/table\_migrator "Twitter"
|
97
|
+
[2]: http://openarkkit.googlecode.com "OAK online alter table"
|
98
|
+
|
data/Rakefile
ADDED
@@ -0,0 +1,16 @@
|
|
1
|
+
require 'rake/testtask'
|
2
|
+
|
3
|
+
Rake::TestTask.new("unit") do |t|
|
4
|
+
t.libs.push "lib"
|
5
|
+
t.test_files = FileList['spec/unit/*_spec.rb']
|
6
|
+
t.verbose = true
|
7
|
+
end
|
8
|
+
|
9
|
+
Rake::TestTask.new("integration") do |t|
|
10
|
+
t.libs.push "lib"
|
11
|
+
t.test_files = FileList['spec/integration/*_spec.rb']
|
12
|
+
t.verbose = true
|
13
|
+
end
|
14
|
+
|
15
|
+
task :default => [:unit, :integration]
|
16
|
+
|
data/TODO
ADDED
data/lhm.gemspec
ADDED
@@ -0,0 +1,29 @@
|
|
1
|
+
# -*- encoding: utf-8 -*-
|
2
|
+
|
3
|
+
lib = File.expand_path('../lib', __FILE__)
|
4
|
+
$:.unshift(lib) unless $:.include?(lib)
|
5
|
+
|
6
|
+
puts $:.inspect
|
7
|
+
|
8
|
+
require 'lhm'
|
9
|
+
|
10
|
+
Gem::Specification.new do |s|
|
11
|
+
s.name = "lhm"
|
12
|
+
s.version = Lhm::VERSION
|
13
|
+
s.platform = Gem::Platform::RUBY
|
14
|
+
s.authors = ["SoundCloud", "Rany Keddo", "Tobias Bielohlawek", "Tobias Schmidt"]
|
15
|
+
s.email = %q{rany@soundcloud.com, tobi@soundcloud.com, ts@soundcloud.com}
|
16
|
+
s.summary = %q{online schema changer for mysql}
|
17
|
+
s.description = %q{Migrate large tables without downtime by copying to a temporary table in chunks. The old table is not dropped. Instead, it is moved to timestamp_table_name for verification.}
|
18
|
+
s.homepage = %q{http://github.com/soundcloud/large-hadron-migrator}
|
19
|
+
s.files = `git ls-files`.split("\n")
|
20
|
+
s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
|
21
|
+
s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
|
22
|
+
s.require_paths = ["lib"]
|
23
|
+
|
24
|
+
# this should be a real dependency, but we're using a different gem in our code
|
25
|
+
s.add_development_dependency "mysql", "~> 2.8.1"
|
26
|
+
s.add_development_dependency "rspec", "=1.3.1"
|
27
|
+
s.add_development_dependency "rake"
|
28
|
+
end
|
29
|
+
|
data/lib/lhm.rb
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
#
|
2
|
+
# Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
|
3
|
+
# Schmidt
|
4
|
+
#
|
5
|
+
|
6
|
+
require 'lhm/table'
|
7
|
+
require 'lhm/invoker'
|
8
|
+
require 'lhm/migration'
|
9
|
+
|
10
|
+
module Lhm
|
11
|
+
VERSION = "1.0.0.rc.1"
|
12
|
+
|
13
|
+
def hadron_change_table(table_name, chunk_options = {}, &block)
|
14
|
+
origin = Table.parse(table_name, connection)
|
15
|
+
invoker = Invoker.new(origin, connection)
|
16
|
+
block.call(invoker.migrator)
|
17
|
+
invoker.run(chunk_options)
|
18
|
+
end
|
19
|
+
end
|
20
|
+
|
data/lib/lhm/chunker.rb
ADDED
@@ -0,0 +1,73 @@
|
|
1
|
+
#
|
2
|
+
# Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
|
3
|
+
# Schmidt
|
4
|
+
#
|
5
|
+
|
6
|
+
require 'lhm/migration'
|
7
|
+
require 'lhm/command'
|
8
|
+
|
9
|
+
module Lhm
|
10
|
+
class Chunker
|
11
|
+
include Command
|
12
|
+
|
13
|
+
#
|
14
|
+
# Copy from origin to destination in chunks of size `stride`. Sleeps for
|
15
|
+
# `throttle` milliseconds between each stride.
|
16
|
+
#
|
17
|
+
|
18
|
+
def initialize(migration, limit = 1, connection = nil, options = {})
|
19
|
+
@stride = options[:stride] || 40_000
|
20
|
+
@throttle = options[:throttle] || 100
|
21
|
+
@limit = limit
|
22
|
+
@connection = connection
|
23
|
+
@migration = migration
|
24
|
+
end
|
25
|
+
|
26
|
+
#
|
27
|
+
# Copies chunks of size `stride`, starting from id 1 up to id `limit`.
|
28
|
+
#
|
29
|
+
|
30
|
+
def up_to(limit)
|
31
|
+
traversable_chunks_up_to(limit).times do |n|
|
32
|
+
yield(bottom(n + 1), top(n + 1, limit)) && sleep(throttle_seconds)
|
33
|
+
end
|
34
|
+
end
|
35
|
+
|
36
|
+
def traversable_chunks_up_to(limit)
|
37
|
+
(limit / @stride.to_f).ceil
|
38
|
+
end
|
39
|
+
|
40
|
+
def bottom(chunk)
|
41
|
+
(chunk - 1) * @stride + 1
|
42
|
+
end
|
43
|
+
|
44
|
+
def top(chunk, limit)
|
45
|
+
[chunk * @stride, limit].min
|
46
|
+
end
|
47
|
+
|
48
|
+
def copy(lowest, highest)
|
49
|
+
"insert ignore into `#{ @migration.destination.name }` (#{ cols.joined }) " +
|
50
|
+
"select #{ cols.joined } from `#{ @migration.origin.name }` " +
|
51
|
+
"where `id` between #{ lowest } and #{ highest }"
|
52
|
+
end
|
53
|
+
|
54
|
+
private
|
55
|
+
|
56
|
+
def cols
|
57
|
+
@cols ||= @migration.intersection
|
58
|
+
end
|
59
|
+
|
60
|
+
def execute
|
61
|
+
up_to(@limit) do |lowest, highest|
|
62
|
+
print "."
|
63
|
+
|
64
|
+
sql copy(lowest, highest)
|
65
|
+
end
|
66
|
+
end
|
67
|
+
|
68
|
+
def throttle_seconds
|
69
|
+
@throttle / 100.0
|
70
|
+
end
|
71
|
+
end
|
72
|
+
end
|
73
|
+
|
data/lib/lhm/command.rb
ADDED
@@ -0,0 +1,70 @@
|
|
1
|
+
#
|
2
|
+
# Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
|
3
|
+
# Schmidt
|
4
|
+
#
|
5
|
+
# Apply a change to the database.
|
6
|
+
#
|
7
|
+
|
8
|
+
module Lhm
|
9
|
+
module Command
|
10
|
+
def self.included(base)
|
11
|
+
base.send :attr_reader, :connection
|
12
|
+
end
|
13
|
+
|
14
|
+
#
|
15
|
+
# Command Interface
|
16
|
+
#
|
17
|
+
|
18
|
+
def validate; end
|
19
|
+
|
20
|
+
def revert
|
21
|
+
raise NotImplementedError.new(self.class.name)
|
22
|
+
end
|
23
|
+
|
24
|
+
def run(&block)
|
25
|
+
validate
|
26
|
+
|
27
|
+
if(block_given?)
|
28
|
+
before
|
29
|
+
block.call(self)
|
30
|
+
after
|
31
|
+
else
|
32
|
+
execute
|
33
|
+
end
|
34
|
+
end
|
35
|
+
|
36
|
+
private
|
37
|
+
|
38
|
+
def execute
|
39
|
+
raise NotImplementedError.new(self.class.name)
|
40
|
+
end
|
41
|
+
|
42
|
+
def before
|
43
|
+
raise NotImplementedError.new(self.class.name)
|
44
|
+
end
|
45
|
+
|
46
|
+
def after
|
47
|
+
raise NotImplementedError.new(self.class.name)
|
48
|
+
end
|
49
|
+
|
50
|
+
def table?(table_name)
|
51
|
+
@connection.table_exists?(table_name)
|
52
|
+
end
|
53
|
+
|
54
|
+
def error(msg)
|
55
|
+
raise Exception.new("#{ self.class }: #{ msg }")
|
56
|
+
end
|
57
|
+
|
58
|
+
def sql(statements)
|
59
|
+
[*statements].each do |statement|
|
60
|
+
begin
|
61
|
+
@connection.execute(statement)
|
62
|
+
rescue Mysql::Error => e
|
63
|
+
revert
|
64
|
+
error "#{ statement } failed: #{ e.inspect }"
|
65
|
+
end
|
66
|
+
end
|
67
|
+
end
|
68
|
+
end
|
69
|
+
end
|
70
|
+
|
@@ -0,0 +1,105 @@
|
|
1
|
+
#
|
2
|
+
# Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
|
3
|
+
# Schmidt
|
4
|
+
#
|
5
|
+
# Creates entanglement between two tables. All creates, updates and deletes
|
6
|
+
# to origin will be repeated on the the destination table.
|
7
|
+
#
|
8
|
+
|
9
|
+
require 'lhm/command'
|
10
|
+
|
11
|
+
module Lhm
|
12
|
+
class Entangler
|
13
|
+
include Command
|
14
|
+
|
15
|
+
attr_reader :epoch
|
16
|
+
|
17
|
+
def initialize(migration, connection = nil)
|
18
|
+
@common = migration.intersection
|
19
|
+
@origin = migration.origin
|
20
|
+
@destination = migration.destination
|
21
|
+
@connection = connection
|
22
|
+
end
|
23
|
+
|
24
|
+
def entangle
|
25
|
+
[
|
26
|
+
create_trigger_del,
|
27
|
+
create_trigger_ins,
|
28
|
+
create_trigger_upd
|
29
|
+
]
|
30
|
+
end
|
31
|
+
|
32
|
+
def untangle
|
33
|
+
[
|
34
|
+
"drop trigger if exists `#{ trigger(:del) }`",
|
35
|
+
"drop trigger if exists `#{ trigger(:ins) }`",
|
36
|
+
"drop trigger if exists `#{ trigger(:upd) }`"
|
37
|
+
]
|
38
|
+
end
|
39
|
+
|
40
|
+
def create_trigger_ins
|
41
|
+
strip %Q{
|
42
|
+
create trigger `#{ trigger(:ins) }`
|
43
|
+
after insert on `#{ @origin.name }` for each row
|
44
|
+
replace into `#{ @destination.name }` (#{ @common.joined })
|
45
|
+
values (#{ @common.typed("NEW") })
|
46
|
+
}
|
47
|
+
end
|
48
|
+
|
49
|
+
def create_trigger_upd
|
50
|
+
strip %Q{
|
51
|
+
create trigger `#{ trigger(:upd) }`
|
52
|
+
after update on `#{ @origin.name }` for each row
|
53
|
+
replace into `#{ @destination.name }` (#{ @common.joined })
|
54
|
+
values (#{ @common.typed("NEW") })
|
55
|
+
}
|
56
|
+
end
|
57
|
+
|
58
|
+
def create_trigger_del
|
59
|
+
strip %Q{
|
60
|
+
create trigger `#{ trigger(:del) }`
|
61
|
+
after delete on `#{ @origin.name }` for each row
|
62
|
+
delete ignore from `#{ @destination.name }`
|
63
|
+
where `#{ @destination.name }`.`id` = OLD.`id`
|
64
|
+
}
|
65
|
+
end
|
66
|
+
|
67
|
+
def trigger(type)
|
68
|
+
"lhmt_#{ type }_#{ @origin.name }"
|
69
|
+
end
|
70
|
+
|
71
|
+
#
|
72
|
+
# Command implementation
|
73
|
+
#
|
74
|
+
|
75
|
+
def validate
|
76
|
+
unless table?(@origin.name)
|
77
|
+
error("#{ @origin.name } does not exist")
|
78
|
+
end
|
79
|
+
|
80
|
+
unless table?(@destination.name)
|
81
|
+
error("#{ @destination.name } does not exist")
|
82
|
+
end
|
83
|
+
end
|
84
|
+
|
85
|
+
def before
|
86
|
+
sql(entangle)
|
87
|
+
@epoch = connection.select_value("select max(id) from #{ @origin.name }").to_i
|
88
|
+
end
|
89
|
+
|
90
|
+
def after
|
91
|
+
sql(untangle)
|
92
|
+
end
|
93
|
+
|
94
|
+
def revert
|
95
|
+
after
|
96
|
+
end
|
97
|
+
|
98
|
+
private
|
99
|
+
|
100
|
+
def strip(sql)
|
101
|
+
sql.strip.gsub(/\n */, "\n")
|
102
|
+
end
|
103
|
+
end
|
104
|
+
end
|
105
|
+
|