lhm 1.0.0.rc.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/.gitignore +5 -0
- data/.travis.yml +10 -0
- data/CHANGELOG.md +32 -0
- data/Gemfile +3 -0
- data/LICENSE +27 -0
- data/README.md +98 -0
- data/Rakefile +16 -0
- data/TODO +11 -0
- data/lhm.gemspec +29 -0
- data/lib/lhm.rb +20 -0
- data/lib/lhm/chunker.rb +73 -0
- data/lib/lhm/command.rb +70 -0
- data/lib/lhm/entangler.rb +105 -0
- data/lib/lhm/intersection.rb +42 -0
- data/lib/lhm/invoker.rb +37 -0
- data/lib/lhm/locked_switcher.rb +78 -0
- data/lib/lhm/migration.rb +34 -0
- data/lib/lhm/migrator.rb +125 -0
- data/lib/lhm/table.rb +87 -0
- data/spec/bootstrap.rb +16 -0
- data/spec/fixtures/destination.ddl +7 -0
- data/spec/fixtures/origin.ddl +7 -0
- data/spec/fixtures/users.ddl +11 -0
- data/spec/integration/chunker_spec.rb +31 -0
- data/spec/integration/entangler_spec.rb +60 -0
- data/spec/integration/integration_helper.rb +74 -0
- data/spec/integration/lhm_spec.rb +118 -0
- data/spec/integration/locked_switcher_spec.rb +41 -0
- data/spec/unit/chunker_spec.rb +79 -0
- data/spec/unit/entangler_spec.rb +79 -0
- data/spec/unit/intersection_spec.rb +42 -0
- data/spec/unit/locked_switcher_spec.rb +54 -0
- data/spec/unit/migration_spec.rb +26 -0
- data/spec/unit/migrator_spec.rb +81 -0
- data/spec/unit/table_spec.rb +88 -0
- data/spec/unit/unit_helper.rb +17 -0
- metadata +165 -0
data/.travis.yml
ADDED
data/CHANGELOG.md
ADDED
@@ -0,0 +1,32 @@
|
|
1
|
+
# 1.0.0-RC1
|
2
|
+
|
3
|
+
* rewrite.
|
4
|
+
|
5
|
+
# 0.9.1
|
6
|
+
|
7
|
+
# 0.2.1 (November 26, 2011)
|
8
|
+
|
9
|
+
* Include changelog in gem
|
10
|
+
|
11
|
+
# 0.2.0 (November 26, 2011)
|
12
|
+
|
13
|
+
* Add Ruby 1.8 compatibility
|
14
|
+
* Setup travis continuous integration
|
15
|
+
* Fix record lose issue
|
16
|
+
* Fix and speed up specs
|
17
|
+
|
18
|
+
# 0.1.4
|
19
|
+
|
20
|
+
* Merged [Pullrequest #9](https://github.com/soundcloud/large-hadron-migrator/pull/9)
|
21
|
+
|
22
|
+
# 0.1.3
|
23
|
+
|
24
|
+
* code cleanup
|
25
|
+
* Merged [Pullrequest #8](https://github.com/soundcloud/large-hadron-migrator/pull/8)
|
26
|
+
* Merged [Pullrequest #7](https://github.com/soundcloud/large-hadron-migrator/pull/7)
|
27
|
+
* Merged [Pullrequest #4](https://github.com/soundcloud/large-hadron-migrator/pull/4)
|
28
|
+
* Merged [Pullrequest #1](https://github.com/soundcloud/large-hadron-migrator/pull/1)
|
29
|
+
|
30
|
+
# 0.1.2
|
31
|
+
|
32
|
+
* Initial Release
|
data/Gemfile
ADDED
data/LICENSE
ADDED
@@ -0,0 +1,27 @@
|
|
1
|
+
Copyright (c) 2011, SoundCloud, Rany Keddo, Tobias Bielohlawek, Tobias Schmidt
|
2
|
+
|
3
|
+
All rights reserved.
|
4
|
+
|
5
|
+
Redistribution and use in source and binary forms, with or without
|
6
|
+
modification, are permitted provided that the following conditions are met:
|
7
|
+
|
8
|
+
- Redistributions of source code must retain the above copyright notice, this
|
9
|
+
list of conditions and the following disclaimer.
|
10
|
+
- Redistributions in binary form must reproduce the above copyright notice,
|
11
|
+
this list of conditions and the following disclaimer in the documentation
|
12
|
+
and/or other materials provided with the distribution.
|
13
|
+
- Neither the name of the SoundCloud nor the names of its contributors may be
|
14
|
+
used to endorse or promote products derived from this software without
|
15
|
+
specific prior written permission.
|
16
|
+
|
17
|
+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
|
18
|
+
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
19
|
+
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
20
|
+
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
|
21
|
+
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
22
|
+
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
|
23
|
+
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
|
24
|
+
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
|
25
|
+
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
26
|
+
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
27
|
+
|
data/README.md
ADDED
@@ -0,0 +1,98 @@
|
|
1
|
+
# Large Hadron Migrator [](http://travis-ci.org/soundcloud/large-hadron-migrator)
|
2
|
+
|
3
|
+
Rails style database migrations are a useful way to evolve your data schema in
|
4
|
+
an agile manner. Most Rails projects start like this, and at first, making
|
5
|
+
changes is fast and easy.
|
6
|
+
|
7
|
+
That is until your tables grow to millions of records. At this point, the
|
8
|
+
locking nature of `ALTER TABLE` may take your site down for an hour or more
|
9
|
+
while critical tables are migrated. In order to avoid this, developers begin
|
10
|
+
to design around the problem by introducing join tables or moving the data
|
11
|
+
into another layer. Development gets less and less agile as tables grow and
|
12
|
+
grow. To make the problem worse, adding or changing indices to optimize data
|
13
|
+
access becomes just as difficult.
|
14
|
+
|
15
|
+
> Side effects may include black holes and universe implosion.
|
16
|
+
|
17
|
+
There are few things that can be done at the server or engine level. It is
|
18
|
+
possible to change default values in an `ALTER TABLE` without locking the
|
19
|
+
table. The InnoDB Plugin provides facilities for online index creation, which
|
20
|
+
is great if you are using this engine, but only solves half the problem.
|
21
|
+
|
22
|
+
At SoundCloud we started having migration pains quite a while ago, and after
|
23
|
+
looking around for third party solutions [0] [1] [2], we decided to create our
|
24
|
+
own. We called it Large Hadron Migrator, and it is a gem for online
|
25
|
+
ActiveRecord migrations.
|
26
|
+
|
27
|
+

|
28
|
+
|
29
|
+
[The Large Hadron collider at CERN](http://en.wikipedia.org/wiki/Large_Hadron_Collider)
|
30
|
+
|
31
|
+
## The idea
|
32
|
+
|
33
|
+
The basic idea is to perform the migration online while the system is live,
|
34
|
+
without locking the table. Similar to OAK (online alter table) [2] and the
|
35
|
+
facebook tool [0], we use a copy table, triggers and a journal table.
|
36
|
+
|
37
|
+
The Large Hadron is a test driven Ruby solution which can easily be dropped
|
38
|
+
into an ActiveRecord migration. It presumes a single auto incremented
|
39
|
+
numerical primary key called id as per the Rails convention. Unlike the
|
40
|
+
twitter solution [1], it does not require the presence of an indexed
|
41
|
+
`updated_at` column.
|
42
|
+
|
43
|
+
## Usage
|
44
|
+
|
45
|
+
After including Lhm, `hadron_change_table` becomes available
|
46
|
+
with the following methods:
|
47
|
+
|
48
|
+
class MigrateArbitrary < ActiveRecord::Migration
|
49
|
+
include Lhm
|
50
|
+
|
51
|
+
def self.up
|
52
|
+
hadron_change_table(:users) do |t|
|
53
|
+
t.add_column(:arbitrary, "INT(12)")
|
54
|
+
t.add_index([:arbitrary, :created_at])
|
55
|
+
t.ddl("alter table %s add column flag tinyint(1)" % t.name)
|
56
|
+
end
|
57
|
+
end
|
58
|
+
|
59
|
+
def self.down
|
60
|
+
hadron_change_table(:users) do |t|
|
61
|
+
t.remove_index([:arbitrary, :created_at])
|
62
|
+
t.remove_column(:arbitrary)
|
63
|
+
end
|
64
|
+
end
|
65
|
+
end
|
66
|
+
|
67
|
+
## Migration phases
|
68
|
+
|
69
|
+
_TODO_
|
70
|
+
|
71
|
+
### When adding a column
|
72
|
+
|
73
|
+
_TODO_
|
74
|
+
|
75
|
+
### When removing a column
|
76
|
+
|
77
|
+
_TODO_
|
78
|
+
|
79
|
+
## Contributing
|
80
|
+
|
81
|
+
We'll check out your contribution if you:
|
82
|
+
|
83
|
+
- Provide a comprehensive suite of tests for your fork.
|
84
|
+
- Have a clear and documented rationale for your changes.
|
85
|
+
- Package these up in a pull request.
|
86
|
+
|
87
|
+
We'll do our best to help you out with any contribution issues you may have.
|
88
|
+
|
89
|
+
## License
|
90
|
+
|
91
|
+
The license is included as LICENSE in this directory.
|
92
|
+
|
93
|
+
## Footnotes
|
94
|
+
|
95
|
+
[0]: http://www.facebook.com/note.php?note\_id=430801045932 "Facebook"
|
96
|
+
[1]: https://github.com/freels/table\_migrator "Twitter"
|
97
|
+
[2]: http://openarkkit.googlecode.com "OAK online alter table"
|
98
|
+
|
data/Rakefile
ADDED
@@ -0,0 +1,16 @@
|
|
1
|
+
require 'rake/testtask'
|
2
|
+
|
3
|
+
Rake::TestTask.new("unit") do |t|
|
4
|
+
t.libs.push "lib"
|
5
|
+
t.test_files = FileList['spec/unit/*_spec.rb']
|
6
|
+
t.verbose = true
|
7
|
+
end
|
8
|
+
|
9
|
+
Rake::TestTask.new("integration") do |t|
|
10
|
+
t.libs.push "lib"
|
11
|
+
t.test_files = FileList['spec/integration/*_spec.rb']
|
12
|
+
t.verbose = true
|
13
|
+
end
|
14
|
+
|
15
|
+
task :default => [:unit, :integration]
|
16
|
+
|
data/TODO
ADDED
data/lhm.gemspec
ADDED
@@ -0,0 +1,29 @@
|
|
1
|
+
# -*- encoding: utf-8 -*-
|
2
|
+
|
3
|
+
lib = File.expand_path('../lib', __FILE__)
|
4
|
+
$:.unshift(lib) unless $:.include?(lib)
|
5
|
+
|
6
|
+
puts $:.inspect
|
7
|
+
|
8
|
+
require 'lhm'
|
9
|
+
|
10
|
+
Gem::Specification.new do |s|
|
11
|
+
s.name = "lhm"
|
12
|
+
s.version = Lhm::VERSION
|
13
|
+
s.platform = Gem::Platform::RUBY
|
14
|
+
s.authors = ["SoundCloud", "Rany Keddo", "Tobias Bielohlawek", "Tobias Schmidt"]
|
15
|
+
s.email = %q{rany@soundcloud.com, tobi@soundcloud.com, ts@soundcloud.com}
|
16
|
+
s.summary = %q{online schema changer for mysql}
|
17
|
+
s.description = %q{Migrate large tables without downtime by copying to a temporary table in chunks. The old table is not dropped. Instead, it is moved to timestamp_table_name for verification.}
|
18
|
+
s.homepage = %q{http://github.com/soundcloud/large-hadron-migrator}
|
19
|
+
s.files = `git ls-files`.split("\n")
|
20
|
+
s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
|
21
|
+
s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
|
22
|
+
s.require_paths = ["lib"]
|
23
|
+
|
24
|
+
# this should be a real dependency, but we're using a different gem in our code
|
25
|
+
s.add_development_dependency "mysql", "~> 2.8.1"
|
26
|
+
s.add_development_dependency "rspec", "=1.3.1"
|
27
|
+
s.add_development_dependency "rake"
|
28
|
+
end
|
29
|
+
|
data/lib/lhm.rb
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
#
|
2
|
+
# Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
|
3
|
+
# Schmidt
|
4
|
+
#
|
5
|
+
|
6
|
+
require 'lhm/table'
|
7
|
+
require 'lhm/invoker'
|
8
|
+
require 'lhm/migration'
|
9
|
+
|
10
|
+
module Lhm
|
11
|
+
VERSION = "1.0.0.rc.1"
|
12
|
+
|
13
|
+
def hadron_change_table(table_name, chunk_options = {}, &block)
|
14
|
+
origin = Table.parse(table_name, connection)
|
15
|
+
invoker = Invoker.new(origin, connection)
|
16
|
+
block.call(invoker.migrator)
|
17
|
+
invoker.run(chunk_options)
|
18
|
+
end
|
19
|
+
end
|
20
|
+
|
data/lib/lhm/chunker.rb
ADDED
@@ -0,0 +1,73 @@
|
|
1
|
+
#
|
2
|
+
# Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
|
3
|
+
# Schmidt
|
4
|
+
#
|
5
|
+
|
6
|
+
require 'lhm/migration'
|
7
|
+
require 'lhm/command'
|
8
|
+
|
9
|
+
module Lhm
|
10
|
+
class Chunker
|
11
|
+
include Command
|
12
|
+
|
13
|
+
#
|
14
|
+
# Copy from origin to destination in chunks of size `stride`. Sleeps for
|
15
|
+
# `throttle` milliseconds between each stride.
|
16
|
+
#
|
17
|
+
|
18
|
+
def initialize(migration, limit = 1, connection = nil, options = {})
|
19
|
+
@stride = options[:stride] || 40_000
|
20
|
+
@throttle = options[:throttle] || 100
|
21
|
+
@limit = limit
|
22
|
+
@connection = connection
|
23
|
+
@migration = migration
|
24
|
+
end
|
25
|
+
|
26
|
+
#
|
27
|
+
# Copies chunks of size `stride`, starting from id 1 up to id `limit`.
|
28
|
+
#
|
29
|
+
|
30
|
+
def up_to(limit)
|
31
|
+
traversable_chunks_up_to(limit).times do |n|
|
32
|
+
yield(bottom(n + 1), top(n + 1, limit)) && sleep(throttle_seconds)
|
33
|
+
end
|
34
|
+
end
|
35
|
+
|
36
|
+
def traversable_chunks_up_to(limit)
|
37
|
+
(limit / @stride.to_f).ceil
|
38
|
+
end
|
39
|
+
|
40
|
+
def bottom(chunk)
|
41
|
+
(chunk - 1) * @stride + 1
|
42
|
+
end
|
43
|
+
|
44
|
+
def top(chunk, limit)
|
45
|
+
[chunk * @stride, limit].min
|
46
|
+
end
|
47
|
+
|
48
|
+
def copy(lowest, highest)
|
49
|
+
"insert ignore into `#{ @migration.destination.name }` (#{ cols.joined }) " +
|
50
|
+
"select #{ cols.joined } from `#{ @migration.origin.name }` " +
|
51
|
+
"where `id` between #{ lowest } and #{ highest }"
|
52
|
+
end
|
53
|
+
|
54
|
+
private
|
55
|
+
|
56
|
+
def cols
|
57
|
+
@cols ||= @migration.intersection
|
58
|
+
end
|
59
|
+
|
60
|
+
def execute
|
61
|
+
up_to(@limit) do |lowest, highest|
|
62
|
+
print "."
|
63
|
+
|
64
|
+
sql copy(lowest, highest)
|
65
|
+
end
|
66
|
+
end
|
67
|
+
|
68
|
+
def throttle_seconds
|
69
|
+
@throttle / 100.0
|
70
|
+
end
|
71
|
+
end
|
72
|
+
end
|
73
|
+
|
data/lib/lhm/command.rb
ADDED
@@ -0,0 +1,70 @@
|
|
1
|
+
#
|
2
|
+
# Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
|
3
|
+
# Schmidt
|
4
|
+
#
|
5
|
+
# Apply a change to the database.
|
6
|
+
#
|
7
|
+
|
8
|
+
module Lhm
|
9
|
+
module Command
|
10
|
+
def self.included(base)
|
11
|
+
base.send :attr_reader, :connection
|
12
|
+
end
|
13
|
+
|
14
|
+
#
|
15
|
+
# Command Interface
|
16
|
+
#
|
17
|
+
|
18
|
+
def validate; end
|
19
|
+
|
20
|
+
def revert
|
21
|
+
raise NotImplementedError.new(self.class.name)
|
22
|
+
end
|
23
|
+
|
24
|
+
def run(&block)
|
25
|
+
validate
|
26
|
+
|
27
|
+
if(block_given?)
|
28
|
+
before
|
29
|
+
block.call(self)
|
30
|
+
after
|
31
|
+
else
|
32
|
+
execute
|
33
|
+
end
|
34
|
+
end
|
35
|
+
|
36
|
+
private
|
37
|
+
|
38
|
+
def execute
|
39
|
+
raise NotImplementedError.new(self.class.name)
|
40
|
+
end
|
41
|
+
|
42
|
+
def before
|
43
|
+
raise NotImplementedError.new(self.class.name)
|
44
|
+
end
|
45
|
+
|
46
|
+
def after
|
47
|
+
raise NotImplementedError.new(self.class.name)
|
48
|
+
end
|
49
|
+
|
50
|
+
def table?(table_name)
|
51
|
+
@connection.table_exists?(table_name)
|
52
|
+
end
|
53
|
+
|
54
|
+
def error(msg)
|
55
|
+
raise Exception.new("#{ self.class }: #{ msg }")
|
56
|
+
end
|
57
|
+
|
58
|
+
def sql(statements)
|
59
|
+
[*statements].each do |statement|
|
60
|
+
begin
|
61
|
+
@connection.execute(statement)
|
62
|
+
rescue Mysql::Error => e
|
63
|
+
revert
|
64
|
+
error "#{ statement } failed: #{ e.inspect }"
|
65
|
+
end
|
66
|
+
end
|
67
|
+
end
|
68
|
+
end
|
69
|
+
end
|
70
|
+
|
@@ -0,0 +1,105 @@
|
|
1
|
+
#
|
2
|
+
# Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
|
3
|
+
# Schmidt
|
4
|
+
#
|
5
|
+
# Creates entanglement between two tables. All creates, updates and deletes
|
6
|
+
# to origin will be repeated on the the destination table.
|
7
|
+
#
|
8
|
+
|
9
|
+
require 'lhm/command'
|
10
|
+
|
11
|
+
module Lhm
|
12
|
+
class Entangler
|
13
|
+
include Command
|
14
|
+
|
15
|
+
attr_reader :epoch
|
16
|
+
|
17
|
+
def initialize(migration, connection = nil)
|
18
|
+
@common = migration.intersection
|
19
|
+
@origin = migration.origin
|
20
|
+
@destination = migration.destination
|
21
|
+
@connection = connection
|
22
|
+
end
|
23
|
+
|
24
|
+
def entangle
|
25
|
+
[
|
26
|
+
create_trigger_del,
|
27
|
+
create_trigger_ins,
|
28
|
+
create_trigger_upd
|
29
|
+
]
|
30
|
+
end
|
31
|
+
|
32
|
+
def untangle
|
33
|
+
[
|
34
|
+
"drop trigger if exists `#{ trigger(:del) }`",
|
35
|
+
"drop trigger if exists `#{ trigger(:ins) }`",
|
36
|
+
"drop trigger if exists `#{ trigger(:upd) }`"
|
37
|
+
]
|
38
|
+
end
|
39
|
+
|
40
|
+
def create_trigger_ins
|
41
|
+
strip %Q{
|
42
|
+
create trigger `#{ trigger(:ins) }`
|
43
|
+
after insert on `#{ @origin.name }` for each row
|
44
|
+
replace into `#{ @destination.name }` (#{ @common.joined })
|
45
|
+
values (#{ @common.typed("NEW") })
|
46
|
+
}
|
47
|
+
end
|
48
|
+
|
49
|
+
def create_trigger_upd
|
50
|
+
strip %Q{
|
51
|
+
create trigger `#{ trigger(:upd) }`
|
52
|
+
after update on `#{ @origin.name }` for each row
|
53
|
+
replace into `#{ @destination.name }` (#{ @common.joined })
|
54
|
+
values (#{ @common.typed("NEW") })
|
55
|
+
}
|
56
|
+
end
|
57
|
+
|
58
|
+
def create_trigger_del
|
59
|
+
strip %Q{
|
60
|
+
create trigger `#{ trigger(:del) }`
|
61
|
+
after delete on `#{ @origin.name }` for each row
|
62
|
+
delete ignore from `#{ @destination.name }`
|
63
|
+
where `#{ @destination.name }`.`id` = OLD.`id`
|
64
|
+
}
|
65
|
+
end
|
66
|
+
|
67
|
+
def trigger(type)
|
68
|
+
"lhmt_#{ type }_#{ @origin.name }"
|
69
|
+
end
|
70
|
+
|
71
|
+
#
|
72
|
+
# Command implementation
|
73
|
+
#
|
74
|
+
|
75
|
+
def validate
|
76
|
+
unless table?(@origin.name)
|
77
|
+
error("#{ @origin.name } does not exist")
|
78
|
+
end
|
79
|
+
|
80
|
+
unless table?(@destination.name)
|
81
|
+
error("#{ @destination.name } does not exist")
|
82
|
+
end
|
83
|
+
end
|
84
|
+
|
85
|
+
def before
|
86
|
+
sql(entangle)
|
87
|
+
@epoch = connection.select_value("select max(id) from #{ @origin.name }").to_i
|
88
|
+
end
|
89
|
+
|
90
|
+
def after
|
91
|
+
sql(untangle)
|
92
|
+
end
|
93
|
+
|
94
|
+
def revert
|
95
|
+
after
|
96
|
+
end
|
97
|
+
|
98
|
+
private
|
99
|
+
|
100
|
+
def strip(sql)
|
101
|
+
sql.strip.gsub(/\n */, "\n")
|
102
|
+
end
|
103
|
+
end
|
104
|
+
end
|
105
|
+
|