lhm 1.0.0.rc.1

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,5 @@
1
+ *.gem
2
+ .bundle
3
+ Gemfile.lock
4
+ pkg/*
5
+ .rvmrc
@@ -0,0 +1,10 @@
1
+ rvm:
2
+ - 1.8.7
3
+ - 1.9.2
4
+ - 1.9.3
5
+ before_script:
6
+ - "mysql -e 'create database large_hadron_migrator;'"
7
+ branches:
8
+ only:
9
+ - master
10
+ - stable
@@ -0,0 +1,32 @@
1
+ # 1.0.0-RC1
2
+
3
+ * rewrite.
4
+
5
+ # 0.9.1
6
+
7
+ # 0.2.1 (November 26, 2011)
8
+
9
+ * Include changelog in gem
10
+
11
+ # 0.2.0 (November 26, 2011)
12
+
13
+ * Add Ruby 1.8 compatibility
14
+ * Setup travis continuous integration
15
+ * Fix record lose issue
16
+ * Fix and speed up specs
17
+
18
+ # 0.1.4
19
+
20
+ * Merged [Pullrequest #9](https://github.com/soundcloud/large-hadron-migrator/pull/9)
21
+
22
+ # 0.1.3
23
+
24
+ * code cleanup
25
+ * Merged [Pullrequest #8](https://github.com/soundcloud/large-hadron-migrator/pull/8)
26
+ * Merged [Pullrequest #7](https://github.com/soundcloud/large-hadron-migrator/pull/7)
27
+ * Merged [Pullrequest #4](https://github.com/soundcloud/large-hadron-migrator/pull/4)
28
+ * Merged [Pullrequest #1](https://github.com/soundcloud/large-hadron-migrator/pull/1)
29
+
30
+ # 0.1.2
31
+
32
+ * Initial Release
data/Gemfile ADDED
@@ -0,0 +1,3 @@
1
+ source "http://rubygems.org"
2
+
3
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,27 @@
1
+ Copyright (c) 2011, SoundCloud, Rany Keddo, Tobias Bielohlawek, Tobias Schmidt
2
+
3
+ All rights reserved.
4
+
5
+ Redistribution and use in source and binary forms, with or without
6
+ modification, are permitted provided that the following conditions are met:
7
+
8
+ - Redistributions of source code must retain the above copyright notice, this
9
+ list of conditions and the following disclaimer.
10
+ - Redistributions in binary form must reproduce the above copyright notice,
11
+ this list of conditions and the following disclaimer in the documentation
12
+ and/or other materials provided with the distribution.
13
+ - Neither the name of the SoundCloud nor the names of its contributors may be
14
+ used to endorse or promote products derived from this software without
15
+ specific prior written permission.
16
+
17
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
18
+ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
19
+ IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
20
+ DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
21
+ FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
22
+ DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
23
+ SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
24
+ CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
25
+ OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
26
+ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
27
+
@@ -0,0 +1,98 @@
1
+ # Large Hadron Migrator [![Build Status](https://secure.travis-ci.org/soundcloud/large-hadron-migrator.png)](http://travis-ci.org/soundcloud/large-hadron-migrator)
2
+
3
+ Rails style database migrations are a useful way to evolve your data schema in
4
+ an agile manner. Most Rails projects start like this, and at first, making
5
+ changes is fast and easy.
6
+
7
+ That is until your tables grow to millions of records. At this point, the
8
+ locking nature of `ALTER TABLE` may take your site down for an hour or more
9
+ while critical tables are migrated. In order to avoid this, developers begin
10
+ to design around the problem by introducing join tables or moving the data
11
+ into another layer. Development gets less and less agile as tables grow and
12
+ grow. To make the problem worse, adding or changing indices to optimize data
13
+ access becomes just as difficult.
14
+
15
+ > Side effects may include black holes and universe implosion.
16
+
17
+ There are few things that can be done at the server or engine level. It is
18
+ possible to change default values in an `ALTER TABLE` without locking the
19
+ table. The InnoDB Plugin provides facilities for online index creation, which
20
+ is great if you are using this engine, but only solves half the problem.
21
+
22
+ At SoundCloud we started having migration pains quite a while ago, and after
23
+ looking around for third party solutions [0] [1] [2], we decided to create our
24
+ own. We called it Large Hadron Migrator, and it is a gem for online
25
+ ActiveRecord migrations.
26
+
27
+ ![LHC](http://farm4.static.flickr.com/3093/2844971993_17f2ddf2a8_z.jpg)
28
+
29
+ [The Large Hadron collider at CERN](http://en.wikipedia.org/wiki/Large_Hadron_Collider)
30
+
31
+ ## The idea
32
+
33
+ The basic idea is to perform the migration online while the system is live,
34
+ without locking the table. Similar to OAK (online alter table) [2] and the
35
+ facebook tool [0], we use a copy table, triggers and a journal table.
36
+
37
+ The Large Hadron is a test driven Ruby solution which can easily be dropped
38
+ into an ActiveRecord migration. It presumes a single auto incremented
39
+ numerical primary key called id as per the Rails convention. Unlike the
40
+ twitter solution [1], it does not require the presence of an indexed
41
+ `updated_at` column.
42
+
43
+ ## Usage
44
+
45
+ After including Lhm, `hadron_change_table` becomes available
46
+ with the following methods:
47
+
48
+ class MigrateArbitrary < ActiveRecord::Migration
49
+ include Lhm
50
+
51
+ def self.up
52
+ hadron_change_table(:users) do |t|
53
+ t.add_column(:arbitrary, "INT(12)")
54
+ t.add_index([:arbitrary, :created_at])
55
+ t.ddl("alter table %s add column flag tinyint(1)" % t.name)
56
+ end
57
+ end
58
+
59
+ def self.down
60
+ hadron_change_table(:users) do |t|
61
+ t.remove_index([:arbitrary, :created_at])
62
+ t.remove_column(:arbitrary)
63
+ end
64
+ end
65
+ end
66
+
67
+ ## Migration phases
68
+
69
+ _TODO_
70
+
71
+ ### When adding a column
72
+
73
+ _TODO_
74
+
75
+ ### When removing a column
76
+
77
+ _TODO_
78
+
79
+ ## Contributing
80
+
81
+ We'll check out your contribution if you:
82
+
83
+ - Provide a comprehensive suite of tests for your fork.
84
+ - Have a clear and documented rationale for your changes.
85
+ - Package these up in a pull request.
86
+
87
+ We'll do our best to help you out with any contribution issues you may have.
88
+
89
+ ## License
90
+
91
+ The license is included as LICENSE in this directory.
92
+
93
+ ## Footnotes
94
+
95
+ [0]: http://www.facebook.com/note.php?note\_id=430801045932 "Facebook"
96
+ [1]: https://github.com/freels/table\_migrator "Twitter"
97
+ [2]: http://openarkkit.googlecode.com "OAK online alter table"
98
+
@@ -0,0 +1,16 @@
1
+ require 'rake/testtask'
2
+
3
+ Rake::TestTask.new("unit") do |t|
4
+ t.libs.push "lib"
5
+ t.test_files = FileList['spec/unit/*_spec.rb']
6
+ t.verbose = true
7
+ end
8
+
9
+ Rake::TestTask.new("integration") do |t|
10
+ t.libs.push "lib"
11
+ t.test_files = FileList['spec/integration/*_spec.rb']
12
+ t.verbose = true
13
+ end
14
+
15
+ task :default => [:unit, :integration]
16
+
data/TODO ADDED
@@ -0,0 +1,11 @@
1
+ todo
2
+
3
+ . locked_switcher
4
+ . schema_creator
5
+ . chunker
6
+
7
+ checklist
8
+
9
+ . consider entanglement epoch
10
+ - should not lose changes due to low epoch
11
+
@@ -0,0 +1,29 @@
1
+ # -*- encoding: utf-8 -*-
2
+
3
+ lib = File.expand_path('../lib', __FILE__)
4
+ $:.unshift(lib) unless $:.include?(lib)
5
+
6
+ puts $:.inspect
7
+
8
+ require 'lhm'
9
+
10
+ Gem::Specification.new do |s|
11
+ s.name = "lhm"
12
+ s.version = Lhm::VERSION
13
+ s.platform = Gem::Platform::RUBY
14
+ s.authors = ["SoundCloud", "Rany Keddo", "Tobias Bielohlawek", "Tobias Schmidt"]
15
+ s.email = %q{rany@soundcloud.com, tobi@soundcloud.com, ts@soundcloud.com}
16
+ s.summary = %q{online schema changer for mysql}
17
+ s.description = %q{Migrate large tables without downtime by copying to a temporary table in chunks. The old table is not dropped. Instead, it is moved to timestamp_table_name for verification.}
18
+ s.homepage = %q{http://github.com/soundcloud/large-hadron-migrator}
19
+ s.files = `git ls-files`.split("\n")
20
+ s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
21
+ s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
22
+ s.require_paths = ["lib"]
23
+
24
+ # this should be a real dependency, but we're using a different gem in our code
25
+ s.add_development_dependency "mysql", "~> 2.8.1"
26
+ s.add_development_dependency "rspec", "=1.3.1"
27
+ s.add_development_dependency "rake"
28
+ end
29
+
@@ -0,0 +1,20 @@
1
+ #
2
+ # Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
3
+ # Schmidt
4
+ #
5
+
6
+ require 'lhm/table'
7
+ require 'lhm/invoker'
8
+ require 'lhm/migration'
9
+
10
+ module Lhm
11
+ VERSION = "1.0.0.rc.1"
12
+
13
+ def hadron_change_table(table_name, chunk_options = {}, &block)
14
+ origin = Table.parse(table_name, connection)
15
+ invoker = Invoker.new(origin, connection)
16
+ block.call(invoker.migrator)
17
+ invoker.run(chunk_options)
18
+ end
19
+ end
20
+
@@ -0,0 +1,73 @@
1
+ #
2
+ # Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
3
+ # Schmidt
4
+ #
5
+
6
+ require 'lhm/migration'
7
+ require 'lhm/command'
8
+
9
+ module Lhm
10
+ class Chunker
11
+ include Command
12
+
13
+ #
14
+ # Copy from origin to destination in chunks of size `stride`. Sleeps for
15
+ # `throttle` milliseconds between each stride.
16
+ #
17
+
18
+ def initialize(migration, limit = 1, connection = nil, options = {})
19
+ @stride = options[:stride] || 40_000
20
+ @throttle = options[:throttle] || 100
21
+ @limit = limit
22
+ @connection = connection
23
+ @migration = migration
24
+ end
25
+
26
+ #
27
+ # Copies chunks of size `stride`, starting from id 1 up to id `limit`.
28
+ #
29
+
30
+ def up_to(limit)
31
+ traversable_chunks_up_to(limit).times do |n|
32
+ yield(bottom(n + 1), top(n + 1, limit)) && sleep(throttle_seconds)
33
+ end
34
+ end
35
+
36
+ def traversable_chunks_up_to(limit)
37
+ (limit / @stride.to_f).ceil
38
+ end
39
+
40
+ def bottom(chunk)
41
+ (chunk - 1) * @stride + 1
42
+ end
43
+
44
+ def top(chunk, limit)
45
+ [chunk * @stride, limit].min
46
+ end
47
+
48
+ def copy(lowest, highest)
49
+ "insert ignore into `#{ @migration.destination.name }` (#{ cols.joined }) " +
50
+ "select #{ cols.joined } from `#{ @migration.origin.name }` " +
51
+ "where `id` between #{ lowest } and #{ highest }"
52
+ end
53
+
54
+ private
55
+
56
+ def cols
57
+ @cols ||= @migration.intersection
58
+ end
59
+
60
+ def execute
61
+ up_to(@limit) do |lowest, highest|
62
+ print "."
63
+
64
+ sql copy(lowest, highest)
65
+ end
66
+ end
67
+
68
+ def throttle_seconds
69
+ @throttle / 100.0
70
+ end
71
+ end
72
+ end
73
+
@@ -0,0 +1,70 @@
1
+ #
2
+ # Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
3
+ # Schmidt
4
+ #
5
+ # Apply a change to the database.
6
+ #
7
+
8
+ module Lhm
9
+ module Command
10
+ def self.included(base)
11
+ base.send :attr_reader, :connection
12
+ end
13
+
14
+ #
15
+ # Command Interface
16
+ #
17
+
18
+ def validate; end
19
+
20
+ def revert
21
+ raise NotImplementedError.new(self.class.name)
22
+ end
23
+
24
+ def run(&block)
25
+ validate
26
+
27
+ if(block_given?)
28
+ before
29
+ block.call(self)
30
+ after
31
+ else
32
+ execute
33
+ end
34
+ end
35
+
36
+ private
37
+
38
+ def execute
39
+ raise NotImplementedError.new(self.class.name)
40
+ end
41
+
42
+ def before
43
+ raise NotImplementedError.new(self.class.name)
44
+ end
45
+
46
+ def after
47
+ raise NotImplementedError.new(self.class.name)
48
+ end
49
+
50
+ def table?(table_name)
51
+ @connection.table_exists?(table_name)
52
+ end
53
+
54
+ def error(msg)
55
+ raise Exception.new("#{ self.class }: #{ msg }")
56
+ end
57
+
58
+ def sql(statements)
59
+ [*statements].each do |statement|
60
+ begin
61
+ @connection.execute(statement)
62
+ rescue Mysql::Error => e
63
+ revert
64
+ error "#{ statement } failed: #{ e.inspect }"
65
+ end
66
+ end
67
+ end
68
+ end
69
+ end
70
+
@@ -0,0 +1,105 @@
1
+ #
2
+ # Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
3
+ # Schmidt
4
+ #
5
+ # Creates entanglement between two tables. All creates, updates and deletes
6
+ # to origin will be repeated on the the destination table.
7
+ #
8
+
9
+ require 'lhm/command'
10
+
11
+ module Lhm
12
+ class Entangler
13
+ include Command
14
+
15
+ attr_reader :epoch
16
+
17
+ def initialize(migration, connection = nil)
18
+ @common = migration.intersection
19
+ @origin = migration.origin
20
+ @destination = migration.destination
21
+ @connection = connection
22
+ end
23
+
24
+ def entangle
25
+ [
26
+ create_trigger_del,
27
+ create_trigger_ins,
28
+ create_trigger_upd
29
+ ]
30
+ end
31
+
32
+ def untangle
33
+ [
34
+ "drop trigger if exists `#{ trigger(:del) }`",
35
+ "drop trigger if exists `#{ trigger(:ins) }`",
36
+ "drop trigger if exists `#{ trigger(:upd) }`"
37
+ ]
38
+ end
39
+
40
+ def create_trigger_ins
41
+ strip %Q{
42
+ create trigger `#{ trigger(:ins) }`
43
+ after insert on `#{ @origin.name }` for each row
44
+ replace into `#{ @destination.name }` (#{ @common.joined })
45
+ values (#{ @common.typed("NEW") })
46
+ }
47
+ end
48
+
49
+ def create_trigger_upd
50
+ strip %Q{
51
+ create trigger `#{ trigger(:upd) }`
52
+ after update on `#{ @origin.name }` for each row
53
+ replace into `#{ @destination.name }` (#{ @common.joined })
54
+ values (#{ @common.typed("NEW") })
55
+ }
56
+ end
57
+
58
+ def create_trigger_del
59
+ strip %Q{
60
+ create trigger `#{ trigger(:del) }`
61
+ after delete on `#{ @origin.name }` for each row
62
+ delete ignore from `#{ @destination.name }`
63
+ where `#{ @destination.name }`.`id` = OLD.`id`
64
+ }
65
+ end
66
+
67
+ def trigger(type)
68
+ "lhmt_#{ type }_#{ @origin.name }"
69
+ end
70
+
71
+ #
72
+ # Command implementation
73
+ #
74
+
75
+ def validate
76
+ unless table?(@origin.name)
77
+ error("#{ @origin.name } does not exist")
78
+ end
79
+
80
+ unless table?(@destination.name)
81
+ error("#{ @destination.name } does not exist")
82
+ end
83
+ end
84
+
85
+ def before
86
+ sql(entangle)
87
+ @epoch = connection.select_value("select max(id) from #{ @origin.name }").to_i
88
+ end
89
+
90
+ def after
91
+ sql(untangle)
92
+ end
93
+
94
+ def revert
95
+ after
96
+ end
97
+
98
+ private
99
+
100
+ def strip(sql)
101
+ sql.strip.gsub(/\n */, "\n")
102
+ end
103
+ end
104
+ end
105
+