lhm 1.0.0.rc.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,5 @@
1
+ *.gem
2
+ .bundle
3
+ Gemfile.lock
4
+ pkg/*
5
+ .rvmrc
@@ -0,0 +1,10 @@
1
+ rvm:
2
+ - 1.8.7
3
+ - 1.9.2
4
+ - 1.9.3
5
+ before_script:
6
+ - "mysql -e 'create database large_hadron_migrator;'"
7
+ branches:
8
+ only:
9
+ - master
10
+ - stable
@@ -0,0 +1,32 @@
1
+ # 1.0.0-RC1
2
+
3
+ * rewrite.
4
+
5
+ # 0.9.1
6
+
7
+ # 0.2.1 (November 26, 2011)
8
+
9
+ * Include changelog in gem
10
+
11
+ # 0.2.0 (November 26, 2011)
12
+
13
+ * Add Ruby 1.8 compatibility
14
+ * Setup travis continuous integration
15
+ * Fix record lose issue
16
+ * Fix and speed up specs
17
+
18
+ # 0.1.4
19
+
20
+ * Merged [Pullrequest #9](https://github.com/soundcloud/large-hadron-migrator/pull/9)
21
+
22
+ # 0.1.3
23
+
24
+ * code cleanup
25
+ * Merged [Pullrequest #8](https://github.com/soundcloud/large-hadron-migrator/pull/8)
26
+ * Merged [Pullrequest #7](https://github.com/soundcloud/large-hadron-migrator/pull/7)
27
+ * Merged [Pullrequest #4](https://github.com/soundcloud/large-hadron-migrator/pull/4)
28
+ * Merged [Pullrequest #1](https://github.com/soundcloud/large-hadron-migrator/pull/1)
29
+
30
+ # 0.1.2
31
+
32
+ * Initial Release
data/Gemfile ADDED
@@ -0,0 +1,3 @@
1
+ source "http://rubygems.org"
2
+
3
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,27 @@
1
+ Copyright (c) 2011, SoundCloud, Rany Keddo, Tobias Bielohlawek, Tobias Schmidt
2
+
3
+ All rights reserved.
4
+
5
+ Redistribution and use in source and binary forms, with or without
6
+ modification, are permitted provided that the following conditions are met:
7
+
8
+ - Redistributions of source code must retain the above copyright notice, this
9
+ list of conditions and the following disclaimer.
10
+ - Redistributions in binary form must reproduce the above copyright notice,
11
+ this list of conditions and the following disclaimer in the documentation
12
+ and/or other materials provided with the distribution.
13
+ - Neither the name of the SoundCloud nor the names of its contributors may be
14
+ used to endorse or promote products derived from this software without
15
+ specific prior written permission.
16
+
17
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
18
+ AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
19
+ IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
20
+ DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
21
+ FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
22
+ DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
23
+ SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
24
+ CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
25
+ OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
26
+ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
27
+
@@ -0,0 +1,98 @@
1
+ # Large Hadron Migrator [![Build Status](https://secure.travis-ci.org/soundcloud/large-hadron-migrator.png)](http://travis-ci.org/soundcloud/large-hadron-migrator)
2
+
3
+ Rails style database migrations are a useful way to evolve your data schema in
4
+ an agile manner. Most Rails projects start like this, and at first, making
5
+ changes is fast and easy.
6
+
7
+ That is until your tables grow to millions of records. At this point, the
8
+ locking nature of `ALTER TABLE` may take your site down for an hour or more
9
+ while critical tables are migrated. In order to avoid this, developers begin
10
+ to design around the problem by introducing join tables or moving the data
11
+ into another layer. Development gets less and less agile as tables grow and
12
+ grow. To make the problem worse, adding or changing indices to optimize data
13
+ access becomes just as difficult.
14
+
15
+ > Side effects may include black holes and universe implosion.
16
+
17
+ There are few things that can be done at the server or engine level. It is
18
+ possible to change default values in an `ALTER TABLE` without locking the
19
+ table. The InnoDB Plugin provides facilities for online index creation, which
20
+ is great if you are using this engine, but only solves half the problem.
21
+
22
+ At SoundCloud we started having migration pains quite a while ago, and after
23
+ looking around for third party solutions [0] [1] [2], we decided to create our
24
+ own. We called it Large Hadron Migrator, and it is a gem for online
25
+ ActiveRecord migrations.
26
+
27
+ ![LHC](http://farm4.static.flickr.com/3093/2844971993_17f2ddf2a8_z.jpg)
28
+
29
+ [The Large Hadron collider at CERN](http://en.wikipedia.org/wiki/Large_Hadron_Collider)
30
+
31
+ ## The idea
32
+
33
+ The basic idea is to perform the migration online while the system is live,
34
+ without locking the table. Similar to OAK (online alter table) [2] and the
35
+ facebook tool [0], we use a copy table, triggers and a journal table.
36
+
37
+ The Large Hadron is a test driven Ruby solution which can easily be dropped
38
+ into an ActiveRecord migration. It presumes a single auto incremented
39
+ numerical primary key called id as per the Rails convention. Unlike the
40
+ twitter solution [1], it does not require the presence of an indexed
41
+ `updated_at` column.
42
+
43
+ ## Usage
44
+
45
+ After including Lhm, `hadron_change_table` becomes available
46
+ with the following methods:
47
+
48
+ class MigrateArbitrary < ActiveRecord::Migration
49
+ include Lhm
50
+
51
+ def self.up
52
+ hadron_change_table(:users) do |t|
53
+ t.add_column(:arbitrary, "INT(12)")
54
+ t.add_index([:arbitrary, :created_at])
55
+ t.ddl("alter table %s add column flag tinyint(1)" % t.name)
56
+ end
57
+ end
58
+
59
+ def self.down
60
+ hadron_change_table(:users) do |t|
61
+ t.remove_index([:arbitrary, :created_at])
62
+ t.remove_column(:arbitrary)
63
+ end
64
+ end
65
+ end
66
+
67
+ ## Migration phases
68
+
69
+ _TODO_
70
+
71
+ ### When adding a column
72
+
73
+ _TODO_
74
+
75
+ ### When removing a column
76
+
77
+ _TODO_
78
+
79
+ ## Contributing
80
+
81
+ We'll check out your contribution if you:
82
+
83
+ - Provide a comprehensive suite of tests for your fork.
84
+ - Have a clear and documented rationale for your changes.
85
+ - Package these up in a pull request.
86
+
87
+ We'll do our best to help you out with any contribution issues you may have.
88
+
89
+ ## License
90
+
91
+ The license is included as LICENSE in this directory.
92
+
93
+ ## Footnotes
94
+
95
+ [0]: http://www.facebook.com/note.php?note\_id=430801045932 "Facebook"
96
+ [1]: https://github.com/freels/table\_migrator "Twitter"
97
+ [2]: http://openarkkit.googlecode.com "OAK online alter table"
98
+
@@ -0,0 +1,16 @@
1
+ require 'rake/testtask'
2
+
3
+ Rake::TestTask.new("unit") do |t|
4
+ t.libs.push "lib"
5
+ t.test_files = FileList['spec/unit/*_spec.rb']
6
+ t.verbose = true
7
+ end
8
+
9
+ Rake::TestTask.new("integration") do |t|
10
+ t.libs.push "lib"
11
+ t.test_files = FileList['spec/integration/*_spec.rb']
12
+ t.verbose = true
13
+ end
14
+
15
+ task :default => [:unit, :integration]
16
+
data/TODO ADDED
@@ -0,0 +1,11 @@
1
+ todo
2
+
3
+ . locked_switcher
4
+ . schema_creator
5
+ . chunker
6
+
7
+ checklist
8
+
9
+ . consider entanglement epoch
10
+ - should not lose changes due to low epoch
11
+
@@ -0,0 +1,29 @@
1
+ # -*- encoding: utf-8 -*-
2
+
3
+ lib = File.expand_path('../lib', __FILE__)
4
+ $:.unshift(lib) unless $:.include?(lib)
5
+
6
+ puts $:.inspect
7
+
8
+ require 'lhm'
9
+
10
+ Gem::Specification.new do |s|
11
+ s.name = "lhm"
12
+ s.version = Lhm::VERSION
13
+ s.platform = Gem::Platform::RUBY
14
+ s.authors = ["SoundCloud", "Rany Keddo", "Tobias Bielohlawek", "Tobias Schmidt"]
15
+ s.email = %q{rany@soundcloud.com, tobi@soundcloud.com, ts@soundcloud.com}
16
+ s.summary = %q{online schema changer for mysql}
17
+ s.description = %q{Migrate large tables without downtime by copying to a temporary table in chunks. The old table is not dropped. Instead, it is moved to timestamp_table_name for verification.}
18
+ s.homepage = %q{http://github.com/soundcloud/large-hadron-migrator}
19
+ s.files = `git ls-files`.split("\n")
20
+ s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
21
+ s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
22
+ s.require_paths = ["lib"]
23
+
24
+ # this should be a real dependency, but we're using a different gem in our code
25
+ s.add_development_dependency "mysql", "~> 2.8.1"
26
+ s.add_development_dependency "rspec", "=1.3.1"
27
+ s.add_development_dependency "rake"
28
+ end
29
+
@@ -0,0 +1,20 @@
1
+ #
2
+ # Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
3
+ # Schmidt
4
+ #
5
+
6
+ require 'lhm/table'
7
+ require 'lhm/invoker'
8
+ require 'lhm/migration'
9
+
10
+ module Lhm
11
+ VERSION = "1.0.0.rc.1"
12
+
13
+ def hadron_change_table(table_name, chunk_options = {}, &block)
14
+ origin = Table.parse(table_name, connection)
15
+ invoker = Invoker.new(origin, connection)
16
+ block.call(invoker.migrator)
17
+ invoker.run(chunk_options)
18
+ end
19
+ end
20
+
@@ -0,0 +1,73 @@
1
+ #
2
+ # Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
3
+ # Schmidt
4
+ #
5
+
6
+ require 'lhm/migration'
7
+ require 'lhm/command'
8
+
9
+ module Lhm
10
+ class Chunker
11
+ include Command
12
+
13
+ #
14
+ # Copy from origin to destination in chunks of size `stride`. Sleeps for
15
+ # `throttle` milliseconds between each stride.
16
+ #
17
+
18
+ def initialize(migration, limit = 1, connection = nil, options = {})
19
+ @stride = options[:stride] || 40_000
20
+ @throttle = options[:throttle] || 100
21
+ @limit = limit
22
+ @connection = connection
23
+ @migration = migration
24
+ end
25
+
26
+ #
27
+ # Copies chunks of size `stride`, starting from id 1 up to id `limit`.
28
+ #
29
+
30
+ def up_to(limit)
31
+ traversable_chunks_up_to(limit).times do |n|
32
+ yield(bottom(n + 1), top(n + 1, limit)) && sleep(throttle_seconds)
33
+ end
34
+ end
35
+
36
+ def traversable_chunks_up_to(limit)
37
+ (limit / @stride.to_f).ceil
38
+ end
39
+
40
+ def bottom(chunk)
41
+ (chunk - 1) * @stride + 1
42
+ end
43
+
44
+ def top(chunk, limit)
45
+ [chunk * @stride, limit].min
46
+ end
47
+
48
+ def copy(lowest, highest)
49
+ "insert ignore into `#{ @migration.destination.name }` (#{ cols.joined }) " +
50
+ "select #{ cols.joined } from `#{ @migration.origin.name }` " +
51
+ "where `id` between #{ lowest } and #{ highest }"
52
+ end
53
+
54
+ private
55
+
56
+ def cols
57
+ @cols ||= @migration.intersection
58
+ end
59
+
60
+ def execute
61
+ up_to(@limit) do |lowest, highest|
62
+ print "."
63
+
64
+ sql copy(lowest, highest)
65
+ end
66
+ end
67
+
68
+ def throttle_seconds
69
+ @throttle / 100.0
70
+ end
71
+ end
72
+ end
73
+
@@ -0,0 +1,70 @@
1
+ #
2
+ # Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
3
+ # Schmidt
4
+ #
5
+ # Apply a change to the database.
6
+ #
7
+
8
+ module Lhm
9
+ module Command
10
+ def self.included(base)
11
+ base.send :attr_reader, :connection
12
+ end
13
+
14
+ #
15
+ # Command Interface
16
+ #
17
+
18
+ def validate; end
19
+
20
+ def revert
21
+ raise NotImplementedError.new(self.class.name)
22
+ end
23
+
24
+ def run(&block)
25
+ validate
26
+
27
+ if(block_given?)
28
+ before
29
+ block.call(self)
30
+ after
31
+ else
32
+ execute
33
+ end
34
+ end
35
+
36
+ private
37
+
38
+ def execute
39
+ raise NotImplementedError.new(self.class.name)
40
+ end
41
+
42
+ def before
43
+ raise NotImplementedError.new(self.class.name)
44
+ end
45
+
46
+ def after
47
+ raise NotImplementedError.new(self.class.name)
48
+ end
49
+
50
+ def table?(table_name)
51
+ @connection.table_exists?(table_name)
52
+ end
53
+
54
+ def error(msg)
55
+ raise Exception.new("#{ self.class }: #{ msg }")
56
+ end
57
+
58
+ def sql(statements)
59
+ [*statements].each do |statement|
60
+ begin
61
+ @connection.execute(statement)
62
+ rescue Mysql::Error => e
63
+ revert
64
+ error "#{ statement } failed: #{ e.inspect }"
65
+ end
66
+ end
67
+ end
68
+ end
69
+ end
70
+
@@ -0,0 +1,105 @@
1
+ #
2
+ # Copyright (c) 2011, SoundCloud Ltd., Rany Keddo, Tobias Bielohlawek, Tobias
3
+ # Schmidt
4
+ #
5
+ # Creates entanglement between two tables. All creates, updates and deletes
6
+ # to origin will be repeated on the the destination table.
7
+ #
8
+
9
+ require 'lhm/command'
10
+
11
+ module Lhm
12
+ class Entangler
13
+ include Command
14
+
15
+ attr_reader :epoch
16
+
17
+ def initialize(migration, connection = nil)
18
+ @common = migration.intersection
19
+ @origin = migration.origin
20
+ @destination = migration.destination
21
+ @connection = connection
22
+ end
23
+
24
+ def entangle
25
+ [
26
+ create_trigger_del,
27
+ create_trigger_ins,
28
+ create_trigger_upd
29
+ ]
30
+ end
31
+
32
+ def untangle
33
+ [
34
+ "drop trigger if exists `#{ trigger(:del) }`",
35
+ "drop trigger if exists `#{ trigger(:ins) }`",
36
+ "drop trigger if exists `#{ trigger(:upd) }`"
37
+ ]
38
+ end
39
+
40
+ def create_trigger_ins
41
+ strip %Q{
42
+ create trigger `#{ trigger(:ins) }`
43
+ after insert on `#{ @origin.name }` for each row
44
+ replace into `#{ @destination.name }` (#{ @common.joined })
45
+ values (#{ @common.typed("NEW") })
46
+ }
47
+ end
48
+
49
+ def create_trigger_upd
50
+ strip %Q{
51
+ create trigger `#{ trigger(:upd) }`
52
+ after update on `#{ @origin.name }` for each row
53
+ replace into `#{ @destination.name }` (#{ @common.joined })
54
+ values (#{ @common.typed("NEW") })
55
+ }
56
+ end
57
+
58
+ def create_trigger_del
59
+ strip %Q{
60
+ create trigger `#{ trigger(:del) }`
61
+ after delete on `#{ @origin.name }` for each row
62
+ delete ignore from `#{ @destination.name }`
63
+ where `#{ @destination.name }`.`id` = OLD.`id`
64
+ }
65
+ end
66
+
67
+ def trigger(type)
68
+ "lhmt_#{ type }_#{ @origin.name }"
69
+ end
70
+
71
+ #
72
+ # Command implementation
73
+ #
74
+
75
+ def validate
76
+ unless table?(@origin.name)
77
+ error("#{ @origin.name } does not exist")
78
+ end
79
+
80
+ unless table?(@destination.name)
81
+ error("#{ @destination.name } does not exist")
82
+ end
83
+ end
84
+
85
+ def before
86
+ sql(entangle)
87
+ @epoch = connection.select_value("select max(id) from #{ @origin.name }").to_i
88
+ end
89
+
90
+ def after
91
+ sql(untangle)
92
+ end
93
+
94
+ def revert
95
+ after
96
+ end
97
+
98
+ private
99
+
100
+ def strip(sql)
101
+ sql.strip.gsub(/\n */, "\n")
102
+ end
103
+ end
104
+ end
105
+