large-hadron-migrator 0.1.2 → 0.1.3

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,9 @@
1
+ # 0.1.3
2
+ * code cleanup
3
+ * Merged [Pullrequest #8](https://github.com/soundcloud/large-hadron-migrator/pull/8)
4
+ * Merged [Pullrequest #7](https://github.com/soundcloud/large-hadron-migrator/pull/7)
5
+ * Merged [Pullrequest #4](https://github.com/soundcloud/large-hadron-migrator/pull/4)
6
+ * Merged [Pullrequest #1](https://github.com/soundcloud/large-hadron-migrator/pull/1)
7
+
8
+ # 0.1.2
9
+ * Initial Release
@@ -5,7 +5,7 @@ an agile manner. Most Rails projects start like this, and at first, making
5
5
  changes is fast and easy.
6
6
 
7
7
  That is until your tables grow to millions of records. At this point, the
8
- locking nature of `ALTER TABLE` may take your site down for an hour our more
8
+ locking nature of `ALTER TABLE` may take your site down for an hour or more
9
9
  while critical tables are migrated. In order to avoid this, developers begin
10
10
  to design around the problem by introducing join tables or moving the data
11
11
  into another layer. Development gets less and less agile as tables grow and
@@ -89,7 +89,7 @@ there can only ever be one version of the record in the journal table.
89
89
 
90
90
  If the journalling trigger hits an already persisted record, it will be
91
91
  replaced with the latest data and action. `ON DUPLICATE KEY` comes in handy
92
- here. This insures that all journal records will be consistent with the
92
+ here. This ensures that all journal records will be consistent with the
93
93
  original table.
94
94
 
95
95
  ### Perform alter statement on new table
@@ -100,20 +100,20 @@ indexes at the end of the copying process.
100
100
 
101
101
  ### Copy in chunks up to max primary key value to new table
102
102
 
103
- Currently InnoDB aquires a read lock on the source rows in `INSERT INTO...
103
+ Currently InnoDB acquires a read lock on the source rows in `INSERT INTO...
104
104
  SELECT`. LHM reads 35K ranges and pauses for a specified number of milliseconds
105
105
  so that contention can be minimized.
106
106
 
107
107
  ### Switch new and original table names and remove triggers
108
108
 
109
- The orignal and copy table are now atomically switched with `RENAME TABLE
109
+ The original and copy table are now atomically switched with `RENAME TABLE
110
110
  original TO archive_original, copy_table TO original`. The triggers are removed
111
111
  so that journalling stops and all mutations and reads now go against the
112
112
  original table.
113
113
 
114
114
  ### Replay journal: insert, update, deletes
115
115
 
116
- Because the chunked copy stops at the intial maximum id, we can simply replay
116
+ Because the chunked copy stops at the initial maximum id, we can simply replay
117
117
  all inserts in the journal table without worrying about collisions.
118
118
 
119
119
  Updates and deletes are then replayed.
@@ -131,8 +131,8 @@ pass, so this will be quite short compared to the copy phase. The
131
131
  inconsistency during replay is similar in effect to a slave which is slightly
132
132
  behind master.
133
133
 
134
- There is also caveat with the current journalling scheme; stale journal
135
- 'update' entries are still replayed. Imagine an update to the a record in the
134
+ There is also a caveat with the current journalling scheme; stale journal
135
+ 'update' entries are still replayed. Imagine an update to a record in the
136
136
  migrated table while the journal is replaying. The journal may already contain
137
137
  an update for this record, which becomes stale now. When it is replayed, the
138
138
  second change will be lost. So if a record is updated twice, once before and
@@ -162,9 +162,9 @@ Several hours into the migration, a critical fix had to be deployed to the
162
162
  site. We rolled out the fix and restarted the app servers in mid migration.
163
163
  This was not a good idea.
164
164
 
165
- TL;DR: Never restart during migrations when removing columns with large
166
- hadron. You can restart while adding migrations as long as active record reads
167
- column definitions from the slave.
165
+ TL;DR: Never restart during migrations when removing columns with LHM.
166
+ You can restart while adding migrations as long as active record reads column
167
+ definitions from the slave.
168
168
 
169
169
  The information below is only relevant if you want to restart your app servers
170
170
  while migrating in a master slave setup.
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.1.2
1
+ 0.1.3
@@ -100,16 +100,17 @@ class LargeHadronMigration < ActiveRecord::Migration
100
100
 
101
101
  raise "chunk_size must be >= 1" unless chunk_size >= 1
102
102
 
103
- new_table = "new_#{curr_table}"
104
- old_table = "%s_#{curr_table}" % Time.now.strftime("%Y_%m_%d_%H_%M_%S_%3N")
105
- journal_table = "#{old_table}_changes"
103
+ started = Time.now.strftime("%Y_%m_%d_%H_%M_%S_%3N")
104
+ new_table = "lhmn_%s" % curr_table
105
+ old_table = "lhmo_%s_%s" % [started, curr_table]
106
+ journal_table = "lhmj_%s_%s" % [started, curr_table]
106
107
 
107
108
  last_insert_id = last_insert_id(curr_table)
108
109
  say "last inserted id in #{curr_table}: #{last_insert_id}"
109
110
 
110
111
  begin
111
112
  # clean tables. old tables are never deleted to guard against rollbacks.
112
- execute %Q/drop table if exists %s/ % new_table
113
+ execute %Q{drop table if exists %s} % new_table
113
114
 
114
115
  clone_table(curr_table, new_table, id_window)
115
116
  clone_table_for_changes(curr_table, journal_table)
@@ -253,8 +254,8 @@ class LargeHadronMigration < ActiveRecord::Migration
253
254
  end
254
255
  end
255
256
 
256
- def self.clone_table(source, dest, window = 0)
257
- execute schema_sql(source, dest, window)
257
+ def self.clone_table(source, dest, window = 0, add_action_column = false)
258
+ execute schema_sql(source, dest, window, add_action_column)
258
259
  end
259
260
 
260
261
  def self.common_columns(t1, t2)
@@ -262,11 +263,7 @@ class LargeHadronMigration < ActiveRecord::Migration
262
263
  end
263
264
 
264
265
  def self.clone_table_for_changes(table, journal_table)
265
- clone_table(table, journal_table)
266
- execute %Q{
267
- alter table %s
268
- add column hadron_action varchar(15);
269
- } % journal_table
266
+ clone_table(table, journal_table, 0, true)
270
267
  end
271
268
 
272
269
  def self.rename_tables(tables = {})
@@ -333,11 +330,15 @@ class LargeHadronMigration < ActiveRecord::Migration
333
330
  end
334
331
 
335
332
  def self.replay_delete_changes(table, journal_table)
336
- execute %Q{
337
- delete from #{table} where id in (
338
- select id from #{journal_table} where hadron_action = 'delete'
339
- )
340
- }
333
+ with_master do
334
+ if connection.select_values("select id from #{journal_table} where hadron_action = 'delete' LIMIT 1").any?
335
+ execute %Q{
336
+ delete from #{table} where id in (
337
+ select id from #{journal_table} where hadron_action = 'delete'
338
+ )
339
+ }
340
+ end
341
+ end
341
342
  end
342
343
 
343
344
  def self.replay_update_changes(table, journal_table, chunk_size = 10000, wait = 0.2)
@@ -359,12 +360,17 @@ class LargeHadronMigration < ActiveRecord::Migration
359
360
  # behavior with the latter where the auto_increment of the source table
360
361
  # got modified when updating the destination.
361
362
  #
362
- def self.schema_sql(source, dest, window)
363
+ def self.schema_sql(source, dest, window, add_action_column = false)
363
364
  show_create(source).tap do |schema|
364
365
  schema.gsub!(/auto_increment=(\d+)/i) do
365
366
  "auto_increment=#{ $1.to_i + window }"
366
367
  end
367
368
 
369
+ if add_action_column
370
+ schema.sub!(/\) ENGINE=/,
371
+ ", hadron_action ENUM('update', 'insert', 'delete'), INDEX hadron_action (hadron_action) USING BTREE) ENGINE=")
372
+ end
373
+
368
374
  schema.gsub!('CREATE TABLE `%s`' % source, 'CREATE TABLE `%s`' % dest)
369
375
  end
370
376
  end
@@ -99,7 +99,8 @@ describe "LargeHadronMigration", "triggers" do
99
99
  end
100
100
 
101
101
  it "should create a table for triggered changes" do
102
- truthiness_column "triggerme_changes", "hadron_action", "varchar"
102
+ truthiness_column "triggerme_changes", "hadron_action", "enum"
103
+ truthiness_index "triggerme_changes", "hadron_action", [ "hadron_action" ], false
103
104
  end
104
105
 
105
106
  it "should trigger on insert" do
@@ -341,6 +342,11 @@ describe "LargeHadronMigration", "replaying changes" do
341
342
  end
342
343
  end
343
344
 
345
+ it "doesn't replay delete if there are any" do
346
+ LargeHadronMigration.should_receive(:execute).never
347
+ LargeHadronMigration.replay_delete_changes("source", "source_changes")
348
+ end
349
+
344
350
  end
345
351
 
346
352
  describe "LargeHadronMigration", "units" do
@@ -90,6 +90,21 @@ module SpecHelper
90
90
 
91
91
  end
92
92
 
93
+ def truthiness_index(table, expected_index_name, indexed_columns, unique)
94
+ index = sql("SHOW INDEXES FROM #{table}").all_hashes.inject({}) do |a, part|
95
+ index_name = part['Key_name']
96
+ a[index_name] ||= { 'unique' => '0' == part['Non_unique'], 'columns' => [] }
97
+ column_index = part['Seq_in_index'].to_i - 1
98
+ a[index_name]['columns'][column_index] = part['Column_name']
99
+ a
100
+ end[expected_index_name]
101
+
102
+ flunk("no index named #{expected_index_name} found on #{table}") unless index
103
+
104
+ index['columns'].should == indexed_columns
105
+ index['unique'].should == unique
106
+ end
107
+
93
108
  end
94
109
 
95
110
  # Mock Rails Environment
metadata CHANGED
@@ -1,12 +1,8 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: large-hadron-migrator
3
3
  version: !ruby/object:Gem::Version
4
- prerelease: false
5
- segments:
6
- - 0
7
- - 1
8
- - 2
9
- version: 0.1.2
4
+ prerelease:
5
+ version: 0.1.3
10
6
  platform: ruby
11
7
  authors:
12
8
  - SoundCloud
@@ -16,8 +12,7 @@ autorequire:
16
12
  bindir: bin
17
13
  cert_chain: []
18
14
 
19
- date: 2011-05-04 00:00:00 +02:00
20
- default_executable:
15
+ date: 2011-09-28 00:00:00 Z
21
16
  dependencies:
22
17
  - !ruby/object:Gem::Dependency
23
18
  name: activerecord
@@ -27,10 +22,6 @@ dependencies:
27
22
  requirements:
28
23
  - - ~>
29
24
  - !ruby/object:Gem::Version
30
- segments:
31
- - 2
32
- - 3
33
- - 8
34
25
  version: 2.3.8
35
26
  type: :runtime
36
27
  version_requirements: *id001
@@ -42,10 +33,6 @@ dependencies:
42
33
  requirements:
43
34
  - - ~>
44
35
  - !ruby/object:Gem::Version
45
- segments:
46
- - 2
47
- - 3
48
- - 8
49
36
  version: 2.3.8
50
37
  type: :runtime
51
38
  version_requirements: *id002
@@ -57,10 +44,6 @@ dependencies:
57
44
  requirements:
58
45
  - - "="
59
46
  - !ruby/object:Gem::Version
60
- segments:
61
- - 2
62
- - 8
63
- - 1
64
47
  version: 2.8.1
65
48
  type: :runtime
66
49
  version_requirements: *id003
@@ -72,10 +55,6 @@ dependencies:
72
55
  requirements:
73
56
  - - "="
74
57
  - !ruby/object:Gem::Version
75
- segments:
76
- - 1
77
- - 3
78
- - 1
79
58
  version: 1.3.1
80
59
  type: :development
81
60
  version_requirements: *id004
@@ -89,11 +68,11 @@ extra_rdoc_files: []
89
68
 
90
69
  files:
91
70
  - .gitignore
92
- - CHANGES.markdown
71
+ - CHANGES.md
93
72
  - Gemfile
94
73
  - Gemfile.lock
95
74
  - LICENSE
96
- - README.markdown
75
+ - README.md
97
76
  - Rakefile
98
77
  - VERSION
99
78
  - large-hadron-migrator.gemspec
@@ -101,7 +80,6 @@ files:
101
80
  - spec/large_hadron_migration_spec.rb
102
81
  - spec/migrate/add_new_column.rb
103
82
  - spec/spec_helper.rb
104
- has_rdoc: true
105
83
  homepage: http://github.com/soundcloud/large-hadron-migrator
106
84
  licenses: []
107
85
 
@@ -115,21 +93,17 @@ required_ruby_version: !ruby/object:Gem::Requirement
115
93
  requirements:
116
94
  - - ">="
117
95
  - !ruby/object:Gem::Version
118
- segments:
119
- - 0
120
96
  version: "0"
121
97
  required_rubygems_version: !ruby/object:Gem::Requirement
122
98
  none: false
123
99
  requirements:
124
100
  - - ">="
125
101
  - !ruby/object:Gem::Version
126
- segments:
127
- - 0
128
102
  version: "0"
129
103
  requirements: []
130
104
 
131
105
  rubyforge_project:
132
- rubygems_version: 1.3.7
106
+ rubygems_version: 1.8.10
133
107
  signing_key:
134
108
  specification_version: 3
135
109
  summary: online schema changer for mysql
File without changes