migratrix 0.8.5 → 0.8.7
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +190 -30
- data/bin/migratrix +1 -1
- data/lib/migratrix.rb +2 -0
- data/lib/migratrix/loads/active_record.rb +73 -0
- data/lib/migratrix/migration.rb +9 -5
- data/lib/migratrix/transforms/transform.rb +0 -12
- data/spec/lib/migratrix/migration_spec.rb +1 -1
- metadata +6 -5
data/README.md
CHANGED
@@ -4,17 +4,89 @@ Dominate your legacy Rails migrations! Migratrix is a gem to help you
|
|
4
4
|
generate and control Migrations, which extract data from legacy systems
|
5
5
|
and import them into your current one.
|
6
6
|
|
7
|
-
##
|
7
|
+
## General Info
|
8
8
|
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
9
|
+
Migratrix is a framework that supports various migration strategies.
|
10
|
+
You tell it how you want to connect to a data source, define any
|
11
|
+
transformations, and then describe the "load target": how and where
|
12
|
+
you want the new data to come out. You can, of course, extract data
|
13
|
+
from a legacy database and load it into your all-new database, but you
|
14
|
+
can also read and write from logs, flat files, or even external
|
15
|
+
services.
|
14
16
|
|
15
|
-
|
17
|
+
If you can get at the data from Ruby, Migratrix can use it as a source
|
18
|
+
of extractable data. If you can get at the new storage mechanism from
|
19
|
+
Ruby, Migratrix can use it as a load target.
|
20
|
+
|
21
|
+
You can extract data from an API and load it into your database. You
|
22
|
+
can extract enumerables from a database and load it to source code
|
23
|
+
that defines constants for the enumeration.
|
24
|
+
|
25
|
+
## Weapons-Grade Migrations
|
26
|
+
|
27
|
+
Migratrix tries to keep simple things simple, but really shines when
|
28
|
+
it scales.
|
29
|
+
|
30
|
+
Migratrix is intended to be as simple and easy to use as possible for
|
31
|
+
small cases, but to have no problems scaling upwards. As a result if
|
32
|
+
you have an ultra-simple migration that one developer on the team will
|
33
|
+
only ever run once, Migratrix will work but may not be the best tool.
|
34
|
+
|
35
|
+
If, however, you have
|
36
|
+
|
37
|
+
* Complex and/or complicated migrations
|
38
|
+
|
39
|
+
* Migrations that need to be run more than once, especially if they
|
40
|
+
need to be run frequently and/or regularly
|
41
|
+
|
42
|
+
* Migrations that need to be run by multiple developers or by
|
43
|
+
distributed machines
|
44
|
+
|
45
|
+
then your migration strategy is complicated enough that it needs to be
|
46
|
+
part of the codebase and knowledgebase for your project, and you want
|
47
|
+
a heavier-duty migration tool. Migratrix is the perfect tool in that
|
48
|
+
situation.
|
49
|
+
|
50
|
+
|
51
|
+
## Object-Oriented Data Transformation
|
52
|
+
|
53
|
+
The first few times you write a special-purpose data migration tool,
|
54
|
+
you're going to be tempted to just say "here's a hash for the source,
|
55
|
+
here's a hash for the destination, and here's a function to transform
|
56
|
+
from one to the other". And it will work. But then the extra
|
57
|
+
requirements start to roll in:
|
58
|
+
|
59
|
+
* We have 20 million rows, can you do the migration in batches?
|
60
|
+
|
61
|
+
* We need to split the users table into users and addresses tables,
|
62
|
+
how hard is that?
|
16
63
|
|
17
|
-
|
64
|
+
* We have a TON of duplicate data, can you merge it down when you
|
65
|
+
migrate?
|
66
|
+
|
67
|
+
* Can we just migrate one or two exemplar rows to test the migration
|
68
|
+
tool?
|
69
|
+
|
70
|
+
And my all-time personal favorite:
|
71
|
+
|
72
|
+
* We've decided we want to keep the old site live while the beta site
|
73
|
+
is running, can you keep the databases in sync?
|
74
|
+
|
75
|
+
I started writing Migratrix when my third client asked me to keep a
|
76
|
+
legacy site and a beta site synchronized. For me this tool reduced the
|
77
|
+
problem from "utterly impossible" to "merely very difficult". :-)
|
78
|
+
|
79
|
+
As a result, Migratrix is very much object based. You get at a data
|
80
|
+
source by using an Extract object. If Migratrix does not support the
|
81
|
+
kind of extraction you want to do, it provides the Extract base class
|
82
|
+
so you can write your own extractor.
|
83
|
+
|
84
|
+
Then the data is handed off to a Transform object, and finally it is
|
85
|
+
given to a Load object.
|
86
|
+
|
87
|
+
Migratrix provides a handful of Extract, Transform and Load classes to
|
88
|
+
handle common types of migration (reading and writing models with
|
89
|
+
ActiveRecord, writing YAML files, etc).
|
18
90
|
|
19
91
|
## Motivation
|
20
92
|
|
@@ -22,42 +94,116 @@ So... much... legacy... data....
|
|
22
94
|
|
23
95
|
## Rails and Ruby Requirements
|
24
96
|
|
25
|
-
### Rails 3
|
97
|
+
### Rails 3
|
26
98
|
|
27
|
-
Migratrix
|
28
|
-
|
29
|
-
|
30
|
-
|
99
|
+
Migratrix depends on Rails 3 for its ActiveRecord and ActiveSupport
|
100
|
+
libraries. If your project was written in an older version of Rails,
|
101
|
+
and you want to use ActiveRecord as your extractor, you may need to
|
102
|
+
define new models that are compatible with Rails 3. But try just
|
103
|
+
loading the old models first; as long as the models aren't too
|
104
|
+
complicated or interconnected, they may work.
|
31
105
|
|
32
|
-
### Ruby 1.9
|
106
|
+
### Ruby 1.9.2 or later
|
33
107
|
|
34
108
|
Because I can.
|
35
109
|
|
36
|
-
##
|
110
|
+
## Examples
|
37
111
|
|
38
|
-
|
112
|
+
### ETL
|
39
113
|
|
40
114
|
I use the term "ETL" here in a loosely similar mechanism as in data
|
41
115
|
warehousing: Extract, Transform and Load. Migratrix approaches
|
42
116
|
migrations in three phases:
|
43
117
|
|
44
|
-
* **Extract** The Migration obtains the legacy data from 1 or more
|
45
|
-
|
46
|
-
|
47
|
-
|
48
|
-
outputs
|
49
|
-
|
118
|
+
* **Extract** The Migration obtains the legacy data from 1 or more sources
|
119
|
+
|
120
|
+
* **Transform** The Migration transforms the data into 1 or more outputs
|
121
|
+
|
50
122
|
* **Load** The Migration saves the data into the new database or other
|
51
|
-
|
52
|
-
|
123
|
+
output(s)
|
124
|
+
|
125
|
+
### General Structure
|
126
|
+
|
127
|
+
I like to create a folder structure in db/legacy with a /migrations
|
128
|
+
folder and a /models folder. Then I write rake tasks to handle the
|
129
|
+
more common migrations. Migration classes go in /migrations,
|
130
|
+
naturally; /models is where I store ActiveRecord models that must
|
131
|
+
access the legacy database. It is important to keep these namespaced
|
132
|
+
so that your `User`model doesn't conflict with your `Legacy::User`
|
133
|
+
model.
|
134
|
+
|
135
|
+
### Simple Example: Straight-up ActiveRecord Copy
|
136
|
+
|
137
|
+
Let's say your legacy app has a simple table that you want to keep
|
138
|
+
unchanged. Migratrix can bring this over using straight ActiveRecord
|
139
|
+
copies:
|
140
|
+
|
141
|
+
TODO: Write Sample App so these examples make sense
|
142
|
+
TODO: And then come write this example
|
143
|
+
|
144
|
+
|
145
|
+
### Slightly More Complicated: Defining your own extensible class
|
146
|
+
|
147
|
+
Let's say your legacy app has a table that stores what amount to
|
148
|
+
constants in a table in the database. Either the data never changes
|
149
|
+
(in fact, you don't even have an admin page to edit the data), or the
|
150
|
+
data changes rarely enough that you're willing to restart the
|
151
|
+
webserver if the data DOES change. It's also really simple data, and
|
152
|
+
has no dependencies. For the sake of debmonstration, let's say that
|
153
|
+
your app stores all 50 US States and their abbreviations. (And you put
|
154
|
+
it in the database because ONE day you were SURE you were going to
|
155
|
+
need to add Canadian provinces or Japanese boroughs.)
|
156
|
+
|
157
|
+
Let's migrate this to a YAML file and store it in config/constants,
|
158
|
+
and at boot time we'll write some code to load the states.yml file and
|
159
|
+
create a frozen hash called MyApp::STATES.
|
160
|
+
|
161
|
+
This should be straightforward at this point, but wait. It also turns
|
162
|
+
out we need to do the same thing with the countries table, the
|
163
|
+
shipping_carriers table, and about six other tables. They're all the
|
164
|
+
same: we need to extract from the legacy database and write to a
|
165
|
+
yaml file in the constants folder.
|
166
|
+
|
167
|
+
We can do that. Here's the entire migration for States:
|
168
|
+
|
169
|
+
require 'constants_migration'
|
170
|
+
module Legacy
|
171
|
+
class StatesMigration < ConstantsMigration
|
172
|
+
extend_extraction source: Legacy::State
|
173
|
+
extend_transform transform: { id: :id, name: :name, abbreviation: :state_code }
|
174
|
+
extend_load filename: MyApp::CONSTANTS_PATH + "states.yml"
|
175
|
+
end
|
176
|
+
end
|
177
|
+
|
178
|
+
And the reason this works is that you've written your own custom
|
179
|
+
Migration class, which looks like this:
|
180
|
+
|
181
|
+
module Legacy
|
182
|
+
# Migrates a Legacy ActiveRecord table through a transform to a
|
183
|
+
# constants file. Child classes should extend_extraction with
|
184
|
+
# :source, transform with a :transform hash, and load with the
|
185
|
+
# filename to write to.
|
186
|
+
class ConstantsMigration < Migratrix::Migration
|
187
|
+
set_extraction :active_record, source: Model
|
188
|
+
set_transform :transform, {
|
189
|
+
transform_collection: Hash,
|
190
|
+
transform_class: Hash,
|
191
|
+
extract_attribute: :[],
|
192
|
+
apply_attribute: :[]=,
|
193
|
+
store_transformed_object: ->(object,collection){ collection[object[:id]] = object }
|
194
|
+
}
|
195
|
+
set_load :yaml
|
196
|
+
end
|
197
|
+
end
|
53
198
|
|
54
199
|
## Migration Dependencies
|
55
200
|
|
56
201
|
Migratrix isn't quite smart enough to know that a migrator depends on
|
57
202
|
another migrator. (Actually, that's easy. What's hard is knowing if a
|
58
|
-
dependent migrator has run and is up-to-date.
|
203
|
+
dependent migrator has run and is up-to-date. This is a solvable
|
204
|
+
problem, I just haven't needed to solve it with Migratrix yet.)
|
59
205
|
|
60
|
-
## Strategies
|
206
|
+
## Migration Strategies
|
61
207
|
|
62
208
|
Migratrix supports multiple migration strategies:
|
63
209
|
|
@@ -89,9 +235,8 @@ Migratrix supports multiple migration strategies:
|
|
89
235
|
## Slices of Data
|
90
236
|
|
91
237
|
Migratrix supports taking partial slices of data. You can migrate a
|
92
|
-
single record to test a migraton, grab
|
93
|
-
idea for how long a full migration will take
|
94
|
-
migration.
|
238
|
+
single record to test a migraton, or grab a few thousand to get an
|
239
|
+
idea for how long a full migration will take.
|
95
240
|
|
96
241
|
## Ongoing Migrations
|
97
242
|
|
@@ -107,7 +252,9 @@ can also record the source object, which is useful for debugging or
|
|
107
252
|
handling migration cases where legacy records get changed or deleted
|
108
253
|
after migrating.
|
109
254
|
|
110
|
-
##
|
255
|
+
## Known Limitations and Issues
|
256
|
+
|
257
|
+
### Migration Tests
|
111
258
|
|
112
259
|
Sorry, nothing to see here yet. Migratrix was originally developed in
|
113
260
|
an environment where the migrations were so heavy-duty and hairy that
|
@@ -118,6 +265,12 @@ mostly a note to remind myself that that heavy-duty migrations can and
|
|
118
265
|
should be developed in a TDD style, and Migratrix should make this
|
119
266
|
easy.
|
120
267
|
|
268
|
+
### Rake Tasks and/or Rails Generators
|
269
|
+
|
270
|
+
Migratrix is definitely a heavyweight tool for migrating data. It
|
271
|
+
would be nice if the gem provided rake tasks or Rails generators to
|
272
|
+
help with the boilerplate.
|
273
|
+
|
121
274
|
## A note about the name
|
122
275
|
|
123
276
|
In old Latin, -or versus -ix endings aren't just about feminine and
|
@@ -142,3 +295,10 @@ MIT. See the license file.
|
|
142
295
|
|
143
296
|
* David Brady -- github@shinybit.com
|
144
297
|
|
298
|
+
## Contributing
|
299
|
+
|
300
|
+
YES PLEASE! For bugs and other issues, send a pull request. If you
|
301
|
+
would like to extend or change a feature of Migratrix, discussion is
|
302
|
+
also very welcome.
|
303
|
+
|
304
|
+
|
data/bin/migratrix
CHANGED
@@ -1,2 +1,2 @@
|
|
1
1
|
#!/usr/bin/env ruby
|
2
|
-
puts "Nothing to see here yet --
|
2
|
+
puts "Nothing to see here yet -- call Migratrix.migrate!(:migrator, options) directly from your own driver code (scripts, rake tasks, etc)"
|
data/lib/migratrix.rb
CHANGED
@@ -29,6 +29,7 @@ module Migratrix
|
|
29
29
|
require APP + 'transforms/map'
|
30
30
|
|
31
31
|
require APP + 'loads/load'
|
32
|
+
require APP + 'loads/active_record'
|
32
33
|
require APP + 'loads/no_op'
|
33
34
|
require APP + 'loads/yaml'
|
34
35
|
# require APP + 'loads/csv'
|
@@ -84,6 +85,7 @@ module Migratrix
|
|
84
85
|
register_transform :no_op, Transforms::NoOp
|
85
86
|
|
86
87
|
register_load :load, Loads::Load
|
88
|
+
register_load :active_record, Loads::ActiveRecord
|
87
89
|
register_load :no_op, Loads::NoOp
|
88
90
|
register_load :yaml, Loads::Yaml
|
89
91
|
|
@@ -0,0 +1,73 @@
|
|
1
|
+
require 'yaml'
|
2
|
+
|
3
|
+
module Migratrix
|
4
|
+
module Loads
|
5
|
+
# An ActiveRecord-based Load that tries to update existing objects
|
6
|
+
# rather than always doing new saves. If :update is true, before
|
7
|
+
# saving we attempt to find the object by the primary_key column.
|
8
|
+
# If found, we call .update_attributes on that record instead of
|
9
|
+
# .save.
|
10
|
+
#
|
11
|
+
# TODO: Verify that update_attributes still calls callbacks, e.g.
|
12
|
+
# validations and before_save? If not we'll need to load the
|
13
|
+
# object, copy attributes manually from the transformed object,
|
14
|
+
# and save it.
|
15
|
+
#
|
16
|
+
# TODO: primary_key is a bit presumptive. Would be better if it
|
17
|
+
# were a where clause.
|
18
|
+
class ActiveRecord < Load
|
19
|
+
set_valid_options :primary_key, :legacy_key, :finder, :update, :cache_key
|
20
|
+
|
21
|
+
def seen
|
22
|
+
@seen ||= { }
|
23
|
+
end
|
24
|
+
|
25
|
+
def seen?(object)
|
26
|
+
if options[:cache_key]
|
27
|
+
seen[object[options[:cache_key]]]
|
28
|
+
end
|
29
|
+
end
|
30
|
+
|
31
|
+
def seen!(object)
|
32
|
+
if options[:cache_key]
|
33
|
+
seen[object[options[:cache_key]]] = true
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
37
|
+
def load(transformed_objects)
|
38
|
+
transformed_objects.each do |transformed_object|
|
39
|
+
next if seen?(transformed_object)
|
40
|
+
if options[:update]
|
41
|
+
object = if options[:finder]
|
42
|
+
options[:finder].call(transformed_object)
|
43
|
+
elsif options[:primary_key] && options[:legacy_key]
|
44
|
+
transformed_object.class.where("#{options[:primary_key]}=?", transformed_object[options[:legacy_key]]).first
|
45
|
+
end
|
46
|
+
if object
|
47
|
+
update_object object, transformed_object
|
48
|
+
else
|
49
|
+
save_object transformed_object
|
50
|
+
end
|
51
|
+
else
|
52
|
+
save_object transformed_object
|
53
|
+
end
|
54
|
+
end
|
55
|
+
transformed_objects
|
56
|
+
end
|
57
|
+
|
58
|
+
def save_object(transformed_object)
|
59
|
+
return if seen?(transformed_object)
|
60
|
+
transformed_object.save
|
61
|
+
seen! transformed_object
|
62
|
+
transformed_object
|
63
|
+
end
|
64
|
+
|
65
|
+
def update_object(original_object, transformed_object)
|
66
|
+
return if seen?(transformed_object)
|
67
|
+
original_object.update_attributes transformed_object.attributes
|
68
|
+
seen! original_object
|
69
|
+
original_object
|
70
|
+
end
|
71
|
+
end
|
72
|
+
end
|
73
|
+
end
|
data/lib/migratrix/migration.rb
CHANGED
@@ -34,11 +34,11 @@ module Migratrix
|
|
34
34
|
opts += transform.valid_options
|
35
35
|
end
|
36
36
|
end
|
37
|
-
|
38
|
-
|
39
|
-
|
40
|
-
|
41
|
-
|
37
|
+
if loads
|
38
|
+
loads.each do |name, load|
|
39
|
+
opts += load.valid_options
|
40
|
+
end
|
41
|
+
end
|
42
42
|
opts.uniq.sort
|
43
43
|
end
|
44
44
|
|
@@ -86,6 +86,8 @@ module Migratrix
|
|
86
86
|
self.class.extractions
|
87
87
|
end
|
88
88
|
|
89
|
+
# TODO: THIS IS HUGE DUPLICATION, REFACTOR REFACTOR REFACTOR
|
90
|
+
|
89
91
|
# transform crap
|
90
92
|
# set_transform :nickname, :registered_name, options_hash
|
91
93
|
# set_transform :nickname, :registered_name # options = {}
|
@@ -128,6 +130,8 @@ module Migratrix
|
|
128
130
|
self.class.transforms
|
129
131
|
end
|
130
132
|
|
133
|
+
# TODO: THIS IS HUGE DUPLICATION, REFACTOR REFACTOR REFACTOR
|
134
|
+
|
131
135
|
# load crap
|
132
136
|
# set_load :nickname, :registered_name, options_hash
|
133
137
|
# set_load :nickname, :registered_name # options = {}
|
@@ -78,18 +78,6 @@ module Migratrix
|
|
78
78
|
# parts.
|
79
79
|
#
|
80
80
|
# ----------------------------------------------------------------------
|
81
|
-
# Map's strategy, as used by PetsMigration
|
82
|
-
#
|
83
|
-
# create_transformed_collection -> Hash.new
|
84
|
-
# create_new_object -> Hash.new
|
85
|
-
# transformation -> {:id => :id, :name => :name }
|
86
|
-
# extract_attribute -> object[attribute_or_extract]
|
87
|
-
# apply_attribute -> object[attribute] = attribute_or_apply
|
88
|
-
# finalize_object -> no-op
|
89
|
-
# store_transformed_object -> collection[object[:id]] = object
|
90
|
-
# ----------------------------------------------------------------------
|
91
|
-
#
|
92
|
-
# ----------------------------------------------------------------------
|
93
81
|
# Default strategy:
|
94
82
|
#
|
95
83
|
# create_transformed_collection -> Array.new
|
@@ -170,7 +170,7 @@ describe Migratrix::Migration do
|
|
170
170
|
|
171
171
|
describe ".valid_options" do
|
172
172
|
it "returns valid options from itself and components" do
|
173
|
-
TestMigration.valid_options.should == [:console, :fetchall, :limit, :map, :offset, :order, :where]
|
173
|
+
TestMigration.valid_options.should == [:console, :fetchall, :filename, :limit, :map, :offset, :order, :where]
|
174
174
|
end
|
175
175
|
end
|
176
176
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: migratrix
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.8.
|
4
|
+
version: 0.8.7
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date:
|
12
|
+
date: 2012-04-20 00:00:00.000000000Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: trollop
|
16
|
-
requirement: &
|
16
|
+
requirement: &2152597120 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,7 +21,7 @@ dependencies:
|
|
21
21
|
version: '0'
|
22
22
|
type: :runtime
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
24
|
+
version_requirements: *2152597120
|
25
25
|
description: Migratrix, a Rails legacy database migration tool supporting multiple
|
26
26
|
strategies, including arbitrary n-ary migrations (1->n, n->1, n->m), arbitrary inputs
|
27
27
|
and outputs (ActiveRecord, bare SQL, CSV) and migration logging
|
@@ -42,6 +42,7 @@ files:
|
|
42
42
|
- lib/migratrix/extractions/active_record.rb
|
43
43
|
- lib/migratrix/extractions/extraction.rb
|
44
44
|
- lib/migratrix/extractions/no_op.rb
|
45
|
+
- lib/migratrix/loads/active_record.rb
|
45
46
|
- lib/migratrix/loads/load.rb
|
46
47
|
- lib/migratrix/loads/no_op.rb
|
47
48
|
- lib/migratrix/loads/yaml.rb
|
@@ -102,7 +103,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
102
103
|
version: '0'
|
103
104
|
requirements: []
|
104
105
|
rubyforge_project:
|
105
|
-
rubygems_version: 1.8.
|
106
|
+
rubygems_version: 1.8.10
|
106
107
|
signing_key:
|
107
108
|
specification_version: 3
|
108
109
|
summary: Rails 3 legacy database migratrion tool supporting multiple strategies
|