migratrix 0.8.5 → 0.8.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.md +190 -30
- data/bin/migratrix +1 -1
- data/lib/migratrix.rb +2 -0
- data/lib/migratrix/loads/active_record.rb +73 -0
- data/lib/migratrix/migration.rb +9 -5
- data/lib/migratrix/transforms/transform.rb +0 -12
- data/spec/lib/migratrix/migration_spec.rb +1 -1
- metadata +6 -5
data/README.md
CHANGED
@@ -4,17 +4,89 @@ Dominate your legacy Rails migrations! Migratrix is a gem to help you
|
|
4
4
|
generate and control Migrations, which extract data from legacy systems
|
5
5
|
and import them into your current one.
|
6
6
|
|
7
|
-
##
|
7
|
+
## General Info
|
8
8
|
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
9
|
+
Migratrix is a framework that supports various migration strategies.
|
10
|
+
You tell it how you want to connect to a data source, define any
|
11
|
+
transformations, and then describe the "load target": how and where
|
12
|
+
you want the new data to come out. You can, of course, extract data
|
13
|
+
from a legacy database and load it into your all-new database, but you
|
14
|
+
can also read and write from logs, flat files, or even external
|
15
|
+
services.
|
14
16
|
|
15
|
-
|
17
|
+
If you can get at the data from Ruby, Migratrix can use it as a source
|
18
|
+
of extractable data. If you can get at the new storage mechanism from
|
19
|
+
Ruby, Migratrix can use it as a load target.
|
20
|
+
|
21
|
+
You can extract data from an API and load it into your database. You
|
22
|
+
can extract enumerables from a database and load it to source code
|
23
|
+
that defines constants for the enumeration.
|
24
|
+
|
25
|
+
## Weapons-Grade Migrations
|
26
|
+
|
27
|
+
Migratrix tries to keep simple things simple, but really shines when
|
28
|
+
it scales.
|
29
|
+
|
30
|
+
Migratrix is intended to be as simple and easy to use as possible for
|
31
|
+
small cases, but to have no problems scaling upwards. As a result if
|
32
|
+
you have an ultra-simple migration that one developer on the team will
|
33
|
+
only ever run once, Migratrix will work but may not be the best tool.
|
34
|
+
|
35
|
+
If, however, you have
|
36
|
+
|
37
|
+
* Complex and/or complicated migrations
|
38
|
+
|
39
|
+
* Migrations that need to be run more than once, especially if they
|
40
|
+
need to be run frequently and/or regularly
|
41
|
+
|
42
|
+
* Migrations that need to be run by multiple developers or by
|
43
|
+
distributed machines
|
44
|
+
|
45
|
+
then your migration strategy is complicated enough that it needs to be
|
46
|
+
part of the codebase and knowledgebase for your project, and you want
|
47
|
+
a heavier-duty migration tool. Migratrix is the perfect tool in that
|
48
|
+
situation.
|
49
|
+
|
50
|
+
|
51
|
+
## Object-Oriented Data Transformation
|
52
|
+
|
53
|
+
The first few times you write a special-purpose data migration tool,
|
54
|
+
you're going to be tempted to just say "here's a hash for the source,
|
55
|
+
here's a hash for the destination, and here's a function to transform
|
56
|
+
from one to the other". And it will work. But then the extra
|
57
|
+
requirements start to roll in:
|
58
|
+
|
59
|
+
* We have 20 million rows, can you do the migration in batches?
|
60
|
+
|
61
|
+
* We need to split the users table into users and addresses tables,
|
62
|
+
how hard is that?
|
16
63
|
|
17
|
-
|
64
|
+
* We have a TON of duplicate data, can you merge it down when you
|
65
|
+
migrate?
|
66
|
+
|
67
|
+
* Can we just migrate one or two exemplar rows to test the migration
|
68
|
+
tool?
|
69
|
+
|
70
|
+
And my all-time personal favorite:
|
71
|
+
|
72
|
+
* We've decided we want to keep the old site live while the beta site
|
73
|
+
is running, can you keep the databases in sync?
|
74
|
+
|
75
|
+
I started writing Migratrix when my third client asked me to keep a
|
76
|
+
legacy site and a beta site synchronized. For me this tool reduced the
|
77
|
+
problem from "utterly impossible" to "merely very difficult". :-)
|
78
|
+
|
79
|
+
As a result, Migratrix is very much object based. You get at a data
|
80
|
+
source by using an Extract object. If Migratrix does not support the
|
81
|
+
kind of extraction you want to do, it provides the Extract base class
|
82
|
+
so you can write your own extractor.
|
83
|
+
|
84
|
+
Then the data is handed off to a Transform object, and finally it is
|
85
|
+
given to a Load object.
|
86
|
+
|
87
|
+
Migratrix provides a handful of Extract, Transform and Load classes to
|
88
|
+
handle common types of migration (reading and writing models with
|
89
|
+
ActiveRecord, writing YAML files, etc).
|
18
90
|
|
19
91
|
## Motivation
|
20
92
|
|
@@ -22,42 +94,116 @@ So... much... legacy... data....
|
|
22
94
|
|
23
95
|
## Rails and Ruby Requirements
|
24
96
|
|
25
|
-
### Rails 3
|
97
|
+
### Rails 3
|
26
98
|
|
27
|
-
Migratrix
|
28
|
-
|
29
|
-
|
30
|
-
|
99
|
+
Migratrix depends on Rails 3 for its ActiveRecord and ActiveSupport
|
100
|
+
libraries. If your project was written in an older version of Rails,
|
101
|
+
and you want to use ActiveRecord as your extractor, you may need to
|
102
|
+
define new models that are compatible with Rails 3. But try just
|
103
|
+
loading the old models first; as long as the models aren't too
|
104
|
+
complicated or interconnected, they may work.
|
31
105
|
|
32
|
-
### Ruby 1.9
|
106
|
+
### Ruby 1.9.2 or later
|
33
107
|
|
34
108
|
Because I can.
|
35
109
|
|
36
|
-
##
|
110
|
+
## Examples
|
37
111
|
|
38
|
-
|
112
|
+
### ETL
|
39
113
|
|
40
114
|
I use the term "ETL" here in a loosely similar mechanism as in data
|
41
115
|
warehousing: Extract, Transform and Load. Migratrix approaches
|
42
116
|
migrations in three phases:
|
43
117
|
|
44
|
-
* **Extract** The Migration obtains the legacy data from 1 or more
|
45
|
-
|
46
|
-
|
47
|
-
|
48
|
-
outputs
|
49
|
-
|
118
|
+
* **Extract** The Migration obtains the legacy data from 1 or more sources
|
119
|
+
|
120
|
+
* **Transform** The Migration transforms the data into 1 or more outputs
|
121
|
+
|
50
122
|
* **Load** The Migration saves the data into the new database or other
|
51
|
-
|
52
|
-
|
123
|
+
output(s)
|
124
|
+
|
125
|
+
### General Structure
|
126
|
+
|
127
|
+
I like to create a folder structure in db/legacy with a /migrations
|
128
|
+
folder and a /models folder. Then I write rake tasks to handle the
|
129
|
+
more common migrations. Migration classes go in /migrations,
|
130
|
+
naturally; /models is where I store ActiveRecord models that must
|
131
|
+
access the legacy database. It is important to keep these namespaced
|
132
|
+
so that your `User`model doesn't conflict with your `Legacy::User`
|
133
|
+
model.
|
134
|
+
|
135
|
+
### Simple Example: Straight-up ActiveRecord Copy
|
136
|
+
|
137
|
+
Let's say your legacy app has a simple table that you want to keep
|
138
|
+
unchanged. Migratrix can bring this over using straight ActiveRecord
|
139
|
+
copies:
|
140
|
+
|
141
|
+
TODO: Write Sample App so these examples make sense
|
142
|
+
TODO: And then come write this example
|
143
|
+
|
144
|
+
|
145
|
+
### Slightly More Complicated: Defining your own extensible class
|
146
|
+
|
147
|
+
Let's say your legacy app has a table that stores what amount to
|
148
|
+
constants in a table in the database. Either the data never changes
|
149
|
+
(in fact, you don't even have an admin page to edit the data), or the
|
150
|
+
data changes rarely enough that you're willing to restart the
|
151
|
+
webserver if the data DOES change. It's also really simple data, and
|
152
|
+
has no dependencies. For the sake of debmonstration, let's say that
|
153
|
+
your app stores all 50 US States and their abbreviations. (And you put
|
154
|
+
it in the database because ONE day you were SURE you were going to
|
155
|
+
need to add Canadian provinces or Japanese boroughs.)
|
156
|
+
|
157
|
+
Let's migrate this to a YAML file and store it in config/constants,
|
158
|
+
and at boot time we'll write some code to load the states.yml file and
|
159
|
+
create a frozen hash called MyApp::STATES.
|
160
|
+
|
161
|
+
This should be straightforward at this point, but wait. It also turns
|
162
|
+
out we need to do the same thing with the countries table, the
|
163
|
+
shipping_carriers table, and about six other tables. They're all the
|
164
|
+
same: we need to extract from the legacy database and write to a
|
165
|
+
yaml file in the constants folder.
|
166
|
+
|
167
|
+
We can do that. Here's the entire migration for States:
|
168
|
+
|
169
|
+
require 'constants_migration'
|
170
|
+
module Legacy
|
171
|
+
class StatesMigration < ConstantsMigration
|
172
|
+
extend_extraction source: Legacy::State
|
173
|
+
extend_transform transform: { id: :id, name: :name, abbreviation: :state_code }
|
174
|
+
extend_load filename: MyApp::CONSTANTS_PATH + "states.yml"
|
175
|
+
end
|
176
|
+
end
|
177
|
+
|
178
|
+
And the reason this works is that you've written your own custom
|
179
|
+
Migration class, which looks like this:
|
180
|
+
|
181
|
+
module Legacy
|
182
|
+
# Migrates a Legacy ActiveRecord table through a transform to a
|
183
|
+
# constants file. Child classes should extend_extraction with
|
184
|
+
# :source, transform with a :transform hash, and load with the
|
185
|
+
# filename to write to.
|
186
|
+
class ConstantsMigration < Migratrix::Migration
|
187
|
+
set_extraction :active_record, source: Model
|
188
|
+
set_transform :transform, {
|
189
|
+
transform_collection: Hash,
|
190
|
+
transform_class: Hash,
|
191
|
+
extract_attribute: :[],
|
192
|
+
apply_attribute: :[]=,
|
193
|
+
store_transformed_object: ->(object,collection){ collection[object[:id]] = object }
|
194
|
+
}
|
195
|
+
set_load :yaml
|
196
|
+
end
|
197
|
+
end
|
53
198
|
|
54
199
|
## Migration Dependencies
|
55
200
|
|
56
201
|
Migratrix isn't quite smart enough to know that a migrator depends on
|
57
202
|
another migrator. (Actually, that's easy. What's hard is knowing if a
|
58
|
-
dependent migrator has run and is up-to-date.
|
203
|
+
dependent migrator has run and is up-to-date. This is a solvable
|
204
|
+
problem, I just haven't needed to solve it with Migratrix yet.)
|
59
205
|
|
60
|
-
## Strategies
|
206
|
+
## Migration Strategies
|
61
207
|
|
62
208
|
Migratrix supports multiple migration strategies:
|
63
209
|
|
@@ -89,9 +235,8 @@ Migratrix supports multiple migration strategies:
|
|
89
235
|
## Slices of Data
|
90
236
|
|
91
237
|
Migratrix supports taking partial slices of data. You can migrate a
|
92
|
-
single record to test a migraton, grab
|
93
|
-
idea for how long a full migration will take
|
94
|
-
migration.
|
238
|
+
single record to test a migraton, or grab a few thousand to get an
|
239
|
+
idea for how long a full migration will take.
|
95
240
|
|
96
241
|
## Ongoing Migrations
|
97
242
|
|
@@ -107,7 +252,9 @@ can also record the source object, which is useful for debugging or
|
|
107
252
|
handling migration cases where legacy records get changed or deleted
|
108
253
|
after migrating.
|
109
254
|
|
110
|
-
##
|
255
|
+
## Known Limitations and Issues
|
256
|
+
|
257
|
+
### Migration Tests
|
111
258
|
|
112
259
|
Sorry, nothing to see here yet. Migratrix was originally developed in
|
113
260
|
an environment where the migrations were so heavy-duty and hairy that
|
@@ -118,6 +265,12 @@ mostly a note to remind myself that that heavy-duty migrations can and
|
|
118
265
|
should be developed in a TDD style, and Migratrix should make this
|
119
266
|
easy.
|
120
267
|
|
268
|
+
### Rake Tasks and/or Rails Generators
|
269
|
+
|
270
|
+
Migratrix is definitely a heavyweight tool for migrating data. It
|
271
|
+
would be nice if the gem provided rake tasks or Rails generators to
|
272
|
+
help with the boilerplate.
|
273
|
+
|
121
274
|
## A note about the name
|
122
275
|
|
123
276
|
In old Latin, -or versus -ix endings aren't just about feminine and
|
@@ -142,3 +295,10 @@ MIT. See the license file.
|
|
142
295
|
|
143
296
|
* David Brady -- github@shinybit.com
|
144
297
|
|
298
|
+
## Contributing
|
299
|
+
|
300
|
+
YES PLEASE! For bugs and other issues, send a pull request. If you
|
301
|
+
would like to extend or change a feature of Migratrix, discussion is
|
302
|
+
also very welcome.
|
303
|
+
|
304
|
+
|
data/bin/migratrix
CHANGED
@@ -1,2 +1,2 @@
|
|
1
1
|
#!/usr/bin/env ruby
|
2
|
-
puts "Nothing to see here yet --
|
2
|
+
puts "Nothing to see here yet -- call Migratrix.migrate!(:migrator, options) directly from your own driver code (scripts, rake tasks, etc)"
|
data/lib/migratrix.rb
CHANGED
@@ -29,6 +29,7 @@ module Migratrix
|
|
29
29
|
require APP + 'transforms/map'
|
30
30
|
|
31
31
|
require APP + 'loads/load'
|
32
|
+
require APP + 'loads/active_record'
|
32
33
|
require APP + 'loads/no_op'
|
33
34
|
require APP + 'loads/yaml'
|
34
35
|
# require APP + 'loads/csv'
|
@@ -84,6 +85,7 @@ module Migratrix
|
|
84
85
|
register_transform :no_op, Transforms::NoOp
|
85
86
|
|
86
87
|
register_load :load, Loads::Load
|
88
|
+
register_load :active_record, Loads::ActiveRecord
|
87
89
|
register_load :no_op, Loads::NoOp
|
88
90
|
register_load :yaml, Loads::Yaml
|
89
91
|
|
@@ -0,0 +1,73 @@
|
|
1
|
+
require 'yaml'
|
2
|
+
|
3
|
+
module Migratrix
|
4
|
+
module Loads
|
5
|
+
# An ActiveRecord-based Load that tries to update existing objects
|
6
|
+
# rather than always doing new saves. If :update is true, before
|
7
|
+
# saving we attempt to find the object by the primary_key column.
|
8
|
+
# If found, we call .update_attributes on that record instead of
|
9
|
+
# .save.
|
10
|
+
#
|
11
|
+
# TODO: Verify that update_attributes still calls callbacks, e.g.
|
12
|
+
# validations and before_save? If not we'll need to load the
|
13
|
+
# object, copy attributes manually from the transformed object,
|
14
|
+
# and save it.
|
15
|
+
#
|
16
|
+
# TODO: primary_key is a bit presumptive. Would be better if it
|
17
|
+
# were a where clause.
|
18
|
+
class ActiveRecord < Load
|
19
|
+
set_valid_options :primary_key, :legacy_key, :finder, :update, :cache_key
|
20
|
+
|
21
|
+
def seen
|
22
|
+
@seen ||= { }
|
23
|
+
end
|
24
|
+
|
25
|
+
def seen?(object)
|
26
|
+
if options[:cache_key]
|
27
|
+
seen[object[options[:cache_key]]]
|
28
|
+
end
|
29
|
+
end
|
30
|
+
|
31
|
+
def seen!(object)
|
32
|
+
if options[:cache_key]
|
33
|
+
seen[object[options[:cache_key]]] = true
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
37
|
+
def load(transformed_objects)
|
38
|
+
transformed_objects.each do |transformed_object|
|
39
|
+
next if seen?(transformed_object)
|
40
|
+
if options[:update]
|
41
|
+
object = if options[:finder]
|
42
|
+
options[:finder].call(transformed_object)
|
43
|
+
elsif options[:primary_key] && options[:legacy_key]
|
44
|
+
transformed_object.class.where("#{options[:primary_key]}=?", transformed_object[options[:legacy_key]]).first
|
45
|
+
end
|
46
|
+
if object
|
47
|
+
update_object object, transformed_object
|
48
|
+
else
|
49
|
+
save_object transformed_object
|
50
|
+
end
|
51
|
+
else
|
52
|
+
save_object transformed_object
|
53
|
+
end
|
54
|
+
end
|
55
|
+
transformed_objects
|
56
|
+
end
|
57
|
+
|
58
|
+
def save_object(transformed_object)
|
59
|
+
return if seen?(transformed_object)
|
60
|
+
transformed_object.save
|
61
|
+
seen! transformed_object
|
62
|
+
transformed_object
|
63
|
+
end
|
64
|
+
|
65
|
+
def update_object(original_object, transformed_object)
|
66
|
+
return if seen?(transformed_object)
|
67
|
+
original_object.update_attributes transformed_object.attributes
|
68
|
+
seen! original_object
|
69
|
+
original_object
|
70
|
+
end
|
71
|
+
end
|
72
|
+
end
|
73
|
+
end
|
data/lib/migratrix/migration.rb
CHANGED
@@ -34,11 +34,11 @@ module Migratrix
|
|
34
34
|
opts += transform.valid_options
|
35
35
|
end
|
36
36
|
end
|
37
|
-
|
38
|
-
|
39
|
-
|
40
|
-
|
41
|
-
|
37
|
+
if loads
|
38
|
+
loads.each do |name, load|
|
39
|
+
opts += load.valid_options
|
40
|
+
end
|
41
|
+
end
|
42
42
|
opts.uniq.sort
|
43
43
|
end
|
44
44
|
|
@@ -86,6 +86,8 @@ module Migratrix
|
|
86
86
|
self.class.extractions
|
87
87
|
end
|
88
88
|
|
89
|
+
# TODO: THIS IS HUGE DUPLICATION, REFACTOR REFACTOR REFACTOR
|
90
|
+
|
89
91
|
# transform crap
|
90
92
|
# set_transform :nickname, :registered_name, options_hash
|
91
93
|
# set_transform :nickname, :registered_name # options = {}
|
@@ -128,6 +130,8 @@ module Migratrix
|
|
128
130
|
self.class.transforms
|
129
131
|
end
|
130
132
|
|
133
|
+
# TODO: THIS IS HUGE DUPLICATION, REFACTOR REFACTOR REFACTOR
|
134
|
+
|
131
135
|
# load crap
|
132
136
|
# set_load :nickname, :registered_name, options_hash
|
133
137
|
# set_load :nickname, :registered_name # options = {}
|
@@ -78,18 +78,6 @@ module Migratrix
|
|
78
78
|
# parts.
|
79
79
|
#
|
80
80
|
# ----------------------------------------------------------------------
|
81
|
-
# Map's strategy, as used by PetsMigration
|
82
|
-
#
|
83
|
-
# create_transformed_collection -> Hash.new
|
84
|
-
# create_new_object -> Hash.new
|
85
|
-
# transformation -> {:id => :id, :name => :name }
|
86
|
-
# extract_attribute -> object[attribute_or_extract]
|
87
|
-
# apply_attribute -> object[attribute] = attribute_or_apply
|
88
|
-
# finalize_object -> no-op
|
89
|
-
# store_transformed_object -> collection[object[:id]] = object
|
90
|
-
# ----------------------------------------------------------------------
|
91
|
-
#
|
92
|
-
# ----------------------------------------------------------------------
|
93
81
|
# Default strategy:
|
94
82
|
#
|
95
83
|
# create_transformed_collection -> Array.new
|
@@ -170,7 +170,7 @@ describe Migratrix::Migration do
|
|
170
170
|
|
171
171
|
describe ".valid_options" do
|
172
172
|
it "returns valid options from itself and components" do
|
173
|
-
TestMigration.valid_options.should == [:console, :fetchall, :limit, :map, :offset, :order, :where]
|
173
|
+
TestMigration.valid_options.should == [:console, :fetchall, :filename, :limit, :map, :offset, :order, :where]
|
174
174
|
end
|
175
175
|
end
|
176
176
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: migratrix
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.8.
|
4
|
+
version: 0.8.7
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,11 +9,11 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date:
|
12
|
+
date: 2012-04-20 00:00:00.000000000Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: trollop
|
16
|
-
requirement: &
|
16
|
+
requirement: &2152597120 !ruby/object:Gem::Requirement
|
17
17
|
none: false
|
18
18
|
requirements:
|
19
19
|
- - ! '>='
|
@@ -21,7 +21,7 @@ dependencies:
|
|
21
21
|
version: '0'
|
22
22
|
type: :runtime
|
23
23
|
prerelease: false
|
24
|
-
version_requirements: *
|
24
|
+
version_requirements: *2152597120
|
25
25
|
description: Migratrix, a Rails legacy database migration tool supporting multiple
|
26
26
|
strategies, including arbitrary n-ary migrations (1->n, n->1, n->m), arbitrary inputs
|
27
27
|
and outputs (ActiveRecord, bare SQL, CSV) and migration logging
|
@@ -42,6 +42,7 @@ files:
|
|
42
42
|
- lib/migratrix/extractions/active_record.rb
|
43
43
|
- lib/migratrix/extractions/extraction.rb
|
44
44
|
- lib/migratrix/extractions/no_op.rb
|
45
|
+
- lib/migratrix/loads/active_record.rb
|
45
46
|
- lib/migratrix/loads/load.rb
|
46
47
|
- lib/migratrix/loads/no_op.rb
|
47
48
|
- lib/migratrix/loads/yaml.rb
|
@@ -102,7 +103,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
102
103
|
version: '0'
|
103
104
|
requirements: []
|
104
105
|
rubyforge_project:
|
105
|
-
rubygems_version: 1.8.
|
106
|
+
rubygems_version: 1.8.10
|
106
107
|
signing_key:
|
107
108
|
specification_version: 3
|
108
109
|
summary: Rails 3 legacy database migratrion tool supporting multiple strategies
|