migratrix 0.8.5 → 0.8.7

Sign up to get free protection for your applications and to get access to all the features.
data/README.md CHANGED
@@ -4,17 +4,89 @@ Dominate your legacy Rails migrations! Migratrix is a gem to help you
4
4
  generate and control Migrations, which extract data from legacy systems
5
5
  and import them into your current one.
6
6
 
7
- ## Warning: Experimental Developmental In-Progress Stuff
7
+ ## General Info
8
8
 
9
- I am currently extracting Migratrix from an ancient legacy codebase.
10
- (Oh the irony.) A lot of the stuff I say Migratrix supports is stuff
11
- that it supports over there, not in here. Be aware that most of this
12
- document is more of a TODO list than a statement of fact. I'll remove
13
- this message once Migratrix does what it says on the tin.
9
+ Migratrix is a framework that supports various migration strategies.
10
+ You tell it how you want to connect to a data source, define any
11
+ transformations, and then describe the "load target": how and where
12
+ you want the new data to come out. You can, of course, extract data
13
+ from a legacy database and load it into your all-new database, but you
14
+ can also read and write from logs, flat files, or even external
15
+ services.
14
16
 
15
- ## General Info
17
+ If you can get at the data from Ruby, Migratrix can use it as a source
18
+ of extractable data. If you can get at the new storage mechanism from
19
+ Ruby, Migratrix can use it as a load target.
20
+
21
+ You can extract data from an API and load it into your database. You
22
+ can extract enumerables from a database and load it to source code
23
+ that defines constants for the enumeration.
24
+
25
+ ## Weapons-Grade Migrations
26
+
27
+ Migratrix tries to keep simple things simple, but really shines when
28
+ it scales.
29
+
30
+ Migratrix is intended to be as simple and easy to use as possible for
31
+ small cases, but to have no problems scaling upwards. As a result if
32
+ you have an ultra-simple migration that one developer on the team will
33
+ only ever run once, Migratrix will work but may not be the best tool.
34
+
35
+ If, however, you have
36
+
37
+ * Complex and/or complicated migrations
38
+
39
+ * Migrations that need to be run more than once, especially if they
40
+ need to be run frequently and/or regularly
41
+
42
+ * Migrations that need to be run by multiple developers or by
43
+ distributed machines
44
+
45
+ then your migration strategy is complicated enough that it needs to be
46
+ part of the codebase and knowledgebase for your project, and you want
47
+ a heavier-duty migration tool. Migratrix is the perfect tool in that
48
+ situation.
49
+
50
+
51
+ ## Object-Oriented Data Transformation
52
+
53
+ The first few times you write a special-purpose data migration tool,
54
+ you're going to be tempted to just say "here's a hash for the source,
55
+ here's a hash for the destination, and here's a function to transform
56
+ from one to the other". And it will work. But then the extra
57
+ requirements start to roll in:
58
+
59
+ * We have 20 million rows, can you do the migration in batches?
60
+
61
+ * We need to split the users table into users and addresses tables,
62
+ how hard is that?
16
63
 
17
- Migratrix is a legacy migration tool strategy
64
+ * We have a TON of duplicate data, can you merge it down when you
65
+ migrate?
66
+
67
+ * Can we just migrate one or two exemplar rows to test the migration
68
+ tool?
69
+
70
+ And my all-time personal favorite:
71
+
72
+ * We've decided we want to keep the old site live while the beta site
73
+ is running, can you keep the databases in sync?
74
+
75
+ I started writing Migratrix when my third client asked me to keep a
76
+ legacy site and a beta site synchronized. For me this tool reduced the
77
+ problem from "utterly impossible" to "merely very difficult". :-)
78
+
79
+ As a result, Migratrix is very much object based. You get at a data
80
+ source by using an Extract object. If Migratrix does not support the
81
+ kind of extraction you want to do, it provides the Extract base class
82
+ so you can write your own extractor.
83
+
84
+ Then the data is handed off to a Transform object, and finally it is
85
+ given to a Load object.
86
+
87
+ Migratrix provides a handful of Extract, Transform and Load classes to
88
+ handle common types of migration (reading and writing models with
89
+ ActiveRecord, writing YAML files, etc).
18
90
 
19
91
  ## Motivation
20
92
 
@@ -22,42 +94,116 @@ So... much... legacy... data....
22
94
 
23
95
  ## Rails and Ruby Requirements
24
96
 
25
- ### Rails 3 Only
97
+ ### Rails 3
26
98
 
27
- Migratrix was originally developed under Rails 2, but come on. Rails 2
28
- apps are legacy SOURCES now, not destinations. Migratrix requires
29
- Rails 3. Once everything's in place I'll bump the Migratrix version to
30
- 3.x to indicate that Migratrix is in keeping with Rails 3.
99
+ Migratrix depends on Rails 3 for its ActiveRecord and ActiveSupport
100
+ libraries. If your project was written in an older version of Rails,
101
+ and you want to use ActiveRecord as your extractor, you may need to
102
+ define new models that are compatible with Rails 3. But try just
103
+ loading the old models first; as long as the models aren't too
104
+ complicated or interconnected, they may work.
31
105
 
32
- ### Ruby 1.9
106
+ ### Ruby 1.9.2 or later
33
107
 
34
108
  Because I can.
35
109
 
36
- ## Example
110
+ ## Examples
37
111
 
38
- ## ETL
112
+ ### ETL
39
113
 
40
114
  I use the term "ETL" here in a loosely similar mechanism as in data
41
115
  warehousing: Extract, Transform and Load. Migratrix approaches
42
116
  migrations in three phases:
43
117
 
44
- * **Extract** The Migration obtains the legacy data from 1 or more
45
- sources
46
-
47
- * **Transform** The Migration transforms the data into 1 or more
48
- outputs
49
-
118
+ * **Extract** The Migration obtains the legacy data from 1 or more sources
119
+
120
+ * **Transform** The Migration transforms the data into 1 or more outputs
121
+
50
122
  * **Load** The Migration saves the data into the new database or other
51
- output(s)
52
-
123
+ output(s)
124
+
125
+ ### General Structure
126
+
127
+ I like to create a folder structure in db/legacy with a /migrations
128
+ folder and a /models folder. Then I write rake tasks to handle the
129
+ more common migrations. Migration classes go in /migrations,
130
+ naturally; /models is where I store ActiveRecord models that must
131
+ access the legacy database. It is important to keep these namespaced
132
+ so that your `User`model doesn't conflict with your `Legacy::User`
133
+ model.
134
+
135
+ ### Simple Example: Straight-up ActiveRecord Copy
136
+
137
+ Let's say your legacy app has a simple table that you want to keep
138
+ unchanged. Migratrix can bring this over using straight ActiveRecord
139
+ copies:
140
+
141
+ TODO: Write Sample App so these examples make sense
142
+ TODO: And then come write this example
143
+
144
+
145
+ ### Slightly More Complicated: Defining your own extensible class
146
+
147
+ Let's say your legacy app has a table that stores what amount to
148
+ constants in a table in the database. Either the data never changes
149
+ (in fact, you don't even have an admin page to edit the data), or the
150
+ data changes rarely enough that you're willing to restart the
151
+ webserver if the data DOES change. It's also really simple data, and
152
+ has no dependencies. For the sake of debmonstration, let's say that
153
+ your app stores all 50 US States and their abbreviations. (And you put
154
+ it in the database because ONE day you were SURE you were going to
155
+ need to add Canadian provinces or Japanese boroughs.)
156
+
157
+ Let's migrate this to a YAML file and store it in config/constants,
158
+ and at boot time we'll write some code to load the states.yml file and
159
+ create a frozen hash called MyApp::STATES.
160
+
161
+ This should be straightforward at this point, but wait. It also turns
162
+ out we need to do the same thing with the countries table, the
163
+ shipping_carriers table, and about six other tables. They're all the
164
+ same: we need to extract from the legacy database and write to a
165
+ yaml file in the constants folder.
166
+
167
+ We can do that. Here's the entire migration for States:
168
+
169
+ require 'constants_migration'
170
+ module Legacy
171
+ class StatesMigration < ConstantsMigration
172
+ extend_extraction source: Legacy::State
173
+ extend_transform transform: { id: :id, name: :name, abbreviation: :state_code }
174
+ extend_load filename: MyApp::CONSTANTS_PATH + "states.yml"
175
+ end
176
+ end
177
+
178
+ And the reason this works is that you've written your own custom
179
+ Migration class, which looks like this:
180
+
181
+ module Legacy
182
+ # Migrates a Legacy ActiveRecord table through a transform to a
183
+ # constants file. Child classes should extend_extraction with
184
+ # :source, transform with a :transform hash, and load with the
185
+ # filename to write to.
186
+ class ConstantsMigration < Migratrix::Migration
187
+ set_extraction :active_record, source: Model
188
+ set_transform :transform, {
189
+ transform_collection: Hash,
190
+ transform_class: Hash,
191
+ extract_attribute: :[],
192
+ apply_attribute: :[]=,
193
+ store_transformed_object: ->(object,collection){ collection[object[:id]] = object }
194
+ }
195
+ set_load :yaml
196
+ end
197
+ end
53
198
 
54
199
  ## Migration Dependencies
55
200
 
56
201
  Migratrix isn't quite smart enough to know that a migrator depends on
57
202
  another migrator. (Actually, that's easy. What's hard is knowing if a
58
- dependent migrator has run and is up-to-date.)
203
+ dependent migrator has run and is up-to-date. This is a solvable
204
+ problem, I just haven't needed to solve it with Migratrix yet.)
59
205
 
60
- ## Strategies
206
+ ## Migration Strategies
61
207
 
62
208
  Migratrix supports multiple migration strategies:
63
209
 
@@ -89,9 +235,8 @@ Migratrix supports multiple migration strategies:
89
235
  ## Slices of Data
90
236
 
91
237
  Migratrix supports taking partial slices of data. You can migrate a
92
- single record to test a migraton, grab 100 records or 1000 to get an
93
- idea for how long a full migration will take, or perform the entire
94
- migration.
238
+ single record to test a migraton, or grab a few thousand to get an
239
+ idea for how long a full migration will take.
95
240
 
96
241
  ## Ongoing Migrations
97
242
 
@@ -107,7 +252,9 @@ can also record the source object, which is useful for debugging or
107
252
  handling migration cases where legacy records get changed or deleted
108
253
  after migrating.
109
254
 
110
- ## Migration Tests
255
+ ## Known Limitations and Issues
256
+
257
+ ### Migration Tests
111
258
 
112
259
  Sorry, nothing to see here yet. Migratrix was originally developed in
113
260
  an environment where the migrations were so heavy-duty and hairy that
@@ -118,6 +265,12 @@ mostly a note to remind myself that that heavy-duty migrations can and
118
265
  should be developed in a TDD style, and Migratrix should make this
119
266
  easy.
120
267
 
268
+ ### Rake Tasks and/or Rails Generators
269
+
270
+ Migratrix is definitely a heavyweight tool for migrating data. It
271
+ would be nice if the gem provided rake tasks or Rails generators to
272
+ help with the boilerplate.
273
+
121
274
  ## A note about the name
122
275
 
123
276
  In old Latin, -or versus -ix endings aren't just about feminine and
@@ -142,3 +295,10 @@ MIT. See the license file.
142
295
 
143
296
  * David Brady -- github@shinybit.com
144
297
 
298
+ ## Contributing
299
+
300
+ YES PLEASE! For bugs and other issues, send a pull request. If you
301
+ would like to extend or change a feature of Migratrix, discussion is
302
+ also very welcome.
303
+
304
+
@@ -1,2 +1,2 @@
1
1
  #!/usr/bin/env ruby
2
- puts "Nothing to see here yet -- use rake tasks to call Migratrix.migrate!(:migrator, options) directly"
2
+ puts "Nothing to see here yet -- call Migratrix.migrate!(:migrator, options) directly from your own driver code (scripts, rake tasks, etc)"
@@ -29,6 +29,7 @@ module Migratrix
29
29
  require APP + 'transforms/map'
30
30
 
31
31
  require APP + 'loads/load'
32
+ require APP + 'loads/active_record'
32
33
  require APP + 'loads/no_op'
33
34
  require APP + 'loads/yaml'
34
35
  # require APP + 'loads/csv'
@@ -84,6 +85,7 @@ module Migratrix
84
85
  register_transform :no_op, Transforms::NoOp
85
86
 
86
87
  register_load :load, Loads::Load
88
+ register_load :active_record, Loads::ActiveRecord
87
89
  register_load :no_op, Loads::NoOp
88
90
  register_load :yaml, Loads::Yaml
89
91
 
@@ -0,0 +1,73 @@
1
+ require 'yaml'
2
+
3
+ module Migratrix
4
+ module Loads
5
+ # An ActiveRecord-based Load that tries to update existing objects
6
+ # rather than always doing new saves. If :update is true, before
7
+ # saving we attempt to find the object by the primary_key column.
8
+ # If found, we call .update_attributes on that record instead of
9
+ # .save.
10
+ #
11
+ # TODO: Verify that update_attributes still calls callbacks, e.g.
12
+ # validations and before_save? If not we'll need to load the
13
+ # object, copy attributes manually from the transformed object,
14
+ # and save it.
15
+ #
16
+ # TODO: primary_key is a bit presumptive. Would be better if it
17
+ # were a where clause.
18
+ class ActiveRecord < Load
19
+ set_valid_options :primary_key, :legacy_key, :finder, :update, :cache_key
20
+
21
+ def seen
22
+ @seen ||= { }
23
+ end
24
+
25
+ def seen?(object)
26
+ if options[:cache_key]
27
+ seen[object[options[:cache_key]]]
28
+ end
29
+ end
30
+
31
+ def seen!(object)
32
+ if options[:cache_key]
33
+ seen[object[options[:cache_key]]] = true
34
+ end
35
+ end
36
+
37
+ def load(transformed_objects)
38
+ transformed_objects.each do |transformed_object|
39
+ next if seen?(transformed_object)
40
+ if options[:update]
41
+ object = if options[:finder]
42
+ options[:finder].call(transformed_object)
43
+ elsif options[:primary_key] && options[:legacy_key]
44
+ transformed_object.class.where("#{options[:primary_key]}=?", transformed_object[options[:legacy_key]]).first
45
+ end
46
+ if object
47
+ update_object object, transformed_object
48
+ else
49
+ save_object transformed_object
50
+ end
51
+ else
52
+ save_object transformed_object
53
+ end
54
+ end
55
+ transformed_objects
56
+ end
57
+
58
+ def save_object(transformed_object)
59
+ return if seen?(transformed_object)
60
+ transformed_object.save
61
+ seen! transformed_object
62
+ transformed_object
63
+ end
64
+
65
+ def update_object(original_object, transformed_object)
66
+ return if seen?(transformed_object)
67
+ original_object.update_attributes transformed_object.attributes
68
+ seen! original_object
69
+ original_object
70
+ end
71
+ end
72
+ end
73
+ end
@@ -34,11 +34,11 @@ module Migratrix
34
34
  opts += transform.valid_options
35
35
  end
36
36
  end
37
- # if loads
38
- # loads.each do |name, load|
39
- # opts += load.valid_options
40
- # end
41
- # end
37
+ if loads
38
+ loads.each do |name, load|
39
+ opts += load.valid_options
40
+ end
41
+ end
42
42
  opts.uniq.sort
43
43
  end
44
44
 
@@ -86,6 +86,8 @@ module Migratrix
86
86
  self.class.extractions
87
87
  end
88
88
 
89
+ # TODO: THIS IS HUGE DUPLICATION, REFACTOR REFACTOR REFACTOR
90
+
89
91
  # transform crap
90
92
  # set_transform :nickname, :registered_name, options_hash
91
93
  # set_transform :nickname, :registered_name # options = {}
@@ -128,6 +130,8 @@ module Migratrix
128
130
  self.class.transforms
129
131
  end
130
132
 
133
+ # TODO: THIS IS HUGE DUPLICATION, REFACTOR REFACTOR REFACTOR
134
+
131
135
  # load crap
132
136
  # set_load :nickname, :registered_name, options_hash
133
137
  # set_load :nickname, :registered_name # options = {}
@@ -78,18 +78,6 @@ module Migratrix
78
78
  # parts.
79
79
  #
80
80
  # ----------------------------------------------------------------------
81
- # Map's strategy, as used by PetsMigration
82
- #
83
- # create_transformed_collection -> Hash.new
84
- # create_new_object -> Hash.new
85
- # transformation -> {:id => :id, :name => :name }
86
- # extract_attribute -> object[attribute_or_extract]
87
- # apply_attribute -> object[attribute] = attribute_or_apply
88
- # finalize_object -> no-op
89
- # store_transformed_object -> collection[object[:id]] = object
90
- # ----------------------------------------------------------------------
91
- #
92
- # ----------------------------------------------------------------------
93
81
  # Default strategy:
94
82
  #
95
83
  # create_transformed_collection -> Array.new
@@ -170,7 +170,7 @@ describe Migratrix::Migration do
170
170
 
171
171
  describe ".valid_options" do
172
172
  it "returns valid options from itself and components" do
173
- TestMigration.valid_options.should == [:console, :fetchall, :limit, :map, :offset, :order, :where]
173
+ TestMigration.valid_options.should == [:console, :fetchall, :filename, :limit, :map, :offset, :order, :where]
174
174
  end
175
175
  end
176
176
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: migratrix
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.8.5
4
+ version: 0.8.7
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,11 +9,11 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2011-10-31 00:00:00.000000000Z
12
+ date: 2012-04-20 00:00:00.000000000Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: trollop
16
- requirement: &2164754380 !ruby/object:Gem::Requirement
16
+ requirement: &2152597120 !ruby/object:Gem::Requirement
17
17
  none: false
18
18
  requirements:
19
19
  - - ! '>='
@@ -21,7 +21,7 @@ dependencies:
21
21
  version: '0'
22
22
  type: :runtime
23
23
  prerelease: false
24
- version_requirements: *2164754380
24
+ version_requirements: *2152597120
25
25
  description: Migratrix, a Rails legacy database migration tool supporting multiple
26
26
  strategies, including arbitrary n-ary migrations (1->n, n->1, n->m), arbitrary inputs
27
27
  and outputs (ActiveRecord, bare SQL, CSV) and migration logging
@@ -42,6 +42,7 @@ files:
42
42
  - lib/migratrix/extractions/active_record.rb
43
43
  - lib/migratrix/extractions/extraction.rb
44
44
  - lib/migratrix/extractions/no_op.rb
45
+ - lib/migratrix/loads/active_record.rb
45
46
  - lib/migratrix/loads/load.rb
46
47
  - lib/migratrix/loads/no_op.rb
47
48
  - lib/migratrix/loads/yaml.rb
@@ -102,7 +103,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
102
103
  version: '0'
103
104
  requirements: []
104
105
  rubyforge_project:
105
- rubygems_version: 1.8.6
106
+ rubygems_version: 1.8.10
106
107
  signing_key:
107
108
  specification_version: 3
108
109
  summary: Rails 3 legacy database migratrion tool supporting multiple strategies