replicate 1.0

Sign up to get free protection for your applications and to get access to all the features.
data/COPYING ADDED
@@ -0,0 +1,18 @@
1
+ Copyright (c) 2011 Ryan Tomayko <http://tomayko.com/about>
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy
4
+ of this software and associated documentation files (the "Software"), to
5
+ deal in the Software without restriction, including without limitation the
6
+ rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
7
+ sell copies of the Software, and to permit persons to whom the Software is
8
+ furnished to do so, subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in
11
+ all copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
15
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
16
+ THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
17
+ IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
18
+ CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,99 @@
1
+ # Replicate
2
+
3
+ Dump and load relational objects between Ruby environments.
4
+
5
+ The project was started at GitHub to ease the process of getting real production
6
+ data into staging and development environments. We have a custom command that
7
+ uses the replicate machinery to dump entire repository data (including
8
+ associated objects like issues, pull requests, commit comments, etc.) from
9
+ production and load it into the current environment. This is excessively useful
10
+ for troubleshooting issues, support requests, and exception reports.
11
+
12
+ ## Synopsis
13
+
14
+ Dumping objects:
15
+
16
+ $ replicate -r config/environment -d 'User.find(1)' > user.dump
17
+ ==> dumped 4 total objects:
18
+ Profile 1
19
+ User 1
20
+ UserEmail 2
21
+
22
+ Loading objects:
23
+
24
+ $ replicate -r config/environment -l < user.dump
25
+ ==> loaded 4 total objects:
26
+ Profile 1
27
+ User 1
28
+ UserEmail 2
29
+
30
+ Dumping and loading over SSH:
31
+
32
+ $ remote_command="replicate -r /app/config/environment -d 'User.find(1234)'"
33
+ $ ssh example.org "$remote_command" |replicate -r config/environment -l
34
+
35
+ ## ActiveRecord
36
+
37
+ *NOTE: Replicate has been tested only under ActiveRecord 2.2. Support for
38
+ ActiveRecord 3.x is planned.*
39
+
40
+ Basic support for dumping and loading ActiveRecord objects is included. When an
41
+ object is dumped, all `belongs_to` and `has_one` associations are automatically
42
+ followed and included in the dump. You can mark `has_many` and
43
+ `has_and_belongs_to_many` associations for automatic inclusion using the
44
+ `replicate_associations` macro:
45
+
46
+ class User < ActiveRecord::Base
47
+ belongs_to :profile
48
+ has_many :email_addresses
49
+
50
+ replicate_associations :email_addresses
51
+ end
52
+
53
+ By default, the loader attempts to create a new record for all objects. This can
54
+ lead to unique constraint errors when a record already exists with matching
55
+ attributes. To update existing records instead of always creating new ones,
56
+ define a natural key for the model using the `replicate_natural_key` macro:
57
+
58
+ class User < ActiveRecord::Base
59
+ belongs_to :profile
60
+ has_many :email_addresses
61
+
62
+ replicate_natural_key :login
63
+ replicate_associations :email_addresses
64
+ end
65
+
66
+ Multiple attribute names may be specified to define a compound key.
67
+
68
+ ## Custom Objects
69
+
70
+ Other object types may be included in the dump stream so long as they implement
71
+ the `dump_replicant` and `load_replicant` methods.
72
+
73
+ The dump side calls `#dump_replicant(dumper)` on each object. The method must
74
+ call `dumper.write()` with the class name, id, and hash of primitively typed
75
+ attributes for the object:
76
+
77
+ class User
78
+ attr_reader :id
79
+ attr_accessor :name, :email
80
+
81
+ def dump_replicant(dumper)
82
+ attributes { 'name' => name, 'email' => email }
83
+ dumper.write self.class, id, attributes
84
+ end
85
+ end
86
+
87
+ The load side calls `::load_replicant(type, id, attributes)` on the class to
88
+ load each object into the current environment. The method must return an
89
+ `[id, object]` tuple:
90
+
91
+ class User
92
+ def self.load_replicant(type, id, attributes)
93
+ user = User.new
94
+ user.name = attributes['name']
95
+ user.email = attributes['email']
96
+ user.save!
97
+ [user.id, user]
98
+ end
99
+ end
@@ -0,0 +1,6 @@
1
+ task :default => :test
2
+
3
+ desc "Run tests"
4
+ task :test do
5
+ sh "testrb test/*_test.rb", :verbose => false
6
+ end
@@ -0,0 +1,71 @@
1
+ #!/usr/bin/env ruby
2
+ #/ script/replicate [-r <lib>] --dump "<ruby>" > objects.dump
3
+ #/ script/replicate [-r <lib>] --load < objects.dump
4
+ #/ Dump and load objects between environments. The --dump form writes to stdout
5
+ #/ the objects returned by evaluating "<ruby>", which must be a valid Ruby
6
+ #/ expression. The --load form reads dump data from stdin and loads into the
7
+ #/ current environment.
8
+ #/
9
+ #/ Options:
10
+ #/ -r, --require Require the library. Often used with 'config/environment'.
11
+ #/ -d, --dump Dump the repository and all related objects to stdout.
12
+ #/ -l, --load Load dump file data from stdin.
13
+ #/
14
+ #/ -v, --verbose Write more status output.
15
+ #/ -q, --quiet Write less status output.
16
+ $stderr.sync = true
17
+ require 'optparse'
18
+
19
+ # default options
20
+ mode = nil
21
+ verbose = false
22
+ quiet = false
23
+ out = $stdout
24
+
25
+ # parse arguments
26
+ file = __FILE__
27
+ usage = lambda { exec "grep ^#/<'#{file}'|cut -c4-" }
28
+ ARGV.options do |opts|
29
+ opts.on("-d", "--dump") { mode = :dump }
30
+ opts.on("-l", "--load") { mode = :load }
31
+ opts.on("-r", "--require=f") { |file| require file }
32
+ opts.on("-v", "--verbose") { verbose = true }
33
+ opts.on("-q", "--quiet") { quiet = true }
34
+ opts.on_tail("-h", "--help", &usage)
35
+ opts.parse!
36
+ end
37
+
38
+ # load rails environment and replicator lib.
39
+ require 'replicate'
40
+
41
+ # hack to enable AR query cache
42
+ if defined?(ActiveRecord::Base)
43
+ ActiveRecord::ConnectionAdapters::QueryCache.
44
+ send :attr_writer, :query_cache, :query_cache_enabled
45
+ ActiveRecord::Base.connection.send(:query_cache=, {})
46
+ ActiveRecord::Base.connection.send(:query_cache_enabled=, true)
47
+ end
48
+
49
+ # dump mode means we're reading records from the database here and writing to
50
+ # stdout. the database should not be modified at all by this operation.
51
+ if mode == :dump
52
+ usage.call if ARGV.empty? || ARGV[0].empty?
53
+ objects = eval(ARGV[0])
54
+ Replicate::Dumper.new do |dumper|
55
+ dumper.marshal_to out
56
+ dumper.log_to $stderr, verbose, quiet
57
+ dumper.dump objects
58
+ end
59
+
60
+ # load mode means we're reading objects from stdin and creating them under
61
+ # the current environment.
62
+ elsif mode == :load
63
+ Replicate::Loader.new do |loader|
64
+ loader.log_to $stderr, verbose, quiet
65
+ loader.read $stdin
66
+ end
67
+
68
+ # mode not set means no -l or -d arg was given. show usage and bail.
69
+ else
70
+ usage.call
71
+ end
@@ -0,0 +1,10 @@
1
+ module Replicate
2
+ autoload :Emitter, 'replicate/emitter'
3
+ autoload :Dumper, 'replicate/dumper'
4
+ autoload :Loader, 'replicate/loader'
5
+ autoload :Object, 'replicate/object'
6
+ autoload :Status, 'replicate/status'
7
+ autoload :AR, 'replicate/active_record'
8
+
9
+ AR if defined?(::ActiveRecord::Base)
10
+ end
@@ -0,0 +1,217 @@
1
+ module Replicate
2
+ # ActiveRecord::Base instance methods used to dump replicant objects for the
3
+ # record and all 1:1 associations. This module implements the replicant_id
4
+ # and dump_replicant methods using AR's reflection API to determine
5
+ # relationships with other objects.
6
+ module AR
7
+ # Mixin for the ActiveRecord instance.
8
+ module InstanceMethods
9
+ # Replicate::Dumper calls this method on objects to trigger dumping a
10
+ # replicant object tuple. The default implementation dumps all belongs_to
11
+ # associations, then self, then all has_one associations, then any
12
+ # has_many or has_and_belongs_to_many associations declared with the
13
+ # replicate_associations macro.
14
+ #
15
+ # dumper - Dumper object whose #write method must be called with the
16
+ # type, id, and attributes hash.
17
+ #
18
+ # Returns nothing.
19
+ def dump_replicant(dumper)
20
+ dump_all_association_replicants dumper, :belongs_to
21
+ dumper.write self.class.to_s, id, replicant_attributes, self
22
+ dump_all_association_replicants dumper, :has_one
23
+ self.class.replicate_associations.each do |association|
24
+ dump_association_replicants dumper, association
25
+ end
26
+ end
27
+
28
+ # Attributes hash used to persist this object. This consists of simply
29
+ # typed values (no complex types or objects) with the exception of special
30
+ # foreign key values. When an attribute value is [:id, "SomeClass:1234"],
31
+ # the loader will handle translating the id value to the local system's
32
+ # version of the same object.
33
+ def replicant_attributes
34
+ attributes = self.attributes.dup
35
+ self.class.reflect_on_all_associations(:belongs_to).each do |reflection|
36
+ foreign_key = (reflection.options[:foreign_key] || "#{reflection.name}_id").to_s
37
+ if id = attributes[foreign_key]
38
+ attributes[foreign_key] = [:id, reflection.klass.to_s, id]
39
+ end
40
+ end
41
+ attributes
42
+ end
43
+
44
+ # The replicant id is a two tuple containing the class and object id. This
45
+ # is used by Replicant::Dumper to determine if the object has already been
46
+ # dumped or not.
47
+ def replicant_id
48
+ [self.class.name, id]
49
+ end
50
+
51
+ # Dump all associations of a given type.
52
+ #
53
+ # dumper - The Dumper object used to dump additional objects.
54
+ # association_type - :has_one, :belongs_to, :has_many
55
+ #
56
+ # Returns nothing.
57
+ def dump_all_association_replicants(dumper, association_type)
58
+ self.class.reflect_on_all_associations(association_type).each do |reflection|
59
+ next if (dependent = __send__(reflection.name)).nil?
60
+ case dependent
61
+ when ActiveRecord::Base, Array
62
+ dumper.dump(dependent)
63
+ else
64
+ warn "warn: #{self.class}##{reflection.name} #{association_type} association " \
65
+ "unexpectedly returned a #{dependent.class}. skipping."
66
+ end
67
+ end
68
+ end
69
+
70
+ # Dump objects associated with an AR object through an association name.
71
+ #
72
+ # object - AR object instance.
73
+ # association - Name of the association whose objects should be dumped.
74
+ #
75
+ # Returns nothing.
76
+ def dump_association_replicants(dumper, association)
77
+ if reflection = self.class.reflect_on_association(association)
78
+ objects = __send__(reflection.name)
79
+ dumper.dump(objects)
80
+ if reflection.macro == :has_and_belongs_to_many
81
+ dump_has_and_belongs_to_many_replicant(dumper, reflection)
82
+ end
83
+ else
84
+ warn "error: #{self.class}##{association} is invalid"
85
+ end
86
+ end
87
+
88
+ # Dump the special Habtm object used to establish many-to-many
89
+ # relationships between objects that have already been dumped. Note that
90
+ # this object and all objects referenced must have already been dumped
91
+ # before calling this method.
92
+ def dump_has_and_belongs_to_many_replicant(dumper, reflection)
93
+ dumper.dump Habtm.new(self, reflection)
94
+ end
95
+ end
96
+
97
+ # Mixin for the ActiveRecord class.
98
+ module ClassMethods
99
+ # Set and retrieve list of association names that should be dumped when
100
+ # objects of this class are dumped. This method may be called multiple
101
+ # times to add associations.
102
+ def replicate_associations(*names)
103
+ self.replicate_associations += names if names.any?
104
+ @replicate_associations || superclass.replicate_associations
105
+ end
106
+
107
+ # Set the list of association names to dump to the specific set of values.
108
+ def replicate_associations=(names)
109
+ @replicate_associations = names.uniq.map { |name| name.to_sym }
110
+ end
111
+
112
+ # Compound key used during load to locate existing objects for update.
113
+ # When no natural key is defined, objects are created new.
114
+ #
115
+ # attribute_names - Macro style setter.
116
+ def replicate_natural_key(*attribute_names)
117
+ self.replicate_natural_key = attribute_names if attribute_names.any?
118
+ @replicate_natural_key || superclass.replicate_natural_key
119
+ end
120
+
121
+ # Set the compound key used to locate existing objects for update when
122
+ # loading. When not set, loading will always create new records.
123
+ #
124
+ # attribute_names - Array of attribute name symbols
125
+ def replicate_natural_key=(attribute_names)
126
+ @replicate_natural_key = attribute_names
127
+ end
128
+
129
+ # Load an individual record into the database. If the models defines a
130
+ # replicate_natural_key then an existing record will be updated if found
131
+ # instead of a new record being created.
132
+ #
133
+ # type - Model class name as a String.
134
+ # id - Primary key id of the record on the dump system. This must be
135
+ # translated to the local system and stored in the keymap.
136
+ # attrs - Hash of attributes to set on the new record.
137
+ #
138
+ # Returns the ActiveRecord object instance for the new record.
139
+ def load_replicant(type, id, attributes)
140
+ instance = replicate_find_existing_record(attributes) || new
141
+ create_or_update_replicant instance, attributes
142
+ end
143
+
144
+ # Locate an existing record using the replicate_natural_key attribute
145
+ # values.
146
+ #
147
+ # Returns the existing record if found, nil otherwise.
148
+ def replicate_find_existing_record(attributes)
149
+ return if replicate_natural_key.empty?
150
+ conditions = {}
151
+ replicate_natural_key.each do |attribute_name|
152
+ conditions[attribute_name] = attributes[attribute_name.to_s]
153
+ end
154
+ find(:first, :conditions => conditions)
155
+ end
156
+
157
+ # Update an AR object's attributes and persist to the database without
158
+ # running validations or callbacks.
159
+ def create_or_update_replicant(instance, attributes)
160
+ def instance.callback(*args);end # Rails 2.x hack to disable callbacks.
161
+
162
+ attributes.each do |key, value|
163
+ next if key == primary_key
164
+ instance.write_attribute key, value
165
+ end
166
+
167
+ instance.save false
168
+ [instance.id, instance]
169
+ end
170
+ end
171
+
172
+ # Special object used to dump the list of associated ids for a
173
+ # has_and_belongs_to_many association. The object includes attributes for
174
+ # locating the source object and writing the list of ids to the appropriate
175
+ # association method.
176
+ class Habtm
177
+ def initialize(object, reflection)
178
+ @object = object
179
+ @reflection = reflection
180
+ end
181
+
182
+ def id
183
+ end
184
+
185
+ def attributes
186
+ ids = @object.__send__("#{@reflection.name.to_s.singularize}_ids")
187
+ {
188
+ 'id' => [:id, @object.class.to_s, @object.id],
189
+ 'class' => @object.class.to_s,
190
+ 'ref_class' => @reflection.klass.to_s,
191
+ 'ref_name' => @reflection.name.to_s,
192
+ 'collection' => [:id, @reflection.klass.to_s, ids]
193
+ }
194
+ end
195
+
196
+ def dump_replicant(dumper)
197
+ type = self.class.name
198
+ id = "#{@object.class.to_s}:#{@reflection.name}:#{@object.id}"
199
+ dumper.write type, id, attributes, self
200
+ end
201
+
202
+ def self.load_replicant(type, id, attrs)
203
+ object = attrs['class'].constantize.find(attrs['id'])
204
+ ids = attrs['collection']
205
+ object.__send__("#{attrs['ref_name'].to_s.singularize}_ids=", ids)
206
+ [id, new(object, nil)]
207
+ end
208
+ end
209
+
210
+ # Load active record and install the extension methods.
211
+ require 'active_record'
212
+ ::ActiveRecord::Base.send :include, InstanceMethods
213
+ ::ActiveRecord::Base.send :extend, ClassMethods
214
+ ::ActiveRecord::Base.replicate_associations = []
215
+ ::ActiveRecord::Base.replicate_natural_key = []
216
+ end
217
+ end
@@ -0,0 +1,109 @@
1
+ module Replicate
2
+ # Dump replicants in a streaming fashion.
3
+ #
4
+ # The Dumper takes objects and generates one or more replicant objects. A
5
+ # replicant has the form [type, id, attributes] and describes exactly one
6
+ # addressable record in a datastore. The type and id identify the model
7
+ # class name and model primary key id. The attributes Hash is a set of attribute
8
+ # name to primitively typed object value mappings.
9
+ #
10
+ # Example dump session:
11
+ #
12
+ # >> Replicate::Dumper.new do |dumper|
13
+ # >> dumper.marshal_to $stdout
14
+ # >> dumper.log_to $stderr
15
+ # >> dumper.dump User.find(1234)
16
+ # >> end
17
+ #
18
+ class Dumper < Emitter
19
+ # Create a new Dumper.
20
+ #
21
+ # io - IO object to write marshalled replicant objects to.
22
+ # block - Dump context block. If given, the end of the block's execution
23
+ # is assumed to be the end of the dump stream.
24
+ def initialize(io=nil)
25
+ @memo = Hash.new { |hash,k| hash[k] = {} }
26
+ super() do
27
+ marshal_to io if io
28
+ yield self if block_given?
29
+ end
30
+ end
31
+
32
+ # Register a filter to write marshalled data to the given IO object.
33
+ def marshal_to(io)
34
+ listen { |type, id, attrs, obj| Marshal.dump([type, id, attrs], io) }
35
+ end
36
+
37
+ # Register a filter to write status information to the given stream. By
38
+ # default, a single line is used to report object counts while the dump is
39
+ # in progress; dump counts for each class are written when complete. The
40
+ # verbose and quiet options can be used to increase or decrease
41
+ # verbosity.
42
+ #
43
+ # out - An IO object to write to, like stderr.
44
+ # verbose - Whether verbose output should be enabled.
45
+ # quiet - Whether quiet output should be enabled.
46
+ #
47
+ # Returns the Replicate::Status object.
48
+ def log_to(out=$stderr, verbose=false, quiet=false)
49
+ use Replicate::Status, 'dump', out, verbose, quiet
50
+ end
51
+
52
+ # Dump one or more objects to the internal array or provided dump
53
+ # stream. This method guarantees that the same object will not be dumped
54
+ # more than once.
55
+ #
56
+ # objects - ActiveRecord object instances.
57
+ #
58
+ # Returns nothing.
59
+ def dump(*objects)
60
+ objects = objects[0] if objects.size == 1 && objects[0].respond_to?(:to_ary)
61
+ objects.each do |object|
62
+ next if object.nil? || dumped?(object)
63
+ if object.respond_to?(:dump_replicant)
64
+ object.dump_replicant(self)
65
+ else
66
+ raise NoMethodError, "#{object.class} must respond to #dump_replicant"
67
+ end
68
+ end
69
+ end
70
+
71
+ # Check if object has been written yet.
72
+ def dumped?(object)
73
+ if object.respond_to?(:replicant_id)
74
+ type, id = object.replicant_id
75
+ elsif object.is_a?(Array)
76
+ type, id = object
77
+ else
78
+ return false
79
+ end
80
+ @memo[type.to_s][id]
81
+ end
82
+
83
+ # Called exactly once per unique type and id. Emits to all listeners.
84
+ #
85
+ # type - The model class name as a String.
86
+ # id - The record's id. Usually an integer.
87
+ # attributes - All model attributes.
88
+ # object - The object this dump is generated for.
89
+ #
90
+ # Returns the object.
91
+ def write(type, id, attributes, object)
92
+ type = type.to_s
93
+ return if dumped?([type, id])
94
+ @memo[type][id] = true
95
+
96
+ emit type, id, attributes, object
97
+ end
98
+
99
+ # Retrieve dumped object counts for all classes.
100
+ #
101
+ # Returns a Hash of { class_name => count } where count is the number of
102
+ # objects dumped with a class of class_name.
103
+ def stats
104
+ stats = {}
105
+ @memo.each { |class_name, items| stats[class_name] = items.size }
106
+ stats
107
+ end
108
+ end
109
+ end