replicate 1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/COPYING ADDED
@@ -0,0 +1,18 @@
1
+ Copyright (c) 2011 Ryan Tomayko <http://tomayko.com/about>
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy
4
+ of this software and associated documentation files (the "Software"), to
5
+ deal in the Software without restriction, including without limitation the
6
+ rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
7
+ sell copies of the Software, and to permit persons to whom the Software is
8
+ furnished to do so, subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in
11
+ all copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
15
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
16
+ THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
17
+ IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
18
+ CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,99 @@
1
+ # Replicate
2
+
3
+ Dump and load relational objects between Ruby environments.
4
+
5
+ The project was started at GitHub to ease the process of getting real production
6
+ data into staging and development environments. We have a custom command that
7
+ uses the replicate machinery to dump entire repository data (including
8
+ associated objects like issues, pull requests, commit comments, etc.) from
9
+ production and load it into the current environment. This is excessively useful
10
+ for troubleshooting issues, support requests, and exception reports.
11
+
12
+ ## Synopsis
13
+
14
+ Dumping objects:
15
+
16
+ $ replicate -r config/environment -d 'User.find(1)' > user.dump
17
+ ==> dumped 4 total objects:
18
+ Profile 1
19
+ User 1
20
+ UserEmail 2
21
+
22
+ Loading objects:
23
+
24
+ $ replicate -r config/environment -l < user.dump
25
+ ==> loaded 4 total objects:
26
+ Profile 1
27
+ User 1
28
+ UserEmail 2
29
+
30
+ Dumping and loading over SSH:
31
+
32
+ $ remote_command="replicate -r /app/config/environment -d 'User.find(1234)'"
33
+ $ ssh example.org "$remote_command" |replicate -r config/environment -l
34
+
35
+ ## ActiveRecord
36
+
37
+ *NOTE: Replicate has been tested only under ActiveRecord 2.2. Support for
38
+ ActiveRecord 3.x is planned.*
39
+
40
+ Basic support for dumping and loading ActiveRecord objects is included. When an
41
+ object is dumped, all `belongs_to` and `has_one` associations are automatically
42
+ followed and included in the dump. You can mark `has_many` and
43
+ `has_and_belongs_to_many` associations for automatic inclusion using the
44
+ `replicate_associations` macro:
45
+
46
+ class User < ActiveRecord::Base
47
+ belongs_to :profile
48
+ has_many :email_addresses
49
+
50
+ replicate_associations :email_addresses
51
+ end
52
+
53
+ By default, the loader attempts to create a new record for all objects. This can
54
+ lead to unique constraint errors when a record already exists with matching
55
+ attributes. To update existing records instead of always creating new ones,
56
+ define a natural key for the model using the `replicate_natural_key` macro:
57
+
58
+ class User < ActiveRecord::Base
59
+ belongs_to :profile
60
+ has_many :email_addresses
61
+
62
+ replicate_natural_key :login
63
+ replicate_associations :email_addresses
64
+ end
65
+
66
+ Multiple attribute names may be specified to define a compound key.
67
+
68
+ ## Custom Objects
69
+
70
+ Other object types may be included in the dump stream so long as they implement
71
+ the `dump_replicant` and `load_replicant` methods.
72
+
73
+ The dump side calls `#dump_replicant(dumper)` on each object. The method must
74
+ call `dumper.write()` with the class name, id, and hash of primitively typed
75
+ attributes for the object:
76
+
77
+ class User
78
+ attr_reader :id
79
+ attr_accessor :name, :email
80
+
81
+ def dump_replicant(dumper)
82
+ attributes { 'name' => name, 'email' => email }
83
+ dumper.write self.class, id, attributes
84
+ end
85
+ end
86
+
87
+ The load side calls `::load_replicant(type, id, attributes)` on the class to
88
+ load each object into the current environment. The method must return an
89
+ `[id, object]` tuple:
90
+
91
+ class User
92
+ def self.load_replicant(type, id, attributes)
93
+ user = User.new
94
+ user.name = attributes['name']
95
+ user.email = attributes['email']
96
+ user.save!
97
+ [user.id, user]
98
+ end
99
+ end
@@ -0,0 +1,6 @@
1
+ task :default => :test
2
+
3
+ desc "Run tests"
4
+ task :test do
5
+ sh "testrb test/*_test.rb", :verbose => false
6
+ end
@@ -0,0 +1,71 @@
1
+ #!/usr/bin/env ruby
2
+ #/ script/replicate [-r <lib>] --dump "<ruby>" > objects.dump
3
+ #/ script/replicate [-r <lib>] --load < objects.dump
4
+ #/ Dump and load objects between environments. The --dump form writes to stdout
5
+ #/ the objects returned by evaluating "<ruby>", which must be a valid Ruby
6
+ #/ expression. The --load form reads dump data from stdin and loads into the
7
+ #/ current environment.
8
+ #/
9
+ #/ Options:
10
+ #/ -r, --require Require the library. Often used with 'config/environment'.
11
+ #/ -d, --dump Dump the repository and all related objects to stdout.
12
+ #/ -l, --load Load dump file data from stdin.
13
+ #/
14
+ #/ -v, --verbose Write more status output.
15
+ #/ -q, --quiet Write less status output.
16
+ $stderr.sync = true
17
+ require 'optparse'
18
+
19
+ # default options
20
+ mode = nil
21
+ verbose = false
22
+ quiet = false
23
+ out = $stdout
24
+
25
+ # parse arguments
26
+ file = __FILE__
27
+ usage = lambda { exec "grep ^#/<'#{file}'|cut -c4-" }
28
+ ARGV.options do |opts|
29
+ opts.on("-d", "--dump") { mode = :dump }
30
+ opts.on("-l", "--load") { mode = :load }
31
+ opts.on("-r", "--require=f") { |file| require file }
32
+ opts.on("-v", "--verbose") { verbose = true }
33
+ opts.on("-q", "--quiet") { quiet = true }
34
+ opts.on_tail("-h", "--help", &usage)
35
+ opts.parse!
36
+ end
37
+
38
+ # load rails environment and replicator lib.
39
+ require 'replicate'
40
+
41
+ # hack to enable AR query cache
42
+ if defined?(ActiveRecord::Base)
43
+ ActiveRecord::ConnectionAdapters::QueryCache.
44
+ send :attr_writer, :query_cache, :query_cache_enabled
45
+ ActiveRecord::Base.connection.send(:query_cache=, {})
46
+ ActiveRecord::Base.connection.send(:query_cache_enabled=, true)
47
+ end
48
+
49
+ # dump mode means we're reading records from the database here and writing to
50
+ # stdout. the database should not be modified at all by this operation.
51
+ if mode == :dump
52
+ usage.call if ARGV.empty? || ARGV[0].empty?
53
+ objects = eval(ARGV[0])
54
+ Replicate::Dumper.new do |dumper|
55
+ dumper.marshal_to out
56
+ dumper.log_to $stderr, verbose, quiet
57
+ dumper.dump objects
58
+ end
59
+
60
+ # load mode means we're reading objects from stdin and creating them under
61
+ # the current environment.
62
+ elsif mode == :load
63
+ Replicate::Loader.new do |loader|
64
+ loader.log_to $stderr, verbose, quiet
65
+ loader.read $stdin
66
+ end
67
+
68
+ # mode not set means no -l or -d arg was given. show usage and bail.
69
+ else
70
+ usage.call
71
+ end
@@ -0,0 +1,10 @@
1
+ module Replicate
2
+ autoload :Emitter, 'replicate/emitter'
3
+ autoload :Dumper, 'replicate/dumper'
4
+ autoload :Loader, 'replicate/loader'
5
+ autoload :Object, 'replicate/object'
6
+ autoload :Status, 'replicate/status'
7
+ autoload :AR, 'replicate/active_record'
8
+
9
+ AR if defined?(::ActiveRecord::Base)
10
+ end
@@ -0,0 +1,217 @@
1
+ module Replicate
2
+ # ActiveRecord::Base instance methods used to dump replicant objects for the
3
+ # record and all 1:1 associations. This module implements the replicant_id
4
+ # and dump_replicant methods using AR's reflection API to determine
5
+ # relationships with other objects.
6
+ module AR
7
+ # Mixin for the ActiveRecord instance.
8
+ module InstanceMethods
9
+ # Replicate::Dumper calls this method on objects to trigger dumping a
10
+ # replicant object tuple. The default implementation dumps all belongs_to
11
+ # associations, then self, then all has_one associations, then any
12
+ # has_many or has_and_belongs_to_many associations declared with the
13
+ # replicate_associations macro.
14
+ #
15
+ # dumper - Dumper object whose #write method must be called with the
16
+ # type, id, and attributes hash.
17
+ #
18
+ # Returns nothing.
19
+ def dump_replicant(dumper)
20
+ dump_all_association_replicants dumper, :belongs_to
21
+ dumper.write self.class.to_s, id, replicant_attributes, self
22
+ dump_all_association_replicants dumper, :has_one
23
+ self.class.replicate_associations.each do |association|
24
+ dump_association_replicants dumper, association
25
+ end
26
+ end
27
+
28
+ # Attributes hash used to persist this object. This consists of simply
29
+ # typed values (no complex types or objects) with the exception of special
30
+ # foreign key values. When an attribute value is [:id, "SomeClass:1234"],
31
+ # the loader will handle translating the id value to the local system's
32
+ # version of the same object.
33
+ def replicant_attributes
34
+ attributes = self.attributes.dup
35
+ self.class.reflect_on_all_associations(:belongs_to).each do |reflection|
36
+ foreign_key = (reflection.options[:foreign_key] || "#{reflection.name}_id").to_s
37
+ if id = attributes[foreign_key]
38
+ attributes[foreign_key] = [:id, reflection.klass.to_s, id]
39
+ end
40
+ end
41
+ attributes
42
+ end
43
+
44
+ # The replicant id is a two tuple containing the class and object id. This
45
+ # is used by Replicant::Dumper to determine if the object has already been
46
+ # dumped or not.
47
+ def replicant_id
48
+ [self.class.name, id]
49
+ end
50
+
51
+ # Dump all associations of a given type.
52
+ #
53
+ # dumper - The Dumper object used to dump additional objects.
54
+ # association_type - :has_one, :belongs_to, :has_many
55
+ #
56
+ # Returns nothing.
57
+ def dump_all_association_replicants(dumper, association_type)
58
+ self.class.reflect_on_all_associations(association_type).each do |reflection|
59
+ next if (dependent = __send__(reflection.name)).nil?
60
+ case dependent
61
+ when ActiveRecord::Base, Array
62
+ dumper.dump(dependent)
63
+ else
64
+ warn "warn: #{self.class}##{reflection.name} #{association_type} association " \
65
+ "unexpectedly returned a #{dependent.class}. skipping."
66
+ end
67
+ end
68
+ end
69
+
70
+ # Dump objects associated with an AR object through an association name.
71
+ #
72
+ # object - AR object instance.
73
+ # association - Name of the association whose objects should be dumped.
74
+ #
75
+ # Returns nothing.
76
+ def dump_association_replicants(dumper, association)
77
+ if reflection = self.class.reflect_on_association(association)
78
+ objects = __send__(reflection.name)
79
+ dumper.dump(objects)
80
+ if reflection.macro == :has_and_belongs_to_many
81
+ dump_has_and_belongs_to_many_replicant(dumper, reflection)
82
+ end
83
+ else
84
+ warn "error: #{self.class}##{association} is invalid"
85
+ end
86
+ end
87
+
88
+ # Dump the special Habtm object used to establish many-to-many
89
+ # relationships between objects that have already been dumped. Note that
90
+ # this object and all objects referenced must have already been dumped
91
+ # before calling this method.
92
+ def dump_has_and_belongs_to_many_replicant(dumper, reflection)
93
+ dumper.dump Habtm.new(self, reflection)
94
+ end
95
+ end
96
+
97
+ # Mixin for the ActiveRecord class.
98
+ module ClassMethods
99
+ # Set and retrieve list of association names that should be dumped when
100
+ # objects of this class are dumped. This method may be called multiple
101
+ # times to add associations.
102
+ def replicate_associations(*names)
103
+ self.replicate_associations += names if names.any?
104
+ @replicate_associations || superclass.replicate_associations
105
+ end
106
+
107
+ # Set the list of association names to dump to the specific set of values.
108
+ def replicate_associations=(names)
109
+ @replicate_associations = names.uniq.map { |name| name.to_sym }
110
+ end
111
+
112
+ # Compound key used during load to locate existing objects for update.
113
+ # When no natural key is defined, objects are created new.
114
+ #
115
+ # attribute_names - Macro style setter.
116
+ def replicate_natural_key(*attribute_names)
117
+ self.replicate_natural_key = attribute_names if attribute_names.any?
118
+ @replicate_natural_key || superclass.replicate_natural_key
119
+ end
120
+
121
+ # Set the compound key used to locate existing objects for update when
122
+ # loading. When not set, loading will always create new records.
123
+ #
124
+ # attribute_names - Array of attribute name symbols
125
+ def replicate_natural_key=(attribute_names)
126
+ @replicate_natural_key = attribute_names
127
+ end
128
+
129
+ # Load an individual record into the database. If the models defines a
130
+ # replicate_natural_key then an existing record will be updated if found
131
+ # instead of a new record being created.
132
+ #
133
+ # type - Model class name as a String.
134
+ # id - Primary key id of the record on the dump system. This must be
135
+ # translated to the local system and stored in the keymap.
136
+ # attrs - Hash of attributes to set on the new record.
137
+ #
138
+ # Returns the ActiveRecord object instance for the new record.
139
+ def load_replicant(type, id, attributes)
140
+ instance = replicate_find_existing_record(attributes) || new
141
+ create_or_update_replicant instance, attributes
142
+ end
143
+
144
+ # Locate an existing record using the replicate_natural_key attribute
145
+ # values.
146
+ #
147
+ # Returns the existing record if found, nil otherwise.
148
+ def replicate_find_existing_record(attributes)
149
+ return if replicate_natural_key.empty?
150
+ conditions = {}
151
+ replicate_natural_key.each do |attribute_name|
152
+ conditions[attribute_name] = attributes[attribute_name.to_s]
153
+ end
154
+ find(:first, :conditions => conditions)
155
+ end
156
+
157
+ # Update an AR object's attributes and persist to the database without
158
+ # running validations or callbacks.
159
+ def create_or_update_replicant(instance, attributes)
160
+ def instance.callback(*args);end # Rails 2.x hack to disable callbacks.
161
+
162
+ attributes.each do |key, value|
163
+ next if key == primary_key
164
+ instance.write_attribute key, value
165
+ end
166
+
167
+ instance.save false
168
+ [instance.id, instance]
169
+ end
170
+ end
171
+
172
+ # Special object used to dump the list of associated ids for a
173
+ # has_and_belongs_to_many association. The object includes attributes for
174
+ # locating the source object and writing the list of ids to the appropriate
175
+ # association method.
176
+ class Habtm
177
+ def initialize(object, reflection)
178
+ @object = object
179
+ @reflection = reflection
180
+ end
181
+
182
+ def id
183
+ end
184
+
185
+ def attributes
186
+ ids = @object.__send__("#{@reflection.name.to_s.singularize}_ids")
187
+ {
188
+ 'id' => [:id, @object.class.to_s, @object.id],
189
+ 'class' => @object.class.to_s,
190
+ 'ref_class' => @reflection.klass.to_s,
191
+ 'ref_name' => @reflection.name.to_s,
192
+ 'collection' => [:id, @reflection.klass.to_s, ids]
193
+ }
194
+ end
195
+
196
+ def dump_replicant(dumper)
197
+ type = self.class.name
198
+ id = "#{@object.class.to_s}:#{@reflection.name}:#{@object.id}"
199
+ dumper.write type, id, attributes, self
200
+ end
201
+
202
+ def self.load_replicant(type, id, attrs)
203
+ object = attrs['class'].constantize.find(attrs['id'])
204
+ ids = attrs['collection']
205
+ object.__send__("#{attrs['ref_name'].to_s.singularize}_ids=", ids)
206
+ [id, new(object, nil)]
207
+ end
208
+ end
209
+
210
+ # Load active record and install the extension methods.
211
+ require 'active_record'
212
+ ::ActiveRecord::Base.send :include, InstanceMethods
213
+ ::ActiveRecord::Base.send :extend, ClassMethods
214
+ ::ActiveRecord::Base.replicate_associations = []
215
+ ::ActiveRecord::Base.replicate_natural_key = []
216
+ end
217
+ end
@@ -0,0 +1,109 @@
1
+ module Replicate
2
+ # Dump replicants in a streaming fashion.
3
+ #
4
+ # The Dumper takes objects and generates one or more replicant objects. A
5
+ # replicant has the form [type, id, attributes] and describes exactly one
6
+ # addressable record in a datastore. The type and id identify the model
7
+ # class name and model primary key id. The attributes Hash is a set of attribute
8
+ # name to primitively typed object value mappings.
9
+ #
10
+ # Example dump session:
11
+ #
12
+ # >> Replicate::Dumper.new do |dumper|
13
+ # >> dumper.marshal_to $stdout
14
+ # >> dumper.log_to $stderr
15
+ # >> dumper.dump User.find(1234)
16
+ # >> end
17
+ #
18
+ class Dumper < Emitter
19
+ # Create a new Dumper.
20
+ #
21
+ # io - IO object to write marshalled replicant objects to.
22
+ # block - Dump context block. If given, the end of the block's execution
23
+ # is assumed to be the end of the dump stream.
24
+ def initialize(io=nil)
25
+ @memo = Hash.new { |hash,k| hash[k] = {} }
26
+ super() do
27
+ marshal_to io if io
28
+ yield self if block_given?
29
+ end
30
+ end
31
+
32
+ # Register a filter to write marshalled data to the given IO object.
33
+ def marshal_to(io)
34
+ listen { |type, id, attrs, obj| Marshal.dump([type, id, attrs], io) }
35
+ end
36
+
37
+ # Register a filter to write status information to the given stream. By
38
+ # default, a single line is used to report object counts while the dump is
39
+ # in progress; dump counts for each class are written when complete. The
40
+ # verbose and quiet options can be used to increase or decrease
41
+ # verbosity.
42
+ #
43
+ # out - An IO object to write to, like stderr.
44
+ # verbose - Whether verbose output should be enabled.
45
+ # quiet - Whether quiet output should be enabled.
46
+ #
47
+ # Returns the Replicate::Status object.
48
+ def log_to(out=$stderr, verbose=false, quiet=false)
49
+ use Replicate::Status, 'dump', out, verbose, quiet
50
+ end
51
+
52
+ # Dump one or more objects to the internal array or provided dump
53
+ # stream. This method guarantees that the same object will not be dumped
54
+ # more than once.
55
+ #
56
+ # objects - ActiveRecord object instances.
57
+ #
58
+ # Returns nothing.
59
+ def dump(*objects)
60
+ objects = objects[0] if objects.size == 1 && objects[0].respond_to?(:to_ary)
61
+ objects.each do |object|
62
+ next if object.nil? || dumped?(object)
63
+ if object.respond_to?(:dump_replicant)
64
+ object.dump_replicant(self)
65
+ else
66
+ raise NoMethodError, "#{object.class} must respond to #dump_replicant"
67
+ end
68
+ end
69
+ end
70
+
71
+ # Check if object has been written yet.
72
+ def dumped?(object)
73
+ if object.respond_to?(:replicant_id)
74
+ type, id = object.replicant_id
75
+ elsif object.is_a?(Array)
76
+ type, id = object
77
+ else
78
+ return false
79
+ end
80
+ @memo[type.to_s][id]
81
+ end
82
+
83
+ # Called exactly once per unique type and id. Emits to all listeners.
84
+ #
85
+ # type - The model class name as a String.
86
+ # id - The record's id. Usually an integer.
87
+ # attributes - All model attributes.
88
+ # object - The object this dump is generated for.
89
+ #
90
+ # Returns the object.
91
+ def write(type, id, attributes, object)
92
+ type = type.to_s
93
+ return if dumped?([type, id])
94
+ @memo[type][id] = true
95
+
96
+ emit type, id, attributes, object
97
+ end
98
+
99
+ # Retrieve dumped object counts for all classes.
100
+ #
101
+ # Returns a Hash of { class_name => count } where count is the number of
102
+ # objects dumped with a class of class_name.
103
+ def stats
104
+ stats = {}
105
+ @memo.each { |class_name, items| stats[class_name] = items.size }
106
+ stats
107
+ end
108
+ end
109
+ end