nsync 0.0.2

Sign up to get free protection for your applications and to get access to all the features.
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ Copyright (c) 2010 Ben Hughes
2
+ Copyright (c) 2010 NabeWise Media, Inc.
3
+
4
+ Permission is hereby granted, free of charge, to any person obtaining a copy
5
+ of this software and associated documentation files (the "Software"), to deal
6
+ in the Software without restriction, including without limitation the rights
7
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
8
+ copies of the Software, and to permit persons to whom the Software is
9
+ furnished to do so, subject to the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be included in
12
+ all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
17
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
18
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
19
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
20
+ THE SOFTWARE.
21
+
data/README.md ADDED
@@ -0,0 +1,204 @@
1
+ Nsync: Git based database synchronization
2
+ =========================================
3
+
4
+ Nsync allows you to keep disparate data sources synchronized for core data.
5
+ The use case this is designed to solve is when you have a data processing
6
+ system and one or many consumer facing services that depend on a canonical,
7
+ processed version of the data. All of this is based on the power of Git.
8
+ Nsync makes no assumptions about your data stores, ORMs, or storage practices
9
+ other than that the data from the producer has something that can serve as a
10
+ unique primary key and that the consumer can be queried by a key indicating
11
+ its source. That said, Nsync comes with extensions for ActiveRecord 2.3.x
12
+ that handle the simple case.
13
+
14
+ A NabeWise Story
15
+ ----------------
16
+
17
+ Nsync was born out of our needs at NabeWise (http://nabewise.com). We deal
18
+ with neighborhoods and cities. Our source data consists of about 70,000
19
+ neighborhoods across the US. We carefully curate neighborhoods for each of
20
+ our cities. Oftentimes this involves editing boundaries, making changes to
21
+ neighborhood names, or other slight adjustments to the underlying data. We
22
+ also process and refine the boundaries for display through a number of
23
+ automated processes to get the data to where we want. This occurs in a Rails
24
+ app based on top of PostgreSQL with PostGIS. We have to get this data to our
25
+ website, which runs on top of MySQL and Redis. We have enough data that full
26
+ reloads from files are impractical, and we also want to handle events like
27
+ neighborhood deletion intelligently. Nsync solves these issues.
28
+
29
+ Installation
30
+ ------------
31
+
32
+ gem install nsync
33
+
34
+ Nsync depends on two gems that I've forked, schleyfox-grit and
35
+ schleyfox-lockfile. I'm sorry, but this is how it has to be. Nsync also
36
+ currently depends on ActiveSupport ~> 2.3.5, but I am working to remove this
37
+ dependency.
38
+
39
+ Terminology
40
+ -----------
41
+
42
+ In Nsync lingo, a producer is an object/class that creates data that will go
43
+ into a repository and propagate to consumers. It adheres to the Producer
44
+ interface. A consumer is an object/class that takes data from the repo and
45
+ updates itself accordingly. It adheres to the Consumer interface. A producer
46
+ is also a consumer of itself.
47
+
48
+ Producer Usage
49
+ --------------
50
+
51
+ To start off with, you have to configure your shiny new producer app. This
52
+ configuration should happen before the producer is ever used.
53
+
54
+ Nsync::Config.run do |c|
55
+ # The producer uses a standard repository
56
+ # This will automatically be created if it does not exist
57
+ c.repo_path = "/local/path/to/hold/data"
58
+ # The remote repository url will get data pushed to it
59
+ c.repo_push_url = "git@examplegithost:username/data.git"
60
+
61
+ # This must be Nsync::GitVersionManager if you want things like
62
+ # rollback to work.
63
+ c.version_manager = Nsync::GitVersionManager.new
64
+
65
+ # A lock file path to use for this app
66
+ c.lock_file = "/tmp/app_name_nsync.lock"
67
+ end
68
+
69
+ Now you need to let your objects know the joy of Nsync. This is not strictly
70
+ necessary, but can help out. If you are using ActiveRecord, do this:
71
+
72
+ ActiveRecord::Base.send(:extend, Nsync::ActiveRecord::ClassMethods)
73
+
74
+ If you are using something else, do this:
75
+
76
+ YourBaseObject.send(:extend, Nsync::ClassMethods)
77
+
78
+ Now, set your data classes up as producers
79
+
80
+ class Post < ActiveRecord::Base
81
+ nsync_producer
82
+ end
83
+
84
+ By default, this will write out the json-ified contents of its attributes (if
85
+ its ActiveRecord, you have to define its representation otherwise) to
86
+ "CLASS_NAME/ID.json" in the repo.
87
+
88
+ If not all of your data should be exported, you can specify an :if function
89
+
90
+ class Post
91
+ nsync_producer :if => lambda {|o| o.should_be_exported }
92
+ end
93
+
94
+ After you make some data changes, you can commit and push them by doing
95
+
96
+ producer = Nsync::Producer.new
97
+ producer.commit("Short Message Describing Changes")
98
+
99
+ See Nsync::ClassMethods, Nsync::ActiveRecord::ClassMethods,
100
+ Nsync::Producer::InstanceMethods, and
101
+ Nsync::ActiveRecord::Producer::InstanceMethods for more information
102
+
103
+ Consumer Usage
104
+ --------------
105
+
106
+ Every good producer needs one (or many) good consumers. Again, the first step
107
+ is configuration.
108
+
109
+ The Consumer is a little less straight forward. It requires that the classes
110
+ from the Producer side be mapped to classes on the Consumer side. This
111
+ happens using Nsync::Config#map_class, which maps from a Producer class name
112
+ to one or many Consumer classes.
113
+
114
+ It also requires that Nsync::Config#version_manager is set to a class or
115
+ instance that conforms to the VersionManager interface. This is probably a
116
+ class on top of a database that stores versions (by commit id) as they are
117
+ loaded into the system, such that the current version and all previous
118
+ versions can be easily accessed. The ActiveRecord integration tests
119
+ demonstrate this.
120
+
121
+ Nsync::Config.run do |c|
122
+ # The consumer uses a read-only, bare repository (one ending in .git)
123
+ # This will automatically be created if it does not exist
124
+ c.repo_path = "/local/path/to/hold/data.git"
125
+ # The remote repository url from which to pull data
126
+ c.repo_url = "git@examplegithost:username/data.git"
127
+
128
+ # An object that implements the VersionManager interface
129
+ # (see Nsync::GitVersionManager) for an example
130
+ c.version_manager = MyCustomVersionManager.new
131
+
132
+ # A lock file path to use for this app
133
+ c.lock_file = "/tmp/app_name_nsync.lock"
134
+
135
+ # The class mapping maps from the class names of the producer classes to
136
+ # the class names of their associated consuming classes. A producer can
137
+ # map to one or many consumers, and a consumer can be mapped to one or many
138
+ # producers. Consumer classes should implement the Consumer interface.
139
+ c.map_class "RawDataPostClass", "Post"
140
+ c.map_class "RawDataInfo", "Info"
141
+ end
142
+
143
+ Now you should let your classes know about the Nsync way.If you are using
144
+ ActiveRecord, do this:
145
+
146
+ ActiveRecord::Base.send(:extend, Nsync::ActiveRecord::ClassMethods)
147
+
148
+ If you are using something else, do this:
149
+
150
+ YourBaseObject.send(:extend, Nsync::ClassMethods)
151
+
152
+ Now it's time to let your objects know that they are consumers
153
+
154
+ class Post < ActiveRecord::Base
155
+ nsync_consumer
156
+ end
157
+
158
+ You can (and probably should) override all or some of the default methods in
159
+ the Consumer interface. By default, it basically just attempts to copy hash
160
+ from the file into the consuming database. If your object has any relations,
161
+ this will probably fail tragically. A better Post class would be
162
+
163
+ class Post < ActiveRecord::Base
164
+
165
+ nsync_consumer
166
+
167
+ def self.nsync_add_data(consumer, event_type, filename, data)
168
+ post = new
169
+ post.source_id = data['id']
170
+ post.nsync_update(consumer, event_type, filename, data)
171
+ end
172
+
173
+ def nsync_update(consumer, event_type, filename, data)
174
+ if event_type == :deleted
175
+ destroy
176
+ else
177
+ self.author = Author.nsync_find(data['author_id']).first
178
+ self.content = data['content']
179
+
180
+ related_post_source_ids = data['related_post_ids']
181
+ post = self
182
+
183
+ consumer.after_current_class_finished(lambda {
184
+ post.related_posts = Post.all(:conditions => {:source_id =>
185
+ related_post_source_ids})
186
+ })
187
+
188
+ self.save
189
+ end
190
+ end
191
+ end
192
+
193
+ This also demonstrates how to add callbacks to queues.
194
+
195
+ You can update from the repo like so:
196
+
197
+ consumer = Nsync::Consumer.new
198
+ consumer.update
199
+
200
+
201
+
202
+
203
+
204
+
data/Rakefile ADDED
@@ -0,0 +1,47 @@
1
+ require 'rubygems'
2
+ require 'rake'
3
+
4
+ gem "schleyfox-grit", ">= 2.3.0.1"
5
+ require 'grit'
6
+
7
+ require 'jeweler'
8
+ require 'jeweler_monkey_patch'
9
+
10
+
11
+ Jeweler::Tasks.new do |gem|
12
+ gem.name = "nsync"
13
+ gem.summary = %Q{Keep your data processors and apps in sync}
14
+ gem.description = %Q{Nsync is designed to allow you to have a separate data
15
+ processing app with its own data processing optimized database and a consumer
16
+ app with its own database, while keeping the data as in sync as you want it.}
17
+ gem.email = "ben@pixelmachine.org"
18
+ gem.homepage = "http://github.com/schleyfox/nsync"
19
+ gem.authors = ["Ben Hughes"]
20
+
21
+ gem.add_dependency "json"
22
+ gem.add_dependency "activesupport", "~> 2.3.5"
23
+ gem.add_dependency "schleyfox-grit", ">= 2.3.0.1"
24
+ gem.add_dependency "schleyfox-lockfile", ">= 1.0.0"
25
+
26
+ gem.add_development_dependency "shoulda"
27
+ gem.add_development_dependency "mocha"
28
+ end
29
+
30
+
31
+ require 'rake/testtask'
32
+ Rake::TestTask.new(:test) do |test|
33
+ test.libs << 'lib' << 'test'
34
+ test.pattern = 'test/**/*_test.rb'
35
+ test.verbose = true
36
+ end
37
+
38
+ desc "Generate RCov test coverage and open in your browser"
39
+ task :coverage do
40
+ require 'rcov'
41
+ sh "rm -fr coverage"
42
+ sh "rcov -Itest -Ilib -x \"rubygems/*,/Library/Ruby/Site/*,gems/*,rcov*\" --html test/*_test.rb"
43
+ sh "open coverage/index.html"
44
+ end
45
+
46
+ require 'yard'
47
+ YARD::Rake::YardocTask.new
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.0.2
@@ -0,0 +1,59 @@
1
+ # because the outdated git-ruby version is not interface compatible with grit,
2
+ # we have to prevent it from being used
3
+ module Jeweler::Specification
4
+ def set_jeweler_defaults(base_dir, git_base_dir = nil)
5
+ base_dir = File.expand_path(base_dir)
6
+ git_base_dir = if git_base_dir
7
+ File.expand_path(git_base_dir)
8
+ else
9
+ base_dir
10
+ end
11
+ can_git = git_base_dir && base_dir.include?(git_base_dir) && File.directory?(File.join(git_base_dir, '.git'))
12
+
13
+ Dir.chdir(git_base_dir) do
14
+ all_files = `git ls-files`.split("\n").reject{|file| file =~ /^\./ }
15
+
16
+ if blank?(files)
17
+ base_dir_with_trailing_separator = File.join(base_dir, "")
18
+
19
+ self.files = all_files.reject{|file| file =~ /^(doc|pkg|test|spec|examples)/ }.compact.map do |file|
20
+ File.expand_path(file).sub(base_dir_with_trailing_separator, "")
21
+ end
22
+ end
23
+
24
+ if blank?(test_files)
25
+ self.test_files = all_files.select{|file| file =~ /^(test|spec|examples)/ }.compact.map do |file|
26
+ File.expand_path(file).sub(base_dir_with_trailing_separator, "")
27
+ end
28
+ end
29
+
30
+ if blank?(executables)
31
+ self.executables = all_files.select{|file| file =~ /^bin/}.map do |file|
32
+ File.basename(file)
33
+ end
34
+ end
35
+
36
+ if blank?(extensions)
37
+ self.extensions = FileList['ext/**/{extconf,mkrf_conf}.rb']
38
+ end
39
+
40
+ self.has_rdoc = true
41
+
42
+ if blank?(extra_rdoc_files)
43
+ self.extra_rdoc_files = FileList['README*', 'ChangeLog*', 'LICENSE*', 'TODO']
44
+ end
45
+
46
+ if File.exist?('Gemfile')
47
+ require 'bundler'
48
+ bundler = Bundler.load
49
+ bundler.dependencies_for(:default, :runtime).each do |dependency|
50
+ self.add_dependency dependency.name, *dependency.requirement.as_list
51
+ end
52
+ bundler.dependencies_for(:development).each do |dependency|
53
+ self.add_development_dependency dependency.name, *dependency.requirement.as_list
54
+ end
55
+ end
56
+
57
+ end
58
+ end
59
+ end
@@ -0,0 +1,47 @@
1
+ module Nsync
2
+ module ActiveRecord
3
+ module Consumer
4
+ module ClassMethods
5
+ def nsync_find(ids)
6
+ nsync_opts = read_inheritable_attribute(:nsync_opts)
7
+ all(:conditions => {nsync_opts[:id_key] => ids})
8
+ end
9
+
10
+ def nsync_add_data(consumer, event_type, filename, data)
11
+ data = data.dup
12
+ nsync_opts = read_inheritable_attribute(:nsync_opts)
13
+ if nsync_opts[:id_key].to_s != "id"
14
+ data[nsync_opts[:id_key].to_s] = data.delete("id")
15
+ create(data)
16
+ else
17
+ id = data.delete("id")
18
+ obj = new(data)
19
+ obj.id = id
20
+ obj.save
21
+ end
22
+ end
23
+ end
24
+
25
+ module InstanceMethods
26
+ def self.included(base)
27
+ base.send(:extend, ClassMethods)
28
+ end
29
+ def nsync_update(consumer, event_type, filename, data)
30
+ data = data.dup
31
+ if event_type == :deleted
32
+ destroy
33
+ else
34
+ nsync_opts = self.class.read_inheritable_attribute(:nsync_opts)
35
+ if nsync_opts[:id_key].to_s != "id"
36
+ data[nsync_opts[:id_key].to_s] = data.delete("id")
37
+ else
38
+ self.id = data.delete("id")
39
+ update_attributes(data)
40
+ end
41
+ end
42
+ end
43
+ end
44
+ end
45
+ end
46
+ end
47
+
@@ -0,0 +1,23 @@
1
+ require File.join(File.dirname(__FILE__), 'consumer/methods')
2
+ require File.join(File.dirname(__FILE__), 'producer/methods')
3
+
4
+ module Nsync
5
+ module ActiveRecord
6
+ module ClassMethods
7
+ # Makes this class an Nsync Consumer
8
+ def nsync_consumer(opts={})
9
+ nsync_opts = {:id_key => :source_id}.merge(opts)
10
+ write_inheritable_attribute(:nsync_opts, nsync_opts)
11
+ include Nsync::ActiveRecord::Consumer::InstanceMethods
12
+ end
13
+
14
+ def nsync_producer(opts={})
15
+ nsync_opts = {:id_key => :id}.merge(opts)
16
+ write_inheritable_attribute(:nsync_opts, nsync_opts)
17
+ include Nsync::ActiveRecord::Consumer::InstanceMethods
18
+ include Nsync::Producer::InstanceMethods
19
+ include Nsync::ActiveRecord::Producer::InstanceMethods
20
+ end
21
+ end
22
+ end
23
+ end
@@ -0,0 +1,21 @@
1
+ module Nsync
2
+ module ActiveRecord
3
+ module Producer
4
+ module InstanceMethods
5
+ def self.included(base)
6
+ puts "foo"
7
+ base.class_eval do
8
+ p base
9
+ after_save :nsync_write
10
+ before_destroy :nsync_destroy
11
+ end
12
+ end
13
+
14
+ def to_nsync_hash
15
+ attributes
16
+ end
17
+ end
18
+ end
19
+ end
20
+ end
21
+
@@ -0,0 +1,17 @@
1
+ require File.join(File.dirname(__FILE__), 'producer/methods')
2
+ module Nsync
3
+ module ClassMethods
4
+ # Makes this class an Nsync Consumer
5
+ def nsync_consumer(opts={})
6
+ nsync_opts = {:id_key => :source_id}.merge(opts)
7
+ write_inheritable_attribute(:nsync_opts, nsync_opts)
8
+ end
9
+
10
+ def nsync_producer(opts={})
11
+ nsync_opts = {:id_key => :id}.merge(opts)
12
+ write_inheritable_attribute(:nsync_opts, nsync_opts)
13
+ include Nsync::Producer::InstanceMethods
14
+ end
15
+ end
16
+ end
17
+
@@ -0,0 +1,96 @@
1
+ module Nsync
2
+ def self.config
3
+ @config ||= Config.new
4
+ end
5
+
6
+ def self.reset_config
7
+ @config = nil
8
+ end
9
+
10
+ class Config
11
+ #required to be user specified
12
+ attr_accessor :version_manager, :repo_path
13
+
14
+ #optional
15
+ attr_accessor :ordering, :repo_url, :repo_push_url, :log, :lock_file,
16
+ :producer_instance
17
+
18
+ include Lockfile
19
+
20
+ def initialize
21
+ clear_class_mappings
22
+ self.log = ::Logger.new(STDERR)
23
+ self.lock_file = "/tmp/nsync.lock"
24
+ end
25
+
26
+ def lock
27
+ ret = nil
28
+ success = with_lock_file(lock_file) do
29
+ ret = yield
30
+ end
31
+ if success != false
32
+ return ret
33
+ else
34
+ log.error("[NSYNC] Could not obtain lock!; exiting")
35
+ return false
36
+ end
37
+ end
38
+
39
+ def cd
40
+ old_pwd = FileUtils.pwd
41
+ begin
42
+ FileUtils.cd repo_path
43
+ yield
44
+ ensure
45
+ FileUtils.cd old_pwd
46
+ end
47
+ end
48
+
49
+
50
+ def map_class(producer_class, *consumer_classes)
51
+ @class_mappings[producer_class] ||= []
52
+ @class_mappings[producer_class] += consumer_classes
53
+ end
54
+
55
+ def clear_class_mappings
56
+ @class_mappings = {}
57
+ end
58
+
59
+ def consumer_classes_for(producer_class)
60
+ Array(@class_mappings[producer_class]).map do |klass|
61
+ begin
62
+ klass.constantize
63
+ rescue NameError => e
64
+ log.error(e.inspect)
65
+ log.warn("[NSYNC] Could not find class '#{klass}'; skipping")
66
+ nil
67
+ end
68
+ end.compact
69
+ end
70
+
71
+ def version_manager
72
+ return @version_manager if @version_manager
73
+ raise "Must define config.version_manager"
74
+ end
75
+
76
+ def producer_instance
77
+ @producer_instance ||= Nsync::Producer.new
78
+ end
79
+
80
+ def self.run
81
+ yield Nsync.config
82
+ end
83
+
84
+ def local?
85
+ !repo_url
86
+ end
87
+
88
+ def remote?
89
+ !!repo_url
90
+ end
91
+
92
+ def remote_push?
93
+ !!repo_push_url
94
+ end
95
+ end
96
+ end