nsync 0.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ Copyright (c) 2010 Ben Hughes
2
+ Copyright (c) 2010 NabeWise Media, Inc.
3
+
4
+ Permission is hereby granted, free of charge, to any person obtaining a copy
5
+ of this software and associated documentation files (the "Software"), to deal
6
+ in the Software without restriction, including without limitation the rights
7
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
8
+ copies of the Software, and to permit persons to whom the Software is
9
+ furnished to do so, subject to the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be included in
12
+ all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
17
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
18
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
19
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
20
+ THE SOFTWARE.
21
+
data/README.md ADDED
@@ -0,0 +1,204 @@
1
+ Nsync: Git based database synchronization
2
+ =========================================
3
+
4
+ Nsync allows you to keep disparate data sources synchronized for core data.
5
+ The use case this is designed to solve is when you have a data processing
6
+ system and one or many consumer facing services that depend on a canonical,
7
+ processed version of the data. All of this is based on the power of Git.
8
+ Nsync makes no assumptions about your data stores, ORMs, or storage practices
9
+ other than that the data from the producer has something that can serve as a
10
+ unique primary key and that the consumer can be queried by a key indicating
11
+ its source. That said, Nsync comes with extensions for ActiveRecord 2.3.x
12
+ that handle the simple case.
13
+
14
+ A NabeWise Story
15
+ ----------------
16
+
17
+ Nsync was born out of our needs at NabeWise (http://nabewise.com). We deal
18
+ with neighborhoods and cities. Our source data consists of about 70,000
19
+ neighborhoods across the US. We carefully curate neighborhoods for each of
20
+ our cities. Oftentimes this involves editing boundaries, making changes to
21
+ neighborhood names, or other slight adjustments to the underlying data. We
22
+ also process and refine the boundaries for display through a number of
23
+ automated processes to get the data to where we want. This occurs in a Rails
24
+ app based on top of PostgreSQL with PostGIS. We have to get this data to our
25
+ website, which runs on top of MySQL and Redis. We have enough data that full
26
+ reloads from files are impractical, and we also want to handle events like
27
+ neighborhood deletion intelligently. Nsync solves these issues.
28
+
29
+ Installation
30
+ ------------
31
+
32
+ gem install nsync
33
+
34
+ Nsync depends on two gems that I've forked, schleyfox-grit and
35
+ schleyfox-lockfile. I'm sorry, but this is how it has to be. Nsync also
36
+ currently depends on ActiveSupport ~> 2.3.5, but I am working to remove this
37
+ dependency.
38
+
39
+ Terminology
40
+ -----------
41
+
42
+ In Nsync lingo, a producer is an object/class that creates data that will go
43
+ into a repository and propagate to consumers. It adheres to the Producer
44
+ interface. A consumer is an object/class that takes data from the repo and
45
+ updates itself accordingly. It adheres to the Consumer interface. A producer
46
+ is also a consumer of itself.
47
+
48
+ Producer Usage
49
+ --------------
50
+
51
+ To start off with, you have to configure your shiny new producer app. This
52
+ configuration should happen before the producer is ever used.
53
+
54
+ Nsync::Config.run do |c|
55
+ # The producer uses a standard repository
56
+ # This will automatically be created if it does not exist
57
+ c.repo_path = "/local/path/to/hold/data"
58
+ # The remote repository url will get data pushed to it
59
+ c.repo_push_url = "git@examplegithost:username/data.git"
60
+
61
+ # This must be Nsync::GitVersionManager if you want things like
62
+ # rollback to work.
63
+ c.version_manager = Nsync::GitVersionManager.new
64
+
65
+ # A lock file path to use for this app
66
+ c.lock_file = "/tmp/app_name_nsync.lock"
67
+ end
68
+
69
+ Now you need to let your objects know the joy of Nsync. This is not strictly
70
+ necessary, but can help out. If you are using ActiveRecord, do this:
71
+
72
+ ActiveRecord::Base.send(:extend, Nsync::ActiveRecord::ClassMethods)
73
+
74
+ If you are using something else, do this:
75
+
76
+ YourBaseObject.send(:extend, Nsync::ClassMethods)
77
+
78
+ Now, set your data classes up as producers
79
+
80
+ class Post < ActiveRecord::Base
81
+ nsync_producer
82
+ end
83
+
84
+ By default, this will write out the json-ified contents of its attributes (if
85
+ its ActiveRecord, you have to define its representation otherwise) to
86
+ "CLASS_NAME/ID.json" in the repo.
87
+
88
+ If not all of your data should be exported, you can specify an :if function
89
+
90
+ class Post
91
+ nsync_producer :if => lambda {|o| o.should_be_exported }
92
+ end
93
+
94
+ After you make some data changes, you can commit and push them by doing
95
+
96
+ producer = Nsync::Producer.new
97
+ producer.commit("Short Message Describing Changes")
98
+
99
+ See Nsync::ClassMethods, Nsync::ActiveRecord::ClassMethods,
100
+ Nsync::Producer::InstanceMethods, and
101
+ Nsync::ActiveRecord::Producer::InstanceMethods for more information
102
+
103
+ Consumer Usage
104
+ --------------
105
+
106
+ Every good producer needs one (or many) good consumers. Again, the first step
107
+ is configuration.
108
+
109
+ The Consumer is a little less straight forward. It requires that the classes
110
+ from the Producer side be mapped to classes on the Consumer side. This
111
+ happens using Nsync::Config#map_class, which maps from a Producer class name
112
+ to one or many Consumer classes.
113
+
114
+ It also requires that Nsync::Config#version_manager is set to a class or
115
+ instance that conforms to the VersionManager interface. This is probably a
116
+ class on top of a database that stores versions (by commit id) as they are
117
+ loaded into the system, such that the current version and all previous
118
+ versions can be easily accessed. The ActiveRecord integration tests
119
+ demonstrate this.
120
+
121
+ Nsync::Config.run do |c|
122
+ # The consumer uses a read-only, bare repository (one ending in .git)
123
+ # This will automatically be created if it does not exist
124
+ c.repo_path = "/local/path/to/hold/data.git"
125
+ # The remote repository url from which to pull data
126
+ c.repo_url = "git@examplegithost:username/data.git"
127
+
128
+ # An object that implements the VersionManager interface
129
+ # (see Nsync::GitVersionManager) for an example
130
+ c.version_manager = MyCustomVersionManager.new
131
+
132
+ # A lock file path to use for this app
133
+ c.lock_file = "/tmp/app_name_nsync.lock"
134
+
135
+ # The class mapping maps from the class names of the producer classes to
136
+ # the class names of their associated consuming classes. A producer can
137
+ # map to one or many consumers, and a consumer can be mapped to one or many
138
+ # producers. Consumer classes should implement the Consumer interface.
139
+ c.map_class "RawDataPostClass", "Post"
140
+ c.map_class "RawDataInfo", "Info"
141
+ end
142
+
143
+ Now you should let your classes know about the Nsync way.If you are using
144
+ ActiveRecord, do this:
145
+
146
+ ActiveRecord::Base.send(:extend, Nsync::ActiveRecord::ClassMethods)
147
+
148
+ If you are using something else, do this:
149
+
150
+ YourBaseObject.send(:extend, Nsync::ClassMethods)
151
+
152
+ Now it's time to let your objects know that they are consumers
153
+
154
+ class Post < ActiveRecord::Base
155
+ nsync_consumer
156
+ end
157
+
158
+ You can (and probably should) override all or some of the default methods in
159
+ the Consumer interface. By default, it basically just attempts to copy hash
160
+ from the file into the consuming database. If your object has any relations,
161
+ this will probably fail tragically. A better Post class would be
162
+
163
+ class Post < ActiveRecord::Base
164
+
165
+ nsync_consumer
166
+
167
+ def self.nsync_add_data(consumer, event_type, filename, data)
168
+ post = new
169
+ post.source_id = data['id']
170
+ post.nsync_update(consumer, event_type, filename, data)
171
+ end
172
+
173
+ def nsync_update(consumer, event_type, filename, data)
174
+ if event_type == :deleted
175
+ destroy
176
+ else
177
+ self.author = Author.nsync_find(data['author_id']).first
178
+ self.content = data['content']
179
+
180
+ related_post_source_ids = data['related_post_ids']
181
+ post = self
182
+
183
+ consumer.after_current_class_finished(lambda {
184
+ post.related_posts = Post.all(:conditions => {:source_id =>
185
+ related_post_source_ids})
186
+ })
187
+
188
+ self.save
189
+ end
190
+ end
191
+ end
192
+
193
+ This also demonstrates how to add callbacks to queues.
194
+
195
+ You can update from the repo like so:
196
+
197
+ consumer = Nsync::Consumer.new
198
+ consumer.update
199
+
200
+
201
+
202
+
203
+
204
+
data/Rakefile ADDED
@@ -0,0 +1,47 @@
1
+ require 'rubygems'
2
+ require 'rake'
3
+
4
+ gem "schleyfox-grit", ">= 2.3.0.1"
5
+ require 'grit'
6
+
7
+ require 'jeweler'
8
+ require 'jeweler_monkey_patch'
9
+
10
+
11
+ Jeweler::Tasks.new do |gem|
12
+ gem.name = "nsync"
13
+ gem.summary = %Q{Keep your data processors and apps in sync}
14
+ gem.description = %Q{Nsync is designed to allow you to have a separate data
15
+ processing app with its own data processing optimized database and a consumer
16
+ app with its own database, while keeping the data as in sync as you want it.}
17
+ gem.email = "ben@pixelmachine.org"
18
+ gem.homepage = "http://github.com/schleyfox/nsync"
19
+ gem.authors = ["Ben Hughes"]
20
+
21
+ gem.add_dependency "json"
22
+ gem.add_dependency "activesupport", "~> 2.3.5"
23
+ gem.add_dependency "schleyfox-grit", ">= 2.3.0.1"
24
+ gem.add_dependency "schleyfox-lockfile", ">= 1.0.0"
25
+
26
+ gem.add_development_dependency "shoulda"
27
+ gem.add_development_dependency "mocha"
28
+ end
29
+
30
+
31
+ require 'rake/testtask'
32
+ Rake::TestTask.new(:test) do |test|
33
+ test.libs << 'lib' << 'test'
34
+ test.pattern = 'test/**/*_test.rb'
35
+ test.verbose = true
36
+ end
37
+
38
+ desc "Generate RCov test coverage and open in your browser"
39
+ task :coverage do
40
+ require 'rcov'
41
+ sh "rm -fr coverage"
42
+ sh "rcov -Itest -Ilib -x \"rubygems/*,/Library/Ruby/Site/*,gems/*,rcov*\" --html test/*_test.rb"
43
+ sh "open coverage/index.html"
44
+ end
45
+
46
+ require 'yard'
47
+ YARD::Rake::YardocTask.new
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.0.2
@@ -0,0 +1,59 @@
1
+ # because the outdated git-ruby version is not interface compatible with grit,
2
+ # we have to prevent it from being used
3
+ module Jeweler::Specification
4
+ def set_jeweler_defaults(base_dir, git_base_dir = nil)
5
+ base_dir = File.expand_path(base_dir)
6
+ git_base_dir = if git_base_dir
7
+ File.expand_path(git_base_dir)
8
+ else
9
+ base_dir
10
+ end
11
+ can_git = git_base_dir && base_dir.include?(git_base_dir) && File.directory?(File.join(git_base_dir, '.git'))
12
+
13
+ Dir.chdir(git_base_dir) do
14
+ all_files = `git ls-files`.split("\n").reject{|file| file =~ /^\./ }
15
+
16
+ if blank?(files)
17
+ base_dir_with_trailing_separator = File.join(base_dir, "")
18
+
19
+ self.files = all_files.reject{|file| file =~ /^(doc|pkg|test|spec|examples)/ }.compact.map do |file|
20
+ File.expand_path(file).sub(base_dir_with_trailing_separator, "")
21
+ end
22
+ end
23
+
24
+ if blank?(test_files)
25
+ self.test_files = all_files.select{|file| file =~ /^(test|spec|examples)/ }.compact.map do |file|
26
+ File.expand_path(file).sub(base_dir_with_trailing_separator, "")
27
+ end
28
+ end
29
+
30
+ if blank?(executables)
31
+ self.executables = all_files.select{|file| file =~ /^bin/}.map do |file|
32
+ File.basename(file)
33
+ end
34
+ end
35
+
36
+ if blank?(extensions)
37
+ self.extensions = FileList['ext/**/{extconf,mkrf_conf}.rb']
38
+ end
39
+
40
+ self.has_rdoc = true
41
+
42
+ if blank?(extra_rdoc_files)
43
+ self.extra_rdoc_files = FileList['README*', 'ChangeLog*', 'LICENSE*', 'TODO']
44
+ end
45
+
46
+ if File.exist?('Gemfile')
47
+ require 'bundler'
48
+ bundler = Bundler.load
49
+ bundler.dependencies_for(:default, :runtime).each do |dependency|
50
+ self.add_dependency dependency.name, *dependency.requirement.as_list
51
+ end
52
+ bundler.dependencies_for(:development).each do |dependency|
53
+ self.add_development_dependency dependency.name, *dependency.requirement.as_list
54
+ end
55
+ end
56
+
57
+ end
58
+ end
59
+ end
@@ -0,0 +1,47 @@
1
+ module Nsync
2
+ module ActiveRecord
3
+ module Consumer
4
+ module ClassMethods
5
+ def nsync_find(ids)
6
+ nsync_opts = read_inheritable_attribute(:nsync_opts)
7
+ all(:conditions => {nsync_opts[:id_key] => ids})
8
+ end
9
+
10
+ def nsync_add_data(consumer, event_type, filename, data)
11
+ data = data.dup
12
+ nsync_opts = read_inheritable_attribute(:nsync_opts)
13
+ if nsync_opts[:id_key].to_s != "id"
14
+ data[nsync_opts[:id_key].to_s] = data.delete("id")
15
+ create(data)
16
+ else
17
+ id = data.delete("id")
18
+ obj = new(data)
19
+ obj.id = id
20
+ obj.save
21
+ end
22
+ end
23
+ end
24
+
25
+ module InstanceMethods
26
+ def self.included(base)
27
+ base.send(:extend, ClassMethods)
28
+ end
29
+ def nsync_update(consumer, event_type, filename, data)
30
+ data = data.dup
31
+ if event_type == :deleted
32
+ destroy
33
+ else
34
+ nsync_opts = self.class.read_inheritable_attribute(:nsync_opts)
35
+ if nsync_opts[:id_key].to_s != "id"
36
+ data[nsync_opts[:id_key].to_s] = data.delete("id")
37
+ else
38
+ self.id = data.delete("id")
39
+ update_attributes(data)
40
+ end
41
+ end
42
+ end
43
+ end
44
+ end
45
+ end
46
+ end
47
+
@@ -0,0 +1,23 @@
1
+ require File.join(File.dirname(__FILE__), 'consumer/methods')
2
+ require File.join(File.dirname(__FILE__), 'producer/methods')
3
+
4
+ module Nsync
5
+ module ActiveRecord
6
+ module ClassMethods
7
+ # Makes this class an Nsync Consumer
8
+ def nsync_consumer(opts={})
9
+ nsync_opts = {:id_key => :source_id}.merge(opts)
10
+ write_inheritable_attribute(:nsync_opts, nsync_opts)
11
+ include Nsync::ActiveRecord::Consumer::InstanceMethods
12
+ end
13
+
14
+ def nsync_producer(opts={})
15
+ nsync_opts = {:id_key => :id}.merge(opts)
16
+ write_inheritable_attribute(:nsync_opts, nsync_opts)
17
+ include Nsync::ActiveRecord::Consumer::InstanceMethods
18
+ include Nsync::Producer::InstanceMethods
19
+ include Nsync::ActiveRecord::Producer::InstanceMethods
20
+ end
21
+ end
22
+ end
23
+ end
@@ -0,0 +1,21 @@
1
+ module Nsync
2
+ module ActiveRecord
3
+ module Producer
4
+ module InstanceMethods
5
+ def self.included(base)
6
+ puts "foo"
7
+ base.class_eval do
8
+ p base
9
+ after_save :nsync_write
10
+ before_destroy :nsync_destroy
11
+ end
12
+ end
13
+
14
+ def to_nsync_hash
15
+ attributes
16
+ end
17
+ end
18
+ end
19
+ end
20
+ end
21
+
@@ -0,0 +1,17 @@
1
+ require File.join(File.dirname(__FILE__), 'producer/methods')
2
+ module Nsync
3
+ module ClassMethods
4
+ # Makes this class an Nsync Consumer
5
+ def nsync_consumer(opts={})
6
+ nsync_opts = {:id_key => :source_id}.merge(opts)
7
+ write_inheritable_attribute(:nsync_opts, nsync_opts)
8
+ end
9
+
10
+ def nsync_producer(opts={})
11
+ nsync_opts = {:id_key => :id}.merge(opts)
12
+ write_inheritable_attribute(:nsync_opts, nsync_opts)
13
+ include Nsync::Producer::InstanceMethods
14
+ end
15
+ end
16
+ end
17
+
@@ -0,0 +1,96 @@
1
+ module Nsync
2
+ def self.config
3
+ @config ||= Config.new
4
+ end
5
+
6
+ def self.reset_config
7
+ @config = nil
8
+ end
9
+
10
+ class Config
11
+ #required to be user specified
12
+ attr_accessor :version_manager, :repo_path
13
+
14
+ #optional
15
+ attr_accessor :ordering, :repo_url, :repo_push_url, :log, :lock_file,
16
+ :producer_instance
17
+
18
+ include Lockfile
19
+
20
+ def initialize
21
+ clear_class_mappings
22
+ self.log = ::Logger.new(STDERR)
23
+ self.lock_file = "/tmp/nsync.lock"
24
+ end
25
+
26
+ def lock
27
+ ret = nil
28
+ success = with_lock_file(lock_file) do
29
+ ret = yield
30
+ end
31
+ if success != false
32
+ return ret
33
+ else
34
+ log.error("[NSYNC] Could not obtain lock!; exiting")
35
+ return false
36
+ end
37
+ end
38
+
39
+ def cd
40
+ old_pwd = FileUtils.pwd
41
+ begin
42
+ FileUtils.cd repo_path
43
+ yield
44
+ ensure
45
+ FileUtils.cd old_pwd
46
+ end
47
+ end
48
+
49
+
50
+ def map_class(producer_class, *consumer_classes)
51
+ @class_mappings[producer_class] ||= []
52
+ @class_mappings[producer_class] += consumer_classes
53
+ end
54
+
55
+ def clear_class_mappings
56
+ @class_mappings = {}
57
+ end
58
+
59
+ def consumer_classes_for(producer_class)
60
+ Array(@class_mappings[producer_class]).map do |klass|
61
+ begin
62
+ klass.constantize
63
+ rescue NameError => e
64
+ log.error(e.inspect)
65
+ log.warn("[NSYNC] Could not find class '#{klass}'; skipping")
66
+ nil
67
+ end
68
+ end.compact
69
+ end
70
+
71
+ def version_manager
72
+ return @version_manager if @version_manager
73
+ raise "Must define config.version_manager"
74
+ end
75
+
76
+ def producer_instance
77
+ @producer_instance ||= Nsync::Producer.new
78
+ end
79
+
80
+ def self.run
81
+ yield Nsync.config
82
+ end
83
+
84
+ def local?
85
+ !repo_url
86
+ end
87
+
88
+ def remote?
89
+ !!repo_url
90
+ end
91
+
92
+ def remote_push?
93
+ !!repo_push_url
94
+ end
95
+ end
96
+ end