ssync 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,2 @@
1
+ .bundle/*
2
+ pkg/*
data/Gemfile ADDED
@@ -0,0 +1,3 @@
1
+ source :rubygems
2
+
3
+ gemspec
@@ -0,0 +1,23 @@
1
+ PATH
2
+ remote: .
3
+ specs:
4
+ ssync (0.1.0)
5
+ aws-s3 (~> 0.6.2)
6
+
7
+ GEM
8
+ remote: http://rubygems.org/
9
+ specs:
10
+ aws-s3 (0.6.2)
11
+ builder
12
+ mime-types
13
+ xml-simple
14
+ builder (2.1.2)
15
+ mime-types (1.16)
16
+ xml-simple (1.0.12)
17
+
18
+ PLATFORMS
19
+ ruby
20
+
21
+ DEPENDENCIES
22
+ aws-s3 (~> 0.6.2)
23
+ ssync!
@@ -0,0 +1,19 @@
1
+ Copyright (c) 2010 Ryan Allen, Envato Pty Ltd
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy
4
+ of this software and associated documentation files (the "Software"), to deal
5
+ in the Software without restriction, including without limitation the rights
6
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
7
+ copies of the Software, and to permit persons to whom the Software is
8
+ furnished to do so, subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in
11
+ all copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
15
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
16
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
17
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
18
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
19
+ THE SOFTWARE.
@@ -0,0 +1,70 @@
1
+ # Ssync
2
+
3
+ __Ssync__, an optimised S3 sync tool using the power of *nix!
4
+
5
+ ## Requirements
6
+
7
+ - Ruby 1.8 or 1.9
8
+ - RubyGems
9
+ - 'aws-s3' rubygem
10
+ - `openssl`
11
+ - `find` and `xargs`
12
+
13
+ ## Installation
14
+
15
+ gem install ssync
16
+
17
+ ## Configuration
18
+
19
+ To configure, run `ssync setup` and follow the prompts, you'll
20
+ need your AWS keys, the local file path you want to back up, the bucket name
21
+ to back up to, and any extra options to pass into find (i.e. for ignoring
22
+ filepaths etc). It'll write the config to `~/.ssync.yml`.
23
+
24
+ ## Synchronisation
25
+
26
+ To sync, run `ssync sync` and away it goes.
27
+
28
+ In the case of a corrupted/incomplete synchronisation, run `ssync sync -f`
29
+ or `ssync sync --force` to force a checksum comparison.
30
+
31
+ ## Why?
32
+
33
+ This library was written because we needed to be able to back up loads of
34
+ data without having to worry about if we had enough disk space on the remote.
35
+ That's where S3 is nice.
36
+
37
+ We tried [s3sync](http://www.s3sync.net/) but it blew our server load (we do in excess of
38
+ 500,000 requests a day (page views, not including hits for images and what not,
39
+ and the server needs to stay responsive). The secret sauce is using the *nix
40
+ `find`, `xargs` and `openssl` commands to generate md5 checksums for comparison.
41
+ Seems to work quite well for us (we have almost 90,000 files to compare).
42
+
43
+ Initially the plan was to use `find` with `-ctime` but S3 isn't particularly nice about
44
+ returning a full list of objects in a bucket (default is 1000, and I want all
45
+ 90,000, and it ignores me when I ask for 1,000,000 objects). Manifest generation
46
+ on a server under load is fast enough and low enough on resources so we're sticking
47
+ with that in the interim.
48
+
49
+ FYI when you run sync, the output will look something like this:
50
+
51
+ [Thu Apr 01 11:50:25 +1100 2010] Starting, performing pre-sync checks ...
52
+ [Thu Apr 01 11:50:26 +1100 2010] Generating local manifest ...
53
+ [Thu Apr 01 11:50:26 +1100 2010] Fetching remote manifest ...
54
+ [Thu Apr 01 11:50:27 +1100 2010] Performing checksum comparison ...
55
+ [Thu Apr 01 11:50:27 +1100 2010] Pushing /tmp/backups/deep/four ...
56
+ [Thu Apr 01 11:50:28 +1100 2010] Pushing /tmp/backups/three ...
57
+ [Thu Apr 01 11:50:29 +1100 2010] Pushing /tmp/backups/two ...
58
+ [Thu Apr 01 11:50:30 +1100 2010] Pushing local manifest up to remote ...
59
+ [Thu Apr 01 11:50:31 +1100 2010] Sync complete!
60
+
61
+ You could pipe sync into a log file, which might be nice.
62
+
63
+ Have fun!
64
+
65
+ ## Authors
66
+
67
+ - [Ryan Allen](https://github.com/ryan-allen)
68
+ - [Fred Wu](https://github.com/fredwu)
69
+
70
+ This project is brought to you by [Envato](http://envato.com/) Pty Ltd.
@@ -0,0 +1,5 @@
1
+ begin
2
+ require 'bundler'
3
+ Bundler::GemHelper.install_tasks
4
+ rescue Exception => e
5
+ end
@@ -0,0 +1,6 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ $:.unshift File.dirname(__FILE__) + "/../lib"
4
+ require "ssync/command"
5
+
6
+ Ssync::Command.run!(*ARGV)
@@ -0,0 +1,8 @@
1
+ # encoding: utf-8
2
+
3
+ require "aws/s3"
4
+ require "yaml"
5
+ require "ssync/helpers"
6
+ require "ssync/setup"
7
+ require "ssync/sync"
8
+ require "ssync/version"
@@ -0,0 +1,52 @@
1
+ require "ssync"
2
+
3
+ module Ssync
4
+ class Command
5
+ include Helpers
6
+
7
+ def self.action
8
+ @@action
9
+ end
10
+
11
+ def self.args
12
+ @@args
13
+ end
14
+
15
+ def self.run!(*args)
16
+ new(*args).run!
17
+ end
18
+
19
+ def initialize(action = :sync, *args)
20
+ @@action = action.to_sym
21
+ @@args = *args
22
+ end
23
+
24
+ def run!
25
+ pre_run_check!
26
+ perform_action!
27
+ end
28
+
29
+ private
30
+
31
+ def pre_run_check!
32
+ if action_eq?(:setup) && config_exists?
33
+ e! "Cannot run the setup, there is already an Ssync configuration in '#{config_path}'."
34
+ elsif action_eq?(:sync) && !config_exists?
35
+ e! "Cannot run the sync, there is no Ssync configuration, try 'ssync setup' to create one first."
36
+ end
37
+ end
38
+
39
+ def perform_action!
40
+ case @@action
41
+ when :setup
42
+ aquire_lock! { Ssync::Setup.run! }
43
+ when :sync
44
+ aquire_lock! { Ssync::Sync.run! }
45
+ when :help
46
+ display_help!
47
+ else
48
+ e! "Cannot perform action '#{@action}', try 'ssync help' for usage."
49
+ end
50
+ end
51
+ end
52
+ end
@@ -0,0 +1,76 @@
1
+ module Ssync
2
+ module Helpers
3
+ def display_help!
4
+ display("Not implemented yet.")
5
+ end
6
+
7
+ def display(message)
8
+ puts("[#{Time.now}] #{message}")
9
+ end
10
+
11
+ def display_error(message)
12
+ display("Error! " + message)
13
+ end
14
+
15
+ def exit_with_error!(message)
16
+ display_error(message)
17
+ exit
18
+ end
19
+
20
+ alias :e :display_error
21
+ alias :e! :exit_with_error!
22
+
23
+ def ask(config_item, question)
24
+ print(question + " [#{config_item}]: ")
25
+ a = $stdin.readline.chomp
26
+ a.empty? ? config_item : a
27
+ end
28
+
29
+ def action_eq?(action)
30
+ Ssync::Command.action == action.to_sym
31
+ end
32
+
33
+ def config_exists?
34
+ File.exist?(config_path)
35
+ end
36
+
37
+ def config_path
38
+ ENV['HOME'] + "/.ssync.yml"
39
+ end
40
+
41
+ def lock_path
42
+ ENV['HOME'] + "/.ssync.lock"
43
+ end
44
+
45
+ def aquire_lock!
46
+ # better way is to write out the pid ($$) and read it back in, to make sure it's the same
47
+ e! "Found a lock at #{lock_path}, is another instance of Ssync running?" if File.exist?(lock_path)
48
+
49
+ begin
50
+ system "touch #{lock_path}"
51
+ yield
52
+ ensure
53
+ system "rm #{lock_path}"
54
+ end
55
+ end
56
+
57
+ def read_config
58
+ begin
59
+ open(config_path, "r") { |f| YAML::load(f) }
60
+ rescue
61
+ {}
62
+ end
63
+ end
64
+
65
+ def write_config!(config)
66
+ open(config_path, "w") { |f| YAML::dump(config, f) }
67
+ end
68
+
69
+ def options_set?(*options)
70
+ false
71
+ [options].flatten.each do |option|
72
+ return true if Command.args.include?("#{option.to_s}")
73
+ end
74
+ end
75
+ end
76
+ end
@@ -0,0 +1,92 @@
1
+ module Ssync
2
+ class Setup
3
+ class << self
4
+ include Helpers
5
+
6
+ def config
7
+ @config
8
+ end
9
+
10
+ def config=(config)
11
+ @config = config
12
+ end
13
+
14
+ def run!
15
+ display "Welcome to Ssync! You will now be asked a few questions, the results will be stored at '#{config_path}'."
16
+
17
+ config = read_config
18
+
19
+ config[:aws_access_key] = ask config[:aws_access_key], "What is the AWS Access Key ID?"
20
+ config[:aws_secret_key] = ask config[:aws_secret_key], "What is the AWS Secret Access Key?"
21
+
22
+ display "Please wait while Ssync is connecting to AWS ..."
23
+
24
+ if aws_credentials_is_valid?(config)
25
+ display "Successfully connected to AWS."
26
+
27
+ config[:aws_dest_bucket] = ask config[:aws_dest_bucket], "Which bucket would you like to put your backups in? Ssync will create the bucket for you if it doesn't exist."
28
+
29
+ if bucket_exists?(config)
30
+ if bucket_empty?(config)
31
+ display "The bucket exists and is empty, great!"
32
+ else
33
+ e! "The bucket exists but is not empty, we cannot sync to a bucket that is not empty!"
34
+ end
35
+ else
36
+ display "The bucket doesn't exist, creating it now ..."
37
+ create_bucket(config)
38
+ display "The bucket has been created."
39
+ end
40
+ else
41
+ e! "Ssync wasn't able to connect to AWS, please check the credentials you supplied are correct."
42
+ end
43
+
44
+ require "pathname"
45
+ config[:local_file_path] = ask config[:local_file_path], "What is the path you would like to backup? (i.e. '/var/www')."
46
+ config[:local_file_path] = Pathname.new(config[:local_file_path]).realpath.to_s
47
+
48
+ if local_file_path_exists?(config)
49
+ display "The path is set to '#{config[:local_file_path]}'."
50
+ else
51
+ e! "The path you specified does not exist!"
52
+ end
53
+
54
+ config[:find_options] = ask config[:find_options], "Do you have any options for 'find'? (e.g. \! -path *.git*)."
55
+
56
+ display "Writing the supplied details to '#{config_path}' for future reference ..."
57
+ write_config!(config)
58
+ display "All done! You may now use 'ssync sync' to syncronise your files to the S3 bucket."
59
+ end
60
+
61
+ def aws_credentials_is_valid?(config = read_config)
62
+ AWS::S3::Base.establish_connection!(:access_key_id => config[:aws_access_key], :secret_access_key => config[:aws_secret_key])
63
+ begin
64
+ # AWS::S3 don't try to connect at all until you ask it for something.
65
+ AWS::S3::Service.buckets
66
+ rescue AWS::S3::InvalidAccessKeyId => e
67
+ false
68
+ else
69
+ true
70
+ end
71
+ end
72
+
73
+ def bucket_exists?(config = read_config)
74
+ AWS::S3::Bucket.find(config[:aws_dest_bucket])
75
+ rescue AWS::S3::NoSuchBucket => e
76
+ false
77
+ end
78
+
79
+ def bucket_empty?(config = read_config)
80
+ AWS::S3::Bucket.find(config[:aws_dest_bucket]).empty?
81
+ end
82
+
83
+ def create_bucket(config = read_config)
84
+ AWS::S3::Bucket.create(config[:aws_dest_bucket])
85
+ end
86
+
87
+ def local_file_path_exists?(config = read_config)
88
+ File.exist?(config[:local_file_path])
89
+ end
90
+ end
91
+ end
92
+ end
@@ -0,0 +1,126 @@
1
+ module Ssync
2
+ class Sync
3
+ class << self
4
+ include Helpers
5
+
6
+ def run!
7
+ display "Initialising Ssync, performing pre-sync checks ..."
8
+
9
+ e! "Couldn't connect to AWS with the credentials specified in '#{config_path}'." unless Setup.aws_credentials_is_valid?
10
+ e! "Couldn't find the S3 bucket specified in '#{config_path}'." unless Setup.bucket_exists?
11
+ e! "The local path specified in '#{config_path}' does not exist." unless Setup.local_file_path_exists?
12
+
13
+ if options_set?("-f", "--force")
14
+ display "Clearing previous sync state ..."
15
+ clear_sync_state
16
+ end
17
+ create_tmp_sync_state
18
+
19
+ if last_sync_recorded?
20
+ display "Performing time based comparison ..."
21
+ files_modified_since_last_sync
22
+ else
23
+ display "Performing (potentially expensive) MD5 checksum comparison ..."
24
+ display "Generating local manifest ..."
25
+ generate_local_manifest
26
+ display "Traversing S3 for remote manifest ..."
27
+ fetch_remote_manifest
28
+ # note that we do not remove files on s3 that no longer exist on local host.
29
+ # this behaviour may be desirable (ala rsync --delete) but we currently don't support it.
30
+ display "Performing checksum comparison ..."
31
+ files_on_localhost_with_checksums - files_on_s3
32
+ end.each { |file| push_file(file) }
33
+
34
+ finalize_sync_state
35
+
36
+ display "Sync complete!"
37
+ end
38
+
39
+ def clear_sync_state
40
+ `rm -f #{last_sync_started} #{last_sync_completed}`
41
+ end
42
+
43
+ def create_tmp_sync_state
44
+ `touch #{last_sync_started}`
45
+ end
46
+
47
+ def last_sync_started
48
+ ENV['HOME'] + "/.ssync.last-sync.started"
49
+ end
50
+
51
+ def last_sync_completed
52
+ ENV['HOME'] + "/.ssync.last-sync.completed"
53
+ end
54
+
55
+ def last_sync_recorded?
56
+ File.exist?(last_sync_completed)
57
+ end
58
+
59
+ def finalize_sync_state
60
+ `cp #{last_sync_started} #{last_sync_completed}`
61
+ end
62
+
63
+ def files_modified_since_last_sync
64
+ # '! -type d' ignores directories, in local manifest directories are spit out to stderr whereas directories pop up in this query
65
+ `find #{read_config[:local_file_path]} #{read_config[:find_options]} \! -type d -cnewer #{last_sync_completed}`.split("\n").collect { |path| { :path => path } }
66
+ end
67
+
68
+ def update_config_with_sync_state(sync_start)
69
+ config = read_config()
70
+ config[:last_sync_at] = sync_start
71
+ write_config!(config)
72
+ end
73
+
74
+ def generate_local_manifest
75
+ `find #{read_config[:local_file_path]} #{read_config[:find_options]} -print0 | xargs -0 openssl md5 2> /dev/null > #{local_manifest_path}`
76
+ end
77
+
78
+ def fetch_remote_manifest
79
+ @remote_objects_cache = []
80
+ traverse_s3_for_objects(AWS::S3::Bucket.find(read_config[:aws_dest_bucket]), @remote_objects_cache)
81
+ end
82
+
83
+ def traverse_s3_for_objects(bucket, collection, n = 1000, upto = 0, marker = nil)
84
+ objects = bucket.objects(:marker => marker, :max_keys => n)
85
+ if objects.size == 0
86
+ return
87
+ else
88
+ objects.each { |object| collection << { :path => "/#{object.key}", :checksum => object.etag } }
89
+ traverse_s3_for_objects(bucket, collection, n, upto+n, objects.last.key)
90
+ end
91
+ end
92
+
93
+ def files_on_localhost_with_checksums
94
+ parse_manifest(local_manifest_path)
95
+ end
96
+
97
+ def files_on_s3
98
+ @remote_objects_cache
99
+ end
100
+
101
+ def local_manifest_path
102
+ "/tmp/ssync.manifest.local"
103
+ end
104
+
105
+ def parse_manifest(location)
106
+ []
107
+ if File.exist?(location)
108
+ open(location, "r") do |file|
109
+ file.collect do |line|
110
+ path, checksum = *line.chomp.match(/^MD5\((.*)\)= (.*)$/).captures
111
+ { :path => path, :checksum => checksum }
112
+ end
113
+ end
114
+ end
115
+ end
116
+
117
+ def push_file(file)
118
+ # xfer speed, logging, etc can occur in this method
119
+ display "Pushing '#{file[:path]}' ..."
120
+ AWS::S3::S3Object.store(file[:path], open(file[:path]), read_config[:aws_dest_bucket])
121
+ rescue
122
+ e "Could not push '#{file[:path]}': #{$!.inspect}"
123
+ end
124
+ end
125
+ end
126
+ end
@@ -0,0 +1,3 @@
1
+ module Ssync
2
+ VERSION = "0.1.0"
3
+ end
@@ -0,0 +1,24 @@
1
+ # -*- encoding: utf-8 -*-
2
+ require File.dirname(__FILE__) + "/lib/ssync/version"
3
+
4
+ Gem::Specification.new do |s|
5
+ s.name = "ssync"
6
+ s.version = Ssync::VERSION
7
+ s.date = Date.today.to_s
8
+ s.authors = ["Fred Wu", "Ryan Allen"]
9
+ s.email = ["fred@envato.com", "ryan@envato.com"]
10
+ s.summary = %q{Ssync, an optimised S3 sync tool using the power of *nix!}
11
+ s.description = %q{Ssync, an optimised S3 sync tool using the power of *nix!}
12
+ s.homepage = %q{http://github.com/fredwu/ssync}
13
+ s.extra_rdoc_files = ["README.md"]
14
+ s.rdoc_options = ["--charset=UTF-8"]
15
+ s.require_paths = ["lib"]
16
+ s.rubyforge_project = s.name
17
+
18
+ s.files = `git ls-files`.split("\n")
19
+ s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
20
+ s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
21
+ s.require_paths = ["lib"]
22
+
23
+ s.add_runtime_dependency(%q<aws-s3>, ["~> 0.6.2"])
24
+ end
metadata ADDED
@@ -0,0 +1,94 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: ssync
3
+ version: !ruby/object:Gem::Version
4
+ prerelease: false
5
+ segments:
6
+ - 0
7
+ - 1
8
+ - 0
9
+ version: 0.1.0
10
+ platform: ruby
11
+ authors:
12
+ - Fred Wu
13
+ - Ryan Allen
14
+ autorequire:
15
+ bindir: bin
16
+ cert_chain: []
17
+
18
+ date: 2010-11-12 00:00:00 +11:00
19
+ default_executable:
20
+ dependencies:
21
+ - !ruby/object:Gem::Dependency
22
+ name: aws-s3
23
+ prerelease: false
24
+ requirement: &id001 !ruby/object:Gem::Requirement
25
+ none: false
26
+ requirements:
27
+ - - ~>
28
+ - !ruby/object:Gem::Version
29
+ segments:
30
+ - 0
31
+ - 6
32
+ - 2
33
+ version: 0.6.2
34
+ type: :runtime
35
+ version_requirements: *id001
36
+ description: Ssync, an optimised S3 sync tool using the power of *nix!
37
+ email:
38
+ - fred@envato.com
39
+ - ryan@envato.com
40
+ executables:
41
+ - ssync
42
+ extensions: []
43
+
44
+ extra_rdoc_files:
45
+ - README.md
46
+ files:
47
+ - .gitignore
48
+ - Gemfile
49
+ - Gemfile.lock
50
+ - MIT-LICENSE
51
+ - README.md
52
+ - Rakefile
53
+ - bin/ssync
54
+ - lib/ssync.rb
55
+ - lib/ssync/command.rb
56
+ - lib/ssync/helpers.rb
57
+ - lib/ssync/setup.rb
58
+ - lib/ssync/sync.rb
59
+ - lib/ssync/version.rb
60
+ - ssync.gemspec
61
+ has_rdoc: true
62
+ homepage: http://github.com/fredwu/ssync
63
+ licenses: []
64
+
65
+ post_install_message:
66
+ rdoc_options:
67
+ - --charset=UTF-8
68
+ require_paths:
69
+ - lib
70
+ required_ruby_version: !ruby/object:Gem::Requirement
71
+ none: false
72
+ requirements:
73
+ - - ">="
74
+ - !ruby/object:Gem::Version
75
+ segments:
76
+ - 0
77
+ version: "0"
78
+ required_rubygems_version: !ruby/object:Gem::Requirement
79
+ none: false
80
+ requirements:
81
+ - - ">="
82
+ - !ruby/object:Gem::Version
83
+ segments:
84
+ - 0
85
+ version: "0"
86
+ requirements: []
87
+
88
+ rubyforge_project: ssync
89
+ rubygems_version: 1.3.7
90
+ signing_key:
91
+ specification_version: 3
92
+ summary: Ssync, an optimised S3 sync tool using the power of *nix!
93
+ test_files: []
94
+