atduskgreg-slipcover 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,19 @@
1
+ Two simple scripts for helping with CouchDB clustering and parallelization:
2
+
3
+ =Slipcover
4
+
5
+ - run a single query across a multi-member cluser (i.e. a group of CouchDBs) and zip up the results
6
+
7
+ =California King
8
+
9
+ - run a series of queries in parallel against a single CouchDB
10
+
11
+ =Details
12
+
13
+ Checkout slipcover.doctest and california_king.doctest for usage and explanation.
14
+
15
+ To run the doctests:
16
+
17
+ gem install rubydoctest
18
+ cd slipcover
19
+ rubydoctest doctest/*.doctest
@@ -0,0 +1,35 @@
1
+ >> $:.unshift File.dirname(__FILE__)
2
+ >> require 'doctest_helper'
3
+
4
+ ## CalforniaKing ##
5
+
6
+
7
+ - named after the hugest mattress you can get.
8
+ - where Slipcover lets you run a single query against a cluster, CaliforniaKing runs multiple simultaneous queries against a single CouchDb
9
+ - furnish the king with an array of queries, and it will execute them across a series of threads (because Erlang/CouchDB eats up concurrency like a king)
10
+ - using CaliforniaKing you can usually get done in seconds what might take minutes, without threading.
11
+
12
+ # Example: run a collection of queries in parallel against a single db.
13
+
14
+ Require CalforniaKing:
15
+
16
+ >> require 'lib/california_king'
17
+ >> cr = CouchRest.new('localhost:5984')
18
+ >> cr.database('california_king-test').delete! rescue nil
19
+ >> db = cr.create_db('california_king-test')
20
+
21
+ Create a bunch of docs:
22
+
23
+ >> docs = []
24
+ >> 400.times{|n|docs.push({:number => n})}
25
+ >> saved = db.bulk_save(docs)
26
+ >> saved['new_revs'].length
27
+ => 400
28
+
29
+ Retrieve them using 10 threads:
30
+
31
+ >> queries = saved['new_revs'].collect{|r|[:get, r['id']]}
32
+ >> king = CaliforniaKing.new('localhost:5984','california_king-test', 10)
33
+ >> results = king.query queries
34
+ >> results.collect{|doc|doc['number']}.uniq.length
35
+ => 400
@@ -0,0 +1,11 @@
1
+ $:.unshift File.dirname(__FILE__) + "/.."
2
+
3
+ class Proc
4
+ def raises_error?(err)
5
+ begin
6
+ self.call
7
+ rescue Exception => e
8
+ e.is_a? err
9
+ end
10
+ end
11
+ end
@@ -0,0 +1,76 @@
1
+ >> $:.unshift File.dirname(__FILE__)
2
+ >> require 'doctest_helper'
3
+
4
+ ## Slipcover ##
5
+
6
+ - treat a series of objects as a cluster by:
7
+ - systematically calling the same method on all of them in parallel
8
+ - zipping up the results
9
+
10
+ # Example: access a cluster of CouchDB databases accessed via CouchRest (http://github.com/jchris/couchrest/tree/master)
11
+
12
+ Require Slipcover and CouchRest:
13
+
14
+ >> require 'lib/slipcover'
15
+ >> require 'couchrest'
16
+
17
+ Create the members of the cluster (these could be on different hosts, but we'll just simulate that here):
18
+
19
+ >> cr1 = CouchRest.new('localhost:5984')
20
+ >> cr1.database('slipcover-test').delete! rescue nil
21
+ >> cr1.create_db('slipcover-test')
22
+
23
+ Create cluster member two.
24
+
25
+ >> cr2 = CouchRest.new('127.0.0.1:5984')
26
+ >> cr2.database('slipcover-test2').delete! rescue nil
27
+ >> cr2.create_db('slipcover-test2')
28
+ >> db1 = cr1.database('slipcover-test')
29
+ >> db2 = cr2.database('slipcover-test2')
30
+ >> saved = db1.save({"test"=>"doc"})
31
+ >> saved['ok']
32
+ => true
33
+
34
+ Assign them to Slipcover for management:
35
+
36
+ >> cluster = Slipcover.new( [db1, db2] )
37
+
38
+ By default, our cluster will re-raise any errors that occur in individual members
39
+
40
+ >> lambda{ cluster.get( saved['id'] )}.raises_error? RestClient::ResourceNotFound
41
+ => true
42
+
43
+ but, if we want to ignore certain errors in the members (like in this case where we only want to hear back from the member of the cluster that actually has the document we're looking for), we can tell Slipcover to silence errors of a certain type
44
+
45
+ >> cluster.silenced_errors << RestClient::ResourceNotFound
46
+ => [RestClient::ResourceNotFound]
47
+
48
+ and then getting a document that's present on only one of the members will return the document without any noise from the other cluster members:
49
+
50
+ >> result = cluster.get( saved['id'] )
51
+ >> result.first['test']
52
+ => "doc"
53
+
54
+ If members raise other errors that aren't included in the list to be silenced, however, they will bring things to a halt. For example, if we add another member to the cluster on a broken connection
55
+
56
+ >> db3 = CouchRest.new('broken-socket').database('no-couch-here')
57
+ >> cluster.add_member db3
58
+
59
+ then the resulting errors will still raise:
60
+
61
+ >> lambda{ cluster.get( saved['id'] )}.raises_error? Object
62
+ => true
63
+
64
+ Let's remove this broken cluster member so we can continue our tests:
65
+
66
+ >> cluster.remove_member{ |m| m.host == 'broken-socket' }
67
+ >> cluster.members.length == 2
68
+ => true
69
+
70
+ We could also remove a particular member if we had it handy
71
+
72
+ >> cluster.add_member db3
73
+ >> cluster.remove_member db3
74
+ >> cluster.members.length == 2
75
+ => true
76
+
@@ -0,0 +1,51 @@
1
+ require 'rubygems'
2
+ require 'enumerator'
3
+ require 'couchrest'
4
+
5
+ class CaliforniaKing
6
+ attr_accessor :silenced_errors, :width
7
+
8
+ def initialize(server, dbname, width=7)
9
+ @server = server
10
+ @dbname = dbname
11
+ @width = width
12
+ @silenced_errors = []
13
+ end
14
+
15
+ def query(queries)
16
+ results = []
17
+ slice_size = queries.length.to_f / @width
18
+ # each thread gets a slice to process, so it doesn't have to wait on others
19
+ threads = []
20
+ puts "slicing #{queries.length} queries into slices of size #{slice_size}"
21
+ queries.each_slice([slice_size.round,1].max) do |qs|
22
+ puts "feeding #{qs.length} queries to thread #{threads.length}"
23
+ threads << Thread.new(database) do |db|
24
+ qs.each do |q|
25
+ method = q.shift
26
+ begin
27
+ results << db.send(method, *q)
28
+ $stdout.putc '.'
29
+ $stdout.flush
30
+ rescue Exception => e
31
+ raise e unless silenced_errors_include? e
32
+ end
33
+ end
34
+ puts "thread finished"
35
+ end
36
+ end
37
+ threads.each{|t| t.join}
38
+ results
39
+ end
40
+
41
+ private
42
+
43
+ def database
44
+ CouchRest.new(@server).database(@dbname)
45
+ end
46
+
47
+ def silenced_errors_include? e
48
+ @silenced_errors.any?{|eklass| e.is_a? eklass}
49
+ end
50
+
51
+ end
@@ -0,0 +1,43 @@
1
+ class Slipcover
2
+ attr_accessor :silenced_errors, :members
3
+
4
+ def initialize(members)
5
+ @members = Array(members)
6
+ @silenced_errors = []
7
+ end
8
+
9
+ def add_member member
10
+ @members << member
11
+ end
12
+
13
+ def remove_member(member=nil, &block)
14
+ @members.delete(member) if member
15
+ @members.reject!{ |m| block.call(m) } if block_given?
16
+ end
17
+
18
+
19
+ def method_missing(method, *args, &block)
20
+ results = []
21
+ threads = []
22
+
23
+ @members.each do |m|
24
+ threads << Thread.new(m) do |member|
25
+ begin
26
+ results << member.send(method, *args)
27
+ rescue Exception => e
28
+ raise e unless silenced_errors_include? e
29
+ end
30
+ end
31
+ end
32
+
33
+ threads.each{|t| t.join}
34
+ results
35
+ end
36
+
37
+ private
38
+
39
+ def silenced_errors_include? e
40
+ @silenced_errors.any?{|eklass| e.is_a? eklass}
41
+ end
42
+
43
+ end
@@ -0,0 +1,20 @@
1
+ Gem::Specification.new do |s|
2
+
3
+ s.name = "slipcover"
4
+ s.version = "0.2.0"
5
+ s.date = "2008-09-09"
6
+ s.summary = "CouchDB clustering and parallelization."
7
+ s.email = "greg@grabb.it"
8
+ s.homepage = "http://github.com/atduskgreg/slipcover"
9
+ s.description = "Slipcover runs a single query across a multi-member cluser (i.e. a group of CouchDBs) and zip up the results. CaliforniaKing runs a series of queries in parallel against a single CouchDB."
10
+ s.has_rdoc = false
11
+ s.authors = ["Greg Borenstein", "J. Chris Anderson"]
12
+ s.files = %w{
13
+ lib/slipcover.rb lib/california_king.rb
14
+ README.rdoc
15
+ slipcover.gemspec
16
+ doctest/slipcover.doctest doctest/california_king.doctest doctest/doctest_helper.rb
17
+ }
18
+ s.require_path = "lib"
19
+ s.add_dependency("couchrest", [">= 0.9"])
20
+ end
metadata ADDED
@@ -0,0 +1,68 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: atduskgreg-slipcover
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.2.0
5
+ platform: ruby
6
+ authors:
7
+ - Greg Borenstein
8
+ - J. Chris Anderson
9
+ autorequire:
10
+ bindir: bin
11
+ cert_chain: []
12
+
13
+ date: 2008-09-09 00:00:00 -07:00
14
+ default_executable:
15
+ dependencies:
16
+ - !ruby/object:Gem::Dependency
17
+ name: couchrest
18
+ version_requirement:
19
+ version_requirements: !ruby/object:Gem::Requirement
20
+ requirements:
21
+ - - ">="
22
+ - !ruby/object:Gem::Version
23
+ version: "0.9"
24
+ version:
25
+ description: Slipcover runs a single query across a multi-member cluser (i.e. a group of CouchDBs) and zip up the results. CaliforniaKing runs a series of queries in parallel against a single CouchDB.
26
+ email: greg@grabb.it
27
+ executables: []
28
+
29
+ extensions: []
30
+
31
+ extra_rdoc_files: []
32
+
33
+ files:
34
+ - lib/slipcover.rb
35
+ - lib/california_king.rb
36
+ - README.rdoc
37
+ - slipcover.gemspec
38
+ - doctest/slipcover.doctest
39
+ - doctest/california_king.doctest
40
+ - doctest/doctest_helper.rb
41
+ has_rdoc: false
42
+ homepage: http://github.com/atduskgreg/slipcover
43
+ post_install_message:
44
+ rdoc_options: []
45
+
46
+ require_paths:
47
+ - lib
48
+ required_ruby_version: !ruby/object:Gem::Requirement
49
+ requirements:
50
+ - - ">="
51
+ - !ruby/object:Gem::Version
52
+ version: "0"
53
+ version:
54
+ required_rubygems_version: !ruby/object:Gem::Requirement
55
+ requirements:
56
+ - - ">="
57
+ - !ruby/object:Gem::Version
58
+ version: "0"
59
+ version:
60
+ requirements: []
61
+
62
+ rubyforge_project:
63
+ rubygems_version: 1.2.0
64
+ signing_key:
65
+ specification_version: 2
66
+ summary: CouchDB clustering and parallelization.
67
+ test_files: []
68
+