atduskgreg-slipcover 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- data/README.rdoc +19 -0
- data/doctest/california_king.doctest +35 -0
- data/doctest/doctest_helper.rb +11 -0
- data/doctest/slipcover.doctest +76 -0
- data/lib/california_king.rb +51 -0
- data/lib/slipcover.rb +43 -0
- data/slipcover.gemspec +20 -0
- metadata +68 -0
data/README.rdoc
ADDED
@@ -0,0 +1,19 @@
|
|
1
|
+
Two simple scripts for helping with CouchDB clustering and parallelization:
|
2
|
+
|
3
|
+
=Slipcover
|
4
|
+
|
5
|
+
- run a single query across a multi-member cluser (i.e. a group of CouchDBs) and zip up the results
|
6
|
+
|
7
|
+
=California King
|
8
|
+
|
9
|
+
- run a series of queries in parallel against a single CouchDB
|
10
|
+
|
11
|
+
=Details
|
12
|
+
|
13
|
+
Checkout slipcover.doctest and california_king.doctest for usage and explanation.
|
14
|
+
|
15
|
+
To run the doctests:
|
16
|
+
|
17
|
+
gem install rubydoctest
|
18
|
+
cd slipcover
|
19
|
+
rubydoctest doctest/*.doctest
|
@@ -0,0 +1,35 @@
|
|
1
|
+
>> $:.unshift File.dirname(__FILE__)
|
2
|
+
>> require 'doctest_helper'
|
3
|
+
|
4
|
+
## CalforniaKing ##
|
5
|
+
|
6
|
+
|
7
|
+
- named after the hugest mattress you can get.
|
8
|
+
- where Slipcover lets you run a single query against a cluster, CaliforniaKing runs multiple simultaneous queries against a single CouchDb
|
9
|
+
- furnish the king with an array of queries, and it will execute them across a series of threads (because Erlang/CouchDB eats up concurrency like a king)
|
10
|
+
- using CaliforniaKing you can usually get done in seconds what might take minutes, without threading.
|
11
|
+
|
12
|
+
# Example: run a collection of queries in parallel against a single db.
|
13
|
+
|
14
|
+
Require CalforniaKing:
|
15
|
+
|
16
|
+
>> require 'lib/california_king'
|
17
|
+
>> cr = CouchRest.new('localhost:5984')
|
18
|
+
>> cr.database('california_king-test').delete! rescue nil
|
19
|
+
>> db = cr.create_db('california_king-test')
|
20
|
+
|
21
|
+
Create a bunch of docs:
|
22
|
+
|
23
|
+
>> docs = []
|
24
|
+
>> 400.times{|n|docs.push({:number => n})}
|
25
|
+
>> saved = db.bulk_save(docs)
|
26
|
+
>> saved['new_revs'].length
|
27
|
+
=> 400
|
28
|
+
|
29
|
+
Retrieve them using 10 threads:
|
30
|
+
|
31
|
+
>> queries = saved['new_revs'].collect{|r|[:get, r['id']]}
|
32
|
+
>> king = CaliforniaKing.new('localhost:5984','california_king-test', 10)
|
33
|
+
>> results = king.query queries
|
34
|
+
>> results.collect{|doc|doc['number']}.uniq.length
|
35
|
+
=> 400
|
@@ -0,0 +1,76 @@
|
|
1
|
+
>> $:.unshift File.dirname(__FILE__)
|
2
|
+
>> require 'doctest_helper'
|
3
|
+
|
4
|
+
## Slipcover ##
|
5
|
+
|
6
|
+
- treat a series of objects as a cluster by:
|
7
|
+
- systematically calling the same method on all of them in parallel
|
8
|
+
- zipping up the results
|
9
|
+
|
10
|
+
# Example: access a cluster of CouchDB databases accessed via CouchRest (http://github.com/jchris/couchrest/tree/master)
|
11
|
+
|
12
|
+
Require Slipcover and CouchRest:
|
13
|
+
|
14
|
+
>> require 'lib/slipcover'
|
15
|
+
>> require 'couchrest'
|
16
|
+
|
17
|
+
Create the members of the cluster (these could be on different hosts, but we'll just simulate that here):
|
18
|
+
|
19
|
+
>> cr1 = CouchRest.new('localhost:5984')
|
20
|
+
>> cr1.database('slipcover-test').delete! rescue nil
|
21
|
+
>> cr1.create_db('slipcover-test')
|
22
|
+
|
23
|
+
Create cluster member two.
|
24
|
+
|
25
|
+
>> cr2 = CouchRest.new('127.0.0.1:5984')
|
26
|
+
>> cr2.database('slipcover-test2').delete! rescue nil
|
27
|
+
>> cr2.create_db('slipcover-test2')
|
28
|
+
>> db1 = cr1.database('slipcover-test')
|
29
|
+
>> db2 = cr2.database('slipcover-test2')
|
30
|
+
>> saved = db1.save({"test"=>"doc"})
|
31
|
+
>> saved['ok']
|
32
|
+
=> true
|
33
|
+
|
34
|
+
Assign them to Slipcover for management:
|
35
|
+
|
36
|
+
>> cluster = Slipcover.new( [db1, db2] )
|
37
|
+
|
38
|
+
By default, our cluster will re-raise any errors that occur in individual members
|
39
|
+
|
40
|
+
>> lambda{ cluster.get( saved['id'] )}.raises_error? RestClient::ResourceNotFound
|
41
|
+
=> true
|
42
|
+
|
43
|
+
but, if we want to ignore certain errors in the members (like in this case where we only want to hear back from the member of the cluster that actually has the document we're looking for), we can tell Slipcover to silence errors of a certain type
|
44
|
+
|
45
|
+
>> cluster.silenced_errors << RestClient::ResourceNotFound
|
46
|
+
=> [RestClient::ResourceNotFound]
|
47
|
+
|
48
|
+
and then getting a document that's present on only one of the members will return the document without any noise from the other cluster members:
|
49
|
+
|
50
|
+
>> result = cluster.get( saved['id'] )
|
51
|
+
>> result.first['test']
|
52
|
+
=> "doc"
|
53
|
+
|
54
|
+
If members raise other errors that aren't included in the list to be silenced, however, they will bring things to a halt. For example, if we add another member to the cluster on a broken connection
|
55
|
+
|
56
|
+
>> db3 = CouchRest.new('broken-socket').database('no-couch-here')
|
57
|
+
>> cluster.add_member db3
|
58
|
+
|
59
|
+
then the resulting errors will still raise:
|
60
|
+
|
61
|
+
>> lambda{ cluster.get( saved['id'] )}.raises_error? Object
|
62
|
+
=> true
|
63
|
+
|
64
|
+
Let's remove this broken cluster member so we can continue our tests:
|
65
|
+
|
66
|
+
>> cluster.remove_member{ |m| m.host == 'broken-socket' }
|
67
|
+
>> cluster.members.length == 2
|
68
|
+
=> true
|
69
|
+
|
70
|
+
We could also remove a particular member if we had it handy
|
71
|
+
|
72
|
+
>> cluster.add_member db3
|
73
|
+
>> cluster.remove_member db3
|
74
|
+
>> cluster.members.length == 2
|
75
|
+
=> true
|
76
|
+
|
@@ -0,0 +1,51 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
require 'enumerator'
|
3
|
+
require 'couchrest'
|
4
|
+
|
5
|
+
class CaliforniaKing
|
6
|
+
attr_accessor :silenced_errors, :width
|
7
|
+
|
8
|
+
def initialize(server, dbname, width=7)
|
9
|
+
@server = server
|
10
|
+
@dbname = dbname
|
11
|
+
@width = width
|
12
|
+
@silenced_errors = []
|
13
|
+
end
|
14
|
+
|
15
|
+
def query(queries)
|
16
|
+
results = []
|
17
|
+
slice_size = queries.length.to_f / @width
|
18
|
+
# each thread gets a slice to process, so it doesn't have to wait on others
|
19
|
+
threads = []
|
20
|
+
puts "slicing #{queries.length} queries into slices of size #{slice_size}"
|
21
|
+
queries.each_slice([slice_size.round,1].max) do |qs|
|
22
|
+
puts "feeding #{qs.length} queries to thread #{threads.length}"
|
23
|
+
threads << Thread.new(database) do |db|
|
24
|
+
qs.each do |q|
|
25
|
+
method = q.shift
|
26
|
+
begin
|
27
|
+
results << db.send(method, *q)
|
28
|
+
$stdout.putc '.'
|
29
|
+
$stdout.flush
|
30
|
+
rescue Exception => e
|
31
|
+
raise e unless silenced_errors_include? e
|
32
|
+
end
|
33
|
+
end
|
34
|
+
puts "thread finished"
|
35
|
+
end
|
36
|
+
end
|
37
|
+
threads.each{|t| t.join}
|
38
|
+
results
|
39
|
+
end
|
40
|
+
|
41
|
+
private
|
42
|
+
|
43
|
+
def database
|
44
|
+
CouchRest.new(@server).database(@dbname)
|
45
|
+
end
|
46
|
+
|
47
|
+
def silenced_errors_include? e
|
48
|
+
@silenced_errors.any?{|eklass| e.is_a? eklass}
|
49
|
+
end
|
50
|
+
|
51
|
+
end
|
data/lib/slipcover.rb
ADDED
@@ -0,0 +1,43 @@
|
|
1
|
+
class Slipcover
|
2
|
+
attr_accessor :silenced_errors, :members
|
3
|
+
|
4
|
+
def initialize(members)
|
5
|
+
@members = Array(members)
|
6
|
+
@silenced_errors = []
|
7
|
+
end
|
8
|
+
|
9
|
+
def add_member member
|
10
|
+
@members << member
|
11
|
+
end
|
12
|
+
|
13
|
+
def remove_member(member=nil, &block)
|
14
|
+
@members.delete(member) if member
|
15
|
+
@members.reject!{ |m| block.call(m) } if block_given?
|
16
|
+
end
|
17
|
+
|
18
|
+
|
19
|
+
def method_missing(method, *args, &block)
|
20
|
+
results = []
|
21
|
+
threads = []
|
22
|
+
|
23
|
+
@members.each do |m|
|
24
|
+
threads << Thread.new(m) do |member|
|
25
|
+
begin
|
26
|
+
results << member.send(method, *args)
|
27
|
+
rescue Exception => e
|
28
|
+
raise e unless silenced_errors_include? e
|
29
|
+
end
|
30
|
+
end
|
31
|
+
end
|
32
|
+
|
33
|
+
threads.each{|t| t.join}
|
34
|
+
results
|
35
|
+
end
|
36
|
+
|
37
|
+
private
|
38
|
+
|
39
|
+
def silenced_errors_include? e
|
40
|
+
@silenced_errors.any?{|eklass| e.is_a? eklass}
|
41
|
+
end
|
42
|
+
|
43
|
+
end
|
data/slipcover.gemspec
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Gem::Specification.new do |s|
|
2
|
+
|
3
|
+
s.name = "slipcover"
|
4
|
+
s.version = "0.2.0"
|
5
|
+
s.date = "2008-09-09"
|
6
|
+
s.summary = "CouchDB clustering and parallelization."
|
7
|
+
s.email = "greg@grabb.it"
|
8
|
+
s.homepage = "http://github.com/atduskgreg/slipcover"
|
9
|
+
s.description = "Slipcover runs a single query across a multi-member cluser (i.e. a group of CouchDBs) and zip up the results. CaliforniaKing runs a series of queries in parallel against a single CouchDB."
|
10
|
+
s.has_rdoc = false
|
11
|
+
s.authors = ["Greg Borenstein", "J. Chris Anderson"]
|
12
|
+
s.files = %w{
|
13
|
+
lib/slipcover.rb lib/california_king.rb
|
14
|
+
README.rdoc
|
15
|
+
slipcover.gemspec
|
16
|
+
doctest/slipcover.doctest doctest/california_king.doctest doctest/doctest_helper.rb
|
17
|
+
}
|
18
|
+
s.require_path = "lib"
|
19
|
+
s.add_dependency("couchrest", [">= 0.9"])
|
20
|
+
end
|
metadata
ADDED
@@ -0,0 +1,68 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: atduskgreg-slipcover
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.2.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Greg Borenstein
|
8
|
+
- J. Chris Anderson
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
|
13
|
+
date: 2008-09-09 00:00:00 -07:00
|
14
|
+
default_executable:
|
15
|
+
dependencies:
|
16
|
+
- !ruby/object:Gem::Dependency
|
17
|
+
name: couchrest
|
18
|
+
version_requirement:
|
19
|
+
version_requirements: !ruby/object:Gem::Requirement
|
20
|
+
requirements:
|
21
|
+
- - ">="
|
22
|
+
- !ruby/object:Gem::Version
|
23
|
+
version: "0.9"
|
24
|
+
version:
|
25
|
+
description: Slipcover runs a single query across a multi-member cluser (i.e. a group of CouchDBs) and zip up the results. CaliforniaKing runs a series of queries in parallel against a single CouchDB.
|
26
|
+
email: greg@grabb.it
|
27
|
+
executables: []
|
28
|
+
|
29
|
+
extensions: []
|
30
|
+
|
31
|
+
extra_rdoc_files: []
|
32
|
+
|
33
|
+
files:
|
34
|
+
- lib/slipcover.rb
|
35
|
+
- lib/california_king.rb
|
36
|
+
- README.rdoc
|
37
|
+
- slipcover.gemspec
|
38
|
+
- doctest/slipcover.doctest
|
39
|
+
- doctest/california_king.doctest
|
40
|
+
- doctest/doctest_helper.rb
|
41
|
+
has_rdoc: false
|
42
|
+
homepage: http://github.com/atduskgreg/slipcover
|
43
|
+
post_install_message:
|
44
|
+
rdoc_options: []
|
45
|
+
|
46
|
+
require_paths:
|
47
|
+
- lib
|
48
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
49
|
+
requirements:
|
50
|
+
- - ">="
|
51
|
+
- !ruby/object:Gem::Version
|
52
|
+
version: "0"
|
53
|
+
version:
|
54
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
55
|
+
requirements:
|
56
|
+
- - ">="
|
57
|
+
- !ruby/object:Gem::Version
|
58
|
+
version: "0"
|
59
|
+
version:
|
60
|
+
requirements: []
|
61
|
+
|
62
|
+
rubyforge_project:
|
63
|
+
rubygems_version: 1.2.0
|
64
|
+
signing_key:
|
65
|
+
specification_version: 2
|
66
|
+
summary: CouchDB clustering and parallelization.
|
67
|
+
test_files: []
|
68
|
+
|