peach 0.2

Sign up to get free protection for your applications and to get access to all the features.
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2008 Ben Hughes
2
+
3
+ Permission is hereby granted, free of charge, to any person
4
+ obtaining a copy of this software and associated documentation
5
+ files (the "Software"), to deal in the Software without
6
+ restriction, including without limitation the rights to use,
7
+ copy, modify, merge, publish, distribute, sublicense, and/or sell
8
+ copies of the Software, and to permit persons to whom the
9
+ Software is furnished to do so, subject to the following
10
+ conditions:
11
+
12
+ The above copyright notice and this permission notice shall be
13
+ included in all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
16
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
17
+ OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
18
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
19
+ HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
20
+ WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
21
+ FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
22
+ OTHER DEALINGS IN THE SOFTWARE.
data/README ADDED
@@ -0,0 +1,22 @@
1
+ Parallel Each (for ruby with threads)
2
+
3
+ It is pretty common to have iterations over Arrays that can be safely run in parallel. With multicore chips becoming pretty common, single threaded processing is about as cool as Pog. Unfortunately, standard Ruby hates real threads pretty hardcore at the present time; however, for some ruby projects alternate VMs like JRuby do give multicores some lovin'. Peach exists to make this power simple to use with minimal code changes.
4
+
5
+ Functions like map, each, and delete_if are often used in a functional, side-effect free style. If the operation in the block is computationally intense, performance can often be gained by multithreading the process. That's where Peach comes in. In the simplest case, you are one letter away from harnessing the power of parallelism and unlocking the secret of a guilt-free tan. At this stage, the goggles are purely optional.
6
+
7
+ Using Peach
8
+
9
+ Suppose you are going about your day job hacking away at code for the WOPR when you stumble upon the code:
10
+
11
+ cities.each {|city| thermonuclear_war(city)}
12
+
13
+ Clearly, the only winning move is to declare war in parallel. With Peach, the new code is:
14
+ require 'peach'
15
+
16
+ cities.peach {|city| thermonuclear_war(city)}
17
+
18
+ Requiring peach.rb monkey patches Array into submission. Currently Peach provides peach, pmap, and pdelete_if. Each of these functions takes an optional argument n, which represents the desired number of worker threads with the default being one thread per Array element. For cheaper operations on a large number of elements, you probably want to set n to something reasonably low.
19
+
20
+ (0...10000).to_a.pmap(4) {|x| process(x)}
21
+
22
+ Constructing the threads and adding on a few layers of indirection does add a bit of overhead to the iteration especially on MRI. Keep this in mind and remember to benchmark when unsure.
@@ -0,0 +1,8 @@
1
+ require 'rake'
2
+ require 'rake/testtask'
3
+
4
+ Rake::TestTask.new(:test) do |t|
5
+ t.libs << 'lib'
6
+ t.pattern = 'test/*_test.rb'
7
+ t.verbose = true
8
+ end
@@ -0,0 +1,57 @@
1
+ #Benchmark for Peach <http://peach.rubyforge.org>
2
+ #Count intrawiki links in wikipedia data
3
+
4
+ require 'peach'
5
+ require 'benchmark'
6
+ require 'digest/md5'
7
+
8
+ puts "PEACH BENCHMARK"
9
+ puts "Wikipedia Processing"
10
+ puts
11
+
12
+ #Read a small slice of the Wikipedia XML file
13
+ fn = "peach_bn_data.txt"
14
+ puts "Reading in dataset #{fn}"
15
+ puts "Dataset is #{File.size(fn)/1024} kb"
16
+ dataset = ""
17
+ puts Benchmark.measure("read dataset") { dataset = File.read(fn) }
18
+
19
+ puts "Splitting dataset into articles"
20
+ articles = []
21
+ puts Benchmark.measure("split dataset") {
22
+ articles = dataset.scan(/<text xml:space=\"preserve\">.*?<\/text>/m)
23
+ articles.delete_if {|x| /#redirect/i.match(x) }
24
+ }
25
+ puts "Found #{articles.size} articles"
26
+ puts
27
+ puts "BEGIN REAL BENCHMARK"
28
+ puts
29
+ puts "map:"
30
+ links1 = []
31
+ for i in (1...5)
32
+ puts Benchmark.measure {
33
+ links1 = articles.map do |article|
34
+ article.scan(/\[\[[\w -']+?\]\]/m)
35
+ #.each do |link|
36
+ # Digest::MD5.hexdigest(article)
37
+ #end
38
+ end
39
+ }
40
+ end
41
+ puts "Found #{links1.flatten.size} links"
42
+ puts
43
+ puts "pmap:"
44
+ links2 = []
45
+ for i in (1...5)
46
+ puts Benchmark.measure {
47
+ links2 = articles.pmap(6) do |article|
48
+ article.scan(/\[\[[\w -']+?\]\]/m)
49
+ #.each do |link|
50
+ # Digest::MD5.hexdigest(link)
51
+ #end
52
+ end
53
+ }
54
+ end
55
+ puts "Found #{links2.flatten.size} links"
56
+ p links2 - links1
57
+ puts "END"
@@ -0,0 +1,46 @@
1
+ require 'peach'
2
+ require 'benchmark'
3
+
4
+ def fac(n)
5
+ if n == 0
6
+ return 1
7
+ else
8
+ n*fac(n-1)
9
+ end
10
+ end
11
+
12
+
13
+ puts "PEACH TEST"
14
+ puts "each:"
15
+ def each_test
16
+ (0...1000).to_a.sort_by{rand}.each do |x|
17
+ fac(x)
18
+ end
19
+ end
20
+ puts Benchmark.measure { each_test }
21
+
22
+ puts "peach:"
23
+ def peach_test
24
+ (0...1000).to_a.sort_by{rand}.peach(4) do |x|
25
+ fac(x)
26
+ end
27
+ end
28
+ puts Benchmark.measure { peach_test }
29
+
30
+ puts "map:"
31
+ def map_test
32
+ (0...1000).to_a.sort_by{rand}.map do |x|
33
+ fac(x)
34
+ end
35
+ end
36
+ puts Benchmark.measure { map_test }
37
+
38
+
39
+ puts "pmap:"
40
+ def pmap_test
41
+ (0...1000).to_a.sort_by{rand}.pmap(4) do |x|
42
+ fac(x)
43
+ end
44
+ end
45
+ puts Benchmark.measure { pmap_test }
46
+
@@ -0,0 +1,46 @@
1
+ module Peach
2
+ def peach(n = nil, &b)
3
+ peach_run(:each, b, n)
4
+ end
5
+
6
+ def pmap(n = nil, &b)
7
+ peach_run(:map, b, n)
8
+ end
9
+
10
+ def pselect(n = nil, &b)
11
+ peach_run(:select, b, n)
12
+ end
13
+
14
+
15
+
16
+ protected
17
+ def peach_run(meth, b, n = nil)
18
+ threads, results, result = [],[],[]
19
+ peach_divvy(n).each_with_index do |x,i|
20
+ threads << Thread.new { results[i] = x.send(meth, &b)}
21
+ end
22
+ threads.each {|t| t.join }
23
+ results.each {|x| result += x if x}
24
+ result
25
+ end
26
+
27
+ def peach_divvy(n = nil)
28
+ return [] if size == 0
29
+
30
+ n ||= $peach_default_threads || size
31
+ n = size if n > size
32
+
33
+ lists = []
34
+
35
+ div = (size/n).floor
36
+ offset = 0
37
+ for i in (0...n-1)
38
+ lists << slice(offset, div)
39
+ offset += div
40
+ end
41
+ lists << slice(offset...size)
42
+ lists
43
+ end
44
+ end
45
+
46
+ Array.send(:include, Peach)
@@ -0,0 +1,56 @@
1
+ require File.join(File.dirname(__FILE__), "test_helper")
2
+
3
+ require File.join(File.dirname(__FILE__), "..", "lib", "peach")
4
+
5
+ class PeachTest < Test::Unit::TestCase
6
+ [:peach, :pmap, :pselect].each do |f|
7
+ context "Parallel function #{f}" do
8
+ normal_f = f.to_s[1..-1].to_sym
9
+
10
+ setup do
11
+ @data = [1, 2, 3, 5, 8]
12
+ @block = lambda{|i| i**2}
13
+ end
14
+ should "return the same result as #{normal_f}" do
15
+ assert_equal @data.send(normal_f, &@block),
16
+ @data.send(f, nil, &@block)
17
+ end
18
+ end
19
+ end
20
+
21
+ context "divvy" do
22
+ setup do
23
+ @data = [1, 2, 3, 4, 5]
24
+ end
25
+
26
+ context "on empty list" do
27
+ should "return empty list" do
28
+ assert_equal [], [].send(:peach_divvy)
29
+ end
30
+ end
31
+
32
+ context "when n is nil" do
33
+ should "put 1 element into each division" do
34
+ assert_equal @data.size, @data.send(:peach_divvy).size
35
+ end
36
+ end
37
+
38
+ context "when n is less than array size" do
39
+ should "put create n divisions" do
40
+ assert_equal 2, @data.send(:peach_divvy, 2).size
41
+ end
42
+
43
+ should "not lose any array elements" do
44
+ assert_equal @data.size, @data.send(:peach_divvy, 2).inject(0) {|sum, i|
45
+ sum + i.size
46
+ }
47
+ end
48
+ end
49
+
50
+ context "when n is greater than array size" do
51
+ should "only create 'array size' divisions" do
52
+ assert_equal @data.size, @data.send(:peach_divvy, 42).size
53
+ end
54
+ end
55
+ end
56
+ end
@@ -0,0 +1,8 @@
1
+ $:.unshift(File.dirname(__FILE__) + '/../lib')
2
+ $:.unshift(File.dirname(__FILE__)) unless
3
+ $:.include?(File.dirname(__FILE__)) ||
4
+ $:.include?(File.expand_path(File.dirname(__FILE__)))
5
+
6
+ require 'rubygems'
7
+ require 'test/unit'
8
+ require 'shoulda'
Binary file
@@ -0,0 +1,128 @@
1
+ <html>
2
+ <head>
3
+ <title>Peach - Parallel Each</title>
4
+ <style>
5
+ pre {
6
+ background-color: #f1f1f3;
7
+ color: #112;
8
+ padding: 10px;
9
+ font-size: 1.1em;
10
+ overflow: auto;
11
+ margin: 4px 0px;
12
+ width: 95%;
13
+ }
14
+
15
+
16
+
17
+ /* Syntax highlighting */
18
+ pre .normal {}
19
+ pre .comment { color: #005; font-style: italic; }
20
+ pre .keyword { color: #A00; font-weight: bold; }
21
+ pre .method { color: #077; }
22
+ pre .class { color: #074; }
23
+ pre .module { color: #050; }
24
+ pre .punct { color: #447; font-weight: bold; }
25
+ pre .symbol { color: #099; }
26
+ pre .string { color: #944; background: #FFE; }
27
+ pre .char { color: #F07; }
28
+ pre .ident { color: #004; }
29
+ pre .constant { color: #07F; }
30
+ pre .regex { color: #B66; background: #FEF; }
31
+ pre .number { color: #F99; }
32
+ pre .attribute { color: #5bb; }
33
+ pre .global { color: #7FB; }
34
+ pre .expr { color: #227; }
35
+ pre .escape { color: #277; }
36
+ </style>
37
+ </head>
38
+ <body style="background-color: #ecad27;">
39
+ <center><div style="border: dotted #000 1px;
40
+ background-color: #fff;
41
+ width: 745px;
42
+ text-align: left;">
43
+ <img src="Peach.sketch.png" alt="Peach">
44
+ <div style ="padding: 0px 20px 20px 20px;">
45
+ <h1>Parallel Each <small><small>
46
+ (for ruby with threads)
47
+ </small></small></h1>
48
+ <p>
49
+ It is pretty common to have iterations over Arrays that can be safely
50
+ run in parallel. With multicore chips becoming pretty common,
51
+ single threaded processing is about as cool as Pog. Unfortunately,
52
+ standard Ruby hates real threads pretty hardcore at the present time;
53
+ however, for some ruby projects alternate VMs like
54
+ <a href="http://jruby.codehaus.org/" title="JRuby: Coolest Thing Ever">
55
+ JRuby</a> do give multicores some lovin'. <i>Peach</i> exists to
56
+ make this power simple to use with minimal code changes.
57
+ </p>
58
+ <p>Functions like <tt>map</tt>, <tt>each</tt>, and <tt>delete_if</tt>
59
+ are often used in a functional, side-effect free style. If the
60
+ operation in the block is computationally intense, performance can
61
+ often be gained by multithreading the process. That's where
62
+ <i>Peach</i> comes in. In the simplest case, you are one letter away
63
+ from harnessing the power of parallelism and unlocking the secret of
64
+ a guilt-free tan. At this stage, the goggles are purely optional.
65
+ </p>
66
+ <h2>Using Peach</h2>
67
+ <p>Suppose you are going about your day job hacking away at code for
68
+ the <a href="http://en.wikipedia.org/wiki/WOPR">WOPR</a> when you
69
+ stumble upon the code:
70
+ </p>
71
+ <pre><span class=ident>cities</span><span class=punct>.</span><span class=ident>each</span> <span class=punct>{|</span><span class=ident>city</span><span class=punct>|</span> <span class=ident>thermonuclear_war</span><span class=punct>(</span><span class=ident>city</span><span class=punct>)}</span>
72
+ </pre>
73
+ <p>Clearly, the only winning move is to declare war in parallel. With
74
+ <i>Peach</i>, the new code is:
75
+ <pre>
76
+ <span class="ident">require</span> <span class="punct">'</span><span class="string">peach</span><span class="punct">'</span>
77
+
78
+ <span class=ident>cities</span><span class=punct>.</span><span class=ident>peach</span> <span class=punct>{|</span><span class=ident>city</span><span class=punct>|</span> <span class=ident>thermonuclear_war</span><span class=punct>(</span><span class=ident>city</span><span class=punct>)}</span>
79
+ </pre>
80
+ <p>
81
+ Requiring peach.rb monkey patches Array into submission.
82
+ Currently <i>Peach</i> provides <tt>peach</tt>, <tt>pmap</tt>, and
83
+ <tt>pdelete_if</tt>. Each of these functions takes an optional
84
+ argument <i>n</i>, which represents the desired number of worker
85
+ threads with the default being one thread per Array element. For
86
+ cheaper operations on a large number of elements, you probably want
87
+ to set <i>n</i> to something reasonably low.
88
+ </p>
89
+ <pre><span class="punct">(</span><span class="number">0</span><span class="punct">...</span><span class="number">10000</span><span class="punct">).</span><span class="ident">to_a</span><span class="punct">.</span><span class="ident">pmap</span><span class="punct">(</span><span class="number">4</span><span class="punct">)</span> <span class="punct">{|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">process</span><span class="punct">(</span><span class="ident">x</span><span class="punct">)}</span>
90
+ </pre>
91
+ <p>
92
+ Constructing the threads and adding on a few layers of indirection does
93
+ add a bit of overhead to the iteration especially on MRI. Keep this in
94
+ mind and remember to benchmark when unsure.
95
+ <h3>Syntax (without all the words)</h3>
96
+ <pre><span class="ident">require</span> <span class="punct">'</span><span class="string">peach</span><span class="punct">'</span>
97
+
98
+ <span class="punct">[</span><span class="number">1</span><span class="punct">,</span><span class="number">2</span><span class="punct">,</span><span class="number">3</span><span class="punct">,</span><span class="number">4</span><span class="punct">].</span><span class="ident">peach</span><span class="punct">{|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">f</span><span class="punct">(</span><span class="ident">x</span><span class="punct">)}</span> <span class="comment">#Spawns 4 threads, =&gt; [1,2,3,4]</span>
99
+ <span class="punct">[</span><span class="number">1</span><span class="punct">,</span><span class="number">2</span><span class="punct">,</span><span class="number">3</span><span class="punct">,</span><span class="number">4</span><span class="punct">].</span><span class="ident">pmap</span><span class="punct">{|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">f</span><span class="punct">(</span><span class="ident">x</span><span class="punct">)}</span> <span class="comment">#Spawns 4 threads, =&gt; [f(1),f(2),f(3),f(4)]</span>
100
+ <span class="punct">[</span><span class="number">1</span><span class="punct">,</span><span class="number">2</span><span class="punct">,</span><span class="number">3</span><span class="punct">,</span><span class="number">4</span><span class="punct">].</span><span class="ident">pdelete_if</span><span class="punct">{|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">x</span> <span class="punct">&gt;</span> <span class="number">2</span><span class="punct">}</span> <span class="comment">#Spawns 4 threads, =&gt; [3,4]</span>
101
+
102
+
103
+ <span class="punct">[</span><span class="number">1</span><span class="punct">,</span><span class="number">2</span><span class="punct">,</span><span class="number">3</span><span class="punct">,</span><span class="number">4</span><span class="punct">].</span><span class="ident">peach</span><span class="punct">(</span><span class="number">2</span><span class="punct">){|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">f</span><span class="punct">(</span><span class="ident">x</span><span class="punct">)}</span> <span class="comment">#Spawns 2 threads, =&gt; [1,2,3,4]</span>
104
+ <span class="punct">[</span><span class="number">1</span><span class="punct">,</span><span class="number">2</span><span class="punct">,</span><span class="number">3</span><span class="punct">,</span><span class="number">4</span><span class="punct">].</span><span class="ident">pmap</span><span class="punct">(</span><span class="number">2</span><span class="punct">){|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">f</span><span class="punct">(</span><span class="ident">x</span><span class="punct">)}</span> <span class="comment">#Spawns 2 threads, =&gt; [f(1),f(2),f(3),f(4)]</span>
105
+ <span class="punct">[</span><span class="number">1</span><span class="punct">,</span><span class="number">2</span><span class="punct">,</span><span class="number">3</span><span class="punct">,</span><span class="number">4</span><span class="punct">].</span><span class="ident">pdelete_if</span><span class="punct">(</span><span class="number">2</span><span class="punct">){|</span><span class="ident">x</span><span class="punct">|</span> <span class="ident">x</span> <span class="punct">&gt;</span> <span class="number">2</span><span class="punct">}</span> <span class="comment">#Spawns 2 threads, =&gt; [3,4]</span>
106
+ </pre>
107
+ <h2>FAQ</h2>
108
+ <p><b>Q: I use normal ruby (MRI 1.8 or 1.9), will Peach confer superpowers and great performance upon my code?</b><br/>
109
+ A: No, on MRI your code will be slightly slower because of the increased overhead for Thread creation. MRI is singlethreaded so Peach will not make it magically parallel.</p>
110
+ <p><b>Q: Why should I switch to JRuby to get the benefits of Peach?</b><br/>
111
+ A: Switching to JRuby for code that needs better performance is a good idea even without Peach. JRuby is insanely fast and a good idea. The multithreading and ability to use this humble utility is just another feature.</p>
112
+ <p><b>Q: Benchmarks?</b><br/>
113
+ A: I am pretty bad at benchmarking code, but I do have a simple test comparing performance between <tt>map</tt> and <tt>pmap</tt> on MRI and JRuby. Headius helped in the preparation of these materials. <a href="http://pastie.caboo.se/177240">Check it out</a>. JRuby on Java 1.6 is <a href="http://pastie.caboo.se/177263">even faster</a>. If you come up with any benchmarks, do let me know.</p>
114
+
115
+ <h2>Eat a Peach</h2>
116
+ <p>
117
+ <i>Peach</i> is distributed as a gem from github, so:<br/>
118
+ <tt>gem install schleyfox-peach --source=http://gems.github.com</tt>.
119
+ <ul>
120
+ <li>Project Page: <a href="http://rubyforge.org/projects/peach/">
121
+ RubyForge</a>, <a href="http://github.com/schleyfox/peach">Github</a></li>
122
+ </ul>
123
+ </p>
124
+ </div>
125
+ </div></center>
126
+ </body>
127
+ </html>
128
+
metadata ADDED
@@ -0,0 +1,62 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: peach
3
+ version: !ruby/object:Gem::Version
4
+ version: "0.2"
5
+ platform: ruby
6
+ authors:
7
+ - Ben Hughes
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+
12
+ date: 2009-04-05 00:00:00 -04:00
13
+ default_executable:
14
+ dependencies: []
15
+
16
+ description:
17
+ email: ben@pixelmachine.org
18
+ executables: []
19
+
20
+ extensions: []
21
+
22
+ extra_rdoc_files: []
23
+
24
+ files:
25
+ - README
26
+ - LICENSE
27
+ - Rakefile
28
+ - lib/peach.rb
29
+ - bn/peach_bn.rb
30
+ - bn/peach_test.rb
31
+ - test/test_helper.rb
32
+ - test/peach_test.rb
33
+ - web/index.html
34
+ - web/Peach.sketch.png
35
+ has_rdoc: false
36
+ homepage: http://peach.rubyforge.org
37
+ post_install_message:
38
+ rdoc_options: []
39
+
40
+ require_paths:
41
+ - lib
42
+ required_ruby_version: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - ">="
45
+ - !ruby/object:Gem::Version
46
+ version: "0"
47
+ version:
48
+ required_rubygems_version: !ruby/object:Gem::Requirement
49
+ requirements:
50
+ - - ">="
51
+ - !ruby/object:Gem::Version
52
+ version: "0"
53
+ version:
54
+ requirements: []
55
+
56
+ rubyforge_project:
57
+ rubygems_version: 1.3.1
58
+ signing_key:
59
+ specification_version: 2
60
+ summary: Parallel Each and other parallel things
61
+ test_files: []
62
+