bud 0.1.0.pre1 → 0.9.0
Sign up to get free protection for your applications and to get access to all the features.
- data/History.txt +23 -0
- data/{README → README.md} +6 -2
- data/docs/cheat.md +1 -8
- data/docs/intro.md +1 -1
- data/lib/bud/aggs.rb +16 -16
- data/lib/bud/bud_meta.rb +8 -15
- data/lib/bud/collections.rb +85 -172
- data/lib/bud/errors.rb +5 -1
- data/lib/bud/executor/elements.rb +133 -118
- data/lib/bud/executor/group.rb +6 -6
- data/lib/bud/executor/join.rb +25 -22
- data/lib/bud/metrics.rb +1 -1
- data/lib/bud/monkeypatch.rb +18 -29
- data/lib/bud/rebl.rb +5 -4
- data/lib/bud/rewrite.rb +21 -160
- data/lib/bud/source.rb +5 -5
- data/lib/bud/state.rb +13 -12
- data/lib/bud/storage/dbm.rb +13 -23
- data/lib/bud/storage/zookeeper.rb +0 -4
- data/lib/bud.rb +184 -162
- metadata +144 -216
- data/docs/deploy.md +0 -96
- data/lib/bud/deploy/countatomicdelivery.rb +0 -38
- data/lib/bud/joins.rb +0 -526
data/History.txt
ADDED
@@ -0,0 +1,23 @@
|
|
1
|
+
== 0.9.0 / 2012-03-20
|
2
|
+
|
3
|
+
* Major performance enhancements
|
4
|
+
* Much, much faster: rewritten runtime that now uses a push-based dataflow
|
5
|
+
* Operator state is cached; only deltas are updated across ticks in many cases
|
6
|
+
* Joins that use collection keys can use collection storage for improved
|
7
|
+
performance
|
8
|
+
* Improved compatibility: Bud now works with MRI 1.9 (as well as 1.8.7)
|
9
|
+
* Switched from ParseTree to ruby_parser
|
10
|
+
* Rewritten Bloom module system
|
11
|
+
* Tuples are now represented as Ruby Structs, rather than Arrays
|
12
|
+
* This avoids the need to define column accessor methods by hand
|
13
|
+
* Tests now use MiniTest rather than Test::Unit
|
14
|
+
* Observe the following incompatibilities:
|
15
|
+
* Support for "semi-structured" collections have been removed. That is,
|
16
|
+
previously you could store extra field values into a tuple; those values
|
17
|
+
would be collapsed into a single array that was tacked onto the end of the
|
18
|
+
tuple.
|
19
|
+
* Support for Bloom-based signal handling has been removed
|
20
|
+
* Support for the "with" syntax has been removed
|
21
|
+
* The Bloom-based "deployment" framework has been removed
|
22
|
+
* Support for Tokyo Cabinet-based collections has been removed
|
23
|
+
|
data/{README → README.md}
RENAMED
@@ -14,23 +14,27 @@ Main deficiencies at this point are:
|
|
14
14
|
available, including mutable state. This allows programmers to get outside
|
15
15
|
the Bloom framework and lose cleanliness.
|
16
16
|
|
17
|
-
- Compatibility: Bud only works with Ruby (MRI) 1.8 and 1.9. JRuby and other
|
17
|
+
- Compatibility: Bud only works with Ruby (MRI) 1.8.7 and 1.9. JRuby and other
|
18
18
|
Ruby implementations are currently not supported.
|
19
19
|
|
20
20
|
## Installation
|
21
21
|
|
22
22
|
To install the latest release:
|
23
|
+
|
23
24
|
% gem install bud
|
24
25
|
|
25
26
|
To build and install a new gem from the current development sources:
|
27
|
+
|
26
28
|
% gem build bud.gemspec ; gem install bud*.gem
|
27
29
|
|
28
|
-
Note that GraphViz must be installed.
|
30
|
+
Note that [GraphViz](http://www.graphviz.org/) must be installed.
|
29
31
|
|
30
32
|
Simple example programs can be found in examples. A much larger set of example
|
31
33
|
programs and libraries can be found in the bud-sandbox repository.
|
32
34
|
|
33
35
|
To run the unit tests:
|
36
|
+
|
37
|
+
% gem install minitest # unless already installed
|
34
38
|
% cd test; ruby ts_bud.rb
|
35
39
|
|
36
40
|
## Optional Dependencies
|
data/docs/cheat.md
CHANGED
@@ -101,13 +101,6 @@ Statements with stdio on lhs must use async merge (`<~`).<br>
|
|
101
101
|
Using `stdio` on the lhs of an async merge results in writing to the `IO` object specified by the `:stdout` Bud option (`$stdout` by default).<br>
|
102
102
|
To use `stdio` on rhs, instantiate Bud with `:stdin` option set to an `IO` object (e.g., `$stdin`).<br>
|
103
103
|
|
104
|
-
### signals ###
|
105
|
-
Built-in read-only scratch collection for receiving OS signals.<br>
|
106
|
-
System-provided attributes: `[:key] => []`
|
107
|
-
|
108
|
-
Currently catches only SIGINT ("INT") and SIGTERM ("TERM"). If Bud option `:signal_handling=>:bloom` is set, the signal is trapped and Bloom rules
|
109
|
-
are responsible to deal with the content of `signals`.
|
110
|
-
|
111
104
|
### halt ###
|
112
105
|
Built-in scratch collection to be used on the lhs of a rule; permanently halts the Bud instance upon first insertion.
|
113
106
|
|
@@ -310,7 +303,7 @@ Like `pairs`, but implicitly includes a block that projects down to the right it
|
|
310
303
|
`outer(`*hash pairs*`)`:<br>
|
311
304
|
Left Outer Join. Like `pairs`, but items in the first collection will be produced nil-padded if they have no match in the second collection.
|
312
305
|
|
313
|
-
## Temp Collections
|
306
|
+
## Temp Collections ##
|
314
307
|
`temp`<br>
|
315
308
|
Temp collections are scratches defined within a `bloom` block:
|
316
309
|
|
data/docs/intro.md
CHANGED
@@ -29,7 +29,7 @@ This alpha is targeted at "friends and family", and at developers who'd like to
|
|
29
29
|
## Getting Started ##
|
30
30
|
We're shipping Bud with a [sandbox](http://github.com/bloom-lang/bud-sandbox) of libraries and example applications for distributed systems. These illustrate the language and how it can be used, and also can serve as mixins for new code you might want to write. You may be surprised at how short the provided Bud code is, but don't be fooled.
|
31
31
|
|
32
|
-
To get you started with Bud, we've provided a [quick-start tutorial](getstarted.md)
|
32
|
+
To get you started with Bud, we've provided a [quick-start tutorial](getstarted.md) and a number of other docs you can find linked from the [README](README.md).
|
33
33
|
|
34
34
|
We welcome both constructive criticism and (hopefully occasional) smoke-out-your-ears, hair-tearing shouts of frustration. Please point your feedback cannon at the [Bloom mailing list](http://groups.google.com/group/bloom-lang).
|
35
35
|
|
data/lib/bud/aggs.rb
CHANGED
@@ -4,7 +4,7 @@ module Bud
|
|
4
4
|
def init(val)
|
5
5
|
val
|
6
6
|
end
|
7
|
-
|
7
|
+
|
8
8
|
# In order to support argagg, trans must return a pair:
|
9
9
|
# 1. the running aggregate state
|
10
10
|
# 2. a flag to indicate what the caller should do with the input tuple for argaggs
|
@@ -40,16 +40,16 @@ module Bud
|
|
40
40
|
|
41
41
|
class Min < ArgExemplary #:nodoc: all
|
42
42
|
def trans(the_state, val)
|
43
|
-
if the_state < val
|
43
|
+
if the_state < val
|
44
44
|
return the_state, :ignore
|
45
45
|
elsif the_state == val
|
46
46
|
return the_state, :keep
|
47
|
-
else
|
47
|
+
else
|
48
48
|
return val, :replace
|
49
49
|
end
|
50
50
|
end
|
51
51
|
end
|
52
|
-
# exemplary aggregate method to be used in Bud::BudCollection.group.
|
52
|
+
# exemplary aggregate method to be used in Bud::BudCollection.group.
|
53
53
|
# computes minimum of x entries aggregated.
|
54
54
|
def min(x)
|
55
55
|
[Min.new, x]
|
@@ -64,7 +64,7 @@ module Bud
|
|
64
64
|
end
|
65
65
|
end
|
66
66
|
end
|
67
|
-
# exemplary aggregate method to be used in Bud::BudCollection.group.
|
67
|
+
# exemplary aggregate method to be used in Bud::BudCollection.group.
|
68
68
|
# computes maximum of x entries aggregated.
|
69
69
|
def max(x)
|
70
70
|
[Max.new, x]
|
@@ -83,7 +83,7 @@ module Bud
|
|
83
83
|
end
|
84
84
|
end
|
85
85
|
|
86
|
-
# exemplary aggregate method to be used in Bud::BudCollection.group.
|
86
|
+
# exemplary aggregate method to be used in Bud::BudCollection.group.
|
87
87
|
# arbitrarily but deterministically chooses among x entries being aggregated.
|
88
88
|
def choose(x)
|
89
89
|
[Choose.new, x]
|
@@ -93,7 +93,7 @@ module Bud
|
|
93
93
|
def init(x=nil) # Vitter's reservoir sampling, sample size = 1
|
94
94
|
the_state = {:cnt => 1, :val => x}
|
95
95
|
end
|
96
|
-
|
96
|
+
|
97
97
|
def trans(the_state, val)
|
98
98
|
the_state[:cnt] += 1
|
99
99
|
j = rand(the_state[:cnt])
|
@@ -113,7 +113,7 @@ module Bud
|
|
113
113
|
end
|
114
114
|
end
|
115
115
|
|
116
|
-
# exemplary aggregate method to be used in Bud::BudCollection.group.
|
116
|
+
# exemplary aggregate method to be used in Bud::BudCollection.group.
|
117
117
|
# randomly chooses among x entries being aggregated.
|
118
118
|
def choose_rand(x=nil)
|
119
119
|
[ChooseOneRand.new, x]
|
@@ -124,8 +124,8 @@ module Bud
|
|
124
124
|
return the_state + val, nil
|
125
125
|
end
|
126
126
|
end
|
127
|
-
|
128
|
-
# aggregate method to be used in Bud::BudCollection.group.
|
127
|
+
|
128
|
+
# aggregate method to be used in Bud::BudCollection.group.
|
129
129
|
# computes sum of x entries aggregated.
|
130
130
|
def sum(x)
|
131
131
|
[Sum.new, x]
|
@@ -139,8 +139,8 @@ module Bud
|
|
139
139
|
return the_state + 1, nil
|
140
140
|
end
|
141
141
|
end
|
142
|
-
|
143
|
-
# aggregate method to be used in Bud::BudCollection.group.
|
142
|
+
|
143
|
+
# aggregate method to be used in Bud::BudCollection.group.
|
144
144
|
# counts number of entries aggregated. argument is ignored.
|
145
145
|
def count(x=nil)
|
146
146
|
[Count.new]
|
@@ -159,8 +159,8 @@ module Bud
|
|
159
159
|
the_state[0]*1.0 / the_state[1]
|
160
160
|
end
|
161
161
|
end
|
162
|
-
|
163
|
-
# aggregate method to be used in Bud::BudCollection.group.
|
162
|
+
|
163
|
+
# aggregate method to be used in Bud::BudCollection.group.
|
164
164
|
# computes average of a multiset of x values
|
165
165
|
def avg(x)
|
166
166
|
[Avg.new, x]
|
@@ -175,8 +175,8 @@ module Bud
|
|
175
175
|
return the_state, nil
|
176
176
|
end
|
177
177
|
end
|
178
|
-
|
179
|
-
# aggregate method to be used in Bud::BudCollection.group.
|
178
|
+
|
179
|
+
# aggregate method to be used in Bud::BudCollection.group.
|
180
180
|
# accumulates all x inputs into an array. note that the order of the elements
|
181
181
|
# in the resulting array is undefined.
|
182
182
|
def accum(x)
|
data/lib/bud/bud_meta.rb
CHANGED
@@ -37,7 +37,6 @@ class BudMeta #:nodoc: all
|
|
37
37
|
# Cleanup
|
38
38
|
stratified_rules = stratified_rules.reject{|r| r.empty?}
|
39
39
|
dump_rewrite(stratified_rules) if @bud_instance.options[:dump_rewrite]
|
40
|
-
|
41
40
|
end
|
42
41
|
return stratified_rules
|
43
42
|
end
|
@@ -140,7 +139,7 @@ class BudMeta #:nodoc: all
|
|
140
139
|
end
|
141
140
|
|
142
141
|
next if i == 1 and n.sexp_type == :nil # a block got rewritten to an empty block
|
143
|
-
|
142
|
+
|
144
143
|
# Check for a common case
|
145
144
|
if n.sexp_type == :lasgn
|
146
145
|
return [n, "illegal operator: '='"]
|
@@ -192,7 +191,7 @@ class BudMeta #:nodoc: all
|
|
192
191
|
lhs = (nodes[d.lhs.to_s] ||= Node.new(d.lhs.to_s, :init, 0, [], true, false, false, false))
|
193
192
|
lhs.in_lhs = true
|
194
193
|
body = (nodes[d.body.to_s] ||= Node.new(d.body.to_s, :init, 0, [], false, true, false, false))
|
195
|
-
temporal = d.op != "<="
|
194
|
+
temporal = d.op != "<="
|
196
195
|
lhs.edges << Edge.new(body, d.op, d.nm, temporal)
|
197
196
|
body.in_body = true
|
198
197
|
end
|
@@ -200,7 +199,7 @@ class BudMeta #:nodoc: all
|
|
200
199
|
nodes.values.each {|n| calc_stratum(n, false, false, [n.name])}
|
201
200
|
# Normalize stratum numbers because they may not be 0-based or consecutive
|
202
201
|
remap = {}
|
203
|
-
# if the nodes stratum numbers are [2, 3, 2, 4], remap = {2 => 0, 3 => 1, 4 => 2}
|
202
|
+
# if the nodes stratum numbers are [2, 3, 2, 4], remap = {2 => 0, 3 => 1, 4 => 2}
|
204
203
|
nodes.values.map {|n| n.stratum}.uniq.sort.each_with_index{|num, i|
|
205
204
|
remap[num] = i
|
206
205
|
}
|
@@ -239,25 +238,19 @@ class BudMeta #:nodoc: all
|
|
239
238
|
|
240
239
|
|
241
240
|
def analyze_dependencies(nodes) # nodes = {node name => node}
|
242
|
-
|
243
|
-
|
244
|
-
preds_in_lhs = nodes.inject(Set.new) {|preds, name_n| preds.add(name_n[0]) if name_n[1].in_lhs; preds}
|
245
|
-
preds_in_body = nodes.inject(Set.new) {|preds, name_n| preds.add(name_n[0]) if name_n[1].in_body; preds}
|
241
|
+
preds_in_lhs = nodes.select {|_, node| node.in_lhs}.map {|name, _| name}.to_set
|
242
|
+
preds_in_body = nodes.select {|_, node| node.in_body}.map {|name, _| name}.to_set
|
246
243
|
|
244
|
+
bud = @bud_instance
|
247
245
|
bud.t_provides.each do |p|
|
248
246
|
pred, input = p.interface, p.input
|
249
247
|
if input
|
250
|
-
# an interface pred is a source if it is an input and it is not in any rule's lhs
|
251
|
-
#bud.sources << [pred] unless (preds_in_lhs.include? pred)
|
252
248
|
unless preds_in_body.include? pred.to_s
|
253
249
|
# input interface is underspecified if not used in any rule body
|
254
250
|
bud.t_underspecified << [pred, true] # true indicates input mode
|
255
251
|
puts "Warning: input interface #{pred} not used"
|
256
252
|
end
|
257
253
|
else
|
258
|
-
# an interface pred is a sink if it is not an input and it is not in any rule's body
|
259
|
-
#(if it is in the body, then it is an intermediate node feeding some lhs)
|
260
|
-
#bud.sinks << [pred] unless (preds_in_body.include? pred)
|
261
254
|
unless preds_in_lhs.include? pred.to_s
|
262
255
|
# output interface underspecified if not in any rule's lhs
|
263
256
|
bud.t_underspecified << [pred, false] #false indicates output mode.
|
@@ -287,8 +280,8 @@ class BudMeta #:nodoc: all
|
|
287
280
|
fout.puts "Declarations:"
|
288
281
|
|
289
282
|
strata.each_with_index do |rules, i|
|
290
|
-
fout.
|
291
|
-
fout.
|
283
|
+
fout.puts "================================="
|
284
|
+
fout.puts "Stratum #{i}"
|
292
285
|
rules.each do |r|
|
293
286
|
fout.puts "#{r.bud_obj.class}##{r.bud_obj.object_id} #{r.rule_id}"
|
294
287
|
fout.puts "\tsrc: #{r.src}"
|