bud 0.1.0.pre1 → 0.9.0

Sign up to get free protection for your applications and to get access to all the features.
data/History.txt ADDED
@@ -0,0 +1,23 @@
1
+ == 0.9.0 / 2012-03-20
2
+
3
+ * Major performance enhancements
4
+ * Much, much faster: rewritten runtime that now uses a push-based dataflow
5
+ * Operator state is cached; only deltas are updated across ticks in many cases
6
+ * Joins that use collection keys can use collection storage for improved
7
+ performance
8
+ * Improved compatibility: Bud now works with MRI 1.9 (as well as 1.8.7)
9
+ * Switched from ParseTree to ruby_parser
10
+ * Rewritten Bloom module system
11
+ * Tuples are now represented as Ruby Structs, rather than Arrays
12
+ * This avoids the need to define column accessor methods by hand
13
+ * Tests now use MiniTest rather than Test::Unit
14
+ * Observe the following incompatibilities:
15
+ * Support for "semi-structured" collections have been removed. That is,
16
+ previously you could store extra field values into a tuple; those values
17
+ would be collapsed into a single array that was tacked onto the end of the
18
+ tuple.
19
+ * Support for Bloom-based signal handling has been removed
20
+ * Support for the "with" syntax has been removed
21
+ * The Bloom-based "deployment" framework has been removed
22
+ * Support for Tokyo Cabinet-based collections has been removed
23
+
@@ -14,23 +14,27 @@ Main deficiencies at this point are:
14
14
  available, including mutable state. This allows programmers to get outside
15
15
  the Bloom framework and lose cleanliness.
16
16
 
17
- - Compatibility: Bud only works with Ruby (MRI) 1.8 and 1.9. JRuby and other
17
+ - Compatibility: Bud only works with Ruby (MRI) 1.8.7 and 1.9. JRuby and other
18
18
  Ruby implementations are currently not supported.
19
19
 
20
20
  ## Installation
21
21
 
22
22
  To install the latest release:
23
+
23
24
  % gem install bud
24
25
 
25
26
  To build and install a new gem from the current development sources:
27
+
26
28
  % gem build bud.gemspec ; gem install bud*.gem
27
29
 
28
- Note that GraphViz must be installed.
30
+ Note that [GraphViz](http://www.graphviz.org/) must be installed.
29
31
 
30
32
  Simple example programs can be found in examples. A much larger set of example
31
33
  programs and libraries can be found in the bud-sandbox repository.
32
34
 
33
35
  To run the unit tests:
36
+
37
+ % gem install minitest # unless already installed
34
38
  % cd test; ruby ts_bud.rb
35
39
 
36
40
  ## Optional Dependencies
data/docs/cheat.md CHANGED
@@ -101,13 +101,6 @@ Statements with stdio on lhs must use async merge (`<~`).<br>
101
101
  Using `stdio` on the lhs of an async merge results in writing to the `IO` object specified by the `:stdout` Bud option (`$stdout` by default).<br>
102
102
  To use `stdio` on rhs, instantiate Bud with `:stdin` option set to an `IO` object (e.g., `$stdin`).<br>
103
103
 
104
- ### signals ###
105
- Built-in read-only scratch collection for receiving OS signals.<br>
106
- System-provided attributes: `[:key] => []`
107
-
108
- Currently catches only SIGINT ("INT") and SIGTERM ("TERM"). If Bud option `:signal_handling=>:bloom` is set, the signal is trapped and Bloom rules
109
- are responsible to deal with the content of `signals`.
110
-
111
104
  ### halt ###
112
105
  Built-in scratch collection to be used on the lhs of a rule; permanently halts the Bud instance upon first insertion.
113
106
 
@@ -310,7 +303,7 @@ Like `pairs`, but implicitly includes a block that projects down to the right it
310
303
  `outer(`*hash pairs*`)`:<br>
311
304
  Left Outer Join. Like `pairs`, but items in the first collection will be produced nil-padded if they have no match in the second collection.
312
305
 
313
- ## Temp Collections and With Blocks ##
306
+ ## Temp Collections ##
314
307
  `temp`<br>
315
308
  Temp collections are scratches defined within a `bloom` block:
316
309
 
data/docs/intro.md CHANGED
@@ -29,7 +29,7 @@ This alpha is targeted at "friends and family", and at developers who'd like to
29
29
  ## Getting Started ##
30
30
  We're shipping Bud with a [sandbox](http://github.com/bloom-lang/bud-sandbox) of libraries and example applications for distributed systems. These illustrate the language and how it can be used, and also can serve as mixins for new code you might want to write. You may be surprised at how short the provided Bud code is, but don't be fooled.
31
31
 
32
- To get you started with Bud, we've provided a [quick-start tutorial](getstarted.md), instructions for [deploying distributed Bud](deploy.md) programs on Amazon's EC2 cloud, and a number of other docs you can find linked from the [README](README.md).
32
+ To get you started with Bud, we've provided a [quick-start tutorial](getstarted.md) and a number of other docs you can find linked from the [README](README.md).
33
33
 
34
34
  We welcome both constructive criticism and (hopefully occasional) smoke-out-your-ears, hair-tearing shouts of frustration. Please point your feedback cannon at the [Bloom mailing list](http://groups.google.com/group/bloom-lang).
35
35
 
data/lib/bud/aggs.rb CHANGED
@@ -4,7 +4,7 @@ module Bud
4
4
  def init(val)
5
5
  val
6
6
  end
7
-
7
+
8
8
  # In order to support argagg, trans must return a pair:
9
9
  # 1. the running aggregate state
10
10
  # 2. a flag to indicate what the caller should do with the input tuple for argaggs
@@ -40,16 +40,16 @@ module Bud
40
40
 
41
41
  class Min < ArgExemplary #:nodoc: all
42
42
  def trans(the_state, val)
43
- if the_state < val
43
+ if the_state < val
44
44
  return the_state, :ignore
45
45
  elsif the_state == val
46
46
  return the_state, :keep
47
- else
47
+ else
48
48
  return val, :replace
49
49
  end
50
50
  end
51
51
  end
52
- # exemplary aggregate method to be used in Bud::BudCollection.group.
52
+ # exemplary aggregate method to be used in Bud::BudCollection.group.
53
53
  # computes minimum of x entries aggregated.
54
54
  def min(x)
55
55
  [Min.new, x]
@@ -64,7 +64,7 @@ module Bud
64
64
  end
65
65
  end
66
66
  end
67
- # exemplary aggregate method to be used in Bud::BudCollection.group.
67
+ # exemplary aggregate method to be used in Bud::BudCollection.group.
68
68
  # computes maximum of x entries aggregated.
69
69
  def max(x)
70
70
  [Max.new, x]
@@ -83,7 +83,7 @@ module Bud
83
83
  end
84
84
  end
85
85
 
86
- # exemplary aggregate method to be used in Bud::BudCollection.group.
86
+ # exemplary aggregate method to be used in Bud::BudCollection.group.
87
87
  # arbitrarily but deterministically chooses among x entries being aggregated.
88
88
  def choose(x)
89
89
  [Choose.new, x]
@@ -93,7 +93,7 @@ module Bud
93
93
  def init(x=nil) # Vitter's reservoir sampling, sample size = 1
94
94
  the_state = {:cnt => 1, :val => x}
95
95
  end
96
-
96
+
97
97
  def trans(the_state, val)
98
98
  the_state[:cnt] += 1
99
99
  j = rand(the_state[:cnt])
@@ -113,7 +113,7 @@ module Bud
113
113
  end
114
114
  end
115
115
 
116
- # exemplary aggregate method to be used in Bud::BudCollection.group.
116
+ # exemplary aggregate method to be used in Bud::BudCollection.group.
117
117
  # randomly chooses among x entries being aggregated.
118
118
  def choose_rand(x=nil)
119
119
  [ChooseOneRand.new, x]
@@ -124,8 +124,8 @@ module Bud
124
124
  return the_state + val, nil
125
125
  end
126
126
  end
127
-
128
- # aggregate method to be used in Bud::BudCollection.group.
127
+
128
+ # aggregate method to be used in Bud::BudCollection.group.
129
129
  # computes sum of x entries aggregated.
130
130
  def sum(x)
131
131
  [Sum.new, x]
@@ -139,8 +139,8 @@ module Bud
139
139
  return the_state + 1, nil
140
140
  end
141
141
  end
142
-
143
- # aggregate method to be used in Bud::BudCollection.group.
142
+
143
+ # aggregate method to be used in Bud::BudCollection.group.
144
144
  # counts number of entries aggregated. argument is ignored.
145
145
  def count(x=nil)
146
146
  [Count.new]
@@ -159,8 +159,8 @@ module Bud
159
159
  the_state[0]*1.0 / the_state[1]
160
160
  end
161
161
  end
162
-
163
- # aggregate method to be used in Bud::BudCollection.group.
162
+
163
+ # aggregate method to be used in Bud::BudCollection.group.
164
164
  # computes average of a multiset of x values
165
165
  def avg(x)
166
166
  [Avg.new, x]
@@ -175,8 +175,8 @@ module Bud
175
175
  return the_state, nil
176
176
  end
177
177
  end
178
-
179
- # aggregate method to be used in Bud::BudCollection.group.
178
+
179
+ # aggregate method to be used in Bud::BudCollection.group.
180
180
  # accumulates all x inputs into an array. note that the order of the elements
181
181
  # in the resulting array is undefined.
182
182
  def accum(x)
data/lib/bud/bud_meta.rb CHANGED
@@ -37,7 +37,6 @@ class BudMeta #:nodoc: all
37
37
  # Cleanup
38
38
  stratified_rules = stratified_rules.reject{|r| r.empty?}
39
39
  dump_rewrite(stratified_rules) if @bud_instance.options[:dump_rewrite]
40
-
41
40
  end
42
41
  return stratified_rules
43
42
  end
@@ -140,7 +139,7 @@ class BudMeta #:nodoc: all
140
139
  end
141
140
 
142
141
  next if i == 1 and n.sexp_type == :nil # a block got rewritten to an empty block
143
-
142
+
144
143
  # Check for a common case
145
144
  if n.sexp_type == :lasgn
146
145
  return [n, "illegal operator: '='"]
@@ -192,7 +191,7 @@ class BudMeta #:nodoc: all
192
191
  lhs = (nodes[d.lhs.to_s] ||= Node.new(d.lhs.to_s, :init, 0, [], true, false, false, false))
193
192
  lhs.in_lhs = true
194
193
  body = (nodes[d.body.to_s] ||= Node.new(d.body.to_s, :init, 0, [], false, true, false, false))
195
- temporal = d.op != "<="
194
+ temporal = d.op != "<="
196
195
  lhs.edges << Edge.new(body, d.op, d.nm, temporal)
197
196
  body.in_body = true
198
197
  end
@@ -200,7 +199,7 @@ class BudMeta #:nodoc: all
200
199
  nodes.values.each {|n| calc_stratum(n, false, false, [n.name])}
201
200
  # Normalize stratum numbers because they may not be 0-based or consecutive
202
201
  remap = {}
203
- # if the nodes stratum numbers are [2, 3, 2, 4], remap = {2 => 0, 3 => 1, 4 => 2}
202
+ # if the nodes stratum numbers are [2, 3, 2, 4], remap = {2 => 0, 3 => 1, 4 => 2}
204
203
  nodes.values.map {|n| n.stratum}.uniq.sort.each_with_index{|num, i|
205
204
  remap[num] = i
206
205
  }
@@ -239,25 +238,19 @@ class BudMeta #:nodoc: all
239
238
 
240
239
 
241
240
  def analyze_dependencies(nodes) # nodes = {node name => node}
242
- bud = @bud_instance
243
-
244
- preds_in_lhs = nodes.inject(Set.new) {|preds, name_n| preds.add(name_n[0]) if name_n[1].in_lhs; preds}
245
- preds_in_body = nodes.inject(Set.new) {|preds, name_n| preds.add(name_n[0]) if name_n[1].in_body; preds}
241
+ preds_in_lhs = nodes.select {|_, node| node.in_lhs}.map {|name, _| name}.to_set
242
+ preds_in_body = nodes.select {|_, node| node.in_body}.map {|name, _| name}.to_set
246
243
 
244
+ bud = @bud_instance
247
245
  bud.t_provides.each do |p|
248
246
  pred, input = p.interface, p.input
249
247
  if input
250
- # an interface pred is a source if it is an input and it is not in any rule's lhs
251
- #bud.sources << [pred] unless (preds_in_lhs.include? pred)
252
248
  unless preds_in_body.include? pred.to_s
253
249
  # input interface is underspecified if not used in any rule body
254
250
  bud.t_underspecified << [pred, true] # true indicates input mode
255
251
  puts "Warning: input interface #{pred} not used"
256
252
  end
257
253
  else
258
- # an interface pred is a sink if it is not an input and it is not in any rule's body
259
- #(if it is in the body, then it is an intermediate node feeding some lhs)
260
- #bud.sinks << [pred] unless (preds_in_body.include? pred)
261
254
  unless preds_in_lhs.include? pred.to_s
262
255
  # output interface underspecified if not in any rule's lhs
263
256
  bud.t_underspecified << [pred, false] #false indicates output mode.
@@ -287,8 +280,8 @@ class BudMeta #:nodoc: all
287
280
  fout.puts "Declarations:"
288
281
 
289
282
  strata.each_with_index do |rules, i|
290
- fout.print "=================================\n"
291
- fout.print "Stratum #{i}\n"
283
+ fout.puts "================================="
284
+ fout.puts "Stratum #{i}"
292
285
  rules.each do |r|
293
286
  fout.puts "#{r.bud_obj.class}##{r.bud_obj.object_id} #{r.rule_id}"
294
287
  fout.puts "\tsrc: #{r.src}"