bud 0.9.4 → 0.9.5

Sign up to get free protection for your applications and to get access to all the features.
data/History.txt CHANGED
@@ -1,3 +1,26 @@
1
+ == 0.9.5 / 2012-11-24
2
+
3
+ * Lattice branch (Bloom^L) merged
4
+ * Compatibility with recent versions of ruby_parser (3.0.2+) and ruby2ruby
5
+ (2.0.1+) -- #. Older versions of these two gems are no longer supported
6
+ * Add support for aggregate functions that take multiple input columns
7
+ * Add built-in aggregate function accum_pair(x, y), which produces a Set of
8
+ pairs (two-element arrays [x,y])
9
+ * Support user-specified code blocks in payloads(), argagg(), argmin() and
10
+ argmax()
11
+ * Change behavior of BudChannel#payloads for channels with two
12
+ columns. Previously we returned a single *column* (scalar) value in this case;
13
+ now we always return a tuple with k-1 columns
14
+ * More consistent behavior for BudCollection#sort when used outside Bloom
15
+ programs
16
+ * Restore support for each_with_index() over Bud collections
17
+ * Restore functionality of Zookeeper-backed Bud collections and fix
18
+ incompatibility with recent (> 0.4.4) versions of the Zookeeper gem
19
+ * Optimize parsing of Bloom statements, particularly for large Bloom programs
20
+ * Fix bug in argagg state materialization
21
+ * Fix bug in chaining argmin() or argmax() expressions
22
+ * Fix bug in chaining notin() expressions
23
+
1
24
  == 0.9.4 / 2012-09-06
2
25
 
3
26
  * Optimize grouping performance
data/bin/budlabel ADDED
@@ -0,0 +1,63 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require 'rubygems'
4
+ require 'bud'
5
+ require 'getopt/std'
6
+ require 'bud/labeling/labeling'
7
+ require 'bud/labeling/bloomgraph'
8
+ require 'bud/labeling/budplot_style'
9
+
10
+ $LOAD_PATH.unshift(".")
11
+
12
+ @opts = Getopt::Std.getopts("r:i:p:O:CP")
13
+
14
+ unless @opts["r"] and @opts["i"]
15
+ puts "USAGE:"
16
+ puts "-r REQUIRE"
17
+ puts "-i INCLUDE"
18
+ puts "[-p INCLUDE PATH]"
19
+ puts "[-O <FMT> Output a graphviz representation of the module in FMT format (pdf if not specified)."
20
+ puts "-C Concise output -- Associate a single label with each output interface"
21
+ puts "-P Path-based output -- For each output interface, attribute a label to paths from each input interface"
22
+ exit
23
+ end
24
+
25
+ hreadable = {
26
+ "D" => "Diffluent: Nondeterministic output contents.",
27
+ "A" => "Asynchronous. Nondeterministic output orders.",
28
+ "N" => "Nonmonotonic. Output contents are sensitive to input orders.",
29
+ "Bot" => "Monotonic. Order-insensitive and retraction-free."
30
+ }
31
+
32
+ if @opts["p"]
33
+ $LOAD_PATH.unshift @opts["p"]
34
+ end
35
+
36
+ require @opts["r"]
37
+ c = Label.new(@opts["i"])
38
+
39
+ puts "--- Report for module #{@opts["i"]} ---"
40
+
41
+ if @opts["C"]
42
+ puts "---------------"
43
+ puts "Output\t\tLabel"
44
+ puts "---------------"
45
+ c.output_report.each_pair do |k, v|
46
+ puts [k, hreadable[v]].join("\t")
47
+ end
48
+ end
49
+
50
+ if @opts["P"]
51
+ c.path_report.each_pair do |output, inpaths|
52
+ puts ""
53
+ puts "--------------------"
54
+ puts "Output\tInput\tLabel"
55
+ puts "--------------------"
56
+ puts output
57
+ inpaths.each_pair do |inp, lbl|
58
+ puts "\t#{inp}\t#{hreadable[lbl]}"
59
+ end
60
+ end
61
+ end
62
+
63
+ c.write_graph(@opts["O"]) if @opts["O"]
data/bin/budtimelines CHANGED
@@ -10,7 +10,7 @@ include VizUtil
10
10
 
11
11
  module Depends
12
12
  state do
13
- table :depends, [:rid, :lhs, :op, :rhs, :nm]
13
+ table :depends, [:rid, :lhs, :op, :rhs, :nm, :in_body]
14
14
  end
15
15
  end
16
16
 
data/docs/cheat.md CHANGED
@@ -239,7 +239,7 @@ Finally, we output every tuple of `bc` that does *not* appear in `t`.
239
239
  * `bc.group([:col1, :col2], min(:col3))`. *akin to min(col3) GROUP BY col1,col2*
240
240
  * exemplary aggs: `min`, `max`, `bool_and`, `bool_or`, `choose`
241
241
  * summary aggs: `sum`, `avg`, `count`
242
- * structural aggs: `accum` *accumulates inputs into a Set*
242
+ * structural aggs: `accum`, `accum_pair` *accumulates inputs into a Set; accum_pair takes two inputs and accumulates a Set of pairs (two element arrays)*
243
243
  * `bc.argmax([:attr1], :attr2)` &nbsp;&nbsp;&nbsp;&nbsp; *returns the bc items per attr1 that have highest attr2*
244
244
  * `bc.argmin([:attr1], :attr2)`
245
245
  * `bc.argagg(:exemplary_agg_name, [:attr1], :attr2))`. *generalizes argmin/max: returns the bc items per attr1 that are chosen by the exemplary
data/docs/getstarted.md CHANGED
@@ -128,7 +128,7 @@ Now that we've seen a bit of Bloom, we're ready to write our first interesting s
128
128
 
129
129
  Even though we're getting ahead of ourselves, let's have a peek at the Bloom statements that implement the server in `examples/chat/chat_server.rb`:
130
130
 
131
- nodelist <= signup.payloads
131
+ nodelist <= connect { |c| [c.client, c.nick] }
132
132
  mcast <~ (mcast * nodelist).pairs { |m,n| [n.key, m.val] }
133
133
 
134
134
  That's it! There is one statement for each of the two sentences describing the behavior of the "basic idea" above. We'll go through these two statements in more detail shortly. But it's nice to see right away how concisely and naturally a Bloom program can fit our intuitive description of a distributed service.
@@ -139,11 +139,11 @@ Now that we've satisfied our need to peek, let's take this a bit more methodical
139
139
 
140
140
  module ChatProtocol
141
141
  state do
142
+ channel :connect, [:@addr, :client] => [:nick]
142
143
  channel :mcast
143
- channel :connect
144
144
  end
145
145
 
146
- DEFAULT_ADDR = "localhost:12345"
146
+ DEFAULT_ADDR = "localhost:12345"
147
147
  end
148
148
 
149
149
  This defines a [Ruby mixin module](http://www.ruby-doc.org/docs/ProgrammingRuby/html/tut_modules.html) called `ChatProtocol` that has a couple special Bloom features:
@@ -167,7 +167,7 @@ Given this protocol (and the Ruby constant at the bottom), we're now ready to ex
167
167
  state { table :nodelist }
168
168
 
169
169
  bloom do
170
- nodelist <= connect.payloads
170
+ nodelist <= connect { |c| [c.client, c.nick] }
171
171
  mcast <~ (mcast * nodelist).pairs { |m,n| [n.key, m.val] }
172
172
  end
173
173
  end
@@ -190,11 +190,11 @@ With those preliminaries aside, we have our first `bloom` block, which is how Bl
190
190
 
191
191
  The first is pretty simple:
192
192
 
193
- nodelist <= connect.payloads
193
+ nodelist <= connect { |c| [c.client, c.nick] }
194
194
 
195
- This says that whenever messages arrive on the channel named "connect", their payloads (i.e. their non-address field) should be instantaneously merged into the table nodelist, which will store them persistently. Note that nodelist has a \[key/val\] pair structure, so we expect the payloads will have that structure as well.
195
+ This says that whenever messages arrive on the channel named "connect", the client address and user-provided nickname should be instantaneously merged into the table "nodelist", which will store them persistently. Note that nodelist has a \[key/val\] pair structure, so it is suitable for storing pairs of (IP address, nickname).
196
196
 
197
- The next Bloom statement is more complex. Remember the description in the "basic idea" at the beginning of this section: the server needs to accept inbound chat messages from clients, and forward them to other clients.
197
+ The next Bloom statement is more complex. Remember the description in the "basic idea" at the beginning of this section: the server needs to accept inbound chat messages from clients and forward them to other clients.
198
198
 
199
199
  mcast <~ (mcast * nodelist).pairs { |m,n| [n.key, m.val] }
200
200
 
@@ -233,7 +233,7 @@ And here's the code:
233
233
  end
234
234
 
235
235
  bootstrap do
236
- connect <~ [[@server, [ip_port, @nick]]]
236
+ connect <~ [[@server, ip_port, @nick]]
237
237
  end
238
238
 
239
239
  bloom do
@@ -7,3 +7,5 @@ To run the chat example, do each of the following in a different terminal:
7
7
  # ruby chat.rb bob
8
8
 
9
9
  # ruby chat.rb harvey
10
+
11
+ Note that the "backports" gem should be installed.
@@ -1,6 +1,7 @@
1
1
  require 'rubygems'
2
+ require 'backports'
2
3
  require 'bud'
3
- require 'chat_protocol'
4
+ require_relative 'chat_protocol'
4
5
 
5
6
  class ChatClient
6
7
  include Bud
@@ -13,7 +14,7 @@ class ChatClient
13
14
  end
14
15
 
15
16
  bootstrap do
16
- connect <~ [[@server, [ip_port, @nick]]]
17
+ connect <~ [[@server, ip_port, @nick]]
17
18
  end
18
19
 
19
20
  bloom do
@@ -1,7 +1,7 @@
1
1
  module ChatProtocol
2
2
  state do
3
+ channel :connect, [:@addr, :client] => [:nick]
3
4
  channel :mcast
4
- channel :connect
5
5
  end
6
6
 
7
7
  DEFAULT_ADDR = "localhost:12345"
@@ -1,6 +1,7 @@
1
1
  require 'rubygems'
2
+ require 'backports'
2
3
  require 'bud'
3
- require 'chat_protocol'
4
+ require_relative 'chat_protocol'
4
5
 
5
6
  class ChatServer
6
7
  include Bud
@@ -9,7 +10,7 @@ class ChatServer
9
10
  state { table :nodelist }
10
11
 
11
12
  bloom do
12
- nodelist <= connect.payloads
13
+ nodelist <= connect { |c| [c.client, c.nick] }
13
14
  mcast <~ (mcast * nodelist).pairs { |m,n| [n.key, m.val] }
14
15
  end
15
16
  end
data/lib/bud/aggs.rb CHANGED
@@ -1,5 +1,3 @@
1
- require 'set'
2
-
3
1
  module Bud
4
2
  ######## Agg definitions
5
3
  class Agg #:nodoc: all
@@ -207,4 +205,20 @@ module Bud
207
205
  def accum(x)
208
206
  [Accum.new, x]
209
207
  end
208
+
209
+ class AccumPair < Agg #:nodoc: all
210
+ def init(fst, snd)
211
+ [[fst, snd]].to_set
212
+ end
213
+ def trans(the_state, fst, snd)
214
+ the_state << [fst, snd]
215
+ return the_state, nil
216
+ end
217
+ end
218
+
219
+ # aggregate method to be used in Bud::BudCollection.group.
220
+ # accumulates x, y inputs into a set of pairs (two element arrays).
221
+ def accum_pair(x, y)
222
+ [AccumPair.new, x, y]
223
+ end
210
224
  end
data/lib/bud/bud_meta.rb CHANGED
@@ -1,6 +1,5 @@
1
1
  require 'bud/rewrite'
2
2
 
3
-
4
3
  class BudMeta #:nodoc: all
5
4
  def initialize(bud_instance, declarations)
6
5
  @bud_instance = bud_instance
@@ -111,11 +110,11 @@ class BudMeta #:nodoc: all
111
110
 
112
111
  def get_qual_name(pt)
113
112
  # expect to see a parse tree corresponding to a dotted name
114
- # a.b.c == s(:call, s1, :c, (:args))
115
- # where s1 == s(:call, s2, :b, (:args))
116
- # where s2 == s(:call, nil, :a, (:args))
117
- tag, recv, name, args = pt
118
- return nil unless tag == :call and args.length == 1
113
+ # a.b.c == s(:call, s1, :c)
114
+ # where s1 == s(:call, s2, :b)
115
+ # where s2 == s(:call, nil, :a)
116
+ tag, recv, name, *args = pt
117
+ return nil unless tag == :call and args.empty?
119
118
 
120
119
  if recv
121
120
  qn = get_qual_name(recv)
@@ -127,23 +126,16 @@ class BudMeta #:nodoc: all
127
126
  end
128
127
 
129
128
  # Perform some basic sanity checks on the AST of a rule block. We expect a
130
- # rule block to consist of a :defn, a nested :scope, and then a sequence of
129
+ # rule block to consist of a :defn whose body consists of a sequence of
131
130
  # statements. Each statement is a :call node. Returns nil (no error found), a
132
131
  # Sexp (containing an error), or a pair of [Sexp, error message].
133
132
  def check_rule_ast(pt)
134
- # :defn format: node tag, block name, args, nested scope
135
- return pt if pt.sexp_type != :defn
136
- scope = pt[3]
137
- return pt if scope.sexp_type != :scope
138
- block = scope[1]
139
-
140
- block.each_with_index do |n,i|
141
- if i == 0
142
- return pt if n != :block
143
- next
144
- end
133
+ # :defn format: node tag, block name, args, body_0, ..., body_n
134
+ tag, name, args, *body = pt
135
+ return pt if tag != :defn
145
136
 
146
- next if i == 1 and n.sexp_type == :nil # a block got rewritten to an empty block
137
+ body.each_with_index do |n,i|
138
+ next if i == 0 and n == s(:nil) # a block got rewritten to an empty block
147
139
 
148
140
  # Check for a common case
149
141
  if n.sexp_type == :lasgn
@@ -157,8 +149,9 @@ class BudMeta #:nodoc: all
157
149
  # Check that LHS references a named collection
158
150
  lhs_name = get_qual_name(lhs)
159
151
  return [n, "unexpected lhs format: #{lhs}"] if lhs_name.nil?
160
- unless @bud_instance.tables.has_key? lhs_name.to_sym
161
- return [n, "collection does not exist: '#{lhs_name}'"]
152
+ unless @bud_instance.tables.has_key? lhs_name.to_sym or
153
+ @bud_instance.lattices.has_key? lhs_name.to_sym
154
+ return [n, "Collection does not exist: '#{lhs_name}'"]
162
155
  end
163
156
 
164
157
  return [n, "illegal operator: '#{op}'"] unless [:<, :<=].include? op
@@ -171,13 +164,11 @@ class BudMeta #:nodoc: all
171
164
  # XXX: We don't check for illegal superators (e.g., "<--"). That would be
172
165
  # tricky, because they are encoded as a nested unary op in the rule body.
173
166
  if op == :<
174
- return n unless rhs.sexp_type == :arglist
175
- body = rhs[1]
176
- return n unless body.sexp_type == :call
177
- op_tail = body[2]
167
+ return n unless rhs.sexp_type == :call
168
+ op_tail = rhs[2]
178
169
  return n unless [:~, :-@, :+@].include? op_tail
179
- rhs_args = body[3]
180
- return n if rhs_args.sexp_type != :arglist or rhs_args.length != 1
170
+ rhs_args = rhs[3..-1]
171
+ return n unless rhs_args.empty?
181
172
  end
182
173
  end
183
174
 
@@ -193,7 +184,7 @@ class BudMeta #:nodoc: all
193
184
  bud = @bud_instance.toplevel
194
185
  nodes = {}
195
186
  bud.t_depends.each do |d|
196
- #t_depends [:bud_instance, :rule_id, :lhs, :op, :body] => [:nm]
187
+ #t_depends [:bud_instance, :rule_id, :lhs, :op, :body] => [:nm, :in_body]
197
188
  lhs = (nodes[d.lhs] ||= Node.new(d.lhs, :init, 0, [], true, false, false, false))
198
189
  lhs.in_lhs = true
199
190
  body = (nodes[d.body] ||= Node.new(d.body, :init, 0, [], false, true, false, false))
@@ -1,5 +1,3 @@
1
- require 'msgpack'
2
-
3
1
  $struct_classes = {}
4
2
  module Bud
5
3
  ########
@@ -59,6 +57,7 @@ module Bud
59
57
  end
60
58
 
61
59
  @key_colnums = @key_cols.map {|k| @cols.index(k)}
60
+ @val_colnums = val_cols.map {|k| @cols.index(k)}
62
61
 
63
62
  if @cols.empty?
64
63
  @cols = nil
@@ -195,7 +194,24 @@ module Bud
195
194
  pusher_pro.tabname = the_name
196
195
  pusher_pro
197
196
  else
198
- @storage.map(&blk)
197
+ rv = []
198
+ self.each do |t|
199
+ t = blk.call(t)
200
+ rv << t unless t.nil?
201
+ end
202
+ rv
203
+ end
204
+ end
205
+
206
+ # XXX: Although we support each_with_index over Bud collections, using it is
207
+ # probably not a great idea: the index assigned to a given collection member
208
+ # is not defined by the language semantics.
209
+ def each_with_index(the_name=tabname, the_schema=schema, &blk)
210
+ if @bud_instance.wiring?
211
+ pusher = to_push_elem(the_name, the_schema)
212
+ pusher.each_with_index(the_name, the_schema, &blk)
213
+ else
214
+ super(&blk)
199
215
  end
200
216
  end
201
217
 
@@ -218,7 +234,7 @@ module Bud
218
234
  end
219
235
  elem.set_block(&f)
220
236
  toplevel.push_elems[[self.object_id, :flatten]] = elem
221
- return elem
237
+ elem
222
238
  else
223
239
  @storage.flat_map(&blk)
224
240
  end
@@ -230,14 +246,16 @@ module Bud
230
246
  pusher = self.pro
231
247
  pusher.sort("sort#{object_id}", @bud_instance, @cols, &blk)
232
248
  else
233
- @storage.sort
249
+ @storage.values.sort(&blk)
234
250
  end
235
251
  end
236
252
 
237
253
  def rename(the_name, the_schema=nil, &blk)
238
254
  raise unless @bud_instance.wiring?
239
255
  # a scratch with this name should have been defined during rewriting
240
- raise Bud::Error, "rename failed to define a scratch named #{the_name}" unless @bud_instance.respond_to? the_name
256
+ unless @bud_instance.respond_to? the_name
257
+ raise Bud::Error, "rename failed to define a scratch named #{the_name}"
258
+ end
241
259
  pro(the_name, the_schema, &blk)
242
260
  end
243
261
 
@@ -377,6 +395,11 @@ module Bud
377
395
  raise Bud::KeyConstraintError, "key conflict inserting #{new.inspect} into \"#{tabname}\": existing tuple #{old.inspect}, key = #{key.inspect}"
378
396
  end
379
397
 
398
+ private
399
+ def is_lattice_val(v)
400
+ v.kind_of? Bud::Lattice
401
+ end
402
+
380
403
  private
381
404
  def prep_tuple(o)
382
405
  return o if o.class == @struct
@@ -391,9 +414,16 @@ module Bud
391
414
  raise Bud::TypeError, "array or struct type expected in \"#{qualified_tabname}\": #{o.inspect}"
392
415
  end
393
416
 
417
+ @key_colnums.each do |i|
418
+ next if i >= o.length
419
+ if is_lattice_val(o[i])
420
+ raise Bud::TypeError, "lattice value cannot be a key for #{qualified_tabname}: #{o[i].inspect}"
421
+ end
422
+ end
394
423
  if o.length > @structlen
395
424
  raise Bud::TypeError, "too many columns for \"#{qualified_tabname}\": #{o.inspect}"
396
425
  end
426
+
397
427
  return @struct.new(*o)
398
428
  end
399
429
 
@@ -420,13 +450,44 @@ module Bud
420
450
  end
421
451
 
422
452
  # Merge "tup" with key values "key" into "buf". "old" is an existing tuple
423
- # with the same key columns as "tup" (if any such tuple exists).
453
+ # with the same key columns as "tup" (if any such tuple exists). If "old"
454
+ # exists and "tup" is not a duplicate, check whether the two tuples disagree
455
+ # on a non-key, non-lattice value; if so, raise a PK error. Otherwise,
456
+ # construct and return a merged tuple by using lattice merge functions.
424
457
  private
425
458
  def merge_to_buf(buf, key, tup, old)
426
- if old.nil? # no matching tuple found
459
+ if old.nil? # no matching tuple found
427
460
  buf[key] = tup
428
- elsif old != tup # ignore duplicates
429
- raise_pk_error(tup, old)
461
+ return
462
+ end
463
+ return if tup == old # ignore duplicates
464
+
465
+ # Check for PK violation
466
+ @val_colnums.each do |i|
467
+ old_v = old[i]
468
+ new_v = tup[i]
469
+
470
+ unless old_v == new_v || (is_lattice_val(old_v) && is_lattice_val(new_v))
471
+ raise_pk_error(tup, old)
472
+ end
473
+ end
474
+
475
+ # Construct new tuple version. We discard the newly-constructed tuple if
476
+ # merging every lattice field doesn't yield a new value.
477
+ new_t = null_tuple
478
+ saw_change = false
479
+ @val_colnums.each do |i|
480
+ if old[i] == tup[i]
481
+ new_t[i] = old[i]
482
+ else
483
+ new_t[i] = old[i].merge(tup[i])
484
+ saw_change = true if new_t[i].reveal != old[i].reveal
485
+ end
486
+ end
487
+
488
+ if saw_change
489
+ @key_colnums.each {|k| new_t[k] = old[k]}
490
+ buf[key] = new_t
430
491
  end
431
492
  end
432
493
 
@@ -508,6 +569,12 @@ module Bud
508
569
  add_merge_target
509
570
  tbl = register_coll_expr(o)
510
571
  tbl.pro.wire_to self
572
+ elsif o.class <= Bud::LatticePushElement
573
+ add_merge_target
574
+ o.wire_to self
575
+ elsif o.class <= Bud::LatticeWrapper
576
+ add_merge_target
577
+ o.to_push_elem.wire_to self
511
578
  else
512
579
  unless o.nil?
513
580
  o = o.uniq.compact if o.respond_to?(:uniq)
@@ -557,6 +624,12 @@ module Bud
557
624
  add_merge_target
558
625
  tbl = register_coll_expr(o)
559
626
  tbl.pro.wire_to(self, :pending)
627
+ elsif o.class <= Bud::LatticePushElement
628
+ add_merge_target
629
+ o.wire_to(self, :pending)
630
+ elsif o.class <= Bud::LatticeWrapper
631
+ add_merge_target
632
+ o.to_push_elem.wire_to(self, :pending)
560
633
  else
561
634
  pending_merge(o)
562
635
  end
@@ -656,10 +729,10 @@ module Bud
656
729
  # for each distinct value of the grouping key columns, return the items in that group
657
730
  # that have the value of the exemplary aggregate +aggname+
658
731
  public
659
- def argagg(aggname, gbkey_cols, collection)
732
+ def argagg(aggname, gbkey_cols, collection, &blk)
660
733
  elem = to_push_elem
661
734
  gbkey_cols = gbkey_cols.map{|k| canonicalize_col(k)} unless gbkey_cols.nil?
662
- retval = elem.argagg(aggname, gbkey_cols, canonicalize_col(collection))
735
+ retval = elem.argagg(aggname, gbkey_cols, canonicalize_col(collection), &blk)
663
736
  # PushElement inherits the schema accessors from this Collection
664
737
  retval.extend @cols_access
665
738
  retval
@@ -669,37 +742,46 @@ module Bud
669
742
  # that group that have the minimum value of the attribute +col+. Note that
670
743
  # multiple tuples might be returned.
671
744
  public
672
- def argmin(gbkey_cols, col)
673
- argagg(:min, gbkey_cols, col)
745
+ def argmin(gbkey_cols, col, &blk)
746
+ argagg(:min, gbkey_cols, col, &blk)
674
747
  end
675
748
 
676
749
  # for each distinct value of the grouping key columns, return the items in
677
750
  # that group that have the maximum value of the attribute +col+. Note that
678
751
  # multiple tuples might be returned.
679
752
  public
680
- def argmax(gbkey_cols, col)
681
- argagg(:max, gbkey_cols, col)
753
+ def argmax(gbkey_cols, col, &blk)
754
+ argagg(:max, gbkey_cols, col, &blk)
682
755
  end
683
756
 
684
757
  # form a collection containing all pairs of items in +self+ and items in
685
758
  # +collection+
686
759
  public
687
760
  def *(collection)
688
- elem1 = to_push_elem
689
- return elem1.join(collection)
761
+ return to_push_elem.join(collection)
762
+ end
763
+
764
+ def prep_aggpairs(aggpairs)
765
+ aggpairs.map do |ap|
766
+ agg, *rest = ap
767
+ if rest.empty?
768
+ [agg]
769
+ else
770
+ [agg] + rest.map {|c| canonicalize_col(c)}
771
+ end
772
+ end
690
773
  end
691
774
 
692
775
  def group(key_cols, *aggpairs, &blk)
693
- elem = to_push_elem
694
776
  key_cols = key_cols.map{|k| canonicalize_col(k)} unless key_cols.nil?
695
- aggpairs = aggpairs.map{|ap| [ap[0], canonicalize_col(ap[1])].compact} unless aggpairs.nil?
696
- return elem.group(key_cols, *aggpairs, &blk)
777
+ aggpairs = prep_aggpairs(aggpairs)
778
+ return to_push_elem.group(key_cols, *aggpairs, &blk)
697
779
  end
698
780
 
699
781
  def notin(collection, *preds, &blk)
700
782
  elem1 = to_push_elem
701
783
  elem2 = collection.to_push_elem
702
- return elem1.notin(elem2, preds, &blk)
784
+ return elem1.notin(elem2, *preds, &blk)
703
785
  end
704
786
 
705
787
  def canonicalize_col(col)
@@ -725,6 +807,7 @@ module Bud
725
807
  false
726
808
  end
727
809
 
810
+ # tick_delta for scratches is @storage, so iterate over that instead
728
811
  public
729
812
  def each_tick_delta(&block)
730
813
  @storage.each_value(&block)
@@ -870,28 +953,45 @@ module Bud
870
953
  raise Bud::Error, "'#{t[@locspec_idx]}', channel '#{@tabname}'" if the_locspec[0].nil? or the_locspec[1].nil? or the_locspec[0] == '' or the_locspec[1] == ''
871
954
  end
872
955
  puts "channel #{qualified_tabname}.send: #{t}" if $BUD_DEBUG
873
- toplevel.dsock.send_datagram([qualified_tabname.to_s, t].to_msgpack,
874
- the_locspec[0], the_locspec[1])
956
+
957
+ # Convert the tuple into a suitable wire format. Because MsgPack cannot
958
+ # marshal arbitrary Ruby objects that we need to send via channels (in
959
+ # particular, lattice values and Class instances), we first encode such
960
+ # values using Marshal, and then encode the entire tuple with
961
+ # MsgPack. Obviously, this is gross. The wire format also includes an
962
+ # array of indices, indicating which fields hold Marshall'd objects.
963
+ marshall_indexes = []
964
+ wire_tuple = Array.new(t.length)
965
+ t.each_with_index do |f, i|
966
+ if [Bud::Lattice, Class].any?{|t| f.class <= t}
967
+ marshall_indexes << i
968
+ wire_tuple[i] = Marshal.dump(f)
969
+ else
970
+ wire_tuple[i] = f
971
+ end
972
+ end
973
+ wire_str = [qualified_tabname.to_s, wire_tuple, marshall_indexes].to_msgpack
974
+ toplevel.dsock.send_datagram(wire_str, the_locspec[0], the_locspec[1])
875
975
  end
876
976
  @pending.clear
877
977
  end
878
978
 
879
979
  public
880
980
  # project to the non-address fields
881
- def payloads
882
- return self.pro if @is_loopback
883
-
884
- if cols.size > 2
885
- # bundle up each tuple's non-locspec fields into an array
886
- retval = case @locspec_idx
887
- when 0 then self.pro{|t| t.values_at(1..(t.size-1))}
888
- when (schema.size - 1) then self.pro{|t| t.values_at(0..(t.size-2))}
889
- else self.pro{|t| t.values_at(0..(@locspec_idx-1), @locspec_idx+1..(t.size-1))}
890
- end
891
- else
892
- # just return each tuple's non-locspec field value
893
- retval = self.pro{|t| t[(@locspec_idx == 0) ? 1 : 0]}
981
+ def payloads(&blk)
982
+ return self.pro(&blk) if @is_loopback
983
+
984
+ if @payload_struct.nil?
985
+ payload_cols = cols.dup
986
+ payload_cols.delete_at(@locspec_idx)
987
+ @payload_struct = Struct.new(*payload_cols)
988
+ @payload_colnums = payload_cols.map {|k| cols.index(k)}
989
+ end
990
+
991
+ retval = self.pro do |t|
992
+ @payload_struct.new(*t.values_at(*@payload_colnums))
894
993
  end
994
+ retval = retval.pro(&blk) unless blk.nil?
895
995
  return retval
896
996
  end
897
997
 
@@ -903,6 +1003,12 @@ module Bud
903
1003
  elsif o.class <= Proc
904
1004
  tbl = register_coll_expr(o)
905
1005
  tbl.pro.wire_to(self, :pending)
1006
+ elsif o.class <= Bud::LatticePushElement
1007
+ add_merge_target
1008
+ o.wire_to(self, :pending)
1009
+ elsif o.class <= Bud::LatticeWrapper
1010
+ add_merge_target
1011
+ o.to_push_elem.wire_to(self, :pending)
906
1012
  else
907
1013
  pending_merge(o)
908
1014
  end
@@ -946,7 +1052,7 @@ module Bud
946
1052
  port = toplevel.port
947
1053
  EventMachine::schedule do
948
1054
  socket = EventMachine::open_datagram_socket("127.0.0.1", 0)
949
- socket.send_datagram([tabname, tup].to_msgpack, ip, port)
1055
+ socket.send_datagram([tabname, tup, []].to_msgpack, ip, port)
950
1056
  end
951
1057
  end
952
1058
  rescue Exception
@@ -1195,7 +1301,19 @@ module Bud
1195
1301
 
1196
1302
  public
1197
1303
  def each(&block)
1198
- @expr.call.each(&block)
1304
+ v = @expr.call
1305
+
1306
+ # XXX: Gross hack. We want to support RHS expressions that do not
1307
+ # necessarily return BudCollections (they might instead return lattice
1308
+ # values or hashes). Since it isn't easy to distinguish between these two
1309
+ # cases statically, instead we just always use CollExpr; at runtime, if
1310
+ # the value doesn't look like a traditional Bloom collection, we don't try
1311
+ # to break it up into tuples.
1312
+ if v.class <= Array || v.class <= BudCollection
1313
+ v.each(&block)
1314
+ else
1315
+ yield v
1316
+ end
1199
1317
  end
1200
1318
 
1201
1319
  public